Wednesday, May 28, 2008
Research Blog
I have decided to take my personal blog private and this would be where I would pen down my work related thoughts and issues, hoping to share my insight
SortedDictionaryBug
Recently I came across a bug which took quite a good part of my afternoon. .NET offers two typesafe collections for implementation of a Hashtable, Dictionary and SortedDictionary. SortedDictionary is sorted by keys.
Now returning to my bug, I was dumping a dictionary as a textfile and reading it into a SortedDictionary object. While doing so exception "Key Already Present" was raised which seemed absurd since I was reading in an dictionary which should have had unique keys.
As it turned out, SortedDictionary and Dictionary treat Unicode strings in a different way. For e.g., Dictionary object treats "johann_strauß" and "johann_strauss" as two different strings whereas SortedDictionary treats them as same keys which is what caused it to raise an exception (since I was adding both). The different is equivalent to using CompareTo versus Equals method of string class which I suspect is a bug in SortedDictionary implementation.
namespace SortedDictionaryBug
{
class Program
{
static void Main(string[] args) {
Dictionary dict = new Dictionary();
dict.Add("johann_strauß", 1);
dict.Add("johann_strauss", 1);
// Everything works well
if ("johann_strauß".Equals("johann_strauss")) {
Console.WriteLine("Equal");
}
if ("johann_strauß".CompareTo("johann_strauss") == 0) {
Console.WriteLine("Equal");
}
SortedDictionary sortedDict = new SortedDictionary();
sortedDict.Add("johann_strauß", 1);
// Exception is raised
sortedDict.Add("johann_strauss", 1);
}
}
}
Now returning to my bug, I was dumping a dictionary as a textfile and reading it into a SortedDictionary object. While doing so exception "Key Already Present" was raised which seemed absurd since I was reading in an dictionary which should have had unique keys.
As it turned out, SortedDictionary and Dictionary treat Unicode strings in a different way. For e.g., Dictionary object treats "johann_strauß" and "johann_strauss" as two different strings whereas SortedDictionary treats them as same keys which is what caused it to raise an exception (since I was adding both). The different is equivalent to using CompareTo versus Equals method of string class which I suspect is a bug in SortedDictionary implementation.
namespace SortedDictionaryBug
{
class Program
{
static void Main(string[] args) {
Dictionary
dict.Add("johann_strauß", 1);
dict.Add("johann_strauss", 1);
// Everything works well
if ("johann_strauß".Equals("johann_strauss")) {
Console.WriteLine("Equal");
}
if ("johann_strauß".CompareTo("johann_strauss") == 0) {
Console.WriteLine("Equal");
}
SortedDictionary
sortedDict.Add("johann_strauß", 1);
// Exception is raised
sortedDict.Add("johann_strauss", 1);
}
}
}
Monday, July 23, 2007
David Heckerman's Talk
Today, I attended a talk by David Heckerman where he talked about his work on using machine learning methods for helping the biology community with the AIDS vaccine. Though I might not have understood quite some portion of the talk. Couple of things I learnt from the talk:
1. Fisher's test (P-measure)
2. False Discovery rate
3. There are some people who are immune to aids and this depends upon the kind of immune system (HLA-B57 vs HLA-B27).
1. Fisher's test (P-measure)
2. False Discovery rate
3. There are some people who are immune to aids and this depends upon the kind of immune system (HLA-B57 vs HLA-B27).
Subscribe to:
Comments (Atom)