That was the first time I had to dig into optimization of SQL queries to maintain usability of the application. While I was looking to optimizing my LINQ structure, I had to go a level deeper into the SQL statements created by the framework. I came across two major "bugs" that i find worth sharing and knowing about. The first one might cause logical mistakes in your program, if you are not aware. The "bug" is related to the contains function which surfaces in the following example:
dbcontext.table.select(r => stringList.Contains(r.name));
While you would expect that the generated SQL would really check if the string is contained in the string list it actually does this:
dbcontext.table.select(r => string.join("", stringList).IndexOf(r.name) >= 0);
Or in words it concats the string list and only checks if that contains the name, which has a considerable different result than the intended one.
If you consider a string list with "1", "2", "3" the check - contains("23") - will have the result - true. I found a quick work around for my case by concatenating the string myself with
var concatString = "-" + string.Join("-", stringList) + "-";
dbcontext.table.select(r => concatString.IndexOf("-" + r.name + "-") >= 0);
This way the separators make sure that the string start and endings are matching with the intended string. If you are not using numbers you should consider using other separation chars than i did here.
The second bug is related to the new version of the entity framework 5. The LINQ statement
var q = dbcontext.table.Where(r => r.name = "1");
would be translated into SQL:
Select Extent1.name From (
Select name From table
) as Extent1
Where name = 1;
In small tables and easy structures this is not impacting the performance notably, but querying a 500k table definitely does. The only way to work around that problem i could think of was to move back to the Entity Framework version 4 type of model creation. By changing the Code Generation Strategy from Default to None this can be achieved without any further hastle.
To retrieve the actual sql statements created by the entity franework you could wether check on the database log or directly debug in the .net.
EF 4:
var query = dbcontext.Table.Where(r => r.Id == searchId)
.Select(r => r);
Console.WriteLine(((ObjectQuery)query).ToTraceString());
EF 5:
var query = dbset.Table.Where(r => r.Id == searchId) .Select(r => r);
Console.WriteLine(((DbQuery)query).ToTraceString());