AsParallel makes your query topless

Here is a simple code snippet using Entity framework:

using (AdventureWorksEntities AwContext = new AdventureWorksEntities())
   var LoginId = AwContext.Employees
            .Where ( u => u.LoginID.Length > 0 )
            .Select(u => u.LoginID)

As you can see from the above code, I am trying to get the top 10 rows from Employee table and only one column LoginID, where loginID length is more than zero. If you fire this query you may get a SQL in SQL Profiler as follows:

      [c].[LoginID] AS [LoginID]
        FROM [HumanResources].[Employee] AS [c]
            WHERE LEN( [c].[LoginID] ) > 0

Now, web is full of examples of how to take advantage of multiple cores and make your query run in parallel by using PLINQ.  Just add magic word AsParallel in front of the data source and your code will take advantage of multi core and run in multi thread . But if you are developer who read 5000 words a minute, you may miss the fact that PLINQ applies only to LINQ to objects (i.e. IEnumerable-based sources where lambdas are bound to delegates, not IQueryable-based sources where the lambdas are bound to expressions) and you may add AsParallel in your query thinking that you are using multi core of your CPU and some how your query will become faster. Unfortunately your code may still work, but now it has sever side effects behind the scene.
<pre>using (AdventureWorksEntities AwContext = new AdventureWorksEntities())</pre>
 var LoginId = AwContext.Employees.AsParallel
 .Where ( u => u.LoginID.Length > 0 )
 .Select(u => u.LoginID)

With luck your code may still work, but under the cover many things have changed. First one is your Top 10 selection is gone from the query which goes to SQL. This can be a problem if your table is large with couple of millions rows. Second, now you are getting all the columns. Third, if you are lucky then you may get some exceptions in  the code as other threads try to process your rest of statements like ‘Where’ clause and they break up with the Null Reference exception else you will just squander the resources in the false pretext that you have written an efficient code. Here is the SQL generated after adding the ‘AsParallel’  keyword.
   [Extent1].[EmployeeID] AS [EmployeeID],
   [Extent1].[NationalIDNumber] AS [NationalIDNumber],
   [Extent1].[ContactID] AS [ContactID],
   [Extent1].[LoginID] AS [LoginID],
..all other columns
     FROM [HumanResources].[Employee] AS [Extent1]
         WHERE LEN( [c].[LoginID] ) > 0

So remember, LINQ-to-SQL and LINQ-to-Entities queries will be executed by the respective databases and query providers, PLINQ does not offer a way to parallelize those queries. However,  If you wish to process the results of those queries in memory, including joining the output of many heterogeneous queries, then PLINQ can be quite useful.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s