By kwcto


2009-02-20 01:27:22 8 Comments

I'm trying to create a generic function to help me select thousands of records using LINQ to SQL from a local list. SQL Server (2005 at least) limits queries to 2100 parameters and I'd like to select more records than that.

Here would be a good example usage:

var some_product_numbers = new int[] { 1,2,3 ... 9999 };

Products.SelectByParameterList(some_product_numbers, p => p.ProductNumber);

Here is my (non-working) implementation:

public static IEnumerable<T> SelectByParameterList<T, PropertyType>(Table<T> items, 

IEnumerable<PropertyType> parameterList, Expression<Func<T, PropertyType>> property) where T : class
{
    var groups = parameterList
        .Select((Parameter, index) =>
            new
            {
                GroupID = index / 2000, //2000 parameters per request
                Parameter
            }
        )
        .GroupBy(x => x.GroupID)
        .AsEnumerable();

    var results = groups
    .Select(g => new { Group = g, Parameters = g.Select(x => x.Parameter) } )
    .SelectMany(g => 
        /* THIS PART FAILS MISERABLY */
        items.Where(item => g.Parameters.Contains(property.Compile()(item)))
    );

    return results;
}

I have seen plenty of examples of building predicates using expressions. In this case I only want to execute the delegate to return the value of the current ProductNumber. Or rather, I want to translate this into the SQL query (it works fine in non-generic form).

I know that compiling the Expression just takes me back to square one (passing in the delegate as Func) but I'm unsure of how to pass a parameter to an "uncompiled" expression.

Thanks for your help!

**** EDIT:** Let me clarify further:

Here is a working example of what I want to generalize:

var local_refill_ids = Refills.Select(r => r.Id).Take(20).ToArray();

var groups = local_refill_ids
    .Select((Parameter, index) =>
        new
        {
            GroupID = index / 5, //5 parameters per request
            Parameter
        }
    )
    .GroupBy(x => x.GroupID)
    .AsEnumerable();

var results = groups
.Select(g => new { Group = g, Parameters = g.Select(x => x.Parameter) } )
.SelectMany(g => 
    Refills.Where(r => g.Parameters.Contains(r.Id))
)
.ToArray()
;

Results in this SQL code:

SELECT [t0].[Id], ... [t0].[Version]
FROM [Refill] AS [t0]
WHERE [t0].[Id] IN (@p0, @p1, @p2, @p3, @p4)

... That query 4 more times (20 / 5 = 4)

5 comments

@Zero 2014-02-07 16:38:59

You can create your own QueryProvider

public class QueryProvider : IQueryProvider
{
    // Translates LINQ query to SQL.
    private readonly Func<IQueryable, DbCommand> _translator;

    // Executes the translated SQL and retrieves results.
    private readonly Func<Type, string, object[], IEnumerable> _executor;

    public QueryProvider(
        Func<IQueryable, DbCommand> translator,
        Func<Type, string, object[], IEnumerable> executor)
    {

        this._translator = translator;
        this._executor = executor;
    }

    #region IQueryProvider Members

    public IQueryable<TElement> CreateQuery<TElement>(Expression expression)
    {
        return new Queryable<TElement>(this, expression);
    }

    public IQueryable CreateQuery(Expression expression)
    {
        throw new NotImplementedException();
    }

    public TResult Execute<TResult>(Expression expression)
    {
        bool isCollection = typeof(TResult).IsGenericType &&
            typeof(TResult).GetGenericTypeDefinition() == typeof(IEnumerable<>);
        var itemType = isCollection
            // TResult is an IEnumerable`1 collection.
            ? typeof(TResult).GetGenericArguments().Single()
            // TResult is not an IEnumerable`1 collection, but a single item.
            : typeof(TResult);
        var queryable = Activator.CreateInstance(
            typeof(Queryable<>).MakeGenericType(itemType), this, expression) as IQueryable;

        IEnumerable queryResult;

        // Translates LINQ query to SQL.
        using (var command = this._translator(queryable))
        {
            var parameters = command.Parameters.OfType<DbParameter>()
                .Select(parameter => parameter)
                .ToList();

            var query = command.CommandText;
            var newParameters = GetNewParameterList(ref query, parameters);

            queryResult = _executor(itemType,query,newParameters);
        }

        return isCollection
            ? (TResult)queryResult // Returns an IEnumerable`1 collection.
            : queryResult.OfType<TResult>()
                         .SingleOrDefault(); // Returns a single item.
    }       

    public object Execute(Expression expression)
    {
        throw new NotImplementedException();
    }

    #endregion

     private static object[] GetNewParameterList(ref string query, List<DbParameter> parameters)
    {
        var newParameters = new List<DbParameter>(parameters);

        foreach (var dbParameter in parameters.Where(p => p.DbType == System.Data.DbType.Int32))
        {
            var name = dbParameter.ParameterName;
            var value = dbParameter.Value != null ? dbParameter.Value.ToString() : "NULL";
            var pattern = String.Format("{0}[^0-9]", dbParameter.ParameterName);
            query = Regex.Replace(query, pattern, match => value + match.Value.Replace(name, ""));
            newParameters.Remove(dbParameter);
        }

        for (var i = 0; i < newParameters.Count; i++)
        {
            var parameter = newParameters[i];
            var oldName = parameter.ParameterName;
            var pattern = String.Format("{0}[^0-9]", oldName);
            var newName = "@p" + i;
            query = Regex.Replace(query, pattern, match => newName + match.Value.Replace(oldName, ""));
        }      

        return newParameters.Select(x => x.Value).ToArray();
    }
}


    static void Main(string[] args)
    {
        using (var dc=new DataContext())
        {
            var provider = new QueryProvider(dc.GetCommand, dc.ExecuteQuery);

            var serviceIds = Enumerable.Range(1, 2200).ToArray();

            var tasks = new Queryable<Task>(provider, dc.Tasks).Where(x => serviceIds.Contains(x.ServiceId) && x.CreatorId==37 && x.Creator.Name=="12312").ToArray();

        }

    }

@Muhammad Adnan 2012-04-13 10:22:14

Pass IQuerable to the Contains function instead of list or array. please see the below example

var df_handsets = db.DataFeed_Handsets.Where(m => m.LaunchDate != null).
                  Select(m => m.Name);
var Make = (from m in db.MobilePhones
    where (m.IsDeleted != true || m.IsDeleted == null)
        && df_handsets.Contains(m.Name)
    orderby m.Make
    select new { Value = m.Make, Text = m.Make }).Distinct();

when you pass list or array it is passed in form of parameters and its exceed the counts when the list items count is greater than 2100.

@kwcto 2012-05-07 17:56:37

You are assuming the collection to match against is coming from the database itself. This is not always the case.

@kwcto 2009-02-20 16:01:23

Easiest way to do this: Use LINQKit (Free, non-restrictive license)

Working version of code:

public static IEnumerable<T> SelectByParameterList<T, PropertyType>(this Table<T> items, IEnumerable<PropertyType> parameterList, Expression<Func<T, PropertyType>> propertySelector, int blockSize) where T : class
{
    var groups = parameterList
        .Select((Parameter, index) =>
            new
            {
                GroupID = index / blockSize, //# of parameters per request
                Parameter
            }
        )
        .GroupBy(x => x.GroupID)
        .AsEnumerable();

    var selector = LinqKit.Linq.Expr(propertySelector);

    var results = groups
    .Select(g => new { Group = g, Parameters = g.Select(x => x.Parameter) } )
    .SelectMany(g => 
        /* AsExpandable() extension method requires LinqKit DLL */
        items.AsExpandable().Where(item => g.Parameters.Contains(selector.Invoke(item)))
    );

    return results;
}

Example usage:

    Guid[] local_refill_ids = Refills.Select(r => r.Id).Take(20).ToArray();

    IEnumerable<Refill> results = Refills.SelectByParameterList(local_refill_ids, r => r.Id, 10); //runs 2 SQL queries with 10 parameters each

Thanks again for all your help!

@Marc Gravell 2009-02-20 21:47:06

I'd be interested in what the TSQL for that does, compared to my InRange reply...

@kwcto 2009-02-20 21:55:30

SELECT [t0].[Id], ... [t0].[Version] FROM [Refill] AS [t0] WHERE [t0].[Id] IN (@p0, @p1, @p2, @p3, @p4, @p5, @p6, @p7, @p8, @p9) ... That query 2times (20 / 10 = 2)

@ni5ni6 2012-12-24 09:56:21

What is your suggestion for blockSize in order to optimize queries using LinqToSql? Or, to put it differently, is it better to have less queries with bigger blocks, or more queries with smaller blocks?

@kwcto 2013-01-03 20:04:41

You generally want as few blocks as possible. The main issue is that SQL Server only allows 2100 parameters per query/command. I'll usually use a blockSize of 2000.

@Marc Gravell 2009-02-20 08:49:22

I've come up with a way to chunk the query into pieces - i.e. you give it 4000 values, so it might do 4 requests of 1000 each; with full Northwind example. Note that this might not work on Entity Framework, due to Expression.Invoke - but is fine on LINQ to SQL:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;

namespace ConsoleApplication5 {
    /// SAMPLE USAGE
    class Program {
        static void Main(string[] args) {
            // get some ids to play with...
            string[] ids;
            using(var ctx = new DataClasses1DataContext()) {
                ids = ctx.Customers.Select(x => x.CustomerID)
                    .Take(100).ToArray();
            }

            // now do our fun select - using a deliberately small
            // batch size to prove it...
            using (var ctx = new DataClasses1DataContext()) {
                ctx.Log = Console.Out;
                foreach(var cust in ctx.Customers
                        .InRange(x => x.CustomerID, 5, ids)) {
                    Console.WriteLine(cust.CompanyName);
                }
            }
        }
    }

    /// THIS IS THE INTERESTING BIT
    public static class QueryableChunked {
        public static IEnumerable<T> InRange<T, TValue>(
                this IQueryable<T> source,
                Expression<Func<T, TValue>> selector,
                int blockSize,
                IEnumerable<TValue> values) {
            MethodInfo method = null;
            foreach(MethodInfo tmp in typeof(Enumerable).GetMethods(
                    BindingFlags.Public | BindingFlags.Static)) {
                if(tmp.Name == "Contains" && tmp.IsGenericMethodDefinition
                        && tmp.GetParameters().Length == 2) {
                    method = tmp.MakeGenericMethod(typeof (TValue));
                    break;
                }
            }
            if(method==null) throw new InvalidOperationException(
                "Unable to locate Contains");
            foreach(TValue[] block in values.GetBlocks(blockSize)) {
                var row = Expression.Parameter(typeof (T), "row");
                var member = Expression.Invoke(selector, row);
                var keys = Expression.Constant(block, typeof (TValue[]));
                var predicate = Expression.Call(method, keys, member);
                var lambda = Expression.Lambda<Func<T,bool>>(
                      predicate, row);
                foreach(T record in source.Where(lambda)) {
                    yield return record;
                }
            }
        }
        public static IEnumerable<T[]> GetBlocks<T>(
                this IEnumerable<T> source, int blockSize) {
            List<T> list = new List<T>(blockSize);
            foreach(T item in source) {
                list.Add(item);
                if(list.Count == blockSize) {
                    yield return list.ToArray();
                    list.Clear();
                }
            }
            if(list.Count > 0) {
                yield return list.ToArray();
            }
        }
    }
}

@Matt Sach 2011-03-30 12:11:48

This handles the case of queryable.Where(o => values.Contains(o.propertyToTest)) by replacing it with queryable.InRange(o => o.propertyToTest, blockSize, values) (if I understand it correctly), but I'm looking at a similar overflow on the 2100 parameters limit, with e.g. queryable.Where(o => !values.Contains(o.propertyToTest)). I'm trying to modify InRange() to obtain a NotInRange() equivalent, and I'm not sure how to do the boolean negation. My thinking was at the foreach (T record in source.Where(lambda)) line?

@Matt Sach 2011-03-30 14:56:32

Actually, after much hunting, I think I've found what's needed, appropriately from an answer you'd given a month previously to this one: stackoverflow.com/questions/457316/…, ref "This also works well to negate a single operation:"

@Code Maverick 2012-10-22 19:04:53

@Marc - How would this be handled in VB? yield return obviously doesn't exist for us.

@Dilbert789 2013-07-23 13:25:46

What is the reason for making the batch size configurable? Wouldn't I always want this to be like 2090? So I'm just under the 2100 limit with a bit of play?

@Frans Bouma 2014-06-12 08:51:02

The Invoke isn't necessary. You can re-use the selector lambda and parameter, although I admit it's a bit dirty. Avoiding the invoke will also make it usable on other ORMs I think. (I made it working on LLBLGen Pro using: gist.github.com/FransBouma/5e7031fe557df4b5b688

@Marc Gravell 2009-02-20 04:34:42

LINQ-to-SQL still works via standard SQL parameters, so writing a fancy expression isn't going to help. There are 3 common options here:

  • pack the ids into (for example) csv/tsv; pass down as a varchar(max) and use a udf to split it (at the server) into a table variable; join to the table variable
  • use a table-valued-parameter in SQL Server 2008
  • have a table on the server that you could push the ids into (perhaps via SqlBulkCopy) (perhaps with a "session guid" or similar); join to this table

The first is the simplest; getting a "split csv udf" is trivial (just search for it). Drag the udf onto the data-context and consume from there.

@kwcto 2009-02-20 18:14:31

This isn't necessary. See my answer below.

Related Questions

Sponsored Content

21 Answered Questions

[SOLVED] What is the best way to give a C# auto-property an initial value?

34 Answered Questions

[SOLVED] What is the Java equivalent for LINQ?

  • 2009-08-01 18:53:26
  • Ahmed
  • 267780 View
  • 805 Score
  • 34 Answer
  • Tags:   java linq

20 Answered Questions

[SOLVED] Retrieving Property name from lambda expression

23 Answered Questions

[SOLVED] LINQ query on a DataTable

20 Answered Questions

[SOLVED] LINQ's Distinct() on a particular property

14 Answered Questions

[SOLVED] Returning IEnumerable<T> vs. IQueryable<T>

7 Answered Questions

[SOLVED] Multiple "order by" in LINQ

  • 2008-11-18 13:34:11
  • Sasha
  • 574362 View
  • 1534 Score
  • 7 Answer
  • Tags:   linq sql-order-by

9 Answered Questions

[SOLVED] Group by in LINQ

  • 2011-09-06 19:44:20
  • test123
  • 1212580 View
  • 1000 Score
  • 9 Answer
  • Tags:   c# linq group-by

9 Answered Questions

[SOLVED] What is a lambda expression in C++11?

16 Answered Questions

[SOLVED] Entity Framework vs LINQ to SQL

Sponsored Content