By user278618


2010-03-29 12:31:49 8 Comments

I have a collection:

List<Car> cars = new List<Car>();

Cars are uniquely identified by their property CarCode.

I have three cars in the collection, and two with identical CarCodes.

How can I use LINQ to convert this collection to Cars with unique CarCodes?

9 comments

@Patrick Hofman 2017-06-22 11:43:22

You can't effectively use Distinct on a collection of objects (without additional work). I will explain why.

The documentation says:

It uses the default equality comparer, Default, to compare values.

For objects that means it uses the default equation method to compare objects (source). That is on their hash code. And since your objects don't implement the GetHashCode() and Equals methods, it will check on the reference of the object, which are not distinct.

@Jon Skeet 2010-03-29 12:34:48

Use MoreLINQ, which has a DistinctBy method :)

IEnumerable<Car> distinctCars = cars.DistinctBy(car => car.CarCode);

(This is only for LINQ to Objects, mind you.)

@Diogo 2013-07-17 16:48:23

@gdoron 2013-10-17 12:57:08

Hi Jon, two questions if I may. 1) Why don't you add the library to Nuget? 2) What about LINQ to SQL\EF\NH? how can we implement that? Do we have to use Guffa version(which is your version if NO_HASHSET is true...)? Thank you very much!

@Jon Skeet 2013-10-17 12:58:39

@gdoron: 1) It's in NuGet already: nuget.org/packages/morelinq 2) I doubt that LINQ to SQL etc are flexible enough to allow that.

@gdoron 2013-10-17 13:04:59

Ohh, it's prerelease... that's why I couldn't find it. 2) Well I'm afraid adding the Lib to my project, I'm afraid someone will use it with IQueryable<T> and try to DistinctBy it and thus query the whole God damn table... Isn't it error prone? Thanks again from your extremely quick response!

@Jon Skeet 2013-10-17 13:14:22

@gdoron: No, 2.0 is prerelease - 1.0 isn't. As for whether it's error-prone... well, that's true of LINQ in general, in that you could always pass an IQueryable<T> to something expecting IEnumerable<T>.

@Gustavo Guevara 2014-06-05 17:57:06

I tried using MoreLinq for the same task, particularly DistinctBy. It wasn't very efficient at all. The MoreLinq/DistinctBy query made the method take 2.5minutes to execute as per Chrome's network tab, where as the GroupBy approach took 1.5 seconds. I had to convert to Queryable using the AsQueryable method, maybe that had some influence.

@Shimmy 2015-01-24 21:46:45

Would you consider it a bad habit to include the MoreLinq features under the System.Linq namespace, so no need to add additional usings to each file that wants to access those features?

@Jon Skeet 2015-01-24 21:55:45

@Shimmy: I'd personally feel nervous about writing code under System as that gives a false impression of it being "official". But your tastes may vary, of course :)

@Jon Skeet 2016-01-20 21:01:46

@gdoron: Yes, it is. I'll edit the LINQ.

@Anestis Kivranoglou 2015-12-09 13:33:32

I think the best option in Terms of performance (or in any terms) is to Distinct using the The IEqualityComparer interface.

Although implementing each time a new comparer for each class is cumbersome and produces boilerplate code.

So here is an extension method which produces a new IEqualityComparer on the fly for any class using reflection.

Usage:

var filtered = taskList.DistinctBy(t => t.TaskExternalId).ToArray();

Extension Method Code

public static class LinqExtensions
{
    public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> items, Func<T, TKey> property)
    {
        GeneralPropertyComparer<T, TKey> comparer = new GeneralPropertyComparer<T,TKey>(property);
        return items.Distinct(comparer);
    }   
}
public class GeneralPropertyComparer<T,TKey> : IEqualityComparer<T>
{
    private Func<T, TKey> expr { get; set; }
    public GeneralPropertyComparer (Func<T, TKey> expr)
    {
        this.expr = expr;
    }
    public bool Equals(T left, T right)
    {
        var leftProp = expr.Invoke(left);
        var rightProp = expr.Invoke(right);
        if (leftProp == null && rightProp == null)
            return true;
        else if (leftProp == null ^ rightProp == null)
            return false;
        else
            return leftProp.Equals(rightProp);
    }
    public int GetHashCode(T obj)
    {
        var prop = expr.Invoke(obj);
        return (prop==null)? 0:prop.GetHashCode();
    }
}

@MistyK 2017-04-04 14:24:28

where is the reflection here?

@Luke Puplett 2014-07-18 10:59:46

Another extension method for Linq-to-Objects, without using GroupBy:

    /// <summary>
    /// Returns the set of items, made distinct by the selected value.
    /// </summary>
    /// <typeparam name="TSource">The type of the source.</typeparam>
    /// <typeparam name="TResult">The type of the result.</typeparam>
    /// <param name="source">The source collection.</param>
    /// <param name="selector">A function that selects a value to determine unique results.</param>
    /// <returns>IEnumerable&lt;TSource&gt;.</returns>
    public static IEnumerable<TSource> Distinct<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector)
    {
        HashSet<TResult> set = new HashSet<TResult>();

        foreach(var item in source)
        {
            var selectedValue = selector(item);

            if (set.Add(selectedValue))
                yield return item;
        }
    }

@Andrzej Gis 2013-08-15 20:21:58

You can check out my PowerfulExtensions library. Currently it's in a very young stage, but already you can use methods like Distinct, Union, Intersect, Except on any number of properties;

This is how you use it:

using PowerfulExtensions.Linq;
...
var distinct = myArray.Distinct(x => x.A, x => x.B);

@Thomas 2017-10-18 09:08:45

If i have a list of objects where I want to delete all objects with the same ID's, will it be myList.Distinct(x => x.ID) ?

@Sheldor the conqueror 2013-02-27 14:14:16

Same approach as Guffa but as an extension method:

public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> items, Func<T, TKey> property)
{
    return items.GroupBy(property).Select(x => x.First());
}

Used as:

var uniqueCars = cars.DistinctBy(x => x.CarCode);

@Savage 2018-10-06 10:16:19

Perfect. This same method is also provided on the Microsoft.Ajax.Utilities library.

@JwJosefy 2012-09-27 18:09:10

Another way to accomplish the same thing...

List<Car> distinticBy = cars
    .Select(car => car.CarCode)
    .Distinct()
    .Select(code => cars.First(car => car.CarCode == code))
    .ToList();

It's possible to create an extension method to do this in a more generic way. It would be interesting if someone could evalute performance of this 'DistinctBy' against the GroupBy approach.

@Guffa 2013-12-24 11:21:16

The second Select would be an O(n*m) operation, so that won't scale well. It could perform better if there are a lot of duplicates, i.e. if the result of the first Select is a very small part of the original collection.

@Anthony Pegram 2010-03-29 12:52:51

You can implement an IEqualityComparer and use that in your Distinct extension.

class CarEqualityComparer : IEqualityComparer<Car>
{
    #region IEqualityComparer<Car> Members

    public bool Equals(Car x, Car y)
    {
        return x.CarCode.Equals(y.CarCode);
    }

    public int GetHashCode(Car obj)
    {
        return obj.CarCode.GetHashCode();
    }

    #endregion
}

And then

var uniqueCars = cars.Distinct(new CarEqualityComparer());

@Parsa 2017-02-18 11:39:05

How can we use this without writting : new CarEqualityComparer() ?

@user2864740 2018-05-11 22:41:15

@Parsa You can create an IEqualitiyComparer wrapper type that accepts lambdas. This would make it generalized: cars.Distinct(new GenericEqualityComparer<Car>((a,b) => a.CarCode == b.CarCode, x => x.CarCode.GetHashCode())). I've used such in the past as it sometimes adds value when performing a one-off Distinct.

@Guffa 2010-03-29 12:44:02

You can use grouping, and get the first car from each group:

List<Car> distinct =
  cars
  .GroupBy(car => car.CarCode)
  .Select(g => g.First())
  .ToList();

@Guffa 2013-10-23 21:57:32

@NateGates: I was talking to the person that downvoted two days ago.

@Amirhossein Mehrvarzi 2013-12-23 13:47:34

I think that no Overhead exists!

@Guffa 2013-12-24 11:17:36

@AmirHosseinMehrvarzi: There is a bit of overhead, as the groups are created, and then only one item from each group is used.

@Ali Rasouli 2015-03-09 12:43:47

for more keys write: .GroupBy(car =>new{ car.CarCode,car.PID,car.CID})

@Nani 2018-09-14 09:29:47

instead of First() we should use FirstOrDefault() in this case, since First() can only be used as a final query operation

@Maximilian Ast 2018-09-16 19:07:31

@Nani generally speaking you're right, but since a group will only created if there is a matching element in the collection, there will be at least one element per group. --> First() is totally okay in this usecase.

Related Questions

Sponsored Content

21 Answered Questions

[SOLVED] LINQ query on a DataTable

13 Answered Questions

[SOLVED] Group By Multiple Columns

16 Answered Questions

[SOLVED] Update all objects in a collection using LINQ

  • 2008-12-29 22:15:23
  • lomaxx
  • 402757 View
  • 413 Score
  • 16 Answer
  • Tags:   c# .net linq foreach

19 Answered Questions

[SOLVED] LINQ's Distinct() on a particular property

19 Answered Questions

[SOLVED] How to Sort a List<T> by a property in the object

41 Answered Questions

[SOLVED] Deep cloning objects

  • 2008-09-17 00:06:27
  • NakedBrunch
  • 728898 View
  • 1996 Score
  • 41 Answer
  • Tags:   c# .net clone

7 Answered Questions

[SOLVED] Multiple "order by" in LINQ

  • 2008-11-18 13:34:11
  • Sasha
  • 542336 View
  • 1462 Score
  • 7 Answer
  • Tags:   linq sql-order-by

7 Answered Questions

[SOLVED] Group by in LINQ

  • 2011-09-06 19:44:20
  • test123
  • 1085880 View
  • 910 Score
  • 7 Answer
  • Tags:   c# linq group-by

15 Answered Questions

[SOLVED] Using LINQ to remove elements from a List<T>

  • 2009-05-12 15:56:24
  • TK.
  • 512412 View
  • 586 Score
  • 15 Answer
  • Tags:   c# .net linq list

7 Answered Questions

[SOLVED] How to get index using LINQ?

  • 2010-03-18 16:30:47
  • codymanix
  • 284844 View
  • 280 Score
  • 7 Answer
  • Tags:   c# .net linq c#-3.0

Sponsored Content