3 minute read

Over the years of writing C# code, I’ve seen many developers who use IEnumerable<T> as a method parameter without realizing the potential for multiple enumeration. This issue can arise when the caller of the method is not aware that the code inside the method will enumerate the collection multiple times. This can lead to slow and inefficient code, especially for large collections.

IEnumerable<T> is a powerful interface in C# that allows for a variety of collection types to be used as a method parameter. However, when working with this interface, it is important to be mindful of the potential for multiple enumeration. In this blog post, we will explore what multiple enumeration is and how to avoid it when using IEnumerable<T> in your code.

What is multiple enumeration?

Multiple enumeration occurs when a collection that implements the IEnumerable<T> interface is enumerated multiple times. This can happen when the collection is passed as an IEnumerable to a method and the method enumerates it. The code inside any LINQ Select or Where lambdas will be executed multiple times, once for each enumeration of the collection.

An example of multiple enumeration

Consider the following code:

private static void DoSomething(IEnumerable<int> values)
{
  Console.WriteLine(values.Count());
  Console.WriteLine(values.Sum());
}

private static void Main()
{
  var list = Enumerable.Range(1, 10).ToList();
  var result = list.Where(x => x % 2 == 0)
  .Select(x => x * 2);

  DoSomething(result);
}

In this example, Where and Select are applied to the list and the result is passed to DoSomething. The Count and Sum methods in DoSomething will enumerate the entire IEnumerable result, causing the Where and Select lambdas to run twice. This can lead to slow and inefficient code, especially for large collections.

How to avoid multiple enumeration

There are two simple ways to avoid multiple enumeration when working with IEnumerable<T>:

  1. Pass the concrete collection type instead of the IEnumerable result. For example, in the previous example, pass list to DoSomething instead of result.
private static void Main()
{
  var list = Enumerable.Range(1, 10).ToList();
  var result = list.Where(x => x % 2 == 0)
  .Select(x => x * 2)
  .ToList();

  DoSomething(result);
}
  1. Use a concrete object as the method parameter, such as IList. This enforces the caller to pass an enumerated result or a non-lazy loaded object like IEnumerable<T>. This approach is useful when you know that multiple enumerations will occur in the method, or when you are creating a class library.
private static void DoSomething(IList&lt;int&gt; values)
{
  Console.WriteLine(values.Count());
  Console.WriteLine(values.Sum());
}

Understanding Late Evaluation of IEnumerables

When using IEnumerable<T> in conjunction with LINQ, it’s important to understand that LINQ queries are not executed immediately. Instead, they are evaluated lazily, meaning that they are only executed when the data is actually needed. This can lead to some unintended consequences when using IEnumerable<T> as a method parameter.

For example, consider the following code:

IEnumerable<int> FilterAndSquare(IEnumerable<int> numbers)
{
    return numbers.Where(n => n % 2 == 0).Select(n => n * n);
}

var numbers = new List<int> { 1, 2, 3, 4, 5 };
var result = FilterAndSquare(numbers);

foreach (var item in result)
{
      Console.WriteLine(item);
}

In this code, the FilterAndSquare method takes an IEnumerable as a parameter and applies a Where and a Select LINQ query to it. The code calling the method then enumerates the result and writes it to the console.

However, because the LINQ queries are evaluated lazily, the Where and Select queries are not executed until the result is actually enumerated. This means that if the caller of the FilterAndSquare method enumerates the result multiple times, the LINQ queries will be executed multiple times as well.

This can lead to unexpected performance issues, especially for large collections. It also makes the code harder to understand and maintain, as the reader may not be aware that the queries are being executed multiple times.

To avoid these issues, it’s important to be mindful of the lazy evaluation behavior of LINQ and to use alternative methods, such as eager evaluation, when necessary. For example, the following code uses the ToList method to force eager evaluation and avoid multiple enumeration:

IEnumerable<int> FilterAndSquare(IEnumerable<int> numbers)
{
    return numbers.Where(n => n % 2 == 0).Select(n => n * n).ToList();
}

By using eager evaluation, we can be sure that the LINQ queries are executed only once, making the code more efficient and easier to understand.

In conclusion, when using IEnumerable<T> and LINQ, it’s important to understand the lazy evaluation behavior of LINQ queries and to use eager evaluation or pass concrete collection type when necessary to avoid multiple enumeration and improve performance and maintainability.

Leave a comment