By StriplingWarrior


2012-01-17 17:21:07 8 Comments

When using lambda expressions or anonymous methods in C#, we have to be wary of the access to modified closure pitfall. For example:

foreach (var s in strings)
{
   query = query.Where(i => i.Prop == s); // access to modified closure
   ...
}

Due to the modified closure, the above code will cause all of the Where clauses on the query to be based on the final value of s.

As explained here, this happens because the s variable declared in foreach loop above is translated like this in the compiler:

string s;
while (enumerator.MoveNext())
{
   s = enumerator.Current;
   ...
}

instead of like this:

while (enumerator.MoveNext())
{
   string s;
   s = enumerator.Current;
   ...
}

As pointed out here, there are no performance advantages to declaring a variable outside the loop, and under normal circumstances the only reason I can think of for doing this is if you plan to use the variable outside the scope of the loop:

string s;
while (enumerator.MoveNext())
{
   s = enumerator.Current;
   ...
}
var finalString = s;

However variables defined in a foreach loop cannot be used outside the loop:

foreach(string s in strings)
{
}
var finalString = s; // won't work: you're outside the scope.

So the compiler declares the variable in a way that makes it highly prone to an error that is often difficult to find and debug, while producing no perceivable benefits.

Is there something you can do with foreach loops this way that you couldn't if they were compiled with an inner-scoped variable, or is this just an arbitrary choice that was made before anonymous methods and lambda expressions were available or common, and which hasn't been revised since then?

5 comments

@TemaTre 2018-06-15 07:31:34

In my opinion it's strange question. It's good to know how compiler works but that only "good to know".

If you write code that depends on algorithm of compiler it's bad practice.And it's better to rewrite code to exclude this dependency.

This is good question for job interview. But in real life I'v not faced with any problems that I solved on job interview.

The 90% of foreach uses for process each element of collection (not for select or calculate some values). sometimes you need to compute some values inside the loop but it's not good practice to create BIG loop.

It's better to use LINQ expressions for computing the values. Because when you calculate a lot of thing inside the loop, after 2-3 month when you (or anybody else) will read this code person will not understand what is this and how it should works.

@StriplingWarrior 2018-06-15 15:24:34

"If you write code that depends on algorithm of compiler it's bad practice." This isn't a question about compiler optimization. It's about the actual behavior of the language, which is something you can't get away from depending on. I'll admit that knowing why a given language behavior was made is more of an academic exercise, but it's clearly a legitimate question, considering the C# team decided to make a breaking change to the language in order to correct it.

@StriplingWarrior 2018-06-15 15:28:50

"It's better to use LINQ expressions for computing the values." You'll note that I am using the loop to build a LINQ expression. Yes, using Aggregate would have avoided this problem in the first place, but it's extremely common for people to use loops like this. I see questions arising from this sort of problem in JavaScript all the time on StackOverflow, and they were similarly common in C# before the team decided to change this language behavior.

@Godeke 2012-01-17 17:47:33

Having been bitten by this, I have a habit of including locally defined variables in the innermost scope which I use to transfer to any closure. In your example:

foreach (var s in strings)
{
    query = query.Where(i => i.Prop == s); // access to modified closure

I do:

foreach (var s in strings)
{
    string search = s;
    query = query.Where(i => i.Prop == search); // New definition ensures unique per iteration.

Once you have that habit, you can avoid it in the very rare case you actually intended to bind to the outer scopes. To be honest, I don't think I have ever done so.

@StriplingWarrior 2012-01-17 17:53:59

That is the typical workaround Thanks for the contribution. Resharper is smart enough to recognize this pattern and bring it to your attention too, which is nice. I haven't been bit by this pattern in a while, but since it is, in Eric Lippert's words, "the single most common incorrect bug report we get," I was curious to know the why more than the how to avoid it.

@Krizz 2012-01-17 17:39:41

What you are asking is thoroughly covered by Eric Lippert in his blog post Closing over the loop variable considered harmful and its sequel.

For me, the most convincing argument is that having new variable in each iteration would be inconsistent with for(;;) style loop. Would you expect to have a new int i in each iteration of for (int i = 0; i < 10; i++)?

The most common problem with this behavior is making a closure over iteration variable and it has an easy workaround:

foreach (var s in strings)
{
    var s_for_closure = s;
    query = query.Where(i => i.Prop == s_for_closure); // access to modified closure

My blog post about this issue: Closure over foreach variable in C#.

@Random832 2012-01-17 17:44:52

Ultimately, what people actually want when they write this isn't to have multiple variables, it's to close over the value. And it's hard to think of a usable syntax for that in the general case.

@Krizz 2012-01-17 17:50:29

Yes, it is not possible to close by the value, however there is a very easy workaround I have just edited my answer to include.

@Sean U 2012-01-17 17:53:32

It's too bad closures in C# close over references. If they closed over values by default, we could easily specify closing over variables instead with ref.

@Andy 2012-01-17 19:41:53

@Krizz, this is a case where the forced consistency is more harmful than being inconsistent. It should "just work" as people expect, and clearly people expect something different when using foreach as opposed to a for loop, given the number of people that have hit problems before we knew about the access to modified closure issue (such as myself).

@Krizz 2012-01-17 19:48:38

That's probably the reason why C# team at Microsoft decides to do this breaking change as declared by Eric in his answer.

@Will Ness 2012-01-18 18:09:49

@Random832 don't know about C# but in Common LISP there is a syntax for that, and it figures that any language with mutable variables and closures would (nay, must) have it too. We either close over reference to the changing place, or a value it has at a given moment in time (creation of a closure). This discusses similar stuff in Python and Scheme (cut for refs/vars and cute for keeping evaluated values in partially-evaluated closures).

@Jim Balter 2013-03-11 01:34:51

"Would you expect to have a new int i in each iteration of for (int i = 0; i < 10; i++)?" -- In a well-designed language, I would definitely expect a closure within a loop to capture the current iteration's value of any variable with loop scope. Changing the semantics of foreach but not of for is an even bigger botch than scoping the foreach variable outside the loop.

@N30 2013-10-24 18:44:52

@Krizz link to your blog article is not working.

@ErikE 2017-11-20 20:46:25

@SeanU No, it's not too bad closures in C# close over references. If it closed over values, you'd eliminate a significant portion of the value of closures in the first place! How would you update a variable outside the lambda otherwise?

@Sнаđошƒаӽ 2018-12-14 06:04:14

Hey, I just made you 9999. So I deserve a little something? :D :D Never mind, just kidding.

@Luaan 2019-07-26 07:51:51

@ErikE Read again. Sean suggested closures would be over values by default, and over references with an extra keyword.

@Eric Lippert 2012-01-17 17:56:25

The compiler declares the variable in a way that makes it highly prone to an error that is often difficult to find and debug, while producing no perceivable benefits.

Your criticism is entirely justified.

I discuss this problem in detail here:

Closing over the loop variable considered harmful

Is there something you can do with foreach loops this way that you couldn't if they were compiled with an inner-scoped variable? or is this just an arbitrary choice that was made before anonymous methods and lambda expressions were available or common, and which hasn't been revised since then?

The latter. The C# 1.0 specification actually did not say whether the loop variable was inside or outside the loop body, as it made no observable difference. When closure semantics were introduced in C# 2.0, the choice was made to put the loop variable outside the loop, consistent with the "for" loop.

I think it is fair to say that all regret that decision. This is one of the worst "gotchas" in C#, and we are going to take the breaking change to fix it. In C# 5 the foreach loop variable will be logically inside the body of the loop, and therefore closures will get a fresh copy every time.

The for loop will not be changed, and the change will not be "back ported" to previous versions of C#. You should therefore continue to be careful when using this idiom.

@Random832 2012-01-17 18:38:59

So, no chance of a syntax to close on values? (Of course, that has a "no-one would use it" issue, because the syntax to close on variables is more natural and makes no difference 90% of the time)

@Eric Lippert 2012-01-17 19:13:14

@Random832: Unlikely. We are, however, considering for Roslyn adding a static analyzer that determines if the closed-over variable is ever written to after the construction of the closure; if it is not, then we could close over the value rather than the variable.

@Marc Gravell 2012-01-17 19:13:48

Actually, there is an indirect reference in the 1.x spec; if you look at the definite assignment rules, it gives an example of the compiler interpretation, and IIRC it is declared inside the loop. This is indirect and circumstantial, though. Not an explicit spec statement.

@Eric Lippert 2012-01-17 19:21:55

We did in fact push back on this change in C# 3 and C# 4. When we designed C# 3 we did realize that the problem (which already existed in C# 2) was going to get worse because there would be so many lambdas (and query comprehensions, which are lambdas in disguise) in foreach loops thanks to LINQ. I regret that we waited for the problem to get sufficiently bad to warrant fixing it so late, rather than fixing it in C# 3.

@leppie 2012-01-18 05:58:43

And now we will have to remember foreach is 'safe' but for is not.

@Michiel van Oosterhout 2012-01-18 07:56:42

I wouldn't call it a breaking change, but rather an enabling change: code that was once subtly wrong will now start to behave as expected. Of course, some may have depended on the old behavior...

@leppie 2012-01-18 09:00:07

@michielvoo: The change is breaking in the sense that it is not backward compatible. New code will not run correctly on when using an older compiler.

@CodesInChaos 2012-01-18 10:13:16

I think I would have changed it to a compiler error. Subtly changing the semantics looks too dangerous to me. I wonder how much code there is that depends on the old behavior.

@Dan Neely 2012-01-18 13:39:41

@usr same here; but without a tool to say safe/unsafe I opt for almost always using an explicit temp. There are too many ways to create bugs to add a subtle one into the mix.

@Adam Rackis 2012-01-18 22:20:16

@Matthew - the whole point of closures is that they capture the variable, and not the value. There's tremendous power in this, but you can get tripped up if you're not careful. The problem here is that each lambda in OPs question is closing over a variable s that was declared outside the loop. Therefore, each lambda is closing over the same s. Eric is saying that in C# 5, s will be declared inside the loop, so each lambda will be closing over a fresh, new variable

@Benjol 2012-01-19 07:47:53

@EricLippert, just out of curiosity, did you guys manage to come up with any not-completely-wacky scenarios which were likely to get broken by this 'breaking' change?

@Eric Lippert 2012-01-19 14:20:07

@Benjol: No, which is why we're willing to take it. Jon Skeet pointed out to me an important breaking change scenario though, which is that someone writes code in C# 5, tests it, and then shares it with people who are still using C# 4, who then naively believe it to be correct. Hopefully the number of people affected by such a scenario is small.

@Eric Lippert 2012-01-19 14:22:24

@CodeInChaos: We can't make it into a compiler error to write a lambda that is closed over the loop variable because then you can't use LINQ inside a foreach loop! The cure would be far worse than the disease.

@Michael Stum 2012-01-19 16:24:49

@EricLippert I assume this only affects code, not the generated assembly? In that case, it should indeed be small.

@Mike Chamberlain 2012-03-21 22:57:13

As an aside, ReSharper has always caught this and reports it as "access to modified closure". Then, by hitting Alt+Enter, will it even automatically fix your code for you. jetbrains.com/resharper

@Om Deshmane 2012-04-05 04:36:34

@EricLippert, I think basic problem is instance of lambda was created outside loop even when the closed-over variable was modified inside the loop. Consider a case where I declare a variable outside loop (not a loop variable) and then use it in lambda inside the loop (there's no connection to loop variable) For ex : string not_loop_var; foreach (var item in strings) { not_loop_var = item; actions.Add(() => Console.WriteLine(not_loop_var)); }

@Om Deshmane 2012-04-05 04:53:41

@Random832 close on values vs variable: (an opinion) I think it is more natural if closures followed same symantics as function calls (pass by value/ref). For example, if I declare public static Action GetLambda(int val) { return () => Console.WriteLine(val); } then GetLambda(x) should be replacable by "() => Console.WriteLine(x)" everywhere

@newacct 2012-05-13 23:00:58

I don't think that you can dismiss this behavior "harmful" or wrong. It works exactly this way in many languages, including Python and JavaScript.

@Alexandre Araujo Moreira 2012-05-20 13:08:22

@Random832 I know this is old but, if you're only interested in behaviour (not in performance) when closing over the value, I believe a simple generic identity function should do the trick. Something like public static T identity<T>(T x) { return x; } I never tried it exactly like that in C#, but I believe it'd work for your scenario.

@Random832 2012-05-20 21:01:28

No, because the call to identity(x) occurs within the closure, and it will return a different value when x changes.

@ta.speot.is 2012-06-09 07:09:49

I feel the current behaviour is understandable given that the lambdas will be evaluated after the loop has been evaluated (or at various stages through its evaluation) but its not intuitive. Thank you for changing it.

@supercat 2012-08-21 18:58:41

@mho: I find it curious that even though when designing vb.net, Microsoft seems to have recognized that parameter passing should default to value semantics, changing a long-standing tradition in their BASIC interpreters (going all the way back to QBasic), they made closures default to reference semantics. Having implicit closures with reference semantics seems a recipe for confusion. I wonder in what cases it would have been problematic to require that all variables used in a closure must have some special declaration, explicitly recognizing that their lifetime would be extended?

@Nikhil Agrawal 2012-10-19 06:22:16

Which VB.Net version will have this corrected?

@Heinzi 2016-01-25 15:46:07

@NikhilAgrawal: Visual Studio 2012 fixed this issue both for C# (C# 5.0) and VB (VB 11.0). See Visual Basic Breaking Changes in Visual Studio 2012.

@Paolo Moretti 2012-09-03 13:58:51

In C# 5.0, this problem is fixed and you can close over loop variables and get the results you expect.

The language specification says:

8.8.4 The foreach statement

(...)

A foreach statement of the form

foreach (V v in x) embedded-statement

is then expanded to:

{
  E e = ((C)(x)).GetEnumerator();
  try {
      while (e.MoveNext()) {
          V v = (V)(T)e.Current;
          embedded-statement
      }
  }
  finally {
      … // Dispose e
  }
}

(...)

The placement of v inside the while loop is important for how it is captured by any anonymous function occurring in the embedded-statement. For example:

int[] values = { 7, 9, 13 };
Action f = null;
foreach (var value in values)
{
    if (f == null) f = () => Console.WriteLine("First value: " + value);
}
f();

If v was declared outside of the while loop, it would be shared among all iterations, and its value after the for loop would be the final value, 13, which is what the invocation of f would print. Instead, because each iteration has its own variable v, the one captured by f in the first iteration will continue to hold the value 7, which is what will be printed. (Note: earlier versions of C# declared v outside of the while loop.)

@colinfang 2013-04-29 16:00:20

Why this early version of C# declared v inside of the while loop?msdn.microsoft.com/en-GB/library/aa664754.aspx

@Paolo Moretti 2013-04-29 18:35:23

@colinfang Be sure to read Eric's answer: The C# 1.0 specification (in your link we are talking about VS 2003, i.e. C# 1.2) actually did not say whether the loop variable was inside or outside the loop body, as it make no observable difference. When closure semantics were introduced in C# 2.0, the choice was made to put the loop variable outside the loop, consistent with the "for" loop.

@colinfang 2013-04-29 18:47:07

So you are saying that the examples in the link were not definitive spec at that time?

@Paolo Moretti 2013-04-29 22:01:58

@colinfang They were definitive specifications. The problem is that we are talking about a feature (i.e. function closures) that was introduced later (with C# 2.0). When C# 2.0 came up they decided to put the loop variable outside the loop. And than they changed their mind again with C# 5.0 :)

Related Questions

Sponsored Content

24 Answered Questions

[SOLVED] What is the scope of variables in JavaScript?

21 Answered Questions

[SOLVED] LINQ equivalent of foreach for IEnumerable<T>

7 Answered Questions

[SOLVED] How does PHP 'foreach' actually work?

33 Answered Questions

[SOLVED] What's the difference between using "let" and "var"?

18 Answered Questions

[SOLVED] Using global variables in a function

34 Answered Questions

[SOLVED] How do you get the index of the current iteration of a foreach loop?

  • 2008-09-04 01:38:39
  • Matt Mitchell
  • 794506 View
  • 847 Score
  • 34 Answer
  • Tags:   c# foreach

296 Answered Questions

[SOLVED] Hidden Features of C#?

  • 2008-08-12 16:32:24
  • Serhat Ozgel
  • 677320 View
  • 1475 Score
  • 296 Answer
  • Tags:   c# hidden-features

3 Answered Questions

[SOLVED] Using a foreach variable in a Laravel view

11 Answered Questions

[SOLVED] Calling remove in foreach loop in Java

2 Answered Questions

[SOLVED] C# foreach variable scope

  • 2015-06-05 09:34:49
  • Pierre-Antoine
  • 2117 View
  • 2 Score
  • 2 Answer
  • Tags:   c# .net foreach scope

Sponsored Content