for loops really "bad"? If not, in what situation(s) would they be better than using a more conventional "vectorized" approach?1
I am familiar with the concept of "vectorization", and how pandas employs vectorized techniques to speed up computation. Vectorized functions broadcast operations over the entire series or DataFrame to achieve speedups much greater than conventionally iterating over the data.
However, I am quite surprised to see a lot of code (including from answers on Stack Overflow) offering solutions to problems that involve looping through data using
for loops and list comprehensions. The documentation and API say that loops are "bad", and that one should "never" iterate over arrays, series, or DataFrames. So, how come I sometimes see users suggesting loop-based solutions?
1 - While it is true that the question sounds somewhat broad, the truth is that there are very specific situations when
for loops are usually better than conventionally iterating over data. This post aims to capture this for posterity.