By Loren C Fortner


2011-11-07 13:24:57 8 Comments

I want to read a text file line by line. I wanted to know if I'm doing it as efficiently as possible within the .NET C# scope of things.

This is what I'm trying so far:

var filestream = new System.IO.FileStream(textFilePath,
                                          System.IO.FileMode.Open,
                                          System.IO.FileAccess.Read,
                                          System.IO.FileShare.ReadWrite);
var file = new System.IO.StreamReader(filestream, System.Text.Encoding.UTF8, true, 128);

while ((lineOfText = file.ReadLine()) != null)
{
    //Do something with the lineOfText
}

8 comments

@Saeed Amiri 2011-11-07 13:29:39

If file size is not big, is faster to read all file then split string:

var filestreams = sr.ReadToEnd().Split(Environment.NewLine, 
                              StringSplitOptions.RemoveEmptyEntries);

@jgauffin 2011-11-07 13:33:56

File.ReadAllLines()

@Saeed Amiri 2011-11-07 13:37:54

@jgauffin I don't know behind implementation of file.ReadAlllines() but I think it has a limited buffer and fileReadtoEnd buffer should be greater, so number of access to file will be decreased by this way, and doing string.Split in the case file size is not big is faster than multiple access to file.

@jgauffin 2011-11-07 14:06:03

I doubt that File.ReadAllLines have a fixed buffer size since the file size is known.

@Martin Liversage 2011-11-07 14:26:21

@jgauffin: In .NET 4.0 File.ReadAllLines creates a list and adds to this list in a loop using StreamReader.ReadLine (with potential reallocation of the underlying array). This method uses a default buffer size of 1024. The StreamReader.ReadToEnd avoids the line parsing part and the buffer size can be set in the constructor if desired.

@user2671536 2013-08-11 02:11:22

Use the following code:

foreach (string line in File.ReadAllLines(fileName))

This was a HUGE difference in reading performance.

It comes at the cost of memory consumption, but totally worth it!

@Martin Liversage 2011-11-07 15:41:29

To find the fastest way to read a file line by line you will have to do some benchmarking. I have done some small tests on my computer but you cannot expect that my results apply to your environment.

Using StreamReader.ReadLine

This is basically your method. For some reason you set the buffer size to the smallest possible value (128). Increasing this will in general increase performance. The default size is 1,024 and other good choices are 512 (the sector size in Windows) or 4,096 (the cluster size in NTFS). You will have to run a benchmark to determine an optimal buffer size. A bigger buffer is - if not faster - at least not slower than a smaller buffer.

const Int32 BufferSize = 128;
using (var fileStream = File.OpenRead(fileName))
  using (var streamReader = new StreamReader(fileStream, Encoding.UTF8, true, BufferSize)) {
    String line;
    while ((line = streamReader.ReadLine()) != null)
      // Process line
  }

The FileStream constructor allows you to specify FileOptions. For example, if you are reading a large file sequentially from beginning to end, you may benefit from FileOptions.SequentialScan. Again, benchmarking is the best thing you can do.

Using File.ReadLines

This is very much like your own solution except that it is implemented using a StreamReader with a fixed buffer size of 1,024. On my computer this results in slightly better performance compared to your code with the buffer size of 128. However, you can get the same performance increase by using a larger buffer size. This method is implemented using an iterator block and does not consume memory for all lines.

var lines = File.ReadLines(fileName);
foreach (var line in lines)
  // Process line

Using File.ReadAllLines

This is very much like the previous method except that this method grows a list of strings used to create the returned array of lines so the memory requirements are higher. However, it returns String[] and not an IEnumerable<String> allowing you to randomly access the lines.

var lines = File.ReadAllLines(fileName);
for (var i = 0; i < lines.Length; i += 1) {
  var line = lines[i];
  // Process line
}

Using String.Split

This method is considerably slower, at least on big files (tested on a 511 KB file), probably due to how String.Split is implemented. It also allocates an array for all the lines increasing the memory required compared to your solution.

using (var streamReader = File.OpenText(fileName)) {
  var lines = streamReader.ReadToEnd().Split("\r\n".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
  foreach (var line in lines)
    // Process line
}

My suggestion is to use File.ReadLines because it is clean and efficient. If you require special sharing options (for example you use FileShare.ReadWrite), you can use your own code but you should increase the buffer size.

@Richard K. 2013-01-13 00:30:16

Thanks for this - your inclusion of the buffer size parameter on the StreamReader's constructor was really helpful. I'm streaming from Amazon's S3 API, and using a matching buffer size speeds things up considerably in conjunction with ReadLine().

@h9uest 2015-01-21 15:55:43

I don't understand. In theory, the vast majority of the time spent reading the file would be the seeking time on disk and the overheads of manupulating streams, like what you'd do with the File.ReadLines. File.ReadLines, on the other hand, is supposed to read everything of a file into the memory in one go. How could it be worse in performance?

@bkqc 2016-09-16 13:39:15

I can't say about speed performance but one thing is certain: it is far worse on memory consumption. If you have to handle very large files (GB for instance), this is very critical. Even more if it means it has to swap memory. On the speed side, you could add that ReadAllLine needs to read ALL lines BEFORE returning the result delaying processing. In some scenarios, the IMPRESSION of speed is more important that raw speed.

@Kim Lage 2018-11-29 16:37:40

If you read the stream as byte arrays It will read the file from 20%~80% faster (from the tests I did). What you need is to get the byte array and convert it to string. That's how I did it: For reading use stream.Read() You can make a loop to make it read in chunks. After appending the whole content into a byte array (use System.Buffer.BlockCopy) you'll need to convert the bytes into string: Encoding.Default.GetString(byteContent,0,byteContent.Length - 1).Split(new string[] { "\r\n", "\r", "\n" }, StringSplitOptions.None);

@Free Coder 24 2014-07-23 13:12:44

While File.ReadAllLines() is one of the simplest ways to read a file, it is also one of the slowest.

If you're just wanting to read lines in a file without doing much, according to these benchmarks, the fastest way to read a file is the age old method of:

using (StreamReader sr = File.OpenText(fileName))
{
        string s = String.Empty;
        while ((s = sr.ReadLine()) != null)
        {
               //do minimal amount of work here
        }
}

However, if you have to do a lot with each line, then this article concludes that the best way is the following (and it's faster to pre-allocate a string[] if you know how many lines you're going to read) :

AllLines = new string[MAX]; //only allocate memory here

using (StreamReader sr = File.OpenText(fileName))
{
        int x = 0;
        while (!sr.EndOfStream)
        {
               AllLines[x] = sr.ReadLine();
               x += 1;
        }
} //Finished. Close the file

//Now parallel process each line in the file
Parallel.For(0, AllLines.Length, x =>
{
    DoYourStuff(AllLines[x]); //do your work here
});

@Marcel James 2013-08-12 14:04:17

There's a good topic about this in Stack Overflow question Is 'yield return' slower than "old school" return?.

It says:

ReadAllLines loads all of the lines into memory and returns a string[]. All well and good if the file is small. If the file is larger than will fit in memory, you'll run out of memory.

ReadLines, on the other hand, uses yield return to return one line at a time. With it, you can read any size file. It doesn't load the whole file into memory.

Say you wanted to find the first line that contains the word "foo", and then exit. Using ReadAllLines, you'd have to read the entire file into memory, even if "foo" occurs on the first line. With ReadLines, you only read one line. Which one would be faster?

@Jon Skeet 2011-11-07 13:26:40

If you're using .NET 4, simply use File.ReadLines which does it all for you. I suspect it's much the same as yours, except it may also use FileOptions.SequentialScan and a larger buffer (128 seems very small).

@stt106 2017-04-20 10:02:40

Another benefit of ReadLines() is it's lazy so works well with LINQ.

@jgauffin 2011-11-07 13:30:45

You can't get any faster if you want to use an existing API to read the lines. But reading larger chunks and manually find each new line in the read buffer would probably be faster.

@Kibbee 2011-11-07 13:28:59

If you have enough memory, I've found some performance gains by reading the entire file into a memory stream, and then opening a stream reader on that to read the lines. As long as you actually plan on reading the whole file anyway, this can yield some improvements.

@jgauffin 2011-11-07 13:33:19

File.ReadAllLines seems to be a better choice then.

Related Questions

Sponsored Content

23 Answered Questions

[SOLVED] How do I save a String to a text file using Java?

21 Answered Questions

[SOLVED] How to read a large text file line by line using Java?

13 Answered Questions

10 Answered Questions

[SOLVED] Improve INSERT-per-second performance of SQLite?

17 Answered Questions

[SOLVED] Is there a way to check if a file is in use?

10 Answered Questions

[SOLVED] Correct way to write line to file?

  • 2011-05-28 05:44:53
  • Yaroslav Bulatov
  • 1824729 View
  • 899 Score
  • 10 Answer
  • Tags:   python file-io

25 Answered Questions

[SOLVED] Reading a plain text file in Java

  • 2011-01-17 18:29:13
  • Tim the Enchanter
  • 2218466 View
  • 866 Score
  • 25 Answer
  • Tags:   java file-io ascii

30 Answered Questions

[SOLVED] How to append text to an existing file in Java

18 Answered Questions

[SOLVED] Why should text files end with a newline?

26 Answered Questions

[SOLVED] Why not inherit from List<T>?

Sponsored Content