By dbkk


2008-08-19 07:23:46 8 Comments

Is there a free or open source library to read Excel files (.xls) directly from a C# program?

It does not need to be too fancy, just to select a worksheet and read the data as strings. So far, I've been using Export to Unicode text function of Excel, and parsing the resulting (tab-delimited) file, but I'd like to eliminate the manual step.

30 comments

@Lizzy 2012-03-06 08:49:41

I want to show a simple method to read xls/xlsx file with .NET. I hope that the following will be helpful for you.

 private DataTable ReadExcelToTable(string path)    
 {

     //Connection String

     string connstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path + ";Extended Properties='Excel 8.0;HDR=NO;IMEX=1';";  
     //the same name 
     //string connstring = Provider=Microsoft.JET.OLEDB.4.0;Data Source=" + path + //";Extended Properties='Excel 8.0;HDR=NO;IMEX=1';"; 

     using(OleDbConnection conn = new OleDbConnection(connstring))
     {
        conn.Open();
        //Get All Sheets Name
        DataTable sheetsName = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables,new object[]{null,null,null,"Table"});  

        //Get the First Sheet Name
        string firstSheetName = sheetsName.Rows[0][2].ToString(); 

        //Query String 
        string sql = string.Format("SELECT * FROM [{0}]",firstSheetName); 
        OleDbDataAdapter ada =new OleDbDataAdapter(sql,connstring);
        DataSet set = new DataSet();
        ada.Fill(set);
        return set.Tables[0];   
   }
 }

Code is from article: http://www.c-sharpcorner.com/uploadfile/d2dcfc/read-excel-file-with-net/. You can get more details from it.

@martinstoeckli 2012-03-29 21:07:43

It was helpful, especially the part about reading the sheetnames.

@Balena 2011-09-15 03:33:19

Take.io Spreadsheet will do this work for you, and at no charge. Just take a look at this.

@Drewmate 2012-11-28 23:24:21

This is a really great little library. It just converts everything into Lists of Lists of strings, which is just fine for the kind of work I needed it for.

@DeeDee 2012-10-26 15:09:52

Late to the party, but I'm a fan of LinqToExcel

@Doctor Rudolf 2012-08-17 08:34:36

We use ClosedXML in rather large systems.

  • Free
  • Easy to install
  • Straight forward coding
  • Very responsive support
  • Developer team is extremly open to new suggestions. Often new features and bug fixes are implemented within the same week

@cless 2011-09-13 04:47:14

If it's just tabular data. I would recommend file data helpers by Marcos Melli which can be downloaded here.

@dbkk 2011-09-13 14:01:47

That's a great library!

@VBK 2011-07-20 02:33:05

If you have multiple tables in the same worksheet you can give each table an object name and read the table using the OleDb method as shown here: http://vbktech.wordpress.com/2011/05/10/c-net-reading-and-writing-to-multiple-tables-in-the-same-microsoft-excel-worksheet/

@Thetam 2011-04-21 10:07:48

The solution that we used, needed to:

  • Allow Reading/Writing of Excel produced files
  • Be Fast in performance (not like using COMs)
  • Be MS Office Independent (needed to be usable without clients having MS Office installed)
  • Be Free or Open Source (but actively developed)

There are several choices, but we found NPoi (.NET port of Java's long existing Poi open source project) to be the best: http://npoi.codeplex.com/

It also allows working with .doc and .ppt file formats

@Bonnie Cornell 2011-04-12 07:30:55

The .NET component Excel Reader .NET may satisfy your requirement. It's good enought for reading XLSX and XLS files. So try it from:

http://www.devtriogroup.com/ExcelReader

@JP Negri 2011-02-04 19:41:51

Excel Data Reader is the way to go!

It´s Open Source, at http://exceldatareader.codeplex.com/ and actively developed.

We been using it for reading Tabular (and sometimes not so tabular) worksheets for a couple of years now (In a financial application).

Works like a charm to read unit test data from human-readable sheets.

Just avoid the feature of trying to return DateTime's, as, for Excel, DateTime's are just double numbers.

@nemke 2011-02-06 13:26:34

There is already mention of exceldatareader here stackoverflow.com/questions/15828/reading-excel-files-from-c‌​/… .Why do you think we need another answer. You should comment the link, not to create long thread garbage

@Michał Pawłowski 2010-09-08 08:45:52

How about Excel Data Reader?

http://exceldatareader.codeplex.com/

I've used in it anger, in a production environment, to pull large amounts of data from a variety of Excel files into SQL Server Compact. It works very well and it's rather robust.

@David Keaveny 2010-10-20 05:14:09

I'll second Excel Data Reader; it has also led to the incredibly useful Excel Data Driven Tests library, which uses NUnit 2.5's TestCaseSource attribute to make data-driven tests using Excel spreadsheets ridiculously easy. Just beware that Resharper doesn't yet support TestCaseSource, so you have to use the NUnit runner.

@Ian1971 2012-10-23 13:02:00

Unfortunately, there are some issues with this library that we've just encountered. Firstly we've had some currency fields coming out as dates. Secondly it is crashing if the workbook has any empty sheets in it. So, although it was very easy to integrate we are now re-evaluating whether to keep using this library. It does not seem to be being actively developed.

@RichieHindle 2012-12-20 12:25:58

It also assumes the presence of some optional elements in xlsx file that cause it to fail to read the data if they're absent.

@Peter 2013-01-14 08:40:48

We're having problems with Excel files coming from SQL Server Reporting Services. They just don't work, unless you open them and save them (even unedited). @RichieHindle: what optional elements are you talking about (hoping this might help me with my SSRS Excel files)?

@RichieHindle 2013-01-14 09:12:54

@Peter: I think it was a missing <dimension> element in the <worksheet> that was causing trouble for me.

@Peter 2013-01-14 09:24:14

@RichieHindle: ah, I think that has been solved now. Thanks

@Ian1971 2013-01-17 12:39:02

As an update to my comment above. We did keep going with this library, and in fact I and another guy have become developers on the project and it is now actively being worked on again. The issues I mentioned have now been fixed, as has open office support and hopefully SSRS (need someone to test it).

@John R 2010-08-26 18:01:25

SpreadsheetGear is awesome. Yes it's an expense, but compared to twiddling with these other solutions, it's worth the cost. It is fast, reliable, very comprehensive, and I have to say after using this product in my fulltime software job for over a year and a half, their customer support is fantastic!

@Anonymous Type 2010-12-19 22:11:59

Hard to justify when there are so many simple and effective ways (for free) of reading from and writing to Excel.

@user289261 2010-03-09 00:16:47

You can try using this open source solution that makes dealing with Excel a lot more cleaner.

http://excelwrapperdotnet.codeplex.com/

@dbkk 2009-10-01 03:44:08

Excel Package is an open-source (GPL) component for reading/writing Excel 2007 files. I used it on a small project, and the API is straightforward. Works with XLSX only (Excel 200&), not with XLS.

The source code also seems well-organized and easy to get around (if you need to expand functionality or fix minor issues as I did).

At first, I tried the ADO.Net (Excel connection string) approach, but it was fraught with nasty hacks -- for instance if second row contains a number, it will return ints for all fields in the column below and quietly drop any data that doesn't fit.

@liya 2009-09-30 03:18:16

SmartXLS is another excel spreadsheet component which support most features of excel Charts,formulas engines, and can read/write the excel2007 openxml format.

@Hafthor 2009-08-24 06:21:56

I just used ExcelLibrary to load an .xls spreadsheet into a DataSet. Worked great for me.

@anonymous 2009-05-11 15:26:26

ExcelMapper is an open source tool (http://code.google.com/p/excelmapper/) that can be used to read Excel worksheets as Strongly Typed Objects. It supports both xls and xlsx formats.

@Rune Grimstad 2008-11-04 11:22:57

Koogra is an open-source component written in C# that reads and writes Excel files.

@Dave 2009-02-02 20:39:09

I think this link needs updated... koogra.sourceforge.net

@Rune Grimstad 2009-02-02 22:07:53

Oops! Very wrong url! Fixed now

@David Burton 2012-08-15 11:13:40

Doesn't look particularly active any more, compared to, say, NPOI

@Joe Erickson 2009-01-19 15:49:51

SpreadsheetGear for .NET is an Excel compatible spreadsheet component for .NET. You can see what our customers say about performance on the right hand side of our product page. You can try it yourself with the free, fully-functional evaluation.

@Dmitry Shechtman 2008-09-04 11:39:19

This is what I used for Excel 2003:

Dictionary<string, string> props = new Dictionary<string, string>();
props["Provider"] = "Microsoft.Jet.OLEDB.4.0";
props["Data Source"] = repFile;
props["Extended Properties"] = "Excel 8.0";

StringBuilder sb = new StringBuilder();
foreach (KeyValuePair<string, string> prop in props)
{
    sb.Append(prop.Key);
    sb.Append('=');
    sb.Append(prop.Value);
    sb.Append(';');
}
string properties = sb.ToString();

using (OleDbConnection conn = new OleDbConnection(properties))
{
    conn.Open();
    DataSet ds = new DataSet();
    string columns = String.Join(",", columnNames.ToArray());
    using (OleDbDataAdapter da = new OleDbDataAdapter(
        "SELECT " + columns + " FROM [" + worksheet + "$]", conn))
    {
        DataTable dt = new DataTable(tableName);
        da.Fill(dt);
        ds.Tables.Add(dt);
    }
}

@lb. 2010-02-24 01:15:59

very clean code!

@Jeremy Holovacs 2011-07-19 15:02:44

worksheet isn't defined... seems a bit odd to me after clearly defining everything else.

@kenny 2008-09-04 11:10:40

Lately, partly to get better at LINQ.... I've been using Excel's automation API to save the file as XML Spreadsheet and then get process that file using LINQ to XML.

@dbkk 2009-10-01 03:47:27

XML Spreadsheet is a fairly clean format :)

@gsvirdi 2010-05-26 05:52:58

But like excel files.... can we protect xml files with password?

@kenny 2010-05-26 09:05:23

I would suspect you can protect it from Excel, but not from man with compiler...like anything...it's just bytes.

@Anonymous Type 2010-12-19 22:12:43

@gsvirdi, post a seperate question on Excel file security, this question is on performance.

@Jason Von Ruden 2008-09-04 10:24:13

I recommend the FileHelpers Library which is a free and easy to use .NET library to import/export data from EXCEL, fixed length or delimited records in files, strings or streams + More.

The Excel Data Link Documentation Section http://filehelpers.sourceforge.net/example_exceldatalink.html

@Will 2009-11-27 18:31:59

I won't down you, but I recently started using FileHelpers and was shocked at how ... crappy it is. For instance, the only way to map columns in a csv to properties... excuse me, FIELDS, of a model is to create the fields in the order of the columns. I don't know about you, but I wouldn't rely on a quirk of the compiler for one of the most central design considerations of my f8king framework.

@Hafthor 2008-09-04 06:00:46

While you did specifically ask for .xls, implying the older file formats, for the OpenXML formats (e.g. xlsx) I highly recommend the OpenXML SDK (http://msdn.microsoft.com/en-us/library/bb448854.aspx)

@Qerim Shahini 2010-06-03 05:52:07

No thanks, the OpenXml API is awful!

@Anonymous Type 2010-12-19 22:09:04

@Quoo, disagree completely.

@Carl Seleborg 2008-08-20 13:24:35

I did a lot of reading from Excel files in C# a while ago, and we used two approaches:

  • The COM API, where you access Excel's objects directly and manipulate them through methods and properties
  • The ODBC driver that allows to use Excel like a database.

The latter approach was much faster: reading a big table with 20 columns and 200 lines would take 30 seconds via COM, and half a second via ODBC. So I would recommend the database approach if all you need is the data.

Cheers,

Carl

@Ian Nelson 2008-08-20 13:19:17

The ADO.NET approach is quick and easy, but it has a few quirks which you should be aware of, especially regarding how DataTypes are handled.

This excellent article will help you avoid some common pitfalls: http://blog.lab49.com/archives/196

@Kevin Le - Khnle 2010-06-03 15:07:54

You answered my question (in the form of a comment above).

@Robin Robinson 2008-08-19 12:18:20

var fileName = string.Format("{0}\\fileNameHere", Directory.GetCurrentDirectory());
var connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0; data source={0}; Extended Properties=Excel 8.0;", fileName);

var adapter = new OleDbDataAdapter("SELECT * FROM [workSheetNameHere$]", connectionString);
var ds = new DataSet();

adapter.Fill(ds, "anyNameHere");

DataTable data = ds.Tables["anyNameHere"];

This is what I usually use. It is a little different because I usually stick a AsEnumerable() at the edit of the tables:

var data = ds.Tables["anyNameHere"].AsEnumerable();

as this lets me use LINQ to search and build structs from the fields.

var query = data.Where(x => x.Field<string>("phoneNumber") != string.Empty).Select(x =>
                new MyContact
                    {
                        firstName= x.Field<string>("First Name"),
                        lastName = x.Field<string>("Last Name"),
                        phoneNumber =x.Field<string>("Phone Number"),
                    });

@Kevin Le - Khnle 2010-06-03 14:47:12

If seems like the Select in this approach tries to guess the data type of the column and force upon that guessed data type. For example, if you have a column with mostly double values, it won't like you passing x.Field<string>, but expects x.Field<double>. IS this true?

@Robin Robinson 2010-06-03 18:20:47

Just looked it up on MSDN. Looks like the <T> is just used to attempt to cast the contents in the column to a type. In this example and just casting the data in the columns to strings. If you wanted a double you would need to call double.Parse(x.Field<string>("Cost") or something like that. Field is an extension method for DataRow and it looks like there aren't an non generic versions.

@Anonymous Type 2010-12-19 22:06:44

Does adding a double.Parse to the Linq query slow it down much?

@Robin Robinson 2010-12-21 17:07:13

Not that I have noticed. I haven't done any real performance on this. For our uses, it isn't being done a lot.

@Shaul Behr 2011-10-06 17:38:29

+1 for the Linq twist - I LOVE LINQ!!

@Andreas Grech 2012-03-10 18:50:01

Note that if you're reading xlsx, you need to use this connection string instead: string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0}; Extended Properties=Excel 12.0;", fileName)

@Duncan 2012-06-06 16:04:28

Sadly the Jet.OLEDB driver is not 64-bit compatible; you will need to switch to target x86 rather than Any CPU (if you still want to go ahead with this method). Alternatively install the 64-bit ACE driver and change the conn string to use this driver (as indicated by Andreas) - microsoft.com/en-us/download/…

@Neal 2012-10-29 01:18:29

Cannot install the 64 bit ACE driver if the target machine has a 32 bit version of office installed.

@CodeRedick 2013-01-14 08:00:43

If this helps anyone, the Jet driver works fine in Win7 64bit... as long as I actually have the document open in Excel.

@hitec 2008-08-19 10:50:55

Here's some code I wrote in C# using .NET 1.1 a few years ago. Not sure if this would be exactly what you need (and may not be my best code :)).

using System;
using System.Data;
using System.Data.OleDb;

namespace ExportExcelToAccess
{
    /// <summary>
    /// Summary description for ExcelHelper.
    /// </summary>
    public sealed class ExcelHelper
    {
        private const string CONNECTION_STRING = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=<FILENAME>;Extended Properties=\"Excel 8.0;HDR=Yes;\";";

        public static DataTable GetDataTableFromExcelFile(string fullFileName, ref string sheetName)
        {
            OleDbConnection objConnection = new OleDbConnection();
            objConnection = new OleDbConnection(CONNECTION_STRING.Replace("<FILENAME>", fullFileName));
            DataSet dsImport = new DataSet();

            try
            {
                objConnection.Open();

                DataTable dtSchema = objConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);

                if( (null == dtSchema) || ( dtSchema.Rows.Count <= 0 ) )
                {
                    //raise exception if needed
                }

                if( (null != sheetName) && (0 != sheetName.Length))
                {
                    if( !CheckIfSheetNameExists(sheetName, dtSchema) )
                    {
                        //raise exception if needed
                    }
                }
                else
                {
                    //Reading the first sheet name from the Excel file.
                    sheetName = dtSchema.Rows[0]["TABLE_NAME"].ToString();
                }

                new OleDbDataAdapter("SELECT * FROM [" + sheetName + "]", objConnection ).Fill(dsImport);
            }
            catch (Exception)
            {
                //raise exception if needed
            }
            finally
            {
                // Clean up.
                if(objConnection != null)
                {
                    objConnection.Close();
                    objConnection.Dispose();
                }
            }


            return dsImport.Tables[0];
            #region Commented code for importing data from CSV file.
            //              string strConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" +"Data Source=" + System.IO.Path.GetDirectoryName(fullFileName) +";" +"Extended Properties=\"Text;HDR=YES;FMT=Delimited\"";
            //
            //              System.Data.OleDb.OleDbConnection conText = new System.Data.OleDb.OleDbConnection(strConnectionString);
            //              new System.Data.OleDb.OleDbDataAdapter("SELECT * FROM " + System.IO.Path.GetFileName(fullFileName).Replace(".", "#"), conText).Fill(dsImport);
            //              return dsImport.Tables[0];

            #endregion
        }

        /// <summary>
        /// This method checks if the user entered sheetName exists in the Schema Table
        /// </summary>
        /// <param name="sheetName">Sheet name to be verified</param>
        /// <param name="dtSchema">schema table </param>
        private static bool CheckIfSheetNameExists(string sheetName, DataTable dtSchema)
        {
            foreach(DataRow dataRow in dtSchema.Rows)
            {
                if( sheetName == dataRow["TABLE_NAME"].ToString() )
                {
                    return true;
                }   
            }
            return false;
        }
    }
}

@Cherian 2009-07-11 08:24:38

This code needs some Resharper love

@hitec 2009-07-13 17:48:08

Couldn't agree more Cherian. This code is many years old... before I even was proficient with Resharper :)

@Sam 2010-07-23 09:16:36

The code is ugly, but it shows how to get the sheet names, great!

@Gaʀʀʏ 2012-07-12 04:49:53

This gets the job done. Thanks!

@Christian Hagelid 2008-08-19 07:48:58

Just did a quick demo project that required managing some excel files. The .NET component from GemBox software was adequate for my needs. It has a free version with a few limitations.

http://www.gemboxsoftware.com/GBSpreadsheet.htm

@ChadD 2012-10-10 16:56:43

FYI: I tried it and it didn't meet my need to be able to read an encrypted file.

@Ryan Farley 2008-08-19 07:28:37

If it is just simple data contained in the Excel file you can read the data via ADO.NET. See the connection strings listed here:

http://www.connectionstrings.com/?carrier=excel2007 or http://www.connectionstrings.com/?carrier=excel

-Ryan

Update: then you can just read the worksheet via something like select * from [Sheet1$]

@StingyJack 2009-01-19 14:06:28

This way is by far the fastest.

@Will 2009-11-27 18:27:50

Of course that's not true, Stingy. You have to sift through all the data and write crappy DB code (hand craft your models, map columns to properties, yadda yadda). The quickest way is to let some other poor SOB do this for you. That's why people use frameworks instead of writing everything from the bottom up.

@cyberzed 2010-02-11 13:26:32

Besides that I have had times where it didn't give me the right results due to localization problems... the neverending fight of seperators

@Triynko 2010-05-13 18:29:55

Worthless method! Truncates text columns to 255 characters when read. Beware! See: stackoverflow.com/questions/1519288/… ACE engine does same thing!

@Ryan Farley 2010-05-13 22:24:24

Triynko, it has been a super long time since I used this method, but IIRC you can get around the 255 char limit by defining an ODBC DSN for the spreadsheet and then define the columns as longer in length and then use the DSN to connect to the spreadsheet. It's a pain to do that, but I believe that gets around that.

@zihotki 2011-01-05 16:43:14

Be aware that using ADO.NET to read data from exel requires Microsoft Access or Microsoft Access Database Engine Redistributable installed.

@Brian Low 2011-04-13 18:50:00

The driver will also guess at the columns types based on the first several rows. If you have a column with what looks like integers in the first rows you will encounter an error when you hit a non-integer (e.g. a float, a string)

@aquinas 2011-09-23 21:19:46

This also will not work at ALL if you are running in a 64 bit process. forums.asp.net/p/1128266/1781961.aspx

@xanadont 2008-08-19 07:31:57

Not free, but with the latest Office there's a very nice automation .Net API. (there has been an API for a long while but was nasty COM) You can do everything you want / need in code all while the Office app remains a hidden background process.

@Anonymous Type 2010-12-19 22:13:20

sorry but i don't think you even read the question.

@xanadont 2010-12-20 20:25:44

@Anonymous-type I did read the question and was offering a helpful alternative to a desired OSS implementation ... because, well, I was pretty sure there was nothing available. And, judging by the accepted answer, a requirement of having Office installed is not an issue.

@Rob Cooper 2008-08-19 07:28:03

Forgive me if I am off-base here, but isn't this what the Office PIA's are for?

@Adam Ralph 2009-01-19 22:28:15

Yes, but that would involve creating an Excel.Application instance, loading the xls file, etc. If the requirement is purely to read some data from the file then it's much easier and far more lightweight to use one of the ADO.NET methods described in the other answers.

@Anonymous Type 2010-12-19 22:11:14

Too slow, using Office PIA as the baseline, everything else is faster - even just using an Object array passed from .Value2 property. Which is still using the PIA.

Related Questions

Sponsored Content

26 Answered Questions

[SOLVED] How do I enumerate an enum in C#?

8 Answered Questions

[SOLVED] How to loop through all enum values in C#?

19 Answered Questions

[SOLVED] Why is Dictionary preferred over Hashtable in C#?

26 Answered Questions

[SOLVED] Get int value from enum in C#

  • 2009-06-03 06:46:39
  • jim
  • 1252790 View
  • 1530 Score
  • 26 Answer
  • Tags:   c# enums casting int

24 Answered Questions

[SOLVED] Cast int to enum in C#

  • 2008-08-27 03:58:21
  • lomaxx
  • 1139829 View
  • 2796 Score
  • 24 Answer
  • Tags:   c# enums casting

42 Answered Questions

[SOLVED] How to create Excel (.XLS and .XLSX) file in C# without installing Ms Office?

  • 2008-09-29 22:30:28
  • mistrmark
  • 980937 View
  • 1696 Score
  • 42 Answer
  • Tags:   c# .net excel file-io

38 Answered Questions

57 Answered Questions

[SOLVED] What is the difference between String and string in C#?

9 Answered Questions

[SOLVED] What are the correct version numbers for C#?

63 Answered Questions

[SOLVED] How do I calculate someone's age in C#?

  • 2008-07-31 23:40:59
  • Jeff Atwood
  • 524150 View
  • 1743 Score
  • 63 Answer
  • Tags:   c# .net datetime

Sponsored Content