By rlb.usa


2010-07-12 21:07:50 8 Comments

I have an Excel worksheet I want to read into a datatable - all is well except for one particular column in my Excel sheet. The column, 'ProductID', is a mix of values like ########## and n#########.

I tried to let OleDB handle everything by itself automatically by reading it into a dataset/datatable, but any values in 'ProductID' like n###### are missing, ignored, and left blank. I tried manually creating my DataTable by looping through each row with a datareader, but with the exact same results.

Here's the code :

// add the column names manually to the datatable as column_1, column_2, ...
for (colnum = 0; colnum < num_columns; colnum ++){
  ds.Tables["products"].Columns.Add("column_" +colnum , System.Type.GetType("System.String")); 
}
while(myDataReader.Read()){
  // loop through each excel row adding a new respective datarow to my datatable 
  DataRow a_row = ds.Tables["products"].NewRow();
  for (col = 0; col < num_columns; col ++){
    try {  a_row[col] = rdr.GetString(col);  }
    catch {  a_row[col] = rdr.GetValue(col).ToString(); }
  }
  ds.Tables["products"].Rows.Add(a_row);
}

I don't understand why it won't let me read in values like n######. How can I do this?

6 comments

@Ibo 2018-06-05 17:56:20

Shortcut --> if you have a mixed type column in Excel: Sort your column Z to A

I pretty much went through all of the answers here and some of them worked for me and some did not, however none was desirable for me because somehow ADO did not pick the data in a mixed type column that I had in my Excel file. I had to set HDR=NO to make ADO read my spreadsheet column that is a mix of text and numbers and that way I lose the ability of using column headers in my SQL statements which is not good. If the order of columns change in the Excel file, the SQL statement will result in error or wrong output.

In a mixed data type column the key is the first 8 rows. ADO determines the data type for the column based on the first 8 rows So if you still want to modify your connection string with the extended parameters, simply sort your column Z to A on your Excel file before reading the data by ADO so this way the rows on top are the text ones and then your column will be picked as text.

If your initial rows are numbers (regardless if your column is set to format TEXT in Excel) ADO will determine that columns as a numeric type, so once it read the text rows below, it cannot cast those into number. On the opposite case, if the column is determined text, if any row if number, it can be cast as text.

@user1424725 2012-08-21 21:05:57

Several forums I found claim that by adding IMEX=1;TypeGuessRows=0;ImportMixedTypes=Text to the Extended Properties in the connection string would fix the problem, but this was not the case. I finally solved this problem by adding "HDR=NO" to the Extended Properties in the connection string (as Brian Wells shows above) so that I could import mixed types.

I then added some generic code to name the columns after the first row of data, then remove the first row.

    public static DataTable ImportMyDataTableFromExcel(string filePath)
    {
        DataTable dt = new DataTable();

        string fullPath = Path.GetFullPath(filePath);

        string connString =
           "Provider=Microsoft.Jet.OLEDB.4.0;" +
           "Data Source=\"" + fullPath + "\";" +
           "Extended Properties=\"Excel 8.0;HDR=No;IMEX=1;\"";

        string sql = @"SELECT * FROM [sheet1$]";

        using (OleDbDataAdapter dataAdapter = new OleDbDataAdapter(sql, connString))
        {
            dataAdapter.Fill(dt);
        }

        dt = BuildHeadersFromFirstRowThenRemoveFirstRow(dt);

        return dt;
    }

    private static DataTable BuildHeadersFromFirstRowThenRemoveFirstRow(DataTable dt)
    {
        DataRow firstRow = dt.Rows[0];

        for (int i = 0; i < dt.Columns.Count; i++)
        {
            if(!string.IsNullOrWhiteSpace(firstRow[i].ToString())) // handle empty cell
              dt.Columns[i].ColumnName = firstRow[i].ToString().Trim();
        }

        dt.Rows.RemoveAt(0);

        return dt;
    }

@3Sphere 2012-08-24 21:55:44

All of this difficulty with accessing data in Excel spreadsheets raises an obvious question: Why doesn't Microsoft provide a more modern (and performant) mechanism to get at the data than using the crappy Jet driver? There are any number of 3rd party tools for this that do a much better job than anything MS provides.

@Simon_Weaver 2012-12-05 21:56:19

@3Sphere perhaps that's why they dont....

@Simon_Weaver 2012-12-05 21:56:48

@3Sphere do you know of something that can read an XLS file. I tried a few things but they can only read the newer XLSX grrr

@3Sphere 2012-12-06 03:47:42

@Simon_Weaver The mechanism discussed in this thread, as clunky as it is, provides a reliable (but extremely non-performant) means to extract data from an XLS file. If you need something that's faster, you are either going to have to write your own parser or buy a 3rd party component (such as Spreadsheet Gear)

@Amos Zoellner 2014-10-29 19:21:12

This technique works well and is a much more generic solution than other answers - except it throws an exception if the file has duplicate data in 2 cells of row 1. I proposed a revision but it was rejected, so use care if copying this solution!

@Brian Wells 2011-04-19 19:20:45

Using .Net 4.0 and reading Excel files, I had a similar issue with OleDbDataAdapter - i.e. reading in a mixed data type on a "PartID" column in MS Excel, where a PartID value can be numeric (e.g. 561) or text (e.g. HL4354), even though the excel column was formatted as "Text".

From what I can tell, ADO.NET chooses the data type based on the majority of the values in the column (with a tie going to numeric data type). i.e. if most of the PartID's in the sample set are numeric, ADO.NET will declare the column to be numeric. Therefore ADO.Net will attempt to cast each cell to a number, which will fail for the "text" PartID values and not import those "text" PartID's.

My solution was to set the OleDbConnection connectionstring to use Extended Properties=IMEX=1;HDR=NO to indicate this is an Import and that the table(s) will not include headers. The excel file has a header row, so in this case tell ado.net not to use it. Then later in the code, remove that header row from the dataset and voilĂ  you have mixed data type for that column.

string sql = "SELECT F1, F2, F3, F4, F5 FROM [sheet1$] WHERE F1 IS NOT NULL";

OleDbConnection connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + PrmPathExcelFile + @";Extended Properties=""Excel 8.0;IMEX=1;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text""");

OleDbCommand cmd = new OleDbCommand(sql, connection);
OleDbDataAdapter da = new OleDbDataAdapter(cmd);

DataSet ds = new DataSet();
ds.Tables.Add("xlsImport", "Excel");
da.Fill(ds, "xlsImport");

// Remove the first row (header row)
DataRow rowDel = ds.Tables["xlsImport"].Rows[0];
ds.Tables["xlsImport"].Rows.Remove(rowDel);

ds.Tables["xlsImport"].Columns[0].ColumnName = "LocationID";
ds.Tables["xlsImport"].Columns[1].ColumnName = "PartID";
ds.Tables["xlsImport"].Columns[2].ColumnName = "Qty";
ds.Tables["xlsImport"].Columns[3].ColumnName = "UserNotes";
ds.Tables["xlsImport"].Columns[4].ColumnName = "UserID";

connection.Close(); 

// now you can use LINQ to search the fields

    var data = ds.Tables["xlsImport"].AsEnumerable();
    var query = data.Where(x => x.Field<string>("LocationID") == "COOKCOUNTY").Select(x =>
                new Contact
                {
                    LocationID= x.Field<string>("LocationID"),
                    PartID = x.Field<string>("PartID"),
                    Quantity = x.Field<string>("Qty"),
                    Notes = x.Field<string>("UserNotes"),
                    UserID = x.Field<string>("UserID")
                });

@TheVillageIdiot 2011-05-11 16:54:10

man "Excel 8.0;IMEX=1;HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text" saved the day for somebody

@Brian Wells 2011-05-12 21:44:57

No prob - glad it worked for somebody!

@Jeff Machamer 2011-07-15 18:50:40

I wish I had taken the time to search for this answer earlier! This just saved me alot of time and effort! Thanks!

@suryakiran 2011-09-29 13:19:14

Great one. Saved a lot of time for me.

@Rn222 2011-11-21 14:23:30

This article shows how to set the parameters in the registry: blog.lab49.com/archives/196 I didn't know you could set TypeGuessRows = 0; ImportMixedTypes = Text in the connection string.

@zunetastic 2012-07-06 12:32:14

@TheVillageIdiot - Thanks mate, saved me plenty time.

@Solo 2013-08-29 02:02:45

Actually "Extended Properties=IMEX=1;" fix the issues. There's no need to include HDR=NO. But thanks anyway

@mjb 2014-01-04 01:34:37

Thanks this save my day too :)

@Yagnesh.Dixit 2014-03-21 06:39:43

Thanks a lot Saved my time too.

@voddy 2014-04-11 05:07:28

Saved mine too. Kudos

@BornToCode 2014-06-25 10:09:54

@Solo - your solution worked for me too! On a sheet that office warned that "the number in this cell is formatted as text or preceded by an apostrophe", using Extended Properties=\"Excel 8.0;IMEX=1;\"; solved it

@causita 2014-10-02 23:26:50

Great. I had similar issue and IMEX=1 solve this. thank you!!!

@Mido 2015-01-15 12:26:09

To avoid the installable ISAM error you need to add quotes around the Extended Properties content. Extended Properties="Excel8.0;IMEX=1"

@Magnus 2015-12-13 02:11:47

It turns out that if you change your SQL like this "SELECT F1, F2, F3, F4, F5 FROM [sheet1$] WHERE F1 IS NOT NULL And F1<>'PartID'" It still works and you don't need to exclude the PartID-row later on in code. This can be of great advantage if you are doing more complex queries which for example directly update a database from the Excel rows.

@Hugo Nava Kopp 2016-02-09 11:31:00

Great job! thanks for this :-)

@Tom H 2016-09-09 18:31:30

After trying everything else, some of which I was already aware of, it was setting HDR=No and skipping that row that finally did the trick for me. The other steps let the data load without error, but it was silently generating empty strings for non-numeric values until that change. Thanks!

@Simon 2017-03-22 17:32:40

IMEX=1 was all that was needed, was driving me crazy why it was working for some imports with mixed types and not for others. Worse that instead of erroring (which it should do) it just inserted blanks in where the mixed type wasnt the one it decided to use.

@szooky 2017-08-29 12:39:30

Watch out for "IMEX=1" only solution, I thought that it works, but when I imported over 1k Excel files for some of them there were still empty cells in some files. All excel files were copy of same "template", all cells in each file were "text" type (but contained only numbers sometime) so I have no idea why it didn't work for some files. However adding HDR=NO;TypeGuessRows=0;ImportMixedTypes=Text part solved problem.

@Brian Wells 2011-04-29 14:48:46

No problem sh4, glad it helps w/ the mixed type issue.

The DateTime column is whole other animal that I recall caused me grief in the past... we have one excel file we process that the OleDbDataAdapter will sometimes convert dates to a double data type (apparently Excel stores dates as doubles, which encode the number of days elapsed since January 0, 1900 ).

The workaround was to use:

OleDbConnection mobjExcelConn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + txtExcelFile.Text + @";Extended Properties=""Excel 8.0;IMEX=1;HDR=Yes;""");

OleDbDataAdapter mobjExcelDataAdapter = new OleDbDataAdapter("Select * from [" + txtSheet.Text + "$] where [Supplier ID] <> '' ", mobjExcelConn);


DateTime dtShipStatus = DateTime.MinValue;
shipStatusOrig = excelRow["Est Ship Date"].ToString(); // excelRow is DataRow in the DataSet via the OleDbDataAdapter             

if (shipStatusOrig != string.Empty)
{
    // Date may be read in via oledb adapter as a double
    if (IsNumeric(shipStatusOrig))
    {
        double d = Convert.ToDouble(shipStatusOrig);
        dtShipStatus = DateTime.FromOADate(d);

        if (DateTime.TryParse(dtShipStatus.ToString(), out dtShipStatus))
        {
            validDate = true;
            Debug.WriteLine("{0} converted: ", dtShipStatus.ToString("s"));
        }
    }
    else
    {
        if (ValidateShipDate(shipStatusOrig))
        {
            dtShipStatus = DateTime.Parse(shipStatusOrig);
            validDate = true;
            Debug.WriteLine("{0} converted: ", dtShipStatus.ToString("s"));
        }
        else
        {
            validDate = false;
            MessageBox.Show("Invalid date format in the Excel spreadsheet.\nLine # " + progressBar1.Value + ", the 'Ship Status' value '" + shipStatusOrig + "' is invalid.\nDate should be in a valid date time format.\ne.g. M/DD/YY, M.D.Y, YYYY-MM-DD, etc.", "Invaid Ship Status Date");
        }
    }
...
}
        public static Boolean IsNumeric (Object Expression)
        {
            if(Expression == null || Expression is DateTime)
                return false;

            if(Expression is Int16 || Expression is Int32 || Expression is Int64 || Expression is Decimal || Expression is Single || Expression is Double || Expression is Boolean)
                return true;

            try
            {
                if(Expression is string)
                    Double.Parse(Expression as string);
                else
                   Double.Parse(Expression.ToString());
                return true;
            } catch {} // just dismiss errors but return false

            return false;
        }

        public bool ValidateShipDate(string shipStatus)
        {
            DateTime startDate;
            try
            {
                startDate = DateTime.Parse(shipStatus);
                return true;
            }
            catch
            {
                return false;
            }
        }

@sh4 2011-04-28 14:33:47

@Brian Wells Thank you, your suggestion did the trick, but not totally... Worked for the mixed field int-string, but the datetime columns went with strange characters after that, so i applied a "hack" over the "hack".

1.- Do a System.Io.File.Copy and create a copy of the excel file.

2.- Modify the Datetime column headers programatically at runtime to something in datetime format, i.e. "01/01/0001".

3.- Save the excel, and then apply your trick doing the query with HDR=NO to the modified file.

Tricky, yes, but worked, and reasonabily fast, if anyone has any alternative to this, i will be glad to hear.

Greetings.

P.D. Excuse my english, it isn't my native language.

@Brian Wells 2011-04-29 14:46:39

No problem glad it helps! I'll post a DateTime resolution I've used previously in a different Answer in this Post (not enough chars given here).

@rlb.usa 2010-07-12 21:22:28

There are two ways to handle mixed datatypes & excel.

Method 1

  • Open up your excel spreadsheet and set the column format to the desired format manually. In this case, 'Text'.

Method 2

  • There is a "hack" that consists of appending "IMEX=1" to your connection string like so:

    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=myfile.xls;Extended Properties=Excel 8.0;IMEX=1

  • This will attempt to handle mixed Excel formats according to how it is set in your registry. This can be set locally by you, but for a server, this is probably not an option.

@Pratik 2011-08-02 14:32:42

It gives error : System.Data.OleDb.OleDbException was unhandled by user code Message="Could not find installable ISAM." Source="Microsoft JET Database Engine" ErrorCode=-2147467259

@Simon Dugré 2012-02-27 21:47:45

I know this thread's now old but I have the same issue as Pratik... I can't specify IMEX=1 because of "Could not find installable ISAM" error :(

@Jake1164 2012-07-06 11:47:33

To advoid the installable ISAM error you need to add quotes around the Extended Properties content. Extended Properties="Excel8.0;IMEX=1"

@user1274820 2019-03-20 23:29:11

I can't believe I had to read this far to find the most obvious solution. I had thought this might work, but for some reason never bothered to try it. This is the best solution: Open up your excel spreadsheet and set the column format to the desired format manually. In this case, 'Text'. I may post some code in a bit that automates this process in VBA.

Related Questions

Sponsored Content

42 Answered Questions

[SOLVED] How to create Excel (.XLS and .XLSX) file in C# without installing Ms Office?

  • 2008-09-29 22:30:28
  • mistrmark
  • 997861 View
  • 1732 Score
  • 42 Answer
  • Tags:   c# .net excel file-io

1 Answered Questions

[SOLVED] reading excel with oledb not displaying correct values

1 Answered Questions

[SOLVED] Reading one & Update some other Excel with c#

0 Answered Questions

Getting wrong data type in DataTable read from Excel

1 Answered Questions

Incorrect error data from excel

  • 2017-01-12 10:51:52
  • user1926138
  • 61 View
  • 0 Score
  • 1 Answer
  • Tags:   c# excel oledb

2 Answered Questions

Get the Row values from a excel based on the column Value

  • 2014-06-17 08:20:57
  • Pankaj
  • 2987 View
  • 0 Score
  • 2 Answer
  • Tags:   c# excel oledb

1 Answered Questions

Set Specific DataRows to read-only based on their value

0 Answered Questions

OLEDB read excel mixed datatypes, no header

3 Answered Questions

[SOLVED] OleDB Jet - Float issues in reading excel data

1 Answered Questions

[SOLVED] Reading .xlsx files using OLEDB + Issues with certain cells not loading data

  • 2012-06-02 05:13:07
  • ShaneW
  • 5783 View
  • 3 Score
  • 1 Answer
  • Tags:   c# excel oledb

Sponsored Content