By ecleel


2009-02-28 07:55:41 8 Comments

What are the differences between multidimensional arrays double[,] and array-of-arrays double[][] in C#?

If there is a difference, what is the best use for each one?

9 comments

@adsamcik 2017-07-31 07:46:56

I would like to update on this, because in .NET Core multi-dimensional arrays are faster than jagged arrays. I ran the tests from John Leidegren and these are the results on .NET Core 2.0 preview 2. I increased the dimension value to make any possible influences from background apps less visible.

Debug (code optimalization disabled)
Running jagged 
187.232 200.585 219.927 227.765 225.334 222.745 224.036 222.396 219.912 222.737 

Running multi-dimensional  
130.732 151.398 131.763 129.740 129.572 159.948 145.464 131.930 133.117 129.342 

Running single-dimensional  
 91.153 145.657 111.974  96.436 100.015  97.640  94.581 139.658 108.326  92.931 


Release (code optimalization enabled)
Running jagged 
108.503 95.409 128.187 121.877 119.295 118.201 102.321 116.393 125.499 116.459 

Running multi-dimensional 
 62.292  60.627  60.611  60.883  61.167  60.923  62.083  60.932  61.444  62.974 

Running single-dimensional 
 34.974  33.901  34.088  34.659  34.064  34.735  34.919  34.694  35.006  34.796 

I looked into disassemblies and this is what I found

jagged[i][j][k] = i * j * k; needed 34 instructions to execute

multi[i, j, k] = i * j * k; needed 11 instructions to execute

single[i * dim * dim + j * dim + k] = i * j * k; needed 23 instructions to execute

I wasn't able to identify why single-dimensional arrays were still faster than multi-dimensional but my guess is that it has to do with some optimalization made on the CPU

@Joe Amenta 2017-06-16 17:53:40

In addition to the other answers, note that a multidimensional array is allocated as one big chunky object on the heap. This has some implications:

  1. Some multidimensional arrays will get allocated on the Large Object Heap (LOH) where their equivalent jagged array counterparts would otherwise not have.
  2. The GC will need to find a single contiguous free block of memory to allocate a multidimensional array, whereas a jagged array might be able to fill in gaps caused by heap fragmentation... this isn't usually an issue in .NET because of compaction, but the LOH doesn't get compacted by default (you have to ask for it, and you have to ask every time you want it).
  3. You'll want to look into <gcAllowVeryLargeObjects> for multidimensional arrays way before the issue will ever come up if you only ever use jagged arrays.

@Eglin 2013-10-28 16:23:47

Preface: This comment is intended to address the answer provided by okutane, but because of SO's silly reputation system, I can not post it where it belongs.

Your assertion that one is slower than the other because of the method calls isn't correct. One is slower than the other because of more complicated bounds-checking algorithms. You can easily verify this by looking, not at the IL, but at the compiled assembly. For example, on my 4.5 install, accessing an element (via pointer in edx) stored in a two-dimensional array pointed to by ecx with indexes stored in eax and edx looks like so:

sub eax,[ecx+10]
cmp eax,[ecx+08]
jae oops //jump to throw out of bounds exception
sub edx,[ecx+14]
cmp edx,[ecx+0C]
jae oops //jump to throw out of bounds exception
imul eax,[ecx+0C]
add eax,edx
lea edx,[ecx+eax*4+18]

Here, you can see that there's no overhead from method calls. The bounds checking is just very convoluted thanks to the possibility of non-zero indexes, which is a functionality not on offer with jagged arrays. If we remove the sub,cmp,and jmps for the non-zero cases, the code pretty much resolves to (x*y_max+y)*sizeof(ptr)+sizeof(array_header). This calculation is about as fast (one multiply could be replaced by a shift, since that's the whole reason we choose bytes to be sized as powers of two bits) as anything else for random access to an element.

Another complication is that there are plenty of cases where a modern compiler will optimize away the nested bounds-checking for element access while iterating over a single-dimension array. The result is code that basically just advances an index pointer over the contiguous memory of the array. Naive iteration over multi-dimensional arrays generally involves an extra layer of nested logic, so a compiler is less likely to optimize the operation. So, even though the bounds-checking overhead of accessing a single element amortizes out to constant runtime with respect to array dimensions and sizes, a simple test-case to measure the difference may take many times longer to execute.

@Elmue 2017-05-29 19:13:17

Thanks for correcting the answer of okutane (not Dmitry). It is annoying that people give wrong answers on Stackoverflow and get 250 up-votes while others give correct answers and get far less. But at the end the IL code is irrelevant. You have to really MEASURE the speed to say anything about performance. Did you do that? I think the difference will be ridiculous.

@Thomas C Hutchinson 2017-02-02 17:56:24

I am parsing .il files generated by ildasm to build a database of assemnblies, classes, methods, and stored procedures for use doing a conversion. I came across the following, which broke my parsing.

.method private hidebysig instance uint32[0...,0...] 
        GenerateWorkingKey(uint8[] key,
                           bool forEncryption) cil managed

The book Expert .NET 2.0 IL Assembler, by Serge Lidin, Apress, published 2006, Chapter 8, Primitive Types and Signatures, pp. 149-150 explains.

<type>[] is termed a Vector of <type>,

<type>[<bounds> [<bounds>**] ] is termed an array of <type>

** means may be repeated, [ ] means optional.

Examples: Let <type> = int32.

1) int32[...,...] is a two-dimensional array of undefined lower bounds and sizes

2) int32[2...5] is a one-dimensional array of lower bound 2 and size 4.

3) int32[0...,0...] is a two-dimensional array of lower bounds 0 and undefined size.

Tom

@shahkalpesh 2009-02-28 08:22:18

Simply put multidimensional arrays are similar to a table in DBMS.
Array of Array (jagged array) lets you have each element hold another array of the same type of variable length.

So, if you are sure that the structure of data looks like a table (fixed rows/columns), you can use a multi-dimensional array. Jagged array are fixed elements & each element can hold an array of variable length

E.g. Psuedocode:

int[,] data = new int[2,2];
data[0,0] = 1;
data[0,1] = 2;
data[1,0] = 3;
data[1,1] = 4;

Think of the above as a 2x2 table:

1 | 2
3 | 4
int[][] jagged = new int[3][]; 
jagged[0] = new int[4] {  1,  2,  3,  4 }; 
jagged[1] = new int[2] { 11, 12 }; 
jagged[2] = new int[3] { 21, 22, 23 }; 

Think of the above as each row having variable number of columns:

 1 |  2 |  3 | 4
11 | 12
21 | 22 | 23

@Xaser 2016-04-23 22:00:47

this is what really matters when deciding what to use.. not this speed thingy.. well speed may become a factor when you have a square array.

@lznt 2015-09-02 21:46:06

This might have been mentioned in the above answers but not explicitly: with jagged array you can use array[row] to refer a whole row of data, but this is not allowed for multi-d arrays.

@John Leidegren 2009-02-28 09:34:49

A multidimensional array creates a nice linear memory layout while a jagged array implies several extra levels of indirection.

Looking up the value jagged[3][6] in a jagged array var jagged = new int[10][5] works like this: Look up the element at index 3 (which is an array) and look up the element at index 6 in that array (which is a value). For each dimension in this case, there's an additional look up (this is an expensive memory access pattern).

A multidimensional array is laid out linearly in memory, the actual value is found by multiplying together the indexes. However, given the array var mult = new int[10,30], the Length property of that multidimensional array returns the total number of elements i.e. 10 * 30 = 300.

The Rank property of a jagged array is always 1, but a multidimensional array can have any rank. The GetLength method of any array can be used to get the length of each dimension. For the multidimensional array in this example mult.GetLength(1) returns 30.

Indexing the multidimensional array is faster. e.g. given the multidimensional array in this example mult[1,7] = 30 * 1 + 7 = 37, get the element at that index 37. This is a better memory access pattern because only one memory location is involved, which is the base address of the array.

A multidimensional array therefore allocates a continuous memory block, while a jagged array does not have to be square, e.g. jagged[1].Length does not have to equal jagged[2].Length, which would be true for any multidimensional array.

Performance

Performance wise, multidimensional arrays should be faster. A lot faster, but due to a really bad CLR implementation they are not.

 23.084  16.634  15.215  15.489  14.407  13.691  14.695  14.398  14.551  14.252 
 25.782  27.484  25.711  20.844  19.607  20.349  25.861  26.214  19.677  20.171 
  5.050   5.085   6.412   5.225   5.100   5.751   6.650   5.222   6.770   5.305 

The first row are timings of jagged arrays, the second shows multidimensional arrays and the third, well that's how it should be. The program is shown below, FYI this was tested running mono. (The windows timings are vastly different, mostly due to the CLR implementation variations).

On windows, the timings of the jagged arrays are greatly superior, about the same as my own interpretation of what multidimensional array look up should be like, see 'Single()'. Sadly the windows JIT-compiler is really stupid, and this unfortunately makes these performance discussions difficult, there are too many inconsistencies.

These are the timings I got on windows, same deal here, the first row are jagged arrays, second multidimensional and third my own implementation of multidimensional, note how much slower this is on windows compared to mono.

  8.438   2.004   8.439   4.362   4.936   4.533   4.751   4.776   4.635   5.864
  7.414  13.196  11.940  11.832  11.675  11.811  11.812  12.964  11.885  11.751
 11.355  10.788  10.527  10.541  10.745  10.723  10.651  10.930  10.639  10.595

Source code:

using System;
using System.Diagnostics;
static class ArrayPref
{
    const string Format = "{0,7:0.000} ";
    static void Main()
    {
        Jagged();
        Multi();
        Single();
    }

    static void Jagged()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var jagged = new int[dim][][];
            for(var i = 0; i < dim; i++)
            {
                jagged[i] = new int[dim][];
                for(var j = 0; j < dim; j++)
                {
                    jagged[i][j] = new int[dim];
                    for(var k = 0; k < dim; k++)
                    {
                        jagged[i][j][k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }

    static void Multi()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var multi = new int[dim,dim,dim];
            for(var i = 0; i < dim; i++)
            {
                for(var j = 0; j < dim; j++)
                {
                    for(var k = 0; k < dim; k++)
                    {
                        multi[i,j,k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }

    static void Single()
    {
        const int dim = 100;
        for(var passes = 0; passes < 10; passes++)
        {
            var timer = new Stopwatch();
            timer.Start();
            var single = new int[dim*dim*dim];
            for(var i = 0; i < dim; i++)
            {
                for(var j = 0; j < dim; j++)
                {
                    for(var k = 0; k < dim; k++)
                    {
                        single[i*dim*dim+j*dim+k] = i * j * k;
                    }
                }
            }
            timer.Stop();
            Console.Write(Format,
                (double)timer.ElapsedTicks/TimeSpan.TicksPerMillisecond);
        }
        Console.WriteLine();
    }
}

@okutane 2009-02-28 09:53:58

@Hosam Aly 2009-02-28 14:27:19

Try timing them yourself, and see how both perform. Jagged arrays are much more optimized in .NET. It may be related to bounds checking, but regardless of the reason, timings and benchmarks clearly show that jagged arrays are faster to access than multi-dimensional ones.

@John Leidegren 2009-02-28 15:19:27

Did, care to weigh in, I updated my answer, I get seriously different results between mono and windows.

@ILoveFortran 2009-02-28 15:30:31

Hosam, that's what he is saying. He is also saying, correctly, that multi-dimensional arrays SHOULD be faster, and certainly not slower than jagged arrays.

@Hosam Aly 2009-03-01 08:11:16

Now this is much clearer about what you mean. +1. I totally agree that multidimensional arrays should be faster, but as you saw yourself, reality is different (at least on Windows; I never used Mono). Sad, but true.

@Hosam Aly 2009-03-01 08:12:34

But your timings appear to be too small (a few milliseconds). At this level you'll have much interference from system services and/or drivers. Make your tests much larger, at least taking a second or two.

@John Leidegren 2009-03-01 09:18:41

I disagree, in your your test a sizable amount of time is being spent on things not necessarily related to array indexing. I ran my test many times over and got consistent results, even if there was interference (unlikely related to services or driver IO) the results are reliable.

@supercat 2012-08-26 21:19:49

On some systems, access to multidimensional arrays will be much faster when the inner loop indexes the last dimension; on some systems, they will be much faster when it accesses the first. Have you tested both directions for timing?

@John Leidegren 2012-08-27 11:55:19

@supercat I haven't, unfortunately. It sounds interesting, what systems are we talking about? Most of my experience is on X86/X64.

@supercat 2012-08-27 14:51:10

@JohnLeidegren: The fact that multi-dimensional arrays work better when indexing one dimension than another has been understood for half a century, since elements which differ in only one particular dimension will be stored consecutively in memory, and with many types of memory (past and present), accessing consecutive items is faster than accessing distant items. I think in .net one should get optimal results indexing by the last subscript which is what you were doing, but testing the time with the subscripts swapped may be informative in any case.

@John Leidegren 2012-08-27 17:34:05

@supercat I agree, though the evidence presented here suggest otherwise. The program is complete, if you wish you can copy and run it swapping the indexing. I'd like to see some benchmarks run och the .NET 4.0 CLR, right now I have other things to attend to.

@John Leidegren 2013-10-09 05:35:32

@jnm2 not my strong suit, fixed.

@Amro 2013-11-05 07:53:09

@supercat: multi-dimensional arrays in C# are stored in row-major order, swapping the order of the subscripts would be slower since you would be accessing the memory non-consecutively. BTW the times reported are no longer accurate, I get almost twice as fast times for multi-dimensional arrays than jagged arrays (tested on latest .NET CLR), which is how it ought to be..

@supercat 2013-11-08 17:26:24

@Amro: The run-time includes array types internally--a simple and a fancy one--and optimization has historically focused on the simple one. For performance-critical code, two other approaches to consider are using a single-dimensional array (e.g. for an 8x8 array, accessing item Board[row*8+col]) and splitting out structures into parallel arrays (if one would often only be accessing one field in many consecutive array elements, moving that data into its own array may be helpful).

@Magus 2014-03-06 17:52:55

I know this is a bit pedantic, but I have to mention that this isn't Windows vs Mono, but CLR vs Mono. You sometimes seem to confuse those. The two are not equivalent; Mono works on Windows as well.

@Jenix 2018-06-02 13:03:28

@Magus While you are right about CLR and Mono, it's also worth mentioning about OS too when benchmarking. It's because OS specific APIs can be called under the hood. For example, Mono uses clock_gettime() on Linux, QueryPerformanceFrequency() on Windows for Stopwatch class and this makes a difference when running the same code on both platforms.

@okutane 2009-02-28 08:07:08

Array of arrays (jagged arrays) are faster than multi-dimensional arrays and can be used more effectively. Multidimensional arrays have nicer syntax.

If you write some simple code using jagged and multidimensional arrays and then inspect the compiled assembly with an IL disassembler you will see that the storage and retrieval from jagged (or single dimensional) arrays are simple IL instructions while the same operations for multidimensional arrays are method invocations which are always slower.

Consider the following methods:

static void SetElementAt(int[][] array, int i, int j, int value)
{
    array[i][j] = value;
}

static void SetElementAt(int[,] array, int i, int j, int value)
{
    array[i, j] = value;
}

Their IL will be the following:

.method private hidebysig static void  SetElementAt(int32[][] 'array',
                                                    int32 i,
                                                    int32 j,
                                                    int32 'value') cil managed
{
  // Code size       7 (0x7)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldelem.ref
  IL_0003:  ldarg.2
  IL_0004:  ldarg.3
  IL_0005:  stelem.i4
  IL_0006:  ret
} // end of method Program::SetElementAt

.method private hidebysig static void  SetElementAt(int32[0...,0...] 'array',
                                                    int32 i,
                                                    int32 j,
                                                    int32 'value') cil managed
{
  // Code size       10 (0xa)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldarg.1
  IL_0002:  ldarg.2
  IL_0003:  ldarg.3
  IL_0004:  call       instance void int32[0...,0...]::Set(int32,
                                                           int32,
                                                           int32)
  IL_0009:  ret
} // end of method Program::SetElementAt

When using jagged arrays you can easily perform such operations as row swap and row resize. Maybe in some cases usage of multidimensional arrays will be more safe, but even Microsoft FxCop tells that jagged arrays should be used instead of multidimensional when you use it to analyse your projects.

@jason saldo 2009-02-28 08:17:18

hmm. I was betting on the other way around. Can you suggest links?

@Hosam Aly 2009-02-28 09:11:11

@Hosam Aly 2009-02-28 14:32:09

@John, measure them yourself, and don't make assumptions.

@Henk Holterman 2009-02-28 14:34:20

@John: My first reaction too but i was wrong - see Hosams question for details.

@ILoveFortran 2009-02-28 15:28:06

Multi-dimensional arrays should logically be more efficent but their implementation by the JIT compiler is not. The above code is not useful since it does not show array access in a loop.

@John Leidegren 2009-03-01 08:09:35

@Henk Holterman - See my answer below, It might be the case that on windows jagged arrays are fast but one has to realize that this is entirely CLR specific and not the case with e.g. mono...

@Anthony Nichols 2016-04-29 15:11:02

I know this is an old question, just wondering if the CLR has been optimized for multidimensional arrays since this question was asked.

@Arthur Castro 2016-09-06 20:16:52

@AnthonyNichols, no optimizations has been done. It still works just like this answer from 2009.

@BlueRaja - Danny Pflughoeft 2017-12-23 16:42:09

@arthur The C# compiler doesn't do optimizations, the JIT does. Looking at the IL won't tell you how it's optimized.

@abatishchev 2009-02-28 20:23:18

Multi-dimension arrays are (n-1)-dimension matrices.

So int[,] square = new int[2,2] is square matrix 2x2, int[,,] cube = new int [3,3,3] is a cube - square matrix 3x3. Proportionality is not required.

Jagged arrays are just array of arrays - an array where each cell contains an array.

So MDA are proportional, JD may be not! Each cell can contains an array of arbitrary length!

Related Questions

Sponsored Content

27 Answered Questions

[SOLVED] What is the best way to iterate over a dictionary?

  • 2008-09-26 18:20:06
  • Jake Stewart
  • 1369811 View
  • 2275 Score
  • 27 Answer
  • Tags:   c# dictionary loops

31 Answered Questions

[SOLVED] What is the difference between a field and a property?

  • 2008-11-17 08:41:38
  • Anonymous
  • 384185 View
  • 974 Score
  • 31 Answer
  • Tags:   c# properties field

26 Answered Questions

[SOLVED] How do I enumerate an enum in C#?

61 Answered Questions

[SOLVED] What is the difference between String and string in C#?

32 Answered Questions

[SOLVED] What is the difference between const and readonly?

38 Answered Questions

13 Answered Questions

[SOLVED] Difference Between Select and SelectMany

  • 2009-06-06 03:54:16
  • Tarik
  • 438647 View
  • 929 Score
  • 13 Answer
  • Tags:   c# linq-to-sql linq

9 Answered Questions

[SOLVED] What are the correct version numbers for C#?

24 Answered Questions

[SOLVED] Cast int to enum in C#

  • 2008-08-27 03:58:21
  • lomaxx
  • 1185522 View
  • 2870 Score
  • 24 Answer
  • Tags:   c# enums casting

15 Answered Questions

[SOLVED] Calculate difference between two dates (number of days)?

  • 2009-10-22 13:47:15
  • leora
  • 1012579 View
  • 957 Score
  • 15 Answer
  • Tags:   c# date

Sponsored Content