By Rob


2008-09-25 07:00:03 8 Comments

I notice that modern C and C++ code seems to use size_t instead of int/unsigned int pretty much everywhere - from parameters for C string functions to the STL. I am curious as to the reason for this and the benefits it brings.

8 comments

@azeemarif 2008-09-25 07:13:02

Classic C (the early dialect of C described by Brian Kernighan and Dennis Ritchie in The C Programming Language, Prentice-Hall, 1978) didn't provide size_t. The C standards committee introduced size_t to eliminate a portability problem

Explained in detail at embedded.com (with a very good example)

@Ihor Kaharlichenko 2011-06-15 13:58:26

Another great article explaining both size_t and ptrdiff_t: viva64.com/en/a/0050

@Rose Perrone 2010-11-28 03:41:35

In short, size_t is never negative, and it maximizes performance because it's typedef'd to be the unsigned integer type that's big enough -- but not too big -- to represent the size of the largest possible object on the target platform.

Sizes should never be negative, and indeed size_t is an unsigned type. Also, because size_t is unsigned, you can store numbers that are roughly twice as big as in the corresponding signed type, because we can use the sign bit to represent magnitude, like all the other bits in the unsigned integer. When we gain one more bit, we are multiplying the range of numbers we can represents by a factor of about two.

So, you ask, why not just use an unsigned int? It may not be able to hold big enough numbers. In an implementation where unsigned int is 32 bits, the biggest number it can represent is 4294967295. Some processors, such as the IP16L32, can copy objects larger than 4294967295 bytes.

So, you ask, why not use an unsigned long int? It exacts a performance toll on some platforms. Standard C requires that a long occupy at least 32 bits. An IP16L32 platform implements each 32-bit long as a pair of 16-bit words. Almost all 32-bit operators on these platforms require two instructions, if not more, because they work with the 32 bits in two 16-bit chunks. For example, moving a 32-bit long usually requires two machine instructions -- one to move each 16-bit chunk.

Using size_t avoids this performance toll. According to this fantastic article, "Type size_t is a typedef that's an alias for some unsigned integer type, typically unsigned int or unsigned long, but possibly even unsigned long long. Each Standard C implementation is supposed to choose the unsigned integer that's big enough--but no bigger than needed--to represent the size of the largest possible object on the target platform."

@Mitch 2012-04-09 11:05:06

Sorry to comment on this after so long, but I just had to confirm the biggest number that an unsigned int can hold - perhaps I'm misunderstanding your terminology, but I thought that the biggest number an unsigned int can hold is 4294967295, 65356 being the maximum of an unsigned short.

@Rose Perrone 2012-04-11 04:24:18

If your unsigned int occupies 32 bits, then yes, the biggest number it can hold is 2^32 - 1, which is 4294967295 (0xffffffff). Do you have another question?

@Mitch 2012-04-11 07:25:34

No other questions, I was just curious as to why you used 65,356 which would imply a 16 bit unsigned int, which I've never known to be the most common case by any means.

@Keith Thompson 2012-04-12 23:12:18

@Mitch: The largest value that can be represented in an unsigned int can and does vary from one system to another. It's required to be at least 65536, but it's commonly 4294967295 and could be 18446744073709551615 (2**64-1) on some systems.

@Mitch 2012-04-14 15:10:55

Oh ok. Is there a standard or something that dictates that it be at least 65536? Also, I just realised I was writing 65356 instead of 65536 - whoops!

@oiyio 2012-09-09 08:41:03

IN this article , it is said that: Using unsigned int as the parameter type, as in: void *memcpy(void *s1, void const *s2, unsigned int n); works just dandy on any platform in which an unsigned int can represent the size of the largest data object. Then can we say size_t = unsigned int . Can we say that there is no difference between them? (my pc is 32 bit)

@Sie Raybould 2013-12-10 23:00:12

The largest value a 16 bit unsigned int can contain is 65535, not 65536. A small but important difference as 65536 is the same as 0 in a 16 bit unsigned int.

@gnasher729 2014-04-12 21:41:52

The "standard or something" that dictates that an unsigned int must be capable of holding at least 65,536 different values is the C Standard (the C++ Standard says the same thing).

@Marc van Leeuwen 2014-06-15 07:38:01

@gnasher729: Are you sure about the C++ standard? Having searched for some time I am under the impression that they simply removed all absolute guarantees about integer ranges (excluding unsigned char). The standard does not seem to contain the string '65535' or '65536' anywhere, and '+32767' only occurs (1.9:9) in a note as possible largest integer representable in int; no guarantee is given even that INT_MAX cannot be smaller than that!

@Ruslan 2016-04-17 09:59:42

@MarcvanLeeuwen in 18.3.3/2, C++11 standard says about <climits>: "The contents are the same as the Standard C library header <limits.h>". I'd suppose that the requirements to the contents are the same. C99 says in 5.2.4.2.1/1 "Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.", followed by the values themselves.

@Zebrafish 2016-01-19 20:55:04

If my compiler is set to 32 bit, size_t is nothing other than a typedef for unsigned int. If my compiler is set to 64 bit, size_t is nothing other than a typedef for unsigned long long.

@StaceyGirl 2018-08-17 20:50:17

Can be just defined as unsigned long for both cases on some OSes.

@Remo.D 2008-09-25 07:08:12

The size_t type is the unsigned integer type that is the result of the sizeof operator (and the offsetof operator), so it is guaranteed to be big enough to contain the size of the biggest object your system can handle (e.g., a static array of 8Gb).

The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.

You may find more precise information in the C99 standard, section 7.17, a draft of which is available on the Internet in pdf format, or in the C11 standard, section 7.19, also available as a pdf draft.

@dan04 2010-11-28 03:46:51

Nope. Think of x86-16 with the large (not huge) memory model: Pointers are far (32-bit), but individual objects are limited to 64k (so size_t can be 16-bit).

@gnasher729 2014-04-12 21:39:01

"size of the biggest object" is not poor wording, but absolutely correct. The sixe of an object can be much more limited than the address space.

@Marc van Leeuwen 2014-06-15 05:42:32

"your compiler might make assumption about it": I would hope the compiler knows the exact range of values that size_t can represent! If it doesn't, who does?

@user1084944 2015-08-31 22:02:06

@Marc: I think the point was more that the compiler might be able to do something with that knowledge.

@user2023370 2016-11-11 10:26:14

I just wish this increasingly popular type didn't require the inclusion of a header file.

@YoYoYonnY 2019-03-05 12:24:36

Actually, compilers generally make better optimizations when not using size_t, because unsigned types are more strictly defined and thus (under certain circumstances) have to check for overflow, among other things. Compared to other unsigned types, simple operations like pointer/array indexing might still be faster, because the compiler might for example not have to convert to 64 bit integers first (if your CPU only performs pointer arithmetic in 64 bits).

@YoYoYonnY 2019-03-05 12:30:58

On top of that, address space is rarely more than 48 bits even when on 64 bit, so unless need to access more than 2 * sizeof(T) GB of elements, I would highly recommend storing your array sizes and indices as int instead of size_t whenever you can get away with it. Can save you up to 50% memory, and may even speed up your code as well.

@Graeme Burke 2013-09-24 02:44:37

This excerpt from the glibc manual 0.02 may also be relevant when researching the topic:

There is a potential problem with the size_t type and versions of GCC prior to release 2.4. ANSI C requires that size_t always be an unsigned type. For compatibility with existing systems' header files, GCC defines size_t in stddef.h' to be whatever type the system'ssys/types.h' defines it to be. Most Unix systems that define size_t in `sys/types.h', define it to be a signed type. Some code in the library depends on size_t being an unsigned type, and will not work correctly if it is signed.

The GNU C library code which expects size_t to be unsigned is correct. The definition of size_t as a signed type is incorrect. We plan that in version 2.4, GCC will always define size_t as an unsigned type, and the fixincludes' script will massage the system'ssys/types.h' so as not to conflict with this.

In the meantime, we work around this problem by telling GCC explicitly to use an unsigned type for size_t when compiling the GNU C library. `configure' will automatically detect what type GCC uses for size_t arrange to override it if necessary.

@who 2008-09-27 05:21:29

size_t is the size of a pointer.

So in 32 bits or the common ILP32 (integer, long, pointer) model size_t is 32 bits. and in 64 bits or the common LP64 (long, pointer) model size_t is 64 bits (integers are still 32 bits).

There are other models but these are the ones that g++ use (at least by default)

@Keith Thompson 2012-04-12 23:13:32

size_t is not necessarily the same size as a pointer, though it commonly is. A pointer has to be able to point to any location in memory; size_t only has to be big enough to represent the size of the largest single object.

@Kevin S. 2008-09-25 07:31:28

The size_t type is the type returned by the sizeof operator. It is an unsigned integer capable of expressing the size in bytes of any memory range supported on the host machine. It is (typically) related to ptrdiff_t in that ptrdiff_t is a signed integer value such that sizeof(ptrdiff_t) and sizeof(size_t) are equal.

When writing C code you should always use size_t whenever dealing with memory ranges.

The int type on the other hand is basically defined as the size of the (signed) integer value that the host machine can use to most efficiently perform integer arithmetic. For example, on many older PC type computers the value sizeof(size_t) would be 4 (bytes) but sizeof(int) would be 2 (byte). 16 bit arithmetic was faster than 32 bit arithmetic, though the CPU could handle a (logical) memory space of up to 4 GiB.

Use the int type only when you care about efficiency as its actual precision depends strongly on both compiler options and machine architecture. In particular the C standard specifies the following invariants: sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) placing no other limitations on the actual representation of the precision available to the programmer for each of these primitive types.

Note: This is NOT the same as in Java (which actually specifies the bit precision for each of the types 'char', 'byte', 'short', 'int' and 'long').

@Clearer 2014-12-07 13:51:42

the de facto definition of int is that it's 16 bit on 16 machines and 32 bit on anything larger. Too much code has been written which assume that int is 32 bits wide, to change this now and as a result people should always use size_t or {,u}int{8,16,32,64}_t if they want something specific -- as a precaution, people should just always use these, instead of the integral integer types.

@chux - Reinstate Monica 2015-10-09 01:36:36

"It is an unsigned integer capable of expressing the size in bytes of any memory range supported on the host machine." --> No. size_t is capable of representing the size of any single object (e.g.: number, array, structure). The entire memory range may exceed size_t

@Maciej Hehl 2008-09-25 07:08:27

Type size_t must be big enough to store the size of any possible object. Unsigned int doesn't have to satisfy that condition.

For example in 64 bit systems int and unsigned int may be 32 bit wide, but size_t must be big enough to store numbers bigger than 4G

@R.. 2010-08-07 18:28:09

"object" is the language used by the standard.

@supercat 2014-03-28 20:24:15

I think size_t would only have to be that big if the compiler could accept a type X such that sizeof(X) would yield a value bigger than 4G. Most compilers would reject e.g. typedef unsigned char foo[1000000000000LL][1000000000000LL], and even foo[65536][65536]; could be legitimately rejected if it exceeded a documented implementation-defined limit.

@Lightness Races with Monica 2015-04-05 01:18:58

@MattJoiner: The wording is fine. "Object" is not vague at all, but rather defined to mean "region of storage".

Related Questions

Sponsored Content

10 Answered Questions

[SOLVED] Improve INSERT-per-second performance of SQLite?

17 Answered Questions

[SOLVED] What is the difference between const int*, const int * const, and int const *?

27 Answered Questions

[SOLVED] Easiest way to convert int to string in C++

10 Answered Questions

[SOLVED] Use 'class' or 'typename' for template parameters?

  • 2008-10-17 17:43:59
  • Kristopher Johnson
  • 259505 View
  • 570 Score
  • 10 Answer
  • Tags:   c++ templates

24 Answered Questions

[SOLVED] Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition

12 Answered Questions

[SOLVED] What is size_t in C?

  • 2010-03-31 05:51:55
  • Vijay
  • 672798 View
  • 590 Score
  • 12 Answer
  • Tags:   c int size-t

4 Answered Questions

[SOLVED] How do I achieve the theoretical maximum of 4 FLOPs per cycle?

2 Answered Questions

[SOLVED] should use size_t or ssize_t

4 Answered Questions

[SOLVED] Why is size_t unsigned?

2 Answered Questions

[SOLVED] int vs unsigned int vs size_t

  • 2015-10-21 10:53:33
  • hr0m
  • 1060 View
  • 2 Score
  • 2 Answer
  • Tags:   c++ int size-t

Sponsored Content