By chux


2019-03-09 19:13:13 8 Comments

Rumor is that the next version of C will disallow sign magnitude and ones' complement signed integer encoding. True or not, it seems efficient to not have to code and test for those rare encodings.

Yet if code might not handle such cases as non-2's complement, it is prudent to detect and fail such compilations today.

Rather than just look for that one kind of dinosaur¹, below is C code that looks for various unicorns² and dinosaurs. Certainly some tests are more useful than others.

Review goal:

  • Please report any dinosaur¹ and unicorns² compilers found by this code.

  • Review how well this code would successfully flag true passé compilers and not report new innovative ones (e.g. 128-bit intmax_t.)

  • Suggest any additional or refined tests.

  • Pre-C11 compilers that lack static_assert may readily need a better #define static_assert ... than this code. Better alternatives are appreciated, but not a main goal of this post.

Note: I am not trying to rate strict adherence to IEEE_754 and the like.


/*
 * unicorn.h
 * Various tests to detect old and strange compilers.
 *
 *  Created on: Mar 8, 2019
 *      Author: chux
 */

#ifndef UNICORN_H_
#define UNICORN_H_

#include <assert.h>
#ifndef static_assert
  #define static_assert( e, m ) typedef char _brevit_static_assert[!!(e)]
#endif

#include <float.h>
#include <limits.h>
#include <stdint.h>

/*
 *  Insure 2's complement
 *  Could also check various int_leastN_t, int_fastN_t
 */
static_assert(SCHAR_MIN < -SCHAR_MAX && SHRT_MIN < -SHRT_MAX &&
    INT_MIN < -INT_MAX && LONG_MIN < -LONG_MAX &&
    LLONG_MIN < -LLONG_MAX && INTMAX_MIN < -INTMAX_MAX &&
    INTPTR_MIN < -INTPTR_MAX && PTRDIFF_MIN < -PTRDIFF_MAX
    , "Dinosuar: Non-2's complement.");

/*
 *  Insure the range of unsigned is 2x that of positive signed
 *  Only ever seen one once with the widest unsigned and signed type with same max
 */
static_assert(SCHAR_MAX == UCHAR_MAX/2 && SHRT_MAX == USHRT_MAX/2 &&
    INT_MAX == UINT_MAX/2 && LONG_MAX == ULONG_MAX/2 &&
    LLONG_MAX == ULLONG_MAX/2 && INTMAX_MAX == UINTMAX_MAX/2, 
        "Dinosuar: narrowed unsigned.");

/*
 *  Insure char is sub-range of int
 *  When char values exceed int, makes for tough code using fgetc()
 */
static_assert(CHAR_MAX <= INT_MAX, "Dinosuar: wide char");

/*
 *  Insure char is a power-2-octet
 *  I suspect many folks would prefer just CHAR_BIT == 8
 */
static_assert((CHAR_BIT & (CHAR_BIT - 1)) == 0, "Dinosaur: Uncommon byte width.");

/*
 *  Only binary FP
 */
static_assert(FLT_RADIX == 2, "Dinosuar: Non binary FP");

/*
 *  Some light checking for pass-able FP types
 *  Certainly this is not a full IEEE check
 *  Tolerate float as double
 */
static_assert(sizeof(float)*CHAR_BIT == 32 || sizeof(float)*CHAR_BIT == 64,
    "Dinosuar: Unusual float");
static_assert(sizeof(double)*CHAR_BIT == 64, "Dinosuar: Unusual double");

/*
 *  Heavier IEEE checking
 */
static_assert(DBL_MAX_10_EXP == 308 && DBL_MAX_EXP == 1024 &&
    DBL_MIN_10_EXP == -307 && DBL_MIN_EXP == -1021 &&
    DBL_DIG == 15 && DBL_DECIMAL_DIG == 17 && DBL_MANT_DIG == 53,
    "Dinosuar: Unusual double");

/*
 *  Insure uxxx_t range <= int
 *  Strange when unsigned helper types promote to int
 */
static_assert(INT_MAX < UINTPTR_MAX, "Unicorn: narrow uintptr_t");
static_assert(INT_MAX < SIZE_MAX, "Unicorn: narrow size_tt");

/*
 *  Insure xxx_t range >= int
 *  Also expect signed helper types at least int range
 */
static_assert(INT_MAX <= PTRDIFF_MAX, "Unicorn: narrow ptrdiff_t");
static_assert(INT_MAX <= INTPTR_MAX, "Unicorn: narrow intptr_");

/*
 *  Insure all integers are within `float` finite range
 */
// Works OK when uintmax_t lacks padding
static_assert(FLT_RADIX == 2 && sizeof(uintmax_t)*CHAR_BIT < FLT_MAX_EXP,
    "Unicorn: wide integer range");
// Better method
#define UNICODE_BW1(x) ((x) > 0x1u ? 2 : 1)
#define UNICODE_BW2(x) ((x) > 0x3u ? UNICODE_BW1((x)/0x4)+2 : UNICODE_BW1(x))
#define UNICODE_BW3(x) ((x) > 0xFu ? UNICODE_BW2((x)/0x10)+4 : UNICODE_BW2(x))
#define UNICODE_BW4(x) ((x) > 0xFFu ? UNICODE_BW3((x)/0x100)+8 : UNICODE_BW3(x))
#define UNICODE_BW5(x) ((x) > 0xFFFFu ? UNICODE_BW4((x)/0x10000)+16 : UNICODE_BW4(x))
#define UNICODE_BW6(x) ((x) > 0xFFFFFFFFu ? \
    UNICODE_BW5((x)/0x100000000)+32 : UNICODE_BW5(x))
#define UNICODE_BW(x) ((x) > 0xFFFFFFFFFFFFFFFFu ? \
    UNICODE_BW6((x)/0x100000000/0x100000000)+64 : UNICODE_BW6(x))
static_assert(FLT_RADIX == 2 && UNICODE_BW(UINTMAX_MAX) < FLT_MAX_EXP,
    "Unicorn: wide integer range");

/*
 *  Insure size_t range > int
 *  Strange code when a `size_t` object promotes to an `int`.
 */
static_assert(INT_MAX < SIZE_MAX, "Unicorn: narrow size_t");

/*
 *  Recommended practice 7.19 4
 */
static_assert(PTRDIFF_MAX <= LONG_MAX, "Unicorn: ptrdiff_t wider than long");
static_assert(SIZE_MAX <= ULONG_MAX, "Unicorn: size_t wider thna unsigned long");

/*
 *  Insure range of integers within float
 */
static_assert(FLT_RADIX == 2 && sizeof(uintmax_t)*CHAR_BIT < FLT_MAX_EXP,
    "Unicorn: wide integer range");

// Addition code could #undef the various UNICODE_BWn

#endif /* UNICORN_H_ */

Test driver

#include "unicorn.h"
#include <stdio.h>

int main(void) {
  printf("Hello World!\n");
  return 0;
}

¹ C is very flexible, yet some features applied to compilers simply no longer in use for over 10 years. For compilers that used out-of-favor features (non-2's complement, non-power-of-2 bit width "bytes", non-binary floating-point, etc.) I'll call dinosaurs.

² C is very flexible for new platform/compilers too. Some of these potential and theoretical compliers could employ very unusual features. I'll call these compilers unicorns. Should one appear, I rather have code fail to compile than compile with errant functioning code.

3 comments

@chux 2019-03-22 19:26:19

In addition to fine answers @Toby Speight, @Lundin and a related FP question, came up with additional idea/detail.

Spelling*

"Dinosuar" --> "Dinosaur".

ASCII or not*

Could use a lengthy test of the execution character set C11 §5.2.1 3

A to Z
a to z
0 to 9
! " # % & ’ ( ) * + , - . / : ; < = > ? [ \ ] ^ _ { | } ~
space character, 
  and control characters representing horizontal tab, vertical tab, and form feed.
some way of indicating the end of each line of text

Note that the 3 "[email protected]"`, ASCII 127 and various control characters are not mentioned above.

  static_assert(
      'A' == 65 && 'B' == 66 && 'C' == 67 && 'D' == 68 && 'E' == 69 && 'F' == 70
          && 'G' == 71 && 'H' == 72 && 'I' == 73 && 'J' == 74 && 'K' == 75
          && 'L' == 76 && 'M' == 77 && 'N' == 78 && 'O' == 79 && 'P' == 80
          && 'Q' == 81 && 'R' == 82 && 'S' == 83 && 'T' == 84 && 'U' == 85
          && 'V' == 86 && 'W' == 87 && 'X' == 88 && 'Y' == 89 && 'Z' == 90,
      "Dinosaur: not ASCII A-Z");
  static_assert(
      'a' == 97 && 'b' == 98 && 'c' == 99 && 'd' == 100 && 'e' == 101
          && 'f' == 102 && 'g' == 103 && 'h' == 104 && 'i' == 105 && 'j' == 106
          && 'k' == 107 && 'l' == 108 && 'm' == 109 && 'n' == 110 && 'o' == 111
          && 'p' == 112 && 'q' == 113 && 'r' == 114 && 's' == 115 && 't' == 116
          && 'u' == 117 && 'v' == 118 && 'w' == 119 && 'x' == 120 && 'y' == 121
          && 'z' == 122, "Dinosaur: not ASCII a-z");
  static_assert('0' == 48, "Dinosaur: not ASCII 0-9");  // 1-9 follow 0 by spec.
  static_assert(
      '!' == 33 && '"' == 34 && '#' == 35 && '%' == 37 && '&' == 38
          && '\'' == 39 && '(' == 40 && ')' == 41 && '*' == 42 && '+' == 43
          && ',' == 44 && '-' == 45 && '.' == 46 && '/' == 47 && ':' == 58
          && ';' == 59 && '<' == 60 && '=' == 61 && '>' == 62 && '?' == 63
          && '[' == 91 && '\\' == 92 && ']' == 93 && '^' == 94 && '_' == 95
          && '{' == 123 && '|' == 124 && '}' == 125 && '~',
      "Dinosaur: not ASCII punct");
  static_assert(
      ' ' == 32 && '\t' == 9 && '\v' == 11 && '\f' == 12 && '\n' == 10,
      "Dinosaur: not ASCII space, ctrl");
   static_assert('\a' == 7 && '\b' == 8 && '\r' == 13,
      "Dinosaur: not ASCII spaces");
   // Not 100% confident safe to do the following test
   static_assert('$' == 36 && '@' == 64 && '`' == 96,
      "Dinosaur: not ASCII special");

@Lundin 2019-03-11 15:41:45

  • I think that static_assert((CHAR_BIT & (CHAR_BIT - 1)) == 0 can be pretty safely replaced by CHAR_BIT==8. There are various old DSP compilers that would fail the test, but they are indeed dinosaur systems.

  • stdint.h and constants like SIZE_MAX, PTRDIFF_MAX were added in C99. So by using such macros/constants, you'll essentially cause all C90 compilers to fail compilation.

    Are C90 compilers dinosaurs per your definition? If not, then maybe do some checks if __STDC_VERSION__ is defined and if so what version. Because most of the exotic ones are likely to follow C90.

@Toby Speight 2019-03-11 11:07:14

I'm appalled! What kind of code are you writing that's so inflexible it needs all these tests? ;-p

Seriously, it ought to be possible to enable only the tests that the including code needs, perhaps by predefining macros that declare its non-portabilities:

#ifdef REQUIRE_BINARY_FP
static_assert(FLT_RADIX == 2, "Dinosuar: Non binary FP");
#endif

(to pick a simple example)


On an extremely minor note, in the comments you've consistently written "insure" where you evidently mean "ensure".


Additional tests to consider:

  • I've seen code that breaks if 'z' - 'a' != 25 and/or 'Z' - 'A' != 25.
  • Some code requires the existence of exact-width integer types such as uint32_t, which are not available on all platforms (it's possible this is covered by the power-of-two byte-width test, but I can't prove it).
  • Perhaps some code requires long double to be bigger (in precision and/or range) than double?

Related Questions

Sponsored Content

1 Answered Questions

[SOLVED] Axiomatic Lisp interpreter in C

  • 2017-09-27 04:49:24
  • luser droog
  • 189 View
  • 5 Score
  • 1 Answer
  • Tags:   c lisp interpreter

3 Answered Questions

[SOLVED] Calculate g.c.d and l.c.m. in C

  • 2017-08-08 14:17:26
  • It_bump
  • 92 View
  • 4 Score
  • 3 Answer
  • Tags:   c

4 Answered Questions

[SOLVED] Factorials, loops, break and professional code

  • 2016-11-12 02:06:05
  • ringzero
  • 144 View
  • 5 Score
  • 4 Answer
  • Tags:   c

1 Answered Questions

[SOLVED] Generate String with Random Consonants and Vowels in C

  • 2016-02-10 21:49:47
  • Christian Hujer
  • 504 View
  • 4 Score
  • 1 Answer
  • Tags:   c

4 Answered Questions

[SOLVED] Detecting arithmetic overflow in C with NASM

  • 2016-01-06 17:57:27
  • coderodde
  • 889 View
  • 11 Score
  • 4 Answer
  • Tags:   c integer assembly

1 Answered Questions

[SOLVED] Calculate the average score of groups and each student, C

  • 2014-12-28 12:39:17
  • rel1x
  • 729 View
  • 4 Score
  • 1 Answer
  • Tags:   c

1 Answered Questions

[SOLVED] Detecting a combination of characters from input

  • 2013-01-29 01:12:22
  • MNY
  • 110 View
  • 4 Score
  • 1 Answer
  • Tags:   c io

1 Answered Questions

[SOLVED] Processes and semaphores in C

  • 2013-01-08 15:06:33
  • user20917
  • 254 View
  • 2 Score
  • 1 Answer
  • Tags:   c

2 Answered Questions

[SOLVED] Style and Suggestions for K&R2 1-18

  • 2013-09-13 23:48:02
  • Z. Bornheimer
  • 278 View
  • 3 Score
  • 2 Answer
  • Tags:   c

3 Answered Questions

[SOLVED] Doing #ifdef DEBUG and #define func() right?

  • 2012-08-02 19:54:11
  • EhevuTov
  • 297 View
  • 1 Score
  • 3 Answer
  • Tags:   c

Sponsored Content