By Alex Gaynor


2008-11-21 21:36:45 8 Comments

I'm working on a programming language, and today I got the point where I could compile the factorial function(recursive), however due to the maximum size of an integer the largest I can get is factorial(12). What are some techniques for handling integers of an arbitrary maximum size. The language currently works by translating code to C++.

7 comments

@artificialidiot 2008-11-21 23:46:04

If I were implement my own language and want to support arbitrary length numbers, I will use a target language with the carry/borrow concept. But since there is no HLL that implements this without severe performance implications (like exceptions), I will certainly go implement it in assembly. It will probably take a single instruction (as in JC in x86) to check for overflow and handle it (as in ADC in x86), which is an acceptable compromise for a language implementing arbitrary precision. Then I will use a few functions written in assembly instead of regular operators, if you can utilize overloading for a more elegant output, even better. But I don't expect generated C++ to be maintainable (or meant to be maintained) as a target language.

Or, just use a library which has more bells and whistles than you need and use it for all your numbers.

As a hybrid approach, detect overflow in assembly and call the library function if overflow instead of rolling your own mini library.

@John D. Cook 2008-11-21 23:03:11

If you want to roll your own arbitrary precision library, see Knuth's Seminumerical Algorithms, volume 2 of his magnum opus.

@Steve Gilham 2009-10-04 09:05:07

+1 for Knuth -- otherwise it's possible you'll miss out the rare corner case for division.

@Bill K 2008-11-21 22:46:14

If you're building this into a language (for learning purposes I'd guess), I think I would probably write a little BCD library. Just store your BCD numbers inside byte arrays.

In fact, with today's gigantic storage abilities, you might just use a byte array where each byte just holds a digit (0-9). You then write your own routine to add, subtract multiply and divide your byte arrays.

(Divide is the hard one, but I bet you can find some code out there somewhere.)

I can give you some Java-like psuedocode but can't really do C++ from scratch at this point:

class BigAssNumber {
    private byte[] value;

    // This constructor can handle numbers where overflows have occurred.
    public BigAssNumber(byte[] value) {
        this.value=normalize(value);
    }

    // Adds two numbers and returns the sum.  Originals not changed.
    public BigAssNumber add(BigAssNumber other) {
        // This needs to be a byte by byte copy in newly allocated space, not pointer copy!
        byte[] dest = value.length > other.length ? value : other.value;         

        // Just add each pair of numbers, like in a pencil and paper addition problem.
        for(int i=0; i<min(value.length, other.value.length); i++)
            dest[i]=value[i]+other.value[i];

        // constructor will fix overflows.
        return new BigAssNumber(dest);
    }

    // Fix things that might have overflowed  0,17,22 will turn into 1,9,2        
    private byte[] normalize(byte [] value) {
        if (most significant digit of value is not zero)
            extend the byte array by a few zero bytes in the front (MSB) position.

        // Simple cheap adjust.  Could lose inner loop easily if It mattered.
        for(int i=0;i<value.length;i++)
            while(value[i] > 9) {
                value[i] -=10;
                value[i+1] +=1;
            }
        }
    }
}

I use the fact that we have a lot of extra room in a byte to help deal with addition overflows in a generic way. Can work for subtraction too, and help on some multiplies.

@Bill K 2008-11-21 22:57:13

And if you're not doing it for school, just go grab some BigInteger code somewhere and use that.

@Alex Gaynor 2008-11-21 22:11:21

My prefered approach would be to use my current int type for 32-bit ints(or maybe change it to internally to be a long long or some such, so long as it can continue to use the same algorithms), then when it overflows, have it change to storing as a bignum, whether of my own creation, or using an external library. However, I feel like I'd need to be checking for overflow on every single arithmetic operation, roughly 2x overhead on arithmetic ops. How could I solve that?

@Bill K 2008-11-21 22:48:07

Don't worry too much about performance. Code it without any regard to performance, then go in and refactor stuff if you can't meet some benchmark.

@artificialidiot 2008-11-21 23:39:01

Yeah, you can even refactor a bubblesort to a mergesort... And you certainly want a good marketing image compared to others to sell your shrink wrapped boxes of your general purpose OO language to big boys. What? is it not general purpose?

@Clayton 2008-11-22 19:13:28

The problems you describe are why I suggested creating a new data type. C++, Java, etc. would not automatically convert a 16-bit int to 32 bits if a multiplication overflowed, so why should you. On the other hand, if this is a documented requirement, you'll just have to accept the performance hit.

@Clayton 2008-11-21 21:53:35

Other posters have given links to libraries that will do this for you, but it seem like you're trying to build this into your language. My first thought is: are you sure you need to do that? Most languages would use an add-on library as others have suggested.

Assuming you're writing a compiler and you do need this feature, you could implement integer arithmetic functions for arbitrarily large values in assembly.

For example, a simple (but non-optimal) implementation would represent the numbers as Binary Coded Decimal. The arithmetic functions could use the same algorithms as you'd use if you were doing the math with pencil and paper.

Also, consider using a specialized data type for these large integers. That way "normal" integers can use the standard 32 bit arithmetic.

@sellibitze 2009-10-04 09:24:25

What's people's obsession with BCD? Nodoby asked for it here.

@Adam Rosenfield 2008-11-21 21:42:31

There's no easy way to do it in C++. You'll have to use an external library such as GNU Multiprecision, or use a different language which natively supports arbitrarily large integers such as Python.

@sellibitze 2009-10-04 09:22:25

I think it's fairly easy. GMP comes along with a nice C++ header

@Barry Kelly 2008-11-21 21:41:18

If you need larger than 32-bits you could consider using 64-bit integers (long long), or use or write an arbitrary precision math library, e.g. GNU MP.

Related Questions

Sponsored Content

31 Answered Questions

[SOLVED] How do I detect unsigned integer multiply overflow?

27 Answered Questions

[SOLVED] Convert a string to an integer?

76 Answered Questions

[SOLVED] How do I iterate over the words of a string?

  • 2008-10-25 08:58:21
  • Ashwin Nanjappa
  • 2139021 View
  • 2886 Score
  • 76 Answer
  • Tags:   c++ string split

18 Answered Questions

[SOLVED] Generate random integers between 0 and 9

  • 2010-10-22 12:48:29
  • aneuryzm
  • 1765917 View
  • 1219 Score
  • 18 Answer
  • Tags:   python random integer

65 Answered Questions

[SOLVED] How do I generate random integers within a specific range in Java?

  • 2008-12-12 18:20:57
  • user42155
  • 3904418 View
  • 3361 Score
  • 65 Answer
  • Tags:   java random integer

27 Answered Questions

[SOLVED] How do you set, clear, and toggle a single bit?

23 Answered Questions

[SOLVED] How do I parse a string to a float or int?

24 Answered Questions

[SOLVED] Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition

10 Answered Questions

8 Answered Questions

Sponsored Content