A brief post on a long standing omission in type checking in Pascal and the limitations of Range Checking as applied to the problem.
Consider this contrived example of a simple function:
function IsLessThan64K(aValue: Integer): Boolean; begin result := aValue < Word($ffff); end;
This very simple function accepts an explicitly 32-bit Integer parameter and simply returns TRUE if the value passed is less than 64K (65536). Worth noting at this point is that Integer is 32-bit regardless of whether you are compiling for 32-bit or 64-bit platforms.
Now consider two Int64 variables, declared and initialised as follows:
a, b: Int64; a := $7fff; // 32767 b := $7fffffffffffffff; // 9223372036854775807
Now consider the results of the following calls to that IsLessThan64K() function, involving a and b:
IsLessThan64K( a ); IsLessThan64K( b ); IsLessThan64K( b - a );
Somewhat surprisingly, given the dramatic discrepancy between the types of the variables and the type of the function parameter, these all compile without grumbling. But Pascal is strongly typed, so if the compiler is happy then all must be right with the world, surely ?
Well, let’s look at the results of these calls:
IsLessThan64K( a ); // Returns TRUE - so far so good IsLessThan64K( b ); // Returns TRUE ... um, what ? IsLessThan64K( b - a ); // Also returns TRUE ... OMG THE COMPILER IS BROKEN!
That last comment is wrong. The compiler isn’t broken, it has just not being very helpful in this particular case.
Using the debugger, if we step into the last of these three calls we can see what value the function thinks it has been passed, and we find that this value is -32768 when the correct result of a – b should be 9223372036854743040.
Hmmm, maybe the compiler is broken after all ?
No. So what’s going on ?
If we examine these values in their hexadecimal notation, things become a little clearer:
a $0000000000007fff 32767 b $7fffffffffffffff 9223372036854775807 b - a $7fffffffffff8000 9223372036854743040 $ffff8000 -32768
The 64-bit value passed to that function has been simply truncated to 32-bits, to fit into that value. As a result, the internal representation results in that massively large positive number being interpreted as a much smaller, negative number. Precisely the sort of problem that a strongly typed language is supposed to prevent, surely ?
So it may come as a surprise to some that all three of those calls will compile without any indication that the wrong type has been passed.
There is no warning or even so much as a hint about the inappropriate attempt to pass a 64-bit value in that 32-bit parameter.
Range Checking: An Incomplete Solution
There is a compiler setting that can help in these circumstances: Range Checking.
First of all, be aware that this option is not on by default. Not even in debug builds. You ideally do not have it enabled in release builds in order to avoid the performance overhead involved, but in debug builds it perhaps makes sense.
This is necessary because as suggested by the fact that this compiler option is grouped under “Runtime Errors“, this does not affect compilation of code. Rather it introduces additional checks at runtime to ensure that values are within the required ranges of the types involved.
This point bears emphasis: Range Checking applies to value, not to types.
So, for example, even with Range Checking enabled, the call involving a both compiles and runs without any problem what-so-ever:
IsLessThan64K( a );
The range checking introduced by the compiler does not care that a is the wrong type. It merely checks the value, finds that it is in the range required for a 32-bit Integer and waves it through. In this contrived example of course a always has a value in the required range, but in a real-world scenario it may have any number of different values at different times in the execution of the code, only some of which may trigger a range check exception.
What you really want – and perhaps expect – from a strongly typed language, is for the compiler to tell you that the potential error exists in the first place.
Caveat Developor: It doesn’t.
Not Just an Int(64) Problem
This isn’t limited only to 32/64-bit integer types. Exactly the same issue also affects other ordinal types. e.g. Passing a Word in a Byte parameter, an Integer in a Word, etc.
Also, Not Just a Delphi Problem
Having quickly investigated in other variants of Pascal this appears to be a consistent behaviour, suggesting that perhaps it is some sort of hang-over from legacy Pascal compiler days.
Neither Oxygene (at least when compiling for .net) nor FreePascal complain about this sort of thing either. In the case of FreePascal this is particularly odd since this compiler supports a warning which seems aimed at precisely this situation: Type size mismatch, possible loss of data / range check error.
For Oxygene/Delphi the question is why does no such warning exist ?
For FreePascal the question is if this code does not trigger that warning, what would ?
Interesting to note that modern C and C++ compilers will produce warnings for such cases.
Similar QC Report #107447: http://qc.embarcadero.com/wc/qcmain.aspx?d=107447
It’s only been three years… I’m sure a fix is coming any day now….
The issue has nothing to do with 32/64 bit platforms/compilers. It applies equally to all ordinal types of differing sizes. In those terms it has been far more than 3 years. 🙂
The loss of precision should probably generate an optional warning. That would be a good QC ticket for a feature request.
As Shurshik has mentioned in his comment, a (similar) QC report already exists.
However that QC raises the concern in the specific context of 32 > 64 bit migration, which is a red herring. It this has nothing to do with the introduction of 64-bit platform support per se. The same issue affects all ordinal types. e.g. Passing a Word in a Byte parameter, an Integer in a Word, etc.
It’s also worth noting that it’s nothing new. As far as I can tell it has been this way since forever.
Note that enabling range checking will break a lot of library code and you will probably find it easier to selectively enable it in your units than to enable it for a project then modify every library that you include to not have it. Which is annoying because it is quite a useful warning… just not in this case.
Enable range checking everywhere and when you hit an error, fix the error rather than suppress range checking. That’s what I do anyway. Every now and again there are times when range and overflow checks should be suppressed. For instance rng, hash and similar types of algos.
like I said, that gets frustrating when I have lots of library units that break with this enabled. Many popular libraries break, and often fixing them is a lot of work that has to be redone every time we get a new version. Pushing fixes upstream ranges from trivial to pointless (it took DevExpress 3 years to put in a one line obvious fix that I suggested, for example). IME some people are just oppsed to range checking and regard requests to fix problems that it reveals as pointless whining.
But there are people like Primoz and Madshi who are very “you’re right, thanks for that, here’s a new version with the fix”, sometimes within 24 hours.
Although it you really want to see “disagree on the basics” look up “curl valgrind memory leak” some time.
Also, does Peganza (or any other static checker) pick this up?
Great to see you posting again!
Cheers Matthew.