I am performing arithmetic operations on integers in a checked scope in C#/.NET, in order to catch when an overflow happens. I want to find out if the overflow was positive or negative, in a short, smart, simple way, without a lot of special cases and checks depending on the operands or operation.
checked
{
try
{
// The code below is an example operation that will throw an
// overflow exception, that I expect to be a positive overflow.
// In my real code, all arithmetic operations are included in this
// code block and can result in both positive and negative overflows.
int foo = int.MaxValue;
int bar = 1;
foo = bar;
}
catch (OverflowException)
{
// I have found out that a overflow occurred,
// but was it positive or negative?
}
}
Can it be done? I found no information in the exception itself to use to find out.
CodePudding user response:
TL;DR:
I want to find out if the overflow was positive or negative, in a short, smart, simple way, without a lot of special cases and checks depending on the operands or operation.
You can't: C# doesn't expose that information because:
- CPUs today don't make it easy to detect overflow direction.
- While it can be done, the necessary steps required to introspect the CPU's state post-moterm will wreck performance on modern superscalar processors.
- The alternative is to perform safety-checks before performing any arithmetic, but that also ruins performance.
- And that's just for the x86/64 alone. There's probably about a dozen radically different CPU ISAs that .NET CLR now supports, and having to handle all of their own overflow/carry/sign idiosyncrasies, to ensure that C# programs all behave the same and correctly for your proposed "
checked
-with-overflow-direction" just isn't feasible.- All of the per-ISA logic happens in .NET's JIT component, which is already a hideously complex beast some 21 years in the making now.
- That single repo directory has 15 megabytes of C and C source code in it. That's a lot.
- All of the per-ISA logic happens in .NET's JIT component, which is already a hideously complex beast some 21 years in the making now.
- There is very little value in knowing the direction of an arithmetic overflow. The important thing is that the system detected that overflow happened, that means there's a bug in your code that you need to go-fix: and because you'll need to reproduce the problem as part of normal debugging practices it means you'll be able to trace execution in full detail and capture every value and state, which is all you need to correct whatever the underlying issue was that caused overflow - whereas knowing that minor detail of overflow direction during normal runtime execution helps... how?
- (That's a rhetorical question: I don't believe it significantly helps at all, and might even be a red-herring that just wastes your time)
Longer answer:
Problem 1: CPUs don't care about the direction of overflow
Some background: in pretty-much every microprocessor today there's these special-function-registers (aka CPU flags, aka status registers which are often similar to these 4 in ARM:
N
- NegativeZ
- Zero flagC
- Carry flagV
- Signed overflow flag
And of course, the CS-theoretical basic design of an ALU (the bit that does arithmetic) is such that integer operations are the same, regardless of whether they're signed or unsigned, positive or negative (e.g. subtraction is addition with negative operands), and the flags by themselves don't automatically signal an error (e.g. the overflow flag is ignored for unsigned arithmetic, while the carry-flag is actually less significant in signed arithmetic than unsigned).
(This post won't explain what they represent or how they work as I assume that you, my erudite reader, is already familiar with the basic fundamentals of computer integer arithmetic)
Now, you might assume that in a checked
block in a C#/.NET program that the native machine code will check the status of these CPU flags after each-and-every arithmetic operation to see if that immediately previous operation had a signed-overflow or unexpected bit-carry - and if so to pass that information in a call/jump to the CLR's internal function that creates and throw the OverflowException
.
...and to an extent that is what happens, except that surprisingly little useful information can realistically be gotten from the CPU. Here's why:
- In a C#
checked
block on x86/x64, the CLR's JIT inserts an x86/x64jo [CORINFO_HELP_OVERFLOW]
instruction after every arithmetic instruction that might overflow.- You can see it in this Godbolt example.
CORINFO_HELP_OVERFLOW
is the address of the native functionJIT_Overflow
that (eventually) callsRealCOMPlusThrowWorker
to throw theOverflowException
.Note that the
jo
instruction is only capable of telling us that the Overflow flag was set: it doesn't expose or reveal the state of any of the other CPU flags, nor the sign of the instruction's operands, so thejo
instruction cannot be used to tell if the overflow was (to use your terminology) a "negative overflow" nor "positive overflow".So if programs want more information than just "it overflowed, Jim" it will need to use CPU instructions that save/copy the rest of the CPU flags state into memory, and if those flags aren't sufficient enough to determine the direction of overflow then the JIT compiler will also have to retain copies of all arithmetic operands in-memory somewhere, which in-practice means increasing your stack space drastically or wasting CPU registers holding old values that you don't want to drop until your arithmetic operation succeeds.
...unfortunately the CPU instructions used to copy CPU flags to memory or other registers tend to wreck overall system performance:
Consider the sheer complexity of modern CPU designs, what with their superscalar, speculative and out-of-order execution, and other neat gizomos: modern CPUs work best when programs follow a predictable "happy path" which don't use too many awkward instructions that mess-around with the CPU's internal state. So altering a program to be more introspective will do more than harm just your own program's performance, but the entire computer system. Oog.
- This comment from Rust contributor Tom-Phinney summarizes the situation well:
Instruction-level access to a "carry bit", so that the value can be used as an input to a subsequent instruction, was trivial to implement in the early days of computing when each instruction was completed before the next instruction was begun.
For a modern, out-of-order, superscalar processor implementation that cost/benefit is reversed; the cost in gates and/or instruction-cycle slowdown of the "carry-bit feature" far, far outweighs any possible benefit. That is why RISC-V, which is a state-of-the-art computer architecture whose expected implementations span the range from embedded processors of 10k gate complexity (e.g., RV32EC) to superscalar processors with 100x more gates, does not materialize an instruction-stream-synchronous carry bit.
- This comment from Rust contributor Tom-Phinney summarizes the situation well:
Problem 2: Hetereogenity
The .NET CLR is ostensibly portable: .NET has to run on every platform Windows supports, and other platforms as per the whims of Microsoft's C-levels and D-levels: today it runs on x86/x64, different varieties of ARM (including Apple Silicon), and in the past Itani
cum, while the XNA build ran on the Xbox 360's PowerPC chip, and the Compact Framework supported SH-3/SH-4, MIPS, and I'm sure dozens others. Oh, and don't forget how Silverlight had its own edition of the CLR, which ultimately became the basis for .NET Core and now .NET 5 - which replaced .NET Framework 4.x - and Silverlight also ran on PowerPC back in 2007.Or in list form, an off-the-top-of-my-head list of all the ISAs that official .NET CLR implementations have supported... that I can think of:
- x86/x64
- ARM / ARM-Thumb
- SH-3 (Compact Framework)
- SH-4 (Compact Framework)
- MIPS (Compact Framework)
- PowerPC (Silverlight 1.0 on PPC Mac, XNA on Xbox 360)
- Itanium IA-64
So that's a nice variety - I'm sure there's others I've forgotten, not to mention all the platforms that Mono supported.
What do all of these processors/ISAs have in common? Well, they all have their own different ways of handling integer overflow - sometimes quite very differently.
- For example, some ISAs (like MIPS) raise a hardware exception (like divide-by-zero) on overflow instead of setting a flag.
- While .NET is fairly portable already, the granddaddy of portability is probably the venerable C Programming Language: if there's an ISA out there then someone's certainly written a C compiler for it. For all of C's life and history from the early 1970s through to today (2022) it never featured support for checked arithmetic (it's UB) because doing-so would be a lot of work for something not really needed in systems-programming which tends to use a lot of intentional unchecked overflows and bitwise operations.
- ...though C23 (for release in 2023) does (finally) add checked arithmetic to the standard library. It only took 50 years though...
- ...though of course C compilers were always free to add extensions to support checked arithmetic, but it was never a part of the portable C language.
- For C programmers who needed it, they had to resort to gnarly (and performance-killing) workarounds involving validating every operand before each operation and aborting-early instead of performing the calculation and then checking CPU flags, again, because there's zero consistency in overflow handling in each of the myriad of CPUs/archs that C supports.
- ...so if C, of all programming languages, backed by all the major players and international standards organizations, had this much trouble with arithmetic overflow then we really can't expect Microsoft to handle that degree of complexity - heck, I must say we're actually very lucky that we even have support for
checked
arithmetic at all considering that .NET's progenitor, Java, didn't support checked arithmetic until Java 8, and only for 2 operations, which also doesn't reveal the direction of overflow either.