574

I know that an "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I assumed that the code was safe enough.

In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization was enabled.

I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string true or false to an existing destination buffer.

Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value?

// Zero-filled global buffer of 16 characters
char destBuffer[16];

void Serialize(bool boolValue) {
    // Determine which string to print based on boolValue
    const char* whichString = boolValue ? "true" : "false";

    // Compute the length of the string we selected
    const size_t len = strlen(whichString);

    // Copy string into destination buffer, which is zero-filled (thus already null-terminated)
    memcpy(destBuffer, whichString, len);
}

If this code is executed with clang 5.0.0 + optimizations, it will/can crash.

The expected ternary-operator boolValue ? "true" : "false" looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue doesn't matter, since it will evaluate to true or false anyhow."

I have setup a Compiler Explorer example that shows the problem in the disassembly, here the complete example. Note: in order to repro the issue, the combination I've found that worked is by using Clang 5.0.0 with -O2 optimisation.

#include <iostream>
#include <cstring>

// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
    bool uninitializedBool;

   __attribute__ ((noinline))  // Note: the constructor must be declared noinline to trigger the problem
   FStruct() {};
};

char destBuffer[16];

// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter
void Serialize(bool boolValue) {
    // Determine which string to print depending if 'boolValue' is evaluated as true or false
    const char* whichString = boolValue ? "true" : "false";

    // Compute the length of the string we selected
    size_t len = strlen(whichString);

    memcpy(destBuffer, whichString, len);
}

int main()
{
    // Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
    FStruct structInstance;

    // Output "true" or "false" to stdout
    Serialize(structInstance.uninitializedBool);
    return 0;
}

The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:

const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue;       // clang clever optimization

While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?

17
  • 246
    It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more. Commented Jan 10, 2019 at 2:04
  • 9
    Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler. Commented Jan 10, 2019 at 3:48
  • 2
    Comments are not for extended discussion; this conversation has been moved to chat. Commented Jan 11, 2019 at 12:28
  • 4
    On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.
    – TLW
    Commented Jan 12, 2019 at 19:36
  • 3
    I'm reminded of one of the strangest C++ bugs I had recently. The code if(a && b) { ... } was behaving strangely, as if b was false when I thought it had to be true, and in desperation I added a debugging printout (using a debugger was inconvenient) to make it if(b) printf("b is true\n"); if(a && b) { ... }, which printed b is true even though the next test still acted as if b was false. Turned out b was 2, and gcc was emitting a test-low-bit instruction in one place, and a !=0 instruction in the other. Commented Oct 29, 2019 at 13:52

6 Answers 6

324

Yes, ISO C++ allows (but doesn't require) implementations to make this choice.

But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t. Even though that's required to be a fixed-layout type with no trap representations. (Note that C has different rules from C++; an uninitialized variable has an indeterminate value in C which might be a trap representation, but reading one at all is fully UB in C++. Not sure if there are extra rules for C11 _Bool which could allow the same crash behaviour as C++.)

It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.


You're compiling for the x86-64 System V ABI, which specifies that a bool as a function arg in a register is represented by the bit-patterns false=0 and true=1 in the low 8 bits of the register1. In memory, bool is a 1-byte type that again must have an integer value of 0 or 1.

(An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions. In terms of the ISO C++ standard, an ABI-violating object-representation is called a trap representation, despite the CPU itself not directly trapping when running instructions on the bytes. Only leading to faults later due to violated software assumptions. In ISO C17, 6.2.6.1 #5 - Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined ... and goes on to say it's called a trap representation. I don't know if the same language is present in ISO C++.)

ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !mybool with xor eax,1 to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b to a bitwise AND for bool types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.

In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)

The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString) to
5U - boolValue.
(BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data2.)

Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)


Your __attribute((noinline)) constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in main with push rax (which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to main is the value it used for uninitializedBool. This is why you actually got values that weren't just 0.

5U - random garbage can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.


Other implementations could make different choices, e.g. false=0 and true=any non-zero value. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.

ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the bool into unsigned char, which you're allowed to do because char* can alias anything. And unsigned char is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)

You've partially "hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{} definition so all translation units must have the same definition. Like with the inline keyword.)

So a compiler could emit just a ret or ud2 (illegal instruction) as the definition for main, because the path of execution starting at the top of main unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)

Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if() branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.

GCC and Clang in practice do actually sometimes emit ud2 on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void function, gcc will sometimes omit a ret instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.

Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t* be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.

See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.

Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.

Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)

Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.

Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN doesn't optimize a<0 as always-false, only that tmp is always negative. (Because INT_MIN + a=INT_MAX is negative on this 2's complement target, and a can't be any higher than that.)

So gcc/clang don't currently backtrack to derive range info for the inputs of a calculation, only on the results based on the assumption of no signed overflow: example on Godbolt. I don't know if this is optimization is intentionally "missed" in the name of user-friendliness or what.

Also note that implementations (aka compilers) are allowed to define behaviour that ISO C++ leaves undefined. For example, all compilers that support Intel's intrinsics (like _mm_add_ps(__m128, __m128) for manual SIMD vectorization) must allow forming mis-aligned pointers, which is UB in C++ even if you don't dereference them. __m128i _mm_loadu_si128(const __m128i *) does unaligned loads by taking a misaligned __m128i* arg, not a void* or char*. Is `reinterpret_cast`ing between hardware SIMD vector pointer and the corresponding type an undefined behavior?

GNU C/C++ also defines the behaviour of left-shifting a negative signed number (even without -fwrapv), separately from the normal signed-overflow UB rules. (This is UB in ISO C++, while right shifts of signed numbers are implementation-defined (logical vs. arithmetic); good quality implementations choose arithmetic on HW that has arithmetic right shifts, but ISO C++ doesn't specify). This is documented in the GCC manual's Integer section, along with defining implementation-defined behaviour that C standards require implementations to define one way or another.

There are definitely quality-of-implementation issues that compiler developers care about; they generally aren't trying to make compilers that are intentionally hostile, but taking advantage of all the UB potholes in C++ (except ones they choose to define) to optimize better can be nearly indistinguishable at times.


Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.

(Other ABIs do make different choices here. Some do require narrow integer types to be zero- or sign-extended to fill a register when passed to or returned from functions, like MIPS64 and PowerPC64. See the last section of this x86-64 answer which compares vs. those earlier ISAs.)

For example, a caller might have calculated a & 0x01010101 in RDI and used it for something else, before calling bool_func(a&1). The caller could optimize away the &1 because it already did that to the low byte as part of and edi, 0x01010101, and it knows the callee is required to ignore the high bytes.

Or if a bool is passed as the 3rd arg, maybe a caller optimizing for code-size loads it with mov dl, [mem] instead of movzx edx, [mem], saving 1 byte at the cost of a false dependency on the old value of RDX (or other partial-register effect, depending on CPU model). Or for the first arg, mov dil, byte [r10] instead of movzx edi, byte [r10], because both require a REX prefix anyway.

This is why clang emits movzx eax, dil in Serialize, instead of sub eax, edi. (For integer args, clang violates this ABI rule, instead depending on the undocumented behaviour of gcc and clang to zero- or sign-extend narrow integers to 32 bits. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI? So I was interested to see that it doesn't do the same thing for bool.)


Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.

OTOH, glibc memcpy will do two 4-byte loads/stores with an overlap that depends on length, so this really does end up making the whole thing free of conditional branches on the boolean. See the L(between_4_7): block in glibc's memcpy/memmove. Or at least, go the same way for either boolean in memcpy's branching to select a chunk size.

If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.

Or if tuning for Intel Ice Lake (with the Fast Short REP MOV feature), an actual rep movsb might be optimal. glibc memcpy might start using rep movsb for small sizes on CPUs with that feature, saving a lot of branching.


Tools for detecting UB and usage of uninitialized values

In gcc and clang, you can compile with -fsanitize=undefined to add run-time instrumentation that will warn or error out on UB that happens at runtime. That won't catch unitialized variables, though. (Because it doesn't increase type sizes to make room for an "uninitialized" bit).

See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/

To find usage of uninitialized data, there's Address Sanitizer and Memory Sanitizer in clang/LLVM. https://github.com/google/sanitizers/wiki/MemorySanitizer shows examples of clang -fsanitize=memory -fPIE -pie detecting uninitialized memory reads. It might work best if you compile without optimization, so all reads of variables end up actually loading from memory in the asm. They show it being used at -O2 in a case where the load wouldn't optimize away. I haven't tried it myself. (In some cases, e.g. not initializing an accumulator before summing an array, clang -O3 will emit code that sums into a vector register that it never initialized. So with optimization, you can have a case where there's no memory read associated with the UB. But -fsanitize=memory changes the generated asm, and might result in a check for this.)

It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.

MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).

It should work for this case because the call to glibc memcpy with a length calculated from uninitialized memory will (inside the library) result in a branch based on length. If it had inlined a fully branchless version that just used cmov, indexing, and two stores, it might not have worked.

Valgrind's memcheck will also look for this kind of problem, again not complaining if the program simply copies around uninitialized data. But it says it will detect when a "Conditional jump or move depends on uninitialised value(s)", to try to catch any externally-visible behaviour that depends on uninitialized data.

Perhaps the idea behind not flagging just a load is that structs can have padding, and copying the whole struct (including padding) with a wide vector load/store is not an error even if the individual members were only written one at a time. At the asm level, the information about what was padding and what is actually part of the value has been lost.

14
  • 14
    Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages. Commented Jan 11, 2019 at 7:04
  • 7
    And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...
    – davidbak
    Commented Jan 12, 2019 at 2:45
  • 6
    @The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.
    – supercat
    Commented Jan 12, 2019 at 22:23
  • 3
    @Joshua: On some implementations, many forms of UB would by design crash outright with very high (sometimes 100%) probability. Reliable trapping of various erroneous actions would often impose a significant run-time performance penalty, but if one is e.g. performing load calculations for a highway bridge, an assurance that overflows could not have caused the program to produce erroneous results may be worth an increase in execution time, and the authors of the Standard would not have wanted to forbid such implementations.
    – supercat
    Commented Jan 14, 2019 at 15:54
  • 3
    @VioletGiraffe - the definition of "undefined behavior" in C++ is so loose it catches many programmers unaware from time to time. Certainly has caught me from time to time. I wish I could program as tightly to the C++ standard as you apparently always can without fail, but I can't. And to be fair to me, neither can anyone I've ever worked with and I've worked with top people.
    – davidbak
    Commented Jul 8, 2019 at 17:10
60

The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true or false). The true value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true and false -- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.

So if you fail to initialise a bool, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:

50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)

11
  • 11
    The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem. Commented Jan 10, 2019 at 2:08
  • 4
    @ShadowRanger You can always inspect the object representation directly.
    – T.C.
    Commented Jan 10, 2019 at 2:12
  • 7
    @shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)
    – rici
    Commented Jan 10, 2019 at 2:28
  • 3
    Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0. Commented Jan 10, 2019 at 8:21
  • 7
    The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.
    – Holger
    Commented Jan 10, 2019 at 10:47
53

The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.

The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).

Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.

NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.

8
  • 2
    Is the first clause true? Does merely copying an uninitialized bool trigger UB? Commented Jan 10, 2019 at 3:25
  • 12
    @JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source
    – M.M
    Commented Jan 10, 2019 at 3:34
  • 8
    @JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations". Commented Jan 10, 2019 at 11:15
  • 8
    Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.
    – MSalters
    Commented Jan 10, 2019 at 20:03
  • 3
    On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.
    – supercat
    Commented Jan 10, 2019 at 21:23
23

A bool is only allowed to hold the implementation-dependent values used internally for true and false, and the generated code can assume that it will only hold one of these two values.

Typically, the implementation will use the integer 0 for false and 1 for true, to simplify conversions between bool and int, and make if (boolvar) generate the same code as if (intvar). In that case, one can imagine that the code generated for the ternary in the assignment would use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:

// the compile could make asm that "looks" like this, from your source
const static char *strings[] = {"false", "true"};
const char *whichString = strings[boolValue];

If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.

14
  • 1
    @SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.
    – Barmar
    Commented Jan 10, 2019 at 2:09
  • 1
    You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.
    – Remz
    Commented Jan 10, 2019 at 2:25
  • 3
    @Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.
    – Barmar
    Commented Jan 10, 2019 at 2:28
  • 2
    @Havenard, int is likely to be bigger than bool so that wouldn't prove anything.
    – Sid S
    Commented Jan 10, 2019 at 4:11
  • 2
    @MSalters: std::bitset<8> doesn't give me nice names for all my different flags. Depending on what they are, that may be important. Commented Jan 11, 2019 at 15:13
17

Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

The standard says nothing about the internal representation of a bool. It only defines what happens when casting a bool to an int (or vice versa). Mostly, because of these integral conversions (and the fact that people rely rather heavily on them), the compiler will use 0 and 1, but it doesn't have to (although it has to respect the constraints of any lower level ABI it uses).

So, the compiler, when it sees a bool is entitled to consider that said bool contains either of the 'true' or 'false' bit patterns and do anything it feels like. So if the values for true and false are 1 and 0, respectively, the compiler is indeed allowed to optimise strlen to 5 - <boolean value>. Other fun behaviours are possible!

As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to

  • Your code working as you expected it to
  • Your code failing at random times
  • Your code not being run at all.

See What every programmer should know about undefined behavior

5

Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?

Yes indeed, and in case it's useful to anyone, here's another real-world example.

I once spent several weeks tracking down an obscure bug in a large codebase. There were several aspects that made it challenging, but the root cause was an uninitialized boolean member of a class variable.

There was a test with a complicated expression involving this member variable:

if(COMPLICATED_EXPRESSION_INVOLVING(class->member)) {
    ...
}

I began to suspect that this test was not evaluating "true" when it should. I don't remember whether it was not convenient to run things under a debugger, or if I didn't trust the debugger, or what, but I went for the brute-force technique of augmenting the code with some debugging printouts:

printf("%s\n", COMPLICATED_EXPRESSION_INVOLVING(class->member) ? "yes" : "no");

if(COMPLICATED_EXPRESSION_INVOLVING(class->member)) {
    printf("doing the thing\n");
    ...
}

Imagine my surprise when the code printed "no" followed by "doing the thing".

Inspecting the assembly code revealed that sometimes, the compiler (which was gcc) was testing the boolean member by comparing it to 0, but other times, it was using a test-least-significant-bit instruction. When things failed, the uninitialized boolean variable happened to contain the value 2. So, in machine language, the test equivalent to

if(class->member != 0)

succeeded, but the test equivalent to

if(class->member % 2 != 0)

failed. The boolean variable was literally true and false at the same time! And if that's not undefined behavior, I don't know what is!

1
  • It must be a qubit! Welcome to the 21st century! :-) Commented Mar 8, 2022 at 19:26

Not the answer you're looking for? Browse other questions tagged or ask your own question.