What does switch-case really do?

I just ran into a strange issue with a switch-case that was not finding the case I was expecting it to. Here's an example code that shows the problem, tested on an Arduino Micro:

void setup() {
  Serial.begin(9600);
  while(!Serial);     
  
  char c = 0xE7;
  
  switch(c) {
    case 0xE7:
      Serial.println("We got E7!");
      break;
    default:
      Serial.println("Swing and a miss!");
      break;
  }

  switch((byte) c) {
    case 0xE7:
      Serial.println("We got byte E7!");
      break;
    default:
      Serial.println("Swing and a byte miss!");
      break;
  }
  
  switch(c) {
    case 0xFFE7:
      Serial.println("We got 16-bit integer FFE7!");
      break;
    default:
      Serial.println("Swing and a int miss!");
      break;
  }
}

void loop() {
}

This is the output from the above:
Swing and a miss!
We got byte E7!
We got 16-bit integer FFE7!

As far as I can tell, the first and last switch are upgrading the signed 8-bit value of the switch expression to signed 16-bit but it's not upgrading the values of the cases to match. Why would it prefer to use 16-bit ints? That seems odd for an 8-bit processor. What is it really doing behind the scenes?

I fear that the middle example is also using 16-bit integers but it's using unsigned expansion so that 0xE7 is converted to 0x00E7 in both the test expression and the case. If that is the case, then switch-case shouldn't be used in time-critical parts of the code on 8-bit Arduinos.

Why would it prefer to use 16-bit ints?

The avr-gcc folks decided int is 16 bits. In the absence of programmer intervention the compiler promotes smaller integer types to int.

If that is the case, then switch-case shouldn’t be used in time-critical parts of the code on 8-bit Arduinos.

Impossible to say with your example. Because the “variable” c evaluates to a compile-time constant, my compiler optimizes away everything except the Serial.println calls.

This is setup

000000be <setup>:
  be:	26 e0       	ldi	r18, 0x06	; 6
  c0:	40 e8       	ldi	r20, 0x80	; 128
  c2:	55 e2       	ldi	r21, 0x25	; 37
  c4:	60 e0       	ldi	r22, 0x00	; 0
  c6:	70 e0       	ldi	r23, 0x00	; 0
  c8:	80 e6       	ldi	r24, 0x60	; 96
  ca:	91 e0       	ldi	r25, 0x01	; 1
  cc:	0e 94 5e 01 	call	0x2bc	; 0x2bc <_ZN14HardwareSerial5beginEmh>

  d0:	60 e0       	ldi	r22, 0x00	; 0
  d2:	71 e0       	ldi	r23, 0x01	; 1
  d4:	80 e6       	ldi	r24, 0x60	; 96
  d6:	91 e0       	ldi	r25, 0x01	; 1
  d8:	0e 94 b7 02 	call	0x56e	; 0x56e <_ZN5Print7printlnEPKc>

  dc:	62 e1       	ldi	r22, 0x12	; 18
  de:	71 e0       	ldi	r23, 0x01	; 1
  e0:	80 e6       	ldi	r24, 0x60	; 96
  e2:	91 e0       	ldi	r25, 0x01	; 1
  e4:	0e 94 b7 02 	call	0x56e	; 0x56e <_ZN5Print7printlnEPKc>

  e8:	62 e2       	ldi	r22, 0x22	; 34
  ea:	71 e0       	ldi	r23, 0x01	; 1
  ec:	80 e6       	ldi	r24, 0x60	; 96
  ee:	91 e0       	ldi	r25, 0x01	; 1
  f0:	0c 94 b7 02 	jmp	0x56e	; 0x56e <_ZN5Print7printlnEPKc>

A call to Serial.begin followed by three calls to Serial.println.

Does it use a proper variable if it’s declared as volatile?

Obviously the compiler is smarter than we usually give credit for. I didn’t realise it could look that far ahead in the code and optimise that much.

Making it volatile would be a more realistic test.

It's nothing at all to do with switch/case. Try this test:

void setup ()
  {
  Serial.begin (115200);
  char c = 0xE7;
  if (c == 0xE7)
    Serial.println ("true");
  else
    Serial.println ("false");
  }  // end of setup
void loop () { }

Predictions, anyone?

MorganS:
Obviously the compiler is smarter than we usually give credit for. I didn't realise it could look that far ahead in the code and optimise that much.

Which is why I discourage people from "making code faster" by dropping into assembler. Probably, they are making it slower.

MorganS:
Does it use a proper variable if it's declared as volatile?

Extremely "proper". volatile forces the creation of a stack frame and forces c to be a stack variable. In other words, no optimization. Looks like a bit of dead code is even included.

Looks like the first second comparison ((byte) c == 0xE7?) is 8 bit…

  ec:    89 81           ldd    r24, Y+1    ; 0x01
  ee:    87 3e           cpi    r24, 0xE7    ; 231
  f0:    19 f4           brne    .+6          ; 0xf8 <setup+0x3a>

The second third comparison is 8 bit / identical…

 104:	89 81       	ldd	r24, Y+1	; 0x01
 106:	87 3e       	cpi	r24, 0xE7	; 231
 108:	19 f4       	brne	.+6      	; 0x110 <setup+0x52>

The last first comparison is optimized away.

Very likely because the compiler is treating char as unsigned signed by default.

Ugh. I got that backwards. The compiler is treating char as signed by default. The first comparison is optimized away the other two are performed at run-time. All comparisons are 8 bit.

Which brings up a very important safety tip: when it is important always use unsigned char (uint8_t) or signed char (int8_t) but never just char.

What was your prediction? Did it match observed results?

I assume you're asking @MorganS. I have worked that horse corpse enough for one night.

This post seems to have dropped well away from the first page of Programming questions.

To briefly summarize, there is nothing wrong with switch/case, however the confusion arises over the way literal constants are promoted to ints, in the way they are required to by the C++ standard.

I may be mistaken w.r.t. Arduino c++, and don't have time to test it right now on an Arduino, but my understanding has always been that at least in c++ the argument to a switch is NOT necessarily an int, but must be either an integral type, or convertible to an integral type. The case labels will then be promoted to that same integral type. So, it is perfectly ok to use an unsigned int as the argument to a switch, but all the case label will then be promoted to unsigned int as well. Similarly, if the switch argument is signed, then all the case labels will be promoted to signed ints. That is consistent with the examples in the OP.

Regards,
Ray L.

My code above in reply #5 does not use switch. In case you haven't tested it, it prints "false". The important lines are:

  char c = 0xE7;
  if (c == 0xE7)
    Serial.println ("true");
  else
    Serial.println ("false");

Thus this is nothing to do with switch arguments or case labels.

Let me explain it like this:

0xE7 is the same as 231 in decimal.

char c = 231;

Well, c can't hold 231 (the maximum positive number is +127), so it becomes -25. So now c contains -25.

Then we compare:

  if (c == 0xE7)

That is:

  if (-25 == 231)

Well, no they are not the same, so the code prints "false".

char c = 0xE7;

switch(c) {
case 0xE7:
Serial.println("We got E7!");
break;

From the first post, " case 0xE7: " doesn't look like a char. And you can't put single quotes on it.

From the first post:

char c = 0xE7;

OK, c now contains -25.

 switch(c) {
    case 0xE7:

0xE7 is not -25.

I tried this on another computer, it would not even compile, even though that character is in the single-byte ascii range 128-255

void dumb()
{
    char c = 'ç' ;


    switch (c)
    {
        case 'a': printf("A");
         break;

        case 'ç' : printf("Stoopid french or turkish letter");
    }

}

OK, so the example of if( -25 == 231 ) is a good way of showing why this comparison should not have worked in the first place.

Why does this work then?

 switch(c) {
    case 0xFFE7:
      Serial.println("We got 16-bit integer FFE7!");

If that was comparing -25 with 65511, it should fail. In the switch() example, it seems like it's promoting the expression to a 16-bit (signed) and the too-large case value has wrapped to -25 as well.