Longer Bit Contants, Please

Well, in a way, I appreciate the code, but that's not the solution I am asking for.

I asked for a small improvement the the Arduino pre-processor , which might only be 15 lines, if done similarly as the code above--in the interest of good code legibility and memory savings.

I don't think it's a valid argument to hide behind which version of C/C++ Arduino is using, because I seriously doubt that those standard libraries and syntax include: pinMode, digitalWrite, and so-forth.

Sometimes people use certain bits for certain things. These things are micro-controllers. We also make masks, and it's easier to see your binary mask if it is in binary--than hex.

Has anyone seen a CRC explanation done entirely in Hex?

Unfortunately, without optimizations (-O0), this code fails to compile. In C++20, you could use consteval instead of constexpr, which would work in this case (because it's always evaluated at compile time, it doesn't have to generate a version of the function that could be called at runtime).

Instead of calling the function B it could be nice to use a user-defined literal:

consteval uint32_t operator""_2 (const char *s, unsigned long len) {
  uint32_t n = 0;
  while (len --> 0) {
    if (*s == '1') {
      n = (n<<1) + 1;
    } else if (*s == '0') {
      n = n<<1;
    } else {
      assert(*s == ' ' || *s == '_');
    }
    ++s;
  }
  return n;
}

This allows you to write:

"0101 1100 1010 0011"_2

Run this code online: Compiler Explorer

How would it save any memory?

This is completely different from functions like pinMode, these are just ordinary C++ functions, not special keywords defined by the C++ standard. The “Arduino Core” is simply a collection of standard C++ functions and classes.
What you're asking for is a much more fundamental change to the actual C++ syntax.

Arduino uses a standard C++ compiler, so you have to work with the version of C++ that is used by the Arduino configuration.
A further diversion from standard C++ by changing the Arduino preprocessor would be a terrible idea and would only cause confusion.
In hindsight I think even the basic function prototype generation it does right now was a mistake. It's not that hard to teach beginners to declare their functions before using them, just like for variables, and the slight advantage in beginner-friendliness is annulled by the fact that it completely breaks down as soon as you start defining types that are used in these function definitions.

If you're interested in the reason why the C++ committee decided to use apostrophes instead of spaces, see c++ - Why was the space character not chosen for C++14 digit separators? - Stack Overflow.

Personally, I prefer the look of apostrophes over spaces, but even if it were the other way around, I would prefer Arduino stuck to the standard, the tiny change in readability wouldn't make up for the inconvenience of creating a new non-standard C++ dialect.

Sure, that's why GCC and C++14 support the 0b prefix for integer literals. If you want a separator between nibbles, use apostrophes and C++14.

If any Arduino devs are reading this: Please add this to the list of reasons for updating the platform files to C++17 (or at least C++14). C++14 removed many restrictions on constexpr functions, which is really useful, especially in an embedded context. And the binary literals with digit separators are a nice feature as well.

My understanding was that, at the time the macros were created, GCC did not have binary literals.

The note here indicates they were added in GCC 4.3. I don't know what is the history of GCC versions used by the Arduino IDE in the olden times though.

As you can see here, there has been a concerted effort to deprecate those macros and to replace their use in the documentation:

The preprocessor currently only does two things to the code:

  • Generate function prototypes
  • Add #include <Arduino.h>

So adding one more is actually a very significant increase in the scope, and also opens the door to further deviations from real C++. The things that might seem simple at a glance can turn out to be very complex once you start taking into account all the corner cases. A huge amount of work has gone into the function prototype generation system, with at least three complete reworkings, yet it still fails in some cases.

Once again, I am asking for a small change in the compilers.

The current situation is not that dissimilar that the poorly made Internet forms which cannot parse your credit card number if it has spaces.

Like it or not, Arduino's non-standard language became its own language. Either it will remain a living language--or become a dead one.

The Arduino "language" is not non standard, rather it provides extension to standard C++ to suit it's use case. As such it will not die

I doubt there is such a thing as a small compiler change. This will affect all developers in the Arduino community. It will allow them to write code that will be incompatible with compilers without that change. It might also break something else and therefore will likely require some serious testing.

If you do not like the way the compiler works, I suspect you could create your own version of the compiler or some pre-processor/ code generator.

I would vote against anything like this

You could do what generation of developers did with old versions of the compiler and just type in the readable format as a comment…

We don’t need to depart from the standard language and compiler. When Arduino will move to a more recent C++ version then you’ll get what the standard allows for.

This requires explanations… care to explain ?

.... somewhere during runtime;

myByte = myDeCipher( "my formatted text" );

and yeah -- been there, done that for user-IO.

Now that the functions and macros have been provided, I'll cast black magic.
Not highly recommended. :crazy_face:

unsigned long operator"" _isBin(const char* c,size_t s){unsigned long x=1,y=0,z=0;
while(y++<=s){switch ((char)(*(c+s-y))-0x30){case 1:z+=x;case 0:x<<=1;}}return z;}

void setup()
{
  Serial.begin(9600);
  Serial.println("1010 1011 1100 1101 1110 1111"_isBin, HEX);
  Serial.println("  11 0000 0011 1001"_isBin, DEC);
}

void loop() {}

EDIT: It's a joke. :roll_eyes:

chrisknightley, I guess you didn't read what I wanted.

I wanted the ability to separate binary numbers by either the byte or nibble, when setting variables and constants--as a means of increasing readability.

Arduino's syntax and usage is not all the standard. Arduino doesn't require pointers, which deviates from C. Also, looking at the documentation, there is very little information on Struct's, likely because the powers that be (you) want people to fall into OOP, which is not always best for the bit twiddling that happens on micro-controllers.

Syntactically, many of the methods visible to the Arduino programmer could have been naturalized to C syntax. Arduino as a language, feels like it was intended to be C-like, but then people took it over, and tried to make it something else.

I am just a lowly person who asked for a small change in the compilers to remove spaces from bitfield byte constants and variables.

That's exactly what his example does: the nibbles and bytes of his binary constants are separated by spaces.

You have already been given multiple possible solutions, the easiest one being simply upgrading to C++14 and using apostrophes like 0b1010'1101'0011'1001.

If you're not satisfied with that, you have two options: either write a custom preprocessor that you apply to your Arduino sketches before compilation, or submit a proposal to the C++ standard committee.
Either way, you'll have to find a way to modify the C++ grammar in such a way that doesn't break existing C++ code and that doesn't cause any parsing ambiguities. It's an interesting exercise if you're interested in formal grammars and compilers.

Like J-M-L, I would absolutely vote against Arduino departing from standard/GNU C++.
And even if that would be an option, it would be much more helpful to present a feasible solution rather than just demanding that the compiler be changed to suit your personal preferences.

Arduino's syntax is standard C++ syntax, compiled by the popular GNU C++ compiler.
Arduino documentation doesn't focus on pointers and structs, because those are already covered in the C++ documentation.

Again, Arduino is not a language. Arduino code is C++ code with a hidden #include <Arduino.h> at the top and a preprocessor that hoists function prototypes for you. That's it. The syntax is exactly the C++ syntax. This has already been explained by Pert in Longer Bit Contants, Please - #24 by pert. It feels “C-like” because C++ was intended to be “C-like”.

1 Like

If you think that Arduino hasn't become a language of it's own, you are kidding yourself. The fact that the Teensy uses the same syntax as the Arduino, should be proof enough.

It doesn't make any sense to ignore apostrophes but not spaces. You are that person writing the credit-card form that cannot have spaces. I find it hard to believe that you are championing the cause of messy code. LOL!

I wanted more readable code.

Arduino and Teensy both use the same C++ syntax. Just open the arduino-1.8.15/hardware/teensy/avr/platform.txt file and you can see exactly how your sketch is compiled: using a GNU C++ compiler.

In fact, I compile all my Arduino libraries using standard GNU and Clang C++ compilers to run my unit tests on my laptop and GitHub CI servers. Apart from the #include <Arduino.h> and the function hoisting, there's nothing separating the “Arduino language” from C++.

I'm sure the standard committee had good reasons to choose apostrophes over spaces.
Spaces are already used for tokenization, overloading them for such a specific use case unnecessarily complicates the parsing process and might contribute to worse diagnostics for syntax errors.
IMO, apostrophes are a better choice than spaces for this purpose anyway. If you disagree, that's a personal preference, but in that case I could return your remark and say “it doesn't make any sense that you're okay with spaces but not apostrophes”.

This is a programming language with strict syntax rules, not a website with broad accessibility requirements.
I'm advocating for uniform syntax and coding style. If that means putting aside personal aesthetic preferences, so be it.

Messy code is subjective, I do find i = 0b1010'1101'0011'1001 + 0b1010'1101'0011'1001 less messy than i = 0b1010 1101 0011 1001 + 0b1010 1101 0011 1001. It's your right to disagree, but you also have to put it into perspective and take into account the fragmentation such a meaningless change would result in, and accept the possibility of other people having different opinions on the matter.

If you are confident that you can convince the standard committee of the importance of allowing spaces as digit separators, you can write a language proposal and have them vote on it. Otherwise, I think there are more important issues to be addressed and code to be written, so any further bikeshedding over a single character is a waste of time.

I disagree. Byte also is not a standard C or C++ statement, but it looks better than its equivalent.

If would seem that if were up to the people so-far replying in this thread, nothing every would have improved to this point. I guess that it was so difficult to learn the first time, they the slightest change would just blow everyone's minds.

When doing actual programming down to the bit and byte level, it helps to be able to clearly see the value in the variable, because each bit has a meaning.

Let's look at some code, before:

After, excusing the green hightlighting, where the interface doesn't also understand what I am doing:

Look at your credit card, you see number grouping. Look at phone numbers. Look at internet addresses. You will see delimiters.

The processors themselves have delimiters--breaking the bits into bytes, unless you haven't noticed that. LOL!

It's not a statement, it's a type alias, which is just plain C++, no magic involved:

Why do you insist that it is somehow a separate language? It isn't.

You're in luck! Because that's exactly the problem the standard committee solved by introducing binary integer literals with optional apostrophes as digit separators.

If you're not happy with the character they chose, you should address your concerns to them, arguing about it with random people on the internet is a waste of time.

It’s C++, not C. You can code with both without pointers if you want but they are part of the language. You can use references as well in C++ which helps avoid pointers as well.

Plus all the answers from other members

All the argument you have are actually working against your claim… it’s C++ period.

May be you need to read again the explanation of what is a programming language as you seem to mix the language itself with the tools and libraries and associated (limited) documentation provided by the Arduino company to make it easier to start something on a board.

1 Like

I thought that was the case.

unsigned long operator"" _isBin(const char* c,size_t s){unsigned long x=1,y=0,z=0;
while(y++<=s){switch ((char)(*(c+s-y))-0x30){case 1:z+=x;case 0:x<<=1;}}return z;}

#define defA 0b101010111100110111101111
#define defB " 1010 1011 1100 1101 1110 1111 "_isBin

const unsigned long cstA = 0b000000000011000000111001;
const unsigned long cstB = " 0000 0000 0011 0000 0011 1001 "_isBin;

void setup()
{
  int varA = 0b0000001100001001;
  int varB = " 0000 0011 0000 1001 "_isBin;
  
  Serial.begin(9600);
  if(defA == defB) Serial.println("def same");
  if(cstA == cstB) Serial.println("cst same");
  if(varA == varB) Serial.println("var same");
  Serial.println(defA, HEX);
  Serial.println(defB, HEX);
  Serial.println(cstA);
  Serial.println(cstB);
  Serial.println(varA);
  Serial.println(varB);
}
void loop() {}

I think, you wanted to put a space because the definition by 0b literal like it is hard to read.
But it is not that? if so I'm sorry to misreadings.

EDIT:
It looks like a define to string, but it's not.
It defines a numerical value similar to use "0b".
Of course it can be used for calculations, register values, etc.
Well...
But I don't allocating 32-bit registers in bit notation.
Also in the case of 8-bit registers, I don't feel the need to separate them with nibbles...