Understanding Font File Syntax

Hello!

I am currently working on a project where I am custom designing some code to read a font file (Such as Arial14 that appears in many libraries). However, I'm having some issues regarding character width data.

Here is Arial14.h as an example of what I'm about to explain (Had to drop web link as the file was too large and pushed the message over 11,000 characters):
https://forum.arduino.cc/index.php?action=dlattach;topic=279510.0;attach=103055

The font data makes perfect sense, and I can understand the firstchar variable and all that. However, I was having trouble comprehending the "char widths" section. I initially thought that it tells the code how many bytes make up each letter (As anyone would presume). However, it seemed at the time to be completely jumbled up and incomprehensible.

Symbol 32 (Hex code 0x20) is a space bar and hence has no width (Seems odd, but ok). However the problem presents itself as soon as you look at the second hexcode in the "char widths" section. Why does it say 0x01 when there are 2 bytes to letter 33? And the next code along 0x03 referring to character 34 with 6 bytes? I am now thinking that whatever the value is for that character you double it to get the number of bytes (Unless something weird pops up later on). Why would this be?!? Why do you have to double the value to get the number of hex codes that define that character? Why not just have the correct value in the first place? It just seems like such a stupid way of defining the font. If there are 6 hex codes that define the character, shouldn't the char width hexcode just be 0x06?

I feel this may have something to do with conversion between hex codes and some other base by some obscure libraries, but I don't see why that would alter font files to be done this way.

Also, I am 100% aware that reading from the font file DIRECTLY is considered highly unusual. This has proven to be a very unusual project.

This is not an urgent matter, as I can work with this. I would just like to know the reasoning behind the logic.

On a related note, does anyone know what "size" refer to (The very first defined byte in the array)? I'm presuming it isn't actually too important when reading from the font.

Thanking you all in advance,
Andrey

Because the height is 0x0E=14.
14 pixels require 14 bits to store them, so 2 bytes and a couple of bits left over.
It seems the font format, presumably for speed efficiency reasons, doesn’t pack the bits into successive bytes, so any bits left over are wasted.
If the font height was more, say 0x22=18, but the width still 1, then there would (I assume) be 3 bytes of data.

Edit: I’m not sure why anyone would assume that something called “width” (and not “length” or “size”) defined anything other than the width of the character in pixels. Unless of course you first made the assumption that the height of the character was 8 pixels (or less), which it isn’t. but if it where then in that very specific case only it does follow that width == size.

static uint8_t Arial_14[] PROGMEM = {
    0x1E, 0x6C, // size
    0x0A, // width
    0x0E, // height
    0x20, // first char
    0x60, // char count
    
    // char widths
    0x00, 0x01, 0x03, 0x08, 0x07, 0x0A, 0x08, 0x01, 0x03, ...


    // font data
                                                    // 32 (0x20) 0 columns

    0xFE, 0x14,                                     // 33 (0x21) 1 column

    0x1E, 0x00, 0x1E, 0x00, 0x00, 0x00,             // 34 (0x22) 3 columns

    0x90, 0x90, 0xF8, 0x96, 0x90, 0xF8, 0x96, 0x90, 
    0x00, 0x1C, 0x00, 0x00, 0x1C, 0x00, 0x00, 0x00, // 35 (0x23) 8 columns

    0x18, 0x24, 0x22, 0xFF, 0x42, 0x42, 0x84, 0x08, 
    0x10, 0x10, 0x3C, 0x10, 0x08, 0x04,             // 36 (0x24) 7 columns

    0x1C, 0x22, 0x22, 0x1C, 0xC0, 0x30, 0x8C, 0x42, 
    0x40, 0x80, 0x00, 0x00, 0x10, 0x0C, 0x00, 0x00, 
    0x0C, 0x10, 0x10, 0x0C,                         // 37 (0x25) 10 columns

    0x80, 0x5C, 0x22, 0x62, 0x92, 0x0C, 0x80, 0x00, 
    0x0C, 0x10, 0x10, 0x10, 0x10, 0x0C, 0x08, 0x10, // 38 (0x26)  8 columns

    0x1E, 0x00,                                     // 39  (0x27) 1 column

    0xF0, 0x0C, 0x02, 0x1C, 0x60, 0x80,             // 40  (0x28) 3 columns

pcbbc:
Because the height is 0x0E=14.
14 pixels require 14 bits to store them, so 2 bytes and a couple of bits left over.
It seems the font format, presumably for speed efficiency reasons, doesn’t pack the bits into successive bytes, so any bits left over are wasted.
If the font height was more, say 0x22=18, but the width still 1, then there would (I assume) be 3 bytes of data.

Edit: I’m not sure why anyone would assume that something called “width” (and not “length” or “size”) defined anything other than the width of the character in pixels. Unless of course you first made the assumption that the height of the character was 8 pixels (or less), which it isn’t. but if it where then in that very specific case only it does follow that width == size.

pcbbc this is EXACTLY what I was looking for, thankyou so much! That makes sense now. The character is considered 0x03 wide, because that is how many pixels wide the character is, not how many hex codes. It also explains why another font I was looking at seems to have 3 hexcodes per pixel, not 2. Thankyou for clearing that up for me!