Aliexpress OLED 16x02 not working/WS0012

I have the OLED_016002B datasheet it uses the same IST0012 instruction set
as the Raystar RS0012 OLED that the OP had that I now have.

Your first instruction doesn't make sense to me.
0x28 would be a function set for 4 bit mode and 2 lines.
Not something that would be done on the SPI interface.
Maybe the device ignores the FuntionSet NL bit when in SPI mode?

Anyway, the issue I'm having isn't initializing the device. I can do that,
I can display things properly on the screen and can even do dimming.
I use the same CMD2 Power on/off options as you did.
I also set the CMD2 RAM access control bits FTD1 and FTD0 to 0 as specified in the table on page 36.
On the Raystar display, the chip won't display anything if you don't do this.

The issue is initializing the device reliably when using 4 bit mode such that it ALWAYS works, no matter what state the device is in.

The initialization reliability issue is for pin control mode.
(both 4 bit and 8 bit mode will have issues)

They screwed up when they extended the instruction set.
Like I said before the hd44780 instruction intentionally ignores bits 0 and 1 of Function set. This is related to how reliably initialize the device in 4 bit mode and also for 8 bit mode.

This chip set also made a fatal mistake by defining 0x03 as a command.
This is the goto table 2 mode instruction. This was a particularly bad choice.
The problem with this is it breaks the reliable hd44780 initialization which is very important when pin control is being used.
here is why:
The hd44780 chipset has no way to reset the chip from software.
(IST0012 chipset also has no s/w reset)
This creates a problem, particularly when using 4 bit mode where the host can only control 4 data pins since the host and the device can get of nibble sync.
Once out of nibble sync, the host can no longer properly communicate with the device and it will never recover on its own.
The way to get them back in sync is to do special sequence of commands/instructions that first reliably gets the device into 8 bit mode.
After that you can put the device into 4 bit mode.

This also applies to 8 bit/pin mode, but is mostly an issue for 4 bit mode where the host can only control 4 data pins.

The issue is that the upper nibble of the Function set instruction to put the device into 8 bit mode is 0x3
In order to reliably get the device into 8 bit mode, a 4 bit host will send this 0x3 nibble 3 times. This ensures that no matter what state the device was in, after the 3rd transfer it will be in 8 bit mode.
What the devices does with the instructions from those 3 strobes of E can vary depending on what state it was in when it started, but it will always end up in 8 bit mode.
This device has issues since the way it has extended its instructions, it breaks this sequence from always working.
i.e. you can't always initialize the device depending on its current state.

On this device, if the host and the device are out of nibble sync, depending on the last instruction sent prior to this, if the lower nibble was 0x0 which becomes the upper nibble to the device, you end up going into command table 2 mode which breaks the initialization.
And suppose the device is in CMD2 mode and the host was reset.
You would need to send a 0x00 instruction to the device to get it out of CMD2 mode and back to the standard commands to start the initialization.
But if the device and host are out of nibble sync then you would need to send multiple nibbles. 1 to finish off the out of sync instruction and two for the 0x00 to exit CMD2 mode.
While blindly sending multiple 0x0 nibbles (say 3 or more) would get the device out of CMD2 mode, it would then end up sending a 0x0 as the high order nibble.
And as soon as the host sends the 0x3 to start the reliable initialization, it would go right back into CMD2 mode.

The problem here is that if the device was waiting for the high nibble, you would need to send two 0x0 nibbles, If the device was waiting for a low nibble, you would need to send three 0x0 nibbles to get it out of CMD2 mode.
There is no way for the host to know what state the device is in.
And therein lies the problem.

I'm still looking, but so far I have not seen a way to create a reliable initialization for this chipset.
IMO, they have totally broken the ability to create a reliable/bullet-proof initialization due to the way that they have defined their instruction set.

I think I probably need to just move on to the I2C mode and work on that.
While i2c isn't as fast, it is simpler to wire and the library can do things like auto discover the i2c address to make things "plug and play".
My concern as that these devices may not come strapped for i2c and it takes some soldering of surface mount resistors to change it and that may be a bar too high for many users.

After i2c is working, maybe come back to 4 bit mode or perhaps not even not support it given it's issues and it looking like there is no way ever to make it fully reliable.