MegaTiny core support for micros() etc. with the inbuilt RTC

I've just been playing with the RTC on the ATtiny1614 with the @DrAzzy MegaTiny core as part of a satellite clock project with a view to using it [the RTC] to maintain time in periods of poor satellite coverage or a system power up.

That is the millis() timer will be run on the RTC peripheral and this is a compiler option. However, then micros() etc. is not *supported, the justification presumably being the limited low frequency oscillators that can be connected to the RTC. The consequence is that this prevents the use of some libraries, such as the LiquidCrystal libraries, which use the delayMicroseconds() for small waits.

Here is a sample work around, in this case a private version of delayMicroseconds()

#define DELAY_CYCLES(n) __builtin_avr_delay_cycles(n)  
. . .
void LiquidCrystal_I2C::delayMicroseconds( uint16_t us ) {
    delay( us/1000  ) ;  // delay whole milliseconds
    us = us - (1000 * ( us/1000 )) ;  // calculate remaining micros()
    DELAY_CYCLES ( F_CPU * us / 1000000L ) ; // wait remaining micros()
}

For a rough approximation of micros(), if required, you'd have to start looking at the RTC timer registers (say RTC.CNT ).
Since the clock is running at 32768Hz "watch crystal" frequency, you'd get a resolution of only 32us

*You could argue that micros() is anyway not supported properly on other AVR platforms because the resolution is 4us. However, it [micros] does at least exist.
Maybe, extending that logic further, micros() should also be available if clocks are running at "watch crystal" frequencies with the warning that the resolution is then a grim 32us.

delayMicroseconds does not rely on micros();

delayMicroseconds is a cycle counting busywait loop

Next release will make delay_microseconds a bit more accurate. I went over it and tweaked how it was organized.... but yeah rtc millis means no micros.

Even with millis disabled, however, delay() and delayMicrioseconds should stiill work.
delay loses is' normal trait of not being impacted by interrupts that fire durin the delay as long as they dont run for so long that millis drops time), but it works.

delay() without millis is implemented by - if the value passed to it is a constant, it just calles _delay_ms() from avrlibc. otherwise it calls _delay_ms(1) however mant times were specified.

delayMicroseconds() with a constant argument will use _delay_us (from avrlibc again), if the value is not constant it will instead use out implementation of it

Some of those changes may not yet be in board manager.
Once I port 3 features from DxCore and finish a 4th I'm going to make a beta version available for both cores through board manager because they need some testing before mainstream release.
There are kind of a lot of features...

* **New I2C/TWI implementation** (I2C and TWI refer to the same interface, the one provided by Wire.h; don't get me started about how these and the library are named).
  * Support for acting as both master and slave (on the same pins); this configuration, sometimes known as "multi-master", includes not only the simple case of multiple masters and slaves on an I2C bus, each of which is always exclusively either a master or a slave, but also the more complicated case, which has been a frequent request: For the AVR device to act as both master AND slave. It can both initiate transactions with slaves, or respond to transactions initiated by other masters.
  * New tools menu option to select whether to support only being a master *or* a slave, as we do now (default) or to support acting as both master *and* slave (new functionality).
  * Support for Dual Mode (one instance of TWI acting as master on one pair of pins and slave on another) on parts that support it (there are no current or announced tinyAVR parts with dual mode support.
  * Significantly reduced flash usage under all circumstances (even master+slave mode should use less flash than old master or slave mode - the cost of supporting master-and-slave mode is RAM for a second buffer. There is an implementation included that can use the same memory for both buffers, however, it is not currently exposed via an option due to the risk of breakage if you receive while preparing to send something.
  * Failed attempts will timeout instead of hanging indefinitely.
  * Support for slave counting the bytes read by the master, and for slave to check whether it's in the middle of a transaction (for example, before sleeping)
  * Correct defect in the changelog (this file) due to a suspected a CEBCAK (cat exists between chair and keyboard).
* Recent change to C++17 required additions to new.cpp and new.h, including sized deallocation (`delete`) and alignment-aware `new` and `delete` operators. The sized deallocation operator is called when existing code that worked before is compiled to the C++ 17 standard; since free() doesn't care about the size, implementation was straightforward. Discussion is ongoing about the aligned `new` and `delete` operators, which are also new in this version of the standards. It is likely that we will not support them, since other Arduino cores aren't even building to C++ 17 standard, so if your code needs aligned new/delete, it also won't work anywhere else in Arduino-land. While we are not shy about adding features, we do so only to support hardware features. If conditions change we will revisit this matter.
* Using millis or micros (or rather attempting to) when they are unavailable due to millis being disabled, or in the case of micros, RTC used as millis time source, give better errors.
* Clarified licence (for one thing, renamed to a .md so people can read it more easily, and know that it's readable if they're on windows) for tinyNeoPixel.
* Improved docs for tinyNeoPixel. The two libraries each have a README.md linking to a greatly expanded library document.
* Document use of WDT for it's original purpose, to protect against hangs!
* Actually prevent disabling warnings - -Wall for all! You should not be compiling code with warnings disabled, the vast majority of the time, they're pointing to problems, and in the cases that aren't bugs, they're still a weak point that future bugs could come from (and that people will comment on when you post the code on forums to ask for help). I thought I'd done this a long time ago. Also pull in some warning-related flags from DxCore, including making implicit function declarations error, since the implied declarations when this happens are basically never result in working code, but it occasionally compiles and malfunctions.
* Fix timekeeping on clock speeds only supported with external clocks or tuning when a TCA or TCB is used for millis (it's still busted with the TCD)
* Correct SYSCFG0 fuse settings when burning bootloader for 2-series parts - they default the reserved bits to 1 not 0, and worse still, setting them to 0 enables a mode we probably don't want.
* Stop clearing fuse 4 by writing the default values for TCD0 on a 1-series. Now, with great difficulty, we only set that on parts that actually have the type D timer in order to keep our promise of burn bootloader restoring the chip to a fully known state. (well, except for the user row, and EEPROM if you've got it set to retain).
* Fix theoretical EEPROM.h bug inherited from avr-libc, and keep millis() from losing time when writing more than one byte at a time; update and harmonize with DxCore.
* Harmonize Comparator.h with DxCore.
* Fix 402 with bad signature support.
* Fix names of .lst and .map
* Add avrdude.conf for the 32k 2-series parts which are now becoming available.
* Fix bug with disabled millis on tinyNeoPixel libraries not working. Again.
* Correctly comment out leftover debugging prints that would be called when using `tone()` (megaTinyCore #550).
* Adjust serial buffer size 512b and 1k parts by adding an intermediate 32b serial buffer size.
  * Parts with 512b are changed - from 16->32 for RX, TX unchanged at 16 (32->48 for each port used).
  * Parts with 1k are changed - from 64 to 32 for TX, RX unchanged at 64 (128->96 for each port used).
  * Smaller and larger parts are unchanged. This mostly helps to smooth out the RAM usage curve as you change flash size - going from 256 to 512 didn't previously change the allocation, while the jump from 512b to 1k was alarmingly large. The fact that the 8k 2-series have  poirts each makes this more noticeable. This combined with another breakpoint led me to think that something else was broken.
* Officially deprecate jtag2updi.
* Port micros and delay-microseconds improvements from DxCore.
* Add a set of compatibility defines to make life easier for people porting non-Event library event-using code to 0/1-series.
* SerialUPDI reference now links to it's actual location.
* (TODO) - Port Serial changes here from DxCore
* (TODO) - Finish Event changes and port to here.
* (TODO) - Port enhanced documentation from DxCore.
* (TODO) - Port new attach interrupt from DxCore.
* (TODO) - Port new printf option from DxCore.
* Platform.txt organization and commenting. Fix issues where defines were missing from lib-discovery phase.

I'm not working on adding functionality RTC Millis because it will be made irrelevant by the sleep+powersave library

I'll note that micros is more accurate than 4ms with most combinations of settings. I only have a recent table availabe for DxCore though.

TCA: DxCore/Ref_Timers.md at master · SpenceKonde/DxCore · GitHub
TCB: DxCore/Ref_Timers.md at master · SpenceKonde/DxCore · GitHub

That's definitely true for megaTinyCore with the version of wiring.c in github, and I think it's the same for the last release

OK. Thanks for the very quick and full answer. That is a very comprehensive activity list you have there.

Is this your own development: sleep+powersave library ? but it sounds good if the used is spared the direct register manipulation.

This is how a library which uses millis() fails under these circumstances. namely aTmega1614 with compiler option millis() timer on RTC. I'll just point it out to the author @bperrybap. I've now a work around with another library. I wrongly assumed that delayMicroseconds() was relevant to this issue. Clearly, it is only micros()


#include <Wire.h>
// #include <LiquidCrystal_I2C.h>            // LiquidCrystal_I2C.h
// LiquidCrystal_I2C lcd(0x27, 16, 2) ;      // LiquidCrystal_I2C.h

#include <hd44780.h>
#include <hd44780ioClass/hd44780_I2Cexp.h>
hd44780_I2Cexp  lcd ;

void setup() {
// lcd.init();               // LiquidCrystal_I2C.h
   lcd.begin( 16, 2 ) ;      // hd44780.h
  
   lcd.setBacklight(HIGH);
   lcd.setCursor(0, 0);
   lcd.print(F("lcd ntp clock" )) ;
}

void loop() { }

With millis() timer on RTC I get this:

 
Linking everything together...
"C:\\Users\\6v6gt\\Documents\\ArduinoData\\packages\\DxCore\\tools\\avr-gcc\\7.3.0-atmel3.6.1-azduino4b/bin/avr-gcc" -Wall -Wextra -Os -g -flto -fuse-linker-plugin -Wl,--gc-sections -Wl,--section-start=.text=0x0 -mrelax -mmcu=attiny1614 -o "C:\\Users\\6v6gt\\AppData\\Local\\Temp\\arduino_build_856054/sketch_nov26b.ino.elf" "C:\\Users\\6v6gt\\AppData\\Local\\Temp\\arduino_build_856054\\sketch\\sketch_nov26b.ino.cpp.o" "C:\\Users\\6v6gt\\AppData\\Local\\Temp\\arduino_build_856054\\libraries\\Wire\\Wire.cpp.o" "C:\\Users\\6v6gt\\AppData\\Local\\Temp\\arduino_build_856054\\libraries\\Wire\\utility\\twi.c.o" "C:\\Users\\6v6gt\\AppData\\Local\\Temp\\arduino_build_856054\\libraries\\hd44780\\hd44780.cpp.o" "C:\\Users\\6v6gt\\AppData\\Local\\Temp\\arduino_build_856054/core\\core.a" "-LC:\\Users\\6v6gt\\AppData\\Local\\Temp\\arduino_build_856054" -lm
C:\Users\6v6gt\AppData\Local\Temp\ccuAK1Jo.ltrans0.ltrans.o: In function `hd44780::command(unsigned char)':
<artificial>:(.text+0x426): undefined reference to `micros'
C:\Users\6v6gt\AppData\Local\Temp\ccuAK1Jo.ltrans0.ltrans.o: In function `hd44780::write(unsigned char)':
<artificial>:(.text+0x4d8): undefined reference to `micros'
C:\Users\6v6gt\AppData\Local\Temp\ccuAK1Jo.ltrans0.ltrans.o: In function `hd44780_I2Cexp::ioread(hd44780::iotype)':
<artificial>:(.text+0x594): undefined reference to `micros'
C:\Users\6v6gt\AppData\Local\Temp\ccuAK1Jo.ltrans0.ltrans.o: In function `hd44780_I2Cexp::iowrite(hd44780::iotype, unsigned char)':
<artificial>:(.text+0x74e): undefined reference to `micros'
C:\Users\6v6gt\AppData\Local\Temp\ccuAK1Jo.ltrans0.ltrans.o: In function `global constructors keyed to 65535_0_sketch_nov26b.ino.cpp.o.2455':
<artificial>:(.text.startup+0x62): undefined reference to `micros'
collect2.exe: error: ld returned 1 exit status
Using library Wire at version 1.1.2 in folder: C:\Users\6v6gt\Documents\ArduinoData\packages\megaTinyCore\hardware\megaavr\2.4.2\libraries\Wire 
Using library hd44780 at version 1.3.1 in folder: C:\Users\6v6gt\Documents\Arduino\libraries\hd44780 
exit status 1
Error compiling for board ATtiny3224/1624/1614/1604/824/814/804/424/414/404/214/204.

bummer.
The hd44780 library needs micros() to work. It is needed to allow the LCD to execute instructions in parallel with the processor and to allow the delays to scale up and down in time depending on the speed of the processor and what the sketch is doing.
If micros() doesn't exist, the hd44780 library can't tell how much time has elapsed since the lcd was handed the instruction.
It is pretty much an integral part of how the library works.
The only way to work around this would be do the full delay just after a instruction was handed to the lcd - assuming delayMicroSeconds() still works.
Not having micros() is a pretty big core deficiency.
Assuming I were to take on a work around, it would require the hd44780 library being able to detect this deficiency at compile time so that conditionals could be used to modify the code.
I haven't looked at the various compile line options/defines set for this mode to see if it is possible.

I found another issue with the hd44780 library related to this core.
The newer core uses compiler tools/options that disable support for typeof()

When I first tried to compile a hd44780 test sketch, I had version 2.0.2 of the core installed.
When I updated to the latest (version 2.4.2) I ran into another issue.
This relates to some unreleased hd44780 code that has some new i/o classes for i2c that allow the sketch to specify the i2c wire object to make things easier and more portable across platforms for things like software wire interfaces and multiple i2c buses that use different wire objects.
This is accomplished using C++ templates.
What breaks, is that I used typeof() to allow the user to not have to know the class name of the Wire object class (they are not all the same).
This worked with version 2.0.2 of the megaTinyCore but breaks under version 2.4.2 as it is setting different compiler options and the gcc extension of typeof() is no longer recognized.
While I can switch over the newer decltype() and I would prefer to use that, I am not sure if that is supported by all the other cores as some may be using older tools or options that force older C standards.
I'll have to do some testing to see what is supported and how far back in versions I can support for decltype() vs typeof()

But like I said this is unreleased hd44780 library code that is currently in development which will need lots of compatibility testing.

I only bring this up as the disabling of gcc extensions such as typeof() may affect other sketch code/libraries.

--- bill

OK. Thanks for that.
It seems that LiquidCrystal_I2C.h is a bit cruder in that it always systematically blocks for the minimum time defined for an instruction, without attempting to optimise this. So it escapes the problem by using delayMicropseconds().
I suppose you could make your own rough version of micros() just for this core and compiler option combination. If I understand correctly, you'd look at RTC.CNT which is 16bit and has, therefore, 2 second rollover. You'd get micros() to a resolution of 32us.

At the moment I have a work around so I can wait until this area stabilises on its own but I'd certainly help with some testing if it gets that far.

If I were to do a work around, I would just change the code to wait the full instruction time like all the other libraries since it appears that micros() doesn't work/exist and delayMicroseconds() still works.
So far I have avoiding doing anything that steps outside of Arduino APIs to touch h/w directly and I'd like to keep it that way to ensure maximum portability.

This is a core issue IMO, and should be fixed there.
If I were to do some custom code for micros() functionality for this core, rather than do it in hd44780, I would push to put the code into the actual core since that is where it needs to go and all other libraries and sketches could benefit.

--- bill

Yes. That is fair enough. It is probably more an issue of documenting the prerequisites for your library as and when the limitations of various platforms and/or configuration options becomes apparent.
The RTC timer runs prescaled to 1024 Hz when it is running the "millis" timer so even the "rough" micros(), which I had thought about, would not be a simple option.

Seems like the core should provide a micros() even if it has a large/ridiculous resolution, in this case 1024us, and then spit out a warning at compile time when this is the case.
But I can see how doing this wouldn't really be that useful to any code that actually needs micros() to work with a much lower resolution.
So maybe the hard error is better?

I have looked at what it takes to work around this in the hd44780 library and it is pretty simple.
There is a define in the core that indicates when this RTC mode is used when micros() is not available so the two internal timing delay functions used in the hd44780 library can change for this environment to work around it with no changes to any other hd44780 code.
The timing functions can be altered to call delayMicroseconds to wait the appropriate time when waiting for the lcd to be ready rather than spinning on micros(), which on most i/o classes will just be the full lcd instruction time.
The exception is for i2c devices where it will be no delay for all but clear() and home() since the i2c transmission time overhead is known and is longer than the lcd instruction time.
So for an i/o class like hd44780_I2Cexp there will be no performance hit, it might actually get a tiny bit faster.

I guess I should obtain one of these ATtiny boards so I can actually test it on the real h/w.
(I can test it on other h/w but I kind of like to see it run in the real environment.)
For testing I'll create a custom Arduino UNO board type in the IDE that sets the magic define that indicates RTC mode with no micros() support.

--- bill

The hd44780 library can now operate without using micros() on the MegaTinyCore.
It looks at some defines in the core: MEGATINYCORE and MILLIS_USE_TIMERRTC to detect it.
This gets the library working, and the hd44780_I2Cexp i/o class is just tiny bit smaller and faster, but then there are many example issues.
Things like LCDiSpeed requires micros(), maybe it could be modified to send more data which could allow it to switch to millis() instead.

But then there are some modes where even millis() doesn't exist.
And there are several examples included in the library that depend on millis().

Not sure how I want to handle the lack of millis() if at all.

The lack of micros() and millis() is very specific to this core and just a few modes within this core so I don't want to ugly up the library all over the place to try to deal with it - when possible, because in some cases, there is no way to work around it.

Does anybody have any actual information on how popular these specific modes really are?
So far I've not seen this show up in discussion yet, but then I've not really been following this specific chipset.

--- bill

It is a bit difficult to generalise but I can imagine that these chips with their built in RTC peripheral could replace some existing configurations, for example a data logger which wakes up on sample delivery to time stamp it, which would previously have used say an ATmega328p and separate RTC. The good feature is that that the millis() timer can be configured to survive the low power standby. I did publish a design recently for an ATTiny1614 based radio link where I used the RTC as a one-shot timer but did not need millis()/micros() to bridge the sleep period.

But some modes, say completely excluding the millis timer, have to be seen as "expert" modes. It is good these are available and people using these are, hopefully, doing it for a good reason and can understand the consequences. Further, there are obviously limit as to what a library designer should be expected to support.

The two cases where this is relevant are when MILLIS_USE_TIMERRTC is set, and when MILLIS_USE_TIMER_NONE is set.

On DxCore I decided against implementing RTC millis on the grounds that the use case was mainly to keep time during sleep, and one of the big (much delayed) projects was to implement a that sleep library that pauses millis, stores the current time, and starts the RTC to track time while sleeping, and restores it after waking for any reason other than the RTC overflowing (which is just tallied and then returns to sleep); I hadn't conceived of that library when I first implemented RTC millis on megaTinyCore. I probably wouldn't have implemented RTC millis if I'd had that idea - it turns out to, among other things, significantly increase the flash overhead of millis().

I wish I had numbers on how many people use which modes too! All I have is what generates the most complaints (which suggests that RTC millis is fairly popular - but that's a biased sample, because when it works people don't complain).
I don't have much of an idea of what the ratio of "yes millis" to "rtc millis" to "disable millis" is like, though. I would think that disabled millis probably has at least as much usage as RTC millis, since the tiny412 seems to be one of the most popular parts, possibly the most popular (or the people who use them are particularly loud).

It's unfortunate that there was never a universal define for checking if the API functions exist or not - because one would like to detect that there was no micros and fall back to delayMicroseconds (better still, _delay_us if the delay is constant; the latest versions of my cores do the "always-inline function testing __builtin_constant_p()" thing, using _delay_us() for constant delays, as that you can really count on, moreso than the hackjob that is delayMicroseconds, and only use the janky arduino-style implementation if the delay is not constant.

The question of what micros() should do when called when there isn't a way to answer the question with anything approaching the normal resolution is a fundamental question that I'm not sure I know how to answer. Arduino official code goes way too far in the "never produce errors" direction. It lets you compile all sorts of stuff that it knows damned bloody well won't do what the user hopes. I think this is partly a consequence of the API being "designed" (with apologies to API designers everywhere) without LTO, and hence without _builtin_constant_p() to detect obviously unreasonable constant values for anything. I don't let people specify non-existent pins other than NOT_A_PIN (code often relies on NOT_A_PIN silently passing through the digital I/O functions), I don't let people try to analogRead from pins that don't support analogRead() and so on. But the matter of a library calling a function like micros or millis that isn't guaranteed to exist is harder to deal with. I try to decide based on how hard the problems would be to debug. A typo'ed pin number that points to a pin that doesn't exist and fails silently, tha
The problem that I see with having micros just fall back to a resolution of 1024 isn't that it would result in requested 1us delays taking 1ms - so be it (printing a warning would be nice - have you ever managed to make the warning attribute work? I haven't... The error attribute was easy, but the analogous warning attribute has never compiled for me). The problem is the code that requires a 50us delay, that calls micros the first time at microsecond number 1023, and has their delay expire in 1 us instead of 50, If the external device fails to respond if the wait isn't at least 40, and the library never had to handle such a "can't happen", the outcome is could very well be code that hangs 4% of the time when it tries to talk to that device, which would be an absolute nightmare to debug... And as I said, I think there are at least as many people running with millis disabled entirely to save flash and/or get rid of the constant interrupts firing in the background (off the top of my head, i know one guy doing that in the university level classes he teaches, using the ADC on the 2-series for capacitive proximity testing.

On the topic of the specific case of that display library... it looks like you've done a very good job of encapsulating the time-dependent aspects of it, such that just changing markStart to

#if (defined(MILLIS_USE_TIMERRTC) || defined(MILLIS_USE_TIMERNONE) || defined(DISABLEMILLIS))
delayMicroseconds(exectime); 
#else
(current definition)
#endif

and then turning _waitReady() into a no-op under the same conditions would be sufficient to make it work for megaTinyCore, DxCore and ATTinyCore (MILLIS_USE_TIMERNONE will work from ATTinyCore 2.0.0, but the existing versions use DISABLEMILLIS).
That was just based on a global search of the source for micros() - are there other references to it hiding somewhere?

I've got a partly done monster refactor of that core to fix all the things about it that are horrifically bad, such I don't feel guilty putting it into minimal maintenance mode, since it's still apparently the most-used of my cores) which will be the next project after I wrap up what I started on Event and get the Serial changes from DxCore ported to megaTinyCore, which will bring both those cores to a point where I can step away and get ATTinyCore 2.0.0 sorted out, and once that's ready, then comes the megaTinyCore/DxCore sleep library.

I guess I should obtain one of these ATtiny boards so I can actually test it on the real h/w.
(I can test it on other h/w but I kind of like to see it run in the real environment.)

DM me your mailing address and I'll hook you up.

For your core users, you could provide a macro that automatically did this for them. Take advantage of the required preprocessor behavior that a self reference inside a macro is not expanded.
So you can do things like:

#define delayMicroseconds(_t)           \
        if (__builtin_constant_p(_t)    \
                _delay_us(_t);          \
        else                            \
                delayMicroseconds(_t);

You can also do things like:

#define micros micros
#define millis millis

This is a handy way to indicate that functions exist as they are effectively nop defines that create the conditional symbols.

Having defines like this would allow code to use a conditional to check if micros() or millis() existed by checking to see if the defines micros or millis exists.
code could check the define for this core, i.e MEGATINYCORE, then look for those defines to see if the functions existed.
IMO, it seems more convenient and obvious than the current checks.

The question of what micros() should do when called when there isn't a way to answer the question with anything approaching the normal resolution is a fundamental question that I'm not sure I know how to answer.

I agree it is tough as there is no good answer.
About the 1024 us resolution. The resolution of a timing function can be no better than twice the timing interval period to ensure that the minimum delay is always met since normally it is unacceptable to ever have a delay that is shorter than the requested amount.
When the timing interval becomes many times, or in this case many orders of magnitude the resolution of the API, it will create issues.
I'm quite familiar with the delay/util.h stuff since I spent months working with atmel (pushing them) to get the new APIs defined and get them working properly. I was the one that drove getting those fixed many years back since the original ones really sucked and didn't ensure that you always got at least what you asked for.
I pushed to get them fixed since it broke a graphics library I had.

My assumption is that most uses of micros would break if the timing interval/accuracy is more than a few us as I'd guess that most code is definitely not prepared or written to deal with such large added potential latency.

That said it isn't ever as easy as just switching from micros() to delayMicroseconds()
The fixes can be anything from trivial, to quite complex, to just not possible, depending on the actual code.

Like you, I've not ever gotten warning attributes to ever work on functions
I think I came to the conclusions that it had to do with some of the compiler options used so it would take changing them, and in my case I don't have control over the platform.txt file so I just gave up every trying to use them.

After thinking about it, I think the best solution in this case may be to just hard error using something like this:

#ifdef __cplusplus
extern "C"{
#endif
	unsigned long __attribute__ ((error("micros() is not supported, for more info see: XXX "))) micros(void);

#ifdef __cplusplus
}
#endif

to print a message that micros() is not supported and to see XXX for more information on workarounds and then somewhere provide some information on the issue and some potential workarounds.

In the hd44780 library, it is handled a bit differently that you had assumed.
delays are never actually done for an lcd instruction before returning.
They are done before the next instruction starts (if needed). This allows the processor to run parallel with the LCD after sending it an instruction.
When it can't tell how much time has elapsed, I decided to treat it as zero elapsed time to preserve the existing methodology.
So markstart() is essentially becomes the nop and waitReady() still does the wait.
This allows the waitReady() to account for any known fixed timing offset that was specified by the i/o class before it calls delayMicroseconds()
the waitReady() timing offset allows the i/o class using things like i2c interfaces which are slow to use the time it takes to transfer the bytes over the interface (which is a constant) to reduce the delay that the processor must do. In fact on lcds that use an i2c backpack, the processor will usually not have to do any additional waiting since it takes longer to transfer the bytes over the i2c interface than most instructions take to complete.

Working around the lack of millis() is a bigger effort.
The library itself doesn't use or need millis() but several of the examples do.
Not sure how I want to handle it.
The one application that will definitely work around it is a diagnostic sketch for the i2c backpacked based devices.
The others, I'm not sure how I want to handle it yet, if at all.

There is one sketch that measure byte transfer times, LCDiSpeed, that requires being able to detect elapsed time, it can't function without some way to detect elapsed time. Currently it uses micros() with some work it could use bigger/longer transfers to switch to millis() but if both are gone there is no way for it work. I think I'll just crash the compile for that one and print a message that it can't run that core/board with those options that don't have micros() support.

--- bill

That's a good idea with the #define micros micros trick, I will incorporate that into 1.4.0 DxCore, 2.5.0 megaTinyCore and 2.0.0 ATTinyCore (the next releases of each)

I do already do the use of _delay_us() for delayMicroseconds much like what you suggest - except that I do it with an always inline function, and the real variable delay delayMicroseconds is named _delayMicroseconds(). - and LTO optmizes away the intermediary function, inlining either the _delay_us or _delayMicroseconds. I do a similar thing in a few other places, for largely the same reason. delay gets that treatment too when there's no millis() or micros().

Interesting, I'd love to know what would make the warning attribute work. glad it's not just me at least >.>

That approach of making _WaitReady have the delay is probably better, yeah. - though In the long run I'm not sure the net effect will ever improve runtime by more than 1 delay period.

I mean, if you have millis, then one could test millis and if it's increased by more than 1, skip the wait, but I think that's more trouble than it's worth, especially since millis is slow like micros is when using RTC as timing source (SIMPLY BECAUSE OF THE GODDAMNED 1024 to 1000 CONVERSION. bitshift division is pretty cool, but a 2 term expansion of division to bitshifts is not negligible when the operands are a pair of 32-bit values.

I do give "better" errors when millis or micros isn't available, though it's not in the megaTinyCore release yet I don't think.

Personally I think it's fine if library examples don't compile on my core with some options, the error messages are clear, and the user chose to disable millis, and the call to millis is right in front of their eyes in the sketch they're compiling. The time it's a problem, IMO, is when the code that's generating the error is in the library where it's less amenable to the user working around themselves.