Advanced: Reading pins A6 and A7, digital, quickly

I am working on a project which flashes and sequences 16 different sets of LEDs. I'm using an Arduino Nano so I'm using D2-D13 plus A0-A3 to get 16 different digital outputs. I am using A4 and A5 for I2C to drive an LCD character display with a backpack. And finally, I have 2 buttons to control the speed and pattern. The only pins left was A6 and A7, so that's where I connected the buttons.

It all basically works, and reasonably well! But I am doing software PWM on all 16 outputs and I am trying to optimize and gain some speed to get rid of the noticeable flicker. I have already changed all the output code to do direct port manipulation rather than digitalWrite() and that has helped quite a bit.

I've been doing some research, and it seems that pins A6 and A7 on the Nano and Mini can only be used with analogRead() as they don't directly connect to any of the port pins. I've also been reading up on how analogRead actually works behind the scenes and the hardware and process involved. The most detailed discussion I've found is here: http://www.instructables.com/id/Girino-Fast-Arduino-Oscilloscope/?ALLSTEPS I've gotten a lot of theory, but I'm missing some specifics. I'm trying to write a fast digitalRead function for just these 2 pins. I don't care about resolution because I am using A6 and A7 as digital inputs with pushbuttons attached, not analog.

So it seems to write my own routine I need to do several things:
1: Set the prescaler to a low value to get speed at the expense of resolution.
2: Set the ADC Control Register ADCMUX to select which channel I want to read. In my case, A6 and A7.
3: Start the ADC process.
4: Check Status / Wait for it to finish / Go do something else for up to 108us.
5: Read the value from the ADCL and ADCH registers. Bit placement depends on ADLAR being 0 or 1.

So, now how do I do each of these things? Best as I can determine, to select the input channel, I just need:

  // Set Analog Input Multiplexer for Channel 6, using lowest 4 bits of ADMUX register
  // Set ADLR low on bit 5, and bits 6 and 7 should be low to select AREF as the voltage reference source
  ADMUX = 6;

(or = 7 for pin A7)
To start the conversion process:

  ADCSRA |= 0x40; // Turn on bit 6 (ADSC) of the ADCSRA register.

Then wait or go do something else (I know how to do that!)
Then if the button is pressed, bits 0 and 1 of the ADHC register will correspond to the highest 2 bits (8 and 9) of the 10-bit conversion value. I can check that with:

  if (ADHC) {
    // Button is pressed...

(Assuming that ADLAR = 0, as it was set above)

Life is never simple of course, and obviously I am missing something(s) as my test sketch doesn't work. Here's the full test sketch:

#include <Wire.h>
#include <LiquidCrystal_I2C.h>

LiquidCrystal_I2C lcd(0x27, 2, 1, 0, 4, 5, 6, 7, 3, POSITIVE);  // Set the LCD I2C address and pin numbers

bool speedButton = false;
bool patButton = false;

void setup() {

  lcd.begin(16,2);
  lcd.backlight();
  // Clear lowest 3 bits to set the prescaler to "2" and get the fastest sampling rate at the expense of resolution
  //ADCSRA |= 0xf8;

} // End void setup{}

void loop() {
  // Set Analog Input Multiplexer for Channel 6, using lowest 4 bits of ADMUX register
  // Set ADLR low on bit 5, and bits 6 and 7 should be low to select AREF as the voltage reference source
  //ADMUX = 6;
  // Start Conversion
  ADCSRA |= 0x40; // Turn on bit 6 (ADSC) of the ADCSRA register
 
  //go do something else here... waste some time while waiting for A-D conversion.
  //might as well update the display while waiting!
  lcd.setCursor(0,0);
  if (patButton) {
    lcd.print ("PATTERN!!");
  } else {
    lcd.print ("nopattern");
  }    
  
  speedButton = (ADCH);
  
  // Set Analog Input Multiplexer for Channel 7, using lowest 4 bits of ADMUX register
  // Set ADLR low on bit 5, and bits 6 and 7 should be low to select AREF as the voltage reference source
  ADMUX = 7;
  // Start Conversion
  ADCSRA |= 0x40; // Turn on bit 6 (ADSC) of the ADCSRA register
 
  //go do something else here... waste some time while waiting for A-D conversion.
  //might as well update the display while waiting!
  lcd.setCursor(0,1);
  if (patButton) {
    lcd.print ("SPEED!!");
  } else {
    lcd.print ("nospeed");
  }    
  
  patButton = (ADCH);

} // End void loop()

Can anyone tell me what I'm missing?

I started looking at what you're doing to drive the ADC manually, but then I wondered why you were doing it. A straight forward analog read only takes about 112 microseconds, and if you're doing a blocking read then just how much time do you really expect to save?

Regarding the digital I/O performance, there is a digital write fast library which provides a similar API to the conventional Arduino digitalRead/digitalWrite functions and gives performance comparable to direct port access when the pin numbers are compile-time constants, without sacrificing the abstraction and portability of the Arduino API.

I started looking at what you're doing to drive the ADC manually, but then I wondered why you were doing it.

I have been using analogRead() and it works acceptably. But I'd like to figure out how to do it manually and faster because:

  1. I'm alway trying to learn new things for my bag of tricks, and would like to better understand how this works.
  2. I'm a stickler for speed and efficiency. I will spend (waste) hours of coding to save a few bytes or clock cycles, even where it doesn't really matter. But again, that is often a learning process.
  3. Although the speed of analogRead() is acceptable in this case, my profiling has revealed that the 2 analogReads are eating about 1/3 of the time it takes for each pass of loop(), and the loop() is otherwise somewhat complex. I expect that is because analogRead() does a delay() while waiting for the ADC, and I could be using that delay time for other things.
  4. I am doing a software PWM on all 16 outputs, and there is a barely perceptible flicker to it. This may give me just the edge I need to make it imperceptible.

I could set up a timer interrupt and read the buttons a lot less often instead of every pass of loop(). But that introduces the inefficiencies of interrupting the main code, pushing everything to the stack, reading the buttons, and pulling everything back from the stack. I expect that may make my flicker worse.