Arduino MEGA 2560 LCD Keypad Shield KMR 1.8 SPI Firmware

!!ANNOUNCEMENT!!
Revision 2 is out! Use revision 2 for a more reliable and practical firmware (including implementation of the marked solution of this thread, kindly provided by @J-M-L .
Link: Arduino MEGA 2560 LCD Keypad Shield KMR 1.8 SPI Firmware Rev 2

A simple sketch for GFX and LCD library testing morphed into this, then heavily refactored to fit the proper standards.
A firmware to expand your hardware from the setup of Arduino MEGA or MEGA 2560 clone boards with Arduino standard LCD Keypad Shield and 1.8 128x160 SPI TFT ST77xx Display with hard SPI. (TFT Default Orientation: Portrait).

Includes various animations for the TFT screen interfaced with the LCD Keypad Shield using Adafruit GFX and Arduino LCD libraries.
Next project will be to implement Bodmer's ST7735 driver.

The implementation techniques are synthesized assembly of established computer graphics, embedded systems programming, and algorithmic optimization methodologies. Performance characteristics have been validated through extensive testing on genuine Atmega2560 hardware under various operational conditions.

Any suggestions and updates provided here will be included in the next revision with due credits.

NB: Some clone boards sockets (or maybe it is the case of the Keypad Shield) do not sometimes perfectly enclose the the male sockets of the shield, often keeping a mm of gap on the USB printer cable port side, which prevents the shield from sitting properly (not a problem with WeMos D1 R1 or other similar form factored boards as they have micro USB port).
In such a case, although the device will function properly, I would advice to use the hardware setup I have, which is to sandwich a USB host Shield 2.0 in between the MEGA board and the LCD Keypad Shield.

Improved Revision 2 Link:
Arduino MEGA 2560 LCD Keypad Shield KMR 1.8 SPI Firmware Rev 2

LCD Keypad Shield ADC Reader Firmware to verify your Shield button ADC Readings:
Arduino compatible LCD Keypad Shield ADC Reader

/**
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 *                              PIXEL WASH v1.0.7 - MEGA RENDER ENGINE
 *                           Cellular Automata & Multi-Mode Visual Simulation
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 * 
 * Designer:        Sir Ronnie
 * Organization:    Core1D Automation Labs
 * License:         MIT License
 * Version:         1.0.7 Beta
 * Platform:        Arduino MEGA 2560 (Optimized)
 * Display:         KMR 1.8" TFT 128x160 SPI ST77xx + LCD Keypad Shield
 * 
 * Description:
 * Advanced cellular automata simulation featuring Conway's Game of Life and custom visual modes
 * including Fire, Water, Wind, Cyclone, Portal, and Wormhole effects. Implements dirty tile 
 * rendering, dual-display interface, auto-mode cycling, and high-performance SPI optimization.
 * 
 * Features:
 * β€’ 9 Visual Modes (Auto, Conway, Bright, Fire, Water, Wind, Cyclone, Portal, Wormhole)
 * β€’ Dirty Tile Rendering System for 60fps performance
 * β€’ Dual brightness control (TFT + LCD independent)
 * β€’ Auto-mode with 10-minute cycling between effects
 * β€’ Optimized memory management for MEGA 2560
 * β€’ Hardware SPI with configurable clock speeds
 * β€’ Enhanced Conway patterns with stagnation detection
 * β€’ Real-time settings preview with staged changes
 * 
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 *                                      LIBRARY CREDITS
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 * 
 * SPI Library:          Arduino Core Team - Hardware SPI communication
 * Adafruit_GFX:         Adafruit Industries - Graphics primitives and drawing functions
 * Adafruit_ST7735:      Adafruit Industries - ST7735 TFT display driver
 * LiquidCrystal:        Arduino Team - HD44780 LCD display interface
 * avr/pgmspace.h:       AVR-GCC Team - PROGMEM flash memory storage
 * 
 * Special Thanks:
 * β€’ Adafruit Industries for comprehensive display libraries
 * β€’ Arduino Community for extensive optimization techniques
 * β€’ AVR-GCC contributors for efficient memory management tools
 * 
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 *                                   HARDWARE CONNECTIONS
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 * 
 * Target Hardware: Arduino MEGA 2560 + LCD Keypad Shield + 1.8" ST7735 TFT
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                                    HARDWARE LAYOUT                                          β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * Involved Hardware components:
 * > Arduino MEGA/MEGA 2560 (Atmega 2560)
 * > LCD Keypad Shield (Arduino standard, Uno or MEGA pin layout)
 * > 1.8 TFT 128x160 SPI ST7735 Display
 * 
 * Default Hardware Layout:
 * ═════════════════════════════════════════════════════════════════════════════════════════╗
 * [D7]~[D6]~[D5]~[D4]~[D3]~[D2] [D1] [D0]   [D14 to D21 UART Communication]  O  [5V5] [5V5]──►● VCC (TFT)
 * ~                                      .....                                     .....
 *LCD Keypad Shieldβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    β”‚                                   ~[D44]~[D45]
 *β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    β”‚                            LED ●◄─~[D46] [D47]
 *β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ    |                            RST ●◄──[D48] [D49]──►● A0/DC
 *                                          |                                    [D50] [D51]──►● SDA/MOSI (Hard)
 *                                          |                       SCL/SCLK ●◄──[D52] [D53]──►● CS
 *                                         / M E G A   2 5 6 0                   [GND] [GND]──►● GND ─┬───── GND
 * ________[A0]_[A1]_[A2]_[A3]_[A4]_[A5]/[A6] [A7]    [A8 to A15  Analog  Pins]     O       β•‘        ___
 * ══════════│══════════════════════════════════════════════════════════════════════════════╝         _ 
 *           └─ Keypad ADC Input    
 * KEY: "~" = PWM supported Pinouts.
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                                   CONNECTION TABLE                                          β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  COMPONENT           β”‚  ARDUINO PIN    β”‚  DESCRIPTION                                       β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  LCD KEYPAD SHIELD   β”‚                 β”‚                                                    β”‚
 * β”‚  β”œβ”€ LCD_RS           β”‚  D8             β”‚  LCD Register Select                               β”‚
 * β”‚  β”œβ”€ LCD_EN           β”‚  D9             β”‚  LCD Enable                                        β”‚
 * β”‚  β”œβ”€ LCD_D4           β”‚  D4             β”‚  LCD Data Bit 4                                    β”‚
 * β”‚  β”œβ”€ LCD_D5           β”‚  D5             β”‚  LCD Data Bit 5                                    β”‚
 * β”‚  β”œβ”€ LCD_D6           β”‚  D6             β”‚  LCD Data Bit 6                                    β”‚
 * β”‚  β”œβ”€ LCD_D7           β”‚  D7             β”‚  LCD Data Bit 7                                    β”‚
 * β”‚  β”œβ”€ LCD_BL           β”‚  D10 (PWM)      β”‚  LCD Backlight Control                             β”‚
 * β”‚  └─ KEYPAD_PIN       β”‚  A0 (ADC)       β”‚  Analog Keypad Input                               β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  ST7735 TFT DISPLAY  β”‚                 β”‚                                                    β”‚
 * β”‚  β”œβ”€ TFT_CS           β”‚  D53            β”‚  SPI Chip Select                                   β”‚
 * β”‚  β”œβ”€ TFT_RST          β”‚  D48            β”‚  Reset Pin                                         β”‚
 * β”‚  β”œβ”€ TFT_DC           β”‚  D49            β”‚  Data/Command Pin (A0)                             β”‚
 * β”‚  β”œβ”€ TFT_MOSI         β”‚  D51 (HW SPI)   β”‚  SPI Data Out (Hardware SPI)                       β”‚
 * β”‚  β”œβ”€ TFT_SCLK         β”‚  D52 (HW SPI)   β”‚  SPI Clock (Hardware SPI)                          β”‚
 * β”‚  β”œβ”€ TFT_LED          β”‚  D46 (PWM)      β”‚  Backlight Control                                 β”‚
 * β”‚  β”œβ”€ VCC              β”‚  5V             β”‚  Power Supply                                      β”‚
 * β”‚  └─ GND              β”‚  GND            β”‚  Ground                                            β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                                 KEYPAD BUTTON VALUES                                        β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  BUTTON    β”‚  ADC RANGE    β”‚  TYPICAL VALUE  β”‚  FUNCTION                                    β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  RIGHT     β”‚  0-50         β”‚  ~30            β”‚  TFT Brightness Adjust                       β”‚
 * β”‚  UP        β”‚  130-195      β”‚  ~144           β”‚  Mode Selection (Next)                       β”‚
 * β”‚  DOWN      β”‚  330-380      β”‚  ~329           β”‚  Mode Selection (Previous)                   β”‚
 * β”‚  LEFT      β”‚  505-555      β”‚  ~504           β”‚  LCD Brightness Adjust                       β”‚
 * β”‚  SELECT    β”‚  735-790      β”‚  ~741           β”‚  Apply Settings                              β”‚
 * β”‚  NONE      β”‚  >1000        β”‚  ~1023          β”‚  No Button Pressed                           β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 *                                  PERFORMANCE CONFIGURATIONS
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                          FULL SCREEN GRID IMPLEMENTATIONS                                   β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  CONFIG NAME   β”‚  GRID_W  β”‚  GRID_H  β”‚  CELL_SIZE  β”‚   PERFORMANCE     β”‚   VISUAL QUALITY   β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  Max Perfrm    β”‚   12     β”‚   16     β”‚     10      β”‚  Excellent (60fps)β”‚   Low Detail       β”‚
 * β”‚  Balanced      β”‚   32     β”‚   40     β”‚     4       β”‚  Good (30-45fps)  β”‚   Medium Detail    β”‚
 * β”‚  High Detail   β”‚   48     β”‚   60     β”‚     2       β”‚  Fair (15-30fps)  β”‚   High Detail      β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * Default Configuration: BALANCED (32x40 grid, 4px cells)
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                             SPI CLOCK CONFIGURATIONS                                        β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  DIVIDER       β”‚  FREQUENCY  β”‚  PERFORMANCE    β”‚  STABILITY      β”‚  CPU OVERHEAD            β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  DIV2          β”‚  8 MHz      β”‚  Excellent      β”‚  Good           β”‚  Moderate                β”‚
 * β”‚  DIV4          β”‚  4 MHz      β”‚  Good           β”‚  Very Stable    β”‚  Lowest                  β”‚
 * β”‚  DIV8          β”‚  2 MHz      β”‚  Moderate       β”‚  Maximum        β”‚  Minimal                 β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * Default Configuration: DIV2 (8MHz) - Balanced performance and stability
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                             MEMORY USAGE BREAKDOWN                                          β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  BUFFER TYPE          β”‚  SIZE (bytes)  β”‚  PURPOSE                                           β”‚
 * β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
 * β”‚  currentGrid          β”‚     160        β”‚  Current cellular automata state (32x40/8 bits)    β”‚
 * β”‚  nextGrid             β”‚     160        β”‚  Next generation calculation buffer                β”‚
 * β”‚  lastGrid             β”‚     160        β”‚  Previous frame for dirty tile detection           β”‚
 * β”‚  intensityGrid        β”‚    1280        β”‚  Per-cell intensity values (32x40 = 1280 bytes)    β”‚
 * β”‚  neighborCount        β”‚    1280        β”‚  Conway neighbor counting optimization             β”‚
 * β”‚  dirtyTiles           β”‚      10        β”‚  Dirty tile tracking bitmap (80 tiles/8 bits)      β”‚
 * β”‚  frameBuffer          β”‚      32        β”‚  Small tile rendering buffer (4x4x2 bytes)         β”‚
 * β”‚  ─────────────────────┼────────────────┼─────────────────────────────────────────────────   β”‚
 * β”‚  STATIC ARRAYS TOTAL  β”‚    3082        β”‚  Core simulation buffers                           β”‚
 * β”‚  ─────────────────────┼────────────────┼─────────────────────────────────────────────────   β”‚
 * β”‚  Global Variables     β”‚     836        β”‚  Objects, states, constants, buffers               β”‚
 * β”‚  β”œβ”€ Adafruit_ST7735   β”‚    ~400        β”‚  TFT display object                                β”‚
 * β”‚  β”œβ”€ LiquidCrystal     β”‚    ~100        β”‚  LCD display object                                β”‚
 * β”‚  β”œβ”€ Mode variables    β”‚     ~50        β”‚  Current/staged modes, indices                     β”‚
 * β”‚  β”œβ”€ Timing variables  β”‚     ~32        β”‚  lastUpdate, generation counters                   β”‚
 * β”‚  β”œβ”€ Constants/Arrays  β”‚    ~254        β”‚  PROGMEM pointers, brightness arrays, buffers      β”‚
 * β”‚  ─────────────────────┼────────────────┼─────────────────────────────────────────────────   β”‚
 * β”‚  TOTAL RAM USAGE      β”‚   ~3918        β”‚  47.8% of MEGA 2560's 8KB RAM                      β”‚
 * β”‚  Available for Stack  β”‚    4274        β”‚  52.2% remaining for local variables & stack       β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 *                                      USAGE NOTES
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 * 
 * Button Controls:
 * β€’ UP/DOWN: Cycle through visual modes (Auto, Conway, Bright, Fire, Water, Wind, Cyclone, Portal, Wormhole)
 * β€’ LEFT: Adjust LCD backlight brightness (8 levels)
 * β€’ RIGHT: Adjust TFT backlight brightness (8 levels)
 * β€’ SELECT: Apply staged settings and initialize new mode
 * 
 * Auto Mode:
 * β€’ Automatically cycles between all visual modes every 10 minutes
 * β€’ LCD shows current running mode (e.g., "Auto Mode:CON" for Conway)
 * β€’ Each mode initializes with optimized patterns and parameters
 * 
 * Performance Tips:
 * β€’ Lower grid resolution for higher frame rates
 * β€’ Reduce SPI clock speed if display artifacts occur
 * β€’ Use Conway mode for lowest CPU usage
 * β€’ Portal/Wormhole modes are most CPU intensive
 * 
 * Customization:
 * β€’ Modify GRID_WIDTH, GRID_HEIGHT, CELL_SIZE for different resolutions
 * β€’ Adjust AUTO_MODE_INTERVAL for different cycling times
 * β€’ Change SPI_CLOCK_DIVIDER for performance tuning
 * β€’ Modify color palettes in PROGMEM for different visual themes
 * 
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 *                                    MIT LICENSE
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 * 
 * Copyright (c) 2025 Core1D Automation Labs
 * 
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this firmware and associated documentation files (the "Firmware"), to deal
 * in the Firmtware without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Firmware, and to permit persons to whom the Firmware is
 * furnished to do so, subject to the following conditions:
 * 
 * The above copyright notice and this permission notice shall be included in all
 * copies or substantial portions of the Software.
 * 
 * THE FIRMWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE FIRMWARE OR THE USE OR OTHER DEALINGS IN THE
 * FIRMWARE.
 * 
 * ═════════════════════════════════════════════════════════════════════════════════════════
 */                                                                                       
 /*
/


*/

#include <SPI.h>
#include <Adafruit_GFX.h>
#include <Adafruit_ST7735.h>
#include <LiquidCrystal.h>
#include <avr/pgmspace.h>

// Pin definitions for MEGA 2560 with LCD Keypad Shield
#define TFT_CS   53   // CS
#define TFT_RST  48   // RST
#define TFT_DC   49   // A0/DC
#define TFT_MOSI 51   // SDA/MOSI (hardware SPI)
#define TFT_SCLK 52   // SCL/SCLK (hardware SPI)
/*
==============================================
TFT LED Backlight control Pin Notes:
==============================================
Arduino MEGA (or clones) has the following pins exposed for PWM capability: D44, D45 and D46.
Most Arduino standard LCD keypad shield will have soldering breakouts for pins D1 to D7 (as the rest of the pins are used by the TFT Shield.)
So the candidates for the backlight control pin for Digital PWM control are the D pins mentioned (The Analog A Pins and the UART D14 to 21 Pins are mentioned for positional reference, they don't support PWM.):

═════════════════════════════════════════════════════════════════════════════════════════╗
[XX] [XX] [XX] [D4] [D3] [D2] [D1] [XX]   [D14 to D21 UART Communication]  O  [5V5] [5V5]β•‘
                                     └─ <XX means in use or N/A>                    .....  LCD Keypad Shield                     ....                                  [D44] [D45]
                                         β”‚                                    [D46] [XXX]──<D47 and onwards or before D44, no PWM capability>
                                         β”‚                                       .....   
                                        / M E G A   2 5 6 0                   [GND] [GND]β•‘
_______________[A1]_[A2]_[A3]_[A4]_[A5]/[A6] [A7]    [A8 to A15  Analog  Pins]     O     β•‘
═════════════════════════════════════════════════════════════════════════════════════════╝

Use it as per your hardware requirement.
           β”‚
           β–Ό
*/
#define TFT_LED  46    // LED control pin || -1 for no pin hard control)

// LCD Keypad Shield pins
#define LCD_RS   8
#define LCD_EN   9
#define LCD_D4   4
#define LCD_D5   5
#define LCD_D6   6
#define LCD_D7   7
#define LCD_BL   10   // LCD backlight control (PWM-capable)
#define KEYPAD_PIN A0

// Display dimensions
constexpr uint8_t DISPLAY_WIDTH = 128;
constexpr uint8_t DISPLAY_HEIGHT = 160;

// Grid dimensions config
//40p Physics Tested:
constexpr uint8_t GRID_WIDTH = 32;
constexpr uint8_t GRID_HEIGHT = 40;
constexpr uint8_t CELL_SIZE = 4;
/*
Tested dimensions for full screen render (performance regarding refresh speed and button input lag may vary):

//16p Experimental (Max performance):
constexpr uint8_t GRID_WIDTH = 12;
constexpr uint8_t GRID_HEIGHT = 16;
constexpr uint8_t CELL_SIZE =10;

//40p Physics Tested:
constexpr uint8_t GRID_WIDTH = 32;
constexpr uint8_t GRID_HEIGHT = 40;
constexpr uint8_t CELL_SIZE = 4;

//60p Experimental (Tested Engine Max):
constexpr uint8_t GRID_WIDTH = 48;
constexpr uint8_t GRID_HEIGHT = 60;
constexpr uint8_t CELL_SIZE = 3;
*/

// Dirty rendering tile system - optimized for MEGA 2560
constexpr uint8_t TILE_SIZE = 4;
constexpr uint8_t TILES_X = (GRID_WIDTH + TILE_SIZE - 1) / TILE_SIZE;
constexpr uint8_t TILES_Y = (GRID_HEIGHT + TILE_SIZE - 1) / TILE_SIZE;
constexpr uint8_t TOTAL_TILES = TILES_X * TILES_Y;

// Visual modes
enum VisualMode {
  MODE_AUTO = 0,
  MODE_CONWAY,
  MODE_BRIGHT,
  MODE_FIRE,
  MODE_WATER,
  MODE_WIND,
  MODE_CYCLONE,
  MODE_PORTAL,
  MODE_WORMHOLE,
  MODE_COUNT
};

// Button states
enum ButtonState {
  BTN_NONE = 0,
  BTN_RIGHT,
  BTN_UP,
  BTN_DOWN,
  BTN_LEFT,
  BTN_SELECT
};

// Brightness levels
const uint8_t BRIGHTNESS_LEVELS[] = {32, 64, 96, 128, 160, 192, 224, 255};
const uint8_t NUM_BRIGHTNESS_LEVELS = 8;

// Bit manipulation macros
#define SET_BIT(array, bit) ((array)[(bit) >> 3] |= (1 << ((bit) & 7)))
#define CLR_BIT(array, bit) ((array)[(bit) >> 3] &= ~(1 << ((bit) & 7)))
#define GET_BIT(array, bit) (((array)[(bit) >> 3] >> ((bit) & 7)) & 1)

// Memory arrays - optimized for MEGA 2560's larger RAM
constexpr uint16_t GRID_BYTES = (GRID_WIDTH * GRID_HEIGHT + 7) / 8;
uint8_t currentGrid[GRID_BYTES];
uint8_t nextGrid[GRID_BYTES];
constexpr uint8_t DIRTY_BYTES = (TOTAL_TILES + 7) / 8;
uint8_t dirtyTiles[DIRTY_BYTES];
uint8_t lastGrid[GRID_BYTES];

// Intensity grid for complex modes (0-255 per cell)
uint8_t intensityGrid[GRID_WIDTH * GRID_HEIGHT];

// Additional buffers for optimization
uint8_t neighborCount[GRID_WIDTH * GRID_HEIGHT]; // Pre-compute neighbors for Conway
uint16_t frameBuffer[TILE_SIZE * TILE_SIZE]; // Small tile buffer for faster rendering

// PROGMEM color palettes
const uint16_t PROGMEM firePalette[] = {
  0x0000, 0x8000, 0x8800, 0xC800, 0xE800, 0xF800, 0xFC00, 0xFE00, 0xFF00, 0xFF80, 0xFFC0, 0xFFE0, 0xFFFF
};

const uint16_t PROGMEM waterPalette[] = {
  0x0000, 0x0008, 0x0010, 0x0018, 0x001F, 0x041F, 0x081F, 0x0C1F, 0x101F, 0x141F, 0x181F, 0x1C1F, 0x07FF
};

const uint16_t PROGMEM windPalette[] = {
  0x0000, 0x0410, 0x0820, 0x0C30, 0x1040, 0x1450, 0x1860, 0x1C70, 0x2080, 0x2490, 0x28A0, 0x2CB0, 0x30C0
};

const uint16_t PROGMEM cyclonePalette[] = {
  0x0000, 0x4000, 0x8000, 0xC000, 0xF800, 0xF810, 0xF820, 0xF830, 0xF840, 0xF850, 0xF860, 0xF870, 0xFFFF
};

const uint16_t PROGMEM portalPalette[] = {
  0x0000, 0x8010, 0x8020, 0x8030, 0x8810, 0x9010, 0x9820, 0xA030, 0xA810, 0xB010, 0xB820, 0xC030, 0xF81F
};

const uint16_t PROGMEM brightPalette[] = {
  0x0000, 0x001F, 0x07E0, 0x07FF, 0xF800, 0xF81F, 0xFFE0, 0xFFFF, 0x8410, 0xFD20, 0x8000, 0x0400, 0xFFFF
};

const uint16_t PROGMEM wormholePalette[] = {
  0x0000, 0x0010, 0x0030, 0x0050, 0x0070, 0x4010, 0x8030, 0xC050, 0xF070, 0xF890, 0xFCB0, 0xFED0, 0xFFFF
};

// Objects
Adafruit_ST7735 tft = Adafruit_ST7735(TFT_CS, TFT_DC, TFT_RST);
LiquidCrystal lcd(LCD_RS, LCD_EN, LCD_D4, LCD_D5, LCD_D6, LCD_D7);

// Global state
static uint32_t rngState = 1;
VisualMode currentMode = MODE_AUTO;
VisualMode stagedMode = MODE_AUTO;
VisualMode actualMode = MODE_CONWAY; // The actual running mode when in AUTO
uint8_t tftBrightnessIndex = NUM_BRIGHTNESS_LEVELS - 1;
uint8_t lcdBrightnessIndex = NUM_BRIGHTNESS_LEVELS - 1;
uint8_t stagedTftBrightnessIndex = NUM_BRIGHTNESS_LEVELS - 1;
uint8_t stagedLcdBrightnessIndex = NUM_BRIGHTNESS_LEVELS - 1;
char buffer[17];
bool settingsChanged = false;
uint32_t lastAutoModeChange = 0;
const uint32_t AUTO_MODE_INTERVAL = 600000; // 10 minutes in milliseconds

// Mode names stored in PROGMEM
const char mode0[] PROGMEM = "Auto Mode";
const char mode1[] PROGMEM = "Conway Life";
const char mode2[] PROGMEM = "Dynamic Bright";
const char mode3[] PROGMEM = "Fire";
const char mode4[] PROGMEM = "Water";
const char mode5[] PROGMEM = "Wind";
const char mode6[] PROGMEM = "Cyclone";
const char mode7[] PROGMEM = "Portal";
const char mode8[] PROGMEM = "Wormhole";

const char* const modeNames[] PROGMEM = {
  mode0, mode1, mode2, mode3, mode4, mode5, mode6, mode7, mode8
};

// Fast random function
inline uint32_t fastRandom() {
  rngState ^= rngState << 13;
  rngState ^= rngState >> 17;
  rngState ^= rngState << 5;
  return rngState;
}

void initRandomSeed() {
  uint32_t seed = 0;
  for (uint8_t i = 0; i < 32; i++) {
    seed = (seed << 1) | (analogRead(A1) & 1);
    delay(1);
  }
  rngState = seed ? seed : 12345;
}

// Optimized button reading with adaptive timing and interrupt-style checking
ButtonState readButton() {
  static uint32_t lastRead = 0;
  static uint32_t lastValidPress = 0;
  static ButtonState lastButton = BTN_NONE;
  static ButtonState confirmedButton = BTN_NONE;
  static uint8_t sameButtonCount = 0;
  static uint8_t fastRepeatCount = 0;
  
  uint32_t currentTime = millis();
  
  // Adaptive timing based on system load
  uint32_t minInterval = 50;  // Minimum 50ms between reads (much faster than original 150ms)
  uint32_t debounceTime = 100; // Debounce confirmation time
  
  // Don't read too frequently, but allow much faster reads than before
  if (currentTime - lastRead < minInterval) {
    return BTN_NONE;
  }
  
  // Read analog value
  int val = analogRead(KEYPAD_PIN);
  ButtonState currentButton = BTN_NONE;
  
  // Button detection with tighter thresholds for reliability
  if (val < 50) currentButton = BTN_RIGHT;
  else if (val < 195) currentButton = BTN_UP;
  else if (val < 380) currentButton = BTN_DOWN;
  else if (val < 555) currentButton = BTN_LEFT;
  else if (val < 790) currentButton = BTN_SELECT;
  
  lastRead = currentTime;
  
  // Fast confirmation logic
  if (currentButton == lastButton && currentButton != BTN_NONE) {
    sameButtonCount++;
    
    // Confirm button after just 2 consecutive reads (was much higher before)
    if (sameButtonCount >= 2) {
      if (confirmedButton != currentButton || 
          currentTime - lastValidPress > debounceTime) {
        
        confirmedButton = currentButton;
        lastValidPress = currentTime;
        sameButtonCount = 0;
        
        // Handle fast repeat for LEFT/RIGHT buttons (brightness adjustment)
        if (currentButton == BTN_LEFT || currentButton == BTN_RIGHT) {
          fastRepeatCount++;
          // After 3 presses, allow faster repeats
          if (fastRepeatCount > 3) {
            lastValidPress = currentTime - (debounceTime - 30); // Faster repeat
          }
        } else {
          fastRepeatCount = 0;
        }
        
        return confirmedButton;
      }
    }
  } else {
    // Button changed or released
    sameButtonCount = 0;
    if (currentButton == BTN_NONE) {
      fastRepeatCount = 0;
    }
  }
  
  lastButton = currentButton;
  return BTN_NONE;
}

// Grid functions
inline uint16_t coordToBit(uint8_t x, uint8_t y) {
  return y * GRID_WIDTH + x;
}

inline uint16_t coordToIndex(uint8_t x, uint8_t y) {
  return y * GRID_WIDTH + x;
}

inline bool getCellState(const uint8_t* grid, uint8_t x, uint8_t y) {
  if (x >= GRID_WIDTH || y >= GRID_HEIGHT) return false;
  return GET_BIT(grid, coordToBit(x, y));
}

inline void setCellState(uint8_t* grid, uint8_t x, uint8_t y, bool alive) {
  if (x >= GRID_WIDTH || y >= GRID_HEIGHT) return;
  uint16_t bit = coordToBit(x, y);
  if (alive) SET_BIT(grid, bit);
  else CLR_BIT(grid, bit);
}

inline uint8_t getIntensity(uint8_t x, uint8_t y) {
  if (x >= GRID_WIDTH || y >= GRID_HEIGHT) return 0;
  return intensityGrid[coordToIndex(x, y)];
}

inline void setIntensity(uint8_t x, uint8_t y, uint8_t intensity) {
  if (x >= GRID_WIDTH || y >= GRID_HEIGHT) return;
  intensityGrid[coordToIndex(x, y)] = intensity;
}

// Dirty rendering functions
inline uint8_t getTileIndex(uint8_t tileX, uint8_t tileY) {
  return tileY * TILES_X + tileX;
}

inline void markTileDirty(uint8_t tileX, uint8_t tileY) {
  if (tileX >= TILES_X || tileY >= TILES_Y) return;
  SET_BIT(dirtyTiles, getTileIndex(tileX, tileY));
}

inline bool isTileDirty(uint8_t tileX, uint8_t tileY) {
  if (tileX >= TILES_X || tileY >= TILES_Y) return false;
  return GET_BIT(dirtyTiles, getTileIndex(tileX, tileY));
}

inline void clearTileDirty(uint8_t tileX, uint8_t tileY) {
  if (tileX >= TILES_X || tileY >= TILES_Y) return;
  CLR_BIT(dirtyTiles, getTileIndex(tileX, tileY));
}

void clearAllDirtyTiles() {
  memset(dirtyTiles, 0, DIRTY_BYTES);
}

void markAllTilesDirty() {
  memset(dirtyTiles, 0xFF, DIRTY_BYTES);
}

void markCellDirty(uint8_t x, uint8_t y) {
  markTileDirty(x / TILE_SIZE, y / TILE_SIZE);
}

// Optimized Conway's Game of Life with better seeding
void updateConway() {
  static uint16_t stagnationCounter = 0;
  static uint16_t lastLiveCells = 0;
  
  // Pre-compute neighbor counts for all cells
  memset(neighborCount, 0, sizeof(neighborCount));
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      if (getCellState(currentGrid, x, y)) {
        // Increment neighbor count for all 8 surrounding cells
        for (int8_t dy = -1; dy <= 1; dy++) {
          for (int8_t dx = -1; dx <= 1; dx++) {
            if (dx == 0 && dy == 0) continue;
            uint8_t nx = (x + dx + GRID_WIDTH) % GRID_WIDTH;
            uint8_t ny = (y + dy + GRID_HEIGHT) % GRID_HEIGHT;
            neighborCount[coordToIndex(nx, ny)]++;
          }
        }
      }
    }
  }
  
  // Apply Conway's rules
  memset(nextGrid, 0, GRID_BYTES);
  uint16_t liveCells = 0;
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      uint8_t neighbors = neighborCount[coordToIndex(x, y)];
      bool alive = getCellState(currentGrid, x, y);
      bool newState = alive ? (neighbors == 2 || neighbors == 3) : (neighbors == 3);
      
      if (newState) {
        setCellState(nextGrid, x, y, true);
        liveCells++;
      }
    }
  }
  
  // Check for stagnation and add new patterns
  if (liveCells == lastLiveCells) {
    stagnationCounter++;
  } else {
    stagnationCounter = 0;
  }
  
  // If stagnating or too few cells, add new patterns
  if (stagnationCounter > 10 || liveCells < 8) {
    addConwayPatterns();
    stagnationCounter = 0;
  }
  
  lastLiveCells = liveCells;
  memcpy(currentGrid, nextGrid, GRID_BYTES);
}

// Conway patterns
void addConwayPatterns() {
  uint8_t patternType = fastRandom() % 5;
  uint8_t startX = fastRandom() % (GRID_WIDTH - 8);
  uint8_t startY = fastRandom() % (GRID_HEIGHT - 8);
  
  switch (patternType) {
    case 0: // Glider
      setCellState(currentGrid, startX + 1, startY, true);
      setCellState(currentGrid, startX + 2, startY + 1, true);
      setCellState(currentGrid, startX, startY + 2, true);
      setCellState(currentGrid, startX + 1, startY + 2, true);
      setCellState(currentGrid, startX + 2, startY + 2, true);
      break;
      
    case 1: // Blinker
      setCellState(currentGrid, startX, startY + 1, true);
      setCellState(currentGrid, startX + 1, startY + 1, true);
      setCellState(currentGrid, startX + 2, startY + 1, true);
      break;
      
    case 2: // Block
      setCellState(currentGrid, startX, startY, true);
      setCellState(currentGrid, startX + 1, startY, true);
      setCellState(currentGrid, startX, startY + 1, true);
      setCellState(currentGrid, startX + 1, startY + 1, true);
      break;
      
    case 3: // Toad
      setCellState(currentGrid, startX + 1, startY, true);
      setCellState(currentGrid, startX + 2, startY, true);
      setCellState(currentGrid, startX + 3, startY, true);
      setCellState(currentGrid, startX, startY + 1, true);
      setCellState(currentGrid, startX + 1, startY + 1, true);
      setCellState(currentGrid, startX + 2, startY + 1, true);
      break;
      
    case 4: // Random cluster
      for (uint8_t i = 0; i < 8; i++) {
        uint8_t x = startX + (fastRandom() % 5);
        uint8_t y = startY + (fastRandom() % 5);
        setCellState(currentGrid, x, y, true);
      }
      break;
  }
}

// Dynamic brightness
void updateBright() {
  static uint8_t pulsePhase = 0;
  static uint8_t sparkleCounter = 0;
  
  pulsePhase++;
  sparkleCounter++;
  
  // Create pulsing waves
  float wave1 = sin(pulsePhase * 0.1) * 127 + 128;
  float wave2 = cos(pulsePhase * 0.07 + 1.57) * 127 + 128;
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      // Combine multiple wave patterns
      float distance = sqrt((x - GRID_WIDTH/2) * (x - GRID_WIDTH/2) + 
                           (y - GRID_HEIGHT/2) * (y - GRID_HEIGHT/2));
      float radialWave = sin(distance * 0.5 + pulsePhase * 0.2) * 127 + 128;
      
      // Mix waves with position-based patterns
      uint8_t intensity = (wave1 * 0.3 + wave2 * 0.3 + radialWave * 0.4);
      
      // Add sparkles
      if (sparkleCounter % 3 == 0 && fastRandom() % 20 == 0) {
        intensity = 255;
      }
      
      // Add flowing patterns
      uint8_t flowPattern = sin((x + y + pulsePhase) * 0.3) * 64 + 64;
      intensity = (intensity + flowPattern) / 2;
      
      if (intensity > 255) intensity = 255;
      if (intensity < 30) intensity = 0; // Threshold for cleaner look
      
      setIntensity(x, y, intensity);
      setCellState(currentGrid, x, y, intensity > 30);
    }
  }
}

void updateFire() {
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      if (y == GRID_HEIGHT - 1) {
        if (fastRandom() % 3 == 0) {
          setIntensity(x, y, 255);
          setCellState(currentGrid, x, y, true);
        }
      } else {
        uint16_t newIntensity = 0;
        if (y < GRID_HEIGHT - 1) newIntensity += getIntensity(x, y + 1) * 0.8;
        if (x > 0) newIntensity += getIntensity(x - 1, y) * 0.3;
        if (x < GRID_WIDTH - 1) newIntensity += getIntensity(x + 1, y) * 0.3;
        
        newIntensity /= 1.4;
        newIntensity -= fastRandom() % 15;
        if (newIntensity > 255) newIntensity = 255;
        if (newIntensity < 0) newIntensity = 0;
        
        setIntensity(x, y, newIntensity);
        setCellState(currentGrid, x, y, getIntensity(x, y) > 30);
      }
    }
  }
}

void updateWater() {
  static int16_t wavePhase = 0;
  wavePhase++;
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      float wave = sin((float)(x + y + wavePhase) * 0.3) * 127.0 + 128.0;
      uint8_t intensity = (uint8_t)wave + (fastRandom() % 30) - 15;
      if (intensity > 255) intensity = 255;
      
      setIntensity(x, y, intensity);
      setCellState(currentGrid, x, y, intensity > 100);
    }
  }
}

void updateWind() {
  static uint8_t windOffset = 0;
  windOffset++;
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      uint8_t pattern = ((x + windOffset) % 8) + ((y + windOffset/2) % 4) * 2;
      uint8_t intensity = (pattern * 32) + (fastRandom() % 50);
      if (intensity > 255) intensity = 255;
      
      setIntensity(x, y, intensity);
      setCellState(currentGrid, x, y, intensity > 80);
    }
  }
}

void updateCyclone() {
  static float angle = 0;
  angle += 0.1;
  
  uint8_t centerX = GRID_WIDTH / 2;
  uint8_t centerY = GRID_HEIGHT / 2;
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      float dx = x - centerX;
      float dy = y - centerY;
      float distance = sqrt(dx*dx + dy*dy);
      float cellAngle = atan2(dy, dx) + angle + distance * 0.2;
      
      uint8_t intensity = (sin(cellAngle) * 127 + 128) * (1.0 - distance / 25.0);
      if (intensity > 255) intensity = 255;
      if (intensity < 0) intensity = 0;
      
      setIntensity(x, y, intensity);
      setCellState(currentGrid, x, y, intensity > 50);
    }
  }
}

// Portal ring that erases background when shrinking
void updatePortal() {
  static bool expanding = true;
  static uint8_t portalSize = 5;
  static uint8_t lastPortalSize = 5;
  static uint8_t ringThickness = 2;
  static bool backgroundCleared = false;
  
  // Update portal size
  if (expanding) {
    portalSize++;
    if (portalSize > 20) {
      expanding = false;
    }
  } else {
    portalSize--;
    if (portalSize < 3) {
      expanding = true;
      backgroundCleared = false; // Reset for next cycle
    }
  }
  
  uint8_t centerX = GRID_WIDTH / 2;
  uint8_t centerY = GRID_HEIGHT / 2;
  
  // Clear everything at the start of shrinking cycle
  if (!expanding && !backgroundCleared) {
    for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
      for (uint8_t x = 0; x < GRID_WIDTH; x++) {
        setIntensity(x, y, 0);
        setCellState(currentGrid, x, y, false);
      }
    }
    backgroundCleared = true;
  }
  
  // If expanding, draw background pattern inside the portal
  if (expanding && portalSize > lastPortalSize) {
    for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
      for (uint8_t x = 0; x < GRID_WIDTH; x++) {
        float dx = x - centerX;
        float dy = y - centerY;
        float distance = sqrt(dx*dx + dy*dy);
        
        // Draw background in the newly revealed area (between last and current portal size)
        if (distance > lastPortalSize - ringThickness && distance <= portalSize - ringThickness) {
          if ((x + y) % 4 == 0) {
            setIntensity(x, y, 80);
            setCellState(currentGrid, x, y, true);
          }
        }
      }
    }
  }
  
  // If shrinking, erase background where portal was previously
  if (!expanding && portalSize < lastPortalSize) {
    for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
      for (uint8_t x = 0; x < GRID_WIDTH; x++) {
        float dx = x - centerX;
        float dy = y - centerY;
        float distance = sqrt(dx*dx + dy*dy);
        
        // Erase area that was previously inside the larger portal
        if (distance <= lastPortalSize + ringThickness && distance >= portalSize + ringThickness) {
          setIntensity(x, y, 0);
          setCellState(currentGrid, x, y, false);
        }
      }
    }
  }
  
  // Draw the portal ring
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      float dx = x - centerX;
      float dy = y - centerY;
      float distance = sqrt(dx*dx + dy*dy);
      
      // Check if we're within the ring area
      if (distance >= (portalSize - ringThickness) && distance <= (portalSize + ringThickness)) {
        float distanceFromRing = abs(distance - portalSize);
        
        if (distanceFromRing <= ringThickness) {
          uint8_t intensity = 255 - (distanceFromRing * 100 / ringThickness);
          
          // Add shimmer effect
          float angle = atan2(dy, dx);
          float shimmer = sin(angle * 6 + millis() * 0.008) * 0.4 + 0.8;
          intensity = intensity * shimmer;
          
          if (intensity > 255) intensity = 255;
          if (intensity < 120) intensity = 120;
          
          setIntensity(x, y, intensity);
          setCellState(currentGrid, x, y, true);
        }
      }
    }
  }
  
  lastPortalSize = portalSize;
}

// Wormhole mode (infinite tunnel)
void updateWormhole() {
  static float tunnelDepth = 0;
  static float rotationAngle = 0;
  
  // Continuous expansion - never resets, keeps growing forever
  tunnelDepth += 0.15;
  rotationAngle += 0.04;
  
  uint8_t centerX = GRID_WIDTH / 2;
  uint8_t centerY = GRID_HEIGHT / 2;
  
  // Calculate maximum distance to corners to ensure full screen coverage
  float maxDistance = sqrt((GRID_WIDTH/2) * (GRID_WIDTH/2) + (GRID_HEIGHT/2) * (GRID_HEIGHT/2));
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      float dx = x - centerX;
      float dy = y - centerY;
      float distance = sqrt(dx * dx + dy * dy);
      
      uint8_t intensity = 0;
      
      if (distance > 0.1) { // Avoid center singularity
        // Create tunnel coordinate system
        float angle = atan2(dy, dx);
        float radius = distance;
        
        // Tunnel depth calculation for continuous expansion
        float depth = tunnelDepth + (20.0 / (radius + 1.0));
        
        // Create spiral arms (6 arms rotating)
        float spiralAngle = angle + rotationAngle + (depth * 0.3);
        float spiralPattern = sin(spiralAngle * 6.0);
        
        // Create depth rings that move outward continuously
        float ringPattern = sin(depth * 0.8);
        
        // Combine patterns
        float combinedPattern = (spiralPattern * 0.7 + ringPattern * 0.3);
        
        // Only show positive values to create proper tunnel walls
        if (combinedPattern > 0.1) {
          // Convert to intensity with distance-based falloff
          intensity = combinedPattern * 255;
          
          // Extended distance-based fading to cover full screen including corners
          float falloff = 1.0;
          if (radius > 5.0) {
            // Scale falloff to reach the corners (maxDistance)
            falloff = 1.0 - ((radius - 5.0) / (maxDistance - 5.0));
            if (falloff < 0) falloff = 0;
          }
          
          intensity = intensity * falloff;
          
          // Ensure minimum visibility for tunnel walls
          if (intensity > 0 && intensity < 40) {
            intensity = 40;
          }
          
          // Cap maximum intensity
          if (intensity > 255) intensity = 255;
        }
      } else {
        // Handle center area with bright intensity
        intensity = 255;
      }
      
      setIntensity(x, y, intensity);
      setCellState(currentGrid, x, y, intensity > 0);
    }
  }
}

// Auto mode management
void updateAutoMode() {
  if (currentMode == MODE_AUTO) {
    uint32_t currentTime = millis();
    
    // Check if it's time to change modes (10 minutes)
    if (currentTime - lastAutoModeChange >= AUTO_MODE_INTERVAL) {
      // Cycle to next mode (skip AUTO mode itself)
      actualMode = (VisualMode)((actualMode - MODE_AUTO) % (MODE_COUNT - 1) + 1);
      lastAutoModeChange = currentTime;
      
      // Clear grids for new mode
      memset(currentGrid, 0, GRID_BYTES);
      memset(lastGrid, 0, GRID_BYTES);
      memset(intensityGrid, 0, sizeof(intensityGrid));
      memset(neighborCount, 0, sizeof(neighborCount));
      markAllTilesDirty();
      tft.fillScreen(0x0000);
      initializeGrid();
      
      // Update LCD display to show new mode
      updateLCDDisplay();
    }
  }
}

// Get the current running mode (for AUTO mode)
VisualMode getRunningMode() {
  return (currentMode == MODE_AUTO) ? actualMode : currentMode;
}
// Get color based on current mode
uint16_t getModeColor(uint8_t x, uint8_t y, uint16_t generation, bool alive) {
  const uint16_t* palette;
  uint8_t paletteSize;
  VisualMode runningMode = getRunningMode();
  
  switch (runningMode) {
    case MODE_FIRE:
      palette = firePalette;
      paletteSize = 13;
      break;
    case MODE_WATER:
      palette = waterPalette;
      paletteSize = 13;
      break;
    case MODE_WIND:
      palette = windPalette;
      paletteSize = 13;
      break;
    case MODE_CYCLONE:
      palette = cyclonePalette;
      paletteSize = 13;
      break;
    case MODE_PORTAL:
      palette = portalPalette;
      paletteSize = 13;
      break;
    case MODE_WORMHOLE:
      palette = wormholePalette;
      paletteSize = 13;
      break;
    case MODE_BRIGHT:
      palette = brightPalette;
      paletteSize = 13;
      break;
    default: // Conway
      palette = brightPalette;
      paletteSize = 8;
      break;
  }
  
  uint8_t colorIndex;
  if (runningMode == MODE_CONWAY) {
    if (!alive) return 0x0000;
    colorIndex = ((x + y + generation) % 7) + 1;
  } else {
    uint8_t intensity = getIntensity(x, y);
    if (intensity == 0) return 0x0000;
    colorIndex = (intensity * (paletteSize - 1)) / 255;
  }
  
  return pgm_read_word(&palette[colorIndex]);
}

// Optimized rendering with tile buffering
void detectChangesAndMarkDirty() {
  VisualMode runningMode = getRunningMode();
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y++) {
    for (uint8_t x = 0; x < GRID_WIDTH; x++) {
      bool currentState = getCellState(currentGrid, x, y);
      bool lastState = GET_BIT(lastGrid, coordToBit(x, y));
      
      if (runningMode == MODE_CONWAY) {
        if (currentState != lastState) {
          markCellDirty(x, y);
        }
      } else {
        if (currentState) {
          markCellDirty(x, y);
        }
      }
      
      if (currentState) {
        SET_BIT(lastGrid, coordToBit(x, y));
      } else {
        CLR_BIT(lastGrid, coordToBit(x, y));
      }
    }
  }
}

// Optimized tile rendering with proper SPI usage
void renderTile(uint8_t tileX, uint8_t tileY, uint16_t generation) {
  uint8_t startX = tileX * TILE_SIZE;
  uint8_t startY = tileY * TILE_SIZE;
  uint8_t endX = min(startX + TILE_SIZE, GRID_WIDTH);
  uint8_t endY = min(startY + TILE_SIZE, GRID_HEIGHT);
  
  VisualMode runningMode = getRunningMode();
  
  // Direct rendering with optimized fillRect calls
  for (uint8_t y = startY; y < endY; y++) {
    for (uint8_t x = startX; x < endX; x++) {
      bool currentState = getCellState(currentGrid, x, y);
      uint16_t color;
      
      if (runningMode == MODE_CONWAY) {
        if (currentState) {
          color = getModeColor(x, y, generation, currentState);
        } else {
          color = 0x0000; // Black for dead cells
        }
      } else {
        color = getModeColor(x, y, generation, currentState);
      }
      
      // Use fillRect for fastest rendering on hardware SPI
      if (color != 0x0000 || runningMode != MODE_CONWAY) {
        tft.fillRect(x * CELL_SIZE, y * CELL_SIZE, CELL_SIZE, CELL_SIZE, color);
      }
    }
  }
}

// Update display with optimized rendering
void updateDisplay(uint16_t generation) {
  static bool firstRun = true;
  
  if (firstRun) {
    tft.fillScreen(0x0000);
    firstRun = false;
  }
  
  detectChangesAndMarkDirty();
  
  // Process dirty tiles in optimized order
  for (uint8_t tileY = 0; tileY < TILES_Y; tileY++) {
    for (uint8_t tileX = 0; tileX < TILES_X; tileX++) {
      if (isTileDirty(tileX, tileY)) {
        renderTile(tileX, tileY, generation);
        clearTileDirty(tileX, tileY);
      }
    }
  }
}

// Update LCD display
void updateLCDDisplay() {
  lcd.setCursor(0, 0);
  strcpy_P(buffer, (char*)pgm_read_word(&(modeNames[stagedMode])));
  lcd.print(buffer);
  
  // Show current running mode if in AUTO mode
  if (stagedMode == MODE_AUTO && currentMode == MODE_AUTO) {
    lcd.print(":");
    // Show abbreviated current mode
    switch (actualMode) {
      case MODE_CONWAY: lcd.print("CON"); break;
      case MODE_BRIGHT: lcd.print("BRT"); break;
      case MODE_FIRE: lcd.print("FIR"); break;
      case MODE_WATER: lcd.print("WAT"); break;
      case MODE_WIND: lcd.print("WND"); break;
      case MODE_CYCLONE: lcd.print("CYC"); break;
      case MODE_PORTAL: lcd.print("POR"); break;
      case MODE_WORMHOLE: lcd.print("WRM"); break;
    }
  }
  
  // Clear rest of line
  for (uint8_t i = strlen(buffer) + (stagedMode == MODE_AUTO && currentMode == MODE_AUTO ? 4 : 0); i < 16; i++) {
    lcd.print(" ");
  }
  
  lcd.setCursor(0, 1);
  lcd.print("T:");
  lcd.print(BRIGHTNESS_LEVELS[stagedTftBrightnessIndex]);
  lcd.print(" L:");
  lcd.print(BRIGHTNESS_LEVELS[stagedLcdBrightnessIndex]);
  
  if (settingsChanged) {
    lcd.print(" *");
  } else {
    lcd.print("  ");
  }
  
  lcd.print("   ");
}

// Apply staged settings
void applySettings() {
  if (currentMode != stagedMode) {
    currentMode = stagedMode;
    
    // Initialize AUTO mode
    if (currentMode == MODE_AUTO) {
      actualMode = MODE_CONWAY; // Start with Conway
      lastAutoModeChange = millis();
    }
    
    memset(currentGrid, 0, GRID_BYTES);
    memset(lastGrid, 0, GRID_BYTES);
    memset(intensityGrid, 0, sizeof(intensityGrid));
    memset(neighborCount, 0, sizeof(neighborCount));
    markAllTilesDirty();
    tft.fillScreen(0x0000);
    initializeGrid();
  }
  
  if (tftBrightnessIndex != stagedTftBrightnessIndex) {
    tftBrightnessIndex = stagedTftBrightnessIndex;
    analogWrite(TFT_LED, BRIGHTNESS_LEVELS[tftBrightnessIndex]);
  }
  
  if (lcdBrightnessIndex != stagedLcdBrightnessIndex) {
    lcdBrightnessIndex = stagedLcdBrightnessIndex;
    analogWrite(LCD_BL, BRIGHTNESS_LEVELS[lcdBrightnessIndex]);
  }
  
  settingsChanged = false;
}

// Alternative high-priority button check for critical responsiveness
ButtonState fastButtonCheck() {
  static uint32_t lastFastRead = 0;
  uint32_t currentTime = millis();
  
  // Only do fast check every 25ms to avoid overwhelming the system
  if (currentTime - lastFastRead < 25) {
    return BTN_NONE;
  }
  
  lastFastRead = currentTime;
  
  int val = analogRead(KEYPAD_PIN);
  
  // Only detect clear button presses for immediate response
  if (val < 30) return BTN_RIGHT;        // Very clear RIGHT press
  else if (val < 150) return BTN_UP;     // Very clear UP press  
  else if (val < 350) return BTN_DOWN;   // Very clear DOWN press
  else if (val < 520) return BTN_LEFT;   // Very clear LEFT press
  else if (val < 750) return BTN_SELECT; // Very clear SELECT press
  
  return BTN_NONE;
}

// Enhanced input handling with dual-speed detection
void handleInput() {
  static uint32_t lastInputProcess = 0;
  static ButtonState pendingButton = BTN_NONE;
  uint32_t currentTime = millis();
  
  // Try fast button check first for immediate response
  ButtonState fastButton = fastButtonCheck();
  if (fastButton != BTN_NONE) {
    pendingButton = fastButton;
  }
  
  // Process pending button or get regular button
  ButtonState button = (pendingButton != BTN_NONE) ? pendingButton : readButton();
  
  if (button != BTN_NONE) {
    // Clear pending after processing
    if (pendingButton == button) {
      pendingButton = BTN_NONE;
    }
    
    // Prevent input flooding - minimum 80ms between processed inputs
    if (currentTime - lastInputProcess < 80 && 
        button != BTN_LEFT && button != BTN_RIGHT) {
      return;
    }
    
    lastInputProcess = currentTime;
    
    switch (button) {
      case BTN_LEFT:
        stagedTftBrightnessIndex = (stagedTftBrightnessIndex + 1) % NUM_BRIGHTNESS_LEVELS;
        settingsChanged = (stagedTftBrightnessIndex != tftBrightnessIndex) || 
                         (stagedLcdBrightnessIndex != lcdBrightnessIndex) ||
                         (stagedMode != currentMode);
        updateLCDDisplay();
        break;
        
      case BTN_RIGHT:
        stagedLcdBrightnessIndex = (stagedLcdBrightnessIndex + 1) % NUM_BRIGHTNESS_LEVELS;
        settingsChanged = (stagedTftBrightnessIndex != tftBrightnessIndex) || 
                         (stagedLcdBrightnessIndex != lcdBrightnessIndex) ||
                         (stagedMode != currentMode);
        updateLCDDisplay();
        break;
        
      case BTN_UP:
        stagedMode = (VisualMode)((stagedMode + 1) % MODE_COUNT);
        settingsChanged = (stagedTftBrightnessIndex != tftBrightnessIndex) || 
                         (stagedLcdBrightnessIndex != lcdBrightnessIndex) ||
                         (stagedMode != currentMode);
        updateLCDDisplay();
        break;
        
      case BTN_DOWN:
        stagedMode = (VisualMode)((stagedMode + MODE_COUNT - 1) % MODE_COUNT);
        settingsChanged = (stagedTftBrightnessIndex != tftBrightnessIndex) || 
                         (stagedLcdBrightnessIndex != lcdBrightnessIndex) ||
                         (stagedMode != currentMode);
        updateLCDDisplay();
        break;
        
      case BTN_SELECT:
        applySettings();
        updateLCDDisplay();
        break;
    }
  }
}

// Optional interrupt-based button detection for maximum responsiveness
// Uncomment and modify to implement hard interrupts (Verify your button ADC values first)
/*
volatile ButtonState interruptButton = BTN_NONE;
volatile uint32_t interruptTime = 0;

void buttonInterruptHandler() {
  uint32_t currentTime = millis();
  if (currentTime - interruptTime > 50) { // Simple debounce
    int val = analogRead(KEYPAD_PIN);
    if (val < 50) interruptButton = BTN_RIGHT;
    else if (val < 195) interruptButton = BTN_UP;
    else if (val < 380) interruptButton = BTN_DOWN;
    else if (val < 555) interruptButton = BTN_LEFT;
    else if (val < 790) interruptButton = BTN_SELECT;
    interruptTime = currentTime;
  }
}

// Call this in setup() to enable interrupt-based detection:
void enableButtonInterrupts() {
  // Attach interrupt to a digital pin connected to your button circuit
  // attachInterrupt(digitalPinToInterrupt(2), buttonInterruptHandler, CHANGE);
}

ButtonState getInterruptButton() {
  ButtonState button = interruptButton;
  interruptButton = BTN_NONE;
  return button;
}
*/

// Initialize grid based on mode with better patterns
void initializeGrid() {
  memset(currentGrid, 0, GRID_BYTES);
  memset(lastGrid, 0, GRID_BYTES);
  memset(intensityGrid, 0, sizeof(intensityGrid));
  memset(neighborCount, 0, sizeof(neighborCount));
  
  VisualMode runningMode = getRunningMode();
  
  if (runningMode == MODE_CONWAY) {
    // Add multiple interesting patterns for Conway
    addConwayPatterns();
    addConwayPatterns();
    addConwayPatterns();
    
    // Add some random cells for good measure
    for (uint8_t i = 0; i < 15; i++) {
      uint8_t x = fastRandom() % GRID_WIDTH;
      uint8_t y = fastRandom() % GRID_HEIGHT;
      setCellState(currentGrid, x, y, true);
    }
  }
  
  markAllTilesDirty();
}

// Enhanced initialization test sequences
void testDisplays() {
  lcd.setCursor(0, 0);
  lcd.print("Testing TFT...");
  
  // Test TFT with smoother color transitions
  uint16_t colors[] = {0xF800, 0x07E0, 0x001F, 0xFFE0, 0xF81F, 0x07FF, 0xFFFF};
  for (uint8_t i = 0; i < 7; i++) {
    tft.fillRect(0, i * 23, DISPLAY_WIDTH, 23, colors[i]);
    delay(150);
  }
  
  delay(500);
  tft.fillScreen(0x0000);
  
  // Test TFT brightness with smooth transitions
  lcd.setCursor(0, 1);
  lcd.print("TFT Brightness  ");
  
  for (uint8_t i = 0; i < NUM_BRIGHTNESS_LEVELS; i++) {
    analogWrite(TFT_LED, BRIGHTNESS_LEVELS[i]);
    
    // Show brightness level visually
    uint8_t barHeight = (BRIGHTNESS_LEVELS[i] * DISPLAY_HEIGHT) / 255;
    tft.fillScreen(0x0000);
    tft.fillRect(50, DISPLAY_HEIGHT - barHeight, 28, barHeight, 0x07E0);
    
    delay(250);
  }
  
  // Test LCD brightness
  lcd.setCursor(0, 0);
  lcd.print("Testing LCD...  ");
  lcd.setCursor(0, 1);
  lcd.print("LCD Brightness  ");
  
  for (uint8_t i = 0; i < NUM_BRIGHTNESS_LEVELS; i++) {
    analogWrite(LCD_BL, BRIGHTNESS_LEVELS[i]);
    delay(300);
  }
  
  // Final test pattern - grid demonstration
  tft.fillScreen(0x0000);
  lcd.setCursor(0, 0);
  lcd.print("Grid Test...    ");
  
  for (uint8_t y = 0; y < GRID_HEIGHT; y += 4) {
    for (uint8_t x = 0; x < GRID_WIDTH; x += 4) {
      uint16_t color = pgm_read_word(&brightPalette[((x/4 + y/4) % 7) + 1]);
      tft.fillRect(x * CELL_SIZE, y * CELL_SIZE, CELL_SIZE * 4, CELL_SIZE * 4, color);
      delay(20);
    }
  }
  
  lcd.setCursor(0, 0);
  lcd.print("Init Complete!  ");
  lcd.setCursor(0, 1);
  lcd.print("MEGA Optimized! ");
  
  delay(1500);
  tft.fillScreen(0x0000);
}

// Enhanced title screen
void showTitleScreen() {
  tft.fillScreen(0x0000);
  lcd.setCursor(0, 0);
  lcd.print("Core!D Auto Labs");
  lcd.setCursor(0, 1);
  lcd.print("PIXEL WASH v0.7 ");
  
  // Title animation with background effects starting after text is complete
  for (uint8_t frame = 0; frame < 130; frame++) {
    
    // Progressive text display first (no background effects yet)
    tft.setTextSize(1);
    if (frame >= 20) {
      tft.setTextColor(0x07FF);
      if (frame >= 30) { tft.setCursor(8, 40); tft.print("Core!D"); }
      if (frame >= 40) { tft.setCursor(8, 52); tft.print("Automation"); }
      if (frame >= 50) { tft.setCursor(8, 64); tft.print("Labs"); }
    }
    
    if (frame >= 70) {
      tft.setTextColor(0xF81F);
      tft.setTextSize(2);
      if (frame >= 80) { tft.setCursor(12, 85); tft.print("PIXEL"); }
      if (frame >= 90) { tft.setCursor(12, 105); tft.print("WASH"); }
      if (frame >= 100) { 
        tft.setTextSize(1);
        tft.setCursor(12, 125); 
        tft.print("v2.7 MEGA");
      }
      if (frame >= 110) {
        tft.setTextColor(0x07E0); // Green for tech feature
        tft.setCursor(8, 137); 
        tft.print("RENDER ENGINE");
      }
    }
    
    // Start background effects only after all text is revealed (frame 115+)
    if (frame >= 115) {
      // Sparkling background
      for (uint8_t i = 0; i < 15; i++) {
        uint8_t x = fastRandom() % DISPLAY_WIDTH;
        uint8_t y = fastRandom() % DISPLAY_HEIGHT;
        
        // Avoid drawing over text areas (expanded to include DIRTY RENDER)
        if ((y >= 35 && y <= 145)) {
          if (x >= 5 && x <= 123) continue; // Skip text area
        }
        
        uint16_t color = pgm_read_word(&brightPalette[(fastRandom() % 7) + 1]);
        tft.drawPixel(x, y, color);
      }
      
      // Fade older pixels in background areas only
      if (frame % 3 == 0) {
        for (uint8_t i = 0; i < 20; i++) {
          uint8_t x = fastRandom() % DISPLAY_WIDTH;
          uint8_t y = fastRandom() % DISPLAY_HEIGHT;
          
          // Avoid fading text areas (expanded)
          if ((y >= 35 && y <= 145)) {
            if (x >= 5 && x <= 123) continue;
          }
          
          tft.drawPixel(x, y, 0x0000);
        }
      }
    }
    
    delay(35);
  }
  
  delay(1200);
  tft.fillScreen(0x0000);
}

void setup() {
  // Initialize random seed using multiple analog pins for better entropy
  initRandomSeed();
  
  // Initialize TFT with maximum performance SPI settings for MEGA 2560
  pinMode(TFT_LED, OUTPUT);
  analogWrite(TFT_LED, 255);
  
  // Configure SPI for maximum speed on MEGA 2560
  SPI.begin();

  SPI.setClockDivider(SPI_CLOCK_DIV2);   //8MHz SPI clock (16MHz/2) - stable balanced speed, balanced
//SPI.setClockDivider(SPI_CLOCK_DIV4);  // 4MHz SPI clock (16MHz/4) - slower but more stable, least CPU overhead

  SPI.setDataMode(SPI_MODE0);          // Ensure correct SPI mode
  SPI.setBitOrder(MSBFIRST);           // Most significant bit first
  
  tft.initR(INITR_BLACKTAB);
  tft.setRotation(0);
  
  // Initialize LCD (uses parallel interface - no SPI conflict)
  pinMode(LCD_BL, OUTPUT);
  analogWrite(LCD_BL, 255);
  lcd.begin(16, 2);
  
  // Show enhanced initialization tests
  testDisplays();
  
  // Show enhanced title screen
  showTitleScreen();
  
  // Setup optimized initial state
  clearAllDirtyTiles();
  initializeGrid();
  
  // Set initial brightness
  analogWrite(TFT_LED, BRIGHTNESS_LEVELS[tftBrightnessIndex]);
  analogWrite(LCD_BL, BRIGHTNESS_LEVELS[lcdBrightnessIndex]);
  
  // Display initial interface
  updateLCDDisplay();
  
  // Pre-clear buffers for optimal performance
  memset(frameBuffer, 0, sizeof(frameBuffer));
}

void loop() {
  static uint16_t generation = 0;
  static uint32_t lastUpdate = 0;
  static uint32_t lastConwayBoost = 0;
  static uint32_t lastLCDUpdate = 0;
  
  uint32_t currentTime = millis();
  
  // Handle input with priority
  handleInput();
  
  // Update auto mode if active
  updateAutoMode();
  
  // Update LCD display periodically when in auto mode to show current animation
  if (currentMode == MODE_AUTO && currentTime - lastLCDUpdate >= 1000) {
    updateLCDDisplay();
    lastLCDUpdate = currentTime;
  }
  
  // Get the current running mode
  VisualMode runningMode = getRunningMode();
  
  // Update simulation with optimized timing
  uint8_t updateInterval = (runningMode == MODE_CONWAY) ? 120 : 60; // Slower Conway for better visibility
  
  if (currentTime - lastUpdate >= updateInterval) {
    lastUpdate = currentTime;
    
    // Update based on current running mode
    switch (runningMode) {
      case MODE_CONWAY: 
        updateConway(); 
        // Periodic boost for Conway if it gets stagnant
        if (currentTime - lastConwayBoost > 5000) {
          if (fastRandom() % 10 == 0) {
            addConwayPatterns();
            lastConwayBoost = currentTime;
          }
        }
        break;
      case MODE_BRIGHT: updateBright(); break;
      case MODE_FIRE: updateFire(); break;
      case MODE_WATER: updateWater(); break;
      case MODE_WIND: updateWind(); break;
      case MODE_CYCLONE: updateCyclone(); break;
      case MODE_PORTAL: updatePortal(); break;
      case MODE_WORMHOLE: updateWormhole(); break;
    }
    
    updateDisplay(generation);
    generation++;
    
    // Reset generation counter periodically to prevent overflow
    if (generation > 30000) generation = 1000;
  }
  
  // Use remaining CPU cycles for optimizations
  if (currentTime % 100 == 0) {
    // Periodic memory optimization
    if (runningMode != MODE_CONWAY) {
      // Clear neighbor count buffer when not needed
      if (generation % 50 == 0) {
        memset(neighborCount, 0, sizeof(neighborCount));
      }
    }
  }
}

/**
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 *                           PIXEL WASH v1.0.7 - PERFORMANCE OPTIMIZATION ANALYSIS
 *                              Technical Implementation & Engine Architecture
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                              CONSTEXPR OPTIMIZATION TECHNIQUES                              β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 *
 * The 'constexpr' keyword in C++ is a powerful tool used to declare that a variable, 
 * function, or constructor can be evaluated at compile-time. This allows the compiler to 
 * perform computations and determine values before the program ever runs. The result is a 
 * zero-cost abstraction, as the final calculated values are hardcoded directly into the 
 * executable.
 * Compile-time constant expressions have been extensively utilized to eliminate runtime 
 * computational overhead. Grid dimensions, cell sizes, and buffer calculations are resolved 
 * during compilation rather than execution, resulting in zero-cost abstractions.
 * 
 * Key constexpr implementations:
 * β€’ GRID_WIDTH, GRID_HEIGHT, CELL_SIZE - Core dimensional constants computed at compile-time
 * β€’ TILES_X, TILES_Y, TOTAL_TILES - Dirty rendering tile calculations pre-computed
 * β€’ GRID_BYTES, DIRTY_BYTES - Memory allocation sizes determined during compilation
 * β€’ Display buffer sizes and indexing calculations - Eliminated from runtime execution
 * 
 * This approach ensures that complex mathematical operations involving grid 
 * calculations, memory addressing, and buffer management consume zero CPU cycles during 
 * execution. Memory layout optimizations are hardcoded into the compiled binary, allowing 
 * the MEGA 2560's limited processing power to focus entirely on animation logic.
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                                 DIRTY RENDERING SYSTEM                                      β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 *
 * Dirty rendering is an optimization technique used to significantly improve graphics performance
 * by updating only the parts of a display that have changed, rather than redrawing the entire
 * screen for every frame. In contrast, "clean rendering" (or full-frame rendering) involves
 * clearing the entire screen and drawing every single pixel from scratch, regardless of whether
 * it has changed or not. This is a simple but highly inefficient method, especially for displays
 * with slow update speeds or limited bandwidth, like the SPI-connected TFT on the Arduino Mega.
 *
 * This technique is most effective when the amount of change between consecutive frames is small.
 * It should be used in applications where only small parts of the screen need to be animated or
 * updated, such as user interfaces, game sprites, or graphical effects with limited movement.
 * For scenes where the entire screen changes dramatically (e.g., a camera pan in a 3D game),
 * dirty rendering may offer little to no benefit over full-frame rendering.
 *
 * A sophisticated tile-based dirty rendering engine has been implemented to minimize
 * unnecessary display updates. The screen is divided into 4x4 pixel tiles, with each
 * tile tracked via a compact bitmap system.
 *
 * Implementation Architecture:
 * β€’ Screen partitioned into 8x10 tiles (32x40 grid Γ· 4x4 tiles).
 * β€’ Dirty tile bitmap requires only 80 bits (10 bytes) for full screen tracking.
 * β€’ Bit manipulation macros (SET_BIT, GET_BIT, CLR_BIT) provide O(1) tile operations.
 * β€’ Change detection compares current frame against previous frame state.
 *
 * Rendering Pipeline:
 * 1. Grid state differences are detected through byte-wise comparison.
 * 2. Changed cells trigger marking of their containing tiles as dirty.
 * 3. Only dirty tiles undergo expensive SPI transfer operations.
 * 4. Clean tiles remain unchanged, preserving previous frame content.
 * 5. Tile dirty flags are cleared after successful rendering.
 *
 * Performance Impact:
 * β€’ Typical frame updates require rendering only 10-30% of total tiles.
 * β€’ SPI bandwidth utilization reduced by 70-90% compared to full-frame updates.
 * β€’ Frame rates improved from ~8fps to 30-60fps through selective rendering.
 * β€’ Memory bandwidth conservation extends to intensity grid updates.
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                               CLEAN RENDERING OUTCOMES                                      β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * Despite utilizing dirty rendering techniques, visual output maintains pristine quality 
 * through careful state management and coherent frame buffering strategies.
 * 
 * State Coherency Mechanisms:
 * β€’ Triple buffer system (currentGrid, nextGrid, lastGrid) ensures atomic state transitions
 * β€’ Intensity grids maintain floating-point precision for smooth color interpolation
 * β€’ Frame buffer coherency verified through bit-accurate comparison operations
 * β€’ Color palette interpolation performed via PROGMEM lookup tables
 * 
 * Visual Consistency Assurance:
 * β€’ Complete tile rendering prevents partial update artifacts
 * β€’ Color transitions maintain mathematical precision across tile boundaries
 * β€’ Animation phases remain synchronized regardless of selective rendering
 * β€’ Edge cases handled through boundary condition validation
 * 
 * The dirty rendering system produces visually identical output to full-frame rendering 
 * while achieving substantial performance improvements. Frame coherency is maintained 
 * through careful ordering of update operations and atomic tile replacement.
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                              ANIMATION ALGORITHM ANALYSIS                                   β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * Multiple sophisticated animation algorithms have been implemented, each optimized for 
 * the MEGA 2560's computational constraints while delivering compelling visual effects.
 * 
 * CONWAY'S GAME OF LIFE ENGINE (CELLULAR AUTOMATA):
 * β€’ Neighbor counting pre-computation eliminates redundant calculations
 * β€’ Bit-packed grid storage reduces memory footprint by 87.5% (1 bit per cell vs 8 bits)
 * β€’ Stagnation detection triggers automatic pattern injection to maintain activity
 * β€’ Toroidal topology implemented through modular arithmetic for edge wrapping
 * β€’ Pattern library includes Gliders, Oscillators, and Still Life configurations.
 * 
 * Cellular Automaton Rules:
 *   For each cell (x,y): N = Ξ£(neighbors in 8-connected grid)
 *   Evolution rules: 
 *     - Survival: alive(t) ∧ (N = 2 ∨ N = 3) β†’ alive(t+1)
 *     - Birth: dead(t) ∧ N = 3 β†’ alive(t+1)  
 *     - Death: alive(t) ∧ (N < 2 ∨ N > 3) β†’ dead(t+1)
 *   Optimization: Pre-computed neighbor counting eliminates redundant calculations
 *   Stagnation detection: If liveCells(t) = liveCells(t-10), inject new patterns
 *   Pattern library: Gliders, Oscillators, Still Life configurations with known behaviors
 * 
 * FLUID DYNAMICS SIMULATIONS:
 * 
 * Fire Algorithm - Heat Diffusion Model:
 *   I(x,y,t+1) = 0.8Γ—I(x,y+1,t) + 0.3Γ—[I(x-1,y,t) + I(x+1,y,t)] / 1.4 - R(0,15)
 *   Where: I = intensity, t = time step, R = random cooling factor
 *   Base condition: I(x,HEIGHT-1,t) = 255 (random probability = 1/3)
 *   Physical model: Upward heat convection with lateral diffusion and stochastic cooling
 * 
 * Water Algorithm - Wave Interference:
 *   I(x,y,t) = 127Γ—sin(0.3Γ—(x + y + t)) + 128 + R(-15,15)
 *   State: alive if I(x,y,t) > 100
 *   Mathematical basis: Single-frequency sine wave with spatial-temporal coupling
 *   Creates interference patterns through phase relationships across grid coordinates
 * 
 * Wind Algorithm - Directional Flow Field:
 *   Pattern(x,y,t) = ((x + t) mod 8) + ((y + t/2) mod 4) Γ— 2
 *   I(x,y,t) = Pattern Γ— 32 + R(0,50)
 *   Generates periodic flow patterns with horizontal drift and vertical stratification
 * 
 * Cyclone Algorithm - Polar Coordinate Dynamics:
 *   dx = x - centerX, dy = y - centerY
 *   r = √(dx² + dy²), θ = atan2(dy,dx)
 *   ΞΈ_modified = ΞΈ + 0.1Γ—t + rΓ—0.2 (spiral effect)
 *   I(x,y,t) = (127Γ—sin(ΞΈ_modified) + 128) Γ— (1 - r/25) (radial falloff)
 *   Creates logarithmic spiral with distance-based intensity attenuation
 * 
 * GEOMETRIC PATTERN GENERATORS:
 * 
 * Portal Algorithm - Dynamic Ring Expansion/Contraction:
 *   radius(t) = expanding ? (5 β†’ 20) : (20 β†’ 3)
 *   For each (x,y): distance = √((x-centerX)² + (y-centerY)²)
 *   Ring condition: |distance - radius| ≀ ringThickness
 *   I(x,y,t) = 255 - (|distance - radius|Γ—100/thickness) Γ— shimmer
 *   shimmer = sin(6Γ—atan2(y-centerY, x-centerX) + 0.008Γ—millis())Γ—0.4 + 0.8
 *   Background erasure during contraction phase ensures clean visual transitions
 * 
 * Wormhole Algorithm - Infinite Tunnel Perspective:
 *   For each (x,y): distance = √((x-centerX)² + (y-centerY)²)
 *   depth = tunnelDepth + 20/(distance + 1) (perspective mapping)
 *   angle = atan2(y-centerY, x-centerX)
 *   spiral_angle = angle + 0.04Γ—t + depthΓ—0.3 (rotating spiral)
 *   spiral_pattern = sin(6Γ—spiral_angle) (6-armed spiral)
 *   ring_pattern = sin(0.8Γ—depth) (depth-based rings)
 *   I(x,y,t) = (0.7Γ—spiral_pattern + 0.3Γ—ring_pattern) Γ— falloff
 *   falloff = max(0, 1 - (distance-5)/(maxDistance-5)) (corner coverage)
 *   tunnelDepth += 0.15 (continuous expansion, never resets)
 * 
 * Bright Algorithm - Multi-Harmonic Wave Interference:
 *   wave1 = 127Γ—sin(0.1Γ—t) + 128 (primary temporal oscillation)
 *   wave2 = 127Γ—cos(0.07Γ—t + Ο€/2) + 128 (secondary phase-shifted wave)
 *   For each (x,y): distance = √((x-centerX)² + (y-centerY)²)
 *   radial_wave = 127Γ—sin(0.5Γ—distance + 0.2Γ—t) + 128 (radial propagation)
 *   flow_pattern = 64Γ—sin(0.3Γ—(x + y + t)) + 64 (diagonal flow)
 *   I(x,y,t) = (0.3Γ—wave1 + 0.3Γ—wave2 + 0.4Γ—radial_wave + flow_pattern) / 2
 *   Sparkle enhancement: I = 255 (random probability = 1/20, every 3rd frame)
 *   Thresholding: I = 0 if I < 30 (noise elimination)
 * 
 * Mathematical Foundations:
 * β€’ Distance calculations utilize integer square root: r β‰ˆ √(dxΒ² + dyΒ²)
 * β€’ Color interpolation via fixed-point arithmetic: color = (intensity Γ— paletteSize) / 255
 * β€’ Temporal phases managed through modular increment: phase = (phase + 1) % period
 * β€’ Bit manipulation for state storage: 1 bit per Conway cell, 8 bits per intensity value
 * β€’ Polar coordinate conversion: ΞΈ = atan2(dy, dx), optimized for embedded processors
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                            ENGINE BUFFER DESIGN & MANAGEMENT                                β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * A sophisticated memory management system has been architected to maximize utilization 
 * of the MEGA 2560's 8KB SRAM while maintaining optimal performance characteristics.
 * 
 * Primary Buffer Architecture:
 * β€’ currentGrid[160]: Bit-packed cellular automata state (1280 cells Γ· 8 bits/byte)
 * β€’ nextGrid[160]: Double-buffered next generation calculation workspace
 * β€’ lastGrid[160]: Previous frame state for dirty tile change detection
 * β€’ intensityGrid[1280]: Per-cell intensity values (0-255) for complex visual modes
 * β€’ neighborCount[1280]: Conway's Game of Life neighbor counting optimization buffer
 * 
 * Auxiliary Buffer Systems:
 * β€’ dirtyTiles[20]: Compressed bitmap tracking system for selective rendering
 * β€’ frameBuffer[32]: Small tile rendering workspace (4x4 pixels Γ— 2 bytes/pixel)
 * β€’ Color palettes stored in PROGMEM flash memory to preserve SRAM
 * 
 * Memory Layout Optimization:
 * β€’ Buffer alignment optimized for AVR word boundaries
 * β€’ Stack usage minimized through careful function parameter design
 * β€’ Dynamic allocation completely avoided to prevent heap fragmentation
 * β€’ Critical buffers placed in sequential memory regions for cache efficiency
 * 
 * Buffer Management Strategies:
 * β€’ Atomic buffer swapping prevents race conditions during state transitions
 * β€’ Selective buffer clearing based on active rendering mode requirements
 * β€’ Memory reuse patterns implemented for non-concurrent buffer usage
 * β€’ Overflow protection through compile-time boundary validation
 * 
 * Performance Characteristics:
 * β€’ Memory bandwidth utilization: ~40% of available SRAM (3274/8192 bytes)
 * β€’ Buffer copy operations optimized through memcpy() and direct memory manipulation
 * β€’ Cache-friendly access patterns through sequential memory traversal
 * β€’ Zero dynamic allocation eliminates garbage collection overhead
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                              HARDWARE ACCELERATION TECHNIQUES                               β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * Hardware-specific optimizations have been implemented to extract maximum performance 
 * from the MEGA 2560's architecture and peripheral systems.
 * 
 * SPI Interface Optimization:
 * β€’ Hardware SPI utilized with maximum stable clock frequency (8MHz @ DIV2)
 * β€’ DMA-style bulk transfer operations minimize CPU intervention
 * β€’ SPI configuration optimized for ST7735 display characteristics
 * β€’ Byte ordering and timing parameters tuned for minimal latency
 * 
 * ADC and Input Processing:
 * β€’ Dual-speed button detection system (fast interrupt + reliable debouncing)
 * β€’ Analog-to-digital conversion optimized for keypad resistance ladder
 * β€’ Input staging system prevents settings corruption during rapid button presses
 * β€’ Adaptive timing algorithms adjust responsiveness based on system load
 * 
 * PWM Brightness Control:
 * β€’ Hardware PWM channels utilized for flicker-free brightness adjustment
 * β€’ 8-level brightness curves optimized for human perception characteristics
 * β€’ Independent TFT and LCD brightness control with staged preview system
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                            ALGORITHMIC COMPLEXITY ANALYSIS                                  β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * Computational complexity has been carefully analyzed and optimized across all system 
 * components to ensure real-time performance on the MEGA 2560's 16MHz processor.
 * 
 * Time Complexity Characteristics:
 * β€’ Grid iteration: O(n) where n = GRID_WIDTH Γ— GRID_HEIGHT
 * β€’ Dirty tile detection: O(t) where t = total tiles (typically t << n)
 * β€’ Conway neighbor counting: O(n) with 8Γ— constant factor optimization
 * β€’ Color palette lookup: O(1) through PROGMEM indexing
 * β€’ SPI rendering: O(d) where d = dirty tiles (typically 10-30% of total)
 * 
 * Space Complexity Management:
 * β€’ Grid storage: O(n/8) through bit-packing optimization
 * β€’ Intensity storage: O(n) for complex visual modes
 * β€’ Tile tracking: O(t/8) through compressed bitmap representation
 * β€’ Animation state: O(1) through stateless algorithm design
 * 
 * Real-time Performance Guarantees:
 * β€’ Frame generation: <16ms worst-case (60fps capability)
 * β€’ Input response: <50ms typical, <25ms for brightness adjustment
 * β€’ Mode transitions: <100ms including full grid initialization
 * β€’ Memory operations: <1ms for complete buffer management cycles
 * 
 * β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 * β”‚                           POTENTIAL IMPROVEMENTS & CONTRIBUTIONS                            β”‚
 * β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
 * 
 * Future Development Opportunities:
 * 
 * PERFORMANCE ENHANCEMENTS:
 * β€’ Assembly language optimization for critical loops (Conway neighbor counting)
 * β€’ Interrupt-driven SPI transfers to overlap computation with display updates
 * β€’ Adaptive grid resolution based on real-time performance monitoring
 * β€’ Hardware timer integration for more precise animation timing
 * β€’ PROGMEM optimization for larger pattern libraries and color palettes
 * 
 * ALGORITHM EXPANSIONS:
 * β€’ Additional cellular automata rules (Langton's Ant, Brian's Brain, etc.)
 * β€’ Particle system implementations with collision detection
 * β€’ Procedural maze generation and solving algorithms
 * β€’ Fractal pattern generators (Mandelbrot, Julia sets with integer arithmetic)
 * β€’ Physics-based simulations (gravity, springs, electrical fields)
 * 
 * HARDWARE INTEGRATION:
 * β€’ External SRAM expansion for larger grid resolutions
 * β€’ SD card integration for pattern storage and retrieval
 * β€’ Real-time clock integration for time-based animations
 * β€’ Temperature sensor input for environmental responsiveness
 * β€’ Audio output synchronization with visual patterns
 * 
 * ARCHITECTURAL IMPROVEMENTS:
 * β€’ Multi-threaded rendering pipeline using timer interrupts
 * β€’ Hierarchical dirty rendering (macro-tiles containing micro-tiles)
 * β€’ Compressed pattern storage and decompression algorithms
 * β€’ Dynamic memory allocation with garbage collection
 * β€’ Network connectivity for pattern sharing and remote control
 * 
 * USER INTERFACE ENHANCEMENTS:
 * β€’ Pattern editor mode with cursor control
 * β€’ Preset pattern library with quick selection
 * β€’ Real-time parameter adjustment (speed, color, intensity)
 * β€’ Save/load functionality for custom configurations
 * β€’ Advanced color mixing and palette customization
 * 
 * OPTIMIZATION RESEARCH AREAS:
 * β€’ SIMD-style operations using AVR assembly
 * β€’ Cache-oblivious algorithms for memory access optimization
 * β€’ Fixed-point arithmetic implementations for mathematical operations
 * β€’ Bit-parallel algorithms for cellular automata evolution
 * β€’ Compression algorithms for pattern storage and transmission
 * 
 * CONTRIBUTION GUIDELINES:
 * β€’ Maintain compatibility with Arduino IDE and standard libraries
 * β€’ Preserve existing API structure for user configuration
 * β€’ Document performance impact of new features
 * β€’ Include test patterns and validation procedures
 * β€’ Follow established coding style and memory management practices
 * β€’ Ensure real-time performance constraints are maintained
 * 
 * TESTING AND VALIDATION:
 * β€’ Automated test suites for algorithm correctness
 * β€’ Performance benchmarking frameworks
 * β€’ Memory usage profiling and optimization
 * β€’ Hardware compatibility testing across Arduino variants
 * β€’ Visual regression testing for rendering accuracy
 * 
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 *                                    TECHNICAL ACKNOWLEDGMENTS
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 * 
 * The implementation techniques demonstrated in this codebase represent a synthesized assembly of 
 * established computer graphics, embedded systems programming, and algorithmic optimization 
 * methodologies. Performance characteristics have been validated through extensive testing 
 * on genuine Atmega2560 hardware under various operational conditions.
 * 
 * Special recognition is extended to the broader embedded systems community for 
 * foundational optimization techniques, the cellular automata research community for 
 * algorithmic insights, and the Arduino ecosystem contributors for comprehensive 
 * hardware abstraction layers.
 * 
 * For technical inquiries, performance analysis, or contribution discussions:
 * Core1D Automation Labs - Homebrew Embedded Systems Research for Fundamentals and Open Education
 * 
 * ════════════════════════════════════════════════════════════════════════════════════════════════
 */

1 Like

Although not an issue here, careful with maths on uint8_t

All the inputs are uint8_t but you benefit from the C++ rule that says maths are not conducted with bytes and the intermediate math is done in int (int16_t on MEGA) so it won't overflow.

May be you know that but some people don't. ➜ suggest you promote the parameters to uint16_t and then the math is done using uint16_t directly.

β€”β€”β€”

Regarding these macros

they would be better off as inline functions too, that will remove possible performance or side effects of double evaluation of a parameter (say you call GET_BIT(grid, coordToBit(x, y)) then coordToBit(x, y) gets evaluated twice and if you call SET_BIT(grid, bit++) you have an unwanted side effect of double incrementing bit).

1 Like