Advice and contributions to hardware dependent code for 4Mhz PWM signal

I am writing a library for an external IC that needs a 4Mhz clock signal to work properly. This signal could be sourced from external an hardware oscillator or from an MCU output pin.

I prefer the idea of the MCU generating the clock, as this means that it reduces the external hardware requirements. However, it also means that the library needs hardware dependent code - the clock signal most likely would need to be generated by low level register bit settings. For example, on an ATmega328P (Arduino Uno), this is set up using fast PWM and counter resets, and basically runs in the background independently from the application code.

However, as you can see from the code fragment below, supporting different processors has the potential to become a library maintenance problem and, also, I don’t have enough knowledge/experience to write the equivalent code for all the different types of processors.

My question is whether this the best way to organise this? I don’t normally write processor dependent code, so I am looking for what is best practice.

I am also looking for contributions to this code base (ie, snippets for other processors that generate this clock signal) from others.

void startClock(void)
#if defined(__AVR_ATmega328P__)
  Using Timer2, creates a 4Mhz PWM signal 50% duty cycle on pin 3 of Arduino Uno.

  This tells the chip MCU to:
  - Enable Fast PWM Mode (WGM22, WGM21, WGM20).
  - Don't scale the clock signal - keep it at 16 MHz (CS22, CS21, CS20).
  - When the counter TCNT2 equals OCR2A, start over from 0 (COM2B1, COM2B0).
  - When the counter TCNT2 equals OCR2B, set pin3 to 0; when counter TCNT2 equals 0, set OCR2B to 1.

  OCR2A = 3: The counter will start over at 3. So it'll count 0, 1, 2, 3, 0, 1, etc
  OCR2B = 1: Pin3 will toggle off at 1, and toggle back on at 0.
  Since the pin makes a complete cycle every four clock ticks, the resulting 
  PWM frequency is 4 MHz (16Mhz/4).
  #define CLOCK_PIN 3  // Pin 3 will be the 4MHz signal.

  TCCR2A = (1 << COM2B1) | (0 << COM2B0) | (1 << WGM21) | (1 << WGM20);
  TCCR2B = (1 << WGM22) | (0 << CS22) | (0 << CS21) | (1 << CS20);     
  OCR2A = 3;
  OCR2B = 1;

#define _STARTCLOCK_

#if defined(__AVR_ATmega32U4__) && defined(CORE_TEENSY)
  // Teensy 2.0
#define CLOCK_PIN 14

  // Make pin 14 be a 4MHz signal.
  TCCR1A = 0x43;  // 01000011
  TCCR1B = 0x19;  // 00011001
  OCR1A = 1;

#define _STARTCLOCK_

#if defined(__AVR_ATmega32U4__)
  // Leonardo

#if defined(__AVR_ATmega168__)
  // Arduino Duemilanove, Diecimila, and NG

#if defined(__AVR_ATmega1280__)
  // Mega

#if defined(__AVR_ATmega2560__)
  // Mega

#if defined(__SAM3X8E__)
  // DUE

#if defined (__AVR_AT90USB162__)
  // Teensy 1.0

#if defined(__MK20DX128__) || defined(__MK20DX256__)
  // Teensy 3.0 and 3.1

#if defined(__AVR_AT90USB646__) || defined(__AVR_AT90USB1286__)
  // Teensy++ 1.0 and 2.0

#ifndef _STARTCLOCK_