When performing a "read-modify-write" operation, you really must be cautious to make it an atomic operation if the same register might ever be manipulated within an interrupt context. Failing to do so leads to infrequent but extremely difficult to diagnose problems. For example:
http://code.google.com/p/arduino/issues/detail?id=146
http://forums.adafruit.com/viewtopic.php?f=31&t=7594
I looked briefly at your pin.cpp. My first impression is it also has issue 146.
On the library design as a whole, I have 3 general comments. This may sound a bit harsh, but hopefully it helps...
First, passing register addresses as parameters to the constructors is going to lead to inefficient code. It might even end up even slower than Arduino's functions? You should really consider using templates, especially if your API is already exposing C++ syntax. Oleg's USB Host Shield library has a hardware abstraction layer that would be one good example. It's possible, with proper use of templates, to get the compiler to optimize the pin usage instances to single instructions (which are automatically atomic operations), and similarly impressive gains could be made with the timers and other peripherals.
Second, passing numerous related registers to a constructor seems unnecessarily complex. You may feel it's hardware agnostic to do so, but it's not. Even the register set is absolutely AVR-8 specific, not to mention the implementation within the class. Even 8-bit AVR XMEGA has a different (larger) set of registers, as does 32 bit AVR. By publishing an API consisting of hardware specific registers tied to exactly one platform, you'll never be (easily) portable to any other hardware platform. It's also error-prone to require more parameters than necessary, especially multiple consecutive parameters of the same type (at least if the types differ, getting them mixed up results in a compile error instead of wrong runtime behavior).
Third, why go to so much trouble to build classes around hardware access, but tie the API so closely to exactly one architecture? Especially if your implementation doesn't strive for good performance, why would anyone want to use it? What's the advantage?
If your goal is a clean API, I think you'd do much better to get a ChipKit or Maple board and think carefully about how your API can work across dramatically different hardware.
If your goal is performance, at the very least look at Oleg's code and as you develop yours, use avr-objdump to view a disassembly of the generated assembly to make sure the compiler really is optimizing at compile time.
If you're really good, you might try for both some degree of cross platform support (or at least an API capable of it) and good performance. But that's very hard.....