Choosing (and learning) new hardware

I've become increasingly successful at basic arduino functions and using code in general. I recently (after much failure, but much learning :D) finished my first decent sized project. Several features throughout my apartment are fully controlled by an arduino atmega 1280. I have an keypad and rob lcd interface to reset anything. I was curious about how to "shrinkify" my project (and more importantly future projects) with smaller hardware such as (http://www.digikey.com/product-detail/en/ATMEGA48-20PU/ATMEGA48-20PU-ND/739777?WT.mc_id=PLA_739777&gclid=CNC3zPHtobYCFaZFMgodEEkAyw). The arduino is costly and good for learning, but not practical versus smaller components that could get the job done for less.

The main question is, what are the limitations for something such hardware. My code is about 20K in size and like I said, runs an LCD and keypad. I've started looking up starter tutorials, but found nothing about sizing components and such yet. More questions as they arise.

Thanks for the continuing help and support.

Make a list of all your requirements:
Flash
SRAM
IO
power - onboard regulator or wallwart
connector types

A minimal Arduino or equivalent with just the chip, crystal, caps, reset resistor, and connectors might do all you need - like a Pro Mini.

The code may be inefficent, but I can make things work effectively and regularly use functions to cut down on repetition. But I'm not by any means advanced. It currently stands at 20k in size. I currently use 18 IO, One to a powerswitch tail, 8 to a universal keypad, 9 to an RGB LCD. I've heard of methods of running an LCD on less, but I'm still looking into that. SRAM is one thing I am unsure of and still researching. I don't yet know what amount will run what. Power supply is 5v. I have the ability to supply that.

EDIT: To clarify the LCD, 4 pins for data, 3 pins for the RGB backlight, 1 for RS, and 1 for enable. Also, SRAM wouldn't likely be a problem. 2k SRAM is more than enough.

Would it be possible to break code into smaller chunks and load it across different micro controllers?

Not clear what it is you want to do. You seem to have sufficient IO for the peripherals you are using and sufficient Flash space.
You want help making the code smaller? You need to post it. You want to find a way to use less IO? Shift registers are one way, take you 3 pins to do what you have 4 doing now with the data for the LCD. Could likely do the same for the keypad, use a shift out to drive 1 column low at at time, only save 1 pin again.