Glad you got it working. My design used a PNP transistor (BC327, emitter connected to battery, collector to Arduino). Yours uses NPN (collector to battery, emitter to Arduino). That's why you had to swap the inputs. Mine drops about 0.1 volts across the transistor, yours about 0.7v or a bit more, so mine will work down to a lower battery voltage - but that probably doesn't matter.
If the power-hungry devices are switched using n-channel mosfets or npn transistors, you may not need to switch the supply to them, because when the Arduino is powered down the transistors/mosfets will be turned off anyway.