Hardware interrupt counter, measuring two independent signals

Hi to all!
Excuse me if i put this topic in a wrong place.
Anyway, i am novice in programming the Atmega. I picked Atmega328P due simple reason; it still can be obtained in DIL package, easy and "human" to handle it in various diys that i might be doing. SMT is much harder for me, lack of tools and lack of sight.
I have particular problem on which i could use a help from more conversant people here.
I am trying to count two independent signals with Atmega328. Also to spare enough cycles for main loop, so to add more functionalities, like handling the LCD, keyboard, SD card module, BT module, audio output etc.
Only counting is critical, the rest is not time critical for me. So i founded out one method to do this and further adapted it for the present setup.
I am using pins D2 and D3 for this purpose.
Two interrupts.
In loop i placed all not-time critical the rest of the code. So it works almost alright. I will emphasize the word "almost". It counts two signals up to ~60kHz at both pins. Than Atmega slows down, chokes and start behaving erratic. Obvious overflow. In order to maintain good behavior of the loop; i added two external divisions by /32 (to preserve as much as possible resources) with use of CD4040. So now f/32 is coming to D2 and D3. It appears alright now. Returning the original values in code by f*32. Ok.
So, i would want to evaluate any other options that i might have. Is there any way to measure two independent signals, fed to two pins, up to 300kHz at both and to avoid any kind of "downsampling", hardware prescaling etc? Important is to give the measurements highest priority. Everything else in the loop can be adjusted accordingly (LCD, 5 keys keyboard, SD card, BT and audio on one pin).
BTW signals are robust 5v rectangular. No need for additional shaping and buffering.

void setup() 
    attachInterrupt(0, handler_INT0_pin_D2, RISING); // or FALLING
    attachInterrupt(1, handler_INT1_pin_D3, RISING); // or FALLING

void handler_INT0_pin_D2()
void handler_INT1_pin_D3() 
void loop() 
 //  not or less time critical code here
 // ...
 // ...
 // ... end of the loop
    currentMicros_pin2 = micros(); 
    duration_pin2 += currentMicros_pin2 - previousMicros_pin2;
    previousMicros_pin2 = currentMicros_pin2;
    currentMicros_pin3 = micros(); 
    duration_pin3 += currentMicros_pin3 - previousMicros_pin3;
    previousMicros_pin3 = currentMicros_pin3;

What exactly are you doing with this "counting"? Do you have to count ALL the pulses continuously? Or, is it OK to just sample the pulse stream for a certain period and then calculate frequency or whatever you're doing? Then go back to sampling.

If the latter, I'd set up the input signal as the clock source for Timer 1 with interrupt on overflow. Take a time hack with micros() then start the timer at 0. When the overflow interrupt occurs get the micros() time again. Now you have a pulse count (65536) and the time it took to count that many. Then start the process over again.

Since you want to measure 2 signals, use and external hardware selector to send them to the input pin one at a time. Use an output pin from the processor to control the selector.

I am measuring the f outs at two fluxgate sensors, which will vary according to variations in the earth magnetic field. Obviously i have to measure them all the way. Than to "decide" in code when changes occurs.

"...use and external hardware selector to send them to the input pin one at a time..."

Yup! That's seems to be good alternative, i though about that.
But i wanted to hear second opinions on this, from people more conversant than me, since i am real novice in these things.
I was hoping that there are the ways to do this within the Atmega code, without additional hardware.

Device is finished, operational, kicking. I don't have much of a complains.
But i want to go step forward and get rid of the f/32 division and those two CD4040s.
Other words; i want to retrieve original resolution and slightly better accuracy.
Main question; is the Atmega328P capable for such task?

I already told you how. When driven by your 300KHz signal Timer 1 (16 bits) will overflow and generate an interrupt every 218ms or so. Then, use the exact time difference from the two micros() reads to compute the frequency. Then, switch the external selector to have the other signal drive Timer 1.

Rise and repeat.

With 218ms (minimum) between interrupts, the ATMega won't have any problems handling other tasks in loop(). That's the benefit of pushing the counting task into a hardware counter.