Arduino is capable of detecting beats as groups of average high analog reads but it's not good for audio frequency in or out.
Human eyes only see so fast (though a 5ms bright flash will persist to be seen a 5ms OFF is undetectable); 24FPS is seen as motion, 30FPS is TV, 60FPS is better but figure that 50FPS is good.
What 50 FPS means is changing all the lights every 1000/50 = 20 milliseconds. That's 320000 cpu cycles. At Arduino Speed, 20 millis =is= a long time. Even if you waste it with delay(20) it's just a long time wasted so get into non-blocking code!
With non-blocking code Arduino can do an amazing amount per millisecond as multiple small tasks handling inputs and generating outputs. I have examples with buttons/leds/serial posted showing how many times void loop() ran every second at over 67KHz, that's check every task and run the ones needed (like the 32-bit loop count and show task) 67 times on average every millisecond --- that's doing light work but Arduino has plenty of do-it in just 1 milli to run many tasks at once.
Say you have a task that analog reads V on left channel bass one ms, mid-range next ms and treble next ms leaving time between for other tasks between then turn those into RGB for the next addressable led in sequence and then wait for the next 20 ms frame to RGB the next led and the color string moves at 50 leds a second.
You take the input that means one thing and apply to something else as another meaning, abstraction. Works with circuits, works with art, works with coding too.
If you don't know the language up to variables and arrays, loops and if-else/switch-case logic then get at least that far just to save yourself wasting major time typing and debugging huge messes. Never fear, there's more but this basic set can do a lot pretty easily.