Is arduino suitable for this kind of real-time audio processing?

For example a project that would recognize the certain note in a solo piano piece, and cut it out in real-time (or change it to a different note/frequency)?

No.

Can you help me figure out the math here?
I have found this data:
Master clock frequency is 84 MHz
DAC Clock frequency is 42 MHz
ADC clock maximum speed of 200 kHz

Can you clarify where wouold be a bottleneck if arduino would get analog signal, convert it to 10bit values at 200 kHz, analyze values in the main loop (84 MHz/200 kHz=420 cycles between readouts), and if they do not match the note, just pass the same values to DAC and speakers (or do any other bits manipulation before)

It's not a matter of crunching those three numbers. The major issue is buried in this phrase:

if they do not match the note,

Determining the frequency of a musical note takes a lot more CPU cycles than an Arduino can provide. It requires a very fast processor or, better yet, one with DSP instructions.
This gets even more complicated by the original stated requirement that it then

cut it [i.e. the note] out in real-time (or change it to a different note/frequency)?

Cutting the note out requires recognizing it in the first place and then subtracting that note from the rest of the audio which is another difficult problem. Changing it to a different note would presumably require pitch shifting which is another cpu-intensive problem.
In summary, my (second) opinion is, absolutely not.

Pete

The bottleneck is in identifying a piano note. That is almost impossible let alone possible in real time. It is not a matter of looking at the the zero crossings because the harmonic content of a piano note changes continuously through the duration of the note. To stand any chance of just identifying a piano note you need to do a running FFT that overlaps sample buffers, yo do not have enough memory to do that in a Due or Zero nore is the clock rate fast enough.

In short you want to do something that is impossible no matter what memory or processing speed you throw at the problem.

"The bottleneck is in identifying a piano note. "
Especially as a piano note is made by a hammer strike 3 strings, and picking up harmonics from other strings and the case and with sustain from other notes added thru the use of the pedals.

Thanks guys. I clearly underestimated the complexity of the problem.

There is a software package, MidiGuitar 2, that samples (Analog-to-Digital) an electric guitar playing, and gives the "note values" (frequencies) to VST (Virtual Studio Technology) and similar dll (dynamic link libraries), which then play the notes as a saxophone, organ, cello, whatever. BUT, I have spent many hours with it and found that it kinda works with a super fast computer. The guitar problem is this: when the finger is taken off the string, this counts as a hammer off (a note). I guess it would work with other cleaner inputs like piano or voice. Anyways, it is a way-out-there topic.

I have spent many hours with it and found that it kinda works with a super fast computer.

That is the problem, it "kinda works".