And it cant be that complicated
Really? Off you go then.
This is window manager level function. It is the window manager's job to control (amongst other things) input functionality. You will need to find a way to hook into the window manager (Explorer.exe in MS Windows), and get it to notify a routine that you will need to write. That routine will then need to send something to your Arduino. There are, I believe, interrupt hooks that are used by the international language/keyboards processes when text input fields get focus. You might be able to attach to that process somehow, but this is not trivial, and requires fairly low level knowledge of Windows internals.