but the bottleneck seems to be the bridge.
Yes, the bridge is indeed the bottleneck. Using YunServer and YunClient to implement an HTML or AJAX interface is easy, but very inefficient.
The 32u4 react "instantly" to the request, but take the 1.5 seconds to parse data.
No, the AR3391 reacts "instantly." That delay is getting the data up to the '32U4 and back down again through the bridge.
What's happening is that the HTML request is getting sent to the web server running on Linux on the AR3391. It parses the request, and see that the first token in the URL after the address is "/arduino/" so it sends the rest of the URL after that token up through the bridge to the '32U4 processor. The sketch receives that data through the YunServer/YunClient connection, parses the URL, and sends a response. That response is sent down through the bridge to the AR3391, which formats it as an HTML response, which gets sent back to the node making the initial request.
The serial port that is the bridge between the two processors is relatively slow (at least compared to the speed of the AR9331 processor and the network connection) and the speed of the '32U4 is also slow compared to the AR9331 processor. Using YunServer/YunClient is easy, but it's probably the slowest way to service HTTP requests.
Do you have any tips or idea how I can increase the speed of the communication?
Yes. Handle the requests completely on the AR9331 processor using code on the Linux side, and forget about trying to handle the HTML requests in the sketch.
I have built several projects on the Yun, and found that that the most efficient and responsive way to use it is to do all of the heavy processing (and certainly all of the network processing) on the Linux side of the system, and only use the sketch as an I/O processor to access the shield pins. An example of such a system is HERE.
That system does all of the "business" logic in Linux: responding to web requests, determining what should happen when, and tracking the system status. All of that logic is implemented in a Python script that periodically prints out the required output status, and periodically reads in the status of inputs. In this case, there is one output: a relay; and one input: an analog value representing the current draw.
The sketch consists of a Process object from the Bridge Library that runs the Python script. When the Python script uses a "print" statement to write out the desired output state, the sketch can read that from the Process object using the available() and read() methods. When new data is available from the Process object, the sketch reads the data, and controls the outputs as commanded. The sketch also periodically reads the analog input, and writes the raw data to the Process object, which the Python script can read as standard input: it reads the raw value, converts it to Amps, and does the appropriate processing.
Then there is the web service logic. For that, I write an application in Python using the Bottle Web Framework. (There are others out there as well, if you would prefer a different framework.) This Python script is launched by the sketch using another Process object. The resulting Bottle application serves up all of the dynamic and static HTTP requests and responses.
Where I was getting similar 1 to 1.5 second response time using YunServer/YunClient, using Bottle on the Linux side I am getting AJAX call response times in the range of a few tens of milliseconds.