I finally got tcpdump of connection from laptop to arduino over a dedicated ethernet connection. I used curl to transfer file of about 450KBytes. And it has always been taking about 45 seconds for the transfer. Here is a trace taken from somewhere in between:
00:03:03.745287 IP arduino.http > laptop.41472: Flags - , ack 316534, win 2048, length 0
00:03:03.745358 IP laptop.41472 > arduino.http: Flags [P.], seq 316534:317558, ack 20, win 29200, length 1024
00:03:03.745365 IP laptop.41472 > arduino.http: Flags - , seq 317558:318582, ack 20, win 29200, length 1024
00:03:03.948151 IP arduino.http > laptop.41472: Flags - , ack 318582, win 2048, length 0
00:03:03.948222 IP laptop.41472 > arduino.http: Flags [P.], seq 318582:319606, ack 20, win 29200, length 1024
00:03:03.948228 IP laptop.41472 > arduino.http: Flags - , seq 319606:320630, ack 20, win 29200, length 1024
00:03:04.150957 IP arduino.http > laptop.41472: Flags - , ack 320630, win 2048, length 0
00:03:04.151028 IP laptop.41472 > arduino.http: Flags [P.], seq 320630:321654, ack 20, win 29200, length 1024
00:03:04.151034 IP laptop.41472 > arduino.http: Flags - , seq 321654:322678, ack 20, win 29200, length 1024
00:03:04.353723 IP arduino.http > laptop.41472: Flags - , ack 322678, win 2048, length 0
00:03:04.353758 IP laptop.41472 > arduino.http: Flags [P.], seq 322678:323702, ack 20, win 29200, length 1024
00:03:04.353763 IP laptop.41472 > arduino.http: Flags [P.], seq 323702:324726, ack 20, win 29200, length 1024
00:03:04.556663 IP arduino.http > laptop.41472: Flags - , ack 324726, win 2048, length 0
00:03:04.556730 IP laptop.41472 > arduino.http: Flags - , seq 324726:325750, ack 20, win 29200, length 1024
00:03:04.556736 IP laptop.41472 > arduino.http: Flags [P.], seq 325750:326774, ack 20, win 29200, length 1024
00:03:04.759463 IP arduino.http > laptop.41472: Flags - , ack 326774, win 2048, length 0
00:03:04.759504 IP laptop.41472 > arduino.http: Flags [P.], seq 326774:327798, ack 20, win 29200, length 1024
00:03:04.759509 IP laptop.41472 > arduino.http: Flags - , seq 327798:328822, ack 20, win 29200, length 1024
00:03:04.962329 IP arduino.http > laptop.41472: Flags - , ack 328822, win 2048, length 0
00:03:04.962397 IP laptop.41472 > arduino.http: Flags [P.], seq 328822:329846, ack 20, win 29200, length 1024
00:03:04.962404 IP laptop.41472 > arduino.http: Flags - , seq 329846:330870, ack 20, win 29200, length 1024
00:03:05.165186 IP arduino.http > laptop.41472: Flags - , ack 330870, win 2048, length 0
00:03:05.165251 IP laptop.41472 > arduino.http: Flags [P.], seq 330870:331894, ack 20, win 29200, length 1024
00:03:05.165257 IP laptop.41472 > arduino.http: Flags - , seq 331894:332918, ack 20, win 29200, length 1024
00:03:05.368028 IP arduino.http > laptop.41472: Flags - , ack 332918, win 2048, length 0
00:03:05.368096 IP laptop.41472 > arduino.http: Flags [P.], seq 332918:333942, ack 20, win 29200, length 1024
00:03:05.368103 IP laptop.41472 > arduino.http: Flags - , seq 333942:334966, ack 20, win 29200, length 1024
00:03:05.570882 IP arduino.http > laptop.41472: Flags - , ack 334966, win 2048, length 0
00:03:05.570953 IP laptop.41472 > arduino.http: Flags [P.], seq 334966:335990, ack 20, win 29200, length 1024
00:03:05.570960 IP laptop.41472 > arduino.http: Flags - , seq 335990:337014, ack 20, win 29200, length 1024
00:03:05.773694 IP arduino.http > laptop.41472: Flags - , ack 337014, win 1600, length 0
00:03:05.773767 IP laptop.41472 > arduino.http: Flags [P.], seq 337014:338038, ack 20, win 29200, length 1024
If you see closely, my linux processes the packet in no time. However, W5100 takes about 200ms to respond. Every single time. So this may indeed be the processing time required by W5100 to process a receive stream. I am wondering why it takes so much, when it can transmit much faster.
The TCP protocol conformance itself seems to be pretty accurate. It correctly advertises 2048 as its buffer size.