How to use RS232 in that case?

is every RS232 device communicating the same way (besides the baudrate)? Or are there differences?

There can be extensive differences, including complex communications protocols and data packaging. In this case, it looks like you're in luck and the interface is very ordinary, and uses plain text for most of the commands. From the FLIR manual:

An RS-232 terminal or host computer connects to the female DB-9 connector on the Pan-Tilt Unit Controller (PTU-C). The host terminal or computer should be set to 9600 baud, 1 start bit, 8 data bits, 1 stop bit, and no parity. Hardware handshaking and XON/XOFF are not used.

From computer networking, you get the concept of "layering." At the bottom "physical" layer, you have the allowed voltages levels and such. This is different between the arduino ("TTL serial") and "real rs232" as used by your PTU, which is why you need that "adaptor" board or chip (the common max232.)
At the next layer you have the "data link layer" which describes the formats of bits and bytes and flowcontrol and stuff. These happen to be the same for arduino serial ports and the PTU. Then there are a bunch more layers, which in this case are pretty minimal and amount to "send text commands." (arguably, the "commands are composed of ascii text strings ending with a delimiter" is a separate layer than "The H command halts all movement.")