does it make sense to declare stuff on the "heap"/free store?
In general, no. Specifically, on 8-bit AVRs the answer is definitely no. On 32-bit Arduinos, the answer is usually no. But it's really related to the difference between a PC (virtual memory, rebooted whenever) vs. an embedded environment (and high-reliability servers).
Transitioning from one environment to the other is difficult, but lots of people do it. Some people never quite get the difference, and continue to write as if the program is running on a platform with an OS and a hard drive. The turning point is usually when their sketch runs for 2 hours (or 2 days), then "hangs", or it runs for a random amount of time and then it starts spitting out garbage.
A primary characteristic of embedded programming is that it is deterministic: an input produces a predictable and repeatable change in the state of the program, and any combination of those inputs is also predictable. The heap does not behave predictably because difference series of malloc/free will produce different heap states (fragments).
Limited RAM will exacerbate the fragmentation, so you shouldn't even consider malloc/free (or new/delete) on an Arduino with 2K of RAM. However, it is very common to declare pools or blocks of fixed-size elements. You are explicitly declaring the maximum number of "things", and there is no possibility of fragmentation. You can still use pointers.
A second aspect is that non-embedded programs tend to "store" some information, then "use" it later. In the embedded world, you can usually write the program to "use" the information as it is received. A frequent example is processing a line of input characters. The typical PC approach is to store characters until a newline is received, and then process the array of characters all at once. You'll need enough RAM to hold on to the longest line. In contrast, the typical embedded approach is to use each character as it is received. You only need enough RAM to hold on to the "interesting" values in the line.
Handling TCP streams in an embedded environment is similar. You must "scan" the incoming stream for the interesting parts, and use them immediately if possible, so they don't have to be stored in RAM. For example, if you're logging the data to an SD card, you must write the data as it is received. Otherwise, your SD files could only be as big as your available RAM. Not very useful.
Just like using
String, using the heap seems so easy because it is easy to understand. The long-term behaviour of the heap is much more subtle, and the problems it can cause are frustratingly random. Save yourself 600 bytes of flash memory and find a way to do it without RAM (processed immediately), with local RAM (processed and released within one routine), or with static RAM (resource available during entire program), in that order.
Regarding STL in general, template instantiation is a powerful technique, and it can be used to minimize RAM and flash size. Can STL be used without the heap, just for local or static variables? I don't know. But if you're familar with OO techniques, you should also take a look at Cosa. It is ridiculously efficient and a terrific OO design. It may also help you with the transition to the embedded mindset.