Hooray - so there is a second chance for me!
(a) First you must understand that transistors can work in two modes: digitally as switches (on/off), and as a kind of amplifier. This is where the problems start. It not really started with this, but the first popular transistors were "bipolar" (whatever that would mean), they were current amplifiers. You inject a small current into the "base", you get a significantly higher current drawn-in at the "collector". This was absolutly weird - no one really wanted to have such a thing. Vacuum tubes were so much more useful!! But it came much worse!
The amplification factor was not a reliable constant, but changed over the voltage applied to the collector, from device to device, and also depended on the temperature.
But people can learn Vietnamese, so they also can learn how to work with transistors. The main pressure came from the market: Transistors were the only way to build portable devices and huge things like (mainframe) computers.
That does not change anything to the fact that no-one really liked bipolar transistors. In a university course a long time ago, my professor once said: "And now I shall tell you a secret: No one really understands why transistors work...."
(b) Enters the FET. This is also called a transistor for reasons unknown, maybe because many of them also have three leads, and they are also used to switch and to amplify.... They are very similar to vacuum tubes in that they are controlled by voltage, which however is also only part of the truth as the gate - especially the early gates - had a considerable capacity which is not at all good for someone who is short of current.... And they had other issues.
But designers now had the option to jump from the frying pan into the fire.
(c) Now for something completely different: Engineers had been working with analog computers for some time to solve (mainly) differential equations. For this they had to add, multiply, and integrate values, and they found out that electrical voltage and current was able to help. E.g. take an (ideal) capacitor: It can collect charge (integration). What they also needed was an (ideal) buffer which would keep the charge at the input and did not keep back anything at the output. It turned out that this could also be described as infinite amplification but that was just by the way. The idea behind this was not to amplify things much, but to have no bad influence. So the factual amplification factor was a measure for quality, like buying a 200 kW car. Asking "How fast can it drive?" is not the point. You just cannot avoid it could drive fast...
But back to electronics. Those devices were called instrumentation amplifier and were as expensive as other lab equipment. You intend to buy one or two in your lifetime, or so....
(d) However.... Instrumentation amplifiers were the holy grale of designers as a Porsche 911.. They were ideal, one even could pre-compute the behavior of a circuit (which one could not with a circuit that contained - say - 5 transistors of whatever kind).
The first affordable - now called - operation amplifiers were a disappointment, they were still expensive, they needed "calibration", a negative supply voltage, and were hardly usefull for anything else but what they had been invented for: near DC signals.
But this changed rapidly. Opamps are nowadays used all over the analog world because you - everyone! - can design reliable and predictable circuits. The rule for an opamp is as simple as this:
It adjusts its output voltage in such a way that there is no voltage difference between its inputs.
This has revolutionized a whole industrie...
Of course a set of transistors can in many cases do the same, but no one is willing to pay an expert to develop such a circuit. On the other hand we have very, very good simulation software at the moment. Well, may be the situation might change....
(e) Back to transistors. You remember the "switch" thing? Transistors are very good switches, and very popular because many of their idiosyncrasies are acceptable in this mode. So you will find transistors all over the place in the digital business. Sometimes they are hidden in a black housing called e.g. ULN2801 or something. You would not have guessed that this "8 channel driver" contained nothing but 8 transistors ?(Well, in fact 16 but that would be hair-splitting.) Transistors as switches are well understood and there is hardly any reason they could vanish for this application, except - of cause that they might be packed differently to satisfy requests for "16 at a stretch".
Hope you enjoyed my little story :-)
Edit:
I just fixed 24 typos...