The 8051 was actually one of the very early microcontrollers (~1980). The original 8x51 had 4k of program memory, 128 bytes of RAM, several 8-bit wide IO ports, a couple of timers, and a UART. They predate Flash memory, so your choice of program memory was "none" (8031), "mask-programmed (at the factory) ROM" (8051), or UV-erasable EPROM (8751)
One of the early Arduino-like projects (in the sense of "A microcontroller system for doing things without having to know all about programming and building electronics") was based on the 8051, in the form of "8052 Basic" boards. These used the on-chip ROM to hold a basic interpreter, so with the addition of relatively simple (for the time) hardware you could have a small and inexpensive (for the time) computer that was programmable in BASIC. (from a terminal. Which was a bit of an issue in those days...) A lot of early home automation (and similar) efforts used such things. (ahh, 1982, when a usable home PC was several thousand (1980s) dollars, without a hard drive...)
The 8052-Basic chip sold for about $30. (chip only; no support circuitry.) But one feature of the 8051 family of chips was that a single signal could be connected that would cause them to operate in "microprocessor mode"; instead of using the internal ROM, they would re-purpose a couple of the IO ports to be an external memory bus, and they would run code from external memory INSTEAD of the internal memory. This meant that obsolete 8051s with "bad" code could be harvested from equipment, or bought/sold on the surplus market (cheap!), and were still useful for building these "external memory" system. (Paul Stoffregen, of "Teensy" fame in the Arduino world, was a player in the 8051 world, and originally sold a nice 8051 SBC.)
A big advantage of the 8051 family is the ubiquity of the thing. EVERYBODY sells an 8051 chip. Atmel sells a variety of 8051 chips. Microchip sells a variety of 8051 chips. NXP sells 8051 chips. USB Hub controller chips have integrated 8051s. MP3 player chips have integrated 8051s. You can get free FPGA cores that implement the architecture. It is the 8-bit equivalent of ARM in this respect. A "modern" 8051 implementation (say, one of the chips from Silicon Labs
has 64k or more of on-chip ISP flash program memory, significant RAM, runs instructions at close to 1 instruction per clock (compared to 12 clocks per instruction in the original) (making it faster than most AVRs.)
The 8051 architecture is not very well suited to running C. There is a free C compiler (SDCC) with a somewhat mixed reputation, and assorted not-free compilers (some with assorted "limited" evaluation versions.) The architecture is from the days of assembly language programming, and the chip has a bunch of neat features at the instruction-set level that are difficult to get a compiler to use.
For the typical Arduino user, I don't think there is any advantage to an 8051 system. It would depend somewhat on exactly which chip was used, and how, and just how good a bargain you were getting.
From a professional point of view, it would be foolish to ignore 8051s. Some of the modern versions have very nice peripheral sets, for instance (16-bit ADCs!) I've been thinking that it would be nice to have an Arduino-class (small, cheap, USB-connected, bootloader-based) board with an 8051 chip of some kind, just from an educational perspective. I'm not sure that it's possible to do that in a cost-effective way, though. :-(