Really? I don't think there has been an unashamedly big-endian processor since the 68k series. (Not counting the modern RISC processors, which have the memory load/store so divorced from the rest of the instructions that they can be configured either way.) (ARM and MIPS are both configurable at some level, but I think all of the ARM and MIPS microcontrollers I'm aware of have been configured as little-endian in silicon.)
Everyone numbers their bits in little-endian fashion these days, because math: bit 0 has a value of 2**0, etc. So LE for bytes continues in that fashion.
I think LE might have had an advantage for doing multi-precision math on early 8bit CPUs. Two add two n-byte numbers, you have to start with the LSB. In an LE architecture, the LSB is right there where your pointer is pointing. If your instruction set is limited in its ability to do indexed addressing, or math on pointer registers, this is much more convenient that needing your first add to be @(Y+4) + @(X+4)
TCP/IP networking is all big-endian. I spent too many years making big-endian code run on little-ending chips. (Nowadays, Intel has a bi-endian compiler that does most of the work for you. If it takes 40ns to fetch a byte from memory, and 0.5ns to swap the bytes around with the BSWAP instruction (on a modern x86 cpu)...)