It goes back to your definitions. MANY processors lack "single bit" operations, but have plenty of other mechanisms for manipulating single bits. For example, on the ARMv6m (Cortex-M0, M0+) chips, the only way to do things with single bits is via operations on 32bit registers. If you want to test a bit somewhere in memory, you load a byte (or larger) from that memory location into a register, and then shift the register until the desired bit becomes the carry or high bit of the word (and can be tested with conditional branch instructions.) (You might think that you'd use "bitwise AND with an immediate value", which is what most chips do. But v6m doesn't have an "AND immediate" instruction.
The AVR (if you consider it a "real" processor") lacks an instruction to set or clear bits in a general register (the SBI/CBI instructions only work on the low 32 I/O registers).
AVR has SBR and CBR instructions for bit set/clear in the general purpose registers, as well as SBRC and SBRS of checking single bits, and BST/BLD for extracting/setting single bits.
Many architectures that DO have single-bit instructions have a bunch of restrictions. For example, on AVR and PIC, the bit number you want to manipulate would have to be a constant.
I'm pretty sure you can do single-bit operations even without logical instructions. At worst, you do conversion to and from binary and count digits... (sort of like using a programmable calculator with essentially only floating point math to do integer stuff like computing check digits. BTDT.)
A C compiler will hide all of this for/from you.
C compilers for architectures like PIC or 8051, with extensive CPU single-bit capabilities, frequently go beyond the C standard to allow the use and definition of single-bit variables.
ARMv7m (Cortex-M3, M4) had an optional feature called "bit banding" that allowed single bits in data or peripheral address space to be addressed (an actual cpu-style address, not some hack.)
They didn't push compiler support for this, and apparently it doesn't get along well with high speed memory, shared buses, or caches, and most of the newer and faster CM3/CM4 processors don't seem to implement it.
Freescale (NXP) implemented their own extension of bit banding called the "bit manipulation engine" that considerably expanded bit and bitfields operations; I don't know offhand whether it got much language or library support. (apparently you can do quite a lot if you throw about 1GB of address space at about 1MB or actual "stuff.") (Now, RISC philosophy says this sort of thing is pretty silly, requiring special compiler support and ending up not significantly faster than constructing a sequence of more primitive operations. But "atomicity" and the need to woo 8-bit customers probably had some effect.)
The DEC DP10 had "byte pointers" that could effectively address any bitfield in the memory address space from 1 to 36bits in length, for load/store purposes. But C compilers didn't like the 36bit memory size very much. (a 36bit address to cover 512k 36bit words. Lovely for assembly language.)