The architecture of AVR or cortex CPUs is essentially the same as those of 50 years ago, and in that time people have been wrestling with the problem of bare-metal vs OS.
Unfortunately, there is no way to create OS-like functionality without incurring some overhead. The more independence of processing that you want, the more overhead you need. This is an inevitable feature of several processes sharing a single CPU core, memory and peripherals.
Fortunately, the hardware guys keep making faster chips, so we get round the problem that way. The alternative is to have independent cores, e.g. like the Parallax Propeller. But I am pretty sure there is no software trick to get the best of both worlds.
There is a niche at the bottom end for bare metal systems, and I think the basic Arduino will continue to provide an entry level type system for that. It's also clear there is demand for systems with rich environments like Raspberry Pi. The price/performance of those will continue to improve, and the price get closer to that of Arduino, if not already there.
What will probably not change is the middle ground, between bare metal and feature rich OS like Linux. For example, I would like to add TCP/IP, web server, USB host with support for wi-fi dongle to my bare metal project. That's a big chunk of code. There is no obvious off the shelf solution, or "go to" RTOS which supports all that out of the box.
The problem is that there are thousands of RTOS to choose from, but drivers and middleware for them is lacking. I think this is one area where diversity doesn't help, and a single standard would allow people to develop drivers and middleware instead of re-inventing the RTOS.
Perhaps the Japanese had the right idea with ITRON, but a government funded standard seems like anathema to the western way of doings. ITRON is claimed to be the most widely used OS (units shipped), but few people have heard of it.
Ironically, an old version of Unix might be a good real time OS, at least for 32 bit devices. It has a simple clean architecture and is well known.
The gap between 16MHz 8 bit AVR, and a 500MHz 32 bit ARM with 4MB RAM (roughly where Linux becomes runnable) is too big to be bridged with a single RTOS. It should be possible to scale from say 50MHz ARM with 32KB upwards on a single RTOS, but chips with a few kB of RAM will be too resource constrained to run anything more than a very basic RTOS if at all.
So I think Arduino will continue as a cheap and simple way to interface to electronics, systems like Beaglebone will reach the price point of Arduino, and the problem of the middle ground between bare metal and Linux will continue.
Perhaps the answer is dual function boards like Arduino Tre and UDOO, with a fast interface between the low and high level CPUs.