Let's see
Firmata implements a protocol. A protocol is an agreed upon method of exchanging data. If the sender and the receiver agree that <R,D,4> means that the packet starts with < and ends with >, that R means read, that D means digital, and the the value is the pin number to read, then reading the serial data is trivial. I've posted code to read start- and end-of-packet delimited code many times.
Parsing the serial data is trivial, too, since there is a consistent delimiter between tokens. I've posted code many times that illustrates how to use strtok() to parse data.
A simple if statement comparing the first token (the what to do command) to R is trivial. If the command is not R, then an else if to deal with S (set) is trivial.
The pin type and value need to be parsed, if the command is R, or the pin number and value need to be parsed, if the command is S. Trivial.
Having the pin type defines whether to call analogRead() or digitalRead(), if the command is R. The pin number to read is known. The return type is int, in both cases, so returning a value is the same in both cases.
If the command is S, the pin number (and a knowledge of the board) defines whether to call digitalWrite() or analogWrite(). Removing the knowledge of the board from the equation would be simple. Add another command, A, for analogWrite() and leave S strictly for digitalWrite(). Neither analogWrite() or digitalWrite() return a value, so there is nothing to return to the caller.
Actually putting this together in code is trivial.
One could decide that S for set is not a good idea, because S should be for servo. OK. So change S for set to W for write, and make S stand for servo.
So, now we have a 4 commands - R (read), W (write digital), A (write analog), and S (servo N to position P). Each is followed by two arguments - pin type and pin number or pin number and value. As illustrated, implementation is pretty easy.
What else would you want a PC application to be able to make the Arduino do?
Actually doing the implementation of the Arduino code could be done, and tested, in 30 minutes.
A drink and a nap, and then we'll tackle the PC side. That won't take but about 10 minutes.