Have a read of how One-Wire works:http://en.wikipedia.org/wiki/One_wire
It's moderately timing dependent, eg.
To send a "1", the bus master software sends a very brief (1–15 µs) low pulse. To send a "0", the software sends a 60 µs low pulse. The falling (negative) edge of the pulse is used to start a monostable multivibrator in the slave device. The multivibrator in the slave clocks to read the data line about 30 µs after the falling edge. The slave's multivibrator unavoidably has analog tolerances that affect its timing accuracy, which is why the output pulses have to be 60 µs long, and the starting pulse can't be longer than 15 µs.
Running at 1 MHz when the library is written for 8 or 16 MHz is going to introduce an error of 8 or 16 times.
Looking at the library, it looks like it uses delayMicroseconds. That would probably be OK if the sketch was compiled for the correct clock speed, and you didn't just compile under 8 MHz and run under 1 MHz.
Even if it was, lines like this probably won't work too well:
DIRECT_MODE_INPUT(reg, mask); // let pin float, pull up will raise
r = DIRECT_READ(reg, mask);
Even if delayMicroseconds adjusts to the slower clock speed, the different between 3 and 9, after dividing by 16, is not going to be much.