As AN4248 makes clear, the reference frame is X north, Y east and Z down. So the reference XY frame is indeed parallel to the Earth's surface. In that reference frame, the Earth's magnetic field is a vector with an inclination angle of delta with respect to the X axis, and in the XZ plane. The calculation of pitch and yaw from the magnetometer readings make no assumption about the orientation of the magnetometer, but obviously these cannot be unique.
This is mostly an intellectual exercise. Since sensors that combine both a magnetometer and an accelerometer are becoming very cheap, it seems silly not to use both (as long as the sensor is not subject to acceleration other than gravity).
As an aside, there are much simpler ways of implementing a tilt-compensated compass that that described in AN4248. For example using a bit of vector algebra (taken from an example placed in the public domain by the Pololu engineers):
// Returns a heading (in degrees) given an acceleration vector a due to gravity,
// a magnetic vector m, and a facing vector p. Semantic mods to Pololu code example by J. Remington
int get_heading(const vector *a, const vector *m, const vector *p)
{
vector W;
vector N;
// cross magnetic vector (magnetic north + inclination) with "down" (acceleration vector) to produce "west"
vector_cross(m, a, &W);
vector_normalize(&W);
// cross "down" with "west" to produce "north" (parallel to the ground)
vector_cross(a, &W, &N);
vector_normalize(&N);
// compute heading
int heading = round(atan2(vector_dot(&W, p), vector_dot(&N, p)) * 180 / M_PI);
if (heading < 0)
heading += 360;
return heading;
}