EDIT: I removed that post because my youtube link wasnt finished so I wanted to
wait. Thanks AWOL for your fast answer! 
First of all thanks all for your patients and for your help! Im really happy! 
Thanks Nick for the explanation. I do a lot javascript scripts so in advanced
c programming Im quite new. 
Not sure why the motors should be slower, unless it is simply a matter of the "step, dammit!" instructions taking longer to be generated.
I made a little video about the speed problem with float/long calculations:
Both times Im going to Point(50;10).
- Part: All calculations and positions storings are FLOAT
- Part: All calculations and positions storings are LONG (I choosen a value thats still ok for long calcs <2.147.000.000)
What I observed:
- With all float values the motors need to get to Point(50;10) approx 45 seconds, and
the motors are very loud
- With all long values the motors need 22 seconds (thats twice as fast) and the motors dont get
that loud.
My stepping method looks like this:
void AxisCtrl::motorStep()
{
unsigned long currentMicros = micros();
if(currentMicros - _previousCycle > _motorSpeed) {
if(_stepDone == 0){
setMotorDir();
incrementCurrStep();
}
_previousCycle = currentMicros;
if(_motorPinState == 0){
_motorPinState = 1;
} else {
_motorPinState = 0;
}
digitalWrite(_motorPin,_motorPinState);
++_stepDone;
if(_stepDone == 2){
_stepDone = 0;
if(_currPos > 0){
_axisHomed = false;
}
}
}
}
void AxisCtrl::incrementCurrStep()
{
if(_motorDir == true){
_currPos += _normLenghtPerStep;
} else {
_currPos -= _normLenghtPerStep;
}
}
void AxisCtrl::setMotorDir()
{
if(_currPos<_newPos){
digitalWrite(_dirPin, HIGH);
_motorDir = true;
}else{
digitalWrite(_dirPin, LOW);
_motorDir = false;
}
}
And the main code:
if(!readyForNextCmd){
// XY INTERPOLATION
x2 = x.getCurrPos();
y2 = linearInterpolate(x.getOldPos(),y.getOldPos(),x.getNewPos(),y.getNewPos(),x2);
y.setNextPos(y2);
// IF ALL DONE READY FOR NEXT COMMANDS
if(abs(x.getCurrPos() - x.getNewPos()) < 0.0005 &&
abs(y.getCurrPos() - y.getNewPos()) < 0.0005)
{
readyForNextCmd = true;
}
// DO THE STEPPING
if(abs(x.getCurrPos() - x.getNewPos()) > 0.0005){
x.motorStep();
}
if(abs(y.getCurrPos() - y.getNextPos()) > 0.0005){
y.motorStep();
}
}
float linearInterpolate(float x1, float y1, float x2, float y2, float X)
{
float _Y = 0;
if(x1 == x2){
_Y = y2;
} else {
_Y = y1 + ((X-x1)*y2 - (X-x1)*y1)/(x2-x1);
}
return _Y;
}
Im using only 1 direction pin for all 3 motors. Im switching the pin everytime the
each motor has to move. So the motor looks at the direction and decides LOW or HIGH
everytime.
The sad thing is that I cant use unsigned longs because the linear interpolation has
to handle negative values as well (when going back) and longlongs are no option too
because that makes the system even slower than float 
Any ideas what that could cause?