Go Down

Topic: server enter and stays in a strange state (Read 321 times) previous topic - next topic

guy_c

Apr 14, 2019, 08:14 pm Last Edit: Apr 14, 2019, 08:18 pm by guy_c
hello,

my 'loop' calls~ 20 functions among which the server.

sometimes, once a week or once a day, after but with no more client access, it becomes much slower: 14 instead of 7ms.

It's as if, after my access, a mysterious client was knocking on its door and its code was (partly) executed.

I am pretty sure that it happens only after a genuine access.

Furthermore, when in this state, other accesses to the Internet (email, etc) become shaky.

Attaching the server itself and the page it serves (index - changed ext to .txt for the attachment)
,
Any help is most welcome


server

Code: [Select]

void serve() {
  if (!getArg(HAS_ENET)){return;}
  bool authenticated = false, ajaxFound = false, dumpFound = false, cnxOK;
  unsigned int bufPtr;
  const char ajaxPhrase[] = "ajax_inputs";
  const char dumpPhrase[] = "Dump";
  const char credPhrase[] = "Authorization: Basic";
  const char emptyPhrase[] = "\r\n";
  const int buttons = 18; //16 args, siren, test
  char fmEnet;
  int param, value;
  char* strStart;
  const int IS_PARAM_LEN = strlen("ARG");
  const int IS_PARAM_OFFSET = strlen("ajax_inputs&");
  const int PARAM_OFFSET = strlen("ajax_inputs&ARG");
  const int VALUE_OFFSET = strlen("ajax_inputs&ARG00=");
  const int TIME_OUT=10000;// max 10 sec idle cnx. prevent connected client with no data deadlock
  const byte MAX_LINES=20;
  const byte MAX_LINE_LEN=200;//characters
  byte lineLen;
  byte lines;
  bool existCommand = false;
  int idleCnxCtr=0;
  client = server.available();
  clTestStart=micros();
  if (!client) {
    return;
  }//no client cnx
  report(F("server-> new client\r\n"), HI_VERB);
  cnxOK=true;
  bufPtr=0; memset(genBuf, '\0', sizeof(genBuf)); //zero buffer
  lines=0; lineLen=0;
  while (client.connected()&&cnxOK) {
    if (client.available()) {
      idleCnxCtr=0;
      //new byte from line
      fmEnet = client.read(); if (bufPtr < (sizeof(genBuf) - 1)) genBuf[bufPtr++]=fmEnet; reportChar(fmEnet, HI_VERB);
      if (fmEnet=='\n') { //full line assembled
        lineLen=0;
        if(lines++>MAX_LINES) {report(F("server-> too many lines"),HI_VERB); logger(__LINE__, lines&0xff, LOG_DONT_HALT); break;}
        if (strstr(genBuf, credPhrase)) {
          report(F("server-> cred phrase found\r\n"), HI_VERB);
          if (strstr(genBuf, SERVER_ADMIN_CREDS)) {
            report(F("server-> admin ok\r\n"), HI_VERB);
            authenticated = true, adminNotGuest = true;
          }
          else if (strstr(genBuf, SERVER_GUEST_CREDS)) {
            report(F("server-> guest ok\r\n"),HI_VERB);
            authenticated = true, adminNotGuest = false;
          }
          else report(F("server-> creds NOK\r\n"), LO_VERB);
        }
        else if ((strStart = strstr(genBuf, ajaxPhrase))) { //ajax can come w/o a command=>just refresh buttons
          ajaxFound = true;
          if (!strncmp(strStart + IS_PARAM_OFFSET, "arg", IS_PARAM_LEN)) {
            existCommand = true;
            param = atoi(strStart + PARAM_OFFSET); value = *(strStart + VALUE_OFFSET) - '0';
            if ((param<0)||(param>=buttons)||(((value!=0)&&(value!=1)))){
              report(F("server-> got strange param/value: "), MI_VERB); reportInt(param, MI_VERB); sp(F(" / "));
              reportInt(value, MI_VERB); spln();
              existCommand=false;
              logger(__LINE__, (-(param&0xf)<<4)+(value&0xf), LOG_DONT_HALT);
            }
            if(existCommand) report(F("server-> nonempty ajax req; param/value "), HI_VERB); sp(param); sp_; spln(value);
          }
          else {
            report(F("server-> empty ajax request\r\n"),HI_VERB);
            existCommand=false;
          }
        }
        else if (strstr(genBuf, dumpPhrase)) dumpFound = true;
        else if (!strcmp(genBuf, emptyPhrase)) { //----------------request assembled
          report(F("server-> found empty\r\n"), HI_VERB);
          if (authenticated) {
            report(F("server-> authenticated\r\n"), HI_VERB);
            if (ajaxFound) sendAjax(client, param, value, existCommand);
            else if (dumpFound)sendDump(client);
            else sendPage(client);
          }
          else {
            report(F("server-> not auhenticated\r\n"), MI_VERB);
            sendAuthenticationPage(client);
          }
          break;
        }//-------------------------------------------------------end req assembled
        else {
          ; //none of above on this line
        }
        bufPtr = 0; memset(genBuf, '\0', sizeof(genBuf)); //sign the string with zero whatever its length
      }//end line has been assembled
      else{
        if(lineLen++>MAX_LINE_LEN) {
          report(F("server-> too long line"),HI_VERB); logger(__LINE__, lineLen&0xff, LOG_DONT_HALT); break;}
      }
    }//end if client available
    else {// connected but nothing comes from enet
      delay(1);
      if((idleCnxCtr++)>TIME_OUT) cnxOK=false;
    }
  }
  // end client connected())
  if(!cnxOK){report(F("server-> long idle, closing cnx"), MI_VERB); logger(__LINE__, 167, LOG_DONT_HALT);}
  if (!client.connected())report(F("server-> connection closed by client\r\n"), MI_VERB);
  report(F("server-> sending stop cnx\r\n"), HI_VERB);
  client.stop(); // close the connection
  delay(1);//?
  report(F("server-> stop cnx sent\r\n\r\n"), HI_VERB);
  if (existCommand) {
    if (getArg(PERS_HALT)) logger(__LINE__, 0, LOG_PERS_HALT);
    writeArgs();
  }
}


pylon

Post complete code! The posted code should compile on an up-to-date IDE. If you used libraries not included with the IDE post links to the used libraries.

guy_c

Thanks, attached
ping from https://github.com/BlakeFoster/Arduino-Ping

pylon

I cannot find a link to that ICMPPing51.h library. The link you provided includes a ICMPPing.h only and using that library it doesn't compile.

The link to the ds3234.h library is also missing.

guy_c

Sorry. attached my 'libraries' folder that you can put in your sketch folder. I checked that it compiles ok

guy_c


IMPORTANT NEWS

First I was wrong to think that there was a need to access the server in order to bring upon the 'broken' state

I could record the phenomenon while on the monitor. No server access and yet spontaneously the server code started to be  very slow and the server code which is rich in printouts did not print anything:

My loop - please focus only #1 and #20. Every loop, execution time of each function is summed

Code: [Select]

void loop() {
  if(getArg(WDOG)){//reset watchdog once pe loop if enabled
    wdt_reset();
    WDTCSR|=(1<<WDIE); //enable wd interrupt
    WDextender=extendTo32s;//extend wad timeout
  }

  unsigned long int debut, fin, duree, loopStart, loopEnd;
  byte i=0;

  loopStart=micros();

  debut=micros();  
  userInterrupt();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//0
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//0

  i++;
  debut=micros();  
  serve();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  synchronizer();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//2
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  anySensorInterrupt();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//3
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  evaluateDensity();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//4
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  exceptionMgt();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//5
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  localDisarm();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//6
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  IPchangeCheck();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//7
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  rainServer();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//8
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  flowProcessor();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//9
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  checkReqTest();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//10
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  localAlarm();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//11
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  manageSiren();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//12
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  flipBoardLed();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//13
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
//  tenSecClock();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//14
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  premule();//presence emulation
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//15
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

i++;
  debut=micros();  
  camLight();//presence emulation
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//16
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//
  
  i++;
  debut=micros();  
  entrance();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//17
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  newdayUpdates();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//18
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  debut=micros();  
  emailException();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//19
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//

  i++;
  loopEnd=micros();
  duree=loopEnd-loopStart;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//20
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//


  i++;
  debut=micros();  
  minuteClock();
  fin=micros();
  duree=fin-debut;
  loDurations[i]+=duree;//<loDurations[i]?duree:loDurations[i];//21
  //hiDurations[i]=duree>=hiDurations[i]?duree:hiDurations[i];//
}




Every minute, minuteClock calls showDuration to display the function # and a scaled mean of the execution time.

Here is the monitor display -texto- during the transition from 'normal' operation to 'broken' one - please focus on bold values. As can be seen NOTHING PRINTED BY THE SERVER AND YET IT THE SERVER THAT BECAME SLOW


19:22:00 minute clock-> thursday. sync time 20:3. loops per sec 3006. loCtr 0
0-305 1-6401 2-235 3-223
4-461 5-630 6-516 7-338
8-352 9-246 10-213 11-289
12-215 13-254 14-193 15-481
16-261 17-441 18-232 19-261
20-18476
19:23:00 minute clock-> thursday. sync time 20:3. loops per sec 3006. loCtr 0
0-303 1-6402 2-235 3-223
4-461 5-631 6-515 7-338
8-351 9-246 10-212 11-289
12-215 13-254 14-194 15-481
16-261 17-440 18-231 19-262
20-18476
19:24:00 minute clock-> thursday. sync time 20:3. loops per sec 2599. loCtr 1
0-301 1-9526 2-234 3-224
4-462 5-634 6-514 7-343
8-348 9-247 10-212 11-288
12-216 13-254 14-194 15-481
16-262 17-439 18-230 19-261
20-21598
19:25:00 minute clock-> thursday. sync time 20:3. loops per sec 2055. loCtr 2
0-303 1-15632 2-234 3-223
4-461 5-632 6-515 7-340
8-349 9-245 10-212 11-289
12-215 13-254 14-194 15-481
16-261 17-440 18-230 19-261
20-27703
19:26:00 minute clock-> thursday. sync time 20:3. loops per sec 2055. loCtr 3
0-303 1-15633 2-235 3-223
4-461 5-632 6-516 7-340
8-349 9-245 10-212 11-289
12-214 13-254 14-194 15-480
16-261 17-439 18-231 19-261
20-27703


One could think that it's the statement
Code: [Select]

if (!client)

of the server that became slow but I think I can affirm that it did not

guy_c

I am realizing that I timed 'if(!client)' instead of 'client = server.available()'; which is nonsense. Redoing

guy_c

no more doubts possible: client=server.available is becoming slow and this also cause other Inet accesses to become shaky.

Is this a hardware issue with the enet shield?
Is this due to a software bug?

pylon

The server.available() does almost nothing that might need much more time except if it looses the listen state. My guess points into another direction. You have 4 interrupt handlers, at least one of them (timer 1) is absolutely unnecessary (it does mostly the same as timer 0 just 1000 times slower). Your code is overly complex, badly formatted and contains a huge number of empty else clauses.
Why do you extend the watchdog to 32s? If your loop isn't run for more than 1 second you should reorganize your code.

Keep a few things in mind regarding the Ethernet shield:
- the hardware is able to handle 4 sockets concurrently, that's including sockets in FIN_WAIT state
- the software expects the control all 4 sockets if a server is set up (so while the server is listening no other requests should be made)

In a huge sketch like that you should avoid any call to delay() especially if you have timing requirements within a few ms.

guy_c

Thanks Pylon,

I am pretty sure that there is a hw issue with the enet shield. Bought a new one from Hobbytronic, hopefully this will solve the issue.

If you have specific recommendations regarding the program 'formating' I'll be happy to hear. As for timer one, it generates my local time based on external time such as rtc or ntp (by syncing to external every so often). Empty else close forces me to explicitly check that nothing has to be done, a practice learned in another programming language where else doNothig is must.

Thanks again

pylon

Quote
If you have specific recommendations regarding the program 'formating' I'll be happy to hear.
The IDE has an "Auto Format" option in the "Tools" menu which does a rather good job.

Quote
As for timer one, it generates my local time based on external time such as rtc or ntp (by syncing to external every so often).
I know that, I can read the code. What I wrote is that timer 0 does more or less the same by providing a millis() value. Divide that by 1000 and you get about your secondsLocal value. The rest of the ISR shouldn't be in an interrupt handler anyway, you can calculate these values when you need them.

Quote
I am pretty sure that there is a hw issue with the enet shield.
I have serious doubts about that but I'm curious to hear about how the new hardware runs.

guy_c


Auto format: wasn't aware about it. Will try it when back home in few days

Clock: i had no doubts you could understand the code rather that you won't bother about details.

I dont know how to efficiently derive  local sec  out of millis or did you mean using timer0 interrupt?

I don't know how to create precise local sec, min etc outside  of the isr

Many function use the time vector and to my opinion it's better to compute it once and not every time its needed.


WD extension: communicating functions as email last up to ~10" and so the loop may last as much

Of course, ill post here the outcome of the new hardware. I *always* write the final conclusions once I am confident I have them. Currently, the watchdog kicks in (logger do halt) when the server slows down and it happens up to few times a day, last three days it didn't happen at all while the machine seems to be running flowlessely

I much appreciate that you take the time to review and suggest improvements!

pylon

Quote
I dont know how to efficiently derive  local sec  out of millis or did you mean using timer0 interrupt?
millis() / 1000 gives you seconds. To adjust using the RTC just store the difference.

Code: [Select]
uint32_t secondsToday = RTCt.hour * 3600 + RTCt.minute * 60 + RTCt.sec;
uint32_t localDifference = millis() / 1000 - secondsToday;


For the rest of the day you get the time by

Code: [Select]
uint32_t secondsToday = millis() / 1000 - localDifference;
byte h = secondsToday / 3600 % 24;
byte m = secondsToday / 60 % 60;
byte s = secondsToday % 60;


Quote
Many function use the time vector and to my opinion it's better to compute it once and not every time its needed.
You mean you use it more than once a second?


guy_c

Hi Pylon,

Indeed, I did not think of this way of creating local time elements. Your procedure to derive them should work nearly as mine. The main differences I can see is that every time the elements are needed (~20 (function) X ~2000 (loops per sec) =~40000 times per sec there will be a calls to an 'updateTimeElements' function before actually using the elements, usually for a comparison to the 'time to act' and also for the update of the 'next time to act'



BTW, am back home and got the new shield :)

guy_c

#14
May 04, 2019, 11:29 am Last Edit: May 04, 2019, 11:34 am by guy_c
Got the shield from hobbytronics. This is even not a low quality product but a piece of sh.. Installed ver96 with the 5500 based seeed part but had to change enet and ping libs :( this run slower than ver95 and has a loop per sec of 2550. Set the discriminator at 2400 loops/sec

Go Up