Help parsing text file through wishield

Hi all, I’m very new at this and have been searching the forums for an example that would help me but I can’t seem to find one.

I’ve setup my arduino & wishield to hit a text file on my webserver. That part works fine and I can print my result to the serial monitor. What I need to do now is parse the result down to a specific value and place that in a variable that will ultimately light my LEDs. I’m not worried about putting the value into a variable yet, it’s the parsing part that I’m having issues with.

Most, if not all, examples I’ve seen in the forums talk about looking for a pointer when parsing, like a |, and then reading beyond that, however I don’t have that luxury. (I think.)

I’m modifying the SimpleClient example that takes the GET result and puts it into an array. I have no parsing code in here yet. My code:

/*
 * A simple sketch that uses WiServer to get the hourly weather data from LAX and prints
 * it via the Serial API
 */

#include <WiServer.h>

#define WIRELESS_MODE_INFRA      1
#define WIRELESS_MODE_ADHOC      2

// Wireless configuration parameters ----------------------------------------
unsigned char local_ip[] = {192,168,1,200};      // IP address of WiShield
unsigned char gateway_ip[] = {192,168,1,1};      // router or gateway IP address
unsigned char subnet_mask[] = {255,255,255,0};      // subnet mask for the local network
const prog_char ssid[] PROGMEM = {"wifinetwork"};            // max 32 bytes

unsigned char security_type = 3;      // 0 - open; 1 - WEP; 2 - WPA; 3 - WPA2

// WPA/WPA2 passphrase
const prog_char security_passphrase[] PROGMEM = {"password"};      // max 64 characters

// WEP 128-bit keys
// sample HEX keys
prog_uchar wep_keys[] PROGMEM = { 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d,      // Key 0
                          0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,      // Key 1
                          0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,      // Key 2
                          0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00      // Key 3
                        };

// setup the wireless mode
// infrastructure - connect to AP
// adhoc - connect to another WiFi device
unsigned char wireless_mode = WIRELESS_MODE_INFRA;

unsigned char ssid_len;
unsigned char security_passphrase_len;

// End of wireless configuration parameters ----------------------------------------

// Function that prints data from the server
void printPublicData(char* data, int len) {
  
  // Print the data returned by the server
  // Note that the data is not null-terminated, may be broken up into smaller packets, and 
  // includes the HTTP header. 
 while (len-- > 0) {
    Serial.print(*(data++));
  }
}


// IP Address for Public Prod  
uint8 ip[] = {8,8,8,8};

// Base64 encoded USERNAME:PASSWORD
char auth[] = "user:pass";

// A request that gets the latest production server from Remix. (Don't use http:// in the domain)
GETrequest getPublicProd(ip, 80, "url", "/deploy.txt");

void setup() {
    // Initialize WiServer (we'll pass NULL for the page serving function since we don't need to serve web pages) 
  WiServer.init(NULL);
  
  // Enable Serial output and ask WiServer to generate log messages (optional)
  Serial.begin(57600);
  WiServer.enableVerboseMode(true);

 
  //Set the auth string to be sent
  getPublicProd.setAuth(auth);

   // Have the processData function called when data is returned by the server
  getPublicProd.setReturnFunc(printPublicData);
}


// Time (in millis) when the data should be retrieved 
long updateTime = 0;

void loop(){

  // Check if it's time to get an update
  if (millis() >= updateTime) {
    
    getPublicProd.submit(); 

    // Get another update one hour from now
    updateTime += 1000 * 60 * 60;
  }
  
  // Run WiServer
  WiServer.server_task();
 
  delay(10);
}

This is the complete result I get. The value that I want to parse out is plain text, either ra, rb or rc. (Really just a,b or c.) And I can’t modify the text file on the server

Connected to foo.com
TX 118 bytes
RX 0 bytes from a foo.com
RX 164 bytes from foo.com
HTTP/1.0 200 OK
Connection: close
X-Mashery-Responder: mashery-web2.ATL
Date: Mon, 01 Nov 2010 02:48:51 GMT
Server: Apache/2.2.16 (Unix) mod_ssl/2.2.16 OpenSSL/RX 164 bytes from foo.com
0.9.8e-fips-rhel5
Last-Modified: Thu, 14 Oct 2010 17:08:00 GMT
Etag: "4990b66-20-49296c08c1c00"
Content-Type: text/plain
Accept-Ranges: bytes
Content-Length: 3RX 37 bytes from foo.com
2

[glow]rb[/glow] Thu Oct 14 12:08:00 CDT 2010
Ended connection with foo.com

Can someone help with advice on how I’d parse that text out?

Much appreciated,
selch

Most, if not all, examples I’ve seen in the forums talk about looking for a pointer when parsing, like a |, and then reading beyond that, however I don’t have that luxury. (I think.)

The | is not a pointer. It is a delimiter.

If you are going to write a parser to parse some data, it helps tremendously if you are not trying to parse random data.

So, the first question that needs to be asked is whether you have any control over the process that sends the data.

/*

  • A simple sketch that uses WiServer to get the hourly weather data from LAX and prints
  • it via the Serial API
    */

The output does not appear to match this comment.

The output is fairly well understandable, though. The only difficulty is that two processes are writing to the stream that is received. It will be easier to parse the data if you can shut one of the processes off.

It appears that one of the processes that writes to the stream is the server that you are accessing (that provides the data of interest) and the other is the WiServer library.

Try changing the WiServer.enableVerboseMode(true); statement to WiServer.enableVerboseMode(false);.

The HTTP statement starts the reply from the server. The "Content-Length: " text precedes a number (as text). That number defines how many bytes of data the server is really returning to you. In this case, there are 32 bytes (the RX output breaks the flow of data from the server.

The 32 bytes then follow the blank line. They are “rb Thu Oct 14 12:08:00 CDT 2010”. You are interested in the first two.

So, locating the data of interest is easy enough. You have a pointer to some data. Use the strstr() function to locate the string "Content-Length: " in data (the pointer to the string to search), in printPublicData. The strstr function returns the offset to the location of the "Content-Length: " string.

Increment data by that amount. The data pointer will now point to “Content_Length: 32rb Thu Oct 14 12:08:00 CDT 2010”.

Use strstr again to find the first carriage return (I show this is , but the string to search for is “\n”). Increment data by that amount plus 2. The data pointer will now point to “rb Thu Oct 14 12:08:00 CDT 2010”.

Then, you can use array notation. data[0] will be ‘r’, and data[1] will be ‘b’.

Thanks, Paul! That information helps tremendously and it think should be what I need to get started. As for the 'weather' comment in the code, that was left over from the example sketch and I simply haven't changed it.

Your response is much appreciated.

selch

@selch Question, what wifi shield are you using?

I'm running the wishield 2. http://asynclabs.com/wiki/index.php?title=WiShield_2.0

So I've gotten this far. If I print out the first variable, cl, I get the letter 'C' - which seems right. However, the second variable that I grab with strstr() is a garbage character. Please reference my first post above for the text that I'm trying to parse.

Anyone offer some insight to my error(s)? Thanks!

char env;

void printPublicData(char* data, int len) {   // how the headers returned from getData routine are handled
  char *cl;
  char *cr;
  
  if (len > 0) {
    cl = strstr(data, "Content-Length");
//    if( !cl) {
//      // output an debug error here 
//    }
    
    cr = strstr(cl, "\n");
//    if( !cr) {
//      // output an debug error here 
//    }

    cr += 3;
    env = *cr;
  }
  
 Serial.print(env);
}

If I print out the first variable, cl, I get the letter 'C'- which seems right.

It's not. cl should print as "Content-Length: 3RX 37 bytes from foo.com 2

rb Thu Oct 14 12:08:00 CDT 2010 Ended connection with foo.com

" (from your original example).

So, how are you printing cl?

Did you print data to validate that it is correct?

Disregard my previous post. I re-wrote it with some help. Only odd thing is that the serial window prints the line twice. I would have used strcpy but couldn't wrap my brain around it and the String functions work fine.

And for what it's worth, I'm lighting up RGB LEDs base on the data returned by the server. I'll comment out the Serial.println lines once I have things working properly.

void printPublicData(char* data, int len) {   // how the headers returned from getData routine are handled
  char *cl;
  char *cr;
  
  if (len != -1) {
    cl = strstr(data, "\r\n\r\n"); //gets me past the headers and to the body

    cl += 4;
    
    String stringOne = cl;
    
    if (stringOne.startsWith("ra",0)) {
      Serial.println("Public Prod is A");
      digitalWrite(A_green, HIGH);
      digitalWrite(A_red, LOW);
      digitalWrite(A_blue, LOW);
      }
      else if (stringOne.startsWith("rb",0)) {
      Serial.println("Public Prod is B");
      digitalWrite(B_green, HIGH);
      digitalWrite(B_red, LOW);
      digitalWrite(B_blue, LOW);
      }
      else if (stringOne.startsWith("rc",0)) {
      Serial.println("Public Prod is C");
      digitalWrite(C_green, HIGH);
      digitalWrite(C_red, LOW);
      digitalWrite(C_blue, LOW);
      }
    
  }
  
}
    cl += 4;

Before executing this instruction, you should verify that there are at least 4 characters in the string being pointed to by cl. If you don't, you risk advancing past the array pointed to be cl, into some other memory space.