Detecting a binary pattern on a stream of data

I have a noise input on my arduino, is a phototransistor and since i cant control the light conditions where my circuit is, I consider the channel noise.

Using a resistor and some easy software filtering I transformed the analog input into digital just by evaluating if sensorValue is below some value the result is considered a 0, if not, a 1. The resulting value is concatenated to an string variable that is cleared if the length is more than 20.

So imagine that you have the variable textInput that grows over time like this:

""
"1"
"10"
"100"
"1001"
"10010"
"100101"
"1001011"

and so... until the length is 20 and then the variable gets cleared. Well how do you search for a pattern?? I mean, I need to stop the program when the pattern "0110110" (or any other given sequence) appears, but i have no idea how to evaluate the stream of data,

What sort of pattern are you looking for?

You are sampling your input at fixed intervals, I guess. If you are looking for a signal generated by person walking by the sensor or something like that, it would be more useful to measure the timing between on->off and off->on transitions.

It is likely to be a useful technique to design a state-machine with transitions driven by your inputs to detect and match an anticipated pattern. There was a recent length thread about finite state machines that you could search for.

Can you please clarify the data type of the textinput variable ?
string or String ?

One method would be to examine each value as it arrives. If the first character received does not match the first character of the pattern that you are looking for, set a flag variable to false and look at the next character received. Once you get a match on the first character set the flag to true and move on to looking for a match on the second one and so on. If at any time there is not a match reset the index to the string to be found and the flag to false and start looking for a match again using the next character received and the first one of the target.

If you reach the end of the target string you have found a match on the whole string.

I don't understand the purpose of this pattern matching, but the approach you're using could be improved. To start with I suggest using a circular buffer rather than a linear buffer that is flushed. This means that you can resume pattern matching based on historical data if a potential match eventually fails. To be completely general, the circular buffer needs to be long enough to hold the longest data sequence that you need to match.

To do the match, you would have a 'read cursor' into the circular buffer which points to the the element you're considering as the start of a potential match. Have a variable which records how many subsequent elements in the buffer match your pattern (i.e. in textual terms it's the 'strlen()' of the matching segment). Each time a new character is available in the buffer, if it matches the pattern you increment the match count - if that reaches the length of your pattern then you have found a match. If your next element doesn't match the pattern then reset the match count to zero, advance the read cursor by one element and start again. In effect it's a finite state machine where the 'state' is the number of characters matching so far.

If you don't want the pattern matching to be able to restart matching without losing historical data then you can get rid of the circular buffer and just have a state machine where the state represents the number of matching bytes so far, and update the state machine each time a new character becomes available.

Thanks to UKHeliBob and PeterH, both of you gave a good idea!