It seems I get dealing with interrupts wrong :(

I have a pump station with a water meter.
It runs on an Arduino UNO; receives its start via MQTT; and reports water flow rate and current volume via MQTT.

This set-up has been working for two year.

A few days ago, I noticed that the volume counter is reset during the pump operation at random intervals... today 16 times, resulting incorrect water volumes being reported.

For simplicity sake I have shortened the code to say:
in setup() -> attach the ISR called pulse_count_isr()
in loop() I call water_flow_measure()

Here the functions:

/*
 * ---------------------------------------------|----------------------------------
 * @brief Add flowmeter pulse via interrupt routine
 * ---------------------------------------------|----------------------------------
 */
void pulse_count_isr ()
{
    g_meter_pulse_counter++;

}   /* void pulse_count_isr () */


/*
 * ---------------------------------------------|----------------------------------
 * @brief Measures water flow and volume
 * ---------------------------------------------|----------------------------------
 */
void water_flow_measure ()
{
    // last time we took a count
    static uint32_t time_flow_read_previously = 1;
    static uint16_t flow_litres_per_minute = 0;

    if ((millis () - time_flow_read_previously) > 1000)
    {
        // process counters once per second

        // disable interrupt while calculating flow rate
        // keep the interrupt detach -> attach period as short as possible
        detachInterrupt (digitalPinToInterrupt(PIN_FLOW_SENSOR));

        g_water_flow_rate = ((1000.0 / (millis () - time_flow_read_previously))
                            * g_meter_pulse_counter) / FLOW_FREQUENCY;

        time_flow_read_previously = millis ();
        flow_litres_per_minute = (g_water_flow_rate / 60) * 1000;
        g_litres_total += flow_litres_per_minute;

        if (0 != int (g_water_flow_rate))
        {
            // only if g_water_flow_rate is not zero
            water_meter_read ();                // ... care about meter and flow updates
        }

        g_meter_pulse_counter = 0;              // reset to start incrementing again

        // attach the interrupt again
        attachInterrupt (digitalPinToInterrupt(PIN_FLOW_SENSOR), pulse_count_isr, RISING);
    }

}   /* void water_flow_measure () */


/*
 * ---------------------------------------------|----------------------------------
 * @brief Outputs flow and volume every 5 seconds
 * function is only called when flow rate is not zero
 * ---------------------------------------------|----------------------------------
 */
void water_meter_read ()
{
    uint32_t time_now = millis ();                      // memorise the time
    static const uint32_t PUB_METER_INTERVAL = 5000;    // 5 s
    static uint32_t time_meter_last_published = 0;      // time stamp when flow was last published
    static uint16_t meter_read_last_value = 0;          // last meter as int

    // publish flow at defined interval
    if (time_now - time_meter_last_published > PUB_METER_INTERVAL)
    {
        time_meter_last_published = time_now;   // update time_meter_last_published here

        g_meter_current = g_litres_total / 1000;// get litres

        if (g_meter_current != meter_read_last_value)
        {
            // only publish meter if it has changed
            static char charBuf[7];             // char for meter

            //dtostrf(floatVar, minStringWidthIncDecimalPoint, numVarsAfterDecimal, charBuffer);
            dtostrf (g_meter_current, 6, 0, charBuf);    // convert meter to char
            // space leading value due to formatting hence the deblank()
            message_publish (MQTT_PUB_WATER_METER, deblank (charBuf));
            charBuf[0] = '\0';                  // reset the char buffer

            dtostrf (int(g_water_flow_rate), 6, 0, charBuf);    // convert float to char
            message_publish (MQTT_PUB_WATER_FLOW, deblank (charBuf));
            charBuf[0] = '\0';                  // reset the char buffer

            meter_read_last_value = g_meter_current;
        }
    }

}   /* void water_meter_read () */

In essence: I detach the interrupt to do house-keeping and sending data yadda yadda;
And yes, I loose a few litres with this approach (which is admittedly sub-optimal), but is has worked to my satisfaction.

Now, here the interesting bit...

According to the code, it should report every 5 seconds; however, the log shows the weirdest of update intervals, ranging from 11 ms to 33 s.

While the code has worked for two years, and it is strange it shows potential weaknesses now, something seems wrong with my approach --- I just don't see it.

Any hints appreciated. Thanks.

A shortened version is good but you need to post a complete program that demonstrates the problem. We need to see how variables are declared. And the problem is usually in the place where you are not looking.

If the pump runs continuously then I would never detach the ISR. Also I would use a volatile unsigned long variable for the pulse count and I would never reset it.

When reading a multibyte variable that is updated by the ISR you need to pause interrupts while you read the value - in case it gets updated in the middle of the read. Like this

noInterrupts();
  copyOfPulseCount = pulseCount;
interrupts();

and then use the copy in your calculations
It would also be a good idea to have a single byte variable updated by the ISR to signal when a new pulse has been detected so your code in loop() only copies the value when it changes - like this

if (newPulse == true) {
   noInterrupts();
      copyOfPulseCount = pulseCount;
      newPulse = false;
   interrupts();
}

...R

Thanks! :slight_smile:

There is certainly truth in “Two or three hours spent thinking and reading documentation solves most programming problems.” I have been looking at this and reading since I posted, I came to a similar conclusion to what you wrote.

Firstly, I posted the abbreviated version as the code verbatim is 1,200 lines, but I should have posted the constants/variable definitions.

To illustrate that most of the functions do not impact the topic, see the prototypes:

/* --------------- Prototypes ----------------------------------------------- */
// common static code modules
void mac_address_build (uint8_t pin);
void ethernet_connection_maintain ();
void bicolour_led_set (uint8_t ledState);
char *deblank (char *str);

void mac_address_publish ();
void message_publish (const char *topic, const char *message);
void error_publish (uint8_t errorType, char *errorMessage);

void temperature_publish (float temp);
float temperature_get (uint8_t pin);
float temperature_validate (float temperature_sensed, uint8_t pin);
void temperature_collect ();

// common code but modify as required
bool b_connected_to_broker ();
void mqtt_call_back (char *topic, byte *payload, uint16_t length);

// controller specific functions
void pulse_count_isr();
void water_flow_measure();
void water_meter_read ();
void soil_permittivity_measure();

Yes,I have:

volatile uint32_t g_meter_pulse_counter = 0;

The key is to not detach the ISR; agreed.

I thought to leave it counting and only copy the g_meter_pulse_counter value every x interval when I want to publish the values (flow and volume); and after I read it, reset it.

I do not understand as to why use a flag.
It would mean the loop copies the value from the ISR a lot, when I only want the value say every 5 seconds (or even longer).

I think letting the ISR run its course until I want its value to publish the calculated flow/volume should be the most efficient approach.

[update 1] I think I get the flag: there is not need to run the calculation loop for 23 hours for the day (the pump runs an hour per day)… only when pulse come in. Smart :slight_smile:

For the benefit of review and others, here the revised functions:

Abbreviated loop()

void loop () {
        if (true == g_b_pulse_detected)
        {
            water_flow_volume_calculate ();     // get pulses, calculate flow and volume
        }
}

And the others:

/*
 * ---------------------------------------------|----------------------------------
 * @brief Add flowmeter pulse via interrupt routine
 * ---------------------------------------------|----------------------------------
 */
void water_pulse_counter_isr ()
{
    // Every time this function is called, increment "count" by 1
    g_water_pulse_counter++;
    g_b_pulse_detected = true;

}   /* void water_pulse_counter_isr () */


/*
 * ---------------------------------------------|----------------------------------
 * @brief Calculates water flow and volume.
 * Briefly stop/starts interrupts to copy pulse_couter value
 * ---------------------------------------------|----------------------------------
 */
void water_flow_volume_calculate ()
{
    const static uint32_t PULSE_COUNTER_COLLECTION_INTERVAL = 5000; // 5 sec
    static uint32_t last_time_pulse_collected = 0;
    uint32_t time_now = millis ();

    // only if we exceed the PULSE_COUNTER_COLLECTION_INTERVAL...
    if ((time_now - last_time_pulse_collected) > PULSE_COUNTER_COLLECTION_INTERVAL)
    {
        static uint16_t flow_litres_per_minute = 0;
        static uint32_t water_pulse_counter_copy = 0;

        // disable interrupt while copying the pulse_counter value
        noInterrupts ();
        water_pulse_counter_copy = g_water_pulse_counter;
        g_water_pulse_counter = 0;
        g_b_pulse_detected = false;
        interrupts ();

        /*
            Calculate flow rate (fr)

                      1000          count
            fr = ---------------- * -----
                 t(now) - t(last)   freq
        */

        g_water_flow_rate = ((1000.0 / (millis () - last_time_pulse_collected))
                            * water_pulse_counter_copy) / FLOW_FREQUENCY;

        flow_litres_per_minute = (g_water_flow_rate / 60) * 1000;

        g_litres_total += flow_litres_per_minute;

        last_time_pulse_collected = time_now;

        if (0 != (g_water_flow_rate))
        {
            water_meter_publish ();
        }
    }

}   /* void water_flow_volume_calculate () */


/*
 * ---------------------------------------------|----------------------------------
 * @brief Outputs flow and volume every 5 seconds
 * function is only called when flow rate is not zero
 * ---------------------------------------------|----------------------------------
 */
void water_meter_publish ()
{
    g_meter_current = g_litres_total / 1000;    // get litres

    static char charBuf[7];                     // char for values to publish

    //dtostrf(floatVar, minStringWidthIncDecimalPoint, numVarsAfterDecimal, charBuffer);
    dtostrf (g_meter_current, 6, 0, charBuf);   // convert meter to char
    // space leading value due to formatting hence the deblank()
    message_publish (MQTT_PUB_WATER_METER, deblank (charBuf));
    charBuf[0] = '\0';                          // reset the char buffer

    // convert float to char
    dtostrf (int (g_water_flow_rate), 6, 0, charBuf);
    message_publish (MQTT_PUB_WATER_FLOW, deblank (charBuf));
    charBuf[0] = '\0';

}   /* void water_meter_publish () */

These functions are now simpler...

In essence: as per Robin2's suggestions, the loop keeps looping
when the pulse_counter receives a change, the g_b_pulse_detected is set to true
the loop than executes water_flow_volume_calculate()
it stops the ISR, copies and resets its value, resets g_b_pulse_detected to false,
calculates flow and volume, and then publishes it.

Not tested yet; will do so tomorrow... but it should work :slight_smile:

MaxG:
Firstly, I posted the abbreviated version as the code verbatim is 1,200 lines,

It's always a good idea to get the bugs out before the program gets to 1200 lines long.

Alternatively, if the bug started when you added the last 5 lines of code then you know where to look for the problem.

Test early and often.

...R

Robin2:
It’s always a good idea to get the bugs out before the program gets to 1200 lines long.

Alternatively, if the bug started when you added the last 5 lines of code then you know where to look for the problem.

Test early and often.

…R

Agree on all three points… :slight_smile:

Given the function names though, only three functions are controller specific.

The key point:

The program still resets the meter to zero in the middle of a valve being open.

Valve 1 starts at 10:00:00; all is good, as in the interval to publish is close to 5 seconds, until 10:57:16.73
Then the interval varies from zero to 15 seconds… to reach 31 s at 11:01:28.19, which is the point it changes the meter from 100 to 0 litres. (see log below)

09/17/20 10:56:46.73 'WaterMeter', (3 05.005
09/17/20 10:56:51.73 'WaterMeter', (3 04.998
09/17/20 10:56:56.73 'WaterMeter', (3 05.004
09/17/20 10:57:01.74 'WaterMeter', (3 05.001
09/17/20 10:57:06.73 'WaterMeter', (3 04.996
09/17/20 10:57:12.67 'WaterMeter', (3 05.936
09/17/20 10:57:16.73 'WaterMeter', (3 04.066
09/17/20 10:57:26.96 'WaterMeter', (3 10.222
09/17/20 10:57:26.99 'WaterMeter', (3 00.030
09/17/20 10:57:41.94 'WaterMeter', (3 14.956
09/17/20 10:57:41.97 'WaterMeter', (3 00.030
09/17/20 10:57:41.99 'WaterMeter', (3 00.021
09/17/20 10:57:46.74 'WaterMeter', (3 04.751
09/17/20 10:57:51.74 'WaterMeter', (3 04.996
09/17/20 10:57:57.35 'WaterMeter', (3 05.611
09/17/20 10:58:02.35 'WaterMeter', (3 04.997
09/17/20 10:58:06.74 'WaterMeter', (3 04.394
09/17/20 10:58:11.74 'WaterMeter', (3 05.001
09/17/20 10:58:16.74 'WaterMeter', (3 05.001
09/17/20 10:58:36.97 'WaterMeter', (3 20.223
09/17/20 10:58:36.99 'WaterMeter', (3 00.025
09/17/20 10:58:37.01 'WaterMeter', (3 00.022
09/17/20 10:58:37.04 'WaterMeter', (3 00.022
09/17/20 10:58:43.17 'WaterMeter', (3 06.131
09/17/20 10:58:46.75 'WaterMeter', (3 03.582
09/17/20 10:58:56.96 'WaterMeter', (3 10.210
09/17/20 10:58:57.02 'WaterMeter', (3 00.059
09/17/20 10:59:06.95 'WaterMeter', (3 09.937
09/17/20 10:59:06.98 'WaterMeter', (3 00.024
09/17/20 10:59:11.76 'WaterMeter', (3 04.778
09/17/20 10:59:21.96 'WaterMeter', (3 10.202
09/17/20 10:59:21.99 'WaterMeter', (3 00.027
09/17/20 10:59:26.77 'WaterMeter', (3 04.781
09/17/20 10:59:31.76 'WaterMeter', (3 04.993
09/17/20 10:59:41.96 'WaterMeter', (3 10.198
09/17/20 10:59:41.98 'WaterMeter', (3 00.023
09/17/20 10:59:47.37 'WaterMeter', (3 05.386
09/17/20 10:59:56.96 'WaterMeter', (3 09.593
09/17/20 10:59:56.98 'WaterMeter', (3 00.023
09/17/20 11:00:01.97 'WaterMeter', (3 04.985
09/17/20 11:00:06.77 'WaterMeter', (3 04.799
09/17/20 11:00:16.97 'WaterMeter', (3 10.201
09/17/20 11:00:17.00 'WaterMeter', (3 00.028
09/17/20 11:00:32.20 'WaterMeter', (3 15.205
09/17/20 11:00:32.26 'WaterMeter', (3 00.063
09/17/20 11:00:32.31 'WaterMeter', (3 00.042
09/17/20 11:00:36.97 'WaterMeter', (3 04.665
09/17/20 11:00:44.80 'WaterMeter', (3 07.830
09/17/20 11:00:46.77 'WaterMeter', (3 01.968
09/17/20 11:00:56.97 'WaterMeter', (3 10.206
09/17/20 11:00:57.00 'WaterMeter', (3 00.022
09/17/20 11:01:28.19 'WaterMeter', (3 31.198
09/17/20 11:01:45.92 'WaterMeter', (1 17.729
09/17/20 11:01:45.97 'WaterMeter', (1 00.045
09/17/20 11:01:46.01 'WaterMeter', (1 00.042
09/17/20 11:01:48.19 'WaterMeter', (1 02.175
09/17/20 11:01:59.45 'WaterMeter', (1 11.266
09/17/20 11:01:59.48 'WaterMeter', (1 00.025
09/17/20 11:02:06.22 'WaterMeter', (1 06.746
09/17/20 11:02:08.19 'WaterMeter', (1 01.968
09/17/20 11:02:13.19 'WaterMeter', (1 05.002
09/17/20 11:02:18.19 'WaterMeter', (1 04.995
09/17/20 11:02:23.19 'WaterMeter', (1 05.001
09/17/20 11:02:28.80 'WaterMeter', (1 05.612
09/17/20 11:02:40.93 'WaterMeter', (1 12.130
09/17/20 11:02:40.96 'WaterMeter', (1 00.027
09/17/20 11:02:43.20 'WaterMeter', (1 02.238
09/17/20 11:02:48.19 'WaterMeter', (1 04.997
09/17/20 11:02:53.20 'WaterMeter', (1 05.006
09/17/20 11:02:58.20 'WaterMeter', (1 05.001
09/17/20 11:03:03.20 'WaterMeter', (1 05.001
09/17/20 11:03:08.20 'WaterMeter', (1 04.996
09/17/20 11:03:13.20 'WaterMeter', (1 05.005

I have attached the program files for those interested. These are straight out of VSCode and PlatformIO.

config.h (2.71 KB)

main.cpp (40.8 KB)

Sorry. I don't have the patience to study 40k of code. Apologies for being lazy.

Can you illustrate the problem in a much shorter demonstration program?

Did you have an earlier version that worked properly? If so, what changed since then.

It is convenient to assume the problem is in a particular group of functions but in my experience it is unwise to limit your investigations.

...R

digitalWrite (PINS_UNUSED[i], INPUT_PULLUP);    // set pin INPUT_PULLUP

-- ? --

Ideally, you will configure the unconnected pins as inputs with the internal pull-up resistor enabled.

Ideally you should use the pinMode() function to set the mode of a pin

Ahh...

        pinMode (PINS_UNUSED[i], INPUT_PULLUP); // set unused pins to INPUT

... like so... (just read the reference again. Thanks for the hint (I clearly didn't get it).

I want to reliably read a water flow meter.

I have a volatile counter and flag, an interrupt service routine, and a function to count flow and flow rate.

I read up on interrrupts, and still ponder over the question, whether to use noInterrupts() as suggested here, or detachInterrupt / attachInterrupt as suggested here under “How are interrupts queued?”.

noInterrupts disables most interrupts, while
detachInterrupt only disables the specific interrupt.

My programs does other things on non-blocking code, such as check if connectivity exists every 10 minutes) or measures temperature every 5 minutes. I do not want to stop these, and do not see how these would interfere the water flow meter pulse counting.

Here the core version.

#include <Arduino.h>

#define PIN_FLOW_SENSOR 2

const float FLOW_FREQUENCY = 4.8;
volatile uint16_t g_water_pulse_counter = 0;
volatile bool g_b_pulse_detected = false;
uint16_t g_meter_current = 0;
uint32_t g_litres_total = 0;
double g_water_flow_rate = 0;

void meter_pulse_counter_isr();
void meter_values_calculate();
void meter_values_publish ();


void meter_pulse_counter_isr ()
{
    g_water_pulse_counter++;
    g_b_pulse_detected = true;
}


void meter_values_calculate ()
{
    static uint32_t last_time_counter_collected = 0;
    uint32_t time_now = millis ();

    if ((time_now - last_time_counter_collected) > 1000)
    {
        last_time_counter_collected = time_now;

        static uint16_t water_pulse_counter_copy = 0;

        // noInterrupts ();
        detachInterrupt (digitalPinToInterrupt (PIN_FLOW_SENSOR));
        water_pulse_counter_copy = g_water_pulse_counter;
        g_water_pulse_counter = 0;
        g_b_pulse_detected = false;
        // interrupts ();
        attachInterrupt (digitalPinToInterrupt (PIN_FLOW_SENSOR), meter_pulse_counter_isr, RISING);

        g_water_flow_rate = ((1000.0 / (millis () - last_time_counter_collected))
                            * water_pulse_counter_copy) / FLOW_FREQUENCY;

        g_litres_total += (int) (g_water_flow_rate / 60) * 1000;

        meter_values_publish ();
    }
}


void meter_values_publish ()
{
    static const uint32_t PUB_METER_INTERVAL = 5000;    // 5 s
    uint32_t time_now = millis();               // memorise the time
    static uint32_t meter_last_published = 0;    // time stamp when flow was last published
    static uint16_t last_flow_volume = 0;       // last meter as int
    static char buffer[7];                      // char for values to publish

    if (time_now - meter_last_published > PUB_METER_INTERVAL)
    {
        meter_last_published = time_now;
        g_meter_current = g_litres_total / 1000;

        if (g_meter_current != last_flow_volume)
        {
            dtostrf (g_meter_current, 6, 0, buffer);
            //message_publish (MQTT_PUB_WATER_METER, deblank (buffer));
            buffer[0] = '\0';                   // reset the char buffer

            dtostrf (int (g_water_flow_rate), 6, 0, buffer);
            //message_publish (MQTT_PUB_WATER_FLOW, deblank (buffer));
            buffer[0] = '\0';

            last_flow_volume = g_meter_current;
        }
    }
}


void setup ()
{
    attachInterrupt (digitalPinToInterrupt(PIN_FLOW_SENSOR), meter_pulse_counter_isr, RISING);
}


void loop ()
{
    if (true == g_b_pulse_detected)
    {
        meter_values_calculate ();
    }
}

Is the position of the detach/attach statements correct?

Don't use attached / detach. Too much messing around.

Just use noInterrupts(), copy the volatile variable to a local one, then interrupts(). That's very fast. Even if another interrupting event happens while they are disabled, it will be queued and serviced as soon as interrupts are re-enabled.

This has already been covered in Reply #1 in your other Thread about the same project.

I am suggesting to the Moderator to merge your Threads so we have all the project info in one place.

...R

Generally, people use noInterrupts. Detach is for when you want to stop servicing a particular interrupt for a while.

Note that noInterrupts does indeed turn them all off, but any that occur will be queued and serviced when you turn them back on. Caveat: only one is queued for each interrupt.

Robin2:
This has already been covered in Reply #1 in your other Thread about the same project.

I am suggesting to the Moderator to merge your Threads so we have all the project info in one place.

...R

I disagree... this is separate.

In particular WRT your reply that you want to see code, and when you see it, it is too long and should be shortened.

This post is about the correct use of either method of interrupts for this use case.

I respect Nick's opinion, and he suggests attach/detach... while you suggest noInterrupts/interrupts.
While both may work I am after the proper use, not shortcuts or substitutes.

wildbill:
Generally, people use noInterrupts. Detach is for when you want to stop servicing a particular interrupt for a while.

Note that noInterrupts does indeed turn them all off, but any that occur will be queued and serviced when you turn them back on.

Caveat: only one is queued for each interrupt.

The latter seems a potential culprit… in what is referred to as my other post.

I have non-blocking functions doing certain things after specific yet different intervals which I suspect eventually mess up variable space or corrupt these.

@MaxG

TOPIC MERGED.

Could you take a few moments to Learn How To Use The Forum.
Other general help and troubleshooting advice can be found here.
It will help you get the best out of the forum.

MaxG:
The latter seems a potential culprit... in what is referred to as my other post.

I have non-blocking functions doing certain things after specific yet different intervals which I suspect eventually mess up variable space or corrupt these.

Highly unlikely. If you use the noInterrupts() / Copy / interrupts() technique, interrupts will only be off for a handful of microseconds.

Is it then possible that the non-blocking function coincide with the ISR?
... considering these are set to occur at round seconds, e.g. 5 10, 15.
I assume they could occur all at once, say at 30 seconds the 5 10 and 15 occur, plus the ISR every second, also occurring at 30 seconds.

Would it be better to skew the intervals by adding a few milliseconds, like so:
a_interval = 5007
b_interval = 60011
c_interval = 300071

It will then be highly unlikely to experience more than one interval at any given time, other than the ISR.

BTW: all this runs on an Arduino UNO.

As for the questions: What was changed?
Nothing! The thing ran for almost three years without a problem; two weeks ago it started randomly resetting the counter to 0 mid valve open.
I updated the code yesterday, making no difference.
I have swapped it out for a new one; no change in outcome.
What I also observed that it rebooted itself, sometimes two time in a row. That never happened before (I keep track of reboots).