Subsetting loops over ranges in languages like GO

Ok, so this is not specifically an Arduino question, other than the fact that the arduinocli is written in GO or Java.

In C, a for loop can modify the variables used by the loop, like:

  char *strings[] = {"test", "foo", "bar", "baz", "glork", "Another test", "overflow", "packets", "datagrams" };
  char buffer[20];
  for (int i=0; i < (sizeof strings)/sizeof(*strings);) {
    int used=0;
    buffer[0] = 0;
    while (sizeof(buffer) > used+strlen(strings[i])+1 ) {
      used += strlen(strings[i]);
      strcat(buffer,strings[i++]);
      if (i >= (sizeof strings)/sizeof(*strings)) {
	printf("%s\n", buffer);
	break;
      }
    }
    printf("%s\n", buffer);
  }

Which will happily assemble the individual strings that are bigger than one string, but less than the buffer size. Essentially, I have the inside of the for loop advance the index by a variable amount.

Can I do something similar with the languages that automatically iterate over groups of objects?
Like Python's for s in strings, Go language's for _, objectFile := range objectFilesToArchive, or I guess even C++'s for (char*s: strings)?

I'm trying to think of easy patches to make Arduino-builder process more than one file at a time, for commands that allow that (like "ar")

Python and Go sort of don't include the concept of "buffer full" with strings.

Is one of the goals to limit the maximum length of the result regardless of language?

Yes. There is apparently a limit on the size of aa command line when you "exec" a process.
It's pretty big on modern systems (~8k?), but it's not so hard to hit when you start specifying a lot of full-qualified path names for files. (For example, people had trouble building microPython from source for rp2040, because of the large number of "-Ipath" options.)

Are you looking for something like this?

def cat(strings, max_size):
    buffer = b''
    for string in strings:
        if len(buffer) + len(string) > max_size:
            return buffer
        buffer += string
    return buffer

Usage:

>>> print(cat([b'this', b'is', b'a', b'test'], 8))
b'thisisa'

Well, if I could stick in an array of strings and get back an array of longer strings ("each not to exceed maxsize"), that would work. Sort of like automatic fill for text, I guess, except that would typically not start with separate strings for each word... Hmm.

Something like this then?

def cat(buffer, strings, max_size):
    for string in strings:
        if len(buffer) + len(string) > max_size:
            return buffer
        buffer += string
    return buffer

Usage:

>>> print(cat(b'oi', [b'this', b'is', b'a', b'test'], 8))
b'oithisis'

Alternatively, you could call the original function as follows:

>>> print(cat([b'oi'] + [b'this', b'is', b'a', b'test'], 8))
b'oithisis'

actually, a function returning an object that is a collection of strings that concatenated values from the original input would probably do nicely for this particular problem.

myfunc('a'..'m') --> ('abcde', 'fghik', 'lm')  // (not the syntax of any real language, AFAIK.)

but I'm also interested in the more general question - advancing a loop "iterator" from within the body of the loop...

Like this?:

https://go.dev/play/p/h8-jdK2aeA-

package main

import "fmt"

func main() {
	for i := 0; i < 10; {
		fmt.Println(i)
		i++
	}
}
0
1
2
3
4
5
6
7
8
9

What about this?

def chunk(strings, max_size):
    string = ''.join(strings)
    return [string[i:i + max_size] for i in range(0, len(string), max_size)]

Usage:

>>> print(chunk(['a', 'bc', 'de', 'f', 'gh', 'i', 'jklm'], 4));
['abcd', 'efgh', 'ijkl', 'm']

In Python this can be done with a while loop.

letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']

i = 0
while i < len(letters):
    print(letters[i])
    if i % 2:
        i += 2
    i += 1

Output:

a
b
e
f
i
j

For things like that you are far better off handling the error instead of trying to limit the length.

No, because that's not using a "range" or "iterator."

A C++23 solution, for fun: https://godbolt.org/z/doGvozasT

#include <fmt/ranges.h>
#include <ranges>
#include <span>
#include <string_view>

auto chunk(std::span<const std::string_view> strings, size_t bufsize) {
    auto pred = [bufsize, size{0uz}](const auto &a, const auto &b) mutable {
        size += a.size();  // Keep track of the size of the current chunk.
        if (size + b.size() > bufsize) {  // If adding b would exceed the buffer
            size = 0;                     // capacity, reset the size and
            return false;                 // split the array between a and b.
        }
        return true;  // Don't split otherwise.
    };
    return std::views::chunk_by(std::move(strings), pred);
}

constexpr std::string_view strings[]{"test", "foo", "bar", "baz", "glork", "Another test", "overflow", "packets", "datagrams"};
constexpr size_t bufsize = 19;  // All sizes exclude null terminator

int main() {
    for (auto &&chunk : chunk(strings, bufsize))
        fmt::println("{}", chunk);
}
["test", "foo", "bar", "baz", "glork"]
["Another test"]
["overflow", "packets"]
["datagrams"]

It has the advantage that all chunks and their elements are lazily evaluated, so no copies are made, and no dynamic allocation takes place.


A similar idea in Python:

from itertools import groupby

class Chunker:
    def __init__(self, max_size):
        self.max_size = max_size
        self._size = 0
        self._count = 0

    def __call__(self, x):
        this_size = len(x)
        self._size += this_size
        if self._size >= self.max_size:
            self._count += 1
            self._size = this_size
        return self._count

for _, chunk in groupby(strings, Chunker(19)):
    print(list(chunk))
['test', 'foo', 'bar', 'baz', 'glork']
['Another test']
['overflow', 'packets']
['datagrams']

You could also use a generator coroutine:

def chunk_by_size(strings: list[str], max_size: int):
    buffer = ""
    for s in strings:
        if len(buffer) + len(s) <= max_size:
            buffer += s
        else:
            yield buffer
            buffer = s
    yield buffer


for chunk in chunk_by_size(strings, 19):
    print(chunk)
testfoobarbazglork
Another test
overflowpackets
datagrams

In C++, range-based for loops are simply syntactic sugar for standard for loops with explicit iterators, see Range-based for loop (since C++11) - cppreference.com.

In C++ 11, the following loop

for (auto element : range)
    foo(element);

is roughly (see link above for the exact equivalence) equivalent to

auto &&__range = range;
for (auto __it = begin(__range), __end = end(__range); __it != __end; ++__it) {
    auto element = *__it;
    foo(element);
}

If you wanted to, you could write this out yourself, and then you have access to the iterator __it in the loop body, where you could use std::advance(__it, n) or ++__it whenever you want.

That being said, I believe this is bad practice, because it is very hard to read (in all languages, not just in C++).
Instead, try to express your intent more clearly by using one of the standard library algorithms or views. That makes it much easier to see what's going on compared to raw loops, especially ones that manipulate their own iterators or indices all over the place.

1 Like

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.