How to create large .txt file?

Hello everyone,

I want to do some testing on my SD Card and need a large txt file with Data to do this.
Is there a way to create a .txt file (>1GB) with Data which fills the whole file?? I tried using fsutil and echo in cmd. While fsutil creates a large file without Data, i can use echo to put Data in, but I want something which fills the complete file with Data and not just one line. Maybe there is a loop command to fill a file? Or is their another way to realize this in short time?

Thanks in advance,

Are you looking for a solution to run on the PC or the Arduino ?

What do you want inside your file? The same line repeated over and over? The same letter? Random stuff? Does it need to be valid ASCII? Can it be binary nonsense? Can it be all zeros?

If you like the letter A, this is one solution:

</dev/zero tr \\0 \\101 | dd count=1000000 bs=1k of=big_file.txt

Powershell, regular batch file.

Remember too, that since you presumably have the Arduino IDE installed, you have gcc and again, I assume that you know how to program in C++, so you can write a ~fifteen line program to generate a suitable file.

I don't know if you can still do it, but under MSDOS, you can copy a file to itself and double the size. Do that a few times and the file will be HUGE!

scan you harddisk for a large file or download a nice film and rename the file to .txt

it even works if the film is not nice :slight_smile:

The OP didn't specify what the file will be used for. OP specified .txt though, so maybe the file should contain ASCII? Hello @noobquestions ?

hello Sir,

you are right, i was not being specific about my idea, sorrsy.
Im going to do a 2 weeks test run with an Arduino UNO. I am using a Data logger with SD slot.
My plan is to check the SD card for errors (my goal is to find bit flips). Thats why i was asking for a large .txt file (which can be filled with anything, just to have a setup to check for errors). In the 2 weeks i will check the.txt file repeatedly in 512-Byte blocks and use CRCs for every block. I will log few events (test correct, test corrupt...) to internal EEPROM.
So do you think renaming a film to .txt will do that?
Thanks a lot for your attention

Since you are only interested in reading raw bytes, any binary data will do, including a film renamed as .txt, but this is a weird approach. Why a film? Why the .txt extension if the file is not a text file? What's wrong about the solution from #3? If you prefer a file filled with random bytes, try:

dd if=/dev/urandom count=1000000 bs=1k of=big_file.raw

It will just take a little longer to complete.

The OP's mention of fsutil suggests Windows, so I wouldn't think that dd is an option.

I'm pretty sure dd should work from cygwin for a task like that. I can't believe Win users are stuck with renaming a film just to make some big raw data file. The mention of fsutil also suggests that the OP is at least somewhat familiar with a CLI, so why shy away from cygwin, after all?

Your suggestion from post #4 is also perfectly sound. Downloading a film, on the other hand...

it's not about the download, it's just about finding a large file. Movies tend to be large. so it's a possible source of content readily available

1 Like

Yes, I see the point, I've seen it from the start. A "solution" like this can be seen as a quick and clever hack or as a lazy and inelegant hack.

Does it work? Sure! Should one solve problems this way? Hmm...

not willing to start a long discussion about this but the original question was

Is there a way to create a .txt file (>1GB) with Data which fills the whole file??

renaming a large file as .txt seems to fit the bill very simply. I don't see the issue with that.

of course you can do more complicated stuff with the command line and ensure proper distribution of the randomness of the data in the file etc... but that was not the ask. over engineered answer?

PS: OP has also clarified the real intent "my goal is to find bit flips", may be that's more interesting

some reading for OP Soft error - Wikipedia

Possibly. At least, the OP now has more tools to pick form.

With a file completely filled with the same value repeated, however, it is easier to spot when the data is departing from the expected value, so I still prefer a file with repeated A chars (or whatever else): I think it has a practical advantage over the big film.

sure - more solutions better than one :slight_smile:

using only one value (8 identical bits scattered around) may induce a bias in the analysis.

I'm aware of bip-flips on DRAM, but on an SD card, given the level of energy required to flip those bits, I'm not sure there will be anything to measure.

on Windows you could use something like this in a cmd

echo "A123456789" > dummy.txt
for /L %i in (1,1,26) do type dummy.txt >> dummy.txt

however, the disadvantage of such ASCII only files is that for example that bit7 of neither cell will be written to 1 and even if the bit7 of all cells couldn't hold a 1 you will not be able to find a damaged cell. Therefore I would just simply use a random binary file and do a write and readback test.