Inoliblist

Hello,

How can I install /use inoliblist? I want to find bugs and make pull requests then.

TomasRoj: Hello,

How can I install /use inoliblist? I want to find bugs and make pull requests then.

OK. I give up.

What is inoliblist ?

Yes, I know that I could Google for it, but do us a favour and explain what it is and where it can be found. After all, it is you that wants the help, so make it as easy as possible for us to provide it

Message was originaly for per1234 as a github issue question. Inoliblist is a github repo for finding bugs and typos in arduino libraries using python codespell library.

UKHeliBob: What is inoliblist ?

The discussion started here: https://github.com/per1234/inolibbuglist/issues/6 (Just for reference).

For anyone curious, here is the inoliblist repository: https://github.com/per1234/inoliblist

TomasRoj: Inoliblist is a github repo for finding bugs and typos in arduino libraries.

That description is actually my inolibbuglist project. inoliblist is a Python script that automatically generates a list of the Arduino libraries on GitHub. The last time I ran it, 8 months ago, it found 8848 repositories. There will be significantly more than that by now. Some of those are duplicates or non-libraries, but I have made a sincere effort to filter as many of those out as possible.

The inolibuglist project is a Python script that scans all the repositories on the list generated by inoliblist for common issues and then generates a list of the results. I have used this list to find thousands of bugs and submit pull requests to fix many of them.

TomasRoj: using python codespell library.

Just to be clear, the codespell check is only one of many checks that inolibbug list runs on the library repositories. codespell is a nice tool for detecting commonly misspelled words. The primary tool I use with inolibbuglist is a Bash script I wrote called arduino-ci-script. The codespell check adds a bit of time to the scan and there are so many other bugs found by inolibbuglist that I have always disabled the codespell check, other than during testing of inolibbuglist, so that's definitely a "low hanging fruit" section of the list where you'll find plenty of opportunities to contribute to Arduino libraries.

I know your real interest is in inolibbuglist, but you'll want to run inolibbuglist with an up to date version of the inoliblist output list, so the first step is to run inoliblist.

  • Install Python 3.x: https://www.python.org/
  • Download inoliblist.py to a location on your computer where your user account has write access: https://raw.githubusercontent.com/per1234/inoliblist/master/inoliblist.py
  • Sign in to your GitHub account.
  • Click your avatar at the top right corner of the GitHub website.
  • Click "Settings".
  • Click "Developer settings".
  • Click "Personal access tokens".
  • Click the "Generate new token" button.
  • Enter your password if requested.
  • Add some text to the "Token description" to remind you of the purpose of the token.
  • You don't need to check any of the scope boxes. The purpose of the token is give you more generous rate limits in the public GitHub API. You actually can use inoliblist without a token, but it would take days to run the script because of constantly having to wait for the unauthenticated rate limit to reset. Even authenticated, a significant portion of the script duration is spent waiting for limit resets.
  • Click the "Generate token" button.
  • When the generated token is displayed click the clipboard icon next to it to copy the token to the clipboard.
  • Open a command line at the location of inoliblist.py.
  • Run the command:
python inoliblist.py --ghtoken token

where token is the GitHub personal access token you generated.

After an initial delay of about a minute, you should start seeing repository URLs being printed to the screen as they are discovered by the script.

The script will take something like 6 hours to run. Make sure your computer will have power and Internet access for the entire time.

After the script finishes, you will find a file named inoliblist.csv in the output subfolder of the folder that contains inoliblist.py. That is a tab separated file, which can be opened in a spreadsheet program.

If you have troubles, you can try running inoliblist.py with the --verbose option to get verbose output for debugging purposes.

Once you have successfully finished generating the inoliblist output list, comment here and I'll proceed with instructions about how to run the inolibbuglist script.

Thanks for reply. This is answer I needed

Ok I ran it. I see only urls of the libraries. I dont see the problems there. So how I can run that inobuglist?

TomasRoj:
Ok I ran it.

Congratulations!

TomasRoj:
I see only urls of the libraries.

If you open inoliblist.csv, you’ll see much more than that. The inoliblist output list provides 37 metadata fields in addition to the repository URL.

TomasRoj:
I dont see the problems there.

Nor should you. I thought I already made that clear. The purpose of inoliblist is to automatically generate a list of GitHub repositories of Arduino libraries, with useful metadata. inoliblist is a component of the larger inolibbuglist that I split out into a separate project because I felt that the list of Arduino libraries would be something generally useful to people, while the list of bugs in Arduino libraries would likely only be useful to a very few people (likely limited to myself so far).

TomasRoj:
So how I can run that inobuglist?

The first thing I should say is that, although I’ve made some efforts to provide Windows compatibility (with Git Bash installed), I have always run the script on Linux and so can’t guarantee it will work on Windows. I have no clue about Mac, as I don’t own one to use for testing, and have zero experience with macOS, but I’d expect that it should work. I found it a real pain to get the shell commands working for all possible path cases on both OS systems and I don’t remember exactly where I left off on that. I usually try to provide a higher quality of documentation and user friendliness to the projects I publish online, but inolibbuglist is such a niche project that I didn’t have much expectation of anyone other than myself taking an interest. I wanted to publish it in the spirit of open source, but the project went way over the time I had allotted to it and so once I got it to the point where it worked reliably for my needs, I didn’t have the time to deal with all the remaining lower priority plans.

Download inolibbuglist: https://github.com/per1234/inolibbuglist/archive/master.zip

Unzip the downloaded .zip file.

inolibbuglist uses inoliblist and expects the folder structure to look like this:

|_inolibbuglist
| |_inolibbuglist.py
| |_etc…
|_inoliblist
|_inoliblist.py
|_output
|_inoliblist.csv

make sure the folder structure follows the diagram above.

I know you’re interested in the codespell check, but I have that check turned off by default in the script because it slows down the list generation and, the other checks had provided me with plenty of bugs to fix in previous runs of the script. So you will want to enable the codespell check. Open inolibbuglist.py in a text editor and change line 20 from:

check_for_typos_default = False

to:

check_for_typos_default = True

Close and save the file.

Here is an example of the command to run the script:

python inolibbuglist.py --github_login TomasRoj --ghtoken token --browser_command “”/c/Program Files/Mozilla Firefox/firefox.exe" -new-tab" --arduino_ci_script_branch master --arduino_ci_script_application_folder ~/ArduinoIDE --arduino_ci_script_arduino_ide_version 1.8.9

I’ll provide some explanation of the options:

  • github_login: The script uses this to determine whether you have open pull requests or issue reports, and whether you are a contributor in each Arduino library repository. By default, the script will skip scanning of any repository where you already have an open pull request. The idea is that this will allow you from either wasting time on a repository where you have already submitted a fix for the bug detected by the scan, or from wasting time on a repository where the maintainer hasn’t made the effort to merge or close your previous PR and will likely do the same with any future PRs. You can configure that behavior by setting the value of process_repos_with_open_pr_default to True at line 19 of inolibbuglist.py. There are a couple of issues found by the scan that are only resolvable by opening an issue report, so the check for whether you have an open issue report can be used to avoid duplicate effort on those issues. The idea behind recording whether you are a contributor to the repository is this indicates that the repository was maintained actively enough by someone receptive to contributions at some point during your time as a contributor on GitHub that they merged one of your PRs, and thus may have a greater likelihood of still being actively maintained. You could sort the list accordingly to prioritize your efforts to those repositories. I’ve found that it’s a very big problem that a lot of Arduino developers either don’t maintain their repositories (but also don’t bother document that the repo is unmaintained) or else are so dysfunctional when it comes to GitHub that they don’t get notifications when a PR is submitted, or don’t know what to do when they get one. I currently have 1533 open PRs, some of them going back as far as 2015. Now, you may think that after a few months sitting open, a simple, non-controversial, obvious, well-documented, valid PR will have no chance of ever being merged, but I’m always amazed to get regular emails of these zombie PRs coming back to life and getting merged, often with expressions of amazement from the developer that they didn’t notice it for so long.
  • ghtoken: Replace token with the same token you used when you ran inoliblist.
  • browser_command: Adjust this to be the command required to open a tab in your web browser of choice. To make the process of submitting bug fixes more efficient, inolibbuglist generates scripts that open batches of repository URLs in browser tabs, and uses this command in those scripts.
  • arduino_ci_script_branch: The branch of the arduino-ci-script tool used by inolibbuglist. This is helpful to me while testing during development work and you will just want to leave it set to master.
  • arduino_ci_script_arduino_ide_version: The version of the Arduino IDE to use. The arduino-ci-script tool uses the Arduino IDE for an obscure check that likely won’t interest you. However, I don’t currently have a system in place to skip that so you’ll want to define it anyway.
  • arduino_ci_script_application_folder: Set this to the location on your computer where you have the Arduino IDE installed. The script will expect to find the Arduino IDE at {arduino_ci_script_application_folder}/arduino-{arduino_ci_script_arduino_ide_version}. If it doesn’t find it there, it will install it. Due to being primarily targeted to use in Travis CI builds, the arduino-ci-script tool currently assumes it is running on a 64 bit Linux system, and will install the 64 bit Linux version of the Arduino IDE. If you have a different OS you definitely will want to make sure you have manually installed the Arduino IDE to the correct location to avoid the wrong IDE being installed. I hope to eventually make arduino-ci-script more friendly to running on any platform, but that has not been a high priority for me yet.
  • bash_command: The Bash shell command used to run the arduino-ci-script tool (which is a Bash script). The default command is bash, which works fine on Linux (and probably macOS?) and so you can leave this option off if you’re using Linux. If you’re attempting to use the script on Windows, you will need to install Git for Windows, which includes Git Bash. In that case, you may want to specify the path to where Git Bash is installed. For my Windows tests, I used the option --bash_command “c:/Program Files/Git/bin/bash.exe”.
  • verbose: Causes the script to print verbose output while it’s running, which you can use for debugging.

Ok, when I will be back home I will try it and tell you if it works or not. Thanks for now!

I forgot to mention that the inolibbuglist script also takes a very long time to run.

How long does it take? I fully understand these two projects now and I will be glad to help you with this project in the future. BTW can't understand how you made this :o

TomasRoj: How long does it take?

I can't remember how long it took the last time I ran it with codespell enabled. It will take even longer than that now because more libraries have been created over the last 8 months. I wouldn't be surprised if it takes 24 hours.

TomasRoj: I fully understand these two projects now

That was my hope. It would have been easier for me to just run the scripts for you, but I thought it would be beneficial if you ran them yourself. It's unfortunate that inolibbuglist is not so user friendly as inoliblist, but we'll hope for the best.

TomasRoj: I will be glad to help you with this project in the future.

You're probably seeing already that there are some areas where improvements are possible.

TomasRoj: BTW can't understand how you made this :o

A lot of trial and error and frustration and confusion and headaches. This was my first real project using Python so that made it especially a struggle. I split the project into three components:

  • inoliblist
  • Adding check_library_structure, check_library_properties, check_keywords_txt, and check_library_manager_compliance functions to arduino-ci-script
  • inolibbuglist

The idea was that inolibbuglist would probably just be a personal project of no interest to anyone else, but that some of the components of the project could potentially be more generally useful, and so those should be split out separately.

The inolibbuglist part of the project was especially frustrating because I was already way past the amount of time I could devote to a hobby project and every time I thought it was finally working, the script would fail from something obscures like weird characters in paths or odd file encodings, which would only be encountered after the script had been running for many hours. When you run a script through 9000 Arduino libraries, it's bound to encounter just about any wacky thing someone could possibly do. Luckily, I also used this project to learn how to do unit testing in both Python and Bash projects, and that has been a life saver, even though it took a lot of time to get it set up.

I did have the advantage that I had already been essentially going through much of this process manually for years previous so I was already familiar with the common Arduino Library bugs that could be automatically scanned for and had put a lot of thought into the project before I ever started on it.

I dont fully understand the folder of ide. My "Arduino" folder is in program files x86 so should I give there path for arduino.exe ?

arduino-ci-script expects the Arduino IDE installation folder to be named like arduino-1.8.9 (this is how the Linux and "Windows ZIP file for non admin install" versions of the Arduino IDE are named when you download them) because it supports CI testing with multiple IDE versions for backwards compatibility tests. Let's say you installed the IDE like this:

C: |_Program Files (x86) |_arduino-1.8.9 |_arduino.exe

Then you would use the option --arduino_ci_script_application_folder "C:\Program Files (x86)"

I am sorry but I am still having problem with running the file: code: >python inolibbuglist.py --github_login TomasRoj --ghtoken xyz --browser_command "\"/c/Program Files/Mozilla Firefox/firefox.exe\" -new-tab" --arduino_ci_script_branch master --arduino_ci_script_application_folder --arduino_ci_script_application_folder "C:\Program Files (x86)" --arduino_ci_script_arduino_ide_version 1.8.8

Now it gives me error that no module named urlib.error and urlib.request

Can you write here how to run it in ci? I think it would be more effective and better for me

You can't run the script on a free CI service like Travis CI or Circle CI because the duration is much longer than they allow.

TomasRoj: Now it gives me error that no module named urlib.error and urlib.request

That could be caused by using Python 2 instead of the required Python 3. Try running the command python --version to see which version is being used.

I just tried running the script under Windows and it started running with no errors. I didn't run it all the way through though, so it's possible the shell stuff will make it fail later on in the process.

I would like to help you, but inoliblist in particular was a personal project that I published on the chance that it might be useful others, but without a desire to get stuck putting a lot of effort into supporting it. I'm pretty busy with work right now and don't have much more time to put towards this. Spend some time seeing if you can get it working. If you still can't get it after that, I'll just run the script myself and give you the output. That will be way less work for me in the end.

Ok, thanks. I have python3 but I will try to fix it. If it will not run I will send you Pm.