What is this
This is a file parser for rclone size output.
if you output rclone to a file as such
rclone ls wikimedia: > size_output.txt
you'll get a list of file size and file path for all files from that repository.
by using this python script like so:
python3 main.py ./size_output.txt https://dumps.wikimedia.org/other/kiwix/zim/ 500
You'll get all those links sorted, completed by a give link, and chunked by the amount of gbs you desire to chunk them.
into a folder divided by root folders of that repository
python3 main.py ./size_output.txt https://dumps.wikimedia.org/other/kiwix/zim/ 500 resulting in something like this:
{'wikiversity': '10.97 GB', 'wikivoyage': '13.33 GB', 'wikinews': '34.19 GB', 'wikisource': '261.42 GB', 'wikiquote': '8.77 GB', 'wiktionary': '63.32 GB', 'wikibooks': '28.29 GB', 'wikipedia': '2.81 TB'}
$ tree
.
├── chunked
│ ├── wikibooks
│ │ ├── wikibooks_chunk_1.txt
│ │ └── wikibooks_links.txt
│ ├── wikinews
│ │ ├── wikinews_chunk_1.txt
│ │ └── wikinews_links.txt
│ ├── wikipedia
│ │ ├── wikipedia_chunk_10.txt
│ │ ├── wikipedia_chunk_11.txt
│ │ ├── wikipedia_chunk_12.txt
│ │ ├── wikipedia_chunk_13.txt
│ │ ├── wikipedia_chunk_14.txt
│ │ ├── wikipedia_chunk_1.txt
│ │ ├── wikipedia_chunk_2.txt
│ │ ├── wikipedia_chunk_3.txt
│ │ ├── wikipedia_chunk_4.txt
│ │ ├── wikipedia_chunk_5.txt
│ │ ├── wikipedia_chunk_6.txt
│ │ ├── wikipedia_chunk_7.txt
│ │ ├── wikipedia_chunk_8.txt
│ │ ├── wikipedia_chunk_9.txt
│ │ └── wikipedia_links.txt
│ ├── wikiquote
│ │ ├── wikiquote_chunk_1.txt
│ │ └── wikiquote_links.txt
│ ├── wikisource
│ │ ├── wikisource_chunk_1.txt
│ │ ├── wikisource_chunk_2.txt
│ │ └── wikisource_links.txt
│ ├── wikiversity
│ │ ├── wikiversity_chunk_1.txt
│ │ └── wikiversity_links.txt
│ ├── wikivoyage
│ │ ├── wikivoyage_chunk_1.txt
│ │ └── wikivoyage_links.txt
│ └── wiktionary
│ ├── wiktionary_chunk_1.txt
│ └── wiktionary_links.txt
Description
Languages
Python
100%