User:Mjb/ia

From Offset
< User:Mjb
Revision as of 16:55, 15 May 2020 by Mjb (talk | contribs) (Upload a folder of videos)
Jump to navigationJump to search

Sometimes, to upload and manage content on archive.org, I use the Internet Archive command-line tool, ia.

The official documentation for it is pretty good:

What I am writing here is just a supplement to cover some things that came up as I used the tool.

Installation and setup

It only runs on Unix-like systems, so I installed it in a Lubuntu installation which I am running on a virtual machine in VirtualBox on Windows:

   sudo apt-get install ia

You only have to set your login credentials once, unless you are going to be using multiple accounts:

   ia configure

Bandwidth limiting

Sometimes I want to throttle the network traffic used by ia, but the normal way of doing this (using trickle) does not have any effect. So, with the help of a guide found in a web search, I discovered I can use tc to throttle the entire network interface (as seen by Lubuntu) instead:

   sudo tc qdisc add dev enp0s3 root tbf rate 880kbit latency 60ms burst 1540

In that command, 880kbit is the rate I want to limit to (about 110 KB/s). enp0s3 is the name of the network interface, as found by running ip link.

I discovered through trial and error that 60ms latency results in no dropped packets, while 50ms results in about a 4% drop rate. To see the packet stats:

   tc -s -d qdisc ls dev enp0s3

Upload a folder of videos

Choose a URL-friendly identifier, and say what file or folder to upload, and set the mediatype correctly. Everything else is optional.

   ia upload identifier folderpath --metadata="title:This is a Better Title than the Identifier" --metadata="mediatype:movies" --metadata="date:1990-05-20" --metadata="language:English"

Common mediatypes: texts, movies, image, or data. If you accidentally enter videos, it will be interpreted as movies.

When you upload a folder, it only uploads the contents of the folder, not the top level folder you gave on the command line.

Replacing files in a folder is not possible

Do not try to upload individual files to replace files in a folder; they will not go into the subfolder. You have to delete and upload the whole folder!

Resuming is not possible

As far as I know, there is no way to resume an interrupted transfer. So if you only get part of a file uploaded, it is gone. If you only get part of a folder uploaded, you need to delete everything and try again.

Delete all content for a given identifier

   ia delete identifier --all -H x-archive-keep-old-version:0

Give it a few minutes to finish. The tool will finish before the server actually deletes everything; be patient.

It is possible some files will still be left behind, e.g. metadata .xml, .sqlite, and a thumbnail image. Don't sweat it.

See what files are in the item

   ia list identifier

See what changes are pending

Whenever you upload, delete, or change something, a bunch of related tasks are added to a queue. To get a list of pending tasks applicable to a certain item, you can visit https://archive.org/catalog.php?identifier=foo (replace foo) which is also linked from https://archive.org/manage/foo (the archive item manager).

A history page at https://catalogd.archive.org/history/foo shows even more info.

See what metadata is associated with the item and its files

   ia metadata identifier

Update metadata

   ia metadata identifier --modify="date:1985"