Caching USB ID Database Updates
The USB ID Database
Where Is It?
The USB ID database is stored locally, at least on Ubuntu, as a text file located here: /var/lib/usbutils/usb.ids
.
How Does It Update?
When you call update-usbids
literally all that happens is a backup is made of your current database and then the lastest copy of the database is downloaded from http://www.linux-usb.org/usb.ids
.
HTTP Headers
If you inspect the headers returned when you issue a GET request for http://www.linux-usb.org/usb.ids
you'll see both Last-Modified
and Etag
headers but not only that you'll also see an Expires
header to!
We have all we need to check if our database is upto date with linux-usb.org rather than just blindly downloading it every time.
The New update-usbids
I put together a little bash script that can be used in place of update-usbids
that makes use of the caching headers that are exposed to us, you can get a copy from GitHub here
How it works
- Make a HEAD request for the compressed version of the database
http://www.linux-usb.org/usb.ids.gz
. - Check if the returned
Etag
header matches theEtag
we stored from last time we requested an update. - If it does match great! Nothing more required and it cost us only a couple hundred Bytes!
- If it doesn't match there's something new in the database!
- Create a backup of our current copy of the database just in case.
- Download the new version of the database.
- Keep a note of the
Etag
of this database.
Also it'll draw pretty progress bars for you courtesy of pv
Conclusion
I'm a little suprised this hasn't already made it into the source of usbutils, seems kind of an obvious optimisation...
It allows you to always have an upto date USB device list, esspecially if you alias lsusb
to update-usbids && lsusb
!
While it's true the whole database once Gzip has finished with it is, at the time of writing, 246kB there's not much bandwidth or time to be saved, at least not from the users point of view.
What about from the servers point of view? Then it's a numbers game. I couldn't find a number of Ubuntu users in the world but say for the sake of argument it's 20 million. If just 5% of these people ran update-usbids
once a week that's over 234GB of bandwidth required on the server end every week. I have no idea how often the database ubdates but given the Expires
header is set to 3 days in the future, you could at least half that 234GB number and save time and bandwith for the users to!
This is only going to become more pronouced as the number of people using USBUtils grows and the number of entries in the database grows...
It's Open Source.. So Why Not Submit It?
In short I have no idea how.. I had a Google and found the source for USBUtils but found no way of contributing my code...