flag of the United Kingdom

wget & cURL

Date:  Wed, 17th-Nov-2004prevnext

Tags: Commentary, Shell


How many times have you worked through a page of links patiently right-clicking and saving images/MP3s/files? How many times have you wished you could capture an entire website and save it to disk? I won't even ask about pr0n!

Last night, quite by chance, I came across a website that has revolutionised my downloading: wgets and cURLs...

“there's not much that you can't capture with a single command”

GNU wget and cURL are freely available, cross-platform utilities that are both designed for retrieving files from the Internet. cURL offers range "globbing" facilities (eg: [0-13], {1,2,3}) and GNU wget supports directory traversal and recursion. Between the two, there's not much that you can't capture with a single command.

Let's look at a couple of examples (taken from wgets and cURLs):

The WIRED CD: 16 MP3s under a Creative Commons license. You can right-click and save the tracks individually, or you can download all 16 tracks by opening a shell (command line) window and typing:

wget -r --span-hosts --level=1 -nd --accept mp3 http://creativecommons.org/wired/

A Piece of Apple History: When the Mac came out in 1984, Apple bought all the advertising space in the election special issue of Newsweek. In addition to the portrait of a famous actor pictured on that Newsweek cover, grab all 39 pages of Apple ads using the following cURL:

curl -O -f "http://www.aci.com.pl/mwichary/pics/computerhistory/ads/international/apple/mac-newsweek/page[0-114].big.jpg"

I think these two utilities have massive potential and I shall certainly be exploring them further.

Further Reading

MP3 Blogs and wget - Jeffrey Veen


Thank you to Pat Croteau for the image entitled "Spider Hunting", which is kindly provided under the Creative Commons "Attribution-NonCommercial-ShareAlike 2.0" License.


17th November, 2004: Corrected the erroneous cURL for the Apple advertisements.

You can comment on this entry, or read what others have written (5 comments).