Tuesday, December 29, 2009

Linux wget your ultimate command line downloader

SkyHi @ Tuesday, December 29, 2009

It is a common practice to manage UNIX/Linux/BSD server remotely over ssh session. As you manage servers, you need to download the software or other files for installation or even download latest ISO of Linux distribution (or even MP3s). These days we have lots of GUI downloaders for X window such as:

  • d4x: http://www.krasu.ru/soft/chuchelo
  • kget: KDE download manager
  • gwget2 - GNOME 2 wget front-end

However, when it comes to command line (shell prompt) wget the non-interactive downloader rules. It supports http, ftp, https protocols along with authentication facility, and tons of options. Here are some tips to get most out of it:

Download a single file using wget

$ wget http://www.cyberciti.biz/here/lsst.tar.gz
$ wget ftp://ftp.freebsd.org/pub/sys.tar.gz

Download multiple files on command line using wget

$ wget http://www.cyberciti.biz/download/lsst.tar.gz ftp://ftp.freebsd.org/pub/sys.tar.gz ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpmOR

i) Create variable that holds all urls and later use 'BASH for loop' to download all files:
$ URLS=”http://www.cyberciti.biz/download/lsst.tar.gz ftp://ftp.freebsd.org/pub/sys.tar.gz ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpm http://xyz.com/abc.iso" ii) Use for loop as follows:
$ for u in $URLS; do wget $u; doneiii) However, a better way is to put all urls in text file and use -i option to wget to download all files:

(a) Create text file using vi
$ vi /tmp/download.txtAdd list of urls:
http://www.cyberciti.biz/download/lsst.tar.gz
ftp://ftp.freebsd.org/pub/sys.tar.gz
ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpm
http://xyz.com/abc.iso
(b) Run wget as follows:
$ wget -i /tmp/download.txt(c) Force wget to resume download
You can use -c option to wget. This is useful when you want to finish up a download started by a previous instance of wget and the net connection was lost. In such case you can add -c option as follows:
$ wget -c http://www.cyberciti.biz/download/lsst.tar.gz
$ wget -c -i /tmp/download.txt
Please note that all ftp/http server does not supports the download resume feature.

Force wget to download all files in background, and log the activity in a file:

$ wget -cb -o /tmp/download.log -i /tmp/download.txtOR$ nohup wget -c -o /tmp/download.log -i /tmp/download.txt &nohup runs the given COMMAND (in this example wget) with hangup signals ignored, so that the command can continue running in the background after you log out.

Limit the download speed to amount bytes/kilobytes per seconds.

This is useful when you download a large file file, such as an ISO image. Recently one of admin started to download SuSe Linux DVD on one of production server for evaluation purpose. Soon wget started to eat up all bandwidth. No need to predict end result of such a disaster.
$ wget -c -o /tmp/susedvd.log --limit-rate=50k ftp://ftp.novell.com/pub/suse/dvd1.iso Use m suffix for megabytes (--limit-rate=1m). Above command will limit the retrieval rate to 50KB/s. It is also possible to specify disk quota for automatic retrievals to avoid disk DoS attack. Following command will be aborted when the quota is
(100MB+) exceeded.
$ wget -cb -o /tmp/download.log -i /tmp/download.txt --quota=100mF) Use http username/password on an HTTP server:
$ wget –http-user=foo –http-password=bar http://cyberciti.biz/vivek/csits.tar.gzG) Download all mp3 or pdf file from remote FTP server:
Generally you can use shell special character aka wildcards such as *, ?, [] to specify selection criteria for files. Same can be use with FTP servers while downloading files.
$ wget ftp://somedom.com/pub/downloads/*.pdf
$ wget ftp://somedom.com/pub/downloads/*.pdf
OR$ wget -g on ftp://somedom.com/pub/downloads/*.pdfH) Use aget when you need multithreaded http download:
aget fetches HTTP URLs in a manner similar to wget, but segments the retrieval into multiple parts to increase download speed. It can be many times as fast as wget in some circumstances( it is just like Flashget under MS Windows but with CLI):
$ aget -n=5 http://download.soft.com/soft1.tar.gzAbove command will download soft1.tar.gz in 5 segments.

Please note that wget command is available on Linux and UNIX/BSD like oses.

See man page of wget(1) for more advanced options.


Reference: http://www.cyberciti.biz/tips/linux-wget-your-ultimate-command-line-downloader.html