Saturday, January 16, 2010

using wget to grab all images from a web page

SkyHi @ Saturday, January 16, 2010

the command

  • wget -A.jpg -r -l1 -np http://www.mentallandscape.com/C_CatalogMoon.htm

explanation

  • -A: accept list. in this case we’re accepting all jpgs.
  • -r: recursive
  • -l: levels to recurse
  • -np: no parent, i.e. do not go up in the directory tree.

more on wget

wget on mac os x

wget does not ship with mac os x. you can find a pre-compiled version of wget at status-q. if you don’t want to install wget, you might try out curl, which is already installed.

making curl behave like wget

  • curl http://url.com/remote.html -o local.html

this will output to a file, rather than printing to the screen.

Reference: http://enure.net/post/article/using-wget-to-grab-all-images-from-a-web-page


wget -P Slides -r -p -nd -t5 -H --domains=.blogger.com,kaspere.blogpost.com http://kaspere.blogspot.com/ -A.jpg,.jpeg,.jpg.1,.jpg.2,.jpeg.1,.jpeg.2 -erobots=off


Reference: http://ubuntuforums.org/showthread.php?s=1545eff4caefc2a35f8e94550f8471a2&t=718549&page=2