This is a tutorial for using wget command for downloading your data.
Wget is one of the powerful tools available there to download stuff from internet. You can do a lot of things using wget. Basic use is to download files from internet.
To download a file just type
But you cannot resume broken downloads.use -c option to start resumable downloads
wget -c http://your-link-to/file
You can also mask the program as web browser using -U.
This helps when the sites doesn’t allow download managers.
wget -c -U Mozilla http://your-link-to/file
Download Entire Website
You can download an entire website using -r option.
wget -r http://your-site.com
But be careful. It downloads the entire website for you. Since this tool can put a large load on servers it obeys robot.txt you can mirror a site on you local drive using -m option.
wget -m http://your-site.com
You can select the levels up to which you can dig into the site and downloads using -l option.
wget -r -l3 http://your-site.com
This will download only up to 3 levels. Suppose you want download only sub folders in a website url use –no-parent option. With this option wget downloads only the sub folders and ignores,the parent folders
wget -r –no-parent http://your-site.com/subfldr/subfolder
Now coming to terrible ideas.. to the hell with webmasters, not allowing to download the website type to ignore the robots.txt.
wget -r -U Mozilla -erobots=off http://url-to-site/
p.s. masking like a browser is a crime in some countries…. or something like that, i have heard on net.
Fooling the Webmasters
Do you think the web master cannot stop u with above command. to fool him use
wget -r -U Mozilla -erobots=off -w 5 –limit-rate=20 http://url-to-site/
here -w 5 instructs wget to wait 5 secs before downloading another file and –limit-rate=20 makes wget to cap the download speed to 20KBps. So u can fool the webmaster ….
Download all PDFs
You can download all files of a particular format , like all pdfs listed on a webpage,
wget -r -l1 -A.pdf –no-parent http://url-to-webpage-with-pdfs/
The error was “Error: Host key verification failed. Please select another viewer and try again”.
ssh-keygen -f “/home/vaibhav/.ssh/known_hosts” -R 192.168.3.216 –to generate the key for the IP address which you are trying to access
OR do following: sudo gedit ~/.ssh/known_hosts
That will open your gedit, now remove the old offending key. I only had one in there so I just removed all text from the file and saved.
If you’re not feeling terminally, browse to your home folder and select View > Show Hidden Files from the menu, or hit Ctrl+h. From there open the .ssh folder and open the known_hosts file. Rinse and repeat.
To install standalone BLAST for Ubuntu 11.04:
For User Manual You can visit: http://manpages.ubuntu.com/manpages/natty/man1/blast.1.html
Sometimes while updating your Ubuntu using terminal you get some error showing missing public key & thus failing to update.You can get the key using the command given below.Just replace the key with the key in error.
wget -q “http://keyserver.ubuntu.com:11371/pks/lookup?op=get&search=0x(4874D3686E80C6B7)” -O- | sudo apt-key add –
visit http://keyserver.ubuntu.com:11371/ to get the public key manually