Archive | June 2011

Download using wget command


This is a tutorial for using wget command for downloading your data.

Basics

Wget is one of the powerful tools available there to download stuff from internet. You can do a lot of things using wget. Basic use is to download files from internet.

To download a file just type

wget http://your-url-to/file

But you cannot resume broken downloads.use -c option to start resumable downloads

wget -c http://your-link-to/file

You can also mask the program as web browser using -U.
This helps when the sites doesn’t allow download managers.

wget -c -U Mozilla http://your-link-to/file

Download Entire Website

You can download an entire website using -r option.

wget -r http://your-site.com

But be careful. It downloads the entire website for you. Since this tool can put a large load on servers it obeys robot.txt you can mirror a site on you local drive using -m option.

wget -m http://your-site.com

You can select the levels up to which you can dig into the site and downloads using -l option.

wget -r -l3 http://your-site.com

This will download only up to 3 levels. Suppose you want download only sub folders in a website url use –no-parent option. With this option wget downloads only the sub folders and ignores,the parent folders

wget -r –no-parent http://your-site.com/subfldr/subfolder 

Now coming to terrible ideas.. to the hell with webmasters, not allowing to download the website type to ignore the robots.txt.

wget -r -U Mozilla -erobots=off http://url-to-site/ 

p.s. masking like a browser is a crime in some countries…. or something like that, i have heard on net.

Fooling the Webmasters

Do you think the web master cannot stop u with above command. to fool him use

wget -r -U Mozilla -erobots=off -w 5 –limit-rate=20 http://url-to-site/ 

here -w 5 instructs wget to wait 5 secs before downloading another file and –limit-rate=20 makes wget to cap the download speed to 20KBps. So u can fool the webmaster ….

Download all PDFs

You can download all files of a particular format , like all pdfs listed on a webpage,

wget -r -l1 -A.pdf –no-parent http://url-to-webpage-with-pdfs/ 

Advertisements

Error: Host key verification failed. Please select another viewer and try again


The Error

The error was “Error: Host key verification failed. Please select another viewer and try again”.

The Fix

ssh-keygen -f “/home/vaibhav/.ssh/known_hosts” -R 192.168.3.216            –to generate the key for the IP address which you are trying to access

OR do following: 

sudo gedit ~/.ssh/known_hosts

That will open your gedit, now remove the old offending key. I only had one in there so I just removed all text from the file and saved.

If you’re not feeling terminally, browse to your home folder and select View > Show Hidden Files from the menu, or hit Ctrl+h. From there open the .ssh folder and open the known_hosts file. Rinse and repeat.

Install standalone BLAST for Ubuntu 11.04


To install standalone BLAST for Ubuntu 11.04:

Go to Ubuntu Software Center & type “Basic Local Alignment Search Tool” or “blast2” & click on install.

For User Manual You can visit: http://manpages.ubuntu.com/manpages/natty/man1/blast.1.html

Verify Java Plugin/JRE Installation For Browser.


You can go to this link http://www.java.com/en/download/installed.jsp  & it will show the status or it will show available upgrade to existing java.

Apt-get GPG Error: Public Key Not Available in Ubuntu


Sometimes while updating your Ubuntu using terminal you get some error showing missing public key & thus failing to update.You can get the key using the command given below.Just replace the key with the key in error.

wget -q “http://keyserver.ubuntu.com:11371/pks/lookup?op=get&search=0x(4874D3686E80C6B7)” -O- | sudo apt-key add –

or

visit http://keyserver.ubuntu.com:11371/ to get the public key manually

%d bloggers like this: