tirsdag den 8. november 2011

python and download an file

So when you are programming/scripting in python, there comes a time when you want do get python to download a file.
This is more or less really good documented on the web, not really from the official python documentation.
So there is two libraries to designed to handle web request (and there is at least one to handle socket request). So the two libraries handle is called urllib and urllib2, the one I would suggest at least for web downloads of and file or a web page.
So to make an simple start:
import urllib2
url = "http://www.python.com"
request = urllib2.Request(url)
urlobj = urllib2.urlopen(request)
readpage = urlobj.read()
print readpage
So first we import the urllib2 library into python... yeah I know that you properly know this, but humor me.
The next I have defined url that I would like to download this is http://www.python.com, why I wanted to download this web page is to show it is possible.
So make an request for this url (urllib2.Request), and after that we going to make an url object with urllib2.urlopen. When it is done, we can read the web page with urllib2.read and we are going to print it out in the terminal.
So this is fine if we wanted to just print it out to the terminal, but what if we want to write this to a file. So we are going to create an new script, which have almost the same:
import urllib2
import shutil
url = "http://www.python.com"
fname = "python.com.html"
request = urllib2.Request(url)
urlobj = urllib2.urlopen(request)
 try:
with open(fname, 'wb') as f:
shutil.copyfileobj(urlobj, f)
finally:
urlobj.close()

And when we open this file in an editor, like gedit we see something like:
So, we open our local file "python.com.html" using the python open function using the 'wb' statement, so the 'w' is writeable, and the 'b' is binary mode. To read more on the built-in open function see reference no.1, and we are using the with statement to open it our local file, I'm not going into depth on the with statement, but effbot.org has already done this, see reference no.2 to read up on it.

Creating a class to handle our download
okay, now our code works to download a file, but if we are going to download more than one file, it can be useful to put it in an function, or make an class that can handle all the downloads.
If you are writing on an bigger script, where you need to download more than one file. So if you need to read up on python classes see reference no. 3. Create a file called "DownloadFile.py" and here is the code:
from urllib2 import Request, urlopen, URLError, HTTPError
from shutil import copyfileobj
from os.path import isfile, exists

class DownloadFile:
def __init__(self, url, tofile):
self.link = url
self.filename = tofile
self.urlobj = None

def __connect(self):
returncode = True
try:
request = Request(self.link)
self.urlobj = urlopen(request)
except URLError, e:
 # There probably need some code, to
 # handle  error code
returncode = False
except HTTPError, e:
# There probably need some code, to
# handle  error code
returncode = False
return returncode

def __write(self):
try:
with open(self.filename, 'wb') as f:
copyfileobj(self.urlobj, f)
finally:
self.urlobj.close()
def run(self):
if not exists(self.filename) and not isfile(self.filename):
gets = self.__connect()
if gets:
self.__write()
return "Downloaded"
else:
return "Failed"
else:
return "Exists"

def info(self):
if self.fileobj is None:
self.__connect()
return self.urlobj.info()

if __name__ == "__main__":
url = "http://meganfoxfans.net/wp-content/uploads/2011/10/Megan_Fox_Picture-coffee.jpg"
lname = "Megan_Fox_Walking_with_Coffee.jpg"
print "Will try to download a picture of Megan Fox:",
print DownloadFile(url,lname).run()
To run it just do like this:
python /path_to_file/DownloadFile.py
It will return something like
"Will try to download a picture of Mega Fox: Downloaded"

That is it for now, hope it can be useful for somebody :D And if you can use it, and you meet me someday, just remember to buy me a beer ;)


Reference:
[1] Python open function
[2] Python with statement
[3] Python classes

Ingen kommentarer:

Send en kommentar