|
|
 |
|
Title: cookielib Example
Submitter: Michael Foord
(other recipes)Michael Foord
(other recipes)
Last Updated: 2004/12/28
Version no: 1.1
Category:
Web
|
|
2 vote(s)
|
|
|
|
Description:
cookielib is a library new to Python 2.4
Prior to Python 2.4 it existed as ClientCookie, but it's not a drop in replacement - some of the function of ClientCookie has been moved into urllib2.
This example shows code for fetching URIs (with cookie handling - including loading and saving) that will work unchanged on :
a machine with python 2.4 (and cookielib)
a machine with ClientCookie installed
a machine with neither
(Obviously on the machine with neither the cookies won't be handled or saved).
Where either cookielib or ClientCookie is available the cookies will be saved in a file.
If that file exists already the cookies will first be loaded from it.
The file format is a useful plain text format and the attributes of each cookie is accessible in the Cookiejar instance (once loaded).
This may be helpful to those just using ClientCookie as the ClientCookie documentation doesn't appear to document the LWPCookieJar class which is needed for saving and loading cookies.
Source: Text Source
COOKIEFILE = 'cookies.lwp'
import os.path
cj = None
ClientCookie = None
cookielib = None
try:
import cookielib
except ImportError:
pass
else:
import urllib2
urlopen = urllib2.urlopen
cj = cookielib.LWPCookieJar()
Request = urllib2.Request
if not cookielib:
try:
import ClientCookie
except ImportError:
import urllib2
urlopen = urllib2.urlopen
Request = urllib2.Request
else:
urlopen = ClientCookie.urlopen
cj = ClientCookie.LWPCookieJar()
Request = ClientCookie.Request
if cj != None:
if os.path.isfile(COOKIEFILE):
cj.load(COOKIEFILE)
if cookielib:
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
urllib2.install_opener(opener)
else:
opener = ClientCookie.build_opener(ClientCookie.HTTPCookieProcessor(cj))
ClientCookie.install_opener(opener)
theurl = 'http://www.diy.co.uk'
txdata = None
txheaders = {'User-agent' : 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'}
try:
req = Request(theurl, txdata, txheaders)
handle = urlopen(req)
except IOError, e:
print 'We failed to open "%s".' % theurl
if hasattr(e, 'code'):
print 'We failed with error code - %s.' % e.code
else:
print 'Here are the headers of the page :'
print handle.info()
print
if cj == None:
print "We don't have a cookie library available - sorry."
print "I can't show you any cookies."
else:
print 'These are the cookies we have received so far :'
for index, cookie in enumerate(cj):
print index, ' : ', cookie
cj.save(COOKIEFILE)
Discussion:
We can always tell which import was successful.
If we are using cookielib then cookielib != None
If we are using ClientCookie then ClientCookie != None
If we are using neither then cj == None
Request is the function to use to make Request objects
urlopen to open URLs !!
Both names will be bound to the appropriate function whichever library is being used.
*WHY*
I'm writing a cgi-proxy called approx.py (see www.voidspace.org.uk/atlantibots/pythonutils.html#cgiproxy ).
It remotely fetches webpages for those in a restricted internet environment.
If ClientCookie is available it will handle cookies (and works well) - including loading/saving a different set of cookies for each user.
My server has python 2.2 - but I'd like the script to function well on machines with Python 2.4 or without ClientCookie at all.
This code installs a Cookiejar and CookieProcessor as the default handler for urllib2.urlopen if these are available.
Otherwise calls to urlopen work as normal.
If the example works as it should then you'll see some page headers printed and then the cookie that the server sent you.
This should then be saved to a file 'cookies.lwp'
(of course you may need to install ClientCookie)
Of course this example also illustrates using Request objects and headers etc to fetch webpages....
|
|
Add comment
|
|
Number of comments: 9
backporting cookielib, Ian Bicking, 2004/09/01
Is cookielib backward compatible to older versions of Python? Or can it be ported if not? This seems easier than dealing with both ClientCookie and cookielib.
Add comment
Backporting cookielib, Michael Foord,Michael Foord, 2004/09/01
The new cookielib uses a modified urllib2 - so it's not as straightforward as just making cookielib available. ClientCookie also has various other 'goodies' that weren't included in cookielib - which is another reason for someone still wanting to use ClientCookie rather than cookielib.
Having said that... it still might be possible,. I already have ClientCookie installed on the server I use, and am more than happy with it. the above chunk of code means my script will run with the same functionality on a machine with Python 2.4 *and* will work fine on a machine with neither.
Add comment
This code is magnificent and just works as it should be :), Nikos Kouremenos, 2004/09/03
and you shouldn't mind the whole 2 libs being checked.
Nicely done.
some small proposals:
i) very often one needs to spool the referer header, so you could have added that header too, except from adding the UserAgent:
txheaders = {'User-agent' : 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7)', 'Referer' : refererUrl}
ii) eventhough you describe it, you don't say exactly how *exactly* is it done if you want to POST and not GET. something like this in a comments could be better [not sure though u decide]
params = {'DomainNumber':'0', 'PhoneNo':PHONE_NO, 'Password':PASSWD}
txdata = urllib.urlencode(params)
anyways. excellent code [I voted for you 5 out of 5] and I just put the above stuff here, just if anyone was wondering [as I did]
Add comment
Thanks, Michael Foord,Michael Foord, 2004/09/06
Thanks for the appreciation !
I also like your additional examples.... - Fuzzy
Add comment
Typo?, Mikael Norgren, 2004/10/03
Think there's a lil' typo in the article.
Shouldn't Request = urlib2.Request be Request = urllib2.Request (urlib2 -> urllib2)?
Add comment
yes, it's a typo.., Nikos Kouremenos, 2004/12/17
and it became obvius to me too while using Python 2.4 :)
Add comment
Oops.., Michael Foord,Michael Foord, 2004/12/28
Sorry about that... typos belatedly corrected.
Add comment
Empty cookies.lwp file when save() called, Alen Ribic, 2007/03/09
Hi Michael,
When running the cookie_example.py my cookies.lwp get updated but it only has the following line in it "#LWP-Cookies-2.0". I checked the log and I do see the output for: "for index, cookie in enumerate(cj): print index, ' : ', cookie".
Any ideas why the file would be writing just "#LWP-Cookies-2.0" on first line and not the cookie entries?
Regards,
-Alen
Add comment
session cookies, Vladimir Cambur, 2007/07/06
if there are only session cookies you won't see them in the cookies.lwp
because by default session cookies are not saved.
if you pass ignore_discard=True to save() then they will be saved.
Add comment
|
|
|
|
|
 |
|