Monday, December 8, 2008

Tutorial for cURL, libcurl, and Python

When troubleshooting websites it often comes down to what tools you know and if you know how to use them. One of the most valuable tools in my internet troubleshooting toolkit is cURL. I usually use curl to emulate a web browser. It gives me fine grained control over all the traffic that is being sent to the webserver.

cURL (or curl) usually refers to a binary based on libcurl. The binary is basically a very well designed swiss army knife for working with HTTP type connections. It supports other types of connections but for someone just learning curl the most useful is its HTTP capabilities.

The other extremely useful thing about curl is libcurl. The actual library allows you to script a lot of HTTP processes. The language you use to script really depends if there exists a binding for libcurl in your language. While libcurl is written in C it doesn't mean you need to use that language.

Libcurl is available in a number of languages from this website: http://curl.haxx.se/libcurl/bindings.html

So far I've used the bindings in PHP, Java, Perl, and Python. So basically pick the language you are most familiar with and go from there. The best way to learn a new tool is by example.

The following is a code snippet for Python that posts to http://www.google.com/.

import pycurl
import StringIO

proxyHostAndPort = 'localhost:8888'
proxyAuthentication = 'username:password'


buffer = StringIO.StringIO()

c = pycurl.Curl()
c.setopt(c.URL, 'http://www.google.com/')
c.setopt(c.COOKIEJAR, 'cookies.txt')
c.setopt(c.COOKIEFILE, 'cookies.txt')

c.setopt(c.POST, 1)
c.setopt(c.POSTFIELDS, "User=smith&Password=password")
c.setopt(c.VERBOSE, 1)
c.setopt(c.REFERER,'')
c.setopt(c.USERAGENT,'Curl')
c.setopt(c.WRITEFUNCTION, buffer.write)
c.setopt(c.SSL_VERIFYHOST, 0)
c.setopt(c.SSL_VERIFYPEER, False)

c.setopt(c.PROXY, proxyHostAndPort)
c.setopt(c.PROXYUSERPWD, proxyAuthentication)

c.perform()
c.close()

print buffer.getvalue()

Not that useful but the main things you want to glean from the code are the use of cookies, the POST method, an authenticating proxy, and results in the form of a string. Curl can also set the referrer and useragent headers to arbitrary values. This whole package is what makes curl so powerful.

3 comments:

Yogesh said...

I downloaded pycurl-7.19.0 and tried to install it.

came across following issue.

I first tried to install it, when it gave me error.

C:\Documents and Settings\Administrator\Desktop\pycurl\pycurl-7.19.0>setup.py install
Using curl directory: c:/python25/Lib/site-packages/pycurl/curl-7.16.2.1
Traceback (most recent call last):
File "C:\Documents and Settings\Administrator\Desktop\pycurl\pycurl-7.19.0\setup.py", line 69, in
assert os.path.isdir(CURL_DIR), "please check CURL_DIR in setup.py"
AssertionError: please check CURL_DIR in setup.py


I modified CURR_DIR address and tried, it gave error as

C:\Documents and Settings\Administrator\Desktop\pycurl\pycurl-7.19.0>setup.py install
Using curl directory: c:/python25/Lib/site-packages/
Traceback (most recent call last):
File "C:\Documents and Settings\Administrator\Desktop\pycurl\pycurl-7.19.0\setup.py", line 210, in
assert os.path.isfile(o), o
AssertionError: c:/python25/Lib/site-packages/lib\libcurl.lib

Most of the python libraries goes to site-packages so I tried but could not understand what's going wrong.
I am using Python 2.5

any help is appreaciated.
You can mail me at yogesh12585[at]gmail[dot]com

regards,
Yogesh

Seo online said...

I think its a best.
Medical
Find doctors
Emr vendor reviews

Limos said...

Its a great work done.
Bus charter dc

Limo service dc