Parsing URL query parameters in Python
I’ve been dabbling with Python for several months now, but I’m not quite as proficient with it as I’d like.
I was hacking on some stuff recently, and needed to parse the query parameters in a URL. Python has URL parsing, but it doesn’t include querystring parsing.
This was the pleasantly easy solution:
from urlparse import urlparse
url = urlparse('http://www.google.com/search?hl=en&safe=off&q=atomized&btnG=Search')
params = dict([part.split('=') for part in url[4].split('&')])
What’s going on here
Let’s break the example down a bit.
-
from urlparse import urlparse
This pulls in Python’s URL parser.
-
url = urlparse('http://www.google.com/search?hl=en&safe=off&q=atomized&btnG=Search')This passes the URL through urlparse, leaving us with a tuple of URL components.
-
url[4].split('&')This splits the querystring on “&” characters, leaving us with a list of “key=val” scalars. The output of this would be:
['hl=en', 'safe=off', 'q=atomized', 'btnG=Search']
-
[part.split('=') for part in url[4].split('&')]This is Python’s list mapping, also known as list comprehension. It allows us to map a function over a list - in this case,
part.split('=')on the list of “key=value” pairs from the previous step. This leaves us with:[['hl', 'en'], ['safe', 'off'], ['q', 'atomized'], ['btnG', 'Search']]
That is, an array of arrays, where the first member of the child array is the key and the second is the value.
-
params = dict([part.split('=') for part in url[4].split('&')])The last part of this is
dict(), which turns the array structure above into a dictionary:{'q': 'atomized', 'safe': 'off', 'btnG': 'Search', 'hl': 'en'}Thix is roughly equivalent to a hash map or associative array. It allows us to access specific keys, such as
params['q'].

June 3rd, 2008 at 1:15 am
Aren’t URLs parsed from HTML actually supposed to use “&” instead of just the & in URLs?
June 3rd, 2008 at 8:23 am
When they’re output in HTML or XML, yes, they need to be escaped. You want to avoid escaping them until it’s time to output them.
Ideally, this stuff is transparent. The escaping is only a property of HTML and XML, so when you get values out, you get the unescaped original value.
June 27th, 2008 at 6:26 am
Thanks for that article.
Can somebody refer me the samo but for PHP
Regards
Dimi
June 27th, 2008 at 8:35 am
PHP does it for you: $_GET.
July 10th, 2008 at 10:55 pm
It’ll be better if
part.split(’=')
changed to
part.split(’=',1).
Thanks a lot.
July 11th, 2008 at 9:04 am
That’s a good tip, and it should improve performance a bit. Any actual = in the strings would get URL encoded to %3D.
July 30th, 2008 at 11:11 pm
You don’t need the square brackets, in the dict expression… this will work as well and it’s slightly more efficient since it uses a generator expression
params = dict(part.split(’=') for part in url[4].split(`&`))
No spam intended but if you are “just getting into python” check this article a friend wrote that is basically a dense list of python tricks and hacks http://www.siafoo.net/article/52
August 6th, 2008 at 10:22 am
No urlparse needed:
def parse_qs(u):
return ‘?’ in u and dict(p.split(’=') for p in u[u.index('?') + 1:].split(’&’)) or {}