I’ve been dabbling with Python for several months now, but I’m not quite as proficient with it as I’d like.
I was hacking on some stuff recently, and needed to parse the query parameters in a URL. Python has URL parsing, but it doesn’t include querystring parsing.
This was the pleasantly easy solution:
from urlparse import urlparse
url = urlparse('http://www.google.com/search?hl=en&safe=off&q=atomized&btnG=Search')
params = dict([part.split('=') for part in url[4].split('&')])
Let’s break the example down a bit.
from urlparse import urlparse
This pulls in Python’s URL parser.
url = urlparse('http://www.google.com/search?hl=en&safe=off&q=atomized&btnG=Search')
This passes the URL through urlparse, leaving us with a tuple of URL components.
url[4].split('&')
This splits the querystring on “&” characters, leaving us with a list of “key=val” scalars. The output of this would be:
['hl=en', 'safe=off', 'q=atomized', 'btnG=Search']
[part.split('=') for part in url[4].split('&')]
This is Python’s list mapping, also known as list comprehension. It allows us to map a function over a list – in this case, part.split('=') on the list of “key=value” pairs from the previous step. This leaves us with:
[['hl', 'en'], ['safe', 'off'], ['q', 'atomized'], ['btnG', 'Search']]
That is, an array of arrays, where the first member of the child array is the key and the second is the value.
params = dict([part.split('=') for part in url[4].split('&')])
The last part of this is dict(), which turns the array structure above into a dictionary:
{'q': 'atomized', 'safe': 'off', 'btnG': 'Search', 'hl': 'en'}
Thix is roughly equivalent to a hash map or associative array. It allows us to access specific keys, such as params['q'].
Discussion