My brother wrote about calculating distances with PostGIS. I followed up with some suggestions, but WordPress seems to be eating my comments (strip_tags() is mutilating them, stripping anything after a “<”). So I’m following up here.
Most of my first comment got eaten; I went on to optimize the query significantly. It should have a WHERE s.zip = 10001, which makes it O(n+1). Obviously, there’s no point in an unconstrained O(nn) JOIN, since we only care about the distance of one zip to the others.
What I described was an optimization where you use the JOIN to narrow the resultset to an approximate area, then refine it with distance_sphere(). So you tighten the focus of the expensive operation to the results which are most likely to be relevant.
In other words, if you’re looking for something in Delaware, don’t bother with Nevada.
The magic here is that the average distance of one degree of lat/lon is 69.2 miles. So you can:
SELECT d.* FROM `zip` s JOIN `zip` d WHERE s.`zip` = 10001 AND (ABS(s.`lat` - d.`lat`) < = (15 / 34.6)) AND (ABS(s.`lon` - d.`lon`) < = (15 / 34.6)) AND distance_sphere(s.`geom`, d.`geom`) < = (15 * 1609.344);
So the first WHERE clause narrows it to one row in the left zip table, then the next two narrow the results in the right table to anything within a 15×15 mile square around the source, then distance_sphere() operates on this much smaller dataset. You’d likely need to EXPLAIN that to get the correct order – some conditions may need to go into ON rather than WHERE.
He’s right about the index, though, it wouldn’t help.
Discussion