Posts Tagged ‘qgis’

Topology cleaning with PostGIS

Monday, November 21st, 2011

An early tester of the new PostGIS Topology submitted an interesting dataset which kept me busy for a couple of weeks fixing a bunch of bugs related to numerical stability/robustness.

Finally, the ST_CreateTopoGeo function succeeded and imported the dataset as a proper topological schema. Here’s what it looks like:

Edges of the built topology

At a first glance it doesn’t seem to be particularly problematic. Here’s the composition summary:

=# select topologysummary('small_sample_topo');
 Topology small_sample_topo (2042), SRID 0, precision 0
 83 nodes, 156 edges, 74 faces, 0 topogeoms in 0 layers

But the devil hides at high zoom levels. Where to zoom ? What are we looking for ?

We are guaranteed none of the constructed edges cross so the only leftover problem we might encounter is very small faces constructed wherever the original input had small overlaps or underlaps (gaps). We can have a visual signal of those faces by creating a view showing faces with an area below a given threshold. Let’s do that:

CREATE VIEW small_sample_topo.small_areas AS
SELECT face_id, st_getfacegeometry('small_sample_topo', face_id)
FROM small_sample_topo.face
WHERE face_id > 0
AND st_area(st_getfacegeometry('small_sample_topo', face_id)) < 0.1;

That query would let us see where to find faces with area < 0.1 units. And here’s qgis showing it:

Areas smaller than 0.1 square units

Now we know where to zoom, and also the ID of the offending faces.

Let’s zoom in and show some labels:

Detail of small area

You can now see that face 59 is bound by (among others) edges 130 and 129. Just get rid of one of them to assign the area to an adjacent face.

We drop edge 130 using ST_RemEdgeModFace, assigning the area to face 52. Here’s the result:

Area after cleanup

Cleaning further would require removing further edges and thus getting rid of all the small faces. There’s a lot of room for automating such processes. The good new is you can now build your own automation around your specific use cases while still using robust and standard foundations.

It is to be noted that the whole process I described here only involved the geometrical/topological level and didn’t affect at all the semantic/feature level. If we had TopoGeometry objects defined by the faces we’d also know which small faces were part of overlaps or underlaps and could then act consequently by adding or removing faces to the definition of the appropriate TopoGeometry object. Such step would have been required for the overlap situations as the ST_RemEdgeModFace function doesn’t let you change the shape of a defined TopoGeometry.

Unfortunately the semantic level is lost when using the ISO functions, as the whole ISO topology model doesn’t deal with features at all. This is why I think PostGIS would benefit from having a function that converts your simple features into topologically-defined features by adding any missing primitive to a topology schema and constructing the feature for you. Such function, is only waiting for sponsorship to become a reality of PostGIS 2.0. If you like what we’re building for your data integrity, please consider supporting the effort!

Getting just the tip of a remote git branch

Tuesday, June 7th, 2011

As projects move their code under git control, people get frustrated about being unable to do most basic operations they are used to perform with SVN or CVS. That’s a fact, so let’s see if I can relief some pain by sharing what I know or learn as I crawl the learning curve myself.

Yesterday I’ve met with Markus Neteler and he was complaining about being unable to checkout the release branch of QuantumGIS without filling up his laptop hard drive. He got me curious, so here are some numbers and a recipe.

An SVN checkout as of April 30  is 281 Mb in size, 1 of which being the .svn directory.

A full git clone at time of writing (June 7) is 330 Mb  in size, 200 of which being the .git database and  130 being the working copy (the “checkout”).

A full git clone contains all the data available in the original repository. Once you get the clone, you have all the branches and all the history. no need for any more bandwidth.

But Markus was only interested in a single branch, not the whole set, and he wanted no history either. So he could cloned just the objects referenced by the commit known as the release-1_7_0 branch and no further parents (back history). Here’s how you do:

 git clone --depth 1 --branch release-1_7_0 \
The resulting shallow repository (the .git directory) is 110 MB in size. Add 133 MB of working directory (yes, release-1_7_0 is 3 MB bigger than master) for a total of 243 MB disk space used.

  1. A shallow repository (one with short history) cannot be further cloned, but here are no problems pulling updates from the origin nor producing patches or pushing changes.
  2. If you don’t know in advance the name of the branch you can query it from the remote repository using git ls-remote
  3. Every git command has a manual page in the form: git-command (ie: man git-ls-remote)
Happy learning !