Topology cleaning with PostGIS


An early tester of the new PostGIS Topology submitted an interesting dataset which kept me busy for a couple of weeks fixing a bunch of bugs related to numerical stability/robustness.

Finally, the ST_CreateTopoGeo function succeeded and imported the dataset as a proper topological schema. Here’s what it looks like:

At a first glance it doesn’t seem to be particularly problematic. Here’s the composition summary:

But the devil hides at high zoom levels. Where to zoom ? What are we looking for ?

We are guaranteed none of the constructed edges cross so the only leftover problem we might encounter is very small faces constructed wherever the original input had small overlaps or underlaps (gaps). We can have a visual signal of those faces by creating a view showing faces with an area below a given threshold. Let’s do that:

That query would let us see where to find faces with area < 0.1 units. And here’s qgis showing it:

Now we know where to zoom, and also the ID of the offending faces.

Let’s zoom in and show some labels:

You can now see that face 59 is bound by (among others) edges 130 and 129. Just get rid of one of them to assign the area to an adjacent face.

We drop edge 130 using ST_RemEdgeModFace, assigning the area to face 52. Here’s the result:

Cleaning further would require removing further edges and thus getting rid of all the small faces. There’s a lot of room for automating such processes. The good new is you can now build your own automation around your specific use cases while still using robust and standard foundations.

It is to be noted that the whole process I described here only involved the geometrical/topological level and didn’t affect at all the semantic/feature level. If we had TopoGeometry objects defined by the faces we’d also know which small faces were part of overlaps or underlaps and could then act consequently by adding or removing faces to the definition of the appropriate TopoGeometry object. Such step would have been required for the overlap situations as the ST__RemEdgeModFace_ function doesn’t let you change the shape of a defined TopoGeometry.

Unfortunately the semantic level is lost when using the ISO functions, as the whole ISO topology model doesn’t deal with features at all. This is why I think PostGIS would benefit from having a function that converts your simple features into topologically-defined features by adding any missing primitive to a topology schema and constructing the feature for you. Such function, is only waiting for sponsorship to become a reality of PostGIS 2.0. If you like what we’re building for your data integrity, please consider supporting the effort!