Posts Tagged ‘postgis’

On the fly simplification of topologically defined geometries

Friday, March 8th, 2013

Keeping GIS data in full-resolution and simplifying it on demand is a known challenge: simplification has to be fast and its output has to be topologically consistent.

simplified geometriessimplified topogeometries

We saw how to get a topologically consistent simplified version of a full layer, but that method isn’t fast enough for on-demand usage. Also, we saw how to perform a fast simplification by sacrificing the degree or generalization so that the introduced inconsistency would not be visible on a rendering surface.

An approach balancing speed and quality would take advantage of the topological definition of geometries to constraint the simplification work by ensuring shared edges get an identical treatment.

This is now possible with the addition in PostGIS of a  new version of ST_Simplify accepting a TopoGeometry as input.

In the picture above you can see the difference between using Geometry or TopoGeometry objects for simplifying the provinces of Tuscany with a tolerance of 8km. The ST_Simplify function runs at comparable speed in both cases.

Simplifying a map layer using PostGIS topology

Friday, April 13th, 2012

Following a recent research about how to simplify a multipolygon layer while keeping topological relationships intact, here’s my take on that, using the PostGIS topological support.

The data

French administrative subdivisions, called “départements”, will be used. Data can be downloaded here.

It is composed by 96 multipolygons for a total of  47036 vertices.

Principle of simplification

  • We convert a layer’s Geometries to TopoGeometries
  • We simplify all edges of the built topology
  • We convert the (now-simplified) TopoGeometries back to Geometries

Steps

The following steps assume you loaded the shapefile into a table named “france_dept”.

 -- Create a topology
SELECT CreateTopology('france_dept_topo', find_srid('public', 'france_dept', 'geom'));

-- Add a layer
SELECT AddTopoGeometryColumn('france_dept_topo', 'public', 'france_dept', 'topogeom', 'MULTIPOLYGON');

-- Populate the layer and the topology
UPDATE france_dept SET topogeom = toTopoGeom(geom, 'france_dept_topo', 1); -- 8.75 seconds

-- Simplify all edges up to 10000 units
SELECT SimplifyEdgeGeom('france_dept_topo', edge_id, 10000) FROM france_dept_topo.edge; -- 3.86 seconds

-- Convert the TopoGeometries to Geometries for visualization
ALTER TABLE france_dept ADD geomsimp GEOMETRY;
UPDATE france_dept SET geomsimp = topogeom::geometry; -- 0.11 seconds

The SimplifyEdgeGeom function

You may have noticed that the “SimplifyEdgeGeom” is not a core function. It is a function I wrote for the purpose of catching topological problems introduced by simplification.

The naive call would be:

SELECT ST_ChangeEdgeGeom('france_dept_topo', edge_id, ST_Simplify(geom, 10000))
FROM france_dept_topo.edge;

The problem with the above call is that any simplification introducing a topology error would be rejected by ST_ChangeEdgeGeom by throwing an exception and the exception would rollback the whole transaction leaving you with no edge changed. Possible topology errors introduced are: edges collapsing to points, intersecting self or other edges.

The SimplifyEdgeGeom function wraps the ST_ChangeEdgeGeom call into a subtransaction and handles exceptions by reducing the simplification factor until it succeeds. The version I used reduces simplification factor in half at each failure, dropping down to zero around 1e-8. You can roll your own with other heuristics or generalize this one to take parameters about stepping and limits.

Here’s the function:

CREATE OR REPLACE FUNCTION SimplifyEdgeGeom(atopo varchar, anedge int, maxtolerance float8)
RETURNS float8 AS $$
DECLARE
  tol float8;
  sql varchar;
BEGIN
  tol := maxtolerance;
  LOOP
    sql := 'SELECT topology.ST_ChangeEdgeGeom(' || quote_literal(atopo) || ', ' || anedge
      || ', ST_Simplify(geom, ' || tol || ')) FROM '
      || quote_ident(atopo) || '.edge WHERE edge_id = ' || anedge;
    BEGIN
      RAISE DEBUG 'Running %', sql;
      EXECUTE sql;
      RETURN tol;
    EXCEPTION
     WHEN OTHERS THEN
      RAISE WARNING 'Simplification of edge % with tolerance % failed: %', anedge, tol, SQLERRM;
      tol := round( (tol/2.0) * 1e8 ) / 1e8; -- round to get to zero quicker
      IF tol = 0 THEN RAISE EXCEPTION '%', SQLERRM; END IF;
    END;
  END LOOP;
END
$$ LANGUAGE 'plpgsql' STABLE STRICT;

Performance

The times shown near the “expensive” steps give you an indication of the performance you may expect. It’s about 13 seconds in total for the 3 steps outlined in the first paragraph.

A single run of the simplification step brought vertices down to 1369 (from 47036).

Timings are take on this system:

POSTGIS=”2.0.1SVN r9637″ GEOS=”3.4.0dev-CAPI-1.8.0″ PROJ=”Rel. 4.8.0, 6 March 2012″ GDAL=”GDAL 1.9.0, released 2011/12/29″ LIBXML=”2.7.6″ LIBJSON=”UNKNOWN” TOPOLOGY RASTER

PostgreSQL 8.4.10 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit

CUSTOM OPTIONS:
shared_buffers = 128MB      (default is 24MB)
temp_buffers = 32MB         (default is 8MB)
work_mem = 8MB              (default is 1MB)
maintenance_work_mem = 32MB (default is 16MB)
max_stack_depth = 4MB       (default is 2MB)
checkpoint_segments = 24    (default is 3)

CPU: Intel(R) Core(TM)2 Duo CPU P9500 @ 2.53GHz (2 x 5054.34 bogomips)

RAM: 4GB

Considerations

The procedure described in this post is also valid for LINESTRING and MULTILINESTRING layers, using the exactly same code.

You could reuse the topology to produce multiple resolution levels w/out incurring again in the construction cost (and with a reduced input complexity at each level).

The simplification step doesn’t use TopoGeometry objects at all so you could choose to perform  topology construction and  attribute assignment in a different way.

Running the SimplifyEdgeGeom function again might give you more simplification because edges which may have intersected to the simplified version of an edge may not be intersecting anymore after their own simplification. The function can be changed to behave differently on exception to improve performance or quality.

PostGIS 2.0.0 released

Wednesday, April 4th, 2012

The long-awaited full featured PostGIS 2.0.0 is finally out. Coupled with GEOS 3.3.3 (released a few days before) and GDAL-1.9.0, it brings you the best spatial database system in town, complete with raster analysis and topology modeling support.

Complete announcement, with list of changes,  here.

It’s been an great pleasure to work with the rest of the team on getting this release ready for shipping, drawing a line after over two years of hard work on new features. I’m particularly proud of the persistent topology support, which kept me busy for the most part of 2011, and the raster support of which I wrote the foundations two years before.

I’m also happy to see an healthily growing community around PostGIS: we’ve had two successful pledges, a growing list of contributors and more corporate sponsors.

A special thank goes to Vizzuality for investing in a full-time PostGIS hacker. Their PostGIS-in-the-cloud solution (cartodb.com) is likely the first one putting PostGIS 2.0.0 in production.

A walk on the wild side

Saturday, January 28th, 2012

Lou Reed, topological versionI’ve been spending the last few days profiling and optimizing the new simple-to-topological converter you will find in PostGIS 2.0.0 thanks to a community effort.

The most expensive operation was found to be the ST_AddEdgeModFace function, which adds an edge and checks if such edge creates a new face.

Face-splitting detection was implemented using a brute force approach consisting in invoking the GEOS polygonizer and then checking if any polygon created contained the newly added edge in its boundary. It can get pretty wild when you add an edge in the universe face, and thus end up feeding the polygonizer with thousands of edges.

So I started thinking about two fields we make great effort to maintain and were (so far) unused: next_left_edge and next_right_edge. Each edge has these two pointers which may be used to walk clockwise along its left or right face.

By walking on the edge side you can know a lot about the edge you add !

First of all if you get to the other side of the edge you know for sure you didn’t split any face (no ring was created). Second, you can easily get the identifiers of all the edges you’ll need to update for setting their new left_face and right_face attributes. Finally by computing ring winding you can know whether each side of the edge is forming a fill or an hole.

I was afraid that such approach could have been slower than the brute force one, mainly due to more database querying and IO. But the dataset I was using for a testcase (Italian municipalities by ISTAT: 8094 multipolygons with an average of 560 vertices each) showed a speedup of up to 10x !

The testsuite was an invaluable tool for correctness checking. I actually spent a fair amount of time enhancing it further with new corner cases (new algorithm, new corners!).

I’m still checking correctness of  the resulting topology for the ISTAT case,  making sure that the improvements didn’t introduce any regression. Hopefully next week the code will be committed upstream for broader delectation. Stay tuned, prepare your input for conversion and start playing and timing the current routines !

GEOS 3.3.2 released

Friday, January 6th, 2012

The second bugfix release in the 3.3 branch of GEOS was released today.

This is the version required by the topology support shipped with the upcoming PostGIS 2.0 release.
Everyone is recommended to upgrade. Changes can be read here, package can be downloaded here.

Topology cleaning with PostGIS

Monday, November 21st, 2011

An early tester of the new PostGIS Topology submitted an interesting dataset which kept me busy for a couple of weeks fixing a bunch of bugs related to numerical stability/robustness.

Finally, the ST_CreateTopoGeo function succeeded and imported the dataset as a proper topological schema. Here’s what it looks like:

Edges of the built topology

At a first glance it doesn’t seem to be particularly problematic. Here’s the composition summary:

=# select topologysummary('small_sample_topo');
                    topologysummary
--------------------------------------------------------
 Topology small_sample_topo (2042), SRID 0, precision 0
 83 nodes, 156 edges, 74 faces, 0 topogeoms in 0 layers

But the devil hides at high zoom levels. Where to zoom ? What are we looking for ?

We are guaranteed none of the constructed edges cross so the only leftover problem we might encounter is very small faces constructed wherever the original input had small overlaps or underlaps (gaps). We can have a visual signal of those faces by creating a view showing faces with an area below a given threshold. Let’s do that:

CREATE VIEW small_sample_topo.small_areas AS
SELECT face_id, st_getfacegeometry('small_sample_topo', face_id)
FROM small_sample_topo.face
WHERE face_id > 0
AND st_area(st_getfacegeometry('small_sample_topo', face_id)) < 0.1;

That query would let us see where to find faces with area < 0.1 units. And here’s qgis showing it:

Areas smaller than 0.1 square units

Now we know where to zoom, and also the ID of the offending faces.

Let’s zoom in and show some labels:

Detail of small area

You can now see that face 59 is bound by (among others) edges 130 and 129. Just get rid of one of them to assign the area to an adjacent face.

We drop edge 130 using ST_RemEdgeModFace, assigning the area to face 52. Here’s the result:

Area after cleanup

Cleaning further would require removing further edges and thus getting rid of all the small faces. There’s a lot of room for automating such processes. The good new is you can now build your own automation around your specific use cases while still using robust and standard foundations.

It is to be noted that the whole process I described here only involved the geometrical/topological level and didn’t affect at all the semantic/feature level. If we had TopoGeometry objects defined by the faces we’d also know which small faces were part of overlaps or underlaps and could then act consequently by adding or removing faces to the definition of the appropriate TopoGeometry object. Such step would have been required for the overlap situations as the ST_RemEdgeModFace function doesn’t let you change the shape of a defined TopoGeometry.

Unfortunately the semantic level is lost when using the ISO functions, as the whole ISO topology model doesn’t deal with features at all. This is why I think PostGIS would benefit from having a function that converts your simple features into topologically-defined features by adding any missing primitive to a topology schema and constructing the feature for you. Such function, is only waiting for sponsorship to become a reality of PostGIS 2.0. If you like what we’re building for your data integrity, please consider supporting the effort!

PostGIS topology ISO SQL/MM complete

Friday, October 14th, 2011

PostGIS implementation of the ISO SQL/MM Topology-Geometry model is finally complete.

The SQL/MM model is just a portion of the whole topology support, but an important one, including schema definition and functions to create and populate the schema with primitive components (nodes, edges, faces).

In addition to the base model, PostGIS adds a TopoGeometry data type for use wherever you would normally use a Geometry type, except the former will be defined by references to primitive, shared, topological components.

The work on SQL/MM topology for PostGIS 2.0 was mainly funded by the Land Management department of the Tuscany Region as part of a larger project aimed at enhancing effectiveness of free software tools already in use within the institution for the management of the topographic database (chapeau!).

But more work is needed to make it easier to convert your layers to topological layers, your Geometries to TopoGeometries. Because I know you want to generalize your vectors in a topologically consistent way, don’t you ? And you want to edit the boundaries just once. And do you want to waste all that space for your triangles ?

Feature freeze for 2.0 will be by the end of November. Drop a note ASAP if you’d like to help with getting the enhanced TopoGeometry constructor shipped with it.