The most expensive operation was found to be the ST_AddEdgeModFace function, which adds an edge and checks if such edge creates a new face.
Face-splitting detection was implemented using a brute force approach consisting in invoking the GEOS polygonizer and then checking if any polygon created contained the newly added edge in its boundary. It can get pretty wild when you add an edge in the universe face, and thus end up feeding the polygonizer with thousands of edges.
So I started thinking about two fields we make great effort to maintain and were (so far) unused: next_left_edge and next_right_edge. Each edge has these two pointers which may be used to walk clockwise along its left or right face.
By walking on the edge side you can know a lot about the edge you add !
First of all if you get to the other side of the edge you know for sure you didn’t split any face (no ring was created). Second, you can easily get the identifiers of all the edges you’ll need to update for setting their new left_face and right_face attributes. Finally by computing ring winding you can know whether each side of the edge is forming a fill or an hole.
I was afraid that such approach could have been slower than the brute force one, mainly due to more database querying and IO. But the dataset I was using for a testcase (Italian municipalities by ISTAT: 8094 multipolygons with an average of 560 vertices each) showed a speedup of up to 10x !
The testsuite was an invaluable tool for correctness checking. I actually spent a fair amount of time enhancing it further with new corner cases (new algorithm, new corners!).
I’m still checking correctness of the resulting topology for the ISTAT case, making sure that the improvements didn’t introduce any regression. Hopefully next week the code will be committed upstream for broader delectation. Stay tuned, prepare your input for conversion and start playing and timing the current routines !