Topology-Geometry: an advanced tutorial
|
TopoLayers and TopoFeatures
In the previous tutorial we've examined a first couple of SQL functions intended to export a GeoTable out from an existing Topology: TopoGeo_ToGeoTable() and TopoGeo_ToGeoTableGeneralize().
Anyway SpatiaLite supports an more sophisticated (and complex) approach based on TopoLayers and TopoFeatures specifically intended to support a very flexible mechanism for exporting a full set of GeoTable(s) directly deriving from a Topology.
a conceptual reference framework
- an ISO Topology is just intended to support a consistent set of Nodes, Edges and Faces primitives fully covering the plane.
Topology primitives correspond to pure geometries, and can never be directly associated to a corresponding set of information attributes.
Both Node and Edge primitives directly correspond to a real geometry, but Face primitives simply correspond to a virtual geometry thus necessarily requiring to be dynamically reconstructed every time that they are referenced (and this could easily become a real bottleneck).
- on the other hand a GeoTable (aka layer, as in the common GIS parlance) is always based on many distinct features, and for each feature a Geometry and an associated set of information attributes are expected to be declared.
- there is further difference to be taken in proper account: Topology primitives necessarily are elementary. Feature Geometries can usually represent some complex type: MULTIPOINT, MULTILINESTRING, MULTIPOLYGON and, may be, GEOMETRYCOLLECTION. So it should be absolutely clear that a Feature Geometry does not necessarily corresponds to a single Topology primitive ; a Feature Geometry could easily be derived from aggregating many different Topology primitives.
- that's not all: many different GeoTables could eventually be derived from a single Topology; and the same Topology primitive could play a very different role in everyone of these GeoTables.
Just a basically simple example: imagine a Topology corresponding to some land registry map. We'll surely have many potential layers: buildings, agricultural land, populated places, administrative boundaries, roads, ponds, rivers, fences and so on.
In such a context a fence could easily correspond to a Feature within the "fences" layer, could probably delimit two adjacent agricultural areas and could eventually be part of some administrative boundary.
- TopoLayers and TopoFeatures represent a complex and flexible structure intended to establish permanent explicit correspondences between Topology Primitives and GeoTable Features.
- each TopoLayer is uniquely identified by its name and directly corresponds to a single GeoTable to be exported from the Topology
An arbitrary number of TopoLayers can be created on the same Topology, and each one of them will act as a separate container.
- each TopoFeature is uniquely identified by its fid (feature-id) and corresponds to a single Feature containing an individual set of information attributes.
The Geometry for each TopoFeature is always indirectly defined by specifying a list of Topology primitives individually referenced by their IDs.
Such a geometry obviously is a virtual one, and will be actually materialized (by aggregating all referenced Topology primitives) only when exporting the destination GeoTable.
- The TopoLayers / TopoFeatures structure is practically implemented as a set of several closely related DB-tables )we'll examine later their respective layouts in full detail):
- <topo-prefix>_topolayers: this first table simply is a catalog of all TopoLayers supporting the same Topology.
Each TopoLayer is identified by its name and by its ID; both identifiers are granted to be unique.
- <topo-prefix>_topofeatures: this second table is intended to permanently store all cross-references existing between Topology primitives and TopoFeatures.
- <topo-prefix>_topofeatures_<topolayer-id>: for each single TopoLayer a separate table is required. The actual correspondence is established via the TopoLayer-ID suffix.
All these tables are intended to permanently store the information attributes for each TopoFeature in the same TopoLayer.
Different TopoLayers will obviously support a different set of information attributes thus requiring an individual table layout.
Anyway all these tables are always expected to declare a Primary Key of the INTEGER type named fid and intended to be an unique identifier for each TopoFeature.
- Several SQL functions are specifically intended to support processing operations based on TopoLayers and TopoFeatures; we'll examine later all them in full detail.
a quick practical exercise
Prerequisites:
- download the sample DB-file from here
It contains Census data (2011) kindly released by ISTAT under CC-BY license terms.
The original datasets have been slightly rearranged in a more convenient form.
All geometries are in the SRID 32632 (WGS 84 / UTM zone 32N):
- census_2011: all Census Areas (2011) covering Tuscany.
Note: several Census Areas are completely uninhabited (lakes, marshlands, high mountains and alike).
- ppl_2011: Populated Places (defined as an aggregation of Census Areas).
Note: not all Census Areas belong to a Populated Place: there are many self-standing dispersed rural areas.
- com_2011: Tuscany Municipalities / Local Councils 2011 (defined as an aggregations of Census Areas).
- com_2014_15: few new Municipalities created during years 2014/15 by merging pre-existing smaller Municipalities.
- prov_2011: Tuscany Provinces / Counties (defined as an aggregation of Municipalities).
- reg_2011: Tuscany Region (defined as an aggregation of Provinces).
- just census_2011 alone supports geometries; any other administrative level is simply defined by relational codes.
- uncompress the downloaded db-file (tuscany-census-2011.sqlite).
- now you can finally start a SpatiaLite session using your preferred SpatiaLite front end tool:
- Attention: you must not directly connect to tuscany-census-2011.sqlite; you are expected instead to connect a new (empty) DB-file named tuscany-topo-2011.sqlite
SELECT CreateTopology('census2011', 32632, 0, 0);
1
ATTACH DATABASE "./tuscany-census-2011.sqlite" AS istat;
SELECT TopoGeo_FromGeoTable('census2011', 'istat', 'census_2011', NULL, 0, 512, -1);
1
SELECT ST_ValidateTopoGeo('census2011');
NULL
SELECT * FROM TEMP.census2011_validate_topogeo;
SELECT TopoGeo_CreateTopoLayer('census2011', 'istat', 'census_2011', NULL, 'census_areas');
1
SELECT TopoGeo_InitTopoLayer('census2011', 'istat', 'pop_ppl_2011', 'ppl');
1
SELECT TopoGeo_InitTopoLayer('census2011', 'istat', 'pop_com_2011', 'com');
1
SELECT TopoGeo_InitTopoLayer('census2011', 'istat', 'pop_prov_2011', 'prov');
1
SELECT TopoGeo_InitTopoLayer('census2011', 'istat', 'pop_reg_2011', 'reg');
1
DETACH DATABASE istat;
We'll start this practical tutorial by duly replicating the same steps we've already examined in the intermediate tutorial:
- we'll create a new 2D Topology named census2011 and located into SRID 32632.
- then we'll attach the tuscany-topo-2011.sqlite external DB-file.
- now we'll populate the census2011 Topology by importing the istat.census_2011 GeoTable, and then we'll immediately check if this Topology is fully valid.
Now we are finally ready to start defining all TopoLayers based on the census2011 Topology; each single TopoLayer will precisely represent some administrative level, and all TopoLayers altogether will completely represent the whole Tuscan administrative hierarchy.
- so we'll duly invoke TopoGeo_CreateTopoLayer() in order to completely define a first TopoLayer representing Census Areas.
Recall: the istat.census_2011 is the unique real GeoTable we have in our initial dataset, because it contains both information attributes and geometries; and each Census Area is expected to directly correspond to a single Topology Face. (we'll examine later in more depth how TopoLayer / TopoFeature relationships have been actually defined).
- then we'll continue by defining several further TopoLayers respectively corresponding to Populated Places, Municipalities, Provinces and Region administrative levels.
Recall: all these administrative levels in the initial dataset are simply defined by relational codes intended to aggregate lower level entities; in this case there are no explicit geometries at all. So are are not directly allowed to create a fully defined TopoLayer, and calling TopoGeo_CreateTopolayer() is just impossible because we lack any appropriate Geometry column.
Anyway we can invoke TopoGeo_InitTopoLayer(); this will initialize an only partially defined TopoLayer, but we can easily complete in a second time any missing information so to get a properly working TopoLayer (as we'll see later in full detail).
- finally we'll detach the now useless external db-file.
TopoLayers / TopoFeatures: layout of DB Tables and corresponding relationships
<topo-prefix>_topolayers
SELECT * FROM census2011_topolayers;
The TopoLayers table contains a distinct row for every TopoLayer defined on the current Topology.
Each TopoLayer is identified by an id and by a name; both are required to be unique values.
|
"census2011_topolayers"
topolayer_id |
topolayer_name |
1 | census_areas |
2 | ppl |
3 | com |
4 | prov |
5 | reg |
|
<topo-prefix>_topofeatures_<topolayer-id>
SELECT * FROM census2011_topofeatures_2;
SELECT * FROM census2011_topofeatures_4;
The same Topology can contain more than a single TopoFeatures table, and each TopoFeatures table corresponds to a single TopoLayer.
Every table can have a specific layout on its own (strictly reflecting the information attributes for each layer), anyway all them will have a unique TopoFeature identifier (fid) acting in the Primary Key role.
The relationship between table names and TopoLayers is established by appending a numeric suffix to the common name that must exactly match the topolayer_id value declared in the TopoLayers main table.
As is shown in this example the first table census2011_topofeatures_2 (topolayer_id=2) contains all TopoFeatures associated to the TopoLayer ppl (Populated Places).
The second table census2011_topofeatures_4 (topolayer_id=4) contains all TopoFeatures associated to the TopoLayer prov (Provinces).
|
"census2011_topofeatures_2"
fid |
cod_ppl |
cod_com |
name |
pop_2011 |
m_2011 |
f_2011 |
1 | 205142705 | 45001 | Albiano Magra | 1907 | 919 | 988 |
2 | 205142706 | 45001 | Aulla | 4321 | 2046 | 2275 |
3 | 205142710 | 45001 | Caprigliola | 482 | 226 | 256 |
|
1018 | 1410775415 | 100007 | Sasseta | 285 | 137 | 148 |
1019 | 1410785415 | 100007 | Terrigoli | 537 | 264 | 273 |
"census2011_topofeatures_4"
fid |
cod_prov |
cod_reg |
name |
abbrev |
pop_2011 |
m_2011 |
f_2011 |
1 | 45 | 9 | Massa Carrara | MS | 199650 | 95754 | 103896 |
2 | 46 | 9 | Lucca | LU | 388327 | 186183 | 202144 |
3 | 47 | 9 | Pistoia | PT | 287866 | 138054 | 149812 |
|
9 | 53 | 9 | Grosseto | GR | 220564 | 105585 | 114979 |
10 | 100 | 9 | Prato | PO | 245916 | 119088 | 126828 |
|
<topo-prefix>_topofeatures
SELECT * FROM census2011_topofeatures;
The TopoFeatures-geometries table is intended to permanently store all relations between TopoFeatures and Topology primitives required in order to correctly build the expected output Geometry for each TopoFeature.
- uid is the Primary Key, and is simply intended to be an unique identifier for each row but doesn't intend to have any special meaning.
- node_id, edge_id and face_id are Foreign Keys directly referencing a Topology primitive; two of these values are always expected to be NULL, and only one is expected to effectively reference a Topology primitive depending on its type.
- topolayer_id and fid together are intended to establish a relational reference to some specific TopoFeature.
Example #1: all Topology primitives directly referenced by rows declaring topolayer_id=2 and fid=3 must be aggregated in order to build the output Geometry corresponding to the Populated Place of Caprigliola.
Example #2: all primitives referenced by rows declaring topolayer_id=4 and fid=9 must be aggregated in order to build the output Geometry corresponding to the Province of Grosseto.
|
"census2011_topofeatures"
uid |
node_id |
edge_id |
face_id |
topolayer_id |
fid |
1 | NULL | NULL | 1 | 1 | 1 |
2 | NULL | NULL | 2 | 1 | 2 |
3 | NULL | NULL | 3 | 1 | 3 |
|
28870 | NULL | NULL | 28864 | 1 | 28867 |
28871 | NULL | NULL | 28868 | 1 | 28868 |
|
Final remarks
- both TopoGeo_CreateTopoLayer() and TopoGeo_InitTopoLayer() will register the TopoLayer into the TopoLayers table, and will create and populate the corresponding TopoFeatures table by importing all information attributes for each Feature defined by the reference table (or reference view).
- only TopoGeo_CreateTopoLayer() will automatically populate the TopoFeatures-geometry table, and while doing so, identify all relationships existing between the Geometries found into the reference table or view and the Topology primitives based on the values of the TopoSeeds.
- as apposed to TopoGeo_InitTopoLayer(), which will never attempt to identify the relationships intercurring between output Geometries and Topology primitives. This task will always be deferred for the user to deal with later.
You could, for example, manually select all Topology primitives corresponding to a single TopoFeature; or more probably you could perform this task in a second round by executing some appropriate SQL statement (as we'll seen soon in the next example).
|
TopoGeo_ExportTopoLayer: exporting a full TopoLayer into a GeoTable
exporting the Census Areas TopoLayer |
We'll start by exporting first the only completely defined TopoLayer we have at this point in the test DB-file, i.e. census_areas
SELECT TopoGeo_ExportTopoLayer('census2011', 'census_areas', 'out_census_2011', 1);
As you can easily check by yourself a new GeoTable named out_census_2011 has been created by TopoGeo_ExportTopoLayer():
- this GeoTable contains all TopoFeatures defined into the TopoLayer.
- each TopoFeature faithfully preserves its initial information attributes.
- all Geometries are precisely built by aggregating the corresponding Topology primitives as specified by TopoLayer / TopoFeatures relationships.
The image examplifies a choropleth centered around the Island of Elba and based on the out_census_2011 GeoTable: all Census Areas are displayed by adopting a colour directly corresponding to their population density.
|
|
Building a full hierarchy of Administrative Levels by SQL statements
As we've already seen before we are now expected to complete someway the TopoFeatures definitions supporting the Populated Places, Municipalities, Provinces and Region TopoLayers we've previously created by calling TopoGeo_InitTopoLayer().
Happily enough all these administrative levels are simply based on direct aggregations of Census Areas, so we just have to execute few appropriate SQL statements.
INSERT INTO census2011_topofeatures
SELECT NULL, c.node_id, c.edge_id, c.face_id, 2, a.fid
FROM census2011_topofeatures_2 AS a
JOIN census2011_topofeatures_1 AS b ON (b.cod_ppl = a.cod_ppl)
JOIN census2011_topofeatures AS c ON (c.topolayer_id = 1 AND c.fid = b.fid);
- census2011_topofeatures_1 is the TopoFeatures table corresponding to Census Areas.
- census2011_topofeatures_2 is the corresponds to Populated Places; the clause b.cod_ppl = a.cod_ppl will relationally join each Populated Place to its underlaying Census Areas.
- so we can duly insert into the TopoFeatures-geometry table (i.e. census2011_topofeatures) a new level of references to Topology primitives by simply copying all definitions already stored into the previous hierarchical level.
And to do such a thing we simply have to read from topolayer=1 then inserting into topolayer=2 after setting the appropriate fid values.
Really not a difficult task.
INSERT INTO census2011_topofeatures
SELECT NULL, c.node_id, c.edge_id, c.face_id, 3, a.fid
FROM census2011_topofeatures_3 AS a
JOIN census2011_topofeatures_1 AS b ON (b.cod_com = a.cod_com)
JOIN census2011_topofeatures AS c ON (c.topolayer_id = 1 AND c.fid = b.fid);
We can adopt exactly the same identical approach in order to complete topolayer=3 corresponding to Municipalities.
INSERT INTO census2011_topofeatures
SELECT NULL, c.node_id, c.edge_id, c.face_id, 4, a.fid
FROM census2011_topofeatures_4 AS a
JOIN census2011_topofeatures_3 AS b ON (b.cod_prov = a.cod_prov)
JOIN census2011_topofeatures AS c ON (c.topolayer_id = 3 AND c.fid = b.fid);
More or less the same is for deriving topolayer=4 corresponding to Provinces; this time we'll simply aggregate the underlying Municipalities from topolayer=3.
We can adopt exactly the same identical approach in order to complete topolayer=3 corresponding to Municipalities.
INSERT INTO census2011_topofeatures
SELECT NULL, c.node_id, c.edge_id, c.face_id, 5, a.fid
FROM census2011_topofeatures_5 AS a
JOIN census2011_topofeatures_4 AS b ON (b.cod_reg = a.cod_reg)
JOIN census2011_topofeatures AS c ON (c.topolayer_id = 4 AND c.fid = b.fid);
And finally we can derive topolayer=5 corresponding to Regions by directly aggregating the underlying Provinces from topolayer=4.
step #1: exporting the Region TopoLayer
SELECT TopoGeo_ExportTopoLayer('census2011', 'reg', 'out_reg_2011', 1);
We are now definitely ready to export any other Administrative Level.
For the sake of clarity we'll follow a top-bottom order, so we'll start by exporting first the whole Tuscany.
|
|
step #2: exporting the Provinces TopoLayer
SELECT TopoGeo_ExportTopoLayer('census2011', 'prov', 'out_prov_2011', 1);
We'll continue by exporting all the Tuscany Provinces.
|
|
step #3: exporting the Municipalities TopoLayer
SELECT TopoGeo_ExportTopoLayer('census2011', 'com', 'out_com_2011', 1);
Then we'll export all the Tuscany Municipalities.
For better clarity the figure shows a magnified detail centered around the Island of Elba.
|
|
step #4: exporting the Populated Places TopoLayer
SELECT TopoGeo_ExportTopoLayer('census2011', 'ppl', 'out_ppl_2011', 1);
And we'll finally export all Populated Places.
Conclusion: we started this advanced tutorial by importing into a Topology just a single layer (Census Areas) and now we've finished by producing a complete set of administrative boundaries at different hierarchical levels:
- Populated Places
- Municipalities
- Provinces
- Region
|
|
to be done
ATTACH DATABASE "./tuscany-census-2011.sqlite" AS istat;
CREATE TABLE com2015 AS
SELECT * FROM istat.pop_com_2011;
INSERT INTO com2015
SELECT a.cod_com, a.cod_prov, a.cod_istat, a.cod_cadrastre, a.name, a.townhall_addr,
Sum(b.pop_2011), Sum(b.m_2011), Sum(b.f_2011)
FROM istat.com_2014_15 AS a
JOIN istat.pop_com_2011 AS b ON (b.cod_com IN (a.old_cod_com_1, a.old_cod_com_2))
GROUP BY a.cod_com;
DETACH DATABASE istat;
SELECT TopoGeo_InitTopoLayer('census2011', NULL, 'com2015', 'com2015');
INSERT INTO census2011_topofeatures
SELECT NULL, a.node_id, a.edge_id, a.face_id, 6, a.fid
FROM census2011_topofeatures AS a
JOIN census2011_topofeatures_3 AS b ON (a.topolayer_id = 3 AND a.fid = b.fid);
UPDATE census2011_topofeatures
SET fid = (SELECT fid FROM census2011_topofeatures_6 WHERE cod_com = 51040)
WHERE topolayer_id = 6 AND fid IN (SELECT fid FROM census2011_topofeatures_6 WHERE cod_com in (51009, 51029));
UPDATE census2011_topofeatures
SET fid = (SELECT fid FROM census2011_topofeatures_6 WHERE cod_com = 51041)
WHERE topolayer_id = 6 AND fid IN (SELECT fid FROM census2011_topofeatures_6 WHERE cod_com in (51032, 51036));
SELECT TopoGeo_ExportTopoLayer('census2011', 'com2015', 'out_com_2015', 1, 1);
SELECT TopoGeo_InsertFeatureFromTopoLayer ('census2011', 'com2015', 'out_com_2015', fid)
FROM census2011_topofeatures_8
WHERE cod_prov = 51 AND cod_com NOT IN (51009, 51029, 51032, 51036);
AR 2011
|
|
AR 2015
|
|
|