A quick tutorial to SpatiaLite - a Spatial extension for SQLite

Table of Contents


1. Preparing to start

This tutorial is intended to let you become quickly trained in order to understand and use SpatiaLite and its Spatial functions.
May be you are not a computer programmer, neither a system administrator, nor a DBMS administrator. [but if you are, don't worry; this tutorial can help you too]
Don't bother about this; the tutorial tries to avoid using a too complex terminology, or to explain every arcane detail concerning spatial data processing, SQL and so on.
The main scope of this tutorial is simply to guide you in a step by step process, exploring what features SpatiaLite can offer you, and how and when you are expected to use them.
Each single feature will be explained with simple but realistic examples, based on the test.db that you should have downloaded with this tutorial.

1.1. What you need to begin

We suppose you have at least a basic familiarity with computers, and we expect too you are able to list a directory, install software and alike; some knowledge of shell environment will be very useful as well.
For the sake of simplicity all the following examples supposes you are doing the tests using a Windows PC; if not, I'm sorry ... but I presume it will be very easy for you to accommodate some few adaption needed for Linux, MacOs etc
  1. In order to start using SpatiaLite you must have the SpatiaLite.dll somewhere on your local hard disk. You surely have received one copy of it within the download.
  2. SpatiaLite is simply an extension; you absolutely need to have the SQLite front end slite3.exe as well, and sqlite3.dll too, in order to use SpatiaLite. Note that you must have a recent version, with a version number 3.3.6 or higher.
    If not, please download them at www.sqlite.org
  3. Neither sqlite3.exe nor SpatiaLite.dll requires an installation.
    You simply have to create an empty folder (e.g. C:\SpatiaLite) and just copy the files inside; put the test.db sample database in the same folder.
  4. If you don't have any GIS tool already installed, may be a good idea to install the open source QuantumGis; it can be useful in order to display at a glance the shapefiles we are going to produce among this tutorial.
    You can download it from www.qgis.org; it has an ordinary Windows Installer that creates a Desktop icon and so on.
Now you have all that is needed to start the tutorial.

1.2. The test.db database

This one is a simple, small-sized but quite realistic database containing some free available GIS data [you can found the original ones at www.rigacci.org]
The test.db contains the following tables:
  1. the regions table contains 109 rows; each entity has a geometry, that is a POLYGON.
  2. the towns table contains 8101 rows; each entity has a geometry, that is a POINT.
  3. the highways table contains 775 rows, ; each entity has a geometry, that is a LINESTRING.

1.3. Getting started with SQL and SQLite

This section will try to let you become accustomed to SQLite in very quick way. If you are anyway more seriously interested in this, please read the official SQLite documentation.
To start using SQLite, open a new shell (command window), then go to the folder you have prepared in the preliminaries, and type the following command:

sqlite3 test.db

you should receive this prompt:
SQLite version 3.x.x
enter ".help" for instructions
sqlite>


ok, you have just started you first SQLite working session, that is currently connected to the test.db database; congratulations.
Now type the following commands:
sqlite> .nullvalue NULL
sqlite> .headers on
sqlite> .mode column

nothing of very interesting in this; you are simply selecting some trivial operative options: if you are curious to discover what exactly they means, you can discover it by yourself just typing:
sqlite> .help

All command starting with a dot [.], as .mode or headers is interpreted by SQLite as an internal command, and not as an SQL expression. i.e. internal commands set some option or mode for SQLite, and have nothing to do with real data processing.
now that the SQLite session is started, we can execute our first SQL query on the test.db database:
sqlite> SELECT * FROM towns LIMIT 5;
GaiaPrimaryKeyGaiaGeometryNamePeoples LocalCouncilCountyRegion
1Brozolo435 100
2Campiglione-Fenile1284 100
3Canischio274 100
4Cavagnolo2281 100
5Magliano Alfieri1674 100
sqlite>
A few points to understand:
  1. the SELECT statement requests SQLite to perform a query
  2. fetching all columns [*]
  3. FROM the database table of name towns
  4. retrieving only the first five rows [LIMIT 5]
  5. note the any SQL statement must end with a semicolon [;]

now try this second SQL query:
sqlite> select name AS Town, peoples as Population from towns ORDER BY name LIMIT 5;
TownPopulation
Abano Terme18206
Abbadia Cerreto276
Abbadia Lariana3151
Abbadia San Salvatore6832
Abbasanta2815
sqlite>
  1. in SQL constructs using lower- or upper-case has identical effects; SELECT is the same of select, FROM and from are equivalents.
  2. you can freely choose which one columns has to be fetched, determine at your will their order, and even rename then if you want to do so by using the AS clause.
  3. you can order the fetched rows by using the ORDER BY clause.

a more complex SQL query:
sqlite> select name, peoples from towns WHERE peoples > 350000 order by peoples DESC;
TownNamePopulation
Roma2546804
Milano1256211
Napoli1004500
Torino865263
Palermo686722
Genova610307
Bologna371217
Firenze356118
sqlite>
  1. you can filter a specific set of rows by imposing a WHERE clause; only those rows that satisfies the logical expression you specify will be fetched.
    In this example only Towns with a population greater than 350000 peoples has been fetched.
  2. you can order rows in descending order if appropriate, by using the DESC clause.

now we'll see a different mode of using an SQL query:
sqlite> select COUNT(*) as '# Towns',
     ... > MIN(peoples) as Smaller,
     ... > MAX(peoples) as Bigger,
     ... > SUM(peoples) as 'Total peoples',
     ... > SUM(peoples) / COUNT(*) as 'mean peoples for town'
     ... > from towns;
# TownsSmallerBigger Total peoplesmean peoples for town
8101332546804 570061477036
sqlite>
  1. you can split complex queries along many lines; SQLite starts executing the query only where it finds a line closed by a semicolon [;] In other words, until you enter a [;] as the last character in a line, you are always continuing to type the same query, and SQLite waits until you have finished to complete the query.
  2. you can use functions in an SQL query.
    COUNT(), MIN(), MAX() and SUM() are functions. Not at all surprisingly:
  3. even more, you can docalculations in your query.
    e.g. we have calculated the mean of peoples per village dividing the SUM() by the COUNT() values.

note that a valid SQL query may simply consists of expressions and functions to be evaluated:
sqlite> select (10 - 11) * 2 as Number, ABS((10 - 11) * 2) as AbsoluteValue;
NumberAbsoluteValue
-22
sqlite>
  1. the (10 - 11) * 2 term is an example of expression
  2. the ABS() functions returns the absolute value of a number.
  3. note that in this example we have not used any DB column or DB table at all.

now we'll repeat an SQL query we have already performed, but with a small variation:
sqlite> select name, peoples, HEX(GaiaGeometry)
     ... > from Towns where peoples > 350000 order by peoples DESC;
NamePeoplesHEX(GaiaGeometry)
Roma2546804 0001787F00003D0AD723BF11284133333313C1B851413D0AD723BF11284133333313C1B851417C010000003D0AD723BF11284133333313C1B85141FE
Milano1256211 0001787F00005C8FC2F5116C1F413D0AD7A3893453415C8FC2F5116C1F413D0AD7A3893453417C010000005C8FC2F5116C1F413D0AD7A389345341FE
Napoli1004500 0001787F00003333333358C42C4133333323FA4C51413333333358C42C4133333323FA4C51417C010000003333333358C42C4133333323FA4C5141FE
Torino865263 0001787F000052B81E85862418419A999939C60A534152B81E85862418419A999939C60A53417C0100000052B81E85862418419A999939C60A5341FE
Palermo686722 0001787F0000713D0A5766DC2A4152B81E05F41F5041713D0A5766DC2A4152B81E05F41F50417C01000000713D0A5766DC2A4152B81E05F41F5041FE
Genova610307 0001787F0000295C8FC24A0D1E4148E17A6462C35241295C8FC24A0D1E4148E17A6462C352417C01000000295C8FC24A0D1E4148E17A6462C35241FE
Bologna371217 0001787F00005C8FC2756EF124419A999949DFCD52415C8FC2756EF124419A999949DFCD52417C010000005C8FC2756EF124419A999949DFCD5241FE
Firenze356118 0001787F00000AD7A3F055CC244114AE4701207F52410AD7A3F055CC244114AE4701207F52417C010000000AD7A3F055CC244114AE4701207F5241FE
sqlite>
  1. the HEX() function returns the hexadecimal representation of a BLOB column value.
  2. in the preceding execution of this query, the geom column seemed empty; now, by using the HEX() function, we discover that contains lots of strange binary data.
  3. geom contains GEOMETRY values, stored as BLOBs and encoded in the internal representation used by SpatiaLite.
    SQLite in its own hasn't the slightest idea of what GEOMETRY is, and cannot do any other operation on it.
    To really use GEOMETRY values, it's time to load the SpatiaLite extension.
We have completed our first step. If you have understood all the preceding examples, you become to be familiar with basic SQL, and that's enough to start using SpatiaLite.

To terminate the current SQLite session and exit, you simply have to type:
sqlite> .quit

2. Getting started with SpatiaLite

This section examines the most basilar operations concerning Spatial data. In order to enable SQLite to process Spatial data as well, you need first to load the SpatiaLite extension.
SpatiaLite [to be pronounced as spatial light] is simply a small library [or DLL in Windows jargon], that supports a collection of SQL functions in a way conformant to the OpenGisConsortium OGC specification for Spatial data processing.
If you are interested on this, you can obtain a copy of OGC specification from the OGC web site

2.1 How to load SpatiaLite

You should first start the sqlite3.exe as usual, so type the following command:

sqlite3 test.db

you should receive this prompt:
SQLite version 3.x.x
enter ".help" for instructions
sqlite>


the SQLite session begins, and is currently connected to the test.db database;
now type the following commands to set the various options for output formatting:
sqlite> .nullvalue NULL
sqlite> .headers on
sqlite> .mode column
sqlite>


and now you can load the SpatiaLite extension:
sqlite> .load 'SpatiaLite.dll'
sqlite>

an alternative way to load SpatiaLite is to execute the following SQL expression:
SELECT load_extension('SpatiaLite.dll');

2.2 A first glance: familiarizing with GEOMETRY

All right, now you can execute your first Spatial query: we'll repeat an SQL query we have already performed:
sqlite> SELECT name, peoples, AsText(GaiaGeometry)
     ... > from Towns where peoples > 350000 order by peoples DESC;
NamePeoplesAsText(GaiaGeometry)
Roma2546804POINT(788703.57 4645636.3)
Milano1256211POINT(514820.49 5034534.56)
Napoli1004500POINT(942636.1 4535272.55)
Torino865263POINT(395553.63 4991768.9)
Palermo686722POINT(880179.17 4227024.08)
Genova610307POINT(492370.69 4918665.57)
Bologna371217POINT(686263.23 4929405.15)
Firenze356118POINT(681514.97 4848768.02)
sqlite>
  1. the AsText() function is a SpatiaLite one, and returns the Well Known Text - WKT representation for a GEOMETRY column value.
    WKT is a standard notation conformant to OpenGIS specification.
  2. in the preceding execution of this query, the HEX() function returned lots of strange binary data.
    Now the AsText() function shows useful and quite easily understandable GEOMETRY values.
  3. a POINT is the simplest GEOMETRY class, and has only a couple of [X,Y] coordinates.
repeat the preceding query in slightly different way:
sqlite> SELECT name, X(GaiaGeometry), Y(GaiaGeometry) FROM Towns
     ... > WHERE peoples > 350000
     ... > ORDER BY peoples DESC;
NameX(GaiaGeometry)Y(GaiaGeometry)
Roma788703.574645636.3
Milano514820.495034534.56
Napoli942636.14535272.55
Torino395553.634991768.9
Palermo880179.174227024.08
Genova492370.694918665.57
Bologna686263.234929405.15
Firenze681514.974848768.02
sqlite>
  1. the SpatiaLite X() function returns the X coordinate for a POINT.
  2. the Y() function returns the Y coordinate for a POINT.
you can use the following GEOMETRY format conversion functions:
sqlite> SELECT HEX(GeomFromText('POINT(10 20)'));
hex(GeomFromText('POINT(10 20)'))
0001FFFFFFFF00000000000024400000000000003440000000000000244000000000000034407C0100000000000000000024400000000000003440FE

sqlite> SELECT HEX(AsBinary(GeomFromText('POINT(10 20)')));
hex(AsBinary(GeomFromText('POINT(10 20)')))
010100000000000000000024400000000000003440

sqlite> SELECT AsText(GeomFromWKB(X'010100000000000000000024400000000000003440'));
AsText(GeomFromWKB(X'010100000000000000000024400000000000003440'))
POINT(10 20)
sqlite>
  1. the SpatiaLite GeomFromText() function returns the internal BLOB representation for a GEOMETRY.
  2. the AsBinary() function returns the Well Known Binary - WKB representation for a GEOMETRY column value.
    WKB is a standard notation conformant to OpenGIS specification.
  3. the GeomFromWKB() function converts a WKB value into the corresponding internal BLOB value.

2.3. GEOMETRY classes

This section helps you to discover the various GEOMETRY classes supported by SpatiaLite as they are defined in the OpenGIS specification; simply speaking, any GEOMETRY CLASS is a peculiar kind of GEOMETRY.
You have already met the POINT class in the preceding paragraph; that's time you'll become accustomed to her sister classes too.

a simple SQL query to explore a new GEOMETRY class:
sqlite> SELECT GaiaPrimaryKey, AsText(GaiaGeometry) FROM HighWays WHERE gaiaprimarykey = 883;
GaiaPrimaryKeyAsText(GaiaGeometry)
883LINESTRING(689001.702718 4798988.808442,
689027.602471 4798996.686619,
689029.54214 4798989.585948,
689029.54214 4798989.585948)
sqlite>
  1. LINESTRING is another GEOMETRY class, and has lots of POINTs.
  2. in this one case you have fetched a very simple LINESTRING, representing a polyline with just 4 vertices.
  3. it isn't at all unusual to encounter LINESTRINGs with some thousands of vertices in real GIS data.
sqlite> SELECT GaiaPrimaryKey, NumPoints(GaiaGeometry), GLength(GaiaGeometry),
     ... > Dimension(GaiaGeometry), GeometryType(GaiaGeometry)
     ... > FROM HighWays ORDER BY NumPoints(GaiaGeometry) DESC LIMIT 5;
GaiaPrimaryKeyNumPoints(GaiaGeometry)GLength(GaiaGeometry)Dimension(GaiaGeometry) GeometryType(GaiaGeometry)
1062675894997.8721344157 1LINESTRING
1063512075453.1247726708 1LINESTRING
273432569052.1024677445 1LINESTRING
343310946831.3235596409 1LINESTRING
1061275560165.9776174597 1LINESTRING
sqlite>
  1. the SpatiaLite NumPoints() function returns the number of vertices for a LINESTRING GEOMETRY.
  2. the GLength() function returns the geometric length [expressed in map units] for a LINESTRING GEOMETRY.
  3. the Dimension() function returns the dimensions' number for any GEOMETRY class [obviously 1 for lines].
  4. the GeometryType() function returns the class type for any kind of GEOMETRY value.
sqlite> SELECT GaiaPrimaryKey, NumPoints(GaiaGeometry),
     ... > AsText(StartPoint(GaiaGeometry)), AsText(EndPoint(GaiaGeometry)),
     ... > X(PointN(GaiaGeometry, 2)), Y(PointN(GaiaGeometry, 2))
     ... > FROM HighWays ORDER BY NumPoints(GaiaGeometry) DESC LIMIT 5;
GaiaPrimaryKeyNumPoints(GaiaGeometry) AsText(StartPoint(
GaiaGeometry))
AsText(EndPoint(
GaiaGeometry))
X(PointN(
GaiaGeometry, 2))
Y(PointN(
GaiaGeometry, 2))
10626758POINT(632090.156998 4835616.546126) POINT(663300.737479 4795631.803342)632086.0096648844835625.74875358
10635120POINT(663292.190654 4795627.307765) POINT(632085.166691 4835620.171885)663295.9924954534795626.48941986
2734325POINT(668247.593086 4862272.349444) POINT(671618.13304 4854179.734158)668232.5292849544862273.56196672
3433109POINT(671613.424233 4854121.472532) POINT(654264.259259 4855357.41189)671610.5236143034854129.55436818
10612755POINT(619601.675367 4855174.599496) POINT(668724.797158 4862015.941886)619593.7115396854855174.74398836
sqlite>
  1. the SpatiaLite StartPoint() function returns the first POINT for a LINESTRING GEOMETRY.
  2. the EndPoint() function returns the last POINT for a LINESTRING GEOMETRY.
  3. the PointN() function returns the selected vertex as a POINT; each one vertex is identified by a relative index.
    The first vertex is identified by an index value 1, the second by an index value 2 and so on.
  4. please note that you can freely nest the various SpatiaLite functions, by passing the return value of the inner function as an argument for the outer one.

sqlite> SELECT name, AsText(GaiaGeometry) FROM Regions WHERE GaiaPrimaryKey = 52;
nameAsText(GaiaGeometry)
EMILIA-ROMAGNAPOLYGON((761808.155309 4966649.458816,
762432.549628 4966393.94736,
764168.27812 4966137.407299,
763486.212544 4966081.379442,
762462.485356 4966252.975485,
761808.155309 4966649.458816))
sqlite>
  1. POLYGON is another GEOMETRY class.
  2. in this one case you have fetched a very simple POLYGON, having only the exterior ring [i.e. it doesn't contains any internal hole].
    Remember that POLYGONs may optionally contain an arbitrary number of internal holes, each one delimited by an interior ring.
  3. the exterior ring in itself is simply a LINESTRING [and interior rings too are LINESTRINGS].
  4. note that a POLYGON is a closed geometry, and thus the first and the last POINT for each ring are exactly identical.
sqlite> SELECT GaiaPrimaryKey,
     ... > Area(GaiaGeometry), AsText(Centroid(GaiaGeometry)),
     ... > Dimension(GaiaGeometry), GeometryType(GaiaGeometry)
     ... > FROM Regions ORDER BY Area(GaiaGeometry) DESC LIMIT 5;
GaiaPrimaryKeyArea(GaiaGeometry)AsText(Centroid(GaiaGeometry)) Dimension(GaiaGeometry)GeometryType(GaiaGeometry)
10125779695636.9109POINT(955762.441021 4173082.511675)2POLYGON
10525391974998.0765POINT(414613.865386 4990738.194216)2POLYGON
10624069782849.5208POINT(560092.709693 5052167.502959)2POLYGON
7423962322618.287POINT(502694.575482 4437256.791656)2POLYGON
5322657959720.0959POINT(672979.667565 4814145.593235)2POLYGON
sqlite>
  1. we have already meet the SpatiaLite Dimension() and GeometryType() functions; they works for POLYGONs exactly in same fashion as for any other kind of GEOMETRY.
  2. the SpatiaLite Area() function returns the geometric area [expressed in square map units] for a POLYGON GEOMETRY.
  3. the Centroid() function returns the POINT identifying the centroid for a POLYGON GEOMETRY.
sqlite> SELECT GaiaPrimaryKey, NumInteriorRings(GaiaGeometry),
     ... > NumPoints(ExteriorRing(GaiaGeometry)), NumPoints(InteriorRingN(GaiaGeometry, 1))
     ... > FROM regions ORDER BY NumInteriorRings(GaiaGeometry) DESC LIMIT 5;
GaiaPrimaryKeyNumInteriorRings(GaiaGeometry) NumPoints(ExteriorRing(GaiaGeometry))NumPoints(InteriorRingN(GaiaGeometry, 1))
60391217
5516029
106NULL
2012NULL
3020NULL
  1. the SpatiaLite ExteriorRing() functions returns the exterior ring for a given GEOMETRY.
    Any valid POLYGON must have an exterior ring.
    Remember: each one of the rings belonging to a POLYGON is a closed LINESTRING
  2. the SpatiaLite NumInteriorRings() function returns the number of interior rings belonging to a POLYGON.
    A valid POLYGON may have any number of interior rings, including zero i.e. no interior ring at all.
  3. The SpatiaLite InteriorRingN() function returns the selected interior rings as a LINESTRING; each one interior ring is identified by a relative index.
    The first interior ring is identified by an index value 1, the second by an index value 2 and so on.
  4. Any ring is a LINESTRING, so we can use the NumPoints() function in order to detect the number of related vertices.
    If we call the NumPoints() function on a NULL GEOMETRY [or on a GEOMETRY of non-LINESTRING class] we'll get a NULL result.
    This explains why the the last three rows has a NULL NumPoints() result; there is no corresponding interior ring !
sqlite> SELECT AsText(InteriorRingN(GaiaGeometry, 1)),
     ... > AsText(PointN(InteriorRingN(GaiaGeometry, 1), 4)),
     ... > X(PointN(InteriorRingN(GaiaGeometry, 1), 5)),
     ... > Y(PointN(InteriorRingN(GaiaGeometry, 1), 5))
     ... > FROM Regions WHERE GaiaPrimaryKey = 55;
AsText(InteriorRingN(GaiaGeometry, 1))AsText(PointN(InteriorRingN(
GaiaGeometry, 1), 4))
X(PointN(InteriorRingN(
GaiaGeometry, 1), 5))
Y(PointN(InteriorRingN(
GaiaGeometry, 1), 5))
LINESTRING(756881.706704 4850692.62625,
760361.595005 4852743.267975,
759582.880944 4855493.610807,
757549.382306 4855414.551183,
755734.189332 4856112.118807,
755020.910885 4855996.887913,
754824.031873 4854723.577451,
756021.000385 4850937.420842,
756881.706704 4850692.62625)
POINT(757549.382306 4855414.551183)755734.1893322574856112.11880693
  1. we have already met in the preceding ones the usage of nested functions.
    For POLYGONs it becomes to be a little more tedious, but still easily understandable.
  2. e.g. to obtain the last column we have used InteriorRingN() in order to get the first interior ring, and then PointN() to get the fifth vertex.
    At last we can call Y() to get the coordinate value.
  3. it's quite boring, but not difficult at all, no ?

POINT, LINESTRING and POLYGON are the elementary classes for GEOMETRY.
But GEOMETRY supports the following complex classes as well:
  1. a MULTIPOINT is a collection of two or more POINTSs belonging to the same entity.
  2. a MULTILINESTRING is a collection of two or more LINESTRINGs.
  3. a MULTIPOLYGON is a collection of two or more POLYGONs.
  4. a GEOMETRYCOLLECTION is an arbitrary collection containing any other kind of geometries.
We'll not explain in detail this kind of collections, because it will be simply too boring and dispersive.
Generally speaking they regularly extends in the expected way properties corresponding to elementary corresponding classes.
  1. the SpatiaLite NumGeometries() function returns the number of elements for a collection.
  2. the GeometryN() function returns the N-th element for a collection.
  3. the GLength() function applied to a MULTILINESTRING returns the sum of individual lengths for each LINESTRING composing the collection.
  4. the Area() function applied to a MULTIPOLYGON returns the sum of individual areas for each POLYGON composing the collection.
  5. the Centroid() function in the SpatiaLite implementation always returns the centroid related to the POLYGON composing the collection which has the largest individual Area().

2.4. GEOMETRY ENVELOPE

Let see a fundamental property for any GEOMETRY class:
sqlite> SELECT Name, AsText(Envelope(GaiaGeometry)) FROM Regions LIMIT 5;
NameAsText(Envelope(GaiaGeometry))
VENETOPOLYGON((752912.250297 5027429.54477,
753828.826422 5027429.54477,
753828.826422 5028928.677375,
752912.250297 5028928.677375,
752912.250297 5027429.54477))
VENETOPOLYGON((751455.937063 5026778.301798,
752928.785333 5026778.301798,
752928.785333 5029157.835014,
751455.937063 5029157.835014,
751455.937063 5026778.301798))
VENETOPOLYGON((759461.944608 5026112.935302,
766247.780711 5026112.935302,
766247.780711 5036802.775999,
759461.944608 5036802.775999,
759461.944608 5026112.935302))
VENETOPOLYGON((750183.233817 5025314.495578,
750842.445895 5025314.495578,
750842.445895 5026793.815968,
750183.233817 5026793.815968,
750183.233817 5025314.495578))
LIGURIAPOLYGON((378359.802362 4848919.510676,
584633.642747 4848919.510676,
584633.642747 4947990.084988,
378359.802362 4947990.084988,
378359.802362 4848919.510676))
sqlite>
  1. the SpatiaLite Envelope() function always returns a POLYGON that is the Minimun Bounding Rectangle - MBR for the given GEOMETRY.
    Because an MBR is a rectangle, it always has 5 POINTs [remember: in closed geometries the last POINT must be identical to the first one].
  2. individual POINTs are as follow:
  3. MBRs are of peculiar interest, because by using them you can evaluate spatial relationships between two geometries in a simplified and roughly approximative way. But MBRs comparisons are very fast to compute, so they are very useful and widely used to speed up data processing.
All right, if you've followed and understood all the above examples, now you have acquired a basic familiarity with the fundamental core of Spatial data processing. We can now pass to examine some more advanced exercise.

3. Managing a GIS databases

Very simply speaking, a Geographic Information System - GIS is the one that supports geographic [or cartographic] informations; as a rule of thumb Spatial data [and Spatial-enabled DBMSs] exists mainly because they are the key component in GIS applications.
This section examines how you can manage a real GIS database, i.e.:
  1. how can you import any existing GIS dataset in a new SpatiaLite database.
  2. how can you insert, modify or delete the GIS datasets once they are stored inside the database you have just created.
  3. how can you export GIS datasets out of the SpatiaLite database.
Situations in which GIS datasets travels back and forward from a workstation and another, even at different locations and using completely different systems and application software, are widespread.
Good new #1: by using SQLite as your preferred GIS DBMS, this task becomes incredibly simple:
  1. an SQLite database is just an ordinary file; you can freely copy it, or even upload and download it on the WEB, send it as an e-mail attachment, zip and unzip it, and so on.
  2. SQLite's database are cross-platform, so you can produce them on a Windows PC and then use them on a Linux server, without any problem.
    Of course, SpatiaLite too guarantees that Spatial data didn't corrupts when going cross-platform.
Good new #2: SpatiaLite supports the almost universal ESRI shapefile format. So you can easily obtain some GIS dataset from an application like ArcGis, manipulate it using SpatiaLite as needed, and then transfer it back again. Shapefiles too can safely be used in a cross-platform fashion.

3.1 Exporting GIS data outside a SpatiaLite db

You have a SpatiaLite database on your hands, the test.db one.
To export all tables as shapefiles you simply have to type:

sqlite> SELECT ExportShapefile('Towns', 'POINTS','shape_towns', 'GaiaGeometry');
========
Dumping SQLite table 'Towns'
into shapefile at 'shape_towns'
8101 POLYGON/MULTIPOLYGON rows fetched from table
8101 entities exported to shapefile
[0.219 sec]
sqlite> SELECT ExportShapefile('HighWays', 'LINESTRINGS', 'shape_highways', 'GaiaGeometry');
========
Dumping SQLite table 'HighWays'
into shapefile at 'shape_highways'
775 LINESTRING/MULTILINESTRING rows fetched from table
775 entities exported to shapefile
[0.234 sec]
sqlite> SELECT ExportShapefile('Regions', 'POLYGONS', 'shape_regions', 'GaiaGeometry');
========
Dumping SQLite table 'Regions'
into shapefile at 'shape_regions'
109 POLYGON/MULTIPOLYGON rows fetched from table
109 entities exported to shapefile
[0.016 sec]
sqlite>
  1. the SpatiaLite ExportShapefile() function exports a whole table as a shapefile.
  2. argument #1 identifies the table name you want to export.
  3. argument #2 specifies which one GEOMETRY CLASS you intend to export.
  4. argument #3 is the pathname of the shapefile to be generated.
  5. argument #4 is the column name that identifies geometries to be exported.

If you now perform a directory listing [e.g. executing the DIR system command], you'll notice something like this:

17/03/200817.104.716shape_highways.dbf
17/03/200817.105.027.164shape_highways.shp
17/03/200817.106.300shape_highways.shx
17/03/200817.10318.988shape_regions.dbf
17/03/200817.10226.928shape_regions.shp
17/03/200817.1064.908shape_regions.shx
17/03/200817.10648.306shape_towns.dbf
17/03/200817.10226.928shape_towns.shp
17/03/200817.1064.908shape_towns.shx

Those files are the shapefiles you've right now exported.
All right, you can launch your preferred GIS viewer, e.g. QGis, and explore them.

view #1 shapefiles

view #2 shapefiles

At last you can view a real geographic map, instead of quite obscure WKB or WKT encoded Spatial data !
Don't play too much with GUI tools; let alone the GIS viewer and come back again to the SQL tutorial.
A very good idea may be the one to automate the boring tasks, like the ones of exporting your GIS data.
To do this you simply have to write an ordinary text file using some text editor, save it somewhere, and then execute it as an SQL script.
If you look inside the scripts directory of the SpatiaLite pack you have downloaded, you'll find a export-all.sql file.
Open and read it with your favorite text editor:

--
-- this SQL script automates shapefile export
--
-- all table of test.db are dumped as shapefiles
--

SELECT DumpShapefile('Towns', 'POINTS','shape_towns', 'GaiaGeometry');
SELECT DumpShapefile('HighWays', 'LINESTRINGS', 'shape_highways', 'GaiaGeometry');
SELECT DumpShapefile('Regions', 'POLYGONS', 'shape_regions', 'GaiaGeometry');
  1. lines starting with a double hyphen [--] are comments, and are not executed.
  2. other lines are exactly the same you have just manually typed in the last exercise.
  3. DumpShapefile() is just another alias name that identifies the ExportShapefile() function.
To execute this SQL script you just have to type:
sqlite> .read export-all.sql

It's very easy and simple, isn't it? Even if you are a GUI tools addict [and thus you judge shell environments to be very primitive, obsolete and awful things, and you hate them], you must admit that this one is the simplest and quickest way of exporting a whole GIS project at a snap you've ever seen since now.

3.2 Creating a new SpatiaLite db and populating it

Now you surely have some shapefiles, at least the ones you've just exported.
So in this example we are going to create from scratch a brand-new, empty database and then feeding it with our shapefiles.
Creating a new SQLite database is really simple; you just have to start a new SQLite session:
sqlite3 my_new_db.db

When SQLite begins a new session, if the db you requested didn't exists, it will create a new one.
SQLite version 3.x.x
enter ".help" for instructions
sqlite>


the SQLite session begins, and is currently connected to the test.db database;
now type the following commands to set the various options for output formatting:
sqlite> .nullvalue NULL
sqlite> .headers on
sqlite> .mode column
sqlite>.load 'SpatiaLite.dll'
sqlite>
That's always the same standard stuff, you already know ...

sqlite> SELECT LoadShapefile('shape_regions', 'NewRegions');
========
Loading shapefile from 'shape_regions'
into SQLite table 'NewRegions'
========
table 'NewRegions' created
acquired 109 POLYGON/MULTIPOLYGON entities from shapefile
inserted 109 rows
[0.281 sec]
sqlite> SELECT LoadShapefile('shape_highways', 'NewHighWays');
========
Loading shapefile from 'shape_highways'
into SQLite table 'NewHighWays'
========
table 'NewHighWays' created
acquired 775 LINESTRING/MULTILINESTRING entities from shapefile
inserted 775 rows
[3.156 sec]
sqlite> SELECT LoadShapefile('shape_towns', 'NewTowns');
========
Loading shapefile from 'shape_towns'
into SQLite table 'NewTowns'
========
table 'NewTowns' created
acquired 8101 POINT entities from shapefile
inserted 8101 rows
[1.469 sec]
sqlite> ANALYZE;
sqlite> VACUUM;
sqlite>
  1. the SpatiaLite LoadShapefile() function creates a new table and populate she by importing a whole shapefile.
  2. the ANALYZE and VACUUM SQL commands tells SQLite to reorganize and restructure the database.
    It's always a very good idea to perform them each time you insert or delete a large amount of data, in order to maintain your db in a well performing state.

Just to check if you have really populated the db you can try the followings:
sqlite> .tables
NewHighWaysNewRegionsNewTowns
sqlite>
The .tables directive causes SQLite to list all the tables contained in the current database.

sqlite> PRAGMA table_info(NewTowns);
cidnametypenotnulldflt_valuepk
0PK_UIDINTEGER0NULL1
1GaiaPrimarINTEGER99NULL0
2NameTEXT99NULL0
3PeoplesDOUBLE99NULL0
4LocalCouncBOOLEAN99NULL0
5CountyBOOLEAN99NULL0
6RegionBOOLEAN99NULL0
7geomBLOB99NULL0
sqlite>
The PRAGMA table_info(table_name) directive causes SQLite to list all columns belonging to the requested table.
  1. please note that the GaiaPrimaryKey and LocalCouncil original column names have been truncated.
    Unfortunately, shapefile format didn't allow column names to be longer than 10 character.
  2. the primary key column now has a PK_UID name, and the geometry column now has a geom name.
  3. Those are the default column names SpatiaLite assumes.
    You can alter this behaviour, of course, but you need to specify some extra parameter to the LoadShapefile() function.
sqlite> SELECT count(*), GeometryType(geom)
     ... > FROM NewTowns GROUP BY GeometryType(geom);
count(*)GeometryType(geom)
8101POINT
sqlite> SELECT count(*), GeometryType(geom)
     ... > FROM NewHighWays GROUP BY GeometryType(geom);
count(*)GeometryType(geom)
775LINESTRINGS
sqlite> SELECT count(*), GeometryType(geom)
     ... > FROM NewRegions GROUP BY GeometryType(geom);
count(*)GeometryType(geom)
109POLYGON
sqlite>

Ok, our new databases seems correctly created and loaded; we can pass to a new exercise.

3.3 Inserting, updating and deleting rows in a SpatiaLite db

Now you have two databases, the original test.db and the brand-new my_new_db.db; so you can safely use one of them to continue testing, and keep the other one as a backup copy.
Let you start with ordinary SQL commands:

sqlite> DELETE FROM NewTowns WHERE peoples < 100000;
sqlite> SELECT count(*) FROM NewTowns;
count(*)
42
sqlite>
Best wishes ... you've just unrecoverably lost some 8.000 rows.
If you repent yourself, is too late; they are gone for ever.

A wisest way to do this is:
sqlite> BEGIN;
sqlite> DELETE FROM NewTowns WHERE peoples < 100000;
sqlite> SELECT count(*) FROM NewTowns;
count(*)
42
sqlite> ROLLBACK;
sqlite> SELECT count(*) FROM NewTowns;
count(*)
8101
sqlite>
  1. the BEGIN SQL statement tells SQLite that you intend to start a transaction.
    Any operation that you perform within a transaction can be cancelled at any subsequent time, and the db will come back to its original unchanged state.
  2. the ROLLBACK SQL statement tells SQLite exactly this.
Otherwise, if you want to really confirm what you have already done:
sqlite> BEGIN;
sqlite> DELETE FROM NewTowns WHERE peoples < 100000;
sqlite> SELECT count(*) FROM NewTowns;
count(*)
42
sqlite> COMMIT;
sqlite> SELECT count(*) FROM NewTowns;
count(*)
42
sqlite>
  1. the COMMIT SQL statement definitively confirms all operations still pending.
  2. both ROLLBACK and COMMIT close the current transaction; if you want to continue working in transactional way, you should start a new transaction invoking BEGIN once again.
  3. if you simply omit to use at all the BEGIN, COMMIT or ROLLBACK statements, SQLite will assume you wish to operate in the so-called autocommit mode, i.e. each single statement is implicitly assumed to be contained within an automatic transaction.
    When you are working in autocommit mode typing a simple DELETE ... statement is understood to mean: BEGIN; DELETE ...; COMMIT;
  4. SQLite is a transactional database engine; working in autocommit mode easily can be a major cause for performance degradation.
  5. so, you are strongly encouraged to use transactions as appropriate in a sensible way.

Now you will create a new table, and then insert some rows in it:

sqlite> BEGIN;
sqlite> CREATE TABLE MyTable (
     ... > name TEXT NOT NULL,
     ... > geom BLOB NOT NULL);
sqlite> INSERT INTO MyTable (name, geom) VALUES
     ... > ('one', GeomFromText('POINT(1 1)'));
sqlite> INSERT INTO MyTable (name, geom) VALUES
     ... > ('two', GeomFromText('POINT(2 2)'));
sqlite> INSERT INTO MyTable (name, geom) VALUES
     ... > ('three', GeomFromText('POINT(3 3)'));
sqlite> SELECT name, AsText(geom) FROM MyTable;
name AsText(geom)
one POINT(1 1)
two POINT(2 2)
three POINT(3 3)
sqlite> .quit
  1. the CREATE TABLE allows you to create a new table, specifying the columns it contains too.
  2. you know well that a Spatial column is a BLOB one
  3. the INSERT INTO allows you to store new rows in the table.
  4. using the GeomFromText() you can create new GEOMETRY values at your will.
  5. you have also performed a final check, using SELECT.
  6. wow !!! you have just created a table and inserted Spatial data in it, so you can happily exit the SQLite session now.
After while, when you start a new SQLite session, unfortunately you discover that:

sqlite> SELECT name, AsText(geom) FROM MyTable;
SQL error: no such table: MyTable
sqlite>


Gosh, what's happening ? Where is your newly created table? It seems to be vanished ...
And actually it has gone; read again and carefully the last session:
  1. you started a transaction by invoking BEGIN
  2. then you have created your table
  3. then you have inserted some rows
  4. ... and after all you have suddenly stopped your session omitting to confirm all pending operation.
    You've forgotten to invoke COMMIT.
    So all the job you had done has gone forever.
  5. this one is the dark side of transactions ... never forget to COMMIT a pending transaction, or your work will simply disappear.
    I hope this catastrophic example may help you to avoid transaction disasters.

Updating rows is as simple as inserting or deleting them:

sqlite> SELECT pk_uid, name, peoples, AsText(geom)
     ... > FROM NewTowns WHERE pk_uid = 8006;
PK_UIDNamePeoplesAsText(geom)
8006Monza120204.0POINT(521332.99 5047818.45)
sqlite> UPDATE NewTowns SET peoples = 150000,
     ... > name = 'MONZA',
     ... > geom = GeomFromText('POINT(10 10)')
     ... > WHERE pk_uid = 8006;
sqlite> SELECT pk_uid, name, peoples, AsText(geom)
     ... > FROM NewTowns WHERE pk_uid = 8006;
PK_UIDNamePeoplesAsText(geom)
8006MONZA150000.0POINT(10 10)
sqlite>
  1. the UPDATE SQL statement lets you modify any column value.
    You simply have to SET column names and new values that will replace the current ones.
  2. you can use as usual the GeomFromText() function to obtain GEOMETRY values.
  3. the PK_UID column is a special one, being the PRIMARY KEY for this table.
    In every table a PRIMARY KEY column assures you that one and only one row may contains a selected value, thus assuring univocity.
  4. if one of your tables, for any reason, has no PRIMARY KEY, don't worry; every way SQLite manages automatically an useful unique identifier for each row in a table, whom name is ROWID; you can reference it SQL expressions exactly as it was an ordinary column name.
  5. note also that in this example you have apparently not used any transaction [you have used neither BEGIN nor COMMIT]; we have already explained this specific point, but it's interesting to underline this a second time.
    SQLite acts always as a transactional DBMS; what really happens behind the scenes in cases like this, is that you are implicitly using auto-commit mode i.e.:

The following examples requires the original my_test_db.db database.
Thus it's better if you rebuilt it now, to avoid any misalignment due to DELETEs and UPDATEs we have just performed.

The SQLite's SQL syntax offers you a straightforward way to create a new table and simultaneously populate it with selected data coming out from an another table:

sqlite> BEGIN;
sqlite> CREATE TABLE Villages AS
     ... > SELECT * FROM NewTowns WHERE peoples < 500;
sqlite> COMMIT;
sqlite> SELECT count(*) FROM Villages;
count(*)
845
sqlite>
  1. using CREATE TABLE .... AS SELECT ... SQLite clones automatically a new table [exactly identical to the original one; same column names etc].
  2. and then all rows coming out from the SELECT will be written in this new table.
SQLite support also another different SQL construct that allows you to create a new table and then populate it with data coming out from an existing table:

sqlite> BEGIN;
sqlite> CREATE TABLE Metropolis (
     ... > Name TEXT NOT NULL,
     ... > Population INTEGER NOT NULL,
     ... > Geometry BLOB NOT NULL);
sqlite> INSERT INTO Metropolis (Name, Population, Geometry)
     ... > SELECT name, peoples, geom FROM NewTowns
     ... > WHERE peoples > 1000000;
sqlite> COMMIT;
sqlite> SELECT name, population, AsText(geometry) FROM Metropolis;
NamePopulationAsText(geometry)
Roma2546804POINT(788703.57 4645636.3)
Milano1256211POINT(514820.49 5034534.56)
Napoli1004500POINT(942636.1 4535272.55)
sqlite>
  1. first you have create the new table, freely choosing columns, their names etc.
  2. then you call INSERT INTO ... SELECT that performs an automatic data transfer [either selected or complete] between the two tables.
Both these SQL constructs are very useful in a GIS-oriented environment; very often you can need to extract a small subset of your data [e.g. only the ones related to a single County, or Town], edit them in a convenient way, perform some very specifically oriented Spatial analysis, produce some specialized output, dump a selected shapefile and so on.
Under these circumstances, creating a set of derived tables can largely help.

Once you don't need no more to keep those short lived tables, you can eliminate them:

sqlite> DROP TABLE Villages;
sqlite> DROP TABLE Metropolis;
sqlite> VACUUM;
sqlite>
  1. a DROP statement will completely eliminate a table and all data it contains.
  2. never forget to perform a VACUUM after dropping tables, in order to really free unused space, compact the database and so on.

Before leaving this section, it's better to underline a few basilar concepts about SQLite's very peculiar way to manage column types. Consider the following example:

sqlite> CREATE TABLE some_table (
     ... > N1 SMALLINT,
     ... > N2 INTEGER NOT NULL,
     ... > N3 DOUBLE,
     ... > STR VARCHAR(4) NOT NULL);
sqlite> INSERT INTO some_table VALUES (10, 11, 111.1111, 'first');
sqlite> INSERT INTO some_table VALUES (NULL, 12, NULL, 'second');
sqlite> INSERT INTO some_table VALUES (30, NULL, 222.2222, NULL);
SQL error: some_table.N2 may not be NULL


all right since now; any other DBMS you have ever used will perform these operations exactly in this way, no?

sqlite> INSERT INTO some_table VALUES ('aaaa', 'bbbb', 'cccc', 1234);
sqlite> INSERT INTO some_table VALUES ('A', 'B', 'C', 1234.6789);


Oops, this one is an unexpected one; you have just inserted without any complaint string values in numeric columns, and numeric values in a CHAR column ....
it's very puzzling, no ?

sqlite> SELECT * FROM some_table;
N1N2N3Str
1011111.1111first
NULL12NULLsecond
aaaabbbbcccc1234
ABC1234.6789
sqlite>
  1. SQLite does not enforce type checking for column values
  2. you can place any kind of data you want in each column, without restrictions; no checking is performed in order to assure conformance with column type declared in CREATE TABLE
  3. you may either love or hate this; it's not at all a bug, it's an explicit design feature of SQLite.
Anyway, if you really absolutely need to implement an effective type-value checking for some column, SQLite can help you to do such a thing:

sqlite> CREATE TABLE checked_table (
     ... > num SMALLINT CHECK (num BETWEEN -32768 AND 32767),
     ... > str VARCHAR(4) CHECK (      ... > length(str) <= 4 AND str BETWEEN ' ' AND 'zzzz'));
sqlite> INSERT INTO checked_table VALUES (1, 'a');
sqlite> INSERT INTO checked_table VALUES (1111, 'aaaa');
sqlite> INSERT INTO checked_table VALUES (1234, 'abcd');
sqlite> INSERT INTO checked_table VALUES ('a', 'abcd');
SQL error: constraint failed


it seems to work just fine, isn't it ?

sqlite> INSERT INTO checked_table VALUES (5678, 1);
sqlite> INSERT INTO checked_table VALUES ('9999', 1234);


oh no, it doesn't seem to works anymore ...

sqlite> INSERT INTO checked_table VALUES ('35000', 1);
SQL error: constraint failed
sqlite> INSERT INTO checked_table VALUES (2, 12345);
SQL error: constraint failed

sqlite>

it really works, after all.
  1. the CHECK clause allows you to define column-based constraints, thus enforcing type conformity.
  2. SQLite enforces CHECKing before inserting or updating column values.
  3. an implicit type conversion may be performed as well when appropriate.

4. Performing some Spatial data analysis

Since now you have acquired the most basic knowledge about SpatiaLite and SQLite.
Now you should be able to manage a spatially enabled database, populate it, add new tables, inserting and updating Spatial data and so on.
Time has come to finally examine the hard core of Spatial data processing, i.e. how you can evaluate relations between geometries.
The ordinary way standard SQL works is by evaluating relations existing between entities, e.g. when you perform a JOIN that imposes some clause like:

... WHERE t1.sex = t2.sex AND t1.name = t2.name ...

Spatial analysis do exactly the same, but evaluates spatial relationships that may join [or disjoin] geometries using some spatial criterion.

4.1 Evaluating MBRs relationships

Every GEOMETRY as its own Minimum Bounding Rectangle - MBR; see the Envelope() function, if you don't remember what an MBR is.
mbrs relationships
These ones are the spatial relations that can exists between a couple of MBRs.
  1. Evaluating spatial relationship existing between two geometries by comparing their MBRs relations is a very imperfect and approximative way to check real spatial relations.
  2. Anyway, MBRs comparisons are the coarsest but fastest way to compare spatial relations.
    Evaluating real relation existing between two arbitrary complex geometries may be an heavily time consuming task. Vice-versa an MBRs comparison is very quick to evaluate.
  3. So evaluating MRBs is at least the best strategy to restrict the field.
    Once you have eliminated the [many] incompatible couples, then you can pass to check better only the [few] couples that may be of real interest.
mbrs relationships (2)
  1. if two MBRs are disjointed, surely related geometries are disjointed as well.
  2. but if two MBRs overlaps or intersects, you cannot be sure at all that related geometries overlaps or intersects as well; in this circumstance you need to check better by evaluating real Spatial relations.

A very common problem in GIS applications is the one to quickly and efficiently fetch lots of entities belonging to some specific frame; e.g. think of drawing a rectangular map.
You'll need to fetch all the few GIS entities within the frame, discarding the many ones outside the frame [that will be invisibles at all].

sqlite> SELECT count(*) FROM NewTowns;
count(*)
8101

this one is the unfiltered dataset for NewTowns [remember: for this table GEOMETRY is a POINT class one]

sqlite> SELECT count(*) FROM NewTowns WHERE MBRContains(
     ... > GeomFromText('POLYGON((554000 4692000, 770000 4692000,
     ... > 770000 4925000, 554000 4925000, 554000 4692000))'), geom);
count(*)
480
sqlite>
  1. the SpatiaLite MBRContains() function allows you to restrict the search field.
  2. the GeomFromText('POLYGON(...)') build the MBR corresponding to current frame
    [i.e. screen, some raster image, paper sheet on a printer etc]
sqlite> SELECT count(*) FROM NewTowns WHERE MBRContains(
     ... > GeomFromText('POLYGON((654000 4692000, 770000 4692000,
     ... > 770000 4924000, 654000 4924000, 654000 4692000))'), geom);
count(*)
295
sqlite> SELECT count(*) FROM NewTowns WHERE MBRContains(
     ... > GeomFromText('POLYGON((754000 4692000, 770000 4692000,
     ... > 770000 4924000, 754000 4924000, 754000 4692000))'), geom);
count(*)
47
sqlite>
  1. you can easily use subsequent call to MBRContains() with different frames [MBRs]in order to obtain a zoom in sequence.
  2. simply you have to use each time a wider or narrower MBR, as needed.
sqlite> SELECT count(*) FROM NewTowns WHERE MBRWithin(
     ... > geom, GeomFromText('POLYGON((754000 4692000, 770000 4692000,
     ... > 770000 4924000, 754000 4924000, 754000 4692000))'));
count(*)
47
sqlite>
  1. the MBRWithin() function is the same as MBRContains(), but the order of MBRs is inverted.
  2. you can freely use the one of them you find more convenient; result is the same anyway.
sqlite> SELECT count(*) FROM NewHighWays WHERE MBRIntersects(
     ... > GeomFromText('POLYGON((754000 4692000, 770000 4692000,
     ... > 770000 4924000, 754000 4924000, 754000 4692000))'), geom);
count(*)
15
sqlite>
  1. do you remember? the HighWay table has a GEOMETRY class of type LINESTRING / MULTILINESTRING
  2. when accessing LINESTRINGs or POLYGONs [or their MULTI-collections] the SpatiaLite MBRIntersects() function allow you to select entities within a given frame.
  3. for MBRIntersects() the order by which you refer geometries hasn't any influence.

5. Managing Coordinate Reference Systems and Coordinate Transformation

Each coordinate value, to be fully qualified, needs to be explicitly assigned to some SRID, or Spatial Reference Identifier.
This value identifies the geometry's associated Spatial Reference System that describes the coordinate space in which the geometry object is defined.
As a general rule, different geometries can interoperate in a meaningful way only if their coordinates are expressed in the same Coordinate Reference System.
Very often some kind of coordinate reprojection is required in order to convert different GIS data in an unique, homogeneous, Coordinate Reference System and thus allowing interoperation and integration.

srid example #1
When you try to use two GIS datasets belonging to different Coordinate Reference Systems, entities falls very far because numeric values for coordinates are obviously in two different spaces.
srid example #2
But if you apply some opportune coordinate reprojection for one dataset, so to put it in the same Coordinate Reference Systems of the other one, entities will correctly overlap as expected.

The European Petroleum Survey Group [EPSG] maintains and distributes a large dataset of geodetic parameters describing quite any Coordinate Reference System and Coordinate Transformation used worldwide for GIS data.
SpatiaLite on its own implements SRIDs for any kind of geometric class, and supports the EPSG dataset to identify Coordinate Reference Systems.
The test.db already contains the epsg table, so we can quickly start an SQLite session:

sqlite> SELECT * FROM epsg LIMIT 5;
sridnameproj_params
2000Anguilla 1957 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +units=m +no_defs
2001Antigua 1943 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +units=m +no_defs
2002Dominica 1945 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +towgs84=725,685,536,0,0,0,0 +units=m +no_defs
2003Grenada 1953 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +towgs84=72,213.7,93,0,0,0,0 +units=m +no_defs
2004Montserrat 1958 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +towgs84=174,359,365,0,0,0,0 +units=m +no_defs
sqlite>
  1. each Coordinate Reference System is uniquely identified by a srid value.
  2. and is qualified by an intelligible name
  3. and has a lot of [quite obscure] geodetic projection parameters
    Don't worry about this ... you can just get them as black magic for now ...
    If you want to learn more about them, you can usefully read the followings: http://en.wikipedia.org/wiki/Geodetic and ftp://ftp.remotesensing.org/proj/OF90-284.pdf
sqlite> SELECT DISTINCT Srid(GaiaGeometry) FROM Towns;
Srid(GaiaGeometry)
32632
sqlite>
  1. the SpatiaLite SRID() function allows you to identify the srid value that identifies any kind of geometry.
  2. all geometries in the test.db have a srid value of 32632.
    You can check this by yourself.
sqlite> SELECT DISTINCT SRID(Towns.GaiaGeometry), epsg.name
     ...> FROM Towns, epsg
     ...> WHERE SRID(Towns.GaiaGeometry) = epsg.srid;

SRID(Towns.GaiaGeometry)epsg.name
32632WGS 84 / UTM zone 32N
sqlite>
  1. you can perform a simple join to discover what srid 32632 really means.
  2. all geometries in the test.db pertains to the UTM zone 32N Coordinate Reference System.
Now suppose that for any good reason you absolutely need to transform coordinates, in such a way to obtain a new dataset pertaining to the WGS 84 Coordinate Reference System.
You can do this quite complex task by simply using a plain SQL query.

sqlite> BEGIN;
sqlite> ALTER TABLE Towns ADD COLUMN wgs84 BLOB;
sqlite> UPDATE Towns SET wgs84 = Transform(GaiaGeometry, 4326);
sqlite> COMMIT;
sqlite> SELECT AsText(GaiaGeometry), Srid(GaiaGeometry),
     ... > AsText(wgs84), Srid(wgs84) FROM Towns LIMIT 5;

AsText(GaiaGeometry)Srid(GaiaGeometry)AsText(wgs84)Srid(wgs84)
POINT(427002.77 4996361.33)32632POINT(8.071929 45.116952)4326
POINT(367470.48 4962414.5)32632POINT(7.324219 44.802838)4326
POINT(390084.12 5025551.73)32632POINT(7.596214 45.374854)4326
POINT(425246.99 5000248.3)32632POINT(8.049028 45.151753)4326
POINT(426418.89 4957737.37)32632POINT(8.070136 44.769244)4326
sqlite>
  1. the srid that identifies WGS 84 is 4326.
    You can check this by yourself.
  2. the SpatiaLite Transform() function obtains a new geometry from an original one, applying a coordinate reprojection as required.
  3. Now you have two alternative geometries for the Towns table:

Now we'll try to repeat the same example, but using the other my_new_db.db database, so we'll start another SQLite session using the other database.

sqlite> SELECT DISTINCT Srid(geom) FROM NewTowns;
Srid(geom)
–1
sqlite>
  1. the SpatiaLite SRID() function returns an unexpected –1 value.
    For SpatiaLite this means our geometries have an unknown / undefined Coordinate Reference System.
  2. the reason for this is in that we loaded the my_new_db.db database from shapefiles.
    Unhappily shapefile is a data format that don't contains any information referencing SRIDs.
  3. we can easily recover from this mismatch; simply we have to assign explicitly a SRID to our geometries.
sqlite> BEGIN;
sqlite> UPDATE NewTowns SET geom = SetSrid(geom, 32632);
sqlite> COMMIT;
sqlite> SELECT DISTINCT Srid(geom) FROM NewTowns;
Srid(geom)
32632
sqlite>
  1. the SpatiaLite SetSRID() function returns a new geometry identical to the original one, but with a different SRID value.
  2. Please note that no coordinate transformation is applied; this function merely modifies the SRID value.
sqlite> SELECT DISTINCT SRID(NewTowns.geom), epsg.name
     ...> FROM NewTowns, epsg
     ...> WHERE SRID(NewTowns.geom) = epsg.srid;

SQL error: no such table: epsg
sqlite>
  1. we have encountered a second problem.
  2. the epsg table doesn't exists in the my_new_db.db database we are currently using.
  3. if you remember the steps we have performed since now, this one should not be a surprise at all:
  4. we have to learn now how it is possible to create and load the epsg table required by SpatiaLite in order to perform coordinate reprojections.
Don't worry; to run a sql script is really very easy:

sqlite> .read epsg-sqlite.sql
sqlite> SELECT DISTINCT SRID(NewTowns.geom), epsg.name
     ...> FROM NewTowns, epsg
     ...> WHERE SRID(NewTowns.geom) = epsg.srid;

SRID(NewTowns.geom)epsg.name
32632WGS 84 / UTM zone 32N
sqlite>

All done; the my_new_db.db database is now completed with its own epsg table.

6. Performing SQL queries directly on shapefiles

The SQLite DBMS implements a very interesting feature; you can develop some specialized driver in order to access any generic [and possibly exotic] data source, and then those external data will appear to the SQL engine exactly as they where native ones.
This means you can apply any SQL operation on them, if appropriate / possible and if the driver offers an adequate support.

The VirtualShape extension implements this kind of driver for SQLite.
Not only; VirtualShape is designed to interact very strictly with SpatiaLite, in order to support a complete direct access to shapefiles via standard SQL, for both attributes and geometries.

sqlite> .load 'SpatiaLite.dll'
sqlite> .load 'VirtualShape.dll'
sqlite>

  1. You already know this; you can use .load in order to dynamically load an SQLite extension.
  2. If you load both the VirtualShape and the SpatiaLite extensions, your SQLite db will be able to perform SQL standard queries directly on shapefiles, with no needing at all to import them in some SQLite table.
  3. we can perform a very quick test by using one shapefile, i.e. the shape_towns we have previously created.
sqlite> CREATE VIRTUAL TABLE test_shape USING VirtualShape(shape_towns);
sqlite> PRAGMA table_info(test_shape);

cidnametypenotnulldflt_valuepk
0PKUIDINT00
1GEOMETRYBLOB00
2GaiaPrimarINT00
3NameVARCHAR(62)00
4PeoplesDOUBLE00
5LocalCouncVARCHAR(1)00
6CountyVARCHAR(1)00
7RegionVARCHAR(1)00
sqlite>
  1. with CREATE VIRTUAL TABLE you request SQLite to access some exotic external data source:
  2. the PRAGMA info_table simply requests SQLite to show all columns within a table; you already known this.
  3. as you can see, any attribute defined for your shapefile has been mapped into a corresponding SQL type.
  4. you can notice too that there are two other columns:
sqlite> SELECT PKUID, Name, Peoples, AsText(GEOMETRY)
     ...> FROM test_shape
     ...> WHERE Peoples > 350000 ORDER BY Name;

PKUIDNamePeoplesAsText(GEOMETRY)
8078Bologna371217.0POINT(686263.23 4929405.15)
7980Firenze356118.0POINT(681514.97 4848768.02)
203Genova610307.0POINT(492370.69 4918665.57)
8003Milano1256211.0POINT(514820.49 5034534.56)
8072Napoli1004500.0POINT(942636.1 4535272.55)
7922Palermo686722.0POINT(880179.17 4227024.08)
7918Roma2546804.0POINT(788703.57 4645636.3)
8012Torino865263.0POINT(395553.63 4991768.9)
sqlite>
  1. the VirtualShape driver is delegated to physically access the external data source [actually, a shapefile]
  2. the only restriction is that [for now ...] the current VirtualShape implementation is limited to read only operations
  3. please note that VIRTUAL TABLEs definitions are stored permanently inside the SQLite database
sqlite> DROP TABLE test_shape; sqlite>
  1. your VIRTUAL TABLE will be cancelled, and is no longer available
  2. don't worry about the original external shapefile; it still exists, with no modifications.
    Dropping a VIRTUAL TABLE simply cancels the SQLite link with the external data source, but don't deletes this later one.