A quick tutorial to SpatiaLite - a Spatial extension for SQLite

Table of Contents


1. Preparing to start

This tutorial is intended to let you become quickly trained in order to understand and use SpatiaLite and its Spatial functions.
May be you are not a computer programmer, neither a system administrator, nor a DBMS administrator. [but if you are, don't worry; this tutorial can help you too]
Don't bother about this; the tutorial tries to avoid using a too complex terminology, or to explain every arcane detail concerning spatial data processing, SQL and so on.
The main scope of this tutorial is simply to guide you in a step by step process, exploring what features SpatiaLite can offer you, and how and when you are expected to use them.
Each single feature will be explained with simple but realistic examples, based on the test-2.3.sqlite that you should have downloaded with this tutorial.

1.1. What you need to begin

We suppose you have at least a basic familiarity with computers, and we expect too you are able to list a directory, install software and alike; some knowledge of shell environment will be very useful as well.
For the sake of simplicity all the following examples supposes you are doing the tests using a Windows PC; if not, I'm sorry ... but I presume it will be very easy for you to accommodate some few adaption needed for Linux, MacOs etc

SpatiaLite comes in various flavours; you must choice one of them before starting.
  1. The simplest way is to use the spatialite.exe executable; it contains all-in-one anything you need in order to get started.
  2. If you are not accustomed to use command line tools, and you strongly dislike this kind of stuff, don't worry.
    You are free to use a true GUI-tool as well [i.e. one supporting graphic windows, a mouse, buttons and so on].
    You simply have to download and run the spatialite-gui suited for the platform you are currently using; this too doesn't require any installation at all, and contains all-in-one anything you need.
  3. Alternatively, you may wish to load SpatiaLite as a dynamic extension into the basic, standard sqlite.exe.
    You can do such a thing, but it's a little harder to do, and you may easily encounter some configuration headache.
  4. If you don't have any GIS tool already installed, may be a good idea to install the open source QuantumGis; it can be useful in order to display at a glance the shapefiles we are going to produce among this tutorial.
    You can download it from www.qgis.org; it has an ordinary Windows Installer that creates a Desktop icon and so on.
Now you have all that is needed to start the tutorial.

1.2. The test-2.3.sqlite database

This one is a simple, small-sized but quite realistic database containing some free available GIS data [you can found the original ones at www.rigacci.org]
The test-2.3.sqlite contains the following tables:
  1. the regions table contains 109 rows; each entity has a geometry, that is a POLYGON.
  2. the towns table contains 8101 rows; each entity has a geometry, that is a POINT.
  3. the highways table contains 775 rows, ; each entity has a geometry, that is a LINESTRING.

1.3. Getting started with SQL and SQLite

This section will try to let you become accustomed to SQLite in very quick way. If you are anyway more seriously interested in this, please read the official SQLite documentation.
To start using SQLite, open a new shell (command window), then go to the folder you have prepared in the preliminaries, and type the following command:

sqlite3 test-2.3.sqlite

Or even better, type: spatialite test-2.3.sqlite

spatialite.exe includes a complete support for SQLite; so it allows you to do anything you can do using sqlite3, and in exactly the same way.
you should receive this prompt:

SpatiaLite version ..: 2.3.0 Supported Extensions:
    - 'VirtualShape' [direct Shapefile access]
    - 'VirtualText' [direct CSV/TXT access]
    - 'VirtualNetwork' [Dijkstra shortest path]
    - 'RTree' [Spatial Index - R*Tree]
    - 'MbrCache' [Spatial Index - MBR cache]
    - 'VirtualFDO' [Spatial Index - MBR cache]
    - 'SpatiaLite' [FDO-OGR interoperability]
PROJ.4 version ......: Rel. 4.6.1, 21 August 2008
GEOS version ........: 3.1.0-CAPI-1.5.0
SQLite version ......: 3.6.12
Enter ".help" for instructions
spatialite>


OK, you have just started you first SQLite working session, that is currently connected to the test-2.3.sqlite database; congratulations.
Now type the following commands:
spatialite> .nullvalue NULL
spatialite> .headers on
spatialite> .mode column

nothing of very interesting in this; you are simply selecting some trivial operative options: if you are curious to discover what exactly they means, you can discover it by yourself just typing:
spatialite> .help

All command starting with a dot [.], as .mode or headers is interpreted by SQLite as an internal command, and not as an SQL expression. i.e. internal commands set some option or mode for SQLite, and have nothing to do with real data processing.
now that the SQLite session is started, we can execute our first SQL query on the test-2.3.sqlite database:
spatialite> SELECT * FROM towns LIMIT 5;
PK_UIDGeometryNamePeoples LocalCouncilCountyRegion
1Brozolo435 100
2Campiglione-Fenile1284 100
3Canischio274 100
4Cavagnolo2281 100
5Magliano Alfieri1674 100
spatialite>
A few points to understand:
  1. the SELECT statement requests SQLite to perform a query
  2. fetching all columns [*]
  3. FROM the database table of name towns
  4. retrieving only the first five rows [LIMIT 5]
  5. note the any SQL statement must end with a semicolon [;]

now try this second SQL query:
spatialite> select name AS Town, peoples as Population from towns ORDER BY name LIMIT 5;
TownPopulation
Abano Terme18206
Abbadia Cerreto276
Abbadia Lariana3151
Abbadia San Salvatore6832
Abbasanta2815
spatialite>
  1. in SQL constructs using lower- or upper-case has identical effects; SELECT is the same of select, FROM and from are equivalents.
  2. you can freely choose which one columns has to be fetched, determine at your will their order, and even rename then if you want to do so by using the AS clause.
  3. you can order the fetched rows by using the ORDER BY clause.

a more complex SQL query:
spatialite> select name, peoples from towns WHERE peoples > 350000 order by peoples DESC;
TownNamePopulation
Roma2546804
Milano1256211
Napoli1004500
Torino865263
Palermo686722
Genova610307
Bologna371217
Firenze356118
spatialite>
  1. you can filter a specific set of rows by imposing a WHERE clause; only those rows that satisfies the logical expression you specify will be fetched.
    In this example only Towns with a population greater than 350000 peoples has been fetched.
  2. you can order rows in descending order if appropriate, by using the DESC clause.

now we'll see a different mode of using an SQL query:
spatialite> select COUNT(*) as '# Towns',
           ... > MIN(peoples) as Smaller,
           ... > MAX(peoples) as Bigger,
           ... > SUM(peoples) as 'Total peoples',
           ... > SUM(peoples) / COUNT(*) as 'mean peoples for town'
           ... > from towns;
# TownsSmallerBigger Total peoplesmean peoples for town
8101332546804 570061477036
spatialite>
  1. you can split complex queries along many lines; SQLite starts executing the query only where it finds a line closed by a semicolon [;] In other words, until you enter a [;] as the last character in a line, you are always continuing to type the same query, and SQLite waits until you have finished to complete the query.
  2. you can use functions in an SQL query.
    COUNT(), MIN(), MAX() and SUM() are functions. Not at all surprisingly:
  3. even more, you can docalculations in your query.
    e.g. we have calculated the mean of peoples per village dividing the SUM() by the COUNT() values.

note that a valid SQL query may simply consists of expressions and functions to be evaluated:
spatialite> select (10 - 11) * 2 as Number, ABS((10 - 11) * 2) as AbsoluteValue;
NumberAbsoluteValue
-22
spatialite>
  1. the (10 - 11) * 2 term is an example of expression
  2. the ABS() functions returns the absolute value of a number.
  3. note that in this example we have not used any DB column or DB table at all.

now we'll repeat an SQL query we have already performed, but with a small variation:
spatialite> select name, peoples, HEX(Geometry)
           ... > from Towns where peoples > 350000 order by peoples DESC;
NamePeoplesHEX(Geometry)
Roma2546804 0001787F00003D0AD723BF11284133333313C1B851413D0AD723BF11284133333313C1B851417C010000003D0AD723BF11284133333313C1B85141FE
Milano1256211 0001787F00005C8FC2F5116C1F413D0AD7A3893453415C8FC2F5116C1F413D0AD7A3893453417C010000005C8FC2F5116C1F413D0AD7A389345341FE
Napoli1004500 0001787F00003333333358C42C4133333323FA4C51413333333358C42C4133333323FA4C51417C010000003333333358C42C4133333323FA4C5141FE
Torino865263 0001787F000052B81E85862418419A999939C60A534152B81E85862418419A999939C60A53417C0100000052B81E85862418419A999939C60A5341FE
Palermo686722 0001787F0000713D0A5766DC2A4152B81E05F41F5041713D0A5766DC2A4152B81E05F41F50417C01000000713D0A5766DC2A4152B81E05F41F5041FE
Genova610307 0001787F0000295C8FC24A0D1E4148E17A6462C35241295C8FC24A0D1E4148E17A6462C352417C01000000295C8FC24A0D1E4148E17A6462C35241FE
Bologna371217 0001787F00005C8FC2756EF124419A999949DFCD52415C8FC2756EF124419A999949DFCD52417C010000005C8FC2756EF124419A999949DFCD5241FE
Firenze356118 0001787F00000AD7A3F055CC244114AE4701207F52410AD7A3F055CC244114AE4701207F52417C010000000AD7A3F055CC244114AE4701207F5241FE
spatialite>
  1. the HEX() function returns the hexadecimal representation of a BLOB column value.
  2. in the preceding execution of this query, the geom column seemed empty; now, by using the HEX() function, we discover that contains lots of strange binary data.
  3. geom contains GEOMETRY values, stored as BLOBs and encoded in the internal representation used by SpatiaLite.
    SQLite in its own hasn't the slightest idea of what GEOMETRY is, and cannot do any other operation on it.
    To really use GEOMETRY values, it's time use the SpatiaLite extension.
We have completed our first step. If you have understood all the preceding examples, you become to be familiar with basic SQL, and that's enough to start using SpatiaLite.

To terminate the current SQLite session and exit, you simply have to type:
spatialite> .quit

2. Getting started with SpatiaLite

This section examines the most basilar operations concerning Spatial data. In order to enable SQLite to process Spatial data as well, you need to use the SpatiaLite extension.
SpatiaLite [to be pronounced as spatial light] is simply a small library, that supports a collection of SQL functions in a way conformant to the OpenGisConsortium OGC specification for Spatial data processing.
If you are interested on this, you can obtain a copy of OGC specification from the OGC web site

2.1 How to load SpatiaLite [for sqlite3 users only]

You should first start the sqlite3.exe as usual, so type the following command:

sqlite3 test-2.3.sqlite

you should receive this prompt:
SQLite version 3.x.x
enter ".help" for instructions
sqlite>


the SQLite session begins, and is currently connected to the test-2.3.sqlite database;
now type the following commands to set the various options for output formatting:
sqlite> .nullvalue NULL
sqlite> .headers on
sqlite> .mode column
sqlite>


and now you can load the SpatiaLite extension:
sqlite> .load 'libspatialite-2.dll'

You'll then notice the following message:
PROJ.4 Rel. 4.6.0, 21 Dec 2007
GEOS version 3.0.0-CAPI-1.4.1
    *** Extension 'RTree' ready [Spatial Index support]
    *** Extension 'VirtualShape' ready [direct Shapefile access]
    *** SQLite's extension 'SpatiaLite' is enabled ******
sqlite>


an alternative way to load SpatiaLite is to execute the following SQL expression:
SELECT load_extension('libspatialite-2.dll');

If you are using the spatialite.exe executable, you don't need to load any extension, because SpatiaLite is automatically included

2.2 A first glance: familiarizing with GEOMETRY

All right, now you can execute your first Spatial query: we'll repeat an SQL query we have already performed:
spatialite> SELECT name, peoples, AsText(Geometry)
           ... > from Towns where peoples > 350000 order by peoples DESC;
NamePeoplesAsText(Geometry)
Roma2546804POINT(788703.57 4645636.3)
Milano1256211POINT(514820.49 5034534.56)
Napoli1004500POINT(942636.1 4535272.55)
Torino865263POINT(395553.63 4991768.9)
Palermo686722POINT(880179.17 4227024.08)
Genova610307POINT(492370.69 4918665.57)
Bologna371217POINT(686263.23 4929405.15)
Firenze356118POINT(681514.97 4848768.02)
spatialite>
  1. the AsText() function is a SpatiaLite one, and returns the Well Known Text - WKT representation for a GEOMETRY column value.
    WKT is a standard notation conformant to OpenGIS specification.
  2. in the preceding execution of this query, the HEX() function returned lots of strange binary data.
    Now the AsText() function shows useful and quite easily understandable GEOMETRY values.
  3. a POINT is the simplest GEOMETRY class, and has only a couple of [X,Y] coordinates.
repeat the preceding query in slightly different way:
spatialite> SELECT name, X(Geometry), Y(Geometry) FROM Towns
           ... > WHERE peoples > 350000
           ... > ORDER BY peoples DESC;
NameX(Geometry)Y(Geometry)
Roma788703.574645636.3
Milano514820.495034534.56
Napoli942636.14535272.55
Torino395553.634991768.9
Palermo880179.174227024.08
Genova492370.694918665.57
Bologna686263.234929405.15
Firenze681514.974848768.02
spatialite>
  1. the SpatiaLite X() function returns the X coordinate for a POINT.
  2. the Y() function returns the Y coordinate for a POINT.
you can use the following GEOMETRY format conversion functions:
spatialite> SELECT HEX(GeomFromText('POINT(10 20)'));
hex(GeomFromText('POINT(10 20)'))
0001FFFFFFFF00000000000024400000000000003440000000000000244000000000000034407C0100000000000000000024400000000000003440FE

spatialite> SELECT HEX(AsBinary(GeomFromText('POINT(10 20)')));
hex(AsBinary(GeomFromText('POINT(10 20)')))
010100000000000000000024400000000000003440

spatialite> SELECT AsText(GeomFromWKB(X'010100000000000000000024400000000000003440'));
AsText(GeomFromWKB(X'010100000000000000000024400000000000003440'))
POINT(10 20)
spatialite>
  1. the SpatiaLite GeomFromText() function returns the internal BLOB representation for a GEOMETRY.
  2. the AsBinary() function returns the Well Known Binary - WKB representation for a GEOMETRY column value.
    WKB is a standard notation conformant to OpenGIS specification.
  3. the GeomFromWKB() function converts a WKB value into the corresponding internal BLOB value.

2.3. GEOMETRY classes

This section helps you to discover the various GEOMETRY classes supported by SpatiaLite as they are defined in the OpenGIS specification; simply speaking, any GEOMETRY CLASS is a peculiar kind of GEOMETRY.
You have already met the POINT class in the preceding paragraph; that's time you'll become accustomed to her sister classes too.

a simple SQL query to explore a new GEOMETRY class:

spatialite> SELECT PK_UID, AsText(Geometry) FROM HighWays WHERE PK_UID = 883;
PK_UIDAsText(Geometry)
883LINESTRING(689001.702718 4798988.808442,
689027.602471 4798996.686619,
689029.54214 4798989.585948,
689029.54214 4798989.585948)
spatialite>
  1. LINESTRING is another GEOMETRY class, and has lots of POINTs.
  2. in this one case you have fetched a very simple LINESTRING, representing a polyline with just 4 vertices.
  3. it isn't at all unusual to encounter LINESTRINGs with some thousands of vertices in real GIS data.
spatialite> SELECT PK_UID, NumPoints(Geometry), GLength(Geometry),
           ... > Dimension(Geometry), GeometryType(Geometry)
           ... > FROM HighWays ORDER BY NumPoints(Geometry) DESC LIMIT 5;
PK_UIDNumPoints(Geometry)GLength(Geometry)Dimension(Geometry) GeometryType(Geometry)
1062675894997.8721344156 1LINESTRING
1063512075453.1247726708 1LINESTRING
273432569052.1024677445 1LINESTRING
343310946831.3235596409 1LINESTRING
1061275560165.9776174597 1LINESTRING
spatialite>
  1. the SpatiaLite NumPoints() function returns the number of vertices for a LINESTRING GEOMETRY.
  2. the GLength() function returns the geometric length [expressed in map units] for a LINESTRING GEOMETRY.
  3. the Dimension() function returns the dimensions' number for any GEOMETRY class [obviously 1 for lines].
  4. the GeometryType() function returns the class type for any kind of GEOMETRY value.
spatialite> SELECT PK_UID, NumPoints(Geometry),
           ... > AsText(StartPoint(Geometry)), AsText(EndPoint(Geometry)),
           ... > X(PointN(Geometry, 2)), Y(PointN(Geometry, 2))
           ... > FROM HighWays ORDER BY NumPoints(Geometry) DESC LIMIT 5;
PK_UIDNumPoints(Geometry) AsText(StartPoint(
Geometry))
AsText(EndPoint(
Geometry))
X(PointN(
Geometry, 2))
Y(PointN(
Geometry, 2))
10626758POINT(632090.156998 4835616.546126) POINT(663300.737479 4795631.803342)632086.0096648844835625.74875358
10635120POINT(663292.190654 4795627.307765) POINT(632085.166691 4835620.171885)663295.9924954534795626.48941986
2734325POINT(668247.593086 4862272.349444) POINT(671618.13304 4854179.734158)668232.5292849544862273.56196672
3433109POINT(671613.424233 4854121.472532) POINT(654264.259259 4855357.41189)671610.5236143034854129.55436818
10612755POINT(619601.675367 4855174.599496) POINT(668724.797158 4862015.941886)619593.7115396854855174.74398836
spatialite>
  1. the SpatiaLite StartPoint() function returns the first POINT for a LINESTRING GEOMETRY.
  2. the EndPoint() function returns the last POINT for a LINESTRING GEOMETRY.
  3. the PointN() function returns the selected vertex as a POINT; each one vertex is identified by a relative index.
    The first vertex is identified by an index value 1, the second by an index value 2 and so on.
  4. please note that you can freely nest the various SpatiaLite functions, by passing the return value of the inner function as an argument for the outer one.

spatialite> SELECT name, AsText(Geometry) FROM Regions WHERE PK_UID = 52;
nameAsText(Geometry)
EMILIA-ROMAGNAPOLYGON((761808.155309 4966649.458816,
762432.549628 4966393.94736,
764168.27812 4966137.407299,
763486.212544 4966081.379442,
762462.485356 4966252.975485,
761808.155309 4966649.458816))
spatialite>
  1. POLYGON is another GEOMETRY class.
  2. in this one case you have fetched a very simple POLYGON, having only the exterior ring [i.e. it doesn't contains any internal hole].
    Remember that POLYGONs may optionally contain an arbitrary number of internal holes, each one delimited by an interior ring.
  3. the exterior ring in itself is simply a LINESTRING [and interior rings too are LINESTRINGS].
  4. note that a POLYGON is a closed geometry, and thus the first and the last POINT for each ring are exactly identical.
spatialite> SELECT PK_UID,
           ... > Area(Geometry), AsText(Centroid(Geometry)),
           ... > Dimension(Geometry), GeometryType(Geometry)
           ... > FROM Regions ORDER BY Area(Geometry) DESC LIMIT 5;
PK_UIDArea(Geometry)AsText(Centroid(Geometry)) Dimension(Geometry)GeometryType(Geometry)
10125779695636.9131POINT(955762.441021 4173082.511675)2POLYGON
10525391974998.079POINT(414613.865386 4990738.194216)2POLYGON
10624069782849.5217POINT(560092.709693 5052167.502959)2POLYGON
7423962322618.2869POINT(502694.575482 4437256.791656)2POLYGON
5322657959720.0999POINT(672979.667565 4814145.593235)2POLYGON
spatialite>
  1. we have already meet the SpatiaLite Dimension() and GeometryType() functions; they works for POLYGONs exactly in same fashion as for any other kind of GEOMETRY.
  2. the SpatiaLite Area() function returns the geometric area [expressed in square map units] for a POLYGON GEOMETRY.
  3. the Centroid() function returns the POINT identifying the centroid for a POLYGON GEOMETRY.
spatialite> SELECT PK_UID, NumInteriorRings(Geometry),
           ... > NumPoints(ExteriorRing(Geometry)), NumPoints(InteriorRingN(Geometry, 1))
           ... > FROM regions ORDER BY NumInteriorRings(Geometry) DESC LIMIT 5;
PK_UIDNumInteriorRings(Geometry) NumPoints(ExteriorRing(Geometry))NumPoints(InteriorRingN(Geometry, 1))
60391217
5516029
106NULL
2012NULL
3020NULL
  1. the SpatiaLite ExteriorRing() functions returns the exterior ring for a given GEOMETRY.
    Any valid POLYGON must have an exterior ring.
    Remember: each one of the rings belonging to a POLYGON is a closed LINESTRING
  2. the SpatiaLite NumInteriorRings() function returns the number of interior rings belonging to a POLYGON.
    A valid POLYGON may have any number of interior rings, including zero i.e. no interior ring at all.
  3. The SpatiaLite InteriorRingN() function returns the selected interior rings as a LINESTRING; each one interior ring is identified by a relative index.
    The first interior ring is identified by an index value 1, the second by an index value 2 and so on.
  4. Any ring is a LINESTRING, so we can use the NumPoints() function in order to detect the number of related vertices.
    If we call the NumPoints() function on a NULL GEOMETRY [or on a GEOMETRY of non-LINESTRING class] we'll get a NULL result.
    This explains why the the last three rows has a NULL NumPoints() result; there is no corresponding interior ring !
spatialite> SELECT AsText(InteriorRingN(Geometry, 1)),
           ... > AsText(PointN(InteriorRingN(Geometry, 1), 4)),
           ... > X(PointN(InteriorRingN(Geometry, 1), 5)),
           ... > Y(PointN(InteriorRingN(Geometry, 1), 5))
           ... > FROM Regions WHERE PK_UID = 55;
AsText(InteriorRingN(Geometry, 1))AsText(PointN(InteriorRingN(
Geometry, 1), 4))
X(PointN(InteriorRingN(
Geometry, 1), 5))
Y(PointN(InteriorRingN(
Geometry, 1), 5))
LINESTRING(756881.706704 4850692.62625,
760361.595005 4852743.267975,
759582.880944 4855493.610807,
757549.382306 4855414.551183,
755734.189332 4856112.118807,
755020.910885 4855996.887913,
754824.031873 4854723.577451,
756021.000385 4850937.420842,
756881.706704 4850692.62625)
POINT(757549.382306 4855414.551183)755734.1893322574856112.11880693
  1. we have already met in the preceding ones the usage of nested functions.
    For POLYGONs it becomes to be a little more tedious, but still easily understandable.
  2. e.g. to obtain the last column we have used InteriorRingN() in order to get the first interior ring, and then PointN() to get the fifth vertex.
    At last we can call Y() to get the coordinate value.
  3. it's quite boring, but not difficult at all, no ?

POINT, LINESTRING and POLYGON are the elementary classes for GEOMETRY.
But GEOMETRY supports the following complex classes as well:
  1. a MULTIPOINT is a collection of two or more POINTSs belonging to the same entity.
  2. a MULTILINESTRING is a collection of two or more LINESTRINGs.
  3. a MULTIPOLYGON is a collection of two or more POLYGONs.
  4. a GEOMETRYCOLLECTION is an arbitrary collection containing any other kind of geometries.
We'll not explain in detail this kind of collections, because it will be simply too boring and dispersive.
Generally speaking they regularly extends in the expected way properties corresponding to elementary corresponding classes.
  1. the SpatiaLite NumGeometries() function returns the number of elements for a collection.
  2. the GeometryN() function returns the N-th element for a collection.
  3. the GLength() function applied to a MULTILINESTRING returns the sum of individual lengths for each LINESTRING composing the collection.
  4. the Area() function applied to a MULTIPOLYGON returns the sum of individual areas for each POLYGON composing the collection.
  5. the Centroid() function returns the average centroid when applied to a MULTYPOLYGON.

2.4. GEOMETRY ENVELOPE

Let see a fundamental property for any GEOMETRY class:
spatialite> SELECT Name, AsText(Envelope(Geometry)) FROM Regions LIMIT 5;
NameAsText(Envelope(Geometry))
VENETOPOLYGON((752912.250297 5027429.54477,
753828.826422 5027429.54477,
753828.826422 5028928.677375,
752912.250297 5028928.677375,
752912.250297 5027429.54477))
VENETOPOLYGON((751455.937063 5026778.301798,
752928.785333 5026778.301798,
752928.785333 5029157.835014,
751455.937063 5029157.835014,
751455.937063 5026778.301798))
VENETOPOLYGON((759461.944608 5026112.935302,
766247.780711 5026112.935302,
766247.780711 5036802.775999,
759461.944608 5036802.775999,
759461.944608 5026112.935302))
VENETOPOLYGON((750183.233817 5025314.495578,
750842.445895 5025314.495578,
750842.445895 5026793.815968,
750183.233817 5026793.815968,
750183.233817 5025314.495578))
LIGURIAPOLYGON((378359.802362 4848919.510676,
584633.642747 4848919.510676,
584633.642747 4947990.084988,
378359.802362 4947990.084988,
378359.802362 4848919.510676))
spatialite>
  1. the SpatiaLite Envelope() function always returns a POLYGON that is the Minimum Bounding Rectangle - MBR for the given GEOMETRY.
    Because an MBR is a rectangle, it always has 5 POINTs [remember: in closed geometries the last POINT must be identical to the first one].
  2. individual POINTs are as follow:
  3. MBRs are of peculiar interest, because by using them you can evaluate spatial relationships between two geometries in a simplified and roughly approximative way. But MBRs comparisons are very fast to compute, so they are very useful and widely used to speed up data processing.
  4. MBRs are also widely referenced as bounding boxes, or BBOX as well.
All right, if you've followed and understood all the above examples, now you have acquired a basic familiarity with the fundamental core of Spatial data processing. We can now pass to examine some more advanced exercise.

3. Managing a GIS databases

Very simply speaking, a Geographic Information System - GIS is the one that supports geographic [or cartographic] informations; as a rule of thumb Spatial data [and Spatial-enabled DBMSs] exists mainly because they are the key component in GIS applications.
This section examines how you can manage a real GIS database, i.e.:
  1. how can you import any existing GIS dataset in a new SpatiaLite database.
  2. how can you insert, modify or delete the GIS datasets once they are stored inside the database you have just created.
  3. how can you export GIS datasets out of the SpatiaLite database.
Situations in which GIS datasets travels back and forward from a workstation and another, even at different locations and using completely different systems and application software, are widespread.
Good new #1: by using SQLite as your preferred GIS DBMS, this task becomes incredibly simple:
  1. an SQLite database is just an ordinary file; you can freely copy it, or even upload and download it on the WEB, send it as an e-mail attachment, zip and unzip it, and so on.
  2. SQLite's database are cross-platform, so you can produce them on a Windows PC and then use them on a Linux server, without any problem.
    Of course, SpatiaLite too guarantees that Spatial data didn't corrupts when going cross-platform.
Good new #2: SpatiaLite supports the almost universal ESRI shapefile format. So you can easily obtain some GIS dataset from an application like ArcGis, manipulate it using SpatiaLite as needed, and then transfer it back again. Shapefiles too can safely be used in a cross-platform fashion.

3.1 Exporting GIS data outside a SpatiaLite db

You have a SpatiaLite database on your hands, the test-2.3.sqlite one.
To export all tables as shapefiles you simply have to type:

spatialite> .dumpshp Towns Geometry shape_towns CP1252 POINT
========
Dumping SQLite table 'Towns' into shapefile at 'shape_towns'

SELECT * FROM Towns WHERE GeometryType(Geometry) = 'POINT';

Exported 8101 rows into shapefile
========
spatialite> .dumpshp HighWays Geometry shape_highways CP1252 LINESTRING
========
Dumping SQLite table 'HighWays' into shapefile at 'shape_highways'

SELECT * FROM HighWays WHERE GeometryType(Geometry) = 'LINESTRING' OR GeometryType(Geometry) = 'MULTILINESTRING';

Exported 775 rows into shapefile
========
spatialite> .dumpshp Regions Geometry shape_regions CP1252 POLYGON
========
Dumping SQLite table 'Regions' into shapefile at 'shape_regions'

SELECT * FROM Regions WHERE GeometryType(Geometry) = 'POLYGON' OR GeometryType(Geometry) = 'MULTIPOLYGON';

Exported 109 rows into shapefile
========
spatialite>
  1. the SpatiaLite .dumpshp macro command exports a whole table as a shapefile.
  2. SpatiaLite handles macro command arguments as they where shell arguments:

If you now perform a directory listing [e.g. executing the DIR system command], you'll notice something like this:

21/06/200813.197.816shape_highways.dbf
21/06/200813.195.027.164shape_highways.shp
21/06/200813.196.300shape_highways.shx
21/06/200813.194.240shape_regions.dbf
21/06/200813.19318.988shape_regions.shp
21/06/200813.19972shape_regions.shx
21/06/200813.19875.134shape_towns.dbf
21/06/200813.19226.928shape_towns.shp
21/06/200813.1964.908shape_towns.shx

Those files are the shapefiles you've right now exported.
All right, you can launch your preferred GIS viewer, e.g. QGis, and explore them.

view #1 shapefiles

view #2 shapefiles

At last you can view a real geographic map, instead of quite obscure WKB or WKT encoded Spatial data !

3.2 Creating a new SpatiaLite db and populating it

Now you surely have some shapefiles, at least the ones you've just exported.
So in this example we are going to create from scratch a brand-new, empty database and then feeding it with our shapefiles.
Creating a new SQLite database is really simple; you just have to start a new SQLite session:

spatialite my_new_db.sqlite

When SQLite begins a new session, if the db you requested didn't exists, it will create a new one.
SpatiaLite version ..: 2.2 Supported Extensions:
    - 'VirtualShape' [direct Shapefile access]
    - 'VirtualText [direct CSV/TXT access]
    - 'RTree' [Spatial Index - R*Tree]
    - 'MbrCache' [Spatial Index - MBR cache]
    - 'SpatiaLite' [Spatial SQL - OGC]
PROJ.4 version ......: Rel. 4.6.1, 21 August 2008
GEOS version ........: 3.0.0-CAPI-1.4.1
SQLite version ......: 3.6.2
Enter ".help" for instructions
spatialite>


the SQLite session begins, and is currently connected to the my_new_db.sqlite database;
now type the following commands to set the various options for output formatting:

spatialite> .nullvalue NULL
spatialite> .headers on
spatialite> .mode column
spatialite>

That's always the same standard stuff, you already know ... Now try:

spatialite> .tables
spatialite>


The .tables directive causes SQLite to list all the tables contained in the current database.
As you can see, this one is a brand new database, and doesn't contains yet any table.
You can check too (using a different windows), that after executing .tables a file named my_new_db.sqlite has just been created in your current directory.
Now you can try to populate this database importing the previously generated shapefiles:

spatialite> .loadshp shape_regions NewRegions CP1252
========
Loading shapefile at 'shape_regions' into SQLite table 'NewRegions'

BEGIN;
CREATE TABLE NewRegions (
PK_UID INTEGER PRIMARY KEY AUTOINCREMENT,
PK_UID_X0 INTEGER NOT NULL,
Name TEXT NOT NULL,
Geometry BLOB NOT NULL);
INSERT INTO NewRegions (
PK_UID,PK_UID_X0,Name,Geometry)
VALUES (1,1,'VENETO',
GeomFromWkb(X'01030000000100000006000000ECE8268020FA2641139C...', -1));
...
COMMIT;

Inserted 109 rows into 'NewRegions' from SHAPEFILE
========
spatialite> .loadshp shape_highways NewHighWays CP1252
========
Loading shapefile at 'shape_highways' into SQLite table 'NewHighWays'

BEGIN;
CREATE TABLE NewHighWays (
PK_UID INTEGER PRIMARY KEY AUTOINCREMENT,
PK_UID_X0 INTEGER NOT NULL,
Geometry BLOB NOT NULL);
INSERT INTO NewHighWays (
PK_UID,PK_UID_X0,Geometry)
VALUES (1,1,
GeomFromWkb(X'01020000006A00000018AEEA69F9662441BD4DE2F66688...', -1));
...
COMMIT;

Inserted 775 rows into 'NewHighWays' from SHAPEFILE
========
spatialite>
  1. the SpatiaLite .loadshp macro command imports a whole shapefile creating a new database table.
  2. SpatiaLite handles macro command arguments as they where shell arguments:
Let us quickly pinpoint some interesting points: We can now perform some operation in order to check what really happened:

spatialite> .tables
NewHighWaysNewRegions
spatialite>

The .tables directive causes SQLite to list all the tables contained in the current database;
all right, the NewHighWays and NewRegions tables have been created.

spatialite> PRAGMA table_info(NewRegions);
cidnametypenotnulldflt_valuepk
0PK_UIDINTEGER0NULL1
1PK_UID_X0INTEGER990
3NameTEXT99NULL0
4GeometryBLOB99NULL0
spatialite>

The PRAGMA table_info(table_name) directive causes SQLite to list all columns belonging to the requested table.
  1. please note that original column names can be subject to truncation.
    Unfortunately, shapefile format didn't allow column names to be longer than 10 character.
  2. there is now a PK_UID_X0 column name; this is to disambiguate a potential duplicate column name
spatialite> SELECT count(*), GeometryType(Geometry)
           ... > FROM NewHighWays GROUP BY GeometryType(Geometry);
count(*)GeometryType(Geometry)
775LINESTRING
spatialite> SELECT count(*), GeometryType(Geometry)
           ... > FROM NewRegions GROUP BY GeometryType(Geometry);
count(*)GeometryType(Geometry)
109POLYGON
spatialite>

Anything seem to be as expected. We can pass over.
Now we have to import our latest shapefile, the shape_towns one.
This time we'll apply some little change, just to examine a different way to import shapefiles.

spatialite> .read init_spatialite-2.3.sql ASCII
InitSpatialMetaData()
1
spatialite> .tables
NewHighWaysNewRegionsgeometry_columnsspatial_ref_sys
spatialite> .loadshp shape_towns NewTowns CP1252 32632 geom
========
Loading shapefile at 'shape_highways' into SQLite table 'NewHighWays'

BEGIN;
CREATE TABLE NewTowns (
PK_UID INTEGER PRIMARY KEY AUTOINCREMENT,
PK_UID_X0 INTEGER NOT NULL,
Name TEXT NOT NULL,
Peoples INTEGER NOT NULL,
LocalCounc INTEGER NOT NULL,
County INTEGER NOT NULL,
Region INTEGER NOT NULL);
SELECT AddGeometryColumn('NewTowns', 'geom', 32632, 'POINT', 2);
INSERT INTO NewTowns (
PK_UID,PK_UID_X0,Name,Peoples,LocalCounc,County,Region,geom)
VALUES (1,1,'Brozolo',435,1,0,0,
GeomFromWkb(X'010100000048E17A14EB0F1A4152B81E55420F5341', 32632));
...
COMMIT;

Inserted 8101 rows into 'NewTowns' from SHAPEFILE
========
spatialite> .tables
NewHighWaysNewTownsgeometry_columns
NewRegionsgeom_cols_ref_sysspatial_ref_sys
spatialite>

  1. the .read macro command executes an SQL script.
    The init_spatialite-2.3.sql script is a standard one [you can download it] and it's used to initialize the so called Spatial MetaData
    As you can notice, it invokes the InitSpatialMetaData() SpatiaLite's SQL function
    You have to specify the charset_name used to encode the SQL Script text; this time we can use the ASCII one.
  2. immediately after executing the init_spatialite-2.3.sql script you can check that two new tables have been created, i.e. spatial_ref_sys and geometry_columns.
    A geom_cols_ref_sys table-like object has been created; this actually is not a real table; we'll see very soon what it really is
  3. as you can notice too, this time we used the 5 arguments form for .loadshp, specifying a SRID and a geometry column name as well.
  4. consequently, the .loadshp macro function has used a separate AddGeometryColumn() statement in order to create the geometry column in way that is consistent with Spatial MetaData handling.
spatialite> SELECT * FROM geometry_columns;
f_table_namef_geometry_columntypecoord_dimensionsridspatial_index_enabled
NewTownsgeomPOINT2326320
spatialite> SELECT * FROM spatial_ref_sys WHERE Srid = 32632;
sridauth_nameauth_sridref_sys_nameproj4text
32632epsg32632WGS 84 / UTM zone 32N +proj=utm +zone=32 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
spatialite>
  1. as you can check, now the Spatial MetaData correctly report that there is a Spatial column in the table NewTowns
  2. and that this column has to contain Point geometries within the 32632 SRID
  3. you can check too that the 32632 SRID identifies the UTM zone 32N, and has lots of geodetic parameters fully qualifying it.
spatialite> SELECT * FROM geom_cols_ref_sys;
f_table_namef_geometry_columntypecoord_dimensionsrid auth_nameauth_sridref_sys_nameproj4text
NewTownsgeomPOINT232632epsg32632WGS 84 / UTM zone 32N +proj=utm +zone=32 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
spatialite> SELECT * FROM sqlite_master
           ...> WHERE name = 'geom_cols_ref_sys';

typenametbl_namerootpagesql
viewgeom_cols_ref_sysgeom_cols_ref_sys0 CREATE VIEW geom_cols_ref_sys AS
SELECT f_table_name, f_geometry_column, type,
coord_dimension, spatial_ref_sys.srid AS srid,
auth_name, auth_srid, ref_sys_name, proj4text
FROM geometry_columns, spatial_ref_sys
WHERE geometry_columns.srid = spatial_ref_sys.srid
spatialite>
  1. the geom_cols_ref_sys seems to be a table, but really is a view
  2. a view is actually a kind of virtual table, and is internally defined as an SQL query; but you can directly query it as if it was a real table
  3. in the SQLite implementation, you can apply SELECT on a view, but INSERT, UPDATE or DELETE are forbidden
  4. defining views is a very useful way to simplify database queries, as in this example
spatialite> SELECT * FROM sqlite_master
           ...> WHERE type = 'trigger' AND tbl_name = 'NewTowns';

typenametbl_namerootpagesql
triggergti_NewTowns_geomNewTowns0 CREATE TRIGGER gti_NewTowns_geom BEFORE INSERT ON NewTowns
FOR EACH ROW BEGIN
    SELECT
    RAISE(ROLLBACK, '''NewTowns.geom'' violates Geometry constraint [geom-type not allowed]') WHERE (
        SELECT type FROM geometry_columns WHERE
            f_table_name = 'NewTowns' AND f_geometry_column = 'geom' AND (type = GeometryType(NEW.geom)
                OR type = GeometryAliasType(NEW.geom))) IS NULL;
END
triggergsi_NewTowns_geomNewTowns0 CREATE TRIGGER gsi_NewTowns_geom BEFORE INSERT ON NewTowns
FOR EACH ROW BEGIN
    SELECT
    RAISE(ROLLBACK, '''NewTowns.geom'' violates Geometry constraint [SRID not allowed]') WHERE (
        SELECT srid FROM geometry_columns WHERE
            f_table_name = 'NewTowns' AND f_geometry_column = 'geom' AND srid = SRID(NEW.geom)) IS NULL;
END
triggergtu_NewTowns_geomNewTowns0 CREATE TRIGGER gtu_NewTowns_geom BEFORE UPDATE ON NewTowns
FOR EACH ROW BEGIN
    SELECT
    RAISE(ROLLBACK, '''NewTowns.geom'' violates Geometry constraint [geom-type not allowed]') WHERE (
        SELECT type FROM geometry_columns WHERE
            f_table_name = 'NewTowns' AND f_geometry_column = 'geom' AND (type = GeometryType(NEW.geom)
                OR type = GeometryAliasType(NEW.geom))) IS NULL;
END
triggergsu_NewTowns_geomNewTowns0 CREATE TRIGGER gsu_NewTowns_geom BEFORE UPDATE ON NewTowns
FOR EACH ROW BEGIN
    SELECT
    RAISE(ROLLBACK, '''NewTowns.geom'' violates Geometry constraint [SRID not allowed]') WHERE (
        SELECT srid FROM geometry_columns WHERE
            f_table_name = 'NewTowns' AND f_geometry_column = 'geom' AND srid = SRID(NEW.geom)) IS NULL;
END
spatialite>
  1. SQLite supports triggers; so SpatiaLite, while creating a new geometry column via AddGeometryColumn(), defines 4 triggers in oder to ensure:
It's really a pity we didn't know this useful stuff before (i.e. Spatial MetaData).
We've already loaded two tables [NewRegions and NewHighWays] without associating them to Spatial MetaData ...
It's never to late; we can recover just now.

spatialite> UPDATE NewRegions SET Geometry = SetSrid(Geometry, 32632);
spatialite> SELECT RecoverGeometryColumn('NewRegions', 'Geometry',
           ...> 32632, 'MULTIPOLYGON', 2);

RecoverGeometryColumn('NewRegions', 'Geometry',32632, 'MULTIPOLYGON', 2)
1
spatialite>
  1. first we have to set an explicit SRID value for any geometry within the NewRegions table
    Otherwise any attempt to associate Spatial MetaData will fail
  2. and then we can try to use the RecoverGeometryColumn() function in order to associate the required column with Spatial MetaData:
spatialite> SELECT * FROM geometry_columns;
f_table_namef_geometry_columntypecoord_dimensionsridspatial_index_enabled
NewTownsgeomPOINT2326320
NewRegionsGeometryMULTIPOLYGON2326320
spatialite> SELECT name, tbl_name FROM sqlite_master
           ...> WHERE type = 'trigger';

nametbl_name
fkd_refsys_geocolsspatial_ref_sys
fki_geocols_refsysgeometry_columns
fku_geocols_refsysgeometry_columns
gti_NewTowns_geomNewTowns
gsi_NewTowns_geomNewTowns
gtu_NewTowns_geomNewTowns
gsu_NewTowns_geomNewTowns
gti_NewRegions_GeometryNewRegions
gsi_NewRegions_GeometryNewRegions
gtu_NewRegions_GeometryNewRegions
gsu_NewRegions_GeometryNewRegions
spatialite>

3.3 Inserting, updating and deleting rows in a SpatiaLite db

Now you have two databases, the original test-2.3.sqlite and the brand-new my_new_db.sqlite; so you can safely use one of them to continue testing, and keep the other one as a backup copy.
Let you start with ordinary SQL commands:

spatialite> DELETE FROM NewTowns WHERE peoples < 100000;
spatialite> SELECT count(*) FROM NewTowns;
count(*)
42
spatialite>

Best wishes ... you've just unrecoverably lost some 8.000 rows.
If you repent yourself, is too late; they are gone for ever.
A wisest way to do this is:

spatialite> BEGIN;
spatialite> DELETE FROM NewTowns WHERE peoples < 100000;
spatialite> SELECT count(*) FROM NewTowns;
count(*)
42
spatialite> ROLLBACK;
spatialite> SELECT count(*) FROM NewTowns;
count(*)
8101
spatialite>
  1. the BEGIN SQL statement tells SQLite that you intend to start a transaction.
    Any operation that you perform within a transaction can be cancelled at any subsequent time, and the db will come back to its original unchanged state.
  2. the ROLLBACK SQL statement tells SQLite exactly this.
Otherwise, if you want to really confirm what you have already done:

spatialite> BEGIN;
spatialite> DELETE FROM NewTowns WHERE peoples < 100000;
spatialite> SELECT count(*) FROM NewTowns;
count(*)
42
spatialite> COMMIT;
spatialite> SELECT count(*) FROM NewTowns;
count(*)
42
spatialite>
  1. the COMMIT SQL statement definitively confirms all operations still pending.
  2. both ROLLBACK and COMMIT close the current transaction; if you want to continue working in transactional way, you should start a new transaction invoking BEGIN once again.
  3. if you simply omit to use at all the BEGIN, COMMIT or ROLLBACK statements, SQLite will assume you wish to operate in the so-called autocommit mode, i.e. each single statement is implicitly assumed to be contained within an automatic transaction.
    When you are working in autocommit mode typing a simple DELETE ... statement is understood to mean: BEGIN; DELETE ...; COMMIT;
  4. SQLite is a transactional database engine; working in autocommit mode easily can be a major cause for performance degradation.
  5. so, you are strongly encouraged to use transactions as appropriate in a sensible way.

Now you will create a new table, and then insert some rows in it:

spatialite> BEGIN;
spatialite> CREATE TABLE MyTable (
           ... > name TEXT NOT NULL,
           ... > geom BLOB NOT NULL);
spatialite> INSERT INTO MyTable (name, geom) VALUES
           ... > ('one', GeomFromText('POINT(1 1)'));
spatialite> INSERT INTO MyTable (name, geom) VALUES
           ... > ('two', GeomFromText('POINT(2 2)'));
spatialite> INSERT INTO MyTable (name, geom) VALUES
           ... > ('three', GeomFromText('POINT(3 3)'));
spatialite> SELECT name, AsText(geom) FROM MyTable;
name AsText(geom)
one POINT(1 1)
two POINT(2 2)
three POINT(3 3)
spatialite> .quit
  1. the CREATE TABLE allows you to create a new table, specifying the columns it contains too.
  2. you know well that a Spatial column is a BLOB one
  3. the INSERT INTO allows you to store new rows in the table.
  4. using the GeomFromText() you can create new GEOMETRY values at your will.
  5. you have also performed a final check, using SELECT.
  6. wow !!! you have just created a table and inserted Spatial data in it, so you can happily exit the SQLite session now.
After while, when you start a new SQLite session, unfortunately you discover that:

spatialite> SELECT name, AsText(geom) FROM MyTable;
SQL error: no such table: MyTable
spatialite>


Gosh, what's happening ? Where is your newly created table? It seems to be vanished ...
And actually it has gone; read again and carefully the last session:
  1. you started a transaction by invoking BEGIN
  2. then you have created your table
  3. then you have inserted some rows
  4. ... and after all you have suddenly stopped your session omitting to confirm all pending operation.
    You've forgotten to invoke COMMIT.
    So all the job you had done has gone forever.
  5. this one is the dark side of transactions ... never forget to COMMIT a pending transaction, or your work will simply disappear.
    I hope this catastrophic example may help you to avoid transaction disasters.

Updating rows is as simple as inserting or deleting them:

spatialite> SELECT pk_uid, name, peoples, AsText(geom)
           ... > FROM NewTowns WHERE pk_uid = 8006;
PK_UIDNamePeoplesAsText(geom)
8006Monza120204.0POINT(521332.99 5047818.45)
spatialite> UPDATE NewTowns SET peoples = 150000,
           ... > name = 'MONZA',
           ... > geom = GeomFromText('POINT(10 10)', 32632)
           ... > WHERE pk_uid = 8006;
spatialite> SELECT pk_uid, name, peoples, AsText(geom)
           ... > FROM NewTowns WHERE pk_uid = 8006;
PK_UIDNamePeoplesAsText(geom)
8006MONZA150000.0POINT(10 10)
spatialite>
  1. the UPDATE SQL statement lets you modify any column value.
    You simply have to SET column names and new values that will replace the current ones.
  2. you can use as usual the GeomFromText() function to obtain GEOMETRY values.
  3. the PK_UID column is a special one, being the PRIMARY KEY for this table.
    In every table a PRIMARY KEY column assures you that one and only one row may contains a selected value, thus assuring univocity.
  4. if one of your tables, for any reason, has no PRIMARY KEY, don't worry; every way SQLite manages automatically an useful unique identifier for each row in a table, whom name is ROWID; you can reference it SQL expressions exactly as it was an ordinary column name.
  5. note also that in this example you have apparently not used any transaction [you have used neither BEGIN nor COMMIT]; we have already explained this specific point, but it's interesting to underline this a second time.
    SQLite acts always as a transactional DBMS; what really happens behind the scenes in cases like this, is that you are implicitly using auto-commit mode i.e.:

The following examples requires the original my_new_db.sqlite database.
Thus it's better if you rebuilt it now, to avoid any misalignment due to DELETEs and UPDATEs we have just performed.

The SQLite's SQL syntax offers you a straightforward way to create a new table and simultaneously populate it with selected data coming out from an another table:

spatialite> BEGIN;
spatialite> CREATE TABLE Villages AS
           ... > SELECT * FROM NewTowns WHERE peoples < 500;
spatialite> COMMIT;
spatialite> SELECT count(*) FROM Villages;
count(*)
845
spatialite>
  1. using CREATE TABLE .... AS SELECT ... SQLite clones automatically a new table [exactly identical to the original one; same column names etc].
  2. and then all rows coming out from the SELECT will be written in this new table.
SQLite support also another different SQL construct that allows you to create a new table and then populate it with data coming out from an existing table:

spatialite> BEGIN;
spatialite> CREATE TABLE Metropolis (
           ... > Name TEXT NOT NULL,
           ... > Population INTEGER NOT NULL,
           ... > Geometry BLOB NOT NULL);
spatialite> INSERT INTO Metropolis (Name, Population, Geometry)
           ... > SELECT name, peoples, geom FROM NewTowns
           ... > WHERE peoples > 1000000;
spatialite> COMMIT;
spatialite> SELECT name, population, AsText(geometry) FROM Metropolis;
NamePopulationAsText(geometry)
Roma2546804POINT(788703.57 4645636.3)
Milano1256211POINT(514820.49 5034534.56)
Napoli1004500POINT(942636.1 4535272.55)
sqlite>
  1. first you have create the new table, freely choosing columns, their names etc.
  2. then you call INSERT INTO ... SELECT that performs an automatic data transfer [either selected or complete] between the two tables.
Both these SQL constructs are very useful in a GIS-oriented environment; very often you can need to extract a small subset of your data [e.g. only the ones related to a single County, or Town], edit them in a convenient way, perform some very specifically oriented Spatial analysis, produce some specialized output, dump a selected shapefile and so on.
Under these circumstances, creating a set of derived tables can largely help.

Once you don't need no more to keep those short lived tables, you can eliminate them:

spatialite> DROP TABLE Villages;
spatialite> DROP TABLE Metropolis;
spatialite> VACUUM;
spatialite>
  1. a DROP statement will completely eliminate a table and all data it contains.
  2. never forget to perform a VACUUM after dropping tables, in order to really free unused space, compact the database and so on.

Before leaving this section, it's better to underline a few basilar concepts about SQLite's very peculiar way to manage column types. Consider the following example:

spatialite> CREATE TABLE some_table (
           ... > N1 SMALLINT,
           ... > N2 INTEGER NOT NULL,
           ... > N3 DOUBLE,
           ... > STR VARCHAR(4) NOT NULL);
spatialite> INSERT INTO some_table VALUES (10, 11, 111.1111, 'first');
spatialite> INSERT INTO some_table VALUES (NULL, 12, NULL, 'second');
spatialite> INSERT INTO some_table VALUES (30, NULL, 222.2222, NULL);
SQL error: some_table.N2 may not be NULL


all right since now; any other DBMS you have ever used will perform these operations exactly in this way, no?

spatialite> INSERT INTO some_table VALUES ('aaaa', 'bbbb', 'cccc', 1234);
spatialite> INSERT INTO some_table VALUES ('A', 'B', 'C', 1234.6789);


Oops, this one is an unexpected one; you have just inserted without any complaint string values in numeric columns, and numeric values in a CHAR column ....
it's very puzzling, no ?

spatialite> SELECT * FROM some_table;
N1N2N3Str
1011111.1111first
NULL12NULLsecond
aaaabbbbcccc1234
ABC1234.6789
spatialite>
  1. SQLite does not enforce type checking for column values
  2. you can place any kind of data you want in each column, without restrictions; no checking is performed in order to assure conformance with column type declared in CREATE TABLE
  3. you may either love or hate this; it's not at all a bug, it's an explicit design feature of SQLite.
Anyway, if you really absolutely need to implement an effective type-value checking for some column, SQLite can help you to do such a thing:

spatialite> CREATE TABLE checked_table (
           ... > num SMALLINT CHECK (num BETWEEN -32768 AND 32767),
           ... > str VARCHAR(4) CHECK (
           ... > length(str) <= 4 AND str BETWEEN ' ' AND 'zzzz'));
spatialite> INSERT INTO checked_table VALUES (1, 'a');
spatialite> INSERT INTO checked_table VALUES (1111, 'aaaa');
spatialite> INSERT INTO checked_table VALUES (1234, 'abcd');
spatialite> INSERT INTO checked_table VALUES ('a', 'abcd');
SQL error: constraint failed


it seems to work just fine, isn't it ?

spatialite> INSERT INTO checked_table VALUES (5678, 1);
spatialite> INSERT INTO checked_table VALUES ('9999', 1234);


oh no, it doesn't seem to works anymore ...

spatialite> INSERT INTO checked_table VALUES ('35000', 1);
SQL error: constraint failed
spatialite> INSERT INTO checked_table VALUES (2, 12345);
SQL error: constraint failed

spatialite>

it really works, after all.
  1. the CHECK clause allows you to define column-based constraints, thus enforcing type conformity.
  2. SQLite enforces CHECKing before inserting or updating column values.
  3. an implicit type conversion may be performed as well when appropriate.

4. Performing some Spatial data analysis

Since now you have acquired the most basic knowledge about SpatiaLite and SQLite.
Now you should be able to manage a spatially enabled database, populate it, add new tables, inserting and updating Spatial data and so on.
Time has come to finally examine the hard core of Spatial data processing, i.e. how you can evaluate relations between geometries.
The ordinary way standard SQL works is by evaluating relations existing between entities, e.g. when you perform a JOIN that imposes some clause like:

... WHERE t1.sex = t2.sex AND t1.name = t2.name ...

Spatial analysis do exactly the same, but evaluates spatial relationships that may join [or disjoin] geometries using some spatial criterion.

4.1 Evaluating MBRs relationships

Every GEOMETRY has its own Minimum Bounding Rectangle - MBR; see the Envelope() function, if you don't remember what an MBR is.
mbrs relationships
These ones are the spatial relations that can exists between a couple of MBRs.
  1. Evaluating spatial relationship existing between two geometries by comparing their MBRs relations is a very imperfect and approximative way to check real spatial relations.
  2. Anyway, MBRs comparisons are the coarsest but fastest way to compare spatial relations.
    Evaluating real relation existing between two arbitrary complex geometries may be an heavily time consuming task. Vice-versa an MBRs comparison is very quick to evaluate.
  3. So evaluating MRBs is at least the best strategy to restrict the field.
    Once you have eliminated the [many] incompatible couples, then you can pass to check better only the [few] couples that may be of real interest.
mbrs relationships (2)
  1. if two MBRs are disjointed, surely related geometries are disjointed as well.
  2. but if two MBRs overlaps or intersects, you cannot be sure at all that related geometries overlaps or intersects as well; in this circumstance you need to check better by evaluating real Spatial relations.

A very common problem in GIS applications is the one to quickly and efficiently fetch lots of entities belonging to some specific frame; e.g. think of drawing a rectangular map.
You'll need to fetch all the few GIS entities within the frame, discarding the many ones outside the frame [that will be invisibles at all].

spatialite> SELECT count(*) FROM NewTowns;
count(*)
8101

this one is the unfiltered dataset for NewTowns [remember: for this table GEOMETRY is a POINT class one]

spatialite> SELECT count(*) FROM NewTowns WHERE MBRContains(
           ... > GeomFromText('POLYGON((554000 4692000, 770000 4692000,
           ... > 770000 4925000, 554000 4925000, 554000 4692000))'), geom);
count(*)
480
spatialite>
  1. the SpatiaLite MBRContains() function allows you to restrict the search field.
  2. the GeomFromText('POLYGON(...)') build the MBR corresponding to current frame
    [i.e. screen, some raster image, paper sheet on a printer etc]
spatialite> SELECT count(*) FROM NewTowns WHERE MBRContains(
           ... > BuildMBR(554000, 4692000, 770000, 4925000), geom);
count(*)
480
spatialite>
  1. defining an MBR via GeomFromText('POLYGON(...)') is quite complex, and unpractical.
  2. so SpatiaLite implements an alternative way: you can use the BuildMBR( X1 , Y1 , X2 , Y2 ) function in order to define an MBR.
  3. another similar function is BuildCircleMBR( X , Y , radius ).

spatialite> SELECT count(*) FROM NewTowns WHERE MBRContains(
           ... > BuildMBR(654000, 4692000, 770000, 4924000), geom);
count(*)
295
spatialite> SELECT count(*) FROM NewTowns WHERE MBRContains(
           ... > BuildMBR(754000, 4692000, 770000, 4924000), geom);
count(*)
47
spatialite>
  1. you can easily use subsequent call to MBRContains() with different frames [MBRs]in order to obtain a zoom in sequence.
  2. simply you have to use each time a wider or narrower MBR, as needed.
spatialite> SELECT count(*) FROM NewTowns WHERE MBRWithin(
           ... > geom, BuildMBR(754000, 4692000, 770000, 4924000));
count(*)
47
spatialite>
  1. the MBRWithin() function is the same as MBRContains(), but the order of MBRs is inverted.
  2. you can freely use the one of them you find more convenient; result is the same anyway.
spatialite> SELECT count(*) FROM NewHighWays WHERE MBRIntersects(
           ... > BuildMBR(754000, 4692000, 770000, 4924000), geom);
count(*)
15
spatialite>
  1. do you remember? the HighWay table has a GEOMETRY class of type LINESTRING / MULTILINESTRING
  2. when accessing LINESTRINGs or POLYGONs [or their MULTI-collections] the SpatiaLite MBRIntersects() function allow you to select entities within a given frame.
  3. for MBRIntersects() the order by which you refer geometries hasn't any influence.

4.2 Evaluating relationships between geometries

As we have just seen in the preceding paragraph, evaluating spatial relationships via MBRs comparison is simply a quick trick, not at all a completely reliable and exhaustive method.
In a general way, a couple of arbitrary geometries can establish one (or more) of the following relationships: Some practical (illustrated) example may help you to understand better:
Equal(g1, g2) equal example
Disjoint(g1, g2) disjoint example
Touches(g1, g2) touches example
Within(g1, g2) within example
Overlaps(g1, g2) overlaps example
Crosses(g1, g2) crosses example
Contains(g1, g2) contains example

SpatiaLite wraps corresponding GEOS library functions in order to evaluate spatial relations intercurring between geometries.
The following SQL functions are supported by SpatiaLite:
  1. all these functions may return one of the following values:
  2. the Relate() function is very general one, and requires a patternMatrix defining what kind of spatial relation has to be checked:
spatialite> SELECT NewRegions.Name, COUNT(*) FROM NewTowns, NewRegions
           ... > WHERE NewRegions.Name IN (
           ... > 'VALLE D''AOSTA', 'PIEMONTE', 'UMBRIA', 'LOMBARDIA',
           ... > 'CALABRIA', 'MOLISE', 'MARCHE', 'BASILICATA') AND
           ... > Within(NewTowns.geom, NewRegions.Geometry)
           ... > GROUP BY NewRegions.Name;
NameCOUNT(*)
BASILICATA133
CALABRIA409
LOMBARDIA1548
MARCHE248
MOLISE136
PIEMONTE1201
UMBRIA91
VALLE D'AOSTA74
spatialite>
  1. this query counts how many towns fall inside a Regional boundary
  2. we want to scan only a limited list of Regions, and consequently we use the IN (..) clause
  3. the Within() function will then test the spatial relation existing between each town and regional boundary
  4. A little curiosity: as you can notice, there is a strange notation inside the IN() clause:
    the string 'VALLE D''AOSTA' really needs four apices [it's intended, it's not a typo]
    this is because there is an apostrophe within this noun, and the SQL syntax pretends to mask apostrophes marking them as a double apex
  5. as you can easily notice, this one is a slow running query.
    Reasons explaining this behaviour are very easy to identify:We can take full profit from an opportune usage of Spatial Index in order to optimize this query
    We'll better explain this step further away, when analyzing the Spatial Index stuff [if you now feel a morbose curiosity about this, you can jump immediately here]

spatialite> SELECT t2.Name, t2.Peoples,
           ... > Distance(t1.geom, t2.geom) AS Distance
           ... > FROM NewTowns AS t1, NewTowns AS t2
           ... > WHERE t1.Name = 'Firenze' AND
           ... > Distance(t1.geom, t2.geom) < 10000;
NamePeoplesDistance
Bagno a Ripoli252325794.81919712765
Impruneta146379196.22864616252
Fiesole140855001.54448646674
Scandicci501368106.97263099484
Sesto Fiorentino460548674.74855857514
Firenze3561180.0
spatialite>
  1. the SpatiaLite Distance() functions measures the minimum distance intercurring between two geometries
    this one too is an useful kind of spatial relation.
  2. this query identifies any town laying in the neighbourhood of the Florence town
    we've assumed a 10Km range to do this
  3. Please note: if the two geometries aren't disjoined, their relative distance is exactly 0

4.3 Boolean operations on geometries

We can apply various boolean operations on two geometries in order to obtain a third geometry. Some practical (illustrated) example may help you to understand better:
g3 = Intersection(g1, g2) intersection example
g3 = Difference(g1, g2) difference example
g3 = GUnion(g1, g2) union example
g3 = SymDifference(g1, g2) symdifferece example

SpatiaLite wraps corresponding GEOS library functions in order to perform boolean operations on geometries.
The following SQL functions are supported by SpatiaLite:
  1. all these functions return a GEOMETRY representing the result of the boolean operation
    please note that the returned GEOMETRY may be an empty one
  2. the OpenGis standard defines the Union() function; but in the SQLite's own syntax Union is a reserved keyword, so SpatiaLite renames this function as GUnion()

4.4 Further operations on geometries

We can use other useful operations on a single geometry in order to obtain a second one. Some practical (illustrated) example may help you to understand better:
g2 = ConvexHull(g1) convex hull example the convex hull may be easily visualized by imagining an elastic band stretched open to encompass the given object;
when released, it will assume the shape of the required convex hull.
g2 = Buffer(g1, radius) buffer example A GIS buffer operation creates a zone of a specified width around a POINT or a LINESTRING or a POLYGON.
it is also referred to as a zone of specified distance around any generic geometry (this late being of complex type as well, such as MULTIPOINT, MULTILINESTRING and so on).

this example shows how you can obtain different buffers repeatedly applying an increasing radius on the same geometries.
g2 = Simplify(g1, tolerance) a Douglas-Peuker simplify example the aim of Douglas Peuker algorithm is to simplify geometries.
this algorithm tries to preserve directional trends in a line using a tolerance factor which may be varied according to the amount of simplification required.
SpatiaLite wraps corresponding GEOS library functions in order to perform these kind of operations on geometries.
The following SQL functions are supported by SpatiaLite:
  1. all these functions return a GEOMETRY representing the result of the requested operation
    please note that the returned GEOMETRY may be an empty one
  2. the OpenGis standard defines the ConvexHull() and Buffer() functions;
    the Simplify() and SimplifyPreserveTopology() aren't defined by OpenGis, but are commonly implemented by many spatially enabled DBMSs
  3. ConvexHull() can be applied to any kind of geometry, because we can assume that line vertices and / or polygon border vertices are Points anyway
  4. Buffer() also can be applied to any kind of geometry.
    the requested radius is intended as expressed in the same units used by referred Geometry
  5. Simplify() can be applied only to linear geometries [this including also polygon borders], but not to Points or MultiPoints.
    the tolerance factor determines the desired amount of simplification [i.e. a biggest tolerance value produces a strongest simplification than a smallest one]
    tolerance has to be expressed in the same units used by referred Geometry
  6. SimplifyPreserveTopology() represents a specialized version of Simplify(), applying extra care in order to fully preserve the original topology.
    e.g. it's best suited for Polygons including internal holes.

5. Managing Coordinate Reference Systems and Coordinate Transformation

Each coordinate value, to be fully qualified, needs to be explicitly assigned to some SRID, or Spatial Reference Identifier.
This value identifies the geometry's associated Spatial Reference System describing the coordinate space in which the geometry object is defined.
As a general rule, different geometries can interoperate in a meaningful way only if their coordinates are expressed in the same Coordinate Reference System.
Very often some kind of coordinate reprojection is required in order to convert different GIS data in an unique, homogeneous, Coordinate Reference System and thus allowing interoperation and integration.

srid example #1
When you try to use two GIS datasets belonging to different Coordinate Reference Systems, entities falls very far because numeric values for coordinates are obviously in two different spaces.
srid example #2
But if you apply some opportune coordinate reprojection for one dataset, so to put it in the same Coordinate Reference Systems of the other one, entities will correctly overlap as expected.

The European Petroleum Survey Group [EPSG] maintains and distributes a large dataset of geodetic parameters describing quite any Coordinate Reference System and Coordinate Transformation used worldwide for GIS data.
SpatiaLite on its own implements SRIDs for any kind of geometric class, and supports the EPSG dataset to identify Coordinate Reference Systems.
The my_new_db.sqlite already contains epsg data; in a preceding exercise, when we loaded shapefiles, we called a
.read init_spatialite-2.3.sql
script in order to initialize Spatial MetaData; do you remember ?
So we can quickly start a new spatialite.exe session:

spatialite> SELECT * FROM spatial_ref_sys LIMIT 5;
sridauth_nameauth_sridref_sys_nameproj4text
2000epgs2000Anguilla 1957 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +units=m +no_defs
2001epgs2001Antigua 1943 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +units=m +no_defs
2002epgs2002Dominica 1945 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +towgs84=725,685,536,0,0,0,0 +units=m +no_defs
2003epgs2003Grenada 1953 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +towgs84=72,213.7,93,0,0,0,0 +units=m +no_defs
2004epgs2004Montserrat 1958 / British West Indies Grid +proj=tmerc +lat_0=0 +lon_0=-62 +k=0.9995000000000001 +x_0=400000 +y_0=0 +ellps=clrk80 +towgs84=174,359,365,0,0,0,0 +units=m +no_defs
spatialite>
  1. the spatial_ref_sys table is a Spatial MetaData one, and is initialized by executing the init_spatialite-2.3.sql script
  2. each Coordinate Reference System is uniquely identified by a srid value.
  3. the auth_name and auth_srid respectively identifies the authority producing this data and the original Srid
  4. each CSR is qualified by an intelligible ref_sys_name
  5. and has a lot of [quite obscure] geodetic projection parameters in the proj4text column
    Don't worry about this ... you can just get them as black magic for now ...
    If you want to learn more about them, you can usefully read the followings: http://en.wikipedia.org/wiki/Geodetic and ftp://ftp.remotesensing.org/proj/OF90-284.pdf
spatialite> SELECT DISTINCT Srid(geom) FROM NewTowns;
Srid(geom)
32632
spatialite> SELECT DISTINCT SRID(NewTowns.geom), spatial_ref_sys.ref_sys_name
           ... > FROM NewTowns, spatial_ref_sys
           ... > WHERE SRID(NewTowns.geom) = spatial_ref_sys.srid;

SRID(NewTowns.geom)spatial_ref_sys.ref_sys_name
32632WGS 84 / UTM zone 32N
spatialite>
  1. the SpatiaLite SRID() function allows you to identify the srid value that identifies any kind of geometry.
  2. you can perform a simple join to discover what srid 32632 really means.
Now suppose that for any good reason you absolutely need to transform coordinates, in such a way to obtain a new dataset pertaining to the WGS 84 Coordinate Reference System.
You can do this quite complex task by simply using a plain SQL query.

spatialite> BEGIN;
spatialite> SELECT AddGeometryColumn('NewTowns', 'wgs84', 4326, 'POINT', 2);
AddGeometryColumn('NewTowns', 'wgs84', 4326, 'POINT', 2)
1
spatialite> UPDATE NewTowns SET wgs84 = Transform(geom, 4326);
spatialite> COMMIT;
spatialite> SELECT AsText(geom), Srid(geom),
           ... > AsText(wgs84), Srid(wgs84) FROM NewTowns LIMIT 5;

AsText(geom)Srid(geom)AsText(wgs84)Srid(wgs84)
POINT(427002.77 4996361.33)32632POINT(8.071929 45.116952)4326
POINT(367470.48 4962414.5)32632POINT(7.324219 44.802838)4326
POINT(390084.12 5025551.73)32632POINT(7.596214 45.374854)4326
POINT(425246.99 5000248.3)32632POINT(8.049028 45.151753)4326
POINT(426418.89 4957737.37)32632POINT(8.070136 44.769244)4326
spatialite>
  1. the srid that identifies WGS 84 is 4326.
    You can check this by yourself.
  2. the SpatiaLite Transform() function obtains a new geometry from an original one, applying a coordinate reprojection as required.
  3. Now you have two alternative geometries for the NewTowns table:

6. Performing SQL queries directly on shapefiles

The SQLite DBMS implements a very interesting feature; you can develop some specialized driver in order to access any generic [and possibly exotic] data source, and then those external data will appear to the SQL engine exactly as they where native ones.
This means you can apply any SQL operation on them, if appropriate / possible and if the driver offers an adequate support.

The VirtualShape extension implements this kind of driver for SQLite.
Not only; VirtualShape is designed to interact very strictly with SpatiaLite, in order to support a complete direct access to shapefiles via standard SQL, for both attributes and geometries.

  1. The VirtualShape extension is automatically bounded within the SpatiaLite extensions, so while loading SpatiaLite you can perform SQL standard queries directly on shapefiles, with no needing at all to import them in some SQLite table.
  2. we can perform a very quick test by using one shapefile, i.e. the shape_towns we have previously created.
spatialite> CREATE VIRTUAL TABLE test_shape USING VirtualShape(shape_towns, CP1252, 32632);
spatialite> PRAGMA table_info(test_shape);

cidnametypenotnulldflt_valuepk
0PK_UIDINT00
1GeometryBLOB00
3NameVARCHAR(62)00
4PeoplesDOUBLE00
5LocalCouncVARCHAR(1)00
6CountyVARCHAR(1)00
7RegionVARCHAR(1)00
spatialite>
  1. with CREATE VIRTUAL TABLE you request SQLite to access some exotic external data source:
  2. the PRAGMA info_table simply requests SQLite to show all columns within a table; you already known this.
  3. as you can see, any attribute defined for your shapefile has been mapped into a corresponding SQL type.
  4. you can notice too that there are two other columns:
spatialite> SELECT PK_UID, Name, Peoples, AsText(Geometry)
           ... > FROM test_shape
           ... > WHERE Peoples > 350000 ORDER BY Name;

PK_UIDNamePeoplesAsText(Geometry)
8078Bologna371217.0POINT(686263.23 4929405.15)
7980Firenze356118.0POINT(681514.97 4848768.02)
203Genova610307.0POINT(492370.69 4918665.57)
8003Milano1256211.0POINT(514820.49 5034534.56)
8072Napoli1004500.0POINT(942636.1 4535272.55)
7922Palermo686722.0POINT(880179.17 4227024.08)
7918Roma2546804.0POINT(788703.57 4645636.3)
8012Torino865263.0POINT(395553.63 4991768.9)
spatialite>
  1. the VirtualShape driver is delegated to physically access the external data source [actually, a shapefile]
  2. the only restriction is that [for now ...] the current VirtualShape implementation is limited to read only operations
  3. please note that VIRTUAL TABLEs definitions are stored permanently inside the SQLite database
spatialite> DROP TABLE test_shape;
spatialite>
  1. your VIRTUAL TABLE will be cancelled, and is no longer available
  2. don't worry about the original external shapefile; it still exists, with no modifications.
    Dropping a VIRTUAL TABLE simply cancels the SQLite link with the external data source, but don't deletes this later one.

7. Performing SQL queries directly on CSV or TXT-tab files

Formats based on delimited text are widespread; usually you can export quite any kind of tabular data using the CSV or TXT-tab formats. Any mainstream spreadsheet [such as MS Excel or Open Office Calc] supports import / export to and from delimited text.
The VirtualText module is another VIRTUAL TABLE driver supporting direct read only access to CSV and TXT-tab files. Let us see a very simple example.



Author,Book,Lang,Category,Price
"Alighieri, Dante",Divina commedia,Italian,Literature,"12,50"
"Hobbes, Thomas",The Leviathan,English,Philosophy,"18,75"
"Shakespeare, William",Julius Cesar,English,Literature,"8,25"
"Knut, Donald",The art of computer programming,English,Computing,"35,50"
Stendhal,Le rouge et le noir,French,Literature,"8,50"
"Gould, Stephen Jay",Wonderful life,English,Biology,"12,60"
"Diderot, Denis",Jacques le fataliste,French,Literature,"6,50"
"Mann, Heinrich",Der blaue Engel,German,Literature,"6,60"
"Caesar, Caius Julius",Commentarii de bello gallico,Latin,History,"15,00"
"Morris, Desmond",The naked ape,English,Biology,"8,75"
"Kerningham, Brian : Ritchie, Dennis",The C programming Language,English,Computing,"17,50"
"de Cervantes, Miguel",Don Quichote,Spanish,Literature,"14,20"
Voltaire,TraitÚ sur la tolÚrance,French,Philosophy,"12,00"
"Mann, Thomas",Der Zauberberg,German,Literature,"11,00"
"Darwin, Charles",Origin of the Species,English,Biology,"9,50"
"Gibson, Henry W.",The decline and fall of the Roman Empire,English,History,"18,50"
"Flaubert, Gustave",L'Úducation sentimentale,Literature,French,"7,50"
"Thackeray, William M.",Vanity fair,English,Literature,"10,00"
"Zola, Emile",Germinal,French,Literature,"8,40"
"Dawkins, Richard",The selfish gene,English,Biology,"12,50"
"Dickens, Charles",Our mutual friend,English,Literature,"9,30"
"Marx, Karl",Grundrisse der Kritik der politischen OEkonomie,German,Philosophy",10,00"
"Manzoni, Alessandro",I promessi sposi,Italian,Literature,"15,00"


The preceding one is an example of CSV, i.e. represents a Comma Separated Value text file.
You simply have to copy this CSV sample, paste it into your preferred text editor (notepad, vi ...); and then save the books.txt file.
Now you are ready to start a SpatiaLite session:

spatialite> CREATE VIRTUAL TABLE books USING
           ... > VirtualText(books.txt, CP1252, 1, COMMA, DOUBLEQUOTE, ',');
spatialite> PRAGMA table_info(books);

cidnametypenotnulldflt_valuepk
0ROWNOINTEGER00
1AuthorTEXT00
2BookTEXT00
3LangTEXT00
4CategoryTEXT00
5PriceDOUBLE00
spatialite>
  1. the VirtualText driver is delegated to physically access the external data source [actually, a CSV or TXT-tab text file]
  2. please note: a VirtualText table is a read only table
spatialite> SELECT * FROM books ORDER BY Lang, Author;
ROWNOAuthorBookLangCategoryPrice
15Darwin, CharlesOrigin of the SpeciesEnglishBiology9.50
20Dawkins, RichardThe selfish geneEnglishBiology12.50
21Dickens, CharlesOur mutual friendEnglishLiterature9.30
16Gibson, Henry W.The decline and fall of the Roman EmpireEnglishHistory18.50
6Gould, Stephen JayWonderful lifeEnglishBiology12.60
2Hobbes, ThomasThe LeviathanEnglishPhilosophy18.75
11Kerningham, Brian : Ritchie, DennisThe C programming LanguageEnglishComputing17.50
4Knut, DonaldThe art of computer programmingEnglishComputing35.50
10Morris, DesmondThe naked apeEnglishBiology8.75
3Shakespeare, WilliamJulius CesarEnglishLiterature8.25
18Thackeray, William M.Vanity fairEnglishLiterature10.00
7Diderot, DenisJacques le fatalisteFrenchLiterature6.50
5StendhalLe rouge et le noirFrenchLiterature8.50
13VoltaireTraitÚ sur la tolÚranceFrenchPhilosophy12.00
19Zola, EmileGerminalFrenchLiterature8.40
8Mann, HeinrichDer blaue EngelGermanLiterature6.60
14Mann, ThomasDer ZauberbergGermanLiterature11.00
22Marx, KarlGrundrisse der Kritik der politischen OEkonomieGermanPhilosophy10.00
1Alighieri, DanteDivina commediaItalianLiterature12.50
23Manzoni, AlessandroI promessi sposiItalianLiterature15.00
9Caesar, Caius JuliusCommentarii de bello gallicoLatinHistory15.00
17Flaubert, GustaveL'Úducation sentimentaleLiteratureFrench7.50
12de Cervantes, MiguelDon QuichoteSpanishLiterature14.20
spatialite> SELECT Book, Author FROM Books
           ... > WHERE Category = 'Literature' AND Price < 10 AND Lang = 'English';
BookAuthor
Julius CesarShakespeare, William
Our mutual friendDickens, Charles
spatialite>

Once you have created the Virtual Table based on the VirtualText module, you can obviously perform any SQL SELECT statement using this table.

8. Spatial Index: using SQLite's R*Tree

The SQLite DBMS (as many other DBMS) implements Indices as a mean to quickly retrieve selected data.
Imagine a table containing million and million rows, such as the national telephone directory; retrieving the people's name owning some specific telephone number may be a slow operation, if you perform a so called full table scan; i.e. if the DBMS engine has to read and evaluate every row in order to filter the requested one.

Implementing an Index on the telephone number column allows for very fast querying; now the DBMS engine can simply read the Index, and immediately will find the required row.
There is no need at all to perform the lengthy full table scan process.

Please note: if your table contains a limited number of rows (let say 5,000 or 20,000) implementing an Index hasn't any practical effect. SQLite can perform a full table scan in a very short time anyway. But if your table contains an huge number of rows (let say more than 100,000), then implementing an Index can have a dramatic effect on query's execution time.
As biggest is the number of rows, as more evident becomes the benefit to implement an Index.

spatialite> CREATE INDEX IdxTownName ON NewTowns (Name);
spatialite>
  1. the CREATE INDEX SQL statement is used in order to build an Index
  2. when an Index is added to some table, then SQLite applies the following actions:
spatialite> DROP INDEX IdxTownName;
spatialite>
  1. the DROP INDEX SQL statement is used in order to remove an Index

Unhappily a SQLite's Index cannot be used against a Geometry-type column.
This is because an Index as the one created be Create Index represents an implementation of a so called BTree (i.e. Binary Tree); binary trees can usefully be applied on numbers and / or strings, but not on geometries.
For geometries we need a completely different kind of Index, i.e. the so called RTree (Rectangle Tree).

Here comes a good notice; starting with v.3.6.0 SQLite supports a stable implementation for RTRee. SQLite's RTree seems to be really reliable (and very efficient, too); so SpatiaLite includes it in order to implement Spatial Index on Geometry-type columns as well.

spatialite> SELECT * FROM geometry_columns;
f_table_namef_geometry_columntypecoord_dimensionsridspatial_index_enabled
NewTownsgeomPOINT2326320
NewRegionsGeometryMULTIPOLYGON2326320
NewTownswgs84POINT243260
spatialite>
  1. do you remember of Spatial MetaData ? we've met them in a preceding paragraph.
  2. the spatial_index_enabled column is used in order to detect if a Geometry-type column has to be Spatially-Indexed.
spatialite> SELECT CreateSpatialIndex('NewTowns', 'geom');
CreateSpatialIndex('NewTowns', 'geom')
1
spatialite> SELECT * FROM geometry_columns;
f_table_namef_geometry_columntypecoord_dimensionsridspatial_index_enabled
NewTownsgeomPOINT2326321
NewRegionsGeometryMULTIPOLYGON2326320
NewTownswgs84POINT243260
spatialite>

The CreateSpatialIndex() SpatiaLite's function allows to define a Spatial Index.
Let see some detail in depth:

spatialite> .tables
NewHighWaysidx_NewTowns_geomspatial_ref_sys
NewRegionsidx_NewTowns_geom_nodetest_shape
NewTownsidx_NewTowns_geom_parent
geometry_columnsidx_NewTowns_geom_rowid
spatialite> PRAGMA table_info(idx_NewTowns_geom);
cidnametypenotnulldflt_valuepk
0pkid0NULL0
1xmin0NULL0
2xmax0NULL0
3ymin0NULL0
4ymax0NULL0
spatialite> SELECT * FROM Idx_NewTowns_geom LIMIT 5;
pkidxminxmaxyminymax
7695319224.0319224.04994219.04994219.0
3442322716.71875322716.718754978712.04978712.0
3436325736.90625325736.906254980018.54980018.5
69332594.65625332594.656254980547.04980547.0
3477333373.40625333373.406254993133.04993133.0
spatialite>
  1. the SQLite RTree is implemented as a VIRTUAL TABLE
    you already know what SQLite's Virtual Tables are; we have just see this point about the VirtualShape
  2. an RTree requires four tables:
  3. the most relevant one is the prefix, that represents the MBRs [aka BBOXs] for indexed geometries;
    the pkid column contains the Primary Key value identifying the corresponding row within the indexed table
    the xmin, xmax, ymin and ymax columns are used to represent the MBR.
spatialite> SELECT name, tbl_name, sql FROM sqlite_master
           ... > WHERE name LIKE 'gi%';
nametbl_namesql
gii_NewTowns_geomNewTowns CREATE TRIGGER gii_NewTowns_geom AFTER INSERT ON NewTowns
FOR EACH ROW BEGIN
INSERT INTO idx_NewTowns_geom (pkid, xmin, xmax, ymin, ymax)
VALUES (NEW.ROWID, MbrMinX(NEW.geom), MbrMaxX(NEW.geom), MbrMinY(NEW.geom), MbrMaxY(NEW.geom));
END
giu_NewTowns_geomNewTowns CREATE TRIGGER giu_NewTowns_geom AFTER UPDATE ON NewTowns
FOR EACH ROW BEGIN
UPDATE idx_NewTowns_geom SET xmin = MbrMinX(NEW.geom), xmax = MbrMaxX(NEW.geom),
ymin = MbrMinY(NEW.geom), ymax = MbrMaxY(NEW.geom) WHERE pkid = NEW.ROWID;
END
gid_NewTowns_geomNewTowns CREATE TRIGGER gid_NewTowns_geom AFTER DELETE ON NewTowns
FOR EACH ROW BEGIN
DELETE FROM idx_NewTowns_geom WHERE pkid = NEW.ROWID;
END
spatialite>
  1. SpatiaLite on its own implements three triggers in order to ensure a full correspondence between geometries into the main table and corresponding spatial index.
  2. each time you'll insert, update or delete a row within the main table, you can safely assume that index infos are consequently updated the right way, because a trigger actively enforces such a constraint.

All right, now we have built a Spatial Index; it's time to take profit of its presence.
RTrees are only loosely integrated into the SQLite's query engine, so we need some extra care in order to use them. You can't assume SpatiaLite will automatically use some Spatial Index in order to speed up a slow query; you have to explicitly request this feature.
It's not too complex; you simply have to compose a nested query in order to take full profit from a Spatial Index.

spatialite> SELECT name FROM NewTowns
           ... > WHERE X(geom) > 730000 AND X(geom) < 734000
           ... > AND Y(geom) > 4810000 AND Y(geom) < 4850000;

Name
Capolona
Chitignano
Subbiano
Arezzo
spatialite> SELECT name FROM NewTowns WHERE ROWID IN
           ... > (SELECT pkid FROM idx_NewTowns_geom WHERE
           ... > xmin > 730000 AND xmax < 734000 AND ymin > 4810000 AND ymax < 4850000);

Name
Capolona
Chitignano
Subbiano
Arezzo
spatialite>
  1. The first query has nothing of interesting; it's a trivial query, and doesn't involves the Spatial Index.
  2. The second query instead will take full advantage from using the Spatial Index.
  3. Important notice: when you'll try these two queries, very easily you'll notice no difference at all between them.
    This is because the NewTowns table contains only 8,000 rows; if you are using a quite decent hardware, a so simple dataset cannot evidentiate any penalization when using a full table scan.
    But while accessing a more complex dataset (say one containing some million rows) the difference become quite impressive; query execution time may vary from some minutes to few seconds
Do you remember ? In a previous step we examined a slow running spatial query, and we committed ourselves to better explain later how we can optimize such a query by using a Spatial Index; let us now rewrite such an optimized query:

spatialite> SELECT NewRegions.Name, COUNT(*) FROM NewTowns, NewRegions
           ... > WHERE NewRegions.Name IN (
           ... > 'VALLE D''AOSTA', 'PIEMONTE', 'UMBRIA', 'LOMBARDIA',
           ... > 'CALABRIA', 'MOLISE', 'MARCHE', 'BASILICATA') AND
           ... > Within(NewTowns.geom, NewRegions.Geometry) AND
           ... > NewTowns.ROWID IN
           ... > (SELECT pkid FROM idx_NewTowns_geom WHERE
           ... > xmin > MbrMinX(NewRegions.Geometry) AND
           ... > xmax < MbrMaxX(NewRegions.Geometry) AND
           ... > ymin > MbrMinY(NewRegions.Geometry) AND
           ... > ymax < MbrMaxY(NewRegions.Geometry))
           ... > GROUP BY NewRegions.Name;

NewRegions.NameCOUNT(*)
BASILICATA133
CALABRIA409
LOMBARDIA1548
MARCHE248
MOLISE136
PIEMONTE1201
UMBRIA91
VALLE D'AOSTA74
spatialite>
  1. We've simply rewritten the former query in order to take advantage of the Spatial Index that is now available.
  2. This means that we are no longer evaluating the Within() spatial relationship for each NewTowns entity, but only for the ones falling inside each NewRegions entity MBR.
  3. The Spatial Index allows to quickly retrieve such NewTowns entities
  4. As you can easily notice, the query rewritten in this later form takes a noticeably shorter time to complete

It's not very alike you'll desire to destroy a Spatial Index once you've created one.
Anyway you can do such a thing, if you will:

spatialite> SELECT DisableSpatialIndex('NewTowns', 'geom');
spatialite> .tables
NewHighWaysidx_NewTowns_geomspatial_ref_sys
NewRegionsidx_NewTowns_geom_nodetest_shape
NewTownsidx_NewTowns_geom_parent
geometry_columnsidx_NewTowns_geom_rowid
spatialite> DROP TABLE idx_NewTowns_geom;
spatialite> .tables
NewHighWaysNewTownsspatial_ref_sys
NewRegionsgeometry_columnstest_shape
spatialite> VACUUM;
spatialite>

  1. first you have to call DisableSpatialIndex(); this will remove the triggers, so to stop Spatial Index updating.
  2. this action doesn't removes the RTree itself; you have to call DROP TABLE explicitly in order to physically remove it.
  3. after this, performing a VACUUM is a very good idea, in order to maintain your database in an efficient state.

9. Spatial Index take two: using MbrCache

SpatiaLite supports an alternative implementation for Spatial Index, i.e. the one based on MBRs in memory caching.
Each Geometry has its own MBR aka BBOX; storing in memory all these MBRs we'll then obtain an MbrCache.
And when ad indexed Spatial selection is required, scanning this cache in order to detect which MBRs satisfy the required Spatial criteria will take a very short time, because no slow disk access will be then required at all; all the MBRs are already cached in memory, so retrieving the required ones will be a very fast operation.
This caching technique presents the advantage of an unbeatable simplicity; it isn't at all a true index [as R*Tree is], but for any practical purpose using an MbrCache or an R*Tree can be assumed to be the same [at least, if you have to cache a reasonable amount of MBRs, let say one million or so on].

The following is a practical example of how to use MbrCache: we'll now connect the test-2.3.sqlite database.

spatialite> SELECT CreateMbrCache('Towns', 'Geometry');
CreateMbrCache('Towns', 'Geometry')
1
spatialite> SELECT * FROM geometry_columns;
f_table_namef_geometry_columntypecoord_dimensionsridspatial_index_enabled
RegionsGeometryMULTIPOLYGON2326320
HighWaysGeometryMULTILINESTRING2326320
TownsGeometryPOINT2326322
spatialite>

The CreateMbrCache() SpatiaLite's function allows to define an MbrCache.
Let see some detail in depth:

spatialite> .tables
HighWayscache_Towns_Geometryspatial_ref_sys
Regionsgeom_cols_ref_sys
Townsgeometry_columns
spatialite> PRAGMA table_info(cache_Towns_Geometry);
cidnametypenotnulldflt_valuepk
0rowidINTEGER00
1mbrBLOB00
spatialite> SELECT * FROM cache_Towns_Geometry LIMIT 5;
rowidmbr
1POLYGON((427002.77 4996361.33, 427002.77 4996361.33, 427002.77 4996361.33, 427002.77 4996361.33, 427002.77 4996361.33))
2POLYGON((367470.48 4962414.50, 367470.48 4962414.50, 367470.48 4962414.50, 367470.48 4962414.50, 367470.48 4962414.50))
3POLYGON((390084.12 5025551.73, 390084.12 5025551.73, 390084.12 5025551.73, 390084.12 5025551.73, 390084.12 5025551.73))
4POLYGON((425246.99 5000248.30, 425246.99 5000248.30, 425246.99 5000248.30, 425246.99 5000248.30, 425246.99 5000248.30))
5POLYGON((426418.89 4957737.37, 426418.89 4957737.37, 426418.89 4957737.37, 426418.89 4957737.37, 426418.89 4957737.37))
spatialite>
  1. the MbrCache is implemented as a VIRTUAL TABLE; you already know what SQLite's Virtual Tables are
  2. each row of this Virtual Table represents the MBRs [aka BBOXs] for corresponding indexed Geometry;
    the rowid column contains the Primary Key value identifying the corresponding row within the indexed table.
    the mbr column is used to represent the MBR [please note: this is not an ordinary Geometry-type value; thus you cannot use the AsText() etc functions on it]
spatialite> SELECT name, tbl_name, sql FROM sqlite_master
           ... > WHERE name LIKE 'gc%';
nametbl_namesql
gci_Towns_GeometryTowns CREATE TRIGGER gci_Towns_Geometry AFTER INSERT ON Towns
FOR EACH ROW BEGIN
INSERT INTO cache_Towns_Geometry (rowid, mbr)
VALUES (NEW.ROWID,
BuildMbrFilter(MbrMinX(NEW.Geometry), MbrMinY(NEW.Geometry), MbrMaxX(NEW.Geometry), MbrMaxY(NEW.Geometry)));
END
gcu_Towns_GeometryTowns CREATE TRIGGER gcu_Towns_Geometry AFTER UPDATE ON Towns
FOR EACH ROW BEGIN
UPDATE cache_Towns_Geometry SET mbr =
BuildMbrFilter(MbrMinX(NEW.Geometry), MbrMinY(NEW.Geometry), MbrMaxX(NEW.Geometry), MbrMaxY(NEW.Geometry))
WHERE rowid = NEW.ROWID;
END
gcd_Towns_GeometryTowns CREATE TRIGGER gcd_Towns_Geometry AFTER DELETE ON Towns
FOR EACH ROW BEGIN
DELETE FROM cache_Towns_Geometry WHERE rowid = OLD.ROWID;
END
spatialite>
  1. SpatiaLite on its own implements three triggers in order to ensure a full correspondence between geometries into the main table and corresponding MbrCache.
  2. each time you'll insert, update or delete a row within the main table, you can safely assume that cached infos are consequently updated the right way, because a trigger actively enforces such a constraint.
  3. the BuildMbrFilter() function is a special one, and has to be mandatorily used in order to create the MBRs stored within an MbrCache

All right, now we have built an MbrCache; it's time to take profit of its presence.
MbrCaches are only loosely integrated into the SQLite's query engine, so we need to compose a nested query in order to use them [Do you remember ? we have already seen how to use nested queries for T*Trees].

spatialite> SELECT name FROM Towns
           ... > WHERE MbrWithin(Geometry, BuildMbr(730000, 4810000, 734000, 4850000));

Name
Capolona
Chitignano
Subbiano
Arezzo
spatialite> SELECT name FROM Towns WHERE ROWID IN
           ... > (SELECT rowid FROM cache_Towns_Geometry WHERE
           ... > mbr = FilterMbrWithin(730000, 4810000, 734000, 4850000));

Name
Capolona
Chitignano
Subbiano
Arezzo
spatialite>
  1. The first query has nothing of interesting; it's a trivial query, and doesn't involves the MbrCache.
  2. The second query instead will take full advantage from using the MbrCache.
  3. Important notice: when you'll try these two queries, very easily you'll notice no difference at all between them.
    This is because the Towns table contains only 8,000 rows; if you are using a quite decent hardware, a so simple dataset cannot evidentiate any penalization when using a full table scan.
    But while accessing a more complex dataset (say one containing a million rows or so on) the difference become quite impressive; query execution time may vary from several seconds to few milliseconds
While composing SQL expressions to be used against an MbrCache, you can use the following filtering functions:

10. Struggling with locale_charset settings

Some little history [to better understand]: SQLite internally manages TEXT values as they where supposed to be UTF-8 encoded; but it does not perform any checking in order to actually enforce this [I suppose this is due to performance reasons].
The open source libiconv library supports transforming any text from some charset to any other known one; SpatiaLite uses libiconv in order to convert any input into UTF-8, and any output from UTF-8, so supporting in an effective way any locale_charset setting.
You are simply required (when needed) to explicitly select the locale_charset to be supported: this may seem at a first glance as a little bit confusing; but you'll discover by experience that this one is a big good feature, allowing for maximum flexibility. SpatiaLite supports the following locale_charsets:
ARMSCII-8Armenian
ASCIIUS-ASCII
BIG5Chinese/Traditional
BIG5-HKSCS
BIG5-HKSCS:1999
BIG5-HKSCS:2001
Chinese/Hong Kong
CP850DOS/OEM Western Europe
CP862DOS/OEM Hebrew
CP866DOS/OEM Cyrillic
CP874DOS/OEM Thai
CP932DOS/OEM Japanese
CP936DOS/OEM Chinese
CP949DOS/OEM Korean
CP950DOS/OEM Chinese/Big5
CP1133Laotian
CP1250Windows Central Europe
CP1251Windows Cyrillic
CP1252Windows Latin 1
CP1253Windows Greek
CP1254Windows Turkish
CP1255Windows Hebrew
CP1256Windows Arabic
CP1257Windows Baltic
CP1258Windows Vietnamese
EUC-CNChinese
EUC-JPJapanese
EUC-KRKorean
EUC-TWTaiwan
GB18030Chinese/National Standard
GBKChinese/Simplified
Georgian-AcademyGeorgian
Georgian-PSGeorgian
HZChinese
ISO-2022-CN
ISO-2022-CN-EXT
Chinese
ISO-2022-JP
ISO-2022-JP-1
ISO-2022-JP-2
Japanese
ISO-2022-KRKorean
ISO-8859-1Latin-1 Western European
ISO-8859-2Latin-2 Central European
ISO-8859-3Latin-3 South European
ISO-8859-4Latin-4 North European
ISO-8859-5Latin/Cyrillic
ISO-8859-6Latin/Arabic
ISO-8859-7Latin/Greek
ISO-8859-8Latin/Hebrew
ISO-8859-9Latin-5 Turkish
ISO-8859-10Latin-6 Nordic
ISO-8859-11Latin/Thai
ISO-8859-13Latin-7 Baltic Rim
ISO-8859-14Latin-8 Celtic
ISO-8859-15Latin-9
ISO-8859-16Latin-10 South-Eastern European
JOHABKorean
KOI8-RRussian
KOI8-UUkrainian
KOI8-RUBelarusian
KOI8-TTajik
MacArabicMAC Arabic
MacCentralEuropeMAC Central Europe
MacCroatianMAC Croatian
MacCyrillicMAC Cyrillic
MacGreekMAC Greek
MacHebrewMAC Hebrew
MacIcelandMAC Iceland
MacintoshMAC
MacRomanMAC European/Western languages
MacRomaniaMAC Romania
MacThaiMAC Thai
MacTurkishMAC Turkish
MacUkraineMAC Ukraine
MuleLao-1Laotian
PT154Kazakh
RK1048Kazakh
SHIFT_JISJapanese
TCVNVietnamese
TIS-620Thai
UTF-8UNICODE/Universal
VISCIIVietnamese

Let us try some test query, in order to better understand: spatialite> SELECT PK_UID, Name FROM Towns WHERE Name LIKE '%¨%';
spatialite>

Apparently, this query found any row at all ... spatialite> SELECT PK_UID, Name FROM Towns WHERE PK_UID = 7749;
PK_UIDName
7749Vi+Ž
spatialite>

Trying to select a row surely containing an ¨ within the Name, we'll then get some strange character ...
Reason for this behaviour is quickly explained: spatialite> .charset CP850
spatialite> SELECT PK_UID, Name FROM Towns WHERE PK_UID = 7749;
PK_UIDName
7749Vi¨
spatialite> SELECT PK_UID, Name FROM Towns WHERE Name LIKE '%¨%';
PK_UIDName
644Pal¨ del Fersina
1689Viggi¨
1878Mont¨ Beccaria
3217Tem¨
4289Cefal¨
5051Carr¨
5333Pal¨
6684Pat¨
7553Cant¨
7749Vi¨
spatialite>
You have to resolve similar problems when importing [exporting] some shapefile, or when executing an SQL script.
You absolutely cannot assume they are UTF-8 encoded; this is the exception, more than the rule ... so you have to be very careful, and check well and thoroughly what are you really doing.
As a rule of the thumb, you can quite safely make the following assumptions: