SpatiaLite - spatial extensions for SQLite

Table of Contents


1. Introduction to SpatiaLite spatial support

SpatiaLite is just a small sized SQLite extension. Once you have installed SpatiaLite (a very simple and elementary task), the SQLite DBMS is enable to load, store and manipulate Spatial Data (aka Geographic Data, GIS Data, Cartographic Data, GeoSpatial Data, Geometry Data and alike).
SpatiaLite implements spatial extensions following the specification of the Open Geospatial Consortium (OGC), an international consortium of companies, agencies, and universities participating in the development of publicly available conceptual solutions that can be useful with all kinds of applications that manage spatial data.
At a very basic level, a DBMS that supports Spatial Data offers an SQL environment that has been extended with a set of geometry types, and thus may be usefully employed i.e. by some GIS application.
A geometry-valued SQL column is implemented as a column that has a geometry type. The OGC specification describe a set of SQL geometry types, as well as functions on those types to create and analyze geometry values. The followings are examples of geographic entities with a geometry: To be completely honest, SpatiaLite implements just a subset of the SQL with Geometry Types environment proposed by OGC.
However this limited support of Spatial Data SQL processing is quite reasonably sufficient to enable SQLite for realistic and useful deployment in a simple GIS environments as well.
The strength points of the SQLite + SpatiaLite couple are: If you are looking for a full fledged Geo-DBMS, capable to support you in every kind of GIS operation, and perhaps for you is paramount to cope with millions and millions of geographic entities, don't even try to use SpatiaLite + SQLite. Obviously it's not the solution you are looking for; simply it falls too short for your needs.
Otherwise, if you are looking for an effective but very basic Geo-DBMS, with no frills, no undue complexity, and the average volume of your data didn't exceeds a some hundredth of thousands entities, SpatiaLite + SQLite can be really the right solution for your needs.

Just to do a very quick comparison with others widely used open source DBMSs supporting Spatial Data:
DBMS Binary size Brief description
PostgreSQL
PostGis
25 MB
1.5 MB
It claims to be "The world's most advanced open source database", and that's simply the truth.
Not very user friendly (may cause you some headache when you'll try to install and configure it at first time), but surely a very complete and impressive full ACID DBMS.
The PostGIS add-on on its own implements a full set of Spatial features (OGC certified); whatever kind of GIS operation you need, PostGIS can support you in an useful way and without dimensional limits.
If you are seriously looking for a complete and unrestricted GIS support, and you aren't frightened at all about managing and caring a very sophisticated DBMS, as powerful as complex, PostgreSQL + PostGis is just what you need.
MySQL Spatial 42 MB It claims to be "The world's most popular open source database", and this too is simply the truth.
Reasonably easy to use, very fast, decently simple to manage; not surprisingly it's the preferred choice for so many WEB developers.
Often criticized for not being completely conformant to SQL standards, supports a wide set (sometimes puzzling too wide) of data engines.
MySql's Spatials enable you to manage in a very effective way GIS data; it's a pity that you can create a spatial index only if you are using MyISAM tables, that aren't neither transactional nor ACIDs. You can use Spatial data in InnoDB tables as well (full ACIDs), but in this case you cannot have spatial indexes.
Anyway, if you are looking for a DMBS able to effectively support WEB publishing of very large amounts of GIS data, and aren't worried to use a non-ACID DBMS (e.g. because you are publishing cold data, i.e. replicated data coming out from some other source), you can be very satisfied with MySQL Spatial.
SQLite
SpatiaLite
200 KB
150 KB
A very light weighted DBMS; it doesn't pretends to be, but perhaps it is "The world's smallest and simplest database".
As the aforementioned two tends to be vast and complex, as SQLite tends to be simple.
But simplicity does not always means incompleteness; not in this case, at least.
SpatiaLite offers you the option to enable the SQLite DBMS to support a decent and standard-conformant environment for Spatial data as well.
It doesn't pretends at all to be better, or fastest, or most powerful than others; most humbly, it just pretends to be simple, light weighted and reasonably useful.
SpatiaLite does not implements any kind of spatial index, mainly because SQLite in its own does not supports R-trees, but only B-trees. Please note well that this one may be a severe limitation, if you need to fastly access a very large data set by using some spatial selection criterion; but if yours data sets are not too large [less than 10.000 rows, just to fix an arbitrary limit] lacking of any kind of spatial index may be practically unnoticed.
If you are looking for an elementary simple-to-manage GIS environment, without frills and undue complexity, and if your have to manage only small- or medium-sized GIS data, SQLite and SpatiaLite can make you feel very happy [I hope so, at least ...]

2. The OpenGIS geometry model in a nutshell

The set of geometry types proposed by OGC's SQL with Geometry Types environment is based on the OpenGIS Geometry Model. In this model, each geometric object has the following general properties:

2.1. The geometry class hierarchy

The geometry classes define a hierarchy as follows: It is not possible to create objects in non-instantiable classes. It is possible to create objects in instantiable classes.
Geometry is the root class of the hierarchy. It is a non-instantiable class but has a number of properties that are common to all geometry values created from any of the Geometry subclasses. These properties are described in the following list. Particular subclasses have their own specific properties, described later.
Geometry Properties
A geometry value has the following properties:

2.3. class Point

A Point is a geometry that represents a single location in coordinate space.
Point Properties
point example

2.4. class Curve

A Curve is a one-dimensional geometry, usually represented by a sequence of points. Particular subclasses of Curve define the type of interpolation between points. Curve is a non-instantiable class.
Curve Properties

2.5. class LineString

A LineString is a Curve with linear interpolation between points.
LineString Properties
linestring example

2.6. class Surface

A Surface is a two-dimensional geometry. It is a non-instantiable class. Its only instantiable subclass is Polygon.
Surface Properties

2.7. class Polygon

A Polygon is a planar Surface representing a multisided geometry. It is defined by a single exterior boundary and zero or more interior boundaries, where each interior boundary defines a hole in the Polygon.
Polygon Assertions
The boundary of a Polygon consists of a set of LinearRing objects (that is, LineString objects that are both simple and closed) that make up its exterior and interior boundaries. The preceding assertions make a Polygon a simple geometry.

polygon example

2.8. class GeometryCollection

A GeometryCollection is a geometry that is a collection of one or more geometries of any class. All the elements in a GeometryCollection must be in the same Spatial Reference System (that is, in the same coordinate system). There are no other constraints on the elements of a GeometryCollection, although the subclasses of GeometryCollection described in the following sections may restrict membership. Restrictions may be based on:

2.9. class MultiPoint

A MultiPoint is a geometry collection composed of Point elements. The points are not connected or ordered in any way.
MultiPoint Properties
multipoint example

2.10. class MultiCurve

A MultiCurve is a geometry collection composed of Curve elements. MultiCurve is a non-instantiable class.
MultiCurve Properties

2.11. class MultiLineString

A MultiLineString is a MultiCurve geometry collection composed of LineString elements. multilinestring example

2.12. class MultiSurface

A MultiSurface is a geometry collection composed of surface elements. MultiSurface is a non-instantiable class. Its only instantiable subclass is MultiPolygon.
MultiSurface Assertions

2.13. class MultiPolygon

A MultiPolygon is a MultiSurface object composed of Polygon elements.
MultiPolygon Assertions
MultiPolygon Properties
multipolygon example

3. Supported spatial data formats

This section describes the standard spatial data formats that are used to represent geometry objects in queries. They are:

3.1. Well-Known Text (WKT) format

The Well-Known Text (WKT) representation of Geometry is designed to exchange geometry data in ASCII form.
Examples of WKT representations of geometry objects: A Backus-Naur grammar that specifies the formal production rules for writing WKT values can be found in the OpenGIS specification document referenced near the beginning of this chapter.

3.2. Well-Known Binary (WKB) format

The Well-Known Binary (WKB) representation for geometric values is defined by the OpenGIS specification. It is also defined in the ISO SQL/MM Part 3: Spatial standard. WKB is used to exchange geometry data as binary streams represented by BLOB values containing geometric WKB information.
WKB uses one-byte unsigned integers, four-byte unsigned integers, and eight-byte double-precision numbers (IEEE 754 format). A byte is eight bits.

For example, a WKB value that corresponds to POINT(1 1) consists of this sequence of 21 bytes (each represented here by two hex digits):
0101000000000000000000F03F000000000000F03F
The sequence may be broken down into these components: Component representation is as follows: The WKB value that corresponds to LINESTRING(1 1, 2 2) consists of the following sequence of 41 bytes:
0102000000000000000000F03F000000000000F03F00000000000000400000000000000040
The sequence may be broken down into these components: A last example referred to collection geometries: the WKB value that corresponds to MULTIPOINT(1 1, 2 2) consists of the following sequence of 49 bytes:
010200000001000000000000000000F03F000000000000F03F0100000000000000000000400000000000000040
The sequence may be broken down into these components:

3.3. Internal BLOB format

Internally, SpatiaLite stores geometry values using ordinary SQLite's BLOB columns in a format that is very closely related to WKB format, but not exactly identical.
The main rationale to adopt a modified WKB format is needing to directly expose explicit MBRs; this helps a lot in order to grant a quick access to entities selected on a spatial basis, and partially balances the lacking of spatial indexes.
The second good reason to use a peculiarly encoded format is needing to be sure that a generic BLOB value really corresponds to a valid SpatiaLite GEOMETRY; remember that one of SQLite specific features is to offer a very weak column-type enforcement.
So SpatiaLite simply relies on ordinary SQLite general support to check that one column value contains a generic BLOB, and then implements on its own any further check if this may be considered as a valid GEOMETRY value.
To do this, SpatiaLite inserts some special markers at predictable and strategic positions. The following is the general internal BLOB format for GEOMETRY used by SpatiaLite:
Byte Offset Content Notes
0 START [hex 00] a GEOMETRY encoded BLOB value must always start with a 0x00 byte
1 ENDIAN
[hex 00 or hex 01]
if this GEOMETRY is BIG_ENDIAN ordered must contain a 0x00 byte value
otherwise, if this GEOMETRY is LITTLE_ENDIAN ordered must contain a 0x01 byte value
Any other value [neither 0x00 nor 0x01], means that this BLOB cannot be a valid GEOMETRY
2 - 5 SRID a 32-bits integer value [little- big-endian ordered, accordingly with the precedent one]
corresponding to the SRID for this GEOMETRY
6 - 13 MBR_MIN_X a double value [little- big-endian ordered, accordingly with the precedent one]
corresponding to the MBR minimum X coordinate for this GEOMETRY
14 - 21 MBR_MIN_Y a double value corresponding to the MBR minimum Y coordinate
22 - 29 MBR_MAX_X a double value corresponding to the MBR maximum X coordinate
30 - 37 MBR_MAX_Y a double value corresponding to the MBR maximum Y coordinate
38 MBR_END [hex 7C] a GEOMETRY encoded BLOB value must always have an 0x7C byte in this position
39 - 42 CLASS TYPE a 32-bits integer; must identify a valid WKB class, i.e.
  • 1 = POINT
  • 2 = LINESTRING
  • 3 = POLYGON
  • 4 = MULTIPOINT
  • 5 = MULTILINESTRING
  • 6 = MULTIPOLYGON
  • 7 = GEOMETRYCOLLECTION
All following bytes are interpreted accordingly to this one class declaration; any other value in this position causes the current one to be considered as an invalid GEOMETRY.
43 - ... geometry class specific length and content depends on geometry class [POINT, LINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION etc]
LAST END [hex FE] a GEOMETRY encoded BLOB value must always end with a 0xFE byte

The preceding one is the general format that SpatiaLite uses for internal representations of any BLOB encoded GEOMETRY. Specific formats for the various classes of GEOMETRY are strictly based upon WKB, as follows:
GEOMETRY class
Format
Byte Offset Content Notes
POINT
0 - 7 X coordinate a double value corresponding to the X coordinate for this POINT
8 - 15 Y coordinate a double value corresponding to the Y coordinate for this POINT
LINESTRING
0 - 3 number of POINTs a 32-bits integer specifying the number of POINTs in this LINESTRING
4 - ... POINTs a corresponding sequence of POINTs
POLYGON
0 - 3 number of RINGs a 32-bits integer specifying the number of RINGs in this POLYGON
4 - ... RINGs a corresponding sequence of RINGs [LINESTRINGs]
MULTIPOINT
0 - 3 number of POINTs a 32-bits integer specifying the number of POINTs in this MULTIPOINT
4 - ... collection entities a corresponding sequence of collection entities, all of the POINT type
MULTILINESTRING
0 - 3 number of LINESTRINGs a 32-bits integer specifying the number of LINESTRINGs in this MULTILINESTRING
4 - ... collection entities a corresponding sequence of collection entities, all of the LINESTRING type
MULTIPOLYGON
0 - 3 number of POLYGONSs a 32-bits integer specifying the number of POLYGONs in this MULTIPOLYGON
4 - ... collection entities a corresponding sequence of collection entities, all of the POLYGON type
GEOMETRYCOLLECTION
0 - 3 number of collection entities a 32-bits integer specifying the number of collection entities in this GEOMETRYCOLLECTION
4 - ... collection entities a corresponding sequence of collection entities, of any supported kind

The following is the format expected for each one collection entity:
Byte Offset Content Notes
0 ENTITY [hex 69] a GEOMETRY encoded BLOB value must always have an 0x69 byte in this position
1 - 4 CLASS TYPE a 32-bits integer; must identify a valid WKB class that can be included within a collection, i.e.
  • 1 = POINT
  • 2 = LINESTRING
  • 3 = POLYGON
All following bytes are interpreted accordingly to this one class declaration; any other value in this position causes the current one to be considered as an invalid GEOMETRY.
5 - ... geometry class specific Length and content depends on geometry class [POINT, LINESTRING or POLYGON]

4. Managing a spatially enabled SQLite database

This section describes the data types you can use for representing spatial data in SpatiaLite, and the functions available for creating and retrieving spatial values.

4.1. SpatiaLite spatial data types

SpatiaLite has data types that correspond to OpenGIS classes. GEOMETRYCOLLECTION can store a collection of objects of any type. The other collection types (MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, and GEOMETRYCOLLECTION) restrict collection members to those having a particular geometry type.

4.2. Creating spatial values

Creating geometry values using WKT functions
SpatiaLite provides a number of functions that take as input parameters a Well-Known Text representation and, optionally, a spatial reference system identifier (SRID); the SRID parameter is anyway ignored.
They return the corresponding geometry. Those type-specific construction functions for construction of geometry values of each geometry type are also available: If one of those functions fails for every reason [non well formed WKT, unclosed rings, illegal or unsupported geometries and the alike], a NULL value is returned instead of expected geometry.

SpatiaLite does not supports the following functions defined in OpenGis specification:
Creating geometry values using WKB functions
SpatiaLite provides a number of functions that take as input parameters a BLOB containing a Well-Known Binary representation and, optionally, a spatial reference system identifier (SRID); the SRID parameter is anyway ignored.
They return the corresponding geometry. Those type-specific construction functions for construction of geometry values of each geometry type are also available: If one of those functions fails for every reason [non well formed wkb, unclosed rings, illegal or unsupported geometries and the alike aNULL value is returned instead of expected geometry.

SpatiaLite does not supports the following functions defined in OpenGis specification:

4.3. Creating spatial columns

SpatiaLite in order to store any kind of GEOMETRY simply requires a standard SQLite column of BLOB type; since SQLite has a very loose check for column values, this means you can store GEOMETRY values in every column, with a very few restrictions.
This may be as puzzling as confusing; you are strongly suggested to create specific columns in order to store GEOMETRY data, and (possibly) to identify them by using sensible names, such as geom, geometry or geo_data and so on.
You can do this by simply using the standard SQL syntax as intended by SQLite, as in:

CREATE TABLE my_geo_test (....., geo_data BLOB NOT NULL);
or
ALTER TABLE my_geo_test ADD COLUMN geo_data BLOB NOT NULL;

4.4. Populating spatial columns

After you have created spatial columns, you can populate them with spatial data. Values should be stored in internal geometry format, but you can convert them to that format from either Well-Known Text (WKT) or Well-Known Binary (WKB) format. The following examples demonstrate how to insert geometry values into a table by converting WKT or WKB values into internal geometry format: Anyway, populating a large dataset manually translating lots of complex WKT expressions one by one, can be at least very boring; and trying to do it using WKB can be a true nightmare ...
A more realistic alternative to quickly populate your Spatial table without pain can be using the LoadShapefile() SpatiaLite function in order to automatically create a new table and store in it your Spatial data. Shapefiles are de facto standard stuff in any GIS oriented environment, and you should expect to obtain one very easily.

4.5. Fetching spatial data

Geometry values stored in a table can be fetched in internal BLOB format. You can also convert them into WKT or WKB format.

5. Analyzing spatial information

After populating spatial columns with values, you are ready to query and analyze them. SpatiaLite provides a set of functions to perform various operations on spatial data. These functions can be grouped into four major categories according to the type of operation they perform:

5.1. Geometry format conversion functions

SpatiaLite supports the following functions for converting geometry values between internal format and either WKT or WKB format:

5.2. Geometry functions

Each function that belongs to this group takes a geometry value as its argument and returns some quantitative or qualitative property of the geometry.
Some functions restrict their argument type. Such functions return NULL if the argument is of an incorrect geometry type. For example, Area() returns NULL if the object type is neither Polygon nor MultiPolygon.
General Geometry Functions
The functions listed in this section do not restrict their argument and accept a geometry value of any type. SpatiaLite does not supports the following function defined in OpenGis specification: SpatiaLite supports the following non-standard function that is not defined in OpenGis specification:
Point Functions
A Point consists of X and Y coordinates, which may be obtained using the following functions:
LineString Functions
A LineString consists of Point values. You can extract particular points of a LineString, count the number of points that it contains, or obtain its length.
MultiLineString Functions
Polygon Functions
SpatiaLite does not supports the following function defined in OpenGis specification:
MultiPolygon Functions
SpatiaLite does not supports the following function defined in OpenGis specification:
GeometryCollection Functions

5.3. Functions that create new geometries from existing ones

Section: Geometry Functions discusses several functions that construct new geometries from existing ones. See that section for descriptions of these functions:

5.4. Functions for testing spatial relations between geometric objects

5.5. Relations on geometry Minimal Bounding Rectangles (MBRs)

SpatiaLite provides several functions that test relations between minimal bounding rectangles of two geometries geom1 and geom2. The return values 1 and 0 indicate true and false, respectively.
Testing spatial relations by using MBRs is only a very approximative way to test true spatial relations; but the main advantage it offers consist in a big performance improvement. Algorithms to test spatial relations may be heavy and time-consuming; vice-versa testing MRBs' spatial relations is a very simple task. mbrs spatial relations example

6. non-standard SpatiaLite utility functions

SpatiaLite implements the following utility functions, useful to manage an SQLite spatially enable DBMS, but not conformant to OpenGis specification.

6.a Importing and exporting shapefiles


The shapefile format is defined by ESRI and is very widely used in GIS, mainly because it is defined as an open and publicly documented format. You can find a copy of the ESRI specification at http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf. If you are interested in this argument, you can usefuly read this article: http://en.wikipedia.org/wiki/Shapefile
Basically, as a short reference, shapefile consists of three individual files in the same directory:
  • the file identified by .shp suffix contains the geometries for each entity in the dataset.
  • the file identified by .dbf suffix contains the attributes for each entity in the dataset.
  • the file identified by .shx suffix contains correspondences between the preceding ones.
The abstract pathname that identifies a shapefile is the one that does not contains any suffix at all. e.g., a shapefile identified by an abstract path test_shape is actually implemented by test_shape.shp, test_shape.shx and test_shape.dbf.
Always remember that a shapefile is valid (i.e. can be accessed) only if all three components are accessible. As an addition precaution you should be very careful when handling upper- and lower-case paths.
Too often, a shapefile that mysteriously refuses to open under Linux but works on Windows has components path as the following ones:
  • test_path.shp
  • test_path.shx
  • test_path.DBF
or else
  • test_path.shp
  • test_path.shx
  • TEST_PATH.dbf
LoadShapefile() and DumpShapefile() expects that paths identifying some shapefile will be abstract ones, without any trailing suffix, and internally appends suffixes as required; e.g., executing:
  • DumpShapefile('myTable', 'POINTS', 'myTable.shp')
don't be puzzled when you find out that the following ones has been created:
  • myTable.shp.shp
  • myTable.shp.shx
  • myTable.shp.dbf

6.b Simple coordinates translation-rotation

mbrs spatial relations example

6.c Coordinates reprojection [PROJ4-EPSG]

Each coordinate value, to be fully qualified, needs to be explicitly assigned to some SRID, or Spatial Reference Identifier.
This value identifies the geometry's associated Spatial Reference System that describes the coordinate space in which the geometry object is defined.
As a general rule, different geometries can interoperate in a meaningful way only if their coordinates are expressed in the same Coordinate Reference System.
GIS data coming from different sources can easily use different Coordinate Reference Systems, so very often some kind of coordinate reprojection is required in order to transform them in an unique, homogeneous, Coordinate Reference System and thus allowing interoperation and integration.

The PROJ.4 library supports a large set of methods to compute coordinate reprojection between different Coordinate Reference Systems.
The PROJ.4 library are the de facto standard for this kind of task, widely used by the vast majority of GIS software.

The EPSG [European Petroleum Survey Group] on its own maintains and distributes a large dataset of geodetic parameters describing quite any Coordinate Reference System and Coordinate Transformation used worldwide for GIS data.

Putting all together, we can assume that:
  • any GIS data should be explicitly referenced to some Coordinate Reference System; this one can be expressed as an EPSG SRID, i.e. a numeric code uniquely identifying it.
  • if the preceding one condition is satisfied, then you can apply a coordinate transformation to any other Coordinate Reference System identified by an EPSG SRID
SpatiaLite supports SRIDs, EPSG geodetic parameters and PROJ.4 in order to implement coordinate transformation between different Coordinate Reference Systems.
To enable this option you need as a prerequisite that the SQLite database file you intend to work with contains an epsg table
So you have to:
  • download the epsg-sqlite.sql script
  • run it against your SQLite database in order to create and populate the epsg table
To run the SQL script you simply start the sqlite3 client and then type:
sqlite> .read epsg-sqlite.sql
sqlite>

7. SpatiaLite conformance and compatibility

SpatiaLite implements a set of spatial extensions that enable an SQLite database to generate, store, and analyze geographic features, in an OpenGIS conformant fashion.
Only a partial and limited subset of the OpenGIS specification is implemented by SpatiaLite; only those very basic and really needed features of the OpenGIS specification are supported.
The aim of SpatiaLite is just to allow realistic deployment of SQLite databases in simple GIS contexts, not to offer a full, complete and exhaustive implementation for OpenGIS specification.
The Open Geospatial Consortium publishes the OpenGIS® Simple Features Specifications For SQL, that defines how to extend an SQL RDBMS to support spatial data.
This specification is available from the OGC Web site at http://www.opengeospatial.org/standards/sfs