Update of "version 4.2.0"
Not logged in

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview

Artifact ID: be5d45e04a3130d9e9d337ce27a1277ad498b088
Page Name:version 4.2.0
Date: 2014-08-25 09:17:37
Original User: sandro
Parent: aaaac53ce19b2f1ec20841c084e13eccb2a639ac (diff)
Content

Back to main SpatiaLite-Tools Wiki page


Changes affecting the spatialite CLI frontend

A new dot macro already supported by the standard sqlite3 front end has been added starting since version 3.8.5:
.shell CMD ARGS......

or

.system CMD ARGS.....
Both .shell and .system are exact synonyms: invoking the one or the other always produces identical effects. By invoking one of these new dot macros you are now free to executes any system command directly from the CLI; and this obviously extends to any SQL script being executed from the standard input.
$ spatialite
......
Enter ".help" for instructions
SQLite version 3.8.5 2014-06-04 14:06:34
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
spatialite> .shell ls -l *.sqlite
-rw-r--r-- 1 sandro Administrators  27779072 Mar  5 18:44 beppe.sqlite
-rw-r--r-- 1 sandro Administrators    427008 Jul  3 21:14 gpkg_test.sqlite
-rw-r--r-- 1 sandro Administrators 276952064 Apr 28 19:00 grafo.sqlite
spatialite>
As you can see in this example, you simply have to specify the command name followed by any required argument immediately after declaring the .shell or .system macro.


new XML processing tools

Starting since version 4.2.0 three new tools have been added, and they are specifically intended to support XML processing via pure SQL:

when and why using the XML tools could be a brilliant idea

Just few general order considerations:
  1. if you simply are an ordinary end-user (i.e. a people mainly interested in consuming already available data coming from external trusted sources) you'll probably judge the whole XML/SQL stuff boring, cumbersome and practically useless.
  2. if you are some kind of power-user (i.e. an highly skilled professional not worrying about performing complex data processing tasks) you'll probably find the XML/SQL tools some way interesting and may be sometimes useful.
  3. anyway the intended scope for the XML/SQL tools is to efficiently support complex (and possibly repetitive) processes related with data validation and acceptance tests. In other worlds: it's an highly specialized tool mainly intended for peoples involved in producing high-quality and robustly self-consistent datasets.
Short conclusion: the XML tools make relatively easy and painless transforming any complex XML document into a DBMS schema. Starting from this point forward you'll be absolutely free to deploy full SQL (and eventually Spatial SQL) firepower in any conceivable way.
There are no practical limits except your own technical skills and creative imagination: you'll be now free to process, edit or validate your data in any possible way. And by wisely using SQL scripting techniques you could eventually nicely automatize many boring repetitive tasks.
If such a scenario sounds even vaguely interesting to you, giving a deeper glance at the new XML tools could be an interesting experience.

Mapping XML Entities into DBMS relational Tables: general order principles

  • any XML document can always be represented as a tree composed of elementary nodes.
  • every node in the tree will have a parent and could eventually have children.
    • with the very remarkable exception of the root node who has no parent at all; this always is the first node found in any XML document.
      Only a single root can exist in a tree.
    • nodes lacking any child are usually known as leaf nodes.
    • all nodes sharing the same parent are usually referenced as siblings.
  • parent-child relationships are the real backbone modeling the whole tree.
  • Special condition: the <C><D><Z> chain has no internal ramification, so it's a very good candidate to be collapsed in a single cumulative node.
  • Please note: the same kind of node can eventually appear in different tree positions under different kind of parents.
    In this arbitrary example this is e.g. the case of <Z> nodes (child of either <D> or <A> nodes).
    The same condition applies to <W> nodes (child of either <B> or <E> nodes).
<W alpha="value1" beta="value2">
    <X gamma="value3">value4</X>
    <Y>value5</Y>
</W>
Accordingly to the XML format specifications:
  • a node could eventually contain a value, as e.g. in
    <X>value4</X>
    <Y>value5</Y>
  • a node could eventually contain one (or more) attributes, and each attribute will be expressed as a name=value pair as e.g. in
    <W alpha="value1" beta="value2">
    <X gamma="value3">
XML Tree

  • the whole XML tree will be mapped into many DBMS tables.
  • each kind of node will directly match a corresponding Table.
  • all parent-child relationships will be represented as Primary / Foreign Key relational constraints joining two Tables.
  • as a direct consequence, when the same kind of node appears on the tree in different positions (under different kind of parents) distinct tables must necessarily be defined, so to faithfully represent the original parenthood.
    This is the case of e.g. "a_b_c_d_z" and "a_z" tables; both them are intended to match an original <Z> XML node, but we absolutely have to keep these two tables well separated so to exactly map parent-child relationships as PK-FK constraints.
    Exactly the same applies to "a_b_w" and "a_e_w" tables; but in this second case this further propagates respectively to "a_b_w_x", "a_b_w_y" and "a_e_w_x", "a_e_w_y" tables.
CREATE TABLE "a_b_c_d_z" (
node_id INTEGER PRIMARY KEY AUTOINCREMENT,
parent_id INTEGER NOT NULL,
node_value TEXT,
CONSTRAINT "fk_a_b_c_d_z" FOREIGN KEY (parent_id) 
  REFERENCES "a_b_c_d" (node_id) ON DELETE CASCADE);
Each DBMS table will always support this basic layout:
  • the node_id column (Primary Key) is the unique identifier for each node.
  • the parent_id column (Foreign Key) directly references the parent node.
  • the node_value column will eventually contain the value associated to this node (if any). Such value will always be of the Text type accordingly to XML baseline requirements.
CREATE TABLE "a_e_w" (
node_id INTEGER PRIMARY KEY AUTOINCREMENT,
parent_id INTEGER NOT NULL,
alpha TEXT,
beta TEXT,
node_value TEXT,
CONSTRAINT "fk_a_e_w" FOREIGN KEY (parent_id) 
  REFERENCES "a_e" (node_id) ON DELETE CASCADE);
  • if an XML node will eventually contain one (or more) attributes, they'll be consequently added to the DBMS Table as Text columns.
DBMS layout

Mapping XML Entities into DBMS relational Tables: fine-grained details

The above criteria allow to faithfully map any possible XML document into a DBMS relational schema. It's a nice and elegant design, it's rather simple and intuitive, and it can easily be extended so to automatically adapt to any possible data layout without imposing any constrained assumption.
Anyway real-world XML documents can easily present very ramified and very deep tree structures, thus causing an impressive proliferation of the required DBMS tables. This is not a big issue by itself: any DBMS is surely capable to store many hundredths tables.

So the first practical problem we have to preliminary resolve is the one to assign appropriate (and possibly intuitive) names to each DBMS tables.

A second option allowing to significantly reduce the depth of the XML tree structured is the one to immediately collapse all GLM geometries:

XML MetaCatalog

Any DB-file created by the spatialite_xml_load tool will contain these two tables intended to facilitate the correct interpretation of the DBMS layout:
The XML MetaCatalog contains many useful informations, and allows to fully reconstruct the original structure of the input XML tree.
It's used e.g. by the spatialite_xml_print tool when exporting again an XML document from the DBMS.


Tutorials and practical examples

  1. how-to process some simple GML dataset.
  2. how-to process some complex GML-Topology dataset.


Back to main SpatiaLite-Tools Wiki page


Credits
Development of SpatiaLite XML-Tools 4.2.0 has been funded by Tuscany Region - Territorial and Environmental Information System
Regione Toscana - Settore Sistema Informativo Territoriale ed Ambientale.