[Xastir-dev] Spatial database support

Fri Jul 6 19:26:42 EDT 2007

I have been working on code to allow xastir to store and retrieve data
from relational databases, including MySQL and PostGIS with support for
spatial objects, indicies, and queries.  There are a number of things
this could support, including persistent storage of station data,
persistent storage and spatial querying of CAD objects, logging of
station data into APRSWorld tables, using xastir with other applications
(such as QGIS for GIS, or APRSWorld for webserving), and so on.  My main
interest here is in interaction with GIS systems, but data persistence
and running multiple clients over a single database (APRSWorld
webserving plus multiple xastir clients) also seem of potential use.

I have a chunk of code that doesn't yet actually do anything, but I
think is starting to lay out a framework on which we can build some
flexible external database and GIS functionality.  

Before I add the code to the cvs repository, I wanted to describe the
direction I'm working in and see if it makes sense to folks.  

------------------------

Here's some of the documentation for the code: 

Functions supporting connections to databases, including GIS enabled
databases that hold OpenGIS objects and can apply spatial indicies.

XASTIR GIS database code is separated into three layers

 1) Supporting XASTIR logic (ui elements, cad integration, 
    map drawing, etc).
 2a) Generic db storage/retrieval code - wrappers for layer 3
 2b) Connection management code
 3a) DBMS specific db storage/retrieval code for spatial databases
 3b) DBMS specific db storage/retrieval code for non-spatial databases

Data structures in an underlying database can be considered as a fourth 
level.

 Code for layers 2 and 3 is in this file.

 Layer 2 functions should be extern and called from elsewhere to
   perform spatial database operations.  Xastir shouldn't need to care
   if an underlying database has spatial support or not for simple data.
   Some functionality might require spatial object support and might be 
   included only if a spatial database is available.  Thus 3b code may
only   support a subset of the 2a functions, while 3a code should
support all   2a functions.
   Layer 2a wrappers should take and return values in xastir
coordinates,   and convert them to decimal degrees to pass on to layer
3.  Likewise   return values from layer 3 to layer 2a should be in
decimal degrees,    limiting the number of different places at which the
xastir/decimal    degree conversion code needs to be invoked.  This
would not be true if    data are fed directly from decimal degree feeds
into the database, so    there may also be a need for layer 2 functions
that deal only with    latitude and longitude in decimal degrees.

 Layer 3 functions should not be extern and should only be called 
   by layer 2 functions from within this file.  
   Layer 3 functions should take and return values in decimal degrees.
   Xastir objects should be passed down into layer 3, as doing
   so should make code easier to maintain (but harder to extend) than
using   generic structures for transport of data between layers 2 and 3.
    Passing a station struct from layer 2 to 3 makes layer 2 a very
simple   wrapper, but requires new layer 3 code to write station data to
a map   layer rather than to a DataRow (to, for example, prepare a layer
of    temperature data at points for analysis and generation of a
temperature   grid.)  [Using generic structures for transport would let
the layer 3     code remain unchanged while layer 2 functions are added
or extended, but   requires added maintainance to synchronise xastir
structs, the generic   structs, and database structures.]

 A spatially enabled database is expected to support OpenGIS spatial 
 objects and be able to apply spatial indicies to the data.  A 
 non-spatially enabled database is expected to hold coordinates using
 separate fields for latitude and longitude.  Layer 3 functions that
 interact with a spatial database will need to convert decimal degrees
 to well known text (WKT, and perhaps also well known binary, WKB) 
 representations.  Layer 3 functions that interact with non-spatially 
 enabled databases can just pass raw latitudes and longitudes.

 All spatial data are expected to be in WGS84 projection EPSG:4326

 Support for five sets of underlying database schema elements is
envisioned  - a very simple station at point and time table
  - schema elements to support CAD objects with arbitrary associated
data    tables.
  - a schema capable of holding the full range of aprs data using
spatial    elements (Points, Polygons, etc).
  - full support for APRSWorld tables (using latitude and longitude
fields    rather than spatial elements).  
  - arbitrary tables with schema discovery for arbitrary GIS databases 
    such as Tiger data.  
  The first three of these will require schema version awareness and
will  produce compatability/database lifecycle issues.  

 Descriptions of how to make connections to databases are stored in 
 connection descriptors.  Connection descriptors describe the DBMS,
whether the database has/lacks spatial support, the schema type (simple,
 simple+cad, xastir full, APRSWorld, etc for the database, and
connection  parameters (server, user, database).  The layer 2/3
separation is intended to allow functions (layer 2) to be called from
within xastir (layer 1)  without the need to test which function to call
for which dbms.  Some  functions may be schema specific, others may be
able to use any of  several different schemas.  Connections can be
opened from a database  descriptor, and more than one descriptor can
point to the same database.  (Thus a single MySQL database may contain
simple xastir tables, xastir  CAD object tables, and APRSWorld tables,
but two different descriptors  would be used to define connections to
talk to the APRSWorld tables and  the simple+cad tables within what
MySQL considers one schema.  A given  version of xastir will expect a
particular version or range of versions  for database schemas - an older
version of xastir may expect fields that  no longer exist in a database
created for a newer version of xastir and  vice versa.  

 Data selected from a spatial database might be brought into xastir as 
  stations just like an internet feed or findu fetch trail query, as 
  editable CAD objects, or as map layers.  

-------------------

And here's one example of how I'm seeing function calls working: 
With xastir running, user turns on logging of packets to external
database (probably implemented through the interface UI), as each
station heard is added to the xastir db, a call is made to a wrapper
function (layer 2a):

storeStationToGisDb(connection, station)

This wrapper determines that the connection is to a MySQL database and
calls 
storeStationToGisDbMysql(connection, station)

This function determines that the connection is to tables with the
APRSWorld structures, generates a set of appropriate insert and update
sql queries and uses the MySQL C client library to run the queries
against the database.

On stopping and restarting xastir, the user could then load all stations
from this MySQL database into the internal xastir db - invoking:

retrieveAllStationsFromGisDb(connection)

This wrapper determines that the connection is to a MySQL database and
calls

retrieveAllStationsFromGisDbMySQL(connection)

This function determines that the connection is to tables with the
APRSWorld structures, generates appropriate select statements, runs
them, and steps through the returned results to populate the internal
xastir db.  

-------------------------

Comments and suggestions are very welcome.

73,
-Paul
--------------
Paul J. Morris
Biodiversity Informatics Manager
Harvard University Herbaria/Museum of Comparative Zoölogy
mole at morris.net  AA3SD  PGP public key available