[Xastir-dev] Optimizing Shapefile drawing

Tom Russo russo at bogodyn.org
Thu Dec 16 13:58:25 EST 2004


I had an idea today for how to improve the speed of shapefile rendering.
I wanna run it up the flagpole first.

There is a public-domain (not even GPL) implementation of an algorithm called
Rtree --- it's used in Grass for their vector-layer query feature --- 
from http://www.superliminal.com/sources/sources.htm

What you do is create a data structure containing the bounding boxes of
all the shapes you want to query, and then there's a fast search algorithm
for finding all those bounding boxes that a query box overlaps.

Right now, xastir loops over all shapes in a shapefile, checks each bounding
box against the viewport, and if the shape overlaps then it processes 
it (either by running dbfawk over the dbf record for that shape, or by
the hard-coded non-dbfawk processing code).

If, however, we were to build up an Rtree for each shapefile the first time
it's processed, and then do a search for all shapes in the file that 
are overlapped by the current viewport, we could loop over the matching shapes
and access them directly rather than searching them one by one.

The issues I see in the way of just doing it include making sure to throw
away the rtree if the shapefile's timestamp is newer than it was when the
rtree was generated, and storing the rtree structure for each shapefile in a way
that can be looked up quickly by draw_shapefile_map given that all it is told
by the generic call from draw_map about the file is its name.

I was thinking that the latter issue could be handled with an associative
list structure, perhaps a skiplist for which there happens to be a similarly 
available implementation from 
http://resnet.uoregon.edu/~gurney_j/jmpc/skiplist.html
When generating an rtree for the first time, one would store not only
a pointer to the rtree structure and the shapefile's name (the key), but
also the timestamp of the shapefile.  The thing would go sort of like this
in pseudocode (where the associative array is assumed to look like an STL map 
in C++ instead of what it really is in C)

  shape_struct = theShapes[shapefile_name]; // here's the skiplist/map/whatever
   //obviously there's a similer "if it doesn't exist, create it" step
   // above
  if (!(shape_struct->rtree_ptr))
  { 
     shape_struct->rtree_ptr= generate_rtree(shapefile_name);
  } 
  else if (timestamp(shapefile_name) later than shape_struct->timestamp)
  {
     destroy(shape_struct->rtree_ptr)
     shape_struct->rtree_ptr= generate_rtree(shapefile_name);
  } 

  n_hits=RTreeSearch(shape_struct->rtree_ptr, current_viewport_rectangle, callback, 0);

  //this is the actual form of the search call in the RTree implementation
  // callback is called every time a shape in the tree is overlapping,
  // and in our case would fill in an array of shape numbers that match
  // say "int shape_hits[]"

  for (i=0;i<=n_hits; i++) 
   {  
     structure = shape_hits[i];
     // now we do everything exactly the same as we did before, with the
     // code above replacing what used to be a simple loop over all
     // entities in the shapefile
  
Doing this could be a huge speed improvement when using very large shapefiles
(like the ones James Ewen was talking about with street-level maps of entire
Canadian provinces in one file), because there is no linear search through
the shapefile for the (possibly) small fraction of shapes that fall within 
the current view.

Generating the rtree is a linear step through the shapefile that pulls out
bounding boxes for every shape and stores them in a datastructure, so the first
time the shapefile is rendered it would be slower than it is now, but
subsequent passes could be much, much quicker.

Thoughts?

-- 
Tom Russo    KM5VY     SAR502  DM64ux         http://www.swcp.com/~russo/
Tijeras, NM  QRPL#1592 K2#398  SOC#236 AHTB#1 http://www.qsl.net/~km5vy/
 "That which does not kill me is better than that which does."
    --Irving Nietzche, lesser known of the famous Nietzche twins



More information about the Xastir-dev mailing list