[Xastir] ESRI shapefiles and dbfawk

Tom Russo russo at bogodyn.org
Wed Dec 15 02:46:26 EST 2004


On Wed, Dec 15, 2004 at 12:06:21AM -0700, we recorded a bogon-computron collision of the <jewen at shaw.ca> flavor, containing:
> > > How do you find out what values are in the CFCC field. Can that be 
> > > determined from the shp file with one of these tools, or do 
> > I have to 
> > > find a descriptor file for the particular shp file I am 
> > trying to parse?
> > 
> > This is part of the "metadata" that the provider of the 
> > shapefiles should have.  There is no way to determine this 
> > from the shapefiles themselves.
> 
> Okay, I think I know the metadata format... fgdc-std-001-1998, which appears
> to a standard metadata format. I have not been able to find anything that
> lists the field names and possible entries.

Hmmmm.  Well, that's unfortunate.  If you can get your hands on the document
that describes that standard it'd really help you out, especially if you
have to do a lot of work on these files.

> However, I used dbfinfo to find our the field names. By viewing the
> shapefile with ArcExplorer, I was able to find values used for NATRDCLASS
> which would allow me to select different types of roads. 

Excellent.

> So, now this still leaves me in a quandary. The dbfawk file basically has a
> list of patterns to match, and instructions on how to display data that
> matches the pattern.

Yes.

> Which file is the pattern matching run against? Should I not be able to grep
> the file, looking for the data?

The pattern is matched against fields in the DBF file, which is binary so
you can't grep for it.  Xastir creates a string against which to run the
pattern matching on-the-fly from the dbf records.

> I can not figure out how to determine what pattern I need to match, when I
> have no idea what the file data looks like.

If your DBF file has a field called NATRDCLASS (and it sounds like it does),
and that contains the data you want to select on, be sure to include that field
in the "dbffields" variable of your dbfawk file, along with any others
that you might need (let's just say there's also one called "ROADNAME"):

  dbffields="NATRDCLASS:ROADNAME"

Now, as xastir is looping over each shape in the shapefile, it pulls out
the fields NATRDCLASS and ROADNAME for that shape's record in the associated 
dbf file.  xastir creates a string like "NATRDCLASS=<whatever value>" and 
"ROADNAME=<whichever value>" and runs the dbfawk rules on it.

> I have guessed at a few possible patterns, none of which work.
> /^NATRDCLASS=FREEWAY/ { lanes=4;color=4; next}

So if you've got dbffields including NATRDCLASS, and "FREEWAY" is in fact
a valid value for that field in the dbf file, this rule will execute whenever
a record is found that has that value.

What you want to do is a dbfdump of the dbf file --- that'll show you the
actual contents of every record in the file, unlike dbfinfo which just tells
you what fields there are.

You'll want to redirect the output to a text file, fer sher, fer sher.  It'll 
be big, as it'll be a straight ASCII dump of the entire contents of the dbf
file.

Once you have that, you can deduce the valid values for the fields
you want to select on. (having the metadata for the data would eliminate
the guesswork, of course)

> I am guessing that NATRDCLASS should be okay, since dbfinfo found it. 

Yes, it sure sounds like it.

> I
> found the value FREEWAY using ArcExplorer. However, I don't know if the
> string NATRDCLASS=FREEWAY would ever be found in whatever file would have
> the data. Perhaps there are spaces in there... Perhaps the value has quotes
> around it as ArcExplorer shows.

The surest way to tell what's in the file is with dbfdump.  I'm not sure
what kind of formatting operations ArcExplorer might perform on the data
before it shows it to you.

Regardless, you won't find "NATRDCLASS=FREEWAY" -- at best you'd find
"FREEWAY" in the NATRDCLASS column.

dbfdump will show you the raw record and you be a little more sure that
what it shows is what's really there.

If there are spaces, parens, or other cruft in the value of NATRDCLASS
you can just wildcard things to ignore it, e.g.
  /^NATRDCLASS=.*FREEWAY/

But you have to be sure that FREEWAY is really the value that's in the 
DBF file --- it could be that ArcExplorer has done some interpretation for you.
For example, when I viewed some SDTS data with GlobalMapper, it displayed
the *name* of a feature type rather than the actual contents of the field 
that defined the feature type (which is numeric) --- it had a lookup table
of its own and was decoding things for me.  If I exported that data to 
shapefile and usedthose names in a dbfawk file, it'd never have matched.

> This sure sounds like just the thing I'm looking for, but when it takes 30
> to 40 man hours to figure out how to set up the dbfawk file, it really gets
> frustrating.

I feel your pain.  I had a ton of trouble figuring out how to use it myself,
which is why I ended up making that tutorial in the first place.  Once you
get a process down it makes sense and goes quickly, but it takes some doing
to get there.  

Having the metadata or users guide for the data you are trying to work with 
makes life a lot simpler.  

-- 
Tom Russo    KM5VY     SAR502  DM64ux         http://www.swcp.com/~russo/
Tijeras, NM  QRPL#1592 K2#398  SOC#236 AHTB#1 http://www.qsl.net/~km5vy/
 "That which does not kill me is better than that which does."
    --Irving Nietzche, lesser known of the famous Nietzche twins



More information about the Xastir mailing list