Tiger Technical Page
from Robin on 20th April 2007:
Please see the
page for the latest USA mapping.
This page is simply some "history" and "technical detail" for those interested.
This page was set up in 2005 to document the old TIGER.EXE, written in 1999, to process the raw Tiger data into the USA street level mapping on www.gpss.co.uk/usa.
I must thank Martin Brilliant who has worked closely with me in recent weeks, resulting in release of much better USA street level mapping, the details of which you will find on the USA Download page. Our work has included development of a new TIGER32 offline program, used to create the data, freely available for use with GPSS. The new GPSS Baseline supports the new, more compact, VEC file format, and the display of more detailed street maps including railroads, water and landmarks such as churches, schools, airports and rail stations.
You should visit the Tiger page for earlier "change history", copyright and legal aspects.
Those reading these pages should have already done step 1, 2 and 3 on the download page. There is some documentation of GPSS .VEC files on the vector page.
This "Tiger Technical" page was set up to help Robin and Martin get to grips with long forgotton and poorly documented software and methods, so that improvements might be made to the free USA street mapping available for GPSS. It is possible that this page may also be of use to businesses, academics or "enthusiasts" who wish to convert vector mapping for other countries for use with GPSS.
Robin Lovelock, Sunninghill UK, December 2005.
Work stopped on the 16-bit VB2 version of TIGER.EXE since it was easier to use VB5 for the 32-bit TIGER32.EXE and not have restrictions on array sizes - already important for processing street names, and crucial for RT2/shape processing.
Marty made changes to TIGER32 which resulted in us having all the RT2/shape data available. See the comparison in these two pictures. The map on the right is much closer to reality.
Unfortunately there was a price to pay - a massive increase in size of the mapping needed to cover the whole USA. The highly compressed 1192 EXE of USA mapping on gpss.co.uk/usa only occupy 267 MB and when expanded to run with GPSS, will fit on a 600 MB CD. However, the extra shape data means that the 1192 EXE downloads need 1,325 MB, and when expanded would need a DVD rather than a CD.
We are now using a "shape code" for each line, which seems to give a reasonable compromise of low storage and yet more realistic shape. The picture on the right illustrates shape codes for a northward drawn vector. Changes to GPSS and TIGER32 to pack and unpack the shape code are now being tested.
We are also testing logic to take some, but not all, of the shape points.
See at the very bottom of this page under "watching paint dry".
On 1st December runs started with TIGER32 using a modified VEC format providing greater compaction of data. Details are here. Also, the option was provided to filter shape points so that only a proportion are used. Results from the overnight run on 1st December are as follows:
Marty has added much better filtering options for shape points in TIGER32 v2o 7th Dec. See picture at the very bottom. This gives much better shape with little increase in storage expected.
The GPSS Baseline has been raised to v5.96 to exploit data on /usanew. It will also handle the old data on /usa.
v2s is the latest version of TIGER32 used by Robin and Marty. This includes [Landmarks] button for generation of L.VEC files, and [USANAMES] for creation of USANAMES.TXT. Robin is working on v2t, towards it providing identical results on his different PCs.
Pre-release versions of GPSS.EXE v5.97 are on the oldnew page. These include a bug fix for "rogue lines" seen at Goat Island - 430500n0790385w. The 32-bit version of GPSS.EXE, now on the oldnew page, seems to be three times faster in reading the USA vector mapping.
The latest mapping is on www.gpss.co.uk/usanew.
This is in the new VEC format, and so you will need to be using v5.96 or later of GPSS.
The USA street mapping uploaded onto the this site in 1999/2000 consists of 1192 EXE files totaling 267 MB.
The new 1192 files total approximately 389 MB. The biggest file is 40N074W.EXE at 2,740,589 bytes.
All VEC files are now capable of fitting onto a CD again.
Details of adding the mapping to GPSS are on the USA Download page.
In late 1999 Robin received the raw 3.9GB of Tiger data on 7 CDs and wrote TIGER.EXE in VB2 to run an an old P1 PC dedicated to the task.
TIGER.EXE took about 10 days of continuous processing to complete the conversion of raw Tiger data into GPSS .VEC files that have been used with GPSS unchanged for over 5 years. This process included the shelling to PKUNZIP to unzip each raw set of files, and when, in 2005, TIGER.EXE was run under XP it was found that the shelling to DOS applications made the program unreliable. In short: it would not work.
In recent days changes have been made to TIGER.EXE and facilities like compression, now standard in Windows XP, have made the process simpler and faster - typically an overnight run of a few hours instead of 10 days.
The steps involved are now:
At the time of writing this, Robin is resurecting the old TIGER.EXE process, not used in over 5 years. The first goal is to get the new processing running under XP and providing exactly the same results as five years ago. i.e. exactly the same 1192 EXE downloads that have been on www.gpss.co.uk/usa for over five years, based on the 1999 Tiger source data which Robin still has on the CDs sent to him by Ted.
However, we will then want to re-run the process with the latest Tiger source data downloaded - or more probably purchased on CD - see links on tiger page.
Some might think it prudent to test some of the new data - to see if it is still of the same format as five years ago :-)
After visiting the US Government Tiger site, Robin set used his MS Explorer browser to download all the 2004 Tiger-Line files for California. On broadband this took almost an hour - just in case the reader is thinking of downloading all of it ;-)
When testing the old TIGER.EXE in 1999 it seems the Golden Gate bridge near San Francisco was chosen as a place for some simple spot checks. This correspinds to TGR06075.ZIP and the street level file TGR06075.RT1
The old 1999 file TGR06075.RT1 starts:
10700 192276334 J H74 0006060750759279092790 67000670000604 0604 999 999 -123126114+37821664-123126114+37798725 10700 1922763351J H75 10 06 075 92790 67000 0604 999 -123173715+37776113-123153863+37807065 10700 192294047 K F74 0006060750759279092790 6700067000017902017902999B999D-122367481+37830127-122370001+37805918 10700 192276337 J H74 0006060750759279092790 67000670000604 0604 999 999 -123173715+37776113-123126114+37760282 10700 1922763381J H75 10 06 075 92790 67000 0604 999 -123153863+37807065-123126114+37821664 10700 192293008 A Thornburg Rd A41 1006060750759279092790 67000670000601 0601 103B103B-122451095+37801873-122449982+37801428 10700 192276340 J H74and the new 2004 file starts:
11204 192295942 N Wyman Ave A41 1801 1899 1800 189800009412994129 06060750759279092790 670006700006010006010010531053-122473312+37787310-122472848+37787557 11204 1922763351J H75 06 075 92790 67000 060400 2999-123173715+37776113-123153863+37807065 11204 192295943 O 14th Ave A41 06060750759279092790 670006700006010006010010531053-122473321+37787193-122473312+37787310 11204 192295945 O Wedemeyer A41 06060750759279092790 670006700006010006010010531053-122473312+37787310-122474581+37787842 11204 1922763381J H75 06 075 92790 67000 060400 2999-123153863+37807065-123126114+37821664- looks to be the same format, even if for a different area.
Let's see what TIGER.EXE does with it... seems to process and plot OK.
There is some documentation of GPSS .VEC files on the vector page. If you unzip the download www.gpss.co.uk/usa/32N122W.EXE you will see 18 files hold all the vector data within the 1 degree x 1 degree area:
327,843 37N122W$.VEC - text of names such as street names.
15,384 37N122WD.VEC - drainage vectors. DCW DR drainage water
33,000 37N122WH.VEC - highway vectors.
8,964 37N122WM.VEC - major highway vectors. DCW RD Major road
8,940 37N122WP.VEC - population centre outline vectors. DCW PO population
33,552 37N122WR.VEC - road vectors.
35,928 37N122WS.VE0 - street vectors - first 0.1 latitude.
26,364 37N122WS.VE1 - street vectors - next 0.1 latitude.
41,724 37N122WS.VE2 - street vectors - etc
133,272 37N122WS.VE3 - street vectors
144,288 37N122WS.VE4 - street vectors
184,788 37N122WS.VE5 - street vectors
164,616 37N122WS.VE6 - street vectors
314,496 37N122WS.VE7 - street vectors
221,220 37N122WS.VE8 - street vectors
280,992 37N122WS.VE9 - street vectors - last 0.1 latitude.
23,496 37N122WT.VEC - positions of name text.
87,576 37N122WW.VEC - water vectors.
Most of these files are created from the Tiger data, but three are from much less detailed vector data from Digital Chart of the World (DCW). Note that the last character of the name (e.g. S in 37N122WS.VE0) gives the type of data.
Back in 1999/2000 it seemed a good idea to allow the less detailed DCW data to be processed into the same VEC file format for use in countries other than USA. Some of this data created by TIGER.EXE is included within the EXE downloads.
The categories of DCW data, and their mapping to VEC file categories are:
DCW VEC Category "PO" "P" DCW population "PP" "C" DCW coast water "DR" "D" DCW drainage water "RD" "M" DCW Major road "H" "W" water "B" "T" rail tracks "A4" "S" streets "A3" "R" major roads "A2" "R" major roads "A1" "H" major highwaysTIGER.EXE can be run without DCW data present, using only Tiger data. If DCW data is to be included, then it was generated by Robin's DUMPDCW program, which creates compact binary files from the DCW CDs. Typical files from DUMPDCW put into the \tigerzip folder (was \tiger where output TMP and VEC files placed) are DCW.VDN, DCW.VPO, DCW.VPP, DCW.VRD:
11/10/2005 21:11 24,774,738 DCW.VDN 11/10/2005 20:32 473,452 DCW.VPO 11/10/2005 20:31 2,974,328 DCW.VPP 11/10/2005 20:36 4,661,756 DCW.VRD
The above DCW files cover all the USA, Canada and Mexico. A smaller sample is available for testing with TIGER.EXE (provided to Martin and others helping in this project). The sample can be downloaded from www.gpss.co.uk/dcw1.exe, is only 1,113,909 bytes, and self-extract into DCW1.VDN,VPO,VPP and VRD. They need to be renamed from DCW1 to DCW to be recognized by TIGER.EXE.
Each CD holds Tiger ZIP files in folders corresponding to USA states. e.g. AL, GA, NY, WY, etc.
A folder was created with name c:\tigerzip and Windows XP "properties" set up for this to be maintained in a compressed state.
xcopy /s d:*.zip
- was used for each CD to copy all the ZIP files. Unfortunately, XCOPY also copies the folders.
A list of ZIP files was made in COPYZIP.TXT using:
dir *.zip /s >copyzip.txt
- to produce a COPYZIP.TXT starting:
AK al AR
A little program COPYZIP.BAS was written:
PRINT "COPYZIP.EXE makes COPYZIP.BAT from COPYZIP.TXT" OPEN "copyzip.txt" FOR INPUT AS #1 OPEN "copyzip.bat" FOR OUTPUT AS #2 DO WHILE NOT EOF(1) LINE INPUT #1, t$ PRINT #2, "COPY \TIGERZIP\"; t$; "\*.ZIP" LOOP CLOSE #1 CLOSE #2 PRINT "all done :-)"- which produced a file COPYZIP.BAT starting:
COPY \TIGERZIP\AK\*.ZIP COPY \TIGERZIP\al\*.ZIP COPY \TIGERZIP\AR\*.ZIP
COPYZIP.BAT was put in the \tigerzip folder and run to copy the ZIP files.
This took about an hour:
TIGER.EXE is run and [makeVEC] clicked to start the whole process which will take several hours if the imput folder (c:\tigerzip) holds TGR files covering all the USA. (TGR*.* were about 30 GB). Any DCW data will increase this time by a few tens of minutes (4 DCW files were about 30MB).
Within a few seconds progress can be seen as a plot on a USA-wide map.
The TIGER.EXE run should result in the required .VEC files within the \tiger folder.
The *.TMP and ???????S.VEC files are not needed and should be deleted or moved. e.g. MOVE *.TMP TMP - where TMP is a subfolder. Note that The S.VEC files are split into 10 smaller files S.VE0 ... S.VE9 late in the TIGER.EXE process.
The final result will be in the region of 26,423 files, with name ??N???W?.VE? occupying about 468MB. These are in the form directly used by GPSS.
Compression into the EXE uploaded onto www.gpss.co.uk/usa is explained below.
A list of EXE downloads was made using:
dir *.exe >vec2exe.txt
- in the \wwwgpss\usa folder holding the old contents of www.gpss.co.uk/usa
At this stage the aim is to re-create exactly the same data as that made in 1999/2000.
The file VEC2EXE.TXT starts:
13N144W.EXE 14N145W.EXE 14N169W.EXEand the program VEC2EXE.BAS was:
PRINT "VEC2EXE.EXE makes VEC2EXE.BAT from VEC2EXE.TXT" OPEN "vec2exe.txt" FOR INPUT AS #1 OPEN "vec2exe.bat" FOR OUTPUT AS #2 DO WHILE NOT EOF(1) LINE INPUT #1, t$ '37N122W.EXE t$ = MID$(t$, 1, 7) '37N122W tt$ = "\LHA A " + t$ + " " + t$ + "?.V??"'\LHA A 37N122W 37N122W?.V?? PRINT #2, tt$ tt$ = "\LHA S " + t$ '\LHA S 37N122W PRINT #2, tt$ LOOP CLOSE #1 CLOSE #2 PRINT "all done :-)"- which creates VEC2EXE.BAT starting:
\LHA A 13N144W 13N144W?.V?? \LHA S 13N144W \LHA A 14N145W 14N145W?.V?? \LHA S 14N145W \LHA A 14N169W 14N169W?.V?? \LHA S 14N169WVEC2EXE.BAT was run in the \tiger folder, to create the EXE for www.gpss.co.uk/usa
A GPSS folder is created, with a copy of GPSS extracted from the baseline files GPSSA.EXE and GPSSB.EXE from the download page. The latest pre-release GPSS.EXE can be taken from the oldnew page. All the VEC files can be moved from \tiger into this GPSS folder. The mapping can then be browsed using tips from the tips page.
Most paints dry quicker than TIGER.EXE takes to run - but I've found keeping an eye on the Laptop screen educational - where USA states are. This is because the source tiger data is organized into folders by state, and by an accident of fate, TIGER plots the tiger data by state, in alphabetical order :-)
The picture below is a recent one, showing "shape" curves, and the "arrow" debug info.
- and this one is an even more recent run taking some of the shape points instead of shape code.