Geog 280: Basic Geographic Techniques

Exercise 5: Available Data for GIS

Many GIS Data Sets are Available

In the early days of GIS, few data were available to most users.  Instead, each individual user would create their own data from paper maps or other sources.  These days, a huge amount of data has been computerized and made available to GIS users.  For some GIS projects, all the data you need may be already in digital format.  This page describes a few of the data that are readily available to you -- there are many more we can't go into here.  The types described here are (links are to points in this page below):

In general, it's much easier to use data already digitized than to reinvent the wheel on your own.  Finding the data can be a problem, of course, as can be assuring that the data are accurate and align geographically.  For other GIS projects, you need to create new data sets.  We'll cover the data creation process on the next page.

Vendor-Supplied Data Sets

For most GIS software, the company that sells it (the vendor) usually provides a variety of data sets with the software.  Usually these data are small-scale base-map information, such as countries of the world, states and counties of the US, and major roads and rivers.  These are great for creating maps of regional areas.  We've used several of these data sets in previous exercises.  Specialized topics naturally wouldn't be included, such as areas of linguistic groups or geologic map units.

Software vendors less commonly provide detailed data sets for local areas.  For example, if you wanted to create a GIS for Rohnert Park, the data that comes with ArcView wouldn't get you very far.  At best you'd have the outline of Sonoma County, Highway 101, maybe the Russian River and perhaps a few other features.

For local mapping you usually must turn to other sources such as those below, which usually require specialized translation software.  Or you can digitize your own, as described on the next page.  Many vendors, including ESRI, sell data sets based on those below that are already in ArcView or other popular format.  Of course, you pay extra money for those easy-to-use data.

Vendor-supplied data are supplied in particular software formats and in particular coordinate systems.  This means that if you need the data in a different format or coordinate system, you must translate them.  For example, you might obtain data in MapInfo (another GIS package) format and in State Plane coordinates, but you use ArcView and want to use UTM coordinates.  In this example, ArcView does have a utility to import MapInfo data (see the ArcView program group), but ArcView cannot transform data to a different coordinate system.  For that you'd have to use Arc/Info or other software.

Local US Streets: TIGER Files from US Census Bureau

The US Census Bureau has created detailed street maps of the entire US.  The data for these maps are found in TIGER files.  TIGER files were originally created to allow mapping of census information, so they also include other information such as census area boundaries (states, counties, cities, census tracts, etc.), but also streams and other lines and points.

Let's look at a small sample of TIGER data for the area around Sonoma State.

As you can see, TIGER files have a lot of detail about all kinds of features in the local area.  They aren't perfect, though, since some streets may be missing or incorrect.  Some companies clean up and resell these files for profit.

An important point about TIGER files is that they have their own particular format that ArcView cannot read directly.  How were you able to add this one?  The file was translated to Arc format for you.. Several companies do this kind of translation, or you can convert them with tools such as in Arc/Info.  This need for conversion is true for the other files below as well.

Topographic Map Images: DRG Files

Topographic maps are great sources for information about places all over the US.  The US Geological Survey, which publishes these maps in the US, is slowly converting all these maps to computer format.  The ideal format for most of the topo map information is vector, so that you could use the streams, roads, boundaries and other features like you can TIGER files.  These vector equivalents of topo maps are called Digital Line Graph (DLG) files.  DLG files are easily available for smaller (less detailed) scales of 1:100,000 and 1:2,000,000.  But few of the detailed DLG files for the familiar 1:24,000-scale topo maps are yet available.

Instead, the USGS is quickly completing a project to scan the topo maps.  This produces a raster format file that appears exactly like the original map.  These files are called Digital Raster Graphics (DRG).  Since it's a raster image, individual features such as roads or streams cannot be extracted from it.  But they can be useful background information for GIS projects.  They usually come in a format that ArcView can read, such as one called TIFF.

One drawback of these DRG files is that they are large -- some 20 Megabytes or larger.  This is true in general of scanned maps and other raster data.

Add a portion of a DRG for viewing in ArcView:

Note this DRG segment looks similar to the topo maps in Exercise 3.  The only difference is that the DRG here was scanned by a government agency, whereas the earlier exercise used maps we scanned ourselves.

Coordinate Systems Must Match for Data

You might have noticed when adding the DRG file that the View-Zoom to Themes command made your other data disappear.  This was because the data from the two themes are in different coordinate systems.  The previous data were in latitude/longitude degrees, whereas the DRG scan is in simple units from the scanner.  A major challenge in using data in a GIS is that oftentimes the themes don't match up well because they come in different coordinate systems.

Examine this problem a little more closely:

It's sometimes possible in ArcView to adjust the coordinates of raster (image) files so they line up with other data.  A file called a world reference file can be constructed to tell ArcView where to place the image with reference to other data.  But other times more complicated means must be followed to get data to line up.

Elevation Data: DEM files

The last example of data is quite different in appearance from those above.  Elevation data can be created in several formats.  For example, the contour lines from topo maps can be converted to vector lines.  But a more common approach is to create a raster image of elevation.  Here, each cell in the image represents the elevation in the cell (usually at the center of the cell).  The common name for these files is Digital Elevation Model (DEM).  DEMs also come in their own special format, so you must translate them to your own format to use them in ArcView or other GIS (ArcView does have extension software that can read "raw" DEMs, but this extension is an expensive optional add-on).

Let's look at a sample of these DEM files:

This DEM shows elevations in the Cotati area (it matches the Cotati 7.5-minute topographic quad).  Here the elevation is portrayed as light-to-dark shades, with the highest elevations being lightest and low points in black. The exception is that the highest elevations, on Sonoma Mountain at the upper right, are too high for the palette and are shown in black.

But the other data have disappeared again!  Again the problem is with the coordinate system.  The DEM is placed in the UTM system, whereas the TIGER data are in latitude/longitude and the DRG was in pixel units.  Fortunately, ArcView is able to reproject data that are in latitude/longitude (decimal degrees, technically).  So you can get the TIGER data to overlay on the DEM.  Follow these steps:

Now the DEM data may make more sense.  For example, you should be able to see where Highway 101 travels south of Cotati over the Meacham Grade, which is in lighter colors toward the bottom of the DEM.

The US Geological Survey is completing DEMs for all the 7.5-minute maps, in addition to having files available at the equivalent of the 1:250,000 maps.  The 1:24,000 files have an elevation every 30 meters (approx. 100 feet), whereas the 1:250,000 files have an elevation about every 90 meters (300 feet).

Searching for Other Data Sources

We've only looked at a few major sources of data for GIS.  The federal government has others, as do state and local governments as well as private companies.  How can you find these sources?  Fortunately most of them are listed on Internet sites nowadays.  If you have time, you can look at a couple of the sites listed below.  Or, later on you can come back and use the links to help find data that you may need.  The indexes are good places to start if you don't know what's available or don't know how to connect to a site you've heard about.

Tip: If you click with the right mouse button on one of these links, you can choose an option to "Open in New Window," which keeps this window but creates another Netscape window for the new link.  That way you can close that window when you're done with that site and simply continue the exercise in this window.

Questions on this Page


Top of this page Geography Department Web Page

Bryan Baker, Sonoma State University, bryan.baker@sonoma.edu
Updated 17 February 1999