Metadata Education Project

Metadata education suggestions and materials for:

Clearinghouse Concepts

Learning Material | Preparatory topics | Complementary topics | Vocabulary


Learning Outcomes

Conviction

Motivation

Skills

Knowledge


Preparatory topics:


Complementary topics:


Vocabulary

Vocabulary definitions


Material for this topic


Overview of the NSDI Clearinghouse

"Geographic Information Systems (GIS), imagery software, and graphics packages have made their way to the desktop of every scientist, sociologist and politician who can afford a personal computer. Likewise, anyone with a connection to the Internet can techno-publish information they create. To alleviate this information overload, it becomes more important than ever to distinguish between usable and dispensable data. In the realm of Geospatial data (information referenced to the earth's surface) a structure of tools and applications have been developed to help assist data creators and users in navigating through the maze of information." (
NC-CGIA)

The dilemma for GIS users is posed by the following questions:

The National Spatial Data Infrastructure (NSDI) is a national effort to link data users with data providers. The U.S. Federal Geographic Data Committee (FGDC) has developed resources to help governmental, non-profit, and commericial participants to make their collections of spatial information searchable and accessible on the Internet using free reference implementation software. A fundamental goal of the FGDC's NSDI clearinghouse program is to provide access to digital spatial data through metadata. FGDC approved the Content Standards for Digital Geospatial Metadata (CSDGM) in 1994 to provide a common set of terminology and definitions for documentation. The standard specifies information that helps prospective users to determine what data exist, the fitness of these data for their applications, and conditions for accessing these data. Agencies and organizations that produce or distribute data can become "nodes" of the national clearinghouse, linking their metadata to centralized system of servers and search engines which allow users to search metadata in order to locate data specific and appropriate to their purposes, with links to either view or download the data once it has been identified.

In April, 1994, President Clinton signed an executive order instructing Federal agencies to use the CSDGM standard to document their geospatial data, and to provide the metadata to the public as part of the National Spatial Data Infrastructure. Moderated by the FGDC, the NSDI is an umbrella of policies, standards, and procedures under which organizations and technologies interact to foster more efficient use, management, and production of geospatial data.

The major components of the NSDI are a basic framework of digital geospatial data, known quality thematic data of critical national importance, standards to facilitate data collection, documentation, access, and transfer, and the means to search, query, find, access, and use geospatial data.

From concept to reality
Although initially targeted at federal agencies, participation in Clearinghouse prototypes has included federal, state, university, and vendor participants in the United States and abroad. The FGDC web-site provides instructions and tutorials on how to setup a clearinghouse node. There are currently over 180 nodes (as of January 2000) participating, including many international sites.

Data producers who may not want to become an official node should still consider contacting a node within their state, to arrange for assistance in creating metadata and making it available through a node. What are the benefits to a data producer in doing this? Simply this: all data producers are also data users. Very few data producers use only the the data that they produce: in most cases they also rely on other sources and providers of data. The more data producers that participate in some form of metadata and clearinghouse activity, the more accessible, affordable, and reliable data will become.


How to use the NSDI Clearinghouse

Go to the FGDC's clearinghouse website and click on the link: Search for geospatial data

There are several "gateways" to the clearinghouse: multiple gateways are provided to cut down on internet traffic and to speed up response time. Each gateway is an exact mirror of the others and contains the same information. Choose anyone by clicking on its name on the map.

Select one of the first two search interfaces: United States Placenames or Country Names.

There are three basic ways to perform a search:

You can use all three, or only one.

Full text or field search:

Type "vegetation", "soil", "ownership", "roads" or some other word that describes a theme you are looking for. The default search is "anywhere" in the documentation. (You can limit the search to just the title, abstract, purpose, keyword, or some other field, but we recommend starting with "anywhere" and later on adding refinements if necessary).

Select data servers to search. The FGDC recommends that you don't select more than 40 at a time. If you want to find out more about a particular site, the FGDC maintains links to all the participating sites. For instance, select all the servers within your state.

Press the Search button. A "status" page will then appear showing you the status of the servers you have requested; this page will automatically reload till all the requested servers have been connected to. If any results are found for a particular server, it will show the number of documents and provide a link to view the results.

Results are shown by dataset title, with a link to view the full metadata document.

What should you look for in the metadata document? Start at the top, with the dataset's title, publisher, and date (citation information). Then read its abstract and purpose. This should give you enough information to decide whether you want to look any further for more details such as format, pricing, data quality information, and more detailed information on content (entity and attributes). Please see the section How to use metadata for more details.

Area search

Since you are required to select data servers, which are usually organized by area (county, state, or smaller unit) you are already limiting the area of your search. There are some nodes, however, that are not area-specific. Examples include ESRI's ArcData Online Server, CIESIN's Socioeconomic Data and Applications Center, FEMA flood insurance map server, the federal government's Global Environmental Information Locator Service, the K-12 Educator's clearinghouse, and the Inter-American Geospatial Data Network. These sites may contain data from anywhere in the U.S. or sometimes world-wide.

Even if you do choose an area-specific server(s), you can still refine your search by one of two ways: choosing place-names, or entering in bounding coordinates.

For instance, you are interested in the Rocky Mountain region. You've selected one or more of non-area specific servers mentioned above. You can either select all or some of the Rocky Mountain states (Colorado, Wyoming, Utah, Idaho, Montana), or you can enter bounding coordinates such as -120 West, 50 North, 100 East, and -40 South.


How to participate in the NSDI Clearinghouse

As mentioned above, the more data producers that participate in some form of metadata and clearinghouse activity, the more accessible, affordable, and reliable data will become for everyone.

Data producers include more than government organizations or large GIS projects. Most students taking GIS-related course-work are at some point required to produce data for class projects.

Data takes time to produce. Whether you are collecting and processing GPS locations, digitizing a map using a digitizing tablet, digitizing features on-screen using a backdrop of aerial photographs, geocoding a database of addresses, or registering and processing satellite imagery, you are involved in a detailed and at times, painstaking process. With all the different variables and choices involved in producing data, it is important and worthwhile to document your sources and procedures, at the very least so that no one can claim "garbage in, garbage out" about your data or your methods.

Once you have some form of documentation - any form of documentation! - you have two choices. You can introduce yourself to the formal content of metadata, and discover how the documentation you've collected matches into the formal schema. If you focus on content, and do not let yourself become overwhelmed by the formatting issues that may come along with metadata standards, this does not have to be a painful process and in fact can be a very interesting, informative exercise.

If you don't want to pursue the conversion of your documentation into formal metadata yourself, you can contact one or more of the clearinghouse nodes within your state or region. Most states also have a GIS office or GIS coordinating body (see the National States Geographic Information Council, NSGIC, for contacts for your state). Metadata specialists are often affiliated with these type of GIS offices or organizations (sometimes also state libraries) and may be interested in getting a copy of your metadata.

Finally, there is the option for creating your own official node of the NSDI clearinghouse. This requires a web server, web server software and indexing/search software, and some knowledge of networking (or access to someone with this knowledge). In addition, you'll need to know how to create formal metadata and "parse" it into a format compliant with the indexing/search software. The FGDC provides a step-by-step tutorial on creating your own clearinghouse node. See the topic on Using and implementing metadata for pointers on creating and parsing metadata.


Advanced material

Beyond the Clearinghouse: providing geographic information services, not just geographic information.

Metadata plays an important role in automated communication of information between "agents". Agents help users access distributed data objects and GIS components on heterogeneous GIS platforms across the internet by interpreting, filtering, and converting information automatically. An agent is an autonomous computer program that has specific functions and responds to specific events, based on pre-defined knowledge rules or user designated instructions. Using metadata, an agent can bridge heterogeneous information systems and translate different data types and models for different GIS tasks.

View the full abstract:
Tsou, M.H. and B.P. Buttenfield, 2000. Agent-based mechanims for distributing geographic information services on the Internet. GIScience 2000: the First International Conference on Geographic Information Science.


References

North Carolina Center for Geographic Information and Analysis Tutorials http://cgia.cgia.state.nc.us:80/tutorials/index.html Back to Course Topics