Logistics Management Institute

Recommended XML Namespace
for Government Organizations

GS301L1/March 2003

By Jessica L. Glace and Mark R. Crawford

Introduction

The U.S. government needs to establish a cohesive namespace approach for its various Extensible Markup Language (XML) efforts. Without a coordinated approach, individual governmental organizations will create a proliferation of disparate XML namespaces. As a result, chaos in managing XML components will ensue.[1] Further, harmonizing of the names and namespaces used by different governmental organizations will become increasingly more difficult over time.

According to the World Wide Web Consortium (W3C), the purpose of XML namespaces is to “provide a simple method for qualifying element and attribute names used in Extensible Markup Language documents by associating them with namespaces identified by URI references.”[2] For an organization to establish an appropriate XML namespace, it must decide on a namespace strategy and a naming convention. The General Services Administration (GSA) asked LMI to recommend a strategy and naming convention that could be used throughout the government. It is our understanding that this strategy is intended to be a central source of guidance that will enable all trading partners of the U.S. Government to develop their XML namespaces using a common strategy and will promote an organized roll-out of governmental organizations’ ever-expanding array of XML components.

This report conveys the results of our work. We begin the report with some background information. We then describe three options available when deciding on a namespace strategy. Next, we discuss the technical options for a namespace naming convention. In the last section, we outline actions necessary to implement the recommended approaches in this paper.

Background

As stated previously, the concept of XML namespaces is defined in the W3C XML namespaces technical specification.[3] XML namespace features are available in the W3C XML Schema (XSD) and in document-type definitions (DTDs). [4] In the following subsections, we discuss namespace declaration and qualification and target namespaces. Since the majority of new XML development work is being done using schemas, and since the Federal XML Developers Guide recommends the use of XSD, we generally focus our discussions to the use of namespaces in XSD.[5]

Namespace Declaration and Qualification

A namespace is declared in the root element of a Schema using a namespace identifier. Schema constructs are associated with a namespace identifier through a user-defined namespace prefix, making the constructs “namespace qualified.” In the following example, the namespace identifier is urn:us:gov:gsa” and the namespace prefix is “gsa”:

<schema xmlns:gsa=“urn:us:gov:gsa”>

 

This means that any construct in the Schema with a namespace prefix of “gsa” belongs to the GSA namespace, as in the following example:

<element name=“gsa:FederalAcquisitionRegulationIndicator” type=“xsd:boolean”/>

 

 

Namespaces allow constructs with the same name but from different markup vocabularies to be used in the same Schema with no adverse effects. In the following example, two “State” elements are used in the same Schema, but they are associated with two different namespaces. One element represents a U.S. state abbreviation (AK, AL, AR) in the EPA’s namespace, while the other represents the state of water quality (acidic, basic, high turbidity) in a specific state’s environmental department namespace:[6]

<?xml version=“1.0” encoding=“UTF-8”?>

<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema”

xmlns:epa=“urn:us:gov:epa” xmlns:vadeq=“urn:us:gov:va:state:environmental”>

<xsd:element name=“epa:State type=“epa:StatePostalCodeType”/>

<xsd:element name=“vadeq:State type=“vadeq:WaterQualityIndicatorType”/>

<!--information removed for example purposes-->

</xsd:schema>

 

If the “State” elements described above were not in separate namespaces, an XML processor would generate an error. This condition is known as “name collision.”

The advantages of namespace declaration and qualification are:

¿      Namespaces associate schema constructs with a conceptual space

¿      Namespace qualification of schema constructs identifies the namespace in which the constructs belong

¿      Namespaces allow constructs with the same name but from different markup vocabularies to be used in the same schema with no adverse effects

The disadvantages of namespace declaration and qualification are:

¿      Namespace qualification of schema constructs can increase verbosity

In the following example, the instance has no namespace qualification:

 

<AgencyName>GSA</AgencyName>

<AgencyID>9986</AgencyID>

<ContactPartyID>222345897<ContactPartyID>

<OrderedQuantity>100<OrderedQuantity>

<OrderedQuantityAmount>399<OrderedQuantityAmount>

 

In this example, the instance has namespace qualification:

<gsa:AgencyName>GSA</gsa:AgencyName>

<gsa:AgencyID>9986</gsa:AgencyID>

<gsa:ContactPartyID>222345897<gsa:ContactPartyID>

<gsa:OrderedQuantity>100<gsa:OrderedQuantity>

<gsa:OrderedQuantityAmount>399<gsa:OrderedQuantityAmount>

 

In summary, although namespace qualification of schema constructs can increase verbosity as shown in the above example, the ability to easily identify the namespace in which a construct belongs (visually or automatically) is valuable. Use of namespaces will be valuable for the U.S. government because it will allow constructs developed in different areas to be associated with their own unique conceptual space.

Target Namespaces

Declaration of a target namespace in a schema indicates that the schema is acting as a “collector” of constructs declared within it. While a schema may have more than one declared namespace, only one namespace can be designated as the target namespace. It is not required that a target namespace be declared in a schema.

A target namespace is declared using the namespace identifier of the selected namespace. In the following example, the “urn:us:gov:gsa” namespace is declared as the target namespace:

<?xml version=“1.0” encoding=“UTF-8”?>

<xsd:schema xmlns:xsd=“http://www.w3.org/2001/XMLSchema”

xmlns:gsa=“urn:us:gov:gsa”

targetNamespace=“urn:us:gov:gsa”>

 

This means that any element, attribute, or data type declared in the schema belongs to the schema’s target namespace.

The advantage of a target namespace is that a declaration of a target namespace in a schema indicates that the schema is acting as a “collector” of constructs declared within it. Whereas the disadvantage is that other users of a schema so declared will need to reference the declared target namespace.

In summary, target namespaces are valuable because they allow a set of schema constructs to be collected into a single conceptual space. This allows the constructs to be identified as a single set of constructs.

Architectural Namespace Strategy

An organization can choose to have no target namespace; one target namespace (referred to as a single namespace configuration), which is used for all schemas within the organization; or multiple target namespaces (referred to as a multiple namespace configuration). We outline the advantages and disadvantages of each option in the following subsections. We also provide guidance or a recommendation concerning each option and provide our justification for the guidance or recommendation.

No Namespace

As stated previously, a schema does not need to declare a target namespace. When a namespace is not declared, this is referred to as the “no namespace” option.


No Namespace Option

Advantages and Disadvantages

Advantages:

Having no namespace provides for “clean” instance documents because prefixes aren’t needed to qualify the constructs.

Disadvantages:

Using no namespace keeps the original basis of the XML constructs invisible in the instance document due to the lack of prefixes.

Not using a namespace makes it possible for a no-namespace schema (schema A), when used by another schema with a declared namespace (schema B), appear to be part of schema B’s namespace.

Guidance or Recommendation

Governmental organizaitions schemas SHOULD use namespaces.

Justification

The intent of namespaces is to clarify semantics. Having no namespaces would cause confusion and increase the time for discerning the origin of XML constructs.

 

Single Namespace Configuration

Under a single namespace concept, all XML components would reference the same namespace, regardless of Department, Agency, or initiative focus. The implication of having a single namespace for the federal government is that all elements and attributes would need to be unique.


 

One Namespace Option

Advantages and Disadvantages

Advantages:

A single namespace will enforce front-end agreement of elements and attributes.a

A single namespace will not allow duplication of XML construct semantic names.

Disadvantages:

A single namespace increases the risk of name collision.

A single namespace may not be adopted voluntarily because it uses proportionately more resources in the short run.

Guidance or Recommendation

The U.S. government SHOULD NOT use a single-namespace configuration. Individual governmental organizations MAY use a single-namespace configuration.

Justification

One namespace is ideal for ensuring maximum interoperability. However, due to the size and requirements of the U.S. government, having only one namespace is not practical due to the volume of data in this scope. Individual governmental organizations, however, may choose to have a single namespace to promote the highest level of interoperability depending on its size and requirements.

a “Front-end” means organizations will need to harmonize data or make various data constructs conform to one another when XML Schemas are created, rather than waiting until there is a need to harmonize data to accommodate back-end needs which often happens after the XML Schemas have been created.

 

Multiple Namespace Configuration

The multiple namespace configuration allows an infinite number of namespaces. For the federal government, the configuration should be one namespace for the federal enterprise and federal agencies and other governmental organizations have the option to create their own namespace strategy. If a governmental organization opted to not create a namespace, they may use the federal enterprise namespace. Additionally, the federal enterprise namespace would hold shared XML constructs that other schemas would reference. The moderation of shared XML components would be crucial to the success of such a strategy. Each governmental organization would need to modify existing XML components that are shared so that they are harmonized across all governmental organizations before placing them into the federal enterprise namespace. Each governmental organization would control the content of its own namespace.

Multiple Namespace Option

Advantages and Disadvantages

Advantages:

A multiple-namespace configuration decreases the risk of name collision.

A multiple-namespace configuration enables an organization’s structure to be represented by the namespaces.

A multiple-namespace configuration can call out a single enterprise namespace to promote interoperability, while organization-specific namespaces promote flexibility for rapid implementation.

One federal enterprise namespace will provide a common namespace for shared elements, attributes, and data types.

Using different namespaces for each government organization will enable rapid development of government organization  XML schema without the need to harmonize across the entire U.S. government.

Using one namespace for each government organization will ensure maximum flexibility in naming XML constructs, with the namespace providing the semantic context.

Using one namespace for each government organization will ensure the XML constructs in U.S. government schemas can be identified easily because of the constructs’ unique namespaces.

Using one namespace for each government organization will give a context to the namespace that could provide presentation opportunities for schemas using Extensible Style Language (XSL) in the future.

Disadvantages:

A multiple-namespace configuration becomes more complex as more namespaces are used.

A multiple-namespace configuration does not enable the highest degree of interoperability.

A multiple-namespace configuration requires the use of the “import” construct for external schema references, increasing the complexity of government schemas.

Using one namespace for each government organization discourages harmonizing within the U.S. government.

Using one namespace for each government organization will be complex.

Guidance or Recommendation

The U.S. government SHOULD use multiple namespaces. Governmental organizations SHOULD adopt the strategy of referencing the shared federal enterprise namespace and define its own namespace architecture strategy.

Justification

Creating a “shared” namespace for housing commonly used XML constructs mitigates the risk of lowering interoperability. The shared namespace may also be used by organizations not wishing to maintain their own namespace.

Allowing namespaces for individual government organizations increases flexibility and initially reduces development cost and time. The additional complexity of multiple namespaces is minimal compared to the additional flexibility provided by the solution.

In the long run, as documents are passed between more systems, additional mapping and translation of the same element in multiple namespaces will need to be harmonized. If the naming conventions prescribed by the Draft Federal XML Developer’s Guide are used, creating semantically unique names in different agencies should not be overly burdensome.

Having a namespace for each organization will create a complex XML namespace architecture, but this is necessary to develop a robust network of U.S. governmental schemas.

 

Naming Conventions

Technical Options

A namespace requires a uniform resource identifier (URI), either a uniform resource locator (URL) or a uniform resource name (URN).

A URL does not have to be a resolvable World Wide Web address, but information regarding a schema may be put at the same URL as the target namespace identifier. Management of URLs—that is, maintenance of unique identifiers—is institutionalized. Currently, GSA is responsible for the management of the dot-gov domain, the Department of Commerce is responsible for the dot-us domain, and the Department of Defense[7] is responsible for the management of the dot-mil domain.

URL Option

Advantages and Disadvantages

Advantages:

Additional information regarding a schema may be stored at the URL.

The dot-gov URLs have an established registry to ensure uniqueness.

Disadvantages:

A URL may cause confusion in that people expect the URL to be resolvable, when in fact this is not required.

Guidance or Recommendation

Governmental organization schemas SHOULD NOT use URLs for namespaces.

Justification

URLs are designed as locations rather than persistent names. URLs lead to great confusion regarding XML namespaces because people assume an XML namespace in the form of a URL will be resolvable.

 

URN Option

Advantages and Disadvantages

Advantages:

URNs are designed “with the specific goal of providing persistent naming of resources.”a

Disadvantages:

A Request for Comment (RFC) b for registration of a namespace identifier will need to be made to the Internet Assigned Numbers Authority.

Guidance or Recommendation

Governmental organization schemas SHOULD use URNs for namespaces.

Justification

URNs are designed as persistent names, a requirement for a schema namespace. The short-term disadvantage of needing to register the namespace Identifier is outweighed by the long-term advantage of a registered persistent name. URL domain names are already managed by the GSA domain registration. URNs can take advantage of this service as discussed later in this document. In addition, although not a requirement, URNs can be registered in a global Namespace Identifier Directory, providing the same opportunity as a URL to be resolvable and to store information pertinent to the schema.c Additionally, the schemaLocation attribute may be used to provide information regarding where the Schema resides, rather than trying to use a namespace to identify a storage location.

a RFC 3406 (October 2002) is available at http://www.ietf.org/rfc/rfc3406.txt.

b RFCs are managed by the Internet Engineering Task Force and are the equivalent of International Organization for Standardization (ISO) standards.

c Mechanisms for URN resolution and use in Internet applications are proposed in RFC 3401 and RFC 3405. See IETFRFC 3406 Uniform Resource Names (URN) Namespace Definition Mechanisms, October 2002.

 

Recommended Naming Convention

The recommended URN solution to XML namespaces requires a naming convention and a source of management. Currently GSA manages the dot-gov top-level domains for URLs.[8] Governmental organizations as defined on http://www.nic.gov/ are qualified to pursue a dot-gov registration. Qualified governmental organizations may register for second- and third-level domains. The URN for XML namespaces should utilize the registered domains of government organizations in the dot-gov domain to create unique URNs.

Our proposed structure follows the structure defined by the IETF Network Working Group in RFC 2141. That structure contains the uniform resource identifier consisting of “urn”, the namespace identifier (NID), and namespace-specific string (NSS). The following is an example:

NSS

 
 

 

 

 

 

 

 

 


The NID is proposed to be “US.” The NSS would conform to the second- and/or third-level domain as registered with the GSA Government Domain Registration and Services.[9] Further delineation of the URN would be at the governmental organization’s discretion. A prudent measure for organizations utilizing the recommendations in this paper is to manage the further delineations of URNs within their organizations to ensure uniqueness and to avoid chaos. However, additional management of namespaces at an organizational level is not required in order to adopt the proposed recommendation in this paper.

The following paragraphs give notional examples of how the naming convention would be applied to different government organizations.

¿      United States Environmental Protection Agency (USEPA). USEPA has registered epa.gov. Following our recommendation, the URN would be urn:us:gov:epa

¿      General Services Administration. GSA has registered gsa.gov. Following our recommendation, the URN would be urn:us:gov:gsa.

¿      The Office of the U.S. Courts.  The Office of the U.S. Courts has registered uscourts.gov.[10] Following our recommendation, the URN would be urn:us:gov:uscourts.

¿      The U.S. District Court for Eastern District of Pennsylvania.  The U.S. District Court for Eastern District of Pennsylvania has a URL of http://www.paed.uscourts.gov/. Although the “paed” extension of the URL is not registered with GSA NIC, it is the registered domain owner’s right to further extend the URN. So a further delineation of uscourts.gov enables the Eastern District of Pennsylvania to have its own URN: urn:us:gov:uscourts:paed.

¿