HumanML & Government Related IT Directives (Part One)
By Rex Brooks and Russell Ruggiero
Contents
Introduction
Overview
ROI (Government vs. Private)
The Current Landscape
The HumanML Effort
HumanML and Governmental
ROI
“REPURPOSING”
The Functional Divisions of Human Markup Language
HumanML and DHS
Consolidation
HumanML and Security by
Preference in DHS
Simplicity is Key
The Core Components
The Framework
Changing Core Dynamics
Information Focus
Benefits and Examples
Postscript
Terms
Copyright
Emerging
technologies are expected to play a vital role in Government related domestic
and foreign IT directives by providing quantifiable
benefits such as quicker response times and reduced costs.
(Historical
Note: The original purpose behind the creation of the Internet may be directly
attributed to a Cold War landscape that required governments and related
agencies to maintain a dependable level of communication between vulnerable
central facilities to prevent the decimation of military-industrial information
resources and command and control centers. As a result, the Defense Advanced
Research Projects Agency (DARPA) net or (ARPANet) was created in the United
States. This network was adapted by the Western academic research community to
the development of Hypertext by the multi-national scientists at the European
Organization for Nuclear Research (CERN) facility to avoid duplication of
experiments and maximize the utilization of resources regarding the one and
only large particle accelerator, for Western Europe. The salient point is that
necessity in the form of vulnerability of communications networks and an early
scarcity of expensive research resources brought this now ubiquitous phenomena
into being, where none had existed before. However, the preexistence of the
electronic media of communications such as the telegraph, telephone, radio and
television had established an existing infrastructure that was simply ready to
be exploited in new ways. The gopher, archie and VERONICA Hypertext search
tools of the seventies gave birth to Mosaic, which was invented to add a
graphical display to Hypertext. Hence, the birth of the World Wide Web or
Internet.)
Overview
The events of September 11, 2001 subsequently led to the creation of the Department of Homeland Security, which is meant to safeguard citizens by establishing coordinated nationwide emergency readiness and response and intensify security through improved communication between intelligence and investigation agencies as well as improving screening at borders and ports of entry. Furthermore, it has been given the task of consolidating the work of twenty-two federal agencies. Organizational structure for the Department of Homeland Security is comprised of four main divisions that include 1) Border and Transportation Security 2) Emergency Preparedness and Response 3) Chemical, Biological, Radiological and Nuclear Countermeasures 4) Information Analysis and Infrastructure Protection. This report focuses on how emerging technologies may be leveraged regarding this important and far-reaching project.
ROI
(Government vs. Private)
Since March of 2000, the term return on investment (ROI) has been a very popular and frequently overused acronym. The so-called “dot-com bubble bust” has put additional emphasis on the bottom-line, which should be viewed as a positive development for all parties concerned. As with most blanket assumptions, the portrayal of the over hyped “New” Economy as baseless is an oversimplification of the event. Nevertheless, we can readily agree that a renewed emphasis on ROI is quite relevant to both the government and private sectors. It is important to note that important differences exist in what ROI means for each sector. A case in point: Key ROI factors such as consolidation and the automation of manual processes are applicable to both sectors, but an entity such as the Department of Homeland Security must also add the element of public safety into the equation. For example, in the private sector a misused data-mining tool may lead to an ineffective sales campaign. However, in the case of the Department of Homeland Security, the aforementioned scenario could lead to unneeded infrastructure spending, waste of manpower, loss of property, or even loss of life. Accordingly a specific ROI model should be created, which incorporates critical elements such as identifying key vulnerabilities within national power distribution networks, information distribution networks, food and water supply lines, transportation networks, and changing daily loci of population centers or gathering points where terrorist acts can directly impact the bottom line of cost-effectiveness for the use of taxpayers investment to protect and defend the common welfare.
The Current Landscape
The effort to consolidate twenty-two separate United States Government Agencies will prove to be a monumental task for the Department of Homeland Security, and involve a myriad of both established and emerging technologies. There are a number of strategic emerging technologies under development that can positively advance this work, but the very dynamic conditions of incubation during difficult economic times tends to act in the marketplace to weed out poorly supported, and or explained efforts. The aforementioned efforts offer some promise, but do not show promise of immediate improvements in bottom line analyses within the private sector. Within the government sector it is almost impossible to write topics for grant proposals aimed at developing such emerging technologies, even within the military that has a tradition of such endeavors. The Defense Advanced Research Projects Agency (DARPA) is one such example of this ongoing tradition. One simply cannot know where chance and circumstance will bring the elements together, which will be necessary to germinate new and innovative combinations of concepts to spark the development of emerging technologies.
The HumanML Effort
There are numerous areas in which new technologies may be cultivated and utilized. Two prime examples are open source and open standards. This is where the Organization for the Advancement of Structured Information Standards (OASIS) and the genesis of Human Markup Language (HumanML) comes into perspective. HumanML is a new specification being developed by the OASIS, which is a not-for-profit consortium that advances electronic business by promoting open, collaborative development of interoperability specifications. While the effort to develop HumanML was generated independently, in an ad hoc Yahoo Groups mailing list, it migrated to OASIS in a process that OASIS, (primarily an industry-based organization of member companies whose sponsoring dues pay for the organization’s operations), allows. In basic terms, HumanML has been designed to represent human characteristics (e.g., cultural, physical, psychological, etc.) through XML, and is focused on enhancing the fidelity of human communication. While the goal of HumanML may seem somewhat ambitious, the vision is very well timed for the dramatic changes currently taking place such as the explosive Internet growth of the non-Western world. During the next fifteen years accurate information exchange between people of unlike cultures and origins is expected to be an ever-increasing concern with regards to Internet communication. Accordingly, there is a clear need for emerging technologies that will improve global communication, while advancing the Internet to a higher level of interconnectedness.
HumanML and Governmental ROI
Improving the manner in which America explains itself and the benefits of democratic institutions and an open, market-based society to the developing world is a key area where HumanML shows great promise relating to governmental ROI. In terms of easing the stresses of consolidation within DHS, there are various ways in which HumanML can offer aid. Perhaps the most important way HumanML can aid in this effort is through consolidation of personal information among employees. In the Human Resources XML-Consortium, (which it should be noted is separate from though aligned with the goals of OASIS), Human Resources related information is given a standard XML-based vocabulary, some of which is incorporated in HumanML much as other vocabularies such as Anatomy and Medicine are also slated to be included under the wide-ranging umbrella of the Human Markup Language. HumanML is envisioned to be as inclusive of existing vocabularies as possible, with the aim of harmonizing and making these vocabularies more interoperable, thus ensuring that duplication of information can be more effectively avoided.
“REPURPOSING”
In terms of DHS and the consolidation of Executive Branch Agencies, the single concept that most clearly represents how HumanML can aid this work is “REPURPOSING.” What this means is that functions in job descriptions within individual employee files, a copy of which can be entered into a common database, can be assembled for inclusion in evaluating how to make the maximum use of individual skill sets. This is, of course, oversimplified for the purpose of giving an example. In point of fact, a specific vocabulary for doing this would need to be assembled, but the most important concept herein is that HumanML offers a defined framework within which this information can be assembled, and this framework is quite apart from and independent of the bureaucratic channels of existing agencies, so it can help produce an overall picture of capabilities and provide a way in which skill sets and people can be matched up more easily. This is not to say that this framework should be instituted immediately, or even soon because, in fact, it should not be approached that rapidly. What needs to be seen is that a framework exists that can be built upon further and specific tools developed for these tasks.
The Functional Divisions of Human Markup Language
The process of building the Human Markup Language has been
broken down into functional divisions. A Primary Base set of terms, largely for
categories of information, has been assembled. From this Base, a set of
extensions called Secondary Base Languages is being assembled. Each of the
Secondary Base Languages is intended to represent an application area. That is
to say that from the Primary Base, a Secondary Base Language such as The Human
Physical Characteristics Description Markup Language (HPCDML) will be derived. In
this case the HPCDML will include the description of human anatomy
incorporating the existing vocabularies and datatypes for anatomy and medicine,
the description of human ancestral specimens
incorporating the existing languages and datatypes for human, and pre-human
hominid archeology, and humanoid animation for the World Wide Web incorporating
the H-Anim Specification of X3D/VRML.
HumanML and DHS Consolidation
By definition, superset languages such as HR-XML, (Human Resources-XML), with its many modules, and a HumanML-enhanced and-enabled language based on DHS needs and requirements may be combined to develop a hybrid language or vocabulary to create a useful pool of information based on skill sets, cultural and educational backgrounds and proven competencies, which can then be applied to the consolidation effort. The most important concept that HumanML brings to this mix is a capacity to better define Human criteria for Human requirements within the informational requirements of DHS.
HumanML and Security by Preference in DHS
It should be noted
that it is and has been the goal of the HumanML effort to put control of an
individual’s personal information in their own hands to the extent possible,
and to create such languages as “Human PreferencesML” which can be used to
effectively
improve basic security and virtually eliminate or transform spam into a
valued resource while optimizing the delivery of information actually of
interest to an individual about their preferred activities. Such a language or
offshoot could allow for a secondary, in-depth authentication of identity, and
serve to increase security by allowing individuals to divulge information only
they could know or authorize (presumably) based on their personal preferences,
and thus streamline both transportation screening and border screening. Wide
implementation of such a language would also effectively change spam for the
better for both the buyers and sellers of the internet marketplace. It would
elevate competition for individual users’ attention from a battle of flashy
claims to a battle of well-presented and easily understood information for
which recipients have shown an interest and specifically allowed or requested be sent to them. This is by no means
the extent of probable secondary languages and application areas which could be
created within HumanML. Nor is it the extent of applications within HumanML
which could be of service in DHS. It is only the beginning.
Simplicity is Key
It may seem
incongruent, but the key that will make HumanML one of most important tools for
transformation is its inherent simplicity. First, the vocabularies or modules
of the Human Markup Language can only function properly to clarify
communication if these modules are constructed from the most simple and clear
base components. We must keep its terms
within the framework of a single meaning as much as possible, even by carefully
excluding unwanted alternate definitions and requiring that the definition used
be that we specify in our HumanML documents in our HumanML namespace. This is
the fundamental aspect of XML vocabularies, or “Markup Languages” that enables
extensibility. You can take a single, simple term such as “personality” and
define it within a specific context, such as <huml:personality> and be
assured that when used this way ONLY the definition of “personality” defined
within the namespace specified as “huml” will be applied. Hence, this emerging
technology provides a method that unlocks a new world of clear meanings and
understandings across an extensive range of Government IT related applications. With this world unlocked, a construction such as
<huml:personality:MBTI:ESFP> would specifically refer to a HumanML-defined
personality type using the Myers Briggs Type Indicator for an exemplar of the
Externally Oriented or Extrovert, Sensing or Sensation-based, Flexible and
Participatory Type. While the fact that a simpler set of definitions can lead
to a much more accurate as well much more complex definition for a human
characteristic or overall collection of characteristics may seem odd, if not
counter-intuitive, it is nonetheless quite true and therefore a more
utilitarian method for improving human communication across the broad spectrum
of digital information systems. Applied to Governmental IT related applications
and their usages of HumanML, these improvements, many of which can’t be
anticipated or predicted, will rather quickly demonstrate improved
communications and cost reductions as confusion and the necessity for
correcting misunderstandings in the delivery of services are achieved as
tangible, and measurable results.
The Core Components
HumanML is an always-evolving standard XML vocabulary
designed to represent human characteristics through XML, while at the same time
enhancing the fidelity of human communication. Following are the three core
components that provide the foundation for this emerging technology:
HumanML. Is an abbreviation for the entire
set of Human Markup Language specifications.
Huml. Relates to the name of the “root” XML Schema Element of the Human
Markup Language Primary Base XML Schema Specification (and, therefore of the
entire set of languages).
Huml. Is the namespace prefix that identifies the HumanML
namespace, and also the abbreviation for the OASIS HumanMarkup Technical
Committee.
The Framework
HumanML describes the XML and resource description framework (RDF) Schema specifications being developed by a designated technical committee at OASIS that contains sets of modules that frame and embed contextual human characteristics. Other efforts within the scope of the HumanML Technical Committee that address the overall concerns of representing and amalgamating human information within data include:
Changing Core Dynamics
The core dynamics of the Internet are rapidly changing as evidenced by the predicted growth of non-Western users, which are estimated to make up 75 percent of the Internet population by the year 2007 according to the top analyst firms. This development warrants an accepted standard that will help foster better communication among the entire Internet population. The ability to properly articulate information will greatly reduce the incidence of misrepresentation, which will no doubt have a positive effect on personal and business relationships. In a perfect world, issues relating to cultural, human and social characteristics do not hinder communication. Since this is not a perfect world, issues do get in the way of the message and they must be viewed as barriers in the current Internet landscape. Hence, the growing interest in HumanML, which can be viewed as the global communications equalizer. Allowing for and accommodating cultural differences will certainly improve our ability to communicate effectively.
Information Focus
The success of HumanML hinges on its ability to improve information exchange between interested parties. Two primary areas of focus are human characteristics and applications:
Human Characteristics
Applications for HumanML
Benefits and Examples
The primary objective of HumanML is to ensure that all of the characteristics relating to human nature (e.g., behavioral, mental, and physical traits) are properly conveyed to enhance the fidelity of human communication. It brings together human qualities and technology. This synergy provides an efficient framework for interested parties to (a) gain a better understanding of one another and (b) improve bi-directional information exchange. This markup language will play a key role in eradicating various barriers we are currently experiencing regarding global Internet communication. With HumanML it will be possible to create a set of applications that accomplish the following:
These examples expound upon a number of
crucial areas where HumanML can be applied to enhance the fidelity of human
communication.
Figure 1: XML Schema Huml Definitions (Sample)

Figure 2: HumanML Components

The HumanML Technical Committee is strongly committed to supporting a pliable framework regarding the current and future development of HumanML. These are the chief areas of focus pertaining to HumanML pliability:
Postscript
What the Human Markup Language effort provides is an existing framework designed to grow and acclimatize with the development and adaptation of human society. It is meant to evolve with humanity as an aid to humanity’s evolution through increasing the overall clarity of communications. The progression and acceptance of Markup Languages is evidenced by the evolution of SGML to HTML to XML, which has now become the “lingua franca” of modern computing. It is important to recognize that any such effort must weather the tests of time, and use what is learned to improve its efforts. We offer this first report on HumanML in Government IT Directives as evidence of this process.
Important Terms
Applets. Are Java
applications embedded in a Web page.
ASP. Active Server Pages are a
set of software components that run on a Web server and allow Web developers to
build dynamic Web pages.
ASP+. Are the
next generation of Active Server Pages (ASP). They provide the services
necessary for developers to build Enterprise type Web applications.
COM. Common Object Model is an
object-based programming specification, designed to provide object
interoperability through sets of predefined routines called interfaces.
COM+. Provides an enterprise
development environment, based on the Microsoft Component Object Model (COM),
for creating component-based, distributed applications.
CORBA. Common Object Request Broker
Architecture is the Object Management Group (OMG) vendor-independent
architecture and infrastructure, which computer applications use to work
together over networks.
CSS. Cascading
Style Sheets is a style sheet language that enables authors and users to attach
style (fonts, spacing, and aural cues) to structure that include HTML and XML
applications.
DOM. Document Object Model is a
platform and language neutral interface that allows programs and scripts to
dynamically access and update the content, structure and style of documents.
DTD. Document Type Definition is
a text file that specifies the meaning of each tag.
EJB. Enterprise JavaBeans are
server component architecture that conform to the Sun EJB component model. The
EJB may be used to create a business object, and related content may be sent
using Java Server Pages (JSPs).
FTP. File Transfer Protocol is
the protocol used on the Internet for sending files.
HPCDML. Human Physical Characteristics Description Markup
Language includes the description of human anatomy incorporating the existing
vocabularies and datatypes for anatomy and medicine, the description of human ancestral specimens incorporating the
existing languages and datatypes for human, and pre-human hominid archeology,
and humanoid animation for the World Wide Web incorporating the H-Anim
Specification of X3D/VRML.
HTML. Hypertext Markup Language is
a non-proprietary format based on SGML, and is the publishing language of the
Web.
HR-XML.
Human
Resources-XML is an XML-based specification that has been designed to
enable e-business and the automation of human resources-related data exchanges.
HumanML. Is an abbreviation for the entire set of Human Markup Language specifications. It has been designed to represent human characteristics (e.g., cultural, physical, psychological, etc.) through XML, and is focused on enhancing the fidelity of human communication.
Huml. Relates to the name of the “root” XML Schema Element of the Human Markup Language Primary Base XML Schema Specification (and, therefore of the entire set of languages).
Huml. Is the namespace prefix that identifies the HumanML namespace, and also the abbreviation for the OASIS HumanMarkup Technical Committee.
IDL. Interface Definition
Language is the standard API for calling CORBA services.
Java. Is a cross-platform source
programming language that allows applications to be distributed over networks
and the Internet.
J2C. The Connector Architecture
Specification (JCA Specification) is a standard architecture for integrating
Java applications with existing enterprise information systems.
J2EE. Java 2 Platform
Enterprise Edition defines a standard for developing multitier applications.
J2ME. Java 2 Platform Micro
Edition provides application-development platform for mobile devices including
cell phones and PDAs.
JDBC. Java Database Connectivity
is the standard API for accessing relational data.
JMS. Java Messaging Service is
the standard API for sending and receiving messages.
JNDI. Java Naming Directory
Interface is the standard API for accessing information in the enterprise name
and directory.
JSP. Java Server Pages are a way
to create dynamic Web content. They may also be used to generate and consume
XML between n-tier servers or between servers and clients.
JVM. The Java Virtual Machine
runs the Java applications.
JTA. Java Transaction API
defines a high-level transaction management specification.
JTS. Java Transaction Services
ensures interoperability with sophisticated transaction resources.
LDAP. Lightweight Directory
Access Protocol is based on the standards contained within the X.500 standard,
but is significantly simpler. And unlike X.500, LDAP supports TCP/IP, which is
necessary for any type of Internet access.
Namespaces. Provide a simple method for
qualifying element and attribute names used in XML documents by associating
them with namespaces identified by URI references.
OASIS. The Organization for the Advancement of Structured Information is a non-for-profit consortium that advances electronic business by promoting open, collaborative development of interoperability specifications.
ODBC. Open Database Connectivity
is a widely accepted API for database access. It is based on the Call-Level
Interface (CLI) specifications from X/Open and ISO/IEC for database APIs and
uses Structured Query Language (SQL) as its database access language.
OMG. Object Management Group is
the industry group dedicated to promoting object-oriented (OO) technology and
its standardization.
PKI: Public-key infrastructure is the combination of software, encryption technologies, and services designed to protect the security of communications and business transactions on the Internet.
RMI. Remote Method Invocation is
used for creating, and or distributing Java objects.
RMI/IIOP. Provides developers an
implementation of the Java RMI API over the Object Management Group (OMG)
standard Internet Inter-Orb-Protocol (IIOP). This allows developers to write
remote interfaces between clients and servers.
SAX. Simple API for XML is an
event-based interface for processing XML documents
Servlets. Allow users to run Java
code on the server and send HTML pages to a browser.
SSL: Secure Sockets Layer is a
security technology that is commonly used to secure server to browser
transactions.
SOAP. Simple
object access protocol is a World Wide Web Consortium (W3C) specification that
facilitates the interoperability between a broad mixture of programs and
platforms.
SQL.
Structured Query Language is a standard language for making interactive queries
from and updating databases.
Telnet. A terminal emulation
program for TCP/IP networks such as the Internet.
The Department of Homeland
Security (DHS). Was created after September 11, 2001 and meant to safeguard citizens,
intensify security at borders and ports of entry.
UDDI. Universal Description, Discovery
and Integration is a an on-line directory that gives businesses and
organizations a uniform way to describe their services, discover other
companies’ services, and understand the methods required to conduct business
with a specific company.
USD. User
Datagram Protocol is a connectionless protocol that runs on top of IP networks.
Web Services. Are components, which
reside on the Internet that have been designed to be published, discovered, and
invoked dynamically across various platforms and unlike networks.
X3D. Describes
any programming or descriptive language that can be used to deliver interactive
3D objects and worlds across the internet.
VRML. Virtual Reality Modeling Language allows
to create "virtual worlds" networked via the Internet and hyperlinked
with the World Wide Web
WSDL. Web Services Description Language is a specification that is published to a UDDI directory. WSDL provides interface/implementation details of available Web services and UDDI Registrants. It leverages XML to describe data types, details, interface, location, and protocols.
XML. Extensible Markup Language
is a non-proprietary subset of SGML. It is focused on data structure, and uses
tags to specify the content of the data elements in a document.
XML Schema. Schemas are used to define
and document XML applications.
XPath. XML Path language’s primary
purpose is to address parts of an XML document. In support of this primary
purpose, it also provides basic facilities for manipulation of strings, numbers
and booleans.
XPointer. XML Pointer Language is
based on the XML Path language (XPath) and supports addressing into the
internal structures of XML documents. It allows for examination of a
hierarchical document structure and choice of its internal parts based on various
properties, such as element types, attribute values, character content, and
relative position.
XQuery. Is a query language that uses the structure of XML intelligently. It can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware. XQuery is designed to be broadly applicable across many types of XML data sources.
XSL. Extensible Style Sheet
language describes how data is presented. XSL may also be used to transform XML
data into HTML/CSS documents on the Web servers.
XSLT. Extensible Style Sheet
language Transformations is a language for transforming XML documents into
other XML documents. XSLT is designed for use as part of XSL, which is a
stylesheet language for XML.
W3C. The World Wide Web Consortium has become the primary organization for creating Web specifications, and whose principal goal is interoperability.
This report was
researched and written by Rex Brooks and Russell Ruggiero
1361-A Addison,
Berkeley, CA, 94702, 510-849-2309
This report
represents the opinions of the authors solely.
This document may not be used without written permission of the authors.
Copyright November, 2003