HumanML & Government Related IT Directives (Part One)

By Rex Brooks and Russell Ruggiero

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Contents

 

Introduction

 

Overview

 

ROI (Government vs. Private)

 

The Current Landscape

 

The HumanML Effort

 

HumanML and Governmental ROI

 

“REPURPOSING”

 

The Functional Divisions of Human Markup Language

 

HumanML and DHS Consolidation

 

HumanML and Security by Preference in DHS

 

Simplicity is Key

 

The Core Components

 

The Framework

 

Changing Core Dynamics

 

Information Focus

 

Benefits and Examples

 

Postscript

 

Terms

 

Copyright

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Introduction

Emerging technologies are expected to play a vital role in Government related domestic and foreign IT directives by providing quantifiable benefits such as quicker response times and reduced costs.

 

(Historical Note: The original purpose behind the creation of the Internet may be directly attributed to a Cold War landscape that required governments and related agencies to maintain a dependable level of communication between vulnerable central facilities to prevent the decimation of military-industrial information resources and command and control centers. As a result, the Defense Advanced Research Projects Agency (DARPA) net or (ARPANet) was created in the United States. This network was adapted by the Western academic research community to the development of Hypertext by the multi-national scientists at the European Organization for Nuclear Research (CERN) facility to avoid duplication of experiments and maximize the utilization of resources regarding the one and only large particle accelerator, for Western Europe. The salient point is that necessity in the form of vulnerability of communications networks and an early scarcity of expensive research resources brought this now ubiquitous phenomena into being, where none had existed before. However, the preexistence of the electronic media of communications such as the telegraph, telephone, radio and television had established an existing infrastructure that was simply ready to be exploited in new ways. The gopher, archie and VERONICA Hypertext search tools of the seventies gave birth to Mosaic, which was invented to add a graphical display to Hypertext. Hence, the birth of the World Wide Web or Internet.)

 

Overview

The events of September 11, 2001 subsequently led to the creation of the Department of Homeland Security, which is meant to safeguard citizens by establishing coordinated nationwide emergency readiness and response and intensify security through improved communication between intelligence and investigation agencies as well as improving screening at borders and ports of entry. Furthermore, it has been given the task of consolidating the work of twenty-two federal agencies. Organizational structure for the Department of Homeland Security is comprised of four main divisions that include 1) Border and Transportation Security 2) Emergency Preparedness and Response 3) Chemical, Biological, Radiological and Nuclear Countermeasures 4) Information Analysis and Infrastructure Protection. This report focuses on how emerging technologies may be leveraged regarding this important and far-reaching project.

 

ROI (Government vs. Private)

Since March of 2000, the term return on investment (ROI) has been a very popular and frequently overused acronym. The so-called “dot-com bubble bust” has put additional emphasis on the bottom-line, which should be viewed as a positive development for all parties concerned. As with most blanket assumptions, the portrayal of the over hyped “New” Economy as baseless is an oversimplification of the event. Nevertheless, we can readily agree that a renewed emphasis on ROI is quite relevant to both the government and private sectors. It is important to note that important differences exist in what ROI means for each sector. A case in point: Key ROI factors such as consolidation and the automation of manual processes are applicable to both sectors, but an entity such as the Department of Homeland Security must also add the element of public safety into the equation. For example, in the private sector a misused data-mining tool may lead to an ineffective sales campaign. However, in the case of the Department of Homeland Security, the aforementioned scenario could lead to unneeded infrastructure spending, waste of manpower, loss of property, or even loss of life. Accordingly a specific ROI model should be created, which incorporates critical elements such as identifying key vulnerabilities within national power distribution networks, information distribution networks, food and water supply lines, transportation networks, and changing daily loci of  population centers or gathering points where terrorist acts can directly impact the bottom line of cost-effectiveness for the use of taxpayers investment to protect and defend the common welfare.

 

The Current Landscape

The effort to consolidate twenty-two separate United States Government Agencies will prove to be a monumental task for the Department of Homeland Security, and involve a myriad of both established and emerging technologies. There are a number of strategic emerging technologies under development that can positively advance this work, but the very dynamic conditions of incubation during difficult economic times tends to act in the marketplace to weed out poorly supported, and or explained efforts. The aforementioned efforts offer some promise, but do not show promise of immediate improvements in bottom line analyses within the private sector. Within the government sector it is almost impossible to write topics for grant proposals aimed at developing such emerging technologies, even within the military that has a tradition of such endeavors. The Defense Advanced Research Projects Agency (DARPA) is one such example of this ongoing tradition. One simply cannot know where chance and circumstance will bring the elements together, which will be necessary to germinate new and innovative combinations of concepts to spark the development of emerging technologies.

 

The HumanML Effort

There are numerous areas in which new technologies may be cultivated and utilized. Two prime examples are open source and open standards. This is where the Organization for the Advancement of Structured Information Standards (OASIS) and the genesis of Human Markup Language (HumanML) comes into perspective. HumanML is a new specification being developed by the OASIS, which is a not-for-profit consortium that advances electronic business by promoting open, collaborative development of interoperability specifications. While the effort to develop HumanML was generated independently, in an ad hoc Yahoo Groups mailing list, it migrated to OASIS in a process that OASIS, (primarily an industry-based organization of member companies whose sponsoring dues pay for the organization’s operations), allows. In basic terms, HumanML has been designed to represent human characteristics (e.g., cultural, physical, psychological, etc.) through XML, and is focused on enhancing the fidelity of human communication. While the goal of HumanML may seem somewhat ambitious, the vision is very well timed for the dramatic changes currently taking place such as the explosive Internet growth of the non-Western world. During the next fifteen years accurate information exchange between people of unlike cultures and origins is expected to be an ever-increasing concern with regards to Internet communication.  Accordingly, there is a clear need for emerging technologies that will improve global communication, while advancing the Internet to a higher level of interconnectedness.

HumanML and Governmental ROI

Improving the manner in which America explains itself and the benefits of democratic institutions and an open, market-based society to the developing world is a key area where HumanML shows great promise relating to governmental ROI. In terms of easing the stresses of consolidation within DHS, there are various ways in which HumanML can offer aid. Perhaps the most important way HumanML can aid in this effort is through consolidation of personal information among employees. In the Human Resources XML-Consortium, (which it should be noted is separate from though aligned with the goals of OASIS), Human Resources related information is given a standard XML-based vocabulary, some of which is incorporated in HumanML much as other vocabularies such as Anatomy and Medicine are also slated to be included under the wide-ranging umbrella of the Human Markup Language. HumanML is envisioned to be as inclusive of existing vocabularies as possible, with the aim of harmonizing and making these vocabularies more interoperable, thus ensuring that duplication of information can be more effectively avoided.

 

“REPURPOSING”

In terms of DHS and the consolidation of Executive Branch Agencies, the single concept that most clearly represents how HumanML can aid this work is “REPURPOSING.” What this means is that functions in job descriptions within individual employee files, a copy of which can be entered into a common database, can be assembled for inclusion in evaluating how to make the maximum use of individual skill sets. This is, of course, oversimplified for the purpose of giving an example. In point of fact, a specific vocabulary for doing this would need to be assembled, but the most important concept herein is that HumanML offers a defined framework within which this information can be assembled, and this framework is quite apart from and independent of the bureaucratic channels of existing agencies, so it can help produce an overall picture of capabilities and provide a way in which skill sets and people can be matched up more easily. This is not to say that this framework should be instituted immediately, or even soon because, in fact, it should not be approached that rapidly. What needs to be seen is that a framework exists that can be built upon further and specific tools developed for these tasks.

 

The Functional Divisions of Human Markup Language

The process of building the Human Markup Language has been broken down into functional divisions. A Primary Base set of terms, largely for categories of information, has been assembled. From this Base, a set of extensions called Secondary Base Languages is being assembled. Each of the Secondary Base Languages is intended to represent an application area. That is to say that from the Primary Base, a Secondary Base Language such as The Human Physical Characteristics Description Markup Language (HPCDML) will be derived. In this case the HPCDML will include the description of human anatomy incorporating the existing vocabularies and datatypes for anatomy and medicine, the description of  human ancestral specimens incorporating the existing languages and datatypes for human, and pre-human hominid archeology, and humanoid animation for the World Wide Web incorporating the H-Anim Specification of X3D/VRML.

HumanML and DHS Consolidation

By definition, superset languages such as HR-XML, (Human Resources-XML), with its many modules, and a HumanML-enhanced and-enabled language based on DHS needs and requirements may be combined to develop a hybrid language or vocabulary to create a useful pool of information based on skill sets, cultural and educational backgrounds and proven competencies, which can then be applied to the consolidation effort. The most important concept that HumanML brings to this mix is a capacity to better define Human criteria for Human requirements within the informational requirements of DHS.

 

HumanML and Security by Preference in DHS

It should be noted that it is and has been the goal of the HumanML effort to put control of an individual’s personal information in their own hands to the extent possible, and to create such languages as “Human PreferencesML” which can be used to effectively improve basic security and virtually eliminate or transform spam into a valued resource while optimizing the delivery of information actually of interest to an individual about their preferred activities. Such a language or offshoot could allow for a secondary, in-depth authentication of identity, and serve to increase security by allowing individuals to divulge information only they could know or authorize (presumably) based on their personal preferences, and thus streamline both transportation screening and border screening. Wide implementation of such a language would also effectively change spam for the better for both the buyers and sellers of the internet marketplace. It would elevate competition for individual users’ attention from a battle of flashy claims to a battle of well-presented and easily understood information for which recipients have shown an interest and specifically allowed or requested  be sent to them. This is by no means the extent of probable secondary languages and application areas which could be created within HumanML. Nor is it the extent of applications within HumanML which could be of service in DHS. It is only the beginning.

 

Simplicity is Key

It may seem incongruent, but the key that will make HumanML one of most important tools for transformation is its inherent simplicity. First, the vocabularies or modules of the Human Markup Language can only function properly to clarify communication if these modules are constructed from the most simple and clear base components.  We must keep its terms within the framework of a single meaning as much as possible, even by carefully excluding unwanted alternate definitions and requiring that the definition used be that we specify in our HumanML documents in our HumanML namespace. This is the fundamental aspect of XML vocabularies, or “Markup Languages” that enables extensibility. You can take a single, simple term such as “personality” and define it within a specific context, such as <huml:personality> and be assured that when used this way ONLY the definition of “personality” defined within the namespace specified as “huml” will be applied. Hence, this emerging technology provides a method that unlocks a new world of clear meanings and understandings across an extensive range of Government IT related applications. With this world unlocked, a construction such as <huml:personality:MBTI:ESFP> would specifically refer to a HumanML-defined personality type using the Myers Briggs Type Indicator for an exemplar of the Externally Oriented or Extrovert, Sensing or Sensation-based, Flexible and Participatory Type. While the fact that a simpler set of definitions can lead to a much more accurate as well much more complex definition for a human characteristic or overall collection of characteristics may seem odd, if not counter-intuitive, it is nonetheless quite true and therefore a more utilitarian method for improving human communication across the broad spectrum of digital information systems. Applied to Governmental IT related applications and their usages of HumanML, these improvements, many of which can’t be anticipated or predicted, will rather quickly demonstrate improved communications and cost reductions as confusion and the necessity for correcting misunderstandings in the delivery of services are achieved as tangible, and measurable results.

 

The Core Components

HumanML is an always-evolving standard XML vocabulary designed to represent human characteristics through XML, while at the same time enhancing the fidelity of human communication. Following are the three core components that provide the foundation for this emerging technology:

HumanML. Is an abbreviation for the entire set of Human Markup Language specifications.

Huml.  Relates to the name of the “root” XML Schema Element of the Human Markup Language Primary Base XML Schema Specification (and, therefore of the entire set of languages).

Huml. Is the namespace prefix that identifies the HumanML namespace, and also the abbreviation for the OASIS HumanMarkup Technical Committee.

 

The Framework

HumanML describes the XML and resource description framework (RDF) Schema specifications being developed by a designated technical committee at OASIS that contains sets of modules that frame and embed contextual human characteristics. Other efforts within the scope of the HumanML Technical Committee that address the overall concerns of representing and amalgamating human information within data include:

Changing Core Dynamics

The core dynamics of the Internet are rapidly changing as evidenced by the predicted growth of non-Western users, which are estimated to make up 75 percent of the Internet population by the year 2007 according to the top analyst firms. This development warrants an accepted standard that will help foster better communication among the entire Internet population. The ability to properly articulate information will greatly reduce the incidence of misrepresentation, which will no doubt have a positive effect on personal and business relationships. In a perfect world, issues relating to cultural, human and social characteristics do not hinder communication. Since this is not a perfect world, issues do get in the way of the message and they must be viewed as barriers in the current Internet landscape. Hence, the growing interest in HumanML, which can be viewed as the global communications equalizer. Allowing for and accommodating cultural differences will certainly improve our ability to communicate effectively.

 

Information Focus

The success of HumanML hinges on its ability to improve information exchange between interested parties. Two primary areas of focus are human characteristics and applications:

 

Human Characteristics

 

Applications for HumanML

Benefits and Examples

The primary objective of HumanML is to ensure that all of the characteristics relating to human nature (e.g., behavioral, mental, and physical traits) are properly conveyed to enhance the fidelity of human communication. It brings together human qualities and technology. This synergy provides an efficient framework for interested parties to (a) gain a better understanding of one another and (b) improve bi-directional information exchange. This markup language will play a key role in eradicating various barriers we are currently experiencing regarding global Internet communication. With HumanML it will be possible to create a set of applications that accomplish the following:

These examples expound upon a number of crucial areas where HumanML can be applied to enhance the fidelity of human communication.

Figure 1: XML Schema Huml Definitions (Sample)



Figure 2: HumanML Components

 

The HumanML Technical Committee is strongly committed to supporting a pliable framework regarding the current and future development of HumanML. These are the chief areas of focus pertaining to HumanML pliability:

Postscript

What the Human Markup Language effort provides is an existing framework designed to grow and acclimatize with the development and adaptation of human society. It is meant to evolve with humanity as an aid to humanity’s evolution through increasing the overall clarity of communications. The progression and acceptance of Markup Languages is evidenced by the evolution of SGML to HTML to XML, which has now become the “lingua franca” of modern computing. It is important to recognize that any such effort must weather the tests of time, and use what is learned to improve its efforts.  We offer this first report on HumanML in Government IT Directives as evidence of this process.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Important Terms

 

Applets. Are Java applications embedded in a Web page.

 

ASP. Active Server Pages are a set of software components that run on a Web server and allow Web developers to build dynamic Web pages.

 

ASP+. Are the next generation of Active Server Pages (ASP). They provide the services necessary for developers to build Enterprise type Web applications.

 

COM. Common Object Model is an object-based programming specification, designed to provide object interoperability through sets of predefined routines called interfaces.

 

COM+. Provides an enterprise development environment, based on the Microsoft Component Object Model (COM), for creating component-based, distributed applications.

 

CORBA. Common Object Request Broker Architecture is the Object Management Group (OMG) vendor-independent architecture and infrastructure, which computer applications use to work together over networks.

 

CSS. Cascading Style Sheets is a style sheet language that enables authors and users to attach style (fonts, spacing, and aural cues) to structure that include HTML and XML applications.

 

DOM. Document Object Model is a platform and language neutral interface that allows programs and scripts to dynamically access and update the content, structure and style of documents.

 

DTD. Document Type Definition is a text file that specifies the meaning of each tag.

 

EJB. Enterprise JavaBeans are server component architecture that conform to the Sun EJB component model. The EJB may be used to create a business object, and related content may be sent using Java Server Pages (JSPs).

 

FTP. File Transfer Protocol is the protocol used on the Internet for sending files.

 

HPCDML. Human Physical Characteristics Description Markup Language includes the description of human anatomy incorporating the existing vocabularies and datatypes for anatomy and medicine, the description of  human ancestral specimens incorporating the existing languages and datatypes for human, and pre-human hominid archeology, and humanoid animation for the World Wide Web incorporating the H-Anim Specification of X3D/VRML.

 

HTML. Hypertext Markup Language is a non-proprietary format based on SGML, and is the publishing language of the Web.

 

HR-XML. Human Resources-XML is an XML-based specification that has been designed to enable e-business and the automation of human resources-related data exchanges.

 

HumanML. Is an abbreviation for the entire set of Human Markup Language specifications. It has been designed to represent human characteristics (e.g., cultural, physical, psychological, etc.) through XML, and is focused on enhancing the fidelity of human communication.

 

Huml.  Relates to the name of the “root” XML Schema Element of the Human Markup Language Primary Base XML Schema Specification (and, therefore of the entire set of languages).

 

Huml. Is the namespace prefix that identifies the HumanML namespace, and also the abbreviation for the OASIS HumanMarkup Technical Committee.

 

IDL. Interface Definition Language is the standard API for calling CORBA services.

 

Java. Is a cross-platform source programming language that allows applications to be distributed over networks and the Internet.

 

J2C. The Connector Architecture Specification (JCA Specification) is a standard architecture for integrating Java applications with existing enterprise information systems.

 

J2EE. Java 2 Platform Enterprise Edition defines a standard for developing multitier applications.

 

J2ME. Java 2 Platform Micro Edition provides application-development platform for mobile devices including cell phones and PDAs.

 

JDBC. Java Database Connectivity is the standard API for accessing relational data.

 

JMS. Java Messaging Service is the standard API for sending and receiving messages.

 

JNDI. Java Naming Directory Interface is the standard API for accessing information in the enterprise name and directory.

 

JSP. Java Server Pages are a way to create dynamic Web content. They may also be used to generate and consume XML between n-tier servers or between servers and clients.

 

JVM. The Java Virtual Machine runs the Java applications.

 

JTA. Java Transaction API defines a high-level transaction management specification.

 

JTS. Java Transaction Services ensures interoperability with sophisticated transaction resources.

 

LDAP. Lightweight Directory Access Protocol is based on the standards contained within the X.500 standard, but is significantly simpler. And unlike X.500, LDAP supports TCP/IP, which is necessary for any type of Internet access.

 

Namespaces. Provide a simple method for qualifying element and attribute names used in XML documents by associating them with namespaces identified by URI references.

 

OASIS. The Organization for the Advancement of Structured Information is a non-for-profit consortium that advances electronic business by promoting open, collaborative development of interoperability specifications.

 

ODBC. Open Database Connectivity is a widely accepted API for database access. It is based on the Call-Level Interface (CLI) specifications from X/Open and ISO/IEC for database APIs and uses Structured Query Language (SQL) as its database access language.

 

OMG. Object Management Group is the industry group dedicated to promoting object-oriented (OO) technology and its standardization.

PKI: Public-key infrastructure is the combination of software, encryption technologies, and services designed to protect the security of communications and business transactions on the Internet.

 

RMI. Remote Method Invocation is used for creating, and or distributing Java objects.

 

RMI/IIOP. Provides developers an implementation of the Java RMI API over the Object Management Group (OMG) standard Internet Inter-Orb-Protocol (IIOP). This allows developers to write remote interfaces between clients and servers.

 

SAX. Simple API for XML is an event-based interface for processing XML documents

 

Servlets. Allow users to run Java code on the server and send HTML pages to a browser.

 

SSL: Secure Sockets Layer is a security technology that is commonly used to secure server to browser transactions.

 

SOAP. Simple object access protocol is a World Wide Web Consortium (W3C) specification that facilitates the interoperability between a broad mixture of programs and platforms.

 

SQL. Structured Query Language is a standard language for making interactive queries from and updating databases.

 

Telnet. A terminal emulation program for TCP/IP networks such as the Internet.

 

The Department of Homeland Security (DHS). Was created after September 11, 2001 and meant to safeguard citizens, intensify security at borders and ports of entry.

 

UDDI. Universal Description, Discovery and Integration is a an on-line directory that gives businesses and organizations a uniform way to describe their services, discover other companies’ services, and understand the methods required to conduct business with a specific company.

 

USD. User Datagram Protocol is a connectionless protocol that runs on top of IP networks. Web Services. Are components, which reside on the Internet that have been designed to be published, discovered, and invoked dynamically across various platforms and unlike networks.

 

X3D. Describes any programming or descriptive language that can be used to deliver interactive 3D objects and worlds across the internet.

 

VRML. Virtual Reality Modeling Language allows to create "virtual worlds" networked via the Internet and hyperlinked with the World Wide Web

 

WSDL. Web Services Description Language is a specification that is published to a UDDI directory. WSDL provides interface/implementation details of available Web services and UDDI Registrants. It leverages XML to describe data types, details, interface, location, and protocols.

 

XML. Extensible Markup Language is a non-proprietary subset of SGML. It is focused on data structure, and uses tags to specify the content of the data elements in a document.

 

XML Schema. Schemas are used to define and document XML applications.

 

XPath. XML Path language’s primary purpose is to address parts of an XML document. In support of this primary purpose, it also provides basic facilities for manipulation of strings, numbers and booleans.

 

XPointer. XML Pointer Language is based on the XML Path language (XPath) and supports addressing into the internal structures of XML documents. It allows for examination of a hierarchical document structure and choice of its internal parts based on various properties, such as element types, attribute values, character content, and relative position.

 

XQuery. Is a query language that uses the structure of XML intelligently. It can express queries across all these kinds of data, whether physically stored in XML or viewed as XML via middleware. XQuery is designed to be broadly applicable across many types of XML data sources.

 

XSL. Extensible Style Sheet language describes how data is presented. XSL may also be used to transform XML data into HTML/CSS documents on the Web servers.

 

XSLT. Extensible Style Sheet language Transformations is a language for transforming XML documents into other XML documents. XSLT is designed for use as part of XSL, which is a stylesheet language for XML.

 

W3C. The World Wide Web Consortium has become the primary organization for creating Web specifications, and whose principal goal is interoperability.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Copyright

 

This report was researched and written by Rex Brooks and Russell Ruggiero

                        1361-A Addison, Berkeley, CA, 94702, 510-849-2309

 

This report represents the opinions of the authors solely.

 

This document may not be used without written permission of the authors.

Copyright November, 2003