| A Brief Introduction to XML and AnIML [PDF] | |
| Gary W. Kramer, NIST | XML (Extensible Markup Language) is a meta-language for describing markup languages that, in turn, are used to describe the structure and relationships between entities in a document. XML provides a standard way for marking up documents through the use of delimiting tags that label entities and create structures. Today, the term XML refers to a series of related languages and technologies for dealing with structured documents such as XSL, Extensible Style Sheet Language; XLink, XML Linking Language; XPointer, XML Pointer Language; namespaces, concepts for dealing with multiple tag sets; Dsig, algorithms for implementing digital signatures; and Xquery, XML structured query language.
Markup languages for information in specific application domains can be created using XML. The Analytical Information Markup Language (AnIML) is being created to facilitate the interchange and archiving of chromatography and molecular spectroscopy data and metadata. ASTM Subcommittee E13.15 and the IUPAC Subcommittee on Electronic Data Standards have worked together to flesh out AnIML around a core schema that describes the data and their representations, a metadata schema that details how the ancillary information about the data is represented, and a series of instance documents that delineate the terms used by each technique for its "scientific metadata." Each technique document contains a standardized portion that will be balloted through the appropriate standards organization, but may be augmented by vendors, user organizations, and end users. To ensure that the information in AnIML documents is complete and valid, both syntactic and semantic checking tools are being developed, digital signatures are incorporated to ensure data integrity, and audit trails provide the data tracking, verification, and validation necessary for use in regulated industries. |
| Requirements for a New Analytical Data Standard [PDF] | |
| Mark F. Bean, GSK | Data standards serve multiple functions; in the past the focus was on a format for information exchanges between data systems; more recently there is a push from the pharmaceutical sector to preserve data for long-periods (30-60 years) to meet FDA requirements, so we need data standards serving as long-term data repositories that can outlive the vendor software; finally, in the future we may hope for vendor-independent processing or viewing of analytical instrument data.
Some of the required properties that have been identified so far include: flexible; strongly-constrained; simple to understand; extensible; long-lived; not only quickly machine readable but also human readable; capable of being verified and validated; capable of handling complex analysis contexts (metadata); capable of being stored in or restored from databases; supports conversion from prior standards (especially ANDI and JCAMP); hardware, operating system, vendor, and software-independence; supports encoding raw or processed data. One of the aspects that make the task of creating analytical information standards difficult is the constant evolution of analytical techniques. As a result, it is important that technique-constrained software must be able to read their technique sections of the standard without failing when encountering any possible extensions. This talk will provide insight into how AnIML requirements have been gathered and offer a forum for further contributions from the audience. |
| Architecture of the Analytical Information Markup Language (AnIML) | |
| Burkhard A. Schaefer, Dipl.-Inf. BSSN, Mainz, Germany |
The Analytical Information Markup Language (AnIML) is a standardization effort of the E13.15 Sub-Committee of the American Society for Testing and Materials. AnIML provides an XML-based format for analytical data. It is suitable for many different analytical measurement techniques.
AnIML consists of a generic data container that permits the storage of arbitrary analytical data. This includes multi-dimensional data, name-value pairs, and hierarchies. The concept of Technique Definitions permits the formal specification of constraints for using this data container. This way, a definition can prescribe how the data for specific measurement techniques should be captured in the data file. To address changing requirements, AnIML supports an extension concept that allows vendors or end users to specify additional data that should be stored for a measurement technique. These extensions can also be formally documented so that they do not break compatibility with existing software. This paper will present a short introduction to AnIML and describe its architectural fundamentals. It demonstrates how AnIML can be used to record data from everyday analytical experiments in a laboratory environment. It also describes how workflows consisting of multiple experiments can be documented. In addition, AnIML features related to its application in regulated environments will be briefly mentioned. This includes digital signatures and audit trail functionality. |
| Flexible standardization with AnIML technique definitions [PDF] | |
| Maren Fiege, Waters GmbH | To make analytical data meaningful, it is necessary to agree upon common terms, i.e. standard data dictionaries for analytical techniques. On the other hand, there needs to be enough flexibility in a standard to accommodate new techniques and special needs. This workshop will show how AnIML meets this challenge, and how it offers both standardization and flexibility. |
| The AnIML Data Model [PDF] | |
| Peter J. Linstrom, NIST |
ASTM Subcommittee E13.15 is developing Analytical Information Markup Language (AnIML) for the storage of analytical instrument data. The language is able to store data from a wide range of data from analytical instruments. AnIML was designed to support complex experimental designs and meet data retention requirements imposed by regulatory agencies.
AnIML uses a generic approach to data storage based on a limited number of base data types. The AnIML data model allows storage of n-dimensional data sets along with some basic metadata. A compact representation for evenly monotonic data series is provided. Base data types supported by AnIML include text format integer and text and binary format floating point data types. This talk will discuss how AnIML stores data and illustrate the various data types supported by AnIML. Examples of applications will be provided. |
| Analytical Instrument Control Using XML-based Web Service [PDF] | |
| Alex Mutin, Shimadzu Scientific Instruments |
There is a growing interest among analytical instrument users for multivendor support of their equipment in terms of instrument control, data acquisition and data processing capabilities.
Different vendors provide different software interfaces to control their instruments. Many users prefer to standardize on software to minimize validation and training costs, while keeping their hardware diverse. Because most laboratory software have limited multi-vendor support, often times when shopping for a new instrument users are burdened by a necessity to stay with one type of software.
XML-based web service embedded into an analytical instrument is a new technology that can potentially solve multi-vendor support limitations of current software. A web server equipped HPLC is directly connected to a computer network. Such system can be controlled from any PC without a need for any additional software except for a web browser such as the Internet Explorer. If laboratory software is linked with such web-service one can easily assemble systems out of multi-vendor hardware components while controlling them from the same application. In addition, the data can be interchanged between instruments, applications and databases using the Analytical Information Markup Language (AnIML) format. |
| Long Term Storage of Chromatographic Data... AnIML, TNF, Viewers, and Plenty of Challenges! [PDF] | |
| Mark Mullins, Agilent Technologies |
Long term storage of chromatographic data is a necessity. Some data needs to be kept upwards of 100 years! In order to guarantee this data will be accessible for this extended time period, a Technology Neutral Format (TNF) must be utilized for the file format and storage of these files.
Currently, Extensible Markup Language (XML) is an ideal format for the TNF storage of files. Chromatographic data stored in XML format is technology neutral, as it can be read and understood without the original creating application. However, without some sort of standardization, every XML file will look different. With standardization, every XML file has the same format, and tools can be developed around the standard to make creating, editing, and viewing much easier. This is where Analytical Instrument Markup Language (AnIML) comes into the picture. AnIML is a standard for storing analytical data in XML.
This workshop will cover some of the challenges that are faced in creating applications to transform data into AnIML format. Some of the topics covered will be isolation from changes in the XML format, schema validation, and AnIML viewers. Examples of actual AnIML data files and a live AnIML viewer will be presented. |
| AnIML in Regulated Environments [PDF] | |
| Antony N. Davies, Waters Corporation | In an electronic age where regulatory compliance and the protection of intellectual property is of ever increasing importance one of the key technologies needed is the capability of ensuring electronic data longevity. This talk will outline how the future IUPAC/ASTM AnIML data standards will provide this longevity and meet legal requirements. |
| The Path to the New ASTM AnIML Standard [PDF] | |
| David P. Martinsen, ACS | The Analytical Information Markup Language (AnIML) is being created within the framework of ASTM. This work is the focus of ASTM Subcommittee E13.15 on Analytical Data, a subcommittee of ASTM Committee E13 on Molecular Spectroscopy and Chromatography. The IUPAC Subcommittee on Electronic Data Standards, who are responsible for the JCAMP-DX standards, have joined with ASTM E13.15 to define this new standard for analytical information. The standard is centered around a core schema which will be used across all analytical techniques. A technique schema defines the framework for creating technique definition files for each specific analytical technique. The core and technique schemas are being created and will be maintained by ASTM E13.15. The technique definitions will require input from experts in each technique. For several common techniques (e.g., UV/Vis, IR, MS), these definition files will be created through collaboration of E13.15 with those experts. This talk will examine the standardization process, recount the work which has already taken place, and discuss the steps remaining to complete the standards process. |