How XML Files Encapsulate Your Data
Extensible Mark-up Language (XML) has quickly established itself as a viable technology with a huge range of real-world applications. One of the key reasons for its importance and wide acceptance is that it offers a working solution to one of the key problems faced by software developers and computer users alike: the exchange of incompatible data. Each software environment produces its own unique type of binary file which only it can understand. Once data is exported in XML format, it becomes a known quantity, independent of the environment in which it was originated.
The PDF format is another example of a platform-independent format which has gained worldwide acceptance. Once a document is saved in PDF format, its format is set in stone, it can viewed and printed with its layout and formatting intact, without the need for the software which created the original document. However, where the PDF format concerns itself mainly with the presentation of information, XML is used to describe and encapsulate the information itself.
Though XML itself is still fairly new, the idea behind it is not. Back in the 1970s, Standard Generalized Markup Language (SGML) was developed in an attempt to create an application-independent method of describing data. SGML is a text-based language which uses the concept of adding mark-up to data which describes the data itself. An SGML document contains both data and a set of rules defining the structure of the data. SGML is a pretty complex language and, unlike XML, has never become mainstream. In the early 1990s, SGML was used to develop HTML and in the late 1990s, SGML was also used as the basis for the development of XML. So, basically, XML is a restricted form of SGML.
XML has already proved itself to be an excellent medium for storing, describing and transporting data, particularly over the internet. It provides flexibility, clarity and simplicity. An XML document may look similar to an HTML document and consists of the same human-readable tags. However, the tags used to markup HTML documents are pre-defined: only a limited set of tags can legitimately be included. XML allows you to create your own markup language and define the tags which are legitimate for your data. It does this via a schema document, which can itself be an XML document. The schema document specifies the vocabulary and grammar which can be used within the XML document which contains your information.
The fact that, when creating or generating XML documents, you can invent all the rules, means that you never have to twist and force your data into a container which was not designed to hold it. You design tags which reflect the nature of your information; you create a schema document which defines the hierarchical structure of that information; and you specify the type of data each element within your document is permitted to contain. In short, if you end up with an XML documents which is not suitable for holding your information, you have only yourself to blame!
Posted on March 14th, 2009 in Web Development | No Comments »