What is XML?
XML is a versatile markup language for structuring and transferring data. It features customizable tags, hierarchical structure, and schema definitions. Unlike HTML, XML focuses on data representation rather than display. It's widely used in various applications, including SSO configurations like Logto's SAML implementation.
Extensible Markup Language (XML) uses tags to indicate how text in data files should be structured, stored, and transmitted. XML is designed to be readable by both humans and machines, making it a powerful and highly customizable markup tool.
XML was created by the World Wide Web Consortium (W3C) in 1996. The W3C's goal was to create a language that could help define document types and provide the ability to create custom tags. The resulting markup language type allows for defining how data on a page is marked up before sharing it as an XML file with another system. As long as two systems use the same XML language (with tags they can both interpret), both systems can "understand" the XML file format. When XML is properly deployed, this ensures that any data file can be stored, transmitted, and then used, reproducing exactly the same data and structure each time it is accessed.
XML content example
XML data consists of text in a digital file. Similar to HTML, you create the necessary "code" for XML files by inserting tags to indicate how the text should be interpreted. For example:
This example shows three users. The XML content includes name, username, email, and user level.
This creates an XML document that can be shared and read between identity providers and resource providers.
Importantly, the above example demonstrates the hierarchical nature of XML documents. For instance:
- The content of the first line
<?xml version="1.0" encoding="UTF-8"?>
is the XML declaration, which specifies the version and encoding - In an XML document, there must be a root element, which in this example is
<users>
- All other elements contained within the root element are called "child elements"
- In the above example, there are 3 child elements, marked with the
<user>
tag - Within the
<user>
child elements, there are several other child elements, such as<name>
,<username>
,<email>
, and<level>
- On the
<user>
tag, there's also anid
, which is called an XML attribute. An element cannot contain multiple attributes with the same name
The tags used show what each type of data is, with plain text serving as the data itself. Note also how the content is indented. This is not to help systems process the XML file, but to help humans more easily browse the XML file and its hierarchical order to discover and resolve any errors or omissions.
What is XML schema?
A schema acts as the "definition" for an XML document. This means it describes the key rules and constraints of the XML file structure, as well as controls on content and data types—controlling how they interact with each other in the document. This ensures that the data integrity of the XML is protected at all times, even when handled between different applications or even systems.
The two main schema languages are Document Type Definition (DTD) and XML Schema Definition (XSD). XSD is the most commonly used language in XML documents due to its rich feature set, powerful data type capabilities, and more.
What is XML syntax?
If XML schemas like XSD are all about detailed specifications for XML documents to ensure their validity, then XML syntax provides a set of rules that govern and control the overall structure of XML files. For example, XML syntax can cover text content, such as self-closing elements (single pieces of information requiring no additional content) or XML declarations (used at the beginning of documents to describe key information, such as character encoding), as well as the XML version used by the document.
XML vs HTML
If you're familiar with HTML, the way XML files deploy markup will look very familiar—but there's a key difference between these two markup languages—their use cases and purposes differ:
- HTML is a Markup Language used to help browsers understand how to display content on screen
- XML is a Markup Language used to store, display, and transfer data
Besides this, they have some other differences:
- Because HTML is for rendering page content for browsers, its tag types are predefined, while XML tags can be customized according to actual situations
- HTML is often static as it's used to display content; XML is dynamic as it's used to transfer data
- HTML doesn't support namespaces, while XML can use namespaces to distinguish tags that might have the same name but different contexts to avoid confusion
In this example:
tech:title
uses<http://www.example.com/tech>
as the namespace, indicating that this<title>
element belongs to thetech
namespacebio:author
uses<http://www.example.com/bio>
as the namespace, indicating that this<author>
element belongs to thebio
namespace
How is XML used in Logto?
Logto supports SAML protocol for SSO. When configuring SSO, the identity provider provides an XML format configuration file that includes important configurations such as certificates and single sign-on URLs.
When configuring SAML SSO, Logto allows for directly uploading the XML configuration file obtained from the identity provider, eliminating the need to manually fill in each configuration item.