0 bytes

About the XML Formatter & Validator

XML (eXtensible Markup Language) is a markup language for storing and transporting structured data. Unlike HTML, XML tags are not predefined — you define your own element names to describe your data. XML is the foundation of SVG, RSS, SOAP web services, Microsoft Office file formats (.docx, .xlsx), Android layout files, and many enterprise data standards.

Well-formed vs valid XML

Most common XML errors

Unclosed tags, unescaped ampersands (write &amp; not &), multiple root elements, mismatched tag names (XML is case-sensitive: <Item> differs from <item>), and missing quotes around attribute values are the most frequent mistakes that cause XML parsers to fail.

Real-world XML formats you encounter

XML is far from obsolete — it underlies many formats you use daily. Understanding these helps when debugging or transforming data.

Frequently Asked Questions

What is the difference between XML and HTML?
HTML is a fixed-tag language for web pages with lenient parsing. XML is strict and self-describing with custom tags you define yourself. XML requires all tags to be closed, is case-sensitive, and stops parsing entirely on a syntax error. HTML is designed to be forgiving; XML is designed to be precise.
What is the difference between XML and JSON?
Both represent structured data. JSON is simpler, more compact, and native to JavaScript, making it dominant for REST APIs. XML supports attributes, namespaces, schemas, and mixed content (text and elements interleaved), making it more flexible for document-oriented formats. XML remains standard in enterprise systems, SOAP services, and document formats like .docx.
How do I escape special characters in XML?
Replace & with &amp;, < with &lt;, > with &gt;, " with &quot;, and ' with &apos;. The mandatory escapes in text content are & and <. In attribute values, also escape the delimiter quote character.
What is an XML namespace?
A namespace prevents name collisions when combining XML from multiple schemas. Declared with xmlns: <root xmlns:h="http://example.com/html">. The URL is just a unique identifier — it does not need to resolve to anything. Namespaces are essential in compound document formats like OOXML (.docx).
How do I format XML on the command line?
Using xmllint (Linux/Mac): xmllint --format input.xml. Using Python: python3 -c "import xml.dom.minidom,sys; print(xml.dom.minidom.parse(sys.argv[1]).toprettyxml())" input.xml. Using pandoc: pandoc -f html -t plain input.xml.
Related tools
Ad