HTML to XML Conversion Sample #2
Goto Sample #1
Goto Sample #3
Goto Sample #4
This is the 2nd of several examples describing the details of how the Chilkat HTML-to-XML library converts HTML into well-formed XML.
Here is another HTML sample:
<html>
<head>
<title>This is a test</title>
<meta http-equiv="Content-Language" content="en-us">
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
</head>
<body>
<ul>
<li>A quibus tantum dissentio
<li>Sophocles vel optime scripserit Electram
<li>tamen male conversam Atilii mihi legendam putem
<li>de quo Lucilius: 'ferreum scriptorem', verum, opinor, scriptorem tamen
</ul>
</body>
</html>
The XML output is shown below.
- Unclosed tags, such as <li>, are automatically closed.
- All text content is placed under <text> nodes.
<?xml version="1.0" encoding="windows-1252" ?>
<root>
<html>
<head>
<title>
<text>This is a test</text>
</title>
<meta http-equiv="Content-Language" content="en-us"></meta>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></meta>
</head>
<body>
<ul>
<li>
<text>A quibus tantum dissentio
</text>
</li>
<li>
<text>Sophocles vel optime scripserit Electram
</text>
</li>
<li>
<text>tamen male conversam Atilii mihi legendam putem
</text>
</li>
<li>
<text>de quo Lucilius: 'ferreum scriptorem', verum, opinor, scriptorem tamen
</text>
</li>
</ul>
</body>
</html>
</root>
(The Chilkat HTML-to-XML API is offered across many programming languages: Ruby, Perl, Python, Java, C#, VB.NET, etc.)