Zip Component, Email Component, Encryption Component ActiveX Control for Zip Compression .NET Components for ASP.NET
ActiveX and .NET Components for Zip Compression, Encryption, Email, XML, S/MIME, HTML Email, Character Encoding, Digital Certificates, FTP, and more ASP Email ActiveX Component

  

  Chilkat ActiveX Components

  Chilkat .NET Components

  Chilkat C++ Libraries

  

  

  

  

 

FAQ

HTML to XML Conversion Sample #3

Goto Sample #1

Goto Sample #2

Goto Sample #4

This is the 3rd of several examples describing the details of how the Chilkat HTML-to-XML library converts HTML into well-formed XML.

Here is another HTML sample. You'll notice that this one contains several errors, which are automatically corrected by the HTML-to-XML library:

<html>
<head>
<title>This is a test</title>
<meta http-equiv="Content-Language" content="en-us">
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
</head>
<body>
<table>
<tr>
<td>Row 1, column 1</td>
<td>Row 1, column 2</td>
<td>Row 1, column 3 Oops forgot the ending td
</tr>
<tr>
<td>Row 2, column 1 Oops...</abc>
<td>Row 2, column 2</td>
<td>Row 2, column 3</td>
</tr>
<tr>
<td>Row 2, column 1 Oops...</abc>
<td>Row 2, <div> This is a test </div> column 2</td>
<td>Row 2, column 3</td>
<!-- Oops, forgot to close the last tr -->
</table>

</body>
</html>

The XML output is shown below.

  • The XML below is well-formed and the HTML errors have been corrected.
  • HTML comments are saved within <comment> nodes.
  • All text content is placed under <text> nodes.
<?xml version="1.0" encoding="windows-1252" ?>

<root>
    <html>
        <head>
            <title>
                <text>This is a test</text>
            </title>
            <meta http-equiv="Content-Language" content="en-us"></meta>
            <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"></meta>
        </head>
        <body>
            <table>
                <tr>
                    <td>
                        <text>Row 1, column 1</text>
                    </td>
                    <td>
                        <text>Row 1, column 2</text>
                    </td>
                    <td>
                        <text>Row 1, column 3 Oops forgot the ending td
                        </text>
                    </td>
                    <tr>
                        <td>
                            <text>Row 2, column 1 Oops...</text>
                        </td>
                        <td>
                            <text>Row 2, column 2</text>
                        </td>
                        <td>
                            <text>Row 2, column 3</text>
                        </td>
                    </tr>
                    <tr>
                        <td>
                            <text>Row 2, column 1 Oops...</text>
                        </td>
                        <td>
                            <text>Row 2, </text>
                            <div>
                                <text>This is a test </text>
                            </div>
                            <text>column 2</text>
                        </td>
                        <td>
                            <text>Row 2, column 3</text>
                        </td>
                        <comment>Oops, forgot to close the last tr</comment>
                    </tr>
                </tr>
            </table>
        </body>
    </html>
</root>

(The Chilkat HTML-to-XML API is offered across many programming languages: Ruby, Perl, Python, Java, C#, VB.NET, etc.)


Privacy Statement. Copyright 2000-2008 Chilkat Software, Inc. All rights reserved.
Send feedback to support@chilkatsoft.com

Components for Microsoft Windows XP, 2000, 2003 Server, Vista, and Windows 95/98/NT4.

Downloads
.NET 2.0
.NET 1.*
.NET x64
VC++ 6.0
VC++ 7.0
VC++ 8.0
Java
Ruby
Perl 5.8.*
Perl 5.10.*
Python
Bounce ActiveX
Charset ActiveX
Email ActiveX
FTP2 ActiveX
Crypt ActiveX
HTML-to-XML ActiveX
HTTP ActiveX
IMAP ActiveX
MHT ActiveX
MIME ActiveX
RSA ActiveX
Socket ActiveX
Spider ActiveX (free)
String ActiveX (free)
Tar ActiveX
Upload ActiveX (free)
XML ActiveX (free)
XMP ActiveX
Zip ActiveX