HTML-to-XML Component Features
The Chilkat HTML-to-XML component is designed for the purpose of transforming HTML into well-formed XML for parsing. If effect, it is designed to be an HTML parser / scraper. Once HTML is converted to XHTML (i.e. well-formed XML), the plethora of existing XML parsing components and libraries can be leveraged for HTML parsing and scraping.
- File-to-file HTML to XML conversion.
- Memory-to-memory HTML to XML conversion.
- Convert character encoding during conversion process.
- Flexibility in controlling how HTML entities are handled.
- Automatically convert HTML entities to corresponding 8-bit characters.
- Optionally drop all text formatting tags from the output.
- Drop/undrop specific tags from the output.
|
Privacy
Statement. Copyright 2000-2008 Chilkat
Software, Inc. All rights reserved.
Send feedback to support@chilkatsoft.com Components for Microsoft Windows XP, 2000, 2003 Server, Vista, and Windows 95/98/NT4.
|
|