HtmlToText C Library Reference

HtmlToText

HTML to plain-text conversion C library. The internal conversion process is much more sophisticated than can be accomplished with the simple regular-expression freeware codes found in the Internet. This is more than simply removing HTML tags from an HTML document.

Create/Dispose

HCkHtmlToText CkHtmlToText_Create(void);

Creates an instance of the CkHtmlToText object and returns a handle (i.e. a "void *" pointer). The handle is passed in the 1st argument for the functions listed on this page.

void CkHtmlToText_Dispose(HCkHtmlToText handle);

Objects created by calling CkHtmlToText_Create must be freed by calling this method. A memory leak occurs if a handle is not disposed by calling this function.

C "Properties"

void CkHtmlToText_getLastErrorHtml(HCkHtmlToText handle, HCkString retval);

Error information in HTML format for the last method called.

void CkHtmlToText_getLastErrorText(HCkHtmlToText handle, HCkString retval);

Error information in plain-text format for the last method called.

void CkHtmlToText_getLastErrorXml(HCkHtmlToText handle, HCkString retval);

Error information in XML format for the last method called.

BOOL CkHtmlToText_getUtf8(HCkHtmlToText handle);
void CkHtmlToText_putUtf8(HCkHtmlToText handle, BOOL newVal);

To be documented soon...

C "Methods"

BOOL CkHtmlToText_IsUnlocked(HCkHtmlToText handle);

Returns true if the component is already unlocked. Otherwise returns false.

BOOL CkHtmlToText_ReadFileToString(HCkHtmlToText handle, const char *filename, const char *srcCharset, HCkString str);

Convenience method for reading a text file into a string. The character encoding of the text file is specified by srcCharset. Valid values, such as "iso-8895-1" or "utf-8" are listed at: List of Charsets.

BOOL CkHtmlToText_SaveLastError(HCkHtmlToText handle, const char *filename);

Saves the last error information to an XML formatted file.

BOOL CkHtmlToText_ToText(HCkHtmlToText handle, const char *html, HCkString outStr);

Converts HTML to plain-text.

BOOL CkHtmlToText_UnlockComponent(HCkHtmlToText handle, const char *code);

Unlocks the component. An arbitrary unlock code may be passed to automatically begin a 30-day trial.

This class is included with the Chilkat HTML-to-XML conversion component license. A permanent unlock code for Chilkat HTML-to-XML should be used to unlock this object.

BOOL CkHtmlToText_WriteStringToFile(HCkHtmlToText handle, const char *str, const char *filename, const char *charset);

Convenience method for saving a string to a file. The character encoding of the output text file is specified by outpuCharset (the string is converted to this charset when writing). Valid values, such as "iso-8895-1" or "utf-8" are listed at: List of Charsets.

const char *CkHtmlToText_lastErrorHtml(HCkHtmlToText handle);

Error information in HTML format for the last method called.

const char *CkHtmlToText_lastErrorText(HCkHtmlToText handle);

Error information in plain-text format for the last method called.

const char *CkHtmlToText_lastErrorXml(HCkHtmlToText handle);

Error information in XML format for the last method called.

const char *CkHtmlToText_readFileToString(HCkHtmlToText handle, const char *filename, const char *srcCharset);

Convenience method for reading a text file into a string. The character encoding of the text file is specified by srcCharset. Valid values, such as "iso-8895-1" or "utf-8" are listed at: List of Charsets.

const char *CkHtmlToText_toText(HCkHtmlToText handle, const char *html);

Converts HTML to plain-text.