CkString Tcl Reference Documentation

CkString

The Chilkat string class.

Object Creation

set myCkString [new CkString]

Properties

NumArabic (integer)

set intVal [CkString_get_NumArabic $myCkString]

Introduced in version 9.5.0.25

The number of Arabic characters contained in this string.

NumAscii (integer)

set intVal [CkString_get_NumAscii $myCkString]

Introduced in version 9.5.0.25

The number of us-ascii characters contained in this string.

NumCentralEuro (integer)

set intVal [CkString_get_NumCentralEuro $myCkString]

Introduced in version 9.5.0.25

The number of Central European and Eastern European characters found in this string. These are characters specific to Polish, Czech, Slovak, Hungarian, Slovene, Croatian, Serbian (Latin script), Romanian and Albanian.

NumChinese (integer)

set intVal [CkString_get_NumChinese $myCkString]

Introduced in version 9.5.0.25

The number of Chinese characters contained in this string.

NumCyrillic (integer)

set intVal [CkString_get_NumCyrillic $myCkString]

Introduced in version 9.5.0.25

The number of Cyrillic characters contained in this string. The Cyrillic alphabet also called azbuka, from the old name of the first two letters) is actually a family of alphabets, subsets of which are used by certain East and South Slavic languages "” Belarusian, Bulgarian, Macedonian, Russian, Rusyn, Serbian and Ukrainian"”as well as many other languages of the former Soviet Union, Asia and Eastern Europe.

NumGreek (integer)

set intVal [CkString_get_NumGreek $myCkString]

Introduced in version 9.5.0.25

The number of Greek characters contained in this string.

NumHebrew (integer)

set intVal [CkString_get_NumHebrew $myCkString]

Introduced in version 9.5.0.25

The number of Hebrew characters contained in this string.

NumJapanese (integer)

set intVal [CkString_get_NumJapanese $myCkString]

Introduced in version 9.5.0.25

The number of Japanese characters contained in this string.

NumKorean (integer)

set intVal [CkString_get_NumKorean $myCkString]

Introduced in version 9.5.0.25

The number of Korean characters contained in this string.

NumLatin (integer)

set intVal [CkString_get_NumLatin $myCkString]

Introduced in version 9.5.0.25

The number of Latin characters contained in this string. Latin characters include all major Western European languages, such as German, Spanish, French, Italian, Nordic languages, etc.

NumThai (integer)

set intVal [CkString_get_NumThai $myCkString]

Introduced in version 9.5.0.25

The number of Thai characters contained in this string.

Methods

# str is a string
CkString_append $str

The str is appended to end of this instance.

# str is a string
CkString_appendAnsi $str

Appends an ANSI string to the end of this instance. str should always be a null terminated ANSI string regardless of the Utf8 property setting.

# c is a char
CkString_appendChar $c

Appends a single ANSI character to the end of this instance.

CkString_appendCurrentDateRfc822

Appends the current date/time to the end of this instance. The date/time is formatted according to the RFC822 standard, which is the typical format used in the "Date" header field of email. For example: "Fri, 27 Jul 2012 17:41:41 -0500"

# str is a string
# charsetEncoding is a string
CkString_appendEnc $str $charsetEncoding

Appends a string of any character encoding to the end of this instance. Examples of charsetEncoding are: Shift_JIS, windows-1255, iso-8859-2, gb2312, etc. The str should point to a null-terminated string that uses the charset specified by charsetEncoding.

Supported Character Encodings

# byteData is a CkByteData
# numBytes is an integer
CkString_appendHexData $byteData $numBytes

Converts the binary data to a hexidecimal string representation and appends to the end of this instance.

# n is an integer
CkString_appendInt $n

Appends the decimal string representation of an integer to the end of this instance.

# str is a string
# numBytes is an integer
CkString_appendN $str $numBytes

Appends N bytes of character data to the end of this instance. If the Utf8 property is set to 1, then str should point to characters in the utf-8 encoding, otherwise it should point to characters using the ANSI encoding. Note: numBytes is not necessarily the number of characters. It is the length, in bytes, of the string to be appended. This method exists to allow for non-null terminated strings to be appended.

# wideStr is a utf-16 string
# numChars is an integer
CkString_appendNU $wideStr $numChars

Append N Unicode characters to the end of this instance. The wideStr points to the 2-byte per char Unicode string. The numChars is the number of Unicode characters to be appended (not the number of bytes).

# numBytes is an integer
# encoding is a string
CkString_appendRandom $numBytes $encoding

Appends numBytes random bytes to the end of this instance. Because arbitrary byte values in the range 0 to 255 do not necessarily represent valid characters, the bytes must be encoded to a string friendly representation such as hex, base64, etc. The encoding specifies the encoding to be used. Possible values are "hex", "base64", "quoted-printable", "asc", or "url".

# strObj is a CkString
CkString_appendStr $strObj

Appends the contents of strObj to the end of this instance.

# unicode is a utf-16 string
CkString_appendU $unicode

Append a Unicode string to the CkString object.

# str is a string
CkString_appendUtf8 $str

Appends a utf-8 string to the existing contents of this instance. str should always be a null terminated utf-8 string regardless of the Utf8 property setting.

# charsetEncoding is a string
CkString_base64Decode $charsetEncoding

In-place base64 decodes the string and inteprets the results according to the character encoding specified.

Supported Character Encodings

# charsetEncoding is a utf-16 string
CkString_base64DecodeW $charsetEncoding

The utf-16 version of base64Decode.

# charsetEncoding is a string
CkString_base64Encode $charsetEncoding

In-place base64 encodes the string. Internally, the string is first converted to the character encoding specified and then base-64 encoded. Typical charsetEncoding values are "utf-8", "ANSI", "iso-8859-1", etc.

Supported Character Encodings

# charsetEncoding is a utf-16 string
CkString_base64EncodeW $charsetEncoding

The utf-16 version of base64Encode.

# substr is a string
set retBool [CkString_beginsWith $substr]

Return 1 if this string begins with substr (case sensitive), otherwise returns 0.

# strObj is a CkString
set retBool [CkString_beginsWithStr $strObj]

Returns 1 if the string begins with the contents of strObj. Otherwise returns 0. This method is case sensitive.

# str is a utf-16 string
set retBool [CkString_beginsWithW $str]

The utf-16 version of beginsWith.

# idx is an integer
set retChar [CkString_charAt $idx]

Returns the ANSI character at a specified index.The first character is at index 0.

# idx is an integer
set utf16_char [CkString_charAtU $idx]

Return the Nth character as a Unicode character.

# ch is a char
CkString_chopAtFirstChar $ch

Finds the first occurance of ch and discards the characters at and following ch.

# subStrObj is a CkString
CkString_chopAtStr $subStrObj

Finds the first occurance of a substring and chops it at that point. The result is that the substring and all subsequent characters are removed from the string.

CkString_clear

Clears the string. The string contains 0 characters after calling this method.

# returns a CkString
set ret_ckString [CkString_clone]

Creates a copy of the string. As with any newly created Chilkat object instance returned by a Chilkat method, the returned CkString object must be deleted by the calling application.

# str is a CkString
set retInt [CkString_compareStr $str]

Compare two strings. A return value = 0 means they are equal. Return value = 1 indicates that calling object is lexicographically less than argument. Return value = -1 indicates that calling object is lexicographically greater than argument.

# substr is a string
set retBool [CkString_containsSubstring $substr]

Returns 1 if the string contains the specified substring, otherwise returns 0. The string comparison is case-sensitive.

# substr is a string
set retBool [CkString_containsSubstringNoCase $substr]

Same as containsSubstring except the matching is case insensitive.

# substr is a utf-16 string
set retBool [CkString_containsSubstringNoCaseW $substr]

The utf-16 version of containsSubstringNoCase.

# substr is a utf-16 string
set retBool [CkString_containsSubstringW $substr]

The utf-16 version of containsSubstring.

# ch is a char
set retInt [CkString_countCharOccurances $ch]

Returns the number of occurances of the specified ANSI char.

CkString_decodeXMLSpecial

Decodes XML special characters. For example, &lt; is converted to '<'

set retDouble [CkString_doubleValue]

Converts the string to a double and returns the value.

# ansiChar is a char
# startIndex is an integer
CkString_eliminateChar $ansiChar $startIndex

Eliminate all occurances of a particular ANSI character.

CkString_encodeXMLSpecial

Encodes XML special characters. For example, '<' is converted to &lt;

# substr is a string
set retBool [CkString_endsWith $substr]

Returns 1 if the string ends with substr (case-sensitive). Otherwise returns 0.

# substrObj is a CkString
set retBool [CkString_endsWithStr $substrObj]

Returns 1 if the string ends with the specified substring, otherwise returns 0.

# s is a utf-16 string
set retBool [CkString_endsWithW $s]

The utf-16 version of endsWith.

CkString_entityDecode

Decodes any HTML entities found within the string, replacing them with the characters represented.

CkString_entityEncode

HTML encodes any characters that are special to HTML or cannot be represented by 7-bit us-ascii.

# str is a string
set retBool [CkString_equals $str]

Returns 1 if the strings are equal, otherwise returns 0. (case-sensitive)

# str is a string
set retBool [CkString_equalsIgnoreCase $str]

Returns 1 if the strings are equal, otherwise returns 0. (case-insensitive)

# strObj is a CkString
set retBool [CkString_equalsIgnoreCaseStr $strObj]

Returns 1 if the strings are equal, otherwise returns 0 (case-insensitive)

# s is a utf-16 string
set retBool [CkString_equalsIgnoreCaseW $s]

The utf-16 version of equalsIgnoreCase.

# strObj is a CkString
set retBool [CkString_equalsStr $strObj]

Returns 1 if the strings are equal, otherwise returns 0. (case-sensitive)

# s is a utf-16 string
set retBool [CkString_equalsW $s]

The utf-16 version of the "equals" method.

# returns a CkString
# idx is an integer
set ret_ckString [CkString_getChar $idx]

Returns a new CkString object containing the Nth character. (Note, it does not contain the Nth byte, but the Nth character.) For languages such as Chinese, Japanese, etc. individual characters are represented by multiple or varying number of bytes.

# encoding is a string
set retStr [CkString_getEnc $encoding]

Returns the string as null-terminated ANSI.

Returns NULL on failure

set retInt [CkString_getNumChars]

Returns the number of characters in the string.

set retInt [CkString_getSizeAnsi]

Returns the size, in bytes, of the ANSI encoding of the string.

set retInt [CkString_getSizeUnicode]

Returns the size, in bytes, of the Unicode encoding of the string.

set retInt [CkString_getSizeUtf8]

Returns the size, in bytes, of the utf-8 encoding of the string.

set retStr [CkString_getString]

Returns the contents of this instance.

Returns NULL on failure

set retStr [CkString_getStringAnsi]

Returns the string as null-terminated ANSI.

Returns NULL on failure

set retStr [CkString_getStringUtf8]

Returns the string as null-terminated utf-8.

Returns NULL on failure

set utf16_text [CkString_getUnicode]

Return a pointer to memory containing the string in Unicode.

# charsetEncoding is a string
CkString_hexDecode $charsetEncoding

Hex decodes a string and inteprets the bytes according to the character encoding specified.

Supported Character Encodings

# charsetEncoding is a utf-16 string
CkString_hexDecodeW $charsetEncoding

The utf-16 version of hexDecode.

# charsetEncoding is a string
CkString_hexEncode $charsetEncoding

Converts the string to the character encoding specified and replaces the string contents with the hex encoding of the character data.

Supported Character Encodings

# charsetEncoding is a utf-16 string
CkString_hexEncodeW $charsetEncoding

The utf-16 version of hexEncode.

# substr is a string
set retInt [CkString_indexOf $substr]

Returns the index of the first occurance of a substring. Returns -1 if not found.

# substrObj is a CkString
set retInt [CkString_indexOfStr $substrObj]

Returns the index of the first occurance of a substring. Returns -1 if not found.

# s is a utf-16 string
set retInt [CkString_indexOfW $s]

The utf-16 version of "indexOf".

set retInt [CkString_intValue]

Converts the string to an integer and returns the integer value.

set retBool [CkString_isEmpty]

Returns 1 if the string object is empty, otherwise returns 0.

set retChar [CkString_lastChar]

Returns the last ANSI character in the string.

# path is a string
# charsetEncoding is a string
set status [CkString_loadFile $path $charsetEncoding]

Load the contents of a text file into the CkString object. The string is cleared before loading. The character encoding of the text file is specified by charsetEncoding. This method allows for text files in any charset to be loaded: utf-8, Unicode, Shift_JIS, iso-8859-1, etc.

Returns 1 for success, 0 for failure.

Supported Character Encodings

# path is a utf-16 string
# charsetEncoding is a utf-16 string
set status [CkString_loadFileW $path $charsetEncoding]

The utf-16 version of loadFile.

Returns 1 for success, 0 for failure.

# strPattern is a string
set retBool [CkString_matches $strPattern]

Returns 1 if the string matches the strPattern, which may contain one or more asterisk wildcard characters. Returns 0 if the string does not match. This method is case-sensitive.

# strPattern is a string
set retBool [CkString_matchesNoCase $strPattern]

Returns 1 if the string matches the strPattern, which may contain one or more asterisk wildcard characters. Returns 0 if the string does not match. This method is case-insensitive.

# s is a utf-16 string
set retBool [CkString_matchesNoCaseW $s]

The utf-16 version of matchesNoCase.

# strPatternObj is a CkString
set retBool [CkString_matchesStr $strPatternObj]

Returns 1 if the string matches a pattern, otherwise returns 0. The pattern may contain any number of wildcard '*' characters which represent 0 or more occurances of any character. This method is case-sensitive.

# s is a utf-16 string
set retBool [CkString_matchesW $s]

The utf-16 version of the "matches" method.

CkString_minimizeMemory

Minimizes the amount of memory consumed by this object. For example, consider the following: A CkString object is loaded with the contents of a text file. The "replaceAllOccurances" method is called, replacing longer substrings with shorter replacements. The actual string length will become shorter than the internal buffer space that is allocated. The minimizeMemory method will, if necessary, allocate a new internal buffer that is exactly the size needed to hold the current contents of the string, copy the string to the new internal buffer, and deallocate the old buffer.

CkString_obfuscate

Obfuscates the string. (The unobfuscate method can be called to reverse the obfuscation to restore the original string.)

The Chilkat string obfuscation algorithm works by taking the utf-8 bytes of the string, base64 encoding it, and then scrambling the letters of the base64 encoded string. It is deterministic in that the same string will always obfuscate to the same result. It is not a secure way of encrypting a string. It is only meant to be a simple means of transforming a string into something unintelligible.

# str is a string
CkString_prepend $str

Prepends str to this instance.

# s is a utf-16 string
CkString_prependW $s

The utf-16 version of the "prepend" method.

CkString_punyDecode

Introduced in version 9.5.0.52

In-place decodes the string from punycode.

Punycode Encoding / Decoding

CkString_punyEncode

Introduced in version 9.5.0.52

In-place encodes the string to punycode.

Punycode Encoding / Decoding

# charsetEncoding is a string
CkString_qpDecode $charsetEncoding

Quoted-printable decodes the string and interprets the resulting character data according to the specified character encoding. The result is that the quoted-printable string is in-place decoded.

Supported Character Encodings

# charset is a utf-16 string
CkString_qpDecodeW $charset

The utf-16 version of the qpDecode method.

# charsetEncoding is a string
CkString_qpEncode $charsetEncoding

Quoted-printable encodes the string. The string is first converted to the charset specified, and those bytes are QP-encoded. The contents of the string are replaced with the QP-encoded result.

Supported Character Encodings

# charset is a utf-16 string
CkString_qpEncodeW $charset

The utf-16 version of the qpEncode method.

# substr is a CkString
set retInt [CkString_removeAll $substr]

Removes all occurances of substr.

# ch is a char
CkString_removeCharOccurances $ch

Removes all occurances of a specific ANSI character from the string.

# charStartPos is an integer
# numChars is an integer
CkString_removeChunk $charStartPos $numChars

Removes a chunk of characters specified by starting index and length.

# beginDelim is a string
# endDelim is a string
# caseSensitive is a boolean
CkString_removeDelimited $beginDelim $endDelim $caseSensitive

Introduced in version 9.5.0.52

Remove all occurances of strings delimited by beginDelim and endDelim. Also removes the delimiters.

# substr is a CkString
set retBool [CkString_removeFirst $substr]

Removes the first occurance of a substring.

# findStrObj is a CkString
# replaceStrObj is a CkString
set retInt [CkString_replaceAll $findStrObj $replaceStrObj]

Replaces all occurances of a substring with another. The replacement string is allowed to be empty or different in length.

# findStr is a string
# replaceStr is a string
set retInt [CkString_replaceAllOccurances $findStr $replaceStr]

Replaces all occurances of a substring with another substring. The replacement string is allowed to be empty or different in length.

# pattern is a utf-16 string
# replacement is a utf-16 string
set retInt [CkString_replaceAllOccurancesW $pattern $replacement]

The utf-16 version of the replaceAllOccurances method.

# findCh is a char
# replaceCh is a char
CkString_replaceChar $findCh $replaceCh

Replaces all occurances of a specified ANSI character with another.

# findStrObj is a CkString
# replaceStrObj is a CkString
set retBool [CkString_replaceFirst $findStrObj $replaceStrObj]

Replaces the first occurance of a substring with another. The replacement string is allowed to be empty or different in length.

# findStr is a string
# replaceStr is a string
set retBool [CkString_replaceFirstOccurance $findStr $replaceStr]

Replaces the first occurance of a substring with another. The replacement string is allowed to be empty or different in length.

# pattern is a utf-16 string
# replacement is a utf-16 string
set retBool [CkString_replaceFirstOccuranceW $pattern $replacement]

The utf-16 version of replaceFirstOccurance.

# path is a string
# charsetEncoding is a string
set status [CkString_saveToFile $path $charsetEncoding]

Saves the string to a file using the character encoding specified by charsetEncoding. If a file of the same name exists, it is overwritten. For charsets such as "utf-8", "utf-16", or others that have a possible BOM/preamble, the preamble is output by default. To exclude the BOM/preamble, prepend "no-bom-" to the charset name. For example "no-bom-utf-8".

Returns 1 for success, 0 for failure.

Supported Character Encodings

# path is a utf-16 string
# charset is a utf-16 string
set status [CkString_saveToFileW $path $charset]

The utf-16 version of the saveToFile method.

Returns 1 for success, 0 for failure.

# s is a CkString
CkString_setStr $s

Replaces the contents of the string with another.

# str is a string
CkString_setString $str

Clears the contents of this instance and appends str.

# s is a string
CkString_setStringAnsi $s

Set the CkString object from an ANSI string.

# unicode is a utf-16 string
CkString_setStringU $unicode

Set the CkString object from a Unicode string.

# s is a string
CkString_setStringUtf8 $s

Set the string object from a utf-8 string.

# n is an integer
CkString_shorten $n

Discards the last N characters.

# returns a CkStringArray
# delimiterChar is a char
# exceptDoubleQuoted is a boolean
# exceptEscaped is a boolean
# keepEmpty is a boolean
set ret_stringArray [CkString_split $delimiterChar $exceptDoubleQuoted $exceptEscaped $keepEmpty]

Splits a string into a collection of strings using a delimiter character. If exceptEscaped is 1, then delimiter chars escaped with a backslash are ignored. If exceptDoubleQuoted is 1, then delimiter chars inside quotes are ignored. If keepEmpty is 0, then empty strings are excluded from being added to the returned CkStringArray object.

# returns a CkStringArray
# delimiterChars is a string
# exceptDoubleQuoted is a boolean
# exceptEscaped is a boolean
# keepEmpty is a boolean
set ret_stringArray [CkString_split2 $delimiterChars $exceptDoubleQuoted $exceptEscaped $keepEmpty]

Same as "split", except a set of characters can be used for delimiters.

# returns a CkStringArray
# splitCharSet is a utf-16 string
# exceptDoubleQuoted is a boolean
# exceptEscaped is a boolean
# keepEmpty is a boolean
set ret_stringArray [CkString_split2W $splitCharSet $exceptDoubleQuoted $exceptEscaped $keepEmpty]

The utf-16 version of the split2 method.

# returns a CkStringArray
set ret_stringArray [CkString_splitAtWS]

Equivalent to split2(" \t\r\n",true,true,false)

# returns a CkString
# startCharIndex is an integer
# numChars is an integer
set ret_ckString [CkString_substring $startCharIndex $numChars]

Returns a substring specified by starting character position and number of characters. (The 1st char is at index 0.)

CkString_toCRLF

Converts all line endings to CRLF.

CkString_toLF

Converts all line endings to bare-LF (Unix/Linux style line endings).

CkString_toLowerCase

Converts the string to lowercase.

CkString_toUpperCase

Converts the string to uppercase.

# returns a CkStringArray
# punctuation is a string
set ret_stringArray [CkString_tokenize $punctuation]

Tokenizes a string. The string is split at whitespace characters, and any single punctuation character is returned as a separate token. For example, this string:
CkStringArray *CkString::tokenize(char *punctuation) const

is tokenized to

CkStringArray
*
CkString
:
:
tokenize
(
*
punctuation
)
const

# returns a CkStringArray
# punctuation is a utf-16 string
set ret_stringArray [CkString_tokenizeW $punctuation]

The utf-16 version of the "tokenize" method.

CkString_trim

Trim SPACE and Tab characters from both ends of the string.

CkString_trim2

Trim SPACE, Tab, CR, and LF characters from both ends of the string.

CkString_trimInsideSpaces

Replaces all tabs, CR's, and LF's, with SPACE chars, and removes extra SPACE's so there are no occurances of more than one SPACE char in a row.

CkString_unobfuscate

Unobfuscates the string.

The Chilkat string obfuscation algorithm works by taking the utf-8 bytes of the string, base64 encoding it, and then scrambling the letters of the base64 encoded string. It is deterministic in that the same string will always obfuscate to the same result. It is not a secure way of encrypting a string. It is only meant to be a simple means of transforming a string into something unintelligible.

# charsetEncoding is a string
CkString_urlDecode $charsetEncoding

URL decodes the string and interprets the resulting byte data in the specified charset encoding.

Supported Character Encodings

# charsetEncoding is a utf-16 string
CkString_urlDecodeW $charsetEncoding

The utf-16 version of the urlDecode method.

# charsetEncoding is a string
CkString_urlEncode $charsetEncoding

URL encodes the string. The string is first converted to the specified charset encoding, and those bytes are URL-encoded. The contents of the string are replaced with the URL-encoded result.

Supported Character Encodings

# charsetEncoding is a utf-16 string
CkString_urlEncodeW $charsetEncoding

The utf-16 version of the urlEncode method.