Chilkat.StringBuilder Class Overview
Chilkat.StringBuilder is a mutable string container for building,
editing, searching, encoding, decoding, hashing, loading, saving, and transforming
text. It includes practical helpers for line endings, substring extraction,
replacements, regular expressions, quoted-string masking, Markdown-to-HTML
conversion, UUID generation, random text, entity decoding, Punycode, emoji
handling, and secure clearing.
What the Class Is Used For
Use Chilkat.StringBuilder when an application needs
a reusable text buffer that can be modified in place. It is especially useful for
constructing text, parsing text, normalizing line endings, converting encodings,
hashing text, loading or writing files with a specific charset, preparing HTML
from Markdown, or exchanging text with other Chilkat classes such as
BinData, JsonObject, and
StringTable.
Build Text Incrementally
Append strings, integers, UUIDs, random data, lines, encoded bytes, or another
StringBuilder.
Search and Extract
Find substrings, whole words, ranges, list items, text before/after markers, or
text between markers.
Replace and Transform
Replace text, words, regions, line endings, case, accents, whitespace, emojis,
quoted strings, and encoded content.
Encode, Decode, Hash, and Save
Encode/decode Base64, hex, URL, quoted-printable, HTML entities, Punycode, and
more; hash text with a chosen charset; load and write files.
Typical Workflow
-
Create a StringBuilder object.
-
Load or set content with SetString or
LoadFile, or build content with append methods such
as Append,
AppendLine, and
AppendSb.
-
Inspect the content with Length,
Contains,
StartsWith,
EndsWith, or
ContentsEqual.
-
Transform the content with methods such as
Replace,
Trim,
ToCRLF,
Encode, or
Decode.
-
Extract needed text with GetBetween,
GetBefore,
GetAfterFinal,
GetRange, or
GetNth.
-
Write the result with WriteFile or
WriteFileIfModified, or return it with
GetAsString.
-
Use SecureClear when the string contains sensitive
data, and check LastErrorText after failures.
Core Concepts
| Concept |
Meaning |
Important Members |
| Mutable Text Buffer |
The object holds text that can be appended, replaced, shortened, cleared,
encoded, decoded, and written without repeatedly creating new strings.
|
Append,
SetString,
Replace,
Clear
|
| Charset-Aware Bytes |
When converting between text and bytes, the charset controls the byte
representation.
|
AppendBd,
Encode,
Decode,
GetHash,
WriteFile
|
| Encoded Text |
Text or decoded bytes can be encoded as Base64, hex, URL encoding,
quoted-printable, HTML entities, and other formats.
|
Encode,
Decode,
GetEncoded,
GetDecoded
|
| Marker-Based Extraction |
Text can be extracted or removed using markers such as “before this”,
“after final occurrence”, or “between begin and end”.
|
GetBefore,
GetAfterFinal,
GetBetween,
ReplaceAllBetween
|
| Delimited Lists |
A string can be treated as a delimiter-separated list, with options to ignore
delimiters inside double quotes or escaped with backslash.
|
GetNth,
SetNth
|
| Safe Temporary Masking |
Quoted strings can be masked before text processing and restored afterward.
|
MaskQuotedStrings,
RestoreMaskedStrings,
StringTable
|
Core Properties
| Property |
Purpose |
Guidance |
| Length |
Returns the number of characters in the current string.
|
Use to check whether content is empty or to validate expected size after
edits.
|
| IntValue |
Gets or sets the content as an integer.
|
Useful when the string content is expected to be a decimal integer.
|
| IsBase64 |
Indicates whether the content contains only characters allowed in Base64.
|
Whitespace is ignored. Base64 characters include A-Z, a-z, 0-9,
+, /, and optional
trailing = padding.
|
| HasEmojis |
Returns true when the content contains one or more emoji characters.
|
Use with RemoveEmojis when emoji-free text is
required.
|
| LastErrorText |
Diagnostic text for the last method or property access.
|
Check after failures or unexpected results. Diagnostic information may be
available regardless of success or failure.
|
Building and Appending Text
| Task |
Method |
Behavior |
| Append text |
Append |
Appends a string to the current content.
|
| Prepend text |
Prepend |
Inserts text at the beginning of the current content.
|
| Append integer |
AppendInt,
AppendInt64
|
Appends the decimal string representation of a 32-bit or 64-bit integer.
|
| Append line |
AppendLine,
AppendLn
|
Appends text followed by CRLF or LF. AppendLn
always appends CRLF.
|
| Append another StringBuilder |
AppendSb |
Appends the contents of another
StringBuilder.
|
| Append bytes from BinData |
AppendBd |
Interprets bytes from BinData using the
specified charset and appends the resulting text.
|
| Append encoded binary data |
AppendEncoded |
Encodes byte data using a specified binary encoding, such as Base64 or hex,
and appends the encoded text.
|
| Append random bytes as text |
AppendRandom |
Generates random bytes and appends them encoded as hex, Base64,
Base64Url, or another supported encoding.
|
| Append UUID |
AppendUuid,
AppendUuid7
|
Appends a random version 4 UUID or random version 7 UUID. Hex case is
controlled by the lowerCase argument.
|
Loading, Setting, Getting, and Saving
| Need |
Method |
Notes |
| Set content |
SetString |
Replaces the current content with the specified string.
|
| Get content |
GetAsString |
Returns the current content as a string.
|
| Load file |
LoadFile |
Loads file content using the specified charset.
|
| Write file |
WriteFile |
Writes content using the specified charset. Can emit a BOM for charsets that
define one.
|
| Write only if changed |
WriteFileIfModified |
Writes a file only when the file is new or the content differs from the
existing file.
|
| Clear content |
Clear |
Removes all characters.
|
| Securely clear content |
SecureClear |
Removes all characters and writes zero bytes to allocated memory before
deallocating it.
|
File charset note:
LoadFile, WriteFile, and
WriteFileIfModified use a charset such as
utf-8, utf-16, or another
supported character encoding.
Searching and Comparing
| Task |
Method |
Behavior |
| Contains substring |
Contains |
Returns true if a substring exists, with optional case sensitivity.
|
| Contains whole word |
ContainsWord |
Finds a whole word with optional case sensitivity. Limited to Latin1-style
word matching.
|
| Starts with substring |
StartsWith |
Checks whether content begins with the specified substring.
|
| Ends with substring |
EndsWith |
Checks whether content ends with the specified substring.
|
| Compare to string |
ContentsEqual |
Compares the content to a string, with optional case sensitivity.
|
| Compare to another StringBuilder |
ContentsEqualSb |
Compares the content to another
StringBuilder.
|
Whole-word limitation:
ContainsWord is limited to strings containing Latin1
characters. Word characters are alphanumeric Latin1 characters, and underscore is
also considered part of a word.
Extraction Methods
| Need |
Method |
Behavior |
| Text between markers |
GetBetween |
Returns text between the first occurrence of a begin marker and the next
occurrence of an end marker.
|
| Text after a prior marker, then between markers |
GetAfterBetween |
Starts searching after searchAfter, then returns
text between the next begin and end markers.
|
| Text before marker |
GetBefore |
Returns text before the first marker. Can remove returned text plus the
marker from this object.
|
| Text after final marker |
GetAfterFinal |
Returns text after the final occurrence of a marker. Can remove the marker
and following content from this object.
|
| Character range |
GetRange,
GetRangeSb
|
Retrieves a character range and optionally removes it from this object.
GetRangeSb appends the range to another
StringBuilder.
|
| Nth delimited field |
GetNth |
Returns the Nth substring in a delimiter-separated list, optionally ignoring
delimiters inside double quotes or escaped with backslash.
|
| Last N lines |
LastNLines |
Returns the last N lines, converting line endings to CRLF or LF.
|
Removal behavior:
Methods such as GetBefore,
GetAfterFinal, and
GetRange can optionally remove the returned portion
from this object. If a marker is not present and removal is requested, the object
may be cleared.
Replacement and Removal
| Task |
Method |
Behavior |
| Replace all exact matches |
Replace |
Replaces all occurrences and returns the number of replacements.
|
| Replace first match |
ReplaceFirst |
Replaces only the first occurrence.
|
| Case-insensitive replace |
ReplaceNoCase |
Replaces all case-insensitive occurrences.
|
| Replace with integer |
ReplaceI |
Replaces occurrences of a string with a decimal integer value.
|
| Replace whole words |
ReplaceWord |
Replaces whole-word occurrences. Limited to Latin1-style word matching.
|
| Replace after final marker |
ReplaceAfterFinal |
Replaces content after the final occurrence of a marker.
|
| Replace between markers |
ReplaceAllBetween,
ReplaceBetween
|
Replaces text between marker pairs, or replaces values only inside marker
regions.
|
| Remove before marker |
RemoveBefore |
Removes text before the first marker and also removes the marker.
|
| Remove after final marker |
RemoveAfterFinal |
Removes text after the final marker and also removes the marker.
|
| Remove character range |
RemoveCharsAt |
Removes a specified range of characters.
|
| Shorten from end |
Shorten |
Removes the last N characters.
|
Whole-word limitation:
ReplaceWord, like
ContainsWord, is limited to Latin1 text. Underscore is
treated as part of a word.
Encoding, Decoding, and Hashing
| Need |
Method |
Important Detail |
| Encode content in place |
Encode |
Converts the current text to bytes using a charset, then encodes those bytes
as Base64, hex, URL encoding, quoted-printable, HTML entities, etc.
|
| Get encoded content without modifying |
GetEncoded |
Returns encoded text while leaving this object unchanged.
|
| Decode content in place |
Decode |
Decodes the current content and interprets the decoded bytes using the
specified charset.
|
| Decode and append |
DecodeAndAppend |
Decodes a supplied encoded value and appends the resulting string.
|
| Get decoded bytes |
GetDecoded |
Decodes the current content and returns the decoded bytes.
|
| Decode HTML entities |
EntityDecode |
Decodes HTML entity references in place.
|
| Hash content |
GetHash |
Converts text to bytes using the requested charset, hashes those bytes, and
returns the hash in the requested encoding.
|
Supported encoding examples:
Encoding names include base64,
hex, quoted-printable or
qp, url,
base32, Q,
B, url_rc1738,
url_rfc2396,
url_rfc3986,
url_oauth, uu,
modBase64, and html.
Hash algorithms:
GetHash supports algorithms including
sha1, sha256,
sha384, sha512,
sha3-224, sha3-256,
sha3-384, sha3-512,
md2, md5,
ripemd128, ripemd160,
ripemd256, and ripemd320.
Line Endings, Case, Whitespace, and Text Cleanup
| Task |
Method |
Behavior |
| Convert to CRLF |
ToCRLF |
Converts line endings to Windows CRLF format.
|
| Convert to LF |
ToLF |
Converts line endings to LF-only Unix format.
|
| Lowercase |
ToLowercase |
Converts content to lowercase.
|
| Uppercase |
ToUppercase |
Converts content to uppercase.
|
| Trim ends |
Trim |
Removes whitespace from both ends.
|
| Normalize inner whitespace |
TrimInsideSpaces |
Replaces tabs, CR, and LF with spaces, then collapses multiple spaces into
one.
|
| Remove Latin/Central European accents |
RemoveAccents |
Removes diacritics for accented characters in Windows-1252 and Windows-1250
character sets.
|
| Remove emojis |
RemoveEmojis |
Removes emoji characters from the content.
|
Accent-removal scope:
RemoveAccents applies only to accented characters in
Windows-1252 and Windows-1250. Accent marks for other languages are not removed.
Regular Expressions
| Task |
Method |
Behavior |
| Find regex matches |
RegexMatch |
Searches the content for matches to a regular expression and returns match
details in a JsonObject. Returns the number of
matches, or -1 for failure.
|
| Replace capture groups |
RegexReplace |
Replaces substrings of capture groups found in a previous call to
RegexMatch.
|
Timeout control:
RegexMatch accepts a
timeoutMs argument. Pass
0 for no timeout.
Quoted-String Masking
Masking quoted strings is useful when performing replacements or parsing where
quoted content should be temporarily protected.
| Method |
Purpose |
quoteType Values |
| MaskQuotedStrings |
Replaces content inside quoted strings with a mask character and stores the
original quoted strings in a StringTable.
|
0 = single and double quotes,
1 = single quotes only,
2 = double quotes only.
|
| RestoreMaskedStrings |
Restores strings previously masked by
MaskQuotedStrings.
|
Uses the same quote-type convention.
|
Markdown to HTML
MarkdownToHtml converts the Markdown content of this
object to HTML, writing the result to another
StringBuilder. It supports both full-document and
streaming conversion.
| Option Area |
JSON Options |
Behavior |
| Streaming conversion |
streaming behavior through options |
In streaming mode, complete Markdown lines are converted and removed from
this object, leaving a final partial line if present.
|
| JavaScript output |
emitJavascript: true |
Emits JavaScript function calls instead of HTML, useful for real-time
embedded browser updates.
|
| HTML shell |
streamingShell: true |
Converts an empty string to an HTML shell for use before applying streaming
JavaScript updates.
|
| Theme |
theme |
Supports themes such as ChatGPT,
cleanWin, cleanMac,
and raw.
|
| Prism highlighting |
usesPrism,
prism.theme,
prism.version
|
Includes Prism code highlighting. Themes include
tomorrow, default,
coy, dark,
funky, okaidia,
solarizedlight, and
twilight.
|
| HTML document parts |
docType,
rootElement,
head,
bodyStart,
bodyEnd,
noContentDiv
|
Allows control over the generated HTML wrapper when no predefined theme is
used.
|
| Copy button |
copyButton: true |
Can include copy-button HTML in non-streaming mode. The copy button is added
by default in streaming mode.
|
Streaming behavior:
In streaming mode, no HTML is emitted if the object contains only a partial
Markdown line. Complete lines are converted and removed, leaving the unfinished
line for the next call.
Punycode, Emojis, Obfuscation, and IDs
| Feature |
Methods / Properties |
Behavior |
| Punycode |
PunyEncode,
PunyDecode
|
Encodes or decodes the current string using Punycode.
|
| Emoji detection/removal |
HasEmojis,
RemoveEmojis
|
Detects and removes emoji characters.
|
| Obfuscation |
Obfuscate,
Unobfuscate
|
Applies reversible Chilkat string obfuscation. This is not encryption and is
not secure.
|
| UUID generation |
AppendUuid,
AppendUuid7
|
Generates and appends random version 4 or version 7 UUIDs.
|
Obfuscation is not encryption:
Chilkat string obfuscation is deterministic and reversible. It is intended only to
make a string unintelligible, not to protect secrets.
Method Summary by Category
| Category |
Methods |
Purpose |
| Build content |
Append,
AppendLine,
AppendLn,
AppendInt,
AppendInt64,
Prepend,
SetString
|
Add, prepend, or replace text content.
|
| Binary and random append |
AppendBd,
AppendEncoded,
AppendRandom,
AppendUuid,
AppendUuid7
|
Append text derived from bytes, random bytes, or generated UUIDs.
|
| Search and compare |
Contains,
ContainsWord,
StartsWith,
EndsWith,
ContentsEqual,
ContentsEqualSb
|
Locate or compare text.
|
| Extract text |
GetBefore,
GetAfterFinal,
GetBetween,
GetAfterBetween,
GetRange,
GetRangeSb,
GetNth,
LastNLines
|
Extract substrings by marker, range, delimiter, or line count.
|
| Replace and remove |
Replace,
ReplaceFirst,
ReplaceNoCase,
ReplaceWord,
ReplaceBetween,
ReplaceAllBetween,
RemoveBefore,
RemoveAfterFinal,
RemoveCharsAt,
Shorten
|
Modify or remove content.
|
| Encode, decode, hash |
Encode,
Decode,
DecodeAndAppend,
GetEncoded,
GetDecoded,
EntityDecode,
GetHash
|
Convert between text, bytes, encodings, entities, and hashes.
|
| Normalize and cleanup |
ToCRLF,
ToLF,
ToLowercase,
ToUppercase,
Trim,
TrimInsideSpaces,
RemoveAccents,
RemoveEmojis
|
Normalize line endings, case, whitespace, accents, and emojis.
|
| Advanced processing |
RegexMatch,
RegexReplace,
MaskQuotedStrings,
RestoreMaskedStrings,
MarkdownToHtml,
PunyEncode,
PunyDecode
|
Perform regex, quoted-string-safe transformations, Markdown conversion, and
Punycode conversion.
|
| File I/O and clearing |
LoadFile,
WriteFile,
WriteFileIfModified,
Clear,
SecureClear
|
Load, save, conditionally save, clear, or securely clear text.
|
Diagnostics and Troubleshooting
| Problem Area |
Member |
What to Check |
| General failure |
LastErrorText |
Check after failed file loading, file writing, encoding, decoding,
regex, Markdown conversion, or other unexpected behavior.
|
| Unexpected text length |
Length |
Confirm whether the operation appended, replaced, removed, shortened, or
cleared content.
|
| Encoded data cannot decode to text |
Decode |
Confirm the binary encoding is correct and that decoded bytes are valid for
the specified charset.
|
| Hash mismatch |
GetHash |
Confirm the same charset, hash algorithm, and output encoding are used by
both sides.
|
| Whole-word search or replace behaves unexpectedly |
ContainsWord,
ReplaceWord
|
These methods are limited to Latin1-style word matching.
|
| Markdown streaming emits no HTML |
MarkdownToHtml |
In streaming mode, only complete Markdown lines are converted. A final
partial line is retained until completed.
|
| Regex operation takes too long |
RegexMatch |
Use the timeoutMs argument to limit processing
time.
|
Common Pitfalls
| Pitfall |
Better Approach |
| Using Encode when the original text must remain unchanged. |
Use GetEncoded, which returns the encoded value
without modifying this object.
|
| Decoding binary data that does not represent text. |
Use GetDecoded to retrieve decoded bytes when
the decoded result is binary rather than a string.
|
| Ignoring charset when hashing or encoding text. |
Specify the charset explicitly, typically
utf-8, so byte representation is predictable.
|
| Assuming obfuscation protects secrets. |
Use real encryption for secrets. Obfuscate is
reversible and not secure.
|
| Expecting ContainsWord and ReplaceWord to handle all Unicode word rules. |
Use them for Latin1 word matching only.
|
| Forgetting that some extraction methods can remove content. |
Review the removeFlag argument before calling
methods such as GetBefore,
GetAfterFinal, and
GetRange.
|
| Using Clear for sensitive text. |
Use SecureClear when the buffer contains
passwords, tokens, private data, or other secrets.
|
Best Practices
| Recommendation |
Reason |
| Use StringBuilder for text that will be modified repeatedly. |
It avoids repeated string replacement and makes append, remove, and transform
operations straightforward.
|
| Use explicit charsets for file, hash, encode, decode, and BinData operations. |
It ensures text is converted to and from bytes consistently.
|
| Use WriteFileIfModified for generated files. |
It avoids rewriting files when the generated content is unchanged.
|
| Use GetEncoded when the original content should remain intact. |
It returns an encoded value without modifying the object.
|
| Use quoted-string masking before broad replacements when quoted text should be preserved. |
MaskQuotedStrings and
RestoreMaskedStrings make it safer to process
only unquoted portions.
|
| Use MarkdownToHtml for both complete and streaming Markdown conversion. |
It supports full HTML output, streaming response updates, themes, Prism code
highlighting, and configurable document wrappers.
|
| Use SecureClear for secrets. |
It overwrites allocated memory with zero bytes before deallocating.
|
| Check LastErrorText after failures. |
It provides the most useful diagnostic detail for loading, writing, encoding,
decoding, regex, Markdown conversion, and transformation failures.
|
Summary
Chilkat.StringBuilder is a flexible mutable text
container for constructing and transforming strings. It supports appending text,
integers, random values, UUIDs, encoded bytes, and BinData;
searching, comparing, extracting, replacing, masking, regex processing, line-ending
normalization, encoding, decoding, hashing, Markdown conversion, file I/O, and
secure clearing.
The most important practical guidance is to use explicit charsets whenever text is
converted to bytes, use GetEncoded instead of
Encode when the original content should remain
unchanged, and use SecureClear for sensitive strings.