Chilkat.StringBuilder Class Overview

Chilkat.StringBuilder is a mutable string container for building, editing, searching, encoding, decoding, hashing, loading, saving, and transforming text. It includes practical helpers for line endings, substring extraction, replacements, regular expressions, quoted-string masking, Markdown-to-HTML conversion, UUID generation, random text, entity decoding, Punycode, emoji handling, and secure clearing.

What the Class Is Used For

Use Chilkat.StringBuilder when an application needs a reusable text buffer that can be modified in place. It is especially useful for constructing text, parsing text, normalizing line endings, converting encodings, hashing text, loading or writing files with a specific charset, preparing HTML from Markdown, or exchanging text with other Chilkat classes such as BinData, JsonObject, and StringTable.

Build Text Incrementally Append strings, integers, UUIDs, random data, lines, encoded bytes, or another StringBuilder.
Search and Extract Find substrings, whole words, ranges, list items, text before/after markers, or text between markers.
Replace and Transform Replace text, words, regions, line endings, case, accents, whitespace, emojis, quoted strings, and encoded content.
Encode, Decode, Hash, and Save Encode/decode Base64, hex, URL, quoted-printable, HTML entities, Punycode, and more; hash text with a chosen charset; load and write files.

Typical Workflow

  1. Create a StringBuilder object.
  2. Load or set content with SetString or LoadFile, or build content with append methods such as Append, AppendLine, and AppendSb.
  3. Inspect the content with Length, Contains, StartsWith, EndsWith, or ContentsEqual.
  4. Transform the content with methods such as Replace, Trim, ToCRLF, Encode, or Decode.
  5. Extract needed text with GetBetween, GetBefore, GetAfterFinal, GetRange, or GetNth.
  6. Write the result with WriteFile or WriteFileIfModified, or return it with GetAsString.
  7. Use SecureClear when the string contains sensitive data, and check LastErrorText after failures.

Core Concepts

Concept Meaning Important Members
Mutable Text Buffer The object holds text that can be appended, replaced, shortened, cleared, encoded, decoded, and written without repeatedly creating new strings. Append, SetString, Replace, Clear
Charset-Aware Bytes When converting between text and bytes, the charset controls the byte representation. AppendBd, Encode, Decode, GetHash, WriteFile
Encoded Text Text or decoded bytes can be encoded as Base64, hex, URL encoding, quoted-printable, HTML entities, and other formats. Encode, Decode, GetEncoded, GetDecoded
Marker-Based Extraction Text can be extracted or removed using markers such as “before this”, “after final occurrence”, or “between begin and end”. GetBefore, GetAfterFinal, GetBetween, ReplaceAllBetween
Delimited Lists A string can be treated as a delimiter-separated list, with options to ignore delimiters inside double quotes or escaped with backslash. GetNth, SetNth
Safe Temporary Masking Quoted strings can be masked before text processing and restored afterward. MaskQuotedStrings, RestoreMaskedStrings, StringTable

Core Properties

Property Purpose Guidance
Length Returns the number of characters in the current string. Use to check whether content is empty or to validate expected size after edits.
IntValue Gets or sets the content as an integer. Useful when the string content is expected to be a decimal integer.
IsBase64 Indicates whether the content contains only characters allowed in Base64. Whitespace is ignored. Base64 characters include A-Z, a-z, 0-9, +, /, and optional trailing = padding.
HasEmojis Returns true when the content contains one or more emoji characters. Use with RemoveEmojis when emoji-free text is required.
LastErrorText Diagnostic text for the last method or property access. Check after failures or unexpected results. Diagnostic information may be available regardless of success or failure.

Building and Appending Text

Task Method Behavior
Append text Append Appends a string to the current content.
Prepend text Prepend Inserts text at the beginning of the current content.
Append integer AppendInt, AppendInt64 Appends the decimal string representation of a 32-bit or 64-bit integer.
Append line AppendLine, AppendLn Appends text followed by CRLF or LF. AppendLn always appends CRLF.
Append another StringBuilder AppendSb Appends the contents of another StringBuilder.
Append bytes from BinData AppendBd Interprets bytes from BinData using the specified charset and appends the resulting text.
Append encoded binary data AppendEncoded Encodes byte data using a specified binary encoding, such as Base64 or hex, and appends the encoded text.
Append random bytes as text AppendRandom Generates random bytes and appends them encoded as hex, Base64, Base64Url, or another supported encoding.
Append UUID AppendUuid, AppendUuid7 Appends a random version 4 UUID or random version 7 UUID. Hex case is controlled by the lowerCase argument.

Loading, Setting, Getting, and Saving

Need Method Notes
Set content SetString Replaces the current content with the specified string.
Get content GetAsString Returns the current content as a string.
Load file LoadFile Loads file content using the specified charset.
Write file WriteFile Writes content using the specified charset. Can emit a BOM for charsets that define one.
Write only if changed WriteFileIfModified Writes a file only when the file is new or the content differs from the existing file.
Clear content Clear Removes all characters.
Securely clear content SecureClear Removes all characters and writes zero bytes to allocated memory before deallocating it.
File charset note: LoadFile, WriteFile, and WriteFileIfModified use a charset such as utf-8, utf-16, or another supported character encoding.

Searching and Comparing

Task Method Behavior
Contains substring Contains Returns true if a substring exists, with optional case sensitivity.
Contains whole word ContainsWord Finds a whole word with optional case sensitivity. Limited to Latin1-style word matching.
Starts with substring StartsWith Checks whether content begins with the specified substring.
Ends with substring EndsWith Checks whether content ends with the specified substring.
Compare to string ContentsEqual Compares the content to a string, with optional case sensitivity.
Compare to another StringBuilder ContentsEqualSb Compares the content to another StringBuilder.
Whole-word limitation: ContainsWord is limited to strings containing Latin1 characters. Word characters are alphanumeric Latin1 characters, and underscore is also considered part of a word.

Extraction Methods

Need Method Behavior
Text between markers GetBetween Returns text between the first occurrence of a begin marker and the next occurrence of an end marker.
Text after a prior marker, then between markers GetAfterBetween Starts searching after searchAfter, then returns text between the next begin and end markers.
Text before marker GetBefore Returns text before the first marker. Can remove returned text plus the marker from this object.
Text after final marker GetAfterFinal Returns text after the final occurrence of a marker. Can remove the marker and following content from this object.
Character range GetRange, GetRangeSb Retrieves a character range and optionally removes it from this object. GetRangeSb appends the range to another StringBuilder.
Nth delimited field GetNth Returns the Nth substring in a delimiter-separated list, optionally ignoring delimiters inside double quotes or escaped with backslash.
Last N lines LastNLines Returns the last N lines, converting line endings to CRLF or LF.
Removal behavior: Methods such as GetBefore, GetAfterFinal, and GetRange can optionally remove the returned portion from this object. If a marker is not present and removal is requested, the object may be cleared.

Replacement and Removal

Task Method Behavior
Replace all exact matches Replace Replaces all occurrences and returns the number of replacements.
Replace first match ReplaceFirst Replaces only the first occurrence.
Case-insensitive replace ReplaceNoCase Replaces all case-insensitive occurrences.
Replace with integer ReplaceI Replaces occurrences of a string with a decimal integer value.
Replace whole words ReplaceWord Replaces whole-word occurrences. Limited to Latin1-style word matching.
Replace after final marker ReplaceAfterFinal Replaces content after the final occurrence of a marker.
Replace between markers ReplaceAllBetween, ReplaceBetween Replaces text between marker pairs, or replaces values only inside marker regions.
Remove before marker RemoveBefore Removes text before the first marker and also removes the marker.
Remove after final marker RemoveAfterFinal Removes text after the final marker and also removes the marker.
Remove character range RemoveCharsAt Removes a specified range of characters.
Shorten from end Shorten Removes the last N characters.
Whole-word limitation: ReplaceWord, like ContainsWord, is limited to Latin1 text. Underscore is treated as part of a word.

Encoding, Decoding, and Hashing

Need Method Important Detail
Encode content in place Encode Converts the current text to bytes using a charset, then encodes those bytes as Base64, hex, URL encoding, quoted-printable, HTML entities, etc.
Get encoded content without modifying GetEncoded Returns encoded text while leaving this object unchanged.
Decode content in place Decode Decodes the current content and interprets the decoded bytes using the specified charset.
Decode and append DecodeAndAppend Decodes a supplied encoded value and appends the resulting string.
Get decoded bytes GetDecoded Decodes the current content and returns the decoded bytes.
Decode HTML entities EntityDecode Decodes HTML entity references in place.
Hash content GetHash Converts text to bytes using the requested charset, hashes those bytes, and returns the hash in the requested encoding.
Supported encoding examples: Encoding names include base64, hex, quoted-printable or qp, url, base32, Q, B, url_rc1738, url_rfc2396, url_rfc3986, url_oauth, uu, modBase64, and html.
Hash algorithms: GetHash supports algorithms including sha1, sha256, sha384, sha512, sha3-224, sha3-256, sha3-384, sha3-512, md2, md5, ripemd128, ripemd160, ripemd256, and ripemd320.

Line Endings, Case, Whitespace, and Text Cleanup

Task Method Behavior
Convert to CRLF ToCRLF Converts line endings to Windows CRLF format.
Convert to LF ToLF Converts line endings to LF-only Unix format.
Lowercase ToLowercase Converts content to lowercase.
Uppercase ToUppercase Converts content to uppercase.
Trim ends Trim Removes whitespace from both ends.
Normalize inner whitespace TrimInsideSpaces Replaces tabs, CR, and LF with spaces, then collapses multiple spaces into one.
Remove Latin/Central European accents RemoveAccents Removes diacritics for accented characters in Windows-1252 and Windows-1250 character sets.
Remove emojis RemoveEmojis Removes emoji characters from the content.
Accent-removal scope: RemoveAccents applies only to accented characters in Windows-1252 and Windows-1250. Accent marks for other languages are not removed.

Regular Expressions

Task Method Behavior
Find regex matches RegexMatch Searches the content for matches to a regular expression and returns match details in a JsonObject. Returns the number of matches, or -1 for failure.
Replace capture groups RegexReplace Replaces substrings of capture groups found in a previous call to RegexMatch.
Timeout control: RegexMatch accepts a timeoutMs argument. Pass 0 for no timeout.

Quoted-String Masking

Masking quoted strings is useful when performing replacements or parsing where quoted content should be temporarily protected.

Method Purpose quoteType Values
MaskQuotedStrings Replaces content inside quoted strings with a mask character and stores the original quoted strings in a StringTable. 0 = single and double quotes, 1 = single quotes only, 2 = double quotes only.
RestoreMaskedStrings Restores strings previously masked by MaskQuotedStrings. Uses the same quote-type convention.

Markdown to HTML

MarkdownToHtml converts the Markdown content of this object to HTML, writing the result to another StringBuilder. It supports both full-document and streaming conversion.

Option Area JSON Options Behavior
Streaming conversion streaming behavior through options In streaming mode, complete Markdown lines are converted and removed from this object, leaving a final partial line if present.
JavaScript output emitJavascript: true Emits JavaScript function calls instead of HTML, useful for real-time embedded browser updates.
HTML shell streamingShell: true Converts an empty string to an HTML shell for use before applying streaming JavaScript updates.
Theme theme Supports themes such as ChatGPT, cleanWin, cleanMac, and raw.
Prism highlighting usesPrism, prism.theme, prism.version Includes Prism code highlighting. Themes include tomorrow, default, coy, dark, funky, okaidia, solarizedlight, and twilight.
HTML document parts docType, rootElement, head, bodyStart, bodyEnd, noContentDiv Allows control over the generated HTML wrapper when no predefined theme is used.
Copy button copyButton: true Can include copy-button HTML in non-streaming mode. The copy button is added by default in streaming mode.
Streaming behavior: In streaming mode, no HTML is emitted if the object contains only a partial Markdown line. Complete lines are converted and removed, leaving the unfinished line for the next call.

Punycode, Emojis, Obfuscation, and IDs

Feature Methods / Properties Behavior
Punycode PunyEncode, PunyDecode Encodes or decodes the current string using Punycode.
Emoji detection/removal HasEmojis, RemoveEmojis Detects and removes emoji characters.
Obfuscation Obfuscate, Unobfuscate Applies reversible Chilkat string obfuscation. This is not encryption and is not secure.
UUID generation AppendUuid, AppendUuid7 Generates and appends random version 4 or version 7 UUIDs.
Obfuscation is not encryption: Chilkat string obfuscation is deterministic and reversible. It is intended only to make a string unintelligible, not to protect secrets.

Method Summary by Category

Category Methods Purpose
Build content Append, AppendLine, AppendLn, AppendInt, AppendInt64, Prepend, SetString Add, prepend, or replace text content.
Binary and random append AppendBd, AppendEncoded, AppendRandom, AppendUuid, AppendUuid7 Append text derived from bytes, random bytes, or generated UUIDs.
Search and compare Contains, ContainsWord, StartsWith, EndsWith, ContentsEqual, ContentsEqualSb Locate or compare text.
Extract text GetBefore, GetAfterFinal, GetBetween, GetAfterBetween, GetRange, GetRangeSb, GetNth, LastNLines Extract substrings by marker, range, delimiter, or line count.
Replace and remove Replace, ReplaceFirst, ReplaceNoCase, ReplaceWord, ReplaceBetween, ReplaceAllBetween, RemoveBefore, RemoveAfterFinal, RemoveCharsAt, Shorten Modify or remove content.
Encode, decode, hash Encode, Decode, DecodeAndAppend, GetEncoded, GetDecoded, EntityDecode, GetHash Convert between text, bytes, encodings, entities, and hashes.
Normalize and cleanup ToCRLF, ToLF, ToLowercase, ToUppercase, Trim, TrimInsideSpaces, RemoveAccents, RemoveEmojis Normalize line endings, case, whitespace, accents, and emojis.
Advanced processing RegexMatch, RegexReplace, MaskQuotedStrings, RestoreMaskedStrings, MarkdownToHtml, PunyEncode, PunyDecode Perform regex, quoted-string-safe transformations, Markdown conversion, and Punycode conversion.
File I/O and clearing LoadFile, WriteFile, WriteFileIfModified, Clear, SecureClear Load, save, conditionally save, clear, or securely clear text.

Diagnostics and Troubleshooting

Problem Area Member What to Check
General failure LastErrorText Check after failed file loading, file writing, encoding, decoding, regex, Markdown conversion, or other unexpected behavior.
Unexpected text length Length Confirm whether the operation appended, replaced, removed, shortened, or cleared content.
Encoded data cannot decode to text Decode Confirm the binary encoding is correct and that decoded bytes are valid for the specified charset.
Hash mismatch GetHash Confirm the same charset, hash algorithm, and output encoding are used by both sides.
Whole-word search or replace behaves unexpectedly ContainsWord, ReplaceWord These methods are limited to Latin1-style word matching.
Markdown streaming emits no HTML MarkdownToHtml In streaming mode, only complete Markdown lines are converted. A final partial line is retained until completed.
Regex operation takes too long RegexMatch Use the timeoutMs argument to limit processing time.

Common Pitfalls

Pitfall Better Approach
Using Encode when the original text must remain unchanged. Use GetEncoded, which returns the encoded value without modifying this object.
Decoding binary data that does not represent text. Use GetDecoded to retrieve decoded bytes when the decoded result is binary rather than a string.
Ignoring charset when hashing or encoding text. Specify the charset explicitly, typically utf-8, so byte representation is predictable.
Assuming obfuscation protects secrets. Use real encryption for secrets. Obfuscate is reversible and not secure.
Expecting ContainsWord and ReplaceWord to handle all Unicode word rules. Use them for Latin1 word matching only.
Forgetting that some extraction methods can remove content. Review the removeFlag argument before calling methods such as GetBefore, GetAfterFinal, and GetRange.
Using Clear for sensitive text. Use SecureClear when the buffer contains passwords, tokens, private data, or other secrets.

Best Practices

Recommendation Reason
Use StringBuilder for text that will be modified repeatedly. It avoids repeated string replacement and makes append, remove, and transform operations straightforward.
Use explicit charsets for file, hash, encode, decode, and BinData operations. It ensures text is converted to and from bytes consistently.
Use WriteFileIfModified for generated files. It avoids rewriting files when the generated content is unchanged.
Use GetEncoded when the original content should remain intact. It returns an encoded value without modifying the object.
Use quoted-string masking before broad replacements when quoted text should be preserved. MaskQuotedStrings and RestoreMaskedStrings make it safer to process only unquoted portions.
Use MarkdownToHtml for both complete and streaming Markdown conversion. It supports full HTML output, streaming response updates, themes, Prism code highlighting, and configurable document wrappers.
Use SecureClear for secrets. It overwrites allocated memory with zero bytes before deallocating.
Check LastErrorText after failures. It provides the most useful diagnostic detail for loading, writing, encoding, decoding, regex, Markdown conversion, and transformation failures.

Summary

Chilkat.StringBuilder is a flexible mutable text container for constructing and transforming strings. It supports appending text, integers, random values, UUIDs, encoded bytes, and BinData; searching, comparing, extracting, replacing, masking, regex processing, line-ending normalization, encoding, decoding, hashing, Markdown conversion, file I/O, and secure clearing.

The most important practical guidance is to use explicit charsets whenever text is converted to bytes, use GetEncoded instead of Encode when the original content should remain unchanged, and use SecureClear for sensitive strings.