Binary Encodings Supported by Chilkat
In the Chilkat API, method arguments and properties like EncodingMode
specify binary encodings for data, such as hex
or base64.
Your application can choose from these binary encodings whenever applicable.
The supported binary encodings are:
- base64
- base64_mime (Same as
base64
but the output is in lines just as it would appear in MIME or in a PEM.) - modbase64 / base64url
- base58
- base45
- base32
- hex
- hex_lower
- uu
- qp (for quoted-printable)
- url / url_rfc3986
- url_rfc1738
- Q (for MIME Q-Encoding)
- B (for MIME B-Encoding)
- fingerprint
- decimal
- eda (for encoding/decoding to the UN/EDIFACT Syntax Level A character set)
- json (for escaping and unescaping JSON strings)
- decList
- ascii85
base64
Base64 encoding is a method of converting binary data (like images, files, or encrypted text) into an ASCII string. It uses 64 printable characters (A-Z, a-z, 0-9, +, /)
to represent binary data.
base64_mime
For MIME (Multipurpose Internet Mail Extensions) and PEM (Privacy-Enhanced Mail), Base64 is split into fixed-length lines for readability and compatibility with older email systems.
Example:
Q2F0cyBhcmUgYXdlc29tZSBhbmltYWxzIHRoYXQgY2FuIHJ1biB2ZXJ5IGZhc3QsIGJ1dCB0aGV5 IGFsc28gdGFrZSBuYXBzIGZvciBsb25nIHBlcmlvZHMgb2YgdGltZS4=
modbase64 / base64url
ModBase64/Base64Url is a URL-safe variant of Base64 where:
+
is replaced with-
/
is replaced with_
- Padding (
=
) is often omitted
ModBase64 is URL-safe and does not need extra encoding for web transmission. ModBase64 is just another name sometimes used for Base64Url.
base58
Base58 is a binary-to-text encoding format designed to be:
- URL and filename safe (no "0", "O", "I", "l", "+", "/").
- Human-friendly, avoiding similar-looking characters.
- It is commonly used in Bitcoin addresses.
base45
Base45 is an encoding format that uses 45 characters (0-9, A-Z, and space/punctuation).
base32
Base32 is a text encoding format that uses 32 characters (A-Z and 2-7). Padding ("=") is used to align data. Optimized for case-insensitive systems and URL safety.
hex
Hexadecimal (uppercase) encoding represents binary data as:
- Pairs of hex digits (0-9, A-F).
- Each byte (8 bits) is shown as two characters.
- Example: The bytes for us-ascii
Hello
(decimal bytes 72, 101, 108, 108, 111) are48656C6C6F
in uppercase hexidecimal
hex_lower
Hexidecimal encoding, but using lowercase a-f
instead of uppercase A-F
uu
UU Encoding (Unix-to-Unix Encoding) is a binary-to-text encoding format that:
- Encodes binary data into printable ASCII characters.
- Uses a 6-bit representation for each character.
- Often used for email attachments in early Unix systems.
- Begins with
begin
and ends withend
.
Example of UU Encoding:
Original Text:
Hello
UU Encoded Output:
begin 644 file.dat %2&5L;&\` ` end
qp / quoted-printable
Quoted-Printable encoding is a binary-to-text encoding scheme that:
- Encodes non-printable and special characters as
=
followed by their hex value. - Keeps most ASCII characters readable.
- Commonly used in email MIME headers to safely transmit special characters.
Example:
Café → Caf=E9
url / url_rfc3986
URL Encoding converts characters into a format safe for URLs:
- Special characters (
space
,&
,/
) are replaced with%
followed by their hexadecimal ASCII value. - Spaces become
%20
,/
becomes%2F
,&
becomes%26
. - Ensures URLs are properly transmitted over HTTP.
Example:
Hello World! → Hello%20World%21
url_rfc1738
The difference between RFC 1738 and RFC 3986 for URL encoding primarily revolves around the encoding of spaces:
Character | RFC 1738 (1994) | RFC 3986 (2005) |
---|---|---|
Space | Encoded as + |
Encoded as %20 |
Reserved characters | % encoding for reserved characters |
Same, but more explicitly defined |
Usage Context | Mainly for form submissions (application/x-www-form-urlencoded ) |
For generic URIs and URLs |
Q
Q
encoding is a method defined in MIME (Multipurpose Internet Mail Extensions) for encoding non-ASCII characters in email header fields (like Subject
and From
). It allows special characters to be safely transmitted over protocols that only support ASCII.
How It Works
- Spaces (
_
. - Non-ASCII characters are percent-encoded in hex format (
=XX
). - The header is prefixed with
=?charset?Q?
.
Example
Original:
Subject: Hello Grüß Gott
Q-encoded:
Subject: =?UTF-8?Q?Hello_Gr=C3=BC=C3=9F_Gott?=
B
B
encoding is a MIME (Multipurpose Internet Mail Extensions) mechanism for encoding non-ASCII characters in email header fields. It uses Base64 encoding to safely represent binary data as ASCII text.
How It Works
- The header is prefixed with:
=?charset?B?
- The content is Base64 encoded.
- Used for
Subject
,From
, and other header fields.
Example
Original:
Subject: Hello Grüß Gott
B-encoded:
Subject: =?UTF-8?B?SGVsbG8gR3LDvMOezCBHb3R0?=
fingerprint
The fingerprint
encoding is a lowercase hex encoding where each hex digit is separated by a colon character. For example: 6a:de:e0:af:56:f8:0c:04:11:5b:ef:4d:49:ad:09:23
decimal
Thedecimal
encoding is for converting large decimal integers to/from a big-endian binary representation. For example, the decimal string 72623859790382856
converts to the bytes 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08
.
eda
EDA encoding
is a mechanism used to encode data into the UN/EDIFACT Syntax Level A
character set, which is a restricted 7-bit ASCII subset defined for electronic data interchange (EDI). This encoding ensures that data conforms to the limited character set required by certain EDI systems.
json
JSON escape encoding is the process of using backslashes (\) to represent characters in JSON strings that might otherwise break parsing or are non-printable.
decList
The decList
encoding is for converting comma-separated lists of decimal integers to bytes and back. For example, 84, 104, 101, 32, 116, 114, 117, 101, 32, 115, 105, 103, 110
. Each decimal integer is a value from 0 to 255.
ascii85
ASCII85 encoding
(also called Base85
) is a method for encoding binary data into ASCII text.
How It Works
- Takes 4 bytes (32 bits) of binary data and encodes it as 5 ASCII characters.
- It is more space-efficient than Base64, using only 25% overhead instead of 33%.
- Encoded text is readable and safe for text-based protocols (e.g., email, JSON).