Why Charset Matters in Digital Signatures
The character encoding (charset) of text is critically important when digitally signing it, because digital signatures operate on the exact byte sequence of the content. Even a tiny difference in encoding results in a different byte stream, and therefore a different hash and signature.
Signatures are computed over bytes, not characters.
- Text like
Grünhöfer
can be represented differently in:- UTF-8:
47 72 75 CC 88 6E 68 6F 66 65 72
- ISO-8859-1:
47 72 FC 6E 68 6F 66 65 72
- UTF-8:
- These byte sequences are not equal, so the signature over them will differ.
- Verification must use the same encoding.
- If the receiver re-encodes the text (e.g., from UTF-8 to ISO-8859-1) before verifying, the signature will fail.
- Determinism is essential.
- For a digital signature to be valid, the exact same bytes signed must be presented during verification — encoding changes break this.
Example Scenario
If you sign this string:
"Grünhöfer"
with UTF-8, it is hashed and signed as:
47 72 C3 BC 6E 68 6F 66 65 72
If someone verifies using ISO-8859-1:
47 72 FC 6E 68 6F 66 65 72
Signature fails, because the hash input is different.