How Encryption Transforms Bytes (General Overview)

Encryption, in its most general form, is the process of converting plain data (plaintext) into encrypted data (ciphertext) using a specific algorithm and a key. This transformation always operates on bytes, regardless of the original data format (text, files, images, etc.).


Plaintext to Byte Conversion

Encryption algorithms do not understand text directly; they operate on raw bytes. If you are encrypting text, it must first be encoded into bytes using a character encoding (charset):

* Character Encoding Examples:

* UTF-8: The most common web encoding, variable-length.

* UTF-16: Uses 2 bytes per character, sometimes 4 for special symbols.

* ISO-8859-1: Single-byte encoding, mainly for Western characters.

# Example: Text to bytes conversion
text = "Hello, Encryption!"
utf8_bytes = text.encode('utf-8')
print(utf8_bytes)

If you change the charset, the bytes will differ:

utf16_bytes = text.encode('utf-16')
print(utf16_bytes)

Output: Ciphertext (Encrypted Bytes)

The result of the encryption process is ciphertext, which is a series of encrypted bytes. These bytes are not human-readable and usually look like random data.

To make it transferable or storable:

* It is often encoded in:

* Base64: URL-safe and compact.

* Hexadecimal: Easier to read but larger.

* Binary: Efficient but less portable.

import base64
ciphertext = b'\x89\xab\xcd\xef...'  # Example encrypted bytes
ciphertext_base64 = base64.b64encode(ciphertext)
print(ciphertext_base64)

Character Encoding Matters

When you are encrypting text data, the character encoding (charset) you use to convert the text to bytes directly affects the encryption result:

* UTF-8 and UTF-16 produce different byte representations of the same string.

* The encryption algorithm only sees bytes, not the original characters.

Example:

# Original string
original = "Encrypt me!"
# UTF-8 encoding
utf8_bytes = original.encode('utf-8')
# UTF-16 encoding
utf16_bytes = original.encode('utf-16')
print(utf8_bytes)    # b'Encrypt me!'
print(utf16_bytes)   # b'\xff\xfeE\x00n\x00c\x00r\x00y\x00p\x00t\x00 \x00m\x00e\x00!\x00'

Decryption

The decryption process is the reverse of encryption:

  1. The ciphertext is transformed back into plaintext bytes.
  2. These bytes are decoded back into readable text using the same character encoding used during encryption.
    # Decrypt the bytes and decode
    decrypted_bytes = decrypt_function(ciphertext)
    original_text = decrypted_bytes.decode('utf-8')
    

Summary

  1. Encryption transforms bytes, not characters.
  2. Text must be encoded into bytes before encryption.
  3. The charset matters — different encodings produce different ciphertexts.
  4. Ciphertext is binary data and often needs to be Base64 or Hex encoded for storage or transport.
  5. Decryption reverses the process, requiring the same encoding to restore readable text.