Base64 is one of the most common “data-to-text” encodings used in software systems. You’ll see it in APIs, JSON payloads, email attachments, JWTs, data URLs, and anywhere binary data must travel through systems designed primarily for text.
This article explains what Base64 is, how it works, when to use it, and how to encode/decode it correctly in popular languages and tools.
What Base64 Is (and What It Isn’t)
Base64 is an encoding, not encryption.
It converts arbitrary bytes into a limited set of ASCII characters so the data can be safely transmitted or embedded in text-based formats.
- Encoding goal: represent binary data using printable characters.
- Security: Base64 does not hide data; anyone can decode it.
- Compression: Base64 increases size (roughly 33% overhead), so it is not a compression technique.
Why Base64 Exists
Many protocols and storage formats historically assumed text content (and sometimes only ASCII). Raw bytes can break these systems due to:
- reserved control characters (e.g., newline, null byte)
- encoding/charset conversions (UTF-8 vs legacy encodings)
- transport layers that are not “binary clean”
Base64 solves this by producing text made only from characters that survive these paths reliably.
How Base64 Works Internally
Base64 processes bytes (8-bit values) and turns them into characters by grouping bits:
- Take input bytes and concatenate their bits.
- Split into chunks of 6 bits.
- Map each 6-bit value (0–63) to a character from the Base64 alphabet.
Standard Base64 Alphabet
A–Z→ 0–25a–z→ 26–510–9→ 52–61+→ 62/→ 63
Padding With =
Because Base64 encodes 3 input bytes (24 bits) into 4 output characters (4 × 6 bits = 24 bits), inputs not divisible by 3 need padding:
- 1 leftover byte → output ends with
== - 2 leftover bytes → output ends with
=
Padding ensures the output length is a multiple of 4 and allows the decoder to restore the original byte length.
Common Variants: Standard vs URL-Safe Base64
When Base64 appears in URLs or filenames, + and / can cause trouble. URL-safe Base64 replaces them:
+→-/→_- padding
=may be omitted depending on the convention
This variant is often used for JWT segments and URL parameters.
A Quick Example
The ASCII text:
hello
Base64-encodes to:
aGVsbG8=
Decoding aGVsbG8= returns the original bytes representing hello.
Encoding and Decoding Base64 in Popular Languages
Python
import base64# Encode text -> Base64 stringtext = "hello"encoded = base64.b64encode(text.encode("utf-8")).decode("ascii")print(encoded) # aGVsbG8=# Decode Base64 string -> textdecoded_bytes = base64.b64decode(encoded)decoded_text = decoded_bytes.decode("utf-8")print(decoded_text) # hello# URL-safe varianturl_encoded = base64.urlsafe_b64encode(text.encode("utf-8")).decode("ascii")print(url_encoded)url_decoded = base64.urlsafe_b64decode(url_encoded).decode("utf-8")print(url_decoded)
Key point: Base64 operates on bytes, so always encode/decode text using a consistent charset like UTF-8.
JavaScript (Browser and Node.js)
Browser (Web APIs)
// Encoding/decoding ASCII only (btoa/atob are limited)const encoded = btoa("hello");console.log(encoded); // aGVsbG8=const decoded = atob(encoded);console.log(decoded); // hello
For Unicode text (e.g., Korean, emoji), use TextEncoder / TextDecoder:
const text = "hello 🌍";const bytes = new TextEncoder().encode(text);// bytes -> Base64let binary = "";bytes.forEach(b => (binary += String.fromCharCode(b)));const encoded = btoa(binary);console.log(encoded);// Base64 -> bytes -> textconst raw = atob(encoded);const outBytes = new Uint8Array([...raw].map(c => c.charCodeAt(0)));const decoded = new TextDecoder().decode(outBytes);console.log(decoded); // hello 🌍
Node.js
const text = "hello 🌍";// Encodeconst encoded = Buffer.from(text, "utf8").toString("base64");console.log(encoded);// Decodeconst decoded = Buffer.from(encoded, "base64").toString("utf8");console.log(decoded);
PHP
<?php$text = "hello";// Encode$encoded = base64_encode($text);echo $encoded . PHP_EOL; // aGVsbG8=// Decode$decoded = base64_decode($encoded, true);echo $decoded . PHP_EOL; // hello?>
Use the second parameter true in base64_decode to enable strict decoding (invalid characters cause failure).
Java
import java.nio.charset.StandardCharsets;import java.util.Base64;public class Main { public static void main(String[] args) { String text = "hello 🌍"; // Encode String encoded = Base64.getEncoder() .encodeToString(text.getBytes(StandardCharsets.UTF_8)); System.out.println(encoded); // Decode byte[] decodedBytes = Base64.getDecoder().decode(encoded); String decoded = new String(decodedBytes, StandardCharsets.UTF_8); System.out.println(decoded); // URL-safe String urlEncoded = Base64.getUrlEncoder() .withoutPadding() .encodeToString(text.getBytes(StandardCharsets.UTF_8)); System.out.println(urlEncoded); }}
Go
package mainimport ( "encoding/base64" "fmt")func main() { text := "hello 🌍" encoded := base64.StdEncoding.EncodeToString([]byte(text)) fmt.Println(encoded) decodedBytes, err := base64.StdEncoding.DecodeString(encoded) if err != nil { panic(err) } fmt.Println(string(decodedBytes)) urlEncoded := base64.RawURLEncoding.EncodeToString([]byte(text)) // no padding fmt.Println(urlEncoded)}
Command-Line Base64 (Linux/macOS)
Encode a file:
base64 input.bin > output.txt
Decode:
base64 -d output.txt > restored.bin
Encode a string (note: echo -n avoids adding a newline):
echo -n "hello" | base64
Decode:
echo "aGVsbG8=" | base64 -d
Command flags vary slightly across OS distributions; if -d doesn’t work, try --decode.
Base64 in Real-World Scenarios
1) Embedding Images in HTML/CSS (Data URLs)
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." alt="Inline image">
This eliminates separate image requests but increases HTML size and can reduce caching efficiency.
2) JSON APIs Carrying Binary Data
Because JSON is text, binary payloads (certificates, small files, signatures) are often Base64-encoded to fit in a JSON string.
3) Tokens and Claims (JWT)
JWT uses Base64URL encoding for its header and payload segments. Padding is typically omitted, and the URL-safe alphabet is used.
Common Pitfalls and How to Avoid Them
Treating Base64 as Encryption
If you need confidentiality, use real cryptography (e.g., AES-GCM) and then Base64-encode the ciphertext if you need a text representation.
Confusing Text With Bytes
Base64 encodes bytes. If you start with text, choose a charset (usually UTF-8) and stick to it on both sides.
Newlines and Whitespace in Encoded Output
Some encoders insert line breaks (common in email/MIME contexts). Most decoders ignore whitespace, but strict decoders may reject it.
Padding Differences
Some systems omit = padding, especially in URL contexts. If decoding fails, verify whether you’re dealing with standard Base64 or Base64URL and whether padding is expected.
Using btoa/atob for Unicode in Browsers
These functions are not Unicode-safe. For non-ASCII data, use TextEncoder/TextDecoder or a robust library.
Quick Checklist for Correct Base64 Handling
- Decide whether you need standard or URL-safe Base64.
- Always convert text to bytes using UTF-8 before encoding.
- Expect about 33% size increase compared to the original bytes.
- Don’t rely on Base64 for security; it is reversible.
- When interoperability matters, confirm:
- padding behavior (
=included or omitted) - line wrapping (none vs fixed width)
- allowed characters (standard vs URL-safe)
- padding behavior (
Summary
Base64 is a reliable way to represent binary data as text, making it ideal for transporting bytes through systems that expect textual content. Understanding its bit-level structure, padding rules, and URL-safe variant helps prevent subtle bugs—especially across different languages, platforms, and libraries.
If you share the language/runtime you’re targeting and the type of data (text, file bytes, JWT, data URL), you can tailor the safest and most compatible encode/decode approach for that specific scenario.