How to encode and decode base64

Base64 is one of the most common “data-to-text” encodings used in software systems. You’ll see it in APIs, JSON payloads, email attachments, JWTs, data URLs, and anywhere binary data must travel through systems designed primarily for text.

This article explains what Base64 is, how it works, when to use it, and how to encode/decode it correctly in popular languages and tools.

What Base64 Is (and What It Isn’t)

Base64 is an encoding, not encryption.
It converts arbitrary bytes into a limited set of ASCII characters so the data can be safely transmitted or embedded in text-based formats.

Encoding goal: represent binary data using printable characters.
Security: Base64 does not hide data; anyone can decode it.
Compression: Base64 increases size (roughly 33% overhead), so it is not a compression technique.

Why Base64 Exists

Many protocols and storage formats historically assumed text content (and sometimes only ASCII). Raw bytes can break these systems due to:

reserved control characters (e.g., newline, null byte)
encoding/charset conversions (UTF-8 vs legacy encodings)
transport layers that are not “binary clean”

Base64 solves this by producing text made only from characters that survive these paths reliably.

How Base64 Works Internally

Base64 processes bytes (8-bit values) and turns them into characters by grouping bits:

Take input bytes and concatenate their bits.
Split into chunks of 6 bits.
Map each 6-bit value (0–63) to a character from the Base64 alphabet.

Standard Base64 Alphabet

A–Z → 0–25
a–z → 26–51
0–9 → 52–61
+ → 62
/ → 63

Padding With `=`

Because Base64 encodes 3 input bytes (24 bits) into 4 output characters (4 × 6 bits = 24 bits), inputs not divisible by 3 need padding:

1 leftover byte → output ends with ==
2 leftover bytes → output ends with =

Padding ensures the output length is a multiple of 4 and allows the decoder to restore the original byte length.

Common Variants: Standard vs URL-Safe Base64

When Base64 appears in URLs or filenames, + and / can cause trouble. URL-safe Base64 replaces them:

+ → -
/ → _
padding = may be omitted depending on the convention

This variant is often used for JWT segments and URL parameters.

A Quick Example

The ASCII text:

hello

Base64-encodes to:

aGVsbG8=

Decoding aGVsbG8= returns the original bytes representing hello.

Encoding and Decoding Base64 in Popular Languages

Python

			
import base64
# Encode text -> Base64 string
text = "hello"
encoded = base64.b64encode(text.encode("utf-8")).decode("ascii")
print(encoded)  # aGVsbG8=
# Decode Base64 string -> text
decoded_bytes = base64.b64decode(encoded)
decoded_text = decoded_bytes.decode("utf-8")
print(decoded_text)  # hello
# URL-safe variant
url_encoded = base64.urlsafe_b64encode(text.encode("utf-8")).decode("ascii")
print(url_encoded)
url_decoded = base64.urlsafe_b64decode(url_encoded).decode("utf-8")
print(url_decoded)

		

Key point: Base64 operates on bytes, so always encode/decode text using a consistent charset like UTF-8.

JavaScript (Browser and Node.js)

Browser (Web APIs)

			
// Encoding/decoding ASCII only (btoa/atob are limited)
const encoded = btoa("hello");
console.log(encoded); // aGVsbG8=
const decoded = atob(encoded);
console.log(decoded); // hello

		

For Unicode text (e.g., Korean, emoji), use TextEncoder / TextDecoder:

			
const text = "hello 🌍";
const bytes = new TextEncoder().encode(text);
// bytes -> Base64
let binary = "";
bytes.forEach(b => (binary += String.fromCharCode(b)));
const encoded = btoa(binary);
console.log(encoded);
// Base64 -> bytes -> text
const raw = atob(encoded);
const outBytes = new Uint8Array([...raw].map(c => c.charCodeAt(0)));
const decoded = new TextDecoder().decode(outBytes);
console.log(decoded); // hello 🌍

		

Node.js

			
const text = "hello 🌍";
// Encode
const encoded = Buffer.from(text, "utf8").toString("base64");
console.log(encoded);
// Decode
const decoded = Buffer.from(encoded, "base64").toString("utf8");
console.log(decoded);

		

PHP

			
<?php
$text = "hello";
// Encode
$encoded = base64_encode($text);
echo $encoded . PHP_EOL; // aGVsbG8=
// Decode
$decoded = base64_decode($encoded, true);
echo $decoded . PHP_EOL; // hello
?>

		

Use the second parameter true in base64_decode to enable strict decoding (invalid characters cause failure).

Java

			
import java.nio.charset.StandardCharsets;
import java.util.Base64;
public class Main {
    public static void main(String[] args) {
        String text = "hello 🌍";
        // Encode
        String encoded = Base64.getEncoder()
                .encodeToString(text.getBytes(StandardCharsets.UTF_8));
        System.out.println(encoded);
        // Decode
        byte[] decodedBytes = Base64.getDecoder().decode(encoded);
        String decoded = new String(decodedBytes, StandardCharsets.UTF_8);
        System.out.println(decoded);
        // URL-safe
        String urlEncoded = Base64.getUrlEncoder()
                .withoutPadding()
                .encodeToString(text.getBytes(StandardCharsets.UTF_8));
        System.out.println(urlEncoded);
    }
}

		

Go

			
package main
import (
	"encoding/base64"
	"fmt"
)
func main() {
	text := "hello 🌍"
	encoded := base64.StdEncoding.EncodeToString([]byte(text))
	fmt.Println(encoded)
	decodedBytes, err := base64.StdEncoding.DecodeString(encoded)
	if err != nil {
		panic(err)
	}
	fmt.Println(string(decodedBytes))
	urlEncoded := base64.RawURLEncoding.EncodeToString([]byte(text)) // no padding
	fmt.Println(urlEncoded)
}

		

Command-Line Base64 (Linux/macOS)

Encode a file:

base64 input.bin > output.txt

Decode:

base64 -d output.txt > restored.bin

Encode a string (note: echo -n avoids adding a newline):

echo -n "hello" | base64

Decode:

echo "aGVsbG8=" | base64 -d

Command flags vary slightly across OS distributions; if -d doesn’t work, try --decode.

Base64 in Real-World Scenarios

1) Embedding Images in HTML/CSS (Data URLs)

			
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." alt="Inline image">

This eliminates separate image requests but increases HTML size and can reduce caching efficiency.

2) JSON APIs Carrying Binary Data

Because JSON is text, binary payloads (certificates, small files, signatures) are often Base64-encoded to fit in a JSON string.

3) Tokens and Claims (JWT)

JWT uses Base64URL encoding for its header and payload segments. Padding is typically omitted, and the URL-safe alphabet is used.

Common Pitfalls and How to Avoid Them

Treating Base64 as Encryption

If you need confidentiality, use real cryptography (e.g., AES-GCM) and then Base64-encode the ciphertext if you need a text representation.

Confusing Text With Bytes

Base64 encodes bytes. If you start with text, choose a charset (usually UTF-8) and stick to it on both sides.

Newlines and Whitespace in Encoded Output

Some encoders insert line breaks (common in email/MIME contexts). Most decoders ignore whitespace, but strict decoders may reject it.

Padding Differences

Some systems omit = padding, especially in URL contexts. If decoding fails, verify whether you’re dealing with standard Base64 or Base64URL and whether padding is expected.

Using `btoa/atob` for Unicode in Browsers

These functions are not Unicode-safe. For non-ASCII data, use TextEncoder/TextDecoder or a robust library.

Quick Checklist for Correct Base64 Handling

Decide whether you need standard or URL-safe Base64.
Always convert text to bytes using UTF-8 before encoding.
Expect about 33% size increase compared to the original bytes.
Don’t rely on Base64 for security; it is reversible.
When interoperability matters, confirm:
- padding behavior (= included or omitted)
- line wrapping (none vs fixed width)
- allowed characters (standard vs URL-safe)

Summary

Base64 is a reliable way to represent binary data as text, making it ideal for transporting bytes through systems that expect textual content. Understanding its bit-level structure, padding rules, and URL-safe variant helps prevent subtle bugs—especially across different languages, platforms, and libraries.

If you share the language/runtime you’re targeting and the type of data (text, file bytes, JWT, data URL), you can tailor the safest and most compatible encode/decode approach for that specific scenario.