Toolpile

Articles

The three encodings every web dev should know cold: Base64, URL-encoding, JWT

What each one is for, when each one is wrong, and the bug stories that prove it.

· 8 min read · By Umur Yavuz

These three show up in the same paragraph in tutorials, on the same shelf in dev tools, and in the same blog post titled "encryption explained." None of them is encryption. All of them are encodings — schemes for shoving data through a channel that can't carry it natively. Mix them up and you ship a bug that will eat half a sprint.

Here's what each one is, when each one is the right answer, and the specific bug each one causes when used wrong.

Base64: getting bytes through a text-only channel

Base64 takes any sequence of bytes and turns it into a string of 64 safe characters: A-Z, a-z, 0-9, plus + and / (and = for padding). The output is exactly 4/3 the size of the input — every 3 bytes become 4 characters. That's the cost of making binary data survive systems that were designed for ASCII text.

Use it when you genuinely need binary in a text channel. Email attachments (MIME), inline images in HTML or CSS (data URIs), JSON fields that have to carry a small image or signature, OAuth client credentials in an Authorization header. Those are the textbook cases and they're real.

Don't use it as a security measure. "We Base64 the password before sending it" is one of the most common comments in pull request reviews and one of the most wrong. Base64 is reversible by every developer on earth in two seconds. It is not encryption. It is not even obfuscation. It is a transport encoding.

There's also Base64URL, which swaps + for - and / for _ so the output is safe to drop into a URL or filename. JWTs use it. Most modern APIs use it. Standard Base64 with + and / will fail in a URL because + means "space" in form-encoding and / can break path matching. If you're putting Base64 anywhere near a URL, use the URL-safe variant.

Tool · Base64
Encode and decode in both standard and URL-safe variants.
Tool · Image to Base64
For when you need a data URI for an inline image — outputs the full data:image/png;base64,... string.

URL-encoding: making strings safe for URLs

Also called percent-encoding. Takes characters that have special meaning in a URL — space, &, ?, =, /, #, %, and anything outside ASCII — and replaces them with a % followed by the byte's hex representation. A space becomes %20. An ampersand becomes %26. "café" becomes "caf%C3%A9" because é is two UTF-8 bytes.

Use it on every value you put into a query string or a path segment that wasn't already produced by your URL builder. Hand-built URLs are where this goes wrong constantly.

The classic bug: a search query containing & gets concatenated into a URL without encoding. A user searches for "cats & dogs". The unencoded URL is /search?q=cats & dogs. The server reads &dogs as a separate query parameter named "dogs" with an empty value, and your search box shows results for "cats " with a trailing space. The fix is one function call (encodeURIComponent in JavaScript) but only if you remember to make it.

There are two encoders in the JS standard library and they are subtly different. encodeURI assumes you're encoding an entire URL and leaves /, :, ?, &, #, = alone. encodeURIComponent assumes you're encoding a single value to drop into a URL and encodes those characters. Use encodeURIComponent for query string values. encodeURI is almost always the wrong choice — it'll let through the very characters you needed to encode.

Tool · URL Encoder
Encode and decode percent-encoded URL components. Helpful for diagnosing the double-encoding case — if you decode and get back something that's still percent-encoded, you've found it.

JWT: a signed, structured, transparent token

JSON Web Tokens are the auth token format that won. A JWT is three Base64URL-encoded parts joined with dots: a header (says which algorithm signs it), a payload (your data, as JSON), and a signature (proves the first two haven't been tampered with). The whole thing looks like xxxxx.yyyyy.zzzzz and is what your API hands clients after they log in.

The thing to internalise: the payload is not encrypted. It is encoded. Anyone who has the token can decode the middle part and read it. That includes the user, that includes whoever stole the token from the user's browser, that includes the proxy logging requests in plaintext. Do not put anything in a JWT payload that isn't safe to show to the bearer. Email, user ID, plan tier, role — fine. Password reset codes, internal database IDs that grant admin access, the user's actual password — never.

What the signature gives you is integrity. If a malicious client decodes the payload, changes "role":"user" to "role":"admin", and re-encodes it, the signature won't match (because they don't have your signing key) and your server will reject the token. This is the entire reason to use JWTs over a random opaque token: you can verify the token without a database lookup.

The famous JWT bug is the alg:none attack. Early JWT libraries treated the alg field in the header as the source of truth — if the token said alg:none, the library skipped signature verification. Attackers found this approximately 30 seconds after it shipped. Every modern JWT library hardcodes the expected algorithm on the verify call now, but if you're rolling your own, never trust the alg field.

Tool · JWT Decoder
Paste a JWT, see the header and payload as JSON. Useful when debugging auth — half of "why am I getting 401" turns out to be "the token is expired" and the exp claim is sitting right there.

The pattern

All three encodings exist because the channel they're crossing — text, URL, header — couldn't carry the data raw. None of them adds security. None of them is reversible only with a key. They make data fit through a pipe; they don't hide it.

If you remember one thing: encoding is for transport, hashing is for fingerprinting, encryption is for secrecy. Three different jobs, three different tools. The next article in this series is about the second one.

Tools mentioned in this article