Stop Defaulting to UUIDs

Last month, I watched a senior engineer spend two days debugging a slow query. The culprit? UUIDs as primary keys on a 50M row table. The index was 2x larger than it needed to be, and range scans were painfully slow.

UUIDs have become the default choice for many teams. But defaults should be questioned.

The Real Cost of UUIDs

UUIDs are 16 bytes. Integers are 4-8 bytes. That sounds trivial until you do the math.

A 100M row table with 5 foreign keys:

BIGINT (8 bytes): PK index ~800MB + FK indexes ~3.7GB = ~4.5GB total
UUID (16 bytes): PK index ~1.6GB + FK indexes ~7.4GB = ~9.0GB total

Double the storage. Double the memory pressure. Double the I/O.

But the bigger problem is index fragmentation. UUIDv4 is random, so inserts land all over the B-tree. Page splits happen constantly.

Sequential BIGINT inserts:
  [1] [2] [3] [4] [5] [6] [7] [8] → append to end
  
  B-tree stays compact. Pages fill left-to-right.
  No page splits. Excellent cache locality.

Random UUIDv4 inserts:
  [a7f...] [3b2...] [f91...] [0d4...] → scattered everywhere
  
  B-tree fragments. Pages split constantly.
  Random I/O. Poor cache locality.

On our production PostgreSQL setup with millions of rows, BIGINT inserts were consistently 30-40% faster than UUIDv4. Range scans? BIGINT was dramatically faster. UUIDv7 was close to BIGINT on writes, which is the main reason it matters.

When UUIDs Actually Make Sense

UUIDs solve specific problems:

Distributed systems — You can't coordinate auto-increment across 10 database shards. UUIDs generate anywhere without coordination.

Public APIs — Sequential IDs leak information. With GET /api/users/1000, an attacker learns you have roughly 1000 users, then tries /api/users/1001, /api/users/1002... classic IDOR vulnerability. With GET /api/users/550e8400-e29b-..., they learn nothing and can't enumerate other users.

Merge conflicts — Syncing data between databases? Sequential IDs will collide. UUIDs won't.

If none of these apply to your use case, you probably don't need UUIDs.

The Hybrid Pattern

Here's what actually works in production: use both.

Internal queries use the fast integer. APIs expose the secure UUID. Best of both.

UUIDv7 Changes Things

UUIDv7 is time-ordered. The first 48 bits are a timestamp, so inserts are sequential like integers but globally unique like UUIDs.

UUIDv4 (550e8400-e29b-41d4-a716-446655440000): Random bits, no order. Fragments the B-tree index on every insert.

UUIDv7 (019006f3-2e47-7000-8000-000000000001): First 48 bits are Unix timestamp (ms), naturally sorted. Inserts go to the end of the index like integers.

Our benchmarks show UUIDv7 recovers most of the insert performance while keeping uniqueness guarantees.

Insert performance (1M rows): BIGINT is the baseline at 100%. UUIDv7 hits 92% — almost as fast. UUIDv4 drops to 68% — significant penalty.

If you need UUIDs, use v7. PostgreSQL 17+ supports it natively. For older versions, generate in application code or use the pg_uuidv7 extension.

Decision Checklist

IDs stay internal, single database? → BIGINT (simple, fast, compact)
IDs exposed in URLs or APIs? → UUID or Hybrid (best of both)
Distributed microservices, multi-region? → UUIDv7 (globally unique + time-ordered)
High-write OLTP workload? → BIGINT internally + UUID for external reference
Need to merge databases or sync offline? → UUID (no collision risk)

The key isn't which one is "better." It's understanding what you're trading off. Make the choice intentionally.

There's no universal ID. Only the right one for your access pattern.

— blanho

UUIDs have become the default choice for many teams. But defaults should be questioned.

The Real Cost of UUIDs

UUIDs are 16 bytes. Integers are 4-8 bytes. That sounds trivial until you do the math.

A 100M row table with 5 foreign keys:

BIGINT (8 bytes): PK index ~800MB + FK indexes ~3.7GB = ~4.5GB total
UUID (16 bytes): PK index ~1.6GB + FK indexes ~7.4GB = ~9.0GB total

Double the storage. Double the memory pressure. Double the I/O.

But the bigger problem is index fragmentation. UUIDv4 is random, so inserts land all over the B-tree. Page splits happen constantly.

Sequential BIGINT inserts:
  [1] [2] [3] [4] [5] [6] [7] [8] → append to end
  
  B-tree stays compact. Pages fill left-to-right.
  No page splits. Excellent cache locality.

Random UUIDv4 inserts:
  [a7f...] [3b2...] [f91...] [0d4...] → scattered everywhere
  
  B-tree fragments. Pages split constantly.
  Random I/O. Poor cache locality.

When UUIDs Actually Make Sense

UUIDs solve specific problems:

Distributed systems — You can't coordinate auto-increment across 10 database shards. UUIDs generate anywhere without coordination.

Merge conflicts — Syncing data between databases? Sequential IDs will collide. UUIDs won't.

If none of these apply to your use case, you probably don't need UUIDs.

The Hybrid Pattern

Here's what actually works in production: use both.

Internal queries use the fast integer. APIs expose the secure UUID. Best of both.

UUIDv7 Changes Things

UUIDv7 is time-ordered. The first 48 bits are a timestamp, so inserts are sequential like integers but globally unique like UUIDs.

UUIDv4 (550e8400-e29b-41d4-a716-446655440000): Random bits, no order. Fragments the B-tree index on every insert.

UUIDv7 (019006f3-2e47-7000-8000-000000000001): First 48 bits are Unix timestamp (ms), naturally sorted. Inserts go to the end of the index like integers.

Our benchmarks show UUIDv7 recovers most of the insert performance while keeping uniqueness guarantees.

Insert performance (1M rows): BIGINT is the baseline at 100%. UUIDv7 hits 92% — almost as fast. UUIDv4 drops to 68% — significant penalty.

If you need UUIDs, use v7. PostgreSQL 17+ supports it natively. For older versions, generate in application code or use the pg_uuidv7 extension.

Decision Checklist

IDs stay internal, single database? → BIGINT (simple, fast, compact)
IDs exposed in URLs or APIs? → UUID or Hybrid (best of both)
Distributed microservices, multi-region? → UUIDv7 (globally unique + time-ordered)
High-write OLTP workload? → BIGINT internally + UUID for external reference
Need to merge databases or sync offline? → UUID (no collision risk)

The key isn't which one is "better." It's understanding what you're trading off. Make the choice intentionally.

There's no universal ID. Only the right one for your access pattern.

— blanho

Stop Defaulting to UUIDs

The Real Cost of UUIDs

When UUIDs Actually Make Sense

The Hybrid Pattern

UUIDv7 Changes Things

Decision Checklist

Related Posts

The Performance Intuition Most Developers Lack

Race Conditions Will Ruin Your Weekend

When Millions of Users Need Matching: The Reverse Index

Stop Defaulting to UUIDs

The Real Cost of UUIDs

When UUIDs Actually Make Sense

The Hybrid Pattern

UUIDv7 Changes Things

Decision Checklist

Related Posts

The Performance Intuition Most Developers Lack

Race Conditions Will Ruin Your Weekend

When Millions of Users Need Matching: The Reverse Index