blog/database/pagination-at-scale-offset-problem
Database Design & Internals

Pagination at Scale: Why OFFSET Is Killing Your Performance

OFFSET pagination seems simple until you have millions of rows. Here's why it degrades linearly, how cursor-based pagination stays constant-time, and when each approach makes sense.

ยท7 min read

The Query That Gets Slower Every Page

You've built pagination. Users click through pages and your SQL looks like this:

SELECT * FROM products ORDER BY created_at DESC LIMIT 20 OFFSET 10000;

Page 1 loads in 2ms. Page 500 takes 800ms. Page 5000? Your database is sweating. Same query, same LIMIT -- so why does it get slower?


How OFFSET Actually Works

OFFSET doesn't "skip" rows. It reads and discards them.

1

Database scans from the start

OFFSET 10000 means the database reads from the beginning of the result set (or the index) and walks through the first 10,000 rows.

2

It throws away what you don't want

All 10,000 rows are fetched, sorted, and then discarded. The database does the full work of retrieving them -- it just doesn't return them to you.

3

It returns the next LIMIT rows

After discarding 10,000 rows, it returns the next 20. You paid for 10,020 rows of work to get 20 rows of data.

Drag to see how OFFSET degrades vs Cursor
Page 1
OFFSET 0
Scans 0 rows first~1ms
WHERE id > cursor
Jumps directly via index~1ms
OFFSET = 0 rows scanned and discardedAcceptable range

The deeper you paginate, the more rows the database scans and throws away. This is O(offset + limit) for every single page request.

PostgreSQL EXPLAIN -- Page 500
$ EXPLAIN ANALYZE SELECT * FROM products ORDER BY created_at DESC LIMIT 20 OFFSET 10000;
Sort  (cost=15234.12..15234.17 rows=20 width=128)
Sort Key: created_at DESC
Sort Method: top-N heapsort  Memory: 42kB
->  Seq Scan on products  (cost=0.00..12834.00 rows=500000 width=128)
Planning Time: 0.09 ms
Execution Time: 842.31 ms

That Seq Scan on products (rows=500000) tells the whole story. The database scanned 500k rows to return 20.


Why This Matters at Scale

๐Ÿ“Œ Linear Degradation

OFFSET pagination has O(n) time complexity where n is the offset value. Page 1 costs O(20). Page 500 costs O(10,020). Page 50,000 costs O(1,000,020). The cost grows linearly with page depth. Every. Single. Request.

In a real application, this creates compounding problems:

  • Database CPU spikes as users (or bots) crawl deep pages
  • Connection pool exhaustion -- slow queries hold connections longer
  • Inconsistent results -- rows inserted between page requests cause items to appear on two pages or disappear entirely

That last one is subtle but critical. If a new row is inserted while a user is on page 5, every subsequent page shifts by one row. Users see duplicates or miss items entirely.


Cursor-Based Pagination

Instead of saying "skip N rows," cursor pagination says "give me rows after this specific point."

-- Instead of OFFSET
SELECT * FROM products
ORDER BY created_at DESC
LIMIT 20 OFFSET 10000;
 
-- Use a cursor (the last item's created_at from the previous page)
SELECT * FROM products
WHERE created_at < '2026-01-15T10:30:00Z'
ORDER BY created_at DESC
LIMIT 20;
๐Ÿ“Œ Cursor Pagination: Why It's O(1)

The database uses the index on created_at to jump directly to the cursor position. It doesn't read any rows before the cursor. Whether you're on page 2 or page 50,000, the database does the same amount of work: seek to cursor position, read 20 rows. Constant time.

PostgreSQL EXPLAIN -- Cursor-based (any depth)
$ EXPLAIN ANALYZE SELECT * FROM products WHERE created_at < '2026-01-15T10:30:00Z' ORDER BY created_at DESC LIMIT 20;
Limit  (cost=0.42..1.28 rows=20 width=128)
->  Index Scan Backward using idx_products_created_at on products
    (cost=0.42..21234.56 rows=250000 width=128)
      Index Cond: (created_at < '2026-01-15 10:30:00+00'::timestamptz)
Planning Time: 0.11 ms
Execution Time: 0.089 ms

0.089ms regardless of how deep into the dataset you are. The Index Scan Backward means the database walked the B-Tree index directly to the right position.


The Cursor Encoding Pattern

In practice, you don't expose raw column values as cursors. You encode them:

// API Response
{
  "data": [...],
  "pagination": {
    "next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNi0wMS0xNVQxMDozMDowMFoiLCJpZCI6NDU2N30=",
    "has_more": true
  }
}

That cursor is a base64-encoded JSON object containing enough information to resume:

// Decoded cursor
{ "created_at": "2026-01-15T10:30:00Z", "id": 4567 }

โš ๏ธ Sort column must be unique (or combined with one that is)

If you sort by created_at and two rows have the same timestamp, cursor pagination can skip or duplicate rows. Always include a unique tiebreaker -- typically the primary key. Your WHERE clause becomes: WHERE (created_at, id) < ('2026-01-15T10:30:00Z', 4567).


When OFFSET Is Still Fine

Cursor pagination isn't always the right answer. OFFSET works well when:

ScenarioOFFSETCursor
Small datasets (< 10k rows)Fine -- degradation is negligibleUnnecessary complexity
Admin dashboardsAcceptable -- low traffic, few pagesOver-engineered
Jump to arbitrary page (page 47)Works directlyNot possible without sequential traversal
Public API / infinite scrollDegrades under loadIdeal -- constant performance
Data export / crawlingGets progressively slowerIdeal -- consistent speed
Sorting changes between requestsWorks (stateless)Cursor may be invalid after sort change

The Hybrid Approach

Some systems use both. GitHub's API, for example:

  • OFFSET for the first ~1000 results (user-facing, low depth)
  • Cursor for deep pagination and API consumers (automated, potentially deep)

The implementation looks like:

// Accept both pagination styles
interface PaginationParams {
  // OFFSET style
  page?: number;
  per_page?: number;
  // Cursor style
  after?: string;
  first?: number;
}
 
function buildQuery(params: PaginationParams) {
  if (params.after) {
    const cursor = decodeCursor(params.after);
    return sql`
      SELECT * FROM products
      WHERE (created_at, id) < (${cursor.created_at}, ${cursor.id})
      ORDER BY created_at DESC, id DESC
      LIMIT ${params.first || 20}
    `;
  }
 
  // Fallback to OFFSET with a hard cap
  const page = Math.min(params.page || 1, 50); // Cap at page 50
  const offset = (page - 1) * (params.per_page || 20);
  return sql`
    SELECT * FROM products
    ORDER BY created_at DESC
    LIMIT ${params.per_page || 20}
    OFFSET ${offset}
  `;
}

โœ… The pragmatic approach

If you expose OFFSET pagination, cap the maximum page depth. OFFSET > 10000? Return a 400 error and guide consumers toward cursor-based pagination. This protects your database from deep-scan abuse while keeping the simple cases simple.


Keyset Pagination: The SQL-Level Pattern

Cursor pagination is the API pattern. Under the hood, it uses keyset pagination -- a SQL technique:

-- Standard keyset pagination (single column)
SELECT * FROM products
WHERE created_at < :last_seen_created_at
ORDER BY created_at DESC
LIMIT 20;
 
-- Composite keyset (handles duplicate sort values)
SELECT * FROM products
WHERE (created_at, id) < (:last_seen_created_at, :last_seen_id)
ORDER BY created_at DESC, id DESC
LIMIT 20;

The composite version uses row value comparison -- a SQL standard feature that compares tuples in order. PostgreSQL, MySQL 8+, and SQLite all support this syntax.

๐Ÿ’ก Index requirement

Keyset pagination needs an index that matches your ORDER BY clause. For ORDER BY created_at DESC, id DESC, create: CREATE INDEX idx_products_pagination ON products(created_at DESC, id DESC). Without this index, the database falls back to sorting, and you lose the performance advantage.


Key Takeaway

โœ… Match your pagination to your scale

OFFSET is simple and works for small datasets or shallow pages. But it has a fundamental flaw: it does work proportional to the page depth, not the page size. Cursor-based pagination costs the same whether you're on page 1 or page 10,000 -- it seeks via index instead of scanning and discarding. For any API that might be paginated deeply, build cursor pagination from the start. Retrofitting it later means API versioning headaches.

References