An iOS Engineering Reference

Learning iOS SQLite

32 Sections Swift / iOS Long-Form

A deep, hands-on guide to using SQLite directly from Swift on iOS — no ORM, no Core Data, no SwiftData, no GRDB. Just the SQLite C library that already ships inside every iOS device, driven from Swift through the SQLite3 system module. We’ll go from opening a database connection to writing a typed Swift wrapper layer, through migrations, indexing, the query planner, concurrency and WAL mode, full-text search, the backup API, custom functions, encryption, and SwiftUI integration.

This is the lowest level at which you can reasonably work with a database on iOS. Everything else — GRDB, SQLite.swift, FMDB, even Core Data — sits on top of the same C API you’ll learn here. Understanding it directly gives you three things: the ability to read and debug what those higher-level libraries do, the ability to drop down to raw SQL when a library’s abstractions get in your way, and a genuine understanding of how relational persistence works on the platform rather than a memorized set of framework incantations.

This guide assumes you’re comfortable with Swift — closures, optionals, generics, enums with associated values, basic concurrency. It does not assume you know SQL or the SQLite C API. We’ll build both up from scratch, with iOS-specific Swift examples throughout. Where the C API has sharp edges when called from Swift (and it has several — pointer lifetimes, the missing SQLITE_TRANSIENT constant, 1-based vs 0-based indexing), we’ll call them out explicitly because these are exactly the things that waste hours when you hit them cold.

A note on scope: SQLite is a vast piece of software with decades of features. This guide covers what an iOS engineer actually needs, in depth, and points you to the official documentation for the long tail. By the end you’ll have written a small but real Swift database layer and will understand the tradeoffs well enough to decide when raw SQLite is the right tool and when you should reach for something higher level.

Table of Contents

  1. What SQLite Is and Why Use It Directly on iOS
  2. The C API: How SQLite Ships in iOS and Bridging to Swift
  3. Opening and Closing a Database Connection
  4. Executing Simple Statements with sqlite3_exec
  5. Prepared Statements: Compile, Bind, Step, Finalize
  6. Binding Parameters Safely
  7. Reading Results: Column Access and Types
  8. CRUD Operations in Practice
  9. Error Handling and Result Codes
  10. Transactions: BEGIN, COMMIT, ROLLBACK
  11. Building a Swift Wrapper Layer, Part 1: Connection and Statement
  12. Building a Swift Wrapper Layer, Part 2: Row Decoding and a Query API
  13. Schema Design and Data Types in SQLite
  14. Migrations: Versioning Your Schema with user_version
  15. Indexes and Query Optimization
  16. EXPLAIN QUERY PLAN and the Query Planner
  17. Joins, Subqueries, and Aggregations
  18. Date, Time, and Storing Complex Types
  19. NULL Handling and Type Affinity
  20. Concurrency: Threading Modes and Connections
  21. WAL Mode and Write Concurrency
  22. Busy Handling and Locking
  23. Prepared Statement Caching and Reuse
  24. Bulk Inserts and Performance Tuning
  25. Full-Text Search with FTS5
  26. The Backup API and Database Copying
  27. Blobs, Incremental I/O, and Large Data
  28. Custom Functions and Collations
  29. Encryption with SQLCipher
  30. Integrating SQLite with SwiftUI
  31. Common Gotchas and Anti-Patterns
  32. Where to Go Deeper

1. What SQLite Is and Why Use It Directly on iOS

Before you write a line of code, you should understand what SQLite actually is, what it isn’t, and why you’d choose to talk to it directly instead of through one of the many libraries that wrap it. This framing shapes every decision you’ll make in the rest of the guide.

What SQLite is

SQLite is an embedded relational database engine. “Embedded” is the key word. Unlike PostgreSQL or MySQL, there is no server process. There is no network connection, no daemon, no separate installation. SQLite is a C library that you link into your application, and your database is a single ordinary file on disk. When your code “connects” to the database, it’s opening a file. When it runs a query, it’s executing C functions in your own process that read and write that file.

This architecture has profound consequences:

  • It’s astonishingly reliable. SQLite is among the most thoroughly tested software in existence — its test suite has more lines than the library itself, with 100% branch coverage. It runs on aircraft, in browsers, in operating systems. On your iPhone it backs Messages, Photos, Notes, and dozens of other system apps.
  • It’s fast for local work. No network round trips, no inter-process communication. A query is a function call. For local persistence on a single device, it’s hard to beat.
  • It’s a real SQL database. Transactions, indexes, joins, triggers, views, foreign keys, full-text search, JSON functions, window functions — it has the features you’d expect from a serious relational engine, with a few documented limitations.
  • The database is portable. The file format is stable, cross-platform, and backward compatible. A database written on an iPhone opens unchanged on a Mac, a Linux server, or a Windows machine.

SQLite is already on every iOS device

You don’t install SQLite on iOS. It’s part of the operating system. There’s a system library, libsqlite3, and a corresponding Swift/Clang module called SQLite3. You add import SQLite3 to a Swift file and you immediately have access to the entire C API — sqlite3_open_v2, sqlite3_prepare_v2, sqlite3_step, all of it. No package, no dependency, no CocoaPods, no SPM entry. It’s just there.

This is a meaningful advantage. Your app gains zero binary size from using SQLite (the library is shared by the OS), you get whatever version Apple ships and maintains with security updates, and there’s nothing to keep up to date. The tradeoff is that you’re stuck with Apple’s build of SQLite, which means certain compile-time options and extensions aren’t available unless you bundle your own copy (more on that when we reach FTS5 and SQLCipher).

Why use the C API directly

Given that GRDB, SQLite.swift, FMDB, and others exist — and they’re good — why would you write against the raw C API?

A few honest reasons:

  • Understanding. Every Swift SQLite library is a thin-to-medium wrapper over the same C functions. When something goes wrong inside GRDB, or when you need a feature it doesn’t expose, you need to understand what’s happening underneath. This guide gives you that.
  • Zero dependencies. Some teams have hard constraints against third-party dependencies. Raw SQLite via the system module satisfies that constraint completely.
  • Control. When you write the layer yourself, there’s no abstraction deciding things for you. You control exactly when statements compile, when they finalize, when transactions begin, how connections are pooled. For specialized workloads this control matters.
  • Learning leverage. Once you understand raw SQLite, learning any wrapper library takes an afternoon, because you already understand the model. The reverse isn’t true — learning GRDB first leaves you helpless when you need to drop below it.

And the honest counterpoint: for most production apps, you should use a wrapper. Writing and maintaining a correct, safe, ergonomic SQLite layer is real work, and libraries like GRDB have solved the hard problems (statement caching, observation, concurrency, Codable integration) better than you will on a deadline. This guide teaches you the foundation precisely so you can make that choice from knowledge rather than ignorance — and so that when you do reach below the wrapper, you know what you’re doing.

What SQLite is not

To set expectations:

  • It’s not a client-server database. No concurrent writers across machines, no network access, no user accounts or permissions. One file, one process (well — possibly several processes on one device, with file locking, but that’s an edge case on iOS).
  • It’s not a replacement for a backend. SQLite on the device is local storage. Syncing to a server, sharing across users, conflict resolution — none of that is SQLite’s job. You build it on top, or you use CloudKit, or you use a backend.
  • It’s not schemaless. It’s relational. You define tables and columns. (It does have a flexible type system and JSON support, which softens this, but it’s fundamentally relational.)
  • It’s not infinitely scalable. It’s excellent up to databases of many gigabytes, but it’s designed for a single device’s local data, not for hundreds of concurrent connections hammering a shared store.

When raw SQLite is the right call

Concretely, reach for direct SQLite when: you want zero dependencies; you’re building your own minimal data layer with full control; you’re learning how persistence works; you need a specific SQL feature and want no abstraction between you and it; or you’re debugging or extending a higher-level library. Reach for a wrapper (GRDB especially) when: you’re shipping a real app on a timeline; you want Codable record mapping, database observation, and migration tooling out of the box; or you’d rather spend your effort on your product than on database plumbing.

What to internalize

SQLite is an embedded, serverless, single-file relational database that ships inside iOS and backs much of the OS itself. You access it from Swift through the SQLite3 system module with no dependencies. Working against the raw C API gives you understanding, control, and zero dependencies — at the cost of writing and maintaining plumbing that mature libraries already provide. The right mental model: a database is a file, a connection is an open file handle, and a query is a C function call in your own process. Everything in this guide builds on that model.


2. The C API: How SQLite Ships in iOS and Bridging to Swift

SQLite’s public interface is a C API — a few hundred functions operating on opaque pointers. Calling C from Swift works smoothly most of the time, but SQLite’s API has a handful of patterns that translate awkwardly, and knowing them up front saves real pain. This section is the bridge: how the module is structured, what the core types look like in Swift, and the specific translation gotchas.

Importing the module

Everything starts with one line:

import SQLite3

This imports Apple’s system SQLite module. After this, every public SQLite C function and constant is available with its original C name. There’s no Swift-ified naming, no namespacing — sqlite3_open_v2 is sqlite3_open_v2, SQLITE_OK is SQLITE_OK. You’re calling C, lightly dressed in Swift syntax.

If you’re in a framework target rather than an app target, you may need to ensure libsqlite3.tbd is linked (Build Phases → Link Binary With Libraries). In a normal app target with import SQLite3, the linker generally resolves it automatically.

The core opaque types

SQLite hands you pointers to internal structures you never inspect directly. In C these are sqlite3* (a database connection) and sqlite3_stmt* (a prepared statement). In Swift, both come through as OpaquePointer:

var db: OpaquePointer?       // a database connection handle
var statement: OpaquePointer? // a prepared statement handle

OpaquePointer? is Swift’s representation of “a pointer to something I can’t see inside.” You pass it back to SQLite functions; you never dereference it yourself. The optionality matters: before you successfully open a database, db is nil; after, it points to a real connection. The same is true for statements.

Because you pass the address of these variables to SQLite (so it can write the pointer into them), you’ll see a lot of &db and &statement:

let result = sqlite3_open_v2(path, &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, nil)

sqlite3_open_v2 takes a sqlite3** in C — a pointer to a pointer. You give it &db, the address of your OpaquePointer?, and it fills in the pointer value.

Result codes are plain integers

Almost every SQLite function returns an Int32 result code. The important ones:

SQLITE_OK    // 0  — success
SQLITE_ROW   // 100 — sqlite3_step produced a row
SQLITE_DONE  // 101 — sqlite3_step finished (no more rows)
SQLITE_ERROR // 1  — generic error
SQLITE_BUSY  // 5  — database is locked
SQLITE_CONSTRAINT // 19 — a constraint (e.g. UNIQUE) was violated

These constants come through from the module as Int32. You’ll compare function results against them constantly:

if sqlite3_open_v2(path, &db, flags, nil) != SQLITE_OK {
    // handle error
}

We’ll build proper Swift error handling around these in Section 9. For now, know that the C API communicates almost everything through these integer codes, and that sqlite3_errmsg(db) turns the last error into a human-readable C string.

Reading C strings

SQLite returns text as C strings — const char*, which Swift sees as UnsafePointer<CChar>? or UnsafePointer<UInt8>? depending on the function. To turn one into a Swift String:

if let cString = sqlite3_column_text(statement, 0) {
    let swiftString = String(cString: cString)
}

String(cString:) copies the bytes into a proper Swift String. The copy matters: the pointer SQLite gives you is only valid until the next call that touches that statement (the next sqlite3_step, or sqlite3_finalize). If you stored the raw pointer and read it later, you’d be reading freed or overwritten memory. Always convert to a String immediately.

Error messages work the same way:

let message = String(cString: sqlite3_errmsg(db))

Passing Swift strings into C

When you pass a Swift String to a C function expecting const char*, Swift bridges it automatically for the duration of the call:

sqlite3_open_v2(path, &db, flags, nil)  // `path` is a Swift String; bridged transparently

This automatic bridging is safe only for the duration of the call. The C string Swift creates is valid while the function runs and is deallocated afterward. This is fine for sqlite3_open_v2 (which copies the path internally) but is the source of the single most infamous SQLite-from-Swift bug, which we’ll meet next.

The SQLITE_TRANSIENT gotcha

When you bind a text or blob value to a prepared statement, SQLite needs to know whether it can use your buffer directly or must copy it. You tell it with a “destructor” argument. C defines two special sentinel values:

#define SQLITE_STATIC      ((sqlite3_destructor_type)0)
#define SQLITE_TRANSIENT   ((sqlite3_destructor_type)-1)

SQLITE_STATIC means “this buffer will outlive the statement, don’t copy it.” SQLITE_TRANSIENT means “copy this immediately because my buffer is about to disappear.” For Swift strings bridged just for a call, you almost always want SQLITE_TRANSIENT — the bridged C string vanishes the moment the bind call returns, so SQLite must copy it now.

The problem: these are C macros, and Clang does not import function-like macro casts cleanly. SQLITE_TRANSIENT is not available as a usable symbol in Swift. If you try to use it, your code won’t compile. The fix, which you’ll copy into essentially every raw-SQLite-in-Swift project, is to define it yourself:

let SQLITE_TRANSIENT = unsafeBitCast(-1, to: sqlite3_destructor_type.self)
let SQLITE_STATIC = unsafeBitCast(0, to: sqlite3_destructor_type.self)

unsafeBitCast(-1, ...) reinterprets the integer -1 as a sqlite3_destructor_type (a function pointer type), reproducing exactly what the C macro does. This is one of the rare legitimate uses of unsafeBitCast. Define it once in your project and forget about it.

If you bind a Swift string and pass SQLITE_STATIC by mistake (or nil), you’ll get intermittent corruption: sometimes the value is right, sometimes it’s garbage, depending on whether the memory happened to survive. It’s a maddening bug precisely because it’s nondeterministic. Use SQLITE_TRANSIENT for transient Swift values and the problem never appears.

Indexing: bind is 1-based, columns are 0-based

A second classic trap, and a purely arbitrary one: when you bind parameters to a statement, the first parameter is at index 1. When you read columns from a result row, the first column is at index 0.

sqlite3_bind_text(statement, 1, name, -1, SQLITE_TRANSIENT)  // first bind param = index 1
let firstColumn = sqlite3_column_text(statement, 0)          // first result column = index 0

This inconsistency is baked into the C API and there’s nothing to do but remember it. Off-by-one errors here produce “no such column” errors or silently wrong data. When something is mysteriously empty or shifted, check your indices first.

The shape of a typical call sequence

To preview where we’re going, almost all real SQLite work follows this rhythm:

var statement: OpaquePointer?
// 1. Compile SQL text into a statement
sqlite3_prepare_v2(db, "SELECT name FROM users WHERE id = ?", -1, &statement, nil)
// 2. Bind values to the ? placeholders
sqlite3_bind_int64(statement, 1, userId)
// 3. Step through results
while sqlite3_step(statement) == SQLITE_ROW {
    let name = String(cString: sqlite3_column_text(statement, 0))
}
// 4. Finalize to release the statement
sqlite3_finalize(statement)

Prepare, bind, step, finalize. We’ll dissect each step in the coming sections, but hold this shape in your head — it’s the heartbeat of the entire API.

What to internalize

import SQLite3 gives you Apple’s system SQLite as raw C, with original names and no Swift sugar. Connections and statements are OpaquePointer?; you pass &db so SQLite can fill them in. Functions communicate through Int32 result codes you compare against constants like SQLITE_OK, SQLITE_ROW, SQLITE_DONE. Convert C strings to Swift Strings immediately with String(cString:) because the pointers are short-lived. Define SQLITE_TRANSIENT yourself with unsafeBitCast(-1, ...) and use it when binding transient Swift values, or face nondeterministic corruption. Remember: bind parameters are 1-based, result columns are 0-based. The eternal rhythm is prepare, bind, step, finalize.


3. Opening and Closing a Database Connection

A SQLite database is a file, and a connection is an open handle to that file. Getting the open-and-close lifecycle right — correct paths, correct flags, correct cleanup — is the foundation everything else rests on. This section covers where to put the database file on iOS, how to open it with the right options, and how to close it without leaking.

Where to put the database file on iOS

iOS sandboxes every app into its own container with several standard directories. Where you put your SQLite file matters because the directories have different backup and purge behavior:

  • Documents (.documentDirectory) — backed up to iCloud/iTunes, visible to the user if file sharing is enabled. Use for user-generated content the user would expect to be preserved.
  • Application Support (.applicationSupportDirectory) — backed up, hidden from the user. The conventional home for a database that’s app-managed but should survive reinstalls-from-backup. This is usually the right choice for an app database.
  • Caches (.cachesDirectory) — not backed up, can be purged by the OS under storage pressure. Use only for data you can regenerate.
  • Temporary (.temporaryDirectory) — can be cleared anytime. For scratch databases.

For a typical app database you want Application Support. One wrinkle: that directory isn’t guaranteed to exist on a fresh install, so you create it if needed:

import Foundation

func databaseURL() throws -> URL {
    let fileManager = FileManager.default
    let appSupport = try fileManager.url(
        for: .applicationSupportDirectory,
        in: .userDomainMask,
        appropriateFor: nil,
        create: true   // creates the directory if it doesn't exist
    )
    return appSupport.appendingPathComponent("app.sqlite")
}

The create: true flag handles the “directory doesn’t exist yet” case. The filename extension is conventional — .sqlite, .db, and .sqlite3 are all common and functionally identical.

Excluding the database from iCloud backup

If your database is a local cache that shouldn’t count against the user’s iCloud storage, mark it excluded from backup even when it lives in a backed-up directory:

func excludeFromBackup(_ url: URL) throws {
    var url = url
    var values = URLResourceValues()
    values.isExcludedFromBackup = true
    try url.setResourceValues(values)
}

This is optional and situational. User data you want preserved should be backed up; regenerable caches should not. Decide deliberately.

Opening with sqlite3_open_v2

There are two open functions. sqlite3_open is the old, simple one. sqlite3_open_v2 is the modern one that lets you specify flags, and it’s the one you should always use:

var db: OpaquePointer?

let path = try databaseURL().path

let flags = SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_FULLMUTEX

let result = sqlite3_open_v2(path, &db, flags, nil)

guard result == SQLITE_OK else {
    let message = db.map { String(cString: sqlite3_errmsg($0)) } ?? "unknown error"
    sqlite3_close(db)   // close even on failure — the handle may be partially allocated
    throw DatabaseError.openFailed(message)
}

Let’s unpack each piece.

The path is the filesystem path as a Swift String. Note we pass url.path, not the URL itself — SQLite wants a C string path. (On modern iOS you could use url.path(percentEncoded:); the plain .path is fine for local files.)

The flags are an OR-ed bitmask:

  • SQLITE_OPEN_READWRITE — open for reading and writing.
  • SQLITE_OPEN_CREATE — create the file if it doesn’t exist. Without this, opening a nonexistent file fails.
  • SQLITE_OPEN_FULLMUTEX — use serialized threading mode (we’ll discuss threading modes in detail in Section 20). For now, this is the safe default that lets you use the connection from multiple threads, at some performance cost.

Other useful flags: SQLITE_OPEN_READONLY (for a bundled, read-only database), SQLITE_OPEN_NOMUTEX (multi-thread mode), SQLITE_OPEN_URI (allow URI filenames with query parameters).

The fourth argument (nil) is an optional VFS name — the virtual file system to use. nil means the default, which is correct on iOS.

Handling open failure correctly

A subtle and important point: even when sqlite3_open_v2 fails, it may have allocated a connection handle (so it can attach an error message to it). The documentation is explicit that you should call sqlite3_close on the handle whether the open succeeded or failed, unless the handle is nil. The code above does this — note it reads the error message before closing, then closes regardless. Skipping the close on the failure path leaks the handle.

In-memory databases

For tests, prototypes, and scratch work, you can open a database that lives entirely in RAM and vanishes when closed:

sqlite3_open_v2(":memory:", &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, nil)

The special path ":memory:" creates a private, in-memory database. It’s extremely fast and perfect for unit tests because each test gets a clean, isolated database with no file system side effects. We’ll use in-memory databases throughout the guide’s testing examples.

Closing the connection

When you’re done, close the connection to release its resources:

sqlite3_close(db)
db = nil

But there’s a critical precondition: sqlite3_close fails if any prepared statements on the connection haven’t been finalized. If you’ve prepared statements and not called sqlite3_finalize on them, sqlite3_close returns SQLITE_BUSY and the connection stays open, leaking. There are two ways to deal with this:

The disciplined way is to finalize every statement before closing. The pragmatic way is sqlite3_close_v2, which is more forgiving:

sqlite3_close_v2(db)
db = nil

sqlite3_close_v2 marks the connection as a “zombie” if statements are still live, then frees it automatically when the last statement finalizes. It never fails due to outstanding statements. For most app code, sqlite3_close_v2 is the safer choice. The plain sqlite3_close is useful precisely because its failure tells you that you’ve leaked a statement — a diagnostic signal you might want during development.

Wrapping it in a class

In real code you’d wrap this lifecycle in a class so the connection is owned and cleaned up deterministically:

final class Database {
    private var handle: OpaquePointer?

    init(url: URL) throws {
        let flags = SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_FULLMUTEX
        let result = sqlite3_open_v2(url.path, &handle, flags, nil)
        guard result == SQLITE_OK else {
            let message = handle.map { String(cString: sqlite3_errmsg($0)) } ?? "unknown"
            sqlite3_close(handle)
            throw DatabaseError.openFailed(message)
        }
    }

    deinit {
        sqlite3_close_v2(handle)
    }
}

enum DatabaseError: Error {
    case openFailed(String)
}

The init opens; the deinit closes. As long as you hold the Database instance for the app’s lifetime (typically in a long-lived object), the connection stays open, and when the instance is finally released, the connection closes cleanly. We’ll grow this class substantially in Sections 11 and 12.

How many connections?

A natural question: one connection for the whole app, or many? The short answer for now: one long-lived connection is the simplest correct choice for most apps, especially in WAL mode (Section 21). You open it at launch, keep it for the app’s lifetime, and route work through it. Multiple connections are useful for true read/write concurrency, but they introduce locking complexity we’ll address later. Start with one.

Open/close pitfalls

Forgetting SQLITE_OPEN_CREATE. Opening a database that doesn’t exist yet without this flag fails with “unable to open database file” — a confusing message when the real problem is just that the file isn’t there.

Passing a URL where a path string is expected. SQLite wants a C string path. Pass url.path, not url.absoluteString (which has a file:// scheme that SQLite doesn’t want by default).

Not closing on the open-failure path. Leaks the partially-allocated handle. Always close, even on failure.

Using sqlite3_close with live statements. Returns SQLITE_BUSY and leaks. Finalize statements first, or use sqlite3_close_v2.

Putting the database in Caches and being surprised when it vanishes. The OS purges Caches under pressure. User data belongs in Application Support or Documents.

Opening a new connection per query. Opening is relatively expensive and discards the page cache each time. Keep a connection open.

What to internalize

A SQLite database is a file; put your app’s database in Application Support (creating the directory with create: true), or use ":memory:" for tests. Open with sqlite3_open_v2 using SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE plus a threading flag, passing url.path and the address of your OpaquePointer?. Read the error message before closing on failure, and always close the handle even when open fails. Prefer sqlite3_close_v2 to avoid the outstanding-statement failure mode. Wrap the lifecycle in a class whose init opens and deinit closes, and keep one long-lived connection for most apps.


4. Executing Simple Statements with sqlite3_exec

Not every SQL statement needs the full prepare-bind-step-finalize ceremony. For one-off statements with no parameters and no results — creating a table, setting a pragma, running a migration step — SQLite offers a convenience function, sqlite3_exec, that compiles and runs SQL in a single call. This section covers when to use it, how it works, and its hard limits.

The simplest possible execution

sqlite3_exec takes the connection, a SQL string, an optional callback for results, a context pointer for the callback, and a pointer for an error message:

let sql = "CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY, name TEXT NOT NULL);"

let result = sqlite3_exec(db, sql, nil, nil, nil)

if result != SQLITE_OK {
    let message = String(cString: sqlite3_errmsg(db))
    throw DatabaseError.execFailed(message)
}

That’s a complete table creation. No statement handle to manage, no stepping, no finalizing — sqlite3_exec does all of that internally. The three nils are: no result callback, no callback context, and no separate error-message buffer (we read the error from sqlite3_errmsg instead).

What sqlite3_exec actually does

Under the hood, sqlite3_exec is a convenience wrapper. It repeatedly calls sqlite3_prepare_v2, sqlite3_step, and sqlite3_finalize for you, looping over every statement in the string. That last part is important and useful: sqlite3_exec can run multiple statements separated by semicolons in a single call. This makes it ideal for schema setup:

let schema = """
CREATE TABLE IF NOT EXISTS users (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    email TEXT
);
CREATE TABLE IF NOT EXISTS posts (
    id INTEGER PRIMARY KEY,
    user_id INTEGER NOT NULL REFERENCES users(id),
    title TEXT NOT NULL,
    body TEXT,
    created_at INTEGER NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_posts_user ON posts(user_id);
"""

if sqlite3_exec(db, schema, nil, nil, nil) != SQLITE_OK {
    throw DatabaseError.execFailed(String(cString: sqlite3_errmsg(db)))
}

Three statements, one call. If any statement fails, sqlite3_exec stops and returns the error code; statements before the failure have already executed (there’s no implicit transaction unless you add one — more on that in Section 10). The prepared-statement path we’ll learn next can only compile and run one statement at a time, so for multi-statement schema scripts, sqlite3_exec is genuinely the right tool.

Pragmas

Pragmas are SQLite’s mechanism for configuring connection and database behavior. You’ll set several at connection startup, and sqlite3_exec is the natural way:

func configureConnection(_ db: OpaquePointer?) throws {
    let pragmas = """
    PRAGMA journal_mode = WAL;
    PRAGMA foreign_keys = ON;
    PRAGMA busy_timeout = 5000;
    PRAGMA synchronous = NORMAL;
    """
    if sqlite3_exec(db, pragmas, nil, nil, nil) != SQLITE_OK {
        throw DatabaseError.execFailed(String(cString: sqlite3_errmsg(db)))
    }
}

What these do (each covered in depth later):

  • journal_mode = WAL — enable write-ahead logging for better concurrency (Section 21).
  • foreign_keys = ON — enforce foreign key constraints, which are off by default in SQLite. If you declare REFERENCES and expect them enforced, you must turn this on for every connection.
  • busy_timeout = 5000 — wait up to 5 seconds for a lock before returning SQLITE_BUSY (Section 22).
  • synchronous = NORMAL — a durability/speed tradeoff that’s safe and fast under WAL.

The foreign_keys one deserves emphasis: it’s a per-connection setting that defaults off, which surprises everyone the first time. Set it at startup on every connection or your foreign key declarations are documentation, not enforcement.

Note that journal_mode = WAL actually returns a result row (the new journal mode), but sqlite3_exec with a nil callback simply discards it. That’s fine here.

The result callback

sqlite3_exec can return query results through a C callback function, invoked once per row. In practice you almost never use this from Swift — the callback is a C function pointer, the row data arrives as arrays of C strings, and threading Swift closures through it requires unsafe context-pointer juggling. For anything that returns data, you’ll use prepared statements (Section 5), which give you clean, typed column access.

For completeness, the callback signature is a C function taking the context pointer, column count, an array of value C strings, and an array of column-name C strings. If you ever need it, you pass a context pointer as the fourth sqlite3_exec argument and cast it back inside the callback. But treat this as a last resort. The rule of thumb: use sqlite3_exec for statements that return no data; use prepared statements for everything that does.

When to use exec vs prepared statements

A clear division:

Use sqlite3_exec for: CREATE TABLE, CREATE INDEX, DROP, ALTER, PRAGMA settings, transaction control (BEGIN/COMMIT), and multi-statement schema scripts — anything static, parameterless, and result-free.

Use prepared statements for: any statement with values that vary at runtime (which means anything with user data), any SELECT that returns rows, and any statement you’ll run repeatedly (to amortize compilation).

The reason for the second rule is the single most important security principle in this entire guide, which the next point makes concrete.

Never build SQL with string interpolation

It is tempting, especially with sqlite3_exec, to build SQL by interpolating values:

// NEVER DO THIS
let name = userProvidedName
let sql = "INSERT INTO users (name) VALUES ('\(name)');"
sqlite3_exec(db, sql, nil, nil, nil)

This is a SQL injection vulnerability. If name is Robert'); DROP TABLE users;--, you’ve just handed an attacker control of your database. Even without malice, a name containing an apostrophe (O'Brien) breaks the SQL syntactically. String-interpolated SQL with runtime values is wrong both for security and for correctness.

The fix is parameter binding with prepared statements, which we cover starting in the next section. Values go in through bind calls, never through string interpolation. sqlite3_exec is for static SQL only — SQL whose full text you wrote at compile time, with no runtime values spliced in. The moment a runtime value needs to enter a statement, you switch to a prepared statement and bind it. Hold this line absolutely; there is no acceptable exception in production code.

Checking for changes

After an INSERT, UPDATE, or DELETE (whether via exec or a prepared statement), you often want to know how many rows changed:

sqlite3_exec(db, "DELETE FROM posts WHERE created_at < 0;", nil, nil, nil)
let deleted = sqlite3_changes(db)
print("Deleted \(deleted) rows")

sqlite3_changes(db) reports the number of rows modified by the most recent statement. sqlite3_total_changes(db) reports the cumulative count since the connection opened. These are handy for confirming an operation did what you expected.

exec pitfalls

Using exec with runtime values via interpolation. SQL injection and breakage. Static SQL only; bind everything else.

Forgetting PRAGMA foreign_keys = ON. Foreign keys silently aren’t enforced. Set it per connection at startup.

Assuming exec is transactional. A multi-statement exec is not automatically wrapped in a transaction. If statement three fails, statements one and two stay committed. Wrap in BEGIN/COMMIT (Section 10) if you need atomicity.

Trying to read results through exec. Possible but painful from Swift. Use prepared statements for anything returning rows.

Ignoring the return code. sqlite3_exec returns a code; a nil error-message argument doesn’t mean errors vanish. Check the return value and read sqlite3_errmsg on failure.

What to internalize

sqlite3_exec compiles and runs SQL in one call, handles multiple semicolon-separated statements, and is the right tool for static, parameterless, result-free SQL: table creation, indexes, pragmas, and schema scripts. Set critical pragmas at connection startup — especially foreign_keys = ON, which defaults off. Never interpolate runtime values into SQL strings; that’s SQL injection, and the fix is parameter binding with prepared statements, which we turn to now. Use sqlite3_changes to see how many rows a statement affected.


5. Prepared Statements: Compile, Bind, Step, Finalize

Prepared statements are the core of real SQLite work. Every query that takes runtime values or returns rows goes through them. The lifecycle — compile the SQL once, bind values, step through results, finalize — is the rhythm you’ll repeat thousands of times. This section dissects each phase so you understand not just the mechanics but why each step exists.

Why prepared statements exist

When you hand SQLite a SQL string, it must parse it, validate it against the schema, and generate a little bytecode program (SQLite has its own virtual machine, the VDBE) that executes the query. This compilation is real work. A prepared statement separates compilation from execution: you compile once, then execute many times with different bound values, paying the parse-and-plan cost only once.

They also separate code from data. The SQL text — the code — is fixed at compile time. The values — the data — come in afterward through binding. This separation is what makes prepared statements injection-proof: there is no way for a bound value to be interpreted as SQL, because the SQL was already parsed before the value arrived. This is both the performance story and the security story, and they’re the same story.

Phase 1: Compile with sqlite3_prepare_v2

var statement: OpaquePointer?
let sql = "SELECT id, name, email FROM users WHERE id = ?;"

let result = sqlite3_prepare_v2(db, sql, -1, &statement, nil)

guard result == SQLITE_OK else {
    let message = String(cString: sqlite3_errmsg(db))
    throw DatabaseError.prepareFailed(message)
}

The arguments:

  • db — the connection.
  • sql — the SQL text, a Swift String bridged to a C string.
  • -1 — the length of the SQL in bytes. -1 means “read until the null terminator,” which is what you want for a Swift string. You can pass an explicit length to compile only a prefix, but -1 is standard.
  • &statement — the address of your OpaquePointer?, which SQLite fills with the compiled statement handle.
  • nil — an optional output pointer to the unused tail of the SQL (for the multi-statement case). nil ignores it.

If preparation fails — a syntax error, a reference to a nonexistent table or column — you get a non-SQLITE_OK code and sqlite3_errmsg explains. Preparation is where SQL errors surface, so this is your first line of defense against malformed queries.

The ? in the SQL is a parameter placeholder. It marks a spot where a value will be bound. The statement is compiled with the placeholder; the value comes next.

Phase 2: Bind values

The ? placeholder is parameter index 1 (remember: bind indices are 1-based):

let userId: Int64 = 42
sqlite3_bind_int64(statement, 1, userId)

sqlite3_bind_int64 binds a 64-bit integer to parameter 1. There’s a bind function per type — sqlite3_bind_int, sqlite3_bind_int64, sqlite3_bind_double, sqlite3_bind_text, sqlite3_bind_blob, sqlite3_bind_null. We’ll cover all of them in detail in Section 6. For now, note that binding happens after compile and before stepping.

A statement with no placeholders (like SELECT * FROM users) skips this phase entirely.

Phase 3: Step through results

sqlite3_step advances the statement’s execution. Its return value tells you what happened:

while sqlite3_step(statement) == SQLITE_ROW {
    let id = sqlite3_column_int64(statement, 0)
    let name = String(cString: sqlite3_column_text(statement, 1))
    let email = sqlite3_column_text(statement, 2).map { String(cString: $0) }
    print("\(id): \(name), \(email ?? "no email")")
}

Each call to sqlite3_step does one of these:

  • Returns SQLITE_ROW — a result row is available. Read its columns with sqlite3_column_* functions (column indices are 0-based), then call sqlite3_step again for the next row.
  • Returns SQLITE_DONE — execution finished, no more rows. For a SELECT, this means you’ve read every row; for an INSERT/UPDATE/DELETE, it’s the normal single completion.
  • Returns an error code (SQLITE_BUSY, SQLITE_ERROR, SQLITE_CONSTRAINT, etc.) — something went wrong.

The while sqlite3_step(...) == SQLITE_ROW loop is the canonical pattern for reading a result set. It naturally stops at SQLITE_DONE. For a statement that returns no rows (an INSERT), you call sqlite3_step once and expect SQLITE_DONE:

let stepResult = sqlite3_step(insertStatement)
guard stepResult == SQLITE_DONE else {
    throw DatabaseError.stepFailed(String(cString: sqlite3_errmsg(db)))
}

Phase 4: Finalize

When you’re done with a statement, you must release it:

sqlite3_finalize(statement)
statement = nil

sqlite3_finalize frees the compiled bytecode and all resources the statement holds. Forgetting to finalize is a resource leak — the statement stays allocated, and (as we saw in Section 3) it prevents sqlite3_close from succeeding. Every sqlite3_prepare_v2 must be balanced by a sqlite3_finalize.

Finalizing is also what makes the statement’s column pointers invalid. Any C string you got from sqlite3_column_text is dead after finalize (and after the next step, actually). This is why you convert to Swift Strings during the step loop, not after.

The complete lifecycle in one place

Putting all four phases together for a query that returns rows:

func fetchUser(id: Int64, db: OpaquePointer?) throws -> (name: String, email: String?)? {
    var statement: OpaquePointer?
    let sql = "SELECT name, email FROM users WHERE id = ?;"

    guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
        throw DatabaseError.prepareFailed(String(cString: sqlite3_errmsg(db)))
    }
    defer { sqlite3_finalize(statement) }   // guarantees finalize on every exit path

    sqlite3_bind_int64(statement, 1, id)

    let stepResult = sqlite3_step(statement)
    if stepResult == SQLITE_ROW {
        let name = String(cString: sqlite3_column_text(statement, 0))
        let email = sqlite3_column_text(statement, 1).map { String(cString: $0) }
        return (name, email)
    } else if stepResult == SQLITE_DONE {
        return nil   // no matching user
    } else {
        throw DatabaseError.stepFailed(String(cString: sqlite3_errmsg(db)))
    }
}

Notice the defer { sqlite3_finalize(statement) } immediately after a successful prepare. This is the single most valuable habit in raw SQLite-from-Swift code: defer guarantees the statement is finalized no matter how the function exits — normal return, early return, or thrown error. Without it, every throw and every early return is a chance to leak the statement. With it, you write the cleanup once and forget about it. Adopt this pattern universally.

Resetting and reusing a statement

You don’t have to finalize a statement to run it again with different values. sqlite3_reset rewinds a stepped statement back to its starting state so you can rebind and re-step:

var statement: OpaquePointer?
sqlite3_prepare_v2(db, "INSERT INTO users (name) VALUES (?);", -1, &statement, nil)
defer { sqlite3_finalize(statement) }

for name in ["Alice", "Bob", "Carol"] {
    sqlite3_reset(statement)                                   // rewind
    sqlite3_clear_bindings(statement)                          // optional: clear old binds
    sqlite3_bind_text(statement, 1, name, -1, SQLITE_TRANSIENT)
    guard sqlite3_step(statement) == SQLITE_DONE else {
        throw DatabaseError.stepFailed(String(cString: sqlite3_errmsg(db)))
    }
}

This is the foundation of efficient bulk operations (Section 24) and statement caching (Section 23). Compile once, then reset + rebind + step for each iteration. sqlite3_reset keeps the compiled bytecode; it only resets execution state. sqlite3_clear_bindings additionally clears the bound values back to NULL — useful if you don’t rebind every parameter each time, optional if you always rebind all of them.

A subtlety: sqlite3_reset returns the error code from the previous execution if it ended in an error. In the happy path it returns SQLITE_OK. Don’t be confused if reset “returns an error” — it’s reporting the prior step’s outcome, not a new failure.

Getting the inserted row id

After an INSERT into a table with an INTEGER PRIMARY KEY, you usually want the new row’s id:

sqlite3_step(insertStatement)   // SQLITE_DONE
let newId = sqlite3_last_insert_rowid(db)

sqlite3_last_insert_rowid(db) returns the rowid of the most recent successful insert on that connection. For a table with id INTEGER PRIMARY KEY, the rowid and the id are the same value, so this gives you the generated id. Capture it right after the step, before any other insert on the connection changes it.

Prepared statement pitfalls

Forgetting to finalize. Leaks the statement and blocks sqlite3_close. Use defer { sqlite3_finalize(statement) } right after prepare.

Reading columns after finalize or after the next step. The column pointers are invalid. Convert to Swift values during the step loop.

Confusing bind indices (1-based) with column indices (0-based). Produces wrong data or “no such column.” Double-check when results look shifted.

Expecting SQLITE_OK from sqlite3_step. Step returns SQLITE_ROW or SQLITE_DONE on success, never SQLITE_OK. Checking == SQLITE_OK after step is always wrong.

Not resetting before reuse. Re-stepping a finished statement without sqlite3_reset returns SQLITE_MISUSE or stale state. Reset first.

Preparing the same statement in a hot loop. Wastes the compile cost prepared statements exist to amortize. Prepare once outside the loop, reset inside it.

What to internalize

Prepared statements separate SQL code (fixed at compile time) from data (bound afterward), which is simultaneously the performance optimization and the injection defense. The lifecycle is prepare → bind → step → finalize. sqlite3_prepare_v2 compiles SQL with ? placeholders; bind functions fill the placeholders (1-based); sqlite3_step returns SQLITE_ROW per result row and SQLITE_DONE at the end; sqlite3_finalize releases everything. Put defer { sqlite3_finalize(statement) } right after a successful prepare so cleanup is automatic on every exit path. Reuse a statement with sqlite3_reset instead of recompiling. Get generated ids with sqlite3_last_insert_rowid.


6. Binding Parameters Safely

Binding is how runtime values enter a prepared statement. Get it right and your queries are safe, correct, and fast. Get it wrong — especially the text/blob lifetime question — and you get nondeterministic corruption that’s miserable to debug. This section covers every bind function, the placeholder styles, and the memory-lifetime rules that matter most.

The bind functions by type

Each SQLite storage type has a corresponding bind function. The first two arguments are always the statement and the 1-based parameter index:

// Integers
sqlite3_bind_int(statement, 1, Int32(value))      // 32-bit
sqlite3_bind_int64(statement, 1, Int64(value))    // 64-bit — prefer this for Swift Int

// Floating point
sqlite3_bind_double(statement, 1, 3.14159)

// Text
sqlite3_bind_text(statement, 1, "hello", -1, SQLITE_TRANSIENT)

// Blob (binary data)
data.withUnsafeBytes { buffer in
    sqlite3_bind_blob(statement, 1, buffer.baseAddress, Int32(buffer.count), SQLITE_TRANSIENT)
}

// NULL
sqlite3_bind_null(statement, 1)

A few notes on type choices. Swift’s Int is 64-bit on all modern iOS devices, so sqlite3_bind_int64 with Int64(myInt) is the natural choice; sqlite3_bind_int (32-bit) risks overflow for large values. For floating point, sqlite3_bind_double handles both Float and Double (convert Float to Double first). Booleans don’t have a dedicated function — SQLite stores them as integers 0 and 1, so bind value ? 1 : 0 with sqlite3_bind_int.

The text and blob lifetime argument

sqlite3_bind_text and sqlite3_bind_blob take two extra arguments the others don’t: a length and a destructor. These encode the most important rule in binding.

The length: for text, -1 means “the string is null-terminated, compute the length yourself.” This is correct for Swift strings bridged to C. For blobs you pass the explicit byte count, since binary data isn’t null-terminated.

The destructor: this tells SQLite what to do with your buffer. The two values that matter:

  • SQLITE_TRANSIENT — “make your own copy of this data immediately, because my buffer is about to disappear.” This is what you want for Swift values bridged just for the call.
  • SQLITE_STATIC — “don’t copy; my buffer will stay valid as long as you need it.” Only safe when you genuinely control the buffer’s lifetime past the bind and step.

Recall from Section 2 that when you pass a Swift String to a C function, Swift creates a temporary C string that’s valid only during the call. So when you write:

sqlite3_bind_text(statement, 1, swiftString, -1, SQLITE_TRANSIENT)

the bridged C string for swiftString exists only while sqlite3_bind_text runs. If you passed SQLITE_STATIC, SQLite would keep a pointer to that buffer and read it later during sqlite3_step — but by then the buffer is gone, and you read garbage. With SQLITE_TRANSIENT, SQLite copies the bytes during the bind call while the buffer is still alive. For bridged Swift strings, always use SQLITE_TRANSIENT. This single rule prevents the most notorious raw-SQLite-in-Swift bug.

And recall you must define SQLITE_TRANSIENT yourself because the C macro doesn’t import:

let SQLITE_TRANSIENT = unsafeBitCast(-1, to: sqlite3_destructor_type.self)

Binding blobs from Data

Binary data needs withUnsafeBytes to get a pointer into the Data’s storage:

let imageData: Data = ...
imageData.withUnsafeBytes { (buffer: UnsafeRawBufferPointer) in
    sqlite3_bind_blob(statement, 1, buffer.baseAddress, Int32(buffer.count), SQLITE_TRANSIENT)
}

withUnsafeBytes gives you a pointer valid only inside the closure. Because you pass SQLITE_TRANSIENT, SQLite copies the bytes before the closure ends, so the copy is safe. If buffer.count could exceed Int32.max (a 2 GB blob — you have other problems), you’d need the blob64 variant, but for normal data Int32(buffer.count) is fine.

An empty Data is a subtlety: buffer.baseAddress may be nil for zero-length data. SQLite handles a nil pointer with zero length as an empty blob, so this generally works, but if you want to be explicit you can special-case empty data to sqlite3_bind_blob(statement, 1, "", 0, SQLITE_TRANSIENT) or bind an actual empty blob.

Binding optionals

Swift optionals map naturally to NULL. The pattern:

func bindOptionalText(_ statement: OpaquePointer?, _ index: Int32, _ value: String?) {
    if let value = value {
        sqlite3_bind_text(statement, index, value, -1, SQLITE_TRANSIENT)
    } else {
        sqlite3_bind_null(statement, index)
    }
}

nil becomes a bound NULL; a present value binds normally. You’ll write small helpers like this (or generalize them, as we do in Section 11) so optional handling isn’t scattered through your call sites.

Named parameters

? placeholders are positional — bind index 1, 2, 3 in order. For statements with many parameters, positional binding gets error-prone (miscount and everything shifts). SQLite supports named parameters, which are self-documenting:

let sql = "INSERT INTO users (name, email, age) VALUES (:name, :email, :age);"
var statement: OpaquePointer?
sqlite3_prepare_v2(db, sql, -1, &statement, nil)

let nameIndex = sqlite3_bind_parameter_index(statement, ":name")
let emailIndex = sqlite3_bind_parameter_index(statement, ":email")
let ageIndex = sqlite3_bind_parameter_index(statement, ":age")

sqlite3_bind_text(statement, nameIndex, "Alice", -1, SQLITE_TRANSIENT)
sqlite3_bind_text(statement, emailIndex, "alice@example.com", -1, SQLITE_TRANSIENT)
sqlite3_bind_int64(statement, ageIndex, 30)

sqlite3_bind_parameter_index looks up the numeric index of a named parameter. Names start with :, @, or $ (:name is most common). Named parameters trade a little verbosity for clarity and resilience to reordering — valuable in long INSERTs and complex WHERE clauses. Positional ? is fine for one or two parameters.

Verifying you bound everything

A statement with three placeholders needs three binds. If you forget one, that parameter stays NULL (its default), which is usually a silent bug rather than a loud error. sqlite3_bind_parameter_count tells you how many parameters a statement has:

let expected = sqlite3_bind_parameter_count(statement)
// after binding, you can assert you bound `expected` parameters

In a wrapper layer you can use this to validate that the number of values supplied matches the number of placeholders, turning a silent NULL bug into an early, explicit error. We’ll do exactly this in Section 11.

Bind return values

Bind functions return a result code. SQLITE_OK means the bind succeeded. They rarely fail — the main failure is SQLITE_RANGE (an out-of-bounds parameter index) or SQLITE_NOMEM (out of memory). Many examples ignore bind return values for brevity, but in a robust layer you check them, because a SQLITE_RANGE means your index is wrong and the value didn’t go where you think:

guard sqlite3_bind_int64(statement, 1, id) == SQLITE_OK else {
    throw DatabaseError.bindFailed(String(cString: sqlite3_errmsg(db)))
}

A worked example: parameterized insert

Tying it together, here’s a safe insert with mixed types and an optional:

func insertUser(name: String, email: String?, age: Int, db: OpaquePointer?) throws -> Int64 {
    let sql = "INSERT INTO users (name, email, age) VALUES (?, ?, ?);"
    var statement: OpaquePointer?
    guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
        throw DatabaseError.prepareFailed(String(cString: sqlite3_errmsg(db)))
    }
    defer { sqlite3_finalize(statement) }

    sqlite3_bind_text(statement, 1, name, -1, SQLITE_TRANSIENT)
    if let email = email {
        sqlite3_bind_text(statement, 2, email, -1, SQLITE_TRANSIENT)
    } else {
        sqlite3_bind_null(statement, 2)
    }
    sqlite3_bind_int64(statement, 3, Int64(age))

    guard sqlite3_step(statement) == SQLITE_DONE else {
        throw DatabaseError.stepFailed(String(cString: sqlite3_errmsg(db)))
    }
    return sqlite3_last_insert_rowid(db)
}

Every value enters through a bind. No interpolation. Text uses SQLITE_TRANSIENT. The optional email branches to bind_null. The function returns the generated id. This is the shape of safe, correct SQLite writes.

Binding pitfalls

SQLITE_STATIC (or omitting the destructor) for a Swift string. Nondeterministic corruption — the value is sometimes right, sometimes garbage. Always SQLITE_TRANSIENT for bridged Swift values.

Forgetting to define SQLITE_TRANSIENT. It doesn’t exist in Swift out of the box. Define it with unsafeBitCast(-1, ...) once per project.

Binding to index 0. Bind parameters are 1-based. Index 0 is invalid and the value goes nowhere.

Forgetting a parameter. It stays NULL silently. Validate bind count against placeholder count in your wrapper.

Using sqlite3_bind_int for large values. 32-bit overflow. Use sqlite3_bind_int64 for Swift Int.

Holding the blob pointer past the closure with SQLITE_STATIC. The withUnsafeBytes buffer is gone after the closure. Use SQLITE_TRANSIENT so SQLite copies it inside the closure.

What to internalize

There’s a bind function per type: int/int64, double, text, blob, null. Prefer int64 for Swift Int. For text and blob, the destructor argument is critical: use SQLITE_TRANSIENT (which you define yourself via unsafeBitCast(-1, ...)) so SQLite copies your transient Swift buffer before it disappears. Bind blobs through withUnsafeBytes. Map Swift optionals to sqlite3_bind_null. Named parameters (:name, looked up with sqlite3_bind_parameter_index) make long statements self-documenting. Validate bind count against placeholder count to catch missing binds. Every runtime value enters through a bind, never through string interpolation.


7. Reading Results: Column Access and Types

After sqlite3_step returns SQLITE_ROW, the current row’s columns are available through the sqlite3_column_* family. Reading them correctly — right function for the type, right index, right NULL handling, right string conversion timing — is where a lot of subtle bugs live. This section covers column access exhaustively.

The column functions by type

Mirroring the bind functions, there’s a column accessor per type. The arguments are the statement and the 0-based column index (remember: columns are 0-based, unlike binds):

let intValue    = sqlite3_column_int(statement, 0)     // Int32
let int64Value  = sqlite3_column_int64(statement, 0)   // Int64 — prefer this
let doubleValue = sqlite3_column_double(statement, 0)   // Double
let textPtr     = sqlite3_column_text(statement, 0)     // UnsafePointer<UInt8>?
let blobPtr     = sqlite3_column_blob(statement, 0)     // UnsafeRawPointer?
let byteCount   = sqlite3_column_bytes(statement, 0)    // Int32 — size of text/blob

sqlite3_column_int64 is the right integer accessor for the same reason sqlite3_bind_int64 is the right binder: Swift Int is 64-bit, and the 32-bit accessor truncates large values.

Converting text columns to Swift strings

sqlite3_column_text returns UnsafePointer<UInt8>? — a pointer to UTF-8 bytes, or nil if the value is NULL. Convert immediately:

if let textPtr = sqlite3_column_text(statement, 0) {
    let string = String(cString: textPtr)
} else {
    // the column was NULL
}

The if let does double duty: it both unwraps the optional pointer and distinguishes NULL (pointer is nil) from a present string. This is the standard way to read a nullable text column. For a column you know is NOT NULL, you can force it, but branching on the optional is safer and handles NULL explicitly.

Why convert immediately? The pointer is owned by SQLite and valid only until the next sqlite3_step, sqlite3_reset, or sqlite3_finalize on this statement. String(cString:) copies the bytes into a Swift-owned String right now, so the copy survives. Storing the raw pointer and reading it after the next step is a use-after-free.

Reading blob columns into Data

Blobs need both the pointer and the byte count:

func readBlob(_ statement: OpaquePointer?, _ index: Int32) -> Data? {
    guard let pointer = sqlite3_column_blob(statement, index) else {
        return nil   // NULL
    }
    let count = Int(sqlite3_column_bytes(statement, index))
    return Data(bytes: pointer, count: count)
}

sqlite3_column_blob gives the pointer, sqlite3_column_bytes gives the length, and Data(bytes:count:) copies them into a Swift-owned Data. As with text, the copy must happen now — the pointer dies on the next step. A NULL blob returns a nil pointer, which we map to nil Data.

Note the order: call sqlite3_column_blob first, then sqlite3_column_bytes. SQLite’s documentation specifies that for blobs you should fetch the pointer before the length to get a consistent answer, because some type conversions can change the reported length.

Detecting NULL explicitly with sqlite3_column_type

For text and blob, a nil pointer signals NULL. But for integer and double columns, there’s no “nil” — sqlite3_column_int64 returns 0 for a NULL column, indistinguishable from a stored 0. When you need to tell NULL apart from zero, ask the column’s type directly:

let columnType = sqlite3_column_type(statement, 0)
switch columnType {
case SQLITE_INTEGER: let v = sqlite3_column_int64(statement, 0)
case SQLITE_FLOAT:   let v = sqlite3_column_double(statement, 0)
case SQLITE_TEXT:    let v = String(cString: sqlite3_column_text(statement, 0))
case SQLITE_BLOB:    let v = readBlob(statement, 0)
case SQLITE_NULL:    break   // the value is NULL
default: break
}

sqlite3_column_type returns one of SQLITE_INTEGER, SQLITE_FLOAT, SQLITE_TEXT, SQLITE_BLOB, SQLITE_NULL. This is the authoritative way to detect NULL for numeric columns and the basis for reading columns whose type you don’t know at compile time (a generic row reader, which we’ll build in Section 12).

A practical pattern for a nullable integer:

func readOptionalInt64(_ statement: OpaquePointer?, _ index: Int32) -> Int64? {
    if sqlite3_column_type(statement, index) == SQLITE_NULL {
        return nil
    }
    return sqlite3_column_int64(statement, index)
}

Check for NULL first, then read. Without the type check, you can’t distinguish a stored 0 from NULL.

Column metadata: names and count

You can introspect the result set’s shape at runtime:

let columnCount = sqlite3_column_count(statement)
for i in 0..<columnCount {
    let name = String(cString: sqlite3_column_name(statement, i))
    print("Column \(i): \(name)")
}

sqlite3_column_count returns the number of columns in the result. sqlite3_column_name returns each column’s name. This metadata lets you build a generic reader that maps column names to values — the foundation of decoding rows into dictionaries or Codable types. Note that the count is only meaningful after a successful sqlite3_prepare_v2 (the column structure is known at compile time); the names likewise.

Type conversion happens automatically

SQLite is loosely typed (Section 19 covers type affinity in depth). If you call sqlite3_column_int64 on a column that holds text like "42", SQLite converts it to the integer 42. If you call sqlite3_column_text on an integer column, you get the string "42". This automatic coercion is occasionally convenient but more often a trap — it means a wrong-type read returns a converted value rather than an error, hiding bugs. The defense is to read each column with the function matching the type you actually stored, and to use sqlite3_column_type when you’re unsure.

A complete typed read

Bringing it together — reading a result set into Swift structs:

struct User {
    let id: Int64
    let name: String
    let email: String?
    let age: Int64?
}

func fetchAllUsers(db: OpaquePointer?) throws -> [User] {
    let sql = "SELECT id, name, email, age FROM users ORDER BY name;"
    var statement: OpaquePointer?
    guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
        throw DatabaseError.prepareFailed(String(cString: sqlite3_errmsg(db)))
    }
    defer { sqlite3_finalize(statement) }

    var users: [User] = []
    while sqlite3_step(statement) == SQLITE_ROW {
        let id = sqlite3_column_int64(statement, 0)
        let name = String(cString: sqlite3_column_text(statement, 1))
        let email = sqlite3_column_text(statement, 2).map { String(cString: $0) }
        let age: Int64? = sqlite3_column_type(statement, 3) == SQLITE_NULL
            ? nil
            : sqlite3_column_int64(statement, 3)
        users.append(User(id: id, name: name, email: email, age: age))
    }
    return users
}

Each column is read with the matching accessor. The non-null id and name read directly. The nullable email uses .map over the optional text pointer. The nullable age checks sqlite3_column_type for NULL because a numeric NULL can’t be detected by the value alone. The whole thing accumulates into an array during the step loop, converting every value to Swift-owned storage before the next step invalidates the pointers.

Reading column pitfalls

Reading columns with 1-based indices. Columns are 0-based. The first column is index 0. (Binds are 1-based — the asymmetry is the trap.)

Storing a column pointer and reading it after the next step. Use-after-free. Convert to String/Data immediately during the step iteration.

Treating a numeric 0 as NULL or vice versa. sqlite3_column_int64 returns 0 for NULL. Use sqlite3_column_type == SQLITE_NULL to distinguish.

Calling sqlite3_column_bytes before sqlite3_column_blob. Can yield an inconsistent length. Fetch the blob pointer first.

Relying on automatic type conversion. Reading an integer column as text “works” but masks schema confusion. Read with the right accessor for the stored type.

Forcing String(cString:) on a possibly-NULL text column. Crashes on the nil pointer. Branch on the optional.

What to internalize

Read columns with the accessor matching the stored type (int64, double, text, blob), using 0-based indices. Convert text to String and blobs to Data immediately during the step loop, because SQLite’s pointers die on the next step. For text and blob, a nil pointer means NULL; for numeric columns, use sqlite3_column_type == SQLITE_NULL since the value alone can’t reveal NULL. sqlite3_column_count and sqlite3_column_name expose the result shape for generic readers. Be wary of SQLite’s automatic type coercion — read each column as the type you stored.


8. CRUD Operations in Practice

You now have every primitive: open, prepare, bind, step, read, finalize. This section assembles them into the four operations every persistence layer needs — Create, Read, Update, Delete — written out as complete, idiomatic functions. Think of it as a reference for the shapes you’ll reuse constantly, with the reasoning behind each.

A consistent table to work with

Throughout, assume this schema (created once via sqlite3_exec, per Section 4):

let schema = """
CREATE TABLE IF NOT EXISTS tasks (
    id INTEGER PRIMARY KEY,
    title TEXT NOT NULL,
    notes TEXT,
    is_done INTEGER NOT NULL DEFAULT 0,
    priority INTEGER NOT NULL DEFAULT 0,
    created_at INTEGER NOT NULL,
    due_date INTEGER
);
"""

id is the auto-incrementing rowid. title is required; notes and due_date are optional. is_done is a boolean stored as 0/1. created_at and due_date are Unix timestamps stored as integers (Section 18 covers date storage). A Swift model to match:

struct Task {
    var id: Int64
    var title: String
    var notes: String?
    var isDone: Bool
    var priority: Int
    var createdAt: Date
    var dueDate: Date?
}

Create (INSERT)

func createTask(title: String, notes: String?, priority: Int, dueDate: Date?, db: OpaquePointer?) throws -> Int64 {
    let sql = """
    INSERT INTO tasks (title, notes, is_done, priority, created_at, due_date)
    VALUES (?, ?, 0, ?, ?, ?);
    """
    var statement: OpaquePointer?
    guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
        throw DatabaseError.prepareFailed(String(cString: sqlite3_errmsg(db)))
    }
    defer { sqlite3_finalize(statement) }

    sqlite3_bind_text(statement, 1, title, -1, SQLITE_TRANSIENT)
    if let notes { sqlite3_bind_text(statement, 2, notes, -1, SQLITE_TRANSIENT) }
    else { sqlite3_bind_null(statement, 2) }
    sqlite3_bind_int64(statement, 3, Int64(priority))
    sqlite3_bind_int64(statement, 4, Int64(Date().timeIntervalSince1970))
    if let dueDate { sqlite3_bind_int64(statement, 5, Int64(dueDate.timeIntervalSince1970)) }
    else { sqlite3_bind_null(statement, 5) }

    guard sqlite3_step(statement) == SQLITE_DONE else {
        throw DatabaseError.stepFailed(String(cString: sqlite3_errmsg(db)))
    }
    return sqlite3_last_insert_rowid(db)
}

Points worth noting: is_done is hardcoded to 0 in the SQL since a new task isn’t done — not every column needs a placeholder. Timestamps are converted from Date to integer seconds via timeIntervalSince1970. The optional notes and dueDate branch to bind_null. The function returns the new id from sqlite3_last_insert_rowid. This is the canonical insert shape.

Read (SELECT) — single row

Fetching by primary key returns zero or one row:

func fetchTask(id: Int64, db: OpaquePointer?) throws -> Task? {
    let sql = "SELECT id, title, notes, is_done, priority, created_at, due_date FROM tasks WHERE id = ?;"
    var statement: OpaquePointer?
    guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
        throw DatabaseError.prepareFailed(String(cString: sqlite3_errmsg(db)))
    }
    defer { sqlite3_finalize(statement) }

    sqlite3_bind_int64(statement, 1, id)

    guard sqlite3_step(statement) == SQLITE_ROW else {
        return nil   // SQLITE_DONE means no row matched
    }
    return readTaskRow(statement)
}

// A reusable row reader — we'll factor this out so every read uses it
func readTaskRow(_ statement: OpaquePointer?) -> Task {
    let id = sqlite3_column_int64(statement, 0)
    let title = String(cString: sqlite3_column_text(statement, 1))
    let notes = sqlite3_column_text(statement, 2).map { String(cString: $0) }
    let isDone = sqlite3_column_int64(statement, 3) != 0
    let priority = Int(sqlite3_column_int64(statement, 4))
    let createdAt = Date(timeIntervalSince1970: TimeInterval(sqlite3_column_int64(statement, 5)))
    let dueDate: Date? = sqlite3_column_type(statement, 6) == SQLITE_NULL
        ? nil
        : Date(timeIntervalSince1970: TimeInterval(sqlite3_column_int64(statement, 6)))
    return Task(id: id, title: title, notes: notes, isDone: isDone,
                priority: priority, createdAt: createdAt, dueDate: dueDate)
}

Factoring the column-reading into readTaskRow is a deliberate move: single-row and multi-row reads share the exact same column mapping, so writing it once eliminates a whole class of “the list query reads columns differently from the detail query” bugs. The boolean reconstitutes from the integer (!= 0), and the dates convert back from integer seconds. Note the column order in readTaskRow must match the SELECT column order — keep them lockstep.

Read (SELECT) — many rows

func fetchTasks(includeDone: Bool, db: OpaquePointer?) throws -> [Task] {
    let sql = includeDone
        ? "SELECT id, title, notes, is_done, priority, created_at, due_date FROM tasks ORDER BY priority DESC, created_at;"
        : "SELECT id, title, notes, is_done, priority, created_at, due_date FROM tasks WHERE is_done = 0 ORDER BY priority DESC, created_at;"

    var statement: OpaquePointer?
    guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
        throw DatabaseError.prepareFailed(String(cString: sqlite3_errmsg(db)))
    }
    defer { sqlite3_finalize(statement) }

    var tasks: [Task] = []
    while sqlite3_step(statement) == SQLITE_ROW {
        tasks.append(readTaskRow(statement))
    }
    return tasks
}

The while ... == SQLITE_ROW loop, the shared readTaskRow, accumulation into an array. Choosing between the two SQL strings based on includeDone is a simple form of dynamic query building — fine for a binary choice. (For filters with runtime values, you’d keep one SQL string with placeholders and bind, never interpolate.)

Update (UPDATE)

func updateTask(_ task: Task, db: OpaquePointer?) throws {
    let sql = """
    UPDATE tasks
    SET title = ?, notes = ?, is_done = ?, priority = ?, due_date = ?
    WHERE id = ?;
    """
    var statement: OpaquePointer?
    guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
        throw DatabaseError.prepareFailed(String(cString: sqlite3_errmsg(db)))
    }
    defer { sqlite3_finalize(statement) }

    sqlite3_bind_text(statement, 1, task.title, -1, SQLITE_TRANSIENT)
    if let notes = task.notes { sqlite3_bind_text(statement, 2, notes, -1, SQLITE_TRANSIENT) }
    else { sqlite3_bind_null(statement, 2) }
    sqlite3_bind_int64(statement, 3, task.isDone ? 1 : 0)
    sqlite3_bind_int64(statement, 4, Int64(task.priority))
    if let dueDate = task.dueDate { sqlite3_bind_int64(statement, 5, Int64(dueDate.timeIntervalSince1970)) }
    else { sqlite3_bind_null(statement, 5) }
    sqlite3_bind_int64(statement, 6, task.id)   // the WHERE clause id — last placeholder

    guard sqlite3_step(statement) == SQLITE_DONE else {
        throw DatabaseError.stepFailed(String(cString: sqlite3_errmsg(db)))
    }

    if sqlite3_changes(db) == 0 {
        throw DatabaseError.notFound   // no row had that id
    }
}

A few important details. The WHERE id = ? placeholder is the last parameter (index 6), after all the SET values — placeholder order follows their position in the SQL text, not their logical importance. The boolean binds as isDone ? 1 : 0. And critically, after the step we check sqlite3_changes(db): an UPDATE that matches no rows still returns SQLITE_DONE (it succeeded — it just updated nothing). If you expect the row to exist, checking the change count turns a silent no-op into an explicit notFound error.

Delete (DELETE)

func deleteTask(id: Int64, db: OpaquePointer?) throws {
    let sql = "DELETE FROM tasks WHERE id = ?;"
    var statement: OpaquePointer?
    guard sqlite3_prepare_v2(db, sql, -1, &statement, nil) == SQLITE_OK else {
        throw DatabaseError.prepareFailed(String(cString: sqlite3_errmsg(db)))
    }
    defer { sqlite3_finalize(statement) }

    sqlite3_bind_int64(statement, 1, id)

    guard sqlite3_step(statement) == SQLITE_DONE else {
        throw DatabaseError.stepFailed(String(cString: sqlite3_errmsg(db)))
    }
    // Optionally check sqlite3_changes(db) to confirm something was deleted
}

The simplest of the four. Bind the id, step once, expect SQLITE_DONE. As with update, sqlite3_changes tells you whether a row actually went away if you care to know.

Upsert (INSERT … ON CONFLICT)

A common need is “insert, or update if it already exists.” SQLite’s ON CONFLICT clause (an “upsert”) does this in one statement, provided there’s a unique constraint to conflict on:

let sql = """
INSERT INTO tasks (id, title, notes, is_done, priority, created_at, due_date)
VALUES (?, ?, ?, ?, ?, ?, ?)
ON CONFLICT(id) DO UPDATE SET
    title = excluded.title,
    notes = excluded.notes,
    is_done = excluded.is_done,
    priority = excluded.priority,
    due_date = excluded.due_date;
"""

ON CONFLICT(id) DO UPDATE means “if a row with this id already exists, update it instead of failing.” The special excluded table refers to the values you tried to insert. This is more efficient than a fetch-then-insert-or-update round trip and is atomic. Upserts are invaluable for sync-style code where you’re reconciling local data with a server.

CRUD pitfalls

Mismatched column order between SELECT and the row reader. The reader reads index 0, 1, 2… in the order the SELECT lists columns. Reorder the SELECT and the reader silently reads the wrong fields. Keep them in lockstep; share one reader.

Forgetting the WHERE clause placeholder is the last bind index. In an UPDATE ... WHERE id = ?, the id is bound after all the SET values. Count carefully.

Not checking sqlite3_changes on update/delete. A no-match UPDATE/DELETE succeeds silently. Check the change count when you expect a row to exist.

Interpolating filter values for dynamic queries. Switching SQL strings for a boolean flag is fine; splicing runtime values into SQL is injection. Bind values; only vary the static structure.

Re-preparing the same CRUD statement in a tight loop. Wasteful. For bulk operations, prepare once and reset (Section 24).

What to internalize

CRUD is the four primitives composed into whole operations, all following the prepare/bind/step/finalize rhythm with defer-based finalize. Factor column reading into one shared row reader so single-row and multi-row queries can’t drift apart. Convert between Swift types and SQLite storage at the boundary — Bool as 0/1, Date as integer seconds. The WHERE placeholder comes last in bind order. Check sqlite3_changes after UPDATE/DELETE when you expect a row to exist, since a no-match still “succeeds.” Use INSERT ... ON CONFLICT ... DO UPDATE for atomic upserts. Vary static SQL structure freely; route every runtime value through a bind.


9. Error Handling and Result Codes

SQLite communicates almost everything through integer result codes. Turning those codes into something a Swift program can reason about — typed errors, meaningful messages, sensible recovery — is the difference between a layer you can debug and one that fails mysteriously. This section covers the result-code system, extended codes, and how to build clean Swift error handling on top.

The result code system

Every SQLite function that can fail returns an Int32. The primary codes you’ll encounter:

SQLITE_OK         // 0   success
SQLITE_ERROR      // 1   generic error (often a SQL logic/syntax error)
SQLITE_BUSY       // 5   the database file is locked
SQLITE_LOCKED     // 6   a table in the database is locked
SQLITE_NOMEM      // 7   out of memory
SQLITE_READONLY   // 8   attempt to write a readonly database
SQLITE_CONSTRAINT // 19  a constraint violation (UNIQUE, NOT NULL, FK, CHECK)
SQLITE_MISMATCH   // 20  data type mismatch
SQLITE_MISUSE     // 21  library used incorrectly (a programming error)
SQLITE_RANGE      // 25  bind/column index out of range
SQLITE_ROW        // 100 sqlite3_step has a row ready
SQLITE_DONE       // 101 sqlite3_step has finished

SQLITE_OK, SQLITE_ROW, and SQLITE_DONE are the success codes. Everything else (in the typical range) indicates a problem. Two of these deserve special attention: SQLITE_BUSY is recoverable (retry or wait — Section 22), and SQLITE_MISUSE means your code did something illegal (used a finalized statement, called functions out of order), which is a bug to fix, not an error to handle gracefully.

Getting the error message

A bare code like 19 isn’t actionable. sqlite3_errmsg returns a human-readable description of the most recent error on a connection:

let message = String(cString: sqlite3_errmsg(db))
// e.g. "UNIQUE constraint failed: users.email"

This message is often remarkably specific — it’ll name the exact constraint and column that failed. Always capture it when handling an error; it’s the single most useful piece of diagnostic information SQLite gives you. There’s also sqlite3_errstr(code), which translates a code into a generic string (e.g. "constraint failed") without needing the connection, useful when you don’t have the db handle handy.

Extended result codes

The primary codes are coarse. SQLITE_CONSTRAINT tells you a constraint failed but not which kind. SQLite has extended result codes that refine the primary ones:

SQLITE_CONSTRAINT_UNIQUE     // a UNIQUE (or PRIMARY KEY) constraint
SQLITE_CONSTRAINT_NOTNULL    // a NOT NULL constraint
SQLITE_CONSTRAINT_FOREIGNKEY // a foreign key constraint
SQLITE_CONSTRAINT_CHECK      // a CHECK constraint

To receive extended codes, either call sqlite3_extended_result_codes(db, 1) once after opening (which makes step/prepare return extended codes), or fetch the extended code on demand:

let extended = sqlite3_extended_errcode(db)
if extended == SQLITE_CONSTRAINT_UNIQUE {
    // specifically a uniqueness violation — e.g. duplicate email
}

Extended codes let you respond precisely: a SQLITE_CONSTRAINT_UNIQUE on a user signup means “that email is taken” (a normal, user-facing condition), while a SQLITE_CONSTRAINT_FOREIGNKEY means “you referenced a row that doesn’t exist” (usually a logic bug). Treating them differently makes for a much better app.

Designing a Swift error type

Raw codes don’t belong in your app’s upper layers. Wrap them in a Swift Error enum that carries the code, the message, and ideally a category:

enum SQLiteError: Error {
    case open(message: String)
    case prepare(message: String, sql: String)
    case step(message: String, code: Int32)
    case bind(message: String)
    case constraint(ConstraintKind, message: String)
    case notFound
    case misuse(message: String)

    enum ConstraintKind {
        case unique, notNull, foreignKey, check, other
    }
}

Then a single helper that builds the right case from a connection and code, so error construction lives in one place:

extension SQLiteError {
    static func from(db: OpaquePointer?, code: Int32, sql: String = "") -> SQLiteError {
        let message = db.map { String(cString: sqlite3_errmsg($0)) } ?? sqlite3_errstr(code).map { String(cString: $0) } ?? "unknown"
        let extended = db.map { sqlite3_extended_errcode($0) } ?? code

        switch code {
        case SQLITE_CONSTRAINT:
            let kind: ConstraintKind
            switch extended {
            case SQLITE_CONSTRAINT_UNIQUE, SQLITE_CONSTRAINT_PRIMARYKEY: kind = .unique
            case SQLITE_CONSTRAINT_NOTNULL: kind = .notNull
            case SQLITE_CONSTRAINT_FOREIGNKEY: kind = .foreignKey
            case SQLITE_CONSTRAINT_CHECK: kind = .check
            default: kind = .other
            }
            return .constraint(kind, message: message)
        case SQLITE_MISUSE:
            return .misuse(message: message)
        default:
            return .step(message: message, code: code)
        }
    }
}

Now your call sites throw rich, typed errors:

let rc = sqlite3_step(statement)
guard rc == SQLITE_DONE else {
    throw SQLiteError.from(db: db, code: rc, sql: sql)
}

And callers can pattern-match meaningfully:

do {
    _ = try createUser(email: email, db: db)
} catch SQLiteError.constraint(.unique, _) {
    showAlert("That email is already registered.")
} catch {
    showAlert("Something went wrong. Please try again.")
}

This is the payoff: the messy integer-code world is confined to one conversion function, and the rest of your app handles clean, expressive Swift errors.

Where errors surface in the lifecycle

Knowing where in the prepare/bind/step/finalize lifecycle an error appears helps you place your checks:

  • Prepare surfaces SQL syntax errors and references to nonexistent tables/columns. A non-SQLITE_OK from sqlite3_prepare_v2 means your SQL is malformed.
  • Bind rarely fails; SQLITE_RANGE means a bad parameter index.
  • Step surfaces runtime errors: constraint violations, SQLITE_BUSY from locking, type mismatches, I/O errors. Most “real” database errors appear here.
  • Finalize returns the error code of the last step (like reset does). A non-OK finalize usually just echoes an earlier failure rather than being a new one.

In practice you check prepare and step rigorously, check bind in robust code, and treat finalize’s return as informational.

The misuse code is a bug signal

SQLITE_MISUSE (21) is categorically different from other errors. It doesn’t mean “the data was bad” — it means “you called the library incorrectly.” Common causes: stepping a finalized statement, binding to a statement that’s mid-execution, using a connection from multiple threads in single-thread mode, finalizing twice. When you see SQLITE_MISUSE, don’t write recovery code — find and fix the lifecycle bug. It’s the SQLite equivalent of an assertion failure.

Logging for diagnosis

During development, a global error log hook surfaces problems you might otherwise miss:

// Called once, early in app startup
func installSQLiteLogging() {
    sqlite3_config(SQLITE_CONFIG_LOG, { (context, code, messagePtr) in
        let message = messagePtr.map { String(cString: $0) } ?? ""
        print("SQLite [\(code)]: \(message)")
    } as (@convention(c) (UnsafeMutableRawPointer?, Int32, UnsafePointer<CChar>?) -> Void), nil)
}

SQLITE_CONFIG_LOG registers a callback SQLite invokes for warnings and errors, including some that don’t surface through normal return codes (like a query that ran without an index when it could have used one). The @convention(c) annotation is required because SQLite needs a C function pointer. Install it once at startup during development; it’s a cheap source of insight.

Error handling pitfalls

Checking codes against SQLITE_OK after sqlite3_step. Step never returns SQLITE_OK on success — it returns SQLITE_ROW or SQLITE_DONE. Comparing to SQLITE_OK after step is always a logic error.

Discarding sqlite3_errmsg. The message names the exact problem. Capture it before the next operation overwrites it.

Treating SQLITE_BUSY as fatal. It’s recoverable — set a busy timeout (Section 22) or retry. Don’t crash on it.

Treating SQLITE_MISUSE as a runtime error to handle. It’s a programming bug. Fix the lifecycle violation instead of catching it.

Ignoring extended codes. Without them, every constraint failure looks the same. Enable them to distinguish “duplicate email” from “dangling foreign key.”

Reading the error from the wrong connection. sqlite3_errmsg is per-connection and reports the latest error. In multi-connection code, read it from the connection that failed, immediately.

What to internalize

SQLite reports through Int32 result codes; success is SQLITE_OK/SQLITE_ROW/SQLITE_DONE, and sqlite3_errmsg gives a precise human-readable message. Enable extended result codes to distinguish kinds of constraint failures, which lets you handle “email taken” differently from a logic bug. Confine the raw-code world to one conversion helper that produces a typed Swift Error enum, so the rest of your app pattern-matches clean cases. Errors surface mostly at prepare (syntax) and step (runtime/constraints); SQLITE_MISUSE signals a bug in your code, not bad data. Install SQLITE_CONFIG_LOG during development for extra diagnostics.


10. Transactions: BEGIN, COMMIT, ROLLBACK

A transaction groups multiple statements into a single atomic unit: either all of them take effect or none do. Transactions are what make a database trustworthy — they protect you from half-completed operations, give you enormous performance wins on bulk writes, and define the isolation guarantees between concurrent readers and writers. This section covers how to use them correctly from Swift.

What a transaction guarantees

SQLite is fully ACID — Atomic, Consistent, Isolated, Durable. A transaction is the unit those guarantees apply to:

  • Atomic — all the statements in the transaction succeed together or the whole thing rolls back. A crash mid-transaction leaves the database as if the transaction never started.
  • Consistent — constraints (NOT NULL, UNIQUE, foreign keys, CHECK) hold at transaction boundaries.
  • Isolated — concurrent connections don’t see each other’s uncommitted changes.
  • Durable — once committed, the data survives a crash or power loss (subject to the synchronous pragma’s tradeoffs).

The practical upshot: if you have an operation that must be all-or-nothing — transferring a value from one row to another, inserting a parent and its children together, applying a batch of related edits — wrap it in a transaction.

The basic transaction

Transactions are controlled with three SQL statements you run via sqlite3_exec:

sqlite3_exec(db, "BEGIN TRANSACTION;", nil, nil, nil)
// ... do several inserts/updates/deletes ...
sqlite3_exec(db, "COMMIT;", nil, nil, nil)

BEGIN opens the transaction; every statement after it is buffered. COMMIT makes all the buffered changes permanent in one atomic operation. If something goes wrong before commit, you ROLLBACK instead:

sqlite3_exec(db, "ROLLBACK;", nil, nil, nil)   // discard everything since BEGIN

Wrapping a transaction in Swift safely

The danger with manual BEGIN/COMMIT is forgetting to commit or rollback on an error path — leaving a transaction open, which locks the database. The Swift solution is a higher-order function that guarantees correct cleanup:

func transaction<T>(_ db: OpaquePointer?, _ work: () throws -> T) throws -> T {
    guard sqlite3_exec(db, "BEGIN TRANSACTION;", nil, nil, nil) == SQLITE_OK else {
        throw SQLiteError.from(db: db, code: sqlite3_errcode(db))
    }
    do {
        let result = try work()
        guard sqlite3_exec(db, "COMMIT;", nil, nil, nil) == SQLITE_OK else {
            throw SQLiteError.from(db: db, code: sqlite3_errcode(db))
        }
        return result
    } catch {
        sqlite3_exec(db, "ROLLBACK;", nil, nil, nil)   // best-effort rollback on any failure
        throw error
    }
}

Now atomic operations are clean and exception-safe:

try transaction(db) {
    let parentId = try createTask(title: "Project", notes: nil, priority: 1, dueDate: nil, db: db)
    try createSubtask(parentId: parentId, title: "Step 1", db: db)
    try createSubtask(parentId: parentId, title: "Step 2", db: db)
}

If any of the three inserts throws, the catch rolls back and re-throws, so the parent task and any subtasks created before the failure are all undone. If all succeed, the COMMIT makes them permanent together. The closure-based design means you can never forget to commit or rollback — the function does it for you on every path. This pattern is the single most useful piece of transaction infrastructure you’ll write.

Transactions are a massive performance win for bulk writes

Here’s something that surprises people: by default, every individual INSERT/UPDATE/DELETE is its own implicit transaction. SQLite wraps each lone statement in a transaction and commits it immediately, which means flushing to disk. Flushing to disk is slow. So inserting 10,000 rows as 10,000 implicit transactions can take seconds — thousands of disk syncs.

Wrap those 10,000 inserts in one explicit transaction and they share a single commit — one disk sync instead of thousands:

try transaction(db) {
    var stmt: OpaquePointer?
    sqlite3_prepare_v2(db, "INSERT INTO tasks (title, is_done, priority, created_at) VALUES (?, 0, 0, ?);", -1, &stmt, nil)
    defer { sqlite3_finalize(stmt) }
    let now = Int64(Date().timeIntervalSince1970)
    for title in tenThousandTitles {
        sqlite3_reset(stmt)
        sqlite3_bind_text(stmt, 1, title, -1, SQLITE_TRANSIENT)
        sqlite3_bind_int64(stmt, 2, now)
        guard sqlite3_step(stmt) == SQLITE_DONE else {
            throw SQLiteError.from(db: db, code: sqlite3_errcode(db))
        }
    }
}

This combination — one prepared statement reused with reset inside one transaction — is the canonical fast bulk insert, and it’s commonly hundreds of times faster than the naive loop. We’ll return to this in Section 24, but the transaction is the bigger half of the speedup. Any time you’re writing many rows, wrap them in a transaction.

Savepoints: nested transactions

SQLite doesn’t truly nest BEGIN/COMMIT, but it offers savepoints, which are named, nestable transaction markers:

sqlite3_exec(db, "SAVEPOINT outer;", nil, nil, nil)
// ... work ...
sqlite3_exec(db, "SAVEPOINT inner;", nil, nil, nil)
// ... more work ...
sqlite3_exec(db, "RELEASE inner;", nil, nil, nil)        // commit the inner savepoint
// ... 
sqlite3_exec(db, "ROLLBACK TO outer;", nil, nil, nil)    // undo back to the outer savepoint
sqlite3_exec(db, "RELEASE outer;", nil, nil, nil)

SAVEPOINT name creates a marker; RELEASE name commits everything since that marker (or merges it into the enclosing transaction); ROLLBACK TO name undoes back to the marker without ending the transaction. Savepoints let you build composable transactional functions — an inner function can use a savepoint so it’s atomic whether or not it’s called inside an outer transaction. For most app code, plain transactions suffice, but savepoints are the tool when you need partial rollback or composability.

Transaction types: DEFERRED, IMMEDIATE, EXCLUSIVE

BEGIN has three flavors that control when locks are acquired:

  • BEGIN DEFERRED (the default) — acquires no lock until the first read or write. The transaction starts “lazily.”
  • BEGIN IMMEDIATE — acquires a write lock immediately, before any statement runs.
  • BEGIN EXCLUSIVE — acquires an exclusive lock immediately (relevant mainly outside WAL mode).

The distinction matters for concurrency. With the default DEFERRED, a transaction that starts by reading and later tries to write can fail with SQLITE_BUSY if another connection grabbed the write lock in between — and at that point you can’t simply retry because you might have read stale data. BEGIN IMMEDIATE avoids this by taking the write lock up front: if it can’t, it fails immediately and cleanly before you’ve done any work, so retrying is safe. For any transaction that will write, prefer BEGIN IMMEDIATE in multi-connection setups. We’ll revisit this under concurrency (Sections 20-22); for now, know the option exists and why it matters.

func writeTransaction<T>(_ db: OpaquePointer?, _ work: () throws -> T) throws -> T {
    guard sqlite3_exec(db, "BEGIN IMMEDIATE;", nil, nil, nil) == SQLITE_OK else {
        throw SQLiteError.from(db: db, code: sqlite3_errcode(db))
    }
    // ... same commit/rollback structure as before
    do {
        let result = try work()
        sqlite3_exec(db, "COMMIT;", nil, nil, nil)
        return result
    } catch {
        sqlite3_exec(db, "ROLLBACK;", nil, nil, nil)
        throw error
    }
}

Checking whether you’re in a transaction

sqlite3_get_autocommit(db) returns nonzero when the connection is in autocommit mode (i.e., not inside an explicit transaction) and zero when a transaction is open:

let inTransaction = sqlite3_get_autocommit(db) == 0

This is occasionally useful for defensive checks — e.g., asserting that a function that must run inside a transaction actually is, or deciding whether to open one.

Transaction pitfalls

Forgetting to commit or rollback. Leaves the transaction open, holding locks, blocking other writes. Use the closure-based transaction wrapper so cleanup is automatic.

Doing slow non-database work inside a transaction. A transaction holds locks for its entire duration. Network calls, image processing, or user prompts inside a transaction can block other connections for a long time. Keep transactions short — gather data first, then open the transaction only around the writes.

Relying on implicit per-statement transactions for bulk writes. Thousands of implicit commits mean thousands of disk syncs. Wrap bulk writes in one explicit transaction.

Using DEFERRED for write transactions in multi-connection code. Risks a mid-transaction SQLITE_BUSY that you can’t safely retry. Use BEGIN IMMEDIATE for writers.

Nesting plain BEGIN/COMMIT. SQLite errors on a BEGIN inside a transaction. Use savepoints for nesting.

Catching the error but not rolling back. An open transaction after an error poisons subsequent work. Always roll back on the failure path (the wrapper does this for you).

What to internalize

A transaction makes a group of statements atomic: all commit or all roll back. Wrap them in a closure-based transaction(_:_:) helper so COMMIT/ROLLBACK happen automatically on every exit path, including thrown errors. Transactions are also the biggest performance lever for bulk writes — one transaction means one disk sync instead of thousands. Use BEGIN IMMEDIATE for transactions that will write in multi-connection setups, to avoid un-retryable mid-transaction SQLITE_BUSY. Keep transactions short and free of slow non-database work, since they hold locks. Use savepoints when you need nesting or partial rollback.


11. Building a Swift Wrapper Layer, Part 1: Connection and Statement

You’ve seen that every raw operation repeats the same boilerplate: prepare, check, bind, step, check, finalize. Writing that by hand at every call site is tedious and error-prone. In this section and the next, we build a small, principled Swift wrapper that hides the boilerplate while keeping full control — the kind of layer you’d actually ship if you chose raw SQLite. Part 1 covers the connection and statement abstractions; Part 2 adds row decoding and a query API.

Design goals

A good wrapper should: own resource lifetimes (no manual finalize/close), surface errors as typed Swift Errors, make binding ergonomic and type-safe, prevent the common mistakes (injection, leaks, lifetime bugs), and stay thin enough that you can always see what SQL runs. We’re not rebuilding GRDB — we’re building the minimum that makes raw SQLite pleasant and safe.

A value type for bindable parameters

First, a Swift enum to represent any value we can bind. This lets us pass heterogeneous parameters as a single array and centralize the bind logic:

enum SQLiteValue {
    case integer(Int64)
    case real(Double)
    case text(String)
    case blob(Data)
    case null
}

// Conveniences so call sites can pass natural Swift types
extension SQLiteValue {
    init(_ value: Int)    { self = .integer(Int64(value)) }
    init(_ value: Int64)  { self = .integer(value) }
    init(_ value: Double) { self = .real(value) }
    init(_ value: Bool)   { self = .integer(value ? 1 : 0) }
    init(_ value: String) { self = .text(value) }
    init(_ value: Data)   { self = .blob(value) }
    init(_ value: Date)   { self = .integer(Int64(value.timeIntervalSince1970)) }

    init<T>(_ value: T?) where T == String { self = value.map { .text($0) } ?? .null }
}

This enum is the bridge between Swift’s type system and SQLite’s five storage classes. The convenience initializers mean call sites write SQLiteValue(task.title) rather than .text(task.title), and the optional initializer maps nil to .null.

The Statement class

The statement wrapper owns an OpaquePointer? and finalizes it on deinit. It exposes bind, step, and column-read methods:

final class Statement {
    private let handle: OpaquePointer?
    private unowned let connection: Connection

    init(connection: Connection, sql: String) throws {
        self.connection = connection
        var stmt: OpaquePointer?
        let rc = sqlite3_prepare_v2(connection.handle, sql, -1, &stmt, nil)
        guard rc == SQLITE_OK else {
            throw SQLiteError.from(db: connection.handle, code: rc, sql: sql)
        }
        self.handle = stmt
    }

    deinit {
        sqlite3_finalize(handle)   // lifetime is owned by the class — no manual finalize
    }

    // Bind an array of values positionally (1-based internally)
    func bind(_ values: [SQLiteValue]) throws {
        let expected = Int(sqlite3_bind_parameter_count(handle))
        guard values.count == expected else {
            throw SQLiteError.bind(message: "Expected \(expected) parameters, got \(values.count)")
        }
        for (offset, value) in values.enumerated() {
            let index = Int32(offset + 1)   // convert 0-based array to 1-based bind index
            try bindOne(value, at: index)
        }
    }

    private func bindOne(_ value: SQLiteValue, at index: Int32) throws {
        let rc: Int32
        switch value {
        case .integer(let i): rc = sqlite3_bind_int64(handle, index, i)
        case .real(let d):    rc = sqlite3_bind_double(handle, index, d)
        case .text(let s):    rc = sqlite3_bind_text(handle, index, s, -1, SQLITE_TRANSIENT)
        case .blob(let data):
            rc = data.withUnsafeBytes { buffer in
                sqlite3_bind_blob(handle, index, buffer.baseAddress, Int32(buffer.count), SQLITE_TRANSIENT)
            }
        case .null:           rc = sqlite3_bind_null(handle, index)
        }
        guard rc == SQLITE_OK else {
            throw SQLiteError.from(db: connection.handle, code: rc)
        }
    }

    // Advance one row; returns true if a row is available, false at the end
    func step() throws -> Bool {
        let rc = sqlite3_step(handle)
        switch rc {
        case SQLITE_ROW:  return true
        case SQLITE_DONE: return false
        default:          throw SQLiteError.from(db: connection.handle, code: rc)
        }
    }

    func reset() {
        sqlite3_reset(handle)
        sqlite3_clear_bindings(handle)
    }
}

Several design decisions are worth calling out. The class owns the statement handle and finalizes in deinit, so callers never finalize manually — the lifetime tracks the object’s lifetime. The bind(_:) method validates parameter count against sqlite3_bind_parameter_count, turning a silent missing-bind bug into a thrown error. It converts the 0-based Swift array to 1-based bind indices internally, hiding the asymmetry. step() returns a clean Bool (true = row available, false = done) and throws on any real error, so callers write while try stmt.step() instead of comparing magic numbers. The connection is held unowned to avoid a retain cycle (the connection will outlive its statements by construction).

Reading columns from the Statement

We add typed column accessors that the row decoder (Part 2) will use:

extension Statement {
    func columnCount() -> Int32 { sqlite3_column_count(handle) }

    func columnName(_ index: Int32) -> String {
        String(cString: sqlite3_column_name(handle, index))
    }

    func isNull(_ index: Int32) -> Bool {
        sqlite3_column_type(handle, index) == SQLITE_NULL
    }

    func int64(_ index: Int32) -> Int64 { sqlite3_column_int64(handle, index) }
    func double(_ index: Int32) -> Double { sqlite3_column_double(handle, index) }

    func string(_ index: Int32) -> String? {
        guard let ptr = sqlite3_column_text(handle, index) else { return nil }
        return String(cString: ptr)
    }

    func data(_ index: Int32) -> Data? {
        guard let ptr = sqlite3_column_blob(handle, index) else { return nil }
        let count = Int(sqlite3_column_bytes(handle, index))
        return Data(bytes: ptr, count: count)
    }
}

These wrap the sqlite3_column_* calls with correct NULL handling and immediate string/data conversion (the lifetime rules from Section 7 are baked in here, so callers can’t get them wrong).

The Connection class

The connection owns the database handle, configures the connection at open, and is the factory for statements:

final class Connection {
    let handle: OpaquePointer?

    init(url: URL, readonly: Bool = false) throws {
        var db: OpaquePointer?
        let flags = (readonly ? SQLITE_OPEN_READONLY : SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE)
            | SQLITE_OPEN_FULLMUTEX
        let rc = sqlite3_open_v2(url.path, &db, flags, nil)
        guard rc == SQLITE_OK else {
            let err = SQLiteError.open(message: db.map { String(cString: sqlite3_errmsg($0)) } ?? "unknown")
            sqlite3_close(db)
            throw err
        }
        self.handle = db
        try configure()
    }

    init(inMemory: Bool) throws {
        var db: OpaquePointer?
        let rc = sqlite3_open_v2(":memory:", &db, SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_FULLMUTEX, nil)
        guard rc == SQLITE_OK else {
            sqlite3_close(db)
            throw SQLiteError.open(message: "could not open in-memory database")
        }
        self.handle = db
        try configure()
    }

    private func configure() throws {
        try execute("PRAGMA journal_mode = WAL;")
        try execute("PRAGMA foreign_keys = ON;")
        try execute("PRAGMA busy_timeout = 5000;")
        try execute("PRAGMA synchronous = NORMAL;")
        sqlite3_extended_result_codes(handle, 1)
    }

    deinit {
        sqlite3_close_v2(handle)   // forgiving close; cleans up even with stray statements
    }

    // Static, parameterless SQL (schema, pragmas)
    func execute(_ sql: String) throws {
        let rc = sqlite3_exec(handle, sql, nil, nil, nil)
        guard rc == SQLITE_OK else {
            throw SQLiteError.from(db: handle, code: rc, sql: sql)
        }
    }

    func prepare(_ sql: String) throws -> Statement {
        try Statement(connection: self, sql: sql)
    }

    var lastInsertRowID: Int64 { sqlite3_last_insert_rowid(handle) }
    var changes: Int32 { sqlite3_changes(handle) }
}

The connection configures sensible pragmas at open (WAL, foreign keys on, a busy timeout, NORMAL synchronous, extended result codes), opens read-write-create or read-only based on a flag, supports in-memory construction for tests, and closes forgivingly in deinit. It exposes execute for static SQL, prepare for parameterized statements, and the lastInsertRowID / changes accessors.

A run helper for non-query statements

For inserts/updates/deletes — prepare, bind, step once, done — a convenience that packages the whole flow:

extension Connection {
    @discardableResult
    func run(_ sql: String, _ values: [SQLiteValue] = []) throws -> Int64 {
        let statement = try prepare(sql)
        try statement.bind(values)
        _ = try statement.step()   // a write statement steps once to SQLITE_DONE
        return lastInsertRowID
    }
}

Now a write is one line:

let id = try connection.run(
    "INSERT INTO tasks (title, is_done, priority, created_at) VALUES (?, 0, ?, ?);",
    [SQLiteValue("Buy milk"), SQLiteValue(1), SQLiteValue(Date())]
)

The statement is created locally, so it finalizes when run returns (the Statement’s deinit fires). Binding is validated. Errors throw. No pointers, no manual cleanup, no injection surface. This is the ergonomic payoff of the wrapper.

The transaction method on Connection

Folding in the transaction wrapper from Section 10 as a method:

extension Connection {
    func transaction<T>(_ work: () throws -> T) throws -> T {
        try execute("BEGIN IMMEDIATE;")
        do {
            let result = try work()
            try execute("COMMIT;")
            return result
        } catch {
            try? execute("ROLLBACK;")
            throw error
        }
    }
}

Wrapper Part 1 pitfalls

Retain cycles between Connection and Statement. A statement holding a strong reference to its connection while the connection caches statements creates a cycle. The unowned reference on Statement.connection breaks it; just ensure connections outlive their statements (they will, by construction).

Finalizing or closing manually despite the wrapper. The whole point is that lifetimes are automatic. Mixing manual sqlite3_finalize/sqlite3_close with the wrapper double-frees. Let the classes own their handles.

Forgetting configure() on one of the initializers. If the in-memory init skips pragma setup, tests behave differently from production. Route both inits through the same configure().

Validating bind count but not in run. The validation lives in Statement.bind, which run calls, so it’s covered — but if you add bind paths that bypass bind(_:), replicate the check.

What to internalize

A thin wrapper turns the repetitive C ceremony into safe, ergonomic Swift. Model bindable values as a SQLiteValue enum with convenience initializers so heterogeneous parameters travel as one array and nil maps to .null. The Statement class owns its handle (finalizing in deinit), validates bind count, hides the 1-based/0-based asymmetry, and exposes a clean step() -> Bool. The Connection class owns the database handle (closing in deinit), configures pragmas at open including the easily-forgotten foreign_keys = ON, and serves as the statement factory. A run helper collapses a write into one line, and a closure-based transaction method makes atomic blocks foolproof. Part 2 builds row decoding and a query API on top of this.


12. Building a Swift Wrapper Layer, Part 2: Row Decoding and a Query API

Part 1 gave us safe connections, statements, binding, and a one-line run for writes. What’s missing is the read side: a pleasant way to run a SELECT and turn each result row into a Swift value. This section builds a row-decoding abstraction and a generic query API, then shows how the whole layer comes together in a small repository.

A Row abstraction

Rather than have callers poke at column indices, we give them a Row that wraps the current statement state and offers typed, name-or-index access:

struct Row {
    fileprivate let statement: Statement
    private let columnIndexByName: [String: Int32]

    init(_ statement: Statement) {
        self.statement = statement
        var map: [String: Int32] = [:]
        let count = statement.columnCount()
        for i in 0..<count {
            map[statement.columnName(i)] = i
        }
        self.columnIndexByName = map
    }

    // By index
    func int64(_ index: Int32) -> Int64 { statement.int64(index) }
    func int(_ index: Int32) -> Int { Int(statement.int64(index)) }
    func double(_ index: Int32) -> Double { statement.double(index) }
    func bool(_ index: Int32) -> Bool { statement.int64(index) != 0 }
    func string(_ index: Int32) -> String? { statement.string(index) }
    func data(_ index: Int32) -> Data? { statement.data(index) }
    func date(_ index: Int32) -> Date? {
        statement.isNull(index) ? nil : Date(timeIntervalSince1970: TimeInterval(statement.int64(index)))
    }

    // By name (resolves to index via the map)
    func int64(_ name: String) -> Int64 { int64(columnIndexByName[name]!) }
    func int(_ name: String) -> Int { int(columnIndexByName[name]!) }
    func double(_ name: String) -> Double { double(columnIndexByName[name]!) }
    func bool(_ name: String) -> Bool { bool(columnIndexByName[name]!) }
    func string(_ name: String) -> String? { string(columnIndexByName[name]!) }
    func data(_ name: String) -> Data? { data(columnIndexByName[name]!) }
    func date(_ name: String) -> Date? { date(columnIndexByName[name]!) }
}

The Row builds a name-to-index map once from the statement’s column metadata, so callers can read by column name (row.string("title")) instead of fragile numeric indices. It also bakes in the Swift-friendly conversions — bool from integer, date from a Unix timestamp — so model decoding reads naturally. The by-index methods remain for hot paths where the name lookup’s small cost matters.

The query method

Now Connection gets a generic query that prepares, binds, steps the whole result set, and maps each row through a decode closure:

extension Connection {
    func query<T>(_ sql: String, _ values: [SQLiteValue] = [], decode: (Row) -> T) throws -> [T] {
        let statement = try prepare(sql)
        try statement.bind(values)

        var results: [T] = []
        let row = Row(statement)   // column map built once; same Row reused per step
        while try statement.step() {
            results.append(decode(row))
        }
        return results
    }
}

The decode closure receives a Row for the current step and returns a value of whatever type the caller wants. Building the Row (and its column map) once before the loop is a deliberate optimization — the column structure is fixed for the statement’s lifetime, so there’s no need to rebuild the map per row. The statement finalizes when query returns.

A convenience for the common “exactly zero or one row” case:

extension Connection {
    func queryOne<T>(_ sql: String, _ values: [SQLiteValue] = [], decode: (Row) -> T) throws -> T? {
        try query(sql, values, decode: decode).first
    }
}

Using the query API

With this in place, reads become declarative. Recall the Task model from Section 8. A single decode function maps a row to a Task:

extension Task {
    init(row: Row) {
        self.id = row.int64("id")
        self.title = row.string("title") ?? ""
        self.notes = row.string("notes")
        self.isDone = row.bool("is_done")
        self.priority = row.int("priority")
        self.createdAt = row.date("created_at") ?? Date(timeIntervalSince1970: 0)
        self.dueDate = row.date("due_date")
    }
}

And queries read cleanly:

// All open tasks, highest priority first
let openTasks = try connection.query(
    "SELECT * FROM tasks WHERE is_done = 0 ORDER BY priority DESC, created_at;",
    decode: Task.init(row:)
)

// One task by id
let task = try connection.queryOne(
    "SELECT * FROM tasks WHERE id = ?;",
    [SQLiteValue(taskId)],
    decode: Task.init(row:)
)

// Just the count — decode a scalar
let count = try connection.queryOne(
    "SELECT COUNT(*) FROM tasks WHERE is_done = 1;",
    decode: { $0.int(0) }
) ?? 0

Reading by column name means SELECT * is safe even if you later reorder the table’s columns — the decoder looks up "title" by name, not by position. (For maximum performance or stability against schema changes you might list columns explicitly, but name-based decoding is wonderfully resilient during development.)

A repository built on the layer

The layer now supports a clean repository — a type that exposes domain operations and hides SQL:

final class TaskRepository {
    private let db: Connection

    init(db: Connection) {
        self.db = db
    }

    func create(title: String, notes: String?, priority: Int, dueDate: Date?) throws -> Int64 {
        try db.run(
            "INSERT INTO tasks (title, notes, is_done, priority, created_at, due_date) VALUES (?, ?, 0, ?, ?, ?);",
            [SQLiteValue(title), SQLiteValue(notes), SQLiteValue(priority),
             SQLiteValue(Date()), dueDate.map { SQLiteValue($0) } ?? .null]
        )
    }

    func all(includeDone: Bool) throws -> [Task] {
        let sql = includeDone
            ? "SELECT * FROM tasks ORDER BY priority DESC, created_at;"
            : "SELECT * FROM tasks WHERE is_done = 0 ORDER BY priority DESC, created_at;"
        return try db.query(sql, decode: Task.init(row:))
    }

    func find(id: Int64) throws -> Task? {
        try db.queryOne("SELECT * FROM tasks WHERE id = ?;", [SQLiteValue(id)], decode: Task.init(row:))
    }

    func update(_ task: Task) throws {
        try db.run(
            "UPDATE tasks SET title = ?, notes = ?, is_done = ?, priority = ?, due_date = ? WHERE id = ?;",
            [SQLiteValue(task.title), task.notes.map { SQLiteValue($0) } ?? .null,
             SQLiteValue(task.isDone), SQLiteValue(task.priority),
             task.dueDate.map { SQLiteValue($0) } ?? .null, SQLiteValue(task.id)]
        )
        guard db.changes > 0 else { throw SQLiteError.notFound }
    }

    func delete(id: Int64) throws {
        try db.run("DELETE FROM tasks WHERE id = ?;", [SQLiteValue(id)])
    }

    func markAllDone() throws {
        try db.transaction {
            try db.run("UPDATE tasks SET is_done = 1 WHERE is_done = 0;")
        }
    }
}

This is the shape of a real data layer built on raw SQLite. Each method is a few lines. There’s no pointer management, no finalize, no injection surface, and errors propagate as typed Swift Errors. The SQL is visible and under your control. Compare this to the multi-dozen-line raw functions from Section 8 — the wrapper has collapsed the ceremony while keeping every bit of the power.

Testing the layer

Because Connection supports in-memory construction, the repository tests are fast and isolated:

import Testing

@Test func testCreateAndFetch() throws {
    let db = try Connection(inMemory: true)
    try db.execute("""
        CREATE TABLE tasks (id INTEGER PRIMARY KEY, title TEXT NOT NULL, notes TEXT,
        is_done INTEGER NOT NULL DEFAULT 0, priority INTEGER NOT NULL DEFAULT 0,
        created_at INTEGER NOT NULL, due_date INTEGER);
    """)
    let repo = TaskRepository(db: db)

    let id = try repo.create(title: "Test", notes: nil, priority: 2, dueDate: nil)
    let fetched = try repo.find(id: id)

    #expect(fetched?.title == "Test")
    #expect(fetched?.priority == 2)
    #expect(fetched?.isDone == false)
}

@Test func testUpdateMissingThrows() throws {
    let db = try Connection(inMemory: true)
    try db.execute("CREATE TABLE tasks (id INTEGER PRIMARY KEY, title TEXT NOT NULL, notes TEXT, is_done INTEGER NOT NULL DEFAULT 0, priority INTEGER NOT NULL DEFAULT 0, created_at INTEGER NOT NULL, due_date INTEGER);")
    let repo = TaskRepository(db: db)
    var ghost = Task(id: 999, title: "x", notes: nil, isDone: false, priority: 0, createdAt: Date(), dueDate: nil)
    #expect(throws: SQLiteError.self) {
        try repo.update(ghost)   // no row with id 999 → notFound
    }
}

Each test opens a fresh :memory: database, creates the schema, exercises the repository, and asserts. No file system, no shared state, fast enough to run hundreds in a second.

Where to stop

This wrapper is intentionally minimal. It does not do: automatic schema migration (Section 14 adds that), statement caching (Section 23), database change observation, Codable integration, or a query builder. Those are real features that mature libraries provide and that you’d add incrementally if you kept going. The point of building this much is that you now understand exactly what such a library does, and you can reach below any of its abstractions when needed. If your app’s needs grow past this, that’s the signal to evaluate GRDB — but you’ll evaluate it from a position of understanding rather than hope.

Wrapper Part 2 pitfalls

Rebuilding the column-name map per row. Wasteful in large result sets. Build it once before the step loop (as query does).

Force-unwrapping a name that isn’t in the result. row.string("titel") (typo) crashes on the !. During development this surfaces mistakes fast; for production robustness you might return optionals or throw on unknown names.

Decoding closures that capture mutable state unsafely. The decode closure runs synchronously per row inside query, so simple accumulation is fine, but don’t escape the Row — it’s only valid during its step.

Assuming SELECT * column order is stable. It follows table definition order, which migrations can change. Name-based decoding tolerates this; index-based decoding does not. Prefer names, or list columns explicitly.

What to internalize

The read side completes the wrapper: a Row builds a column-name-to-index map once and offers typed, name-based accessors with Swift-friendly conversions baked in (bool from int, date from timestamp). Connection.query prepares, binds, steps the whole set, and maps each row through a decode closure, reusing one Row for efficiency; queryOne handles the single-row case. A model gains an init(row:) and queries become one-liners. On top of this, a repository exposes clean domain methods with no pointer management or injection surface, and in-memory connections make it all trivially testable. This is a real, shippable raw-SQLite layer — and building it means you understand precisely what the bigger libraries do for you.


13. Schema Design and Data Types in SQLite

A database schema is the set of tables, columns, constraints, and relationships that define your data’s shape. SQLite’s approach to types is unusual — looser than most databases — and understanding it prevents a category of subtle bugs. This section covers the type system, column constraints, keys, and the practical decisions of designing tables for an iOS app.

SQLite’s five storage classes

Despite SQL’s traditional rich type vocabulary, SQLite stores everything as one of just five storage classes:

  • NULL — the absence of a value.
  • INTEGER — a signed integer, 1 to 8 bytes depending on magnitude.
  • REAL — an 8-byte IEEE floating-point number.
  • TEXT — a string, stored in the database encoding (UTF-8 by default).
  • BLOB — a blob of bytes, stored exactly as given.

That’s it. There is no native boolean, no native date, no native decimal, no VARCHAR(50) with an enforced length. Those concepts are conventions layered on top of the five storage classes, which is why we store Bool as INTEGER 0/1 and Date as an INTEGER timestamp.

Type affinity: the surprising part

Here’s where SQLite diverges sharply from other databases. When you declare a column’s type, you’re not declaring a strict type — you’re declaring a type affinity, a preference for how values should be stored. SQLite will try to coerce values toward that affinity, but by default it will happily store a value of any storage class in any column.

The affinities derived from declared types:

  • A column declared INTEGER, INT, BIGINT, etc. gets INTEGER affinity.
  • A column declared TEXT, VARCHAR, CHAR, CLOB gets TEXT affinity.
  • A column declared REAL, DOUBLE, FLOAT gets REAL affinity.
  • A column declared BLOB, or with no type at all, gets BLOB affinity (no coercion).
  • A column declared NUMERIC, DECIMAL, BOOLEAN, DATE, DATETIME gets NUMERIC affinity (tries integer/real, falls back to text).

The practical consequence: if you declare age INTEGER and then bind the string "thirty" to it, SQLite stores the text "thirty" in that integer column without complaint. The affinity is a hint, not a contract. This flexibility is occasionally useful but mostly a footgun — it means type errors don’t surface as errors. The defenses are to always bind values of the type you intend (which the SQLiteValue enum from Section 11 helps enforce) and, if you want true strictness, to use STRICT tables.

STRICT tables

Since SQLite 3.37 (iOS 15.4+), you can opt into rigid typing with STRICT:

let sql = """
CREATE TABLE accounts (
    id INTEGER PRIMARY KEY,
    balance INTEGER NOT NULL,
    name TEXT NOT NULL
) STRICT;
"""

In a STRICT table, each column must be one of a fixed set of types (INT, INTEGER, REAL, TEXT, BLOB, ANY), and SQLite enforces them — binding a string to an INTEGER column raises an error instead of silently storing text. For new schemas where you want type safety, STRICT is worth adopting. It eliminates a whole class of “wrong type stored silently” bugs at the cost of the historical flexibility (which you rarely want anyway). The one caveat is the minimum iOS version; check your deployment target supports the SQLite version that introduced it.

Column constraints

Constraints enforce rules at the column or table level:

let sql = """
CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    email TEXT NOT NULL UNIQUE,
    username TEXT NOT NULL,
    age INTEGER CHECK (age >= 0 AND age < 150),
    country TEXT NOT NULL DEFAULT 'US',
    created_at INTEGER NOT NULL
);
"""

The constraints in play:

  • PRIMARY KEY — uniquely identifies a row. For INTEGER PRIMARY KEY specifically, this column becomes an alias for the rowid (more below).
  • NOT NULL — the column must have a value; inserting NULL fails.
  • UNIQUE — no two rows may share this value; a duplicate insert fails with a constraint error.
  • CHECK (expression) — the expression must be true for every row; violations fail. Great for domain rules like non-negative ages.
  • DEFAULT value — the value used when an insert omits the column.

These are enforced at write time (and surface as the SQLITE_CONSTRAINT codes from Section 9). They’re your first line of data integrity — far better to have the database reject bad data than to rely on every code path to validate.

INTEGER PRIMARY KEY and the rowid

Every SQLite table (unless declared WITHOUT ROWID) has a hidden 64-bit integer key called the rowid. When you declare a column as exactly INTEGER PRIMARY KEY, that column becomes the rowid — they’re the same value. This has performance implications: looking up a row by its INTEGER PRIMARY KEY is the fastest possible lookup, because it’s the table’s physical organization key, no separate index needed.

CREATE TABLE tasks (id INTEGER PRIMARY KEY, ...);   -- id IS the rowid; lookups by id are fastest

A few rules: it must be exactly INTEGER (not INT, oddly — INT PRIMARY KEY does not alias the rowid), and there can be only one. If you don’t declare an INTEGER PRIMARY KEY, SQLite still maintains an automatic rowid you can reference as rowid, but giving it an explicit name (id) is clearer and lets you reference it in foreign keys. For the overwhelming majority of tables, declare id INTEGER PRIMARY KEY and use it as your identity.

AUTOINCREMENT is a separate, often-misunderstood keyword: id INTEGER PRIMARY KEY AUTOINCREMENT prevents reuse of deleted ids (a normal INTEGER PRIMARY KEY may reuse the id of a deleted row). It has a small cost and is rarely needed — only use it if you specifically require that ids never repeat across the table’s lifetime.

Foreign keys and relationships

Relationships between tables are expressed with foreign keys:

let sql = """
CREATE TABLE posts (
    id INTEGER PRIMARY KEY,
    user_id INTEGER NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    title TEXT NOT NULL,
    created_at INTEGER NOT NULL
);
"""

user_id INTEGER NOT NULL REFERENCES users(id) declares that posts.user_id must match an existing users.id. The ON DELETE CASCADE clause says: when a user is deleted, automatically delete their posts. Other actions: ON DELETE SET NULL (orphan the posts by nulling user_id — requires the column be nullable), ON DELETE RESTRICT/NO ACTION (forbid deleting a user who still has posts).

The critical caveat, repeated from Section 4 because it bites everyone: foreign keys are not enforced unless you run PRAGMA foreign_keys = ON on the connection. Declared foreign keys with the pragma off are pure documentation — you can insert a user_id that references nothing, and deletes don’t cascade. Set the pragma at connection startup, always. With it on, foreign key violations surface as SQLITE_CONSTRAINT_FOREIGNKEY.

Designing tables for an iOS app: practical guidance

Some opinionated advice distilled from real apps:

Use INTEGER PRIMARY KEY for identity. It’s the fastest key and the natural choice. If you need a stable identifier for syncing across devices (where rowids differ per device), add a separate uuid TEXT NOT NULL UNIQUE column alongside the integer primary key. The integer is your local identity; the UUID is your portable identity.

Store dates as INTEGER Unix timestamps. Integer seconds since 1970 sort correctly, compare cheaply, and convert trivially to/from Date (Section 18). Avoid storing dates as text unless you need human readability in the raw file.

Store enums as INTEGER or TEXT. A small fixed set (priority: low/medium/high) can be an integer (0/1/2) for compactness or text (‘low’/’medium’/’high’) for readability in raw queries. Text is more debuggable; integer is more compact. Either is fine.

Use NOT NULL liberally. Every column that should always have a value should say so. NULLs propagate in surprising ways through queries (Section 19); fewer nullable columns means fewer surprises.

Add CHECK constraints for invariants. If a quantity must be positive, a CHECK (quantity > 0) makes the database enforce it. Cheap insurance against bad data from any code path.

Name things consistently. Pick snake_case for SQL identifiers (it’s the SQL convention and avoids quoting), singular or plural table names (just be consistent), and clear column names. Your future self reading raw SQL will thank you.

Altering tables

SQLite’s ALTER TABLE is limited compared to other databases. You can:

ALTER TABLE users ADD COLUMN phone TEXT;           -- add a column (with a default or nullable)
ALTER TABLE users RENAME TO app_users;             -- rename the table
ALTER TABLE users RENAME COLUMN username TO handle; -- rename a column (SQLite 3.25+)
ALTER TABLE users DROP COLUMN phone;               -- drop a column (SQLite 3.35+)

What you cannot directly do (in older versions) is change a column’s type, add a constraint to an existing column, or reorder columns. The classic workaround for complex changes is the “twelve-step” recreate: create a new table with the desired shape, copy data over, drop the old, rename the new. We’ll see this in the migrations section. For simple additive changes, ADD COLUMN is all you need, and adding a nullable column or one with a default is cheap and safe.

Schema design pitfalls

Relying on type affinity for type safety. A declared INTEGER column accepts text by default. Bind correct types, or use STRICT tables for real enforcement.

Forgetting PRAGMA foreign_keys = ON. Foreign keys become decorative. Enable per connection at startup.

Using INT PRIMARY KEY instead of INTEGER PRIMARY KEY. INT does not alias the rowid; you lose the fast-lookup optimization. Spell it INTEGER.

Storing dates as locale-formatted text. Breaks sorting and comparison, and is locale-dependent. Use integer timestamps.

Over-using AUTOINCREMENT. It has overhead and is rarely needed. Plain INTEGER PRIMARY KEY reuses freed ids, which is fine for almost everything.

Too many nullable columns. NULLs complicate queries and logic. Use NOT NULL with defaults wherever a value should always exist.

What to internalize

SQLite has five storage classes (NULL, INTEGER, REAL, TEXT, BLOB) and uses type affinity — a preference, not a contract — so by default a column accepts any type. Use STRICT tables (iOS 15.4+) for real type enforcement. Declare id INTEGER PRIMARY KEY for fast identity (it aliases the rowid), add a uuid column if you need cross-device identity, store dates as integer timestamps and booleans as 0/1. Enforce integrity with NOT NULL, UNIQUE, CHECK, and foreign keys — but remember foreign keys require PRAGMA foreign_keys = ON per connection. ALTER TABLE is limited to additive changes and renames; complex changes use the recreate-and-copy pattern.


14. Migrations: Versioning Your Schema with user_version

Your schema will change. You’ll add a column, create a table, add an index, restructure a relationship — and you’ll need to apply those changes to databases that already exist on users’ devices, with their data intact. A migration system tracks which changes have been applied and runs the missing ones in order. SQLite gives you a simple, durable mechanism for this: the user_version pragma. This section builds a complete migration system on it.

The problem migrations solve

When your app first installs, there’s no database — you create the schema fresh. When you ship version 2 with a new column, the user already has a version-1 database full of their data. You can’t just re-run CREATE TABLE (it already exists), and you can’t drop and recreate (that destroys their data). You need to apply only the changes since their current version, in order, exactly once. That’s migration.

The requirements: know what version a given database is at, define an ordered list of schema changes, apply only the ones newer than the database’s version, do it atomically (a half-applied migration is corruption), and record the new version when done.

The user_version pragma

SQLite stores a 32-bit integer in the database header that’s entirely yours to use — it means nothing to SQLite itself. You read and write it with a pragma:

// Read
func schemaVersion(_ db: OpaquePointer?) -> Int32 {
    var statement: OpaquePointer?
    sqlite3_prepare_v2(db, "PRAGMA user_version;", -1, &statement, nil)
    defer { sqlite3_finalize(statement) }
    guard sqlite3_step(statement) == SQLITE_ROW else { return 0 }
    return sqlite3_column_int(statement, 0)
}

// Write (note: user_version doesn't accept a bound parameter, so we build the SQL — safe because it's an integer we control)
func setSchemaVersion(_ db: OpaquePointer?, _ version: Int32) {
    sqlite3_exec(db, "PRAGMA user_version = \(version);", nil, nil, nil)
}

A fresh database has user_version 0. We’ll use it as the count of migrations applied: after applying migration 1, set it to 1; after migration 2, set it to 2; and so on. The version persists in the database file, surviving app restarts — which is exactly what we need, because it must reflect the database’s true state regardless of the app’s lifecycle.

(The string interpolation in setSchemaVersion is the rare safe case: version is an Int32 we generate, never user input, and PRAGMA user_version doesn’t accept bound parameters. This is the only kind of interpolation that’s acceptable — a value you fully control, of a type that can’t carry SQL.)

A migration as a unit of work

We model each migration as a closure that performs its schema change against a connection:

struct Migration {
    let version: Int32
    let apply: (OpaquePointer?) throws -> Void
}

Each migration knows its version number and how to apply itself. The migrations are an ordered list:

let migrations: [Migration] = [
    Migration(version: 1) { db in
        try exec(db, """
            CREATE TABLE tasks (
                id INTEGER PRIMARY KEY,
                title TEXT NOT NULL,
                is_done INTEGER NOT NULL DEFAULT 0,
                created_at INTEGER NOT NULL
            );
        """)
    },
    Migration(version: 2) { db in
        // Add a priority column to existing tasks
        try exec(db, "ALTER TABLE tasks ADD COLUMN priority INTEGER NOT NULL DEFAULT 0;")
    },
    Migration(version: 3) { db in
        // Add an index for the common "incomplete tasks by priority" query
        try exec(db, "CREATE INDEX idx_tasks_open ON tasks(is_done, priority);")
    },
    Migration(version: 4) { db in
        // Add a notes column and a separate tags table
        try exec(db, "ALTER TABLE tasks ADD COLUMN notes TEXT;")
        try exec(db, """
            CREATE TABLE tags (
                id INTEGER PRIMARY KEY,
                task_id INTEGER NOT NULL REFERENCES tasks(id) ON DELETE CASCADE,
                name TEXT NOT NULL
            );
        """)
    },
]

// small helper used above
func exec(_ db: OpaquePointer?, _ sql: String) throws {
    guard sqlite3_exec(db, sql, nil, nil, nil) == SQLITE_OK else {
        throw SQLiteError.from(db: db, code: sqlite3_errcode(db), sql: sql)
    }
}

The list is the complete history of your schema. Version 1 creates the initial tables; later versions evolve them. A user installing fresh runs all of them in order; a user upgrading from version 2 runs only 3 and 4. The list only ever grows — you append new migrations, never edit shipped ones (editing a shipped migration means users who already ran it never get the change, while fresh installs do, creating divergent schemas).

The migration runner

The runner reads the current version, applies every migration with a higher version in order, each in its own transaction, and updates user_version after each:

func migrate(_ db: OpaquePointer?, migrations: [Migration]) throws {
    let current = schemaVersion(db)
    let pending = migrations
        .filter { $0.version > current }
        .sorted { $0.version < $1.version }

    for migration in pending {
        try exec(db, "BEGIN IMMEDIATE;")
        do {
            try migration.apply(db)
            setSchemaVersion(db, migration.version)
            try exec(db, "COMMIT;")
        } catch {
            try? exec(db, "ROLLBACK;")
            throw error   // stop; the database stays at the last good version
        }
    }
}

This is the heart of the system. Each migration runs inside its own BEGIN IMMEDIATE / COMMIT, so it’s atomic: either the schema change and the version bump both happen, or neither does. If migration 3 fails, migrations 1 and 2 are already committed (the database is at version 2), 3 is rolled back, and the error propagates — the database is never left in a half-migrated state. On the next launch, the runner picks up again from version 2 and retries 3. The per-migration transaction is what makes the whole thing crash-safe.

Wiring it into the connection setup:

extension Connection {
    func runMigrations(_ migrations: [Migration]) throws {
        try migrate(handle, migrations: migrations)
    }
}

// At app startup:
let db = try Connection(url: databaseURL())
try db.runMigrations(migrations)

You call runMigrations once, right after opening the connection, before any other database work. By the time it returns, the schema is at the latest version regardless of where the database started.

Foreign keys during migration

One sharp edge: foreign key enforcement and schema changes interact badly. If you’re doing a complex migration that recreates tables (the “twelve-step” pattern), you want foreign keys off during the migration so intermediate states don’t trip constraints, then back on after. The recommended sequence for such migrations:

func migrateWithTableRebuild(_ db: OpaquePointer?) throws {
    try exec(db, "PRAGMA foreign_keys = OFF;")
    try exec(db, "BEGIN IMMEDIATE;")
    do {
        // create new table, copy data, drop old, rename new...
        try exec(db, "COMMIT;")
    } catch {
        try? exec(db, "ROLLBACK;")
        try exec(db, "PRAGMA foreign_keys = ON;")
        throw error
    }
    try exec(db, "PRAGMA foreign_keys = ON;")
    // Optionally verify integrity:
    try exec(db, "PRAGMA foreign_key_check;")
}

Note PRAGMA foreign_keys cannot be changed inside a transaction, so the OFF/ON toggles sit outside the BEGIN/COMMIT. For simple additive migrations (just ADD COLUMN or CREATE TABLE), you don’t need this dance — foreign keys can stay on. It’s only the table-rebuild migrations that require it.

The complex migration: rebuilding a table

When you need to change a column’s type, drop a constraint, or otherwise alter a table in a way ALTER TABLE can’t, the pattern is recreate-and-copy:

Migration(version: 5) { db in
    // Goal: change tasks.priority from INTEGER to a CHECK-constrained column,
    // which ALTER TABLE can't do directly. Rebuild the table.
    try exec(db, """
        CREATE TABLE tasks_new (
            id INTEGER PRIMARY KEY,
            title TEXT NOT NULL,
            is_done INTEGER NOT NULL DEFAULT 0,
            created_at INTEGER NOT NULL,
            priority INTEGER NOT NULL DEFAULT 0 CHECK (priority BETWEEN 0 AND 5),
            notes TEXT
        );
    """)
    try exec(db, """
        INSERT INTO tasks_new (id, title, is_done, created_at, priority, notes)
        SELECT id, title, is_done, created_at,
               CASE WHEN priority BETWEEN 0 AND 5 THEN priority ELSE 0 END,
               notes
        FROM tasks;
    """)
    try exec(db, "DROP TABLE tasks;")
    try exec(db, "ALTER TABLE tasks_new RENAME TO tasks;")
    // Recreate any indexes that were on the old table
    try exec(db, "CREATE INDEX idx_tasks_open ON tasks(is_done, priority);")
}

Create the new table with the desired shape, INSERT ... SELECT to copy (transforming data as needed — here clamping out-of-range priorities), drop the old, rename the new, and recreate indexes (which don’t survive the drop). Because the whole migration runs in a transaction (from the runner), it’s atomic. This pattern handles any structural change SQLite’s ALTER TABLE can’t do directly.

Testing migrations

Migrations are high-stakes — a bug ships data loss — so test them. The key test is “a database at version N migrates correctly to the latest version with data intact”:

@Test func testMigrationFromV1() throws {
    let db = try Connection(inMemory: true)
    // Manually set up a version-1 database with data
    try db.execute("CREATE TABLE tasks (id INTEGER PRIMARY KEY, title TEXT NOT NULL, is_done INTEGER NOT NULL DEFAULT 0, created_at INTEGER NOT NULL);")
    try db.execute("PRAGMA user_version = 1;")
    try db.run("INSERT INTO tasks (title, is_done, created_at) VALUES (?, 0, ?);",
               [SQLiteValue("Existing task"), SQLiteValue(Date())])

    // Run all migrations
    try db.runMigrations(migrations)

    // Verify the data survived and new columns exist
    let tasks = try db.query("SELECT id, title, priority, notes FROM tasks;") { row in
        (row.int64("id"), row.string("title"), row.int("priority"), row.string("notes"))
    }
    #expect(tasks.count == 1)
    #expect(tasks.first?.1 == "Existing task")
    #expect(tasks.first?.2 == 0)   // priority defaulted
}

Set up a database at an old version with representative data, run the migrations, assert the data survived and the new schema is present. Write one such test per “starting version” you care about supporting. Add them to CI so a broken migration can’t ship.

Migration pitfalls

Editing a shipped migration. Users who ran the old version never see the edit; fresh installs do. Divergent schemas. Never edit a migration once it’s shipped — append a new one.

Migrations not in a transaction. A crash mid-migration leaves a half-applied schema and a wrong version number. Run each migration atomically with version bump inside the same transaction.

Toggling foreign_keys inside a transaction. It’s a no-op there. Toggle outside the BEGIN/COMMIT.

Forgetting to recreate indexes after a table rebuild. Dropping the table drops its indexes. Recreate them in the same migration.

Not testing upgrades from old versions. The bugs live in the data-preservation paths. Test migrating from each supported starting version with real data.

Using user_version for something else too. It’s a single integer; if you also try to encode other state in it, the migration logic breaks. Dedicate it to migration versioning.

What to internalize

Migrations apply ordered, incremental schema changes to existing databases exactly once. Track progress in SQLite’s user_version pragma, which persists in the file. Model each migration as a versioned closure; keep them in an append-only ordered list (never edit shipped ones). The runner applies every migration newer than the current version, each in its own transaction with the version bump, so the database is always at a consistent, known version even across crashes. Simple additive changes use ALTER TABLE ADD COLUMN; structural changes use the create-new / copy / drop-old / rename pattern with foreign keys toggled off around it. Test upgrades from every supported starting version, with data, in CI.


15. Indexes and Query Optimization

An index is a data structure that lets SQLite find rows without scanning the whole table. The difference between a query with the right index and one without can be milliseconds versus seconds — or, on a large table, instant versus unusable. This section covers what indexes are, when to create them, how to design them well, and the costs they carry.

What an index is and why it’s fast

Conceptually, an index is a sorted copy of one or more columns, paired with pointers back to the full rows. Because it’s sorted, SQLite can binary-search it — jumping to the matching range in logarithmic time instead of examining every row linearly.

Without an index, a query like SELECT * FROM tasks WHERE title = 'Buy milk' is a full table scan: SQLite reads every row and checks each one’s title. On 100 rows that’s instant; on 1,000,000 rows it’s slow. With an index on title, SQLite binary-searches the sorted title index, finds the matching entries immediately, and follows the pointers to the rows. The query becomes fast regardless of table size.

The INTEGER PRIMARY KEY (rowid) is automatically indexed — that’s why lookups by id are always fast. Every other column you frequently search, filter, sort, or join on is a candidate for an explicit index.

Creating an index

CREATE INDEX idx_tasks_title ON tasks(title);
CREATE INDEX IF NOT EXISTS idx_posts_user ON posts(user_id);
CREATE UNIQUE INDEX idx_users_email ON users(email);

CREATE INDEX name ON table(columns) builds an index. The IF NOT EXISTS guard makes it safe to run repeatedly (useful in setup code, though in practice indexes live in migrations). CREATE UNIQUE INDEX additionally enforces uniqueness — it’s how you add a uniqueness constraint after table creation, and it doubles as both a constraint and a performance index.

Naming convention: idx_<table>_<columns> is common and self-documenting. The name only matters for dropping the index later (DROP INDEX idx_tasks_title); SQLite picks indexes by their columns, not their names.

When to create an index

Create an index on a column when it appears in:

  • WHERE clauses you run often. WHERE user_id = ? wants an index on user_id.
  • JOIN conditions. Joining posts to users on posts.user_id = users.id wants an index on posts.user_id (the users.id side is already the rowid).
  • ORDER BY clauses. Sorting by created_at can use an index on created_at to avoid a separate sort step.
  • UNIQUE requirements. A column that must be unique gets a unique index (which is also a fast lookup index).

The strongest signal is a frequent query against a large or growing table that filters or sorts on a non-primary-key column. That’s exactly where the full-scan cost hurts and the index pays off.

Don’t index everything

Indexes aren’t free, and over-indexing is a real anti-pattern. Each index:

  • Takes disk space — it’s a sorted copy of the indexed columns.
  • Slows down writes — every INSERT, UPDATE, and DELETE must update every affected index, in addition to the table. A table with eight indexes pays eight index-maintenance costs per write.
  • Adds nothing if unused — an index on a column you never query is pure overhead.

The discipline: index the columns your actual queries use, and no more. Don’t speculatively index “just in case.” If you’re not sure whether an index helps, measure with EXPLAIN QUERY PLAN (Section 16) before and after. A good rule of thumb for a typical app table is a small handful of carefully chosen indexes, not one per column.

Composite indexes and column order

An index can cover multiple columns, and the order of columns within it is significant:

CREATE INDEX idx_tasks_status_priority ON tasks(is_done, priority);

This composite index sorts by is_done first, then priority within each is_done group. It accelerates:

  • WHERE is_done = 0 — uses the leading column.
  • WHERE is_done = 0 AND priority > 3 — uses both columns.
  • WHERE is_done = 0 ORDER BY priority — uses both: filter on the leading column, and the index’s secondary sort provides the ordering for free.

But it does not efficiently accelerate WHERE priority > 3 alone (without an is_done filter), because priority isn’t the leading column — the index is sorted by is_done first, so priority values are scattered across the index. This is the “leftmost prefix” rule: a composite index helps queries that use a leftmost prefix of its columns (is_done, or is_done + priority), not arbitrary subsets.

The design implication: order composite index columns by how you query them. Put equality-filter columns first, then range/sort columns. For “incomplete tasks ordered by priority,” (is_done, priority) is exactly right — equality on is_done, then order by priority.

Covering indexes

If an index contains all the columns a query needs — both the filtered columns and the selected columns — SQLite can answer the query entirely from the index without touching the table at all. This is a covering index, the fastest possible read:

CREATE INDEX idx_tasks_cover ON tasks(is_done, priority, title);

-- This query is "covered"  every column it needs is in the index:
SELECT title FROM tasks WHERE is_done = 0 ORDER BY priority;

Because title, is_done, and priority are all in the index, SQLite reads only the index. No row lookups. For hot read paths, deliberately including the selected columns in the index (even though they’re not filtered or sorted) can be a meaningful optimization. The tradeoff is a larger index, so reserve covering indexes for genuinely hot queries.

Partial indexes

Sometimes you only ever query a subset of rows. A partial index indexes only those rows, staying smaller and cheaper:

CREATE INDEX idx_open_tasks ON tasks(priority) WHERE is_done = 0;

This indexes priority only for incomplete tasks. If your app overwhelmingly queries open tasks (the common case for a to-do app), the partial index is smaller than a full one and the write cost is lower (completed tasks don’t touch it). The WHERE clause in the index definition must match the query’s WHERE for the index to apply. Partial indexes are a nice optimization when there’s a clear “hot subset” of rows.

Expression indexes

You can index the result of an expression, not just a raw column:

CREATE INDEX idx_users_email_lower ON users(lower(email));

-- Now case-insensitive lookups can use the index:
SELECT * FROM users WHERE lower(email) = lower(?);

By indexing lower(email), a case-insensitive email lookup becomes index-accelerated. Without it, WHERE lower(email) = ? would scan every row applying lower to each. The query’s expression must match the index’s expression. Expression indexes are the tool when you consistently query a transformed value.

Measuring before optimizing

The cardinal rule: don’t guess, measure. Before adding an index, confirm the query is actually slow and that it’s doing a full scan. After adding it, confirm the index is used and the query got faster. The tool for this is EXPLAIN QUERY PLAN, covered in full next section. The short version: prefix any query with EXPLAIN QUERY PLAN and SQLite tells you whether it’s scanning or using an index, and which one. Adding indexes blindly bloats the database and slows writes for no benefit; adding them based on measured query plans is precise and effective.

Rebuilding statistics with ANALYZE

SQLite’s query planner uses statistics about your data’s distribution to choose indexes. After substantial data changes, those statistics can go stale, leading the planner to make poor choices. ANALYZE refreshes them:

sqlite3_exec(db, "ANALYZE;", nil, nil, nil)

Running ANALYZE occasionally (after a large import, or periodically in a maintenance routine) helps the planner pick the right indexes. It’s cheap relative to the queries it improves. You don’t need it constantly, but it’s worth running after big data shifts.

Index pitfalls

No index on a frequently filtered column of a large table. Full scans that get slower as data grows. Add an index on the filtered column.

Indexing everything. Bloats the database and slows every write. Index only columns your queries actually use.

Wrong column order in a composite index. A query on a non-leading column can’t use the index efficiently. Order columns to match your query patterns (equality first, then range/sort).

Expecting an index to help a query whose WHERE doesn’t match it. WHERE priority > 3 won’t use an index on (is_done, priority). Design indexes around real queries.

Adding indexes without measuring. You might add an unused index or miss the one that matters. Use EXPLAIN QUERY PLAN before and after.

Forgetting that functions in WHERE defeat plain indexes. WHERE lower(email) = ? won’t use an index on email. Use an expression index on lower(email).

What to internalize

An index is a sorted structure that turns full table scans into fast lookups, sized for the columns your queries filter, join, sort, or require unique on. The INTEGER PRIMARY KEY is auto-indexed; everything else is your decision. Indexes cost disk space and write speed, so index deliberately — the columns real queries use, not every column. Composite indexes follow the leftmost-prefix rule, so order their columns to match query patterns (equality then range/sort). Covering indexes answer queries from the index alone; partial and expression indexes optimize hot subsets and transformed lookups. Always measure with EXPLAIN QUERY PLAN rather than guessing, and run ANALYZE after large data changes to keep the planner’s statistics fresh.


16. EXPLAIN QUERY PLAN and the Query Planner

You can’t optimize what you can’t see. SQLite’s query planner decides, for every query, whether to scan a table, use an index, build a temporary sort structure, or some combination. EXPLAIN QUERY PLAN shows you those decisions in a compact, readable form. Learning to read its output is the single most valuable optimization skill you can develop — it turns performance work from guesswork into diagnosis. This section teaches you to read query plans and act on them.

What the query planner does

When you prepare a statement, SQLite doesn’t just blindly execute your SQL top to bottom. It analyzes the query and chooses an execution strategy: which table to read first, whether to use an index or scan, how to satisfy ORDER BY, how to perform joins. It uses heuristics and statistics (from ANALYZE) to estimate which strategy is cheapest. Usually it chooses well. EXPLAIN QUERY PLAN is how you confirm that — and catch the cases where it’s forced into something slow because the right index doesn’t exist.

Reading a basic plan

Prefix any query with EXPLAIN QUERY PLAN and run it like a normal SELECT. Each result row describes one step of the plan. Let’s wrap it in a helper that prints the plan:

func explainQueryPlan(_ db: OpaquePointer?, _ sql: String) {
    let explainSQL = "EXPLAIN QUERY PLAN " + sql
    var statement: OpaquePointer?
    guard sqlite3_prepare_v2(db, explainSQL, -1, &statement, nil) == SQLITE_OK else {
        print("Could not explain: \(String(cString: sqlite3_errmsg(db)))")
        return
    }
    defer { sqlite3_finalize(statement) }

    print("PLAN FOR: \(sql)")
    while sqlite3_step(statement) == SQLITE_ROW {
        // EXPLAIN QUERY PLAN returns columns: id, parent, notused, detail
        let detail = String(cString: sqlite3_column_text(statement, 3))
        print("  \(detail)")
    }
}

The interesting column is the last one, detail — a human-readable description of the step. Calling it on a query with no useful index:

explainQueryPlan(db, "SELECT * FROM tasks WHERE title = 'Buy milk';")
// PLAN FOR: SELECT * FROM tasks WHERE title = 'Buy milk';
//   SCAN tasks

SCAN tasks means a full table scan — SQLite reads every row. For a small table, fine. For a large one, this is the signal to add an index.

The two words that matter most in plan output:

  • SCAN — SQLite examines every row in the table (or every row in a range it can’t narrow with an index). Cost grows linearly with table size. Acceptable for small tables or when you genuinely need all rows; a red flag for a selective query on a large table.
  • SEARCH — SQLite uses an index to jump directly to the matching rows without examining the rest. Cost grows logarithmically. This is what you want for selective queries.

Add an index and re-explain the same query:

sqlite3_exec(db, "CREATE INDEX idx_tasks_title ON tasks(title);", nil, nil, nil)
explainQueryPlan(db, "SELECT * FROM tasks WHERE title = 'Buy milk';")
// PLAN FOR: SELECT * FROM tasks WHERE title = 'Buy milk';
//   SEARCH tasks USING INDEX idx_tasks_title (title=?)

Now it says SEARCH tasks USING INDEX idx_tasks_title (title=?) — SQLite found the index and uses it to locate the row directly. The (title=?) part tells you which index column and constraint it’s exploiting. This before/after comparison — SCAN becoming SEARCH — is the canonical way to verify an index does its job.

Recognizing a covering index

When the plan says USING COVERING INDEX, SQLite answered the query entirely from the index without reading the table — the fastest possible read (Section 15):

sqlite3_exec(db, "CREATE INDEX idx_cover ON tasks(is_done, priority, title);", nil, nil, nil)
explainQueryPlan(db, "SELECT title FROM tasks WHERE is_done = 0 ORDER BY priority;")
//   SEARCH tasks USING COVERING INDEX idx_cover (is_done=?)

COVERING INDEX in the output confirms your covering-index design worked — there’s no separate row fetch. If you built an index expecting it to cover a query and the plan shows plain USING INDEX (not COVERING), the index is missing one of the selected columns.

Detecting an expensive sort

ORDER BY on a column with no usable index forces SQLite to collect all the rows and sort them in a temporary structure. The plan reveals this:

explainQueryPlan(db, "SELECT * FROM tasks ORDER BY created_at;")
//   SCAN tasks
//   USE TEMP B-TREE FOR ORDER BY

USE TEMP B-TREE FOR ORDER BY means SQLite built a temporary sort tree — extra memory and time. If this query is hot, an index on created_at lets SQLite read the rows in sorted order directly, eliminating the temp b-tree:

sqlite3_exec(db, "CREATE INDEX idx_created ON tasks(created_at);", nil, nil, nil)
explainQueryPlan(db, "SELECT * FROM tasks ORDER BY created_at;")
//   SCAN tasks USING INDEX idx_created

Now the scan walks the index in order — no separate sort. Seeing USE TEMP B-TREE is a specific, actionable signal: an index can make the sort free.

Reading join plans

For a query with a join, the plan shows one line per table and the order SQLite chose to process them:

explainQueryPlan(db, """
    SELECT t.title, tg.name FROM tasks t
    JOIN tags tg ON tg.task_id = t.id
    WHERE t.is_done = 0;
""")
//   SEARCH t USING INDEX idx_tasks_open (is_done=?)
//   SEARCH tg USING INDEX idx_tags_task (task_id=?)

This reads as: filter tasks using the is_done index, then for each matching task, look up its tags using an index on tags.task_id. Both steps are SEARCH (index-driven) — an efficient join. If instead you saw SCAN tg for the inner table, it would mean SQLite re-scans all tags for every task — an O(tasks × tags) disaster on large data, fixed by indexing tags.task_id. Join performance lives almost entirely in whether the joined columns are indexed; the plan tells you immediately.

The full EXPLAIN (bytecode)

EXPLAIN without QUERY PLAN dumps the actual VDBE bytecode — the low-level virtual machine instructions SQLite will execute:

// EXPLAIN SELECT ... → rows of opcodes like OpenRead, Column, ResultRow, Next

This is far more detail than you usually need; it’s a tool for deep debugging of the engine’s behavior, not everyday optimization. Stick with EXPLAIN QUERY PLAN for normal work — it’s the right altitude. Reach for full EXPLAIN only when you’re investigating something exotic.

A practical optimization workflow

Put it together into a repeatable process:

  1. Identify a slow query. Measure actual time (wrap the prepare/step/finalize in a timer), or notice UI jank tied to a database operation.
  2. Explain it. Run EXPLAIN QUERY PLAN and look for SCAN on large tables, USE TEMP B-TREE, or SCAN on the inner table of a join.
  3. Hypothesize an index. Based on the WHERE, JOIN, and ORDER BY columns, design an index (mind the leftmost-prefix rule from Section 15).
  4. Add it and re-explain. Confirm SCAN became SEARCH, the temp b-tree vanished, or the join inner became SEARCH.
  5. Re-measure. Verify the wall-clock time actually improved. Sometimes an index the plan “uses” doesn’t help because the query wasn’t the bottleneck.
  6. Run ANALYZE if the planner seems to be ignoring a good index — stale statistics can mislead it.

This loop is precise and fast. Most query optimization on SQLite is exactly this: explain, spot the scan or sort, add the matching index, confirm.

When a scan is fine

Not every SCAN is a problem. A full scan is the right plan when you’re reading most or all of a table (no index helps if you need every row), when the table is small (a scan of 50 rows is faster than index overhead), or for a one-off maintenance query where speed doesn’t matter. Don’t reflexively index away every scan — index the scans that are both selective (returning few of many rows) and hot (run often on large tables). The plan tells you what SQLite does; judgment tells you whether it matters.

Query plan pitfalls

Optimizing without explaining. You might add an index the query can’t use, or miss that the real cost is a sort. Always explain first.

Assuming USING INDEX means optimal. It might be a non-covering index forcing row lookups, or the wrong index. Read the full detail, and check for COVERING on hot reads.

Ignoring USE TEMP B-TREE. A hidden sort cost. An index on the ORDER BY columns often eliminates it.

Missing a SCAN on a join’s inner table. That’s a per-outer-row rescan — quadratic. Index the join column.

Trusting the plan but not the clock. The plan shows strategy, not wall-clock time. Re-measure after changes; sometimes the database wasn’t the bottleneck.

Forgetting ANALYZE. Without fresh statistics the planner can pick a poor index. Run ANALYZE after large data changes.

What to internalize

EXPLAIN QUERY PLAN reveals how SQLite executes a query, turning optimization from guesswork into diagnosis. The key vocabulary: SCAN (reads every row, linear cost, a flag on large selective queries) versus SEARCH (index-driven, logarithmic, what you want); USING COVERING INDEX (answered from the index alone, the fastest read); USE TEMP B-TREE FOR ORDER BY (an expensive sort an index can eliminate); and per-table join lines that expose whether the inner table is searched or rescanned. The workflow is identify → explain → add the matching index → re-explain → re-measure, with ANALYZE to refresh statistics. And remember a scan is fine when you genuinely need most rows or the table is tiny — optimize the scans that are selective and hot.


17. Joins, Subqueries, and Aggregations

Relational databases earn their name from relationships between tables, and SQL’s power to combine and summarize data across them. This section covers the query constructs you’ll use constantly once your schema has more than one table: joins to combine rows, subqueries to nest queries, and aggregations to summarize. All from Swift, reading the results through the wrapper from Section 12.

Joins combine rows from multiple tables

Suppose tasks and tags (a task has many tags, from the migration in Section 14). To list each task with its tags, you join the tables on the relationship column:

let sql = """
SELECT t.id, t.title, tg.name
FROM tasks t
INNER JOIN tags tg ON tg.task_id = t.id
WHERE t.is_done = 0
ORDER BY t.title;
"""

INNER JOIN tags tg ON tg.task_id = t.id matches each tasks row with every tags row whose task_id equals the task’s id. The result has one row per (task, tag) pair. The table aliases (t, tg) keep the column references short and unambiguous — essential when both tables have an id column.

Reading a join result through the wrapper is the same as any query; you just select columns from multiple tables:

struct TaskTag { let taskId: Int64; let title: String; let tag: String }

let rows = try connection.query(sql) { row in
    TaskTag(taskId: row.int64("id"), title: row.string("title") ?? "", tag: row.string("name") ?? "")
}

INNER JOIN versus LEFT JOIN

The join type determines what happens to rows with no match:

  • INNER JOIN — only returns rows where the join condition matches on both sides. A task with no tags simply doesn’t appear in the result.
  • LEFT JOIN (left outer join) — returns every row from the left table, with NULLs for the right table’s columns when there’s no match. A task with no tags appears once, with a NULL tag.
let sql = """
SELECT t.id, t.title, tg.name
FROM tasks t
LEFT JOIN tags tg ON tg.task_id = t.id
ORDER BY t.title;
"""

With LEFT JOIN, a tagless task yields a row where tg.name is NULL. In Swift, row.string("name") returns nil for that row. Choose INNER when you only care about matched pairs (tasks that have tags) and LEFT when you want all left-table rows regardless (all tasks, with their tags if any). Getting this choice wrong is a common source of “rows mysteriously missing from my results” — if tagless tasks vanished and you didn’t expect that, you wanted a LEFT JOIN.

Aggregations summarize rows

Aggregate functions collapse many rows into a single summary value:

SELECT COUNT(*) FROM tasks WHERE is_done = 0;        -- how many open tasks
SELECT AVG(priority) FROM tasks;                      -- average priority
SELECT MAX(created_at) FROM tasks;                    -- most recent creation time
SELECT SUM(priority) FROM tasks WHERE is_done = 0;    -- total priority of open tasks
SELECT MIN(due_date) FROM tasks WHERE due_date IS NOT NULL;  -- earliest due date

The five core aggregates: COUNT (how many), SUM (total), AVG (mean), MIN, MAX. Reading a single aggregate value is the scalar-query pattern:

let openCount = try connection.queryOne("SELECT COUNT(*) FROM tasks WHERE is_done = 0;") {
    $0.int(0)
} ?? 0

COUNT(*) counts rows; COUNT(column) counts non-NULL values in that column (a distinction that matters with nullable columns — COUNT(due_date) counts only tasks that have a due date). Aggregates ignore NULLs (except COUNT(*)), which interacts with the NULL semantics we’ll cover in Section 19.

GROUP BY: aggregation per group

Aggregating the whole table gives one number. GROUP BY aggregates within groups, giving one row per group:

let sql = """
SELECT priority, COUNT(*) AS task_count
FROM tasks
WHERE is_done = 0
GROUP BY priority
ORDER BY priority DESC;
"""

struct PriorityCount { let priority: Int; let count: Int }
let counts = try connection.query(sql) { row in
    PriorityCount(priority: row.int("priority"), count: row.int("task_count"))
}

GROUP BY priority partitions the open tasks by their priority value, and COUNT(*) counts within each partition. The result is one row per distinct priority, with its task count. The AS task_count aliases the aggregate column so you can read it by a clean name. GROUP BY is how you build summaries like “tasks per priority,” “posts per user,” “sales per month” — any “X per Y” question.

HAVING: filtering groups

WHERE filters rows before grouping; HAVING filters groups after aggregating. To find priorities with more than five open tasks:

let sql = """
SELECT priority, COUNT(*) AS task_count
FROM tasks
WHERE is_done = 0
GROUP BY priority
HAVING COUNT(*) > 5
ORDER BY task_count DESC;
"""

The WHERE is_done = 0 filters to open tasks before grouping; the HAVING COUNT(*) > 5 then keeps only the groups whose count exceeds five. You can’t put an aggregate in WHERE (it runs before aggregation exists); HAVING is specifically for conditions on aggregate results. The mental model: WHERE picks the rows that go into the groups, HAVING picks which finished groups survive.

Subqueries

A subquery is a query nested inside another. They come in a few flavors. A scalar subquery returns a single value usable in an expression:

-- Tasks created more recently than the average creation time
SELECT id, title FROM tasks
WHERE created_at > (SELECT AVG(created_at) FROM tasks);

The inner (SELECT AVG(created_at) FROM tasks) computes one number, which the outer query compares against. A subquery in IN filters by membership in a set:

-- Tasks that have at least one tag named 'urgent'
SELECT id, title FROM tasks
WHERE id IN (SELECT task_id FROM tags WHERE name = 'urgent');

The inner query produces a set of task_ids; the outer keeps tasks whose id is in that set. A correlated subquery references the outer query and runs per outer row (powerful but potentially slow — each outer row triggers the inner query):

-- Tasks along with their tag count, computed via a correlated subquery
SELECT t.id, t.title,
       (SELECT COUNT(*) FROM tags tg WHERE tg.task_id = t.id) AS tag_count
FROM tasks t;

Here the subquery references t.id from the outer query, so it recomputes for each task. This particular example is often better expressed as a LEFT JOIN ... GROUP BY, which the planner can optimize better — correlated subqueries are clear but watch their performance on large tables (explain them, per Section 16).

Combining: a realistic query

A query that exercises several constructs — the open tasks with the most tags, including their tag counts, for tasks that have at least two tags:

let sql = """
SELECT t.id, t.title, COUNT(tg.id) AS tag_count
FROM tasks t
JOIN tags tg ON tg.task_id = t.id
WHERE t.is_done = 0
GROUP BY t.id, t.title
HAVING COUNT(tg.id) >= 2
ORDER BY tag_count DESC, t.title
LIMIT 20;
"""

struct TaggedTask { let id: Int64; let title: String; let tagCount: Int }
let results = try connection.query(sql) { row in
    TaggedTask(id: row.int64("id"), title: row.string("title") ?? "", tagCount: row.int("tag_count"))
}

This joins tasks to tags, filters to open tasks (WHERE), groups by task (GROUP BY), counts tags per task (COUNT), keeps only tasks with two or more tags (HAVING), orders by tag count then title, and caps the result at 20 rows (LIMIT). It reads cleanly through the wrapper into a Swift struct. This is the kind of query that would be awkward and slow to assemble by fetching everything and processing in Swift — letting SQLite do it in one query is both faster and clearer.

A note on doing work in SQL versus Swift

A recurring decision: should you compute something in SQL or fetch raw rows and compute in Swift? The guideline: let the database do set-oriented work — filtering, joining, aggregating, sorting, limiting. SQLite is highly optimized for these, operates close to the data, and (with indexes) avoids loading rows you don’t need. Pull computation into Swift when it’s genuinely application logic that doesn’t map to SQL, or when you need the raw rows anyway. Fetching an entire table into Swift to count or filter it in a loop is almost always the wrong call — it loads more data, runs slower, and discards SQLite’s optimizations. When in doubt, express it in SQL and let the planner work.

Joins and aggregation pitfalls

Using INNER JOIN when you meant LEFT JOIN. Rows with no match silently disappear. If you expect all left-table rows, use LEFT JOIN.

Unindexed join columns. A join on a non-indexed column rescans the inner table per outer row — quadratic. Index foreign key columns used in joins (Section 16’s plan reading catches this).

Putting an aggregate in WHERE. Aggregates don’t exist until after grouping. Filter groups with HAVING, rows with WHERE.

Forgetting COUNT(*) counts rows but COUNT(col) counts non-NULLs. With nullable columns these differ. Pick the one you mean.

Correlated subqueries on large tables. They run per outer row. Often a join + GROUP BY is faster; explain to compare.

Fetching everything to filter/aggregate in Swift. Wastes memory and time. Express set work in SQL.

What to internalize

Joins combine rows across tables on a relationship column: INNER JOIN keeps only matches, LEFT JOIN keeps all left-table rows with NULLs for missing matches (choosing wrong makes rows appear or vanish). Aggregates (COUNT, SUM, AVG, MIN, MAX) summarize rows; GROUP BY aggregates per group for “X per Y” summaries; HAVING filters groups after aggregation while WHERE filters rows before. Subqueries nest queries — scalar, IN-set, and correlated (mind the per-row cost of correlated ones). Read all of these through the same wrapper query/queryOne into Swift structs. The overarching principle: let SQLite do set-oriented filtering, joining, aggregating, and sorting close to the data, rather than pulling raw rows into Swift to process in loops.


18. Date, Time, and Storing Complex Types

SQLite has no native date type and no native way to store a Swift struct or array. You store these as one of the five storage classes, using a convention. Choosing the right convention for dates, and a sound approach for complex types, prevents a lot of pain — wrong choices here lead to dates that won’t sort, timezone bugs, and brittle serialization. This section covers the practical patterns for both.

Three ways to store a date, and which to choose

There’s no DATE storage class, so a date becomes an integer, a real, or text. The three conventions:

  • INTEGER Unix timestamp — seconds (or milliseconds) since 1970-01-01 UTC. Compact, sorts correctly as a number, trivial to convert to/from Date, timezone-independent (it’s always UTC). This is the recommended default.
  • REAL Julian day number — days since November 24, 4714 BC, as a floating-point number. SQLite’s date functions use this internally. Rarely the right choice for app code — the conversion to Date is awkward and the format is unintuitive.
  • TEXT ISO-8601 string — like "2026-01-15T10:30:00Z". Human-readable in the raw database (nice for debugging), sorts correctly if you use a consistent zero-padded UTC format, but takes more space and requires string parsing to get a Date.

For almost every iOS app, store dates as INTEGER Unix timestamps. They’re the smallest, fastest to compare and sort, and convert to Date in one line. Use TEXT ISO-8601 only when raw-file human readability genuinely matters and you’re disciplined about a consistent UTC format. Avoid REAL Julian days for app data.

Converting Date to and from integer timestamps

The conversion is Date.timeIntervalSince1970, which gives seconds since the epoch as a Double:

// Date → INTEGER (seconds)
let timestamp = Int64(date.timeIntervalSince1970)
sqlite3_bind_int64(statement, 1, timestamp)

// INTEGER → Date
let seconds = sqlite3_column_int64(statement, 0)
let date = Date(timeIntervalSince1970: TimeInterval(seconds))

If you need sub-second precision, store milliseconds (Int64(date.timeIntervalSince1970 * 1000)) and divide on the way out. For most apps, second precision is plenty. This conversion is exactly what the SQLiteValue(_ value: Date) initializer and Row.date(_:) accessor from Sections 11-12 encapsulate — you define it once and never think about it again. The crucial property: because the timestamp is UTC seconds, it’s free of timezone ambiguity. The display timezone is a presentation concern you handle with DateFormatter when showing the date, never in storage.

SQLite’s date and time functions

Even with integer storage, SQLite’s built-in date functions are useful for queries. They operate on timestamps when you tell them the value is a Unix epoch with the 'unixepoch' modifier:

-- Format a stored timestamp as a readable date in a query
SELECT id, title, date(created_at, 'unixepoch') AS created_day FROM tasks;

-- Tasks created today (compares date portions)
SELECT * FROM tasks WHERE date(created_at, 'unixepoch') = date('now');

-- Tasks created in the last 7 days
SELECT * FROM tasks WHERE created_at >= strftime('%s', 'now', '-7 days');

-- Group tasks by the month they were created
SELECT strftime('%Y-%m', created_at, 'unixepoch') AS month, COUNT(*) AS n
FROM tasks GROUP BY month ORDER BY month;

The functions: date() extracts the date portion, datetime() the full date-time, time() the time portion, and strftime() formats with a pattern (%Y year, %m month, %d day, %s Unix seconds, etc.). The 'unixepoch' modifier tells these functions your integer is Unix seconds (without it they’d interpret the number as a Julian day). 'now' is the current time, and modifiers like '-7 days', 'start of month', '+1 hour' do date arithmetic. These let you do date-based filtering and grouping in SQL rather than fetching everything and filtering by date in Swift.

One subtlety: comparing created_at >= strftime('%s', 'now', '-7 days') compares two integer timestamps, which is fast and can use an index on created_at. But wrapping the column in a function — date(created_at, 'unixepoch') = date('now') — applies the function per row and defeats an index on created_at. For index-friendly date range queries, compare the raw integer column against a computed bound, not a function-wrapped column. (This echoes the expression-index discussion from Section 15.)

Storing Codable types as JSON

For genuinely complex values — a Swift struct, an array, a dictionary — the clean approach is to encode them to JSON and store the JSON as TEXT (or BLOB). Swift’s Codable makes this nearly automatic:

struct TaskMetadata: Codable {
    var labels: [String]
    var estimatedMinutes: Int?
    var recurrence: String?
}

// Encode → store as TEXT
func encodeMetadata(_ metadata: TaskMetadata) throws -> String {
    let data = try JSONEncoder().encode(metadata)
    return String(data: data, encoding: .utf8)!
}

// Decode ← read from TEXT
func decodeMetadata(_ json: String) throws -> TaskMetadata {
    let data = json.data(using: .utf8)!
    return try JSONDecoder().decode(TaskMetadata.self, from: data)
}

You store the JSON string in a TEXT column and decode it on read. This is the pragmatic way to persist structured side-data without modeling every field as its own column. It’s a deliberate denormalization: you trade queryability (you can’t easily WHERE on a field inside the JSON, at least not without SQLite’s JSON functions) for simplicity and flexibility (the structure can evolve without a migration). Use it for ancillary, rarely-queried structured data — settings blobs, cached API responses, flexible metadata. Don’t use it for data you need to filter, sort, or join on; that data belongs in proper columns.

SQLite’s JSON functions

If you do store JSON and occasionally need to query into it, SQLite’s JSON functions (JSON1, included in Apple’s build) let you reach inside:

-- Extract a field from a JSON column
SELECT id, json_extract(metadata, '$.estimatedMinutes') AS minutes FROM tasks;

-- Filter on a JSON field
SELECT * FROM tasks WHERE json_extract(metadata, '$.recurrence') = 'weekly';

-- Check array membership
SELECT * FROM tasks WHERE EXISTS (
    SELECT 1 FROM json_each(metadata, '$.labels') WHERE value = 'urgent'
);

json_extract(column, '$.path') pulls a value out by JSON path; json_each expands a JSON array into rows you can query. These bridge the gap when you mostly treat JSON as opaque but occasionally need to peek inside. They’re a middle ground — more flexible than separate columns, more queryable than fully opaque JSON. But note that filtering with json_extract scans and parses every row’s JSON (no index unless you add an expression index on the extraction), so it’s fine for occasional queries on modest data, not for hot paths on large tables. If you find yourself querying a JSON field constantly, promote it to a real column.

Storing other common types

A few more type-mapping conventions you’ll hit:

Bool — store as INTEGER 0/1 (covered in Section 8). SQLite has no boolean type; WHERE is_done = 1 and WHERE is_done = 0 work naturally.

UUID — store as TEXT (uuid.uuidString) for readability, or as a 16-byte BLOB (withUnsafeBytes on the uuid) for compactness. TEXT is more debuggable and usually worth the few extra bytes; BLOB matters only at large scale.

URL — store as TEXT via url.absoluteString, reconstruct with URL(string:). Simple and readable.

Enums — store the raw value. A String-backed enum stores as TEXT (readable); an Int-backed enum stores as INTEGER (compact). For a RawRepresentable enum, bind enumValue.rawValue and reconstruct with MyEnum(rawValue:).

Decimal/currency — be careful. Storing money as a floating-point REAL invites rounding errors. Store currency as an INTEGER count of the smallest unit (cents, not dollars) and format for display, or as TEXT if you need exact arbitrary precision. Never store money you care about as a plain Double.

Complex types pitfalls

Storing dates as locale-formatted text. "Jan 15, 2026" won’t sort and is locale-dependent. Use integer UTC timestamps.

Wrapping a date column in a function in WHERE. date(created_at, 'unixepoch') = ... defeats the index. Compare the raw integer against a computed bound for index-friendly range queries.

Storing currency as Double. Floating-point rounding corrupts money. Use integer smallest-units or exact TEXT.

Putting queryable data inside a JSON blob. You can’t index or efficiently filter it. Data you query belongs in its own column; JSON is for opaque side-data.

Forgetting timezones are a display concern. Storage is UTC seconds; timezone conversion happens at display with DateFormatter. Mixing timezones into storage creates ambiguity.

Force-unwrapping JSON encode/decode. Malformed data crashes. Handle the throws from JSONEncoder/JSONDecoder and decide on a fallback.

What to internalize

SQLite has no native date or complex-type storage, so you use conventions. Store dates as INTEGER Unix timestamps (UTC seconds) — compact, sortable, timezone-free, one line to convert to/from Date — reserving ISO-8601 TEXT for when raw readability matters and avoiding Julian REAL. Use SQLite’s date/datetime/strftime functions with the 'unixepoch' modifier for in-SQL date filtering and grouping, but compare raw integer columns (not function-wrapped ones) to keep indexes effective. Persist complex Swift values by encoding Codable types to JSON in a TEXT column — great for opaque side-data, wrong for anything you need to query (which belongs in real columns). SQLite’s JSON functions can peek inside when needed. Store booleans as 0/1, UUIDs/URLs as text, enums as their raw value, and money as integer smallest-units — never as Double.


19. NULL Handling and Type Affinity

Two of SQLite’s behaviors trip up developers more than any others: NULL’s three-valued logic and the loose type affinity system. Both behave differently from Swift’s clean, strict model, and both cause bugs that are subtle precisely because the database doesn’t complain — it just returns a result you didn’t expect. This section makes both explicit so they stop surprising you.

NULL means “unknown,” not “empty”

In Swift, nil is a clear absence. In SQL, NULL means unknown — a value that exists but isn’t known — and this distinction drives logic that feels strange until you internalize it. The headline consequence: NULL is not equal to anything, including itself.

NULL = NULL      -- evaluates to NULL (not true!)
NULL = 5         -- NULL
NULL <> 5        -- NULL
5 = NULL         -- NULL

Every comparison involving NULL yields NULL, which in a WHERE clause is treated as “not true,” so the row is excluded. This means a query like WHERE column = NULL returns nothing, ever — even for rows where the column is actually NULL. It’s a classic bug: you think you’re finding the NULL rows, but you’re finding none.

Testing for NULL with IS NULL / IS NOT NULL

Because = NULL never matches, SQL provides dedicated operators:

SELECT * FROM tasks WHERE due_date IS NULL;       -- tasks with no due date
SELECT * FROM tasks WHERE due_date IS NOT NULL;   -- tasks that have a due date

IS NULL and IS NOT NULL are the only correct way to test for NULL. They return true/false (not NULL), so they work as you’d expect in a WHERE. Whenever you mean “where this column has no value” or “where it has some value,” reach for IS NULL / IS NOT NULL, never = NULL or <> NULL. From Swift, when you build a query that conditionally filters on a nullable column, remember this — a generated column = ? with a NULL bound value will not match NULL rows; you need column IS NULL for that case.

Three-valued logic in compound conditions

NULL propagates through AND/OR with three-valued logic (true, false, unknown):

true  AND NULL  -- NULL  (unknown — the result depends on the unknown value)
false AND NULL  -- false (false regardless of the unknown)
true  OR  NULL  -- true  (true regardless of the unknown)
false OR  NULL  -- NULL  (unknown)
NOT NULL        -- NULL  (the negation of unknown is still unknown)

The practical effect surfaces in compound WHERE clauses. A condition like WHERE priority > 3 OR notes IS NOT NULL behaves fine, but WHERE priority > 3 AND bonus_flag = 1 excludes rows where bonus_flag is NULL even though you might have intended NULL to mean “no flag = treat as 0.” When a nullable column participates in a condition, think through what NULL does to the logic. Often the fix is to coalesce the NULL to a concrete default first.

COALESCE and IFNULL: substituting defaults

COALESCE returns its first non-NULL argument; IFNULL is the two-argument version. They turn NULL into a usable default:

-- Treat a NULL priority as 0 for sorting
SELECT * FROM tasks ORDER BY COALESCE(priority, 0) DESC;

-- Display a fallback for missing notes
SELECT id, title, IFNULL(notes, '(no notes)') AS notes_display FROM tasks;

-- First non-null among several columns
SELECT COALESCE(nickname, full_name, 'Anonymous') AS display_name FROM users;

COALESCE(a, b, c) evaluates to the first of a, b, c that isn’t NULL. This is the standard way to provide defaults in queries — far cleaner than handling NULL in Swift after the fact, and it lets the NULL substitution participate correctly in sorting and comparison. When a nullable column needs to behave like a concrete value in a query, coalesce it.

NULL in aggregates

Aggregate functions (except COUNT(*)) ignore NULLs:

SELECT AVG(priority) FROM tasks;   -- averages only the non-NULL priorities
SELECT COUNT(due_date) FROM tasks; -- counts only rows where due_date IS NOT NULL
SELECT COUNT(*) FROM tasks;        -- counts ALL rows, NULL or not
SELECT SUM(priority) FROM tasks;   -- sums non-NULL; returns NULL if ALL are NULL

This is usually what you want — an average shouldn’t be dragged down by treating unknown values as zero — but it has edge cases. COUNT(column) versus COUNT(*) differ whenever the column has NULLs (Section 17 noted this). And SUM over a set that’s entirely NULL returns NULL, not 0, which can surprise downstream code; coalesce it (COALESCE(SUM(priority), 0)) if you need a numeric zero. AVG of all-NULL is also NULL. Keep these in mind when an aggregate “comes back empty” — it may be NULL from an all-NULL input.

NULL in ORDER BY and UNIQUE

Two more NULL behaviors worth knowing. In ORDER BY, SQLite sorts NULLs first in ascending order (they’re considered smaller than everything):

SELECT * FROM tasks ORDER BY due_date;          -- NULL due_dates appear first
SELECT * FROM tasks ORDER BY due_date DESC;     -- NULL due_dates appear last
-- To control it explicitly:
SELECT * FROM tasks ORDER BY due_date IS NULL, due_date;  -- non-NULLs first, then NULLs

If you want NULLs at a specific end regardless of direction, the ORDER BY column IS NULL, column trick sorts the NULL-ness first. Second, in a UNIQUE constraint, NULLs are considered distinct from each other — you can have many rows with NULL in a UNIQUE column, because NULL ≠ NULL means no two NULLs “collide.” This is standard SQL behavior but surprises people expecting “unique” to allow only one NULL. If you need at most one NULL, you need additional logic (a partial unique index or a sentinel value).

Type affinity revisited: the loose typing trap

Section 13 introduced type affinity — that a column’s declared type is a preference, not a strict rule. Here’s where it bites in practice. Because SQLite coerces toward affinity but accepts mismatches, comparisons can behave unexpectedly when types don’t line up:

-- If 'count' is a TEXT-affinity column that happens to hold '10' and '9' as text:
SELECT * FROM items WHERE count > 5;   -- text comparison! '9' > '5' but '10' < '5' as strings

Text compares lexicographically, so '10' sorts before '5' (comparing first characters, ‘1’ < ‘5’). If a column you think is numeric actually stores text, range comparisons silently produce wrong results. The defenses: use the right affinity (declare numeric columns INTEGER/REAL), bind values of the intended type (the SQLiteValue enum helps), and adopt STRICT tables (Section 13) to make type mismatches an error instead of a silent coercion.

A related gotcha: comparing a number to a string. WHERE id = '5' (string) versus WHERE id = 5 (integer) can behave differently depending on column affinity and SQLite’s conversion rules. Bind the correct Swift type and this never arises — but it’s a reason to be disciplined about types at the binding boundary rather than relying on coercion.

Affinity and NULL together in a generic reader

When you build a generic row reader (like the Row in Section 12) that reads columns by inspecting sqlite3_column_type, NULL and affinity intersect. A column declared INTEGER might, due to loose typing, actually contain text or NULL in a given row. Robust generic readers check sqlite3_column_type per row (it reports the actual storage class of the value in this row, not the column’s declared affinity) and branch accordingly. This is why the Row.date and Row.int64 accessors check isNull before reading — the value’s real type can differ from the column’s nominal type, and NULL must be handled explicitly because numeric NULL reads as 0.

NULL and affinity pitfalls

WHERE column = NULL. Never matches anything. Use IS NULL.

Expecting column = ? with a NULL bound value to match NULL rows. It won’t. Branch to IS NULL for the NULL case.

Forgetting NULL excludes rows in compound AND conditions. A NULL in an AND makes the whole condition not-true. Coalesce nullable columns to defaults when that’s the intent.

SUM/AVG over all-NULL returning NULL, not 0. Surprises downstream numeric code. COALESCE(SUM(x), 0) if you need zero.

Assuming UNIQUE allows only one NULL. It allows many — NULLs are distinct. Use a partial unique index if you need at most one.

Trusting declared column types for comparisons. Loose affinity means a “numeric” column might hold text, breaking range comparisons. Bind correct types and/or use STRICT tables.

Numeric NULL reading as 0 in Swift. sqlite3_column_int64 returns 0 for NULL. Check sqlite3_column_type == SQLITE_NULL first.

What to internalize

NULL means “unknown,” so it’s never equal to anything — = NULL matches nothing; use IS NULL / IS NOT NULL. NULL propagates through AND/OR with three-valued logic and silently excludes rows from compound conditions, so coalesce nullable columns to defaults (COALESCE, IFNULL) when you need concrete behavior. Aggregates ignore NULLs except COUNT(*), and SUM/AVG over all-NULL return NULL not 0. NULLs sort first ascending and are distinct under UNIQUE (so many NULLs are allowed). Separately, type affinity is loose: a column accepts mismatched types and coerces, which can break comparisons (text '10' < '5'); bind correct Swift types, use STRICT tables for enforcement, and in generic readers check sqlite3_column_type per row since the actual value type and NULL-ness can differ from the column’s declaration.


20. Concurrency: Threading Modes and Connections

iOS apps are concurrent by nature — UI on the main thread, work on background queues, increasingly structured concurrency with async/await and actors. SQLite can be used safely from multiple threads, but only if you understand its threading model and follow a small set of rules. Getting this wrong produces crashes, corruption, and SQLITE_MISUSE errors that are hard to reproduce. This section establishes the rules.

SQLite’s three threading modes

SQLite can be compiled and configured in three threading modes, which control how much internal locking it does:

  • Single-thread — no internal mutexes at all. The library may only be used by one thread, period. Fastest, least safe.
  • Multi-thread — the library may be used by multiple threads, provided no single database connection is used by more than one thread at a time. Each thread needs its own connection.
  • Serialized — the library may be used freely from multiple threads with no restriction; SQLite serializes access with internal mutexes. Safest, with some locking overhead.

On iOS, the default is serialized mode. You can also select per-connection via the open flags from Section 3: SQLITE_OPEN_FULLMUTEX requests serialized, SQLITE_OPEN_NOMUTEX requests multi-thread (no per-connection mutex). The mode determines what’s safe, so it’s worth being deliberate.

The core rule: one connection per thread (or serialize access to one)

Regardless of mode, the safe patterns reduce to two:

  1. One connection, accessed by one thread/queue at a time. Use SQLITE_OPEN_FULLMUTEX (serialized) and funnel all database access through a single serial dispatch queue or a single actor. SQLite’s mutex protects you, and the serial queue guarantees no overlap. Simple and correct.

  2. One connection per thread. Use SQLITE_OPEN_NOMUTEX (multi-thread) and give each thread its own connection, never sharing a connection across threads. Each connection is single-threaded from SQLite’s perspective.

What you must never do is share one connection across threads without serialization in a non-serialized mode. That’s the path to corruption and SQLITE_MISUSE. An OpaquePointer connection handle used concurrently from two threads, with no mutex, will eventually crash or corrupt data.

OpaquePointer is not Sendable

In Swift’s concurrency model, this manifests concretely: OpaquePointer is not Sendable. The compiler will (under strict concurrency checking) flag attempts to pass a connection handle across actor or task boundaries. This is the type system telling you the truth — a raw connection handle is not safe to move between concurrency domains casually. Your wrapper types (Connection, Statement) wrap these pointers, and you must design their concurrency story deliberately rather than scatter handles across threads.

Pattern 1: A serial queue owning one connection

The simplest robust design: one connection, all access serialized through a private dispatch queue. The queue guarantees one operation at a time; the connection never sees concurrent use.

final class DatabaseActor {
    private let connection: Connection
    private let queue = DispatchQueue(label: "com.app.database")

    init(url: URL) throws {
        self.connection = try Connection(url: url)   // opened with FULLMUTEX
    }

    func read<T>(_ work: @escaping (Connection) throws -> T) async throws -> T {
        try await withCheckedThrowingContinuation { continuation in
            queue.async {
                do { continuation.resume(returning: try work(self.connection)) }
                catch { continuation.resume(throwing: error) }
            }
        }
    }

    func write<T>(_ work: @escaping (Connection) throws -> T) async throws -> T {
        try await withCheckedThrowingContinuation { continuation in
            queue.async(flags: .barrier) {
                do {
                    let result = try self.connection.transaction { try work(self.connection) }
                    continuation.resume(returning: result)
                } catch { continuation.resume(throwing: error) }
            }
        }
    }
}

All access goes through the queue, bridged to async/await with continuations. Reads and writes are serialized; the connection is touched by exactly one queue at a time. This is correct, simple, and sufficient for the large majority of apps — most apps don’t need true read/write parallelism on a local database. (For true read parallelism you’d use a concurrent queue with .barrier writes plus WAL mode and a connection pool — more complex, rarely needed.)

Pattern 2: A Swift actor wrapping the connection

In modern Swift concurrency, an actor provides exactly the serialization guarantee SQLite wants — an actor processes one call at a time, so a connection owned by an actor is never accessed concurrently:

actor Database {
    private let connection: Connection

    init(url: URL) throws {
        self.connection = try Connection(url: url)
    }

    func fetchTasks() throws -> [Task] {
        try connection.query("SELECT * FROM tasks WHERE is_done = 0;", decode: Task.init(row:))
    }

    func createTask(title: String) throws -> Int64 {
        try connection.run(
            "INSERT INTO tasks (title, is_done, priority, created_at) VALUES (?, 0, 0, ?);",
            [SQLiteValue(title), SQLiteValue(Date())]
        )
    }
}

// Usage:
let db = try Database(url: databaseURL())
let tasks = try await db.fetchTasks()
let id = try await db.createTask(title: "New task")

The actor’s isolation means every method runs in mutual exclusion — the connection is safe by construction, no manual queue, no continuations. Calls from outside are async. This is the cleanest modern pattern: the actor is the serialization. The connection handle never leaves the actor’s isolation domain, which also satisfies the non-Sendable constraint naturally. For new code targeting Swift concurrency, an actor-wrapped connection is the recommended approach.

The trade-off: a single connection serializes everything

Both patterns above route all database work — reads and writes — through one serialized point. That’s correct, but it means a long-running read blocks a write and vice versa. For most apps this is fine: SQLite operations on a local database are fast, and the simplicity is worth far more than the theoretical parallelism. But if you have a workload with genuine concurrent read demand — say, a feed that reads constantly while a sync writes in the background — you’ll want multiple connections and WAL mode, which lets readers proceed without blocking the writer. That’s the subject of the next section. The key insight here: start with one serialized connection, and only add connection-level concurrency when profiling shows the single connection is a real bottleneck.

Keeping database work off the main thread

A non-negotiable iOS rule: don’t do database work on the main thread if it could be slow. A query that scans a large table, a bulk write, or a migration can take long enough to drop frames and freeze the UI. With the actor or queue patterns above, database calls are async and run off the main thread automatically — you await them from the main actor, and the work happens elsewhere. This is a major reason to adopt one of these patterns rather than calling raw SQLite synchronously from wherever: it structurally keeps the work off the main thread. When you read results back to drive UI, you hop back to the main actor to update state (Section 30 covers the SwiftUI integration explicitly).

Connections are cheap to keep, expensive to churn

A reminder from Section 3 that matters more under concurrency: opening a connection is relatively expensive (file operations, page cache initialization), so you keep connections alive rather than opening one per operation. Under the single-connection patterns, you open once at startup and hold it for the app’s lifetime. If you adopt multiple connections for concurrency, you pool them — open a fixed set and reuse them — rather than opening and closing per query. Connection churn is a performance anti-pattern; connection longevity is the norm.

Concurrency pitfalls

Sharing one connection across threads without serialization. Corruption and SQLITE_MISUSE. Serialize access (queue or actor) or give each thread its own connection.

Passing an OpaquePointer across actor/task boundaries. It’s not Sendable for good reason. Keep the handle inside one isolation domain (an actor or a queue-owning class).

Doing database work synchronously on the main thread. Janks the UI. Route work through an async actor/queue that runs off-main.

Opening a connection per query. Expensive churn. Keep connections long-lived; pool them if you use several.

Assuming a single serialized connection gives read parallelism. It serializes everything. For genuine concurrent reads, use WAL plus multiple connections (next section) — but only when measured need justifies the complexity.

Mixing the two patterns. Don’t half-serialize with a queue and also hand the connection to other threads. Pick one ownership model and hold it.

What to internalize

SQLite has three threading modes (single-thread, multi-thread, serialized); iOS defaults to serialized, selectable per connection via SQLITE_OPEN_FULLMUTEX/SQLITE_OPEN_NOMUTEX. The two safe patterns are: one connection accessed strictly one-at-a-time (serial queue or actor), or one connection per thread — never a shared connection used concurrently without serialization. OpaquePointer is not Sendable, which is the type system enforcing this. The cleanest modern design is a Swift actor wrapping the connection: the actor’s one-call-at-a-time isolation is the serialization, it keeps work off the main thread via async, and it confines the handle to one isolation domain. Start with a single serialized connection; keep connections long-lived; and only reach for multiple connections plus WAL when profiling proves the single connection is a genuine bottleneck.


21. WAL Mode and Write Concurrency

Write-Ahead Logging (WAL) is the journal mode that makes SQLite genuinely good at concurrent access: readers don’t block the writer, and the writer doesn’t block readers. For an iOS app that reads on the main path while writing in the background, WAL is transformative. This section explains what WAL is, how to enable and manage it, and the iOS-specific considerations around its extra files.

The default journal mode and its limitation

Before WAL, SQLite’s default journal mode was “rollback journal.” In that mode, a write transaction creates a separate journal file holding the original data, modifies the database file in place, and deletes the journal on commit. The crucial limitation: while a write transaction is in progress, it holds a lock that blocks all readers. Readers and the writer are mutually exclusive. For a concurrent app, this serializes everything — a background write stalls every read until it commits.

What WAL does differently

WAL inverts the approach. Instead of writing changes into the main database file and journaling the old data, WAL writes new data to a separate “write-ahead log” file (-wal) and leaves the main database file untouched until a “checkpoint” later folds the WAL back in. The consequences are exactly what concurrent apps want:

  • Readers don’t block the writer, and the writer doesn’t block readers. A reader sees a consistent snapshot of the database as of when its read began, reading from the main file plus the relevant WAL frames. The writer appends new frames to the WAL concurrently. They don’t contend.
  • Only writers block other writers. There can still be just one writer at a time (SQLite remains single-writer), but writers no longer lock out readers.
  • Writes are often faster, because appending to the WAL is sequential and the expensive checkpoint is deferred and batched.

For a typical iOS app — UI reading current data while background tasks sync and write — WAL means the writes stop freezing the reads. This is why WAL is the recommended journal mode for app databases, and why the Connection.configure() in Section 11 set it at startup.

Enabling WAL

WAL is enabled with a pragma, and it’s persistent — once set on a database file, it stays in WAL mode across connections and app launches until explicitly changed:

sqlite3_exec(db, "PRAGMA journal_mode = WAL;", nil, nil, nil)

This pragma returns a row (the resulting mode), which a nil-callback sqlite3_exec discards harmlessly. You set it once after opening. Because the mode is a property of the database file, not the connection, you don’t strictly need to set it on every connection — but it’s harmless to, and doing so in configure() ensures it regardless of how the file was created. To check the current mode:

// PRAGMA journal_mode; with no value returns the current mode as a text row

The -wal and -shm files

WAL mode introduces two companion files alongside your database (app.sqlite):

  • app.sqlite-wal — the write-ahead log itself, holding committed-but-not-yet-checkpointed changes.
  • app.sqlite-shm — a shared-memory file used to coordinate access to the WAL among connections.

These are normal and expected. A few implications for iOS:

First, the database is now three files, not one. If you copy, back up, or move the database, you must handle all three (or properly checkpoint and close first so the WAL is empty — more below). Copying just app.sqlite while a populated -wal exists gives you a stale database missing recent changes. The backup API (Section 26) handles this correctly; naive file copies don’t.

Second, the -wal file grows as writes accumulate, until a checkpoint folds it back into the main database. An app that writes a lot without checkpointing can see a large -wal file. SQLite checkpoints automatically by default (see below), so this is usually self-managing, but it’s worth knowing where the disk space goes.

Checkpointing

A checkpoint transfers the changes accumulated in the WAL back into the main database file and (if no readers are mid-snapshot) resets the WAL. SQLite does this automatically: by default, after the WAL reaches about 1000 pages, the next commit triggers an automatic checkpoint. You can tune or trigger checkpoints manually:

// Tune the automatic checkpoint threshold (in pages); 0 disables auto-checkpoint
sqlite3_exec(db, "PRAGMA wal_autocheckpoint = 1000;", nil, nil, nil)

// Trigger a checkpoint manually
sqlite3_wal_checkpoint_v2(db, nil, SQLITE_CHECKPOINT_TRUNCATE, nil, nil)

The checkpoint modes for sqlite3_wal_checkpoint_v2:

  • SQLITE_CHECKPOINT_PASSIVE — checkpoint as much as possible without blocking; the default automatic behavior.
  • SQLITE_CHECKPOINT_FULL — wait for readers, then checkpoint everything.
  • SQLITE_CHECKPOINT_TRUNCATE — checkpoint everything and truncate the WAL file to zero, reclaiming its disk space.

For most apps, the automatic checkpointing is fine and you never call this. A manual TRUNCATE checkpoint is useful at specific moments — before backing up the database, when entering the background and you want the WAL flushed and small, or before copying the file. Don’t checkpoint obsessively; it’s work, and the automatic threshold balances it well.

WAL and app lifecycle on iOS

iOS suspends and terminates apps aggressively, which interacts with WAL. A couple of practical points:

When your app moves to the background, it’s a reasonable moment to checkpoint so the WAL is folded in and small — if the OS then terminates the app, the database file is more complete. A SQLITE_CHECKPOINT_PASSIVE or TRUNCATE checkpoint in your scenePhase background handler (or the older applicationDidEnterBackground) is a tidy habit:

func appDidEnterBackground() {
    // best-effort checkpoint; ignore errors
    sqlite3_wal_checkpoint_v2(connection.handle, nil, SQLITE_CHECKPOINT_PASSIVE, nil, nil)
}

WAL is also crash-safe: if the app is killed with changes in the WAL, the next time the database opens, SQLite recovers automatically from the WAL — committed transactions survive. You don’t lose committed data to an abrupt termination. This durability is one of WAL’s quiet virtues on a platform that terminates apps without warning.

App Groups and shared databases

If you share a database across an app and its extensions (a widget, a share extension, a notification service) via an App Group container, WAL has special considerations. The -shm shared-memory coordination relies on memory-mapping that behaves differently across process boundaries, and there have historically been subtleties with WAL databases accessed from multiple processes in an App Group. The practical guidance: it’s doable, but test carefully, ensure all processes open the database from the same App Group path, and be aware that the shared-memory WAL index assumes cooperating processes. Some teams use a checkpoint-and-close discipline or PRAGMA journal_mode choices tuned for the multi-process case. If you only access the database from the main app, none of this applies — it’s specifically the cross-process App Group scenario that needs care.

When not to use WAL

WAL is the right default, but a couple of cases favor the rollback journal. A strictly read-only database (like a bundled reference database you ship in the app and never write) doesn’t benefit from WAL and can stay in the default mode — there are no writes to make concurrent. And the multi-process App Group case above sometimes pushes teams toward different configurations. For the common case — a read-write database accessed from the main app — WAL is what you want.

WAL pitfalls

Copying or backing up only the main file with a populated WAL. You get a stale database missing recent changes. Checkpoint first, or use the backup API which handles the WAL correctly.

Expecting WAL to allow multiple concurrent writers. It doesn’t — still one writer at a time. WAL frees readers from blocking on the writer, not writers from each other.

Disabling auto-checkpoint and never checkpointing manually. The -wal file grows unbounded. Leave auto-checkpoint on or checkpoint deliberately.

Forgetting the extra files exist. A “move the database” routine that moves one file breaks. Account for -wal and -shm, or checkpoint-and-close to empty the WAL first.

Using WAL naively across App Group processes. The shared-memory coordination has cross-process subtleties. Test thoroughly or use a configuration suited to multi-process access.

What to internalize

WAL (write-ahead logging) lets readers and the writer proceed concurrently — readers see a consistent snapshot while the writer appends to a separate -wal file — which is exactly what an iOS app reading on the UI path while writing in the background needs. Enable it once with PRAGMA journal_mode = WAL (it’s persistent on the file); it adds -wal and -shm companion files, so treat the database as three files for copying/backup or checkpoint-and-close first. Checkpointing folds the WAL back into the main file and happens automatically (~1000 pages); trigger it manually with sqlite3_wal_checkpoint_v2 before backups or when entering the background to keep the WAL small. WAL still allows only one writer at a time and is crash-safe across iOS’s aggressive termination. It’s the right default for read-write app databases; read-only bundled databases and some App Group multi-process scenarios are the exceptions.


22. Busy Handling and Locking

Even in WAL mode, SQLite allows only one writer at a time, and a connection that tries to write while another holds the write lock gets SQLITE_BUSY. How you handle that — wait, retry, or fail — determines whether your app is robust under concurrent writes or fragile and flaky. This section covers SQLite’s locking model and the mechanisms for dealing with contention gracefully.

The SQLITE_BUSY condition

SQLITE_BUSY (result code 5) means: “the resource you want is locked by another connection; try again later.” It’s not an error in the sense of “something is broken” — it’s a normal, expected outcome of concurrent access that you handle, not crash on. You’ll see it when one connection tries to acquire a lock another connection holds: most commonly, a writer trying to start while another writer is mid-transaction.

The key distinction from a real error: SQLITE_BUSY is transient and recoverable. The lock will eventually release when the other connection commits or rolls back. Your job is to wait for that and retry, within reason.

Lock states (briefly)

Under the hood, SQLite connections move through lock states on the database. In the classic rollback-journal model these are UNLOCKED → SHARED (readers) → RESERVED → PENDING → EXCLUSIVE (the writer escalates as it commits). In WAL mode it’s simpler: readers don’t take locks that block the writer, and a single write lock governs writers. You rarely need to reason about the specific states; what matters operationally is: multiple readers coexist freely (especially in WAL), and exactly one writer proceeds at a time, with others getting SQLITE_BUSY until the writer finishes.

The busy timeout: the simplest fix

The easiest and most effective way to handle SQLITE_BUSY is to tell SQLite to wait a while before returning it. PRAGMA busy_timeout (or the equivalent sqlite3_busy_timeout) sets a duration during which SQLite, upon hitting a lock, sleeps and retries internally rather than immediately returning SQLITE_BUSY:

// Wait up to 5000 milliseconds for a lock before giving up
sqlite3_exec(db, "PRAGMA busy_timeout = 5000;", nil, nil, nil)
// or equivalently:
sqlite3_busy_timeout(db, 5000)

With a 5-second busy timeout, a write that hits a lock held by a brief transaction simply waits (transparently, inside the SQLite call) until the lock frees — typically milliseconds — and then proceeds. You never even see SQLITE_BUSY for normal short contentions. This single pragma (set in configure() in Section 11) resolves the vast majority of busy situations with zero application logic. It’s the first thing to set, and often the only busy handling you need.

The busy handler callback

busy_timeout is actually a convenience built on a lower-level mechanism: the busy handler, a callback SQLite invokes when it hits a lock, letting you decide whether to retry. sqlite3_busy_handler registers a custom one:

sqlite3_busy_handler(db, { (context, attempts) -> Int32 in
    // attempts = how many times we've been called for this lock
    if attempts < 50 {
        usleep(10_000)   // sleep 10ms
        return 1          // return nonzero → retry
    } else {
        return 0          // return zero → give up, SQLite returns SQLITE_BUSY
    }
}, nil)

The handler receives the number of prior attempts and returns nonzero to retry (after whatever sleep you do) or zero to give up. busy_timeout is essentially a built-in handler that retries with escalating sleeps until the timeout elapses. You’d write a custom handler only for special backoff strategies; for almost all apps, busy_timeout is simpler and sufficient. Note you use either busy_timeout or a custom busy_handler — setting one replaces the other.

Application-level retry for the cases timeout can’t cover

A busy timeout handles contention within SQLite calls, but some situations need retry logic at the application level. The notable one: a SQLITE_BUSY returned from a COMMIT (or from a deferred transaction that upgraded to a write and lost a race) where you need to redo the whole transaction. A retry wrapper handles this:

func withRetry<T>(maxAttempts: Int = 5, _ work: () throws -> T) throws -> T {
    var attempt = 0
    while true {
        do {
            return try work()
        } catch let error as SQLiteError {
            attempt += 1
            if case .step(_, let code) = error, code == SQLITE_BUSY, attempt < maxAttempts {
                let backoff = UInt32(attempt * 20_000)   // escalating microseconds
                usleep(backoff)
                continue
            }
            throw error
        }
    }
}

// Usage: retry a whole write transaction on busy
try withRetry {
    try connection.transaction {
        // ... writes ...
    }
}

This retries the entire unit of work on SQLITE_BUSY, with escalating backoff, up to a cap. Retrying the whole transaction (not just the failed statement) is the correct granularity, because a busy that aborts a transaction needs the transaction redone from BEGIN. Combined with a busy timeout, this makes writes robust under contention: the timeout absorbs short waits inside calls, and the retry wrapper handles the rarer transaction-level aborts.

BEGIN IMMEDIATE avoids the un-retryable busy

Section 10 introduced BEGIN IMMEDIATE and promised it’d matter here. The reason: with a default BEGIN DEFERRED transaction that reads first, then writes, the connection takes a read lock at the first read and only tries to upgrade to a write lock at the first write. If another writer grabbed the write lock in between, the upgrade fails with SQLITE_BUSY — and now you’re stuck, because you’ve already read data under the old state and can’t safely just retry the write without potentially acting on stale reads. BEGIN IMMEDIATE takes the write lock up front, before any reads:

sqlite3_exec(db, "BEGIN IMMEDIATE;", nil, nil, nil)

If the write lock isn’t available, BEGIN IMMEDIATE fails immediately, before you’ve read or written anything — so retrying the whole transaction is clean and safe. For any transaction that will write, use BEGIN IMMEDIATE (the Connection.transaction method in Section 11 does this). It converts a dangerous mid-transaction busy into a safe, retryable, do-nothing-yet busy. This pairing — BEGIN IMMEDIATE plus the retry wrapper — is the robust write pattern.

Keep transactions short to minimize contention

The best busy handling is avoiding contention in the first place, and the lever for that is transaction duration. A write transaction holds the write lock for its entire span; the longer it runs, the longer other writers wait and the more likely they hit SQLITE_BUSY. So:

  • Don’t do slow non-database work inside a write transaction — no network calls, no heavy computation, no waiting on user input. Gather everything you need before BEGIN, then do the writes quickly and COMMIT.
  • Batch related writes into one short transaction rather than many tiny ones (also a performance win, Section 24), but don’t let a “batch” become a multi-second lock-holder.
  • Move reads outside the write transaction when possible, so the write lock is held only for the actual writes.

Short transactions make contention rare, which makes busy handling almost a non-issue. The busy timeout and retry wrapper are the safety net; short transactions are the design that keeps you off the net.

Locking and busy pitfalls

Treating SQLITE_BUSY as a fatal error. It’s transient and recoverable. Set a busy timeout and/or retry; never crash on it.

No busy timeout set. Every momentary contention surfaces as SQLITE_BUSY to your code. Set busy_timeout at startup so short waits are handled transparently.

Retrying a single statement instead of the whole transaction. A busy that aborts a transaction needs the transaction redone from BEGIN. Retry at the transaction granularity.

Using BEGIN DEFERRED for read-then-write transactions. Risks an un-retryable mid-transaction upgrade busy. Use BEGIN IMMEDIATE for writers.

Long transactions holding the write lock. Maximize contention. Keep transactions short; do slow work outside them.

Setting both busy_timeout and a custom busy_handler. The second replaces the first. Choose one (timeout for almost everyone).

What to internalize

SQLITE_BUSY is a normal, recoverable signal that another connection holds a lock — handle it, don’t crash. The first and usually sufficient fix is PRAGMA busy_timeout (e.g. 5000ms), which makes SQLite wait-and-retry inside its calls so short contentions never reach your code. For the rarer transaction-level aborts, wrap whole write transactions in an application retry with escalating backoff. Use BEGIN IMMEDIATE for any writing transaction so a busy fails up front (cleanly retryable) instead of mid-transaction (un-retryable). Above all, keep write transactions short and free of slow non-database work — short transactions make contention rare, turning busy handling into a safety net you seldom hit. (WAL, from the previous section, already removes reader/writer contention; this section handles the remaining writer/writer case.)


23. Prepared Statement Caching and Reuse

Compiling SQL into a prepared statement costs real time — parsing, validation, query planning. If you recompile the same statement on every call, you pay that cost repeatedly for no reason. Statement caching compiles each distinct SQL once and reuses the compiled statement across calls. This section shows how to build a statement cache into the wrapper and why it’s one of the highest-leverage optimizations for a chatty app.

The cost of recompiling

Recall the lifecycle: sqlite3_prepare_v2 compiles, you bind/step, then sqlite3_finalize releases. The wrapper’s run and query methods (Sections 11-12) prepare a fresh statement each call and finalize it on return. That’s correct and safe, but for a statement you run thousands of times — say, “fetch task by id” called for every row in a list — you’re recompiling identical SQL thousands of times. Compilation can dominate the cost of a simple query whose execution is otherwise a single index lookup.

The fix is to compile each distinct SQL string once, keep the compiled statement, and on subsequent calls reset it and rebind rather than recompile. This is exactly what sqlite3_reset enables (Section 5): it rewinds a statement to its initial state, ready to rebind and re-step, keeping the compiled bytecode.

A statement cache keyed by SQL text

The design: a dictionary mapping SQL strings to their compiled Statement objects. When you ask to run a SQL string, the cache returns the existing compiled statement (reset and ready) or compiles and stores a new one. Building this into Connection:

final class Connection {
    let handle: OpaquePointer?
    private var statementCache: [String: Statement] = [:]

    // ... existing init/configure/deinit ...

    /// Returns a cached statement for this SQL, compiling and caching on first use.
    private func cachedStatement(_ sql: String) throws -> Statement {
        if let existing = statementCache[sql] {
            existing.reset()              // rewind + clear bindings for reuse
            return existing
        }
        let statement = try Statement(connection: self, sql: sql)
        statementCache[sql] = statement   // keep it alive for future calls
        return statement
    }

    deinit {
        statementCache.removeAll()        // releases cached Statements → their deinit finalizes
        sqlite3_close_v2(handle)
    }
}

The cache holds each Statement alive (so its handle isn’t finalized between calls), and cachedStatement resets a hit before handing it back so it’s clean for rebinding. On Connection deinit, clearing the cache releases the Statement objects, whose own deinit finalizes the handles — and only then does the forgiving sqlite3_close_v2 run. The ordering matters: statements must finalize before (or as part of) the close.

Rewriting run and query to use the cache

Now the public methods use cached statements instead of fresh ones:

extension Connection {
    @discardableResult
    func run(_ sql: String, _ values: [SQLiteValue] = []) throws -> Int64 {
        let statement = try cachedStatement(sql)
        try statement.bind(values)
        _ = try statement.step()
        return lastInsertRowID
    }

    func query<T>(_ sql: String, _ values: [SQLiteValue] = [], decode: (Row) -> T) throws -> [T] {
        let statement = try cachedStatement(sql)
        try statement.bind(values)
        var results: [T] = []
        let row = Row(statement)
        while try statement.step() {
            results.append(decode(row))
        }
        return results
    }
}

The only change from the Section 11-12 versions is cachedStatement(sql) instead of prepare(sql). The statement is no longer finalized at the end of the call — it’s reset on the next use. From the caller’s perspective nothing changed; under the hood, identical SQL now compiles once and is reused. For an app that runs the same handful of queries repeatedly (which is most apps), this eliminates nearly all compilation cost.

Why parameterized SQL makes caching work

Statement caching is only effective because you parameterize your SQL. Recall that values enter through binds, not string interpolation (Sections 5-6). That means the SQL text is identical across calls — "SELECT * FROM tasks WHERE id = ?;" is the same string whether you fetch task 1 or task 1000, with the id supplied as a bound parameter. The cache keys on that stable text and finds a hit every time.

If you had (wrongly) interpolated values into the SQL — "SELECT * FROM tasks WHERE id = \(id);" — every call would produce a different SQL string, the cache would miss every time, and you’d compile afresh constantly (on top of the injection vulnerability). So parameterization isn’t just security and correctness; it’s what makes caching possible. The same discipline pays off three ways.

Cache invalidation and schema changes

One correctness concern: a prepared statement is compiled against the schema as it was at prepare time. If you change the schema — run a migration, drop a column, alter a table — cached statements compiled against the old schema can become stale or invalid. sqlite3_prepare_v2 (the v2 variant) handles many schema changes gracefully by automatically recompiling on the next step when it detects a schema change, returning the statement transparently re-prepared. But to be safe, clear the statement cache after applying migrations or any DDL:

extension Connection {
    func clearStatementCache() {
        statementCache.removeAll()   // Statements deinit → finalize
    }
}

// After migrations:
try db.runMigrations(migrations)
db.clearStatementCache()

In practice, migrations run once at startup before the cache fills, so this is rarely an issue — but if you alter the schema at runtime, clear the cache afterward. The _v2 prepare’s auto-recompilation is a safety net, not a license to ignore the concern entirely.

Bounding the cache

A naive cache grows without limit — every distinct SQL string ever run stays compiled forever. For an app with a fixed, modest set of queries (the common case), that’s fine; the cache plateaus at a small size. But if you generate many distinct SQL strings (e.g., dynamically built queries with varying structure), an unbounded cache leaks memory and holds many compiled statements. For that case, bound the cache with an LRU eviction policy — keep the N most-recently-used statements, finalize the rest:

// Sketch: track access order, evict least-recently-used when over capacity.
// On eviction, remove from the dictionary so the Statement deinits and finalizes.

Most apps don’t need this — they run a small, stable set of parameterized queries and the cache naturally stays small. Reach for LRU bounding only if you profile a growing statement cache, which signals you’re generating too many distinct SQL strings (often a hint to parameterize more or restructure dynamic query building).

Caching and concurrency

The statement cache is per-connection mutable state, so it inherits the connection’s concurrency rules (Section 20). Under the single-serialized-connection patterns (a queue or an actor owning the connection), the cache is accessed one-at-a-time and is safe. If you use multiple connections, each has its own cache — statements are connection-specific and cannot be shared across connections (a Statement belongs to the Connection that prepared it). Never share one statement cache across connections; the cache lives and dies with its connection.

Statement caching pitfalls

Recompiling the same SQL every call. Wastes compilation cost on hot queries. Cache compiled statements keyed by SQL text.

Interpolating values, defeating the cache. Distinct SQL strings per call means zero cache hits (and injection). Parameterize so the SQL text is stable.

Finalizing a cached statement out from under the cache. Double-free or use-after-finalize. Let the cache own statement lifetimes; clear the cache to release them.

Sharing statements or a cache across connections. A statement belongs to its connection. Each connection has its own cache.

Unbounded cache with dynamically generated SQL. Memory growth. Bound with LRU if you generate many distinct queries (or, better, generate fewer).

Stale statements after a runtime schema change. Clear the cache after migrations/DDL; rely on _v2 auto-recompilation only as a backstop.

What to internalize

Compiling SQL is expensive, so cache compiled statements keyed by their SQL text and reuse them via sqlite3_reset (rewind + rebind) instead of recompiling. Build the cache into Connection: a [String: Statement] dictionary, with run/query fetching a reset cached statement rather than preparing a fresh one, and the cache released on connection deinit so statements finalize before the close. Caching works precisely because you parameterize — stable SQL text means consistent cache hits — so the bind-don’t-interpolate discipline pays off in security, correctness, and now performance. Clear the cache after runtime schema changes (migrations at startup rarely need it), and bound the cache with LRU only if dynamic query generation makes it grow. The cache is per-connection state subject to the connection’s concurrency rules; never share statements or caches across connections.


24. Bulk Inserts and Performance Tuning

When you need to write thousands of rows — importing data, seeding a database, syncing a large payload — the difference between a naive approach and a tuned one is often two or three orders of magnitude. This section assembles every performance lever (transactions, statement reuse, pragmas) into the canonical fast bulk-write pattern and then covers the broader tuning pragmas worth knowing.

The naive approach and why it’s catastrophically slow

The obvious way to insert 10,000 rows is a loop that runs an insert each iteration. With the wrapper’s per-call statement (pre-caching) and no transaction, each insert is its own implicit transaction:

// SLOW: 10,000 implicit transactions, 10,000 disk syncs, 10,000 compilations
for item in tenThousandItems {
    try connection.run("INSERT INTO items (name, value) VALUES (?, ?);",
                       [SQLiteValue(item.name), SQLiteValue(item.value)])
}

This is slow for three compounding reasons: (1) each insert is wrapped in its own implicit transaction that commits — and committing means flushing to disk, the single most expensive operation, done 10,000 times; (2) without caching, the identical SQL recompiles 10,000 times; (3) per-call overhead accumulates. The disk syncs dominate — this loop can take seconds. On a real device with slower flash, even worse.

The fast pattern: one transaction, one statement, reset per row

The fix combines three levers: wrap all inserts in a single transaction (one disk sync instead of 10,000), reuse one prepared statement via reset (one compilation instead of 10,000), and bind per iteration:

func bulkInsert(_ items: [Item], into connection: Connection) throws {
    try connection.transaction {                       // (1) ONE transaction
        let statement = try connection.prepare(         // (2) ONE compilation
            "INSERT INTO items (name, value) VALUES (?, ?);"
        )
        for item in items {
            statement.reset()                           // rewind for reuse
            try statement.bind([SQLiteValue(item.name), SQLiteValue(item.value)])
            _ = try statement.step()                    // (3) bind + step per row
        }
        // statement finalizes when it goes out of scope at the end of the closure
    }
}

This is the canonical fast bulk insert. The single transaction means SQLite buffers all 10,000 inserts and flushes once at COMMIT. The reused statement means one compilation. The result is commonly hundreds of times faster than the naive loop — the same 10,000 inserts that took seconds now take tens of milliseconds. The transaction is the bigger win (eliminating 9,999 disk syncs), with statement reuse a meaningful secondary gain. If you internalize one performance pattern from this entire guide, make it this one.

(With the statement cache from Section 23 in place, the connection.run version inside a transaction also reuses the compiled statement automatically, so try connection.transaction { for item in items { try connection.run(...) } } is nearly as fast and more ergonomic. The explicit-statement version above makes the reuse visible and shaves the last bit of per-call overhead for the largest imports.)

Batching very large imports

For truly enormous imports (hundreds of thousands or millions of rows), holding a single transaction open for the entire import means the WAL (or rollback journal) grows to hold all the uncommitted changes, consuming memory and disk. The refinement is to commit in batches — say, every 10,000 rows — balancing the per-transaction sync savings against unbounded journal growth:

func bulkInsertBatched(_ items: [Item], batchSize: Int = 10_000, into connection: Connection) throws {
    var index = 0
    while index < items.count {
        let end = min(index + batchSize, items.count)
        let batch = items[index..<end]
        try connection.transaction {
            let statement = try connection.prepare("INSERT INTO items (name, value) VALUES (?, ?);")
            for item in batch {
                statement.reset()
                try statement.bind([SQLiteValue(item.name), SQLiteValue(item.value)])
                _ = try statement.step()
            }
        }
        index = end
    }
}

Each batch is one transaction (one sync), but the journal only ever holds one batch’s worth of changes. For most app-scale imports (thousands, not millions), a single transaction is fine and simpler; batch only when the import is large enough that journal growth matters. The batch size is a tunable — larger batches amortize sync cost better but use more memory; 10,000 is a reasonable starting point.

Performance pragmas

Beyond transactions and statement reuse, several pragmas tune SQLite’s behavior. The ones worth knowing:

PRAGMA synchronous controls how aggressively SQLite flushes to disk for durability:

sqlite3_exec(db, "PRAGMA synchronous = NORMAL;", nil, nil, nil)

FULL (the default outside WAL) flushes at every critical moment — safest, slowest. NORMAL flushes less often and, in WAL mode, is durable across app crashes (you can only lose data on an OS crash or power loss, not an app crash), while being significantly faster. OFF skips syncs entirely — fastest but risks corruption on power loss. For a WAL-mode app database, NORMAL is the standard choice: a great speed gain with crash-safety preserved. This is why configure() (Section 11) set it.

PRAGMA cache_size sets how many database pages SQLite keeps in memory:

sqlite3_exec(db, "PRAGMA cache_size = -8000;", nil, nil, nil)  // negative = KB; -8000 = 8MB

A larger page cache means fewer disk reads for hot data. A negative value specifies the cache in kibibytes (so -8000 is roughly 8 MB) rather than a page count. Bumping this up helps read-heavy workloads at the cost of memory; tune it to your app’s data size and the device’s memory budget.

PRAGMA temp_store controls where temporary tables and indices (e.g., from sorts) live:

sqlite3_exec(db, "PRAGMA temp_store = MEMORY;", nil, nil, nil)

MEMORY keeps temporary structures in RAM instead of on disk — faster for queries that build temp b-trees (Section 16), at the cost of memory. Reasonable for an app where temp structures are modest.

PRAGMA mmap_size enables memory-mapped I/O for reads:

sqlite3_exec(db, "PRAGMA mmap_size = 268435456;", nil, nil, nil)  // 256MB

Memory-mapping lets SQLite read the database via the OS page cache, avoiding some copy overhead. It can speed up read-heavy workloads. The value caps how much of the database is mapped. Benefits vary by workload and device; measure before committing to it.

Measure, don’t assume

Every pragma above is a tradeoff, and the right values depend on your data, your access patterns, and the device. The discipline is to measure. Wrap an operation in a timer, run it on a real device (the simulator’s disk performance is unrepresentative), try a pragma change, and compare:

func measure(_ label: String, _ work: () throws -> Void) rethrows {
    let start = DispatchTime.now()
    try work()
    let elapsed = Double(DispatchTime.now().uptimeNanoseconds - start.uptimeNanoseconds) / 1_000_000
    print("\(label): \(elapsed) ms")
}

try measure("bulk insert 10k") {
    try bulkInsert(tenThousandItems, into: connection)
}

Real-device measurement is essential because flash storage performance, memory pressure, and thermal state all differ from the simulator and from one device to another. Tune based on measurements of your actual workload, not folklore. The transaction-plus-statement-reuse pattern is a near-universal win; the pragmas are situational and deserve measurement.

Other write-performance considerations

A few more levers worth knowing. Drop and recreate indexes around a massive import: maintaining indexes during a bulk insert means updating them per row; for a one-time large import into an empty or near-empty table, dropping the indexes, inserting, then recreating them can be faster (the index builds once at the end). Only worth it for large imports. Insert in primary-key order when possible — inserting rows roughly in INTEGER PRIMARY KEY order is more cache-friendly than random order. Avoid unnecessary work in the loop — compute timestamps and other constants once before the loop, not per iteration (a small but free gain).

Bulk insert and tuning pitfalls

Inserting in a loop without a transaction. Thousands of disk syncs; catastrophically slow. Always wrap bulk writes in a transaction.

Recompiling the statement per row. Wasteful. Reuse one statement with reset (or rely on the statement cache).

One giant transaction for a million rows. Unbounded journal/WAL growth and memory. Batch into transactions of ~10,000.

Tuning pragmas by folklore. The right values are workload- and device-specific. Measure on a real device before and after.

synchronous = OFF in production. Fast but risks corruption on power loss. NORMAL under WAL is the safe-and-fast choice.

Measuring only on the simulator. Simulator disk performance is unrepresentative. Profile on real devices.

What to internalize

The single most important write-performance pattern is one transaction wrapping a reused prepared statement (reset + rebind + step per row) — it eliminates thousands of disk syncs and recompilations, commonly yielding hundreds-fold speedups; the transaction is the bigger half. For very large imports, batch into transactions of ~10,000 rows to cap journal growth. Tune with pragmas — synchronous = NORMAL (fast and crash-safe under WAL), a larger cache_size, temp_store = MEMORY, and optionally mmap_size — but treat them as measured tradeoffs, not defaults, and profile on real devices since flash performance varies. For massive imports, consider dropping/recreating indexes and inserting in primary-key order. The reliable universal win is transaction + statement reuse; everything else is situational tuning you justify with measurement.


25. Full-Text Search with FTS5

When users need to search text — notes, messages, documents, any free-form content — a plain WHERE title LIKE '%query%' is the wrong tool: it scans every row, can’t use an index, and offers no relevance ranking. SQLite’s FTS5 extension is purpose-built for this: it indexes text for fast, ranked, full-text search. Apple’s SQLite build includes FTS5, so it’s available on iOS with no extra setup. This section covers creating an FTS5 table, querying it, ranking results, and keeping it in sync with your data.

Why LIKE isn’t enough

SELECT * FROM notes WHERE body LIKE '%budget%' works for tiny datasets but fails at scale. The leading wildcard %budget% means no index can help — SQLite scans every row and substring-matches. It also can’t match word stems, can’t rank by relevance, can’t search multiple terms intelligently, and treats text as opaque bytes. For real search over a meaningful corpus, you want an inverted index that maps each word to the rows containing it, which is exactly what FTS5 builds.

Creating an FTS5 virtual table

FTS5 tables are “virtual tables” — backed by the extension rather than a plain b-tree. You create one with CREATE VIRTUAL TABLE ... USING fts5:

let sql = """
CREATE VIRTUAL TABLE notes_fts USING fts5(
    title,
    body,
    tokenize = 'unicode61'
);
"""
sqlite3_exec(db, sql, nil, nil, nil)

This creates a full-text index over title and body. Every column in an FTS5 table is a searchable text column (FTS5 doesn’t have typed columns like a normal table — everything is text to be indexed). The tokenize option selects how text is split into tokens; unicode61 is a good general-purpose tokenizer that handles Unicode word boundaries and case-folding. Other tokenizers include porter (which stems words, so “running” matches “run”) and ascii. For most apps, unicode61 (optionally with porter layered on) is the right starting point.

Querying with MATCH

You search an FTS5 table with the MATCH operator, which is the FTS5 query interface:

let sql = "SELECT title, body FROM notes_fts WHERE notes_fts MATCH ?;"
let results = try connection.query(sql, [SQLiteValue("budget")]) { row in
    (row.string("title") ?? "", row.string("body") ?? "")
}

WHERE notes_fts MATCH 'budget' finds all rows whose indexed text contains the term “budget,” using the inverted index for speed. The query term is bound as a parameter (still no interpolation — the FTS query string is data). FTS5’s query syntax is rich:

"budget"                  // rows containing "budget"
"budget report"           // rows containing BOTH "budget" AND "report" (implicit AND)
"budget OR forecast"      // rows containing either
"budget NOT draft"        // "budget" but not "draft"
"\"quarterly budget\""    // the exact phrase "quarterly budget"
"budg*"                   // prefix search: budget, budgeting, budgetary...
"title:budget"            // restrict the term to the title column

These operators (AND implicit, OR, NOT, phrase quoting, * prefix, column: scoping) give users real search power. Note prefix search (budg*) is fast in FTS5 because of how the index is structured, unlike LIKE 'budg%' on a normal column.

Ranking results by relevance

The point of full-text search is showing the best matches first. FTS5 has a built-in bm25 ranking function (BM25 is a standard relevance algorithm) that scores each match — lower scores are better matches in FTS5’s convention:

let sql = """
SELECT title, body, bm25(notes_fts) AS rank
FROM notes_fts
WHERE notes_fts MATCH ?
ORDER BY rank
LIMIT 20;
"""
let ranked = try connection.query(sql, [SQLiteValue(userQuery)]) { row in
    (row.string("title") ?? "", row.double("rank"))
}

bm25(notes_fts) computes a relevance score per matching row; ORDER BY rank sorts best-first (FTS5’s bm25 returns more-negative values for better matches, so plain ascending order puts the best first). You can even weight columns — bm25(notes_fts, 10.0, 1.0) weights the first column (title) ten times more than the second (body), so a match in the title ranks higher than a match in the body. This is how you build search that feels intelligent: a query matching a note’s title surfaces above one merely mentioning the term in the body.

Highlighting matches

FTS5 can return snippets with the matched terms highlighted — useful for showing search results with context:

let sql = """
SELECT title, snippet(notes_fts, 1, '[', ']', '...', 20) AS excerpt
FROM notes_fts
WHERE notes_fts MATCH ?
ORDER BY bm25(notes_fts)
LIMIT 20;
"""

snippet(table, columnIndex, startMark, endMark, ellipsis, tokenCount) returns a fragment of the matched column around the match, wrapping matched terms in your markers (here [ and ]) and using ... for elision, up to ~20 tokens. There’s also highlight() for highlighting within the full text. These let you show users why a result matched, with the matching words marked — a hallmark of polished search UI.

Keeping FTS in sync: external content tables

Here’s the central design question: FTS5 stores its own copy of the indexed text, separate from your real notes table. If you store text in both your notes table and the FTS5 table, you’re duplicating data and must keep them synchronized. FTS5’s solution is the external content table — the FTS index references your real table for the content, storing only the index, not a copy:

// Your real table
CREATE TABLE notes (id INTEGER PRIMARY KEY, title TEXT, body TEXT, created_at INTEGER);

// FTS index that references the real table as its content source
CREATE VIRTUAL TABLE notes_fts USING fts5(
    title, body,
    content = 'notes',        // external content: read from the notes table
    content_rowid = 'id'      // notes.id is the rowid linking the two
);

With content = 'notes', the FTS5 table doesn’t duplicate the text — it indexes notes and reads content from there. But now you are responsible for updating the index when notes changes, because FTS5 won’t see your inserts/updates/deletes automatically. The standard mechanism is triggers that mirror changes into the FTS index:

let triggers = """
CREATE TRIGGER notes_ai AFTER INSERT ON notes BEGIN
    INSERT INTO notes_fts(rowid, title, body) VALUES (new.id, new.title, new.body);
END;
CREATE TRIGGER notes_ad AFTER DELETE ON notes BEGIN
    INSERT INTO notes_fts(notes_fts, rowid, title, body) VALUES('delete', old.id, old.title, old.body);
END;
CREATE TRIGGER notes_au AFTER UPDATE ON notes BEGIN
    INSERT INTO notes_fts(notes_fts, rowid, title, body) VALUES('delete', old.id, old.title, old.body);
    INSERT INTO notes_fts(rowid, title, body) VALUES (new.id, new.title, new.body);
END;
"""
sqlite3_exec(db, triggers, nil, nil, nil)

These triggers fire automatically on changes to notes: an insert adds to the index, a delete removes from it (the special 'delete' command), and an update does both. Now your code just writes to notes as normal, and the FTS index stays current transparently. This trigger-based external-content pattern is the recommended way to add full-text search to an existing table without duplicating its data. Set up the triggers in a migration alongside the FTS table.

Searching and joining back to the real table

With an external content table, a search typically joins the FTS results back to the real table to fetch full rows:

let sql = """
SELECT n.id, n.title, n.created_at, bm25(notes_fts) AS rank
FROM notes_fts
JOIN notes n ON n.id = notes_fts.rowid
WHERE notes_fts MATCH ?
ORDER BY rank
LIMIT 20;
"""

The FTS table provides the match and rank; the join to notes (on notes.id = notes_fts.rowid) fetches whatever columns you need from the real row, including non-indexed ones like created_at. This is the typical shape: FTS finds and ranks, the join hydrates the results.

Rebuilding the index

If the FTS index ever gets out of sync (a bug in your triggers, a bulk operation that bypassed them, or initial population for an existing table), you can rebuild it from the content table:

sqlite3_exec(db, "INSERT INTO notes_fts(notes_fts) VALUES('rebuild');", nil, nil, nil)

The special 'rebuild' command re-indexes everything from the external content table. You’ll run this once when first adding FTS to a table that already has data (the triggers only catch future changes), and occasionally as a repair. It’s also how you’d reindex after changing tokenizer settings.

FTS5 pitfalls

Using LIKE '%term%' for real search. Full scan, no ranking, no stemming. Use FTS5 for meaningful text search.

Duplicating content in both the real table and the FTS table. Wasted space and sync burden. Use an external content table with triggers.

Forgetting the sync triggers (or initial rebuild). The index goes stale — searches miss recent data. Set up insert/update/delete triggers and run 'rebuild' for pre-existing data.

Interpolating the user’s query into the SQL. Still injection-prone, and FTS query syntax has its own special characters. Bind the query as a parameter.

Not ranking results. Unranked full-text results feel random. Use bm25 (optionally column-weighted) and ORDER BY rank.

Assuming FTS5 is unavailable on iOS. Apple’s build includes it. You don’t need to bundle your own SQLite just for FTS5 (unlike SQLCipher, next-but-one section).

What to internalize

FTS5 is SQLite’s full-text search extension — included in Apple’s iOS build — and the right tool whenever users search free-form text, replacing slow, unrankable LIKE '%...%' scans with a fast inverted index. Create a virtual table with USING fts5(...) and a tokenizer like unicode61; query with the MATCH operator (which supports AND/OR/NOT, phrases, prefix *, and column scoping) and bind the user’s query as a parameter. Rank with bm25 (column-weightable) and ORDER BY rank for relevance, and use snippet/highlight to show matched context. Crucially, use an external content table (content = 'your_table') plus insert/update/delete triggers so the index references your real data instead of duplicating it and stays in sync automatically; run the 'rebuild' command to index pre-existing data. Search by matching in the FTS table and joining back to the real table to hydrate full rows.


26. The Backup API and Database Copying

Sometimes you need to copy an entire database: back it up to a safe location, snapshot an in-memory database to disk (or vice versa), seed a fresh database from one bundled in your app, or export a consistent copy while the database is in use. A naive file copy is wrong in WAL mode and unsafe while the database is open. SQLite’s online backup API does it correctly — even on a live database. This section covers the backup API and the related task of seeding from a bundled database.

Why not just copy the file?

The tempting approach — FileManager.copyItem(at:to:) on app.sqlite — is wrong for two reasons. First, in WAL mode (Section 21) the current state is split across app.sqlite and app.sqlite-wal; copying only the main file gives you a stale database missing everything in the WAL. You’d have to copy all the companion files, and even then only safely when no writes are in flight. Second, copying a file that’s open and being written produces a torn, inconsistent copy — you might capture the file mid-transaction. The file copy can be made to work (checkpoint fully, ensure no writers, copy all files), but it’s fragile. The backup API sidesteps all of it.

The online backup API

SQLite’s backup API copies one open database to another, page by page, producing a consistent copy even if the source is being modified during the backup. The three functions:

func backup(from source: OpaquePointer?, to destination: OpaquePointer?) throws {
    // Initialize a backup from source's "main" database to destination's "main" database
    guard let backup = sqlite3_backup_init(destination, "main", source, "main") else {
        throw SQLiteError.from(db: destination, code: sqlite3_errcode(destination))
    }

    // Copy all pages in one step (-1 means "all remaining pages")
    let rc = sqlite3_backup_step(backup, -1)

    // Finish: release the backup object
    sqlite3_backup_finish(backup)

    guard rc == SQLITE_DONE else {
        throw SQLiteError.from(db: destination, code: rc)
    }
}

sqlite3_backup_init(dest, "main", source, "main") sets up a backup from the source connection’s main database to the destination connection’s main database (the "main" strings name the database — "main" is the primary; attached databases have other names). sqlite3_backup_step(backup, -1) copies pages; -1 copies all remaining pages in one call (you can pass a smaller number to copy incrementally — see below). sqlite3_backup_finish releases the backup handle. A SQLITE_DONE from the step means the copy completed.

Both source and destination are open Connections. To back up your live database to a file:

func backupDatabase(_ source: Connection, to url: URL) throws {
    let destination = try Connection(url: url)   // opens/creates the backup file
    try backup(from: source.handle, to: destination.handle)
    // destination's deinit closes the backup file
}

This produces a consistent, complete copy of the source at url, handling WAL correctly, even if the source is being read or written during the backup. It’s the right way to make a backup.

Incremental backup for large databases

Copying all pages at once (step(backup, -1)) holds a read lock on the source for the duration, which for a large database could block writers noticeably. For large databases or when you want to avoid a long lock, copy in chunks, yielding between them:

func incrementalBackup(from source: OpaquePointer?, to destination: OpaquePointer?, pagesPerStep: Int32 = 100) throws {
    guard let backup = sqlite3_backup_init(destination, "main", source, "main") else {
        throw SQLiteError.from(db: destination, code: sqlite3_errcode(destination))
    }
    defer { sqlite3_backup_finish(backup) }

    var rc: Int32 = SQLITE_OK
    repeat {
        rc = sqlite3_backup_step(backup, pagesPerStep)   // copy a chunk of pages
        // Between chunks, the source lock is released, letting writers proceed.
        // sqlite3_backup_remaining(backup) and sqlite3_backup_pagecount(backup)
        // give progress for a progress bar.
        if rc == SQLITE_BUSY || rc == SQLITE_LOCKED {
            usleep(50_000)   // brief pause, then continue
        }
    } while rc == SQLITE_OK || rc == SQLITE_BUSY || rc == SQLITE_LOCKED

    guard rc == SQLITE_DONE else {
        throw SQLiteError.from(db: destination, code: rc)
    }
}

Stepping a chunk at a time releases the source’s lock between chunks, so writers aren’t blocked for the whole backup. sqlite3_backup_remaining and sqlite3_backup_pagecount report progress (remaining pages and total), perfect for a progress indicator on a large backup. If the source is modified mid-backup, SQLite automatically restarts the affected portions to keep the copy consistent. For most app databases (modest size), the one-shot step(backup, -1) is simpler and fine; reach for incremental backup when the database is large enough that a long source lock matters.

Snapshotting an in-memory database to disk

The backup API works between any two connections, including in-memory ones. A useful pattern: build or load a database in memory (fast), then snapshot it to disk:

let memory = try Connection(inMemory: true)
// ... build up the in-memory database with lots of fast writes ...
let disk = try Connection(url: databaseURL())
try backup(from: memory.handle, to: disk.handle)   // persist the in-memory db to a file

And the reverse — load a disk database entirely into memory for blazing-fast read-only access:

let disk = try Connection(url: databaseURL(), readonly: true)
let memory = try Connection(inMemory: true)
try backup(from: disk.handle, to: memory.handle)    // load disk db into memory
// now query `memory` with no disk I/O

These are handy for specialized workloads: building a database with heavy writes in memory then persisting once, or loading a read-only reference database into memory for the fastest possible queries.

Seeding from a bundled database

A common iOS need: ship a prepopulated database in your app bundle (reference data, initial content) and copy it into the writable container on first launch. Because the bundled database is read-only (the app bundle is read-only) and you need a writable copy, you copy it out on first run:

func seedDatabaseIfNeeded() throws {
    let destinationURL = try databaseURL()
    let fileManager = FileManager.default

    // Only seed if the database doesn't exist yet
    guard !fileManager.fileExists(atPath: destinationURL.path) else { return }

    guard let bundledURL = Bundle.main.url(forResource: "seed", withExtension: "sqlite") else {
        throw SQLiteError.open(message: "bundled seed database not found")
    }

    // For a simple, not-in-use bundled file, a file copy is acceptable here:
    try fileManager.copyItem(at: bundledURL, to: destinationURL)
}

For seeding specifically, a plain file copy is acceptable — the bundled database isn’t open or being written, and you’d ship it checkpointed (WAL empty) so there are no companion files to worry about. The backup API is the tool for live databases; a bundled seed file is static, so the simple copy works. Run seedDatabaseIfNeeded() before opening the connection, guarded by a file-existence check so it only happens once. After seeding, open the copied database normally and run migrations on top of it (the seed should be at a known schema version).

Backup pitfalls

Naive file copy of a WAL database. Misses uncommitted-to-main WAL contents → stale copy. Use the backup API, or checkpoint fully and copy all files.

Copying an open, actively-written database file. Torn, inconsistent copy. The backup API copies consistently even on a live database.

One-shot backup locking a huge source too long. Blocks writers for the whole copy. Use incremental backup (chunked steps) for large databases.

Seeding by copying without a first-run guard. Overwrites the user’s data on every launch. Guard with a file-existence check so seeding happens once.

Forgetting the bundled seed must be checkpointed/single-file. Shipping a seed with a stray -wal complicates the copy. Ship a clean, checkpointed seed file.

Not running migrations after seeding. The seed is at some schema version; your migration runner should bring it to current. Seed, then migrate.

What to internalize

To copy an entire database, use SQLite’s online backup API, not a file copy — a file copy misses WAL contents and can capture a torn, inconsistent state. sqlite3_backup_init / sqlite3_backup_step / sqlite3_backup_finish copy one open connection’s database to another page by page, producing a consistent copy even while the source is live; step(backup, -1) copies everything at once (fine for modest databases), while chunked stepping with progress via backup_remaining/backup_pagecount avoids holding a long source lock on large ones. The API works between any connections, enabling in-memory↔disk snapshots for specialized fast-write or fast-read workloads. For seeding a bundled prepopulated database into the writable container, a simple guarded first-run file copy is acceptable (the bundled file is static and single), after which you open it and run migrations to bring it current.


27. Blobs, Incremental I/O, and Large Data

Sometimes you need to store binary data — an image thumbnail, a small audio clip, an encoded payload, a file’s contents. SQLite stores binary as BLOBs, and for large blobs it offers incremental I/O that reads and writes pieces without loading the whole thing into memory. But there’s also a more fundamental question: should large binary data live in the database at all? This section covers blob storage, the incremental blob API, and the file-versus-blob decision.

The file-versus-blob decision

Before reaching for blobs, decide whether the data belongs in the database. The well-known guidance, supported by SQLite’s own research: small blobs (under ~100 KB) are often faster read from the database than from individual files; large blobs are generally better stored as files on disk with only their path (or filename) in the database. The reasoning: for small data, the per-file open/read/close overhead exceeds the cost of reading from SQLite’s already-open page cache; for large data, files avoid bloating the database, keep the database’s page cache effective for actual structured data, and let the OS file cache do its job.

So the practical rule for iOS:

  • Small binary data (thumbnails, icons, short encoded values, modest payloads): store as a BLOB in the database. It’s convenient, transactional (the blob commits atomically with related rows), and fast at small sizes.
  • Large binary data (full-resolution images, video, large documents): store as a file in the app container (e.g., a subdirectory of Application Support or Caches), and store the filename or relative path in the database as TEXT. The database row references the file; the file lives on disk.

This split keeps the database lean and fast for its core job — structured, queryable data — while binary bulk lives where binary bulk belongs. A photos app stores image files and indexes their metadata (path, date, dimensions) in SQLite; it does not stuff multi-megabyte images into the database.

Storing and reading small blobs

For data you do store as a blob, binding and reading are as covered in Sections 6 and 7 — sqlite3_bind_blob with withUnsafeBytes and SQLITE_TRANSIENT, reading with sqlite3_column_blob + sqlite3_column_bytes into Data. Through the wrapper, a blob is just a SQLiteValue.blob(Data):

// Store a thumbnail
let thumbnail: Data = makeThumbnail(from: image)   // small, e.g. 20KB
try connection.run(
    "INSERT INTO photos (caption, thumbnail, created_at) VALUES (?, ?, ?);",
    [SQLiteValue("Sunset"), SQLiteValue(thumbnail), SQLiteValue(Date())]
)

// Read it back
let photo = try connection.queryOne(
    "SELECT caption, thumbnail FROM photos WHERE id = ?;",
    [SQLiteValue(photoId)]
) { row in
    (caption: row.string("caption") ?? "", thumbnail: row.data("thumbnail"))
}

The Row.data accessor handles the pointer-and-length read and NULL case (Section 12). For small blobs this whole-value read is exactly right — you want the entire thumbnail in memory anyway. The blob commits transactionally with the rest of the row, so there’s no risk of a row referencing a thumbnail that didn’t save.

The incremental blob I/O API

For a large blob that you nonetheless want in the database (occasionally justified), reading or writing the entire thing as one Data means holding the whole blob in memory at once — wasteful for, say, a multi-megabyte value you only need to read a slice of. SQLite’s incremental blob API opens a blob as a stream you read or write in pieces:

func readBlobIncrementally(db: OpaquePointer?, table: String, column: String, rowid: Int64) throws -> Data {
    var blob: OpaquePointer?
    // Open the blob for reading (0 = readonly; 1 would be read/write)
    let rc = sqlite3_blob_open(db, "main", table, column, rowid, 0, &blob)
    guard rc == SQLITE_OK else {
        throw SQLiteError.from(db: db, code: rc)
    }
    defer { sqlite3_blob_close(blob) }

    let totalSize = Int(sqlite3_blob_bytes(blob))
    var result = Data(count: totalSize)
    let chunkSize = 64 * 1024   // read 64KB at a time

    var offset = 0
    while offset < totalSize {
        let thisChunk = min(chunkSize, totalSize - offset)
        let readRC = result.withUnsafeMutableBytes { buffer in
            sqlite3_blob_read(blob, buffer.baseAddress!.advanced(by: offset), Int32(thisChunk), Int32(offset))
        }
        guard readRC == SQLITE_OK else {
            throw SQLiteError.from(db: db, code: readRC)
        }
        offset += thisChunk
    }
    return result
}

sqlite3_blob_open opens a handle to a specific blob (identified by table, column, and rowid). sqlite3_blob_bytes gives its size. sqlite3_blob_read(blob, buffer, count, offset) reads count bytes starting at offset into your buffer — letting you read in chunks rather than all at once. sqlite3_blob_write similarly writes a slice (the blob must already be the right size — incremental I/O can’t grow a blob, only overwrite within its existing size). sqlite3_blob_close releases the handle.

The value of incremental I/O is reading or writing part of a large blob, or streaming it in bounded-memory chunks rather than materializing the whole thing. If you need only the first kilobyte of a large blob (a header, say), incremental read fetches just that. In practice, for iOS apps this API is rarely needed — it’s a tool for the unusual case of large blobs in the database that you process in pieces. The far more common answer for large binary data is the file-on-disk approach above, which avoids the question entirely.

Practical pattern: files on disk, metadata in SQLite

The recommended pattern for media-heavy apps, spelled out:

// Schema: store the file reference, not the bytes
// CREATE TABLE photos (id INTEGER PRIMARY KEY, caption TEXT, filename TEXT NOT NULL,
//                      thumbnail BLOB, width INTEGER, height INTEGER, created_at INTEGER);

func savePhoto(image: Data, caption: String, connection: Connection) throws -> Int64 {
    // 1. Write the large image to a file in the container
    let filename = UUID().uuidString + ".jpg"
    let fileURL = try imagesDirectory().appendingPathComponent(filename)
    try image.write(to: fileURL)

    // 2. Store a small thumbnail as a blob, plus metadata and the filename, in the database
    let thumbnail = makeThumbnail(from: image)   // small
    return try connection.run(
        "INSERT INTO photos (caption, filename, thumbnail, created_at) VALUES (?, ?, ?, ?);",
        [SQLiteValue(caption), SQLiteValue(filename), SQLiteValue(thumbnail), SQLiteValue(Date())]
    )
}

func loadFullImage(for photoId: Int64, connection: Connection) throws -> Data? {
    guard let filename = try connection.queryOne(
        "SELECT filename FROM photos WHERE id = ?;", [SQLiteValue(photoId)]
    ) ({ $0.string("filename") }) ?? nil else { return nil }
    let fileURL = try imagesDirectory().appendingPathComponent(filename)
    return try? Data(contentsOf: fileURL)
}

The large image goes to a file; the database stores the filename, a small thumbnail blob (for fast list rendering without touching the file system), and queryable metadata. Lists render from the thumbnails and metadata (all in the database, fast); opening a photo loads the full file on demand. This gives you the best of both: a lean, fast, fully-queryable database, and binary bulk stored efficiently as files.

Consistency between files and rows

The one cost of the files-on-disk approach is that the database and file system can drift out of sync — a row references a deleted file, or a file is orphaned with no row. Mitigations: delete the file in the same operation that deletes the row (and consider doing the row delete in a transaction, then the file delete after commit, so a failed commit doesn’t delete a file you still reference); run a periodic reconciliation that finds orphaned files (files with no referencing row) and dangling references (rows whose file is missing). It’s a manageable cost, and far preferable to the database bloat of storing large blobs inline. Plan for the reconciliation rather than assuming perfect consistency.

Blob and large data pitfalls

Storing large media as inline blobs. Bloats the database, hurts its page cache, slows everything. Store large binary as files; keep the path in the database.

Storing tiny values as files. Per-file overhead exceeds database read cost for small data. Small blobs belong in the database.

Reading a huge blob as one Data. Materializes the whole thing in memory. Use incremental blob I/O to read in chunks (or, better, don’t store it inline).

Trying to grow a blob with incremental write. sqlite3_blob_write can only overwrite within the existing size. Size the blob first (e.g., insert a zeroblob of the right size), then write.

Forgetting file/row consistency. Deleting a row without its file (or vice versa) leaves orphans/dangles. Delete together and reconcile periodically.

Putting blob bytes where a thumbnail would do. Lists don’t need full-resolution data. Store a small thumbnail blob for list rendering and load the full file only on demand.

What to internalize

First decide whether binary data belongs in the database at all: store small blobs (under ~100 KB — thumbnails, icons, short payloads) inline as BLOBs (convenient, transactional, fast at small sizes), but store large binary (full images, video, big documents) as files in the app container with only the filename/path in the database, keeping the database lean and fully queryable. Small blobs read and write as whole Data values through the wrapper’s SQLiteValue.blob / Row.data. SQLite’s incremental blob I/O (sqlite3_blob_open/read/write/close) reads or writes large blobs in bounded-memory chunks, but it’s rarely needed on iOS because the files-on-disk pattern avoids the problem. The recommended media pattern is files on disk plus a small thumbnail blob and queryable metadata in the database — fast lists from the database, full files loaded on demand — at the manageable cost of reconciling file/row consistency.


28. Custom Functions and Collations

SQLite lets you extend its SQL with your own functions and sorting rules, written in Swift. A custom function can compute anything you can express in Swift and call it from SQL; a custom collation defines how text sorts and compares. These are advanced, occasionally invaluable tools — and they’re the place where bridging Swift closures into SQLite’s C callback API gets genuinely tricky, because C function pointers can’t capture Swift context directly. This section covers both the capability and the bridging technique.

Why custom functions

SQLite ships with many built-in functions (length, upper, abs, date, etc.), but sometimes you need logic SQLite doesn’t have: a custom distance calculation, a domain-specific transformation, a Swift-side formatting rule applied in a query. A custom function registers Swift code as a SQL function so you can use it in SELECT, WHERE, ORDER BY — anywhere a function is allowed. The function runs inside SQLite’s execution, per row, which can be far more efficient than fetching rows and post-processing in Swift.

The bridging challenge

sqlite3_create_function_v2 expects a C function pointer for the implementation — specifically an @convention(c) closure, which cannot capture Swift context (no captured variables, no self). But your function logic usually needs context (the Swift closure you actually want to run). The standard technique: pass your Swift logic as a heap-allocated context pointer through the API’s “application data” parameter, and inside the C callback, cast that pointer back to retrieve your closure. It’s the same context-pointer pattern that C APIs use everywhere; it just looks unusual in Swift.

Registering a scalar function

Here’s a complete example registering a custom function. We’ll register a levenshtein-style or, more simply, a custom reverse_string function to keep the logic clear and focus on the bridging:

// The Swift logic we want to expose, boxed so we can pass it through a C pointer.
final class FunctionBox {
    let implementation: ([SQLiteValue]) -> SQLiteValue
    init(_ implementation: @escaping ([SQLiteValue]) -> SQLiteValue) {
        self.implementation = implementation
    }
}

func registerFunction(
    _ db: OpaquePointer?,
    name: String,
    argCount: Int32,
    implementation: @escaping ([SQLiteValue]) -> SQLiteValue
) {
    // Box the closure and hand ownership to SQLite as the function's "app data".
    let box = FunctionBox(implementation)
    let boxPointer = Unmanaged.passRetained(box).toOpaque()

    sqlite3_create_function_v2(
        db,
        name,
        argCount,
        SQLITE_UTF8 | SQLITE_DETERMINISTIC,   // text encoding + "same input → same output" hint
        boxPointer,                            // app data: our boxed closure
        { (context, argc, argv) in             // the @convention(c) callback — no captures allowed
            // Recover our box from the app data.
            guard let raw = sqlite3_user_data(context) else { return }
            let box = Unmanaged<FunctionBox>.fromOpaque(raw).takeUnretainedValue()

            // Convert the C arguments into Swift SQLiteValues.
            var args: [SQLiteValue] = []
            if let argv = argv {
                for i in 0..<Int(argc) {
                    let value = argv[i]
                    switch sqlite3_value_type(value) {
                    case SQLITE_INTEGER: args.append(.integer(sqlite3_value_int64(value)))
                    case SQLITE_FLOAT:   args.append(.real(sqlite3_value_double(value)))
                    case SQLITE_TEXT:
                        let text = sqlite3_value_text(value).map { String(cString: $0) } ?? ""
                        args.append(.text(text))
                    case SQLITE_NULL:    args.append(.null)
                    default:             args.append(.null)
                    }
                }
            }

            // Run the Swift logic and report the result back to SQLite.
            let result = box.implementation(args)
            switch result {
            case .integer(let i): sqlite3_result_int64(context, i)
            case .real(let d):    sqlite3_result_double(context, d)
            case .text(let s):    sqlite3_result_text(context, s, -1, SQLITE_TRANSIENT)
            case .blob(let data):
                data.withUnsafeBytes { sqlite3_result_blob(context, $0.baseAddress, Int32(data.count), SQLITE_TRANSIENT) }
            case .null:           sqlite3_result_null(context)
            }
        },
        nil,   // step function (for aggregates only)
        nil,   // final function (for aggregates only)
        { (raw) in   // destructor for the app data — release our box when the function is unregistered
            if let raw = raw { Unmanaged<FunctionBox>.fromOpaque(raw).release() }
        }
    )
}

That’s a lot, so let’s trace the bridging carefully. We box the Swift closure in a FunctionBox class (so it has a stable heap address), then Unmanaged.passRetained(box).toOpaque() gives us a raw pointer that keeps the box alive and is handed to SQLite as the function’s app data. The implementation argument is an @convention(c) closure — it captures nothing, which is why it must recover the box from sqlite3_user_data(context) rather than closing over it. Inside, we convert the C sqlite3_value* arguments to our SQLiteValue enum, run the boxed Swift closure, and report the result with the sqlite3_result_* functions. Finally, the destructor (the last argument) releases the box when SQLite unregisters the function, balancing the passRetained. The SQLITE_DETERMINISTIC flag tells SQLite the function always returns the same output for the same input, allowing optimizations (omit it for non-deterministic functions like a random or time-based one).

Using the registered function

Once registered, the function is callable from SQL like any built-in:

registerFunction(db, name: "reverse_string", argCount: 1) { args in
    guard case .text(let s) = args.first else { return .null }
    return .text(String(s.reversed()))
}

// Now usable in queries:
let results = try connection.query("SELECT reverse_string(title) AS rev FROM tasks;") {
    $0.string("rev")
}

reverse_string(title) runs your Swift closure for each row’s title, inside the query. This is genuinely powerful — arbitrary Swift logic available in SQL — but use it judiciously: a custom function called per row can’t be optimized by the planner the way built-in operations can, and a slow Swift function makes the whole query slow. Reserve custom functions for logic that genuinely can’t be expressed in SQL and benefits from running close to the data.

Custom collations: defining sort order

A collation is the rule SQLite uses to compare and order text. The default (BINARY) compares byte-by-byte, which sorts ASCII correctly but mishandles case-insensitivity, locale-specific ordering, and accented characters. SQLite has built-in NOCASE (ASCII case-insensitive) and RTRIM, but for proper locale-aware or custom ordering you register your own collation:

func registerCollation(_ db: OpaquePointer?, name: String, compare: @escaping (String, String) -> ComparisonResult) {
    final class CollationBox {
        let compare: (String, String) -> ComparisonResult
        init(_ c: @escaping (String, String) -> ComparisonResult) { self.compare = c }
    }
    let box = CollationBox(compare)
    let boxPointer = Unmanaged.passRetained(box).toOpaque()

    sqlite3_create_collation_v2(
        db, name, SQLITE_UTF8, boxPointer,
        { (raw, len1, bytes1, len2, bytes2) -> Int32 in
            guard let raw = raw, let bytes1 = bytes1, let bytes2 = bytes2 else { return 0 }
            let box = Unmanaged<CollationBox>.fromOpaque(raw).takeUnretainedValue()
            let s1 = String(data: Data(bytes: bytes1, count: Int(len1)), encoding: .utf8) ?? ""
            let s2 = String(data: Data(bytes: bytes2, count: Int(len2)), encoding: .utf8) ?? ""
            switch box.compare(s1, s2) {
            case .orderedAscending:  return -1
            case .orderedSame:       return 0
            case .orderedDescending: return 1
            }
        },
        { (raw) in if let raw = raw { Unmanaged<CollationBox>.fromOpaque(raw).release() } }
    )
}

// Register a localized, case- and accent-insensitive collation:
registerCollation(db, name: "LOCALIZED") { a, b in
    a.compare(b, options: [.caseInsensitive, .diacriticInsensitive], range: nil, locale: .current)
}

The same bridging pattern: box the Swift comparison closure, pass it as app data, recover it in the @convention(c) callback, and translate Swift’s ComparisonResult to the -1/0/1 that SQLite expects. Now you can apply the collation in SQL:

// Sort using the localized collation
"SELECT * FROM contacts ORDER BY name COLLATE LOCALIZED;"
// Or attach it to a column at table creation so it's the default for that column:
"CREATE TABLE contacts (id INTEGER PRIMARY KEY, name TEXT COLLATE LOCALIZED);"

ORDER BY name COLLATE LOCALIZED sorts names the way a user expects in their locale — case- and accent-insensitive, locale-aware — rather than by raw bytes. This is the right way to get human-friendly text sorting, especially for non-English content where byte ordering is simply wrong. A collation attached at the column level also makes equality comparisons (WHERE name = ?) use that collation, enabling case-insensitive lookups.

When to use these (and when not)

Custom functions and collations are advanced features you’ll use rarely. Most apps never need them — built-in functions and NOCASE collation cover the common cases. Reach for a custom function when you have genuinely Swift-only logic that must run inside a query for efficiency, and a custom collation when you need locale-aware or domain-specific text ordering that the built-ins don’t provide (proper localized sorting is the most common real use). The bridging is intricate, so encapsulate it once in helpers like the above and reuse them. And remember the performance caveat: custom functions run per row and aren’t optimizable by the planner, so a heavy one can dominate a query’s cost.

Custom functions and collations pitfalls

Trying to capture Swift context in the @convention(c) callback. It can’t capture. Box the closure and pass it as app data; recover it via sqlite3_user_data.

Leaking the boxed closure. passRetained without a matching release leaks. Provide the destructor argument that releases the box.

Forgetting SQLITE_TRANSIENT for text/blob results. Same lifetime issue as binding. Use SQLITE_TRANSIENT when reporting text/blob results.

Marking a non-deterministic function SQLITE_DETERMINISTIC. The planner may cache results wrongly. Only flag genuinely deterministic functions.

Heavy custom functions in hot queries. They run per row, unoptimizable. Keep them light, or precompute.

Byte-ordering text that needs locale awareness. Default BINARY collation misorders accented/non-English text. Register a localized collation for human-friendly sorting.

What to internalize

SQLite lets you register Swift code as SQL functions (sqlite3_create_function_v2) and text sort rules as collations (sqlite3_create_collation_v2), callable directly from SQL. The central difficulty is bridging: the implementation must be an @convention(c) closure that can’t capture Swift context, so you box your Swift closure in a class, pass it as the function’s app data via Unmanaged.passRetained(...).toOpaque(), recover it inside the callback with sqlite3_user_data/fromOpaque, and release it in a destructor — converting arguments to/from your SQLiteValue enum and reporting results with sqlite3_result_*. Custom functions run per row and aren’t planner-optimizable, so reserve them for genuinely Swift-only logic that benefits from running in-query. The most common real use is a custom collation for locale-aware, case- and accent-insensitive text sorting (ORDER BY name COLLATE LOCALIZED), which byte-wise BINARY ordering gets wrong for non-English content. Encapsulate the intricate bridging once and reuse it; most apps need these rarely.


29. Encryption with SQLCipher

SQLite databases are, by default, plain files on disk — anyone with file access can read them with any SQLite tool. For sensitive data, you may want the database encrypted at rest. SQLite itself has no built-in open-source encryption, and critically, Apple’s bundled SQLite does not include encryption — so unlike FTS5 or the backup API, you cannot get database encryption from the system SQLite3 module. This section covers the options: SQLCipher (the standard encryption extension, which you must bundle yourself) and iOS’s own file Data Protection as an alternative.

What Apple’s SQLite does and doesn’t give you

The system SQLite3 module is Apple’s standard build. It includes FTS5, JSON1, the backup API, R-Trees, and most common features — but not the encryption hooks. The official SQLite Encryption Extension (SEE) is a paid commercial product from the SQLite authors, and SQLCipher is a widely-used open-source alternative — but neither is in Apple’s build. The consequence: if you want the database file itself encrypted via SQLite’s encryption mechanism, you must bundle your own build of SQLite with encryption compiled in, replacing the system module. You can’t import SQLite3 and get encryption.

SQLCipher: the standard open-source option

SQLCipher is an open-source extension to SQLite that transparently encrypts the entire database file with AES-256. To the application, an encrypted database looks and behaves exactly like a normal one — same C API, same SQL — except you provide a key after opening, and every page is encrypted on disk. It’s the de facto standard for encrypted SQLite on mobile.

Because it replaces SQLite, you add it as a dependency rather than using the system module. Via Swift Package Manager or CocoaPods you’d add SQLCipher (or a Swift wrapper that bundles it), and crucially you stop importing the system SQLite3 and instead link SQLCipher’s build. Some Swift SQLite libraries (including GRDB) offer a SQLCipher-enabled variant precisely so you don’t have to wire this up by hand — for an encrypted database, using such a library is often the pragmatic path even if you’d otherwise use the raw API.

Keying an encrypted database

With SQLCipher linked, the only API difference from normal SQLite is that you set a key immediately after opening, before any other operation, using PRAGMA key:

// After sqlite3_open_v2, before any other statement:
let key = retrieveEncryptionKey()   // from the Keychain — see below
let keyedPragma = "PRAGMA key = '\(key)';"   // SQLCipher reads the key from this pragma
sqlite3_exec(db, keyedPragma, nil, nil, nil)

// Now the database is unlocked for this connection; proceed normally.

PRAGMA key provides the passphrase that derives the encryption key. It must be the very first thing you do after opening — before any query, before setting other pragmas that read data. With the correct key, the database decrypts transparently for that connection; with a wrong key (or none), any read fails because the bytes are ciphertext. SQLCipher also supports PRAGMA rekey to change the key on an existing database and various tuning pragmas for the KDF iteration count and page size.

(Note this is one of the few legitimate uses of building SQL with a value — but the key should still be handled carefully; SQLCipher also supports passing the key as raw bytes via a blob form to avoid string handling of the secret. The string form is shown for clarity.)

Where the key comes from: the Keychain

Encryption is only as good as key management. The encryption key must not be hardcoded in your app (anyone can extract strings from a binary), must not live in UserDefaults (unencrypted), and must not be derived from something guessable. On iOS, the right place for the key is the Keychain, which is itself hardware-encrypted and access-controlled:

import Security

func storeEncryptionKey(_ key: Data) throws {
    let query: [String: Any] = [
        kSecClass as String: kSecClassGenericPassword,
        kSecAttrAccount as String: "database-encryption-key",
        kSecValueData as String: key,
        kSecAttrAccessible as String: kSecAttrAccessibleAfterFirstUnlockThisDeviceOnly
    ]
    SecItemDelete(query as CFDictionary)   // remove any existing
    let status = SecItemAdd(query as CFDictionary, nil)
    guard status == errSecSuccess else { throw KeychainError.storeFailed(status) }
}

func retrieveEncryptionKey() throws -> Data {
    let query: [String: Any] = [
        kSecClass as String: kSecClassGenericPassword,
        kSecAttrAccount as String: "database-encryption-key",
        kSecReturnData as String: true,
        kSecMatchLimit as String: kSecMatchLimitOne
    ]
    var result: AnyObject?
    let status = SecItemCopyMatching(query as CFDictionary, &result)
    guard status == errSecSuccess, let data = result as? Data else {
        throw KeychainError.retrieveFailed(status)
    }
    return data
}

The pattern: generate a strong random key once (e.g., 32 random bytes from SecRandomCopyBytes), store it in the Keychain on first launch, and retrieve it to key the database on each launch. The kSecAttrAccessibleAfterFirstUnlockThisDeviceOnly accessibility means the key is available after the first unlock following a boot and never leaves the device (not synced, not backed up to another device) — a sensible default for a database key. The Keychain’s hardware backing means even a jailbroken-device attacker faces real resistance. This Keychain-stored, randomly-generated key is the standard approach; deriving the key from a user password (with a strong KDF) is an alternative when you want the data inaccessible without the user’s secret.

The simpler alternative: iOS Data Protection

Before committing to SQLCipher’s complexity, consider whether iOS’s built-in file Data Protection suffices. iOS can encrypt files at rest automatically, tied to the device passcode, via file protection classes. You set a protection level on the database file and the OS encrypts it — no SQLCipher, no bundled SQLite, no key management of your own:

func enableDataProtection(on url: URL) throws {
    try FileManager.default.setAttributes(
        [.protectionKey: FileProtectionType.completeUntilFirstUserAuthentication],
        ofItemAtPath: url.path
    )
}

FileProtectionType levels: .complete (file unreadable whenever the device is locked), .completeUntilFirstUserAuthentication (readable after the first unlock post-boot — the usual choice for a database you access in the background), .completeUnlessOpen, and .none. With Data Protection, the file is encrypted by the OS using keys derived from the device passcode and hardware — so on a locked or powered-off device, the data is genuinely protected, and you wrote almost no code.

The tradeoffs versus SQLCipher: Data Protection is OS-managed (no bundled dependency, no manual keys, uses the system SQLite3), but it only protects against the device-locked/at-rest threat — once the device is unlocked, the file is readable to the app’s process (and to anything that can run as the app). It also requires the user to have a passcode set (no passcode means no real protection). SQLCipher encrypts regardless of device lock state and gives you control of the key, protecting against threats like a copied database file even on an unlocked device. For most apps, Data Protection (.completeUntilFirstUserAuthentication) is sufficient and dramatically simpler — it’s the right first choice. Reach for SQLCipher when you have a specific threat model that Data Protection doesn’t cover, or a compliance requirement for application-level encryption.

Encryption pitfalls

Expecting the system SQLite3 to encrypt. Apple’s build has no encryption. You must bundle SQLCipher (replacing the system module) for database-level encryption.

Hardcoding the key in the app or storing it in UserDefaults. Trivially extractable. Store keys in the Keychain.

Setting PRAGMA key after other statements. It must be the first operation after opening, before any read. Key first, then everything else.

Choosing SQLCipher when Data Protection would do. Adds a bundled dependency and key management for no benefit if the threat is only at-rest-on-a-locked-device. Try Data Protection first.

Relying on Data Protection without a device passcode. No passcode means the protection keys are weak/absent. Data Protection assumes the user has a passcode.

Forgetting that an unlocked device exposes Data Protection files to the app process. It guards at-rest, not in-use. If your threat model includes an unlocked-device attacker reading the file, you need SQLCipher.

What to internalize

Apple’s system SQLite3 module includes FTS5, the backup API, and JSON — but not encryption, so database-file encryption requires bundling your own SQLite build with SQLCipher (the standard open-source AES-256 extension), replacing the system module; many teams use a SQLCipher-enabled variant of a library like GRDB rather than wiring it by hand. With SQLCipher, the only API change is PRAGMA key as the very first operation after opening, with the key stored in the hardware-backed Keychain (randomly generated, ...ThisDeviceOnly accessibility), never hardcoded or in UserDefaults. But before adopting SQLCipher, consider iOS’s built-in file Data Protection (FileProtectionType.completeUntilFirstUserAuthentication), which encrypts the file at rest tied to the device passcode with almost no code and using the system module — sufficient for most apps’ at-rest threat. Choose SQLCipher only when your threat model needs application-level encryption that survives an unlocked device or a copied file.


30. Integrating SQLite with SwiftUI

A database layer only matters when it drives your UI. SwiftUI’s declarative, state-driven model wants data as observable Swift values that views render and that update reactively when the data changes. This section connects the raw-SQLite layer you’ve built to SwiftUI: keeping database work off the main thread, exposing data through @Observable, structuring the read/write flow, and handling change propagation. The patterns here align with the concurrency model from Section 20.

The shape of the problem

SwiftUI views are functions of state. When state changes, the view re-renders. So integrating SQLite means: load data from the database into observable state, render that state in views, and when the user acts (creating, editing, deleting), write to the database and update the state so the UI reflects the change. The two constraints that shape the design: database work must stay off the main thread (Section 20), and state that drives the UI must be updated on the main thread (SwiftUI requirement). The architecture has to bridge a background database and a main-thread UI cleanly.

An @Observable store backed by the database

The recommended structure is an @Observable store (iOS 17+, matching the modern preference for @Observable over ObservableObject) that owns access to the database, exposes the data as observable properties, and offers methods for loading and mutating. The store is @MainActor so its observable state is always touched on the main thread, while the actual database calls hop to a background-isolated database actor:

import Observation

@MainActor
@Observable
final class TaskStore {
    private(set) var tasks: [Task] = []
    private(set) var isLoading = false
    private(set) var errorMessage: String?

    private let database: Database   // the actor from Section 20

    init(database: Database) {
        self.database = database
    }

    func load() async {
        isLoading = true
        errorMessage = nil
        do {
            // Database work happens inside the actor (off the main thread);
            // the result is assigned back here on the main actor.
            tasks = try await database.fetchOpenTasks()
        } catch {
            errorMessage = "Could not load tasks."
        }
        isLoading = false
    }

    func add(title: String) async {
        do {
            _ = try await database.createTask(title: title)
            await load()   // reload to reflect the new task
        } catch {
            errorMessage = "Could not add task."
        }
    }

    func toggleDone(_ task: Task) async {
        do {
            try await database.setDone(id: task.id, done: !task.isDone)
            await load()
        } catch {
            errorMessage = "Could not update task."
        }
    }

    func delete(_ task: Task) async {
        do {
            try await database.deleteTask(id: task.id)
            await load()
        } catch {
            errorMessage = "Could not delete task."
        }
    }
}

The store is the bridge. It’s @MainActor, so tasks, isLoading, and errorMessage — the observable state SwiftUI watches — are always mutated on the main thread. The database calls are awaited into the Database actor, which runs them off the main thread (Section 20’s actor pattern). After a mutation, it reloads so the observable tasks reflects the new database state, and SwiftUI re-renders. Errors become user-facing messages rather than crashes. This @MainActor @Observable store over a background database actor is the clean, modern integration pattern.

The Database actor’s UI-facing methods

The Database actor (from Section 20) exposes the domain methods the store calls. They use the wrapper’s query/run/transaction and return plain value types (Task structs) that cross the actor boundary safely:

actor Database {
    private let connection: Connection
    init(url: URL) throws { self.connection = try Connection(url: url) }

    func fetchOpenTasks() throws -> [Task] {
        try connection.query(
            "SELECT * FROM tasks WHERE is_done = 0 ORDER BY priority DESC, created_at;",
            decode: Task.init(row:)
        )
    }
    func createTask(title: String) throws -> Int64 {
        try connection.run(
            "INSERT INTO tasks (title, is_done, priority, created_at) VALUES (?, 0, 0, ?);",
            [SQLiteValue(title), SQLiteValue(Date())]
        )
    }
    func setDone(id: Int64, done: Bool) throws {
        try connection.run("UPDATE tasks SET is_done = ? WHERE id = ?;",
                           [SQLiteValue(done), SQLiteValue(id)])
    }
    func deleteTask(id: Int64) throws {
        try connection.run("DELETE FROM tasks WHERE id = ?;", [SQLiteValue(id)])
    }
}

Task is a struct of value types, so returning [Task] from the actor to the @MainActor store is safe — value types are Sendable and copy across the boundary cleanly. This is why the wrapper decodes into plain Swift structs rather than handing back anything tied to the connection: the decoded values are detached from the database and free to travel between concurrency domains.

Wiring it into views

The view observes the store and triggers loads and mutations:

struct TaskListView: View {
    @State private var store: TaskStore

    init(database: Database) {
        _store = State(initialValue: TaskStore(database: database))
    }

    var body: some View {
        List {
            ForEach(store.tasks) { task in
                HStack {
                    Button { Task { await store.toggleDone(task) } } label: {
                        Image(systemName: task.isDone ? "checkmark.circle.fill" : "circle")
                    }
                    Text(task.title)
                }
            }
            .onDelete { indexSet in
                for index in indexSet {
                    let task = store.tasks[index]
                    Task { await store.delete(task) }
                }
            }
        }
        .overlay { if store.isLoading { ProgressView() } }
        .task { await store.load() }   // load when the view appears
    }
}

@State private var store holds the @Observable store; SwiftUI tracks which of its properties the view reads and re-renders on changes. The .task { await store.load() } modifier loads data when the view appears, on an async context. User actions wrap the async store methods in Task { ... }. Because the store is @MainActor and its mutations reload observable state on the main thread, the UI stays consistent and the database work stays off the main thread. (For ForEach(store.tasks) to work, Task should be Identifiable — add id conformance, which the id: Int64 already supports.)

The reload-after-write tradeoff and change observation

The pattern above reloads the full list after every mutation. That’s simple and always correct, and for modest lists it’s perfectly fine — a query that fetches a few hundred rows is fast. But it’s not the most efficient: a single toggle reloads everything. Two refinements, in increasing sophistication:

Local mutation plus targeted reload. After a write, update the in-memory tasks array directly (flip the toggled task’s isDone) instead of reloading, keeping the database and the array in sync manually. Faster, but you must be careful the in-memory state matches what the database now holds.

Database change observation. SQLite can notify you when data changes via sqlite3_update_hook, a callback fired on every insert/update/delete with the table and rowid. You could use it to know precisely what changed and refresh only that. This is essentially what libraries like GRDB build their ValueObservation on — automatic, fine-grained UI updates when the underlying data changes. Building robust observation by hand on the raw API is real work (you must handle transactions, coalescing, and threading), which is one of the strongest reasons apps with significant reactive-data needs adopt GRDB rather than the raw layer. For a hand-rolled layer, the reload-after-write approach is the pragmatic default; add sqlite3_update_hook-based observation only if you need it and are prepared for the complexity.

Loading large datasets and pagination

For a list that could be large, loading every row into tasks is wasteful — the UI only shows a screenful at a time. Use SQL LIMIT/OFFSET (or keyset pagination on an indexed column) to load pages:

func fetchTasks(limit: Int, offset: Int) throws -> [Task] {
    try connection.query(
        "SELECT * FROM tasks ORDER BY created_at DESC LIMIT ? OFFSET ?;",
        [SQLiteValue(limit), SQLiteValue(offset)],
        decode: Task.init(row:)
    )
}

The store loads the first page on appearance and appends further pages as the user scrolls (driven by .onAppear on the last row, or SwiftUI’s list prefetching). Keyset pagination (WHERE created_at < ? using the last-seen value) scales better than OFFSET for deep lists because OFFSET still scans the skipped rows, but LIMIT/OFFSET is fine for modest depths. Pagination keeps memory bounded and the UI responsive regardless of table size — important once a table grows beyond a few hundred rows.

SwiftUI integration pitfalls

Calling the database synchronously from a view or the main thread. Janks the UI. Route through an async store/actor that runs off-main.

Mutating observable state off the main thread. SwiftUI requires main-thread state updates. Keep the store @MainActor and assign results there.

Returning connection-tied objects across the actor boundary. Only Sendable value types (your decoded structs) should cross. Decode into plain structs; never pass statements or the connection out.

Reloading huge lists on every change. Wasteful for large data. Paginate, and consider targeted updates or change observation.

Building hand-rolled change observation casually. sqlite3_update_hook observation is subtle (transactions, threading, coalescing). If you need robust reactive updates, that’s a strong signal to use GRDB.

Forgetting Identifiable for ForEach. Lists need stable identity. Conform your model to Identifiable via its primary key.

What to internalize

Bridge the background database and main-thread UI with a @MainActor @Observable store over a Database actor: the store holds observable state (tasks, isLoading, errorMessage) always mutated on the main thread, while database calls await into the actor that runs them off the main thread (Section 20’s model). The actor’s methods decode into plain Sendable value-type structs that cross the boundary safely — which is why the wrapper returns detached structs, not connection-tied objects. Views observe the store, .task { await store.load() } on appearance, and wrap user actions in Task { await ... }. Reloading after each write is simple and correct for modest data; for large lists, paginate with LIMIT/OFFSET (or keyset) to bound memory. Fine-grained reactive updates via sqlite3_update_hook are possible but subtle — needing them is a strong argument for adopting GRDB, whose observation solves exactly this. Keep database work off the main thread and observable state on it, and the integration stays clean.


31. Common Gotchas and Anti-Patterns

This section is a consolidated field guide to the mistakes that bite people working with raw SQLite from Swift. Most have appeared in context throughout the guide; gathering them here gives you a single place to scan when something behaves strangely, and a checklist to review before shipping. They cluster into a few themes: the C-bridging traps, the lifetime traps, the SQL-semantics traps, and the performance traps.

The C-bridging traps

SQLITE_TRANSIENT doesn’t exist until you define it. The C macro doesn’t import into Swift. Define let SQLITE_TRANSIENT = unsafeBitCast(-1, to: sqlite3_destructor_type.self) once per project. Without it, text/blob binding won’t compile — or worse, if you reach for SQLITE_STATIC, you get nondeterministic corruption.

Using SQLITE_STATIC for a bridged Swift string. The bridged C string is destroyed when the bind call returns, so SQLite reads freed memory during step. The value is sometimes right, sometimes garbage — the worst kind of bug. Always SQLITE_TRANSIENT for transient Swift values so SQLite copies them immediately.

Bind indices are 1-based; column indices are 0-based. This arbitrary asymmetry produces “no such column” errors and shifted data. When results look wrong or shifted by one, check your indices first. (The wrapper hides this — another reason to use one.)

Expecting SQLITE_OK from sqlite3_step. Step returns SQLITE_ROW or SQLITE_DONE on success, never SQLITE_OK. Comparing to SQLITE_OK after step is always a logic error.

The lifetime traps

Reading a column pointer after the next step or after finalize. sqlite3_column_text/_blob pointers are valid only until the next sqlite3_step, sqlite3_reset, or sqlite3_finalize on that statement. Convert to String/Data immediately during the step loop, never store the raw pointer.

Forgetting to finalize a statement. Leaks the statement and blocks sqlite3_close (returns SQLITE_BUSY). Use defer { sqlite3_finalize(statement) } immediately after a successful prepare so cleanup happens on every exit path. (The wrapper’s Statement finalizes in deinit, eliminating this entirely.)

Not closing the connection on the open-failure path. A failed sqlite3_open_v2 may still allocate a handle. Read the error message, then close regardless, or you leak. Prefer sqlite3_close_v2 for forgiving cleanup.

Numeric NULL reads as 0. sqlite3_column_int64 returns 0 for a NULL column — indistinguishable from a stored 0 by value alone. Check sqlite3_column_type == SQLITE_NULL before reading numeric columns that can be NULL.

The SQL-semantics traps

PRAGMA foreign_keys is OFF by default. Declared foreign keys aren’t enforced until you run PRAGMA foreign_keys = ON on the connection — and it’s per-connection. Without it, you can insert dangling references and cascades don’t fire. Set it at startup, always. This one surprises everyone exactly once.

column = NULL never matches. NULL means unknown, so = NULL is always not-true. Use IS NULL / IS NOT NULL. A generated column = ? with a NULL bound value also won’t match NULL rows — branch to IS NULL for that case.

NULL silently excludes rows from compound AND conditions. A NULL in an AND makes the whole condition not-true, dropping the row. Coalesce nullable columns to defaults (COALESCE) when you intend NULL to mean something concrete.

Type affinity is a preference, not a contract. A column declared INTEGER will silently store text by default, breaking range comparisons ('10' < '5' as strings). Bind values of the intended type, and use STRICT tables (iOS 15.4+) for real enforcement.

INT PRIMARY KEY doesn’t alias the rowid; INTEGER PRIMARY KEY does. The spelling matters. INT PRIMARY KEY creates a separate indexed column, losing the fast-lookup optimization. Spell it INTEGER.

Building SQL with string interpolation. SQL injection and breakage (an apostrophe in a name breaks the syntax). Every runtime value goes through a bind; only static SQL whose full text you wrote at compile time uses interpolation, and even then only for values you fully control (like an integer user_version).

The performance traps

Inserting many rows without a transaction. Each implicit per-statement transaction means a disk sync — thousands of syncs, catastrophically slow. Wrap bulk writes in one transaction (one sync); reuse one prepared statement with reset. This is the single biggest write-performance lever.

Recompiling the same SQL on every call. Wastes the compilation cost. Cache prepared statements keyed by SQL text and reuse them via reset (Section 23). Parameterization is what makes this possible — interpolated SQL produces distinct strings that never cache-hit.

Doing database work on the main thread. A slow query or bulk write janks the UI. Route work through an async actor or serial queue that runs off-main (Section 20).

Missing indexes on filtered/joined/sorted columns of large tables. Full scans that worsen as data grows; quadratic joins when the inner table’s join column is unindexed. Diagnose with EXPLAIN QUERY PLAN (look for SCAN and USE TEMP B-TREE), add the matching index, and re-explain to confirm SEARCH.

Over-indexing. Every index costs disk and slows every write. Index the columns real queries use, not every column.

Storing large blobs inline. Bloats the database and degrades its page cache. Store large binary as files on disk with the path in the database; keep only small blobs (thumbnails, under ~100 KB) inline.

The concurrency traps

Sharing one connection across threads without serialization. Corruption and SQLITE_MISUSE. Serialize access through one actor or serial queue, or give each thread its own connection. OpaquePointer is not Sendable — the type system is warning you.

Using BEGIN DEFERRED for read-then-write transactions in multi-connection code. Risks an un-retryable mid-transaction SQLITE_BUSY. Use BEGIN IMMEDIATE for writers so a busy fails up front, cleanly retryable.

Treating SQLITE_BUSY as fatal. It’s transient and recoverable. Set PRAGMA busy_timeout so short contentions wait transparently, and retry whole transactions on the rare busy that aborts one.

Copying a WAL database with a plain file copy. Misses the -wal contents → stale copy. Use the backup API, or checkpoint fully and copy all files.

The “you probably want a library” signals

A meta-anti-pattern: continuing to hand-build raw SQLite infrastructure past the point where a library would serve you better. Specific signals that you’ve outgrown the raw layer: you’re building change observation with sqlite3_update_hook for reactive UI; you’re writing a query builder because string SQL has become unwieldy; you need robust Codable record mapping across many types; you’re managing a connection pool for read concurrency; or your migration and statement-caching code is becoming a framework of its own. Each of these is a problem GRDB (or similar) has already solved well. Reaching the edge of the raw API isn’t failure — it’s the moment your understanding of the foundation lets you adopt a library wisely. The anti-pattern is reinventing that library by hand on a deadline.

What to internalize

The recurring mistakes cluster predictably. C-bridging: define SQLITE_TRANSIENT and use it for Swift strings; remember binds are 1-based and columns 0-based; step returns ROW/DONE, never OK. Lifetimes: convert column pointers immediately, finalize every statement (use defer or the wrapper’s deinit), close on the failure path, and detect numeric NULL with column_type. SQL semantics: turn foreign_keys ON per connection, use IS NULL not = NULL, watch NULL in compound conditions, don’t trust loose type affinity (use STRICT), spell INTEGER PRIMARY KEY exactly, and never interpolate runtime values. Performance: wrap bulk writes in a transaction with a reused statement, cache prepared statements, keep work off the main thread, index what queries use (no more), and store large blobs as files. Concurrency: serialize connection access (or one per thread), use BEGIN IMMEDIATE for writers, handle SQLITE_BUSY with a timeout and retries, and back up WAL databases with the backup API. And recognize the signals that you’ve outgrown the raw layer — adopting a library from understanding is the goal, not a defeat.


32. Where to Go Deeper

You’ve covered the full arc of raw SQLite on iOS — from opening a connection through a typed wrapper layer, migrations, indexing, the query planner, concurrency, WAL, full-text search, the backup API, custom functions, encryption, and SwiftUI integration. This final section points you toward the resources, deeper topics, and the decision of when to graduate to a higher-level library, so you can keep growing from here.

The official documentation is exceptional

SQLite’s own documentation at sqlite.org is among the best in software — thorough, precise, and written by the people who built it. A few documents are worth reading in full once you’ve internalized this guide:

  • “Quirks, Caveats, and Gotchas In SQLite” — the authors’ own catalog of surprising behaviors, including the type affinity and NULL handling subtleties we covered. Reading their version cements the concepts.
  • “The Architecture of SQLite” — explains the internal structure (the parser, the code generator, the virtual machine, the B-tree layer, the pager, the OS interface). Understanding the layers demystifies a lot of behavior and is genuinely interesting.
  • “Datatypes In SQLite” — the authoritative treatment of storage classes, type affinity, and STRICT tables.
  • The C/C++ Interface reference — the complete API documentation. You’ve used a few dozen functions; the reference covers them precisely and reveals options we didn’t touch.
  • “Write-Ahead Logging” — the full WAL documentation, including the checkpoint details and the cross-process considerations relevant to App Groups.

Bookmark sqlite.org and treat it as the source of truth. When this guide and your memory disagree with the official docs, the docs win.

Topics worth deeper exploration

Several topics we introduced have more depth available when you need it:

The query planner and sqlite3_stmt_status. Beyond EXPLAIN QUERY PLAN, SQLite exposes per-statement statistics (full-scan counts, sort operations) via sqlite3_stmt_status, useful for profiling which statements do expensive work. The query planner documentation (“The SQLite Query Optimizer Overview” and “Next-Generation Query Planner”) goes deep on how index selection actually works.

Window functions. SQLite supports SQL window functions (ROW_NUMBER(), RANK(), LAG(), LEAD(), running totals with SUM() OVER (...)). These let you express analytical queries — rankings, running aggregates, per-group sequencing — that would otherwise require awkward self-joins or Swift post-processing. Well worth learning when you hit a “rank within group” or “compare to previous row” problem.

Common Table Expressions (CTEs) and recursion. The WITH clause defines named subqueries that make complex queries readable, and recursive CTEs (WITH RECURSIVE) traverse hierarchies and graphs (a tree of categories, a chain of references) in pure SQL. A powerful tool for hierarchical data.

Triggers and views. We used triggers for FTS sync; they’re more general (enforcing invariants, maintaining derived data, auditing changes). Views are saved queries that act like virtual tables, useful for encapsulating complex joins behind a simple name.

The sqlite3_update_hook and building observation. If you want reactive data without a library, the update hook plus pre-update and commit hooks are the raw materials. Studying how GRDB’s ValueObservation is built on them is illuminating even if you ultimately use the library.

Generated columns and partial/expression indexes. We touched expression and partial indexes; generated columns (computed from other columns, optionally stored) and the full range of index types reward study when optimizing a specific schema.

Tooling worth adopting

A few tools make working with SQLite much more pleasant:

The sqlite3 command-line shell (available on macOS) lets you open your app’s database file directly and run queries, inspect schema (.schema), check the plan (.eqp on), and debug data — invaluable for understanding what your app actually wrote. You can pull a database off the simulator (or a development device via the container) and poke at it directly.

A GUI database browser like DB Browser for SQLite gives you a visual view of tables, lets you run queries, and helps you understand your data’s shape during development.

For schema and query work, simply keeping a .sql file of your schema and exploratory queries, runnable in the shell, speeds iteration enormously compared to round-tripping through the app.

When to graduate to GRDB (or another library)

This guide has been honest throughout that for many production apps, a library is the right choice — and now you can make that call from understanding. The signals that it’s time, gathered:

You want database change observation driving reactive SwiftUI updates. You want Codable record mapping without writing decoders by hand. You want a typed query interface instead of string SQL. You need a connection pool for genuine read concurrency. Your migration, caching, and wrapper code is becoming a framework. Or simply: you’d rather invest your effort in your product than in database plumbing, and you’ve confirmed the foundation works the way you now understand it to.

GRDB is the standout recommendation for Swift — it’s built directly on the C API you now know, embraces SQL rather than hiding it, has excellent observation and concurrency support, integrates cleanly with SwiftUI, and is actively maintained. Because you understand the raw layer, GRDB will feel like a well-designed convenience over familiar machinery rather than a black box — and when you need to drop to raw SQL or understand its behavior, you can. Other options exist (SQLite.swift for a type-safe query builder, FMDB for an Objective-C-era Swift-usable wrapper), but GRDB is where most new Swift projects that want a SQLite library land today.

The point of learning the foundation was never to avoid libraries forever — it was to understand what they do, debug them when needed, drop below them when warranted, and choose them deliberately. You can now read GRDB’s source and recognize the sqlite3_* calls underneath. That’s leverage no amount of library-only knowledge provides.

A closing perspective

Raw SQLite is a small, sharp, durable tool. The C API hasn’t fundamentally changed in many years and won’t change much in many more; what you’ve learned here will still be true a decade from now, on whatever Apple platforms exist then. The same can’t be said for most frameworks. That stability is part of what makes the time invested here worthwhile: relational persistence, prepared statements, transactions, indexes, and the query planner are foundational concepts that transfer to every database you’ll ever use, on every platform. You haven’t just learned an iOS API — you’ve learned how databases work, with iOS as the concrete setting.

Build something with it. Start with the wrapper from Sections 11-12, add a migration or two, index a slow query you find with EXPLAIN QUERY PLAN, and feel how the pieces fit. Then, when your app’s needs justify it, reach for GRDB knowing exactly what it’s doing for you. The understanding compounds: every hard database problem you meet from here, you’ll meet with a real model of what’s happening underneath.

Good luck, and have fun.

End of Document