pg_trickle

pg_trickle is a PostgreSQL 18 extension that turns ordinary SQL views into self-maintaining stream tables — no external processes, no sidecars, no bespoke refresh pipelines. Just CREATE EXTENSION pg_trickle and your views stay fresh.

-- Declare a stream table — a view that maintains itself
SELECT pgtrickle.create_stream_table(
    name     => 'active_orders',
    query    => 'SELECT * FROM orders WHERE status = ''active''',
    schedule => '30s'
);

-- Insert a row — the stream table updates automatically on the next refresh
INSERT INTO orders (id, status) VALUES (42, 'active');
SELECT count(*) FROM active_orders;  -- 1

The problem with materialized views

PostgreSQL's materialized views are powerful but frustrating. REFRESH MATERIALIZED VIEW re-runs the entire query from scratch, even if only one row changed in a million-row table. Your choices are: burn CPU on full recomputation, or accept stale data. Most teams end up building bespoke refresh pipelines just to keep summary tables current.

What pg_trickle does differently

pg_trickle captures changes to your source tables and — on each refresh cycle — derives a delta query that processes only the changed rows and merges the result into the materialized table. One insert into a million-row source table? pg_trickle touches exactly one row's worth of computation.

The approach is grounded in the DBSP differential dataflow framework (Budiu et al., 2022). Delta queries are derived automatically from your SQL's operator tree: joins produce the classic bilinear expansion, aggregates maintain auxiliary counters, and linear operators like filters pass deltas through unchanged.

Key capabilities

Feature	Description
Incremental refresh	Only changed rows are recomputed — never a full table scan
Cascading DAG	Stream tables that depend on stream tables propagate deltas downstream automatically
Demand-driven scheduling	Set a freshness interval on the views your app queries; upstream layers inherit the tightest schedule automatically
Hybrid CDC	Starts with lightweight row-level triggers; seamlessly transitions to WAL-based logical replication once available
Broad SQL support	JOINs, GROUP BY, DISTINCT, UNION/INTERSECT/EXCEPT, subqueries, CTEs (including WITH RECURSIVE), window functions, LATERAL, and more
Built-in observability	Monitoring views, refresh history, NOTIFY-based alerting
CloudNativePG-ready	Ships as an Image Volume extension image for Kubernetes deployments

Demand-driven scheduling

With the default CALCULATED schedule mode, you only set an explicit refresh interval on the stream tables your application actually queries. The system propagates that cadence upward through the dependency graph: each upstream stream table inherits the tightest schedule among its downstream dependents. You declare freshness requirements where they matter — at the consumer — and the entire pipeline adjusts without manual coordination.

Hybrid change capture

pg_trickle bootstraps with lightweight row-level triggers — no configuration needed, works out of the box. Once the first refresh succeeds and wal_level = logical is available, the system automatically transitions to WAL-based logical replication for lower write-side overhead. The transition is seamless: trigger → transitioning → WAL-only. If anything goes wrong, it falls back to triggers.

Explore this documentation

Getting Started — build a three-layer DAG from scratch in minutes
SQL Reference — every function and option
Architecture — how the engine works internally
Configuration — GUC variables and tuning

Tutorials

Fuse Circuit Breaker — protect stream tables from bulk-change storms
Tiered Scheduling — configure multi-tier refresh cadences
Migrating from Materialized Views — step-by-step migration guide
Circular Dependencies — handle SCCs in your DAG
Monitoring & Alerting — set up observability for stream tables
ETL Bulk Load Patterns — safely load large batches without overwhelming CDC

Integrations

CloudNativePG — deploy pg_trickle on Kubernetes
Prometheus & Grafana — metrics and dashboards
PgBouncer — connection pooling configuration

Source & releases

Written in Rust using pgrx. Targets PostgreSQL 18. Apache 2.0 licensed.

Repository: github.com/grove/pg-trickle
Install instructions: Installation
Changelog: Changelog
Roadmap: Roadmap

Getting Started with pg_trickle

What is pg_trickle?

pg_trickle adds stream tables to PostgreSQL — tables that are defined by a SQL query and kept automatically up to date as the underlying data changes. Think of them as materialized views that refresh themselves, but smarter: instead of re-running the entire query on every refresh, pg_trickle uses Incremental View Maintenance (IVM) to process only the rows that changed.

Traditional materialized views force a choice: either re-run the full query (expensive) or accept stale data. pg_trickle eliminates this trade-off. When you insert a single row into a million-row table, pg_trickle computes the effect of that one row on the query result — it doesn't touch the other 999,999.

How data flows

The key concept is that data flows downstream automatically — from your base tables through any chain of stream tables, without you writing a single line of orchestration code:

  You write to base tables
         │
         ▼
  ┌─────────────┐   triggers (or WAL)   ┌─────────────────────┐
  │ Base Tables │ ─────────────────────▶ │   Change Buffers    │
  │ (you write) │                        │ (pgtrickle_changes.*) │
  └─────────────┘                        └──────────┬──────────┘
                                                     │
                                           delta query (ΔQ) on refresh
                                                     │
                                                     ▼
  ┌──────────────────────────────────────────────────────────────┐
  │  Stream Table A  ◀── depends on base tables                  │
  └──────────────────────────┬───────────────────────────────────┘
                             │  change captured, buffer written
                             ▼
  ┌──────────────────────────────────────────────────────────────┐
  │  Stream Table B  ◀── depends on Stream Table A               │
  └──────────────────────────────────────────────────────────────┘

One write to a base table can ripple through an entire DAG of stream tables — each layer refreshed in the correct topological order, each doing only the work proportional to what actually changed.

You write to your base tables normally — INSERT, UPDATE, DELETE
Lightweight AFTER row-level triggers capture each change into a buffer, atomically in the same transaction. No polling, no logical replication slots required by default.
On each refresh cycle, pg_trickle derives a delta query (ΔQ) that reads only the buffered changes since the last refresh frontier
The delta is merged into the stream table — only the affected rows are written
If other stream tables depend on this one, they are scheduled next (topological order)
Optionally: once wal_level = logical is available and the first refresh succeeds, pg_trickle automatically transitions from triggers to WAL-based CDC (near-zero write-path overhead compared to ~2–15 μs for triggers). The transition is seamless and transparent.

This tutorial walks through a concrete org-chart example so you can see this flow end to end, including a chain of stream tables that propagates changes automatically.

Prerequisites

PostgreSQL 18.x with pg_trickle installed (see INSTALL.md)
shared_preload_libraries = 'pg_trickle' in postgresql.conf
max_worker_processes raised to at least 32 (see INSTALL.md); the PostgreSQL default of 8 is often exhausted if you have several databases, causing stream tables to silently stop refreshing
psql or any SQL client

Deploying to production? See the Pre-Deployment Checklist for a complete list of requirements, pooler compatibility, and recommended GUC values.

Playground: The fastest way to experiment is the playground — a Docker Compose environment with sample tables and stream tables pre-loaded. cd playground && docker compose up -d and you're running.

Quick start with Docker: Pull the pre-built GHCR image — PostgreSQL 18.3 + pg_trickle ready to run, no configuration needed:
docker run --rm -e POSTGRES_PASSWORD=secret -p 5432:5432 ghcr.io/grove/pg_trickle:latest
All GUC defaults (wal_level, shared_preload_libraries, scheduler settings) are pre-configured. See INSTALL.md for tag details and volume mounting.

Connect to the database you want to use and enable the extension:

CREATE EXTENSION pg_trickle;

No additional configuration is needed. pg_trickle automatically discovers all databases on the server and starts a scheduler for each one where the extension is installed.

Chapter 1: Hello World — Your First Stream Table

Before diving into multi-table joins and recursive CTEs, start with the simplest possible stream table: a single-source aggregate with no joins.

1.1 Setup

Create one table and enable the extension:

CREATE EXTENSION IF NOT EXISTS pg_trickle;

CREATE TABLE products (
    id       SERIAL PRIMARY KEY,
    category TEXT           NOT NULL,
    price    NUMERIC(10,2)  NOT NULL,
    in_stock BOOLEAN        NOT NULL DEFAULT true
);

INSERT INTO products (category, price) VALUES
    ('Electronics', 299.99),
    ('Electronics', 49.99),
    ('Books',       14.99),
    ('Books',       24.99),
    ('Books',        9.99);

1.2 Create the stream table

SELECT pgtrickle.create_stream_table(
    name     => 'category_summary',
    query    => $$
        SELECT
            category,
            COUNT(*)                    AS product_count,
            ROUND(AVG(price), 2)        AS avg_price,
            MIN(price)                  AS min_price,
            MAX(price)                  AS max_price,
            COUNT(*) FILTER (WHERE in_stock) AS in_stock_count
        FROM products
        GROUP BY category
    $$,
    schedule => '1s'
);

Query it immediately — it was populated by the initial full refresh:

SELECT category, product_count, avg_price, min_price, max_price, in_stock_count
FROM category_summary ORDER BY category;

  category   | product_count | avg_price | min_price | max_price | in_stock_count
-------------+---------------+-----------+-----------+-----------+----------------
 Books       |             3 |     16.66 |      9.99 |     24.99 |              3
 Electronics |             2 |    174.99 |     49.99 |    299.99 |              2
(2 rows)

1.3 Watch an INSERT update one group

INSERT INTO products (category, price) VALUES ('Books', 39.99);

Within ~1 second (or call SELECT pgtrickle.refresh_stream_table('category_summary') to force it):

SELECT category, product_count, avg_price, min_price, max_price, in_stock_count
FROM category_summary WHERE category = 'Books';

 category | product_count | avg_price | min_price | max_price | in_stock_count
----------+---------------+-----------+-----------+-----------+----------------
 Books    |             4 |     22.49 |      9.99 |     39.99 |              4
(1 row)

The Electronics row was not touched at all — pg_trickle read exactly 1 row from the change buffer, adjusted only the Books group.

1.4 Watch an UPDATE propagate

UPDATE products SET price = 19.99 WHERE price = 299.99;

After the next refresh:

SELECT category, product_count, avg_price, min_price, max_price, in_stock_count
FROM category_summary WHERE category = 'Electronics';

  category   | product_count | avg_price | min_price | max_price | in_stock_count
-------------+---------------+-----------+-----------+-----------+----------------
 Electronics |             2 |     34.99 |     19.99 |     49.99 |              2
(1 row)

For AVG, pg_trickle maintains running sum and count columns internally, so re-aggregating a group is O(1) regardless of group size.

1.5 What you just saw

A single function call created the storage table, installed CDC triggers, ran the initial full refresh, and registered a 1-second schedule.
Every subsequent DML on products was captured in an AFTER trigger — no polling, no logical replication.
Each refresh touched only the rows and groups that changed.
The stream table is a real PostgreSQL table — you can SELECT, index, and join against category_summary like any other table.

Clean up: SELECT pgtrickle.drop_stream_table('category_summary'); DROP TABLE products;

Chapter 2: Joins, Aggregates & Chains

What you'll build

An employee org-chart system with two stream tables:

department_tree — a recursive CTE that flattens a department hierarchy into paths like Company > Engineering > Backend
department_stats — a join + aggregation over department_tree (a stream table!) that computes headcount and salary budget, with the full path included
department_report — a further aggregation that rolls up stats to top-level departments

The chain departments → department_tree → department_stats → department_report demonstrates automatic downstream propagation: modify a department name in the base table and all three stream tables update automatically, in the right order, without any manual orchestration.

By the end you will have:

Seen how stream tables are created, queried, and refreshed
Watched a single UPDATE in a base table cascade through three layers of stream tables automatically
Understood the four refresh modes and IVM strategies

Prefer dbt? A runnable dbt companion project mirrors every step below. Clone the repo and run:
./examples/dbt_getting_started/scripts/run_example.sh
See examples/dbt_getting_started/ for full details.

2.1 Create the Base Tables

These are ordinary PostgreSQL tables — pg_trickle doesn't require any special column types, annotations, or schema conventions.

Tables without a primary key work, but pg_trickle will emit a WARNING at stream table creation time: change detection falls back to a content-based hash across all columns, which is slower for wide tables and cannot distinguish between identical duplicate rows. Adding a primary key gives the best performance and most reliable change detection. A primary key is also required for automatic transition to WAL-based CDC (cdc_mode = 'auto'); without one the source table stays on trigger-based CDC.

-- Department hierarchy (self-referencing tree)
CREATE TABLE departments (
    id         SERIAL PRIMARY KEY,
    name       TEXT NOT NULL,
    parent_id  INT REFERENCES departments(id)
);

-- Employees belong to a department
CREATE TABLE employees (
    id            SERIAL PRIMARY KEY,
    name          TEXT NOT NULL,
    department_id INT NOT NULL REFERENCES departments(id),
    salary        NUMERIC(10,2) NOT NULL
);

Now insert some data — a three-level department tree and a handful of employees:

-- Top-level
INSERT INTO departments (id, name, parent_id) VALUES
    (1, 'Company',     NULL);

-- Second level
INSERT INTO departments (id, name, parent_id) VALUES
    (2, 'Engineering', 1),
    (3, 'Sales',       1),
    (4, 'Operations',  1);

-- Third level (under Engineering)
INSERT INTO departments (id, name, parent_id) VALUES
    (5, 'Backend',     2),
    (6, 'Frontend',    2),
    (7, 'Platform',    2);

-- Employees
INSERT INTO employees (name, department_id, salary) VALUES
    ('Alice',   5, 120000),   -- Backend
    ('Bob',     5, 115000),   -- Backend
    ('Charlie', 6, 110000),   -- Frontend
    ('Diana',   7, 130000),   -- Platform
    ('Eve',     3, 95000),    -- Sales
    ('Frank',   3, 90000),    -- Sales
    ('Grace',   4, 100000);   -- Operations

At this point these are plain tables with no triggers, no change tracking, nothing special. The department tree looks like this:

Company (1)
├── Engineering (2)
│   ├── Backend (5)     — Alice, Bob
│   ├── Frontend (6)    — Charlie
│   └── Platform (7)    — Diana
├── Sales (3)           — Eve, Frank
└── Operations (4)      — Grace

2.2 Create the First Stream Table — Recursive Hierarchy

Our first stream table flattens the department tree. For every department, it computes the full path from the root and the depth level. This uses WITH RECURSIVE — a SQL construct that can't be differentiated with simple algebraic rules (the recursion depends on itself), but pg_trickle handles it using incremental strategies (semi-naive evaluation for inserts, Delete-and-Rederive for mixed changes) that we'll explain later.

SELECT pgtrickle.create_stream_table(
    name         => 'department_tree',
    query        => $$
    WITH RECURSIVE tree AS (
        -- Base case: root departments (no parent)
        SELECT id, name, parent_id, name AS path, 0 AS depth
        FROM departments
        WHERE parent_id IS NULL

        UNION ALL

        -- Recursive step: children join back to the tree
        SELECT d.id, d.name, d.parent_id,
               tree.path || ' > ' || d.name AS path,
               tree.depth + 1
        FROM departments d
        JOIN tree ON d.parent_id = tree.id
    )
    SELECT id, name, parent_id, path, depth FROM tree
    $$,
    schedule     => '1s'
);

Note on short schedules: A 1-second schedule is safe for development and production thanks to auto_backoff (on by default since v0.10.0). If a refresh takes more than 95% of the schedule window, the scheduler automatically stretches the effective interval (up to 8× the configured schedule) to prevent CPU runaway, then resets to 1× as soon as a refresh completes on time. You will see a WARNING message when backoff activates.

v0.2.0+: create_stream_table also accepts diamond_consistency ('none' or 'atomic') and diamond_schedule_policy ('fastest' or 'slowest') for diamond-shaped dependency graphs. Schedules can be cron expressions (e.g., '*/5 * * * *', '@hourly'). Set pooler_compatibility_mode => true if you're connecting through PgBouncer or another transaction-mode connection pooler. See SQL_REFERENCE.md for the full parameter list.

What just happened?

That single function call did a lot of work atomically (all in one transaction):

Parsed the defining query into an operator tree — identifying the recursive CTE, the scan on departments, the join, the union
Created a storage table called department_tree in the public schema — a real PostgreSQL heap table with columns matching the SELECT output, plus internal columns __pgt_row_id (a hash used to track individual rows)
Installed CDC triggers on the departments table — lightweight AFTER INSERT OR UPDATE OR DELETE row-level triggers that will capture every future change
Created a change buffer table in the pgtrickle_changes schema — this is where the triggers write captured changes
Ran an initial full refresh — executed the recursive query against the current data and populated the storage table
Registered the stream table in pg_trickle's catalog with a 1-second refresh schedule

TRUNCATE caveat: Row-level triggers do not fire on TRUNCATE. If you TRUNCATE a base table, the change is not captured incrementally — the stream table will become stale. Use DELETE FROM table instead, or call pgtrickle.refresh_stream_table('department_tree') after a TRUNCATE. If the stream table uses DIFFERENTIAL mode, temporarily switch to FULL for a full recompute: pgtrickle.alter_stream_table('department_tree', refresh_mode => 'FULL'), refresh, then switch back. Query it immediately — it's already populated:

SELECT id, name, parent_id, path, depth FROM department_tree ORDER BY path;

Expected output:

 id |    name     | parent_id |               path               | depth
----+-------------+-----------+----------------------------------+-------
  1 | Company     |           | Company                          |     0
  2 | Engineering |         1 | Company > Engineering            |     1
  5 | Backend     |         2 | Company > Engineering > Backend  |     2
  6 | Frontend    |         2 | Company > Engineering > Frontend |     2
  7 | Platform    |         2 | Company > Engineering > Platform |     2
  4 | Operations  |         1 | Company > Operations             |     1
  3 | Sales       |         1 | Company > Sales                  |     1
(7 rows)

This is a real PostgreSQL table — you can create indexes on it, join it in other queries, reference it in views, or even use it as a source for other stream tables. pg_trickle keeps it in sync automatically.

Key insight: The recursive query that computes paths and depths would normally need to be re-run manually (or via REFRESH MATERIALIZED VIEW). With pg_trickle, it stays fresh — any change to the departments table is automatically reflected within the schedule bound (1 second here).

2.3 Chain Stream Tables — Build the Downstream Layers

Now create department_stats. The twist: instead of joining directly against departments, it joins against department_tree — the stream table we just created. This creates a chain: changes to departments update department_tree, whose changes then trigger department_stats to update.

This demonstrates how pg_trickle builds a DAG — a directed acyclic graph of stream tables — and automatically schedules refreshes in topological order.

SELECT pgtrickle.create_stream_table(
    name         => 'department_stats',
    query        => $$
    SELECT
        t.id          AS department_id,
        t.name        AS department_name,
        t.path        AS full_path,
        t.depth,
        COUNT(e.id)                    AS headcount,
        COALESCE(SUM(e.salary), 0)     AS total_salary,
        COALESCE(AVG(e.salary), 0)     AS avg_salary
    FROM department_tree t
    LEFT JOIN employees e ON e.department_id = t.id
    GROUP BY t.id, t.name, t.path, t.depth
    $$,
    schedule     => 'calculated'      -- CALCULATED: inherit schedule from downstream; see explanation below
);

What just happened — and why this one is different?

Like before, pg_trickle parsed the query, created a storage table, and set up CDC. But department_stats depends on department_tree, not a base table — so no new triggers were installed. Instead, pg_trickle registered department_tree as an upstream dependency in the DAG.

The schedule is 'calculated' (CALCULATED mode), which means: "don't give this table its own schedule — inherit the tightest schedule of any downstream table that queries it". Internally this stores NULL in the catalog, but you must pass the string 'calculated' — passing SQL NULL is an error. Since no other stream table has been created yet, it will be refreshed on demand or when a downstream dependent triggers it.

The query has no recursive CTE, so pg_trickle uses algebraic differentiation:

Decomposed into operators: Scan(department_tree) → LEFT JOIN → Scan(employees) → Aggregate(GROUP BY + COUNT/SUM/AVG) → Project
Derived a differentiation rule for each:
- Δ(Scan) = read only change buffer rows (not the full table)
- Δ(LEFT JOIN) = join change rows from one side against the full other side
- Δ(Aggregate) = for COUNT/SUM/AVG, add or subtract per group — no rescan needed
Composed these into a single delta query (ΔQ) that never touches unchanged rows

When one employee is inserted, the refresh reads one change buffer row, joins to find the department, and adjusts only that group's count and sum.

Query it:

SELECT department_name, full_path, headcount, total_salary
FROM department_stats
ORDER BY full_path;

Expected output:

 department_name |            full_path             | headcount | total_salary
-----------------+----------------------------------+-----------+--------------
 Company         | Company                          |         0 |            0
 Engineering     | Company > Engineering            |         0 |            0
 Backend         | Company > Engineering > Backend  |         2 |    235000.00
 Frontend        | Company > Engineering > Frontend |         1 |    110000.00
 Platform        | Company > Engineering > Platform |         1 |    130000.00
 Operations      | Company > Operations             |         1 |    100000.00
 Sales           | Company > Sales                  |         2 |    185000.00
(7 rows)

Notice that the full_path column comes from department_tree — this data already went through one layer of incremental maintenance before landing here.

Add a third layer: `department_report`

Now add a rollup that aggregates department_stats by top-level group (depth = 1):

SELECT pgtrickle.create_stream_table(
    name         => 'department_report',
    query        => $$
    SELECT
        split_part(full_path, ' > ', 2) AS division,
        SUM(headcount)                  AS total_headcount,
        SUM(total_salary)               AS total_payroll
    FROM department_stats
    WHERE depth >= 1
    GROUP BY 1
    $$,
    schedule     => '1s'              -- this is the only explicit schedule; CALCULATED tables above inherit it
);

The DAG is now:

departments (base)  employees (base)
      │                   │
      ▼                   │
department_tree ──────────┤
   (DIFF, CALCULATED)     │
      │                   ▼
      └──────▶ department_stats
                 (DIFF, CALCULATED)
                      │
                      ▼
               department_report
                  (DIFF, 1s)   ◀── only explicit schedule

department_report drives the whole pipeline. Because it has a 1-second schedule, pg_trickle automatically propagates that cadence upstream: department_stats and department_tree will also be refreshed within 1 second of a base table change, in topological order, with no manual configuration.

Query the report:

SELECT division, total_headcount, total_payroll FROM department_report ORDER BY division;

  division   | total_headcount | total_payroll
-------------+-----------------+---------------
 Engineering |               4 |    475000.00
 Operations  |               1 |    100000.00
 Sales       |               2 |    185000.00
(3 rows)

2.4 Watch a Change Cascade Through All Three Layers

This is the heart of pg_trickle. We'll make four changes to the base tables and watch changes propagate automatically through the three-layer DAG — each layer doing only the minimum work.

The data flow pipeline (three layers)

  Your SQL statement
       │
       ▼
  CDC trigger fires (same transaction)
  Change buffer receives one row
       │
       ▼
  Background scheduler fires (within ~1 second)
       │
       ├──▶ [Layer 1] Refresh department_tree
       │         delta query reads change buffer
       │         MERGE touches only affected rows in department_tree
       │         department_tree's own change buffer is updated
       │
       ├──▶ [Layer 2] Refresh department_stats
       │         delta query reads department_tree's change buffer
       │         MERGE touches only affected department groups
       │
       └──▶ [Layer 3] Refresh department_report
                 delta query reads department_stats' change buffer
                 MERGE touches only affected division rows
                 All change buffers cleaned up ✓

All three layers run in a single scheduled pass, in topological order.

2.4a: INSERT ripples through all three layers

INSERT INTO employees (name, department_id, salary) VALUES
    ('Heidi', 6, 105000);  -- New Frontend engineer

What happened immediately (in your transaction): The AFTER INSERT trigger on employees fired and wrote one row to pgtrickle_changes.changes_<employees_oid>. The row contains the new values, action type I, and the LSN at the time of insert. Your transaction committed normally — no blocking.

The stream tables don't know about Heidi yet. The change is in the buffer, waiting for the next refresh.

The background scheduler handles this automatically. With a 1-second schedule, department_stats and department_report refresh within about a second.

To confirm a refresh has happened, check data_timestamp in the monitoring view:
SELECT name, data_timestamp, staleness FROM pgtrickle.pgt_status();
To force an immediate synchronous refresh, wait a moment first (so the scheduler can finish its current tick), then call in topological order. Note that refresh_stream_table only refreshes the named table — it does not cascade upstream:
SELECT pg_sleep(2);  -- let the scheduler finish any in-progress tick
SELECT pgtrickle.refresh_stream_table('department_stats');
SELECT pgtrickle.refresh_stream_table('department_report');

What happened across the three layers:

Layer	What ran	Rows touched
`department_tree`	No change — `employees` is not a source for this ST	0
`department_stats`	Delta query: read 1 buffer row, join to Frontend, COUNT+1, SUM+105000	1 (Frontend group only)
`department_report`	Delta query: read 1 change from dept_stats, SUM += 1 headcount, += 105000	1 (Engineering row only)

Check the result:

SELECT department_name, headcount, total_salary FROM department_stats
WHERE department_name = 'Frontend';

 department_name | headcount | total_salary
-----------------+-----------+--------------
 Frontend        |         2 |    215000.00

The 6 other groups in department_stats were not touched at all.

Contrast with a standard materialized view: REFRESH MATERIALIZED VIEW would re-scan all 8 employees, re-join with all 7 departments, re-aggregate, and update all 7 rows. With pg_trickle, the work was proportional to the 1 changed row — across all three layers.

2.4b: A department change cascades through the whole DAG

Now change the departments table — the root of the entire chain:

INSERT INTO departments (id, name, parent_id) VALUES
    (8, 'DevOps', 2);  -- New team under Engineering

What happened: The CDC trigger on departments fired. The change buffer for departments has one new row. None of the stream tables know about it yet.

The scheduler handles this automatically — all three tables will refresh within a second in the correct dependency order (upstream first). To force it synchronously, wait a moment first then refresh each table in topological order (refresh_stream_table does not cascade upstream):
SELECT pg_sleep(2);
SELECT pgtrickle.refresh_stream_table('department_tree');
SELECT pgtrickle.refresh_stream_table('department_stats');
SELECT pgtrickle.refresh_stream_table('department_report');

What happened across all three layers:

Layer	What ran	Rows touched
`department_tree`	Semi-naive evaluation: base case finds new dept, recursive term computes its path. Result: 1 new row	1 inserted
`department_stats`	Delta query reads new row from dept_tree's change buffer; DevOps has 0 employees so delta is minimal	1 inserted (headcount=0)
`department_report`	Delta on Engineering row: headcount stays the same (DevOps has 0 employees)	0 effective changes

How the recursive CTE refresh works — unlike department_stats, recursive CTEs can't be algebraically differentiated (the recursion references itself). pg_trickle uses incremental fixpoint strategies:

INSERT → semi-naive evaluation: differentiate the base case, propagate the delta through the recursive term, stopping when no new rows are produced. Only new rows inserted.
DELETE or UPDATE → Delete-and-Rederive (DRed): remove rows derived from deleted facts, re-derive rows that may have alternative derivation paths, handle cascades cleanly.

SELECT id, name, depth, path FROM department_tree WHERE name = 'DevOps';

 id |  name  | depth |              path
----+--------+-------+--------------------------------
  8 | DevOps |     2 | Company > Engineering > DevOps
(1 row)

The recursive CTE automatically expanded to include the new department at the correct depth and path. One inserted row in departments produced one new row in the stream table.

2.4c: UPDATE — A single rename that cascades everywhere

Rename "Engineering" to "R&D":

UPDATE departments SET name = 'R&D' WHERE id = 2;

What happened in the change buffer: The CDC trigger captured the old row (name='Engineering') and the new row (name='R&D'). Both old and new values are stored so the delta can compute what to remove and what to add.

Wait a moment for the scheduler to propagate the rename through all layers. To force it synchronously, wait then refresh each table in topological order (refresh_stream_table does not cascade upstream):

SELECT pg_sleep(2);
SELECT pgtrickle.refresh_stream_table('department_tree');
SELECT pgtrickle.refresh_stream_table('department_stats');
SELECT pgtrickle.refresh_stream_table('department_report');

What happened across all three layers:

Layer	Work done	Result
`department_tree`	DRed strategy: delete rows derived with old name, re-derive with new name. 5 rows updated (Engineering + 4 sub-teams)	Paths now say `Company > R&D > …`
`department_stats`	Delta reads 5 changed rows from dept_tree's buffer; updates `full_path` column for those 5 departments	5 rows updated
`department_report`	Division name changed: "Engineering" row replaced by "R&D" row	1 DELETE + 1 INSERT

Query to verify the cascade:

SELECT name, path FROM department_tree WHERE path LIKE '%R&D%' ORDER BY depth, name;

   name   |           path           
----------+--------------------------
 R&D      | Company > R&D
 Backend  | Company > R&D > Backend
 DevOps   | Company > R&D > DevOps
 Frontend | Company > R&D > Frontend
 Platform | Company > R&D > Platform
(5 rows)

One UPDATE to a department name flowed through all three layers automatically — updating 5 + 5 + 2 rows across the chain.

2.4d: DELETE — Remove an employee

DELETE FROM employees WHERE name = 'Bob';

What happened: The AFTER DELETE trigger on employees fired, writing a change buffer row with action type D and Bob's old values (department_id=5, salary=115000). The delta query will use these old values to compute the correct aggregate adjustment — it knows to subtract 115000 from Backend's salary sum and decrement the count.

Important — refresh before querying: The background scheduler refreshes all three tables within ~1 second, in topological order. To see the result immediately, wait a moment then explicitly refresh in upstream-first order:

SELECT pg_sleep(2);
SELECT pgtrickle.refresh_stream_table('department_stats');
SELECT pgtrickle.refresh_stream_table('department_report');

Why call department_stats first? department_stats sources from both employees and department_tree. Refreshing in topological order ensures each layer processes its upstream changes before computing its own deltas. Even when department_tree has unprocessed changes from step 4c and a new employee change arrives simultaneously, pg_trickle's differential engine handles both correctly — using the pre-change left snapshot (L₀) to avoid double-counting.

Then verify the result:

SELECT department_name, headcount, total_salary, avg_salary
FROM department_stats WHERE department_name = 'Backend';

 department_name | headcount | total_salary |     avg_salary
-----------------+-----------+--------------+---------------------
 Backend         |         1 |    120000.00 | 120000.000000000000
(1 row)

Headcount dropped from 2 → 1 and the salary aggregates updated. Again, only the Backend group was touched — the other 6 department rows were untouched.

Chapter 3: Scheduling & Backpressure

Automatic Scheduling — Let the DAG Drive Itself

pg_trickle runs a background scheduler that automatically refreshes stale tables in topological order. In the Step 4 examples above, the scheduler handled every change within about a second. You can also call refresh_stream_table() directly when needed (e.g. in scripts or tests), but in normal operation the scheduler takes care of everything.

How schedules propagate

We gave department_report a '1s' schedule and the two upstream tables a NULL schedule (CALCULATED mode). This is the recommended pattern:

 department_tree    (CALCULATED → inherits 1s from downstream)
       │
 department_stats   (CALCULATED → inherits 1s from downstream)
       │
 department_report  (1s — the only explicit schedule)

CALCULATED mode (pass schedule => 'calculated') means: compute the tightest schedule across all downstream dependents. You declare freshness requirements at the tables your application queries — the system figures out how often each upstream table needs to refresh.

What the scheduler does every second

Queries the catalog for stream tables past their freshness bound
Sorts them topologically (upstream first) — department_tree refreshes before department_stats, which refreshes before department_report
Runs each refresh (respecting pg_trickle.max_concurrent_refreshes)
Updates the last-refresh frontier

Monitoring

-- Current status of all stream tables
SELECT name, status, refresh_mode, schedule, data_timestamp, staleness
FROM pgtrickle.pgt_status();

        name                 | status |  refresh_mode | schedule |       data_timestamp        |    staleness
-----------------------------+--------+---------------+----------+-----------------------------+-----------------
 public.department_tree      | ACTIVE | DIFFERENTIAL  |          | 2026-02-26 10:30:00.123+01 | 00:00:00.877
 public.department_stats     | ACTIVE | DIFFERENTIAL  |          | 2026-02-26 10:30:00.456+01 | 00:00:00.544
 public.department_report    | ACTIVE | DIFFERENTIAL  | 1s       | 2026-02-26 10:30:00.789+01 | 00:00:00.211

-- Detailed performance stats
SELECT pgt_name, total_refreshes, avg_duration_ms, successful_refreshes
FROM pgtrickle.pg_stat_stream_tables;

-- Health check: quick triage of common issues
SELECT check_name, severity, detail FROM pgtrickle.health_check();

-- Visualize the dependency DAG
SELECT * FROM pgtrickle.dependency_tree();

-- Recent refresh timeline across all stream tables
SELECT * FROM pgtrickle.refresh_timeline(10);

-- Check CDC change buffer sizes (spotting buffer build-up)
SELECT * FROM pgtrickle.change_buffer_sizes();

See SQL_REFERENCE.md for the full list of monitoring functions including list_sources(), trigger_inventory(), and diamond_groups().

Chapter 4: Monitoring In Depth

All the monitoring capabilities from the monitoring quick reference above, expanded. For the five most important day-to-day introspection queries see the Monitoring Quick Reference at the end of this guide.

Optional: WAL-based CDC

By default pg_trickle uses triggers. If wal_level = logical is configured, set:

ALTER SYSTEM SET pg_trickle.cdc_mode = 'auto';
SELECT pg_reload_conf();

pg_trickle will automatically transition each stream table from trigger-based to WAL-based capture after the first successful refresh — reducing per-write overhead from ~2–15 μs (triggers) to near-zero (WAL-based capture adds no synchronous overhead to your DML). The transition is transparent; your queries and the refresh schedule are unaffected.

Optional: Parallel Refresh (v0.4.0+)

By default the scheduler refreshes stream tables sequentially in topological order within a single background worker. This is correct and efficient for most workloads.

For deployments with many independent stream tables, enable parallel refresh:

ALTER SYSTEM SET pg_trickle.parallel_refresh_mode = 'on';
ALTER SYSTEM SET pg_trickle.max_dynamic_refresh_workers = 4;  -- cluster-wide cap
SELECT pg_reload_conf();

Independent stream tables at the same DAG level will then refresh concurrently in separate dynamic background workers. Refresh pairs with IMMEDIATE-trigger connections and atomic consistency groups still run in a single worker for correctness.

Before enabling, ensure max_worker_processes has enough room:

max_worker_processes >= 1 (launcher)
                      + number of databases with stream tables
                      + max_dynamic_refresh_workers (default 4)
                      + autovacuum and other extension workers

Monitor parallel refresh:

SELECT * FROM pgtrickle.worker_pool_status();        -- live worker budget
SELECT * FROM pgtrickle.parallel_job_status(60);     -- recent jobs

See CONFIGURATION.md — Parallel Refresh for the complete tuning reference.

Optional: PgBouncer / Connection Pooler Compatibility (v0.10.0+)

If you're connecting through PgBouncer or another connection pooler in transaction mode (the default on Supabase, Railway, Neon, and most managed PostgreSQL platforms), set pooler_compatibility_mode when creating or altering a stream table:

SELECT pgtrickle.create_stream_table(
    name                    => 'live_headcount',
    query                   => 'SELECT department_id, COUNT(*) FROM employees GROUP BY 1',
    schedule                => '1s',
    pooler_compatibility_mode => true
);

This disables prepared statements and NOTIFY emissions for that table — the two features that break in transaction-pool mode. Leave it off (the default) if you connect directly to PostgreSQL.

Optional: Change Buffer Compaction (v0.10.0+)

For high-churn tables, pg_trickle automatically compacts the pending change buffer before each refresh cycle when it exceeds pg_trickle.compact_threshold (default 100,000 rows). INSERT→DELETE pairs that cancel each other out are eliminated, and multiple changes to the same row are collapsed to a single net change, reducing delta scan overhead by 50–90%.

Chapter 5: Advanced Topics

Refresh Modes and IVM Strategies

You've now seen the IVM strategies pg_trickle uses for incremental view maintenance. Understanding the four refresh modes and when each strategy applies helps you write efficient stream table queries.

The Four Refresh Modes

Mode	When it refreshes	Use case
AUTO (default)	On a schedule (background)	Most use cases — uses DIFFERENTIAL when possible, falls back to FULL automatically
DIFFERENTIAL	On a schedule (background)	Like AUTO but errors if the query can't be differentiated
FULL	On a schedule (background)	Forces full recompute every cycle
IMMEDIATE	Synchronously, in the same transaction as the DML	Real-time dashboards, audit tables — the stream table is always up-to-date

When you omit refresh_mode, the default is 'AUTO' — it uses differential (delta-only) maintenance when the query supports it, and automatically falls back to full recomputation when it doesn't. You only need to specify a mode explicitly for advanced cases.

IMMEDIATE mode (new in v0.2.0) maintains stream tables synchronously within the same transaction as the base table DML. It uses statement-level AFTER triggers with transition tables — no change buffers, no scheduler. The stream table is always consistent with the current transaction.

-- Create a stream table that updates in real-time
SELECT pgtrickle.create_stream_table(
    name         => 'live_headcount',
    query        => $$
    SELECT department_id, COUNT(*) AS headcount
    FROM employees
    GROUP BY department_id
    $$,
    refresh_mode => 'IMMEDIATE'
);

-- After any INSERT/UPDATE/DELETE on employees,
-- live_headcount is already up-to-date — no refresh needed!

IMMEDIATE mode supports joins, aggregates, window functions, LATERAL subqueries, and cascading IMMEDIATE stream tables. Recursive CTEs are not supported in IMMEDIATE mode (use DIFFERENTIAL instead).

You can switch between modes at any time:

-- Switch from DIFFERENTIAL to IMMEDIATE
SELECT pgtrickle.alter_stream_table('department_stats', refresh_mode => 'IMMEDIATE');

-- Switch back to DIFFERENTIAL with a schedule
SELECT pgtrickle.alter_stream_table('department_stats', refresh_mode => 'DIFFERENTIAL', schedule => '1s');

Algebraic Differentiation (used by `department_stats`)

For queries composed of scans, filters, joins, and algebraic aggregates (COUNT, SUM, AVG), pg_trickle can derive the IVM delta mathematically. The rules come from the theory of DBSP (Database Stream Processing):

Operator	Delta Rule	Cost
Scan	Read only change buffer rows (not the full table)	O(changes)
Filter (WHERE)	Apply predicate to change rows	O(changes)
Join	Join change rows from one side against the full other side	O(changes × lookup)
Aggregate (COUNT/SUM/AVG)	Add or subtract deltas per group — no rescan	O(affected groups)
Project	Pass through	O(changes)

The total cost is proportional to the number of changes, not the table size. For a million-row table with 10 changes, the delta query touches ~10 rows.

Incremental Strategies for Recursive CTEs (used by `department_tree`)

For recursive CTEs, pg_trickle can't derive an algebraic delta because the recursion references itself. Instead it uses two complementary strategies, chosen automatically based on what changed:

Semi-naive evaluation (for INSERT-only changes):

Differentiate the base case — find the new seed rows
Propagate the delta through the recursive term, iterating until no new rows are produced
The result is only the new rows created by the change — not the whole tree

Delete-and-Rederive (DRed) (for DELETE or UPDATE):

Remove all rows derived from the old fact
Re-derive rows that had the old fact as one of their derivation paths (they may still be reachable via other paths)
Insert the newly derived rows under the new fact

Both strategies are more efficient than full recomputation — they work on the affected portion of the result set, not the entire recursive query. The MERGE only modifies rows that actually changed.

When to use which strategy?

You don't choose — pg_trickle detects the strategy automatically based on the query structure:

Query Pattern	Strategy	Performance
Scan + Filter + Join + algebraic Aggregate (COUNT/SUM/AVG)	Algebraic	Excellent — O(changes)
`CORR`, `COVAR_POP/SAMP`, `REGR_*` (12 functions)	Algebraic (Welford running totals)	O(changes) — running totals updated per changed row, no group rescan (v0.10.0+)
Non-recursive CTEs	Algebraic (inlined)	CTE body is differentiated inline
`MIN` / `MAX` aggregates	Semi-algebraic	Uses LEAST/GREATEST merge; per-group rescan only when an extremum is deleted
`STRING_AGG`, `ARRAY_AGG`, ordered-set aggregates	Group-rescan	Affected groups fully re-aggregated from source
`GROUPING SETS` / `CUBE` / `ROLLUP`	Algebraic (rewritten)	Auto-expanded to `UNION ALL` of `GROUP BY` queries; CUBE capped at 64 branches
Recursive CTEs (`WITH RECURSIVE`) INSERT	Semi-naive evaluation	O(new rows derived from the change)
Recursive CTEs (`WITH RECURSIVE`) DELETE/UPDATE	Delete-and-Rederive	Re-derives rows with alternative paths; O(affected subgraph) (v0.10.0+)
LATERAL subqueries	Correlated re-evaluation	Only outer rows correlated with changed inner data re-evaluated — O(correlated rows) (v0.10.0+)
Window functions	Partition recompute	Only affected partitions recomputed
`ORDER BY … LIMIT N` (TopK)	Scoped recomputation	Re-evaluates top-N via MERGE; stores exactly N rows
IMMEDIATE mode queries	In-transaction delta	Same algebraic strategies, applied synchronously via transition tables

FUSE Circuit Breaker (v0.11.0+)

The fuse is a circuit breaker that stops a stream table from processing an unexpectedly large batch of changes — for example from a runaway script or mass-delete migration — without operator review.

-- Arm a fuse: blow when pending changes exceed 50 000 rows
SELECT pgtrickle.alter_stream_table(
    'category_summary',
    fuse           => 'on',
    fuse_ceiling   => 50000
);

-- Check fuse status across all stream tables
SELECT name, fuse_mode, fuse_state, fuse_ceiling, blown_at
FROM pgtrickle.fuse_status();

-- After investigating and deciding to apply the batch:
SELECT pgtrickle.reset_fuse('category_summary', action => 'apply');

-- Or skip the oversized batch entirely and resume from current state:
SELECT pgtrickle.reset_fuse('category_summary', action => 'skip_changes');

reset_fuse supports three actions:

'apply' — process all pending changes and resume normal scheduling.
'reinitialize' — drop and repopulate the stream table from scratch.
'skip_changes' — discard pending changes and resume from the current frontier.

A pgtrickle_alert NOTIFY is emitted when the fuse blows, making it easy to hook into alerting pipelines or LISTEN from application code.

Partitioned Stream Tables (v0.11.0+)

For large stream tables, declare a partition key at creation time so MERGE operations are scoped to only the relevant partitions:

SELECT pgtrickle.create_stream_table(
    name         => 'sales_by_month',
    query        => $$
        SELECT
            DATE_TRUNC('month', sale_date) AS month,
            product_id,
            SUM(amount) AS total_sales
        FROM sales
        GROUP BY 1, 2
    $$,
    schedule     => '1m',
    partition_by => 'month'    -- partition key must be in the SELECT output
);

pg_trickle creates the storage table as PARTITION BY RANGE (month) with a catch-all partition, then on each refresh:

Inspects the delta to find the MIN and MAX of the partition key.
Injects AND st.month BETWEEN min AND max into the MERGE ON clause.
PostgreSQL prunes all partitions outside the range — giving ~100× I/O reduction for a 0.1% change rate on a 10M-row table.

See SQL_REFERENCE.md for full partitioning options.

IMMEDIATE Mode — Real-Time In-Transaction IVM

-- Create a stream table that updates in the same transaction as its source
SELECT pgtrickle.create_stream_table(
    name         => 'live_headcount',
    query        => $$
    SELECT department_id, COUNT(*) AS headcount
    FROM employees
    GROUP BY department_id
    $$,
    refresh_mode => 'IMMEDIATE'
);

-- After any INSERT/UPDATE/DELETE on employees, live_headcount is already up-to-date:
INSERT INTO employees (name, department_id, salary) VALUES ('Zara', 2, 95000);
SELECT * FROM live_headcount WHERE department_id = 2;  -- 4 rows, immediately

IMMEDIATE mode uses statement-level AFTER triggers with transition tables — no change buffers, no scheduler, no background workers. The stream table is always consistent with the current transaction. Ideal for audit tables, real-time dashboards, and applications that need zero-latency reads.

Multi-Tenant Worker Quotas (v0.11.0+)

In deployments with multiple databases, one busy database can starve others if all dynamic refresh workers are claimed. The per_database_worker_quota GUC prevents this:

-- Limit one performance-critical database to 4 workers (with burst to 6)
ALTER DATABASE analytics  SET pg_trickle.per_database_worker_quota = 4;
-- Allow a reporting database only 2 base workers
ALTER DATABASE reporting  SET pg_trickle.per_database_worker_quota = 2;
-- Apply changes
SELECT pg_reload_conf();

When the cluster has spare capacity (active workers < 80% of max_dynamic_refresh_workers), a database may temporarily burst to 150% of its quota. Burst is reclaimed within 1 scheduler cycle once load rises. Within each dispatch tick, IMMEDIATE-trigger closures are always dispatched first, followed by atomic groups, singletons, and cyclic SCCs.

See CONFIGURATION.md for full quota tuning options.

Clean Up

When you're done experimenting, drop the stream tables. Drop dependents before their sources:

SELECT pgtrickle.drop_stream_table('department_report');
SELECT pgtrickle.drop_stream_table('department_stats');
SELECT pgtrickle.drop_stream_table('department_tree');

DROP TABLE employees;
DROP TABLE departments;

drop_stream_table atomically removes in a single transaction:

The storage table (e.g., public.department_stats)
CDC triggers on source tables (removed only if no other stream table references the same source)
Change buffer tables in pgtrickle_changes
Catalog entries in pgtrickle.pgt_stream_tables

Monitoring Quick Reference

pg_trickle ships several built-in monitoring functions and a ready-made Prometheus/Grafana stack. Here are the five most useful functions for day-to-day operations.

Stream Table Status

-- Overview of all stream tables: status, staleness, last refresh time, errors
SELECT name, status, staleness, last_refresh_at, last_error
FROM pgtrickle.pgt_status();

Health Check

-- Run all built-in health checks; returns severity (OK/WARNING/CRITICAL) per check
SELECT check_name, severity, detail FROM pgtrickle.health_check();

Change Buffer Sizes

-- Show CDC buffer row counts per source table — useful for spotting backlogs
SELECT * FROM pgtrickle.change_buffer_sizes();

Dependency Tree

-- Visualize the DAG: which stream tables depend on what
SELECT * FROM pgtrickle.dependency_tree();

Fuse Status

-- Check circuit breaker state for all stream tables (v0.11.0+)
SELECT * FROM pgtrickle.fuse_status();

Prometheus & Grafana

For production monitoring, pg_trickle ships a ready-made observability stack in the monitoring/ directory:

cd monitoring && docker compose up

This starts PostgreSQL + postgres_exporter + Prometheus + Grafana with pre-configured dashboards and alerting rules. Grafana is available at http://localhost:3000 (admin/admin). See monitoring/README.md for the full list of exported metrics and alert conditions.

Key Prometheus metrics:

Metric	Description
`pgtrickle_refresh_total`	Cumulative refresh count per table
`pgtrickle_refresh_duration_seconds`	Last refresh duration per table
`pgtrickle_staleness_seconds`	Seconds since last successful refresh
`pgtrickle_consecutive_errors`	Current error streak per table
`pgtrickle_cdc_buffer_rows`	Pending change buffer rows per source table

Pre-configured alerts: staleness > 5 min, ≥3 consecutive failures, table SUSPENDED, CDC buffer > 1 GB, scheduler down, high refresh duration.

Summary: What You Learned

Concept	What you saw
Stream tables	Tables defined by a SQL query that stay automatically up to date
CDC triggers	Lightweight change capture in the same transaction — no logical replication or polling required
DAG scheduling	Stream tables can depend on other stream tables; refreshes run in topological order, schedules propagate upstream via `CALCULATED` mode
Algebraic IVM	Delta queries that process only changed rows — O(changes) regardless of table size
Semi-naive / DRed	Incremental strategies for `WITH RECURSIVE` — INSERT uses semi-naive, DELETE/UPDATE uses Delete-and-Rederive (v0.10.0+)
IMMEDIATE mode	Synchronous in-transaction IVM — stream tables updated within the same transaction as your DML, always consistent
TopK	`ORDER BY … LIMIT N` queries store exactly N rows, refreshed via scoped recomputation
Diamond consistency	Atomic refresh groups for diamond-shaped dependency graphs via `diamond_consistency = 'atomic'`
Downstream propagation	A single base table write cascades through an entire chain of stream tables, automatically, in the right order
Trigger-based CDC	Lightweight row-level triggers by default (no WAL configuration needed); optional transition to WAL-based capture via `pg_trickle.cdc_mode = 'auto'`
Parallel refresh	Independent stream tables refresh concurrently in dynamic background workers via `pg_trickle.parallel_refresh_mode = 'on'` (v0.4.0+, default off)
auto_backoff	Scheduler automatically stretches effective interval when refresh cost exceeds 95% of the schedule window, capped at 8× (on by default, v0.10.0+)
PgBouncer compatibility	Set `pooler_compatibility_mode => true` per stream table to work behind transaction-mode connection poolers (v0.10.0+)
Monitoring	`pgt_status()`, `health_check()`, `dependency_tree()`, `pg_stat_stream_tables`, and more for freshness, timing, and error history

The key takeaway: you write to base tables — pg_trickle does the rest. Data flows downstream automatically, each layer doing the minimum work proportional to what changed, in dependency order.

Troubleshooting

Stream table is stale / not refreshing

Check the status view first:

SELECT name, status, last_error, last_refresh_at, staleness FROM pgtrickle.pgt_status();

A status of ERROR means the last refresh failed. last_error contains the message. Fix the underlying issue (e.g., a dropped column referenced in the query) then call:

SELECT pgtrickle.refresh_stream_table('your_table');

For a broader health check:

SELECT check_name, severity, detail FROM pgtrickle.health_check();

Change buffer growing large

If a stream table has status = 'PAUSED' or refreshes are falling behind:

SELECT * FROM pgtrickle.change_buffer_sizes();  -- find large buffers

Large buffers are normal under heavy load — auto_backoff slows the schedule to avoid CPU runaway and will self-correct once throughput stabilises. If a buffer stays large indefinitely, check last_error in pgt_status() for a blocked refresh.

CDC triggers missing after restore / point-in-time recovery

PITR restores the heap table but not the triggers if the extension was installed after the base backup. Verify:

SELECT * FROM pgtrickle.trigger_inventory();  -- expected vs installed triggers

Any missing trigger can be reinstalled with:

SELECT pgtrickle.repair_stream_table('your_table');

Deployment Best Practices

Once you've built your stream tables interactively, you'll want to deploy them reliably — via SQL migration scripts, dbt, or GitOps pipelines.

Kubernetes Deployment (CloudNativePG)

pg_trickle integrates natively with CloudNativePG using Image Volume Extensions (Kubernetes 1.33+). The extension is packaged as a scratch-based OCI image containing only the .so, .control, and .sql files — no custom PostgreSQL image required.

Prerequisites

Kubernetes 1.33+ with the ImageVolume feature gate enabled
CloudNativePG operator 1.28+
pg_trickle extension image pushed to your cluster registry

Quick Start

Deploy the Cluster with the extension mounted as an Image Volume:

# cnpg/cluster-example.yaml (abridged)
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: pg-trickle-demo
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:18
  postgresql:
    shared_preload_libraries:
      - pg_trickle
    extensions:
      - name: pg-trickle
        image:
          reference: ghcr.io/<owner>/pg_trickle-ext:<version>
    parameters:
      max_worker_processes: "8"

Create the extension declaratively with a CNPG Database resource:

# cnpg/database-example.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Database
metadata:
  name: pg-trickle-app
spec:
  name: app
  owner: app
  cluster:
    name: pg-trickle-demo
  extensions:
    - name: pg_trickle

Apply both resources:

kubectl apply -f cnpg/cluster-example.yaml
kubectl apply -f cnpg/database-example.yaml

Full example manifests are in the cnpg/ directory.

Health Monitoring

CNPG manages PostgreSQL liveness/readiness probes via its instance manager. For pg_trickle-specific health, use the built-in health check function:

-- Run against the primary or any replica:
SELECT * FROM pgtrickle.health_check();

This returns rows for scheduler status, error/suspended tables, stale tables, CDC buffer growth, WAL slot lag, and worker pool utilization. Integrate it into your monitoring stack:

Prometheus: Use the CNPG monitoring integration to expose pgtrickle.health_check() results as custom metrics
Kubernetes CronJob: Schedule periodic health checks and alert via your existing alerting pipeline
pgtrickle-tui: The TUI tool has a dedicated Health view that polls health_check() continuously

Probe Configuration

The example manifests include probe settings tuned for pg_trickle workloads:

probes:
  startup:
    periodSeconds: 10
    failureThreshold: 60     # 10 min for shared_preload_libraries init
  liveness:
    periodSeconds: 10
    failureThreshold: 6      # 60s before restart
  readiness:
    type: streaming
    maximumLag: 64Mi         # replicas must be streaming before serving reads

Why readiness: streaming? Stream tables are readable on replicas, but a lagging replica serves stale stream table data. The maximumLag setting ensures replicas are caught up before receiving traffic.

Failover Behavior

When the primary pod fails and CNPG promotes a replica:

Scheduler: The new primary starts the pg_trickle scheduler background worker automatically (registered via shared_preload_libraries)
Stream tables: All stream table definitions are stored in the pgtrickle.pgt_stream_tables catalog table, which is replicated to all replicas. The promoted replica has the complete catalog.
CDC triggers: Trigger definitions are replicated as part of the WAL stream. The new primary's triggers fire normally on new writes.
Change buffers: Uncommitted change buffer rows from in-flight transactions on the old primary are lost (standard PostgreSQL behavior). The next refresh cycle detects the gap and performs a FULL refresh to resynchronize.
Refresh frontiers: Each stream table's last-refresh frontier is stored in the catalog. If the frontier is ahead of the available change buffer data (due to WAL replay lag), the scheduler falls back to FULL refresh once and then resumes DIFFERENTIAL.

No manual intervention is required after failover.

Idempotent SQL Migrations

Use create_or_replace_stream_table() in your migration scripts. It's safe to run on every deploy:

-- migrations/V003__stream_tables.sql
-- Creates if absent, updates if definition changed, no-op if identical.

SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'employee_salaries',
    query        => 'SELECT e.id, e.name, d.name AS department, e.salary
                     FROM employees e JOIN departments d ON e.department_id = d.id',
    schedule     => '30s',
    refresh_mode => 'DIFFERENTIAL'
);

SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'department_stats',
    query        => 'SELECT department, COUNT(*) AS headcount, AVG(salary) AS avg_salary
                     FROM employee_salaries GROUP BY department',
    schedule     => '30s',
    refresh_mode => 'DIFFERENTIAL'
);

If someone changes the query in a later migration, create_or_replace detects the difference and migrates the storage table in place — no need to drop and recreate.

dbt Integration

With the dbt-pgtrickle package, stream tables are just dbt models with materialized='stream_table':

-- models/department_stats.sql
{{ config(
    materialized='stream_table',
    schedule='30s',
    refresh_mode='DIFFERENTIAL'
) }}

SELECT department, COUNT(*) AS headcount, AVG(salary) AS avg_salary
FROM {{ ref('employee_salaries') }}
GROUP BY department

Every dbt run calls create_or_replace_stream_table() under the hood, so deployments are always idempotent.

Day 2 Operations

Added in v0.20.0 (UX-4).

Once your stream tables are running in production, pg_trickle can monitor itself using its own stream tables — a technique called dog-feeding.

Enabling Dog-Feeding

-- Create all five monitoring stream tables (idempotent, safe to repeat).
SELECT pgtrickle.setup_dog_feeding();

-- Check what was created.
SELECT * FROM pgtrickle.dog_feeding_status();

This creates five stream tables in the pgtrickle schema:

Stream Table	Purpose
`df_efficiency_rolling`	Rolling-window refresh statistics (replaces manual `refresh_efficiency()` calls)
`df_anomaly_signals`	Detects duration spikes, error bursts, mode oscillation
`df_threshold_advice`	Recommends threshold adjustments based on multi-cycle analysis
`df_cdc_buffer_trends`	Tracks CDC buffer growth rates per source table
`df_scheduling_interference`	Detects concurrent refresh overlap patterns

Checking Recommendations

After at least 10–20 refresh cycles have accumulated:

-- Which stream tables have poorly calibrated thresholds?
SELECT pgt_name, current_threshold, recommended_threshold, confidence, reason
FROM pgtrickle.df_threshold_advice
WHERE confidence IN ('HIGH', 'MEDIUM')
  AND abs(recommended_threshold - current_threshold) > 0.05;

-- Are any stream tables experiencing anomalies?
SELECT pgt_name, duration_anomaly, recent_failures
FROM pgtrickle.df_anomaly_signals
WHERE duration_anomaly IS NOT NULL OR recent_failures >= 2;

Automatic Threshold Tuning

To let pg_trickle automatically apply threshold recommendations:

SET pg_trickle.dog_feeding_auto_apply = 'threshold_only';

This applies changes only when confidence is HIGH and the recommended threshold differs by more than 5%. Changes are rate-limited to once per 10 minutes per stream table and logged with initiated_by = 'DOG_FEED'.

Visualizing the DAG

-- See the full refresh graph (Mermaid format, paste into any Mermaid renderer).
SELECT pgtrickle.explain_dag();

Dog-feeding STs appear in green, user STs in blue, suspended in red.

Disabling Dog-Feeding

SELECT pgtrickle.teardown_dog_feeding();

This drops all monitoring stream tables. User stream tables are never affected. The control plane continues operating identically without dog-feeding.

What's Next?

TUI.md — Terminal UI & CLI tool for managing and monitoring stream tables from outside SQL
SQL_REFERENCE.md — Full API reference for all functions, views, and configuration
ARCHITECTURE.md — Deep dive into the system architecture and data flow
DVM_OPERATORS.md — How each SQL operator is differentiated for incremental maintenance
CONFIGURATION.md — GUC variables for tuning schedule, concurrency, and cleanup behavior
Flyway & Liquibase Integration — Migration patterns for Flyway and Liquibase
ORM Integration — SQLAlchemy and Django ORM patterns for stream tables
What Happens on INSERT — Detailed trace of a single INSERT through the entire pipeline
What Happens on UPDATE — How UPDATEs are split into D+I, group key changes, and net-effect computation
What Happens on DELETE — Reference counting, group deletion, and INSERT+DELETE cancellation
What Happens on TRUNCATE — Why TRUNCATE bypasses triggers and how to recover
dbt Getting Started example — Everything above, expressed as dbt models and seeds with a one-command Docker runner

Playground

The quickest way to explore pg_trickle is the playground — a pre-configured Docker environment with sample data and stream tables ready to query. No installation, no configuration. One command and you're running.

Quick Start

git clone https://github.com/grove/pg-trickle.git
cd pg-trickle/playground
docker compose up -d

Then connect:

psql postgresql://postgres:playground@localhost:5432/playground

PostgreSQL 18+ note: The Docker image stores data in a versioned subdirectory (/var/lib/postgresql/18/main). The compose file mounts /var/lib/postgresql (not .../data) — this is intentional.

What's Pre-Loaded

The seed script creates three base tables and five stream tables that cover the most common pg_trickle patterns.

Base Tables

Table	Description
`products`	Product catalog with categories and prices
`orders`	Order line items with quantities and timestamps
`customers`	Customer profiles with regions

Stream Tables

Stream Table	Query	Pattern demonstrated
`sales_by_region`	`SUM(total)` grouped by region	Basic aggregate, DIFFERENTIAL mode
`top_products`	`SUM(quantity)` ranked by category	Window function (`RANK()`)
`customer_lifetime_value`	Revenue + order count per customer	Multi-table join + aggregates
`daily_revenue`	Revenue per day	Time-series aggregation
`active_products`	Products with orders	`EXISTS` subquery

Exercises

1. Watch an INSERT propagate

-- Current state
SELECT * FROM sales_by_region ORDER BY region;

-- Insert a new order
INSERT INTO orders (customer_id, product_id, quantity, order_date)
VALUES (1, 1, 10, CURRENT_DATE);

-- After ~1 s the stream table refreshes
SELECT * FROM sales_by_region ORDER BY region;

2. Inspect pg_trickle internals

-- Overall health
SELECT * FROM pgtrickle.health_check();

-- Status of all stream tables
SELECT name, status, refresh_mode, staleness
FROM pgtrickle.pgt_status()
ORDER BY name;

-- Recent refresh activity
SELECT start_time, stream_table, action, status, duration_ms
FROM pgtrickle.refresh_timeline(10);

-- Delta SQL for a stream table
SELECT pgtrickle.explain_st('sales_by_region');

-- Change buffer sizes
SELECT * FROM pgtrickle.change_buffer_sizes();

3. Update and Delete

-- Update a product price
UPDATE products SET price = 99.99 WHERE name = 'Widget';

-- customer_lifetime_value re-calculates
SELECT * FROM customer_lifetime_value ORDER BY total_revenue DESC LIMIT 5;

-- Delete a customer's orders
DELETE FROM orders WHERE customer_id = 3;

-- Stream tables reflect the removal
SELECT * FROM sales_by_region ORDER BY region;

4. Create your own stream table

SELECT pgtrickle.create_stream_table(
    name     => 'my_experiment',
    query    => $$
        SELECT p.category,
               COUNT(DISTINCT o.customer_id) AS unique_buyers,
               SUM(o.quantity)               AS total_units
        FROM orders o
        JOIN products p ON p.id = o.product_id
        GROUP BY p.category
        HAVING SUM(o.quantity) > 5
    $$,
    schedule => '2s'
);

SELECT * FROM my_experiment;

Tear Down

docker compose down -v

The -v flag removes the data volume. Omit it if you want to keep your changes.

Next Steps

Getting Started Guide — full tutorial with an org-chart example
SQL Reference — all functions and parameters
Best-Practice Patterns — production-ready patterns

Best-Practice Patterns for pg_trickle

This guide covers common data modeling patterns and recommended configurations for pg_trickle stream tables. Each pattern includes worked SQL examples, anti-patterns to avoid, and refresh mode recommendations.

Version: v0.14.0+. Some features require recent versions — check SQL_REFERENCE.md for per-feature availability.

Pattern 1: Bronze / Silver / Gold Materialization

A multi-layer approach where raw data flows through progressively refined stream tables, similar to a medallion architecture.

Architecture

  [raw_events]          ← Bronze: raw ingest table (regular table)
       ↓
  [events_cleaned]      ← Silver: filtered, deduplicated, typed
       ↓
  [events_aggregated]   ← Gold: business-level aggregates

SQL Example

-- Bronze: regular PostgreSQL table (source of truth)
CREATE TABLE raw_events (
    event_id    BIGSERIAL PRIMARY KEY,
    user_id     INT NOT NULL,
    event_type  TEXT NOT NULL,
    payload     JSONB,
    received_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Silver: cleaned and deduplicated events
SELECT pgtrickle.create_stream_table(
    'events_cleaned',
    $$SELECT DISTINCT ON (event_id)
        event_id,
        user_id,
        event_type,
        (payload->>'amount')::numeric AS amount,
        received_at
      FROM raw_events
      WHERE event_type IN ('purchase', 'refund', 'subscription')$$,
    schedule => '5s',
    refresh_mode => 'DIFFERENTIAL'
);

-- Gold: per-user purchase summary
SELECT pgtrickle.create_stream_table(
    'user_purchase_summary',
    $$SELECT user_id,
             COUNT(*) AS total_purchases,
             SUM(amount) AS total_spent,
             AVG(amount) AS avg_order
      FROM events_cleaned
      WHERE event_type = 'purchase'
      GROUP BY user_id$$,
    schedule => 'calculated',
    refresh_mode => 'DIFFERENTIAL'
);

Recommended Configuration

Layer	Refresh Mode	Schedule	Tier
Silver	DIFFERENTIAL	5s – 30s	hot
Gold	DIFFERENTIAL	calculated	hot

Anti-Patterns

Don't use FULL refresh for Silver. With frequent small inserts, DIFFERENTIAL is 10–100x faster.
Don't skip the Silver layer. Joining raw tables directly in Gold queries produces wider joins and slower deltas.
Don't use IMMEDIATE mode for Gold. Aggregate maintenance on every DML row is expensive — batched DIFFERENTIAL is more efficient.

Pattern 2: Event Sourcing with Stream Tables

Use stream tables as projections of an append-only event log. The source table is the event store; stream tables materialize different read models.

SQL Example

-- Event store (append-only source)
CREATE TABLE events (
    event_id    BIGSERIAL PRIMARY KEY,
    aggregate_id UUID NOT NULL,
    event_type   TEXT NOT NULL,
    payload      JSONB NOT NULL,
    created_at   TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Projection 1: Current state per aggregate
SELECT pgtrickle.create_stream_table(
    'aggregate_state',
    $$SELECT DISTINCT ON (aggregate_id)
        aggregate_id,
        event_type AS last_event,
        payload AS current_state,
        created_at AS last_updated
      FROM events
      ORDER BY aggregate_id, created_at DESC$$,
    schedule => '2s',
    refresh_mode => 'DIFFERENTIAL'
);

-- Projection 2: Event counts by type per hour
SELECT pgtrickle.create_stream_table(
    'hourly_event_counts',
    $$SELECT date_trunc('hour', created_at) AS hour,
             event_type,
             COUNT(*) AS event_count
      FROM events
      GROUP BY 1, 2$$,
    schedule => '10s',
    refresh_mode => 'DIFFERENTIAL'
);

Recommended Configuration

Projection	Refresh Mode	Why
Current state	DIFFERENTIAL	Small delta per cycle; DISTINCT ON supported
Hourly counts	DIFFERENTIAL	Algebraic aggregate (COUNT), efficient delta
String aggregations	AUTO	GROUP_RESCAN aggs may benefit from FULL

Anti-Patterns

Don't DELETE from the event store. pg_trickle tracks changes via triggers; mixing append and delete on the source creates unnecessary delta complexity. Archive old events to a separate table.
Don't use append_only => true with UPDATE/DELETE patterns. The append_only flag skips DELETE tracking in the change buffer — only use it when the source truly never updates or deletes.

Pattern 3: Slowly Changing Dimensions (SCD)

SCD Type 1: Overwrite

The stream table always reflects the current state. Source updates overwrite previous values.

-- Source: customer dimension table (updated in place)
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    name        TEXT NOT NULL,
    email       TEXT,
    tier        TEXT DEFAULT 'standard',
    updated_at  TIMESTAMPTZ DEFAULT now()
);

-- SCD-1: current customer state enriched with order stats
SELECT pgtrickle.create_stream_table(
    'customer_360',
    $$SELECT c.customer_id,
             c.name,
             c.email,
             c.tier,
             COUNT(o.id) AS total_orders,
             COALESCE(SUM(o.amount), 0) AS lifetime_value
      FROM customers c
      LEFT JOIN orders o ON o.customer_id = c.customer_id
      GROUP BY c.customer_id, c.name, c.email, c.tier$$,
    schedule => '30s',
    refresh_mode => 'DIFFERENTIAL'
);

SCD Type 2: History Tracking

For SCD-2, maintain a history table with valid-from/valid-to ranges. The stream table provides the current snapshot.

-- Source: customer history with validity ranges
CREATE TABLE customer_history (
    customer_id INT NOT NULL,
    name        TEXT NOT NULL,
    tier        TEXT NOT NULL,
    valid_from  TIMESTAMPTZ NOT NULL,
    valid_to    TIMESTAMPTZ,  -- NULL = current
    PRIMARY KEY (customer_id, valid_from)
);

-- Current active records only
SELECT pgtrickle.create_stream_table(
    'customers_current',
    $$SELECT customer_id, name, tier, valid_from
      FROM customer_history
      WHERE valid_to IS NULL$$,
    schedule => '10s',
    refresh_mode => 'DIFFERENTIAL'
);

Anti-Patterns

Don't use FULL refresh for SCD-1 with large dimension tables. Customer tables with millions of rows but few changes per cycle are ideal for DIFFERENTIAL.
Don't forget to index valid_to IS NULL for SCD-2 sources. Without it, the delta scan touches all historical rows.

Pattern 4: High-Fan-Out Topology

When a single source table feeds many downstream stream tables.

Architecture

                    [orders]
                   ↙  ↓  ↓  ↘
  [daily_totals] [by_region] [by_product] [top_customers]

SQL Example

-- Single source feeding multiple views
CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer_id INT NOT NULL,
    region      TEXT NOT NULL,
    product_id  INT NOT NULL,
    amount      NUMERIC(10,2) NOT NULL,
    order_date  DATE NOT NULL DEFAULT CURRENT_DATE
);

-- Fan-out: 4 stream tables on 1 source
SELECT pgtrickle.create_stream_table('daily_totals',
    'SELECT order_date, SUM(amount) AS daily_total, COUNT(*) AS order_count
     FROM orders GROUP BY order_date',
    schedule => '5s', refresh_mode => 'DIFFERENTIAL');

SELECT pgtrickle.create_stream_table('by_region',
    'SELECT region, SUM(amount) AS total, COUNT(*) AS cnt
     FROM orders GROUP BY region',
    schedule => '5s', refresh_mode => 'DIFFERENTIAL');

SELECT pgtrickle.create_stream_table('by_product',
    'SELECT product_id, SUM(amount) AS total, COUNT(*) AS cnt
     FROM orders GROUP BY product_id',
    schedule => '5s', refresh_mode => 'DIFFERENTIAL');

SELECT pgtrickle.create_stream_table('top_customers',
    'SELECT customer_id, SUM(amount) AS lifetime_value, COUNT(*) AS order_count
     FROM orders GROUP BY customer_id',
    schedule => '10s', refresh_mode => 'DIFFERENTIAL');

Recommended Configuration

All fan-out targets share the same source change buffer — CDC overhead is paid once regardless of how many stream tables read from orders.
Use schedule => 'calculated' on downstream STs when they chain from other stream tables.
Consider pg_trickle.max_workers if fan-out exceeds 8 (default: 4 workers).

Anti-Patterns

Don't use IMMEDIATE mode on high-fan-out sources. Each DML row triggers N refreshes (one per downstream ST). Use DIFFERENTIAL with a batched schedule instead.
Don't set different schedules on STs that should be consistent. If daily_totals and by_region must agree, give them the same schedule or use diamond_consistency => 'atomic'.

Pattern 5: Real-Time Dashboards

For dashboards that need sub-second refresh latency.

SQL Example

-- Live order monitor (sub-second freshness)
SELECT pgtrickle.create_stream_table(
    'order_monitor',
    $$SELECT
        date_trunc('minute', order_date) AS minute,
        region,
        COUNT(*) AS orders,
        SUM(amount) AS revenue
      FROM orders
      WHERE order_date >= CURRENT_DATE
      GROUP BY 1, 2$$,
    schedule => '1s',
    refresh_mode => 'DIFFERENTIAL'
);

-- For truly real-time needs, use IMMEDIATE mode (triggers on each DML)
SELECT pgtrickle.create_stream_table(
    'live_counter',
    $$SELECT region, COUNT(*) AS cnt, SUM(amount) AS total
      FROM orders GROUP BY region$$,
    schedule => 'IMMEDIATE',
    refresh_mode => 'DIFFERENTIAL'
);

When to Use IMMEDIATE vs Scheduled DIFFERENTIAL

Scenario	Mode	Why
Dashboard polls every 1s	`1s`	Batched delta amortizes overhead
GraphQL subscription, < 100ms	IMMEDIATE	Triggers fire synchronously per DML
Aggregate with GROUP_RESCAN	`5s`+	Avoid per-row full rescans
High write throughput (>1K/s)	`2s`–`5s`	IMMEDIATE adds latency to each INSERT

Anti-Patterns

Don't use IMMEDIATE for complex joins. Each INSERT/UPDATE/DELETE fires the full DVM delta SQL synchronously — multi-table joins in IMMEDIATE mode add significant latency to writes.
Don't forget pooler_compatibility_mode with PgBouncer. Transaction pooling drops temp tables between transactions; enable this flag to avoid stale PREPARE statements.

Pattern 6: Tiered Refresh Strategy

Assign refresh importance tiers to control scheduling priority.

-- Hot: real-time operational dashboard
SELECT pgtrickle.create_stream_table('live_metrics', ...);
SELECT pgtrickle.alter_stream_table('live_metrics', tier => 'hot');

-- Warm: hourly business reports (2x interval multiplier)
SELECT pgtrickle.create_stream_table('hourly_report', ...,
    schedule => '1m');
SELECT pgtrickle.alter_stream_table('hourly_report', tier => 'warm');

-- Cold: daily analytics (10x interval multiplier)
SELECT pgtrickle.create_stream_table('daily_analytics', ...,
    schedule => '5m');
SELECT pgtrickle.alter_stream_table('daily_analytics', tier => 'cold');

-- Frozen: archive/audit (skip refresh entirely)
SELECT pgtrickle.alter_stream_table('audit_log_summary', tier => 'frozen');

Tier Multipliers

Tier	Schedule Multiplier	Use Case
hot	1x	Operational dashboards, alerts
warm	2x	Hourly reports, batch pipelines
cold	10x	Daily analytics, low-priority STs
frozen	skip	Paused/archived, manual refresh

General Guidelines

Choosing a Refresh Mode

Scenario	Recommended Mode
Source has < 5% change ratio per cycle	DIFFERENTIAL
Source changes > 50% per cycle	FULL
Query is a simple filter/projection	DIFFERENTIAL
Query has GROUP_RESCAN aggregates (MIN, MAX)	AUTO
Query joins 4+ tables	DIFFERENTIAL
Target table < 1000 rows	FULL
Need per-row latency guarantee	IMMEDIATE

Use pgtrickle.recommend_refresh_mode() (v0.14.0+) for automated analysis:

SELECT pgt_name, recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode();

Monitoring Checklist

-- Check refresh efficiency across all stream tables
SELECT pgt_name, refresh_mode, diff_speedup, avg_change_ratio
FROM pgtrickle.refresh_efficiency()
ORDER BY total_refreshes DESC;

-- Find stream tables that might benefit from mode change
SELECT pgt_name, current_mode, recommended_mode, reason
FROM pgtrickle.recommend_refresh_mode()
WHERE recommended_mode != 'KEEP';

-- Check for error states
SELECT pgt_name, status, last_error_message
FROM pgtrickle.stream_tables_info
WHERE status IN ('ERROR', 'SUSPENDED');

-- Export definitions for backup
SELECT pgtrickle.export_definition(pgt_schema || '.' || pgt_name)
FROM pgtrickle.pgt_stream_tables;

Common Mistakes

Using FULL refresh by default. Start with DIFFERENTIAL — it's correct for 80%+ of workloads. Switch to FULL only when recommend_refresh_mode() suggests it.
Over-scheduling. A 1-second schedule on a table with 1-hour change cycles wastes CPU. Match the schedule to actual data arrival rate.
Ignoring append_only. If the source table is truly append-only (no UPDATEs, no DELETEs), set append_only => true to halve change buffer writes.
Not using calculated schedule for chained STs. When ST-B reads from ST-A, use schedule => 'calculated' on ST-B to avoid unnecessary refreshes. The scheduler automatically propagates ST-A changes downstream.
Mixing IMMEDIATE and complex joins. IMMEDIATE mode fires delta SQL on every DML — an 8-table join in IMMEDIATE mode adds 50–200ms to each INSERT. Use scheduled DIFFERENTIAL for complex queries.

Pre-Deployment Checklist

Complete this checklist before deploying pg_trickle to a new environment. Each item links to the relevant documentation for details.

Version: v0.14.0+. Earlier versions may have different requirements.

1. PostgreSQL Version

PostgreSQL 18.x is required (pg_trickle is compiled against PG 18)
Extension binary matches your exact PostgreSQL major version

SELECT version();  -- Must show PostgreSQL 18.x

2. shared_preload_libraries

pg_trickle must be loaded at server startup via shared_preload_libraries. Without this, GUC variables and the background scheduler are not available.

# postgresql.conf
shared_preload_libraries = 'pg_trickle'

shared_preload_libraries includes pg_trickle
PostgreSQL has been restarted after changing this setting (reload is not sufficient)

SHOW shared_preload_libraries;  -- Must include pg_trickle

Managed PostgreSQL: Some providers (Supabase, Neon) do not support custom shared_preload_libraries. Check your provider's extension compatibility list. AWS RDS and Google Cloud SQL support custom shared libraries via parameter groups.

3. WAL Configuration (Optional but Recommended)

pg_trickle works without wal_level = logical — it uses trigger-based CDC by default. However, WAL-based CDC provides lower overhead on write-heavy workloads.

# postgresql.conf (optional — for WAL-based CDC)
wal_level = logical
max_replication_slots = 10   # At least 1 per tracked source table

Decide: trigger-based CDC (default) or WAL-based CDC
If WAL: wal_level = logical and server restarted
If WAL: max_replication_slots is sufficient for your source table count

Note: CDC mode is configurable per stream table. The default cdc_mode = 'auto' starts with triggers and transitions to WAL automatically when wal_level = logical is detected. See CONFIGURATION.md for details.

4. Extension Installation

CREATE EXTENSION pg_trickle;

-- Verify installation
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_trickle';

Extension created successfully
Version matches expected release

5. Background Scheduler

The scheduler runs as a background worker and manages automatic refresh. Verify it's running:

SELECT pid, backend_type, state
FROM pg_stat_activity
WHERE backend_type = 'pg_trickle scheduler';

Scheduler process is visible in pg_stat_activity
pg_trickle.enabled = true (default; set to false to disable)

6. Connection Pooler Compatibility

PgBouncer (Transaction Mode)

PgBouncer in transaction pooling mode drops session state between transactions. pg_trickle needs special handling:

Enable pooler_compatibility_mode on affected stream tables:

SELECT pgtrickle.alter_stream_table('my_st',
    pooler_compatibility_mode => true);

Or set globally via GUC:

pg_trickle.pooler_compatibility_mode = true

PgBouncer (Session Mode)

Session mode preserves session state — no special configuration needed.

Supavisor / Other Poolers

Some poolers (Supavisor, pgcat) have their own compatibility characteristics. Test with pgtrickle.validate_query() before deploying.

7. Recommended GUC Starting Values

These are sensible defaults for most workloads. Adjust based on monitoring data.

# Core settings (usually fine as defaults)
pg_trickle.enabled = true                    # Enable scheduler
pg_trickle.schedule_interval = '5s'          # Global default refresh interval
pg_trickle.max_workers = 4                   # Parallel refresh workers

# Performance tuning
pg_trickle.planner_aggressive = true         # Enable MERGE planner hints
pg_trickle.tiered_scheduling = true          # Tier-aware scheduling

# CDC mode
pg_trickle.cdc_mode = 'auto'                # auto | trigger | wal

# Safety
pg_trickle.unlogged_buffers = false          # true = faster but not crash-safe
pg_trickle.fuse_default_ceiling = 10000      # Auto-fuse change threshold

Review GUC values for your workload
See CONFIGURATION.md for the full reference

8. Resource Planning

Memory

Each background worker uses a separate PostgreSQL backend
work_mem applies to each worker's delta SQL execution
Monitor RSS growth via pg_stat_activity or OS-level tools

Storage

Change buffer tables (pgtrickle_changes.changes_*) grow between refreshes
Buffer size depends on DML rate × refresh interval
Monitor via pgtrickle.shared_buffer_stats()

Connections

The scheduler uses pg_trickle.max_workers backend connections
Ensure max_connections has headroom for workers + application
max_connections is at least application connections + pg_trickle.max_workers + 5

9. Monitoring Setup

Essential Queries

-- Stream table health overview
SELECT pgt_name, status, staleness, refresh_mode
FROM pgtrickle.stream_tables_info
ORDER BY staleness DESC NULLS LAST;

-- Refresh efficiency
SELECT pgt_name, diff_speedup, avg_change_ratio
FROM pgtrickle.refresh_efficiency();

-- Error states
SELECT pgt_name, status, last_error_message, last_error_at
FROM pgtrickle.pgt_stream_tables
WHERE status IN ('ERROR', 'SUSPENDED');

Grafana / Prometheus

See the monitoring/ directory for ready-to-use Grafana dashboards and Prometheus configuration.

Monitoring configured for stream table health
Alerting on ERROR/SUSPENDED status

10. Backup & Restore

pg_trickle stream tables are standard PostgreSQL tables and are included in pg_dump / pg_restore. See BACKUP_AND_RESTORE.md for details.

Backup strategy accounts for both source tables and stream tables
Restore procedure tested (stream tables may need re-initialization)

Quick Validation Script

Run this after deployment to verify everything is working:

-- 1. Extension loaded
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_trickle';

-- 2. Scheduler running
SELECT COUNT(*) > 0 AS scheduler_alive
FROM pg_stat_activity
WHERE backend_type = 'pg_trickle scheduler';

-- 3. Create a test stream table
CREATE TABLE _deploy_test_src (id INT PRIMARY KEY, val INT);
INSERT INTO _deploy_test_src VALUES (1, 100), (2, 200);

SELECT pgtrickle.create_stream_table(
    '_deploy_test_st',
    'SELECT id, val FROM _deploy_test_src',
    refresh_mode => 'FULL'
);

SELECT pgtrickle.refresh_stream_table('_deploy_test_st');

-- 4. Verify data
SELECT * FROM _deploy_test_st ORDER BY id;
-- Expected: (1, 100), (2, 200)

-- 5. Cleanup
SELECT pgtrickle.drop_stream_table('_deploy_test_st');
DROP TABLE _deploy_test_src;

Connection Pooler Compatibility

Added in v0.19.0 (UX-4 / STAB-1).

pg_trickle uses prepared statements and NOTIFY internally. These features require special handling when a connection pooler sits between the application and PostgreSQL.

PgBouncer Transaction Mode

In PgBouncer transaction pooling mode, each transaction may land on a different server-side connection. Prepared statements and LISTEN/NOTIFY do not survive across transactions.

Recommended configuration:

# postgresql.conf
pg_trickle.connection_pooler_mode = 'transaction'

This cluster-wide GUC:

Disables prepared-statement reuse for all stream tables.
Suppresses NOTIFY pg_trickle_refresh emissions (listeners on other connections will not receive them anyway in transaction mode).

Alternatively, enable pooler compatibility per stream table:

SELECT pgtrickle.alter_stream_table('my_stream_table',
    pooler_compatibility_mode => true);

PgBouncer Session Mode

Session pooling is fully compatible — no special configuration needed.

pgcat / Supavisor

These poolers generally support prepared statements and NOTIFY. Set pg_trickle.connection_pooler_mode = 'off' (the default).

Kubernetes / CNPG

See Scaling — CNPG for connection pooler configuration in Kubernetes environments.

Getting Started — First stream table in 5 minutes
Configuration Reference — All GUC variables
SQL Reference — Complete function reference
Best-Practice Patterns — Common data modeling patterns
Architecture — How pg_trickle works internally
Backup & Restore — Backup considerations

SQL Reference

Complete reference for all SQL functions, views, and catalog tables provided by pgtrickle.

Functions

Core Lifecycle

Create, modify, and manage the lifecycle of stream tables.

pgtrickle.create_stream_table

Create a new stream table.

pgtrickle.create_stream_table(
    name                  text,
    query                 text,
    schedule              text      DEFAULT 'calculated',
    refresh_mode          text      DEFAULT 'AUTO',
    initialize            bool      DEFAULT true,
    diamond_consistency   text      DEFAULT NULL,
    diamond_schedule_policy text    DEFAULT NULL,
    cdc_mode              text      DEFAULT NULL,
    append_only           bool      DEFAULT false,
    pooler_compatibility_mode bool  DEFAULT false
) → void

Parameters:

Parameter	Type	Default	Description
`name`	`text`	—	Name of the stream table. May be schema-qualified (`myschema.my_st`). Defaults to `public` schema.
`query`	`text`	—	The defining SQL query. Must be a valid SELECT statement using supported operators.
`schedule`	`text`	`'calculated'`	Refresh schedule as a Prometheus/GNU-style duration string (e.g., `'30s'`, `'5m'`, `'1h'`, `'1h30m'`, `'1d'`) or a cron expression (e.g., `'/5 * * *'`, `'@hourly'`). Use `'calculated'` for CALCULATED mode (inherits schedule from downstream dependents).
`refresh_mode`	`text`	`'AUTO'`	`'AUTO'` (adaptive — uses DIFFERENTIAL when possible, falls back to FULL if the query is not differentiable), `'FULL'` (truncate and reload), `'DIFFERENTIAL'` (apply delta only — errors if the query is not differentiable), or `'IMMEDIATE'` (synchronous in-transaction maintenance via statement-level triggers).
`initialize`	`bool`	`true`	If `true`, populates the table immediately via a full refresh. If `false`, creates the table empty.
`diamond_consistency`	`text`	`NULL` (defaults to `'atomic'`)	Diamond dependency consistency mode: `'atomic'` (SAVEPOINT-based atomic group refresh) or `'none'` (independent refresh).
`diamond_schedule_policy`	`text`	`NULL` (defaults to `'fastest'`)	Schedule policy for atomic diamond groups: `'fastest'` (fire when any member is due) or `'slowest'` (fire when all are due). Set on the convergence node.
`cdc_mode`	`text`	`NULL` (use `pg_trickle.cdc_mode`)	Optional per-stream-table CDC override: `'auto'`, `'trigger'`, or `'wal'`. This affects all deferred TABLE sources of the stream table.
`append_only`	`bool`	`false`	When `true`, differential refreshes use a fast INSERT path instead of MERGE. Skips DELETE/UPDATE/IS DISTINCT FROM checks. If a DELETE or Update is later detected in the change buffer, the flag is automatically reverted to `false`. Not compatible with `FULL`, `IMMEDIATE`, or keyless sources.
`pooler_compatibility_mode`	`bool`	`false`	When `true`, the refresh engine uses inline SQL instead of `PREPARE`/`EXECUTE` and suppresses all `NOTIFY` emissions for this stream table. Enable this when the stream table is accessed through a transaction-mode connection pooler (e.g. PgBouncer).

When refresh_mode => 'IMMEDIATE', the cluster-wide pg_trickle.cdc_mode setting is ignored. IMMEDIATE mode always uses statement-level IVM triggers instead of CDC triggers or WAL replication slots. If you explicitly pass cdc_mode => 'wal' together with refresh_mode => 'IMMEDIATE', pg_trickle rejects the call because WAL CDC is asynchronous and incompatible with in-transaction maintenance.

Duration format:

Unit	Suffix	Example
Seconds	`s`	`'30s'`
Minutes	`m`	`'5m'`
Hours	`h`	`'2h'`
Days	`d`	`'1d'`
Weeks	`w`	`'1w'`
Compound	—	`'1h30m'`, `'2m30s'`

Cron expression format:

schedule also accepts standard cron expressions for time-based scheduling. The scheduler refreshes the stream table when the cron schedule fires, rather than checking staleness.

Format	Fields	Example	Description
5-field	min hour dom mon dow	`'/5 * * *'`	Every 5 minutes
6-field	sec min hour dom mon dow	`'0 /5 * * *'`	Every 5 minutes at :00 seconds
Alias	—	`'@hourly'`	Every hour
Alias	—	`'@daily'`	Every day at midnight
Alias	—	`'@weekly'`	Every Sunday at midnight
Alias	—	`'@monthly'`	First of every month
Weekday range	—	`'0 6 * * 1-5'`	6 AM on weekdays

Note: Cron-scheduled stream tables do not participate in CALCULATED schedule resolution. The stale column in monitoring views returns NULL for cron-scheduled tables.

Example:

-- Duration-based: refresh when data is staler than 2 minutes (refresh_mode defaults to 'AUTO')
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule => '2m'
);

-- Cron-based: refresh every hour
SELECT pgtrickle.create_stream_table(
    name         => 'hourly_summary',
    query        => 'SELECT date_trunc(''hour'', ts), COUNT(*) FROM events GROUP BY 1',
    schedule     => '@hourly',
    refresh_mode => 'FULL'
);

-- Cron-based: refresh at 6 AM on weekdays
SELECT pgtrickle.create_stream_table(
    name         => 'daily_report',
    query        => 'SELECT region, SUM(revenue) AS total FROM sales GROUP BY region',
    schedule     => '0 6 * * 1-5',
    refresh_mode => 'FULL'
);

-- Immediate mode: maintained synchronously within the same transaction
-- No schedule needed — updates happen automatically when base table changes
SELECT pgtrickle.create_stream_table(
    name         => 'live_totals',
    query        => 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    refresh_mode => 'IMMEDIATE'
);

-- Force WAL CDC for this stream table even if the global GUC is 'trigger'
SELECT pgtrickle.create_stream_table(
    name         => 'wal_orders',
    query        => 'SELECT id, amount FROM orders',
    schedule     => '1s',
    refresh_mode => 'DIFFERENTIAL',
    cdc_mode     => 'wal'
);

Aggregate Examples:

All supported aggregate functions work in AUTO mode (and all other modes). Examples below omit refresh_mode — the default 'AUTO' selects DIFFERENTIAL automatically. Explicit modes are shown only when the mode itself is being demonstrated.

-- Algebraic aggregates (fully differential — no rescan needed)
SELECT pgtrickle.create_stream_table(
    name     => 'sales_summary',
    query    => 'SELECT region, COUNT(*) AS cnt, SUM(amount) AS total, AVG(amount) AS avg_amount
     FROM orders GROUP BY region',
    schedule => '1m'
);

-- Semi-algebraic aggregates (MIN/MAX)
SELECT pgtrickle.create_stream_table(
    name     => 'salary_ranges',
    query    => 'SELECT department, MIN(salary) AS min_sal, MAX(salary) AS max_sal
     FROM employees GROUP BY department',
    schedule => '2m'
);

-- Group-rescan aggregates (BOOL_AND/OR, STRING_AGG, ARRAY_AGG, JSON_AGG, JSONB_AGG,
--                          BIT_AND, BIT_OR, BIT_XOR, JSON_OBJECT_AGG, JSONB_OBJECT_AGG,
--                          STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, VAR_SAMP,
--                          MODE, PERCENTILE_CONT, PERCENTILE_DISC,
--                          CORR, COVAR_POP, COVAR_SAMP, REGR_AVGX, REGR_AVGY,
--                          REGR_COUNT, REGR_INTERCEPT, REGR_R2, REGR_SLOPE,
--                          REGR_SXX, REGR_SXY, REGR_SYY, ANY_VALUE)
SELECT pgtrickle.create_stream_table(
    name     => 'team_members',
    query    => 'SELECT department,
            STRING_AGG(name, '', '' ORDER BY name) AS members,
            ARRAY_AGG(employee_id) AS member_ids,
            BOOL_AND(active) AS all_active,
            JSON_AGG(name) AS members_json
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- Bitwise aggregates
SELECT pgtrickle.create_stream_table(
    name     => 'permission_summary',
    query    => 'SELECT department,
            BIT_OR(permissions) AS combined_perms,
            BIT_AND(permissions) AS common_perms,
            BIT_XOR(flags) AS xor_flags
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- JSON object aggregates
SELECT pgtrickle.create_stream_table(
    name     => 'config_map',
    query    => 'SELECT department,
            JSON_OBJECT_AGG(setting_name, setting_value) AS settings,
            JSONB_OBJECT_AGG(key, value) AS metadata
     FROM config
     GROUP BY department',
    schedule => '1m'
);

-- Statistical aggregates
SELECT pgtrickle.create_stream_table(
    name     => 'salary_stats',
    query    => 'SELECT department,
            STDDEV_POP(salary) AS sd_pop,
            STDDEV_SAMP(salary) AS sd_samp,
            VAR_POP(salary) AS var_pop,
            VAR_SAMP(salary) AS var_samp
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- Ordered-set aggregates (MODE, PERCENTILE_CONT, PERCENTILE_DISC)
SELECT pgtrickle.create_stream_table(
    name     => 'salary_percentiles',
    query    => 'SELECT department,
            MODE() WITHIN GROUP (ORDER BY grade) AS most_common_grade,
            PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary,
            PERCENTILE_DISC(0.9) WITHIN GROUP (ORDER BY salary) AS p90_salary
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- Regression / correlation aggregates (CORR, COVAR_*, REGR_*)
SELECT pgtrickle.create_stream_table(
    name     => 'regression_stats',
    query    => 'SELECT department,
            CORR(salary, experience) AS sal_exp_corr,
            COVAR_POP(salary, experience) AS covar_pop,
            COVAR_SAMP(salary, experience) AS covar_samp,
            REGR_SLOPE(salary, experience) AS slope,
            REGR_INTERCEPT(salary, experience) AS intercept,
            REGR_R2(salary, experience) AS r_squared,
            REGR_COUNT(salary, experience) AS regr_n
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- ANY_VALUE aggregate (PostgreSQL 16+)
SELECT pgtrickle.create_stream_table(
    name     => 'dept_sample',
    query    => 'SELECT department, ANY_VALUE(office_location) AS sample_office
     FROM employees GROUP BY department',
    schedule => '1m'
);

-- FILTER clause on aggregates
SELECT pgtrickle.create_stream_table(
    name     => 'order_metrics',
    query    => 'SELECT region,
            COUNT(*) AS total,
            COUNT(*) FILTER (WHERE status = ''active'') AS active_count,
            SUM(amount) FILTER (WHERE status = ''shipped'') AS shipped_total
     FROM orders
     GROUP BY region',
    schedule => '1m'
);

-- PgBouncer compatibility (transaction-mode pooler)
SELECT pgtrickle.create_stream_table(
    name                      => 'pooled_orders',
    query                     => 'SELECT id, amount FROM orders',
    schedule                  => '5m',
    pooler_compatibility_mode => true
);

CTE Examples:

Non-recursive CTEs are fully supported in both FULL and DIFFERENTIAL modes:

-- Simple CTE
SELECT pgtrickle.create_stream_table(
    name     => 'active_order_totals',
    query    => 'WITH active_users AS (
        SELECT id, name FROM users WHERE active = true
    )
    SELECT a.id, a.name, SUM(o.amount) AS total
    FROM active_users a
    JOIN orders o ON o.user_id = a.id
    GROUP BY a.id, a.name',
    schedule => '1m'
);

-- Chained CTEs (CTE referencing another CTE)
SELECT pgtrickle.create_stream_table(
    name     => 'top_regions',
    query    => 'WITH regional AS (
        SELECT region, SUM(amount) AS total FROM orders GROUP BY region
    ),
    ranked AS (
        SELECT region, total FROM regional WHERE total > 1000
    )
    SELECT * FROM ranked',
    schedule => '2m'
);

-- Multi-reference CTE (referenced twice in FROM — shared delta optimization)
SELECT pgtrickle.create_stream_table(
    name     => 'self_compare',
    query    => 'WITH totals AS (
        SELECT user_id, SUM(amount) AS total FROM orders GROUP BY user_id
    )
    SELECT t1.user_id, t1.total, t2.total AS next_total
    FROM totals t1
    JOIN totals t2 ON t1.user_id = t2.user_id + 1',
    schedule => '1m'
);

-- Append-only stream table (INSERT-only fast path)
SELECT pgtrickle.create_stream_table(
    name        => 'event_log_st',
    query       => 'SELECT id, event_type, payload, created_at FROM events',
    schedule    => '30s',
    append_only => true
);

Recursive CTEs work with FULL, DIFFERENTIAL, and IMMEDIATE modes:

-- Recursive CTE (hierarchy traversal)
SELECT pgtrickle.create_stream_table(
    name         => 'category_tree',
    query        => 'WITH RECURSIVE cat_tree AS (
        SELECT id, name, parent_id, 0 AS depth
        FROM categories WHERE parent_id IS NULL
        UNION ALL
        SELECT c.id, c.name, c.parent_id, ct.depth + 1
        FROM categories c
        JOIN cat_tree ct ON c.parent_id = ct.id
    )
    SELECT * FROM cat_tree',
    schedule     => '5m',
    refresh_mode => 'FULL'  -- FULL mode: standard re-execution
);

-- Recursive CTE with DIFFERENTIAL mode (incremental semi-naive / DRed)
SELECT pgtrickle.create_stream_table(
    name         => 'org_chart',
    query        => 'WITH RECURSIVE reports AS (
        SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
        UNION ALL
        SELECT e.id, e.name, e.manager_id
        FROM employees e JOIN reports r ON e.manager_id = r.id
    )
    SELECT * FROM reports',
    schedule     => '2m',
    refresh_mode => 'DIFFERENTIAL'  -- Uses semi-naive, DRed, or recomputation (auto-selected)
);

-- Recursive CTE with IMMEDIATE mode (same-transaction maintenance)
SELECT pgtrickle.create_stream_table(
    name         => 'org_chart_live',
    query        => 'WITH RECURSIVE reports AS (
        SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
        UNION ALL
        SELECT e.id, e.name, e.manager_id
        FROM employees e JOIN reports r ON e.manager_id = r.id
    )
    SELECT * FROM reports',
    refresh_mode => 'IMMEDIATE'  -- Uses transition tables with semi-naive / DRed maintenance
);

Non-monotone recursive terms: If the recursive term contains operators like EXCEPT, aggregate functions, window functions, DISTINCT, INTERSECT (set), or anti-joins, the system automatically falls back to recomputation to guarantee correctness. Semi-naive and DRed strategies require monotone recursive terms (JOIN, UNION ALL, filter/project only).

Set Operation Examples:

INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL, UNION, and UNION ALL are supported:

-- INTERSECT: customers who placed orders in BOTH regions
SELECT pgtrickle.create_stream_table(
    name     => 'bi_region_customers',
    query    => 'SELECT customer_id FROM orders_east
     INTERSECT
     SELECT customer_id FROM orders_west',
    schedule => '2m'
);

-- INTERSECT ALL: preserves duplicates (bag semantics)
SELECT pgtrickle.create_stream_table(
    name     => 'common_items',
    query    => 'SELECT item_name FROM warehouse_a
     INTERSECT ALL
     SELECT item_name FROM warehouse_b',
    schedule => '1m'
);

-- EXCEPT: orders not yet shipped
SELECT pgtrickle.create_stream_table(
    name     => 'unshipped_orders',
    query    => 'SELECT order_id FROM orders
     EXCEPT
     SELECT order_id FROM shipments',
    schedule => '1m'
);

-- EXCEPT ALL: preserves duplicate counts (bag subtraction)
SELECT pgtrickle.create_stream_table(
    name     => 'excess_inventory',
    query    => 'SELECT sku FROM stock_received
     EXCEPT ALL
     SELECT sku FROM stock_shipped',
    schedule => '5m'
);

-- UNION: deduplicated merge of two sources
SELECT pgtrickle.create_stream_table(
    name     => 'all_contacts',
    query    => 'SELECT email FROM customers
     UNION
     SELECT email FROM newsletter_subscribers',
    schedule => '5m'
);

LATERAL Set-Returning Function Examples:

Set-returning functions (SRFs) in the FROM clause are supported in both FULL and DIFFERENTIAL modes. Common SRFs include jsonb_array_elements, jsonb_each, jsonb_each_text, and unnest:

-- Flatten JSONB arrays into rows
SELECT pgtrickle.create_stream_table(
    name     => 'flat_children',
    query    => 'SELECT p.id, child.value AS val
     FROM parent_data p,
     jsonb_array_elements(p.data->''children'') AS child',
    schedule => '1m'
);

-- Expand JSONB key-value pairs (multi-column SRF)
SELECT pgtrickle.create_stream_table(
    name     => 'flat_properties',
    query    => 'SELECT d.id, kv.key, kv.value
     FROM documents d,
     jsonb_each(d.metadata) AS kv',
    schedule => '2m'
);

-- Unnest arrays
SELECT pgtrickle.create_stream_table(
    name     => 'flat_tags',
    query    => 'SELECT t.id, tag.tag
     FROM tagged_items t,
     unnest(t.tags) AS tag(tag)',
    schedule => '1m'
);

-- SRF with WHERE filter
SELECT pgtrickle.create_stream_table(
    name     => 'high_value_items',
    query    => 'SELECT p.id, (e.value)::int AS amount
     FROM products p,
     jsonb_array_elements(p.prices) AS e
     WHERE (e.value)::int > 100',
    schedule => '5m'
);

-- SRF combined with aggregation
SELECT pgtrickle.create_stream_table(
    name         => 'element_counts',
    query        => 'SELECT a.id, count(*) AS cnt
     FROM arrays a,
     jsonb_array_elements(a.data) AS e
     GROUP BY a.id',
    schedule     => '1m',
    refresh_mode => 'FULL'
);

LATERAL Subquery Examples:

LATERAL subqueries in the FROM clause are supported in both FULL and DIFFERENTIAL modes. Use them for top-N per group, correlated aggregation, and conditional expansion:

-- Top-N per group: latest item per order
SELECT pgtrickle.create_stream_table(
    name     => 'latest_items',
    query    => 'SELECT o.id, o.customer, latest.amount
     FROM orders o,
     LATERAL (
         SELECT li.amount
         FROM line_items li
         WHERE li.order_id = o.id
         ORDER BY li.created_at DESC
         LIMIT 1
     ) AS latest',
    schedule => '1m'
);

-- Correlated aggregate
SELECT pgtrickle.create_stream_table(
    name     => 'dept_summaries',
    query    => 'SELECT d.id, d.name, stats.total, stats.cnt
     FROM departments d,
     LATERAL (
         SELECT SUM(e.salary) AS total, COUNT(*) AS cnt
         FROM employees e
         WHERE e.dept_id = d.id
     ) AS stats',
    schedule => '1m'
);

-- LEFT JOIN LATERAL: preserve outer rows with NULLs when subquery returns no rows
SELECT pgtrickle.create_stream_table(
    name     => 'dept_stats_all',
    query    => 'SELECT d.id, d.name, stats.total
     FROM departments d
     LEFT JOIN LATERAL (
         SELECT SUM(e.salary) AS total
         FROM employees e
         WHERE e.dept_id = d.id
     ) AS stats ON true',
    schedule => '1m'
);

WHERE Subquery Examples:

Subqueries in the WHERE clause are automatically transformed into semi-join, anti-join, or scalar subquery operators in the DVM operator tree:

-- EXISTS subquery: customers who have placed orders
SELECT pgtrickle.create_stream_table(
    name     => 'active_customers',
    query    => 'SELECT c.id, c.name
     FROM customers c
     WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id)',
    schedule => '1m'
);

-- NOT EXISTS: customers with no orders
SELECT pgtrickle.create_stream_table(
    name     => 'inactive_customers',
    query    => 'SELECT c.id, c.name
     FROM customers c
     WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id)',
    schedule => '1m'
);

-- IN subquery: products that have been ordered
SELECT pgtrickle.create_stream_table(
    name     => 'ordered_products',
    query    => 'SELECT p.id, p.name
     FROM products p
     WHERE p.id IN (SELECT product_id FROM order_items)',
    schedule => '1m'
);

-- NOT IN subquery: products never ordered
SELECT pgtrickle.create_stream_table(
    name     => 'unordered_products',
    query    => 'SELECT p.id, p.name
     FROM products p
     WHERE p.id NOT IN (SELECT product_id FROM order_items)',
    schedule => '1m'
);

-- Scalar subquery in SELECT list
SELECT pgtrickle.create_stream_table(
    name     => 'products_with_max_price',
    query    => 'SELECT p.id, p.name, (SELECT max(price) FROM products) AS max_price
     FROM products p',
    schedule => '1m'
);

Notes:

The defining query is parsed into an operator tree and validated for DVM support.
Views as sources — views referenced in the defining query are automatically inlined as subqueries (auto-rewrite pass #0). CDC triggers are created on the underlying base tables. Nested views (view → view → table) are fully expanded. The user's original query is preserved in original_query for reinit and introspection. Materialized views are rejected in DIFFERENTIAL mode (use FULL mode or the underlying query directly). Foreign tables are also rejected in DIFFERENTIAL mode.
CDC triggers and change buffer tables are created automatically for each source table.
TRUNCATE on source tables — when a source table is TRUNCATEd, a CDC trigger writes a marker row (action='T') into the change buffer. On the next refresh cycle, pg_trickle detects the marker and automatically falls back to a FULL refresh. For single-source stream tables where no subsequent DML occurred after the TRUNCATE, an optimized fast path deletes all ST rows directly without re-running the full defining query.
The ST is registered in the dependency DAG; cycles are rejected.
Non-recursive CTEs are inlined as subqueries during parsing (Tier 1). Multi-reference CTEs share delta computation (Tier 2).
Recursive CTEs in DIFFERENTIAL mode use three strategies, auto-selected per refresh: semi-naive evaluation for INSERT-only changes, DRed (Delete-and-Rederive) for mixed DELETE/UPDATE changes, and recomputation fallback when CTE columns do not match ST storage columns. Non-monotone recursive terms (containing EXCEPT, Aggregate, Window, DISTINCT, AntiJoin, or INTERSECT SET) automatically fall back to recomputation to ensure correctness.

Recursive CTE DIFFERENTIAL mode -- DRed algorithm (P2-1) In DIFFERENTIAL mode, mixed DELETE/UPDATE changes now use the DRed (Delete-and-Rederive) algorithm: (1) semi-naive INSERT propagation; (2) over-deletion cascade from ST storage; (3) rederivation from current source tables; (4) combine net deletions. DRed correctly handles derived-column changes such as path rebuilds under a renamed ancestor node. When CTE output columns differ from ST storage columns (mismatch), recomputation is used. Implemented in v0.10.0 (P2-1).

LATERAL SRFs in DIFFERENTIAL mode use row-scoped recomputation: when a source row changes, only the SRF expansions for that row are re-evaluated.
LATERAL subqueries in DIFFERENTIAL mode also use row-scoped recomputation: when an outer row changes, the correlated subquery is re-executed only for that row.
WHERE subqueries (EXISTS, IN, scalar) are parsed into dedicated semi-join, anti-join, and scalar subquery operators with specialized delta computation.
ALL (subquery) is the only subquery form that is currently rejected.
ORDER BY is accepted but silently discarded — row order in the storage table is undefined (consistent with PostgreSQL's CREATE MATERIALIZED VIEW behavior). Apply ORDER BY when querying the stream table.
TopK (ORDER BY + LIMIT) — When a top-level ORDER BY … LIMIT N is present (with a constant integer limit, optionally with OFFSET M), the query is recognized as a "TopK" pattern and accepted. TopK stream tables store exactly N rows (starting from position M+1 if OFFSET is specified) and are refreshed via a scoped-recomputation MERGE strategy. The DVM delta pipeline is bypassed; instead, each refresh re-evaluates the full ORDER BY + LIMIT [+ OFFSET] query and merges the result into the storage table. The catalog records topk_limit, topk_order_by, and optionally topk_offset for the stream table. TopK is not supported with set operations (UNION/INTERSECT/EXCEPT) or GROUP BY ROLLUP/CUBE/GROUPING SETS.
LIMIT / OFFSET without ORDER BY are rejected — stream tables materialize the full result set. Apply LIMIT when querying the stream table.

pgtrickle.create_stream_table_if_not_exists

Create a stream table if it does not already exist. If a stream table with the given name already exists, this is a silent no-op (an INFO message is logged). The existing definition is never modified.

pgtrickle.create_stream_table_if_not_exists(
    name                    text,
    query                   text,
    schedule                text      DEFAULT 'calculated',
    refresh_mode            text      DEFAULT 'AUTO',
    initialize              bool      DEFAULT true,
    diamond_consistency     text      DEFAULT NULL,
    diamond_schedule_policy text      DEFAULT NULL,
    cdc_mode                text      DEFAULT NULL,
    append_only             bool      DEFAULT false,
    pooler_compatibility_mode bool    DEFAULT false
) → void

Parameters: Same as create_stream_table.

Example:

-- Safe to re-run in migrations:
SELECT pgtrickle.create_stream_table_if_not_exists(
    'order_totals',
    'SELECT customer_id, sum(amount) AS total FROM orders GROUP BY customer_id',
    '1m',
    'DIFFERENTIAL'
);

Notes:

Useful for deployment / migration scripts that should be safe to re-run.
If the stream table already exists, the provided query, schedule, and other parameters are ignored — the existing definition is preserved.

pgtrickle.create_or_replace_stream_table

Create a stream table if it does not exist, or replace the existing one if the definition changed. This is the declarative, idempotent API for deployment workflows (dbt, SQL migrations, GitOps).

pgtrickle.create_or_replace_stream_table(
    name                    text,
    query                   text,
    schedule                text      DEFAULT 'calculated',
    refresh_mode            text      DEFAULT 'AUTO',
    initialize              bool      DEFAULT true,
    diamond_consistency     text      DEFAULT NULL,
    diamond_schedule_policy text      DEFAULT NULL,
    cdc_mode                text      DEFAULT NULL,
    append_only             bool      DEFAULT false,
    pooler_compatibility_mode bool    DEFAULT false
) → void

Parameters: Same as create_stream_table.

Behavior:

Current state	Action taken
Stream table does not exist	Create — identical to `create_stream_table(...)`
Stream table exists, query and all config identical	No-op — logs INFO, returns immediately
Stream table exists, query identical but config differs	Alter config — delegates to `alter_stream_table(...)` for schedule, refresh_mode, diamond settings, cdc_mode, append_only, pooler_compatibility_mode
Stream table exists, query differs	Replace query — in-place ALTER QUERY migration plus any config changes; a full refresh is applied

The initialize parameter is honoured on create only. On replace, the stream table is always repopulated via a full refresh.

Query comparison uses the post-rewrite (normalized) form of the SQL. Cosmetic differences such as whitespace, casing, and extra parentheses are ignored.

Example:

-- Idempotent deployment — safe to run on every deploy:
SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'order_totals',
    query        => 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule     => '2m',
    refresh_mode => 'DIFFERENTIAL'
);

-- If the query changed since last deploy, the stream table is
-- migrated in place (no data gap). If nothing changed, it's a no-op.

Notes:

Mirrors PostgreSQL's CREATE OR REPLACE convention (CREATE OR REPLACE VIEW, CREATE OR REPLACE FUNCTION).
Never drops the stream table — even for incompatible schema changes, the ALTER QUERY path rebuilds storage in place while preserving the catalog entry (pgt_id).
For migration scripts that should not modify an existing definition, use create_stream_table_if_not_exists instead.

pgtrickle.bulk_create

Create multiple stream tables in a single transaction.

pgtrickle.bulk_create(
    definitions  jsonb     -- Array of stream table definitions
) → jsonb                  -- Array of result objects

Each element in the definitions array must be a JSON object with at least name and query keys. All other keys match the parameters of create_stream_table (snake_case):

Key	Type	Default	Description
`name`	`string`	(required)	Stream table name (optionally schema-qualified).
`query`	`string`	(required)	Defining SQL query.
`schedule`	`string`	`'calculated'`	Refresh schedule.
`refresh_mode`	`string`	`'AUTO'`	`'AUTO'`, `'FULL'`, `'DIFFERENTIAL'`, or `'IMMEDIATE'`.
`initialize`	`boolean`	`true`	Whether to populate immediately.
`diamond_consistency`	`string`	`NULL`	`'atomic'` or `'none'`.
`diamond_schedule_policy`	`string`	`NULL`	`'fastest'` or `'slowest'`.
`cdc_mode`	`string`	`NULL`	`'auto'`, `'trigger'`, or `'wal'`.
`append_only`	`boolean`	`false`	Enable append-only fast path.
`pooler_compatibility_mode`	`boolean`	`false`	PgBouncer compatibility.
`partition_by`	`string`	`NULL`	Partition key.
`max_differential_joins`	`integer`	`NULL`	Max join scan limit.
`max_delta_fraction`	`number`	`NULL`	Max delta fraction (0.0–1.0).

Returns a JSONB array of result objects:

[
  {"name": "st1", "status": "created", "pgt_id": 42},
  {"name": "st2", "status": "created", "pgt_id": 43}
]

On any error, the entire transaction is rolled back (standard PostgreSQL transactional semantics). The error message includes the index and name of the failing definition.

Example:

SELECT pgtrickle.bulk_create('[
  {"name": "order_totals", "query": "SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id", "schedule": "30s"},
  {"name": "product_stats", "query": "SELECT product_id, COUNT(*) AS cnt FROM order_items GROUP BY product_id", "schedule": "1m"}
]'::jsonb);

pgtrickle.alter_stream_table

Alter properties of an existing stream table.

pgtrickle.alter_stream_table(
    name                  text,
    query                 text      DEFAULT NULL,
    schedule              text      DEFAULT NULL,
    refresh_mode          text      DEFAULT NULL,
    status                text      DEFAULT NULL,
    diamond_consistency   text      DEFAULT NULL,
    diamond_schedule_policy text    DEFAULT NULL,
    cdc_mode              text      DEFAULT NULL,
    append_only           bool      DEFAULT NULL,
    pooler_compatibility_mode bool  DEFAULT NULL,
    tier                  text      DEFAULT NULL
) → void

Parameters:

Parameter	Type	Default	Description
`name`	`text`	—	Name of the stream table (schema-qualified or unqualified).
`query`	`text`	`NULL`	New defining query. Pass `NULL` to leave unchanged. When set, the function validates the new query, migrates the storage table schema if needed, updates catalog entries and dependencies, and runs a full refresh. Schema changes are classified as same (no DDL), compatible (ALTER TABLE ADD/DROP COLUMN), or incompatible (full storage rebuild with OID change).
`schedule`	`text`	`NULL`	New schedule as a duration string (e.g., `'5m'`). Pass `NULL` to leave unchanged. Pass `'calculated'` to switch to CALCULATED mode.
`refresh_mode`	`text`	`NULL`	New refresh mode (`'AUTO'`, `'FULL'`, `'DIFFERENTIAL'`, or `'IMMEDIATE'`). Pass `NULL` to leave unchanged. Switching to/from `'IMMEDIATE'` migrates trigger infrastructure (IVM triggers ↔ CDC triggers), clears or restores the schedule, and runs a full refresh.
`status`	`text`	`NULL`	New status (`'ACTIVE'`, `'SUSPENDED'`). Pass `NULL` to leave unchanged. Resuming resets consecutive errors to 0.
`diamond_consistency`	`text`	`NULL`	New diamond consistency mode (`'none'` or `'atomic'`). Pass `NULL` to leave unchanged.
`diamond_schedule_policy`	`text`	`NULL`	New schedule policy for atomic diamond groups (`'fastest'` or `'slowest'`). Pass `NULL` to leave unchanged.
`cdc_mode`	`text`	`NULL`	New requested CDC mode override (`'auto'`, `'trigger'`, or `'wal'`). Pass `NULL` to leave unchanged.
`append_only`	`bool`	`NULL`	Enable or disable the append-only INSERT fast path. Pass `NULL` to leave unchanged. When `true`, rejected for FULL, IMMEDIATE, or keyless source stream tables.
`pooler_compatibility_mode`	`bool`	`NULL`	Enable or disable pooler-safe mode. When `true`, prepared statements are bypassed and NOTIFY emissions are suppressed. Pass `NULL` to leave unchanged.
`tier`	`text`	`NULL`	Refresh tier for tiered scheduling (`'hot'`, `'warm'`, `'cold'`, or `'frozen'`). Only effective when `pg_trickle.tiered_scheduling` GUC is enabled. Hot (1×), Warm (2×), Cold (10×), Frozen (skip). Pass `NULL` to leave unchanged.

If you switch a stream table to refresh_mode => 'IMMEDIATE' while the cluster-wide pg_trickle.cdc_mode GUC is set to 'wal', pg_trickle logs an INFO and proceeds with IVM triggers. WAL CDC does not apply to IMMEDIATE mode. If the stream table has an explicit cdc_mode => 'wal' override, switching to IMMEDIATE is rejected until you change the requested CDC mode back to 'auto' or 'trigger'.

Examples:

-- Change the defining query (same output schema — fast path)
SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total FROM orders WHERE status = ''active'' GROUP BY customer_id');

-- Change query and add a column (compatible schema migration)
SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total, COUNT(*) AS cnt FROM orders GROUP BY customer_id');

-- Change query and mode simultaneously
SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    refresh_mode => 'FULL');

-- Change schedule
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '5m');

-- Switch to full refresh mode
SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'FULL');

-- Switch to immediate (transactional) mode — installs IVM triggers, clears schedule
SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'IMMEDIATE');

-- Switch from immediate back to differential — re-creates CDC triggers, restores schedule
SELECT pgtrickle.alter_stream_table('order_totals',
    refresh_mode => 'DIFFERENTIAL', schedule => '5m');

-- Pin a deferred stream table to trigger CDC even when the global GUC is 'auto'
SELECT pgtrickle.alter_stream_table('order_totals', cdc_mode => 'trigger');

-- Enable append-only INSERT fast path
SELECT pgtrickle.alter_stream_table('event_log_st', append_only => true);

-- Enable pooler compatibility mode (for PgBouncer transaction mode)
SELECT pgtrickle.alter_stream_table('order_totals', pooler_compatibility_mode => true);

-- Set refresh tier (requires pg_trickle.tiered_scheduling = on)
SELECT pgtrickle.alter_stream_table('order_totals', tier => 'warm');
SELECT pgtrickle.alter_stream_table('archive_stats', tier => 'frozen');

-- Suspend a stream table
SELECT pgtrickle.alter_stream_table('order_totals', status => 'SUSPENDED');

-- Resume a suspended stream table
SELECT pgtrickle.resume_stream_table('order_totals');
-- Or via alter_stream_table
SELECT pgtrickle.alter_stream_table('order_totals', status => 'ACTIVE');

Notes:

When query is provided, the function runs the full query rewrite pipeline (view inlining, DISTINCT ON, GROUPING SETS, etc.) and validates the new query before applying changes.
The entire ALTER QUERY operation runs within a single transaction. If any step fails, the stream table is left unchanged.
For same-schema and compatible-schema changes, the storage table OID is preserved — views, policies, and publications referencing the stream table remain valid.
For incompatible schema changes (e.g., changing a column from integer to text), the storage table is rebuilt and the OID changes. A WARNING is emitted.
The stream table is temporarily suspended during query migration to prevent concurrent scheduler refreshes.

pgtrickle.drop_stream_table

Drop a stream table, removing the storage table and all catalog entries.

pgtrickle.drop_stream_table(name text) → void

Parameters:

Parameter	Type	Description
`name`	`text`	Name of the stream table to drop.

Example:

SELECT pgtrickle.drop_stream_table('order_totals');

Notes:

Drops the underlying storage table with CASCADE.
Removes all catalog entries (metadata, dependencies, refresh history).
Cleans up CDC triggers and change buffer tables for source tables that are no longer tracked by any ST.

pgtrickle.resume_stream_table

Resume a suspended stream table, clearing its consecutive error count and re-enabling automated and manual refreshes.

pgtrickle.resume_stream_table(name text) → void

Parameters:

Parameter	Type	Description
`name`	`text`	Name of the stream table to resume (schema-qualified or unqualified).

Example:

-- Resume a stream table that was auto-suspended due to repeated errors
SELECT pgtrickle.resume_stream_table('order_totals');

Notes:

Errors if the ST is not in SUSPENDED state.
Resets consecutive_errors to 0 and sets status = 'ACTIVE'.
Emits a resumed event on the pg_trickle_alert NOTIFY channel.
After resuming, the scheduler will include the ST in its next cycle.

pgtrickle.refresh_stream_table

Manually trigger a synchronous refresh of a stream table.

pgtrickle.refresh_stream_table(name text) → void

Parameters:

Parameter	Type	Description
`name`	`text`	Name of the stream table to refresh.

Example:

SELECT pgtrickle.refresh_stream_table('order_totals');

Notes:

Blocked if the ST is SUSPENDED — use pgtrickle.resume_stream_table(name) first.
Uses an advisory lock to prevent concurrent refreshes of the same ST.
For DIFFERENTIAL mode, generates and applies a delta query. For FULL mode, truncates and reloads.
Records the refresh in pgtrickle.pgt_refresh_history with initiated_by = 'MANUAL'.

pgtrickle.repair_stream_table

Repair a stream table by reinstalling any missing CDC triggers, validating catalog entries, and reconciling change buffer state.

pgtrickle.repair_stream_table(name text) → void

Parameters:

Parameter	Type	Description
`name`	`text`	Name of the stream table to repair.

Example:

-- Reinstall missing CDC triggers after a point-in-time recovery
SELECT pgtrickle.repair_stream_table('order_totals');

Notes:

Inspects all source tables in the stream table's dependency graph and reinstalls any missing or disabled CDC triggers.
Validates that the stream table's catalog entry, storage table, and change buffer tables are consistent.
Useful after pg_basebackup or PITR restores where triggers may not have been captured in the backup.
Use pgtrickle.trigger_inventory() first to identify which triggers are missing.
Safe to call on a healthy stream table — it is a no-op if everything is intact.

Status & Monitoring

Query the state of stream tables, view refresh statistics, and diagnose problems.

pgtrickle.pgt_status

Get the status of all stream tables.

pgtrickle.pgt_status() → SETOF record(
    name                text,
    status              text,
    refresh_mode        text,
    is_populated        bool,
    consecutive_errors  int,
    schedule            text,
    data_timestamp      timestamptz,
    staleness           interval
)

Example:

SELECT * FROM pgtrickle.pgt_status();

name	status	refresh_mode	is_populated	consecutive_errors	schedule	data_timestamp	staleness
public.order_totals	ACTIVE	DIFFERENTIAL	true	0	5m	2026-02-21 12:00:00+00	00:02:30

pgtrickle.health_check

Run a set of health checks against the pg_trickle installation and return one row per check.

pgtrickle.health_check() → SETOF record(
    check_name  text,   -- identifier for the check
    severity    text,   -- 'OK', 'WARN', or 'ERROR'
    detail      text    -- human-readable explanation
)

Filter to problems only:

SELECT check_name, severity, detail
FROM pgtrickle.health_check()
WHERE severity != 'OK';

Checks: scheduler_running, error_tables, stale_tables, needs_reinit, consecutive_errors, buffer_growth (> 10 000 pending rows), slot_lag (retained WAL above pg_trickle.slot_lag_warning_threshold_mb, default 100 MB), worker_pool (all worker tokens in use — parallel mode only), job_queue (> 10 jobs queued — parallel mode only).

pgtrickle.health_summary

Single-row summary of the entire pg_trickle deployment's health. Designed for monitoring dashboards that want one endpoint to poll instead of joining multiple views.

pgtrickle.health_summary() → SETOF record(
    total_stream_tables   int,
    active_count          int,
    error_count           int,
    suspended_count       int,
    stale_count           int,
    reinit_pending        int,
    max_staleness_seconds float8,    -- NULL if no stream tables
    scheduler_status      text,      -- 'ACTIVE', 'STOPPED', or 'NOT_LOADED'
    cache_hit_rate        float8     -- NULL if no cache lookups yet
)

Example:

SELECT * FROM pgtrickle.health_summary();

total_stream_tables	active_count	error_count	suspended_count	stale_count	reinit_pending	max_staleness_seconds	scheduler_status	cache_hit_rate
12	11	0	1	0	0	45.2	ACTIVE	0.94

Tip: Use this in a Grafana single-stat panel or a Prometheus exporter to surface fleet-level health at a glance.

pgtrickle.refresh_timeline

Return recent refresh records across all stream tables in a single chronological view.

pgtrickle.refresh_timeline(
    max_rows int  DEFAULT 50
) → SETOF record(
    start_time      timestamptz,
    stream_table    text,
    action          text,
    status          text,
    rows_inserted   bigint,
    rows_deleted    bigint,
    duration_ms     float8,
    error_message   text
)

Example:

-- Most recent 20 events across all stream tables:
SELECT start_time, stream_table, action, status, round(duration_ms::numeric,1) AS ms
FROM pgtrickle.refresh_timeline(20);

-- Just failures in the last 100 events:
SELECT * FROM pgtrickle.refresh_timeline(100) WHERE status = 'ERROR';

pgtrickle.st_refresh_stats

Return per-ST refresh statistics aggregated from the refresh history.

pgtrickle.st_refresh_stats() → SETOF record(
    pgt_name                text,
    pgt_schema              text,
    status                 text,
    refresh_mode           text,
    is_populated           bool,
    total_refreshes        bigint,
    successful_refreshes   bigint,
    failed_refreshes       bigint,
    total_rows_inserted    bigint,
    total_rows_deleted     bigint,
    avg_duration_ms        float8,
    last_refresh_action    text,
    last_refresh_status    text,
    last_refresh_at        timestamptz,
    staleness_secs       float8,
    stale           bool
)

Example:

SELECT pgt_name, status, total_refreshes, avg_duration_ms, stale
FROM pgtrickle.st_refresh_stats();

pgtrickle.get_refresh_history

Return refresh history for a specific stream table.

pgtrickle.get_refresh_history(
    name      text,
    max_rows  int  DEFAULT 20
) → SETOF record(
    refresh_id       bigint,
    data_timestamp   timestamptz,
    start_time       timestamptz,
    end_time         timestamptz,
    action           text,
    status           text,
    rows_inserted    bigint,
    rows_deleted     bigint,
    duration_ms      float8,
    error_message    text
)

Example:

SELECT action, status, rows_inserted, duration_ms
FROM pgtrickle.get_refresh_history('order_totals', 5);

pgtrickle.get_staleness

Get the current staleness in seconds for a specific stream table.

pgtrickle.get_staleness(name text) → float8

Returns NULL if the ST has never been refreshed.

Example:

SELECT pgtrickle.get_staleness('order_totals');
-- Returns: 12.345  (seconds since last refresh)

pgtrickle.explain_refresh_mode

Added in v0.11.0

Explain the configured vs. effective refresh mode for a stream table, including the reason for any downgrade (e.g., AUTO choosing FULL).

pgtrickle.explain_refresh_mode(name text) → TABLE(
    configured_mode  text,
    effective_mode   text,
    downgrade_reason text
)

Columns:

Column	Type	Description
`configured_mode`	`text`	The refresh mode set on the stream table (e.g., `DIFFERENTIAL`, `AUTO`, `FULL`, `IMMEDIATE`)
`effective_mode`	`text`	The mode actually used on the most recent refresh. NULL for IMMEDIATE mode (handled by triggers)
`downgrade_reason`	`text`	Human-readable explanation when `effective_mode` differs from `configured_mode`, or informational note for IMMEDIATE / APPEND_ONLY

Example:

SELECT * FROM pgtrickle.explain_refresh_mode('public.orders_summary');

configured_mode	effective_mode	downgrade_reason
AUTO	FULL	The most recent refresh used FULL mode. Possible causes: defining query contains a CTE or unsupported operator, adaptive change-ratio threshold was exceeded, or aggregate saturation occurred. Check pgtrickle.pgt_refresh_history for details.

pgtrickle.cache_stats

Return template cache statistics from shared memory.

Reports L1 (thread-local) hits, L2 (catalog table) hits, full misses (DVM re-parse), evictions (generation flushes), and the current L1 cache size for this backend.

pgtrickle.cache_stats() → SETOF record(
    l1_hits    bigint,
    l2_hits    bigint,
    misses     bigint,
    evictions  bigint,
    l1_size    integer
)

Column	Description
`l1_hits`	Number of delta template cache hits in the thread-local (L1) cache. ~0 ns lookup.
`l2_hits`	Number of delta template cache hits in the catalog table (L2) cache. ~1 ms SPI lookup.
`misses`	Number of full cache misses requiring DVM re-parse (~45 ms).
`evictions`	Number of entries evicted from L1 due to DDL-triggered generation flushes.
`l1_size`	Current number of entries in this backend's L1 cache.

Example:

SELECT * FROM pgtrickle.cache_stats();

l1_hits	l2_hits	misses	evictions	l1_size
142	3	5	10	8

Note: Counters are cluster-wide (shared memory) except l1_size which is per-backend. Requires shared_preload_libraries = 'pg_trickle'; returns zeros when loaded dynamically.

CDC Diagnostics

Inspect CDC pipeline health, replication slots, change buffers, and trigger coverage.

pgtrickle.slot_health

Check replication slot health for all tracked CDC slots.

pgtrickle.slot_health() → SETOF record(
    slot_name          text,
    source_relid       bigint,
    active             bool,
    retained_wal_bytes bigint,
    wal_status         text
)

Example:

SELECT * FROM pgtrickle.slot_health();

slot_name	source_relid	active	retained_wal_bytes	wal_status
pg_trickle_slot_16384	16384	false	1048576	reserved

pgtrickle.check_cdc_health

Check CDC health for all tracked source tables. Returns per-source health status including the current CDC mode, replication slot details, estimated lag, and any alerts.

The alert column uses the critical threshold configured by pg_trickle.slot_lag_critical_threshold_mb (default 1024 MB).

pgtrickle.check_cdc_health() → SETOF record(
    source_relid   bigint,
    source_table   text,
    cdc_mode       text,
    slot_name      text,
    lag_bytes      bigint,
    confirmed_lsn  text,
    alert          text
)

Columns:

Column	Type	Description
`source_relid`	`bigint`	OID of the tracked source table
`source_table`	`text`	Resolved name of the source table (e.g., `public.orders`)
`cdc_mode`	`text`	Current CDC mode: `TRIGGER`, `TRANSITIONING`, or `WAL`
`slot_name`	`text`	Replication slot name (NULL for TRIGGER mode)
`lag_bytes`	`bigint`	Replication slot lag in bytes (NULL for TRIGGER mode)
`confirmed_lsn`	`text`	Last confirmed WAL position (NULL for TRIGGER mode)
`alert`	`text`	Alert message if unhealthy (e.g., `slot_lag_exceeds_threshold`, `replication_slot_missing`)

Example:

SELECT * FROM pgtrickle.check_cdc_health();

source_relid	source_table	cdc_mode	slot_name	lag_bytes	confirmed_lsn	alert
16384	public.orders	TRIGGER
16390	public.events	WAL	pg_trickle_slot_16390	524288	0/1A8B000

pgtrickle.change_buffer_sizes

Show pending change counts and estimated on-disk sizes for all CDC-tracked source tables.

Returns one row per (stream_table, source_table) pair.

pgtrickle.change_buffer_sizes() → SETOF record(
    stream_table  text,     -- qualified stream table name
    source_table  text,     -- qualified source table name
    source_oid    bigint,
    cdc_mode      text,     -- 'trigger', 'wal', or 'transitioning'
    pending_rows  bigint,   -- rows in buffer not yet consumed
    buffer_bytes  bigint    -- estimated buffer table size in bytes
)

Example:

SELECT * FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

Useful for spotting a source table whose CDC buffer is growing unexpectedly (which may indicate a stalled differential refresh or a high-write source that has outpaced the schedule).

pgtrickle.worker_pool_status

Snapshot of the parallel refresh worker pool. Returns a single row.

pgtrickle.worker_pool_status() → SETOF record(
    active_workers  int,   -- workers currently executing refresh jobs
    max_workers     int,   -- cluster-wide worker budget (GUC)
    per_db_cap      int,   -- per-database dispatch cap (GUC)
    parallel_mode   text   -- current parallel_refresh_mode value
)

Example:

SELECT * FROM pgtrickle.worker_pool_status();

Returns 0 active workers when parallel_refresh_mode = 'off'.

pgtrickle.parallel_job_status

Active and recently completed scheduler jobs from the pgt_scheduler_jobs table. Shows jobs that are currently queued or running, plus jobs that finished within the last max_age_seconds (default 300).

pgtrickle.parallel_job_status(
    max_age_seconds int  DEFAULT 300
) → SETOF record(
    job_id         bigint,
    unit_key       text,        -- stable unit identifier (s:42, a:1,2, etc.)
    unit_kind      text,        -- 'singleton', 'atomic_group', 'immediate_closure'
    status         text,        -- 'QUEUED', 'RUNNING', 'SUCCEEDED', etc.
    member_count   int,
    attempt_no     int,
    scheduler_pid  int,
    worker_pid     int,         -- NULL if not yet claimed
    enqueued_at    timestamptz,
    started_at     timestamptz, -- NULL if still queued
    finished_at    timestamptz, -- NULL if not finished
    duration_ms    float8       -- NULL if not finished
)

Example — show running and recently failed jobs:

SELECT job_id, unit_key, status, duration_ms
FROM pgtrickle.parallel_job_status(60)
WHERE status NOT IN ('SUCCEEDED');

pgtrickle.trigger_inventory

List all CDC triggers that pg_trickle should have installed, and verify each one exists and is enabled in pg_catalog.

pgtrickle.trigger_inventory() → SETOF record(
    source_table  text,    -- qualified source table name
    source_oid    bigint,
    trigger_name  text,    -- expected trigger name
    trigger_type  text,    -- 'DML' or 'TRUNCATE'
    present       bool,    -- trigger exists in pg_catalog
    enabled       bool     -- trigger is not disabled
)

A present = false row means change capture is broken for that source.

Example:

-- Show only missing or disabled triggers:
SELECT source_table, trigger_type, trigger_name
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

pgtrickle.fuse_status

Return the circuit-breaker (fuse) state for every stream table that has a fuse configured.

pgtrickle.fuse_status() → SETOF record(
    name           text,         -- stream table name
    fuse_mode      text,         -- 'off', 'on', or 'auto'
    fuse_state     text,         -- 'armed' or 'blown'
    fuse_ceiling   bigint,       -- change-count threshold
    fuse_sensitivity int,        -- consecutive over-ceiling cycles before blow
    blown_at       timestamptz,  -- when the fuse last blew (NULL if armed)
    blow_reason    text          -- reason the fuse blew (NULL if armed)
)

Example:

-- Check all fuse-enabled stream tables
SELECT name, fuse_mode, fuse_state, fuse_ceiling, blown_at
FROM pgtrickle.fuse_status();

-- Find blown fuses
SELECT name, blow_reason, blown_at
FROM pgtrickle.fuse_status()
WHERE fuse_state = 'blown';

Notes:

Returns one row per stream table where fuse_mode != 'off'.
A blown fuse suspends differential refreshes until cleared with pgtrickle.reset_fuse().
A pgtrickle_alert NOTIFY with event fuse_blown is emitted when the fuse trips.
See Configuration — fuse_default_ceiling for global defaults.

pgtrickle.reset_fuse

Clear a blown circuit-breaker fuse and resume scheduling for the stream table.

pgtrickle.reset_fuse(name text, action text DEFAULT 'apply') → void

Parameters:

Parameter	Type	Default	Description
`name`	`text`	—	Name of the stream table whose fuse to reset.
`action`	`text`	`'apply'`	How to handle the pending changes that caused the fuse to blow.

Actions:

Action	Behavior
`'apply'`	Process all pending changes normally and resume scheduling.
`'reinitialize'`	Drop and repopulate the stream table from scratch (full refresh from defining query).
`'skip_changes'`	Discard the pending changes that triggered the fuse and resume from the current frontier.

Example:

-- After investigating a bulk load, apply the changes:
SELECT pgtrickle.reset_fuse('category_summary', action => 'apply');

-- Or skip the oversized batch entirely:
SELECT pgtrickle.reset_fuse('category_summary', action => 'skip_changes');

-- Or rebuild from scratch:
SELECT pgtrickle.reset_fuse('category_summary', action => 'reinitialize');

Notes:

Errors if the stream table's fuse is not in 'blown' state.
After reset, the fuse returns to 'armed' state and the scheduler resumes normal operation.
Use pgtrickle.fuse_status() to inspect the fuse state before resetting.
The 'skip_changes' action advances the frontier past the pending changes without applying them — use only when you are certain the changes should be discarded.

Dependency & Inspection

Visualize dependencies, understand query plans, and audit source table relationships.

pgtrickle.dependency_tree

Render all stream table dependencies as an indented ASCII tree.

pgtrickle.dependency_tree() → SETOF record(
    tree_line    text,    -- indented visual line (├──, └──, │ characters)
    node         text,    -- qualified name (schema.table)
    node_type    text,    -- 'stream_table' or 'source_table'
    depth        int,
    status       text,    -- NULL for source_table nodes
    refresh_mode text     -- NULL for source_table nodes
)

Roots (stream tables with no stream-table parents) appear at depth 0. Each dependent is indented beneath its parent. Plain source tables are rendered as leaf nodes tagged [src].

Example:

SELECT tree_line, status, refresh_mode
FROM pgtrickle.dependency_tree();

tree_line                               status   refresh_mode
----------------------------------------+---------+--------------
report_summary                          ACTIVE   DIFFERENTIAL
├── orders_by_region                    ACTIVE   DIFFERENTIAL
│   ├── public.orders [src]
│   └── public.customers [src]
└── revenue_totals                      ACTIVE   DIFFERENTIAL
    └── public.orders [src]

pgtrickle.diamond_groups

List all detected diamond dependency groups and their members.

When stream tables form diamond-shaped dependency graphs (multiple paths converge at a single fan-in node), the scheduler groups them for coordinated refresh. This function exposes those groups for monitoring and debugging.

pgtrickle.diamond_groups() → SETOF record(
    group_id        int4,
    member_name     text,
    member_schema   text,
    is_convergence  bool,
    epoch           int8,
    schedule_policy text
)

Return columns:

Column	Type	Description
`group_id`	`int4`	Numeric identifier for the consistency group (1-based).
`member_name`	`text`	Name of the stream table in this group.
`member_schema`	`text`	Schema of the stream table.
`is_convergence`	`bool`	`true` if this member is a convergence (fan-in) node where multiple paths meet.
`epoch`	`int8`	Group epoch counter — advances on each successful atomic refresh of the group.
`schedule_policy`	`text`	Effective schedule policy for this group (`'fastest'` or `'slowest'`). Computed from convergence node settings with strictest-wins.

Example:

SELECT * FROM pgtrickle.diamond_groups();

group_id	member_name	member_schema	is_convergence	schedule_policy
1	st_b	public	false	fastest
1	st_c	public	false	fastest
1	st_d	public	true	fastest

Notes:

Singleton stream tables (not part of any diamond) are omitted.
The DAG is rebuilt on each call from the catalog — results reflect the current dependency graph.
Groups are only relevant when diamond_consistency = 'atomic' is set on the convergence node or globally via the pg_trickle.diamond_consistency GUC.

pgtrickle.pgt_scc_status

List all cyclic strongly connected components (SCCs) and their convergence status.

When stream tables form circular dependencies (with pg_trickle.allow_circular = true), they are grouped into SCCs and iterated to a fixed point. This function exposes those groups for monitoring and debugging.

pgtrickle.pgt_scc_status() → SETOF record(
    scc_id              int4,
    member_count        int4,
    members             text[],
    last_iterations     int4,
    last_converged_at   timestamptz
)

Return columns:

Column	Type	Description
`scc_id`	`int4`	SCC group identifier (1-based).
`member_count`	`int4`	Number of stream tables in this SCC.
`members`	`text[]`	Array of `schema.name` for each member.
`last_iterations`	`int4`	Number of fixpoint iterations in the last convergence (NULL if never iterated).
`last_converged_at`	`timestamptz`	Timestamp of the most recent refresh among SCC members (NULL if never refreshed).

Example:

SELECT * FROM pgtrickle.pgt_scc_status();

scc_id	member_count	members	last_iterations	last_converged_at
1	2	{public.reach_a,public.reach_b}	3	2026-03-15 12:00:00+00

Notes:

Only cyclic SCCs (with scc_id IS NOT NULL) are returned. Acyclic stream tables are omitted.
last_iterations reflects the maximum last_fixpoint_iterations across SCC members.
Results are queried from the catalog on each call.

pgtrickle.explain_st

Explain the DVM plan for a stream table's defining query.

pgtrickle.explain_st(name text) → SETOF record(
    property  text,
    value     text
)

Example:

SELECT * FROM pgtrickle.explain_st('order_totals');

property	value
pgt_name	public.order_totals
defining_query	SELECT region, SUM(amount) ...
refresh_mode	DIFFERENTIAL
status	active
is_populated	true
dvm_supported	true
operator_tree	Aggregate → Scan(orders)
output_columns	region, total
source_oids	16384
delta_query	WITH ... SELECT ...
frontier	{"orders": "0/15A3B80"}
amplification_stats	{"samples":10,"min":1.0,...}
refresh_timing_stats	{"samples":10,"min_ms":12.3,...}
source_partitions	[{"source":"public.orders",...}]
dependency_graph_dot	digraph dependency_subgraph { ... }
spill_info	{"temp_blks_read":0,"temp_blks_written":1234,...}

Output Fields

Property	Description
`pgt_name`	Fully-qualified stream table name
`defining_query`	The SQL query that defines the stream table
`refresh_mode`	`DIFFERENTIAL`, `FULL`, or `IMMEDIATE`
`status`	Current status (`active`, `suspended`, etc.)
`is_populated`	Whether the stream table has been initially populated
`dvm_supported`	Whether the defining query supports differential view maintenance
`operator_tree`	Debug representation of the DVM operator tree
`output_columns`	Comma-separated list of output column names
`source_oids`	Comma-separated list of source table OIDs
`aggregate_strategies`	Per-aggregate maintenance strategies (JSON, if aggregates present)
`delta_query`	The generated delta SQL used for DIFFERENTIAL refresh
`frontier`	Current LSN/watermark frontier (JSON)
`amplification_stats`	Delta amplification ratio statistics over the last 20 refreshes (JSON)
`refresh_timing_stats`	Refresh duration statistics over the last 20 completed refreshes (JSON). Fields: `samples`, `min_ms`, `max_ms`, `avg_ms`, `latest_ms`, `latest_action`
`source_partitions`	Partition info for partitioned source tables (JSON array). Fields per entry: `source`, `partition_key`, `partitions`
`dependency_graph_dot`	Dependency sub-graph in DOT format. Shows immediate upstream sources (ellipses for base tables, boxes for stream tables) and downstream dependents. Paste into a Graphviz renderer to visualize.
`spill_info`	Temp file spill metrics from `pg_stat_statements` (JSON). Fields: `temp_blks_read`, `temp_blks_written`, `threshold`, `exceeds_threshold`. Only present when `pg_trickle.spill_threshold_blocks > 0`.

Note: Properties are only included when data is available. For example, source_partitions only appears when at least one source table is partitioned, and refresh_timing_stats only appears after at least one completed refresh.

pgtrickle.list_sources

List the source tables that a stream table depends on.

pgtrickle.list_sources(name text) → SETOF record(
    source_table   text,         -- qualified source table name
    source_oid     bigint,
    source_type    text,         -- 'table', 'stream_table', etc.
    cdc_mode       text,         -- 'trigger', 'wal', or 'transitioning'
    columns_used   text          -- column-level dependency info (if available)
)

Example:

SELECT * FROM pgtrickle.list_sources('order_totals');

Returns the tables tracked by CDC for the given stream table, along with how they are being tracked. Useful when diagnosing why a stream table is not refreshing or to audit which source tables are being trigger-tracked.

Utilities

Utility functions for CDC management and row identity hashing.

pgtrickle.rebuild_cdc_triggers

Rebuild all CDC triggers (function body + trigger DDL) for every source table tracked by pg_trickle. This recreates trigger functions and re-attaches the trigger to each source table.

pgtrickle.rebuild_cdc_triggers() → text

Returns 'done' on success. Emits a WARNING per table on error and continues processing remaining sources.

When to use:

After changing pg_trickle.cdc_trigger_mode from row to statement (or vice versa).
After ALTER EXTENSION pg_trickle UPDATE when the CDC trigger function body has changed.
After restoring from a backup where triggers may have been lost.

Example:

-- Switch to statement-level triggers and rebuild
SET pg_trickle.cdc_trigger_mode = 'statement';
SELECT pgtrickle.rebuild_cdc_triggers();

Notes:

Called automatically during ALTER EXTENSION pg_trickle UPDATE (0.3.0 → 0.4.0) migration.
Safe to call at any time — existing triggers are dropped and recreated.
On error for a specific table, a WARNING is logged and processing continues with remaining sources.

pgtrickle.pg_trickle_hash

Compute a 64-bit xxHash row ID from a text value.

pgtrickle.pg_trickle_hash(input text) → bigint

Marked IMMUTABLE, PARALLEL SAFE.

Example:

SELECT pgtrickle.pg_trickle_hash('some_key');
-- Returns: 1234567890123456789

pgtrickle.pg_trickle_hash_multi

Compute a row ID by hashing multiple text values (composite keys).

pgtrickle.pg_trickle_hash_multi(inputs text[]) → bigint

Marked IMMUTABLE, PARALLEL SAFE. Uses \x1E (record separator) between values and \x00NULL\x00 for NULL entries.

Example:

SELECT pgtrickle.pg_trickle_hash_multi(ARRAY['key1', 'key2']);

Operator Support Matrix — Summary

pg_trickle supports 60+ SQL constructs across three refresh modes. The table below summarises broad categories. For the complete per-operator matrix (including notes on caveats, auxiliary columns and strategies), see DVM_OPERATORS.md.

Category	FULL	DIFFERENTIAL	IMMEDIATE	Notes
Basic SELECT / WHERE / DISTINCT	✅	✅	✅
Joins (INNER, LEFT, RIGHT, FULL, CROSS, LATERAL)	✅	✅	✅	Hybrid delta strategy
Subqueries (EXISTS, IN, NOT EXISTS, NOT IN, scalar)	✅	✅	✅
Set operations (UNION ALL, INTERSECT, EXCEPT)	✅	✅	✅
Algebraic aggregates (COUNT, SUM, AVG, STDDEV, …)	✅	✅	✅	Fully invertible delta
Semi-algebraic aggregates (MIN, MAX)	✅	✅	✅	Group rescan on ambiguous delete
Group-rescan aggregates (STRING_AGG, ARRAY_AGG, …)	✅	⚠️	⚠️	Warning emitted at creation time
Window functions (ROW_NUMBER, RANK, LAG, LEAD, …)	✅	✅	✅	Partition-scoped recompute
CTEs (non-recursive and WITH RECURSIVE)	✅	✅	✅	Semi-naive / DRed strategies
TopK (ORDER BY … LIMIT)	✅	✅	✅	Scoped recomputation
LATERAL / set-returning functions / JSON_TABLE	✅	✅	✅	Row-scoped re-execution
ST-to-ST dependencies	✅	✅	✅	Differential via change buffers
VOLATILE functions	✅	❌	❌	Rejected at creation time

Legend: ✅ fully supported — ⚠️ supported with caveats — ❌ not supported

For details on each operator's delta strategy, auxiliary columns, and known limitations, see the full Operator Support Matrix.

Expression Support

pgtrickle's DVM parser supports a wide range of SQL expressions in defining queries. All expressions work in both FULL and DIFFERENTIAL modes.

Conditional Expressions

Expression	Example	Notes
`CASE WHEN … THEN … ELSE … END`	`CASE WHEN amount > 100 THEN 'high' ELSE 'low' END`	Searched CASE
`CASE <expr> WHEN … THEN … END`	`CASE status WHEN 1 THEN 'active' WHEN 2 THEN 'inactive' END`	Simple CASE
`COALESCE(a, b, …)`	`COALESCE(phone, email, 'unknown')`	Returns first non-NULL argument
`NULLIF(a, b)`	`NULLIF(divisor, 0)`	Returns NULL if `a = b`
`GREATEST(a, b, …)`	`GREATEST(score1, score2, score3)`	Returns the largest value
`LEAST(a, b, …)`	`LEAST(price, max_price)`	Returns the smallest value

Comparison Operators

Expression	Example	Notes
`IN (list)`	`category IN ('A', 'B', 'C')`	Also supports `NOT IN`
`BETWEEN a AND b`	`price BETWEEN 10 AND 100`	Also supports `NOT BETWEEN`
`IS DISTINCT FROM`	`a IS DISTINCT FROM b`	NULL-safe inequality
`IS NOT DISTINCT FROM`	`a IS NOT DISTINCT FROM b`	NULL-safe equality
`SIMILAR TO`	`name SIMILAR TO '%pattern%'`	SQL regex matching
`op ANY(array)`	`id = ANY(ARRAY[1,2,3])`	Array comparison
`op ALL(array)`	`score > ALL(ARRAY[50,60])`	Array comparison

Boolean Tests

Expression	Example
`IS TRUE`	`active IS TRUE`
`IS NOT TRUE`	`flag IS NOT TRUE`
`IS FALSE`	`completed IS FALSE`
`IS NOT FALSE`	`valid IS NOT FALSE`
`IS UNKNOWN`	`result IS UNKNOWN`
`IS NOT UNKNOWN`	`flag IS NOT UNKNOWN`

SQL Value Functions

Function	Description
`CURRENT_DATE`	Current date
`CURRENT_TIME`	Current time with time zone
`CURRENT_TIMESTAMP`	Current date and time with time zone
`LOCALTIME`	Current time without time zone
`LOCALTIMESTAMP`	Current date and time without time zone
`CURRENT_ROLE`	Current role name
`CURRENT_USER`	Current user name
`SESSION_USER`	Session user name
`CURRENT_CATALOG`	Current database name
`CURRENT_SCHEMA`	Current schema name

Array and Row Expressions

Expression	Example	Notes
`ARRAY[…]`	`ARRAY[1, 2, 3]`	Array constructor
`ROW(…)`	`ROW(a, b, c)`	Row constructor
Array subscript	`arr[1]`	Array element access
Field access	`(rec).field`	Composite type field access
Star indirection	`(data).*`	Expand all fields

Subquery Expressions

Subqueries are supported in the WHERE clause and SELECT list. They are parsed into dedicated DVM operators with specialized delta computation for incremental maintenance.

Expression	Example	DVM Operator
`EXISTS (subquery)`	`WHERE EXISTS (SELECT 1 FROM orders WHERE orders.cid = c.id)`	Semi-Join
`NOT EXISTS (subquery)`	`WHERE NOT EXISTS (SELECT 1 FROM orders WHERE orders.cid = c.id)`	Anti-Join
`IN (subquery)`	`WHERE id IN (SELECT product_id FROM order_items)`	Semi-Join (rewritten as equality)
`NOT IN (subquery)`	`WHERE id NOT IN (SELECT product_id FROM order_items)`	Anti-Join
`ALL (subquery)`	`WHERE price > ALL (SELECT price FROM competitors)`	Anti-Join (NULL-safe)
Scalar subquery (SELECT)	`SELECT (SELECT max(price) FROM products) AS max_p`	Scalar Subquery

Notes:

EXISTS and IN (subquery) in the WHERE clause are transformed into semi-join operators. NOT EXISTS and NOT IN (subquery) become anti-join operators.
Multi-column IN (subquery) is not supported (e.g., WHERE (a, b) IN (SELECT x, y FROM ...)). Rewrite as WHERE EXISTS (SELECT 1 FROM ... WHERE a = x AND b = y) for equivalent semantics.
Multiple subqueries in the same WHERE clause are supported when combined with AND. Subqueries combined with OR are also supported — they are automatically rewritten into UNION of separate filtered queries.
Scalar subqueries in the SELECT list are supported as long as they return exactly one row and one column.
ALL (subquery) is supported — see the worked example below.

ALL (subquery) — Worked Example

ALL (subquery) tests whether a comparison holds against every row returned by the subquery. pg_trickle rewrites it to a NULL-safe anti-join so it can be maintained incrementally.

Comparison operators supported: >, >=, <, <=, =, <>

Example — products cheaper than all competitors:

-- Source tables
CREATE TABLE products (
    id    INT PRIMARY KEY,
    name  TEXT,
    price NUMERIC
);
CREATE TABLE competitor_prices (
    id          INT PRIMARY KEY,
    product_id  INT,
    price       NUMERIC
);

-- Sample data
INSERT INTO products VALUES (1, 'Widget', 9.99), (2, 'Gadget', 24.99), (3, 'Gizmo', 14.99);
INSERT INTO competitor_prices VALUES (1, 1, 12.99), (2, 1, 11.50), (3, 2, 19.99), (4, 3, 14.99);

-- Stream table: find products priced below ALL competitor prices
SELECT pgtrickle.create_stream_table(
    name  => 'cheapest_products',
    query => $$
        SELECT p.id, p.name, p.price
        FROM products p
        WHERE p.price < ALL (
            SELECT cp.price
            FROM competitor_prices cp
            WHERE cp.product_id = p.id
        )
    $$,
    schedule => '1m'
);

Result: Widget (9.99 < all of [12.99, 11.50]) is included. Gadget (24.99 ≮ 19.99) is excluded. Gizmo (14.99 ≮ 14.99) is excluded.

How pg_trickle handles it internally:

WHERE price < ALL (SELECT ...) is parsed into an anti-join with a NULL-safe condition.
The condition NOT (x op col) is wrapped as (col IS NULL OR NOT (x op col)) to correctly handle NULL values in the subquery — if any subquery row is NULL, the ALL comparison fails (standard SQL semantics).
The anti-join uses the same incremental delta computation as NOT EXISTS, so changes to either products or competitor_prices are propagated efficiently.

Other common patterns:

-- Employees whose salary meets or exceeds all department maximums
WHERE salary >= ALL (SELECT max_salary FROM department_caps)

-- Orders with ratings better than all thresholds
WHERE rating > ALL (SELECT min_rating FROM quality_thresholds)

Auto-Rewrite Pipeline

pg_trickle transparently rewrites certain SQL constructs before parsing. These rewrites are applied automatically and require no user action:

Order	Trigger	Rewrite
#0	View references in FROM	Inline view body as subquery
#1	`DISTINCT ON (expr)`	Convert to `ROW_NUMBER() OVER (PARTITION BY expr ORDER BY ...) = 1` subquery
#2	`GROUPING SETS` / `CUBE` / `ROLLUP`	Decompose into `UNION ALL` of separate `GROUP BY` queries
#3	Scalar subquery in `WHERE`	Convert to `CROSS JOIN` with inline view
#4	Correlated scalar subquery in `SELECT`	Convert to `LEFT JOIN` with grouped inline view
#5	`EXISTS`/`IN` inside `OR`	Split into `UNION` of separate filtered queries
#6	Multiple `PARTITION BY` clauses	Split into joined subqueries, one per distinct partitioning
#7	Window functions inside expressions	Lift to inner subquery with synthetic `__pgt_wf_N` columns (see below)

Window Functions in Expressions (Auto-Rewrite)

Window functions nested inside expressions (e.g., CASE WHEN ROW_NUMBER() ..., ABS(RANK() OVER (...) - 5)) are automatically rewritten. pg_trickle lifts each window function call into a synthetic column in an inner subquery, then applies the original expression in the outer SELECT.

This rewrite is transparent — you write your query naturally and pg_trickle handles it:

Your query:

SELECT
    id,
    name,
    CASE WHEN ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) = 1
         THEN 'top earner'
         ELSE 'other'
    END AS rank_label
FROM employees

What pg_trickle generates internally:

SELECT
    "__pgt_wf_inner".id,
    "__pgt_wf_inner".name,
    CASE WHEN "__pgt_wf_inner"."__pgt_wf_1" = 1
         THEN 'top earner'
         ELSE 'other'
    END AS "rank_label"
FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS "__pgt_wf_1"
    FROM employees
) "__pgt_wf_inner"

The inner subquery produces the window function result as a plain column (__pgt_wf_1), which the DVM engine can maintain incrementally using its existing window function support. The outer expression is then a simple column reference.

More examples:

-- Arithmetic with window functions
SELECT id, ABS(RANK() OVER (ORDER BY score) - 5) AS adjusted_rank
FROM players

-- COALESCE with window function
SELECT id, COALESCE(LAG(value) OVER (ORDER BY ts), 0) AS prev_value
FROM sensor_readings

-- Multiple window functions in expressions
SELECT id,
       ROW_NUMBER() OVER (ORDER BY created_at) * 100 AS seq,
       SUM(amount) OVER (ORDER BY created_at) / COUNT(*) OVER (ORDER BY created_at) AS running_avg
FROM transactions

All of these are handled automatically — each distinct window function call is extracted to its own __pgt_wf_N synthetic column.

HAVING Clause

HAVING is fully supported. The filter predicate is applied on top of the aggregate delta computation — groups that pass the HAVING condition are included in the stream table.

SELECT pgtrickle.create_stream_table(
    name     => 'big_departments',
    query    => 'SELECT department, COUNT(*) AS cnt FROM employees GROUP BY department HAVING COUNT(*) > 10',
    schedule => '1m'
);

Tables Without Primary Keys (Keyless Tables)

Tables without a primary key can be used as sources. pg_trickle generates a content-based row identity by hashing all column values using pg_trickle_hash_multi(). This allows DIFFERENTIAL mode to work, though at the cost of being unable to distinguish truly duplicate rows (rows with identical values in all columns).

-- No primary key — pg_trickle uses content hashing for row identity
CREATE TABLE events (ts TIMESTAMPTZ, payload JSONB);
SELECT pgtrickle.create_stream_table(
    name     => 'event_summary',
    query    => 'SELECT payload->>''type'' AS event_type, COUNT(*) FROM events GROUP BY 1',
    schedule => '1m'
);

Known Limitation — Duplicate Rows in Keyless Tables (G7.1)

When a keyless table contains exact duplicate rows (identical values in every column), content-based hashing produces the same __pgt_row_id for each copy. Consequences:

INSERT of a duplicate row may appear as a no-op (the hash already exists in the stream table).

DELETE of one copy may delete all copies (the MERGE matches on __pgt_row_id, hitting every duplicate).

Aggregate counts over keyless tables with duplicates may drift from the true query result.

Recommendation: Add a PRIMARY KEY or at least a UNIQUE constraint to source tables used in DIFFERENTIAL mode. This eliminates the ambiguity entirely. If duplicates are expected and correctness matters, use FULL refresh mode, which always recomputes from scratch.

Volatile Function Detection

pg_trickle checks all functions and operators in the defining query against pg_proc.provolatile:

VOLATILE functions (e.g., random(), clock_timestamp(), gen_random_uuid()) are rejected in DIFFERENTIAL and IMMEDIATE modes because they produce different results on each evaluation, breaking delta correctness.
VOLATILE operators — custom operators backed by volatile functions are also detected. The check resolves the operator’s implementation function via pg_operator.oprcode and checks its volatility in pg_proc.
STABLE functions (e.g., now(), current_timestamp, current_setting()) produce a warning in DIFFERENTIAL and IMMEDIATE modes — they are consistent within a single refresh but may differ between refreshes.
IMMUTABLE functions are always safe and produce no warnings.

FULL mode accepts all volatility classes since it re-evaluates the entire query each time.

Volatile Function Policy (VOL-1)

The pg_trickle.volatile_function_policy GUC controls how volatile functions are handled:

Value	Behavior
`reject` (default)	ERROR — volatile functions are rejected at creation time.
`warn`	WARNING emitted but creation proceeds. Delta correctness is not guaranteed.
`allow`	Silent — no warning or error. Use when you understand the implications.

-- Allow volatile functions with a warning
SET pg_trickle.volatile_function_policy = 'warn';

-- Allow volatile functions silently
SET pg_trickle.volatile_function_policy = 'allow';

-- Restore default (reject volatile functions)
SET pg_trickle.volatile_function_policy = 'reject';

COLLATE Expressions

COLLATE clauses on expressions are supported:

SELECT pgtrickle.create_stream_table(
    name     => 'sorted_names',
    query    => 'SELECT name COLLATE "C" AS c_name FROM users',
    schedule => '1m'
);

IS JSON Predicate (PostgreSQL 16+)

The IS JSON predicate validates whether a value is valid JSON. All variants are supported:

-- Filter rows with valid JSON
SELECT pgtrickle.create_stream_table(
    name     => 'valid_json_events',
    query    => 'SELECT id, payload FROM events WHERE payload::text IS JSON',
    schedule => '1m'
);

-- Type-specific checks
SELECT pgtrickle.create_stream_table(
    name         => 'json_objects_only',
    query        => 'SELECT id, data IS JSON OBJECT AS is_obj,
          data IS JSON ARRAY AS is_arr,
          data IS JSON SCALAR AS is_scalar
   FROM json_data',
    schedule     => '1m',
    refresh_mode => 'FULL'
);

Supported variants: IS JSON, IS JSON OBJECT, IS JSON ARRAY, IS JSON SCALAR, IS NOT JSON (all forms), WITH UNIQUE KEYS.

SQL/JSON Constructors (PostgreSQL 16+)

SQL-standard JSON constructor functions are supported in both FULL and DIFFERENTIAL modes:

-- JSON_OBJECT: construct a JSON object from key-value pairs
SELECT pgtrickle.create_stream_table(
    name     => 'user_json',
    query    => 'SELECT id, JSON_OBJECT(''name'' : name, ''age'' : age) AS data FROM users',
    schedule => '1m'
);

-- JSON_ARRAY: construct a JSON array from values
SELECT pgtrickle.create_stream_table(
    name         => 'value_arrays',
    query        => 'SELECT id, JSON_ARRAY(a, b, c) AS arr FROM measurements',
    schedule     => '1m',
    refresh_mode => 'FULL'
);

-- JSON(): parse a text value as JSON
-- JSON_SCALAR(): wrap a scalar value as JSON
-- JSON_SERIALIZE(): serialize a JSON value to text

Note: JSON_ARRAYAGG() and JSON_OBJECTAGG() are SQL-standard aggregate functions fully recognized by the DVM engine. In DIFFERENTIAL mode, they use the group-rescan strategy (affected groups are re-aggregated from source data). The full deparsed SQL is preserved to handle the special key: value, ABSENT ON NULL, ORDER BY, and RETURNING clause syntax.

JSON_TABLE (PostgreSQL 17+)

JSON_TABLE() generates a relational table from JSON data. It is supported in the FROM clause in both FULL and DIFFERENTIAL modes. Internally, it is modeled as a LateralFunction.

-- Extract structured data from a JSON column
SELECT pgtrickle.create_stream_table(
    name     => 'user_phones',
    query    => $$SELECT u.id, j.phone_type, j.phone_number
    FROM users u,
         JSON_TABLE(u.contact_info, '$.phones[*]'
           COLUMNS (
             phone_type TEXT PATH '$.type',
             phone_number TEXT PATH '$.number'
           )
         ) AS j$$,
    schedule => '1m'
);

Supported column types:

Regular columns — name TYPE PATH '$.path' (with optional ON ERROR/ON EMPTY behaviors)
EXISTS columns — name TYPE EXISTS PATH '$.path'
Formatted columns — name TYPE FORMAT JSON PATH '$.path'
Nested columns — NESTED PATH '$.path' COLUMNS (...)

The PASSING clause is also supported for passing named variables to path expressions.

Unsupported Expression Types

The following are rejected with clear error messages rather than producing broken SQL:

Expression	Error Behavior	Suggested Rewrite
`TABLESAMPLE`	Rejected — stream tables materialize the complete result set	Use `WHERE random() < 0.1` if sampling is needed
`FOR UPDATE` / `FOR SHARE`	Rejected — stream tables do not support row-level locking	Remove the locking clause
Unknown node types	Rejected with type information	—

Note: Window functions inside expressions (e.g., CASE WHEN ROW_NUMBER() OVER (...) ...) were unsupported in earlier versions but are now automatically rewritten — see Auto-Rewrite Pipeline § Window Functions in Expressions.

Restrictions & Interoperability

Stream tables are standard PostgreSQL heap tables stored in the pgtrickle schema with an additional __pgt_row_id BIGINT PRIMARY KEY column managed by the refresh engine. This section describes what you can and cannot do with them.

Referencing Other Stream Tables

Stream tables can reference other stream tables in their defining query. This creates a dependency edge in the internal DAG, and the scheduler refreshes upstream tables before downstream ones. By default, cycles are detected and rejected at creation time.

When pg_trickle.allow_circular = true, circular dependencies are allowed for stream tables that use DIFFERENTIAL refresh mode and have monotone defining queries (no aggregates, EXCEPT, window functions, or NOT EXISTS/NOT IN). Cycle members are assigned an scc_id and the scheduler iterates them to a fixed point. Non-monotone operators are rejected because they prevent convergence.

-- ST1 reads from a base table
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule => '1m'
);

-- ST2 reads from ST1
SELECT pgtrickle.create_stream_table(
    name     => 'big_customers',
    query    => 'SELECT customer_id, total FROM pgtrickle.order_totals WHERE total > 1000',
    schedule => '1m'
);

Views as Sources in Defining Queries

PostgreSQL views can be used as source tables in a stream table's defining query. Views are automatically inlined — replaced with their underlying SELECT definition as subqueries — so CDC triggers land on the actual base tables.

CREATE VIEW active_orders AS
  SELECT * FROM orders WHERE status = 'active';

-- This works (views are auto-inlined):
SELECT pgtrickle.create_stream_table(
    name     => 'order_summary',
    query    => 'SELECT customer_id, COUNT(*) FROM active_orders GROUP BY customer_id',
    schedule => '1m'
);
-- Internally, 'active_orders' is replaced with:
--   (SELECT ... FROM orders WHERE status = 'active') AS active_orders

Nested views (view → view → table) are fully expanded via a fixpoint loop. Column-renaming views (CREATE VIEW v(a, b) AS ...) work correctly — pg_get_viewdef() produces the proper column aliases.

When a view is inlined, the user's original SQL is stored in the original_query catalog column for reinit and introspection. The defining_query column contains the expanded (post-inlining) form.

DDL hooks: CREATE OR REPLACE VIEW on a view that was inlined into a stream table marks that ST for reinit. DROP VIEW sets affected STs to ERROR status.

Materialized views are rejected in DIFFERENTIAL mode — their stale-snapshot semantics prevent CDC triggers from tracking changes. Use the underlying query directly, or switch to FULL mode. In FULL mode, materialized views are allowed (no CDC needed).

Foreign tables are rejected in DIFFERENTIAL mode — row-level triggers cannot be created on foreign tables. Use FULL mode instead.

Partitioned Tables as Sources

Partitioned tables are fully supported as source tables in both FULL and DIFFERENTIAL modes. CDC triggers are installed on the partitioned parent table, and PostgreSQL 13+ ensures the trigger fires for all DML routed to child partitions. The change buffer uses the parent table's OID (pgtrickle_changes.changes_<parent_oid>).

CREATE TABLE orders (
    id INT, region TEXT, amount NUMERIC
) PARTITION BY LIST (region);
CREATE TABLE orders_us PARTITION OF orders FOR VALUES IN ('US');
CREATE TABLE orders_eu PARTITION OF orders FOR VALUES IN ('EU');

-- Works — inserts into any partition are captured:
SELECT pgtrickle.create_stream_table(
    name     => 'order_summary',
    query    => 'SELECT region, SUM(amount) FROM orders GROUP BY region',
    schedule => '1m'
);

ATTACH PARTITION detection: When a new partition is attached to a tracked source table via ALTER TABLE parent ATTACH PARTITION child ..., pg_trickle's DDL event trigger detects the change in partition structure and automatically marks affected stream tables for reinitialize. This ensures pre-existing rows in the newly attached partition are included on the next refresh. DETACH PARTITION is also detected and triggers reinitialization.

WAL mode: When using WAL-based CDC (cdc_mode = 'wal'), publications for partitioned source tables are created with publish_via_partition_root = true. This ensures changes from child partitions are published under the parent table's identity, matching trigger-mode CDC behavior.

Note: pg_trickle targets PostgreSQL 18. On PostgreSQL 12 or earlier (not supported), parent triggers do not fire for partition-routed rows, which would cause silent data loss.

Foreign Tables as Sources

Foreign tables (via postgres_fdw or other FDWs) can be used as stream table sources with these constraints:

CDC Method	Supported?	Why
Trigger-based	❌ No	Foreign tables don't support row-level triggers
WAL-based	❌ No	Foreign tables don't generate local WAL entries
FULL refresh	✅ Yes	Re-executes the remote query each cycle
Polling-based	✅ Yes	When `pg_trickle.foreign_table_polling = on`

-- Foreign table source — FULL refresh only
SELECT pgtrickle.create_stream_table(
    name         => 'remote_summary',
    query        => 'SELECT region, SUM(amount) FROM remote_orders GROUP BY region',
    schedule     => '5m',
    refresh_mode => 'FULL'
);

When pg_trickle detects a foreign table source, it emits an INFO message explaining the constraints. If you attempt to use DIFFERENTIAL mode without polling enabled, the creation will succeed but the refresh falls back to FULL.

Polling-based CDC creates a local snapshot table and computes EXCEPT ALL differences on each refresh. Enable with:

SET pg_trickle.foreign_table_polling = on;

For a complete step-by-step setup guide, see the Foreign Table Sources tutorial.

IMMEDIATE Mode Query Restrictions

The 'IMMEDIATE' refresh mode supports nearly all SQL constructs supported by 'DIFFERENTIAL' and 'FULL' modes. Queries are validated at stream table creation and when switching to IMMEDIATE mode via alter_stream_table.

Supported in IMMEDIATE mode:

Simple SELECT ... FROM table scans, filters, projections
JOIN (INNER, LEFT, FULL OUTER)
GROUP BY with standard aggregates (COUNT, SUM, AVG, MIN, MAX, etc.)
DISTINCT
Non-recursive WITH (CTEs)
UNION ALL, INTERSECT, EXCEPT
EXISTS / IN subqueries (SemiJoin, AntiJoin)
Subqueries in FROM
Window functions (ROW_NUMBER, RANK, DENSE_RANK, etc.)
LATERAL subqueries
LATERAL set-returning functions (unnest(), jsonb_array_elements(), etc.)
Scalar subqueries in SELECT
Cascading IMMEDIATE stream tables (ST depending on another IMMEDIATE ST)
Recursive CTEs (WITH RECURSIVE) — uses semi-naive evaluation (INSERT-only) or Delete-and-Rederive (DELETE/UPDATE); bounded by pg_trickle.ivm_recursive_max_depth (default 100) to guard against infinite loops from cyclic data

Not yet supported in IMMEDIATE mode:

None — all constructs that work in 'DIFFERENTIAL' mode are now also available in 'IMMEDIATE' mode.

Notes on WITH RECURSIVE in IMMEDIATE mode:

A __pgt_depth counter is injected into the generated semi-naive SQL. Propagation stops when the counter reaches ivm_recursive_max_depth (default 100). Raise this GUC for deeper hierarchies or set it to 0 to disable the guard.
A WARNING is emitted at stream table creation time reminding operators to monitor for stack depth limit exceeded errors on very deep hierarchies.
Non-linear recursion (multiple self-references) is rejected — PostgreSQL itself enforces this restriction.

Attempting to create a stream table with an unsupported construct produces a clear error message.

Logical Replication Targets

Tables that receive data via logical replication require special consideration. Changes arriving via replication do not fire normal row-level triggers, which means CDC triggers will miss those changes.

pg_trickle emits a WARNING at stream table creation time if any source table is detected as a logical replication target (via pg_subscription_rel).

Workarounds:

Use cdc_mode = 'wal' for WAL-based CDC that captures all changes regardless of origin.
Use FULL refresh mode, which recomputes entirely from the current table state.
Set a frequent refresh schedule with FULL mode to limit staleness.

Views on Stream Tables

PostgreSQL views can reference stream tables. The view reflects the data as of the most recent refresh.

CREATE VIEW top_customers AS
SELECT customer_id, total
FROM pgtrickle.order_totals
WHERE total > 500
ORDER BY total DESC;

Materialized Views on Stream Tables

Materialized views can reference stream tables, though this is typically redundant (both are physical snapshots of a query). The materialized view requires its own REFRESH MATERIALIZED VIEW — it does not auto-refresh when the stream table refreshes.

Logical Replication of Stream Tables

Stream tables can be published for logical replication like any ordinary table:

-- On publisher
CREATE PUBLICATION my_pub FOR TABLE pgtrickle.order_totals;

-- On subscriber
CREATE SUBSCRIPTION my_sub
  CONNECTION 'host=... dbname=...'
  PUBLICATION my_pub;

Caveats:

The __pgt_row_id column is replicated (it is the primary key), which is an internal implementation detail.
The subscriber receives materialized data, not the defining query. Refreshes on the publisher propagate as normal DML via logical replication.
Do not install pg_trickle on the subscriber and attempt to refresh the replicated table — it will have no CDC triggers or catalog entries.
The internal change buffer tables (pgtrickle_changes.changes_<oid>) and catalog tables are not published by default; subscribers only receive the final output.

Known Delta Computation Limitations

The following edge cases produce incorrect delta results in DIFFERENTIAL mode under specific data mutation patterns. They have no effect on FULL mode.

JOIN Key Column Change + Simultaneous Right-Side Delete — Fixed (EC-01)

Resolved in v0.14.0. This limitation no longer exists — the delta query now uses a pre-change right snapshot (R₀) for DELETE deltas, ensuring stale rows are correctly removed even when the join partner is simultaneously deleted.

The fix splits Part 1 of the JOIN delta into two arms:

Part 1a (inserts): ΔL_inserts ⋈ R₁ — uses current right state
Part 1b (deletes): ΔL_deletes ⋈ R₀ — uses pre-change right state

R₀ is reconstructed as R_current EXCEPT ALL ΔR_inserts UNION ALL ΔR_deletes (or via NOT EXISTS anti-join for simple Scan nodes). This ensures the DELETE half always finds the old join partner, even if that partner was deleted in the same cycle.

The fix applies to INNER JOIN, LEFT JOIN, and FULL OUTER JOIN delta operators. See DVM_OPERATORS.md for implementation details.

CUBE/ROLLUP Expansion Limit

CUBE(a, b, c...n) on N columns generates $2^N$ grouping set branches (a UNION ALL of N queries). pg_trickle rejects CUBE/ROLLUP that would produce more than 64 branches to prevent runaway memory usage during query generation. Use explicit GROUPING SETS(...) instead:

-- Rejected: CUBE(a, b, c, d, e, f, g) would generate 128 branches
-- Use instead:
SELECT pgtrickle.create_stream_table(
    name     => 'multi_dim',
    query    => 'SELECT a, b, c, SUM(v) FROM t
   GROUP BY GROUPING SETS ((a, b, c), (a, b), (a), ())',
    schedule => '5m'
);

What Is NOT Allowed

Operation	Restriction	Reason
Direct DML (`INSERT`, `UPDATE`, `DELETE`)	❌ Not supported	Stream table contents are managed exclusively by the refresh engine.
Direct DDL (`ALTER TABLE`)	❌ Not supported	Use `pgtrickle.alter_stream_table()` to change the defining query or schedule.
Foreign keys referencing or from a stream table	❌ Not supported	The refresh engine performs bulk `MERGE` operations that do not respect FK ordering.
User-defined triggers on stream tables	✅ Supported (DIFFERENTIAL)	In DIFFERENTIAL mode, the refresh engine decomposes changes into explicit DELETE + UPDATE + INSERT statements so triggers fire with correct `TG_OP`, `OLD`, and `NEW`. Row-level triggers are suppressed during FULL refresh. Controlled by `pg_trickle.user_triggers` GUC (default: `auto`).
`TRUNCATE` on a stream table	❌ Not supported	Use `pgtrickle.refresh_stream_table()` to reset data.

Tip: The __pgt_row_id column is visible but should be ignored by consuming queries — it is an implementation detail used for delta MERGE operations.

Internal `__pgt_*` Auxiliary Columns

Stream tables may contain additional hidden columns whose names begin with __pgt_. These are managed exclusively by the refresh engine — they are not part of the user-visible schema and should never be read or written by application queries.

`__pgt_row_id` — Row identity (always present)

Every stream table has a BIGINT PRIMARY KEY column named __pgt_row_id. It is a content hash of all output columns (xxHash3-128 with Fibonacci-mixing of multiple column hashes), updated by the refresh engine on every MERGE. It is used as the MERGE join key to detect inserts/updates/deletes.

`__pgt_count` — Group multiplicity (aggregates & DISTINCT)

Added when the defining query contains GROUP BY, DISTINCT, UNION ALL ... GROUP BY, or any aggregate expression that requires tracking how many source rows contribute to each output row.

Type	Triggers
`BIGINT NOT NULL DEFAULT 0`	`GROUP BY`, `DISTINCT`, `COUNT(*)`, `SUM(...)`, `AVG(...)`, `STDDEV(...)`, `VAR(...)`, `UNION` deduplication

`__pgt_count_l` / `__pgt_count_r` — Dual multiplicity (INTERSECT / EXCEPT)

Added when the defining query contains INTERSECT or EXCEPT. Stores independently the left-branch and right-branch row counts for Z-set delta algebra.

Type	Triggers
`BIGINT NOT NULL DEFAULT 0` each	`INTERSECT`, `INTERSECT ALL`, `EXCEPT`, `EXCEPT ALL`

`__pgt_aux_sum_<alias>` / `__pgt_aux_count_<alias>` — Running totals for AVG

Pairs of auxiliary columns added for each AVG(expr) in the query. Instead of recomputing the average from scratch on each delta, the refresh engine maintains a running sum and count and derives the average algebraically.

Type	Triggers
`NUMERIC NOT NULL DEFAULT 0` (sum), `BIGINT NOT NULL DEFAULT 0` (count)	Any `AVG(expr)` in `GROUP BY` query

Named __pgt_aux_sum_<output_alias> and __pgt_aux_count_<output_alias>, where <output_alias> is the column alias for the AVG expression in the SELECT list.

`__pgt_aux_sum2_<alias>` — Sum-of-squares for STDDEV / VARIANCE

Added alongside the sum/count pair when the query contains STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, or VAR_SAMP. Enables O(1) algebraic computation of variance from the Welford identity.

Type	Triggers
`NUMERIC NOT NULL DEFAULT 0`	`STDDEV(...)`, `STDDEV_POP(...)`, `STDDEV_SAMP(...)`, `VARIANCE(...)`, `VAR_POP(...)`, `VAR_SAMP(...)`

`__pgt_aux_sumx_` / `__pgt_aux_sumy_` / `__pgt_aux_sumxy_` / `__pgt_aux_sumx2_` / `__pgt_aux_sumy2_*` — Cross-product accumulators for regression aggregates

Five auxiliary columns per aggregate, used for O(1) algebraic maintenance of the twelve PostgreSQL regression and correlation aggregates.

Type	Triggers
`NUMERIC NOT NULL DEFAULT 0` (five columns per aggregate)	`CORR(Y,X)`, `COVAR_POP(Y,X)`, `COVAR_SAMP(Y,X)`, `REGR_AVGX(Y,X)`, `REGR_AVGY(Y,X)`, `REGR_COUNT(Y,X)`, `REGR_INTERCEPT(Y,X)`, `REGR_R2(Y,X)`, `REGR_SLOPE(Y,X)`, `REGR_SXX(Y,X)`, `REGR_SXY(Y,X)`, `REGR_SYY(Y,X)`

The five columns are named with base prefix __pgt_aux_<kind>_<output_alias> where <kind> is sumx, sumy, sumxy, sumx2, or sumy2. The shared group count is stored in the companion __pgt_aux_count_<output_alias> column.

`__pgt_aux_nonnull_<alias>` — Non-NULL count for SUM + FULL OUTER JOIN

Added when the query contains SUM(expr) inside a FULL OUTER JOIN aggregate. When matched rows transition to unmatched (null-padded), standard algebraic SUM would produce 0 instead of NULL. This counter tracks how many non-NULL argument values exist in each group; when it reaches zero the SUM is definitively NULL without a full rescan.

Type	Triggers
`BIGINT NOT NULL DEFAULT 0`	`SUM(expr)` in a query with `FULL OUTER JOIN` at the top level

`__pgt_wf_<N>` — Window function lift-out (query rewrite)

Added at query-rewrite time (before storage table creation) when the defining query contains window functions embedded inside larger expressions (e.g. CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN ...). The engine lifts the window function to a synthetic inner-subquery column so the outer SELECT can reference it by alias.

Type	Triggers
Inherits the window-function return type	Window function inside expression — e.g. `RANK()`, `ROW_NUMBER()`, `DENSE_RANK()`, `LAG()`, `LEAD()`, etc.

`__pgt_depth` — Recursion depth counter (recursive CTE)

Present only inside the DVM-generated SQL for recursive CTE queries. Used to limit unbounded recursion in semi-naive evaluation. Not added as a permanent column to the storage table.

Rule of thumb: Unless you see an ALTER TABLE query mentioning one of these columns, they are transparent to consuming queries. Never SELECT __pgt_* columns in application code — their names, types, and presence may change across minor versions.

Row-Level Security (RLS)

Stream tables follow the same RLS model as PostgreSQL's built-in MATERIALIZED VIEW: the refresh always materializes the full, unfiltered result set. Access control is applied at read time via RLS policies on the stream table itself.

How It Works

Area	Behavior
RLS on source tables	Ignored during refresh. The scheduler runs as superuser; manual `refresh_stream_table()` and IMMEDIATE-mode triggers bypass RLS via `SET LOCAL row_security = off` / `SECURITY DEFINER`. The stream table always contains all rows.
RLS on the stream table	Works naturally. Enable RLS and create policies on the stream table to filter reads per role — exactly as you would on any regular table.
RLS policy changes on source tables	`CREATE POLICY`, `ALTER POLICY`, and `DROP POLICY` on a source table are detected by pg_trickle's DDL event trigger and mark the stream table for reinitialisation.
ENABLE/DISABLE RLS on source tables	`ALTER TABLE … ENABLE ROW LEVEL SECURITY` and `DISABLE ROW LEVEL SECURITY` on a source table mark the stream table for reinitialisation.
Change buffer tables	RLS is explicitly disabled on all change buffer tables (`pgtrickle_changes.changes_*`) so CDC trigger inserts always succeed regardless of schema-level RLS settings.
IMMEDIATE mode	IVM trigger functions are `SECURITY DEFINER` with a locked `search_path`, so the delta query always sees all rows. The DML issued by the calling user is still filtered by that user's RLS policies on the source table — only the stream table maintenance runs with elevated privileges.

Recommended Pattern: RLS on the Stream Table

-- 1. Create a stream table (materializes all rows)
SELECT pgtrickle.create_stream_table(
    name  => 'order_totals',
    query => 'SELECT tenant_id, SUM(amount) AS total FROM orders GROUP BY tenant_id'
);

-- 2. Enable RLS on the stream table
ALTER TABLE pgtrickle.order_totals ENABLE ROW LEVEL SECURITY;

-- 3. Create per-tenant policies
CREATE POLICY tenant_isolation ON pgtrickle.order_totals
    USING (tenant_id = current_setting('app.tenant_id')::INT);

-- 4. Each role sees only its own rows
SET app.tenant_id = '42';
SELECT * FROM pgtrickle.order_totals;  -- only tenant 42's rows

Note: This is identical to how you would apply RLS to a regular MATERIALIZED VIEW. One stream table serves all tenants; per-tenant filtering happens at query time with zero storage duplication.

Views

pgtrickle.stream_tables_info

Status overview with computed staleness information.

SELECT * FROM pgtrickle.stream_tables_info;

Columns include all pgtrickle.pgt_stream_tables columns plus:

Column	Type	Description
`staleness`	`interval`	`now() - last_refresh_at`
`stale`	`bool`	`true` when the scheduler itself is behind (last_refresh_at age exceeds schedule); `false` when the scheduler is healthy even if source tables have had no writes

pgtrickle.pg_stat_stream_tables

Comprehensive monitoring view combining catalog metadata with aggregate refresh statistics.

SELECT * FROM pgtrickle.pg_stat_stream_tables;

Key columns:

Column	Type	Description
`pgt_id`	`bigint`	Stream table ID
`pgt_schema` / `pgt_name`	`text`	Schema and name
`status`	`text`	INITIALIZING, ACTIVE, SUSPENDED, ERROR
`refresh_mode`	`text`	FULL or DIFFERENTIAL
`data_timestamp`	`timestamptz`	Timestamp of last refresh
`staleness`	`interval`	`now() - last_refresh_at`
`stale`	`bool`	`true` when the scheduler is behind its schedule; `false` when the scheduler is healthy (quiet source tables do not count as stale)
`total_refreshes`	`bigint`	Total refresh count
`successful_refreshes`	`bigint`	Successful refresh count
`failed_refreshes`	`bigint`	Failed refresh count
`avg_duration_ms`	`float8`	Average refresh duration
`consecutive_errors`	`int`	Current error streak
`cdc_modes`	`text[]`	Distinct CDC modes across TABLE-type sources (e.g. `{wal}`, `{trigger,wal}`, `{transitioning,wal}`)
`scc_id`	`int`	SCC group identifier for circular dependencies (`NULL` if not in a cycle)
`last_fixpoint_iterations`	`int`	Number of fixpoint iterations in the last SCC convergence (`NULL` if not cyclic)

pgtrickle.quick_health

Single-row health summary for dashboards and alerting. Returns the overall health status of the pg_trickle extension at a glance.

SELECT * FROM pgtrickle.quick_health;

Column	Type	Description
`total_stream_tables`	`bigint`	Total number of stream tables
`error_tables`	`bigint`	Stream tables with `status = 'ERROR'` or `consecutive_errors > 0`
`stale_tables`	`bigint`	Stream tables whose data is older than their schedule interval
`scheduler_running`	`boolean`	Whether a pg_trickle scheduler backend is detected in `pg_stat_activity`
`status`	`text`	Overall status: `EMPTY`, `OK`, `WARNING`, or `CRITICAL`

Status values:

EMPTY — No stream tables exist.
OK — All stream tables are healthy and up-to-date.
WARNING — Some tables have errors or are stale.
CRITICAL — At least one stream table is SUSPENDED.

pgtrickle.pgt_cdc_status

Convenience view for inspecting the CDC mode and WAL slot state of every TABLE-type source for all stream tables. Useful for monitoring in-progress TRIGGER→WAL transitions.

SELECT * FROM pgtrickle.pgt_cdc_status;

Column	Type	Description
`pgt_schema`	`text`	Schema of the stream table
`pgt_name`	`text`	Name of the stream table
`source_relid`	`oid`	OID of the source table
`source_name`	`text`	Name of the source table
`source_schema`	`text`	Schema of the source table
`cdc_mode`	`text`	Current CDC mode: `trigger`, `transitioning`, or `wal`
`slot_name`	`text`	Replication slot name (`NULL` for trigger mode)
`decoder_confirmed_lsn`	`pg_lsn`	Last WAL position decoded (`NULL` for trigger mode)
`transition_started_at`	`timestamptz`	When the trigger→WAL transition began (`NULL` if not transitioning)

Subscribe to the pgtrickle_cdc_transition NOTIFY channel to receive real-time events when a source moves between CDC modes (payload is a JSON object with source_oid, from, and to fields).

Catalog Tables

pgtrickle.pgt_stream_tables

Core metadata for each stream table.

Column	Type	Description
`pgt_id`	`bigserial`	Primary key
`pgt_relid`	`oid`	OID of the storage table
`pgt_name`	`text`	Table name
`pgt_schema`	`text`	Schema name
`defining_query`	`text`	The SQL query that defines the ST
`original_query`	`text`	The user-supplied query before normalization
`schedule`	`text`	Refresh schedule (duration or cron expression)
`refresh_mode`	`text`	FULL, DIFFERENTIAL, or IMMEDIATE
`status`	`text`	INITIALIZING, ACTIVE, SUSPENDED, ERROR
`is_populated`	`bool`	Whether the table has been populated
`data_timestamp`	`timestamptz`	Timestamp of the data in the ST
`frontier`	`jsonb`	Per-source LSN positions (version tracking)
`last_refresh_at`	`timestamptz`	When last refreshed
`consecutive_errors`	`int`	Current error streak count
`needs_reinit`	`bool`	Whether upstream DDL requires reinitialization
`auto_threshold`	`double precision`	Per-ST adaptive fallback threshold (overrides GUC)
`last_full_ms`	`double precision`	Last FULL refresh duration in milliseconds
`functions_used`	`text[]`	Function names used in the defining query (for DDL tracking)
`topk_limit`	`int`	LIMIT value for TopK stream tables (`NULL` if not TopK)
`topk_order_by`	`text`	ORDER BY clause SQL for TopK stream tables
`topk_offset`	`int`	OFFSET value for paged TopK queries (`NULL` if not paged)
`diamond_consistency`	`text`	Diamond consistency mode: `none` or `atomic`
`diamond_schedule_policy`	`text`	Diamond schedule policy: `fastest` or `slowest`
`has_keyless_source`	`bool`	Whether any source table lacks a PRIMARY KEY (EC-06)
`function_hashes`	`text`	MD5 hashes of referenced function bodies for change detection (EC-16)
`scc_id`	`int`	SCC group identifier for circular dependencies (`NULL` if not in a cycle)
`last_fixpoint_iterations`	`int`	Number of iterations in the last SCC fixpoint convergence (`NULL` if never iterated)
`created_at`	`timestamptz`	Creation timestamp
`updated_at`	`timestamptz`	Last modification timestamp

pgtrickle.pgt_dependencies

DAG edges — records which source tables each ST depends on, including CDC mode metadata.

Column	Type	Description
`pgt_id`	`bigint`	FK to pgt_stream_tables
`source_relid`	`oid`	OID of the source table
`source_type`	`text`	TABLE, STREAM_TABLE, VIEW, MATVIEW, or FOREIGN_TABLE
`columns_used`	`text[]`	Which columns are referenced
`column_snapshot`	`jsonb`	Snapshot of source column metadata at creation time
`schema_fingerprint`	`text`	SHA-256 fingerprint of column snapshot for fast equality checks
`cdc_mode`	`text`	Current CDC mode: TRIGGER, TRANSITIONING, or WAL
`slot_name`	`text`	Replication slot name (WAL/TRANSITIONING modes)
`decoder_confirmed_lsn`	`pg_lsn`	WAL decoder's last confirmed position
`transition_started_at`	`timestamptz`	When the trigger→WAL transition started

pgtrickle.pgt_refresh_history

Audit log of all refresh operations.

Column	Type	Description
`refresh_id`	`bigserial`	Primary key
`pgt_id`	`bigint`	FK to pgt_stream_tables
`data_timestamp`	`timestamptz`	Data timestamp of the refresh
`start_time`	`timestamptz`	When the refresh started
`end_time`	`timestamptz`	When it completed
`action`	`text`	NO_DATA, FULL, DIFFERENTIAL, REINITIALIZE, SKIP
`rows_inserted`	`bigint`	Rows inserted
`rows_deleted`	`bigint`	Rows deleted
`delta_row_count`	`bigint`	Number of delta rows processed from change buffers
`merge_strategy_used`	`text`	Which merge strategy was used (e.g. MERGE, DELETE+INSERT)
`was_full_fallback`	`bool`	Whether the refresh fell back to FULL from DIFFERENTIAL
`error_message`	`text`	Error message if failed
`status`	`text`	RUNNING, COMPLETED, FAILED, SKIPPED
`initiated_by`	`text`	What triggered: SCHEDULER, MANUAL, or INITIAL
`freshness_deadline`	`timestamptz`	SLA deadline (duration schedules only; NULL for cron)
`fixpoint_iteration`	`int`	Iteration of the fixed-point loop (`NULL` for non-cyclic refreshes)

pgtrickle.pgt_change_tracking

CDC slot tracking per source table.

Column	Type	Description
`source_relid`	`oid`	OID of the tracked source table
`slot_name`	`text`	Logical replication slot name
`last_consumed_lsn`	`pg_lsn`	Last consumed WAL position
`tracked_by_pgt_ids`	`bigint[]`	Array of ST IDs depending on this source

pgtrickle.pgt_source_gates

Bootstrap source gate registry. One row per source table that has ever been gated. Only sources with gated = true are actively blocking scheduler refreshes.

Column	Type	Description
`source_relid`	`oid`	OID of the gated source table (PK)
`gated`	`boolean`	`true` while the source is gated; `false` after `ungate_source()`
`gated_at`	`timestamptz`	When the gate was most recently set
`ungated_at`	`timestamptz`	When the gate was cleared (`NULL` if still active)
`gated_by`	`text`	Actor that set the gate (e.g. `'gate_source'`)

pgtrickle.pgt_refresh_groups

User-declared Cross-Source Snapshot Consistency groups (v0.9.0). A refresh group guarantees that all member stream tables are refreshed against a snapshot taken at the same point in time, preventing partial-update visibility (e.g. orders and order_lines both reflecting the same transaction boundary).

Column	Type	Description
`group_id`	`serial`	Primary key
`group_name`	`text`	Unique human-readable group name
`member_oids`	`oid[]`	OIDs of the stream table storage relations that participate in this group
`isolation`	`text`	Snapshot isolation level for the group: `'read_committed'` (default) or `'repeatable_read'`
`created_at`	`timestamptz`	When the group was created

Management API

-- Create a refresh group
SELECT pgtrickle.create_refresh_group(
    'orders_snapshot',
    ARRAY['public.orders_summary', 'public.order_lines_summary'],
    'repeatable_read'   -- or 'read_committed' (default)
);

-- List all groups:
SELECT * FROM pgtrickle.refresh_groups();

-- Remove a group:
SELECT pgtrickle.drop_refresh_group('orders_snapshot');

Validation rules:

At least 2 member stream tables are required.
All members must exist in pgt_stream_tables.
No member can appear in more than one refresh group.
Valid isolation levels: 'read_committed' (default), 'repeatable_read'.

Bootstrap Source Gating (v0.5.0)

These functions let operators pause and resume scheduler-driven refreshes for individual source tables — useful during large bulk loads or ETL windows.

pgtrickle.gate_source(source TEXT)

Mark a source table as gated. The scheduler will skip any stream table that reads from this source until ungate_source() is called.

SELECT pgtrickle.gate_source('my_schema.big_source');

Manual refresh_stream_table() calls are not affected by gates.

pgtrickle.ungate_source(source TEXT)

Clear a gate set by gate_source(). After this call the scheduler resumes normal refresh scheduling for dependent stream tables.

SELECT pgtrickle.ungate_source('my_schema.big_source');

pgtrickle.source_gates()

Table function returning the current gate status for all registered sources.

SELECT * FROM pgtrickle.source_gates();
-- source_table | schema_name | gated | gated_at | ungated_at | gated_by

Column	Type	Description
`source_table`	`text`	Relation name
`schema_name`	`text`	Schema name
`gated`	`boolean`	Whether the source is currently gated
`gated_at`	`timestamptz`	When the gate was set
`ungated_at`	`timestamptz`	When the gate was cleared (`NULL` if active)
`gated_by`	`text`	Which function set the gate

Typical workflow

-- 1. Gate the source before a bulk load.
SELECT pgtrickle.gate_source('orders');

-- 2. Load historical data (scheduler sits idle for orders-based STs).
COPY orders FROM '/data/historical_orders.csv';

-- 3. Ungate — the next scheduler tick refreshes everything cleanly.
SELECT pgtrickle.ungate_source('orders');

pgtrickle.bootstrap_gate_status() (v0.6.0)

Rich introspection of bootstrap gate lifecycle. Returns the same columns as source_gates() plus computed fields for debugging.

SELECT * FROM pgtrickle.bootstrap_gate_status();
-- source_table | schema_name | gated | gated_at | ungated_at | gated_by | gate_duration | affected_stream_tables

Column	Type	Description
`source_table`	`text`	Relation name
`schema_name`	`text`	Schema name
`gated`	`boolean`	Whether the source is currently gated
`gated_at`	`timestamptz`	When the gate was set (updated on re-gate)
`ungated_at`	`timestamptz`	When the gate was cleared (`NULL` if active)
`gated_by`	`text`	Which function set the gate
`gate_duration`	`interval`	How long the gate has been active (gated: `now() - gated_at`; ungated: `ungated_at - gated_at`)
`affected_stream_tables`	`text`	Comma-separated list of stream tables whose scheduler refreshes are blocked by this gate

Rows are sorted with currently-gated sources first, then alphabetically.

ETL Coordination Cookbook (v0.6.0)

Step-by-step recipes for common bulk-load patterns using source gating.

Recipe 1 — Single Source Bulk Load

Gate one source table during a large data import. The scheduler pauses refreshes for all stream tables that depend on this source.

-- 1. Gate the source before loading.
SELECT pgtrickle.gate_source('orders');

-- 2. Load the data.  The scheduler sits idle for orders-dependent STs.
COPY orders FROM '/data/orders_2026.csv' WITH (FORMAT csv, HEADER);

-- 3. Ungate.  On the next tick the scheduler refreshes everything cleanly.
SELECT pgtrickle.ungate_source('orders');

Recipe 2 — Coordinated Multi-Source Load

When multiple sources feed into a shared downstream stream table, gate them all before loading so no intermediate refreshes occur.

-- 1. Gate all sources that will be loaded.
SELECT pgtrickle.gate_source('orders');
SELECT pgtrickle.gate_source('order_lines');

-- 2. Load each source (can be parallel, any order).
COPY orders FROM '/data/orders.csv' WITH (FORMAT csv, HEADER);
COPY order_lines FROM '/data/lines.csv' WITH (FORMAT csv, HEADER);

-- 3. Ungate all sources.  The scheduler refreshes downstream STs once.
SELECT pgtrickle.ungate_source('orders');
SELECT pgtrickle.ungate_source('order_lines');

Recipe 3 — Gate + Deferred Initialization

Combine gating with initialize => false to prevent incomplete initial population when sources are loaded asynchronously.

-- 1. Gate sources before creating any stream tables.
SELECT pgtrickle.gate_source('orders');
SELECT pgtrickle.gate_source('order_lines');

-- 2. Create stream tables without initial population.
SELECT pgtrickle.create_stream_table(
    'order_summary',
    'SELECT region, SUM(amount) FROM orders GROUP BY region',
    '1m', initialize => false
);
SELECT pgtrickle.create_stream_table(
    'order_report',
    'SELECT s.region, s.total, l.line_count
     FROM order_summary s
     JOIN (SELECT region, COUNT(*) AS line_count FROM order_lines GROUP BY region) l
       USING (region)',
    '1m', initialize => false
);

-- 3. Run ETL processes (can be in separate transactions).
BEGIN;
  COPY orders FROM 's3://warehouse/orders.parquet';
  SELECT pgtrickle.ungate_source('orders');
COMMIT;

BEGIN;
  COPY order_lines FROM 's3://warehouse/lines.parquet';
  SELECT pgtrickle.ungate_source('order_lines');
COMMIT;

-- 4. Once all sources are ungated, the scheduler initializes and refreshes
--    all stream tables in dependency order.

Recipe 4 — Nightly Batch Pattern

For scheduled ETL that runs overnight, gate sources before the batch starts and ungate after the batch completes.

-- Nightly ETL script:

-- Gate all sources that will be refreshed.
SELECT pgtrickle.gate_source('sales');
SELECT pgtrickle.gate_source('inventory');

-- Truncate and reload (or use COPY, INSERT...SELECT, etc.).
TRUNCATE sales;
COPY sales FROM '/data/nightly/sales.csv' WITH (FORMAT csv, HEADER);

TRUNCATE inventory;
COPY inventory FROM '/data/nightly/inventory.csv' WITH (FORMAT csv, HEADER);

-- All data loaded — ungate and let the scheduler handle the rest.
SELECT pgtrickle.ungate_source('sales');
SELECT pgtrickle.ungate_source('inventory');

-- Verify: check the gate status to confirm everything is ungated.
SELECT * FROM pgtrickle.bootstrap_gate_status();

Recipe 5 — Monitoring During a Gated Load

Use bootstrap_gate_status() to monitor progress when streams appear stalled.

-- Check which sources are currently gated and how long they've been paused.
SELECT source_table, gate_duration, affected_stream_tables
FROM pgtrickle.bootstrap_gate_status()
WHERE gated = true;

-- If a gate has been active too long (e.g. ETL failed), ungate manually.
SELECT pgtrickle.ungate_source('stale_source');

Watermark Gating (v0.7.0)

Watermark gating is a scheduling control for ETL pipelines where multiple source tables are populated by separate jobs that finish at different times. Each ETL job declares "I'm done up to timestamp X", and the scheduler waits until all sources in a group are caught up within a configurable tolerance before refreshing downstream stream tables.

Catalog Tables

pgtrickle.pgt_watermarks

Per-source watermark state. One row per source table that has had a watermark advanced.

Column	Type	Description
`source_relid`	`oid`	Source table OID (primary key)
`watermark`	`timestamptz`	Current watermark value
`updated_at`	`timestamptz`	When the watermark was last advanced
`advanced_by`	`text`	User/role that advanced the watermark
`wal_lsn_at_advance`	`text`	WAL LSN at the time of advancement

pgtrickle.pgt_watermark_groups

Watermark group definitions. Each group declares that a set of sources must be temporally aligned.

Column	Type	Description
`group_id`	`serial`	Auto-generated group ID (primary key)
`group_name`	`text`	Unique group name
`source_relids`	`oid[]`	Array of source table OIDs in the group
`tolerance_secs`	`float8`	Maximum allowed lag in seconds (default 0)
`created_at`	`timestamptz`	When the group was created

pgtrickle.pgt_template_cache

Added in v0.16.0. Cross-backend delta SQL template cache (UNLOGGED). Stores compiled delta query templates so new backends skip the ~45 ms DVM parse+differentiate step. Managed automatically — no user interaction required.

Column	Type	Description
`pgt_id`	`bigint`	Stream table ID (PK, FK → pgt_stream_tables)
`query_hash`	`bigint`	Hash of the defining query (staleness detection)
`delta_sql`	`text`	Delta SQL template with LSN placeholder tokens
`columns`	`text[]`	Output column names
`source_oids`	`integer[]`	Source table OIDs
`is_dedup`	`boolean`	Whether the delta is deduplicated per row ID
`key_changed`	`boolean`	Whether `__pgt_key_changed` column is present
`all_algebraic`	`boolean`	Whether all aggregates are algebraically invertible
`cached_at`	`timestamptz`	When the entry was last populated

Functions

pgtrickle.advance_watermark(source TEXT, watermark TIMESTAMPTZ)

Signal that a source table's data is complete through the given timestamp.

Monotonic: rejects watermarks that go backward (raises error).
Idempotent: advancing to the same value is a silent no-op.
Transactional: the watermark is part of the caller's transaction.

SELECT pgtrickle.advance_watermark('orders', '2026-03-01 12:05:00+00');

pgtrickle.create_watermark_group(group_name TEXT, sources TEXT[], tolerance_secs FLOAT8 DEFAULT 0)

Create a watermark group. Requires at least 2 sources.

tolerance_secs: maximum allowed lag between the most-advanced and least-advanced watermarks. Default 0 means strict alignment.

SELECT pgtrickle.create_watermark_group(
    'order_pipeline',
    ARRAY['orders', 'order_lines'],
    0    -- strict alignment (default)
);

pgtrickle.drop_watermark_group(group_name TEXT)

Remove a watermark group by name.

SELECT pgtrickle.drop_watermark_group('order_pipeline');

pgtrickle.watermarks()

Return the current watermark state for all registered sources.

SELECT * FROM pgtrickle.watermarks();

Column	Type	Description
`source_table`	`text`	Source table name
`schema_name`	`text`	Schema name
`watermark`	`timestamptz`	Current watermark value
`updated_at`	`timestamptz`	Last advancement time
`advanced_by`	`text`	User that advanced it
`wal_lsn`	`text`	WAL LSN at advancement

pgtrickle.watermark_groups()

Return all watermark group definitions.

SELECT * FROM pgtrickle.watermark_groups();

pgtrickle.watermark_status()

Return live alignment status for each watermark group.

SELECT * FROM pgtrickle.watermark_status();

Column	Type	Description
`group_name`	`text`	Group name
`min_watermark`	`timestamptz`	Least-advanced watermark
`max_watermark`	`timestamptz`	Most-advanced watermark
`lag_secs`	`float8`	Lag in seconds between max and min
`aligned`	`boolean`	Whether lag is within tolerance
`sources_with_watermark`	`int4`	Number of sources that have a watermark
`sources_total`	`int4`	Total sources in the group

Recipes

Recipe 6 — Nightly ETL with Watermarks

-- Create a watermark group for the order pipeline.
SELECT pgtrickle.create_watermark_group(
    'order_pipeline',
    ARRAY['orders', 'order_lines']
);

-- Nightly ETL job 1: Load orders
BEGIN;
  COPY orders FROM '/data/orders_20260301.csv';
  SELECT pgtrickle.advance_watermark('orders', '2026-03-01');
COMMIT;

-- Nightly ETL job 2: Load order lines (may run later)
BEGIN;
  COPY order_lines FROM '/data/lines_20260301.csv';
  SELECT pgtrickle.advance_watermark('order_lines', '2026-03-01');
COMMIT;

-- order_report refreshes on the next tick after both watermarks align.

Recipe 7 — Micro-Batch Tolerance

-- Allow up to 30 seconds of skew between trades and quotes.
SELECT pgtrickle.create_watermark_group(
    'realtime_pipeline',
    ARRAY['trades', 'quotes'],
    30   -- 30-second tolerance
);

-- External process advances watermarks every few seconds.
SELECT pgtrickle.advance_watermark('trades', '2026-03-01 12:00:05+00');
SELECT pgtrickle.advance_watermark('quotes', '2026-03-01 12:00:02+00');
-- Lag is 3s, within 30s tolerance → stream tables refresh normally.

Recipe 8 — Monitoring Watermark Alignment

-- Check which groups are currently misaligned.
SELECT group_name, lag_secs, aligned
FROM pgtrickle.watermark_status()
WHERE NOT aligned;

-- Check individual source watermarks.
SELECT source_table, watermark, updated_at
FROM pgtrickle.watermarks()
ORDER BY watermark;

Stuck Watermark Detection (WM-7, v0.15.0)

When pg_trickle.watermark_holdback_timeout is set to a positive value (seconds), the scheduler periodically checks all watermark sources. If any source in a watermark group has not been advanced within the timeout, downstream stream tables in that group are paused (refresh is skipped) and a pgtrickle_alert NOTIFY is emitted.

This protects against silent data staleness when an ETL pipeline breaks and stops advancing watermarks -- without this guard, stream tables would continue refreshing with stale external data.

Behavior:

Stuck detection: Every ~60 seconds, the scheduler checks updated_at for all watermark sources. If now() - updated_at > watermark_holdback_timeout, the source is stuck.
Pause: Any stream table whose source set overlaps a group containing a stuck source is skipped. A SKIP record with "stuck" in the reason is logged to pgt_refresh_history.
Alert: A pgtrickle_alert NOTIFY with event watermark_stuck is emitted (once per newly-stuck source, not repeated every check cycle).
Auto-resume: When the stuck watermark is advanced via advance_watermark(), the next scheduler check detects the advancement, lifts the pause, and emits a watermark_resumed event.

Recipe 9 — Stuck Watermark Protection

-- Enable stuck-watermark detection with a 10-minute timeout.
ALTER SYSTEM SET pg_trickle.watermark_holdback_timeout = 600;
SELECT pg_reload_conf();

-- Listen for alerts in a monitoring process.
LISTEN pgtrickle_alert;

-- When the ETL pipeline breaks and stops calling advance_watermark(),
-- the scheduler will start skipping downstream STs after 10 minutes.
-- You'll receive a NOTIFY payload like:
--   {"event":"watermark_stuck","group":"order_pipeline","source_oid":16385,"age_secs":620}

-- When the ETL pipeline recovers and advances the watermark:
SELECT pgtrickle.advance_watermark('orders', '2026-03-02 00:00:00+00');
-- The scheduler automatically resumes, and you'll receive:
--   {"event":"watermark_resumed","source_oid":16385}

Developer Diagnostics (v0.12.0)

Four SQL-callable introspection functions that surface internal DVM state without side-effects. All functions are read-only — they never modify catalog tables or trigger refreshes.

`pgtrickle.explain_query_rewrite(query TEXT)`

Walk a query through the full DVM rewrite pipeline and report each pass.

Returns one row per rewrite pass. When a pass changes the query, changed = true and sql_after contains the SQL after the transformation. Two synthetic rows are appended: topk_detection (detects ORDER BY … LIMIT) and dvm_patterns (lists detected DVM constructs such as aggregation strategy, join types, and volatility).

SELECT pass_name, changed, sql_after
FROM pgtrickle.explain_query_rewrite(
  'SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id'
);

Return columns:

Column	Type	Description
`pass_name`	`text`	Rewrite pass name (e.g. `view_inlining`, `distinct_on`, `grouping_sets`)
`changed`	`bool`	Whether this pass modified the query
`sql_after`	`text`	SQL text after this pass (NULL if unchanged)

Rewrite passes (in order):

Pass	Description
`view_inlining`	Expand view references to their defining SQL
`nested_window_lift`	Lift window functions out of expressions (e.g. `CASE WHEN ROW_NUMBER() OVER (...) ...`)
`distinct_on`	Rewrite `DISTINCT ON` to a `ROW_NUMBER()` window
`grouping_sets`	Expand `GROUPING SETS / CUBE / ROLLUP` to `UNION ALL` of `GROUP BY`
`scalar_subquery_in_where`	Rewrite scalar subqueries in `WHERE` to `CROSS JOIN`
`correlated_scalar_in_select`	Rewrite correlated scalar subqueries in `SELECT` to `LEFT JOIN`
`sublinks_in_or_demorgan`	Apply De Morgan normalization and expand `SubLinks` inside `OR`
`rows_from`	Rewrite `ROWS FROM()` multi-function expressions
`topk_detection`	Detect `ORDER BY … LIMIT n` TopK pattern
`dvm_patterns`	Detected DVM constructs: join types, aggregate strategies, volatility

`pgtrickle.diagnose_errors(name TEXT)`

Return the last 5 FAILED refresh events for a stream table, with each error classified by type and supplied with a remediation hint.

SELECT event_time, error_type, error_message, remediation
FROM pgtrickle.diagnose_errors('my_stream_table');

Return columns:

Column	Type	Description
`event_time`	`timestamptz`	When the failed refresh started
`error_type`	`text`	Classification: `user`, `schema`, `correctness`, `performance`, `infrastructure`
`error_message`	`text`	Raw error text from `pgt_refresh_history`
`remediation`	`text`	Suggested next step

Error types:

Type	Trigger patterns	Typical action
`user`	`query parse error`, `unsupported operator`, `type mismatch`	Check query; run `validate_query()`
`schema`	`upstream table schema changed`, `upstream table dropped`	Reinitialize; check `pgt_dependencies`
`correctness`	`phantom`, `EXCEPT ALL`, `row count mismatch`	Switch to `refresh_mode='FULL'`; report bug
`performance`	`lock timeout`, `deadlock`, `serialization failure`, `spill`	Tune `lock_timeout`; enable `buffer_partitioning`
`infrastructure`	`permission denied`, `SPI error`, `replication slot`	Check role grants; verify slot config

`pgtrickle.list_auxiliary_columns(name TEXT)`

List all __pgt_* internal columns on a stream table's storage relation, with an explanation of each column's role.

These columns are normally hidden from SELECT * output. This function surfaces them for debugging and operator visibility.

SELECT column_name, data_type, purpose
FROM pgtrickle.list_auxiliary_columns('my_stream_table');

Return columns:

Column	Type	Description
`column_name`	`text`	Internal column name (e.g. `__pgt_row_id`)
`data_type`	`text`	PostgreSQL type (e.g. `bigint`, `text`)
`purpose`	`text`	Human-readable description of the column's role

Common auxiliary columns:

Column	Purpose
`__pgt_row_id`	Row identity hash — MERGE join key for delta application
`__pgt_count`	Multiplicity counter for DISTINCT / aggregation / UNION dedup
`__pgt_count_l`	Left-side multiplicity for INTERSECT / EXCEPT
`__pgt_count_r`	Right-side multiplicity for INTERSECT / EXCEPT
`__pgt_aux_sum_<col>`	Running SUM for algebraic AVG maintenance
`__pgt_aux_count_<col>`	Running COUNT for algebraic AVG maintenance
`__pgt_aux_sum2_<col>`	Sum-of-squares for STDDEV / VAR maintenance
`__pgt_aux_sum{x,y,xy,x2,y2}_<col>`	Five-column set for CORR / COVAR / REGR_*
`__pgt_aux_nonnull_<col>`	Non-null count for SUM-above-FULL-JOIN maintenance

`pgtrickle.validate_query(query TEXT)`

Parse and validate a query through the DVM pipeline without creating a stream table. Returns detected SQL constructs, warnings, and the resolved refresh mode.

SELECT check_name, result, severity
FROM pgtrickle.validate_query(
  'SELECT customer_id, COUNT(*) FROM orders GROUP BY customer_id'
);

Return columns:

Column	Type	Description
`check_name`	`text`	Name of the check or detected construct
`result`	`text`	Resolved value or construct description
`severity`	`text`	`INFO`, `WARNING`, or `ERROR`

The first row always has check_name = 'resolved_refresh_mode' with the mode that would be assigned under refresh_mode = 'AUTO': DIFFERENTIAL, FULL, or TOPK.

Common check names:

Check	Description
`resolved_refresh_mode`	`DIFFERENTIAL`, `FULL`, or `TOPK`
`topk_pattern`	Detected LIMIT + ORDER BY values
`unsupported_construct`	Feature not supported for DIFFERENTIAL mode (→ WARNING)
`matview_or_foreign_table`	Query references matview/foreign table (→ WARNING, FULL)
`ivm_support_check`	DVM parse result (→ WARNING if DIFFERENTIAL not possible)
`aggregate`	Aggregate with strategy: `ALGEBRAIC_INVERTIBLE`, `ALGEBRAIC_VIA_AUX`, `SEMI_ALGEBRAIC`, or `GROUP_RESCAN`
`join`	Detected join type: `INNER`, `LEFT_OUTER`, `FULL_OUTER`, `SEMI`, `ANTI`
`set_op`	Set operation: `DISTINCT`, `UNION_ALL`, `INTERSECT`, `EXCEPT`, `EXCEPT_ALL`
`window_function`	Query contains window functions
`scalar_subquery`	Query contains scalar subqueries
`lateral`	Query contains LATERAL functions or subqueries
`recursive_cte`	Query uses `WITH RECURSIVE`
`volatility`	Worst-case volatility of functions used: `immutable`, `stable`, `volatile`
`needs_pgt_count`	Multiplicity counter column will be added
`needs_dual_count`	Left/right multiplicity counters will be added
`parse_warning`	Advisory warning from the DVM parse phase

Example output for a GROUP_RESCAN query:

SELECT check_name, result, severity
FROM pgtrickle.validate_query(
  'SELECT grp, STRING_AGG(tag, '','') FROM events GROUP BY grp'
);

check_name	result	severity
`resolved_refresh_mode`	`DIFFERENTIAL`	`INFO`
`aggregate`	`STRING_AGG(GROUP_RESCAN)`	`WARNING`
`needs_pgt_count`	`true — multiplicity counter column required`	`INFO`
`volatility`	`immutable`	`INFO`

Note on GROUP_RESCAN: STRING_AGG, ARRAY_AGG, BOOL_AND, and other non-algebraic aggregates use a group-rescan strategy — any change in a group triggers full re-aggregation from the source data for that group. This is still DIFFERENTIAL (only changed groups are rescanned), but has higher per-group cost than algebraic strategies. If this is performance-sensitive, consider pre-aggregating with a simpler aggregate and post-processing.

Delta SQL Profiling (v0.13.0)

`pgtrickle.explain_delta(st_name text, format text DEFAULT 'text')`

Generate the delta SQL query plan for a stream table without executing a refresh.

explain_delta produces the differential delta SQL that would be used on the next DIFFERENTIAL refresh, then runs EXPLAIN (ANALYZE false, FORMAT <format>) on it and returns the plan lines. This function is useful for:

Identifying slow joins or missing indexes in auto-generated delta SQL.
Comparing plan complexity between different query forms.
Monitoring how the size of change buffers affects plan shape.

The delta SQL is generated against a hypothetical "scan all changes" window (LSN 0/0 → FF/FFFFFFFF) so the plan shows the full join/filter structure even when the change buffer is currently empty.

Parameters:

Name	Type	Description
`st_name`	`text`	Qualified stream table name (e.g. `'public.orders_summary'`).
`format`	`text`	Plan format: `'text'` (default), `'json'`, `'xml'`, or `'yaml'`.

Returns: SETOF text — one row per plan line (text format) or one row containing the full JSON/XML/YAML plan.

Example:

-- Show the text plan for the delta query
SELECT line FROM pgtrickle.explain_delta('public.orders_summary');

-- Get the JSON plan for programmatic analysis
SELECT line FROM pgtrickle.explain_delta('public.orders_summary', 'json');

Environment variable (PGS_PROFILE_DELTA=1): When the environment variable PGS_PROFILE_DELTA=1 is set in the PostgreSQL server process, every DIFFERENTIAL refresh automatically captures EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) for the resolved delta SQL and writes the plan to /tmp/delta_plans/<schema>_<table>.json. This is intended for E2E test diagnostics and local profiling sessions.

`pgtrickle.dedup_stats()`

Show MERGE deduplication profiling counters accumulated since server start.

When the delta cannot be guaranteed to contain at most one row per __pgt_row_id (e.g. for aggregate queries or keyless sources), the MERGE must group and aggregate the delta before merging. This is tracked as dedup needed. A consistently high ratio indicates that pre-MERGE compaction in the change buffer would reduce refresh latency.

Returns: one row with:

Column	Type	Description
`total_diff_refreshes`	`bigint`	Total DIFFERENTIAL refreshes executed since server start that processed at least one change. Resets on server restart.
`dedup_needed`	`bigint`	Number of those refreshes where the delta required weight aggregation / deduplication in the MERGE USING clause.
`dedup_ratio_pct`	`float8`	`dedup_needed / total_diff_refreshes × 100`. 0 when `total_diff_refreshes = 0`.

Example:

SELECT * FROM pgtrickle.dedup_stats();
-- total_diff_refreshes | dedup_needed | dedup_ratio_pct
-- ----------------------+--------------+-----------------
--                  1234 |           87 |            7.05

A dedup_ratio_pct ≥ 10 is the threshold recommended for investigating a two-pass MERGE strategy. See plans/performance/REPORT_OVERALL_STATUS.md §14 for background.

`pgtrickle.shared_buffer_stats()`

Added in v0.13.0

D-4 observability function. Returns one row per shared change buffer (one per tracked source table), showing how many stream tables share the buffer, which columns are tracked, the safe cleanup frontier, and the current buffer size.

Return columns:

Column	Type	Description
`source_oid`	`bigint`	PostgreSQL OID of the source table
`source_table`	`text`	Fully qualified source table name
`consumer_count`	`integer`	Number of stream tables sharing this buffer
`consumers`	`text`	Comma-separated list of consumer stream table names
`columns_tracked`	`integer`	Number of `new_*` columns in the buffer (column superset)
`safe_frontier_lsn`	`text`	MIN(frontier LSN) across all consumers — rows at or below this are safe to clean up
`buffer_rows`	`bigint`	Current number of rows in the change buffer
`is_partitioned`	`boolean`	Whether the buffer uses LSN-range partitioning

Example:

SELECT * FROM pgtrickle.shared_buffer_stats();
-- source_oid | source_table       | consumer_count | consumers                          | columns_tracked | safe_frontier_lsn | buffer_rows | is_partitioned
-- -----------+--------------------+----------------+------------------------------------+-----------------+-------------------+-------------+----------------
--      16456 | public.orders      |              3 | public.orders_by_region, public... |               5 | 0/1A2B3C4D        |         142 | f

UNLOGGED Change Buffers (v0.14.0)

`pgtrickle.convert_buffers_to_unlogged()`

Converts all existing logged change buffer tables to UNLOGGED. This eliminates WAL writes for trigger-inserted CDC rows, reducing WAL amplification by ~30%.

Returns: bigint — the number of buffer tables converted.

SELECT pgtrickle.convert_buffers_to_unlogged();
-- convert_buffers_to_unlogged
-- ----------------------------
--                            5

Warning: Each conversion acquires ACCESS EXCLUSIVE lock on the buffer table. Run this function during a low-traffic maintenance window to minimize lock contention.

After conversion: Buffer contents will be lost on crash recovery. The scheduler automatically detects this and enqueues a FULL refresh for affected stream tables. See pg_trickle.unlogged_buffers for the full trade-off discussion.

Refresh Mode Diagnostics (v0.14.0)

`pgtrickle.recommend_refresh_mode(st_name TEXT DEFAULT NULL)`

Analyze stream table workload characteristics and recommend the optimal refresh mode (FULL vs DIFFERENTIAL). When st_name is NULL, returns one row per stream table. When provided, returns a single row for the named stream table.

The function evaluates seven weighted signals — change ratio, empirical timing, query complexity, target size, index coverage, and latency variance — and computes a composite score. Scores above +0.15 recommend DIFFERENTIAL; below −0.15 recommend FULL; in between, the function recommends KEEP (current mode is near-optimal).

Parameters:

Name	Type	Default	Description
`st_name`	`text`	`NULL`	Optional stream table name. NULL = all stream tables.

Return columns:

Column	Type	Description
`pgt_schema`	`text`	Stream table schema
`pgt_name`	`text`	Stream table name
`current_mode`	`text`	Currently configured refresh mode
`effective_mode`	`text`	Mode actually used in the last refresh
`recommended_mode`	`text`	`DIFFERENTIAL`, `FULL`, or `KEEP`
`confidence`	`text`	`high`, `medium`, or `low`
`reason`	`text`	Human-readable explanation of the recommendation
`signals`	`jsonb`	Detailed signal breakdown with scores and weights

Example:

-- Check all stream tables
SELECT pgt_name, current_mode, recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode();

-- Check a specific stream table
SELECT recommended_mode, confidence, reason, signals
FROM pgtrickle.recommend_refresh_mode('public.orders_summary');

Signal weights:

Signal	Base Weight	Description
`change_ratio_current`	0.25	Current pending changes / source rows
`change_ratio_avg`	0.30	Historical average change ratio
`empirical_timing`	0.35	Observed DIFF vs FULL speed ratio
`query_complexity`	0.10	JOIN/aggregate/window count
`target_size`	0.10	Target relation + index size
`index_coverage`	0.05	Whether `__pgt_row_id` index exists
`latency_variance`	0.05	DIFF latency p95/p50 ratio

`pgtrickle.refresh_efficiency()`

Per-table refresh efficiency metrics. Returns operational statistics for every stream table — useful for monitoring dashboards and Grafana alerts.

Return columns:

Column	Type	Description
`pgt_schema`	`text`	Stream table schema
`pgt_name`	`text`	Stream table name
`refresh_mode`	`text`	Current refresh mode
`total_refreshes`	`bigint`	Total completed refresh count
`diff_count`	`bigint`	DIFFERENTIAL refresh count
`full_count`	`bigint`	FULL refresh count
`avg_diff_ms`	`float8`	Average DIFFERENTIAL duration (ms)
`avg_full_ms`	`float8`	Average FULL duration (ms)
`avg_change_ratio`	`float8`	Average change ratio from history
`diff_speedup`	`text`	Speedup factor (e.g. `12.3x`) of FULL / DIFF timing
`last_refresh_at`	`text`	Timestamp of last data refresh

Example:

SELECT pgt_name, refresh_mode, diff_count, full_count,
       avg_diff_ms, avg_full_ms, diff_speedup
FROM pgtrickle.refresh_efficiency()
ORDER BY total_refreshes DESC;

Export API (v0.14.0)

`pgtrickle.export_definition(st_name TEXT)`

Export a stream table's configuration as reproducible DDL. Returns a SQL script containing DROP STREAM TABLE IF EXISTS followed by SELECT pgtrickle.create_stream_table(...) with all configured options, plus any ALTER STREAM TABLE calls for post-creation settings (tier, fuse mode, etc.).

Parameters:

Name	Type	Description
`st_name`	`text`	Fully qualified or search-path-resolved stream table name.

Returns: text — SQL script that recreates the stream table.

Example:

-- Export a single definition
SELECT pgtrickle.export_definition('public.orders_summary');

-- Export all definitions
SELECT pgtrickle.export_definition(pgt_schema || '.' || pgt_name)
FROM pgtrickle.pgt_stream_tables;

dbt Integration (v0.13.0)

The dbt-pgtrickle package exposes two new config(...) keys added in v0.13.0: partition_by and the fuse circuit-breaker options. Use them directly in any stream_table materialization model.

For full dbt documentation see dbt-pgtrickle/README.md.

`partition_by` config

Partition the stream table's underlying storage table using PostgreSQL PARTITION BY RANGE. Only applied at creation time — changing it after the stream table exists has no effect (use --full-refresh to recreate).

-- models/marts/events_by_day.sql
{{ config(
    materialized='stream_table',
    schedule='1m',
    refresh_mode='DIFFERENTIAL',
    partition_by='event_day'
) }}

SELECT
    event_day,
    user_id,
    COUNT(*) AS event_count
FROM {{ source('raw', 'events') }}
GROUP BY event_day, user_id

pg_trickle creates a PARTITION BY RANGE (event_day) storage table with an automatic default catch-all partition. Add named partitions via standard DDL:

CREATE TABLE analytics.events_by_day_2026
  PARTITION OF analytics.events_by_day
  FOR VALUES FROM ('2026-01-01') TO ('2027-01-01');

The partition_by value is stored in pgtrickle.pgt_stream_tables.st_partition_key and visible via pgtrickle.stream_tables_info.

`fuse` config

The fuse circuit breaker suspends differential refreshes when the incoming change volume exceeds a threshold, preventing runaway refresh cycles during bulk ingestion. Fuse parameters are applied via alter_stream_table() on every dbt run; they are a no-op if the values have not changed.

-- models/marts/order_totals.sql
{{ config(
    materialized='stream_table',
    schedule='5m',
    refresh_mode='DIFFERENTIAL',
    fuse='auto',
    fuse_ceiling=50000,
    fuse_sensitivity=3
) }}

SELECT customer_id, SUM(amount) AS total
FROM {{ source('raw', 'orders') }}
GROUP BY customer_id

Config key	Type	Default	Description
`fuse`	`'off'`\|`'on'`\|`'auto'`	`null` (no-op)	Fuse mode. `'auto'` activates only when FULL refresh would be cheaper than DIFFERENTIAL.
`fuse_ceiling`	integer	`null`	Change-count threshold (number of changed rows) that triggers the fuse. `null` uses the global `pg_trickle.fuse_default_ceiling` GUC.
`fuse_sensitivity`	integer	`null`	Number of consecutive over-ceiling observations required before the fuse blows. `null` means 1 (blow immediately).

Monitor fuse state via pgtrickle.dedup_stats() or check pgtrickle.pgt_stream_tables.fuse_state directly:

SELECT pgt_name, fuse_mode, fuse_state, fuse_ceiling, fuse_sensitivity
FROM pgtrickle.pgt_stream_tables
WHERE fuse_mode != 'off';

Dog Feeding — Self-Monitoring (v0.20.0)

Added in v0.20.0.

pg_trickle can monitor itself using its own stream tables. Five dog-feeding stream tables maintain reactive analytics over the internal catalog, replacing repeated full-scan diagnostic queries with continuously-maintained incremental views.

Quick Start

-- Create all five dog-feeding stream tables (idempotent).
SELECT pgtrickle.setup_dog_feeding();

-- Check status.
SELECT * FROM pgtrickle.dog_feeding_status();

-- View threshold recommendations (after 10+ refresh cycles).
SELECT * FROM pgtrickle.df_threshold_advice
WHERE confidence IN ('HIGH', 'MEDIUM');

-- View anomalies.
SELECT * FROM pgtrickle.df_anomaly_signals
WHERE duration_anomaly IS NOT NULL;

-- Enable auto-apply (optional).
SET pg_trickle.dog_feeding_auto_apply = 'threshold_only';

-- Clean up.
SELECT pgtrickle.teardown_dog_feeding();

`pgtrickle.setup_dog_feeding()`

Creates all five dog-feeding stream tables. Idempotent — safe to call multiple times. Emits a warm-up warning if pgt_refresh_history has fewer than 50 rows.

Stream tables created:

Name	Schedule	Mode	Purpose
`pgtrickle.df_efficiency_rolling`	48s	AUTO	Rolling-window refresh statistics
`pgtrickle.df_anomaly_signals`	48s	AUTO	Duration spikes, error bursts, mode oscillation
`pgtrickle.df_threshold_advice`	96s	AUTO	Multi-cycle threshold recommendations
`pgtrickle.df_cdc_buffer_trends`	48s	AUTO	CDC buffer growth rates per source
`pgtrickle.df_scheduling_interference`	96s	FULL	Concurrent refresh overlap detection

`pgtrickle.teardown_dog_feeding()`

Drops all dog-feeding stream tables. Safe with partial setups — missing tables are silently skipped. User stream tables are never affected.

`pgtrickle.dog_feeding_status()`

Returns the status of all five expected dog-feeding stream tables:

Column	Type	Description
`st_name`	text	Stream table name
`exists`	bool	Whether the ST exists
`status`	text	Current status (ACTIVE, SUSPENDED, etc.)
`refresh_mode`	text	Effective refresh mode
`last_refresh_at`	text	Last successful refresh timestamp
`total_refreshes`	bigint	Total completed refreshes

`pgtrickle.scheduler_overhead()`

Returns scheduler efficiency metrics for the last hour:

Column	Type	Description
`total_refreshes_1h`	bigint	Total refreshes in the last hour
`df_refreshes_1h`	bigint	Dog-feeding refreshes in the last hour
`df_refresh_fraction`	float	Fraction of refreshes that are dog-feeding
`avg_refresh_ms`	float	Average refresh duration (ms)
`avg_df_refresh_ms`	float	Average DF refresh duration (ms)
`total_refresh_time_s`	float	Total time spent refreshing (seconds)
`df_refresh_time_s`	float	Time spent on DF refreshes (seconds)

`pgtrickle.explain_dag(format)`

Returns the full refresh DAG as a Mermaid markdown (default) or Graphviz DOT string. Node colours: user STs = blue, dog-feeding STs = green, suspended = red, fused = orange.

-- Mermaid format (default).
SELECT pgtrickle.explain_dag();

-- Graphviz DOT format.
SELECT pgtrickle.explain_dag('dot');

Auto-Apply Policy

The pg_trickle.dog_feeding_auto_apply GUC controls whether analytics can automatically adjust stream table configuration:

Value	Behaviour
`off` (default)	Advisory only — no automatic changes
`threshold_only`	Apply threshold recommendations when confidence is HIGH and delta > 5%
`full`	Also apply scheduling hints from interference analysis

Auto-apply is rate-limited to at most one threshold change per stream table per 10 minutes. Changes are logged to pgt_refresh_history with initiated_by = 'DOG_FEED'.

Confidence Levels and Sparse History

df_threshold_advice assigns a confidence level to each recommendation:

Confidence	Criteria	What to expect
HIGH	≥ 20 total refreshes, ≥ 5 DIFFERENTIAL, ≥ 2 FULL	Reliable recommendation — auto-apply will act on this
MEDIUM	≥ 10 total refreshes	Directionally useful, but may lack enough FULL/DIFF mix
LOW	< 10 total refreshes	Insufficient data — recommendation equals the current threshold

When you see LOW confidence: This is normal during the first minutes after setup_dog_feeding(). The stream tables need time to accumulate refresh history. In typical deployments with a 1-minute schedule, expect:

LOW for the first ~10 minutes
MEDIUM after ~10 minutes
HIGH after ~20 minutes (requires at least 2 FULL refreshes — these happen naturally when the auto-threshold triggers a mode switch)

If a stream table uses FULL mode exclusively, the advice will remain at MEDIUM because no DIFFERENTIAL observations exist for comparison.

The sla_headroom_pct column shows how much faster DIFFERENTIAL is compared to FULL as a percentage. A value of 70% means "DIFF is 70% faster than FULL". This column is NULL when either FULL or DIFF observations are missing.

Public API Stability Contract

Added in v0.19.0 (DB-6).

Stable (will not break without a major version bump)

Surface	Guarantee
All functions in the `pgtrickle` schema documented in this reference	Signature and return type preserved across minor releases. New optional parameters may be added with defaults that preserve existing behaviour.
Catalog tables `pgtrickle.pgt_stream_tables`, `pgtrickle.pgt_dependencies`, `pgtrickle.pgt_refresh_history`	Existing columns are not renamed or removed. New columns may be added.
NOTIFY channels `pg_trickle_refresh`, `pgtrickle_alert`, `pgtrickle_wake`	Channel names and JSON payload structure preserved. New keys may be added to JSON payloads.
GUC names listed in `docs/CONFIGURATION.md`	Names preserved; default values may change between minor releases (documented in CHANGELOG).

Unstable (may change in any release)

Surface	Notes
Functions prefixed with `_` (e.g. `_signal_launcher_rescan`)	Internal use only.
Catalog tables not listed above (e.g. `pgt_scheduler_jobs`, `pgt_source_gates`, `pgt_watermarks`)	Schema may change.
The `pgtrickle_changes` schema and its `changes_*` tables	CDC implementation detail; format may change.
SQL generated by the DVM engine (MERGE, delta CTEs)	Internal query structure is not an API.
The `pgtrickle.pgt_schema_version` table	Migration infrastructure; rows and schema may change.

Versioning Policy

Patch releases (0.x.Y): Bug fixes only. No breaking changes.
Minor releases (0.X.0): New features. Stable API preserved; unstable surfaces may change. Breaking changes to stable API only with a deprecation cycle (WARNING for one release, removal in the next).
Major release (1.0.0): Stable API locked. Breaking changes require a major version bump.

Configuration

Complete reference for all pg_trickle GUC (Grand Unified Configuration) variables.

Overview

pg_trickle exposes over forty configuration variables in the pg_trickle namespace. All can be set in postgresql.conf or at runtime via SET / ALTER SYSTEM.

Required postgresql.conf settings:

shared_preload_libraries = 'pg_trickle'

The extension must be loaded via shared_preload_libraries because it registers GUC variables and a background worker at startup.

Note: wal_level = logical and max_replication_slots are recommended but not required. The default CDC mode (auto) uses lightweight row-level triggers initially and transparently transitions to WAL-based capture if wal_level = logical is available. If wal_level is not logical, pg_trickle stays on triggers permanently — no degradation, no errors. Set pg_trickle.cdc_mode = 'trigger' to disable WAL transitions entirely (see pg_trickle.cdc_mode).

GUC Variables

Essential

The settings most users configure at install time.

pg_trickle.enabled

Enable or disable the pg_trickle extension.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET` (superuser)
Restart Required	No

When set to false, the background scheduler stops processing refreshes. Existing stream tables remain in the catalog but are not refreshed. Manual pgtrickle.refresh_stream_table() calls still work.

-- Disable automatic refreshes
SET pg_trickle.enabled = false;

-- Re-enable
SET pg_trickle.enabled = true;

pg_trickle.cdc_mode

CDC (Change Data Capture) mechanism selection.

Value	Description
`'auto'`	(default) Use triggers for creation; transition to WAL-based CDC if `wal_level = logical`. Falls back to triggers automatically on error.
`'trigger'`	Always use row-level triggers for change capture
`'wal'`	Require WAL-based CDC (fails if `wal_level != logical`)

Default: 'auto'

pg_trickle.cdc_mode only affects deferred refresh modes ('AUTO', 'FULL', and 'DIFFERENTIAL'). refresh_mode = 'IMMEDIATE' bypasses CDC entirely and always uses statement-level IVM triggers. If the GUC is set to 'wal' when a stream table is created or altered to IMMEDIATE, pg_trickle logs an INFO and continues with IVM triggers instead of creating CDC triggers or WAL slots.

Per-stream-table overrides take precedence over the GUC when you pass cdc_mode => 'auto' | 'trigger' | 'wal' to pgtrickle.create_stream_table(...) or pgtrickle.alter_stream_table(...). The override is stored in pgtrickle.pgt_stream_tables.requested_cdc_mode. For shared source tables, pg_trickle resolves the effective source-level CDC mechanism conservatively: any dependent stream table that requests 'trigger' keeps the source on trigger CDC; otherwise 'wal' wins over 'auto'.

-- Enable automatic trigger → WAL transition (default)
SET pg_trickle.cdc_mode = 'auto';

-- Force trigger-only CDC (disable WAL transitions)
SET pg_trickle.cdc_mode = 'trigger';

-- Require WAL-based CDC (error if wal_level != logical)
SET pg_trickle.cdc_mode = 'wal';

pg_trickle.scheduler_interval_ms

How often the background scheduler checks for stream tables that need refreshing.

Property	Value
Type	`int`
Default	`1000` (1 second)
Range	`100` – `60000` (100ms to 60s)
Context	`SUSET`
Restart Required	No

Tuning Guidance:

Low-latency workloads (sub-second schedule): Set to 100–500.
Standard workloads (minutes of schedule): Default 1000 is appropriate.
Low-overhead workloads (many STs with long schedules): Increase to 5000–10000 to reduce scheduler overhead.

The scheduler interval does not determine refresh frequency — it determines how often the scheduler checks whether any ST's staleness exceeds its schedule (or whether a cron expression has fired). The actual refresh frequency is governed by schedule (duration or cron) and canonical period alignment.

SET pg_trickle.scheduler_interval_ms = 500;

pg_trickle.event_driven_wake

Enable event-driven scheduler wake via LISTEN/NOTIFY. When enabled, CDC triggers emit pg_notify('pgtrickle_wake', '') after writing to the change buffer, and the scheduler LISTENs on that channel, waking immediately instead of waiting for the full scheduler_interval_ms poll. This reduces median end-to-end latency from ~500 ms to ~15 ms for low-volume workloads.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET`
Restart Required	No

Tuning Guidance:

Low-latency workloads: Leave enabled (default) for the best latency.
Extreme write throughput (>100K DML/s): Consider disabling if the per-statement NOTIFY overhead is measurable. The NOTIFY is coalesced by PostgreSQL (one notification per transaction), so the actual overhead is negligible for most workloads.

-- Disable event-driven wake (fall back to poll-only)
SET pg_trickle.event_driven_wake = off;

pg_trickle.wake_debounce_ms

After the scheduler receives the first pgtrickle_wake notification, it waits this many milliseconds to coalesce rapidly arriving notifications before starting a refresh tick. Lower values reduce latency; higher values reduce wake overhead during bulk DML.

Property	Value
Type	`int`
Default	`10` (10 milliseconds)
Range	`1` – `5000`
Context	`SUSET`
Restart Required	No

Tuning Guidance:

Single-statement latency-sensitive: Use 1–5 ms.
Bulk DML workloads: Use 50–200 ms to coalesce more notifications per tick.
Default (10 ms) balances sub-20 ms latency with reasonable coalescing.

SET pg_trickle.wake_debounce_ms = 50;

pg_trickle.min_schedule_seconds

Minimum allowed schedule value (in seconds) when creating or altering a stream table with a duration-based schedule. This limit does not apply to cron expressions.

Property	Value
Type	`int`
Default	`1` (1 second)
Range	`1` – `86400` (1 second to 24 hours)
Context	`SUSET`
Restart Required	No

This acts as a safety guardrail to prevent users from setting impractically small schedules that would cause excessive refresh overhead.

Tuning Guidance:

Development/testing: Default 1 allows sub-second testing.
Production: Raise to 60 or higher to prevent excessive WAL consumption and CPU usage.

-- Restrict to 10-second minimum schedules
SET pg_trickle.min_schedule_seconds = 10;

pg_trickle.default_schedule_seconds

Default effective schedule (in seconds) for isolated CALCULATED stream tables that have no downstream dependents.

Property	Value
Type	`int`
Default	`1` (1 second)
Range	`1` – `86400` (1 second to 24 hours)
Context	`SUSET`
Restart Required	No

When a CALCULATED stream table (scheduled with 'calculated') has no downstream dependents to derive a schedule from, this value is used as its effective refresh interval. This is distinct from min_schedule_seconds, which is the validation floor for duration-based schedules.

Tuning Guidance:

Development/testing: Default 1 allows rapid iteration.
Production standalone CALCULATED tables: Raise to match your desired update cadence (e.g., 60 for once-per-minute).

-- Set default for isolated CALCULATED tables to 30 seconds
SET pg_trickle.default_schedule_seconds = 30;

pg_trickle.max_consecutive_errors

Maximum consecutive refresh failures before a stream table is moved to ERROR status.

Property	Value
Type	`int`
Default	`3`
Range	`1` – `100`
Context	`SUSET`
Restart Required	No

When a ST's consecutive_errors reaches this threshold:

The ST status changes to ERROR.
Automatic refreshes stop for this ST.
Manual intervention is required: SELECT pgtrickle.alter_stream_table('...', status => 'ACTIVE').

Tuning Guidance:

Strict (production): 3 — fail fast to surface issues.
Lenient (development): 10–20 — tolerate transient errors.

SET pg_trickle.max_consecutive_errors = 5;

WAL CDC

Settings specific to WAL-based CDC. Only relevant when pg_trickle.cdc_mode = 'auto' or 'wal'.

pg_trickle.wal_transition_timeout

Note: This setting is only relevant when pg_trickle.cdc_mode = 'auto' or 'wal'. See ARCHITECTURE.md for the full CDC transition lifecycle.

Maximum time (seconds) to wait for the WAL decoder to catch up during the transition from trigger-based to WAL-based CDC. If the decoder has not caught up within this timeout, the system falls back to triggers.

Default: 300 (5 minutes)
Range: 10 – 3600

SET pg_trickle.wal_transition_timeout = 300;

pg_trickle.slot_lag_warning_threshold_mb

Warning threshold for retained WAL on pg_trickle replication slots.

Property	Value
Type	`int`
Default	`100` (MB)
Range	`1` – `1048576`
Context	`SUSET`
Restart Required	No

When retained WAL for a pg_trickle replication slot exceeds this threshold:

The scheduler emits a slot_lag_warning event on LISTEN pg_trickle_alert
pgtrickle.health_check() reports WARN for the slot_lag check

Raise this on high-throughput systems that intentionally tolerate larger WAL retention. Lower it if you want earlier warning before slots risk invalidation.

SET pg_trickle.slot_lag_warning_threshold_mb = 256;

pg_trickle.slot_lag_critical_threshold_mb

Critical threshold for retained WAL on pg_trickle replication slots.

Property	Value
Type	`int`
Default	`1024` (MB)
Range	`1` – `1048576`
Context	`SUSET`
Restart Required	No

When retained WAL for a pg_trickle replication slot exceeds this threshold, pgtrickle.check_cdc_health() returns a per-source slot_lag_exceeds_threshold alert.

This threshold is intentionally higher than the warning threshold so operators can separate early warning from source-level unhealthy state.

SET pg_trickle.slot_lag_critical_threshold_mb = 2048;

Refresh Performance

Fine-grained tuning for the differential refresh engine.

pg_trickle.differential_max_change_ratio

Maximum change-to-table ratio before DIFFERENTIAL refresh falls back to FULL refresh.

Property	Value
Type	`float`
Default	`0.15` (15%)
Range	`0.0` – `1.0`
Context	`SUSET`
Restart Required	No

When the number of pending change buffer rows exceeds this fraction of the source table's estimated row count, the refresh engine switches from DIFFERENTIAL (which uses JSONB parsing and window functions) to FULL refresh. At high change rates FULL refresh is cheaper because it avoids the per-row JSONB overhead.

Special Values:

0.0: Disable adaptive fallback — always use DIFFERENTIAL.
1.0: Always fall back to FULL (effectively forces FULL mode).

Tuning Guidance:

OLTP with low change rates (< 5%): Default 0.15 is appropriate.
Batch-load workloads (bulk inserts): Lower to 0.05–0.10 so large batches trigger FULL refresh sooner.
Latency-sensitive (want deterministic refresh time): Set to 0.0 to always use DIFFERENTIAL.

-- Lower threshold for batch-heavy workloads
SET pg_trickle.differential_max_change_ratio = 0.10;

-- Disable adaptive fallback
SET pg_trickle.differential_max_change_ratio = 0.0;

pg_trickle.refresh_strategy

Cluster-wide refresh strategy override.

Property	Value
Type	`string`
Default	`'auto'`
Values	`'auto'`, `'differential'`, `'full'`
Context	`SUSET`
Restart Required	No

Controls the FULL vs. DIFFERENTIAL decision for all stream tables whose refresh_mode is DIFFERENTIAL:

'auto' (default): Use the adaptive cost-based heuristic that considers differential_max_change_ratio, per-ST auto_threshold, refresh history, and spill detection to pick the optimal strategy per refresh cycle.
'differential': Always use DIFFERENTIAL refresh — skip the adaptive ratio check entirely. The BUF-LIMIT safety check (max_buffer_rows) still applies.
'full': Always use FULL refresh regardless of change volume. Useful for debugging or when you know DIFFERENTIAL is consistently slower for your workload.

Important: Per-ST refresh_mode in the catalog takes precedence. Stream tables explicitly configured as refresh_mode = 'FULL' always use FULL regardless of this GUC.

Tuning Guidance:

Most workloads: Leave at 'auto' — the adaptive heuristic learns from refresh history.
Known-low-churn workloads: Set to 'differential' to eliminate the per-source capped-count query overhead.
Debugging delta issues: Temporarily set to 'full' to compare behavior.

-- Force DIFFERENTIAL for all stream tables (skip ratio check)
SET pg_trickle.refresh_strategy = 'differential';

-- Force FULL for all stream tables (debugging)
SET pg_trickle.refresh_strategy = 'full';

-- Reset to adaptive heuristic
SET pg_trickle.refresh_strategy = 'auto';

pg_trickle.cost_model_safety_margin

Added in v0.17.0. Safety margin for the predictive cost model that decides FULL vs. DIFFERENTIAL.

Property	Value
Type	`float`
Default	`0.8`
Range	`0.1` – `2.0`
Context	`SUSET`
Restart Required	No

When refresh_strategy = 'auto', the cost model estimates DIFFERENTIAL and FULL costs from recent refresh history. DIFFERENTIAL is chosen when:

estimated_diff_cost < estimated_full_cost × safety_margin

A value below 1.0 biases toward DIFFERENTIAL (which has lower lock contention and is generally preferred). A value above 1.0 biases toward FULL.

The cost model also classifies each stream table's query complexity (scan, filter, aggregate, join, or join+aggregate) and uses per-class coefficients learned from historical data.

Tuning Guidance:

0.8 (default): Prefer DIFFERENTIAL unless it's nearly as expensive as FULL.
0.5: Strongly prefer DIFFERENTIAL — only fall back when it's clearly more expensive.
1.0: Neutral — pick whichever is estimated to be cheaper.
1.2: Slightly prefer FULL — useful when FULL is very fast and DIFFERENTIAL lock contention is a concern.

-- Strongly prefer DIFFERENTIAL
SET pg_trickle.cost_model_safety_margin = 0.5;

-- Neutral (pick the estimated cheapest)
SET pg_trickle.cost_model_safety_margin = 1.0;

pg_trickle.max_delta_estimate_rows

Added in v0.15.0. Maximum estimated delta output cardinality before falling back to FULL refresh.

Property	Value
Type	`int`
Default	`0` (disabled)
Range	`0` – `10,000,000`
Context	`SUSET`
Restart Required	No

Before executing the MERGE, the refresh executor extracts the delta subquery and runs a capped SELECT count(*) FROM (delta LIMIT N+1). If the count reaches the configured limit, the refresh emits a NOTICE and falls back to FULL refresh to prevent OOM or excessive temp-file spills from unexpectedly large delta output.

This is complementary to differential_max_change_ratio which checks input change buffer size as a ratio of source table size. max_delta_estimate_rows checks output cardinality — catching cases where a small number of input changes produce a large delta output after JOINs.

Special Values:

0 (default): Disable the estimation check entirely.

Tuning Guidance:

Servers with 8–16 GB RAM: Start with 100000 and adjust based on observed refresh behavior.
Large-memory servers (32+ GB): 500000 or higher.
Complex multi-join queries: Lower to 50000 since join fan-out can amplify small changes.

-- Enable delta output estimation with 100K row limit
SET pg_trickle.max_delta_estimate_rows = 100000;

-- Disable estimation (default)
SET pg_trickle.max_delta_estimate_rows = 0;

pg_trickle.cleanup_use_truncate

Use TRUNCATE instead of per-row DELETE for change buffer cleanup when the entire buffer is consumed by a refresh.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET`
Restart Required	No

After a differential refresh consumes all rows from the change buffer, the engine must clean up the buffer table. TRUNCATE is O(1) regardless of row count, versus DELETE which must update indexes row-by-row. This saves 3–5 ms per refresh at 10%+ change rates.

Trade-off: TRUNCATE acquires an AccessExclusiveLock on the change buffer table. If concurrent DML on the source table is actively inserting into the same change buffer via triggers, this lock can cause brief contention.

Tuning Guidance:

Most workloads: Leave at true — the performance benefit outweighs the brief lock.
High-concurrency OLTP with continuous writes during refresh: Set to false if you observe lock-wait timeouts on the change buffer.
PgBouncer / connection poolers: The AccessExclusiveLock acquired by TRUNCATE is held only on the change buffer table (not the source table), but in transaction-pooling mode with frequent refreshes, even brief exclusive locks can cause connection queuing. If you observe elevated pg_stat_activity wait events on change buffer tables, switch to false.

-- Use per-row DELETE for change buffer cleanup
SET pg_trickle.cleanup_use_truncate = false;

pg_trickle.planner_aggressive

Added in v0.14.0. Consolidated switch for all MERGE planner hints. Replaces the deprecated merge_planner_hints and merge_work_mem_mb GUCs.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET`
Restart Required	No

When enabled, the refresh executor estimates the delta size and applies optimizer hints within the transaction:

Delta ≥ 100 rows: SET LOCAL enable_nestloop = off — forces hash joins instead of nested-loop joins.
Delta ≥ 10,000 rows: additionally SET LOCAL work_mem = '<N>MB' (see pg_trickle.merge_work_mem_mb).

Tuning Guidance:

Most workloads: Leave at true — the hints improve tail latency without affecting small deltas.
Custom plan overrides: Set to false if you manage planner settings yourself or if the hints conflict with your pg_hint_plan configuration.
Memory-constrained environments: When enabled, large deltas (≥ 10,000 rows) raise work_mem to 64 MB (configurable via merge_work_mem_mb). If your server has limited RAM and runs many concurrent refreshes, this can cause unexpected memory pressure or temp-file spills. Monitor temp_blks_written in pg_stat_statements and consider lowering merge_work_mem_mb or disabling this GUC if spills are frequent.

-- Disable all planner hints
SET pg_trickle.planner_aggressive = false;

pg_trickle.merge_join_strategy

Added in v0.15.0. Manual override for the join strategy used during MERGE execution.

Property	Value
Type	`text`
Default	`'auto'`
Values	`auto`, `hash_join`, `nested_loop`, `merge_join`
Context	`SUSET`
Restart Required	No

Controls which join strategy the refresh executor hints to PostgreSQL via SET LOCAL during differential refresh. Requires planner_aggressive to be enabled.

Value	Behaviour
`auto` (default)	Delta-size heuristics choose: nested-loop for tiny deltas, hash-join for larger ones
`hash_join`	Always disable nested-loop joins and raise `work_mem` — best for medium-to-large deltas
`nested_loop`	Always disable hash-join and merge-join — best for very small deltas against indexed tables
`merge_join`	Always disable hash-join and nested-loop — useful if data is pre-sorted

Tuning Guidance:

Most workloads: Leave at auto — the built-in heuristic performs well.
Consistently large deltas (1K+ rows): Setting to hash_join avoids heuristic overhead.
Troubleshooting: If refresh is slow, try different strategies and compare with explain_st().

-- Force hash joins for all MERGE operations
SET pg_trickle.merge_join_strategy = 'hash_join';

-- Revert to automatic heuristics
SET pg_trickle.merge_join_strategy = 'auto';

pg_trickle.merge_strategy

Added in v0.16.0. Controls how differential refresh applies deltas to stream tables.

Property	Value
Type	`text`
Default	`'auto'`
Values	`auto`, `merge`
Context	`SUSET`
Restart Required	No

Value	Behaviour
`auto` (default)	Use DELETE+INSERT when `delta_rows / target_rows` is below `merge_strategy_threshold`; MERGE otherwise
`merge`	Always use the PostgreSQL MERGE statement

Breaking change (v0.19.0): The delete_insert value was removed in v0.19.0 (CORR-1) because it was semantically unsafe for aggregate and DISTINCT queries. Setting it now logs a WARNING and falls back to auto.

The DELETE+INSERT strategy avoids the MERGE join cost by executing two targeted statements: a DELETE for removed rows (matched by __pgt_row_id), then an INSERT for new rows. This is significantly cheaper for sub-1% deltas against large tables because it avoids scanning the entire target for the MERGE join.

Tuning Guidance:

Most workloads: Leave at auto — the heuristic picks DELETE+INSERT for small deltas automatically.
Correctness concerns: The merge setting preserves the pre-v0.16.0 behaviour.

-- Force MERGE for all differential refreshes
SET pg_trickle.merge_strategy = 'merge';

-- Revert to automatic heuristics
SET pg_trickle.merge_strategy = 'auto';

pg_trickle.merge_strategy_threshold

Added in v0.16.0. Delta ratio threshold for the auto merge strategy.

Property	Value
Type	`float`
Default	`0.01` (1%)
Range	`0.001` – `1.0`
Context	`SUSET`
Restart Required	No

When merge_strategy is auto, DELETE+INSERT is used instead of MERGE when delta_rows / target_rows is below this threshold. The target row count is estimated from pg_class.reltuples.

Tuning Guidance:

Default (0.01): DELETE+INSERT for deltas under 1% of the target table size.
Higher values (0.05–0.10): More aggressive use of DELETE+INSERT; useful for wide tables where MERGE join overhead is high.
Lower values (0.001): Only use DELETE+INSERT for very tiny deltas.

-- Use DELETE+INSERT for deltas under 5% of target size
SET pg_trickle.merge_strategy_threshold = 0.05;

pg_trickle.merge_planner_hints

Deprecated in v0.14.0. Use pg_trickle.planner_aggressive instead. This GUC is still accepted for backward compatibility but is ignored at runtime.

Inject SET LOCAL planner hints before MERGE execution during differential refresh.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET`
Restart Required	No

When enabled, the refresh executor estimates the delta size and applies optimizer hints within the transaction:

Delta ≥ 100 rows: SET LOCAL enable_nestloop = off — forces hash joins instead of nested-loop joins.
Delta ≥ 10,000 rows: additionally SET LOCAL work_mem = '<N>MB' (see pg_trickle.merge_work_mem_mb).

This reduces P95 latency spikes caused by PostgreSQL choosing nested-loop plans for medium/large delta sizes.

Tuning Guidance:

Most workloads: Leave at true — the hints improve tail latency without affecting small deltas.
Custom plan overrides: Set to false if you manage planner settings yourself or if the hints conflict with your pg_hint_plan configuration.

-- Disable planner hints
SET pg_trickle.merge_planner_hints = false;

pg_trickle.merge_work_mem_mb

work_mem value (in MB) applied via SET LOCAL when the delta exceeds 10,000 rows and planner hints are enabled.

Property	Value
Type	`int`
Default	`64` (64 MB)
Range	`8` – `4096` (8 MB to 4 GB)
Context	`SUSET`
Restart Required	No

A higher value lets PostgreSQL use larger in-memory hash tables for the MERGE join, avoiding disk-spilling sort/merge strategies on large deltas. This setting is only applied when both merge_planner_hints = true and the delta exceeds 10,000 rows.

Tuning Guidance:

Servers with ample RAM (32+ GB): Increase to 128–256 for faster large-delta refreshes.
Memory-constrained: Lower to 16–32 or disable planner hints entirely.
Very large deltas (100K+ rows): Consider 256–512 if refresh latency matters.

SET pg_trickle.merge_work_mem_mb = 128;

pg_trickle.delta_work_mem_cap_mb

Maximum work_mem (in MB) that planner hints are allowed to set during delta MERGE execution. When the deep-join or large-delta path would set work_mem above this cap, the refresh falls back to FULL instead of risking OOM.

Property	Value
Type	`int`
Default	`0` (disabled — no cap)
Range	`0` – `8192` (0 to 8 GB)
Context	`SUSET`
Restart Required	No

Set to 0 to disable the cap entirely (default). When enabled, the cap is checked before any SET LOCAL work_mem in apply_planner_hints(). If the configured or computed work_mem exceeds the cap, the refresh emits a NOTICE and falls back to FULL refresh.

Tuning Guidance:

Production servers with tight memory budgets: Set to 256–512 to prevent runaway hash joins.
Servers with ample RAM (64+ GB): Leave at 0 (disabled) or set high (2048+).
If you see SCAL-3 fallback notices: Either raise the cap or investigate why delta sizes are unexpectedly large.

SET pg_trickle.delta_work_mem_cap_mb = 512;

pg_trickle.merge_seqscan_threshold

Delta-to-ST row ratio below which sequential scans are disabled for the MERGE transaction. Requires planner hints to be enabled.

Property	Value
Type	`real`
Default	`0.001`
Range	`0.0` – `1.0`
Context	`SUSET`
Restart Required	No

When the estimated delta row count divided by the stream table's reltuples falls below this threshold, the refresh executor issues SET LOCAL enable_seqscan = off, coercing PostgreSQL into using the __pgt_row_id B-tree index instead of a full sequential scan.

Set to 0.0 to disable the feature entirely.

Tuning Guidance:

Default (0.001): Suitable for most workloads. A 10M-row ST with fewer than 10K delta rows triggers the hint.
High-throughput / small STs: Increase to 0.01 if your STs are small and you want more aggressive index usage.
Disable: Set to 0.0 if index-only scans are not beneficial for your access pattern.

SET pg_trickle.merge_seqscan_threshold = 0.01;

pg_trickle.auto_backoff

Automatically back off the refresh schedule when a stream table is consistently falling behind.

Property	Value
Type	`bool`
Default	`on`
Context	`SUSET`
Restart Required	No

When enabled (the default), the scheduler tracks a per-stream-table backoff factor. If a refresh cycle takes more than 95% of the scheduled interval, the backoff factor doubles (capped at 8×), effectively stretching the schedule to avoid runaway refresh storms. The factor resets to 1× on the first on-time completion, and a WARNING is emitted whenever the factor changes so you always know why a stream table is refreshing more slowly than expected.

The 95% trigger threshold means that brief jitter on developer machines (e.g. a 950 ms refresh on a 1-second schedule) will correctly engage backoff, while a 900 ms refresh on the same schedule will not. The EC-11 operator alert (scheduler_falling_behind NOTIFY) continues to fire at the lower 80% threshold, giving you advance warning before the scheduler is actually stuck.

This is a safety net for overloaded systems — it prevents a single slow stream table from monopolizing the background worker when operators are not available to intervene.

Tuning Guidance:

Leave on (the default) for both production and development environments.
Disable only if you are deliberately running stream tables at the limit of their schedule budget and want the scheduler to keep trying at full speed regardless.

-- Disable if you want no backoff (not recommended for production)
SET pg_trickle.auto_backoff = off;

pg_trickle.tiered_scheduling

Enable tiered refresh scheduling (Hot/Warm/Cold/Frozen) for stream tables.

Property	Value
Type	`bool`
Default	`on`
Context	`SUSET`
Restart Required	No

When enabled, the scheduler applies a per-stream-table refresh tier multiplier to duration-based schedules. Each stream table has a refresh_tier column (default 'hot') that controls how often it is refreshed relative to its configured schedule:

Tier	Multiplier	Effect
`hot`	1×	Refresh at configured schedule (default)
`warm`	2×	Refresh at 2× the configured interval
`cold`	10×	Refresh at 10× the configured interval
`frozen`	skip	Never refreshed until manually promoted

Cron-based schedules are not affected by the tier multiplier.

Set the tier via:

SELECT pgtrickle.alter_stream_table('my_table', tier => 'warm');
SELECT pgtrickle.alter_stream_table('my_table', tier => 'frozen');

Design note: Tiers are user-assigned only. Automatic classification from pg_stat_user_tables was rejected because pg_trickle's own MERGE scans pollute the read counters, making auto-classification unreliable.

Tier Thresholds Reference

The following table summarizes the effective refresh behavior for each tier. All multipliers apply to duration-based schedules only — cron-based schedules are always honored as-is. New stream tables default to hot.

Tier	Multiplier	Effective Schedule (1 s base)	Use Case
`hot`	1×	1 s	Real-time dashboards, alerting tables, SLA-bound queries
`warm`	2×	2 s	Important but not latency-critical tables; reduces CPU by 50%
`cold`	10×	10 s	Reporting tables queried infrequently; saves significant CPU
`frozen`	skip	never (until promoted)	Archival tables, tables under maintenance, or seasonal reports

When to use each tier:

Hot — default for all new stream tables. Appropriate when downstream consumers expect near-real-time freshness.
Warm — set for tables where a few seconds of staleness is acceptable. Halves the refresh CPU cost compared to Hot.
Cold — set for tables queried only by batch jobs or low-frequency dashboards. 10× reduction in refresh overhead.
Frozen — set when a table should not be refreshed at all (e.g., during a maintenance window or when the upstream source is being migrated). Promote back to Hot/Warm/Cold when ready.

-- Promote a frozen table back to warm
SELECT pgtrickle.alter_stream_table('seasonal_report', tier => 'warm');

-- Freeze a table during maintenance
SELECT pgtrickle.alter_stream_table('my_table', tier => 'frozen');

Changed in v0.12.0: The default for pg_trickle.tiered_scheduling changed from off to on. Set pg_trickle.tiered_scheduling = off in postgresql.conf to restore pre-v0.12.0 behavior (all STs refresh at full speed regardless of tier assignment).

Diamond Schedule Policy (per-stream-table)

Controls how the scheduler fires diamond consistency groups — sets of stream tables that share upstream sources through a diamond-shaped DAG topology.

Property	Value
Column	`diamond_schedule_policy` in `pgt_stream_tables`
Values	`'fastest'` (default), `'slowest'`
Set via	`create_stream_table(..., diamond_schedule_policy => 'slowest')`
Alter via	`alter_stream_table('name', diamond_schedule_policy => 'slowest')`

Only meaningful when diamond_consistency = 'atomic' is also set.

fastest (default): The atomic group fires when any member is due. This maximizes freshness but can cause CPU multiplication. In an asymmetric diamond where stream table B refreshes every 1 s and stream table C every 5 s, both feeding D with diamond_consistency = 'atomic': C refreshes 5× more often than its schedule because B triggers the group every second. For N members with schedules S₁ < S₂ < … < Sₙ, the total refresh count is N × (cycle_time / S₁), meaning slower members do up to Sₙ/S₁ times more work than their schedule implies.

slowest: The atomic group fires only when all members are due. This minimizes CPU cost at the expense of freshness — faster members are held back until the slowest member's schedule fires.

Tuning Guidance:

Use 'fastest' when freshness of the diamond tip matters and the cost of extra refreshes is acceptable.
Use 'slowest' when CPU budget is tight or members have very different schedules (e.g., 1 s vs 60 s) and the multiplication would be excessive.

-- Create with slowest policy to avoid CPU multiplication
SELECT pgtrickle.create_stream_table(
    'my_diamond_tip',
    'SELECT ... FROM a JOIN b ...',
    diamond_consistency => 'atomic',
    diamond_schedule_policy => 'slowest'
);

pg_trickle.use_prepared_statements

Use SQL PREPARE / EXECUTE for MERGE statements during differential refresh.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET`
Restart Required	No

When enabled, the refresh executor issues PREPARE __pgt_merge_{id} on the first cache-hit cycle, then uses EXECUTE on subsequent cycles. After approximately 5 executions, PostgreSQL switches from a custom plan to a generic plan, saving 1–2 ms of parse/plan overhead per refresh.

Tuning Guidance:

Most workloads: Leave at true — the cumulative parse/plan savings are significant for frequently-refreshed stream tables.
Highly skewed data: Set to false if prepared-statement parameter sniffing produces poor plans (e.g., highly skewed LSN distributions causing bad join estimates).

-- Disable prepared statements
SET pg_trickle.use_prepared_statements = false;

pg_trickle.user_triggers

Control how user-defined triggers on stream tables are handled during refresh.

Property	Value
Type	`text`
Default	`'auto'`
Values	`'auto'`, `'off'` (`'on'` accepted as deprecated alias for `'auto'`)
Context	`SUSET`
Restart Required	No

When a stream table has user-defined row-level triggers, the refresh engine can decompose the MERGE into explicit DELETE + UPDATE + INSERT statements so triggers fire with correct TG_OP, OLD, and NEW values.

Values:

auto (default): Automatically detect user triggers on the stream table. If present, use the explicit DML path; otherwise use MERGE.
off: Always use MERGE. User triggers are suppressed during refresh. This is the escape hatch if explicit DML causes issues.
on: Deprecated compatibility alias for auto. Existing configs continue to work, but new configs should use auto.

Notes:

Row-level triggers do not fire during FULL refresh regardless of this setting. FULL refresh uses DISABLE TRIGGER USER / ENABLE TRIGGER USER to suppress them.
The explicit DML path adds ~25–60% overhead compared to MERGE for affected stream tables.
Stream tables without user triggers have zero overhead when using auto (only a fast pg_trigger check).

-- Auto-detect (default)
SET pg_trickle.user_triggers = 'auto';

-- Suppress triggers, use MERGE
SET pg_trickle.user_triggers = 'off';

-- Backward-compatible legacy setting (treated the same as 'auto')
SET pg_trickle.user_triggers = 'on';

Guardrails & Limits

Safety controls and hard limits.

pg_trickle.block_source_ddl

When enabled, column-affecting DDL (e.g., ALTER TABLE ... DROP COLUMN, ALTER TABLE ... ALTER COLUMN ... TYPE) on source tables tracked by stream tables is blocked with an ERROR instead of silently marking stream tables for reinitialization.

This is useful in production environments where you want to prevent accidental schema changes that would trigger expensive full recomputation of downstream stream tables.

Default: false
Context: Superuser

-- Block column-affecting DDL on tracked source tables
SET pg_trickle.block_source_ddl = true;

-- Allow DDL (stream tables will be marked for reinit instead)
SET pg_trickle.block_source_ddl = false;

Note: Only column-affecting changes are blocked. Benign DDL (adding indexes, comments, constraints) is always allowed regardless of this setting.

pg_trickle.buffer_alert_threshold

When any source table's change buffer exceeds this number of rows, a BufferGrowthWarning alert is emitted. Raise for high-throughput workloads, lower for small tables.

Default: 1000000 (1 million rows)
Range: 1000 – 100000000

SET pg_trickle.buffer_alert_threshold = 500000;

pg_trickle.compact_threshold

When a source table's pending change buffer exceeds this many rows, compaction is triggered before the next refresh cycle. Compaction eliminates net-zero INSERT+DELETE pairs (rows inserted then deleted within the same refresh window) and collapses multi-change groups to first+last rows per pk_hash, reducing delta scan overhead by 50–90% for high-churn tables.

Set to 0 to disable compaction.

Default: 100000 (100K rows)
Range: 0 – 100000000

-- Trigger compaction at 50K pending rows
SET pg_trickle.compact_threshold = 50000;

-- Disable compaction
SET pg_trickle.compact_threshold = 0;

pg_trickle.max_buffer_rows

Added in v0.16.0. Hard limit on change buffer rows per source table. When a source table's change buffer exceeds this limit at refresh time, pg_trickle forces a FULL refresh and truncates the buffer, preventing unbounded disk growth when differential refresh fails repeatedly.

Property	Value
Type	`integer`
Default	`1000000` (1 million rows)
Range	`0` – `100000000`
Context	`SUSET`
Restart Required	No

Set to 0 to disable the limit (not recommended for production).

Tuning Guidance:

Most workloads: Leave at 1000000. This accommodates high-throughput tables while preventing runaway growth.
High-throughput event tables: Raise to 5000000–10000000 if your source tables regularly accumulate large change buffers between refresh cycles.
Small databases / tight disk budgets: Lower to 100000–500000 to limit change buffer disk usage.

-- Set buffer limit to 5 million rows
SET pg_trickle.max_buffer_rows = 5000000;

-- Disable the limit (not recommended)
SET pg_trickle.max_buffer_rows = 0;

pg_trickle.auto_index

Added in v0.16.0. Controls whether create_stream_table() automatically creates performance indexes on stream tables.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET`
Restart Required	No

When enabled, the following indexes are created automatically:

GROUP BY composite index — for aggregate queries in DIFFERENTIAL mode, a composite index on the GROUP BY columns is created to speed up group lookups during MERGE.
DISTINCT composite index — for DISTINCT queries with ≤ 8 output columns, a composite index on all output columns is created.
Covering __pgt_row_id index — for stream tables with ≤ 8 output columns, the __pgt_row_id index includes all user columns via INCLUDE, enabling index-only scans during MERGE (20–50% faster for small deltas against large targets).

The __pgt_row_id index itself is always created regardless of this setting (it is required for correctness).

Tuning Guidance:

Most workloads: Leave at true.
Custom index strategies: Set to false if you prefer to manage indexes manually or if the auto-created indexes conflict with your workload patterns.

-- Disable automatic index creation
SET pg_trickle.auto_index = false;

pg_trickle.aggregate_fast_path

Added in v0.16.0. Controls whether stream tables with all-algebraic aggregates use the explicit DML fast-path instead of MERGE.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET`
Restart Required	No

When enabled, stream tables whose aggregates are all algebraically invertible (COUNT, SUM, AVG, STDDEV, VAR, CORR, REGR_*, etc.) use the explicit DML path (DELETE + UPDATE + INSERT via a materialized temp table) instead of the generic MERGE statement. This avoids the MERGE hash-join cost, which dominates for aggregate queries with high group cardinality.

Not eligible:

Queries with SEMI_ALGEBRAIC aggregates (MIN, MAX) — these may require group-rescan on extremum deletion
Queries with GROUP_RESCAN aggregates (STRING_AGG, ARRAY_AGG, JSON_AGG, etc.)
Queries with user-defined triggers on the stream table (already use explicit DML via the user-trigger path)

The explain_st() output shows the aggregate_path field:

explicit_dml — fast-path is active
merge — using the default MERGE path
merge (fast-path disabled) — eligible but GUC is off

-- Disable aggregate fast-path
SET pg_trickle.aggregate_fast_path = false;

-- Check the current aggregate path for a stream table
SELECT * FROM pgtrickle.explain_st('my_agg_st');

pg_trickle.template_cache

Added in v0.16.0. Controls the cross-backend delta template cache backed by an UNLOGGED catalog table.

Property	Value
Type	`bool`
Default	`true`
Context	`SUSET`
Restart Required	No

When enabled, delta SQL templates generated by the DVM engine are persisted in pgtrickle.pgt_template_cache so that new backends skip the ~45 ms parse+differentiate step on their first refresh of each stream table (down to ~1 ms SPI lookup).

Templates are automatically invalidated when:

A stream table's defining query changes (ALTER STREAM TABLE ... SET QUERY)
A stream table is dropped
A stream table is reinitialized

The explain_st() output includes template_cache (enabled/disabled) and template_cache_stats with L2 hit and full miss counters.

-- Disable the template cache for debugging
SET pg_trickle.template_cache = false;

-- Check template cache stats
SELECT * FROM pgtrickle.explain_st('my_st')
WHERE property IN ('template_cache', 'template_cache_stats');

pg_trickle.buffer_partitioning

Controls whether change buffer tables use PARTITION BY RANGE (lsn) for O(1) cleanup via partition detach instead of O(n) DELETE.

Value	Behaviour
`'off'`	(Default) Unpartitioned heap tables. Cleanup uses DELETE or TRUNCATE. Lowest DDL overhead per cycle.
`'on'`	Always create partitioned change buffers. Old partitions are detached and dropped after consumption — O(1) cleanup regardless of buffer size. Best for high-throughput sources where buffers routinely exceed `compact_threshold`.
`'auto'`	Start with unpartitioned buffers. If a buffer accumulates more rows than `compact_threshold` within a single refresh cycle, automatically promote it to RANGE(lsn) partitioned mode. Once promoted, the buffer stays partitioned. Combines low overhead for quiet sources with O(1) cleanup for hot ones.

Default: 'off' Context: SUSET (superuser session-level)

-- Always partition change buffers
SET pg_trickle.buffer_partitioning = 'on';

-- Auto-promote based on throughput
SET pg_trickle.buffer_partitioning = 'auto';

-- Disable partitioning (default)
SET pg_trickle.buffer_partitioning = 'off';

Interaction with compact_threshold: In 'auto' mode, the compact_threshold value serves double duty — it triggers both compaction and the auto-promotion decision. Lowering compact_threshold makes auto-promotion more sensitive.

pg_trickle.max_grouping_set_branches

Maximum allowed grouping set branches in CUBE/ROLLUP queries. CUBE(n) produces $2^n$ branches — without a limit, large cubes cause memory exhaustion during parsing. Users who genuinely need more than 64 branches can raise this GUC.

Default: 64
Range: 1 – 65536

-- Allow up to 128 grouping set branches
SET pg_trickle.max_grouping_set_branches = 128;

pg_trickle.volatile_function_policy

Controls how volatile functions in defining queries are handled for DIFFERENTIAL and IMMEDIATE modes.

Value	Behaviour
`reject`	(Default) Volatile functions cause an ERROR at stream table creation time.
`warn`	Volatile functions emit a WARNING but creation proceeds. Delta correctness is not guaranteed.
`allow`	Volatile functions are permitted silently. Use only when you understand that delta computation may produce incorrect results.

Default: reject Context: SUSET (superuser session-level)

-- Allow volatile functions with a warning
SET pg_trickle.volatile_function_policy = 'warn';

-- Allow volatile functions silently
SET pg_trickle.volatile_function_policy = 'allow';

Note: Volatile functions (e.g., random(), clock_timestamp()) produce different values on each evaluation. In DIFFERENTIAL/IMMEDIATE modes, the delta computation assumes deterministic functions — volatile functions may cause stale or incorrect rows. FULL mode is unaffected since it recomputes from scratch every time.

pg_trickle.unlogged_buffers

Create new change buffer tables as UNLOGGED to reduce WAL amplification from CDC trigger inserts.

Value	Behaviour
`false`	(Default) Change buffers are WAL-logged. Crash-safe — no data loss on crash recovery.
`true`	New change buffers are created as `UNLOGGED`. Eliminates WAL writes for trigger-inserted rows, reducing WAL amplification by ~30%. Trade-off: buffers are truncated on crash recovery; affected stream tables automatically receive a FULL refresh on the next scheduler cycle.

Default: false Context: SUSET (superuser session-level)

-- Enable UNLOGGED buffers for new stream tables
SET pg_trickle.unlogged_buffers = true;

Crash recovery: After a PostgreSQL crash or standby restart, UNLOGGED buffer tables are automatically truncated by PostgreSQL. The pg_trickle scheduler detects this condition and enqueues a FULL refresh for each affected stream table on the next tick. During the window between crash recovery and FULL refresh completion, stream table data may be stale.

Standby replicas: UNLOGGED tables are not replicated to standbys. Stream tables on read replicas will be stale after any standby restart until the next FULL refresh completes on the primary.

Converting existing buffers: This GUC only affects newly created change buffer tables. To convert existing logged buffers, use:
SELECT pgtrickle.convert_buffers_to_unlogged();
This function acquires ACCESS EXCLUSIVE lock on each buffer table. Run it during a low-traffic maintenance window.

pg_trickle.max_parse_depth

Maximum recursion depth for the query parser's tree visitors (G13-SD). Prevents stack-overflow crashes on pathological queries with deeply nested subqueries, CTEs, or set operations. When the limit is exceeded, the parser returns a QueryTooComplex error instead of crashing.

Default: 64 Range: 1 – 10000

-- Raise the limit for deeply nested queries
SET pg_trickle.max_parse_depth = 128;

pg_trickle.ivm_topk_max_limit

Maximum LIMIT value for TopK stream tables in IMMEDIATE mode. TopK queries exceeding this threshold are rejected because the inline micro-refresh (recomputing top-K rows on every DML statement) adds latency proportional to LIMIT. Set to 0 to disable TopK in IMMEDIATE mode entirely.

Default: 1000
Range: 0 – 1000000

-- Allow TopK up to LIMIT 5000 in IMMEDIATE mode
SET pg_trickle.ivm_topk_max_limit = 5000;

pg_trickle.ivm_recursive_max_depth

Maximum recursion depth for WITH RECURSIVE queries in IMMEDIATE mode. The semi-naive evaluation injects a __pgt_depth counter column into the recursive SQL; iteration stops when the counter reaches this limit. Protects against infinite recursion in pathological graphs.

Default: 100
Range: 1 – 10000

-- Allow deeper recursion for large hierarchies
SET pg_trickle.ivm_recursive_max_depth = 500;

Parallel Refresh

These settings control whether and how the scheduler dispatches refresh work to multiple dynamic background workers instead of processing stream tables sequentially. See PLAN_PARALLELISM.md for the design.

Note: Parallel refresh is new in v0.4.0 and defaults to off. Enable it via pg_trickle.parallel_refresh_mode after validating your workload.

pg_trickle.parallel_refresh_mode

Controls whether the scheduler dispatches refresh work to dynamic background workers.

Property	Value
Type	`text`
Default	`'off'`
Values	`'off'`, `'dry_run'`, `'on'`
Context	`SUSET`
Restart Required	No

off (default): Sequential execution. All stream tables are refreshed one at a time in topological order by the single scheduler background worker. This is the proven, stable default.
dry_run: The scheduler computes execution units, logs dispatch decisions (unit keys, ready-queue contents, budget), but still executes refreshes inline. Useful for previewing parallel behaviour without actually spawning workers.
on: True parallel refresh. The coordinator builds an execution-unit DAG, dispatches ready units to dynamic background workers, and respects both the per-database cap (max_concurrent_refreshes) and the cluster-wide cap (max_dynamic_refresh_workers).

-- Preview parallel dispatch decisions without changing runtime behaviour
SET pg_trickle.parallel_refresh_mode = 'dry_run';

-- Enable parallel refresh
SET pg_trickle.parallel_refresh_mode = 'on';

pg_trickle.max_dynamic_refresh_workers

Cluster-wide cap on concurrently active pg_trickle dynamic refresh workers.

Property	Value
Type	`int`
Default	`4`
Range	`0` – `64`
Context	`SUSET`
Restart Required	No

This is distinct from pg_trickle.max_concurrent_refreshes (per-database cap). When multiple databases each have their own scheduler, this GUC prevents them from overcommitting the shared PostgreSQL max_worker_processes budget.

Worker-budget planning: Each dynamic refresh worker consumes one max_worker_processes slot. In addition, pg_trickle uses one slot for the launcher and one per-database scheduler. Ensure:

max_worker_processes >= pg_trickle launchers (1)
                      + pg_trickle schedulers (1 per database)
                      + max_dynamic_refresh_workers
                      + autovacuum workers
                      + parallel query workers
                      + other extensions

A typical small deployment (1–2 databases, 4 parallel workers) needs at least max_worker_processes = 16. The E2E test Docker image uses 128.

-- Allow up to 8 concurrent refresh workers cluster-wide
SET pg_trickle.max_dynamic_refresh_workers = 8;

pg_trickle.max_concurrent_refreshes

Per-database dispatch cap for parallel refresh workers.

Property	Value
Type	`int`
Default	`4`
Range	`1` – `32`
Context	`SUSET`
Restart Required	No

When parallel_refresh_mode = 'on', this limits how many execution units a single database coordinator may have in-flight at the same time. In sequential mode (parallel_refresh_mode = 'off'), this setting has no effect.

The effective concurrent refreshes for a database is:

min(max_concurrent_refreshes, max_dynamic_refresh_workers - workers_used_by_other_dbs)

-- Allow up to 8 concurrent refreshes in this database
SET pg_trickle.max_concurrent_refreshes = 8;

pg_trickle.per_database_worker_quota

Per-database dynamic refresh worker quota for multi-tenant cluster isolation.

Property	Value
Type	`int`
Default	`0` (disabled)
Range	`0` – `64`
Context	`SUSET`
Restart Required	No

When greater than 0, each per-database scheduler limits itself to this many concurrently active dynamic refresh workers drawn from the shared max_dynamic_refresh_workers pool. This prevents a single busy database from starving others in multi-tenant clusters.

Burst capacity: when the cluster is lightly loaded (active workers < 80% of max_dynamic_refresh_workers), a database may temporarily exceed its quota by up to 50% to absorb sudden change backlogs. The burst is reclaimed automatically within 1 scheduler cycle once global load rises back above the 80% threshold.

Priority dispatch: within each dispatch tick, IMMEDIATE-trigger closures are dispatched before all other unit kinds, ensuring transactional consistency requirements are always met first, even under quota pressure.

-- Limit the analytics DB to 4 base workers (bursts to 6 when cluster is idle)
ALTER DATABASE analytics SET pg_trickle.per_database_worker_quota = 4;
-- Give the reporting DB only 2 base workers
ALTER DATABASE reporting  SET pg_trickle.per_database_worker_quota = 2;
SELECT pg_reload_conf();

When per_database_worker_quota = 0 (the default), this feature is disabled and all databases share the max_dynamic_refresh_workers pool on a first-come-first-served basis, bounded per coordinator by max_concurrent_refreshes.

Note: Set this GUC per-database with ALTER DATABASE rather than globally with ALTER SYSTEM, so different databases can have different quotas.

Advanced / Internal

pg_trickle.change_buffer_schema

Schema name for change-buffer tables created by the trigger-based CDC pipeline.

Default: 'pgtrickle_changes'

Change buffer tables are named <schema>.changes_<oid> where <oid> is the source table's OID. Placing them in a dedicated schema keeps them out of the public namespace.

SET pg_trickle.change_buffer_schema = 'my_change_buffers';

pg_trickle.foreign_table_polling

Enable polling-based change detection for foreign table sources. When enabled, the scheduler periodically re-executes the foreign table query and computes deltas via snapshot comparison (EXCEPT ALL). Foreign tables cannot use trigger or WAL-based CDC, so this is the only mechanism for incremental maintenance.

Default: false

SET pg_trickle.foreign_table_polling = true;

pg_trickle.matview_polling

Enable polling-based CDC for materialized views. When enabled, materialized views referenced in defining queries are supported via snapshot-comparison (the same mechanism as foreign table polling). A local shadow table stores the previous state; EXCEPT ALL computes the delta on each refresh cycle.

Property	Value
Type	`boolean`
Default	`false`
Context	`SUSET` (superuser)
Restart required	No

SET pg_trickle.matview_polling = true;

pg_trickle.cdc_trigger_mode

Controls the CDC trigger granularity: statement (default) or row.

statement uses statement-level AFTER triggers with transition tables (NEW TABLE / OLD TABLE). A single invocation per DML statement processes all affected rows in one bulk INSERT ... SELECT, giving 50-80% less write-side overhead for bulk UPDATE/DELETE. Single-row DML is unaffected.

row uses the legacy per-row trigger approach (pg_trickle < 0.4.0 behavior).

Changing this setting takes effect for newly installed CDC triggers. Call pgtrickle.rebuild_cdc_triggers() to migrate existing stream tables.

Property	Value
Type	`string`
Default	`'statement'`
Valid values	`statement`, `row`
Context	`SUSET` (superuser)
Restart required	No

-- Switch to statement-level triggers (default, recommended)
SET pg_trickle.cdc_trigger_mode = 'statement';

-- After changing, rebuild existing triggers:
SELECT pgtrickle.rebuild_cdc_triggers();

pg_trickle.tick_watermark_enabled

Cap CDC consumption to the WAL LSN at scheduler tick start. When enabled (default), each scheduler tick captures pg_current_wal_lsn() at its start and prevents any refresh from consuming WAL changes beyond that LSN. This bounds cross-source staleness without requiring user configuration.

Disable only if you need stream tables to always advance to the latest available LSN.

Property	Value
Type	`boolean`
Default	`true`
Context	`SUSET` (superuser)
Restart required	No

-- Disable tick watermark bounding
SET pg_trickle.tick_watermark_enabled = false;

pg_trickle.watermark_holdback_timeout

Maximum seconds a user-provided watermark may remain un-advanced before being considered stuck. When a watermark group contains a source whose watermark has not been advanced within this timeout, downstream stream tables in that group are paused (refresh is skipped) and a pgtrickle_alert NOTIFY with watermark_stuck event is emitted.

When the stuck watermark is advanced again (via advance_watermark()), the pause is automatically lifted and a watermark_resumed event is emitted.

Set to 0 to disable stuck-watermark detection (default). Useful values depend on your ETL pipeline cadence -- for a pipeline that loads every 5 minutes, a timeout of 600 (10 min) gives a safety margin.

Property	Value
Type	`integer`
Default	`0` (disabled)
Min	`0`
Max	`86400` (24 hours)
Context	`SUSET` (superuser)
Restart required	No

-- Set stuck-watermark timeout to 10 minutes
ALTER SYSTEM SET pg_trickle.watermark_holdback_timeout = 600;
SELECT pg_reload_conf();

NOTIFY payloads:

{"event":"watermark_stuck","group":"order_pipeline","source_oid":16385,"age_secs":620}
{"event":"watermark_resumed","source_oid":16385}

pg_trickle.spill_threshold_blocks

Temp blocks written threshold for spill detection. After each differential MERGE, pg_trickle queries pg_stat_statements for the temp_blks_written metric. If the value exceeds this threshold, the refresh is considered a spill.

After spill_consecutive_limit consecutive spills, the scheduler forces a FULL refresh for that stream table to prevent repeated expensive differential merges.

Requires the pg_stat_statements extension to be installed. Set to 0 to disable spill detection (default).

Property	Value
Type	`integer`
Default	`0` (disabled)
Min	`0`
Max	`100000000`
Context	`SUSET` (superuser)
Restart required	No

-- Enable spill detection: flag > 1000 temp blocks as a spill
ALTER SYSTEM SET pg_trickle.spill_threshold_blocks = 1000;
SELECT pg_reload_conf();

pg_trickle.spill_consecutive_limit

Number of consecutive spilling differential refreshes before the scheduler automatically forces a FULL refresh. Resets after any non-spilling refresh.

Only effective when spill_threshold_blocks > 0.

Property	Value
Type	`integer`
Default	`3`
Min	`1`
Max	`100`
Context	`SUSET` (superuser)
Restart required	No

-- Force FULL after 5 consecutive spills (default: 3)
ALTER SYSTEM SET pg_trickle.spill_consecutive_limit = 5;
SELECT pg_reload_conf();

pg_trickle.log_merge_sql

Log the generated MERGE SQL template on every refresh cycle. When enabled, the MERGE SQL template built during differential refresh is emitted to the PostgreSQL server log at LOG level.

Intended for debugging MERGE query generation only. Do not enable in production — the output is verbose and includes the full SQL for every refresh.

Property	Value
Type	`boolean`
Default	`false`
Context	`SUSET` (superuser)
Restart required	No

SET pg_trickle.log_merge_sql = true;

Guardrails & Diagnostics

These GUCs control safety thresholds and diagnostic warnings.

pg_trickle.fuse_default_ceiling

Global default change-count ceiling for the fuse circuit breaker. When a stream table has fuse_mode = 'on' or 'auto' and no per-ST fuse_ceiling, this value is used. If pending changes exceed this count, the fuse blows and the stream table is suspended (status = SUSPENDED).

Set to 0 to disable the global default (per-ST ceilings still apply).

Property	Value
Type	`integer`
Default	`0` (disabled)
Range	0 - 2,000,000,000
Context	`SUSET` (superuser)
Restart required	No

-- Set global fuse ceiling to 1 million rows
SET pg_trickle.fuse_default_ceiling = 1000000;

pg_trickle.delta_amplification_threshold

Delta amplification detection threshold (output/input ratio). When a DIFFERENTIAL refresh produces more than this multiple of the input delta rows, a WARNING is emitted so operators can identify pathological join fan-out or many-to-many amplification.

Set to 0.0 to disable.

Property	Value
Type	`float`
Default	`0.0` (disabled)
Range	0.0 - 100,000.0
Context	`SUSET` (superuser)
Restart required	No

-- Warn when delta output is 10x the input
SET pg_trickle.delta_amplification_threshold = 10.0;

pg_trickle.algebraic_drift_reset_cycles

Differential cycles between automatic full recomputes for algebraic aggregates. After this many differential refresh cycles, stream tables with algebraic aggregates (AVG, STDDEV, VAR) are automatically reinitialized to reset accumulated floating-point drift in auxiliary columns.

Set to 0 to disable automatic resets.

Property	Value
Type	`integer`
Default	`0` (disabled)
Range	0 - 100,000
Context	`SUSET` (superuser)
Restart required	No

-- Reset algebraic aggregates every 10,000 cycles
SET pg_trickle.algebraic_drift_reset_cycles = 10000;

pg_trickle.agg_diff_cardinality_threshold

Estimated GROUP BY cardinality threshold for algebraic aggregate warnings. At create_stream_table time, if the defining query uses algebraic aggregates (SUM, COUNT, AVG) in DIFFERENTIAL mode and the estimated group cardinality is below this threshold, a WARNING is emitted suggesting FULL or AUTO mode.

Set to 0 to disable the warning.

Property	Value
Type	`integer`
Default	`0` (disabled)
Range	0 - 100,000,000
Context	`SUSET` (superuser)
Restart required	No

-- Warn when GROUP BY cardinality is below 100
SET pg_trickle.agg_diff_cardinality_threshold = 100;

Connection Pooler

v0.19.0+ (STAB-1).

pg_trickle.connection_pooler_mode

Cluster-wide connection pooler compatibility override.

Property	Value
Type	`string`
Default	`'off'`
Valid values	`'off'`, `'transaction'`, `'session'`
Context	`SUSET`

Value	Behaviour
`off` (default)	Per-ST `pooler_compatibility_mode` governs
`transaction`	Globally disable prepared-statement reuse and suppress NOTIFY emissions (PgBouncer transaction-pool compatibility)
`session`	Explicit opt-in to session mode (same as `off` today, reserved for future use)

See Connection Pooler Compatibility for deployment guidance.

-- Enable transaction-mode pooler compatibility globally
SET pg_trickle.connection_pooler_mode = 'transaction';

History & Retention

v0.19.0+ (DB-5).

pg_trickle.history_retention_days

Number of days to retain rows in pgtrickle.pgt_refresh_history.

Property	Value
Type	`integer`
Default	`90`
Min	`0` (disabled)
Max	`36500` (~100 years)
Context	`SUSET`

The scheduler runs a daily background cleanup that deletes rows older than this many days. Set to 0 to disable automatic cleanup (history grows unbounded — monitor disk usage).

-- Keep 30 days of refresh history
SET pg_trickle.history_retention_days = 30;

Circular Dependencies

v0.7.0+ — Circular dependency support is now fully available for safe monotone cycles in DIFFERENTIAL mode. These settings control whether cycles are allowed at all and how many fixpoint iterations the scheduler will try before surfacing a non-convergence error.

pg_trickle.allow_circular

Master switch for circular (cyclic) stream table dependencies. When false (default), creating a stream table that would introduce a cycle in the dependency graph is rejected with a CycleDetected error. When true, monotone cycles — those containing only safe operators (joins, filters, projections, UNION ALL, INTERSECT, EXISTS) — are allowed.

Non-monotone operators (Aggregate, EXCEPT, Window functions, NOT EXISTS) always block cycle creation regardless of this setting, because they cannot guarantee convergence to a fixed point.

Default: false

SET pg_trickle.allow_circular = true;

pg_trickle.max_fixpoint_iterations

Maximum number of iterations per strongly connected component (SCC) before the scheduler declares non-convergence and marks all SCC members as ERROR. Prevents runaway loops in circular dependency chains.

For most practical use cases (transitive closure, graph reachability), convergence happens in 2–5 iterations. The default of 100 provides ample headroom.

Default: 100
Range: 1 – 10000

SET pg_trickle.max_fixpoint_iterations = 50;

pg_trickle.dog_feeding_auto_apply

Added in v0.20.0 (DF-G1).

Controls whether the dog-feeding analytics stream tables can automatically adjust stream table configuration.

Value	Behaviour
`off` (default)	Advisory only — no automatic changes. Dog-feeding stream tables produce analytics that operators and dashboards can read, but nothing is applied automatically.
`threshold_only`	After each 10-minute auto-apply cycle, reads `df_threshold_advice`. If a recommendation has HIGH confidence and the recommended threshold differs from the current threshold by more than 5%, applies `ALTER STREAM TABLE ... SET auto_threshold = <recommended>`. Changes are logged with `initiated_by = 'DOG_FEED'`.
`full`	Same as `threshold_only`, plus applies scheduling hints from `df_scheduling_interference` (future enhancement).

Default: off

-- Enable threshold auto-apply.
SET pg_trickle.dog_feeding_auto_apply = 'threshold_only';

-- Check current setting.
SHOW pg_trickle.dog_feeding_auto_apply;

Prerequisites: Dog-feeding stream tables must be created first via SELECT pgtrickle.setup_dog_feeding(). If the stream tables do not exist, the auto-apply worker is a no-op.

Rate limiting: At most one threshold change per stream table per 10 minutes.

Audit trail: All auto-apply changes are recorded in pgt_refresh_history with initiated_by = 'DOG_FEED' and a SKIP action describing the old and new threshold values.

GUC Interaction Matrix

Some GUC variables interact with or depend on each other. The table below documents these cross-dependencies to help avoid misconfiguration.

GUC A	GUC B	Interaction
`event_driven_wake`	`scheduler_interval_ms`	When `event_driven_wake = true`, the scheduler wakes on NOTIFY and `scheduler_interval_ms` serves only as the poll-based fallback interval. Lowering `scheduler_interval_ms` below 100 ms with event-driven wake enabled adds little value and wastes CPU.
`event_driven_wake`	`wake_debounce_ms`	`wake_debounce_ms` only takes effect when `event_driven_wake = true`. It coalesces rapid-fire notifications during bulk DML. Set higher (50–100 ms) for write-heavy workloads, lower (5–10 ms) for latency-sensitive workloads.
`auto_backoff`	`min_schedule_seconds`	`auto_backoff` stretches the effective interval up to 8× the configured schedule, but never below `min_schedule_seconds`. If `min_schedule_seconds` is high, backoff has limited room to operate.
`auto_backoff`	`default_schedule_seconds`	The backoff multiplier is applied to `default_schedule_seconds` (or the per-ST override); raising this value gives backoff a wider range.
`parallel_refresh_mode`	`max_concurrent_refreshes`	`parallel_refresh_mode = 'on'` dispatches independent STs to parallel workers, up to `max_concurrent_refreshes` per database. Setting `max_concurrent_refreshes = 1` effectively disables parallelism even when the mode is `'on'`.
`parallel_refresh_mode`	`max_dynamic_refresh_workers`	`max_dynamic_refresh_workers` is a cluster-wide cap across all databases. If you have 4 databases each wanting 4 concurrent refreshes, set this to ≥16 (or accept queuing).
`max_dynamic_refresh_workers`	`per_database_worker_quota`	When `per_database_worker_quota > 0`, each database claims at most that many workers from the shared `max_dynamic_refresh_workers` pool. Set `per_database_worker_quota` to `max_dynamic_refresh_workers / n_databases` for equal sharing. Burst to 150% is allowed when the cluster is < 80% loaded.
`differential_max_change_ratio`	`fuse_default_ceiling`	Both guard against large change batches but at different levels: `differential_max_change_ratio` triggers a FULL refresh fallback (proportional to table size), while `fuse_default_ceiling` halts refresh entirely (absolute row count). The fuse fires first if the change count exceeds it, regardless of the ratio.
`block_source_ddl`	DDL operations	When `true`, DDL on source tables (ALTER TABLE, DROP COLUMN) is blocked by an event trigger. Disable temporarily with `SET pg_trickle.block_source_ddl = false` before schema migrations, then re-enable.
`cdc_mode`	`cdc_trigger_mode`	`cdc_trigger_mode` (`'statement'` / `'row'`) only applies when CDC is trigger-based. When `cdc_mode = 'wal'` (or after auto-transition to WAL), `cdc_trigger_mode` is irrelevant.
`cdc_mode`	`wal_transition_timeout`	`wal_transition_timeout` only applies when `cdc_mode = 'auto'`. It controls how many seconds to wait for the first WAL-based refresh to succeed before falling back to triggers.
`cleanup_use_truncate`	`compact_threshold`	`cleanup_use_truncate = true` uses TRUNCATE to clear consumed change buffers (fastest, acquires AccessExclusiveLock briefly). `compact_threshold` controls when fully-consumed buffers are compacted via DELETE — only relevant when TRUNCATE is disabled.
`buffer_partitioning`	`compact_threshold`	In `'auto'` mode, `compact_threshold` serves as the promotion trigger: if a buffer exceeds this many rows in a single refresh cycle, it is promoted to RANGE(lsn) partitioned mode. Lowering `compact_threshold` makes auto-promotion more sensitive.
`allow_circular`	`max_fixpoint_iterations`	`max_fixpoint_iterations` is only evaluated when `allow_circular = true`. It caps the number of convergence iterations for circular dependency chains.
`ivm_topk_max_limit`	TopK queries	Queries with `LIMIT > ivm_topk_max_limit` fall back to FULL refresh instead of the optimized TopK path. Raise this if you have legitimate large TopK queries.
`ivm_recursive_max_depth`	Recursive CTEs	Recursive expansion beyond `ivm_recursive_max_depth` iterations is terminated with a warning and falls back to FULL refresh. Set to 0 to disable the guard (not recommended).

Tuning Profiles

Three named profiles for common deployment patterns. Copy the relevant settings into your postgresql.conf and adjust to taste.

Low-Latency Profile

Goal: Minimize end-to-end latency from base table write to stream table update. Best for dashboards, real-time analytics, and operational monitoring.

# Event-driven wake — sub-50ms median latency
pg_trickle.event_driven_wake = true
pg_trickle.wake_debounce_ms = 5              # aggressive: 5ms coalesce

# Fast scheduling
pg_trickle.scheduler_interval_ms = 200       # poll fallback (rarely used)
pg_trickle.min_schedule_seconds = 1
pg_trickle.default_schedule_seconds = 1

# Parallel refresh for independent STs
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_concurrent_refreshes = 4

# Lean merge
pg_trickle.merge_planner_hints = true
pg_trickle.merge_work_mem_mb = 128           # more memory = fewer disk sorts
pg_trickle.cleanup_use_truncate = true
pg_trickle.use_prepared_statements = true

# Guardrails
pg_trickle.auto_backoff = true               # prevent CPU runaway
pg_trickle.fuse_default_ceiling = 0          # disabled — latency over safety
pg_trickle.block_source_ddl = true

High-Throughput Profile

Goal: Maximize rows-per-second processed across many stream tables under heavy write load. Accepts slightly higher latency in exchange for better batching and resource efficiency.

# Batched wake — coalesce writes into larger deltas
pg_trickle.event_driven_wake = true
pg_trickle.wake_debounce_ms = 50             # 50ms coalesce window

# Relaxed scheduling
pg_trickle.scheduler_interval_ms = 2000      # 2-second poll fallback
pg_trickle.min_schedule_seconds = 2
pg_trickle.default_schedule_seconds = 5

# Heavy parallelism
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_concurrent_refreshes = 8
pg_trickle.max_dynamic_refresh_workers = 8

# Aggressive performance
pg_trickle.merge_planner_hints = true
pg_trickle.merge_work_mem_mb = 256           # large work_mem for big deltas
pg_trickle.merge_seqscan_threshold = 0.01    # allow seq scans for >1% changes
pg_trickle.cleanup_use_truncate = true
pg_trickle.use_prepared_statements = true
pg_trickle.auto_backoff = true
pg_trickle.buffer_partitioning = 'auto'      # O(1) cleanup for hot buffers

# Safety for bulk workloads
pg_trickle.fuse_default_ceiling = 500000     # pause on >500K changes
pg_trickle.differential_max_change_ratio = 0.25  # FULL fallback at 25%
pg_trickle.block_source_ddl = true

Resource-Constrained Profile

Goal: Minimize CPU and memory footprint for small instances, shared hosting, or development environments. Accepts higher latency and slower throughput.

# Poll-based only — no NOTIFY overhead
pg_trickle.event_driven_wake = false
pg_trickle.scheduler_interval_ms = 5000      # 5-second poll

# Conservative scheduling
pg_trickle.min_schedule_seconds = 5
pg_trickle.default_schedule_seconds = 10

# Minimal parallelism
pg_trickle.parallel_refresh_mode = 'off'     # single-threaded refresh
pg_trickle.max_concurrent_refreshes = 1
pg_trickle.max_dynamic_refresh_workers = 1

# Conservative memory
pg_trickle.merge_work_mem_mb = 32
pg_trickle.merge_planner_hints = true
pg_trickle.cleanup_use_truncate = true

# Tight guardrails
pg_trickle.auto_backoff = true
pg_trickle.fuse_default_ceiling = 100000
pg_trickle.differential_max_change_ratio = 0.10
pg_trickle.block_source_ddl = true
pg_trickle.buffer_alert_threshold = 500000

Complete postgresql.conf Example

# Required
shared_preload_libraries = 'pg_trickle'

# Essential
pg_trickle.enabled = true
pg_trickle.cdc_mode = 'auto'
pg_trickle.scheduler_interval_ms = 1000
pg_trickle.min_schedule_seconds = 1
pg_trickle.default_schedule_seconds = 1
pg_trickle.max_consecutive_errors = 3

# WAL CDC
pg_trickle.wal_transition_timeout = 300
pg_trickle.slot_lag_warning_threshold_mb = 100
pg_trickle.slot_lag_critical_threshold_mb = 1024

# Refresh performance
pg_trickle.differential_max_change_ratio = 0.15
pg_trickle.merge_planner_hints = true
pg_trickle.merge_work_mem_mb = 64
pg_trickle.cleanup_use_truncate = true
pg_trickle.use_prepared_statements = true
pg_trickle.user_triggers = 'auto'

# Guardrails & limits
pg_trickle.block_source_ddl = false
pg_trickle.buffer_alert_threshold = 1000000
pg_trickle.compact_threshold = 100000
pg_trickle.buffer_partitioning = 'off'
pg_trickle.max_grouping_set_branches = 64
pg_trickle.max_parse_depth = 64
pg_trickle.ivm_topk_max_limit = 1000
pg_trickle.ivm_recursive_max_depth = 100

# Circular dependencies (v0.7.0+)
pg_trickle.allow_circular = false                # master switch
pg_trickle.max_fixpoint_iterations = 100         # convergence limit

# Parallel refresh (v0.4.0+, default off)
pg_trickle.parallel_refresh_mode = 'off'        # 'off' | 'dry_run' | 'on'
pg_trickle.max_dynamic_refresh_workers = 4       # cluster-wide worker cap
pg_trickle.max_concurrent_refreshes = 4          # per-database dispatch cap

# Advanced / internal
pg_trickle.change_buffer_schema = 'pgtrickle_changes'
pg_trickle.foreign_table_polling = false

Runtime Configuration

All GUC variables can be changed at runtime by a superuser:

-- View current settings
SHOW pg_trickle.enabled;
SHOW pg_trickle.parallel_refresh_mode;

-- Enable parallel refresh for current session
SET pg_trickle.parallel_refresh_mode = 'on';

-- Change persistently (requires reload)
ALTER SYSTEM SET pg_trickle.scheduler_interval_ms = 500;
SELECT pg_reload_conf();

Scaling Guide

This document provides guidance for scaling pg_trickle to hundreds of stream tables and beyond. It covers worker pool sizing, scheduler tuning, and diagnostic queries for identifying bottlenecks.

Architecture Overview

pg_trickle uses a two-tier background worker model:

Launcher — one per server. Scans pg_database every 10 seconds, spawns per-database schedulers, and auto-restarts crashed workers.
Per-database scheduler — one per database. Wakes every scheduler_interval_ms (default: 1 s), reads DAG changes from shared memory, consumes CDC buffers, and dispatches refreshes.

When parallel_refresh_mode = 'on', the scheduler dispatches refresh work to a pool of dynamic background workers instead of running refreshes inline.

Worker Pool Sizing

Deployment Size	Stream Tables	Recommended `max_dynamic_refresh_workers`	Notes
Small	1–20	2–4	Default (4) is usually sufficient
Medium	20–100	4–8	Monitor worker saturation
Large	100–200	8–16	Enable tiered scheduling
Very Large	200+	16–32	Tune per-database quotas

Budget Formula

Worker slots are drawn from max_worker_processes, which is shared with autovacuum, parallel queries, and other extensions:

max_worker_processes >= launchers(1)
                      + schedulers(N_databases)
                      + max_dynamic_refresh_workers
                      + autovacuum_max_workers
                      + max_parallel_workers
                      + other_extensions

Example for 200 STs across 2 databases with 16 workers:

# postgresql.conf
max_worker_processes = 40
pg_trickle.max_dynamic_refresh_workers = 16
pg_trickle.max_concurrent_refreshes = 8
pg_trickle.per_database_worker_quota = 8
pg_trickle.parallel_refresh_mode = 'on'

Tiered Scheduling

For deployments with 50+ stream tables, enable tiered scheduling to reduce scheduler overhead:

pg_trickle.tiered_scheduling = on   -- default since v0.12.0

The scheduler classifies stream tables into tiers based on change frequency:

Tier	Schedule Multiplier	Behavior
Hot	1× (base interval)	Tables with frequent changes
Warm	2×	Tables with moderate changes
Cold	10×	Tables with rare changes
Frozen	skip	Tables with no recent changes

This reduces the CPU cost of the scheduling loop itself, which can become a bottleneck at 200+ STs when every table is polled every cycle.

Dispatch Priority

When multiple stream tables are ready simultaneously, the scheduler dispatches in priority order:

IMMEDIATE closures — time-critical refresh requests
Atomic groups / Repeatable-read groups / Fused chains — multi-ST units
Singletons — individual stream tables
Cyclic SCCs — strongly-connected components

Within each priority band, the tier sort applies (Hot > Warm > Cold).

Per-Database Quotas and Burst

When per_database_worker_quota > 0, each database gets a guaranteed slice of the worker pool:

Normal load (cluster < 80% capacity): database can burst to 150% of its quota using idle capacity from other databases.
High load (cluster ≥ 80% capacity): strict quota enforcement.

This prevents a single high-traffic database from starving others.

Monitoring

Worker Pool Status

SELECT * FROM pgtrickle.worker_pool_status();
-- Returns: active_workers, max_workers, per_db_cap, parallel_mode

Active Job Details

SELECT * FROM pgtrickle.parallel_job_status(300);
-- Returns recent jobs (last 300s): status, duration, worker PID, etc.

Health Summary

SELECT * FROM pgtrickle.health_summary();
-- Returns: total/active/error/suspended/stale counts, scheduler status, cache hit rate

Buffer Backlog Check

SELECT * FROM pgtrickle.change_buffer_sizes()
ORDER BY row_count DESC
LIMIT 20;

Identifying Bottlenecks

Is the scheduler loop the bottleneck?

-- If queue depth is consistently > 10 and workers are not saturated,
-- the scheduler loop is the bottleneck. Reduce scheduler_interval_ms.
SELECT active_workers, max_workers
FROM pgtrickle.worker_pool_status();

Are workers saturated?

-- If active_workers == max_workers consistently, increase the pool.
SELECT active_workers >= max_workers AS saturated
FROM pgtrickle.worker_pool_status();

Which STs take the longest?

SELECT st.pgt_schema, st.pgt_name,
       AVG(EXTRACT(EPOCH FROM (h.end_time - h.start_time))) AS avg_sec,
       MAX(EXTRACT(EPOCH FROM (h.end_time - h.start_time))) AS max_sec,
       COUNT(*) AS refreshes
FROM pgtrickle.pgt_refresh_history h
JOIN pgtrickle.pgt_stream_tables st ON st.pgt_id = h.pgt_id
WHERE h.start_time > now() - interval '1 hour'
  AND h.status = 'COMPLETED'
GROUP BY st.pgt_schema, st.pgt_name
ORDER BY avg_sec DESC
LIMIT 20;

Tuning Profiles

Low-Latency (< 50 ms P99)

pg_trickle.scheduler_interval_ms = 200
pg_trickle.event_driven_wake = on
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_dynamic_refresh_workers = 8
pg_trickle.tiered_scheduling = on

High-Throughput (200+ STs)

pg_trickle.scheduler_interval_ms = 500
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_dynamic_refresh_workers = 16
pg_trickle.max_concurrent_refreshes = 8
pg_trickle.per_database_worker_quota = 8
pg_trickle.tiered_scheduling = on
pg_trickle.merge_work_mem_mb = 128

Resource-Constrained (4 CPU / 8 GB RAM)

pg_trickle.scheduler_interval_ms = 2000
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_dynamic_refresh_workers = 2
pg_trickle.max_concurrent_refreshes = 2
pg_trickle.tiered_scheduling = on
pg_trickle.delta_work_mem_cap_mb = 256
pg_trickle.merge_work_mem_mb = 32

Profiling Methodology

To profile worker utilization at scale, run a test with 200+ stream tables and max_workers set to 4, 8, and 16 in turn. Collect the following metrics at 1-second intervals:

-- Worker pool utilization over time
SELECT now() AS ts,
       (SELECT active_workers FROM pgtrickle.worker_pool_status()) AS active,
       (SELECT max_workers FROM pgtrickle.worker_pool_status()) AS pool_size,
       (SELECT COUNT(*) FROM pgtrickle.parallel_job_status(5)
        WHERE status = 'QUEUED') AS queue_depth;

Plot active / pool_size (utilization) and queue_depth over time. If utilization is consistently > 90% with non-zero queue depth, the pool is undersized. If utilization is < 50%, the pool is oversized and consuming max_worker_processes slots unnecessarily.

Known Scaling Limits

Resource	Practical Limit	Bottleneck
Stream tables per DB	~500	Scheduler loop CPU
Worker pool size	64	GUC max
Change buffer rows	`max_buffer_rows` (default 1M)	Disk I/O
Template cache size	128 entries (L1)	Evictions increase at >128 STs
DAG depth	~20 levels	Topological sort + cascade latency

Read Replicas & Hot Standby

Added in v0.19.0 (SCAL-1 / STAB-2).

pg_trickle is a primary-only extension. Stream tables are maintained by the background scheduler through DML (INSERT, DELETE, MERGE), which is only possible on the primary server.

Behaviour on Replicas

When the pg_trickle shared library is loaded on a read replica (physical standby or streaming replica):

The launcher worker detects pg_is_in_recovery() = true and enters a sleep loop, checking every 30 seconds for promotion.
Upon promotion (e.g. pg_promote()), the launcher resumes normal operation and spawns per-database schedulers.
Manual refresh calls (pgtrickle.refresh_stream_table()) on a replica are rejected with a clear error message.

Recommended Setup

Include pg_trickle in shared_preload_libraries on both primary and replicas. This ensures immediate availability after failover without a restart.
Stream tables are read-queryable on replicas via physical replication — the storage tables are regular PostgreSQL tables that replicate normally.
Monitor the replication lag to estimate stream table staleness on replicas.

CNPG & Kubernetes Operations

Added in v0.19.0 (SCAL-3).

CloudNativePG (CNPG) is the recommended Kubernetes operator for running pg_trickle. The extension is packaged as a custom container image that extends the official PostgreSQL image.

Container Image

Build the pg_trickle image using the provided Dockerfiles:

# GHCR image (multi-stage build)
docker build -f Dockerfile.ghcr -t pg-trickle:latest .

# Or use the CNPG-specific Dockerfile
docker build -f cnpg/Dockerfile.ext -t pg-trickle-cnpg:latest .

CNPG Cluster Configuration

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: pg-trickle-cluster
spec:
  instances: 3
  imageName: your-registry/pg-trickle:0.19.0
  postgresql:
    shared_preload_libraries:
      - pg_trickle
    parameters:
      pg_trickle.enabled: "true"
      pg_trickle.scheduler_interval_ms: "1000"
      pg_trickle.max_concurrent_refreshes: "4"
      # STAB-1: If using PgBouncer sidecar in transaction mode:
      # pg_trickle.connection_pooler_mode: "transaction"

Operational Notes

Failover: pg_trickle detects promotion automatically (see Read Replicas above). After CNPG promotes a replica, the launcher starts within 30 seconds.
Scaling replicas: Stream table data replicates to all replicas via physical replication. No pg_trickle-specific configuration needed on replicas.
Backup: Use CNPG's built-in Barman backup. pg_trickle's catalog tables are included automatically. See Backup & Restore.
Monitoring: The Prometheus endpoint (pgtrickle.health_summary()) is compatible with CNPG's monitoring sidecar. See the Grafana dashboards in monitoring/grafana/.

Installation Guide

Prerequisites

Requirement	Version
PostgreSQL	18.x

Building from source additionally requires Rust 1.85+ (edition 2024) and pgrx 0.17.x. Pre-built release artifacts only need a running PostgreSQL 18.x instance.

Installing from a Pre-built Release

1. Download the release archive

Download the archive for your platform from the GitHub Releases page:

Platform	Archive
Linux x86_64	`pg_trickle-<ver>-pg18-linux-amd64.tar.gz`
macOS Apple Silicon	`pg_trickle-<ver>-pg18-macos-arm64.tar.gz`
Windows x64	`pg_trickle-<ver>-pg18-windows-amd64.zip`

Optionally verify the checksum against SHA256SUMS.txt from the same release:

sha256sum -c SHA256SUMS.txt

2. Extract and install

Linux / macOS:

tar xzf pg_trickle-<ver>-pg18-linux-amd64.tar.gz
cd pg_trickle-<ver>-pg18-linux-amd64

sudo cp lib/*.so  "$(pg_config --pkglibdir)/"
sudo cp extension/*.control extension/*.sql "$(pg_config --sharedir)/extension/"

Windows (PowerShell):

Expand-Archive pg_trickle-<ver>-pg18-windows-amd64.zip -DestinationPath .
cd pg_trickle-<ver>-pg18-windows-amd64

Copy-Item lib\*.dll  "$(pg_config --pkglibdir)\"
Copy-Item extension\* "$(pg_config --sharedir)\extension\"

3. Using with CloudNativePG (Kubernetes)

pg_trickle is distributed as an OCI extension image for use with CloudNativePG Image Volume Extensions.

Requirements: Kubernetes 1.33+, CNPG 1.28+, PostgreSQL 18.

# Pull the extension image
docker pull ghcr.io/grove/pg_trickle-ext:<ver>

See cnpg/cluster-example.yaml and cnpg/database-example.yaml for complete Cluster and Database deployment examples.

4. GHCR Docker image (recommended for local dev)

pg_trickle is published as a ready-to-run Docker image on the GitHub Container Registry. PostgreSQL 18.3 and pg_trickle are pre-installed and all sensible GUC defaults (wal_level, shared_preload_libraries, memory, scheduler settings) are baked in — no configuration file editing needed.

docker pull ghcr.io/grove/pg_trickle:latest

docker run --rm \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  ghcr.io/grove/pg_trickle:latest

CREATE EXTENSION pg_trickle; runs automatically on the default postgres database at first startup.

Available tags:

Tag	Meaning
`latest`	Most recent release
`pg18`	Floating alias for the latest PostgreSQL 18 build
`<version>-pg18.3`	Immutable tag, e.g. `0.13.0-pg18.3`

Override any GUC at runtime without rebuilding:

docker run --rm \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  ghcr.io/grove/pg_trickle:latest \
  -c shared_buffers=2GB -c work_mem=64MB -c effective_cache_size=6GB

For persistent data, mount a volume:

docker run -d \
  --name pg_trickle \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  -v pg_trickle_data:/var/lib/postgresql/data \
  ghcr.io/grove/pg_trickle:latest

Alternative — manual mount from a release archive: If you prefer to use the stock postgres:18.3 image rather than the pre-built image, extract the extension files from a release archive and mount them:

tar xzf pg_trickle-<ver>-pg18-linux-amd64.tar.gz
cd pg_trickle-<ver>-pg18-linux-amd64

docker run --rm \
  -v $PWD/lib/pg_trickle.so:/usr/lib/postgresql/18/lib/pg_trickle.so:ro \
  -v $PWD/extension/:/tmp/ext/:ro \
  -e POSTGRES_PASSWORD=postgres \
  postgres:18.3 \
  sh -c 'cp /tmp/ext/* /usr/share/postgresql/18/extension/ && \
         exec postgres -c shared_preload_libraries=pg_trickle'

Installing from PGXN

pg_trickle is published on the PostgreSQL Extension Network (PGXN). Installing via PGXN compiles the extension from source, so the Rust toolchain and pgrx are required.

1. Install prerequisites

# Rust toolchain
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"

# pgrx build tool
cargo install --locked cargo-pgrx --version 0.17.0
cargo pgrx init --pg18 "$(pg_config --bindir)/pg_config"

2. Install the pgxn client

pip install pgxnclient

3. Install pg_trickle

pgxn install pg_trickle

To install a specific version:

pgxn install pg_trickle=0.10.0

Note: After installation, follow the PostgreSQL Configuration and Extension Installation steps below.

Building from Source

1. Install Rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

2. Install pgrx

cargo install --locked cargo-pgrx --version 0.17.0
cargo pgrx init --pg18 $(pg_config --bindir)/pg_config

3. Build the Extension

# Development build (faster compilation)
cargo pgrx install --pg-config $(pg_config --bindir)/pg_config

# Release build (optimized, for production)
cargo pgrx install --release --pg-config $(pg_config --bindir)/pg_config

# Package for deployment (creates installable artifacts)
cargo pgrx package --pg-config $(pg_config --bindir)/pg_config

PostgreSQL Configuration

Add the following to postgresql.conf before starting PostgreSQL:

# Required — loads the extension shared library at server start
shared_preload_libraries = 'pg_trickle'

# Must accommodate the pg_trickle launcher + one scheduler per database
# with pg_trickle installed + optional parallel refresh workers.
#
# WARNING: when this limit is reached, the launcher silently skips
# databases it cannot spawn a scheduler for and retries every 5 minutes.
# Those databases stop refreshing without any visible error.
# Check PostgreSQL logs for:
#   WARNING:  pg_trickle launcher: could not spawn scheduler for database '...'
#
# Formula:
#   1 (launcher) + N (one scheduler per DB) + max_dynamic_refresh_workers
#   + autovacuum_max_workers + parallel query workers + other extensions
#
# 32 is a safe starting point for most clusters:
max_worker_processes = 32

Note: wal_level = logical and max_replication_slots are not required. The extension uses lightweight row-level triggers for CDC, not logical replication.

Restart PostgreSQL after modifying these settings:

pg_ctl restart -D /path/to/data
# or
systemctl restart postgresql

Extension Installation

Connect to the target database and run:

CREATE EXTENSION pg_trickle;

This creates:

The pgtrickle schema with catalog tables and SQL functions
The pgtrickle_changes schema for change buffer tables
Event triggers for DDL tracking
The pgtrickle.pg_stat_stream_tables monitoring view

Verification

After installation, verify everything is working:

-- Check the extension version
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_trickle';

-- Or get a full status overview (includes version, scheduler state, stream table count)
SELECT * FROM pgtrickle.pgt_status();

Quick functional test

CREATE TABLE test_source (id INT PRIMARY KEY, val TEXT);
INSERT INTO test_source VALUES (1, 'hello');

SELECT pgtrickle.create_stream_table(
    'test_st',
    'SELECT id, val FROM test_source',
    '1m',
    'FULL'
);

SELECT * FROM test_st;
-- Should return: 1 | hello

-- Clean up
SELECT pgtrickle.drop_stream_table('test_st');
DROP TABLE test_source;

Upgrading

To upgrade pg_trickle to a newer version without losing data:

For comprehensive upgrade instructions, version-specific notes, troubleshooting, and rollback procedures, see docs/UPGRADING.md.

1. Install the new extension files

Follow the same steps as Installing from a Pre-built Release to overwrite the shared library and SQL files with the new version. You do not need to drop the extension from your databases first.

Linux / macOS:

tar xzf pg_trickle-<new-ver>-pg18-linux-amd64.tar.gz
cd pg_trickle-<new-ver>-pg18-linux-amd64

sudo cp lib/*.so  "$(pg_config --pkglibdir)/"
sudo cp extension/*.control extension/*.sql "$(pg_config --sharedir)/extension/"

2. Restart PostgreSQL (when required)

If the shared library ABI has changed, restart PostgreSQL before proceeding so the new .so/.dll is loaded. The release notes for each version will call this out explicitly when a restart is required.

pg_ctl restart -D /path/to/data
# or
systemctl restart postgresql

3. Apply the schema migration in each database

Connect to every database where pg_trickle is installed and run:

-- Upgrade to the latest bundled version
ALTER EXTENSION pg_trickle UPDATE;

-- Or upgrade to a specific version
ALTER EXTENSION pg_trickle UPDATE TO '<new-version>';

PostgreSQL uses the versioned SQL migration scripts bundled with the release (e.g. pg_trickle--0.2.3--0.3.0.sql, pg_trickle--0.3.0--0.4.0.sql) to apply catalog and SQL-surface changes. PostgreSQL automatically chains these scripts when you run ALTER EXTENSION pg_trickle UPDATE. The command is a no-op when no migration script is needed for a given release.

You can confirm the active version afterwards:

SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';

Coming soon: A future release will include a helper function (pgtrickle.upgrade()) that automates steps 2–3 across all databases in the cluster and validates catalog integrity after the migration. Until then, the manual steps above are the supported upgrade path.

Uninstallation

-- Drop all stream tables first
SELECT pgtrickle.drop_stream_table(pgt_schema || '.' || pgt_name)
FROM pgtrickle.pgt_stream_tables;

-- Drop the extension
DROP EXTENSION pg_trickle CASCADE;

Remove pg_trickle from shared_preload_libraries in postgresql.conf and restart PostgreSQL.

Troubleshooting

Unit tests crash on macOS 26+ (`symbol not found in flat namespace`)

macOS 26 (Tahoe) changed dyld to eagerly resolve all flat-namespace symbols at binary load time. pgrx extensions reference PostgreSQL server-internal symbols (e.g. CacheMemoryContext, SPI_connect) via the -Wl,-undefined,dynamic_lookup linker flag. These symbols are normally provided by the postgres executable when the extension is loaded as a shared library — but for cargo test --lib there is no postgres process, so the test binary aborts immediately:

dyld[66617]: symbol not found in flat namespace '_CacheMemoryContext'

This affects local development only — integration tests, E2E tests, and the extension itself running inside PostgreSQL are unaffected.

The fix is built into the just test-unit recipe. It automatically:

Compiles a tiny C stub library (scripts/pg_stub.c → target/libpg_stub.dylib) that provides NULL/no-op definitions for the ~28 PostgreSQL symbols.
Compiles the test binary with --no-run.
Runs the binary with DYLD_INSERT_LIBRARIES pointing to the stub.

The stub is only built on macOS 26+. On Linux or older macOS, just test-unit runs cargo test --lib directly with no changes.

Note: The stub symbols are never called — unit tests exercise pure Rust logic only. If a test accidentally calls a PostgreSQL function it will crash with a NULL dereference (the desired fail-fast behavior).

If you run unit tests without just (e.g. directly via cargo test --lib), you can use the wrapper script instead:

./scripts/run_unit_tests.sh pg18

# With test name filter:
./scripts/run_unit_tests.sh pg18 -- test_parse_basic

Extension fails to load

Ensure shared_preload_libraries = 'pg_trickle' is set and PostgreSQL has been restarted (not just reloaded). The extension requires shared memory initialization at startup.

Background worker not starting

Check that max_worker_processes is high enough. In sequential mode (default) pg_trickle needs one slot per database with stream tables. With parallel refresh enabled (pg_trickle.parallel_refresh_mode = 'on') it additionally needs max_dynamic_refresh_workers slots (default 4) shared across all databases.

See the worker-budget formula in CONFIGURATION.md for sizing guidance.

Check logs for details

The extension logs at various levels. Enable debug logging for more detail:

SET client_min_messages TO debug1;

Next Steps

Getting Started — Create your first stream table in 5 minutes
Pre-Deployment Checklist — Complete checklist for production deployments
Best-Practice Patterns — Common data modeling patterns
Configuration Reference — All GUC variables and tuning

Upgrading pg_trickle

This guide covers upgrading pg_trickle from one version to another.

Quick Upgrade (Recommended)

-- 1. Check current version
SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';

-- 2. Replace the binary files (.so/.dylib, .control, .sql)
--    See the installation method below for your platform.

-- 3. Restart PostgreSQL (required for shared library changes)
--    sudo systemctl restart postgresql

-- 4. Run the upgrade in each database that has pg_trickle installed
ALTER EXTENSION pg_trickle UPDATE;

-- 5. Verify the upgrade
SELECT pgtrickle.version();
SELECT * FROM pgtrickle.health_check();

Step-by-Step Instructions

1. Check Current Version

SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';
-- Returns your current installed version, e.g. '0.9.0'

2. Install New Binary Files

Replace the extension files in your PostgreSQL installation directory. The method depends on how you originally installed pg_trickle.

From release tarball:

# Replace <new-version> with the target release, for example 0.2.3
curl -LO https://github.com/getretake/pg_trickle/releases/download/v<new-version>/pg_trickle-<new-version>-pg18-linux-amd64.tar.gz
tar xzf pg_trickle-<new-version>-pg18-linux-amd64.tar.gz

# Copy files to PostgreSQL directories
sudo cp pg_trickle-<new-version>-pg18-linux-amd64/lib/* $(pg_config --pkglibdir)/
sudo cp pg_trickle-<new-version>-pg18-linux-amd64/extension/* $(pg_config --sharedir)/extension/

From source (cargo-pgrx):

cargo pgrx install --release

3. Restart PostgreSQL

The shared library (.so / .dylib) is loaded at server start via shared_preload_libraries. A restart is required for the new binary to take effect.

sudo systemctl restart postgresql
# or on macOS with Homebrew:
brew services restart postgresql@18

4. Run ALTER EXTENSION UPDATE

Connect to each database where pg_trickle is installed and run:

ALTER EXTENSION pg_trickle UPDATE;

This executes the upgrade migration scripts in order (for example, pg_trickle--0.5.0--0.6.0.sql → pg_trickle--0.6.0--0.7.0.sql). PostgreSQL automatically determines the full upgrade chain from your current version to the new default_version.

5. Verify the Upgrade

-- Check version
SELECT pgtrickle.version();

-- Run health check
SELECT * FROM pgtrickle.health_check();

-- Verify stream tables are intact
SELECT * FROM pgtrickle.stream_tables_info;

-- Test a refresh
SELECT pgtrickle.refresh_stream_table('your_stream_table');

Version-Specific Notes

0.1.3 → 0.2.0

New functions added:

pgtrickle.list_sources(name) — list source tables for a stream table
pgtrickle.change_buffer_sizes() — inspect CDC change buffer sizes
pgtrickle.health_check() — diagnostic health checks
pgtrickle.dependency_tree() — visualize the dependency DAG
pgtrickle.trigger_inventory() — audit CDC triggers
pgtrickle.refresh_timeline(max_rows) — refresh history
pgtrickle.diamond_groups() — diamond dependency group info
pgtrickle.version() — extension version string
pgtrickle.pgt_ivm_apply_delta(...) — internal IVM delta application
pgtrickle.pgt_ivm_handle_truncate(...) — internal TRUNCATE handler
pgtrickle._signal_launcher_rescan() — internal launcher signal

No schema changes to pgtrickle.pgt_stream_tables or pgtrickle.pgt_dependencies catalog tables.

No breaking changes. All v0.1.3 functions and views continue to work as before.

0.2.0 → 0.2.1

Three new catalog columns added to pgtrickle.pgt_stream_tables:

Column	Type	Default	Purpose
`topk_offset`	`INT`	`NULL`	Pre-provisioned for paged TopK OFFSET (activated in v0.2.2)
`has_keyless_source`	`BOOLEAN NOT NULL`	`FALSE`	EC-06: keyless source flag; switches apply strategy from MERGE to counted DELETE
`function_hashes`	`TEXT`	`NULL`	EC-16: stores MD5 hashes of referenced function bodies for change detection

The migration script (pg_trickle--0.2.0--0.2.1.sql) adds these columns via ALTER TABLE … ADD COLUMN IF NOT EXISTS.

No breaking changes. All v0.2.0 functions, views, and event triggers continue to work as before.

What's also new:

Upgrade migration safety infrastructure (scripts, CI, E2E tests)
GitHub Pages book expansion (6 new documentation pages)
User-facing upgrade guide (this document)

0.2.1 → 0.2.2

No catalog table DDL changes. The topk_offset column needed for paged TopK was already added in v0.2.1.

Two SQL function updates are applied by pg_trickle--0.2.1--0.2.2.sql:

pgtrickle.create_stream_table(...)
- default schedule changes from '1m' to 'calculated'
- default refresh_mode changes from 'DIFFERENTIAL' to 'AUTO'
pgtrickle.alter_stream_table(...)
- adds the optional query parameter used by ALTER QUERY support

Because PostgreSQL stores argument defaults and function signatures in pg_proc, the migration script must DROP FUNCTION and recreate both signatures during ALTER EXTENSION ... UPDATE.

Behavioral notes:

Existing stream tables keep their current catalog values. The migration only changes the defaults used by future create_stream_table(...) calls.
Existing applications can opt a table into the new defaults explicitly via pgtrickle.alter_stream_table(...) after the upgrade.
After installing the new binary and restarting PostgreSQL, the scheduler now warns if the shared library version and SQL-installed extension version do not match. This helps detect stale .so/.dylib files after partial upgrades.

0.2.2 → 0.2.3

One new catalog column is added to pgtrickle.pgt_stream_tables:

Column	Type	Default	Purpose
`requested_cdc_mode`	`TEXT`	`NULL`	Optional per-stream-table CDC override (`'auto'`, `'trigger'`, `'wal'`)

The upgrade script also recreates two SQL functions:

pgtrickle.create_stream_table(...)
- adds the optional cdc_mode parameter
pgtrickle.alter_stream_table(...)
- adds the optional cdc_mode parameter

Monitoring view updates:

pgtrickle.pg_stat_stream_tables gains the cdc_modes column
pgtrickle.pgt_cdc_status is added for per-source CDC visibility

Because PostgreSQL stores function signatures and defaults in pg_proc, the upgrade script drops and recreates both lifecycle functions during ALTER EXTENSION ... UPDATE.

0.6.0 → 0.7.0

One new catalog column is added to pgtrickle.pgt_stream_tables:

Column	Type	Default	Purpose
`last_fixpoint_iterations`	`INT`	`NULL`	Records how many rounds the last circular-dependency fixpoint run required

Two new catalog tables are added:

Table	Purpose
`pgtrickle.pgt_watermarks`	Stores per-source watermark progress reported by external loaders
`pgtrickle.pgt_watermark_groups`	Stores groups of sources that must stay temporally aligned before refresh

The upgrade script also updates and adds SQL functions:

Recreates pgtrickle.pgt_status() so the result includes scc_id
Adds pgtrickle.pgt_scc_status() for circular-dependency monitoring
Adds pgtrickle.advance_watermark(source, watermark)
Adds pgtrickle.create_watermark_group(name, sources[], tolerance_secs)
Adds pgtrickle.drop_watermark_group(name)
Adds pgtrickle.watermarks()
Adds pgtrickle.watermark_groups()
Adds pgtrickle.watermark_status()

Behavioral notes:

Circular stream table dependencies can now run to convergence when pg_trickle.allow_circular = true and every member of the cycle is safe for monotone DIFFERENTIAL refresh.
The scheduler can now hold back refreshes until related source tables are aligned within a configured watermark tolerance.
Existing non-circular stream tables continue to work as before. The new catalog objects are additive.

0.7.0 → 0.8.0

No catalog schema changes. The upgrade migration script contains no DDL.

New operational features:

pg_dump / pg_restore support: stream tables are now safely exported and re-connected after restore without manual intervention.
Connection pooler opt-in was introduced at the per-stream level (superseded by the more comprehensive pooler_compatibility_mode added in v0.10.0).

No breaking changes. All v0.7.0 functions, views, and event triggers continue to work as before.

0.8.0 → 0.9.0

No catalog schema DDL changes to pgtrickle.pgt_stream_tables or the dependency catalog.

New API function added:

pgtrickle.restore_stream_tables() — re-installs CDC triggers and re-registers stream tables after a pg_restore from a pg_dump.

Hidden auxiliary columns for AVG / STDDEV / VAR aggregates. Stream tables using these aggregates will automatically receive hidden __pgt_aux_* columns on the next refresh after upgrading. No manual action is needed — pg_trickle detects missing auxiliary columns and performs a single full reinitialise to add them.

Behavioral notes:

COUNT, SUM, and AVG now update in constant time (O(changed rows)) instead of rescanning the whole group.
STDDEV and VAR variants likewise update in O(changed rows) via hidden sum-of-squares auxiliary columns.
MIN/MAX still requires a group rescan only when the deleted value is the current extreme.
Refresh groups (create_refresh_group, drop_refresh_group, refresh_groups()) are available starting from this version.

0.9.0 → 0.10.0

Two new catalog columns added to pgtrickle.pgt_stream_tables:

Column	Type	Default	Purpose
`pooler_compatibility_mode`	`BOOLEAN NOT NULL`	`FALSE`	Disables prepared statements and NOTIFY for this stream table — required when accessed through PgBouncer in transaction-pool mode
`refresh_tier`	`TEXT NOT NULL`	`'hot'`	Tiered scheduling tier: `hot`, `warm`, `cold`, or `frozen`

One new catalog table is added:

Table	Purpose
`pgtrickle.pgt_refresh_groups`	Stores refresh groups for snapshot-consistent multi-table refresh

The upgrade script also updates and adds SQL functions:

pgtrickle.create_stream_table(...) gains the pooler_compatibility_mode parameter
pgtrickle.create_stream_table_if_not_exists(...) likewise
pgtrickle.create_or_replace_stream_table(...) likewise
pgtrickle.alter_stream_table(...) likewise
Adds pgtrickle.create_refresh_group(name, members, isolation)
Adds pgtrickle.drop_refresh_group(name)
Adds pgtrickle.refresh_groups() — lists all declared groups

Behavioral notes:

pooler_compatibility_mode defaults to false. Existing stream tables are unaffected. Enable it only for stream tables accessed through PgBouncer transaction-mode pooling.
pg_trickle.auto_backoff now defaults to on (was off). The backoff threshold is raised from 80 % → 95 % and the maximum slowdown is capped at 8× (was 64×). If you relied on the old opt-in behaviour, set pg_trickle.auto_backoff = off explicitly.
diamond_consistency now defaults to 'atomic' for new stream tables (was 'none'). Existing stream tables keep their current setting.
The scheduler now uses row-level locking for concurrency control instead of session-level advisory locks, making pg_trickle compatible with PgBouncer transaction-pool and similar connection poolers.
Statistical aggregates (CORR, COVAR_*, REGR_*) now update incrementally using Welford-style accumulation, no longer requiring a group rescan.
Materialized view sources can now be used in DIFFERENTIAL mode when pg_trickle.matview_polling = on is set.
Recursive CTE stream tables with DELETE/UPDATE now use the Delete-and-Rederive algorithm (O(delta) instead of O(n)).

0.10.0 → 0.11.0

New catalog columns added to pgtrickle.pgt_stream_tables:

Column	Type	Default	Purpose
`effective_refresh_mode`	`TEXT`	`NULL`	Actual refresh mode used in the last cycle (FULL / DIFFERENTIAL / APPEND_ONLY / TOP_K / NO_DATA); populated by the scheduler after each completed refresh
`fuse_mode`	`TEXT NOT NULL`	`'off'`	Circuit-breaker mode: `off`, `on`, or `auto`
`fuse_state`	`TEXT NOT NULL`	`'armed'`	Circuit-breaker state: `armed`, `blown`, or `disabled`
`fuse_ceiling`	`BIGINT`	`NULL`	Maximum change-row count that can pass through in one refresh before the fuse blows; `NULL` = unlimited
`fuse_sensitivity`	`INT`	`NULL`	Sensitivity multiplier for auto-fuse detection
`blown_at`	`TIMESTAMPTZ`	`NULL`	Timestamp when the fuse last triggered
`blow_reason`	`TEXT`	`NULL`	Human-readable reason the fuse blew
`st_partition_key`	`TEXT`	`NULL`	Partition key column for declaratively partitioned stream tables; `NULL` = not partitioned

Updated function signatures — existing calls continue to work because new parameters all have defaults:

pgtrickle.create_stream_table(...) gains partition_by TEXT DEFAULT NULL
pgtrickle.create_stream_table_if_not_exists(...) likewise
pgtrickle.create_or_replace_stream_table(...) likewise
pgtrickle.alter_stream_table(...) gains fuse TEXT DEFAULT NULL, fuse_ceiling BIGINT DEFAULT NULL, fuse_sensitivity INT DEFAULT NULL

New functions:

pgtrickle.reset_fuse(name TEXT, action TEXT DEFAULT 'apply') — clear a blown fuse and resume scheduling
pgtrickle.fuse_status() — returns circuit-breaker state for every stream table
pgtrickle.explain_refresh_mode(name TEXT) — shows configured mode, effective mode, and the reason for any downgrade

Behavioral notes:

Event-driven wake (pg_trickle.event_driven_wake) is on by default — the background worker now wakes within ~15 ms of a source-table write instead of waiting up to 500 ms.
Stream-table-to-stream-table chains now refresh incrementally — downstream tables receive a small insert/delete delta rather than cascading full refreshes.
pg_trickle.tiered_scheduling now defaults to on.
Declaratively partitioned stream tables are supported via partition_by — the refresh MERGE is automatically restricted to only the changed partitions.

0.11.0 → 0.12.0

No schema changes. This release adds four new diagnostic SQL functions only:

Function	Returns	Purpose
`pgtrickle.explain_query_rewrite(query TEXT)`	`TABLE(pass_name TEXT, changed BOOL, sql_after TEXT)`	Walk a query through every DVM rewrite pass to see how pg_trickle transforms it
`pgtrickle.diagnose_errors(name TEXT)`	`TABLE(event_time TIMESTAMPTZ, error_type TEXT, error_message TEXT, remediation TEXT)`	Last 5 FAILED refresh events with error classification and suggested fixes
`pgtrickle.list_auxiliary_columns(name TEXT)`	`TABLE(column_name TEXT, data_type TEXT, purpose TEXT)`	List all hidden `__pgt_*` auxiliary columns on a stream table's storage relation
`pgtrickle.validate_query(query TEXT)`	`TABLE(valid BOOL, mode TEXT, reason TEXT)`	Parse and validate a query for stream-table compatibility without creating one

Behavioral notes:

The incremental engine now handles multi-table join deletes correctly — phantom rows after simultaneous deletes from multiple join sides no longer occur.
Stream-table-to-stream-table row identity is now computed consistently between the change buffer and the downstream table, eliminating stale duplicate rows after upstream UPDATEs.
pg_trickle.tiered_scheduling defaults to on (same as 0.11.0 runtime behaviour; this release makes it the explicit default).

0.12.0 → 0.13.0

Ten new catalog columns added to pgtrickle.pgt_stream_tables:

Column	Type	Default	Purpose
`effective_refresh_mode`	`TEXT`	`NULL`	Computed refresh mode after AUTO resolution
`fuse_mode`	`TEXT NOT NULL`	`'off'`	Fuse configuration: off, auto, or manual
`fuse_state`	`TEXT NOT NULL`	`'armed'`	Current fuse state: armed or blown
`fuse_ceiling`	`BIGINT`	`NULL`	Maximum change count before fuse blows
`fuse_sensitivity`	`INT`	`NULL`	Consecutive cycles above ceiling before triggering
`blown_at`	`TIMESTAMPTZ`	`NULL`	Timestamp when the fuse last blew
`blow_reason`	`TEXT`	`NULL`	Reason the fuse blew
`st_partition_key`	`TEXT`	`NULL`	Partition key specification (RANGE, LIST, or HASH)
`max_differential_joins`	`INT`	`NULL`	Maximum join count for differential mode (auto-fallback to FULL when exceeded)
`max_delta_fraction`	`DOUBLE PRECISION`	`NULL`	Maximum delta-to-table ratio for differential mode (auto-fallback to FULL when exceeded)

All columns use ADD COLUMN IF NOT EXISTS for idempotent upgrades.

Nine new SQL functions (plus one replacement with new signature):

Function	Purpose
`pgtrickle.explain_delta(name, format)`	Delta SQL query plan inspection
`pgtrickle.dedup_stats()`	MERGE deduplication frequency counters
`pgtrickle.shared_buffer_stats()`	Per-source-buffer observability
`pgtrickle.explain_refresh_mode(name)`	Refresh mode decision explanation
`pgtrickle.reset_fuse(name)`	Reset a blown fuse
`pgtrickle.fuse_status()`	Fuse state across all stream tables
`pgtrickle.explain_query_rewrite(query)`	DVM rewrite pass inspection
`pgtrickle.diagnose_errors(name)`	Error classification and remediation
`pgtrickle.list_auxiliary_columns(name)`	Hidden `__pgt_*` column listing
`pgtrickle.validate_query(query)`	Query compatibility validation
`pgtrickle.alter_stream_table(...)`	(replaced) — new `partition_by` parameter

New GUC variables:

GUC	Default	Purpose
`pg_trickle.per_database_worker_quota`	`0` (auto)	Per-database parallel worker limit

Behavioral notes:

Shared change buffers: Multiple stream tables reading from the same source now automatically share a single change buffer. No migration action required — existing per-source buffers continue to work.
Columnar change tracking: Wide-table UPDATEs that touch only value columns (not GROUP BY / JOIN / WHERE columns) now generate significantly less delta volume. This is fully automatic.
Auto buffer partitioning: Set pg_trickle.buffer_partitioning = 'auto' to let high-throughput buffers self-promote to partitioned mode for O(1) cleanup.
dbt macros: If you use dbt-pgtrickle, update your macros to the matching v0.13.0 version. New config options: partition_by, fuse, fuse_ceiling, fuse_sensitivity.

No breaking changes. All v0.12.0 functions, views, and event triggers continue to work as before.

0.13.0 → 0.14.0

Two new catalog columns added to pgtrickle.pgt_stream_tables:

Column	Type	Default	Purpose
`last_error_message`	`TEXT`	`NULL`	Error message from the last permanent refresh failure
`last_error_at`	`TIMESTAMPTZ`	`NULL`	Timestamp of the last permanent refresh failure

Updated function signature (return type gained new columns):

pgtrickle.st_refresh_stats() — gains consecutive_errors, schedule, refresh_tier, and last_error_message columns. The upgrade script drops and recreates the function. No behavior change for existing callers that ignore unknown columns.

New SQL functions (available immediately after ALTER EXTENSION ... UPDATE):

Function	Purpose
`pgtrickle.recommend_refresh_mode(name)`	Workload-based refresh mode recommendation with confidence level
`pgtrickle.refresh_efficiency(name)`	Per-table FULL vs. DIFFERENTIAL performance metrics
`pgtrickle.export_definition(name)`	Export stream table as reproducible DROP+CREATE+ALTER DDL
`pgtrickle.convert_buffers_to_unlogged()`	Convert logged change buffers to UNLOGGED

New GUC variables:

GUC	Default	Purpose
`pg_trickle.planner_aggressive`	`true`	Consolidated switch replacing `merge_planner_hints` + `merge_work_mem_mb`
`pg_trickle.unlogged_buffers`	`false`	Create new change buffers as UNLOGGED (reduces WAL by ~30%)
`pg_trickle.agg_diff_cardinality_threshold`	`1000`	Warn at creation time when GROUP BY cardinality is below this

Deprecated GUCs (still accepted but ignored at runtime):

pg_trickle.merge_planner_hints → use pg_trickle.planner_aggressive
pg_trickle.merge_work_mem_mb → use pg_trickle.planner_aggressive

Behavioral notes:

Error-state circuit breaker: A single permanent refresh failure (e.g. a function that doesn't exist for the column type) now immediately sets the stream table status to ERROR with a message stored in last_error_message. The scheduler skips ERROR tables. Use pgtrickle.resume_stream_table(name) followed by pgtrickle.alter_stream_table(name, query => ...) to recover.
Tiered scheduling NOTICE: Demoting a stream table from hot to cold or frozen now emits a NOTICE so operators are aware the effective refresh interval has changed (10× for cold, suspended for frozen).
SECURITY DEFINER triggers: All CDC trigger functions now run with SECURITY DEFINER and an explicit SET search_path, hardening against privilege-escalation attacks. This is applied automatically on upgrade — no manual action needed.
TUI binary: A pgtrickle command-line tool is now included in the package. See TUI.md for usage.

No breaking changes. All v0.13.0 functions, views, and event triggers continue to work as before.

Supported Upgrade Paths

The following migration hops are available. PostgreSQL chains them automatically when you run ALTER EXTENSION pg_trickle UPDATE.

From	To	Script
0.1.3	0.2.0	`pg_trickle--0.1.3--0.2.0.sql`
0.2.0	0.2.1	`pg_trickle--0.2.0--0.2.1.sql`
0.2.1	0.2.2	`pg_trickle--0.2.1--0.2.2.sql`
0.2.2	0.2.3	`pg_trickle--0.2.2--0.2.3.sql`
0.2.3	0.3.0	`pg_trickle--0.2.3--0.3.0.sql`
0.3.0	0.4.0	`pg_trickle--0.3.0--0.4.0.sql`
0.4.0	0.5.0	`pg_trickle--0.4.0--0.5.0.sql`
0.5.0	0.6.0	`pg_trickle--0.5.0--0.6.0.sql`
0.6.0	0.7.0	`pg_trickle--0.6.0--0.7.0.sql`
0.7.0	0.8.0	`pg_trickle--0.7.0--0.8.0.sql`
0.8.0	0.9.0	`pg_trickle--0.8.0--0.9.0.sql`
0.9.0	0.10.0	`pg_trickle--0.9.0--0.10.0.sql`
0.10.0	0.11.0	`pg_trickle--0.10.0--0.11.0.sql`
0.11.0	0.12.0	`pg_trickle--0.11.0--0.12.0.sql`
0.12.0	0.13.0	`pg_trickle--0.12.0--0.13.0.sql`
0.13.0	0.14.0	`pg_trickle--0.13.0--0.14.0.sql`

That means any installation currently on 0.1.3 through 0.13.0 can upgrade to 0.14.0 in one step after the new binaries are installed and PostgreSQL has been restarted.

Rollback / Downgrade

PostgreSQL does not support automatic extension downgrades. To roll back:

Export stream table definitions (if you want to recreate them later):

cargo run --bin pg_trickle_dump -- --output backup.sql

Or, if the binary is already installed in your PATH:

pg_trickle_dump --output backup.sql

Use --dsn '<connection string>' or standard PG* / DATABASE_URL environment variables when the default local connection parameters are not sufficient.

Drop the extension (destroys all stream tables):
```
DROP EXTENSION pg_trickle CASCADE;
```
Install the old version and restart PostgreSQL.

Recreate the extension at the old version:

CREATE EXTENSION pg_trickle VERSION '0.1.3';

Recreate stream tables from your backup.

Troubleshooting

"function pgtrickle.xxx does not exist" after upgrade

This means the upgrade script is missing a function. Workaround:

-- Check what version PostgreSQL thinks is installed
SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';

-- If the version looks correct but functions are missing,
-- the upgrade script may be incomplete. Try a clean reinstall:
DROP EXTENSION pg_trickle CASCADE;
CREATE EXTENSION pg_trickle CASCADE;
-- Warning: this destroys all stream tables!

Report this as a bug — upgrade scripts should never silently drop functions.

"could not access file pg_trickle" after restart

The new shared library file was not installed correctly. Verify:

ls -la $(pg_config --pkglibdir)/pg_trickle*

ALTER EXTENSION UPDATE says "already at version X"

The binary files are already the new version but the SQL catalog wasn't upgraded. This usually means the .control file's default_version matches your current version. Check:

cat $(pg_config --sharedir)/extension/pg_trickle.control

Multi-Database Environments

ALTER EXTENSION UPDATE must be run in each database where pg_trickle is installed. A common pattern:

for db in $(psql -t -c "SELECT datname FROM pg_database WHERE datname NOT IN ('template0', 'template1')"); do
  psql -d "$db" -c "ALTER EXTENSION pg_trickle UPDATE;" 2>/dev/null || true
done

CloudNativePG (CNPG)

For CNPG deployments, see cnpg/README.md for upgrade instructions specific to the Kubernetes operator.

Backup and Restore

Like any standard PostgreSQL extension, pg_trickle supports logical backups via pg_dump and physical backups (via tools like pgBackRest or pg_basebackup).

Because pg_trickle maintains automated states (like Change Data Capture buffers and DDL Event Triggers), specific workflows should be followed to ensure a smooth recovery.

Physical Backups (pgBackRest / pg_basebackup)

Physical backups copy the underlying data blocks. These are the most robust backups.

No special steps are needed during restore. When the database comes online, pg_trickle's catalogs, CDC buffers, and internal dependencies exist precisely as they did at the moment the snapshot was taken.

Note for WAL-Mode Users: Physical backups do not export replication slot data by default. If your CDC pipeline was in wal mode, logical slots might not survive the recreation. The pg_trickle scheduler handles missing slots gracefully by temporarily re-enabling table triggers.

Logical Backups (pg_dump / pg_restore)

Logical backups dump your database schema as generic cross-compatible SQL (CREATE TABLE, INSERT, CREATE INDEX).

pg_trickle integrates with pg_dump natively. When restoring these backups (which typically involves sequentially recreating schemas, inserting data into those tables, and lastly applying indexes and triggers), you must restore into a database precisely, to allow the extension to rewrite its own internal triggers correctly without conflicting with plain PostgreSQL commands.

The Recommended Multi-Stage pg_restore Strategy

The most reliable approach is to use the --section arguments of pg_restore. By breaking the restore up into pieces, we guarantee that when the schema, data, and constraints are created, all variables and configurations are actively in the database, and our custom hook DdlEventKind::ExtensionChange intercepts the query and automatically dials pgtrickle.restore_stream_tables() internally.

pg_trickle TUI — User Guide

pgtrickle is a terminal tool for managing and monitoring pg_trickle stream tables. It works in two modes:

Interactive dashboard — run pgtrickle with no arguments to launch a live-updating TUI that shows all your stream tables, their health, dependencies, and configuration.
One-shot CLI — run pgtrickle <command> to perform a single operation and exit. Output goes to stdout in table, JSON, or CSV format. Designed for scripts, CI pipelines, and automation.

Building

The TUI is a standalone Rust binary in the pgtrickle-tui workspace member. It does not require the PostgreSQL extension to compile — only a Rust toolchain.

# Build (debug)
cargo build -p pgtrickle-tui

# Build (release, optimized)
cargo build --release -p pgtrickle-tui

# The binary is at:
#   target/debug/pgtrickle       (debug)
#   target/release/pgtrickle     (release)

To install it on your PATH:

cargo install --path pgtrickle-tui

Verify:

pgtrickle --version
pgtrickle --help

Requirements

Rust 2024 edition (1.85+)
A running PostgreSQL 18 server with the pg_trickle extension installed
Network access to the database (no local socket required)

Connecting to a Database

pgtrickle resolves connection parameters in this order (first match wins):

Priority	Method	Example
1	`--url` flag	`pgtrickle --url postgres://user:pass@host:5432/mydb list`
2	`PGTRICKLE_URL` env var	`export PGTRICKLE_URL=postgres://...`
3	Individual flags	`--host`, `--port`, `--dbname`, `--user`, `--password`
4	Standard libpq env vars	`PGHOST`, `PGPORT`, `PGDATABASE`, `PGUSER`, `PGPASSWORD`
5	Defaults	`localhost:5432/postgres` as user `postgres`

Connection flags work with every subcommand and with the interactive dashboard:

# URL-style connection
pgtrickle --url postgres://admin:secret@db.example.com:5432/analytics

# Environment variables (most common in production)
export PGHOST=db.example.com
export PGPORT=5432
export PGDATABASE=analytics
export PGUSER=admin
export PGPASSWORD=secret
pgtrickle list

# Explicit flags
pgtrickle --host db.example.com --dbname analytics --user admin list

Interactive Dashboard

Run pgtrickle with no subcommand:

pgtrickle

This opens a full-screen terminal UI that auto-refreshes every 2 seconds. The screen has three areas:

Header — application name, current view, connection status (● connected / ✗ disconnected), and time since last poll.
Body — the active view (see below).
Footer — keyboard shortcuts for switching views and a filter indicator.

Press q or Ctrl+C to exit.

Views

There are 14 views. Switch between them by pressing the key shown:

Key	View	What it shows
`1`	Dashboard	All stream tables in a sortable list with status, mode, staleness, and last refresh time. A status ribbon at the top summarizes active/error/stale counts.
`2`	Detail	Deep dive into the selected stream table: properties (schema, status, mode, schedule, tier, refresh mode explanation), source tables, refresh history, CDC health, and diagnosed errors for error-state tables.
`3`	Dependencies	The stream table dependency graph rendered as an ASCII tree. Edges are color-coded by status (green = active, red = error).
`4`	Refresh Log	A scrollable timeline of recent refreshes across all tables — timestamp, mode (DIFF/FULL), table name, status, duration, and rows affected.
`5`	Diagnostics	Output of `recommend_refresh_mode()` — shows each table's current mode vs. recommended mode with confidence level and reasoning.
`6`	CDC Health	Change buffer sizes and byte counts per source table, plus the CDC mode (trigger/WAL). Large buffers are highlighted as warnings.
`7`	Configuration	All `pg_trickle.*` GUC parameters: current value, unit, category, and description.
`8`	Health Checks	Results of `health_check()` — each check displays a name, severity (OK/WARN/CRITICAL), and detail message. Critical items are shown in red.
`9`	Alerts	Real-time alert feed from `LISTEN pg_trickle_alert`. Shows timestamp, severity icon, and message for each event.
`w`	Workers	Background scheduler worker pool: each worker's state (running/idle), the table it's refreshing, and duration. Below that, the pending job queue with priority and wait time.
`f`	Fuse	Circuit breaker status for each stream table: fuse state (ARMED/TRIPPED/BLOWN), consecutive error count, and last error message.
`m`	Watermarks	Watermark group alignment: group name, member count, min/max watermarks, and whether the group is gated. Two tabs: Groups and Gates.
`d`	Delta Inspector	Fetches and displays the auto-generated delta SQL for the selected stream table (two tabs: Delta SQL and Auxiliary Columns). Press `e` to show the table's CREATE DDL.
`i`	Issues	All detected DAG issues (cycles, orphans, missing sources) sorted by severity and blast radius.

Keyboard Shortcuts

Navigation — works in all views:

Key	Action
`j` or `↓`	Move selection down
`k` or `↑`	Move selection up
`Page Down` / `Page Up`	Scroll 20 rows
`Home`	Jump to first row
`End`	Jump to last row
`Enter`	Drill into detail (Dashboard → Detail view; Delta Inspector → reload delta SQL)
`Esc`	Go back to Dashboard / close overlay / clear filter
`Tab`	Switch sub-tabs (Delta Inspector: SQL ↔ Auxiliary Columns; Watermarks: Groups ↔ Gates)

Write actions (view-specific):

Key	View	Action
`r`	Dashboard, Detail	Refresh selected stream table
`R`	Dashboard	Refresh all active tables (with confirmation)
`p`	Dashboard, Detail	Pause selected (with confirmation)
`P`	Dashboard, Detail	Resume selected
`e`	Detail, Delta Inspector	Show CREATE DDL overlay for selected table
`A`	Fuse	Re-arm fuse for selected (with confirmation)
`g`	Watermarks (Gates tab)	Gate / ungate selected source (confirmation for gate)

Global actions:

Key	Action
`/`	Open filter — type to search, `Enter` to apply, `Esc` to cancel
`:`	Open command palette
`s` / `S`	Cycle sort field / reverse sort direction (Dashboard)
`t`	Toggle light/dark theme
`Ctrl+R`	Force an immediate poll
`Ctrl+E`	Export current view to JSON file (`/tmp/pgtrickle_export_*.json`)
`?`	Toggle help overlay
`q` or `Ctrl+C`	Quit

View switching:

Press 1–9, w, f, m, d, g, or i to jump directly to any view. The active view and selected table are shown in both the header bar and the footer nav bar.

Command Palette

Press : to open the command palette. Tab-completion works on stream table names. Available commands:

Command	Description
`refresh <name>`	Refresh a stream table (or `refresh all`)
`pause <name>`	Pause a stream table
`resume <name>`	Resume a paused stream table
`repair <name>`	Re-install CDC triggers
`export <name>`	Show CREATE DDL overlay
`explain <name>`	Fetch and display delta SQL for a stream table
`validate <SQL>`	Validate a SQL query against the extension
`fuse reset <name>`	Reset the circuit breaker fuse
`quit`	Exit the TUI

LISTEN/NOTIFY

The TUI opens a second, dedicated database connection that runs LISTEN pg_trickle_alert. Alerts (refresh failures, auto-suspension events, etc.) appear in the Alerts view (9) in real time, without waiting for the next poll cycle.

CLI Subcommands

Every subcommand runs non-interactively: it connects, executes one query, prints the result, and exits. This makes them suitable for shell scripts, cron jobs, CI pipelines, and monitoring probes.

Output Formats

All subcommands that produce tabular output accept --format / -f:

Format	Flag	Description
Table	`--format table` (default)	Human-readable aligned columns
JSON	`--format json`	Array of objects on stdout
CSV	`--format csv`	Comma-separated values

Command Reference

`pgtrickle list`

List all stream tables with status, mode, schedule, tier, and refresh stats.

pgtrickle list
pgtrickle list --format json

`pgtrickle status <name>`

Show detailed status for a single stream table.

pgtrickle status order_totals
pgtrickle status order_totals --format json

`pgtrickle refresh <name>`

Trigger a manual refresh of one stream table, or all of them.

pgtrickle refresh order_totals
pgtrickle refresh --all

`pgtrickle create <name> <query>`

Create a new stream table with the given defining query.

pgtrickle create my_totals "SELECT region, SUM(amount) FROM orders GROUP BY region"
pgtrickle create my_totals "SELECT ..." --schedule 5m --mode differential
pgtrickle create my_totals "SELECT ..." --no-initialize

Flag	Description
`--schedule`	Refresh schedule (e.g. `5m`, `@hourly`)
`--mode`	Refresh mode: `auto`, `differential`, `full`, `immediate`
`--no-initialize`	Skip the initial refresh after creation

`pgtrickle drop <name>`

Drop a stream table.

pgtrickle drop my_totals

`pgtrickle alter <name>`

Change a stream table's settings.

pgtrickle alter order_totals --mode full
pgtrickle alter order_totals --schedule 10m
pgtrickle alter order_totals --tier cold
pgtrickle alter order_totals --status paused
pgtrickle alter order_totals --query "SELECT ..."

Flag	Description
`--mode`	New refresh mode
`--schedule`	New refresh schedule
`--tier`	New scheduling tier (`hot`, `warm`, `cold`, `frozen`)
`--status`	New status (`active`, `paused`, `suspended`)
`--query`	New defining query (ALTER QUERY)

`pgtrickle export <name>`

Print the DDL (SQL definition) for a stream table.

pgtrickle export order_totals

`pgtrickle diag [name]`

Show refresh mode diagnostics and recommendations. Without a name, shows all tables. With a name, shows diagnostics for that table only.

pgtrickle diag
pgtrickle diag order_totals
pgtrickle diag --format json

`pgtrickle cdc`

Show CDC change buffer sizes and health.

pgtrickle cdc
pgtrickle cdc --format json

`pgtrickle graph`

Print the stream table dependency graph as an ASCII tree.

pgtrickle graph
pgtrickle graph --format json

`pgtrickle config`

Show all pg_trickle.* GUC parameters, or set one.

pgtrickle config
pgtrickle config --set pg_trickle.unlogged_buffers=true
pgtrickle config --format json

The --set flag runs ALTER SYSTEM SET followed by pg_reload_conf().

`pgtrickle health`

Run system health checks. Returns exit code 1 if any check is CRITICAL.

pgtrickle health
pgtrickle health --format json

# Use in CI/monitoring:
pgtrickle health || echo "Health check failed"

`pgtrickle workers`

Show the background worker pool status and pending job queue.

pgtrickle workers
pgtrickle workers --format json

`pgtrickle fuse`

Show fuse (circuit breaker) status for all stream tables.

pgtrickle fuse
pgtrickle fuse --format json

`pgtrickle watermarks`

Show watermark groups and source gating status.

pgtrickle watermarks
pgtrickle watermarks --format json

`pgtrickle explain <name>`

Inspect the generated delta SQL, DVM operator tree, or deduplication stats for a stream table. By default shows the delta SQL.

pgtrickle explain order_totals                  # Delta SQL
pgtrickle explain order_totals --analyze        # EXPLAIN ANALYZE on the delta
pgtrickle explain order_totals --operators      # DVM operator tree
pgtrickle explain order_totals --dedup          # Dedup stats per source
pgtrickle explain order_totals --format json

Flag	Description
`--analyze`	Run `EXPLAIN ANALYZE` on the delta query
`--operators`	Show the DVM operator tree instead of raw SQL
`--dedup`	Show change buffer deduplication statistics

`pgtrickle watch`

Non-interactive continuous output mode. Polls the database and prints a status table at regular intervals. Useful for CI logs, monitoring, and terminals without TUI support.

pgtrickle watch                     # Default: every 2 seconds
pgtrickle watch -n 10               # Every 10 seconds
pgtrickle watch --compact           # One line per table
pgtrickle watch --no-color          # No ANSI color codes
pgtrickle watch --append            # Append mode (don't clear screen)

# Log to a file
pgtrickle watch --compact --no-color --append >> /var/log/pgtrickle.log

Flag	Short	Description
`--interval`	`-n`	Poll interval in seconds (default: 2)
`--compact`		One-line-per-table output
`--no-color`		Disable ANSI color codes
`--append`		Append to stdout instead of clearing the screen

`pgtrickle completions <shell>`

Generate shell completion scripts. Install them once and get tab-completion for all subcommands and flags.

# Bash
pgtrickle completions bash > /etc/bash_completion.d/pgtrickle
# or for the current user:
pgtrickle completions bash > ~/.local/share/bash-completion/completions/pgtrickle

# Zsh
pgtrickle completions zsh > ~/.zfunc/_pgtrickle

# Fish
pgtrickle completions fish > ~/.config/fish/completions/pgtrickle.fish

# PowerShell
pgtrickle completions powershell > pgtrickle.ps1

Examples

Quick health check in CI

#!/bin/bash
set -e
export PGHOST=db.example.com PGDATABASE=analytics PGUSER=monitor

pgtrickle health || { echo "pg_trickle health check failed"; exit 1; }
pgtrickle list --format json | jq '.[] | select(.status != "ACTIVE")'

Monitor stream tables in a tmux pane

pgtrickle watch -n 5

Export all definitions for version control

for name in $(pgtrickle list --format json | jq -r '.[].name'); do
  pgtrickle export "$name" > "sql/stream_tables/${name}.sql"
done

Debug a slow differential refresh

pgtrickle explain order_totals --analyze
pgtrickle explain order_totals --operators
pgtrickle explain order_totals --dedup

How It Works

The TUI connects to PostgreSQL using tokio-postgres (async, no TLS by default) and queries pg_trickle's built-in SQL API functions:

View	SQL function(s)
Dashboard	`pgtrickle.st_refresh_stats()`
Detail	`pgtrickle.explain_refresh_mode()`, `pgtrickle.list_sources()`, `pgtrickle.get_refresh_history()`, `pgtrickle.diagnose_errors()`
Dependencies	`pgtrickle.dependency_tree()`
Refresh Log	`pgtrickle.refresh_timeline()`
Diagnostics	`pgtrickle.recommend_refresh_mode()`
CDC Health	`pgtrickle.change_buffer_sizes()`, `pgtrickle.check_cdc_health()`
Configuration	`pg_settings WHERE name LIKE 'pg_trickle.%'`
Health Checks	`pgtrickle.health_check()`
Alerts	`LISTEN pg_trickle_alert` (real-time)
Workers	`pgtrickle.worker_pool_status()`, `pgtrickle.parallel_job_status()`
Fuse	`pgtrickle.fuse_status()`
Watermarks	`pgtrickle.watermark_groups()`, `pgtrickle.source_gate_status()`
Delta Inspector	`pgtrickle.explain_delta()`, `pgtrickle.list_auxiliary_columns()`, `pgtrickle.pgt_stream_tables` (DDL)
Issues	`pgtrickle.dag_issues()`

In interactive mode, a background task polls all of these every 2 seconds and pushes state updates to the rendering loop. A second connection runs LISTEN pg_trickle_alert for real-time notifications.

The TUI is purely a client — it reads from pg_trickle's monitoring API and sends commands (refresh, create, drop, alter) through the same SQL functions you would call from psql. It does not require any special privileges beyond what the pg_trickle SQL API requires.

Planned: cache_stats() and health_summary() Integration

Status: Not yet surfaced in the TUI (v0.18.0 gap).

The following SQL functions are available but not yet integrated into the TUI:

pgtrickle.cache_stats() — template cache hit rate, L1 hits, evictions, delta cache entries. Useful for monitoring cache effectiveness.
pgtrickle.health_summary() — single-row deployment overview with total/active/error/stale stream table counts, P99 refresh latency, scheduler status, and cache hit rate.

Lightest integration path: Add cache hit rate to the Dashboard status ribbon (currently shows scheduler status from quick_health). The Health Checks view (8) could display health_summary() fields alongside the existing health_check() results. Both functions are already available via raw SQL (psql, Grafana, or the Prometheus exporter).

Tech Stack

Component	Crate	Purpose
Terminal rendering	`ratatui` 0.29 + `crossterm` 0.28	Full-screen TUI with color, layout, widgets
Async runtime	`tokio` 1.x	Background polling, LISTEN/NOTIFY, signals
PostgreSQL	`tokio-postgres` 0.7	Async database queries
CLI parsing	`clap` 4.x	Subcommands, flags, env var integration
Table output	`comfy-table` 7.x	Aligned text tables for CLI mode
Serialization	`serde` + `serde_json`	JSON and CSV output formats
Shell completions	`clap_complete` 4.x	bash/zsh/fish/PowerShell completions

Contributing to pg_trickle

Thank you for your interest in contributing! pg_trickle is an Apache 2.0-licensed open-source project and welcomes contributions of all kinds.

Before You Start

Check the open issues and discussions to avoid duplicating work.
For non-trivial changes, open an issue first to discuss the approach.
Read AGENTS.md — it is the authoritative guide for all coding conventions, error handling rules, module layout, and test requirements.
Read docs/ARCHITECTURE.md to understand the system.
Read ROADMAP.md to see what work is planned.

Ways to Contribute

Type	Where to start
Bug report	Open an issue
Feature request	Open an issue or start a discussion
Documentation fix	Open a PR directly — no issue needed for typos/clarity
Code fix or feature	Open an issue first, then a PR
Performance improvement	Include benchmark numbers (see `just bench`)

Development Setup

# Install pgrx
cargo install cargo-pgrx --version "=0.17.0"
cargo pgrx init --pg18 /usr/lib/postgresql/18/bin/pg_config

# Build
cargo build

# Format + lint (required before every PR)
just fmt
just lint

# Run tests
just test-unit          # fast, no DB
just test-integration   # Testcontainers
just test-light-e2e     # PR-equivalent Light E2E tier (stock postgres)
just test-e2e           # full E2E (builds Docker image)
just test-pgbouncer     # PgBouncer transaction-pool compatibility tests

Full setup instructions are in INSTALL.md.

Devcontainer / Containerized Development

If you are developing in a devcontainer, use the default non-root vscode user and run the normal commands from the workspace root:

just fmt
just lint
just test-unit

just test-unit uses scripts/run_unit_tests.sh, which now selects a writable and cache-friendly target directory in this order:

target/ (preferred)
.cargo-target/ (project-local fallback)
$HOME/.cache/pg_trickle-target
${TMPDIR:-/tmp}/pg_trickle-target (last resort)

This avoids permission failures on bind mounts and preserves incremental builds when source or test files change.

If you see permission errors in containerized runs, verify you are not forcing a different container user/UID than expected by your workspace mount.

Run E2E tests in devcontainer

E2E tests use Testcontainers and require Docker access from inside the devcontainer (provided by the Docker-in-Docker feature in .devcontainer/devcontainer.json).

Run from the workspace root inside the devcontainer:

just build-e2e-image
just test-e2e

Notes:

The E2E harness starts containers via testcontainers (tests/e2e/mod.rs).
The default E2E image is pg_trickle_e2e:latest (built by tests/build_e2e_image.sh).
A plain docker run of the dev image is not equivalent to a full VS Code devcontainer session with features/lifecycle hooks enabled.

Making a Pull Request

Fork the repository and create a branch: git checkout -b fix/my-fix
Make your changes following the conventions in AGENTS.md
Run just fmt && just lint — both must pass with zero warnings
Add or update tests — see AGENTS.md § Testing
Open a PR against main

The PR template will walk you through the checklist.

CI Coverage on PRs

PR CI runs a three-tier gate:

Unit tests (Linux only)
Integration tests
Light E2E — curated PR-friendly end-to-end coverage split across three shards and executed against stock postgres:18.3

Full E2E, TPC-H tests, benchmarks, dbt, CNPG smoke, and the extra macOS / Windows unit jobs stay off the PR critical path and run on push-to-main, schedule, or manual dispatch. This keeps typical PR feedback closer to the single-digit-minute range while preserving broader scheduled coverage.

To trigger the full CI matrix on your PR branch (recommended for DVM engine, refresh, or CDC changes):

gh workflow run ci.yml --ref <your-branch>

To run all tests locally before pushing:

just test-all          # unit + integration + e2e

# PR-equivalent fast path:
just test-unit
just test-integration
just test-light-e2e

# TPC-H correctness tests (requires e2e Docker image):
cargo test --test e2e_tpch_tests -- --ignored --test-threads=1 --nocapture

See AGENTS.md § Testing for the full CI coverage matrix.

Coding Conventions (summary)

No unwrap() or panic!() in non-test code
All unsafe blocks require a // SAFETY: comment
Errors go through PgTrickleError in src/error.rs
New SQL functions use #[pg_extern(schema = "pgtrickle")]
Tests use Testcontainers — never a local PostgreSQL instance

Full details are in AGENTS.md.

Commit Messages

Use Conventional Commits:

fix: correct pgoutput action parsing for tables named INSERT_LOG
feat: add CUBE explosion guard (max 64 UNION ALL branches)
docs: document JOIN key change limitation in SQL_REFERENCE
test: add E2E test for keyless table duplicate-row behaviour

License

By contributing you agree that your contributions will be licensed under the Apache License 2.0.

dbt-pgtrickle

A dbt package that integrates pg_trickle stream tables into your dbt project via a custom stream_table materialization.

No custom Python adapter required — works with the standard dbt-postgres adapter. Just Jinja SQL macros that call pg_trickle's SQL API.

Prerequisites

Requirement	Minimum Version
dbt Core	≥ 1.9
dbt-postgres adapter	Matching dbt Core version
PostgreSQL	18.x
pg_trickle extension	≥ 0.1.0 (`CREATE EXTENSION pg_trickle;`)

Installation

From Git (recommended until dbt Hub listing is live)

Add to your packages.yml:

packages:
  - git: "https://github.com/grove/pg-trickle.git"
    revision: v0.15.0
    subdirectory: "dbt-pgtrickle"

From dbt Hub (once published)

After the package is listed on dbt Hub, you can install by package name:

packages:
  - package: grove/dbt_pgtrickle
    version: [">=0.15.0", "<1.0.0"]

Note: dbt Hub listing requires a separate GitHub repository for the package. See docs/integrations/dbt-hub-submission.md for the submission checklist and steps.

Then run:

dbt deps

Quick Start

Create a model with materialized='stream_table':

-- models/marts/order_totals.sql
{{
  config(
    materialized='stream_table',
    schedule='5m',
    refresh_mode='DIFFERENTIAL'
  )
}}

SELECT
    customer_id,
    SUM(amount) AS total_amount,
    COUNT(*) AS order_count
FROM {{ source('raw', 'orders') }}
GROUP BY customer_id

dbt run --select order_totals   # Creates the stream table
dbt test --select order_totals  # Tests work normally (it's a real table)

Configuration Reference

Key	Type	Default	Description
`materialized`	string	—	Must be `'stream_table'`
`schedule`	string/null	`'1m'`	Refresh schedule (e.g., `'5m'`, `'1h'`, cron). `null` for pg_trickle's CALCULATED schedule.
`refresh_mode`	string	`'DIFFERENTIAL'`	`'FULL'`, `'DIFFERENTIAL'`, `'AUTO'`, or `'IMMEDIATE'`
`initialize`	bool	`true`	Populate on creation
`status`	string/null	`null`	`'ACTIVE'` or `'PAUSED'`. When set, applies on subsequent runs via `alter_stream_table()`.
`stream_table_name`	string	model name	Override stream table name
`stream_table_schema`	string	target schema	Override schema
`cdc_mode`	string/null	`null`	CDC mode override: `'auto'`, `'trigger'`, or `'wal'`. `null` uses the GUC default.
`partition_by`	string/null	`null`	Column name for RANGE partitioning of the storage table (v0.13.0+). Cannot be changed after creation.
`fuse`	string/null	`null`	Fuse circuit-breaker mode: `'off'`, `'on'`, or `'auto'` (v0.13.0+). Applied via `alter_stream_table()` on every run; no-op if unchanged.
`fuse_ceiling`	int/null	`null`	Change-count threshold that triggers the fuse (v0.13.0+). `null` uses the global GUC default.
`fuse_sensitivity`	int/null	`null`	Number of consecutive over-ceiling observations before the fuse blows (v0.13.0+). `null` means 1 (immediate).

`partition_by` — RANGE partitioning

Partition the stream table's storage table by a column value. pg_trickle creates a PARTITION BY RANGE (<col>) storage table with a default catch-all partition. Add your own date/integer range partitions via standard PostgreSQL DDL after dbt run.

-- models/marts/events_by_day.sql
{{ config(
    materialized='stream_table',
    schedule='1m',
    refresh_mode='DIFFERENTIAL',
    partition_by='event_day'
) }}

SELECT
    event_day,
    user_id,
    COUNT(*) AS event_count
FROM {{ source('raw', 'events') }}
GROUP BY event_day, user_id

Note: partition_by is applied only at creation time. Changing it after the stream table exists has no effect. Use dbt run --full-refresh to recreate with a new partition key.

`fuse` — Circuit breaker

The fuse circuit breaker suspends refreshes when the change volume exceeds a threshold, protecting against runaway refresh cycles during bulk ingestion.

-- models/marts/order_totals.sql
{{ config(
    materialized='stream_table',
    schedule='5m',
    refresh_mode='DIFFERENTIAL',
    fuse='auto',
    fuse_ceiling=50000,
    fuse_sensitivity=3
) }}

SELECT customer_id, SUM(amount) AS total
FROM {{ source('raw', 'orders') }}
GROUP BY customer_id

`fuse` value	Behaviour
`'off'`	Fuse disabled (default)
`'on'`	Fuse always active; blows when ceiling is exceeded
`'auto'`	Fuse activates only when the delta is large enough to make FULL refresh cheaper than DIFFERENTIAL

Fuse parameters are applied on every dbt run via alter_stream_table() — only calls the SQL function when the values have genuinely changed from the catalog state.

Project-level defaults

# dbt_project.yml
models:
  my_project:
    marts:
      +materialized: stream_table
      +schedule: '5m'
      +refresh_mode: DIFFERENTIAL

Operations

`pgtrickle_refresh` — Manual refresh

dbt run-operation pgtrickle_refresh --args '{"model_name": "order_totals"}'

`refresh_all_stream_tables` — Refresh all in dependency order

Refreshes all dbt-managed stream tables in topological (dependency) order. Upstream tables are refreshed before downstream ones. Designed for CI pipelines: run after dbt run and before dbt test to ensure all data is current.

# Refresh all dbt-managed stream tables
dbt run-operation refresh_all_stream_tables

# Refresh only stream tables in a specific schema
dbt run-operation refresh_all_stream_tables --args '{"schema": "analytics"}'

`drop_all_stream_tables` — Drop dbt-managed stream tables

Drops only stream tables defined as dbt models (safe in shared environments):

dbt run-operation drop_all_stream_tables

`drop_all_stream_tables_force` — Drop ALL stream tables

Drops everything from the pg_trickle catalog, including non-dbt stream tables:

dbt run-operation drop_all_stream_tables_force

`pgtrickle_check_cdc_health` — CDC pipeline health

dbt run-operation pgtrickle_check_cdc_health

Raises an error (non-zero exit) if any CDC source is unhealthy.

Freshness Monitoring

Native dbt source freshness is not supported (the last_refresh_at column lives in the catalog, not on the stream table). Use the pgtrickle_check_freshness run-operation instead:

# Check all active stream tables (defaults: warn=600s, error=1800s)
dbt run-operation pgtrickle_check_freshness

# Custom thresholds
dbt run-operation pgtrickle_check_freshness \
  --args '{model_name: order_totals, warn_seconds: 300, error_seconds: 900}'

Exits non-zero when any stream table exceeds the error threshold — safe for CI.

Useful `dbt` Commands

# List all stream table models
dbt ls --select config.materialized:stream_table

# Full refresh (drop + recreate)
dbt run --select order_totals --full-refresh

# Build models + tests in DAG order
dbt build --select order_totals

Note: dbt build runs stream table models early in the DAG. If downstream models depend on a stream table with initialize: false, the table may not be populated yet.

Testing

Stream tables are standard PostgreSQL heap tables — all dbt tests work normally:

models:
  - name: order_totals
    columns:
      - name: customer_id
        tests:
          - not_null
          - unique

Stream Table Health Test

Use the built-in stream_table_healthy generic test to fail your dbt test suite when a stream table is stale, erroring, or paused:

models:
  - name: order_totals
    tests:
      - dbt_pgtrickle.stream_table_healthy:
          warn_seconds: 300  # fail if stale for more than 5 minutes

The test queries pgtrickle.pg_stat_stream_tables and returns rows for any unhealthy condition. An empty result set means the stream table is healthy.

Stream Table Status Macro

For more programmatic control, use the pgtrickle_stream_table_status() macro directly in custom tests or run-operations:

{%- set st = dbt_pgtrickle.pgtrickle_stream_table_status('order_totals', warn_seconds=300) -%}
{# st.status is one of: 'healthy', 'stale', 'erroring', 'paused', 'not_found' #}
{# st.staleness_seconds, st.consecutive_errors, st.total_refreshes, etc. #}

`__pgt_row_id` Column

pg_trickle adds an internal __pgt_row_id column to stream tables for row identity tracking. This column:

Appears in SELECT * and dbt docs generate
Does not affect dbt test unless you check column counts
Can be documented to reduce confusion:

columns:
  - name: __pgt_row_id
    description: "Internal pg_trickle row identity hash. Ignore this column."

Limitations

Limitation	Workaround
No in-place query alteration	Materialization auto-drops and recreates when query changes
`__pgt_row_id` visible	Document it; exclude in downstream `SELECT`
No native `dbt source freshness`	Use `pgtrickle_check_freshness` run-operation
No `dbt snapshot` support	Snapshot the stream table as a regular table
Query change detection is whitespace-sensitive	dbt compiles deterministically; unnecessary recreations are safe
PostgreSQL 18 required	Extension requirement
Shared version tags with pg_trickle extension	Pin to specific git revision

Contributing

See AGENTS.md for development guidelines and the implementation plan for design rationale.

Running tests locally

The quickest way (requires Docker and dbt installed):

# Full run — builds Docker image, starts container, runs tests, cleans up
just test-dbt

# Fast run — reuses existing Docker image (run after first build)
just test-dbt-fast

Or use the script directly with options:

cd dbt-pgtrickle/integration_tests/scripts

# Default: builds image, runs tests with dbt 1.9, cleans up
./run_dbt_tests.sh

# Skip image rebuild (faster iteration)
./run_dbt_tests.sh --skip-build

# Keep the container running after tests (for debugging)
./run_dbt_tests.sh --skip-build --keep-container

# Use a custom port (avoids conflicts with local PostgreSQL)
PGPORT=25432 ./run_dbt_tests.sh

Manual testing against an existing pg_trickle instance

If you already have PostgreSQL 18 + pg_trickle running locally:

export PGHOST=localhost PGPORT=5432 PGUSER=postgres PGPASSWORD=postgres PGDATABASE=postgres
cd dbt-pgtrickle/integration_tests
dbt deps
dbt seed
dbt run
./scripts/wait_for_populated.sh order_totals 30
dbt test
dbt run-operation drop_all_stream_tables

License

Apache 2.0 — see LICENSE.

CloudNativePG / Kubernetes

pg_trickle is designed to work with CloudNativePG (CNPG) — the Kubernetes operator for PostgreSQL. The extension is loaded via Image Volume Extensions, meaning no custom PostgreSQL image is needed.

Prerequisites

Kubernetes 1.33+ with the ImageVolume feature gate enabled
CloudNativePG operator 1.28+
The pg_trickle-ext OCI image available in your cluster registry

Architecture

┌─────────────────────────────────────┐
│  CNPG Cluster (3 pods)              │
│                                     │
│  ┌──────────┐  ┌──────────────────┐ │
│  │ Primary  │  │ pg_trickle-ext   │ │
│  │ PG 18    │◄─┤ (ImageVolume)    │ │
│  │          │  │ .so + .sql only  │ │
│  └──────────┘  └──────────────────┘ │
│  ┌──────────┐  ┌──────────┐        │
│  │ Replica 1│  │ Replica 2│        │
│  │ (standby)│  │ (standby)│        │
│  └──────────┘  └──────────┘        │
└─────────────────────────────────────┘

The scheduler runs on the primary pod only. Replica pods detect recovery mode (pg_is_in_recovery() = true) and sleep.
Stream tables are replicated to standbys via physical streaming replication like any other heap table.
Pod restarts are safe — the scheduler resumes from the stored frontier with no data loss.

Deploying pg_trickle on CNPG

1. Build the extension image

The cnpg/Dockerfile.ext builds a scratch-based OCI image containing only the shared library, control file, and SQL migrations:

# From the dist/ directory with pre-built artifacts:
docker build -t ghcr.io/<owner>/pg_trickle-ext:0.13.0 -f cnpg/Dockerfile.ext dist/
docker push ghcr.io/<owner>/pg_trickle-ext:0.13.0

2. Deploy the Cluster

Apply the Cluster manifest with pg_trickle configured as an Image Volume extension:

# cnpg/cluster-example.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: pg-trickle-demo
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:18

  postgresql:
    shared_preload_libraries:
      - pg_trickle
    extensions:
      - name: pg-trickle
        image:
          reference: ghcr.io/<owner>/pg_trickle-ext:0.13.0
    parameters:
      max_worker_processes: "8"

  bootstrap:
    initdb:
      database: app
      owner: app

  storage:
    size: 10Gi
    storageClass: standard

kubectl apply -f cnpg/cluster-example.yaml

3. Enable the extension

Use the CNPG Database resource for declarative extension management:

# cnpg/database-example.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Database
metadata:
  name: app
spec:
  cluster:
    name: pg-trickle-demo
  name: app
  owner: app
  extensions:
    - name: pg_trickle

kubectl apply -f cnpg/database-example.yaml

4. Verify

kubectl exec -it pg-trickle-demo-1 -- psql -U postgres -d app -c \
  "SELECT pgtrickle.version();"

Key Considerations

Worker processes

Each database with pg_trickle needs one background worker slot. Set max_worker_processes in the Cluster manifest to accommodate the launcher (1) + one scheduler per database + any parallel refresh workers:

parameters:
  max_worker_processes: "16"

Persistent volumes

Catalog tables (pgtrickle.pgt_stream_tables) and change buffers (pgtrickle_changes.*) are stored in regular PostgreSQL tablespaces. Persistent volume claims preserve them across pod rescheduling.

Backups

pg_trickle state (catalog, change buffers, stream table data) is included in CNPG's Barman object-store backups automatically. After a restore, the scheduler detects frontier inconsistencies and performs a full refresh on the first cycle. See Backup and Restore for details.

Failover

When the primary pod fails and a replica is promoted, the new primary's scheduler starts automatically. Since stream tables were replicated via streaming replication, they are already up-to-date (minus replication lag). The scheduler resumes refreshing from the stored frontier.

Resource limits

For production deployments, set resource requests and limits in the Cluster manifest to prevent the scheduler from starving other workloads:

resources:
  requests:
    memory: 512Mi
    cpu: 500m
  limits:
    memory: 2Gi
    cpu: 2000m

Example manifests

The repository includes ready-to-use manifests in the cnpg/ directory:

File	Purpose
`cnpg/Dockerfile.ext`	Build the scratch-based extension image
`cnpg/Dockerfile.ext-build`	Multi-stage build for CI/CD pipelines
`cnpg/cluster-example.yaml`	Complete Cluster manifest with pg_trickle
`cnpg/database-example.yaml`	Database resource with declarative extension management

Prometheus & Grafana Monitoring

pg_trickle ships with a complete observability stack based on postgres_exporter, Prometheus, and Grafana. The monitoring/ directory in the repository contains everything you need.

Quick Start

cd monitoring/
docker compose up -d

Open Grafana at http://localhost:3000 (default: admin / admin). The pg_trickle Overview dashboard is pre-provisioned.

Architecture

PostgreSQL + pg_trickle
        │
        │  custom SQL queries
        ▼
postgres_exporter (:9187)
        │
        │  /metrics (Prometheus format)
        ▼
   Prometheus (:9090)
        │
        │  data source
        ▼
    Grafana (:3000)

postgres_exporter runs custom SQL queries defined in prometheus/pg_trickle_queries.yml against the pg_trickle monitoring views (pgtrickle.stream_tables_info, pgtrickle.pg_stat_stream_tables, etc.) and exposes them as Prometheus metrics.

Connecting to an Existing Database

If you already have PostgreSQL + pg_trickle running, configure the exporter to point at your instance:

export PG_HOST=your-pg-host
export PG_PORT=5432
export PG_USER=postgres
export PG_PASSWORD=yourpassword
export PG_DATABASE=yourdb
docker compose up -d

Or edit the DATA_SOURCE_NAME in docker-compose.yml directly.

Metrics Exposed

All metrics are prefixed pg_trickle_.

Metric	Type	Description
`pg_trickle_stream_tables_total`	gauge	Total stream tables by status
`pg_trickle_stale_tables_total`	gauge	Tables with data older than schedule
`pg_trickle_consecutive_errors`	gauge	Per-table consecutive error count
`pg_trickle_refresh_duration_ms`	gauge	Average refresh duration (ms)
`pg_trickle_total_refreshes`	counter	Total refresh count per table
`pg_trickle_failed_refreshes`	counter	Failed refresh count per table
`pg_trickle_rows_inserted_total`	counter	Rows inserted per table
`pg_trickle_rows_deleted_total`	counter	Rows deleted per table
`pg_trickle_staleness_seconds`	gauge	Seconds since last successful refresh
`pg_trickle_cdc_pending_rows`	gauge	Pending rows in CDC change buffer
`pg_trickle_cdc_buffer_bytes`	gauge	CDC change buffer size in bytes
`pg_trickle_scheduler_running`	gauge	1 if scheduler background worker is alive
`pg_trickle_health_status`	gauge	Overall health: 0=OK, 1=WARNING, 2=CRITICAL

Pre-configured Alerts

Alerting rules are defined in prometheus/alerts.yml:

Alert	Condition	Severity
`PgTrickleTableStale`	Staleness > 5 min past schedule	warning
`PgTrickleConsecutiveErrors`	≥ 3 consecutive refresh failures	warning
`PgTrickleTableSuspended`	Any table in SUSPENDED status	critical
`PgTrickleCdcBufferLarge`	CDC buffer > 1 GB	warning
`PgTrickleSchedulerDown`	Scheduler not running for > 2 min	critical
`PgTrickleHighRefreshDuration`	Avg refresh > 30 s	warning

NOTIFY-Based Alerting

In addition to Prometheus alerts, pg_trickle emits real-time PostgreSQL NOTIFY events on the pg_trickle_alert channel:

LISTEN pg_trickle_alert;

Events include stale_data, auto_suspended, reinitialize_needed, buffer_growth_warning, fuse_blown, refresh_completed, and refresh_failed. Each notification carries a JSON payload with the stream table name and relevant details.

You can bridge NOTIFY events to external alerting systems (PagerDuty, Slack, etc.) using tools like pgnotify or a simple LISTEN loop in your application.

Grafana Dashboard

The pre-provisioned pg_trickle Overview dashboard (grafana/dashboards/pg_trickle_overview.json) includes panels for:

Stream table status distribution (active / suspended / error)
Refresh rate and duration over time
Staleness heatmap
CDC buffer sizes
Consecutive error counts
Scheduler uptime

Built-in SQL Monitoring Views

pg_trickle also provides built-in monitoring accessible without Prometheus:

-- Quick health overview (returns warnings and errors)
SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

-- Stream table status and staleness
SELECT name, status, refresh_mode, staleness
FROM pgtrickle.stream_tables_info;

-- Detailed refresh statistics
SELECT * FROM pgtrickle.pg_stat_stream_tables;

-- CDC health per source table
SELECT * FROM pgtrickle.check_cdc_health();

-- Change buffer sizes
SELECT * FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

See the SQL Reference for the complete list of monitoring functions.

Files Reference

File	Purpose
`monitoring/docker-compose.yml`	Demo stack: PG + exporter + Prometheus + Grafana
`monitoring/prometheus/prometheus.yml`	Prometheus scrape configuration
`monitoring/prometheus/pg_trickle_queries.yml`	Custom SQL queries for postgres_exporter
`monitoring/prometheus/alerts.yml`	Alerting rules
`monitoring/grafana/provisioning/`	Auto-provisioned data source + dashboard
`monitoring/grafana/dashboards/pg_trickle_overview.json`	Overview dashboard

Requirements

Docker 24+ with Compose v2
pg_trickle 0.10.0+ installed in the target database
PostgreSQL user with SELECT on the pgtrickle.* schema

PgBouncer & Connection Poolers

pg_trickle's background scheduler uses session-level PostgreSQL features. This page explains how to configure pg_trickle alongside connection poolers like PgBouncer, Supavisor (Supabase), and PgCat.

Compatibility Matrix

Pooling Mode	Compatible?	Notes
Session mode (`pool_mode = session`)	✅ Fully	All features work.
Direct connection (no pooler for scheduler)	✅ Fully	Application queries can still go through a pooler.
Transaction mode (`pool_mode = transaction`)	❌ Not supported	Advisory locks, prepared statements, and LISTEN/NOTIFY are session-scoped.
Statement mode (`pool_mode = statement`)	❌ Not supported	Same session-scoped limitations.

Why Transaction Mode Breaks

The pg_trickle scheduler relies on three session-level features:

Feature	Problem in Transaction Mode
`pg_advisory_lock()`	Session lock released when connection returns to pool — concurrent refreshes become possible
`PREPARE` / `EXECUTE`	Prepared statements vanish on connection hop — "prepared statement does not exist" errors
`LISTEN` / `NOTIFY`	Listener loses notifications when assigned a different backend connection

Recommended Setup

Route the pg_trickle background worker through a direct connection while keeping application traffic on the pooler:

┌─────────────────┐     ┌──────────────┐
│  Application    │────▶│  PgBouncer   │──┐
│  (transaction   │     │  (txn mode)  │  │
│   mode OK)      │     └──────────────┘  │
└─────────────────┘                       │
                                          ▼
┌─────────────────┐                ┌─────────────┐
│  pg_trickle     │───────────────▶│ PostgreSQL   │
│  scheduler      │  direct conn   │             │
│  (session mode) │                └─────────────┘
└─────────────────┘

The scheduler connects directly to PostgreSQL as a background worker — it does not go through the pooler at all. No special configuration is needed for this; the scheduler always uses an internal SPI connection.

The pooler only matters for application queries that read from stream tables or call pg_trickle functions (e.g., refresh_stream_table()).

Platform-Specific Notes

Supabase

Supabase uses Supavisor in transaction mode by default. pg_trickle's scheduler works because it runs as a background worker (bypasses the pooler). Application queries against stream tables work normally through the pooler since they are regular SELECT statements.

If you call pgtrickle.refresh_stream_table() from application code, use the direct connection string (port 5432) rather than the pooled connection (port 6543).

Neon

Neon uses a custom proxy that supports both session and transaction modes. Use the session-mode connection string for any pg_trickle management calls. The scheduler runs as a background worker and is unaffected by the proxy.

AWS RDS Proxy

RDS Proxy only supports transaction-mode pooling. The pg_trickle scheduler runs as a background worker inside the RDS instance and is unaffected. Application queries reading stream tables work normally through the proxy.

Manual refresh_stream_table() calls through the proxy may fail due to advisory lock issues. Use a direct connection for management operations.

Pooler Compatibility Mode

pg_trickle includes a pooler_compatibility_mode setting (v0.10.0+) that adjusts internal behavior for environments where the scheduler's SPI connection may be affected by pooler-like middleware:

-- Usually not needed — the scheduler bypasses external poolers
SHOW pg_trickle.pooler_compatibility_mode;

This GUC is primarily for edge cases in managed PostgreSQL services. For standard deployments, the default setting works correctly.

Flyway & Liquibase Migration Frameworks

pg_trickle stream tables are managed through SQL function calls, not standard DDL (CREATE TABLE / ALTER TABLE). This page documents patterns for integrating pg_trickle with Flyway and Liquibase migration frameworks.

Key Principle

Stream tables are created and managed via pgtrickle.create_stream_table(), pgtrickle.alter_stream_table(), and pgtrickle.drop_stream_table(). These are regular SQL function calls that can be embedded in any migration script.

CDC triggers are automatically installed on source tables during stream table creation — no manual trigger management is needed.

Flyway

Creating Stream Tables in Migrations

Place stream table definitions in versioned migration files alongside your regular schema changes:

-- V3__create_order_stream_tables.sql

-- 1. Create the source tables first (standard DDL)
CREATE TABLE IF NOT EXISTS orders (
    id         SERIAL PRIMARY KEY,
    region     TEXT NOT NULL,
    amount     NUMERIC(10,2) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT now()
);

-- 2. Create stream tables via pg_trickle API
SELECT pgtrickle.create_stream_table(
    'order_totals',
    $$SELECT region, COUNT(*) AS order_count, SUM(amount) AS total
      FROM orders GROUP BY region$$,
    schedule     => '5s',
    refresh_mode => 'DIFFERENTIAL'
);

Altering Stream Tables

Use pgtrickle.alter_stream_table() in a new migration:

-- V5__update_order_totals_schedule.sql
SELECT pgtrickle.alter_stream_table(
    'order_totals',
    schedule => '10s'
);

Altering the Defining Query

Use alter_query to change the SQL without dropping and recreating:

-- V7__add_avg_to_order_totals.sql
SELECT pgtrickle.alter_stream_table(
    'order_totals',
    alter_query => $$SELECT region,
                            COUNT(*) AS order_count,
                            SUM(amount) AS total,
                            AVG(amount) AS avg_amount
                     FROM orders GROUP BY region$$
);

Dropping Stream Tables

-- V9__remove_legacy_stream_tables.sql
SELECT pgtrickle.drop_stream_table('legacy_report');

Bulk Creation

For environments with many stream tables, use bulk_create to create them atomically:

-- V4__create_all_stream_tables.sql
SELECT pgtrickle.bulk_create('[
    {
        "name": "order_totals",
        "query": "SELECT region, COUNT(*) AS cnt, SUM(amount) AS total FROM orders GROUP BY region",
        "schedule": "5s",
        "refresh_mode": "DIFFERENTIAL"
    },
    {
        "name": "daily_revenue",
        "query": "SELECT date_trunc(''day'', created_at) AS day, SUM(amount) AS revenue FROM orders GROUP BY 1",
        "schedule": "30s",
        "refresh_mode": "DIFFERENTIAL"
    }
]'::jsonb);

Ordering: Source Tables Before Stream Tables

Flyway executes migrations in version order. Ensure source tables are created in an earlier migration than their dependent stream tables:

V1__create_schema.sql           -- CREATE TABLE orders, products, ...
V2__create_indexes.sql          -- CREATE INDEX ...
V3__create_stream_tables.sql    -- SELECT pgtrickle.create_stream_table(...)

Repeatable Migrations

If you want stream table definitions to be re-applied on every Flyway run (for development environments), use repeatable migrations:

-- R__stream_tables.sql
-- Drop and recreate all stream tables
SELECT pgtrickle.drop_stream_table('order_totals') 
WHERE EXISTS (
    SELECT 1 FROM pgtrickle.pgt_stream_tables 
    WHERE pgt_name = 'order_totals'
);

SELECT pgtrickle.create_stream_table(
    'order_totals',
    $$SELECT region, COUNT(*) AS cnt FROM orders GROUP BY region$$,
    schedule => '5s',
    refresh_mode => 'DIFFERENTIAL'
);

Or use create_or_replace_stream_table for idempotent definitions:

-- R__stream_tables.sql (idempotent)
SELECT pgtrickle.create_or_replace_stream_table(
    'order_totals',
    $$SELECT region, COUNT(*) AS cnt FROM orders GROUP BY region$$,
    schedule => '5s',
    refresh_mode => 'DIFFERENTIAL'
);

Handling `ALTER TABLE` on Source Tables

When a Flyway migration alters a source table (e.g., adding a column), pg_trickle's DDL event trigger detects the change and suspends affected stream tables. After the schema change, stream tables resume automatically on the next refresh cycle.

If the source table change invalidates the stream table's defining query (e.g., removing a referenced column), you must update or drop the stream table in the same or a subsequent migration.

Liquibase

Creating Stream Tables in Changesets

Use Liquibase's <sql> tag to call pg_trickle functions:

<!-- changelog-3.0.xml -->
<changeSet id="create-order-stream-tables" author="dev">
    <sql>
        SELECT pgtrickle.create_stream_table(
            'order_totals',
            $pgt$SELECT region, COUNT(*) AS order_count, SUM(amount) AS total
                  FROM orders GROUP BY region$pgt$,
            schedule     => '5s',
            refresh_mode => 'DIFFERENTIAL'
        );
    </sql>
    <rollback>
        <sql>SELECT pgtrickle.drop_stream_table('order_totals');</sql>
    </rollback>
</changeSet>

Rollback Support

Always include <rollback> blocks that drop the stream table:

<changeSet id="add-daily-revenue-st" author="dev">
    <sql>
        SELECT pgtrickle.create_stream_table(
            'daily_revenue',
            $pgt$SELECT date_trunc('day', created_at) AS day,
                        SUM(amount) AS revenue
                 FROM orders GROUP BY 1$pgt$,
            schedule => '30s',
            refresh_mode => 'DIFFERENTIAL'
        );
    </sql>
    <rollback>
        <sql>SELECT pgtrickle.drop_stream_table('daily_revenue');</sql>
    </rollback>
</changeSet>

Altering Stream Tables

<changeSet id="update-order-totals-schedule" author="dev">
    <sql>
        SELECT pgtrickle.alter_stream_table(
            'order_totals',
            schedule => '10s'
        );
    </sql>
    <rollback>
        <sql>
            SELECT pgtrickle.alter_stream_table(
                'order_totals',
                schedule => '5s'
            );
        </sql>
    </rollback>
</changeSet>

Preconditions

Use Liquibase preconditions to check whether pg_trickle is available:

<changeSet id="create-stream-tables" author="dev">
    <preConditions onFail="MARK_RAN">
        <sqlCheck expectedResult="1">
            SELECT COUNT(*) FROM pg_extension WHERE extname = 'pg_trickle'
        </sqlCheck>
    </preConditions>
    <sql>
        SELECT pgtrickle.create_stream_table(...);
    </sql>
</changeSet>

Common Patterns

Environment-Specific Schedules

Use different schedules for development vs. production:

-- Use a function to parameterize schedules
SELECT pgtrickle.create_stream_table(
    'order_totals',
    $$SELECT region, COUNT(*) AS cnt FROM orders GROUP BY region$$,
    schedule => CASE 
        WHEN current_setting('pg_trickle.enabled', true) = 'on' 
        THEN '5s' 
        ELSE '1m' 
    END,
    refresh_mode => 'DIFFERENTIAL'
);

CI/Test Environments

In CI, set pg_trickle.enabled = off in postgresql.conf to prevent the background scheduler from running during schema migrations. Stream tables will still be created correctly — they just won't auto-refresh until the scheduler is enabled.

Extension Dependency

Ensure CREATE EXTENSION pg_trickle runs before any stream table migration. In Flyway, use an early versioned migration:

-- V0__extensions.sql
CREATE EXTENSION IF NOT EXISTS pg_trickle;

In Liquibase:

<changeSet id="install-extensions" author="dev" runOnChange="true">
    <sql>CREATE EXTENSION IF NOT EXISTS pg_trickle;</sql>
</changeSet>

ORM Integration Guides

pg_trickle stream tables are read-only materialized views that refresh automatically. This page documents how to use stream tables from popular Python ORMs — SQLAlchemy and Django ORM.

Key Principles

Stream tables are read-only. All writes go to the source tables; pg_trickle refreshes stream tables in the background.
Model stream tables as views, not regular tables. ORMs should never attempt INSERT, UPDATE, or DELETE on a stream table.
Internal columns are hidden. The __pgt_row_id column used for incremental maintenance is excluded from SELECT * queries.

SQLAlchemy

Read-Only Model Definition

Map a stream table as a read-only model using __table_args__:

from sqlalchemy import Column, Integer, Numeric, String, BigInteger
from sqlalchemy.orm import DeclarativeBase

class Base(DeclarativeBase):
    pass

class OrderTotals(Base):
    """Read-only model backed by pg_trickle stream table."""
    __tablename__ = "order_totals"
    
    # Map the stream table's row ID as primary key for ORM identity
    __pgt_row_id = Column("__pgt_row_id", BigInteger, primary_key=True)
    
    region = Column(String, nullable=False)
    order_count = Column(BigInteger, nullable=False)
    total = Column(Numeric(10, 2), nullable=False)
    
    __table_args__ = {
        "info": {"readonly": True},  # Convention marker
    }

Querying

Query stream tables like any other SQLAlchemy model:

from sqlalchemy import select

# All regions
stmt = select(OrderTotals).order_by(OrderTotals.total.desc())
results = session.execute(stmt).scalars().all()

# Filtered
stmt = (
    select(OrderTotals)
    .where(OrderTotals.order_count > 10)
    .where(OrderTotals.region == "East")
)
row = session.execute(stmt).scalar_one_or_none()

Preventing Accidental Writes

Use SQLAlchemy events to block write operations:

from sqlalchemy import event

READONLY_TABLES = {"order_totals", "daily_revenue", "customer_stats"}

@event.listens_for(session, "before_flush")
def block_stream_table_writes(session, flush_context, instances):
    for obj in session.new | session.dirty | session.deleted:
        table_name = obj.__class__.__tablename__
        if table_name in READONLY_TABLES:
            raise RuntimeError(
                f"Cannot write to stream table '{table_name}'. "
                f"Write to the source table instead."
            )

Reflecting Stream Tables

If you prefer reflection over explicit models:

from sqlalchemy import MetaData, Table, create_engine

engine = create_engine("postgresql://...")
metadata = MetaData()

# Reflect the stream table (treated as a regular table by PostgreSQL)
order_totals = Table("order_totals", metadata, autoload_with=engine)

# Query
with engine.connect() as conn:
    result = conn.execute(order_totals.select().limit(10))
    for row in result:
        print(row)

Checking Freshness

Query the stream table's metadata to check when it was last refreshed:

from sqlalchemy import text

def get_staleness(session, st_name: str) -> dict:
    """Return freshness info for a stream table."""
    result = session.execute(
        text("SELECT * FROM pgtrickle.get_staleness(:name)"),
        {"name": st_name},
    ).mappings().one()
    return dict(result)

# Usage
staleness = get_staleness(session, "order_totals")
print(f"Last refresh: {staleness['data_timestamp']}")
print(f"Stale for: {staleness['staleness_seconds']}s")

Async SQLAlchemy (2.0+)

Works identically with async_session:

from sqlalchemy.ext.asyncio import AsyncSession

async def get_top_regions(session: AsyncSession, limit: int = 10):
    stmt = (
        select(OrderTotals)
        .order_by(OrderTotals.total.desc())
        .limit(limit)
    )
    result = await session.execute(stmt)
    return result.scalars().all()

Django ORM

Read-Only Model Definition

Use managed = False so Django never creates, alters, or drops the table:

# models.py
from django.db import models

class OrderTotals(models.Model):
    """Read-only model backed by pg_trickle stream table."""
    
    region = models.CharField(max_length=255)
    order_count = models.BigIntegerField()
    total = models.DecimalField(max_digits=10, decimal_places=2)
    
    class Meta:
        managed = False        # Django will not create/alter this table
        db_table = "order_totals"
    
    def save(self, *args, **kwargs):
        raise NotImplementedError("Stream tables are read-only")
    
    def delete(self, *args, **kwargs):
        raise NotImplementedError("Stream tables are read-only")

Querying

Standard Django QuerySet operations work:

# All regions sorted by total
OrderTotals.objects.all().order_by("-total")

# Filtered
OrderTotals.objects.filter(
    order_count__gt=10,
    region="East"
).first()

# Aggregation (on the stream table itself)
from django.db.models import Sum, Avg
OrderTotals.objects.aggregate(
    total_revenue=Sum("total"),
    avg_orders=Avg("order_count"),
)

Django Migrations

Since managed = False, Django migrations won't touch stream tables. Create stream tables in a custom migration using RunSQL:

# migrations/0003_create_stream_tables.py
from django.db import migrations

class Migration(migrations.Migration):
    dependencies = [
        ("myapp", "0002_create_orders_table"),
    ]

    operations = [
        migrations.RunSQL(
            sql="""
                SELECT pgtrickle.create_stream_table(
                    'order_totals',
                    $pgt$SELECT region,
                                COUNT(*) AS order_count,
                                SUM(amount) AS total
                         FROM orders GROUP BY region$pgt$,
                    schedule     => '5s',
                    refresh_mode => 'DIFFERENTIAL'
                );
            """,
            reverse_sql="""
                SELECT pgtrickle.drop_stream_table('order_totals');
            """,
        ),
    ]

Read-Only Mixin

Create a reusable mixin for all stream table models:

class StreamTableMixin(models.Model):
    """Base class for pg_trickle stream table models."""
    
    class Meta:
        abstract = True
        managed = False
    
    def save(self, *args, **kwargs):
        raise NotImplementedError(
            f"{self.__class__.__name__} is a read-only stream table. "
            f"Write to the source table instead."
        )
    
    def delete(self, *args, **kwargs):
        raise NotImplementedError(
            f"{self.__class__.__name__} is a read-only stream table."
        )

# Usage
class OrderTotals(StreamTableMixin):
    region = models.CharField(max_length=255)
    order_count = models.BigIntegerField()
    total = models.DecimalField(max_digits=10, decimal_places=2)
    
    class Meta(StreamTableMixin.Meta):
        db_table = "order_totals"

class DailyRevenue(StreamTableMixin):
    day = models.DateField()
    revenue = models.DecimalField(max_digits=12, decimal_places=2)
    
    class Meta(StreamTableMixin.Meta):
        db_table = "daily_revenue"

Checking Freshness

Use raw SQL to query pg_trickle diagnostics:

from django.db import connection

def get_staleness(st_name: str) -> dict:
    """Return freshness info for a stream table."""
    with connection.cursor() as cursor:
        cursor.execute(
            "SELECT * FROM pgtrickle.get_staleness(%s)", [st_name]
        )
        columns = [col.name for col in cursor.description]
        row = cursor.fetchone()
        return dict(zip(columns, row)) if row else {}

Django REST Framework

Stream table models work with DRF serializers and viewsets:

from rest_framework import serializers, viewsets

class OrderTotalsSerializer(serializers.ModelSerializer):
    class Meta:
        model = OrderTotals
        fields = ["region", "order_count", "total"]

class OrderTotalsViewSet(viewsets.ReadOnlyModelViewSet):
    """Read-only API endpoint for order totals stream table."""
    queryset = OrderTotals.objects.all()
    serializer_class = OrderTotalsSerializer

Common Patterns

Write to Source, Read from Stream

The fundamental pattern: all writes go to source tables (normal ORM models), reads come from stream tables (read-only models).

# Write to source table (normal ORM)
order = Order(region="East", amount=Decimal("99.99"))
session.add(order)
session.commit()

# Read from stream table (auto-refreshed by pg_trickle)
totals = session.execute(
    select(OrderTotals).where(OrderTotals.region == "East")
).scalar_one()
print(f"East: {totals.order_count} orders, ${totals.total}")

Handling Eventual Consistency

Stream tables refresh on a schedule (e.g., every 5 seconds). After writing to a source table, the stream table may be briefly stale. Options:

Accept staleness — suitable for dashboards and reports.
Force refresh — call pgtrickle.refresh_stream_table() after critical writes.
Use IMMEDIATE mode — stream table refreshes within the same transaction.

# Option 2: Force refresh after a critical write
session.execute(text(
    "SELECT pgtrickle.refresh_stream_table('order_totals')"
))

Architecture

This document describes the internal architecture of pg_trickle — a PostgreSQL 18 extension that implements stream tables with differential view maintenance. For a high-level description of what pg_trickle does and why, read ESSENCE.md. For release milestones and future plans, see ROADMAP.md.

High-Level Overview

┌─────────────────────────────────────────────────────────────────┐
│                     PostgreSQL 18 Backend                       │
│                                                                 │
│  ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌─────────────┐   │
│  │  Source  │   │  Source  │   │  Storage │   │  Storage    │   │
│  │  Table A │   │  Table B │   │  Table X │   │  Table Y    │   │
│  └────┬─────┘   └────┬─────┘   └────▲─────┘   └────▲────────┘   │
│       │              │              │              │            │
│  ═════╪══════════════╪══════════════╪══════════════╪════════    │
│       │              │              │              │            │
│  ┌────▼──────────────▼────┐   ┌────┴──────────────┴────┐        │
│  │  Hybrid CDC Layer      │   │  Delta Application     │        │
│  │  Triggers ──or── WAL   │   │  (INSERT/DELETE diffs) │        │
│  └────────────┬───────────┘   └────────────▲───────────┘        │
│               │                            │                    │
│  ┌────────────▼───────────┐   ┌────────────┴───────────┐        │
│  │   Change Buffer        │   │   DVM Engine           │        │
│  │   (pgtrickle_changes.*) │   │   (Operator Tree)      │        │
│  └────────────┬───────────┘   └────────────▲───────────┘        │
│               │                            │                    │
│               └────────────┬───────────────┘                    │
│                            │                                    │
│  ┌─────────────────────────▼─────────────────────────────┐      │
│  │              Refresh Engine                           │      │
│  │  ┌──────────┐  ┌──────────┐  ┌─────────────────────┐  │      │
│  │  │ Frontier │  │ DAG      │  │ Scheduler           │  │      │
│  │  │ Tracker  │  │ Resolver │  │ (canonical schedule)│  │      │
│  │  └──────────┘  └──────────┘  └─────────────────────┘  │      │
│  └───────────────────────────────────────────────────────┘      │
│                                                                 │
│  ┌────────────────────────────────────────────────────────┐     │
│  │                    Catalog (pgtrickle.*)                │     │
│  │  pgt_stream_tables │ pgt_dependencies │ pgt_refresh_history│  │
│  └────────────────────────────────────────────────────────┘     │
│                                                                 │
│  ┌──────────────────────────────────────────────────────┐       │
│  │                  Monitoring Layer                    │       │
│  │  st_refresh_stats │ slot_health │ check_cdc_health    │       │
│  │  explain_st │ views │ NOTIFY alerting               │       │
│  └──────────────────────────────────────────────────────┘       │
└─────────────────────────────────────────────────────────────────┘

Component Details

1. SQL API Layer (`src/api.rs`)

The public entry point for users. All operations are exposed as #[pg_extern] functions in the pgtrickle schema:

create_stream_table — Applies a chain of auto-rewrite passes (view inlining → DISTINCT ON → GROUPING SETS → scalar subquery in WHERE → correlated scalar subquery in SELECT → SubLinks in OR → multi-PARTITION BY windows), parses the defining query, builds an operator tree, creates the storage table, registers CDC slots, populates the catalog, and optionally performs an initial full refresh.
alter_stream_table — Modifies schedule, refresh mode, status (ACTIVE/SUSPENDED), or defining query. Query changes trigger schema migration, dependency updates, and a full refresh within a single transaction.
drop_stream_table — Removes the storage table, catalog entries, and cleans up CDC slots.
refresh_stream_table — Triggers a manual refresh (same path as automatic scheduling).
pgt_status — Returns a summary of all registered stream tables.

2. Catalog (`src/catalog.rs`)

The catalog manages persistent metadata stored in PostgreSQL tables within the pgtrickle schema:

Table	Purpose
`pgtrickle.pgt_stream_tables`	Core metadata: name, query, schedule, status, frontier, etc.
`pgtrickle.pgt_dependencies`	DAG edges from ST to source tables
`pgtrickle.pgt_refresh_history`	Audit log of every refresh operation
`pgtrickle.pgt_change_tracking`	Per-source CDC slot metadata

Schema creation is handled by extension_sql!() macros that run at CREATE EXTENSION time.

Entity-Relationship Diagram

erDiagram
    pgt_stream_tables {
        bigserial pgt_id PK
        oid pgt_relid UK "OID of materialized storage table"
        text pgt_name
        text pgt_schema
        text defining_query
        text original_query "User's original SQL (pre-inlining)"
        text schedule "Duration or cron expression"
        text refresh_mode "FULL | DIFFERENTIAL | IMMEDIATE"
        text status "INITIALIZING | ACTIVE | SUSPENDED | ERROR"
        boolean is_populated
        timestamptz data_timestamp "Freshness watermark"
        jsonb frontier "DBSP-style version frontier"
        timestamptz last_refresh_at
        int consecutive_errors
        boolean needs_reinit
        float8 auto_threshold
        float8 last_full_ms
        timestamptz created_at
        timestamptz updated_at
    }

    pgt_dependencies {
        bigint pgt_id PK,FK "References pgt_stream_tables.pgt_id"
        oid source_relid PK "OID of source table"
        text source_type "TABLE | STREAM_TABLE | VIEW"
        text_arr columns_used "Column-level lineage"
        text cdc_mode "TRIGGER | TRANSITIONING | WAL"
        text slot_name "Replication slot (WAL mode)"
        pg_lsn decoder_confirmed_lsn "WAL decoder progress"
        timestamptz transition_started_at "Trigger→WAL transition start"
    }

    pgt_refresh_history {
        bigserial refresh_id PK
        bigint pgt_id FK "References pgt_stream_tables.pgt_id"
        timestamptz data_timestamp
        timestamptz start_time
        timestamptz end_time
        text action "NO_DATA | FULL | DIFFERENTIAL | REINITIALIZE | SKIP"
        bigint rows_inserted
        bigint rows_deleted
        text error_message
        text status "RUNNING | COMPLETED | FAILED | SKIPPED"
        text initiated_by "SCHEDULER | MANUAL | INITIAL"
        timestamptz freshness_deadline
    }

    pgt_change_tracking {
        oid source_relid PK "OID of tracked source table"
        text slot_name "Trigger function name"
        pg_lsn last_consumed_lsn
        bigint_arr tracked_by_pgt_ids "ST IDs sharing this source"
    }

    pgt_stream_tables ||--o{ pgt_dependencies : "has sources"
    pgt_stream_tables ||--o{ pgt_refresh_history : "has refresh history"
    pgt_stream_tables }o--o{ pgt_change_tracking : "tracks via pgt_ids array"

Note: Change buffer tables (pgtrickle_changes.changes_<oid>) are created dynamically per source table OID and live in the separate pgtrickle_changes schema.

3. CDC / Change Data Capture (`src/cdc.rs`, `src/wal_decoder.rs`)

pg_trickle uses a hybrid CDC architecture that starts with triggers and optionally transitions to WAL-based (logical replication) capture for lower write-side overhead.

Trigger Mode (default)

Trigger Management — Creates AFTER INSERT OR UPDATE OR DELETE row-level triggers (pg_trickle_cdc_<oid>) on each tracked source table. Each trigger fires a PL/pgSQL function (pg_trickle_cdc_fn_<oid>()) that writes changes to the buffer table.
Change Buffering — Decoded changes are written to per-source change buffer tables in the pgtrickle_changes schema. Each row captures the LSN (pg_current_wal_lsn()), transaction ID, action type (I/U/D), and the new/old row data as typed columns (new_<col> TYPE, old_<col> TYPE) — native PostgreSQL types, not JSONB.
Cleanup — Consumed changes are deleted after each successful refresh via delete_consumed_changes(), bounded by the upper LSN to prevent unbounded scans.
Lifecycle — Triggers and trigger functions are automatically created when a source table is first tracked and dropped when the last stream table referencing a source is removed.

The trigger approach was chosen as the default for transaction safety (triggers can be created in the same transaction as DDL), simplicity (no slot management, no wal_level = logical requirement), and immediate visibility (changes are visible in buffer tables as soon as the source transaction commits).

WAL Mode (optional, automatic transition)

When pg_trickle.cdc_mode is set to 'auto' or 'wal' and wal_level = logical is available, the system transitions from trigger-based to WAL-based CDC after the first successful refresh:

WAL Availability Detection — At stream table creation, checks whether wal_level = logical is configured. If so, the source dependency is marked for WAL transition.
WAL Decoder Background Worker — A dedicated background worker (src/wal_decoder.rs) polls logical replication slots and writes decoded changes into the same change buffer tables used by triggers, ensuring a uniform format for the DVM engine.
Transition Orchestration — The transition is a three-step process: (a) create a replication slot, (b) wait for the decoder to catch up to the trigger's last confirmed LSN, (c) drop the trigger and switch the dependency to WAL mode. If the decoder doesn't catch up within pg_trickle.wal_transition_timeout (default 300s), the system falls back to triggers.
CDC Mode Tracking — Each source dependency in pgt_dependencies carries a cdc_mode column (TRIGGER / TRANSITIONING / WAL) and WAL-specific metadata (slot_name, decoder_confirmed_lsn, transition_started_at).

See ADR-001 and ADR-002 in plans/adrs/PLAN_ADRS.md for the original design rationale and plans/sql/PLAN_HYBRID_CDC.md for the full implementation plan.

Immediate Mode / Transactional IVM (`src/ivm.rs`)

When refresh_mode = 'IMMEDIATE', pg_trickle uses statement-level AFTER triggers with transition tables instead of row-level CDC triggers. The stream table is maintained synchronously within the same transaction as the base table DML.

BEFORE Triggers — Statement-level BEFORE triggers on each base table acquire an advisory lock on the stream table to prevent concurrent conflicting updates.
AFTER Triggers — Statement-level AFTER triggers with REFERENCING NEW TABLE AS ... OLD TABLE AS ... copy the transition table data to temp tables, then call the Rust pgt_ivm_apply_delta() function.
Delta Computation — The DVM engine's Scan operator reads from the temp tables (via DeltaSource::TransitionTable) instead of change buffer tables. No LSN filtering or net-effect computation is needed — each trigger invocation represents a single atomic statement.
Delta Application — The computed delta is applied via explicit DML (DELETE + INSERT ON CONFLICT) to the stream table.
TRUNCATE — A separate AFTER TRUNCATE trigger calls pgt_ivm_handle_truncate(), which truncates the stream table and re-populates from the defining query.

No change buffer tables, no scheduler involvement, and no WAL infrastructure is needed for IMMEDIATE mode. See plans/sql/PLAN_TRANSACTIONAL_IVM.md for the design plan.

ST-to-ST Change Capture (v0.11.0+)

When a stream table's defining query references another stream table (rather than a base table), neither triggers nor WAL capture apply — the upstream source is itself maintained by pg_trickle. A dedicated ST change buffer mechanism enables downstream stream tables to refresh differentially even when their source is another stream table.

  Base Table  ──trigger/WAL──▶  changes_<oid>       (base-table buffer)
  Stream Table A  ──refresh──▶  changes_pgt_<pgt_id>  (ST buffer for A's consumers)
  Stream Table B  reads from    changes_pgt_<pgt_id>  (B depends on A)

Buffer schema. ST change buffers are named pgtrickle_changes.changes_pgt_<pgt_id> (using the internal pgt_id rather than the OID). Unlike base-table buffers, they store only new_* columns — no old_* columns — because ST deltas are expressed as INSERT/DELETE pairs, not UPDATE rows.

Delta capture — DIFFERENTIAL path. When an upstream stream table refreshes in DIFFERENTIAL mode and has downstream consumers, the refresh engine captures the computed delta (the INSERT and DELETE rows applied to the upstream ST) into the ST change buffer via explicit DML. Downstream stream tables then read from this buffer exactly as they would read from a base-table change buffer.

Delta capture — FULL path. When an upstream stream table refreshes in FULL mode (e.g., due to a mode downgrade or full => true), the engine takes a pre-refresh snapshot, executes the full refresh, then computes an EXCEPT ALL diff between the old and new contents. The resulting INSERT/DELETE pairs are written to the ST change buffer. This prevents FULL refreshes from cascading through the entire dependency chain — downstream STs always receive a minimal delta regardless of how the upstream was refreshed.

Frontier tracking. ST source positions are tracked in the same frontier JSONB structure as base-table sources, using pgt_<upstream_pgt_id> as the key (e.g., {"pgt_42": 157}) rather than the OID-based keys used for base tables. The scheduler's has_stream_table_source_changes() function compares the downstream's last-consumed frontier position against the upstream buffer's current maximum LSN to decide whether a refresh is needed.

Lifecycle. ST change buffers are created automatically when a stream table gains its first downstream consumer (create_st_change_buffer_table()), and dropped when the last downstream consumer is removed (drop_st_change_buffer_table()). On upgrade from pre-v0.11.0, existing ST-to-ST dependencies have their buffers auto-created on the first scheduler tick. Consumed rows are cleaned up by cleanup_st_change_buffers_by_frontier() after each successful downstream refresh.

4. DVM Engine (`src/dvm/`)

The Differential View Maintenance engine is the core of the system. It transforms the defining SQL query into an executable operator tree that can compute deltas efficiently.

Auto-Rewrite Pipeline (`src/dvm/parser.rs`)

Before the defining query is parsed into an operator tree, it passes through a chain of auto-rewrite passes that normalize SQL constructs the DVM parser doesn't handle directly:

Pass	Function	Purpose
#0	`rewrite_views_inline()`	Replace view references with `(view_definition) AS alias` subqueries
#1	`rewrite_distinct_on()`	Convert `DISTINCT ON` to `ROW_NUMBER() OVER (…) = 1` window subquery
#2	`rewrite_grouping_sets()`	Decompose `GROUPING SETS` / `CUBE` / `ROLLUP` into `UNION ALL` of `GROUP BY`
#3	`rewrite_scalar_subquery_in_where()`	Convert `WHERE col > (SELECT …)` to `CROSS JOIN`
#4	`rewrite_sublinks_in_or()`	Split `WHERE a OR EXISTS (…)` into `UNION` branches
#5	`rewrite_multi_partition_windows()`	Split multiple `PARTITION BY` clauses into joined subqueries

The view inlining pass (#0) runs first so that view definitions containing DISTINCT ON, GROUPING SETS, etc. are further rewritten by downstream passes. Nested views are expanded via a fixpoint loop (max depth 10).

Query Parser (`src/dvm/parser.rs`)

Parses the defining query using PostgreSQL's internal parser (via pgrx raw_parser) and extracts:

WITH clause — CTE definitions (non-recursive: inline expansion or shared delta; recursive: detected for mode gating)
Target list — output columns
FROM clause — source tables, joins, subqueries, and CTE references
WHERE clause — filters
GROUP BY / aggregate functions
DISTINCT / UNION ALL / INTERSECT / EXCEPT

The parser produces an OpTree — a tree of operator nodes. CTE handling follows a tiered approach:

Tier 1 (Inline Expansion) — Non-recursive CTEs referenced once are expanded into Subquery nodes, equivalent to subqueries in FROM.
Tier 2 (Shared Delta) — Non-recursive CTEs referenced multiple times produce CteScan nodes that share a single delta computation via a CTE registry and delta cache.
Tier 3a/3b/3c (Recursive) — Recursive CTEs (WITH RECURSIVE) are detected via query_has_recursive_cte(). In FULL mode, the query executes as-is. In DIFFERENTIAL mode, the strategy is auto-selected: semi-naive evaluation for INSERT-only changes, Delete-and-Rederive (DRed) for mixed changes, or recomputation fallback when CTE columns don't match ST storage or when the recursive term contains non-monotone operators (EXCEPT, Aggregate, Window, DISTINCT, AntiJoin, INTERSECT SET). In IMMEDIATE mode, the same semi-naive / DRed machinery runs against statement transition tables and is bounded by pg_trickle.ivm_recursive_max_depth to guard against unbounded recursion.

Operators (`src/dvm/operators/`)

Each operator knows how to generate a delta query — given a set of changes to its inputs, it produces the corresponding changes to its output:

Operator	Delta Strategy
Scan	Direct passthrough of CDC changes
Filter	Apply WHERE predicate to deltas
Project	Apply column projection to deltas
Join	Join deltas against the other side's current state
OuterJoin	LEFT/RIGHT outer join with NULL padding
FullJoin	FULL OUTER JOIN with 8-part delta (both sides may produce NULLs)
Aggregate	Recompute group values where affected keys changed
Distinct	COUNT-based duplicate tracking
UnionAll	Merge deltas from both branches
Intersect	Dual-count multiplicity with LEAST boundary crossing
Except	Dual-count multiplicity with GREATEST(0, L-R) boundary crossing
Subquery	Transparent delegation + optional column renaming (CTEs, subselects)
CteScan	Shared delta lookup from CTE cache (multi-reference CTEs)
RecursiveCte	Semi-naive / DRed / recomputation for `WITH RECURSIVE`
Window	Partition-based recomputation for window functions
LateralFunction	Row-scoped recomputation for SRFs in FROM (jsonb_array_elements, unnest, etc.)
LateralSubquery	Row-scoped recomputation for correlated subqueries in LATERAL FROM
SemiJoin	EXISTS / IN subquery delta via semi-join
AntiJoin	NOT EXISTS / NOT IN subquery delta via anti-join
ScalarSubquery	Correlated scalar subquery in SELECT list

See DVM_OPERATORS.md for detailed descriptions.

Diff Engine (`src/dvm/diff.rs`)

Generates the final diff SQL that:

Computes the delta from the operator tree
Produces ('+', row) for inserts and ('-', row) for deletes
Applies the diff via DELETE matching old rows and INSERT for new rows

5. DAG / Dependency Graph (`src/dag.rs`)

Stream tables can depend on other stream tables (cascading), forming a Directed Acyclic Graph:

Cycle detection — Detects circular dependencies at creation time using Kahn's algorithm (BFS topological sort). When pg_trickle.allow_circular = true, monotone cycles (queries using only safe operators — joins, filters, UNION ALL, etc.) are allowed; non-monotone cycles (aggregates, EXCEPT, window functions, anti-joins) are rejected. SCC IDs are automatically assigned to cycle members and recomputed on drop/alter.
SCC decomposition — Tarjan's algorithm decomposes the graph into strongly connected components. Singleton SCCs are acyclic; multi-node SCCs contain cycles that are handled by fixed-point iteration in the scheduler.
Monotonicity analysis — Static check (check_monotonicity() in src/dvm/parser.rs) determines whether a query's operators are safe for cyclic fixed-point iteration. Non-monotone operators (Aggregate, EXCEPT, Window, NOT EXISTS) block cycle creation.
Topological ordering — Determines refresh order: upstream STs must be refreshed before downstream STs.
Condensation order — condensation_order() returns SCCs in topological order, grouping cyclic STs for fixed-point iteration. The scheduler's iterate_to_fixpoint() processes multi-node SCCs by refreshing all members repeatedly until convergence (zero net changes) or max_fixpoint_iterations is exceeded.
Cascade operations — When a source table changes, all transitive dependents are identified for refresh.

6. Version / Frontier Tracking (`src/version.rs`)

Implements a per-source frontier (JSONB map of source_oid → LSN) to track exactly how far each stream table has consumed changes:

Read frontier — Before refresh, read the frontier to know where to start consuming changes.
Advance frontier — After a successful refresh, the frontier is updated to the latest consumed LSN.
Consistent snapshots — The frontier ensures that each refresh processes a contiguous, non-overlapping window of changes.

Delayed View Semantics (DVS) Guarantee

The contents of every stream table are logically equivalent to evaluating its defining query at some past point in time — the data_timestamp. The scheduler refreshes STs in topological order so that when ST B references upstream ST A, A has already been refreshed to the target data_timestamp before B runs its delta query against A's contents. The frontier lifecycle is:

Created — on first full refresh; records the LSN of each source at that moment.
Advanced — on each differential refresh; the old frontier becomes the lower bound and the new frontier (with fresh LSNs) the upper bound. The DVM engine reads changes in [old, new].
Reset — on reinitialize; a fresh frontier is created from scratch.

7. Refresh Engine (`src/refresh.rs`)

Orchestrates the complete refresh cycle:

┌──────────────┐
│  Check State │ → Is ST active? Has it been populated?
└──────┬───────┘
       │
 ┌─────▼──────┐
 │ Drain CDC  │ → Read WAL changes into change buffer tables
 └─────┬──────┘
       │
 ┌─────▼──────────────┐
 │ Determine Action   │ → FULL, DIFFERENTIAL, NO_DATA, REINITIALIZE, or SKIP?
 │                    │   (adaptive: if change ratio > pg_trickle.differential_max_change_ratio,
 │                    │    downgrade DIFFERENTIAL → FULL automatically)
 └─────┬──────────────┘
       │
 ┌─────▼──────┐
 │ Execute    │ → Full: TRUNCATE + INSERT ... SELECT
 │            │   Differential: Generate & apply delta SQL
 └─────┬──────┘
       │
 ┌─────▼──────────────┐
 │ Record History     │ → Write to pgtrickle.pgt_refresh_history
 └─────┬──────────────┘
       │
 ┌─────▼──────────────┐
 │ Advance Frontier   │ → Update JSONB frontier in catalog
 └─────┬──────────────┘
       │
 ┌─────▼──────────────┐
 │ Reset Error Count  │ → On success, reset consecutive_errors to 0
 └──────────────────────┘

8. Background Worker & Scheduling (`src/scheduler.rs`)

Registration & Lifecycle

pg_trickle registers one PostgreSQL background worker — the scheduler — during _PG_init() (extension load). Because it is registered at startup, pg_trickle must appear in shared_preload_libraries, which requires a server restart.

┌──────────────────────────────────────────────────────────────────┐
│                  PostgreSQL postmaster                           │
│                                                                  │
│  shared_preload_libraries = 'pg_trickle'                          │
│       │                                                          │
│       ▼                                                          │
│  _PG_init()                                                      │
│    ├─ Register GUCs (pg_trickle.enabled, scheduler_interval_ms …) │
│    ├─ Register shared memory (PgTrickleSharedState, atomics)      │
│    └─ BackgroundWorkerBuilder::new("pg_trickle scheduler")        │
│         .set_start_time(RecoveryFinished)                        │
│         .set_restart_time(5s)       ← auto-restart on crash      │
│         .load()                                                  │
│                                                                  │
│  After recovery finishes:                                        │
│       │                                                          │
│       ▼                                                          │
│  pg_trickle_scheduler_main()         ← background worker starts   │
│    ├─ Attach SIGHUP + SIGTERM handlers                           │
│    ├─ Connect to SPI (database = "postgres")                     │
│    ├─ Crash recovery: mark stale RUNNING records as FAILED       │
│    └─ Enter main loop ─────────────────────────┐                 │
│         │                                      │                 │
│         ▼                                      │                 │
│     wait_latch(scheduler_interval_ms)          │                 │
│         │                                      │                 │
│     ┌───▼───────────────────────────────┐      │                 │
│     │ SIGTERM? → log + break            │      │                 │
│     │ pg_trickle.enabled = false? → skip │      │                 │
│     │ Otherwise → scheduler tick        │      │                 │
│     └───┬───────────────────────────────┘      │                 │
│         │                                      │                 │
│         └──────────── loop ────────────────────┘                 │
└──────────────────────────────────────────────────────────────────┘

Key lifecycle properties:

Property	Behaviour
Start condition	After PostgreSQL recovery finishes (`RecoveryFinished`)
Auto-restart	5-second delay after an unexpected crash
Graceful shutdown	Handles `SIGTERM` — breaks the main loop and exits cleanly
Config reload	Handles `SIGHUP` — re-reads GUC values on the next latch wake
Crash recovery	On startup, any `pgt_refresh_history` rows stuck in `RUNNING` status are marked `FAILED` (the transaction that wrote them was rolled back by PostgreSQL, but the status row may have been committed in a prior transaction)
Database	Connects to the `postgres` database via SPI
Standby / replica	On standby servers (`pg_is_in_recovery() = true`), the worker enters a sleep loop and does not attempt refreshes. Stream tables are still readable on standbys — they are regular heap tables replicated via physical streaming replication. After promotion the scheduler resumes automatically. See the FAQ § Replication for details on logical replication and subscriber limitations.

Scheduler Tick

Each tick of the main loop performs the following steps inside a single transaction:

DAG rebuild — Compare the shared-memory DAG_REBUILD_SIGNAL counter against the local copy. If it advanced (a CREATE, ALTER, or DROP stream table occurred), rebuild the in-memory dependency graph (StDag) from the catalog.
Topological traversal — Walk stream tables in dependency order (upstream before downstream). This ensures that when ST B references ST A, A is refreshed first.
Per-ST evaluation — For each active ST:
- Skip if in retry backoff (exponential, per-ST).
- Skip if schedule/cron says not yet due.
- Skip if a row-level lock on the catalog entry indicates a concurrent refresh.
- Check upstream change buffers for pending rows.
Execute refresh — Acquire a row-level lock on the catalog entry → record RUNNING in history → run FULL / DIFFERENTIAL / REINITIALIZE → store new frontier → release lock → record completion.
WAL transitions — Advance any trigger→WAL CDC mode transitions (src/wal_decoder.rs).
Slot health — Check replication slot health and emit NOTIFY alerts.
Prune retry state — Remove backoff entries for STs that no longer exist.

Sequential Processing (Default)

By default (parallel_refresh_mode = 'off), the scheduler processes stream tables sequentially within a single background worker. All STs are refreshed one at a time in topological order. pg_trickle.max_concurrent_refreshes (default 4) only prevents a manual pgtrickle.refresh_stream_table() call from overlapping with the scheduler on the same ST — it does not spawn additional workers.

The PostgreSQL GUC max_worker_processes (default 8) sets the server-wide budget for all background workers (autovacuum, parallel query, logical replication, extensions). In sequential mode pg_trickle consumes one slot from that budget.

Parallel Refresh (`parallel_refresh_mode = 'on'`)

When enabled, the scheduler builds an execution-unit DAG from the stream-table dependency graph and dispatches independent units to dynamic background workers:

Execution units — Each independent stream table becomes a singleton unit. Atomic consistency groups and IMMEDIATE-trigger closures are collapsed into composite units that run in a single worker for correctness.
Ready queue — Units whose upstream dependencies have all completed enter the ready queue. The coordinator dispatches them subject to a per-database cap (max_concurrent_refreshes) and a cluster-wide cap (max_dynamic_refresh_workers).
Dynamic workers — Each dispatched unit spawns a short-lived background worker via BackgroundWorkerBuilder::load_dynamic(). Workers claim a job from the pgtrickle.pgt_scheduler_jobs catalog table, execute the refresh, and exit.

The parallel path respects the same topological ordering as the sequential path — downstream units only become ready after all upstream units succeed. The worker-budget caps ensure pg_trickle does not exhaust max_worker_processes.

See PLAN_PARALLELISM.md for the full design and CONFIGURATION.md for tuning guidance.

Retry & Error Handling

Each ST maintains an in-memory RetryState (reset on scheduler restart):

Retryable errors (SPI failures, lock contention, slot issues) trigger exponential backoff.
Permanent errors (schema mismatch, user errors) skip backoff but increment consecutive_errors.
When consecutive_errors reaches pg_trickle.max_consecutive_errors (default 3), the ST is auto-suspended and a NOTIFY alert is emitted.
Schema errors additionally set needs_reinit, triggering a REINITIALIZE on the next successful cycle.

Scheduling Policy

Automatic refresh scheduling uses canonical periods (48·2ⁿ seconds, n = 0, 1, 2, …) snapped to the user's schedule:

Picks the smallest canonical period ≤ schedule.
For DOWNSTREAM schedule (NULL schedule), the ST refreshes only when explicitly triggered or when a downstream ST needs it.
Advisory locks prevent concurrent refreshes of the same ST.
The scheduler is driven by the background worker polling at the pg_trickle.scheduler_interval_ms GUC interval.

Shared Memory (`src/shmem.rs`)

The scheduler background worker and user sessions share a PgTrickleSharedState structure protected by a PgLwLock. Key fields:

Field	Type	Purpose
`dag_version`	`u64`	Incremented when the ST catalog changes; used by the scheduler to detect when the DAG needs rebuilding.
`scheduler_pid`	`i32`	PID of the scheduler background worker (0 if not running).
`scheduler_running`	`bool`	Whether the scheduler is active.
`last_scheduler_wake`	`i64`	Unix timestamp of the last scheduler wake cycle (for monitoring).

A separate PgAtomic<AtomicU64> named DAG_REBUILD_SIGNAL is incremented by API functions (create, alter, drop) after catalog mutations. The scheduler compares its local copy against the atomic counter to detect when to rebuild its in-memory DAG without holding a lock.

A second PgAtomic<AtomicU64> named CACHE_GENERATION tracks DDL events that may invalidate cached delta or MERGE templates across backends. When DDL hooks fire (view change, ALTER TABLE, function change) or API functions mutate the catalog, CACHE_GENERATION is bumped. Each backend maintains a thread-local generation counter; on the next refresh, if the shared generation has advanced, the backend flushes its delta template cache, MERGE template cache, and explicitly DEALLOCATEs tracked __pgt_merge_* prepared statements before rebuilding local state.

9. DDL Tracking (`src/hooks.rs`)

Event triggers monitor DDL changes to source tables and functions:

_on_ddl_end — Fires on ALTER TABLE to detect column adds/drops/type changes. If a source table used by a ST is altered, the ST's needs_reinit flag is set. Also detects CREATE OR REPLACE FUNCTION / ALTER FUNCTION — if the function appears in a ST's functions_used catalog column, the ST is marked for reinit.
_on_sql_drop — Fires on DROP TABLE to set needs_reinit for affected STs. Also detects DROP FUNCTION and marks affected STs for reinit.
Function name extraction — object_identity strings (e.g., public.my_func(integer, text)) are parsed to extract the bare function name, which is matched against the functions_used TEXT[] column in pgt_stream_tables.

Reinitialization is deferred until the next refresh cycle, which then performs a REINITIALIZE action (drop and recreate the storage table from the updated query).

10. Error Handling (`src/error.rs`)

Centralized error types using thiserror:

PgTrickleError variants cover catalog access, SQL execution, CDC, DVM, DAG, and config errors.
Each refresh failure increments consecutive_errors.
When consecutive_errors reaches pg_trickle.max_consecutive_errors (default 3), the ST is moved to ERROR status and suspended from automatic refresh.
Manual intervention (ALTER ... status => 'ACTIVE') resets the counter.

11. Monitoring (`src/monitor.rs`)

Provides observability functions:

st_refresh_stats — Aggregate statistics (total/successful/failed refreshes, avg duration, staleness status).
get_refresh_history — Per-ST audit trail.
get_staleness — Current staleness in seconds.
slot_health — Checks replication slot state and WAL retention.
check_cdc_health — Per-source CDC health status including mode, slot lag, confirmed LSN, and alerts.
explain_st — Describes the DVM plan for a given ST.
diamond_groups — Lists detected diamond dependency groups, their members, convergence points, and epoch counters.
Views — pgtrickle.stream_tables_info (computed staleness) and pgtrickle.pg_stat_stream_tables (combined stats).

NOTIFY Alerting

Operational events are broadcast via PostgreSQL NOTIFY on the pg_trickle_alert channel. Clients can subscribe with LISTEN pg_trickle_alert; and receive JSON-formatted events:

Event	Condition
`stale`	data staleness exceeds 2× `schedule`
`auto_suspended`	ST suspended after `pg_trickle.max_consecutive_errors` failures
`reinitialize_needed`	Upstream DDL change detected
`slot_lag_warning`	Replication slot WAL retention exceeded `pg_trickle.slot_lag_warning_threshold_mb`
`cdc_transition_complete`	Source transitioned from trigger to WAL-based CDC
`cdc_transition_failed`	Trigger→WAL transition failed (fell back to triggers)
`refresh_completed`	Refresh completed successfully
`refresh_failed`	Refresh failed with an error

12. Row ID Hashing (`src/hash.rs`)

Provides deterministic 64-bit row identifiers using xxHash (xxh64) with a fixed seed. Two SQL functions are exposed:

pgtrickle.pg_trickle_hash(text) — Hash a single text value; used for simple single-column row IDs.
pgtrickle.pg_trickle_hash_multi(text[]) — Hash multiple values (separated by a record-separator byte \x1E) for composite keys (join row IDs, GROUP BY keys).

Row IDs are written into every stream table's storage as an internal __pgt_row_id BIGINT column and are used by the delta application phase to match DELETE candidates precisely.

13. Diamond Dependency Consistency (`src/dag.rs`)

When stream tables form diamond-shaped dependency graphs, a convergence (fan-in) node may read from multiple upstream STs that share a common ancestor:

        A (source table)
       / \
      B   C   (intermediate STs)
       \ /
        D     (convergence / fan-in ST)

If B refreshes successfully but C fails, D would read a fresh version of B's data alongside stale data from C — a split-version inconsistency.

Detection

StDag::detect_diamonds() walks all fan-in nodes (STs with multiple upstream ST dependencies) and computes transitive ancestor sets per branch. If two or more branches share ancestors, a diamond is detected. Overlapping diamonds are merged.

Consistency Groups

StDag::compute_consistency_groups() converts detected diamonds into consistency groups — topologically ordered sets of STs that must be refreshed atomically. Each group contains:

Members — All intermediate STs plus the convergence node, in refresh order.
Convergence points — The fan-in nodes where multiple paths meet.
Epoch counter — Advances on each successful atomic refresh.

STs not involved in any diamond are placed in singleton groups (no overhead).

Scheduler Wiring

When diamond_consistency = 'atomic' (per-ST or via the pg_trickle.diamond_consistency GUC):

The scheduler wraps each multi-member group in a SAVEPOINT pgt_consistency_group.
Each member is refreshed in topological order within the savepoint.
If all succeed — RELEASE SAVEPOINT and advance the group epoch.
If any member fails — ROLLBACK TO SAVEPOINT undoes all members' changes. The failure is logged and the group retries on the next scheduler tick.

With diamond_consistency = 'none', members refresh independently in topological order — matching pre-feature behavior.

Schedule Policy

The diamond_schedule_policy setting (per-convergence-node or via the pg_trickle.diamond_schedule_policy GUC) controls when an atomic group fires:

Policy	Trigger condition	Trade-off
`'fastest'` (default)	Any member is due	Higher freshness, more refreshes
`'slowest'`	All members are due	Lower resource cost, staler data

The policy is set on the convergence (fan-in) node. When multiple convergence nodes exist in the same group (nested diamonds), the strictest policy wins (slowest > fastest). The GUC serves as a cluster-wide fallback for nodes without an explicit per-node setting.

Monitoring

The pgtrickle.diamond_groups() SQL function exposes detected groups for operational visibility. See SQL_REFERENCE.md for details.

14. Configuration (`src/config.rs`)

Runtime behavior is controlled by a growing set of GUC (Grand Unified Configuration) variables. See CONFIGURATION.md for the complete, current list.

GUC	Default	Purpose
`pg_trickle.enabled`	`true`	Master on/off switch for the scheduler
`pg_trickle.scheduler_interval_ms`	`1000`	Scheduler background worker wake interval (ms)
`pg_trickle.min_schedule_seconds`	`60`	Minimum allowed `schedule`
`pg_trickle.max_consecutive_errors`	`3`	Errors before auto-suspending a ST
`pg_trickle.change_buffer_schema`	`pgtrickle_changes`	Schema for change buffer tables
`pg_trickle.max_concurrent_refreshes`	`4`	Maximum parallel refresh workers
`pg_trickle.differential_max_change_ratio`	`0.15`	Change-to-table-size ratio above which DIFFERENTIAL falls back to FULL
`pg_trickle.cleanup_use_truncate`	`true`	Use `TRUNCATE` instead of `DELETE` for change buffer cleanup when the entire buffer is consumed
`pg_trickle.user_triggers`	`'auto'`	User-defined trigger handling: `auto` / `off` (`on` accepted as deprecated alias for `auto`)
`pg_trickle.block_source_ddl`	`false`	Block column-affecting DDL on tracked source tables instead of reinit
`pg_trickle.cdc_mode`	`'auto'`	CDC mechanism: `auto` / `trigger` / `wal`
`pg_trickle.wal_transition_timeout`	`300`	Max seconds to wait for WAL decoder catch-up during transition
`pg_trickle.slot_lag_warning_threshold_mb`	`100`	Warning threshold for WAL slot retention used by `slot_lag_warning` and `health_check()`
`pg_trickle.slot_lag_critical_threshold_mb`	`1024`	Critical threshold for WAL slot retention used by `check_cdc_health()` alerts
`pg_trickle.diamond_consistency`	`'atomic'`	Diamond dependency consistency mode: `atomic` or `none`
`pg_trickle.diamond_schedule_policy`	`'fastest'`	Schedule policy for atomic diamond groups: `fastest` or `slowest`
`pg_trickle.merge_planner_hints`	`true`	Inject `SET LOCAL` planner hints (disable nestloop, raise work_mem) before MERGE
`pg_trickle.merge_work_mem_mb`	`64`	`work_mem` (MB) applied when delta exceeds 10 000 rows and planner hints enabled
`pg_trickle.use_prepared_statements`	`true`	Use SQL PREPARE/EXECUTE for cached MERGE templates

Data Flow: End-to-End Refresh

 Source Table INSERT/UPDATE/DELETE
           │
           ▼
 Hybrid CDC Layer:
   ┌─────────────────────────────────────────────┐
   │ TRIGGER mode: Row-Level AFTER Trigger        │
   │   pg_trickle_cdc_fn_<oid>() → buffer table    │
   │                                              │
   │ WAL mode: Logical Replication Slot           │
   │   wal_decoder bgworker → same buffer table   │
   │                                              │
   │ ST-to-ST: Refresh engine captures delta      │
   │   → changes_pgt_<pgt_id> buffer table        │
   └─────────────────────────────────────────────┘
           │
           ▼
 Change Buffer Table
   Base tables:   pgtrickle_changes.changes_<oid>
   ST sources:    pgtrickle_changes.changes_pgt_<pgt_id>
   Columns: change_id, lsn, action (I/U/D), pk_hash, new_<col>, old_<col> (typed)
           │
           ▼
 DVM Engine: generate delta SQL from operator tree
   - Scan operator reads from changes_<oid> or changes_pgt_<id>
   - Filter/Project/Join transform the deltas
   - Aggregate recomputes affected groups
           │
           ▼
 Diff Engine: produce (+/-) diff rows
           │
           ▼
 Delta Application:
   DELETE FROM storage WHERE __pgt_row_id IN (removed)
   INSERT INTO storage SELECT ... FROM (added)
           │
           ▼
 Frontier Update: advance per-source LSN
           │
           ▼
 History Record: log to pgtrickle.pgt_refresh_history

Module Map

src/
├── lib.rs           # Extension entry, module declarations, _PG_init
├── bin/
│   └── pgrx_embed.rs# pgrx SQL entity embedding (generated)
├── api.rs           # SQL API functions (create/alter/drop/refresh/status)
├── catalog.rs       # Catalog CRUD operations
├── cdc.rs           # Change data capture (triggers + WAL transition)
├── config.rs        # GUC variable registration
├── dag.rs           # Dependency graph (cycle detection, SCC decomposition, topo sort)
├── error.rs         # Centralized error types
├── hash.rs          # xxHash row ID generation (pg_trickle_hash / pg_trickle_hash_multi)
├── hooks.rs         # DDL event trigger handlers (_on_ddl_end, _on_sql_drop)
├── ivm.rs           # Transactional IVM (IMMEDIATE mode: statement-level triggers)
├── shmem.rs         # Shared memory state (PgTrickleSharedState, DAG_REBUILD_SIGNAL, CACHE_GENERATION)
├── dvm/
│   ├── mod.rs       # DVM module root + recursive CTE orchestration
│   ├── parser.rs    # Query → OpTree converter (CTE extraction, subquery, window support)
│   ├── diff.rs      # Delta SQL generation (CTE delta cache)
│   ├── row_id.rs    # Row ID generation
│   └── operators/
│       ├── mod.rs           # Operator trait + registry
│       ├── scan.rs          # Table scan (CDC passthrough)
│       ├── filter.rs        # WHERE clause filtering
│       ├── project.rs       # Column projection
│       ├── join.rs          # Inner join
│       ├── join_common.rs   # Shared join utilities (snapshot subqueries, column disambiguation)
│       ├── outer_join.rs    # LEFT/RIGHT outer join
│       ├── full_join.rs     # FULL OUTER JOIN (8-part delta)
│       ├── aggregate.rs     # GROUP BY + aggregate functions (39 AggFunc variants)
│       ├── distinct.rs      # DISTINCT deduplication
│       ├── union_all.rs     # UNION ALL merging
│       ├── intersect.rs     # INTERSECT / INTERSECT ALL (dual-count LEAST)
│       ├── except.rs        # EXCEPT / EXCEPT ALL (dual-count GREATEST)
│       ├── subquery.rs      # Subquery / inlined CTE delegation
│       ├── cte_scan.rs      # Shared CTE delta (multi-reference)
│       ├── recursive_cte.rs # Recursive CTE (semi-naive + DRed + recomputation)
│       ├── window.rs        # Window function (partition recomputation)
│       ├── lateral_function.rs  # LATERAL SRF (row-scoped recomputation)
│       ├── lateral_subquery.rs  # LATERAL correlated subquery
│       ├── semi_join.rs     # EXISTS / IN subquery (semi-join delta)
│       ├── anti_join.rs     # NOT EXISTS / NOT IN subquery (anti-join delta)
│       └── scalar_subquery.rs   # Correlated scalar subquery in SELECT
├── monitor.rs       # Monitoring & observability functions
├── refresh.rs       # Refresh orchestration
├── scheduler.rs     # Automatic scheduling with canonical periods
├── version.rs       # Frontier / LSN tracking
└── wal_decoder.rs   # WAL-based CDC (logical replication slot polling, transitions)

Extension Control File (`pg_trickle.control`)

The pg_trickle.control file in the repository root is required by PostgreSQL's extension infrastructure. It declares the extension's description, default version, shared-library path, and privilege requirements. PostgreSQL reads this file when CREATE EXTENSION pg_trickle; is executed.

During packaging (cargo pgrx package), pgrx replaces the @CARGO_VERSION@ placeholder with the version from Cargo.toml and copies the file into the target's share/extension/ directory alongside the SQL migration scripts.

DVM Operators

This document describes the Differential View Maintenance (DVM) operators implemented by pgtrickle. Each operator transforms a stream of row-level changes (deltas) propagated from source tables through the operator tree.

Prior Art

Budiu, M. et al. (2023). "DBSP: Automatic Incremental View Maintenance." VLDB 2023. (comparison)
Gupta, A. & Mumick, I.S. (1999). Materialized Views: Techniques, Implementations, and Applications. MIT Press.
Koch, C. et al. (2014). "DBToaster: Higher-order Delta Processing for Dynamic, Frequently Fresh Views." VLDB Journal.
PostgreSQL 9.4+ — Materialized views with REFRESH MATERIALIZED VIEW CONCURRENTLY.

Overview

When a stream table is created, the defining SQL query is parsed into a tree of DVM operators. During an differential refresh, changes flow bottom-up through this tree:

         Aggregate
            │
         Project
            │
          Filter
            │
    ┌───────┴───────┐
   Join             │
  ┌─┴─┐            │
Scan(A) Scan(B)   Scan(C)

Each operator implements a differentiation rule: given the delta (Δ) to its input(s), it produces the corresponding delta to its output. This is conceptually similar to automatic differentiation in calculus.

The general contract:

Input: a set of ('+', row) and ('-', row) tuples (inserts and deletes)
Output: a set of ('+', row) and ('-', row) tuples

Updates are modeled as a delete of the old row followed by an insert of the new row.

DIFFERENTIAL and IMMEDIATE maintenance require deterministic expressions. VOLATILE functions and custom operators such as random() or clock_timestamp() are rejected during stream table creation because re-evaluation would corrupt delta semantics. STABLE functions such as now() and current_timestamp are allowed with a warning; FULL mode accepts all volatility classes because it recomputes the full result on each refresh.

Operator Support Matrix

The following table shows which SQL constructs are supported under each refresh mode.

SQL Construct	FULL	DIFFERENTIAL	IMMEDIATE	Notes
Basic
Simple `SELECT` / projection	✅	✅	✅
`WHERE` filter	✅	✅	✅
Column expressions / aliases	✅	✅	✅
`DISTINCT`	✅	✅	✅	Uses `__pgt_dup_count` reference counting
`DISTINCT ON`	✅	✅	✅
Joins
`INNER JOIN`	✅	✅	✅	Hybrid delta strategy
`LEFT OUTER JOIN`	✅	✅	✅	NULL-padding transitions tracked
`RIGHT OUTER JOIN`	✅	✅	✅
`FULL OUTER JOIN`	✅	✅	✅	8-part UNION ALL delta
`CROSS JOIN`	✅	✅	✅
`LATERAL JOIN`	✅	✅	✅	Row-scoped re-execution
Multi-table join (≤2 right scans)	✅	✅	✅	Full phantom-row-after-DELETE fix
Multi-table join (≥3 right scans)	✅	⚠️	⚠️	Falls back to post-change snapshot for right subtree (EC-01 boundary, fix planned for v0.12.0)
Subqueries
`EXISTS` / `IN` (semi-join)	✅	✅	✅	Delta-key pre-filter on left side
`NOT EXISTS` / `NOT IN` (anti-join)	✅	✅	✅	Inverted semantics; two-part delta
Scalar subquery (SELECT-list)	✅	✅	✅	Pre/post snapshot EXCEPT ALL diff
Correlated `LATERAL` subquery	✅	✅	✅
Set Operations
`UNION ALL`	✅	✅	✅	Dual-branch merge
`INTERSECT` / `INTERSECT ALL`	✅	✅	✅	Dual-count tracking
`EXCEPT` / `EXCEPT ALL`	✅	✅	✅
Aggregates
`COUNT`, `SUM`, `AVG`	✅	✅	✅	Algebraic — fully invertible delta
`MIN`, `MAX`	✅	✅	✅	Semi-algebraic — group rescan on ambiguous delete
`COUNT(DISTINCT)`, `SUM(DISTINCT)`	✅	✅	✅	Algebraic via auxiliary columns
`BOOL_AND`, `BOOL_OR`, `BIT_AND`, `BIT_OR`	✅	✅	✅	Algebraic via auxiliary columns
`EVERY`	✅	✅	✅	Algebraic via auxiliary columns
`STRING_AGG`, `ARRAY_AGG`	✅	⚠️	⚠️	Group-rescan strategy — warning emitted at creation time in DIFFERENTIAL mode
`STDDEV`, `VARIANCE`, `STDDEV_POP`, `VAR_POP`	✅	✅	✅	Algebraic via auxiliary M2/sum/count columns
`COVAR_SAMP`, `COVAR_POP`, `CORR`	✅	✅	✅	Algebraic via auxiliary columns
`REGR_*` (all 9 regression functions)	✅	✅	✅	Algebraic via auxiliary columns
`PERCENTILE_CONT`, `PERCENTILE_DISC`	✅	⚠️	⚠️	Group-rescan strategy
`MODE`	✅	⚠️	⚠️	Group-rescan strategy
`XMLAGG`, `JSON_AGG`, `JSONB_AGG`	✅	⚠️	⚠️	Group-rescan strategy
`JSON_OBJECT_AGG`, `JSONB_OBJECT_AGG`	✅	⚠️	⚠️	Group-rescan strategy
`GROUP BY` / `HAVING`	✅	✅	✅
`GROUP BY ROLLUP` / `CUBE` / `GROUPING SETS`	✅	✅	✅	Branch count capped by `max_grouping_set_branches` (default 64)
Window Functions
`ROW_NUMBER`, `RANK`, `DENSE_RANK`	✅	✅	✅	Partition-scoped recompute
`LAG`, `LEAD`, `FIRST_VALUE`, `LAST_VALUE`	✅	✅	✅	Partition-scoped recompute
`NTILE`, `CUME_DIST`, `PERCENT_RANK`	✅	✅	✅	Partition-scoped recompute
Window `frame` clauses (`ROWS`, `RANGE`, `GROUPS`)	✅	✅	✅
CTEs
Non-recursive `WITH`	✅	✅	✅	Inlined or delta-cached (multi-ref)
`WITH RECURSIVE` (INSERT-only workload)	✅	✅	✅	Semi-naive evaluation
`WITH RECURSIVE` (mixed INSERT/DELETE/UPDATE)	✅	✅	✅	Delete-and-Rederive (DRed) strategy
TopK
`ORDER BY … LIMIT N`	✅	✅	✅	Scoped recomputation; metadata validated each refresh
`ORDER BY … LIMIT N OFFSET M`	✅	✅	✅
Lateral / SRF
`LATERAL` with set-returning function	✅	✅	✅	Row-scoped re-execution
`JSON_TABLE`	✅	✅	✅	Via lateral function operator
`generate_series()`	✅	✅	✅
`unnest()`	✅	✅	✅
ST-to-ST Dependencies
Stream table reading from another stream table	✅	✅	✅	Differential via `changes_pgt_` buffers (v0.11.0); FULL upstream produces I/D diff so downstream stays differential
Multi-level ST chains	✅	✅	✅	Topological order; per-level delta propagation
Function Volatility
`IMMUTABLE` functions	✅	✅	✅
`STABLE` functions (`now()`, `current_timestamp`)	✅	⚠️	⚠️	Allowed with warning — value may differ between initial load and delta evaluation
`VOLATILE` functions (`random()`, `clock_timestamp()`)	✅	❌	❌	Rejected at creation time — re-evaluation corrupts delta semantics

Legend: ✅ = fully supported — ⚠️ = supported with caveats (see Notes column) — ❌ = not supported (blocked at creation time)

Operators

Scan

Module: src/dvm/operators/scan.rs

The leaf operator. Reads CDC changes from a source table's change buffer.

Delta Rule:

$$\Delta(\text{Scan}(R)) = \Delta R$$

The scan operator is a direct passthrough — inserts in the source become inserts in the output, deletes become deletes.

SQL Generation:

SELECT op, row_data FROM pgtrickle_changes.changes_<oid>
WHERE xid >= <last_consumed_xid>

Notes:

Each source table has a dedicated change buffer table created by the CDC module.
Row data is stored as JSONB with column names as keys.
The __pgt_row_id column (xxHash of primary key) is included for deduplication.

Filter

Module: src/dvm/operators/filter.rs

Applies a WHERE clause predicate to the delta stream.

Delta Rule:

$$\Delta(\sigma_p(R)) = \sigma_p(\Delta R)$$

Filtering is applied to the deltas in the same way as to the base data — only rows satisfying the predicate pass through.

SQL Generation:

SELECT * FROM (<input_delta>) AS d
WHERE <predicate>

Example:

If the defining query is:

SELECT * FROM orders WHERE status = 'shipped'

And a new row (id=5, status='pending') is inserted, it does not appear in the delta output. If (id=3, status='shipped') is inserted, it passes through.

Edge Cases:

For updates that change the predicate column (e.g., status from 'pending' to 'shipped'), the CDC produces a delete of the old row and insert of the new row. The filter passes the insert (matches) and blocks the delete (doesn't match the old row against the predicate), correctly resulting in a net insert.

Project

Module: src/dvm/operators/project.rs

Applies column projection from the target list.

Delta Rule:

$$\Delta(\pi_L(R)) = \pi_L(\Delta R)$$

Projects the same columns from the delta that the query projects from the base data.

SQL Generation:

SELECT <target_columns> FROM (<input_delta>) AS d

Notes:

Projection is applied after filtering for efficiency.
Computed expressions in the target list (e.g., price * quantity AS total) are evaluated on the delta rows.

Join (Inner)

Module: src/dvm/operators/join.rs

Implements inner join between two inputs.

Delta Rule:

For $R \bowtie S$:

$$\Delta(R \bowtie S) = (\Delta R \bowtie S) \cup (R' \bowtie \Delta S)$$

Where $R' = R \cup \Delta R$ (the new state of R after applying deltas).

In practice, when only one side has changes (common case), the delta join simplifies to joining the changed rows against the current state of the other side.

SQL Generation:

-- Changes to left side joined with current right side
SELECT '+' AS op, l.*, r.*
FROM (<left_delta> WHERE op = '+') AS l
JOIN <right_table> AS r ON <join_condition>

UNION ALL

-- Current left side joined with changes to right side
SELECT '+' AS op, l.*, r.*
FROM <left_table> AS l
JOIN (<right_delta> WHERE op = '+') AS r ON <join_condition>

(And corresponding DELETE queries for op = '-'.)

Notes:

The join uses the current state of the non-changed side, not the change buffer.
For equi-joins, this is efficient — the join key narrows the scan.
Non-equi joins (theta joins) may require broader scans.

Outer Join

Module: src/dvm/operators/outer_join.rs (LEFT JOIN), src/dvm/operators/full_join.rs (FULL JOIN)

Implements LEFT, RIGHT, and FULL OUTER JOIN.

RIGHT JOIN Handling:

RIGHT JOIN is automatically converted to a LEFT JOIN with swapped left/right operands during query parsing. This normalization happens transparently — the user can write RIGHT JOIN and the parser rewrites it to an equivalent LEFT JOIN before the operator tree is constructed.

Delta Rule:

Similar to inner join, but additionally handles NULL-padded rows:

$$\Delta(R \text{ LEFT JOIN } S) = (\Delta R \bowtie_L S) \cup (R' \bowtie_L \Delta S)$$

With special handling for:

Rows in ΔR that have no match in S → emit ('+', row, NULLs)
Rows in ΔS that create a first match for an R row → emit ('-', row, NULLs) and ('+', row, s_data)
Rows in ΔS that remove the last match for an R row → emit ('-', row, s_data) and ('+', row, NULLs)

SQL Generation (LEFT JOIN):

Uses anti-join detection (via NOT EXISTS) to correctly handle the NULL padding transitions.

FULL OUTER JOIN Delta Rule:

FULL OUTER JOIN extends the LEFT JOIN delta with symmetric right-side handling. The delta is computed as an 8-part UNION ALL:

Parts 1–5: Same as LEFT JOIN delta (inserted/deleted rows from both sides, with NULL-padding transitions)
Parts 6–7: Symmetric anti-join transitions for the right side (rows in ΔL that remove/create the last/first match for an S row)
Part 8: Right-side insertions that have no match in the left side → emit ('+', NULLs, s_data)

Each part uses pre-computed delta flags (__has_ins_*, __has_del_*) to efficiently detect first-match/last-match transitions without redundant subqueries.

Nested Join Support:

Module: src/dvm/operators/join_common.rs

All join operators (inner, left, full) support nested children — i.e., a join whose left or right operand is itself another join. The join_common module provides shared helpers:

build_snapshot_sql() — returns the table reference for simple (Scan) operands, or a parenthesized subquery with disambiguated columns for nested join operands
rewrite_join_condition() — rewrites column references in ON conditions to use the correct alias prefixes for nested children (e.g., o.cust_id → dl.o__cust_id)

This enables queries with 3 or more joined tables, e.g.:

SELECT o.id, c.name, p.title
FROM orders o
JOIN customers c ON o.cust_id = c.id
JOIN products p ON o.prod_id = p.id

Limitations:

FULL OUTER JOIN delta computation can be expensive due to dual-side NULL tracking (8 UNION ALL parts).
Performance degrades with high-cardinality join keys.
NATURAL JOIN is supported — common columns are resolved automatically and synthesized into an explicit equi-join condition.
EC-01 pre-change snapshot boundary (SF-5): The phantom-row-after-DELETE fix (EC-01) uses EXCEPT ALL to reconstruct the pre-change state of a join side. This is limited to join subtrees with ≤ 2 scan nodes to avoid PostgreSQL temporary file exhaustion on wide join chains. For queries with ≥ 3 base tables on one side of a join (e.g. TPC-H Q7/Q8/Q9), a simultaneous DELETE on both join sides may leave a phantom row in the stream table until the next full refresh. See use_pre_change_snapshot() in join_common.rs for the full rationale.

Aggregate

Module: src/dvm/operators/aggregate.rs

Handles GROUP BY with aggregate functions (COUNT, SUM, AVG, MIN, MAX, BOOL_AND, BOOL_OR, STRING_AGG, ARRAY_AGG, JSON_AGG, JSONB_AGG, BIT_AND, BIT_OR, BIT_XOR, JSON_OBJECT_AGG, JSONB_OBJECT_AGG, STDDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP, MODE, PERCENTILE_CONT, PERCENTILE_DISC, JSON_ARRAYAGG, JSON_OBJECTAGG) and the FILTER (WHERE …) and WITHIN GROUP (ORDER BY …) clauses.

Delta Rule:

$$\Delta(\gamma_{G, \text{agg}}(R)) = \gamma_{G, \text{agg}}(R' \text{ WHERE } G \in \text{affected_keys}) - \gamma_{G, \text{agg}}(R \text{ WHERE } G \in \text{affected_keys})$$

Where:

$G$ = grouping columns
affected_keys = the set of group key values that appear in ΔR
$R'$ = $R \cup \Delta R$ (the new state)

Strategy:

Identify affected groups — Collect all group key values that appear in the delta (either inserted or deleted rows).
Recompute old values — Query the storage table for current aggregate values of affected groups.
Recompute new values — Query the updated source for new aggregate values of affected groups.
Diff — For each affected group:
- If old exists and new differs → emit ('-', old) and ('+', new)
- If old exists and new is gone → emit ('-', old) (group eliminated)
- If no old and new exists → emit ('+', new) (new group appeared)

Supported Aggregate Functions:

Function	DVM Strategy	Notes
`COUNT(*)`	Algebraic	Fully differential
`COUNT(expr)`	Algebraic	Fully differential
`SUM(expr)`	Algebraic	Fully differential
`AVG(expr)`	Algebraic	Decomposed to SUM/COUNT internally
`MIN(expr)`	Semi-algebraic	Uses `LEAST` merge; falls back to per-group rescan when min row is deleted
`MAX(expr)`	Semi-algebraic	Uses `GREATEST` merge; falls back to per-group rescan when max row is deleted
`BOOL_AND(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`BOOL_OR(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`STRING_AGG(expr, sep)`	Group-rescan	Affected groups are re-aggregated from source data
`ARRAY_AGG(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`JSON_AGG(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`JSONB_AGG(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`BIT_AND(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`BIT_OR(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`BIT_XOR(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`JSON_OBJECT_AGG(key, value)`	Group-rescan	Affected groups are re-aggregated from source data
`JSONB_OBJECT_AGG(key, value)`	Group-rescan	Affected groups are re-aggregated from source data
`STDDEV_POP(expr)` / `STDDEV(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`STDDEV_SAMP(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`VAR_POP(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`VAR_SAMP(expr)` / `VARIANCE(expr)`	Group-rescan	Affected groups are re-aggregated from source data
`MODE() WITHIN GROUP (ORDER BY expr)`	Group-rescan	Ordered-set aggregate; affected groups re-aggregated
`PERCENTILE_CONT(frac) WITHIN GROUP (ORDER BY expr)`	Group-rescan	Ordered-set aggregate; affected groups re-aggregated
`PERCENTILE_DISC(frac) WITHIN GROUP (ORDER BY expr)`	Group-rescan	Ordered-set aggregate; affected groups re-aggregated
`CORR(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`COVAR_POP(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`COVAR_SAMP(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_AVGX(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_AVGY(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_COUNT(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_INTERCEPT(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_R2(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_SLOPE(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_SXX(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_SXY(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`REGR_SYY(Y, X)`	Group-rescan	Regression aggregate; affected groups re-aggregated
`ANY_VALUE(expr)`	Group-rescan	PostgreSQL 16+; affected groups re-aggregated
`JSON_ARRAYAGG(expr ...)`	Group-rescan	SQL-standard JSON aggregation (PostgreSQL 16+); full deparsed SQL preserved
`JSON_OBJECTAGG(key: value ...)`	Group-rescan	SQL-standard JSON aggregation (PostgreSQL 16+); full deparsed SQL preserved
User-defined aggregates (`CREATE AGGREGATE`)	Group-rescan	Any custom aggregate is supported via group-rescan; full aggregate call SQL preserved verbatim

FILTER Clause:

All aggregate functions support the FILTER (WHERE …) clause:

SELECT COUNT(*) FILTER (WHERE status = 'active') AS active_count FROM orders GROUP BY region

The filter predicate is applied within the delta computation — only rows matching the filter contribute to the aggregate delta. Filtered aggregates are excluded from the P5 direct-bypass optimization.

SQL Generation:

The aggregate operator uses a 3-CTE pipeline:

Merge CTE — Joins affected group keys against old (storage) and new (source) aggregate values, producing __pgt_meta_action ('I' for new-only groups, 'D' for disappeared groups, 'U' for changed groups).
LATERAL VALUES expansion — A single-pass LATERAL (VALUES ...) clause expands each merge row into insert and delete actions, avoiding a 4-branch UNION ALL:

FROM merge_cte m,
LATERAL (VALUES
    ('I', m.new_count, m.new_total),
    ('D', m.old_count, m.old_total)
) v(action, count_val, val_total)
WHERE (m.__pgt_meta_action = 'I' AND v.action = 'I')
   OR (m.__pgt_meta_action = 'D' AND v.action = 'D')
   OR (m.__pgt_meta_action = 'U')

Final projection — Emits ('+', row) and ('-', row) tuples for the refresh engine.

MIN/MAX Merge Strategy:

MIN and MAX use a semi-algebraic strategy with two cases:

Non-extremum deletion — When the deleted row is NOT the current minimum (or maximum), the merge uses LEAST(old_value, new_inserts) for MIN or GREATEST(old_value, new_inserts) for MAX. This is fully algebraic and requires no rescan.
Extremum deletion — When the row holding the current minimum (or maximum) IS deleted, the new value cannot be computed from the delta alone. The merge expression returns NULL as a sentinel, which triggers the change-detection guard (IS DISTINCT FROM) to emit the group for re-aggregation. The MERGE layer treats this as a DELETE + INSERT pair, recomputing the group from source data. This is still more efficient than a full table refresh since only affected groups are rescanned.

Distinct

Module: src/dvm/operators/distinct.rs

Implements SELECT DISTINCT using reference counting.

Delta Rule:

$$\Delta(\delta(R)) = { r \in \Delta R : \text{count}(r, R) = 0 \land \text{count}(r, R') > 0 } - { r \in \Delta R : \text{count}(r, R) > 0 \land \text{count}(r, R') = 0 }$$

In other words:

A row enters the output when its count transitions from 0 to ≥1
A row leaves the output when its count transitions from ≥1 to 0

Strategy:

Maintains a hidden __pgt_dup_count column in the storage table to track how many times each distinct row appears in the pre-distinct input.

On insert: increment count. If count was 0, emit ('+', row).
On delete: decrement count. If count becomes 0, emit ('-', row).

Notes:

The duplicate count is not visible in user queries against the storage table (projected away by the view layer).
Duplicate counting uses __pgt_row_id (xxHash) for efficient lookups.

Union All

Module: src/dvm/operators/union_all.rs

Merges deltas from two branches.

Delta Rule:

$$\Delta(R \cup_{\text{all}} S) = \Delta R \cup_{\text{all}} \Delta S$$

Simply concatenates the delta streams from both branches.

SQL Generation:

SELECT * FROM (<left_delta>)
UNION ALL
SELECT * FROM (<right_delta>)

Notes:

Column count and types must match between branches.
Each branch is independently processed through its own operator sub-tree.
This is the simplest operator since UNION ALL preserves all duplicates.

Intersect

Module: src/dvm/operators/intersect.rs

Implements INTERSECT and INTERSECT ALL using dual-count per-branch multiplicity tracking.

Delta Rule:

$$\Delta(R \cap S): \text{emit rows where } \min(\text{count}_L, \text{count}_R) \text{ crosses the 0 boundary}$$

INTERSECT (set): a row is present when both branches contain it.
INTERSECT ALL (bag): a row appears $\min(\text{count}_L, \text{count}_R)$ times.

SQL Generation (3-CTE chain):

Delta CTE — tags rows from left/right child deltas with branch indicator ('L'/'R') and computes per-row net_count.
Merge CTE — joins with the storage table to compute old and new per-branch counts (__pgt_count_l, __pgt_count_r).
Final CTE — detects boundary crossings using LEAST(old_count_l, old_count_r) vs LEAST(new_count_l, new_count_r).

Notes:

Storage table requires hidden columns __pgt_count_l and __pgt_count_r for multiplicity tracking.
Both set and bag variants use the same 3-CTE structure; only the boundary logic stays the same (both use LEAST).

Except

Module: src/dvm/operators/except.rs

Implements EXCEPT and EXCEPT ALL using dual-count per-branch multiplicity tracking.

Delta Rule:

$$\Delta(R - S): \text{emit rows where } \max(0, \text{count}_L - \text{count}_R) \text{ crosses the 0 boundary}$$

EXCEPT (set): a row is present when it exists in the left but not the right branch.
EXCEPT ALL (bag): a row appears $\max(0, \text{count}_L - \text{count}_R)$ times.

SQL Generation (3-CTE chain):

Delta CTE — same as Intersect: tags rows from both child deltas with branch indicator.
Merge CTE — joins with storage table for old/new per-branch counts.
Final CTE — detects boundary crossings using GREATEST(0, old_count_l - old_count_r) vs GREATEST(0, new_count_l - new_count_r).

Notes:

EXCEPT is not commutative — left branch is the positive input, right is subtracted.
Storage table requires hidden columns __pgt_count_l and __pgt_count_r.
Same 3-CTE structure as Intersect with different effective-count function.

Subquery

Module: src/dvm/operators/subquery.rs

Handles both inlined CTEs and explicit subqueries in FROM ((SELECT ...) AS alias).

Delta Rule:

$$\Delta(\rho_{\text{alias}}(Q)) = \rho_{\text{alias}}(\Delta Q)$$

A subquery wrapper is transparent for differentiation — it delegates to its child's delta and optionally renames output columns to match the subquery's column aliases.

SQL Generation:

-- If column aliases differ from child output columns:
SELECT __pgt_row_id, __pgt_action, child_col1 AS alias_col1, child_col2 AS alias_col2
FROM (<child_delta>)

If the child columns already match the aliases, the subquery is a pure passthrough — no additional CTE is emitted.

Notes:

This operator enables both CTE support (Tier 1) and standalone subqueries in FROM.
Column aliases on subqueries (FROM (...) AS x(a, b)) are handled by emitting a thin renaming CTE.
The subquery body is fully differentiated as a normal operator sub-tree.

CTE Scan (Shared Delta)

Module: src/dvm/operators/cte_scan.rs

Handles multi-reference CTEs by computing the CTE body's delta once and reusing it across all references (Tier 2).

Delta Rule:

$$\Delta(\text{CteScan}(\text{id}, Q)) = \text{cache}[\text{id}] \quad \text{(computed once, reused)}$$

When a CTE is referenced multiple times in a query, each reference produces a CteScan node with the same cte_id. The diff engine differentiates the CTE body once and caches the result. Subsequent CteScan nodes for the same CTE reuse the cached delta.

SQL Generation:

-- First reference: differentiates the CTE body and stores result in cache
-- Subsequent references: point to the same system CTE name
SELECT __pgt_row_id, __pgt_action, <columns>
FROM __pgt_cte_<cte_name>_delta  -- shared across all references

If column aliases are present, a thin renaming CTE is added on top of the cached delta.

Notes:

Without CteScan (Tier 1), multi-reference CTEs are inlined: each reference duplicates the full operator sub-tree. CteScan (Tier 2) eliminates this duplication.
The CTE body is pre-differentiated in dependency order (earlier CTEs before later ones that reference them).
Column alias support follows the same pattern as the Subquery operator.

Recursive CTEs

Recursive CTEs (WITH RECURSIVE) are supported in FULL, DIFFERENTIAL, and IMMEDIATE modes, with different execution paths depending on the refresh mode:

FULL Mode

Recursive CTEs work out-of-the-box with refresh_mode = 'FULL'. The defining query is executed as-is via INSERT INTO ... SELECT ..., and PostgreSQL handles the iterative evaluation internally.

DIFFERENTIAL Mode (Three-Strategy Incremental Maintenance)

Recursive CTEs with refresh_mode = 'DIFFERENTIAL' use an automatic three-strategy approach, selected based on column compatibility and change type:

Strategy 1: Semi-Naive Evaluation (INSERT-only changes)

When only INSERT changes are present in the change buffer, pg_trickle uses semi-naive evaluation — the standard technique for incremental fixpoint computation. The base case is differentiated normally through the DVM operator tree, then the resulting delta is propagated through the recursive term using a nested WITH RECURSIVE:

WITH RECURSIVE
  __pgt_base_delta AS (
    -- Normal DVM differentiation of the base case (INSERT rows only)
    <differentiated base case>
  ),
  __pgt_rec_delta AS (
    -- Seed: base case delta rows
    SELECT cols FROM __pgt_base_delta WHERE __pgt_action = 'I'
    UNION ALL
    -- Seed: new base rows joining existing ST storage
    SELECT cols FROM <recursive term with self_ref = ST_storage, base = change_buffer>
    UNION ALL
    -- Propagation: recursive term applied to growing delta
    SELECT cols FROM <recursive term with self_ref = __pgt_rec_delta, base = full>
  )
SELECT pgtrickle.pg_trickle_hash(...) AS __pgt_row_id, 'I' AS __pgt_action, cols
FROM __pgt_rec_delta

The cost is proportional to the number of new rows produced by the change, not the full result set.

Strategy 2: Delete-and-Rederive / DRed (mixed INSERT/DELETE/UPDATE changes)

When the change buffer contains DELETE or UPDATE changes, simple propagation is insufficient — a deleted base row may have transitively derived many recursive rows, some of which may still be derivable from alternative paths. DRed handles this in four phases:

Insert propagation — semi-naive evaluation for the INSERT portion (same as Strategy 1)
Over-deletion cascade — propagate base-case deletions through the recursive term against ST storage to find all transitively-derived rows that might be invalidated
Rederivation — re-execute the recursive CTE from the remaining (non-deleted) base rows to restore any over-deleted rows that have alternative derivations
Combine — final delta = inserts + (over-deletions − rederived rows)

This avoids full recomputation while correctly handling deletions with alternative derivation paths.

IMMEDIATE Mode

Recursive CTEs with refresh_mode = 'IMMEDIATE' use the same semi-naive and Delete-and-Rederive machinery as DIFFERENTIAL mode, but the base changes come from PostgreSQL statement transition tables instead of the background change buffer. This keeps the stream table transactionally up to date within the same statement. To guard against cyclic data or unexpectedly deep recursion, the semi-naive SQL injects a depth counter capped by pg_trickle.ivm_recursive_max_depth (default 100; set to 0 to disable the guard).

Strategy 3: Recomputation Fallback

When the CTE defines more columns than the outer SELECT projects (column mismatch), the incremental strategies cannot be used because the ST storage table lacks columns needed for recursive self-joins. In this case, the full defining query is re-executed and anti-joined against current storage:

WITH __pgt_recomp_new AS (
    SELECT pgtrickle.pg_trickle_hash(row_to_json(sub)::text) AS __pgt_row_id, col1, col2, ...
    FROM (<defining_query>) sub
),
__pgt_recomp_ins AS (
    SELECT n.__pgt_row_id, 'I'::text AS __pgt_action, n.col1, n.col2, ...
    FROM __pgt_recomp_new n
    LEFT JOIN <storage_table> s ON s.__pgt_row_id = n.__pgt_row_id
    WHERE s.__pgt_row_id IS NULL
),
__pgt_recomp_del AS (
    SELECT s.__pgt_row_id, 'D'::text AS __pgt_action, s.col1, s.col2, ...
    FROM <storage_table> s
    LEFT JOIN __pgt_recomp_new n ON n.__pgt_row_id = s.__pgt_row_id
    WHERE n.__pgt_row_id IS NULL
)
SELECT * FROM __pgt_recomp_ins
UNION ALL
SELECT * FROM __pgt_recomp_del

The cost is proportional to the full result set size.

Strategy Selection

CTE columns match ST?	Change type	`refresh_mode` / `DeltaSource`	Strategy
✅ Match	INSERT-only	DIFFERENTIAL (ChangeBuffer)	Semi-naive (Strategy 1)
Match	Mixed (INSERT+DELETE/UPDATE)	DIFFERENTIAL (ChangeBuffer)	DRed (Strategy 2)
Match	INSERT-only	IMMEDIATE (TransitionTable)	Semi-naive (Strategy 1)
Match	Mixed (INSERT+DELETE/UPDATE)	IMMEDIATE (TransitionTable)	DRed (Strategy 2)
Mismatch	Any	Any	Recomputation (Strategy 3)

DRed in DIFFERENTIAL mode (P2-1 -- implemented in v0.10.0)

DRed is now active in both DIFFERENTIAL and IMMEDIATE modes when CTE output columns match ST storage columns. Phase 1 propagates inserts via semi-naive evaluation; Phase 2 cascades deletions through ST storage; Phase 3 rederives over-deleted rows that have alternative derivation paths; Phase 4 combines the results. DRed correctly handles derived-column changes such as path rebuilds under a renamed ancestor node. Column-mismatch cases still use recomputation fallback.

Notes:

Non-linear recursion (multiple self-references in the recursive term) is rejected — PostgreSQL restricts the recursive term to reference the CTE at most once.
The __pgt_row_id column (xxHash of the JSON-serialized row) is used for row identity.
For write-heavy workloads on very large recursive result sets with frequent mixed changes, refresh_mode = 'FULL' may still be more efficient than DRed.

Window Functions

Module: src/dvm/operators/window.rs

Handles window functions (ROW_NUMBER, RANK, DENSE_RANK, SUM() OVER, etc.) using partition-based recomputation.

Delta Rule:

When any row in a partition changes (insert, update, or delete), the entire partition's window function output is recomputed:

$$\Delta(\omega_{f, P}(R)) = \omega_{f, P}(R'|{\text{affected partitions}}) - \omega{f, P}(R|_{\text{affected partitions}})$$

Where $P$ is the PARTITION BY key and $f$ is the window function.

Strategy:

Identify affected partition keys from the child delta.
Delete old window function results for affected partitions from storage.
Build the current input for affected partitions by excluding changed rows via NOT EXISTS on pass-through columns.
Recompute the window function on the current input for affected partitions.
Compute unique row IDs via row_to_json + row_number (handles tied values in ranking functions).
Emit the recomputed rows as inserts.

SQL Generation:

-- CTE 1: Affected partition keys from delta
WITH affected_partitions AS (
    SELECT DISTINCT <partition_cols> FROM (<child_delta>)
),
-- CTE 2: Current input (surviving rows not in delta) for affected partitions
current_input AS (
    SELECT * FROM <child_snapshot>
    WHERE (<partition_cols>) IN (SELECT * FROM affected_partitions)
    AND NOT EXISTS (
        SELECT 1 FROM (<child_delta>) d
        WHERE d.<col1> IS NOT DISTINCT FROM <child_alias>.<col1>
        AND   d.<col2> IS NOT DISTINCT FROM <child_alias>.<col2> ...
    )
),
-- CTE 3: Recompute window function with unique row IDs
recomputed AS (
    SELECT *, pgtrickle.pg_trickle_hash(
        row_to_json(w)::text || '/' || row_number() OVER ()::text
    ) AS __pgt_row_id
    FROM (
        SELECT *, <window_func> OVER (PARTITION BY <partition_cols> ORDER BY <order_cols>) AS <alias>
        FROM current_input
    ) w
)
-- Delete old results + insert recomputed results
SELECT 'D' AS __pgt_action, ...  -- old rows from affected partitions
UNION ALL
SELECT 'I' AS __pgt_action, ...  -- recomputed rows

Notes:

The cost is proportional to the size of affected partitions, not the full table. For workloads where changes spread across few partitions, this is efficient.
When multiple window functions use different PARTITION BY clauses, the parser accepts all of them. If they share the same partition key it is used directly; otherwise the operator falls back to un-partitioned (full) recomputation.
Without PARTITION BY, the entire table is treated as a single partition — any change triggers a full recomputation.
Window functions wrapping aggregates (e.g., RANK() OVER (ORDER BY SUM(x))) are supported: the window diff rewrites ORDER BY / PARTITION BY expressions to reference aggregate output aliases via build_agg_alias_map.
Row IDs are computed from the full row content (row_to_json) plus a positional disambiguator (row_number) to avoid hash collisions with tied ranking values (DENSE_RANK, RANK).

Known Limitation: O(partition_size) Recomputation Cost

Any single-row change within a window partition triggers recomputation of the entire partition. For queries with large partitions (e.g., PARTITION BY region where a region has 500K rows), a single INSERT into that partition causes all 500K rows to be recomputed and diffed. This is inherent to the partition-based delta strategy — window functions cannot be incrementally maintained at sub-partition granularity because a single row insertion can shift the rank, row number, or running aggregate of every other row in the same partition.

Mitigation strategies:

Use more granular PARTITION BY keys to keep partition sizes small.

For queries without PARTITION BY, consider restructuring as a GROUP BY aggregate if the window function is equivalent (e.g., SUM(x) OVER () → SUM(x) as a scalar subquery).

Accept the cost for low-change-frequency partitions; the recomputation is still cheaper than a full table refresh since only affected partitions are touched.

If partition sizes routinely exceed 100K rows and changes are frequent, consider the FULL refresh mode which bypasses the per-partition delta entirely.

Window Frame Clauses:

Window frame specifications are fully supported:

Modes: ROWS, RANGE, GROUPS
Bounds: UNBOUNDED PRECEDING, N PRECEDING, CURRENT ROW, N FOLLOWING, UNBOUNDED FOLLOWING
Between syntax: BETWEEN <start> AND <end>
Exclusion: EXCLUDE CURRENT ROW, EXCLUDE GROUP, EXCLUDE TIES, EXCLUDE NO OTHERS

Example: SUM(val) OVER (ORDER BY ts ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)

Named WINDOW Clauses:

Named window definitions are resolved from the query-level WINDOW clause:

SELECT id, SUM(val) OVER w, AVG(val) OVER w
FROM data
WINDOW w AS (PARTITION BY category ORDER BY ts)

The parser resolves OVER w by looking up the window definition from the WINDOW clause and merging partition, order, and frame specifications.

Lateral Function (Set-Returning Functions in FROM)

Module: src/dvm/operators/lateral_function.rs

Handles set-returning functions (SRFs) used in the FROM clause with implicit LATERAL semantics: jsonb_array_elements, jsonb_each, jsonb_each_text, unnest, etc.

Delta Rule:

When a source row changes (insert, update, or delete), the SRF expansion is re-evaluated only for that source row:

$$\Delta(R \ltimes f(R.\text{col})) = (R' \ltimes f(R'.\text{col}))|{\text{changed rows}} - (R \ltimes f(R.\text{col}))|{\text{changed rows}}$$

Where $R$ is the source table, $f$ is the SRF, and changed rows are identified via the child delta.

Strategy (Row-Scoped Recomputation):

Propagate the child delta to identify changed source rows.
Find all existing ST rows derived from changed source rows (via column matching).
Delete old SRF expansions for those source rows.
Re-expand the SRF for inserted/updated source rows.
Emit deletes + inserts as the final delta.

SQL Generation (4-CTE chain):

-- CTE 1: Changed source rows from child delta
WITH lat_changed AS (
    SELECT DISTINCT "__pgt_row_id", "__pgt_action", <child_cols>
    FROM <child_delta>
),
-- CTE 2: Old ST rows for changed source rows (to be deleted)
lat_old AS (
    SELECT st."__pgt_row_id", st.<all_output_cols>
    FROM <st_table> st
    WHERE EXISTS (
        SELECT 1 FROM lat_changed cs
        WHERE st.<col1> IS NOT DISTINCT FROM cs.<col1>
          AND st.<col2> IS NOT DISTINCT FROM cs.<col2>
          ...
    )
),
-- CTE 3: Re-expand SRF for inserted/updated source rows
lat_expand AS (
    SELECT pg_trickle_hash(<all_cols>::text) AS "__pgt_row_id",
           cs.<child_cols>, <srf_alias>.<srf_cols>
    FROM lat_changed cs,
         LATERAL <srf_function>(cs.<arg>) AS <srf_alias>
    WHERE cs."__pgt_action" = 'I'
),
-- CTE 4: Final delta
lat_final AS (
    SELECT "__pgt_row_id", 'D' AS "__pgt_action", <cols> FROM lat_old
    UNION ALL
    SELECT "__pgt_row_id", 'I' AS "__pgt_action", <cols> FROM lat_expand
)

Row Identity:

Content-based: hash(child_columns || srf_result_columns). This is stable as long as the same source row produces the same expanded values.

Supported SRFs:

Function	Output Columns	Notes
`jsonb_array_elements(jsonb)`	`value` (jsonb)	Expands JSONB array to rows
`jsonb_array_elements_text(jsonb)`	`value` (text)	Text variant
`jsonb_each(jsonb)`	`key` (text), `value` (jsonb)	Expands JSONB object to key-value pairs
`jsonb_each_text(jsonb)`	`key` (text), `value` (text)	Text variant
`unnest(anyarray)`	Element type	Unnests PostgreSQL arrays
Custom SRFs	User-provided column aliases	`AS alias(col1, col2)`

Notes:

The cost is proportional to the number of changed source rows × average SRF expansion size, not the full table.
WITH ORDINALITY is supported — adds a bigint ordinality column to the output.
ROWS FROM() with multiple functions is not supported (rejected at parse time).
Column aliases (e.g., AS child(value)) are used to determine output column names; for known SRFs without aliases, the alias name becomes the column name.
JSON_TABLE (PostgreSQL 17+) — JSON_TABLE(expr, path COLUMNS (...)) is modeled as a LateralFunction and uses the same row-scoped recomputation strategy. Supported column types: regular, EXISTS, formatted, and nested columns with ON ERROR/ON EMPTY behaviors and PASSING clauses.

Lateral Subquery (Correlated Subqueries in FROM)

Module: src/dvm/operators/lateral_subquery.rs

Handles correlated subqueries used in the FROM clause with explicit or implicit LATERAL semantics: FROM t, LATERAL (SELECT ... WHERE ref = t.col) AS alias or FROM t LEFT JOIN LATERAL (...) AS alias ON true.

Delta Rule:

When an outer row changes, the correlated subquery is re-executed only for that row:

$$\Delta(R \ltimes Q(R)) = (R' \ltimes Q(R'))|{\text{changed rows}} - (R \ltimes Q(R))|{\text{changed rows}}$$

Where $R$ is the outer table, $Q(R)$ is the correlated subquery, and changed rows are identified via the child delta.

Strategy (Row-Scoped Recomputation):

Propagate the child delta to identify changed outer rows.
Find all existing ST rows derived from changed outer rows (via column matching with IS NOT DISTINCT FROM).
Delete old subquery expansions for those outer rows.
Re-execute the subquery for inserted/updated outer rows using the original outer alias.
Emit deletes + inserts as the final delta.

SQL Generation (4-CTE chain):

-- CTE 1: Changed outer rows from child delta
WITH lat_sq_changed AS (
    SELECT DISTINCT "__pgt_row_id", "__pgt_action", <child_cols>
    FROM <child_delta>
),
-- CTE 2: Old ST rows for changed outer rows (to be deleted)
lat_sq_old AS (
    SELECT st."__pgt_row_id", st.<all_output_cols>
    FROM <st_table> st
    WHERE EXISTS (
        SELECT 1 FROM lat_sq_changed cs
        WHERE st.<col1> IS NOT DISTINCT FROM cs.<col1>
          AND st.<col2> IS NOT DISTINCT FROM cs.<col2>
          ...
    )
),
-- CTE 3: Re-execute subquery for inserted/updated outer rows
lat_sq_expand AS (
    SELECT pg_trickle_hash(<all_cols>::text) AS "__pgt_row_id",
           <outer_alias>.<child_cols>, <sub_alias>.<sub_cols>
    FROM lat_sq_changed AS <outer_alias>,      -- Original outer alias!
         LATERAL (<subquery_sql>) AS <sub_alias>
    WHERE <outer_alias>."__pgt_action" = 'I'
),
-- CTE 4: Final delta
lat_sq_final AS (
    SELECT "__pgt_row_id", 'D' AS "__pgt_action", <cols> FROM lat_sq_old
    UNION ALL
    SELECT "__pgt_row_id", 'I' AS "__pgt_action", <cols> FROM lat_sq_expand
)

LEFT JOIN LATERAL Handling:

For queries using LEFT JOIN LATERAL (...) ON true, the expand CTE uses LEFT JOIN LATERAL instead of comma syntax and wraps subquery columns in COALESCE for hash stability:

lat_sq_expand AS (
    SELECT pg_trickle_hash(<outer_cols>::text || '/' || COALESCE(<sub_cols>::text, '')) AS "__pgt_row_id",
           <outer_alias>.<child_cols>, <sub_alias>.<sub_cols>
    FROM lat_sq_changed AS <outer_alias>
    LEFT JOIN LATERAL (<subquery_sql>) AS <sub_alias> ON true
    WHERE <outer_alias>."__pgt_action" = 'I'
)

Row Identity:

Content-based: hash(outer_columns || '/' || subquery_result_columns). For LEFT JOIN with NULL results, COALESCE ensures a stable hash.

Supported Patterns:

Pattern	Syntax	Notes
Top-N per group	`LATERAL (SELECT ... ORDER BY ... LIMIT N)`	Most common use case
Correlated aggregate	`LATERAL (SELECT SUM(x) FROM t WHERE t.fk = p.pk)`	Returns single row per outer row
Existence with data	`LEFT JOIN LATERAL (...) ON true`	Preserves outer rows with NULLs
Multi-column lookup	`LATERAL (SELECT a, b FROM t WHERE t.fk = p.pk LIMIT 1)`	Multiple derived values
GROUP BY inside subquery	`LATERAL (SELECT type, COUNT(*) FROM t WHERE t.fk = p.pk GROUP BY type)`	Multiple rows per outer row

Key Design Decision: Outer Alias Rewriting

The subquery body contains column references to the outer table (e.g., WHERE li.order_id = o.id). In the expansion CTE, the changed-sources CTE is aliased with the original outer table alias (e.g., lat_sq_changed AS o) so that the subquery's column references resolve naturally without rewriting.

Notes:

The cost is proportional to the number of changed outer rows × average subquery result size, not the full table.
The subquery is stored as raw SQL (like LateralFunction) because it cannot be independently differentiated — it depends on outer row context.
Source table OIDs referenced by the subquery body are extracted at parse time for CDC trigger setup.
ORDER BY + LIMIT inside the subquery are valid (they apply per-outer-row, not to the stream table).

Semi-Join (EXISTS / IN Subquery)

Module: src/dvm/operators/semi_join.rs

Handles WHERE EXISTS (SELECT ... FROM ...) and WHERE col IN (SELECT ...) patterns. The parser transforms these into a SemiJoin operator with a left (outer) child, a right (inner) child, and a join condition.

Delta Rule:

$$\Delta(L \ltimes R) = \Delta L|{R} + L|{\Delta R \text{ causes existence change}}$$

Part 1: Outer rows that changed and still satisfy the semi-join condition.
Part 2: Existing outer rows whose semi-join result flipped due to inner changes (a matching inner row was inserted or deleted).

Strategy (Two-Part Delta):

Part 1 (outer delta): Filter delta_left to rows that have at least one match in the current right-hand snapshot.
Part 2 (inner delta): For each row in the left snapshot, check whether the existence of matching right-hand rows changed between the old and current state. Emit 'I' if a match appeared, 'D' if all matches disappeared.

The "old" right-hand state is reconstructed from the current state by reversing the delta: R_old = (R_current EXCEPT ALL delta_right(action='I')) UNION ALL delta_right(action='D').

Row Identity:

Part 1: Uses __pgt_row_id from the left delta.
Part 2: Content-based hash via pg_trickle_hash_multi on left-side columns.

Supported Patterns:

Pattern	SQL	Notes
`EXISTS`	`WHERE EXISTS (SELECT 1 FROM t WHERE t.fk = s.pk)`	Direct semi-join
`IN (subquery)`	`WHERE id IN (SELECT fk FROM t)`	Rewritten to `EXISTS` with equality
Multiple conditions	`WHERE EXISTS (... AND ...)`	Additional predicates in subquery WHERE

Anti-Join (NOT EXISTS / NOT IN Subquery)

Module: src/dvm/operators/anti_join.rs

Handles WHERE NOT EXISTS (SELECT ... FROM ...) and WHERE col NOT IN (SELECT ...) patterns. The inverse of the semi-join operator.

Delta Rule:

$$\Delta(L \triangleright R) = \Delta L|{\neg R} + L|{\Delta R \text{ causes existence change}}$$

Part 1: Outer rows that changed and have no match in the right-hand snapshot.
Part 2: Existing outer rows whose anti-join result flipped due to inner changes.

Strategy (Two-Part Delta):

Part 1 (outer delta): Filter delta_left to rows with NOT EXISTS in the current right snapshot.
Part 2 (inner delta): For each row in the left snapshot, detect existence changes. Emit 'D' if a match appeared (row no longer qualifies), 'I' if all matches disappeared (row now qualifies).

Note the inverted semantics compared to semi-join: a new match means deletion, losing all matches means insertion.

Row Identity: Same as semi-join.

Supported Patterns:

Pattern	SQL	Notes
`NOT EXISTS`	`WHERE NOT EXISTS (SELECT 1 FROM t WHERE t.fk = s.pk)`	Direct anti-join
`NOT IN (subquery)`	`WHERE id NOT IN (SELECT fk FROM t)`	Rewritten to `NOT EXISTS` with equality

Scalar Subquery (Correlated SELECT Subquery)

Module: src/dvm/operators/scalar_subquery.rs

Handles scalar subqueries appearing in the SELECT list, e.g., SELECT a, (SELECT max(x) FROM t) AS mx FROM s. The subquery must return exactly one row and one column.

Delta Rule:

$$\Delta(L \times q) = \Delta L \times q' + L \times (q' - q)$$

Where $q$ is the scalar subquery value and $q'$ is the updated value.

Strategy (Two-Part Delta):

Part 1 (outer delta): Propagate the child delta, appending the current scalar subquery value to each row.
Part 2 (scalar value change): When the scalar subquery's result changes, emit deletes for all existing outer rows (with the old scalar value) and re-inserts for all outer rows (with the new value). The old scalar value is reconstructed by reversing the inner delta.

SQL Generation (3 or 4 CTEs):

-- Part 1: child delta + current scalar value
WITH sq_outer AS (
    SELECT *, (<scalar_subquery>) AS "<alias>"
    FROM <child_delta>
),
-- Part 2a: DELETE all outer rows when scalar changed
sq_del AS (
    SELECT "__pgt_row_id", 'D' AS "__pgt_action", <cols>
    FROM <st_table>
    WHERE (<scalar_old>) IS DISTINCT FROM (<scalar_current>)
),
-- Part 2b: INSERT all outer rows with new scalar value
sq_ins AS (
    SELECT pg_trickle_hash_multi(...) AS "__pgt_row_id",
           'I' AS "__pgt_action", <cols>, (<scalar_current>) AS "<alias>"
    FROM <source_snapshot>
    WHERE (<scalar_old>) IS DISTINCT FROM (<scalar_current>)
)
-- Final: UNION ALL of all parts
SELECT * FROM sq_outer
UNION ALL SELECT * FROM sq_del
UNION ALL SELECT * FROM sq_ins

Row Identity:

Part 1: __pgt_row_id from the child delta.
Part 2: Content-based hash via pg_trickle_hash_multi on all output columns.

Notes:

The scalar subquery is stored as raw SQL (deparsed from the parse tree).
The old scalar value is approximated using the same EXCEPT ALL / UNION ALL reversal technique as semi/anti-join.
If the scalar subquery references a table that changes, all outer rows must be re-evaluated — the delta can be large.
Source OIDs used by the scalar subquery are captured at parse time for CDC trigger registration.

Operator Tree Construction

The DVM engine builds the operator tree by analyzing the parsed query:

WITH clause → CTE definitions extracted into a name→body map (non-recursive) or CTE registry (multi-reference)
FROM clause → Scan nodes for physical tables; Subquery nodes for inlined CTEs and subqueries in FROM; CteScan nodes for multi-reference CTEs; LateralFunction nodes for SRFs and JSON_TABLE in FROM; LateralSubquery nodes for correlated subqueries in FROM
JOIN → Join or OuterJoin wrapping two sub-trees
LATERAL SRFs → LateralFunction wrapping the left-hand FROM item as its child
LATERAL subqueries → LateralSubquery wrapping the left-hand FROM item as its child (comma syntax or JOIN LATERAL)
WHERE subqueries → SemiJoin for EXISTS/IN (subquery), AntiJoin for NOT EXISTS/NOT IN (subquery), extracted from the WHERE clause
Scalar subqueries → ScalarSubquery for (SELECT ...) in the SELECT list, wrapping the child tree
WHERE → Filter wrapping the scan/join tree (remaining non-subquery predicates)
SELECT list → Project for column selection and expressions
GROUP BY → Aggregate wrapping the filtered/projected tree
DISTINCT → Distinct on top
UNION ALL → UnionAll combining two complete sub-trees
INTERSECT / EXCEPT → Intersect or Except combining two sub-trees with dual-count tracking
Window functions → Window wrapping the sub-tree with PARTITION BY / ORDER BY metadata
ORDER BY → silently discarded (storage row order is undefined)
LIMIT / OFFSET → ORDER BY + LIMIT [+ OFFSET] is accepted as TopK (scoped recomputation); standalone LIMIT or OFFSET without ORDER BY is rejected

For recursive CTEs (WITH RECURSIVE), the query is parsed into an OpTree with RecursiveCte operator nodes. In DIFFERENTIAL mode, the strategy (semi-naive, DRed, or recomputation) is selected automatically based on column compatibility and change type — see the Recursive CTEs section above for details.

The tree is then traversed bottom-up during delta generation: each operator's generate_delta_sql() method composes its SQL fragment around the output of its child operator(s).

pg_trickle — Benchmark Guide

This document explains how the database-level refresh benchmarks work and how to interpret their output.

Overview

The benchmark suite in tests/e2e_bench_tests.rs measures wall-clock refresh time for FULL vs DIFFERENTIAL mode across a matrix of table sizes, change rates, and query complexities. Each benchmark spawns an isolated PostgreSQL 18.x container via Testcontainers, ensuring reproducible and interference-free measurements.

The core question the benchmarks answer:

How much faster is an DIFFERENTIAL refresh compared to a FULL refresh, given a specific workload?

Prerequisites

Build the E2E test Docker image before running any benchmarks:

./tests/build_e2e_image.sh

Docker must be running on the host.

Running Benchmarks

All benchmark tests are tagged #[ignore] so they are skipped during normal CI. The --nocapture flag is required to see the printed output tables.

Quick Spot Checks (~5–10 seconds each)

# Simple scan, 10K rows, 1% change rate
cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_scan_10k_1pct

# Aggregate query, 100K rows, 1% change rate
cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_aggregate_100k_1pct

# Join + aggregate, 100K rows, 10% change rate
cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_join_agg_100k_10pct

Zero-Change Latency (~5 seconds)

cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_no_data_refresh_latency

Full Matrix (~15–30 minutes)

Runs all 30 combinations and prints a consolidated summary:

cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_full_matrix

Run All Benchmarks in Parallel

cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture

Note: each test starts its own container, so parallel execution requires sufficient Docker resources.

Benchmark Dimensions

Table Sizes

Size	Rows	Purpose
Small	10,000	Fast iteration; measures per-row overhead
Medium	100,000	More realistic; reveals scaling characteristics

Change Rates

Rate	Description
1%	Low churn — the sweet spot for incremental refresh
10%	Moderate churn — tests delta query scalability
50%	High churn — stress test; approaches full-refresh cost

Query Complexities

Scenario	Defining Query	Operators Tested
scan	`SELECT id, region, category, amount, score FROM src`	Table scan only
filter	`SELECT id, region, amount FROM src WHERE amount > 5000`	Scan + filter (WHERE)
aggregate	`SELECT region, SUM(amount), COUNT(*) FROM src GROUP BY region`	Scan + group-by aggregate
join	`SELECT s.id, s.region, s.amount, d.region_name FROM src s JOIN dim d ON ...`	Scan + inner join
join_agg	`SELECT d.region_name, SUM(s.amount), COUNT(*) FROM src s JOIN dim d ON ... GROUP BY ...`	Scan + join + aggregate

DML Mix per Cycle

Each change cycle applies a realistic mix of operations:

Operation	Fraction	Example at 10K rows, 10% rate
UPDATE	70%	700 rows have `amount` incremented
DELETE	15%	150 rows removed
INSERT	15%	150 new rows added

What Each Benchmark Does

1. Start a fresh PostgreSQL 18.x container
2. Install the pg_trickle extension
3. Create and populate the source table (10K or 100K rows)
4. Create dimension table if needed (for join scenarios)
5. ANALYZE for stable query plans

── FULL mode ──
6. Create a Stream Table in FULL refresh mode
7. For each of 3 cycles:
   a. Apply random DML (updates + deletes + inserts)
   b. ANALYZE
   c. Time the FULL refresh (TRUNCATE + re-execute entire query)
   d. Record refresh_ms and ST row count
8. Drop the FULL-mode ST

── DIFFERENTIAL mode ──
9. Reset source table to same starting state
10. Create a Stream Table in DIFFERENTIAL refresh mode
11. For each of 3 cycles:
    a. Apply random DML (same parameters)
    b. ANALYZE
    c. Time the DIFFERENTIAL refresh (delta query + MERGE)
    d. Record refresh_ms and ST row count

12. Print results table and summary

Both modes start from the same data to ensure a fair comparison. The 3-cycle design captures warm-up effects (cycle 1 may be slower due to plan caching).

Reading the Output

Detail Table

╔══════════════════════════════════════════════════════════════════════════════════════╗
║                    pg_trickle Refresh Benchmark Results                      ║
╠════════════╤══════════╤════════╤═════════════╤═══════╤════════════╤═════════════════╣
║ Scenario   │ Rows     │ Chg %  │ Mode        │ Cycle │ Refresh ms │ ST Rows         ║
╠════════════╪══════════╪════════╪═════════════╪═══════╪════════════╪═════════════════╣
║ aggregate  │    10000 │     1% │ FULL        │     1 │       22.1 │               5 ║
║ aggregate  │    10000 │     1% │ FULL        │     2 │        4.8 │               5 ║
║ aggregate  │    10000 │     1% │ FULL        │     3 │        5.3 │               5 ║
║ aggregate  │    10000 │     1% │ DIFFERENTIAL │     1 │        8.4 │               5 ║
║ aggregate  │    10000 │     1% │ DIFFERENTIAL │     2 │        4.4 │               5 ║
║ aggregate  │    10000 │     1% │ DIFFERENTIAL │     3 │        4.6 │               5 ║
╚════════════╧══════════╧════════╧═════════════╧═══════╧════════════╧═════════════════╝

Column	Meaning
Scenario	Query complexity level (scan, filter, aggregate, join, join_agg)
Rows	Number of rows in the base table
Chg %	Percentage of rows changed per cycle
Mode	FULL (truncate + recompute) or DIFFERENTIAL (delta + merge)
Cycle	Which of the 3 measurement rounds (cycle 1 often includes warm-up)
Refresh ms	Wall-clock time for the refresh operation
ST Rows	Row count in the Stream Table after refresh (sanity check)

Summary Table

┌─────────────────────────────────────────────────────────────────────────┐
│                        Summary (avg ms per cycle)                       │
├────────────┬──────────┬────────┬─────────────────┬──────────────────────┤
│ Scenario   │ Rows     │ Chg %  │ FULL avg ms     │ DIFFERENTIAL avg ms   │
├────────────┼──────────┼────────┼─────────────────┼──────────────────────┤
│ aggregate  │    10000 │     1% │       10.7       │        5.8 (  1.8x) │
└────────────┴──────────┴────────┴─────────────────┴──────────────────────┘

The Speedup value in parentheses is FULL avg / DIFFERENTIAL avg — how many times faster the incremental refresh is compared to a full refresh.

Interpreting the Speedup

What to Expect

Change Rate	Table Size	Expected Speedup	Explanation
1%	10K	1.5–5x	Small table; overhead is similar, delta is tiny
1%	100K	5–50x	Larger table amplifies full-refresh cost
10%	100K	2–10x	Moderate delta; still significantly faster
50%	any	1–2x	Delta is nearly as large as full table

Rules of Thumb

Speedup	Interpretation
> 10x	Strong win for DIFFERENTIAL — typical at low change rates on larger tables
5–10x	Clear advantage for DIFFERENTIAL
2–5x	Moderate advantage — DIFFERENTIAL is the right choice
1–2x	Marginal gain — either mode is acceptable
~1x	Break-even — change rate is too high for incremental to help
< 1x	DIFFERENTIAL is slower — would indicate overhead exceeds savings (investigate)

Key Patterns to Look For

Scaling with table size: For the same change rate, speedup should increase with table size. FULL must re-process all rows; DIFFERENTIAL processes only the delta.
Degradation with change rate: As change rate rises from 1% → 50%, speedup should decrease. At 50%, DIFFERENTIAL processes half the table which approaches FULL cost.
Query complexity amplifies speedup: Aggregate and join queries benefit more from DIFFERENTIAL because they avoid expensive re-computation. A join_agg at 1% changes should show higher speedup than a simple scan at the same parameters.
Cycle 1 warm-up: The first cycle in each mode may be slower due to PostgreSQL plan cache population. Use cycles 2–3 for the steadiest numbers.
ST Rows consistency: The ST row count should be similar between FULL and DIFFERENTIAL for the same scenario (accounting for random DML). Large discrepancies indicate a correctness issue.

Zero-Change Latency

The bench_no_data_refresh_latency test measures the overhead of a refresh when no data has changed — the NO_DATA code path.

┌──────────────────────────────────────────────┐
│ NO_DATA Refresh Latency (10 iterations)      │
├──────────────────────────────────────────────┤
│ Avg:     3.21 ms                             │
│ Max:     5.10 ms                             │
│ Target: < 10 ms                              │
│ Status: ✅ PASS                              │
└──────────────────────────────────────────────┘

Metric	Meaning
Avg	Average wall-clock time across 10 no-op refreshes
Max	Worst-case single iteration
Target	The PLAN.md goal: < 10 ms per no-op refresh
Status	PASS if avg < 10 ms, SLOW otherwise

A passing result confirms the scheduler's per-cycle overhead is negligible. Values > 10 ms in containerized environments may be acceptable due to Docker overhead; bare-metal PostgreSQL should comfortably meet the target.

Available Tests

Individual Tests (10K rows)

Test Name	Scenario	Change Rate
`bench_scan_10k_1pct`	scan	1%
`bench_scan_10k_10pct`	scan	10%
`bench_scan_10k_50pct`	scan	50%
`bench_filter_10k_1pct`	filter	1%
`bench_aggregate_10k_1pct`	aggregate	1%
`bench_join_10k_1pct`	join	1%
`bench_join_agg_10k_1pct`	join_agg	1%

Individual Tests (100K rows)

Test Name	Scenario	Change Rate
`bench_scan_100k_1pct`	scan	1%
`bench_scan_100k_10pct`	scan	10%
`bench_scan_100k_50pct`	scan	50%
`bench_aggregate_100k_1pct`	aggregate	1%
`bench_aggregate_100k_10pct`	aggregate	10%
`bench_join_agg_100k_1pct`	join_agg	1%
`bench_join_agg_100k_10pct`	join_agg	10%

Special Tests

Test Name	Description
`bench_full_matrix`	All 30 combinations (5 queries × 2 sizes × 3 rates)
`bench_no_data_refresh_latency`	Zero-change overhead (10 iterations)

Nexmark Streaming Benchmark

The Nexmark benchmark validates correctness against a sustained high-frequency DML workload modelling an online auction system. It is adapted from the Nexmark benchmark specification used by streaming systems like Flink, Feldera, and Materialize.

Data Model

Table	Description	Default Size
`person`	Registered users (sellers/bidders)	100 rows
`auction`	Items listed for sale	500 rows
`bid`	Bids placed on auctions	2,000 rows

Queries

Query	Features	Description
Q0	Passthrough	Identity projection of all bids
Q1	Projection + arithmetic	Currency conversion
Q2	Filter	Bids on specific auctions
Q3	JOIN + filter	Local item suggestion (person-auction join)
Q4	JOIN + GROUP BY + AVG	Average selling price by category
Q5	GROUP BY + COUNT	Hot items (bid count per auction)
Q6	JOIN + GROUP BY + AVG	Average bid price per seller
Q7	Aggregate (MAX)	Highest bid price
Q8	JOIN	Person-auction join (new users monitoring)
Q9	JOIN + DISTINCT ON	Winning bid per auction with bidder info

Running Nexmark Tests

# Default scale (100 persons, 500 auctions, 2000 bids, 3 cycles)
cargo test --test e2e_nexmark_tests -- --ignored --test-threads=1 --nocapture

# Larger scale
NEXMARK_PERSONS=1000 NEXMARK_AUCTIONS=5000 NEXMARK_BIDS=50000 NEXMARK_CYCLES=5 \
  cargo test --test e2e_nexmark_tests -- --ignored --test-threads=1 --nocapture

What Each Cycle Does

Each refresh cycle applies three mutation functions (RF1-RF3) then refreshes all stream tables and asserts multiset equality:

RF1 (INSERT): New persons, auctions, and bids
RF2 (DELETE): Remove oldest bids, orphaned auctions, orphaned persons
RF3 (UPDATE): Price changes, reserve adjustments, city moves
Refresh + Assert: Differential refresh → EXCEPT ALL correctness check

Correctness Validation

The test uses the same DBSP invariant as TPC-H: after every differential refresh, the stream table must be multiset-equal to re-executing the defining query from scratch (symmetric EXCEPT ALL). Additionally, negative __pgt_count values (over-retraction bugs) are detected.

DAG Topology Benchmarks

The DAG topology benchmark suite in tests/e2e_dag_bench_tests.rs measures end-to-end propagation latency and throughput through multi-level DAG topologies. While the single-ST benchmarks above measure per-operator refresh speed, these benchmarks measure how efficiently changes propagate through chains, fan-outs, diamonds, and mixed topologies with 5–100+ stream tables.

The core questions these benchmarks answer:

How long does it take for a source-table INSERT to propagate through an entire DAG to the leaf stream tables?

How does PARALLEL refresh mode compare to CALCULATED mode across different topology shapes?

Running DAG Benchmarks

# Full suite (rebuilds Docker image)
just test-dag-bench

# Skip Docker image rebuild
just test-dag-bench-fast

# Individual topology tests
cargo test --test e2e_dag_bench_tests --features pg18 -- --ignored bench_latency_linear_5 --test-threads=1 --nocapture
cargo test --test e2e_dag_bench_tests --features pg18 -- --ignored bench_throughput_diamond --test-threads=1 --nocapture

Topology Patterns

Topology	Shape	Description
Linear Chain	`src → st_1 → st_2 → ... → st_N`	Sequential pipeline; L1 aggregate, L2+ alternating project/filter
Wide DAG	`src → [W parallel chains × D deep]`	W independent chains of depth D from a shared source; tests parallel refresh mode
Fan-Out Tree	`src → root → [b children] → [b² grandchildren] → ...`	Exponential fan-out; each parent spawns b children with filter/project variants
Diamond	`src → [fan-out aggregates] → JOIN → [extension]`	Fan-out to independent aggregates (SUM/COUNT/MAX/MIN/AVG) then converge via JOIN
Mixed	Two sources, 4 layers, ~15 STs	Realistic e-commerce scenario with chains, fan-out, cross-source joins, and alerts

Measurement Modes

Latency benchmarks (auto-refresh): The scheduler is enabled with a 200 ms interval. The test INSERTs into the source table and polls pgt_refresh_history until the leaf stream table has a new COMPLETED entry. This measures the full propagation latency including scheduler overhead.

Throughput benchmarks (manual refresh): The scheduler is disabled. The test applies mixed DML (70% UPDATE, 15% DELETE, 15% INSERT) then manually refreshes all STs in topological order. This isolates pure refresh cost from scheduler overhead.

Theoretical Comparison

Each latency benchmark computes the theoretical prediction from PLAN_DAG_PERFORMANCE.md and reports the delta:

Mode	Formula
CALCULATED	L = I_s + N × T_r
PARALLEL(C)	L = Σ ⌈W_l / C⌉ × max(I_p, T_r) per level

Where T_r is the measured average per-ST refresh time, I_s = 200 ms (scheduler interval), and C is the concurrency limit.

Reading the Output

Per-Cycle Machine-Parseable Lines (stderr)

[DAG_BENCH] topology=linear_chain mode=CALCULATED sts=10 depth=10 width=1 cycle=1 actual_ms=820.3 theory_ms=700.0 overhead_pct=17.2 per_hop_ms=82.0

ASCII Summary Table (stdout)

╔══════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                         pg_trickle DAG Topology Benchmark Results                                 ║
╠═══════════════╤═══════════════╤══════╤═══════╤═══════╤════════════╤════════════╤═══════════════════╣
║ Topology      │ Mode          │ STs  │ Depth │ Width │ Actual ms  │ Theory ms  │ Overhead          ║
╠═══════════════╪═══════════════╪══════╪═══════╪═══════╪════════════╪════════════╪═══════════════════╣
║ linear_chain  │ CALCULATED    │   10 │    10 │     1 │      820.3 │      700.0 │ +17.2%            ║
║ wide_dag      │ PARALLEL_C8   │   60 │     3 │    20 │     2430.1 │     1800.0 │ +35.0%            ║
╚═══════════════╧═══════════════╧══════╧═══════╧═══════╧════════════╧════════════╧═══════════════════╝

Per-Level Breakdown

  Per-Level Breakdown (linear_chain D=10, CALCULATED):
  Level  1: avg  52.3ms  [st_lc_1]
  Level  2: avg  48.7ms  [st_lc_2]
  ...
  Level 10: avg  51.2ms  [st_lc_10]
  Total:       513.5ms  (scheduler overhead: 306.8ms)

JSON Export

Results are written to target/dag_bench_results/<timestamp>.json (overridable via PGS_DAG_BENCH_JSON_DIR env var) for cross-run comparison.

Available DAG Benchmark Tests

Latency Tests (Auto-Refresh)

Test Name	Topology	Mode	STs
`bench_latency_linear_5_calc`	Linear, D=5	CALCULATED	5
`bench_latency_linear_10_calc`	Linear, D=10	CALCULATED	10
`bench_latency_linear_20_calc`	Linear, D=20	CALCULATED	20
`bench_latency_linear_10_par4`	Linear, D=10	PARALLEL(4)	10
`bench_latency_wide_3x20_calc`	Wide, D=3 W=20	CALCULATED	60
`bench_latency_wide_3x20_par4`	Wide, D=3 W=20	PARALLEL(4)	60
`bench_latency_wide_3x20_par8`	Wide, D=3 W=20	PARALLEL(8)	60
`bench_latency_wide_5x20_calc`	Wide, D=5 W=20	CALCULATED	100
`bench_latency_wide_5x20_par8`	Wide, D=5 W=20	PARALLEL(8)	100
`bench_latency_fanout_b2d5_calc`	Fan-out, b=2 d=5	CALCULATED	31
`bench_latency_fanout_b2d5_par8`	Fan-out, b=2 d=5	PARALLEL(8)	31
`bench_latency_diamond_4_calc`	Diamond, fan=4	CALCULATED	5
`bench_latency_mixed_calc`	Mixed, ~15 STs	CALCULATED	~15
`bench_latency_mixed_par8`	Mixed, ~15 STs	PARALLEL(8)	~15

Throughput Tests (Manual Refresh)

Test Name	Topology	STs	Delta Sizes
`bench_throughput_linear_5`	Linear, D=5	5	10, 100, 1000
`bench_throughput_linear_10`	Linear, D=10	10	10, 100, 1000
`bench_throughput_linear_20`	Linear, D=20	20	10, 100, 1000
`bench_throughput_wide_3x20`	Wide, D=3 W=20	60	10, 100, 1000
`bench_throughput_fanout_b2d5`	Fan-out, b=2 d=5	31	10, 100, 1000
`bench_throughput_diamond_4`	Diamond, fan=4	5	10, 100, 1000
`bench_throughput_mixed`	Mixed, ~15 STs	~15	10, 100, 1000

What to Look For

Linear chain: CALCULATED faster than PARALLEL. For width=1 DAGs, PARALLEL adds poll overhead without parallelism benefit. CALCULATED should be faster.
Wide DAG: PARALLEL(C=8) speedup over CALCULATED. For width ≥ 20, PARALLEL should show measurable improvement — it refreshes up to C STs concurrently per level instead of sequentially.
Overhead < 100%. Theoretical vs actual overhead should stay below 100% across all topologies — the formulas should be in the right ballpark.
DIFFERENTIAL action in per-ST breakdown. ST-on-ST hops should show DIFFERENTIAL rather than FULL, confirming differential propagation is working.
Throughput scaling with delta size. Smaller deltas (10 rows) should yield lower per-cycle wall-clock time than larger deltas (1000 rows).

In-Process Micro-Benchmarks (Criterion.rs)

In addition to the E2E database benchmarks, the project includes two Criterion.rs benchmark suites that measure pure Rust computation time without database overhead. These are useful for tracking performance regressions in the internal query-building and IVM differentiation logic.

Benchmark Suites

`refresh_bench` — Utility Functions

benches/refresh_bench.rs benchmarks the low-level helper functions used during refresh operations:

Benchmark Group	What It Measures
quote_ident	PostgreSQL identifier quoting speed
col_list	Column list SQL generation
prefixed_col_list	Prefixed column list generation (e.g., `NEW.col`)
expr_to_sql	AST expression → SQL string conversion
output_columns	Output column extraction from parsed queries
source_oids	Source table OID resolution
lsn_gt	LSN comparison expression generation
frontier_json	Frontier state JSON serialization
canonical_period	Interval parsing and canonicalization
dag_operations	DAG topological sort and cycle detection
xxh64	xxHash-64 hashing throughput

`diff_operators` — IVM Operator Differentiation

benches/diff_operators.rs benchmarks the delta SQL generation for every IVM operator. Each benchmark creates a realistic operator tree and measures differentiate() throughput:

Benchmark Group	What It Measures
diff_scan	Table scan differentiation (3, 10, 20 columns)
diff_filter	Filter (WHERE) differentiation
diff_project	Projection (SELECT subset) differentiation
diff_aggregate	GROUP BY aggregate differentiation (simple + complex)
diff_inner_join	Inner join differentiation
diff_left_join	Left outer join differentiation
diff_distinct	DISTINCT differentiation
diff_union_all	UNION ALL differentiation (2, 5, 10 children)
diff_window	Window function differentiation
diff_join_aggregate	Composite join + aggregate pipeline
differentiate_full	Full `differentiate()` call for scan-only and filter+scan trees

Running Micro-Benchmarks

# Run all Criterion benchmarks
just bench

# Run only refresh utility benchmarks
cargo bench --bench refresh_bench --features pg18

# Run only IVM diff operator benchmarks
just bench-diff
# or equivalently:
cargo bench --bench diff_operators --features pg18

# Output in Bencher-compatible format (for CI integration)
just bench-bencher

Output and Reports

Criterion produces statistical analysis for each benchmark including:

Mean and standard deviation of execution time
Throughput (iterations/sec)
Comparison with previous run — reports improvements/regressions with confidence intervals

HTML reports are generated in target/criterion/ with interactive charts showing distributions and regression history. Open target/criterion/report/index.html to browse all results.

Sample output:

diff_scan/3_columns   time:   [11.834 µs 12.074 µs 12.329 µs]
diff_scan/10_columns  time:   [16.203 µs 16.525 µs 16.869 µs]
diff_aggregate/simple time:   [21.447 µs 21.862 µs 22.301 µs]
diff_inner_join       time:   [25.919 µs 26.421 µs 26.952 µs]

Continuous Benchmarking with Bencher

Bencher provides continuous benchmark tracking in CI, detecting performance regressions on pull requests before they merge.

How It Works

The .github/workflows/benchmarks.yml workflow:

On main pushes — runs both Criterion suites and uploads results to Bencher as the baseline. This establishes the expected performance for each benchmark.
On pull requests — runs the same benchmarks and compares against the main baseline using a Student's t-test with a 99% upper confidence boundary. If any benchmark regresses beyond the threshold, the PR check fails.

Setup

To enable Bencher for your fork or deployment:

Create a Bencher account at bencher.dev and create a project.
Add the API token as a GitHub Actions secret:
- Go to Settings → Secrets and variables → Actions
- Add BENCHER_API_TOKEN with your Bencher API token
Update the project slug in .github/workflows/benchmarks.yml if your Bencher project name differs from pg-trickle.

The workflow gracefully degrades — if BENCHER_API_TOKEN is not set, benchmarks still run and upload artifacts but skip Bencher tracking.

Local Bencher-Format Output

To see what Bencher would receive from CI:

just bench-bencher

This runs both suites with --output-format bencher, producing JSON output compatible with bencher run.

Dashboard

Once configured, the Bencher dashboard shows:

Historical trends for every benchmark across commits
Statistical thresholds with configurable alerting
PR annotations highlighting which benchmarks regressed and by how much

Troubleshooting

Issue	Resolution
`docker: command not found`	Install Docker Desktop and ensure it is running
Container startup timeout	Increase Docker memory allocation (≥ 4 GB recommended)
`image not found`	Run `./tests/build_e2e_image.sh` to build the test image
Highly variable timings	Close other workloads; use `--test-threads=1` to avoid container contention
SLOW status on latency test	Expected in Docker; bare-metal should pass < 10 ms

CDC Write-Side Overhead Benchmarks

The CDC write-overhead benchmark suite in tests/e2e_cdc_write_overhead_tests.rs measures the DML throughput cost of pg_trickle's CDC triggers on source tables. This quantifies the "write amplification factor" — how much slower DML becomes when a stream table is attached.

The core question this benchmark answers:

How much write throughput do you sacrifice by attaching a stream table to a source table?

Running CDC Write Overhead Benchmarks

# Full suite (all 5 scenarios)
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_write_overhead_full

# Individual scenarios
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_single_row_insert
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_bulk_insert
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_bulk_update
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_bulk_delete
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_concurrent_writers

Scenarios

Scenario	Description	Rows per Cycle
Single-row INSERT	One `INSERT` statement per row, 1,000 rows total	1,000
Bulk INSERT	Single `INSERT ... SELECT generate_series(...)`	10,000
Bulk UPDATE	Single `UPDATE ... WHERE id <= N`	10,000
Bulk DELETE	Single `DELETE ... WHERE id <= N`	10,000
Concurrent writers	4 parallel sessions each inserting 5,000 rows	20,000 total

Reading the Output

╔═══════════════════════════════════════════════════════════════════════════════════╗
║               pg_trickle CDC Write-Side Overhead Benchmark                       ║
╠═══════════════════════╤═══════════════╤═══════════════╤═════════════════════════╣
║ Scenario              │ Baseline (ms) │ With CDC (ms) │ Write Amplification     ║
╠═══════════════════════╪═══════════════╪═══════════════╪═════════════════════════╣
║ single-row INSERT     │         450.2 │         890.5 │       1.98×             ║
║ bulk INSERT (10K)     │          35.1 │          72.3 │       2.06×             ║
║ bulk UPDATE (10K)     │          48.7 │         105.2 │       2.16×             ║
║ bulk DELETE (10K)     │          22.4 │          51.8 │       2.31×             ║
║ concurrent (4×5K)     │          65.3 │         142.1 │       2.18×             ║
╚═══════════════════════╧═══════════════╧═══════════════╧═════════════════════════╝

Column	Meaning
Scenario	DML pattern being measured
Baseline	Average wall-clock time with no stream table (no CDC trigger)
With CDC	Average wall-clock time with an active stream table (CDC trigger fires)
Write Amplification	`With CDC / Baseline` — how many times slower the write path becomes

Machine-Readable Output

[CDC_BENCH] scenario=single-row_INSERT baseline_avg_ms=450.2 cdc_avg_ms=890.5 write_amplification=1.98

Interpreting Write Amplification

Write Amplification	Interpretation
1.0–1.5×	Minimal overhead — triggers add negligible cost. Typical for bulk DML with statement-level triggers.
1.5–2.5×	Expected range for statement-level CDC triggers. Each DML statement incurs one additional INSERT into the change buffer.
2.5–4.0×	Moderate overhead — acceptable for most workloads. Common with row-level triggers or single-row DML.
4.0–10×	High overhead — consider `pg_trickle.cdc_trigger_mode = 'statement'` if using row-level triggers, or reduce DML frequency.
> 10×	Investigate — may indicate lock contention on the change buffer or pathological trigger interaction.

Key Patterns to Look For

Statement-level triggers vs row-level: Statement-level triggers (default since v0.11.0) should show significantly lower overhead for bulk DML compared to row-level triggers.
Bulk DML advantage: Bulk INSERT/UPDATE/DELETE should show lower write amplification than single-row INSERT because the trigger fires once per statement, not once per row.
Concurrent writer safety: The concurrent scenario should complete without deadlocks or errors, and the write amplification should be similar to the serial bulk INSERT case.
DELETE overhead: DELETE triggers tend to be slightly more expensive than INSERT triggers because the trigger must capture the OLD row values.

CI Benchmark Workflows

All benchmark jobs run only on weekly schedule and workflow_dispatch — never on PR or push — to avoid blocking the merge gate with long-running tests.

`e2e-benchmarks.yml` — E2E Benchmark Tracking

Produces the numbers in README.md and this document. Each job posts a summary table to the GitHub Actions run page and uploads artifacts at 90-day retention. Manual dispatch accepts a job input (refresh | latency | cdc | tpch | all) to re-run a single job.

Job	Test(s)	README Section	Timeout	`just` command
`bench-refresh`	`bench_full_matrix`	Differential vs Full Refresh	60 min	`just test-bench-e2e-fast`
`bench-latency`	`bench_no_data_refresh_latency`	Zero-Change Latency	20 min	`just test-bench-e2e-fast`
`bench-cdc`	`bench_cdc_trigger_overhead`	Write-Path Overhead	30 min	`just test-bench-e2e-fast`
`bench-tpch`	`test_tpch_performance_comparison`	TPC-H per-query table	30 min	`just bench-tpch-fast`

`ci.yml` — Benchmark Jobs

Criterion micro-benchmarks and DAG topology benchmarks. Run on the daily schedule and workflow_dispatch.

Job	Test Suite	What It Measures	Timeout	`just` command
`benchmarks`	`benches/refresh_bench.rs`, `benches/diff_operators.rs`	In-process Rust: query building, delta SQL generation (sub-µs)	20 min	`just bench`
`dag-bench-calc`	`e2e_dag_bench_tests` (excl. `par*`)	DAG propagation latency + throughput, CALCULATED mode	30 min	`just test-dag-bench-fast`
`dag-bench-parallel`	`e2e_dag_bench_tests` (`par*`)	DAG propagation with 4–8 parallel workers	120 min	`just test-dag-bench-fast`

`benchmarks.yml` — Bencher Integration (opt-in)

Disabled by default (no scheduled trigger). Re-enable by restoring push/pull_request triggers and adding a BENCHER_API_TOKEN secret. When active, it annotates PRs with regressions detected via Student’s t-test at a 99% upper confidence boundary.

Job	Test Suite	What It Measures	Tracking
`benchmark`	`benches/refresh_bench.rs`, `benches/diff_operators.rs`	Same as `ci.yml` `benchmarks` job	Bencher (regression alert on PR)

Artifact Retention Summary

Workflow	Artifact	Retention
`e2e-benchmarks.yml`	`bench-{refresh,latency,cdc,tpch}-results` (stdout + JSON)	90 days
`ci.yml` `benchmarks`	`benchmark-results` (Criterion HTML + JSON)	7 days
`benchmarks.yml`	`criterion-results` (Criterion HTML + JSON)	7 days

What Happens When You INSERT a Row?

This tutorial traces the complete lifecycle of a single INSERT statement on a base table that is referenced by a stream table — from the moment the row is written to the moment the stream table reflects the change.

Setup: A Real-World Example

Suppose you run an e-commerce platform. You have an orders table and a stream table that maintains a running total per customer:

-- Base table
CREATE TABLE orders (
    id    SERIAL PRIMARY KEY,
    customer TEXT NOT NULL,
    amount   NUMERIC(10,2) NOT NULL
);

-- Stream table: always-fresh customer totals
SELECT pgtrickle.create_stream_table(
    name     => 'customer_totals',
    query    => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule => '1m'  -- refresh when data is staler than 1 minute
    -- refresh_mode defaults to 'AUTO' (differential with full-refresh fallback)
);

After creation, customer_totals is a real PostgreSQL table:

SELECT * FROM customer_totals;
-- (empty — no orders yet)

Phase 1: The INSERT

A new order arrives:

INSERT INTO orders (customer, amount) VALUES ('alice', 49.99);

What happens inside PostgreSQL

When create_stream_table() was called, pg_trickle installed an AFTER INSERT OR UPDATE OR DELETE trigger on the orders table. This trigger fires automatically — the user's INSERT statement triggers it transparently.

The trigger function (pgtrickle_changes.pg_trickle_cdc_fn_<oid>()) executes inside the same transaction as the INSERT and writes a single row into the change buffer table:

pgtrickle_changes.changes_16384    (where 16384 = orders table OID)
┌───────────┬─────────────┬────────┬─────────┬──────────┬──────────┬────────────┐
│ change_id │ lsn         │ action │ pk_hash  │ new_id   │ new_cust │ new_amount │
├───────────┼─────────────┼────────┼─────────┼──────────┼──────────┼────────────┤
│ 1         │ 0/1A3F2B80  │ I      │ -837291 │ 1        │ alice    │ 49.99      │
└───────────┴─────────────┴────────┴─────────┴──────────┴──────────┴────────────┘

Key details:

lsn: The current WAL Log Sequence Number (pg_current_wal_lsn()), used to bound which changes belong to which refresh cycle.
action: 'I' for INSERT, 'U' for UPDATE, 'D' for DELETE.
pk_hash: A pre-computed hash of the primary key (orders.id), used later for efficient row matching.
new_* columns: The actual column values from NEW, stored as native PostgreSQL types (not JSONB). There are no old_* values for INSERTs.

The trigger adds zero overhead to the user's transaction commit beyond this single INSERT into the buffer table. There is no JSONB serialization, no logical replication slot, and no external process involved.

Phase 2: The Scheduler Wakes Up

A background worker called the scheduler runs inside PostgreSQL (registered via shared_preload_libraries). It wakes up every pg_trickle.scheduler_interval_ms milliseconds (default: 1000ms) and performs a tick:

Rebuild the DAG (if any stream tables were created/dropped since last tick) — a dependency graph of all stream tables and their source tables.
Topological sort — determine the refresh order so that stream tables depending on other stream tables are refreshed after their dependencies.
For each stream table, check: has its staleness exceeded its schedule?

For customer_totals with a '1m' schedule, the scheduler compares:

now() minus data_timestamp (the freshness watermark from the last refresh)
Against the schedule: 60 seconds

If more than 60 seconds have elapsed and the stream table isn't already being refreshed, the scheduler begins a refresh.

Phase 3: Frontier Advancement

Before executing the refresh, the scheduler creates a new frontier — a snapshot of how far to read changes from each source table:

Previous frontier: { orders(16384): lsn = 0/1A3F2A00 }
New frontier:      { orders(16384): lsn = 0/1A3F2C00 }

The frontier is a DBSP-inspired version vector. Each source table has its own LSN cursor. The refresh will process all changes in the buffer table where lsn > previous_frontier_lsn AND lsn <= new_frontier_lsn.

This means:

Changes committed before the previous refresh are already reflected.
Changes committed after the new frontier will be picked up in the next cycle.
The INSERT we made (lsn = 0/1A3F2B80) falls within this window.

Phase 4: Change Detection — Is There Anything to Do?

Before running the full delta query, the scheduler runs a short-circuit check: does the change buffer actually have any rows in the LSN window?

SELECT count(*)::bigint FROM (
    SELECT 1 FROM pgtrickle_changes.changes_16384
    WHERE lsn > '0/1A3F2A00'::pg_lsn
    AND lsn <= '0/1A3F2C00'::pg_lsn
    LIMIT <threshold>
) __pgt_capped

This query also checks the adaptive threshold: if the number of changes exceeds a percentage of the source table size (default: 10%), the scheduler falls back to a FULL refresh instead of DIFFERENTIAL, because applying thousands of individual deltas would be slower than a bulk reload.

For our single INSERT, the count is 1 — well below the threshold. The scheduler proceeds with a DIFFERENTIAL refresh.

Phase 5: Delta Query Generation (DVM Engine)

This is where the Differential View Maintenance (DVM) engine does its work. The defining query:

SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
FROM orders GROUP BY customer

is parsed into an operator tree:

Aggregate(GROUP BY customer, SUM(amount), COUNT(*))
  └── Scan(orders)

The DVM engine differentiates each operator — converting it from "compute the full result" to "compute only what changed":

Step 1: Differentiate the Scan

The Scan(orders) operator becomes a read from the change buffer:

-- Reads only changes in the LSN window, splitting UPDATEs into DELETE+INSERT
WITH __pgt_raw AS (
    SELECT c.pk_hash, c.action,
           c."new_customer", c."old_customer",
           c."new_amount", c."old_amount"
    FROM pgtrickle_changes.changes_16384 c
    WHERE c.lsn > '0/1A3F2A00'::pg_lsn
    AND   c.lsn <= '0/1A3F2C00'::pg_lsn
)
-- INSERT rows: take new_* values
SELECT pk_hash AS __pgt_row_id, 'I' AS __pgt_action,
       "new_customer" AS customer, "new_amount" AS amount
FROM __pgt_raw WHERE action IN ('I', 'U')
UNION ALL
-- DELETE rows: take old_* values
SELECT pk_hash AS __pgt_row_id, 'D' AS __pgt_action,
       "old_customer" AS customer, "old_amount" AS amount
FROM __pgt_raw WHERE action IN ('D', 'U')

For our single INSERT, this produces:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
-837291      | I            | alice    | 49.99

Step 2: Differentiate the Aggregate

The Aggregate differentiation is the heart of incremental maintenance. Instead of re-computing SUM(amount) over the entire orders table, it computes:

-- Delta for SUM: add new values, subtract deleted values
SELECT customer,
       SUM(CASE WHEN __pgt_action = 'I' THEN amount
                WHEN __pgt_action = 'D' THEN -amount END) AS total,
       SUM(CASE WHEN __pgt_action = 'I' THEN 1
                WHEN __pgt_action = 'D' THEN -1 END) AS order_count,
       pgtrickle.pg_trickle_hash(customer::text) AS __pgt_row_id,
       'I' AS __pgt_action
FROM <scan_delta>
GROUP BY customer

For our INSERT of ('alice', 49.99), this yields:

customer | total  | order_count | __pgt_row_id | __pgt_action
---------|--------|-------------|--------------|-------------
alice    | +49.99 | +1          | 7283194      | I

The stream table uses reference counting: it tracks __pgt_count (how many source rows contribute to each group). When __pgt_count reaches 0, the group row is deleted.

Phase 6: MERGE Into the Stream Table

The delta is applied to the customer_totals storage table using a single SQL MERGE statement:

MERGE INTO public.customer_totals AS st
USING (<delta_query>) AS d
ON st.__pgt_row_id = d.__pgt_row_id
WHEN MATCHED AND d.__pgt_action = 'D' THEN DELETE
WHEN MATCHED AND d.__pgt_action = 'I' THEN
    UPDATE SET customer = d.customer, total = d.total, order_count = d.order_count
WHEN NOT MATCHED AND d.__pgt_action = 'I' THEN
    INSERT (__pgt_row_id, customer, total, order_count)
    VALUES (d.__pgt_row_id, d.customer, d.total, d.order_count)

Since alice didn't exist before, this is a NOT MATCHED → INSERT. The stream table now contains:

SELECT * FROM customer_totals;
 customer | total | order_count
----------|-------|------------
 alice    | 49.99 | 1

Phase 7: Cleanup and Bookkeeping

After the MERGE succeeds:

Consumed changes are deleted from the buffer table:

DELETE FROM pgtrickle_changes.changes_16384
WHERE lsn > '0/1A3F2A00'::pg_lsn
AND lsn <= '0/1A3F2C00'::pg_lsn

The frontier is saved to the catalog as JSONB, so the next refresh knows where to start.

The refresh is recorded in pgtrickle.pgt_refresh_history:

refresh_id | pgt_id | action       | rows_inserted | rows_deleted | delta_row_count | status    | initiated_by
1          | 1      | DIFFERENTIAL | 1             | 0            | 1               | COMPLETED | SCHEDULER

The delta_row_count column (new in v0.2.0) records the total number of change buffer rows consumed during this refresh cycle.

The data timestamp on the stream table is advanced, resetting the staleness clock.
The MERGE template is cached in thread-local storage. The next refresh for this stream table skips SQL parsing, operator tree construction, and differentiation — it only substitutes LSN values into the cached template. This saves ~45ms per refresh cycle.

What About UPDATE and DELETE?

UPDATE

UPDATE orders SET amount = 59.99 WHERE id = 1;

The trigger writes a single row with action = 'U', capturing both OLD and NEW values:

action | new_amount | old_amount | new_customer | old_customer
-------|------------|------------|--------------|-------------
U      | 59.99      | 49.99      | alice        | alice

The scan differentiation splits this into:

DELETE old: (alice, 49.99) with action 'D'
INSERT new: (alice, 59.99) with action 'I'

The aggregate differentiation computes: +59.99 - 49.99 = +10.00 for alice's total. The MERGE updates the existing row.

DELETE

DELETE FROM orders WHERE id = 1;

The trigger writes action = 'D' with the OLD values. The aggregate differentiation computes -49.99 for the total and -1 for the count. If the __pgt_count reaches 0 (no more orders for alice), the MERGE deletes alice's row from the stream table entirely.

Performance: Why This Is Fast

Step	What it avoids
Trigger-based CDC	No logical replication slot, no WAL parsing, no external process
Typed columns	No JSONB serialization in the trigger, no `jsonb_populate_record` in the delta query
Pre-computed pk_hash	No per-row hash computation during the delta query
LSN-bounded reads	Index scan on the change buffer, not a full table scan
Algebraic differentiation	Processes only changed rows — O(changes) not O(table size)
MERGE statement	Single SQL round-trip for all inserts, updates, and deletes
Cached templates	After the first refresh, delta SQL generation is skipped entirely
Adaptive fallback	Automatically switches to FULL refresh when changes exceed a threshold

For a table with 10 million rows and 100 changed rows, a DIFFERENTIAL refresh processes only those 100 rows. A FULL refresh would need to scan all 10 million.

What About IMMEDIATE Mode?

Everything described above applies to the default AUTO mode — changes accumulate in a buffer and are applied on a schedule using differential (delta-only) maintenance. As of v0.2.0, pg_trickle also supports IMMEDIATE mode, which takes a fundamentally different path.

With IMMEDIATE mode, there are no change buffers, no scheduler, and no waiting:

SELECT pgtrickle.create_stream_table(
    name         => 'customer_totals_live',
    query        => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    refresh_mode => 'IMMEDIATE'
);

How IMMEDIATE Mode Differs for INSERT

Phase	DIFFERENTIAL	IMMEDIATE
Trigger type	Row-level AFTER trigger	Statement-level AFTER trigger with `REFERENCING NEW TABLE`
What's captured	One buffer row per INSERT	A transition table containing all inserted rows
When delta runs	Next scheduler tick (up to schedule bound)	Immediately, in the same transaction
Delta source	Change buffer table (`pgtrickle_changes.*`)	Temp table copied from transition table
Concurrency	No locking between writers	Advisory lock per stream table

When you run INSERT INTO orders ...:

A BEFORE INSERT statement-level trigger acquires an advisory lock on the stream table
The AFTER INSERT trigger captures the transition table (NEW TABLE AS __pgt_newtable) into a temp table
The DVM engine generates the same delta query, but reads from the temp table instead of the change buffer
The delta is applied to the stream table via INSERT/DELETE DML (not MERGE)
The stream table is immediately up-to-date — within the same transaction

BEGIN;
INSERT INTO orders (customer, amount) VALUES ('alice', 49.99);
-- customer_totals_live already shows alice with total=49.99 here!
SELECT * FROM customer_totals_live;
COMMIT;

The delta SQL template is cached per (pgt_id, source_oid, has_new, has_old) combination, so subsequent trigger invocations skip query parsing entirely.

Next in This Series

What Happens When You UPDATE a Row? — D+I split, group key changes, net-effect for multiple UPDATEs
What Happens When You DELETE a Row? — Reference counting, group deletion, INSERT+DELETE cancellation
What Happens When You TRUNCATE a Table? — Why TRUNCATE bypasses triggers and how to recover

What Happens When You UPDATE a Row?

This tutorial traces what happens when an UPDATE statement hits a base table that is referenced by a stream table. It covers the trigger capture, the scan-level decomposition into DELETE + INSERT, and how each DVM operator propagates the change — including cases where the group key changes, where JOINs are involved, and where multiple UPDATEs happen within a single refresh window.

Prerequisite: Read WHAT_HAPPENS_ON_INSERT.md first — it introduces the full 7-phase lifecycle. This tutorial focuses on how UPDATE differs.

Setup

Same e-commerce example:

CREATE TABLE orders (
    id       SERIAL PRIMARY KEY,
    customer TEXT NOT NULL,
    amount   NUMERIC(10,2) NOT NULL
);

SELECT pgtrickle.create_stream_table(
    name     => 'customer_totals',
    query    => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule => '1m'
);

-- Seed some data
INSERT INTO orders (customer, amount) VALUES
    ('alice', 49.99),
    ('alice', 30.00),
    ('bob',   75.00);

After the first refresh, the stream table contains:

customer | total | order_count
---------|-------|------------
alice    | 79.99 | 2
bob      | 75.00 | 1

Case 1: Simple Value UPDATE (Same Group Key)

UPDATE orders SET amount = 59.99 WHERE id = 1;

Alice's first order changes from 49.99 to 59.99. The customer (group key) stays the same.

Phase 1: Trigger Capture

The AFTER UPDATE trigger fires and writes one row to the change buffer with both OLD and NEW values:

pgtrickle_changes.changes_16384
┌───────────┬─────────────┬────────┬──────────┬──────────┬────────────┬──────────┬────────────┐
│ change_id │ lsn         │ action │ new_cust │ new_amt  │ old_cust   │ old_amt  │ pk_hash    │
├───────────┼─────────────┼────────┼──────────┼──────────┼────────────┼──────────┼────────────┤
│ 4         │ 0/1A3F3000  │ U      │ alice    │ 59.99    │ alice      │ 49.99    │ -837291    │
└───────────┴─────────────┴────────┴──────────┴──────────┴────────────┴──────────┴────────────┘

Key difference from INSERT: the trigger writes both new_* and old_* columns. The pk_hash is computed from NEW.id.

Phase 2–4: Scheduler, Frontier, Change Detection

Identical to the INSERT flow. The scheduler detects one change row in the LSN window.

Phase 5: Scan Differentiation — The U → D+I Split

This is where UPDATE handling diverges fundamentally. The scan delta operator decomposes the UPDATE into two events:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
-837291      | D            | alice    | 49.99     ← old values (DELETE)
-837291      | I            | alice    | 59.99     ← new values (INSERT)

Why split into D+I? This is a core IVM principle. Downstream operators (aggregates, joins, filters) don't have special "update" logic — they only understand insertions and deletions. By decomposing the UPDATE:

The DELETE event subtracts the old values from running aggregates
The INSERT event adds the new values

This algebraic approach handles arbitrary operator trees without operator-specific update logic.

Phase 5 (continued): Aggregate Differentiation

The aggregate operator processes both events against the alice group:

-- DELETE event: subtract old values
alice: total += CASE WHEN action='D' THEN -49.99 END  →  -49.99
alice: count += CASE WHEN action='D' THEN -1 END       →  -1

-- INSERT event: add new values
alice: total += CASE WHEN action='I' THEN +59.99 END  →  +59.99
alice: count += CASE WHEN action='I' THEN +1 END       →  +1

Net effect on alice's group:

total delta:  -49.99 + 59.99 = +10.00
count delta:  -1 + 1 = 0

The aggregate emits this as an INSERT (because the group still exists and its value changed):

customer | total  | order_count | __pgt_row_id | __pgt_action
---------|--------|-------------|--------------|-------------
alice    | +10.00 | 0           | 7283194      | I

Phase 6: MERGE

The MERGE updates the existing row:

-- MERGE WHEN MATCHED AND action = 'I' THEN UPDATE:
-- alice's total: 79.99 + 10.00 = 89.99  (via reference counting)
-- alice's count: 2 + 0 = 2

Wait — that's not right. The MERGE doesn't add deltas; it replaces the row. The aggregate delta query actually computes the new absolute value by combining the stored state with the delta:

COALESCE(existing.total, 0) + delta.total  → 79.99 + 10.00 = 89.99
COALESCE(existing.__pgt_count, 0) + delta.__pgt_count → 2 + 0 = 2

Result:

SELECT * FROM customer_totals;
 customer | total | order_count
----------|-------|------------
 alice    | 89.99 | 2            ← was 79.99
 bob      | 75.00 | 1

Case 2: Group Key Change (Customer Reassignment)

UPDATE orders SET customer = 'bob' WHERE id = 2;

Alice's second order (amount=30.00) is reassigned to Bob. The group key itself changes.

Trigger Capture

change_id | lsn         | action | new_cust | new_amt | old_cust | old_amt | pk_hash
5         | 0/1A3F3100  | U      | bob      | 30.00   | alice    | 30.00   | 4521038

The old and new customer values differ.

Scan Delta: D+I Split

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
4521038      | D            | alice    | 30.00    ← removes from alice's group
4521038      | I            | bob      | 30.00    ← adds to bob's group

Aggregate Delta

The aggregate groups by customer, so the DELETE and INSERT land in different groups:

Group "alice":
  total delta:  -30.00
  count delta:  -1

Group "bob":
  total delta:  +30.00
  count delta:  +1

After MERGE

SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 alice    | 59.99  | 1            ← lost one order (-30.00)
 bob      | 105.00 | 2            ← gained one order (+30.00)

This is why the D+I decomposition is essential. Without it, you'd need special "move between groups" logic. With it, the standard aggregate differentiation handles group key changes naturally.

Case 3: UPDATE That Deletes a Group

-- Alice only has one order left. Reassign it to bob.
UPDATE orders SET customer = 'bob' WHERE id = 1;

Aggregate Delta

Group "alice":
  total delta:    -59.99
  count delta:    -1
  new __pgt_count: 1 - 1 = 0  → group vanishes!

Group "bob":
  total delta:    +59.99
  count delta:    +1

When __pgt_count reaches 0, the aggregate emits a DELETE for alice's group:

customer | total | __pgt_row_id | __pgt_action
---------|-------|--------------|-------------
alice    | —     | 7283194      | D             ← group removed
bob      | ...   | 9182734      | I             ← group updated

The MERGE deletes alice's row entirely:

SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 bob      | 165.00 | 3

Case 4: Multiple UPDATEs on the Same Row (Within One Refresh Window)

What if a row is updated multiple times before the next refresh?

UPDATE orders SET amount = 10.00 WHERE id = 3;  -- bob: 75 → 10
UPDATE orders SET amount = 20.00 WHERE id = 3;  -- bob: 10 → 20
UPDATE orders SET amount = 30.00 WHERE id = 3;  -- bob: 20 → 30

The change buffer now has 3 rows for pk_hash of order #3:

change_id | action | old_amt | new_amt
6         | U      | 75.00   | 10.00
7         | U      | 10.00   | 20.00
8         | U      | 20.00   | 30.00

Net-Effect Computation

The scan delta uses a split fast-path design. Since order #3 has multiple changes (cnt > 1), it takes the multi-change path with window functions:

FIRST_VALUE(action) OVER (PARTITION BY pk_hash ORDER BY change_id)  → 'U'
LAST_VALUE(action) OVER (...)                                        → 'U'

Both first and last actions are 'U', so:

DELETE: emits using old values from the earliest change (change_id=6): old_amt = 75.00
INSERT: emits using new values from the latest change (change_id=8): new_amt = 30.00

Net delta:

__pgt_row_id | __pgt_action | amount
-------------|--------------|-------
pk_hash_3    | D            | 75.00    ← original value before all changes
pk_hash_3    | I            | 30.00    ← final value after all changes

The aggregate sees -75.00 + 30.00 = -45.00. This is correct regardless of the intermediate values. The intermediate rows (10.00, 20.00) are never seen.

Case 5: INSERT + UPDATE in Same Window

INSERT INTO orders (customer, amount) VALUES ('charlie', 100.00);
UPDATE orders SET amount = 200.00 WHERE customer = 'charlie';

Both happen before the next refresh. The buffer has:

change_id | action | old_amt | new_amt
9         | I      | NULL    | 100.00
10        | U      | 100.00  | 200.00

Net-effect analysis:

first_action = 'I' (row didn't exist before this window)
last_action = 'U' (row exists after)

Result:

No DELETE emitted (first_action = 'I' means the row was born in this window)
INSERT with final values: (charlie, 200.00)

The aggregate sees a pure insertion of (charlie, 200.00) — the intermediate value of 100.00 never appears.

Case 6: UPDATE + DELETE in Same Window

UPDATE orders SET amount = 999.99 WHERE id = 3;
DELETE FROM orders WHERE id = 3;

Net-effect:

first_action = 'U' (row existed before)
last_action = 'D' (row no longer exists)

Result:

DELETE with original old values from the first change
No INSERT (last_action = 'D')

The aggregate correctly sees only a removal.

Case 7: UPDATE with JOINs

Consider a stream table that joins two tables:

CREATE TABLE customers (
    id   SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    tier TEXT NOT NULL DEFAULT 'standard'
);

CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer_id INT REFERENCES customers(id),
    amount      NUMERIC(10,2)
);

SELECT pgtrickle.create_stream_table(
    name         => 'order_details',
    query        => $$
      SELECT c.name, c.tier, o.amount
      FROM orders o
      JOIN customers c ON o.customer_id = c.id
    $$,
    schedule => '1m'
);

Now update a customer's tier:

UPDATE customers SET tier = 'premium' WHERE name = 'alice';

How the JOIN Delta Works

The join differentiation follows the formula:

$$\Delta(L \bowtie R) = (\Delta L \bowtie R) \cup (L \bowtie \Delta R) - (\Delta L \bowtie \Delta R)$$

Since only the customers table changed:

$\Delta L$ = changes to orders (empty)
$\Delta R$ = changes to customers (alice's tier: standard → premium)

So:

Part 1: $\Delta\text{orders} \bowtie \text{customers}$ = empty (no order changes)
Part 2: $\text{orders} \bowtie \Delta\text{customers}$ = all of alice's orders joined with her tier change
Part 3: $\Delta\text{orders} \bowtie \Delta\text{customers}$ = empty (no order changes)

Part 2 produces the delta: for each of alice's orders, DELETE the old row (with tier='standard') and INSERT a new row (with tier='premium').

The stream table is updated to reflect the new tier across all of alice's order rows.

Performance Summary

Scenario	Buffer rows	Delta rows emitted	Work
Simple value change	1	2 (D+I)	O(1) per group
Group key change	1	2 (D+I, different groups)	O(1) per affected group
Group deletion	1	1 (D) + 1 (I) or 1 (D)	O(1)
N updates same row	N	2 (D first-old + I last-new)	O(N) scan, O(1) aggregate
INSERT+UPDATE same window	2	1 (I only)	O(1)
UPDATE+DELETE same window	2	1 (D only)	O(1)

In all cases, the work is proportional to the number of changed rows, not the total table size. A single UPDATE on a billion-row table produces the same delta cost as on a 10-row table.

What About IMMEDIATE Mode?

Everything above describes DIFFERENTIAL mode — changes accumulate in a buffer and are applied on a schedule. As of v0.2.0, pg_trickle also supports IMMEDIATE mode, where the stream table is updated synchronously within the same transaction as your UPDATE.

How IMMEDIATE Mode Differs for UPDATE

Phase	DIFFERENTIAL	IMMEDIATE
Trigger type	Row-level AFTER trigger	Statement-level AFTER trigger with `REFERENCING OLD TABLE, NEW TABLE`
What's captured	One buffer row with old_* and new_*	Two transition tables: `__pgt_oldtable` and `__pgt_newtable`
When delta runs	Next scheduler tick	Immediately, in the same transaction
D+I decomposition	In the scan delta CTE	Same algebra, but reading from transition temp tables
Concurrency	No locking between writers	Advisory lock per stream table

When you run UPDATE orders SET amount = 59.99 WHERE id = 1:

A BEFORE UPDATE trigger acquires an advisory lock on the stream table
The AFTER UPDATE trigger captures both OLD TABLE AS __pgt_oldtable and NEW TABLE AS __pgt_newtable into temp tables
The DVM engine generates the same D+I decomposition, reading old values from the old-table and new values from the new-table
The delta is applied to the stream table immediately
Any query within the same transaction sees the updated stream table

BEGIN;
UPDATE orders SET amount = 59.99 WHERE id = 1;
-- customer_totals already reflects the new amount here!
SELECT * FROM customer_totals WHERE customer = 'alice';
COMMIT;

The same D+I split, aggregate differentiation, and net-effect logic applies — the only difference is the data source (transition tables vs change buffer) and timing (synchronous vs scheduled).

Next in This Series

What Happens When You INSERT a Row? — The full 7-phase lifecycle (start here if you haven't already)
What Happens When You DELETE a Row? — Reference counting, group deletion, INSERT+DELETE cancellation
What Happens When You TRUNCATE a Table? — Why TRUNCATE bypasses triggers and how to recover

What Happens When You DELETE a Row?

This tutorial traces what happens when a DELETE statement hits a base table that is referenced by a stream table. It covers the trigger capture, how the scan delta emits a single DELETE event, and how each DVM operator propagates the removal — including group deletion, partial group reduction, JOINs, cascading deletes within a single refresh window, and the important edge case where a DELETE cancels a prior INSERT.

Prerequisite: Read WHAT_HAPPENS_ON_INSERT.md first — it introduces the full 7-phase lifecycle (trigger → scheduler → frontier → change detection → DVM delta → MERGE → cleanup). This tutorial focuses on how DELETE differs.

Setup

Same e-commerce example used throughout the series:

CREATE TABLE orders (
    id       SERIAL PRIMARY KEY,
    customer TEXT NOT NULL,
    amount   NUMERIC(10,2) NOT NULL
);

SELECT pgtrickle.create_stream_table(
    name     => 'customer_totals',
    query    => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule => '1m'
);

-- Seed some data
INSERT INTO orders (customer, amount) VALUES
    ('alice', 50.00),
    ('alice', 30.00),
    ('bob',   75.00),
    ('bob',   25.00);

After the first refresh, the stream table contains:

customer | total  | order_count
---------|--------|------------
alice    | 80.00  | 2
bob      | 100.00 | 2

Case 1: Delete One Row (Group Survives)

DELETE FROM orders WHERE id = 2;  -- alice's 30.00 order

Alice still has one remaining order (id=1, amount=50.00). The group shrinks but doesn't vanish.

Phase 1: Trigger Capture

The AFTER DELETE trigger fires and writes one row to the change buffer with only OLD values:

pgtrickle_changes.changes_16384
┌───────────┬─────────────┬────────┬──────────┬──────────┬────────────┬──────────┬────────────┐
│ change_id │ lsn         │ action │ new_cust │ new_amt  │ old_cust   │ old_amt  │ pk_hash    │
├───────────┼─────────────┼────────┼──────────┼──────────┼────────────┼──────────┼────────────┤
│ 5         │ 0/1A3F3000  │ D      │ NULL     │ NULL     │ alice      │ 30.00    │ 4521038    │
└───────────┴─────────────┴────────┴──────────┴──────────┴────────────┴──────────┴────────────┘

Key difference from INSERT and UPDATE:

new_* columns are all NULL — the row no longer exists, so there are no NEW values
old_* columns contain the deleted row's data — this is what gets subtracted
pk_hash is computed from OLD.id (the deleted row's primary key)

Phase 2–4: Scheduler, Frontier, Change Detection

Identical to the INSERT flow. The scheduler detects one change row in the LSN window.

Phase 5: Scan Differentiation — Pure DELETE

Unlike UPDATE (which splits into D+I), a DELETE produces a single event:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
4521038      | D            | alice    | 30.00

The scan delta applies the net-effect filtering rule:

first_action = 'D' → row existed before the refresh window
last_action = 'D' → row does not exist after

Result: emit a DELETE using old values. No INSERT is emitted (because last_action = 'D').

This is the simplest path through the scan delta — one change, one PK, one DELETE event.

Phase 5 (continued): Aggregate Differentiation

The aggregate operator processes the DELETE event against the alice group:

-- DELETE event: subtract old values from alice's group
__ins_count = 0         -- no inserts
__del_count = 1         -- one deletion
__ins_total = 0         -- no amount added
__del_total = 30.00     -- 30.00 removed

The merge CTE joins this delta with the existing stream table state:

new_count = old_count + ins_count - del_count = 2 + 0 - 1 = 1  (still > 0)

Since new_count > 0 and the group already existed (old_count = 2), the action is classified as 'U' (update). The aggregate emits the group with its new values:

customer | total | order_count | __pgt_row_id | __pgt_action
---------|-------|-------------|--------------|-------------
alice    | 50.00 | 1           | 7283194      | I

Note: the 'U' meta-action is emitted as __pgt_action = 'I' because the MERGE treats it as an update-via-INSERT (see aggregate final CTE: CASE WHEN __pgt_meta_action = 'D' THEN 'D' ELSE 'I' END).

Phase 6: MERGE

The MERGE statement matches alice's existing row and updates it:

MERGE INTO customer_totals AS st
USING (...delta...) AS d
ON st.__pgt_row_id = d.__pgt_row_id
WHEN MATCHED AND d.__pgt_action = 'I' THEN
  UPDATE SET customer = d.customer, total = d.total, order_count = d.order_count, ...

Result:

SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 alice    | 50.00  | 1            ← was 80.00 / 2
 bob      | 100.00 | 2

Phase 7: Cleanup

The change buffer rows in the consumed LSN window are deleted:

DELETE FROM pgtrickle_changes.changes_16384
WHERE lsn > '0/1A3F2FFF'::pg_lsn AND lsn <= '0/1A3F3000'::pg_lsn;

Case 2: Delete Last Row in Group (Group Vanishes)

-- Alice has one order left (id=1, amount=50.00). Delete it.
DELETE FROM orders WHERE id = 1;

Trigger Capture

change_id | lsn         | action | old_cust | old_amt | pk_hash
6         | 0/1A3F3100  | D      | alice    | 50.00   | -837291

Scan Delta

Single DELETE event:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
-837291      | D            | alice    | 50.00

Aggregate Delta

Group "alice":
  ins_count = 0
  del_count = 1
  new_count = old_count + 0 - 1 = 1 - 1 = 0  → group vanishes!

When new_count drops to 0 (or below), the aggregate classifies this as action 'D' (delete). The reference count has reached zero — no rows contribute to this group anymore.

The aggregate emits a DELETE for alice's group:

customer | __pgt_row_id | __pgt_action
---------|--------------|-------------
alice    | 7283194      | D

MERGE

The MERGE matches alice's existing row and deletes it:

WHEN MATCHED AND d.__pgt_action = 'D' THEN DELETE

Result:

SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 bob      | 100.00 | 2

Alice's row is completely removed from the stream table. This is the correct behavior — with zero contributing rows, the group should not exist.

Case 3: Delete Multiple Rows (Same Group, Same Window)

-- Delete both of bob's orders before the next refresh
DELETE FROM orders WHERE id = 3;  -- bob, 75.00
DELETE FROM orders WHERE id = 4;  -- bob, 25.00

The change buffer has two rows with different pk_hash values (different PKs):

change_id | action | old_cust | old_amt | pk_hash
7         | D      | bob      | 75.00   | pk_hash_3
8         | D      | bob      | 25.00   | pk_hash_4

Scan Delta

Each PK has exactly one change, so both take the single-change fast path:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
pk_hash_3    | D            | bob      | 75.00
pk_hash_4    | D            | bob      | 25.00

Two DELETE events, both targeting bob's group.

Aggregate Delta

The aggregate sums both deletions:

Group "bob":
  ins_count = 0
  del_count = 2
  del_total = 75.00 + 25.00 = 100.00
  new_count = 2 + 0 - 2 = 0  → group vanishes!

The aggregate emits a DELETE for bob's group.

MERGE

Bob's row is deleted from the stream table. With both alice and bob gone (from Cases 1+2+3), the stream table is now empty.

Case 4: INSERT + DELETE in Same Window (Cancellation)

What if a row is inserted and then deleted before the next refresh?

INSERT INTO orders (customer, amount) VALUES ('charlie', 200.00);
DELETE FROM orders WHERE customer = 'charlie';

The change buffer has:

change_id | action | new_cust | new_amt | old_cust | old_amt | pk_hash
9         | I      | charlie  | 200.00  | NULL     | NULL    | pk_hash_new
10        | D      | NULL     | NULL    | charlie  | 200.00  | pk_hash_new

Net-Effect Computation

Both changes share the same pk_hash. The pk_stats CTE finds cnt = 2, so this goes through the multi-change path:

first_action = FIRST_VALUE(action) OVER (...) → 'I'
last_action  = LAST_VALUE(action)  OVER (...) → 'D'

The scan delta applies the net-effect filtering:

DELETE branch: requires first_action != 'I' → FAILS (first_action = 'I')
INSERT branch: requires last_action != 'D' → FAILS (last_action = 'D')

Result: zero events emitted. The INSERT and DELETE completely cancel each other out.

The aggregate never sees charlie. The stream table is unchanged. This is correct — the row was born and died within the same refresh window, so it should have no visible effect.

Case 5: UPDATE + DELETE in Same Window

UPDATE orders SET amount = 999.99 WHERE id = 3;  -- bob: 75 → 999.99
DELETE FROM orders WHERE id = 3;

The change buffer:

change_id | action | old_amt | new_amt
11        | U      | 75.00   | 999.99
12        | D      | 999.99  | NULL

Net-Effect Computation

Same pk_hash, cnt = 2:

first_action = 'U'  (row existed before this window)
last_action  = 'D'  (row no longer exists)

Filtering:

DELETE branch: first_action != 'I' → OK. Emit DELETE with old values from the earliest change: old_amt = 75.00
INSERT branch: last_action != 'D' → FAILS. No INSERT emitted.

Net delta:

__pgt_row_id | __pgt_action | amount
-------------|--------------|-------
pk_hash_3    | D            | 75.00

The intermediate value of 999.99 never appears. The aggregate sees only the removal of the original value (75.00), which is correct — that's the value that was previously accounted for in the stream table.

Case 6: DELETE with JOINs

Consider a stream table that joins two tables:

CREATE TABLE customers (
    id   SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    tier TEXT NOT NULL DEFAULT 'standard'
);

CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer_id INT REFERENCES customers(id),
    amount      NUMERIC(10,2)
);

SELECT pgtrickle.create_stream_table(
    name         => 'order_details',
    query        => $$
      SELECT c.name, c.tier, o.amount
      FROM orders o
      JOIN customers c ON o.customer_id = c.id
    $$,
    schedule => '1m'
);

Seed data:

INSERT INTO customers VALUES (1, 'alice', 'premium'), (2, 'bob', 'standard');
INSERT INTO orders VALUES (1, 1, 50.00), (2, 1, 30.00), (3, 2, 75.00);

After refresh, the stream table has:

name  | tier     | amount
------|----------|-------
alice | premium  | 50.00
alice | premium  | 30.00
bob   | standard | 75.00

Now delete an order:

DELETE FROM orders WHERE id = 2;  -- alice's 30.00 order

How the JOIN Delta Works

The join differentiation formula:

$$\Delta(L \bowtie R) = (\Delta L \bowtie R) \cup (L \bowtie \Delta R) - (\Delta L \bowtie \Delta R)$$

Since only the orders table changed:

$\Delta L$ = changes to orders (one DELETE: order #2)
$\Delta R$ = changes to customers (empty)

So:

Part 1: $\Delta\text{orders} \bowtie \text{customers}$ = the deleted order joined with its customer
Part 2: $\text{orders} \bowtie \Delta\text{customers}$ = empty (no customer changes)
Part 3: $\Delta\text{orders} \bowtie \Delta\text{customers}$ = empty (customers unchanged)

Part 1 produces:

name  | tier    | amount | __pgt_action
------|---------|--------|-------------
alice | premium | 30.00  | D

The deleted order is joined with alice's customer record to produce a DELETE delta row with the complete joined values.

MERGE

The MERGE matches the row (alice, premium, 30.00) and deletes it:

SELECT * FROM order_details;
 name  | tier     | amount
-------|----------|-------
 alice | premium  | 50.00      ← alice's remaining order
 bob   | standard | 75.00

What About Deleting From the Dimension Table?

DELETE FROM customers WHERE id = 2;  -- remove bob entirely

Now $\Delta R$ has a DELETE for bob, while $\Delta L$ is empty:

Part 1: $\Delta\text{orders} \bowtie \text{customers}$ = empty
Part 2: $\text{orders} \bowtie \Delta\text{customers}$ = bob's order(s) joined with deleted customer record

Part 2 produces DELETE events for every order that referenced bob:

name | tier     | amount | __pgt_action
-----|----------|--------|-------------
bob  | standard | 75.00  | D

After MERGE, bob's rows vanish from the stream table.

Note: This assumes referential integrity — if orders still references customer #2, a foreign key constraint would prevent the DELETE in practice. But from the IVM perspective, the join delta correctly handles the removal regardless.

Case 7: Bulk DELETE

DELETE FROM orders WHERE amount < 50.00;

This deletes multiple rows across potentially multiple groups. The trigger fires once per row (it's a FOR EACH ROW trigger), writing one change buffer entry per deleted row:

change_id | action | old_cust | old_amt | pk_hash
13        | D      | alice    | 30.00   | pk_hash_2
14        | D      | bob      | 25.00   | pk_hash_4

Scan Delta

Each deleted PK is independent (different pk_hash values), so each takes the single-change fast path. Two DELETE events:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
pk_hash_2    | D            | alice    | 30.00
pk_hash_4    | D            | bob      | 25.00

Aggregate Delta

The aggregate groups these by customer:

Group "alice":
  del_count = 1, del_total = 30.00
  new_count = 2 - 1 = 1  (survives)

Group "bob":
  del_count = 1, del_total = 25.00
  new_count = 2 - 1 = 1  (survives)

Both groups survive (count > 0), so the aggregate emits UPDATE (as 'I') events with new values:

customer | total | order_count
---------|-------|------------
alice    | 50.00 | 1
bob      | 75.00 | 1

The MERGE updates both rows. All work is proportional to the number of deleted rows (2), not the total table size.

Case 8: TRUNCATE (Automatic Full Refresh)

TRUNCATE orders;

TRUNCATE does not fire row-level triggers. However, as of v0.2.0, pg_trickle installs a statement-level AFTER TRUNCATE trigger that writes a 'T' marker to the change buffer. On the next refresh cycle, the scheduler detects this marker and automatically performs a full refresh — truncating the stream table and recomputing from the defining query.

No manual intervention is required. For details on how TRUNCATE is handled across all three refresh modes (DIFFERENTIAL, IMMEDIATE, FULL), see What Happens When You TRUNCATE a Table?.

How DELETE Differs From INSERT and UPDATE — A Summary

Aspect	INSERT	UPDATE	DELETE
Trigger writes	`new_*` columns only	Both `new_` and `old_`	`old_*` columns only
new_ columns*	Row values	New values	NULL
old_ columns*	NULL	Old values	Row values
pk_hash source	`NEW.pk`	`NEW.pk`	`OLD.pk`
Scan delta output	1 INSERT event	2 events (D+I split)	1 DELETE event
Aggregate effect	Adds to group count/sum	Subtracts old, adds new	Subtracts from group
Can delete a group?	No (only creates/grows)	Yes (if group key changes)	Yes (if count reaches 0)
MERGE action	INSERT new row	UPDATE existing row	DELETE matched row

The Reference Counting Principle

The core insight behind incremental DELETE handling is reference counting. Every aggregate group in the stream table maintains an internal counter (__pgt_count) that tracks how many source rows contribute to the group:

Stream table internal state:
customer | total | order_count | __pgt_count (hidden)
---------|-------|-------------|---------------------
alice    | 80.00 | 2           | 2
bob      | 100.00| 2           | 2

INSERT → __pgt_count += 1
DELETE → __pgt_count -= 1
UPDATE → __pgt_count += 0 (D cancels I for same-group updates)

When __pgt_count reaches 0:

The group has zero contributing rows
The aggregate emits a DELETE event
The MERGE removes the row from the stream table

This is mathematically rigorous — the stream table always reflects the correct result of the defining query over the current base table contents, incrementally maintained through algebraic delta operations.

Performance Summary

Scenario	Buffer rows	Delta rows emitted	Work
Single row DELETE (group survives)	1	1 (D)	O(1) per group
Single row DELETE (group vanishes)	1	1 (D)	O(1)
N deletes same group	N	N (D) → 1 group delta	O(N) scan, O(1) per group
INSERT+DELETE same window	2	0 (cancels)	O(1)
UPDATE+DELETE same window	2	1 (D original)	O(1)
Bulk DELETE across M groups	N	N (D) → M group deltas	O(N) scan, O(M) aggregate
JOIN table DELETE	1	K (one per matched join row)	O(K) join

In all cases, the work is proportional to the number of changed rows, not the total table size. Deleting 3 rows from a billion-row table produces the same delta cost as from a 10-row table.

What About IMMEDIATE Mode?

How IMMEDIATE Mode Differs for DELETE

Phase	DIFFERENTIAL	IMMEDIATE
Trigger type	Row-level AFTER trigger	Statement-level AFTER trigger with `REFERENCING OLD TABLE`
What's captured	One buffer row with old_* columns per deleted row	A transition table containing all deleted rows
When delta runs	Next scheduler tick	Immediately, in the same transaction
Delta source	Change buffer rows with action='D'	Temp table copied from transition table
Concurrency	No locking between writers	Advisory lock per stream table

When you run DELETE FROM orders WHERE id = 2:

A BEFORE DELETE trigger acquires an advisory lock on the stream table
The AFTER DELETE trigger captures OLD TABLE AS __pgt_oldtable into a temp table
The DVM engine generates the same aggregate delta, reading deleted values from the old-table
The delta is applied to the stream table immediately — groups are decremented, and groups reaching count=0 are removed
Any query within the same transaction sees the updated stream table

BEGIN;
DELETE FROM orders WHERE id = 2;  -- alice's 30.00 order
-- customer_totals already reflects the deletion here!
SELECT * FROM customer_totals WHERE customer = 'alice';
-- Shows: alice | 50.00 | 1
COMMIT;

The same reference counting, group deletion, and net-effect logic applies — the only difference is the data source (transition tables vs change buffer) and timing (synchronous vs scheduled).

Next in This Series

What Happens When You INSERT a Row? — The full 7-phase lifecycle (start here if you haven't already)
What Happens When You UPDATE a Row? — D+I split, group key changes, net-effect for multiple UPDATEs
What Happens When You TRUNCATE a Table? — Why TRUNCATE bypasses triggers and how to recover

What Happens When You TRUNCATE a Table?

This tutorial explains what happens when a TRUNCATE statement hits a base table that is referenced by a stream table. Unlike INSERT, UPDATE, and DELETE — which are fully tracked by the CDC trigger — TRUNCATE is a special case that bypasses row-level triggers entirely. Understanding this gap is essential for operating pg_trickle correctly.

Prerequisite: Read WHAT_HAPPENS_ON_INSERT.md first — it introduces the 7-phase lifecycle. This tutorial explains why TRUNCATE breaks that lifecycle and how to recover.

Setup

Same e-commerce example used throughout the series:

CREATE TABLE orders (
    id       SERIAL PRIMARY KEY,
    customer TEXT NOT NULL,
    amount   NUMERIC(10,2) NOT NULL
);

SELECT pgtrickle.create_stream_table(
    name     => 'customer_totals',
    query    => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule => '1m'
);

-- Seed some data
INSERT INTO orders (customer, amount) VALUES
    ('alice', 50.00),
    ('alice', 30.00),
    ('bob',   75.00),
    ('bob',   25.00);

After the first refresh, the stream table contains:

customer | total  | order_count
---------|--------|------------
alice    | 80.00  | 2
bob      | 100.00 | 2

Case 1: TRUNCATE the Base Table (DIFFERENTIAL Mode)

TRUNCATE orders;

All four rows are removed instantly.

What Happens at the Trigger Level: TRUNCATE Marker

Updated in v0.2.0: pg_trickle now installs a statement-level AFTER TRUNCATE trigger on tracked source tables. This trigger writes a single marker row to the change buffer with action = 'T'.

Unlike the per-row DML triggers, the TRUNCATE trigger cannot capture individual row data (PostgreSQL's TRUNCATE does not provide OLD records). Instead, it writes a sentinel:

pgtrickle_changes.changes_16384
┌───────────┬─────────────┬────────┬──────────┬──────────┐
│ change_id │ lsn         │ action │ new_*    │ old_*    │
├───────────┼─────────────┼────────┼──────────┼──────────┤
│ 5         │ 0/1A3F4000  │ T      │ NULL     │ NULL     │
└───────────┴─────────────┴────────┴──────────┴──────────┘

The 'T' action marker tells the refresh engine: "a TRUNCATE happened — a full refresh is required."

What Happens at the Scheduler: Automatic Full Refresh

On the next refresh cycle, the scheduler:

Checks the change buffer for rows in the LSN window
Finds the action = 'T' marker row
Falls back to a FULL refresh — regardless of the stream table's configured refresh_mode
TRUNCATEs the stream table
Re-executes the defining query against the current base table state
Inserts all results

Since the orders table is now empty, the defining query returns zero rows:

-- After the next scheduled refresh:
SELECT * FROM customer_totals;
 customer | total | order_count
----------|-------|------------
 (0 rows)                        ← correct: orders is empty

No manual intervention required. The TRUNCATE marker ensures the stream table is automatically brought back into consistency on the next refresh cycle.

Note: In versions before v0.2.0, TRUNCATE was not captured at all — the change buffer stayed empty and the stream table became silently stale. If you're running an older version, you still need to call pgtrickle.refresh_stream_table() manually after a TRUNCATE.

Case 2: Manual Refresh (Explicit Recovery)

Although TRUNCATE is now automatically handled on the next refresh cycle, you can force an immediate recovery without waiting:

SELECT pgtrickle.refresh_stream_table('customer_totals');

This executes a full refresh regardless of the stream table's configured refresh mode:

TRUNCATE the stream table itself (clearing the stale data)
Re-execute the defining query
INSERT the results into the stream table
Update the frontier so future differential refreshes start from the current LSN

This is useful when you can't wait for the next scheduled refresh cycle and need the stream table consistent immediately.

Case 3: TRUNCATE Then INSERT (Common ETL Pattern)

A common data loading pattern is:

BEGIN;
TRUNCATE orders;
INSERT INTO orders (customer, amount) VALUES
    ('charlie', 100.00),
    ('charlie', 200.00),
    ('dave',    150.00);
COMMIT;

What the Change Buffer Sees

TRUNCATE: 1 marker event (action = 'T') — captured by the statement-level trigger
INSERT charlie 100.00: 1 event (captured)
INSERT charlie 200.00: 1 event (captured)
INSERT dave 150.00: 1 event (captured)

The change buffer has 4 rows — the TRUNCATE marker plus 3 INSERT events.

What the Scheduler Does

The scheduler sees the action = 'T' marker and triggers a full refresh, ignoring the individual INSERT events. The full refresh re-executes the defining query against the current state of orders, which now contains only charlie and dave:

-- After the next scheduled refresh:
SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 charlie  | 300.00 | 2            ← correct
 dave     | 150.00 | 1            ← correct

The old data (alice, bob) is gone because the full refresh recomputed from scratch. This is correct — the TRUNCATE marker ensures consistency regardless of what other changes occurred in the same window.

Case 4: TRUNCATE a Dimension Table in a JOIN

Consider a stream table that joins two tables:

CREATE TABLE customers (
    id   SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    tier TEXT NOT NULL DEFAULT 'standard'
);

CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer_id INT REFERENCES customers(id),
    amount      NUMERIC(10,2)
);

SELECT pgtrickle.create_stream_table(
    name         => 'order_details',
    query        => $$
      SELECT c.name, c.tier, o.amount
      FROM orders o
      JOIN customers c ON o.customer_id = c.id
    $$,
    schedule => '1m'
);

Now truncate the dimension table:

TRUNCATE customers CASCADE;

The CASCADE also truncates orders (due to the foreign key). Both tables have TRUNCATE triggers installed, so both write a 'T' marker to their respective change buffers.

On the next refresh cycle, the scheduler detects the TRUNCATE markers and performs a full refresh. The stream table is recomputed from the now-empty base tables:

-- After the next scheduled refresh:
SELECT * FROM order_details;
-- (0 rows) — correct

Case 5: FULL Mode Stream Tables Are Immune

If the stream table uses FULL refresh mode instead of DIFFERENTIAL:

SELECT pgtrickle.create_stream_table(
    name         => 'customer_totals_full',
    query        => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule     => '1m',
    refresh_mode => 'FULL'
);

A FULL-mode stream table doesn't use the change buffer at all. Every refresh cycle:

TRUNCATEs the stream table
Re-executes the defining query
Inserts all results

So after a TRUNCATE of the base table, the next scheduled refresh automatically picks up the correct state — no manual intervention needed. The trade-off is that every refresh recomputes from scratch, which is more expensive for large result sets.

Why PostgreSQL Doesn't Fire Row Triggers on TRUNCATE

Understanding the PostgreSQL internals helps explain why per-row capture is impossible:

Operation	Mechanism	Row triggers fired?	Statement triggers fired?
`DELETE FROM t`	Scans and removes rows one by one	Yes — AFTER DELETE per row	Yes
`TRUNCATE t`	Removes all heap files and reinitializes the table storage	No — no per-row processing	Yes — AFTER TRUNCATE
`DELETE FROM t WHERE true`	Same as `DELETE FROM t` (full scan)	Yes — AFTER DELETE per row	Yes

TRUNCATE is fundamentally different from DELETE. It's an O(1) operation that replaces the table's storage files, while DELETE is O(N) — scanning every row and recording each removal in WAL.

pg_trickle uses a statement-level AFTER TRUNCATE trigger to detect the event and write a 'T' marker to the change buffer. This marker does not contain per-row data (PostgreSQL's TRUNCATE trigger doesn't provide OLD records), but it's sufficient to signal that a full refresh is needed.

Alternative: DELETE FROM Instead of TRUNCATE

For DIFFERENTIAL mode, TRUNCATE is now handled automatically (via the 'T' marker and full refresh fallback). However, using DELETE FROM instead of TRUNCATE has its own advantages:

-- Instead of: TRUNCATE orders;
DELETE FROM orders;

This fires the row-level DELETE trigger for every row. The change buffer captures all removals, and the next differential refresh correctly decrements all reference counts through the standard algebraic delta path — avoiding the need for a full refresh fallback.

Approach	Speed	Stream table consistent?	Refresh type
`TRUNCATE orders`	O(1) — instant	Yes — automatic full refresh on next cycle	FULL (fallback)
`DELETE FROM orders`	O(N) — scans all rows	Yes — per-row triggers fire	DIFFERENTIAL
`TRUNCATE` + manual refresh	O(1) + O(query)	Yes — immediately	FULL (manual)

For tables with millions of rows, DELETE FROM can be slow and generate significant WAL. TRUNCATE is generally the better choice — the automatic full refresh fallback makes it safe to use.

Best Practices

1. TRUNCATE Is Safe to Use

As of v0.2.0, TRUNCATE on tracked source tables is automatically detected and triggers a full refresh on the next scheduler cycle. No manual intervention is required for standard operation.

2. Use Manual Refresh for Immediate Consistency

If you need the stream table to be consistent immediately (not on the next cycle), call refresh explicitly:

TRUNCATE orders;
SELECT pgtrickle.refresh_stream_table('customer_totals');

3. Consider IMMEDIATE Mode for Real-Time Needs

For stream tables that need to reflect TRUNCATE instantly (within the same transaction), use IMMEDIATE mode. The TRUNCATE trigger automatically performs a full refresh synchronously.

4. Consider FULL Mode for ETL-Heavy Tables

If a table is routinely truncated and reloaded, FULL refresh mode may be simpler than DIFFERENTIAL — it naturally handles TRUNCATE because it recomputes from scratch every cycle.

5. Use `trigger_inventory()` to Verify Triggers

You can verify that both the DML trigger and the TRUNCATE trigger are installed and enabled:

SELECT * FROM pgtrickle.trigger_inventory();

This shows one row per (source table, trigger type) confirming both pg_trickle_cdc_<oid> (DML) and pg_trickle_cdc_truncate_<oid> (TRUNCATE) triggers are present.

How TRUNCATE Compares to Other Operations

Aspect	INSERT	UPDATE	DELETE	TRUNCATE
Row trigger fires?	Yes (per row)	Yes (per row)	Yes (per row)	No
Statement trigger fires?	Yes	Yes	Yes	Yes (writes `'T'` marker)
Change buffer	1 row per INSERT	1 row per UPDATE	1 row per DELETE	1 marker row (`action='T'`)
Stream table updated?	Yes (next refresh)	Yes (next refresh)	Yes (next refresh)	Yes (full refresh on next cycle)
Recovery	Automatic (differential)	Automatic (differential)	Automatic (differential)	Automatic (full refresh fallback)
FULL mode affected?	N/A (recomputes)	N/A (recomputes)	N/A (recomputes)	N/A (recomputes)
IMMEDIATE mode?	Synchronous delta	Synchronous delta	Synchronous delta	Synchronous full refresh
Speed	O(1) per row	O(1) per row	O(1) per row	O(1) + O(query) for refresh

What About IMMEDIATE Mode?

In IMMEDIATE mode, TRUNCATE is handled synchronously within the same transaction:

The BEFORE TRUNCATE trigger acquires an advisory lock on the stream table
The AFTER TRUNCATE trigger calls pgt_ivm_handle_truncate(pgt_id)
This function TRUNCATEs the stream table and re-populates it by re-executing the defining query
The stream table is immediately consistent — within the same transaction

SELECT pgtrickle.create_stream_table(
    name         => 'customer_totals_live',
    query        => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    refresh_mode => 'IMMEDIATE'
);

BEGIN;
TRUNCATE orders;
-- customer_totals_live is already empty here!
SELECT * FROM customer_totals_live;  -- (0 rows)
COMMIT;

No waiting for a scheduler cycle, no stale data — TRUNCATE is fully handled in real-time.

Summary

As of v0.2.0, TRUNCATE is fully tracked by pg_trickle across all three refresh modes. While it cannot be captured as per-row DELETE events (PostgreSQL's TRUNCATE doesn't process individual rows), pg_trickle uses a statement-level trigger to detect the event and respond appropriately.

The key takeaways:

TRUNCATE is automatically handled — a statement-level AFTER TRUNCATE trigger writes a 'T' marker to the change buffer
DIFFERENTIAL mode: automatic full refresh — the scheduler detects the marker and falls back to a full refresh on the next cycle
IMMEDIATE mode: synchronous full refresh — the stream table is rebuilt within the same transaction
FULL mode: naturally immune — every refresh recomputes from scratch regardless
Manual refresh for instant consistency — call pgtrickle.refresh_stream_table() if you can't wait for the next cycle
DELETE FROM remains an alternative — fires per-row triggers, enabling incremental delta processing instead of full refresh fallback

Next in This Series

What Happens When You INSERT a Row? — The full 7-phase lifecycle (start here if you haven't already)
What Happens When You UPDATE a Row? — D+I split, group key changes, net-effect for multiple UPDATEs
What Happens When You DELETE a Row? — Reference counting, group deletion, INSERT+DELETE cancellation

Row-Level Security (RLS) on Stream Tables

This tutorial shows how to apply PostgreSQL Row-Level Security to stream tables so that different database roles see only the rows they are permitted to access.

Background

Stream tables materialize the full result set of their defining query, regardless of any RLS policies on the source tables. This matches the behavior of PostgreSQL's built-in MATERIALIZED VIEW — the cache contains everything, and access control is enforced at read time.

The recommended pattern is:

Source tables: may or may not have RLS. Stream tables always see all rows.
Stream table: enable RLS on the stream table and create per-role policies so each role sees only its permitted rows.

Setup: Multi-Tenant Orders

-- Source table: all tenant orders
CREATE TABLE orders (
    id        SERIAL PRIMARY KEY,
    tenant_id INT    NOT NULL,
    product   TEXT   NOT NULL,
    amount    NUMERIC(10,2) NOT NULL
);

INSERT INTO orders (tenant_id, product, amount) VALUES
    (1, 'Widget A', 19.99),
    (1, 'Widget B',  9.50),
    (2, 'Gadget X', 49.00),
    (2, 'Gadget Y', 25.00),
    (3, 'Doohickey', 5.00);

-- Stream table: per-tenant spend summary
SELECT pgtrickle.create_stream_table(
    name  => 'tenant_spend',
    query => $$
      SELECT tenant_id,
             COUNT(*)       AS order_count,
             SUM(amount)    AS total_spend
      FROM orders
      GROUP BY tenant_id
    $$,
    schedule => '1m'
);

After the first refresh, tenant_spend contains all three tenants:

SELECT * FROM pgtrickle.tenant_spend ORDER BY tenant_id;
--  tenant_id | order_count | total_spend
-- -----------+-------------+-------------
--          1 |           2 |       29.49
--          2 |           2 |       74.00
--          3 |           1 |        5.00

Step 1: Enable RLS on the Stream Table

ALTER TABLE pgtrickle.tenant_spend ENABLE ROW LEVEL SECURITY;

Once RLS is enabled, non-superuser roles see zero rows unless a policy grants access. The superuser (table owner) bypasses RLS by default.

Step 2: Create Per-Tenant Roles

CREATE ROLE tenant_1 LOGIN;
CREATE ROLE tenant_2 LOGIN;

GRANT USAGE  ON SCHEMA pgtrickle TO tenant_1, tenant_2;
GRANT SELECT ON pgtrickle.tenant_spend TO tenant_1, tenant_2;

Step 3: Create RLS Policies

-- Tenant 1 sees only tenant_id = 1
CREATE POLICY tenant_1_policy ON pgtrickle.tenant_spend
    FOR SELECT
    TO tenant_1
    USING (tenant_id = 1);

-- Tenant 2 sees only tenant_id = 2
CREATE POLICY tenant_2_policy ON pgtrickle.tenant_spend
    FOR SELECT
    TO tenant_2
    USING (tenant_id = 2);

Step 4: Verify Filtering

Connect as each tenant role and query:

-- As tenant_1:
SET ROLE tenant_1;
SELECT * FROM pgtrickle.tenant_spend;
--  tenant_id | order_count | total_spend
-- -----------+-------------+-------------
--          1 |           2 |       29.49

RESET ROLE;

-- As tenant_2:
SET ROLE tenant_2;
SELECT * FROM pgtrickle.tenant_spend;
--  tenant_id | order_count | total_spend
-- -----------+-------------+-------------
--          2 |           2 |       74.00

RESET ROLE;

Each tenant sees only their own data. The underlying stream table still contains all rows — the filtering happens at query time via RLS.

How Refresh Works with RLS

Both scheduled and manual refreshes run with superuser-equivalent privileges, so RLS on source tables is always bypassed during refresh. This ensures:

The stream table always contains the complete result set.
A refresh_stream_table() call produces the same result regardless of who calls it.
IMMEDIATE mode (IVM triggers) also bypasses RLS via SECURITY DEFINER trigger functions.

Policy Change Detection

pg_trickle automatically detects RLS-related DDL on source tables:

DDL on source table	Effect
`CREATE POLICY` / `ALTER POLICY` / `DROP POLICY`	Stream table marked for reinit
`ALTER TABLE ... ENABLE ROW LEVEL SECURITY`	Stream table marked for reinit
`ALTER TABLE ... DISABLE ROW LEVEL SECURITY`	Stream table marked for reinit
`ALTER TABLE ... FORCE ROW LEVEL SECURITY`	Stream table marked for reinit
`ALTER TABLE ... NO FORCE ROW LEVEL SECURITY`	Stream table marked for reinit

Since the stream table always sees all rows (bypassing RLS), these reinits serve as a confirmation that the materialized data remains consistent after the security posture of the source table changed.

Tips

One stream table, many roles: A single stream table can serve all tenants. Each role's RLS policy filters at read time — no per-tenant duplication needed.
Write policies: Stream tables are maintained by pg_trickle. Restrict writes to the pg_trickle system by only creating FOR SELECT policies.
Default deny: Once RLS is enabled, roles without a matching policy see zero rows. Always test with a non-superuser role.
FORCE ROW LEVEL SECURITY: By default, table owners bypass RLS. Use ALTER TABLE ... FORCE ROW LEVEL SECURITY if the owner should also be subject to policies.

Partitioned Tables as Sources

This tutorial shows how pg_trickle works with PostgreSQL's declarative table partitioning. It covers RANGE, LIST, and HASH partitioned source tables, explains what happens when you add or remove partitions, and documents known caveats.

Background

PostgreSQL lets you split large tables into smaller "partitions" — for example one partition per month for an orders table. This is a common technique for managing very large datasets. pg_trickle handles partitioned source tables transparently:

CDC triggers fire on all partitions. PostgreSQL 13+ automatically clones row-level triggers from the parent to every child partition. All DML (INSERT, UPDATE, DELETE) on any partition is captured in a single change buffer keyed by the parent table's OID.
ATTACH PARTITION is detected automatically. When you add a new partition with pre-existing data, pg_trickle's DDL event trigger detects the change and marks affected stream tables for reinitialization. No manual intervention required.
WAL-based CDC works correctly. When using WAL mode, publications are created with publish_via_partition_root = true so all partition changes appear under the parent table's identity.

Example: Monthly Sales Partitions (RANGE)

-- Create a RANGE-partitioned source table
CREATE TABLE sales (
    id         SERIAL,
    sale_date  DATE    NOT NULL,
    region     TEXT    NOT NULL,
    amount     NUMERIC NOT NULL,
    PRIMARY KEY (id, sale_date)
) PARTITION BY RANGE (sale_date);

-- Create partitions for each half of the year
CREATE TABLE sales_h1_2025 PARTITION OF sales
    FOR VALUES FROM ('2025-01-01') TO ('2025-07-01');
CREATE TABLE sales_h2_2025 PARTITION OF sales
    FOR VALUES FROM ('2025-07-01') TO ('2026-01-01');

-- Insert data across partitions
INSERT INTO sales (sale_date, region, amount) VALUES
    ('2025-02-15', 'US', 100.00),
    ('2025-05-20', 'EU', 250.00),
    ('2025-08-10', 'US', 175.00),
    ('2025-11-30', 'EU', 300.00);

-- Create a stream table over the partitioned source
SELECT pgtrickle.create_stream_table(
    name  => 'regional_sales',
    query => $$
        SELECT region, SUM(amount) AS total, COUNT(*) AS cnt
        FROM sales
        GROUP BY region
    $$,
    schedule     => '1 minute',
    refresh_mode => 'DIFFERENTIAL'
);

-- Refresh to populate
SELECT pgtrickle.refresh_stream_table('regional_sales');

-- Verify — aggregates span all partitions:
SELECT * FROM regional_sales ORDER BY region;
--  region | total  | cnt
-- --------+--------+-----
--  EU     | 550.00 |   2
--  US     | 275.00 |   2

Adding New Partitions

When you add a new partition, any new rows inserted through the parent are automatically captured by CDC triggers. The trigger on the parent is cloned to the new partition by PostgreSQL.

-- Add a new partition for 2026
CREATE TABLE sales_h1_2026 PARTITION OF sales
    FOR VALUES FROM ('2026-01-01') TO ('2026-07-01');

-- Inserts into the new partition are captured normally
INSERT INTO sales (sale_date, region, amount)
    VALUES ('2026-03-15', 'US', 400.00);

-- Next refresh picks up the new row
SELECT pgtrickle.refresh_stream_table('regional_sales');

SELECT * FROM regional_sales ORDER BY region;
--  region | total  | cnt
-- --------+--------+-----
--  EU     | 550.00 |   2
--  US     | 675.00 |   3

ATTACH PARTITION with Pre-Existing Data

The most important edge case: attaching a table that already contains rows. These rows were never seen by CDC triggers, so the stream table would be stale. pg_trickle detects this automatically.

-- Create a standalone table with existing data
CREATE TABLE sales_h2_2026 (
    id        SERIAL,
    sale_date DATE    NOT NULL,
    region    TEXT    NOT NULL,
    amount    NUMERIC NOT NULL,
    PRIMARY KEY (id, sale_date)
);
INSERT INTO sales_h2_2026 (sale_date, region, amount) VALUES
    ('2026-08-01', 'EU', 500.00),
    ('2026-09-15', 'US', 200.00);

-- Attach it to the partitioned table
ALTER TABLE sales ATTACH PARTITION sales_h2_2026
    FOR VALUES FROM ('2026-07-01') TO ('2027-01-01');

-- pg_trickle detects the partition change and marks the stream table
-- for reinitialize. Check:
SELECT pgt_name, needs_reinit
FROM pgtrickle.pgt_stream_tables
WHERE pgt_name = 'regional_sales';
--  pgt_name        | needs_reinit
-- -----------------+--------------
--  regional_sales  | t

-- The next refresh reinitializes — re-reading all data from scratch:
SELECT pgtrickle.refresh_stream_table('regional_sales');

SELECT * FROM regional_sales ORDER BY region;
--  region | total   | cnt
-- --------+---------+-----
--  EU     | 1050.00 |   3
--  US     |  875.00 |   4

DETACH PARTITION

When you detach a partition, the detached table's data is no longer visible through the parent. pg_trickle detects this too and marks stream tables for reinitialize.

-- Archive the old partition
ALTER TABLE sales DETACH PARTITION sales_h1_2025;

-- Stream table is marked for reinit:
SELECT pgt_name, needs_reinit
FROM pgtrickle.pgt_stream_tables
WHERE pgt_name = 'regional_sales';
--  pgt_name        | needs_reinit
-- -----------------+--------------
--  regional_sales  | t

-- After refresh, the detached partition's rows are gone:
SELECT pgtrickle.refresh_stream_table('regional_sales');
SELECT * FROM regional_sales ORDER BY region;
-- (only rows from remaining partitions)

LIST Partitioning

LIST partitioning splits rows by discrete values. It works identically:

CREATE TABLE events (
    id      SERIAL,
    region  TEXT NOT NULL,
    payload TEXT,
    PRIMARY KEY (id, region)
) PARTITION BY LIST (region);

CREATE TABLE events_us PARTITION OF events FOR VALUES IN ('US');
CREATE TABLE events_eu PARTITION OF events FOR VALUES IN ('EU');
CREATE TABLE events_ap PARTITION OF events FOR VALUES IN ('AP');

SELECT pgtrickle.create_stream_table(
    name  => 'event_counts',
    query => 'SELECT region, count(*) AS cnt FROM events GROUP BY region',
    schedule => '1 minute'
);

HASH Partitioning

HASH partitioning distributes rows across a fixed number of partitions. Useful for spreading write load evenly:

CREATE TABLE metrics (
    id        SERIAL PRIMARY KEY,
    sensor_id INT    NOT NULL,
    value     DOUBLE PRECISION
) PARTITION BY HASH (id);

CREATE TABLE metrics_0 PARTITION OF metrics
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE metrics_1 PARTITION OF metrics
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE metrics_2 PARTITION OF metrics
    FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE metrics_3 PARTITION OF metrics
    FOR VALUES WITH (MODULUS 4, REMAINDER 3);

SELECT pgtrickle.create_stream_table(
    name  => 'sensor_avg',
    query => $$
        SELECT sensor_id, AVG(value) AS avg_val, COUNT(*) AS cnt
        FROM metrics GROUP BY sensor_id
    $$,
    schedule => '1 minute'
);

Foreign Tables

Tables from other databases (via postgres_fdw) can be used as sources, but with restrictions:

No trigger-based CDC — foreign tables don't support row-level triggers.
No WAL-based CDC — foreign tables don't generate local WAL.
FULL refresh works — SELECT * executes a remote query each time.
Polling-based CDC works — when pg_trickle.foreign_table_polling is enabled, pg_trickle creates a local snapshot table and detects changes via EXCEPT ALL comparison.

When you use a foreign table as a source, pg_trickle emits an info message explaining the limitations:

CREATE EXTENSION postgres_fdw;

CREATE SERVER remote_db
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'remote-host', dbname 'analytics');

CREATE USER MAPPING FOR CURRENT_USER
    SERVER remote_db OPTIONS (user 'reader');

CREATE FOREIGN TABLE remote_orders (
    id     INT,
    amount NUMERIC
) SERVER remote_db OPTIONS (table_name 'orders');

-- Only FULL refresh is available:
SELECT pgtrickle.create_stream_table(
    name  => 'remote_totals',
    query => 'SELECT SUM(amount) AS total FROM remote_orders',
    schedule     => '5 minutes',
    refresh_mode => 'FULL'
);
-- INFO: pg_trickle: source table remote_orders is a foreign table.
-- Foreign tables cannot use trigger-based or WAL-based CDC —
-- only FULL refresh mode or polling-based change detection is supported.

Known Caveats

Caveat	Description
PostgreSQL 13+ required	Parent-table triggers only propagate to child partitions on PG 13+. pg_trickle targets PostgreSQL 18, so this is always satisfied.
Partition key in PRIMARY KEY	PostgreSQL requires the partition key to be part of any unique constraint. This means your `PRIMARY KEY` must include the partition column.
ATTACH with data = reinitialize	Attaching a partition with pre-existing rows triggers a full reinitialize on the next refresh. For very large tables, this may be slow. Consider gating the source with `pgtrickle.gate_source()` during bulk partition operations.
Sub-partitioning	Multi-level partitioning (partitions of partitions) works in principle because triggers propagate through the entire hierarchy, but it is not extensively tested.
pg_partman compatibility	`pg_partman` dynamically creates and drops partitions. Since pg_trickle detects ATTACH/DETACH via DDL event triggers, it should work, but this combination is not yet tested.
Partitioned storage tables	Using a partitioned table as the stream table's storage is not supported. This is tracked for a future release.
DETACH PARTITION CONCURRENTLY	`DETACH PARTITION ... CONCURRENTLY` is a two-phase operation. The DDL event trigger fires after the first phase; the partition is not fully detached until the second phase commits. The stream table may briefly reflect the old partition count.

Foreign Table Sources

This tutorial shows how to use a postgres_fdw foreign table as a source for a stream table. Foreign tables let you aggregate data from remote PostgreSQL databases into a local stream table that refreshes automatically.

Background

PostgreSQL's Foreign Data Wrapper (postgres_fdw) lets you define tables that transparently query a remote database. pg_trickle can use these foreign tables as stream table sources, but with different change-detection semantics than regular tables.

Key difference: Foreign tables cannot use trigger-based or WAL-based CDC. Changes are detected either by re-scanning the entire remote table (FULL refresh) or by comparing a local snapshot to the remote data (polling-based CDC).

Step 1 — Set Up the Foreign Server

-- Enable the foreign data wrapper extension
CREATE EXTENSION IF NOT EXISTS postgres_fdw;

-- Create a connection to the remote database
CREATE SERVER warehouse_db
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'warehouse.example.com', dbname 'analytics', port '5432');

-- Map the current user to a remote user
CREATE USER MAPPING FOR CURRENT_USER
    SERVER warehouse_db
    OPTIONS (user 'readonly_user', password 'secret');

Step 2 — Define the Foreign Table

CREATE FOREIGN TABLE remote_orders (
    id          INT,
    customer_id INT,
    amount      NUMERIC(12,2),
    region      TEXT,
    created_at  TIMESTAMP
) SERVER warehouse_db
  OPTIONS (schema_name 'public', table_name 'orders');

Alternatively, import an entire remote schema:

IMPORT FOREIGN SCHEMA public
    LIMIT TO (orders, customers)
    FROM SERVER warehouse_db
    INTO public;

Step 3 — Create a Stream Table with FULL Refresh

The simplest approach uses FULL refresh mode — pg_trickle re-executes the query against the remote table on every refresh cycle:

SELECT pgtrickle.create_stream_table(
    name         => 'orders_by_region',
    query        => $$
        SELECT
            region,
            COUNT(*)        AS order_count,
            SUM(amount)     AS total_revenue,
            AVG(amount)     AS avg_order_value
        FROM remote_orders
        GROUP BY region
    $$,
    schedule     => '5m',
    refresh_mode => 'FULL'
);

pg_trickle will emit an informational message:

INFO: pg_trickle: source table remote_orders is a foreign table.
Foreign tables cannot use trigger-based or WAL-based CDC —
only FULL refresh mode or polling-based change detection is supported.

How FULL refresh works with foreign tables:

Every 5 minutes, pg_trickle executes the defining query.
The query is sent to the remote database via postgres_fdw.
The complete result set replaces the stream table contents.
This is equivalent to a MATERIALIZED VIEW refresh, but automated.

Step 4 — Polling-Based CDC (Optional)

If the remote table is large and changes are small, FULL refresh becomes expensive because it transfers the entire result set every cycle. Polling-based CDC provides a more efficient alternative:

-- Enable polling globally (or per-session)
SET pg_trickle.foreign_table_polling = on;

-- Now create with DIFFERENTIAL mode — pg_trickle will use polling
SELECT pgtrickle.create_stream_table(
    name         => 'orders_by_region_polling',
    query        => $$
        SELECT
            region,
            COUNT(*)        AS order_count,
            SUM(amount)     AS total_revenue,
            AVG(amount)     AS avg_order_value
        FROM remote_orders
        GROUP BY region
    $$,
    schedule     => '5m',
    refresh_mode => 'FULL'
);

How polling works:

On the first refresh, pg_trickle creates a local snapshot table that mirrors the remote table's data.
On subsequent refreshes, it fetches the current remote data and computes an EXCEPT ALL difference against the snapshot.
Only the changed rows are written to the change buffer and processed through the incremental delta pipeline.
The snapshot table is updated to reflect the new remote state.
When the stream table is dropped, the snapshot table is cleaned up.

Trade-offs:

Aspect	FULL Refresh	Polling CDC
Network transfer	Full result set every cycle	Full remote scan, but only diffs applied
Local storage	Stream table only	Stream table + snapshot table
Best for	Small remote tables	Large remote tables with small change rates
GUC required	No	`pg_trickle.foreign_table_polling = on`

Step 5 — Verify and Monitor

-- Check stream table status
SELECT * FROM pgtrickle.pgt_status('orders_by_region');

-- Check CDC health (will show foreign table constraints)
SELECT * FROM pgtrickle.check_cdc_health();

-- View refresh history
SELECT * FROM pgtrickle.get_refresh_history('orders_by_region', 5);

-- Monitor staleness
SELECT * FROM pgtrickle.get_staleness('orders_by_region');

Worked Example — Remote Inventory Dashboard

This example aggregates inventory data from a remote warehouse database into a local dashboard table:

-- Remote table definition
CREATE FOREIGN TABLE remote_inventory (
    sku         TEXT,
    warehouse   TEXT,
    quantity    INT,
    updated_at  TIMESTAMP
) SERVER warehouse_db
  OPTIONS (schema_name 'inventory', table_name 'stock_levels');

-- Dashboard: inventory summary by warehouse
SELECT pgtrickle.create_stream_table(
    name     => 'inventory_dashboard',
    query    => $$
        SELECT
            warehouse,
            COUNT(DISTINCT sku)  AS unique_products,
            SUM(quantity)        AS total_units,
            MIN(updated_at)      AS oldest_update,
            MAX(updated_at)      AS newest_update
        FROM remote_inventory
        GROUP BY warehouse
    $$,
    schedule     => '10m',
    refresh_mode => 'FULL'
);

After the first refresh:

SELECT * FROM inventory_dashboard;

 warehouse | unique_products | total_units | oldest_update       | newest_update
-----------+-----------------+-------------+---------------------+---------------------
 east      |             142 |       23500 | 2026-03-14 08:00:00 | 2026-03-14 09:15:00
 west      |              98 |       15200 | 2026-03-14 07:30:00 | 2026-03-14 09:10:00
 central   |             215 |       41000 | 2026-03-14 06:00:00 | 2026-03-14 09:20:00

Constraints and Caveats

Constraint	Details
No trigger CDC	Foreign tables don't support PostgreSQL row-level triggers.
No WAL CDC	Foreign tables don't generate local WAL entries.
Network latency	Each refresh cycle queries the remote database. Schedule accordingly.
Remote availability	If the remote database is down, the refresh will fail (logged in `pgt_refresh_history`). The stream table retains its last successful data.
Authentication	`CREATE USER MAPPING` credentials must remain valid. Use `.pgpass` or environment variables in production.
Snapshot storage	Polling CDC creates a snapshot table sized proportionally to the remote table. Monitor disk usage.

FAQ

Q: Why does my foreign table stream table only work in FULL mode?

Foreign tables cannot install row-level triggers (the mechanism pg_trickle uses for trigger-based CDC) and don't generate local WAL records (used by WAL-based CDC). FULL refresh works because it simply re-executes the remote query. Enable pg_trickle.foreign_table_polling if you need differential-style change detection.

Q: Can I mix foreign and local tables in the same defining query?

Yes. If your query joins a foreign table with a local table, pg_trickle uses trigger/WAL CDC for the local table and FULL-rescan or polling for the foreign table. The refresh mode must be FULL unless polling is enabled for the foreign table sources.

Q: What happens if the remote database is temporarily unavailable?

The refresh attempt fails, is logged in pgt_refresh_history with status FAILED, and the consecutive_errors counter increments. The stream table retains its last successful data. When the remote database recovers, the next scheduled refresh succeeds and the error counter resets.

Tutorial: Migrating from Materialized Views

This guide shows how to incrementally migrate existing PostgreSQL MATERIALIZED VIEW + manual REFRESH workflows to pg_trickle stream tables.

Why Migrate?

	Materialized View	Stream Table
Refresh	Manual (`REFRESH MATERIALIZED VIEW`)	Automatic (scheduler) or manual
Incremental refresh	Not supported	Built-in differential mode
Blocking reads	`REFRESH` without `CONCURRENTLY` blocks readers	Never blocks readers
Dependency ordering	Manual	Automatic (DAG-aware topological refresh)
Monitoring	None	Built-in views, stats, NOTIFY alerts
Scheduling	External (cron, pg_cron)	Native (duration, cron, CALCULATED)

Step-by-Step Migration

1. Identify materialized views to migrate

-- List all materialized views with their defining queries
SELECT schemaname, matviewname, definition
FROM pg_matviews
ORDER BY schemaname, matviewname;

2. Create the stream table

Take the materialized view's defining query and pass it to create_stream_table():

Before (materialized view):

CREATE MATERIALIZED VIEW order_totals AS
SELECT customer_id, SUM(amount) AS total, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;

-- Refreshed via cron or pg_cron:
-- */5 * * * * psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY order_totals"

After (stream table):

SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total, COUNT(*) AS order_count
                 FROM orders GROUP BY customer_id',
    schedule => '5m'
);

3. Update application queries

Stream tables live in the pgtrickle schema by default. Update your application queries to reference the new location:

-- Before:
SELECT * FROM public.order_totals WHERE total > 1000;

-- After:
SELECT * FROM pgtrickle.order_totals WHERE total > 1000;

Or create a view in the original schema for backward compatibility:

CREATE VIEW public.order_totals AS
SELECT customer_id, total, order_count
FROM pgtrickle.order_totals;

4. Recreate indexes

Stream tables are regular heap tables — you can add indexes just like any other table. Recreate the indexes your queries depend on:

-- Before (on materialized view):
CREATE UNIQUE INDEX ON order_totals (customer_id);

-- After (on stream table):
CREATE INDEX ON pgtrickle.order_totals (customer_id);

Note: The __pgt_row_id column is the primary key on stream tables. You cannot add a separate UNIQUE primary key, but you can add regular or unique indexes on your business columns.

5. Remove the old materialized view

Once you've verified the stream table is working correctly:

DROP MATERIALIZED VIEW IF EXISTS public.order_totals;

6. Remove external refresh jobs

Delete any cron jobs, pg_cron entries, or application-level refresh triggers that were maintaining the old materialized view.

Migrating Concurrent Refresh Patterns

If you use REFRESH MATERIALIZED VIEW CONCURRENTLY (which requires a unique index), the stream table equivalent is simpler — differential refresh never blocks readers and doesn't require a unique index:

Before:

CREATE MATERIALIZED VIEW active_users AS
SELECT user_id, MAX(login_at) AS last_login
FROM logins
WHERE login_at > NOW() - INTERVAL '30 days'
GROUP BY user_id;

CREATE UNIQUE INDEX ON active_users (user_id);
REFRESH MATERIALIZED VIEW CONCURRENTLY active_users;

After:

SELECT pgtrickle.create_stream_table(
    name     => 'active_users',
    query    => 'SELECT user_id, MAX(login_at) AS last_login
                 FROM logins
                 WHERE login_at > NOW() - INTERVAL ''30 days''
                 GROUP BY user_id',
    schedule => '1m'
);
-- No unique index needed. No manual refresh needed.

Migrating Cascading Materialized Views

If you have materialized views that depend on other materialized views, the migration is straightforward — pg_trickle handles dependency ordering automatically:

Before:

CREATE MATERIALIZED VIEW order_totals AS
SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id;

CREATE MATERIALIZED VIEW big_customers AS
SELECT customer_id, total FROM order_totals WHERE total > 1000;

-- Must refresh in order:
REFRESH MATERIALIZED VIEW order_totals;
REFRESH MATERIALIZED VIEW big_customers;

After:

SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule => '1m'
);

SELECT pgtrickle.create_stream_table(
    name     => 'big_customers',
    query    => 'SELECT customer_id, total FROM pgtrickle.order_totals WHERE total > 1000',
    schedule => '1m'
);
-- Dependency ordering is automatic. No manual refresh needed.

Idempotent Deployment

For CI/CD pipelines, use create_or_replace_stream_table() so your migration scripts are safe to re-run:

SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'order_totals',
    query        => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule     => '5m',
    refresh_mode => 'DIFFERENTIAL'
);

Choosing the Right Refresh Mode

Scenario	Mode
Most migrations (default)	`DIFFERENTIAL` — only processes changes
Volatile functions (`NOW()`, `RANDOM()`) in the query	`FULL` — the query result changes even without source DML
Need real-time consistency within a transaction	`IMMEDIATE`
Unsure	`AUTO` (default) — pg_trickle picks the best mode per cycle

Migration Checklist

Identify all materialized views and their refresh schedules
Create equivalent stream tables with matching queries
Recreate any required indexes on the stream tables
Update application queries to reference the pgtrickle schema
Verify data correctness (compare stream table vs. materialized view)
Remove external refresh jobs (cron, pg_cron)
Drop the old materialized views
Set up monitoring (Prometheus/Grafana or built-in views)

Tutorial: Fuse Circuit Breaker

The fuse circuit breaker (v0.11.0+) suspends differential refreshes when the incoming change volume exceeds a threshold. This protects your database from runaway refresh cycles during bulk data loads, accidental mass-deletes, or migration scripts.

When to Use It

Bulk ETL loads — loading millions of rows that would overwhelm a differential refresh
Data migration scripts — large schema or data changes that temporarily spike the change buffer
Protection against accidents — an errant DELETE FROM orders shouldn't silently cascade through all downstream stream tables

How It Works

Normal operation           Fuse blows               After reset
─────────────────         ─────────────────        ─────────────────
Source DML ──▶ CDC ──▶ Refresh   Source DML ──▶ CDC ──▶ BLOCKED   Source DML ──▶ CDC ──▶ Refresh
                                  │                                    (resumed)
                                  ▼
                           NOTIFY alert
                           (fuse_blown)

Each refresh cycle, the scheduler counts pending changes in the buffer.
If the count exceeds fuse_ceiling for fuse_sensitivity consecutive cycles, the fuse blows.
The stream table enters a paused state — no refreshes occur.
A fuse_blown alert is emitted via NOTIFY pg_trickle_alert.
An operator investigates and calls reset_fuse() to resume.

Step-by-Step Example

1. Create a stream table with fuse protection

SELECT pgtrickle.create_stream_table(
    name         => 'category_summary',
    query        => 'SELECT category, COUNT(*) AS cnt, SUM(price) AS total
                     FROM products GROUP BY category',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

-- Arm the fuse: blow when pending changes exceed 50,000 rows
SELECT pgtrickle.alter_stream_table(
    'category_summary',
    fuse           => 'on',
    fuse_ceiling   => 50000,
    fuse_sensitivity => 3    -- require 3 consecutive over-ceiling cycles
);

2. Observe normal operation

-- Insert a small batch — well under the ceiling
INSERT INTO products (name, category, price)
SELECT 'Product ' || i, 'Electronics', 9.99
FROM generate_series(1, 100) i;

-- After the next refresh cycle, the stream table is updated normally
SELECT * FROM pgtrickle.category_summary;

3. Trigger a bulk load

-- Simulate a large ETL load — 100,000 rows
INSERT INTO products (name, category, price)
SELECT 'Bulk ' || i, 'Imported', 4.99
FROM generate_series(1, 100000) i;

After fuse_sensitivity scheduler cycles (3 in our example), the fuse blows. The stream table stops refreshing.

4. Inspect the fuse state

SELECT name, fuse_mode, fuse_state, fuse_ceiling, blown_at, blow_reason
FROM pgtrickle.fuse_status();

     name          | fuse_mode | fuse_state | fuse_ceiling |          blown_at          |       blow_reason
-------------------+-----------+------------+--------------+----------------------------+---------------------------
 category_summary  | on        | blown      |        50000 | 2026-03-31 14:22:01.123+00 | change_count_exceeded

5. Decide how to recover

You have three options:

-- Option A: Apply the changes (process the bulk load normally)
SELECT pgtrickle.reset_fuse('category_summary', action => 'apply');

-- Option B: Skip the changes (discard the batch, resume from current state)
SELECT pgtrickle.reset_fuse('category_summary', action => 'skip_changes');

-- Option C: Reinitialize (full rebuild from the defining query)
SELECT pgtrickle.reset_fuse('category_summary', action => 'reinitialize');

After resetting, the fuse returns to 'armed' state and the scheduler resumes.

Fuse Modes

Mode	Behavior
`'off'`	No fuse protection (default)
`'on'`	Always armed — blows when changes exceed `fuse_ceiling`
`'auto'`	Blows only when a FULL refresh would be cheaper than DIFFERENTIAL

'auto' mode is recommended for most use cases — it protects against bulk loads while allowing large-but-efficient differential refreshes to proceed.

Using with dbt

In dbt models, configure the fuse via the stream_table materialization:

-- models/marts/category_summary.sql
{{ config(
    materialized='stream_table',
    schedule='5m',
    refresh_mode='DIFFERENTIAL',
    fuse='auto',
    fuse_ceiling=50000,
    fuse_sensitivity=3
) }}

SELECT category, COUNT(*) AS cnt, SUM(price) AS total
FROM {{ source('raw', 'products') }}
GROUP BY category

Global Defaults

Set a cluster-wide default ceiling via the pg_trickle.fuse_default_ceiling GUC. Stream tables with fuse_ceiling = NULL inherit this value:

ALTER SYSTEM SET pg_trickle.fuse_default_ceiling = 100000;
SELECT pg_reload_conf();

Monitoring

pgtrickle.fuse_status() — inspect fuse state for all stream tables
LISTEN pg_trickle_alert — receive real-time fuse_blown notifications
pgtrickle.dedup_stats() — includes fuse-related counters
pgtrickle.pgt_stream_tables.fuse_state — direct catalog query

Tutorial: Tiered Scheduling

Tiered scheduling (v0.12.0+) lets you assign refresh priorities to stream tables using four tiers: Hot, Warm, Cold, and Frozen. This reduces CPU and I/O overhead by refreshing less-critical tables less frequently.

When to Use It

You have many stream tables (50+) and want to reduce scheduler load
Some tables power real-time dashboards (need hot refresh) while others serve weekly reports (can be cold)
You want to freeze tables during maintenance windows without dropping them

Tier Overview

Tier	Multiplier	Effect
`hot`	1×	Refresh at the configured schedule (default)
`warm`	2×	Refresh at 2× the configured interval
`cold`	10×	Refresh at 10× the configured interval
`frozen`	skip	Never refreshed until manually promoted

For a stream table with schedule => '1m':

Tier	Effective Interval
hot	1 minute
warm	2 minutes
cold	10 minutes
frozen	never

Note: Cron-based schedules are not affected by the tier multiplier. They always fire at the configured cron time.

Step-by-Step Example

1. Enable tiered scheduling

Tiered scheduling is enabled by default since v0.12.0. Verify:

SHOW pg_trickle.tiered_scheduling;
-- Should return: on

2. Create stream tables with different priorities

-- Real-time dashboard — stays hot (default)
SELECT pgtrickle.create_stream_table(
    name     => 'live_order_count',
    query    => 'SELECT COUNT(*) AS total FROM orders WHERE status = ''active''',
    schedule => '30s'
);

-- Important but not latency-critical
SELECT pgtrickle.create_stream_table(
    name     => 'daily_revenue',
    query    => 'SELECT DATE_TRUNC(''day'', created_at) AS day, SUM(amount) AS revenue
                 FROM orders GROUP BY 1',
    schedule => '1m'
);

-- Weekly report — rarely queried
SELECT pgtrickle.create_stream_table(
    name     => 'customer_lifetime_value',
    query    => 'SELECT customer_id, SUM(amount) AS lifetime_value
                 FROM orders GROUP BY customer_id',
    schedule => '5m'
);

3. Assign tiers

-- live_order_count stays at 'hot' (default) — refreshes every 30s

-- daily_revenue: 2× multiplier → effective interval = 2 minutes
SELECT pgtrickle.alter_stream_table('daily_revenue', tier => 'warm');

-- customer_lifetime_value: 10× multiplier → effective interval = 50 minutes
SELECT pgtrickle.alter_stream_table('customer_lifetime_value', tier => 'cold');

4. Verify effective schedules

SELECT pgt_name, schedule, refresh_tier,
       CASE refresh_tier
           WHEN 'hot'  THEN schedule
           WHEN 'warm' THEN schedule || ' ×2'
           WHEN 'cold' THEN schedule || ' ×10'
           WHEN 'frozen' THEN 'never'
       END AS effective
FROM pgtrickle.pgt_stream_tables
ORDER BY refresh_tier;

5. Freeze a table during maintenance

-- Freeze before a schema migration
SELECT pgtrickle.alter_stream_table('customer_lifetime_value', tier => 'frozen');

-- ... perform migration ...

-- Promote back when ready
SELECT pgtrickle.alter_stream_table('customer_lifetime_value', tier => 'warm');

Choosing the Right Tier

Use Case	Recommended Tier
Real-time dashboards, alerting tables	hot
Operational reports queried hourly	warm
Weekly/monthly analytics, batch consumers	cold
Tables under maintenance, seasonal reports	frozen

Rules of thumb:

Start with everything at hot (the default). Move tables to warm or cold as you identify which ones can tolerate more staleness.
Warm halves the refresh CPU cost compared to hot.
Cold reduces refresh overhead by 90%.
Use frozen sparingly — changes accumulate in the buffer and will be processed when you promote the table back.

Monitoring Tiers

-- Check which tables are in which tier
SELECT pgt_name, refresh_tier, status, staleness
FROM pgtrickle.stream_tables_info
ORDER BY refresh_tier, staleness DESC;

-- Find frozen tables (these are NOT being refreshed)
SELECT pgt_name, refresh_tier
FROM pgtrickle.pgt_stream_tables
WHERE refresh_tier = 'frozen';

Troubleshooting

All tables are frozen and nothing is refreshing:
If every stream table is set to frozen, the scheduler has nothing to do. Promote at least one table back to hot or warm.

Staleness exceeds expectations for cold tables:
Remember that cold applies a 10× multiplier. A 5-minute schedule becomes a 50-minute effective interval. If this is too stale, use warm instead.

Tutorial: Tuning Refresh Mode

This tutorial walks you through using pg_trickle's built-in diagnostics to determine whether your stream tables are running in the most efficient refresh mode (FULL vs DIFFERENTIAL), and how to act on the recommendations.

Prerequisites

pg_trickle v0.14.0 or later
At least one stream table with several completed refresh cycles (the diagnostics become more accurate with more history)

Step 1: Check Current Refresh Efficiency

Start by reviewing how your stream tables are performing with their current refresh mode:

SELECT pgt_name, refresh_mode, diff_count, full_count,
       avg_diff_ms, avg_full_ms, diff_speedup
FROM pgtrickle.refresh_efficiency();

Example output:

pgt_name	refresh_mode	diff_count	full_count	avg_diff_ms	avg_full_ms	diff_speedup
order_totals	DIFFERENTIAL	142	3	12.4	850.2	68.6x
user_stats	FULL	0	145	—	320.1	—
daily_metrics	DIFFERENTIAL	98	47	425.8	410.3	1.0x

Key observations:

order_totals: DIFFERENTIAL is 68× faster — this is a great fit.
user_stats: Running in FULL mode with no DIFFERENTIAL history — worth checking if DIFFERENTIAL would be faster.
daily_metrics: DIFFERENTIAL and FULL take about the same time (1.0× speedup). FULL might actually be simpler and more predictable here.

Step 2: Get Recommendations

Use recommend_refresh_mode() to get AI-weighted recommendations:

SELECT pgt_name, current_mode, recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode();

Example output:

pgt_name	current_mode	recommended_mode	confidence	reason
order_totals	DIFFERENTIAL	KEEP	high	DIFFERENTIAL is 68.6× faster than FULL with low latency variance
user_stats	FULL	DIFFERENTIAL	medium	Query is simple (no complex joins), change ratio is low (2.1%), target table is large
daily_metrics	DIFFERENTIAL	FULL	medium	DIFFERENTIAL shows no speedup over FULL (1.0×); high latency variance (p95/p50 = 4.2) suggests unstable performance

For a single table with full signal details:

SELECT recommended_mode, confidence, reason,
       jsonb_pretty(signals) AS signal_details
FROM pgtrickle.recommend_refresh_mode('daily_metrics');

Step 3: Understand the Signals

The signals JSONB column contains the detailed breakdown of all seven weighted signals that contributed to the recommendation:

{
  "composite_score": -0.22,
  "signals": [
    { "name": "change_ratio_avg", "score": -0.1, "weight": 0.30 },
    { "name": "empirical_timing", "score": -0.3, "weight": 0.35 },
    { "name": "change_ratio_current", "score": -0.2, "weight": 0.25 },
    { "name": "query_complexity", "score": 0.0, "weight": 0.10 },
    { "name": "target_size", "score": 0.1, "weight": 0.10 },
    { "name": "index_coverage", "score": 0.0, "weight": 0.05 },
    { "name": "latency_variance", "score": -0.4, "weight": 0.05 }
  ]
}

Positive scores favour DIFFERENTIAL; negative scores favour FULL. A composite score above +0.15 recommends DIFFERENTIAL; below −0.15 recommends FULL; in between, the current mode is near-optimal (KEEP).

Confidence levels:

Level	Meaning
`high`	10+ completed refresh cycles; strong signal agreement
`medium`	5–10 cycles or mixed signals
`low`	Fewer than 5 cycles; recommendation is speculative

Step 4: Apply the Recommendation

If you decide to follow a recommendation, use ALTER STREAM TABLE:

-- Switch daily_metrics from DIFFERENTIAL to FULL
SELECT pgtrickle.alter_stream_table('daily_metrics',
    refresh_mode => 'FULL'
);

Or switch a table to DIFFERENTIAL:

-- Switch user_stats to DIFFERENTIAL mode
SELECT pgtrickle.alter_stream_table('user_stats',
    refresh_mode => 'DIFFERENTIAL'
);

The change takes effect on the next refresh cycle. No data is lost during the transition.

Step 5: Monitor After the Change

After switching modes, wait for several refresh cycles and re-check:

-- Wait a few minutes, then re-check efficiency
SELECT pgt_name, refresh_mode, diff_count, full_count,
       avg_diff_ms, avg_full_ms, diff_speedup
FROM pgtrickle.refresh_efficiency()
WHERE pgt_name = 'daily_metrics';

Run the recommendation function again to verify the change was beneficial:

SELECT recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode('daily_metrics');

If the recommendation now says KEEP, the new mode is working well.

Common Scenarios

High-cardinality aggregates

Stream tables with SUM/COUNT/AVG over high-cardinality GROUP BY keys (1000+ groups) are almost always better in DIFFERENTIAL mode. pg_trickle warns about low-cardinality groups at creation time (DIAG-2).

Small tables with frequent full rewrites

If the source table is small (< 10,000 rows) and changes affect > 30% of rows per cycle, FULL refresh is often faster because it avoids the overhead of change tracking and delta application.

Complex multi-join queries

Queries with 4+ JOINs may have high DIFFERENTIAL overhead due to the delta propagation rules. If diff_speedup is below 2×, consider FULL mode.

Tables with volatile functions

Stream tables using volatile functions (e.g., now(), random()) must use FULL mode. pg_trickle rejects volatile functions in DIFFERENTIAL mode at creation time.

Using the TUI

The pgtrickle TUI provides a visual diagnostics panel. Press 5 or d in the interactive dashboard to open the diagnostics view, which shows recommendations with confidence levels for all stream tables at a glance.

From the CLI:

# Show recommendations for all tables
pgtrickle diag

# Show recommendations in JSON format (for automation)
pgtrickle diag --format json

Tutorial: Circular Dependencies

pg_trickle supports circular (cyclic) stream table dependencies (v0.7.0+) for queries that use only monotone operators. The scheduler groups circular dependencies into Strongly Connected Components (SCCs) and iterates them to a fixed point.

When to Use It

Transitive closure — computing all reachable nodes in a graph
Graph reachability — finding all paths between nodes
Iterative convergence — mutual dependencies that stabilize after a few iterations

Prerequisites

Circular dependencies are disabled by default. Enable them:

SET pg_trickle.allow_circular = true;

Monotone Operator Requirement

Only monotone operators are allowed in circular dependency chains. Monotone operators guarantee convergence — the result set grows (or stays the same) with each iteration until a fixed point is reached.

Allowed (Monotone)	Blocked (Non-Monotone)
Joins (INNER, LEFT, RIGHT, FULL)	Aggregates (SUM, COUNT, etc.)
Filters (WHERE)	EXCEPT
Projections (SELECT)	Window functions
UNION ALL	NOT EXISTS / NOT IN
INTERSECT
EXISTS

Creating a circular dependency with non-monotone operators is rejected with a clear error message, regardless of the allow_circular setting.

Step-by-Step Example: Transitive Closure

Suppose you have a graph of relationships:

CREATE TABLE edges (src INT, dst INT);
INSERT INTO edges VALUES
    (1, 2), (2, 3), (3, 4), (4, 5),
    (1, 3), (2, 5);

1. Create the base reachability table

-- Direct edges: all nodes directly connected
SELECT pgtrickle.create_stream_table(
    name     => 'reachable_direct',
    query    => 'SELECT src, dst FROM edges',
    schedule => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

2. Create the transitive closure with a self-reference

-- Transitive closure: if A→B and B→C, then A→C
-- This creates a circular dependency (reachable depends on itself via the join)
SELECT pgtrickle.create_stream_table(
    name     => 'reachable',
    query    => 'SELECT DISTINCT r1.src, r2.dst
                 FROM pgtrickle.reachable_direct r1
                 JOIN pgtrickle.reachable_direct r2 ON r1.dst = r2.src
                 UNION ALL
                 SELECT src, dst FROM edges',
    schedule => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

Note: This example uses the reachable_direct table for the join rather than self-referencing reachable directly. For a true self-referencing cycle, pg_trickle detects the SCC and iterates.

3. Observe the fixed-point iteration

When the scheduler processes an SCC, it iterates until no new rows are produced (the fixed point):

-- Check SCC status
SELECT * FROM pgtrickle.pgt_scc_status();

Output:

 scc_id | members                          | iteration | converged
--------+----------------------------------+-----------+-----------
      1 | {reachable_direct,reachable}     |         3 | true

4. Add new edges and watch convergence

INSERT INTO edges VALUES (5, 1);  -- creates a cycle in the graph

On the next refresh cycle, the scheduler re-iterates the SCC until the transitive closure stabilizes with the new edge.

Monitoring SCCs

-- View all SCCs and their convergence status
SELECT * FROM pgtrickle.pgt_scc_status();

-- Check which stream tables belong to which SCC
SELECT pgt_name, scc_id
FROM pgtrickle.pgt_stream_tables
WHERE scc_id IS NOT NULL;

Controlling Iteration Limits

The pg_trickle.max_fixpoint_iterations GUC limits how many iterations the scheduler attempts before declaring non-convergence:

-- Default: 100 (generous headroom)
SHOW pg_trickle.max_fixpoint_iterations;

-- Lower it for fast-converging workloads
SET pg_trickle.max_fixpoint_iterations = 20;

If convergence is not reached within the limit, all SCC members are marked as ERROR. This prevents runaway infinite loops.

Limitations

Non-monotone operators are always rejected — aggregates, EXCEPT, window functions, and NOT EXISTS/NOT IN cannot appear in circular chains because they prevent convergence.
Performance scales with iteration count — each iteration runs a full differential refresh cycle for all SCC members. Keep cycles small.
All SCC members must use DIFFERENTIAL mode — FULL and IMMEDIATE modes are not supported for circular dependencies.

Tutorial: Monitoring & Alerting

This guide consolidates all pg_trickle monitoring capabilities into a single reference: built-in SQL views, NOTIFY-based alerts, and the Prometheus/Grafana observability stack.

Quick Health Check

The fastest way to verify pg_trickle is healthy:

SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

If this returns no rows, everything is working. Any WARN or ERROR rows tell you where to investigate.

Built-in Monitoring Views

Stream table status

-- Overview: name, status, mode, staleness
SELECT name, status, refresh_mode, staleness, stale
FROM pgtrickle.stream_tables_info;

-- Detailed stats: refresh counts, duration, error streaks
SELECT pgt_name, total_refreshes, avg_duration_ms, consecutive_errors, stale
FROM pgtrickle.pg_stat_stream_tables;

-- Live status with error counts
SELECT * FROM pgtrickle.pgt_status();

Refresh history

-- Last 10 refreshes for a specific stream table
SELECT start_time, action, status, duration_ms, rows_inserted, rows_deleted, error_message
FROM pgtrickle.get_refresh_history('order_totals', 10);

-- Global refresh timeline (last 20 events across all stream tables)
SELECT start_time, stream_table, action, status, duration_ms, error_message
FROM pgtrickle.refresh_timeline(20);

-- Aggregate refresh statistics
SELECT * FROM pgtrickle.st_refresh_stats();

CDC pipeline health

-- Per-source CDC mode, WAL lag, and alerts
SELECT * FROM pgtrickle.check_cdc_health();

-- Change buffer sizes (pending changes not yet consumed)
SELECT stream_table, source_table, cdc_mode, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

-- Verify all CDC triggers are installed and enabled
SELECT source_table, trigger_type, trigger_name
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

Dependencies

-- ASCII tree view of the entire dependency graph
SELECT tree_line, status, refresh_mode
FROM pgtrickle.dependency_tree();

-- Diamond consistency groups
SELECT * FROM pgtrickle.diamond_groups();

Fuse circuit breaker

-- Check fuse state for all stream tables
SELECT name, fuse_mode, fuse_state, fuse_ceiling, blown_at
FROM pgtrickle.fuse_status();

Parallel workers

-- Worker pool status (when parallel_refresh_mode = 'on')
SELECT * FROM pgtrickle.worker_pool_status();

-- Recent parallel job history
SELECT job_id, unit_key, status, duration_ms
FROM pgtrickle.parallel_job_status(60);

NOTIFY-Based Alerting

pg_trickle emits real-time events via PostgreSQL's NOTIFY system:

LISTEN pg_trickle_alert;

Event Types

Event	Trigger	Severity
`stale_data`	Scheduler is also behind — view is genuinely out of date	Warning
`no_upstream_changes`	Scheduler is healthy but source tables have had no writes — view is correct	Info
`auto_suspended`	Stream table suspended after max consecutive errors	Critical
`resumed`	Stream table resumed after suspension	Info
`reinitialize_needed`	Upstream DDL change detected	Warning
`buffer_growth_warning`	Change buffer growing unexpectedly	Warning
`slot_lag_warning`	WAL replication slot retaining excessive data	Warning
`fuse_blown`	Circuit breaker tripped	Warning
`refresh_completed`	Refresh completed successfully	Info
`refresh_failed`	Refresh failed	Error
`diamond_partial_failure`	One member of an atomic diamond group failed	Warning
`scheduler_falling_behind`	Refresh duration approaching the schedule interval	Warning
`spill_threshold_exceeded`	Delta MERGE spilled to temp files for consecutive refreshes, forcing FULL	Warning

Notification Payload

Each notification carries a JSON payload:

{
  "event": "auto_suspended",
  "stream_table": "order_totals",
  "consecutive_errors": 3,
  "last_error": "column \"deleted_column\" does not exist",
  "timestamp": "2026-03-31T14:22:01.123Z"
}

Bridging to External Systems

To forward NOTIFY events to external alerting systems (PagerDuty, Slack, OpsGenie), use a listener process:

# Example: Python listener using psycopg
import psycopg
import json

conn = psycopg.connect("postgresql://user:pass@host/db", autocommit=True)
conn.execute("LISTEN pg_trickle_alert")

for notify in conn.notifies():
    payload = json.loads(notify.payload)
    event = payload["event"]
    # no_upstream_changes is informational — source tables are quiet but healthy.
    # Only page on actionable events.
    if event in ("auto_suspended", "fuse_blown", "refresh_failed"):
        send_to_pagerduty(payload)
    elif event == "stale_data":  # scheduler itself is falling behind
        send_to_pagerduty(payload)

Prometheus & Grafana Stack

For production deployments, use the pre-built observability stack in the monitoring/ directory:

cd monitoring/
docker compose up -d

This gives you:

Prometheus scraping pg_trickle metrics via postgres_exporter
Grafana with a pre-provisioned dashboard
Alerting rules for staleness, errors, CDC lag, and scheduler health

See Prometheus & Grafana Integration for full setup details.

Diagnostic Workflow

When something is wrong, follow this systematic workflow:

Step 1 — Global health

SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

Step 2 — Status and staleness

SELECT name, status, consecutive_errors, staleness
FROM pgtrickle.pgt_status()
ORDER BY staleness DESC NULLS FIRST;

Step 3 — Recent refresh activity

SELECT start_time, stream_table, action, status, error_message
FROM pgtrickle.refresh_timeline(20);

Step 4 — Error details for a specific stream table

SELECT * FROM pgtrickle.diagnose_errors('my_stream_table');

Step 5 — CDC pipeline

SELECT stream_table, source_table, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

Step 6 — Trigger verification

SELECT source_table, trigger_type, trigger_name
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

Common Alert Responses

Alert	Likely Cause	Action
`stale_data`	Scheduler behind, long refresh, or lock contention	Check `pgt_status()` and `refresh_timeline()`
`auto_suspended`	Repeated refresh failures	Fix root cause, then `resume_stream_table()`
`fuse_blown`	Bulk load exceeded fuse ceiling	Investigate, then `reset_fuse()`
`buffer_growth_warning`	Scheduler not consuming buffers fast enough	Check scheduler status and refresh errors
`reinitialize_needed`	Source table DDL changed	Verify schema compatibility; scheduler handles automatically

Tutorial: ETL & Bulk Load Patterns

pg_trickle provides source gating (v0.5.0+) and watermark gating (v0.7.0+) to coordinate stream table refreshes with ETL pipelines and bulk data loads. This tutorial covers common patterns for pausing refreshes during loads and resuming them safely afterward.

The Problem

When you bulk-load data into a source table (e.g., a nightly ETL job), the change buffer fills rapidly. Without coordination:

A differential refresh mid-load sees a partial batch, producing incomplete results
The adaptive fallback may trigger repeated FULL refreshes during the load
The fuse circuit breaker may blow, requiring manual intervention

Source gating solves this by telling pg_trickle to skip refreshes for gated sources until the load completes.

Recipe 1 — Single Source Bulk Load

The simplest pattern: gate the source, load data, ungate.

-- 1. Gate the source table — all dependent stream tables pause
SELECT pgtrickle.gate_source('public.orders');

-- 2. Perform the bulk load
COPY orders FROM '/data/orders_20260331.csv' WITH (FORMAT csv, HEADER);
-- or: INSERT INTO orders SELECT ... FROM staging_orders;

-- 3. Ungate — stream tables resume and process the full batch
SELECT pgtrickle.ungate_source('public.orders');

While gated, the scheduler skips all stream tables that depend on the gated source. Changes still accumulate in the CDC buffer and are processed in a single batch after ungating.

Recipe 2 — Coordinated Multi-Source Load

When your ETL loads multiple tables that feed into the same stream table:

-- Gate all sources involved in the load
SELECT pgtrickle.gate_source('public.orders');
SELECT pgtrickle.gate_source('public.customers');
SELECT pgtrickle.gate_source('public.products');

-- Load all tables
COPY orders FROM '/data/orders.csv' WITH (FORMAT csv, HEADER);
COPY customers FROM '/data/customers.csv' WITH (FORMAT csv, HEADER);
COPY products FROM '/data/products.csv' WITH (FORMAT csv, HEADER);

-- Ungate all at once — stream tables see a consistent snapshot
SELECT pgtrickle.ungate_source('public.orders');
SELECT pgtrickle.ungate_source('public.customers');
SELECT pgtrickle.ungate_source('public.products');

Recipe 3 — Gate + Deferred Stream Table Creation

For initial deployments where data must be loaded before stream tables are created:

-- 1. Gate the source before any stream tables exist
SELECT pgtrickle.gate_source('public.orders');

-- 2. Load the initial data
COPY orders FROM '/data/historical_orders.csv' WITH (FORMAT csv, HEADER);

-- 3. Create stream tables — they won't refresh yet (source is gated)
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule => '1m'
);

-- 4. Ungate — the first refresh processes all data cleanly
SELECT pgtrickle.ungate_source('public.orders');

Recipe 4 — Nightly Batch Pattern

A common production pattern using a scheduled batch job:

-- Run nightly at 02:00 UTC

-- Step 1: Gate all ETL sources
DO $$
DECLARE
    src TEXT;
BEGIN
    FOR src IN SELECT DISTINCT source_table
               FROM pgtrickle.list_sources('daily_report')
    LOOP
        PERFORM pgtrickle.gate_source(src);
    END LOOP;
END;
$$;

-- Step 2: Run the ETL pipeline
CALL etl.load_daily_data();

-- Step 3: Ungate all sources
DO $$
DECLARE
    gated RECORD;
BEGIN
    FOR gated IN SELECT source_name FROM pgtrickle.source_gates()
                 WHERE is_gated = true
    LOOP
        PERFORM pgtrickle.ungate_source(gated.source_name);
    END LOOP;
END;
$$;

Monitoring During a Gated Load

While sources are gated, verify the gate status:

-- Check which sources are currently gated
SELECT * FROM pgtrickle.source_gates();

-- Bootstrap gate status (v0.6.0+)
SELECT * FROM pgtrickle.bootstrap_gate_status();

Combining with the Fuse Circuit Breaker

For extra safety, combine gating with the fuse circuit breaker:

-- Arm the fuse as a safety net
SELECT pgtrickle.alter_stream_table('order_totals',
    fuse         => 'on',
    fuse_ceiling => 500000
);

-- Gate for controlled loads
SELECT pgtrickle.gate_source('public.orders');
-- ... load data ...
SELECT pgtrickle.ungate_source('public.orders');

-- The fuse catches any unexpected bulk changes outside the gated window

Watermark Gating (v0.7.0+)

Watermark gating extends source gating with LSN-based coordination for more precise control:

-- Set a watermark — refreshes only consume changes up to this LSN
SELECT pgtrickle.set_watermark('public.orders', pg_current_wal_lsn());

-- Load new data (changes accumulate beyond the watermark)
COPY orders FROM '/data/new_orders.csv' WITH (FORMAT csv, HEADER);

-- Advance the watermark to include the new data
SELECT pgtrickle.advance_watermark('public.orders', pg_current_wal_lsn());

-- Or clear the watermark entirely
SELECT pgtrickle.clear_watermark('public.orders');

See the SQL Reference — Watermark Gating for the complete API.

Tutorial: Migrating from pg_ivm to pg_trickle

This guide walks through migrating existing pg_ivm IMMVs (Incrementally Maintained Materialized Views) to pg_trickle stream tables. It covers API mapping, behavioral differences, and a step-by-step migration checklist.

See also: plans/ecosystem/GAP_PG_IVM_COMPARISON.md for the full feature comparison and gap analysis between the two extensions.

Why Migrate?

	pg_ivm (IMMV)	pg_trickle (Stream Table)
Maintenance model	Immediate only (in-transaction)	Deferred (scheduler) and Immediate
Aggregate functions	5 (COUNT, SUM, AVG, MIN, MAX)	60+ (all built-in + user-defined)
Window functions	Not supported	Full support
CTEs (recursive)	Not supported	Semi-naive, DRed, recomputation
Subqueries	Very limited	Full (EXISTS, NOT EXISTS, IN, LATERAL, scalar)
Set operations	Not supported	UNION, INTERSECT, EXCEPT (bag + set)
HAVING clause	Not supported	Supported
GROUPING SETS / CUBE / ROLLUP	Not supported	Auto-rewritten to UNION ALL
DISTINCT ON	Not supported	Auto-rewritten to ROW_NUMBER
Views as sources	Not supported	Auto-inlined
Cascading views	Not supported	DAG-aware topological scheduling
Background scheduling	None (manual only)	Native cron, duration, CALCULATED
Monitoring	1 catalog table	15+ diagnostic functions
Concurrency	ExclusiveLock during maintenance	Advisory locks, non-blocking reads
Parallel refresh	Not supported	Worker pool with caps

Concept Mapping

pg_ivm Concept	pg_trickle Equivalent	Notes
IMMV (Incrementally Maintained Materialized View)	Stream table	Same idea — a query result kept incrementally up to date
`pgivm.create_immv(name, query)`	`pgtrickle.create_stream_table(name, query)`	pg_trickle adds optional `schedule` and `refresh_mode` parameters
`pgivm.refresh_immv(name, true)`	`pgtrickle.refresh_stream_table(name)`	Manual refresh
`pgivm.refresh_immv(name, false)`	No direct equivalent	pg_trickle has `pgtrickle.alter_stream_table(name, enabled => false)` to suspend
`pgivm.pg_ivm_immv` catalog	`pgtrickle.pgt_stream_tables`	Plus `pgt_status()`, `refresh_timeline()`, etc.
`DROP TABLE immv_name`	`pgtrickle.drop_stream_table(name)`	Stream tables must be dropped via the API
`ALTER TABLE immv RENAME TO ...`	`pgtrickle.alter_stream_table(old, name => new)`	Rename via API
In-transaction maintenance (AFTER row triggers)	`refresh_mode => 'IMMEDIATE'`	Same model — triggers fire in the writing transaction
(not available)	`refresh_mode => 'DIFFERENTIAL'`	Deferred incremental refresh via change buffers
(not available)	`refresh_mode => 'AUTO'`	Picks DIFFERENTIAL or FULL automatically
Auto-created indexes on GROUP BY / PK	Manual `CREATE INDEX`	pg_trickle auto-creates the primary key but not secondary indexes

Step-by-Step Migration

1. Inventory existing IMMVs

List all pg_ivm IMMVs in your database:

-- pg_ivm catalog
SELECT immvrelid::regclass AS immv_name,
       pgivm.get_immv_def(immvrelid) AS defining_query
FROM pgivm.pg_ivm_immv
ORDER BY immvrelid::regclass::text;

Record each IMMV's name, defining query, and any indexes you have created on it.

2. Check query compatibility

pg_trickle supports a superset of pg_ivm's SQL dialect, so any query that works with pg_ivm will work with pg_trickle. However, there are a few things to verify:

Data types: pg_ivm requires btree operator classes for all columns (excluding json, xml, point, etc.). pg_trickle has no such restriction.
Outer joins: If your IMMV uses outer joins, pg_trickle removes pg_ivm's restrictions (single equijoin, no aggregates, no CASE). Your query may work unchanged or you may be able to simplify workarounds you added for pg_ivm.

3. Choose a refresh mode

For each IMMV, decide which pg_trickle refresh mode to use:

pg_ivm behavior	pg_trickle refresh mode	When to choose
Zero staleness required	`IMMEDIATE`	Same in-transaction behavior as pg_ivm
Some staleness acceptable	`DIFFERENTIAL` with schedule	Lower write latency, batched refresh
Let pg_trickle decide	`AUTO` (default)	Recommended for most cases

4. Create stream tables

For each IMMV, create the corresponding stream table:

pg_ivm (before):

SELECT pgivm.create_immv(
    'order_totals',
    'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id'
);

pg_trickle — IMMEDIATE mode (same behavior as pg_ivm):

SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    NULL,          -- no schedule needed for IMMEDIATE
    'IMMEDIATE'
);

pg_trickle — deferred mode (lower write latency):

SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    '30s'          -- refresh every 30 seconds; mode defaults to AUTO
);

5. Recreate indexes

pg_ivm auto-creates indexes on GROUP BY, DISTINCT, and primary key columns. pg_trickle auto-creates the primary key (pgt_row_id) but not secondary indexes.

Recreate any indexes that your read queries depend on:

-- Example: index on the GROUP BY column for lookup queries
CREATE INDEX ON pgtrickle.order_totals (customer_id);

6. Update application queries

pg_ivm IMMVs live in the schema where they were created (usually public). pg_trickle stream tables default to the pgtrickle schema.

-- Before (pg_ivm):
SELECT * FROM public.order_totals WHERE customer_id = 42;

-- After (pg_trickle):
SELECT * FROM pgtrickle.order_totals WHERE customer_id = 42;

To avoid changing application code, create a compatibility view:

CREATE VIEW public.order_totals AS
SELECT * FROM pgtrickle.order_totals;

7. Verify correctness

After creating the stream table and running a refresh, compare results:

-- Compare row counts
SELECT 'immv' AS source, COUNT(*) FROM public.order_totals_immv
UNION ALL
SELECT 'stream_table', COUNT(*) FROM pgtrickle.order_totals;

-- Full diff (should return zero rows)
(SELECT * FROM public.order_totals_immv EXCEPT SELECT * FROM pgtrickle.order_totals)
UNION ALL
(SELECT * FROM pgtrickle.order_totals EXCEPT SELECT * FROM public.order_totals_immv);

8. Drop the old IMMV

Once you have verified the stream table is correct and applications are updated:

DROP TABLE public.order_totals_immv;

9. (Optional) Remove pg_ivm

After all IMMVs are migrated:

DROP EXTENSION pg_ivm CASCADE;

Remove pg_ivm from shared_preload_libraries if it was listed there and restart PostgreSQL.

Behavioral Differences to Be Aware Of

Locking

pg_ivm: Holds ExclusiveLock on the IMMV during maintenance. In REPEATABLE READ / SERIALIZABLE, concurrent writes to the same IMMV's base tables may raise serialization errors.
pg_trickle (IMMEDIATE): Uses advisory locks. Concurrent reads of the stream table are never blocked.
pg_trickle (deferred): Base table writes only insert into change buffers (~2–50 μs). No lock contention with refresh.

TRUNCATE

pg_ivm: Synchronously truncates or fully refreshes the IMMV.
pg_trickle (IMMEDIATE): Performs a full refresh within the same transaction.
pg_trickle (deferred): Clears the change buffer and queues a full refresh on the next cycle.

Logical Replication

pg_ivm: Not compatible with logical replication — subscriber nodes do not have triggers that fire for replicated changes.
pg_trickle (deferred): Supports WAL-based CDC (pg_trickle.cdc_mode = 'wal') which reads from the WAL directly. Trigger-based CDC also works with logical replication if triggers are created on the subscriber.

Schema Changes

pg_ivm: No automatic DDL tracking. If a base table column is altered, the IMMV may break silently.
pg_trickle: Event triggers detect DDL changes on source tables and automatically reinitialize affected stream tables.

Upgrading Queries That pg_ivm Couldn't Handle

pg_ivm's SQL restrictions often force users to create workarounds. With pg_trickle, many of these workarounds can be simplified:

HAVING clauses

-- pg_ivm workaround: filter in application or wrap in a view
SELECT pgivm.create_immv('big_customers',
    'SELECT customer_id, SUM(amount) AS total
     FROM orders GROUP BY customer_id'
);
-- Then: SELECT * FROM big_customers WHERE total > 1000;

-- pg_trickle: use HAVING directly
SELECT pgtrickle.create_stream_table('big_customers',
    'SELECT customer_id, SUM(amount) AS total
     FROM orders GROUP BY customer_id
     HAVING SUM(amount) > 1000'
);

NOT EXISTS / anti-joins

-- pg_ivm: not supported — manual workaround required

-- pg_trickle: works directly
SELECT pgtrickle.create_stream_table('orphan_orders',
    'SELECT o.* FROM orders o
     WHERE NOT EXISTS (SELECT 1 FROM customers c WHERE c.id = o.customer_id)'
);

Window functions

-- pg_ivm: not supported

-- pg_trickle: works directly
SELECT pgtrickle.create_stream_table('ranked_products',
    'SELECT product_id, category, revenue,
            RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS rnk
     FROM product_revenue'
);

UNION ALL pipelines

-- pg_ivm: not supported — requires separate IMMVs + application-side UNION

-- pg_trickle: works directly
SELECT pgtrickle.create_stream_table('all_events',
    'SELECT id, ts, ''order'' AS type FROM order_events
     UNION ALL
     SELECT id, ts, ''return'' AS type FROM return_events'
);

Monitoring After Migration

pg_trickle provides extensive monitoring that pg_ivm does not offer:

-- Overall health
SELECT * FROM pgtrickle.health_check();

-- Status of all stream tables (includes staleness, last refresh, error count)
SELECT * FROM pgtrickle.pgt_status();

-- Recent refresh history across all stream tables
SELECT * FROM pgtrickle.refresh_timeline(20);

-- CDC pipeline health
SELECT * FROM pgtrickle.change_buffer_sizes();

-- Diagnose errors for a specific stream table
SELECT * FROM pgtrickle.diagnose_errors('order_totals');

See SQL Reference for the complete list of monitoring functions.

Frequently Asked Questions

This FAQ covers everything from core concepts and getting started, through SQL support details, to operational topics like deployment, monitoring, and troubleshooting. Use the table of contents below to jump to a specific topic.

New User FAQ — Top 15 Questions

New to pg_trickle? Start here. Each answer is a short summary with a link to the full explanation further down.

1. What is pg_trickle?

A PostgreSQL 18 extension that adds stream tables — materialized views that refresh themselves incrementally, processing only changed rows instead of re-running the entire query. Full answer →

2. How is this different from a materialized view?

Stream tables refresh automatically on a schedule, support incremental (differential) refresh, track changes via CDC triggers, and propagate updates through dependency chains — none of which REFRESH MATERIALIZED VIEW provides. Full answer →

3. How do I install pg_trickle?

Install from the Docker image, PGXN, or build from source. Add shared_preload_libraries = 'pg_trickle' to postgresql.conf, then CREATE EXTENSION pg_trickle; in each database. Full answer →

4. How do I create my first stream table?

One function call: SELECT pgtrickle.create_stream_table(name => 'my_st', query => 'SELECT ...', schedule => '5s'); See the Getting Started guide for a walkthrough. Full answer →

5. What is the difference between FULL and DIFFERENTIAL refresh?

FULL re-runs the entire defining query. DIFFERENTIAL reads only the changed rows from the change buffer and computes the delta — orders of magnitude faster for small changes on large tables. AUTO mode picks the best strategy per cycle. Full answer →

6. Which refresh mode should I use?

Use AUTO (the default) — it selects DIFFERENTIAL when possible and falls back to FULL when needed. Use IMMEDIATE for same-transaction consistency. Use FULL only when the defining query uses volatile functions or is not IVM-eligible. Full answer →

7. What SQL features are supported?

Joins (INNER, LEFT, RIGHT, FULL OUTER, CROSS, LATERAL), aggregates (60+ functions including SUM, COUNT, AVG, array_agg, jsonb_agg), CTEs (including recursive), window functions, UNION/INTERSECT/EXCEPT, subqueries, CASE, COALESCE, DISTINCT, GROUP BY with ROLLUP/CUBE/GROUPING SETS, and more. Full answer →

8. How fresh is my stream table data?

As fresh as the refresh schedule allows. With a 1s schedule, data is typically < 2 seconds stale. With IMMEDIATE mode, data is updated within the same transaction as the source write. Full answer →

9. Can I chain stream tables (ST reads from another ST)?

Yes — stream tables can reference other stream tables. pg_trickle builds a dependency DAG and refreshes them in topological order automatically. Full answer →

10. How does change data capture work?

Lightweight row-level AFTER triggers capture every INSERT, UPDATE, and DELETE into per-table change buffers. If wal_level = logical is available, pg_trickle can automatically transition to WAL-based CDC for near-zero write-path overhead. Full answer →

11. Do I need `wal_level = logical`?

No. pg_trickle works with the default wal_level = replica using trigger-based CDC. WAL-based CDC is optional and provides lower write-path overhead. Full answer →

12. Can I use pg_trickle with PgBouncer / connection poolers?

Yes. pg_trickle's background workers use direct connections, not pooled ones. Your application can use any pooler for reads and writes — the scheduler operates independently. Full answer →

13. How do I monitor stream table health?

Built-in views (pgtrickle.pgt_status, pgtrickle.pgt_refresh_history), Prometheus metrics endpoint, Grafana dashboard, NOTIFY-based alerts, and a TUI tool. Full answer →

14. What happens if a refresh fails?

The stream table is marked SUSPENDED after exceeding the fuse threshold (default 5 consecutive failures). Data in the change buffer is preserved. Use pgtrickle.reset_fuse('my_st') to resume after fixing the issue. Full answer →

15. Can I use pg_trickle with dbt?

Yes — the dbt-pgtrickle package provides a stream_table materialization. dbt run creates/alters stream tables, dbt source freshness checks staleness. Full answer →

Getting started

General — What pg_trickle is, how IVM works, key concepts
Installation & Setup — Installing, configuring, uninstalling
Creating & Managing Stream Tables — Create, alter, drop, schedules

Consistency & refresh modes

Data Freshness & Consistency — Staleness, read-your-writes, DVS
IMMEDIATE Mode (Transactional IVM) — Same-transaction refresh

SQL features

SQL Support — Supported and unsupported SQL constructs
Aggregates & Group-By — Incremental aggregates, HAVING, auxiliary columns
Joins — Multi-table delta computation, FULL OUTER JOIN
CTEs & Recursive Queries — Semi-naive, DRed, recomputation strategies
Window Functions & LATERAL — Partition-based recomputation, SRFs
TopK (ORDER BY … LIMIT) — Bounded result sets
Tables Without Primary Keys — Content-based row identity

Internals & architecture

Change Data Capture (CDC) — Triggers, WAL transition, why auto is the default, change buffers
Diamond Dependencies & DAG Scheduling — Topological ordering, atomic groups
Schema Changes & DDL Events — Reinitialize, event triggers

Operations

Performance & Tuning — Scheduler tuning, min schedule risks, disk space, adaptive fallback
Interoperability — Views, replication, connection poolers, triggers, pgvector
dbt Integration — Materialization, commands, freshness checks
Row-Level Security (RLS) — Source vs stream table policies, SECURITY DEFINER triggers
Deployment & Operations — Workers, upgrades, replicas, Kubernetes
Monitoring & Alerting — Views, NOTIFY alerts, failure handling
Configuration Reference — All GUC parameters

Troubleshooting & reference

Troubleshooting — Common problems and debugging
Why Are These SQL Features Not Supported? — Technical explanations for each limitation
Why Are These Stream Table Operations Restricted? — Why direct DML, ALTER TABLE, and TRUNCATE are disallowed

General

These questions cover fundamental concepts — what pg_trickle is, how incremental view maintenance works, and the key building blocks (frontiers, row IDs, the auto-rewrite pipeline) that power the extension.

What is pg_trickle?

pg_trickle is a PostgreSQL 18 extension that implements stream tables — declarative, automatically-refreshing materialized views with Differential View Maintenance (DVM). You define a SQL query and a refresh schedule; the extension handles change capture, delta computation, and incremental refresh automatically.

It is inspired by the DBSP differential dataflow framework. See DBSP_COMPARISON.md for a detailed comparison.

What is incremental view maintenance (IVM) and why does it matter?

Incremental View Maintenance means updating a materialized view by processing only the changes (deltas) to the source data, rather than re-executing the entire defining query from scratch.

Consider a stream table defined as SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id over a 10-million-row orders table. When you insert 5 new rows:

Without IVM (FULL refresh): Re-scans all 10 million rows and recomputes every group. Cost: O(total rows).
With IVM (DIFFERENTIAL refresh): Reads only the 5 new rows from the change buffer, identifies the affected groups, and updates just those groups. Cost: O(changed rows × affected groups).

pg_trickle's DVM engine implements IVM using differentiation rules for each SQL operator (Scan, Filter, Join, Aggregate, etc.), generating a delta query that computes the exact changes to the stream table from the exact changes to the source.

What is the difference between a stream table and a regular materialized view, in practice?

Feature	Materialized Views	Stream Tables
Refresh	Manual (`REFRESH MATERIALIZED VIEW`)	Automatic (scheduler) or manual
Incremental refresh	Not supported natively	Built-in differential mode
Change detection	None — always full recompute	CDC triggers track row-level changes
Dependency ordering	None	DAG-aware topological refresh
Monitoring	None	Built-in views, stats, NOTIFY alerts
Schedule	None	Duration strings (`5m`) or cron (`/5 * * *`)
Transactional IVM	No	Yes (IMMEDIATE mode)

In practice, stream tables are regular PostgreSQL heap tables under the hood — you can query them, create indexes on them, join them with other tables, and reference them from views. The key difference is that pg_trickle manages their contents automatically.

What happens behind the scenes when I INSERT a row into a table tracked by a stream table?

The full data flow for a DIFFERENTIAL-mode stream table:

Your INSERT completes normally. The row is written to the source table.
A CDC trigger fires (row-level AFTER INSERT). It writes a change record (action=I, the new row data as JSONB, the current WAL LSN) into the source's change buffer table (pgtrickle_changes.changes_<oid>). This happens within your transaction — if you roll back, the change record is also rolled back.
You commit. Both the source row and the change record become visible.
The scheduler wakes up (every pg_trickle.scheduler_interval_ms, default 1 second). It checks whether the stream table's schedule says a refresh is due.
If due, the refresh engine runs. It reads the change buffer for rows with LSN > the stream table's current frontier, generates a delta query from the DVM operator tree, and applies the result via MERGE.
Frontier advances. The stream table's frontier is updated to the new LSN, and the consumed change buffer rows are cleaned up.

For IMMEDIATE-mode stream tables, steps 2–6 are replaced: a statement-level AFTER trigger computes and applies the delta within your transaction, so the stream table is updated before your transaction commits.

What does "differential" mean in the context of pg_trickle?

"Differential" refers to the mathematical approach of computing differences (deltas) rather than absolute values. Given a query Q and a set of changes ΔR to source table R, the DVM engine computes ΔQ(R, ΔR) — the change to the query result caused by the change to the source. This delta is then applied (merged) into the stream table.

Each SQL operator has its own differentiation rule. For example:

Filter: ΔFilter(R, ΔR) = Filter(ΔR) — just apply the filter to the changes.
Join: ΔJoin(R, S, ΔR) = Join(ΔR, S) — join the changes against the other side's current state.
Aggregate: Recompute only the groups whose keys appear in the changes.

See DVM_OPERATORS.md for the complete set of differentiation rules.

What is a frontier, and why does pg_trickle track LSNs?

A frontier is a per-source map of {source_oid → LSN} that records exactly how far each stream table has consumed changes from each of its source tables. It is stored as JSONB in the pgtrickle.pgt_stream_tables catalog.

Why LSNs? PostgreSQL's Write-Ahead Log Sequence Number (LSN) provides a globally ordered, monotonically increasing position in the change stream. By recording the LSN at which each source was last consumed, the frontier ensures:

No missed changes. The next refresh reads changes with LSN > frontier, ensuring contiguous, non-overlapping windows.
No duplicate processing. Changes at or below the frontier are never re-read.
Consistent snapshots. When a stream table depends on multiple source tables, the frontier tracks each source independently, enabling consistent multi-source delta computation.

Lifecycle: Created on first full refresh → Advanced on each differential refresh → Reset on reinitialize.

What is the `__pgt_row_id` column and why does it appear in my stream tables?

Every stream table has a __pgt_row_id BIGINT PRIMARY KEY column. It stores a 64-bit xxHash of the row's group-by key (for aggregate queries) or all output columns (for non-aggregate queries). The refresh engine uses it to match incoming deltas against existing rows during the MERGE operation.

You should ignore this column in your queries. It is an implementation detail. If it bothers you, exclude it explicitly:

SELECT customer_id, total FROM order_totals;  -- omit __pgt_row_id

What is the auto-rewrite pipeline and how does it affect my queries?

Before parsing a defining query into the DVM operator tree, pg_trickle runs six automatic rewrite passes:

#	Pass	What it does
0	View inlining	Replaces view references with `(view_definition) AS alias` subqueries (fixpoint, max depth 10)
1	DISTINCT ON	Converts to `ROW_NUMBER() OVER (PARTITION BY … ORDER BY …) = 1` subquery
2	GROUPING SETS / CUBE / ROLLUP	Decomposes into `UNION ALL` of separate `GROUP BY` queries
3	Scalar subquery in WHERE	Rewrites `WHERE col > (SELECT …)` to `CROSS JOIN`
4	Correlated scalar subquery in SELECT	Rewrites to `LEFT JOIN` with grouped inline view
5	SubLinks in OR	Splits `WHERE a OR EXISTS (…)` into `UNION` branches

The rewrites are transparent — your original query is preserved in the catalog (original_query column) while the rewritten version is stored in defining_query. The DVM engine only sees standard SQL operators after rewriting.

See ARCHITECTURE.md for details on each pass.

How does pg_trickle compare to DBSP (the academic framework)?

pg_trickle is inspired by DBSP but is not a direct implementation. Key differences:

DBSP is a general-purpose differential dataflow framework with a Rust runtime (Feldera). It models computation as circuits over Z-sets (multisets with integer weights).
pg_trickle implements the same mathematical principles (delta queries, frontier tracking) but embedded inside PostgreSQL as an extension. It generates SQL delta queries rather than running a separate computation engine.
Trade-off: pg_trickle leverages PostgreSQL's optimizer, indexes, and storage engine but is limited to what can be expressed as SQL queries. DBSP can implement arbitrary dataflow computations.

See DBSP_COMPARISON.md for a detailed comparison.

How does pg_trickle compare to pg_ivm?

Feature	pg_ivm	pg_trickle
Refresh timing	Immediate (same transaction) only	Immediate, Deferred (scheduled), or Manual
Incremental strategy	Transition tables + query rewriting	DVM operator tree + delta SQL generation
Supported SQL	Inner joins, simple outer joins, COUNT/SUM/AVG/MIN/MAX, EXISTS, DISTINCT	All of the above + window functions, recursive CTEs, LATERAL, UNION/INTERSECT/EXCEPT, 37 aggregates, TopK, GROUPING SETS
Cascading (view-on-view)	No	Yes (DAG-aware topological refresh)
Scheduling	None (always immediate)	Duration, cron, CALCULATED, or NULL
Monitoring	None	Built-in views, stats, NOTIFY alerts
PostgreSQL version	14–17	18 only (until v0.4.0)

pg_trickle's IMMEDIATE mode is designed as a migration path for pg_ivm users — it uses the same statement-level trigger approach with transition tables.

What PostgreSQL versions are supported?

PostgreSQL 18.x exclusively. pg_trickle uses PostgreSQL 18 features such as enhanced MERGE syntax with NOT MATCHED BY SOURCE and improved event trigger payloads. These features are not available in earlier versions.

Backward compatibility with PostgreSQL 16–17 is planned for a future release (tracked in the roadmap).

Does pg_trickle require `wal_level = logical`?

No. By default, pg_trickle uses lightweight row-level triggers for change data capture instead of logical replication. This means you do not need to set wal_level = logical, configure max_replication_slots, or create publications.

If you later enable the hybrid CDC mode (pg_trickle.cdc_mode = 'auto'), WAL-based capture becomes an option — but this is opt-in and not required for normal operation.

Is pg_trickle production-ready?

pg_trickle is under active development and approaching production readiness. It has a comprehensive test suite with 700+ unit tests and 290+ end-to-end tests covering correctness, failure recovery, and concurrency scenarios.

That said, as with any new extension, you should evaluate it against your specific workloads before deploying to production. Start with non-critical dashboards or reporting tables, monitor refresh performance and data correctness, and gradually expand usage as confidence grows.

Installation & Setup

How do I install pg_trickle?

Add pg_trickle to shared_preload_libraries in postgresql.conf:
```
shared_preload_libraries = 'pg_trickle'
```
Restart PostgreSQL.
Run:
```
CREATE EXTENSION pg_trickle;
```

See INSTALL.md for platform-specific instructions and pre-built release artifacts.

What are the minimum configuration requirements?

The only mandatory setting is adding pg_trickle to shared_preload_libraries in postgresql.conf (this requires a PostgreSQL restart):

shared_preload_libraries = 'pg_trickle'

All other GUC parameters have sensible defaults and can be tuned later. However, max_worker_processes often needs to be raised from its default of 8 — see the next question.

Can I install pg_trickle on a managed PostgreSQL service (RDS, Cloud SQL, etc.)?

It depends on whether the service allows custom extensions and shared_preload_libraries modifications. Many managed services restrict these. However, pg_trickle has one advantage over replication-based extensions: it does not require wal_level = logical, which avoids one of the most common restrictions on managed PostgreSQL services.

Check your provider's documentation for custom extension support. Services that support custom extensions (e.g., some tiers of Azure Flexible Server, Supabase, Neon) are more likely to work.

How do I uninstall pg_trickle?

Drop all stream tables first (or they will be cascade-dropped):

SELECT pgtrickle.drop_stream_table(pgt_name) FROM pgtrickle.pgt_stream_tables;

Drop the extension:
```
DROP EXTENSION pg_trickle CASCADE;
```
Remove pg_trickle from shared_preload_libraries and restart PostgreSQL.

Creating & Managing Stream Tables

Do I need to choose a refresh mode?

No. The default mode ('AUTO') is adaptive: it uses differential (delta-only) maintenance when efficient, and automatically falls back to full recomputation when the change volume is high or the query cannot be differentiated. This works well for the vast majority of queries.

You only need to specify a mode explicitly when:

You want FULL mode to force recomputation every time (rare).
You want IMMEDIATE mode for sub-second, in-transaction updates (adds overhead to every write on source tables).
You want strict DIFFERENTIAL mode and prefer an error over silent fallback when the query isn't differentiable.

How do I create a stream table?

-- Minimal: just name and query. Refreshes on a calculated schedule
-- using adaptive differential maintenance.
SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT customer_id, SUM(amount) AS total
     FROM orders GROUP BY customer_id'
);

-- With custom schedule:
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total
     FROM orders GROUP BY customer_id',
    schedule => '5m'
);

What is the difference between FULL and DIFFERENTIAL refresh mode?

FULL — Truncates the stream table and re-runs the entire defining query every refresh cycle. Simple but expensive for large result sets.
DIFFERENTIAL — Computes only the delta (changes since the last refresh) using the DVM engine and applies it via a MERGE statement. Much faster when only a small fraction of source data changes between refreshes. When the change ratio exceeds pg_trickle.differential_max_change_ratio (default 15%), DIFFERENTIAL automatically falls back to FULL for that cycle.
IMMEDIATE — Maintains the stream table synchronously within the same transaction as the base table DML. Uses statement-level triggers with transition tables — no change buffers, no scheduler. The stream table is always up-to-date.

Why does FULL mode exist if DIFFERENTIAL can fall back to it automatically?

DIFFERENTIAL mode with adaptive fallback covers most user needs — it uses incremental deltas when changes are small and automatically switches to a full recompute when the change ratio is high. However, explicit FULL mode still has its place:

No CDC overhead. FULL mode installs CDC triggers on source tables (for DAG tracking), but the refresh itself ignores the change buffers entirely. If your workload has very high write throughput and you know you'll always do a full recompute, FULL mode avoids the per-row trigger overhead of writing change records that will never be consumed incrementally.
Simpler debugging. When investigating data correctness issues, FULL mode is a clean baseline — it re-runs the defining query with no delta computation, no frontier tracking, and no MERGE logic. If FULL produces correct results but DIFFERENTIAL doesn't, the bug is in the delta pipeline.
Predictable performance. DIFFERENTIAL refresh time varies with the number of changes, which can be unpredictable. FULL refresh time is proportional to the total result set size, which is stable. For SLA-sensitive workloads where you'd rather have consistent 500ms refreshes than variable 5ms–500ms refreshes, FULL provides that predictability.
Unsupported-but-planned constructs. Some queries may parse correctly in DIFFERENTIAL mode but produce suboptimal deltas. Using FULL mode explicitly is a safe fallback while the DVM engine matures.

For most users, DIFFERENTIAL is the right default. Use FULL when you have a specific reason.

When should I use FULL vs. DIFFERENTIAL vs. IMMEDIATE?

Use DIFFERENTIAL (default) when:

Source tables are large and changes between refreshes are small
The defining query uses supported operators (most common SQL is supported)
Some staleness (seconds to minutes) is acceptable

Use FULL when:

The defining query uses unsupported aggregates (CORR, COVAR_*, REGR_*)
Source tables are small and a full recompute is cheap
You see frequent adaptive fallbacks to FULL (check refresh history)

Use IMMEDIATE when:

The stream table must always reflect the latest committed data
You need transactional consistency (reads within the same transaction see updated data)
Write-side overhead per DML statement is acceptable
The defining query is relatively simple (no TopK, no materialized view sources)

What are the advantages and disadvantages of IMMEDIATE vs. deferred (FULL/DIFFERENTIAL) refresh modes?

IMMEDIATE mode

	Detail
✅ Read-your-writes consistency	The stream table is updated within the same transaction as the base table DML — always current from the writer's perspective.
✅ No lag	No background worker, no schedule interval. The view is never stale.
✅ No change buffers	`pgtrickle_changes.*` tables are not used, reducing write overhead on source tables.
✅ pg_ivm compatibility	Drop-in migration path for existing pg_ivm / IMMV users.
❌ Write amplification	Every DML statement on a base table also executes IVM trigger logic, adding latency to the original transaction.
❌ Serialized concurrent writes	An `ExclusiveLock` is taken on the stream table during maintenance, serializing writers.
❌ Limited SQL support	Window functions, recursive CTEs, `LATERAL` joins, scalar subqueries, and TopK (`ORDER BY … LIMIT`) are not supported — use `DIFFERENTIAL` instead.
❌ Cascading limitations	Cascading IMMEDIATE stream tables work but may require manual refresh for deep chains.
❌ No throttling	The refresh cannot be delayed or rate-limited.

Deferred mode (FULL / DIFFERENTIAL)

	Detail
✅ Decoupled write path	Base table writes are fast; view maintenance runs later via the scheduler or manual refresh.
✅ Broadest SQL support	Window functions, recursive CTEs, `LATERAL`, `UNION`, user-defined aggregates, TopK, cascading stream tables, and more.
✅ Adaptive cost control	`DIFFERENTIAL` automatically falls back to `FULL` when the change ratio exceeds `pg_trickle.differential_max_change_ratio`.
✅ Concurrency-friendly	Writers never block on view maintenance.
❌ Staleness	The stream table lags by up to one schedule interval (e.g. `1m`).
❌ No read-your-writes	A writer querying the stream table immediately after a write may see the pre-change data.
❌ Infrastructure overhead	Requires change buffer tables, a background worker, and frontier tracking.

Rule of thumb: use IMMEDIATE when the query is simple and freshness within the transaction matters. Use DIFFERENTIAL (or FULL) for complex queries, high concurrency, or when you want to decouple write latency from view maintenance.

What happens if I have an IMMEDIATE stream table between two DIFFERENTIAL stream tables in a dependency chain?

Consider the chain: source → ST_A (DIFFERENTIAL) → ST_B (IMMEDIATE) → ST_C (DIFFERENTIAL). This is a valid but unusual configuration with important behavioral consequences:

ST_A refreshes on its schedule (e.g., every 1 minute) via the background scheduler.
ST_B is IMMEDIATE, so it has no CDC triggers on ST_A — it uses statement-level IVM triggers. But ST_A is updated by the scheduler (not by user DML), and the scheduler's MERGE operation does fire statement-level triggers on ST_A's dependents. So ST_B updates within the scheduler's transaction when ST_A refreshes.
ST_C is DIFFERENTIAL and depends on ST_B. Since ST_B is a stream table, ST_C's CDC triggers fire when ST_B is modified. The scheduler refreshes ST_C on its own schedule.

The practical concern: write latency stacking. When the scheduler refreshes ST_A, ST_B's IVM triggers fire synchronously within that same transaction, adding IVM overhead to ST_A's refresh. If ST_B's delta computation is expensive, it slows down the entire scheduler cycle.

Recommendation: Avoid mixing IMMEDIATE into the middle of a deferred chain. Either make the entire chain IMMEDIATE (for small, simple queries) or keep it entirely DIFFERENTIAL. If you need read-your-writes for one specific step, consider making that the terminal (leaf) stream table in the chain.

What schedule formats are supported?

Duration strings:

Unit	Suffix	Example
Seconds	`s`	`30s`
Minutes	`m`	`5m`
Hours	`h`	`2h`
Days	`d`	`1d`
Weeks	`w`	`1w`
Compound	—	`1h30m`

Cron expressions:

Format	Example	Description
5-field	`/5 * * *`	Every 5 minutes
Aliases	`@hourly`, `@daily`	Built-in shortcuts

CALCULATED mode: Pass NULL as the schedule to inherit the schedule from downstream dependents.

How do cron schedules handle timezones? What does `@daily` really mean?

pg_trickle evaluates cron expressions in UTC. The underlying croner crate computes the next occurrence from a UTC timestamp, and the scheduler compares this against chrono::Utc::now(). There is no per-stream-table timezone setting.

This means:

@daily (equivalent to 0 0 * * *) fires at midnight UTC, not midnight in your local timezone.
@hourly (equivalent to 0 * * * *) fires at the top of each UTC hour.
0 9 * * 1-5 fires at 09:00 UTC on weekdays — if your server is in America/New_York, that's 04:00 or 05:00 local time depending on DST.

If you need a schedule aligned to a local timezone, convert the desired local time to UTC and write the cron expression accordingly. For example, to refresh at 08:00 Europe/Oslo (UTC+1 in winter, UTC+2 in summer), use 0 6 * * * in summer and 0 7 * * * in winter — or accept the 1-hour seasonal shift and pick one.

Tip: For most analytics workloads, UTC-based schedules are preferable because they don't shift with daylight saving transitions.

What is the minimum allowed schedule?

The pg_trickle.min_schedule_seconds GUC (default: 60 seconds) sets the shortest allowed refresh schedule. Any create_stream_table or alter_stream_table call with a schedule shorter than this floor is rejected with a clear error message.

This guard exists to prevent accidentally creating stream tables that refresh too frequently, which could overload the scheduler or the source tables. During development and testing, you can lower it:

ALTER SYSTEM SET pg_trickle.min_schedule_seconds = 1;
SELECT pg_reload_conf();

What happens if all stream tables in the DAG have a CALCULATED schedule?

When every stream table uses a CALCULATED schedule (schedule => 'calculated'), there are no explicit schedules for the resolution algorithm to derive from. The CALCULATED logic works by propagating MIN(effective_schedule) from downstream dependents upward through the DAG. If no node has an explicit duration:

Leaf nodes (no downstream dependents) have no schedules to take the minimum of, so they fall back to the pg_trickle.min_schedule_seconds GUC (default: 60 seconds).
Upstream nodes then resolve to MIN(fallback) = fallback.
The result: every stream table in the DAG gets the fallback schedule (60 s by default).

This is safe but usually not what you want — the whole DAG refreshes at the same generic interval. Best practice is to set an explicit schedule on at least the leaf (most-downstream) stream tables so that upstream CALCULATED schedules resolve to something meaningful:

-- Leaf ST with an explicit schedule
SELECT pgtrickle.create_stream_table(
    name     => 'daily_summary',
    query    => 'SELECT region, SUM(total) FROM pgtrickle.order_totals GROUP BY region',
    schedule => '10m'
);

-- Upstream ST inherits that 10 m schedule via CALCULATED
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule => 'calculated'
);

You can inspect the resolved effective schedules with:

SELECT pgt_name, schedule, effective_schedule
FROM pgtrickle.pgt_stream_tables;

Can a stream table reference another stream table?

Yes. Stream tables can depend on other stream tables. The scheduler automatically refreshes them in topological order (upstream first). Circular dependencies are detected and rejected at creation time.

-- ST1: aggregates orders
SELECT pgtrickle.create_stream_table(
    name         => 'order_totals',
    query        => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

-- ST2: filters ST1
SELECT pgtrickle.create_stream_table(
    name         => 'big_customers',
    query        => 'SELECT customer_id, total FROM pgtrickle.order_totals WHERE total > 1000',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

How do I change a stream table's schedule or mode?

-- Change schedule
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '10m');

-- Switch refresh mode
SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'FULL');

-- Suspend
SELECT pgtrickle.alter_stream_table('order_totals', status => 'SUSPENDED');

-- Resume
SELECT pgtrickle.alter_stream_table('order_totals', status => 'ACTIVE');

Can I change the defining query of a stream table?

Yes — use the query parameter of alter_stream_table():

SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total, COUNT(*) AS order_count
              FROM orders GROUP BY customer_id');

The ALTER QUERY operation validates the new query, migrates the storage table schema if needed, updates catalog entries and source dependencies, and runs a full refresh — all within a single transaction. Concurrent readers see either the old data or the new data, never an empty table.

Schema migration behavior:

Schema change	Behavior
Same columns	Fast path — no storage DDL, just catalog update + full refresh
Columns added or removed	Compatible migration via `ALTER TABLE ADD/DROP COLUMN` — storage table OID preserved
Column type incompatible	Full rebuild — storage table dropped and recreated (OID changes, `WARNING` emitted)

You can also change the query and other parameters simultaneously:

SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    refresh_mode => 'FULL');

How do I deploy stream tables idempotently?

Use create_or_replace_stream_table() — one function call that does the right thing automatically:

-- Safe to run on every deploy — creates, updates, or no-ops as needed:
SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'order_totals',
    query        => 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule     => '2m',
    refresh_mode => 'DIFFERENTIAL'
);

What happens on each deploy:

Situation	Action
First deploy (stream table doesn't exist)	Creates it, populates data
Nothing changed since last deploy	No-op — logs INFO, returns instantly
You changed the schedule or mode	Updates config in place (no data loss)
You changed the query	Migrates storage schema + runs a full refresh

This mirrors PostgreSQL's CREATE OR REPLACE VIEW / CREATE OR REPLACE FUNCTION pattern.

When to use which function:

Function	Use case
`create_or_replace_stream_table()`	Recommended for most deployments. Declarative, idempotent — handles all cases automatically.
`create_stream_table_if_not_exists()`	Safe re-run, but never modifies an existing definition. Good for one-time seed migrations.
`create_stream_table()`	Strict mode — errors if the stream table already exists. Use when you want an explicit failure on duplicates.

How do I trigger a manual refresh?

Call refresh_stream_table() to immediately refresh a stream table without waiting for the next scheduled cycle:

SELECT pgtrickle.refresh_stream_table('order_totals');

This runs a synchronous refresh in your current session and returns when complete. It works even when the background scheduler is disabled (pg_trickle.enabled = false), making it useful for testing, debugging, or one-off data refreshes.

To force a full refresh regardless of the stream table's configured mode, temporarily change the refresh mode:

SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'FULL');
SELECT pgtrickle.refresh_stream_table('order_totals');
-- Switch back to the original mode when done:
SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'DIFFERENTIAL');

Data Freshness & Consistency

Understanding when and how stream tables become current is the #1 conceptual hurdle for users coming from synchronous materialized views. This section explains staleness guarantees, read-your-writes behavior, and Delayed View Semantics (DVS).

How stale can a stream table be?

For deferred modes (FULL / DIFFERENTIAL): A stream table can be at most one schedule interval behind the source data, plus the time it takes to execute the refresh itself. For example, with schedule => '1m', the maximum staleness is approximately 1 minute + refresh duration.

In practice, staleness is often less than the schedule interval because the scheduler continuously checks for due refreshes at pg_trickle.scheduler_interval_ms (default: 1 second).

For IMMEDIATE mode: The stream table is always current within the transaction that modified the source data. There is zero staleness.

Check current staleness:

SELECT pgtrickle.get_staleness('order_totals');  -- returns seconds, NULL if never refreshed

-- Or check all stream tables:
SELECT pgt_name, staleness, stale FROM pgtrickle.stream_tables_info;

Can I read my own writes immediately after an INSERT?

It depends on the refresh mode:

IMMEDIATE mode: Yes. The stream table is updated within the same transaction as your INSERT. You can query it immediately and see the updated data.
DIFFERENTIAL / FULL mode: No. The stream table is updated by the background scheduler in a separate transaction. Your INSERT is captured by the CDC trigger, but the stream table won't reflect it until the next scheduled refresh (or a manual refresh_stream_table() call).

If read-your-writes consistency is a requirement, use refresh_mode => 'IMMEDIATE'.

What consistency guarantees does pg_trickle provide?

pg_trickle provides Delayed View Semantics (DVS): the contents of every stream table are logically equivalent to evaluating its defining query at some past point in time — the data_timestamp. This means:

The data is always internally consistent — it corresponds to a valid snapshot of the source data.
The data may be stale — it reflects the source state at data_timestamp, not necessarily the current state.
For cascading stream tables, the scheduler refreshes in topological order so that when ST B references upstream ST A, A has already been refreshed before B runs its delta query against A's contents.

For IMMEDIATE mode, the guarantee is stronger: the stream table always reflects the state of the source data as of the current transaction.

What are "Delayed View Semantics" (DVS)?

DVS is the formal consistency guarantee: a stream table's contents are equivalent to evaluating its defining query at a specific past time (the data_timestamp). This is analogous to how a materialized view captured at a point in time is always internally consistent, even if the source data has since changed.

The data_timestamp is recorded in the catalog and advanced after each successful refresh:

SELECT pgt_name, data_timestamp FROM pgtrickle.pgt_stream_tables;

What happens if the scheduler is behind — does data get lost?

No. Change data is never lost, even if the scheduler falls behind. Changes accumulate in the change buffer tables (pgtrickle_changes.changes_<oid>) until consumed by a refresh. The frontier ensures that each refresh picks up exactly where the last one left off.

However, a growing change buffer increases:

Disk usage (change buffer tables grow)
Refresh time (more changes to process per cycle)
Risk of adaptive fallback to FULL (if the change ratio exceeds pg_trickle.differential_max_change_ratio)

The monitoring system emits a buffer_growth_warning NOTIFY alert if buffers grow unexpectedly.

How does pg_trickle ensure deltas are applied in the right order across cascading stream tables?

The scheduler uses topological ordering from the dependency DAG. When ST B depends on ST A:

ST A is refreshed first — its data is brought up to date and its frontier advances.
ST A's refresh writes are captured by CDC triggers (since ST A is a source for ST B).
ST B is refreshed next — its delta query reads ST A's current (just-refreshed) data and the change buffer.

This ensures that downstream stream tables always see consistent upstream data. Circular dependencies are rejected at creation time.

IMMEDIATE Mode (Transactional IVM)

IMMEDIATE mode maintains the stream table synchronously — within the same transaction as the source DML. This section covers when to use it, what SQL it supports, locking behavior, and how to switch between modes.

When should I use IMMEDIATE mode instead of DIFFERENTIAL?

Use IMMEDIATE when:

Your application requires read-your-writes consistency — e.g., a user inserts an order and immediately queries a dashboard that must include that order.
The defining query is relatively simple (single-table aggregation, joins, filters).
The source table write rate is moderate (IMMEDIATE adds latency to every DML statement).

Stick with DIFFERENTIAL when:

Staleness of a few seconds to minutes is acceptable.
The defining query uses unsupported IMMEDIATE constructs (materialized-view sources, foreign-table sources).
Write-side performance is critical (high-throughput OLTP).
You need to decouple write latency from view maintenance.

What SQL features are NOT supported in IMMEDIATE mode?

IMMEDIATE mode supports all constructs that DIFFERENTIAL supports, with two source-type exceptions:

Feature	Status	Notes
`WITH RECURSIVE`	✅ Supported (IM1)	Semi-naive evaluation inside the trigger. A depth counter guards against infinite loops (`pg_trickle.ivm_recursive_max_depth`, default 100). A warning is emitted at create time for very deep hierarchies.
TopK (`ORDER BY … LIMIT N [OFFSET M]`)	✅ Supported (IM2)	Micro-refresh: recomputes the top-N rows on every DML statement. Gated by `pg_trickle.ivm_topk_max_limit` to prevent unbounded scans.
Materialized views as sources	❌ Rejected	Stale-snapshot prevents trigger-based capture — use the underlying query instead.
Foreign tables as sources	❌ Rejected	No triggers on foreign tables — use FULL mode instead.

Attempting to create or switch to IMMEDIATE mode with an unsupported construct produces a clear error message.

What happens when I TRUNCATE a source table in IMMEDIATE mode?

A statement-level AFTER TRUNCATE trigger fires and truncates the stream table, then re-populates it by executing a full refresh from the defining query — all within the same transaction. The stream table remains consistent.

Can I have cascading IMMEDIATE stream tables (ST A → ST B)?

Yes. When ST A is IMMEDIATE and ST B depends on ST A and is also IMMEDIATE, changes propagate through the chain within the same transaction. The IVM triggers on the base table update ST A, and since that write is visible within the transaction, ST B's triggers fire and update ST B.

What locking does IMMEDIATE mode use?

IMMEDIATE mode acquires statement-level locks on the stream table during delta application:

Simple queries (single-table scan/filter without aggregates or DISTINCT): RowExclusiveLock — allows concurrent readers, blocks other writers.
Complex queries (joins, aggregates, DISTINCT, window functions): ExclusiveLock — blocks both readers and writers to ensure delta consistency.

This means concurrent writes to the same base table are serialized through the stream table lock. For high-concurrency write workloads, DIFFERENTIAL mode avoids this bottleneck.

How do I switch an existing DIFFERENTIAL stream table to IMMEDIATE?

SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'IMMEDIATE');

This:

Validates the defining query against IMMEDIATE mode restrictions.
Removes the row-level CDC triggers from source tables.
Installs statement-level IVM triggers (BEFORE + AFTER with transition tables).
Clears the schedule (IMMEDIATE mode has no schedule).
Performs a full refresh to establish a consistent baseline.

To switch back:

SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'DIFFERENTIAL');

This reverses the process: removes IVM triggers, installs CDC triggers, restores the schedule (default 1m), and performs a full refresh.

What happens to IMMEDIATE mode during a manual `refresh_stream_table()` call?

For IMMEDIATE mode stream tables, refresh_stream_table() performs a FULL refresh — truncates and re-populates from the defining query. This is useful for recovering from edge cases or forcing a clean baseline. It is equivalent to pg_ivm's refresh_immv(name, true).

How much write-side overhead does IMMEDIATE mode add?

Each DML statement on a base table tracked by an IMMEDIATE stream table incurs:

BEFORE trigger: Advisory lock acquisition + pre-state setup (~0.1–0.5 ms).
AFTER trigger: Transition table copy to temp tables + delta SQL generation + delta application (~1–50 ms depending on query complexity and delta size).

For a simple single-table aggregate, expect 2–10 ms overhead per statement. For multi-table joins or window functions, overhead is higher. The overhead scales with the number of IMMEDIATE stream tables that depend on the same source table.

SQL Support

pg_trickle supports a broad range of SQL in defining queries. This section covers what’s supported, what’s rejected (with rewrites), and how specific constructs like aggregates and ORDER BY are handled. The subsections that follow dive deeper into aggregates, joins, CTEs, window functions, and TopK.

What SQL features are supported in defining queries?

Most common SQL is supported in both FULL and DIFFERENTIAL modes:

Table scans, projections, WHERE/HAVING filters
INNER, LEFT, RIGHT, FULL OUTER JOIN (including multi-table joins)
GROUP BY with 25+ aggregate functions (COUNT, SUM, AVG, MIN, MAX, BOOL_AND/OR, STRING_AGG, ARRAY_AGG, JSON_AGG, JSONB_AGG, BIT_AND/OR/XOR, STDDEV, VARIANCE, MODE, PERCENTILE_CONT/DISC, and more)
FILTER (WHERE ...) on aggregates
DISTINCT
Set operations: UNION ALL, UNION, INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL
Subqueries: EXISTS, NOT EXISTS, IN (subquery), NOT IN (subquery), scalar subqueries
Non-recursive and recursive CTEs
Window functions (ROW_NUMBER, RANK, SUM OVER, etc.)
LATERAL joins with set-returning functions and correlated subqueries
CASE, COALESCE, NULLIF, GREATEST, LEAST, BETWEEN, IS DISTINCT FROM

See DVM Operators for the complete list.

What SQL features are NOT supported?

The following are rejected with clear error messages and suggested rewrites:

Feature	Reason	Suggested Rewrite
`TABLESAMPLE`	Stream tables materialize the full result set	Use `WHERE random() < fraction` in consuming query
Window functions in expressions	Cannot be differentially maintained	Move window function to a separate column
`LIMIT` / `OFFSET` (without `ORDER BY`)	Stream tables materialize the full result set; `ORDER BY … LIMIT N [OFFSET M]` is supported as TopK	Apply when querying the stream table, or add `ORDER BY` + `LIMIT` to use the TopK pattern
`FOR UPDATE` / `FOR SHARE`	Row-level locking not applicable	Remove the locking clause
`RANGE_AGG` / `RANGE_INTERSECT_AGG`	No incremental delta decomposition exists for range aggregates	Use FULL mode, or compute range unions in the consuming query

Each rejected feature is explained in detail in the Why Are These SQL Features Not Supported? section below.

What happens to `ORDER BY` in defining queries?

ORDER BY in the defining query is accepted but silently discarded. This is consistent with how PostgreSQL handles CREATE MATERIALIZED VIEW AS SELECT ... ORDER BY ... — the ordering only affects the initial INSERT, not the stored data.

Stream tables are heap tables with no guaranteed row order. Apply ORDER BY when querying the stream table instead:

-- Don't rely on ORDER BY in the defining query:
-- 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region ORDER BY total DESC'

-- Instead, order when reading:
SELECT * FROM regional_totals ORDER BY total DESC;

Exception: When ORDER BY is paired with LIMIT N (with or without OFFSET M), pg_trickle recognizes the TopK pattern and preserves the ordering, limit, and offset.

Which aggregates support DIFFERENTIAL mode?

Algebraic (O(changes), fully incremental): COUNT, SUM, AVG

Semi-algebraic (incremental with occasional group rescan): MIN, MAX

Group-rescan (affected groups re-aggregated from source): STRING_AGG, ARRAY_AGG, JSON_AGG, JSONB_AGG, BOOL_AND, BOOL_OR, BIT_AND, BIT_OR, BIT_XOR, JSON_OBJECT_AGG, JSONB_OBJECT_AGG, STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, VAR_SAMP, MODE, PERCENTILE_CONT, PERCENTILE_DISC, CORR, COVAR_POP, COVAR_SAMP, REGR_AVGX, REGR_AVGY, REGR_COUNT, REGR_INTERCEPT, REGR_R2, REGR_SLOPE, REGR_SXX, REGR_SXY, REGR_SYY

37 aggregate function variants are supported in total.

Aggregates & Group-By

Aggregate handling is one of the most complex parts of incremental view maintenance. This section explains how pg_trickle categorizes aggregates by their incremental cost, how hidden auxiliary columns work, and what happens when groups are created or destroyed.

Which aggregates are fully incremental (O(1) per change) vs. group-rescan?

pg_trickle categorizes aggregates into three tiers:

Tier	Cost per change	Aggregates	Mechanism
Algebraic	O(1)	`COUNT`, `SUM`, `AVG`	Hidden auxiliary columns (`__pgt_count`, `__pgt_sum_x`) track running totals. Delta updates these columns arithmetically.
Semi-algebraic	O(1) normally, O(group) on extremum deletion	`MIN`, `MAX`	Maintained via `LEAST`/`GREATEST`. If the current MIN/MAX is deleted, the group is rescanned to find the new extremum.
Group-rescan	O(group size) per affected group	All others (35 functions)	Affected groups are re-aggregated from source data. A NULL sentinel marks stale groups for rescan.

For most workloads, the algebraic tier (COUNT/SUM/AVG) covers the majority of aggregations and is the fastest.

Why do some aggregates have hidden auxiliary columns?

For algebraic aggregates (COUNT, SUM, AVG), the DVM engine adds hidden __pgt_count and __pgt_sum_x columns to the stream table's storage. These store running totals that can be updated with O(1) arithmetic per change instead of rescanning the entire group.

For example, a stream table defined as SELECT dept, AVG(salary) FROM employees GROUP BY dept internally stores:

dept — the group-by key
avg — the user-visible average (computed as __pgt_sum_x / __pgt_count)
__pgt_count — running count of rows in the group
__pgt_sum_x — running sum of salary values
__pgt_row_id — row identity hash

When a new employee is inserted, the refresh updates __pgt_count += 1, __pgt_sum_x += new_salary, and recomputes avg. No rescan of the source table is needed.

How does HAVING work with incremental refresh?

HAVING is fully supported in DIFFERENTIAL mode. The DVM engine tracks threshold transitions — groups entering or exiting the HAVING condition:

Group crosses threshold upward: A previously excluded group (e.g., HAVING COUNT(*) > 5) gains enough members → the group is inserted into the stream table.
Group crosses threshold downward: A group that was included drops below the threshold → the group is deleted from the stream table.
Group stays above threshold: Normal delta update (adjust aggregate values).

This means the stream table always reflects only the groups that satisfy the HAVING clause, even as group membership changes.

What happens to a group when all its rows are deleted?

When the last row of a group is deleted from the source table, the DVM engine detects that __pgt_count drops to zero and deletes the group row from the stream table. The hidden auxiliary columns are cleaned up along with it.

If a new row for the same group-by key is later inserted, a fresh group row is created from scratch.

Why are `CORR`, `COVAR_`, and `REGR_` limited to FULL mode?

Regression aggregates like CORR, COVAR_POP, COVAR_SAMP, and the REGR_* family require maintaining running sums of products and squares across the entire group. Unlike COUNT/SUM/AVG (where deltas can be computed from the change alone), regression aggregates:

Lack algebraic delta rules. There is no closed-form way to update a correlation coefficient from a single row change without access to the full group's data.
Would degrade to group-rescan anyway. Even if supported, the implementation would need to rescan the full group from source — identical to FULL mode for most practical group sizes.

These aggregates work fine in FULL refresh mode, which re-runs the entire query from scratch each cycle.

Joins

Join delta computation can produce surprising results when both sides change simultaneously. This section covers the standard IVM join rule, FULL OUTER JOIN support, and known edge cases.

How does a DIFFERENTIAL refresh handle a join when both sides changed?

When both tables in a join have changes since the last refresh, the DVM engine computes the join delta using the standard IVM join rule:

$$\Delta(R \bowtie S) = (\Delta R \bowtie S) \cup (R \bowtie \Delta S) \cup (\Delta R \bowtie \Delta S)$$

In practice, this means:

Join the changes from the left against the current state of the right.
Join the current state of the left against the changes from the right.
Join the changes from both sides (handles simultaneous changes to matching keys).

All three parts are combined into a single CTE-based delta query that PostgreSQL executes in one pass.

Does pg_trickle support FULL OUTER JOIN incrementally?

Yes. FULL OUTER JOIN is supported in DIFFERENTIAL mode with an 8-part delta computation. This handles all four cases: matched rows on both sides, left-only rows, right-only rows, and rows that transition between matched and unmatched states as data changes.

The 8 parts cover: new left matches, removed left matches, new right matches, removed right matches, newly matched from left-only, newly matched from right-only, newly unmatched to left-only, and newly unmatched to right-only.

What happens when a join key is updated and the joined row is simultaneously deleted?

This is a known edge case. When a join key column is updated in the same refresh cycle as the joined-side row is deleted, the delta may miss the required DELETE, potentially leaving a stale row in the stream table.

Mitigations:

The adaptive FULL fallback (triggered when the change ratio exceeds pg_trickle.differential_max_change_ratio) catches most high-change-rate scenarios where this is likely.
You can stagger changes across refresh cycles.
Use FULL mode for tables where this pattern is common.

How does NATURAL JOIN work?

NATURAL JOIN is fully supported. At parse time, pg_trickle resolves the common columns between the two tables and synthesizes explicit equi-join conditions. The internal __pgt_row_id column is excluded from common column resolution, so NATURAL JOINs between stream tables also work correctly.

CTEs & Recursive Queries

Recursive CTE support is a key differentiator for pg_trickle. This section explains the three maintenance strategies (semi-naive, DRed, recomputation) and when each is used.

Do recursive CTEs work in DIFFERENTIAL mode?

Yes. pg_trickle supports WITH RECURSIVE in DIFFERENTIAL mode with three auto-selected strategies:

Strategy	When used	How it works
Semi-naive evaluation	INSERT-only changes to the base case	Iteratively evaluates new derivations from the inserted rows without touching existing rows. Fastest path.
Delete-and-Rederive (DRed)	Mixed changes (INSERT + DELETE/UPDATE)	Deletes potentially affected derived rows, then rederives them from scratch to determine the true delta.
Recomputation fallback	Column mismatch or non-monotone recursive terms	Falls back to full recomputation of the recursive CTE. Used when the recursive term contains EXCEPT, Aggregate, Window, DISTINCT, AntiJoin, or INTERSECT SET operators.

The strategy is selected automatically based on the type of changes and the recursive term's structure.

What are the three strategies for recursive CTE maintenance?

See the table above. In brief:

Semi-naive is the fast path for append-only workloads (e.g., adding nodes to a tree). It's O(new derivations) — much cheaper than a full re-evaluation.
DRed handles deletions and updates correctly by first removing potentially invalidated rows and then rederiving them. More expensive than semi-naive, but still incremental.
Recomputation is the safe fallback that re-executes the entire recursive CTE. Used when the recursive term's structure is too complex for incremental processing.

What triggers a fallback from semi-naive to recomputation?

A recomputation fallback is triggered when:

The recursive term contains non-monotone operators — EXCEPT, Aggregate, Window, DISTINCT, AntiJoin, or INTERSECT SET. These operators can "un-derive" rows when inputs change, which semi-naive evaluation cannot handle.
Column mismatch — the CTE's output columns don't match the stream table's storage schema (e.g., after a schema change).
Mixed DML with non-monotone terms — DELETE or UPDATE changes combined with non-monotone recursive terms always trigger recomputation.

Check which strategy was used in the refresh history:

SELECT action, rows_inserted, rows_deleted
FROM pgtrickle.get_refresh_history('my_recursive_st', 5);

What happens when a CTE is referenced multiple times in the same query?

When a non-recursive CTE is referenced more than once, pg_trickle uses shared delta computation — the CTE's delta is computed once and cached, then reused by each reference. This is tracked via CteScan operator nodes that look up the shared delta from an internal CTE registry.

For single-reference CTEs, pg_trickle simply inlines them as subqueries (no overhead).

Window Functions & LATERAL

Window functions are maintained via partition-based recomputation rather than row-level deltas. This section covers what’s supported, the expression restriction, and LATERAL constructs.

How are window functions maintained incrementally?

pg_trickle uses partition-based recomputation for window functions. When source data changes, the DVM engine:

Identifies which partitions are affected by the changes (based on the PARTITION BY key).
Recomputes the window function for only the affected partitions.
Replaces the old partition results with the new ones in the stream table.

This is more efficient than a full recomputation when changes affect a small number of partitions.

Why can't I use a window function inside a CASE or COALESCE expression?

Window functions like ROW_NUMBER() OVER (…) are supported as standalone columns but cannot be embedded in expressions (e.g., CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN ...).

This restriction exists because the DVM engine handles window functions by recomputing entire partitions. When a window function is buried inside an expression, the engine cannot isolate the window computation from the surrounding expression.

Rewrite: Move the window function to a separate column in one stream table, then reference it in a second stream table:

-- ST1: compute the window function
SELECT id, dept, salary,
       ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rn
FROM employees

-- ST2: use it in an expression (references ST1)
SELECT id, CASE WHEN rn = 1 THEN 'top' ELSE 'other' END AS rank_label
FROM st1

What LATERAL constructs are supported?

pg_trickle supports three kinds of LATERAL constructs:

Construct	Example	Delta strategy
Set-returning functions	`LATERAL jsonb_array_elements(data)`	Row-scoped recomputation — only affected parent rows are re-expanded
Correlated subqueries	`LATERAL (SELECT ... WHERE t.id = s.id)`	Row-scoped recomputation
JSON_TABLE (PG 17+)	`JSON_TABLE(data, '$.items[*]' ...)`	Modeled as `LateralFunction`

Additional supported SRFs: jsonb_each, jsonb_each_text, unnest, generate_series, and others.

What happens when a row moves between window partitions during a refresh?

When a row's PARTITION BY key changes (e.g., an employee moves departments), the DVM engine recomputes both the old partition (to remove the row) and the new partition (to add it). Both partitions are re-evaluated from the source data, ensuring window function results are correct.

TopK (ORDER BY … LIMIT)

TopK queries (ORDER BY ... LIMIT N, optionally with OFFSET M) are handled via a specialized MERGE-based strategy that re-executes the bounded query each cycle. This section explains how it works and its limitations.

How does `ORDER BY … LIMIT N` work in a stream table?

When a defining query has a top-level ORDER BY … LIMIT N (with a constant integer N), pg_trickle recognizes it as a TopK pattern. An optional OFFSET M (constant integer) selects a "page" within the ranked result. The stream table stores exactly N rows and is refreshed via a MERGE-based scoped-recomputation strategy:

On each refresh, the full query (with ORDER BY + LIMIT, and OFFSET if present) is re-executed against the source tables.
The result is merged into the stream table using MERGE with NOT MATCHED BY SOURCE for deletes.
The catalog records topk_limit, topk_order_by, and optionally topk_offset for the stream table.

TopK bypasses the DVM delta pipeline — it always re-executes the bounded query. This is efficient because the result set is bounded by N.

SELECT pgtrickle.create_stream_table(
    name         => 'top_customers',
    query        => 'SELECT customer_id, total FROM order_totals ORDER BY total DESC LIMIT 100',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

-- With OFFSET — "page 2" of the leaderboard (rows 101–200):
SELECT pgtrickle.create_stream_table(
    name         => 'next_customers',
    query        => 'SELECT customer_id, total FROM order_totals ORDER BY total DESC LIMIT 100 OFFSET 100',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

Does OFFSET work with TopK?

Yes. ORDER BY … LIMIT N OFFSET M is fully supported. The stream table stores exactly N rows starting from position M+1 in the ranked result. This is useful for:

Paginated dashboards: Each page is a separate stream table with a different OFFSET.
Excluding outliers: OFFSET 5 LIMIT 50 skips the top 5 and shows the next 50.
Windowed leaderboards: OFFSET 10 LIMIT 10 shows the "second tier."

Caveat: When source data changes, the "page" can shift — a row on page 3 may move to page 2 or 4. The stream table always reflects the current state of the page at the time of the last refresh.

OFFSET 0 is treated as no offset.

What happens when a row below the top-N cutoff rises above it?

On the next refresh, the full ORDER BY … LIMIT N query is re-executed. The newly qualifying row appears in the result, and the row that fell out of the top-N is removed. The MERGE operation handles this by:

INSERT the newly qualifying row
DELETE the row that fell below the cutoff
UPDATE any rows whose values changed but remained in the top-N

Since TopK always re-executes the bounded query, it correctly detects all ranking changes.

Can I use TopK with aggregates or joins?

Yes. The defining query can contain any SQL that pg_trickle supports, plus ORDER BY … LIMIT N:

-- TopK over an aggregate
SELECT dept, SUM(salary) AS total_salary
FROM employees GROUP BY dept
ORDER BY total_salary DESC LIMIT 10

-- TopK over a join
SELECT e.name, d.name AS dept, e.salary
FROM employees e JOIN departments d ON e.dept_id = d.id
ORDER BY e.salary DESC LIMIT 20

The only restriction is that TopK cannot be combined with set operations (UNION/INTERSECT/EXCEPT) or GROUPING SETS/CUBE/ROLLUP.

Tables Without Primary Keys

While primary keys are not required, their absence changes how pg_trickle identifies rows. This section explains the content-based hashing fallback and its limitations with duplicate rows.

Do source tables need a primary key?

No, but it is strongly recommended. When a source table has a primary key, pg_trickle uses it to generate a deterministic __pgt_row_id for each row — this is the most reliable way to track row identity across refreshes.

Without a primary key, pg_trickle falls back to content-based hashing — an xxHash of all column values. This works correctly for tables where every row is unique, but has known issues with exact duplicate rows. See What are the risks of using tables without primary keys? for details.

What are the risks of using tables without primary keys?

Content-based row identity has known limitations with exact duplicate rows (rows where every column value is identical):

INSERT as no-op: If a row identical to an existing one is inserted, both have the same __pgt_row_id hash, so the MERGE treats it as a no-op (the row already exists).
DELETE removes all copies: Deleting one of N identical rows generates a DELETE delta, but the MERGE removes all rows with that __pgt_row_id.
Aggregate drift: Over time, these mismatches can cause aggregate values to drift from the true result.

Recommendation: Add a primary key or unique constraint to source tables, or use FULL mode for tables with frequent exact-duplicate rows.

How does content-based row identity work for duplicate rows?

For tables without a primary key, __pgt_row_id is computed as pg_trickle_hash_multi(ARRAY[col1::text, col2::text, ...]) — an xxHash of all column values. Rows with identical content produce identical hashes.

The hash uses \x1E (record separator) between values and \x00NULL\x00 for NULL values, minimizing collision risk for rows with different content. However, truly identical rows (same values in every column) will always hash to the same value — this is inherent to content-based identity.

Change Data Capture (CDC)

This section explains how pg_trickle captures changes to your source tables, the trade-offs between trigger-based and WAL-based CDC, and operational topics like backup/restore and buffer inspection.

How does pg_trickle capture changes to source tables?

pg_trickle installs AFTER INSERT/UPDATE/DELETE row-level PL/pgSQL triggers on each source table referenced by a stream table. Whenever a row in the source table is modified, the trigger writes a change record into a per-source buffer table in the pgtrickle_changes schema.

Each change record contains:

Action — I (insert), U (update), D (delete), or T (truncate marker)
Row data — old and/or new row values serialized as JSONB
LSN — the current WAL log sequence number, used for frontier tracking
Transaction ID — links the change to its originating transaction

The trigger fires within your transaction, so if you roll back, the change record is also rolled back. This guarantees that only committed changes appear in the buffer.

What is the overhead of CDC triggers?

The per-row overhead is approximately 20–55 μs, which covers the PL/pgSQL function dispatch, row_to_json() serialization, and the buffer table INSERT.

At typical write rates (fewer than 1,000 writes per second per source table), this adds less than 5% additional DML latency. For most OLTP workloads, the overhead is negligible — a single network round-trip to the database is usually 10–100× more expensive.

If you have very high-throughput source tables (>10K writes/sec), consider enabling the hybrid CDC mode (pg_trickle.cdc_mode = 'auto') which can automatically transition to WAL-based capture for lower per-row overhead (~5–15 μs).

What happens when I `TRUNCATE` a source table?

TRUNCATE is captured via a statement-level AFTER TRUNCATE trigger that writes a T marker row to the change buffer. When the differential refresh engine detects this marker, it automatically falls back to a full refresh for that cycle, ensuring the stream table stays consistent. Both FULL and DIFFERENTIAL mode stream tables handle TRUNCATE correctly.

Are CDC triggers automatically cleaned up?

Yes. pg_trickle tracks which source tables are referenced by which stream tables in the pgt_dependencies catalog. When the last stream table referencing a particular source table is dropped, pg_trickle automatically:

Removes the CDC triggers from the source table.
Drops the associated change buffer table (pgtrickle_changes.changes_<oid>).

You do not need to manually clean up triggers or buffer tables.

What happens if a source table is dropped or altered?

pg_trickle has DDL event triggers that listen for ALTER TABLE and DROP TABLE on source tables. When a change is detected, pg_trickle responds automatically:

All stream tables that depend on the altered source are marked with needs_reinit = true in the catalog.
On the next scheduler cycle, each affected stream table is reinitialized — the existing storage table is dropped, recreated from the current defining query schema, and re-populated with a full refresh.
A reinitialize_needed NOTIFY alert is sent so your monitoring can detect the event.

If the DDL change breaks the defining query (e.g., a column referenced in the query was dropped), the reinitialization will fail and the stream table will enter ERROR status. In that case, you need to drop and recreate the stream table with an updated query.

How do I check if a source table has switched from trigger-based CDC to WAL-based CDC?

When you enable hybrid CDC (pg_trickle.cdc_mode = 'auto'), pg_trickle starts capturing changes with triggers and can automatically transition to WAL-based logical replication once conditions are met. There are several ways to check the current CDC mode for each source table:

1. Query the dependency catalog directly:

SELECT d.source_relid, c.relname AS source_table, d.cdc_mode,
       d.slot_name, d.decoder_confirmed_lsn, d.transition_started_at
FROM pgtrickle.pgt_dependencies d
JOIN pg_class c ON c.oid = d.source_relid;

The cdc_mode column shows one of three values:

TRIGGER — changes are captured via row-level triggers (the default)
TRANSITIONING — the system is in the process of switching from triggers to WAL
WAL — changes are captured via logical replication

2. Use the built-in health check function:

SELECT source_table, cdc_mode, slot_name, lag_bytes, alert
FROM pgtrickle.check_cdc_health();

This returns a row per source table with the current mode, replication slot lag (for WAL-mode sources), and any alert conditions such as slot_lag_exceeds_threshold or replication_slot_missing.

3. Listen for real-time transition notifications:

LISTEN pg_trickle_cdc_transition;

pg_trickle sends a NOTIFY with a JSON payload whenever a transition starts, completes, or is rolled back. Example payload:

{
  "event": "transition_complete",
  "source_table": "public.orders",
  "old_mode": "TRANSITIONING",
  "new_mode": "WAL",
  "slot_name": "pg_trickle_slot_16384"
}

This lets you integrate CDC mode changes into your monitoring stack without polling.

4. Check the global GUC setting:

SHOW pg_trickle.cdc_mode;

This shows the desired global behavior (trigger, auto, or wal), not the per-table actual state. The per-table state lives in pgt_dependencies.cdc_mode as described above.

See CONFIGURATION.md for details on the pg_trickle.cdc_mode, pg_trickle.wal_transition_timeout, pg_trickle.slot_lag_warning_threshold_mb, and pg_trickle.slot_lag_critical_threshold_mb GUCs.

Is it safe to add triggers to a stream table while the source table is switching CDC modes?

Yes, this is completely safe. CDC mode transitions and user-defined triggers operate on different tables and do not interfere with each other:

CDC transitions affect how changes are captured from source tables (e.g., orders). The transition switches the capture mechanism from row-level triggers on the source table to WAL-based logical replication.
User-defined triggers live on stream tables (e.g., order_totals) and control how the refresh engine applies changes to the materialized output.

Because these are independent concerns, you can freely add, modify, or remove triggers on a stream table at any point — including during an active CDC transition on its source tables.

How it works in practice:

The refresh engine checks for user-defined triggers on the stream table at the start of each refresh cycle (via a fast pg_trigger lookup, <0.1 ms).
If user triggers are detected, the engine uses explicit DELETE / UPDATE / INSERT statements instead of MERGE, so your triggers fire with correct TG_OP, OLD, and NEW values.
The change data consumed by the refresh engine has the same format regardless of whether it came from CDC triggers or WAL decoding — so the trigger detection and the CDC mode are fully decoupled.

A trigger added between two refresh cycles will simply be picked up on the next cycle. The only (theoretical) edge case is adding a trigger in the tiny window during a single refresh transaction, between the trigger-detection check and the MERGE execution — but since both happen within the same transaction, this is virtually impossible in practice.

Why does pg_trickle use triggers instead of logical replication for initial CDC?

pg_trickle always bootstraps CDC with row-level AFTER triggers because they provide single-transaction atomicity — the change record is written in the same transaction as the source DML, so:

No commit-order ambiguity. The change buffer always reflects committed data; rolled-back transactions never produce partial change records.
No replication slot management at creation time. Logical replication requires creating and monitoring replication slots, which can bloat WAL if the subscriber falls behind. Trigger-based bootstrap avoids this complexity.
Works on all hosting providers. Some managed PostgreSQL services restrict wal_level = logical or limit the number of replication slots. Trigger bootstrap works everywhere, with no configuration changes.
Simpler initial deployment. No need for wal_level = logical, no publication/subscription setup, and no extra connections for WAL senders.

With pg_trickle.cdc_mode = 'auto' (the default since v0.3.0), pg_trickle uses triggers initially and then transparently transitions to WAL-based CDC if wal_level = logical is available. If WAL is not available, triggers are kept permanently — no degradation, no errors. Set pg_trickle.cdc_mode = 'trigger' if you want to disable WAL transitions entirely. See ADR-001 and ADR-002 in the architecture documentation for the full rationale.

Why is `auto` the default `pg_trickle.cdc_mode`?

As of v0.3.0, auto is the default CDC mode. This was changed from trigger based on the following considerations:

1. Safe no-op on standard installs. PostgreSQL ships with wal_level = replica by default. In this configuration, auto simply stays on trigger-based CDC permanently — it does not create replication slots, publications, or any WAL infrastructure. There is no error, warning, or user-visible difference from the old trigger default. auto only activates the WAL transition path when wal_level = logical is explicitly configured by the operator.

2. Automatic fallback hardening. The WAL transition and steady-state polling now include robust automatic fallback:

Consecutive poll errors (5 failures) trigger automatic revert to triggers.
check_decoder_health() validates slot existence, WAL lag, and wal_level on every tick.
The TRANSITIONING phase has a progressive timeout with informative warnings.
Post-restart health checks (check_cdc_transition_health()) automatically clean up stale transitions.

3. Zero overhead for trigger-only deployments. When wal_level != logical, the auto scheduler branch takes a fast-path exit after a single GUC check and pg_replication_slots query. The overhead compared to trigger mode is negligible (<1 ms per scheduler tick).

4. Progressive optimisation without config changes. When an operator later enables wal_level = logical (e.g., for other replication needs), pg_trickle automatically benefits from lower per-row CDC overhead (~5–15 μs vs ~20–55 μs) without any configuration change. This aligns with the principle of least surprise.

When to use trigger instead: Set pg_trickle.cdc_mode = 'trigger' if you want fully deterministic trigger-only behaviour, need to minimize any replication slot management, or are on a restricted managed PostgreSQL that caps replication slots. This reverts to the pre-v0.3.0 default.

Caveats to be aware of in auto mode:

Keyless tables (no PRIMARY KEY) stay on triggers permanently — WAL mode requires a PK for pk_hash computation.
Replication slots prevent WAL recycling: if the decoder falls behind, WAL accumulates. pg_trickle now warns at pg_trickle.slot_lag_warning_threshold_mb (default 100 MB) and marks per-source CDC health unhealthy at pg_trickle.slot_lag_critical_threshold_mb (default 1024 MB).
The TRANSITIONING phase runs both trigger and WAL decoder simultaneously; LSN-based deduplication handles correctness. If anything goes wrong, the system rolls back to triggers.

How does the trigger-to-WAL automatic transition work?

When pg_trickle.cdc_mode = 'auto', pg_trickle monitors each source table's write rate. When the rate exceeds an internal threshold, the transition proceeds in three phases:

Slot creation. A logical replication slot is created for the source table's OID (e.g., pg_trickle_slot_16384).
Dual capture. For a brief period, both triggers and WAL decoding capture changes. The system uses LSN comparison to deduplicate, ensuring no changes are lost or double-counted.
Trigger removal. Once the WAL decoder has confirmed it is caught up (its confirmed LSN ≥ the frontier LSN), the row-level triggers are dropped and the source transitions fully to WAL mode.

The transition is tracked in pgt_dependencies.cdc_mode (values: TRIGGER → TRANSITIONING → WAL). If the transition times out (pg_trickle.wal_transition_timeout, default 5 minutes), it is rolled back and triggers are kept.

What happens to CDC if I restore a database backup?

After restoring a backup (pg_dump, pg_basebackup, or PITR), the CDC state depends on the backup type:

Backup type	Triggers	Change buffers	Frontier	Action needed
pg_dump (logical)	Preserved (in DDL)	Buffer rows included	Catalog restored	Usually none — next refresh detects stale frontier and does a full refresh
pg_basebackup (physical)	Preserved	Buffer rows preserved (committed at backup time)	Catalog restored	Replication slots may be invalid — WAL-mode sources may need manual transition back to TRIGGER mode
PITR (point-in-time)	Preserved	Only committed buffer rows at the recovery target	Catalog restored	Similar to pg_basebackup; frontier may point ahead of actual buffer content → first refresh does a full refresh to reconcile

In all cases, the pg_trickle scheduler automatically detects frontier inconsistencies and falls back to a full refresh for the first cycle after restore. No manual intervention is required for trigger-mode sources.

For full guidelines on disaster recovery strategies, see our dedicated Backup and Restore chapter.

For WAL-mode sources, replication slots created after the backup point will not exist in the restored state. Set pg_trickle.cdc_mode = 'trigger' temporarily, or let the auto transition recreate slots.

Do CDC triggers fire for rows inserted via logical replication (subscribers)?

Yes. PostgreSQL fires row-level triggers on the subscriber side for rows applied via logical replication. This means if you have a subscriber database with pg_trickle installed, the CDC triggers will capture replicated changes into the local change buffers.

Implication: You can run stream tables on a subscriber database that tracks replicated tables — the change capture works transparently. However, be careful about:

Double-counting. If the same table is tracked by pg_trickle on both the publisher and subscriber, changes are captured twice (once on each side). This is fine if the stream tables are independent, but confusing if you expect them to be identical.
Replication lag. The stream table on the subscriber will be delayed by both the replication lag and the pg_trickle refresh schedule.

Can I inspect the change buffer tables directly?

Yes. Change buffers are ordinary tables in the pgtrickle_changes schema, named changes_<source_oid>:

-- List all change buffer tables
SELECT tablename FROM pg_tables WHERE schemaname = 'pgtrickle_changes';

-- Inspect recent changes for a source table (find OID first)
SELECT c.oid FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = 'orders' AND n.nspname = 'public';

-- Then query the buffer
SELECT action, lsn, txid, old_data, new_data
FROM pgtrickle_changes.changes_16384
ORDER BY lsn DESC LIMIT 10;

The action column contains: I (insert), U (update), D (delete), or T (truncate).

Warning: Do not modify buffer tables directly. The refresh engine manages buffer cleanup (truncation) after each successful refresh. Manual changes will corrupt the frontier tracking.

How does pg_trickle prevent its own refresh writes from re-triggering CDC?

When the refresh engine writes to a stream table (via MERGE or explicit DML), it does not trigger CDC capture on that stream table, even if the stream table is itself a source for a downstream stream table. This is because:

CDC triggers are only installed on source tables, not on stream tables. The refresh engine writes directly to the stream table's storage without going through any change-capture mechanism.
Downstream change propagation uses a different path. When stream table A is a source for stream table B, changes to A are detected at B's refresh time by re-reading A's data (not via triggers on A). The topological ordering ensures A is refreshed before B.

This design prevents infinite loops (A triggers B triggers A) and avoids the overhead of capturing changes to materialized output that will be recomputed anyway.

Diamond Dependencies & DAG Scheduling

When multiple stream tables form a diamond-shaped dependency graph, careful coordination is needed to avoid inconsistent snapshots. This section covers atomic consistency, schedule policies, and topological ordering.

What is a diamond dependency and why does it matter?

A diamond dependency occurs when two (or more) intermediate stream tables both depend on the same source, and a downstream stream table depends on both of them:

       Source: orders
       /             \
  ST: totals      ST: counts
       \             /
    ST: combined_report

Without coordination, combined_report might be refreshed after totals is updated but before counts is updated (or vice versa), producing a temporarily inconsistent snapshot — totals reflects the latest data but counts is stale.

What does `diamond_consistency = 'atomic'` do?

When diamond_consistency = 'atomic' is set on the downstream stream table (e.g., combined_report), pg_trickle ensures that all upstream stream tables in the diamond are refreshed within the same scheduler cycle before the downstream table is refreshed. This guarantees a consistent point-in-time snapshot.

If any upstream refresh in the atomic group fails, the downstream refresh is skipped for that cycle to avoid inconsistency. The failed upstream will be retried on the next cycle.

SELECT pgtrickle.alter_stream_table('combined_report',
    diamond_consistency => 'atomic');

What is the difference between `'fastest'` and `'slowest'` schedule policy?

When a stream table has multiple upstream dependencies with different schedules, pg_trickle needs a policy for when to refresh the downstream table:

Policy	Behavior	Best for
`fastest`	Refresh downstream whenever any upstream refreshes	Low-latency dashboards where partial freshness is acceptable
`slowest`	Refresh downstream only after all upstreams have refreshed	Reports requiring all-or-nothing consistency

The default is fastest. Use slowest with diamond_consistency = 'atomic' for the strongest consistency guarantees.

What happens when an atomic diamond group partially fails?

When diamond_consistency = 'atomic' is set and one upstream stream table in the diamond fails to refresh:

The downstream refresh is skipped for that cycle (it reads stale-but-consistent data from the previous successful cycle).
The failed upstream follows the normal retry logic (exponential backoff, up to max_consecutive_errors).
Other non-failing upstreams in the diamond are still refreshed normally — their data is fresh, but the downstream won't consume it until all upstreams succeed.
A NOTIFY pg_trickle_alert with event diamond_partial_failure is sent so your monitoring can detect the situation.

How does pg_trickle determine topological refresh order?

The scheduler builds a directed acyclic graph (DAG) of stream table dependencies at startup and after any create_stream_table / drop_stream_table call. The algorithm:

Edge discovery. For each stream table, the defining query's source tables are extracted. If a source table is itself a stream table, a dependency edge is added.
Cycle detection. The DAG is checked for cycles. If a cycle is detected, the offending create_stream_table call is rejected with a clear error message listing the cycle path.
Topological sort. A Kahn's algorithm topological sort produces the refresh order — leaf nodes (no stream table dependencies) are refreshed first, then their dependents, and so on.
Level assignment. Each stream table is assigned a "level" (0 for leaves, max(parent levels) + 1 for dependents). Stream tables at the same level are refreshed concurrently when pg_trickle.parallel_refresh_mode = 'on'.

The topological order is recalculated whenever the DAG changes. You can inspect it with:

SELECT pgt_name, depends_on, topo_level
FROM pgtrickle.stream_tables_info
ORDER BY topo_level, pgt_name;

Schema Changes & DDL Events

pg_trickle detects source table schema changes via PostgreSQL’s DDL event trigger system and reacts automatically. This section explains what happens for various DDL operations and how to handle them.

What happens when I add a column to a source table?

Adding a column to a source table is safe and non-disruptive if the stream table's defining query does not use SELECT *:

Named columns: If the defining query explicitly lists columns (e.g., SELECT id, name, amount FROM orders), the new column is simply not captured by CDC and has no effect on the stream table.
SELECT *: If the defining query uses SELECT *, pg_trickle detects the schema mismatch at the next refresh and marks the stream table with needs_reinit = true. The next scheduler cycle performs a full reinitialization — drops the storage table, recreates it with the new column set, and does a full refresh.

CDC triggers capture the full row as JSONB regardless of which columns the stream table uses, so no trigger changes are needed.

What happens when I drop a column used in a stream table's query?

Dropping a column that is referenced in a stream table's defining query will cause the next refresh to fail because the column no longer exists in the source table. pg_trickle handles this via:

DDL event trigger detects the ALTER TABLE ... DROP COLUMN and marks all affected stream tables with needs_reinit = true.
On the next refresh cycle, the scheduler attempts reinitialization — but the defining query will fail with a PostgreSQL error (e.g., column "amount" does not exist).
The stream table moves to ERROR status after max_consecutive_errors failures.
A reinitialize_needed NOTIFY alert is sent.

Resolution: Drop and recreate the stream table with an updated defining query:

SELECT pgtrickle.drop_stream_table('order_totals');
SELECT pgtrickle.create_stream_table(
    name         => 'order_totals',
    query        => 'SELECT id, name FROM orders',  -- updated query without dropped column
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

What happens when I `CREATE OR REPLACE` a view used by a stream table?

PostgreSQL event triggers fire on CREATE OR REPLACE VIEW, so pg_trickle detects the change and marks dependent stream tables with needs_reinit = true. On the next refresh:

If the new view definition is compatible (same output columns, same types), reinitialization succeeds transparently — the stream table is repopulated with the new query logic.
If the new view definition changes the output schema (different columns or types), the delta query will fail and the stream table enters ERROR status.

Tip: To avoid disruption, use pgtrickle.alter_stream_table() to pause the stream table before replacing the view, then resume after verifying compatibility.

What happens when I alter or drop a function used in a stream table's query?

If a stream table's defining query calls a user-defined function (e.g., SELECT my_func(amount) FROM orders) and that function is altered or dropped:

ALTER FUNCTION (changing the body): pg_trickle does not detect this automatically — PostgreSQL does not fire DDL event triggers for function body changes. The stream table continues refreshing with the new function behavior. If this is intentional, no action is needed. If you want a full rebase to the new logic, temporarily switch to FULL mode and refresh:
```
SELECT pgtrickle.alter_stream_table('my_st', refresh_mode => 'FULL');
SELECT pgtrickle.refresh_stream_table('my_st');
SELECT pgtrickle.alter_stream_table('my_st', refresh_mode => 'DIFFERENTIAL');
```
DROP FUNCTION: The next refresh fails because the function no longer exists. The stream table enters ERROR status. Recreate the function or drop and recreate the stream table.

What is reinitialize and when does it trigger?

Reinitialize is pg_trickle's mechanism for handling structural changes to source tables. When a stream table is marked with needs_reinit = true, the next scheduler cycle performs:

Drop the existing storage table (the physical heap table backing the stream table).
Recreate the storage table from the defining query's current output schema.
Full refresh — run the defining query against current source data and populate the new storage table.
Reset the frontier to the current LSN.
Clear the needs_reinit flag.

Reinitialize triggers automatically when:

DDL event triggers detect ALTER TABLE, DROP TABLE, or CREATE OR REPLACE VIEW on source tables or intermediate views.
A needs_reinit NOTIFY alert is sent.

You can also trigger it manually:

UPDATE pgtrickle.pgt_stream_tables SET needs_reinit = true WHERE pgt_name = 'my_st';

Can I block DDL on tracked source tables?

pg_trickle does not currently block DDL on source tables — it only reacts to DDL changes via event triggers. If you want to prevent accidental schema changes on critical source tables, use PostgreSQL's built-in mechanisms:

-- Revoke ALTER/DROP from application roles
REVOKE ALL ON TABLE orders FROM app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLE orders TO app_user;
-- Only the table owner (or superuser) can now ALTER/DROP

Alternatively, create a custom event trigger that raises an exception when DDL targets tracked source tables:

CREATE OR REPLACE FUNCTION prevent_source_ddl() RETURNS event_trigger AS $$
BEGIN
    IF EXISTS (
        SELECT 1 FROM pg_event_trigger_ddl_commands() cmd
        JOIN pgtrickle.pgt_dependencies d ON d.source_relid = cmd.objid
    ) THEN
        RAISE EXCEPTION 'Cannot ALTER/DROP a table tracked by pg_trickle';
    END IF;
END;
$$ LANGUAGE plpgsql;

CREATE EVENT TRIGGER guard_source_ddl ON ddl_command_end
EXECUTE FUNCTION prevent_source_ddl();

What happens if I run DDL on a source table during an active refresh?

PostgreSQL's locking mechanism prevents most conflicts. The refresh transaction acquires a ShareLock on source tables before reading them. Since ALTER TABLE (including ADD COLUMN, DROP COLUMN, ALTER TYPE) requires an AccessExclusiveLock, the DDL statement blocks until the refresh transaction completes.

In practice:

During a refresh: The ALTER TABLE waits for the refresh to finish, then proceeds. pg_trickle's DDL event trigger then detects the change and marks the stream table for reinitialization.
Between refreshes: DDL proceeds immediately. The next refresh picks up the reinitialization flag.

There is a tiny theoretical window between lock acquisition and the first read where DDL could sneak in, but this is prevented by PostgreSQL's MVCC — the refresh's snapshot was taken before the DDL committed, so it reads the old schema regardless.

If pg_trickle.block_source_ddl = true: Column-affecting DDL on tracked source tables is rejected entirely with an ERROR, regardless of whether a refresh is running.

Do stream tables work with logical replication?

Stream tables are replicated to standbys via physical (streaming) replication like any other heap table. However, they are not automatically maintained by pg_trickle on the subscriber:

Aspect	Primary	Physical standby	Logical subscriber
Scheduler runs	Yes	No (read-only)	No (no pg_trickle catalog)
Stream tables readable	Yes	Yes (replicated)	Only if published
Refreshes occur	Yes	No (standby is read-only)	No
Change buffers	Managed by pg_trickle	Replicated but not consumed	Not available

Key limitations:

Change buffer tables (pgtrickle_changes.*) are not published through logical replication — they are internal transient data.
The pg_trickle catalog (pgtrickle.pgt_stream_tables) is not replicated through logical replication.
On a physical standby, stream tables receive updates through streaming replication with the usual replication lag.

Recommended pattern: Run pg_trickle on the primary only. Read stream tables from any physical standby.

Performance & Tuning

This section covers scheduler tuning, the adaptive FULL fallback, disk space management, and guidance on when to use DIFFERENTIAL vs. FULL mode.

How do I tune the scheduler interval?

The pg_trickle.scheduler_interval_ms GUC controls how often the scheduler checks for stale stream tables (default: 1000 ms).

Workload	Recommended Value
Low-latency (near real-time)	`100`–`500`
Standard	`1000` (default)
Low-overhead (many STs, long schedules)	`5000`–`10000`

Is there any risk in setting `min_schedule_seconds` very low?

Yes. pg_trickle.min_schedule_seconds (default: 60) is a safety guardrail, not an arbitrary limit. Setting it very low — especially in production — can cause several problems:

WAL amplification. Every differential refresh writes a MERGE to the WAL. At 1-second intervals across many stream tables, WAL generation rises sharply, increasing replication lag and storage costs.

Lock contention. Each refresh acquires locks on the change buffer table. With cleanup_use_truncate = true (the default), this is an AccessExclusiveLock. Sub-second schedules can starve concurrent INSERT/UPDATE/DELETE statements on the source tables.

Cascading refresh load. If a refresh takes longer than the schedule interval (e.g., an 800 ms refresh on a 1-second schedule), the next refresh fires almost immediately upon completion. With chained or diamond-shaped ST graphs, the entire topological chain must complete within the interval to avoid falling behind.

Autovacuum pressure. Rapid MERGE operations produce dead tuples in the stream table faster than autovacuum can clean them up, bloating the table and degrading query performance over time.

Adaptive fallback triggering. At high change rates, pg_trickle.differential_max_change_ratio may trigger a FULL refresh instead of DIFFERENTIAL. A FULL refresh at 1-second intervals is very expensive and defeats the purpose of differential maintenance.

Practical guidance:

Environment	Recommended minimum
Development / testing	`1` s — fine for fast iteration
Lightly loaded production	`10`–`30` s
Standard production	`60` s (default)
High-throughput OLTP	`120`+ s — let change buffers accumulate for efficient batch merging

If you need near-real-time results, consider IMMEDIATE mode (refresh_mode => 'DIFFERENTIAL' with same-transaction refresh) instead of a very short schedule — it avoids the scheduler overhead entirely and updates the stream table within your transaction.

What is the adaptive fallback to FULL?

When the number of pending changes exceeds pg_trickle.differential_max_change_ratio (default: 15%) of the source table size, DIFFERENTIAL mode automatically falls back to FULL for that refresh cycle. This prevents pathological delta queries on bulk changes.

Set to 0.0 to always use DIFFERENTIAL (even on large change sets)
Set to 1.0 to effectively always use FULL
Default 0.15 (15%) is a good balance

How many concurrent refreshes can run?

By default (parallel_refresh_mode = 'off') refreshes are processed sequentially within the scheduler's single background worker. This is safe and efficient for most deployments.

Starting in v0.4.0, true parallel refresh is available via:

ALTER SYSTEM SET pg_trickle.parallel_refresh_mode = 'on';
ALTER SYSTEM SET pg_trickle.max_dynamic_refresh_workers = 4;  -- cluster-wide cap
ALTER SYSTEM SET pg_trickle.max_concurrent_refreshes = 4;    -- per-database cap
SELECT pg_reload_conf();

When enabled, independent stream tables at the same DAG level are refreshed concurrently in separate dynamic background workers. Each worker uses one max_worker_processes slot — see the worker-budget formula before enabling.

Monitor parallel refresh with:

SELECT * FROM pgtrickle.worker_pool_status();
SELECT * FROM pgtrickle.parallel_job_status(60);

For most deployments with fewer than 100 stream tables, sequential processing is still efficient (each differential refresh typically takes 5–50 ms).

How do I check if my stream tables are keeping up?

-- Quick overview
SELECT pgt_name, status, staleness, stale
FROM pgtrickle.stream_tables_info;

-- Detailed statistics
SELECT pgt_name, total_refreshes, avg_duration_ms, consecutive_errors, stale
FROM pgtrickle.pg_stat_stream_tables;

-- Recent refresh history for a specific ST
SELECT * FROM pgtrickle.get_refresh_history('order_totals', 10);

What is `__pgt_row_id`?

Every stream table has a __pgt_row_id BIGINT PRIMARY KEY column that stores a 64-bit xxHash of the row's identity key. The refresh engine uses it to match incoming deltas against existing rows during MERGE operations.

For a detailed explanation of how this column is computed and why it exists, see What is the __pgt_row_id column and why does it appear in my stream tables? in the General section.

You should ignore this column in your queries. It is an implementation detail.

How much disk space do change buffer tables consume?

Each change buffer table stores one row per source-table change (INSERT, UPDATE, DELETE, or TRUNCATE marker). The row size depends on the source table's column count and data types:

Component	Approximate size
`action` column (char)	1 byte
`old_data` / `new_data` (JSONB)	1–10 KB per row (depends on source columns)
`lsn` (pg_lsn)	8 bytes
`txid` (xid8)	8 bytes
Index (on lsn)	~40 bytes per row

Rule of thumb: Buffer tables consume roughly 2–3× the raw row size of the source change, because both OLD and NEW values are stored as JSONB.

Buffer tables are cleaned up (truncated or deleted) after each successful refresh. If you suspect buffer bloat, check:

SELECT relname, pg_size_pretty(pg_total_relation_size(oid)) AS size
FROM pg_class
WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'pgtrickle_changes')
ORDER BY pg_total_relation_size(oid) DESC;

What determines whether DIFFERENTIAL or FULL is faster for a given workload?

The breakeven point depends on the change ratio — the number of changed rows relative to the total source table size:

Change ratio	Recommended mode	Why
< 5%	DIFFERENTIAL	Delta query touches few rows; much cheaper than re-reading everything
5–15%	DIFFERENTIAL (usually)	Still faster, but approaching the crossover
15–50%	FULL	The delta query scans a large fraction of the source anyway; FULL avoids the overhead of delta computation
> 50%	FULL	Bulk load scenario — TRUNCATE + INSERT is simpler and faster

Additional factors:

Query complexity: Queries with many joins or window functions have more expensive delta computation. The crossover shifts lower.
Source table size: For small tables (<10K rows), FULL is nearly always faster because the overhead is negligible.
Index presence: DIFFERENTIAL uses indexes to look up changed rows. Missing indexes on join keys or GROUP BY columns can make delta queries slow.

The adaptive fallback (pg_trickle.differential_max_change_ratio, default 0.15) automates this decision per-cycle.

What are the planner hints and when should I disable them?

Before executing a delta query, pg_trickle sets several session-level planner parameters to guide PostgreSQL toward efficient delta plans:

SET LOCAL enable_seqscan = off;     -- Prefer index scans for small deltas
SET LOCAL enable_nestloop = on;     -- Nested loops are good for small delta × large table joins
SET LOCAL enable_mergejoin = off;   -- Merge joins are worse for skewed delta sizes

These hints are active only during the refresh transaction and are reset afterward.

When to disable hints: If you notice that a particular stream table's refresh is slow (check avg_duration_ms in pg_stat_stream_tables), the planner hints may be suboptimal for that specific query. You can disable them by setting:

SET pg_trickle.planner_hints = off;

This allows PostgreSQL's planner to choose its own strategy. Test both settings and compare avg_duration_ms.

How do prepared statements help refresh performance?

The refresh engine uses PostgreSQL prepared statements (PREPARE / EXECUTE) for the delta and MERGE queries. On the first refresh, the statement is prepared; subsequent refreshes reuse the cached plan. Benefits:

Reduced planning overhead. For complex delta queries with many joins and CTEs, planning can take 5–50 ms. Prepared statements skip this on subsequent refreshes.
Stable plans. The planner uses generic plans after the 5th execution (PostgreSQL default), avoiding plan instability from statistic fluctuations.

Prepared statements are stored per-session and are invalidated when:

The stream table is reinitialized (schema change)
The shared cache generation advances after DDL or stream-table metadata changes
The PostgreSQL connection is recycled
The session ends

How does the adaptive FULL fallback threshold work in practice?

The pg_trickle.differential_max_change_ratio GUC (default: 0.15) is evaluated per source table, per refresh cycle:

Before each differential refresh, the engine counts pending changes in the buffer table: pending_changes = COUNT(*) FROM pgtrickle_changes.changes_<oid>.
It estimates the source table size from pg_class.reltuples.
If pending_changes / reltuples > differential_max_change_ratio, the engine falls back to FULL for that cycle.

Edge cases:

If the source table has reltuples = 0 (freshly created, no ANALYZE yet), the engine always uses FULL until statistics are available.
For multi-source stream tables (joins), each source is evaluated independently. If any source exceeds the threshold, the entire refresh falls back to FULL.
The threshold applies to the current cycle only — the next cycle re-evaluates.

How many stream tables can a single PostgreSQL instance handle?

There is no hard limit. Practical limits depend on:

Factor	Guideline
Scheduler overhead	Each cycle iterates all STs; at 1000 STs with 1ms overhead per check, the cycle takes ~1s
Background connections	1 per database (the scheduler) + 1 per manual refresh call
Change buffer bloat	Each source table gets its own buffer table — many sources = many tables in `pgtrickle_changes`
Catalog size	`pgt_stream_tables` and `pgt_dependencies` grow linearly
Refresh throughput	Sequential processing means total cycle time = sum of individual refresh times

Tested benchmarks: Up to 500 stream tables on a single instance with <2s total cycle time for DIFFERENTIAL refreshes averaging 3ms each.

What is the TRUNCATE vs DELETE cleanup trade-off for change buffers?

After each successful refresh, the engine cleans up processed change records from the buffer table. The pg_trickle.cleanup_use_truncate GUC (default: true) controls the method:

Method	Pros	Cons
`TRUNCATE` (default)	Instant — O(1) regardless of row count. Reclaims disk space immediately.	Takes an `ACCESS EXCLUSIVE` lock on the buffer table, briefly blocking concurrent INSERTs from CDC triggers (~0.1 ms typical).
`DELETE`	Row-level lock only — no blocking of concurrent CDC writes.	O(N) — proportional to the number of processed rows. Dead tuples require VACUUM to reclaim space.

When to switch to DELETE: If your source table has extremely high write throughput (>10K writes/sec) and you observe brief stalls in DML latency during refresh cleanup, switch to DELETE:

ALTER SYSTEM SET pg_trickle.cleanup_use_truncate = false;
SELECT pg_reload_conf();

For most workloads, TRUNCATE is the better choice because buffer tables are typically emptied completely after each refresh.

Interoperability

Stream tables are standard PostgreSQL heap tables, which means they work with most PostgreSQL features. This section clarifies what’s compatible (views, replication, triggers) and what’s not (direct DML, foreign keys).

Can PostgreSQL views reference stream tables?

Yes. Since stream tables are standard PostgreSQL heap tables, you can create views on top of them just like any other table. The view will return whatever data is currently in the stream table, reflecting the most recent refresh:

CREATE VIEW high_value_customers AS
SELECT customer_id, total FROM pgtrickle.order_totals WHERE total > 1000;

This is a common pattern for adding per-user filters or formatting on top of a shared stream table.

Can materialized views reference stream tables?

Yes, though this is usually redundant — both materialized views and stream tables are physical snapshots of query results. The key difference is that the materialized view requires its own manual REFRESH MATERIALIZED VIEW call; it does not auto-refresh when the underlying stream table refreshes.

A more idiomatic approach is to create a second stream table that references the first one. This way, pg_trickle handles the dependency ordering and refresh scheduling for both automatically.

Can I replicate stream tables with logical replication?

Yes. Stream tables can be published like any ordinary table:

CREATE PUBLICATION my_pub FOR TABLE pgtrickle.order_totals;

Important caveats:

The __pgt_row_id column is replicated (it is the primary key)
Subscribers receive materialized data, not the defining query
Do not install pg_trickle on the subscriber and attempt to refresh the replicated table — it will have no CDC triggers or catalog entries
Internal change buffer tables are not published by default

Can I `INSERT`, `UPDATE`, or `DELETE` rows in a stream table directly?

No. Stream table contents are managed exclusively by the refresh engine, and direct DML will corrupt the internal state (row IDs, frontier tracking, and change buffer consistency). See Why can't I INSERT, UPDATE, or DELETE rows in a stream table? for a detailed explanation of what goes wrong.

If you need to post-process stream table data, create a view or a second stream table that references the first one.

Can I add foreign keys to or from stream tables?

No. Foreign key constraints are incompatible with how the refresh engine operates. The engine uses bulk MERGE operations that apply inserts and deletes atomically, without guaranteeing the row-by-row ordering that foreign key checks require. Full refreshes also use TRUNCATE + INSERT, which bypasses cascade logic entirely.

See Why can't I add foreign keys? for details. If you need referential integrity, enforce it in your application or in a view that joins the stream tables.

Can I add my own triggers to stream tables?

Yes, for DIFFERENTIAL mode stream tables. When user-defined row-level triggers are detected, the refresh engine automatically switches from MERGE to explicit DELETE + UPDATE + INSERT statements. This ensures triggers fire with the correct TG_OP, OLD, and NEW values. Legacy configs that still set pg_trickle.user_triggers = 'on' are treated the same as auto.

Limitations:

Row-level triggers do not fire during FULL refresh (they are automatically suppressed via DISABLE TRIGGER USER). Use REFRESH MODE DIFFERENTIAL for stream tables with triggers.
The IS DISTINCT FROM guard prevents no-op UPDATE triggers when the aggregate result is unchanged.
BEFORE triggers that modify NEW will affect the stored value — the next refresh may "correct" it back, causing oscillation.

See the pg_trickle.user_triggers GUC in CONFIGURATION.md for control options.

Can I `ALTER TABLE` a stream table directly?

No. Direct ALTER TABLE would change the physical table without updating pg_trickle's catalog, causing column mismatches and __pgt_row_id invalidation on the next refresh. See Why can't I ALTER TABLE a stream table directly? for details.

Instead, use the pg_trickle API:

-- Change schedule, mode, or status:
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '10m');

-- To change the defining query or column structure, drop and recreate:
SELECT pgtrickle.drop_stream_table('order_totals');
SELECT pgtrickle.create_stream_table(
    name         => 'order_totals',
    query        => '...',
    schedule     => '5m',
    refresh_mode => 'DIFFERENTIAL'
);

Does pg_trickle work with PgBouncer or other connection poolers?

It depends on the pooling mode. pg_trickle's background scheduler uses session-level features that are incompatible with transaction-mode connection pooling:

Feature	Issue with Transaction-Mode Pooling
`pg_advisory_lock()`	Session-level lock released when connection returns to pool — concurrent refreshes possible
`PREPARE` / `EXECUTE`	Prepared statements are session-scoped — "does not exist" errors on different connections
`LISTEN` / `NOTIFY`	Notifications lost when listeners change connections

Recommended configurations:

Session-mode pooling (pool_mode = session): Fully compatible. The scheduler holds a dedicated connection.
Direct connection (no pooler for the scheduler): Fully compatible. Application queries can still go through a pooler.
Transaction-mode pooling (pool_mode = transaction): Not supported. The scheduler requires a persistent session.

Tip: If your infrastructure requires transaction-mode pooling (e.g., AWS RDS Proxy, Supabase), route the pg_trickle background worker through a direct connection while keeping application traffic on the pooler. Most connection poolers support per-database or per-user routing rules.

Does pg_trickle work with pgvector?

Partially — it depends on the refresh mode and what the defining query does.

What works:

Source tables with vector columns. CDC triggers are generated using PostgreSQL's format_type(), which returns the full type name (e.g. vector(1536)). Change buffer tables mirror the source schema correctly, so inserts, updates, and deletes on pgvector tables are captured and replayed without issue.
Passing vector columns through in DIFFERENTIAL mode. Stream tables that select, filter (on non-vector columns), or join sources that happen to contain vector columns work correctly — the vector data is treated as an opaque value and copied through unchanged.
FULL mode with any pgvector expression. Because FULL mode re-executes the entire defining query, all pgvector operators (<->, <=>, <#>) and functions (cosine_distance, l2_normalize, etc.) work exactly as they do in a regular query.

What does not work:

DIFFERENTIAL mode with pgvector distance operators in the query. The DVM engine needs a differentiation rule for every SQL operator it encounters. Custom operators like <-> (L2 distance) or <=> (cosine distance) are not in the built-in rule set. The engine will fall back automatically to FULL mode if such operators appear in the delta query path. Set refresh_mode => 'FULL' explicitly to make this intent clear.
Incremental aggregation over vector columns. There is no meaningful incremental form for aggregates over vector values (e.g. averaging embeddings). Use FULL mode for any aggregate that involves vector arithmetic.

Recommended pattern for a nearest-neighbour cache or semantic search result set:

CREATE EXTENSION IF NOT EXISTS vector;

SELECT pgtrickle.create_stream_table(
    name         => 'top_similar_docs',
    query        => $$
        SELECT d.id, d.title, d.embedding,
               d.embedding <=> '[0.1, 0.2, 0.3]'::vector AS distance
        FROM documents d
        ORDER BY distance
        LIMIT 100
    $$,
    schedule     => '5m',
    refresh_mode => 'FULL'
);

For use-cases that only carry vector columns through without computing on them, DIFFERENTIAL mode works fine:

-- Vectors are not used in the delta computation — DIFFERENTIAL is safe here
SELECT pgtrickle.create_stream_table(
    name         => 'active_doc_embeddings',
    query        => $$
        SELECT id, embedding
        FROM documents
        WHERE status = 'published'
    $$,
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

dbt Integration

The dbt-pgtrickle package provides a stream_table materialization that lets you manage stream tables through dbt’s standard workflow. This section covers setup, commands, freshness checks, and query change handling.

How do I use pg_trickle with dbt?

Install the dbt-pgtrickle package (a pure Jinja SQL macro package — no Python dependencies):

# packages.yml
packages:
  - package: pg_trickle/dbt_pgtrickle
    version: ">=0.2.0"

Then define a stream table model using the stream_table materialization:

-- models/order_totals.sql
{{ config(
    materialized='stream_table',
    schedule='1m',
    refresh_mode='DIFFERENTIAL'
) }}

SELECT customer_id, SUM(amount) AS total
FROM {{ source('public', 'orders') }}
GROUP BY customer_id

The stream_table materialization calls pgtrickle.create_stream_table() on the first run and pgtrickle.alter_stream_table() on subsequent runs (if the schedule or mode changes).

What dbt commands work with stream tables?

Command	Behavior
`dbt run`	Creates stream tables that don't exist; updates schedule/mode if changed; does not alter the defining query of existing STs
`dbt run --full-refresh`	Drops and recreates all stream tables from scratch (new defining query, fresh data)
`dbt test`	Works normally — tests query the stream table as a regular table
`dbt source freshness`	Works if you configure a freshness block on the stream table source
`dbt docs generate`	Documents stream tables like any other model

How does `dbt run --full-refresh` work with stream tables?

When --full-refresh is passed, the stream_table materialization:

Calls pgtrickle.drop_stream_table('model_name') to remove the existing stream table, CDC triggers, and change buffers.
Calls pgtrickle.create_stream_table(...) with the current defining query from the model file.
The new stream table starts in INITIALIZING status and performs its first full refresh.

This is the correct way to update a stream table's defining query in dbt. Without --full-refresh, dbt will not detect query changes (it only compares schedule and mode).

How do I check stream table freshness in dbt?

Use dbt's built-in source freshness feature by adding a freshness block to your source definition:

# models/sources.yml
sources:
  - name: pgtrickle
    schema: pgtrickle
    tables:
      - name: order_totals
        loaded_at_field: "last_refreshed_at"  # from stream_tables_info
        freshness:
          warn_after: {count: 5, period: minute}
          error_after: {count: 15, period: minute}

Then run dbt source freshness to check.

Alternatively, query the pg_trickle monitoring views directly in a dbt test:

-- tests/check_freshness.sql
SELECT pgt_name FROM pgtrickle.stream_tables_info WHERE stale = true

What happens when the defining query changes in dbt?

If you modify the SQL in a stream table model file and run dbt run without --full-refresh:

The stream_table materialization detects that the stream table already exists.
It compares the schedule and refresh mode — if either changed, it calls alter_stream_table() to update them.
It does not compare the defining query text. The existing defining query remains in effect.

To apply a new defining query, you must run dbt run --full-refresh. This drops and recreates the stream table with the new query.

Recommendation: After changing a model's SQL, always run dbt run --full-refresh -s model_name to apply the change.

Can I use `dbt snapshot` with stream tables?

Yes, with caveats. dbt snapshots work by tracking changes to a source table over time using updated_at or check strategies. You can snapshot a stream table like any other table.

However, keep in mind:

Stream tables are refreshed periodically, not on every write. The snapshot will only capture changes at refresh boundaries, not at the granularity of individual source-table writes.
The __pgt_row_id column will appear in the snapshot. You may want to exclude it with check_cols or a select in the snapshot configuration.
FULL refresh mode replaces all rows each cycle, which will appear as "updates" to the snapshot strategy even if the data hasn't changed. Use DIFFERENTIAL mode for stream tables that are snapshotted.

What dbt versions are supported?

dbt-pgtrickle is a pure Jinja SQL macro package that works with:

dbt-core 1.7+ (the stream_table materialization uses standard Jinja patterns)
dbt-postgres adapter (required for PostgreSQL connection)

There are no Python dependencies beyond dbt-core and dbt-postgres. The package is tested against dbt 1.7.x and 1.8.x in CI.

Row-Level Security (RLS)

Does RLS on source tables affect stream table content?

No. Stream tables always materialize the full, unfiltered result set, regardless of any RLS policies on source tables. This matches the behavior of PostgreSQL's built-in REFRESH MATERIALIZED VIEW.

The scheduled refresh runs as a superuser background worker. Manual calls to refresh_stream_table() and IMMEDIATE-mode IVM triggers also bypass RLS internally (SET LOCAL row_security = off / SECURITY DEFINER trigger functions), ensuring the stream table content is always complete and deterministic.

Can I use RLS on a stream table to filter reads per role?

Yes. Stream tables are regular PostgreSQL tables, so ALTER TABLE … ENABLE ROW LEVEL SECURITY and CREATE POLICY work exactly as expected. This is the recommended pattern for multi-tenant filtering:

ALTER TABLE pgtrickle.order_totals ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON pgtrickle.order_totals
    USING (tenant_id = current_setting('app.tenant_id')::INT);

One stream table serves all tenants. Per-tenant filtering happens at query time with zero storage duplication.

What happens when I ENABLE or DISABLE RLS on a source table?

pg_trickle's DDL event trigger detects ALTER TABLE … ENABLE ROW LEVEL SECURITY, DISABLE ROW LEVEL SECURITY, FORCE ROW LEVEL SECURITY, and NO FORCE ROW LEVEL SECURITY on source tables and marks all dependent stream tables for reinitialisation. The same applies to CREATE POLICY, ALTER POLICY, and DROP POLICY.

Why are IVM trigger functions SECURITY DEFINER?

In IMMEDIATE mode, the IVM trigger fires in the DML-issuing user's context. If that user has restricted RLS visibility, the delta query could see only a subset of the base table rows, producing a corrupt stream table. Making the trigger function SECURITY DEFINER (owned by the extension installer, typically a superuser) ensures the delta query always has full visibility. The DML itself is still subject to the user's own RLS policies — only the stream table maintenance runs with elevated privileges.

The trigger functions also set search_path = pg_catalog, pgtrickle, pgtrickle_changes, public to prevent search_path hijacking — a security best practice for all SECURITY DEFINER functions. The public schema is included because the delta SQL references user tables that typically reside there.

Deployment & Operations

This section covers the operational aspects of running pg_trickle in production: background workers, upgrades, restarts, replicas, Kubernetes, partitioned tables, and multi-database deployments.

How many background workers does pg_trickle use?

pg_trickle uses a two-tier background worker model:

Launcher (pg_trickle launcher) — one per cluster, static. Scans pg_database every ~10 seconds and spawns a per-database scheduler for every database where pg_trickle is installed. Automatically re-spawns schedulers that exit.
Per-database scheduler (pg_trickle scheduler) — one dynamic worker per database with pg_trickle installed.

Component	Workers	Notes
Launcher	1 (static)	Cluster-wide; connects to `postgres` database
Scheduler	1 per database (dynamic)	Persistent per database; drives all refreshes
Parallel refresh workers	0–N per database	Only when `pg_trickle.parallel_refresh_mode = 'on'`
WAL decoder	0 (shared)	Shares the scheduler's SPI connection
Manual refresh	0	Runs in the caller's session

How do I size `max_worker_processes`?

When max_worker_processes is too low, the launcher silently fails to spawn schedulers for some databases and retries every 5 minutes. Those databases stop refreshing with no error in the stream table itself — you only see it in the PostgreSQL log:

WARNING:  pg_trickle launcher: could not spawn scheduler for database 'mydb'

The minimum formula:

max_worker_processes ≥
  1 (pg_trickle launcher)
  + N  (one scheduler per database with pg_trickle installed)
  + max_dynamic_refresh_workers  (only if parallel_refresh_mode = 'on'; default 4)
  + autovacuum_max_workers        (default 3)
  + parallel query workers        (max_parallel_workers_per_gather × concurrent queries)
  + slots for other extensions    (logical replication launcher, etc.)

A practical starting point for a cluster with a handful of databases:

max_worker_processes = 32

This value requires a full PostgreSQL restart (not just reload).

How do I upgrade pg_trickle to a new version?

Install the new shared library (replace the .so/.dylib file in PostgreSQL's lib directory).
Run the upgrade SQL:
```
ALTER EXTENSION pg_trickle UPDATE;
```
This applies migration scripts (e.g., pg_trickle--0.2.1--0.2.2.sql) that update catalog tables, add new functions, and migrate data as needed.
Restart PostgreSQL if the shared library changed (required for shared_preload_libraries changes).
Verify:
```
SELECT pgtrickle.version();
```

Zero-downtime upgrades are possible for minor versions (patch releases) that don't change the shared library. Just run ALTER EXTENSION pg_trickle UPDATE — no restart needed.

For detailed instructions, version-specific notes, rollback procedures, and troubleshooting, see the full Upgrading Guide.

How do I know if my shared library and SQL extension versions match?

The background worker checks for version mismatches at startup and logs a WARNING if the compiled .so version differs from the installed SQL extension version. You can also check manually:

-- Compiled .so version:
SELECT pgtrickle.version();

-- Installed SQL extension version:
SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';

If these differ, run ALTER EXTENSION pg_trickle UPDATE; and restart PostgreSQL if prompted.

Are stream tables preserved during an upgrade?

Yes. ALTER EXTENSION pg_trickle UPDATE applies only additive schema migrations (new columns, updated function signatures). Existing stream tables, their data, refresh history, and CDC infrastructure are preserved. The scheduler resumes normal operation after the upgrade completes.

For version-specific migration notes, see the Upgrading Guide — Version-Specific Notes.

What happens to stream tables during a PostgreSQL restart?

During a restart:

The scheduler stops. No refreshes occur while PostgreSQL is down.
CDC triggers are inactive. Source table writes during the restart window are captured when PostgreSQL comes back up (triggers are persistent DDL objects).
On startup, the scheduler background worker starts, reads the catalog, rebuilds the DAG, and resumes refresh cycles from where it left off.
Frontier reconciliation. The scheduler detects any gap between the stored frontier LSN and the current WAL position. Source changes that occurred between the last successful refresh and the restart are in the change buffers (for trigger-mode CDC) and will be processed in the first refresh cycle.

Net effect: Stream tables may be stale for the duration of the downtime, but no data is lost. The first refresh cycle after restart catches up automatically.

Can I use pg_trickle on a read replica / standby?

The scheduler does not run on standby servers. When pg_trickle detects it is running in recovery mode (pg_is_in_recovery() = true), the background worker enters a sleep loop and does not attempt any refreshes.

However, stream tables replicated from the primary are readable on the standby — they are regular heap tables and are replicated via physical (streaming) replication like any other table.

Pattern for read-heavy workloads:

Run pg_trickle on the primary — it performs all refreshes.
Query stream tables on the standby — read replicas get the latest refreshed data via streaming replication, with replication lag as the only additional delay.

How does pg_trickle work with CloudNativePG / Kubernetes?

pg_trickle is compatible with CloudNativePG. The cnpg/ directory in the repository contains example manifests:

Dockerfile.ext — builds a PostgreSQL image with pg_trickle pre-installed
cluster-example.yaml — CloudNativePG Cluster manifest with shared_preload_libraries = 'pg_trickle'

Key considerations:

Include pg_trickle in shared_preload_libraries in the Cluster's postgresql configuration.
The scheduler runs on the primary pod only. Replica pods detect recovery mode and sleep.
Pod restarts are handled the same way as PostgreSQL restarts (see above).
Persistent volume claims preserve catalog and change buffers across pod rescheduling.

Does pg_trickle work with partitioned source tables?

Yes. pg_trickle installs CDC triggers on the partitioned parent table, which PostgreSQL automatically propagates to all existing and future partitions. When a row is inserted into any partition, the trigger fires and writes the change to the buffer table.

Caveats:

TRUNCATE on individual partitions fires the partition-level trigger, which is also captured.
Attaching or detaching partitions (ALTER TABLE ... ATTACH/DETACH PARTITION) fires DDL event triggers, which may mark the stream table for reinitialization.
Row movement between partitions (when the partition key is updated) is captured as a DELETE from the old partition and an INSERT into the new partition.

Can I run pg_trickle in multiple databases on the same cluster?

Yes. Each database gets its own independent scheduler background worker, its own catalog tables, and its own change buffers. Stream tables in different databases do not interact.

Resource planning: Each database with stream tables requires 1 background worker slot in max_worker_processes. If you have many databases, the default of 8 is easily exhausted.

Important: When max_worker_processes is exhausted, the launcher silently skips databases it cannot spawn a scheduler for and retries every 5 minutes. This means stream tables in those databases stop refreshing with no visible error — they just go stale. Check the PostgreSQL log for:
WARNING:  pg_trickle launcher: could not spawn scheduler for database 'mydb'
If you see this, increase max_worker_processes and restart PostgreSQL.

See How do I size max_worker_processes? for the full formula.

-- On each database where you want pg_trickle:
CREATE EXTENSION pg_trickle;

The extension must be created separately in each database — shared_preload_libraries loads the shared library cluster-wide, but the SQL objects (catalog tables, functions) are per-database.

Monitoring & Alerting

pg_trickle provides built-in monitoring views and NOTIFY-based alerting. This section explains the available views, alert events, and failure handling.

How do I list all stream tables in my database?

Several options depending on how much detail you need:

-- Quickest: name + status + mode + staleness
SELECT name, status, refresh_mode, is_populated, staleness
FROM pgtrickle.stream_tables_info;

-- Full stats: refresh counts, rows inserted/deleted, avg duration, error streaks
SELECT * FROM pgtrickle.pg_stat_stream_tables;

-- Live status including consecutive_errors and data_timestamp
SELECT * FROM pgtrickle.pgt_status();

-- Raw catalog (all persisted properties, no computed fields)
SELECT * FROM pgtrickle.pgt_stream_tables;

How do I inspect what pg_trickle is doing right now?

Quick status snapshot:

SELECT name, status, refresh_mode, consecutive_errors, staleness
FROM pgtrickle.pgt_status();

Deep dive into a specific stream table — shows the defining query, DVM operator tree, source tables, generated delta SQL, and current WAL frontier:

SELECT * FROM pgtrickle.explain_st('my_table');

Key properties returned:

Property	Description
`dvm_supported`	Whether differential maintenance is possible for this query
`operator_tree`	How the DVM engine has decomposed the query
`delta_query`	The actual SQL executed during a differential refresh
`frontier`	Per-source LSN positions flushed at last refresh

Recent refresh activity:

-- Last 10 refreshes for a stream table (action, status, rows, duration):
SELECT * FROM pgtrickle.get_refresh_history('my_table', 10);

-- Aggregate refresh stats for all stream tables:
SELECT * FROM pgtrickle.st_refresh_stats();

CDC and slot health:

-- Per-source CDC mode, WAL lag, and alerts:
SELECT * FROM pgtrickle.check_cdc_health();

-- Replication slot health (slot_name, active, lag_bytes):
SELECT * FROM pgtrickle.slot_health();

Real-time event stream:

LISTEN pg_trickle_alert;
-- Receives JSON payloads for: stale_data, auto_suspended, resumed,
-- reinitialize_needed, buffer_growth_warning, refresh_completed, refresh_failed

Pending change buffers (rows not yet consumed by a differential refresh):

SELECT stream_table, source_table, cdc_mode, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

Are there convenience functions for inspecting source tables and CDC buffers?

Yes. pg_trickle provides two functions added to complement the existing monitoring suite:

pgtrickle.list_sources(name) — shows every source table a stream table depends on, the CDC mode each uses, and any column-level usage metadata:

SELECT * FROM pgtrickle.list_sources('order_totals');
-- Returns: source_table, source_oid, source_type, cdc_mode, columns_used

pgtrickle.change_buffer_sizes() — shows, for every tracked source table, how many CDC rows are pending (not yet consumed by a differential refresh) and the estimated on-disk size of the change buffer:

SELECT * FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;
-- Returns: stream_table, source_table, source_oid, cdc_mode, pending_rows, buffer_bytes

A large pending_rows value for a source table means a differential refresh is overdue or stalled — use pgtrickle.get_refresh_history() to investigate.

Can I see a tree view of all stream table dependencies?

Yes. pgtrickle.dependency_tree() walks the dependency DAG and renders it as an indented ASCII tree:

SELECT tree_line, status, refresh_mode
FROM pgtrickle.dependency_tree();

Example output:

tree_line                                 | status | refresh_mode
------------------------------------------+--------+--------------
report_summary                            | ACTIVE | DIFFERENTIAL
├── orders_by_region                      | ACTIVE | DIFFERENTIAL
│   ├── public.orders [src]              |        |
│   └── public.customers [src]           |        |
└── revenue_totals                        | ACTIVE | DIFFERENTIAL
    └── public.orders [src]              |        |

Each row has node (qualified name), node_type (stream_table or source_table), depth, status, and refresh_mode. Source tables are shown as leaves tagged with [src].

What monitoring views are available?

View	Description
`pgtrickle.stream_tables_info`	Status overview with computed staleness
`pgtrickle.pg_stat_stream_tables`	Comprehensive stats (refresh counts, avg duration, error streaks)

How do I get alerted when something goes wrong?

pg_trickle sends PostgreSQL NOTIFY messages on the pg_trickle_alert channel with JSON payloads:

Event	When
`stale_data`	Staleness exceeds 2× the schedule
`auto_suspended`	Stream table suspended after max consecutive errors
`reinitialize_needed`	Upstream DDL change detected
`buffer_growth_warning`	Change buffer growing unexpectedly
`refresh_completed`	Refresh completed successfully
`refresh_failed`	Refresh failed

Listen with:

LISTEN pg_trickle_alert;

What happens when a stream table keeps failing?

After pg_trickle.max_consecutive_errors (default: 3) consecutive failures, the stream table moves to ERROR status and automatic refreshes stop. An auto_suspended NOTIFY alert is sent.

To recover:

-- Fix the underlying issue (e.g., restore a dropped source table), then:
SELECT pgtrickle.alter_stream_table('my_table', status => 'ACTIVE');

Retries use exponential backoff (base 1s, max 60s, ±25% jitter, up to 5 retries before counting as a real failure).

Configuration Reference

All pg_trickle settings are configured via PostgreSQL GUC parameters. The table below lists every available parameter with its type, default, and description.

GUC	Type	Default	Description
`pg_trickle.enabled`	bool	`true`	Enable/disable the scheduler. Manual refreshes still work when `false`.
`pg_trickle.scheduler_interval_ms`	int	`1000`	Scheduler wake interval in milliseconds (100–60000)
`pg_trickle.min_schedule_seconds`	int	`60`	Minimum allowed schedule duration (1–86400)
`pg_trickle.max_consecutive_errors`	int	`3`	Failures before auto-suspending (1–100)
`pg_trickle.change_buffer_schema`	text	`pgtrickle_changes`	Schema for CDC buffer tables
`pg_trickle.max_concurrent_refreshes`	int	`4`	Max parallel refresh workers (1–32)
`pg_trickle.user_triggers`	text	`auto`	User trigger handling: `auto` (detect), `off` (suppress), `on` (deprecated alias for `auto`)
`pg_trickle.differential_max_change_ratio`	float	`0.15`	Change ratio threshold for adaptive FULL fallback (0.0–1.0)
`pg_trickle.cleanup_use_truncate`	bool	`true`	Use TRUNCATE instead of DELETE for buffer cleanup

All GUCs are SUSET context (superuser SET) and take effect without restart, except shared_preload_libraries which requires a PostgreSQL restart.

Troubleshooting

This section covers common problems and how to debug them. If your issue isn’t listed here, check the refresh history for error messages and the monitoring views for status information.

How do I diagnose stalled data flow through stream tables?

See also: Error Reference — comprehensive guide to all pg_trickle error variants with causes and fixes.

If data seems to have stopped flowing -- stream tables show stale results despite DML on the source tables -- follow this systematic diagnostic workflow. Each step narrows the problem from broad health checks down to specific root causes.

Step 0 -- Verify GUC configuration:

Misconfigured GUCs are a common and easy-to-miss cause of stalled or severely throttled data flow. Check all pg_trickle settings in one query:

SELECT name, setting, unit
FROM pg_settings
WHERE name LIKE 'pg_trickle.%'
   OR name = 'max_worker_processes'
ORDER BY name;

Key values to check:

GUC	Safe value	Problem if set to
`pg_trickle.enabled`	`on`	`off` -- stops all automatic refreshes
`pg_trickle.tiered_scheduling`	`on` (fine)	`on` with all STs at `tier = 'frozen'` -- silently skips them
`pg_trickle.max_consecutive_errors`	`3`-`10`	`1` -- one transient error suspends the ST permanently
`pg_trickle.scheduler_interval_ms`	`100`-`1000`	Very high (e.g. `60000`) -- scheduler only wakes every 60 s
`pg_trickle.event_driven_wake`	`on`	`off` -- falls back to poll-only; latency equals `scheduler_interval_ms`
`pg_trickle.wake_debounce_ms`	`1`-`50`	Very high (e.g. `5000`) -- coalesces notifications for 5 s before acting
`pg_trickle.auto_backoff`	`on`	Fine normally, but if refreshes take >95% of schedule it silently stretches intervals up to 8x
`pg_trickle.default_schedule_seconds`	`1`-`60`	Very high -- isolated CALCULATED tables refresh very infrequently
`max_worker_processes`	`>= 16` (typical)	Too low -- workers cannot be spawned; parallel mode silently stalls

Also check whether any stream tables are frozen:

SELECT pgt_name, refresh_tier
FROM pgtrickle.pgt_stream_tables
WHERE refresh_tier = 'frozen';

Step 1 -- Quick health overview:

SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

This single call checks scheduler status, error tables, stale tables, buffer growth, replication slot lag, and the worker pool. Any WARN or ERROR row tells you where to look next.

Step 2 -- Check stream table status and staleness:

SELECT name, status, refresh_mode, consecutive_errors, staleness
FROM pgtrickle.pgt_status()
ORDER BY staleness DESC NULLS FIRST;

Look for SUSPENDED status (auto-suspended after repeated errors), high consecutive_errors, or unexpectedly large staleness.

Step 3 -- Check recent refresh activity:

SELECT start_time, stream_table, action, status, duration_ms, error_message
FROM pgtrickle.refresh_timeline(20);

If no recent rows appear, the scheduler may not be running. If rows show ERROR, the error messages explain why refreshes are failing.

Step 4 -- Inspect errors for a specific stream table:

SELECT * FROM pgtrickle.diagnose_errors('my_stream_table');

Returns the last 5 FAILED refresh events with error classification and suggested remediation steps.

Step 5 -- Check the CDC pipeline (are changes being captured?):

SELECT stream_table, source_table, cdc_mode, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

pending_rows = 0 everywhere: either no DML is happening on the source tables, or CDC triggers are missing.
pending_rows growing but stream tables are not refreshing: scheduler or refresh problem (go back to Steps 1-3).

Step 6 -- Verify CDC triggers exist and are enabled:

SELECT source_table, trigger_type, trigger_name
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

Any rows returned here mean change capture is broken for that source table -- DML changes are not being recorded.

Step 7 -- Check CDC slot health (WAL mode only):

SELECT * FROM pgtrickle.check_cdc_health();

Look for alert values like slot_lag_exceeds_threshold or replication_slot_missing.

Step 8 -- Verify the dependency DAG:

SELECT tree_line, status, refresh_mode
FROM pgtrickle.dependency_tree();

Confirms the dependency graph is wired as expected. A missing edge means upstream changes will not propagate to downstream stream tables.

Step 9 -- Check the parallel worker pool (if using parallel mode):

SELECT * FROM pgtrickle.worker_pool_status();

SELECT job_id, unit_key, status, duration_ms
FROM pgtrickle.parallel_job_status(300)
WHERE status NOT IN ('SUCCEEDED');

Common root causes at a glance:

Symptom	Diagnostic function	Likely root cause
No refreshes happening at all	`health_check` -> `scheduler_running`	Background worker not running or `pg_trickle.enabled = off`
Stream table in `SUSPENDED` status	`pgt_status`	Repeated errors hit `max_consecutive_errors` threshold
Zero pending changes despite DML	`trigger_inventory`	CDC trigger was dropped or disabled by DDL
WAL slot missing or lagging	`check_cdc_health`, `slot_health`	Replication slot dropped, or WAL retention exceeded
Buffers growing but no refreshes	`change_buffer_sizes` + `refresh_timeline`	Scheduler stalled, refresh failing, or lock contention
Upstream changes not propagating	`dependency_tree`	Upstream ST not connected in the DAG

Unit tests crash with `symbol not found in flat namespace` on macOS 26+

macOS 26 (Tahoe) changed the dynamic linker (dyld) to eagerly resolve all flat-namespace symbols at binary load time. pgrx extensions link PostgreSQL server symbols (e.g. CacheMemoryContext, SPI_connect) with -Wl,-undefined,dynamic_lookup, which previously resolved lazily. Since cargo test --lib runs outside the postgres process, those symbols are missing and the test binary aborts:

dyld[66617]: symbol not found in flat namespace '_CacheMemoryContext'

Use just test-unit — it automatically detects macOS 26+ and injects a stub library (libpg_stub.dylib) via DYLD_INSERT_LIBRARIES. The stub provides NULL/no-op definitions for the ~28 PostgreSQL symbols; they are never called during unit tests (pure Rust logic only).

This does not affect integration tests, E2E tests, just lint, just build, or the extension running inside PostgreSQL.

See the Installation Guide for details and manual usage.

My stream table is stuck in INITIALIZING status

The initial full refresh may have failed. Check:

SELECT * FROM pgtrickle.get_refresh_history('my_table', 5);

If the error is transient, retry with:

SELECT pgtrickle.refresh_stream_table('my_table');

My stream table shows stale data but the scheduler is running

Common causes:

TRUNCATE on source table — bypasses CDC triggers. Manual refresh needed.
Too many errors — check consecutive_errors in pgtrickle.pg_stat_stream_tables. Resume with ALTER ... status => 'ACTIVE'.
Long-running refresh — check for lock contention or slow defining queries.
Scheduler disabled — verify SHOW pg_trickle.enabled; returns on.

I get "cycle detected" when creating a stream table

Stream tables cannot have circular dependencies. If stream table A depends on stream table B and B depends on A (either directly or through a chain of intermediate stream tables), pg_trickle rejects the creation with a clear error message listing the cycle path.

To resolve this, restructure your queries to eliminate the circular reference. Common patterns:

Extract the shared logic into a single base stream table that both A and B reference.
Use a regular view instead of a stream table for one side of the dependency.
Merge the two queries into a single stream table if possible.

A source table was altered and my stream table stopped refreshing

pg_trickle detects DDL changes (column additions, drops, type changes) via event triggers and marks affected stream tables with needs_reinit = true. The next scheduler cycle will attempt to reinitialize the stream table — drop the storage table, recreate it from the current defining query schema, and perform a full refresh.

If the schema change breaks the defining query (e.g., a column referenced in the query was dropped or renamed), the reinitialization will fail repeatedly until the stream table hits max_consecutive_errors and enters ERROR status.

To fix it: Update the defining query and recreate the stream table:

SELECT pgtrickle.drop_stream_table('order_totals');
SELECT pgtrickle.create_stream_table(
    name         => 'order_totals',
    query        => 'SELECT id, name FROM orders',  -- updated query reflecting new schema
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

Check the refresh history for the specific error message:

SELECT * FROM pgtrickle.get_refresh_history('order_totals', 5);

How do I see the delta query generated for a stream table?

SELECT pgtrickle.explain_st('order_totals');

This shows the DVM operator tree, source tables, and the generated delta SQL.

How do I interpret the refresh history?

The pgtrickle.get_refresh_history() function returns the most recent refresh records for a stream table:

SELECT * FROM pgtrickle.get_refresh_history('order_totals', 10);

Key columns:

Column	Meaning
`action`	Refresh type: `FULL`, `DIFFERENTIAL`, `TOPK`, `IMMEDIATE`, or `REINITIALIZE`
`rows_inserted`	Rows added to the stream table in this cycle
`rows_deleted`	Rows removed from the stream table in this cycle
`rows_updated`	Rows modified in the stream table (for explicit DML path)
`duration_ms`	Wall-clock time for the refresh
`error_message`	NULL for success; error text for failures
`source_changes`	Number of pending change records processed
`fallback_reason`	If DIFFERENTIAL fell back to FULL: `change_ratio_exceeded`, `truncate_detected`, or `reinitialize`

Patterns to look for:

High rows_inserted + rows_deleted with low source_changes → possible duplicate rows (keyless source tables)
fallback_reason = 'change_ratio_exceeded' frequently → consider lowering the threshold or switching to FULL mode
Increasing duration_ms over time → index maintenance or buffer bloat; consider VACUUM or checking for missing indexes

How can I tell if the scheduler is running?

Several ways to verify:

1. Check the background worker:

SELECT pid, datname, backend_type, state, query
FROM pg_stat_activity
WHERE backend_type = 'pg_trickle scheduler';

If no rows are returned, the scheduler is not running. Common causes:

pg_trickle.enabled = false
Extension not in shared_preload_libraries
max_worker_processes exhausted — the launcher silently skips databases it cannot accommodate and retries every 5 minutes. Check the PostgreSQL log for WARNING: pg_trickle launcher: could not spawn scheduler for database '...'.

2. Check recent refresh activity:

SELECT MAX(refreshed_at) AS last_refresh
FROM pgtrickle.pgt_stream_tables
WHERE status = 'ACTIVE';

If the last refresh was long ago relative to the shortest schedule, the scheduler may be stuck.

3. Check PostgreSQL logs: The scheduler logs startup and shutdown messages at LOG level:

LOG:  pg_trickle scheduler started for database "mydb"
LOG:  pg_trickle scheduler shutting down (SIGTERM)

How do I debug a stream table that shows stale data?

Follow this diagnostic checklist:

Is the scheduler running? (See above)
Is the stream table active?
```
SELECT pgt_name, status, consecutive_errors FROM pgtrickle.pg_stat_stream_tables
WHERE pgt_name = 'my_st';
```
If status is ERROR or SUSPENDED, the stream table has been auto-suspended after repeated failures.
Are there pending changes?
```
SELECT COUNT(*) FROM pgtrickle_changes.changes_<source_oid>;
```
If zero, the source table may not have CDC triggers (check SELECT tgname FROM pg_trigger WHERE tgrelid = '<source_oid>').

Is the refresh failing silently?

SELECT * FROM pgtrickle.get_refresh_history('my_st', 5);

Check for error messages.

Is there lock contention? Long-running transactions holding locks on the source or stream table can block refreshes. Check pg_locks and pg_stat_activity.

What does the `needs_reinit` flag mean and how do I clear it?

The needs_reinit flag in pgtrickle.pgt_stream_tables indicates that the stream table's physical storage needs to be rebuilt — typically because a source table's schema changed.

When needs_reinit = true:

The scheduler skips normal differential/full refresh.
Instead, it performs a reinitialize: drop the storage table, recreate it from the current defining query schema, and populate with a full refresh.
If reinitialization succeeds, needs_reinit is cleared automatically.

If reinitialization keeps failing (e.g., the defining query references a dropped column):

-- Fix the underlying issue first, then clear manually:
UPDATE pgtrickle.pgt_stream_tables SET needs_reinit = false WHERE pgt_name = 'my_st';
-- Or drop and recreate:
SELECT pgtrickle.drop_stream_table('my_st');
SELECT pgtrickle.create_stream_table(
    name         => 'my_st',
    query        => 'SELECT ...',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

Why Are These SQL Features Not Supported?

This section gives detailed technical explanations for each SQL limitation. pg_trickle follows the principle of "fail loudly rather than produce wrong data" — every unsupported feature is detected at stream-table creation time and rejected with a clear error message and a suggested rewrite.

For all of these, returning an explicit error is a deliberate design choice: the alternative would be silently producing incorrect results after a refresh, which is far harder to diagnose.

How does `NATURAL JOIN` work?

NATURAL JOIN is fully supported. At parse time, pg_trickle resolves the common columns between the two tables (using OpTree::output_columns()) and synthesizes explicit equi-join conditions. This supports INNER, LEFT, RIGHT, and FULL NATURAL JOIN variants.

Internally, NATURAL JOIN is converted to an explicit JOIN ... ON before the DVM engine builds its operator tree, so delta computation works identically to a manually specified equi-join.

Note: The internal __pgt_row_id column is excluded from common column resolution, so NATURAL JOINs between stream tables work correctly.

How do `GROUPING SETS`, `CUBE`, and `ROLLUP` work?

GROUPING SETS, CUBE, and ROLLUP are fully supported via an automatic parse-time rewrite. pg_trickle decomposes these constructs into a UNION ALL of separate GROUP BY queries before the DVM engine processes the query.

Explosion guard: CUBE(N) generates $2^N$ branches. pg_trickle rejects CUBE/ROLLUP combinations that would produce more than 64 branches to prevent runaway memory usage. Use explicit GROUPING SETS(...) instead.

For example:

-- This defining query:
SELECT dept, region, SUM(amount) FROM sales GROUP BY CUBE(dept, region)

-- Is automatically rewritten to:
SELECT dept, region, SUM(amount) FROM sales GROUP BY dept, region
UNION ALL
SELECT dept, NULL::text, SUM(amount) FROM sales GROUP BY dept
UNION ALL
SELECT NULL::text, region, SUM(amount) FROM sales GROUP BY region
UNION ALL
SELECT NULL::text, NULL::text, SUM(amount) FROM sales

GROUPING() function calls are replaced with integer literal constants corresponding to the grouping level. The rewrite is transparent — the DVM engine sees only standard GROUP BY + UNION ALL operators and can apply incremental delta computation to each branch independently.

How does `DISTINCT ON (…)` work?

DISTINCT ON is fully supported via an automatic parse-time rewrite. pg_trickle transparently transforms DISTINCT ON into a ROW_NUMBER() window function subquery:

-- This defining query:
SELECT DISTINCT ON (dept) dept, employee, salary
FROM employees ORDER BY dept, salary DESC

-- Is automatically rewritten to:
SELECT dept, employee, salary FROM (
    SELECT dept, employee, salary,
           ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rn
    FROM employees
) sub WHERE rn = 1

The rewrite happens before DVM parsing, so the operator tree sees a standard window function query and can apply partition-based recomputation for incremental delta maintenance.

Why is `TABLESAMPLE` rejected?

TABLESAMPLE returns a random subset of rows from a table (e.g., FROM orders TABLESAMPLE BERNOULLI(10) gives ~10% of rows).

Stream tables materialize the complete result set of the defining query and keep it up-to-date across refreshes. Baking a random sample into the defining query is not meaningful because:

Non-determinism. Each refresh would sample different rows, making the stream table contents unstable and unpredictable. The delta between refreshes would be dominated by sampling noise, not actual data changes.
CDC incompatibility. The trigger-based change-capture system tracks specific row-level changes (inserts, updates, deletes). A TABLESAMPLE defining query has no stable row identity — the "changed rows" concept doesn't apply when the entire sample shifts each cycle.

Rewrite:

-- Instead of sampling in the defining query:
SELECT * FROM orders TABLESAMPLE BERNOULLI(10)

-- Materialize the full result and sample when querying:
SELECT * FROM order_stream_table WHERE random() < 0.1

Why is `LIMIT` / `OFFSET` rejected?

Stream tables materialize the complete result set and keep it synchronized with source data. Bare LIMIT/OFFSET (without a recognized pattern) would truncate the result:

Undefined ordering. LIMIT without ORDER BY returns an arbitrary subset.
Delta instability. When source rows change, the boundary between "in the LIMIT" and "out of the LIMIT" shifts. A single INSERT could evict one row and admit another, requiring the refresh to track the full ordered position of every row.
Semantic mismatch. Users who write LIMIT 100 typically want to limit what they read, not what is stored.

Exception — TopK pattern: When the defining query has a top-level ORDER BY … LIMIT N (constant integer, optionally with OFFSET M), pg_trickle recognizes this as a TopK query and accepts it. The ORDER BY clause is required — bare LIMIT without ORDER BY is always rejected because it selects an arbitrary subset. With ORDER BY, the top-N boundary is well-defined and the stream table stores exactly those N rows (starting from position M+1 if OFFSET is specified). See the TopK section for details.

Rewrite (when TopK doesn't apply):

-- Instead of:
'SELECT * FROM orders ORDER BY created_at DESC LIMIT 100'

-- Omit LIMIT from the defining query, apply when reading:
SELECT * FROM orders_stream_table ORDER BY created_at DESC LIMIT 100

Why are window functions in expressions rejected?

Window functions like ROW_NUMBER() OVER (…) are supported as standalone columns in stream tables. However, embedding a window function inside an expression — such as CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN ... or SUM(x) OVER (...) + 1 — is rejected.

This restriction exists because:

Partition-based recomputation. pg_trickle's differential mode handles window functions by recomputing entire partitions that were affected by changes. When a window function is buried inside an expression, the DVM engine cannot isolate the window computation from the surrounding expression, making it impossible to correctly identify which partitions to recompute.
Expression tree ambiguity. The DVM parser would need to differentiate the outer expression (arithmetic, CASE, etc.) while treating the inner window function specially. This creates a combinatorial explosion of differentiation rules for every possible expression type × window function combination.

Rewrite:

-- Instead of:
SELECT id, CASE WHEN ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) = 1
                THEN 'top' ELSE 'other' END AS rank_label
FROM employees

-- Move window function to a separate column, then use a wrapping stream table:
-- ST1:
SELECT id, dept, salary,
       ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rn
FROM employees

-- ST2 (references ST1):
SELECT id, CASE WHEN rn = 1 THEN 'top' ELSE 'other' END AS rank_label
FROM pgtrickle.employees_ranked

FOR UPDATE and related locking clauses (FOR SHARE, FOR NO KEY UPDATE, FOR KEY SHARE) acquire row-level locks on selected rows. This is incompatible with stream tables because:

Refresh semantics. Stream table contents are managed by the refresh engine using bulk MERGE operations. Row-level locks taken during the defining query would conflict with the refresh engine's own locking strategy.
No direct DML. Since users cannot directly modify stream table rows, there is no use case for locking rows inside the defining query. The locks would be held for the duration of the refresh transaction and then released, serving no purpose.

How does `ALL (subquery)` work?

ALL (subquery) comparisons (e.g., WHERE x > ALL (SELECT y FROM t)) are supported via an automatic rewrite to NOT EXISTS. For example, x > ALL (SELECT y FROM t) is rewritten to NOT EXISTS (SELECT 1 FROM t WHERE y >= x), which pg_trickle handles via its anti-join operator.

Why is `ORDER BY` silently discarded?

ORDER BY in the defining query is accepted but ignored. This is consistent with how PostgreSQL treats CREATE MATERIALIZED VIEW AS SELECT ... ORDER BY ... — the ordering is not preserved in the stored data.

Stream tables are heap tables with no guaranteed row order. The ORDER BY in the defining query would only affect the order of the initial INSERT, which has no lasting effect. Apply ordering when querying the stream table:

-- This ORDER BY is meaningless in the defining query:
'SELECT region, SUM(amount) FROM orders GROUP BY region ORDER BY total DESC'

-- Instead, order when reading:
SELECT * FROM regional_totals ORDER BY total DESC

Why are unsupported aggregates (`CORR`, `COVAR_`, `REGR_`) limited to FULL mode?

Lack algebraic delta rules. There is no closed-form way to update a correlation coefficient from a single row change without access to the full group's data.
Would degrade to group-rescan anyway. Even if supported, the implementation would need to rescan the full group from source — identical to FULL mode for most practical group sizes.

These aggregates work fine in FULL refresh mode, which re-runs the entire query from scratch each cycle.

Why Are These Stream Table Operations Restricted?

Stream tables are regular PostgreSQL heap tables under the hood, but their contents are managed exclusively by the refresh engine. This section explains why certain operations that work on ordinary tables are disallowed or unsupported on stream tables, and what to do instead.

Why can't I `INSERT`, `UPDATE`, or `DELETE` rows in a stream table?

Stream table contents are the output of the refresh engine — they represent the materialized result of the defining query at a specific point in time. Direct DML would corrupt this contract in several ways:

Row ID integrity. Every row has a __pgt_row_id (a 64-bit xxHash of the group-by key or all columns). The refresh engine uses this for delta MERGE — matching incoming deltas against existing rows. A manually inserted row with an incorrect or duplicate __pgt_row_id would cause the next differential refresh to produce wrong results (double-counting, missed deletes, or merge conflicts).
Frontier inconsistency. Each refresh records a frontier — a set of per-source LSN positions that represent "data up to this point has been materialized." A manual DML change is not tracked by any frontier. The next differential refresh would either overwrite the change (if the delta touches the same row) or leave the stream table in a state that doesn't match any consistent point-in-time snapshot of the source data.
Change buffer desync. The CDC triggers on source tables write changes to buffer tables. The refresh engine reads these buffers and advances the frontier. Manual DML on the stream table bypasses this pipeline entirely — the buffer and frontier have no record of the change, so future refreshes cannot account for it.

If you need to post-process stream table data, create a view or a second stream table that references the first one.

Why can't I add foreign keys to or from a stream table?

Foreign key constraints require that referenced/referencing rows exist at the time of each DML statement. The refresh engine violates this assumption:

Bulk MERGE ordering. A differential refresh executes a single MERGE INTO statement that applies all deltas (inserts and deletes) atomically. PostgreSQL evaluates FK constraints row-by-row within this MERGE. If a parent row is deleted and a new parent inserted in the same delta batch, the child FK check may fail because it sees the delete before the insert — even though the final state would be consistent.
Full refresh uses TRUNCATE + INSERT. In FULL mode, the refresh engine truncates the stream table and re-inserts all rows. TRUNCATE does not fire individual DELETE triggers and bypasses FK cascade logic, which would leave referencing tables with dangling references.
Cross-table refresh ordering. If stream table A has an FK referencing stream table B, both tables refresh independently (in topological order, but in separate transactions). There is no guarantee that A's refresh sees B's latest data — the FK constraint could transiently fail between refreshes.

Workaround: Enforce referential integrity in the consuming application or use a view that joins the stream tables and validates the relationship.

How do user-defined triggers work on stream tables?

When a DIFFERENTIAL mode stream table has user-defined row-level triggers, the refresh engine uses explicit DML decomposition instead of MERGE:

Delta materialized once. The delta query result is stored in a temporary table (__pgt_delta_<id>) to avoid evaluating it three times.
DELETE removed rows. Rows in the stream table whose __pgt_row_id is absent from the delta are deleted. AFTER DELETE triggers fire with correct OLD values.
UPDATE changed rows. Rows whose __pgt_row_id exists in both the stream table and delta but whose values differ (checked via IS DISTINCT FROM) are updated. AFTER UPDATE triggers fire with correct OLD and NEW. No-op updates (where values are identical) are skipped, preventing spurious triggers.
INSERT new rows. Rows in the delta whose __pgt_row_id is absent from the stream table are inserted. AFTER INSERT triggers fire with correct NEW values.

FULL refresh behavior: Row-level user triggers are automatically suppressed during FULL refresh via DISABLE TRIGGER USER / ENABLE TRIGGER USER. A NOTIFY pgtrickle_refresh is emitted so listeners know a FULL refresh occurred. Use REFRESH MODE DIFFERENTIAL for stream tables that need per-row trigger semantics.

Performance: The explicit DML path adds ~25–60% overhead compared to MERGE for triggered stream tables. Stream tables without user triggers have zero overhead (only a fast pg_trigger check, <0.1 ms).

Control: The pg_trickle.user_triggers GUC controls this behavior:

auto (default): detect user triggers automatically
off: always use MERGE, suppressing triggers
on: deprecated compatibility alias for auto

Why can't I `ALTER TABLE` a stream table directly?

Stream table metadata (defining query, schedule, refresh mode) is stored in the pg_trickle catalog (pgtrickle.pgt_stream_tables). A direct ALTER TABLE would change the physical table without updating the catalog, causing:

Column mismatch. If you add or remove columns, the refresh engine's cached delta query and MERGE statement would reference columns that no longer exist (or miss new ones), causing runtime errors.
__pgt_row_id invalidation. The row ID hash is computed from the defining query's output columns. Altering the table schema without updating the defining query would make existing row IDs inconsistent with the new column set.

Use pgtrickle.alter_stream_table() to change schedule, refresh mode, or status. To change the defining query or column structure, drop and recreate the stream table.

Why can't I `TRUNCATE` a stream table?

TRUNCATE removes all rows instantly but does not update the pg_trickle frontier or change buffers. After a TRUNCATE:

Differential refresh sees no changes. The frontier still records the last-processed LSN. No new source changes may have occurred, so the next differential refresh produces an empty delta — leaving the stream table empty even though the source still has data.
No recovery path for differential mode. The refresh engine has no way to detect that the stream table was externally truncated. It assumes the current contents match the frontier.

Use pgtrickle.refresh_stream_table('my_table') to force a full re-materialization, or drop and recreate the stream table if you need a clean slate.

What are the memory limits for delta processing?

The differential refresh path executes the delta query as a single SQL statement. For large batches (e.g., a bulk UPDATE of 10M rows), PostgreSQL may attempt to materialize the entire delta result set in memory. If the delta exceeds work_mem, PostgreSQL will spill to temporary files on disk, which is slower but safe. In extreme cases, OOM (out of memory) can occur if work_mem is set very high and the delta is enormous.

Mitigations:

Adaptive fallback. The pg_trickle.differential_max_change_ratio GUC (default 0.15) automatically triggers a FULL refresh when the ratio of pending changes to total rows exceeds the threshold. This prevents large deltas from consuming excessive memory.
work_mem tuning. PostgreSQL's work_mem setting controls how much memory each sort/hash operation uses before spilling to disk. For pg_trickle workloads, 64–256 MB is typical. Monitor temp_blks_written in pg_stat_statements to detect spilling.
pg_trickle.merge_work_mem_mb GUC. Sets a session-level work_mem override during the MERGE execution (default: 0 = use global work_mem). This allows higher memory for refresh without affecting other queries.
Monitoring. If pg_stat_statements is installed, pg_trickle logs a warning when the MERGE query writes temporary blocks to disk.

Why are refreshes processed sequentially by default?

The default (parallel_refresh_mode = 'off') is sequential because it is simple, correct, and efficient for most workloads. Topological ordering guarantees upstream stream tables refresh before downstream ones with no coordination overhead.

When to consider enabling parallel refresh:

Your database has many independent stream tables (no shared dependencies).
Total cycle time = sum of all refresh durations and some refreshes are visibly blocking unrelated ones.
You have enough max_worker_processes headroom (each parallel worker uses one slot).

Enabling parallel refresh (v0.4.0+):

ALTER SYSTEM SET pg_trickle.parallel_refresh_mode = 'on';
SELECT pg_reload_conf();

With parallel_refresh_mode = 'on', the scheduler builds an execution-unit DAG and dispatches independent units to dynamic background workers. Atomic consistency groups and IMMEDIATE-trigger closures remain single-worker for correctness.

See CONFIGURATION.md — Parallel Refresh for tuning guidance including the worker-budget formula.

How many connections does pg_trickle use?

pg_trickle uses the following PostgreSQL connections:

Component	Connections	When
Background scheduler	1	Always (per database with STs)
WAL decoder polling	0 (shared)	Uses the scheduler's SPI connection
Manual refresh	1	Per-call (uses caller's session)

Total: 1 persistent connection per database. WAL decoder polling shares the scheduler's SPI connection rather than opening separate connections.

max_worker_processes: pg_trickle registers 1 background worker per database during _PG_init(). Ensure max_worker_processes (default 8) has room for the pg_trickle worker plus any other extensions.

Advisory locks: The scheduler holds a session-level advisory lock per actively-refreshing ST. These are released immediately after each refresh completes.

Troubleshooting & Failure Mode Runbook

This document covers common failure scenarios, their symptoms, diagnosis steps, and resolution procedures. It is intended for operators and DBAs running pg_trickle in production.

Quick start: Run SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK'; for a single-call triage of your installation.

See also:

Error Reference — all PgTrickleError variants with causes and fixes

FAQ — Troubleshooting section — common user questions

Pre-Deployment Checklist — configuration verification

Configuration — GUC reference

Diagnostic Toolkit

These functions are your primary tools for diagnosing issues:

Function	Purpose
`pgtrickle.health_check()`	Single-call overall health triage (OK/WARN/ERROR)
`pgtrickle.pgt_status()`	Status, staleness, error count for all stream tables
`pgtrickle.refresh_timeline(N)`	Last N refresh events across all stream tables
`pgtrickle.diagnose_errors('name')`	Last 5 failed events with classification and remediation
`pgtrickle.change_buffer_sizes()`	CDC pipeline: pending rows and buffer bytes per source
`pgtrickle.trigger_inventory()`	CDC trigger presence and enabled state
`pgtrickle.check_cdc_health()`	WAL replication slot health (WAL mode only)
`pgtrickle.dependency_tree()`	Dependency DAG visualization
`pgtrickle.worker_pool_status()`	Parallel refresh worker pool state
`pgtrickle.explain_st('name')`	DVM operator tree and generated delta SQL

Quick health check script:

-- 1. Overall health
SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

-- 2. Problem stream tables
SELECT name, status, refresh_mode, consecutive_errors, staleness
FROM pgtrickle.pgt_status()
WHERE status != 'ACTIVE' OR consecutive_errors > 0
ORDER BY consecutive_errors DESC;

-- 3. Recent failures
SELECT start_time, stream_table, action, status, duration_ms, error_message
FROM pgtrickle.refresh_timeline(20)
WHERE status = 'FAILED';

Failure Scenarios

1. Scheduler Not Running

Symptoms:

No stream tables are refreshing
health_check() reports scheduler_running = false
No pg_trickle scheduler process in pg_stat_activity

Diagnosis:

-- Check for the scheduler process
SELECT pid, datname, backend_type, state
FROM pg_stat_activity
WHERE backend_type = 'pg_trickle scheduler';

-- Check GUC
SHOW pg_trickle.enabled;

-- Check shared_preload_libraries
SHOW shared_preload_libraries;

Resolution:

Cause	Fix
`pg_trickle.enabled = off`	`ALTER SYSTEM SET pg_trickle.enabled = on; SELECT pg_reload_conf();`
Not in `shared_preload_libraries`	Add `pg_trickle` to `shared_preload_libraries` in `postgresql.conf` and restart PostgreSQL
`max_worker_processes` exhausted	Increase `max_worker_processes` and restart. The launcher retries every 5 minutes — check PostgreSQL logs for `WARNING: pg_trickle launcher: could not spawn scheduler`
Scheduler crashed	Check PostgreSQL logs for crash details. The launcher will auto-restart it. If recurring, check for OOM or resource limits

2. Stream Table Stuck in SUSPENDED Status

Symptoms:

Stream table status shows SUSPENDED
consecutive_errors is at or above pg_trickle.max_consecutive_errors
No refreshes happening for this stream table

Diagnosis:

-- Check the stream table status
SELECT pgt_name, status, consecutive_errors, last_error_message
FROM pgtrickle.pg_stat_stream_tables
WHERE pgt_name = 'my_stream_table';

-- Get detailed error history
SELECT * FROM pgtrickle.diagnose_errors('my_stream_table');

Resolution:

Fix the underlying error (check last_error_message and diagnose_errors)
Resume the stream table:

SELECT pgtrickle.alter_stream_table('my_stream_table', enabled => true);

Trigger a manual refresh to verify:

SELECT pgtrickle.refresh_stream_table('my_stream_table');

Prevention: Increase pg_trickle.max_consecutive_errors (default 3) if transient errors are common in your environment:

ALTER SYSTEM SET pg_trickle.max_consecutive_errors = 5;
SELECT pg_reload_conf();

3. CDC Triggers Missing or Disabled

Symptoms:

Stream table refreshes succeed but shows no changes
change_buffer_sizes() shows pending_rows = 0 despite active DML
Source tables have no pg_trickle triggers

Diagnosis:

-- Check trigger inventory
SELECT source_table, trigger_type, trigger_name, present, enabled
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

-- Manual check on a specific source table
SELECT tgname, tgenabled
FROM pg_trigger
WHERE tgrelid = 'public.orders'::regclass
  AND tgname LIKE 'pgt_%';

Resolution:

Cause	Fix
Triggers dropped by DDL (e.g., `pg_dump` + restore without triggers)	Drop and recreate the stream table, or reinitialize: `SELECT pgtrickle.refresh_stream_table('my_st');`
Triggers disabled (`ALTER TABLE ... DISABLE TRIGGER`)	`ALTER TABLE source_table ENABLE TRIGGER ALL;`
Source gating active	Check `SELECT * FROM pgtrickle.source_gates();` and ungate: `SELECT pgtrickle.ungate_source('source_table');`
WAL mode active but slot missing	See WAL Replication Slot Lag or Missing

4. WAL Replication Slot Lag or Missing

Symptoms:

check_cdc_health() shows slot_lag_exceeds_threshold or replication_slot_missing
WAL disk usage growing unexpectedly
Stream tables not receiving changes in WAL mode

Diagnosis:

-- Check CDC health
SELECT * FROM pgtrickle.check_cdc_health();

-- Check replication slots directly
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS lag
FROM pg_replication_slots
WHERE slot_name LIKE 'pgt_%';

Resolution:

Cause	Fix
Slot dropped externally	pg_trickle will auto-fallback to trigger-based CDC. To recreate: drop and recreate the stream table
Slot lagging (WAL accumulation)	Check for long-running transactions: `SELECT pid, age(backend_xmin) FROM pg_stat_replication;`. Kill idle-in-transaction sessions
`wal_level != logical`	WAL CDC requires `wal_level = logical`. Set it and restart PostgreSQL
`max_replication_slots` exhausted	Increase `max_replication_slots` and restart

Fallback: Force trigger-based CDC mode if WAL mode is problematic:

ALTER SYSTEM SET pg_trickle.cdc_mode = 'trigger';
SELECT pg_reload_conf();

5. Stream Table Stuck in INITIALIZING

Symptoms:

Stream table status is INITIALIZING for an extended period
The initial full refresh may have failed or is still running

Diagnosis:

-- Check refresh history
SELECT * FROM pgtrickle.get_refresh_history('my_st', 5);

-- Check for active refresh
SELECT pid, state, query, now() - query_start AS running_for
FROM pg_stat_activity
WHERE query LIKE '%pgtrickle%' AND state = 'active';

Resolution:

Cause	Fix
Initial refresh failed (check error in history)	Fix the error, then: `SELECT pgtrickle.refresh_stream_table('my_st');`
Defining query is very slow	Optimize the query, add indexes on source tables, or increase `work_mem`
Lock contention during initial refresh	See Lock Contention

6. Change Buffers Growing Without Refresh

Symptoms:

change_buffer_sizes() shows large pending_rows and growing buffer_bytes
Stream tables are stale
Refreshes are not running or are failing

Diagnosis:

-- Check buffer sizes
SELECT stream_table, source_table, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

-- Check if refreshes are happening
SELECT * FROM pgtrickle.refresh_timeline(10);

-- Check for blocked refresh processes
SELECT pid, wait_event_type, wait_event, state, query
FROM pg_stat_activity
WHERE query LIKE '%pgtrickle%';

Resolution:

Cause	Fix
Scheduler not running	See Scheduler Not Running
All refreshes failing	Check `diagnose_errors()` for each affected stream table
Lock contention	See Lock Contention
Very large buffer causing slow MERGE	Consider lowering `pg_trickle.differential_change_ratio_threshold` to trigger FULL refresh for large batches

Emergency: If buffers are dangerously large and you need immediate relief:

-- Force a full refresh (bypasses change buffers)
SELECT pgtrickle.refresh_stream_table('my_st', force_full => true);

7. Lock Contention Blocking Refresh

Symptoms:

Refresh duration is much longer than usual
pg_stat_activity shows refresh processes in Lock wait state
Long-running transactions on source or stream tables

Diagnosis:

-- Find blocking locks
SELECT blocked.pid AS blocked_pid,
       blocked.query AS blocked_query,
       blocking.pid AS blocking_pid,
       blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid AND NOT bl.granted
JOIN pg_locks gl ON gl.locktype = bl.locktype
    AND gl.database IS NOT DISTINCT FROM bl.database
    AND gl.relation IS NOT DISTINCT FROM bl.relation
    AND gl.page IS NOT DISTINCT FROM bl.page
    AND gl.tuple IS NOT DISTINCT FROM bl.tuple
    AND gl.pid != bl.pid
    AND gl.granted
JOIN pg_stat_activity blocking ON blocking.pid = gl.pid
WHERE blocked.query LIKE '%pgtrickle%';

Resolution:

Identify and terminate the blocking session if appropriate:
```
SELECT pg_terminate_backend(<blocking_pid>);
```
Investigate why the blocking transaction is long-running (idle in transaction, slow query, etc.)
Consider adding statement_timeout or idle_in_transaction_session_timeout to prevent future occurrences

8. Out-of-Memory During Refresh

Symptoms:

Refresh processes killed by OS OOM killer
PostgreSQL logs show out of memory errors
Stream tables fail with system-category errors

Diagnosis:

# Check OS OOM killer logs
dmesg | grep -i "oom\|killed process" | tail -20

# Check PostgreSQL logs for memory errors
grep -i "out of memory\|oom" /var/log/postgresql/postgresql-*.log | tail -10

-- Check which stream tables have large source data
SELECT stream_table, source_table, pending_rows
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

Resolution:

Cause	Fix
Large FULL refresh on big table	Reduce `work_mem` or `maintenance_work_mem` to limit per-query memory
Large change buffer accumulation	Refresh more frequently (shorter schedule) to keep buffers small
Complex query with many joins	Simplify the defining query or break into cascading stream tables
Parallel refresh amplifies memory	Reduce `pg_trickle.max_concurrent_refreshes`

Tuning:

-- Limit per-refresh memory
SET work_mem = '64MB';

-- Limit concurrent refreshes to reduce peak memory
ALTER SYSTEM SET pg_trickle.max_concurrent_refreshes = 2;
SELECT pg_reload_conf();

9. Disk Full / WAL Retention Exceeded

Symptoms:

PostgreSQL logs No space left on device errors
WAL directory consuming excessive disk
Replication slots preventing WAL cleanup

Diagnosis:

# Check disk usage
df -h /var/lib/postgresql/data
du -sh /var/lib/postgresql/data/pg_wal/

-- Check replication slot WAL retention
SELECT slot_name, active,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal
FROM pg_replication_slots
ORDER BY pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) DESC;

-- Check change buffer table sizes
SELECT stream_table, source_table,
       pg_size_pretty(buffer_bytes::bigint) AS buffer_size
FROM pgtrickle.change_buffer_sizes()
ORDER BY buffer_bytes DESC;

Resolution:

Cause	Fix
Inactive replication slot holding WAL	Drop the slot: `SELECT pg_drop_replication_slot('pgt_...');`
Change buffer tables too large	Force full refresh to clear buffers, or refresh more frequently
WAL accumulation from long transactions	Terminate idle-in-transaction sessions
`max_wal_size` too low	Increase `max_wal_size` in `postgresql.conf`

Emergency cleanup:

-- Drop inactive pg_trickle replication slots
SELECT pg_drop_replication_slot(slot_name)
FROM pg_replication_slots
WHERE slot_name LIKE 'pgt_%' AND NOT active;

10. Circular Pipeline Convergence Failure

Symptoms:

Stream tables in a circular dependency hit the maximum iteration limit
Refresh history shows repeated cycles without convergence
Error messages mention fixed_point_max_iterations

Diagnosis:

-- Check for circular dependencies
SELECT * FROM pgtrickle.dependency_tree();

-- Check refresh history for iteration patterns
SELECT start_time, stream_table, action, status, error_message
FROM pgtrickle.refresh_timeline(50)
WHERE stream_table IN ('st_a', 'st_b')  -- suspected cycle members
ORDER BY start_time DESC;

Resolution:

Verify the cycle is intentional (see Circular Dependencies tutorial)

Increase the iteration limit if convergence is slow:

ALTER SYSTEM SET pg_trickle.fixed_point_max_iterations = 20;
SELECT pg_reload_conf();

If the cycle never converges, the defining queries may not be monotone. Restructure to eliminate the cycle or ensure monotonicity

11. Schema Change Broke Stream Table

Symptoms:

Stream table has needs_reinit = true
Reinitialization keeps failing
Error messages reference dropped or renamed columns

Diagnosis:

-- Check for pending reinit
SELECT pgt_name, needs_reinit, status, last_error_message
FROM pgtrickle.pg_stat_stream_tables
WHERE needs_reinit;

-- Get error details
SELECT * FROM pgtrickle.diagnose_errors('my_st');

Resolution:

If the defining query is still valid after the DDL change, force a reinit:

SELECT pgtrickle.refresh_stream_table('my_st');

If the defining query needs to be updated:

-- Option 1: Alter the defining query
SELECT pgtrickle.alter_stream_table('my_st',
    query => 'SELECT new_column, SUM(amount) FROM orders GROUP BY new_column'
);

-- Option 2: Drop and recreate
SELECT pgtrickle.drop_stream_table('my_st');
SELECT pgtrickle.create_stream_table(
    'my_st',
    'SELECT new_column, SUM(amount) FROM orders GROUP BY new_column',
    '1m'
);

12. Worker Pool Exhaustion

Symptoms:

Refresh latency increases across the board
Some stream tables refresh while others queue indefinitely
worker_pool_status() shows all workers busy

Diagnosis:

-- Check worker pool
SELECT * FROM pgtrickle.worker_pool_status();

-- Check for long-running parallel jobs
SELECT job_id, unit_key, status, duration_ms
FROM pgtrickle.parallel_job_status(300)
WHERE status = 'RUNNING'
ORDER BY duration_ms DESC;

Resolution:

Cause	Fix
Too few workers for workload	Increase `pg_trickle.max_concurrent_refreshes` and/or `max_worker_processes`
One stream table monopolizing workers	Check if a single slow refresh is blocking the pool. Consider splitting into smaller stream tables
Global worker cap reached	Increase `pg_trickle.max_dynamic_refresh_workers`

13. Fuse Tripped (Circuit Breaker)

Symptoms:

Stream table shows fuse_state = 'BLOWN' or refresh is paused
fuse_status() reports a tripped fuse
No refreshes happening despite active scheduler

Diagnosis:

-- Check fuse status
SELECT * FROM pgtrickle.fuse_status();

Resolution:

Reset the fuse after investigating the root cause:

SELECT pgtrickle.reset_fuse('my_stream_table');

See the Fuse Circuit Breaker tutorial for details on fuse thresholds and configuration.

General Diagnostic Workflow

When investigating any issue, follow this sequence:

1. health_check()          → identify which subsystem is unhealthy
2. pgt_status()            → find specific affected stream tables
3. diagnose_errors('name') → get root cause for failures
4. refresh_timeline(20)    → correlate with recent refresh events
5. change_buffer_sizes()   → check CDC pipeline health
6. trigger_inventory()     → verify change capture is working
7. dependency_tree()       → confirm DAG wiring
8. PostgreSQL logs         → low-level crash/resource details

GUC Quick Reference for Troubleshooting

GUC	Default	What to check
`pg_trickle.enabled`	`on`	Must be `on` for scheduler to run
`pg_trickle.max_consecutive_errors`	`3`	Stream tables suspend after this many failures
`pg_trickle.scheduler_interval_ms`	`100`	Very high values cause refresh lag
`pg_trickle.event_driven_wake`	`on`	`off` = poll-only, higher latency
`pg_trickle.cdc_mode`	`auto`	`trigger` for reliable fallback
`pg_trickle.max_concurrent_refreshes`	`4`	Per-database parallel refresh cap
`pg_trickle.fixed_point_max_iterations`	`10`	Circular pipeline iteration limit
`pg_trickle.differential_change_ratio_threshold`	`0.5`	Falls back to FULL above this ratio
`pg_trickle.auto_backoff`	`on`	Stretches intervals up to 8x under load

pg_trickle Error Reference

This document lists all PgTrickleError variants with descriptions, common causes, and suggested fixes. If you encounter an error not listed here, please open an issue.

Tip: Most errors include context (table name, OID, or query fragment) in the message text. Use that context to narrow down the root cause.

SQLSTATE Code Reference

Every pg_trickle error includes a PostgreSQL SQLSTATE code for programmatic error handling. Use SQLSTATE in PL/pgSQL EXCEPTION WHEN blocks or check the error code in your client library.

Error Variant	SQLSTATE	Code Name
QueryParseError	`42000`	SYNTAX_ERROR_OR_ACCESS_RULE_VIOLATION
TypeMismatch	`42804`	DATATYPE_MISMATCH
UnsupportedOperator	`0A000`	FEATURE_NOT_SUPPORTED
CycleDetected	`3F000`	INVALID_SCHEMA_DEFINITION
NotFound	`42P01`	UNDEFINED_TABLE
AlreadyExists	`42P07`	DUPLICATE_TABLE
InvalidArgument	`22023`	INVALID_PARAMETER_VALUE
QueryTooComplex	`54000`	PROGRAM_LIMIT_EXCEEDED
UpstreamTableDropped	`42P01`	UNDEFINED_TABLE
UpstreamSchemaChanged	`42P17`	INVALID_TABLE_DEFINITION
LockTimeout	`55P03`	LOCK_NOT_AVAILABLE
ReplicationSlotError	`55000`	OBJECT_NOT_IN_PREREQUISITE_STATE
WalTransitionError	`55000`	OBJECT_NOT_IN_PREREQUISITE_STATE
SpiError	`XX000`	INTERNAL_ERROR
SpiPermissionError	`42501`	INSUFFICIENT_PRIVILEGE
WatermarkBackwardMovement	`22000`	DATA_EXCEPTION
WatermarkGroupNotFound	`42704`	UNDEFINED_OBJECT
WatermarkGroupAlreadyExists	`42710`	DUPLICATE_OBJECT
RefreshSkipped	`55000`	OBJECT_NOT_IN_PREREQUISITE_STATE
InternalError	`XX000`	INTERNAL_ERROR

Error Categories

pg_trickle classifies errors into four categories that determine retry behavior:

Category	Retried by scheduler?	Description
User	No	Invalid queries, type mismatches, DAG cycles. Fix the input.
Schema	No (triggers reinitialize)	Upstream DDL changes. The stream table is reinitialized automatically.
System	Yes (with backoff)	Lock timeouts, replication slot problems, transient SPI failures.
Internal	No	Unexpected bugs. Please report these.

User Errors

QueryParseError

Message: query parse error: <details>

Description: The defining query could not be parsed or validated by the pg_trickle query analyzer.

Common causes:

Syntax error in the defining query
Use of PostgreSQL syntax not yet supported by pgrx's query parser
A CTE or subquery that cannot be analyzed

Suggested fix: Simplify the query. Check that it runs as a standalone SELECT statement. Review SQL Reference — Expression Support for supported syntax.

TypeMismatch

Message: type mismatch: <details>

Description: A type incompatibility was detected between the defining query output and the stream table schema, or between source columns and expected types.

Common causes:

Column type changed on a source table after stream table creation
Explicit cast to an incompatible type in the defining query
UNION branches with mismatched column types

Suggested fix: Ensure column types match. Use explicit CAST() to align types if needed. If the source table changed, use pgtrickle.repair_stream_table() to reinitialize.

UnsupportedOperator

Message: unsupported operator for DIFFERENTIAL mode: <operator>

Description: The defining query uses an SQL operator or construct that pg_trickle cannot maintain incrementally.

Common causes:

TABLESAMPLE, GROUPING SETS beyond the branch limit, recursive CTEs with unsupported patterns, certain window function combinations
Non-monotonic or volatile functions in positions that prevent differential maintenance

Suggested fix: Use refresh_mode => 'FULL' to fall back to full recomputation:

SELECT pgtrickle.alter_stream_table('my_stream_table',
    refresh_mode => 'FULL');

Or restructure the query to avoid the unsupported construct. See SQL Reference — Expression Support.

CycleDetected

Message: cycle detected in dependency graph: A -> B -> C -> A

Description: Adding or altering this stream table would create a circular dependency in the refresh DAG.

Common causes:

Stream table A depends on stream table B, which depends on A
Indirect cycles through chains of stream tables

Suggested fix: Restructure the stream table definitions to break the cycle. Use pgtrickle.get_dependency_graph() to visualize the current DAG. If circular dependencies are intentional, enable pg_trickle.allow_circular = true (see Configuration).

NotFound

Message: stream table not found: <name>

Description: The specified stream table does not exist in the pgtrickle.pgt_stream_tables catalog.

Common causes:

Typo in the stream table name
The stream table was already dropped
Schema-qualified name required but not provided (e.g., myschema.my_st)

Suggested fix: Check the name with pgtrickle.list_stream_tables(). Use the fully qualified name: schema.table_name.

AlreadyExists

Message: stream table already exists: <name>

Description: A create_stream_table() call was made for a stream table name that is already registered.

Common causes:

Re-running a migration or DDL script without IF NOT EXISTS

Suggested fix: Use pgtrickle.create_stream_table_if_not_exists() or pgtrickle.create_or_replace_stream_table() for idempotent creation.

InvalidArgument

Message: invalid argument: <details>

Description: An invalid value was passed to a pg_trickle API function.

Common causes:

Invalid refresh_mode value (must be 'DIFFERENTIAL', 'FULL', or 'AUTO')
Calling resume_stream_table() on a table that is not suspended
Invalid schedule interval or threshold value
Empty or malformed table name

Suggested fix: Check the function signature in the SQL Reference and correct the argument.

QueryTooComplex

Message: query too complex: <details>

Description: The defining query exceeds the maximum parse depth, which protects against stack overflow during query analysis.

Common causes:

Deeply nested subqueries (> 64 levels by default)
Large UNION ALL chains
Complex CTE hierarchies

Suggested fix: Simplify the query. If the depth limit is too restrictive, increase pg_trickle.max_parse_depth (default: 64). See Configuration.

Schema Errors

UpstreamTableDropped

Message: upstream table dropped: OID <oid>

Description: A source table referenced by the stream table's defining query was dropped.

Common causes:

DROP TABLE on a source table
Table replaced via DROP + CREATE (new OID)

Suggested fix: Either recreate the source table with the same schema or drop the stream table and recreate it. If pg_trickle.block_source_ddl = true (default), the DROP would have been blocked in the first place.

UpstreamSchemaChanged

Message: upstream table schema changed: OID <oid>

Description: A source table's schema was altered (e.g., column added, dropped, or type changed) in a way that affects the defining query.

Common causes:

ALTER TABLE ... ADD/DROP/ALTER COLUMN on a source table
Type change on a column used in the defining query

Suggested fix: The stream table will be automatically reinitialized on the next scheduler tick. If pg_trickle.block_source_ddl = true (default), most schema changes are blocked proactively. Use pgtrickle.alter_stream_table(..., query => '...') to update the defining query if needed.

System Errors

LockTimeout

Message: lock timeout: <details>

Description: A lock required for refresh could not be acquired within the configured timeout.

Common causes:

Long-running transactions holding locks on the stream table or source tables
Concurrent ALTER TABLE or VACUUM FULL operations
High contention on the change buffer tables

Suggested fix: This error is automatically retried with exponential backoff. If persistent, investigate long-running transactions with pg_stat_activity. Consider increasing lock_timeout or reducing refresh frequency.

ReplicationSlotError

Message: replication slot error: <details>

Description: An error occurred with the logical replication slot used for WAL-based CDC.

Common causes:

Replication slot dropped externally
wal_level changed from logical to a lower level
Slot lag exceeded max_slot_wal_keep_size

Suggested fix: Check replication slot status with SELECT * FROM pg_replication_slots. Ensure wal_level = logical. If the slot was dropped, pg_trickle will recreate it automatically. See Configuration — WAL CDC.

WalTransitionError

Message: WAL transition error: <details>

Description: An error occurred during the transition from trigger-based CDC to WAL-based CDC.

Common causes:

wal_level is not logical when cdc_mode = 'auto'
Transient connection issues during the transition

Suggested fix: Ensure wal_level = logical in postgresql.conf if you want WAL-based CDC. Otherwise set pg_trickle.cdc_mode = 'trigger' to stay on trigger-based CDC. This error is retried automatically.

SpiError

Message: SPI error: <details>

Description: A PostgreSQL Server Programming Interface (SPI) error occurred during an internal query.

Common causes:

Transient serialization failures under high concurrency
Deadlocks between refresh and concurrent DML
Connection issues in background workers
Permanent errors: missing columns, syntax errors in generated SQL

Suggested fix: Transient SPI errors (deadlocks, serialization failures) are retried automatically. Permanent errors (permission denied, missing objects) will suspend the stream table after max_consecutive_errors failures. Check pgtrickle.check_health() for details.

SpiPermissionError

Message: SPI permission error: <details>

Description: The background worker's role lacks required permissions.

Common causes:

Missing SELECT privilege on a source table
Missing INSERT/UPDATE/DELETE privilege on the stream table
Role used by the background worker is not the table owner

Suggested fix: Grant the necessary privileges to the role running pg_trickle's background workers:

GRANT SELECT ON source_table TO pgtrickle_role;
GRANT ALL ON pgtrickle.my_stream_table TO pgtrickle_role;

This error does not count toward the consecutive error suspension limit.

Watermark Errors

WatermarkBackwardMovement

Message: watermark moved backward: <details>

Description: A watermark advancement was rejected because the new value is older than the current watermark, violating monotonicity.

Common causes:

Clock skew in distributed systems
Manual watermark manipulation with an incorrect value
Bug in watermark tracking logic

Suggested fix: Ensure watermark values are monotonically increasing. Check the current watermark with pgtrickle.get_watermark_groups().

WatermarkGroupNotFound

Message: watermark group not found: <details>

Description: The specified watermark group does not exist.

Common causes:

Typo in the watermark group name
The group was deleted or never created

Suggested fix: List existing groups with pgtrickle.get_watermark_groups().

WatermarkGroupAlreadyExists

Message: watermark group already exists: <details>

Description: A watermark group with this name already exists.

Common causes:

Re-running a setup script without idempotent guards

Suggested fix: Use a different name or delete the existing group first.

Transient Errors

RefreshSkipped

Message: refresh skipped: <details>

Description: A refresh was skipped because a previous refresh for the same stream table is still running.

Common causes:

Slow refresh (large delta or complex query) overlapping with the next scheduled cycle
Multiple manual refresh_stream_table() calls in parallel

Suggested fix: No action needed — the scheduler will retry on the next cycle. If this happens frequently, increase the schedule interval or investigate why refreshes are slow using pgtrickle.explain_st().

This error does not count toward the consecutive error suspension limit.

Internal Errors

InternalError

Message: internal error: <details>

Description: An unexpected internal error that indicates a bug in pg_trickle.

Common causes:

This should not happen in normal operation

Suggested fix: Please report the issue with the full error message, your PostgreSQL version, and pg_trickle version. Include the output of pgtrickle.check_health() and the relevant PostgreSQL log entries.

Changelog

What's new in pg_trickle — written for everyone, not just developers.

For future plans and upcoming features, see ROADMAP.md.

[Unreleased]

[0.20.0] — Dog Feeding

pg_trickle now monitors itself. Instead of you having to check on pg_trickle's health manually, this release lets pg_trickle watch its own performance, spot problems early, and even fix some of them on its own. Five new stream tables sit in the pgtrickle schema and continuously analyse refresh history — the same technology you use for your own data, pointed inward. One SQL call sets everything up; one call tears it down.

We call this dog feeding — pg_trickle uses its own stream-table technology to keep an eye on itself, just like it keeps your data views up to date.

What's new

One-click self-monitoring — run SELECT pgtrickle.setup_dog_feeding() and pg_trickle creates five monitoring stream tables that continuously track how well it is performing. Run teardown_dog_feeding() to remove them. Both are idempotent — safe to call as many times as you like, even during rolling upgrades.
Health at a glance — the new dog_feeding_status() function shows the status of all five monitoring views in one query: whether each one exists, its refresh mode, and the last time it refreshed. Quick to run from a monitoring script or dashboard.
Threshold recommendations — after enough refresh cycles accumulate (typically 10–20 minutes of activity), df_threshold_advice starts producing suggestions for each stream table. Each recommendation includes a confidence level (HIGH / MEDIUM / LOW) and a reason — for example, "DIFF is 73% faster — raise threshold to allow more DIFF". A sla_headroom_pct column shows exactly how much faster incremental refresh is versus full refresh for that table.
Automatic tuning — set pg_trickle.dog_feeding_auto_apply = 'threshold_only' and pg_trickle will apply HIGH-confidence threshold recommendations automatically. Changes are rate-limited to once per 10 minutes per stream table, and every adjustment is logged to pgt_refresh_history with initiated_by = 'DOG_FEED' so you have a full audit trail.
Real-time alerts — when pg_trickle detects an anomaly (duration spike exceeding 3× the baseline, or two or more recent failures), it sends a NOTIFY on the pgtrickle_alert channel with a JSON payload. Your application, Alertmanager webhook, or LISTEN client can act immediately without polling.
Scheduling interference detection — df_scheduling_interference tracks pairs of stream tables that consistently overlap during refresh. When overlap is heavy, the scheduler automatically backs off its poll interval (up to 2× the configured base) to reduce contention.
Visual dependency graph — the new explain_dag() function renders your full refresh pipeline as a Mermaid or Graphviz DOT diagram. User stream tables appear in blue, dog-feeding tables in green, suspended tables in red. Paste the output into any Mermaid renderer or dot to see exactly how your tables depend on each other.
Scheduler overhead report — scheduler_overhead() returns metrics for the last hour: total refreshes, how many were dog-feeding, the fraction they represent, and average durations. Useful for confirming that self-monitoring adds negligible cost.

What pg_trickle watches

Monitoring view	What it tracks
`df_efficiency_rolling`	Rolling-window refresh speed, change ratio, DIFF vs FULL counts
`df_anomaly_signals`	Duration spikes (> 3× baseline), error bursts, mode oscillation
`df_threshold_advice`	Per-table threshold recommendations with confidence level and reasoning
`df_cdc_buffer_trends`	Change-capture buffer growth rate per source table; alerts on burst spikes
`df_scheduling_interference`	Refresh overlap patterns; pairs with 3+ concurrent refreshes in the last hour

Faster and more reliable

A new index on pgt_refresh_history(pgt_id, start_time) speeds up all dog-feeding queries and general history lookups. Applied automatically during the 0.19.0 → 0.20.0 upgrade.
Old history records are now pruned in batches of 1,000 rows per transaction (previously one large DELETE), which avoids long lock holds on pgt_refresh_history during the nightly cleanup.
check_cdc_health() is enriched with spill-risk alerts: if a source table's max burst delta exceeds 10× its average, you get an early warning before the buffer fills.
explain_st() now shows two new properties: dog_feeding_coverage (none / partial / full) and recommended_refresh_mode, so diagnostics automatically surface self-monitoring data when it is available.

New documentation and tooling

SQL Reference — a new "Dog Feeding — Self-Monitoring" section covers all five stream tables, setup_dog_feeding(), teardown_dog_feeding(), confidence levels, and the sla_headroom_pct column.
Getting Started — a new "Day 2 Operations" section walks through enabling dog-feeding, reading recommendations, enabling auto-apply, and visualising the DAG.
Configuration — pg_trickle.dog_feeding_auto_apply is fully documented with values, rate-limiting behaviour, and the audit trail.
A ready-made Grafana dashboard (pg_trickle_dog_feeding.json) with five panels covers refresh throughput, anomaly heatmap, threshold calibration, CDC buffer growth, and the scheduling interference matrix.
A dbt macro (pgtrickle_enable_monitoring) enables monitoring as a post-hook with one line in dbt_project.yml.
A quick-start SQL script at sql/dog_feeding_setup.sql walks through setup, auto-apply, alert listening, and status verification in six steps.

[0.19.0] — 2026-04-13

Safer, faster, easier to operate. This release closes several security and correctness gaps, adds new conveniences for operators and developers, and significantly improves performance for deployments with many stream tables. The background scheduler finds the next table to refresh 10–15× faster. Four breaking changes are included — all easy to adapt to, each one correcting behaviour that was a source of subtle bugs in production.

Breaking changes

Only owners can modify their own stream tables — other database users can no longer drop or alter a stream table they did not create. If shared access is intentional, grant superuser or explicitly add the user as owner. Superusers are unaffected.
Dropping a stream table no longer cascades — drop_stream_table() now behaves like PostgreSQL's own DROP TABLE: it refuses to drop if dependent objects exist, unless you pass cascade => true explicitly. Previously it silently removed all dependents, which surprised operators after restructuring.
The refresh notification channel was renamed — change LISTEN pgtrickle_refresh to LISTEN pg_trickle_refresh (note the added underscore). The old name was inconsistent with every other channel in the extension.
The delete_insert refresh strategy was removed — this strategy could produce wrong results for queries containing aggregates or DISTINCT. If you had it configured, pg_trickle logs a warning and automatically switches to the safe auto strategy. No data is lost; the next refresh corrects any affected rows.

New features

Installation health check — version_check() returns the installed extension version, the loaded library version, and the PostgreSQL server version in one row. If the extension was upgraded but the server has not been restarted, you get an explicit warning. Useful in deploy scripts and smoke tests.
Write and refresh in one step — write_and_refresh(sql, st_name) executes an arbitrary SQL statement and immediately refreshes the named stream table in the same transaction. Downstream readers see consistent results as soon as the transaction commits — no polling loop needed.
Better connection-pooler support — the new pg_trickle.connection_pooler_mode GUC configures pg_trickle for PgBouncer, pgcat, or Supavisor at the cluster level. Previously each stream table had to be configured individually, which was error-prone on large deployments.
Automatic refresh history cleanup — pgt_refresh_history is now trimmed automatically after 90 days (configurable with pg_trickle.history_retention_days; set to 0 to disable). Without this, the history table could grow by thousands of rows per day on busy deployments.
Schema migration tracking — pg_trickle now records which upgrade scripts have been applied in pgtrickle.pgt_schema_version. This makes it straightforward to verify that a deployment is fully up to date and simplifies the rollback story.
Clearer skip messages — when a refresh is skipped because another refresh of the same stream table is already running, you now see a NOTICE: skipping refresh of <name> — already running message instead of silence. Reduces confusion when debugging slow or stuck schedulers.
Deeper diagnostics — explain_st() gains a with_analyze parameter. When set to true, it runs EXPLAIN (ANALYZE, BUFFERS) on the refresh query and returns actual row counts, timing, and buffer hit/miss ratios — the same information PostgreSQL's query planner provides for any query, but surfaced inside the stream-table diagnostic tool.
New deployment guides — step-by-step documentation for PgBouncer, pgcat, Supavisor, CNPG, and Kubernetes deployments, plus an operational runbook for common Kubernetes failure modes.

Bug fixes

Fixed a constraint-validation inconsistency in databases upgraded from 0.11.0 or earlier where pgt_refresh_history had a duplicate check entry in the catalog. Affected databases could see spurious constraint errors on busy write paths.
Error messages throughout the extension now show human-readable table names (e.g. public.orders) instead of raw PostgreSQL OIDs. This affects "source table was dropped", "schema changed", and several other error paths that were previously unreadable without a catalog lookup.

Performance

10–15× faster scheduler dispatch — the scheduler now finds the next stream table to process with a direct lookup instead of scanning the full list on every poll cycle. On a deployment with 500 stream tables this drops from ~650 µs to ~45 µs per poll, reducing background CPU overhead significantly at scale.
Single-query change detection — when the scheduler checks whether any source tables have changed, it now issues one query covering all sources at once instead of one query per source table. On deployments with 50+ source tables this meaningfully reduces the overhead of each scheduler cycle, especially under PgBouncer transaction pooling.

[0.18.0] — 2026-04-12

Hardening & Delta Performance. This release focuses on correctness, reliability, and giving operators better visibility into what pg_trickle is doing. Stream tables that group by columns containing NULL values now refresh correctly in all cases. A new memory safety net prevents runaway refreshes from consuming too much RAM. Error messages across the board now explain what went wrong and suggest how to fix it. Two new SQL functions — health_summary() and cache_stats() — give you a single-query overview of the entire system, and updated Grafana dashboards make monitoring plug-and-play. The TPC-H industry benchmark now runs as a nightly regression guard, and property-based tests mathematically verify the core delta engine's arithmetic.

Highlights

NULL values in GROUP BY now handled correctly — previous versions could produce wrong results when a stream table grouped by a column that contained NULL values and rows were deleted. The root cause was that NULL group keys broke the internal row-matching logic. This is now fixed: NULL keys are matched correctly during both inserts and deletes, so aggregate stream tables always return the right answer regardless of NULLs in the data.
Memory safety net for large deltas — if an unexpectedly large batch of changes arrives (for example, a bulk import into a source table), the incremental refresh could previously consume unbounded memory. A new configuration option (pg_trickle.delta_work_mem_cap_mb) lets you set a ceiling. When a refresh would exceed it, pg_trickle automatically falls back to a full refresh instead of risking an out-of-memory crash.
Early warning when refreshes spill to disk — when the incremental refresh engine runs low on memory, PostgreSQL may spill intermediate data to temporary files on disk, which is much slower. pg_trickle now detects this and sends a notification so you can investigate before performance degrades. If spilling happens repeatedly, the scheduler automatically switches the affected stream table to full refresh.
One-query system health check — the new pgtrickle.health_summary() function returns a single row with everything you need at a glance: how many stream tables are active, how many are in error or suspended state, the worst staleness across all tables, whether the scheduler is running, and the overall cache hit rate. Perfect for dashboards, alerting rules, or a quick manual check.
Cache performance visibility — the new pgtrickle.cache_stats() function shows how effectively pg_trickle is reusing its internal query templates. You can see cache hit rates, eviction counts, and current cache size — useful for tuning pg_trickle.template_cache_size on busy systems.
Better error messages — every error pg_trickle can raise now includes a standard PostgreSQL error code (SQLSTATE), a DETAIL line explaining the context, and a HINT suggesting what to do. Instead of a cryptic internal error, you get actionable guidance like "Table 'orders' was dropped while stream table 'order_summary' depends on it — recreate the source table or drop the stream table."

Monitoring & dashboards

Updated Grafana dashboards — the bundled pg_trickle_overview.json dashboard now includes panels for template cache hit rate, P99 and average refresh latency, hourly refresh success/failure counts, and cache eviction trends. Import it into Grafana and point it at your Prometheus instance for instant visibility.
Prometheus metric documentation — all 8 new metrics exposed by cache_stats() and health_summary() are now fully documented in the monitoring guide, with ready-to-use PromQL queries.

Correctness & testing

TPC-H regression guard — all 22 queries from the TPC-H industry benchmark now run nightly against known-good expected output. If a code change causes any query to return different results, CI fails immediately. This catches subtle correctness regressions that targeted tests might miss.
Mathematical proof of delta arithmetic — 6 property-based tests (2,000 random cases each) verify that the core engine's insert/delete accounting is correct: operations compose in the right order, groups cancel out properly, and no phantom rows appear after mixed workloads. An additional 4 end-to-end property tests exercise the full pipeline from change capture through to the final merged result.
CDC edge case coverage — new tests cover composite primary keys, generated (computed) columns, NULL values in non-key columns, and domain types — real-world schema patterns that were previously untested.
dbt integration tests — the dbt adapter now has regression tests for AUTO refresh mode, stream table health checks, and refresh history lifecycle — ensuring the dbt workflow stays reliable across releases.

Scalability

Scaling guide — a new docs/SCALING.md document covers how to configure pg_trickle for large deployments (200+ stream tables), including worker pool sizing, tiered scheduling, per-database quotas, and tuning profiles for different workload types.
Buffer growth stress tests — new tests verify that the max_buffer_rows safety limit works correctly under sustained high write rates, including automatic recovery back to incremental refresh after a burst subsides.

Testing infrastructure

Faster CI on pull requests — 19 additional test files (~197 tests) were moved to the lightweight test runner that does not require building a custom Docker image. Pull request CI is now faster without sacrificing coverage.
Upgrade path tested — the full upgrade chain from version 0.1.3 through every release up to 0.18.0 is verified automatically in CI, including function availability, schema integrity, and data survival.

Fixed

Upgrade script completeness — the 0.17.0→0.18.0 upgrade migration now includes all new and changed functions (pg_trickle_hash, cache_stats(), health_summary()), so ALTER EXTENSION pg_trickle UPDATE works correctly.

[0.17.0] — 2026-04-08

Query Intelligence & Stability. This release teaches pg_trickle to make smarter decisions about how to refresh each stream table, reduces unnecessary work when only a handful of columns actually changed, and proves correctness through 10,000 automated random mutations every night. Large deployments with hundreds of stream tables now handle schema changes much faster. Alongside these improvements, three new documentation resources make it easier to get started, troubleshoot problems, and migrate from pg_ivm.

Highlights

Query-aware refresh decisions — pg_trickle previously used a fixed threshold to decide between incremental and full refresh: if more than 50% of rows changed, switch to full. That works for simple queries but is poorly calibrated for joins or aggregates. The engine now classifies each query by its complexity (simple scan, filter, aggregate, join, or join+aggregate) and weights the cost estimate accordingly. Simple queries stay incremental even at high change rates; expensive join-heavy queries switch to full refresh sooner when the data is largely different. You can also pin a table to always use one strategy with the new pg_trickle.refresh_strategy setting ('auto' / 'differential' / 'full'), or tune the aggressiveness with pg_trickle.cost_model_safety_margin.
Skip columns that did not change — when a row is updated in a wide source table (say, 50 columns) but only 2 columns that the stream table actually uses are modified, pg_trickle previously processed the full change anyway. It now tracks exactly which columns were modified and skips updates that touch none of the relevant columns. For aggregate stream tables the savings go further: a value-only update that does not affect group membership is applied as a single lightweight correction instead of a delete-then-insert pair. On write-heavy workloads with wide tables, this reduces the volume of data flowing through the refresh pipeline by 50–90%.
Faster schema changes on large deployments — every time you create, alter, or drop a stream table, pg_trickle previously rebuilt the entire internal dependency graph from scratch. With 100 stream tables that takes only a few milliseconds, but at 1,000 it becomes noticeable. The graph is now updated incrementally — only the affected edges are touched, leaving everything else in place. At 1,000 stream tables the rebuild time drops from ~600 µs to ~116 µs and no longer scales with the total number of tables in the database.
Nightly correctness oracle — a new automated test runs 10,000 random data mutations every night against a broad set of query shapes. For each mutation it compares the result of incremental refresh against a full recompute and fails if they ever disagree. This catches subtle correctness bugs that only surface after unusual sequences of inserts, updates, and deletes — the kind that hand-written tests rarely reach.
ROWS FROM() fully supported — queries that use ROWS FROM() to call multiple set-returning functions side-by-side are now fully supported in incremental mode, including updates and deletes. This was previously restricted to insert-only workloads.

New documentation

Try it in 60 seconds — a new playground/ directory contains a docker compose up environment with PostgreSQL 18 + pg_trickle pre-wired, sample data loaded, and five stream tables ready to query. No installation required beyond Docker.
Troubleshooting runbook — docs/TROUBLESHOOTING.md covers 13 real-world failure scenarios: scheduler not running, stream table stuck in SUSPENDED state, CDC triggers missing, WAL slot problems, out-of-memory, disk full, circular dependency convergence issues, unexpected schema changes, worker pool exhaustion, and blown fuses. Each scenario lists symptoms, diagnostic queries, and step-by-step resolution.
Migrating from pg_ivm — docs/tutorials/MIGRATING_FROM_PG_IVM.md is a step-by-step guide for teams moving from the pg_ivm extension. It maps every pg_ivm API to its pg_trickle equivalent, explains behavioral differences, and includes ready-to-run SQL examples and a post-migration verification checklist.
New user FAQ — the top 15 common questions are now answered at the top of docs/FAQ.md so new users find answers before scrolling through the full document.
Post-install verification script — scripts/verify_install.sql walks through the complete setup: checks that pg_trickle is loaded, creates a test stream table, runs a refresh, verifies the result, and cleans up. Useful for confirming a fresh installation or diagnosing environment issues.

Stability & code quality

Safer internal code — the number of unsafe Rust blocks in the query parser was reduced from 690 to 441 (a 36% drop) by introducing two helper macros that wrap the most common unsafe patterns. No behavior change; this makes the codebase easier to audit and maintain.
Cleaner internal structure — the largest source file (api.rs, ~9,400 lines) was split into three focused modules. This has no user-visible effect but makes the codebase significantly easier to work with and reduces the risk of regressions from unrelated code being in the same file.
Refresh logic extracted and tested — seven functions responsible for building the SQL used during refresh were extracted into standalone testable units and covered with 29 new unit tests. This catches regressions in generated SQL templates before they reach production.

[0.16.0] — 2026-04-06

Performance & Refresh Optimization. This release makes stream table refreshes significantly faster across the board. Small changes to large tables are now applied without expensive full-table scans. Tables that only receive new rows (no updates or deletes) use a streamlined path that skips unnecessary work. Aggregate queries like SUM and COUNT are refreshed with pinpoint updates instead of recalculating entire groups. A new template cache eliminates repeated startup work when database connections are recycled. An automated benchmark system now prevents future changes from accidentally slowing things down.

Highlights

Smarter refresh for small changes — when only a handful of rows change in a large stream table (less than 1% of total rows), pg_trickle now uses a faster strategy that skips the full-table comparison. This can reduce refresh time by up to 40% for common workloads where most data stays the same between refreshes. The system picks the best strategy automatically, but you can override it via the merge_strategy setting.
Insert-only fast path — stream tables backed by append-only data sources (like event logs or audit trails that never update or delete rows) are now detected automatically and refreshed using a much simpler, faster path. No configuration is needed — pg_trickle observes your data patterns and switches to the fast path on its own. If an update or delete is later detected, it safely falls back to the standard approach with a warning.
Faster aggregate refreshes — stream tables that use SUM, COUNT, AVG, or STDDEV aggregates now update individual groups directly instead of re-joining against the entire table. For queries with many distinct groups, this can be 5–20× faster. Non-invertible aggregates like MIN, MAX, and STRING_AGG continue using the standard path.
Template cache for faster cold starts — the first time a database connection refreshes a stream table, pg_trickle normally spends ~45 ms preparing the refresh query. A new cross-connection cache stores these prepared queries so that subsequent connections (including those from connection poolers like PgBouncer) start refreshing in about 1 ms instead.
Automated performance regression checks — every code change to pg_trickle is now automatically benchmarked before it can be merged. If any operation slows down by more than 10%, the change is blocked until the regression is fixed. This protects users from accidental performance degradation in future releases.

New features

Error reference guide — a new error reference page documents every error message pg_trickle can produce, explains what caused it, and suggests how to fix it. Useful when troubleshooting unexpected behavior in production.
Change buffer growth protection — if a stream table's refresh keeps failing, the backlog of unprocessed changes could previously grow without limit, consuming disk space. A new max_buffer_rows setting (default: 1,000,000 rows) caps this growth. When the limit is reached, pg_trickle performs a full refresh to clear the backlog and warns you about the situation.
Automatic index creation control — pg_trickle has always created helpful indexes on stream tables automatically. A new auto_index setting lets you disable this behavior when you want full control over indexing. Stream tables using SELECT DISTINCT now also get an automatic index on their distinct columns.
Compaction and predicate pushdown stats — the explain_st() diagnostics function now shows additional information about change buffer compaction thresholds, merge strategy selection, append-only mode, aggregate fast-path status, and template cache hit rates.

Improved

Configuration guidance — the documentation now includes detailed tuning advice for the planner_aggressive and cleanup_use_truncate settings, especially for environments using connection poolers like PgBouncer or running under memory pressure.
Terminal dashboard improvements — the pgtrickle TUI dashboard now shows the effective refresh mode for each stream table (e.g., when a table is temporarily downgraded from differential to full refresh). The Alerts tab has been restructured with a clearer table layout and better distinction between "stale data" and "no upstream changes" conditions.

Fixed

Append-only detection with chained stream tables — stream tables that feed into other stream tables (cascading dependencies) now correctly skip the append-only fast path to avoid data inconsistencies. Previously, a chained stream table could incorrectly use the insert-only path even when downstream tables needed the full change set.
Append-only heuristic accuracy — the automatic detection of insert-only data sources now also checks the stream table's own change buffer for non-insert operations, avoiding false positives.
Full refresh fallback for mixed changes — when both a stream table and its source table have pending changes in the same refresh cycle, pg_trickle now correctly falls back to a full refresh to avoid inconsistencies.
resume_stream_table() confirmed working — the function referenced in error messages when a stream table enters SUSPENDED state was verified to exist and work correctly (present since v0.2.0).

Testing & quality

13 new end-to-end tests covering JOIN correctness across update/delete cycles, window function differential behavior, differential-vs-full equivalence validation, and source table schema evolution resilience.
5 new benchmark scenarios covering semi-joins, anti-joins, multi-table join chains, and aggregate queries at varying group counts. Total: 22 benchmark functions.
1,700 unit tests pass (up from 1,630 in v0.15.0).

[0.15.0] — 2026-04-03

0.15.0 brings the terminal dashboard to full operational capability, adds safety features that protect against runaway refreshes, and broadens the ecosystem with guides for popular migration and ORM frameworks. It also includes a major internal refactoring of the query parser and a new streaming benchmark suite.

Highlights

Interactive terminal dashboard — the pgtrickle TUI is no longer read-only. Refresh, pause, resume, and repair stream tables directly from the dashboard. A command palette (:) with fuzzy search makes common operations fast. The poller reconnects automatically after network interruptions.
Bulk creation — pgtrickle.bulk_create() creates many stream tables in a single atomic transaction, ideal for CI/CD and dbt pipelines.
Runaway-refresh protection — two new safety nets prevent expensive merges from spiralling: a pre-flight row-count estimate that downgrades to FULL refresh when deltas are too large (max_delta_estimate_rows), and a spill detector that forces FULL refresh after repeated temp-file writes (spill_threshold_blocks).
Stuck-watermark alerting — if an upstream ETL pipeline stops advancing its watermark, pg_trickle now pauses affected stream tables and sends a watermark_stuck notification so the issue is surfaced immediately rather than silently producing stale data.
Integration guides — new documentation for Flyway, Liquibase, SQLAlchemy, Django, and dbt Hub helps teams adopt pg_trickle alongside their existing tooling.

New Features

Volatile function policy — a new volatile_function_policy setting lets you choose whether volatile functions (like random() or clock_timestamp()) should be rejected (the default), allowed with a warning, or allowed silently when creating stream tables.
Bulk create API — pgtrickle.bulk_create(definitions) accepts a JSON array of stream table definitions and creates them all in one transaction. If any definition fails, the entire batch is rolled back.
Enhanced diagnostics — pgtrickle.explain_st() now shows refresh timing statistics (min/max/average duration), partition info for partitioned source tables, and a dependency graph you can render with Graphviz.
Join strategy override — the merge_join_strategy setting lets you force a specific join method (hash_join, nested_loop, or merge_join) during delta merges, which can help when the automatic heuristic doesn't suit your workload.
Pre-flight delta estimation — when max_delta_estimate_rows is set, pg_trickle counts the delta rows before merging. If the count exceeds the limit, it falls back to a FULL refresh and logs a notice, preventing out-of-memory conditions on unexpectedly large change sets.
Spill-aware refresh — if differential merges spill to disk repeatedly (controlled by spill_threshold_blocks and spill_consecutive_limit), the scheduler switches to FULL refresh automatically.
Stuck watermark hold-back — the watermark_holdback_timeout setting detects watermarks that have not advanced within a configurable window. Downstream stream tables are paused and a watermark_stuck notification is emitted until the watermark advances again.
Cascade drop — drop_stream_table() now accepts an optional cascade parameter (default true). Setting it to false raises an error if dependent stream tables exist, matching PostgreSQL's RESTRICT behavior.
Nexmark benchmark suite — a 10-query streaming benchmark (modelled on an online auction system) validates correctness under sustained high-frequency inserts, updates, and deletes.
17 new end-to-end tests — 7 tests for multi-level stream-table chains (3- and 4-level cascades with mixed refresh modes) and 10 tests for diamond/fan-in topologies with IMMEDIATE mode. No deadlocks were found.

Terminal Dashboard (TUI)

Write actions — refresh, pause, resume, repair, reset fuse, and gate/ungate operations can now be performed without leaving the dashboard.
Command palette — press : for fuzzy-matched command entry with tab-completion.
Automatic reconnection — the dashboard reconnects with exponential back-off (up to 15 s) after a connection loss, with a visual indicator.
Richer views — all 14 views now show additional live data (diagnostics, CDC health, refresh history with row-delta counts, error remediation hints, dependency-graph annotations, worker queue status, and watermark alignment).
Cross-view filtering — the / search filter now persists across all 10 list views.
Navigation re-fetch — moving between rows in the Detail view immediately fetches fresh data for the selected table.
Toast messages — write actions show confirmation and error toasts.
Sort cycling — press s / S on the Dashboard to cycle through 6 sort modes.
Mouse support — --mouse enables scroll-wheel navigation.
Theme toggle — t or --theme dark|light switches colour themes.
JSON export — Ctrl+E or :export writes the current view to a file.
TLS support — --sslmode and --sslrootcert flags.

Documentation & Ecosystem

Flyway / Liquibase guide — migration patterns for versioned and repeatable migrations, rollback blocks, and CI environments.
SQLAlchemy / Django guide — read-only model patterns, write-blocking safeguards, DRF viewsets, and freshness checking.
dbt Hub readiness — the dbt-pgtrickle package is version-synced and ready for dbt Hub submission.
Kubernetes / CNPG — updated probe configuration and a new deployment section in the Getting Started guide.
Full documentation review — configuration reference expanded from 23 to 40+ settings, missing SQL reference entries filled in, outdated FAQ answers corrected.

Internal Improvements

Parser modularisation — the 21 000-line query parser has been split into 5 focused sub-modules (types, validation, rewrites, sublinks, and the main entry point). No behavior change — all 1 687 unit tests pass.
Unsafe audit — every unsafe block in the codebase (~750 total) now has a // SAFETY: comment explaining why it is sound.
Shared-memory cache RFC — an RFC for a DSM-based MERGE template cache has been written, informing the v0.16.0 implementation plan.
TRUNCATE handling verified — TRUNCATE on source tables in trigger CDC mode already triggers a FULL refresh; this is now documented.
JOIN key-change fix verified — the v0.14.0 correctness fix for simultaneous JOIN key updates and DELETEs has been verified working and the former known-limitation note replaced with a description of the fix.

Bug Fixes

Fixed a panic in the TUI when deserializing health-check data that returned 64-bit integers where 32-bit was expected.
Fixed spurious "Error: db error" toasts in the TUI Detail view — background queries now degrade silently instead of surfacing transient errors.
Fixed incorrect integer type annotations in two E2E tests for IMMEDIATE mode diamond topologies.

[0.14.0] — 2026-04-02

0.14.0 is the Tiered Scheduling, Diagnostics & TUI release. It gives you fine-grained control over how often each stream table refreshes, adds tools that recommend the best refresh strategy for your workload, introduces a full-screen terminal dashboard for managing stream tables without SQL, and includes important security and reliability fixes.

Terminal Dashboard (TUI)

A new pgtrickle command-line tool lets you monitor and manage stream tables from a terminal — no SQL required. Run it with no arguments to launch a live-updating full-screen dashboard (think htop for stream tables), or use one-shot subcommands like pgtrickle list, pgtrickle status, or pgtrickle refresh for scripting and CI.

The interactive dashboard includes:

Live overview — stream table statuses, refresh timing, and issue counts update every 2 seconds, with color-coded health indicators.
Dependency graph — see how stream tables relate to each other in an ASCII tree view.
Diagnostics — view refresh mode recommendations with confidence levels.
CDC health — monitor change buffer sizes with warnings when they grow too large.
Alert feed — real-time notification display with severity levels.
Issue detection — automatically spots broken dependency chains, growing buffers, blown fuses, and stale data, with a persistent badge showing the issue count from any view.
Watch mode — pgtrickle watch provides continuous non-interactive output suitable for log aggregation.
Output formats — all CLI subcommands support --format json, --format csv, and human-readable table output.

See docs/TUI.md for the full user guide.

Tiered Refresh Scheduling

Stream tables can now be assigned to refresh tiers — hot, warm, cold, or frozen — to control how frequently they refresh:

Hot (default) — refreshes at the configured interval.
Warm — refreshes at 2× the interval.
Cold — refreshes at 10× the interval, ideal for infrequently accessed reports.
Frozen — pauses automatic refresh entirely until promoted back.

Assign a tier with ALTER STREAM TABLE ... SET (tier = 'cold'). A NOTICE is emitted when demoting from Hot to Cold or Frozen so operators are aware of the change in refresh frequency.

Smarter Refresh Recommendations

Two new diagnostic functions help you choose the most efficient refresh strategy for each stream table:

pgtrickle.recommend_refresh_mode(name) — analyzes seven workload signals (change frequency, timing history, query complexity, table size, index coverage, and latency patterns) and recommends FULL or DIFFERENTIAL mode with a confidence level and plain-language explanation. Useful when you're unsure which mode will be faster for a particular table.
pgtrickle.refresh_efficiency(name) — shows per-table refresh performance: how many FULL vs. DIFFERENTIAL refreshes have run, average timing for each, and the speedup factor. Good for monitoring dashboards and alerting.

A new tutorial — Tuning Refresh Mode — walks through the process step by step.

Reduced Write Overhead with UNLOGGED Buffers

Enable pg_trickle.unlogged_buffers = true and newly created change buffer tables will skip write-ahead logging, reducing WAL volume by roughly 30%. This is ideal for workloads where you can tolerate a full re-sync after a crash (the extension detects the crash and re-syncs automatically).

A utility function — pgtrickle.convert_buffers_to_unlogged() — converts existing buffers in one call. Run it during a maintenance window since it briefly locks each buffer table.

Instant Error Detection

Previously, when a stream table's refresh hit a permanent error (for example, a function that doesn't exist for the column type), the extension would retry several times before giving up. Now it recognizes permanent errors immediately, sets the stream table status to ERROR with a clear error message, and stops retrying. You can see the error at a glance in the stream_tables_info view or the TUI dashboard, and fix it by altering the stream table's query.

Security Hardening

CDC trigger functions now use SECURITY DEFINER — change-data-capture trigger functions run with the privileges of the extension owner rather than the current user, preventing privilege escalation through modified search paths.
Explicit SET search_path — all CDC trigger functions now set search_path to pgtrickle_changes, pg_catalog to prevent search-path manipulation attacks.

Other Improvements

Export definitions — pgtrickle.export_definition(name) exports a stream table's full configuration as reproducible SQL (DROP + CREATE + ALTER statements), making it easy to version-control or migrate stream table definitions between environments.
Creation-time warnings — when creating a stream table with aggregates like MIN, MAX, or STRING_AGG in DIFFERENTIAL mode, a warning now suggests that FULL or AUTO mode may be more efficient. For algebraic aggregates (SUM/COUNT/AVG), the warning only appears when the estimated number of groups is below a configurable threshold.
Simplified settings — the merge_planner_hints and merge_work_mem_mb settings have been consolidated into a single planner_aggressive switch. The old setting names still work but are ignored in favor of the new one.
GHCR Docker image — a multi-architecture Docker image (ghcr.io/grove/pg_trickle) with PostgreSQL 18.3 and pg_trickle pre-installed is now published automatically on each release.
Pre-deployment checklist — new PRE_DEPLOYMENT.md with a 10-point checklist for production deployments.
Best-practice patterns guide — new PATTERNS.md with 6 common patterns: Bronze/Silver/Gold materialization, event sourcing, slowly-changing dimensions, high-fan-out topology, real-time dashboards, and tiered refresh strategies.
Keyless dedup fix — replaced MAX(col) with array_agg(col)[1] for deduplicating keyless scan results, which is more correct for non-orderable types.

Bug Fixes

ST-on-ST differential refresh — manually refreshing a stream table that reads from another stream table now uses true incremental (DIFFERENTIAL) refresh instead of falling back to a full re-scan. This matches the behavior of the automatic scheduler and is significantly faster for large tables.
Staleness tracking — the staleness indicator now uses the actual last refresh time instead of an internal data timestamp, making the pg_stat_stream_tables view more accurate.

Testing & Reliability

Soak test — a new long-running stability test validates zero worker crashes, zero ERROR states, and stable memory usage under sustained mixed workload (configurable duration, default 10 minutes).
Multi-database isolation test — verifies that two databases in the same PostgreSQL cluster run pg_trickle independently without interference.
140 TUI tests — comprehensive unit, snapshot, and interaction tests for the terminal dashboard.
23 mixed-object E2E tests — validates stream tables alongside regular PostgreSQL views, materialized views, and other objects.
Scheduler race fixes — eliminated flaky test failures caused by scheduler timing races and GUC leak between tests.

New SQL Functions

Function	Purpose
`pgtrickle.recommend_refresh_mode(name)`	Workload-based refresh mode recommendation
`pgtrickle.refresh_efficiency(name)`	Per-table refresh performance metrics
`pgtrickle.export_definition(name)`	Export stream table as reproducible DDL
`pgtrickle.convert_buffers_to_unlogged()`	Convert logged change buffers to UNLOGGED

New Settings

Setting	Default	Purpose
`pg_trickle.planner_aggressive`	`true`	Consolidated switch for MERGE planner hints
`pg_trickle.unlogged_buffers`	`false`	Create new change buffers as UNLOGGED
`pg_trickle.agg_diff_cardinality_threshold`	`1000`	Warn about DIFFERENTIAL mode below this group count

Deprecated

pg_trickle.merge_planner_hints — Use pg_trickle.planner_aggressive instead. Still accepted but ignored at runtime.
pg_trickle.merge_work_mem_mb — Same; use planner_aggressive instead.

Upgrading

Run ALTER EXTENSION pg_trickle UPDATE; after installing the new binaries. The upgrade adds new catalog columns, functions, and the TUI workspace member. No breaking changes — everything from v0.13.0 continues to work. See UPGRADING.md for details.

[0.13.0] — 2026-03-31

0.13.0 is the Scalability Foundations release. It makes pg_trickle handle large tables, complex queries, and multi-tenant deployments much more efficiently — and it achieves a major milestone: all 22 TPC-H benchmark queries now run in incremental (DIFFERENTIAL) mode, meaning the engine no longer needs to fall back to slow full-refresh for any standard analytical query pattern.

Smarter Change Detection for Wide Tables

When you UPDATE a few columns in a large table — say, changing a status column in a 60-column table — pg_trickle used to treat every column as potentially changed, doing extra work to keep all downstream views up to date.

Now it knows the difference. Columns used in GROUP BY, JOIN, or WHERE clauses are "key columns"; everything else is a "value column." When only value columns change, the engine takes a shortcut: it sends a single correction row instead of a full delete-and-reinsert pair. For wide-table workloads, this can cut the volume of data processed by 50% or more.

Shared Change Buffers

If you have several stream tables watching the same source table, each one used to maintain its own private copy of the change log. That's wasteful. Now they share a single change buffer per source, and each consumer simply tracks how far it has read. The slowest reader protects the buffer for everyone.

You can see how this is working with the new pgtrickle.shared_buffer_stats() function — it shows each buffer, who's reading from it, how many rows are queued, and whether it's been automatically partitioned for performance.

Automatic Buffer Partitioning

Set pg_trickle.buffer_partitioning = 'auto' and pg_trickle will start with simple, unpartitioned change buffers. If a buffer starts accumulating a lot of rows (high-throughput sources), it automatically converts to a partitioned layout where old data can be removed almost instantly instead of deleting rows one by one.

More Partitioning Options for Stream Tables

Building on the RANGE partitioning added in v0.11.0, you can now partition stream tables in three additional ways:

Multi-column keys — partition by a combination of columns (partition_by='region,year')
LIST partitioning — for low-cardinality columns like status or type (partition_by='LIST:status')
HASH partitioning — for even distribution across a fixed number of partitions (partition_by='HASH:customer_id:8')

You can also change the partition key of an existing stream table at runtime with alter_stream_table(partition_by => ...) — data is preserved automatically. If rows land in the default (catch-all) partition, a WARNING is emitted to prompt you to add explicit partitions.

All 22 TPC-H Queries Now Run Incrementally

The DVM (differential view maintenance) engine received its most significant set of improvements yet, targeting the complex multi-table join patterns found in standard analytical benchmarks:

Smarter pre-image lookups — instead of reconstructing what the data looked like before a change by subtracting deltas (expensive for large tables), the engine now uses targeted index lookups that only touch the rows that actually changed.
Predicate pushdown — WHERE conditions from the original query are now pushed into the delta computation, preventing unnecessary cross-products in multi-table joins.
Deep-join optimizations — queries joining 5+ tables get automatic planner hints (more memory, smarter join strategies) to avoid spilling to disk.
Scan-count-aware strategy selector — queries that exceed configurable join complexity or delta volume thresholds automatically fall back to full refresh on a per-query basis rather than failing.

The result: all 22 TPC-H queries pass at SF=0.01 in DIFFERENTIAL mode with zero drift across 3 refresh cycles. The DIFFERENTIAL_SKIP_ALLOWLIST (queries that previously required full refresh) is now empty.

Refresh Performance Inspection Tools

Two new functions help you understand what pg_trickle is doing under the hood:

pgtrickle.explain_delta(name, format) — shows you the query plan for the auto-generated delta SQL, the same way EXPLAIN works for regular queries. Available in text, JSON, XML, or YAML format.
pgtrickle.dedup_stats() — reports how often concurrent writes produce duplicate entries that need pre-processing before the MERGE step.

Multi-Tenant Worker Quotas

New setting: pg_trickle.per_database_worker_quota — if you run many databases on one PostgreSQL cluster, this prevents a busy database from monopolizing all the refresh workers. Workers are assigned by priority (immediate-mode tables first, then hot, warm, and cold), with burst capacity up to 150% when other databases are idle.

TPC-H Benchmark Harness

You can now measure refresh performance across all 22 TPC-H queries in a structured way. Run just bench-tpch to get per-query timing, FULL vs. DIFFERENTIAL comparison, and P95 latency numbers. Five synthetic benchmarks (q01, q05, q08, q18, q21) also measure the pure Rust delta-SQL generation time without needing a database.

Broader SQL Support

IS JSON predicates (PG 16+) — expressions like expr IS JSON OBJECT now work in incremental mode.
SQL/JSON constructors (PG 16+) — JSON_OBJECT(...), JSON_ARRAY(...), JSON_OBJECTAGG(...), and JSON_ARRAYAGG(...) are now accepted.
Recursive CTEs — recursive queries with non-monotone operators (like EXCEPT) correctly fall back to full refresh instead of producing wrong results.

dbt Integration Updates

If you use dbt-pgtrickle, you can now set partitioning and fuse options directly from dbt model config:

{{ config(partition_by='customer_id') }} for partitioned stream tables
{{ config(fuse='auto', fuse_ceiling=100000, fuse_sensitivity=3) }} for circuit-breaker protection

Bug Fixes

Scheduler cascade fix — stream tables downstream of FULL-mode upstream tables now detect changes correctly via a last_refresh_at fallback, preventing stale data in chains where the upstream uses full refresh.
SUM(CASE WHEN ...) drift fix — aggregate expressions using CASE were occasionally producing slightly wrong incremental results; these are now correctly detected and processed via a group rescan.
Duplicate column DDL fix — removed a duplicate column definition in the pgt_stream_tables DDL that could cause issues on fresh installs.

Testing Improvements

New regression test suite targeting 9 structural weaknesses: join multi-cycle correctness (7 tests), differential-equals-full equivalence (11 tests), DVM operator execution, failure recovery, and MERGE template unit tests.
E2E test infrastructure now uses template databases, cutting per-test setup time significantly.

New SQL Functions

Function	Purpose
`pgtrickle.explain_delta(name, format)`	Show the query plan for the delta SQL
`pgtrickle.dedup_stats()`	MERGE deduplication frequency counters
`pgtrickle.shared_buffer_stats()`	Per-source change buffer status
`pgtrickle.explain_refresh_mode(name)`	Why a stream table uses its current refresh mode
`pgtrickle.reset_fuse(name)`	Reset a blown circuit-breaker fuse
`pgtrickle.fuse_status()`	Fuse state across all stream tables

New Catalog Columns

Ten new columns on pgtrickle.pgt_stream_tables:

Column	Purpose
`effective_refresh_mode`	The actual refresh mode after AUTO resolution
`fuse_mode`	Circuit-breaker configuration (off / auto / manual)
`fuse_state`	Current fuse state (armed / blown)
`fuse_ceiling`	Maximum change count before fuse blows
`fuse_sensitivity`	Consecutive cycles above ceiling before triggering
`blown_at`	When the fuse last blew
`blow_reason`	Why the fuse blew
`st_partition_key`	Partition key specification
`max_differential_joins`	Maximum join count for differential mode
`max_delta_fraction`	Maximum delta-to-table ratio for differential mode

Upgrading

Run ALTER EXTENSION pg_trickle UPDATE; after installing the new binaries. All new columns and functions are added automatically. No breaking changes — everything from v0.12.0 continues to work as before. See UPGRADING.md for details.

[0.12.0] — 2026-03-28

0.12.0 is a correctness, reliability, and developer-experience release built on top of 0.11.0's major new features. It closes the last known wrong-answer bugs for complex join queries, adds tools to help you understand and debug stream table behavior, hardens the scheduler against several edge cases that could cause stale data or crashes, and backs it all with thousands of new automatically generated tests.

Stale Rows Fixed in Stream-Table Chains

What was the problem? When a stream table (B) reads from another stream table (A), each change in A is recorded as a small "what changed" entry — a row added or removed. But the identity key used for those entries was computed differently inside the change buffer than it was inside B's own storage. As a result, when A changed via an upstream UPDATE, B's refresh could silently fail to delete the old version of a row, leaving a stale duplicate.

What changed? The change buffer now computes row identity the same way B does — using a hash of all the data columns rather than the upstream source's primary key. Stale rows after UPDATE no longer appear in stream-table chains. This bug was found and confirmed by the new property-based test suite (see below).

Phantom Rows Fixed for Complex Joins (TPC-H Q7 / Q8 / Q9)

What was the problem? When a stream table's query joins three or more tables together and rows are deleted from more than one join side at the same time, the incremental engine could silently drop the correction — leaving rows in the stream table that should have been removed.

This affected TPC-H queries Q7, Q8, and Q9 (which all involve deep join trees), and any user query with a similar multi-table join structure. A temporary workaround (falling back to full refresh for wide joins) was in place since v0.11.0 and has now been lifted.

What changed? The incremental engine now takes an individual "before snapshot" for each leaf table in the join tree — each one cheaply computed from a single-table comparison — and re-joins them after the delete. This avoids writing multi-gigabyte temp files to disk (the root cause of the original workaround) and eliminates the phantom-row bug entirely. Q7, Q8, and Q9 now run in differential mode without any workarounds.

Type Errors Fixed in Parallel Refresh Chains

What was the problem? When a chain of stream tables is fused into a single execution unit for efficiency (the "bypass" optimisation added in v0.11.0), the internal bypass table used text for every column regardless of the actual column type. This caused an operator does not exist: text > integer error whenever a downstream stream table had a type-sensitive WHERE clause (e.g. WHERE amount > 100), making the parallel worker tests fail silently across all topologies that included a fused chain.

What changed? Bypass tables now use the real column types. The six parallel-worker benchmark tests now complete in 9–26 seconds rather than timing out after 120 seconds.

Scheduler Fixes for Diamond and ST-on-ST Topologies

Two scheduler bugs that caused incorrect refresh behavior with complex dependency graphs were fixed:

Diamond timeout. In a diamond topology (A → B, A → C, B+C → D), the L1 arm stream tables (B and C) were created with a 1-minute fixed interval rather than a calculated schedule. This meant D never received updates within the test window. The scheduler also had a bug loading stream table records by ID that caused silent failures in parallel worker paths. Both are fixed.
ST-on-ST parallel workers. When an upstream stream table changed, the parallel worker paths (singleton, atomic group, immediate closure, fused chain) were not forcing a full refresh on downstream stream tables the way the main scheduler loop did. This could leave downstream tables stale. The fix ensures all parallel paths treat upstream stream-table changes the same way.

Four New Diagnostic Functions

When stream table behavior is unexpected — wrong refresh mode, a query being rewritten in a surprising way, persistent errors — it previously required reading server logs or source code to understand why. Four new SQL functions expose that internal state directly in queries:

pgtrickle.explain_query_rewrite(query TEXT) — shows exactly how pg_trickle rewrites your query for incremental refresh: which operators were applied, how delta keys are injected, and how aggregates are classified. Useful for understanding why a query got a particular refresh mode.
pgtrickle.diagnose_errors(name TEXT) — shows the last 5 errors for a stream table, each classified by type (correctness, performance, configuration, infrastructure) with a suggested fix.
pgtrickle.list_auxiliary_columns(name TEXT) — lists the internal __pgt_* columns that pg_trickle injects into a stream table's query plan, with an explanation of each one's purpose. Helpful when SELECT * returns unexpected extra columns.
pgtrickle.validate_query(query TEXT) — analyses a SQL query and reports which refresh mode it would get, which SQL constructs were detected, and any warnings — all without creating a stream table.

Multi-Column `IN (subquery)` Now Gives a Clear Error

What was the problem? A query like WHERE (col_a, col_b) IN (SELECT x, y FROM …) passed validation but produced silently wrong results — the engine was only matching on the first column and ignoring the second.

What changed? This construct is now detected at stream table creation time and rejected with a clear error message that recommends rewriting it as EXISTS (SELECT 1 FROM … WHERE col_a = x AND col_b = y).

IMMEDIATE Mode Proven Correct Under High Concurrency

IMMEDIATE mode (where the stream table updates inside the same transaction as the source table change) now has a dedicated concurrency stress test: 100–120 concurrent transactions firing simultaneously against the same source table, across five scenarios (all inserts, all updates to distinct rows, all updates to the same row, all deletes, and a mixed workload). Zero lost updates, zero phantom rows, and no deadlocks were observed in any run.

Protection Against Pathological Queries

A new guard prevents a particularly deep or convoluted query from consuming all available stack space and crashing the database backend. When the query analyser recurses more than 64 levels deep (configurable via pg_trickle.max_parse_depth), it now returns a clear QueryTooComplex error instead of crashing.

Tiered Scheduling Now On By Default

The tiered scheduling feature — which automatically slows down cold (infrequently-read) stream tables and speeds up hot ones — is now enabled by default. In large deployments this reduces the scheduler's CPU usage significantly. Stream tables you query often continue refreshing at full speed. Stream tables that nobody has read recently back off gracefully.

If you rely on all stream tables refreshing at the same rate regardless of read frequency, set pg_trickle.tiered_scheduling = off.

Thousands of Automatically Generated Tests

Two new automated testing systems were added to complement the hand-written test suite:

Property-based tests — the test framework automatically generates thousands of random DAG shapes, schedule combinations, and edge cases and checks that the scheduler's ordering guarantees hold for all of them. If any configuration would cause a table to refresh in the wrong order or get spuriously suspended, these tests catch it.
SQLancer fuzzing — SQLancer generates random SQL queries and checks that pg_trickle's incremental result matches the result of running the same query directly in PostgreSQL. Any mismatch is automatically saved as a permanent regression test. A weekly CI job runs this continuously. At time of release, zero mismatches have been found.

CDC Write-Side Benchmark Published

A new benchmark suite measures the overhead that pg_trickle's change capture triggers add to your write workload. Results across five scenarios (single-row INSERT, bulk INSERT, bulk UPDATE, bulk DELETE, concurrent writers) are published in docs/BENCHMARK.md. Use these numbers to estimate the impact before deploying pg_trickle on a write-heavy table.

MERGE Template Validation at Test Startup

The SQL templates that pg_trickle generates for applying incremental changes (the MERGE statements) are now validated with an EXPLAIN dry-run at every test startup. If a code change accidentally produces a malformed MERGE template, the tests catch it before any data is processed — rather than manifesting as a cryptic runtime error.

[0.11.0] — 2026-03-26

This is the biggest release since the initial launch. The headline features are 34× lower latency for real-time workloads, stream-table chains that now refresh incrementally (no more forced full recomputation when one stream table feeds another), declarative partitioning to cut I/O on large tables by up to 100×, a ready-to-use Prometheus and Grafana monitoring stack, and a circuit breaker to protect production databases from runaway change bursts.

34× Lower Latency — Changes Arrive Instantly

Previously, the background worker woke up on a fixed timer every ~500 ms to check for new data, even when nothing had changed. Every change had to wait up to half a second in the change buffer before being processed.

Now, when a source table is modified, the change capture trigger immediately wakes the background worker via a PostgreSQL notification channel. The worker starts processing within ~15 ms of the write committing — a 34× improvement for low-volume workloads. Under heavy DML, a 10 ms debounce window coalesces rapid notifications so the worker isn't flooded.

Event-driven wake is on by default. You can turn it off (pg_trickle.event_driven_wake = off) to revert to poll-based wake, and you can tune the debounce window with pg_trickle.wake_debounce_ms (default 10).

Stream-Table-to-Stream-Table Chains Now Refresh Incrementally

Previously, when stream table B's query read from stream table A, pg_trickle had to do a full recomputation of B every time A changed — even if only a few rows in A actually changed. For long chains (A → B → C → D), every hop was a full re-scan.

Now, stream tables can read from other stream tables incrementally. When A refreshes, the rows it added and removed are recorded in a change buffer just like a base table. B wakes up, reads only the changed rows from A, and applies a delta — not a full recomputation. Even when A does a full refresh (e.g. because its query does not support differential mode), a before/after snapshot diff is captured automatically so downstream tables still receive a small insert/delete delta rather than cascading full refreshes through the chain.

Declaratively Partitioned Stream Tables

Stream tables can now be declared with a partition key:

SELECT create_stream_table(
  'monthly_sales',
  $$ SELECT month, region, SUM(amount) FROM orders GROUP BY 1, 2 $$,
  partition_by => 'month'
);

pg_trickle creates a range-partitioned storage table and, when refreshing, automatically restricts the MERGE operation to only the partitions that contain changed rows. For large tables where changes touch only 2–3 out of 100 monthly partitions, this can reduce the MERGE I/O from 10 million rows to ~100,000 — a 100× improvement.

Ready-to-Use Prometheus and Grafana Monitoring

A complete observability stack is now included in the monitoring/ directory:

monitoring/prometheus/pg_trickle_queries.yml — drop-in configuration for postgres_exporter that exports 14 metrics covering refresh performance, CDC buffer sizes, staleness, error rates, and per-table status.
monitoring/prometheus/alerts.yml — 8 alerting rules that page you when a stream table goes stale (> 5 min), starts error-looping (≥ 3 consecutive failures), is suspended, or when the CDC buffer exceeds 1 GB.
monitoring/grafana/dashboards/pg_trickle_overview.json — a pre-built Grafana dashboard with six sections: cluster overview, refresh latency time-series, staleness heatmap, CDC lag, per-table drill-down, and scheduler health.
monitoring/docker-compose.yml — brings up PostgreSQL + pg_trickle + postgres_exporter + Prometheus + Grafana with one command (docker compose up). Grafana opens at http://localhost:3000; the dashboard shows live metrics generated by a seed workload of stream tables continuously refreshing synthetic order and product data (see monitoring/init/01_demo.sql).

No code changes are needed to use this stack with an existing pg_trickle installation.

Circuit Breaker (Fuse) — Protection Against Runaway Change Bursts

A new circuit breaker mechanism halts refresh for a stream table when its pending change count exceeds a configurable threshold. This protects your database from accidental mass-delete scripts, runaway migrations, or data imports that would otherwise trigger an unexpectedly large and expensive refresh operation.

When the fuse blows, pg_trickle sends a pgtrickle_alert PostgreSQL notification that you can subscribe to, and suspends the affected stream table. You then choose how to recover using reset_fuse():

reset_fuse(name, action => 'apply') — process the backlog normally (default).
reset_fuse(name, action => 'reinitialize') — clear the change buffer and repopulate the stream table from scratch.
reset_fuse(name, action => 'skip_changes') — discard the pending changes and resume without reprocessing them.

Configure per-table with alter_stream_table(fuse => 'on', fuse_ceiling => 10000) or set a global default with pg_trickle.fuse_default_ceiling. Use fuse_status() to inspect the blown/active state of all stream tables at once.

Wider Column Bitmask — No More 63-Column Limit

pg_trickle's change capture tracks which columns were actually modified in each row so that stream tables that reference only a subset of columns can ignore irrelevant updates. Previously, this optimization silently stopped working for source tables with more than 63 columns — all updates were treated as touching every column.

The bitmask has been extended from a 64-bit integer to an arbitrary-width PostgreSQL VARBIT value, removing the column count cap entirely. Existing deployments are migrated automatically (the old column value becomes NULL, which the filter treats conservatively — no rows are silently dropped). Tables with fewer than 64 columns are unaffected at the data level.

Per-Database Worker Quotas

In multi-tenant environments where multiple databases share a single PostgreSQL instance, all stream-table refresh workers previously competed for the same concurrency pool. A single busy database could crowd out others.

A new GUC pg_trickle.per_database_worker_quota sets a soft concurrency limit per database. When the rest of the cluster is lightly loaded (< 80% of available capacity in use), a database can burst to 150% of its quota. When the cluster is busy, each database is held to its base quota.

Refresh work is also now dispatched in priority order: IMMEDIATE mode tables → atomic diamond groups → singleton tables.

DAG Scheduling Performance

For deployments with chains of stream tables (A → B → C), several improvements reduce end-to-end propagation latency:

Fused single-consumer chains. When a stream table chain has exactly one downstream consumer at each hop, the scheduler fuses the chain into a single execution unit in one background worker. Intermediate deltas are stored in temporary in-memory tables instead of persistent change buffers, eliminating the WAL writes, index maintenance, and cleanup that would normally occur at each hop.
Batch coalescing. Before a downstream table reads from an upstream change buffer, redundant insert/delete pairs for the same row are cancelled out. This prevents rapid-fire upstream refreshes from accumulating duplicate work for downstream tables.
Adaptive dispatch polling. The parallel dispatch loop now backs off exponentially (20 ms → 200 ms) instead of using a fixed 200 ms poll, and resets to 20 ms as soon as any worker finishes. Cheap refreshes no longer wait a full 200 ms for the next tick.
Delta amplification warnings. When a differential refresh produces many more output rows than input rows (default threshold: 100×), a WARNING is emitted with the table name, input and output counts, and a tuning hint. explain_st() now exposes amplification_stats from the last 20 refreshes.

Smarter Diagnostics and Warnings

Several improvements to make problems visible earlier and easier to diagnose:

Know which refresh mode is actually running. When a stream table is set to AUTO, pg_trickle now records which mode it actually chose at each refresh (DIFFERENTIAL, FULL, etc.) in a new effective_refresh_mode column on pgt_stream_tables. A new explain_refresh_mode(name) function reports the configured mode, the actual mode used, and the reason for any downgrade — all in one query.
Clearer warning when a stream table falls back to full refresh. If a stream table cannot use differential mode, pg_trickle now emits a WARNING message naming the affected table and the reason. Previously this happened silently.
Warning when using aggregates that require full group rescans. Aggregate functions like STRING_AGG, ARRAY_AGG, and JSON_AGG require re-aggregating the entire group whenever any member changes. pg_trickle now warns at stream table creation time when such aggregates are used in DIFFERENTIAL mode, and explain_st() classifies each aggregate's maintenance strategy (incremental, auxiliary-state, or group-rescan) so you can understand the cost.
Better error messages. Errors for unsupported query patterns, cycle detection, upstream schema changes, and query parse failures now include a DETAIL field explaining what went wrong and a HINT field suggesting how to fix it.
Invalid parameter combinations are rejected at creation time. For example, using diamond_schedule_policy='slowest' without diamond_consistency='atomic' now produces a clear error at create_stream_table / alter_stream_table time rather than silently doing the wrong thing at refresh time.
TopK queries validate their metadata on every refresh. Stream tables defined with ORDER BY ... LIMIT N now recheck that the stored LIMIT/OFFSET metadata still matches the actual query on each refresh. On mismatch, they fall back to a full refresh with a WARNING rather than silently producing wrong results.

Safety and Reliability Improvements

No more crashes from schema changes. If a source table's schema changes while a refresh is running (e.g. a column is dropped), pg_trickle now catches the error, emits a structured WARNING with the table name and error details, and continues refreshing all other stream tables. The scheduler never crashes due to an individual table's error.
Failure injection tests. New end-to-end tests deliberately drop columns and tables mid-refresh to verify that the scheduler stays alive and other stream tables continue processing correctly.
Safer defaults. Three default settings have been updated to reflect production-safe behavior:
- parallel_refresh_mode now defaults to 'on' (was 'off'). Parallel refresh has been stable for several releases; serial mode is now opt-in.
- block_source_ddl now defaults to true. Accidental ALTER TABLE on a source table while a stream table depends on it is now blocked by default, with clear instructions on how to temporarily disable the guard if needed.
- The invalidation ring capacity has been doubled from 32 to 128 slots, reducing the risk of invalidation events being silently discarded under rapid DDL.

Getting Started Guide Restructured

docs/GETTING_STARTED.md has been reorganised into five progressive chapters:

Hello World — create your first stream table and watch it update.
Joins, Aggregates & Chains — multi-table dependencies and DAG patterns.
Scheduling & Backpressure — controlling refresh frequency and auto-backoff.
Monitoring In Depth — using the five key diagnostic functions and the Prometheus/Grafana stack.
Advanced Topics — FUSE circuit breaker, partitioned stream tables, IMMEDIATE (in-transaction) IVM, and multi-tenant worker quotas.

TPC-H Correctness Gate Added to CI

Five queries derived from the TPC-H benchmark — covering single-table GROUP BY, filter-aggregate, CASE WHEN inside SUM, a three-way join, and LEFT OUTER JOIN with GROUP BY — now run in DIFFERENTIAL mode on every push to main and daily. Any correctness mismatch between pg_trickle's incremental output and plain PostgreSQL execution fails the CI build automatically.

Docker Hub Image Improvements

The Dockerfile.hub image that is published to Docker Hub has been expanded with a comprehensive set of GUC defaults fine-tuned for production use. A new just build-hub-image recipe builds the image locally for testing.

Bug Fixes

Scheduler crash after event-driven wake was enabled. The background worker crashed immediately after startup when event_driven_wake = on (the default) because the LISTEN command was being issued outside of a transaction. Fixed by issuing LISTEN inside a short-lived SPI transaction at startup. (#296)
Spurious full refresh for non-recursive CTEs. Stream tables containing WITH clauses that were not recursive (WITH foo AS (SELECT ...)) were being incorrectly forced to FULL refresh mode. Only truly recursive CTEs (WITH RECURSIVE) require this. Non-recursive CTEs now correctly use differential mode. (#298)
DISTINCT ON inside a CTE body caused a parse error. When a stream table's defining query contained a WITH clause whose body used DISTINCT ON (...), the DVM query analyser failed with a parse error. The DISTINCT ON clause is now rewritten before analysis so it no longer interferes. (#300)
Full-refresh fallback warning now names the affected table. When pg_trickle falls back from differential to full refresh, the emitted WARNING now includes the stream table name and the reason, making it straightforward to identify which table you need to investigate. (#301)

[0.10.0] — 2026-03-25

The headline features of 0.10.0 are cloud deployment compatibility, query engine correctness, refresh performance, and improved developer experience for auto_backoff. pg_trickle now works reliably behind PgBouncer — the connection pooler used by default on Supabase, Railway, Neon, and other managed PostgreSQL platforms. A broad set of correctness issues in the incremental query engine are fixed. And several performance optimizations cut refresh time for large tables and busy deployments.

`auto_backoff` Is Now Much Friendlier on Developer Machines

When pg_trickle.auto_backoff = true is enabled, the scheduler automatically slows down stream tables whose refresh cost exceeds their schedule budget — a good safeguard in production. This release makes the feature safe to use alongside short schedules (e.g. '1s') in developer and CI environments:

Trigger threshold raised from 80 % → 95 %. Backoff now only activates when a refresh consumes more than 95 % of the schedule window. A 900 ms refresh on a 1-second schedule (90 %) used to trigger backoff; it no longer does. EC-11 operator alerting continues to fire at 80 % (unchanged) so you still get an early warning before the scheduler is actually stuck.
Maximum slowdown reduced from 64× → 8×. In the worst case, a stream table's effective refresh interval is now capped at 8× its configured schedule (e.g. 8 seconds for a '1s' table) instead of 64 seconds. The cap self-heals immediately: a single on-time refresh resets the factor to 1×.
Backoff events now emit WARNING instead of INFO. When the scheduler stretches or resets a stream table's effective interval, you will see a WARNING message in your PostgreSQL client, including the new effective interval — rather than a silent slowdown with no explanation.
auto_backoff now defaults to on. With the above improvements in place, the feature is safe in all environments. New installations get CPU runaway protection out of the box. To restore the old opt-in behaviour, set pg_trickle.auto_backoff = off.

Works Behind PgBouncer

PgBouncer is the most popular PostgreSQL connection pooler. In "transaction mode" — the default setting on most cloud PostgreSQL platforms — it hands a fresh database connection to every transaction, which breaks anything that assumes the same connection stays open between calls (session locks, prepared statements). pg_trickle previously relied on both. This release makes pg_trickle work correctly in such deployments.

Session locks replaced with row-level locking. The background scheduler now acquires a short-lived row-level lock on each stream table's catalog entry instead of a session-level advisory lock. Row-level locks are released automatically at transaction end — exactly what PgBouncer transaction mode requires. If a concurrent refresh is already running for a given stream table, the scheduler skips that cycle and retries, rather than blocking.
New pooler_compatibility_mode option per stream table. Setting pooler_compatibility_mode => true when creating or altering a stream table disables prepared statements and NOTIFY emissions for that table. Leave it off (the default) if you're not behind a pooler — behaviour is unchanged from v0.9.0.
PgBouncer tested end-to-end. A new automated test suite boots PgBouncer in transaction-pool mode alongside pg_trickle and exercises the full lifecycle: create, refresh, alter, drop — all through the pooler. Run with just test-pgbouncer.

Query Engine Correctness Fixes

Several SQL patterns that appeared to work correctly could produce wrong results silently under the incremental query engine. All of the following are now fixed:

Recursive queries (WITH RECURSIVE) update correctly when rows are deleted. Recursive queries are used for organisation hierarchies, bill-of-materials roll-ups, graph traversals, and similar structures. In DIFFERENTIAL mode, deleting a row from the source previously caused a full recomputation (correct, but expensive — O(n)). Now pg_trickle uses the Delete-and-Rederive algorithm, updating only affected rows at O(delta) cost. Computed expressions like ancestor.path || ' > ' || node.name update correctly when any ancestor is renamed or moved.
SUM over a FULL OUTER JOIN no longer returns 0 instead of NULL. When matched rows on both join sides transition to matched on one side only (creating null-padded rows), the incremental SUM formula previously returned 0 instead of NULL. pg_trickle now tracks how many non-null values exist in each group and produces the correct answer without any full-group rescan.
Multi-source delta merging is now correct for diamond-shaped queries. A "diamond" topology is when two separate paths through the dependency graph both feed into the same stream table (e.g. table A → both B and C → D). Simultaneous changes on both paths could previously cause some corrections to be silently discarded, leaving D with wrong values. Now uses proper weight aggregation (Z-set algebra) so every correction is applied. Six property-based tests verify this for different diamond shapes.
Statistical aggregates (CORR, COVAR, REGR_*) now update in constant time. All twelve SQL correlation and regression functions — CORR, COVAR_POP, COVAR_SAMP, and the ten REGR_* variants — now update incrementally using running totals (Welford-style accumulation) instead of rescanning the whole group. Each changed row is processed once regardless of group size.
LATERAL subqueries only re-examine correlated rows. When data changes in the inner part of a LATERAL JOIN, pg_trickle previously re-ran the subquery for every row in the outer table. Now it re-runs it only for outer rows that actually correlate with the changed inner data, reducing work from proportional-to-table-size to proportional-to-changes.
Materialized view sources now work in DIFFERENTIAL mode. Stream tables can use a PostgreSQL materialized view as their data source when pg_trickle.matview_polling = on is set. Changes are detected by comparing snapshots, the same mechanism used for foreign table sources.
Six correctness bugs in the query rewriting engine fixed. These all involved edge cases in how the incremental engine translates SQL:
- SQL comment fragments such as /* unsupported ... */ that were being injected into generated SQL and causing runtime syntax errors are now replaced with clear extension-level errors.
- When a column-rename step (e.g. EXTRACT(year FROM orderdate) AS o_year) sits between an aggregate and its source, GROUP BY and aggregate expressions now resolve correctly.
- EXCEPT queries wrapped in a projection no longer silently lose their row multiplicity tracking.
- A placeholder row identifier value of zero could collide with real row hashes; changed to a sentinel value (i64::MIN) outside the normal hash range.
- Empty scalar subqueries now raise a clear error instead of silently emitting NULL.
Change capture (CDC) fixes. The UPDATE trigger now correctly handles rows with NULL values in their primary key columns (previously those rows were silently dropped from the change buffer). WAL logical replication publications are automatically rebuilt when a source table is converted to partitioned after the publication was set up — previously this caused the stream table to silently stop updating. TRUNCATE followed by INSERT is handled atomically so post-TRUNCATE inserts are never lost.

Faster Refreshes

Automatic covering index on stream table row IDs. Stream tables with eight or fewer output columns now automatically get a covering index with INCLUDE (col1, col2, ...) on the internal __pgt_row_id column. This lets the MERGE step use index-only scans — no heap lookups for matched rows — reducing refresh time by roughly 20–50% in small-delta / large-table scenarios.
Change buffer compaction. When the pending change buffer grows beyond pg_trickle.compact_threshold (default 100,000 rows), pg_trickle compacts it before the next refresh cycle. INSERT→DELETE pairs that cancel each other out are eliminated; multiple sequential changes to the same row are collapsed to a single net change. Reduces delta scan overhead by 50–90% for high-churn tables. Uses change_id (not ctid) for safe operation under concurrent VACUUM.
Tiered refresh scheduling. Large deployments can assign stream tables to one of four tiers: Hot (refresh at the configured interval), Warm (2× interval), Cold (10× interval), or Frozen (skip until manually promoted). Gate the feature with pg_trickle.tiered_scheduling = on (default off). Set per stream table via ALTER STREAM TABLE ... SET (tier => 'warm'). Frozen stream tables are entirely skipped by the scheduler until you promote them.
Incremental dependency-graph updates. When a stream table is created, altered, or dropped, the internal dependency graph now updates only the affected entries instead of rebuilding the entire graph from scratch. Reduces the latency impact of DDL operations from roughly 50 ms to roughly 1 ms in deployments with 1,000+ stream tables.
Smarter topo-sort caching inside a scheduler tick. The ordering in which stream tables are refreshed (topological order through the dependency graph) is now computed once per scheduler tick and reused across all internal callers, eliminating redundant work.

Better Visibility Into What pg_trickle Is Doing

Several behaviours that previously happened silently now produce a short, actionable message at the moment they occur:

ORDER BY without LIMIT warns you at creation time. Adding ORDER BY to a stream table's defining query without also adding LIMIT has no effect: stream table storage has no guaranteed row order. pg_trickle now emits a WARNING pointing you toward the TopK pattern or suggesting you remove the ORDER BY.
append_only mode reversions are visible. When pg_trickle automatically exits append-only mode (because deletions or updates were detected in the source), the notice is now emitted at WARNING level (was INFO, normally suppressed) and also dispatched as a pgtrickle_alert notification.
Cleanup failures escalate after 3 consecutive attempts. If the background worker fails to clean up a source table 3 times in a row, the message is promoted from DEBUG1 (normally invisible) to WARNING so it appears in the server log.
Diamond dependency with diamond_consistency='none' now advises you. When you create a stream table that forms a diamond in the dependency graph and explicitly set diamond_consistency='none', a NOTICE advises you to consider diamond_consistency='atomic' for consistent cross-branch reads.
diamond_consistency now defaults to 'atomic'. New stream tables get atomic group semantics by default, meaning all branches of a diamond are refreshed together in a single savepoint before the convergence node is updated. This prevents a read from the convergence node seeing one branch partially updated and the other stale. To restore the old independent behavior, pass diamond_consistency => 'none' explicitly.
Adaptive fallback is visible at the default log level. When a differential refresh falls back to a full refresh because the delta is too large, the message is now emitted at NOTICE level (the default client_min_messages threshold) instead of INFO (usually suppressed in the client session).
CALCULATED schedule without downstream dependents warns you. When a stream table is created with schedule='calculated' but no existing stream table references it as a downstream dependent, a NOTICE explains that the schedule will fall back to pg_trickle.default_schedule_seconds.
Internal __pgt_* auxiliary columns are now documented. The hidden columns that the refresh engine may add to stream table physical storage are described in a new section of SQL_REFERENCE.md. This covers all variants from the always-present __pgt_row_id primary key through the aggregate-specific auxiliary columns for AVG, STDDEV, CORR, COVAR, REGR_*, window functions, and recursive CTE depth.

Bug Fixes

Scheduler no longer permanently misses stream tables created under a stale snapshot. signal_dag_invalidation is called inside the creating transaction before it commits. If the background scheduler happened to start a new tick and capture a catalog snapshot at that exact instant, the DAG rebuild query would not see the new stream table — yet the version counter was already advanced, so the scheduler would never rebuild again. The affected stream table would then never be scheduled for refresh. Fixed by verifying that every invalidated pgt_id is present in the rebuilt DAG after each rebuild. If any are missing the scheduler signals a full-rebuild for the next tick (which starts a fresh transaction that includes all committed data) rather than accepting the stale version. Fixes CI test test_autorefresh_diamond_cascade.

Upgrade Notes

New catalog columns. The 0.9.0 → 0.10.0 upgrade migration adds pooler_compatibility_mode BOOLEAN and refresh_tier TEXT to pgt_stream_tables. Run ALTER EXTENSION pg_trickle UPDATE TO '0.10.0' after replacing the extension files. Verification script: scripts/check_upgrade_completeness.sh.
Hidden auxiliary columns for statistical aggregates. Stream tables using CORR, COVAR_POP, COVAR_SAMP, or any REGR_* aggregate will get hidden __pgt_aux_* columns when created or altered under 0.10.0. These are invisible to normal queries (excluded by the NOT LIKE '__pgt_%' convention) and managed automatically.
pooler_compatibility_mode is off by default. Existing stream tables are unaffected. Enable it only for stream tables accessed through PgBouncer transaction-mode pooling.

Additional Bug Fixes (2026-03-24)

Scheduler stability:

Scheduler no longer crashes when concurrent refreshes compete. The internal function that decides whether to skip a refresh cycle was running a locking query outside a transaction boundary — a strict PostgreSQL requirement. It now runs inside a proper subtransaction, eliminating the crash.
Auto-backoff no longer causes a transaction conflict in the background worker. When the auto-backoff feature stretches a stream table's refresh interval, it previously tried to open a new transaction inside the background worker's already-open transaction. PostgreSQL does not allow this nesting; the code path is now restructured to avoid it.

Query engine correctness:

Queries that filter on hidden columns now produce correct results. For example, SELECT name FROM users WHERE internal_id > 5 — where internal_id is not part of the output — could return wrong rows during incremental updates. Fixed.
JOIN results are correct when both joined tables change at the same time. Simultaneous changes to two stream tables connected by a JOIN could leave the output with stale or duplicated rows. Fixed.
NULLIF(a, b) expressions now work in incremental queries. NULLIF returns NULL when its two arguments are equal. It was not recognised by the incremental parser, causing a fallback error. Fixed.
LIKE and ILIKE pattern matching now work in filter conditions. Filter expressions such as WHERE name LIKE 'A%' or WHERE description ILIKE '%widget%' were not handled by the incremental engine. Fixed.
Subqueries with ORDER BY, LIMIT, or OFFSET are now preserved correctly. When the incremental engine reconstructed a subquery, those clauses were silently dropped. The incremental result no longer differs from a full refresh for such queries.
Scalar subqueries using LIMIT or OFFSET are now handled gracefully. Rather than producing a runtime error, the engine falls back to a full refresh for those cases and continues.

SQL parser:

Wildcard column references (table.*) now work for qualified names. A two- or three-part column reference such as schema.table.* or alias.* caused a parser crash. Fixed.

Change capture and WAL:

State transitions no longer stall when the WAL replication slot is behind. When a stream table moves through the TRANSITIONING state, pg_trickle now advances the WAL replication slot up-front. This eliminates a lag-check stall that could cause the transition to hang indefinitely under write-heavy workloads.

Security:

Several low-severity code quality and security scanner alerts from Semgrep and CodeQL are resolved. No user-visible behaviour changes.

[0.9.0] — 2026-03-20

The headline feature of 0.9.0 is incremental aggregate maintenance: when a single row changes inside a group of 100,000 rows, pg_trickle no longer has to re-scan all 100,000 rows to update COUNT, SUM, AVG, STDDEV, or VAR results. Instead it keeps running totals and adjusts them in constant time. Only MIN/MAX still needs a rescan — and only when the deleted value happens to be the current extreme.

Beyond aggregates, this release contains a broad set of performance optimizations that reduce wasted I/O during every refresh cycle, two new configuration knobs, a refresh-group management API, and several bug fixes.

Faster Aggregates

Constant-time COUNT, SUM, AVG: Changed rows are now applied algebraically (new_sum = old_sum + inserted − deleted) instead of re-aggregating the whole group. AVG uses hidden auxiliary SUM and COUNT columns maintained automatically on the stream table.
Constant-time STDDEV and VAR: Standard-deviation and variance aggregates (STDDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP) now use a sum-of-squares decomposition with a hidden auxiliary column, achieving the same constant-time update as COUNT/SUM/AVG.
MIN/MAX safety guard: Deleting the row that currently holds the minimum (or maximum) value correctly triggers a rescan of that group. Property-based tests verify this boundary.
Floating-point drift reset: A new setting (pg_trickle.algebraic_drift_reset_cycles) periodically forces a full recomputation to correct any floating-point rounding drift that accumulates over many incremental cycles.

Smarter Refresh Scheduling

Automatic backoff for overloaded streams: The pg_trickle.auto_backoff GUC was introduced here (default off at the time). See the v0.10.0 entry for the improved thresholds, reduced cap, and the flip to on by default.
Index-aware MERGE: A new threshold setting (pg_trickle.merge_seqscan_threshold, default 0.001) tells PostgreSQL to use an index lookup instead of a full table scan when only a tiny fraction of the stream table's rows are changing.

Less Wasted I/O

Skip unchanged columns: The scan operator now checks the CDC trigger's per-row bitmask to skip UPDATE rows where none of the columns your query actually uses were modified. For wide tables where you only reference a few columns, most UPDATE processing is eliminated.
Skip unchanged sources in joins: When a multi-source join query has three source tables but only one of them changed, the delta branches for the two unchanged sources are now replaced with FALSE at plan time. PostgreSQL's planner recognises those branches as empty and skips them entirely.
Push WHERE filters into the change scan: If your stream table's defining query has a WHERE clause (e.g. WHERE status = 'shipped'), that filter is now applied immediately after reading the change buffer — before rows enter the join or aggregate pipeline. Rows that don't match the filter are discarded right away.
Faster DISTINCT counting: The per-row multiplicity lookup for SELECT DISTINCT queries now uses an index-driven scalar subquery instead of a LEFT JOIN, guaranteeing I/O proportional to the number of changed rows regardless of stream table size.
Scalar subquery short-circuit: When a scalar subquery's inner source has no changes in the current cycle, the expensive outer-table snapshot reconstruction is skipped entirely.

Refresh Group Management

New SQL functions for grouping stream tables that should always be refreshed together (cross-source snapshot consistency):
- pgtrickle.create_refresh_group(name, members, isolation)
- pgtrickle.drop_refresh_group(name)
- pgtrickle.refresh_groups() — lists all declared groups.

Bug Fixes

Fixed a crash when internal status queries failed: The source_gates() and watermarks() SQL functions previously crashed the entire PostgreSQL backend process on any internal error. They now report a normal SQL error instead.
Clearer handling of window functions in expressions: Queries like CASE WHEN ROW_NUMBER() OVER (...) > 5 THEN ... were silently accepted but failed at refresh time with a confusing error. pg_trickle now automatically falls back to full refresh mode (in AUTO mode) or warns you at creation time (in explicit DIFFERENTIAL mode).

Documentation

Documented the known limitation that recursive CTE stream tables in DIFFERENTIAL mode fall back to full recomputation when rows are deleted or updated. Workaround: use refresh_mode = 'IMMEDIATE'.
Documented the pgt_refresh_groups catalog table schema and usage.
Documented the O(partition_size) cost of window function maintenance with mitigation strategies.

Deferred to v0.10.0

The following performance optimizations were evaluated and explicitly deferred. In every case the current behaviour is correct — these items would make certain workloads faster but carry enough implementation risk that they need more design work first:

Recursive CTE incremental delete/update in DIFFERENTIAL mode (P2-1)
SUM NULL-transition shortcut for FULL OUTER JOIN aggregates (P2-2)
Materialized view sources in IMMEDIATE mode (P2-4)
LATERAL subquery scoped re-execution (P2-6)
Welford auxiliary columns for CORR/COVAR/REGR_* aggregates (P3-2)
Merged-delta weight aggregation for multi-source deduplication (B3-2/B3-3)

Upgrade Notes

New SQL objects: The 0.8.0 → 0.9.0 upgrade migration adds the pgt_refresh_groups table and the restore_stream_tables function. Run ALTER EXTENSION pg_trickle UPDATE TO '0.9.0' after replacing the extension files.
Hidden auxiliary columns: Stream tables using AVG, STDDEV, or VAR aggregates will automatically get hidden __pgt_aux_* columns when created or altered. These columns are invisible to normal queries (filtered by the existing NOT LIKE '__pgt_%' convention) and are managed automatically.
PGXN publishing: Release artifacts are now automatically uploaded to PGXN via GitHub Actions.

[0.8.0] — 2026-03-17

This release focuses on making your streams easier to back up, far more reliable under complex scenarios, and solidifying the underlying core engine through massive testing improvements.

Added

Backup and Restore Support: You can now safely backup your database using standard pg_dump and pg_restore commands. The system will automatically reconnect all streams and data queues to eliminate downtime during disaster recovery.
Connection Pooler Opt-In: Replaced the global PgBouncer pooler compatibility setting with a per-stream option. You can now enable connection pooling optimizations selectively on a stream-by-stream basis.

Fixed

Cyclic Stream Reliability: Fixed internal bugs that occasionally caused streams referencing each other in a loop to get stuck refreshing forever. Streams now accurately detect when row changes stop and naturally settle.
Large Dependency Chains: Fixed a crash (stack overflow) that could happen if you attempted to drop an extremely large or heavily recursive chain of stream tables sequentially.
Special Character Support in SQL: Handled an edge case causing errors when multi-byte characters or special non-ASCII symbols were parsed inside certain SQL commands.
Mac Support for Developer Tooling: Addressed a minor internal tool error stopping test components from automatically building on Apple Silicon machines.

Under the Hood Code and Testing Enhancements

Massive Testing Hardening: We have fundamentally overhauled and upgraded how we test the system. Our internal test suite has been completely enhanced with tens of thousands of continuous automated checks ensuring query answers are perfect, no matter how complex the data joins or updates get.
Performance Migrations: Began adopting new tools (cargo nextest) to speed up how fast we can iterate and develop the software in the background.

[0.7.0] — 2026-03-16

0.7.0 makes pg_trickle easier to trust in real-world data pipelines. The big theme of this release is fewer surprises: the scheduler can now wait for late arriving source data, some circular pipelines can run safely instead of being blocked, more queries stay on incremental refresh, and the system does a better job of deciding when incremental work is no longer worth it.

Added

Multi-source data can wait until it is actually ready

pg_trickle can now delay a refresh until related source tables have all caught up to roughly the same point in time. This is useful for ETL jobs where, for example, orders arrives before order_lines and refreshing too early would produce a half-finished report.

New watermark APIs: advance_watermark(source, watermark), create_watermark_group(name, sources[], tolerance_secs), and drop_watermark_group(name).
New status helpers: watermarks(), watermark_groups(), and watermark_status().
The scheduler now skips gated refreshes when grouped sources are too far apart and records the reason in refresh history.
New catalog tables store per-source watermarks and watermark group definitions.
28 end-to-end tests cover normal operation, bad input, tolerance windows, and scheduler behavior.

Some circular pipelines can now run safely

Stream tables that depend on each other in a loop are no longer always blocked. If the cycle is monotone and uses DIFFERENTIAL mode, pg_trickle can now keep refreshing the group until it stops changing.

Circular refreshes run to a fixed point, with pg_trickle.max_fixpoint_iterations as a safety limit.
Cycle creation and ALTER validation now check that every member is safe for convergence before allowing the loop.
pgtrickle.pgt_status() now reports scc_id, and pgtrickle.pgt_scc_status() shows per-cycle-group status.
pgtrickle.pgt_stream_tables now tracks last_fixpoint_iterations so it is easier to spot slow or unstable cycles.
6 end-to-end tests cover convergence, rejection of unsafe cycles, non-convergence handling, and cleanup.

More queries stay on incremental refresh

Several query shapes that used to fall back to FULL refresh, or fail outright, now keep working in DIFFERENTIAL and AUTO mode.

User-defined aggregates created with CREATE AGGREGATE now work through the existing group-rescan strategy, including common extension-provided aggregates.
More complex OR plus subquery patterns are now rewritten correctly, including cases that need De Morgan normalization and multiple rewrite passes.
The rewrite pipeline has a guardrail to stop runaway branch explosion.
A dedicated 14-test end-to-end suite covers these previously missing cases.

Easier packaging ahead of 1.0

The release also adds infrastructure that makes evaluation and future distribution simpler.

Dockerfile.hub and a dedicated CI workflow can build and smoke-test a ready-to-run PostgreSQL 18 image with pg_trickle preinstalled.
META.json adds PGXN package metadata with release_status: "testing".
CNPG smoke testing is now part of the documented pre-1.0 packaging story.

Improved

Refresh strategy and performance decisions are smarter

The scheduler and refresh engine now make better choices when incremental work is likely to help and back off sooner when it is not.

Wide tables now use xxh64-based change detection instead of slower MD5-based comparisons.
Aggregate stream tables can skip expensive incremental work and jump straight to FULL refresh when the pending change set is obviously too large.
Strategy selection now combines a change-ratio signal with recent refresh history, which helps on workloads with uneven batch sizes.
DAG levels are extracted explicitly, enabling level-parallel refresh scheduling.
Small internal hot paths such as column-list building and LSN comparison were tightened to remove avoidable allocations.

Benchmarking is much easier to use and compare

The performance toolchain was expanded so regressions are easier to spot and large-scale behavior is easier to study.

Benchmarks now support per-cycle output, optional EXPLAIN ANALYZE capture, larger 1M-row runs, and more stable Criterion settings.
New tooling covers cross-run comparison, concurrent writers, and extra query shapes such as window, lateral, CTE, and UNION ALL workloads.
just bench-docker makes it easier to run Criterion inside the builder image when local linking is awkward.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
Round-trip notifications — pg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

Internal low-level code is much safer to audit

This release cuts the amount of low-level unsafe Rust in half without changing behavior.

Unsafe blocks were reduced by 51%, from 1,309 to 641.
Repeated patterns were consolidated into a small set of documented helper functions.
37 internal functions no longer need to be marked unsafe.
Existing unit tests continued to pass unchanged after the refactor.

[0.6.0] — 2026-03-14

Added

Idempotent DDL (`create_or_replace`)

New one-call function for deploying stream tables without worrying about whether they already exist. Replaces the old "check if it exists, then drop and recreate" pattern.

create_or_replace_stream_table() — a single function that does the right thing automatically:
- Creates the stream table if it doesn't exist yet.
- Does nothing if the stream table already exists with the same query and settings (logs an INFO so you know it was a no-op).
- Updates settings (schedule, refresh mode, etc.) if only config changed.
- Replaces the query if the defining query changed — including automatic schema migration and a full refresh.
dbt uses it automatically. The stream_table materialization now calls create_or_replace_stream_table() when running against pg_trickle 0.6.0+, with automatic fallback for older versions.
Whitespace-insensitive. Cosmetic SQL differences (extra spaces, tabs, newlines) are correctly treated as no-ops — won't trigger unnecessary rebuilds.

dbt Integration Enhancements

Check stream table health from dbt. New pgtrickle_stream_table_status() macro returns whether a stream table is healthy, stale, erroring, or paused. Pair it with the new built-in stream_table_healthy test in your schema.yml to fail CI when a stream table is behind or broken.
Refresh everything in the right order. New refresh_all_stream_tables run-operation refreshes all dbt-managed stream tables in dependency order. Run it after dbt run and before dbt test in your CI pipeline.

Partitioned Source Tables

Stream tables now work with PostgreSQL's declarative table partitioning — RANGE, LIST, and HASH partitioned tables all work as sources out of the box.

Changes in any partition are captured automatically. CDC triggers fire on the parent table so inserts, updates, and deletes in any child partition are picked up.
ATTACH PARTITION triggers automatic rebuild. When you attach a new partition, pg_trickle detects the structural change and rebuilds affected stream tables to include the new partition's pre-existing data.
WAL mode works with partitions. Publications are configured with publish_via_partition_root = true, so all partitions report changes under the parent table's identity.
New tutorial covering partitioned source tables, ATTACH/DETACH behavior, and known caveats (docs/tutorials/PARTITIONED_TABLES.md).

Circular Dependency Foundation

Lays the groundwork for stream tables that reference each other in a cycle (A → B → A). The actual cyclic refresh execution is planned for v0.7.0 — this release adds the detection, validation, and safety infrastructure.

Cycle detection. pg_trickle can now identify groups of stream tables that form circular dependencies.
Safety checks at creation time. Queries that can't safely participate in a cycle (those using aggregates, EXCEPT, window functions, or NOT EXISTS) are rejected with a clear error explaining why.
New settings:
- pg_trickle.allow_circular (default: off) — master switch for circular dependencies.
- pg_trickle.max_fixpoint_iterations (default: 100) — prevents runaway loops.

Source Gating Improvements

bootstrap_gate_status() function. Shows which sources are currently gated, when they were gated, how long the gate has been active, and which stream tables are waiting. Useful for debugging "why isn't my stream table refreshing?"
ETL coordination cookbook. SQL Reference now includes five step-by-step recipes for common bulk-load patterns.

More SQL Patterns Supported

Two query patterns that previously required workarounds now just work:

Window functions inside expressions. Queries like CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN 'top' ELSE 'other' END or COALESCE(SUM() OVER (...), 0) are now accepted and produce correct results. Use FULL refresh mode for these queries — incremental (DIFFERENTIAL) refresh of window-in-expression patterns is not yet supported. Previously, the query was rejected entirely at creation time.
ALL (subquery) comparisons. Queries like WHERE price < ALL (SELECT price FROM competitors) are now accepted in both FULL and DIFFERENTIAL modes. Supports all comparison operators (>, >=, <, <=, =, <>) and correctly handles NULL values per the SQL standard.

Operational Safety Improvements

Function changes detected automatically. If a stream table's query calls a user-defined function and you update that function with CREATE OR REPLACE FUNCTION, pg_trickle detects the change and automatically rebuilds the stream table on the next cycle. No manual intervention needed.
WAL mode explains why it isn't activating. When cdc_mode = 'auto' and the system stays on trigger-based tracking, the scheduler now periodically logs the exact reason (e.g., "wal_level is not logical") and check_cdc_health() reports the current mode so you can diagnose the issue.
WAL + keyless tables rejected early. Creating a stream table with cdc_mode = 'wal' on a table that has no primary key and no REPLICA IDENTITY FULL is now rejected at creation time with a clear error — instead of silently producing incomplete results later.
Automatic recovery after backup/restore. When a PostgreSQL server is restored from pg_basebackup, WAL replication slots are lost. pg_trickle now detects the missing slot, automatically falls back to trigger-based tracking, and logs a WARNING so you know what happened.

Documentation

ALL (subquery) worked example in the SQL Reference with sample data and expected results.
Window-in-expression documentation showing before/after examples of the automatic rewrite.
Foreign table sources tutorial — step-by-step guide for using postgres_fdw foreign tables as stream table sources.

Fixed

create_or_replace whitespace handling. Extra spaces, tabs, and newlines in queries no longer trigger unnecessary rebuilds.
create_or_replace schema incompatibility detection. Incompatible column type changes (e.g., text → integer) are now properly detected and handled.

[0.5.0] — 2026-03-13

Added

Row-Level Security (RLS) Support

Stream tables now work correctly with PostgreSQL's Row-Level Security feature, which lets you control which rows different users can see.

Refreshes always see all data. When a stream table is refreshed, it computes the full result regardless of RLS policies on the source tables. This matches how PostgreSQL's built-in materialized views work. You then add RLS policies directly on the stream table to control who can read what.
Internal tables are protected. The internal change-tracking tables used by pg_trickle are shielded from RLS interference, so refreshes won't silently fail if you turn on RLS at the schema level.
Real-time (IMMEDIATE) mode secured. Triggers that keep stream tables updated in real time now run with elevated privileges and a locked-down search path, preventing data corruption or security bypasses.
RLS changes are detected automatically. If you enable, disable, or force RLS on a source table, pg_trickle detects the change and marks affected stream tables for a full rebuild.
New tutorial. Step-by-step guide for setting up per-tenant RLS policies on stream tables (see docs/tutorials/ROW_LEVEL_SECURITY.md).

Source Gating for Bulk Loads

New pause/resume mechanism for large data imports. When you're loading a big batch of data into a source table, you can temporarily "gate" it to prevent the background scheduler from triggering refreshes mid-load. Once the load is done, ungate it and everything catches up in a single refresh.

gate_source('my_table') — pauses automatic refreshes for any stream table that depends on my_table.
ungate_source('my_table') — resumes automatic refreshes. All changes made during the gate are picked up in the next refresh cycle.
source_gates() — shows which source tables are currently gated, when they were gated, and by whom.
Manual refresh still works. Even while a source is gated, you can explicitly call refresh_stream_table() if needed.
Gating is idempotent — calling gate_source() twice is safe, and gating a source that's already gated is a no-op.

Append-Only Fast Path

Significant performance improvement for tables that only receive INSERTs (event logs, audit trails, time-series data, etc.). When you mark a stream table as append_only, refreshes skip the expensive merge logic (checking for deletes, updates, and row comparisons) and use a simple, fast insert.

How to use: Pass append_only => true when creating or altering a stream table.
Safe fallback. If a DELETE or UPDATE is detected on a source table, the extension automatically falls back to the standard refresh path and logs a warning. It won't silently produce wrong results.
Restrictions. Append-only mode requires DIFFERENTIAL refresh mode and source tables with primary keys.

Usability Improvements

Manual refresh history. When you manually call refresh_stream_table(), the result (success or failure, timing, rows affected) is now recorded in the refresh history, just like scheduled refreshes.
quick_health view. A single-row health summary showing how many stream tables you have, how many are in error or stale, whether the scheduler is running, and an overall status (OK, WARNING, CRITICAL). Easy to plug into monitoring dashboards.
create_stream_table_if_not_exists(). A convenience function that does nothing if the stream table already exists, instead of raising an error. Makes migration scripts and deployment automation simpler.

Smooth Upgrade from 0.4.0

Existing installations can upgrade with ALTER EXTENSION pg_trickle UPDATE TO '0.5.0'. All new features (source gating, append-only mode, quick health view, and the new convenience functions) are included in the upgrade script.
The upgrade has been verified with automated tests that confirm all 40 SQL objects survive the upgrade intact.

[0.4.0] — 2026-03-12

Added

Parallel Refresh (opt-in)

Stream tables can now be refreshed in parallel, using multiple background workers instead of processing them one at a time. This can dramatically reduce end-to-end refresh latency when you have many independent stream tables.

Off by default. Set pg_trickle.parallel_refresh_mode = 'on' to enable. Use 'dry_run' to preview what the scheduler would do without changing behavior.
Automatic dependency awareness. The scheduler figures out which stream tables can safely refresh at the same time and which must wait for others. Stream tables connected by real-time (IMMEDIATE) triggers are always refreshed together to prevent race conditions.
Atomic groups. When a group of stream tables must succeed or fail together (e.g. diamond dependencies), all members are wrapped in a single transaction — if one fails, the whole group rolls back cleanly.
Worker pool controls:
- pg_trickle.max_dynamic_refresh_workers (default 4) — cluster-wide cap on concurrent refresh workers.
- pg_trickle.max_concurrent_refreshes — per-database dispatch cap.
Monitoring:
- worker_pool_status() — shows how many workers are active and the current limits.
- parallel_job_status(max_age_seconds) — lists recent and active refresh jobs with timing and status.
- health_check() now warns when the worker pool is saturated or the job queue is backing up.
Self-healing. On startup, the scheduler automatically cleans up orphaned jobs and reclaims leaked worker slots from previous crashes.

Statement-Level CDC Triggers

Change tracking triggers have been upgraded from row-level to statement-level, reducing write-side overhead for bulk INSERT and UPDATE operations. This is now the default for all new and existing stream tables. A benchmark harness is included so you can measure the difference on your own hardware.

dbt Getting Started Example

New examples/dbt_getting_started/ project with a complete, runnable dbt example showing org-chart seed data, staging views, and three stream table models. Includes an automated test script.

Fixed

Refresh Lock Not Released After Errors

Fixed a bug where refresh_stream_table() could get permanently stuck after a PostgreSQL error (e.g. running out of temp file space). The internal lock was session-level and survived transaction rollback, causing all future refreshes for that stream table to report "another refresh is already in progress". Refresh locks are now transaction-level, so they are automatically released when the transaction ends — whether it succeeds or fails.

dbt Integration Fixes

Fixed query quoting in dbt macros that broke when queries contained single quotes.
Fixed schedule = none in dbt being incorrectly mapped to SQL NULL.
Fixed view inlining when the same view was referenced with different aliases.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
Round-trip notifications — pg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.
Updated to PostgreSQL 18.3 across CI and test infrastructure.
Dependency updates: tokio 1.49 → 1.50 and several GitHub Actions bumps.

Breaking Changes

These behavioural changes shipped in v0.4.0. They improve usability but may require action from users upgrading from v0.3.0.

Schedule default changed from '1m' to 'calculated'. create_stream_table now defaults to schedule => 'calculated', which auto-computes the refresh interval from downstream dependents instead of refreshing every 1 minute. If you relied on the implicit 1-minute default, explicitly pass schedule => '1m' to preserve the old behaviour.
NULL schedule input rejected. Passing schedule => NULL to create_stream_table now returns an error. Use schedule => 'calculated' instead — it's explicit and self-documenting.
Diamond GUCs removed. The cluster-wide GUCs pg_trickle.diamond_consistency and pg_trickle.diamond_schedule_policy have been removed. Diamond behaviour is now controlled per-table via parameters on create_stream_table() / alter_stream_table(): diamond_consistency => 'atomic', diamond_schedule_policy => 'slowest'.

[0.3.0] — 2026-03-11

This is a correctness and hardening release. No new SQL functions, tables, or views were added — all changes are in the compiled extension code. ALTER EXTENSION pg_trickle UPDATE is safe and a no-op for schema objects.

Fixed

Incremental Correctness Fixes

All 18 previously-disabled correctness tests have been re-enabled (0 remaining). The following query patterns now produce correct results during incremental (non-full) refreshes:

HAVING clause threshold crossing. Queries with HAVING filters (e.g. HAVING SUM(amount) > 100) now produce correct totals when groups cross the threshold. Previously, a group gaining enough rows to meet the condition would show only the newly added values instead of the correct total.
FULL OUTER JOIN. Five bugs affecting incremental updates for FULL OUTER JOIN queries are fixed: mismatched row identifiers, incorrect handling of compound GROUP BY expressions like COALESCE(left.col, right.col), and wrong NULL handling for SUM aggregates.
EXISTS with HAVING subqueries. Queries using WHERE EXISTS(... GROUP BY ... HAVING ...) now work correctly — the inner GROUP BY and HAVING were previously being silently discarded.
Correlated scalar subqueries. Correlated subqueries in SELECT like (SELECT MAX(e.salary) FROM emp e WHERE e.dept_id = d.id) are now automatically rewritten into LEFT JOINs so the incremental engine can handle them correctly.

Background Worker Detection on PostgreSQL 18

Fixed a bug where health_check() and the scheduler reported zero active workers on PostgreSQL 18 due to a column name change in system views.

Scheduler Stability

Fixed a loop where the scheduler launcher could get stuck retrying failed database probes indefinitely instead of backing off properly.

Added

Security Tooling

Added static security analysis to the CI pipeline:

GitHub CodeQL — automated security scanning across all Rust source files. First scan: zero findings.
cargo deny — enforces a license allow-list and flags unmaintained or yanked dependencies.
Semgrep — custom rules that flag potentially dangerous patterns such as dynamic SQL construction and privilege escalation. Advisory-only (does not block merges).
Unsafe block inventory — CI tracks the count of unsafe code blocks per file and fails if any file exceeds its baseline, preventing unreviewed growth of low-level code.

[0.2.3] — 2026-03-09

Added

Unsafe function detection. Queries using non-deterministic functions like random() or clock_timestamp() are now rejected when creating incremental stream tables, because they can't produce reliable results. Functions like now() that return the same value within a transaction are allowed with a warning.
Per-table change tracking mode. You can now choose how each stream table tracks changes ('auto', 'trigger', or 'wal') via the cdc_mode parameter on create_stream_table() and alter_stream_table(), instead of relying only on the global setting.
CDC status view. New pgtrickle.pgt_cdc_status view shows the change tracking mode, replication slot, and transition status for every source table in one place.
Configurable WAL lag thresholds. The warning and critical thresholds for replication slot lag are now configurable via pg_trickle.slot_lag_warning_threshold_mb (default 100 MB) and pg_trickle.slot_lag_critical_threshold_mb (default 1024 MB), instead of being hard-coded.
pg_trickle_dump backup tool. New standalone CLI that exports all your stream table definitions as replayable SQL, ordered by dependency. Useful for backups before upgrades or migrations.
Upgrade path. ALTER EXTENSION pg_trickle UPDATE picks up all new features from this release.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
Round-trip notifications — pg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.
After a full refresh, WAL replication slots are now advanced to the current position, preventing unnecessary WAL accumulation and false lag alarms.
Change buffers are now flushed after a full refresh, fixing a cycle where the scheduler would alternate endlessly between incremental and full refreshes on bulk-loaded tables.
IMMEDIATE mode now correctly rejects explicit WAL CDC requests with a clear error, since real-time mode uses its own trigger mechanism.
The pg_trickle.user_triggers setting is simplified to auto and off. The old on value still works as an alias for auto.
CI pipelines are faster on PRs — only essential tests run; the full suite runs on merge and daily schedule.

[0.2.2] — 2026-03-08

Added

Change a stream table's query. alter_stream_table now accepts a query parameter, so you can change what a stream table computes without dropping and recreating it. If the new query's columns are compatible, the underlying storage table is preserved — existing views, policies, and publications continue to work.
AUTO refresh mode (new default). Stream tables now default to AUTO mode, which uses fast incremental updates when the query supports it and automatically falls back to a full recompute when it doesn't. You no longer need to think about whether your query is "incremental-compatible" — just create the stream table and it picks the best strategy.
Version mismatch warning. The background scheduler now warns if the installed extension version doesn't match the compiled library, making it easier to spot a half-finished upgrade.
ORDER BY + LIMIT + OFFSET. You can now page through top-N results, e.g. ORDER BY revenue DESC LIMIT 10 OFFSET 20 to get the third page of top earners.
Real-time mode: recursive queries. WITH RECURSIVE queries (e.g. org-chart hierarchies) now work in IMMEDIATE mode. A depth limit (default 100) prevents infinite loops.
Real-time mode: top-N queries. ORDER BY ... LIMIT N queries now work in IMMEDIATE mode — the top-N rows are recomputed on every data change. Maximum N is controlled by pg_trickle.ivm_topk_max_limit (default 1000).
Foreign table support. Stream tables can now use foreign tables as sources. Changes are detected by comparing snapshots since foreign tables don't support triggers. Enable with pg_trickle.foreign_table_polling = on.
Documentation reorganization. Configuration and SQL reference docs are reorganized around practical workflows. New sections cover DDL-during-refresh behavior, standby/replica limitations, and PgBouncer constraints.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
Round-trip notifications — pg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.
Default refresh mode changed from 'DIFFERENTIAL' to 'AUTO'.
Default schedule changed from '1m' to 'calculated' (automatic).
Default change tracking mode changed from 'trigger' to 'auto' — WAL-based tracking starts automatically when available, with trigger-based as fallback.

[0.2.1] — 2026-03-05

Added

Safe upgrades. New upgrade infrastructure ensures that ALTER EXTENSION pg_trickle UPDATE works correctly. A CI check detects missing functions or views in upgrade scripts, and automated tests verify that stream tables survive version-to-version upgrades intact. See docs/UPGRADING.md for the upgrade guide.
ORDER BY + LIMIT + OFFSET. You can now create stream tables over paged results, like "the second page of the top-100 products by revenue" (ORDER BY revenue DESC LIMIT 100 OFFSET 100).
'calculated' schedule. Instead of passing SQL NULL to request automatic scheduling, you can now write schedule => 'calculated'. Passing NULL now gives a helpful error message.
Documentation expansion. Six new pages in the online book covering dbt integration, contributing guidelines, security policy, release process, and research comparisons with other projects.
Better warnings and safety checks:
- Warning when a source table lacks a primary key (duplicate rows are handled safely but less efficiently).
- Warning when using SELECT * (new columns added later can break incremental updates).
- Alert when the refresh queue is falling behind (> 80% capacity).
- Guard triggers prevent accidental direct writes to stream table storage.
- Automatic fallback from WAL to trigger-based change tracking when the replication slot disappears.
- Nested window functions and complex WHERE clauses with EXISTS are now handled automatically.
Change buffer partitioning. For high-throughput tables, change buffers can now be partitioned so that processed data is dropped efficiently.
Column pruning. The incremental engine now skips source columns not used in the query, reducing I/O for wide tables.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
Round-trip notifications — pg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.
Default schedule changed from '1m' to 'calculated' (automatic).
Minimum schedule interval lowered from 60 s to 1 s.
Cluster-wide diamond consistency settings removed; per-table settings remain and now default to 'atomic' / 'fastest'.

Fixed

The 0.1.3 → 0.2.0 upgrade script was accidentally a no-op, silently skipping 11 new functions. Fixed.
Queries combining WITH (CTEs) and UNION ALL now parse correctly.

[0.2.0] — 2026-03-04

Added

Monitoring & health checks. Six new functions for inspecting your stream tables at runtime (no superuser required):
- change_buffer_sizes() — shows how much pending change data each stream table has queued up.
- list_sources(name) — lists all base tables that feed a given stream table, with row counts and size estimates.
- dependency_tree() — displays an ASCII tree of how your stream tables depend on each other.
- health_check() — quick system triage that checks whether the scheduler is running, flags tables in error or stale, and warns about large change buffers or WAL lag.
- refresh_timeline() — recent refresh history across all stream tables, showing timing, row counts, and any errors.
- trigger_inventory() — verifies that all required change-tracking triggers are in place and enabled.
IMMEDIATE refresh mode (real-time updates). New 'IMMEDIATE' mode keeps stream tables updated within the same transaction as your data changes. There's no delay — the stream table reflects changes the instant they happen. Supports window functions, LATERAL joins, scalar subqueries, and aggregate queries. You can switch between IMMEDIATE and other modes at any time using alter_stream_table.
Top-N queries (ORDER BY + LIMIT). Queries like SELECT ... ORDER BY score DESC LIMIT 10 are now supported. The stream table stores only the top N rows and updates efficiently.
Diamond dependency consistency. When multiple stream tables share common sources and feed into the same downstream table (a "diamond" pattern), they can now be refreshed as an atomic group — either all succeed or all roll back. This prevents inconsistent reads at convergence points. Controlled via the diamond_consistency parameter (default: 'atomic').
Multi-database auto-discovery. The background scheduler now automatically finds and services all databases on the server where pg_trickle is installed. No manual pg_trickle.database configuration required — just install the extension and the scheduler discovers it.

Fixed

Fixed IMMEDIATE mode incorrectly trying to read from change buffer tables (which don't exist in that mode) for certain aggregate queries.
Fixed type mismatches when join queries had unchanged source tables producing empty change sets.
Fixed join condition column order being swapped when the right-side table was written first in the ON clause (e.g. ON r.id = l.id).
Fixed dbt macros silently rolling back stream table creation because dbt wraps statements in a ROLLBACK by default.
Fixed LIMIT ALL being incorrectly rejected as an unsupported LIMIT clause.
Fixed false "query may produce incorrect incremental results" warnings on simple arithmetic like depth + 1 or path || name.
Fixed auto-created indexes using the wrong column name when the query had a column alias (e.g. SELECT id AS department_id).

[0.1.3] — 2026-03-02

Major hardening release with 50 improvements across correctness, robustness, operational safety, and test coverage.

Added

DDL change tracking expanded. ALTER TYPE, ALTER POLICY, and ALTER DOMAIN on source tables are now detected and trigger a rebuild of affected stream tables. Previously only column changes were tracked.
Recursive query safety guard. Recursive CTEs (WITH RECURSIVE) are now checked for non-monotonic terms that could produce incorrect incremental results.
Read replica awareness. The background scheduler detects when it's running on a read replica and skips refresh work, preventing errors.
Range aggregates rejected. RANGE_AGG and RANGE_INTERSECT_AGG are now properly rejected in incremental mode with a clear error.
Refresh history: row counts. Refresh history now records how many rows were inserted, updated, and deleted in each refresh cycle.
Change buffer alerts. New pg_trickle.buffer_alert_threshold setting lets you configure when to be warned about growing change buffers.
st_auto_threshold() function. Shows the current adaptive threshold that decides when to switch between incremental and full refresh.
Wide table optimization. Tables with more than 50 columns use a hash shortcut during refresh merges, improving performance.
Change buffer security. Internal change buffer tables are no longer accessible to PUBLIC.
Documentation. PgBouncer compatibility, keyless table limitations, delta memory bounds, sequential processing rationale, and connection overhead are all now documented in the FAQ.

TPC-H Correctness Suite: 22/22 Queries Passing

The TPC-H-derived correctness test suite (22 industry-standard analytical queries) now passes completely across multiple rounds of data changes. This validates that incremental refreshes produce identical results to full recomputation for complex real-world query patterns.

Fixed

Window Function Correctness

Fixed incremental maintenance of window functions (ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG/LEAD, SUM OVER, etc.) to correctly handle:

Non-RANGE frame types
Ranking functions over tied values
Window functions wrapping aggregates (e.g. RANK() OVER (ORDER BY SUM(x)))
Multiple window functions with different PARTITION BY clauses

INTERSECT / EXCEPT Correctness

Fixed incremental maintenance of INTERSECT and EXCEPT queries that produced wrong results due to invalid SQL generation.

EXISTS / IN with OR Correctness

Fixed EXISTS and IN subqueries combined with OR in WHERE clauses that produced wrong results.

Aggregate Correctness

MIN / MAX now correctly rescan the source table when the current minimum or maximum value is deleted.
STRING_AGG(... ORDER BY ...) and ARRAY_AGG(... ORDER BY ...) no longer silently drop the ORDER BY clause.

[0.1.2] — 2026-02-28

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
Round-trip notifications — pg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

Project Renamed from pg_stream to pg_trickle

Renamed the entire project from pg_stream to pg_trickle to avoid a naming collision with an unrelated project. If you were using the old name, all configuration prefixes changed from pg_stream.* to pg_trickle.*, and the SQL schemas changed from pgstream to pgtrickle. The "stream tables" terminology is unchanged.

Fixed

Fixed numerous incremental computation bugs discovered while building a comprehensive correctness test suite based on all 22 TPC-H analytical queries:

Inner join double-counting. When both sides of a join had changes in the same refresh cycle, some rows were counted twice.
Shared source cleanup. Cleaning up processed changes for one stream table could accidentally delete entries still needed by another stream table sharing the same source.
Scalar aggregate identity mismatch. Queries like SELECT SUM(amount) FROM orders could produce mismatched row identifiers between the incremental and merge phases. AVG also failed to recompute correctly after partial group changes.
EXISTS / NOT EXISTS snapshots. Incremental maintenance of EXISTS and NOT EXISTS subqueries missed pre-change state, producing wrong results.
Column resolution in complex joins. Several fixes for column name resolution in multi-table joins and nested subqueries.
COUNT(*) rendering. COUNT(*) was sometimes rendered as COUNT() (missing the star), causing SQL errors.
Subquery rewriting. Several subquery patterns (correlated vs non-correlated scalar subqueries, derived tables in FROM) were incorrectly rewritten, blocking certain queries from being created.
Cleanup worker crash. The background cleanup worker no longer crashes when it encounters entries for stream tables that were dropped mid-cycle.

Added

TPC-H Correctness Test Suite

Added a comprehensive correctness test suite based on all 22 TPC-H analytical queries. These tests verify that incremental refreshes produce identical results to a full recompute after INSERT, DELETE, and UPDATE mutations. 20 of 22 queries can be created as stream tables; 15 pass full correctness checks at this point (improved to 22/22 in v0.1.3).

[0.1.1] — 2026-02-26

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
Round-trip notifications — pg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

CloudNativePG Extension Image

Replaced the full PostgreSQL Docker image (~400 MB) with a minimal extension-only image (< 10 MB) following the CloudNativePG Image Volume Extensions specification. This means faster pulls and less disk usage in Kubernetes deployments. The image contains just the extension files — no full PostgreSQL server.

[0.1.0] — 2026-02-26

Initial release of pg_trickle — a PostgreSQL extension that keeps query results automatically up to date as your data changes.

Core Concept

Define a SQL query and a schedule. pg_trickle creates a stream table that stores the query's results and keeps them fresh — either on a schedule (every N seconds) or in real time. When data in your source tables changes, only the affected rows are recomputed instead of re-running the entire query.

What You Can Do

Create stream tables from SELECT queries — joins, aggregates, subqueries, CTEs, window functions, set operations, and more.
Automatic refresh — a background scheduler refreshes stream tables in dependency order. You can also trigger refreshes manually.
Incremental updates — the engine automatically figures out how to update only the rows that changed, instead of recomputing everything. This works for most query patterns including multi-table joins and aggregates.
Views as sources — views referenced in your query are automatically expanded so change tracking works on the underlying tables.
Tables without primary keys — supported via content hashing. Tables with primary keys get better performance.
Hybrid change tracking — starts with lightweight triggers (no special PostgreSQL configuration needed). Can automatically switch to WAL-based tracking for lower overhead when wal_level = logical is available.
Multi-database support — the scheduler automatically discovers all databases on the server where the extension is installed.
User triggers on stream tables — your own AFTER triggers on stream tables fire correctly during incremental refreshes.
DDL awareness — ALTER TABLE, DROP TABLE, CREATE OR REPLACE FUNCTION, and other DDL on source tables or functions used in your query are detected and handled automatically.

SQL Support

Broad coverage of SQL features:

Joins: INNER, LEFT, RIGHT, FULL OUTER, NATURAL, LATERAL subqueries, LATERAL set-returning functions (unnest, jsonb_array_elements, etc.)
Aggregates: 39 functions including COUNT, SUM, AVG, MIN, MAX, STRING_AGG, ARRAY_AGG, JSON_ARRAYAGG, JSON_OBJECTAGG, statistical regression functions (CORR, COVAR_, REGR_), and ordered-set aggregates (MODE, PERCENTILE_CONT, PERCENTILE_DISC)
Window functions: ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, SUM OVER, etc. with full frame clause support
Set operations: UNION, UNION ALL, INTERSECT, EXCEPT
Subqueries: in FROM, EXISTS/NOT EXISTS, IN/NOT IN, scalar subqueries
CTEs: WITH and WITH RECURSIVE
Special syntax: DISTINCT, DISTINCT ON, GROUPING SETS / CUBE / ROLLUP, CASE WHEN, COALESCE, JSON_TABLE (PostgreSQL 17+)
Unsafe function detection: queries using non-deterministic functions like random() are rejected with a clear error

Monitoring

explain_st() — shows the incremental computation plan
st_refresh_stats(), get_refresh_history(), get_staleness() — refresh performance and status
slot_health() — WAL replication slot health
check_cdc_health() — change tracking health per source table
stream_tables_info and pg_stat_stream_tables views
NOTIFY alerts for stale data, errors, and refresh events

Documentation

Architecture guide, SQL reference, configuration reference, FAQ, getting-started tutorial, and deep-dive tutorials.

Known Limitations

TABLESAMPLE, LIMIT / OFFSET, FOR UPDATE / FOR SHARE — not yet supported (clear error messages).
Window functions inside expressions (e.g. CASE WHEN ROW_NUMBER() ...) — not yet supported.
Circular stream table dependencies — not yet supported.

pg_trickle — Project Roadmap

Last updated: 2026-04-13 Latest release: 0.19.0 (2026-04-13) Current milestone: v0.21.0 — PostgreSQL 17 Support

For a concise description of what pg_trickle is and why it exists, read ESSENCE.md — it explains the core problem (full REFRESH MATERIALIZED VIEW recomputation), how the differential dataflow approach solves it, the hybrid trigger→WAL CDC architecture, and the broad SQL coverage, all in plain language.

Overview

pg_trickle is a PostgreSQL 18 extension that implements streaming tables with incremental view maintenance (IVM) via differential dataflow. The extension is designed for maximum performance, low latency, and high throughput — differential refresh is the default mode, and full refresh is a fallback of last resort. All 13 design phases are complete. This roadmap tracks the path from the v0.1.x series to 1.0 and beyond.

Version	Theme	Status
v0.1.x	Core engine, DVM, CDC, scheduling, monitoring	✅ Released
v0.2.0	TopK, diamond consistency, transactional IVM	✅ Released
v0.2.1	Upgrade infrastructure & documentation	✅ Released
v0.2.2	OFFSET, AUTO mode, ALTER QUERY, CDC hardening	✅ Released
v0.2.3	Non-determinism, CDC/mode gaps, operational polish	✅ Released
v0.3.0	DVM correctness, SAST & test coverage	✅ Released
v0.4.0	Parallel refresh & performance hardening	✅ Released
v0.5.0	Row-level security & operational controls	✅ Released
v0.6.0	Partitioning, idempotent DDL, circular dependency foundation	✅ Released
v0.7.0	Performance, watermarks, circular DAG, observability	✅ Released
v0.8.0	pg_dump support & test hardening	✅ Released
v0.9.0	Incremental aggregate maintenance	✅ Released
v0.10.0	DVM hardening, connection pooler compat, refresh optimizations	✅ Released
v0.11.0	Partitioned stream tables, Prometheus/Grafana, safety hardening	✅ Released
v0.12.0	Correctness, reliability & developer tooling	✅ Released
v0.13.0	Scalability foundations, MERGE profiling, multi-tenant scheduling	✅ Released
v0.14.0	Tiered scheduling, UNLOGGED buffers & diagnostics	✅ Released
v0.15.0	External test suites & integration	✅ Released
v0.16.0	Performance & refresh optimization	✅ Released
v0.17.0	Query intelligence & stability	✅ Released
v0.18.0	Hardening & delta performance	✅ Released
v0.19.0	Production gap closure & distribution	✅ Released
v0.20.0	Dog-feeding (pg_trickle monitors itself)	✅ Released
v0.21.0	PostgreSQL 17 support	Planned
v0.22.0	PGlite proof of concept	Planned
v0.23.0	Core extraction (`pg_trickle_core`)	Planned
v0.24.0	PGlite WASM extension	Planned
v0.25.0	PGlite reactive integration	Planned
v1.0.0	Stable release (incl. PG 19 compatibility)	Planned

v0.1.x Series — Released

Completed items (click to expand)

v0.1.0 — Released (2026-02-26)

Status: Released — all 13 design phases implemented.

Core engine, DVM with 21 OpTree operators, trigger-based CDC, DAG-aware scheduling, monitoring, dbt macro package, and 1,300+ tests.

Key additions over pre-release:

WAL decoder pgoutput edge cases (F4)
JOIN key column change limitation docs (F7)
Keyless duplicate-row behavior documented (F11)
CUBE explosion guard (F14)

v0.1.1 — Released (2026-02-27)

Patch release: WAL decoder keyless pk_hash fix (F2), old_* column population for UPDATEs (F3), and delete_insert merge strategy removal (F1).

v0.1.2 — Released (2026-02-28)

Patch release: ALTER TYPE/POLICY DDL tracking (F6), window partition key E2E tests (F8), PgBouncer compatibility docs (F12), read replica detection (F16), SPI retry with SQLSTATE classification (F29), and 40+ additional E2E tests.

v0.1.3 — Released (2026-03-01)

Patch release: Completed 50/51 SQL_GAPS_7 items across all tiers. Highlights:

Adaptive fallback threshold (F27), delta change metrics (F30)
WAL decoder hardening: replay deduplication, slot lag alerting (F31–F38)
TPC-H 22-query correctness baseline (22/22 pass, SF=0.01)
460 E2E tests (≥ 400 exit criterion met)
CNPG extension image published to GHCR

See CHANGELOG.md for the full feature list.

v0.2.0 — TopK, Diamond Consistency & Transactional IVM

Status: Released (2026-03-04).

The 51-item SQL_GAPS_7 correctness plan was completed in v0.1.x. v0.2.0 delivers three major feature additions.

Completed items (click to expand)

Tier	Items	Status
0 — Critical	F1–F3, F5–F6	✅ Done in v0.1.1–v0.1.3
1 — Verification	F8–F10, F12	✅ Done in v0.1.2–v0.1.3
2 — Robustness	F13, F15–F16	✅ Done in v0.1.2–v0.1.3
3 — Test coverage	F17–F26 (62 E2E tests)	✅ Done in v0.1.2–v0.1.3
4 — Operational hardening	F27–F39	✅ Done in v0.1.3
4 — Upgrade migrations	F40	✅ Done in v0.2.1
5 — Nice-to-have	F41–F51	✅ Done in v0.1.3

TPC-H baseline: 22/22 queries pass deterministic correctness checks across multiple mutation cycles (just test-tpch, SF=0.01).

Queries are derived from the TPC-H Benchmark specification; results are not comparable to published TPC results. TPC Benchmark™ is a trademark of TPC.

ORDER BY / LIMIT / OFFSET — TopK Support ✅

In plain terms: Stream tables can now be defined with ORDER BY ... LIMIT N — for example "keep the top 10 best-selling products". When the underlying data changes, only the top-N slot is updated incrementally rather than recomputing the entire sorted list from scratch every tick.

ORDER BY ... LIMIT N defining queries are accepted and refreshed correctly. All 9 plan items (TK1–TK9) implemented, including 5 TPC-H queries with ORDER BY restored (Q2, Q3, Q10, Q18, Q21).

Item	Description	Status
TK1	E2E tests for `FETCH FIRST` / `FETCH NEXT` rejection	✅ Done
TK2	OFFSET without ORDER BY warning in subqueries	✅ Done
TK3	`detect_topk_pattern()` + `TopKInfo` struct in `parser.rs`	✅ Done
TK4	Catalog columns: `pgt_topk_limit`, `pgt_topk_order_by`	✅ Done
TK5	TopK-aware refresh path (scoped recomputation via MERGE)	✅ Done
TK6	DVM pipeline bypass for TopK tables in `api.rs`	✅ Done
TK7	E2E + unit tests (`e2e_topk_tests.rs`, 18 tests)	✅ Done
TK8	Documentation (SQL Reference, FAQ, CHANGELOG)	✅ Done
TK9	TPC-H: restored ORDER BY + LIMIT in Q2, Q3, Q10, Q18, Q21	✅ Done

See PLAN_ORDER_BY_LIMIT_OFFSET.md.

Diamond Dependency Consistency ✅

In plain terms: A "diamond" is when two stream tables share the same source (A → B, A → C) and a third (D) reads from both B and C. Without special handling, updating A could refresh B before C, leaving D briefly in an inconsistent state where it sees new-B but old-C. This groups B and C into an atomic refresh unit so D always sees them change together in a single step.

Atomic refresh groups eliminate the inconsistency window in diamond DAGs (A→B→D, A→C→D). All 8 plan items (D1–D8) implemented.

Item	Description	Status
D1	Data structures (`Diamond`, `ConsistencyGroup`) in `dag.rs`	✅ Done
D2	Diamond detection algorithm in `dag.rs`	✅ Done
D3	Consistency group computation in `dag.rs`	✅ Done
D4	Catalog columns + GUCs (`diamond_consistency`, `diamond_schedule_policy`)	✅ Done
D5	Scheduler wiring with SAVEPOINT loop	✅ Done
D6	Monitoring function `pgtrickle.diamond_groups()`	✅ Done
D7	E2E test suite (`tests/e2e_diamond_tests.rs`)	✅ Done
D8	Documentation (`SQL_REFERENCE.md`, `CONFIGURATION.md`, `ARCHITECTURE.md`)	✅ Done

See PLAN_DIAMOND_DEPENDENCY_CONSISTENCY.md.

Transactional IVM — IMMEDIATE Mode ✅

In plain terms: Normally stream tables refresh on a schedule (every N seconds). IMMEDIATE mode updates the stream table inside the same database transaction as the source table change — so by the time your INSERT/UPDATE/ DELETE commits, the stream table is already up to date. Zero lag, at the cost of a slightly slower write.

New IMMEDIATE refresh mode that updates stream tables within the same transaction as base table DML, using statement-level AFTER triggers with transition tables. Phase 1 (core engine) and Phase 3 (extended SQL support) are complete. Phase 2 (pg_ivm compatibility layer) is postponed. Phase 4 (performance optimizations) has partial completion (delta SQL template caching).

Item	Description	Status
TI1	`RefreshMode::Immediate` enum, catalog CHECK, API validation	✅ Done
TI2	Statement-level IVM trigger functions with transition tables	✅ Done
TI3	`DeltaSource::TransitionTable` — Scan operator dual-path	✅ Done
TI4	Delta application (DELETE + INSERT ON CONFLICT)	✅ Done
TI5	Advisory lock-based concurrency (`IvmLockMode`)	✅ Done
TI6	TRUNCATE handling (full refresh of stream table)	✅ Done
TI7	`alter_stream_table` mode switching (DIFFERENTIAL↔IMMEDIATE, FULL↔IMMEDIATE)	✅ Done
TI8	Query restriction validation (`validate_immediate_mode_support`)	✅ Done
TI9	Delta SQL template caching (thread-local `IVM_DELTA_CACHE`)	✅ Done
TI10	Window functions, LATERAL, scalar subqueries in IMMEDIATE mode	✅ Done
TI11	Cascading IMMEDIATE stream tables (ST_A → ST_B)	✅ Done
TI12	29 E2E tests + 8 unit tests	✅ Done
TI13	Documentation (SQL Reference, Architecture, FAQ, CHANGELOG)	✅ Done

Remaining performance optimizations (ENR-based transition table access, aggregate fast-path, C-level trigger functions, prepared statement reuse) are tracked under post-1.0 A2.

See PLAN_TRANSACTIONAL_IVM.md.

Exit criteria:

ORDER BY ... LIMIT N (TopK) defining queries accepted and refreshed correctly
TPC-H queries Q2, Q3, Q10, Q18, Q21 pass with original LIMIT restored
Diamond dependency consistency (D1–D8) implemented and E2E-tested
IMMEDIATE refresh mode: INSERT/UPDATE/DELETE on base table updates stream table within the same transaction
Window functions, LATERAL, scalar subqueries work in IMMEDIATE mode
Cascading IMMEDIATE stream tables (ST_A → ST_B) propagate correctly
Concurrent transaction tests pass

v0.2.1 — Upgrade Infrastructure & Documentation

Status: Released (2026-03-05).

Patch release focused on upgrade safety, documentation, and three catalog schema additions via sql/pg_trickle--0.2.0--0.2.1.sql:

Completed items (click to expand)

has_keyless_source BOOLEAN NOT NULL DEFAULT FALSE — EC-06 keyless source flag; changes apply strategy from MERGE to counted DELETE when set.
function_hashes TEXT — EC-16 function-body hash map; forces a full refresh when a referenced function's body changes silently.
topk_offset INT — OS2 catalog field for paged TopK OFFSET support, shipped and used in this release.

Upgrade Migration Infrastructure ✅

In plain terms: When you run ALTER EXTENSION pg_trickle UPDATE, all your stream tables should survive intact. This adds the safety net that makes that true: automated scripts that check every upgrade script covers all database objects, real end-to-end tests that actually perform the upgrade in a test container, and CI gates that catch regressions before they reach users.

Complete safety net for ALTER EXTENSION pg_trickle UPDATE:

Item	Description	Status
U1	`scripts/check_upgrade_completeness.sh` — CI completeness checker	✅ Done
U2	`sql/archive/` with archived SQL baselines per version	✅ Done
U3	`tests/Dockerfile.e2e-upgrade` for real upgrade tests	✅ Done
U4	6 upgrade E2E tests (function parity, stream table survival, etc.)	✅ Done
U5	CI: `upgrade-check` (every PR) + `upgrade-e2e` (push-to-main)	✅ Done
U6	`docs/UPGRADING.md` user-facing upgrade guide	✅ Done
U7	`just check-upgrade`, `just build-upgrade-image`, `just test-upgrade`	✅ Done
U8	Fixed 0.1.3→0.2.0 upgrade script (was no-op placeholder)	✅ Done

Documentation Expansion ✅

In plain terms: Added six new pages to the documentation book: a dbt integration guide, contributing guide, security policy, release process, a comparison with the pg_ivm extension, and a deep-dive explaining why row-level triggers were chosen over logical replication for CDC.

GitHub Pages book grew from 14 to 20 pages:

Page	Section	Source
dbt Integration	Integrations	`dbt-pgtrickle/README.md`
Contributing	Reference	`CONTRIBUTING.md`
Security Policy	Reference	`SECURITY.md`
Release Process	Reference	`docs/RELEASE.md`
pg_ivm Comparison	Research	`plans/ecosystem/GAP_PG_IVM_COMPARISON.md`
Triggers vs Replication	Research	`plans/sql/REPORT_TRIGGERS_VS_REPLICATION.md`

Exit criteria:

ALTER EXTENSION pg_trickle UPDATE from 0.1.3→0.2.0 tested end-to-end
Completeness check passes (upgrade script covers all pgrx-generated SQL objects)
CI enforces upgrade script completeness on every PR
All documentation pages build and render in mdBook

v0.2.2 — OFFSET, AUTO Mode, ALTER QUERY, Edge Cases & CDC Hardening

Status: Released (2026-03-08).

This milestone shipped paged TopK OFFSET support, AUTO-by-default refresh selection, ALTER QUERY, the remaining upgrade-tooling work, edge-case and WAL CDC hardening, IMMEDIATE-mode parity fixes, and the outstanding documentation sweep.

Completed items (click to expand)

ORDER BY + LIMIT + OFFSET (Paged TopK) — Finalization ✅

In plain terms: Extends TopK to support OFFSET — so you can define a stream table as "rows 11–20 of the top-20 best-selling products" (page 2 of a ranked list). Useful for paginated leaderboards, ranked feeds, or any use case where you want a specific window into a sorted result.

Core implementation is complete (parser, catalog, refresh path, docs, 9 E2E tests). The topk_offset catalog column shipped in v0.2.1 and is exercised by the paged TopK feature here.

Item	Description	Status	Ref
OS1	9 OFFSET E2E tests in `e2e_topk_tests.rs`	✅ Done	PLAN_OFFSET_SUPPORT.md §Step 6
OS2	`sql/pg_trickle--0.2.1--0.2.2.sql` — function signature updates (no schema DDL needed)	✅ Done	PLAN_OFFSET_SUPPORT.md §Step 2

AUTO Refresh Mode ✅

In plain terms: Changes the default from "always try differential (incremental) refresh" to a smart automatic selection: use differential when the query supports it, fall back to a full re-scan when it doesn't. New stream tables also get a calculated schedule interval instead of a hardcoded 1-minute default.

Item	Description	Status	Ref
AM1	`RefreshMode::Auto` — uses DIFFERENTIAL when supported, falls back to FULL	✅ Done	PLAN_REFRESH_MODE_DEFAULT.md
AM2	`create_stream_table` default changed from `'DIFFERENTIAL'` to `'AUTO'`	✅ Done	—
AM3	`create_stream_table` schedule default changed from `'1m'` to `'calculated'`	✅ Done	—

ALTER QUERY ✅

In plain terms: Lets you change the SQL query of an existing stream table without dropping and recreating it. pg_trickle inspects the old and new queries, determines what type of change was made (added a column, dropped a column, or fundamentally incompatible change), and performs the most minimal migration possible — updating in place where it can, rebuilding only when it must.

Item	Description	Status	Ref
AQ1	`alter_stream_table(query => ...)` — validate, classify schema change, migrate storage	✅ Done	PLAN_ALTER_QUERY.md
AQ2	Schema classification: same, compatible (ADD/DROP COLUMN), incompatible (full rebuild)	✅ Done	—
AQ3	ALTER-aware cycle detection (`check_for_cycles_alter`)	✅ Done	—
AQ4	CDC dependency migration (add/remove triggers for changed sources)	✅ Done	—
AQ5	SQL Reference & CHANGELOG documentation	✅ Done	—

Upgrade Tooling ✅

In plain terms: If the compiled extension library (.so file) is a different version than the SQL objects in the database, the scheduler now warns loudly at startup instead of failing in confusing ways later. Also adds FAQ entries and cross-links for common upgrade questions.

Item	Description	Status	Ref
UG1	Version mismatch check — scheduler warns if `.so` version ≠ SQL version	✅ Done	PLAN_UPGRADE_MIGRATIONS.md §5.2
UG2	FAQ upgrade section — 3 new entries with UPGRADING.md cross-links	✅ Done	PLAN_UPGRADE_MIGRATIONS.md §5.4
UG3	CI and local upgrade automation now target 0.2.2 (`upgrade-check`, upgrade-image defaults, upgrade E2E env)	✅ Done	PLAN_UPGRADE_MIGRATIONS.md

IMMEDIATE Mode Parity ✅

In plain terms: Closes two remaining SQL patterns that worked in DIFFERENTIAL mode but not in IMMEDIATE mode. Recursive CTEs (queries that reference themselves to compute e.g. graph reachability or org-chart hierarchies) now work in IMMEDIATE mode with a configurable depth guard. TopK (ORDER BY + LIMIT) queries also get a dedicated fast micro-refresh path in IMMEDIATE mode.

Close the gap between DIFFERENTIAL and IMMEDIATE mode SQL coverage for the two remaining high-risk patterns — recursive CTEs and TopK queries.

Item	Description	Effort	Ref
IM1	Validate recursive CTE semi-naive in IMMEDIATE mode; add stack-depth guard for deeply recursive defining queries	2–3d	PLAN_EDGE_CASES_TIVM_IMPL_ORDER.md Stage 6 §5.1
IM2	TopK in IMMEDIATE mode: statement-level micro-refresh + `ivm_topk_max_limit` GUC	2–3d	PLAN_EDGE_CASES_TIVM_IMPL_ORDER.md Stage 6 §5.2

IMMEDIATE parity subtotal: ✅ Complete (IM1 + IM2)

Edge Case Hardening ✅

In plain terms: Three targeted fixes for uncommon-but-real scenarios: a cap on CUBE/ROLLUP combinatorial explosion (which can generate thousands of grouping variants from a single query and crash the database); automatic recovery when CDC gets stuck in a "transitioning" state after a database restart; and polling-based change detection for foreign tables (tables in external databases) that can't use triggers or WAL.

Self-contained items from Stage 7 of the edge-cases/TIVM implementation plan.

Item	Description	Effort	Ref
EC1	`pg_trickle.max_grouping_set_branches` GUC — cap CUBE/ROLLUP branch-count explosion	4h	PLAN_EDGE_CASES.md EC-02
EC2	Post-restart CDC `TRANSITIONING` health check — detect stuck CDC transitions after crash or restart	1d	PLAN_EDGE_CASES.md EC-20
EC3	Foreign table support: polling-based change detection via periodic re-execution	2–3d	PLAN_EDGE_CASES.md EC-05

Edge-case hardening subtotal: ✅ Complete (EC1 + EC2 + EC3)

Documentation Sweep

In plain terms: Filled three documentation gaps: what happens to an in-flight refresh if you run DDL (ALTER TABLE, DROP INDEX) at the same time; limitations when using pg_trickle on standby replicas; and a PgBouncer configuration guide explaining the session-mode requirement and incompatible settings.

Remaining documentation gaps identified in Stage 7 of the gap analysis.

Item	Description	Effort	Status	Ref
DS1	DDL-during-refresh behaviour: document safe patterns and races	2h	✅ Done	PLAN_EDGE_CASES.md EC-17
DS2	Replication/standby limitations: document in FAQ and Architecture	3h	✅ Done	PLAN_EDGE_CASES.md EC-21/22/23
DS3	PgBouncer configuration guide: session-mode requirements and known incompatibilities	2h	✅ Done	PLAN_EDGE_CASES.md EC-28

Documentation sweep subtotal: ✅ Complete

WAL CDC Hardening

In plain terms: WAL (Write-Ahead Log) mode tracks changes by reading PostgreSQL's internal replication stream rather than using row-level triggers — which is more efficient and works across concurrent sessions. This work added a complete E2E test suite for WAL mode, hardened the automatic fallback from WAL to trigger mode when WAL isn't available, and promoted cdc_mode = 'auto' (try WAL first, fall back to triggers) as the default.

WAL decoder F2–F3 fixes (keyless pk_hash, old_* columns for UPDATE) landed in v0.1.3.

Item	Description	Effort	Status	Ref
W1	WAL mode E2E test suite (parallel to trigger suite)	8–12h	✅ Done	PLAN_HYBRID_CDC.md
W2	WAL→trigger automatic fallback hardening	4–6h	✅ Done	PLAN_HYBRID_CDC.md
W3	Promote `pg_trickle.cdc_mode = 'auto'` to default	~1h	✅ Done	PLAN_HYBRID_CDC.md

WAL CDC subtotal: ~13–19 hours

Exit criteria:

ORDER BY + LIMIT + OFFSET defining queries accepted, refreshed, and E2E-tested
sql/pg_trickle--0.2.1--0.2.2.sql exists (column pre-provisioned in 0.2.1; function signature updates)
Upgrade completeness check passes for 0.2.1→0.2.2
CI and local upgrade-E2E defaults target 0.2.2
Version check fires at scheduler startup if .so/SQL versions diverge
IMMEDIATE mode: recursive CTE semi-naive validated; ivm_recursive_max_depth depth guard added
IMMEDIATE mode: TopK micro-refresh fully tested end-to-end (10 E2E tests)
max_grouping_set_branches GUC guards CUBE/ROLLUP explosion (3 E2E tests)
Post-restart CDC TRANSITIONING health check in place
Foreign table polling-based CDC implemented (3 E2E tests)
DDL-during-refresh and standby/replication limitations documented
WAL CDC mode passes full E2E suite
E2E tests pass (just build-e2e-image && just test-e2e)

v0.2.3 — Non-Determinism, CDC/Mode Gaps & Operational Polish

Status: Released (2026-03-09).

Completed items (click to expand)

Goal: Close a small set of high-leverage correctness and operational gaps that do not need to wait for the larger v0.3.0 parallel refresh, security, and partitioning work. This milestone tightens refresh-mode behavior, makes CDC transitions easier to observe, and removes one silent correctness hazard in DIFFERENTIAL mode.

Non-Deterministic Function Handling

In plain terms: Functions like random(), gen_random_uuid(), and clock_timestamp() return a different value every time they're called. In DIFFERENTIAL mode, pg_trickle computes what changed between the old and new result — but if a function changes on every call, the "change" is meaningless and produces phantom rows. This detects such functions at stream-table creation time and rejects them in DIFFERENTIAL mode (they still work fine in FULL or IMMEDIATE mode).

Status: Done. Volatility lookup, OpTree enforcement, E2E coverage, and documentation are complete.

Volatile functions (random(), gen_random_uuid(), clock_timestamp()) break delta computation in DIFFERENTIAL mode — values change on each evaluation, causing phantom changes and corrupted row identity hashes. This is a silent correctness gap.

Item	Description	Effort	Ref
ND1	Volatility lookup via `pg_proc.provolatile` + recursive `Expr` scanner	Done	PLAN_NON_DETERMINISM.md §Part 1
ND2	OpTree volatility walker + enforcement policy (reject volatile in DIFFERENTIAL, warn for stable)	Done	PLAN_NON_DETERMINISM.md §Part 2
ND3	E2E tests (volatile rejected, stable warned, immutable allowed, nested volatile in WHERE)	Done	PLAN_NON_DETERMINISM.md §E2E Tests
ND4	Documentation (`SQL_REFERENCE.md`, `DVM_OPERATORS.md`)	Done	PLAN_NON_DETERMINISM.md §Files

Non-determinism subtotal: ~4–6 hours

CDC / Refresh Mode Interaction Gaps ✅

In plain terms: pg_trickle has four CDC modes (trigger, WAL, auto, per-table override) and four refresh modes (FULL, DIFFERENTIAL, IMMEDIATE, AUTO). Not every combination makes sense, and some had silent bugs. This fixed six specific gaps: stale change buffers not being flushed after FULL refreshes (so they got replayed again on the next tick), a missing error for the IMMEDIATE + WAL combination, a new pgt_cdc_status monitoring view, per-table CDC mode overrides, and a guard against refreshing stream tables that haven't been populated yet.

Six gaps between the four CDC modes and four refresh modes — missing validations, resource leaks, and observability holes. Phased from quick wins (pure Rust) to a larger feature (per-table cdc_mode override).

Item	Description	Effort	Ref
G6	Defensive `is_populated` + empty-frontier check in `execute_differential_refresh()`	Done	PLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G6
G2	Validate `IMMEDIATE` + `cdc_mode='wal'` — global-GUC path logs INFO; explicit per-table override is rejected with a clear error	Done	PLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G2
G3	Advance WAL replication slot after FULL refresh; flush change buffers	Done	PLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G3
G4	Flush change buffers after AUTO→FULL adaptive fallback (prevents ping-pong)	Done	PLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G4
G5	`pgtrickle.pgt_cdc_status` view + NOTIFY on CDC transitions	Done	PLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G5
G1	Per-table `cdc_mode` override (SQL API, catalog, dbt, migration)	Done	PLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G1

CDC/refresh mode gaps subtotal: ✅ Complete

Progress: G6 is now implemented in v0.2.3: the low-level differential executor rejects unpopulated stream tables and missing frontiers before it can scan from 0/0, while the public manual-refresh path continues to fall back to FULL for initialize => false stream tables.

Progress: G1 and G2 are now complete: create_stream_table() and alter_stream_table() accept an optional per-table cdc_mode override, the requested value is stored in pgt_stream_tables.requested_cdc_mode, dbt forwards the setting, and shared-source WAL transition eligibility is now resolved conservatively from all dependent deferred stream tables. The cluster-wide pg_trickle.cdc_mode = 'wal' path still logs INFO for refresh_mode = 'IMMEDIATE', while explicit per-table cdc_mode => 'wal' requests are rejected for IMMEDIATE mode with a clear error.

Progress: G3 and G4 are now implemented in v0.2.3: advance_slot_to_current() in wal_decoder.rs advances WAL slots after each FULL refresh; the shared post_full_refresh_cleanup() helper in refresh.rs advances all WAL/TRANSITIONING slots and flushes change buffers, called from scheduler.rs after every Full/Reinitialize execution and from the adaptive fallback path. This prevents change-buffer ping-pong on bulk-loaded tables.

Progress: G5 is now implemented in v0.2.3: the pgtrickle.pgt_cdc_status convenience view has been added, and a cdc_modes text-array column surfaces per-source CDC modes in pgtrickle.pg_stat_stream_tables. NOTIFY on CDC transitions (TRIGGER → TRANSITIONING → WAL) was already implemented via emit_cdc_transition_notify() in wal_decoder.rs.

Progress: The SQL upgrade path for these CDC and monitoring changes is in place via sql/pg_trickle--0.2.2--0.2.3.sql, which adds requested_cdc_mode, updates the create_stream_table / alter_stream_table signatures, recreates pgtrickle.pg_stat_stream_tables, and adds pgtrickle.pgt_cdc_status for ALTER EXTENSION ... UPDATE users.

Operational

In plain terms: Four housekeeping improvements: clean up prepared statements when the database catalog changes (prevents stale caches after DDL); make WAL slot lag alert thresholds configurable rather than hardcoded; simplify a confusing GUC setting (user_triggers) with a deprecated alias; and add a pg_trickle_dump tool that exports all stream table definitions to a replayable SQL file — useful as a backup before running an upgrade.

Item	Description	Effort	Ref
O1	Prepared statement cleanup on cache invalidation	Done	GAP_SQL_PHASE_7.md G4.4
O2	Slot lag alerting thresholds configurable (`slot_lag_warning_threshold_mb`, `slot_lag_critical_threshold_mb`)	Done	PLAN_HYBRID_CDC.md §6.2
O3	Simplify `pg_trickle.user_triggers` GUC (canonical `auto` / `off`, deprecated `on` alias)	Done	PLAN_FEATURE_CLEANUP.md C5
O4	`pg_trickle_dump`: SQL export tool for manual backup before upgrade	Done	PLAN_UPGRADE_MIGRATIONS.md §5.3

Operational subtotal: Done

Progress: All four operational items are now shipped in v0.2.3. Warning-level and critical WAL slot lag thresholds are configurable, prepared __pgt_merge_* statements are cleaned up on shared cache invalidation, pg_trickle.user_triggers is simplified to canonical auto / off semantics with a deprecated on alias, and pg_trickle_dump provides a replayable SQL export for upgrade backups.

v0.2.3 total: ~45–66 hours

Exit criteria:

Volatile functions rejected in DIFFERENTIAL mode; stable functions warned
DIFFERENTIAL on unpopulated ST returns error (G6)
IMMEDIATE + explicit cdc_mode='wal' rejected with clear error (G2)
WAL slot advanced after FULL refresh; change buffers flushed (G3)
Adaptive fallback flushes change buffers; no ping-pong cycles (G4)
pgtrickle.pgt_cdc_status view available; NOTIFY on CDC transitions (G5)
Prepared statement cache cleanup works after invalidation
Per-table cdc_mode override functional in SQL API and dbt adapter (G1)
Extension upgrade path tested (0.2.2 → 0.2.3)

v0.3.0 — DVM Correctness, SAST & Test Coverage

Status: Released (2026-03-11).

Completed items (click to expand)

Goal: Re-enable all 18 previously-ignored DVM correctness E2E tests by fixing HAVING, FULL OUTER JOIN, correlated EXISTS+HAVING, and correlated scalar subquery differential computation bugs. Harden the SAST toolchain with privilege-context rules and an unsafe-block baseline. Expand TPC-H coverage with rollback, mode-comparison, single-row, and DAG tests.

DVM Correctness Fixes

In plain terms: The Differential View Maintenance engine — the core algorithm that computes what changed incrementally — had four correctness bugs in specific SQL patterns. Queries using these patterns were silently producing wrong results and had their tests marked "ignored". This release fixes all four: HAVING clauses on aggregates, FULL OUTER JOINs, correlated EXISTS subqueries combined with HAVING, and correlated scalar subqueries in SELECT lists. All 18 previously-ignored E2E tests now pass.

Item	Description	Status
DC1	HAVING clause differential correctness — fix `COUNT(*)` rewrite and threshold-crossing upward rescan (5 tests un-ignored)	✅ Done
DC2	FULL OUTER JOIN differential correctness — fix row-id mismatch, compound GROUP BY expressions, SUM NULL semantics, and rescan CTE SELECT list (5 tests un-ignored)	✅ Done
DC3	Correlated EXISTS with HAVING differential correctness — fix EXISTS sublink parser discarding GROUP BY/HAVING, row-id mismatch for `Project(SemiJoin)`, and `diff_project` row-id recomputation (1 test un-ignored)	✅ Done
DC4	Correlated scalar subquery differential correctness — `rewrite_correlated_scalar_in_select` rewrites correlated scalar subqueries to LEFT JOINs before DVM parsing (2 tests un-ignored)	✅ Done

DVM correctness subtotal: 18 previously-ignored E2E tests re-enabled (0 remaining)

SAST Program (Phases 1–3)

In plain terms: Adds formal static security analysis (SAST) to every build. CodeQL and Semgrep scan for known vulnerability patterns — for example, using SECURITY DEFINER functions without locking down search_path, or calling SET ROLE in ways that could be abused. Separately, every Rust unsafe {} block is inventoried and counted; any PR that adds new unsafe blocks beyond the committed baseline fails CI automatically.

Item	Description	Status
S1	CodeQL + `cargo deny` + initial Semgrep baseline — zero findings across 115 Rust source files	✅ Done
S2	Narrow `rust.panic-in-sql-path` scope — exclude `src/dvm/` and `src/bin/` to eliminate 351 false-positive alerts	✅ Done
S3	`sql.row-security.disabled` Semgrep rule — flag `SET LOCAL row_security = off`	✅ Done
S4	`sql.set-role.present` Semgrep rule — flag `SET ROLE` / `RESET ROLE` patterns	✅ Done
S5	Updated `sql.security-definer.present` message to require explicit `SET search_path`	✅ Done
S6	`scripts/unsafe_inventory.sh` + `.unsafe-baseline` — per-file `unsafe {` counter with committed baseline (1309 blocks across 6 files)	✅ Done
S7	`.github/workflows/unsafe-inventory.yml` — advisory CI workflow; fails if any file exceeds its baseline	✅ Done
S8	Remove `pull_request` trigger from CodeQL + Semgrep workflows (no inline PR annotations; runs on push-to-main + weekly schedule)	✅ Done

SAST subtotal: Phases 1–3 complete; Phase 4 rule promotion tracked as post-v0.3.0 cleanup

TPC-H Test Suite Enhancements (T1–T6)

In plain terms: TPC-H is an industry-standard analytical query benchmark — 22 queries against a simulated supply-chain database. This extends the pg_trickle TPC-H test suite to verify four additional scenarios that the basic correctness checks didn't cover: that ROLLBACK atomically undoes an IVM stream table update; that DIFFERENTIAL and IMMEDIATE mode produce identical answers for the same data; that single-row mutations work correctly (not just bulk changes); and that multi-level stream table DAGs refresh in the correct topological order.

Item	Description	Status
T1	`__pgt_count < 0` guard in `assert_tpch_invariant` — over-retraction detector, applies to all existing TPC-H tests	✅ Done
T2	Skip-set regression guard in DIFFERENTIAL + IMMEDIATE tests — any newly skipped query not in the allowlist fails CI	✅ Done
T3	`test_tpch_immediate_rollback` — verify ROLLBACK restores IVM stream table atomically across RF mutations	✅ Done
T4	`test_tpch_differential_vs_immediate` — side-by-side comparison: both incremental modes produce identical results after shared mutations	✅ Done
T5	`test_tpch_single_row_mutations` + SQL fixtures — single-row INSERT/UPDATE/DELETE IVM trigger paths on Q01/Q06/Q03	✅ Done
T6a	`test_tpch_dag_chain` — two-level DAG (Q01 → filtered projection), refreshed in topological order	✅ Done
T6b	`test_tpch_dag_multi_parent` — multi-parent fan-in (Q01 + Q06 → UNION ALL), DIFFERENTIAL mode	✅ Done

TPC-H subtotal: T1–T6 complete; 22/22 TPC-H queries passing

Exit criteria:

All 18 previously-ignored DVM correctness E2E tests re-enabled
SAST Phases 1–3 deployed; unsafe baseline committed; CodeQL zero findings
TPC-H T1–T6 implemented; rollback, differential-vs-immediate, single-row, and DAG tests pass
Extension upgrade path tested (0.2.3 → 0.3.0)

v0.4.0 — Parallel Refresh & Performance Hardening

Status: Released (2026-03-12).

Completed items (click to expand)

Goal: Deliver true parallel refresh, cut write-side CDC overhead with statement-level triggers, close a cross-source snapshot consistency gap, and ship quick ergonomic and infrastructure improvements. Together these close the main performance and operational gaps before the security and partitioning work begins.

Parallel Refresh

In plain terms: Right now the scheduler refreshes stream tables one at a time. This feature lets multiple stream tables refresh simultaneously — like running several errands at once instead of in a queue. When you have dozens of stream tables, this can cut total refresh latency dramatically.

Detailed implementation is tracked in PLAN_PARALLELISM.md. The older REPORT_PARALLELIZATION.md remains the options-analysis precursor.

Item	Description	Effort	Ref
P1	Phase 0–1: instrumentation, `dry_run`, and execution-unit DAG (atomic groups + IMMEDIATE closures)	12–20h	PLAN_PARALLELISM.md §10
P2	Phase 2–4: job table, worker budget, dynamic refresh workers, and ready-queue dispatch	16–28h	PLAN_PARALLELISM.md §10
P3	Phase 5–7: composite units, observability, rollout gating, and CI validation	12–24h	PLAN_PARALLELISM.md §10

Progress:

P1 — Phase 0 + Phase 1 (done): GUCs (parallel_refresh_mode, max_dynamic_refresh_workers), ExecutionUnit/ExecutionUnitDag types in dag.rs, IMMEDIATE-closure collapsing, dry-run logging in scheduler, 10 new unit tests (1211 total).
P2 — Phase 2–4 (done): Job table (pgt_scheduler_jobs), catalog CRUD, shared-memory token pool (Phase 2). Dynamic worker entry point, spawn helper, reconciliation (Phase 3). Coordinator dispatch loop with ready-queue scheduling, per-db/cluster-wide budget enforcement, transaction-split spawning, dynamic poll interval, 8 new unit tests (Phase 4). 1233 unit tests total.
P3a — Phase 5 (done): Composite unit execution — execute_worker_atomic_group() with C-level sub-transaction rollback, execute_worker_immediate_closure() with root-only refresh (IMMEDIATE triggers propagate downstream). Replaces Phase 3 serial placeholder.
P3b — Phase 6 (done): Observability — worker_pool_status(), parallel_job_status() SQL functions; health_check() extended with worker_pool and job_queue checks; docs updated.
P3c — Phase 7 (done): Rollout — GUC documentation in CONFIGURATION.md, worker-budget guidance in ARCHITECTURE.md, CI E2E coverage with PGT_PARALLEL_MODE=on, feature stays gated behind parallel_refresh_mode = 'off' default.

Parallel refresh subtotal: ~40–72 hours

Statement-Level CDC Triggers

In plain terms: Previously, when you updated 1,000 rows in a source table, the database fired a "row changed" notification 1,000 times — once per row. Now it fires once per statement, handing off all 1,000 changed rows in a single batch. For bulk operations like data imports or batch updates this is 50–80% cheaper; for single-row changes you won't notice a difference.

Replace per-row AFTER triggers with statement-level triggers using NEW TABLE AS __pgt_new / OLD TABLE AS __pgt_old. Expected write-side trigger overhead reduction of 50–80% for bulk DML; neutral for single-row.

Item	Description	Effort	Ref
B1	~~Replace per-row triggers with statement-level triggers; INSERT/UPDATE/DELETE via set-based buffer fill~~	8h	✅ Done — `build_stmt_trigger_fn_sql` in cdc.rs; `REFERENCING NEW TABLE AS __pgt_new OLD TABLE AS __pgt_old FOR EACH STATEMENT` created by `create_change_trigger`
B2	~~`pg_trickle.cdc_trigger_mode = 'statement'\|'row'` GUC + migration to replace row-level triggers on `ALTER EXTENSION UPDATE`~~	4h	✅ Done — `CdcTriggerMode` enum in config.rs; `rebuild_cdc_triggers()` in api.rs; 0.3.0→0.4.0 upgrade script migrates existing triggers
B3	~~Write-side benchmark matrix (narrow/medium/wide tables × bulk/single DML)~~	2h	✅ Done — `bench_stmt_vs_row_cdc_matrix` + `bench_stmt_vs_row_cdc_quick` in e2e_bench_tests.rs; runs via `cargo test -- --ignored bench_stmt_vs_row_cdc_matrix`

Statement-level CDC subtotal: ✅ All done (~14h)

Cross-Source Snapshot Consistency (Phase 1)

In plain terms: Imagine a stream table that joins orders and customers. If a single transaction updates both tables, the old scheduler could read the new orders data but the old customers data — a half-applied, internally inconsistent snapshot. This fix takes a "freeze frame" of the change log at the start of each scheduler tick and only processes changes up to that point, so all sources are always read from the same moment in time. Zero configuration required.

At start of each scheduler tick, snapshot pg_current_wal_lsn() as a tick_watermark and cap all CDC consumption to that LSN. Zero user configuration — prevents interleaved reads from two sources that were updated in the same transaction from producing an inconsistent stream table.

Item	Description	Effort	Ref
~~CSS1~~	~~LSN tick watermark: snapshot `pg_current_wal_lsn()` per tick; cap frontier advance; log in `pgt_refresh_history`; `pg_trickle.tick_watermark_enabled` GUC (default `on`)~~	~~3–4h~~	✅ Done

Cross-source consistency subtotal: ✅ All done

Ergonomic Hardening

In plain terms: Added helpful warning messages for common mistakes: "your WAL level isn't configured for logical replication", "this source table has no primary key — duplicate rows may appear", "this change will trigger a full re-scan of all source data". Think of these as friendly guardrails that explain why something might not work as expected.

Item	Description	Effort	Ref
~~ERG-B~~	~~Warn at `_PG_init` when `cdc_mode='auto'` but `wal_level != 'logical'` — prevents silent trigger-only operation~~	~~30min~~	✅ Done
~~ERG-C~~	~~Warn at `create_stream_table` when source has no primary key — surfaces keyless duplicate-row risk~~	1h	✅ Done (pre-existing in `warn_source_table_properties`)
~~ERG-F~~	~~Emit `WARNING` when `alter_stream_table` triggers an implicit full refresh~~	1h	✅ Done

Ergonomic hardening subtotal: ✅ All done

Code Coverage

In plain terms: Every pull request now automatically reports what percentage of the code is exercised by tests, and which specific lines are never touched. It's like a map that highlights the unlit corners — helpful for spotting blind spots before they become bugs.

Item	Description	Effort	Ref
~~COV~~	~~Codecov integration: move token to `with:`, add `codecov.yml` with patch targets for `src/dvm/`, add README badge, verify first upload~~	~~1–2h~~	✅ Done — reports live at app.codecov.io/github/grove/pg-trickle

v0.4.0 total: ~60–94 hours

Exit criteria:

max_concurrent_refreshes drives real parallel refresh via coordinator + dynamic refresh workers
Statement-level CDC triggers implemented (B1/B2/B3); benchmark harness in bench_stmt_vs_row_cdc_matrix
LSN tick watermark active by default; no interleaved-source inconsistency in E2E tests
Codecov badge on README; coverage report uploading
Extension upgrade path tested (0.3.0 → 0.4.0)

v0.5.0 — Row-Level Security & Operational Controls

Status: Released (2026-03-13).

Completed items (click to expand)

Goal: Harden the security context for stream tables and IVM triggers, add source-level pause/resume gating for bulk-load coordination, and deliver small ergonomic improvements.

Row-Level Security (RLS) Support

In plain terms: Row-level security lets you write policies like "user Alice can only see rows where tenant_id = 'alice'". Stream tables already honour these policies when users query them. What this work fixes is the machinery behind the scenes — the triggers and refresh functions that build the stream table need to see all rows regardless of who is running them, otherwise they'd produce an incomplete result. This phase hardens those internal components so they always have full visibility, while end-users still see only their filtered slice.

Stream tables materialize the full result set (like MATERIALIZED VIEW). RLS is applied on the stream table itself for read-side filtering. Phase 1 hardens the security context; Phase 2 adds a tutorial; Phase 3 completes DDL tracking. Phase 4 (per-role security_invoker) is deferred to post-1.0.

Item	Description	Effort	Ref
R1	Document RLS semantics in SQL_REFERENCE.md and FAQ.md	1h	PLAN_ROW_LEVEL_SECURITY.md §3.1
R2	Disable RLS on change buffer tables (`ALTER TABLE ... DISABLE ROW LEVEL SECURITY`)	30min	PLAN_ROW_LEVEL_SECURITY.md §3.1 R2
R3	Force superuser context for manual `refresh_stream_table()` (prevent "who refreshed it?" hazard)	2h	PLAN_ROW_LEVEL_SECURITY.md §3.1 R3
R4	Force SECURITY DEFINER on IVM trigger functions (IMMEDIATE mode delta queries must see all rows)	2h	PLAN_ROW_LEVEL_SECURITY.md §3.1 R4
R5	E2E test: RLS on source table does not affect stream table content	1h	PLAN_ROW_LEVEL_SECURITY.md §3.1 R5
R6	Tutorial: RLS on stream tables (enable RLS, per-tenant policies, verify filtering)	1.5h	PLAN_ROW_LEVEL_SECURITY.md §3.2 R6
R7	E2E test: RLS on stream table filters reads per role	1h	PLAN_ROW_LEVEL_SECURITY.md §3.2 R7
R8	E2E test: IMMEDIATE mode + RLS on stream table	30min	PLAN_ROW_LEVEL_SECURITY.md §3.2 R8
R9	Track ENABLE/DISABLE RLS DDL on source tables (AT_EnableRowSecurity et al.) in hooks.rs	2h	PLAN_ROW_LEVEL_SECURITY.md §3.3 R9
R10	E2E test: ENABLE RLS on source table triggers reinit	1h	PLAN_ROW_LEVEL_SECURITY.md §3.3 R10

RLS subtotal: ~8–12 hours (Phase 4 security_invoker deferred to post-1.0)

Bootstrap Source Gating

In plain terms: A pause/resume switch for individual source tables. If you're bulk-loading 10 million rows into a source table (a nightly ETL import, for example), you can "gate" it first — the scheduler will skip refreshing any stream table that reads from it. Once the load is done you "ungate" it and a single clean refresh runs. Without gating, the CDC system would frantically process millions of intermediate changes during the load, most of which get immediately overwritten anyway.

Allow operators to pause CDC consumption for specific source tables (e.g. during bulk loads or ETL windows) without dropping and recreating stream tables. The scheduler skips any stream table whose transitive source set intersects the current gated set.

Item	Description	Effort	Ref
BOOT-1	`pgtrickle.pgt_source_gates` catalog table (`source_relid`, `gated`, `gated_at`, `gated_by`)	30min	PLAN_BOOTSTRAP_GATING.md
BOOT-2	`gate_source(source TEXT)` SQL function — sets gate, pg_notify scheduler	1h	PLAN_BOOTSTRAP_GATING.md
BOOT-3	`ungate_source(source TEXT)` + `source_gates()` introspection view	30min	PLAN_BOOTSTRAP_GATING.md
BOOT-4	Scheduler integration: load gated-source set per tick; skip and log `SKIP` in `pgt_refresh_history`	2–3h	PLAN_BOOTSTRAP_GATING.md
BOOT-5	E2E tests: single-source gate, coordinated multi-source, partial DAG, bootstrap with `initialize => false`	3–4h	PLAN_BOOTSTRAP_GATING.md

Bootstrap source gating subtotal: ~7–9 hours

Ergonomics & API Polish

In plain terms: A handful of quality-of-life improvements: track when someone manually triggered a refresh and log it in the history table; a one-row quick_health view that tells you at a glance whether the extension is healthy (total tables, any errors, any stale tables, scheduler running); a create_stream_table_if_not_exists() helper so deployment scripts don't crash if the table was already created; and CALL syntax wrappers so the functions feel like native PostgreSQL commands rather than extension functions.

Item	Description	Effort	Ref
ERG-D	Record manual `refresh_stream_table()` calls in `pgt_refresh_history` with `initiated_by='MANUAL'`	2h	PLAN_ERGONOMICS.md §D
ERG-E	`pgtrickle.quick_health` view — single-row status summary (`total_stream_tables`, `error_tables`, `stale_tables`, `scheduler_running`, `status`)	2h	PLAN_ERGONOMICS.md §E
COR-2	`create_stream_table_if_not_exists()` convenience wrapper	30min	PLAN_CREATE_OR_REPLACE.md §COR-2
~~NAT-CALL~~	~~`CREATE PROCEDURE` wrappers for all four main SQL functions — enables `CALL pgtrickle.create_stream_table(...)` syntax~~	1h	Deferred — PostgreSQL does not allow procedures and functions with the same name and argument types

Ergonomics subtotal: ~5–5.5 hours (NAT-CALL deferred)

Performance Foundations (Wave 1)

These quick-win items from PLAN_NEW_STUFF.md ship alongside the RLS and operational work. Read the risk analyses in that document before implementing any item.

Item	Description	Effort	Ref
A-3a	MERGE bypass — Append-Only INSERT path: expose `APPEND ONLY` declaration on `CREATE STREAM TABLE`; CDC heuristic fallback (fast-path until first DELETE/UPDATE seen)	1–2 wk	PLAN_NEW_STUFF.md §A-3

A-4, B-2, and C-4 deferred to v0.6.0 Performance Wave 2 (scope mismatch with the RLS/operational-controls theme; correctness risk warrants a dedicated wave).

Performance foundations subtotal: ~10–20h (A-3a only)

v0.5.0 total: ~51–97h

Exit criteria:

RLS semantics documented; change buffers RLS-hardened; IVM triggers SECURITY DEFINER
RLS on stream table E2E-tested (DIFFERENTIAL + IMMEDIATE)
gate_source / ungate_source operational; scheduler skips gated sources correctly
quick_health view and create_stream_table_if_not_exists available
Manual refresh calls recorded in history with initiated_by='MANUAL'
A-3a: Append-Only INSERT path eliminates MERGE for event-sourced stream tables
Extension upgrade path tested (0.4.0 → 0.5.0)

v0.6.0 — Partitioning, Idempotent DDL, Edge Cases & Circular Dependency Foundation

Status: Released (2026-03-14).

Completed items (click to expand)

Goal: Validate partitioned source tables, add create_or_replace_stream_table for idempotent deployments (critical for dbt and migration workflows), close all remaining P0/P1 edge cases and two usability-tier gaps, harden ergonomics and source gating, expand the dbt integration, fill SQL documentation gaps, and lay the foundation for circular stream table DAGs.

Partitioning Support (Source Tables)

In plain terms: PostgreSQL lets you split large tables into smaller "partitions" — for example one partition per month for an orders table. This is a common technique for managing very large datasets. This work teaches pg_trickle to track all those partitions as a unit, so adding a new monthly partition doesn't silently break stream tables that depend on orders. It also handles the special case of foreign tables (tables that live in another database), restricting them to full-scan refresh since they can't be change-tracked the normal way.

Item	Description	Effort	Ref
~~PT1~~	Verify partitioned tables work end-to-end. Create stream tables over RANGE-partitioned source tables, insert/update/delete rows, refresh, and confirm results match — proving that pg_trickle handles partitions correctly out of the box.	8–12h	PLAN_PARTITIONING_SHARDING.md §7
~~PT2~~	Detect new partitions automatically. When someone runs `ALTER TABLE orders ATTACH PARTITION orders_2026_04 ...`, pg_trickle notices and rebuilds affected stream tables so the new partition's data is included. Without this, the new partition would be silently ignored.	4–8h	PLAN_PARTITIONING_SHARDING.md §3.3
~~PT3~~	Make WAL-based change tracking work with partitions. PostgreSQL's logical replication normally sends changes tagged with the child partition name, not the parent. This configures it to report changes under the parent table name so pg_trickle's WAL decoder can match them correctly.	2–4h	PLAN_PARTITIONING_SHARDING.md §3.4
~~PT4~~	Handle foreign tables gracefully. Tables that live in another database (via `postgres_fdw`) can't have triggers or WAL tracking. pg_trickle now detects them and automatically uses full-scan refresh mode instead of failing with a confusing error.	2–4h	PLAN_PARTITIONING_SHARDING.md §6.3
~~PT5~~	Document partitioned table support. User-facing guide covering which partition types work, what happens when you add/remove partitions, and known caveats.	2–4h	PLAN_PARTITIONING_SHARDING.md §8

Partitioning subtotal: ~18–32 hours

Idempotent DDL (`create_or_replace`) ✅

In plain terms: Right now if you run create_stream_table() twice with the same name it errors out, and changing the query means drop_stream_table() followed by create_stream_table() — which loses all the data in between. create_or_replace_stream_table() does the right thing automatically: if nothing changed it's a no-op, if only settings changed it updates in place, if the query changed it rebuilds. This is the same pattern as CREATE OR REPLACE FUNCTION in PostgreSQL — and it's exactly what the dbt materialization macro needs so every dbt run doesn't drop and recreate tables from scratch.

create_or_replace_stream_table() performs a smart diff: no-op if identical, in-place alter for config-only changes, schema migration for ADD/DROP column, full rebuild for incompatible changes. Eliminates the drop-and-recreate pattern used by the dbt materialization macro.

Item	Description	Effort	Ref
~~COR-1~~	The core function. `create_or_replace_stream_table()` compares the new definition against the existing one and picks the cheapest path: no-op if identical, settings-only update if just config changed, column migration if columns were added/dropped, or full rebuild if the query is fundamentally different. One function call replaces the drop-and-recreate dance.	4h	PLAN_CREATE_OR_REPLACE.md
~~COR-3~~	dbt just works. Updates the `stream_table` dbt materialization macro to call `create_or_replace` instead of dropping and recreating on every `dbt run`. Existing data survives deployments; only genuinely changed stream tables get rebuilt.	2h	PLAN_CREATE_OR_REPLACE.md
~~COR-4~~	Upgrade path and documentation. Upgrade SQL script so existing installations get the new function via `ALTER EXTENSION UPDATE`. SQL Reference and FAQ updated with usage examples.	2.5h	PLAN_CREATE_OR_REPLACE.md
~~COR-5~~	Thorough test coverage. 13 end-to-end tests covering: identical no-op, config-only change, query change with compatible columns, query change with incompatible columns, mode switches, and error cases.	4h	PLAN_CREATE_OR_REPLACE.md

Idempotent DDL subtotal: ~12–13 hours

Circular Dependency Foundation ✅

In plain terms: Normally stream tables form a one-way chain: A feeds B, B feeds C. A circular dependency means A feeds B which feeds A — usually a mistake, but occasionally useful for iterative computations like graph reachability or recursive aggregations. This lays the groundwork — the algorithms, catalog columns, and GUC settings — to eventually allow controlled circular stream tables. The actual live execution is completed in v0.7.0.

Forms the prerequisite for full SCC-based fixpoint refresh in v0.7.0.

Item	Description	Effort	Ref
~~CYC-1~~	Find cycles in the dependency graph. Implement Tarjan's algorithm to efficiently detect which stream tables form circular groups. This tells the scheduler "these three stream tables reference each other — they need special handling."	~2h	PLAN_CIRCULAR_REFERENCES.md Part 1
~~CYC-2~~	Block unsafe cycles. Not all queries can safely participate in a cycle — aggregates, EXCEPT, window functions, and NOT EXISTS can't converge to a stable answer when run in a loop. This checker rejects those at creation time with a clear error explaining why.	~1h	PLAN_CIRCULAR_REFERENCES.md Part 2
~~CYC-3~~	Track cycles in the catalog. Add columns to the internal tables that record which cycle group each stream table belongs to and how many iterations the last refresh took. Needed for monitoring and the scheduler logic in v0.7.0.	~1h	PLAN_CIRCULAR_REFERENCES.md Part 3
~~CYC-4~~	Safety knobs. Two new settings: `max_fixpoint_iterations` (default 100) prevents runaway loops, and `allow_circular` (default off) is the master switch — circular dependencies are rejected unless you explicitly opt in.	~30min	PLAN_CIRCULAR_REFERENCES.md Part 4

Circular dependency foundation subtotal: ~4.5 hours

Edge Case Hardening

In plain terms: Six remaining edge cases from the PLAN_EDGE_CASES.md catalogue — one data correctness issue (P0), three operational-surprise items (P1), and two usability gaps (P2). Together they close every open edge case above "accepted trade-off" status.

P0 — Data Correctness

Item	Description	Effort	Ref
~~EC-19~~ ✅	Prevent silent data corruption with WAL + keyless tables. If you use WAL-based change tracking on a table without a primary key, PostgreSQL needs `REPLICA IDENTITY FULL` to send complete row data. Without it, deltas are silently incomplete. This rejects the combination at creation time with a clear error instead of producing wrong results.	0.5 day	PLAN_EDGE_CASES.md EC-19

P1 — Operational Safety

Item	Description	Effort	Ref
~~EC-16~~ ✅	Detect when someone silently changes a function your query uses. If a stream table's query calls `calculate_discount()` and someone does `CREATE OR REPLACE FUNCTION calculate_discount(...)` with new logic, the stream table's cached computation plan becomes stale. This checks function body hashes on each refresh and triggers a rebuild when a change is detected.	2 days	PLAN_EDGE_CASES.md EC-16
~~EC-18~~ ✅	Explain why WAL mode isn't activating. When `cdc_mode = 'auto'`, pg_trickle is supposed to upgrade from trigger-based to WAL-based change tracking when possible. If it stays stuck on triggers (e.g. because `wal_level` isn't set to `logical`), there's no feedback. This adds a periodic log message explaining the reason and surfaces it in the `health_check()` output.	1 day	PLAN_EDGE_CASES.md EC-18
~~EC-34~~ ✅	Recover gracefully after restoring from backup. When you restore a PostgreSQL server from `pg_basebackup`, replication slots are lost. pg_trickle's WAL decoder would fail trying to read from a slot that no longer exists. This detects the missing slot, automatically falls back to trigger-based tracking, and logs a WARNING so you know what happened.	1 day	PLAN_EDGE_CASES.md EC-34

P2 — Usability Gaps

Item	Description	Effort	Ref
~~EC-03~~ ✅	Support window functions inside expressions. Queries like `CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN 'first' ELSE 'other' END` are currently rejected because the incremental engine can't handle a window function nested inside a CASE. This automatically extracts the window function into a preliminary step and rewrites the outer query to reference the precomputed result — so the query pattern just works.	3–5 days	PLAN_EDGE_CASES.md EC-03
~~EC-32~~ ✅	Support `ALL (subquery)` comparisons. Queries like `WHERE price > ALL (SELECT price FROM competitors)` (meaning "greater than every row in the subquery") are currently rejected in incremental mode. This rewrites them into an equivalent form the engine can handle, removing a Known Limitation from the changelog.	2–3 days	PLAN_EDGE_CASES.md EC-32

Edge case hardening subtotal: ~9.5–13.5 days

Ergonomics Follow-Up ✅

In plain terms: Several test gaps and a documentation item were left over from the v0.5.0 ergonomics work. These are all small E2E tests that confirm existing features actually produce the warnings and errors they're supposed to — catching regressions before users hit them. The changelog entry documents breaking behavioural changes (the default schedule changed from a fixed "every 1 minute" to an auto-calculated interval, and NULL schedule input is now rejected).

Item	Description	Effort	Ref
~~ERG-T1~~ ✅	Test the smart schedule default. Verify that passing `'calculated'` as a schedule works (pg_trickle picks an interval based on table size) and that passing `NULL` gives a clear error instead of silently breaking. Catches regressions in the schedule parser.	4h	PLAN_ERGONOMICS.md §Remaining follow-up
~~ERG-T2~~ ✅	Test that removed settings stay removed. The `diamond_consistency` GUC was removed in v0.4.0. Verify that `SHOW pg_trickle.diamond_consistency` returns an error — not a stale value from a previous installation that confuses users.	2h	PLAN_ERGONOMICS.md §Remaining follow-up
~~ERG-T3~~ ✅	Test the "heads up, this will do a full refresh" warning. When you change a stream table's query via `alter_stream_table(query => ...)`, it may trigger an expensive full re-scan. Verify the WARNING appears so users aren't surprised by a sudden spike in load.	3h	PLAN_ERGONOMICS.md §Remaining follow-up
~~ERG-T4~~ ✅	Test the WAL configuration warning. When `cdc_mode = 'auto'` but PostgreSQL's `wal_level` isn't set to `logical`, pg_trickle can't use WAL-based tracking and silently falls back to triggers. Verify the startup WARNING appears so operators know they need to change `wal_level`.	3h	PLAN_ERGONOMICS.md §Remaining follow-up
~~ERG-T5~~ ✅	Document breaking changes in the changelog. In v0.4.0 the default schedule changed from "every 1 minute" to auto-calculated, and `NULL` schedule input started being rejected. These behavioural changes need explicit CHANGELOG entries so upgrading users aren't caught off guard.	2h	PLAN_ERGONOMICS.md §Remaining follow-up

Ergonomics follow-up subtotal: ~14 hours

Bootstrap Source Gating Follow-Up ✅

In plain terms: Source gating (pause/resume for bulk loads) shipped in v0.5.0 with the core API and scheduler integration. This follow-up adds robustness tests for edge cases that real-world ETL pipelines will hit: What happens if you gate a source twice? What if you re-gate it after ungating? It also adds a dedicated introspection function that shows the full gate lifecycle (when gated, who gated it, how long it's been gated), and documentation showing common ETL coordination patterns like "gate → bulk load → ungate → single clean refresh."

Item	Description	Effort	Ref
~~BOOT-F1~~	Calling gate twice is safe. Verify that calling `gate_source('orders')` when `orders` is already gated is a harmless no-op — not an error. Important for ETL scripts that may retry on failure.	3h	PLAN_BOOTSTRAP_GATING.md
~~BOOT-F2~~	Gate → ungate → gate again works correctly. Verify the full lifecycle: gate a source (scheduler skips it), ungate it (scheduler resumes), gate it again (scheduler skips again). Proves the mechanism is reusable across multiple load cycles.	3h	PLAN_BOOTSTRAP_GATING.md
~~BOOT-F3~~	See your gates at a glance. A new `bootstrap_gate_status()` function that shows which sources are gated, when they were gated, who gated them, and how long they've been paused. Useful for debugging when the scheduler seems to be "doing nothing" — it might just be waiting for a gate.	3h	PLAN_BOOTSTRAP_GATING.md
~~BOOT-F4~~	Cookbook for common ETL patterns. Documentation with step-by-step recipes: gating a single source during a bulk load, coordinating multiple source loads that must finish together, gating only part of a stream table DAG, and the classic "nightly batch → gate → load → ungate → single clean refresh" workflow.	3h	PLAN_BOOTSTRAP_GATING.md

Bootstrap gating follow-up subtotal: ~12 hours

dbt Integration Enhancements ✅

In plain terms: The dbt macro package (dbt-pgtrickle) shipped in v0.4.0 with the core stream_table materialization. This adds three improvements: a stream_table_status macro that lets dbt models query health information (stale? erroring? how many refreshes?) so you can build dbt tests that fail when a stream table is unhealthy; a bulk refresh_all_stream_tables operation for CI pipelines that need everything fresh before running tests; and expanded integration tests covering the alter_stream_table flow (which gets more important once create_or_replace lands in the same release).

Item	Description	Effort	Ref
~~DBT-1~~	Check stream table health from dbt. A new `stream_table_status()` macro that returns whether a stream table is healthy, stale, or erroring — so you can write dbt tests like "fail if the orders summary hasn't refreshed in the last 5 minutes." Makes pg_trickle a first-class citizen in dbt's testing framework.	3h	PLAN_ECO_SYSTEM.md §Project 1
~~DBT-2~~	Refresh everything in one command. A `dbt run-operation refresh_all_stream_tables` command that refreshes all stream tables in the correct dependency order. Designed for CI pipelines: run it after `dbt run` and before `dbt test` to make sure all materialized data is current.	2h	PLAN_ECO_SYSTEM.md §Project 1
~~DBT-3~~	Test the dbt ↔ alter flow. Integration tests that verify query changes, config changes, and mode switches all work correctly when made through dbt's `stream_table` materialization. Especially important now that `create_or_replace` is landing in the same release.	3h	PLAN_ECO_SYSTEM.md §Project 1

dbt integration subtotal: ~8 hours

SQL Documentation Gaps ✅

In plain terms: Once EC-03 (window functions in expressions) and EC-32 (ALL (subquery)) are implemented in this release, the documentation needs to explain the new patterns with examples. The foreign table polling CDC feature (shipped in v0.2.2) also needs a worked example showing common setups like postgres_fdw source tables with periodic polling.

Item	Description	Effort	Ref
~~DOC-1~~	Show users how ALL-subqueries work. Once EC-32 lands, add a SQL Reference section explaining `WHERE price > ALL (SELECT ...)`, how pg_trickle rewrites it internally, and a complete worked example with sample data and expected output.	2h	GAP_SQL_OVERVIEW.md
~~DOC-2~~	Show the window-in-expression pattern. Once EC-03 lands, add a before/after example to the SQL Reference: "Here's your original query with `CASE WHEN ROW_NUMBER() ...`, and here's what pg_trickle does under the hood to make it work incrementally."	2h	PLAN_EDGE_CASES.md EC-03
~~DOC-3~~	Walkthrough for foreign table sources. A step-by-step recipe showing how to create a `postgres_fdw` foreign table, use it as a stream table source with polling-based change detection, and what to expect in terms of refresh behaviour. This feature shipped in v0.2.2 but was never properly documented with an example.	1h	Existing feature (v0.2.2)

SQL documentation subtotal: ~5 hours

v0.6.0 total: ~77–92h

Exit criteria:

Partitioned source tables E2E-tested; ATTACH PARTITION detected
WAL mode works with publish_via_partition_root = true
create_or_replace_stream_table deployed; dbt macro updated
SCC algorithm in place; monotonicity checker rejects non-monotone cycles
WAL + keyless without REPLICA IDENTITY FULL rejected at creation (EC-19)
ALTER FUNCTION body changes detected via pg_proc hash polling (EC-16)
Stuck auto CDC mode surfaces explanation in logs and health check (EC-18)
Missing WAL slot after restore auto-detected with TRIGGER fallback (EC-34)
Window functions in expressions supported via subquery-lift rewrite (EC-03)
ALL (subquery) rewritten to NULL-safe anti-join (EC-32)
Ergonomics E2E tests for calculated schedule, warnings, and removed GUCs pass
gate_source() idempotency and re-gating tested; bootstrap_gate_status() available
dbt stream_table_status() and refresh_all_stream_tables macros shipped
SQL Reference updated for EC-03, EC-32, and foreign table polling patterns
Extension upgrade path tested (0.5.0 → 0.6.0)

v0.7.0 — Performance, Watermarks, Circular DAG Execution, Observability & Infrastructure

Status: Released (2026-03-16).

Goal: Land Part 9 performance improvements (parallel refresh scheduling, MERGE strategy optimization, advanced benchmarks), add user-injected temporal watermark gating for batch-ETL coordination, complete the fixpoint scheduler for circular stream table DAGs, ship ready-made Prometheus/Grafana monitoring, and prepare the 1.0 packaging and deployment infrastructure.

Completed items (click to expand)

Watermark Gating

In plain terms: A scheduling control for ETL pipelines where multiple source tables are populated by separate jobs that finish at different times. For example, orders might be loaded by a job that finishes at 02:00 and products by one that finishes at 03:00. Without watermarks, the scheduler might refresh a stream table that joins the two at 02:30, producing a half-complete result. Watermarks let each ETL job declare "I'm done up to timestamp X", and the scheduler waits until all sources are caught up within a configurable tolerance before proceeding.

Let producers signal their progress so the scheduler only refreshes stream tables when all contributing sources are aligned within a configurable tolerance. The primary use case is nightly batch ETL pipelines where multiple source tables are populated on different schedules.

Item	Description	Effort	Ref
~~WM-1~~	~~Catalog: `pgt_watermarks` table (`source_relid`, `current_watermark`, `updated_at`, `wal_lsn_at_advance`); `pgt_watermark_groups` table (`group_name`, `sources`, `tolerance`)~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-2~~	~~`advance_watermark(source, watermark)` — monotonicity check, store LSN alongside watermark, lightweight scheduler signal~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-3~~	~~`create_watermark_group(name, sources[], tolerance)` / `drop_watermark_group()`~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-4~~	~~Scheduler pre-check: evaluate watermark alignment predicate; skip + log `SKIP(watermark_misaligned)` if not aligned~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-5~~	~~`watermarks()`, `watermark_groups()`, `watermark_status()` introspection functions~~	✅ Done	PLAN_WATERMARK_GATING.md
~~WM-6~~	~~E2E tests: nightly ETL, micro-batch tolerance, multiple pipelines, mixed external+internal sources~~	✅ Done	PLAN_WATERMARK_GATING.md

Watermark gating: ✅ Complete

Circular Dependencies — Scheduler Integration

In plain terms: Completes the circular DAG work started in v0.6.0. When stream tables reference each other in a cycle (A → B → A), the scheduler now runs them repeatedly until the result stabilises — no more changes flowing through the cycle. This is called "fixpoint iteration", like solving a system of equations by re-running it until the numbers stop moving. If it doesn't converge within a configurable number of rounds (default 100) it surfaces an error rather than looping forever.

Completes the SCC foundation from v0.6.0 with a working fixpoint iteration loop. Stream tables in a monotone cycle are refreshed repeatedly until convergence (zero net change) or max_fixpoint_iterations is exceeded.

Item	Description	Effort	Ref
~~CYC-5~~	~~Scheduler fixpoint iteration: `iterate_to_fixpoint()`, convergence detection from `(rows_inserted, rows_deleted)`, non-convergence → `ERROR` status~~	✅ Done	PLAN_CIRCULAR_REFERENCES.md Part 5
~~CYC-6~~	~~Creation-time validation: allow monotone cycles when `allow_circular=true`; assign `scc_id`; recompute SCCs on `drop_stream_table`~~	✅ Done	PLAN_CIRCULAR_REFERENCES.md Part 6
~~CYC-7~~	~~Monitoring: `scc_id` + `last_fixpoint_iterations` in views; `pgtrickle.pgt_scc_status()` function~~	✅ Done	PLAN_CIRCULAR_REFERENCES.md Part 7
~~CYC-8~~	~~Documentation + E2E tests (`e2e_circular_tests.rs`): 6 scenarios (monotone cycle, non-monotone reject, convergence, non-convergence→ERROR, drop breaks cycle, `allow_circular=false` default)~~	✅ Done	PLAN_CIRCULAR_REFERENCES.md Part 8

Circular dependencies subtotal: ~19 hours

Last Differential Mode Gaps

In plain terms: Three query patterns that previously fell back to FULL refresh in AUTO mode — or hard-errored in explicit DIFFERENTIAL mode — despite the DVM engine having the infrastructure to handle them. All three gaps are now closed.

Item	Description	Effort	Ref
~~DG-1~~	User-Defined Aggregates (UDAs). PostGIS (`ST_Union`, `ST_Collect`), pgvector vector averages, and any `CREATE AGGREGATE` function are rejected. Fix: classify unknown aggregates as `AggFunc::UserDefined` and route them through the existing group-rescan strategy — no new delta math required.	✅ Done	PLAN_LAST_DIFFERENTIAL_GAPS.md §G1
~~DG-2~~	Window functions nested in expressions. `RANK() OVER (...) + 1`, `CASE WHEN ROW_NUMBER() OVER (...) <= 10`, `COALESCE(LAG(v) OVER (...), 0)` etc. are rejected.	✅ Done (v0.6.0)	PLAN_LAST_DIFFERENTIAL_GAPS.md §G2
~~DG-3~~	Sublinks in deeply nested OR. The two-stage rewrite pipeline handles flat `EXISTS(...) OR …` and `AND(EXISTS OR …)` but gives up on multiple OR+sublink conjuncts. Fix: expand all OR+sublink conjuncts in AND to a cartesian product of UNION branches with a 16-branch explosion guard.	✅ Done	PLAN_LAST_DIFFERENTIAL_GAPS.md §G3

Last differential gaps: ✅ Complete

Pre-1.0 Infrastructure Prep

In plain terms: Three preparatory tasks that make the eventual 1.0 release smoother. A draft Docker Hub image workflow (tests the build but doesn't publish yet); a PGXN metadata file so the extension can eventually be installed with pgxn install pg_trickle; and a basic CNPG integration test that verifies the extension image loads correctly in a CloudNativePG cluster. None of these ship user-facing features — they're CI and packaging scaffolding.

Item	Description	Effort	Ref
~~INFRA-1~~	Prove the Docker image builds. Set up a CI workflow that builds the official Docker Hub image (PostgreSQL 18 + pg_trickle pre-installed), runs a smoke test (create extension, create a stream table, refresh it), but doesn't publish anywhere yet. When 1.0 arrives, publishing is just flipping a switch.	5h	✅ Done
~~INFRA-2~~	Publish an early PGXN testing release. Draft `META.json` and upload a `release_status: "testing"` package to PGXN so `pgxn install pg_trickle` works for early adopters now. PGXN explicitly supports pre-stable releases; this gets real-world install testing and establishes registry presence before 1.0. At 1.0 the only change is flipping `release_status` to `"stable"`.	2–3h	✅ Done
~~INFRA-3~~	Verify Kubernetes deployment works. A CI smoke test that deploys the pg_trickle extension image into a CloudNativePG (CNPG) Kubernetes cluster, creates a stream table, and confirms a refresh cycle completes. Catches packaging and compatibility issues before they reach Kubernetes users.	4h	✅ Done

Pre-1.0 infrastructure prep: ✅ Complete

Performance — Regression Fixes & Benchmark Infrastructure (Part 9 S1–S2) ✅ Done

Fixes Criterion benchmark regressions identified in Part 9 and ships five benchmark infrastructure improvements to support data-driven performance decisions.

Item	Description	Status
A-3	Fix `prefixed_col_list/20` +34% regression — eliminate intermediate `Vec` allocation	✅ Done
A-4	Fix `lsn_gt` +22% regression — use `split_once` instead of `split().collect()`	✅ Done
I-1c	`just bench-docker` target for running Criterion inside Docker builder image	✅ Done
I-2	Per-cycle `[BENCH_CYCLE]` CSV output in E2E benchmarks for external analysis	✅ Done
I-3	EXPLAIN ANALYZE capture mode (`PGS_BENCH_EXPLAIN=true`) for delta query plans	✅ Done
I-6	1M-row benchmark tier (`bench__1m_` + `bench_large_matrix`)	✅ Done
I-8	Criterion noise reduction (`sample_size(200)`, `measurement_time(10s)`)	✅ Done

Performance — Parallel Refresh, MERGE Optimization & Advanced Benchmarks (Part 9 S4–S6) ✅ Done

DAG level-parallel scheduling, improved MERGE strategy selection (xxh64 hashing, aggregate saturation bypass, cost-based threshold), and expanded benchmark suite (JSON comparison, concurrent writers, window/lateral/CTE).

Item	Description	Status
C-1	DAG level extraction (`topological_levels()` on `StDag` and `ExecutionUnitDag`)	✅ Done
C-2	Level-parallel dispatch (existing `parallel_dispatch_tick` infrastructure sufficient)	✅ Done
C-3	Result communication (existing `SchedulerJob` + `pgt_refresh_history` sufficient)	✅ Done
D-1	xxh64 hash-based change detection for wide tables (≥50 cols)	✅ Done
D-2	Aggregate saturation FULL bypass (changes ≥ groups → FULL)	✅ Done
D-3	Cost-based strategy selection from `pgt_refresh_history` data	✅ Done
I-4	Cross-run comparison tool (`just bench-compare`, JSON output)	✅ Done
I-5	Concurrent writer benchmarks (1/2/4/8 writers)	✅ Done
I-7	Window / lateral / CTE / UNION ALL operator benchmarks	✅ Done

v0.7.0 total: ~59–62h

Exit criteria:

Part 9 performance: DAG levels, xxh64 hashing, aggregate saturation bypass, cost-based threshold, advanced benchmarks
advance_watermark + scheduler gating operational; ETL E2E tests pass
Monotone circular DAGs converge to fixpoint; non-convergence surfaces as ERROR
UDAs, nested window expressions, and deeply nested OR+sublinks supported in DIFFERENTIAL mode
Docker Hub image CI workflow builds and smoke-tests successfully
PGXN testing release uploaded; pgxn install pg_trickle works
CNPG integration smoke test passes in CI
Extension upgrade path tested (0.6.0 → 0.7.0)

v0.8.0 — pg_dump Support & Test Hardening

Status: Released

Goal: Complete the pg_dump round-trip story so stream tables survive pg_dump/pg_restore cycles, and comprehensively harden the E2E test suites with multiset invariants to mathematically enforce DVM correctness.

Completed items (click to expand)

pg_dump / pg_restore Support

In plain terms: pg_dump is the standard PostgreSQL backup tool. Without this, a dump of a database containing stream tables may not capture them correctly — and restoring from that dump would require manually recreating them by hand. This teaches pg_dump to emit valid SQL for every stream table, and adds logic to automatically re-link orphaned catalog entries when restoring an extension from a backup.

Complete the native DDL story: teach pg_dump to emit CREATE MATERIALIZED VIEW … WITH (pgtrickle.stream = true) for stream tables and add an event trigger that re-links orphaned catalog entries on extension restore.

Item	Description	Effort	Ref
NAT-DUMP	`generate_dump()` + `restore_stream_tables()` companion functions (done); event trigger on extension load for orphaned catalog entries	3–4d	PLAN_NATIVE_SYNTAX.md §pg_dump
NAT-TEST	E2E tests: pg_dump round-trip, restore from backup, orphaned-entry recovery	2–3d	PLAN_NATIVE_SYNTAX.md §pg_dump

pg_dump support subtotal: ~5–7 days

Test Suite Evaluation & Hardening

In plain terms: Replacing legacy, row-count-based assertions with comprehensive, order-independent multiset evaluations (assert_st_matches_query) across all testing tiers. This mathematical invariant proving guarantees differential dataflow correctness under highly chaotic multiset interleavings and edge cases.

Item	Description	Effort	Ref
TE1	Unit Test Hardening: Full multiset equality testing for pure-Rust DVM operators	Done	PLAN_EVALS_UNIT
TE2	Light E2E Migration: Expand speed-optimized E2E pipeline with rigorous symmetric difference checks	Done	PLAN_EVALS_LIGHT_E2E
TE3	Integration Concurrency: Prove complex orchestration correctness under transaction delays	Done	PLAN_EVALS_INTEGRATION
TE4	Full E2E Hardening: Validate cross-boundary, multi-DAG cascades, partition handling, and upgrade paths	Done	PLAN_EVALS_FULL_E2E
TE5	TPC-H Smoke Test: Stateful invariant evaluations for heavily randomized DML loads over large matrices	Done	PLAN_EVALS_TPCH
TE6	Property-Based Invariants: Chaotic property testing pipelines for topological boundaries and cyclic executions	Done	PLAN_PROPERTY_BASED_INVARIANTS
TE7	cargo-nextest Migration: Move test suite execution to cargo-nextest to aggressively parallelize and isolate tests, solving wall-clock execution regressions	1–2d	PLAN_CARGO_NEXTEST

Test evaluation subtotal: ~11-14 days (Mostly Completed)

v0.8.0 total: ~16–21 days

Exit criteria:

Test infrastructure hardened with exact mathematical multiset validation
Test harness migrated to cargo-nextest to fix speed and CI flake regressions
pg_dump round-trip produces valid, restorable SQL for stream tables (Done)
Extension upgrade path tested (0.7.0 → 0.8.0)

v0.9.0 — Incremental Aggregate Maintenance

Status: Released (2026-03-20).

Goal: Implement algebraic incremental maintenance for decomposable aggregates (COUNT, SUM, AVG, MIN, MAX, STDDEV), reducing per-group refresh from O(group_size) to O(1) for the common case. This is the highest-potential-payoff item in the performance plan — benchmarks show aggregate scenarios going from 2.5 ms to sub-1 ms per group.

Completed items (click to expand)

Critical Bug Fixes

Item	Description	Effort	Status	Ref
G-1	`panic!()` in SQL-callable `source_gates()` and `watermarks()` functions. Both functions reach `panic!()` on any SPI error, crashing the PostgreSQL backend process. AGENTS.md explicitly forbids `panic!()` in code reachable from SQL. Replace both `.unwrap_or_else(\|e\| panic!(…))` calls with `pgrx::error!(…)` so any SPI failure surfaces as a PostgreSQL `ERROR` instead.	~1h	✅ Done	src/api.rs

Critical bug fixes subtotal: ~1 hour

Algebraic Aggregate Shortcuts (B-1)

In plain terms: When only one row changes in a group of 100,000, today pg_trickle re-scans all 100,000 rows to recompute the aggregate. Algebraic maintenance keeps running totals: new_sum = old_sum + Δsum, new_count = old_count + Δcount. Only MIN/MAX needs a rescan — and only when the deleted value was the current minimum or maximum.

Item	Description	Effort	Status	Ref
B1-1	Algebraic rules: COUNT, SUM (already algebraic), AVG (done — aux cols), STDDEV/VAR (done — sum-of-squares decomposition), MIN/MAX with rescan guard (already implemented)	3–4 wk	✅ Done	PLAN_NEW_STUFF.md §B-1
B1-2	Auxiliary column management (`__pgt_aux_sum_`, `__pgt_aux_count_`, `__pgt_aux_sum2_` — done); hidden via `__pgt_` naming convention (existing `NOT LIKE '__pgt_%'` filter)	1–2 wk	✅ Done	PLAN_NEW_STUFF.md §B-1
B1-3	Migration story for existing aggregate stream tables; periodic full-group recomputation to reset floating-point drift	1 wk	✅ Done	PLAN_NEW_STUFF.md §B-1
B1-4	Fallback to full-group recomputation for non-decomposable aggregates (`mode`, percentile, `string_agg` with ordering)	1 wk	✅ Done	PLAN_NEW_STUFF.md §B-1
B1-5	Property-based tests: MIN/MAX boundary case (deleting the exact current min or max value must trigger rescan)	1 wk	✅ Done	PLAN_NEW_STUFF.md §B-1

Implementation Progress

Completed:

AVG algebraic maintenance (B1-1): AVG no longer triggers full group-rescan. Classified as is_algebraic_via_aux() and tracked via __pgt_aux_sum_* / __pgt_aux_count_* columns. The merge expression computes (old_sum + ins - del) / NULLIF(old_count + ins - del, 0).
STDDEV/VAR algebraic maintenance (B1-1): STDDEV_POP, STDDEV_SAMP, VAR_POP, and VAR_SAMP are now algebraic using sum-of-squares decomposition. Auxiliary columns: __pgt_aux_sum_* (running SUM), __pgt_aux_sum2_* (running SUM(x²)), __pgt_aux_count_*. Merge formulas:
- VAR_POP = GREATEST(0, (n·sum2 − sum²) / n²)
- VAR_SAMP = GREATEST(0, (n·sum2 − sum²) / (n·(n−1)))
- STDDEV_POP = SQRT(VAR_POP), STDDEV_SAMP = SQRT(VAR_SAMP) Null guards match PostgreSQL semantics (NULL when count ≤ threshold).
Auxiliary column infrastructure (B1-2): create_stream_table() and alter_stream_table() detect AVG/STDDEV/VAR aggregates and automatically add NUMERIC sum/sum2 and BIGINT count columns. Full refresh and initialization paths inject SUM(arg), COUNT(arg), and SUM(arg*arg). All __pgt_aux_* columns are automatically hidden by the existing NOT LIKE '__pgt_%' convention used throughout the codebase.
Non-decomposable fallback (B1-4): Already existed as the group-rescan strategy — any aggregate not classified as algebraic or algebraic-via-aux falls back to full group recomputation.
Property-based tests (B1-5): Seven proptest tests verify: (a) MIN merge uses LEAST, MAX merge uses GREATEST; (b) deleting the exact current extremum triggers rescan; (c) delta expressions use matching aggregate functions; (d) AVG is classified as algebraic-via-aux (not group-rescan); (e) STDDEV/VAR use sum-of-squares algebraic path with GREATEST guard; (f) STDDEV wraps in SQRT, VAR does not; (g) DISTINCT STDDEV falls back (not algebraic).
Migration story (B1-3): ALTER QUERY transition seamlessly. Handled by extending migrate_aux_columns to execute ALTER TABLE ADD COLUMN or DROP COLUMN exactly matching runtime changes in the new_avg_aux or new_sum2_aux definitions.
Floating-point drift reset (B1-3): Implemented global GUC pg_trickle.algebraic_drift_reset_cycles (0=disabled) that counts differential refresh attempts in scheduler memory per-stream-table. When the threshold fires, action degrades to RefreshAction::Reinitialize.
E2E integration tests: Tested via multi-cycle inserts, updates, and deletes checking proper handling without regression (added specifically for STDDEV/VAR).

Remaining work:

Extension upgrade path (0.8.0 → 0.9.0): Upgrade SQL stub created. Left as a final pre-release checklist item to generate the final sql/archive/pg_trickle--0.9.0.sql with cargo pgrx package once all CI checks pass.
F15 — Selective CDC Column Capture: ✅ Complete. Column-selection pipeline, monitoring exposure via check_cdc_health().selective_capture, and 3 E2E integration tests done.

⚠️ Critical: the MIN/MAX maintenance rule is directionally tricky. The correct condition for triggering a rescan is: deleted value equals the current min/max (not when it differs). Getting this backwards silently produces stale aggregates on the most common OLTP delete pattern. See the corrected table and risk analysis in PLAN_NEW_STUFF.md §B-1.

Retraction consideration (B-1): Keep in v0.9.0, but item B1-5 (property-based tests covering the MIN/MAX boundary case) is a hard prerequisite for B1-1, not optional follow-on work. The MIN/MAX rule was stated backwards in the original spec; the corrected rule is now in PLAN_NEW_STUFF.md. Do not merge any MIN/MAX algebraic path until property-based tests confirm: (a) deleting the exact current min triggers a rescan and (b) deleting a non-min value does not. Floating-point drift reset (B1-3) is also required before enabling persistent auxiliary columns.

✅ B1-5 hard prerequisite satisfied. Property-based tests now cover both conditions — see prop_min_max_rescan_guard_direction in tests/property_tests.rs.

Algebraic aggregates subtotal: ~7–9 weeks

Advanced SQL Syntax & DVM Capabilities (B-2)

These represent expansions of the DVM engine to handle richer SQL constructs and improve runtime execution consistency.

Item	Description	Effort	Status	Ref
B2-1	LIMIT / OFFSET / ORDER BY. Top-K queries evaluated directly within the DVM engine.	2–3 wk	✅ Done	PLAN_ORDER_BY_LIMIT_OFFSET.md
B2-2	LATERAL Joins. Expanding the parser and DVM diff engine to handle LATERAL subqueries.	2 wk	✅ Done	PLAN_LATERAL_JOINS.md
B2-3	View Inlining. Allow stream tables to query standard PostgreSQL views natively.	1-2 wk	✅ Done	PLAN_VIEW_INLINING.md
B2-4	Synchronous / Transactional IVM. Evaluating DVM diffs synchronously in the same transaction as the DML.	3 wk	✅ Done	PLAN_TRANSACTIONAL_IVM.md
B2-5	Cross-Source Snapshot Consistency. Improving engine consistency models when joining multiple tables.	2 wk	✅ Done	PLAN_CROSS_SOURCE_SNAPSHOT_CONSISTENCY.md
B2-6	Non-Determinism Guarding. Better handling or rejection of non-deterministic functions (`random()`, `now()`).	1 wk	✅ Done	PLAN_NON_DETERMINISM.md

Multi-Table Delta Batching (B-3)

In plain terms: When a join query has three source tables and all three change in the same cycle, today pg_trickle makes three separate passes through the source tables. B-3 merges those passes into one and prunes UNION ALL branches for sources with no changes.

Item	Description	Effort	Status	Ref
B3-1	Intra-query delta-branch pruning: skip UNION ALL branch entirely when a source has zero changes in this cycle	1–2 wk	✅ Done	PLAN_NEW_STUFF.md §B-3
B3-2	Merged-delta generation: weight aggregation (`GROUP BY __pgt_row_id, SUM(weight)`) for cross-source deduplication; remove zero-weight rows	3–4 wk	✅ Done (v0.10.0)	PLAN_NEW_STUFF.md §B-3
B3-3	Property-based correctness tests for simultaneous multi-source changes; diamond-flow scenarios	1–2 wk	✅ Done (v0.10.0)	PLAN_NEW_STUFF.md §B-3

✅ B3-2 correctly uses weight aggregation (GROUP BY __pgt_row_id, SUM(weight)) instead of DISTINCT ON. B3-3 property-based tests (6 diamond-flow scenarios) verify correctness.

Multi-source delta batching subtotal: ~5–8 weeks

Phase 7 Gap Resolutions (DVM Correctness, Syntax & Testing)

These items pull in the remaining correctness edge cases and syntax expansions identified in the Phase 7 SQL Gap Analysis, along with completing exhaustive differential E2E test maturation.

Item	Description	Effort	Status	Ref
G1.1	JOIN Key Column Changes. Handle updates that simultaneously modify a JOIN key and right-side tracked columns.	3-5d	✅ Done	GAP_SQL_PHASE_7.md
G1.2	Window Function Partition Drift. Explicit tracking for updates that cause rows to cross `PARTITION BY` ranges.	4-6d	✅ Done	GAP_SQL_PHASE_7.md
G1.5/G7.1	Keyless Table Duplicate Identity. Resolve `__pgt_row_id` collisions for non-PK tables with exact duplicate rows.	3-5d	✅ Done	GAP_SQL_PHASE_7.md
G5.6	Range Aggregates. Support and differentiate `RANGE_AGG` and `RANGE_INTERSECT_AGG`.	1-2d	✅ Done	GAP_SQL_PHASE_7.md
G5.3	XML Expression Parsing. Native DVM handling for `T_XmlExpr` syntax trees.	1-2d	✅ Done	GAP_SQL_PHASE_7.md
G5.5	NATURAL JOIN Drift Tracking. DVM tracking of schema shifts in `NATURAL JOIN` between refreshes.	2-3d	✅ Done	GAP_SQL_PHASE_7.md
F15	Selective CDC Column Capture. Limit row I/O by only tracking columns referenced in query lineage.	1-2 wk	✅ Done	GAP_SQL_PHASE_6.md
F40	Extension Upgrade Migrations. Robust versioned SQL schema migrations.	1-2 wk	✅ Done	REPORT_DB_SCHEMA_STABILITY.md

Phase 7 Gaps subtotal: ~5-7 weeks

Additional Query Engine Improvements

Item	Description	Effort	Status	Ref
A1	Circular dependency support (SCC fixpoint iteration)	~40h	✅ Done	CIRCULAR_REFERENCES.md
A7	Skip-unchanged-column scanning in delta SQL (requires column-usage demand-propagation pass in DVM parser)	~1–2d	✅ Done	PLAN_EDGE_CASES_TIVM_IMPL_ORDER.md Stage 4 §3.4
EC-03	Window-in-expression DIFFERENTIAL fallback warning: emit a `WARNING` (and eventually an `INFO` hint) when a stream table with `CASE WHEN window_fn() OVER (...) ...` silently falls back from DIFFERENTIAL to FULL refresh mode; currently fails at runtime with `column st.* does not exist` — no user-visible signal exists	~1d	✅ Done	PLAN_EDGE_CASES.md §EC-03
A8	`pgt_refresh_groups` SQL API: companion functions (`pgtrickle.create_refresh_group()`, `pgtrickle.drop_refresh_group()`, `pgtrickle.refresh_groups()`) for the Cross-Source Snapshot Consistency catalog table introduced in the `0.8.0→0.9.0` upgrade script	~2–3d	✅ Done	PLAN_CROSS_SOURCE_SNAPSHOT_CONSISTENCY.md

Advanced Capabilities subtotal: ~11–13 weeks

DVM Engine Correctness & Performance Hardening (P2)

These items address correctness gaps that silently degrade to full-recompute modes or cause excessive I/O on each differential cycle. All are observable in production workloads.

Item	Description	Effort	Status	Ref
P2-1	Recursive CTE DRed in DIFFERENTIAL mode. Currently, any DELETE or UPDATE against a recursive CTE's source in DIFFERENTIAL mode falls back to O(n) full recompute + diff. The Delete-and-Rederive (DRed) algorithm exists for IMMEDIATE mode only. Implement DRed for `DeltaSource::ChangeBuffer` so recursive CTE stream tables in DIFFERENTIAL mode maintain O(delta) cost.	2–3 wk	⏭️ Deferred to v0.10.0	src/dvm/operators/recursive_cte.rs
P2-2	SUM NULL-transition rescan for FULL OUTER JOIN aggregates. When `SUM` sits above a FULL OUTER JOIN and rows transition between matched and unmatched states (matched→NULL), the algebraic formula gives 0 instead of NULL, triggering a `child_has_full_join()` full-group rescan on every cycle where rows cross that boundary. Implement a targeted correction that avoids full-group rescans in the common case.	1–2 wk	⏭️ Deferred to v0.10.0	src/dvm/operators/aggregate.rs
P2-3	DISTINCT multiplicity-count JOIN overhead. Every differential refresh for `SELECT DISTINCT` queries joins against the stream table's `__pgt_count` column for the full stream table, even when only a tiny delta is being processed. Replace with a per-affected-row lookup pattern to limit this to O(delta) I/O.	1 wk	✅ Done	src/dvm/operators/distinct.rs
P2-4	Materialized view sources in IMMEDIATE mode (EC-09). Stream tables that use a PostgreSQL materialized view as a source are rejected at creation time when IMMEDIATE mode is requested. Implement a polling-change-detection wrapper (same approach as EC-05 for foreign tables) to support `REFRESH MATERIALIZED VIEW`-sourced queries in IMMEDIATE mode.	2–3 wk	⏭️ Deferred to v0.10.0	plans/PLAN_EDGE_CASES.md §EC-09
P2-5	`changed_cols` bitmask captured but not consumed in delta scan SQL. Every CDC change buffer row stores a `changed_cols BIGINT` bitmask recording which source columns were modified by an UPDATE. The DVM delta scan CTE reads every UPDATE row regardless of whether any query-referenced column actually changed. Implement a demand-propagation pass to identify referenced columns per Scan, then inject a `changed_cols & referenced_mask != 0` filter into the delta CTE WHERE clause. For wide source tables (50+ columns) where a typical UPDATE touches 1–3 columns, this eliminates ~98% of UPDATE rows entering the join/aggregate pipeline.	2–3 wk	✅ Done	src/dvm/operators/scan.rs · plans/PLAN_EDGE_CASES_TIVM_IMPL_ORDER.md §Task 3.1
P2-6	LATERAL subquery inner-source change triggers O(\|outer table\|) full re-execution. When any inner source has CDC entries in the current window, `build_inner_change_branch()` re-materializes the entire outer table snapshot and re-executes the lateral subquery for every outer row — O(\|outer\|) per affected cycle. Gate the outer-table scan behind a join to the inner delta rows so only outer rows correlated with changed inner rows are re-executed. (The analogous scalar subquery fix is P3-3; this is the lateral equivalent.)	1–2 wk	⏭️ Deferred to v0.10.0	src/dvm/operators/lateral_subquery.rs
P2-7	Delta predicate pushdown not implemented. WHERE predicates from the defining query are not pushed into the change buffer scan CTE. A stream table defined as `SELECT … FROM orders WHERE status = 'shipped'` reads all changes from `pgtrickle_changes.changes_<oid>` then filters — for 10K changes/cycle with 50 matching the predicate, 9,950 rows traverse the join/aggregate pipeline needlessly. Collect pushable predicates from the Filter node above the Scan; inject `new_<col> / old_<col>` predicate variants into the delta scan SQL. Care required: UPDATE rows need both old and new column values checked to avoid missing deletions that move rows out of the predicate window.	2–3 wk	✅ Done	src/dvm/operators/scan.rs · src/dvm/operators/filter.rs · plans/performance/PLAN_NEW_STUFF.md §B-2

DVM hardening (P2) subtotal: ~6–9 weeks

DVM Performance Trade-offs (P3)

These items are correct as implemented but scale with data size rather than delta size. They are lower priority than P2 but represent solid measurable wins for high-cardinality workloads.

Item	Description	Effort	Status	Ref
P3-1	Window partition full recompute. Any single-row change in a window partition triggers recomputation of the entire partition. Add a partition-size heuristic: if the affected partition exceeds a configurable row threshold, downgrade to FULL refresh for that cycle and emit a `pgrx::info!()` message. At minimum, document the O(partition_size) cost prominently.	1 wk	✅ Done (documented)	src/dvm/operators/window.rs
P3-2	*Welford auxiliary columns for CORR/COVAR/REGR_ aggregates.** `CORR`, `COVAR_POP`, `COVAR_SAMP`, `REGR_` currently use O(group_size) group-rescan. Implement Welford-style auxiliary column accumulation (`__pgt_aux_sumx_`, `__pgt_aux_sumy_`, `__pgt_aux_sumxy_`) to reach O(1) algebraic maintenance identical to the STDDEV/VAR path.	2–3 wk	⏭️ Deferred to v0.10.0	src/dvm/operators/aggregate.rs
P3-3	Scalar subquery C₀ EXCEPT ALL scan. Part 2 of the scalar subquery delta computes `C₀ = C_current EXCEPT ALL Δ_inserts UNION ALL Δ_deletes` by scanning the full outer snapshot. For large outer tables with an unstable inner source, this scan is proportional to the outer table size. Profile and gate the scan behind an existence check on inner-source stability to avoid it when possible; the `WHERE EXISTS (SELECT 1 FROM delta_subquery)` guard already handles the trivial case.	1 wk	✅ Done	src/dvm/operators/scalar_subquery.rs
P3-4	Index-aware MERGE planning. For small deltas against large stream tables (e.g. 5 delta rows, 10M-row ST), the PostgreSQL planner often chooses a sequential scan of the stream table for the MERGE join on `__pgt_row_id`, yielding O(n) full-table I/O when an index lookup would be O(log n). Emit `SET LOCAL enable_seqscan = off` within the MERGE transaction when the delta row count is below a configurable threshold fraction of the ST row count (`pg_trickle.merge_seqscan_threshold` GUC, default 0.001).	1–2 wk	✅ Done	src/refresh.rs · src/config.rs · plans/performance/PLAN_NEW_STUFF.md §A-4
P3-5	`auto_backoff` GUC for falling-behind stream tables. EC-11 implemented the `scheduler_falling_behind` NOTIFY alert at 80% of the refresh budget. The companion `auto_backoff` GUC that automatically doubles the effective refresh interval when a stream table consistently runs behind was explicitly deferred. Add a `pg_trickle.auto_backoff` bool GUC (default off); when enabled, track a per-ST exponential backoff factor in scheduler shared state and reset it on the first on-time cycle. Saves CPU runaway when operators are offline to respond manually.	1–2d	✅ Done	src/scheduler.rs · src/config.rs · plans/PLAN_EDGE_CASES.md §EC-11

DVM performance trade-offs (P3) subtotal: ~4–7 weeks

Documentation Gaps (D)

Item	Description	Effort	Status
D1	Recursive CTE DIFFERENTIAL mode limitation. The O(n) fallback for mixed DELETE/UPDATE against a recursive CTE source is not documented in docs/SQL_REFERENCE.md or docs/DVM_OPERATORS.md. Users hitting DELETE/UPDATE-heavy workloads on recursive CTE stream tables will see unexpectedly slow refresh times with no explanation. Add a "Known Limitations" callout in both files.	~2h	✅ Done
D2	`pgt_refresh_groups` catalog table undocumented. The catalog table added in the `0.8.0→0.9.0` upgrade script is not described in docs/SQL_REFERENCE.md. Even before the full A8 API lands, document the table schema, its purpose, and the manual INSERT/DELETE workflow users can use in the interim.	~2h	✅ Done

v0.9.0 total: ~23–29 weeks

Exit criteria:

AVG algebraic path implemented (SUM/COUNT auxiliary columns)
STDDEV/VAR algebraic path implemented (sum-of-squares decomposition)
MIN/MAX boundary case (delete-the-extremum) covered by property-based tests
Non-decomposable fallback confirmed (group-rescan strategy)
Auxiliary columns hidden from user queries via __pgt_* naming convention
Migration path for existing aggregate stream tables tested
Floating-point drift reset mechanism in place (periodic recompute)
E2E integration tests for algebraic aggregate paths
B2-1: Top-K queries (LIMIT/OFFSET/ORDER BY) support
B2-2: LATERAL Joins support
B2-3: View Inlining support
B2-4: Synchronous / Transactional IVM mode
B2-5: Cross-Source Snapshot Consistency models
B2-6: Non-Determinism Guarding semantics implemented
Extension upgrade path tested (0.8.0 → 0.9.0)
G1 Correctness Gaps addressed (G1.1, G1.2, G1.5, G1.6)
G5 Syntax Gaps addressed (G5.2, G5.3, G5.5, G5.6)
G6 Test Coverage expanded (G6.1, G6.2, G6.3, G6.5)
F15: Selective CDC Column Capture (optimize I/O by only tracking columns referenced in query lineage)
F40: Extension Upgrade Migration Scripts (finalize versioned SQL schema migrations)
B3-1: Delta-branch pruning for zero-change sources (skip UNION ALL branch when source has no changes)
B3-2: Merged-delta weight aggregation — implemented in v0.10.0 (weight aggregation replaces DISTINCT ON; B3-3 property tests verify correctness)
B3-3: Property-based correctness tests for B3-2 — implemented in v0.10.0 (6 diamond-flow E2E property tests)
EC-03: WARNING emitted when window-in-expression query silently falls back from DIFFERENTIAL to FULL refresh mode
A8: pgt_refresh_groups SQL API (pgt_add_refresh_group, pgt_remove_refresh_group, pgt_list_refresh_groups)
P2-1: Recursive CTE DRed for DIFFERENTIAL mode — deferred to v0.10.0 (high risk; ChangeBuffer mode lacks old-state context for safe rederivation; recomputation fallback is correct)
P2-2: SUM NULL-transition rescan optimization — deferred to v0.10.0 (requires auxiliary nonnull-count columns; current rescan approach is correct)
P2-3: DISTINCT __pgt_count lookup scoped to O(delta) I/O per cycle
P2-4: Materialized view sources in IMMEDIATE mode — deferred to v0.10.0 (requires external polling-change-detection wrapper; out of scope for v0.9.0)
P3-1: Window partition O(partition_size) cost documented; heuristic downgrade implemented or explicitly deferred
P3-2: CORR/COVAR_/REGR_ Welford auxiliary columns — explicitly deferred to v0.10.0 (group-rescan strategy already works correctly for all regression/correlation aggregates)
P3-3: Scalar subquery C₀ EXCEPT ALL scan gated behind inner-source stability check or explicitly deferred
D1: Recursive CTE DIFFERENTIAL mode limitation documented in SQL_REFERENCE.md and DVM_OPERATORS.md
D2: pgt_refresh_groups table schema and interim workflow documented in SQL_REFERENCE.md
G-1: panic!() replaced with pgrx::error!() in source_gates() and watermarks() SQL functions
G-2 (P2-5): changed_cols bitmask consumed in delta scan CTE — referenced-column mask filter injected
G-3 (P2-6): LATERAL subquery inner-source scoping — deferred to v0.10.0 (requires correlation predicate extraction from raw SQL; full re-execution is correct)
G-4 (P2-7): Delta predicate pushdown implemented (pushable predicates injected into change buffer scan CTE)
G-5 (P3-4): Index-aware MERGE planning: SET LOCAL enable_seqscan = off for small deltas against large STs
G-6 (P3-5): auto_backoff GUC implemented; scheduler doubles interval when stream table falls behind

v0.10.0 — DVM Hardening, Connection Pooler Compatibility, Core Refresh Optimizations & Infrastructure Prep

Status: Released (2026-03-23).

Goal: Land deferred DVM correctness and performance improvements (recursive CTE DRed, FULL OUTER JOIN aggregate fix, LATERAL scoping, Welford regression aggregates, multi-source delta merging), fix a class of post-audit DVM safety issues (SQL comment injection as FROM fragments, silent wrong aggregate results, EC-01 gap for complex join trees) and CDC correctness bug (NULL-unsafe PK join, TRUNCATE+INSERT race, stale WAL publication after partitioning), deliver the first wave of refresh performance optimizations (index-aware MERGE, predicate pushdown, change buffer compaction, cost-based refresh strategy), enable cloud-native PgBouncer transaction-mode deployments via an opt-in compatibility mode, and complete the pre-1.0 packaging and deployment infrastructure.

Completed items (click to expand)

Connection Pooler Compatibility

In plain terms: PgBouncer is the most widely used PostgreSQL connection pooler — it sits in front of the database and reuses connections across many application threads. In its common "transaction mode" it hands a different physical connection to each transaction, which breaks anything that assumes the same connection persists between calls (session locks, prepared statements). This work introduces an opt-in compatibility mode for pg_trickle so it works correctly in cloud deployments — Supabase, Railway, Neon, and similar platforms that route through PgBouncer by default.

pg_trickle uses session-level advisory locks and PREPARE statements that are incompatible with PgBouncer transaction-mode pooling. This section introduces an opt-in graceful degradation layer for connection pooler compatibility.

Item	Description	Effort	Status	Ref
PB1	Replace `pg_advisory_lock()` with catalog row-level locking (`FOR UPDATE SKIP LOCKED`)	3–4d	✅ Done (0.10-adjustments)	PLAN_PG_BOUNCER.md
PB2	Add `pooler_compatibility_mode` catalog column directly to `pgt_stream_tables` via `CREATE STREAM TABLE ... WITH (...)` or `alter_stream_table()` to bypass `PREPARE` statements and skip `NOTIFY` locally	3–4d	✅ Done (0.10-adjustments)	PLAN_PG_BOUNCER.md
PB3	E2E validation against PgBouncer transaction-mode (Docker Compose with pooler sidecar)	1–2d	✅ Done (0.10-adjustments)	PLAN_EDGE_CASES.md EC-28

⚠️ PB1 — SKIP LOCKED fails silently, not safely. pg_advisory_lock() blocks until the lock is granted, guaranteeing mutual exclusion. FOR UPDATE SKIP LOCKED returns zero rows immediately if the row is already locked — meaning a second worker will simply not acquire the lock and proceed as if uncontested, potentially running a concurrent refresh on the same stream table. Before merging PB1, verify that every call site that previously relied on the blocking guarantee now explicitly handles the "lock not acquired" path (e.g. skip this cycle and retry) rather than silently proceeding. The E2E test in PB3 must include a concurrent-refresh scenario that would fail if the skip-and-proceed bug is present.

PgBouncer compatibility subtotal: ~7–10 days

DVM Correctness & Performance (deferred from v0.9.0)

In plain terms: These items were evaluated during v0.9.0 and deferred because the current implementations are correct — they just scale with data size rather than delta size in certain edge cases. All produce correct results today; this work makes them faster.

Item	Description	Effort	Status	Ref
P2-1	Recursive CTE DRed in DIFFERENTIAL mode. DELETE/UPDATE against a recursive CTE source falls back to O(n) full recompute + diff. Implement DRed for `DeltaSource::ChangeBuffer` to maintain O(delta) cost.	2–3 wk	✅ Done (0.10-adjustments)	src/dvm/operators/recursive_cte.rs
P2-2	SUM NULL-transition rescan for FULL OUTER JOIN aggregates. When SUM sits above a FULL OUTER JOIN and rows transition between matched/unmatched states, algebraic formula gives 0 instead of NULL, triggering full-group rescan. Implement targeted correction.	1–2 wk	✅ Done	src/dvm/operators/aggregate.rs
P2-4	Materialized view sources in IMMEDIATE mode (EC-09). Implement polling-change-detection wrapper for `REFRESH MATERIALIZED VIEW`-sourced queries in IMMEDIATE mode.	2–3 wk	✅ Done	plans/PLAN_EDGE_CASES.md §EC-09
P2-6	LATERAL subquery inner-source scoped re-execution. Gate outer-table scan behind a join to inner delta rows so only correlated outer rows are re-executed, reducing O(\|outer\|) to O(delta).	1–2 wk	✅ Done	src/dvm/operators/lateral_subquery.rs
P3-2	*Welford auxiliary columns for CORR/COVAR/REGR_ aggregates.** Implement Welford-style accumulation to reach O(1) algebraic maintenance identical to the STDDEV/VAR path.	2–3 wk	✅ Done	src/dvm/operators/aggregate.rs
B3-2	Merged-delta weight aggregation. `GROUP BY __pgt_row_id, SUM(weight)` for cross-source deduplication; remove zero-weight rows.	3–4 wk	✅ Done	PLAN_NEW_STUFF.md §B-3
B3-3	Property-based correctness tests for simultaneous multi-source changes; diamond-flow scenarios. Hard prerequisite for B3-2.	1–2 wk	✅ Done	PLAN_NEW_STUFF.md §B-3

✅ B3-2 correctly uses weight aggregation (GROUP BY __pgt_row_id, SUM(weight)) instead of DISTINCT ON. B3-3 property-based tests verify correctness for 6 diamond-flow topologies (inner join, left join, full join, aggregate, multi-root, deep diamond).

DVM deferred items subtotal: ~12–19 weeks

DVM Safety Fixes & CDC Correctness Hardening

These items were identified during a post-v0.9.0 audit of the DVM engine and CDC pipeline. P0 items produce runtime PostgreSQL syntax errors with no helpful extension-level error; P1 items produce silent wrong results. They target uncommon query shapes but are fully reachable by users without warning.

SQL Comment Injection (P0)

Item	Description	Effort	Status	Ref
SF-1	`build_snapshot_sql` catch-all returns an SQL comment as a FROM clause fragment. The `_` arm of `build_snapshot_sql()` returns `/* unsupported snapshot for <node> */` which is injected directly into JOIN SQL, producing a PostgreSQL syntax error (`syntax error at or near "/"`) instead of a clear extension error. Affects any `RecursiveCte`, `Except`, `Intersect`, `UnionAll`, `LateralSubquery`, `LateralFunction`, `ScalarSubquery`, `Distinct`, or `RecursiveSelfRef` node appearing as a direct JOIN child. Replace the catch-all arm with `PgTrickleError::UnsupportedQuery`.	0.5 d	✅ Done	src/dvm/operators/join_common.rs
SF-2	*Explicit `/ unsupported snapshot for distinct /` string in join.rs.* Hardcoded variant of SF-1 for the `Distinct`-child case in inner-join snapshot construction. Same fix: return `PgTrickleError::UnsupportedQuery`.	0.5 d	✅ Done	src/dvm/operators/join.rs
SF-3	`parser.rs` FROM-clause deparser fallbacks inject SQL comments. `/* unsupported RangeSubselect /` and `/ unsupported FROM item */` are emitted as FROM clause fragments, causing PostgreSQL syntax errors when the generated SQL is executed. Replace with `PgTrickleError::UnsupportedQuery`.	0.5 d	✅ Done	src/dvm/parser.rs

DVM Correctness Bugs (P1)

Item	Description	Effort	Status	Ref
SF-4	`child_to_from_sql` returns `None` for renamed-column `Project` nodes, silently skipping group rescan. When a `Project` with column renames (e.g. `EXTRACT(year FROM orderdate) AS o_year`) sits between an aggregate and its source, `child_to_from_sql()` returns `None` and the group-rescan CTE is omitted without error. Groups crossing COUNT 0→1 or MAX deletion thresholds produce permanently stale aggregate values. Distinct from tracked P2-2 (SUM/FULL OUTER JOIN specific); this affects any complex projection above an aggregate.	1–2 wk	✅ Done	src/dvm/operators/aggregate.rs
SF-5	EC-01 fix is incomplete for right-side join subtrees with ≥3 scan nodes. `use_pre_change_snapshot()` applies a `join_scan_count(child) <= 2` threshold to avoid cascading CTE materialization. For right-side join chains with ≥3 scan nodes (TPC-H Q7, Q8, Q9 all qualify), the original EC-01 phantom-row-after-DELETE bug is still present. The roadmap marks EC-01 as "Done" without noting this remaining boundary. Extend the fix to ≥3-scan right subtrees, or document the limitation explicitly with a test that asserts the boundary.	2–3 wk	✅ Done (boundary documented with 5 unit tests + DVM_OPERATORS.md limitation note)	src/dvm/operators/join_common.rs
SF-6	EXCEPT `__pgt_count` columns not forwarded through `Project` nodes, causing silent wrong results. EXCEPT uses a "retain but mark invisible" design (never emits `'D'` events). A `Project` above `EXCEPT` that does not propagate `__pgt_count_l`/`__pgt_count_r` prevents the MERGE step from distinguishing visible from invisible rows. Enforce count column propagation in the planner or raise `PgTrickleError` at planning time if a `Project` over `Except` drops these columns.	1–2 wk	✅ Done	src/dvm/operators/project.rs

DVM Edge-Condition Correctness (P2)

Item	Description	Effort	Status	Ref
SF-7	Empty `subquery_cols` silently emits `(SELECT NULL FROM …)` as scalar subquery result. When inner column detection fails (e.g. star-expansion from a view source), `scalar_col` is set to `"NULL"` and NULL values silently propagate into the stream table with no error raised. Detect empty `subquery_cols` at planning time and return `PgTrickleError::UnsupportedQuery`.	0.5 d	✅ Done	src/dvm/operators/scalar_subquery.rs
SF-8	Dummy `row_id = 0` in lateral inner-change branch can hash-collide with a real outer row. `build_inner_change_branch()` emits `0::BIGINT AS __pgt_row_id` as a placeholder for re-executed outer rows. Since actual row hashes span the full BIGINT range, a real outer row could hash to `0`, causing the DISTINCT/MERGE step to conflate it with the dummy entry. Use a sentinel outside the hash range (e.g. `(-9223372036854775808)::BIGINT`, i.e. `MIN(BIGINT)`) or add a separate `__pgt_is_inner_dummy BOOLEAN` discriminator column.	1 wk	✅ Done (sentinel changed to i64::MIN)	src/dvm/operators/lateral_subquery.rs

CDC Correctness (P1–P2)

Item	Description	Effort	Status	Ref
SF-9	UPDATE trigger uses `=` (not `IS NOT DISTINCT FROM`) on composite PK columns, silently dropping rows with NULL PK columns. The `__pgt_new JOIN __pgt_old ON pk_a = pk_a AND pk_b = pk_b` uses `=`, so `NULL = NULL` evaluates to false and those rows are silently dropped from the change buffer. The stream table permanently diverges from the source with no error. Change all PK join conditions in the UPDATE trigger to use `IS NOT DISTINCT FROM`.	0.5 d	✅ Done	src/cdc.rs
SF-10	TRUNCATE marker + same-window INSERT ordering is untested; post-TRUNCATE rows may be missed. If INSERTs arrive after a TRUNCATE but before the scheduler ticks, the change buffer contains both a `'T'` marker and `'I'` rows. The "TRUNCATE → full refresh → discard buffer" path has no E2E test coverage for this sequencing. A race between the FULL refresh snapshot and in-flight inserts could drop post-TRUNCATE inserted rows. Add a targeted E2E test and verify atomicity of the discard-vs-snapshot sequence.	0.5 d	✅ Done (verified: TRUNCATE triggers full refresh which re-reads source; change buffer is discarded atomically within the same transaction)	src/cdc.rs
SF-11	WAL publication goes stale after a source table is later converted to partitioned. `create_publication()` sets `publish_via_partition_root = true` only at creation time. If a source table is subsequently converted to partitioned, WAL events arrive with child-partition OIDs, causing lookup failures and a silent CDC stall for that table (no error, stream table silently freezes). Detect post-creation partitioning during publication health checks and rebuild the publication entry.	1–2 wk	✅ Done	src/wal_decoder.rs

Operational & Documentation Gaps (P3)

Item	Description	Effort	Status	Ref
SF-12	`DiamondSchedulePolicy::Fastest` CPU multiplication is undocumented. The default policy refreshes all members of a diamond consistency group whenever any member is due. In an asymmetric diamond (B every 1s, C every 5s, both feeding D), C refreshes 5× more often than scheduled, consuming unexplained CPU. Add a cost-implication warning to `CONFIGURATION.md` and `ARCHITECTURE.md`, and explain `DiamondSchedulePolicy::Slowest` as the low-CPU alternative.	0.5 d	✅ Done	src/dag.rs · docs/CONFIGURATION.md
SF-13	ROADMAP inconsistency: B-2 (Delta Predicate Pushdown) listed as ⬜ Not started in v0.10.0 but G-4/P2-7 marked completed in v0.9.0. The v0.9.0 exit criteria mark `[x] G-4 (P2-7): Delta predicate pushdown implemented`, yet the v0.10.0 table lists `B-2 \| Delta Predicate Pushdown \| ⬜ Not started`. If B-2 has additional scope beyond G-4 (e.g. OR-branch handling for deletions, covering index creation, benchmark targets), document that scope explicitly. If B-2 is fully covered by G-4, remove or mark it done in the v0.10.0 table to avoid double-counting effort.	0.5 d	✅ Done (B-2 marked as completed by G-4/P2-7)	ROADMAP.md

DVM safety & CDC hardening subtotal: ~3–4 days (SF-1–3, SF-7, SF-9–10, SF-12–13) + ~6–10 weeks (SF-4–6, SF-8, SF-11)

Core Refresh Optimizations (Wave 2)

Read the risk analyses in PLAN_NEW_STUFF.md before implementing. Implement in this order: A-4 (no schema change), B-2, C-4, then B-4.

Item	Description	Effort	Status	Ref
A-4	Index-Aware MERGE Planning. Planner hint injection (`enable_seqscan = off` for small-delta / large-target); covering index auto-creation on `__pgt_row_id`. No schema changes required.	1–2 wk	✅ Done	PLAN_NEW_STUFF.md §A-4
B-2	Delta Predicate Pushdown. Push WHERE predicates from defining query into change-buffer `delta_scan` CTE; `OR old_col` handling for deletions; 5–10× delta-row-volume reduction for selective queries.	2–3 wk	✅ Done (v0.9.0 as G-4/P2-7)	PLAN_NEW_STUFF.md §B-2
C-4	Change Buffer Compaction. Net-change compaction (INSERT+DELETE=no-op; UPDATE+UPDATE=single row); run when buffer exceeds `pg_trickle.compact_threshold`; use advisory lock to serialise with refresh.	2–3 wk	✅ Done	PLAN_NEW_STUFF.md §C-4
B-4	Cost-Based Refresh Strategy. Replace fixed `differential_max_change_ratio` with a history-driven cost model fitted on `pgt_refresh_history`; cold-start fallback to fixed threshold.	2–3 wk	✅ Done (cost model + adaptive threshold already active)	PLAN_NEW_STUFF.md §B-4

⚠️ C-4: The compaction DELETE must use seq (the sequence primary key) not ctid as the stable row identifier. ctid changes under VACUUM and will silently delete the wrong rows. See the corrected SQL and risk analysis in PLAN_NEW_STUFF.md §C-4.

⚠️ A-4 — Planner hint must be transaction-scoped (SET LOCAL), never session-scoped (SET). The existing P3-4 implementation (already shipped) uses SET LOCAL enable_seqscan = off, which PostgreSQL automatically reverts at transaction end. Any extension of A-4 (e.g. the covering index auto-creation path) must continue to use SET LOCAL. Using plain SET instead would permanently disable seq-scans for the remainder of the session, corrupting planner behaviour for all subsequent queries in that backend.

Core refresh optimizations subtotal: ~7–11 weeks

Scheduler & DAG Scalability

These items address scheduler CPU efficiency and DAG maintenance overhead at scale. Both were identified as C-1 and C-2 in plans/performance/PLAN_NEW_STUFF.md but were not included in earlier milestones.

Item	Description	Effort	Status	Ref
G-7	Tiered refresh scheduling (Hot/Warm/Cold/Frozen). All stream tables currently refresh at their configured interval regardless of how often they are queried. In deployments with many STs, most Cold/Frozen tables consume full scheduler CPU unnecessarily. Introduce four tiers keyed by a per-ST pgtrickle access counter (not `pg_stat_user_tables`, which is polluted by pg_trickle's own MERGE scans): Hot (≥10 reads/min: refresh at configured interval), Warm (1–10 reads/min: ×2 interval), Cold (<1 read/min: ×10 interval), Frozen (0 reads since last N cycles: suspend until manually promoted). A single GUC `pg_trickle.tiered_scheduling` (default off) gates the feature.	3–4 wk	✅ Done	src/scheduler.rs · plans/performance/PLAN_NEW_STUFF.md §C-1
G-8	Incremental DAG rebuild on DDL changes. Any `CREATE`/`ALTER`/`DROP STREAM TABLE` currently triggers a full O(V+E) re-query of all `pgt_dependencies` rows to rebuild the entire DAG. For deployments with 100+ stream tables this adds per-DDL latency and has a race condition: if two DDL events arrive before the scheduler ticks, only the latest `pgt_id` stored in shared memory may be processed. Replace with a targeted edge-delta approach: the DDL hooks write affected stream table OIDs into a pending-changes queue; the scheduler applies only those edge insertions/deletions, leaving the rest of the graph intact.	2–3 wk	✅ Done	src/dag.rs · src/scheduler.rs · plans/performance/PLAN_NEW_STUFF.md §C-2
C2-1	Ring-buffer DAG invalidation. Replace single `pgt_id` scalar in shared memory with a bounded ring buffer of affected IDs; full-rebuild fallback on overflow. Hard prerequisite for correctness of G-8 under rapid DDL changes.	1 wk	✅ Done	PLAN_NEW_STUFF.md §C-2
C2-2	Incremental topo-sort. Incremental topo-sort on affected subgraph only; cache sorted schedule in shared memory.	1–2 wk	✅ Done	PLAN_NEW_STUFF.md §C-2

⚠️ A single pgt_id scalar in shared memory is vulnerable to overwrite when two DDL changes arrive between scheduler ticks — use a ring buffer (C2-1) or fall back to full rebuild. See PLAN_NEW_STUFF.md §C-2 risk analysis.

Scheduler & DAG scalability subtotal: ~7–10 weeks

"No Surprises" — Principle of Least Astonishment

In plain terms: pg_trickle does a lot of work automatically — rewriting queries, managing auxiliary columns, transitioning CDC modes, falling back between refresh strategies. Most of this is exactly what users want, but several behaviors happen silently where a brief notification would prevent confusion. This section adds targeted warnings, notices, and documentation so that every implicit behavior is surfaced to the user at the moment it matters.

Item	Description	Effort	Status	Ref
NS-1	Warn on ORDER BY without LIMIT. Emit `WARNING` at `create_stream_table` / `alter_stream_table` time when query contains `ORDER BY` without `LIMIT`: "ORDER BY without LIMIT has no effect on stream tables — storage row order is undefined."	2–4h	✅ Done	src/api.rs
NS-2	Warn on append_only auto-revert. Upgrade the `info!()` to `warning!()` when `append_only` is automatically reverted due to DELETE/UPDATE. Add a `pgtrickle_alert` NOTIFY with category `append_only_reverted`.	1–2h	✅ Done	src/refresh.rs
NS-3	Promote cleanup errors after consecutive failures. Track consecutive `drain_pending_cleanups()` error count in thread-local state; promote from `debug1` to `WARNING` after 3 consecutive failures for the same source OID.	2–4h	✅ Done	src/refresh.rs
NS-4	*Document `__pgt_` auxiliary columns in SQL_REFERENCE.** Add a dedicated subsection listing all implicit columns (`__pgt_row_id`, `__pgt_count`, `__pgt_sum`, `__pgt_sum2`, `__pgt_nonnull`, `__pgt_covar_*`, `__pgt_count_l`, `__pgt_count_r`) with the aggregate functions that trigger each.	2–4h	✅ Done	docs/SQL_REFERENCE.md
NS-5	NOTICE on diamond detection with `diamond_consistency='none'`. When `create_stream_table` detects a diamond dependency and the user hasn't explicitly set `diamond_consistency`, emit `NOTICE`: "Diamond dependency detected — consider setting diamond_consistency='atomic' for consistent cross-branch reads."	2–4h	✅ Done	src/api.rs · src/dag.rs
NS-6	NOTICE on differential→full fallback. Upgrade the existing `info!()` in adaptive fallback to `NOTICE` so it appears at default `client_min_messages` level.	0.5–1h	✅ Done	src/refresh.rs
NS-7	NOTICE on isolated CALCULATED schedule. When `create_stream_table` creates an ST with `schedule='calculated'` that has no downstream dependents, emit `NOTICE`: "No downstream dependents found — schedule will fall back to pg_trickle.default_schedule_seconds (currently Ns)."	1–2h	✅ Done	src/api.rs

"No Surprises" subtotal: ~10–20 hours

v0.10.0 total: ~58–84 hours + ~32–50 weeks DVM, refresh & safety work + ~10–20 hours "No Surprises"

Exit criteria:

ALTER EXTENSION pg_trickle UPDATE tested (0.9.0 → 0.10.0) — upgrade script verified complete via scripts/check_upgrade_completeness.sh; adds pooler_compatibility_mode, refresh_tier, pgt_refresh_groups, and updated API function signatures
All public documentation current and reviewed — SQL_REFERENCE.md, CONFIGURATION.md, CHANGELOG.md, and ROADMAP.md updated for all v0.10.0 features
G-7: Tiered scheduling (Hot/Warm/Cold/Frozen) implemented; pg_trickle.tiered_scheduling GUC gating the feature
G-8: Incremental DAG rebuild implemented; DDL-triggered edge-delta replaces full O(V+E) re-query
C2-1: Ring-buffer DAG invalidation safe under rapid consecutive DDL changes
C2-2: Incremental topo-sort caches sorted schedule; verified by property-based test
P2-1: Recursive CTE DRed for DIFFERENTIAL mode (O(delta) instead of O(n) recompute) — implemented in 0.10-adjustments
P2-2: SUM NULL-transition correction for FULL OUTER JOIN aggregates — implemented; __pgt_aux_nonnull_* auxiliary column eliminates full-group rescan
P2-4: Materialized view sources supported in IMMEDIATE mode
P2-6: LATERAL subquery inner-source scoped re-execution (O(delta) instead of O(|outer|))
P3-2: CORR/COVAR_/REGR_ Welford auxiliary columns for O(1) algebraic maintenance
B3-2: Merged-delta weight aggregation passes property-based correctness proofs — implemented; replaces DISTINCT ON with GROUP BY + SUM(weight) + HAVING
B3-3: Property-based tests for simultaneous multi-source changes — implemented; 6 diamond-flow E2E property tests
A-4: Covering index auto-created on __pgt_row_id with INCLUDE clause for ≤8-column schemas; planner hint prevents seq-scan on small delta; SET LOCAL confirmed (not SET) so hint reverts at transaction end
B-2: Predicate pushdown reduces delta volume for selective queries — bench_b2_predicate_pushdown in e2e_bench_tests.rs measures median filtered vs unfiltered refresh time; asserts filtered ≤3× unfiltered (in practice typically faster)
C-4: Compaction uses change_id PK (not ctid); correct under concurrent VACUUM; serialised with advisory lock; net-zero elimination + intermediate row collapse
B-4: Cost model self-calibrates from refresh history (estimate_cost_based_threshold + compute_adaptive_threshold with 60/40 blend); cold-start fallback to fixed GUC threshold
PB1: Concurrent-refresh scenario covered by test_pb1_concurrent_refresh_skip_locked_no_corruption in e2e_concurrent_tests.rs; two concurrent refresh_stream_table() calls verified to produce correct data without corruption; SKIP LOCKED path confirmed non-blocking
SF-1: build_snapshot_sql catch-all arm uses pgrx::error!() instead of injecting an SQL comment as a FROM fragment
SF-2: Explicit /* unsupported snapshot for distinct */ string replaced with PgTrickleError::UnsupportedQuery in join.rs
SF-3: parser.rs FROM-clause deparser fallbacks replaced with PgTrickleError::UnsupportedQuery
SF-4: child_to_from_sql wraps Project in subquery with projected expressions; rescan CTE correctly resolves aliased column names
SF-5: EC-01 ≤2-scan boundary documented with 5 unit tests asserting the boundary + DVM_OPERATORS.md limitation note explaining the CTE materialization trade-off
SF-6: diff_project forwards __pgt_count_l/__pgt_count_r through projection when present in child result
SF-7: Empty subquery_cols in scalar subquery returns PgTrickleError::UnsupportedQuery rather than emitting NULL
SF-8: Lateral inner-change branch uses i64::MIN sentinel instead of 0::BIGINT as dummy __pgt_row_id
SF-9: UPDATE trigger PK join uses IS NOT DISTINCT FROM for all PK columns; NULL-PK rows captured correctly
SF-10: TRUNCATE + same-window INSERT E2E test passes; post-TRUNCATE rows not dropped
SF-11: check_publication_health() detects post-creation partitioning and rebuilds publication with publish_via_partition_root = true
SF-12: DiamondSchedulePolicy::Fastest cost-multiplication documented in CONFIGURATION.md with Slowest explanation
SF-13: B-2 / G-4 roadmap inconsistency resolved; entry reflects actual remaining scope (or marked done if fully completed)
NS-1: ORDER BY without LIMIT emits WARNING at creation time; E2E test verifies message
NS-2: append_only auto-revert uses WARNING (not INFO) and sends pgtrickle_alert NOTIFY
NS-3: drain_pending_cleanups promotes to WARNING after 3 consecutive failures per source OID
NS-4: __pgt_* auxiliary columns documented in SQL_REFERENCE with triggering aggregate functions
NS-5: Diamond detection with diamond_consistency='none' emits NOTICE suggesting 'atomic'
NS-6: Differential→full adaptive fallback uses NOTICE (not INFO)
NS-7: Isolated CALCULATED schedule emits NOTICE with effective fallback interval
NS-8: diamond_consistency default changed to 'atomic'; catalog DDL, API code comments, and all documentation updated to match actual runtime behavior (API already resolved NULL to Atomic)

v0.11.0 — Partitioned Stream Tables, Prometheus & Grafana Observability, Safety Hardening & Correctness

Status: Released 2026-03-26. See CHANGELOG.md §0.11.0 for the full feature list.

Highlights: 34× lower latency via event-driven scheduler wake · incremental ST-to-ST refresh chains · declaratively partitioned stream tables (100× I/O reduction) · ready-to-use Prometheus + Grafana monitoring stack · FUSE circuit breaker · VARBIT changed-column bitmask (no more 63-column cap) · per-database worker quotas · DAG scheduling performance improvements (fused chains, adaptive polling, amplification detection) · TPC-H correctness gate in CI · safer production defaults.

Completed items (click to expand)

Partitioned Stream Tables — Storage (A-1)

In plain terms: A 10M-row stream table partitioned into 100 ranges means only the 2–3 partitions that actually received changes are touched by MERGE — reducing the MERGE scan from 10M rows to ~100K. The partition key must be a user-visible column and the refresh path must inject a verified range predicate.

Item	Description	Effort	Ref
A1-1	DDL: `CREATE STREAM TABLE … PARTITION BY` declaration; catalog column for partition key	1–2 wk	PLAN_NEW_STUFF.md §A-1
A1-2	Delta inspection: extract min/max of partition key from delta CTE per scheduler tick	1 wk	PLAN_NEW_STUFF.md §A-1
A1-3	MERGE rewrite: inject validated partition-key range predicate or issue per-partition MERGEs via Rust loop	2–3 wk	PLAN_NEW_STUFF.md §A-1
A1-4	E2E benchmarks: 10M-row partitioned ST, 0.1% change rate concentrated in 2–3 partitions	1 wk	PLAN_NEW_STUFF.md §A-1

⚠️ MERGE joins on __pgt_row_id (a content hash unrelated to the partition key) — partition pruning will not activate automatically. A predicate injection step is mandatory. See PLAN_NEW_STUFF.md §A-1 risk analysis before starting.

Retraction consideration (A-1): The 5–7 week effort estimate is optimistic. The core assumption — that partition pruning can be activated via a WHERE partition_key BETWEEN ? AND ? predicate — requires the partition key to be a tracked catalog column (not currently the case) and a verified range derivation from the delta. The alternative (per-partition MERGE loop in Rust) is architecturally sound but requires significant catalog and refresh-path changes. A design spike (2–4 days) producing a written implementation plan must be completed before A1-1 is started. The milestone is at P3 / Very High risk and should not block the 1.0 release if the design spike reveals additional complexity.

Partitioned stream tables subtotal: ~5–7 weeks

Multi-Database Scheduler Isolation (C-3)

Item	Description	Effort	Ref
~~C3-1~~	~~Per-database worker quotas (`pg_trickle.per_database_worker_quota`); priority ordering (IMMEDIATE > Hot > Warm > Cold); burst capacity up to 150% when other DBs are under budget~~ ✅ Done in v0.11.0 Phase 11 — `compute_per_db_quota()` helper with burst threshold at 80% cluster utilisation; `sort_ready_queue_by_priority()` dispatches ImmediateClosure first; 7 unit tests.	—	src/scheduler.rs

Multi-DB isolation subtotal: ✅ Complete

Prometheus & Grafana Observability

In plain terms: Most teams already run Prometheus and Grafana to monitor their databases. This ships ready-to-use configuration files — no custom code, no extension changes — that plug into the standard postgres_exporter and light up a Grafana dashboard showing refresh latency, staleness, error rates, CDC lag, and per-stream-table detail. Also includes Prometheus alerting rules so you get paged when a stream table goes stale or starts error-looping. A Docker Compose file lets you try the full observability stack with a single docker compose up.

Zero-code monitoring integration. All config files live in a new monitoring/ directory in the main repo (or a separate pgtrickle-monitoring repo). Queries use existing views (pg_stat_stream_tables, check_cdc_health(), quick_health).

Item	Description	Effort	Ref
~~OBS-1~~	~~Prometheus metrics out of the box.~~ ✅ Done in v0.11.0 Phase 3 — `monitoring/prometheus/pg_trickle_queries.yml` exports 14 metrics (per-table refresh stats, health summary, CDC buffer sizes, status counts, recent error rate) via postgres_exporter.	—	monitoring/prometheus/pg_trickle_queries.yml
~~OBS-2~~	~~Get paged when things go wrong.~~ ✅ Done in v0.11.0 Phase 3 — `monitoring/prometheus/alerts.yml` has 8 alerting rules: staleness > 5 min, ≥3 consecutive failures, table SUSPENDED, CDC buffer > 1 GB, scheduler down, high refresh duration, cluster WARNING/CRITICAL.	—	monitoring/prometheus/alerts.yml
~~OBS-3~~	~~See everything at a glance.~~ ✅ Done in v0.11.0 Phase 3 — `monitoring/grafana/dashboards/pg_trickle_overview.json` has 6 sections: cluster overview stat panels, refresh performance time-series, staleness heatmap, CDC health graphs, per-table drill-down table with schema/table variable filters.	—	monitoring/grafana/dashboards/pg_trickle_overview.json
~~OBS-4~~	~~Try it all in one command.~~ ✅ Done in v0.11.0 Phase 3 — `monitoring/docker-compose.yml` spins up PostgreSQL + pg_trickle + postgres_exporter + Prometheus + Grafana with pre-wired config and demo seed data (`monitoring/init/01_demo.sql`). `docker compose up` → Grafana at :3000.	—	monitoring/docker-compose.yml

Observability subtotal: ~12 hours ✅

Default Tuning & Safety Defaults (from REPORT_OVERALL_STATUS.md)

These four changes flip conservative defaults to the behavior that is safe and correct in production. All underlying features are implemented and tested; only the default values change. Each keeps the original GUC so operators can revert if needed.

Item	Description	Effort	Ref
~~DEF-1~~	~~Flip `parallel_refresh_mode` default to `'on'`.~~ ✅ Done in v0.11.0 Phase 1 — default flipped; `normalize_parallel_refresh_mode` maps `None`/unknown → `On`; unit test renamed to `defaults_to_on`.	—	REPORT_OVERALL_STATUS.md §R1
DEF-2	~~Flip `auto_backoff` default to `true`.~~ ✅ Done in v0.10.0 — default flipped to `true`; trigger threshold raised to 95%, cap reduced to 8×, log level raised to WARNING. CONFIGURATION.md updated.	1–2h	REPORT_OVERALL_STATUS.md §R10
~~DEF-3~~	~~SemiJoin delta-key pre-filter (O-1).~~ ✅ Verified already implemented in v0.11.0 Phase 2 — `left_snapshot_filtered` pre-filter with `WHERE left_key IN (SELECT DISTINCT right_key FROM delta)` was already present in `semi_join.rs`.	—	src/dvm/operators/semi_join.rs
~~DEF-4~~	~~Increase invalidation ring capacity from 32 to 128 slots.~~ ✅ Done in v0.11.0 Phase 1 — `INVALIDATION_RING_CAPACITY` raised to 128 in `shmem.rs`.	—	REPORT_OVERALL_STATUS.md §R9
~~DEF-5~~	~~Flip `block_source_ddl` default to `true`.~~ ✅ Done in v0.11.0 Phase 1 — default flipped to `true`; both error messages in `hooks.rs` include step-by-step escape-hatch procedure.	—	REPORT_OVERALL_STATUS.md §R12

Default tuning subtotal: ~14–21 hours

Safety & Resilience Hardening (Must-Ship)

In plain terms: The background worker should never silently hang or leave a stream table in an undefined state when an internal operation fails. These items replace panic!/unwrap() in code paths reachable from the background worker with structured errors and graceful recovery.

Item	Description	Effort	Ref
~~SAF-1~~	~~Replace worker-path panics with structured errors.~~ ✅ Done in v0.11.0 Phase 1 — full audit of `scheduler.rs`, `refresh.rs`, `hooks.rs`: no `panic!`/`unwrap()` outside `#[cfg(test)]`. `check_skip_needed` now logs `WARNING` on SPI error with table name and error details. Audit finding documented in comment.	—	src/scheduler.rs
~~SAF-2~~	~~Failure-injection E2E test.~~ ✅ Done in v0.11.0 Phase 2 — two E2E tests in `tests/e2e_safety_tests.rs`: (1) column drop triggers UpstreamSchemaChanged, verifies scheduler stays alive and other STs continue; (2) source table drop, same verification.	—	tests/e2e_safety_tests.rs

Safety hardening subtotal: ~7–12 hours

Correctness & Code Quality Quick Wins (from REPORT_OVERALL_STATUS.md §12–§15)

In plain terms: Six self-contained improvements identified in the deep gap analysis. Each takes under a day and substantially reduces silent failure modes, operator confusion, and diagnostic friction.

Quick Fixes (< 1 hour each)

Item	Description	Effort	Ref
QF-1	~~Fix unguarded debug `println!`.~~ ✅ Done in v0.11.0 Phase 1 — `println!` replaced with `pgrx::log!()` guarded by new `pg_trickle.log_merge_sql` GUC (default `off`).	—	src/refresh.rs
QF-2	~~Upgrade AUTO mode downgrade log level.~~ ✅ Done in v0.11.0 Phase 1 — four AUTO→FULL downgrade paths in `api.rs` raised from `pgrx::info!()` to `pgrx::warning!()`.	—	plans/performance/REPORT_OVERALL_STATUS.md §12
QF-3	~~Warn when `append_only` auto-reverts.~~ ✅ Verified already implemented — `pgrx::warning!()` + `emit_alert(AppendOnlyReverted)` already present in `refresh.rs`.	—	plans/performance/REPORT_OVERALL_STATUS.md §15
QF-4	~~Document parser `unwrap()` invariants.~~ ✅ Done in v0.11.0 Phase 1 — `// INVARIANT:` comments added at four `unwrap()` sites in `dvm/parser.rs` (after `is_empty()` guard, `len()==1` guards, and non-empty `Err` return).	—	src/dvm/parser.rs

Quick-fix subtotal: ~3–4 hours

Effective Refresh Mode Tracking (G12-ERM)

In plain terms: When a stream table is configured as AUTO, operators currently have no way to discover which mode is actually being used at runtime without reading warning logs. Storing the resolved mode in the catalog and exposing a diagnostic function closes this observability gap.

Item	Description	Effort	Ref
~~G12-ERM-1~~	~~Add `effective_refresh_mode` column to `pgt_stream_tables`~~. ✅ Done in v0.11.0 Phase 2 — column added; scheduler writes actual mode (FULL/DIFFERENTIAL/APPEND_ONLY/TOP_K/NO_DATA) via thread-local tracking; upgrade SQL `pg_trickle--0.10.0--0.11.0.sql` created.	—	src/catalog.rs
~~G12-ERM-2~~	~~Add `explain_refresh_mode(name TEXT)` SQL function~~. ✅ Done in v0.11.0 Phase 2 — `pgtrickle.explain_refresh_mode()` returns configured mode, effective mode, and downgrade reason.	—	src/api.rs

Effective refresh mode subtotal: ~4–7 hours

Correctness Guards (G12-2, G12-AGG)

Item	Description	Effort	Ref
~~G12-2~~	~~TopK runtime validation.~~ ✅ Done in v0.11.0 Phase 4 — `validate_topk_metadata()` re-parses the reconstructed full query on each TopK refresh; `validate_topk_metadata_fields()` validates stored fields (pure logic, unit-testable). Falls back to FULL + `WARNING` on mismatch. 7 unit tests.	—	src/refresh.rs
~~G12-AGG~~	~~Group-rescan aggregate warning.~~ ✅ Done in v0.11.0 Phase 4 — `classify_agg_strategy()` classifies each aggregate as ALGEBRAIC_INVERTIBLE / ALGEBRAIC_VIA_AUX / SEMI_ALGEBRAIC / GROUP_RESCAN. Warning emitted at `create_stream_table` time for DIFFERENTIAL + group-rescan aggs. Strategy exposed in `explain_st()` as `aggregate_strategies` JSON. 18 unit tests.	—	src/dvm/parser.rs

Correctness guards subtotal: ✅ Complete

Parameter & Error Hardening (G15-PV, G13-EH)

Item	Description	Effort	Ref
~~G15-PV~~	~~Validate incompatible parameter combinations.~~ ✅ Done in v0.11.0 Phase 2 — (a) `cdc_mode='wal'` + `refresh_mode='IMMEDIATE'` rejection was already present; (b) `diamond_schedule_policy='slowest'` + `diamond_consistency='none'` now rejected in `create_stream_table_impl` and `alter_stream_table_impl` with structured error.	—	src/api.rs
~~G13-EH~~	~~Structured error HINT/DETAIL fields.~~ ✅ Done in v0.11.0 Phase 2 — `raise_error_with_context()` helper in `api.rs` uses `ErrorReport::new().set_detail().set_hint()` for `UnsupportedOperator`, `CycleDetected`, `UpstreamSchemaChanged`, and `QueryParseError`; all 8 API-boundary error sites updated.	—	src/api.rs

Parameter & error hardening subtotal: ~6–12 hours

Testing: EC-01 Boundary Regression (G17-EC01B-NEG)

Item	Description	Effort	Ref
~~G17-EC01B-NEG~~	~~Add a negative regression test asserting that ≥3-scan join right subtrees currently fall back to FULL refresh.~~ ✅ Done in v0.11.0 Phase 4 — 4 unit tests in `join_common.rs` covering 3-way join, 4-way join, right-subtree ≥3 scans, and 2-scan boundary. `// TODO: Remove when EC01B-1/EC01B-2 fixed in v0.12.0`	—	src/dvm/operators/join_common.rs

EC-01 boundary regression subtotal: ✅ Complete

Documentation Quick Wins (G16-GS, G16-SM, G16-MQR, G15-GUC)

Item	Description	Effort	Ref
G16-GS	Restructure `GETTING_STARTED.md` with progressive complexity. Five chapters: (1) Hello World — single-table ST with no join; (2) Multi-table join; (3) Scheduling & backpressure; (4) Monitoring — 5 key functions; (5) Advanced — FUSE, wide bitmask, partitions. Remove the current flat wall-of-SQL structure. ✅ Done in v0.11.0 Phase 11 — 5-chapter structure implemented; Chapter 1 Hello World example added; Chapter 5 Advanced Topics adds inline FUSE, partitioning, IMMEDIATE, and multi-tenant quota examples.	—	docs/GETTING_STARTED.md
~~G16-SM~~	~~SQL/mode operator support matrix.~~ ✅ Done — 60+ row operator support matrix added to `docs/DVM_OPERATORS.md` covering all operators × FULL/DIFFERENTIAL/IMMEDIATE modes with caveat footnotes.	—	docs/DVM_OPERATORS.md
~~G16-MQR~~	~~Monitoring quick reference.~~ ✅ Done — Monitoring Quick Reference section added to `docs/GETTING_STARTED.md` with `pgt_status()`, `health_check()`, `change_buffer_sizes()`, `dependency_tree()`, `fuse_status()`, Prometheus/Grafana stack, key metrics table, and alert summary.	—	docs/GETTING_STARTED.md
~~G15-GUC~~	~~GUC interaction matrix.~~ ✅ Done — GUC Interaction Matrix (14 interaction pairs) and three named Tuning Profiles (Low-Latency, High-Throughput, Resource-Constrained) added to `docs/CONFIGURATION.md`.	—	docs/CONFIGURATION.md

Documentation subtotal: ~2–3 days

Correctness quick-wins & documentation subtotal: ~1–2 days code + ~2–3 days docs

Should-Ship Additions

Wider Changed-Column Bitmask (>63 columns)

In plain terms: Stream tables built on source tables with more than 63 columns fall back silently to tracking every column on every UPDATE, losing all CDC selectivity. Extending the changed_cols field from a BIGINT to a BYTEA vector removes this cliff without breaking existing deployments.

Item	Description	Effort	Ref
WB-1	Extend the CDC trigger `changed_cols` column from `BIGINT` to `BYTEA`; update bitmask encoding/decoding in `cdc.rs`; add schema migration for existing change buffer tables (tables with <64 columns are unaffected at the data level).	1–2 wk	REPORT_OVERALL_STATUS.md §R13
WB-2	E2E test: wide (>63 column) source table; verify only referenced columns trigger delta propagation; benchmark UPDATE selectivity before/after.	2–4h	`tests/e2e_cdc_tests.rs`

Wider bitmask subtotal: ~1–2 weeks + ~4h testing

Fuse — Anomalous Change Detection

In plain terms: A circuit breaker that stops a stream table from processing an unexpectedly large batch of changes (runaway script, mass delete, data migration) without operator review. A blown fuse halts refresh and emits a pgtrickle_alert NOTIFY; reset_fuse() resumes with a chosen recovery action (apply, reinitialize, or skip_changes).

Item	Description	Effort	Ref
~~FUSE-1~~ ✅	~~Catalog: fuse state columns on `pgt_stream_tables` (`fuse_mode`, `fuse_state`, `fuse_ceiling`, `fuse_sensitivity`, `blown_at`, `blow_reason`)~~	1–2h	PLAN_FUSE.md
~~FUSE-2~~ ✅	~~`alter_stream_table()` new params: `fuse`, `fuse_ceiling`, `fuse_sensitivity`~~	1h	PLAN_FUSE.md
~~FUSE-3~~ ✅	~~`reset_fuse(name, action => 'apply'\|'reinitialize'\|'skip_changes')` SQL function~~	1h	PLAN_FUSE.md
~~FUSE-4~~ ✅	~~`fuse_status()` introspection function~~	1h	PLAN_FUSE.md
~~FUSE-5~~ ✅	~~Scheduler pre-check: count change buffer rows; evaluate threshold; blow fuse + NOTIFY if exceeded~~	2–3h	PLAN_FUSE.md
~~FUSE-6~~ ✅	~~E2E tests: normal baseline, spike → blow, reset (`apply`/`reinitialize`/`skip_changes`), diamond/DAG interaction~~	4–6h	PLAN_FUSE.md

Fuse subtotal: ~10–14 hours — ✅ Complete

External Correctness Gate (TS1 or TS2)

In plain terms: Run an independent public query corpus through pg_trickle's DIFFERENTIAL mode and assert the results match a vanilla PostgreSQL execution. This catches blind spots that the extension's own test suite cannot, and provides an objective correctness baseline before v1.0.

Item	Description	Effort	Ref
TS1	sqllogictest suite. Run the PostgreSQL sqllogic suite through pg_trickle DIFFERENTIAL mode; gate CI on zero correctness mismatches. Preferred choice: broadest query coverage.	2–3d	PLAN_TESTING_GAPS.md §J
TS2	JOB (Join Order Benchmark). Correctness baseline and refresh latency profiling on realistic multi-join analytical queries. Alternative if sqllogictest setup is too costly.	1–2d	PLAN_TESTING_GAPS.md §J

Deliver one of TS1 or TS2; whichever is completed first meets the exit criterion.

External correctness gate subtotal: ~1–3 days

Differential ST-to-ST Refresh (✅ Done)

In plain terms: When stream table B's defining query reads from stream table A, pg_trickle currently forces a FULL refresh of B every time A updates — re-executing B's entire query even when only a handful of rows changed. This feature gives ST-to-ST dependencies the same CDC change buffer that base tables already have, so B refreshes differentially (applying only the delta). Crucially, even when A itself does a FULL refresh, a pre/post snapshot diff is captured so B still receives a small I/D delta rather than cascading FULL through the chain.

Item	Description	Status	Ref
ST-ST-1	Change buffer infrastructure. `create_st_change_buffer_table()` / `drop_st_change_buffer_table()` in `cdc.rs`; lifecycle hooks in `api.rs`; idempotent `ensure_st_change_buffer()`	✅ Done	PLAN_ST_TO_ST.md §Phase 1
ST-ST-2	Delta capture — DIFFERENTIAL path. Force explicit DML when ST has downstream consumers; capture delta from `__pgt_delta_{id}` to `changes_pgt_{id}`	✅ Done	PLAN_ST_TO_ST.md §Phase 2
ST-ST-3	Delta capture — FULL path. Pre/post snapshot diff writes I/D pairs to `changes_pgt_{id}`; eliminates cascading FULL	✅ Done	PLAN_ST_TO_ST.md §7
ST-ST-4	DVM scan operator for ST sources. Read from `changes_pgt_{id}`; `pgt_`-prefixed LSN tokens; extended frontier and placeholder resolver	✅ Done	PLAN_ST_TO_ST.md §Phase 3
ST-ST-5	Scheduler integration. Buffer-based change detection in `has_stream_table_source_changes()`; removed FULL override; frontier augmented with ST source positions	✅ Done	PLAN_ST_TO_ST.md §Phase 4
ST-ST-6	Cleanup & lifecycle. `cleanup_st_change_buffers_by_frontier()` for ST buffers; removed prewarm skip for ST sources; ST buffer cleanup in both differential and full refresh paths	✅ Done	PLAN_ST_TO_ST.md §Phase 5–6

ST-to-ST differential subtotal: ~4.5–6.5 weeks

Adaptive/Event-Driven Scheduler Wake (Must-Ship)

In plain terms: The scheduler currently wakes on a fixed 1-second timer even when nothing has changed. This adds event-driven wake: CDC triggers notify the scheduler immediately when changes arrive. Median end-to-end latency drops from ~515 ms to ~15 ms for low-volume workloads — a 34× improvement. This is a must-ship item because low latency is a primary project goal.

Item	Description	Effort	Ref
~~WAKE-1~~	~~Event-driven scheduler wake.~~ ✅ Done in v0.11.0 Phase 7 — CDC triggers emit `pg_notify('pgtrickle_wake', '')` after each change buffer INSERT; scheduler issues `LISTEN pgtrickle_wake` at startup; 10 ms debounce coalesces rapid notifications; poll fallback preserved. New GUCs: `event_driven_wake` (default `true`), `wake_debounce_ms` (default `10`). E2E tests in `tests/e2e_wake_tests.rs`.	—	REPORT_OVERALL_STATUS.md §R16

Event-driven wake subtotal: ✅ Complete

Stretch Goals (if capacity allows after Must-Ship)

Item	Description	Effort	Ref
~~STRETCH-1~~	~~Partitioned stream tables — design spike only.~~ ✅ Done in v0.11.0 Partitioning Spike — RFC written (PLAN_PARTITIONING_SPIKE.md), go/no-go decision: Go. A1-1 implemented (catalog column, API parameter, validation).	2–4d	PLAN_PARTITIONING_SPIKE.md
~~A1-1~~	~~DDL: `CREATE STREAM TABLE … PARTITION BY`; `st_partition_key` catalog column.~~ ✅ Done — `partition_by` parameter added to all three `create_stream_table*` functions; `st_partition_key TEXT` column in catalog; `validate_partition_key()` validates column exists in output; `build_create_table_sql` emits `PARTITION BY RANGE (key)`; `setup_storage_table` creates default catch-all partition and non-unique `__pgt_row_id` index.	1–2 wk	PLAN_PARTITIONING_SPIKE.md
~~A1-2~~	~~Delta min/max inspection.~~ ✅ Done — `extract_partition_range()` in `refresh.rs` runs `SELECT MIN/MAX(key)::text` on the resolved delta SQL; returns `None` on empty delta (MERGE skipped).	1 wk	PLAN_PARTITIONING_SPIKE.md §8
~~A1-3~~	~~MERGE rewrite.~~ ✅ Done — `inject_partition_predicate()` replaces `__PGT_PART_PRED__` placeholder in MERGE ON clause with `AND st."key" BETWEEN 'min' AND 'max'`; `CachedMergeTemplate` stores `delta_sql_template`; D-2 prepared statements disabled for partitioned STs.	2–3 wk	PLAN_PARTITIONING_SPIKE.md §8
~~A1-4~~	~~E2E benchmarks: 10M-row partitioned ST, 0.1%/0.2%/100% change rate scenarios; `EXPLAIN (ANALYZE, BUFFERS)` partition-scan verification.~~ ✅ Done — 7 E2E tests added to `tests/e2e_partition_tests.rs` covering: initial populate, differential inserts, updates/deletes, empty-delta fast path, EXPLAIN plan verification, invalid partition key rejection; added to light-E2E allowlist.	1 wk	PLAN_PARTITIONING_SPIKE.md §9

Stretch subtotal: STRETCH-1 + A1-1 + A1-2 + A1-3 + A1-4 ✅ All complete

DAG Refresh Performance Improvements (from PLAN_DAG_PERFORMANCE.md §8)

In plain terms: Now that ST-to-ST differential refresh eliminates the "every hop is FULL" bottleneck, the next performance frontier is reducing per-hop overhead and exploiting DAG structure more aggressively. These items target the scheduling and dispatch layer — not the DVM engine — and collectively can reduce end-to-end propagation latency by 30–50% for heterogeneous DAGs.

Item	Description	Effort	Ref
~~DAG-1~~	Intra-tick pipelining. Within a single scheduler tick, begin processing a downstream ST as soon as all its specific upstream dependencies have completed — not when the entire topological level finishes. Requires per-ST completion tracking in the parallel dispatch loop and immediate enqueuing of newly-ready STs. Expected 30–50% latency reduction for DAGs with mixed-cost levels. ✅ Done — Already achieved by Phase 4’s parallel dispatch architecture: per-dependency `remaining_upstreams` tracking with immediate downstream readiness propagation. No level barrier exists. 3 validation tests.	2–3 wk	PLAN_DAG_PERFORMANCE.md §8.1
~~DAG-2~~	Adaptive poll interval. Replace the fixed 200 ms parallel dispatch poll with exponential backoff (20 ms → 200 ms), resetting on worker completion. Makes parallel mode competitive with CALCULATED for cheap refreshes ($T_r \approx 10\text{ms}$). Alternative: `WaitLatch` with shared-memory completion flags. ✅ Done — `compute_adaptive_poll_ms()` pure-logic helper with exponential backoff (20ms → 200ms); `ParallelDispatchState` tracks `adaptive_poll_ms` + `completions_this_tick`; resets to 20ms on worker completion; 8 unit tests.	1–2 wk	PLAN_DAG_PERFORMANCE.md §8.2
~~DAG-3~~	Delta amplification detection. Track input→output delta ratio per hop via `pgt_refresh_history`. When a join ST amplifies delta beyond a configurable threshold (e.g., output > 100× input), emit a performance WARNING and optionally fall back to FULL for that hop. Expose amplification metrics in `explain_st()`. ✅ Done — `pg_trickle.delta_amplification_threshold` GUC (default 100×); `compute_amplification_ratio` + `should_warn_amplification` pure-logic helpers; WARNING emitted after MERGE with ratio, counts, and tuning hint; `explain_st()` exposes `amplification_stats` JSON from last 20 DIFFERENTIAL refreshes; 15 unit tests.	3–5d	PLAN_DAG_PERFORMANCE.md §8.4
~~DAG-4~~	ST buffer bypass for single-consumer CALCULATED chains. For ST dependencies with exactly one downstream consumer refreshing in the same tick, pass the delta in-memory instead of writing/reading from the `changes_pgt_` buffer table. Eliminates 2× SPI DML per hop (~20 ms savings per hop for 10K-row deltas). ✅ Done — `FusedChain` execution unit kind; `find_fusable_chains()` pure-logic detection; `capture_delta_to_bypass_table()` writes to temp table; `DiffContext.st_bypass_tables` threads bypass through DVM scan; delta SQL cache bypassed when active; 11+4 unit tests.	3–4 wk	PLAN_DAG_PERFORMANCE.md §8.3
~~DAG-5~~	ST buffer batch coalescing. Apply net-effect computation to ST change buffers before downstream reads — cancel INSERT/DELETE pairs for the same `__pgt_row_id` that accumulate between reads during rapid-fire upstream refreshes. Adapts existing `compute_net_effect()` logic to the ST buffer schema. ✅ Done — `compact_st_change_buffer()` with `build_st_compact_sql()` pure-logic helper; advisory lock namespace 0x5047_5500; integrated in `execute_differential_refresh()` after C-4 base-table compaction; 9 unit tests.	1–2 wk	PLAN_DAG_PERFORMANCE.md §8.5

DAG refresh performance subtotal: ~8–12 weeks

v0.11.0 total: ~7–10 weeks (partitioning + isolation) + ~12h observability + ~14–21h default tuning + ~7–12h safety hardening + ~2–4 weeks should-ship (bitmask + fuse + external corpus) + ~4.5–6.5 weeks ST-to-ST differential + ~2–3 weeks event-driven wake + ~1–2 days correctness quick-wins + ~2–3 days documentation + ~8–12 weeks DAG performance

Exit criteria: ✅ All met. Released 2026-03-26.

Declaratively partitioned stream tables accepted; partition key tracked in catalog — ✅ Done in v0.11.0 Partitioning Spike (STRETCH-1 RFC + A1-1)
Partitioned storage table created with PARTITION BY RANGE + default catch-all partition — ✅ Done (A1-1 physical DDL)
Partition-key range predicate injected into MERGE ON clause; empty-delta fast-path skips MERGE — ✅ Done (A1-2 + A1-3)
Partition-scoped MERGE benchmark: 10M-row ST, 0.1% change rate (expect ~100× I/O reduction) — ✅ Done (A1-4 E2E tests)
Per-database worker quotas enforced; burst reclaimed within 1 scheduler cycle — ✅ Done in v0.11.0 Phase 11 (pg_trickle.per_database_worker_quota GUC; burst to 150% at < 80% cluster load)
Prometheus queries + alerting rules + Grafana dashboard shipped — ✅ Done in v0.11.0 Phase 3 (monitoring/ directory)
DEF-1: parallel_refresh_mode default is 'on'; unit test updated — ✅ Done in v0.11.0 Phase 1
DEF-2: auto_backoff default is true; CONFIGURATION.md updated — ✅ Done in v0.10.0
DEF-3: SemiJoin delta-key pre-filter verified already implemented — ✅ Done in v0.11.0 Phase 2 (pre-existing in semi_join.rs)
DEF-4: Invalidation ring capacity is 128 slots — ✅ Done in v0.11.0 Phase 1
DEF-5: block_source_ddl default is true; error message includes escape-hatch instructions — ✅ Done in v0.11.0 Phase 1
SAF-1: No panic!/unwrap() in background worker hot paths; check_skip_needed logs SPI errors — ✅ Done in v0.11.0 Phase 1
SAF-2: Failure-injection E2E tests in tests/e2e_safety_tests.rs — ✅ Done in v0.11.0 Phase 2
WB-1+2: Changed-column bitmask supports >63 columns (VARBIT); wide-table CDC selectivity E2E passes; schema migration tested — ✅ Done in v0.11.0 Phase 5
FUSE-1–6: Fuse blows on configurable change-count threshold; reset_fuse() recovers in all three action modes; diamond/DAG interaction tested — ✅ Done in v0.11.0 Phase 6
TS2: TPC-H-derived 5-query DIFFERENTIAL correctness gate passes with zero mismatches; gated in CI — ✅ Done in v0.11.0 Phase 9
QF-1–4: println! replaced with guarded pgrx::log!(); AUTO downgrades emit WARNING; append_only reversion verified already warns; parser invariant sites annotated — ✅ Done in v0.11.0 Phase 1
G12-ERM: effective_refresh_mode column present in pgt_stream_tables; explain_refresh_mode() returns configured mode, effective mode, downgrade reason — ✅ Done in v0.11.0 Phase 2
G12-2: TopK path validates assumptions at refresh time; triggers FULL fallback with WARNING on violation — ✅ Done in v0.11.0 Phase 4
G12-AGG: Group-rescan aggregate warning fires at create_stream_table for DIFFERENTIAL mode; strategy visible in explain_st() — ✅ Done in v0.11.0 Phase 4
G15-PV: Incompatible cdc_mode/refresh_mode and diamond_schedule_policy combinations rejected at creation time with structured HINT — ✅ Done in v0.11.0 Phase 2
G13-EH: UnsupportedOperator, CycleDetected, UpstreamSchemaChanged, QueryParseError include DETAIL and HINT fields — ✅ Done in v0.11.0 Phase 2
G17-EC01B-NEG: Negative regression test documents ≥3-scan fall-back behavior; linked to v0.12.0 EC01B fix — ✅ Done in v0.11.0 Phase 4
G16-GS/SM/MQR/GUC: GETTING_STARTED restructured (5 chapters + Hello World + Advanced Topics); DVM_OPERATORS support matrix; monitoring quick reference; CONFIGURATION.md GUC matrix — ✅ Done in v0.11.0 Phase 11
ST-ST-1–6: All ST-to-ST dependencies refresh differentially when upstream has a change buffer; FULL refreshes on upstream produce pre/post I/D diff; no cascading FULL — ✅ Done in v0.11.0 Phase 8
WAKE-1: Event-driven scheduler wake; median latency ~15 ms (34× improvement); 10 ms debounce; poll fallback — ✅ Done in v0.11.0 Phase 7
DAG-1: Intra-tick pipelining confirmed in Phase 4 architecture — ✅ Done
DAG-2: Adaptive poll interval (20 ms → 200 ms exponential backoff) — ✅ Done in v0.11.0 Phase 10
DAG-3: Delta amplification detection with pg_trickle.delta_amplification_threshold GUC — ✅ Done in v0.11.0 Phase 10
DAG-4: ST buffer bypass (FusedChain) for single-consumer CALCULATED chains — ✅ Done in v0.11.0 Phase 10
DAG-5: ST buffer batch coalescing cancels redundant I/D pairs — ✅ Done in v0.11.0 Phase 10
Extension upgrade path tested (0.10.0 → 0.11.0) — ✅ upgrade SQL in sql/pg_trickle--0.10.0--0.11.0.sql

v0.12.0 — Correctness, Reliability & Developer Tooling

Goal: Close the last known wrong-answer bugs in the incremental query engine, add SQL-callable diagnostic functions for observability, harden the scheduler against edge cases uncovered with deeper topologies, and back the whole release with thousands of automatically generated property and fuzz tests.

Phases 5–8 from the original v0.12.0 scope (Scalability Foundations, Partitioning Enhancements, MERGE Profiling, and dbt Macro Updates) have been moved to v0.13.0 to keep this release tightly focused on correctness and reliability. See §v0.13.0 for those items.

Status: Released (2026-03-28).

Completed items (click to expand)

Anomalous Change Detection (Fuse)

In plain terms: Imagine a source table suddenly receives a million-row batch delete — a bug, runaway script, or intentional purge. Without a fuse, pg_trickle would try to process all of it and potentially overload the database. This adds a circuit breaker: you set a ceiling (e.g. "never process more than 50,000 changes at once"), and if that limit is hit the stream table pauses and sends a notification. You investigate, fix the root cause, then resume with reset_fuse() and choose how to recover (apply the changes, reinitialize from scratch, or skip them entirely).

Per-stream-table fuse that blows when the change buffer row count exceeds a configurable fixed ceiling or an adaptive μ+kσ threshold derived from pgt_refresh_history. A blown fuse halts refresh and emits a pgtrickle_alert NOTIFY; reset_fuse() resumes with a chosen recovery action.

Item	Description	Effort	Ref
FUSE-1	Catalog: fuse state columns on `pgt_stream_tables` (`fuse_mode`, `fuse_state`, `fuse_ceiling`, `fuse_sensitivity`, `blown_at`, `blow_reason`)	1–2h	PLAN_FUSE.md
FUSE-2	`alter_stream_table()` new params: `fuse`, `fuse_ceiling`, `fuse_sensitivity`	1h	PLAN_FUSE.md
FUSE-3	`reset_fuse(name, action => 'apply'\|'reinitialize'\|'skip_changes')` SQL function	1h	PLAN_FUSE.md
FUSE-4	`fuse_status()` introspection function	1h	PLAN_FUSE.md
FUSE-5	Scheduler pre-check: count change buffer rows; evaluate threshold; blow fuse + NOTIFY if exceeded	2–3h	PLAN_FUSE.md
FUSE-6	E2E tests: normal baseline, spike → blow, reset, diamond/DAG interaction	4–6h	PLAN_FUSE.md

Anomalous change detection subtotal: ~10–14 hours

Correctness — EC-01 Deep Fix (≥3-Scan Join Right Subtrees)

In plain terms: The phantom-row-after-DELETE bug (EC-01) was fixed for join children with ≤2 scan nodes on the right side. Wider join chains — TPC-H Q7, Q8, Q9 all qualify — are still silently affected: when both sides of a join are deleted in the same batch, the DELETE can be silently dropped. The existing EXCEPT ALL snapshot strategy causes PostgreSQL to spill multi-GB temp files for deep join trees, which is why the threshold exists. This work designs a fundamentally different per-subtree snapshot strategy that removes the cap.

Item	Description	Effort	Ref
~~EC01B-1~~	Design and implement a per-subtree CTE-based snapshot strategy to replace EXCEPT ALL for right-side join chains with ≥3 scan nodes; remove the `join_scan_count(child) <= 2` threshold in `use_pre_change_snapshot` ✅ Done	—	src/dvm/operators/join_common.rs · plans/PLAN_EDGE_CASES.md §EC-01
~~EC01B-2~~	~~TPC-H Q7/Q8/Q9 regression tests: combined left-DELETE + right-DELETE in same cycle; assert no phantom-row drop~~ ✅ Done	—	tests/e2e_tpch_tests.rs

EC-01 deep fix subtotal: ~3–4 weeks — ✅ Complete

CDC Write-Side Overhead Benchmark

In plain terms: Every INSERT/UPDATE/DELETE on a source table fires a PL/pgSQL trigger that writes to the change buffer. We have never measured how much write throughput this costs. These benchmarks quantify it across five scenarios (single-row, bulk INSERT, bulk UPDATE, bulk DELETE, concurrent writers) and gate the decision on whether to implement a change_buffer_unlogged GUC that could reduce WAL overhead by ~20–30%.

Item	Description	Effort	Ref
~~BENCH-W1~~	~~Implement `tests/e2e_cdc_write_overhead_tests.rs`: compare source-only vs. source + stream table DML throughput across five scenarios; report write amplification factor~~ ✅ Done	—	tests/e2e_cdc_write_overhead_tests.rs
~~BENCH-W2~~	~~Publish results in `docs/BENCHMARK.md`~~ ✅ Done	—	docs/BENCHMARK.md

CDC write-side benchmark subtotal: ~3–5 days — ✅ Complete

DAG Topology Benchmark Suite (from PLAN_DAG_BENCHMARK.md)

In plain terms: Production deployments form DAGs with 10–500+ stream tables arranged in chains, fan-outs, diamonds, and mixed topologies. This benchmark suite measures end-to-end propagation latency and throughput through these DAG shapes, validates the theoretical latency formulas from PLAN_DAG_PERFORMANCE.md, and provides regression detection for DAG propagation overhead.

Item	Description	Effort	Ref
~~DAG-B1~~	~~Session 1: Infrastructure, linear chain topology builder, latency + throughput measurement drivers, reporting (ASCII/JSON), 7 benchmark tests~~ ✅ Done	—	PLAN_DAG_BENCHMARK.md §11.1
~~DAG-B2~~	~~Session 2: Wide DAG + fan-out tree topology builders; 9 latency + throughput tests (5 wide + 2 fan-out latency, 2 throughput)~~ ✅ Done	—	PLAN_DAG_BENCHMARK.md §11.2
~~DAG-B3~~	~~Session 3: Diamond + mixed topology builders; 5 latency + throughput tests; per-level breakdown reporting~~ ✅ Done	—	PLAN_DAG_BENCHMARK.md §11.3
~~DAG-B4~~	~~Session 4: Update `docs/BENCHMARK.md`, full suite validation run~~ ✅ Done	—	PLAN_DAG_BENCHMARK.md §11.4

DAG topology benchmark subtotal: ~3–5 days — ✅ Complete

Developer Tooling & Observability Functions (from REPORT_OVERALL_STATUS.md §15) ✅ Complete

In plain terms: pg_trickle's diagnostic toolbox today is limited to explain_st() and refresh_history(). Operators debugging unexpected mode changes, query rewrites, or error patterns must read source code or server logs. This section adds four SQL-callable diagnostic functions that surface internal state in a structured, queryable form.

Item	Description	Effort	Status
DT-1	`explain_query_rewrite(query TEXT)` — parse a query through the DVM pipeline and return the rewritten SQL plus a list of passes applied (operator rewrites, delta-key injections, TopK detection, group-rescan classification). Useful for debugging unexpected refresh behavior without creating a stream table.	~1–2d	✅ Done in v0.12.0 Phase 2
DT-2	`diagnose_errors(name TEXT)` — return the last 5 error events for a stream table, classified by type (correctness, performance, config, infrastructure), with a suggested remediation for each class.	~2–3d	✅ Done in v0.12.0 Phase 2
DT-3	`list_auxiliary_columns(name TEXT)` — list all `__pgt_` internal columns injected into the stream table's query plan with their purpose (delta tracking, row identity, compaction key). Helps users understand unexpected columns in `SELECT ` output.	~1d	✅ Done in v0.12.0 Phase 2
DT-4	`validate_query(query TEXT)` — parse and run DVM validation on a query without creating a stream table; return the resolved refresh mode, detected SQL constructs (group-rescan aggregates, non-equijoins, multi-scan subtrees), and any warnings.	~1–2d	✅ Done in v0.12.0 Phase 2

Developer tooling subtotal: ~5–8 days

Parser Safety, Concurrency & Query Coverage (from REPORT_OVERALL_STATUS.md §13/§12/§17)

Additional correctness and robustness items from the deep gap analysis: a stack-overflow prevention guard for pathological queries, a concurrency stress test for IMMEDIATE mode, and two investigations into known under- documented query constructs.

Item	Description	Effort	Ref
~~G13-SD~~	Parser recursion depth limit. Add a recursion depth counter to all recursive parse-tree visitor functions in `dvm/parser.rs`. Return `PgTrickleError::QueryTooComplex` if depth exceeds `pg_trickle.max_parse_depth` (GUC, default 64). Prevents stack-overflow crashes on pathological queries. ✅ Done	—	src/dvm/parser.rs · src/config.rs · src/error.rs
~~G17-IMS~~	IMMEDIATE mode concurrency stress test. 100+ concurrent DML transactions on the same source table in `IMMEDIATE` refresh mode; assert zero lost updates, zero phantom rows, and no deadlocks. ✅ Done	—	tests/e2e_immediate_concurrency_tests.rs
~~G12-SQL-IN~~	Multi-column `IN (subquery)` correctness investigation. Determine behavior when DVM encounters `EXPR IN (subquery returning multiple columns)`. Add a correctness test; if the construct is broken, fix it or document as unsupported with a structured error. ✅ Done — documented as unsupported	—	tests/e2e_multi_column_in_tests.rs · src/dvm/parser.rs
G14-MDED	MERGE deduplication profiling. Profile how often concurrent-write scenarios produce duplicate key entries requiring pre-MERGE compaction. If ≥10% of refresh cycles need dedup, write an RFC for a two-pass MERGE strategy.	~3–5d	plans/performance/REPORT_OVERALL_STATUS.md §14
~~G17-MERGEEX~~	MERGE template EXPLAIN validation in E2E tests. Add `EXPLAIN (COSTS OFF)` dry-run checks for generated MERGE SQL templates at E2E test startup. Catches malformed templates before any data is processed. ✅ Done	—	tests/e2e_merge_template_tests.rs

Parser safety & coverage subtotal: ~9–15 days

Differential Fuzzing (SQLancer)

In plain terms: SQLancer is a SQL fuzzer that generates thousands of syntactically valid but structurally unusual queries and uses mathematical oracles (NoREC, TLP) to prove our DVM engine produces exactly the same results as PostgreSQL's native executor. Unlike hand-written tests, it explores the long tail of NULL semantics, nested aggregations, and edge cases no human would write. Any backend crash or result mismatch becomes a permanent regression test seed.

Item	Description	Effort	Ref
SQLANCER-1	Docker-based harness: `just sqlancer` spins up E2E container; crash-test oracle verifies that no SQLancer-generated `create_stream_table` call crashes the backend	3–4d	PLAN_SQLANCER.md §Steps 1–2
SQLANCER-2	Equivalence oracle: for each generated query Q, assert `create_stream_table` + `refresh` output equals native `SELECT` (multiset comparison); failures auto-committed as proptest regression seeds	3–4d	PLAN_SQLANCER.md §Step 3
SQLANCER-3	CI `weekly-sqlancer` job (daily schedule + manual dispatch); new proptest seed files committed on any detected correctness failure	1–2d	PLAN_SQLANCER.md

SQLancer fuzzing subtotal: ~1–2 weeks

Property-Based Invariant Tests (Items 5 & 6)

In plain terms: Items 1–4 of the property test plan are done. These two remaining items add topology/scheduler stress tests (random DAG shapes with multi-source branch interactions) and pure Rust unit-level properties (ordering monotonicity, SCC bookkeeping correctness). Both slot into the existing proptest harness and provide coverage that example-based tests cannot exhaustively explore.

Item	Description	Effort	Ref
PROP-5	Topology / scheduler stress: randomized DAG topologies with multi-source branch interactions; assert no incorrect refresh ordering or spurious suspension	4–6d	PLAN_TEST_PROPERTY_BASED_INVARIANTS.md §Item 5
PROP-6	Pure Rust DAG / scheduler helper properties: ordering invariants, monotonic metadata helpers, SCC bookkeeping edge-cases	2–4d	PLAN_TEST_PROPERTY_BASED_INVARIANTS.md §Item 6

Property testing subtotal: ~6–10 days

Async CDC — Research Spike (D-2)

In plain terms: A custom PostgreSQL logical decoding plugin could write changes directly to change buffers without the polling round-trip, cutting CDC latency by ~10× and WAL decoding CPU by 50–80%. This milestone scopes a research spike only — not a full implementation — to validate the key technical constraints.

Item	Description	Effort	Ref
D2-R	Research spike: prototype in-memory row buffering inside `pg_trickle_decoder`; validate SPI flush in `commit` callback; document memory-safety constraints and feasibility; produce a written RFC before any full implementation is started	2–3 wk	PLAN_NEW_STUFF.md §D-2

⚠️ SPI writes inside logical decoding change callbacks are not supported. All row buffering must occur in-memory within the plugin's memory context; flush only in the commit callback. In-memory buffers must handle arbitrarily large transactions. See PLAN_NEW_STUFF.md §D-2 risk analysis before writing any C code.

Retraction candidate (D-2): Even as a research spike, this item introduces C-level complexity (custom output plugin memory management, commit-callback SPI failure handling, arbitrarily large transaction buffering) that substantially exceeds the stated 2–3 week estimate once the architectural constraints are respected. The risk rating is Very High and the SPI-in-change-callback infeasibility makes the originally proposed design non-functional. Recommend moving D-2 to a post-1.0 research backlog entirely; do not include it in a numbered milestone until a separate feasibility study (outside the release cycle) produces a concrete RFC.

D-2 research spike subtotal: ~2–3 weeks

Scalability Foundations (pulled forward from v0.13.0)

In plain terms: These items directly serve the project's primary goal of world-class performance and scalability. Columnar change tracking eliminates wasted delta processing for wide tables, and shared change buffers reduce I/O multiplication in deployments with many stream tables reading from the same source.

Item	Description	Effort	Ref
A-2	Columnar Change Tracking. Per-column bitmask in CDC triggers; skip rows where no referenced column changed; lightweight UPDATE-only path when only projected columns changed; 50–90% delta-volume reduction for wide-table UPDATE workloads.	3–4 wk	PLAN_NEW_STUFF.md §A-2
D-4	Shared Change Buffers. Single buffer per source shared across all dependent STs; multi-frontier cleanup coordination; static-superset column mode for initial implementation.	3–4 wk	PLAN_NEW_STUFF.md §D-4

Scalability foundations subtotal: ~6–8 weeks

Partitioning Enhancements (A1 follow-ons from v0.11.0 spike)

In plain terms: The v0.11.0 spike delivered RANGE partitioning end-to-end. These follow-on items extend coverage to the use cases deliberately deferred from A1: multi-column keys, retrofitting existing stream tables, LIST-based partitions, HASH partitions (which need a different strategy than predicate injection), and operational quality-of-life improvements.

Item	Description	Effort	Ref
~~A1-1b~~	Multi-column partition keys. Comma-separated `partition_by`; `PARTITION BY RANGE (col_a, col_b)`; multi-column MIN/MAX extraction; ROW() comparison predicates for partition pruning. ✅ Done — `parse_partition_key_columns()`, composite `extract_partition_range()`, ROW comparison in `inject_partition_predicate()`; 5 unit tests + 3 E2E tests	—	src/api.rs, src/refresh.rs
~~A1-1c~~	`alter_stream_table(partition_by => …)` support. Add/change/remove partition key on existing stream tables; `alter_stream_table_partition_key()` handles DROP + recreate + full refresh; `update_partition_key()` in catalog; SQL migration adds parameter; also fixed `alter_stream_table_query` to preserve partition key. ✅ Done — 4 E2E tests	—	src/api.rs, src/catalog.rs
~~A1-1d~~	LIST partitioning support. `partition_by => 'LIST:col'` creates `PARTITION BY LIST` storage; `PartitionMethod` enum dispatches LIST vs RANGE; `extract_partition_bounds()` uses `SELECT DISTINCT` for LIST; `inject_partition_predicate()` emits `IN (…)` predicate; single-column-only validation. ✅ Done — 16 unit tests + 4 E2E tests	—	src/api.rs, src/refresh.rs
~~A1-3b~~	HASH partitioning via per-partition MERGE loop. `partition_by => 'HASH:col[:N]'` creates `PARTITION BY HASH` storage with N auto-created child partitions; `execute_hash_partitioned_merge()` materializes delta → discovers children via `pg_inherits` → per-child MERGE filtered through `satisfies_hash_partition()`; `build_hash_child_merge()` rewrites MERGE targeting `ONLY child_partition`. ✅ Done — 22 unit tests + 6 E2E tests	—	src/api.rs, src/refresh.rs
~~PART-WARN~~	Default-partition growth warning. `warn_default_partition_growth()` emits `pgrx::warning!()` after FULL and DIFFERENTIAL refresh when the default partition has rows; includes example DDL. ✅ Done — 2 E2E tests	—	src/refresh.rs

Auto-partition creation (TimescaleDB-style automatic chunk management) remains a post-1.0 item as stated in PLAN_PARTITIONING_SPIKE.md §10.

Partitioning enhancements subtotal: ~5–8 weeks

Performance Defaults (from REPORT_OVERALL_STATUS.md)

Targeted improvements identified in the overall status report. None require large design changes; all build on existing infrastructure.

Item	Description	Effort	Ref
~~PERF-2~~	~~Auto-enable `buffer_partitioning` for high-throughput sources.~~ ✅ Done — `should_promote_inner()` throughput-based heuristic; `convert_buffer_to_partitioned()` runtime migration; auto-promote hook in `execute_differential_refresh()`; `docs/CONFIGURATION.md` updated; 10 unit tests + 3 E2E tests	—	REPORT_OVERALL_STATUS.md §R7
~~PERF-3~~	Flip `tiered_scheduling` default to `true`. The feature is implemented and tested since v0.10.0. ✅ Done — default flipped; CONFIGURATION.md updated with tier thresholds section	—	src/config.rs · docs/CONFIGURATION.md
~~PERF-1~~	~~Adaptive scheduler wake interval.~~ ➡️ Pulled forward to v0.11.0 as WAKE-1.	—	REPORT_OVERALL_STATUS.md §R3/R16
~~PERF-4~~	~~Flip `block_source_ddl` default to `true`.~~ ➡️ Pulled forward to v0.11.0 as DEF-5.	—	REPORT_OVERALL_STATUS.md §R12
~~PERF-5~~	~~Wider changed-column bitmask (>63 columns).~~ ➡️ Pulled forward to v0.11.0 as WB-1/WB-2.	—	REPORT_OVERALL_STATUS.md §R13

Performance defaults subtotal: ~1–3 weeks

DAG Refresh Performance Improvements (from PLAN_DAG_PERFORMANCE.md §8)

➡️ Moved to v0.11.0 — these items build directly on the ST-to-ST differential infrastructure shipped in v0.11.0 Phase 8 and are most impactful while that work is fresh.

v0.12.0 total: ~18–27 weeks + ~6–8 weeks scalability + ~5–8 weeks partitioning enhancements + ~1–3 weeks defaults + ~3–5 weeks developer tooling & observability

Priority tiers: P0 = Phases 1–3 (must ship); P1 = Phases 4 + 7 (target); P2 = Phases 5, 6, 8 (can defer to v0.13.0 as a unit — never partially ship Phase 5/6).

dbt Macro Updates (Phase 8)

Priority P2 — Expose the v0.11.0 SQL API additions (partition_by, fuse, fuse_ceiling, fuse_sensitivity) in the dbt materialization macros so dbt users can configure them via config(...). No catalog changes; pure Jinja/SQL. Can defer to v0.13.0 as a unit.

Item	Description	Effort
DBT-1	`partition_by` config option wired through `stream_table.sql`, `create_stream_table.sql`, and `alter_stream_table.sql`	~1d
DBT-2	`fuse`, `fuse_ceiling`, `fuse_sensitivity` config options wired through the materialization and alter macro with change-detection logic	~1–2d
DBT-3	dbt docs update: README and SQL_REFERENCE.md dbt section	~0.5d

dbt macro updates subtotal: ~2–3.5 days

Exit criteria — all met (v0.12.0 Released 2026-03-28):

EC01B-1/2: No phantom-row drop for ≥3-scan right-subtree joins; TPC-H Q7/Q8/Q9 DELETE regression tests pass ✅
BENCH-W: Write-side overhead benchmarks published in docs/BENCHMARK.md ✅
DAG-B1–B4: DAG topology benchmark suite complete ✅
SQLANCER-1/2/3: Crash-test + equivalence oracles in weekly CI job; zero mismatches ✅
PROP-5+6: Topology stress and DAG/scheduler helper property tests pass ✅
DT-1–4: explain_query_rewrite(), diagnose_errors(), list_auxiliary_columns(), validate_query() callable from SQL ✅
G13-SD: max_parse_depth guard active; pathological query returns QueryTooComplex ✅
G17-IMS: IMMEDIATE mode concurrency stress test (5 scenarios × 100+ concurrent DML) passes ✅
G12-SQL-IN: Multi-column IN subquery documented as unsupported with structured error + EXISTS hint ✅
G17-MERGEEX: MERGE template EXPLAIN validation at E2E test startup ✅
PERF-3: tiered_scheduling default is true; CONFIGURATION.md updated ✅
ST-ST-9: Content-hash pk_hash in ST change buffers; stale-row-after-UPDATE bug fixed ✅
DAG-4 bypass column types fixed; parallel worker tests complete without timeout ✅
docs/UPGRADING.md updated with v0.11.0→v0.12.0 migration notes ✅
scripts/check_upgrade_completeness.sh passes ✅
Extension upgrade path tested (0.11.0 → 0.12.0) ✅

v0.13.0 — Scalability Foundations, Partitioning Enhancements, MERGE Profiling & Multi-Tenant Scheduling

Status: Released (2026-03-31).

Goal: Deliver the scalability foundations deferred from v0.12.0 — columnar change tracking and shared change buffers — alongside the partitioning enhancements that build on v0.11.0's RANGE partitioning spike, a MERGE deduplication profiling pass, the dbt macro updates, per-database worker quotas for multi-tenant deployments, the TPC-H-derived benchmarking harness for data-driven performance validation, and a small SQL coverage cleanup for PG 16+ expression types.

Completed items (click to expand)

Phases from PLAN_0_12_0.md: Phases 5 (Scalability), 6 (Partitioning), 7 (MERGE Profiling), and 8 (dbt Macro Updates). Plus three new phases: 9 (Multi-Tenant Scheduler Isolation), 10 (TPC-H Benchmark Harness), and 11 (SQL Coverage Cleanup).

Scalability Foundations (Phase 5)

In plain terms: These items directly serve the project's primary goal of world-class performance and scalability. Columnar change tracking eliminates wasted delta processing for wide tables, and shared change buffers reduce I/O multiplication in deployments with many stream tables reading from the same source.

Item	Description	Effort	Ref
A-2	Columnar Change Tracking. Per-column bitmask in CDC triggers; skip rows where no referenced column changed; lightweight UPDATE-only path when only projected columns changed; 50–90% delta-volume reduction for wide-table UPDATE workloads.	3–4 wk	PLAN_NEW_STUFF.md §A-2
D-4	Shared Change Buffers. Single buffer per source shared across all dependent STs; multi-frontier cleanup coordination; static-superset column mode for initial implementation.	3–4 wk	PLAN_NEW_STUFF.md §D-4
~~PERF-2~~	~~Auto-enable `buffer_partitioning` for high-throughput sources.~~ ✅ Done — throughput-based auto-promotion: buffer exceeding `compact_threshold` in a single refresh cycle is converted to RANGE(lsn) partitioned mode at runtime.	—	REPORT_OVERALL_STATUS.md §R7

⚠️ D-4 multi-frontier cleanup correctness verified. MIN(consumer_frontier) used in all cleanup paths. Property-based tests with 5–10 consumers and 500 random frontier advancement cases pass.

Scalability foundations subtotal: ~6–8 weeks

Partitioning Enhancements (Phase 6)

In plain terms: The v0.11.0 spike delivered RANGE partitioning end-to-end. These follow-on items extend coverage to the use cases deliberately deferred from A1: multi-column keys, retrofitting existing stream tables, LIST-based partitions, HASH partitions, and operational quality-of-life improvements.

Item	Description	Effort	Ref
~~A1-1b~~	Multi-column partition keys. Comma-separated `partition_by`; ROW() predicate for composite keys. ✅ Done	—	src/api.rs, src/refresh.rs
~~A1-1c~~	`alter_stream_table(partition_by => …)` support. Add/change/remove partition key with full storage rebuild. ✅ Done	—	src/api.rs, src/catalog.rs
~~A1-1d~~	LIST partitioning support. `PARTITION BY LIST` for low-cardinality columns; `IN (…)` predicate style from the delta. ✅ Done	—	src/api.rs, src/refresh.rs
~~A1-3b~~	HASH partitioning via per-partition MERGE loop. `HASH:col[:N]` with auto-created child partitions; per-partition MERGE through `satisfies_hash_partition()`. ✅ Done	—	src/api.rs, src/refresh.rs
~~PART-WARN~~	Default-partition growth warning. `warn_default_partition_growth()` after FULL and DIFFERENTIAL refresh. ✅ Done	—	src/refresh.rs

Partitioning enhancements subtotal: ~5–8 weeks

MERGE Profiling (Phase 7)

Item	Description	Effort	Ref
G14-MDED	MERGE deduplication profiling. Profile how often concurrent-write scenarios produce duplicate key entries requiring pre-MERGE compaction. If ≥10% of refresh cycles need dedup, write an RFC for a two-pass MERGE strategy.	3–5d	plans/performance/REPORT_OVERALL_STATUS.md §14
PROF-DLT	Delta SQL query plan profiling (`explain_delta()` function). Capture `EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)` for auto-generated delta SQL queries to identify PostgreSQL execution bottlenecks (join algorithms, scan types, sort spills). Add `pgtrickle.explain_delta(st_name, format DEFAULT 'text')` SQL function; optional `PGS_PROFILE_DELTA=1` environment variable for E2E test auto-capture to `/tmp/delta_plans/<st>.json`. Enables identification of operator-level performance issues (semi-join full scans, deep join chains). Prerequisite for data-driven MERGE optimization.	1–2w	PLAN_TPC_H_BENCHMARKING.md §1-5

MERGE profiling subtotal: ~1–3 weeks

dbt Macro Updates (Phase 8)

In plain terms: Expose the v0.11.0 SQL API additions (partition_by, fuse, fuse_ceiling, fuse_sensitivity) in the dbt materialization macros so dbt users can configure them via config(...). No catalog changes; pure Jinja/SQL.

Item	Description	Effort
DBT-1	`partition_by` config option wired through `stream_table.sql`, `create_stream_table.sql`, and `alter_stream_table.sql`	~1d
DBT-2	`fuse`, `fuse_ceiling`, `fuse_sensitivity` config options wired through the materialization and alter macro with change-detection logic	~1–2d
DBT-3	dbt docs update: README and SQL_REFERENCE.md dbt section	~0.5d

dbt macro updates subtotal: ~2–3.5 days

Multi-Tenant Scheduler Isolation (Phase 9)

In plain terms: As deployments grow past 10 databases on a single cluster, all schedulers compete for the same global background-worker pool. One busy database can starve the others. Phase 9 gives operators per-database quotas and a priority queue so critical databases always get workers.

Item	Description	Effort	Ref
~~C-3~~	Per-database worker quotas. Add `pg_trickle.per_database_worker_quota` GUC; priority ordering: IMMEDIATE > Hot > Warm > Cold STs; burst capacity up to 150% when other databases are under quota. ✅ Done — GUC registered; `compute_per_db_quota()` with 80% burst; tier-aware `sort_ready_queue_by_priority`; 5 unit tests + 6 E2E tests	—	src/scheduler.rs

⚠️ C-3 depends on C-1 (tiered scheduling) for Hot/Warm/Cold classification. If C-1 is not ready, fall back to IMMEDIATE > all-other ordering with equal priority within each tier; add full tier-aware ordering as a follow-on when C-1 lands in v0.14.0.

Multi-tenant scheduler isolation subtotal: ~2–3 weeks

TPC-H Benchmark Harness (Phase 10)

In plain terms: The existing TPC-H correctness suite (22/22 queries passing) has no timing infrastructure. Phase 10 adds benchmark mode so we can measure FULL vs DIFFERENTIAL speedups across all 22 queries — the only way to validate that A-2, D-4, and other v0.13.0 changes actually help on realistic analytical workloads, and to catch per-query regressions at larger scale factors.

Item	Description	Effort	Ref
TPCH-1	`TPCH_BENCH=1` benchmark mode for Phase 3. Instrument `test_tpch_full_vs_differential` with warm-up cycles (`WARMUP_CYCLES=2`), reuse `extract_last_profile()` for `[PGS_PROFILE]` extraction, emit `[TPCH_BENCH]` structured output per cycle (`query=q01 tier=2 cycle=1 mode=DIFF ms=12.7 decision=0.41 merge=11.3 …`). Add `print_tpch_summary()` with per-query FULL/DIFF median, speedup, P95, and MERGE% table.	4–5h	PLAN_TPC_H_BENCHMARKING.md §3
TPCH-2	`just bench-tpch` / `bench-tpch-large` / `bench-tpch-fast` justfile targets. `bench-tpch`: SF-0.01 with `TPCH_BENCH=1`; `bench-tpch-large`: SF-0.1 with 5 cycles; `bench-tpch-fast`: skip Docker image rebuild. Enables before/after measurement for every v0.13.0 optimization.	15 min	PLAN_TPC_H_BENCHMARKING.md §3
TPCH-3	TPC-H OpTree Criterion micro-benchmarks. Add composite `OpTree` benchmarks to `benches/diff_operators.rs` representing TPC-H query shapes (`diff_tpch_q01`, `diff_tpch_q05`, `diff_tpch_q08`, `diff_tpch_q18`, `diff_tpch_q21`). Measures pure-Rust delta SQL generation time for complex multi-join/semi-join trees; catches DVM engine regressions without a running database.	4h	PLAN_TPC_H_BENCHMARKING.md §4

TPC-H benchmark harness subtotal: ~1 day

SQL Coverage Cleanup (Phase 11)

In plain terms: Three small SQL expression gaps that are unscheduled anywhere. Two are PG 16+ standard SQL syntax currently rejected with errors; one is an audit-gated correctness check for recursive CTEs with non-monotone operators. All are low-effort items that round out DVM coverage without adding scope risk.

Item	Description	Effort	Ref
SQL-RECUR	Recursive CTE non-monotone divergence audit. Write an E2E test for a recursive CTE with `EXCEPT` or aggregation in the recursive term (`WITH RECURSIVE … SELECT … EXCEPT SELECT …`). If the test passes → downgrade G1.3 to P4 (verified correct, no code change). If it fails → add a guard in `diff_recursive_cte` that detects non-monotone recursive terms and rejects them with `ERROR: non-monotone recursive CTEs are not supported in DIFFERENTIAL mode — use FULL`.	6–8h	GAP_SQL_PHASE_7.md §G1.3
SQL-PG16-1	`IS JSON` predicate support (PG 16+). `expr IS JSON`, `expr IS JSON OBJECT`, `expr IS JSON ARRAY`, `expr IS JSON SCALAR`, `expr IS JSON WITH UNIQUE KEYS` — standard SQL/JSON predicates rejected today. Add a `T_JsonIsPredicate` arm in `parser.rs`; the predicate is treated opaquely (no delta decomposition); it passes through to the delta SQL unchanged where the PG executor evaluates it natively.	2–3h	GAP_SQL_PHASE_6.md §G1.4
SQL-PG16-2	SQL/JSON constructor support (PG 16+). `JSON_OBJECT(…)`, `JSON_ARRAY(…)`, `JSON_OBJECTAGG(…)`, `JSON_ARRAYAGG(…)` — standard SQL/JSON constructors (`T_JsonConstructorExpr`) currently rejected. Add opaque pass-through in `parser.rs`; treat as scalar expressions (no incremental maintenance of the JSON value itself); handle the aggregate variants the same way as other custom aggregates (full group rescan).	4–6h	GAP_SQL_PHASE_6.md §G1.5

SQL coverage cleanup subtotal: ~1–2 days

DVM Engine Improvements (Phase 10)

In plain terms: The delta SQL generated for deep multi-table joins (e.g., TPC-H Q05/Q09 with 6 joined tables) computes identical pre-change snapshots redundantly at every reference site, spilling multi-GB temporary files that exceed temp_file_limit. Nested semi-joins (Q20) exhibit an O(n²) blowup from fully materializing the right-side pre-change state. These improvements target the intermediate data volume directly in the delta SQL generator, with TPC-H 22/22 DIFFERENTIAL correctness as the measurable gate.

Item	Description	Effort	Ref
DI-1	Named CTE L₀ snapshots. Emit per-leaf pre-change snapshots as named CTEs (`NOT MATERIALIZED` default; `MATERIALIZED` when reference count ≥ 3); deduplicate 3–10× redundant `EXCEPT ALL` evaluations per leaf. Targets Q05/Q09 temp spill root cause.	2–3d	PLAN_DVM_IMPROVEMENTS.md §DI-1
DI-2	Pre-image read from change buffer + aggregate UPDATE-split. Replace per-leaf `EXCEPT ALL` with a `NOT EXISTS` anti-join on `pk_hash` + direct `old_` read. Per-leaf conditional fallback to `EXCEPT ALL` when delta exceeds `max_delta_fraction` for that leaf. Includes aggregate UPDATE-split: the 'D' side of `SUM(CASE WHEN …)` evaluates using `old_` column values, superseding DI-8’s band-aid.	3.5–5.5d	PLAN_DVM_IMPROVEMENTS.md §DI-2
DI-3	Group-key filtered aggregate old rescan. Restrict non-algebraic aggregate `EXCEPT ALL` rescans to affected groups via `EXISTS (… IS NOT DISTINCT FROM …)` filter. NULL-safe. Independent quick win.	0.5–1d	PLAN_DVM_IMPROVEMENTS.md §DI-3
DI-6	Lazy semi-join R_old materialization. Skip `EXCEPT ALL` for unchanged semi-join right children; push down equi-join key as a filter when R_old is needed. Eliminates Q20-type O(n²) blowup.	1–2d	PLAN_DVM_IMPROVEMENTS.md §DI-6
DI-4	Shared R₀ CTE cache. Cache pre-change snapshot SQL by OpTree node identity to avoid regenerating duplicate inline subqueries for shared subtrees. Depends on DI-1.	1–2d	PLAN_DVM_IMPROVEMENTS.md §DI-4
DI-5	Part 3 correction consolidation. Consolidate per-node Part 3 correction CTEs for linear inner-join chains into a single term.	2–3d	PLAN_DVM_IMPROVEMENTS.md §DI-5
DI-7	Scan-count-aware strategy selector. `max_differential_joins` and `max_delta_fraction` per-stream-table options; auto-fallback to FULL refresh when join count or delta-rate threshold is exceeded. Complements DI-2's per-leaf fallback with a coarser per-ST guard at scheduler decision time.	1–2d	PLAN_DVM_IMPROVEMENTS.md §DI-7
DI-8	SUM(CASE WHEN …) algebraic drift fix. Detect `Expr::Raw("CASE …")` in `is_algebraically_invertible()` and fall back to GROUP_RESCAN. Q14 is unaffected (parsed as `ComplexExpression`, already GROUP_RESCAN). Correctness band-aid superseded by DI-2’s aggregate UPDATE-split.	~0.5d	PLAN_DVM_IMPROVEMENTS.md §DI-8
DI-9	Scheduler skips IMMEDIATE-mode tables. Raise `scheduler_interval_ms` GUC cap to 600,000 ms; return early from refresh-due check for `refresh_mode = IMMEDIATE` (verified safe: IMMEDIATE drains TABLE-source buffers synchronously; downstream CALCULATED tables detected via `has_stream_table_source_changes()` independently).	0.5d	PLAN_DVM_IMPROVEMENTS.md §DI-9
DI-10	SF=1 benchmark validation gate. Add `bench-tpch-sf1` justfile target (`TPCH_SF=1 TPCH_BENCH=1`). Gate v0.13.0 release on 22/22 queries at SF=1. CI: manual dispatch only (60–180 min runtime, 4h timeout).	~0.5d	PLAN_DVM_IMPROVEMENTS.md §DI-10
DI-11	Predicate pushdown + deep-join L₀ threshold + planner hints. (a) Enable `push_filter_into_cross_joins()` with scalar-subquery guard. (b) Deep-join L₀ threshold (4+ scans): skip L₀ reconstruction, use L₁ + Part 3 correction. (c) Deep-join planner hints (5+ scans): disable nestloop, raise work_mem, override temp_file_limit. Result: 22/22 TPC-H DIFFERENTIAL.	~1d	—

DI-2 promoted from v1.x: CDC old_* column capture was completed as part of the typed-column CDC rewrite (already in production). DI-2 scope includes both the join-level pre-image capture (NOT EXISTS anti-join) and an aggregate UPDATE-split that uses old_* values for the 'D' side of SUM(CASE WHEN …), superseding DI-8's GROUP_RESCAN band-aid.

Implementation order: DI-8 → DI-9 → DI-1 → DI-3 → DI-2 → DI-6 → DI-4 → DI-5 → DI-7 → DI-10 → DI-11

DVM improvements subtotal: ~2–3 weeks (DI-8/DI-9 are small independent fixes; DI-1–DI-7 are the core engine work; DI-10 is a validation run; DI-11 is predicate pushdown + deep-join optimization)

Regression-Free Testing Initiative (Q2 2026)

Tracking: TESTING_GAPS_2_IMPLEMENTATION_PROPOSAL.md

Addresses 9 structural weaknesses identified in the regression risk analysis. Target: reduce regression escape rate from ~15% to <5%.

Phase	Item	Status
P1	Test infrastructure hardening: `#[must_use]` on poll helpers; `wait_for_condition` with exponential backoff; `assert_column_types_match`	✅ Done (2026-03-28)
P2	Join multi-cycle correctness: 7 tests — LEFT/RIGHT/FULL join, join-key update, both-sides DML, 4-table chain, NULL key	✅ Done (2026-03-28)
P3	Differential ≡ Full equivalence: 11 tests covering every major DVM operator class; `effective_refresh_mode` guard	✅ Done (2026-03-28)
P4	DVM operator execution: LATERAL MAX subquery multi-cycle (5 cycles) + recursive CTE org hierarchy multi-cycle (5 cycles)	✅ Done (2026-03-28)
P5	Failure recovery & schema evolution: 6 failure recovery tests (FR-1..6 in `e2e_failure_recovery_tests.rs`) + 5 schema evolution tests (SE-1..5 in `e2e_ddl_event_tests.rs`)	✅ Done (2026-03-28)
P6	MERGE template unit tests: 8 pure-Rust tests — `determine_refresh_action` (×5) + `build_is_distinct_clause` boundary (×3) in `src/refresh.rs`	✅ Done (2026-03-28)

v0.13.0 total: ~15–23 weeks (Scalability: 6–8w, Partitioning: 5–8w, MERGE Profiling: 1–3w, dbt: 2–3.5d, Multi-tenant: 2–3w, TPC-H harness: ~1d, SQL cleanup: ~1–2d, DVM improvements: ~2–3w)

Exit criteria:

A-2: Columnar change tracking bitmask skips irrelevant rows; key column classification ✅, __pgt_key_changed annotation ✅, P5 value-only fast path ✅, DiffResult.has_key_changed signal propagation ✅, MERGE value-only UPDATE optimization ✅, upgrade script ✅ ✅ Done
D-4: Shared buffer serves multiple STs via per-source changes_{oid} naming; pgt_change_tracking.tracked_by_pgt_ids reference counting; shared_buffer_stats() observability; property-based test with 5–10 consumers (3 properties, 500 cases) ✅ Done; 5 E2E fan-out tests
PERF-2: buffer_partitioning = 'auto' activates RANGE(lsn) partitioned mode for high-throughput sources — throughput-based should_promote_inner() heuristic, convert_buffer_to_partitioned() runtime migration, 10 unit tests + 3 E2E tests, docs/CONFIGURATION.md updated ✅ Done
A1-1b: Multi-column RANGE partition keys work end-to-end; composite ROW() predicate triggers partition pruning; 3 E2E tests + 5 unit tests ✅ Done
A1-1c: alter_stream_table(partition_by => …) repartitions existing storage table without data loss; add/change/remove tested
A1-1d: LIST partitioning creates PARTITION BY LIST storage; IN-list predicate injected; single-column-only validated; 4 E2E tests pass
A1-3b: HASH partitioning uses per-partition MERGE loop; auto-creates N child partitions; satisfies_hash_partition() filter; 22 unit tests + 6 E2E tests ✅ Done
PART-WARN: WARNING emitted when default partition has rows after refresh; warn_default_partition_growth() on both FULL and DIFFERENTIAL paths ✅ Done
G14-MDED: Deduplication frequency profiling complete; TOTAL_DIFF_REFRESHES + DEDUP_NEEDED_REFRESHES shared-memory atomic counters; pgtrickle.dedup_stats() reports ratio; RFC threshold documented at ≥10% ✅ Done
PROF-DLT: pgtrickle.explain_delta(st_name, format) function captures delta query plans in text/json/xml/yaml; PGS_PROFILE_DELTA=1 auto-capture to /tmp/delta_plans/; documented in SQL_REFERENCE.md ✅ Done
C-3: Per-database worker quota enforced; tier-aware priority sort (IMMEDIATE > Hot > Warm > Cold) implemented; GUC + E2E quota tests added; compute_per_db_quota() with burst at 80% cluster load ✅ Done
TPCH-1/2: TPCH_BENCH=1 mode emits [TPCH_BENCH] lines + summary table; just bench-tpch and bench-tpch-large targets functional ✅ Done
TPCH-3: Five TPC-H OpTree Criterion benchmarks pass and run without a PostgreSQL backend ✅ Done
DBT-1/2/3: partition_by, fuse, fuse_ceiling, fuse_sensitivity exposed in dbt macros; change detection wired; integration tests added; README and SQL_REFERENCE.md updated ✅ Done
SQL-RECUR: Recursive CTE non-monotone audit complete; G1.3 downgraded to P4 — two Tier 3h E2E tests verify recomputation fallback is correct ✅ Done
SQL-PG16-1: IS JSON predicate accepted in DIFFERENTIAL defining queries; E2E tests in e2e_expression_tests.rs confirm correct delta behaviour ✅ Done
SQL-PG16-2: JSON_OBJECT, JSON_ARRAY, JSON_OBJECTAGG, JSON_ARRAYAGG accepted in DIFFERENTIAL defining queries; E2E tests in e2e_expression_tests.rs confirm correct delta behaviour ✅ Done
scripts/check_upgrade_completeness.sh passes (all catalog changes in sql/pg_trickle--0.12.0--0.13.0.sql) ✅ Done — 58 functions, 8 new columns, all covered
DI-8: is_algebraically_invertible() detects Expr::Raw("CASE …") and returns false for SUM(CASE WHEN …) (Q14 unaffected — ComplexExpression); Q12 removed from DIFFERENTIAL_SKIP_ALLOWLIST; 4 unit tests ✅ Done
DI-9: scheduler_interval_ms cap raised to 600,000 ms; scheduler skips IMMEDIATE-mode tables in check_schedule(); verified safe for CALCULATED dependants ✅ Done
DI-1: Named CTE L₀ snapshots implemented (NOT MATERIALIZED default, MATERIALIZED when ref ≥ 3); Q05/Q09 pass DIFFERENTIAL correctness ✅ Done
DI-2: NOT EXISTS anti-join replaces EXCEPT ALL in build_pre_change_snapshot_sql(); per-leaf conditional EXCEPT ALL fallback when delta > max_delta_fraction; aggregate UPDATE-split blocked on Q12 drift root cause (DI-8 band-aid retained) ✅ Done
DI-3: Already implemented — non-algebraic aggregate old rescan filtered via EXISTS (… IS NOT DISTINCT FROM …) to affected groups; NULL-safe ✅ Done
DI-6: Semi-join R_old lazy materialization with key push-down; Q20 DIFF passes at SF=0.01 ✅ Done
DI-4/5/7: R₀ cache (subset of DI-1), Part 3 threshold raised from 3→5, strategy selector + max_delta_fraction complete ✅ Done
DI-10: bench-tpch-sf1 target added; 22/22 queries pass at SF=0.01 (3 cycles, zero drift) ✅ Done
DI-11: Predicate pushdown enabled with scalar-subquery guard; deep-join L₀ threshold (4 scans); deep-join planner hints (5+ total scans); 22/22 TPC-H DIFFERENTIAL ✅ Done
Extension upgrade path tested (0.12.0 → 0.13.0) ✅ Done

v0.14.0 — Tiered Scheduling, UNLOGGED Buffers & Diagnostics

Status: Released (2026-04-02).

Tiered refresh scheduling, UNLOGGED change buffers, refresh mode diagnostics, error-state circuit breaker, a full-featured TUI dashboard, security hardening (SECURITY DEFINER triggers with explicit search_path), GHCR Docker image, pre-deployment checklist, best-practice patterns guide, and comprehensive E2E test coverage. See CHANGELOG.md for the full feature list.

Completed items (click to expand)

Quick Polish & Error State Circuit Breaker (Phase 1 + 1b) — ✅ Done

C4: pg_trickle.planner_aggressive GUC consolidates merge_planner_hints + merge_work_mem_mb. Old GUCs deprecated.
DIAG-2: Creation-time WARNING for group-rescan and low-cardinality algebraic aggregates. agg_diff_cardinality_threshold GUC added.
DOC-OPM: Operator support matrix summary table linked from SQL_REFERENCE.md.
ERR-1: Permanent failures immediately set ERROR status with last_error_message/last_error_at. API calls clear error state. E2E test pending.

Manual Tiered Scheduling (Phase 2 — C-1) — ✅ Done

Tiered scheduling infrastructure was already in place since v0.11/v0.12 (refresh_tier column, RefreshTier enum, ALTER ... SET (tier=...), scheduler multipliers). Phase 2 verified completeness and added:

C-1b: NOTICE on tier demotion from Hot to Cold/Frozen, alerting operators to the effective interval change.
C-1c: Scheduler tier-aware multipliers confirmed: Hot ×1, Warm ×2, Cold ×10, Frozen = skip. Gated by pg_trickle.tiered_scheduling (default true since v0.12.0).

UNLOGGED Change Buffers (Phase 3 — D-1) — ✅ Done

D-1a: pg_trickle.unlogged_buffers GUC (default false). New change buffer tables created as UNLOGGED when enabled, reducing WAL amplification by ~30%.
D-1b: Crash recovery detection — scheduler detects UNLOGGED buffers emptied by crash (postmaster restart after last refresh) and auto-enqueues FULL refresh.
D-1c: pgtrickle.convert_buffers_to_unlogged() utility function for converting existing logged buffers. Documents lock-window warning.
D-1e: Documentation in CONFIGURATION.md and SQL_REFERENCE.md.

Documentation: Best-Practice Patterns Guide (G16-PAT) — ✅ Done

Item	Description	Effort	Ref
~~G16-PAT~~	Best-practice patterns guide. `docs/PATTERNS.md`: 6 patterns (Bronze/Silver/Gold, event sourcing, SCD type-1/2, high-fan-out, real-time dashboards, tiered refresh) with SQL examples, anti-patterns, and refresh mode recommendations.	—	✅ Done

Patterns guide subtotal: ✅ Done

Long-Running Stability & Multi-Database Testing (G17-SOAK, G17-MDB) — ✅ Done

Soak test validates zero worker crashes, zero ERROR states, and stable RSS under sustained mixed DML. Multi-database test validates catalog isolation, shared-memory independence, and concurrent correctness.

Item	Description	Effort	Ref
~~G17-SOAK~~	Long-running stability soak test. `tests/e2e_soak_tests.rs` with configurable duration, 5 source tables, mixed DML, health checks, RSS monitoring, correctness verification. `just test-soak` / `just test-soak-short`. CI job: schedule + manual dispatch.	—	✅ Done
~~G17-MDB~~	Multi-database scheduler isolation test. `tests/e2e_mdb_tests.rs` with two databases, catalog isolation assertion, concurrent mutation cycles, correctness verification per database. `just test-mdb`. CI job: schedule + manual dispatch.	—	✅ Done

Stability & multi-database testing subtotal: ✅ Done

Container Infrastructure (INFRA-GHCR)

Item	Description	Effort	Ref
INFRA-GHCR	GHCR Docker image. `Dockerfile.ghcr` (pinned to `postgres:18.3-bookworm`) + `.github/workflows/ghcr.yml` workflow that builds a multi-arch (`linux/amd64` + `linux/arm64`) PostgreSQL 18.3 server image with pg_trickle pre-installed and all sensible GUC defaults baked in. Smoke-tests on amd64 before push. Published to `ghcr.io/grove/pg_trickle` on every `v*` tag with immutable (`<version>-pg18.3`), floating (`pg18`), and `latest` tags. Uses `GITHUB_TOKEN` — no extra secrets.	4h	—

Container infrastructure subtotal: ✅ Done

Refresh Mode Diagnostics (DIAG-1) — ✅ Done

Analyzes stream table workload characteristics and recommends the optimal refresh mode. Seven weighted signals (change ratio, empirical timing, query complexity, target size, index coverage, latency variance) produce a composite score with confidence level and human-readable explanation.

Item	Description	Effort	Ref
~~DIAG-1a~~	~~`src/diagnostics.rs` — pure signal-scoring functions + unit tests~~	—	✅ Done
~~DIAG-1b~~	~~SPI data-gathering layer~~	—	✅ Done
~~DIAG-1c~~	~~`pgtrickle.recommend_refresh_mode()` SQL function~~	—	✅ Done
~~DIAG-1d~~	~~`pgtrickle.refresh_efficiency()` function~~	—	✅ Done
~~DIAG-1e~~	~~E2E integration tests; upgrade migration~~	—	✅ Done
~~DIAG-1f~~	~~Documentation: SQL_REFERENCE.md additions~~	—	✅ Done

The function synthesises 7 weighted signals (historical change ratio 0.30, empirical timing 0.35, current change ratio 0.25, query complexity 0.10, target size 0.10, index coverage 0.05, P95/P50 variance 0.05) into a composite score. Confidence degrades gracefully when history is sparse.

Diagnostics subtotal: ~3.5–7 days

Export Definition API (G15-EX) — ✅ Done

Item	Description	Effort	Ref
~~G15-EX~~	`export_definition(name TEXT)` — export a stream table configuration as reproducible DDL	—	✅ Done

G15-EX subtotal: ~1–2 days

TUI Tool (E3-TUI)

In plain terms: A full-featured terminal user interface (TUI) for managing, monitoring, and diagnosing pg_trickle stream tables without touching SQL. Built with ratatui in Rust, it provides a real-time dashboard (think htop for stream tables), interactive dependency graph visualization, live refresh log, diagnostics with signal breakdown charts, CDC health monitoring, a GUC configuration editor, and a real-time alert feed — all navigable with keyboard shortcuts and a command palette. It also supports every original CLI command as one-shot subcommands for scripting and CI.

Item	Description	Effort	Ref
E3-TUI	TUI tool (`pgtrickle`) for interactive management and monitoring	8–10d	PLAN_TUI.md

E3-TUI subtotal: ~8–10 days (T1–T8 implemented: CLI skeleton with 18 subcommands, interactive dashboard with 15 views, watch mode with --filter, LISTEN/NOTIFY alerts with JSON parsing, async polling with force-poll, cascade staleness detection, DAG issue detection, sparklines, fuse detail panel, trigger inventory, context-sensitive help, docs/TUI.md)

GUC Surface Consolidation (C4)

Item	Description	Effort	Ref
C4	Consolidate `merge_planner_hints` + `merge_work_mem_mb` into single `planner_aggressive` boolean. Reduces GUC surface area; existing two GUCs become aliases that emit a deprecation notice.	~1–2h	PLAN_FEATURE_CLEANUP.md §C4

C4 subtotal: ~1–2 hours

Documentation: Pre-Deployment Checklist (DOC-PDC) — ✅ Done

Item	Description	Effort	Ref
~~DOC-PDC~~	Pre-deployment checklist page. `docs/PRE_DEPLOYMENT.md`: 10-point checklist covering PG version, `shared_preload_libraries`, WAL configuration, PgBouncer compatibility, recommended GUCs, resource planning, monitoring, validation script. Cross-linked from GETTING_STARTED.md and INSTALL.md.	—	✅ Done

DOC-PDC subtotal: ✅ Done

Documentation: Operator Mode Support Matrix Cross-Link (DOC-OPM)

Item	Description	Effort	Ref
DOC-OPM	Cross-link operator support matrix from SQL_REFERENCE.md. The 60+ operator × FULL/DIFFERENTIAL/IMMEDIATE matrix in DVM_OPERATORS.md is not discoverable from the page users actually read. Add a summary table and prominent link in SQL_REFERENCE.md §Supported SQL Constructs.	~2–4h	docs/DVM_OPERATORS.md · docs/SQL_REFERENCE.md

DOC-OPM subtotal: ~2–4 hours

Aggregate Mode Warning at Creation Time (DIAG-2)

In plain terms: Queries with very few distinct GROUP BY groups (e.g. 5 regions from 100K rows) are always faster with FULL refresh — differential overhead exceeds the cost of re-aggregating a tiny result set. Today users discover this only after benchmarking. A creation-time WARNING with an explicit recommendation prevents the surprise. The classification logic is already present in the DVM parser (aggregate strategy classification from is_algebraically_invertible, is_group_rescan); this item exposes it at the SQL boundary.

Item	Description	Effort	Ref
DIAG-2	Aggregate mode warning at `create_stream_table` time. After parsing the defining query, inspect the top-level operator: if it is an `Aggregate` node containing non-algebraic (group-rescan) functions such as `MIN`, `MAX`, `STRING_AGG`, `ARRAY_AGG`, `BOOL_AND/OR`, emit a `WARNING` recommending `refresh_mode='full'` or `'auto'` and citing the group-rescan cost. For algebraic aggregates (`SUM`/`COUNT`/`AVG`), emit the warning only when the estimated group cardinality (from `pg_stats.n_distinct` on the GROUP BY columns) is below `pg_trickle.agg_diff_cardinality_threshold` (default: 1000 distinct groups), since below this threshold FULL is reliably faster. No behavior change — warning only.	~2–4h	plans/performance/REPORT_OVERALL_STATUS.md §12.3

DIAG-2 subtotal: ~2–4 hours

DIFFERENTIAL Refresh for Manual ST-on-ST Path (FIX-STST-DIFF)

Background: When a stream table reads from another stream table (calculated schedule), the scheduler propagates changes via a per-ST change buffer (pgtrickle_changes.changes_pgt_{id}) and performs a true DIFFERENTIAL DVM refresh against that buffer. The manual pgtrickle.refresh_stream_table() path does not: it currently falls back to an unconditional TRUNCATE + INSERT (FULL refresh) for every call.

This was introduced as a correctness fix in v0.13.0 (PR #371) to close a scheduler race where the previous no-op guard could leave stale data in place. The FULL fallback is correct but inefficient — it pays a full table scan of all upstream STs even when only a small delta is present.

What needs to happen: Wire execute_manual_differential_refresh to use the same changes_pgt_ change buffers the scheduler already writes. When a manual refresh is requested for a calculated ST that has a stored frontier, check each upstream ST's change buffer for rows with lsn > frontier.get_st_lsn(upstream_pgt_id). If new rows exist, apply the DVM delta SQL (same as execute_differential_refresh). If no rows exist beyond the frontier, return a true no-op. This also fixes the pre-existing test_st_on_st_uses_differential_not_full E2E failure.

Item	Description	Effort	Ref
~~FIX-STST-DIFF~~	DIFFERENTIAL manual refresh for ST-on-ST. In `execute_manual_differential_refresh` (`src/api.rs`), replace the unconditional FULL fallback for `has_st_source` with a proper change-buffer delta path: read rows from `changes_pgt_{upstream_pgt_id}` beyond the stored frontier LSN, run DVM differential SQL, advance the frontier. Matches the scheduler path exactly. Fixes `test_st_on_st_uses_differential_not_full`.	—	✅ Done

FIX-STST-DIFF subtotal: ~1–2 days

v0.14.0 total: ~2–6 weeks + ~1wk patterns guide + ~2–4 days stability tests + ~3.5–7 days diagnostics + ~1–2d export API + ~8–10d TUI + ~0.5d docs + ~2–4h aggregate warning + ~1–2d ST-on-ST diff manual path

Exit criteria:

C-1: Tier classification with manual assignment; Cold STs skip refresh correctly; E2E tested ✅ Done
D-1: UNLOGGED change buffers opt-in (unlogged_buffers = false by default); crash-recovery FULL-refresh path tested; E2E tested ✅ Done
G16-PAT: Patterns guide published in docs/PATTERNS.md covering 6 patterns ✅ Done
G17-SOAK: Soak test passes with zero worker crashes, zero zombie stream tables, stable memory ✅ Done
G17-MDB: Multi-database scheduler isolation verified ✅ Done
DIAG-1: recommend_refresh_mode() + refresh_efficiency() implemented with 7 signals; E2E tested; tutorial published ✅ Done
DIAG-2: WARNING emitted at creation time for group-rescan and low-cardinality aggregates; threshold configurable ✅ Done
G15-EX: export_definition(name TEXT) returns valid reproducible DDL; round-trip tested ✅ Done
E3-TUI: pgtrickle TUI binary builds as workspace member; one-shot CLI commands functional with --format json; interactive dashboard launches with no subcommand; 15 views with cascade staleness, issue detection, sparklines, force-poll, NOTIFY, and context-sensitive help; documented in docs/TUI.md ✅ Done
C4: merge_planner_hints and merge_work_mem_mb consolidated into planner_aggressive ✅ Done
DOC-PDC: Pre-deployment checklist published in docs/PRE_DEPLOYMENT.md ✅ Done
DOC-OPM: Operator mode support matrix summary and link added to SQL_REFERENCE.md ✅ Done
FIX-STST-DIFF: Manual DIFFERENTIAL refresh for ST-on-ST path ✅ Done
INFRA-GHCR: ghcr.io/grove/pg_trickle multi-arch image builds, smoke-tests, and pushes on v* tags ✅ Done
ERR-1: Error-state circuit breaker with E2E test coverage ✅ Done
Extension upgrade path tested (0.13.0 → 0.14.0) ✅ Done

v0.15.0 — External Test Suites & Integration

Status: Released (2026-04-03). All 20 roadmap items complete.

Goal: Validate correctness against independent query corpora and ship the dbt integration as a formal release.

Completed items (click to expand)

External Test Suite Integration

In plain terms: pg_trickle's own tests were written by the pg_trickle team, which means they can have the same blind spots as the code. This adds validation against three independent public benchmarks: PostgreSQL's own SQL conformance suite (sqllogictest), the Join Order Benchmark (a realistic analytical query workload), and Nexmark (a streaming data benchmark). If pg_trickle produces a different answer than PostgreSQL does on the same query, these external suites will catch it.

Validate correctness against independent query corpora beyond TPC-H.

➡️ TS1 and TS2 pulled forward to v0.11.0. Delivering one of TS1 or TS2 is an exit criterion for 0.11.0. TS3 (Nexmark) remains in 0.15.0. If TS1/TS2 slip from 0.11.0, they land here.

Item	Description	Effort	Ref
~~TS1~~	~~sqllogictest: run PostgreSQL sqllogic suite through pg_trickle DIFFERENTIAL mode~~ ➡️ Pulled to v0.11.0	2–3d	PLAN_TESTING_GAPS.md §J
~~TS2~~	~~JOB (Join Order Benchmark): correctness baseline and refresh latency profiling~~ ➡️ Pulled to v0.11.0	1–2d	PLAN_TESTING_GAPS.md §J
TS3	Nexmark streaming benchmark: sustained high-frequency DML correctness	1–2d	PLAN_TESTING_GAPS.md §J

External test suites subtotal: ~1–2 days (TS3 only; TS1/TS2 in v0.11.0) -- ✅ TS3 complete

Documentation Review

In plain terms: A full documentation review polishes everything so the product is ready to be announced to the wider PostgreSQL community.

Item	Description	Effort	Ref
I2	Complete documentation review & polish	4--6h	docs/

Documentation subtotal: ✅ Done

Bulk Create API (G15-BC)

Item	Description	Effort	Ref
G15-BC	`bulk_create(definitions JSONB)` — create multiple stream tables and their CDC triggers in a single transaction. Useful for dbt/CI pipelines that manage many STs programmatically. ✅ Done	~2–3d	plans/performance/REPORT_OVERALL_STATUS.md §15

G15-BC subtotal: ✅ Completed

Parser Modularization (G13-PRF) -- ✅ Done

In plain terms: At ~21,000 lines, parser.rs was too large to maintain safely. Split into 5 sub-modules by concern -- zero behavior change.

Item	Description	Effort	Ref
G13-PRF	~~Modularize `src/dvm/parser.rs`.~~ ✅ Done. Split into `mod.rs`, `types.rs`, `validation.rs`, `rewrites.rs`, `sublinks.rs`. Added `// SAFETY:` comments to all ~750 `unsafe` blocks (~676 newly documented).	~3–4wk	plans/performance/REPORT_OVERALL_STATUS.md §13

G13-PRF subtotal: ✅ Completed

Watermark Hold-Back Mode (WM-7) -- ✅ Done

In plain terms: The watermark gating system (shipped in v0.7.0) lets ETL producers signal their progress. Hold-back mode adds stuck detection: when a watermark is not advanced within a configurable timeout, downstream stream tables are paused and operators are notified.

Item	Description	Effort	Ref
WM-7	Watermark hold-back mode. `watermark_holdback_timeout` GUC detects stuck watermarks; pauses downstream gated STs; emits `pgtrickle_alert` NOTIFY with `watermark_stuck` event; auto-resumes with `watermark_resumed` event when watermark advances.	✅ Done	PLAN_WATERMARK_GATING.md §4.1

WM-7 subtotal: ✅ Done

Delta Cost Estimation (PH-E1) — ✅ Done

In plain terms: Before executing the MERGE, runs a capped COUNT on the delta subquery to estimate output cardinality. If the count exceeds pg_trickle.max_delta_estimate_rows, emits a NOTICE and falls back to FULL refresh to prevent OOM or excessive temp-file spills.

Item	Description	Effort	Ref
PH-E1	Delta cost estimation. Capped `SELECT count(*) FROM (delta LIMIT N+1)` before MERGE execution. `max_delta_estimate_rows` GUC (default: 0 = disabled). Falls back to FULL + NOTICE when exceeded.	—	PLAN_PERFORMANCE_PART_9.md §Phase E

PH-E1 subtotal: ✅ Complete

dbt Hub Publication (I3) — ✅ Done

In plain terms: dbt-pgtrickle is now prepared for dbt Hub publication. The dbt_project.yml is version-synced (0.15.0), README documents both git and Hub install methods, and a submission guide documents the hubcap PR process. Actual Hub listing requires creating a standalone grove/dbt-pgtrickle repository and submitting a PR to dbt-labs/hubcap.

Item	Description	Effort	Ref
I3	Prepared `dbt-pgtrickle` for dbt Hub publication. Version synced to 0.15.0, README updated with Hub install snippet, submission guide written. Hub listing pending separate repo creation + hubcap PR.	2–4h	dbt-pgtrickle/ · docs/integrations/dbt-hub-submission.md

I3 subtotal: ~2–4 hours — ✅ Complete

Hash-Join Planner Hints (PH-D2) — ✅ Done

In plain terms: Added pg_trickle.merge_join_strategy GUC that lets operators manually override the join strategy used during MERGE. Values: auto (default heuristic), hash_join, nested_loop, merge_join. The existing delta-size heuristics remain the default (auto).

Item	Description	Effort	Ref
PH-D2	Hash-join planner hints. Added `merge_join_strategy` GUC with manual override for join strategy during MERGE. `auto` preserves existing delta-size heuristics; `hash_join`/`nested_loop`/`merge_join` force specific strategies.	3–5d	PLAN_PERFORMANCE_PART_9.md §Phase D

PH-D2 subtotal: ~3–5 days — ✅ Complete

Shared-Memory Template Cache Research Spike (G14-SHC-SPIKE)

In plain terms: Every new database connection that triggers a refresh pays a 15–50ms cold-start cost to regenerate the MERGE SQL template. With PgBouncer in transaction mode, this happens on every refresh cycle. This milestone scopes a research spike only: write an RFC, build a prototype, measure whether DSM-based caching eliminates the cold-start. Full implementation stays in v0.16.0.

Item	Description	Effort	Ref
G14-SHC-SPIKE	Shared-memory template cache research spike. Write an RFC for DSM + lwlock-based MERGE SQL template caching. Build a prototype benchmark to validate cold-start elimination. Full implementation deferred to v0.16.0.	2–3d	plans/performance/REPORT_OVERALL_STATUS.md §14

G14-SHC-SPIKE subtotal: ~2–3 days -- ✅ RFC complete (plans/performance/RFC_SHARED_TEMPLATE_CACHE.md)

TRUNCATE Capture for Trigger-Mode CDC (TRUNC-1)

In plain terms: WAL-mode CDC detects TRUNCATE on source tables and marks downstream stream tables for reinitialization. But trigger-mode CDC has no TRUNCATE handler — a TRUNCATE silently leaves the stream table stale. Adding a DDL event trigger that catches TRUNCATE and flags affected STs closes this correctness gap.

Item	Description	Effort	Ref
TRUNC-1	TRUNCATE capture for trigger-mode CDC. Add a DDL event trigger or statement-level trigger that detects TRUNCATE on source tables in trigger CDC mode and marks downstream STs for `needs_reinit`. ✅ Done — CDC TRUNCATE triggers write `action='T'` marker; refresh engine detects and falls back to FULL.	4–6h	plans/adrs/PLAN_ADRS.md ADR-070

TRUNC-1 subtotal: ✅ Completed

Volatile Function Policy GUC (VOL-1)

In plain terms: Volatile functions (random(), clock_timestamp(), etc.) are correctly rejected at stream table creation time in DIFFERENTIAL and IMMEDIATE modes. But there’s no way for users to override this — some want volatile functions in FULL mode. Adding a volatile_function_policy GUC with reject/warn/allow modes gives operators control.

Item	Description	Effort	Ref
VOL-1	`pg_trickle.volatile_function_policy` GUC. Add a GUC with values `reject` (default), `warn`, `allow` to control volatile function handling. `reject` preserves current behavior; `warn` emits WARNING but allows creation; `allow` silently permits (user accepts correctness risk). ✅ Done	3–5h	plans/sql/PLAN_NON_DETERMINISM.md

VOL-1 subtotal: ✅ Completed

Spill-Aware Refresh (PH-E2)

In plain terms: After PH-E1 adds pre-flight cost estimation, PH-E2 adds post-flight monitoring: track temp_bytes from pg_stat_statements after each refresh cycle and auto-adjust if spill is excessive.

Item	Description	Effort	Ref
PH-E2	Spill-aware refresh. Monitor `temp_bytes` from `pg_stat_statements` after each refresh cycle. If spill exceeds threshold 3 consecutive times, automatically increase `per-ST work_mem` override or switch to FULL. Expose in `explain_st()` as `spill_history`. ✅ Done	1–2 wk	PLAN_PERFORMANCE_PART_9.md §Phase E

PH-E2 subtotal: ✅ Completed

ORM Integration Guides (E5)

In plain terms: Documentation showing how popular ORMs (SQLAlchemy, Django, etc.) interact with stream tables — model definitions, migrations, and freshness checks. Documentation-only work.

Item	Description	Effort	Ref
E5	ORM integrations guide (SQLAlchemy, Django, etc.)	8–12h	PLAN_ECO_SYSTEM.md §5

E5 subtotal: ✅ Done

Flyway / Liquibase Migration Support (E4)

In plain terms: Documentation showing how standard migration frameworks interact with stream tables — CREATE/ALTER/DROP patterns, handling CDC triggers across schema migrations. Documentation-only work.

Item	Description	Effort	Ref
E4	Flyway / Liquibase migration support	8–12h	PLAN_ECO_SYSTEM.md §5

E4 subtotal: ✅ Done

JOIN Key Change + DELETE Correctness Fix (EC-01) — ✅ Done (pre-existing)

In plain terms: The phantom-row-after-DELETE bug was fixed in v0.14.0 via the R₀ pre-change snapshot strategy. Part 1 of the JOIN delta is split into 1a (inserts ⋈ R₁) + 1b (deletes ⋈ R₀), ensuring DELETE deltas always find the old join partner. The fix was extended to all join depths via the EC-01B-1 per-leaf CTE strategy, and regression tests (EC-01B-2) cover TPC-H Q07, Q08, Q09.

Item	Description	Effort	Ref
EC-01	R₀ pre-change snapshot for JOIN key change + DELETE. Part 1 split into 1a (inserts ⋈ R₁) + 1b (deletes ⋈ R₀). Applied to INNER/LEFT/FULL JOIN. Closes G1.1.	—	GAP_SQL_PHASE_7.md §G1.1

EC-01 subtotal: ✅ Complete (implemented in v0.14.0)

Multi-Level ST-on-ST Testing (STST-3)

In plain terms: FIX-STST-DIFF (v0.14.0) fixed 2-level stream-table-on-stream-table DIFFERENTIAL refresh. Some 3-level cascade tests exist, but systematic coverage for 3+ level chains — including mixed refresh modes, concurrent DML at multiple levels, and DELETE/UPDATE propagation through deep chains — is missing. This adds a dedicated test matrix to prevent regressions as cascade depth increases.

Item	Description	Effort	Ref
STST-3	Multi-level ST-on-ST test matrix (3+ levels). Systematic coverage: 3-level and 4-level chains, INSERT/UPDATE/DELETE propagation, mixed DIFFERENTIAL/FULL modes, concurrent DML at multiple levels, correctness comparison against materialized-view baseline.	3–5d	e2e_cascade_regression_tests.rs

STST-3 subtotal: ✅ Done

Circular Dependencies + IMMEDIATE Mode (CIRC-IMM)

In plain terms: Circular dependencies are rejected at creation time (EC-30), but the interaction between near-circular topologies (e.g. diamond dependencies with IMMEDIATE triggers on both sides) and IMMEDIATE mode is untested territory. This adds targeted testing and, if needed, hardening to ensure IMMEDIATE mode doesn't deadlock or produce incorrect results on complex dependency graphs. Conditional P1 — can slip to v0.16.0 if no issues surface during other IMMEDIATE-mode work.

Item	Description	Effort	Ref
CIRC-IMM	Circular-dependency + IMMEDIATE mode hardening. Test: diamond deps with IMMEDIATE triggers, near-circular topologies, lock ordering under concurrent DML. Add deadlock detection / timeout guard if issues found.	3–5d	PLAN_EDGE_CASES.md §EC-30 · PLAN_CIRCULAR_REFERENCES.md

CIRC-IMM subtotal: ✅ Done

Cross-Session MERGE Cache Staleness Fix (G8.1)

In plain terms: When session A alters a stream table's defining query, session B's cached MERGE SQL template remains stale until B encounters a refresh error or reconnects. Adding a catalog version counter that is bumped on every ALTER QUERY and checked before each refresh closes this race window.

Item	Description	Effort	Ref
G8.1	Cross-session MERGE cache invalidation. Add a `catalog_version` counter to `pgt_stream_tables`, bump on ALTER QUERY / DROP / reinit. Before each refresh, compare cached version to catalog; regenerate template on mismatch. ✅ Done — existing `CACHE_GENERATION` counter + `defining_query_hash` provides cross-session + per-ST invalidation without a schema change.	4–6h	—

G8.1 subtotal: ✅ Completed

`explain_st()` Enhancements (EXPL-ENH) — ✅ Done

In plain terms: Small quality-of-life improvements to the diagnostic function: refresh timing statistics, partition source info, and a dependency-graph visualization snippet in DOT format.

Item	Description	Effort	Ref
EXPL-ENH	`explain_st()` enhancements. Added: (a) refresh timing stats (min/max/avg/latest duration from last 20 refreshes), (b) source partition info for partitioned tables, (c) dependency sub-graph visualization in DOT format.	4–8h	PLAN_FEATURE_CLEANUP.md

EXPL-ENH subtotal: ~4–8 hours — ✅ Complete

CNPG Operator Hardening (R4)

In plain terms: Kubernetes-native improvements for the CloudNativePG integration: adopt K8s 1.33+ native ImageVolume (replacing the init-container workaround), add liveness/readiness probe integration for pg_trickle health, and test failover behavior with stream tables.

Item	Description	Effort	Ref
R4	CNPG operator hardening. Adopt K8s 1.33+ native ImageVolume, add pg_trickle health to CNPG liveness/readiness probes, test primary→replica failover with active stream tables.	4–6h	PLAN_CLOUDNATIVEPG.md

R4 subtotal: ~4–6 hours -- ✅ Complete

v0.15.0 total: ~52–90h + ~2–3d bulk create + ~3–5d planner hints + ~2–3d cache spike + ~3–4wk parser + ~1–2wk watermark + ~2–4wk delta cost/spill + ~2–3d EC-01 + ~3–5d ST-on-ST + ~3–5d CIRC-IMM

Exit criteria:

At least one external test corpus (sqllogictest, JOB, or Nexmark) passes
Complete documentation review done
G15-BC: pgtrickle.bulk_create(definitions JSONB) creates all STs and CDC triggers atomically; tested with 10+ definitions in a single call
G13-PRF: parser.rs split into 5 sub-modules; zero behavior change; all existing tests pass
WM-7: Stuck watermarks detected and downstream STs paused; watermark_stuck alert emitted; auto-resume on watermark advance
PH-E1: Delta cost estimation via capped COUNT on delta subquery; max_delta_estimate_rows GUC; FULL downgrade + NOTICE when threshold exceeded
PH-E2: Spill-aware auto-adjustment triggers after 3 consecutive spills; spill_info exposed in explain_st()
PH-D2: merge_join_strategy GUC with manual override (auto/hash_join/nested_loop/merge_join)
G14-SHC-SPIKE: RFC written; prototype benchmark validates or invalidates DSM-based approach
I2: Complete documentation review done -- CONFIGURATION.md GUCs documented (40+), SQL_REFERENCE.md gaps filled, FAQ refs fixed
TRUNC-1: TRUNCATE on trigger-mode CDC source marks downstream STs for reinit; tested end-to-end
VOL-1: volatile_function_policy GUC controls volatile function handling; reject/warn/allow modes tested
I3: dbt-pgtrickle prepared for dbt Hub; submission guide written; Hub listing pending separate repo + hubcap PR
E4: Flyway / Liquibase integration guide published in docs/integrations/flyway-liquibase.md
E5: ORM integration guides (SQLAlchemy, Django) published in docs/integrations/orm.md
EC-01: R₀ pre-change snapshot ensures DELETE deltas find old join partners; unit + TPC-H regression tests confirm correctness
STST-3: 3-level and 4-level ST-on-ST chains tested with INSERT/UPDATE/DELETE propagation; mixed modes covered
CIRC-IMM: Diamond + near-circular IMMEDIATE topologies tested; no deadlocks or incorrect results
G8.1: Cross-session MERGE cache invalidation via catalog version counter; tested with concurrent ALTER QUERY + refresh
EXPL-ENH: explain_st() shows refresh timing stats, source partition info, and dependency sub-graph (DOT format)
R4: CNPG operator hardening — ImageVolume, health probes, failover tested
G13-PRF: parser.rs split into 5 sub-modules; all ~750 unsafe blocks have // SAFETY: comments; zero behavior change; all existing tests pass
Extension upgrade path tested (0.14.0 → 0.15.0)
just check-version-sync passes

v0.16.0 — Performance & Refresh Optimization

Status: Released (2026-04-06).

Faster refreshes across the board: sub-1% deltas use DELETE+INSERT instead of MERGE, insert-only stream tables auto-detect and skip the MERGE join, algebraic aggregates apply pinpoint updates, and a cross-backend template cache eliminates cold-start latency. Automated benchmark regression gating prevents future performance degradation.

Completed items (click to expand)

Goal: Attack the MERGE bottleneck from multiple angles — alternative merge strategies, algebraic aggregate shortcuts, append-only bypass, delta filtering, change buffer compaction, shared-memory template caching — close critical test coverage gaps to validate these new paths.

MERGE Alternatives & Planner Control (Phase D)

In plain terms: MERGE dominates 70–97% of refresh time. This explores whether replacing MERGE with DELETE+INSERT (or INSERT ON CONFLICT + DELETE) is faster for specific patterns — particularly for small deltas against large stream tables where the MERGE join is the bottleneck.

Item	Description	Effort	Ref
~~PH-D1~~	DELETE+INSERT strategy. For stream tables where delta is <1% of target, replace MERGE with `DELETE WHERE __pgt_row_id IN (delta_deletes)` + `INSERT ... SELECT FROM delta_inserts`. Benchmark against MERGE for 1K/10K/100K deltas against 1M/10M targets. Gate behind `pg_trickle.merge_strategy = 'auto'\|'merge'\|'delete_insert'` GUC.	~~1–2 wk~~	~~PLAN_PERFORMANCE_PART_9.md §Phase D~~

MERGE alternatives subtotal: ~1–2 weeks

Algebraic Aggregate UPDATE Fast-Path (B-1)

In plain terms: The current aggregate delta rule recomputes entire groups where the GROUP BY key appears in the delta. For a group with 100K rows where 1 row changed, the aggregate re-scans all 100K rows in that group. For decomposable aggregates (SUM/COUNT/AVG), a direct UPDATE target SET col = col + Δ replaces the full MERGE join — dropping aggregate refresh from O(group_size) to O(1) per group.

Item	Description	Effort	Ref
B-1	Algebraic aggregate UPDATE fast-path. For `GROUP BY` queries where all aggregates are algebraically invertible (`SUM`/`COUNT`/`AVG`), replace the MERGE with a direct `UPDATE target SET col = col + Δ WHERE group_key = ?` for existing groups, plus `INSERT` for newly-appearing groups and `DELETE` for groups whose count reaches zero. Eliminates the MERGE join overhead — the dominant cost for aggregate refresh when group cardinality is high. Requires adding `__pgt_aux_count` / `__pgt_aux_sum` auxiliary columns to the stream table. Fallback to existing MERGE path for non-algebraic aggregates (`MIN`, `MAX`, `STRING_AGG`, etc.). Gate behind `pg_trickle.aggregate_fast_path` GUC (default `true`). Expected impact: 5–20× apply-time reduction for high-cardinality GROUP BY (10K+ distinct groups); aggregate scenarios at 100K/1% projected to drop from ~50ms to sub-1ms apply time.	4–6 wk	plans/performance/PLAN_NEW_STUFF.md §B-1 · plans/sql/PLAN_TRANSACTIONAL_IVM.md §Phase 4

B-1 subtotal: ~4–6 weeks

Append-Only Stream Tables — MERGE Bypass (A-3-AO)

In plain terms: When a stream table's sources are insert-only (e.g. event logs, append-only tables where CDC never sees DELETE/UPDATE), the MERGE is pure overhead — every delta row is an INSERT, never a match. Bypassing MERGE entirely with a plain INSERT INTO st SELECT ... FROM delta removes the join against the target table, takes only RowExclusiveLock, and is the single highest-payoff optimization for event-sourced architectures.

Item	Description	Effort	Ref
~~A-3-AO~~	Append-only stream table fast path. Expose an explicit `CREATE STREAM TABLE … APPEND ONLY` declaration. When set, refresh uses `INSERT INTO st SELECT ... FROM delta` instead of MERGE — no target-table join, `RowExclusiveLock` only. CDC-observed heuristic fallback: if no DELETE/UPDATE has been seen, use the fast path; fall back to MERGE on first non-insert. Benchmark against MERGE for 1K/10K/100K append deltas.	~~1–2 wk~~	~~plans/performance/PLAN_NEW_STUFF.md §A-3~~

A-3-AO subtotal: ~1–2 weeks

Delta Predicate Pushdown (B-2)

In plain terms: For a query like SELECT ... FROM orders WHERE status = 'shipped', if a CDC change row has status = 'pending', the delta processes it through scan → filter → discard. All the scan and join work is wasted. Pushing the WHERE predicate down into the change buffer scan eliminates irrelevant rows before any join processing begins — a 5–10× reduction in delta row volume for selective queries.

Item	Description	Effort	Ref
~~B-2~~	Delta predicate pushdown. During OpTree construction, identify `Filter` nodes whose predicates reference only columns from a single source table. Inject these predicates into the `delta_scan` CTE as additional WHERE clauses (including `OR old_col = 'value'` for DELETE correctness). Expected impact: 5–10× delta row reduction for queries with < 10% selectivity.	~~2–3 wk~~	~~plans/performance/PLAN_NEW_STUFF.md §B-2~~

B-2 subtotal: ~2–3 weeks

Shared-Memory Template Caching (G14-SHC)

In plain terms: Every new database connection that triggers a refresh pays a 15–50ms cold-start cost to regenerate the MERGE SQL template. With PgBouncer in transaction mode, this happens on every single refresh cycle. Shared-memory caching stores compiled templates in PostgreSQL DSM so they survive across connections — eliminating the cold-start entirely for steady-state workloads.

Item	Description	Effort	Ref
G14-SHC	Shared-memory template caching (implementation). Full implementation of DSM + lwlock-based MERGE SQL template caching, building on the G14-SHC-SPIKE RFC from v0.15.0.	~2–3wk	plans/performance/REPORT_OVERALL_STATUS.md §14

G14-SHC subtotal: ~2–3 weeks

PostgreSQL 19 Forward-Compatibility (A3) — Moved to v1.0.0

PG 19 beta not available in time. Items A3-1 through A3-4 deferred to v1.0.0 milestone.

Change Buffer Compaction (C-4)

In plain terms: A high-churn source table can accumulate thousands of changes to the same row between refresh cycles — an INSERT followed by 10 UPDATEs followed by a DELETE is really just "nothing happened." Compaction merges multiple changes to the same row ID into a single net change before the delta query runs, reducing change buffer size by 50–90% for high-churn tables. This directly reduces work for every downstream path (MERGE, DELETE+INSERT, append-only INSERT, predicate pushdown).

Item	Description	Effort	Ref
~~C-4~~	Change buffer compaction. Before delta-query execution, merge multiple changes to the same `__pgt_row_id` into a single net change: INSERT+DELETE cancel out; consecutive UPDATEs collapse to one. Trigger on buffer exceeding `pg_trickle.compact_threshold` rows (default: 100K). Expected impact: 50–90% reduction in change buffer size for high-churn tables.	~~2–3 wk~~	~~plans/performance/PLAN_NEW_STUFF.md §C-4~~

C-4 subtotal: ~2–3 weeks

Test Coverage Hardening (TG2)

In plain terms: The performance optimizations in this release change core refresh paths (MERGE alternatives, aggregate fast-path, append-only bypass, predicate pushdown). Before and alongside these changes, critical test coverage gaps need closing — particularly around operators and scenarios where bugs could hide silently. These gaps were identified in the TESTING_GAPS_2 audit.

High-Priority Gaps

Item	Description	Effort	Ref
~~TG2-WIN~~	Window function DVM execution tests. ~5 unit tests exist but 0 DVM execution tests. Add execution-level tests for ROW_NUMBER, RANK, DENSE_RANK, LAG/LEAD delta behavior across INSERT/UPDATE/DELETE cycles.	~~3–5d~~	~~TESTING_GAPS_2.md~~
~~TG2-JOIN~~	Join multi-cycle UPDATE/DELETE correctness. E2E join tests are INSERT-only; no UPDATE/DELETE differential cycles. Add systematic multi-cycle coverage for INNER/LEFT/FULL JOIN with UPDATE and DELETE propagation. Risk: silent data corruption in production workloads.	~~3–5d~~	~~TESTING_GAPS_2.md~~
~~TG2-EQUIV~~	Differential ≡ Full equivalence validation. Only CTEs validated; joins and aggregates lack equivalence proof. Add a test harness that runs every defining query in both DIFFERENTIAL and FULL mode and asserts identical results. Critical for trusting the new optimization paths.	~~3–5d~~	~~TESTING_GAPS_2.md~~

Medium-Priority Gaps

Item	Description	Effort	Ref
TG2-MERGE	refresh.rs MERGE template unit tests. Only helpers/enums tested; the core MERGE SQL template generation is untested at the unit level.	2–3d	TESTING_GAPS_2.md
TG2-CANCEL	Timeout/cancellation during refresh. Zero tests for `statement_timeout`, `pg_cancel_backend()` during active refresh. Risk: silent failures or resource leaks under production load.	1–2d	TESTING_GAPS_2.md
TG2-SCHEMA	Source table schema evolution. Partial DDL tests exist; type changes and column renames are thin. Risk: silent data corruption on schema change.	2–3d	TESTING_GAPS_2.md

TG2 subtotal: ~2–4 weeks (high-priority) + ~1–2 weeks (medium-priority)

Performance Regression CI (BENCH-CI)

In plain terms: v0.16.0 changes core refresh paths (MERGE alternatives, aggregate fast-path, append-only bypass, predicate pushdown, buffer compaction). Without automated benchmarks in CI, performance regressions will slip through silently. This adds a benchmark suite that runs on every PR and compares against a committed baseline — any statistically significant regression blocks the merge.

Item	Description	Effort	Ref
BENCH-CI-1	Benchmark harness in CI. Run `just bench` (Criterion-based) on a fixed hardware profile (GitHub Actions large runner or self-hosted). Capture results as JSON artifacts. Compare against committed baseline using Criterion's `--save-baseline` / `--baseline`.	2–3d	plans/performance/PLAN_PERFORMANCE_PART_9.md §I
BENCH-CI-2	Regression gate. Parse Criterion JSON output; fail CI if any benchmark regresses by more than 10% (configurable threshold). Report regressions as PR comment with before/after numbers.	1–2d	plans/performance/PLAN_PERFORMANCE_PART_9.md §I
BENCH-CI-3	Scenario coverage. Ensure benchmark suite covers: scan, filter, aggregate (algebraic + non-algebraic), join (2-table, 3-table), window function, CTE, TopK, append-only, and mixed workloads. At minimum 1K/10K/100K row scales.	2–3d	plans/performance/PLAN_PERFORMANCE_PART_9.md §I

BENCH-CI subtotal: ~1–2 weeks

Auto-Indexing on Stream Table Creation (AUTO-IDX)

In plain terms: pg_ivm automatically creates indexes on GROUP BY columns and primary key columns when creating an incrementally maintained view. pg_trickle currently requires manual index creation, which is a friction point for new users. Auto-indexing creates appropriate indexes at stream table creation time — GROUP BY keys, DISTINCT columns, and the __pgt_row_id covering index for MERGE performance.

Item	Description	Effort	Ref
~~AUTO-IDX-1~~	~~Auto-create indexes on GROUP BY / DISTINCT columns.~~ ✅ GROUP BY composite index (existing) and DISTINCT composite index (new) auto-created at `create_stream_table()` time. Gated behind `pg_trickle.auto_index` GUC.	—	src/api.rs
~~AUTO-IDX-2~~	~~Covering index on `__pgt_row_id`.~~ ✅ Already implemented (A-4). Now gated behind `pg_trickle.auto_index` GUC (default `true`).	—	src/api.rs

AUTO-IDX: ✅ Done

Quick Wins

Item	Description	Effort	Ref
~~C2-BUG~~	~~Implement missing `resume_stream_table()`.~~ ✅ Already existed since v0.2.0 — verified operational.	—
~~ERR-REF~~	~~Error reference documentation.~~ ✅ Published as `docs/ERRORS.md` with all 20 variants documented. Cross-linked from FAQ.	—	docs/ERRORS.md
~~GUC-DEFAULTS~~	~~Review dangerous GUC defaults.~~ ✅ Defaults kept at `true` (correct for most workloads). Added detailed tuning guidance for memory-constrained and PgBouncer environments in CONFIGURATION.md.	—	docs/CONFIGURATION.md
~~BUF-LIMIT~~	~~Change buffer hard growth limit.~~ ✅ `pg_trickle.max_buffer_rows` GUC added (default: 1M). Forces FULL refresh + truncation when exceeded.	—	src/config.rs · src/refresh.rs

Quick wins: ✅ Done

v0.16.0 total: ~1–2 weeks (MERGE alts) + ~4–6 weeks (aggregate fast-path) + ~1–2 weeks (append-only) + ~2–3 weeks (predicate pushdown) + ~2–3 weeks (template cache) + ~2–3 weeks (buffer compaction) + ~3–6 weeks (test coverage) + ~1–2 weeks (bench CI) + ~2–3 days (auto-indexing) + ~2–4 hours (quick wins) Note: PG 19 compatibility (A3, ~18–36h) moved to v1.0.0.

Exit criteria:

PH-D1: DELETE+INSERT strategy implemented and gated behind merge_strategy GUC; correctness verified for INSERT/UPDATE/DELETE deltas
B-1: Algebraic aggregate fast-path replaces MERGE for SUM/COUNT/AVG GROUP BY queries; aggregate_fast_path GUC respected; explicit DML path (DELETE+UPDATE+INSERT) used instead of MERGE for all-algebraic aggregates; explain_st() exposes aggregate_path; existing tests pass — ✅ Done in v0.16.0 Phase 8
A-3-AO: CREATE STREAM TABLE … APPEND ONLY accepted; refresh uses INSERT path; heuristic auto-promotion on insert-only buffers; falls back to MERGE on first non-insert CDC event
B-2: Delta predicate pushdown implemented for single-source Filter nodes (P2-7); DELETE correctness verified (OR old_col predicate); selective-query benchmarks show delta row reduction
G14-SHC: Cross-backend template cache eliminates cold-start; catalog-backed L2 cache with template_cache GUC; invalidation on DDL; explain_st() exposes stats
~~A3: PG 19 builds and passes full E2E suite~~ — moved to v1.0.0
C-4: Change buffer compaction reduces buffer size by ≥50% for high-churn workloads; compact_threshold GUC respected; no correctness regressions
TG2-WIN: Window function DVM execution tests cover ROW_NUMBER, RANK, DENSE_RANK, LAG/LEAD across INSERT/UPDATE/DELETE
TG2-JOIN: Join multi-cycle tests cover INNER/LEFT/FULL JOIN with UPDATE and DELETE propagation; no silent data loss
TG2-EQUIV: Differential ≡ Full equivalence validated for joins, aggregates, and window functions
TG2-MERGE: refresh.rs MERGE template generation has unit test coverage (completed in v0.17.0)
TG2-CANCEL: Timeout and cancellation during refresh tested; no resource leaks (completed in v0.17.0)
TG2-SCHEMA: Source table type changes and column renames tested end-to-end
BENCH-CI: Performance regression CI runs on every PR; 10% regression threshold blocks merge; scenario coverage includes scan/filter/aggregate/join/window/CTE/TopK/SemiJoin/AntiJoin
AUTO-IDX: Stream tables auto-create indexes on GROUP BY / DISTINCT columns; __pgt_row_id covering index for ≤ 8-column tables; auto_index GUC respected
C2-BUG: resume_stream_table() verified operational (present since v0.2.0)
ERR-REF: Error reference doc published with all 20 PgTrickleError variants, common causes, and suggested fixes
GUC-DEFAULTS: planner_aggressive and cleanup_use_truncate defaults reviewed; trade-offs documented in CONFIGURATION.md
BUF-LIMIT: max_buffer_rows GUC prevents unbounded change buffer growth; triggers FULL + truncation when exceeded
Extension upgrade path tested (0.15.0 → 0.16.0)
just check-version-sync passes

v0.17.0 — Query Intelligence & Stability

Status: Released (2026-04-08).

Goal: Make the refresh engine smarter, prove correctness through automated fuzzing, harden for scale, and prepare for adoption. Cost-based strategy selection replaces the fixed DIFF/FULL threshold, columnar change tracking skips irrelevant columns in wide-table UPDATEs, SQLancer integration provides automated semantic proving, incremental DAG rebuild supports 1000+ stream table deployments, and unsafe block reduction continues the safety hardening toward 1.0. On the adoption side: api.rs modularization improves code maintainability, a pg_ivm migration guide targets the largest potential adopter audience, a failure mode runbook equips production teams, and a Docker Compose playground provides a 60-second tryout experience.

Completed items (click to expand)

Cost-Based Refresh Strategy Selection (B-4)

In plain terms: The current adaptive FULL/DIFFERENTIAL threshold is a fixed ratio (differential_max_change_ratio default 0.5). A join-heavy query may be better off with FULL at 5% change rate, while a scan-only query benefits from DIFFERENTIAL up to 80%. This replaces the fixed threshold with a cost model trained on each stream table's own refresh history — selecting the cheapest strategy per cycle automatically.

Item	Description	Effort	Ref
B-4	Cost-based refresh strategy selection. Collect per-ST statistics (`delta_row_count`, `merge_duration_ms`, `full_refresh_duration_ms`, `query_complexity_class`) from `pgt_refresh_history`. Fit a simple linear cost model. Before each refresh, compare `estimated_diff_cost(Δ)` vs `estimated_full_cost × safety_margin` and select the cheaper path. Cold-start heuristic (< 10 refreshes) falls back to existing fixed threshold. Gate behind `pg_trickle.refresh_strategy = 'auto'\|'differential'\|'full'` GUC.	2–3 wk	plans/performance/PLAN_NEW_STUFF.md §B-4

B-4 subtotal: ~2–3 weeks

Columnar Change Tracking (A-2-COL)

In plain terms: When a source table UPDATE changes only 1 of 50 columns, the current CDC captures the entire row (old + new) and the delta query processes all columns. If the changed column is not referenced by the stream table's defining query, the entire refresh is wasted work. Columnar change tracking adds a per-column bitmask to CDC events so the delta query can skip irrelevant rows at scan time — a 50–90% reduction in delta volume for wide-table OLTP workloads.

Item	Description	Effort	Ref
A-2-COL-1	CDC trigger bitmask. Compute `changed_columns` bitmask (`old.col IS DISTINCT FROM new.col`) in the CDC trigger; store as `int8` or `bit(n)` alongside the change row.	1–2 wk	plans/performance/PLAN_NEW_STUFF.md §A-2
A-2-COL-2	Delta-scan column filtering. At delta-query build time, consult the bitmask: skip rows where no referenced column changed; use lightweight UPDATE-only path when only projected columns changed (no join keys, no filter predicates, no aggregate keys).	1–2 wk	plans/performance/PLAN_NEW_STUFF.md §A-2
A-2-COL-3	Aggregate correction optimization. For aggregates where only the aggregated value column changed (not GROUP BY key), emit a single correction row instead of delete-old + insert-new.	3–5d	plans/performance/PLAN_NEW_STUFF.md §A-2

A-2-COL subtotal: ~3–4 weeks

Transactional IVM Phase 4 Remaining (A2)

In plain terms: IMMEDIATE mode (same-transaction refresh) shipped in v0.2.0 using SQL-level statement triggers. Phase 4 completes the transition to lower-overhead C-level triggers and ENR-based transition tables — sharing the transition tuplestore directly between the trigger and the refresh engine instead of copying through a temp table. Also adds prepared statement reuse to eliminate repeated parse/plan overhead for the delta query.

Item	Description	Effort	Ref
~~A2-ENR~~	~~ENR-based transition tables.~~ 🚫 Deferred post-1.0 — requires raw `pg_sys` ENR tuplestore FFI not surfaced by pgrx; carries memory-corruption and `pg_upgrade` compatibility risk. Revisit after 1.0 stabilisation.	~~12–18h~~	PLAN_TRANSACTIONAL_IVM.md §Phase 4
~~A2-CTR~~	~~C-level triggers.~~ 🚫 Deferred post-1.0 — requires raw `CreateTrigger()` FFI not surfaced by pgrx; carries memory-corruption and `pg_upgrade` compatibility risk. Revisit after 1.0 stabilisation.	~~12–18h~~	PLAN_TRANSACTIONAL_IVM.md §Phase 4
~~A2-PS~~	~~Prepared statement reuse.~~ ✅ Already shipped — `pg_trickle.use_prepared_statements` GUC (default `true`) implemented and wired in `refresh.rs`; parse/plan overhead eliminated on steady-state workloads.	~~8–12h~~	PLAN_TRANSACTIONAL_IVM.md §Phase 4

A2 subtotal: 0h remaining (A2-PS shipped; A2-ENR + A2-CTR deferred post-1.0)

`ROWS FROM()` Support (A8)

In plain terms: ROWS FROM() with multiple set-returning functions is a rarely-used SQL feature, but supporting it closes a coverage gap in the parser and DVM pipeline.

Item	Description	Effort	Ref
A8	`ROWS FROM()` with multiple SRF functions. Parser + DVM support for `ROWS FROM(generate_series(...), unnest(...))` in defining queries. Very low demand.	~1–2d	PLAN_TRANSACTIONAL_IVM_PART_2.md Task 2.3

A8 subtotal: ~1–2 days

SQLancer Fuzzing Integration (SQLANCER)

In plain terms: pg_trickle's tests were written by the pg_trickle team, which means they share the same assumptions as the code. SQLancer is an automated database testing tool that generates random SQL queries and checks whether the results are correct — it has found hundreds of bugs in PostgreSQL, SQLite, CockroachDB, and TiDB. Integrating SQLancer gives pg_trickle a crash-test oracle (does the parser panic on fuzzed input?), an equivalence oracle (does DIFFERENTIAL mode produce the same answer as FULL?), and stateful DML fuzzing (do random INSERT/UPDATE/DELETE sequences corrupt stream table data?). This is the single highest-value testing investment for finding unknown correctness bugs.

Item	Description	Effort	Ref
~~SQLANCER-1~~	~~Fuzzing environment.~~ ✅ Done — Docker-based harness (`just sqlancer`), Rust LCG query generator, `SQLANCER_CASES`/`SQLANCER_SEED` controls, `weekly-sqlancer` CI job.	~~2–3d~~	PLAN_SQLANCER.md §1
~~SQLANCER-2~~	~~Crash-test oracle.~~ ✅ Done — `test_sqlancer_crash_oracle` / `run_crash_oracle()` verifies zero backend crashes over 200–2000 fuzzed queries.	~~3–5d~~	PLAN_SQLANCER.md §2
~~SQLANCER-3~~	~~Equivalence oracle.~~ ✅ Done — `test_sqlancer_diff_vs_full_oracle` / `run_diff_vs_full_oracle()` creates DIFFERENTIAL + FULL stream tables, applies 4 DML mutations, and asserts count parity. Integrated into `test_sqlancer_ci_combined`.	~~3–5d~~	PLAN_SQLANCER.md §3
~~SQLANCER-4~~	~~Stateful DML fuzzing.~~ ✅ Done — `test_sqlancer_stateful_dml` / `run_stateful_dml_fuzzing()` runs `SQLANCER_MUTATIONS` (default 100, nightly 10 000) random INSERT/UPDATE/DELETE mutations with checkpoints every 50. CI: `weekly-sqlancer-stateful` job (`SQLANCER_MUTATIONS=10000`).	~~3–5d~~	PLAN_SQLANCER.md §4

SQLANCER subtotal: 0 remaining (all four items shipped in v0.17.0)

Incremental DAG Rebuild (C-2)

In plain terms: When any DDL change occurs (e.g. ALTER STREAM TABLE, DROP STREAM TABLE), the entire dependency graph is rebuilt from scratch by querying pgt_dependencies. For 1000+ stream tables this becomes expensive — O(V+E) SPI queries. Incremental DAG maintenance records which specific stream table was affected and only re-sorts the affected subgraph, reducing the scheduler latency spike from ~50ms to ~1ms at scale.

Item	Description	Effort	Ref
C-2-1	Delta-based rebuild. Record affected `pgt_id` in a bounded ring buffer in shared memory alongside `DAG_REBUILD_SIGNAL`. On overflow, fall back to full rebuild.	1 wk	plans/performance/PLAN_NEW_STUFF.md §C-2
C-2-2	Incremental topological sort. Add/remove only affected edges and vertices; re-run topological sort on the affected subgraph only. Cache the sorted schedule in shared memory.	1–2 wk	plans/performance/PLAN_NEW_STUFF.md §C-2

C-2 subtotal: ~2–3 weeks

Unsafe Block Reduction — Phase 6 (UNSAFE-R1/R2)

In plain terms: pg_trickle achieved a 51% reduction in unsafe blocks (from ~1,300 to 641) in earlier releases. The remaining blocks are concentrated in well-documented field-accessor macros and standalone is_a type checks. Converting these to safe wrappers removes another 150–250 unsafe blocks with minimal risk — a meaningful safety improvement before 1.0.

Item	Description	Effort	Ref
UNSAFE-R1	Safe field-accessor macros. Replace `unsafe { (*node).field }` patterns with safe accessor functions. Estimated reduction: ~100–150 unsafe blocks.	2–4h	PLAN_REDUCED_UNSAFE.md §R1
UNSAFE-R2	Safe `is_a` checks. Convert standalone `unsafe { is_a(node, T_Foo) }` calls to safe wrapper functions. Estimated reduction: ~50–99 unsafe blocks.	2–4h	PLAN_REDUCED_UNSAFE.md §R2

UNSAFE-R1/R2 subtotal: ~4–8 hours

`api.rs` Modularization (API-MOD)

In plain terms: api.rs is 9,413 lines — the largest file in the codebase. It contains stream table CRUD, ALTER QUERY, CDC management, bulk operations, diagnostics, and monitoring functions all in one file. The same treatment that parser.rs received in v0.15.0 (split from 21K lines into 5 sub-modules) is needed here. Zero behavior change — purely structural.

Item	Description	Effort	Ref
API-MOD	Split `src/api.rs` into sub-modules. Proposed split: `api/create.rs` (create/drop/alter), `api/refresh.rs` (refresh entry points), `api/cdc.rs` (CDC management), `api/diagnostics.rs` (explain_st, health_check), `api/bulk.rs` (bulk_create), `api/mod.rs` (re-exports). Zero behavior change.	1–2 wk	—

API-MOD subtotal: ~1–2 weeks

pg_ivm Migration Guide (MIG-IVM)

In plain terms: pg_ivm is the incumbent IVM extension with 1,400+ GitHub stars and 4 years of production use. Many potential pg_trickle adopters are currently using pg_ivm. A step-by-step migration guide — mapping pg_ivm concepts to pg_trickle equivalents, with concrete SQL examples — removes the biggest adoption friction for this audience.

Item	Description	Effort	Ref
MIG-IVM	pg_ivm → pg_trickle migration guide. Map: `create_immv()` → `create_stream_table()`; `refresh_immv()` → `refresh_stream_table()`; IMMEDIATE mode equivalence; aggregate coverage differences (5 vs 60+); GUC mapping; worked example migrating a real pg_ivm deployment. Publish as `docs/tutorials/MIGRATING_FROM_PG_IVM.md`.	2–3d	docs/research/PG_IVM_COMPARISON.md

MIG-IVM subtotal: ~2–3 days

Failure Mode Runbook (RUNBOOK)

In plain terms: Production teams need to know what happens when things go wrong — and what to do about it. This documents every failure mode pg_trickle can encounter (scheduler crash, WAL slot lag, OOM during refresh, disk full, replication slot conflict, stuck watermarks, circular convergence failure) with symptoms, diagnosis steps, and resolution procedures. Essential for on-call engineers.

Item	Description	Effort	Ref
RUNBOOK	Failure mode runbook. Document: scheduler crash recovery, WAL decoder failures, OOM during refresh, disk-full behavior, replication slot conflicts, stuck watermarks, circular convergence timeout, CDC trigger failures, SUSPENDED state recovery, lock contention diagnosis. Include `health_check()` output interpretation and `explain_st()` troubleshooting. Publish as `docs/TROUBLESHOOTING.md`.	3–5d	docs/PRE_DEPLOYMENT.md

RUNBOOK subtotal: ~3–5 days

Docker Quickstart Playground (PLAYGROUND)

In plain terms: The fastest way to evaluate any database extension is to run it locally in 60 seconds. A docker-compose.yml with PostgreSQL + pg_trickle pre-installed, sample data (e.g. the org-chart from GETTING_STARTED.md), and a Jupyter notebook or pgAdmin web UI gives potential users a zero-friction tryout experience. This is the single most impactful thing for driving initial adoption.

Item	Description	Effort	Ref
PLAYGROUND	Docker Compose quickstart. `docker-compose.yml` with: PG 18 + pg_trickle, seed SQL script (org-chart example from GETTING_STARTED.md + TPC-H SF=0.01), pgAdmin web UI (optional). Single `docker compose up` command. README with guided walkthrough.	2–3d	docs/GETTING_STARTED.md

PLAYGROUND subtotal: ~2–3 days

Documentation Polish (DOC-POLISH)

In plain terms: The existing documentation is comprehensive and technically excellent, but it's optimized for users already familiar with IVM and PostgreSQL internals. These items restructure the docs for a better "first hour" experience — simpler getting-started examples, a refresh mode decision guide, a condensed new-user FAQ, and a setup verification checklist. The goal is to reduce cognitive overload for new users without losing the depth that experienced users need.

Item	Description	Effort	Ref
DOC-HELLO	Simplified "Hello Stream Table" in GETTING_STARTED. Add a Chapter 0 with a single-table, single-aggregate stream table (e.g. `SELECT department, count(*) FROM employees GROUP BY department`). Create it, insert a row, verify the refresh. Build confidence before the multi-table org-chart example.	2–4h	docs/GETTING_STARTED.md
DOC-DECIDE	Refresh mode decision guide. Flowchart: "Need transactional consistency? → IMMEDIATE. Volatile functions? → FULL. Otherwise → AUTO (DIFFERENTIAL with FULL fallback)." Include when-to-use guidance for each mode with concrete examples. Publish as a section in GETTING_STARTED or as a standalone tutorial.	2–4h	docs/tutorials/tuning-refresh-mode.md
DOC-FAQ-NEW	New User FAQ (top 15 questions). Extract the 15 most common new-user questions from the 3,000-line FAQ into a prominent "New User FAQ" section at the top. Keyword-rich headings for searchability. Link to deep FAQ for details.	2–3h	docs/FAQ.md
DOC-VERIFY	Post-install verification checklist. SQL script that verifies: extension loaded, shared_preload_libraries configured, GUCs set, CDC triggers installable, first stream table creates and refreshes successfully. Runnable as `psql -f verify_install.sql`.	2–4h	docs/GETTING_STARTED.md
DOC-STUBS	Fill or remove research stubs. `PG_IVM_COMPARISON.md` (60 bytes) and `CUSTOM_SQL_SYNTAX.md` (57 bytes) are empty stubs. Either flesh them out (PG_IVM_COMPARISON can draw from the existing comparison data) or remove from SUMMARY.md.	2–4h	docs/research/

DOC-POLISH subtotal: ~2–3 days

v0.17.0 total: ~2–3 weeks (cost-based strategy) + ~3–4 weeks (columnar tracking) + ~32–48 hours (TIVM Phase 4) + ~1–2 days (ROWS FROM) + ~2–3 weeks (SQLancer) + ~2–3 weeks (incremental DAG) + ~4–8 hours (unsafe reduction) + ~1–2 weeks (api.rs modularization) + ~2–3 days (pg_ivm migration) + ~3–5 days (failure runbook) + ~2–3 days (Docker playground) + ~2–3 days (doc polish)

Exit criteria:

B-4: Cost-based strategy selector trained on per-ST history; cold-start fallback to fixed threshold; QueryComplexityClass cost model (scan/filter/aggregate/join/join_agg); refresh_strategy + cost_model_safety_margin GUCs; pre-refresh predictive comparison; 10 unit tests
A-2-COL: CDC trigger emits changed_cols VARBIT bitmask (COL-1); delta-scan filters irrelevant rows via changed_cols & mask (COL-2); aggregate value-only correction 'V' path halves row volume (COL-3)
~~[ ] A2-ENR~~: 🚫 Deferred post-1.0 — requires raw pg_sys ENR tuplestore FFI (memory-corruption risk); revisit after 1.0 stabilisation
~~[ ] A2-CTR~~: 🚫 Deferred post-1.0 — requires raw CreateTrigger() C FFI (memory-corruption risk); revisit after 1.0 stabilisation
A2-PS: ✅ Already shipped — pg_trickle.use_prepared_statements GUC (default true) wired in refresh.rs; parse/plan overhead eliminated on steady-state workloads
A8: ROWS FROM() with multiple SRFs accepted in defining queries; E2E tests cover INSERT/UPDATE/DELETE propagation
SQLANCER: ✅ SQLANCER-1/2 crash + equivalence oracles shipped in v0.12.0; SQLANCER-3 diff-vs-full oracle and SQLANCER-4 stateful DML soak (10K mutations) added in v0.17.0; weekly-sqlancer-stateful CI job wired
C-2: Incremental DAG rebuild reduces DDL-triggered latency spike to < 5ms at 100+ STs; ring buffer overflow falls back to full rebuild; no correctness regressions
UNSAFE-R1/R2: Unsafe block count reduced by 249 (690→441 in parser); is_node_type! and pg_deref! macros; all 1,700 unit tests pass
API-MOD: api.rs split into 3 sub-modules (mod.rs 5,624 + diagnostics.rs 1,377 + helpers.rs 2,461); zero behavior change; all 1,700 unit tests pass
MIG-IVM: docs/tutorials/MIGRATING_FROM_PG_IVM.md published with step-by-step migration, API mapping, behavioral differences, SQL upgrade examples, and verification checklist
RUNBOOK: docs/TROUBLESHOOTING.md covers 13 failure scenarios (scheduler, SUSPENDED, CDC triggers, WAL slots, INITIALIZING, buffer growth, lock contention, OOM, disk full, circular convergence, schema changes, worker pool, fuse) with symptoms, diagnosis, and resolution
PLAYGROUND: playground/ with docker-compose.yml, seed.sql (3 base tables, 5 stream tables), and README walkthrough
DOC-HELLO: Chapter 1 "Hello World" in GETTING_STARTED already provides the single-table aggregate example (products/category_summary)
DOC-DECIDE: Refresh mode decision guide already published as tutorials/tuning-refresh-mode.md with recommend_refresh_mode() and signal breakdown
DOC-FAQ-NEW: New User FAQ section with 15 keyword-rich entries added at top of FAQ.md
DOC-VERIFY: scripts/verify_install.sql checks shared_preload_libraries, extension, scheduler, GUCs, and runs end-to-end stream table cycle
DOC-STUBS: Research stubs already use {{#include}} directives pointing to substantial content (923 + 1232 lines)
Extension upgrade path tested (0.16.0 → 0.17.0)

v0.18.0 — Hardening & Delta Performance

Status: Released (2026-04-12).

Release Theme This release hardens pg_trickle for production at scale and delivers the biggest remaining performance win in the differential refresh path. The Z-set multi-source delta engine merges per-source delta branches into a single GROUP BY + SUM(weight) query, eliminating redundant join evaluation when multiple source tables change in the same cycle. Cross-source snapshot consistency guarantees that multi-source stream tables always read all upstream tables at the same transaction boundary — closing the last known correctness gap. Every production-path .unwrap() is replaced with graceful error propagation, another ~69 unsafe blocks are eliminated, and a populated TPC-H baseline turns the 22-query suite into a true regression canary. SQLancer fuzzing integration provides an external, assumption-free correctness oracle. Together, these changes build the confidence foundation for 1.0.

Completed items (click to expand)

Correctness

ID	Title	Effort	Priority
CORR-1	Enforce cross-source snapshot consistency	L	P0
CORR-2	Populate TPC-H expected-output regression guard	XS	P0
CORR-3	NULL-safe GROUP BY elimination under deletes	S	P1
CORR-4	Z-set merged-delta weight accounting proof	M	P0
CORR-5	HAVING-filtered aggregate correction under group depletion	S	P1

CORR-1 — Enforce cross-source snapshot consistency (CSS-3)

In plain terms: When a stream table reads from two different source tables, there is a window where it can see source A at a newer point in time than source B — for example, seeing a new order but the old inventory count. Phase 3 completes the tick-watermark enforcement so both sources are always read at the same consistent LSN before any refresh proceeds. Phases 1 and 2 are already complete.

Item	Description	Effort	Ref
CSS-3-1	LSN watermark enforcement in the scheduler — hold refresh until all upstream sources reach the same tick boundary	4–6h	PLAN_CROSS_SOURCE_SNAPSHOT_CONSISTENCY.md §Phase 3
CSS-3-2	Catalog column `pgt_css_watermark_lsn` + GUC `pg_trickle.cross_source_consistency` (default `off`)	2–3h	—
CSS-3-3	E2E test: concurrent writes to two sources, assert stream table never sees a split snapshot	2–3h	—

CSS-3 subtotal: ~8–12 hours Dependencies: None. Schema change: Yes.

CORR-2 — Populate TPC-H expected-output regression guard (TPCH-BASE)

In plain terms: The TPC-H correctness tests run all 22 queries but the expected-output comparison guard was never populated — so the tests catch structural failures but not quiet result regressions. Populating the baseline turns the suite into a true correctness canary.

Item	Description	Effort
TPCH-BASE-1	Run TPC-H suite once at known-good state; capture output	30min
TPCH-BASE-2	Populate comparison baseline in `e2e_tpch_tests.rs` line 89 (remove TODO); verify guard fires on a deliberate regression	1h

TPCH-BASE subtotal: ~1–2 hours Dependencies: None. Schema change: No.

CORR-3 — NULL-safe GROUP BY elimination under deletes

In plain terms: When all rows in a GROUP BY group are deleted and the grouping key contains NULLs, the differential engine must correctly remove the group. SQL's three-valued logic in IS DISTINCT FROM may cause delta weight miscounting for NULL keys.

Verify: E2E test with GROUP BY nullable_col, delete all group members, assert zero rows remain in the stream table. Dependencies: None. Schema change: No.

CORR-4 — Z-set merged-delta weight accounting proof

In plain terms: Companion correctness gate for PERF-1 (B3-MERGE). The Z-set algebra requires that SUM(weight) across all merged branches for every primary key never produces a spurious net-positive or net-negative for a single join path.

Verify: property-based tests (proptest) asserting merged_weights == individual_branch_sums for randomly generated multi-source DAGs. All existing B3-3 diamond-flow tests must pass unchanged. Dependencies: PERF-1. Schema change: No.

CORR-5 — HAVING-filtered aggregate correction under group depletion

In plain terms: When a HAVING-qualified group loses enough rows to no longer satisfy the predicate (e.g., HAVING count(*) > 5 and 3 of 6 rows are deleted), the differential aggregate path must delete the stream table row rather than leaving a stale row matching the old HAVING predicate.

Verify: E2E test with HAVING + selective deletes crossing the threshold. Dependencies: None. Schema change: No.

Stability

ID	Title	Effort	Priority
STAB-1	Eliminate production-path .unwrap() calls	S	P0
STAB-2	unsafe block reduction Phase 1	M	P1
STAB-3	Spill detection alerting	S	P1
STAB-4	Parallel worker orphaned resource cleanup	M	P1
STAB-5	Upgrade migration test (0.17→0.18)	S	P0
STAB-6	Error SQLSTATE coverage audit	S	P2

STAB-1 — Eliminate production-path .unwrap() calls (SAFE-1)

In plain terms: A small number of SQL-parsing code paths in production (non-test) code call .unwrap() directly — if they encounter unexpected input they will panic the backend process and disconnect all clients. These should propagate errors gracefully instead.

Item	Description	Effort
SAFE-1-1	`detect_and_strip_distinct()` call in `api.rs` (L8163) → propagate `PgTrickleError`	1h
SAFE-1-2	`find_top_level_keyword(sql, "FROM")` calls in `api.rs` (L8229–8258, 3×) → propagate error	1h
SAFE-1-3	`merge_sql[using_start.unwrap()..using_end.unwrap()]` in `refresh.rs` (L6236) → bounds-check	1h
SAFE-1-4	`entry.unwrap()` in delta computation loop in `refresh.rs` (L5992) → return `Err`	1h
SAFE-1-5	Chained `.unwrap().unwrap()` in `refresh.rs` (L6556–6557) → propagate	1h

SAFE-1 subtotal: ~4–6 hours Dependencies: None. Schema change: No.

STAB-2 — unsafe block reduction Phase 1 (UNSAFE-P1)

In plain terms: The DVM parser has 1,286 unsafe blocks — 98% of the total. Phase 1 introduces a single pg_cstr_to_str() safe helper that eliminates ~69 of the most mechanical ones: C-string-to-Rust conversions. No API or behavior change; pure safety improvement.

Item	Description	Effort	Ref
UNSAFE-P1-1	Implement `pg_cstr_to_str(ptr: *const c_char) -> &str` safe wrapper in `src/dvm/parser/mod.rs`	1h	PLAN_REDUCED_UNSAFE.md §Phase 1
UNSAFE-P1-2	Replace ~69 `unsafe { CStr::from_ptr(...).to_str()... }` call-sites with the safe helper	4–6h	—
UNSAFE-P1-3	`unsafe_inventory.sh` baseline update + CI check	1h	`scripts/unsafe_inventory.sh`

UNSAFE-P1 subtotal: ~6–8 hours Dependencies: None. Schema change: No.

STAB-3 — Spill detection alerting (PH-E2)

In plain terms: The GUCs pg_trickle.spill_threshold_blocks and pg_trickle.spill_consecutive_limit already exist to configure spill budgets, but no alert fires when a refresh actually spills to disk. This adds an AlertEvent::SpillThresholdExceeded notification so operators know when large delta queries are hitting disk.

Item	Description	Effort
PH-E2-1	Add `AlertEvent::SpillThresholdExceeded` variant to `src/monitor.rs`	1h
PH-E2-2	Detect spill after MERGE execution; emit alert when consecutive count exceeds limit	2–3h
PH-E2-3	E2E test: configure low spill threshold, trigger spill, assert alert fires	1–2h

PH-E2 subtotal: ~4–6 hours Dependencies: None. Schema change: No.

STAB-4 — Parallel worker orphaned resource cleanup

In plain terms: After a parallel worker panics mid-refresh, advisory locks, __pgt_delta_* temp tables, and partially-written change buffer rows may be left behind. The scheduler recovery path must clean these up.

Audit the recovery path to ensure: (a) advisory locks are released on next scheduler tick, (b) temp tables are cleaned up, (c) change buffer rows are not double-counted on retry. Verify: E2E test simulating worker crash via pg_terminate_backend() followed by successful recovery. Dependencies: None. Schema change: No.

STAB-5 — Upgrade migration test (0.17→0.18) Extend the upgrade E2E test framework to cover the 0.17.0→0.18.0 migration path and the three-version chain 0.16→0.17→0.18. Verify: catalog column additions, new function signatures, existing stream tables survive, refresh continues working post-upgrade. Dependencies: All schema-changing items (CORR-1). Schema change: No.

STAB-6 — Error SQLSTATE coverage audit Audit all ereport!() and error!() calls for SQLSTATE classification. Ensure every user-facing error has a unique, documented SQLSTATE code that connection poolers and application retry logic can pattern-match. Cross- reference with docs/ERRORS.md for completeness. Dependencies: None. Schema change: No.

Performance

ID	Title	Effort	Priority
PERF-1	Z-set multi-source delta engine	L	P0
PERF-2	Cost-based refresh strategy completion	L	P1
PERF-3	Zero-change source branch elision	M	P1
PERF-4	Columnar change tracking Phase 1 — CDC bitmask	L	P1
PERF-5	Index hint generation for MERGE target	S	P2

PERF-1 — Z-set multi-source delta engine (B3-MERGE)

In plain terms: When a stream table joins multiple tables and more than one of those tables receives changes in the same scheduler cycle, the current engine generates one delta branch per source and stacks them in a UNION ALL. With this change those branches are merged into a single GROUP BY + SUM(weight) query using Z-set algebra, eliminating duplicate evaluation of shared join paths. B3-1 (branch pruning) and B3-3 (correctness proofs) are already done; this is the final payoff.

Item	Description	Effort	Ref
B3-2-1	Z-set merged-delta generation in `src/dvm/diff.rs` (`DiffEngine::diff_node()`)	8–10h	PLAN_MULTI_TABLE_DELTA_BATCHING.md
B3-2-2	Unit + property-based tests (existing B3-3 diamond-flow tests must pass unchanged)	2–4h	—
B3-2-3	Benchmark regression check against Part-8 baseline	2h	—

B3-MERGE subtotal: ~12–16 hours Dependencies: CORR-4 (property tests must accompany). Schema change: No.

PERF-2 — Cost-based refresh strategy completion (B-4 remainder)

In plain terms: Deferred from v0.17.0. The refresh_strategy GUC landed in the current cycle. The remaining work is the per-ST cost model: collect delta_row_count, merge_duration_ms, full_refresh_duration_ms from pgt_refresh_history; fit a simple linear cost model; cold-start heuristic (<10 refreshes) falls back to the fixed threshold.

Verify: mixed-workload benchmark showing the model picks the cheaper strategy ≥80% of the time. Dependencies: B-4 Phase 1 (shipped). Schema change: No.

PERF-3 — Zero-change source branch elision

In plain terms: When building a multi-source delta query, skip branches entirely for sources with empty change buffers. Currently all branches are generated and executed regardless of whether a source has changes.

Verify: benchmark showing latency reduction when 1-of-3 sources changes vs. all 3 changing. Dependencies: PERF-1 (applies to the merged delta builder). Schema change: No.

PERF-4 — Columnar change tracking Phase 1 — CDC bitmask (A-2-COL-1)

In plain terms: Deferred from v0.17.0. Compute changed_columns bitmask (old.col IS DISTINCT FROM new.col) in the CDC trigger; store as int8 or bit(n) alongside the change row. Phase 1 only: bitmask computation + storage. Phase 2 (delta-scan filtering using the bitmask) deferred to v0.22.0. Provides the foundation for 50–90% delta volume reduction on wide-table UPDATE workloads.

Gate behind pg_trickle.columnar_tracking GUC (default off). Dependencies: None. Schema change: Yes (change buffer schema addition).

PERF-5 — Index hint generation for MERGE target

In plain terms: When the stream table has a covering index on the MERGE join keys, bias the planner toward the index to avoid expensive sequential scans during delta application on large stream tables.

Emit SET enable_seqscan = off within the MERGE statement's session. Verify: EXPLAIN ANALYZE shows index scan on MERGE for tables with PK index. Dependencies: None. Schema change: No.

Scalability

ID	Title	Effort	Priority
SCAL-1	Change buffer growth stress test at 10× write rate	M	P1
SCAL-2	Parallel worker utilization profiling at 200+ STs	M	P2
SCAL-3	Delta working-set memory cap	M	P2

SCAL-1 — Change buffer growth stress test at 10× write rate Run a sustained write load at 10× normal throughput for 30+ minutes with intentionally slow refresh intervals. Verify the max_buffer_rows cap triggers correctly, FULL refresh clears the backlog, no disk exhaustion occurs, and the extension recovers cleanly once write rate normalizes. This validates the v0.16.0 buffer growth protection under extreme conditions. Dependencies: None. Schema change: No.

SCAL-2 — Parallel worker utilization profiling at 200+ STs Profile the scheduler with 200+ stream tables across pg_trickle.max_workers = 4/8/16 settings. Measure: CPU utilization per worker, scheduling queue depth, per-ST refresh latency P50/P99. Identify whether the scheduling loop itself becomes a bottleneck before worker saturation. Document findings as a scaling guide section. Dependencies: None. Schema change: No.

SCAL-3 — Delta working-set memory cap The current delta merge can allocate unbounded work_mem for hash joins. Add a configurable cap (pg_trickle.delta_work_mem_mb, default: 256 MB) that triggers FULL refresh fallback when the delta working set would exceed the limit, preventing OOM on unexpectedly large deltas. Verify: E2E test with low cap triggers fallback and logs a warning. Dependencies: None. Schema change: No.

Ease of Use

ID	Title	Effort	Priority
UX-1	Template cache observability	S	P1
UX-2	Pre-built Grafana dashboard panels	M	P1
UX-3	Error message actionability audit	S	P1
UX-4	Single-endpoint health summary function	S	P2
UX-5	Prometheus metric completeness audit	XS	P2
UX-6	TUI surfaces for cache_stats and health_summary	XS	P2

UX-1 — Template cache observability (CACHE-OBS)

In plain terms: The delta SQL template cache (IVM_DELTA_CACHE) saves regenerating delta queries on every refresh cycle, but its hit rate is invisible to operators. Adding pgtrickle.cache_stats() lets you see whether the cache is effective and tune pg_trickle.ivm_cache_size accordingly.

Item	Description	Effort
CACHE-OBS-1	Add hit/miss/eviction counters to `IVM_DELTA_CACHE`	1h
CACHE-OBS-2	Expose via `pgtrickle.cache_stats()` returning `(hits BIGINT, misses BIGINT, evictions BIGINT, size INT)`	1–2h
CACHE-OBS-3	Documentation and E2E smoke test	1h

CACHE-OBS subtotal: ~3–4 hours Dependencies: None. Schema change: No.

UX-2 — Pre-built Grafana dashboard panels Extend monitoring/grafana/ with import-ready JSON panels for: refresh latency P50/P99 histogram, differential vs. FULL refresh ratio over time, change buffer backlog per stream table, spill event count, template cache hit rate, and worker utilization gauge. Document import instructions in monitoring/README.md. Dependencies: UX-1 (cache stats metric), STAB-3 (spill events). Schema change: No.

UX-3 — Error message actionability audit Audit all PgTrickleError variants and ereport!()/error!() calls. Ensure every user-facing error includes: the stream table name (when applicable), the operation that failed, and a 1-sentence remediation hint. Cross-reference with docs/ERRORS.md; add missing entries. Dependencies: None. Schema change: No.

UX-4 — Single-endpoint health summary function New pgtrickle.health_summary() function returning a single-row JSONB: total STs, healthy/degraded/error counts, oldest un-refreshed age, largest buffer backlog, fuse status, scheduler state. Useful for monitoring integrations (Nagios, Datadog) without parsing multiple views. Dependencies: None. Schema change: No.

UX-5 — Prometheus metric completeness audit Verify every metric emitted by the extension matches the documented name in docs/CONFIGURATION.md §Prometheus. Remove undocumented metrics or add documentation. Ensure metric names follow Prometheus naming conventions (pgtrickle_* prefix, snake_case, unit suffix). Dependencies: None. Schema change: No.

UX-6 — TUI surfaces for cache_stats and health_summary

In plain terms: The new pgtrickle.cache_stats() (UX-1) and pgtrickle.health_summary() (UX-4) functions are useful in isolation but are most discoverable when surfaced in the TUI. Even a read-only status panel showing total STs, healthy/degraded/error counts, cache hit rate, and scheduler state would make these endpoints visible to users who reach the extension through pgtrickle-tui rather than raw SQL. Audit pgtrickle-tui/src/ to identify the lightest-weight integration point (likely a new "Health" tab or an expanded "Status" panel). If TUI changes are out of scope for this release, document the gap in docs/TUI.md so it is not silently deferred.

Verify: TUI displays non-zero cache stats and a valid health JSONB row after at least one refresh cycle in the E2E playground environment. Dependencies: UX-1, UX-4. Schema change: No.

Test Coverage

ID	Title	Effort	Priority
TEST-1	TPC-H regression baseline	XS	P0
TEST-2	SQLancer fuzzing — crash-test oracle	L	P1
TEST-3	CDC edge cases: NULL PKs, composite PKs, generated columns	M	P1
TEST-4	Property-based tests for Z-set merged delta	M	P0
TEST-5	Light E2E eligibility audit	S	P2
TEST-6	Three-version upgrade chain test (0.16→0.17→0.18)	S	P0
TEST-7	dbt integration regression coverage	S	P1

TEST-1 — TPC-H regression baseline (TPCH-BASE) Same as CORR-2. Capture known-good outputs; verify guard fires on deliberate regression. Dependencies: None. Schema change: No.

TEST-2 — SQLancer fuzzing — crash-test oracle

In plain terms: Deferred from v0.17.0 (second time). Scope reduced to crash-test oracle only for v0.18.0: SQLancer in Docker, configured to feed randomized SQL to the parser and DVM pipeline. Zero-panic guarantee — any input that crashes the extension is a bug. Equivalence oracle (DIFFERENTIAL ≡ FULL) and stateful DML fuzzing deferred to v0.22.0.

Verify: 10K+ fuzzed queries with zero panics. Dependencies: None. Schema change: No.

TEST-3 — CDC edge cases: NULL PKs, composite PKs, generated columns Create E2E tests covering: (a) tables with nullable PK columns in differential mode, (b) composite PKs with 3+ columns, (c) GENERATED ALWAYS AS stored columns as source columns, (d) domain-typed columns, (e) array-typed columns referenced in defining queries. Dependencies: None. Schema change: No.

TEST-4 — Property-based tests for Z-set merged delta Required companion to PERF-1. proptest-based tests generating random multi-source DAGs (2–5 sources, 1–3 join levels) with random DML sequences. Assert merged delta produces identical stream table state as sequential per-branch application. Detect weight-accounting bugs before they ship. Dependencies: PERF-1. Schema change: No.

TEST-5 — Light E2E eligibility audit Review all 10 full E2E test files (~90 tests). Identify tests that don't require custom Docker image features (custom extensions, special configurations) and can run on the stock postgres:18.3 image. Migrate eligible tests to reduce CI wall-clock time on PRs. Dependencies: None. Schema change: No.

TEST-6 — Three-version upgrade chain test (0.16→0.17→0.18) Extend upgrade E2E tests to cover: fresh install of 0.16.0, create stream tables, upgrade to 0.17.0, verify survival, upgrade to 0.18.0, verify survival + new features functional. Dependencies: All schema-changing items. Schema change: No.

TEST-7 — dbt integration regression coverage

In plain terms: The dbt-pgtrickle macro package is the primary adoption vector for teams using dbt, but the integration test suite in dbt-pgtrickle/integration_tests/ currently verifies only happy-path macro expansion. Add regression tests covering: (a) pgtrickle_stream_table macro with all supported materialisation strategies (differential, full, auto), (b) incremental model compatibility, (c) pgtrickle_status test macro, (d) teardown and recreation idempotency (drop + re-run produces identical output). Run as part of just test-dbt.

Verify: just test-dbt passes all new cases; idempotency test confirms identical stream table contents after a full dbt run --full-refresh cycle. Dependencies: None. Schema change: No.

Conflicts & Risks

PERF-1 + CORR-4 + TEST-4 form a mandatory cluster. The Z-set multi-source delta engine (B3-MERGE) is the highest-impact performance item but also touches the DVM engine core. Property-based tests (TEST-4) and the weight accounting proof (CORR-4) are not optional — they must ship alongside PERF-1 to prevent correctness regressions.
Two schema changes. CORR-1 (CSS-3) adds pgt_css_watermark_lsn to the catalog. PERF-4 (A-2-COL-1) adds changed_columns to change buffer tables. Both require upgrade migration scripts and freeze-risk coordination. Consider batching both into a single migration file.
PERF-3 depends on PERF-1. Zero-change branch elision modifies the same delta query builder as B3-MERGE. Sequence PERF-3 strictly after PERF-1 to avoid merge conflicts and compound risk.
TEST-2 (SQLancer) is deferred for the second time. Originally planned for v0.17.0, it remains unstarted. v0.18.0 scopes it to crash-test oracle only (L effort instead of XL), but there is a risk of perpetual deferral. If capacity is tight, prioritize the crash-test oracle as a standalone deliverable rather than deferring the full suite again.
PERF-2 (cost model) requires production history data. The per-ST cost model trains on pgt_refresh_history. Users upgrading from v0.17.0 will have a cold history cache. The cold-start heuristic (< 10 refreshes) is critical — test it explicitly.
PERF-4 (columnar tracking) changes CDC trigger output. The changed_columns bitmask adds overhead to every trigger invocation. Gate behind a GUC (default off) and benchmark the per-row overhead (< 1μs target) before enabling by default in a later release.
B-4 and A-2-COL are carry-overs from v0.17.0. Both were originally scoped for v0.17.0 but not started. They are re-proposed here with reduced scope (B-4 cost model only, A-2-COL Phase 1 bitmask only). If v0.17.0 ships B-4 partially, adjust PERF-2 scope accordingly.

v0.18.0 total: ~70–100 hours

Exit criteria:

CORR-1: Split-snapshot E2E test passes under concurrent writes; pgt_css_watermark_lsn column added
CORR-2 / TEST-1: TPC-H baseline populated; deliberate regression detected by the guard
CORR-3: NULL-keyed GROUP BY group fully removed after all-row delete
CORR-4 / TEST-4: Property-based Z-set weight tests pass for randomly generated multi-source DAGs
CORR-5: HAVING-qualified group deleted from stream table when row count drops below threshold
STAB-1: All production-path unwrap() calls in api.rs and refresh.rs replaced with proper error propagation
STAB-2: unsafe_inventory.sh reports ≥69 fewer unsafe blocks; CI baseline updated
STAB-3: Spill alert fires in E2E test with artificially low threshold
STAB-4: Worker crash recovery E2E test cleans up advisory locks, temp tables, and buffer rows
STAB-5 / TEST-6: Three-version upgrade chain (0.16→0.17→0.18) passes
STAB-6: All user-facing errors have documented SQLSTATE codes in docs/ERRORS.md
PERF-1: Merged multi-source delta implemented; all B3-3 diamond-flow property tests pass unchanged
PERF-2: Cost model picks cheaper strategy ≥80% of the time on mixed workload benchmark
PERF-3: Zero-change branch elision shows measurable latency reduction in multi-source benchmark
PERF-4: changed_columns bitmask stored in change buffer; per-row overhead < 1μs
PERF-5: Index scan confirmed via EXPLAIN ANALYZE for MERGE on tables with PK covering index
SCAL-1: Buffer growth stress test at 10× rate completes without disk exhaustion or data loss
SCAL-2: Profiling report for 200+ STs documented
SCAL-3: Delta work_mem cap triggers FULL fallback in E2E test
UX-1: pgtrickle.cache_stats() returns correct counters in smoke test
UX-2: Grafana dashboard JSON importable; documents refresh latency, buffer backlog, spill events
UX-3: Error message audit complete; all errors include table name and remediation hint
UX-4: pgtrickle.health_summary() returns single-row JSONB with correct counts
UX-5: Prometheus metric names match documentation; no undocumented metrics
TEST-2: SQLancer crash-test oracle runs 10K+ fuzzed queries with zero panics
TEST-3: CDC edge case tests cover NULL PKs, composite PKs, generated columns, domain types, arrays
TEST-5: At least 10 tests migrated from full E2E to light E2E
TEST-7: dbt regression suite covers all macro strategies and teardown idempotency; just test-dbt passes
UX-6: TUI (or docs/TUI.md gap note) reflects cache_stats() and health_summary() availability
Extension upgrade path tested (0.17.0 → 0.18.0)
just check-version-sync passes

v0.19.0 — Production Gap Closure & Distribution

Status: Released (2026-04-13).

Release Theme This release closes the most impactful correctness, security, stability, and performance gaps identified in the Phase 7 deep-dive and subsequent audits that v0.18.0 did not address. It removes the unsafe delete_insert merge strategy, adds ownership checks to all DDL-like API functions, hardens the WAL decoder path before it is promoted to production-ready, eliminates O(n²) scheduler dispatch overhead, and ships pg_trickle on standard package registries for the first time. The JOIN delta R₀ fix for simultaneous key-change + right-side delete is the highest-value correctness improvement remaining before 1.0. CDC ordering guarantees, parallel worker crash recovery, delta branch pruning for zero-change sources, and an index-aware MERGE path round out a release that strengthens every layer of the stack. Four to five weeks of focused work delivers measurable correctness improvements, privilege enforcement, catalog index optimizations, a PgBouncer transaction-mode compatibility fix, read-replica safety, and PGXN/apt/rpm distribution.

Completed items (click to expand)

Correctness

ID	Title	Effort	Priority
CORR-1	Remove unsafe `delete_insert` merge strategy	XS	P0
CORR-2	JOIN delta R₀ fix — key change + right-side delete	M	P1
CORR-3	Track `ALTER TYPE` / `ALTER DOMAIN` DDL events	S	P1
CORR-4	Track `ALTER POLICY` DDL events for RLS source tables	S	P1
CORR-5	Fix keyless content-hash collision on identical-content rows	S	P1
CORR-6	Harden guarded `.unwrap()` calls in DVM operators	XS	P2
CORR-7	TRUNCATE + INSERT CDC ordering guarantee	S	P1
CORR-8	NULL join-key delta handling for INNER/OUTER joins	S	P1

CORR-1 — Remove unsafe delete_insert merge strategy

In plain terms: The delete_insert strategy (set via pg_trickle.merge_join_strategy = 'delete_insert') is semantically unsafe for aggregate and DISTINCT queries because the DELETE half executes against already-mutated state, producing phantom deletes. It is slower than standard MERGE for small deltas and incompatible with prepared statements. The auto strategy already covers its only legitimate use case.

Item	Description	Effort
CORR-1-1	Remove `delete_insert` as a valid enum value; emit `ERROR` if set with hint to use `'auto'`.	XS
CORR-1-2	Add upgrade SQL to detect old GUC value and log a NOTICE.	XS

Verify: SET pg_trickle.merge_join_strategy = 'delete_insert' raises ERROR with actionable hint. All existing benchmarks pass. Dependencies: None. Schema change: No.

CORR-2 — JOIN delta R₀ fix for simultaneous key-change + right-side delete

In plain terms: When a row's join key column is updated (UPDATE orders SET cust_id = 5 WHERE cust_id = 3) in the same refresh cycle as the old join partner (customer 3) is deleted, the DELETE half of the delta finds no match in current_right and is silently dropped, leaving a stale row in the stream table until the next full refresh. The fix applies the R₀ snapshot technique (pre-change right-side state via EXCEPT ALL) symmetrically with the existing L₀ already implemented for Part 2 of the delta. build_snapshot_sql() in join_common.rs already exists.

Item	Description	Effort
CORR-2-1	Add `right_part1_source` / `use_r0` logic mirroring `use_l0` in `diff_inner_join`, `diff_left_join`, `diff_full_join`.	M
CORR-2-2	Split Part 1 SQL into two `UNION ALL` arms for the `use_r0` case; update row ID hashing for Part 1b.	M
CORR-2-3	Integration tests: co-delete scenario, UPDATE-then-delete, multi-cycle correctness, TPC-H Q07 regression.	M

Verify: E2E test where UPDATE orders SET cust_id = new_id and DELETE FROM customers WHERE id = old_id land in the same refresh cycle produces correct stream table result without a forced full refresh. Dependencies: EC-01 R₀ EXCEPT ALL pattern (shipped in v0.15.0). Schema change: No.

CORR-3 — Track ALTER TYPE / ALTER DOMAIN DDL events

In plain terms: When a user-defined type or domain used by a source table column is altered (e.g., extending an enum, changing a domain constraint), the DDL event trigger fires but hooks.rs does not classify it as requiring downstream stream table invalidation. Fix: extend the DDL classifier to catch ALTER TYPE and ALTER DOMAIN and trigger cascade invalidation.

Verify: ALTER TYPE my_enum ADD VALUE 'new_val' on a type used by a source column triggers the marked-for-reinit flag on dependent stream tables. Dependencies: None. Schema change: No.

CORR-4 — Track ALTER POLICY DDL events for RLS source tables

In plain terms: If an ALTER POLICY changes the USING expression on a source table, stream tables may silently return wrong results for sessions with active RLS. Fix: detect ALTER POLICY in the DDL classifier and mark dependent stream tables for conservative reinit.

Verify: ALTER POLICY on a source table with dependent stream tables triggers invalidation. E2E test with RLS policy change confirms correct reinitialization. Dependencies: None. Schema change: No.

CORR-5 — Fix keyless content-hash collision on identical-content rows

In plain terms: The keyless table path uses a content hash to identify rows. If two rows have completely identical content, they hash to the same bucket. Under concurrent INSERT + DELETE of identical rows, the net-counting approach may attribute a delete to the wrong "copy" of the row, leaving incorrect counts. Fix: incorporate the change buffer's (lsn, op_index) pair into the hash to break ties between otherwise-identical rows.

Verify: E2E test with two identical rows — insert 2, delete 1 in same cycle; stream table retains exactly 1 row. Dependencies: EC-06 keyless path (shipped in prior release). Schema change: No.

CORR-6 — Harden guarded .unwrap() calls in DVM operators

In plain terms: Several DVM operators use .unwrap() on values that are logically guaranteed by a prior is_some() guard, but the coupling is implicit and fragile — a refactor could silently break the invariant, causing a panic in SQL-reachable code. The most fragile instance is ctx.st_qualified_name.as_deref().unwrap() in filter.rs (line ~130), guarded by has_st which is derived from is_some() several lines earlier. Replace these patterns with if let Some(…) or .unwrap_or_else(|| …) to make the invariant structurally enforced rather than comment-documented.

Verify: grep -rn '\.unwrap()' src/dvm/operators/ returns zero hits outside test modules. All existing unit tests pass. Dependencies: None. Schema change: No.

CORR-7 — TRUNCATE + INSERT CDC ordering guarantee

In plain terms: When a TRUNCATE and subsequent INSERT occur within the same transaction on a source table, the change buffer must preserve their ordering. If the refresh engine processes the INSERT before the TRUNCATE, the stream table loses all rows including the newly inserted ones. The trigger- based CDC path records operations in ctid order within a statement, but cross-statement ordering within a single transaction relies on the change buffer’s op_seq column. Verify that op_seq is monotonically increasing across statements and that the refresh engine applies TRUNCATE before INSERT.

Verify: E2E test: BEGIN; TRUNCATE src; INSERT INTO src VALUES (1); COMMIT; followed by refresh — stream table contains exactly 1 row. Dependencies: None. Schema change: No.

CORR-8 — NULL join-key delta handling for INNER/OUTER joins

In plain terms: When a join key column contains NULL, the INNER JOIN delta should produce zero matching rows (NULL ≠ NULL in SQL), and LEFT/FULL OUTER JOIN deltas should produce NULL-extended rows. The v0.18.0 NULL GROUP BY fix addressed aggregate grouping but the JOIN delta path’s NULL-key behavior is exercised only indirectly by existing tests. Add explicit coverage: INSERT a row with NULL join key, UPDATE it to a non-NULL key, DELETE it — verify each delta cycle produces correct results under both INNER and LEFT JOIN.

Verify: E2E tests with NULL join keys for INNER JOIN, LEFT JOIN, and FULL JOIN — all delta cycles produce correct results matching a full recompute. Dependencies: None. Schema change: No.

Security

ID	Title	Effort	Priority
SEC-1	Add ownership checks to `drop_stream_table` / `alter_stream_table`	S	P0
SEC-2	SQL injection audit for dynamic refresh SQL	XS	P1

SEC-1 — Add ownership checks to drop_stream_table / alter_stream_table

In plain terms: Currently, any role with EXECUTE privilege on pgtrickle.drop_stream_table() or pgtrickle.alter_stream_table() can modify or drop any stream table, regardless of who created it. PostgreSQL convention requires that only the owner (or a superuser) can DROP or ALTER an object. Fix: call pg_class_ownercheck(stream_table_oid, GetUserId()) (or the pgrx-safe equivalent) at the top of both functions and raise ERROR: must be owner of stream table "name" if the check fails. create_stream_table already records the creating role as the table owner in pg_class.

Verify: Non-owner role calling pgtrickle.drop_stream_table('other_users_st') receives ERROR: must be owner of stream table "other_users_st". Superuser can still drop any stream table. E2E test with two roles confirms. Dependencies: None. Schema change: No.

SEC-2 — SQL injection audit for dynamic refresh SQL

In plain terms: The refresh engine builds SQL strings dynamically using format!() with user-provided table names, column names, and schema names. While pgrx’s quote_identifier() and quote_literal() are used in most places, a focused audit of every format!() call site in refresh.rs, diff.rs, and the operators/ directory ensures no path allows unquoted user input into executable SQL. This is a review-only item — fix any findings immediately as P0.

Verify: Audit checklist signed off — every format!() that incorporates catalog-derived names uses quote_identifier() or parameterised SPI queries. Zero unquoted interpolations outside test code. Dependencies: None. Schema change: No.

Stability

ID	Title	Effort	Priority
STAB-1	PgBouncer transaction-mode compatibility guard	M	P1
STAB-2	Read-replica / hot-standby safety guard	S	P1
STAB-3	Elevate Semgrep to blocking in CI	XS	P1
STAB-4	`auto_backoff` GUC — double interval after 3 falling-behind cycles	S	P2
STAB-5	Harden `unwrap()` in scheduler hot path	XS	P2
STAB-6	Parallel worker crash recovery sweep	M	P1
STAB-7	Extension version mismatch detection at load	XS	P2

STAB-1 — PgBouncer transaction-mode compatibility guard

In plain terms: In PgBouncer transaction mode, session-level state is lost between transactions because different backend connections may serve the same session. pg_trickle uses transaction-scoped advisory locks which are safe, but also uses prepared statements and SET LOCAL — both of which fail silently in transaction mode, causing incorrect refresh behavior. Adding pg_trickle.connection_pooler_mode GUC (none / session / transaction) and disabling prepared statements in transaction mode prevents silent misbehavior.

Verify: integration test with PgBouncer transaction mode confirms refreshes complete correctly without prepared statement errors. pg_trickle.connection_pooler_mode = 'transaction' documented in docs/PRE_DEPLOYMENT.md. Dependencies: None. Schema change: No.

STAB-2 — Read-replica / hot-standby safety guard

In plain terms: If pg_trickle's background worker accidentally starts on a streaming replica (hot standby), it attempts writes to the catalog and crash-loops. Fix: detect pg_is_in_recovery() at worker startup and exit gracefully with LOG: pg_trickle background worker skipped: server is in recovery mode.

Verify: integration test that simulates a replica environment; background worker exits cleanly with the correct log message. No crash loop. Dependencies: None. Schema change: No.

STAB-3 — Elevate Semgrep to blocking in CI

In plain terms: CodeQL and cargo-deny are already blocking in CI; Semgrep runs as advisory-only. Before v1.0.0, all SAST tooling should be blocking. Verify zero findings across all current rules, then flip the CI step from continue-on-error: true to blocking.

Verify: CI step passes in blocking mode. Zero advisory-only bypasses remain. Dependencies: None. Schema change: No.

STAB-4 — auto_backoff GUC for scheduler overload

In plain terms: EC-11 shipped the scheduler_falling_behind alert but deferred auto-remediation. When a stream table has triggered the alert for 3 consecutive cycles, automatically double the effective refresh interval for that table until the next successful on-time cycle. Prevents a single heavy stream table from starving the rest of the queue.

Verify: E2E test with artificially slow stream table; effective interval doubles after 3 consecutive falling-behind alerts; returns to original interval after catching up. Dependencies: EC-11 scheduler_falling_behind (shipped in v0.18.0). Schema change: No.

STAB-5 — Harden unwrap() in scheduler hot path

In plain terms: The scheduler dispatch loop in scheduler.rs uses eu_dag.units().find(|u| u.id == uid).unwrap() at several call sites (lines ~1522, ~1680, ~1751, ~1811, ~1859, ~1885). While the IDs come from the same DAG and are expected to always match, a stale topo-order after a concurrent DDL change could cause a panic inside the background worker. Fix: replace with .ok_or(PgTrickleError::InternalError("unit not found in DAG"))? or use the HashMap introduced by PERF-5. This eliminates the last unwrap() cluster in the scheduler hot path.

Verify: grep -n '\.unwrap()' src/scheduler.rs returns zero hits outside test-only code. All scheduler integration tests pass. Dependencies: PERF-5 (HashMap replaces .find().unwrap() pattern). Schema change: No.

STAB-6 — Parallel worker crash recovery sweep

In plain terms: If a background worker is killed (OOM, SIGKILL) or crashes mid-refresh, it may leave behind: (a) orphaned advisory locks that block the next refresh of that stream table, (b) partially consumed rows in the change buffer (consumed but not committed), or (c) incomplete catalog state. Add a startup recovery sweep to the scheduler: on launch, scan for advisory locks held by PIDs that no longer exist (pg_stat_activity), roll back any xact_status = 'in progress' from dead backends, and reset stream tables stuck in REFRESHING state with no active backend.

Verify: Integration test: kill a worker PID mid-refresh via pg_terminate_backend(); restart the scheduler; the affected stream table recovers without manual intervention within one scheduler cycle. Dependencies: None. Schema change: No.

STAB-7 — Extension version mismatch detection at load

In plain terms: Running ALTER EXTENSION pg_trickle UPDATE updates the SQL objects but the shared library (pg_trickle.so) remains loaded from the previous version until the server is restarted. This mismatch can cause subtle failures (wrong function signatures, missing struct fields). Add a version check in _PG_init() that compares the compiled-in version string against the SQL-level extversion from pg_extension. Emit a WARNING if they differ and refuse to start background workers until the server is reloaded.

Verify: After ALTER EXTENSION pg_trickle UPDATE without server restart, the extension log shows WARNING: pg_trickle shared library version (X) does not match installed extension version (Y) — restart PostgreSQL. Background workers do not start. Dependencies: None. Schema change: No.

Performance

ID	Title	Effort	Priority
PERF-1	Fix WAL decoder: `old_*` columns always NULL on UPDATE	S	P1
PERF-2	Fix WAL decoder: naive `pgoutput` action string parsing	S	P1
PERF-3	`EXPLAIN (ANALYZE, BUFFERS)` surface for delta SQL in `explain_st()`	S	P2
PERF-4	Add catalog indexes on `pgt_relid` and `pgt_dependencies(pgt_id)`	XS	P1
PERF-5	Eliminate O(n²) `units().find()` in scheduler dispatch	S	P1
PERF-6	Batch `has_table_source_changes()` into single query	S	P2
PERF-7	Delta branch pruning for zero-change sources	S	P1
PERF-8	Index-aware MERGE path selection	S	P2

PERF-1 — Fix WAL decoder: old_* columns always NULL on UPDATE

In plain terms: In WAL-based CDC (pg_trickle.wal_enabled = true), the old_col_* values for UPDATE rows are always NULL because the decoder reads new_tuple for both old and new field positions. This breaks R₀ snapshot construction for the WAL path. Fix: correctly write old_tuple fields to the old_col_* buffer columns for UPDATE events. Currently dormant (only manifests with wal_enabled = true).

Verify: WAL decoder integration test: UPDATE source SET pk = new_pk; assert old_col_pk IS NOT NULL in the change buffer and equals the pre-update value. Dependencies: None. Schema change: No.

PERF-2 — Fix WAL decoder: naive pgoutput action string parsing

In plain terms: The WAL decoder parses action type with starts_with("I") which incorrectly matches any string beginning with "I" (e.g., "INSERT"). Fix: use exact single-character comparison (== "I") or parse the action byte directly from the pgoutput message buffer. Currently dormant (only manifests with wal_enabled = true).

Verify: WAL decoder unit tests for each action type using exact-match assertion. Fuzz test with action strings longer than 1 character. Dependencies: None. Schema change: No.

PERF-3 — EXPLAIN (ANALYZE, BUFFERS) in explain_st()

In plain terms: pgtrickle.explain_st(name) returns the delta SQL template without execution statistics. Adding a with_analyze BOOLEAN parameter that runs EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) on the delta SQL gives operators plan + actual row counts + buffer hit/miss data — making slow refresh diagnosis much easier.

Verify: pgtrickle.explain_st('my_st', with_analyze => true) returns JSONB with Plan, Actual Rows, and Shared Hit Blocks fields. Documented in docs/SQL_REFERENCE.md. Dependencies: None. Schema change: No.

PERF-4 — Add catalog indexes on pgt_relid and pgt_dependencies(pgt_id)

In plain terms: pgt_stream_tables has an index on status but not on pgt_relid, which is used in hot-path lookups (WHERE pgt_relid = $1) by DDL hooks, CDC trigger installation, and refresh dependency resolution. pgt_dependencies has an index on source_relid but not on pgt_id, which is used when rebuilding a single stream table's dependency set. Adding these two B-tree indexes eliminates sequential scans on these catalog tables at scale.

Verify: \di pgtrickle.idx_pgt_relid and \di pgtrickle.idx_deps_pgt_id exist after upgrade. EXPLAIN of SELECT * FROM pgtrickle.pgt_stream_tables WHERE pgt_relid = 12345 shows Index Scan. Dependencies: None. Schema change: Yes (upgrade SQL adds CREATE INDEX).

PERF-5 — Eliminate O(n²) units().find() in scheduler dispatch

In plain terms: The scheduler dispatch loop calls eu_dag.units().find(|u| u.id == uid) inside iteration over topo_order and ready_queue, causing O(n²) behavior per tick. At 500+ stream tables this adds measurable overhead. Fix: build a HashMap<UnitId, &Unit> once per tick and replace all .find() lookups with O(1) map access.

Verify: Benchmark with 500 stream tables shows tick latency < 1ms (currently ~5–10ms). grep -n 'units().find' src/scheduler.rs returns zero hits. Dependencies: None. Schema change: No.

PERF-6 — Batch has_table_source_changes() into single query

In plain terms: has_table_source_changes() executes N separate SELECT EXISTS(SELECT 1 FROM changes_<oid> LIMIT 1) SPI queries — one per source table per stream table per scheduler tick. For a stream table with 5 sources, this is 5 SPI round-trips. Batching into a single SELECT unnest(ARRAY[oid1, oid2, ...]) AS oid WHERE EXISTS(...) or using a single UNION ALL subquery reduces this to 1 SPI call regardless of source count.

Verify: SPI call count for has_table_source_changes() is 1 regardless of source table count. Scheduler integration tests pass. Dependencies: None. Schema change: No.

PERF-7 — Delta branch pruning for zero-change sources

In plain terms: In a multi-source JOIN stream table (SELECT * FROM a JOIN b ON ...), the delta has two arms: Δ_a ⋈ b and a ⋈ Δ_b. If only source a has changes, the second arm (a ⋈ Δ_b) reads an empty change buffer and produces zero rows — but the engine still executes the full SQL including the join against a. Short-circuit: check has_table_source_changes() per source before building each delta arm. Skip arms where the source has zero changes. For a 5-source star join with only 1 changing source, this eliminates 4 of 5 delta arms entirely.

Verify: Benchmark with 5-source JOIN where only 1 source changes; observe 4 of 5 delta arms skipped in explain_st() output. Refresh latency drops proportionally. Dependencies: PERF-6 (batched source-change check). Schema change: No.

PERF-8 — Index-aware MERGE path selection

In plain terms: The MERGE statement used during differential refresh joins the delta against the stream table on __pgt_row_id. If the stream table has a covering index on the row ID column (which pg_trickle creates by default), the planner should use an index nested-loop join. However, PostgreSQL’s cost model sometimes prefers a hash join for large deltas. Add a targeted SET LOCAL enable_hashjoin = off within the refresh transaction when the delta cardinality is below a configurable threshold (pg_trickle.merge_index_threshold, default 10,000 rows) to steer the planner toward the index path for small deltas.

Verify: EXPLAIN of the MERGE with delta < 10,000 rows shows Index Nested Loop instead of Hash Join. Benchmark shows improved P99 latency for small deltas on large stream tables. Dependencies: None. Schema change: No.

Scalability

ID	Title	Effort	Priority
SCAL-1	Read replica compatibility section in `docs/SCALING.md`	S	P1
SCAL-2	Multi-database GUC stub (`pg_trickle.database_list`)	S	P2
SCAL-3	CNPG operational runbook in `docs/SCALING.md`	S	P2
SCAL-4	Partitioned source table impact assessment	M	P2

SCAL-1 — Read replica compatibility documentation

In plain terms: The background worker now safely skips on replicas (STAB-2), but the interaction with read replicas for query offloading deserves its own documentation section. Add docs/SCALING.md §Read Replicas covering: which queries are safe on a replica, how pg_is_in_recovery() is used by the extension, and the recommended architecture for OLAP read-offload alongside pg_trickle stream tables.

Verify: docs/SCALING.md has a dedicated replica section. Dependencies: STAB-2. Schema change: No.

SCAL-2 — Multi-database GUC stub

In plain terms: Post-1.0 multi-database support requires catalog changes. This item adds only the pg_trickle.database_list TEXT GUC declaration with a default of '' (current database only) and a startup WARNING if set. This reserves the configuration namespace and lets operators test GUC surface before the full feature ships.

Verify: SHOW pg_trickle.database_list returns ''. Setting a non-empty value emits a WARNING: "pg_trickle.database_list is not yet implemented." Dependencies: None. Schema change: No.

SCAL-3 — CNPG operational runbook in docs/SCALING.md

In plain terms: The CNPG (CloudNativePG) smoke test in CI validates that pg_trickle loads and functions on a CNPG-managed cluster, but the operational patterns are not documented. Add a §CNPG / Kubernetes section to docs/SCALING.md covering: cluster-example.yaml annotations for loading the extension, pod restart behavior when the background worker crashes, WAL volume sizing for CDC, recommended shared_preload_libraries configuration, and health check integration with Kubernetes liveness/readiness probes.

Verify: docs/SCALING.md has a CNPG/Kubernetes section. Content reviewed against actual CNPG deployment behavior. Dependencies: None. Schema change: No.

SCAL-4 — Partitioned source table impact assessment

In plain terms: Stream tables backed by partitioned source tables (inheritance or declarative partitioning) are untested and likely broken: CDC triggers may be installed only on the parent, change buffers may miss partition-routed inserts, and ALTER TABLE ... ATTACH/DETACH PARTITION DDL events are unhandled. This item is a time-boxed spike (2 days): create a partitioned source, attach a stream table, run INSERT/UPDATE/DELETE through various partitions, and document what works, what breaks, and what the fix scope is. Output: a plans/PLAN_PARTITIONING_SPIKE.md update.

Verify: Spike report documents concrete findings. At minimum: which operations work, which fail, and a rough estimate for full partitioning support. Dependencies: None. Schema change: No.

Ease of Use

ID	Title	Effort	Priority
UX-1	PGXN `release_status` → `"stable"`	XS	P1
UX-2	Automated Docker Hub release pipeline	S	P1
UX-3	apt/rpm packaging via PGDG	M	P1
UX-4	Connection pooler compatibility guide in `docs/PRE_DEPLOYMENT.md`	S	P1
UX-5	`pgtrickle.write_and_refresh(dml_sql TEXT, st_name TEXT)`	S	P2
UX-6	Change `drop_stream_table` cascade default to `false`	XS	P1
UX-7	Resolve OIDs to table names in error messages	S	P1
UX-8	Emit NOTICE when `refresh_stream_table` is skipped	XS	P1
UX-9	Fix CONFIGURATION.md TOC gaps for 3 undocumented GUCs	XS	P2
UX-10	TUI per-table refresh latency sparkline	S	P2
UX-11	`pgtrickle.version()` diagnostic function	XS	P2

UX-1 — PGXN release_status → "stable"

In plain terms: pg_trickle's META.json uses release_status: "testing". Flipping to "stable" signals production-readiness, enabling the extension to appear in the main PGXN package listing and in downstream package managers that consume the PGXN stable feed. One field change in META.json.

Verify: META.json "release_status": "stable". Published PGXN listing reflects the change after the next PGXN sync. Dependencies: None. Schema change: No.

UX-2 — Automated Docker Hub release pipeline

In plain terms: Automate publishing pgtrickle/pg_trickle:<ver>-pg18 and pgtrickle/pg_trickle:latest on every tagged release. Wire the existing Dockerfile.hub into the GitHub Actions release workflow via docker/build-push-action. The latest tag tracks the highest non-prerelease version.

Verify: After a test release tag, Docker Hub shows the correct image. docker pull pgtrickle/pg_trickle:0.19.0-pg18 succeeds and passes the smoke test. Dependencies: Dockerfile.hub (already exists). Schema change: No.

UX-3 — apt/rpm packaging via PGDG

In plain terms: PostgreSQL users install extensions via apt install postgresql-18-pg-trickle or dnf install pg_trickle_18. Submit package specs to pgrpms.org (rpm) and the PGDG apt repository (deb). Generate packages from the GitHub release tarball. This is the most impactful distribution improvement possible.

Verify: apt install postgresql-18-pg-trickle works on Ubuntu 24.04. dnf install pg_trickle_18 works on RHEL 9. Both pass verify_install.sql. Dependencies: None. Schema change: No.

UX-4 — Connection pooler compatibility guide

In plain terms: Add a dedicated section to docs/PRE_DEPLOYMENT.md covering: PgBouncer session mode (fully compatible), PgBouncer transaction mode (set pg_trickle.connection_pooler_mode = 'transaction'), pgpool-II (session mode only), PgCat (session mode only). Include a compatibility matrix and postgresql.conf + PgBouncer config snippets.

Verify: PRE_DEPLOYMENT.md pooler section reviewed by a DBA familiar with PgBouncer. All described modes are tested or explicitly marked "untested." Dependencies: STAB-1. Schema change: No.

UX-5 — pgtrickle.write_and_refresh() convenience function

In plain terms: In DIFFERENTIAL mode, a write followed by refresh_stream_table() requires two API calls. A single function that executes the DML and triggers a refresh atomically simplifies read-your-writes patterns for applications that need immediate consistency without the overhead of IMMEDIATE mode.

Verify: SELECT pgtrickle.write_and_refresh('INSERT INTO src VALUES (1)', 'my_st') executes the INSERT and refreshes the stream table. Documented in docs/SQL_REFERENCE.md. Dependencies: None. Schema change: No.

UX-6 — Change drop_stream_table cascade default to false

In plain terms: pgtrickle.drop_stream_table(name, cascade) currently defaults cascade to true. This violates the PostgreSQL convention where DROP defaults to RESTRICT and CASCADE must be explicit. A user calling SELECT pgtrickle.drop_stream_table('my_st') may inadvertently cascade-drop dependent stream tables. Fix: change the default to false (RESTRICT). This is a behavior change — existing scripts that rely on the implicit cascade must add cascade => true explicitly.

Verify: SELECT pgtrickle.drop_stream_table('parent_st') returns an error when parent_st has dependents. SELECT pgtrickle.drop_stream_table('parent_st', cascade => true) succeeds. Documented in CHANGELOG as a breaking change. Dependencies: None. Schema change: No (function signature change only).

UX-7 — Resolve OIDs to table names in error messages

In plain terms: UpstreamTableDropped(u32) and UpstreamSchemaChanged(u32) display raw PostgreSQL OIDs (e.g., "upstream table dropped: OID 16384"). Users cannot easily map OIDs to table names. Fix: resolve the OID to schema.table via pg_class at error-construction time or store the name alongside the OID. If the table is already dropped, fall back to "OID <oid> (table no longer exists)".

Verify: UpstreamTableDropped error message shows "upstream table dropped: public.orders" instead of raw OID. Fallback tested with a pre-dropped table. Dependencies: None. Schema change: No.

UX-8 — Emit NOTICE when refresh_stream_table is skipped

In plain terms: When refresh_stream_table() encounters a RefreshSkipped condition (e.g., no changes detected, another refresh already in progress), it currently logs at debug1 level and returns success — invisible to the caller at default log levels. Fix: emit a PostgreSQL NOTICE (visible to the calling session) in addition to the debug1 log, so the caller knows the refresh did not execute.

Verify: SELECT pgtrickle.refresh_stream_table('my_st') with no pending changes emits NOTICE: refresh skipped for "my_st": no changes detected. Visible in psql output. Dependencies: None. Schema change: No.

UX-9 — Fix CONFIGURATION.md TOC gaps

In plain terms: Three GUCs (delta_work_mem_cap_mb, volatile_function_policy, unlogged_buffers) have full documentation sections in docs/CONFIGURATION.md but are missing from the table of contents navigation at the top of the file. Additionally, there is a duplicate "Guardrails" entry in the TOC. Fix: add the missing TOC entries and remove the duplicate.

Verify: All ### pg_trickle.* headings in CONFIGURATION.md have a corresponding TOC link. No duplicate entries. Dependencies: None. Schema change: No.

UX-10 — TUI per-table refresh latency sparkline

In plain terms: The pgtrickle TUI dashboard shows each stream table’s current status and last refresh duration, but operators cannot see at a glance whether latency is trending up or down. Add a sparkline column (last 20 refresh latencies, ~80 chars wide) to the stream table list view. The data is already available in pgt_refresh_history; the TUI polls it on each tick. This makes performance degradation and recovery immediately visible without switching to Grafana.

Verify: TUI stream table view shows a sparkline column. Sparkline updates after each refresh cycle. Values match pgt_refresh_history entries. Dependencies: None. Schema change: No.

UX-11 — pgtrickle.version() diagnostic function

In plain terms: A SELECT pgtrickle.version() function that returns the installed extension version, the shared library version, and the target PostgreSQL major version as a composite record. This is standard practice for PostgreSQL extensions (cf. postgis_full_version()) and simplifies remote diagnostics — support can ask a user to run one query instead of checking pg_available_extensions, pg_config, and SHOW server_version separately.

Verify: SELECT * FROM pgtrickle.version() returns three fields: extension_version, library_version, pg_major_version. Values match the installed state. Dependencies: None. Schema change: No.

Test Coverage

ID	Title	Effort	Priority
TEST-1	E2E tests for CORR-2 (JOIN delta R₀ fix)	S	P1
TEST-2	E2E tests for DDL tracking gaps (CORR-3 / CORR-4)	S	P1
TEST-3	WAL decoder unit tests for PERF-1 / PERF-2	S	P1
TEST-4	PgBouncer transaction-mode integration smoke test	M	P1
TEST-5	Read-replica guard integration test	S	P1
TEST-6	Ownership-check privilege tests for SEC-1	S	P1
TEST-7	Scheduler dispatch benchmark (500+ STs)	S	P1
TEST-8	Upgrade E2E tests (`e2e_migration_tests.rs`)	M	P1
TEST-9	Extract unit-testable logic from E2E-only paths	M	P1
TEST-10	TPC-H scale factor coverage (SF-1, SF-10)	S	P2

TEST-1 — E2E tests for CORR-2 (JOIN delta R₀ fix)

In plain terms: The co-delete scenario (UPDATE join key + DELETE join partner in same cycle) is currently untested. Add three E2E tests: (a) simultaneous key change + right-side delete; (b) UPDATE key + DELETE multiple right-side rows; (c) multi-cycle correctness after the scenario.

Verify: 3 E2E tests in e2e_join_tests.rs. All pass; intermediate full refresh not required for correctness. Dependencies: CORR-2. Schema change: No.

TEST-2 — E2E tests for DDL tracking (CORR-3 / CORR-4)

In plain terms: Add E2E tests verifying that ALTER TYPE, ALTER DOMAIN, and ALTER POLICY DDL events correctly trigger stream table invalidation.

Verify: 3 E2E tests (one per DDL type). Stream table state after reinit is correct. Dependencies: CORR-3, CORR-4. Schema change: No.

TEST-3 — WAL decoder unit tests

In plain terms: Add WAL decoder unit tests that explicitly enable wal_enabled = true and verify: (a) old_col_* values are non-NULL for UPDATE rows; (b) pk_hash is non-zero for keyless tables; (c) action string parsing uses exact comparison.

Verify: 5+ unit tests in tests/wal_decoder_tests.rs using Testcontainers with WAL mode enabled. Dependencies: PERF-1, PERF-2. Schema change: No.

TEST-4 — PgBouncer transaction-mode smoke test

In plain terms: Start PgBouncer in transaction mode via Testcontainers, connect pg_trickle through it, and run a basic refresh cycle. Verifies connection_pooler_mode = 'transaction' correctly disables prepared statements and refreshes complete without errors.

Verify: integration test passes with PgBouncer transaction mode container. Dependencies: STAB-1. Schema change: No.

TEST-5 — Read-replica guard integration test

In plain terms: Start a streaming replica via Testcontainers, install pg_trickle on the replica, and verify the background worker exits cleanly with the correct log message rather than crash-looping.

Verify: worker log contains "pg_trickle background worker skipped: server is in recovery mode." No ERROR or FATAL in replica logs. Dependencies: STAB-2. Schema change: No.

TEST-6 — Ownership-check privilege tests for SEC-1

In plain terms: Add E2E tests with two PostgreSQL roles: role A creates a stream table, role B (non-superuser, non-owner) attempts to drop and alter it. Verify that role B receives ERROR: must be owner of stream table. Also verify that a superuser can drop/alter any stream table regardless of ownership.

Verify: 3 E2E tests (non-owner drop, non-owner alter, superuser override). Dependencies: SEC-1. Schema change: No.

TEST-7 — Scheduler dispatch benchmark (500+ STs)

In plain terms: Add a Criterion benchmark that creates a mock DAG with 500+ stream tables and measures per-tick dispatch latency. This gates PERF-5 (HashMap optimization) and provides a regression baseline for future scheduler changes. The benchmark should run in the existing benches/ framework.

Verify: cargo bench --bench scheduler_bench runs and reports P50/P99 tick latency. Baseline saved for Criterion regression gate. Dependencies: PERF-5. Schema change: No.

TEST-8 — Upgrade E2E tests (e2e_migration_tests.rs)

In plain terms: The upgrade path from 0.18.0 → 0.19.0 is currently tested only by verifying ALTER EXTENSION pg_trickle UPDATE runs without error. There are no tests that verify (a) existing stream tables continue to function after upgrade, (b) the new catalog schema items (DB-2 FK, DB-3 version table, DB-5 history retention) are present and correct, or (c) stream table data is preserved. Add a Testcontainers-based upgrade E2E test.

Verify: tests/e2e_migration_tests.rs tests: fresh install, upgrade from previous version with populated stream tables, catalog integrity check, post-upgrade refresh cycle. All pass. Dependencies: DB-1, DB-2, DB-3. Schema change: No (tests existing schema).

TEST-9 — Extract unit-testable logic from E2E-only paths

In plain terms: Several core functions in refresh.rs and scheduler.rs are currently exercised only through end-to-end tests that require a PostgreSQL container. Extracting pure logic from SPI-dependent code and adding direct unit tests makes regressions detectable in seconds instead of minutes. Target: identify 5+ functions (refresh strategy selection, delta cardinality estimation, backoff calculation, topo-sort cycle detection, merge strategy costing) that operate on plain Rust data structures and can be tested with #[cfg(test)] modules.

Verify: 5+ new #[cfg(test)] unit tests in src/refresh.rs or src/scheduler.rs. just test-unit runs them in < 5 seconds. Dependencies: None. Schema change: No.

TEST-10 — TPC-H scale factor coverage (SF-1, SF-10)

In plain terms: The v0.18.0 TPC-H regression guard runs all 22 queries at a single scale factor. Real-world correctness bugs sometimes only manifest at higher cardinalities where hash collisions, sort spill, and parallel execution change the code path. Add nightly runs at SF-1 (6M rows) and SF-10 (60M rows) alongside the existing default. The SF-10 run doubles as a performance soak test — flag any query whose refresh time regresses by more than 20% compared to the previous nightly.

Verify: CI nightly job runs TPC-H at SF-1 and SF-10. All 22 queries produce correct results at both scales. SF-10 timing baseline saved for regression detection. Dependencies: None. Schema change: No.

Schema Stability

ID	Title	Effort	Priority
DB-1	Fix duplicate `'DIFFERENTIAL'` in two CHECK constraints	XS	P0
DB-2	Add `ON DELETE CASCADE` FK on `pgt_refresh_history.pgt_id`	XS	P0
DB-3	Add `pgtrickle.pgt_schema_version` version tracking table	XS	P0
DB-4	Rename `pgtrickle_refresh` NOTIFY channel → `pg_trickle_refresh`	XS	P0
DB-5	`pg_trickle.history_retention_days` GUC + scheduler daily cleanup	S	P1
DB-6	Document public API stability contract in `docs/SQL_REFERENCE.md`	XS	P1
DB-7	Add migration script template to `sql/`	XS	P1
DB-8	Validate orphan cleanup in `drop_stream_table`	XS	P1
DB-9	`pgtrickle.migrate()` utility function	S	P2

DB-1 — Fix duplicate 'DIFFERENTIAL' in CHECK constraints

In plain terms: Both pgt_stream_tables.refresh_mode and pgt_refresh_history.action have 'DIFFERENTIAL' listed twice in their CHECK constraints. While logically harmless, it signals sloppiness and produces confusing output in dumps. Both from REPORT_DB_SCHEMA_STABILITY.md §3.1.

Verify: \d+ pgtrickle.pgt_stream_tables and \d+ pgtrickle.pgt_refresh_history show their CHECK constraints with no duplicate values. Dependencies: None. Schema change: Yes (upgrade SQL drops/recreates constraints).

DB-2 — Add ON DELETE CASCADE FK on pgt_refresh_history.pgt_id

In plain terms: pgt_refresh_history.pgt_id references pgt_stream_tables.pgt_id logically but has no formal FK. When a stream table is dropped, orphan history rows accumulate indefinitely. Adding FOREIGN KEY (pgt_id) REFERENCES pgtrickle.pgt_stream_tables(pgt_id) ON DELETE CASCADE cleans up automatically.

Verify: Drop a stream table; SELECT count(*) FROM pgtrickle.pgt_refresh_history WHERE pgt_id = <dropped_id> returns 0. Dependencies: None. Schema change: Yes.

DB-3 — Add pgtrickle.pgt_schema_version version tracking table

In plain terms: There is currently no way for migration scripts to verify which schema version is installed before applying changes. Add a pgt_schema_version(version TEXT PRIMARY KEY, applied_at TIMESTAMPTZ, description TEXT) table seeded with the current version. Every future migration script will check this table and insert its target version.

Verify: SELECT version FROM pgtrickle.pgt_schema_version ORDER BY applied_at DESC LIMIT 1 returns the current extension version after upgrade. Dependencies: None. Schema change: Yes.

DB-4 — Rename pgtrickle_refresh NOTIFY channel → pg_trickle_refresh

In plain terms: Two existing NOTIFY channels use pg_trickle_* naming (pg_trickle_alert, pg_trickle_cdc_transition). The third uses inconsistent pgtrickle_refresh (no separator). Rename before 1.0 while still pre-1.0. Any external LISTEN pgtrickle_refresh in application code must be updated. Document as a breaking change in CHANGELOG.

Verify: LISTEN pg_trickle_refresh receives notifications on refresh events. LISTEN pgtrickle_refresh receives none. Dependencies: None. Schema change: No (code change only).

DB-5 — pg_trickle.history_retention_days GUC + scheduler cleanup

In plain terms: pgt_refresh_history has no retention policy. Production deployments running daily refreshes on 100+ stream tables will accumulate millions of rows within months. Add a GUC (default: 30 days) and a daily cleanup step in the scheduler: DELETE FROM pgtrickle.pgt_refresh_history WHERE start_time < now() - make_interval(...).

Verify: SET pg_trickle.history_retention_days = 1 and run the cleanup; rows older than 1 day are removed. Default retains 30 days. Dependencies: None. Schema change: No (new GUC + cleanup logic only).

DB-6 — Document public API stability contract

In plain terms: The stability contract defined in REPORT_DB_SCHEMA_STABILITY.md §5 (Tier 1/2/3 surfaces) is not yet published anywhere users can find it. Add a "Stability Guarantees" section to docs/SQL_REFERENCE.md covering: which function signatures are stable, which view columns can be added without a major version, and which internal objects may change with migration scripts.

Verify: docs/SQL_REFERENCE.md has a §Stability Guarantees section linked from the TOC. Dependencies: None. Schema change: No.

DB-7 — Add migration script template to sql/

In plain terms: The sql/pg_trickle--0.18.0--0.19.0.sql file is currently empty (stub). Populate it with: (a) the DB-1 CHECK constraint fixes, (b) the DB-2 FK addition, (c) the DB-3 schema version table creation, and (d) the DB-4 NOTIFY channel rename notice. Also create a reusable migration script template comment header for future versions.

Verify: ALTER EXTENSION pg_trickle UPDATE on a 0.18.0 instance applies all schema changes correctly. check_upgrade_completeness.sh passes. Dependencies: DB-1, DB-2, DB-3, DB-4. Schema change: Yes (this IS the migration script).

DB-8 — Validate orphan cleanup in drop_stream_table

In plain terms: When a stream table is dropped, pgt_change_tracking rows with the dropped pgt_id in tracked_by_pgt_ids (a BIGINT[] column) may not be cleaned up if the array contains other IDs. Add an explicit sweep: remove the dropped pgt_id from all tracked_by_pgt_ids arrays; delete rows where the array becomes empty.

Verify: Create a shared-source ST pair, drop one; SELECT * FROM pgtrickle.pgt_change_tracking shows correct state. Dependencies: None. Schema change: No.

DB-9 — pgtrickle.migrate() utility function

In plain terms: Add a pgtrickle.migrate() SQL function that iterates over all registered stream tables and applies any pending dynamic object migrations (change buffer schema updates, CDC trigger function regeneration). This is called automatically at the end of ALTER EXTENSION UPDATE and can also be called manually after an upgrade to repair STs that were being refreshed during the upgrade window.

Verify: SELECT pgtrickle.migrate() completes without error on a fresh install and after a version upgrade. Returns a summary of migrated objects. Dependencies: DB-3 (uses schema version to determine needed migrations). Schema change: No.

v0.19.0 total: ~4–5 weeks

Exit criteria:

CORR-1: delete_insert strategy removed; ERROR raised on old GUC value
CORR-2: JOIN delta R₀ fix: UPDATE key + DELETE partner in same cycle produces correct stream table result
CORR-3: ALTER TYPE / ALTER DOMAIN DDL events trigger stream table invalidation
CORR-4: ALTER POLICY DDL events trigger stream table invalidation
CORR-5: Keyless content-hash collision test passes with two identical-content rows
CORR-6: Zero .unwrap() in src/dvm/operators/ outside test modules
SEC-1: Non-owner drop_stream_table/alter_stream_table raises ERROR: must be owner
STAB-1: pg_trickle.connection_pooler_mode GUC added; transaction mode disables prepared statements
STAB-2: Background worker exits cleanly on hot standby with correct log message
STAB-3: Semgrep elevated to blocking; zero findings verified
STAB-4: auto_backoff GUC: interval doubles after 3 consecutive falling-behind alerts
STAB-5: Zero .unwrap() in scheduler hot path outside test modules
PERF-1: WAL decoder writes correct old_col_* values for UPDATE rows
PERF-2: WAL decoder uses exact action string comparison
PERF-4: Catalog indexes on pgt_relid and pgt_dependencies(pgt_id) exist after upgrade
PERF-5: Zero units().find() in scheduler; HashMap-based O(1) lookup
PERF-6: has_table_source_changes() executes single SPI query regardless of source count
SCAL-1: docs/SCALING.md replica section added
UX-1: META.json release_status → "stable"; PGXN listing updated
UX-2: Docker Hub release automation wired in GitHub Actions
UX-3: apt/rpm packages available via PGDG
UX-4: docs/PRE_DEPLOYMENT.md connection pooler compatibility guide added
UX-6: drop_stream_table defaults to cascade => false
UX-7: UpstreamTableDropped/UpstreamSchemaChanged show table name instead of raw OID
UX-8: refresh_stream_table emits NOTICE when refresh is skipped
UX-9: CONFIGURATION.md TOC complete; no duplicate entries
TEST-1: 3 JOIN delta R₀ E2E tests pass
TEST-2: 3 DDL tracking E2E tests pass
TEST-3: 5+ WAL decoder unit tests pass with wal_enabled = true
TEST-4: PgBouncer transaction-mode integration test passes
TEST-5: Read-replica guard integration test passes
TEST-6: 3 ownership-check privilege E2E tests pass
TEST-7: Scheduler dispatch benchmark baseline saved
TEST-8: Upgrade E2E tests pass (pre- and post-upgrade stream table correctness)
DB-1: No duplicate 'DIFFERENTIAL' in CHECK constraints
DB-2: pgt_refresh_history.pgt_id FK with ON DELETE CASCADE added
DB-3: pgtrickle.pgt_schema_version table present and seeded
DB-4: pgtrickle_refresh channel renamed to pg_trickle_refresh
DB-5: pg_trickle.history_retention_days GUC active; daily cleanup deletes old rows
DB-6: docs/SQL_REFERENCE.md stability contract section published
DB-7: sql/pg_trickle--0.18.0--0.19.0.sql applies DB-1 through DB-4 changes
DB-8: drop_stream_table leaves no orphan rows in pgt_change_tracking
CORR-7: TRUNCATE + INSERT in same transaction — stream table correct after refresh
CORR-8: NULL join-key delta correct for INNER, LEFT, and FULL JOIN
SEC-2: SQL injection audit complete — zero unquoted interpolations in refresh SQL
STAB-6: Worker crash recovery sweep cleans orphaned locks and stuck REFRESHING state
STAB-7: Version mismatch WARNING emitted after ALTER EXTENSION without restart
PERF-7: Delta branch pruning skips zero-change source arms in multi-JOIN
PERF-8: Index-aware MERGE uses nested loop for small deltas on indexed tables
SCAL-3: docs/SCALING.md CNPG/Kubernetes section published
SCAL-4: Partitioning spike report written with concrete findings
UX-10: TUI sparkline column visible for refresh latency trend
UX-11: pgtrickle.version() returns extension, library, and PG versions
TEST-9: 5+ unit tests extracted from E2E-only refresh/scheduler logic
TEST-10: TPC-H nightly runs at SF-1 and SF-10 with correct results
Extension upgrade path tested (0.18.0 → 0.19.0)
just check-version-sync passes

Conflicts & Risks

CORR-1 is a user-visible breaking change. Any deployment with merge_join_strategy = 'delete_insert' in postgresql.conf will error at startup after upgrade. Requires a prominent CHANGELOG entry and a NOTICE during the upgrade migration.
CORR-2 touches high-traffic diff operators. diff_inner_join and diff_left_join are the most commonly used operators. Gate the merge behind TPC-H regression suite + TEST-1. Do not merge without both passing.
STAB-1 introduces a new GUC. The pg_trickle.connection_pooler_mode GUC must be mirrored in upgrade migration SQL, CONFIGURATION.md, and check-version-sync validation.
PERF-1/PERF-2 are currently dormant. Changes to wal_decoder.rs must be tested with wal_enabled = true explicitly. The default trigger-based CDC is unaffected — keep WAL tests behind an explicit env var to avoid slowing down the default test run.
UX-3 (apt/rpm packaging) depends on PGDG maintainer availability (~8–12h) and can be cut without impacting correctness if it risks delaying the release.
SEC-1 changes privilege semantics. Existing deployments where non-owner roles call drop_stream_table or alter_stream_table will break. Requires a CHANGELOG entry and, optionally, a pg_trickle.skip_ownership_check GUC (default false) for a transition period.
UX-6 changes the cascade default. Scripts relying on implicit cascade => true will silently change behavior — DROP will error instead of cascading. Ship alongside SEC-1 and document both breaking changes together.
PERF-4 requires upgrade SQL. The two CREATE INDEX statements must be added to sql/pg_trickle--0.18.0--0.19.0.sql. Index creation on a busy system may briefly lock the catalog tables (millisecond-range for small catalogs; document in upgrade notes).
DB-4 renames the pgtrickle_refresh NOTIFY channel. Any application code using LISTEN pgtrickle_refresh will stop receiving notifications after upgrade. The old channel name ceases to exist. Document prominently in CHANGELOG and UPGRADING.md.
DB-2 adds a CASCADE FK. If any external tooling holds open transactions when a stream table is dropped, the cascade may fail under lock. Test in upgrade E2E (TEST-8) before shipping.
STAB-6 touches the scheduler startup path. A bug in the recovery sweep could incorrectly reset a stream table that is still being refreshed on a live backend. The sweep must verify that the PID is truly dead via pg_stat_activity before taking corrective action.
PERF-8 disables hashjoin within the refresh transaction. If the threshold is set too high, large deltas will use a slower nested-loop path. Make the merge_index_threshold GUC tunable and document clearly that it only affects the MERGE step, not the delta SQL.
SCAL-4 (partitioning spike) may uncover scope too large for v0.19.0. If the spike reveals that full partitioning support requires CDC architectural changes, defer the implementation to a later release and document findings in the spike report.

v0.20.0 — Dog-Feeding (pg_trickle Monitors Itself)

Status: Released (2026-04-15). All 62 items implemented, 1 skipped (PERF-6 already shipped in v0.19.0). See plans/PLAN_0_20_0.md.

Release Theme This release implements dog-feeding: pg_trickle uses its own stream tables to maintain reactive analytics over its internal catalog and refresh-history tables. Five dog-feeding stream tables (df_efficiency_rolling, df_anomaly_signals, df_threshold_advice, df_cdc_buffer_trends, df_scheduling_interference) replace repeated full-scan diagnostic functions with continuously-maintained incremental views, enable multi-cycle trend detection for threshold tuning, and surface anomalies reactively. An optional auto-apply policy layer can automatically adjust auto_threshold when confidence is high. This validates pg_trickle on its own non-trivial workload and demonstrates the incremental analytics value proposition to users.

See plans/PLAN_DOG_FEEDING.md for the full design, architecture, and risk analysis.

Phase 1 — Foundation

Item	Description	Effort	Ref
DF-F1	Verify CDC on `pgt_refresh_history`. Confirm that `create_stream_table()` installs INSERT triggers on `pgt_refresh_history`. Fix schema-exclusion logic if the `pgtrickle` schema is skipped.	2–4h	PLAN_DOG_FEEDING.md §7 Phase 1
DF-F2	Create `df_efficiency_rolling` (DF-1). Maintained rolling-window aggregates over `pgt_refresh_history`. Replaces `refresh_efficiency()` full scans.	2–4h	PLAN_DOG_FEEDING.md §5 DF-1
DF-F3	E2E test: DF-1 output matches `refresh_efficiency()`. Insert synthetic history rows, refresh DF-1, assert aggregates agree.	2–4h	PLAN_DOG_FEEDING.md §8
DF-F4	`pgtrickle.setup_dog_feeding()` helper. Single SQL call that creates all five `df_*` stream tables.	2–4h	PLAN_DOG_FEEDING.md §7 Phase 4
DF-F5	`pgtrickle.teardown_dog_feeding()` helper. Drops all `df_*` stream tables cleanly.	1h	PLAN_DOG_FEEDING.md §7 Phase 4

Phase 2 — Anomaly Detection

Item	Description	Effort	Ref
DF-A1	Create `df_anomaly_signals` (DF-2). Detects duration spikes, error bursts, and mode oscillation by comparing recent behavior against DF-1 baselines.	3–5h	PLAN_DOG_FEEDING.md §5 DF-2
DF-A2	Create `df_threshold_advice` (DF-3). Multi-cycle threshold recommendation replacing the single-step `compute_adaptive_threshold()` convergence.	3–5h	PLAN_DOG_FEEDING.md §5 DF-3
DF-A3	Verify DAG ordering. DF-1 refreshes before DF-2 and DF-3.	1–2h	PLAN_DOG_FEEDING.md §7 Phase 2
DF-A4	E2E test: threshold spike detection. Inject synthetic history making DIFF consistently fast; assert DF-3 recommends raising the threshold.	2–4h	PLAN_DOG_FEEDING.md §8
DF-A5	E2E test: anomaly duration spike. Inject a 3× duration spike; assert DF-2 detects it.	2–4h	PLAN_DOG_FEEDING.md §8

Phase 3 — CDC Buffer & Interference

Item	Description	Effort	Ref
DF-C1	Create `df_cdc_buffer_trends` (DF-4). Tracks change-buffer growth rates per source table. May require `pgtrickle.cdc_buffer_row_counts()` helper for dynamic table names.	4–8h	PLAN_DOG_FEEDING.md §5 DF-4
DF-C2	Create `df_scheduling_interference` (DF-5). Detects concurrent refresh overlap. FULL-refresh mode initially (bounded 1-hour window).	3–5h	PLAN_DOG_FEEDING.md §5 DF-5
DF-C3	E2E test: scheduling overlap detection. Create 3 STs with overlapping schedules; verify DF-5 detects overlap.	2–4h	PLAN_DOG_FEEDING.md §8

Phase 4 — GUC & Auto-Apply

Item	Description	Effort	Ref
DF-G1	`pg_trickle.dog_feeding_auto_apply` GUC. Values: `off` (default) / `threshold_only` / `full`. Registered in `src/config.rs`.	1–2h	PLAN_DOG_FEEDING.md §6.2
DF-G2	Auto-apply worker (threshold_only). Post-tick hook reads `df_threshold_advice`; applies `ALTER STREAM TABLE ... SET auto_threshold = <recommended>` when confidence is HIGH and delta > 5%. Rate-limited to 1 change per ST per 10 minutes.	4–8h	PLAN_DOG_FEEDING.md §7 Phase 5
DF-G3	`initiated_by = 'DOG_FEED'` audit trail. Log auto-apply changes to `pgt_refresh_history`.	1–2h	PLAN_DOG_FEEDING.md §7 Phase 5
DF-G4	E2E test: auto-apply threshold. Enable `threshold_only`, inject history making DIFF consistently faster, verify threshold increases automatically.	2–4h	PLAN_DOG_FEEDING.md §8
DF-G5	E2E test: rate limiting. Verify no more than 1 threshold change per ST per 10 minutes.	1–2h	PLAN_DOG_FEEDING.md §8

Phase 5 — Operational Diagnostics

Item	Description	Effort	Ref
OPS-1	`pgtrickle.recommend_refresh_mode(st_name)` Reads `df_threshold_advice` to return a structured recommendation `{ mode, confidence, reason }` rather than computing on demand.	2–4h	PLAN_DOG_FEEDING.md §10.6
OPS-2	`check_cdc_health()` spill-risk enrichment. Query `df_cdc_buffer_trends` growth rate; emit a `spill_risk` alert when buffer growth will breach `spill_threshold_blocks` within 2 cycles.	2–4h	PLAN_DOG_FEEDING.md §10.3
OPS-3	`pgtrickle.scheduler_overhead()` diagnostic function. Returns busy-time ratio, queue depth, avg dispatch latency, and fraction of CPU spent on DF STs vs user STs.	2–4h	—
OPS-4	`pgtrickle.explain_dag()` — Mermaid/DOT output. Returns DAG as Mermaid markdown with node colours: user=blue, dog-feeding=green, suspended=red.	3–4h	—
OPS-5	`sql/dog_feeding_setup.sql` quick-start template. Runnable script: call `setup_dog_feeding()`, set `dog_feeding_auto_apply = 'threshold_only'`, configure LISTEN, query initial recommendations.	1h	—
OPS-6	Workload-aware poll intervals via DF-5 signal. Replace `compute_adaptive_poll_ms()` exponential backoff with pre-emptive dispatch interval widening when `df_scheduling_interference` detects contention.	2–4h	PLAN_DOG_FEEDING.md §10.2
DASH-1	Grafana Dog-Feeding Dashboard. New `monitoring/grafana/dashboards/pg_trickle_dog_feeding.json` — 5 panels reading from DF-1 through DF-5.	4–6h	PLAN_DOG_FEEDING.md §10.5
DBT-1	dbt `pgtrickle_enable_monitoring` post-hook macro. Calls `setup_dog_feeding()` automatically after a successful `dbt run`; documented in `dbt-pgtrickle/`.	2h	—

OPS-1 — pgtrickle.recommend_refresh_mode(st_name text)

Reads directly from df_threshold_advice instead of computing a single-cycle cost comparison on demand (PLAN_DOG_FEEDING.md §10.6). Returns TABLE(mode text, confidence text, reason text). When confidence is LOW (< 10 history rows), emits a fallback with mode='AUTO' and a reason explaining insufficient data. Integrates with explain_st() output.

Verify: call on an ST with ≥ 20 history cycles; assert mode ∈ {'DIFFERENTIAL','FULL','AUTO'} and confidence ∈ {'HIGH','MEDIUM','LOW'}. Dependencies: DF-A2. Schema change: No.

OPS-2 — check_cdc_health() spill-risk enrichment

Currently check_cdc_health() performs full-table scans to detect anomalies. When DF-C1 is active, query df_cdc_buffer_trends growth rate instead. Emit a spill_risk = 'IMMINENT' row when the 1-cycle growth rate extrapolated 2 cycles ahead exceeds spill_threshold_blocks. Falls back to full scan when dog-feeding is not set up.

Verify: inject 80% of spill_threshold_blocks worth of buffer rows with a steep growth rate; assert check_cdc_health() returns a spill-risk alert. Dependencies: DF-C1. Schema change: No.

OPS-3 — pgtrickle.scheduler_overhead() diagnostic function

Returns a snapshot of scheduler efficiency: scheduler_busy_ratio (fraction of wall-clock time spent executing refreshes), queue_depth (STs waiting to be dispatched), avg_dispatch_latency_ms, df_refresh_fraction (fraction of busy time attributable to DF STs). This makes PERF-3's < 1% CPU target observable in production without custom monitoring.

Verify: function returns non-NULL values after 5+ refresh cycles; assert df_refresh_fraction < 0.01 in the soak test context. Dependencies: DF-D4. Schema change: No (new function only).

OPS-4 — pgtrickle.explain_dag() — Mermaid / DOT graph output

Returns the full refresh DAG as a Mermaid markdown string (default) or Graphviz DOT (via format => 'dot' argument). Node labels show ST name, current mode, and refresh interval. Node colours: user STs = blue, dog-feeding STs = green, suspended = red, fused = orange. Edges show dependency direction. Validates that DF-1 → DF-2 → DF-3 ordering is correct post-setup.

Verify: SELECT pgtrickle.explain_dag() after setup_dog_feeding() returns a string containing all five df_ nodes in green with correct edges. Dependencies: None. Schema change: No (new function only).

OPS-5 — sql/dog_feeding_setup.sql quick-start template

A standalone SQL script in sql/ that an operator can run with psql -f sql/dog_feeding_setup.sql. Contents: calls setup_dog_feeding(), sets pg_trickle.dog_feeding_auto_apply = 'threshold_only', runs LISTEN pg_trickle_alert, queries dog_feeding_status() for a status summary, and queries df_threshold_advice for initial recommendations with a warm-up note. Referenced from GETTING_STARTED.md Day 2 operations section (UX-4).

Verify: script executes without errors on a fresh install; produces visible output showing 5 active DF STs. Dependencies: DF-F4, DF-G1, UX-4. Schema change: No.

OPS-6 — Workload-aware poll intervals via DF-5 signal

Currently compute_adaptive_poll_ms() uses pure exponential backoff that reacts to contention only after it occurs. Replace this with a pre-emptive signal: after each scheduler tick, read the latest overlap_count from df_scheduling_interference; if overlap_count >= 2, increase the dispatch interval for the next tick by 20% before dispatching (capped at pg_trickle.max_poll_interval_ms). This closes the dog-feeding feedback loop by letting the analytics directly influence scheduling policy, reducing contention on write-heavy deployments without waiting for timeouts.

Verify: soak test with known-contending STs shows lower overlap_count in DF-5 with signal enabled vs disabled. scheduler_overhead() shows reduced busy-time ratio. Dependencies: DF-C2, OPS-3. Schema change: No.

DASH-1 — Grafana Dog-Feeding Dashboard

Add monitoring/grafana/dashboards/pg_trickle_dog_feeding.json alongside the existing pg_trickle_overview.json. Five panels: (1) Refresh throughput timeline (DF-1 avg_diff_ms over time), (2) Anomaly heatmap (DF-2 per-ST anomaly type grid), (3) Threshold calibration scatter (DF-3 current vs recommended threshold), (4) CDC buffer growth sparklines (DF-4 per-source growth rate), (5) Interference matrix (DF-5 overlap heatmap). Provisioned automatically in monitoring/grafana/provisioning/.

Verify: docker compose up in monitoring/ loads both dashboards; all five panels resolve without No data errors using the postgres-exporter queries. Dependencies: DF-F2, DF-A1, DF-A2, DF-C1, DF-C2. Schema change: No.

DBT-1 — pgtrickle_enable_monitoring dbt post-hook macro

Add a pgtrickle_enable_monitoring macro to dbt-pgtrickle/macros/ that calls {{ pgtrickle.setup_dog_feeding() }} and emits a log() message confirming activation. Documented in dbt-pgtrickle/README.md. Users add +post-hook: "{{ pgtrickle_enable_monitoring() }}" to dbt_project.yml to auto-enable monitoring after any dbt run. Idempotent — safe to call on every run because setup_dog_feeding() is already idempotent (STAB-1).

Verify: just test-dbt includes a test case that runs the macro twice; asserts dog_feeding_status() shows 5 active STs after both calls. Dependencies: DF-F4, STAB-1. Schema change: No.

Documentation & Safety

Item	Description	Effort	Ref
DF-D1	SQL_REFERENCE.md: dog-feeding quick start. Document `setup_dog_feeding()`, `teardown_dog_feeding()`, all five `df_*` stream tables, and the auto-apply GUC.	2–4h	—
DF-D2	CONFIGURATION.md: `pg_trickle.dog_feeding_auto_apply` GUC.	1h	—
DF-D3	E2E test: control plane survives DF ST suspension. Drop or suspend all `df_*` STs; verify the scheduler and refresh logic operate identically.	2–4h	PLAN_DOG_FEEDING.md §8
DF-D4	Soak test addition. Add dog-feeding STs to the existing soak test; verify no memory growth or scheduler stalls under 1-hour sustained load.	2–4h	PLAN_DOG_FEEDING.md §8

Correctness

ID	Title	Effort	Priority
CORR-1	`df_threshold_advice` output always within [0.01, 0.80]	S	P0
CORR-2	DF-2 suppresses false-positive spike on first-ever refresh	S	P0
CORR-3	`avg_change_ratio` never NaN/Inf on zero-delta streams	S	P0
CORR-4	CDC INSERT-only invariant verified on `pgt_refresh_history`	XS	P1
CORR-5	DF-1 historical window boundary is exclusive, not inclusive	XS	P1

CORR-1 — df_threshold_advice output always within [0.01, 0.80]

The LEAST(0.80, GREATEST(0.01, …)) expression in DF-3 must hold for all input combinations including NULL avg_diff_ms, zero avg_full_ms, and extreme ratios. Add a property-based test (proptest) that generates random (avg_diff_ms, avg_full_ms, current_threshold) triples and asserts the output is always in the valid range. Any value outside [0.01, 0.80] that reaches auto-apply would corrupt stream table configuration.

Verify: proptest with 10,000 iterations; zero out-of-range results. Dependencies: DF-A2. Schema change: No.

CORR-2 — DF-2 suppresses false-positive spike on first-ever refresh

df_anomaly_signals compares latest.duration_ms against eff.avg_diff_ms. On the very first refresh of a stream table there is no rolling average yet (eff.avg_diff_ms IS NULL), so the CASE WHEN would produce no anomaly. Confirm the LATERAL subquery returns NULL (not 0) when history is empty, and that the CASE guard is > 3.0 * NULLIF(eff.avg_diff_ms, 0) so a NULL baseline never triggers a spike.

Verify: E2E test creating a brand-new ST; assert duration_anomaly IS NULL on first DF-2 refresh. Dependencies: DF-A1. Schema change: No.

CORR-3 — avg_change_ratio never NaN/Inf on zero-delta streams

DF-1 computes avg(h.delta_row_count::float / NULLIF(h.rows_inserted + h.rows_deleted, 0)). If a stream table runs only FULL refreshes (no DIFF cycles) the divisor is always NULL and avg() returns NULL — correct. But if DIFF runs with exactly zero rows inserted and zero deleted (CDC buffer was empty), NULLIF must prevent a divide-by-zero NaN. Verify the guard holds and that avg_change_ratio is either a valid float in [0, 1] or NULL.

Verify: E2E test triggering a DIFF refresh on a quiescent source; assert avg_change_ratio IS NULL OR avg_change_ratio BETWEEN 0 AND 1. Dependencies: DF-F2. Schema change: No.

CORR-4 — CDC INSERT-only invariant verified on pgt_refresh_history

pgt_refresh_history is semantically append-only: rows are only ever INSERTed (one per refresh). The CDC trigger installed by DF-F1 must be an INSERT-only trigger (no UPDATE/DELETE triggers). If the trigger were registered as FOR EACH ROW AFTER INSERT OR UPDATE, a future catalog UPDATE would generate spurious change-buffer rows and corrupt DF-1 aggregates. Inspect pg_trigger to confirm only an INSERT trigger exists.

Verify: SELECT tgtype FROM pg_trigger WHERE tgrelid = 'pgtrickle.pgt_refresh_history'::regclass returns only INSERT-event triggers. Dependencies: DF-F1. Schema change: No.

CORR-5 — DF-1 historical window boundary is exclusive, not inclusive

The WHERE h.start_time > now() - interval '1 hour' clause uses a strict > comparison. This ensures a row with start_time exactly equal to the boundary is excluded on each pass, preventing double-counting in rolling aggregates. Confirm the query plan uses the index on (pgt_id, start_time) (see PERF-2) and that the boundary is consistent across DF-1, DF-2, and DF-4 (all use the same 1-hour lookback).

Verify: unit test comparing aggregate output with a row at the exact boundary; assert it is excluded. Dependencies: DF-F2. Schema change: No.

Stability

ID	Title	Effort	Priority
STAB-1	`setup_dog_feeding()` is fully idempotent	S	P0
STAB-2	Auto-apply handles `ALTER STREAM TABLE` failure gracefully	S	P0
STAB-3	DF STs survive `DROP EXTENSION` + `CREATE EXTENSION` cycle	S	P1
STAB-4	Auto-apply worker checks ST still exists before applying	XS	P1
STAB-5	`teardown_dog_feeding()` is safe when some DF STs already removed	XS	P1

STAB-1 — setup_dog_feeding() is fully idempotent

Calling setup_dog_feeding() a second time while DF STs already exist must not raise an error. Use IF NOT EXISTS semantics internally (or check catalog before creating). The function must also be safe to call concurrently from two sessions. Idempotency is critical for upgrade scripts and Terraform-style declarative deployment workflows.

Verify: call setup_dog_feeding() three times in a row; no errors, no duplicate stream tables. Dependencies: DF-F4. Schema change: No.

STAB-2 — Auto-apply handles ALTER STREAM TABLE failure gracefully

The auto-apply post-tick hook reads df_threshold_advice and issues ALTER STREAM TABLE … SET auto_threshold = <recommended>. If the stream table was dropped between the advice read and the apply (a TOCTOU race), the ALTER will error. Catch SQL errors in the post-tick hook with an appropriate match on PgTrickleError and log a WARNING rather than crashing the background worker.

Verify: unit test with a mocked ALTER that returns ERROR: relation does not exist; assert the worker logs a warning and continues to the next advice row. Dependencies: DF-G2. Schema change: No.

STAB-3 — DF STs survive DROP EXTENSION + CREATE EXTENSION cycle

DROP EXTENSION pg_trickle CASCADE drops all extension-owned objects. After CREATE EXTENSION pg_trickle, setup_dog_feeding() should recreate the DF STs cleanly. There must be no leftover triggers, orphaned change buffer tables, or stale catalog rows from the previous installation. This is the most likely failure mode after an emergency rollback + reinstall.

Verify: E2E test: setup_dog_feeding() → DROP EXTENSION CASCADE → CREATE EXTENSION → setup_dog_feeding() → insert history → refresh DF-1; assert correct aggregates. Dependencies: DF-F4, DF-F5. Schema change: No.

STAB-4 — Auto-apply worker checks ST still exists before applying

Before issuing ALTER STREAM TABLE, the worker should confirm the ST is still in pgt_stream_tables and is not in SUSPENDED or FUSED state. Applying a threshold change to a SUSPENDED ST is harmless but wasteful; applying to a FUSED ST is wrong (the fuse exists for a reason). Add a pre-apply guard in the Rust post-tick hook.

Verify: E2E test suspending an ST manually while auto-apply is enabled; assert no threshold change is applied-to a suspended stream table. Dependencies: DF-G2. Schema change: No.

STAB-5 — teardown_dog_feeding() is safe when some DF STs already removed

If a user manually drops df_anomaly_signals before calling teardown_dog_feeding(), the teardown function must not error on DROP STREAM TABLE df_anomaly_signals. Use drop_stream_table(name, if_exists => true) semantics for each DF table in the teardown. Otherwise a partial teardown leaves the system in an inconsistent state.

Verify: drop two DF STs manually, then call teardown_dog_feeding(); assert no errors and remaining DF STs are gone. Dependencies: DF-F5. Schema change: No.

Performance

ID	Title	Effort	Priority
PERF-1	Index on `pgt_refresh_history(pgt_id, start_time)` for DF queries	XS	P0
PERF-2	Benchmark DF-1 vs `refresh_efficiency()` on 10 K history rows	S	P0
PERF-3	Dog-feeding scheduler overhead target: < 1% of total CPU	S	P1
PERF-4	DF-5 self-join uses bounded index scan, not seq-scan	S	P1
PERF-5	History pruning batch-DELETE with short transactions (no CDC lock contention)	S	P1
PERF-6	Columnar change tracking Phase 1 — CDC bitmask (deferred from v0.17/v0.18)	M	P1

PERF-1 — Index on pgt_refresh_history(pgt_id, start_time) for DF queries

All five DF stream tables filter pgt_refresh_history on (pgt_id, start_time). Without a composite index on these columns the rolling-window WHERE clause forces a sequential scan of the growing history table. Verify the index was created during extension install (check the upgrade migration); if missing, add it as part of the 0.19.0 → 0.20.0 migration script.

Verify: EXPLAIN (FORMAT TEXT) SELECT … FROM pgtrickle.pgt_refresh_history WHERE pgt_id = 1 AND start_time > now() - interval '1 hour' shows an index scan. Schema change: Yes (index addition in migration script).

PERF-2 — Benchmark DF-1 vs refresh_efficiency() on 10 K history rows

The primary performance claim of dog-feeding is that a maintained DIFFERENTIAL stream table is cheaper than scanning the full history table on every diagnostic call. Establish a Criterion micro-benchmark that seeds 10 K history rows, then compares: (a) a full SELECT * FROM pgtrickle.refresh_efficiency() call vs (b) a SELECT * FROM pgtrickle.df_efficiency_rolling read after one incremental refresh. The benchmark documents the win concretely.

Verify: Criterion benchmark shows DF-1 read is at least 5× faster than refresh_efficiency() at 10 K rows. Included in benches/ and run in CI. Dependencies: DF-F2. Schema change: No.

PERF-3 — Dog-feeding scheduler overhead target: < 1% of total CPU

Five DF STs at 48–96 s schedules add background refresh work. Under a realistic load (20 user STs, 10 K history rows), the total time spent refreshing DF STs should be < 1% of total scheduler CPU. Measure in the E2E soak test by comparing scheduler loop busy-time with and without DF STs. If overhead exceeds 1%, relax schedules to 120 s or move DF STs to refresh_tier = 'cold'.

Verify: soak test reports DF refresh overhead as a fraction of total scheduler CPU; assert < 1%. Dependencies: DF-D4. Schema change: No.

PERF-4 — DF-5 self-join uses bounded index scan, not seq-scan

df_scheduling_interference joins pgt_refresh_history to itself on an overlap condition with a 1-hour bound. Without the index from PERF-1 this double-scan is O(N²) in history rows. Verify EXPLAIN shows nested-loop index scans (not hash or merge join over full table) for both sides of the self-join. If the planner chooses a seq-scan, add enable_seqscan = off for the DF-5 query or restructure with a CTE.

Verify: EXPLAIN of DF-5 query shows index scans on both sides of the JOIN. Dependencies: PERF-1, DF-C2. Schema change: No.

PERF-5 — History pruning batch-DELETE with short transactions

pg_trickle.history_retention_days cleanup (shipped in v0.19.0) currently deletes rows in a single long transaction. Under dog-feeding, that transaction holds a lock on pgt_refresh_history that can delay CDC trigger INSERTs. Rewrite the purge as batched DELETEs: delete at most 500 rows per transaction, commit between batches, sleep 50 ms between batches. The index from PERF-1 ensures each batch is an index-range scan, not a seq-scan.

Verify: soak test running history purge concurrently with DF CDC trigger INSERTs; no lock wait timeout observed. Batch size configurable via pg_trickle.history_purge_batch_size GUC (default 500). Dependencies: PERF-1. Schema change: No.

PERF-6 — Columnar change tracking Phase 1 — CDC bitmask

Deferred from v0.17.0 (twice) and v0.18.0. Dog-feeding now provides concrete internal workload data that justifies the schema change. Phase 1 only: compute changed_columns bitmask (old.col IS DISTINCT FROM new.col) in the CDC trigger for UPDATE rows; store as int8 in the change buffer. Phase 2 (delta-scan filtering using the bitmask) deferred to v0.22.0. Gate behind pg_trickle.columnar_tracking GUC (default off). This is the foundation for 50–90% delta volume reduction on wide-table UPDATE workloads.

Verify: UPDATE a 20-column row, changing 2 columns; assert changed_columns bitmask has exactly 2 bits set. just check-upgrade-all passes. Dependencies: None. Schema change: Yes (change buffer schema addition + migration script).

Scalability

ID	Title	Effort	Priority
SCAL-1	DF STs refresh within window at 100 user stream tables	S	P1
SCAL-2	`pgt_refresh_history` retention interacts correctly with dog-feeding	S	P1
SCAL-3	1-hour rolling window doesn't over-aggregate when history is sparse	XS	P2

SCAL-1 — DF STs refresh within window at 100 user stream tables

With 100 user STs generating up to 100 history rows per 48 s window, DF-1 processes up to ~7,500 rows/hour. Verify that the DIFFERENTIAL refresh of DF-1 completes within its 48 s schedule interval at this load, leaving margin for DF-2 and DF-3. If DF-1 duration exceeds 10 s, investigate query plan and index usage. Run as part of the soak-test at high table count.

Verify: soak test with 100 STs; DF-1 refresh duration < 10 s throughout. Dependencies: PERF-1. Schema change: No.

SCAL-2 — pgt_refresh_history retention interacts correctly with dog-feeding

pg_trickle.history_retention_days (shipped in v0.19.0, default 90 days) purges old history rows. DF-1 only looks back 1 hour, so retention does not affect correctness. However the purge job must not hold a long-running lock that delays CDC trigger firing on concurrent INSERT into the history table. Verify that the cleanup job uses a DELETE … RETURNING batch strategy with short transactions to avoid blocking DF CDC triggers.

Verify: E2E test running the history purge job while DF-1 is being refreshed; no lock wait timeout, no CDC trigger delay. Dependencies: DF-F1. Schema change: No.

SCAL-3 — 1-hour rolling window doesn't over-aggregate when history is sparse

For a stream table that refreshes every 30 minutes (2 refreshes/hour), the DF-1 1-hour window contains at most 2 rows. The AVG() aggregate is still meaningful, but percentile_cont(0.95) over 2 rows is misleading. Document the minimum sample size (in the confidence column of DF-3) and add a note in SQL_REFERENCE.md that DF stats are most meaningful for STs refreshing every 60 s or faster.

Verify: SQL_REFERENCE.md updated; confidence = 'LOW' for STs with total_refreshes < 10. Dependencies: DF-A2. Schema change: No.

Ease of Use

ID	Title	Effort	Priority
UX-1	`pgtrickle.dog_feeding_status()` diagnostic function	S	P0
UX-2	`setup_dog_feeding()` warm-up hint when history is sparse	XS	P1
UX-3	NOTIFY on anomaly via `pg_trickle_alert` channel	S	P1
UX-4	GETTING_STARTED.md: "Day 2 operations" section	S	P1
UX-5	`explain_st()` shows if a DF ST covers the queried stream table	XS	P2
UX-6	`recommend_refresh_mode()` exposed in `explain_st()` JSON output	XS	P2
UX-7	`scheduler_overhead()` output included in TUI diagnostics panel	XS	P2
UX-8	`df_threshold_advice` extended with SLA headroom column	S	P2

UX-1 — pgtrickle.dog_feeding_status() diagnostic function

A single-query overview of the dog-feeding analytics plane: name, last refresh timestamp, row count, and whether the DF ST is ACTIVE / SUSPENDED / NOT_CREATED. Calling this function is the first thing an operator should run to check that dog-feeding is working. Return type: TABLE(df_name text, status text, last_refresh timestamptz, row_count bigint, note text).

Verify: function returns 5 rows when all DF STs are active; returns rows with status = 'NOT_CREATED' when setup_dog_feeding() has not been called. Schema change: No (new function only).

UX-2 — setup_dog_feeding() warm-up hint when history is sparse

If pgt_refresh_history has fewer than 50 rows when setup_dog_feeding() is called, emit a NOTICE: "Dog-feeding stream tables created. DF analytics will populate as refresh history accumulates (currently N rows; recommend ≥ 50 before consulting df_threshold_advice)." This prevents operators from acting on meaningless LOW-confidence advice immediately after setup.

Verify: call setup_dog_feeding() on a fresh install; assert NOTICE contains the row count and the ≥ 50 recommendation. Dependencies: DF-F4. Schema change: No.

UX-3 — NOTIFY on anomaly via pg_trickle_alert channel

When df_anomaly_signals detects a duration_anomaly IS NOT NULL or recent_failures >= 2 after a refresh, emit a pg_notify('pg_trickle_alert', payload::text) with event = 'dog_feed_anomaly', the stream table name, anomaly type, last duration, baseline, and a plain-English recommendation. This integrates with existing alert pipelines without requiring a new channel. Fires from a post-refresh trigger on df_anomaly_signals or from the auto-apply post-tick hook.

Verify: E2E test LISTEN on pg_trickle_alert; inject a 3× duration spike; assert NOTIFY payload arrives with correct anomaly type. Dependencies: DF-A1. Schema change: No.

UX-4 — GETTING_STARTED.md: "Day 2 operations" section

Add a new section to docs/GETTING_STARTED.md covering the first steps after initial deployment: (1) enable dog-feeding with setup_dog_feeding(), (2) check status with dog_feeding_status(), (3) query df_threshold_advice to tune thresholds, (4) set up anomaly alerting via LISTEN. This gives new users a clear post-install checklist and demonstrates the dog-feeding value proposition immediately.

Verify: documentation PR reviewed; code examples in GETTING_STARTED.md execute without modification. Dependencies: UX-1, UX-2. Schema change: No.

UX-5 — explain_st() shows if a DF ST covers the queried stream table

When a user calls pgtrickle.explain_st('my_table'), append a line "Dog-feeding coverage: df_efficiency_rolling ✓, df_threshold_advice ✓" (or "Not set up — run setup_dog_feeding()") to the output. This surfaces the analytics plane to users who might not know dog-feeding exists, without requiring a separate function call.

Verify: SELECT explain_st('any_table') output includes a dog_feeding field in the JSON output. Dependencies: UX-1. Schema change: No.

UX-8 — df_threshold_advice extended with SLA headroom column

Extend the DF-3 defining query to include a computed sla_headroom_ms column: freshness_deadline_ms - avg_diff_ms from pgt_refresh_history. When sla_headroom_ms < 0, add a boolean sla_breach_risk = true flag so operators can see at a glance which STs risk missing their freshness SLA on the next DIFFERENTIAL cycle. The freshness_deadline column already exists in pgt_refresh_history (since v0.2.3). No schema change required.

Verify: create an ST with a tight freshness_deadline; run slow synthetic refreshes; assert df_threshold_advice.sla_breach_risk = true. Dependencies: DF-A2. Schema change: No (view column addition only).

UX-6 — recommend_refresh_mode() exposed in explain_st() JSON output

explain_st() already shows dog-feeding coverage (UX-5). Extend its JSON output with a recommended_mode field reading from df_threshold_advice (OPS-1). If OPS-1 is not available (no DF setup), fall back to null with a setup_dog_feeding() hint. Keeps the single-function diagnostic surface comprehensive without requiring separate calls.

Verify: SELECT explain_st('any_table') JSON includes recommended_mode and mode_confidence fields. Dependencies: OPS-1. Schema change: No.

UX-7 — scheduler_overhead() output included in TUI diagnostics panel

The TUI (pgtrickle-tui) already shows refresh latency sparklines and ST status. Add a diagnostics panel (toggle key D) showing the fields from scheduler_overhead(): busy ratio, queue depth, and DF fraction as a percentage. Gives operators hands-on observability without needing psql.

Verify: TUI diagnostics panel shows all three scheduler overhead fields; df_refresh_fraction updates after each DF refresh cycle. Dependencies: OPS-3. Schema change: No.

Test Coverage

ID	Title	Effort	Priority
TEST-1	Property test: DF-3 recommended threshold always ∈ [0.01, 0.80]	S	P0
TEST-2	Light E2E: dog-feeding create/refresh/teardown full cycle	S	P0
TEST-3	Upgrade test: `pgt_refresh_history` rows survive `0.19.0 → 0.20.0`	S	P0
TEST-4	Regression test: DF STs absent from `check_cdc_health()` anomaly list	XS	P1
TEST-5	Stability test: dog-feeding under 1-h soak with 50 user STs	M	P1
TEST-6	Light E2E: `setup_dog_feeding()` idempotency (3× call)	XS	P1

TEST-1 — Property test: DF-3 recommended threshold always ∈ [0.01, 0.80]

Implements CORR-1 as a proptest unit test. Generate random (avg_diff_ms: 0.0–100_000.0, avg_full_ms: 0.0–100_000.0, current: 0.01–0.80) triples, compute the DF-3 CASE expression in Rust, assert output ∈ [0.01, 0.80]. Can be a pure Rust unit test in src/refresh.rs alongside the existing compute_adaptive_threshold tests — no database required.

Verify: just test-unit passes; 10,000 proptest iterations with zero failures. Dependencies: CORR-1. Schema change: No.

TEST-2 — Light E2E: dog-feeding create/refresh/teardown full cycle

A light E2E test (stock postgres:18.3 container) that: (1) installs the extension, (2) creates 3 user STs, (3) runs 5 refresh cycles to populate history, (4) calls setup_dog_feeding(), (5) refreshes all DF STs once, (6) asserts dog_feeding_status() shows 5 active STs, (7) calls teardown_dog_feeding(), (8) asserts all DF STs are gone.

Verify: test passes in just test-light-e2e with zero assertions failed. Schema change: No.

TEST-3 — Upgrade test: pgt_refresh_history rows survive 0.19.0 → 0.20.0

The 0.19.0 → 0.20.0 migration adds an index to pgt_refresh_history (PERF-1). The upgrade must not truncate, reorder, or modify existing history rows. Write an upgrade E2E test: deploy 0.19.0, run 10 refreshes, ALTER EXTENSION pg_trickle UPDATE, assert all 10 history rows are intact and the new index exists.

Verify: upgrade E2E test passes; SELECT count(*) FROM pgt_refresh_history unchanged after upgrade. Schema change: Yes (index).

TEST-4 — Regression test: DF STs absent from check_cdc_health() anomaly list

pgtrickle.check_cdc_health() scans all stream tables for CDC anomalies. After setup_dog_feeding(), DF STs must not appear in the anomaly list just because they are refreshed at longer intervals (48–96 s). Their schedules must be recognised as intentionally relaxed, not "falling behind".

Verify: E2E test: setup_dog_feeding() → wait one full DF cycle → assert check_cdc_health() returns no anomalies for any df_ table. Dependencies: DF-F4. Schema change: No.

TEST-5 — Stability test: dog-feeding under 1-h soak with 50 user STs

Extends DF-D4. Runs 50 user STs + 5 DF STs for 1 hour under steady insert load (1 000 rows/min across all sources). Assertions: (a) all DF STs remain ACTIVE, (b) no OOM or background worker crash, (c) DF-1 avg refresh duration < 5 s throughout, (d) pgtrickle.dog_feeding_status() shows 5 active STs at end of run.

Verify: soak test passes with all four assertions. Dependencies: DF-D4, SCAL-1. Schema change: No.

TEST-6 — Light E2E: setup_dog_feeding() idempotency (3× call)

Implements STAB-1 as a light E2E test. Call setup_dog_feeding() three consecutive times in the same session. Assert: no errors, exactly five df_ stream tables in pgt_stream_tables, no duplicate triggers in pg_trigger for history table.

Verify: test passes in just test-light-e2e; SELECT count(*) FROM pgtrickle.pgt_stream_tables WHERE pgt_name LIKE 'df_%' = 5 after all three calls. Dependencies: STAB-1. Schema change: No.

Conflicts & Risks

PERF-1 (index addition) requires a migration script change. Adding CREATE INDEX CONCURRENTLY to the 0.19.0 → 0.20.0 migration must be tested with just check-upgrade-all. CONCURRENTLY cannot run inside a transaction block — the migration must issue it outside the default single-transaction DDL wrapper.
UX-3 (NOTIFY on anomaly) fires from a post-refresh path. If the pg_notify() call fails (e.g., payload too large), it must not roll back the DF-2 refresh. Wrap the notify in a BEGIN … EXCEPTION WHEN OTHERS THEN NULL END block, or fire it from a deferred trigger.
STAB-3 (DROP EXTENSION cycle) requires DF STs to be extension-owned or cleanly unregistered. If DF STs are not extension-owned objects, DROP EXTENSION CASCADE will not drop them. Either register them as extension members or document that teardown_dog_feeding() must be called before DROP EXTENSION.
TEST-5 (soak test) overlaps with the existing soak test in CI. Add it to the daily stability-tests.yml workflow rather than ci.yml to avoid extending PR CI time. Mark with #[ignore] and trigger via just test-soak.
CORR-5 / PERF-4 interaction. The start_time > now() - interval '1 hour' boundary and the index depend on the planner choosing an index range scan. On very busy deployments where the cardinality estimate is off, the planner may prefer a seq-scan. Consider adding SET enable_seqscan = off inside the DF stream table queries if plan stability is a concern.
PERF-6 (columnar tracking) is a schema change — deferred twice already. The changed_columns column addition to all change buffer tables requires a migration script. Gate strictly behind pg_trickle.columnar_tracking = off default. If capacity is tight, PERF-6 can be cut from v0.20.0 without affecting any other item — it shares no code paths with the DF pipeline.
OPS-2 (check_cdc_health() enrichment) has a fallback requirement. When setup_dog_feeding() has not been called, the function must fall back to the old full-scan path without error. Guard with a catalog check for df_cdc_buffer_trends existence before querying it.
OPS-4 (explain_dag()) output size. At 100+ user STs the Mermaid output may exceed typical terminal width. Offer format => 'dot' and limit => N arguments to constrain output. Default format => 'mermaid' with a NOTICE when DAG has > 20 nodes.
OPS-6 (workload-aware poll) writes to the scheduler hot path. The compute_adaptive_poll_ms() function is called on every scheduler tick. The DF-5 read must be a single O(1) catalog lookup (latest row only), not a full table scan. Guard with LIMIT 1 ORDER BY collected_at DESC. If the DF-5 table does not exist (dog-feeding not set up), fall back to the old backoff logic without error.
DASH-1 (Grafana) depends on postgres-exporter SQL queries. The dashboard panels use custom SQL collectors in the postgres-exporter config. Verify that monitoring/ docker-compose already mounts query config; if not, add a pg_trickle_df_queries.yaml collector file alongside the existing exporter config.
DBT-1 macro idempotency. The pgtrickle_enable_monitoring macro calls setup_dog_feeding() on every dbt run. Document that this is intentionally safe (STAB-1) and adds < 5 ms overhead per run.

v0.20.0 total: ~3–4 weeks

Exit criteria:

DF-F1: pgt_refresh_history receives CDC INSERT triggers when create_stream_table() is called
DF-F2: df_efficiency_rolling created and refreshes correctly in DIFFERENTIAL mode
DF-F3: DF-1 output matches refresh_efficiency() results on synthetic history
DF-F4: setup_dog_feeding() creates all five df_* stream tables in one call
DF-F5: teardown_dog_feeding() drops all df_* tables cleanly with no orphaned triggers
DF-A1: df_anomaly_signals created and detects 3× duration spikes
DF-A2: df_threshold_advice provides HIGH-confidence recommendations after ≥ 20 refresh cycles
DF-A3: DAG ensures DF-1 refreshes before DF-2 and DF-3 in every scheduler tick
DF-C1: df_cdc_buffer_trends created (FULL or DIFFERENTIAL mode)
DF-C2: df_scheduling_interference detects overlapping concurrent refreshes
DF-G1: pg_trickle.dog_feeding_auto_apply GUC registered with default off
DF-G2: Auto-apply adjusts threshold with ≥ 1 confirmed change in E2E test
DF-G5: Rate limiting verified — no more than 1 change per ST per 10 minutes
DF-D3: Suspending all df_* STs does not affect control-plane operation
CORR-1: df_threshold_advice output always within [0.01, 0.80] (property test)
CORR-2: No false-positive DURATION_SPIKE on first-ever refresh of a new ST
CORR-3: avg_change_ratio is NULL or in [0, 1] for zero-delta sources
CORR-4: Only INSERT triggers (no UPDATE/DELETE) on pgt_refresh_history
STAB-1: setup_dog_feeding() called 3× produces no errors and no duplicates
STAB-2: Auto-apply worker logs WARNING (not panic) when ALTER target disappears
STAB-3: DROP EXTENSION + CREATE EXTENSION + setup_dog_feeding() cycle works cleanly
PERF-1: pgt_refresh_history(pgt_id, start_time) index exists and is used by DF queries
PERF-2: DF-1 read ≥ 5× faster than refresh_efficiency() at 10 K history rows
UX-1: pgtrickle.dog_feeding_status() returns correct status for all five DF STs
UX-2: setup_dog_feeding() emits warm-up NOTICE when history has < 50 rows
UX-3: pg_trickle_alert NOTIFY received within one DF cycle after a 3× duration spike
TEST-1: Proptest for DF-3 threshold bounds passes 10,000 iterations
TEST-2: Light E2E full cycle test passes
TEST-3: Upgrade E2E: history rows intact and index present after 0.19.0 → 0.20.0
TEST-4: check_cdc_health() reports no anomalies for df_* tables after setup
OPS-1: recommend_refresh_mode() returns mode ∈ {'DIFFERENTIAL','FULL','AUTO'} and confidence ∈ {'HIGH','MEDIUM','LOW'}
OPS-2: check_cdc_health() returns spill-risk alert when buffer growth rate extrapolates to breach threshold within 2 cycles
OPS-3: scheduler_overhead() returns non-NULL fields after ≥ 5 refresh cycles; df_refresh_fraction < 0.01 in soak test
OPS-4: explain_dag() output contains all five df_* nodes after setup_dog_feeding()
OPS-5: sql/dog_feeding_setup.sql executes without errors on a fresh install
PERF-5: Concurrent history purge + DF CDC INSERT produces no lock wait timeouts in soak test
PERF-6: changed_columns bitmask stored in change buffer for UPDATE rows when columnar_tracking = on (if included)
OPS-6: Soak test shows lower overlap_count in DF-5 with workload-aware poll enabled vs disabled
DASH-1: docker compose up in monitoring/ loads pg_trickle_dog_feeding dashboard; all 5 panels show data
DBT-1: pgtrickle_enable_monitoring macro runs twice without error; dog_feeding_status() shows 5 active STs after both calls
UX-8: df_threshold_advice.sla_breach_risk = true when avg_diff_ms > freshness_deadline_ms on synthetic data
Extension upgrade path tested (0.19.0 → 0.20.0)
just check-version-sync passes

v0.21.0 — PostgreSQL 17 Support

Release Theme This release adds PostgreSQL 17 as a supported target alongside PostgreSQL 18. PGlite is built on PostgreSQL 17, so this is a hard prerequisite for the PGlite proof of concept (v0.22.0). The pgrx 0.17.x framework already supports PG 17 — the work is enabling the feature flag, adapting version-sensitive code paths, expanding the CI matrix, and validating the full test suite against a PG 17 instance.

Cargo & Build System

Item	Description	Effort	Ref
PG17-1	Add `pg17` feature to `Cargo.toml`. Define `pg17 = ["pgrx/pg17", "pgrx-tests/pg17"]` feature. Keep `default = ["pg18"]`.	1h	—
PG17-2	Broaden `#[cfg]` guards in `src/dag.rs`. Three `#[cfg(feature = "pg18")]` blocks must become `#[cfg(any(feature = "pg17", feature = "pg18"))]`.	1–2h	—
PG17-3	Guard `NodeTag` numeric assertions. `src/dvm/parser/mod.rs` asserts specific `NodeTag` integer values (e.g., `T_GroupingSet = 107`) that shift between PG versions. Gate behind `#[cfg(feature = "pg18")]` or use per-version value tables.	2–4h	—
PG17-4	*Audit `pg_sys::` API surface.** Verify that every `pg_sys` call compiles and behaves correctly on PG 17 bindings. Focus on catalog struct field names, WAL decoder types, and any PG 18-only additions.	4–8h	—

CI & Infrastructure

Item	Description	Effort	Ref
PG17-5	CI matrix expansion. Add PG 17 build + unit test job to `ci.yml`. Use `postgres:17` Docker image for integration and light E2E tests.	4–8h	—
PG17-6	`justfile` parameterisation. Add `pg17` variants for build, test, and package recipes (e.g., `just build-pg17`, `just test-e2e-pg17`).	2–4h	—
PG17-7	`tests/Dockerfile.e2e` PG version parameter. Accept a build arg for the base PostgreSQL image version so the same Dockerfile works for PG 17 and PG 18.	2–4h	—
PG17-8	Scripts parameterisation. Update `run_unit_tests.sh`, `run_light_e2e_tests.sh`, `run_e2e_tests.sh` to accept a PG version argument instead of hardcoding `pg18`.	2–4h	—

Testing & Validation

Item	Description	Effort	Ref
PG17-9	Full E2E suite against PG 17. Run the complete E2E test suite against a PG 17 instance. Fix any parser or catalog incompatibilities that surface.	1–2d	—
PG17-10	TPC-H validation on PG 17. Run TPC-H benchmark queries on PG 17 to verify differential refresh correctness for complex queries.	4–8h	—
PG17-11	Upgrade path test. Verify `ALTER EXTENSION pg_trickle UPDATE` from 0.19.0 to 0.20.0 works on both PG 17 and PG 18.	2–4h	—

Documentation

Item	Description	Effort	Ref
PG17-12	Update docs and README. Change "PostgreSQL 18 extension" to "PostgreSQL 17/18 extension" in `README.md`, `INSTALL.md`, `src/lib.rs` doc comments, and `ARCHITECTURE.md`.	1–2h	—
PG17-13	Docker Hub image variants. Publish images tagged with both PG versions (e.g., `:0.20.0-pg17`, `:0.20.0-pg18`).	2–4h	—

v0.21.0 total: ~2–4 days

Exit criteria:

PG17-1: cargo build --features pg17 --no-default-features compiles cleanly
PG17-2/PG17-3: cargo clippy --features pg17 --no-default-features passes with zero warnings
PG17-4: No pg_sys compile errors on PG 17 bindings
PG17-5: CI runs unit + integration + light E2E tests on PG 17
PG17-9: Full E2E suite passes on PG 17 with zero failures
PG17-10: TPC-H differential refresh matches full refresh on PG 17
PG17-11: Extension upgrade path works on both PG 17 and PG 18
PG17-12: Documentation reflects PG 17/18 dual support
Extension upgrade path tested (0.20.0 → 0.21.0)
just check-version-sync passes

v0.22.0 — PGlite Proof of Concept

Release Theme This release validates whether PGlite users want real incremental view maintenance by shipping a lightweight TypeScript plugin with zero core changes. The plugin (@pgtrickle/pglite-lite) intercepts DML via statement-level AFTER triggers and applies pre-computed delta SQL for simple patterns — single-table aggregates, two-table inner joins, and filtered scans. It deliberately limits scope to 3–5 SQL patterns to keep effort low while generating a concrete demand signal. If adoption materialises, the full core extraction (v0.23.0) and WASM build (v0.24.0) proceed. The main pg_trickle PostgreSQL extension ships no functional changes in this release — only version bumps and upgrade migration plumbing.

See PLAN_PGLITE.md for the full feasibility report.

PGlite JS Plugin PoC (Strategy C — Phase 0)

In plain terms: PGlite's built-in live.incrementalQuery() re-runs the full query on every change and diffs at the JavaScript layer. This proof of concept ships a PGlite plugin (@pgtrickle/pglite-lite) that intercepts DML via statement-level AFTER triggers and applies pre-computed delta SQL for simple cases — single-table aggregates and two-table inner joins. It validates whether PGlite users want real IVM and whether the trigger infrastructure works correctly in PGlite's single-user WASM mode. No WASM compilation, no pgrx changes, no core refactoring required.

Item	Description	Effort	Ref
PGL-0-1	PGlite trigger infrastructure validation. Empirically verify that statement-level triggers with `REFERENCING NEW TABLE AS ... OLD TABLE AS ...` work in PGlite's single-user mode. Document any limitations.	4–8h	PLAN_PGLITE.md §8 Q1
PGL-0-2	Delta SQL templates for simple patterns. Implement delta SQL generation in TypeScript for: (a) single-table `GROUP BY` with `COUNT`/`SUM`/`AVG`, (b) two-table `INNER JOIN`, (c) simple `WHERE` filter. Pre-compute at `createStreamTable()` time.	2–3d	PLAN_PGLITE.md §5 Strategy C
PGL-0-3	PGlite plugin skeleton. TypeScript plugin implementing `createStreamTable()`, `dropStreamTable()`, trigger registration, and delta application via PGlite's plugin API.	2–3d	PLAN_PGLITE.md §5 Strategy C
PGL-0-4	npm package `@pgtrickle/pglite-lite`. Package, publish, README with usage examples, and 3–5 supported SQL patterns documented.	1–2d	—
PGL-0-5	Benchmark vs `live.incrementalQuery()`. Compare latency and throughput for a 10K-row table with single-row inserts. Quantify the IVM advantage.	1d	PLAN_PGLITE.md §4.2

Phase 0 subtotal: ~2–3 weeks

Correctness

ID	Title	Effort	Priority
CORR-1	Delta SQL equivalence for supported patterns	M	P0
CORR-2	NULL-key aggregate correctness in JS delta	S	P0
CORR-3	Multi-DML transaction atomicity	S	P1

CORR-1 — Delta SQL equivalence for supported patterns

In plain terms: The TypeScript delta SQL templates must produce the exact same stream table state as a full query re-evaluation, for every combination of INSERT, UPDATE, and DELETE on the supported patterns (single-table GROUP BY + COUNT/SUM/AVG, two-table INNER JOIN, simple WHERE filter). Correctness is proven by running each DML operation, comparing the delta-maintained result against a fresh SELECT, and asserting row-for-row equivalence.

Verify: automated test suite runs 100+ randomised DML sequences per pattern; zero divergence from full re-evaluation. Dependencies: PGL-0-2, PGL-0-3. Schema change: No.

CORR-2 — NULL-key aggregate correctness in JS delta

In plain terms: When a GROUP BY key is NULL, SQL three-valued logic means GROUP BY NULL forms its own group. The TypeScript delta templates must handle NULL group keys correctly — insertions into the NULL group, deletions that empty it, and updates that move rows in/out of the NULL group. This is the most common correctness pitfall in hand-rolled IVM.

Verify: E2E test with nullable GROUP BY column; assert NULL group appears, grows, shrinks, and disappears correctly. Dependencies: CORR-1. Schema change: No.

CORR-3 — Multi-DML transaction atomicity

In plain terms: PGlite runs in single-connection mode, so a BEGIN; INSERT ...; DELETE ...; COMMIT sequence fires two separate statement-level triggers. The plugin must ensure the stream table reflects the net effect of the entire transaction, not an intermediate state. If trigger ordering produces incorrect intermediate results, a post-transaction reconciliation pass is needed.

Verify: test with BEGIN; INSERT; UPDATE; DELETE; COMMIT on a single base table; stream table matches full re-evaluation after commit. Dependencies: PGL-0-3. Schema change: No.

Stability

ID	Title	Effort	Priority
STAB-1	Trigger cleanup on dropStreamTable	S	P0
STAB-2	Graceful error on unsupported SQL	S	P0
STAB-3	Plugin idempotency (create-drop-create cycle)	S	P1

STAB-1 — Trigger cleanup on dropStreamTable

In plain terms: When a user calls dropStreamTable(), all statement- level AFTER triggers registered on source tables must be removed. Orphaned triggers would fire on every subsequent DML and attempt to write to a non-existent stream table, causing errors.

Verify: after dropStreamTable(), no pg_trickle-related triggers remain in pg_trigger for the source tables. Dependencies: PGL-0-3. Schema change: No.

STAB-2 — Graceful error on unsupported SQL

In plain terms: The PoC supports only 3–5 SQL patterns. If a user passes an unsupported query (e.g., a LEFT JOIN, window function, or recursive CTE), the plugin must throw a clear, actionable error message listing what is supported — not silently produce wrong results or crash.

Verify: createStreamTable() with an unsupported query throws an error whose message names the unsupported feature and lists supported alternatives. Dependencies: PGL-0-2. Schema change: No.

STAB-3 — Plugin idempotency (create-drop-create cycle)

In plain terms: Creating a stream table, dropping it, and creating it again with the same name must work without leftover state. Leftover catalog rows, triggers, or temp tables from the first creation must not interfere with the second.

Verify: create-drop-create cycle produces correct results; no duplicate triggers or stale catalog entries. Dependencies: STAB-1. Schema change: No.

Performance

ID	Title	Effort	Priority
PERF-1	Benchmark vs live.incrementalQuery()	M	P0
PERF-2	Delta overhead profiling per DML	S	P1
PERF-3	Large result set scalability (10K/100K rows)	S	P1

PERF-1 — Benchmark vs live.incrementalQuery() (= PGL-0-5)

In plain terms: The entire value proposition of this PoC depends on being faster than PGlite's built-in live.incrementalQuery() for the supported patterns. Produce a public benchmark comparing latency and throughput for single-row inserts into a 10K-row base table across all three supported patterns (aggregate, join, filter).

Verify: delta-maintained stream table refresh latency < 50% of live.incrementalQuery() latency for all supported patterns at 10K rows. Dependencies: PGL-0-3, PGL-0-4. Schema change: No.

PERF-2 — Delta overhead profiling per DML

In plain terms: Measure the per-DML overhead added by the statement- level triggers. INSERT-heavy workloads should not suffer more than 2x latency increase compared to the same INSERT without pg_trickle triggers installed. Profile trigger function execution time, temp table creation, and delta DML.

Verify: microbenchmark shows per-DML overhead < 2 ms for aggregate pattern; < 5 ms for join pattern at 10K source rows. Dependencies: PGL-0-3. Schema change: No.

PERF-3 — Large result set scalability (10K/100K rows)

In plain terms: Verify that the delta approach maintains its advantage over full re-evaluation as base table size grows. At 100K rows, the delta path should be significantly faster than full re-evaluation for single-row changes.

Verify: at 100K base table rows, single-row insert refresh latency is < 10% of full query re-evaluation latency. Dependencies: PERF-1. Schema change: No.

Scalability

ID	Title	Effort	Priority
SCAL-1	Multiple stream tables on same source	S	P1
SCAL-2	Cascading stream table triggers	M	P2
SCAL-3	Concurrent DML with multiple stream tables	S	P2

SCAL-1 — Multiple stream tables on same source

In plain terms: Verify that 3+ stream tables can be maintained from the same base table simultaneously. Each DML fires one trigger per stream table; ensure triggers do not interfere with each other.

Verify: 3 stream tables on the same source; INSERT + UPDATE + DELETE cycle; all 3 produce correct results. Dependencies: PGL-0-3. Schema change: No.

SCAL-2 — Cascading stream table triggers

In plain terms: If stream table B reads from stream table A's underlying storage, an INSERT into A's source should propagate through A's trigger, update A, and then fire B's trigger to update B — all within the same PGlite transaction. Verify this works in PGlite's single-connection environment without deadlocks or infinite trigger loops.

Verify: A->B cascade produces correct results for INSERT/DELETE on A's source. No infinite loops detected. Dependencies: SCAL-1. Schema change: No.

SCAL-3 — Concurrent DML with multiple stream tables

In plain terms: PGlite is single-connection, but a user could issue rapid sequential DML (INSERT; INSERT; INSERT) without explicit transactions. Verify all stream tables converge to the correct state.

Verify: 100 sequential INSERTs with 3 stream tables; final state matches full re-evaluation. Dependencies: SCAL-1. Schema change: No.

Ease of Use

ID	Title	Effort	Priority
UX-1	Getting-started README with copy-paste examples	S	P0
UX-2	Supported patterns decision table	XS	P0
UX-3	Error messages include remediation hints	S	P1
UX-4	TypeScript type definitions	S	P1
UX-5	ElectricSQL outreach and collaboration	S	P1

UX-1 — Getting-started README with copy-paste examples

In plain terms: The npm package README must include 3 complete, copy-pasteable examples — one per supported pattern — that a developer can run in under 2 minutes. Include Node.js and browser (Vite) examples.

Verify: all README examples execute without modification on a fresh PGlite instance. Dependencies: PGL-0-4. Schema change: No.

UX-2 — Supported patterns decision table

In plain terms: A clear table showing which SQL patterns are and are not supported, what error you get for unsupported patterns, and when full support is expected (v0.24.0). This prevents user frustration and sets expectations.

Verify: decision table in README and npm page lists all tested patterns with status (supported / unsupported / planned). Dependencies: None. Schema change: No.

UX-3 — Error messages include remediation hints

In plain terms: Every error thrown by the plugin must include the table name, the failing operation, and a one-sentence hint. Example: "LEFT JOIN is not supported in pglite-lite. Use @pgtrickle/pglite (v0.24.0+) for full SQL support, or rewrite as INNER JOIN."

Verify: all error paths tested; every error message includes a remediation sentence. Dependencies: STAB-2. Schema change: No.

UX-4 — TypeScript type definitions

In plain terms: Ship .d.ts type definitions so TypeScript users get autocomplete and type checking for createStreamTable(), dropStreamTable(), and configuration options.

Verify: TypeScript project consumes the plugin with strict mode; no any types leaked. Dependencies: PGL-0-4. Schema change: No.

UX-5 — ElectricSQL outreach and collaboration

In plain terms: PGlite is developed by ElectricSQL. Their cooperation is essential for Phase 2 (WASM build). Initiate contact before shipping Phase 0 to gauge interest, validate assumptions about PGlite's trigger infrastructure, and explore potential co-marketing.

Verify: documented exchange with ElectricSQL team (GitHub issue, email, or meeting notes). Dependencies: None. Schema change: No.

Test Coverage

ID	Title	Effort	Priority
TEST-1	Automated correctness suite (all patterns x DML types)	M	P0
TEST-2	PGlite version compatibility matrix	S	P1
TEST-3	Regression test: trigger firing order	S	P1
TEST-4	Bundle size monitoring	XS	P2
TEST-5	Extension upgrade path (0.18 to 0.19)	S	P0

TEST-1 — Automated correctness suite (all patterns x DML types)

In plain terms: For each supported pattern (aggregate, join, filter), run every DML type (INSERT, UPDATE, DELETE, multi-row, TRUNCATE) and assert the stream table matches a fresh full evaluation. This is the primary quality gate.

Verify: Jest/Vitest test suite with > 50 test cases; all pass on PGlite latest. Dependencies: PGL-0-2, PGL-0-3. Schema change: No.

TEST-2 — PGlite version compatibility matrix

In plain terms: PGlite updates frequently. Test the plugin against the last 3 PGlite releases to ensure trigger behavior hasn't changed. Document the minimum supported PGlite version.

Verify: CI matrix runs tests against PGlite N, N-1, N-2. Dependencies: TEST-1. Schema change: No.

TEST-3 — Regression test: trigger firing order

In plain terms: When multiple triggers exist on the same table, PostgreSQL fires them in alphabetical order by trigger name. Verify that trigger naming conventions prevent ordering conflicts with user-defined triggers.

Verify: test with a user-defined AFTER trigger alongside the plugin's trigger; both fire correctly; stream table produces correct results. Dependencies: PGL-0-3. Schema change: No.

TEST-4 — Bundle size monitoring

In plain terms: The npm package should be small (< 50 KB minified + gzipped) since this is a pure-JS plugin with no WASM. Add a CI check that fails if bundle size exceeds the threshold.

Verify: npm pack --dry-run reports < 50 KB gzipped. Dependencies: PGL-0-4. Schema change: No.

TEST-5 — Extension upgrade path (0.18 to 0.19)

In plain terms: The main pg_trickle PostgreSQL extension ships no functional changes in v0.21.0, but the upgrade migration path must still be tested. ALTER EXTENSION pg_trickle UPDATE from 0.20.0 to 0.21.0 must leave existing stream tables intact.

Verify: upgrade E2E test confirms all existing stream tables survive and refresh correctly after 0.20.0 -> 0.21.0 upgrade. Dependencies: None. Schema change: No (PG extension unchanged).

Conflicts & Risks

Demand uncertainty is the primary risk. This entire milestone is a bet that PGlite users want IVM beyond what pg_ivm provides. If Phase 0 generates no adoption signal, v0.23.0–v0.25.0 should be deprioritised and v1.0.0 proceeds without PGlite. Define a concrete adoption threshold (e.g., > 100 npm weekly downloads within 60 days of publication) as a go/no-go gate for v0.23.0.
PGlite trigger infrastructure is unverified. PGL-0-1 (trigger validation) is a hard prerequisite for everything else. If statement-level triggers with transition tables do not work in PGlite's single-user mode, the entire Strategy C approach fails and the PoC must pivot to a pure JS diff approach (lower value).
PGlite version mismatch. PGlite tracks PostgreSQL 17; pg_trickle targets PG 18. The PoC operates at the SQL level and should be unaffected, but if PGlite upgrades to PG 18 mid-cycle, trigger behavior may change. Pin the minimum PGlite version in package.json.
No core Rust changes, but version bump required. The main pg_trickle extension needs a v0.22.0 version bump, upgrade migration SQL, and passing CI even though no functional code changes. This is low-risk but must not be forgotten.
ElectricSQL collaboration timing. UX-5 (outreach) should happen early — before v0.22.0 ships — to avoid building something ElectricSQL is already working on or would actively resist. If they signal interest in co-development, Phase 2 scope and timeline may shift.
TypeScript delta SQL correctness is harder to prove than Rust. The main extension uses property-based testing and SQLancer for correctness. The TS plugin lacks these tools. TEST-1 must be rigorously designed to compensate — consider porting the proptest approach to a JS property- testing library (e.g., fast-check).

v0.22.0 total: ~2–3 weeks (PGlite plugin) + ~1–2 days (PG extension version bump)

Exit criteria:

PGL-0-1: Statement-level triggers with transition tables confirmed working in PGlite
PGL-0-2: Delta SQL correct for single-table aggregate, two-table join, and filtered query
PGL-0-3: @pgtrickle/pglite-lite plugin creates and maintains stream tables in PGlite
PGL-0-4: npm package published with README and usage examples
PGL-0-5: Benchmark shows measurable latency improvement over live.incrementalQuery() for supported patterns
CORR-1: Automated delta SQL equivalence tests pass (100+ DML sequences per pattern)
CORR-2: NULL-key aggregate groups correctly created, updated, and removed
CORR-3: Multi-DML transaction produces correct net result
STAB-1: No orphaned triggers after dropStreamTable()
STAB-2: Unsupported SQL patterns produce clear, actionable errors
STAB-3: Create-drop-create cycle produces correct results
PERF-1: Delta refresh latency < 50% of live.incrementalQuery() at 10K rows
PERF-3: Delta advantage holds at 100K rows (< 10% of full re-evaluation latency)
SCAL-1: 3+ stream tables on same source produce correct results
UX-1: README examples run unmodified on fresh PGlite instance
UX-2: Supported patterns decision table published
UX-4: TypeScript type definitions ship with strict-mode compatibility
TEST-1: > 50 correctness test cases pass on PGlite latest
TEST-2: CI tests pass against PGlite N, N-1, N-2
TEST-5: Extension upgrade path tested (0.21.0 -> 0.22.0)
just check-version-sync passes

v0.23.0 — Core Extraction (`pg_trickle_core`)

Release Theme This release surgically separates pg_trickle's "brain" — the DVM engine, operator delta SQL generation, query rewrite passes, and DAG computation — into a standalone Rust crate (pg_trickle_core) with zero pgrx dependency. The extraction touches ~51,000 lines of code across 30+ source files but produces zero user-visible behavior change: every existing test must pass unchanged. The payoff is threefold: the core crate compiles to WASM (enabling the PGlite extension in v0.24.0), pure-logic unit tests run without a PostgreSQL instance (10x faster CI), and the main extension gains a cleaner internal architecture. Approximately 500 unsafe blocks in the parser require an abstraction layer over raw pg_sys node traversal, making this the most technically demanding refactoring in the project's history.

See PLAN_PGLITE.md §5 Strategy A for the full extraction architecture.

Core Crate Extraction (Phase 1)

In plain terms: pg_trickle's "brain" — the code that analyses SQL queries, builds operator trees, and generates delta SQL — is currently tangled with pgrx (the Rust-to-PostgreSQL bridge). This milestone surgically separates the pure logic into its own crate so it can be compiled independently. The existing extension continues to work unchanged; it just imports from pg_trickle_core instead of having the code inline. A trait DatabaseBackend abstracts SPI and parser access so the core logic can be tested without a running PostgreSQL instance.

Item	Description	Effort	Ref
PGL-1-1	Create `pg_trickle_core` crate. Workspace member with `[lib]` target, no pgrx dependency. Move `OpTree`, `Expr`, `Column`, `AggExpr`, and all shared types.	1–2d	PLAN_PGLITE.md §5 Strategy A
PGL-1-2	Extract operator delta SQL generation. Move all `src/dvm/operators/` logic (~24K lines, 23 files) into the core crate. Each operator's `generate_delta_sql()` becomes a pure function taking abstract types.	3–5d	PLAN_PGLITE.md §5 Strategy A
PGL-1-3	Extract auto-rewrite passes. Move view inlining, DISTINCT ON rewrite, GROUPING SETS expansion, and SubLink extraction into `pg_trickle_core::rewrites`.	2–3d	PLAN_PGLITE.md §5 Strategy A
PGL-1-4	Extract DAG computation. Move dependency graph, topological sort, cycle detection, diamond detection into `pg_trickle_core::dag`.	1–2d	PLAN_PGLITE.md §5 Strategy A
PGL-1-5	Define `trait DatabaseBackend`. Abstract trait for SPI queries and raw_parser access. Implement for pgrx in the main extension crate.	2–3d	PLAN_PGLITE.md §5 Strategy A
PGL-1-6	WASM compilation gate. Verify `pg_trickle_core` compiles to `wasm32-unknown-emscripten` target. CI check for WASM build.	1–2d	PLAN_PGLITE.md §5 Strategy A
PGL-1-7	Existing test suite passes. All unit, integration, and E2E tests pass with the refactored crate structure. Zero behavior change.	2–3d	—

Phase 1 subtotal: ~3–4 weeks

Correctness

ID	Title	Effort	Priority
CORR-1	Delta SQL output byte-for-byte equivalence	M	P0
CORR-2	OpTree serialization round-trip fidelity	S	P0
CORR-3	Rewrite pass ordering preservation	S	P1
CORR-4	DAG cycle detection parity after extraction	S	P1

CORR-1 — Delta SQL output byte-for-byte equivalence

In plain terms: After the extraction, every operator's generate_delta_sql() must produce the exact same SQL string as it did before the refactoring. Any byte-level difference — even whitespace — indicates a semantic shift that could change query plans or correctness. Capture the SQL output for all 22 TPC-H stream tables before and after the extraction and assert bit-for-bit equality.

Verify: snapshot test comparing delta SQL for all TPC-H queries + the full E2E test suite. Any diff fails the build. Dependencies: PGL-1-2. Schema change: No.

CORR-2 — OpTree serialization round-trip fidelity

In plain terms: The OpTree types are moving to a new crate. If any field is accidentally dropped or retyped during the move, the delta SQL generator will silently produce wrong output. Add a round-trip test: serialize an OpTree to JSON, deserialize it back, and assert structural equality. This catches missing #[derive] attributes and field ordering issues.

Verify: proptest generating random OpTrees; serialize-deserialize round-trip produces identical trees. Dependencies: PGL-1-1. Schema change: No.

CORR-3 — Rewrite pass ordering preservation

In plain terms: The auto-rewrite passes (view inlining, DISTINCT ON, GROUPING SETS, SubLink extraction) must execute in the same order after extraction. Reordering could change the resulting OpTree and thereby the delta SQL. Add an integration test that runs all rewrite passes on a complex query (joining 3 tables with DISTINCT ON + GROUPING SETS) and asserts the final OpTree matches a golden snapshot.

Verify: golden-snapshot test for rewrite pass output on complex query. Dependencies: PGL-1-3. Schema change: No.

CORR-4 — DAG cycle detection parity after extraction

In plain terms: The cycle detection algorithm in dag.rs has subtleties around self-referencing views and diamond patterns. After moving to the core crate, the algorithm must detect the same cycles. Run the existing cycle-detection unit tests and add 3 new edge cases: self-referencing CTE, diamond with mixed IMMEDIATE/DIFFERENTIAL, and 4-level cascade.

Verify: all existing DAG unit tests pass + 3 new edge-case tests. Dependencies: PGL-1-4. Schema change: No.

Stability

ID	Title	Effort	Priority
STAB-1	pg_sys node abstraction layer (~500 unsafe blocks)	L	P0
STAB-2	Compile-time pgrx dependency leak detection	S	P0
STAB-3	Cargo workspace configuration correctness	S	P0
STAB-4	Extension upgrade path (0.19 to 0.20)	S	P0
STAB-5	Feature-flag isolation for WASM target	S	P1

STAB-1 — pg_sys node abstraction layer (~500 unsafe blocks)

In plain terms: rewrites.rs (118 unsafe blocks, 295 pg_sys refs) and sublinks.rs (367 unsafe blocks, 492 pg_sys refs) are the most deeply coupled to pgrx. The core crate cannot contain raw pg_sys calls. Define a trait NodeVisitor (or equivalent) that wraps pg_sys node traversal behind safe method calls. The pgrx backend implements the trait using actual pg_sys pointers; a mock backend can be used for unit tests. This is the single highest-effort item in the release.

Verify: zero pg_sys:: references in pg_trickle_core/; grep -r pg_sys pg_trickle_core/src/ returns empty. Dependencies: PGL-1-1, PGL-1-5. Schema change: No.

STAB-2 — Compile-time pgrx dependency leak detection

In plain terms: After extraction, any accidental use pgrx::* in the core crate would break the WASM build. Add a CI job that compiles pg_trickle_core in isolation (without the pgrx feature) and fails if any pgrx symbol is referenced. This catches leaks immediately rather than at WASM build time.

Verify: cargo build -p pg_trickle_core --no-default-features succeeds in CI. Dependencies: PGL-1-1. Schema change: No.

STAB-3 — Cargo workspace configuration correctness

In plain terms: Adding a workspace member changes Cargo.lock resolution, feature unification, and cargo pgrx behavior. Verify: cargo pgrx package still produces a valid .so, cargo test runs all workspace tests, and cargo pgrx test works for the extension crate. pgrx version must remain pinned at 0.17.x.

Verify: cargo pgrx package, cargo test --workspace, cargo pgrx test all succeed. Dependencies: PGL-1-1. Schema change: No.

STAB-4 — Extension upgrade path (0.19 to 0.20)

In plain terms: v0.23.0 makes no SQL-visible changes (same functions, same catalog schema), but the upgrade migration must still be tested. ALTER EXTENSION pg_trickle UPDATE from 0.21.0 to 0.22.0 must leave existing stream tables intact and refreshable.

Verify: upgrade E2E test confirms stream tables survive and refresh correctly after 0.22.0 -> 0.23.0.

STAB-5 — Feature-flag isolation for WASM target

In plain terms: The core crate must compile on both native and WASM. Any platform-specific code (e.g., std::time::Instant unavailable on wasm32-unknown-emscripten) must be gated behind #[cfg] attributes. Add a CI matrix entry for the WASM target that catches platform leaks.

Verify: cargo build --target wasm32-unknown-emscripten -p pg_trickle_core succeeds in CI. Dependencies: PGL-1-6. Schema change: No.

Performance

ID	Title	Effort	Priority
PERF-1	Zero-overhead abstraction for DatabaseBackend	M	P0
PERF-2	Benchmark regression gate across extraction	S	P0
PERF-3	Core-only unit test speedup measurement	S	P1

PERF-1 — Zero-overhead abstraction for DatabaseBackend

In plain terms: The trait DatabaseBackend introduces dynamic dispatch (dyn DatabaseBackend or generics). For the native extension, the abstraction must add zero measurable overhead. Use monomorphization (generics, not trait objects) for the hot path — delta SQL generation is called on every refresh cycle and must not regress. Measure with Criterion before/after on the diff_operators benchmark suite.

Verify: Criterion benchmark shows < 1% regression on diff_operators suite after extraction. Dependencies: PGL-1-5. Schema change: No.

PERF-2 — Benchmark regression gate across extraction

In plain terms: The extraction touches 51K lines of code. Even without functional changes, module restructuring can alter inlining, cache locality, and link-time optimization. Run the full Criterion benchmark suite before and after and assert no regression > 5%.

Verify: scripts/criterion_regression_check.py passes with 5% threshold on all existing benchmarks. Dependencies: PGL-1-7. Schema change: No.

PERF-3 — Core-only unit test speedup measurement

In plain terms: One of the key benefits of extraction is that pg_trickle_core unit tests run without starting PostgreSQL. Measure the wall-clock time for cargo test -p pg_trickle_core vs the old in-tree unit tests. Document the speedup in the CHANGELOG — expect 5-10x faster CI for unit-level tests.

Verify: document test execution times before/after in PR description. Dependencies: PGL-1-7. Schema change: No.

Scalability

ID	Title	Effort	Priority
SCAL-1	Workspace build parallelism verification	S	P1
SCAL-2	Core crate binary size for WASM budget	S	P1
SCAL-3	Incremental compilation impact assessment	S	P2

SCAL-1 — Workspace build parallelism verification

In plain terms: With two crates, cargo build can compile pg_trickle_core and other non-dependent crates in parallel. Verify that the workspace DAG allows parallel compilation and measure the incremental rebuild time for a change in pg_trickle_core only.

Verify: cargo build --timings shows parallel compilation of core crate. Dependencies: PGL-1-1. Schema change: No.

SCAL-2 — Core crate binary size for WASM budget

In plain terms: v0.24.0 targets < 2 MB WASM bundle. Measure the compiled size of pg_trickle_core for the WASM target now so the budget is known before Phase 2. If > 5 MB, investigate wasm-opt stripping and feature-gating large operator modules.

Verify: wasm32-unknown-emscripten build of pg_trickle_core produces < 5 MB unoptimized. Document size in tracking issue. Dependencies: PGL-1-6. Schema change: No.

SCAL-3 — Incremental compilation impact assessment

In plain terms: Splitting into two crates changes the incremental compilation boundary. A change in pg_trickle_core now forces a recompile of the extension crate. Measure incremental compile time for common edit patterns (add a test, modify an operator, change a rewrite pass) and ensure developer-experience compile times remain < 30s.

Verify: document incremental compile times for 3 edit patterns. Dependencies: PGL-1-1. Schema change: No.

Ease of Use

ID	Title	Effort	Priority
UX-1	Workspace-aware justfile targets	S	P0
UX-2	Developer guide for core crate contributions	S	P1
UX-3	ARCHITECTURE.md update for two-crate layout	S	P1

UX-1 — Workspace-aware justfile targets

In plain terms: Existing just targets (just test-unit, just lint, just fmt) must work seamlessly with the new workspace layout. Update the justfile so just test-unit runs both pg_trickle_core unit tests and extension unit tests. Add just test-core for core-only tests.

Verify: all existing just targets pass; just test-core runs core-only tests in < 5 seconds. Dependencies: PGL-1-1. Schema change: No.

UX-2 — Developer guide for core crate contributions

In plain terms: Contributors need to know the rules: what goes in pg_trickle_core (pure logic, no pgrx) vs the extension crate (SPI, FFI, SQL functions). Add a section to CONTRIBUTING.md explaining the crate boundary, the DatabaseBackend trait contract, and how to add a new operator to the core crate.

Verify: CONTRIBUTING.md updated with crate boundary rules. Dependencies: PGL-1-5. Schema change: No.

UX-3 — ARCHITECTURE.md update for two-crate layout

In plain terms: The module layout diagram in docs/ARCHITECTURE.md and AGENTS.md must reflect the new two-crate structure. Update both files so new contributors see the correct layout.

Verify: docs/ARCHITECTURE.md and AGENTS.md module diagrams show pg_trickle_core/ and pg_trickle/ crates. Dependencies: PGL-1-7. Schema change: No.

Test Coverage

ID	Title	Effort	Priority
TEST-1	Delta SQL snapshot tests for all 22 TPC-H queries	M	P0
TEST-2	Pure-Rust unit tests for extracted operators	L	P0
TEST-3	Mock DatabaseBackend for in-memory testing	M	P1
TEST-4	WASM build smoke test in CI	S	P0
TEST-5	Cargo deny / audit for new crate	XS	P0

TEST-1 — Delta SQL snapshot tests for all 22 TPC-H queries

In plain terms: Before extraction, capture the exact delta SQL output for each of the 22 TPC-H stream table definitions. After extraction, run the same generator and diff. Any change is a hard failure. This is the primary correctness gate for the refactoring.

Verify: cargo test -p pg_trickle_core -- snapshot passes with zero diffs. Dependencies: CORR-1. Schema change: No.

TEST-2 — Pure-Rust unit tests for extracted operators

In plain terms: The 23 operator files currently have ~1,700 unit tests that run inside cargo pgrx test (requires PostgreSQL). After extraction, all pure-logic tests should run via cargo test -p pg_trickle_core without a database. Tests that require SPI (e.g., catalog lookups) stay in the extension crate. Audit and migrate every test that can run without PostgreSQL.

Verify: > 80% of existing operator unit tests run in pg_trickle_core without PostgreSQL. Dependencies: PGL-1-2, TEST-3. Schema change: No.

TEST-3 — Mock DatabaseBackend for in-memory testing

In plain terms: For core crate tests that need to call the parser or SPI, provide a MockBackend that returns canned parse trees and query results. This allows testing the full pipeline (parse -> rewrite -> operator tree -> delta SQL) without PostgreSQL.

Verify: MockBackend supports at least: raw_parser() returning a canned OpTree, and spi_query() returning a canned result set. 10+ tests use it. Dependencies: PGL-1-5. Schema change: No.

TEST-4 — WASM build smoke test in CI

In plain terms: Add a CI job that compiles pg_trickle_core to wasm32-unknown-emscripten on every PR. This catches platform-specific code leaks before they accumulate. The job does not need to run the WASM binary — just compile it.

Verify: CI job build-wasm passes on every PR targeting the core crate. Dependencies: PGL-1-6, STAB-5. Schema change: No.

TEST-5 — Cargo deny / audit for new crate

In plain terms: The new pg_trickle_core crate may introduce new transitive dependencies. Ensure cargo deny check and cargo audit cover the new crate and report no advisories.

Verify: cargo deny check and cargo audit pass for the full workspace. Dependencies: PGL-1-1. Schema change: No.

Conflicts & Risks

STAB-1 is the critical path. The ~500 unsafe blocks in rewrites.rs and sublinks.rs require a NodeVisitor abstraction over raw pg_sys pointer traversal. This is the highest-effort, highest-risk item. If the abstraction proves too leaky (e.g., too many pg_sys node types to wrap), consider leaving rewrites.rs and sublinks.rs in the extension crate and extracting only operators + DAG + types to the core crate. This reduces v0.23.0 scope but still delivers the WASM-compilable operator engine for v0.24.0.
PERF-1 must be validated before merging. Introducing a trait DatabaseBackend could add vtable dispatch overhead on the hot refresh path. Use monomorphization (generics) rather than dyn Trait for the extension-side implementation. If Criterion shows > 1% regression, investigate #[inline] annotations and LTO settings.
No schema changes, but workspace restructuring can break cargo pgrx. The cargo-pgrx tool makes assumptions about workspace layout (e.g., expecting a single lib.rs entry point). Test cargo pgrx package, cargo pgrx test, and cargo pgrx run early. If cargo-pgrx 0.17.x cannot handle the workspace, consider upgrading to a newer pgrx that supports workspaces, or use a [patch] section in Cargo.toml.
TEST-2 depends on TEST-3 (MockBackend). Pure-Rust operator tests need a way to feed canned parse trees. Build the MockBackend early so TEST-2 can proceed.
WASM target may not be available in standard CI runners. The wasm32-unknown-emscripten target requires Emscripten SDK. Either install it in CI (adds ~2 min setup) or use a pre-built Docker image with the SDK. Budget for CI setup time.
Extraction is all-or-nothing per module. Partially extracting a module (e.g., moving half of rewrites.rs) creates circular dependencies. Each module must move completely or stay. Plan the extraction order: types -> operators -> DAG -> diff -> rewrites -> sublinks.

v0.23.0 total: ~3–4 weeks (extraction) + ~1–2 weeks (abstraction layer + testing)

Exit criteria:

PGL-1-1: pg_trickle_core crate exists as a workspace member with zero pgrx dependencies
PGL-1-2: All operator delta SQL generation lives in the core crate
PGL-1-3: All auto-rewrite passes live in the core crate
PGL-1-4: DAG computation lives in the core crate
PGL-1-5: trait DatabaseBackend defined; pgrx implementation passes all existing tests
PGL-1-6: cargo build --target wasm32-unknown-emscripten -p pg_trickle_core succeeds
PGL-1-7: just test-all passes with zero regressions
CORR-1: Delta SQL snapshot tests pass for all 22 TPC-H queries (byte-for-byte match)
CORR-2: OpTree serialize-deserialize round-trip passes proptest
CORR-3: Rewrite pass ordering golden snapshot matches
CORR-4: DAG cycle detection passes with 3 new edge-case tests
STAB-1: Zero pg_sys:: references in pg_trickle_core/src/
STAB-2: cargo build -p pg_trickle_core --no-default-features passes in CI
STAB-3: cargo pgrx package and cargo pgrx test succeed with workspace layout
STAB-4: Extension upgrade path tested (0.22.0 -> 0.23.0)
STAB-5: WASM target builds in CI
PERF-1: Criterion shows < 1% regression on diff_operators benchmark
PERF-2: Full benchmark suite passes with < 5% regression threshold
TEST-1: TPC-H delta SQL snapshot tests pass
TEST-2: > 80% of operator unit tests run without PostgreSQL
TEST-3: MockBackend used by 10+ core crate tests
TEST-4: CI build-wasm job passes on every PR
TEST-5: cargo deny check and cargo audit pass for workspace
UX-1: All existing just targets pass; just test-core added
UX-3: ARCHITECTURE.md and AGENTS.md updated with two-crate layout
just check-version-sync passes

v0.24.0 — PGlite WASM Extension

Release Theme This release delivers the first working PGlite extension — the moment pg_trickle's incremental view maintenance runs in the browser. By wrapping pg_trickle_core (extracted in v0.23.0) in a thin C/FFI shim and compiling to WASM via PGlite's Emscripten toolchain, we ship an npm package (@pgtrickle/pglite) that gives PGlite users the full DVM operator vocabulary — outer joins, window functions, subqueries, recursive CTEs — in IMMEDIATE mode. This dramatically exceeds pg_ivm's PGlite offering (INNER joins + basic aggregates only). The release also establishes the cross-platform correctness and performance baselines that all future PGlite work builds on.

See PLAN_PGLITE.md §5 Strategy A and §7 Phase 2 for the full architecture.

PGlite WASM Build (Phase 2)

In plain terms: This takes the pg_trickle_core crate extracted in v0.23.0 and wraps it in a thin C shim that PGlite's Emscripten-based extension build system can compile to WASM. The result is a PGlite extension package (@pgtrickle/pglite) that provides create_stream_table(), drop_stream_table(), and alter_stream_table() — all running IMMEDIATE mode inside the WASM PostgreSQL engine with the full DVM operator set.

Item	Description	Effort	Ref
PGL-2-1	C shim for PGlite. Thin C wrapper bridging PGlite's Emscripten environment to `pg_trickle_core` via Rust FFI. Handles `raw_parser` calls through PGlite's built-in PostgreSQL parser.	1–2wk	PLAN_PGLITE.md §5 Strategy A
PGL-2-2	`DatabaseBackend` for PGlite. Implement the trait for PGlite's single-connection SPI and built-in parser. Remove advisory lock acquisition (trivial in single-connection).	3–5d	PLAN_PGLITE.md §5 Strategy A
PGL-2-3	WASM bundle build. Integrate with PGlite's extension toolchain (`postgres-pglite`). Produce `.tar.gz` WASM bundle. Target bundle size < 2 MB.	3–5d	PLAN_PGLITE.md §8
PGL-2-4	TypeScript wrapper. `@pgtrickle/pglite` npm package with PGlite plugin API. `createStreamTable()`, `dropStreamTable()`, `alterStreamTable()` with full IMMEDIATE mode support.	2–3d	PLAN_PGLITE.md §7 Phase 2
PGL-2-5	IMMEDIATE mode E2E tests on PGlite. Verify inner joins, outer joins, aggregates, DISTINCT, UNION ALL, window functions, subqueries, CTEs (non-recursive + recursive), LATERAL, view inlining, DISTINCT ON, GROUPING SETS.	1–2wk	PLAN_PGLITE.md §4.1
PGL-2-6	PG 17 vs PG 18 parse tree compatibility. PGlite tracks PG 17; pg_trickle targets PG 18. Audit and gate any node struct differences with conditional compilation.	3–5d	PLAN_PGLITE.md §8

Phase 2 subtotal: ~5–7 weeks

Correctness

ID	Title	Effort	Priority
CORR-1	PG 17/18 parse tree node divergence audit	M	P0
CORR-2	Delta SQL cross-platform equivalence	M	P0
CORR-3	Advisory lock no-op safety proof	S	P1
CORR-4	IMMEDIATE trigger ordering in single-connection	S	P1

CORR-1 — PG 17/18 parse tree node divergence audit

In plain terms: PGlite embeds PostgreSQL 17's parser; pg_trickle's OpTree construction targets PostgreSQL 18 node structs. Any struct layout difference (added fields, renamed members, changed enum values) would cause the C shim to misinterpret parse trees, producing silently wrong delta SQL. Systematically diff the PG 17 and PG 18 parse tree headers (nodes/parsenodes.h, nodes/primnodes.h) and catalog every node type that pg_trickle traverses. Gate incompatible nodes behind #[cfg(pg17)] / #[cfg(pg18)] conditional compilation.

Verify: a CI job compiles pg_trickle_core against both PG 17 and PG 18 parse tree headers. A test generates OpTrees from the same SQL on both versions and asserts structural equality. Dependencies: PGL-2-6. Schema change: No.

CORR-2 — Delta SQL cross-platform equivalence

In plain terms: The same SQL view definition must produce the exact same delta SQL on native PostgreSQL 18 and PGlite (WASM + PG 17 parser). Any divergence means one platform gets wrong incremental results. Create a snapshot test suite that runs all 22 TPC-H stream table definitions through both the native and WASM DatabaseBackend implementations and asserts byte-for-byte identical delta SQL output.

Verify: snapshot comparison test passes for all 22 TPC-H queries on both platforms. Any diff is a hard failure. Dependencies: PGL-2-2, CORR-1. Schema change: No.

CORR-3 — Advisory lock no-op safety proof

In plain terms: The native extension uses pg_advisory_xact_lock() to prevent concurrent refresh of the same stream table. PGlite is single-connection — the lock acquisition is a no-op. Verify that removing the lock cannot cause re-entrancy (a trigger firing create_stream_table() from within a refresh) by auditing all SPI call paths from the PGlite DatabaseBackend for re-entrant calls.

Verify: code review + integration test that attempts re-entrant refresh from within a trigger. Must error cleanly, not corrupt state. Dependencies: PGL-2-2. Schema change: No.

CORR-4 — IMMEDIATE trigger ordering in single-connection

In plain terms: IMMEDIATE mode relies on AFTER triggers firing in a specific order when multiple source tables are modified in the same statement (e.g., a CTE with multiple INSERTs). Verify that PGlite's trigger execution order matches native PostgreSQL's for the trigger configurations pg_trickle creates.

Verify: integration test with multi-table CTE INSERT on PGlite; assert stream table state matches native. Dependencies: PGL-2-5. Schema change: No.

Stability

ID	Title	Effort	Priority
STAB-1	WASM heap OOM graceful degradation	M	P0
STAB-2	C shim panic/unwind boundary safety	S	P0
STAB-3	Extension load/unload lifecycle correctness	S	P0
STAB-4	Native extension upgrade path (0.21 → 0.22)	S	P0
STAB-5	npm package version synchronization	XS	P1

STAB-1 — WASM heap OOM graceful degradation

In plain terms: WASM environments have a finite heap (typically 256 MB in browsers, configurable in Node). A large stream table with many operators could exhaust WASM memory during OpTree construction or delta SQL generation. The extension must detect allocation failures and return a clear PostgreSQL error rather than crashing the WASM instance (which would kill all PGlite state). Implement a memory-aware allocator wrapper or check emscripten_get_heap_size() at entry points.

Verify: stress test creating stream tables over increasingly complex views until OOM; assert PGlite remains functional and returns an actionable error. Dependencies: PGL-2-1. Schema change: No.

STAB-2 — C shim panic/unwind boundary safety

In plain terms: Rust panics must not cross the FFI boundary into C. The C shim must catch panics via std::panic::catch_unwind() and convert them to PostgreSQL ereport(ERROR) calls. Any uncaught panic in WASM would abort the entire PGlite instance. Audit every #[no_mangle] extern "C" entry point in the shim for panic safety.

Verify: test that triggers a panic path (e.g., invalid SQL) from TypeScript; assert PGlite returns a SQL error, not a WASM trap. Dependencies: PGL-2-1. Schema change: No.

STAB-3 — Extension load/unload lifecycle correctness

In plain terms: PGlite extensions can be loaded and unloaded. The C shim must free all Rust-allocated memory on unload and not leave dangling pointers or leaked state. Test the full lifecycle: load extension → create stream tables → drop stream tables → unload extension → reload extension → create new stream tables.

Verify: lifecycle test with memory profiling shows zero leaked allocations after unload/reload cycle. Dependencies: PGL-2-1, PGL-2-4. Schema change: No.

STAB-4 — Native extension upgrade path (0.22 → 0.23)

In plain terms: v0.24.0 adds PGlite support but makes no SQL-visible changes to the native extension. The upgrade migration from 0.21.0 to 0.22.0 must leave existing stream tables intact and refreshable.

Verify: upgrade E2E test confirms stream tables survive and refresh correctly after 0.23.0 -> 0.24.0.

STAB-5 — npm package version synchronization

In plain terms: The @pgtrickle/pglite npm package version must match the extension version (0.22.0). Add a CI check that verifies package.json version matches pg_trickle.control version, similar to the existing just check-version-sync target.

Verify: just check-version-sync also validates npm package version. Dependencies: PGL-2-4. Schema change: No.

Performance

ID	Title	Effort	Priority
PERF-1	WASM vs native refresh latency benchmark	M	P0
PERF-2	WASM bundle size optimization (< 2 MB target)	M	P0
PERF-3	PGlite cold-start extension load time	S	P1

PERF-1 — WASM vs native refresh latency benchmark

In plain terms: WASM is expected to be 1.5–3× slower than native (per PLAN_PGLITE.md §8). Quantify the actual overhead by benchmarking IMMEDIATE-mode refresh on both platforms using the same schema + data. The overhead must stay below the threshold where IMMEDIATE mode is still faster than full re-evaluation — otherwise PGlite users would be better off just re-running the query. Establish a Criterion-like benchmark suite for PGlite (potentially using Node.js + @electric-sql/pglite).

Verify: benchmark report showing WASM refresh latency for 5 representative stream tables (scan, join, aggregate, window, recursive CTE). Document native-to-WASM overhead ratio. Dependencies: PGL-2-5. Schema change: No.

PERF-2 — WASM bundle size optimization (< 2 MB target)

In plain terms: The WASM bundle must be < 2 MB for acceptable download times in browser environments (PostGIS is 8.2 MB, pgcrypto is 1.1 MB — pg_trickle should be closer to pgcrypto). Apply wasm-opt -Oz, LTO, codegen-units = 1, strip debug info, and feature-gate large operator modules (e.g., recursive CTE, window functions) behind optional features if needed to meet the target.

Verify: CI job measures WASM bundle size after wasm-opt and fails if > 2 MB. Document size breakdown by operator module. Dependencies: PGL-2-3. Schema change: No.

PERF-3 — PGlite cold-start extension load time

In plain terms: The first CREATE EXTENSION pg_trickle in a PGlite session compiles and loads the WASM module. This must complete in < 500 ms in a browser and < 200 ms in Node.js. Measure and optimize by using streaming WASM compilation (WebAssembly.compileStreaming()) and ensuring the extension _PG_init() function does minimal work.

Verify: benchmark measuring time from CREATE EXTENSION to first create_stream_table() on fresh PGlite instance. Document cold-start time. Dependencies: PGL-2-1, PGL-2-3. Schema change: No.

Scalability

ID	Title	Effort	Priority
SCAL-1	Stream table count ceiling in WASM	S	P1
SCAL-2	Wide-table OpTree memory footprint	S	P1
SCAL-3	Dataset size practical limit for IMMEDIATE mode	S	P2

SCAL-1 — Stream table count ceiling in WASM

In plain terms: Each stream table consumes memory for its OpTree, delta SQL templates, and trigger metadata. In native PostgreSQL with gigabytes of RAM this is trivial, but in a 256 MB WASM heap it matters. Determine the practical limit by creating stream tables in a loop until OOM, then document the ceiling and add a guard that errors at 80% capacity with an actionable message.

Verify: stress test documents the ceiling (e.g., "~200 stream tables with average 3-table join in 256 MB heap"). Guard errors at 80%. Dependencies: STAB-1. Schema change: No.

SCAL-2 — Wide-table OpTree memory footprint

In plain terms: A stream table over a 100-column source table produces a large OpTree and long delta SQL strings. Profile the memory consumption of OpTree construction for wide tables and ensure it fits within the WASM heap budget alongside typical stream table counts.

Verify: profile OpTree allocation for 10, 50, 100-column source tables. Document memory per stream table as a function of column count. Dependencies: PGL-2-5. Schema change: No.

SCAL-3 — Dataset size practical limit for IMMEDIATE mode

In plain terms: IMMEDIATE mode fires triggers on every DML, so overhead scales with write frequency. In a WASM environment with ~2× slower execution, determine at what dataset size (rows × columns × writes/second) IMMEDIATE mode becomes impractical. Document the breakpoint so PGlite users know when their use case has outgrown the browser and should migrate to native pg_trickle with DIFFERENTIAL mode.

Verify: benchmark with increasing write rates; document the throughput ceiling (e.g., "> 10K rows/sec INSERT rate degrades stream table latency past 100 ms"). Dependencies: PERF-1. Schema change: No.

Ease of Use

ID	Title	Effort	Priority
UX-1	TypeScript API ergonomics and type safety	S	P0
UX-2	PGlite getting-started guide	M	P0
UX-3	WASM-context error message quality	S	P1
UX-4	npm package README with runnable examples	S	P1

UX-1 — TypeScript API ergonomics and type safety

In plain terms: The @pgtrickle/pglite TypeScript API must follow PGlite plugin conventions (PGlitePlugin interface, init() lifecycle). All methods must be fully typed — no any types. The API surface must be minimal: createStreamTable(sql), dropStreamTable(name), alterStreamTable(name, sql), listStreamTables(), and refreshStreamTable(name). Review against existing PGlite plugins (@electric-sql/pglite-repl, pglite-vector) for consistency.

Verify: TypeScript strict mode compilation with no errors. API review against PGlite plugin conventions checklist. Dependencies: PGL-2-4. Schema change: No.

UX-2 — PGlite getting-started guide

In plain terms: A docs/tutorials/PGLITE_QUICKSTART.md guide walking a user from npm install to a working React app with live stream tables in < 10 minutes. Include: install, create PGlite instance with extension, define source table + stream table, insert data, observe stream table update. Provide a CodeSandbox / StackBlitz link for zero-install try-it-now experience.

Verify: a new developer can follow the guide and see a working stream table in PGlite in a browser within 10 minutes. Dependencies: PGL-2-4, UX-1. Schema change: No.

UX-3 — WASM-context error message quality

In plain terms: Error messages from the Rust/C shim must be JavaScript-friendly: no raw pg_sys error codes, no memory addresses. Every error must include the stream table name, the failing SQL fragment, and a remediation hint. Unsupported features (DIFFERENTIAL mode, scheduled refresh, parallel workers) must error with "Not supported in PGlite: . Use IMMEDIATE mode." rather than cryptic internal errors.

Verify: audit all error paths in the C shim + PGlite DatabaseBackend. Every error message includes table name + remediation hint. Dependencies: PGL-2-1, PGL-2-2. Schema change: No.

UX-4 — npm package README with runnable examples

In plain terms: The npm package must have a README with: badge for PGlite compatibility, install command, 3 runnable examples (basic aggregate, join, window function), API reference, link to the full PGlite quickstart guide, and a "Limitations vs native pg_trickle" section clearly stating: no DIFFERENTIAL mode, no scheduled refresh, no parallel workers, PG 17 parser only.

Verify: README renders correctly on npmjs.com; examples are copy-pasteable into a Node.js REPL. Dependencies: PGL-2-4, UX-2. Schema change: No.

Test Coverage

ID	Title	Effort	Priority
TEST-1	Full DVM operator E2E suite on PGlite	L	P0
TEST-2	PG 17/18 parse tree compatibility tests	M	P0
TEST-3	WASM memory stress tests	M	P1
TEST-4	TypeScript integration tests	M	P0
TEST-5	Bundle size regression gate in CI	S	P0

TEST-1 — Full DVM operator E2E suite on PGlite

In plain terms: Run every DVM operator (23 operators across inner join, outer join, full join, semi-join, anti-join, aggregate, distinct, union/intersect/except, subquery, scalar subquery, CTE scan, recursive CTE, lateral function, lateral subquery, window function, scan, filter, project) through IMMEDIATE mode in PGlite. This is the primary correctness gate for the WASM extension. Use a Node.js test harness with @electric-sql/pglite to run the tests headlessly.

Verify: test suite with ≥ 1 test per operator (23+ tests) passes in CI using PGlite Node.js. Test matrix: INSERT, UPDATE, DELETE for each operator. Dependencies: PGL-2-5. Schema change: No.

TEST-2 — PG 17/18 parse tree compatibility tests

In plain terms: For every parse tree node type that pg_trickle traverses, generate a test query that exercises that node, parse it on both PG 17 (PGlite) and PG 18 (native), and assert that the resulting OpTree is structurally identical. This catches version-specific divergences before they reach users.

Verify: compatibility test suite covers all node types referenced in pg_trickle_core. Any divergence is a hard failure with clear diagnostic. Dependencies: CORR-1. Schema change: No.

TEST-3 — WASM memory stress tests

In plain terms: Create increasing numbers of stream tables with increasing complexity until OOM. Verify that: (a) the guard from SCAL-1 fires at 80% capacity, (b) PGlite remains functional after the guard fires, (c) dropping stream tables actually frees memory. Run under different heap sizes (64 MB, 128 MB, 256 MB) to validate the guard thresholds.

Verify: stress test with 3 heap sizes completes without WASM trap. Guard fires at documented threshold. Memory reclaimed after DROP. Dependencies: STAB-1, SCAL-1. Schema change: No.

TEST-4 — TypeScript integration tests

In plain terms: Test the @pgtrickle/pglite TypeScript API end-to-end using Jest or Vitest in Node.js. Cover: create/drop/alter stream table, error handling (invalid SQL, unsupported features), plugin lifecycle (init/cleanup), and concurrent operations on different stream tables. Run as part of CI on every PR that touches pg_trickle_pglite/.

Verify: ≥ 20 TypeScript integration tests pass in CI. Test coverage report for the TypeScript wrapper shows > 90% line coverage. Dependencies: PGL-2-4, UX-1. Schema change: No.

TEST-5 — Bundle size regression gate in CI

In plain terms: Add a CI job that builds the WASM bundle, runs wasm-opt, measures the final .wasm file size, and fails if it exceeds 2 MB. Store the current size as a baseline and alert on any increase > 10%. This prevents bundle bloat as features are added.

Verify: CI job check-wasm-size runs on every PR touching pg_trickle_core/ or pg_trickle_pglite/. Fails at > 2 MB. Dependencies: PGL-2-3, PERF-2. Schema change: No.

Conflicts & Risks

CORR-1 (PG 17/18 parse tree compatibility) is the highest risk. PGlite embeds PG 17; pg_trickle targets PG 18. If node struct layouts diverged significantly between versions (e.g., JoinExpr gained a field, RangeTblEntry changed a flag), the C shim must handle both layouts via conditional compilation. In the worst case, some operators may need version-specific code paths. Start this audit early — it blocks PGL-2-1 and PGL-2-2.
PERF-2 (bundle size < 2 MB) may conflict with full operator coverage. If the 23-operator delta SQL generator compiles to > 2 MB, we may need to feature-gate rarely-used operators (recursive CTE, GROUPING SETS) behind cargo features. This would reduce the "full DVM vocabulary" claim and require documenting which operators are available by default. Measure early with a minimal build to establish baseline.
PGlite's Emscripten toolchain is a moving target. PGlite's extension build system (postgres-pglite) is not yet stable. Breaking changes in the toolchain could block PGL-2-3. Pin the PGlite version and track upstream releases. Have a fallback plan: manual Emscripten compilation without the PGlite toolchain.
STAB-2 (panic boundary) and STAB-1 (OOM handling) interact. A Rust OOM in WASM triggers a panic, which must not cross the FFI boundary. Both items must be implemented together: the OOM guard (STAB-1) sets a pre-panic threshold, and the catch_unwind wrapper (STAB-2) is the last-resort safety net.
No prior C FFI in the codebase. The only C code is scripts/pg_stub.c (test helper). The C shim (PGL-2-1) introduces a new language and toolchain requirement. Ensure the C code is minimal (< 500 lines), well-documented, and covered by the TypeScript integration tests.
TEST-1 and TEST-4 require a PGlite-based CI runner. Need Node.js 18+ with @electric-sql/pglite in CI. This is a new CI dependency. Add it to the existing CI matrix as a separate job that only runs when pg_trickle_pglite/ or pg_trickle_core/ files are modified.

v0.24.0 total: ~5–7 weeks (WASM build) + ~2–3 weeks (testing + polish)

Exit criteria:

PGL-2-1: C shim compiles and links against PGlite's WASM PostgreSQL headers
PGL-2-2: PGlite DatabaseBackend passes all IMMEDIATE-mode operator tests
PGL-2-3: WASM bundle size < 2 MB after wasm-opt
PGL-2-4: @pgtrickle/pglite npm package published to npmjs.com
PGL-2-5: All 23 DVM operators pass E2E tests on PGlite
PGL-2-6: PG 17 parse tree differences documented and handled with #[cfg]
CORR-1: PG 17/18 parse tree audit complete; compatibility tests pass
CORR-2: Delta SQL cross-platform snapshot tests pass for all 22 TPC-H queries
CORR-3: Re-entrant refresh test passes on PGlite
CORR-4: Multi-table CTE trigger ordering matches native
STAB-1: OOM stress test: PGlite survives with actionable error
STAB-2: Panic from invalid SQL returns SQL error, not WASM trap
STAB-3: Load/unload/reload lifecycle test: zero leaked allocations
STAB-4: Extension upgrade path tested (0.23.0 -> 0.24.0)
PERF-1: WASM vs native benchmark report published (≤ 3× overhead)
PERF-2: WASM bundle ≤ 2 MB (CI gated)
PERF-3: Cold-start load time < 500 ms browser, < 200 ms Node.js
TEST-1: ≥ 23 operator E2E tests pass on PGlite in CI
TEST-2: Parse tree compatibility tests cover all traversed node types
TEST-3: Memory stress tests pass under 64/128/256 MB heap sizes
TEST-4: ≥ 20 TypeScript integration tests with > 90% line coverage
TEST-5: CI check-wasm-size job passes on every PR
UX-1: TypeScript strict mode compilation: zero errors
UX-2: PGlite getting-started guide published with CodeSandbox link
UX-4: npm README renders correctly on npmjs.com
just check-version-sync passes (incl. npm package version)

v0.25.0 — PGlite Reactive Integration

Release Theme This release completes the PGlite story by bridging the gap between database-side incremental view maintenance and front-end UI reactivity. By connecting stream table deltas to PGlite's live.changes() API and providing framework-specific hooks (useStreamTable() for React and Vue), pg_trickle becomes the first IVM engine to offer truly reactive UI bindings — where DOM updates are proportional to changed rows, not result set size. This is the local-first developer's final mile: from INSERT to re-render in a single digit millisecond count, with no polling, no diffing, and no full query re-execution.

See PLAN_PGLITE.md §7 Phase 3 for the full reactive integration design.

Reactive Bindings (Phase 3)

In plain terms: Phase 2 gave PGlite users in-engine IVM. This phase connects stream table changes to PGlite's live.changes() API and provides framework-specific hooks — useStreamTable() for React, useStreamTable() for Vue — so UI components automatically re-render when the underlying data changes. For local-first apps like collaborative editors, dashboards, and offline-capable tools, this is the last mile between incremental SQL and reactive UI.

Item	Description	Effort	Ref
PGL-3-1	`live.changes()` bridge. Emit INSERT/UPDATE/DELETE change events from stream table delta application to PGlite's live query system. Keyed by `__pgt_row_id`.	3–5d	PLAN_PGLITE.md §7 Phase 3
PGL-3-2	React hooks. `useStreamTable(query)` hook that subscribes to stream table changes and returns reactive state. Handles mount/unmount lifecycle.	3–5d	—
PGL-3-3	Vue composable. `useStreamTable(query)` composable with equivalent functionality.	2–3d	—
PGL-3-4	Documentation and examples. Local-first app patterns: collaborative todo list, real-time dashboard, offline-first inventory tracker. Published as `@pgtrickle/pglite` docs.	2–3d	—
PGL-3-5	Performance benchmarks. End-to-end latency from `INSERT` to React re-render. Compare against `live.incrementalQuery()` for complex queries (3-table join + aggregate).	1–2d	—

Phase 3 subtotal: ~2–3 weeks

Correctness

ID	Title	Effort	Priority
CORR-1	Change event fidelity vs stream table state	M	P0
CORR-2	Multi-row DML atomicity in reactive stream	S	P0
CORR-3	Hook state consistency after rapid mutations	M	P1
CORR-4	DELETE/re-INSERT identity stability	S	P1

CORR-1 — Change event fidelity vs stream table state

In plain terms: The live.changes() bridge emits INSERT/UPDATE/DELETE events derived from the IMMEDIATE mode delta application. If an event is missed, duplicated, or misclassified (e.g., an UPDATE emitted as DELETE + INSERT), the React/Vue state will diverge from the actual stream table contents. For every DML operation on every DVM operator type, assert that the sequence of change events, when applied to an empty accumulator, produces a set identical to SELECT * FROM stream_table.

Verify: integration test replaying 1,000 random DML operations across all operator types; final accumulator state matches SELECT *. Any divergence is a hard failure. Dependencies: PGL-3-1. Schema change: No.

CORR-2 — Multi-row DML atomicity in reactive stream

In plain terms: A single INSERT INTO source SELECT ... FROM generate_series(1, 100) inserts 100 rows and triggers IMMEDIATE mode delta application. The live.changes() bridge must emit all 100 change events as a single batch — not trickle them one-by-one — so that React performs a single re-render, not 100. If events leak across batch boundaries, the UI shows intermediate states that never existed in the database.

Verify: test with 100-row INSERT; assert useStreamTable() callback fires exactly once with all 100 rows. Intermediate renders counted via React profiler must be ≤ 1. Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

CORR-3 — Hook state consistency after rapid mutations

In plain terms: If a user performs INSERT → DELETE → INSERT on the same row within 10 ms (e.g., optimistic UI with undo), the hook must resolve to the correct final state. Race conditions between the live.changes() event stream and React's asynchronous render cycle could show stale data. The hook must use a monotonic sequence number (from the bridge's event stream) to discard stale updates.

Verify: stress test with 50 rapid mutations on the same row at 1 ms intervals; final hook state matches SELECT *. Test on both React 18 (concurrent mode) and React 19. Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

CORR-4 — DELETE/re-INSERT identity stability

In plain terms: When a row is deleted and a new row with the same PK is inserted, the __pgt_row_id changes but the PK doesn't. The change bridge must emit a DELETE for the old __pgt_row_id and an INSERT for the new one — not an UPDATE — so that React's reconciler correctly unmounts and remounts the component (not just re-renders it). Wrong identity semantics cause stale closures and event handler leaks.

Verify: test DELETE + INSERT with same PK; verify React component lifecycle (unmount + mount, not just update). Use React DevTools profiler. Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

Stability

ID	Title	Effort	Priority
STAB-1	Memory leak prevention in long-lived hooks	M	P0
STAB-2	Subscription cleanup on component unmount	S	P0
STAB-3	Error boundary integration for hook failures	S	P0
STAB-4	Native extension upgrade path (0.24 → 0.25)	S	P0
STAB-5	Framework version compatibility matrix	S	P1

STAB-1 — Memory leak prevention in long-lived hooks

In plain terms: A useStreamTable() hook in a long-lived component (e.g., a dashboard that runs for hours) accumulates change events via the live.changes() subscription. If the bridge or hook retains references to processed events, memory grows unboundedly. Implement a bounded event buffer (configurable, default 1,000 events) that discards processed events after they are applied to the hook's state snapshot. After the buffer fills, old entries are garbage-collected.

Verify: 4-hour soak test with continuous 1 row/sec mutations. Heap snapshot at 1h and 4h shows < 10% growth. No detached DOM nodes or leaked closures. Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

STAB-2 — Subscription cleanup on component unmount

In plain terms: When a React component using useStreamTable() is unmounted (e.g., route change), the live.changes() subscription must be cancelled immediately. Failing to clean up causes: (a) memory leaks from the change listener, (b) "setState on unmounted component" warnings, (c) stale event processing after the component is gone. Use useEffect() cleanup function with an AbortController pattern.

Verify: mount/unmount cycle test (100 cycles); zero console warnings, zero leaked subscriptions (verified via PGlite connection subscription count). Dependencies: PGL-3-2. Schema change: No.

STAB-3 — Error boundary integration for hook failures

In plain terms: If the live.changes() bridge throws (e.g., stream table was dropped while the hook is active), the hook must propagate the error to React's error boundary / Vue's onErrorCaptured — not swallow it silently or crash the app. Provide an onError callback option and a default that throws to the nearest error boundary.

Verify: test dropping a stream table while useStreamTable() is active; assert error boundary catches the error with an actionable message. Dependencies: PGL-3-2, PGL-3-3. Schema change: No.

STAB-4 — Native extension upgrade path (0.24 → 0.25)

In plain terms: v0.25.0 adds reactive bindings at the TypeScript/npm layer only. The native PostgreSQL extension and PGlite WASM extension must continue to work unchanged. The upgrade migration from 0.23.0 to 0.24.0 must leave existing stream tables and the @pgtrickle/pglite WASM extension intact.

Verify: upgrade E2E test confirms stream tables survive and refresh correctly after 0.24.0 -> 0.25.0. TypeScript API backward compatibility verified. Dependencies: None. Schema change: No.

STAB-5 — Framework version compatibility matrix

In plain terms: Test useStreamTable() against: React 18.x, React 19.x, Vue 3.4+. Document which framework versions are supported. Future consideration: Svelte 5 (runes), SolidJS, Angular signals — document these as "community-contributed" integration points, not first-party.

Verify: CI matrix testing React 18, React 19, Vue 3.4. Published compatibility table in npm README. Dependencies: PGL-3-2, PGL-3-3. Schema change: No.

Performance

ID	Title	Effort	Priority
PERF-1	INSERT-to-render latency benchmark	M	P0
PERF-2	Batch rendering efficiency (single re-render)	S	P0
PERF-3	Bridge overhead vs raw `live.changes()`	S	P1

PERF-1 — INSERT-to-render latency benchmark

In plain terms: Measure the end-to-end latency from INSERT INTO source_table to the React component's DOM update. The target is < 50% of live.incrementalQuery() latency for a 3-table join + aggregate at 10K rows (per PLAN_PGLITE.md). This is the headline metric: if pg_trickle's reactive path is not significantly faster than PGlite's built-in incremental query, the value proposition collapses.

Verify: benchmark suite with 5 complexity levels (scan, filter, join, aggregate, window). Publish results as a comparison table against live.incrementalQuery(). Target: < 50% latency at 10K rows. Dependencies: PGL-3-1, PGL-3-2, PGL-3-5. Schema change: No.

PERF-2 — Batch rendering efficiency (single re-render)

In plain terms: A bulk INSERT (100 rows) must produce exactly one React re-render, not 100. The change bridge must batch events emitted within the same transaction into a single live.changes() notification. Use queueMicrotask() or requestAnimationFrame() batching in the TypeScript wrapper to coalesce rapid-fire events.

Verify: React profiler shows ≤ 1 render per bulk DML. Test with 1, 10, 100, 1000-row INSERTs; render count is always 1. Dependencies: PGL-3-1, PGL-3-2, CORR-2. Schema change: No.

PERF-3 — Bridge overhead vs raw live.changes()

In plain terms: The change bridge adds a translation layer between the IMMEDIATE mode delta application and PGlite's live.changes() API. Measure the overhead of this translation (serialization, event construction, key mapping) and ensure it is < 5% of total refresh latency. If overhead is higher, optimize the bridge's change event construction (e.g., avoid JSON round-trips, use structured clones).

Verify: micro-benchmark isolating bridge overhead from WASM refresh time. Document overhead as percentage of total INSERT-to-event latency. Dependencies: PGL-3-1. Schema change: No.

Scalability

ID	Title	Effort	Priority
SCAL-1	Multiple concurrent subscriptions	S	P1
SCAL-2	Large result set rendering (10K+ rows)	M	P1
SCAL-3	Multi-tab / SharedWorker isolation	S	P2

SCAL-1 — Multiple concurrent subscriptions

In plain terms: A dashboard page may render 5-10 useStreamTable() hooks simultaneously, each watching a different stream table. The bridge must not create per-hook subscriptions to live.changes() — instead, use a single multiplexed subscription that fans out to registered hooks. Measure performance with 1, 5, 10, 20 concurrent hooks.

Verify: benchmark with 20 concurrent useStreamTable() hooks; latency degradation < 20% vs single hook. Memory growth linear (not quadratic). Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

SCAL-2 — Large result set rendering (10K+ rows)

In plain terms: A stream table with 10K+ rows produces a large initial snapshot when useStreamTable() mounts. The hook must support virtualized rendering (integrating with libraries like react-virtual or tanstack-virtual) by providing a stable row identity key (__pgt_row_id) and fine-grained change signals (which rows changed, not just "something changed"). Without this, mounting a 10K-row stream table would freeze the UI for seconds.

Verify: demo app with 10K-row stream table using @tanstack/react-virtual. Mount time < 200 ms. Single-row INSERT re-renders only the affected row, not the full list. Dependencies: PGL-3-2, PGL-3-4. Schema change: No.

SCAL-3 — Multi-tab / SharedWorker isolation

In plain terms: In multi-tab apps using PGlite with SharedWorker, each tab gets its own useStreamTable() hooks but shares a single PGlite instance. The bridge must correctly fan out change events to all tabs without cross-tab interference or duplicate processing. Document the SharedWorker architecture and test with 3 concurrent tabs.

Verify: 3-tab test with shared PGlite instance via SharedWorker. INSERT in tab 1 causes re-render in all 3 tabs. No duplicate events. No memory leaks across tabs. Dependencies: PGL-3-1. Schema change: No.

Ease of Use

ID	Title	Effort	Priority
UX-1	Local-first app example: collaborative todo	M	P0
UX-2	Real-time dashboard example	M	P0
UX-3	API reference with interactive playground	S	P1
UX-4	Migration guide from `live.incrementalQuery()`	S	P1

UX-1 — Local-first app example: collaborative todo

In plain terms: A complete, runnable React app demonstrating pg_trickle + PGlite for a collaborative todo list: multiple "users" (simulated in separate components) INSERT/UPDATE/DELETE todos, each user's view updates reactively via useStreamTable(). Published in the monorepo under examples/pglite-todo/ with a CodeSandbox link. This is the primary "show, don't tell" marketing asset.

Verify: example app runs in CodeSandbox with zero local setup. README explains every code section. A non-pg_trickle developer can understand it in 5 minutes. Dependencies: PGL-3-2, PGL-3-4. Schema change: No.

UX-2 — Real-time dashboard example

In plain terms: A React dashboard with 3 stream tables: (a) live order count (aggregate), (b) revenue by region (join + aggregate), (c) top products (window function + LIMIT). Data is inserted via a simulated event stream. Each panel updates reactively. Demonstrates the breadth of SQL operators supported in PGlite, beyond what live.incrementalQuery() can efficiently handle.

Verify: example app with 3 panels. INSERT 100 orders; all 3 panels update with a single render each. Published to CodeSandbox. Dependencies: PGL-3-2, PGL-3-4. Schema change: No.

UX-3 — API reference with interactive playground

In plain terms: An interactive documentation page (MDX or Storybook) where users can type SQL, create a stream table, insert data, and see the useStreamTable() hook update live — all in the browser via PGlite. This replaces the need for a local install for initial exploration.

Verify: playground page loads in < 3 seconds. Users can create a stream table and see reactive updates within 30 seconds of page load. Dependencies: PGL-3-2, UX-1. Schema change: No.

UX-4 — Migration guide from live.incrementalQuery()

In plain terms: Users already using PGlite's live.incrementalQuery() need a clear guide showing: (a) when to switch to pg_trickle (complex queries, high-throughput writes, large result sets), (b) how to migrate step-by-step (replace live.incrementalQuery(q) with createStreamTable(q) + useStreamTable(name)), (c) what to expect (latency improvement, memory trade-off, SQL surface differences).

Verify: migration guide published in docs. Includes a before/after code diff and a decision flowchart. Dependencies: PGL-3-4, PERF-1. Schema change: No.

Test Coverage

ID	Title	Effort	Priority
TEST-1	Change event fidelity suite (all operators)	L	P0
TEST-2	React hook lifecycle tests	M	P0
TEST-3	Vue composable lifecycle tests	M	P0
TEST-4	Cross-framework render count assertions	S	P0
TEST-5	Long-running soak test for memory leaks	M	P1

TEST-1 — Change event fidelity suite (all operators)

In plain terms: For each of the 23 DVM operators, test that the live.changes() bridge emits the correct change events for INSERT, UPDATE, and DELETE on the source table. Replay events into an accumulator and assert it matches SELECT * FROM stream_table. This extends v0.24.0 TEST-1 (operator E2E) by adding the reactive layer.

Verify: ≥ 69 tests (23 operators × 3 DML types). Accumulator matches SELECT * for every test case. Dependencies: PGL-3-1, v0.24.0 TEST-1. Schema change: No.

TEST-2 — React hook lifecycle tests

In plain terms: Test the full lifecycle of useStreamTable(): (a) initial mount returns current stream table state, (b) INSERT on source triggers re-render with new data, (c) unmount cancels subscription, (d) remount re-subscribes and returns current state, (e) rapid mount/unmount (100 cycles) has no leaks. Use React Testing Library with renderHook().

Verify: ≥ 15 tests covering mount, update, unmount, remount, error, and stress scenarios. Zero console warnings in test output. Dependencies: PGL-3-2. Schema change: No.

TEST-3 — Vue composable lifecycle tests

In plain terms: Equivalent of TEST-2 for Vue: mount, update, unmount, remount, error handling. Use Vue Test Utils with mount() and wrapper.unmount(). Test with both Options API and Composition API usage patterns.

Verify: ≥ 10 tests covering Vue lifecycle. Zero console warnings. Dependencies: PGL-3-3. Schema change: No.

TEST-4 — Cross-framework render count assertions

In plain terms: For each framework (React, Vue), verify that a bulk INSERT (100 rows) triggers exactly 1 render, not 100. This is the batching correctness test. Use framework-specific profiling APIs (React Profiler, Vue DevTools perf hooks) to count renders.

Verify: render count = 1 for 100-row bulk INSERT in both React and Vue. CI assertion. Dependencies: PGL-3-2, PGL-3-3, PERF-2. Schema change: No.

TEST-5 — Long-running soak test for memory leaks

In plain terms: Run a React app with useStreamTable() for 4 hours with 1 mutation/second. Take heap snapshots at 0h, 1h, 2h, 4h. Assert heap growth < 10%. Check for detached DOM nodes, leaked event listeners, and orphaned closures. This validates STAB-1 under real conditions.

Verify: soak test runs in CI (with a 30-min abbreviated version for PR CI). Full 4-hour version runs in nightly CI. Heap growth < 10%. Dependencies: STAB-1, PGL-3-2. Schema change: No.

Conflicts & Risks

live.changes() API stability. PGlite's live.changes() is relatively new and its event format may change between PGlite releases. Pin the PGlite version and add an adapter layer so the bridge can accommodate event format changes without rewriting the React/Vue hooks. If PGlite deprecates live.changes() before v0.25.0 ships, fall back to LISTEN/NOTIFY with a custom channel.
CORR-2 (batch atomicity) and PERF-2 (single re-render) are coupled. The batching mechanism must ensure correctness (all-or-nothing event delivery) AND performance (single render). Using queueMicrotask() for batching risks splitting a transaction's events across two microtasks if the event stream straddles a microtask boundary. Consider explicit transaction-boundary markers in the bridge's event protocol.
React concurrent mode complicates CORR-3 (rapid mutations). React 18/19 concurrent features (startTransition, useDeferredValue) may delay or re-order state updates from useStreamTable(). The hook must use useSyncExternalStore() (React 18+) to ensure tearing-free reads. This is non-negotiable for correctness.
SCAL-2 (large result set rendering) requires external library integration. The useStreamTable() hook should not bundle a virtualization library — instead, expose stable row keys and fine-grained change signals that integrate with @tanstack/react-virtual or similar. Document the pattern but do not create a hard dependency.
SCAL-3 (SharedWorker) is exploratory. PGlite's SharedWorker support has known limitations (no concurrent transactions). Mark SCAL-3 as P2 and scope it to documentation + a proof-of-concept, not production-grade support.
No native extension changes in v0.25.0. This release is entirely in the TypeScript/npm layer. Any temptation to add native features (e.g., LISTEN/NOTIFY bridge, WebSocket push) should be deferred to post-1.0. Keep the scope tight: reactive bindings + examples + docs.

v0.25.0 total: ~2–3 weeks (bridge + hooks) + ~1–2 weeks (examples + testing + polish)

Exit criteria:

PGL-3-1: Stream table changes appear in live.changes() event stream
PGL-3-2: React useStreamTable() hook re-renders on stream table changes
PGL-3-3: Vue useStreamTable() composable re-renders on stream table changes
PGL-3-4: At least 2 example apps published with documentation and CodeSandbox links
PGL-3-5: End-to-end latency benchmarked and published
CORR-1: 1,000-operation replay test: accumulator matches SELECT * for all operators
CORR-2: 100-row bulk INSERT triggers exactly 1 re-render
CORR-3: 50 rapid same-row mutations: final hook state matches SELECT *
CORR-4: DELETE + re-INSERT with same PK: correct unmount/mount lifecycle
STAB-1: 4-hour soak test: heap growth < 10%
STAB-2: 100 mount/unmount cycles: zero leaked subscriptions
STAB-3: Stream table dropped while hook active: error boundary catches
STAB-4: Extension upgrade path tested (0.24.0 -> 0.25.0)
STAB-5: CI matrix passes for React 18, React 19, Vue 3.4+
PERF-1: INSERT-to-render latency < 50% of live.incrementalQuery() at 10K rows
PERF-2: Render count = 1 for bulk DML (1, 10, 100, 1000 rows)
TEST-1: ≥ 69 change event fidelity tests pass (23 operators × 3 DML types)
TEST-2: ≥ 15 React hook lifecycle tests pass
TEST-3: ≥ 10 Vue composable lifecycle tests pass
TEST-4: Cross-framework render count = 1 for bulk DML
TEST-5: 30-min abbreviated soak test passes in PR CI
UX-1: Collaborative todo example published to CodeSandbox
UX-2: Real-time dashboard example published to CodeSandbox
UX-4: Migration guide from live.incrementalQuery() published
just check-version-sync passes (incl. npm package version)

v1.0.0 — Stable Release

Goal: First officially supported release. Semantic versioning locks in. API, catalog schema, and GUC names are considered stable. Focus is distribution — getting pg_trickle onto package registries — and PostgreSQL 19 forward-compatibility.

PostgreSQL 19 Forward-Compatibility (A3)

In plain terms: When PostgreSQL 19 beta stabilises and pgrx 0.18.x ships with PG 19 support, this milestone bumps the pgrx dependency, audits every internal pg_sys::* API call for breaking changes, adds conditional compilation gates, and validates the WAL decoder against any pgoutput format changes introduced in PG 19. Moved here from the earlier v0.22.0 milestone because PG 19 beta availability is uncertain.

Item	Description	Effort	Ref
A3-1	pgrx version bump to 0.18.x (PG 19 support) + `cargo pgrx init --pg19`	2–4h	PLAN_PG19_COMPAT.md §2
A3-2	`pg_sys::*` API audit: heap access, catalog structs, WAL decoder `LogicalDecodingContext`	8–16h	PLAN_PG19_COMPAT.md §3
A3-3	Conditional compilation (`#[cfg(feature = "pg19")]`) for changed APIs	4–8h	PLAN_PG19_COMPAT.md §4
A3-4	CI matrix expansion for PG 19 + full E2E suite run	4–8h	PLAN_PG19_COMPAT.md

A3 subtotal: ~18–36 hours

Release engineering

In plain terms: The 1.0 release is the official "we stand behind this API" declaration — from this point on the function names, catalog schema, and configuration settings won't change without a major version bump. The practical work is getting pg_trickle onto standard package registries (PGXN, apt, rpm) so it can be installed with the same commands as any other PostgreSQL extension, and hardening the CloudNativePG integration for Kubernetes deployments.

Item	Description	Effort	Ref
R1	Semantic versioning policy + compatibility guarantees	2–3h	PLAN_VERSIONING.md
R2	apt / rpm packaging (Debian/Ubuntu `.deb` + RHEL `.rpm` via PGDG)	8–12h	PLAN_PACKAGING.md
R2b	PGXN `release_status` → `"stable"` (flip one field; PGXN testing release ships in v0.7.0)	30min	PLAN_PACKAGING.md
R3	~~Docker Hub official image~~ → CNPG extension image	✅ Done	PLAN_CLOUDNATIVEPG.md
R4	~~CNPG operator hardening (K8s 1.33+ native ImageVolume)~~ ➡️ Pulled to v0.15.0	4–6h	PLAN_CLOUDNATIVEPG.md
R5	Docker Hub official image. Publish `pgtrickle/pg_trickle:1.0.0-pg18` and `:latest` to Docker Hub. Sync Dockerfile.hub version tag with release. Automate via GitHub Actions release workflow.	2–4h	—
R6	Version sync automation. Ensure `just check-version-sync` covers all version references (Cargo.toml, extension control files, Dockerfile.hub, dbt_project.yml, CNPG manifests). Add to CI as a blocking check.	2–3h	—
SAST-SEMGREP	Elevate Semgrep to blocking in CI. CodeQL and cargo-deny already block; Semgrep is advisory-only. Flip to blocking for consistent safety gating. Before flipping, verify zero findings across all current rules.	1–2h	PLAN_SAST.md

v1.0.0 total: ~36–66 hours (incl. PG 19 compat ~18–36h + release engineering ~18–30h)

Exit criteria:

A3: PG 19 builds and passes full E2E suite
CI matrix includes PG 19
Published on PGXN (stable) and apt/rpm via PGDG
Docker Hub image published (pgtrickle/pg_trickle:1.0.0-pg18 and :latest)
CNPG extension image published to GHCR (pg_trickle-ext)
CNPG cluster-example.yaml validated (Image Volume approach)
just check-version-sync passes and blocks CI on mismatch
SAST-SEMGREP: Semgrep elevated to blocking in CI; zero findings verified
Upgrade path from v0.17.0 tested
Semantic versioning policy in effect

Post-1.0 — Scale, Ecosystem & Platform Expansion

These are not gated on 1.0 but represent the longer-term horizon. PG backward compatibility (PG 16–18) and native DDL syntax were moved here from v0.16.0 to keep the pre-1.0 milestones focused on performance and correctness.

Ecosystem expansion

In plain terms: Building first-class integrations with the tools most data teams already use — a proper dbt adapter (beyond just a materialization macro), an Airflow provider so you can trigger stream table refreshes from Airflow DAGs, a pgtrickle TUI for managing and monitoring stream tables without writing SQL (shipped in v0.14.0), and integration guides for popular ORMs and migration frameworks like Django, SQLAlchemy, Flyway, and Liquibase.

Item	Description	Effort	Ref
E1	dbt full adapter (`dbt-pgtrickle` extending `dbt-postgres`)	20–30h	PLAN_DBT_ADAPTER.md
E2	Airflow provider (`apache-airflow-providers-pgtrickle`)	16–20h	PLAN_ECO_SYSTEM.md §4
E3	~~CLI tool (`pgtrickle`) for management outside SQL~~ ➡️ Pulled to v0.14.0 as TUI (E3-TUI)	4–6d	PLAN_TUI.md
E4	~~Flyway / Liquibase migration support~~ ➡️ Pulled to v0.15.0	8–12h	PLAN_ECO_SYSTEM.md §5
E5	~~ORM integrations guide (SQLAlchemy, Django, etc.)~~ ➡️ Pulled to v0.15.0	8–12h	PLAN_ECO_SYSTEM.md §5

Scale

In plain terms: When you have hundreds of stream tables or a very large cluster, the single background worker that drives pg_trickle today can become a bottleneck. These items explore running the scheduler as an external sidecar process (outside the database itself), distributing stream tables across Citus shards for horizontal scale-out, and managing stream tables that span multiple databases in the same PostgreSQL cluster.

Item	Description	Effort	Ref
S1	External orchestrator sidecar for 100+ STs	20–40h	REPORT_PARALLELIZATION.md §D
S2	Citus / distributed PostgreSQL compatibility	~6 months	plans/infra/CITUS.md
S3	Multi-database support (beyond `postgres` DB)	TBD	PLAN_MULTI_DATABASE.md

PG Backward Compatibility (PG 16–18)

In plain terms: pg_trickle currently only targets PostgreSQL 18. This work adds support for PG 16 and PG 17 so teams that haven't yet upgraded can still use the extension. Each PostgreSQL major version has subtly different internal APIs — especially around query parsing and the WAL format used for change-data-capture — so each version needs its own feature flags, build path, and CI test run.

Item	Description	Effort	Ref
BC1	Cargo.toml feature flags (`pg16`, `pg17`, `pg18`) + `cfg_aliases`	4–8h	PLAN_PG_BACKCOMPAT.md §5.2 Phase 1
BC2	`#[cfg]` gate JSON_TABLE nodes in `parser.rs` (~250 lines, PG 17+)	12–16h	PLAN_PG_BACKCOMPAT.md §5.2 Phase 2
BC3	`pg_get_viewdef()` trailing-semicolon behavior verification	2–4h	PLAN_PG_BACKCOMPAT.md §5.2 Phase 3
BC4	CI matrix expansion (PG 16, 17, 18) + parameterized Dockerfiles	12–16h	PLAN_PG_BACKCOMPAT.md §5.2 Phases 4–5
BC5	WAL decoder validation against PG 16–17 `pgoutput` format	8–12h	PLAN_PG_BACKCOMPAT.md §6A

Backward compatibility subtotal: ~38–56 hours

Native DDL Syntax

In plain terms: Currently you create stream tables by calling a function: SELECT pgtrickle.create_stream_table(...). This adds support for standard PostgreSQL DDL syntax: CREATE MATERIALIZED VIEW my_view WITH (pgtrickle.stream = true) AS SELECT .... That single change means pg_dump can back them up properly, \dm in psql lists them, ORMs can introspect them, and migration tools like Flyway treat them like ordinary database objects. Stream tables finally look native to PostgreSQL tooling.

Item	Description	Effort	Ref
NAT-1	`ProcessUtility_hook` infrastructure: register in `_PG_init()`, dispatch+passthrough, hook chaining with TimescaleDB/pg_stat_statements	3–5d	PLAN_NATIVE_SYNTAX.md §Tier 2
NAT-2	CREATE/DROP/REFRESH interception: parse `CreateTableAsStmt` reloptions, route to internal impls, IF EXISTS handling, CONCURRENTLY no-op	8–13d	PLAN_NATIVE_SYNTAX.md §Tier 2
NAT-3	E2E tests: CREATE/DROP/REFRESH via DDL syntax, hook chaining, non-pg_trickle matview passthrough	2–3d	PLAN_NATIVE_SYNTAX.md §Tier 2

Native DDL syntax subtotal: ~13–21 days

Advanced SQL

In plain terms: Longer-horizon features requiring significant research — backward-compatibility to PG 14/15, partitioned stream table storage, and remaining SQL coverage gaps. Several items have been pulled forward to v0.16.0 and v0.17.0.

Item	Description	Effort	Ref
A2	~~Transactional IVM Phase 4 remaining (ENR-based transition tables, C-level triggers, prepared stmt reuse)~~ ➡️ Pulled to v0.17.0	~36–54h	PLAN_TRANSACTIONAL_IVM.md
A3	~~PostgreSQL 19 forward-compatibility~~ ➡️ Pulled to v0.16.0 ➡️ Moved to v1.0.0	~18–36h	PLAN_PG19_COMPAT.md
A4	PostgreSQL 14–15 backward compatibility	~40h	PLAN_PG_BACKCOMPAT.md
A5	Partitioned stream table storage (opt-in)	~60–80h	PLAN_PARTITIONING_SHARDING.md §4
A6	~~Buffer table partitioning by LSN range (`pg_trickle.buffer_partitioning` GUC)~~	✅ Done	PLAN_EDGE_CASES_TIVM_IMPL_ORDER.md Stage 4 §3.3
A8	~~`ROWS FROM()` with multiple SRF functions~~ ➡️ Pulled to v0.17.0	~1–2d	PLAN_TRANSACTIONAL_IVM_PART_2.md Task 2.3

Parser Modularization & Shared Template Cache (G13-PRF, G14-SHC)

In plain terms: Two large-effort research items identified in the deep gap analysis. Parser modularization is a prerequisite for native DDL syntax (BC2); shared template caching eliminates per-connection cold-start overhead.

Item	Description	Effort	Ref
~~G13-PRF~~	~~Modularize `src/dvm/parser.rs`.~~ ✅ Done in v0.15.0	~3–4wk	plans/performance/REPORT_OVERALL_STATUS.md §13
~~G14-SHC~~	~~Shared-memory template caching (research spike).~~ ➡️ Pulled to v0.16.0	~2–3wk	plans/performance/REPORT_OVERALL_STATUS.md §14

Parser modularization: ✅ Done in v0.15.0. Template caching: ➡️ v0.16.0

Convenience API Functions (G15-BC, G15-EX)

In plain terms: Two quality-of-life API additions that simplify programmatic stream table management, useful for dbt/CI pipelines.

Item	Description	Effort	Ref
G15-BC	`bulk_create(definitions JSONB)` — create multiple stream tables and their CDC triggers in a single transaction. Useful for dbt/CI pipelines that manage many STs programmatically. ➡️ Pulled to v0.15.0	~2–3d	plans/performance/REPORT_OVERALL_STATUS.md §15
~~G15-EX~~	`export_definition(name TEXT)` — export a stream table configuration as reproducible `CREATE STREAM TABLE … WITH (…)` DDL. ➡️ Pulled to v0.14.0	~1–2d	plans/performance/REPORT_OVERALL_STATUS.md §15

Convenience API subtotal: ~2–3 days (G15-EX pulled to v0.14.0; G15-BC pulled to v0.15.0)

Effort Summary

Milestone	Effort estimate	Cumulative	Status
v0.1.x — Core engine + correctness	~30h actual	30h	✅ Released
v0.2.0 — TopK, Diamond & Transactional IVM	✔️ Complete	62–78h	✅ Released
v0.2.1 — Upgrade Infrastructure & Documentation	~8h	70–86h	✅ Released
v0.2.2 — OFFSET Support, ALTER QUERY & Upgrade Tooling	~50–70h	120–156h	✅ Released
v0.2.3 — Non-Determinism, CDC/Mode Gaps & Operational Polish	45–66h	165–222h	✅ Released
v0.3.0 — DVM Correctness, SAST & Test Coverage	~20–30h	185–252h	✅ Released
v0.4.0 — Parallel Refresh & Performance Hardening	~60–94h	245–346h	✅ Released
v0.5.0 — RLS, Operational Controls + Perf Wave 1 (A-3a only)	~51–97h	296–443h	✅ Released
v0.6.0 — Partitioning, Idempotent DDL & Circular Dependency Foundation	~35–50h	331–493h	✅ Released
v0.7.0 — Performance, Watermarks, Circular DAG Execution, Observability & Infrastructure	~59–62h	390–555h
v0.8.0 — pg_dump Support & Test Hardening	~16–21d	—
v0.9.0 — Incremental Aggregate Maintenance (B-1)	~7–9 wk	—
v0.10.0 — DVM Hardening, Connection Pooler Compat, Core Refresh Opts & Infra Prep	~7–10d + ~26–40 wk	—
v0.11.0 — Partitioned Stream Tables, Prometheus & Grafana, Safety Hardening & Correctness	~7–10 wk + ~12h obs + ~14–21h defaults + ~7–12h safety + ~2–4 wk should-ship	—
v0.12.0 — Scalability Foundations, Partitioning Enhancements & Correctness	~18–27 wk + ~6–8 wk scalability + ~5–8 wk partitioning + ~1–3 wk defaults	—
v0.13.0 — Scalability Foundations, Partitioning Enhancements, MERGE Profiling & Multi-Tenant Scheduling	~15–23 wk	—
v0.14.0 — Tiered Scheduling, UNLOGGED Buffers & Diagnostics	~2–6 wk + ~1 wk patterns + ~2–4d stability + ~3.5–7d diagnostics + ~1–2d export + ~4–6d TUI + ~0.5d docs	—
v0.15.0 — External Test Suites & Integration	~40–70h + ~2–3d bulk create + ~3–5d planner hints + ~2–3d cache spike + ~3–4wk parser + ~1–2wk watermark + ~2–4wk delta cost/spill	—	✅ Released
v0.16.0 — Performance & Refresh Optimization	~1–2wk MERGE alts + ~4–6wk aggregate fast-path + ~1–2wk append-only + ~2–3wk predicate pushdown + ~2–3wk template cache + ~2–3wk buffer compaction + ~3–6wk test coverage + ~1–2wk bench CI + ~2–3d auto-indexing + ~12–22h quick wins	—
v0.17.0 — Query Intelligence & Stability	~2–3wk cost-based strategy + ~3–4wk columnar tracking + ~32–48h TIVM Phase 4 + ~1–2d ROWS FROM + ~2–3wk SQLancer + ~2–3wk incremental DAG + ~4–8h unsafe reduction + ~1–2wk api.rs mod + ~2–3d migration guide + ~3–5d runbook + ~2–3d playground + ~2–3d doc polish	—
v0.18.0 — Hardening & Delta Performance	~70–100h	—
v0.19.0 — Production Gap Closure & Distribution	~4–5 weeks	—
v0.20.0 — Dog-Feeding (pg_trickle monitors itself)	~3–4wk	—
v0.21.0 — PostgreSQL 17 Support	~2–4d	—
v0.22.0 — PGlite Proof of Concept	~2–3wk (plugin) + ~1–2d (version bump)	—
v0.23.0 — Core Extraction (`pg_trickle_core`)	~3–4wk (extraction) + ~1–2wk (abstraction + testing)	—
v0.24.0 — PGlite WASM Extension	~5–7wk (WASM build) + ~2–3wk (testing + polish)	—
v0.25.0 — PGlite Reactive Integration	~2–3wk (bridge + hooks) + ~1–2wk (examples + testing + polish)	—
v1.0.0 — Stable release (incl. PG 19 compat)	~36–66h	—
Post-1.0 (PG compat + Native DDL)	~38–56h (PG 16–18) + ~13–21d (Native DDL)	—
Post-1.0 (ecosystem)	88–134h	—
Post-1.0 (scale)	6+ months	—

References

Document	Purpose
CHANGELOG.md	What's been built
plans/PLAN.md	Original 13-phase design plan
plans/sql/SQL_GAPS_7.md	53 known gaps, prioritized
plans/sql/PLAN_PARALLELISM.md	Detailed implementation plan for true parallel refresh
plans/performance/REPORT_PARALLELIZATION.md	Parallelization options analysis
plans/performance/STATUS_PERFORMANCE.md	Benchmark results
plans/ecosystem/PLAN_ECO_SYSTEM.md	Ecosystem project catalog
plans/dbt/PLAN_DBT_ADAPTER.md	Full dbt adapter plan
plans/infra/CITUS.md	Citus compatibility plan
plans/infra/PLAN_VERSIONING.md	Versioning & compatibility policy
plans/infra/PLAN_PACKAGING.md	PGXN / deb / rpm packaging
plans/infra/PLAN_DOCKER_IMAGE.md	Official Docker image (superseded by CNPG extension image)
plans/ecosystem/PLAN_CLOUDNATIVEPG.md	CNPG Image Volume extension image
plans/infra/PLAN_MULTI_DATABASE.md	Multi-database support
plans/infra/PLAN_PG19_COMPAT.md	PostgreSQL 19 forward-compatibility
plans/sql/PLAN_UPGRADE_MIGRATIONS.md	Extension upgrade migrations
plans/sql/PLAN_TRANSACTIONAL_IVM.md	Transactional IVM (immediate, same-transaction refresh)
plans/sql/PLAN_ORDER_BY_LIMIT_OFFSET.md	ORDER BY / LIMIT / OFFSET gaps & TopK support
plans/sql/PLAN_NON_DETERMINISM.md	Non-deterministic function handling
plans/sql/PLAN_ROW_LEVEL_SECURITY.md	Row-Level Security support plan (Phases 1–4)
plans/infra/PLAN_PARTITIONING_SHARDING.md	PostgreSQL partitioning & sharding compatibility
plans/infra/PLAN_PG_BACKCOMPAT.md	Supporting older PostgreSQL versions (13–17)
plans/sql/PLAN_DIAMOND_DEPENDENCY_CONSISTENCY.md	Diamond dependency consistency (multi-path refresh atomicity)
plans/adrs/PLAN_ADRS.md	Architectural decisions
docs/ARCHITECTURE.md	System architecture

Release Process

This document describes how to create a release of pg_trickle.

Overview

Releases are fully automated via GitHub Actions. Pushing a version tag (v*) triggers the Release workflow, which:

Runs a preflight version-sync check to ensure all version references match the tag
Builds extension packages for Linux (amd64), macOS (arm64), and Windows (amd64)
Smoke-tests the Linux artifact against a live PostgreSQL 18 instance
Creates a GitHub Release with archives and SHA256 checksums
Builds and pushes a multi-arch extension image to GHCR (for CNPG Image Volumes)

A separate PGXN workflow also fires on the same v* tag and publishes the source archive to the PostgreSQL Extension Network.

Prerequisites

Push access to the repository (or a PR merged by a maintainer)
All CI checks passing on main (verify the last run on the version-bump commit succeeded)
The version in Cargo.toml matches the tag you intend to push
Required GitHub secrets configured (see Required GitHub Secrets below)

Required GitHub Secrets

The release automation uses the following GitHub Actions secrets. Set them under Settings → Secrets and variables → Actions → New repository secret.

Secret	Used by	Description
`PGXN_USERNAME`	`pgxn.yml`	Your PGXN account username. Used to authenticate the `curl` upload to PGXN Manager when publishing source archives to the PostgreSQL Extension Network. Register at pgxn.org.
`PGXN_PASSWORD`	`pgxn.yml`	Password for the PGXN account above. Never hardcode this — it must be stored as a secret so it is never exposed in logs or committed to the repository.
`CODECOV_TOKEN`	`coverage.yml`	Upload token for Codecov. Used to publish unit and E2E coverage reports. Obtain it from the Codecov dashboard after linking the repository. The workflow degrades gracefully (`fail_ci_if_error: false`) if absent.
`BENCHER_API_TOKEN`	`benchmarks.yml`	API token for Bencher, the continuous benchmarking platform. Used to track Criterion benchmark results on `main` and detect regressions on pull requests. The benchmark steps are skipped entirely when this secret is absent, so CI still passes without it. Create a project at bencher.dev and copy the token from the project settings.

Note: The GITHUB_TOKEN secret is provided automatically by GitHub Actions and does not need to be configured manually. It is used by the release workflow to create GitHub Releases, by the Docker workflow to push images to GHCR, and by Bencher to post PR comments.

Step-by-Step

1. Decide the version number

Follow Semantic Versioning:

Change type	Bump	Example
Breaking SQL API or config change	Major	`1.0.0 → 2.0.0`
New feature, backward-compatible	Minor	`0.1.0 → 0.2.0`
Bug fix, no API change	Patch	`0.2.0 → 0.2.1`
Pre-release / release candidate	Suffix	`0.3.0-rc.1`

2. Update the version

Four files must have their version bumped together:

# 1. Cargo.toml — the canonical version source for the extension
#    Change:  version = "0.7.0"  →  version = "0.8.0"

# 2. pgtrickle-tui/Cargo.toml — the TUI binary; must always match Cargo.toml
#    Change:  version = "0.7.0"  →  version = "0.8.0"

# 3. META.json — the PGXN package metadata
#    Change both top-level "version" and the nested "provides" version

# 4. CHANGELOG.md
#    Rename ## [Unreleased] → ## [0.8.0] — YYYY-MM-DD
#    Add a new empty ## [Unreleased] section at the top

Important: Cargo.toml (extension) and pgtrickle-tui/Cargo.toml (TUI) must always carry the same version. They are built and released together, and a mismatch causes cargo install --path pgtrickle-tui to report the wrong version. The just check-version-sync script does not currently enforce this, so it must be checked manually.

The extension control file (pg_trickle.control) uses default_version = '@CARGO_VERSION@', which pgrx substitutes automatically at build time — no manual edit needed there.

After editing, verify all version-related files are in sync:

just check-version-sync

3. Commit the version bump

git add Cargo.toml META.json CHANGELOG.md
git commit -m "release: v0.8.0"
git push origin main

4. Wait for CI to pass and verify upgrade completeness

Ensure the CI workflow passes on main with the version bump commit. All unit, integration, E2E, and pgrx tests must be green.

Critical: Before tagging, verify that the upgrade script covers all SQL schema changes:

# Run comprehensive upgrade completeness checks
just check-upgrade-all

# If any check fails (e.g. "ERROR: X new function(s) missing from upgrade script"),
# fix the issue by adding the missing SQL objects to:
#   sql/pg_trickle--<prev>--<new>.sql
#
# Then re-run until all checks pass:
just check-upgrade-all  # Should print "All 15 upgrade step(s) passed completeness checks."

Why this matters: New SQL functions, views, tables, and columns added in any prior release must be carried forward in the upgrade script, even if the current release doesn't change them. The upgrade script is the source of truth for what PostgreSQL applies when users run ALTER EXTENSION pg_trickle UPDATE.

Confirm the local and CI upgrade-E2E defaults were advanced to the new release:

just check-version-sync  # Verifies ci.yml, justfile, and test defaults

5. Create and push the tag

git tag -a v0.2.0 -m "Release v0.2.0"
git push origin v0.2.0

This triggers the Release workflow automatically.

6. Monitor the release

Watch the Actions tab for progress. The release workflow runs these jobs in order:

preflight  ──►  build-release (linux, macos, windows)
                      │
                      ▼
                test-release  ──►  publish-release
                              ──►  publish-docker-arch (linux/amd64 + linux/arm64)
                                         │
                                         ▼
                                   publish-docker (merge manifest + push :latest)

The PGXN workflow (pgxn.yml) runs independently and publishes the source archive to pgxn.org in parallel with the release workflow.

7. Make the GHCR package public (first release only)

When a package is pushed to GHCR for the first time it is private by default. Because this is an open-source project, packages linked to the public repository inherit public visibility — but you must make the package public once to unlock that:

Go to github.com/⟨owner⟩ → Packages → pg_trickle-ext
Click Package settings
Scroll to Danger Zone → Change package visibility → set to Public

After that first change:

All future pushes keep the package public automatically
Unauthenticated docker pull ghcr.io/grove/pg_trickle-ext:... works
Storage and bandwidth are free (GHCR open-source advantage)
The package page shows the README, linked repository, license, and description from the OCI labels

8. Verify the release

Once both workflows complete:

Check the GitHub Releases page for the new release
Verify all three platform archives are attached (.tar.gz for Linux/macOS, .zip for Windows)
Verify SHA256SUMS.txt is present
Verify the extension image is available at ghcr.io/grove/pg_trickle-ext:<version>
Verify the PGXN upload succeeded: pgxn info pg_trickle should show the new version
Optionally verify the extension image layout:

docker pull ghcr.io/grove/pg_trickle-ext:<version>
ID=$(docker create ghcr.io/grove/pg_trickle-ext:<version>)
docker cp "$ID:/lib/" /tmp/ext-lib/
docker cp "$ID:/share/" /tmp/ext-share/
docker rm "$ID"
ls -la /tmp/ext-lib/ /tmp/ext-share/extension/

Post-Release Checklist

Complete these steps immediately after a release tag has been pushed and both the Release and PGXN workflows have finished successfully.

Create a post-release branch from main (e.g. post-release-<ver>-a)
Bump Cargo.toml version to the next development version (e.g. 0.12.0 → 0.13.0)
Bump pgtrickle-tui/Cargo.toml version to the same next development version — must always match Cargo.toml
Bump META.json — both the top-level "version" and the nested "provides" → "pg_trickle" → "version" to match
Write plans/PLAN_0_<next>_0.md — initial planning document for the next milestone
Delete plans/PLAN_0_<released>_0.md — remove the now-completed plan
Wrap roadmap items — in ROADMAP.md, wrap all completed items from the old release with <details> tags to archive them
Add ## [Unreleased] stub to CHANGELOG.md above the just-released entry
Create sql/pg_trickle--<released>--<next>.sql — empty upgrade script stub for the next migration hop
Copy sql/archive/pg_trickle--<released>.sql → sql/archive/pg_trickle--<next>.sql — placeholder archive baseline for the next version
Update justfile — advance build-upgrade-image and test-upgrade to defaults to <next>; update the build-hub Docker image tag
Update tests/e2e_upgrade_tests.rs — advance all unwrap_or("<released>".into()) fallback strings to <next>
Update version numbers in README.md — search for occurrences of the released version (e.g. 0.17.0) and advance them to <next>: CNPG image reference (ghcr.io/grove/pg_trickle-ext:<version>), dbt revision tag, and any other hardcoded version strings. A quick check: grep -n '<released>' README.md
Run just check-version-sync — must exit 0 before opening the PR
Open a PR against main with the commit title chore: start v<next> development cycle

Preparing for the Next Release (Pre-Work Checklist)

Use this checklist at the start of each new release milestone to ensure the repository is properly set up before development begins. This maps directly to what just check-version-sync verifies.

File / target	Action	`check-version-sync` check
`Cargo.toml`	`version = "<next>"`	canonical version source
`META.json`	both `"version"` fields set to `<next>`	PGXN manifest
`CHANGELOG.md`	`## [Unreleased]` section present	(manual hygiene)
`sql/pg_trickle--<prev>--<next>.sql`	stub file exists	upgrade SQL exists
`sql/archive/pg_trickle--<next>.sql`	placeholder file exists (copy of `<prev>`)	archive SQL exists
`.github/workflows/ci.yml`	upgrade matrix and chain end at `<next>`	CI matrix up to date
`justfile`	`build-upgrade-image` and `test-upgrade` `to` defaults = `<next>`	justfile defaults
`tests/e2e_upgrade_tests.rs`	all `unwrap_or` fallbacks = `"<next>"`	e2e fallback strings

Quick-verify with:

just check-version-sync
# Should print: All version references are in sync.

Release Artifacts

Each release produces:

Artifact	Description
`pg_trickle-<ver>-pg18-linux-amd64.tar.gz`	Extension files for Linux x86_64
`pg_trickle-<ver>-pg18-macos-arm64.tar.gz`	Extension files for macOS Apple Silicon
`pg_trickle-<ver>-pg18-windows-amd64.zip`	Extension files for Windows x64
`SHA256SUMS.txt`	SHA-256 checksums for all archives
`ghcr.io/grove/pg_trickle-ext:<ver>`	CNPG extension image for Image Volumes (amd64 + arm64)

Installing from an archive

tar xzf pg_trickle-<version>-pg18-linux-amd64.tar.gz
cd pg_trickle-<version>-pg18-linux-amd64

sudo cp lib/*.so "$(pg_config --pkglibdir)/"
sudo cp extension/*.control extension/*.sql "$(pg_config --sharedir)/extension/"

Then add to postgresql.conf and restart:

shared_preload_libraries = 'pg_trickle'

See INSTALL.md for full installation details.

Pre-releases

Tags containing -rc, -beta, or -alpha (e.g., v0.3.0-rc.1) are automatically marked as pre-releases on GitHub. Pre-release extension images are tagged but do not update the latest tag.

Hotfix Releases

For urgent fixes on an older release:

# Branch from the tag
git checkout -b hotfix/v0.2.1 v0.2.0

# Apply fix, bump version to 0.2.1
git commit -am "fix: ..."
git push origin hotfix/v0.2.1

# Tag from the branch (CI will still run the release workflow)
git tag -a v0.2.1 -m "Release v0.2.1"
git push origin v0.2.1

Files to Update for Each Release

Every release requires manual updates to the files below. Missing any of them leads to version skew between the code, the docs, and the packages.

File	What to change	Why
`Cargo.toml`	`version = "x.y.z"` field	The canonical version source. pgrx reads this at build time and substitutes it into `pg_trickle.control` via `@CARGO_VERSION@`. The git tag must match.
`META.json`	Both `"version"` fields (top-level and inside `"provides"`)	The PGXN package manifest. The `pgxn.yml` workflow uploads this file as part of the source archive; a stale version here means the wrong version appears on pgxn.org.
`CHANGELOG.md`	Rename `## [Unreleased]` → `## [x.y.z] — YYYY-MM-DD`; add a new empty `## [Unreleased]` at the top	Keeps the public changelog accurate and gives downstream users a dated record of changes.
`ROADMAP.md`	Update the preamble's latest-release/current-milestone lines; mark the released milestone done; advance the "We are here" pointer to the next milestone	Keeps the forward-looking plan aligned with reality. Leaves no confusion about what just shipped versus what is next.
`README.md`	Update test-count line (`~N unit tests + M E2E tests`) if test counts changed significantly	The README is the first thing users read; stale numbers erode trust.
`INSTALL.md`	Update any version numbers in install commands or example URLs	Users copy-paste installation commands; stale versions cause failures.
`docs/UPGRADING.md`	Add the new version-specific migration notes and extend the supported upgrade-path table	Documents exactly what `ALTER EXTENSION ... UPDATE` will do and which chains are supported.
`sql/pg_trickle--<old>--<new>.sql`	Add or update the hand-authored upgrade script for every SQL-surface change (new objects, changed signatures, changed defaults, view changes). Also carry forward all functions/views/tables added in previous releases — the upgrade script is cumulative.	`ALTER EXTENSION ... UPDATE` only applies what is explicitly scripted; function defaults and signatures stored in `pg_proc` do not update themselves. Omitting a function that existed in `<old>` but is expected in `<new>` will break user upgrades.
`sql/archive/pg_trickle--<new>.sql`	Regenerate and commit the full-install SQL baseline for the new version. This file was created as a placeholder copy of `<prev>` at the start of the development cycle — it must be replaced with the actual generated SQL before tagging. Run `cargo pgrx schema` (or the equivalent `just` target) to produce the final schema, then overwrite the placeholder.	Future upgrade-completeness checks and upgrade E2E tests need an exact baseline for the released version. A stale placeholder from the start of the cycle will cause spurious failures.
`.github/workflows/ci.yml`, `justfile`, `tests/build_e2e_upgrade_image.sh`, `tests/Dockerfile.e2e-upgrade`	Advance the upgrade-check chain and default upgrade-E2E target version to the new release	Prevents release automation and local upgrade validation from getting stuck on the previous version after a new migration hop is added.
`pg_trickle.control`	No manual edit needed — `default_version` is set to `'@CARGO_VERSION@'` and pgrx substitutes it at build time. Verify the substitution in the built artifact.	Ensures the SQL `CREATE EXTENSION` command installs the right version.

CRITICAL: After updating sql/pg_trickle--<old>--<new>.sql, always run just check-upgrade-all to verify that the upgrade script is complete. This checks not just the immediate hop to the new version, but the entire upgrade chain from v0.1.3 onwards. If the check fails (e.g. "ERROR: 3 new function(s) missing"), it means the upgrade script is missing one or more SQL objects that users will expect to have after upgrading. Fix all failures before tagging.

Checklist summary

[ ] Cargo.toml — version bumped
[ ] META.json — both "version" fields updated to match
[ ] CHANGELOG.md — [Unreleased] renamed to [x.y.z] with date; new empty [Unreleased] added
[ ] ROADMAP.md — preamble updated; released milestone marked done
[ ] README.md — test counts current (if materially changed)
[ ] INSTALL.md — version references current
[ ] docs/UPGRADING.md — latest migration notes and supported chains added
[ ] sql/pg_trickle--<old>--<new>.sql — covers every SQL-surface change AND carries forward all previous release functions
[ ] sql/archive/pg_trickle--<new>.sql — regenerated from final schema and committed (replaces the dev-cycle placeholder)
[ ] just check-upgrade-all — all upgrade steps pass completeness checks (not just the one-step hop)
[ ] Upgrade automation defaults — CI/local upgrade checks and E2E target the new version
[ ] just check-version-sync — all version references in sync
[ ] All CI checks on main have passed (verify the last run on the version-bump commit succeeded)
[ ] git tag matches Cargo.toml version

Troubleshooting

Release workflow failed

Go to the Actions tab and identify which job failed. Then follow the appropriate recovery path below.

Option A: Re-run (transient failure)

If the failure is transient — network timeout, registry hiccup, runner issue — you can re-run without changing anything:

Open the failed workflow run in the Actions tab
Click Re-run all jobs (or re-run just the failed job)

This works because the v* tag still points to the same commit, and the workflow uses cancel-in-progress: false so a re-run won't be cancelled.

Option B: Fix code and re-tag

If the failure is a real build or code issue:

# 1. Delete the remote tag
git push origin :refs/tags/v0.2.0

# 2. Delete the local tag
git tag -d v0.2.0

# 3. Fix the issue, commit, and push
git add <files>
git commit -m "fix: ..."
git push origin main

# 4. Re-tag on the new commit and push
git tag -a v0.2.0 -m "Release v0.2.0"
git push origin v0.2.0

This triggers a fresh release workflow run.

Option C: Clean up a partial GitHub Release

If the workflow created a draft or partial Release before failing:

Go to Releases in the repository
Delete the broken release (this does not delete the tag)
Then follow Option A or Option B above

Upgrade script completeness check failed

If just check-upgrade-all reports errors like "ERROR: X new function(s) missing from upgrade script", it means the upgrade SQL script is incomplete:

# 1. Look at the error — it tells you exactly what's missing
just check-upgrade-all  # e.g. "ERROR: 3 new function(s) missing from upgrade script:
                        #        - pgtrickle.\"explain_refresh_mode\"
                        #        - pgtrickle.\"fuse_status\"
                        #        - pgtrickle.\"reset_fuse\""

# 2. Find where those objects are defined in the previous release
#    (they should already exist in sql/archive/pg_trickle--<prev>.sql)
grep -n "CREATE.*FUNCTION.*explain_refresh_mode" sql/archive/pg_trickle--*.sql

# 3. Copy the function definitions (CREATE OR REPLACE FUNCTION) to the
#    upgrade script you're fixing. They should go into:
#    sql/pg_trickle--<old>--<new>.sql
#    
#    Typically, carry-forward functions are grouped in their own section
#    at the top of the upgrade script with a comment explaining they're
#    from a prior release.

# 4. Re-run the check to verify it passes
just check-upgrade-all

Why this happens: When a new release (e.g. v0.11.0) adds SQL functions, those functions must be explicitly included in all subsequent upgrade scripts. The upgrade script is the ground truth — PostgreSQL only applies what is listed in the .sql file. If you skip a function that users expect, their upgraded extension will be missing that object.

Common failure causes

Symptom	Cause	Fix
Version mismatch error	`Cargo.toml` version doesn't match the git tag	Run `just check-version-sync`, fix any skew, commit, delete tag, re-tag (Option B)
Build failure	Compilation error in release profile	Fix on `main`, re-tag (Option B)
Docker push failed	Missing permissions	Verify `packages: write` is in the workflow and `GITHUB_TOKEN` has GHCR access, then re-run (Option A)
Smoke test failed	Extension doesn't load in PostgreSQL	Fix the issue, re-tag (Option B)
PGXN upload failed	Missing `PGXN_USERNAME` / `PGXN_PASSWORD` secrets, or `META.json` version not updated	Add the secrets in repository settings; verify `META.json` version matches the tag; re-run the `pgxn.yml` workflow from the Actions tab
`just check-upgrade-all` reports missing functions/views	Upgrade script is incomplete — new objects from prior releases not carried forward	See "Upgrade script completeness check failed" above for recovery steps
Rate limited	GitHub API or GHCR throttling	Wait a few minutes, then re-run (Option A)

Yanking a release

If a release has a critical issue:

Mark it as pre-release on the GitHub Releases page (uncheck "Set as the latest release")
Add a warning to the release notes
Publish a patch release with the fix

Security Policy

Supported Versions

Version	Supported
0.13.x (current pre-release)	✅

During pre-1.0 development, only the latest minor version receives security fixes. Once v1.0.0 is released, the two most recent minor versions will receive security fixes.

Reporting a Vulnerability

Please do not report security vulnerabilities via public GitHub Issues.

Use GitHub's built-in private vulnerability reporting:

Go to the Security tab of this repository
Click "Report a vulnerability"
Fill in the details — affected version, description, reproduction steps, and potential impact

We aim to acknowledge reports within 48 hours and provide a fix or mitigation within 14 days for critical issues.

What to Include

A useful report includes:

PostgreSQL version and pg_trickle version
Minimal reproduction SQL or Rust code
Description of the unintended behaviour and its security impact
Whether the vulnerability requires a trusted (superuser) or untrusted role to trigger

Scope

In-scope:

SQL injection or privilege escalation via pgtrickle.* functions
Memory safety issues in the Rust extension code (buffer overflows, use-after-free, etc.)
Denial-of-service caused by a low-privilege user triggering runaway resource usage
Information disclosure through change buffers (pgtrickle_changes.*) or monitoring views

Out-of-scope:

Vulnerabilities in PostgreSQL itself (report to the PostgreSQL security team)
Vulnerabilities in pgrx (report to pgcentralfoundation/pgrx)
Issues requiring physical access to the database host

Disclosure Policy

We follow coordinated disclosure. Once a fix is released we will publish a security advisory on GitHub with a CVE if applicable.

pg_trickle vs. DBSP: Similarities and Differences

pg_trickle explicitly cites DBSP as its theoretical foundation (see PRIOR_ART.md). The key overlap:

Concept	DBSP (paper)	pg_trickle (implementation)
Z-set / delta model	Rows annotated with weights (+1/−1) in an abelian group	`__pgt_action = 'I'/'D'` column on every delta row — effectively Z-sets restricted to {+1, −1}
Per-operator differentiation	Recursive Algorithm 4.6: Q^Δ = D ∘ Q ∘ I, decomposed per-operator via the chain rule (Q₁ ∘ Q₂)^Δ = Q₁^Δ ∘ Q₂^Δ	`DiffContext::diff_node()` walks the OpTree and calls per-operator differentiators (scan, filter, project, join, aggregate, distinct, union, etc.) — same recursive structural decomposition
Linear operators are self-incremental	Theorem 3.3: for LTI operator Q, Q^Δ = Q	Filter and Project pass deltas through unchanged (just apply predicate/projection to the delta stream)
Bilinear join rule	Theorem 3.4: Δ(a × b) = Δa × Δb + a × Δb + Δa × b	`diff_inner_join` generates exactly 3 UNION ALL parts: (delta_left ⋈ current_right), (current_left ⋈ delta_right), and optionally (delta_left ⋈ delta_right)
Aggregate auxiliary counters	§4.2: counting algorithm for maintaining aggregates with deletions	`__pgt_count` auxiliary column, LEFT JOIN back to stream table to read old counts and compute new counts
Recursive queries	§6: fixed-point iteration with z⁻¹ delay operator, semi-naive evaluation	`diff_recursive_cte` uses recomputation-diff (DRed-style), not DBSP's native fixed-point circuit

Key Differences

1. Execution model — standalone engine vs. embedded in PostgreSQL

DBSP is a standalone streaming runtime (Rust library, now Feldera). It compiles query plans into dataflow graphs that maintain in-memory state and process continuous micro-batches. Operators are long-lived stateful actors with their own memory.

pg_trickle is an extension inside PostgreSQL. It has no persistent dataflow graph. On each refresh, it generates a single SQL query (CTE chain) that PostgreSQL's own planner/executor evaluates. After execution, no operator state persists — auxiliary state lives in the stream table itself (__pgt_count columns) and change buffer tables.

2. Streams vs. periodic batches

DBSP operates on true infinite streams indexed by logical time t ∈ ℕ. Each "step" processes one micro-batch of changes, and operators carry integration state (I operator = running sum from t=0).

pg_trickle operates in discrete refresh cycles triggered by a lag-based scheduler. There is no integration operator — the "current state" is just the stream table's contents, and changes are consumed from CDC buffer tables between LSN boundaries. Each refresh is a self-contained transaction.

3. Z-set weights vs. binary actions

DBSP uses integer weights in ℤ — rows can have weights > 1 (bags) or < −1 (multiple deletions). This enables correct multiset semantics and composable group algebra.

pg_trickle uses binary actions ('I' insert, 'D' delete, sometimes 'U' update). It doesn't maintain true Z-set weights. For aggregates, the __pgt_count auxiliary column serves a similar purpose but is specific to the aggregate operator — it's not a general weight propagated through the operator tree.

4. Integration operator (I)

DBSP: The integration operator I(s)[t] = Σᵢ≤ₜ s[i] is an explicit first-class circuit element. It maintains running sums of changes and is the key mechanism for computing incremental joins (z⁻¹(I(a)) = "accumulated left side up to previous step").

pg_trickle: No explicit integration. The equivalent of I is just "read the current contents of the source/stream table." Join differentiation directly reads the current snapshot of the non-delta side (build_snapshot_sql() generates FROM "public"."orders" r), which implicitly includes all historical changes.

5. Recursion

DBSP: Native fixed-point circuits with z⁻¹ delay. Can incrementally maintain recursive queries (e.g., transitive closure) by iterating only on new changes within each step — semi-naive evaluation generalized to arbitrary recursion.

pg_trickle: Uses recomputation-diff for recursive CTEs — re-executes the full recursive query and anti-joins against current storage to compute the delta. This is correct but not truly incremental for the recursive part.

6. Correctness guarantees

DBSP: Proven correct in Lean. All theorems are machine-checked. The chain rule, cycle rule, and bilinear decomposition are formally verified.

pg_trickle: Verified empirically via property-based tests (the assert_invariant checks that Contents(ST) = Q(DB) after each mutation cycle). No formal proof, but the per-operator rules are direct translations of DBSP's rules.

7. Scope

DBSP: A general-purpose theory and streaming engine. Handles nested relations, streaming aggregation over windows, arbitrary compositions. The Feldera implementation supports a full SQL frontend.

pg_trickle: Focused on materialized views inside PostgreSQL. Supports a specific subset of SQL (scan, filter, project, inner/left/full join, aggregates, DISTINCT, UNION ALL, INTERSECT, EXCEPT, CTEs, window functions, lateral joins). It is not a general streaming engine — it leverages PostgreSQL's own query planner and executor.

Summary

pg_trickle applies DBSP's differentiation rules to generate delta queries, but it is not a DBSP implementation. It borrows the mathematical framework (per-operator differentiation, Z-set-like deltas, bilinear join decomposition) while making fundamentally different architectural choices: embedded in PostgreSQL, no persistent dataflow state, periodic batch execution, and PostgreSQL's planner as the optimizer. Think of it as "DBSP's differentiation algebra, compiled down to SQL CTEs and executed by PostgreSQL."

Prior Art

This document lists the academic papers, PostgreSQL commits, open-source tools, and standard algorithms whose techniques are reused in pg_trickle.

Maintaining this record serves two purposes:

Attribution — credit the research and engineering work this project builds upon.
Independent derivation — demonstrate that every core technique predates and is independent of any single vendor's commercial product.

Differential View Maintenance (DVM)

DBSP — Automatic Incremental View Maintenance

Budiu, M., Ryzhyk, L., McSherry, F., & Tannen, V. (2023). "DBSP: Automatic Incremental View Maintenance for Rich Query Languages." Proceedings of the VLDB Endowment (PVLDB), 16(7), 1601–1614. https://arxiv.org/abs/2203.16684

The Z-set abstraction (rows annotated with +1/−1 multiplicity) is the theoretical foundation for the __pgt_action column produced by the delta operators in src/dvm/operators/. The per-operator differentiation rules (scan, filter, project, join, aggregate, union) are direct applications of the DBSP lifting operator (D) described in this paper.

See DBSP_COMPARISON.md for a detailed comparison of pg_trickle's architecture with the DBSP model.

Gupta & Mumick — Materialized Views Survey

Gupta, A. & Mumick, I.S. (1995). "Maintenance of Materialized Views: Problems, Techniques, and Applications." IEEE Data Engineering Bulletin, 18(2), 3–18.

Gupta, A. & Mumick, I.S. (1999). Materialized Views: Techniques, Implementations, and Applications. MIT Press. ISBN 978-0-262-57122-7.

The per-operator differentiation rules in src/dvm/operators/ follow the derivation given in section 3 of the 1995 survey. The counting algorithm for maintaining aggregates with deletions uses the approach described in the MIT Press book.

DBToaster — Higher-order Delta Processing

Koch, C., Ahmad, Y., Kennedy, O., Nikolic, M., Nötzli, A., Olteanu, D., & Zavodny, J. (2014). "DBToaster: Higher-order Delta Processing for Dynamic, Frequently Fresh Views." The VLDB Journal, 23(2), 253–278. https://doi.org/10.1007/s00778-013-0348-4

Inspiration for the recursive delta compilation strategy where the delta of a complex query is itself a query that can be differentiated.

DRed — Deletion and Re-derivation

Gupta, A., Mumick, I.S., & Subrahmanian, V.S. (1993). "Maintaining Views Incrementally." Proceedings of the 1993 ACM SIGMOD International Conference, 157–166.

The DRed algorithm for handling deletions in recursive views is the basis for the recursive CTE differential refresh strategy in src/dvm/operators/recursive_cte.rs.

Scheduling

Earliest-Deadline-First (EDF)

Liu, C.L. & Layland, J.W. (1973). "Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment." Journal of the ACM, 20(1), 46–61. https://doi.org/10.1145/321738.321743

The schedule-based scheduling in src/scheduler.rs applies the classic EDF principle: the stream table whose freshness deadline expires soonest is refreshed first. EDF is optimal for uniprocessor preemptive scheduling and is a standard technique in operating systems and real-time databases.

Topological Sort — Kahn's Algorithm

Kahn, A.B. (1962). "Topological sorting of large networks." Communications of the ACM, 5(11), 558–562. https://doi.org/10.1145/368996.369025

The dependency DAG in src/dag.rs uses Kahn's algorithm for topological ordering and cycle detection. This is standard computer science curriculum and appears in every major algorithms textbook (Cormen et al., Sedgewick, Kleinberg & Tardos).

Change Data Capture (CDC)

PostgreSQL Row-Level Triggers

Row-level AFTER INSERT/UPDATE/DELETE triggers have been available in PostgreSQL since version 6.x (late 1990s). The trigger-based change capture pattern used in src/cdc.rs is a well-established PostgreSQL technique:

PostgreSQL documentation: CREATE TRIGGER — trigger-based CDC has been a standard pattern for decades.
PostgreSQL wiki: "Trigger-based Change Data Capture in PostgreSQL."

Debezium

Debezium project (Red Hat, open source since 2016). https://debezium.io/

Debezium implements trigger-based and WAL-based CDC for PostgreSQL and other databases. The change buffer table pattern (pg_trickle_changes.changes_<oid>) follows a similar approach, modified for single-process consumption within the PostgreSQL backend.

pgaudit

pgaudit extension (2015). https://github.com/pgaudit/pgaudit

Captures DML via AFTER row-level triggers for audit logging, demonstrating the same trigger-based change-capture technique in production since 2015.

Materialized View Refresh

PostgreSQL REFRESH MATERIALIZED VIEW CONCURRENTLY

PostgreSQL 9.4 (December 2014, commit 96ef3b8). src/backend/commands/matview.c

The snapshot-diff strategy used for recomputation-diff refreshes (where the full query is re-executed and anti-joined against current storage to compute inserts and deletes) mirrors the algorithm implemented in PostgreSQL's REFRESH MATERIALIZED VIEW CONCURRENTLY. This PostgreSQL feature predates all relevant patents and is publicly documented.

SQL MERGE Statement

ISO/IEC 9075:2003 (SQL:2003 standard) — MERGE statement. PostgreSQL 15 (October 2022, commit 7103eba).

The MERGE-based delta application in src/refresh.rs uses the ISO-standard MERGE statement, independently implemented by Oracle, SQL Server, DB2, and PostgreSQL. This is not derived from any vendor-specific implementation.

General Database Theory

Relational Algebra

Codd, E.F. (1970). "A Relational Model of Data for Large Shared Data Banks." Communications of the ACM, 13(6), 377–387.

The operator tree in src/dvm/parser.rs models standard relational algebra operators (select, project, join, aggregate, union). These are foundational database theory from 1970.

Semi-Naive Evaluation

Bancilhon, F. & Ramakrishnan, R. (1986). "An Amateur's Introduction to Recursive Query Processing Strategies." Proceedings ACM SIGMOD, 16–52.

General background for recursive CTE evaluation strategies. PostgreSQL's own WITH RECURSIVE implementation uses iterative fixpoint evaluation based on these principles.

This document is maintained for attribution and independent-derivation documentation purposes. It does not constitute legal advice.

Custom SQL Syntax for PostgreSQL Extensions

Comprehensive Technical Research Report

Date: 2026-02-25 Context: pg_trickle extension — evaluating approaches to support CREATE STREAM TABLE syntax or equivalent native-feeling DDL.

1. Executive Summary

PostgreSQL's parser is not extensible — there is no parser hook that allows extensions to add new grammar rules. This is a fundamental design constraint. Every approach to "custom DDL syntax" in extensions falls into one of two categories:

Intercept existing syntax — Use ProcessUtility_hook or event triggers to intercept standard DDL (e.g., CREATE TABLE, CREATE VIEW) and augment its behavior.
Use a SQL function as the DDL interface — Define SELECT my_extension.create_thing(...) as the user-facing API (this is what pg_trickle currently does).

No production PostgreSQL extension ships truly new SQL grammar without forking the PostgreSQL parser. TimescaleDB, Citus, pg_ivm, and others all work within existing syntax boundaries.

2. PostgreSQL Parser Hooks / Utility Hooks

Available Hook Points

PostgreSQL provides several hook function pointers that extensions can override in _PG_init():

Hook	Header	Purpose
`ProcessUtility_hook`	`tcop/utility.h`	Intercept utility (DDL) statement execution
`post_parse_analyze_hook`	`parser/analyze.h`	Inspect/modify the analyzed parse tree after semantic analysis
`planner_hook`	`optimizer/planner.h`	Replace or augment the query planner
`ExecutorStart_hook`	`executor/executor.h`	Intercept executor startup
`ExecutorRun_hook`	`executor/executor.h`	Intercept executor row processing
`ExecutorFinish_hook`	`executor/executor.h`	Intercept executor finish
`ExecutorEnd_hook`	`executor/executor.h`	Intercept executor cleanup
`object_access_hook`	`catalog/objectaccess.h`	Notifications when objects are created/modified/dropped
`emit_log_hook`	`utils/elog.h`	Intercept log messages

What's Missing: No Parser Hook

There is no parser_hook or raw_parser_hook. The raw parser (gram.y → scan.l → bison grammar) is compiled into the PostgreSQL server binary. Extensions cannot:

Add new keywords (e.g., STREAM)
Add new grammar productions (e.g., CREATE STREAM TABLE)
Modify the tokenizer/lexer
Intercept raw SQL text before parsing

The closest hook is post_parse_analyze_hook, which fires after the SQL has already been parsed and analyzed. By this point:

The SQL string has already been tokenized and parsed by gram.y
A parse tree (Query node) has been produced
If the SQL contains unknown syntax, a syntax error has already been raised

Technical Details of `post_parse_analyze_hook`

/* In src/backend/parser/analyze.c */
typedef void (*post_parse_analyze_hook_type)(ParseState *pstate,
                                             Query *query,
                                             JumbleState *jstate);
post_parse_analyze_hook_type post_parse_analyze_hook = NULL;

Extensions can set this in _PG_init():

static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;

void _PG_init(void) {
    prev_post_parse_analyze_hook = post_parse_analyze_hook;
    post_parse_analyze_hook = my_post_parse_analyze;
}

Use cases: Query rewriting after parsing (e.g., adding security predicates, row-level security), statistics collection, plan caching invalidation. Not usable for new syntax because parsing has already completed.

Pros/Cons

Aspect	Assessment
Native syntax	Impossible — cannot add new grammar
Intercept existing DDL	Yes via `ProcessUtility_hook`
Modify parsed queries	Yes via `post_parse_analyze_hook`
Complexity	Low for hooking, but limited in capability
PG version	All modern versions (hooks stable since PG 9.x)
Maintenance	Very low — hook signatures rarely change

3. The ProcessUtility_hook Approach

How It Works

ProcessUtility_hook is the most powerful DDL interception point. It fires for every "utility statement" (DDL, COPY, EXPLAIN, etc.) after parsing but before execution.

typedef void (*ProcessUtility_hook_type)(PlannedStmt *pstmt,
                                         const char *queryString,
                                         bool readOnlyTree,
                                         ProcessUtilityContext context,
                                         ParamListInfo params,
                                         QueryEnvironment *queryEnv,
                                         DestReceiver *dest,
                                         QueryCompletion *qc);

An extension can:

Inspect the parse tree node — The PlannedStmt->utilityStmt field contains the parsed DDL node (e.g., CreateStmt, AlterTableStmt, ViewStmt).
Modify the parse tree — Change fields before passing to the standard handler.
Replace execution entirely — Skip calling the standard handler and do something else.
Post-process — Call the standard handler first, then do additional work.
Block execution — Raise an error to prevent the DDL.

What Extensions Use This

Extension	What they intercept	Purpose
TimescaleDB	`CREATE TABLE`, `ALTER TABLE`, `DROP TABLE`, `CREATE INDEX`, etc.	Convert regular tables to hypertables, distribute DDL
Citus	Most DDL statements	Propagate DDL to worker nodes
pg_partman	`CREATE TABLE`, partition DDL	Auto-manage partitioning
pg_stat_statements	All utility statements	Track DDL execution statistics
pgAudit	All utility statements	Audit logging
pg_hint_plan	—	Uses `post_parse_analyze_hook` instead
sepgsql	Object creation/modification	Security label enforcement

Can It Handle New Syntax?

No. It can only intercept DDL that PostgreSQL's parser already understands. You cannot use ProcessUtility_hook to handle CREATE STREAM TABLE because the parser will reject that syntax before the hook is ever called.

However, it can intercept and augment existing syntax:

CREATE TABLE ... (some_option) → Intercept CreateStmt, check for special markers, do extra work
CREATE VIEW ... WITH (custom_option = true) → Intercept ViewStmt, check reloptions
CREATE MATERIALIZED VIEW ... WITH (custom = true) → Same approach

Pattern: Intercepting CREATE TABLE

static void my_process_utility(PlannedStmt *pstmt, ...) {
    Node *parsetree = pstmt->utilityStmt;

    if (IsA(parsetree, CreateStmt)) {
        CreateStmt *stmt = (CreateStmt *) parsetree;
        // Check for a special reloption or table name pattern
        ListCell *lc;
        foreach(lc, stmt->options) {
            DefElem *opt = (DefElem *) lfirst(lc);
            if (strcmp(opt->defname, "stream") == 0) {
                // This is a stream table! Do custom logic.
                create_stream_table_from_ddl(stmt, queryString);
                return; // Don't call standard handler
            }
        }
    }

    // Pass through to standard handler
    if (prev_ProcessUtility)
        prev_ProcessUtility(pstmt, ...);
    else
        standard_ProcessUtility(pstmt, ...);
}

Pros/Cons

Aspect	Assessment
Native `CREATE STREAM TABLE`	No — parser rejects unknown syntax
`CREATE TABLE ... WITH (stream=true)`	Yes — feasible via reloptions
Complexity	Medium — must carefully chain with other extensions
PG version	All modern versions
Maintenance	Low — hook signature changes rarely (changed in PG14, PG15)
Risk	Must always chain `prev_ProcessUtility` — misbehaving can break other extensions

4. Raw Parser Extension (gram.y)

How It Works

PostgreSQL's SQL parser is a Bison-generated LALR(1) parser defined in:

src/backend/parser/gram.y — Grammar rules (~18,000 lines)
src/backend/parser/scan.l — Flex lexer (tokenizer)
src/include/parser/kwlist.h — Reserved/unreserved keyword list

To add CREATE STREAM TABLE, you would:

Add STREAM to the keyword list (unreserved or reserved)

Add grammar rules to gram.y:

CreateStreamTableStmt:
    CREATE STREAM TABLE qualified_name '(' OptTableElementList ')'
    OptWith AS SelectStmt
    {
        CreateStreamTableStmt *n = makeNode(CreateStreamTableStmt);
        n->relation = $4;
        n->query = $9;
        /* ... */
        $$ = (Node *) n;
    }
;

Add a new NodeTag for CreateStreamTableStmt
Handle it in ProcessUtility
Rebuild the PostgreSQL server

Implications

This requires forking PostgreSQL. The modified parser is compiled into postgres binary. You cannot ship a grammar modification as a loadable extension (.so/.dylib).

Who Does This?

YugabyteDB — Fork of PG with custom grammar for distributed features
CockroachDB — Entirely custom parser (Go, not PG's Bison grammar)
Amazon Aurora (partially) — Custom grammar additions for Aurora-specific features
Greenplum — Fork of PG with added grammar for DISTRIBUTED BY, PARTITION BY etc.
ParadeDB — Fork of PG with some custom syntax additions

Pros/Cons

Aspect	Assessment
Native `CREATE STREAM TABLE`	Yes — full parser-level support
Complexity	Very high — must maintain a PG fork
PG version	Tied to a single PG version
Maintenance	Extremely high — must rebase on every PG release (gram.y changes significantly between major versions)
Distribution	Cannot use `CREATE EXTENSION`; must ship entire modified PostgreSQL
User adoption	Very low — users must replace their PostgreSQL installation
psql autocomplete	Would work with matching psql modifications
pg_dump/pg_restore	Broken unless you also modify those tools

Verdict: Not viable for an extension. Only viable for a PostgreSQL fork/distribution.

5. The Utility Command Approach

How It Works

Some sources reference a "custom utility command" mechanism. In practice, this does not exist as a formal PostgreSQL extension point. What people sometimes mean is one of:

5a. Using DO Blocks as Custom Commands

DO $$ BEGIN PERFORM pgtrickle.create_stream_table('my_st', 'SELECT ...'); END $$;

This is just a wrapped function call — not a real custom command.

5b. Abusing COMMENT or SET for Command Dispatch

Some extensions parse custom commands from strings:

-- Using SET to pass commands
SET myext.command = 'CREATE STREAM TABLE my_st AS SELECT ...';
SELECT myext.execute_pending_command();

Or using post_parse_analyze_hook to intercept a specially-formatted query:

-- Extension intercepts this via post_parse_analyze_hook
SELECT * FROM myext.dispatch('CREATE STREAM TABLE ...');

5c. Overloading Existing Syntax

Some extensions overload SELECT or CALL:

CALL pgtrickle.create_stream_table('my_st', $$SELECT ...$$);

CALL was introduced in PostgreSQL 11 for stored procedures. Using it makes the DDL feel more "command-like" than SELECT function().

Pros/Cons

Aspect	Assessment
Native syntax	No — still a function call in disguise
User experience	Moderate — `CALL` is better than `SELECT`
Complexity	Low
PG version	PG11+ for `CALL`
Maintenance	Very low

6. Custom Access Methods (CREATE ACCESS METHOD)

How It Works

PostgreSQL supports extension-defined access methods (index AMs and table AMs):

CREATE ACCESS METHOD my_am TYPE TABLE HANDLER my_am_handler;

This was introduced in PostgreSQL 9.6 for index AMs and extended to table AMs in PostgreSQL 12. The CREATE ACCESS METHOD statement shows PostgreSQL's philosophy: extensions can define new implementations of existing concepts (tables, indexes) but not new concepts (stream tables).

Table AM vs. Index AM

Type	Since	Handler Signature	Example
Index AM	PG 9.6	`IndexAmRoutine` with scan/insert/delete callbacks	bloom, brin, GiST
Table AM	PG 12	`TableAmRoutine` with 60+ callbacks	heap (default), columnar (Citus), zedstore (experimental)

Can We Use This for Stream Tables?

The table AM API defines how tuples are stored and retrieved, not how tables are created or maintained. A stream table's key features are:

Defining query — Not part of the table AM concept
Automatic refresh — Not part of the table AM concept
Change tracking — Could partially overlap with table AM's tuple modification callbacks
Storage — The actual storage could use heap (default) AM

You could theoretically create a custom table AM that:

Uses heap storage underneath
Intercepts INSERT/UPDATE/DELETE to maintain change buffers
Adds custom metadata

But this would be an extreme abuse of the API. Table AMs are meant for storage engines, not for implementing materialized view semantics.

Pros/Cons

Aspect	Assessment
Native syntax	No — `CREATE TABLE ... USING my_am` is the closest
Complexity	Extremely high — 60+ callbacks to implement
Fitness	Poor — table AM is about storage, not view maintenance
PG version	PG 12+
Maintenance	High — AM API evolves between major versions

7. Table Access Method API (PostgreSQL 12+)

Deep Technical Details

The Table Access Method (AM) API was introduced in PostgreSQL 12 via commit c2fe139c20 by Andres Freund. It abstracts the storage layer, allowing extensions to replace the default heap storage with custom implementations.

The `CREATE TABLE ... USING` Syntax

-- Use default AM (heap)
CREATE TABLE normal_table (id int, data text);

-- Use custom AM
CREATE TABLE my_table (id int, data text) USING my_custom_am;

-- Set default for a database
SET default_table_access_method = 'my_custom_am';

TableAmRoutine Structure

The handler function must return a TableAmRoutine struct with callbacks:

typedef struct TableAmRoutine {
    NodeTag type;

    /* Slot callbacks */
    const TupleTableSlotOps *(*slot_callbacks)(Relation rel);

    /* Scan callbacks */
    TableScanDesc (*scan_begin)(Relation rel, Snapshot snap, int nkeys, ...);
    void (*scan_end)(TableScanDesc scan);
    void (*scan_rescan)(TableScanDesc scan, ...);
    bool (*scan_getnextslot)(TableScanDesc scan, ...);

    /* Parallel scan */
    Size (*parallelscan_estimate)(Relation rel);
    Size (*parallelscan_initialize)(Relation rel, ...);
    void (*parallelscan_reinitialize)(Relation rel, ...);

    /* Index fetch */
    IndexFetchTableData *(*index_fetch_begin)(Relation rel);
    void (*index_fetch_reset)(IndexFetchTableData *data);
    void (*index_fetch_end)(IndexFetchTableData *data);
    bool (*index_fetch_tuple)(IndexFetchTableData *data, ...);

    /* Tuple modification */
    void (*tuple_insert)(Relation rel, TupleTableSlot *slot, ...);
    void (*tuple_insert_speculative)(Relation rel, ...);
    void (*tuple_complete_speculative)(Relation rel, ...);
    void (*multi_insert)(Relation rel, TupleTableSlot **slots, int nslots, ...);
    TM_Result (*tuple_delete)(Relation rel, ItemPointer tid, ...);
    TM_Result (*tuple_update)(Relation rel, ItemPointer otid, ...);
    TM_Result (*tuple_lock)(Relation rel, ItemPointer tid, ...);

    /* DDL callbacks */
    void (*relation_set_new_filelocator)(Relation rel, ...);
    void (*relation_nontransactional_truncate)(Relation rel);
    void (*relation_copy_data)(Relation rel, const RelFileLocator *newrlocator);
    void (*relation_copy_for_cluster)(Relation rel, ...);
    void (*relation_vacuum)(Relation rel, VacuumParams *params, ...);
    bool (*scan_analyze_next_block)(TableScanDesc scan, ...);
    bool (*scan_analyze_next_tuple)(TableScanDesc scan, ...);

    /* Planner support */
    void (*relation_estimate_size)(Relation rel, int32 *attr_widths, ...);

    /* ... more callbacks */
} TableAmRoutine;

Hybrid Approach: Table AM + ProcessUtility_hook

A more practical pattern:

Register a custom table AM (e.g., stream_am) that wraps heap
Use ProcessUtility_hook to intercept CREATE TABLE ... USING stream_am
When detected, perform stream table registration (catalog, CDC, etc.)
The actual storage uses standard heap via delegation

-- User writes:
CREATE TABLE order_totals (region text, total numeric)
    USING stream_am
    WITH (query = 'SELECT region, SUM(amount) FROM orders GROUP BY region',
          schedule = '1m',
          refresh_mode = 'DIFFERENTIAL');

Problems with This Approach

Column list is mandatory — CREATE TABLE ... USING requires explicit column definitions. Stream tables should derive columns from the query.
Query in WITH clause — Storing a full SQL query in reloptions is hacky and has length limits.
No AS SELECT — Table AMs don't support CREATE TABLE ... AS SELECT with USING clause in the standard grammar.
VACUUM, ANALYZE complexity — Must implement or delegate all maintenance callbacks.
pg_dump compatibility — pg_dump would dump CREATE TABLE ... USING stream_am but not the associated metadata (query, schedule, etc.)

Pros/Cons

Aspect	Assessment
Native syntax	Partial — `CREATE TABLE ... USING stream_am`
Feels like a stream table	No — still looks like a regular table with options
Complexity	Very high
pg_dump	Broken — metadata in catalog tables won't be dumped
PG version	PG 12+
Maintenance	High — table AM API changes between versions

8. Foreign Data Wrapper Approach

How It Works

Foreign Data Wrappers (FDW) allow PostgreSQL to access external data sources via CREATE FOREIGN TABLE. An extension can register a custom FDW:

CREATE EXTENSION pg_trickle;
CREATE SERVER stream_server FOREIGN DATA WRAPPER pgtrickle_fdw;

CREATE FOREIGN TABLE order_totals (region text, total numeric)
    SERVER stream_server
    OPTIONS (
        query 'SELECT region, SUM(amount) FROM orders GROUP BY region',
        schedule '1m',
        refresh_mode 'DIFFERENTIAL'
    );

FDW API

The FDW API provides callbacks for:

GetForeignRelSize — Estimate relation size for planning
GetForeignPaths — Generate access paths
GetForeignPlan — Create a plan node
BeginForeignScan — Start scan
IterateForeignScan — Get next tuple
EndForeignScan — End scan
AddForeignUpdatePaths — Support INSERT/UPDATE/DELETE (optional)

How It Could Work for Stream Tables

Define a custom FDW (pgtrickle_fdw)
The FDW's scan callbacks read from the underlying storage table
ProcessUtility_hook intercepts CREATE FOREIGN TABLE ... SERVER stream_server to set up CDC, catalog entries, etc.
A background worker handles refresh scheduling

Problems

Foreign tables have restrictions — Cannot have indexes, constraints, triggers, or participate in inheritance. This severely limits usability.
Query planner limitations — Foreign tables use a separate planning path with potentially worse plan quality.
No MVCC — Foreign tables typically don't provide snapshot isolation semantics.
User model confusion — "Foreign table" implies external data, not a derived view.
EXPLAIN output — Shows "Foreign Scan" instead of "Seq Scan", confusing users.
pg_dump — Foreign tables are dumped, but server/FDW setup may not transfer correctly.
Two-step creation — Requires CREATE SERVER before CREATE FOREIGN TABLE.

Pros/Cons

Aspect	Assessment
Native syntax	Partial — `CREATE FOREIGN TABLE` with options
Feels like a stream table	No — foreign tables have different semantics
Index support	No — major limitation
Trigger support	No — major limitation
Complexity	Medium
PG version	PG 9.1+
Maintenance	Low — FDW API is very stable

Verdict: Not suitable. The restrictions on foreign tables (no indexes, no triggers) make this impractical for stream tables that need to behave like regular tables.

9. Event Triggers

How It Works

Event triggers fire on DDL events at the database level:

CREATE EVENT TRIGGER my_trigger ON ddl_command_end
    WHEN TAG IN ('CREATE TABLE', 'ALTER TABLE', 'DROP TABLE')
    EXECUTE FUNCTION my_handler();

Available events:

ddl_command_start — Before DDL execution (PG 9.3+)
ddl_command_end — After DDL execution (PG 9.3+)
sql_drop — When objects are dropped (PG 9.3+)
table_rewrite — When a table is rewritten (PG 9.5+)

Inside the Handler

CREATE FUNCTION my_handler() RETURNS event_trigger AS $$
DECLARE
    obj record;
BEGIN
    FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
    LOOP
        -- obj.objid, obj.object_type, obj.command_tag, etc.
        IF obj.command_tag = 'CREATE TABLE' AND obj.object_type = 'table' THEN
            -- Check if this table has a special marker
            -- (e.g., a specific reloption or comment)
        END IF;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Pattern: CREATE TABLE + Event Trigger

User creates a table with a special comment or option:

CREATE TABLE order_totals (region text, total numeric);
COMMENT ON TABLE order_totals IS 'pgtrickle:query=SELECT region...;schedule=1m';

Event trigger on ddl_command_end fires
Handler parses the comment, detects stream table intent
Handler registers the stream table in the catalog

Limitations

Cannot modify the DDL — Event triggers observe DDL, they can't change what happened. On ddl_command_end, the table already exists.
Cannot prevent DDL — On ddl_command_start, you can raise an error to prevent it, but you can't redirect it.
Two-step process — User must CREATE TABLE AND then mark it somehow (comment, option, separate function call).
No custom syntax — Event triggers watch existing DDL commands.
pg_trickle already uses this — For DDL tracking on upstream tables (see hooks.rs).

Pros/Cons

Aspect	Assessment
Native syntax	No — watches existing DDL only
Complexity	Low
Can transform DDL	No — observe only
PG version	PG 9.3+
Maintenance	Very low
pg_trickle usage	Already used for upstream DDL tracking

10. TimescaleDB Continuous Aggregates Pattern

How It Works

TimescaleDB continuous aggregates (caggs) demonstrate the most sophisticated approach to custom DDL-like syntax in a PostgreSQL extension. Their evolution is instructive.

Phase 1: Pure Function API (early versions)

-- Create a view, then register it
CREATE VIEW daily_temps AS
SELECT time_bucket('1 day', time) AS day, AVG(temp)
FROM conditions GROUP BY 1;

SELECT add_continuous_aggregate_policy('daily_temps', ...);

Phase 2: CREATE MATERIALIZED VIEW WITH (introduced in TimescaleDB 2.0)

CREATE MATERIALIZED VIEW daily_temps
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 day', time) AS day, device_id, AVG(temp)
FROM conditions
GROUP BY 1, 2;

How the Hook Chain Works

TimescaleDB's approach uses layered hooks:

ProcessUtility_hook intercepts CREATE MATERIALIZED VIEW
Checks reloptions for timescaledb.continuous in the WithClause
If found:
- Does NOT call standard ProcessUtility for the matview
- Instead creates a regular hypertable (the materialization)
- Creates an internal view (the user-facing query interface)
- Registers refresh policies in the catalog
- Sets up continuous aggregate metadata
For REFRESH MATERIALIZED VIEW, intercepts and routes to their refresh engine
For DROP MATERIALIZED VIEW, intercepts and cleans up all artifacts

The Magic: Reloptions as Extension Point

PostgreSQL's CREATE MATERIALIZED VIEW ... WITH (option = value) passes options as DefElem nodes in the parse tree. The parser treats these as generic key-value pairs — it does NOT validate the option names. This is the key insight: PostgreSQL's parser accepts arbitrary options in WITH clauses.

// In ProcessUtility_hook:
if (IsA(parsetree, CreateTableAsStmt)) {
    CreateTableAsStmt *stmt = (CreateTableAsStmt *) parsetree;
    if (stmt->objtype == OBJECT_MATVIEW) {
        // Check for our custom option in stmt->into->options
        bool is_continuous = false;
        ListCell *lc;
        foreach(lc, stmt->into->rel->options) {
            DefElem *opt = (DefElem *) lfirst(lc);
            if (strcmp(opt->defname, "timescaledb.continuous") == 0) {
                is_continuous = true;
                break;
            }
        }
        if (is_continuous) {
            // Handle as continuous aggregate
            return;
        }
    }
}

Refresh Policies

-- Add a refresh policy (function call, not DDL)
SELECT add_continuous_aggregate_policy('daily_temps',
    start_offset => INTERVAL '1 month',
    end_offset => INTERVAL '1 day',
    schedule_interval => INTERVAL '1 hour');

What pg_trickle Could Learn

The TimescaleDB pattern for pg_trickle would look like:

-- Option A: CREATE MATERIALIZED VIEW with custom option
CREATE MATERIALIZED VIEW order_totals
WITH (pgtrickle.stream = true, pgtrickle.schedule = '1m', pgtrickle.mode = 'DIFFERENTIAL')
AS SELECT region, SUM(amount) FROM orders GROUP BY region;

-- Option B: CREATE TABLE with custom option (less natural)
CREATE TABLE order_totals (region text, total numeric)
WITH (pgtrickle.stream = true);
-- Then separately: SELECT pgtrickle.set_query('order_totals', 'SELECT ...');

Pros/Cons

Aspect	Assessment
Native syntax	Good — `CREATE MATERIALIZED VIEW ... WITH (pgtrickle.stream)` looks natural
User experience	Very good — familiar DDL syntax with extension options
Complexity	High — must implement full ProcessUtility_hook chain
pg_dump	Partial — matview DDL is dumped, but custom metadata needs `pg_dump` extension or config tables
PG version	PG 9.3+ (matviews), PG 12+ (better option handling)
Maintenance	Medium — must track changes to matview creation internals
Shared preload	Required — ProcessUtility_hook needs `shared_preload_libraries`

11. Citus Distributed DDL Pattern

How It Works

Citus (now part of Microsoft) demonstrates another approach to extending DDL behavior:

ProcessUtility_hook Chain

Citus has one of the most comprehensive ProcessUtility_hook implementations:

void multi_ProcessUtility(PlannedStmt *pstmt, ...) {
    // 1. Classify the DDL
    Node *parsetree = pstmt->utilityStmt;

    // 2. Check if it affects distributed tables
    if (IsA(parsetree, AlterTableStmt)) {
        // Propagate ALTER TABLE to all worker nodes
        PropagateAlterTable((AlterTableStmt *)parsetree, queryString);
    }

    // 3. Call standard handler (or skip for intercepted commands)
    if (prev_ProcessUtility)
        prev_ProcessUtility(pstmt, ...);
    else
        standard_ProcessUtility(pstmt, ...);

    // 4. Post-processing
    if (IsA(parsetree, CreateStmt)) {
        // Check if we should auto-distribute this table
    }
}

Table Distribution via Function Calls

Citus does NOT add custom DDL syntax. Distribution is done via function calls:

-- Create a regular table
CREATE TABLE events (id bigint, data jsonb, created_at timestamptz);

-- Distribute it (function call, not DDL)
SELECT create_distributed_table('events', 'id');

-- Or create a reference table
SELECT create_reference_table('lookups');

Columnar Storage via Table AM

Citus also provides columnar storage as a table AM:

CREATE TABLE analytics_data (...)
    USING columnar;

This uses the table AM API (PostgreSQL 12+) — see Section 7.

What Citus Teaches Us

Function calls for complex operations — create_distributed_table() is analogous to pgtrickle.create_stream_table().
ProcessUtility_hook for DDL propagation — Intercept standard DDL and add behavior.
Table AM for storage — Separate concern from distribution logic.
No custom syntax — Even with Microsoft's resources, Citus doesn't fork the parser.

Pros/Cons

Aspect	Assessment
Native syntax	No — uses function calls like pg_trickle
Approach validated	Yes — Citus is used at massive scale with this pattern
Complexity	Medium (function API) to High (ProcessUtility_hook)
User adoption	Proven successful
Maintenance	Low for function API

12. PostgreSQL 18 New Features

Relevant Extension Points in PG 18

PostgreSQL 18 (released 2025) includes several features relevant to this analysis:

12a. Virtual Generated Columns

PG 18 adds GENERATED ALWAYS AS (expr) VIRTUAL columns. Not directly relevant to stream tables, but shows PostgreSQL's willingness to expand CREATE TABLE syntax incrementally.

12b. Improved Table AM API

PG 18 refines the table AM API with better TOAST handling and improved parallel scan support. This makes custom table AMs slightly more practical.

12c. Enhanced Event Trigger Information

PG 18 expands pg_event_trigger_ddl_commands() with additional metadata fields, making event-trigger-based approaches more capable.

12d. `pg_stat_io` Improvements

Enhanced I/O statistics infrastructure that could benefit monitoring of stream table refresh operations.

12e. No New Parser Extension Points

PostgreSQL 18 does not add any parser extension mechanism. The parser remains monolithic and non-extensible. There have been occasional discussions on pgsql-hackers about parser hooks, but no concrete proposals have been accepted.

12f. No Custom DDL Extension Points

No new general-purpose DDL extension points beyond the existing hook system.

Looking Forward: Discussion on pgsql-hackers

There have been recurring threads on pgsql-hackers about:

Extension-defined SQL syntax — Rejected due to complexity and parser architecture
Loadable parser modules — Theoretical discussions, no implementation
Extension catalogs — Some interest in allowing extensions to register custom catalogs

None of these are implemented in PG 18.

Pros/Cons

Aspect	Assessment
New syntax extension points	None in PG 18
Table AM improvements	Minor — slightly easier to implement
Event trigger improvements	Minor — more metadata available
Parser extensibility	Not planned for any upcoming PG release

13. COMMENT / OPTIONS Abuse Pattern

How It Works

Several extensions use table comments or reloptions as a "poor man's metadata" to tag tables with custom semantics.

Pattern 1: COMMENT-based

CREATE TABLE order_totals (region text, total numeric);
COMMENT ON TABLE order_totals IS '@pgtrickle {"query": "SELECT ...", "schedule": "1m"}';

An event trigger or background worker scans pg_description for tables with the @pgtrickle prefix and processes them.

Pattern 2: Reloptions-based

CREATE TABLE order_totals (region text, total numeric)
    WITH (fillfactor = 70, pgtrickle.stream = true);

Problem: PostgreSQL validates reloptions against a known list. You cannot add arbitrary options to WITH (...) without registering them. Extensions can register custom reloptions via add_reloption() functions, but this is a relatively obscure API.

Pattern 3: GUC-based Tagging

-- Set a GUC that our ProcessUtility_hook reads
SET pgtrickle.next_create_is_stream = true;
SET pgtrickle.stream_query = 'SELECT region, SUM(amount) FROM orders GROUP BY region';

-- Hook intercepts this CREATE TABLE and registers it
CREATE TABLE order_totals (region text, total numeric);

-- Reset
RESET pgtrickle.next_create_is_stream;

This is extremely hacky but has been used in practice (some partitioning extensions used similar patterns before native partitioning).

Who Uses This?

pgmemcache — Uses comments to configure caching behavior
Some row-level security extensions — Comments to define policies
pg_partman — Uses a configuration table (not comments) but similar concept

Pros/Cons

Aspect	Assessment
Native syntax	No — abuses existing mechanisms
User experience	Poor — fragile, easy to break by editing comments
Complexity	Low
pg_dump	COMMENT is dumped — metadata survives pg_dump/restore
Robustness	Low — comments can be accidentally changed
PG version	All versions

14. pg_ivm (Incremental View Maintenance) Pattern

How It Works

pg_ivm is the most directly comparable extension to pg_trickle. It implements incremental view maintenance for PostgreSQL.

API Design

pg_ivm uses a pure function-call API:

-- Create an incrementally maintainable materialized view
SELECT create_immv('order_totals', 'SELECT region, SUM(amount) FROM orders GROUP BY region');

-- Refresh
SELECT refresh_immv('order_totals');

-- Drop
DROP TABLE order_totals;  -- Just drop the underlying table

Key function: create_immv(name, query) — Creates an "Incrementally Maintainable Materialized View" (IMMV).

Internal Implementation

create_immv() is a SQL function (not a hook)
It parses the query, creates a storage table, sets up triggers on source tables
IMMVs are stored as regular tables with metadata in a custom catalog (pg_ivm_immv)
Triggers on source tables automatically update the IMMV on DML

No ProcessUtility_hook

pg_ivm does not use ProcessUtility_hook. It operates entirely through:

SQL functions (create_immv, refresh_immv)
Row-level triggers for automatic maintenance
A custom catalog table for metadata

Why No Custom Syntax?

pg_ivm was developed as a proof-of-concept for PostgreSQL core IVM support. The authors explicitly chose function-call syntax to:

Avoid shared_preload_libraries requirement (hooks need it)
Keep the extension simple and portable
Focus on the IVM algorithm, not the user interface

Eventually Merged to Core?

There was discussion about upstreaming IVM to PostgreSQL core. If merged, it would get proper syntax (CREATE INCREMENTAL MATERIALIZED VIEW). As an extension, it stays with function calls.

Relevance to pg_trickle

pg_trickle's current API (pgtrickle.create_stream_table()) follows the exact same pattern as pg_ivm. This is the established approach for IVM extensions.

Pros/Cons

Aspect	Assessment
Native syntax	No — function calls
Complexity	Low — simple function API
shared_preload_libraries	Not required for basic function API
pg_dump	No — function calls are not dumped; must use custom dump/restore
User experience	Moderate — familiar to pg_ivm users
Community acceptance	Established pattern for IVM extensions

15. CREATE TABLE ... USING (Table Access Methods) Deep Dive

Full Syntax

CREATE TABLE tablename (
    column1 datatype,
    column2 datatype,
    ...
) USING access_method_name
  WITH (storage_parameter = value, ...);

How the Parser Handles USING

In gram.y:

CreateStmt: CREATE OptTemp TABLE ...
    OptTableAccessMethod OptWith ...

OptTableAccessMethod:
    USING name    { $$ = $2; }
    | /* empty */ { $$ = NULL; }
    ;

The USING clause sets CreateStmt->accessMethod to the access method name string.

How ProcessUtility Handles It

In createRelation() (src/backend/commands/tablecmds.c):

If accessMethod is specified, look it up in pg_am
Verify it's a table AM (not an index AM)
Store the AM OID in pg_class.relam
Use the AM's callbacks for all subsequent operations

Custom Reloptions with Table AMs

Table AMs can define custom reloptions via:

static relopt_parse_elt stream_relopt_tab[] = {
    {"query", RELOPT_TYPE_STRING, offsetof(StreamOptions, query)},
    {"schedule", RELOPT_TYPE_STRING, offsetof(StreamOptions, schedule)},
    {"refresh_mode", RELOPT_TYPE_STRING, offsetof(StreamOptions, refresh_mode)},
};

This would allow:

CREATE TABLE order_totals (region text, total numeric)
    USING stream_heap
    WITH (query = 'SELECT ...', schedule = '1m', refresh_mode = 'DIFFERENTIAL');

Problems Specific to Stream Tables

Column derivation — Stream tables derive columns from the query. CREATE TABLE ... USING requires explicit column definitions, creating redundancy and potential inconsistency.

No AS SELECT — You can't combine USING with AS SELECT:

-- This does NOT work in PostgreSQL grammar:
CREATE TABLE order_totals
    USING stream_heap
    AS SELECT region, SUM(amount) FROM orders GROUP BY region;

Full AM implementation required — Even if you delegate to heap, you must implement all callbacks and handle edge cases.
VACUUM/ANALYZE — Must properly delegate to heap for these to work.
Replication — Logical replication assumes heap tuples; custom AMs may break.

Hybrid Practical Approach

If pursuing this route:

-- Step 1: Set default AM
SET default_table_access_method = 'stream_heap';

-- Step 2: Create with query in options
CREATE TABLE order_totals ()
    WITH (pgtrickle.query = 'SELECT region, SUM(amount) FROM orders GROUP BY region',
          pgtrickle.schedule = '1m');

-- ProcessUtility_hook would:
-- 1. Detect USING stream_heap (or detect our custom reloptions)
-- 2. Parse the query from options
-- 3. Derive columns from the query
-- 4. Create the actual table with proper columns using heap AM
-- 5. Register in pgtrickle catalog
-- 6. Set up CDC

Pros/Cons

Aspect	Assessment
Native syntax	Partial — `CREATE TABLE ... USING stream_heap WITH (...)`
Column derivation	Not supported — must specify columns or use hook magic
Complexity	Very high
pg_dump	Good — `CREATE TABLE ... USING` is properly dumped
PG version	PG 12+
Maintenance	High — AM API changes between versions

16. Comparison Matrix

Approach	Native Syntax	Complexity	pg_dump	PG Version	Maintenance	Recommended
Function API (current)	No	Low	No*	Any	Very Low	Yes
ProcessUtility_hook + MATVIEW WITH	Good	High	Partial	9.3+	Medium	Maybe
Raw parser fork	Perfect	Very High	No	Fork only	Very High	No
Table AM USING	Partial	Very High	Yes	12+	High	No
FDW FOREIGN TABLE	Partial	Medium	Yes	9.1+	Low	No
Event triggers alone	No	Low	No	9.3+	Low	No
COMMENT abuse	No	Low	Yes	Any	Low	No
GUC + CREATE TABLE hack	No	Medium	Partial	Any	Medium	No
TimescaleDB pattern (MATVIEW + WITH)	Good	High	Partial	9.3+	Medium	Best option

* Custom pg_dump support can be added via pg_dump hook or wrapper script.

17. Recommendations for pg_trickle

Current Approach: Function API (Keep and Enhance)

pg_trickle's current approach (pgtrickle.create_stream_table('name', 'query', ...)) is:

Proven — Same pattern as pg_ivm, Citus, and many other extensions
Simple — No shared_preload_libraries required for basic usage
Maintainable — No hook chains to debug
Portable — Works on any PG version that supports pgrx

Enhancement opportunities:

-- Current
SELECT pgtrickle.create_stream_table('order_totals',
    'SELECT region, SUM(amount) FROM orders GROUP BY region', '1m');

-- Enhanced: CALL syntax for more DDL-like feel (PG 11+)
CALL pgtrickle.create_stream_table('order_totals',
    $$SELECT region, SUM(amount) FROM orders GROUP BY region$$, '1m');

Future Option: TimescaleDB-style Materialized View Integration

If user demand justifies the complexity, pg_trickle could add a second creation path via ProcessUtility_hook:

-- New native-feeling syntax (requires shared_preload_libraries)
CREATE MATERIALIZED VIEW order_totals
WITH (pgtrickle.stream = true, pgtrickle.schedule = '1m')
AS SELECT region, SUM(amount) FROM orders GROUP BY region
WITH NO DATA;

-- Original function API still works (no hook needed)
SELECT pgtrickle.create_stream_table('order_totals',
    'SELECT region, SUM(amount) FROM orders GROUP BY region', '1m');

Implementation plan for hook-based approach:

Register ProcessUtility_hook in _PG_init() (already needed for shared_preload_libraries)
Intercept CREATE MATERIALIZED VIEW → Check for pgtrickle.stream option
If found: parse options, call create_stream_table_impl() internally, create standard storage table instead of matview
Intercept DROP MATERIALIZED VIEW → Check if target is a stream table → Clean up
Intercept REFRESH MATERIALIZED VIEW → Route to stream table refresh engine
Intercept ALTER MATERIALIZED VIEW → Route to stream table alter logic

Estimated complexity: ~800-1200 lines of Rust hook code + tests.

Not Recommended

Forking PostgreSQL for custom grammar — Maintenance cost is prohibitive
Table AM approach — Complexity without proportional benefit
FDW approach — Too many restrictions on foreign tables
COMMENT abuse — Fragile and poor UX

pg_dump / pg_restore Strategy

Regardless of approach, pg_dump is a challenge. Options:

Custom dump/restore functions — pgtrickle.dump_config() and pgtrickle.restore_config()
Migration script generation — pgtrickle.generate_migration() outputs SQL to recreate all stream tables
Event trigger on restore — Detect when tables are restored and re-register them
Sidecar file — Generate a companion SQL file alongside pg_dump

Appendix A: Hook Registration in pgrx (Rust)

For reference, here's how ProcessUtility_hook registration works in pgrx:

#![allow(unused)]
fn main() {
use pgrx::prelude::*;
use pgrx::pg_sys;
use std::ffi::CStr;

static mut PREV_PROCESS_UTILITY_HOOK: pg_sys::ProcessUtility_hook_type = None;

#[pg_guard]
pub extern "C-unwind" fn my_process_utility(
    pstmt: *mut pg_sys::PlannedStmt,
    query_string: *const std::os::raw::c_char,
    read_only_tree: bool,
    context: pg_sys::ProcessUtilityContext,
    params: pg_sys::ParamListInfo,
    query_env: *mut pg_sys::QueryEnvironment,
    dest: *mut pg_sys::DestReceiver,
    qc: *mut pg_sys::QueryCompletion,
) {
    // SAFETY: pstmt is a valid pointer provided by PostgreSQL
    let stmt = unsafe { (*pstmt).utilityStmt };

    // Check if this is a CreateTableAsStmt (materialized view)
    if unsafe { pgrx::is_a(stmt, pg_sys::NodeTag::T_CreateTableAsStmt) } {
        // Check for our custom options...
    }

    // Chain to previous hook or standard handler
    unsafe {
        if let Some(prev) = PREV_PROCESS_UTILITY_HOOK {
            prev(pstmt, query_string, read_only_tree, context,
                 params, query_env, dest, qc);
        } else {
            pg_sys::standard_ProcessUtility(
                pstmt, query_string, read_only_tree, context,
                params, query_env, dest, qc);
        }
    }
}

pub fn register_hooks() {
    unsafe {
        PREV_PROCESS_UTILITY_HOOK = pg_sys::ProcessUtility_hook;
        pg_sys::ProcessUtility_hook = Some(my_process_utility);
    }
}
}

Appendix B: Key Source Files in PostgreSQL

File	Purpose
`src/backend/parser/gram.y`	SQL grammar (~18,000 lines)
`src/backend/parser/scan.l`	Lexer/tokenizer
`src/include/parser/kwlist.h`	Keyword definitions
`src/backend/tcop/utility.c`	`ProcessUtility()` — DDL dispatcher
`src/backend/commands/tablecmds.c`	CREATE/ALTER/DROP TABLE implementation
`src/backend/commands/createas.c`	CREATE TABLE AS / CREATE MATVIEW AS
`src/include/access/tableam.h`	Table Access Method API
`src/include/foreign/fdwapi.h`	FDW API
`src/backend/commands/event_trigger.c`	Event trigger infrastructure

Appendix C: References

PostgreSQL Documentation — Table Access Method Interface
PostgreSQL Documentation — Event Triggers
PostgreSQL Documentation — Writing A Foreign Data Wrapper
TimescaleDB Source — process_utility.c
Citus Source — multi_utility.c
pg_ivm Source — createas.c
pgrx Documentation — Hooks
PostgreSQL Wiki — CustomScanProviders

pg_trickle vs pg_ivm — Comparison Report & Gap Analysis

Date: 2026-02-28 (merged 2026-03-01, updated 2026-03-20) Author: Internal research Status: Reference document

1. Executive Summary

Both pg_trickle and pg_ivm implement Incremental View Maintenance (IVM) as PostgreSQL extensions — the goal of keeping materialized query results up-to-date without full recomputation. Despite the shared objective they differ fundamentally in design philosophy, maintenance model, SQL coverage, operational model, and target audience.

pg_ivm is a mature, widely-deployed C extension (1.4k GitHub stars, 17 releases) focused on immediate, synchronous IVM that runs inside the same transaction as the base-table write. pg_trickle is a Rust extension (v0.9.0) offering both deferred (scheduled) and immediate (transactional) IVM with a richer SQL dialect, a dependency DAG, and built-in operational tooling.

pg_trickle is significantly ahead of pg_ivm in SQL coverage, operator support, aggregate support, and operational features. As of v0.2.1, pg_trickle also matches pg_ivm's core strength — immediate, in-transaction maintenance — via the IMMEDIATE refresh mode (all phases complete). pg_ivm's one remaining structural advantage is broader PostgreSQL version support (PG 13–18):

IMMEDIATE mode — fully implemented. Statement-level AFTER triggers with transition tables update stream tables within the same transaction as base-table DML. Window functions, LATERAL, scalar subqueries, cascading IMMEDIATE stream tables, WITH RECURSIVE (with a stack-depth warning), and TopK micro-refresh are all supported. See PLAN_TRANSACTIONAL_IVM.md.
AUTO refresh mode — new default for create_stream_table. Selects DIFFERENTIAL when the query supports it and transparently falls back to FULL otherwise, eliminating the need to choose a mode at creation time.
pg_ivm compatibility layer — postponed. The pgivm.create_immv() / pgivm.refresh_immv() / pgivm.pg_ivm_immv wrappers (Phase 2) are deferred to post-1.0.
PLAN_PG_BACKCOMPAT.md details backporting pg_trickle to PG 14–18 (recommended) or PG 16–18 (minimum viable), requiring ~2.5–3 weeks of effort primarily in #[cfg]-gating ~435 lines of JSON/SQL-standard parse-tree handling.

With IMMEDIATE mode fully implemented, Row Level Security support (v0.5.0), pg_dump/restore support (v0.8.0), algebraic aggregate maintenance (v0.9.0), parallel refresh (v0.4.0), circular pipeline support (v0.7.0), watermark APIs (v0.7.0), and 40+ unique features, pg_ivm's only remaining advantages are PG version breadth and production maturity.

2. Project Overview

Attribute	pg_ivm	pg_trickle
Repository	sraoss/pg_ivm	grove/pg-trickle
Language	C	Rust (pgrx 0.17)
Latest release	1.13 (2025-10-20)	0.9.0 (2026-03-20)
Stars	~1,400	early stage
License	PostgreSQL License	Apache 2.0
PG versions	13 – 18	18 only; PG 14–18 planned
Schema	`pgivm`	`pgtrickle` / `pgtrickle_changes`
Shared library required	Yes (`shared_preload_libraries` or `session_preload_libraries`)	Yes (`shared_preload_libraries`, required for background worker)
Background worker	No	Yes (scheduler + optional WAL decoder)

3. Maintenance Model

This is the most important design difference between the two extensions.

pg_ivm — Immediate Maintenance

pg_ivm updates its views synchronously inside the same transaction that modified the base table. When a row is inserted/updated/deleted, AFTER row triggers fire and update the IMMV before the transaction commits.

BEGIN;
  UPDATE base_table ...;   -- triggers fire here
  -- IMMV is updated before COMMIT
COMMIT;

Consequences:

The IMMV is always exactly consistent with the committed state of the base table — zero staleness.
Write latency increases by the cost of view maintenance. For large joins or aggregates on popular tables this can be significant.
Locking: ExclusiveLock is held on the IMMV during maintenance to prevent concurrent anomalies. In REPEATABLE READ or SERIALIZABLE isolation, errors are raised when conflicts are detected.
TRUNCATE on a base table triggers full IMMV refresh (for most view types).
Not compatible with logical replication (subscriber nodes are not updated).

pg_trickle — Deferred, Scheduled Maintenance

pg_trickle updates its stream tables asynchronously, driven by a background worker scheduler. Changes are captured by row-level triggers (or optionally by WAL decoding) into change-buffer tables and are applied in batch on the next refresh cycle.

-- Write path: only a trigger INSERT into change buffer
BEGIN;
  UPDATE base_table ...;   -- trigger captures delta into pgtrickle_changes.*
COMMIT;

-- Separate refresh cycle (background worker):
  apply_delta_to_stream_table(...)

Consequences:

Write latency is minimized — the trigger write into the change buffer is ~2–50 μs regardless of view complexity.
Stream tables are stale between refresh cycles. The staleness bound is configurable (e.g. '30s', '5m', '@hourly', or cron expressions).
Refresh can be triggered manually: pgtrickle.refresh_stream_table(...).
Multiple stream tables can share a refresh pipeline ordered by dependency (topological DAG scheduling).
The WAL-based CDC mode (pg_trickle.cdc_mode = 'wal') eliminates trigger overhead entirely when wal_level = logical is available.
Append-only fast path (v0.5.0): append_only => true skips merge for INSERT-only tables with auto-fallback if DELETE/UPDATE detected.
Source gating (v0.5.0): pause CDC during bulk loads via gate_source() and ungate_source() to avoid trigger overhead during large batch inserts.

Implemented: pg_trickle IMMEDIATE Mode

pg_trickle now offers an IMMEDIATE refresh mode (Phase 1 + Phase 3 complete) that uses statement-level AFTER triggers with transition tables — the same mechanism pg_ivm uses. Key implementation details:

Reuses the DVM engine — the Scan operator reads from transition tables (via temporary views) instead of change buffer tables.
Phase 1 (complete): core IMMEDIATE engine — INSERT/UPDATE/DELETE/TRUNCATE handling, advisory lock-based concurrency (IvmLockMode), mode switching via alter_stream_table, query restriction validation.
Phase 2 (postponed): pgivm.* compatibility layer for drop-in migration.
Phase 3 (complete): extended SQL support — window functions, LATERAL, scalar subqueries, cascading IMMEDIATE stream tables, WITH RECURSIVE (IM1: supported with a stack-depth warning), and TopK micro-refresh (IM2: recomputes top-K on every DML, gated by pg_trickle.ivm_topk_max_limit).
Phase 4 (complete): delta SQL template caching (IVM_DELTA_CACHE); ENR-based transition tables and C-level triggers deferred to post-1.0 as optimizations only.

-- Create an IMMEDIATE stream table (zero staleness)
SELECT pgtrickle.create_stream_table(
    'live_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    NULL,          -- no schedule needed
    'IMMEDIATE'
);

-- Updates propagate within the same transaction
BEGIN;
  INSERT INTO orders (region, amount) VALUES ('EU', 100);
  SELECT * FROM live_totals;  -- already includes the new row
COMMIT;

4. SQL Feature Coverage — Summary

Dimension	pg_ivm	pg_trickle	Winner
Maintenance timing	Immediate (in-transaction triggers)	Deferred (scheduler/manual) and IMMEDIATE (in-transaction)	pg_trickle (offers both models)
PostgreSQL versions	13–18	18 only; PG 14–18 planned	pg_ivm (today); planned parity
Aggregate functions	5 (COUNT, SUM, AVG, MIN, MAX)	60+ (all built-in aggregates incl. algebraic O(1) for COUNT/SUM/AVG/STDDEV/VAR)	pg_trickle
FILTER clause on aggregates	No	Yes	pg_trickle
HAVING clause	No	Yes	pg_trickle
Inner joins	Yes (including self-join)	Yes (including self-join, NATURAL, nested)	pg_trickle
Outer joins	Yes (limited — equijoin, single condition, many restrictions)	Yes (LEFT/RIGHT/FULL, nested, complex conditions)	pg_trickle
DISTINCT	Yes (reference-counted)	Yes (reference-counted)	Tie
DISTINCT ON	No	Yes (auto-rewritten to ROW_NUMBER)	pg_trickle
UNION / INTERSECT / EXCEPT	No	Yes (all 6 variants, bag + set)	pg_trickle
Window functions	No	Yes (partition recomputation)	pg_trickle
CTEs (non-recursive)	Simple only (no aggregates, no DISTINCT inside)	Full (aggregates, DISTINCT, multi-reference shared delta)	pg_trickle
CTEs (recursive)	No	Yes (semi-naive, DRed, recomputation; IMMEDIATE mode with stack-depth warning)	pg_trickle
Subqueries in FROM	Simple only (no aggregates/DISTINCT inside)	Full support	pg_trickle
EXISTS subqueries	Yes (WHERE only, AND only, no agg/DISTINCT)	Yes (WHERE + targetlist, AND/OR, agg/DISTINCT inside)	pg_trickle
NOT EXISTS / NOT IN	No	Yes (anti-join operator)	pg_trickle
IN (subquery)	No	Yes (semi-join operator)	pg_trickle
Scalar subquery in SELECT	No	Yes (scalar subquery operator)	pg_trickle
LATERAL subqueries	No	Yes (row-scoped recomputation)	pg_trickle
LATERAL SRFs	No	Yes (jsonb_array_elements, unnest, etc.)	pg_trickle
JSON_TABLE (PG 17+)	No	Yes	pg_trickle
GROUPING SETS / CUBE / ROLLUP	No	Yes (auto-rewritten to UNION ALL)	pg_trickle
Views as sources	No (simple tables only)	Yes (auto-inlined, nested)	pg_trickle
Partitioned tables	No	Yes	pg_trickle
Foreign tables	No	FULL mode only	pg_trickle
Cascading (view-on-view)	No	Yes (DAG-aware scheduling)	pg_trickle
Background scheduling	No (user must trigger)	Yes (cron + duration, background worker)	pg_trickle
Monitoring / observability	1 catalog table	Extensive (stats, history, staleness, CDC health, NOTIFY)	pg_trickle
CDC mechanism	Triggers only	Hybrid (triggers + optional WAL)	pg_trickle
DDL tracking	No automatic handling	Yes (event triggers, auto-reinit)	pg_trickle
TRUNCATE handling	Yes (auto-truncate IMMV)	IMMEDIATE mode: full refresh in same txn; DEFERRED: queued full refresh	Tie (functionally equivalent in IMMEDIATE mode)
Auto-indexing	Yes (on GROUP BY / DISTINCT / PK columns)	No (user creates indexes)	pg_ivm
Row Level Security	Yes (with limitations)	Yes (refreshes see all data; RLS on stream table; IMMEDIATE mode secured)	pg_trickle (richer model)
Concurrency model	ExclusiveLock on IMMV during maintenance	Advisory locks, non-blocking reads, parallel refresh	pg_trickle
Data type restrictions	Must have btree opclass (no json, xml, point)	No documented type restrictions	pg_trickle
Maturity / ecosystem	4 years, 1.4k stars, PGXN, yum packages	v0.9.0 released, 1,100+ unit tests + 900+ E2E tests, 22 TPC-H benchmarks, dbt integration	pg_ivm

4.1 Areas Where pg_ivm Wins

Of the ~35 dimensions in the summary table above, pg_ivm holds an advantage in only 3 (down from 6 before IMMEDIATE mode and RLS were implemented). One is substantive, two are temporary gaps with existing plans.

1. PostgreSQL Version Support (substantive, planned resolution)

pg_ivm ships pre-built packages for PostgreSQL 13–18 across all major Linux distros via yum.postgresql.org and PGXN. pg_trickle currently targets PG 18 only.

This is the single largest remaining structural gap. PG 13 is EOL (Nov 2025), but PG 14–17 are widely deployed in production environments. Users on those versions simply cannot use pg_trickle today.

Planned resolution: PLAN_PG_BACKCOMPAT.md details backporting to PG 14–18 (~2.5–3 weeks). pgrx 0.17 already supports PG 14–18 via feature flags; ~435 lines in parser.rs need #[cfg] gating for JSON/SQL-standard parse-tree handling.

2. Auto-Indexing (substantive, low priority)

When pg_ivm creates an IMMV, it automatically adds indexes on columns used in GROUP BY, DISTINCT, and primary keys. This is a genuine usability advantage — new users get reasonable read performance without manual intervention.

pg_trickle leaves index creation entirely to the user. For DIFFERENTIAL mode stream tables, the DVM engine's MERGE-based delta application already uses the stream table's primary key (which is auto-created), and index-aware MERGE (pg_trickle.merge_seqscan_threshold, added v0.9.0) uses index lookups for tiny change ratios, but secondary indexes for read-side query patterns must be added manually.

Impact: Low — experienced users always create application-specific indexes anyway. Auto-indexing mostly helps onboarding and simple use-cases.

Planned resolution: Tracked as part of the pg_ivm compatibility layer (Phase 2, postponed to post-1.0). Could also be implemented independently as a CREATE INDEX IF NOT EXISTS step in create_stream_table.

3. Maturity / Ecosystem (temporary, closing over time)

pg_ivm has 4 years of production use, ~1,400 GitHub stars, 17 releases, and is distributed via PGXN, yum, and apt package repositories. It has a track record of stability and a community of users.

pg_trickle is a v0.9.0 series release with 1,100+ unit tests, 200+ integration tests, 570+ light E2E tests, 90+ full E2E tests, and 22 TPC-H correctness benchmarks—but no wide production deployments yet. It lacks the battle-testing that comes from years of real-world usage.

Impact: High for risk-averse organizations considering production adoption. Low for greenfield projects or teams willing to adopt early.

Resolution: This gap closes naturally with time, releases, and adoption. The dbt integration (dbt-pgtrickle) and CNPG/Kubernetes deployment support accelerate ecosystem development.

5. Detailed SQL Comparison

5.1 Aggregate Functions

Function	pg_ivm	pg_trickle
COUNT(*) / COUNT(expr)	✅ Algebraic	✅ Algebraic (O(1) running total, v0.9.0)
SUM	✅ Algebraic	✅ Algebraic (O(1) running total, v0.9.0)
AVG	✅ Algebraic (via SUM/COUNT)	✅ Algebraic (O(1) via SUM/COUNT decomposition, v0.9.0)
MIN	✅ Semi-algebraic (rescan on extremum delete)	✅ Semi-algebraic (O(1) unless extremum deleted, v0.9.0 safety guard)
MAX	✅ Semi-algebraic (rescan on extremum delete)	✅ Semi-algebraic (O(1) unless extremum deleted, v0.9.0 safety guard)
BOOL_AND / BOOL_OR	❌	✅ Group-rescan
STRING_AGG	❌	✅ Group-rescan
ARRAY_AGG	❌	✅ Group-rescan
JSON_AGG / JSONB_AGG	❌	✅ Group-rescan
BIT_AND / BIT_OR / BIT_XOR	❌	✅ Group-rescan
JSON_OBJECT_AGG / JSONB_OBJECT_AGG	❌	✅ Group-rescan
STDDEV / VARIANCE (all variants)	❌	✅ Algebraic (O(1) sum-of-squares decomposition, v0.9.0)
MODE / PERCENTILE_CONT / PERCENTILE_DISC	❌	✅ Group-rescan
CORR / COVAR / REGR_* (11 functions)	❌	✅ Group-rescan
ANY_VALUE (PG 16+)	❌	✅ Group-rescan
JSON_ARRAYAGG / JSON_OBJECTAGG (PG 16+)	❌	✅ Group-rescan
User-defined aggregates (CREATE AGGREGATE)	❌	✅ Group-rescan
FILTER (WHERE) clause	❌	✅
WITHIN GROUP (ORDER BY)	❌	✅
COUNT(DISTINCT expr) / SUM(DISTINCT expr)	❌	✅
Total	5	60+

Gap for pg_ivm: Massive. Only 5 of ~60 built-in aggregate functions are supported. pg_trickle v0.9.0 also introduced algebraic (O(1)) maintenance for COUNT, SUM, AVG, STDDEV, and VARIANCE — meaning these aggregates update in constant time per changed row via running totals, whereas pg_ivm’s algebraic support is limited to COUNT, SUM, AVG. pg_trickle additionally supports user-defined aggregates via group-rescan and floating-point drift correction (pg_trickle.algebraic_drift_reset_cycles).

5.2 Joins

Feature	pg_ivm	pg_trickle
Inner join	✅	✅
Self-join	✅	✅
LEFT JOIN	✅ (restricted)	✅ (full)
RIGHT JOIN	✅ (restricted)	✅ (normalized to LEFT)
FULL OUTER JOIN	✅ (restricted)	✅ (8-part delta)
NATURAL JOIN	?	✅
Cross join	?	✅
Nested joins (3+ tables)	✅	✅
Non-equi joins (theta)	?	✅
Outer join + aggregates	❌	✅
Outer join + subqueries	❌	✅
Outer join + CASE/non-strict	❌	✅
Outer join multi-condition	❌ (single equality only)	✅

Gap for pg_ivm: Outer joins are heavily restricted — single equijoin condition, no aggregates, no subqueries, no CASE expressions, no IS NULL in WHERE.

5.3 Subqueries

Feature	pg_ivm	pg_trickle
Simple subquery in FROM	✅ (no aggregates/DISTINCT inside)	✅ (full support)
EXISTS in WHERE	✅ (AND only, no agg/DISTINCT inside)	✅ (AND + OR, full SQL inside)
NOT EXISTS in WHERE	❌	✅ (anti-join operator)
IN (subquery)	❌	✅ (rewritten to semi-join)
NOT IN (subquery)	❌	✅ (rewritten to anti-join)
ALL (subquery)	❌	✅ (rewritten to anti-join)
Scalar subquery in SELECT	❌	✅ (scalar subquery operator)
Scalar subquery in WHERE	❌	✅ (auto-rewritten to CROSS JOIN)
LATERAL subquery in FROM	❌	✅ (row-scoped recomputation)
LATERAL SRF in FROM	❌	✅ (jsonb_array_elements, unnest, etc.)
Subqueries in OR	❌	✅ (auto-rewritten to UNION)

Gap for pg_ivm: Severely limited subquery support. No anti-joins, no scalar subqueries, no LATERAL, no SRFs.

5.4 CTEs

Feature	pg_ivm	pg_trickle
Simple non-recursive CTE	✅ (no aggregates/DISTINCT inside)	✅ (full SQL inside)
Multi-reference CTE	?	✅ (shared delta optimization)
Chained CTEs	?	✅
WITH RECURSIVE	❌	✅ (semi-naive, DRed, recomputation; IMMEDIATE mode with stack-depth warning)

Gap for pg_ivm: No recursive CTEs, no aggregates/DISTINCT inside CTEs.

5.5 Set Operations

Feature	pg_ivm	pg_trickle
UNION ALL	❌	✅
UNION (set)	❌	✅ (via DISTINCT + UNION ALL)
INTERSECT	❌	✅ (dual-count multiplicity)
INTERSECT ALL	❌	✅
EXCEPT	❌	✅ (dual-count multiplicity)
EXCEPT ALL	❌	✅

Gap for pg_ivm: No set operations at all.

5.6 Window Functions

Feature	pg_ivm	pg_trickle
ROW_NUMBER, RANK, DENSE_RANK	❌	✅
SUM/AVG/COUNT OVER ()	❌	✅
Frame clauses (ROWS/RANGE/GROUPS)	❌	✅
Named WINDOW clauses	❌	✅
PARTITION BY recomputation	❌	✅

Gap for pg_ivm: Window functions are completely unsupported.

5.7 DISTINCT & Grouping

Feature	pg_ivm	pg_trickle
SELECT DISTINCT	✅	✅
DISTINCT ON (expr, ...)	❌	✅ (auto-rewritten to ROW_NUMBER)
GROUP BY	✅	✅
GROUPING SETS	❌	✅ (auto-rewritten to UNION ALL)
CUBE	❌	✅ (auto-rewritten via GROUPING SETS)
ROLLUP	❌	✅ (auto-rewritten via GROUPING SETS)
GROUPING() function	❌	✅
HAVING	❌	✅

5.8 Source Table Types

Source type	pg_ivm	pg_trickle
Simple heap tables	✅	✅
Views	❌	✅ (auto-inlined)
Materialized views	❌	FULL mode only
Partitioned tables	❌	✅
Partitions	❌	✅ (via parent)
Foreign tables	❌	FULL mode only
Other IMMVs / stream tables	❌	✅ (DAG cascading)

Gap for pg_ivm: Only simple heap tables. No views, no partitioned tables, no cascading.

6. API Comparison

pg_ivm API

-- Create an IMMV
SELECT pgivm.create_immv('myview', 'SELECT * FROM mytab');

-- Full refresh (emergency)
SELECT pgivm.refresh_immv('myview', true);   -- with data
SELECT pgivm.refresh_immv('myview', false);  -- disable maintenance

-- Inspect
SELECT immvrelid, pgivm.get_immv_def(immvrelid)
FROM pgivm.pg_ivm_immv;

-- Drop
DROP TABLE myview;

-- Rename
ALTER TABLE myview RENAME TO myview2;

pg_ivm IMMVs are standard PostgreSQL tables. They can be dropped with DROP TABLE and renamed with ALTER TABLE.

pg_trickle API

-- Create a stream table (AUTO mode: DIFFERENTIAL when possible, FULL fallback)
SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region'
    -- refresh_mode defaults to 'AUTO', schedule defaults to 'calculated'
);

-- Create a stream table (explicit deferred, scheduled)
SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule     => '2m',
    refresh_mode => 'DIFFERENTIAL'
);

-- Create a stream table (immediate, in-transaction)
SELECT pgtrickle.create_stream_table(
    'live_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule     => NULL,
    refresh_mode => 'IMMEDIATE'
);

-- Manual refresh
SELECT pgtrickle.refresh_stream_table('order_totals');

-- Alter schedule, mode, or defining query
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '5m');
SELECT pgtrickle.alter_stream_table(
    'order_totals',
    query => 'SELECT region, SUM(amount) AS total FROM orders WHERE active GROUP BY region'
);

-- Drop
SELECT pgtrickle.drop_stream_table('order_totals');

-- Status and monitoring
SELECT * FROM pgtrickle.pgt_status();
SELECT * FROM pgtrickle.pg_stat_stream_tables;
SELECT * FROM pgtrickle.pgt_stream_tables;

-- DAG inspection
SELECT * FROM pgtrickle.pgt_dependencies;

-- Extended observability (added v0.2.0+)
SELECT * FROM pgtrickle.change_buffer_sizes();  -- CDC buffer health
SELECT * FROM pgtrickle.list_sources('order_totals');  -- source table stats
SELECT * FROM pgtrickle.dependency_tree();  -- ASCII DAG view
SELECT * FROM pgtrickle.health_check();  -- OK/WARN/ERROR triage
SELECT * FROM pgtrickle.refresh_timeline();  -- cross-stream history
SELECT * FROM pgtrickle.trigger_inventory();  -- CDC trigger audit
SELECT * FROM pgtrickle.diamond_groups();  -- diamond consistency groups

-- Source gating (v0.5.0)
SELECT pgtrickle.gate_source('orders');      -- pause CDC
SELECT pgtrickle.ungate_source('orders');    -- resume CDC
SELECT * FROM pgtrickle.source_gates();      -- gate status

-- Watermarks (v0.7.0)
SELECT pgtrickle.advance_watermark('orders', '2026-03-20 12:00:00');
SELECT pgtrickle.create_watermark_group('sync', ARRAY['orders','products'], 30);
SELECT * FROM pgtrickle.watermarks();
SELECT * FROM pgtrickle.watermark_status();

-- Parallel refresh monitoring (v0.4.0)
SELECT * FROM pgtrickle.worker_pool_status();
SELECT * FROM pgtrickle.parallel_job_status();

-- Refresh groups (v0.9.0)
SELECT pgtrickle.create_refresh_group('my_group', ARRAY['st1','st2']);
SELECT pgtrickle.drop_refresh_group('my_group');

-- Idempotent DDL (v0.6.0)
SELECT pgtrickle.create_or_replace_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region'
);

pg_trickle stream tables are regular PostgreSQL tables but managed through the pgtrickle schema's API functions. They cannot be renamed with ALTER TABLE (use alter_stream_table).

7. Scheduling and Dependency Management

Capability	pg_ivm	pg_trickle
Automatic scheduling	❌ (immediate only, no scheduler)	✅ background worker
Manual refresh	✅ `refresh_immv()`	✅ `refresh_stream_table()`
Cron schedules	❌	✅ (standard 5/6-field cron + aliases)
Duration-based staleness bounds	❌	✅ (`'30s'`, `'5m'`, `'1h'`, …)
Dependency DAG	❌	✅ (stream tables can reference other stream tables)
Topological refresh ordering	❌	✅ (upstream refreshes before downstream)
CALCULATED schedule propagation	❌	✅ (consumers drive upstream schedules)
Parallel refresh	❌	✅ (worker pool with database + cluster caps, v0.4.0)
Circular pipeline support	❌	✅ (monotone cycles with fixed-point iteration, v0.7.0)
Watermark coordination	❌	✅ (multi-source readiness gates, v0.7.0)
Refresh group management	❌	✅ (atomic multi-ST refresh, v0.9.0)

pg_trickle's DAG scheduling is a significant differentiator: you can build multi-layer pipelines where each downstream stream table is automatically refreshed after its upstream dependencies.

8. Change Data Capture

Attribute	pg_ivm	pg_trickle
Mechanism	AFTER row triggers (inline, same txn)	AFTER row/statement triggers → change buffer
WAL-based CDC	❌	✅ optional (`pg_trickle.cdc_mode = 'wal'`)
Statement-level triggers	❌	✅ (v0.4.0, reduced overhead for bulk operations)
Logical replication slots	Not used	Used in WAL mode only
Write-side overhead	Higher (view maintenance in txn)	Lower (small trigger insert only)
Change buffer tables	None (applied immediately)	`pgtrickle_changes.changes_<oid>`
TRUNCATE handling	IMMV truncated/refreshed synchronously	Change buffer cleared; full refresh queued

9. Concurrency and Isolation

pg_ivm

Holds ExclusiveLock on the IMMV during incremental update.
In READ COMMITTED: serializes concurrent updates to the same IMMV.
In REPEATABLE READ / SERIALIZABLE: raises an error when a concurrent transaction has already updated the IMMV.
Single-table INSERT-only IMMVs use the lighter RowExclusiveLock.

pg_trickle

Refresh operations acquire an advisory lock per stream table so only one refresh can run at a time.
Base table writes are never blocked by refresh operations.
Parallel refresh (v0.4.0): pg_trickle.parallel_refresh_mode = 'on' enables a worker pool with per-database (max_concurrent_refreshes, default 4) and cluster-wide (max_dynamic_refresh_workers) caps.
Atomic refresh groups for diamond dependencies.
Crash recovery: in-flight refreshes are marked failed on restart; the scheduler retries on the next cycle.

10. Observability

Feature	pg_ivm	pg_trickle
Catalog of managed views	`pgivm.pg_ivm_immv`	`pgtrickle.pgt_stream_tables`
Per-refresh timing/history	❌	✅ `pgtrickle.pgt_refresh_history`
Staleness reporting	❌	✅ `stale` column + `get_staleness()`
Scheduler status	❌	✅ `pgtrickle.pgt_status()`
NOTIFY-based alerting	❌	✅ `pgtrickle_refresh` channel (10+ alert types)
Error tracking	❌	✅ consecutive error counter, last error message
dbt integration	❌	✅ `dbt-pgtrickle` macro package
Explain/introspection	❌	✅ `explain_st`
CDC buffer health	❌	✅ `pgtrickle.change_buffer_sizes()` (v0.2.0)
Source table stats	❌	✅ `pgtrickle.list_sources()` (v0.2.0)
Dependency tree view	❌	✅ `pgtrickle.dependency_tree()` (v0.2.0)
Health triage	❌	✅ `pgtrickle.health_check()` (v0.2.0)
Cross-stream refresh history	❌	✅ `pgtrickle.refresh_timeline()` (v0.2.0)
CDC trigger audit	❌	✅ `pgtrickle.trigger_inventory()` (v0.2.0)
Diamond group inspection	❌	✅ `pgtrickle.diamond_groups()` (v0.2.0)
Quick health summary	❌	✅ `pgtrickle.quick_health` view (v0.5.0)
Source gating status	❌	✅ `pgtrickle.source_gates()` (v0.5.0)
Watermark monitoring	❌	✅ `pgtrickle.watermarks()` / `watermark_status()` (v0.7.0)
Parallel worker status	❌	✅ `pgtrickle.worker_pool_status()` / `parallel_job_status()` (v0.4.0)
SCC cycle status	❌	✅ `pgtrickle.pgt_scc_status()` (v0.7.0)
Replication slot health	❌	✅ `pgtrickle.slot_health()`
CDC mode per-source	❌	✅ `pgtrickle.pgt_cdc_status` view

11. Installation and Deployment

Attribute	pg_ivm	pg_trickle
Pre-built packages	RPM via yum.postgresql.org	OCI image, tarball
CNPG / Kubernetes	❌ (no OCI image)	✅ OCI extension image + CNPG smoke tests
Docker local dev	Manual	✅ documented + Docker Hub image
`shared_preload_libraries`	Required (or `session_preload_libraries`)	Required
Extension upgrade scripts	✅ (1.0 → 1.1 → … → 1.13)	✅ (0.1.3 → … → 0.9.0, CI completeness check, upgrade E2E tests)
`pg_dump` / restore	Manual IMMV recreation required	✅ Standard pg_dump supported (v0.8.0)

12. Performance Characteristics

pg_ivm

Write path: slower — every DML statement triggers inline view maintenance. From the README example: a single row update on a 10M-row join IMMV takes ~15 ms vs ~9 ms for a plain table update.
Read path: instant — IMMV is always current, no refresh needed on read.
Refresh (full): comparable to REFRESH MATERIALIZED VIEW (~20 seconds for a 10M-row join in the example).

pg_trickle

Write path: minimal overhead — only a small trigger INSERT into the change buffer (~2–50 μs per row). In WAL mode, zero trigger overhead. Statement-level CDC triggers (v0.4.0) further reduce overhead for bulk ops.
Read path: instant from the materialized table (potentially stale).
Refresh (differential): proportional to the number of changed rows, not the total table size. A single-row change on a million-row aggregate touches one row's worth of computation. Algebraic aggregates (v0.9.0) like COUNT/SUM/AVG/STDDEV/VAR update in O(1) constant time per changed row.
Refresh (full): re-runs the entire query; comparable to REFRESH MATERIALIZED VIEW.
Parallel refresh (v0.4.0): linear speedup with worker pool size.
I/O optimizations (v0.9.0): column skipping, source skipping in joins, WHERE filter push-down, index-aware MERGE for tiny change ratios, scalar subquery short-circuit.

13. Known Limitations

pg_ivm Limitations

Adds latency to every write on tracked base tables.
Cannot track tables modified via logical replication (subscriber nodes are not updated).
pg_dump / pg_upgrade require manual recreation of all IMMVs.
Limited aggregate support (no user-defined aggregates, no window functions).
Column type restrictions (btree operator class required in target list).
No scheduler or background worker — refresh is immediate only.
On high-churn tables, min/max aggregates can trigger expensive rescans.

pg_trickle Limitations

In DIFFERENTIAL/FULL mode, data is stale between refresh cycles. Use IMMEDIATE mode for zero-staleness, in-transaction consistency.
Recursive CTEs in IMMEDIATE mode emit a stack-depth warning; very deep recursion may hit PostgreSQL's stack limit.
Recursive CTEs in DIFFERENTIAL mode fall back to full recomputation for mixed DELETE/UPDATE changes (DRed scheduled for v0.10.0+).
LIMIT without ORDER BY is not supported in defining queries.
OFFSET without ORDER BY … LIMIT is not supported. Paged TopK (ORDER BY … LIMIT N OFFSET M) is fully supported.
ORDER BY + LIMIT (TopK) without OFFSET uses scoped recomputation (MERGE).
Volatile SQL functions rejected in DIFFERENTIAL mode.
Materialized views as sources not supported in DIFFERENTIAL mode.
Window functions in expressions (e.g. CASE WHEN ROW_NUMBER() OVER (...) > 5) require FULL mode.
Foreign tables as sources require FULL mode.
ALTER EXTENSION pg_trickle UPDATE migration scripts ship from v0.2.1; continuous upgrade path through v0.9.0.
Targets PostgreSQL 18 only; no backport to PG 13–17 (planned for PG 14–18).
v0.9.x series — extensive testing but not yet production-hardened at scale.

14. PostgreSQL Version Support

	pg_ivm	pg_trickle (current)	pg_trickle (planned)
PG 13	✅	❌	❌ (EOL Nov 2025)
PG 14	✅	❌	✅ (full plan)
PG 15	✅	❌	✅ (full plan)
PG 16	✅	❌	✅ (MVP target)
PG 17	✅	❌	✅ (MVP target)
PG 18	✅	✅	✅

Planned resolution: PLAN_PG_BACKCOMPAT.md:

Minimum viable (PG 16–18): ~1.5 weeks effort.
Full target (PG 14–18): ~2.5–3 weeks effort.
pgrx 0.17.0 already supports PG 14–18 via feature flags.
~435 lines in src/dvm/parser.rs need #[cfg] gating (all in JSON/SQL-standard sections). The remaining ~13,500 lines compile unchanged.

Feature degradation matrix:

Feature	PG 14	PG 15	PG 16	PG 17	PG 18
Core streaming tables	✅	✅	✅	✅	✅
Trigger-based CDC	✅	✅	✅	✅	✅
Differential refresh	✅	✅	✅	✅	✅
SQL/JSON constructors	—	—	✅	✅	✅
JSON_TABLE	—	—	—	✅	✅
WAL-based CDC	Needs test	Needs test	Likely	Likely	✅

15. Features Unique to Each System

Features Unique to pg_trickle (42 items, no pg_ivm equivalent)

IMMEDIATE + deferred modes (pg_ivm is immediate-only; pg_trickle offers both)
60+ aggregate functions (vs 5), including algebraic O(1) for COUNT/SUM/AVG/STDDEV/VAR
FILTER / HAVING / WITHIN GROUP on aggregates
Window functions (partition recomputation)
Set operations (UNION ALL, UNION, INTERSECT, EXCEPT — all 6 variants)
Recursive CTEs (semi-naive, DRed, recomputation; including IMMEDIATE mode with stack-depth warning)
LATERAL subqueries and SRFs (jsonb_array_elements, unnest, JSON_TABLE)
Anti-join / semi-join operators (NOT EXISTS, NOT IN, IN, EXISTS with full SQL)
Scalar subqueries in SELECT list
Views as sources (auto-inlined with nested expansion)
Partitioned table support (RANGE, LIST, HASH with auto-rebuild on ATTACH PARTITION)
Cascading stream tables (ST referencing other STs via DAG)
Background scheduler (cron + duration + canonical periods) with multi-database auto-discovery
GROUPING SETS / CUBE / ROLLUP (auto-rewritten)
DISTINCT ON (auto-rewritten to ROW_NUMBER)
Hybrid CDC (trigger → WAL transition)
DDL change detection and automatic reinitialization (including ALTER FUNCTION body changes)
Monitoring suite (15+ observability functions: change_buffer_sizes, list_sources, dependency_tree, health_check, refresh_timeline, trigger_inventory, diamond_groups, source_gates, watermarks, watermark_groups, watermark_status, worker_pool_status, parallel_job_status, pgt_scc_status, slot_health, check_cdc_health)
Auto-rewrite pipeline (6 transparent SQL rewrites)
Volatile function detection
AUTO refresh mode (smart DIFFERENTIAL/FULL selection with transparent fallback)
ALTER QUERY — change the defining query of an existing stream table online, with schema-change classification and OID-preserving migration
dbt macro package (materialization, status macro, health test, refresh operation)
CNPG / Kubernetes deployment
SQL/JSON constructors (JSON_OBJECT, JSON_ARRAY, etc.)
JSON_TABLE support (PG 17+)
TopK stream tables (ORDER BY + LIMIT, including IMMEDIATE mode via micro-refresh)
Paged TopK (ORDER BY + LIMIT + OFFSET for server-side pagination)
Diamond dependency consistency (multi-path refresh atomicity with SAVEPOINT)
Extension upgrade infrastructure (SQL migration scripts, CI completeness check, upgrade E2E tests, per-release SQL baselines)
Row Level Security (refreshes see all data; RLS policies on ST itself; IMMEDIATE mode secured; internal change buffers shielded from RLS interference) (v0.5.0)
Source gating (pause/resume CDC for bulk loads: gate_source, ungate_source) (v0.5.0)
Append-only fast path (append_only => true skips merge for INSERT-only tables) (v0.5.0)
Parallel refresh (background worker pool with per-database and cluster-wide caps, atomic groups for diamond dependencies) (v0.4.0)
Statement-level CDC triggers (reduced write-side overhead for bulk operations) (v0.4.0)
Circular pipeline support (monotone cycles with fixed-point iteration, max_fixpoint_iterations safety limit, SCC status monitoring) (v0.7.0)
Watermark APIs (delay refresh until multi-source data is ready: advance_watermark, create_watermark_group, tolerance-based readiness) (v0.7.0)
pg_dump / pg_restore support (safe backup with auto-reconnect of streams) (v0.8.0)
Algebraic aggregate maintenance (O(1) constant-time updates for COUNT/SUM/AVG/STDDEV/VAR with floating-point drift correction) (v0.9.0)
Refresh group management (create_refresh_group, drop_refresh_group for atomic multi-ST refresh) (v0.9.0)
Automatic backoff (exponential slowdown for overloaded streams) (v0.9.0)
Index-aware MERGE (use index lookups for tiny change ratios) (v0.9.0)

Features Unique to pg_ivm (with planned resolutions)

#	Feature	Status	Ref
1	Immediate (synchronous) maintenance	✅ Closed — IMMEDIATE refresh mode fully implemented (all phases)	PLAN_TRANSACTIONAL_IVM
2	Auto-index creation on GROUP BY / DISTINCT / PK	Postponed (Phase 2 of transactional IVM)	PLAN_TRANSACTIONAL_IVM §5.2
3	TRUNCATE propagation (auto-truncate IMMV)	✅ Closed — IMMEDIATE mode fires full refresh on TRUNCATE	PLAN_TRANSACTIONAL_IVM §3.2
4	Row Level Security respect	✅ Closed — v0.5.0: refreshes see all data; RLS on ST itself; IMMEDIATE mode secured; change buffers shielded	ROW_LEVEL_SECURITY.md
5	PostgreSQL 13–17 support	PG 14–18 backcompat planned (~2.5–3 weeks)	PLAN_PG_BACKCOMPAT
6	session_preload_libraries	Not applicable (background worker needs shared_preload)	—
7	Rename via ALTER TABLE	Event trigger support (low effort)	—
8	Drop via DROP TABLE	Postponed (Phase 2 of transactional IVM)	PLAN_TRANSACTIONAL_IVM §4.3
9	Extension upgrade scripts	✅ Closed — Scripts ship from v0.2.1; CI completeness check and upgrade E2E tests in place	—
10	pg_dump / pg_restore	✅ Closed — v0.8.0: safe backup with `pg_dump` and `pg_restore`, auto-reconnect streams	—

Of the 10 items, 5 are now closed (immediate maintenance, TRUNCATE, RLS, upgrade scripts, pg_dump), 3 have concrete implementation plans, and 2 are low-priority or not applicable.

16. Use-Case Fit

Scenario	Recommended
Need views consistent within the same transaction	Either (pg_trickle IMMEDIATE mode or pg_ivm)
Application cannot tolerate any view staleness	Either (pg_trickle IMMEDIATE mode or pg_ivm)
High write throughput, views can be slightly stale	pg_trickle (DIFFERENTIAL mode)
Multi-layer summary pipelines with dependencies	pg_trickle
Time-based or cron-driven refresh schedules	pg_trickle
Views with complex SQL (window functions, CTEs, UNION)	pg_trickle
Simple aggregation with zero-staleness requirement	Either (pg_trickle has richer SQL coverage)
Kubernetes / CloudNativePG deployment	pg_trickle
dbt integration	pg_trickle
Circular / self-referencing pipelines	pg_trickle
Multi-source watermark coordination	pg_trickle
High-throughput bulk loading (append-only)	pg_trickle (append-only fast path)
Row Level Security on analytical summaries	pg_trickle (richer RLS model)
pg_dump / pg_restore workflow	pg_trickle
PostgreSQL 13–17	pg_ivm
PostgreSQL 18	pg_trickle (superset of pg_ivm)
Production-hardened, stable API	pg_ivm
Early adopter, rich SQL coverage needed	pg_trickle

17. Coexistence

The two extensions can be installed in the same database simultaneously — they use different schemas (pgivm vs pgtrickle/pgtrickle_changes) and do not interfere with each other. However, with pg_trickle's IMMEDIATE mode now available and its dramatically broader feature set (v0.9.0), there is little reason to use both:

Use pg_trickle IMMEDIATE for small, critical lookup tables that must be perfectly consistent within transactions (the use-case that previously required pg_ivm).
Use pg_trickle DIFFERENTIAL/FULL for large analytical summary tables, multi-layer aggregation pipelines, circular pipelines, or views where slight staleness is acceptable.
Use pg_trickle AUTO (default) to let the system choose the best strategy.
Use pg_ivm only if you need PostgreSQL 13–17 support or prefer its mature, battle-tested codebase.

18. Recommendations

Planned work that closes pg_ivm gaps

Priority	Item	Plan	Effort	Closes Gaps
✅ Done	IMMEDIATE refresh mode (all phases)	PLAN_TRANSACTIONAL_IVM	Complete	#1 (immediate maintenance), #3 (TRUNCATE)
✅ Done	Extension upgrade scripts	v0.2.1 release	Complete	#9 (upgrade scripts)
✅ Done	Row Level Security	v0.5.0 release	Complete	#4 (RLS)
✅ Done	pg_dump / pg_restore	v0.8.0 release	Complete	#10 (backup/restore)
Postponed	pg_ivm compatibility layer	PLAN_TRANSACTIONAL_IVM Phase 2	Deferred to post-1.0	#2 (auto-indexing), #7 (rename), #8 (DROP TABLE)
High	PG 16–18 backcompat (MVP)	PLAN_PG_BACKCOMPAT §11	~1.5 weeks	#5 (PG version support)
Medium	PG 14–18 backcompat (full)	PLAN_PG_BACKCOMPAT §5	~2.5–3 weeks	#5 (PG version support)

Remaining small gaps (no existing plan)

Priority	Item	Description	Effort
Low	ALTER TABLE RENAME	Detect rename via event trigger, update catalog	2–4h

Not worth pursuing

Item	Reason
PG 13 support	EOL since November 2025. Incompatible `raw_parser()` API.
session_preload_libraries	Requires background worker, which needs shared_preload_libraries.

19. Conclusion

pg_trickle covers all of pg_ivm's SQL surface and extends it dramatically with 55+ additional aggregate functions (including algebraic O(1) maintenance for COUNT/SUM/AVG/STDDEV/VAR), window functions, set operations, recursive CTEs, LATERAL support, anti/semi-joins, circular pipeline support, watermark coordination, parallel refresh, Row Level Security, and a comprehensive operational layer.

The immediate maintenance gap is now fully closed: pg_trickle's IMMEDIATE refresh mode provides the same in-transaction consistency as pg_ivm, while also supporting window functions, LATERAL, scalar subqueries, WITH RECURSIVE (IM1), TopK micro-refresh (IM2), and cascading stream tables in IMMEDIATE mode — all of which pg_ivm cannot do.

The upgrade infrastructure gap is also closed: v0.2.1 ships SQL migration scripts with continuous upgrade path through v0.9.0, a CI completeness checker, and upgrade E2E tests, matching pg_ivm's upgrade path story.

The Row Level Security gap is closed (v0.5.0): refreshes see all data, RLS policies on the stream table itself control access, and IMMEDIATE mode is secured with shielded change buffers.

The pg_dump/restore gap is closed (v0.8.0): safe backup with standard PostgreSQL tools and automatic stream reconnection on restore.

The one remaining structural gap is PG version support:

PLAN_PG_BACKCOMPAT details backporting to PG 14–18 (or PG 16–18 as MVP) in ~2.5–3 weeks, primarily by #[cfg]- gating ~435 lines of JSON/SQL-standard parse-tree code.

Once backcompat is implemented, pg_trickle will be a strict superset of pg_ivm in every dimension: same immediate maintenance model, comparable PG version support (14–18 vs 13–18, with PG 13 EOL), dramatically wider SQL coverage (60+ aggregates vs 5, 21 DVM operators, 42 unique features), and a complete operational layer that pg_ivm entirely lacks.

For users migrating from pg_ivm, the IMMEDIATE refresh mode already provides the same zero-staleness guarantee. A full compatibility layer (pgivm.create_immv, pgivm.refresh_immv, pgivm.pg_ivm_immv) is planned for post-1.0 to enable zero-change migration.

References

pg_ivm repository: https://github.com/sraoss/pg_ivm
pg_trickle repository: https://github.com/grove/pg-trickle
DBSP differential dataflow paper: https://arxiv.org/abs/2203.16684
pg_trickle ESSENCE.md: ../../ESSENCE.md
pg_trickle DVM operators: ../../docs/DVM_OPERATORS.md
pg_trickle architecture: ../../docs/ARCHITECTURE.md

Triggers vs Logical Replication for CDC in pg_trickle

Status: Evaluation Report (updated with implementation status)
Date: 2026-02-24
Context: ADR-001/ADR-002 in PLAN_ADRS.md · PLAN_USER_TRIGGERS_EXPLICIT_DML.md

Executive Summary

pg_trickle uses row-level AFTER triggers to capture changes on source tables. This report evaluates the trigger-based approach against logical replication (WAL-based CDC) across five dimensions: correctness, performance, operations, and two end-user features — user-defined triggers on stream tables and logical replication subscriptions from stream tables.

Conclusion: Triggers remain the correct choice for the current scope given operational simplicity and zero-config deployment. The hybrid approach — trigger bootstrap for creation with automatic WAL transition for steady-state — is now implemented (pg_trickle.cdc_mode GUC, src/wal_decoder.rs). User- defined triggers on stream tables are also implemented (pg_trickle.user_triggers GUC, DISABLE TRIGGER USER during refresh). These were previously recommendations (§6.2, §6.6); both are now shipped.

However, the atomicity constraint — the original reason for choosing triggers — is primarily a creation-time inconvenience, not a steady-state limitation. Once a stream table exists, logical replication has three significant runtime advantages:

No write-side overhead — With triggers, every INSERT/UPDATE/DELETE on a tracked source table does extra work before the application's transaction can commit: it runs a PL/pgSQL function, writes a row into a buffer table, and updates an index. This slows down the application. With logical replication, PostgreSQL already writes every change to its internal transaction log (WAL) regardless — the CDC layer simply reads that log after the fact, so the application's writes are not slowed down at all.
TRUNCATE capture — When someone runs TRUNCATE on a source table, row-level triggers do not fire (TRUNCATE replaces the entire file rather than deleting rows one-by-one). This leaves stream tables silently stale until a manual refresh. Logical replication captures TRUNCATE natively from the WAL, so pg_trickle would know immediately that all rows were removed.
Change ordering from the transaction log — With triggers, each trigger independently calls pg_current_wal_lsn() to timestamp its change. With logical replication, the ordering comes directly from the WAL — the authoritative, global record of all database changes — which means change ordering is guaranteed to match commit order, even across concurrent transactions.

The two end-user features (user triggers and logical replication FROM stream tables) are both achievable without changing the CDC mechanism. A hybrid approach (triggers for creation, logical replication for steady-state) deserves serious consideration. See §3 for the full analysis.

1. Background

Current Architecture

CDC triggers on each tracked source table write typed per-column rows into per-table buffer tables (pgtrickle_changes.changes_<oid>). Each buffer row captures:

Column	Purpose
`change_id`	BIGSERIAL ordering within a source
`lsn`	`pg_current_wal_lsn()` at trigger time
`action`	`'I'` / `'U'` / `'D'`
`pk_hash`	Content hash of PK columns (optional)
`new_<col>`	Per-column NEW values (INSERT/UPDATE)
`old_<col>`	Per-column OLD values (UPDATE/DELETE)

A covering B-tree index (lsn, pk_hash, change_id) INCLUDE (action) supports the differential refresh's LSN-range scan.

The Atomicity Constraint

create_stream_table() performs DDL (CREATE TABLE) and DML (catalog inserts) before setting up CDC. pg_create_logical_replication_slot() cannot execute inside a transaction that has already performed writes. This makes single-transaction atomic creation impossible with logical replication — the decisive factor in the original ADR.

2. Comparison Matrix

2.1 Correctness & Transactional Safety

Aspect	Triggers	Logical Replication
Atomic creation	✅ Same transaction as DDL+catalog	❌ Slot creation requires separate transaction
Change visibility	✅ Immediate (same transaction)	⚠️ Asynchronous (after COMMIT + WAL decode)
TRUNCATE capture	❌ Row-level triggers not fired	✅ WAL emits TRUNCATE since PG 11
Transaction ordering	✅ Change buffer rows ordered by LSN	✅ WAL stream preserves commit order
Crash recovery	✅ Buffer tables are WAL-logged; no orphan state	⚠️ Slot survives crash but may need re-sync
Schema change handling	✅ DDL event hooks rebuild trigger in-place	⚠️ Requires slot re-creation or output plugin awareness

Key insight: The TRUNCATE gap is the most significant correctness limitation of the trigger approach. A statement-level AFTER TRUNCATE trigger that marks downstream STs for automatic FULL refresh would close this gap without changing the CDC architecture (see §6 Recommendation 3).

2.2 Performance

Metric	Triggers	Logical Replication
Per-row write overhead	~2–4 μs (narrow INSERT) to ~5–15 μs (wide UPDATE)	~0 (WAL writes happen regardless)
Expected throughput reduction	1.5–5× on tracked source tables	None on source tables
Write amplification	2× (source WAL + buffer table WAL + index)	1× (only source WAL)
Change buffer storage	Heap table + index per source	WAL segments (shared, recycled)
Sequence contention	BIGSERIAL per buffer (lightweight)	N/A
Throughput ceiling	~5,000 writes/sec (estimated)	WAL throughput (much higher)
Decoding CPU cost	N/A	Non-trivial; output plugin runs in WAL sender
Zero-change refresh	~3 ms (EXISTS check on empty buffer)	~3 ms (no pending WAL changes)

Key insight: Trigger overhead is synchronous — every committing transaction pays the cost. For applications with moderate write rates (<5,000 writes/sec) this is acceptable. For high-throughput OLTP workloads, logical replication's zero write-side overhead is a significant advantage.

2.3 Operational Complexity

Aspect	Triggers	Logical Replication
PostgreSQL configuration	None required	`wal_level = logical` + restart
Managed PG compatibility	✅ Works everywhere	⚠️ Some providers restrict `wal_level`
WAL retention risk	None (buffer tables are independent)	Slots prevent WAL cleanup; disk exhaustion risk
Slot management	N/A	Create, monitor, drop; orphan detection
`max_replication_slots`	N/A	Must be sized for number of tracked sources
`REPLICA IDENTITY` config	N/A	Required on all tracked source tables
Monitoring	Buffer table row counts	Slot lag, WAL retention, decode rate
Extension dependencies	None	Output plugin (`pgoutput`, `wal2json`, or custom)
Upgrade path	`CREATE OR REPLACE FUNCTION`	Slot protocol version compatibility

Key insight: Triggers are operationally simpler by a wide margin. Logical replication introduces a class of failure modes (stuck slots, WAL bloat, replica identity misconfiguration) that require dedicated monitoring and operational runbooks.

2.4 Feature: User Triggers on Stream Tables

This addresses end-user triggers on the output stream tables, not CDC triggers on source tables.

Aspect	Current (Trigger CDC)	With Logical Replication CDC
Feasibility	✅ Achievable via `session_replication_role`	✅ Same mechanism applies
Refresh suppression	`SET LOCAL session_replication_role = 'replica'`	Same
Post-refresh notification	`NOTIFY pg_trickle_refresh` with metadata	Same
MERGE firing pattern	DELETE+INSERT (not UPDATE); must be suppressed	Same — refresh mechanism is independent of CDC

Key insight: User trigger support on stream tables is orthogonal to the CDC mechanism and is now implemented. The solution uses ALTER TABLE ... DISABLE TRIGGER USER / ENABLE TRIGGER USER around FULL refresh (avoiding the session_replication_role conflict with logical replication publishing). In DIFFERENTIAL mode, explicit per-row DML (INSERT/UPDATE/DELETE) is used instead of MERGE so that user-defined AFTER triggers fire correctly. The implementation is controlled by the pg_trickle.user_triggers GUC (auto/ on/off). See PLAN_USER_TRIGGERS_EXPLICIT_DML.md for the full design.

Note: Sections 2.1–2.5 compare creation-time and operational aspects. For a focused steady-state comparison (what matters once the ST exists), see §3.

2.5 Feature: Logical Replication FROM Stream Tables

This addresses end-users subscribing to stream table changes via PostgreSQL's built-in logical replication.

Aspect	Status	Notes
Basic publishing	✅ Works today	STs are regular heap tables; `CREATE PUBLICATION` works
`__pgt_row_id` column	⚠️ Replicated by default	Use column list in PUBLICATION to exclude, or document as usable PK
Differential refresh	✅ DELETE+INSERT via MERGE are replicated	Subscriber sees individual DELETEs and INSERTs, not UPDATEs
Full refresh	✅ TRUNCATE + INSERT replicated	Subscriber needs `replica_identity` set; receives TRUNCATE + mass INSERT
`REPLICA IDENTITY`	Needs configuration	`__pgt_row_id` could serve as unique index for identity

The `session_replication_role` Conflict

If the refresh engine sets session_replication_role = 'replica' to suppress user triggers (Phase 1 of the user-trigger plan), this may also suppress publication of the DML to logical replication subscribers. When a session is in replica mode, PostgreSQL treats it as a replication subscriber — DML performed in that session may not be forwarded to downstream subscribers (depending on the publication's publish_via_partition_root and the subscriber's origin setting).

This is a potential conflict between the two features. Options:

Option	User Triggers Suppressed?	Replication Published?	Drawback
`session_replication_role = 'replica'`	✅ Yes	❌ May not be published	Breaks logical replication from STs
`ALTER TABLE ... DISABLE TRIGGER USER`	✅ Yes	✅ Yes	Requires `ACCESS EXCLUSIVE` lock
`pg_trickle.suppress_user_triggers` GUC → `DISABLE TRIGGER USER` only when needed	✅ Configurable	✅ Yes	Lock overhead; crash safety concern (ENABLE on recovery)
`tgisinternal` flag manipulation	✅ Yes	✅ Yes	Non-portable; catalog-level hack

Recommended resolution: Use ALTER TABLE ... DISABLE TRIGGER USER within a SAVEPOINT, restoring on error. The ACCESS EXCLUSIVE lock is brief (only held for the catalog update, not the entire refresh). If the user has enabled both user triggers AND logical replication on a stream table, this is the only approach that supports both simultaneously. If neither feature is in use, skip the overhead entirely.

3. Separating Creation-Time from Steady-State

The original ADR chose triggers because pg_create_logical_replication_slot() cannot execute inside a transaction that has already performed writes. This report initially treated that constraint as "decisive." But it deserves scrutiny: the atomicity constraint only affects the create_stream_table() call — a one-time event. Once a stream table exists, CDC runs for hours, days, or months. The steady-state characteristics are what actually matter for performance, correctness, and user experience.

3.1 The Atomicity Constraint Is a Solvable Engineering Problem

The constraint is real but workable. Three approaches exist, all with well-understood trade-offs:

Approach	How It Works	Downside
Two-phase creation	Phase 1: DDL + catalog in one transaction. Phase 2: slot creation in a separate transaction. Rollback Phase 1 artifacts on Phase 2 failure.	Brief window where catalog entry exists without CDC. Cleanup on failure adds ~50 lines of code.
Background worker handoff	Main transaction creates DDL + catalog + temporary trigger. Background worker creates slot asynchronously, then drops trigger.	Race window: changes between COMMIT and slot creation are captured by the temporary trigger, so no data is lost. Adds complexity (~100 lines).
Trigger bootstrap → slot transition	Create with triggers (current approach). After first successful refresh, migrate to logical replication in the background.	Trigger overhead during bootstrap period (minutes). Most natural hybrid approach.

None of these are architecturally difficult. The two-phase approach is straightforward — if slot creation fails, drop the storage table and catalog entry. The temporary-trigger approach eliminates even the theoretical data-loss window. These are engineering inconveniences, not fundamental blockers.

3.2 Steady-State: Triggers vs Logical Replication (Honest Comparison)

Once the stream table exists and CDC is running, here is how the two approaches compare on their actual runtime merits.

In plain terms: With triggers, every time the application writes a row to a tracked source table, the database does extra work right then and there — calling a function, writing to a buffer table, updating an index — all before the application's transaction can finish. This is like a toll booth on a highway: every car (write) must stop and pay (trigger overhead) before continuing.

With logical replication, the database already writes every change to its internal transaction log (the WAL) as part of normal operation. CDC simply reads that log after the fact, in a separate background process. The application's writes pass through without stopping — there is no toll booth. The cost of reading the log is paid by the database server, but it happens asynchronously and never slows down the application.

Where Logical Replication Wins (Steady-State)

Dimension	Trigger Impact	Logical Replication Advantage
Write-path latency	Every INSERT/UPDATE/DELETE on a tracked source pays ~2–15 μs synchronous overhead (PL/pgSQL dispatch, buffer INSERT, index update). This is inside the committing transaction's critical path.	Zero additional write-path cost. WAL writes happen regardless; decoding is asynchronous. Source table DML performance is completely unaffected.
Write amplification	Each source row change produces: (1) source table WAL, (2) buffer table heap write, (3) buffer table WAL, (4) buffer index update, (5) index WAL. ~2–3× total write amplification.	1× — only the source table's normal WAL. No additional heap writes, no secondary indexes.
TRUNCATE capture	Cannot capture. Row-level triggers don't fire. Requires a separate statement-level AFTER TRUNCATE workaround (§4) that only marks for reinit — the actual row deletions are invisible to differential mode.	Native. WAL emits TRUNCATE events since PG 11. The decoder receives a clean signal that all rows were removed.
Throughput ceiling	Estimated ~5,000 writes/sec on tracked sources before trigger overhead dominates. PL/pgSQL function dispatch is the bottleneck.	Bounded by WAL throughput — typically 50,000–200,000+ writes/sec depending on hardware and `wal_buffers`.
Connection-pool pressure	Trigger executes in the application's connection. Long-running trigger INSERTs can increase connection hold time under load.	Decoding runs in a dedicated WAL sender process. Application connections are unaffected.
Vacuum pressure	Buffer tables accumulate dead tuples between cleanups. Each refresh cycle creates bloat that autovacuum must reclaim.	No buffer tables to vacuum. WAL segments are recycled by the WAL management subsystem.
Transaction ID consumption	Each trigger INSERT consumes sub-transaction resources within the outer transaction. High-volume batch operations can cause excessive subtransaction overhead.	No additional transaction work.

Where Triggers Win (Steady-State)

Dimension	Trigger Advantage	Logical Replication Impact
Operational simplicity	No external state to manage. Buffer tables are regular heap tables — queryable, monitorable, backed up normally. Drop the trigger and it's gone.	Replication slots are persistent server-side state. A stuck or crashed consumer prevents WAL recycling, potentially filling the disk. Requires monitoring, max_slot_wal_keep_size guards, and orphan-slot cleanup.
Zero configuration	Works with any `wal_level` (`minimal`, `replica`, `logical`). No restart required. No `REPLICA IDENTITY` configuration.	Requires `wal_level = logical` (server restart), `max_replication_slots` sizing, and `REPLICA IDENTITY` on every tracked source table. Many managed PostgreSQL providers default to `wal_level = replica`.
Schema evolution	DDL event hooks rebuild the trigger function via `CREATE OR REPLACE FUNCTION`. New columns are added to the buffer table with `ADD COLUMN IF NOT EXISTS`. Simple, same-transaction, no coordination.	Schema changes on tracked tables require careful handling. The output plugin must be aware of column additions/removals. Slot may need to be recreated. `ALTER TABLE` during active decoding can cause protocol errors.
Debugging & visibility	Change buffers are queryable tables: `SELECT * FROM pgtrickle_changes.changes_12345 ORDER BY change_id DESC LIMIT 10`. Immediate visibility into what was captured.	WAL is binary and opaque. Inspecting captured changes requires `pg_logical_slot_peek_changes()` which advances or peeks the slot — disruptive in production.
Crash recovery	Buffer tables are WAL-logged and survive crashes. No special recovery needed — the refresh engine picks up from the last frontier LSN.	Slots survive crashes, but the decoding position may be ahead of what pg_trickle has consumed. Requires careful bookkeeping to avoid replaying or losing changes.
Multi-source coordination	Each source has an independent buffer table. The refresh engine reads from multiple buffers with independent LSN ranges. No coordination between sources.	Multiple sources could share a single slot (decoding all tables) or use per-source slots. Shared slots require demultiplexing; per-source slots multiply the slot management burden.
Isolation	Trigger failure (e.g., buffer table full) raises an error in the application transaction — visible and immediate.	Decoding failure is asynchronous. The application commits successfully, but changes may never reach the buffer. Silent data loss is possible unless monitored.

Neutral (Roughly Equivalent)

Dimension	Notes
Refresh-path performance	Both approaches populate the same buffer table schema. The MERGE/DVM pipeline is identical regardless of how buffers were filled.
Zero-change detection	Triggers: `EXISTS` check on empty buffer (~3 ms). Logical replication: check slot position vs current WAL LSN (~3 ms). Equivalent.
Memory footprint	Triggers: PL/pgSQL function cache per backend. Logical replication: WAL sender process + decoding context. Both are modest.

3.3 When Does Logical Replication Become the Better Choice?

The crossover point depends on workload characteristics:

Scenario	Better Choice	Why
< 1,000 writes/sec on tracked sources	Triggers	Overhead is negligible; operational simplicity dominates
1,000–5,000 writes/sec	Either / Triggers still acceptable	Trigger overhead is measurable but unlikely to be the bottleneck
> 5,000 writes/sec	Logical Replication	Write-path overhead starts to matter; 2–3× write amplification compounds
ETL patterns (TRUNCATE + bulk INSERT)	Logical Replication	Native TRUNCATE capture; no stale-data gap
Wide tables (20+ columns)	Logical Replication	Trigger overhead scales with column count (~5–15 μs); WAL overhead does not
Managed PostgreSQL with `wal_level` restrictions	Triggers	No choice — logical replication may not be available
Many tracked sources (50+)	Logical Replication	Fewer moving parts than 50 triggers + 50 buffer tables + 50 indexes
Need logical replication FROM stream tables	Triggers (with caveats)	see §2.5 — `session_replication_role` conflict with `DISABLE TRIGGER USER` as workaround

3.4 Reassessing the Decision

With the atomicity constraint properly scoped as a creation-time concern, the decision to use triggers rests on three remaining pillars:

Operational simplicity — no wal_level change, no slot management, no REPLICA IDENTITY configuration. This is genuinely valuable for an early-stage extension that needs frictionless adoption.
Debugging visibility — queryable buffer tables are a major developer experience advantage. Being able to SELECT * FROM changes_<oid> during debugging is invaluable.
Zero-config deployment — works on any PostgreSQL 18 instance without server restarts or configuration changes. Critical for managed PostgreSQL environments.

However, these advantages are primarily about developer and operator experience, not about the fundamental capability of the system. A mature pg_trickle deployment that needs high write throughput, TRUNCATE support, or minimal source-table impact would be better served by logical replication in steady-state.

The honest assessment: Triggers are the right choice today for pragmatic reasons (simplicity, early-stage adoption, managed PG compatibility). But the report should not overstate the atomicity constraint as a fundamental blocker — it is a solvable problem. If pg_trickle grows to serve high-throughput production workloads, the migration to logical replication for steady-state CDC should be treated as a planned evolution, not a theoretical future.

4. TRUNCATE: The Gap and How to Close It

This limitation is one of the strongest arguments for logical replication in steady-state — see §3.2 for the comparison.

The TRUNCATE limitation is the most commonly cited drawback of trigger-based CDC. PostgreSQL does not fire row-level triggers for TRUNCATE because TRUNCATE operates at the file level (O(1)) — there are no individual rows to enumerate.

Current Behavior

User runs TRUNCATE source_table
CDC trigger does not fire — change buffer remains empty
Scheduler sees zero changes → NO_DATA → stream table is stale
Stream table shows data from rows that no longer exist

Proposed Fix: Statement-Level AFTER TRUNCATE Trigger

PostgreSQL supports statement-level AFTER TRUNCATE triggers. While they provide no OLD row data, they can mark downstream stream tables for reinitialization:

CREATE TRIGGER pg_trickle_truncate_<oid>
  AFTER TRUNCATE ON <source_table>
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.on_source_truncated('<source_oid>');

The trigger function would:

Look up all stream tables that depend on this source
Mark them needs_reinit = true in the catalog
Cascade transitively to downstream STs

This closes the TRUNCATE gap without changing the CDC architecture. The next scheduler cycle would trigger a FULL refresh automatically.

Effort estimate: ~2–4 hours (trigger creation in cdc.rs, PL/pgSQL or Rust function for on_source_truncated, cascade logic reuse from hooks.rs).

5. Migration Path: Trigger → Logical Replication (Now Implemented)

Status: Phase A (Hybrid Creation) is now implemented in src/wal_decoder.rs. The pg_trickle.cdc_mode GUC controls the behavior (trigger/auto/wal).

As discussed in §3, the atomicity constraint is a creation-time problem with known solutions. The buffer table schema and downstream IVM pipeline are decoupled from the capture mechanism, so migration is isolated to the CDC layer. This should be treated as a planned evolution for high-throughput deployments, not a theoretical future:

Phase A: Hybrid Creation

create_stream_table() continues using triggers for atomic creation
After first successful full refresh, a background worker creates a replication slot and transitions to WAL-based capture
Trigger is dropped; buffer table continues to be populated from WAL decode

Phase B: Steady-State WAL Capture

Background worker runs a logical decoding consumer per tracked source
WAL changes are decoded and written to the same buffer table schema
Downstream pipeline (DVM, MERGE, frontier) is unchanged
TRUNCATE events are captured natively from WAL

Prerequisites

wal_level = logical (must be documented as optional upgrade path)
REPLICA IDENTITY on tracked sources (auto-configured or user-managed)
Custom output plugin or pgoutput + column mapping
Slot health monitoring (WAL retention alerts, orphan cleanup)

Effort estimate: 3–5 weeks for a production-quality implementation.

6. Recommendations

Recommendation 1: Keep Trigger-Based CDC (For Now)

Operational simplicity and zero-config deployment are strong advantages for an early-stage extension. The performance ceiling (~5,000 writes/sec) is adequate for current target use cases. The atomicity constraint, while solvable (see §3.1), adds creation-time complexity that is not yet justified.

However: This decision should be revisited when any of these triggers are hit: (a) users report write-path latency from CDC triggers, (b) TRUNCATE-based ETL patterns become a common pain point, (c) pg_trickle targets environments where wal_level = logical is already the norm. The steady-state advantages of logical replication (§3.2) are substantial and should not be dismissed.

Recommendation 2: ✅ IMPLEMENTED — User Trigger Suppression

User-defined triggers on stream tables are now fully supported. The implementation uses ALTER TABLE ... DISABLE TRIGGER USER / ENABLE TRIGGER USER around FULL refresh, and explicit per-row DML (INSERT/UPDATE/DELETE) instead of MERGE during DIFFERENTIAL refresh so user AFTER triggers fire correctly. Controlled by pg_trickle.user_triggers GUC (auto/on/off). The session_replication_role approach from the original plan was rejected to avoid conflict with logical replication publishing (see §2.5).

Recommendation 3: Add TRUNCATE Capture Trigger

Add a statement-level AFTER TRUNCATE trigger on each tracked source table that marks downstream STs for reinitialization. This closes the most significant usability gap without changing the CDC architecture.

Recommendation 4: Document Logical Replication FROM Stream Tables

Add documentation and examples for CREATE PUBLICATION on stream tables, including:

Column filtering to exclude __pgt_row_id
REPLICA IDENTITY configuration using __pgt_row_id as unique index
Behavior during FULL vs DIFFERENTIAL refresh
Interaction with user trigger suppression

Recommendation 5: Benchmark Trigger Overhead

Execute the benchmark plan in PLAN_TRIGGERS_OVERHEAD.md to establish data-driven thresholds for the logical replication migration crossover point. The results should feed directly into the §3.3 crossover analysis.

Recommendation 6: ✅ IMPLEMENTED — Hybrid CDC Approach

The "trigger bootstrap → slot transition" pattern is now implemented in src/wal_decoder.rs (1152 lines). The implementation includes:

Automatic transition: After stream table creation with triggers, a background worker creates a logical replication slot and transitions to WAL-based capture.
GUC control: pg_trickle.cdc_mode (trigger/auto/wal) and pg_trickle.wal_transition_timeout control the behavior.
Transition orchestration: Create slot → wait for catch-up → drop trigger. Automatic fallback to triggers if slot creation fails.
Catalog extension: pgt_dependencies gains cdc_mode, slot_name, decoder_confirmed_lsn, transition_started_at columns.
Health monitoring: pgtrickle.check_cdc_health() function and NOTIFY pg_trickle_cdc_transition notifications.

7. Decision Log

#	Decision	Rationale
D1	Keep triggers for CDC on source tables — for now	Zero-config, operational simplicity, adequate for current scale
D2	Atomicity constraint is solvable, not fundamental	Two-phase creation and hybrid bootstrap are proven patterns (§3.1)
D3	Logical replication is superior in steady-state	Zero write overhead, TRUNCATE capture, higher throughput ceiling (§3.2)
D4	User triggers on STs are orthogonal to CDC choice	`session_replication_role` / `DISABLE TRIGGER USER` works with either approach
D5	Logical replication FROM STs works today	Regular heap tables; needs documentation, not code
D6	TRUNCATE gap is closable with statement-level trigger	Low effort, high impact — but logical replication handles it natively
D7	Hybrid approach is the optimal long-term target	Trigger bootstrap for creation + logical replication for steady-state
D8	User trigger suppression uses `DISABLE TRIGGER USER`	Avoids `session_replication_role` conflict with logical replication publishing (§2.5)
D9	Hybrid CDC implemented with auto-transition	`pg_trickle.cdc_mode = 'auto'` triggers → WAL transition after creation
D10	Explicit DML for DIFFERENTIAL refresh with user triggers	INSERT/UPDATE/DELETE instead of MERGE so AFTER triggers fire correctly

pg_trickle Documentation