pg_trickle

pg_trickle is a PostgreSQL 18 extension that turns ordinary SQL views into self-maintaining stream tables — no external processes, no sidecars, no bespoke refresh pipelines. Just CREATE EXTENSION pg_trickle and your views stay fresh.

-- Declare a stream table — a view that maintains itself
SELECT pgtrickle.create_stream_table(
    name     => 'active_orders',
    query    => 'SELECT * FROM orders WHERE status = ''active''',
    schedule => '30s'
);

-- Insert a row — the stream table updates automatically on the next refresh
INSERT INTO orders (id, status) VALUES (42, 'active');
SELECT count(*) FROM active_orders;  -- 1

The problem with materialized views

PostgreSQL's materialized views are powerful but frustrating. REFRESH MATERIALIZED VIEW re-runs the entire query from scratch, even if only one row changed in a million-row table. Your choices are: burn CPU on full recomputation, or accept stale data. Most teams end up building bespoke refresh pipelines just to keep summary tables current.

What pg_trickle does differently

pg_trickle captures changes to your source tables and — on each refresh cycle — derives a delta query that processes only the changed rows and merges the result into the materialized table. One insert into a million-row source table? pg_trickle touches exactly one row's worth of computation.

The approach is grounded in the DBSP differential dataflow framework (Budiu et al., 2022). Delta queries are derived automatically from your SQL's operator tree: joins produce the classic bilinear expansion, aggregates maintain auxiliary counters, and linear operators like filters pass deltas through unchanged.

Key capabilities

FeatureDescription
Incremental refreshOnly changed rows are recomputed — never a full table scan
Cascading DAGStream tables that depend on stream tables propagate deltas downstream automatically
Demand-driven schedulingSet a freshness interval on the views your app queries; upstream layers inherit the tightest schedule automatically
Hybrid CDCStarts with lightweight row-level triggers; seamlessly transitions to WAL-based logical replication once available
Broad SQL supportJOINs, GROUP BY, DISTINCT, UNION/INTERSECT/EXCEPT, subqueries, CTEs (including WITH RECURSIVE), window functions, LATERAL, and more
Built-in observabilityMonitoring views, refresh history, NOTIFY-based alerting
CloudNativePG-readyShips as an Image Volume extension image for Kubernetes deployments

Demand-driven scheduling

With the default CALCULATED schedule mode, you only set an explicit refresh interval on the stream tables your application actually queries. The system propagates that cadence upward through the dependency graph: each upstream stream table inherits the tightest schedule among its downstream dependents. You declare freshness requirements where they matter — at the consumer — and the entire pipeline adjusts without manual coordination.

Hybrid change capture

pg_trickle bootstraps with lightweight row-level triggers — no configuration needed, works out of the box. Once the first refresh succeeds and wal_level = logical is available, the system automatically transitions to WAL-based logical replication for lower write-side overhead. The transition is seamless: trigger → transitioning → WAL-only. If anything goes wrong, it falls back to triggers.


Explore this documentation

Tutorials

Integrations


Source & releases

Written in Rust using pgrx. Targets PostgreSQL 18. Apache 2.0 licensed.

Getting Started with pg_trickle

What is pg_trickle?

pg_trickle adds stream tables to PostgreSQL — tables that are defined by a SQL query and kept automatically up to date as the underlying data changes. Think of them as materialized views that refresh themselves, but smarter: instead of re-running the entire query on every refresh, pg_trickle uses Incremental View Maintenance (IVM) to process only the rows that changed.

Traditional materialized views force a choice: either re-run the full query (expensive) or accept stale data. pg_trickle eliminates this trade-off. When you insert a single row into a million-row table, pg_trickle computes the effect of that one row on the query result — it doesn't touch the other 999,999.

How data flows

The key concept is that data flows downstream automatically — from your base tables through any chain of stream tables, without you writing a single line of orchestration code:

  You write to base tables
         │
         ▼
  ┌─────────────┐   triggers (or WAL)   ┌─────────────────────┐
  │ Base Tables │ ─────────────────────▶ │   Change Buffers    │
  │ (you write) │                        │ (pgtrickle_changes.*) │
  └─────────────┘                        └──────────┬──────────┘
                                                     │
                                           delta query (ΔQ) on refresh
                                                     │
                                                     ▼
  ┌──────────────────────────────────────────────────────────────┐
  │  Stream Table A  ◀── depends on base tables                  │
  └──────────────────────────┬───────────────────────────────────┘
                             │  change captured, buffer written
                             ▼
  ┌──────────────────────────────────────────────────────────────┐
  │  Stream Table B  ◀── depends on Stream Table A               │
  └──────────────────────────────────────────────────────────────┘

One write to a base table can ripple through an entire DAG of stream tables — each layer refreshed in the correct topological order, each doing only the work proportional to what actually changed.

  1. You write to your base tables normally — INSERT, UPDATE, DELETE
  2. Lightweight AFTER row-level triggers capture each change into a buffer, atomically in the same transaction. No polling, no logical replication slots required by default.
  3. On each refresh cycle, pg_trickle derives a delta query (ΔQ) that reads only the buffered changes since the last refresh frontier
  4. The delta is merged into the stream table — only the affected rows are written
  5. If other stream tables depend on this one, they are scheduled next (topological order)
  6. Optionally: once wal_level = logical is available and the first refresh succeeds, pg_trickle automatically transitions from triggers to WAL-based CDC (near-zero write-path overhead compared to ~2–15 μs for triggers). The transition is seamless and transparent.

This tutorial walks through a concrete org-chart example so you can see this flow end to end, including a chain of stream tables that propagates changes automatically.


Prerequisites

  • PostgreSQL 18.x with pg_trickle installed (see INSTALL.md)
  • shared_preload_libraries = 'pg_trickle' in postgresql.conf
  • max_worker_processes raised to at least 32 (see INSTALL.md); the PostgreSQL default of 8 is often exhausted if you have several databases, causing stream tables to silently stop refreshing
  • psql or any SQL client

Deploying to production? See the Pre-Deployment Checklist for a complete list of requirements, pooler compatibility, and recommended GUC values.

Playground: The fastest way to experiment is the playground — a Docker Compose environment with sample tables and stream tables pre-loaded. cd playground && docker compose up -d and you're running.

Quick start with Docker: Pull the pre-built GHCR image — PostgreSQL 18.3 + pg_trickle ready to run, no configuration needed:

docker run --rm -e POSTGRES_PASSWORD=secret -p 5432:5432 ghcr.io/grove/pg_trickle:latest

All GUC defaults (wal_level, shared_preload_libraries, scheduler settings) are pre-configured. See INSTALL.md for tag details and volume mounting.

Connect to the database you want to use and enable the extension:

CREATE EXTENSION pg_trickle;

No additional configuration is needed. pg_trickle automatically discovers all databases on the server and starts a scheduler for each one where the extension is installed.


Chapter 1: Hello World — Your First Stream Table

Before diving into multi-table joins and recursive CTEs, start with the simplest possible stream table: a single-source aggregate with no joins.

1.1 Setup

Create one table and enable the extension:

CREATE EXTENSION IF NOT EXISTS pg_trickle;

CREATE TABLE products (
    id       SERIAL PRIMARY KEY,
    category TEXT           NOT NULL,
    price    NUMERIC(10,2)  NOT NULL,
    in_stock BOOLEAN        NOT NULL DEFAULT true
);

INSERT INTO products (category, price) VALUES
    ('Electronics', 299.99),
    ('Electronics', 49.99),
    ('Books',       14.99),
    ('Books',       24.99),
    ('Books',        9.99);

1.2 Create the stream table

SELECT pgtrickle.create_stream_table(
    name     => 'category_summary',
    query    => $$
        SELECT
            category,
            COUNT(*)                    AS product_count,
            ROUND(AVG(price), 2)        AS avg_price,
            MIN(price)                  AS min_price,
            MAX(price)                  AS max_price,
            COUNT(*) FILTER (WHERE in_stock) AS in_stock_count
        FROM products
        GROUP BY category
    $$,
    schedule => '1s'
);

Query it immediately — it was populated by the initial full refresh:

SELECT category, product_count, avg_price, min_price, max_price, in_stock_count
FROM category_summary ORDER BY category;
  category   | product_count | avg_price | min_price | max_price | in_stock_count
-------------+---------------+-----------+-----------+-----------+----------------
 Books       |             3 |     16.66 |      9.99 |     24.99 |              3
 Electronics |             2 |    174.99 |     49.99 |    299.99 |              2
(2 rows)

1.3 Watch an INSERT update one group

INSERT INTO products (category, price) VALUES ('Books', 39.99);

Within ~1 second (or call SELECT pgtrickle.refresh_stream_table('category_summary') to force it):

SELECT category, product_count, avg_price, min_price, max_price, in_stock_count
FROM category_summary WHERE category = 'Books';
 category | product_count | avg_price | min_price | max_price | in_stock_count
----------+---------------+-----------+-----------+-----------+----------------
 Books    |             4 |     22.49 |      9.99 |     39.99 |              4
(1 row)

The Electronics row was not touched at all — pg_trickle read exactly 1 row from the change buffer, adjusted only the Books group.

1.4 Watch an UPDATE propagate

UPDATE products SET price = 19.99 WHERE price = 299.99;

After the next refresh:

SELECT category, product_count, avg_price, min_price, max_price, in_stock_count
FROM category_summary WHERE category = 'Electronics';
  category   | product_count | avg_price | min_price | max_price | in_stock_count
-------------+---------------+-----------+-----------+-----------+----------------
 Electronics |             2 |     34.99 |     19.99 |     49.99 |              2
(1 row)

For AVG, pg_trickle maintains running sum and count columns internally, so re-aggregating a group is O(1) regardless of group size.

1.5 What you just saw

  • A single function call created the storage table, installed CDC triggers, ran the initial full refresh, and registered a 1-second schedule.
  • Every subsequent DML on products was captured in an AFTER trigger — no polling, no logical replication.
  • Each refresh touched only the rows and groups that changed.
  • The stream table is a real PostgreSQL table — you can SELECT, index, and join against category_summary like any other table.

Clean up: SELECT pgtrickle.drop_stream_table('category_summary'); DROP TABLE products;


Chapter 2: Joins, Aggregates & Chains

What you'll build

An employee org-chart system with two stream tables:

  • department_tree — a recursive CTE that flattens a department hierarchy into paths like Company > Engineering > Backend
  • department_stats — a join + aggregation over department_tree (a stream table!) that computes headcount and salary budget, with the full path included
  • department_report — a further aggregation that rolls up stats to top-level departments

The chain departmentsdepartment_treedepartment_statsdepartment_report demonstrates automatic downstream propagation: modify a department name in the base table and all three stream tables update automatically, in the right order, without any manual orchestration.

By the end you will have:

  • Seen how stream tables are created, queried, and refreshed
  • Watched a single UPDATE in a base table cascade through three layers of stream tables automatically
  • Understood the four refresh modes and IVM strategies

Prefer dbt? A runnable dbt companion project mirrors every step below. Clone the repo and run:

./examples/dbt_getting_started/scripts/run_example.sh

See examples/dbt_getting_started/ for full details.


2.1 Create the Base Tables

These are ordinary PostgreSQL tables — pg_trickle doesn't require any special column types, annotations, or schema conventions.

Tables without a primary key work, but pg_trickle will emit a WARNING at stream table creation time: change detection falls back to a content-based hash across all columns, which is slower for wide tables and cannot distinguish between identical duplicate rows. Adding a primary key gives the best performance and most reliable change detection. A primary key is also required for automatic transition to WAL-based CDC (cdc_mode = 'auto'); without one the source table stays on trigger-based CDC.

-- Department hierarchy (self-referencing tree)
CREATE TABLE departments (
    id         SERIAL PRIMARY KEY,
    name       TEXT NOT NULL,
    parent_id  INT REFERENCES departments(id)
);

-- Employees belong to a department
CREATE TABLE employees (
    id            SERIAL PRIMARY KEY,
    name          TEXT NOT NULL,
    department_id INT NOT NULL REFERENCES departments(id),
    salary        NUMERIC(10,2) NOT NULL
);

Now insert some data — a three-level department tree and a handful of employees:

-- Top-level
INSERT INTO departments (id, name, parent_id) VALUES
    (1, 'Company',     NULL);

-- Second level
INSERT INTO departments (id, name, parent_id) VALUES
    (2, 'Engineering', 1),
    (3, 'Sales',       1),
    (4, 'Operations',  1);

-- Third level (under Engineering)
INSERT INTO departments (id, name, parent_id) VALUES
    (5, 'Backend',     2),
    (6, 'Frontend',    2),
    (7, 'Platform',    2);

-- Employees
INSERT INTO employees (name, department_id, salary) VALUES
    ('Alice',   5, 120000),   -- Backend
    ('Bob',     5, 115000),   -- Backend
    ('Charlie', 6, 110000),   -- Frontend
    ('Diana',   7, 130000),   -- Platform
    ('Eve',     3, 95000),    -- Sales
    ('Frank',   3, 90000),    -- Sales
    ('Grace',   4, 100000);   -- Operations

At this point these are plain tables with no triggers, no change tracking, nothing special. The department tree looks like this:

Company (1)
├── Engineering (2)
│   ├── Backend (5)     — Alice, Bob
│   ├── Frontend (6)    — Charlie
│   └── Platform (7)    — Diana
├── Sales (3)           — Eve, Frank
└── Operations (4)      — Grace

2.2 Create the First Stream Table — Recursive Hierarchy

Our first stream table flattens the department tree. For every department, it computes the full path from the root and the depth level. This uses WITH RECURSIVE — a SQL construct that can't be differentiated with simple algebraic rules (the recursion depends on itself), but pg_trickle handles it using incremental strategies (semi-naive evaluation for inserts, Delete-and-Rederive for mixed changes) that we'll explain later.

SELECT pgtrickle.create_stream_table(
    name         => 'department_tree',
    query        => $$
    WITH RECURSIVE tree AS (
        -- Base case: root departments (no parent)
        SELECT id, name, parent_id, name AS path, 0 AS depth
        FROM departments
        WHERE parent_id IS NULL

        UNION ALL

        -- Recursive step: children join back to the tree
        SELECT d.id, d.name, d.parent_id,
               tree.path || ' > ' || d.name AS path,
               tree.depth + 1
        FROM departments d
        JOIN tree ON d.parent_id = tree.id
    )
    SELECT id, name, parent_id, path, depth FROM tree
    $$,
    schedule     => '1s'
);

Note on short schedules: A 1-second schedule is safe for development and production thanks to auto_backoff (on by default since v0.10.0). If a refresh takes more than 95% of the schedule window, the scheduler automatically stretches the effective interval (up to 8× the configured schedule) to prevent CPU runaway, then resets to 1× as soon as a refresh completes on time. You will see a WARNING message when backoff activates.

v0.2.0+: create_stream_table also accepts diamond_consistency ('none' or 'atomic') and diamond_schedule_policy ('fastest' or 'slowest') for diamond-shaped dependency graphs. Schedules can be cron expressions (e.g., '*/5 * * * *', '@hourly'). Set pooler_compatibility_mode => true if you're connecting through PgBouncer or another transaction-mode connection pooler. See SQL_REFERENCE.md for the full parameter list.

What just happened?

That single function call did a lot of work atomically (all in one transaction):

  1. Parsed the defining query into an operator tree — identifying the recursive CTE, the scan on departments, the join, the union
  2. Created a storage table called department_tree in the public schema — a real PostgreSQL heap table with columns matching the SELECT output, plus internal columns __pgt_row_id (a hash used to track individual rows)
  3. Installed CDC triggers on the departments table — lightweight AFTER INSERT OR UPDATE OR DELETE row-level triggers that will capture every future change
  4. Created a change buffer table in the pgtrickle_changes schema — this is where the triggers write captured changes
  5. Ran an initial full refresh — executed the recursive query against the current data and populated the storage table
  6. Registered the stream table in pg_trickle's catalog with a 1-second refresh schedule

TRUNCATE caveat: Row-level triggers do not fire on TRUNCATE. If you TRUNCATE a base table, the change is not captured incrementally — the stream table will become stale. Use DELETE FROM table instead, or call pgtrickle.refresh_stream_table('department_tree') after a TRUNCATE. If the stream table uses DIFFERENTIAL mode, temporarily switch to FULL for a full recompute: pgtrickle.alter_stream_table('department_tree', refresh_mode => 'FULL'), refresh, then switch back. Query it immediately — it's already populated:

SELECT id, name, parent_id, path, depth FROM department_tree ORDER BY path;

Expected output:

 id |    name     | parent_id |               path               | depth
----+-------------+-----------+----------------------------------+-------
  1 | Company     |           | Company                          |     0
  2 | Engineering |         1 | Company > Engineering            |     1
  5 | Backend     |         2 | Company > Engineering > Backend  |     2
  6 | Frontend    |         2 | Company > Engineering > Frontend |     2
  7 | Platform    |         2 | Company > Engineering > Platform |     2
  4 | Operations  |         1 | Company > Operations             |     1
  3 | Sales       |         1 | Company > Sales                  |     1
(7 rows)

This is a real PostgreSQL table — you can create indexes on it, join it in other queries, reference it in views, or even use it as a source for other stream tables. pg_trickle keeps it in sync automatically.

Key insight: The recursive query that computes paths and depths would normally need to be re-run manually (or via REFRESH MATERIALIZED VIEW). With pg_trickle, it stays fresh — any change to the departments table is automatically reflected within the schedule bound (1 second here).


2.3 Chain Stream Tables — Build the Downstream Layers

Now create department_stats. The twist: instead of joining directly against departments, it joins against department_tree — the stream table we just created. This creates a chain: changes to departments update department_tree, whose changes then trigger department_stats to update.

This demonstrates how pg_trickle builds a DAG — a directed acyclic graph of stream tables — and automatically schedules refreshes in topological order.

SELECT pgtrickle.create_stream_table(
    name         => 'department_stats',
    query        => $$
    SELECT
        t.id          AS department_id,
        t.name        AS department_name,
        t.path        AS full_path,
        t.depth,
        COUNT(e.id)                    AS headcount,
        COALESCE(SUM(e.salary), 0)     AS total_salary,
        COALESCE(AVG(e.salary), 0)     AS avg_salary
    FROM department_tree t
    LEFT JOIN employees e ON e.department_id = t.id
    GROUP BY t.id, t.name, t.path, t.depth
    $$,
    schedule     => 'calculated'      -- CALCULATED: inherit schedule from downstream; see explanation below
);

What just happened — and why this one is different?

Like before, pg_trickle parsed the query, created a storage table, and set up CDC. But department_stats depends on department_tree, not a base table — so no new triggers were installed. Instead, pg_trickle registered department_tree as an upstream dependency in the DAG.

The schedule is 'calculated' (CALCULATED mode), which means: "don't give this table its own schedule — inherit the tightest schedule of any downstream table that queries it". Internally this stores NULL in the catalog, but you must pass the string 'calculated' — passing SQL NULL is an error. Since no other stream table has been created yet, it will be refreshed on demand or when a downstream dependent triggers it.

The query has no recursive CTE, so pg_trickle uses algebraic differentiation:

  1. Decomposed into operators: Scan(department_tree)LEFT JOINScan(employees)Aggregate(GROUP BY + COUNT/SUM/AVG)Project
  2. Derived a differentiation rule for each:
    • Δ(Scan) = read only change buffer rows (not the full table)
    • Δ(LEFT JOIN) = join change rows from one side against the full other side
    • Δ(Aggregate) = for COUNT/SUM/AVG, add or subtract per group — no rescan needed
  3. Composed these into a single delta query (ΔQ) that never touches unchanged rows

When one employee is inserted, the refresh reads one change buffer row, joins to find the department, and adjusts only that group's count and sum.

Query it:

SELECT department_name, full_path, headcount, total_salary
FROM department_stats
ORDER BY full_path;

Expected output:

 department_name |            full_path             | headcount | total_salary
-----------------+----------------------------------+-----------+--------------
 Company         | Company                          |         0 |            0
 Engineering     | Company > Engineering            |         0 |            0
 Backend         | Company > Engineering > Backend  |         2 |    235000.00
 Frontend        | Company > Engineering > Frontend |         1 |    110000.00
 Platform        | Company > Engineering > Platform |         1 |    130000.00
 Operations      | Company > Operations             |         1 |    100000.00
 Sales           | Company > Sales                  |         2 |    185000.00
(7 rows)

Notice that the full_path column comes from department_tree — this data already went through one layer of incremental maintenance before landing here.

Add a third layer: department_report

Now add a rollup that aggregates department_stats by top-level group (depth = 1):

SELECT pgtrickle.create_stream_table(
    name         => 'department_report',
    query        => $$
    SELECT
        split_part(full_path, ' > ', 2) AS division,
        SUM(headcount)                  AS total_headcount,
        SUM(total_salary)               AS total_payroll
    FROM department_stats
    WHERE depth >= 1
    GROUP BY 1
    $$,
    schedule     => '1s'              -- this is the only explicit schedule; CALCULATED tables above inherit it
);

The DAG is now:

departments (base)  employees (base)
      │                   │
      ▼                   │
department_tree ──────────┤
   (DIFF, CALCULATED)     │
      │                   ▼
      └──────▶ department_stats
                 (DIFF, CALCULATED)
                      │
                      ▼
               department_report
                  (DIFF, 1s)   ◀── only explicit schedule

department_report drives the whole pipeline. Because it has a 1-second schedule, pg_trickle automatically propagates that cadence upstream: department_stats and department_tree will also be refreshed within 1 second of a base table change, in topological order, with no manual configuration.

Query the report:

SELECT division, total_headcount, total_payroll FROM department_report ORDER BY division;
  division   | total_headcount | total_payroll
-------------+-----------------+---------------
 Engineering |               4 |    475000.00
 Operations  |               1 |    100000.00
 Sales       |               2 |    185000.00
(3 rows)

2.4 Watch a Change Cascade Through All Three Layers

This is the heart of pg_trickle. We'll make four changes to the base tables and watch changes propagate automatically through the three-layer DAG — each layer doing only the minimum work.

The data flow pipeline (three layers)

  Your SQL statement
       │
       ▼
  CDC trigger fires (same transaction)
  Change buffer receives one row
       │
       ▼
  Background scheduler fires (within ~1 second)
       │
       ├──▶ [Layer 1] Refresh department_tree
       │         delta query reads change buffer
       │         MERGE touches only affected rows in department_tree
       │         department_tree's own change buffer is updated
       │
       ├──▶ [Layer 2] Refresh department_stats
       │         delta query reads department_tree's change buffer
       │         MERGE touches only affected department groups
       │
       └──▶ [Layer 3] Refresh department_report
                 delta query reads department_stats' change buffer
                 MERGE touches only affected division rows
                 All change buffers cleaned up ✓

All three layers run in a single scheduled pass, in topological order.

2.4a: INSERT ripples through all three layers

INSERT INTO employees (name, department_id, salary) VALUES
    ('Heidi', 6, 105000);  -- New Frontend engineer

What happened immediately (in your transaction): The AFTER INSERT trigger on employees fired and wrote one row to pgtrickle_changes.changes_<employees_oid>. The row contains the new values, action type I, and the LSN at the time of insert. Your transaction committed normally — no blocking.

The stream tables don't know about Heidi yet. The change is in the buffer, waiting for the next refresh.

The background scheduler handles this automatically. With a 1-second schedule, department_stats and department_report refresh within about a second.

To confirm a refresh has happened, check data_timestamp in the monitoring view:

SELECT name, data_timestamp, staleness FROM pgtrickle.pgt_status();

To force an immediate synchronous refresh, wait a moment first (so the scheduler can finish its current tick), then call in topological order. Note that refresh_stream_table only refreshes the named table — it does not cascade upstream:

SELECT pg_sleep(2);  -- let the scheduler finish any in-progress tick
SELECT pgtrickle.refresh_stream_table('department_stats');
SELECT pgtrickle.refresh_stream_table('department_report');

What happened across the three layers:

LayerWhat ranRows touched
department_treeNo change — employees is not a source for this ST0
department_statsDelta query: read 1 buffer row, join to Frontend, COUNT+1, SUM+1050001 (Frontend group only)
department_reportDelta query: read 1 change from dept_stats, SUM += 1 headcount, += 1050001 (Engineering row only)

Check the result:

SELECT department_name, headcount, total_salary FROM department_stats
WHERE department_name = 'Frontend';
 department_name | headcount | total_salary
-----------------+-----------+--------------
 Frontend        |         2 |    215000.00

The 6 other groups in department_stats were not touched at all.

Contrast with a standard materialized view: REFRESH MATERIALIZED VIEW would re-scan all 8 employees, re-join with all 7 departments, re-aggregate, and update all 7 rows. With pg_trickle, the work was proportional to the 1 changed row — across all three layers.

2.4b: A department change cascades through the whole DAG

Now change the departments table — the root of the entire chain:

INSERT INTO departments (id, name, parent_id) VALUES
    (8, 'DevOps', 2);  -- New team under Engineering

What happened: The CDC trigger on departments fired. The change buffer for departments has one new row. None of the stream tables know about it yet.

The scheduler handles this automatically — all three tables will refresh within a second in the correct dependency order (upstream first). To force it synchronously, wait a moment first then refresh each table in topological order (refresh_stream_table does not cascade upstream):

SELECT pg_sleep(2);
SELECT pgtrickle.refresh_stream_table('department_tree');
SELECT pgtrickle.refresh_stream_table('department_stats');
SELECT pgtrickle.refresh_stream_table('department_report');

What happened across all three layers:

LayerWhat ranRows touched
department_treeSemi-naive evaluation: base case finds new dept, recursive term computes its path. Result: 1 new row1 inserted
department_statsDelta query reads new row from dept_tree's change buffer; DevOps has 0 employees so delta is minimal1 inserted (headcount=0)
department_reportDelta on Engineering row: headcount stays the same (DevOps has 0 employees)0 effective changes

How the recursive CTE refresh works — unlike department_stats, recursive CTEs can't be algebraically differentiated (the recursion references itself). pg_trickle uses incremental fixpoint strategies:

  • INSERT → semi-naive evaluation: differentiate the base case, propagate the delta through the recursive term, stopping when no new rows are produced. Only new rows inserted.
  • DELETE or UPDATE → Delete-and-Rederive (DRed): remove rows derived from deleted facts, re-derive rows that may have alternative derivation paths, handle cascades cleanly.
SELECT id, name, depth, path FROM department_tree WHERE name = 'DevOps';
 id |  name  | depth |              path
----+--------+-------+--------------------------------
  8 | DevOps |     2 | Company > Engineering > DevOps
(1 row)

The recursive CTE automatically expanded to include the new department at the correct depth and path. One inserted row in departments produced one new row in the stream table.

2.4c: UPDATE — A single rename that cascades everywhere

Rename "Engineering" to "R&D":

UPDATE departments SET name = 'R&D' WHERE id = 2;

What happened in the change buffer: The CDC trigger captured the old row (name='Engineering') and the new row (name='R&D'). Both old and new values are stored so the delta can compute what to remove and what to add.

Wait a moment for the scheduler to propagate the rename through all layers. To force it synchronously, wait then refresh each table in topological order (refresh_stream_table does not cascade upstream):

SELECT pg_sleep(2);
SELECT pgtrickle.refresh_stream_table('department_tree');
SELECT pgtrickle.refresh_stream_table('department_stats');
SELECT pgtrickle.refresh_stream_table('department_report');

What happened across all three layers:

LayerWork doneResult
department_treeDRed strategy: delete rows derived with old name, re-derive with new name. 5 rows updated (Engineering + 4 sub-teams)Paths now say Company > R&D > …
department_statsDelta reads 5 changed rows from dept_tree's buffer; updates full_path column for those 5 departments5 rows updated
department_reportDivision name changed: "Engineering" row replaced by "R&D" row1 DELETE + 1 INSERT

Query to verify the cascade:

SELECT name, path FROM department_tree WHERE path LIKE '%R&D%' ORDER BY depth, name;
   name   |           path           
----------+--------------------------
 R&D      | Company > R&D
 Backend  | Company > R&D > Backend
 DevOps   | Company > R&D > DevOps
 Frontend | Company > R&D > Frontend
 Platform | Company > R&D > Platform
(5 rows)

One UPDATE to a department name flowed through all three layers automatically — updating 5 + 5 + 2 rows across the chain.

2.4d: DELETE — Remove an employee

DELETE FROM employees WHERE name = 'Bob';

What happened: The AFTER DELETE trigger on employees fired, writing a change buffer row with action type D and Bob's old values (department_id=5, salary=115000). The delta query will use these old values to compute the correct aggregate adjustment — it knows to subtract 115000 from Backend's salary sum and decrement the count.

Important — refresh before querying: The background scheduler refreshes all three tables within ~1 second, in topological order. To see the result immediately, wait a moment then explicitly refresh in upstream-first order:

SELECT pg_sleep(2);
SELECT pgtrickle.refresh_stream_table('department_stats');
SELECT pgtrickle.refresh_stream_table('department_report');

Why call department_stats first? department_stats sources from both employees and department_tree. Refreshing in topological order ensures each layer processes its upstream changes before computing its own deltas. Even when department_tree has unprocessed changes from step 4c and a new employee change arrives simultaneously, pg_trickle's differential engine handles both correctly — using the pre-change left snapshot (L₀) to avoid double-counting.

Then verify the result:

SELECT department_name, headcount, total_salary, avg_salary
FROM department_stats WHERE department_name = 'Backend';
 department_name | headcount | total_salary |     avg_salary
-----------------+-----------+--------------+---------------------
 Backend         |         1 |    120000.00 | 120000.000000000000
(1 row)

Headcount dropped from 2 → 1 and the salary aggregates updated. Again, only the Backend group was touched — the other 6 department rows were untouched.


Chapter 3: Scheduling & Backpressure

Automatic Scheduling — Let the DAG Drive Itself

pg_trickle runs a background scheduler that automatically refreshes stale tables in topological order. In the Step 4 examples above, the scheduler handled every change within about a second. You can also call refresh_stream_table() directly when needed (e.g. in scripts or tests), but in normal operation the scheduler takes care of everything.

How schedules propagate

We gave department_report a '1s' schedule and the two upstream tables a NULL schedule (CALCULATED mode). This is the recommended pattern:

 department_tree    (CALCULATED → inherits 1s from downstream)
       │
 department_stats   (CALCULATED → inherits 1s from downstream)
       │
 department_report  (1s — the only explicit schedule)

CALCULATED mode (pass schedule => 'calculated') means: compute the tightest schedule across all downstream dependents. You declare freshness requirements at the tables your application queries — the system figures out how often each upstream table needs to refresh.

What the scheduler does every second

  1. Queries the catalog for stream tables past their freshness bound
  2. Sorts them topologically (upstream first) — department_tree refreshes before department_stats, which refreshes before department_report
  3. Runs each refresh (respecting pg_trickle.max_concurrent_refreshes)
  4. Updates the last-refresh frontier

Monitoring

-- Current status of all stream tables
SELECT name, status, refresh_mode, schedule, data_timestamp, staleness
FROM pgtrickle.pgt_status();
        name                 | status |  refresh_mode | schedule |       data_timestamp        |    staleness
-----------------------------+--------+---------------+----------+-----------------------------+-----------------
 public.department_tree      | ACTIVE | DIFFERENTIAL  |          | 2026-02-26 10:30:00.123+01 | 00:00:00.877
 public.department_stats     | ACTIVE | DIFFERENTIAL  |          | 2026-02-26 10:30:00.456+01 | 00:00:00.544
 public.department_report    | ACTIVE | DIFFERENTIAL  | 1s       | 2026-02-26 10:30:00.789+01 | 00:00:00.211
-- Detailed performance stats
SELECT pgt_name, total_refreshes, avg_duration_ms, successful_refreshes
FROM pgtrickle.pg_stat_stream_tables;
-- Health check: quick triage of common issues
SELECT check_name, severity, detail FROM pgtrickle.health_check();
-- Visualize the dependency DAG
SELECT * FROM pgtrickle.dependency_tree();
-- Recent refresh timeline across all stream tables
SELECT * FROM pgtrickle.refresh_timeline(10);
-- Check CDC change buffer sizes (spotting buffer build-up)
SELECT * FROM pgtrickle.change_buffer_sizes();

See SQL_REFERENCE.md for the full list of monitoring functions including list_sources(), trigger_inventory(), and diamond_groups().


Chapter 4: Monitoring In Depth

All the monitoring capabilities from the monitoring quick reference above, expanded. For the five most important day-to-day introspection queries see the Monitoring Quick Reference at the end of this guide.

Optional: WAL-based CDC

By default pg_trickle uses triggers. If wal_level = logical is configured, set:

ALTER SYSTEM SET pg_trickle.cdc_mode = 'auto';
SELECT pg_reload_conf();

pg_trickle will automatically transition each stream table from trigger-based to WAL-based capture after the first successful refresh — reducing per-write overhead from ~2–15 μs (triggers) to near-zero (WAL-based capture adds no synchronous overhead to your DML). The transition is transparent; your queries and the refresh schedule are unaffected.

Optional: Parallel Refresh (v0.4.0+)

By default the scheduler refreshes stream tables sequentially in topological order within a single background worker. This is correct and efficient for most workloads.

For deployments with many independent stream tables, enable parallel refresh:

ALTER SYSTEM SET pg_trickle.parallel_refresh_mode = 'on';
ALTER SYSTEM SET pg_trickle.max_dynamic_refresh_workers = 4;  -- cluster-wide cap
SELECT pg_reload_conf();

Independent stream tables at the same DAG level will then refresh concurrently in separate dynamic background workers. Refresh pairs with IMMEDIATE-trigger connections and atomic consistency groups still run in a single worker for correctness.

Before enabling, ensure max_worker_processes has enough room:

max_worker_processes >= 1 (launcher)
                      + number of databases with stream tables
                      + max_dynamic_refresh_workers (default 4)
                      + autovacuum and other extension workers

Monitor parallel refresh:

SELECT * FROM pgtrickle.worker_pool_status();        -- live worker budget
SELECT * FROM pgtrickle.parallel_job_status(60);     -- recent jobs

See CONFIGURATION.md — Parallel Refresh for the complete tuning reference.

Optional: PgBouncer / Connection Pooler Compatibility (v0.10.0+)

If you're connecting through PgBouncer or another connection pooler in transaction mode (the default on Supabase, Railway, Neon, and most managed PostgreSQL platforms), set pooler_compatibility_mode when creating or altering a stream table:

SELECT pgtrickle.create_stream_table(
    name                    => 'live_headcount',
    query                   => 'SELECT department_id, COUNT(*) FROM employees GROUP BY 1',
    schedule                => '1s',
    pooler_compatibility_mode => true
);

This disables prepared statements and NOTIFY emissions for that table — the two features that break in transaction-pool mode. Leave it off (the default) if you connect directly to PostgreSQL.

Optional: Change Buffer Compaction (v0.10.0+)

For high-churn tables, pg_trickle automatically compacts the pending change buffer before each refresh cycle when it exceeds pg_trickle.compact_threshold (default 100,000 rows). INSERT→DELETE pairs that cancel each other out are eliminated, and multiple changes to the same row are collapsed to a single net change, reducing delta scan overhead by 50–90%.


Chapter 5: Advanced Topics

Refresh Modes and IVM Strategies

You've now seen the IVM strategies pg_trickle uses for incremental view maintenance. Understanding the four refresh modes and when each strategy applies helps you write efficient stream table queries.

The Four Refresh Modes

ModeWhen it refreshesUse case
AUTO (default)On a schedule (background)Most use cases — uses DIFFERENTIAL when possible, falls back to FULL automatically
DIFFERENTIALOn a schedule (background)Like AUTO but errors if the query can't be differentiated
FULLOn a schedule (background)Forces full recompute every cycle
IMMEDIATESynchronously, in the same transaction as the DMLReal-time dashboards, audit tables — the stream table is always up-to-date

When you omit refresh_mode, the default is 'AUTO' — it uses differential (delta-only) maintenance when the query supports it, and automatically falls back to full recomputation when it doesn't. You only need to specify a mode explicitly for advanced cases.

IMMEDIATE mode (new in v0.2.0) maintains stream tables synchronously within the same transaction as the base table DML. It uses statement-level AFTER triggers with transition tables — no change buffers, no scheduler. The stream table is always consistent with the current transaction.

-- Create a stream table that updates in real-time
SELECT pgtrickle.create_stream_table(
    name         => 'live_headcount',
    query        => $$
    SELECT department_id, COUNT(*) AS headcount
    FROM employees
    GROUP BY department_id
    $$,
    refresh_mode => 'IMMEDIATE'
);

-- After any INSERT/UPDATE/DELETE on employees,
-- live_headcount is already up-to-date — no refresh needed!

IMMEDIATE mode supports joins, aggregates, window functions, LATERAL subqueries, and cascading IMMEDIATE stream tables. Recursive CTEs are not supported in IMMEDIATE mode (use DIFFERENTIAL instead).

You can switch between modes at any time:

-- Switch from DIFFERENTIAL to IMMEDIATE
SELECT pgtrickle.alter_stream_table('department_stats', refresh_mode => 'IMMEDIATE');

-- Switch back to DIFFERENTIAL with a schedule
SELECT pgtrickle.alter_stream_table('department_stats', refresh_mode => 'DIFFERENTIAL', schedule => '1s');

Algebraic Differentiation (used by department_stats)

For queries composed of scans, filters, joins, and algebraic aggregates (COUNT, SUM, AVG), pg_trickle can derive the IVM delta mathematically. The rules come from the theory of DBSP (Database Stream Processing):

OperatorDelta RuleCost
ScanRead only change buffer rows (not the full table)O(changes)
Filter (WHERE)Apply predicate to change rowsO(changes)
JoinJoin change rows from one side against the full other sideO(changes × lookup)
Aggregate (COUNT/SUM/AVG)Add or subtract deltas per group — no rescanO(affected groups)
ProjectPass throughO(changes)

The total cost is proportional to the number of changes, not the table size. For a million-row table with 10 changes, the delta query touches ~10 rows.

Incremental Strategies for Recursive CTEs (used by department_tree)

For recursive CTEs, pg_trickle can't derive an algebraic delta because the recursion references itself. Instead it uses two complementary strategies, chosen automatically based on what changed:

Semi-naive evaluation (for INSERT-only changes):

  1. Differentiate the base case — find the new seed rows
  2. Propagate the delta through the recursive term, iterating until no new rows are produced
  3. The result is only the new rows created by the change — not the whole tree

Delete-and-Rederive (DRed) (for DELETE or UPDATE):

  1. Remove all rows derived from the old fact
  2. Re-derive rows that had the old fact as one of their derivation paths (they may still be reachable via other paths)
  3. Insert the newly derived rows under the new fact

Both strategies are more efficient than full recomputation — they work on the affected portion of the result set, not the entire recursive query. The MERGE only modifies rows that actually changed.

When to use which strategy?

You don't choose — pg_trickle detects the strategy automatically based on the query structure:

Query PatternStrategyPerformance
Scan + Filter + Join + algebraic Aggregate (COUNT/SUM/AVG)AlgebraicExcellent — O(changes)
CORR, COVAR_POP/SAMP, REGR_* (12 functions)Algebraic (Welford running totals)O(changes) — running totals updated per changed row, no group rescan (v0.10.0+)
Non-recursive CTEsAlgebraic (inlined)CTE body is differentiated inline
MIN / MAX aggregatesSemi-algebraicUses LEAST/GREATEST merge; per-group rescan only when an extremum is deleted
STRING_AGG, ARRAY_AGG, ordered-set aggregatesGroup-rescanAffected groups fully re-aggregated from source
GROUPING SETS / CUBE / ROLLUPAlgebraic (rewritten)Auto-expanded to UNION ALL of GROUP BY queries; CUBE capped at 64 branches
Recursive CTEs (WITH RECURSIVE) INSERTSemi-naive evaluationO(new rows derived from the change)
Recursive CTEs (WITH RECURSIVE) DELETE/UPDATEDelete-and-RederiveRe-derives rows with alternative paths; O(affected subgraph) (v0.10.0+)
LATERAL subqueriesCorrelated re-evaluationOnly outer rows correlated with changed inner data re-evaluated — O(correlated rows) (v0.10.0+)
Window functionsPartition recomputeOnly affected partitions recomputed
ORDER BY … LIMIT N (TopK)Scoped recomputationRe-evaluates top-N via MERGE; stores exactly N rows
IMMEDIATE mode queriesIn-transaction deltaSame algebraic strategies, applied synchronously via transition tables

FUSE Circuit Breaker (v0.11.0+)

The fuse is a circuit breaker that stops a stream table from processing an unexpectedly large batch of changes — for example from a runaway script or mass-delete migration — without operator review.

-- Arm a fuse: blow when pending changes exceed 50 000 rows
SELECT pgtrickle.alter_stream_table(
    'category_summary',
    fuse           => 'on',
    fuse_ceiling   => 50000
);

-- Check fuse status across all stream tables
SELECT name, fuse_mode, fuse_state, fuse_ceiling, blown_at
FROM pgtrickle.fuse_status();

-- After investigating and deciding to apply the batch:
SELECT pgtrickle.reset_fuse('category_summary', action => 'apply');

-- Or skip the oversized batch entirely and resume from current state:
SELECT pgtrickle.reset_fuse('category_summary', action => 'skip_changes');

reset_fuse supports three actions:

  • 'apply' — process all pending changes and resume normal scheduling.
  • 'reinitialize' — drop and repopulate the stream table from scratch.
  • 'skip_changes' — discard pending changes and resume from the current frontier.

A pgtrickle_alert NOTIFY is emitted when the fuse blows, making it easy to hook into alerting pipelines or LISTEN from application code.

Partitioned Stream Tables (v0.11.0+)

For large stream tables, declare a partition key at creation time so MERGE operations are scoped to only the relevant partitions:

SELECT pgtrickle.create_stream_table(
    name         => 'sales_by_month',
    query        => $$
        SELECT
            DATE_TRUNC('month', sale_date) AS month,
            product_id,
            SUM(amount) AS total_sales
        FROM sales
        GROUP BY 1, 2
    $$,
    schedule     => '1m',
    partition_by => 'month'    -- partition key must be in the SELECT output
);

pg_trickle creates the storage table as PARTITION BY RANGE (month) with a catch-all partition, then on each refresh:

  1. Inspects the delta to find the MIN and MAX of the partition key.
  2. Injects AND st.month BETWEEN min AND max into the MERGE ON clause.
  3. PostgreSQL prunes all partitions outside the range — giving ~100× I/O reduction for a 0.1% change rate on a 10M-row table.

See SQL_REFERENCE.md for full partitioning options.

IMMEDIATE Mode — Real-Time In-Transaction IVM

-- Create a stream table that updates in the same transaction as its source
SELECT pgtrickle.create_stream_table(
    name         => 'live_headcount',
    query        => $$
    SELECT department_id, COUNT(*) AS headcount
    FROM employees
    GROUP BY department_id
    $$,
    refresh_mode => 'IMMEDIATE'
);

-- After any INSERT/UPDATE/DELETE on employees, live_headcount is already up-to-date:
INSERT INTO employees (name, department_id, salary) VALUES ('Zara', 2, 95000);
SELECT * FROM live_headcount WHERE department_id = 2;  -- 4 rows, immediately

IMMEDIATE mode uses statement-level AFTER triggers with transition tables — no change buffers, no scheduler, no background workers. The stream table is always consistent with the current transaction. Ideal for audit tables, real-time dashboards, and applications that need zero-latency reads.

Multi-Tenant Worker Quotas (v0.11.0+)

In deployments with multiple databases, one busy database can starve others if all dynamic refresh workers are claimed. The per_database_worker_quota GUC prevents this:

-- Limit one performance-critical database to 4 workers (with burst to 6)
ALTER DATABASE analytics  SET pg_trickle.per_database_worker_quota = 4;
-- Allow a reporting database only 2 base workers
ALTER DATABASE reporting  SET pg_trickle.per_database_worker_quota = 2;
-- Apply changes
SELECT pg_reload_conf();

When the cluster has spare capacity (active workers < 80% of max_dynamic_refresh_workers), a database may temporarily burst to 150% of its quota. Burst is reclaimed within 1 scheduler cycle once load rises. Within each dispatch tick, IMMEDIATE-trigger closures are always dispatched first, followed by atomic groups, singletons, and cyclic SCCs.

See CONFIGURATION.md for full quota tuning options.


Clean Up

When you're done experimenting, drop the stream tables. Drop dependents before their sources:

SELECT pgtrickle.drop_stream_table('department_report');
SELECT pgtrickle.drop_stream_table('department_stats');
SELECT pgtrickle.drop_stream_table('department_tree');

DROP TABLE employees;
DROP TABLE departments;

drop_stream_table atomically removes in a single transaction:

  • The storage table (e.g., public.department_stats)
  • CDC triggers on source tables (removed only if no other stream table references the same source)
  • Change buffer tables in pgtrickle_changes
  • Catalog entries in pgtrickle.pgt_stream_tables


Monitoring Quick Reference

pg_trickle ships several built-in monitoring functions and a ready-made Prometheus/Grafana stack. Here are the five most useful functions for day-to-day operations.

Stream Table Status

-- Overview of all stream tables: status, staleness, last refresh time, errors
SELECT name, status, staleness, last_refresh_at, last_error
FROM pgtrickle.pgt_status();

Health Check

-- Run all built-in health checks; returns severity (OK/WARNING/CRITICAL) per check
SELECT check_name, severity, detail FROM pgtrickle.health_check();

Change Buffer Sizes

-- Show CDC buffer row counts per source table — useful for spotting backlogs
SELECT * FROM pgtrickle.change_buffer_sizes();

Dependency Tree

-- Visualize the DAG: which stream tables depend on what
SELECT * FROM pgtrickle.dependency_tree();

Fuse Status

-- Check circuit breaker state for all stream tables (v0.11.0+)
SELECT * FROM pgtrickle.fuse_status();

Prometheus & Grafana

For production monitoring, pg_trickle ships a ready-made observability stack in the monitoring/ directory:

cd monitoring && docker compose up

This starts PostgreSQL + postgres_exporter + Prometheus + Grafana with pre-configured dashboards and alerting rules. Grafana is available at http://localhost:3000 (admin/admin). See monitoring/README.md for the full list of exported metrics and alert conditions.

Key Prometheus metrics:

MetricDescription
pgtrickle_refresh_totalCumulative refresh count per table
pgtrickle_refresh_duration_secondsLast refresh duration per table
pgtrickle_staleness_secondsSeconds since last successful refresh
pgtrickle_consecutive_errorsCurrent error streak per table
pgtrickle_cdc_buffer_rowsPending change buffer rows per source table

Pre-configured alerts: staleness > 5 min, ≥3 consecutive failures, table SUSPENDED, CDC buffer > 1 GB, scheduler down, high refresh duration.


Summary: What You Learned

ConceptWhat you saw
Stream tablesTables defined by a SQL query that stay automatically up to date
CDC triggersLightweight change capture in the same transaction — no logical replication or polling required
DAG schedulingStream tables can depend on other stream tables; refreshes run in topological order, schedules propagate upstream via CALCULATED mode
Algebraic IVMDelta queries that process only changed rows — O(changes) regardless of table size
Semi-naive / DRedIncremental strategies for WITH RECURSIVE — INSERT uses semi-naive, DELETE/UPDATE uses Delete-and-Rederive (v0.10.0+)
IMMEDIATE modeSynchronous in-transaction IVM — stream tables updated within the same transaction as your DML, always consistent
TopKORDER BY … LIMIT N queries store exactly N rows, refreshed via scoped recomputation
Diamond consistencyAtomic refresh groups for diamond-shaped dependency graphs via diamond_consistency = 'atomic'
Downstream propagationA single base table write cascades through an entire chain of stream tables, automatically, in the right order
Trigger-based CDCLightweight row-level triggers by default (no WAL configuration needed); optional transition to WAL-based capture via pg_trickle.cdc_mode = 'auto'
Parallel refreshIndependent stream tables refresh concurrently in dynamic background workers via pg_trickle.parallel_refresh_mode = 'on' (v0.4.0+, default off)
auto_backoffScheduler automatically stretches effective interval when refresh cost exceeds 95% of the schedule window, capped at 8× (on by default, v0.10.0+)
PgBouncer compatibilitySet pooler_compatibility_mode => true per stream table to work behind transaction-mode connection poolers (v0.10.0+)
Monitoringpgt_status(), health_check(), dependency_tree(), pg_stat_stream_tables, and more for freshness, timing, and error history

The key takeaway: you write to base tables — pg_trickle does the rest. Data flows downstream automatically, each layer doing the minimum work proportional to what changed, in dependency order.


Troubleshooting

Stream table is stale / not refreshing

Check the status view first:

SELECT name, status, last_error, last_refresh_at, staleness FROM pgtrickle.pgt_status();

A status of ERROR means the last refresh failed. last_error contains the message. Fix the underlying issue (e.g., a dropped column referenced in the query) then call:

SELECT pgtrickle.refresh_stream_table('your_table');

For a broader health check:

SELECT check_name, severity, detail FROM pgtrickle.health_check();

Change buffer growing large

If a stream table has status = 'PAUSED' or refreshes are falling behind:

SELECT * FROM pgtrickle.change_buffer_sizes();  -- find large buffers

Large buffers are normal under heavy load — auto_backoff slows the schedule to avoid CPU runaway and will self-correct once throughput stabilises. If a buffer stays large indefinitely, check last_error in pgt_status() for a blocked refresh.

CDC triggers missing after restore / point-in-time recovery

PITR restores the heap table but not the triggers if the extension was installed after the base backup. Verify:

SELECT * FROM pgtrickle.trigger_inventory();  -- expected vs installed triggers

Any missing trigger can be reinstalled with:

SELECT pgtrickle.repair_stream_table('your_table');

Deployment Best Practices

Once you've built your stream tables interactively, you'll want to deploy them reliably — via SQL migration scripts, dbt, or GitOps pipelines.

Kubernetes Deployment (CloudNativePG)

pg_trickle integrates natively with CloudNativePG using Image Volume Extensions (Kubernetes 1.33+). The extension is packaged as a scratch-based OCI image containing only the .so, .control, and .sql files — no custom PostgreSQL image required.

Prerequisites

  • Kubernetes 1.33+ with the ImageVolume feature gate enabled
  • CloudNativePG operator 1.28+
  • pg_trickle extension image pushed to your cluster registry

Quick Start

  1. Deploy the Cluster with the extension mounted as an Image Volume:
# cnpg/cluster-example.yaml (abridged)
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: pg-trickle-demo
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:18
  postgresql:
    shared_preload_libraries:
      - pg_trickle
    extensions:
      - name: pg-trickle
        image:
          reference: ghcr.io/<owner>/pg_trickle-ext:<version>
    parameters:
      max_worker_processes: "8"
  1. Create the extension declaratively with a CNPG Database resource:
# cnpg/database-example.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Database
metadata:
  name: pg-trickle-app
spec:
  name: app
  owner: app
  cluster:
    name: pg-trickle-demo
  extensions:
    - name: pg_trickle
  1. Apply both resources:
kubectl apply -f cnpg/cluster-example.yaml
kubectl apply -f cnpg/database-example.yaml

Full example manifests are in the cnpg/ directory.

Health Monitoring

CNPG manages PostgreSQL liveness/readiness probes via its instance manager. For pg_trickle-specific health, use the built-in health check function:

-- Run against the primary or any replica:
SELECT * FROM pgtrickle.health_check();

This returns rows for scheduler status, error/suspended tables, stale tables, CDC buffer growth, WAL slot lag, and worker pool utilization. Integrate it into your monitoring stack:

  • Prometheus: Use the CNPG monitoring integration to expose pgtrickle.health_check() results as custom metrics
  • Kubernetes CronJob: Schedule periodic health checks and alert via your existing alerting pipeline
  • pgtrickle-tui: The TUI tool has a dedicated Health view that polls health_check() continuously

Probe Configuration

The example manifests include probe settings tuned for pg_trickle workloads:

probes:
  startup:
    periodSeconds: 10
    failureThreshold: 60     # 10 min for shared_preload_libraries init
  liveness:
    periodSeconds: 10
    failureThreshold: 6      # 60s before restart
  readiness:
    type: streaming
    maximumLag: 64Mi         # replicas must be streaming before serving reads

Why readiness: streaming? Stream tables are readable on replicas, but a lagging replica serves stale stream table data. The maximumLag setting ensures replicas are caught up before receiving traffic.

Failover Behavior

When the primary pod fails and CNPG promotes a replica:

  • Scheduler: The new primary starts the pg_trickle scheduler background worker automatically (registered via shared_preload_libraries)
  • Stream tables: All stream table definitions are stored in the pgtrickle.pgt_stream_tables catalog table, which is replicated to all replicas. The promoted replica has the complete catalog.
  • CDC triggers: Trigger definitions are replicated as part of the WAL stream. The new primary's triggers fire normally on new writes.
  • Change buffers: Uncommitted change buffer rows from in-flight transactions on the old primary are lost (standard PostgreSQL behavior). The next refresh cycle detects the gap and performs a FULL refresh to resynchronize.
  • Refresh frontiers: Each stream table's last-refresh frontier is stored in the catalog. If the frontier is ahead of the available change buffer data (due to WAL replay lag), the scheduler falls back to FULL refresh once and then resumes DIFFERENTIAL.

No manual intervention is required after failover.

Idempotent SQL Migrations

Use create_or_replace_stream_table() in your migration scripts. It's safe to run on every deploy:

-- migrations/V003__stream_tables.sql
-- Creates if absent, updates if definition changed, no-op if identical.

SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'employee_salaries',
    query        => 'SELECT e.id, e.name, d.name AS department, e.salary
                     FROM employees e JOIN departments d ON e.department_id = d.id',
    schedule     => '30s',
    refresh_mode => 'DIFFERENTIAL'
);

SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'department_stats',
    query        => 'SELECT department, COUNT(*) AS headcount, AVG(salary) AS avg_salary
                     FROM employee_salaries GROUP BY department',
    schedule     => '30s',
    refresh_mode => 'DIFFERENTIAL'
);

If someone changes the query in a later migration, create_or_replace detects the difference and migrates the storage table in place — no need to drop and recreate.

dbt Integration

With the dbt-pgtrickle package, stream tables are just dbt models with materialized='stream_table':

-- models/department_stats.sql
{{ config(
    materialized='stream_table',
    schedule='30s',
    refresh_mode='DIFFERENTIAL'
) }}

SELECT department, COUNT(*) AS headcount, AVG(salary) AS avg_salary
FROM {{ ref('employee_salaries') }}
GROUP BY department

Every dbt run calls create_or_replace_stream_table() under the hood, so deployments are always idempotent.


Day 2 Operations

Added in v0.20.0 (UX-4).

Once your stream tables are running in production, pg_trickle can monitor itself using its own stream tables — a technique called dog-feeding.

Enabling Dog-Feeding

-- Create all five monitoring stream tables (idempotent, safe to repeat).
SELECT pgtrickle.setup_dog_feeding();

-- Check what was created.
SELECT * FROM pgtrickle.dog_feeding_status();

This creates five stream tables in the pgtrickle schema:

Stream TablePurpose
df_efficiency_rollingRolling-window refresh statistics (replaces manual refresh_efficiency() calls)
df_anomaly_signalsDetects duration spikes, error bursts, mode oscillation
df_threshold_adviceRecommends threshold adjustments based on multi-cycle analysis
df_cdc_buffer_trendsTracks CDC buffer growth rates per source table
df_scheduling_interferenceDetects concurrent refresh overlap patterns

Checking Recommendations

After at least 10–20 refresh cycles have accumulated:

-- Which stream tables have poorly calibrated thresholds?
SELECT pgt_name, current_threshold, recommended_threshold, confidence, reason
FROM pgtrickle.df_threshold_advice
WHERE confidence IN ('HIGH', 'MEDIUM')
  AND abs(recommended_threshold - current_threshold) > 0.05;

-- Are any stream tables experiencing anomalies?
SELECT pgt_name, duration_anomaly, recent_failures
FROM pgtrickle.df_anomaly_signals
WHERE duration_anomaly IS NOT NULL OR recent_failures >= 2;

Automatic Threshold Tuning

To let pg_trickle automatically apply threshold recommendations:

SET pg_trickle.dog_feeding_auto_apply = 'threshold_only';

This applies changes only when confidence is HIGH and the recommended threshold differs by more than 5%. Changes are rate-limited to once per 10 minutes per stream table and logged with initiated_by = 'DOG_FEED'.

Visualizing the DAG

-- See the full refresh graph (Mermaid format, paste into any Mermaid renderer).
SELECT pgtrickle.explain_dag();

Dog-feeding STs appear in green, user STs in blue, suspended in red.

Disabling Dog-Feeding

SELECT pgtrickle.teardown_dog_feeding();

This drops all monitoring stream tables. User stream tables are never affected. The control plane continues operating identically without dog-feeding.


What's Next?

Playground

The quickest way to explore pg_trickle is the playground — a pre-configured Docker environment with sample data and stream tables ready to query. No installation, no configuration. One command and you're running.

Quick Start

git clone https://github.com/grove/pg-trickle.git
cd pg-trickle/playground
docker compose up -d

Then connect:

psql postgresql://postgres:playground@localhost:5432/playground

PostgreSQL 18+ note: The Docker image stores data in a versioned subdirectory (/var/lib/postgresql/18/main). The compose file mounts /var/lib/postgresql (not .../data) — this is intentional.


What's Pre-Loaded

The seed script creates three base tables and five stream tables that cover the most common pg_trickle patterns.

Base Tables

TableDescription
productsProduct catalog with categories and prices
ordersOrder line items with quantities and timestamps
customersCustomer profiles with regions

Stream Tables

Stream TableQueryPattern demonstrated
sales_by_regionSUM(total) grouped by regionBasic aggregate, DIFFERENTIAL mode
top_productsSUM(quantity) ranked by categoryWindow function (RANK())
customer_lifetime_valueRevenue + order count per customerMulti-table join + aggregates
daily_revenueRevenue per dayTime-series aggregation
active_productsProducts with ordersEXISTS subquery

Exercises

1. Watch an INSERT propagate

-- Current state
SELECT * FROM sales_by_region ORDER BY region;

-- Insert a new order
INSERT INTO orders (customer_id, product_id, quantity, order_date)
VALUES (1, 1, 10, CURRENT_DATE);

-- After ~1 s the stream table refreshes
SELECT * FROM sales_by_region ORDER BY region;

2. Inspect pg_trickle internals

-- Overall health
SELECT * FROM pgtrickle.health_check();

-- Status of all stream tables
SELECT name, status, refresh_mode, staleness
FROM pgtrickle.pgt_status()
ORDER BY name;

-- Recent refresh activity
SELECT start_time, stream_table, action, status, duration_ms
FROM pgtrickle.refresh_timeline(10);

-- Delta SQL for a stream table
SELECT pgtrickle.explain_st('sales_by_region');

-- Change buffer sizes
SELECT * FROM pgtrickle.change_buffer_sizes();

3. Update and Delete

-- Update a product price
UPDATE products SET price = 99.99 WHERE name = 'Widget';

-- customer_lifetime_value re-calculates
SELECT * FROM customer_lifetime_value ORDER BY total_revenue DESC LIMIT 5;

-- Delete a customer's orders
DELETE FROM orders WHERE customer_id = 3;

-- Stream tables reflect the removal
SELECT * FROM sales_by_region ORDER BY region;

4. Create your own stream table

SELECT pgtrickle.create_stream_table(
    name     => 'my_experiment',
    query    => $$
        SELECT p.category,
               COUNT(DISTINCT o.customer_id) AS unique_buyers,
               SUM(o.quantity)               AS total_units
        FROM orders o
        JOIN products p ON p.id = o.product_id
        GROUP BY p.category
        HAVING SUM(o.quantity) > 5
    $$,
    schedule => '2s'
);

SELECT * FROM my_experiment;

Tear Down

docker compose down -v

The -v flag removes the data volume. Omit it if you want to keep your changes.


Next Steps

Best-Practice Patterns for pg_trickle

This guide covers common data modeling patterns and recommended configurations for pg_trickle stream tables. Each pattern includes worked SQL examples, anti-patterns to avoid, and refresh mode recommendations.

Version: v0.14.0+. Some features require recent versions — check SQL_REFERENCE.md for per-feature availability.


Table of Contents


Pattern 1: Bronze / Silver / Gold Materialization

A multi-layer approach where raw data flows through progressively refined stream tables, similar to a medallion architecture.

Architecture

  [raw_events]          ← Bronze: raw ingest table (regular table)
       ↓
  [events_cleaned]      ← Silver: filtered, deduplicated, typed
       ↓
  [events_aggregated]   ← Gold: business-level aggregates

SQL Example

-- Bronze: regular PostgreSQL table (source of truth)
CREATE TABLE raw_events (
    event_id    BIGSERIAL PRIMARY KEY,
    user_id     INT NOT NULL,
    event_type  TEXT NOT NULL,
    payload     JSONB,
    received_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Silver: cleaned and deduplicated events
SELECT pgtrickle.create_stream_table(
    'events_cleaned',
    $$SELECT DISTINCT ON (event_id)
        event_id,
        user_id,
        event_type,
        (payload->>'amount')::numeric AS amount,
        received_at
      FROM raw_events
      WHERE event_type IN ('purchase', 'refund', 'subscription')$$,
    schedule => '5s',
    refresh_mode => 'DIFFERENTIAL'
);

-- Gold: per-user purchase summary
SELECT pgtrickle.create_stream_table(
    'user_purchase_summary',
    $$SELECT user_id,
             COUNT(*) AS total_purchases,
             SUM(amount) AS total_spent,
             AVG(amount) AS avg_order
      FROM events_cleaned
      WHERE event_type = 'purchase'
      GROUP BY user_id$$,
    schedule => 'calculated',
    refresh_mode => 'DIFFERENTIAL'
);
LayerRefresh ModeScheduleTier
SilverDIFFERENTIAL5s – 30shot
GoldDIFFERENTIALcalculatedhot

Anti-Patterns

  • Don't use FULL refresh for Silver. With frequent small inserts, DIFFERENTIAL is 10–100x faster.
  • Don't skip the Silver layer. Joining raw tables directly in Gold queries produces wider joins and slower deltas.
  • Don't use IMMEDIATE mode for Gold. Aggregate maintenance on every DML row is expensive — batched DIFFERENTIAL is more efficient.

Pattern 2: Event Sourcing with Stream Tables

Use stream tables as projections of an append-only event log. The source table is the event store; stream tables materialize different read models.

SQL Example

-- Event store (append-only source)
CREATE TABLE events (
    event_id    BIGSERIAL PRIMARY KEY,
    aggregate_id UUID NOT NULL,
    event_type   TEXT NOT NULL,
    payload      JSONB NOT NULL,
    created_at   TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Projection 1: Current state per aggregate
SELECT pgtrickle.create_stream_table(
    'aggregate_state',
    $$SELECT DISTINCT ON (aggregate_id)
        aggregate_id,
        event_type AS last_event,
        payload AS current_state,
        created_at AS last_updated
      FROM events
      ORDER BY aggregate_id, created_at DESC$$,
    schedule => '2s',
    refresh_mode => 'DIFFERENTIAL'
);

-- Projection 2: Event counts by type per hour
SELECT pgtrickle.create_stream_table(
    'hourly_event_counts',
    $$SELECT date_trunc('hour', created_at) AS hour,
             event_type,
             COUNT(*) AS event_count
      FROM events
      GROUP BY 1, 2$$,
    schedule => '10s',
    refresh_mode => 'DIFFERENTIAL'
);
ProjectionRefresh ModeWhy
Current stateDIFFERENTIALSmall delta per cycle; DISTINCT ON supported
Hourly countsDIFFERENTIALAlgebraic aggregate (COUNT), efficient delta
String aggregationsAUTOGROUP_RESCAN aggs may benefit from FULL

Anti-Patterns

  • Don't DELETE from the event store. pg_trickle tracks changes via triggers; mixing append and delete on the source creates unnecessary delta complexity. Archive old events to a separate table.
  • Don't use append_only => true with UPDATE/DELETE patterns. The append_only flag skips DELETE tracking in the change buffer — only use it when the source truly never updates or deletes.

Pattern 3: Slowly Changing Dimensions (SCD)

SCD Type 1: Overwrite

The stream table always reflects the current state. Source updates overwrite previous values.

-- Source: customer dimension table (updated in place)
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    name        TEXT NOT NULL,
    email       TEXT,
    tier        TEXT DEFAULT 'standard',
    updated_at  TIMESTAMPTZ DEFAULT now()
);

-- SCD-1: current customer state enriched with order stats
SELECT pgtrickle.create_stream_table(
    'customer_360',
    $$SELECT c.customer_id,
             c.name,
             c.email,
             c.tier,
             COUNT(o.id) AS total_orders,
             COALESCE(SUM(o.amount), 0) AS lifetime_value
      FROM customers c
      LEFT JOIN orders o ON o.customer_id = c.customer_id
      GROUP BY c.customer_id, c.name, c.email, c.tier$$,
    schedule => '30s',
    refresh_mode => 'DIFFERENTIAL'
);

SCD Type 2: History Tracking

For SCD-2, maintain a history table with valid-from/valid-to ranges. The stream table provides the current snapshot.

-- Source: customer history with validity ranges
CREATE TABLE customer_history (
    customer_id INT NOT NULL,
    name        TEXT NOT NULL,
    tier        TEXT NOT NULL,
    valid_from  TIMESTAMPTZ NOT NULL,
    valid_to    TIMESTAMPTZ,  -- NULL = current
    PRIMARY KEY (customer_id, valid_from)
);

-- Current active records only
SELECT pgtrickle.create_stream_table(
    'customers_current',
    $$SELECT customer_id, name, tier, valid_from
      FROM customer_history
      WHERE valid_to IS NULL$$,
    schedule => '10s',
    refresh_mode => 'DIFFERENTIAL'
);

Anti-Patterns

  • Don't use FULL refresh for SCD-1 with large dimension tables. Customer tables with millions of rows but few changes per cycle are ideal for DIFFERENTIAL.
  • Don't forget to index valid_to IS NULL for SCD-2 sources. Without it, the delta scan touches all historical rows.

Pattern 4: High-Fan-Out Topology

When a single source table feeds many downstream stream tables.

Architecture

                    [orders]
                   ↙  ↓  ↓  ↘
  [daily_totals] [by_region] [by_product] [top_customers]

SQL Example

-- Single source feeding multiple views
CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer_id INT NOT NULL,
    region      TEXT NOT NULL,
    product_id  INT NOT NULL,
    amount      NUMERIC(10,2) NOT NULL,
    order_date  DATE NOT NULL DEFAULT CURRENT_DATE
);

-- Fan-out: 4 stream tables on 1 source
SELECT pgtrickle.create_stream_table('daily_totals',
    'SELECT order_date, SUM(amount) AS daily_total, COUNT(*) AS order_count
     FROM orders GROUP BY order_date',
    schedule => '5s', refresh_mode => 'DIFFERENTIAL');

SELECT pgtrickle.create_stream_table('by_region',
    'SELECT region, SUM(amount) AS total, COUNT(*) AS cnt
     FROM orders GROUP BY region',
    schedule => '5s', refresh_mode => 'DIFFERENTIAL');

SELECT pgtrickle.create_stream_table('by_product',
    'SELECT product_id, SUM(amount) AS total, COUNT(*) AS cnt
     FROM orders GROUP BY product_id',
    schedule => '5s', refresh_mode => 'DIFFERENTIAL');

SELECT pgtrickle.create_stream_table('top_customers',
    'SELECT customer_id, SUM(amount) AS lifetime_value, COUNT(*) AS order_count
     FROM orders GROUP BY customer_id',
    schedule => '10s', refresh_mode => 'DIFFERENTIAL');
  • All fan-out targets share the same source change buffer — CDC overhead is paid once regardless of how many stream tables read from orders.
  • Use schedule => 'calculated' on downstream STs when they chain from other stream tables.
  • Consider pg_trickle.max_workers if fan-out exceeds 8 (default: 4 workers).

Anti-Patterns

  • Don't use IMMEDIATE mode on high-fan-out sources. Each DML row triggers N refreshes (one per downstream ST). Use DIFFERENTIAL with a batched schedule instead.
  • Don't set different schedules on STs that should be consistent. If daily_totals and by_region must agree, give them the same schedule or use diamond_consistency => 'atomic'.

Pattern 5: Real-Time Dashboards

For dashboards that need sub-second refresh latency.

SQL Example

-- Live order monitor (sub-second freshness)
SELECT pgtrickle.create_stream_table(
    'order_monitor',
    $$SELECT
        date_trunc('minute', order_date) AS minute,
        region,
        COUNT(*) AS orders,
        SUM(amount) AS revenue
      FROM orders
      WHERE order_date >= CURRENT_DATE
      GROUP BY 1, 2$$,
    schedule => '1s',
    refresh_mode => 'DIFFERENTIAL'
);

-- For truly real-time needs, use IMMEDIATE mode (triggers on each DML)
SELECT pgtrickle.create_stream_table(
    'live_counter',
    $$SELECT region, COUNT(*) AS cnt, SUM(amount) AS total
      FROM orders GROUP BY region$$,
    schedule => 'IMMEDIATE',
    refresh_mode => 'DIFFERENTIAL'
);

When to Use IMMEDIATE vs Scheduled DIFFERENTIAL

ScenarioModeWhy
Dashboard polls every 1s1sBatched delta amortizes overhead
GraphQL subscription, < 100msIMMEDIATETriggers fire synchronously per DML
Aggregate with GROUP_RESCAN5s+Avoid per-row full rescans
High write throughput (>1K/s)2s5sIMMEDIATE adds latency to each INSERT

Anti-Patterns

  • Don't use IMMEDIATE for complex joins. Each INSERT/UPDATE/DELETE fires the full DVM delta SQL synchronously — multi-table joins in IMMEDIATE mode add significant latency to writes.
  • Don't forget pooler_compatibility_mode with PgBouncer. Transaction pooling drops temp tables between transactions; enable this flag to avoid stale PREPARE statements.

Pattern 6: Tiered Refresh Strategy

Assign refresh importance tiers to control scheduling priority.

-- Hot: real-time operational dashboard
SELECT pgtrickle.create_stream_table('live_metrics', ...);
SELECT pgtrickle.alter_stream_table('live_metrics', tier => 'hot');

-- Warm: hourly business reports (2x interval multiplier)
SELECT pgtrickle.create_stream_table('hourly_report', ...,
    schedule => '1m');
SELECT pgtrickle.alter_stream_table('hourly_report', tier => 'warm');

-- Cold: daily analytics (10x interval multiplier)
SELECT pgtrickle.create_stream_table('daily_analytics', ...,
    schedule => '5m');
SELECT pgtrickle.alter_stream_table('daily_analytics', tier => 'cold');

-- Frozen: archive/audit (skip refresh entirely)
SELECT pgtrickle.alter_stream_table('audit_log_summary', tier => 'frozen');

Tier Multipliers

TierSchedule MultiplierUse Case
hot1xOperational dashboards, alerts
warm2xHourly reports, batch pipelines
cold10xDaily analytics, low-priority STs
frozenskipPaused/archived, manual refresh

General Guidelines

Choosing a Refresh Mode

ScenarioRecommended Mode
Source has < 5% change ratio per cycleDIFFERENTIAL
Source changes > 50% per cycleFULL
Query is a simple filter/projectionDIFFERENTIAL
Query has GROUP_RESCAN aggregates (MIN, MAX)AUTO
Query joins 4+ tablesDIFFERENTIAL
Target table < 1000 rowsFULL
Need per-row latency guaranteeIMMEDIATE

Use pgtrickle.recommend_refresh_mode() (v0.14.0+) for automated analysis:

SELECT pgt_name, recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode();

Monitoring Checklist

-- Check refresh efficiency across all stream tables
SELECT pgt_name, refresh_mode, diff_speedup, avg_change_ratio
FROM pgtrickle.refresh_efficiency()
ORDER BY total_refreshes DESC;

-- Find stream tables that might benefit from mode change
SELECT pgt_name, current_mode, recommended_mode, reason
FROM pgtrickle.recommend_refresh_mode()
WHERE recommended_mode != 'KEEP';

-- Check for error states
SELECT pgt_name, status, last_error_message
FROM pgtrickle.stream_tables_info
WHERE status IN ('ERROR', 'SUSPENDED');

-- Export definitions for backup
SELECT pgtrickle.export_definition(pgt_schema || '.' || pgt_name)
FROM pgtrickle.pgt_stream_tables;

Common Mistakes

  1. Using FULL refresh by default. Start with DIFFERENTIAL — it's correct for 80%+ of workloads. Switch to FULL only when recommend_refresh_mode() suggests it.

  2. Over-scheduling. A 1-second schedule on a table with 1-hour change cycles wastes CPU. Match the schedule to actual data arrival rate.

  3. Ignoring append_only. If the source table is truly append-only (no UPDATEs, no DELETEs), set append_only => true to halve change buffer writes.

  4. Not using calculated schedule for chained STs. When ST-B reads from ST-A, use schedule => 'calculated' on ST-B to avoid unnecessary refreshes. The scheduler automatically propagates ST-A changes downstream.

  5. Mixing IMMEDIATE and complex joins. IMMEDIATE mode fires delta SQL on every DML — an 8-table join in IMMEDIATE mode adds 50–200ms to each INSERT. Use scheduled DIFFERENTIAL for complex queries.

Pre-Deployment Checklist

Complete this checklist before deploying pg_trickle to a new environment. Each item links to the relevant documentation for details.

Version: v0.14.0+. Earlier versions may have different requirements.


1. PostgreSQL Version

  • PostgreSQL 18.x is required (pg_trickle is compiled against PG 18)
  • Extension binary matches your exact PostgreSQL major version
SELECT version();  -- Must show PostgreSQL 18.x

2. shared_preload_libraries

pg_trickle must be loaded at server startup via shared_preload_libraries. Without this, GUC variables and the background scheduler are not available.

# postgresql.conf
shared_preload_libraries = 'pg_trickle'
  • shared_preload_libraries includes pg_trickle
  • PostgreSQL has been restarted after changing this setting (reload is not sufficient)
SHOW shared_preload_libraries;  -- Must include pg_trickle

Managed PostgreSQL: Some providers (Supabase, Neon) do not support custom shared_preload_libraries. Check your provider's extension compatibility list. AWS RDS and Google Cloud SQL support custom shared libraries via parameter groups.


pg_trickle works without wal_level = logical — it uses trigger-based CDC by default. However, WAL-based CDC provides lower overhead on write-heavy workloads.

# postgresql.conf (optional — for WAL-based CDC)
wal_level = logical
max_replication_slots = 10   # At least 1 per tracked source table
  • Decide: trigger-based CDC (default) or WAL-based CDC
  • If WAL: wal_level = logical and server restarted
  • If WAL: max_replication_slots is sufficient for your source table count

Note: CDC mode is configurable per stream table. The default cdc_mode = 'auto' starts with triggers and transitions to WAL automatically when wal_level = logical is detected. See CONFIGURATION.md for details.


4. Extension Installation

CREATE EXTENSION pg_trickle;

-- Verify installation
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_trickle';
  • Extension created successfully
  • Version matches expected release

5. Background Scheduler

The scheduler runs as a background worker and manages automatic refresh. Verify it's running:

SELECT pid, backend_type, state
FROM pg_stat_activity
WHERE backend_type = 'pg_trickle scheduler';
  • Scheduler process is visible in pg_stat_activity
  • pg_trickle.enabled = true (default; set to false to disable)

6. Connection Pooler Compatibility

PgBouncer (Transaction Mode)

PgBouncer in transaction pooling mode drops session state between transactions. pg_trickle needs special handling:

  • Enable pooler_compatibility_mode on affected stream tables:
SELECT pgtrickle.alter_stream_table('my_st',
    pooler_compatibility_mode => true);
  • Or set globally via GUC:
pg_trickle.pooler_compatibility_mode = true

PgBouncer (Session Mode)

Session mode preserves session state — no special configuration needed.

Supavisor / Other Poolers

Some poolers (Supavisor, pgcat) have their own compatibility characteristics. Test with pgtrickle.validate_query() before deploying.


These are sensible defaults for most workloads. Adjust based on monitoring data.

# Core settings (usually fine as defaults)
pg_trickle.enabled = true                    # Enable scheduler
pg_trickle.schedule_interval = '5s'          # Global default refresh interval
pg_trickle.max_workers = 4                   # Parallel refresh workers

# Performance tuning
pg_trickle.planner_aggressive = true         # Enable MERGE planner hints
pg_trickle.tiered_scheduling = true          # Tier-aware scheduling

# CDC mode
pg_trickle.cdc_mode = 'auto'                # auto | trigger | wal

# Safety
pg_trickle.unlogged_buffers = false          # true = faster but not crash-safe
pg_trickle.fuse_default_ceiling = 10000      # Auto-fuse change threshold
  • Review GUC values for your workload
  • See CONFIGURATION.md for the full reference

8. Resource Planning

Memory

  • Each background worker uses a separate PostgreSQL backend
  • work_mem applies to each worker's delta SQL execution
  • Monitor RSS growth via pg_stat_activity or OS-level tools

Storage

  • Change buffer tables (pgtrickle_changes.changes_*) grow between refreshes
  • Buffer size depends on DML rate × refresh interval
  • Monitor via pgtrickle.shared_buffer_stats()

Connections

  • The scheduler uses pg_trickle.max_workers backend connections

  • Ensure max_connections has headroom for workers + application

  • max_connections is at least application connections + pg_trickle.max_workers + 5


9. Monitoring Setup

Essential Queries

-- Stream table health overview
SELECT pgt_name, status, staleness, refresh_mode
FROM pgtrickle.stream_tables_info
ORDER BY staleness DESC NULLS LAST;

-- Refresh efficiency
SELECT pgt_name, diff_speedup, avg_change_ratio
FROM pgtrickle.refresh_efficiency();

-- Error states
SELECT pgt_name, status, last_error_message, last_error_at
FROM pgtrickle.pgt_stream_tables
WHERE status IN ('ERROR', 'SUSPENDED');

Grafana / Prometheus

See the monitoring/ directory for ready-to-use Grafana dashboards and Prometheus configuration.

  • Monitoring configured for stream table health
  • Alerting on ERROR/SUSPENDED status

10. Backup & Restore

pg_trickle stream tables are standard PostgreSQL tables and are included in pg_dump / pg_restore. See BACKUP_AND_RESTORE.md for details.

  • Backup strategy accounts for both source tables and stream tables
  • Restore procedure tested (stream tables may need re-initialization)

Quick Validation Script

Run this after deployment to verify everything is working:

-- 1. Extension loaded
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_trickle';

-- 2. Scheduler running
SELECT COUNT(*) > 0 AS scheduler_alive
FROM pg_stat_activity
WHERE backend_type = 'pg_trickle scheduler';

-- 3. Create a test stream table
CREATE TABLE _deploy_test_src (id INT PRIMARY KEY, val INT);
INSERT INTO _deploy_test_src VALUES (1, 100), (2, 200);

SELECT pgtrickle.create_stream_table(
    '_deploy_test_st',
    'SELECT id, val FROM _deploy_test_src',
    refresh_mode => 'FULL'
);

SELECT pgtrickle.refresh_stream_table('_deploy_test_st');

-- 4. Verify data
SELECT * FROM _deploy_test_st ORDER BY id;
-- Expected: (1, 100), (2, 200)

-- 5. Cleanup
SELECT pgtrickle.drop_stream_table('_deploy_test_st');
DROP TABLE _deploy_test_src;

Connection Pooler Compatibility

Added in v0.19.0 (UX-4 / STAB-1).

pg_trickle uses prepared statements and NOTIFY internally. These features require special handling when a connection pooler sits between the application and PostgreSQL.

PgBouncer Transaction Mode

In PgBouncer transaction pooling mode, each transaction may land on a different server-side connection. Prepared statements and LISTEN/NOTIFY do not survive across transactions.

Recommended configuration:

# postgresql.conf
pg_trickle.connection_pooler_mode = 'transaction'

This cluster-wide GUC:

  • Disables prepared-statement reuse for all stream tables.
  • Suppresses NOTIFY pg_trickle_refresh emissions (listeners on other connections will not receive them anyway in transaction mode).

Alternatively, enable pooler compatibility per stream table:

SELECT pgtrickle.alter_stream_table('my_stream_table',
    pooler_compatibility_mode => true);

PgBouncer Session Mode

Session pooling is fully compatible — no special configuration needed.

pgcat / Supavisor

These poolers generally support prepared statements and NOTIFY. Set pg_trickle.connection_pooler_mode = 'off' (the default).

Kubernetes / CNPG

See Scaling — CNPG for connection pooler configuration in Kubernetes environments.


SQL Reference

Complete reference for all SQL functions, views, and catalog tables provided by pgtrickle.


Table of Contents


Functions

Core Lifecycle

Create, modify, and manage the lifecycle of stream tables.


pgtrickle.create_stream_table

Create a new stream table.

pgtrickle.create_stream_table(
    name                  text,
    query                 text,
    schedule              text      DEFAULT 'calculated',
    refresh_mode          text      DEFAULT 'AUTO',
    initialize            bool      DEFAULT true,
    diamond_consistency   text      DEFAULT NULL,
    diamond_schedule_policy text    DEFAULT NULL,
    cdc_mode              text      DEFAULT NULL,
    append_only           bool      DEFAULT false,
    pooler_compatibility_mode bool  DEFAULT false
) → void

Parameters:

ParameterTypeDefaultDescription
nametextName of the stream table. May be schema-qualified (myschema.my_st). Defaults to public schema.
querytextThe defining SQL query. Must be a valid SELECT statement using supported operators.
scheduletext'calculated'Refresh schedule as a Prometheus/GNU-style duration string (e.g., '30s', '5m', '1h', '1h30m', '1d') or a cron expression (e.g., '*/5 * * * *', '@hourly'). Use 'calculated' for CALCULATED mode (inherits schedule from downstream dependents).
refresh_modetext'AUTO''AUTO' (adaptive — uses DIFFERENTIAL when possible, falls back to FULL if the query is not differentiable), 'FULL' (truncate and reload), 'DIFFERENTIAL' (apply delta only — errors if the query is not differentiable), or 'IMMEDIATE' (synchronous in-transaction maintenance via statement-level triggers).
initializebooltrueIf true, populates the table immediately via a full refresh. If false, creates the table empty.
diamond_consistencytextNULL (defaults to 'atomic')Diamond dependency consistency mode: 'atomic' (SAVEPOINT-based atomic group refresh) or 'none' (independent refresh).
diamond_schedule_policytextNULL (defaults to 'fastest')Schedule policy for atomic diamond groups: 'fastest' (fire when any member is due) or 'slowest' (fire when all are due). Set on the convergence node.
cdc_modetextNULL (use pg_trickle.cdc_mode)Optional per-stream-table CDC override: 'auto', 'trigger', or 'wal'. This affects all deferred TABLE sources of the stream table.
append_onlyboolfalseWhen true, differential refreshes use a fast INSERT path instead of MERGE. Skips DELETE/UPDATE/IS DISTINCT FROM checks. If a DELETE or Update is later detected in the change buffer, the flag is automatically reverted to false. Not compatible with FULL, IMMEDIATE, or keyless sources.
pooler_compatibility_modeboolfalseWhen true, the refresh engine uses inline SQL instead of PREPARE/EXECUTE and suppresses all NOTIFY emissions for this stream table. Enable this when the stream table is accessed through a transaction-mode connection pooler (e.g. PgBouncer).

When refresh_mode => 'IMMEDIATE', the cluster-wide pg_trickle.cdc_mode setting is ignored. IMMEDIATE mode always uses statement-level IVM triggers instead of CDC triggers or WAL replication slots. If you explicitly pass cdc_mode => 'wal' together with refresh_mode => 'IMMEDIATE', pg_trickle rejects the call because WAL CDC is asynchronous and incompatible with in-transaction maintenance.

Duration format:

UnitSuffixExample
Secondss'30s'
Minutesm'5m'
Hoursh'2h'
Daysd'1d'
Weeksw'1w'
Compound'1h30m', '2m30s'

Cron expression format:

schedule also accepts standard cron expressions for time-based scheduling. The scheduler refreshes the stream table when the cron schedule fires, rather than checking staleness.

FormatFieldsExampleDescription
5-fieldmin hour dom mon dow'*/5 * * * *'Every 5 minutes
6-fieldsec min hour dom mon dow'0 */5 * * * *'Every 5 minutes at :00 seconds
Alias'@hourly'Every hour
Alias'@daily'Every day at midnight
Alias'@weekly'Every Sunday at midnight
Alias'@monthly'First of every month
Weekday range'0 6 * * 1-5'6 AM on weekdays

Note: Cron-scheduled stream tables do not participate in CALCULATED schedule resolution. The stale column in monitoring views returns NULL for cron-scheduled tables.

Example:

-- Duration-based: refresh when data is staler than 2 minutes (refresh_mode defaults to 'AUTO')
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule => '2m'
);

-- Cron-based: refresh every hour
SELECT pgtrickle.create_stream_table(
    name         => 'hourly_summary',
    query        => 'SELECT date_trunc(''hour'', ts), COUNT(*) FROM events GROUP BY 1',
    schedule     => '@hourly',
    refresh_mode => 'FULL'
);

-- Cron-based: refresh at 6 AM on weekdays
SELECT pgtrickle.create_stream_table(
    name         => 'daily_report',
    query        => 'SELECT region, SUM(revenue) AS total FROM sales GROUP BY region',
    schedule     => '0 6 * * 1-5',
    refresh_mode => 'FULL'
);

-- Immediate mode: maintained synchronously within the same transaction
-- No schedule needed — updates happen automatically when base table changes
SELECT pgtrickle.create_stream_table(
    name         => 'live_totals',
    query        => 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    refresh_mode => 'IMMEDIATE'
);

-- Force WAL CDC for this stream table even if the global GUC is 'trigger'
SELECT pgtrickle.create_stream_table(
    name         => 'wal_orders',
    query        => 'SELECT id, amount FROM orders',
    schedule     => '1s',
    refresh_mode => 'DIFFERENTIAL',
    cdc_mode     => 'wal'
);

Aggregate Examples:

All supported aggregate functions work in AUTO mode (and all other modes). Examples below omit refresh_mode — the default 'AUTO' selects DIFFERENTIAL automatically. Explicit modes are shown only when the mode itself is being demonstrated.

-- Algebraic aggregates (fully differential — no rescan needed)
SELECT pgtrickle.create_stream_table(
    name     => 'sales_summary',
    query    => 'SELECT region, COUNT(*) AS cnt, SUM(amount) AS total, AVG(amount) AS avg_amount
     FROM orders GROUP BY region',
    schedule => '1m'
);

-- Semi-algebraic aggregates (MIN/MAX)
SELECT pgtrickle.create_stream_table(
    name     => 'salary_ranges',
    query    => 'SELECT department, MIN(salary) AS min_sal, MAX(salary) AS max_sal
     FROM employees GROUP BY department',
    schedule => '2m'
);

-- Group-rescan aggregates (BOOL_AND/OR, STRING_AGG, ARRAY_AGG, JSON_AGG, JSONB_AGG,
--                          BIT_AND, BIT_OR, BIT_XOR, JSON_OBJECT_AGG, JSONB_OBJECT_AGG,
--                          STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, VAR_SAMP,
--                          MODE, PERCENTILE_CONT, PERCENTILE_DISC,
--                          CORR, COVAR_POP, COVAR_SAMP, REGR_AVGX, REGR_AVGY,
--                          REGR_COUNT, REGR_INTERCEPT, REGR_R2, REGR_SLOPE,
--                          REGR_SXX, REGR_SXY, REGR_SYY, ANY_VALUE)
SELECT pgtrickle.create_stream_table(
    name     => 'team_members',
    query    => 'SELECT department,
            STRING_AGG(name, '', '' ORDER BY name) AS members,
            ARRAY_AGG(employee_id) AS member_ids,
            BOOL_AND(active) AS all_active,
            JSON_AGG(name) AS members_json
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- Bitwise aggregates
SELECT pgtrickle.create_stream_table(
    name     => 'permission_summary',
    query    => 'SELECT department,
            BIT_OR(permissions) AS combined_perms,
            BIT_AND(permissions) AS common_perms,
            BIT_XOR(flags) AS xor_flags
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- JSON object aggregates
SELECT pgtrickle.create_stream_table(
    name     => 'config_map',
    query    => 'SELECT department,
            JSON_OBJECT_AGG(setting_name, setting_value) AS settings,
            JSONB_OBJECT_AGG(key, value) AS metadata
     FROM config
     GROUP BY department',
    schedule => '1m'
);

-- Statistical aggregates
SELECT pgtrickle.create_stream_table(
    name     => 'salary_stats',
    query    => 'SELECT department,
            STDDEV_POP(salary) AS sd_pop,
            STDDEV_SAMP(salary) AS sd_samp,
            VAR_POP(salary) AS var_pop,
            VAR_SAMP(salary) AS var_samp
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- Ordered-set aggregates (MODE, PERCENTILE_CONT, PERCENTILE_DISC)
SELECT pgtrickle.create_stream_table(
    name     => 'salary_percentiles',
    query    => 'SELECT department,
            MODE() WITHIN GROUP (ORDER BY grade) AS most_common_grade,
            PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) AS median_salary,
            PERCENTILE_DISC(0.9) WITHIN GROUP (ORDER BY salary) AS p90_salary
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- Regression / correlation aggregates (CORR, COVAR_*, REGR_*)
SELECT pgtrickle.create_stream_table(
    name     => 'regression_stats',
    query    => 'SELECT department,
            CORR(salary, experience) AS sal_exp_corr,
            COVAR_POP(salary, experience) AS covar_pop,
            COVAR_SAMP(salary, experience) AS covar_samp,
            REGR_SLOPE(salary, experience) AS slope,
            REGR_INTERCEPT(salary, experience) AS intercept,
            REGR_R2(salary, experience) AS r_squared,
            REGR_COUNT(salary, experience) AS regr_n
     FROM employees
     GROUP BY department',
    schedule => '1m'
);

-- ANY_VALUE aggregate (PostgreSQL 16+)
SELECT pgtrickle.create_stream_table(
    name     => 'dept_sample',
    query    => 'SELECT department, ANY_VALUE(office_location) AS sample_office
     FROM employees GROUP BY department',
    schedule => '1m'
);

-- FILTER clause on aggregates
SELECT pgtrickle.create_stream_table(
    name     => 'order_metrics',
    query    => 'SELECT region,
            COUNT(*) AS total,
            COUNT(*) FILTER (WHERE status = ''active'') AS active_count,
            SUM(amount) FILTER (WHERE status = ''shipped'') AS shipped_total
     FROM orders
     GROUP BY region',
    schedule => '1m'
);

-- PgBouncer compatibility (transaction-mode pooler)
SELECT pgtrickle.create_stream_table(
    name                      => 'pooled_orders',
    query                     => 'SELECT id, amount FROM orders',
    schedule                  => '5m',
    pooler_compatibility_mode => true
);

CTE Examples:

Non-recursive CTEs are fully supported in both FULL and DIFFERENTIAL modes:

-- Simple CTE
SELECT pgtrickle.create_stream_table(
    name     => 'active_order_totals',
    query    => 'WITH active_users AS (
        SELECT id, name FROM users WHERE active = true
    )
    SELECT a.id, a.name, SUM(o.amount) AS total
    FROM active_users a
    JOIN orders o ON o.user_id = a.id
    GROUP BY a.id, a.name',
    schedule => '1m'
);

-- Chained CTEs (CTE referencing another CTE)
SELECT pgtrickle.create_stream_table(
    name     => 'top_regions',
    query    => 'WITH regional AS (
        SELECT region, SUM(amount) AS total FROM orders GROUP BY region
    ),
    ranked AS (
        SELECT region, total FROM regional WHERE total > 1000
    )
    SELECT * FROM ranked',
    schedule => '2m'
);

-- Multi-reference CTE (referenced twice in FROM — shared delta optimization)
SELECT pgtrickle.create_stream_table(
    name     => 'self_compare',
    query    => 'WITH totals AS (
        SELECT user_id, SUM(amount) AS total FROM orders GROUP BY user_id
    )
    SELECT t1.user_id, t1.total, t2.total AS next_total
    FROM totals t1
    JOIN totals t2 ON t1.user_id = t2.user_id + 1',
    schedule => '1m'
);

-- Append-only stream table (INSERT-only fast path)
SELECT pgtrickle.create_stream_table(
    name        => 'event_log_st',
    query       => 'SELECT id, event_type, payload, created_at FROM events',
    schedule    => '30s',
    append_only => true
);

Recursive CTEs work with FULL, DIFFERENTIAL, and IMMEDIATE modes:

-- Recursive CTE (hierarchy traversal)
SELECT pgtrickle.create_stream_table(
    name         => 'category_tree',
    query        => 'WITH RECURSIVE cat_tree AS (
        SELECT id, name, parent_id, 0 AS depth
        FROM categories WHERE parent_id IS NULL
        UNION ALL
        SELECT c.id, c.name, c.parent_id, ct.depth + 1
        FROM categories c
        JOIN cat_tree ct ON c.parent_id = ct.id
    )
    SELECT * FROM cat_tree',
    schedule     => '5m',
    refresh_mode => 'FULL'  -- FULL mode: standard re-execution
);

-- Recursive CTE with DIFFERENTIAL mode (incremental semi-naive / DRed)
SELECT pgtrickle.create_stream_table(
    name         => 'org_chart',
    query        => 'WITH RECURSIVE reports AS (
        SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
        UNION ALL
        SELECT e.id, e.name, e.manager_id
        FROM employees e JOIN reports r ON e.manager_id = r.id
    )
    SELECT * FROM reports',
    schedule     => '2m',
    refresh_mode => 'DIFFERENTIAL'  -- Uses semi-naive, DRed, or recomputation (auto-selected)
);

-- Recursive CTE with IMMEDIATE mode (same-transaction maintenance)
SELECT pgtrickle.create_stream_table(
    name         => 'org_chart_live',
    query        => 'WITH RECURSIVE reports AS (
        SELECT id, name, manager_id FROM employees WHERE manager_id IS NULL
        UNION ALL
        SELECT e.id, e.name, e.manager_id
        FROM employees e JOIN reports r ON e.manager_id = r.id
    )
    SELECT * FROM reports',
    refresh_mode => 'IMMEDIATE'  -- Uses transition tables with semi-naive / DRed maintenance
);

Non-monotone recursive terms: If the recursive term contains operators like EXCEPT, aggregate functions, window functions, DISTINCT, INTERSECT (set), or anti-joins, the system automatically falls back to recomputation to guarantee correctness. Semi-naive and DRed strategies require monotone recursive terms (JOIN, UNION ALL, filter/project only).

Set Operation Examples:

INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL, UNION, and UNION ALL are supported:

-- INTERSECT: customers who placed orders in BOTH regions
SELECT pgtrickle.create_stream_table(
    name     => 'bi_region_customers',
    query    => 'SELECT customer_id FROM orders_east
     INTERSECT
     SELECT customer_id FROM orders_west',
    schedule => '2m'
);

-- INTERSECT ALL: preserves duplicates (bag semantics)
SELECT pgtrickle.create_stream_table(
    name     => 'common_items',
    query    => 'SELECT item_name FROM warehouse_a
     INTERSECT ALL
     SELECT item_name FROM warehouse_b',
    schedule => '1m'
);

-- EXCEPT: orders not yet shipped
SELECT pgtrickle.create_stream_table(
    name     => 'unshipped_orders',
    query    => 'SELECT order_id FROM orders
     EXCEPT
     SELECT order_id FROM shipments',
    schedule => '1m'
);

-- EXCEPT ALL: preserves duplicate counts (bag subtraction)
SELECT pgtrickle.create_stream_table(
    name     => 'excess_inventory',
    query    => 'SELECT sku FROM stock_received
     EXCEPT ALL
     SELECT sku FROM stock_shipped',
    schedule => '5m'
);

-- UNION: deduplicated merge of two sources
SELECT pgtrickle.create_stream_table(
    name     => 'all_contacts',
    query    => 'SELECT email FROM customers
     UNION
     SELECT email FROM newsletter_subscribers',
    schedule => '5m'
);

LATERAL Set-Returning Function Examples:

Set-returning functions (SRFs) in the FROM clause are supported in both FULL and DIFFERENTIAL modes. Common SRFs include jsonb_array_elements, jsonb_each, jsonb_each_text, and unnest:

-- Flatten JSONB arrays into rows
SELECT pgtrickle.create_stream_table(
    name     => 'flat_children',
    query    => 'SELECT p.id, child.value AS val
     FROM parent_data p,
     jsonb_array_elements(p.data->''children'') AS child',
    schedule => '1m'
);

-- Expand JSONB key-value pairs (multi-column SRF)
SELECT pgtrickle.create_stream_table(
    name     => 'flat_properties',
    query    => 'SELECT d.id, kv.key, kv.value
     FROM documents d,
     jsonb_each(d.metadata) AS kv',
    schedule => '2m'
);

-- Unnest arrays
SELECT pgtrickle.create_stream_table(
    name     => 'flat_tags',
    query    => 'SELECT t.id, tag.tag
     FROM tagged_items t,
     unnest(t.tags) AS tag(tag)',
    schedule => '1m'
);

-- SRF with WHERE filter
SELECT pgtrickle.create_stream_table(
    name     => 'high_value_items',
    query    => 'SELECT p.id, (e.value)::int AS amount
     FROM products p,
     jsonb_array_elements(p.prices) AS e
     WHERE (e.value)::int > 100',
    schedule => '5m'
);

-- SRF combined with aggregation
SELECT pgtrickle.create_stream_table(
    name         => 'element_counts',
    query        => 'SELECT a.id, count(*) AS cnt
     FROM arrays a,
     jsonb_array_elements(a.data) AS e
     GROUP BY a.id',
    schedule     => '1m',
    refresh_mode => 'FULL'
);

LATERAL Subquery Examples:

LATERAL subqueries in the FROM clause are supported in both FULL and DIFFERENTIAL modes. Use them for top-N per group, correlated aggregation, and conditional expansion:

-- Top-N per group: latest item per order
SELECT pgtrickle.create_stream_table(
    name     => 'latest_items',
    query    => 'SELECT o.id, o.customer, latest.amount
     FROM orders o,
     LATERAL (
         SELECT li.amount
         FROM line_items li
         WHERE li.order_id = o.id
         ORDER BY li.created_at DESC
         LIMIT 1
     ) AS latest',
    schedule => '1m'
);

-- Correlated aggregate
SELECT pgtrickle.create_stream_table(
    name     => 'dept_summaries',
    query    => 'SELECT d.id, d.name, stats.total, stats.cnt
     FROM departments d,
     LATERAL (
         SELECT SUM(e.salary) AS total, COUNT(*) AS cnt
         FROM employees e
         WHERE e.dept_id = d.id
     ) AS stats',
    schedule => '1m'
);

-- LEFT JOIN LATERAL: preserve outer rows with NULLs when subquery returns no rows
SELECT pgtrickle.create_stream_table(
    name     => 'dept_stats_all',
    query    => 'SELECT d.id, d.name, stats.total
     FROM departments d
     LEFT JOIN LATERAL (
         SELECT SUM(e.salary) AS total
         FROM employees e
         WHERE e.dept_id = d.id
     ) AS stats ON true',
    schedule => '1m'
);

WHERE Subquery Examples:

Subqueries in the WHERE clause are automatically transformed into semi-join, anti-join, or scalar subquery operators in the DVM operator tree:

-- EXISTS subquery: customers who have placed orders
SELECT pgtrickle.create_stream_table(
    name     => 'active_customers',
    query    => 'SELECT c.id, c.name
     FROM customers c
     WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id)',
    schedule => '1m'
);

-- NOT EXISTS: customers with no orders
SELECT pgtrickle.create_stream_table(
    name     => 'inactive_customers',
    query    => 'SELECT c.id, c.name
     FROM customers c
     WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id)',
    schedule => '1m'
);

-- IN subquery: products that have been ordered
SELECT pgtrickle.create_stream_table(
    name     => 'ordered_products',
    query    => 'SELECT p.id, p.name
     FROM products p
     WHERE p.id IN (SELECT product_id FROM order_items)',
    schedule => '1m'
);

-- NOT IN subquery: products never ordered
SELECT pgtrickle.create_stream_table(
    name     => 'unordered_products',
    query    => 'SELECT p.id, p.name
     FROM products p
     WHERE p.id NOT IN (SELECT product_id FROM order_items)',
    schedule => '1m'
);

-- Scalar subquery in SELECT list
SELECT pgtrickle.create_stream_table(
    name     => 'products_with_max_price',
    query    => 'SELECT p.id, p.name, (SELECT max(price) FROM products) AS max_price
     FROM products p',
    schedule => '1m'
);

Notes:

  • The defining query is parsed into an operator tree and validated for DVM support.
  • Views as sources — views referenced in the defining query are automatically inlined as subqueries (auto-rewrite pass #0). CDC triggers are created on the underlying base tables. Nested views (view → view → table) are fully expanded. The user's original query is preserved in original_query for reinit and introspection. Materialized views are rejected in DIFFERENTIAL mode (use FULL mode or the underlying query directly). Foreign tables are also rejected in DIFFERENTIAL mode.
  • CDC triggers and change buffer tables are created automatically for each source table.
  • TRUNCATE on source tables — when a source table is TRUNCATEd, a CDC trigger writes a marker row (action='T') into the change buffer. On the next refresh cycle, pg_trickle detects the marker and automatically falls back to a FULL refresh. For single-source stream tables where no subsequent DML occurred after the TRUNCATE, an optimized fast path deletes all ST rows directly without re-running the full defining query.
  • The ST is registered in the dependency DAG; cycles are rejected.
  • Non-recursive CTEs are inlined as subqueries during parsing (Tier 1). Multi-reference CTEs share delta computation (Tier 2).
  • Recursive CTEs in DIFFERENTIAL mode use three strategies, auto-selected per refresh: semi-naive evaluation for INSERT-only changes, DRed (Delete-and-Rederive) for mixed DELETE/UPDATE changes, and recomputation fallback when CTE columns do not match ST storage columns. Non-monotone recursive terms (containing EXCEPT, Aggregate, Window, DISTINCT, AntiJoin, or INTERSECT SET) automatically fall back to recomputation to ensure correctness.

Recursive CTE DIFFERENTIAL mode -- DRed algorithm (P2-1) In DIFFERENTIAL mode, mixed DELETE/UPDATE changes now use the DRed (Delete-and-Rederive) algorithm: (1) semi-naive INSERT propagation; (2) over-deletion cascade from ST storage; (3) rederivation from current source tables; (4) combine net deletions. DRed correctly handles derived-column changes such as path rebuilds under a renamed ancestor node. When CTE output columns differ from ST storage columns (mismatch), recomputation is used. Implemented in v0.10.0 (P2-1).

  • LATERAL SRFs in DIFFERENTIAL mode use row-scoped recomputation: when a source row changes, only the SRF expansions for that row are re-evaluated.
  • LATERAL subqueries in DIFFERENTIAL mode also use row-scoped recomputation: when an outer row changes, the correlated subquery is re-executed only for that row.
  • WHERE subqueries (EXISTS, IN, scalar) are parsed into dedicated semi-join, anti-join, and scalar subquery operators with specialized delta computation.
  • ALL (subquery) is the only subquery form that is currently rejected.
  • ORDER BY is accepted but silently discarded — row order in the storage table is undefined (consistent with PostgreSQL's CREATE MATERIALIZED VIEW behavior). Apply ORDER BY when querying the stream table.
  • TopK (ORDER BY + LIMIT) — When a top-level ORDER BY … LIMIT N is present (with a constant integer limit, optionally with OFFSET M), the query is recognized as a "TopK" pattern and accepted. TopK stream tables store exactly N rows (starting from position M+1 if OFFSET is specified) and are refreshed via a scoped-recomputation MERGE strategy. The DVM delta pipeline is bypassed; instead, each refresh re-evaluates the full ORDER BY + LIMIT [+ OFFSET] query and merges the result into the storage table. The catalog records topk_limit, topk_order_by, and optionally topk_offset for the stream table. TopK is not supported with set operations (UNION/INTERSECT/EXCEPT) or GROUP BY ROLLUP/CUBE/GROUPING SETS.
  • LIMIT / OFFSET without ORDER BY are rejected — stream tables materialize the full result set. Apply LIMIT when querying the stream table.

pgtrickle.create_stream_table_if_not_exists

Create a stream table if it does not already exist. If a stream table with the given name already exists, this is a silent no-op (an INFO message is logged). The existing definition is never modified.

pgtrickle.create_stream_table_if_not_exists(
    name                    text,
    query                   text,
    schedule                text      DEFAULT 'calculated',
    refresh_mode            text      DEFAULT 'AUTO',
    initialize              bool      DEFAULT true,
    diamond_consistency     text      DEFAULT NULL,
    diamond_schedule_policy text      DEFAULT NULL,
    cdc_mode                text      DEFAULT NULL,
    append_only             bool      DEFAULT false,
    pooler_compatibility_mode bool    DEFAULT false
) → void

Parameters: Same as create_stream_table.

Example:

-- Safe to re-run in migrations:
SELECT pgtrickle.create_stream_table_if_not_exists(
    'order_totals',
    'SELECT customer_id, sum(amount) AS total FROM orders GROUP BY customer_id',
    '1m',
    'DIFFERENTIAL'
);

Notes:

  • Useful for deployment / migration scripts that should be safe to re-run.
  • If the stream table already exists, the provided query, schedule, and other parameters are ignored — the existing definition is preserved.

pgtrickle.create_or_replace_stream_table

Create a stream table if it does not exist, or replace the existing one if the definition changed. This is the declarative, idempotent API for deployment workflows (dbt, SQL migrations, GitOps).

pgtrickle.create_or_replace_stream_table(
    name                    text,
    query                   text,
    schedule                text      DEFAULT 'calculated',
    refresh_mode            text      DEFAULT 'AUTO',
    initialize              bool      DEFAULT true,
    diamond_consistency     text      DEFAULT NULL,
    diamond_schedule_policy text      DEFAULT NULL,
    cdc_mode                text      DEFAULT NULL,
    append_only             bool      DEFAULT false,
    pooler_compatibility_mode bool    DEFAULT false
) → void

Parameters: Same as create_stream_table.

Behavior:

Current stateAction taken
Stream table does not existCreate — identical to create_stream_table(...)
Stream table exists, query and all config identicalNo-op — logs INFO, returns immediately
Stream table exists, query identical but config differsAlter config — delegates to alter_stream_table(...) for schedule, refresh_mode, diamond settings, cdc_mode, append_only, pooler_compatibility_mode
Stream table exists, query differsReplace query — in-place ALTER QUERY migration plus any config changes; a full refresh is applied

The initialize parameter is honoured on create only. On replace, the stream table is always repopulated via a full refresh.

Query comparison uses the post-rewrite (normalized) form of the SQL. Cosmetic differences such as whitespace, casing, and extra parentheses are ignored.

Example:

-- Idempotent deployment — safe to run on every deploy:
SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'order_totals',
    query        => 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule     => '2m',
    refresh_mode => 'DIFFERENTIAL'
);

-- If the query changed since last deploy, the stream table is
-- migrated in place (no data gap). If nothing changed, it's a no-op.

Notes:

  • Mirrors PostgreSQL's CREATE OR REPLACE convention (CREATE OR REPLACE VIEW, CREATE OR REPLACE FUNCTION).
  • Never drops the stream table — even for incompatible schema changes, the ALTER QUERY path rebuilds storage in place while preserving the catalog entry (pgt_id).
  • For migration scripts that should not modify an existing definition, use create_stream_table_if_not_exists instead.

pgtrickle.bulk_create

Create multiple stream tables in a single transaction.

pgtrickle.bulk_create(
    definitions  jsonb     -- Array of stream table definitions
) → jsonb                  -- Array of result objects

Each element in the definitions array must be a JSON object with at least name and query keys. All other keys match the parameters of create_stream_table (snake_case):

KeyTypeDefaultDescription
namestring(required)Stream table name (optionally schema-qualified).
querystring(required)Defining SQL query.
schedulestring'calculated'Refresh schedule.
refresh_modestring'AUTO''AUTO', 'FULL', 'DIFFERENTIAL', or 'IMMEDIATE'.
initializebooleantrueWhether to populate immediately.
diamond_consistencystringNULL'atomic' or 'none'.
diamond_schedule_policystringNULL'fastest' or 'slowest'.
cdc_modestringNULL'auto', 'trigger', or 'wal'.
append_onlybooleanfalseEnable append-only fast path.
pooler_compatibility_modebooleanfalsePgBouncer compatibility.
partition_bystringNULLPartition key.
max_differential_joinsintegerNULLMax join scan limit.
max_delta_fractionnumberNULLMax delta fraction (0.0–1.0).

Returns a JSONB array of result objects:

[
  {"name": "st1", "status": "created", "pgt_id": 42},
  {"name": "st2", "status": "created", "pgt_id": 43}
]

On any error, the entire transaction is rolled back (standard PostgreSQL transactional semantics). The error message includes the index and name of the failing definition.

Example:

SELECT pgtrickle.bulk_create('[
  {"name": "order_totals", "query": "SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id", "schedule": "30s"},
  {"name": "product_stats", "query": "SELECT product_id, COUNT(*) AS cnt FROM order_items GROUP BY product_id", "schedule": "1m"}
]'::jsonb);

pgtrickle.alter_stream_table

Alter properties of an existing stream table.

pgtrickle.alter_stream_table(
    name                  text,
    query                 text      DEFAULT NULL,
    schedule              text      DEFAULT NULL,
    refresh_mode          text      DEFAULT NULL,
    status                text      DEFAULT NULL,
    diamond_consistency   text      DEFAULT NULL,
    diamond_schedule_policy text    DEFAULT NULL,
    cdc_mode              text      DEFAULT NULL,
    append_only           bool      DEFAULT NULL,
    pooler_compatibility_mode bool  DEFAULT NULL,
    tier                  text      DEFAULT NULL
) → void

Parameters:

ParameterTypeDefaultDescription
nametextName of the stream table (schema-qualified or unqualified).
querytextNULLNew defining query. Pass NULL to leave unchanged. When set, the function validates the new query, migrates the storage table schema if needed, updates catalog entries and dependencies, and runs a full refresh. Schema changes are classified as same (no DDL), compatible (ALTER TABLE ADD/DROP COLUMN), or incompatible (full storage rebuild with OID change).
scheduletextNULLNew schedule as a duration string (e.g., '5m'). Pass NULL to leave unchanged. Pass 'calculated' to switch to CALCULATED mode.
refresh_modetextNULLNew refresh mode ('AUTO', 'FULL', 'DIFFERENTIAL', or 'IMMEDIATE'). Pass NULL to leave unchanged. Switching to/from 'IMMEDIATE' migrates trigger infrastructure (IVM triggers ↔ CDC triggers), clears or restores the schedule, and runs a full refresh.
statustextNULLNew status ('ACTIVE', 'SUSPENDED'). Pass NULL to leave unchanged. Resuming resets consecutive errors to 0.
diamond_consistencytextNULLNew diamond consistency mode ('none' or 'atomic'). Pass NULL to leave unchanged.
diamond_schedule_policytextNULLNew schedule policy for atomic diamond groups ('fastest' or 'slowest'). Pass NULL to leave unchanged.
cdc_modetextNULLNew requested CDC mode override ('auto', 'trigger', or 'wal'). Pass NULL to leave unchanged.
append_onlyboolNULLEnable or disable the append-only INSERT fast path. Pass NULL to leave unchanged. When true, rejected for FULL, IMMEDIATE, or keyless source stream tables.
pooler_compatibility_modeboolNULLEnable or disable pooler-safe mode. When true, prepared statements are bypassed and NOTIFY emissions are suppressed. Pass NULL to leave unchanged.
tiertextNULLRefresh tier for tiered scheduling ('hot', 'warm', 'cold', or 'frozen'). Only effective when pg_trickle.tiered_scheduling GUC is enabled. Hot (1×), Warm (2×), Cold (10×), Frozen (skip). Pass NULL to leave unchanged.

If you switch a stream table to refresh_mode => 'IMMEDIATE' while the cluster-wide pg_trickle.cdc_mode GUC is set to 'wal', pg_trickle logs an INFO and proceeds with IVM triggers. WAL CDC does not apply to IMMEDIATE mode. If the stream table has an explicit cdc_mode => 'wal' override, switching to IMMEDIATE is rejected until you change the requested CDC mode back to 'auto' or 'trigger'.

Examples:

-- Change the defining query (same output schema — fast path)
SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total FROM orders WHERE status = ''active'' GROUP BY customer_id');

-- Change query and add a column (compatible schema migration)
SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total, COUNT(*) AS cnt FROM orders GROUP BY customer_id');

-- Change query and mode simultaneously
SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    refresh_mode => 'FULL');

-- Change schedule
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '5m');

-- Switch to full refresh mode
SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'FULL');

-- Switch to immediate (transactional) mode — installs IVM triggers, clears schedule
SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'IMMEDIATE');

-- Switch from immediate back to differential — re-creates CDC triggers, restores schedule
SELECT pgtrickle.alter_stream_table('order_totals',
    refresh_mode => 'DIFFERENTIAL', schedule => '5m');

-- Pin a deferred stream table to trigger CDC even when the global GUC is 'auto'
SELECT pgtrickle.alter_stream_table('order_totals', cdc_mode => 'trigger');

-- Enable append-only INSERT fast path
SELECT pgtrickle.alter_stream_table('event_log_st', append_only => true);

-- Enable pooler compatibility mode (for PgBouncer transaction mode)
SELECT pgtrickle.alter_stream_table('order_totals', pooler_compatibility_mode => true);

-- Set refresh tier (requires pg_trickle.tiered_scheduling = on)
SELECT pgtrickle.alter_stream_table('order_totals', tier => 'warm');
SELECT pgtrickle.alter_stream_table('archive_stats', tier => 'frozen');

-- Suspend a stream table
SELECT pgtrickle.alter_stream_table('order_totals', status => 'SUSPENDED');

-- Resume a suspended stream table
SELECT pgtrickle.resume_stream_table('order_totals');
-- Or via alter_stream_table
SELECT pgtrickle.alter_stream_table('order_totals', status => 'ACTIVE');

Notes:

  • When query is provided, the function runs the full query rewrite pipeline (view inlining, DISTINCT ON, GROUPING SETS, etc.) and validates the new query before applying changes.
  • The entire ALTER QUERY operation runs within a single transaction. If any step fails, the stream table is left unchanged.
  • For same-schema and compatible-schema changes, the storage table OID is preserved — views, policies, and publications referencing the stream table remain valid.
  • For incompatible schema changes (e.g., changing a column from integer to text), the storage table is rebuilt and the OID changes. A WARNING is emitted.
  • The stream table is temporarily suspended during query migration to prevent concurrent scheduler refreshes.

pgtrickle.drop_stream_table

Drop a stream table, removing the storage table and all catalog entries.

pgtrickle.drop_stream_table(name text) → void

Parameters:

ParameterTypeDescription
nametextName of the stream table to drop.

Example:

SELECT pgtrickle.drop_stream_table('order_totals');

Notes:

  • Drops the underlying storage table with CASCADE.
  • Removes all catalog entries (metadata, dependencies, refresh history).
  • Cleans up CDC triggers and change buffer tables for source tables that are no longer tracked by any ST.

pgtrickle.resume_stream_table

Resume a suspended stream table, clearing its consecutive error count and re-enabling automated and manual refreshes.

pgtrickle.resume_stream_table(name text) → void

Parameters:

ParameterTypeDescription
nametextName of the stream table to resume (schema-qualified or unqualified).

Example:

-- Resume a stream table that was auto-suspended due to repeated errors
SELECT pgtrickle.resume_stream_table('order_totals');

Notes:

  • Errors if the ST is not in SUSPENDED state.
  • Resets consecutive_errors to 0 and sets status = 'ACTIVE'.
  • Emits a resumed event on the pg_trickle_alert NOTIFY channel.
  • After resuming, the scheduler will include the ST in its next cycle.

pgtrickle.refresh_stream_table

Manually trigger a synchronous refresh of a stream table.

pgtrickle.refresh_stream_table(name text) → void

Parameters:

ParameterTypeDescription
nametextName of the stream table to refresh.

Example:

SELECT pgtrickle.refresh_stream_table('order_totals');

Notes:

  • Blocked if the ST is SUSPENDED — use pgtrickle.resume_stream_table(name) first.
  • Uses an advisory lock to prevent concurrent refreshes of the same ST.
  • For DIFFERENTIAL mode, generates and applies a delta query. For FULL mode, truncates and reloads.
  • Records the refresh in pgtrickle.pgt_refresh_history with initiated_by = 'MANUAL'.

pgtrickle.repair_stream_table

Repair a stream table by reinstalling any missing CDC triggers, validating catalog entries, and reconciling change buffer state.

pgtrickle.repair_stream_table(name text) → void

Parameters:

ParameterTypeDescription
nametextName of the stream table to repair.

Example:

-- Reinstall missing CDC triggers after a point-in-time recovery
SELECT pgtrickle.repair_stream_table('order_totals');

Notes:

  • Inspects all source tables in the stream table's dependency graph and reinstalls any missing or disabled CDC triggers.
  • Validates that the stream table's catalog entry, storage table, and change buffer tables are consistent.
  • Useful after pg_basebackup or PITR restores where triggers may not have been captured in the backup.
  • Use pgtrickle.trigger_inventory() first to identify which triggers are missing.
  • Safe to call on a healthy stream table — it is a no-op if everything is intact.

Status & Monitoring

Query the state of stream tables, view refresh statistics, and diagnose problems.


pgtrickle.pgt_status

Get the status of all stream tables.

pgtrickle.pgt_status() → SETOF record(
    name                text,
    status              text,
    refresh_mode        text,
    is_populated        bool,
    consecutive_errors  int,
    schedule            text,
    data_timestamp      timestamptz,
    staleness           interval
)

Example:

SELECT * FROM pgtrickle.pgt_status();
namestatusrefresh_modeis_populatedconsecutive_errorsscheduledata_timestampstaleness
public.order_totalsACTIVEDIFFERENTIALtrue05m2026-02-21 12:00:00+0000:02:30

pgtrickle.health_check

Run a set of health checks against the pg_trickle installation and return one row per check.

pgtrickle.health_check() → SETOF record(
    check_name  text,   -- identifier for the check
    severity    text,   -- 'OK', 'WARN', or 'ERROR'
    detail      text    -- human-readable explanation
)

Filter to problems only:

SELECT check_name, severity, detail
FROM pgtrickle.health_check()
WHERE severity != 'OK';

Checks: scheduler_running, error_tables, stale_tables, needs_reinit, consecutive_errors, buffer_growth (> 10 000 pending rows), slot_lag (retained WAL above pg_trickle.slot_lag_warning_threshold_mb, default 100 MB), worker_pool (all worker tokens in use — parallel mode only), job_queue (> 10 jobs queued — parallel mode only).


pgtrickle.health_summary

Single-row summary of the entire pg_trickle deployment's health. Designed for monitoring dashboards that want one endpoint to poll instead of joining multiple views.

pgtrickle.health_summary() → SETOF record(
    total_stream_tables   int,
    active_count          int,
    error_count           int,
    suspended_count       int,
    stale_count           int,
    reinit_pending        int,
    max_staleness_seconds float8,    -- NULL if no stream tables
    scheduler_status      text,      -- 'ACTIVE', 'STOPPED', or 'NOT_LOADED'
    cache_hit_rate        float8     -- NULL if no cache lookups yet
)

Example:

SELECT * FROM pgtrickle.health_summary();
total_stream_tablesactive_counterror_countsuspended_countstale_countreinit_pendingmax_staleness_secondsscheduler_statuscache_hit_rate
1211010045.2ACTIVE0.94

Tip: Use this in a Grafana single-stat panel or a Prometheus exporter to surface fleet-level health at a glance.


pgtrickle.refresh_timeline

Return recent refresh records across all stream tables in a single chronological view.

pgtrickle.refresh_timeline(
    max_rows int  DEFAULT 50
) → SETOF record(
    start_time      timestamptz,
    stream_table    text,
    action          text,
    status          text,
    rows_inserted   bigint,
    rows_deleted    bigint,
    duration_ms     float8,
    error_message   text
)

Example:

-- Most recent 20 events across all stream tables:
SELECT start_time, stream_table, action, status, round(duration_ms::numeric,1) AS ms
FROM pgtrickle.refresh_timeline(20);

-- Just failures in the last 100 events:
SELECT * FROM pgtrickle.refresh_timeline(100) WHERE status = 'ERROR';

pgtrickle.st_refresh_stats

Return per-ST refresh statistics aggregated from the refresh history.

pgtrickle.st_refresh_stats() → SETOF record(
    pgt_name                text,
    pgt_schema              text,
    status                 text,
    refresh_mode           text,
    is_populated           bool,
    total_refreshes        bigint,
    successful_refreshes   bigint,
    failed_refreshes       bigint,
    total_rows_inserted    bigint,
    total_rows_deleted     bigint,
    avg_duration_ms        float8,
    last_refresh_action    text,
    last_refresh_status    text,
    last_refresh_at        timestamptz,
    staleness_secs       float8,
    stale           bool
)

Example:

SELECT pgt_name, status, total_refreshes, avg_duration_ms, stale
FROM pgtrickle.st_refresh_stats();

pgtrickle.get_refresh_history

Return refresh history for a specific stream table.

pgtrickle.get_refresh_history(
    name      text,
    max_rows  int  DEFAULT 20
) → SETOF record(
    refresh_id       bigint,
    data_timestamp   timestamptz,
    start_time       timestamptz,
    end_time         timestamptz,
    action           text,
    status           text,
    rows_inserted    bigint,
    rows_deleted     bigint,
    duration_ms      float8,
    error_message    text
)

Example:

SELECT action, status, rows_inserted, duration_ms
FROM pgtrickle.get_refresh_history('order_totals', 5);

pgtrickle.get_staleness

Get the current staleness in seconds for a specific stream table.

pgtrickle.get_staleness(name text) → float8

Returns NULL if the ST has never been refreshed.

Example:

SELECT pgtrickle.get_staleness('order_totals');
-- Returns: 12.345  (seconds since last refresh)

pgtrickle.explain_refresh_mode

Added in v0.11.0

Explain the configured vs. effective refresh mode for a stream table, including the reason for any downgrade (e.g., AUTO choosing FULL).

pgtrickle.explain_refresh_mode(name text) → TABLE(
    configured_mode  text,
    effective_mode   text,
    downgrade_reason text
)

Columns:

ColumnTypeDescription
configured_modetextThe refresh mode set on the stream table (e.g., DIFFERENTIAL, AUTO, FULL, IMMEDIATE)
effective_modetextThe mode actually used on the most recent refresh. NULL for IMMEDIATE mode (handled by triggers)
downgrade_reasontextHuman-readable explanation when effective_mode differs from configured_mode, or informational note for IMMEDIATE / APPEND_ONLY

Example:

SELECT * FROM pgtrickle.explain_refresh_mode('public.orders_summary');
configured_modeeffective_modedowngrade_reason
AUTOFULLThe most recent refresh used FULL mode. Possible causes: defining query contains a CTE or unsupported operator, adaptive change-ratio threshold was exceeded, or aggregate saturation occurred. Check pgtrickle.pgt_refresh_history for details.

pgtrickle.cache_stats

Return template cache statistics from shared memory.

Reports L1 (thread-local) hits, L2 (catalog table) hits, full misses (DVM re-parse), evictions (generation flushes), and the current L1 cache size for this backend.

pgtrickle.cache_stats() → SETOF record(
    l1_hits    bigint,
    l2_hits    bigint,
    misses     bigint,
    evictions  bigint,
    l1_size    integer
)
ColumnDescription
l1_hitsNumber of delta template cache hits in the thread-local (L1) cache. ~0 ns lookup.
l2_hitsNumber of delta template cache hits in the catalog table (L2) cache. ~1 ms SPI lookup.
missesNumber of full cache misses requiring DVM re-parse (~45 ms).
evictionsNumber of entries evicted from L1 due to DDL-triggered generation flushes.
l1_sizeCurrent number of entries in this backend's L1 cache.

Example:

SELECT * FROM pgtrickle.cache_stats();
l1_hitsl2_hitsmissesevictionsl1_size
14235108

Note: Counters are cluster-wide (shared memory) except l1_size which is per-backend. Requires shared_preload_libraries = 'pg_trickle'; returns zeros when loaded dynamically.


CDC Diagnostics

Inspect CDC pipeline health, replication slots, change buffers, and trigger coverage.


pgtrickle.slot_health

Check replication slot health for all tracked CDC slots.

pgtrickle.slot_health() → SETOF record(
    slot_name          text,
    source_relid       bigint,
    active             bool,
    retained_wal_bytes bigint,
    wal_status         text
)

Example:

SELECT * FROM pgtrickle.slot_health();
slot_namesource_relidactiveretained_wal_byteswal_status
pg_trickle_slot_1638416384false1048576reserved

pgtrickle.check_cdc_health

Check CDC health for all tracked source tables. Returns per-source health status including the current CDC mode, replication slot details, estimated lag, and any alerts.

The alert column uses the critical threshold configured by pg_trickle.slot_lag_critical_threshold_mb (default 1024 MB).

pgtrickle.check_cdc_health() → SETOF record(
    source_relid   bigint,
    source_table   text,
    cdc_mode       text,
    slot_name      text,
    lag_bytes      bigint,
    confirmed_lsn  text,
    alert          text
)

Columns:

ColumnTypeDescription
source_relidbigintOID of the tracked source table
source_tabletextResolved name of the source table (e.g., public.orders)
cdc_modetextCurrent CDC mode: TRIGGER, TRANSITIONING, or WAL
slot_nametextReplication slot name (NULL for TRIGGER mode)
lag_bytesbigintReplication slot lag in bytes (NULL for TRIGGER mode)
confirmed_lsntextLast confirmed WAL position (NULL for TRIGGER mode)
alerttextAlert message if unhealthy (e.g., slot_lag_exceeds_threshold, replication_slot_missing)

Example:

SELECT * FROM pgtrickle.check_cdc_health();
source_relidsource_tablecdc_modeslot_namelag_bytesconfirmed_lsnalert
16384public.ordersTRIGGER
16390public.eventsWALpg_trickle_slot_163905242880/1A8B000

pgtrickle.change_buffer_sizes

Show pending change counts and estimated on-disk sizes for all CDC-tracked source tables.

Returns one row per (stream_table, source_table) pair.

pgtrickle.change_buffer_sizes() → SETOF record(
    stream_table  text,     -- qualified stream table name
    source_table  text,     -- qualified source table name
    source_oid    bigint,
    cdc_mode      text,     -- 'trigger', 'wal', or 'transitioning'
    pending_rows  bigint,   -- rows in buffer not yet consumed
    buffer_bytes  bigint    -- estimated buffer table size in bytes
)

Example:

SELECT * FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

Useful for spotting a source table whose CDC buffer is growing unexpectedly (which may indicate a stalled differential refresh or a high-write source that has outpaced the schedule).


pgtrickle.worker_pool_status

Snapshot of the parallel refresh worker pool. Returns a single row.

pgtrickle.worker_pool_status() → SETOF record(
    active_workers  int,   -- workers currently executing refresh jobs
    max_workers     int,   -- cluster-wide worker budget (GUC)
    per_db_cap      int,   -- per-database dispatch cap (GUC)
    parallel_mode   text   -- current parallel_refresh_mode value
)

Example:

SELECT * FROM pgtrickle.worker_pool_status();

Returns 0 active workers when parallel_refresh_mode = 'off'.


pgtrickle.parallel_job_status

Active and recently completed scheduler jobs from the pgt_scheduler_jobs table. Shows jobs that are currently queued or running, plus jobs that finished within the last max_age_seconds (default 300).

pgtrickle.parallel_job_status(
    max_age_seconds int  DEFAULT 300
) → SETOF record(
    job_id         bigint,
    unit_key       text,        -- stable unit identifier (s:42, a:1,2, etc.)
    unit_kind      text,        -- 'singleton', 'atomic_group', 'immediate_closure'
    status         text,        -- 'QUEUED', 'RUNNING', 'SUCCEEDED', etc.
    member_count   int,
    attempt_no     int,
    scheduler_pid  int,
    worker_pid     int,         -- NULL if not yet claimed
    enqueued_at    timestamptz,
    started_at     timestamptz, -- NULL if still queued
    finished_at    timestamptz, -- NULL if not finished
    duration_ms    float8       -- NULL if not finished
)

Example — show running and recently failed jobs:

SELECT job_id, unit_key, status, duration_ms
FROM pgtrickle.parallel_job_status(60)
WHERE status NOT IN ('SUCCEEDED');

pgtrickle.trigger_inventory

List all CDC triggers that pg_trickle should have installed, and verify each one exists and is enabled in pg_catalog.

pgtrickle.trigger_inventory() → SETOF record(
    source_table  text,    -- qualified source table name
    source_oid    bigint,
    trigger_name  text,    -- expected trigger name
    trigger_type  text,    -- 'DML' or 'TRUNCATE'
    present       bool,    -- trigger exists in pg_catalog
    enabled       bool     -- trigger is not disabled
)

A present = false row means change capture is broken for that source.

Example:

-- Show only missing or disabled triggers:
SELECT source_table, trigger_type, trigger_name
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

pgtrickle.fuse_status

Return the circuit-breaker (fuse) state for every stream table that has a fuse configured.

pgtrickle.fuse_status() → SETOF record(
    name           text,         -- stream table name
    fuse_mode      text,         -- 'off', 'on', or 'auto'
    fuse_state     text,         -- 'armed' or 'blown'
    fuse_ceiling   bigint,       -- change-count threshold
    fuse_sensitivity int,        -- consecutive over-ceiling cycles before blow
    blown_at       timestamptz,  -- when the fuse last blew (NULL if armed)
    blow_reason    text          -- reason the fuse blew (NULL if armed)
)

Example:

-- Check all fuse-enabled stream tables
SELECT name, fuse_mode, fuse_state, fuse_ceiling, blown_at
FROM pgtrickle.fuse_status();

-- Find blown fuses
SELECT name, blow_reason, blown_at
FROM pgtrickle.fuse_status()
WHERE fuse_state = 'blown';

Notes:

  • Returns one row per stream table where fuse_mode != 'off'.
  • A blown fuse suspends differential refreshes until cleared with pgtrickle.reset_fuse().
  • A pgtrickle_alert NOTIFY with event fuse_blown is emitted when the fuse trips.
  • See Configuration — fuse_default_ceiling for global defaults.

pgtrickle.reset_fuse

Clear a blown circuit-breaker fuse and resume scheduling for the stream table.

pgtrickle.reset_fuse(name text, action text DEFAULT 'apply') → void

Parameters:

ParameterTypeDefaultDescription
nametextName of the stream table whose fuse to reset.
actiontext'apply'How to handle the pending changes that caused the fuse to blow.

Actions:

ActionBehavior
'apply'Process all pending changes normally and resume scheduling.
'reinitialize'Drop and repopulate the stream table from scratch (full refresh from defining query).
'skip_changes'Discard the pending changes that triggered the fuse and resume from the current frontier.

Example:

-- After investigating a bulk load, apply the changes:
SELECT pgtrickle.reset_fuse('category_summary', action => 'apply');

-- Or skip the oversized batch entirely:
SELECT pgtrickle.reset_fuse('category_summary', action => 'skip_changes');

-- Or rebuild from scratch:
SELECT pgtrickle.reset_fuse('category_summary', action => 'reinitialize');

Notes:

  • Errors if the stream table's fuse is not in 'blown' state.
  • After reset, the fuse returns to 'armed' state and the scheduler resumes normal operation.
  • Use pgtrickle.fuse_status() to inspect the fuse state before resetting.
  • The 'skip_changes' action advances the frontier past the pending changes without applying them — use only when you are certain the changes should be discarded.

Dependency & Inspection

Visualize dependencies, understand query plans, and audit source table relationships.


pgtrickle.dependency_tree

Render all stream table dependencies as an indented ASCII tree.

pgtrickle.dependency_tree() → SETOF record(
    tree_line    text,    -- indented visual line (├──, └──, │ characters)
    node         text,    -- qualified name (schema.table)
    node_type    text,    -- 'stream_table' or 'source_table'
    depth        int,
    status       text,    -- NULL for source_table nodes
    refresh_mode text     -- NULL for source_table nodes
)

Roots (stream tables with no stream-table parents) appear at depth 0. Each dependent is indented beneath its parent. Plain source tables are rendered as leaf nodes tagged [src].

Example:

SELECT tree_line, status, refresh_mode
FROM pgtrickle.dependency_tree();
tree_line                               status   refresh_mode
----------------------------------------+---------+--------------
report_summary                          ACTIVE   DIFFERENTIAL
├── orders_by_region                    ACTIVE   DIFFERENTIAL
│   ├── public.orders [src]
│   └── public.customers [src]
└── revenue_totals                      ACTIVE   DIFFERENTIAL
    └── public.orders [src]

pgtrickle.diamond_groups

List all detected diamond dependency groups and their members.

When stream tables form diamond-shaped dependency graphs (multiple paths converge at a single fan-in node), the scheduler groups them for coordinated refresh. This function exposes those groups for monitoring and debugging.

pgtrickle.diamond_groups() → SETOF record(
    group_id        int4,
    member_name     text,
    member_schema   text,
    is_convergence  bool,
    epoch           int8,
    schedule_policy text
)

Return columns:

ColumnTypeDescription
group_idint4Numeric identifier for the consistency group (1-based).
member_nametextName of the stream table in this group.
member_schematextSchema of the stream table.
is_convergencebooltrue if this member is a convergence (fan-in) node where multiple paths meet.
epochint8Group epoch counter — advances on each successful atomic refresh of the group.
schedule_policytextEffective schedule policy for this group ('fastest' or 'slowest'). Computed from convergence node settings with strictest-wins.

Example:

SELECT * FROM pgtrickle.diamond_groups();
group_idmember_namemember_schemais_convergenceepochschedule_policy
1st_bpublicfalse0fastest
1st_cpublicfalse0fastest
1st_dpublictrue0fastest

Notes:

  • Singleton stream tables (not part of any diamond) are omitted.
  • The DAG is rebuilt on each call from the catalog — results reflect the current dependency graph.
  • Groups are only relevant when diamond_consistency = 'atomic' is set on the convergence node or globally via the pg_trickle.diamond_consistency GUC.

pgtrickle.pgt_scc_status

List all cyclic strongly connected components (SCCs) and their convergence status.

When stream tables form circular dependencies (with pg_trickle.allow_circular = true), they are grouped into SCCs and iterated to a fixed point. This function exposes those groups for monitoring and debugging.

pgtrickle.pgt_scc_status() → SETOF record(
    scc_id              int4,
    member_count        int4,
    members             text[],
    last_iterations     int4,
    last_converged_at   timestamptz
)

Return columns:

ColumnTypeDescription
scc_idint4SCC group identifier (1-based).
member_countint4Number of stream tables in this SCC.
memberstext[]Array of schema.name for each member.
last_iterationsint4Number of fixpoint iterations in the last convergence (NULL if never iterated).
last_converged_attimestamptzTimestamp of the most recent refresh among SCC members (NULL if never refreshed).

Example:

SELECT * FROM pgtrickle.pgt_scc_status();
scc_idmember_countmemberslast_iterationslast_converged_at
12{public.reach_a,public.reach_b}32026-03-15 12:00:00+00

Notes:

  • Only cyclic SCCs (with scc_id IS NOT NULL) are returned. Acyclic stream tables are omitted.
  • last_iterations reflects the maximum last_fixpoint_iterations across SCC members.
  • Results are queried from the catalog on each call.

pgtrickle.explain_st

Explain the DVM plan for a stream table's defining query.

pgtrickle.explain_st(name text) → SETOF record(
    property  text,
    value     text
)

Example:

SELECT * FROM pgtrickle.explain_st('order_totals');
propertyvalue
pgt_namepublic.order_totals
defining_querySELECT region, SUM(amount) ...
refresh_modeDIFFERENTIAL
statusactive
is_populatedtrue
dvm_supportedtrue
operator_treeAggregate → Scan(orders)
output_columnsregion, total
source_oids16384
delta_queryWITH ... SELECT ...
frontier{"orders": "0/15A3B80"}
amplification_stats{"samples":10,"min":1.0,...}
refresh_timing_stats{"samples":10,"min_ms":12.3,...}
source_partitions[{"source":"public.orders",...}]
dependency_graph_dotdigraph dependency_subgraph { ... }
spill_info{"temp_blks_read":0,"temp_blks_written":1234,...}

Output Fields

PropertyDescription
pgt_nameFully-qualified stream table name
defining_queryThe SQL query that defines the stream table
refresh_modeDIFFERENTIAL, FULL, or IMMEDIATE
statusCurrent status (active, suspended, etc.)
is_populatedWhether the stream table has been initially populated
dvm_supportedWhether the defining query supports differential view maintenance
operator_treeDebug representation of the DVM operator tree
output_columnsComma-separated list of output column names
source_oidsComma-separated list of source table OIDs
aggregate_strategiesPer-aggregate maintenance strategies (JSON, if aggregates present)
delta_queryThe generated delta SQL used for DIFFERENTIAL refresh
frontierCurrent LSN/watermark frontier (JSON)
amplification_statsDelta amplification ratio statistics over the last 20 refreshes (JSON)
refresh_timing_statsRefresh duration statistics over the last 20 completed refreshes (JSON). Fields: samples, min_ms, max_ms, avg_ms, latest_ms, latest_action
source_partitionsPartition info for partitioned source tables (JSON array). Fields per entry: source, partition_key, partitions
dependency_graph_dotDependency sub-graph in DOT format. Shows immediate upstream sources (ellipses for base tables, boxes for stream tables) and downstream dependents. Paste into a Graphviz renderer to visualize.
spill_infoTemp file spill metrics from pg_stat_statements (JSON). Fields: temp_blks_read, temp_blks_written, threshold, exceeds_threshold. Only present when pg_trickle.spill_threshold_blocks > 0.

Note: Properties are only included when data is available. For example, source_partitions only appears when at least one source table is partitioned, and refresh_timing_stats only appears after at least one completed refresh.


pgtrickle.list_sources

List the source tables that a stream table depends on.

pgtrickle.list_sources(name text) → SETOF record(
    source_table   text,         -- qualified source table name
    source_oid     bigint,
    source_type    text,         -- 'table', 'stream_table', etc.
    cdc_mode       text,         -- 'trigger', 'wal', or 'transitioning'
    columns_used   text          -- column-level dependency info (if available)
)

Example:

SELECT * FROM pgtrickle.list_sources('order_totals');

Returns the tables tracked by CDC for the given stream table, along with how they are being tracked. Useful when diagnosing why a stream table is not refreshing or to audit which source tables are being trigger-tracked.


Utilities

Utility functions for CDC management and row identity hashing.


pgtrickle.rebuild_cdc_triggers

Rebuild all CDC triggers (function body + trigger DDL) for every source table tracked by pg_trickle. This recreates trigger functions and re-attaches the trigger to each source table.

pgtrickle.rebuild_cdc_triggers() → text

Returns 'done' on success. Emits a WARNING per table on error and continues processing remaining sources.

When to use:

  • After changing pg_trickle.cdc_trigger_mode from row to statement (or vice versa).
  • After ALTER EXTENSION pg_trickle UPDATE when the CDC trigger function body has changed.
  • After restoring from a backup where triggers may have been lost.

Example:

-- Switch to statement-level triggers and rebuild
SET pg_trickle.cdc_trigger_mode = 'statement';
SELECT pgtrickle.rebuild_cdc_triggers();

Notes:

  • Called automatically during ALTER EXTENSION pg_trickle UPDATE (0.3.0 → 0.4.0) migration.
  • Safe to call at any time — existing triggers are dropped and recreated.
  • On error for a specific table, a WARNING is logged and processing continues with remaining sources.

pgtrickle.pg_trickle_hash

Compute a 64-bit xxHash row ID from a text value.

pgtrickle.pg_trickle_hash(input text) → bigint

Marked IMMUTABLE, PARALLEL SAFE.

Example:

SELECT pgtrickle.pg_trickle_hash('some_key');
-- Returns: 1234567890123456789

pgtrickle.pg_trickle_hash_multi

Compute a row ID by hashing multiple text values (composite keys).

pgtrickle.pg_trickle_hash_multi(inputs text[]) → bigint

Marked IMMUTABLE, PARALLEL SAFE. Uses \x1E (record separator) between values and \x00NULL\x00 for NULL entries.

Example:

SELECT pgtrickle.pg_trickle_hash_multi(ARRAY['key1', 'key2']);

Operator Support Matrix — Summary

pg_trickle supports 60+ SQL constructs across three refresh modes. The table below summarises broad categories. For the complete per-operator matrix (including notes on caveats, auxiliary columns and strategies), see DVM_OPERATORS.md.

CategoryFULLDIFFERENTIALIMMEDIATENotes
Basic SELECT / WHERE / DISTINCT
Joins (INNER, LEFT, RIGHT, FULL, CROSS, LATERAL)Hybrid delta strategy
Subqueries (EXISTS, IN, NOT EXISTS, NOT IN, scalar)
Set operations (UNION ALL, INTERSECT, EXCEPT)
Algebraic aggregates (COUNT, SUM, AVG, STDDEV, …)Fully invertible delta
Semi-algebraic aggregates (MIN, MAX)Group rescan on ambiguous delete
Group-rescan aggregates (STRING_AGG, ARRAY_AGG, …)⚠️⚠️Warning emitted at creation time
Window functions (ROW_NUMBER, RANK, LAG, LEAD, …)Partition-scoped recompute
CTEs (non-recursive and WITH RECURSIVE)Semi-naive / DRed strategies
TopK (ORDER BY … LIMIT)Scoped recomputation
LATERAL / set-returning functions / JSON_TABLERow-scoped re-execution
ST-to-ST dependenciesDifferential via change buffers
VOLATILE functionsRejected at creation time

Legend: ✅ fully supported — ⚠️ supported with caveats — ❌ not supported

For details on each operator's delta strategy, auxiliary columns, and known limitations, see the full Operator Support Matrix.


Expression Support

pgtrickle's DVM parser supports a wide range of SQL expressions in defining queries. All expressions work in both FULL and DIFFERENTIAL modes.

Conditional Expressions

ExpressionExampleNotes
CASE WHEN … THEN … ELSE … ENDCASE WHEN amount > 100 THEN 'high' ELSE 'low' ENDSearched CASE
CASE <expr> WHEN … THEN … ENDCASE status WHEN 1 THEN 'active' WHEN 2 THEN 'inactive' ENDSimple CASE
COALESCE(a, b, …)COALESCE(phone, email, 'unknown')Returns first non-NULL argument
NULLIF(a, b)NULLIF(divisor, 0)Returns NULL if a = b
GREATEST(a, b, …)GREATEST(score1, score2, score3)Returns the largest value
LEAST(a, b, …)LEAST(price, max_price)Returns the smallest value

Comparison Operators

ExpressionExampleNotes
IN (list)category IN ('A', 'B', 'C')Also supports NOT IN
BETWEEN a AND bprice BETWEEN 10 AND 100Also supports NOT BETWEEN
IS DISTINCT FROMa IS DISTINCT FROM bNULL-safe inequality
IS NOT DISTINCT FROMa IS NOT DISTINCT FROM bNULL-safe equality
SIMILAR TOname SIMILAR TO '%pattern%'SQL regex matching
op ANY(array)id = ANY(ARRAY[1,2,3])Array comparison
op ALL(array)score > ALL(ARRAY[50,60])Array comparison

Boolean Tests

ExpressionExample
IS TRUEactive IS TRUE
IS NOT TRUEflag IS NOT TRUE
IS FALSEcompleted IS FALSE
IS NOT FALSEvalid IS NOT FALSE
IS UNKNOWNresult IS UNKNOWN
IS NOT UNKNOWNflag IS NOT UNKNOWN

SQL Value Functions

FunctionDescription
CURRENT_DATECurrent date
CURRENT_TIMECurrent time with time zone
CURRENT_TIMESTAMPCurrent date and time with time zone
LOCALTIMECurrent time without time zone
LOCALTIMESTAMPCurrent date and time without time zone
CURRENT_ROLECurrent role name
CURRENT_USERCurrent user name
SESSION_USERSession user name
CURRENT_CATALOGCurrent database name
CURRENT_SCHEMACurrent schema name

Array and Row Expressions

ExpressionExampleNotes
ARRAY[…]ARRAY[1, 2, 3]Array constructor
ROW(…)ROW(a, b, c)Row constructor
Array subscriptarr[1]Array element access
Field access(rec).fieldComposite type field access
Star indirection(data).*Expand all fields

Subquery Expressions

Subqueries are supported in the WHERE clause and SELECT list. They are parsed into dedicated DVM operators with specialized delta computation for incremental maintenance.

ExpressionExampleDVM Operator
EXISTS (subquery)WHERE EXISTS (SELECT 1 FROM orders WHERE orders.cid = c.id)Semi-Join
NOT EXISTS (subquery)WHERE NOT EXISTS (SELECT 1 FROM orders WHERE orders.cid = c.id)Anti-Join
IN (subquery)WHERE id IN (SELECT product_id FROM order_items)Semi-Join (rewritten as equality)
NOT IN (subquery)WHERE id NOT IN (SELECT product_id FROM order_items)Anti-Join
ALL (subquery)WHERE price > ALL (SELECT price FROM competitors)Anti-Join (NULL-safe)
Scalar subquery (SELECT)SELECT (SELECT max(price) FROM products) AS max_pScalar Subquery

Notes:

  • EXISTS and IN (subquery) in the WHERE clause are transformed into semi-join operators. NOT EXISTS and NOT IN (subquery) become anti-join operators.
  • Multi-column IN (subquery) is not supported (e.g., WHERE (a, b) IN (SELECT x, y FROM ...)). Rewrite as WHERE EXISTS (SELECT 1 FROM ... WHERE a = x AND b = y) for equivalent semantics.
  • Multiple subqueries in the same WHERE clause are supported when combined with AND. Subqueries combined with OR are also supported — they are automatically rewritten into UNION of separate filtered queries.
  • Scalar subqueries in the SELECT list are supported as long as they return exactly one row and one column.
  • ALL (subquery) is supported — see the worked example below.

ALL (subquery) — Worked Example

ALL (subquery) tests whether a comparison holds against every row returned by the subquery. pg_trickle rewrites it to a NULL-safe anti-join so it can be maintained incrementally.

Comparison operators supported: >, >=, <, <=, =, <>

Example — products cheaper than all competitors:

-- Source tables
CREATE TABLE products (
    id    INT PRIMARY KEY,
    name  TEXT,
    price NUMERIC
);
CREATE TABLE competitor_prices (
    id          INT PRIMARY KEY,
    product_id  INT,
    price       NUMERIC
);

-- Sample data
INSERT INTO products VALUES (1, 'Widget', 9.99), (2, 'Gadget', 24.99), (3, 'Gizmo', 14.99);
INSERT INTO competitor_prices VALUES (1, 1, 12.99), (2, 1, 11.50), (3, 2, 19.99), (4, 3, 14.99);

-- Stream table: find products priced below ALL competitor prices
SELECT pgtrickle.create_stream_table(
    name  => 'cheapest_products',
    query => $$
        SELECT p.id, p.name, p.price
        FROM products p
        WHERE p.price < ALL (
            SELECT cp.price
            FROM competitor_prices cp
            WHERE cp.product_id = p.id
        )
    $$,
    schedule => '1m'
);

Result: Widget (9.99 < all of [12.99, 11.50]) is included. Gadget (24.99 ≮ 19.99) is excluded. Gizmo (14.99 ≮ 14.99) is excluded.

How pg_trickle handles it internally:

  1. WHERE price < ALL (SELECT ...) is parsed into an anti-join with a NULL-safe condition.
  2. The condition NOT (x op col) is wrapped as (col IS NULL OR NOT (x op col)) to correctly handle NULL values in the subquery — if any subquery row is NULL, the ALL comparison fails (standard SQL semantics).
  3. The anti-join uses the same incremental delta computation as NOT EXISTS, so changes to either products or competitor_prices are propagated efficiently.

Other common patterns:

-- Employees whose salary meets or exceeds all department maximums
WHERE salary >= ALL (SELECT max_salary FROM department_caps)

-- Orders with ratings better than all thresholds
WHERE rating > ALL (SELECT min_rating FROM quality_thresholds)

Auto-Rewrite Pipeline

pg_trickle transparently rewrites certain SQL constructs before parsing. These rewrites are applied automatically and require no user action:

OrderTriggerRewrite
#0View references in FROMInline view body as subquery
#1DISTINCT ON (expr)Convert to ROW_NUMBER() OVER (PARTITION BY expr ORDER BY ...) = 1 subquery
#2GROUPING SETS / CUBE / ROLLUPDecompose into UNION ALL of separate GROUP BY queries
#3Scalar subquery in WHEREConvert to CROSS JOIN with inline view
#4Correlated scalar subquery in SELECTConvert to LEFT JOIN with grouped inline view
#5EXISTS/IN inside ORSplit into UNION of separate filtered queries
#6Multiple PARTITION BY clausesSplit into joined subqueries, one per distinct partitioning
#7Window functions inside expressionsLift to inner subquery with synthetic __pgt_wf_N columns (see below)

Window Functions in Expressions (Auto-Rewrite)

Window functions nested inside expressions (e.g., CASE WHEN ROW_NUMBER() ..., ABS(RANK() OVER (...) - 5)) are automatically rewritten. pg_trickle lifts each window function call into a synthetic column in an inner subquery, then applies the original expression in the outer SELECT.

This rewrite is transparent — you write your query naturally and pg_trickle handles it:

Your query:

SELECT
    id,
    name,
    CASE WHEN ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) = 1
         THEN 'top earner'
         ELSE 'other'
    END AS rank_label
FROM employees

What pg_trickle generates internally:

SELECT
    "__pgt_wf_inner".id,
    "__pgt_wf_inner".name,
    CASE WHEN "__pgt_wf_inner"."__pgt_wf_1" = 1
         THEN 'top earner'
         ELSE 'other'
    END AS "rank_label"
FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS "__pgt_wf_1"
    FROM employees
) "__pgt_wf_inner"

The inner subquery produces the window function result as a plain column (__pgt_wf_1), which the DVM engine can maintain incrementally using its existing window function support. The outer expression is then a simple column reference.

More examples:

-- Arithmetic with window functions
SELECT id, ABS(RANK() OVER (ORDER BY score) - 5) AS adjusted_rank
FROM players

-- COALESCE with window function
SELECT id, COALESCE(LAG(value) OVER (ORDER BY ts), 0) AS prev_value
FROM sensor_readings

-- Multiple window functions in expressions
SELECT id,
       ROW_NUMBER() OVER (ORDER BY created_at) * 100 AS seq,
       SUM(amount) OVER (ORDER BY created_at) / COUNT(*) OVER (ORDER BY created_at) AS running_avg
FROM transactions

All of these are handled automatically — each distinct window function call is extracted to its own __pgt_wf_N synthetic column.

HAVING Clause

HAVING is fully supported. The filter predicate is applied on top of the aggregate delta computation — groups that pass the HAVING condition are included in the stream table.

SELECT pgtrickle.create_stream_table(
    name     => 'big_departments',
    query    => 'SELECT department, COUNT(*) AS cnt FROM employees GROUP BY department HAVING COUNT(*) > 10',
    schedule => '1m'
);

Tables Without Primary Keys (Keyless Tables)

Tables without a primary key can be used as sources. pg_trickle generates a content-based row identity by hashing all column values using pg_trickle_hash_multi(). This allows DIFFERENTIAL mode to work, though at the cost of being unable to distinguish truly duplicate rows (rows with identical values in all columns).

-- No primary key — pg_trickle uses content hashing for row identity
CREATE TABLE events (ts TIMESTAMPTZ, payload JSONB);
SELECT pgtrickle.create_stream_table(
    name     => 'event_summary',
    query    => 'SELECT payload->>''type'' AS event_type, COUNT(*) FROM events GROUP BY 1',
    schedule => '1m'
);

Known Limitation — Duplicate Rows in Keyless Tables (G7.1)

When a keyless table contains exact duplicate rows (identical values in every column), content-based hashing produces the same __pgt_row_id for each copy. Consequences:

  • INSERT of a duplicate row may appear as a no-op (the hash already exists in the stream table).
  • DELETE of one copy may delete all copies (the MERGE matches on __pgt_row_id, hitting every duplicate).
  • Aggregate counts over keyless tables with duplicates may drift from the true query result.

Recommendation: Add a PRIMARY KEY or at least a UNIQUE constraint to source tables used in DIFFERENTIAL mode. This eliminates the ambiguity entirely. If duplicates are expected and correctness matters, use FULL refresh mode, which always recomputes from scratch.

Volatile Function Detection

pg_trickle checks all functions and operators in the defining query against pg_proc.provolatile:

  • VOLATILE functions (e.g., random(), clock_timestamp(), gen_random_uuid()) are rejected in DIFFERENTIAL and IMMEDIATE modes because they produce different results on each evaluation, breaking delta correctness.
  • VOLATILE operators — custom operators backed by volatile functions are also detected. The check resolves the operator’s implementation function via pg_operator.oprcode and checks its volatility in pg_proc.
  • STABLE functions (e.g., now(), current_timestamp, current_setting()) produce a warning in DIFFERENTIAL and IMMEDIATE modes — they are consistent within a single refresh but may differ between refreshes.
  • IMMUTABLE functions are always safe and produce no warnings.

FULL mode accepts all volatility classes since it re-evaluates the entire query each time.

Volatile Function Policy (VOL-1)

The pg_trickle.volatile_function_policy GUC controls how volatile functions are handled:

ValueBehavior
reject (default)ERROR — volatile functions are rejected at creation time.
warnWARNING emitted but creation proceeds. Delta correctness is not guaranteed.
allowSilent — no warning or error. Use when you understand the implications.
-- Allow volatile functions with a warning
SET pg_trickle.volatile_function_policy = 'warn';

-- Allow volatile functions silently
SET pg_trickle.volatile_function_policy = 'allow';

-- Restore default (reject volatile functions)
SET pg_trickle.volatile_function_policy = 'reject';

COLLATE Expressions

COLLATE clauses on expressions are supported:

SELECT pgtrickle.create_stream_table(
    name     => 'sorted_names',
    query    => 'SELECT name COLLATE "C" AS c_name FROM users',
    schedule => '1m'
);

IS JSON Predicate (PostgreSQL 16+)

The IS JSON predicate validates whether a value is valid JSON. All variants are supported:

-- Filter rows with valid JSON
SELECT pgtrickle.create_stream_table(
    name     => 'valid_json_events',
    query    => 'SELECT id, payload FROM events WHERE payload::text IS JSON',
    schedule => '1m'
);

-- Type-specific checks
SELECT pgtrickle.create_stream_table(
    name         => 'json_objects_only',
    query        => 'SELECT id, data IS JSON OBJECT AS is_obj,
          data IS JSON ARRAY AS is_arr,
          data IS JSON SCALAR AS is_scalar
   FROM json_data',
    schedule     => '1m',
    refresh_mode => 'FULL'
);

Supported variants: IS JSON, IS JSON OBJECT, IS JSON ARRAY, IS JSON SCALAR, IS NOT JSON (all forms), WITH UNIQUE KEYS.

SQL/JSON Constructors (PostgreSQL 16+)

SQL-standard JSON constructor functions are supported in both FULL and DIFFERENTIAL modes:

-- JSON_OBJECT: construct a JSON object from key-value pairs
SELECT pgtrickle.create_stream_table(
    name     => 'user_json',
    query    => 'SELECT id, JSON_OBJECT(''name'' : name, ''age'' : age) AS data FROM users',
    schedule => '1m'
);

-- JSON_ARRAY: construct a JSON array from values
SELECT pgtrickle.create_stream_table(
    name         => 'value_arrays',
    query        => 'SELECT id, JSON_ARRAY(a, b, c) AS arr FROM measurements',
    schedule     => '1m',
    refresh_mode => 'FULL'
);

-- JSON(): parse a text value as JSON
-- JSON_SCALAR(): wrap a scalar value as JSON
-- JSON_SERIALIZE(): serialize a JSON value to text

Note: JSON_ARRAYAGG() and JSON_OBJECTAGG() are SQL-standard aggregate functions fully recognized by the DVM engine. In DIFFERENTIAL mode, they use the group-rescan strategy (affected groups are re-aggregated from source data). The full deparsed SQL is preserved to handle the special key: value, ABSENT ON NULL, ORDER BY, and RETURNING clause syntax.

JSON_TABLE (PostgreSQL 17+)

JSON_TABLE() generates a relational table from JSON data. It is supported in the FROM clause in both FULL and DIFFERENTIAL modes. Internally, it is modeled as a LateralFunction.

-- Extract structured data from a JSON column
SELECT pgtrickle.create_stream_table(
    name     => 'user_phones',
    query    => $$SELECT u.id, j.phone_type, j.phone_number
    FROM users u,
         JSON_TABLE(u.contact_info, '$.phones[*]'
           COLUMNS (
             phone_type TEXT PATH '$.type',
             phone_number TEXT PATH '$.number'
           )
         ) AS j$$,
    schedule => '1m'
);

Supported column types:

  • Regular columnsname TYPE PATH '$.path' (with optional ON ERROR/ON EMPTY behaviors)
  • EXISTS columnsname TYPE EXISTS PATH '$.path'
  • Formatted columnsname TYPE FORMAT JSON PATH '$.path'
  • Nested columnsNESTED PATH '$.path' COLUMNS (...)

The PASSING clause is also supported for passing named variables to path expressions.

Unsupported Expression Types

The following are rejected with clear error messages rather than producing broken SQL:

ExpressionError BehaviorSuggested Rewrite
TABLESAMPLERejected — stream tables materialize the complete result setUse WHERE random() < 0.1 if sampling is needed
FOR UPDATE / FOR SHARERejected — stream tables do not support row-level lockingRemove the locking clause
Unknown node typesRejected with type information

Note: Window functions inside expressions (e.g., CASE WHEN ROW_NUMBER() OVER (...) ...) were unsupported in earlier versions but are now automatically rewritten — see Auto-Rewrite Pipeline § Window Functions in Expressions.


Restrictions & Interoperability

Stream tables are standard PostgreSQL heap tables stored in the pgtrickle schema with an additional __pgt_row_id BIGINT PRIMARY KEY column managed by the refresh engine. This section describes what you can and cannot do with them.

Referencing Other Stream Tables

Stream tables can reference other stream tables in their defining query. This creates a dependency edge in the internal DAG, and the scheduler refreshes upstream tables before downstream ones. By default, cycles are detected and rejected at creation time.

When pg_trickle.allow_circular = true, circular dependencies are allowed for stream tables that use DIFFERENTIAL refresh mode and have monotone defining queries (no aggregates, EXCEPT, window functions, or NOT EXISTS/NOT IN). Cycle members are assigned an scc_id and the scheduler iterates them to a fixed point. Non-monotone operators are rejected because they prevent convergence.

-- ST1 reads from a base table
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule => '1m'
);

-- ST2 reads from ST1
SELECT pgtrickle.create_stream_table(
    name     => 'big_customers',
    query    => 'SELECT customer_id, total FROM pgtrickle.order_totals WHERE total > 1000',
    schedule => '1m'
);

Views as Sources in Defining Queries

PostgreSQL views can be used as source tables in a stream table's defining query. Views are automatically inlined — replaced with their underlying SELECT definition as subqueries — so CDC triggers land on the actual base tables.

CREATE VIEW active_orders AS
  SELECT * FROM orders WHERE status = 'active';

-- This works (views are auto-inlined):
SELECT pgtrickle.create_stream_table(
    name     => 'order_summary',
    query    => 'SELECT customer_id, COUNT(*) FROM active_orders GROUP BY customer_id',
    schedule => '1m'
);
-- Internally, 'active_orders' is replaced with:
--   (SELECT ... FROM orders WHERE status = 'active') AS active_orders

Nested views (view → view → table) are fully expanded via a fixpoint loop. Column-renaming views (CREATE VIEW v(a, b) AS ...) work correctly — pg_get_viewdef() produces the proper column aliases.

When a view is inlined, the user's original SQL is stored in the original_query catalog column for reinit and introspection. The defining_query column contains the expanded (post-inlining) form.

DDL hooks: CREATE OR REPLACE VIEW on a view that was inlined into a stream table marks that ST for reinit. DROP VIEW sets affected STs to ERROR status.

Materialized views are rejected in DIFFERENTIAL mode — their stale-snapshot semantics prevent CDC triggers from tracking changes. Use the underlying query directly, or switch to FULL mode. In FULL mode, materialized views are allowed (no CDC needed).

Foreign tables are rejected in DIFFERENTIAL mode — row-level triggers cannot be created on foreign tables. Use FULL mode instead.

Partitioned Tables as Sources

Partitioned tables are fully supported as source tables in both FULL and DIFFERENTIAL modes. CDC triggers are installed on the partitioned parent table, and PostgreSQL 13+ ensures the trigger fires for all DML routed to child partitions. The change buffer uses the parent table's OID (pgtrickle_changes.changes_<parent_oid>).

CREATE TABLE orders (
    id INT, region TEXT, amount NUMERIC
) PARTITION BY LIST (region);
CREATE TABLE orders_us PARTITION OF orders FOR VALUES IN ('US');
CREATE TABLE orders_eu PARTITION OF orders FOR VALUES IN ('EU');

-- Works — inserts into any partition are captured:
SELECT pgtrickle.create_stream_table(
    name     => 'order_summary',
    query    => 'SELECT region, SUM(amount) FROM orders GROUP BY region',
    schedule => '1m'
);

ATTACH PARTITION detection: When a new partition is attached to a tracked source table via ALTER TABLE parent ATTACH PARTITION child ..., pg_trickle's DDL event trigger detects the change in partition structure and automatically marks affected stream tables for reinitialize. This ensures pre-existing rows in the newly attached partition are included on the next refresh. DETACH PARTITION is also detected and triggers reinitialization.

WAL mode: When using WAL-based CDC (cdc_mode = 'wal'), publications for partitioned source tables are created with publish_via_partition_root = true. This ensures changes from child partitions are published under the parent table's identity, matching trigger-mode CDC behavior.

Note: pg_trickle targets PostgreSQL 18. On PostgreSQL 12 or earlier (not supported), parent triggers do not fire for partition-routed rows, which would cause silent data loss.

Foreign Tables as Sources

Foreign tables (via postgres_fdw or other FDWs) can be used as stream table sources with these constraints:

CDC MethodSupported?Why
Trigger-based❌ NoForeign tables don't support row-level triggers
WAL-based❌ NoForeign tables don't generate local WAL entries
FULL refresh✅ YesRe-executes the remote query each cycle
Polling-based✅ YesWhen pg_trickle.foreign_table_polling = on
-- Foreign table source — FULL refresh only
SELECT pgtrickle.create_stream_table(
    name         => 'remote_summary',
    query        => 'SELECT region, SUM(amount) FROM remote_orders GROUP BY region',
    schedule     => '5m',
    refresh_mode => 'FULL'
);

When pg_trickle detects a foreign table source, it emits an INFO message explaining the constraints. If you attempt to use DIFFERENTIAL mode without polling enabled, the creation will succeed but the refresh falls back to FULL.

Polling-based CDC creates a local snapshot table and computes EXCEPT ALL differences on each refresh. Enable with:

SET pg_trickle.foreign_table_polling = on;

For a complete step-by-step setup guide, see the Foreign Table Sources tutorial.

IMMEDIATE Mode Query Restrictions

The 'IMMEDIATE' refresh mode supports nearly all SQL constructs supported by 'DIFFERENTIAL' and 'FULL' modes. Queries are validated at stream table creation and when switching to IMMEDIATE mode via alter_stream_table.

Supported in IMMEDIATE mode:

  • Simple SELECT ... FROM table scans, filters, projections
  • JOIN (INNER, LEFT, FULL OUTER)
  • GROUP BY with standard aggregates (COUNT, SUM, AVG, MIN, MAX, etc.)
  • DISTINCT
  • Non-recursive WITH (CTEs)
  • UNION ALL, INTERSECT, EXCEPT
  • EXISTS / IN subqueries (SemiJoin, AntiJoin)
  • Subqueries in FROM
  • Window functions (ROW_NUMBER, RANK, DENSE_RANK, etc.)
  • LATERAL subqueries
  • LATERAL set-returning functions (unnest(), jsonb_array_elements(), etc.)
  • Scalar subqueries in SELECT
  • Cascading IMMEDIATE stream tables (ST depending on another IMMEDIATE ST)
  • Recursive CTEs (WITH RECURSIVE) — uses semi-naive evaluation (INSERT-only) or Delete-and-Rederive (DELETE/UPDATE); bounded by pg_trickle.ivm_recursive_max_depth (default 100) to guard against infinite loops from cyclic data

Not yet supported in IMMEDIATE mode:

None — all constructs that work in 'DIFFERENTIAL' mode are now also available in 'IMMEDIATE' mode.

Notes on WITH RECURSIVE in IMMEDIATE mode:

  • A __pgt_depth counter is injected into the generated semi-naive SQL. Propagation stops when the counter reaches ivm_recursive_max_depth (default 100). Raise this GUC for deeper hierarchies or set it to 0 to disable the guard.
  • A WARNING is emitted at stream table creation time reminding operators to monitor for stack depth limit exceeded errors on very deep hierarchies.
  • Non-linear recursion (multiple self-references) is rejected — PostgreSQL itself enforces this restriction.

Attempting to create a stream table with an unsupported construct produces a clear error message.

Logical Replication Targets

Tables that receive data via logical replication require special consideration. Changes arriving via replication do not fire normal row-level triggers, which means CDC triggers will miss those changes.

pg_trickle emits a WARNING at stream table creation time if any source table is detected as a logical replication target (via pg_subscription_rel).

Workarounds:

  • Use cdc_mode = 'wal' for WAL-based CDC that captures all changes regardless of origin.
  • Use FULL refresh mode, which recomputes entirely from the current table state.
  • Set a frequent refresh schedule with FULL mode to limit staleness.

Views on Stream Tables

PostgreSQL views can reference stream tables. The view reflects the data as of the most recent refresh.

CREATE VIEW top_customers AS
SELECT customer_id, total
FROM pgtrickle.order_totals
WHERE total > 500
ORDER BY total DESC;

Materialized Views on Stream Tables

Materialized views can reference stream tables, though this is typically redundant (both are physical snapshots of a query). The materialized view requires its own REFRESH MATERIALIZED VIEW — it does not auto-refresh when the stream table refreshes.

Logical Replication of Stream Tables

Stream tables can be published for logical replication like any ordinary table:

-- On publisher
CREATE PUBLICATION my_pub FOR TABLE pgtrickle.order_totals;

-- On subscriber
CREATE SUBSCRIPTION my_sub
  CONNECTION 'host=... dbname=...'
  PUBLICATION my_pub;

Caveats:

  • The __pgt_row_id column is replicated (it is the primary key), which is an internal implementation detail.
  • The subscriber receives materialized data, not the defining query. Refreshes on the publisher propagate as normal DML via logical replication.
  • Do not install pg_trickle on the subscriber and attempt to refresh the replicated table — it will have no CDC triggers or catalog entries.
  • The internal change buffer tables (pgtrickle_changes.changes_<oid>) and catalog tables are not published by default; subscribers only receive the final output.

Known Delta Computation Limitations

The following edge cases produce incorrect delta results in DIFFERENTIAL mode under specific data mutation patterns. They have no effect on FULL mode.

JOIN Key Column Change + Simultaneous Right-Side Delete — Fixed (EC-01)

Resolved in v0.14.0. This limitation no longer exists — the delta query now uses a pre-change right snapshot (R₀) for DELETE deltas, ensuring stale rows are correctly removed even when the join partner is simultaneously deleted.

The fix splits Part 1 of the JOIN delta into two arms:

  • Part 1a (inserts): ΔL_inserts ⋈ R₁ — uses current right state
  • Part 1b (deletes): ΔL_deletes ⋈ R₀ — uses pre-change right state

R₀ is reconstructed as R_current EXCEPT ALL ΔR_inserts UNION ALL ΔR_deletes (or via NOT EXISTS anti-join for simple Scan nodes). This ensures the DELETE half always finds the old join partner, even if that partner was deleted in the same cycle.

The fix applies to INNER JOIN, LEFT JOIN, and FULL OUTER JOIN delta operators. See DVM_OPERATORS.md for implementation details.

CUBE/ROLLUP Expansion Limit

CUBE(a, b, c...n) on N columns generates $2^N$ grouping set branches (a UNION ALL of N queries). pg_trickle rejects CUBE/ROLLUP that would produce more than 64 branches to prevent runaway memory usage during query generation. Use explicit GROUPING SETS(...) instead:

-- Rejected: CUBE(a, b, c, d, e, f, g) would generate 128 branches
-- Use instead:
SELECT pgtrickle.create_stream_table(
    name     => 'multi_dim',
    query    => 'SELECT a, b, c, SUM(v) FROM t
   GROUP BY GROUPING SETS ((a, b, c), (a, b), (a), ())',
    schedule => '5m'
);

What Is NOT Allowed

OperationRestrictionReason
Direct DML (INSERT, UPDATE, DELETE)❌ Not supportedStream table contents are managed exclusively by the refresh engine.
Direct DDL (ALTER TABLE)❌ Not supportedUse pgtrickle.alter_stream_table() to change the defining query or schedule.
Foreign keys referencing or from a stream table❌ Not supportedThe refresh engine performs bulk MERGE operations that do not respect FK ordering.
User-defined triggers on stream tables✅ Supported (DIFFERENTIAL)In DIFFERENTIAL mode, the refresh engine decomposes changes into explicit DELETE + UPDATE + INSERT statements so triggers fire with correct TG_OP, OLD, and NEW. Row-level triggers are suppressed during FULL refresh. Controlled by pg_trickle.user_triggers GUC (default: auto).
TRUNCATE on a stream table❌ Not supportedUse pgtrickle.refresh_stream_table() to reset data.

Tip: The __pgt_row_id column is visible but should be ignored by consuming queries — it is an implementation detail used for delta MERGE operations.

Internal __pgt_* Auxiliary Columns

Stream tables may contain additional hidden columns whose names begin with __pgt_. These are managed exclusively by the refresh engine — they are not part of the user-visible schema and should never be read or written by application queries.

__pgt_row_id — Row identity (always present)

Every stream table has a BIGINT PRIMARY KEY column named __pgt_row_id. It is a content hash of all output columns (xxHash3-128 with Fibonacci-mixing of multiple column hashes), updated by the refresh engine on every MERGE. It is used as the MERGE join key to detect inserts/updates/deletes.

__pgt_count — Group multiplicity (aggregates & DISTINCT)

Added when the defining query contains GROUP BY, DISTINCT, UNION ALL ... GROUP BY, or any aggregate expression that requires tracking how many source rows contribute to each output row.

TypeTriggers
BIGINT NOT NULL DEFAULT 0GROUP BY, DISTINCT, COUNT(*), SUM(...), AVG(...), STDDEV(...), VAR(...), UNION deduplication

__pgt_count_l / __pgt_count_r — Dual multiplicity (INTERSECT / EXCEPT)

Added when the defining query contains INTERSECT or EXCEPT. Stores independently the left-branch and right-branch row counts for Z-set delta algebra.

TypeTriggers
BIGINT NOT NULL DEFAULT 0 eachINTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL

__pgt_aux_sum_<alias> / __pgt_aux_count_<alias> — Running totals for AVG

Pairs of auxiliary columns added for each AVG(expr) in the query. Instead of recomputing the average from scratch on each delta, the refresh engine maintains a running sum and count and derives the average algebraically.

TypeTriggers
NUMERIC NOT NULL DEFAULT 0 (sum), BIGINT NOT NULL DEFAULT 0 (count)Any AVG(expr) in GROUP BY query

Named __pgt_aux_sum_<output_alias> and __pgt_aux_count_<output_alias>, where <output_alias> is the column alias for the AVG expression in the SELECT list.

__pgt_aux_sum2_<alias> — Sum-of-squares for STDDEV / VARIANCE

Added alongside the sum/count pair when the query contains STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, or VAR_SAMP. Enables O(1) algebraic computation of variance from the Welford identity.

TypeTriggers
NUMERIC NOT NULL DEFAULT 0STDDEV(...), STDDEV_POP(...), STDDEV_SAMP(...), VARIANCE(...), VAR_POP(...), VAR_SAMP(...)

__pgt_aux_sumx_* / __pgt_aux_sumy_* / __pgt_aux_sumxy_* / __pgt_aux_sumx2_* / __pgt_aux_sumy2_* — Cross-product accumulators for regression aggregates

Five auxiliary columns per aggregate, used for O(1) algebraic maintenance of the twelve PostgreSQL regression and correlation aggregates.

TypeTriggers
NUMERIC NOT NULL DEFAULT 0 (five columns per aggregate)CORR(Y,X), COVAR_POP(Y,X), COVAR_SAMP(Y,X), REGR_AVGX(Y,X), REGR_AVGY(Y,X), REGR_COUNT(Y,X), REGR_INTERCEPT(Y,X), REGR_R2(Y,X), REGR_SLOPE(Y,X), REGR_SXX(Y,X), REGR_SXY(Y,X), REGR_SYY(Y,X)

The five columns are named with base prefix __pgt_aux_<kind>_<output_alias> where <kind> is sumx, sumy, sumxy, sumx2, or sumy2. The shared group count is stored in the companion __pgt_aux_count_<output_alias> column.

__pgt_aux_nonnull_<alias> — Non-NULL count for SUM + FULL OUTER JOIN

Added when the query contains SUM(expr) inside a FULL OUTER JOIN aggregate. When matched rows transition to unmatched (null-padded), standard algebraic SUM would produce 0 instead of NULL. This counter tracks how many non-NULL argument values exist in each group; when it reaches zero the SUM is definitively NULL without a full rescan.

TypeTriggers
BIGINT NOT NULL DEFAULT 0SUM(expr) in a query with FULL OUTER JOIN at the top level

__pgt_wf_<N> — Window function lift-out (query rewrite)

Added at query-rewrite time (before storage table creation) when the defining query contains window functions embedded inside larger expressions (e.g. CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN ...). The engine lifts the window function to a synthetic inner-subquery column so the outer SELECT can reference it by alias.

TypeTriggers
Inherits the window-function return typeWindow function inside expression — e.g. RANK(), ROW_NUMBER(), DENSE_RANK(), LAG(), LEAD(), etc.

__pgt_depth — Recursion depth counter (recursive CTE)

Present only inside the DVM-generated SQL for recursive CTE queries. Used to limit unbounded recursion in semi-naive evaluation. Not added as a permanent column to the storage table.


Rule of thumb: Unless you see an ALTER TABLE query mentioning one of these columns, they are transparent to consuming queries. Never SELECT __pgt_* columns in application code — their names, types, and presence may change across minor versions.

Row-Level Security (RLS)

Stream tables follow the same RLS model as PostgreSQL's built-in MATERIALIZED VIEW: the refresh always materializes the full, unfiltered result set. Access control is applied at read time via RLS policies on the stream table itself.

How It Works

AreaBehavior
RLS on source tablesIgnored during refresh. The scheduler runs as superuser; manual refresh_stream_table() and IMMEDIATE-mode triggers bypass RLS via SET LOCAL row_security = off / SECURITY DEFINER. The stream table always contains all rows.
RLS on the stream tableWorks naturally. Enable RLS and create policies on the stream table to filter reads per role — exactly as you would on any regular table.
RLS policy changes on source tablesCREATE POLICY, ALTER POLICY, and DROP POLICY on a source table are detected by pg_trickle's DDL event trigger and mark the stream table for reinitialisation.
ENABLE/DISABLE RLS on source tablesALTER TABLE … ENABLE ROW LEVEL SECURITY and DISABLE ROW LEVEL SECURITY on a source table mark the stream table for reinitialisation.
Change buffer tablesRLS is explicitly disabled on all change buffer tables (pgtrickle_changes.changes_*) so CDC trigger inserts always succeed regardless of schema-level RLS settings.
IMMEDIATE modeIVM trigger functions are SECURITY DEFINER with a locked search_path, so the delta query always sees all rows. The DML issued by the calling user is still filtered by that user's RLS policies on the source table — only the stream table maintenance runs with elevated privileges.
-- 1. Create a stream table (materializes all rows)
SELECT pgtrickle.create_stream_table(
    name  => 'order_totals',
    query => 'SELECT tenant_id, SUM(amount) AS total FROM orders GROUP BY tenant_id'
);

-- 2. Enable RLS on the stream table
ALTER TABLE pgtrickle.order_totals ENABLE ROW LEVEL SECURITY;

-- 3. Create per-tenant policies
CREATE POLICY tenant_isolation ON pgtrickle.order_totals
    USING (tenant_id = current_setting('app.tenant_id')::INT);

-- 4. Each role sees only its own rows
SET app.tenant_id = '42';
SELECT * FROM pgtrickle.order_totals;  -- only tenant 42's rows

Note: This is identical to how you would apply RLS to a regular MATERIALIZED VIEW. One stream table serves all tenants; per-tenant filtering happens at query time with zero storage duplication.


Views

pgtrickle.stream_tables_info

Status overview with computed staleness information.

SELECT * FROM pgtrickle.stream_tables_info;

Columns include all pgtrickle.pgt_stream_tables columns plus:

ColumnTypeDescription
stalenessintervalnow() - last_refresh_at
stalebooltrue when the scheduler itself is behind (last_refresh_at age exceeds schedule); false when the scheduler is healthy even if source tables have had no writes

pgtrickle.pg_stat_stream_tables

Comprehensive monitoring view combining catalog metadata with aggregate refresh statistics.

SELECT * FROM pgtrickle.pg_stat_stream_tables;

Key columns:

ColumnTypeDescription
pgt_idbigintStream table ID
pgt_schema / pgt_nametextSchema and name
statustextINITIALIZING, ACTIVE, SUSPENDED, ERROR
refresh_modetextFULL or DIFFERENTIAL
data_timestamptimestamptzTimestamp of last refresh
stalenessintervalnow() - last_refresh_at
stalebooltrue when the scheduler is behind its schedule; false when the scheduler is healthy (quiet source tables do not count as stale)
total_refreshesbigintTotal refresh count
successful_refreshesbigintSuccessful refresh count
failed_refreshesbigintFailed refresh count
avg_duration_msfloat8Average refresh duration
consecutive_errorsintCurrent error streak
cdc_modestext[]Distinct CDC modes across TABLE-type sources (e.g. {wal}, {trigger,wal}, {transitioning,wal})
scc_idintSCC group identifier for circular dependencies (NULL if not in a cycle)
last_fixpoint_iterationsintNumber of fixpoint iterations in the last SCC convergence (NULL if not cyclic)

pgtrickle.quick_health

Single-row health summary for dashboards and alerting. Returns the overall health status of the pg_trickle extension at a glance.

SELECT * FROM pgtrickle.quick_health;
ColumnTypeDescription
total_stream_tablesbigintTotal number of stream tables
error_tablesbigintStream tables with status = 'ERROR' or consecutive_errors > 0
stale_tablesbigintStream tables whose data is older than their schedule interval
scheduler_runningbooleanWhether a pg_trickle scheduler backend is detected in pg_stat_activity
statustextOverall status: EMPTY, OK, WARNING, or CRITICAL

Status values:

  • EMPTY — No stream tables exist.
  • OK — All stream tables are healthy and up-to-date.
  • WARNING — Some tables have errors or are stale.
  • CRITICAL — At least one stream table is SUSPENDED.

pgtrickle.pgt_cdc_status

Convenience view for inspecting the CDC mode and WAL slot state of every TABLE-type source for all stream tables. Useful for monitoring in-progress TRIGGER→WAL transitions.

SELECT * FROM pgtrickle.pgt_cdc_status;
ColumnTypeDescription
pgt_schematextSchema of the stream table
pgt_nametextName of the stream table
source_relidoidOID of the source table
source_nametextName of the source table
source_schematextSchema of the source table
cdc_modetextCurrent CDC mode: trigger, transitioning, or wal
slot_nametextReplication slot name (NULL for trigger mode)
decoder_confirmed_lsnpg_lsnLast WAL position decoded (NULL for trigger mode)
transition_started_attimestamptzWhen the trigger→WAL transition began (NULL if not transitioning)

Subscribe to the pgtrickle_cdc_transition NOTIFY channel to receive real-time events when a source moves between CDC modes (payload is a JSON object with source_oid, from, and to fields).


Catalog Tables

pgtrickle.pgt_stream_tables

Core metadata for each stream table.

ColumnTypeDescription
pgt_idbigserialPrimary key
pgt_relidoidOID of the storage table
pgt_nametextTable name
pgt_schematextSchema name
defining_querytextThe SQL query that defines the ST
original_querytextThe user-supplied query before normalization
scheduletextRefresh schedule (duration or cron expression)
refresh_modetextFULL, DIFFERENTIAL, or IMMEDIATE
statustextINITIALIZING, ACTIVE, SUSPENDED, ERROR
is_populatedboolWhether the table has been populated
data_timestamptimestamptzTimestamp of the data in the ST
frontierjsonbPer-source LSN positions (version tracking)
last_refresh_attimestamptzWhen last refreshed
consecutive_errorsintCurrent error streak count
needs_reinitboolWhether upstream DDL requires reinitialization
auto_thresholddouble precisionPer-ST adaptive fallback threshold (overrides GUC)
last_full_msdouble precisionLast FULL refresh duration in milliseconds
functions_usedtext[]Function names used in the defining query (for DDL tracking)
topk_limitintLIMIT value for TopK stream tables (NULL if not TopK)
topk_order_bytextORDER BY clause SQL for TopK stream tables
topk_offsetintOFFSET value for paged TopK queries (NULL if not paged)
diamond_consistencytextDiamond consistency mode: none or atomic
diamond_schedule_policytextDiamond schedule policy: fastest or slowest
has_keyless_sourceboolWhether any source table lacks a PRIMARY KEY (EC-06)
function_hashestextMD5 hashes of referenced function bodies for change detection (EC-16)
scc_idintSCC group identifier for circular dependencies (NULL if not in a cycle)
last_fixpoint_iterationsintNumber of iterations in the last SCC fixpoint convergence (NULL if never iterated)
created_attimestamptzCreation timestamp
updated_attimestamptzLast modification timestamp

pgtrickle.pgt_dependencies

DAG edges — records which source tables each ST depends on, including CDC mode metadata.

ColumnTypeDescription
pgt_idbigintFK to pgt_stream_tables
source_relidoidOID of the source table
source_typetextTABLE, STREAM_TABLE, VIEW, MATVIEW, or FOREIGN_TABLE
columns_usedtext[]Which columns are referenced
column_snapshotjsonbSnapshot of source column metadata at creation time
schema_fingerprinttextSHA-256 fingerprint of column snapshot for fast equality checks
cdc_modetextCurrent CDC mode: TRIGGER, TRANSITIONING, or WAL
slot_nametextReplication slot name (WAL/TRANSITIONING modes)
decoder_confirmed_lsnpg_lsnWAL decoder's last confirmed position
transition_started_attimestamptzWhen the trigger→WAL transition started

pgtrickle.pgt_refresh_history

Audit log of all refresh operations.

ColumnTypeDescription
refresh_idbigserialPrimary key
pgt_idbigintFK to pgt_stream_tables
data_timestamptimestamptzData timestamp of the refresh
start_timetimestamptzWhen the refresh started
end_timetimestamptzWhen it completed
actiontextNO_DATA, FULL, DIFFERENTIAL, REINITIALIZE, SKIP
rows_insertedbigintRows inserted
rows_deletedbigintRows deleted
delta_row_countbigintNumber of delta rows processed from change buffers
merge_strategy_usedtextWhich merge strategy was used (e.g. MERGE, DELETE+INSERT)
was_full_fallbackboolWhether the refresh fell back to FULL from DIFFERENTIAL
error_messagetextError message if failed
statustextRUNNING, COMPLETED, FAILED, SKIPPED
initiated_bytextWhat triggered: SCHEDULER, MANUAL, or INITIAL
freshness_deadlinetimestamptzSLA deadline (duration schedules only; NULL for cron)
fixpoint_iterationintIteration of the fixed-point loop (NULL for non-cyclic refreshes)

pgtrickle.pgt_change_tracking

CDC slot tracking per source table.

ColumnTypeDescription
source_relidoidOID of the tracked source table
slot_nametextLogical replication slot name
last_consumed_lsnpg_lsnLast consumed WAL position
tracked_by_pgt_idsbigint[]Array of ST IDs depending on this source

pgtrickle.pgt_source_gates

Bootstrap source gate registry. One row per source table that has ever been gated. Only sources with gated = true are actively blocking scheduler refreshes.

ColumnTypeDescription
source_relidoidOID of the gated source table (PK)
gatedbooleantrue while the source is gated; false after ungate_source()
gated_attimestamptzWhen the gate was most recently set
ungated_attimestamptzWhen the gate was cleared (NULL if still active)
gated_bytextActor that set the gate (e.g. 'gate_source')

pgtrickle.pgt_refresh_groups

User-declared Cross-Source Snapshot Consistency groups (v0.9.0). A refresh group guarantees that all member stream tables are refreshed against a snapshot taken at the same point in time, preventing partial-update visibility (e.g. orders and order_lines both reflecting the same transaction boundary).

ColumnTypeDescription
group_idserialPrimary key
group_nametextUnique human-readable group name
member_oidsoid[]OIDs of the stream table storage relations that participate in this group
isolationtextSnapshot isolation level for the group: 'read_committed' (default) or 'repeatable_read'
created_attimestamptzWhen the group was created

Management API

-- Create a refresh group
SELECT pgtrickle.create_refresh_group(
    'orders_snapshot',
    ARRAY['public.orders_summary', 'public.order_lines_summary'],
    'repeatable_read'   -- or 'read_committed' (default)
);

-- List all groups:
SELECT * FROM pgtrickle.refresh_groups();

-- Remove a group:
SELECT pgtrickle.drop_refresh_group('orders_snapshot');

Validation rules:

  • At least 2 member stream tables are required.
  • All members must exist in pgt_stream_tables.
  • No member can appear in more than one refresh group.
  • Valid isolation levels: 'read_committed' (default), 'repeatable_read'.

Bootstrap Source Gating (v0.5.0)

These functions let operators pause and resume scheduler-driven refreshes for individual source tables — useful during large bulk loads or ETL windows.

pgtrickle.gate_source(source TEXT)

Mark a source table as gated. The scheduler will skip any stream table that reads from this source until ungate_source() is called.

SELECT pgtrickle.gate_source('my_schema.big_source');

Manual refresh_stream_table() calls are not affected by gates.

pgtrickle.ungate_source(source TEXT)

Clear a gate set by gate_source(). After this call the scheduler resumes normal refresh scheduling for dependent stream tables.

SELECT pgtrickle.ungate_source('my_schema.big_source');

pgtrickle.source_gates()

Table function returning the current gate status for all registered sources.

SELECT * FROM pgtrickle.source_gates();
-- source_table | schema_name | gated | gated_at | ungated_at | gated_by
ColumnTypeDescription
source_tabletextRelation name
schema_nametextSchema name
gatedbooleanWhether the source is currently gated
gated_attimestamptzWhen the gate was set
ungated_attimestamptzWhen the gate was cleared (NULL if active)
gated_bytextWhich function set the gate

Typical workflow

-- 1. Gate the source before a bulk load.
SELECT pgtrickle.gate_source('orders');

-- 2. Load historical data (scheduler sits idle for orders-based STs).
COPY orders FROM '/data/historical_orders.csv';

-- 3. Ungate — the next scheduler tick refreshes everything cleanly.
SELECT pgtrickle.ungate_source('orders');

pgtrickle.bootstrap_gate_status() (v0.6.0)

Rich introspection of bootstrap gate lifecycle. Returns the same columns as source_gates() plus computed fields for debugging.

SELECT * FROM pgtrickle.bootstrap_gate_status();
-- source_table | schema_name | gated | gated_at | ungated_at | gated_by | gate_duration | affected_stream_tables
ColumnTypeDescription
source_tabletextRelation name
schema_nametextSchema name
gatedbooleanWhether the source is currently gated
gated_attimestamptzWhen the gate was set (updated on re-gate)
ungated_attimestamptzWhen the gate was cleared (NULL if active)
gated_bytextWhich function set the gate
gate_durationintervalHow long the gate has been active (gated: now() - gated_at; ungated: ungated_at - gated_at)
affected_stream_tablestextComma-separated list of stream tables whose scheduler refreshes are blocked by this gate

Rows are sorted with currently-gated sources first, then alphabetically.

ETL Coordination Cookbook (v0.6.0)

Step-by-step recipes for common bulk-load patterns using source gating.

Recipe 1 — Single Source Bulk Load

Gate one source table during a large data import. The scheduler pauses refreshes for all stream tables that depend on this source.

-- 1. Gate the source before loading.
SELECT pgtrickle.gate_source('orders');

-- 2. Load the data.  The scheduler sits idle for orders-dependent STs.
COPY orders FROM '/data/orders_2026.csv' WITH (FORMAT csv, HEADER);

-- 3. Ungate.  On the next tick the scheduler refreshes everything cleanly.
SELECT pgtrickle.ungate_source('orders');

Recipe 2 — Coordinated Multi-Source Load

When multiple sources feed into a shared downstream stream table, gate them all before loading so no intermediate refreshes occur.

-- 1. Gate all sources that will be loaded.
SELECT pgtrickle.gate_source('orders');
SELECT pgtrickle.gate_source('order_lines');

-- 2. Load each source (can be parallel, any order).
COPY orders FROM '/data/orders.csv' WITH (FORMAT csv, HEADER);
COPY order_lines FROM '/data/lines.csv' WITH (FORMAT csv, HEADER);

-- 3. Ungate all sources.  The scheduler refreshes downstream STs once.
SELECT pgtrickle.ungate_source('orders');
SELECT pgtrickle.ungate_source('order_lines');

Recipe 3 — Gate + Deferred Initialization

Combine gating with initialize => false to prevent incomplete initial population when sources are loaded asynchronously.

-- 1. Gate sources before creating any stream tables.
SELECT pgtrickle.gate_source('orders');
SELECT pgtrickle.gate_source('order_lines');

-- 2. Create stream tables without initial population.
SELECT pgtrickle.create_stream_table(
    'order_summary',
    'SELECT region, SUM(amount) FROM orders GROUP BY region',
    '1m', initialize => false
);
SELECT pgtrickle.create_stream_table(
    'order_report',
    'SELECT s.region, s.total, l.line_count
     FROM order_summary s
     JOIN (SELECT region, COUNT(*) AS line_count FROM order_lines GROUP BY region) l
       USING (region)',
    '1m', initialize => false
);

-- 3. Run ETL processes (can be in separate transactions).
BEGIN;
  COPY orders FROM 's3://warehouse/orders.parquet';
  SELECT pgtrickle.ungate_source('orders');
COMMIT;

BEGIN;
  COPY order_lines FROM 's3://warehouse/lines.parquet';
  SELECT pgtrickle.ungate_source('order_lines');
COMMIT;

-- 4. Once all sources are ungated, the scheduler initializes and refreshes
--    all stream tables in dependency order.

Recipe 4 — Nightly Batch Pattern

For scheduled ETL that runs overnight, gate sources before the batch starts and ungate after the batch completes.

-- Nightly ETL script:

-- Gate all sources that will be refreshed.
SELECT pgtrickle.gate_source('sales');
SELECT pgtrickle.gate_source('inventory');

-- Truncate and reload (or use COPY, INSERT...SELECT, etc.).
TRUNCATE sales;
COPY sales FROM '/data/nightly/sales.csv' WITH (FORMAT csv, HEADER);

TRUNCATE inventory;
COPY inventory FROM '/data/nightly/inventory.csv' WITH (FORMAT csv, HEADER);

-- All data loaded — ungate and let the scheduler handle the rest.
SELECT pgtrickle.ungate_source('sales');
SELECT pgtrickle.ungate_source('inventory');

-- Verify: check the gate status to confirm everything is ungated.
SELECT * FROM pgtrickle.bootstrap_gate_status();

Recipe 5 — Monitoring During a Gated Load

Use bootstrap_gate_status() to monitor progress when streams appear stalled.

-- Check which sources are currently gated and how long they've been paused.
SELECT source_table, gate_duration, affected_stream_tables
FROM pgtrickle.bootstrap_gate_status()
WHERE gated = true;

-- If a gate has been active too long (e.g. ETL failed), ungate manually.
SELECT pgtrickle.ungate_source('stale_source');

Watermark Gating (v0.7.0)

Watermark gating is a scheduling control for ETL pipelines where multiple source tables are populated by separate jobs that finish at different times. Each ETL job declares "I'm done up to timestamp X", and the scheduler waits until all sources in a group are caught up within a configurable tolerance before refreshing downstream stream tables.

Catalog Tables

pgtrickle.pgt_watermarks

Per-source watermark state. One row per source table that has had a watermark advanced.

ColumnTypeDescription
source_relidoidSource table OID (primary key)
watermarktimestamptzCurrent watermark value
updated_attimestamptzWhen the watermark was last advanced
advanced_bytextUser/role that advanced the watermark
wal_lsn_at_advancetextWAL LSN at the time of advancement

pgtrickle.pgt_watermark_groups

Watermark group definitions. Each group declares that a set of sources must be temporally aligned.

ColumnTypeDescription
group_idserialAuto-generated group ID (primary key)
group_nametextUnique group name
source_relidsoid[]Array of source table OIDs in the group
tolerance_secsfloat8Maximum allowed lag in seconds (default 0)
created_attimestamptzWhen the group was created

pgtrickle.pgt_template_cache

Added in v0.16.0. Cross-backend delta SQL template cache (UNLOGGED). Stores compiled delta query templates so new backends skip the ~45 ms DVM parse+differentiate step. Managed automatically — no user interaction required.

ColumnTypeDescription
pgt_idbigintStream table ID (PK, FK → pgt_stream_tables)
query_hashbigintHash of the defining query (staleness detection)
delta_sqltextDelta SQL template with LSN placeholder tokens
columnstext[]Output column names
source_oidsinteger[]Source table OIDs
is_dedupbooleanWhether the delta is deduplicated per row ID
key_changedbooleanWhether __pgt_key_changed column is present
all_algebraicbooleanWhether all aggregates are algebraically invertible
cached_attimestamptzWhen the entry was last populated

Functions

pgtrickle.advance_watermark(source TEXT, watermark TIMESTAMPTZ)

Signal that a source table's data is complete through the given timestamp.

  • Monotonic: rejects watermarks that go backward (raises error).
  • Idempotent: advancing to the same value is a silent no-op.
  • Transactional: the watermark is part of the caller's transaction.
SELECT pgtrickle.advance_watermark('orders', '2026-03-01 12:05:00+00');

pgtrickle.create_watermark_group(group_name TEXT, sources TEXT[], tolerance_secs FLOAT8 DEFAULT 0)

Create a watermark group. Requires at least 2 sources.

  • tolerance_secs: maximum allowed lag between the most-advanced and least-advanced watermarks. Default 0 means strict alignment.
SELECT pgtrickle.create_watermark_group(
    'order_pipeline',
    ARRAY['orders', 'order_lines'],
    0    -- strict alignment (default)
);

pgtrickle.drop_watermark_group(group_name TEXT)

Remove a watermark group by name.

SELECT pgtrickle.drop_watermark_group('order_pipeline');

pgtrickle.watermarks()

Return the current watermark state for all registered sources.

SELECT * FROM pgtrickle.watermarks();
ColumnTypeDescription
source_tabletextSource table name
schema_nametextSchema name
watermarktimestamptzCurrent watermark value
updated_attimestamptzLast advancement time
advanced_bytextUser that advanced it
wal_lsntextWAL LSN at advancement

pgtrickle.watermark_groups()

Return all watermark group definitions.

SELECT * FROM pgtrickle.watermark_groups();

pgtrickle.watermark_status()

Return live alignment status for each watermark group.

SELECT * FROM pgtrickle.watermark_status();
ColumnTypeDescription
group_nametextGroup name
min_watermarktimestamptzLeast-advanced watermark
max_watermarktimestamptzMost-advanced watermark
lag_secsfloat8Lag in seconds between max and min
alignedbooleanWhether lag is within tolerance
sources_with_watermarkint4Number of sources that have a watermark
sources_totalint4Total sources in the group

Recipes

Recipe 6 — Nightly ETL with Watermarks

-- Create a watermark group for the order pipeline.
SELECT pgtrickle.create_watermark_group(
    'order_pipeline',
    ARRAY['orders', 'order_lines']
);

-- Nightly ETL job 1: Load orders
BEGIN;
  COPY orders FROM '/data/orders_20260301.csv';
  SELECT pgtrickle.advance_watermark('orders', '2026-03-01');
COMMIT;

-- Nightly ETL job 2: Load order lines (may run later)
BEGIN;
  COPY order_lines FROM '/data/lines_20260301.csv';
  SELECT pgtrickle.advance_watermark('order_lines', '2026-03-01');
COMMIT;

-- order_report refreshes on the next tick after both watermarks align.

Recipe 7 — Micro-Batch Tolerance

-- Allow up to 30 seconds of skew between trades and quotes.
SELECT pgtrickle.create_watermark_group(
    'realtime_pipeline',
    ARRAY['trades', 'quotes'],
    30   -- 30-second tolerance
);

-- External process advances watermarks every few seconds.
SELECT pgtrickle.advance_watermark('trades', '2026-03-01 12:00:05+00');
SELECT pgtrickle.advance_watermark('quotes', '2026-03-01 12:00:02+00');
-- Lag is 3s, within 30s tolerance → stream tables refresh normally.

Recipe 8 — Monitoring Watermark Alignment

-- Check which groups are currently misaligned.
SELECT group_name, lag_secs, aligned
FROM pgtrickle.watermark_status()
WHERE NOT aligned;

-- Check individual source watermarks.
SELECT source_table, watermark, updated_at
FROM pgtrickle.watermarks()
ORDER BY watermark;

Stuck Watermark Detection (WM-7, v0.15.0)

When pg_trickle.watermark_holdback_timeout is set to a positive value (seconds), the scheduler periodically checks all watermark sources. If any source in a watermark group has not been advanced within the timeout, downstream stream tables in that group are paused (refresh is skipped) and a pgtrickle_alert NOTIFY is emitted.

This protects against silent data staleness when an ETL pipeline breaks and stops advancing watermarks -- without this guard, stream tables would continue refreshing with stale external data.

Behavior:

  • Stuck detection: Every ~60 seconds, the scheduler checks updated_at for all watermark sources. If now() - updated_at > watermark_holdback_timeout, the source is stuck.
  • Pause: Any stream table whose source set overlaps a group containing a stuck source is skipped. A SKIP record with "stuck" in the reason is logged to pgt_refresh_history.
  • Alert: A pgtrickle_alert NOTIFY with event watermark_stuck is emitted (once per newly-stuck source, not repeated every check cycle).
  • Auto-resume: When the stuck watermark is advanced via advance_watermark(), the next scheduler check detects the advancement, lifts the pause, and emits a watermark_resumed event.

Recipe 9 — Stuck Watermark Protection

-- Enable stuck-watermark detection with a 10-minute timeout.
ALTER SYSTEM SET pg_trickle.watermark_holdback_timeout = 600;
SELECT pg_reload_conf();

-- Listen for alerts in a monitoring process.
LISTEN pgtrickle_alert;

-- When the ETL pipeline breaks and stops calling advance_watermark(),
-- the scheduler will start skipping downstream STs after 10 minutes.
-- You'll receive a NOTIFY payload like:
--   {"event":"watermark_stuck","group":"order_pipeline","source_oid":16385,"age_secs":620}

-- When the ETL pipeline recovers and advances the watermark:
SELECT pgtrickle.advance_watermark('orders', '2026-03-02 00:00:00+00');
-- The scheduler automatically resumes, and you'll receive:
--   {"event":"watermark_resumed","source_oid":16385}

Developer Diagnostics (v0.12.0)

Four SQL-callable introspection functions that surface internal DVM state without side-effects. All functions are read-only — they never modify catalog tables or trigger refreshes.

pgtrickle.explain_query_rewrite(query TEXT)

Walk a query through the full DVM rewrite pipeline and report each pass.

Returns one row per rewrite pass. When a pass changes the query, changed = true and sql_after contains the SQL after the transformation. Two synthetic rows are appended: topk_detection (detects ORDER BY … LIMIT) and dvm_patterns (lists detected DVM constructs such as aggregation strategy, join types, and volatility).

SELECT pass_name, changed, sql_after
FROM pgtrickle.explain_query_rewrite(
  'SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id'
);

Return columns:

ColumnTypeDescription
pass_nametextRewrite pass name (e.g. view_inlining, distinct_on, grouping_sets)
changedboolWhether this pass modified the query
sql_aftertextSQL text after this pass (NULL if unchanged)

Rewrite passes (in order):

PassDescription
view_inliningExpand view references to their defining SQL
nested_window_liftLift window functions out of expressions (e.g. CASE WHEN ROW_NUMBER() OVER (...) ...)
distinct_onRewrite DISTINCT ON to a ROW_NUMBER() window
grouping_setsExpand GROUPING SETS / CUBE / ROLLUP to UNION ALL of GROUP BY
scalar_subquery_in_whereRewrite scalar subqueries in WHERE to CROSS JOIN
correlated_scalar_in_selectRewrite correlated scalar subqueries in SELECT to LEFT JOIN
sublinks_in_or_demorganApply De Morgan normalization and expand SubLinks inside OR
rows_fromRewrite ROWS FROM() multi-function expressions
topk_detectionDetect ORDER BY … LIMIT n TopK pattern
dvm_patternsDetected DVM constructs: join types, aggregate strategies, volatility

pgtrickle.diagnose_errors(name TEXT)

Return the last 5 FAILED refresh events for a stream table, with each error classified by type and supplied with a remediation hint.

SELECT event_time, error_type, error_message, remediation
FROM pgtrickle.diagnose_errors('my_stream_table');

Return columns:

ColumnTypeDescription
event_timetimestamptzWhen the failed refresh started
error_typetextClassification: user, schema, correctness, performance, infrastructure
error_messagetextRaw error text from pgt_refresh_history
remediationtextSuggested next step

Error types:

TypeTrigger patternsTypical action
userquery parse error, unsupported operator, type mismatchCheck query; run validate_query()
schemaupstream table schema changed, upstream table droppedReinitialize; check pgt_dependencies
correctnessphantom, EXCEPT ALL, row count mismatchSwitch to refresh_mode='FULL'; report bug
performancelock timeout, deadlock, serialization failure, spillTune lock_timeout; enable buffer_partitioning
infrastructurepermission denied, SPI error, replication slotCheck role grants; verify slot config

pgtrickle.list_auxiliary_columns(name TEXT)

List all __pgt_* internal columns on a stream table's storage relation, with an explanation of each column's role.

These columns are normally hidden from SELECT * output. This function surfaces them for debugging and operator visibility.

SELECT column_name, data_type, purpose
FROM pgtrickle.list_auxiliary_columns('my_stream_table');

Return columns:

ColumnTypeDescription
column_nametextInternal column name (e.g. __pgt_row_id)
data_typetextPostgreSQL type (e.g. bigint, text)
purposetextHuman-readable description of the column's role

Common auxiliary columns:

ColumnPurpose
__pgt_row_idRow identity hash — MERGE join key for delta application
__pgt_countMultiplicity counter for DISTINCT / aggregation / UNION dedup
__pgt_count_lLeft-side multiplicity for INTERSECT / EXCEPT
__pgt_count_rRight-side multiplicity for INTERSECT / EXCEPT
__pgt_aux_sum_<col>Running SUM for algebraic AVG maintenance
__pgt_aux_count_<col>Running COUNT for algebraic AVG maintenance
__pgt_aux_sum2_<col>Sum-of-squares for STDDEV / VAR maintenance
__pgt_aux_sum{x,y,xy,x2,y2}_<col>Five-column set for CORR / COVAR / REGR_*
__pgt_aux_nonnull_<col>Non-null count for SUM-above-FULL-JOIN maintenance

pgtrickle.validate_query(query TEXT)

Parse and validate a query through the DVM pipeline without creating a stream table. Returns detected SQL constructs, warnings, and the resolved refresh mode.

SELECT check_name, result, severity
FROM pgtrickle.validate_query(
  'SELECT customer_id, COUNT(*) FROM orders GROUP BY customer_id'
);

Return columns:

ColumnTypeDescription
check_nametextName of the check or detected construct
resulttextResolved value or construct description
severitytextINFO, WARNING, or ERROR

The first row always has check_name = 'resolved_refresh_mode' with the mode that would be assigned under refresh_mode = 'AUTO': DIFFERENTIAL, FULL, or TOPK.

Common check names:

CheckDescription
resolved_refresh_modeDIFFERENTIAL, FULL, or TOPK
topk_patternDetected LIMIT + ORDER BY values
unsupported_constructFeature not supported for DIFFERENTIAL mode (→ WARNING)
matview_or_foreign_tableQuery references matview/foreign table (→ WARNING, FULL)
ivm_support_checkDVM parse result (→ WARNING if DIFFERENTIAL not possible)
aggregateAggregate with strategy: ALGEBRAIC_INVERTIBLE, ALGEBRAIC_VIA_AUX, SEMI_ALGEBRAIC, or GROUP_RESCAN
joinDetected join type: INNER, LEFT_OUTER, FULL_OUTER, SEMI, ANTI
set_opSet operation: DISTINCT, UNION_ALL, INTERSECT, EXCEPT, EXCEPT_ALL
window_functionQuery contains window functions
scalar_subqueryQuery contains scalar subqueries
lateralQuery contains LATERAL functions or subqueries
recursive_cteQuery uses WITH RECURSIVE
volatilityWorst-case volatility of functions used: immutable, stable, volatile
needs_pgt_countMultiplicity counter column will be added
needs_dual_countLeft/right multiplicity counters will be added
parse_warningAdvisory warning from the DVM parse phase

Example output for a GROUP_RESCAN query:

SELECT check_name, result, severity
FROM pgtrickle.validate_query(
  'SELECT grp, STRING_AGG(tag, '','') FROM events GROUP BY grp'
);
check_nameresultseverity
resolved_refresh_modeDIFFERENTIALINFO
aggregateSTRING_AGG(GROUP_RESCAN)WARNING
needs_pgt_counttrue — multiplicity counter column requiredINFO
volatilityimmutableINFO

Note on GROUP_RESCAN: STRING_AGG, ARRAY_AGG, BOOL_AND, and other non-algebraic aggregates use a group-rescan strategy — any change in a group triggers full re-aggregation from the source data for that group. This is still DIFFERENTIAL (only changed groups are rescanned), but has higher per-group cost than algebraic strategies. If this is performance-sensitive, consider pre-aggregating with a simpler aggregate and post-processing.


Delta SQL Profiling (v0.13.0)

pgtrickle.explain_delta(st_name text, format text DEFAULT 'text')

Generate the delta SQL query plan for a stream table without executing a refresh.

explain_delta produces the differential delta SQL that would be used on the next DIFFERENTIAL refresh, then runs EXPLAIN (ANALYZE false, FORMAT <format>) on it and returns the plan lines. This function is useful for:

  • Identifying slow joins or missing indexes in auto-generated delta SQL.
  • Comparing plan complexity between different query forms.
  • Monitoring how the size of change buffers affects plan shape.

The delta SQL is generated against a hypothetical "scan all changes" window (LSN 0/0 → FF/FFFFFFFF) so the plan shows the full join/filter structure even when the change buffer is currently empty.

Parameters:

NameTypeDescription
st_nametextQualified stream table name (e.g. 'public.orders_summary').
formattextPlan format: 'text' (default), 'json', 'xml', or 'yaml'.

Returns: SETOF text — one row per plan line (text format) or one row containing the full JSON/XML/YAML plan.

Example:

-- Show the text plan for the delta query
SELECT line FROM pgtrickle.explain_delta('public.orders_summary');

-- Get the JSON plan for programmatic analysis
SELECT line FROM pgtrickle.explain_delta('public.orders_summary', 'json');

Environment variable (PGS_PROFILE_DELTA=1): When the environment variable PGS_PROFILE_DELTA=1 is set in the PostgreSQL server process, every DIFFERENTIAL refresh automatically captures EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) for the resolved delta SQL and writes the plan to /tmp/delta_plans/<schema>_<table>.json. This is intended for E2E test diagnostics and local profiling sessions.


pgtrickle.dedup_stats()

Show MERGE deduplication profiling counters accumulated since server start.

When the delta cannot be guaranteed to contain at most one row per __pgt_row_id (e.g. for aggregate queries or keyless sources), the MERGE must group and aggregate the delta before merging. This is tracked as dedup needed. A consistently high ratio indicates that pre-MERGE compaction in the change buffer would reduce refresh latency.

Returns: one row with:

ColumnTypeDescription
total_diff_refreshesbigintTotal DIFFERENTIAL refreshes executed since server start that processed at least one change. Resets on server restart.
dedup_neededbigintNumber of those refreshes where the delta required weight aggregation / deduplication in the MERGE USING clause.
dedup_ratio_pctfloat8dedup_needed / total_diff_refreshes × 100. 0 when total_diff_refreshes = 0.

Example:

SELECT * FROM pgtrickle.dedup_stats();
-- total_diff_refreshes | dedup_needed | dedup_ratio_pct
-- ----------------------+--------------+-----------------
--                  1234 |           87 |            7.05

A dedup_ratio_pct ≥ 10 is the threshold recommended for investigating a two-pass MERGE strategy. See plans/performance/REPORT_OVERALL_STATUS.md §14 for background.

pgtrickle.shared_buffer_stats()

Added in v0.13.0

D-4 observability function. Returns one row per shared change buffer (one per tracked source table), showing how many stream tables share the buffer, which columns are tracked, the safe cleanup frontier, and the current buffer size.

Return columns:

ColumnTypeDescription
source_oidbigintPostgreSQL OID of the source table
source_tabletextFully qualified source table name
consumer_countintegerNumber of stream tables sharing this buffer
consumerstextComma-separated list of consumer stream table names
columns_trackedintegerNumber of new_* columns in the buffer (column superset)
safe_frontier_lsntextMIN(frontier LSN) across all consumers — rows at or below this are safe to clean up
buffer_rowsbigintCurrent number of rows in the change buffer
is_partitionedbooleanWhether the buffer uses LSN-range partitioning

Example:

SELECT * FROM pgtrickle.shared_buffer_stats();
-- source_oid | source_table       | consumer_count | consumers                          | columns_tracked | safe_frontier_lsn | buffer_rows | is_partitioned
-- -----------+--------------------+----------------+------------------------------------+-----------------+-------------------+-------------+----------------
--      16456 | public.orders      |              3 | public.orders_by_region, public... |               5 | 0/1A2B3C4D        |         142 | f

UNLOGGED Change Buffers (v0.14.0)

pgtrickle.convert_buffers_to_unlogged()

Converts all existing logged change buffer tables to UNLOGGED. This eliminates WAL writes for trigger-inserted CDC rows, reducing WAL amplification by ~30%.

Returns: bigint — the number of buffer tables converted.

SELECT pgtrickle.convert_buffers_to_unlogged();
-- convert_buffers_to_unlogged
-- ----------------------------
--                            5

Warning: Each conversion acquires ACCESS EXCLUSIVE lock on the buffer table. Run this function during a low-traffic maintenance window to minimize lock contention.

After conversion: Buffer contents will be lost on crash recovery. The scheduler automatically detects this and enqueues a FULL refresh for affected stream tables. See pg_trickle.unlogged_buffers for the full trade-off discussion.


Refresh Mode Diagnostics (v0.14.0)

pgtrickle.recommend_refresh_mode(st_name TEXT DEFAULT NULL)

Analyze stream table workload characteristics and recommend the optimal refresh mode (FULL vs DIFFERENTIAL). When st_name is NULL, returns one row per stream table. When provided, returns a single row for the named stream table.

The function evaluates seven weighted signals — change ratio, empirical timing, query complexity, target size, index coverage, and latency variance — and computes a composite score. Scores above +0.15 recommend DIFFERENTIAL; below −0.15 recommend FULL; in between, the function recommends KEEP (current mode is near-optimal).

Parameters:

NameTypeDefaultDescription
st_nametextNULLOptional stream table name. NULL = all stream tables.

Return columns:

ColumnTypeDescription
pgt_schematextStream table schema
pgt_nametextStream table name
current_modetextCurrently configured refresh mode
effective_modetextMode actually used in the last refresh
recommended_modetextDIFFERENTIAL, FULL, or KEEP
confidencetexthigh, medium, or low
reasontextHuman-readable explanation of the recommendation
signalsjsonbDetailed signal breakdown with scores and weights

Example:

-- Check all stream tables
SELECT pgt_name, current_mode, recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode();

-- Check a specific stream table
SELECT recommended_mode, confidence, reason, signals
FROM pgtrickle.recommend_refresh_mode('public.orders_summary');

Signal weights:

SignalBase WeightDescription
change_ratio_current0.25Current pending changes / source rows
change_ratio_avg0.30Historical average change ratio
empirical_timing0.35Observed DIFF vs FULL speed ratio
query_complexity0.10JOIN/aggregate/window count
target_size0.10Target relation + index size
index_coverage0.05Whether __pgt_row_id index exists
latency_variance0.05DIFF latency p95/p50 ratio

pgtrickle.refresh_efficiency()

Per-table refresh efficiency metrics. Returns operational statistics for every stream table — useful for monitoring dashboards and Grafana alerts.

Return columns:

ColumnTypeDescription
pgt_schematextStream table schema
pgt_nametextStream table name
refresh_modetextCurrent refresh mode
total_refreshesbigintTotal completed refresh count
diff_countbigintDIFFERENTIAL refresh count
full_countbigintFULL refresh count
avg_diff_msfloat8Average DIFFERENTIAL duration (ms)
avg_full_msfloat8Average FULL duration (ms)
avg_change_ratiofloat8Average change ratio from history
diff_speeduptextSpeedup factor (e.g. 12.3x) of FULL / DIFF timing
last_refresh_attextTimestamp of last data refresh

Example:

SELECT pgt_name, refresh_mode, diff_count, full_count,
       avg_diff_ms, avg_full_ms, diff_speedup
FROM pgtrickle.refresh_efficiency()
ORDER BY total_refreshes DESC;

Export API (v0.14.0)

pgtrickle.export_definition(st_name TEXT)

Export a stream table's configuration as reproducible DDL. Returns a SQL script containing DROP STREAM TABLE IF EXISTS followed by SELECT pgtrickle.create_stream_table(...) with all configured options, plus any ALTER STREAM TABLE calls for post-creation settings (tier, fuse mode, etc.).

Parameters:

NameTypeDescription
st_nametextFully qualified or search-path-resolved stream table name.

Returns: text — SQL script that recreates the stream table.

Example:

-- Export a single definition
SELECT pgtrickle.export_definition('public.orders_summary');

-- Export all definitions
SELECT pgtrickle.export_definition(pgt_schema || '.' || pgt_name)
FROM pgtrickle.pgt_stream_tables;

dbt Integration (v0.13.0)

The dbt-pgtrickle package exposes two new config(...) keys added in v0.13.0: partition_by and the fuse circuit-breaker options. Use them directly in any stream_table materialization model.

For full dbt documentation see dbt-pgtrickle/README.md.


partition_by config

Partition the stream table's underlying storage table using PostgreSQL PARTITION BY RANGE. Only applied at creation time — changing it after the stream table exists has no effect (use --full-refresh to recreate).

-- models/marts/events_by_day.sql
{{ config(
    materialized='stream_table',
    schedule='1m',
    refresh_mode='DIFFERENTIAL',
    partition_by='event_day'
) }}

SELECT
    event_day,
    user_id,
    COUNT(*) AS event_count
FROM {{ source('raw', 'events') }}
GROUP BY event_day, user_id

pg_trickle creates a PARTITION BY RANGE (event_day) storage table with an automatic default catch-all partition. Add named partitions via standard DDL:

CREATE TABLE analytics.events_by_day_2026
  PARTITION OF analytics.events_by_day
  FOR VALUES FROM ('2026-01-01') TO ('2027-01-01');

The partition_by value is stored in pgtrickle.pgt_stream_tables.st_partition_key and visible via pgtrickle.stream_tables_info.


fuse config

The fuse circuit breaker suspends differential refreshes when the incoming change volume exceeds a threshold, preventing runaway refresh cycles during bulk ingestion. Fuse parameters are applied via alter_stream_table() on every dbt run; they are a no-op if the values have not changed.

-- models/marts/order_totals.sql
{{ config(
    materialized='stream_table',
    schedule='5m',
    refresh_mode='DIFFERENTIAL',
    fuse='auto',
    fuse_ceiling=50000,
    fuse_sensitivity=3
) }}

SELECT customer_id, SUM(amount) AS total
FROM {{ source('raw', 'orders') }}
GROUP BY customer_id
Config keyTypeDefaultDescription
fuse'off'|'on'|'auto'null (no-op)Fuse mode. 'auto' activates only when FULL refresh would be cheaper than DIFFERENTIAL.
fuse_ceilingintegernullChange-count threshold (number of changed rows) that triggers the fuse. null uses the global pg_trickle.fuse_default_ceiling GUC.
fuse_sensitivityintegernullNumber of consecutive over-ceiling observations required before the fuse blows. null means 1 (blow immediately).

Monitor fuse state via pgtrickle.dedup_stats() or check pgtrickle.pgt_stream_tables.fuse_state directly:

SELECT pgt_name, fuse_mode, fuse_state, fuse_ceiling, fuse_sensitivity
FROM pgtrickle.pgt_stream_tables
WHERE fuse_mode != 'off';

Dog Feeding — Self-Monitoring (v0.20.0)

Added in v0.20.0.

pg_trickle can monitor itself using its own stream tables. Five dog-feeding stream tables maintain reactive analytics over the internal catalog, replacing repeated full-scan diagnostic queries with continuously-maintained incremental views.

Quick Start

-- Create all five dog-feeding stream tables (idempotent).
SELECT pgtrickle.setup_dog_feeding();

-- Check status.
SELECT * FROM pgtrickle.dog_feeding_status();

-- View threshold recommendations (after 10+ refresh cycles).
SELECT * FROM pgtrickle.df_threshold_advice
WHERE confidence IN ('HIGH', 'MEDIUM');

-- View anomalies.
SELECT * FROM pgtrickle.df_anomaly_signals
WHERE duration_anomaly IS NOT NULL;

-- Enable auto-apply (optional).
SET pg_trickle.dog_feeding_auto_apply = 'threshold_only';

-- Clean up.
SELECT pgtrickle.teardown_dog_feeding();

pgtrickle.setup_dog_feeding()

Creates all five dog-feeding stream tables. Idempotent — safe to call multiple times. Emits a warm-up warning if pgt_refresh_history has fewer than 50 rows.

Stream tables created:

NameScheduleModePurpose
pgtrickle.df_efficiency_rolling48sAUTORolling-window refresh statistics
pgtrickle.df_anomaly_signals48sAUTODuration spikes, error bursts, mode oscillation
pgtrickle.df_threshold_advice96sAUTOMulti-cycle threshold recommendations
pgtrickle.df_cdc_buffer_trends48sAUTOCDC buffer growth rates per source
pgtrickle.df_scheduling_interference96sFULLConcurrent refresh overlap detection

pgtrickle.teardown_dog_feeding()

Drops all dog-feeding stream tables. Safe with partial setups — missing tables are silently skipped. User stream tables are never affected.

pgtrickle.dog_feeding_status()

Returns the status of all five expected dog-feeding stream tables:

ColumnTypeDescription
st_nametextStream table name
existsboolWhether the ST exists
statustextCurrent status (ACTIVE, SUSPENDED, etc.)
refresh_modetextEffective refresh mode
last_refresh_attextLast successful refresh timestamp
total_refreshesbigintTotal completed refreshes

pgtrickle.scheduler_overhead()

Returns scheduler efficiency metrics for the last hour:

ColumnTypeDescription
total_refreshes_1hbigintTotal refreshes in the last hour
df_refreshes_1hbigintDog-feeding refreshes in the last hour
df_refresh_fractionfloatFraction of refreshes that are dog-feeding
avg_refresh_msfloatAverage refresh duration (ms)
avg_df_refresh_msfloatAverage DF refresh duration (ms)
total_refresh_time_sfloatTotal time spent refreshing (seconds)
df_refresh_time_sfloatTime spent on DF refreshes (seconds)

pgtrickle.explain_dag(format)

Returns the full refresh DAG as a Mermaid markdown (default) or Graphviz DOT string. Node colours: user STs = blue, dog-feeding STs = green, suspended = red, fused = orange.

-- Mermaid format (default).
SELECT pgtrickle.explain_dag();

-- Graphviz DOT format.
SELECT pgtrickle.explain_dag('dot');

Auto-Apply Policy

The pg_trickle.dog_feeding_auto_apply GUC controls whether analytics can automatically adjust stream table configuration:

ValueBehaviour
off (default)Advisory only — no automatic changes
threshold_onlyApply threshold recommendations when confidence is HIGH and delta > 5%
fullAlso apply scheduling hints from interference analysis

Auto-apply is rate-limited to at most one threshold change per stream table per 10 minutes. Changes are logged to pgt_refresh_history with initiated_by = 'DOG_FEED'.

Confidence Levels and Sparse History

df_threshold_advice assigns a confidence level to each recommendation:

ConfidenceCriteriaWhat to expect
HIGH≥ 20 total refreshes, ≥ 5 DIFFERENTIAL, ≥ 2 FULLReliable recommendation — auto-apply will act on this
MEDIUM≥ 10 total refreshesDirectionally useful, but may lack enough FULL/DIFF mix
LOW< 10 total refreshesInsufficient data — recommendation equals the current threshold

When you see LOW confidence: This is normal during the first minutes after setup_dog_feeding(). The stream tables need time to accumulate refresh history. In typical deployments with a 1-minute schedule, expect:

  • LOW for the first ~10 minutes
  • MEDIUM after ~10 minutes
  • HIGH after ~20 minutes (requires at least 2 FULL refreshes — these happen naturally when the auto-threshold triggers a mode switch)

If a stream table uses FULL mode exclusively, the advice will remain at MEDIUM because no DIFFERENTIAL observations exist for comparison.

The sla_headroom_pct column shows how much faster DIFFERENTIAL is compared to FULL as a percentage. A value of 70% means "DIFF is 70% faster than FULL". This column is NULL when either FULL or DIFF observations are missing.


Public API Stability Contract

Added in v0.19.0 (DB-6).

Stable (will not break without a major version bump)

SurfaceGuarantee
All functions in the pgtrickle schema documented in this referenceSignature and return type preserved across minor releases. New optional parameters may be added with defaults that preserve existing behaviour.
Catalog tables pgtrickle.pgt_stream_tables, pgtrickle.pgt_dependencies, pgtrickle.pgt_refresh_historyExisting columns are not renamed or removed. New columns may be added.
NOTIFY channels pg_trickle_refresh, pgtrickle_alert, pgtrickle_wakeChannel names and JSON payload structure preserved. New keys may be added to JSON payloads.
GUC names listed in docs/CONFIGURATION.mdNames preserved; default values may change between minor releases (documented in CHANGELOG).

Unstable (may change in any release)

SurfaceNotes
Functions prefixed with _ (e.g. _signal_launcher_rescan)Internal use only.
Catalog tables not listed above (e.g. pgt_scheduler_jobs, pgt_source_gates, pgt_watermarks)Schema may change.
The pgtrickle_changes schema and its changes_* tablesCDC implementation detail; format may change.
SQL generated by the DVM engine (MERGE, delta CTEs)Internal query structure is not an API.
The pgtrickle.pgt_schema_version tableMigration infrastructure; rows and schema may change.

Versioning Policy

  • Patch releases (0.x.Y): Bug fixes only. No breaking changes.
  • Minor releases (0.X.0): New features. Stable API preserved; unstable surfaces may change. Breaking changes to stable API only with a deprecation cycle (WARNING for one release, removal in the next).
  • Major release (1.0.0): Stable API locked. Breaking changes require a major version bump.

Configuration

Complete reference for all pg_trickle GUC (Grand Unified Configuration) variables.


Table of Contents


Overview

pg_trickle exposes over forty configuration variables in the pg_trickle namespace. All can be set in postgresql.conf or at runtime via SET / ALTER SYSTEM.

Required postgresql.conf settings:

shared_preload_libraries = 'pg_trickle'

The extension must be loaded via shared_preload_libraries because it registers GUC variables and a background worker at startup.

Note: wal_level = logical and max_replication_slots are recommended but not required. The default CDC mode (auto) uses lightweight row-level triggers initially and transparently transitions to WAL-based capture if wal_level = logical is available. If wal_level is not logical, pg_trickle stays on triggers permanently — no degradation, no errors. Set pg_trickle.cdc_mode = 'trigger' to disable WAL transitions entirely (see pg_trickle.cdc_mode).


GUC Variables

Essential

The settings most users configure at install time.


pg_trickle.enabled

Enable or disable the pg_trickle extension.

PropertyValue
Typebool
Defaulttrue
ContextSUSET (superuser)
Restart RequiredNo

When set to false, the background scheduler stops processing refreshes. Existing stream tables remain in the catalog but are not refreshed. Manual pgtrickle.refresh_stream_table() calls still work.

-- Disable automatic refreshes
SET pg_trickle.enabled = false;

-- Re-enable
SET pg_trickle.enabled = true;

pg_trickle.cdc_mode

CDC (Change Data Capture) mechanism selection.

ValueDescription
'auto'(default) Use triggers for creation; transition to WAL-based CDC if wal_level = logical. Falls back to triggers automatically on error.
'trigger'Always use row-level triggers for change capture
'wal'Require WAL-based CDC (fails if wal_level != logical)

Default: 'auto'

pg_trickle.cdc_mode only affects deferred refresh modes ('AUTO', 'FULL', and 'DIFFERENTIAL'). refresh_mode = 'IMMEDIATE' bypasses CDC entirely and always uses statement-level IVM triggers. If the GUC is set to 'wal' when a stream table is created or altered to IMMEDIATE, pg_trickle logs an INFO and continues with IVM triggers instead of creating CDC triggers or WAL slots.

Per-stream-table overrides take precedence over the GUC when you pass cdc_mode => 'auto' | 'trigger' | 'wal' to pgtrickle.create_stream_table(...) or pgtrickle.alter_stream_table(...). The override is stored in pgtrickle.pgt_stream_tables.requested_cdc_mode. For shared source tables, pg_trickle resolves the effective source-level CDC mechanism conservatively: any dependent stream table that requests 'trigger' keeps the source on trigger CDC; otherwise 'wal' wins over 'auto'.

-- Enable automatic trigger → WAL transition (default)
SET pg_trickle.cdc_mode = 'auto';

-- Force trigger-only CDC (disable WAL transitions)
SET pg_trickle.cdc_mode = 'trigger';

-- Require WAL-based CDC (error if wal_level != logical)
SET pg_trickle.cdc_mode = 'wal';

pg_trickle.scheduler_interval_ms

How often the background scheduler checks for stream tables that need refreshing.

PropertyValue
Typeint
Default1000 (1 second)
Range10060000 (100ms to 60s)
ContextSUSET
Restart RequiredNo

Tuning Guidance:

  • Low-latency workloads (sub-second schedule): Set to 100500.
  • Standard workloads (minutes of schedule): Default 1000 is appropriate.
  • Low-overhead workloads (many STs with long schedules): Increase to 500010000 to reduce scheduler overhead.

The scheduler interval does not determine refresh frequency — it determines how often the scheduler checks whether any ST's staleness exceeds its schedule (or whether a cron expression has fired). The actual refresh frequency is governed by schedule (duration or cron) and canonical period alignment.

SET pg_trickle.scheduler_interval_ms = 500;

pg_trickle.event_driven_wake

Enable event-driven scheduler wake via LISTEN/NOTIFY. When enabled, CDC triggers emit pg_notify('pgtrickle_wake', '') after writing to the change buffer, and the scheduler LISTENs on that channel, waking immediately instead of waiting for the full scheduler_interval_ms poll. This reduces median end-to-end latency from ~500 ms to ~15 ms for low-volume workloads.

PropertyValue
Typebool
Defaulttrue
ContextSUSET
Restart RequiredNo

Tuning Guidance:

  • Low-latency workloads: Leave enabled (default) for the best latency.
  • Extreme write throughput (>100K DML/s): Consider disabling if the per-statement NOTIFY overhead is measurable. The NOTIFY is coalesced by PostgreSQL (one notification per transaction), so the actual overhead is negligible for most workloads.
-- Disable event-driven wake (fall back to poll-only)
SET pg_trickle.event_driven_wake = off;

pg_trickle.wake_debounce_ms

After the scheduler receives the first pgtrickle_wake notification, it waits this many milliseconds to coalesce rapidly arriving notifications before starting a refresh tick. Lower values reduce latency; higher values reduce wake overhead during bulk DML.

PropertyValue
Typeint
Default10 (10 milliseconds)
Range15000
ContextSUSET
Restart RequiredNo

Tuning Guidance:

  • Single-statement latency-sensitive: Use 15 ms.
  • Bulk DML workloads: Use 50200 ms to coalesce more notifications per tick.
  • Default (10 ms) balances sub-20 ms latency with reasonable coalescing.
SET pg_trickle.wake_debounce_ms = 50;

pg_trickle.min_schedule_seconds

Minimum allowed schedule value (in seconds) when creating or altering a stream table with a duration-based schedule. This limit does not apply to cron expressions.

PropertyValue
Typeint
Default1 (1 second)
Range186400 (1 second to 24 hours)
ContextSUSET
Restart RequiredNo

This acts as a safety guardrail to prevent users from setting impractically small schedules that would cause excessive refresh overhead.

Tuning Guidance:

  • Development/testing: Default 1 allows sub-second testing.
  • Production: Raise to 60 or higher to prevent excessive WAL consumption and CPU usage.
-- Restrict to 10-second minimum schedules
SET pg_trickle.min_schedule_seconds = 10;

pg_trickle.default_schedule_seconds

Default effective schedule (in seconds) for isolated CALCULATED stream tables that have no downstream dependents.

PropertyValue
Typeint
Default1 (1 second)
Range186400 (1 second to 24 hours)
ContextSUSET
Restart RequiredNo

When a CALCULATED stream table (scheduled with 'calculated') has no downstream dependents to derive a schedule from, this value is used as its effective refresh interval. This is distinct from min_schedule_seconds, which is the validation floor for duration-based schedules.

Tuning Guidance:

  • Development/testing: Default 1 allows rapid iteration.
  • Production standalone CALCULATED tables: Raise to match your desired update cadence (e.g., 60 for once-per-minute).
-- Set default for isolated CALCULATED tables to 30 seconds
SET pg_trickle.default_schedule_seconds = 30;

pg_trickle.max_consecutive_errors

Maximum consecutive refresh failures before a stream table is moved to ERROR status.

PropertyValue
Typeint
Default3
Range1100
ContextSUSET
Restart RequiredNo

When a ST's consecutive_errors reaches this threshold:

  1. The ST status changes to ERROR.
  2. Automatic refreshes stop for this ST.
  3. Manual intervention is required: SELECT pgtrickle.alter_stream_table('...', status => 'ACTIVE').

Tuning Guidance:

  • Strict (production): 3 — fail fast to surface issues.
  • Lenient (development): 1020 — tolerate transient errors.
SET pg_trickle.max_consecutive_errors = 5;

WAL CDC

Settings specific to WAL-based CDC. Only relevant when pg_trickle.cdc_mode = 'auto' or 'wal'.


pg_trickle.wal_transition_timeout

Note: This setting is only relevant when pg_trickle.cdc_mode = 'auto' or 'wal'. See ARCHITECTURE.md for the full CDC transition lifecycle.

Maximum time (seconds) to wait for the WAL decoder to catch up during the transition from trigger-based to WAL-based CDC. If the decoder has not caught up within this timeout, the system falls back to triggers.

Default: 300 (5 minutes)
Range: 103600

SET pg_trickle.wal_transition_timeout = 300;

pg_trickle.slot_lag_warning_threshold_mb

Warning threshold for retained WAL on pg_trickle replication slots.

PropertyValue
Typeint
Default100 (MB)
Range11048576
ContextSUSET
Restart RequiredNo

When retained WAL for a pg_trickle replication slot exceeds this threshold:

  • The scheduler emits a slot_lag_warning event on LISTEN pg_trickle_alert
  • pgtrickle.health_check() reports WARN for the slot_lag check

Raise this on high-throughput systems that intentionally tolerate larger WAL retention. Lower it if you want earlier warning before slots risk invalidation.

SET pg_trickle.slot_lag_warning_threshold_mb = 256;

pg_trickle.slot_lag_critical_threshold_mb

Critical threshold for retained WAL on pg_trickle replication slots.

PropertyValue
Typeint
Default1024 (MB)
Range11048576
ContextSUSET
Restart RequiredNo

When retained WAL for a pg_trickle replication slot exceeds this threshold, pgtrickle.check_cdc_health() returns a per-source slot_lag_exceeds_threshold alert.

This threshold is intentionally higher than the warning threshold so operators can separate early warning from source-level unhealthy state.

SET pg_trickle.slot_lag_critical_threshold_mb = 2048;

Refresh Performance

Fine-grained tuning for the differential refresh engine.


pg_trickle.differential_max_change_ratio

Maximum change-to-table ratio before DIFFERENTIAL refresh falls back to FULL refresh.

PropertyValue
Typefloat
Default0.15 (15%)
Range0.01.0
ContextSUSET
Restart RequiredNo

When the number of pending change buffer rows exceeds this fraction of the source table's estimated row count, the refresh engine switches from DIFFERENTIAL (which uses JSONB parsing and window functions) to FULL refresh. At high change rates FULL refresh is cheaper because it avoids the per-row JSONB overhead.

Special Values:

  • 0.0: Disable adaptive fallback — always use DIFFERENTIAL.
  • 1.0: Always fall back to FULL (effectively forces FULL mode).

Tuning Guidance:

  • OLTP with low change rates (< 5%): Default 0.15 is appropriate.
  • Batch-load workloads (bulk inserts): Lower to 0.050.10 so large batches trigger FULL refresh sooner.
  • Latency-sensitive (want deterministic refresh time): Set to 0.0 to always use DIFFERENTIAL.
-- Lower threshold for batch-heavy workloads
SET pg_trickle.differential_max_change_ratio = 0.10;

-- Disable adaptive fallback
SET pg_trickle.differential_max_change_ratio = 0.0;

pg_trickle.refresh_strategy

Cluster-wide refresh strategy override.

PropertyValue
Typestring
Default'auto'
Values'auto', 'differential', 'full'
ContextSUSET
Restart RequiredNo

Controls the FULL vs. DIFFERENTIAL decision for all stream tables whose refresh_mode is DIFFERENTIAL:

  • 'auto' (default): Use the adaptive cost-based heuristic that considers differential_max_change_ratio, per-ST auto_threshold, refresh history, and spill detection to pick the optimal strategy per refresh cycle.
  • 'differential': Always use DIFFERENTIAL refresh — skip the adaptive ratio check entirely. The BUF-LIMIT safety check (max_buffer_rows) still applies.
  • 'full': Always use FULL refresh regardless of change volume. Useful for debugging or when you know DIFFERENTIAL is consistently slower for your workload.

Important: Per-ST refresh_mode in the catalog takes precedence. Stream tables explicitly configured as refresh_mode = 'FULL' always use FULL regardless of this GUC.

Tuning Guidance:

  • Most workloads: Leave at 'auto' — the adaptive heuristic learns from refresh history.
  • Known-low-churn workloads: Set to 'differential' to eliminate the per-source capped-count query overhead.
  • Debugging delta issues: Temporarily set to 'full' to compare behavior.
-- Force DIFFERENTIAL for all stream tables (skip ratio check)
SET pg_trickle.refresh_strategy = 'differential';

-- Force FULL for all stream tables (debugging)
SET pg_trickle.refresh_strategy = 'full';

-- Reset to adaptive heuristic
SET pg_trickle.refresh_strategy = 'auto';

pg_trickle.cost_model_safety_margin

Added in v0.17.0. Safety margin for the predictive cost model that decides FULL vs. DIFFERENTIAL.

PropertyValue
Typefloat
Default0.8
Range0.12.0
ContextSUSET
Restart RequiredNo

When refresh_strategy = 'auto', the cost model estimates DIFFERENTIAL and FULL costs from recent refresh history. DIFFERENTIAL is chosen when:

estimated_diff_cost < estimated_full_cost × safety_margin

A value below 1.0 biases toward DIFFERENTIAL (which has lower lock contention and is generally preferred). A value above 1.0 biases toward FULL.

The cost model also classifies each stream table's query complexity (scan, filter, aggregate, join, or join+aggregate) and uses per-class coefficients learned from historical data.

Tuning Guidance:

  • 0.8 (default): Prefer DIFFERENTIAL unless it's nearly as expensive as FULL.
  • 0.5: Strongly prefer DIFFERENTIAL — only fall back when it's clearly more expensive.
  • 1.0: Neutral — pick whichever is estimated to be cheaper.
  • 1.2: Slightly prefer FULL — useful when FULL is very fast and DIFFERENTIAL lock contention is a concern.
-- Strongly prefer DIFFERENTIAL
SET pg_trickle.cost_model_safety_margin = 0.5;

-- Neutral (pick the estimated cheapest)
SET pg_trickle.cost_model_safety_margin = 1.0;

pg_trickle.max_delta_estimate_rows

Added in v0.15.0. Maximum estimated delta output cardinality before falling back to FULL refresh.

PropertyValue
Typeint
Default0 (disabled)
Range010,000,000
ContextSUSET
Restart RequiredNo

Before executing the MERGE, the refresh executor extracts the delta subquery and runs a capped SELECT count(*) FROM (delta LIMIT N+1). If the count reaches the configured limit, the refresh emits a NOTICE and falls back to FULL refresh to prevent OOM or excessive temp-file spills from unexpectedly large delta output.

This is complementary to differential_max_change_ratio which checks input change buffer size as a ratio of source table size. max_delta_estimate_rows checks output cardinality — catching cases where a small number of input changes produce a large delta output after JOINs.

Special Values:

  • 0 (default): Disable the estimation check entirely.

Tuning Guidance:

  • Servers with 8–16 GB RAM: Start with 100000 and adjust based on observed refresh behavior.
  • Large-memory servers (32+ GB): 500000 or higher.
  • Complex multi-join queries: Lower to 50000 since join fan-out can amplify small changes.
-- Enable delta output estimation with 100K row limit
SET pg_trickle.max_delta_estimate_rows = 100000;

-- Disable estimation (default)
SET pg_trickle.max_delta_estimate_rows = 0;

pg_trickle.cleanup_use_truncate

Use TRUNCATE instead of per-row DELETE for change buffer cleanup when the entire buffer is consumed by a refresh.

PropertyValue
Typebool
Defaulttrue
ContextSUSET
Restart RequiredNo

After a differential refresh consumes all rows from the change buffer, the engine must clean up the buffer table. TRUNCATE is O(1) regardless of row count, versus DELETE which must update indexes row-by-row. This saves 3–5 ms per refresh at 10%+ change rates.

Trade-off: TRUNCATE acquires an AccessExclusiveLock on the change buffer table. If concurrent DML on the source table is actively inserting into the same change buffer via triggers, this lock can cause brief contention.

Tuning Guidance:

  • Most workloads: Leave at true — the performance benefit outweighs the brief lock.
  • High-concurrency OLTP with continuous writes during refresh: Set to false if you observe lock-wait timeouts on the change buffer.
  • PgBouncer / connection poolers: The AccessExclusiveLock acquired by TRUNCATE is held only on the change buffer table (not the source table), but in transaction-pooling mode with frequent refreshes, even brief exclusive locks can cause connection queuing. If you observe elevated pg_stat_activity wait events on change buffer tables, switch to false.
-- Use per-row DELETE for change buffer cleanup
SET pg_trickle.cleanup_use_truncate = false;

pg_trickle.planner_aggressive

Added in v0.14.0. Consolidated switch for all MERGE planner hints. Replaces the deprecated merge_planner_hints and merge_work_mem_mb GUCs.

PropertyValue
Typebool
Defaulttrue
ContextSUSET
Restart RequiredNo

When enabled, the refresh executor estimates the delta size and applies optimizer hints within the transaction:

  • Delta ≥ 100 rows: SET LOCAL enable_nestloop = off — forces hash joins instead of nested-loop joins.
  • Delta ≥ 10,000 rows: additionally SET LOCAL work_mem = '<N>MB' (see pg_trickle.merge_work_mem_mb).

Tuning Guidance:

  • Most workloads: Leave at true — the hints improve tail latency without affecting small deltas.
  • Custom plan overrides: Set to false if you manage planner settings yourself or if the hints conflict with your pg_hint_plan configuration.
  • Memory-constrained environments: When enabled, large deltas (≥ 10,000 rows) raise work_mem to 64 MB (configurable via merge_work_mem_mb). If your server has limited RAM and runs many concurrent refreshes, this can cause unexpected memory pressure or temp-file spills. Monitor temp_blks_written in pg_stat_statements and consider lowering merge_work_mem_mb or disabling this GUC if spills are frequent.
-- Disable all planner hints
SET pg_trickle.planner_aggressive = false;

pg_trickle.merge_join_strategy

Added in v0.15.0. Manual override for the join strategy used during MERGE execution.

PropertyValue
Typetext
Default'auto'
Valuesauto, hash_join, nested_loop, merge_join
ContextSUSET
Restart RequiredNo

Controls which join strategy the refresh executor hints to PostgreSQL via SET LOCAL during differential refresh. Requires planner_aggressive to be enabled.

ValueBehaviour
auto (default)Delta-size heuristics choose: nested-loop for tiny deltas, hash-join for larger ones
hash_joinAlways disable nested-loop joins and raise work_mem — best for medium-to-large deltas
nested_loopAlways disable hash-join and merge-join — best for very small deltas against indexed tables
merge_joinAlways disable hash-join and nested-loop — useful if data is pre-sorted

Tuning Guidance:

  • Most workloads: Leave at auto — the built-in heuristic performs well.
  • Consistently large deltas (1K+ rows): Setting to hash_join avoids heuristic overhead.
  • Troubleshooting: If refresh is slow, try different strategies and compare with explain_st().
-- Force hash joins for all MERGE operations
SET pg_trickle.merge_join_strategy = 'hash_join';

-- Revert to automatic heuristics
SET pg_trickle.merge_join_strategy = 'auto';

pg_trickle.merge_strategy

Added in v0.16.0. Controls how differential refresh applies deltas to stream tables.

PropertyValue
Typetext
Default'auto'
Valuesauto, merge
ContextSUSET
Restart RequiredNo
ValueBehaviour
auto (default)Use DELETE+INSERT when delta_rows / target_rows is below merge_strategy_threshold; MERGE otherwise
mergeAlways use the PostgreSQL MERGE statement

Breaking change (v0.19.0): The delete_insert value was removed in v0.19.0 (CORR-1) because it was semantically unsafe for aggregate and DISTINCT queries. Setting it now logs a WARNING and falls back to auto.

The DELETE+INSERT strategy avoids the MERGE join cost by executing two targeted statements: a DELETE for removed rows (matched by __pgt_row_id), then an INSERT for new rows. This is significantly cheaper for sub-1% deltas against large tables because it avoids scanning the entire target for the MERGE join.

Tuning Guidance:

  • Most workloads: Leave at auto — the heuristic picks DELETE+INSERT for small deltas automatically.
  • Correctness concerns: The merge setting preserves the pre-v0.16.0 behaviour.
-- Force MERGE for all differential refreshes
SET pg_trickle.merge_strategy = 'merge';

-- Revert to automatic heuristics
SET pg_trickle.merge_strategy = 'auto';

pg_trickle.merge_strategy_threshold

Added in v0.16.0. Delta ratio threshold for the auto merge strategy.

PropertyValue
Typefloat
Default0.01 (1%)
Range0.0011.0
ContextSUSET
Restart RequiredNo

When merge_strategy is auto, DELETE+INSERT is used instead of MERGE when delta_rows / target_rows is below this threshold. The target row count is estimated from pg_class.reltuples.

Tuning Guidance:

  • Default (0.01): DELETE+INSERT for deltas under 1% of the target table size.
  • Higher values (0.05–0.10): More aggressive use of DELETE+INSERT; useful for wide tables where MERGE join overhead is high.
  • Lower values (0.001): Only use DELETE+INSERT for very tiny deltas.
-- Use DELETE+INSERT for deltas under 5% of target size
SET pg_trickle.merge_strategy_threshold = 0.05;

pg_trickle.merge_planner_hints

Deprecated in v0.14.0. Use pg_trickle.planner_aggressive instead. This GUC is still accepted for backward compatibility but is ignored at runtime.

Inject SET LOCAL planner hints before MERGE execution during differential refresh.

PropertyValue
Typebool
Defaulttrue
ContextSUSET
Restart RequiredNo

When enabled, the refresh executor estimates the delta size and applies optimizer hints within the transaction:

  • Delta ≥ 100 rows: SET LOCAL enable_nestloop = off — forces hash joins instead of nested-loop joins.
  • Delta ≥ 10,000 rows: additionally SET LOCAL work_mem = '<N>MB' (see pg_trickle.merge_work_mem_mb).

This reduces P95 latency spikes caused by PostgreSQL choosing nested-loop plans for medium/large delta sizes.

Tuning Guidance:

  • Most workloads: Leave at true — the hints improve tail latency without affecting small deltas.
  • Custom plan overrides: Set to false if you manage planner settings yourself or if the hints conflict with your pg_hint_plan configuration.
-- Disable planner hints
SET pg_trickle.merge_planner_hints = false;

pg_trickle.merge_work_mem_mb

work_mem value (in MB) applied via SET LOCAL when the delta exceeds 10,000 rows and planner hints are enabled.

PropertyValue
Typeint
Default64 (64 MB)
Range84096 (8 MB to 4 GB)
ContextSUSET
Restart RequiredNo

A higher value lets PostgreSQL use larger in-memory hash tables for the MERGE join, avoiding disk-spilling sort/merge strategies on large deltas. This setting is only applied when both merge_planner_hints = true and the delta exceeds 10,000 rows.

Tuning Guidance:

  • Servers with ample RAM (32+ GB): Increase to 128256 for faster large-delta refreshes.
  • Memory-constrained: Lower to 1632 or disable planner hints entirely.
  • Very large deltas (100K+ rows): Consider 256512 if refresh latency matters.
SET pg_trickle.merge_work_mem_mb = 128;

pg_trickle.delta_work_mem_cap_mb

Maximum work_mem (in MB) that planner hints are allowed to set during delta MERGE execution. When the deep-join or large-delta path would set work_mem above this cap, the refresh falls back to FULL instead of risking OOM.

PropertyValue
Typeint
Default0 (disabled — no cap)
Range08192 (0 to 8 GB)
ContextSUSET
Restart RequiredNo

Set to 0 to disable the cap entirely (default). When enabled, the cap is checked before any SET LOCAL work_mem in apply_planner_hints(). If the configured or computed work_mem exceeds the cap, the refresh emits a NOTICE and falls back to FULL refresh.

Tuning Guidance:

  • Production servers with tight memory budgets: Set to 256512 to prevent runaway hash joins.
  • Servers with ample RAM (64+ GB): Leave at 0 (disabled) or set high (2048+).
  • If you see SCAL-3 fallback notices: Either raise the cap or investigate why delta sizes are unexpectedly large.
SET pg_trickle.delta_work_mem_cap_mb = 512;

pg_trickle.merge_seqscan_threshold

Delta-to-ST row ratio below which sequential scans are disabled for the MERGE transaction. Requires planner hints to be enabled.

PropertyValue
Typereal
Default0.001
Range0.01.0
ContextSUSET
Restart RequiredNo

When the estimated delta row count divided by the stream table's reltuples falls below this threshold, the refresh executor issues SET LOCAL enable_seqscan = off, coercing PostgreSQL into using the __pgt_row_id B-tree index instead of a full sequential scan.

Set to 0.0 to disable the feature entirely.

Tuning Guidance:

  • Default (0.001): Suitable for most workloads. A 10M-row ST with fewer than 10K delta rows triggers the hint.
  • High-throughput / small STs: Increase to 0.01 if your STs are small and you want more aggressive index usage.
  • Disable: Set to 0.0 if index-only scans are not beneficial for your access pattern.
SET pg_trickle.merge_seqscan_threshold = 0.01;

pg_trickle.auto_backoff

Automatically back off the refresh schedule when a stream table is consistently falling behind.

PropertyValue
Typebool
Defaulton
ContextSUSET
Restart RequiredNo

When enabled (the default), the scheduler tracks a per-stream-table backoff factor. If a refresh cycle takes more than 95% of the scheduled interval, the backoff factor doubles (capped at ), effectively stretching the schedule to avoid runaway refresh storms. The factor resets to 1× on the first on-time completion, and a WARNING is emitted whenever the factor changes so you always know why a stream table is refreshing more slowly than expected.

The 95% trigger threshold means that brief jitter on developer machines (e.g. a 950 ms refresh on a 1-second schedule) will correctly engage backoff, while a 900 ms refresh on the same schedule will not. The EC-11 operator alert (scheduler_falling_behind NOTIFY) continues to fire at the lower 80% threshold, giving you advance warning before the scheduler is actually stuck.

This is a safety net for overloaded systems — it prevents a single slow stream table from monopolizing the background worker when operators are not available to intervene.

Tuning Guidance:

  • Leave on (the default) for both production and development environments.
  • Disable only if you are deliberately running stream tables at the limit of their schedule budget and want the scheduler to keep trying at full speed regardless.
-- Disable if you want no backoff (not recommended for production)
SET pg_trickle.auto_backoff = off;

pg_trickle.tiered_scheduling

Enable tiered refresh scheduling (Hot/Warm/Cold/Frozen) for stream tables.

PropertyValue
Typebool
Defaulton
ContextSUSET
Restart RequiredNo

When enabled, the scheduler applies a per-stream-table refresh tier multiplier to duration-based schedules. Each stream table has a refresh_tier column (default 'hot') that controls how often it is refreshed relative to its configured schedule:

TierMultiplierEffect
hotRefresh at configured schedule (default)
warmRefresh at 2× the configured interval
cold10×Refresh at 10× the configured interval
frozenskipNever refreshed until manually promoted

Cron-based schedules are not affected by the tier multiplier.

Set the tier via:

SELECT pgtrickle.alter_stream_table('my_table', tier => 'warm');
SELECT pgtrickle.alter_stream_table('my_table', tier => 'frozen');

Design note: Tiers are user-assigned only. Automatic classification from pg_stat_user_tables was rejected because pg_trickle's own MERGE scans pollute the read counters, making auto-classification unreliable.

Tier Thresholds Reference

The following table summarizes the effective refresh behavior for each tier. All multipliers apply to duration-based schedules only — cron-based schedules are always honored as-is. New stream tables default to hot.

TierMultiplierEffective Schedule (1 s base)Use Case
hot1 sReal-time dashboards, alerting tables, SLA-bound queries
warm2 sImportant but not latency-critical tables; reduces CPU by 50%
cold10×10 sReporting tables queried infrequently; saves significant CPU
frozenskipnever (until promoted)Archival tables, tables under maintenance, or seasonal reports

When to use each tier:

  • Hot — default for all new stream tables. Appropriate when downstream consumers expect near-real-time freshness.
  • Warm — set for tables where a few seconds of staleness is acceptable. Halves the refresh CPU cost compared to Hot.
  • Cold — set for tables queried only by batch jobs or low-frequency dashboards. 10× reduction in refresh overhead.
  • Frozen — set when a table should not be refreshed at all (e.g., during a maintenance window or when the upstream source is being migrated). Promote back to Hot/Warm/Cold when ready.
-- Promote a frozen table back to warm
SELECT pgtrickle.alter_stream_table('seasonal_report', tier => 'warm');

-- Freeze a table during maintenance
SELECT pgtrickle.alter_stream_table('my_table', tier => 'frozen');

Changed in v0.12.0: The default for pg_trickle.tiered_scheduling changed from off to on. Set pg_trickle.tiered_scheduling = off in postgresql.conf to restore pre-v0.12.0 behavior (all STs refresh at full speed regardless of tier assignment).


Diamond Schedule Policy (per-stream-table)

Controls how the scheduler fires diamond consistency groups — sets of stream tables that share upstream sources through a diamond-shaped DAG topology.

PropertyValue
Columndiamond_schedule_policy in pgt_stream_tables
Values'fastest' (default), 'slowest'
Set viacreate_stream_table(..., diamond_schedule_policy => 'slowest')
Alter viaalter_stream_table('name', diamond_schedule_policy => 'slowest')

Only meaningful when diamond_consistency = 'atomic' is also set.

fastest (default): The atomic group fires when any member is due. This maximizes freshness but can cause CPU multiplication. In an asymmetric diamond where stream table B refreshes every 1 s and stream table C every 5 s, both feeding D with diamond_consistency = 'atomic': C refreshes 5× more often than its schedule because B triggers the group every second. For N members with schedules S₁ < S₂ < … < Sₙ, the total refresh count is N × (cycle_time / S₁), meaning slower members do up to Sₙ/S₁ times more work than their schedule implies.

slowest: The atomic group fires only when all members are due. This minimizes CPU cost at the expense of freshness — faster members are held back until the slowest member's schedule fires.

Tuning Guidance:

  • Use 'fastest' when freshness of the diamond tip matters and the cost of extra refreshes is acceptable.
  • Use 'slowest' when CPU budget is tight or members have very different schedules (e.g., 1 s vs 60 s) and the multiplication would be excessive.
-- Create with slowest policy to avoid CPU multiplication
SELECT pgtrickle.create_stream_table(
    'my_diamond_tip',
    'SELECT ... FROM a JOIN b ...',
    diamond_consistency => 'atomic',
    diamond_schedule_policy => 'slowest'
);

pg_trickle.use_prepared_statements

Use SQL PREPARE / EXECUTE for MERGE statements during differential refresh.

PropertyValue
Typebool
Defaulttrue
ContextSUSET
Restart RequiredNo

When enabled, the refresh executor issues PREPARE __pgt_merge_{id} on the first cache-hit cycle, then uses EXECUTE on subsequent cycles. After approximately 5 executions, PostgreSQL switches from a custom plan to a generic plan, saving 1–2 ms of parse/plan overhead per refresh.

Tuning Guidance:

  • Most workloads: Leave at true — the cumulative parse/plan savings are significant for frequently-refreshed stream tables.
  • Highly skewed data: Set to false if prepared-statement parameter sniffing produces poor plans (e.g., highly skewed LSN distributions causing bad join estimates).
-- Disable prepared statements
SET pg_trickle.use_prepared_statements = false;

pg_trickle.user_triggers

Control how user-defined triggers on stream tables are handled during refresh.

PropertyValue
Typetext
Default'auto'
Values'auto', 'off' ('on' accepted as deprecated alias for 'auto')
ContextSUSET
Restart RequiredNo

When a stream table has user-defined row-level triggers, the refresh engine can decompose the MERGE into explicit DELETE + UPDATE + INSERT statements so triggers fire with correct TG_OP, OLD, and NEW values.

Values:

  • auto (default): Automatically detect user triggers on the stream table. If present, use the explicit DML path; otherwise use MERGE.
  • off: Always use MERGE. User triggers are suppressed during refresh. This is the escape hatch if explicit DML causes issues.
  • on: Deprecated compatibility alias for auto. Existing configs continue to work, but new configs should use auto.

Notes:

  • Row-level triggers do not fire during FULL refresh regardless of this setting. FULL refresh uses DISABLE TRIGGER USER / ENABLE TRIGGER USER to suppress them.
  • The explicit DML path adds ~25–60% overhead compared to MERGE for affected stream tables.
  • Stream tables without user triggers have zero overhead when using auto (only a fast pg_trigger check).
-- Auto-detect (default)
SET pg_trickle.user_triggers = 'auto';

-- Suppress triggers, use MERGE
SET pg_trickle.user_triggers = 'off';

-- Backward-compatible legacy setting (treated the same as 'auto')
SET pg_trickle.user_triggers = 'on';

Guardrails & Limits

Safety controls and hard limits.


pg_trickle.block_source_ddl

When enabled, column-affecting DDL (e.g., ALTER TABLE ... DROP COLUMN, ALTER TABLE ... ALTER COLUMN ... TYPE) on source tables tracked by stream tables is blocked with an ERROR instead of silently marking stream tables for reinitialization.

This is useful in production environments where you want to prevent accidental schema changes that would trigger expensive full recomputation of downstream stream tables.

Default: false
Context: Superuser

-- Block column-affecting DDL on tracked source tables
SET pg_trickle.block_source_ddl = true;

-- Allow DDL (stream tables will be marked for reinit instead)
SET pg_trickle.block_source_ddl = false;

Note: Only column-affecting changes are blocked. Benign DDL (adding indexes, comments, constraints) is always allowed regardless of this setting.


pg_trickle.buffer_alert_threshold

When any source table's change buffer exceeds this number of rows, a BufferGrowthWarning alert is emitted. Raise for high-throughput workloads, lower for small tables.

Default: 1000000 (1 million rows)
Range: 1000100000000

SET pg_trickle.buffer_alert_threshold = 500000;

pg_trickle.compact_threshold

When a source table's pending change buffer exceeds this many rows, compaction is triggered before the next refresh cycle. Compaction eliminates net-zero INSERT+DELETE pairs (rows inserted then deleted within the same refresh window) and collapses multi-change groups to first+last rows per pk_hash, reducing delta scan overhead by 50–90% for high-churn tables.

Set to 0 to disable compaction.

Default: 100000 (100K rows)
Range: 0100000000

-- Trigger compaction at 50K pending rows
SET pg_trickle.compact_threshold = 50000;

-- Disable compaction
SET pg_trickle.compact_threshold = 0;

pg_trickle.max_buffer_rows

Added in v0.16.0. Hard limit on change buffer rows per source table. When a source table's change buffer exceeds this limit at refresh time, pg_trickle forces a FULL refresh and truncates the buffer, preventing unbounded disk growth when differential refresh fails repeatedly.

PropertyValue
Typeinteger
Default1000000 (1 million rows)
Range0100000000
ContextSUSET
Restart RequiredNo

Set to 0 to disable the limit (not recommended for production).

Tuning Guidance:

  • Most workloads: Leave at 1000000. This accommodates high-throughput tables while preventing runaway growth.
  • High-throughput event tables: Raise to 500000010000000 if your source tables regularly accumulate large change buffers between refresh cycles.
  • Small databases / tight disk budgets: Lower to 100000500000 to limit change buffer disk usage.
-- Set buffer limit to 5 million rows
SET pg_trickle.max_buffer_rows = 5000000;

-- Disable the limit (not recommended)
SET pg_trickle.max_buffer_rows = 0;

pg_trickle.auto_index

Added in v0.16.0. Controls whether create_stream_table() automatically creates performance indexes on stream tables.

PropertyValue
Typebool
Defaulttrue
ContextSUSET
Restart RequiredNo

When enabled, the following indexes are created automatically:

  1. GROUP BY composite index — for aggregate queries in DIFFERENTIAL mode, a composite index on the GROUP BY columns is created to speed up group lookups during MERGE.

  2. DISTINCT composite index — for DISTINCT queries with ≤ 8 output columns, a composite index on all output columns is created.

  3. Covering __pgt_row_id index — for stream tables with ≤ 8 output columns, the __pgt_row_id index includes all user columns via INCLUDE, enabling index-only scans during MERGE (20–50% faster for small deltas against large targets).

The __pgt_row_id index itself is always created regardless of this setting (it is required for correctness).

Tuning Guidance:

  • Most workloads: Leave at true.
  • Custom index strategies: Set to false if you prefer to manage indexes manually or if the auto-created indexes conflict with your workload patterns.
-- Disable automatic index creation
SET pg_trickle.auto_index = false;

pg_trickle.aggregate_fast_path

Added in v0.16.0. Controls whether stream tables with all-algebraic aggregates use the explicit DML fast-path instead of MERGE.

PropertyValue
Typebool
Defaulttrue
ContextSUSET
Restart RequiredNo

When enabled, stream tables whose aggregates are all algebraically invertible (COUNT, SUM, AVG, STDDEV, VAR, CORR, REGR_*, etc.) use the explicit DML path (DELETE + UPDATE + INSERT via a materialized temp table) instead of the generic MERGE statement. This avoids the MERGE hash-join cost, which dominates for aggregate queries with high group cardinality.

Not eligible:

  • Queries with SEMI_ALGEBRAIC aggregates (MIN, MAX) — these may require group-rescan on extremum deletion
  • Queries with GROUP_RESCAN aggregates (STRING_AGG, ARRAY_AGG, JSON_AGG, etc.)
  • Queries with user-defined triggers on the stream table (already use explicit DML via the user-trigger path)

The explain_st() output shows the aggregate_path field:

  • explicit_dml — fast-path is active
  • merge — using the default MERGE path
  • merge (fast-path disabled) — eligible but GUC is off
-- Disable aggregate fast-path
SET pg_trickle.aggregate_fast_path = false;

-- Check the current aggregate path for a stream table
SELECT * FROM pgtrickle.explain_st('my_agg_st');

pg_trickle.template_cache

Added in v0.16.0. Controls the cross-backend delta template cache backed by an UNLOGGED catalog table.

PropertyValue
Typebool
Defaulttrue
ContextSUSET
Restart RequiredNo

When enabled, delta SQL templates generated by the DVM engine are persisted in pgtrickle.pgt_template_cache so that new backends skip the ~45 ms parse+differentiate step on their first refresh of each stream table (down to ~1 ms SPI lookup).

Templates are automatically invalidated when:

  • A stream table's defining query changes (ALTER STREAM TABLE ... SET QUERY)
  • A stream table is dropped
  • A stream table is reinitialized

The explain_st() output includes template_cache (enabled/disabled) and template_cache_stats with L2 hit and full miss counters.

-- Disable the template cache for debugging
SET pg_trickle.template_cache = false;

-- Check template cache stats
SELECT * FROM pgtrickle.explain_st('my_st')
WHERE property IN ('template_cache', 'template_cache_stats');

pg_trickle.buffer_partitioning

Controls whether change buffer tables use PARTITION BY RANGE (lsn) for O(1) cleanup via partition detach instead of O(n) DELETE.

ValueBehaviour
'off'(Default) Unpartitioned heap tables. Cleanup uses DELETE or TRUNCATE. Lowest DDL overhead per cycle.
'on'Always create partitioned change buffers. Old partitions are detached and dropped after consumption — O(1) cleanup regardless of buffer size. Best for high-throughput sources where buffers routinely exceed compact_threshold.
'auto'Start with unpartitioned buffers. If a buffer accumulates more rows than compact_threshold within a single refresh cycle, automatically promote it to RANGE(lsn) partitioned mode. Once promoted, the buffer stays partitioned. Combines low overhead for quiet sources with O(1) cleanup for hot ones.

Default: 'off' Context: SUSET (superuser session-level)

-- Always partition change buffers
SET pg_trickle.buffer_partitioning = 'on';

-- Auto-promote based on throughput
SET pg_trickle.buffer_partitioning = 'auto';

-- Disable partitioning (default)
SET pg_trickle.buffer_partitioning = 'off';

Interaction with compact_threshold: In 'auto' mode, the compact_threshold value serves double duty — it triggers both compaction and the auto-promotion decision. Lowering compact_threshold makes auto-promotion more sensitive.


pg_trickle.max_grouping_set_branches

Maximum allowed grouping set branches in CUBE/ROLLUP queries. CUBE(n) produces $2^n$ branches — without a limit, large cubes cause memory exhaustion during parsing. Users who genuinely need more than 64 branches can raise this GUC.

Default: 64
Range: 165536

-- Allow up to 128 grouping set branches
SET pg_trickle.max_grouping_set_branches = 128;

pg_trickle.volatile_function_policy

Controls how volatile functions in defining queries are handled for DIFFERENTIAL and IMMEDIATE modes.

ValueBehaviour
reject(Default) Volatile functions cause an ERROR at stream table creation time.
warnVolatile functions emit a WARNING but creation proceeds. Delta correctness is not guaranteed.
allowVolatile functions are permitted silently. Use only when you understand that delta computation may produce incorrect results.

Default: reject Context: SUSET (superuser session-level)

-- Allow volatile functions with a warning
SET pg_trickle.volatile_function_policy = 'warn';

-- Allow volatile functions silently
SET pg_trickle.volatile_function_policy = 'allow';

Note: Volatile functions (e.g., random(), clock_timestamp()) produce different values on each evaluation. In DIFFERENTIAL/IMMEDIATE modes, the delta computation assumes deterministic functions — volatile functions may cause stale or incorrect rows. FULL mode is unaffected since it recomputes from scratch every time.


pg_trickle.unlogged_buffers

Create new change buffer tables as UNLOGGED to reduce WAL amplification from CDC trigger inserts.

ValueBehaviour
false(Default) Change buffers are WAL-logged. Crash-safe — no data loss on crash recovery.
trueNew change buffers are created as UNLOGGED. Eliminates WAL writes for trigger-inserted rows, reducing WAL amplification by ~30%. Trade-off: buffers are truncated on crash recovery; affected stream tables automatically receive a FULL refresh on the next scheduler cycle.

Default: false Context: SUSET (superuser session-level)

-- Enable UNLOGGED buffers for new stream tables
SET pg_trickle.unlogged_buffers = true;

Crash recovery: After a PostgreSQL crash or standby restart, UNLOGGED buffer tables are automatically truncated by PostgreSQL. The pg_trickle scheduler detects this condition and enqueues a FULL refresh for each affected stream table on the next tick. During the window between crash recovery and FULL refresh completion, stream table data may be stale.

Standby replicas: UNLOGGED tables are not replicated to standbys. Stream tables on read replicas will be stale after any standby restart until the next FULL refresh completes on the primary.

Converting existing buffers: This GUC only affects newly created change buffer tables. To convert existing logged buffers, use:

SELECT pgtrickle.convert_buffers_to_unlogged();

This function acquires ACCESS EXCLUSIVE lock on each buffer table. Run it during a low-traffic maintenance window.


pg_trickle.max_parse_depth

Maximum recursion depth for the query parser's tree visitors (G13-SD). Prevents stack-overflow crashes on pathological queries with deeply nested subqueries, CTEs, or set operations. When the limit is exceeded, the parser returns a QueryTooComplex error instead of crashing.

Default: 64 Range: 110000

-- Raise the limit for deeply nested queries
SET pg_trickle.max_parse_depth = 128;

pg_trickle.ivm_topk_max_limit

Maximum LIMIT value for TopK stream tables in IMMEDIATE mode. TopK queries exceeding this threshold are rejected because the inline micro-refresh (recomputing top-K rows on every DML statement) adds latency proportional to LIMIT. Set to 0 to disable TopK in IMMEDIATE mode entirely.

Default: 1000
Range: 01000000

-- Allow TopK up to LIMIT 5000 in IMMEDIATE mode
SET pg_trickle.ivm_topk_max_limit = 5000;

pg_trickle.ivm_recursive_max_depth

Maximum recursion depth for WITH RECURSIVE queries in IMMEDIATE mode. The semi-naive evaluation injects a __pgt_depth counter column into the recursive SQL; iteration stops when the counter reaches this limit. Protects against infinite recursion in pathological graphs.

Default: 100
Range: 110000

-- Allow deeper recursion for large hierarchies
SET pg_trickle.ivm_recursive_max_depth = 500;

Parallel Refresh

These settings control whether and how the scheduler dispatches refresh work to multiple dynamic background workers instead of processing stream tables sequentially. See PLAN_PARALLELISM.md for the design.

Note: Parallel refresh is new in v0.4.0 and defaults to off. Enable it via pg_trickle.parallel_refresh_mode after validating your workload.

pg_trickle.parallel_refresh_mode

Controls whether the scheduler dispatches refresh work to dynamic background workers.

PropertyValue
Typetext
Default'off'
Values'off', 'dry_run', 'on'
ContextSUSET
Restart RequiredNo
  • off (default): Sequential execution. All stream tables are refreshed one at a time in topological order by the single scheduler background worker. This is the proven, stable default.
  • dry_run: The scheduler computes execution units, logs dispatch decisions (unit keys, ready-queue contents, budget), but still executes refreshes inline. Useful for previewing parallel behaviour without actually spawning workers.
  • on: True parallel refresh. The coordinator builds an execution-unit DAG, dispatches ready units to dynamic background workers, and respects both the per-database cap (max_concurrent_refreshes) and the cluster-wide cap (max_dynamic_refresh_workers).
-- Preview parallel dispatch decisions without changing runtime behaviour
SET pg_trickle.parallel_refresh_mode = 'dry_run';

-- Enable parallel refresh
SET pg_trickle.parallel_refresh_mode = 'on';

pg_trickle.max_dynamic_refresh_workers

Cluster-wide cap on concurrently active pg_trickle dynamic refresh workers.

PropertyValue
Typeint
Default4
Range064
ContextSUSET
Restart RequiredNo

This is distinct from pg_trickle.max_concurrent_refreshes (per-database cap). When multiple databases each have their own scheduler, this GUC prevents them from overcommitting the shared PostgreSQL max_worker_processes budget.

Worker-budget planning: Each dynamic refresh worker consumes one max_worker_processes slot. In addition, pg_trickle uses one slot for the launcher and one per-database scheduler. Ensure:

max_worker_processes >= pg_trickle launchers (1)
                      + pg_trickle schedulers (1 per database)
                      + max_dynamic_refresh_workers
                      + autovacuum workers
                      + parallel query workers
                      + other extensions

A typical small deployment (1–2 databases, 4 parallel workers) needs at least max_worker_processes = 16. The E2E test Docker image uses 128.

-- Allow up to 8 concurrent refresh workers cluster-wide
SET pg_trickle.max_dynamic_refresh_workers = 8;

pg_trickle.max_concurrent_refreshes

Per-database dispatch cap for parallel refresh workers.

PropertyValue
Typeint
Default4
Range132
ContextSUSET
Restart RequiredNo

When parallel_refresh_mode = 'on', this limits how many execution units a single database coordinator may have in-flight at the same time. In sequential mode (parallel_refresh_mode = 'off'), this setting has no effect.

The effective concurrent refreshes for a database is:

min(max_concurrent_refreshes, max_dynamic_refresh_workers - workers_used_by_other_dbs)
-- Allow up to 8 concurrent refreshes in this database
SET pg_trickle.max_concurrent_refreshes = 8;

pg_trickle.per_database_worker_quota

Per-database dynamic refresh worker quota for multi-tenant cluster isolation.

PropertyValue
Typeint
Default0 (disabled)
Range064
ContextSUSET
Restart RequiredNo

When greater than 0, each per-database scheduler limits itself to this many concurrently active dynamic refresh workers drawn from the shared max_dynamic_refresh_workers pool. This prevents a single busy database from starving others in multi-tenant clusters.

Burst capacity: when the cluster is lightly loaded (active workers < 80% of max_dynamic_refresh_workers), a database may temporarily exceed its quota by up to 50% to absorb sudden change backlogs. The burst is reclaimed automatically within 1 scheduler cycle once global load rises back above the 80% threshold.

Priority dispatch: within each dispatch tick, IMMEDIATE-trigger closures are dispatched before all other unit kinds, ensuring transactional consistency requirements are always met first, even under quota pressure.

-- Limit the analytics DB to 4 base workers (bursts to 6 when cluster is idle)
ALTER DATABASE analytics SET pg_trickle.per_database_worker_quota = 4;
-- Give the reporting DB only 2 base workers
ALTER DATABASE reporting  SET pg_trickle.per_database_worker_quota = 2;
SELECT pg_reload_conf();

When per_database_worker_quota = 0 (the default), this feature is disabled and all databases share the max_dynamic_refresh_workers pool on a first-come-first-served basis, bounded per coordinator by max_concurrent_refreshes.

Note: Set this GUC per-database with ALTER DATABASE rather than globally with ALTER SYSTEM, so different databases can have different quotas.


Advanced / Internal

pg_trickle.change_buffer_schema

Schema name for change-buffer tables created by the trigger-based CDC pipeline.

Default: 'pgtrickle_changes'

Change buffer tables are named <schema>.changes_<oid> where <oid> is the source table's OID. Placing them in a dedicated schema keeps them out of the public namespace.

SET pg_trickle.change_buffer_schema = 'my_change_buffers';

pg_trickle.foreign_table_polling

Enable polling-based change detection for foreign table sources. When enabled, the scheduler periodically re-executes the foreign table query and computes deltas via snapshot comparison (EXCEPT ALL). Foreign tables cannot use trigger or WAL-based CDC, so this is the only mechanism for incremental maintenance.

Default: false

SET pg_trickle.foreign_table_polling = true;

pg_trickle.matview_polling

Enable polling-based CDC for materialized views. When enabled, materialized views referenced in defining queries are supported via snapshot-comparison (the same mechanism as foreign table polling). A local shadow table stores the previous state; EXCEPT ALL computes the delta on each refresh cycle.

PropertyValue
Typeboolean
Defaultfalse
ContextSUSET (superuser)
Restart requiredNo
SET pg_trickle.matview_polling = true;

pg_trickle.cdc_trigger_mode

Controls the CDC trigger granularity: statement (default) or row.

statement uses statement-level AFTER triggers with transition tables (NEW TABLE / OLD TABLE). A single invocation per DML statement processes all affected rows in one bulk INSERT ... SELECT, giving 50-80% less write-side overhead for bulk UPDATE/DELETE. Single-row DML is unaffected.

row uses the legacy per-row trigger approach (pg_trickle < 0.4.0 behavior).

Changing this setting takes effect for newly installed CDC triggers. Call pgtrickle.rebuild_cdc_triggers() to migrate existing stream tables.

PropertyValue
Typestring
Default'statement'
Valid valuesstatement, row
ContextSUSET (superuser)
Restart requiredNo
-- Switch to statement-level triggers (default, recommended)
SET pg_trickle.cdc_trigger_mode = 'statement';

-- After changing, rebuild existing triggers:
SELECT pgtrickle.rebuild_cdc_triggers();

pg_trickle.tick_watermark_enabled

Cap CDC consumption to the WAL LSN at scheduler tick start. When enabled (default), each scheduler tick captures pg_current_wal_lsn() at its start and prevents any refresh from consuming WAL changes beyond that LSN. This bounds cross-source staleness without requiring user configuration.

Disable only if you need stream tables to always advance to the latest available LSN.

PropertyValue
Typeboolean
Defaulttrue
ContextSUSET (superuser)
Restart requiredNo
-- Disable tick watermark bounding
SET pg_trickle.tick_watermark_enabled = false;

pg_trickle.watermark_holdback_timeout

Maximum seconds a user-provided watermark may remain un-advanced before being considered stuck. When a watermark group contains a source whose watermark has not been advanced within this timeout, downstream stream tables in that group are paused (refresh is skipped) and a pgtrickle_alert NOTIFY with watermark_stuck event is emitted.

When the stuck watermark is advanced again (via advance_watermark()), the pause is automatically lifted and a watermark_resumed event is emitted.

Set to 0 to disable stuck-watermark detection (default). Useful values depend on your ETL pipeline cadence -- for a pipeline that loads every 5 minutes, a timeout of 600 (10 min) gives a safety margin.

PropertyValue
Typeinteger
Default0 (disabled)
Min0
Max86400 (24 hours)
ContextSUSET (superuser)
Restart requiredNo
-- Set stuck-watermark timeout to 10 minutes
ALTER SYSTEM SET pg_trickle.watermark_holdback_timeout = 600;
SELECT pg_reload_conf();

NOTIFY payloads:

{"event":"watermark_stuck","group":"order_pipeline","source_oid":16385,"age_secs":620}
{"event":"watermark_resumed","source_oid":16385}

pg_trickle.spill_threshold_blocks

Temp blocks written threshold for spill detection. After each differential MERGE, pg_trickle queries pg_stat_statements for the temp_blks_written metric. If the value exceeds this threshold, the refresh is considered a spill.

After spill_consecutive_limit consecutive spills, the scheduler forces a FULL refresh for that stream table to prevent repeated expensive differential merges.

Requires the pg_stat_statements extension to be installed. Set to 0 to disable spill detection (default).

PropertyValue
Typeinteger
Default0 (disabled)
Min0
Max100000000
ContextSUSET (superuser)
Restart requiredNo
-- Enable spill detection: flag > 1000 temp blocks as a spill
ALTER SYSTEM SET pg_trickle.spill_threshold_blocks = 1000;
SELECT pg_reload_conf();

pg_trickle.spill_consecutive_limit

Number of consecutive spilling differential refreshes before the scheduler automatically forces a FULL refresh. Resets after any non-spilling refresh.

Only effective when spill_threshold_blocks > 0.

PropertyValue
Typeinteger
Default3
Min1
Max100
ContextSUSET (superuser)
Restart requiredNo
-- Force FULL after 5 consecutive spills (default: 3)
ALTER SYSTEM SET pg_trickle.spill_consecutive_limit = 5;
SELECT pg_reload_conf();

pg_trickle.log_merge_sql

Log the generated MERGE SQL template on every refresh cycle. When enabled, the MERGE SQL template built during differential refresh is emitted to the PostgreSQL server log at LOG level.

Intended for debugging MERGE query generation only. Do not enable in production — the output is verbose and includes the full SQL for every refresh.

PropertyValue
Typeboolean
Defaultfalse
ContextSUSET (superuser)
Restart requiredNo
SET pg_trickle.log_merge_sql = true;

Guardrails & Diagnostics

These GUCs control safety thresholds and diagnostic warnings.

pg_trickle.fuse_default_ceiling

Global default change-count ceiling for the fuse circuit breaker. When a stream table has fuse_mode = 'on' or 'auto' and no per-ST fuse_ceiling, this value is used. If pending changes exceed this count, the fuse blows and the stream table is suspended (status = SUSPENDED).

Set to 0 to disable the global default (per-ST ceilings still apply).

PropertyValue
Typeinteger
Default0 (disabled)
Range0 - 2,000,000,000
ContextSUSET (superuser)
Restart requiredNo
-- Set global fuse ceiling to 1 million rows
SET pg_trickle.fuse_default_ceiling = 1000000;

pg_trickle.delta_amplification_threshold

Delta amplification detection threshold (output/input ratio). When a DIFFERENTIAL refresh produces more than this multiple of the input delta rows, a WARNING is emitted so operators can identify pathological join fan-out or many-to-many amplification.

Set to 0.0 to disable.

PropertyValue
Typefloat
Default0.0 (disabled)
Range0.0 - 100,000.0
ContextSUSET (superuser)
Restart requiredNo
-- Warn when delta output is 10x the input
SET pg_trickle.delta_amplification_threshold = 10.0;

pg_trickle.algebraic_drift_reset_cycles

Differential cycles between automatic full recomputes for algebraic aggregates. After this many differential refresh cycles, stream tables with algebraic aggregates (AVG, STDDEV, VAR) are automatically reinitialized to reset accumulated floating-point drift in auxiliary columns.

Set to 0 to disable automatic resets.

PropertyValue
Typeinteger
Default0 (disabled)
Range0 - 100,000
ContextSUSET (superuser)
Restart requiredNo
-- Reset algebraic aggregates every 10,000 cycles
SET pg_trickle.algebraic_drift_reset_cycles = 10000;

pg_trickle.agg_diff_cardinality_threshold

Estimated GROUP BY cardinality threshold for algebraic aggregate warnings. At create_stream_table time, if the defining query uses algebraic aggregates (SUM, COUNT, AVG) in DIFFERENTIAL mode and the estimated group cardinality is below this threshold, a WARNING is emitted suggesting FULL or AUTO mode.

Set to 0 to disable the warning.

PropertyValue
Typeinteger
Default0 (disabled)
Range0 - 100,000,000
ContextSUSET (superuser)
Restart requiredNo
-- Warn when GROUP BY cardinality is below 100
SET pg_trickle.agg_diff_cardinality_threshold = 100;

Connection Pooler

v0.19.0+ (STAB-1).

pg_trickle.connection_pooler_mode

Cluster-wide connection pooler compatibility override.

PropertyValue
Typestring
Default'off'
Valid values'off', 'transaction', 'session'
ContextSUSET
ValueBehaviour
off (default)Per-ST pooler_compatibility_mode governs
transactionGlobally disable prepared-statement reuse and suppress NOTIFY emissions (PgBouncer transaction-pool compatibility)
sessionExplicit opt-in to session mode (same as off today, reserved for future use)

See Connection Pooler Compatibility for deployment guidance.

-- Enable transaction-mode pooler compatibility globally
SET pg_trickle.connection_pooler_mode = 'transaction';

History & Retention

v0.19.0+ (DB-5).

pg_trickle.history_retention_days

Number of days to retain rows in pgtrickle.pgt_refresh_history.

PropertyValue
Typeinteger
Default90
Min0 (disabled)
Max36500 (~100 years)
ContextSUSET

The scheduler runs a daily background cleanup that deletes rows older than this many days. Set to 0 to disable automatic cleanup (history grows unbounded — monitor disk usage).

-- Keep 30 days of refresh history
SET pg_trickle.history_retention_days = 30;

Circular Dependencies

v0.7.0+ — Circular dependency support is now fully available for safe monotone cycles in DIFFERENTIAL mode. These settings control whether cycles are allowed at all and how many fixpoint iterations the scheduler will try before surfacing a non-convergence error.

pg_trickle.allow_circular

Master switch for circular (cyclic) stream table dependencies. When false (default), creating a stream table that would introduce a cycle in the dependency graph is rejected with a CycleDetected error. When true, monotone cycles — those containing only safe operators (joins, filters, projections, UNION ALL, INTERSECT, EXISTS) — are allowed.

Non-monotone operators (Aggregate, EXCEPT, Window functions, NOT EXISTS) always block cycle creation regardless of this setting, because they cannot guarantee convergence to a fixed point.

Default: false

SET pg_trickle.allow_circular = true;

pg_trickle.max_fixpoint_iterations

Maximum number of iterations per strongly connected component (SCC) before the scheduler declares non-convergence and marks all SCC members as ERROR. Prevents runaway loops in circular dependency chains.

For most practical use cases (transitive closure, graph reachability), convergence happens in 2–5 iterations. The default of 100 provides ample headroom.

Default: 100
Range: 110000

SET pg_trickle.max_fixpoint_iterations = 50;

pg_trickle.dog_feeding_auto_apply

Added in v0.20.0 (DF-G1).

Controls whether the dog-feeding analytics stream tables can automatically adjust stream table configuration.

ValueBehaviour
off (default)Advisory only — no automatic changes. Dog-feeding stream tables produce analytics that operators and dashboards can read, but nothing is applied automatically.
threshold_onlyAfter each 10-minute auto-apply cycle, reads df_threshold_advice. If a recommendation has HIGH confidence and the recommended threshold differs from the current threshold by more than 5%, applies ALTER STREAM TABLE ... SET auto_threshold = <recommended>. Changes are logged with initiated_by = 'DOG_FEED'.
fullSame as threshold_only, plus applies scheduling hints from df_scheduling_interference (future enhancement).

Default: off

-- Enable threshold auto-apply.
SET pg_trickle.dog_feeding_auto_apply = 'threshold_only';

-- Check current setting.
SHOW pg_trickle.dog_feeding_auto_apply;

Prerequisites: Dog-feeding stream tables must be created first via SELECT pgtrickle.setup_dog_feeding(). If the stream tables do not exist, the auto-apply worker is a no-op.

Rate limiting: At most one threshold change per stream table per 10 minutes.

Audit trail: All auto-apply changes are recorded in pgt_refresh_history with initiated_by = 'DOG_FEED' and a SKIP action describing the old and new threshold values.


GUC Interaction Matrix

Some GUC variables interact with or depend on each other. The table below documents these cross-dependencies to help avoid misconfiguration.

GUC AGUC BInteraction
event_driven_wakescheduler_interval_msWhen event_driven_wake = true, the scheduler wakes on NOTIFY and scheduler_interval_ms serves only as the poll-based fallback interval. Lowering scheduler_interval_ms below 100 ms with event-driven wake enabled adds little value and wastes CPU.
event_driven_wakewake_debounce_mswake_debounce_ms only takes effect when event_driven_wake = true. It coalesces rapid-fire notifications during bulk DML. Set higher (50–100 ms) for write-heavy workloads, lower (5–10 ms) for latency-sensitive workloads.
auto_backoffmin_schedule_secondsauto_backoff stretches the effective interval up to 8× the configured schedule, but never below min_schedule_seconds. If min_schedule_seconds is high, backoff has limited room to operate.
auto_backoffdefault_schedule_secondsThe backoff multiplier is applied to default_schedule_seconds (or the per-ST override); raising this value gives backoff a wider range.
parallel_refresh_modemax_concurrent_refreshesparallel_refresh_mode = 'on' dispatches independent STs to parallel workers, up to max_concurrent_refreshes per database. Setting max_concurrent_refreshes = 1 effectively disables parallelism even when the mode is 'on'.
parallel_refresh_modemax_dynamic_refresh_workersmax_dynamic_refresh_workers is a cluster-wide cap across all databases. If you have 4 databases each wanting 4 concurrent refreshes, set this to ≥16 (or accept queuing).
max_dynamic_refresh_workersper_database_worker_quotaWhen per_database_worker_quota > 0, each database claims at most that many workers from the shared max_dynamic_refresh_workers pool. Set per_database_worker_quota to max_dynamic_refresh_workers / n_databases for equal sharing. Burst to 150% is allowed when the cluster is < 80% loaded.
differential_max_change_ratiofuse_default_ceilingBoth guard against large change batches but at different levels: differential_max_change_ratio triggers a FULL refresh fallback (proportional to table size), while fuse_default_ceiling halts refresh entirely (absolute row count). The fuse fires first if the change count exceeds it, regardless of the ratio.
block_source_ddlDDL operationsWhen true, DDL on source tables (ALTER TABLE, DROP COLUMN) is blocked by an event trigger. Disable temporarily with SET pg_trickle.block_source_ddl = false before schema migrations, then re-enable.
cdc_modecdc_trigger_modecdc_trigger_mode ('statement' / 'row') only applies when CDC is trigger-based. When cdc_mode = 'wal' (or after auto-transition to WAL), cdc_trigger_mode is irrelevant.
cdc_modewal_transition_timeoutwal_transition_timeout only applies when cdc_mode = 'auto'. It controls how many seconds to wait for the first WAL-based refresh to succeed before falling back to triggers.
cleanup_use_truncatecompact_thresholdcleanup_use_truncate = true uses TRUNCATE to clear consumed change buffers (fastest, acquires AccessExclusiveLock briefly). compact_threshold controls when fully-consumed buffers are compacted via DELETE — only relevant when TRUNCATE is disabled.
buffer_partitioningcompact_thresholdIn 'auto' mode, compact_threshold serves as the promotion trigger: if a buffer exceeds this many rows in a single refresh cycle, it is promoted to RANGE(lsn) partitioned mode. Lowering compact_threshold makes auto-promotion more sensitive.
allow_circularmax_fixpoint_iterationsmax_fixpoint_iterations is only evaluated when allow_circular = true. It caps the number of convergence iterations for circular dependency chains.
ivm_topk_max_limitTopK queriesQueries with LIMIT > ivm_topk_max_limit fall back to FULL refresh instead of the optimized TopK path. Raise this if you have legitimate large TopK queries.
ivm_recursive_max_depthRecursive CTEsRecursive expansion beyond ivm_recursive_max_depth iterations is terminated with a warning and falls back to FULL refresh. Set to 0 to disable the guard (not recommended).

Tuning Profiles

Three named profiles for common deployment patterns. Copy the relevant settings into your postgresql.conf and adjust to taste.

Low-Latency Profile

Goal: Minimize end-to-end latency from base table write to stream table update. Best for dashboards, real-time analytics, and operational monitoring.

# Event-driven wake — sub-50ms median latency
pg_trickle.event_driven_wake = true
pg_trickle.wake_debounce_ms = 5              # aggressive: 5ms coalesce

# Fast scheduling
pg_trickle.scheduler_interval_ms = 200       # poll fallback (rarely used)
pg_trickle.min_schedule_seconds = 1
pg_trickle.default_schedule_seconds = 1

# Parallel refresh for independent STs
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_concurrent_refreshes = 4

# Lean merge
pg_trickle.merge_planner_hints = true
pg_trickle.merge_work_mem_mb = 128           # more memory = fewer disk sorts
pg_trickle.cleanup_use_truncate = true
pg_trickle.use_prepared_statements = true

# Guardrails
pg_trickle.auto_backoff = true               # prevent CPU runaway
pg_trickle.fuse_default_ceiling = 0          # disabled — latency over safety
pg_trickle.block_source_ddl = true

High-Throughput Profile

Goal: Maximize rows-per-second processed across many stream tables under heavy write load. Accepts slightly higher latency in exchange for better batching and resource efficiency.

# Batched wake — coalesce writes into larger deltas
pg_trickle.event_driven_wake = true
pg_trickle.wake_debounce_ms = 50             # 50ms coalesce window

# Relaxed scheduling
pg_trickle.scheduler_interval_ms = 2000      # 2-second poll fallback
pg_trickle.min_schedule_seconds = 2
pg_trickle.default_schedule_seconds = 5

# Heavy parallelism
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_concurrent_refreshes = 8
pg_trickle.max_dynamic_refresh_workers = 8

# Aggressive performance
pg_trickle.merge_planner_hints = true
pg_trickle.merge_work_mem_mb = 256           # large work_mem for big deltas
pg_trickle.merge_seqscan_threshold = 0.01    # allow seq scans for >1% changes
pg_trickle.cleanup_use_truncate = true
pg_trickle.use_prepared_statements = true
pg_trickle.auto_backoff = true
pg_trickle.buffer_partitioning = 'auto'      # O(1) cleanup for hot buffers

# Safety for bulk workloads
pg_trickle.fuse_default_ceiling = 500000     # pause on >500K changes
pg_trickle.differential_max_change_ratio = 0.25  # FULL fallback at 25%
pg_trickle.block_source_ddl = true

Resource-Constrained Profile

Goal: Minimize CPU and memory footprint for small instances, shared hosting, or development environments. Accepts higher latency and slower throughput.

# Poll-based only — no NOTIFY overhead
pg_trickle.event_driven_wake = false
pg_trickle.scheduler_interval_ms = 5000      # 5-second poll

# Conservative scheduling
pg_trickle.min_schedule_seconds = 5
pg_trickle.default_schedule_seconds = 10

# Minimal parallelism
pg_trickle.parallel_refresh_mode = 'off'     # single-threaded refresh
pg_trickle.max_concurrent_refreshes = 1
pg_trickle.max_dynamic_refresh_workers = 1

# Conservative memory
pg_trickle.merge_work_mem_mb = 32
pg_trickle.merge_planner_hints = true
pg_trickle.cleanup_use_truncate = true

# Tight guardrails
pg_trickle.auto_backoff = true
pg_trickle.fuse_default_ceiling = 100000
pg_trickle.differential_max_change_ratio = 0.10
pg_trickle.block_source_ddl = true
pg_trickle.buffer_alert_threshold = 500000

Complete postgresql.conf Example

# Required
shared_preload_libraries = 'pg_trickle'

# Essential
pg_trickle.enabled = true
pg_trickle.cdc_mode = 'auto'
pg_trickle.scheduler_interval_ms = 1000
pg_trickle.min_schedule_seconds = 1
pg_trickle.default_schedule_seconds = 1
pg_trickle.max_consecutive_errors = 3

# WAL CDC
pg_trickle.wal_transition_timeout = 300
pg_trickle.slot_lag_warning_threshold_mb = 100
pg_trickle.slot_lag_critical_threshold_mb = 1024

# Refresh performance
pg_trickle.differential_max_change_ratio = 0.15
pg_trickle.merge_planner_hints = true
pg_trickle.merge_work_mem_mb = 64
pg_trickle.cleanup_use_truncate = true
pg_trickle.use_prepared_statements = true
pg_trickle.user_triggers = 'auto'

# Guardrails & limits
pg_trickle.block_source_ddl = false
pg_trickle.buffer_alert_threshold = 1000000
pg_trickle.compact_threshold = 100000
pg_trickle.buffer_partitioning = 'off'
pg_trickle.max_grouping_set_branches = 64
pg_trickle.max_parse_depth = 64
pg_trickle.ivm_topk_max_limit = 1000
pg_trickle.ivm_recursive_max_depth = 100

# Circular dependencies (v0.7.0+)
pg_trickle.allow_circular = false                # master switch
pg_trickle.max_fixpoint_iterations = 100         # convergence limit

# Parallel refresh (v0.4.0+, default off)
pg_trickle.parallel_refresh_mode = 'off'        # 'off' | 'dry_run' | 'on'
pg_trickle.max_dynamic_refresh_workers = 4       # cluster-wide worker cap
pg_trickle.max_concurrent_refreshes = 4          # per-database dispatch cap

# Advanced / internal
pg_trickle.change_buffer_schema = 'pgtrickle_changes'
pg_trickle.foreign_table_polling = false

Runtime Configuration

All GUC variables can be changed at runtime by a superuser:

-- View current settings
SHOW pg_trickle.enabled;
SHOW pg_trickle.parallel_refresh_mode;

-- Enable parallel refresh for current session
SET pg_trickle.parallel_refresh_mode = 'on';

-- Change persistently (requires reload)
ALTER SYSTEM SET pg_trickle.scheduler_interval_ms = 500;
SELECT pg_reload_conf();

Further Reading

Scaling Guide

This document provides guidance for scaling pg_trickle to hundreds of stream tables and beyond. It covers worker pool sizing, scheduler tuning, and diagnostic queries for identifying bottlenecks.

Architecture Overview

pg_trickle uses a two-tier background worker model:

  1. Launcher — one per server. Scans pg_database every 10 seconds, spawns per-database schedulers, and auto-restarts crashed workers.
  2. Per-database scheduler — one per database. Wakes every scheduler_interval_ms (default: 1 s), reads DAG changes from shared memory, consumes CDC buffers, and dispatches refreshes.

When parallel_refresh_mode = 'on', the scheduler dispatches refresh work to a pool of dynamic background workers instead of running refreshes inline.

Worker Pool Sizing

Deployment SizeStream TablesRecommended max_dynamic_refresh_workersNotes
Small1–202–4Default (4) is usually sufficient
Medium20–1004–8Monitor worker saturation
Large100–2008–16Enable tiered scheduling
Very Large200+16–32Tune per-database quotas

Budget Formula

Worker slots are drawn from max_worker_processes, which is shared with autovacuum, parallel queries, and other extensions:

max_worker_processes >= launchers(1)
                      + schedulers(N_databases)
                      + max_dynamic_refresh_workers
                      + autovacuum_max_workers
                      + max_parallel_workers
                      + other_extensions

Example for 200 STs across 2 databases with 16 workers:

# postgresql.conf
max_worker_processes = 40
pg_trickle.max_dynamic_refresh_workers = 16
pg_trickle.max_concurrent_refreshes = 8
pg_trickle.per_database_worker_quota = 8
pg_trickle.parallel_refresh_mode = 'on'

Tiered Scheduling

For deployments with 50+ stream tables, enable tiered scheduling to reduce scheduler overhead:

pg_trickle.tiered_scheduling = on   -- default since v0.12.0

The scheduler classifies stream tables into tiers based on change frequency:

TierSchedule MultiplierBehavior
Hot1× (base interval)Tables with frequent changes
WarmTables with moderate changes
Cold10×Tables with rare changes
FrozenskipTables with no recent changes

This reduces the CPU cost of the scheduling loop itself, which can become a bottleneck at 200+ STs when every table is polled every cycle.

Dispatch Priority

When multiple stream tables are ready simultaneously, the scheduler dispatches in priority order:

  1. IMMEDIATE closures — time-critical refresh requests
  2. Atomic groups / Repeatable-read groups / Fused chains — multi-ST units
  3. Singletons — individual stream tables
  4. Cyclic SCCs — strongly-connected components

Within each priority band, the tier sort applies (Hot > Warm > Cold).

Per-Database Quotas and Burst

When per_database_worker_quota > 0, each database gets a guaranteed slice of the worker pool:

  • Normal load (cluster < 80% capacity): database can burst to 150% of its quota using idle capacity from other databases.
  • High load (cluster ≥ 80% capacity): strict quota enforcement.

This prevents a single high-traffic database from starving others.

Monitoring

Worker Pool Status

SELECT * FROM pgtrickle.worker_pool_status();
-- Returns: active_workers, max_workers, per_db_cap, parallel_mode

Active Job Details

SELECT * FROM pgtrickle.parallel_job_status(300);
-- Returns recent jobs (last 300s): status, duration, worker PID, etc.

Health Summary

SELECT * FROM pgtrickle.health_summary();
-- Returns: total/active/error/suspended/stale counts, scheduler status, cache hit rate

Buffer Backlog Check

SELECT * FROM pgtrickle.change_buffer_sizes()
ORDER BY row_count DESC
LIMIT 20;

Identifying Bottlenecks

Is the scheduler loop the bottleneck?

-- If queue depth is consistently > 10 and workers are not saturated,
-- the scheduler loop is the bottleneck. Reduce scheduler_interval_ms.
SELECT active_workers, max_workers
FROM pgtrickle.worker_pool_status();

Are workers saturated?

-- If active_workers == max_workers consistently, increase the pool.
SELECT active_workers >= max_workers AS saturated
FROM pgtrickle.worker_pool_status();

Which STs take the longest?

SELECT st.pgt_schema, st.pgt_name,
       AVG(EXTRACT(EPOCH FROM (h.end_time - h.start_time))) AS avg_sec,
       MAX(EXTRACT(EPOCH FROM (h.end_time - h.start_time))) AS max_sec,
       COUNT(*) AS refreshes
FROM pgtrickle.pgt_refresh_history h
JOIN pgtrickle.pgt_stream_tables st ON st.pgt_id = h.pgt_id
WHERE h.start_time > now() - interval '1 hour'
  AND h.status = 'COMPLETED'
GROUP BY st.pgt_schema, st.pgt_name
ORDER BY avg_sec DESC
LIMIT 20;

Tuning Profiles

Low-Latency (< 50 ms P99)

pg_trickle.scheduler_interval_ms = 200
pg_trickle.event_driven_wake = on
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_dynamic_refresh_workers = 8
pg_trickle.tiered_scheduling = on

High-Throughput (200+ STs)

pg_trickle.scheduler_interval_ms = 500
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_dynamic_refresh_workers = 16
pg_trickle.max_concurrent_refreshes = 8
pg_trickle.per_database_worker_quota = 8
pg_trickle.tiered_scheduling = on
pg_trickle.merge_work_mem_mb = 128

Resource-Constrained (4 CPU / 8 GB RAM)

pg_trickle.scheduler_interval_ms = 2000
pg_trickle.parallel_refresh_mode = 'on'
pg_trickle.max_dynamic_refresh_workers = 2
pg_trickle.max_concurrent_refreshes = 2
pg_trickle.tiered_scheduling = on
pg_trickle.delta_work_mem_cap_mb = 256
pg_trickle.merge_work_mem_mb = 32

Profiling Methodology

To profile worker utilization at scale, run a test with 200+ stream tables and max_workers set to 4, 8, and 16 in turn. Collect the following metrics at 1-second intervals:

-- Worker pool utilization over time
SELECT now() AS ts,
       (SELECT active_workers FROM pgtrickle.worker_pool_status()) AS active,
       (SELECT max_workers FROM pgtrickle.worker_pool_status()) AS pool_size,
       (SELECT COUNT(*) FROM pgtrickle.parallel_job_status(5)
        WHERE status = 'QUEUED') AS queue_depth;

Plot active / pool_size (utilization) and queue_depth over time. If utilization is consistently > 90% with non-zero queue depth, the pool is undersized. If utilization is < 50%, the pool is oversized and consuming max_worker_processes slots unnecessarily.

Known Scaling Limits

ResourcePractical LimitBottleneck
Stream tables per DB~500Scheduler loop CPU
Worker pool size64GUC max
Change buffer rowsmax_buffer_rows (default 1M)Disk I/O
Template cache size128 entries (L1)Evictions increase at >128 STs
DAG depth~20 levelsTopological sort + cascade latency

Read Replicas & Hot Standby

Added in v0.19.0 (SCAL-1 / STAB-2).

pg_trickle is a primary-only extension. Stream tables are maintained by the background scheduler through DML (INSERT, DELETE, MERGE), which is only possible on the primary server.

Behaviour on Replicas

When the pg_trickle shared library is loaded on a read replica (physical standby or streaming replica):

  1. The launcher worker detects pg_is_in_recovery() = true and enters a sleep loop, checking every 30 seconds for promotion.
  2. Upon promotion (e.g. pg_promote()), the launcher resumes normal operation and spawns per-database schedulers.
  3. Manual refresh calls (pgtrickle.refresh_stream_table()) on a replica are rejected with a clear error message.
  • Include pg_trickle in shared_preload_libraries on both primary and replicas. This ensures immediate availability after failover without a restart.
  • Stream tables are read-queryable on replicas via physical replication — the storage tables are regular PostgreSQL tables that replicate normally.
  • Monitor the replication lag to estimate stream table staleness on replicas.

CNPG & Kubernetes Operations

Added in v0.19.0 (SCAL-3).

CloudNativePG (CNPG) is the recommended Kubernetes operator for running pg_trickle. The extension is packaged as a custom container image that extends the official PostgreSQL image.

Container Image

Build the pg_trickle image using the provided Dockerfiles:

# GHCR image (multi-stage build)
docker build -f Dockerfile.ghcr -t pg-trickle:latest .

# Or use the CNPG-specific Dockerfile
docker build -f cnpg/Dockerfile.ext -t pg-trickle-cnpg:latest .

CNPG Cluster Configuration

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: pg-trickle-cluster
spec:
  instances: 3
  imageName: your-registry/pg-trickle:0.19.0
  postgresql:
    shared_preload_libraries:
      - pg_trickle
    parameters:
      pg_trickle.enabled: "true"
      pg_trickle.scheduler_interval_ms: "1000"
      pg_trickle.max_concurrent_refreshes: "4"
      # STAB-1: If using PgBouncer sidecar in transaction mode:
      # pg_trickle.connection_pooler_mode: "transaction"

Operational Notes

  • Failover: pg_trickle detects promotion automatically (see Read Replicas above). After CNPG promotes a replica, the launcher starts within 30 seconds.
  • Scaling replicas: Stream table data replicates to all replicas via physical replication. No pg_trickle-specific configuration needed on replicas.
  • Backup: Use CNPG's built-in Barman backup. pg_trickle's catalog tables are included automatically. See Backup & Restore.
  • Monitoring: The Prometheus endpoint (pgtrickle.health_summary()) is compatible with CNPG's monitoring sidecar. See the Grafana dashboards in monitoring/grafana/.

Installation Guide

Prerequisites

RequirementVersion
PostgreSQL18.x

Building from source additionally requires Rust 1.85+ (edition 2024) and pgrx 0.17.x. Pre-built release artifacts only need a running PostgreSQL 18.x instance.


Installing from a Pre-built Release

1. Download the release archive

Download the archive for your platform from the GitHub Releases page:

PlatformArchive
Linux x86_64pg_trickle-<ver>-pg18-linux-amd64.tar.gz
macOS Apple Siliconpg_trickle-<ver>-pg18-macos-arm64.tar.gz
Windows x64pg_trickle-<ver>-pg18-windows-amd64.zip

Optionally verify the checksum against SHA256SUMS.txt from the same release:

sha256sum -c SHA256SUMS.txt

2. Extract and install

Linux / macOS:

tar xzf pg_trickle-<ver>-pg18-linux-amd64.tar.gz
cd pg_trickle-<ver>-pg18-linux-amd64

sudo cp lib/*.so  "$(pg_config --pkglibdir)/"
sudo cp extension/*.control extension/*.sql "$(pg_config --sharedir)/extension/"

Windows (PowerShell):

Expand-Archive pg_trickle-<ver>-pg18-windows-amd64.zip -DestinationPath .
cd pg_trickle-<ver>-pg18-windows-amd64

Copy-Item lib\*.dll  "$(pg_config --pkglibdir)\"
Copy-Item extension\* "$(pg_config --sharedir)\extension\"

3. Using with CloudNativePG (Kubernetes)

pg_trickle is distributed as an OCI extension image for use with CloudNativePG Image Volume Extensions.

Requirements: Kubernetes 1.33+, CNPG 1.28+, PostgreSQL 18.

# Pull the extension image
docker pull ghcr.io/grove/pg_trickle-ext:<ver>

See cnpg/cluster-example.yaml and cnpg/database-example.yaml for complete Cluster and Database deployment examples.

pg_trickle is published as a ready-to-run Docker image on the GitHub Container Registry. PostgreSQL 18.3 and pg_trickle are pre-installed and all sensible GUC defaults (wal_level, shared_preload_libraries, memory, scheduler settings) are baked in — no configuration file editing needed.

docker pull ghcr.io/grove/pg_trickle:latest

docker run --rm \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  ghcr.io/grove/pg_trickle:latest

CREATE EXTENSION pg_trickle; runs automatically on the default postgres database at first startup.

Available tags:

TagMeaning
latestMost recent release
pg18Floating alias for the latest PostgreSQL 18 build
<version>-pg18.3Immutable tag, e.g. 0.13.0-pg18.3

Override any GUC at runtime without rebuilding:

docker run --rm \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  ghcr.io/grove/pg_trickle:latest \
  -c shared_buffers=2GB -c work_mem=64MB -c effective_cache_size=6GB

For persistent data, mount a volume:

docker run -d \
  --name pg_trickle \
  -e POSTGRES_PASSWORD=secret \
  -p 5432:5432 \
  -v pg_trickle_data:/var/lib/postgresql/data \
  ghcr.io/grove/pg_trickle:latest

Alternative — manual mount from a release archive: If you prefer to use the stock postgres:18.3 image rather than the pre-built image, extract the extension files from a release archive and mount them:

tar xzf pg_trickle-<ver>-pg18-linux-amd64.tar.gz
cd pg_trickle-<ver>-pg18-linux-amd64

docker run --rm \
  -v $PWD/lib/pg_trickle.so:/usr/lib/postgresql/18/lib/pg_trickle.so:ro \
  -v $PWD/extension/:/tmp/ext/:ro \
  -e POSTGRES_PASSWORD=postgres \
  postgres:18.3 \
  sh -c 'cp /tmp/ext/* /usr/share/postgresql/18/extension/ && \
         exec postgres -c shared_preload_libraries=pg_trickle'

Installing from PGXN

pg_trickle is published on the PostgreSQL Extension Network (PGXN). Installing via PGXN compiles the extension from source, so the Rust toolchain and pgrx are required.

1. Install prerequisites

# Rust toolchain
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"

# pgrx build tool
cargo install --locked cargo-pgrx --version 0.17.0
cargo pgrx init --pg18 "$(pg_config --bindir)/pg_config"

2. Install the pgxn client

pip install pgxnclient

3. Install pg_trickle

pgxn install pg_trickle

To install a specific version:

pgxn install pg_trickle=0.10.0

Note: After installation, follow the PostgreSQL Configuration and Extension Installation steps below.


Building from Source

1. Install Rust

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

2. Install pgrx

cargo install --locked cargo-pgrx --version 0.17.0
cargo pgrx init --pg18 $(pg_config --bindir)/pg_config

3. Build the Extension

# Development build (faster compilation)
cargo pgrx install --pg-config $(pg_config --bindir)/pg_config

# Release build (optimized, for production)
cargo pgrx install --release --pg-config $(pg_config --bindir)/pg_config

# Package for deployment (creates installable artifacts)
cargo pgrx package --pg-config $(pg_config --bindir)/pg_config

PostgreSQL Configuration

Add the following to postgresql.conf before starting PostgreSQL:

# Required — loads the extension shared library at server start
shared_preload_libraries = 'pg_trickle'

# Must accommodate the pg_trickle launcher + one scheduler per database
# with pg_trickle installed + optional parallel refresh workers.
#
# WARNING: when this limit is reached, the launcher silently skips
# databases it cannot spawn a scheduler for and retries every 5 minutes.
# Those databases stop refreshing without any visible error.
# Check PostgreSQL logs for:
#   WARNING:  pg_trickle launcher: could not spawn scheduler for database '...'
#
# Formula:
#   1 (launcher) + N (one scheduler per DB) + max_dynamic_refresh_workers
#   + autovacuum_max_workers + parallel query workers + other extensions
#
# 32 is a safe starting point for most clusters:
max_worker_processes = 32

Note: wal_level = logical and max_replication_slots are not required. The extension uses lightweight row-level triggers for CDC, not logical replication.

Restart PostgreSQL after modifying these settings:

pg_ctl restart -D /path/to/data
# or
systemctl restart postgresql

Extension Installation

Connect to the target database and run:

CREATE EXTENSION pg_trickle;

This creates:

  • The pgtrickle schema with catalog tables and SQL functions
  • The pgtrickle_changes schema for change buffer tables
  • Event triggers for DDL tracking
  • The pgtrickle.pg_stat_stream_tables monitoring view

Verification

After installation, verify everything is working:

-- Check the extension version
SELECT extname, extversion FROM pg_extension WHERE extname = 'pg_trickle';

-- Or get a full status overview (includes version, scheduler state, stream table count)
SELECT * FROM pgtrickle.pgt_status();

Quick functional test

CREATE TABLE test_source (id INT PRIMARY KEY, val TEXT);
INSERT INTO test_source VALUES (1, 'hello');

SELECT pgtrickle.create_stream_table(
    'test_st',
    'SELECT id, val FROM test_source',
    '1m',
    'FULL'
);

SELECT * FROM test_st;
-- Should return: 1 | hello

-- Clean up
SELECT pgtrickle.drop_stream_table('test_st');
DROP TABLE test_source;

Upgrading

To upgrade pg_trickle to a newer version without losing data:

For comprehensive upgrade instructions, version-specific notes, troubleshooting, and rollback procedures, see docs/UPGRADING.md.

1. Install the new extension files

Follow the same steps as Installing from a Pre-built Release to overwrite the shared library and SQL files with the new version. You do not need to drop the extension from your databases first.

Linux / macOS:

tar xzf pg_trickle-<new-ver>-pg18-linux-amd64.tar.gz
cd pg_trickle-<new-ver>-pg18-linux-amd64

sudo cp lib/*.so  "$(pg_config --pkglibdir)/"
sudo cp extension/*.control extension/*.sql "$(pg_config --sharedir)/extension/"

2. Restart PostgreSQL (when required)

If the shared library ABI has changed, restart PostgreSQL before proceeding so the new .so/.dll is loaded. The release notes for each version will call this out explicitly when a restart is required.

pg_ctl restart -D /path/to/data
# or
systemctl restart postgresql

3. Apply the schema migration in each database

Connect to every database where pg_trickle is installed and run:

-- Upgrade to the latest bundled version
ALTER EXTENSION pg_trickle UPDATE;

-- Or upgrade to a specific version
ALTER EXTENSION pg_trickle UPDATE TO '<new-version>';

PostgreSQL uses the versioned SQL migration scripts bundled with the release (e.g. pg_trickle--0.2.3--0.3.0.sql, pg_trickle--0.3.0--0.4.0.sql) to apply catalog and SQL-surface changes. PostgreSQL automatically chains these scripts when you run ALTER EXTENSION pg_trickle UPDATE. The command is a no-op when no migration script is needed for a given release.

You can confirm the active version afterwards:

SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';

Coming soon: A future release will include a helper function (pgtrickle.upgrade()) that automates steps 2–3 across all databases in the cluster and validates catalog integrity after the migration. Until then, the manual steps above are the supported upgrade path.


Uninstallation

-- Drop all stream tables first
SELECT pgtrickle.drop_stream_table(pgt_schema || '.' || pgt_name)
FROM pgtrickle.pgt_stream_tables;

-- Drop the extension
DROP EXTENSION pg_trickle CASCADE;

Remove pg_trickle from shared_preload_libraries in postgresql.conf and restart PostgreSQL.

Troubleshooting

Unit tests crash on macOS 26+ (symbol not found in flat namespace)

macOS 26 (Tahoe) changed dyld to eagerly resolve all flat-namespace symbols at binary load time. pgrx extensions reference PostgreSQL server-internal symbols (e.g. CacheMemoryContext, SPI_connect) via the -Wl,-undefined,dynamic_lookup linker flag. These symbols are normally provided by the postgres executable when the extension is loaded as a shared library — but for cargo test --lib there is no postgres process, so the test binary aborts immediately:

dyld[66617]: symbol not found in flat namespace '_CacheMemoryContext'

This affects local development only — integration tests, E2E tests, and the extension itself running inside PostgreSQL are unaffected.

The fix is built into the just test-unit recipe. It automatically:

  1. Compiles a tiny C stub library (scripts/pg_stub.ctarget/libpg_stub.dylib) that provides NULL/no-op definitions for the ~28 PostgreSQL symbols.
  2. Compiles the test binary with --no-run.
  3. Runs the binary with DYLD_INSERT_LIBRARIES pointing to the stub.

The stub is only built on macOS 26+. On Linux or older macOS, just test-unit runs cargo test --lib directly with no changes.

Note: The stub symbols are never called — unit tests exercise pure Rust logic only. If a test accidentally calls a PostgreSQL function it will crash with a NULL dereference (the desired fail-fast behavior).

If you run unit tests without just (e.g. directly via cargo test --lib), you can use the wrapper script instead:

./scripts/run_unit_tests.sh pg18

# With test name filter:
./scripts/run_unit_tests.sh pg18 -- test_parse_basic

Extension fails to load

Ensure shared_preload_libraries = 'pg_trickle' is set and PostgreSQL has been restarted (not just reloaded). The extension requires shared memory initialization at startup.

Background worker not starting

Check that max_worker_processes is high enough. In sequential mode (default) pg_trickle needs one slot per database with stream tables. With parallel refresh enabled (pg_trickle.parallel_refresh_mode = 'on') it additionally needs max_dynamic_refresh_workers slots (default 4) shared across all databases.

See the worker-budget formula in CONFIGURATION.md for sizing guidance.

Check logs for details

The extension logs at various levels. Enable debug logging for more detail:

SET client_min_messages TO debug1;

Next Steps

Upgrading pg_trickle

This guide covers upgrading pg_trickle from one version to another.


-- 1. Check current version
SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';

-- 2. Replace the binary files (.so/.dylib, .control, .sql)
--    See the installation method below for your platform.

-- 3. Restart PostgreSQL (required for shared library changes)
--    sudo systemctl restart postgresql

-- 4. Run the upgrade in each database that has pg_trickle installed
ALTER EXTENSION pg_trickle UPDATE;

-- 5. Verify the upgrade
SELECT pgtrickle.version();
SELECT * FROM pgtrickle.health_check();

Step-by-Step Instructions

1. Check Current Version

SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';
-- Returns your current installed version, e.g. '0.9.0'

2. Install New Binary Files

Replace the extension files in your PostgreSQL installation directory. The method depends on how you originally installed pg_trickle.

From release tarball:

# Replace <new-version> with the target release, for example 0.2.3
curl -LO https://github.com/getretake/pg_trickle/releases/download/v<new-version>/pg_trickle-<new-version>-pg18-linux-amd64.tar.gz
tar xzf pg_trickle-<new-version>-pg18-linux-amd64.tar.gz

# Copy files to PostgreSQL directories
sudo cp pg_trickle-<new-version>-pg18-linux-amd64/lib/* $(pg_config --pkglibdir)/
sudo cp pg_trickle-<new-version>-pg18-linux-amd64/extension/* $(pg_config --sharedir)/extension/

From source (cargo-pgrx):

cargo pgrx install --release

3. Restart PostgreSQL

The shared library (.so / .dylib) is loaded at server start via shared_preload_libraries. A restart is required for the new binary to take effect.

sudo systemctl restart postgresql
# or on macOS with Homebrew:
brew services restart postgresql@18

4. Run ALTER EXTENSION UPDATE

Connect to each database where pg_trickle is installed and run:

ALTER EXTENSION pg_trickle UPDATE;

This executes the upgrade migration scripts in order (for example, pg_trickle--0.5.0--0.6.0.sqlpg_trickle--0.6.0--0.7.0.sql). PostgreSQL automatically determines the full upgrade chain from your current version to the new default_version.

5. Verify the Upgrade

-- Check version
SELECT pgtrickle.version();

-- Run health check
SELECT * FROM pgtrickle.health_check();

-- Verify stream tables are intact
SELECT * FROM pgtrickle.stream_tables_info;

-- Test a refresh
SELECT pgtrickle.refresh_stream_table('your_stream_table');

Version-Specific Notes

0.1.3 → 0.2.0

New functions added:

  • pgtrickle.list_sources(name) — list source tables for a stream table
  • pgtrickle.change_buffer_sizes() — inspect CDC change buffer sizes
  • pgtrickle.health_check() — diagnostic health checks
  • pgtrickle.dependency_tree() — visualize the dependency DAG
  • pgtrickle.trigger_inventory() — audit CDC triggers
  • pgtrickle.refresh_timeline(max_rows) — refresh history
  • pgtrickle.diamond_groups() — diamond dependency group info
  • pgtrickle.version() — extension version string
  • pgtrickle.pgt_ivm_apply_delta(...) — internal IVM delta application
  • pgtrickle.pgt_ivm_handle_truncate(...) — internal TRUNCATE handler
  • pgtrickle._signal_launcher_rescan() — internal launcher signal

No schema changes to pgtrickle.pgt_stream_tables or pgtrickle.pgt_dependencies catalog tables.

No breaking changes. All v0.1.3 functions and views continue to work as before.

0.2.0 → 0.2.1

Three new catalog columns added to pgtrickle.pgt_stream_tables:

ColumnTypeDefaultPurpose
topk_offsetINTNULLPre-provisioned for paged TopK OFFSET (activated in v0.2.2)
has_keyless_sourceBOOLEAN NOT NULLFALSEEC-06: keyless source flag; switches apply strategy from MERGE to counted DELETE
function_hashesTEXTNULLEC-16: stores MD5 hashes of referenced function bodies for change detection

The migration script (pg_trickle--0.2.0--0.2.1.sql) adds these columns via ALTER TABLE … ADD COLUMN IF NOT EXISTS.

No breaking changes. All v0.2.0 functions, views, and event triggers continue to work as before.

What's also new:

  • Upgrade migration safety infrastructure (scripts, CI, E2E tests)
  • GitHub Pages book expansion (6 new documentation pages)
  • User-facing upgrade guide (this document)

0.2.1 → 0.2.2

No catalog table DDL changes. The topk_offset column needed for paged TopK was already added in v0.2.1.

Two SQL function updates are applied by pg_trickle--0.2.1--0.2.2.sql:

  • pgtrickle.create_stream_table(...)
    • default schedule changes from '1m' to 'calculated'
    • default refresh_mode changes from 'DIFFERENTIAL' to 'AUTO'
  • pgtrickle.alter_stream_table(...)
    • adds the optional query parameter used by ALTER QUERY support

Because PostgreSQL stores argument defaults and function signatures in pg_proc, the migration script must DROP FUNCTION and recreate both signatures during ALTER EXTENSION ... UPDATE.

Behavioral notes:

  • Existing stream tables keep their current catalog values. The migration only changes the defaults used by future create_stream_table(...) calls.
  • Existing applications can opt a table into the new defaults explicitly via pgtrickle.alter_stream_table(...) after the upgrade.
  • After installing the new binary and restarting PostgreSQL, the scheduler now warns if the shared library version and SQL-installed extension version do not match. This helps detect stale .so/.dylib files after partial upgrades.

0.2.2 → 0.2.3

One new catalog column is added to pgtrickle.pgt_stream_tables:

ColumnTypeDefaultPurpose
requested_cdc_modeTEXTNULLOptional per-stream-table CDC override ('auto', 'trigger', 'wal')

The upgrade script also recreates two SQL functions:

  • pgtrickle.create_stream_table(...)
    • adds the optional cdc_mode parameter
  • pgtrickle.alter_stream_table(...)
    • adds the optional cdc_mode parameter

Monitoring view updates:

  • pgtrickle.pg_stat_stream_tables gains the cdc_modes column
  • pgtrickle.pgt_cdc_status is added for per-source CDC visibility

Because PostgreSQL stores function signatures and defaults in pg_proc, the upgrade script drops and recreates both lifecycle functions during ALTER EXTENSION ... UPDATE.

0.6.0 → 0.7.0

One new catalog column is added to pgtrickle.pgt_stream_tables:

ColumnTypeDefaultPurpose
last_fixpoint_iterationsINTNULLRecords how many rounds the last circular-dependency fixpoint run required

Two new catalog tables are added:

TablePurpose
pgtrickle.pgt_watermarksStores per-source watermark progress reported by external loaders
pgtrickle.pgt_watermark_groupsStores groups of sources that must stay temporally aligned before refresh

The upgrade script also updates and adds SQL functions:

  • Recreates pgtrickle.pgt_status() so the result includes scc_id
  • Adds pgtrickle.pgt_scc_status() for circular-dependency monitoring
  • Adds pgtrickle.advance_watermark(source, watermark)
  • Adds pgtrickle.create_watermark_group(name, sources[], tolerance_secs)
  • Adds pgtrickle.drop_watermark_group(name)
  • Adds pgtrickle.watermarks()
  • Adds pgtrickle.watermark_groups()
  • Adds pgtrickle.watermark_status()

Behavioral notes:

  • Circular stream table dependencies can now run to convergence when pg_trickle.allow_circular = true and every member of the cycle is safe for monotone DIFFERENTIAL refresh.
  • The scheduler can now hold back refreshes until related source tables are aligned within a configured watermark tolerance.
  • Existing non-circular stream tables continue to work as before. The new catalog objects are additive.

0.7.0 → 0.8.0

No catalog schema changes. The upgrade migration script contains no DDL.

New operational features:

  • pg_dump / pg_restore support: stream tables are now safely exported and re-connected after restore without manual intervention.
  • Connection pooler opt-in was introduced at the per-stream level (superseded by the more comprehensive pooler_compatibility_mode added in v0.10.0).

No breaking changes. All v0.7.0 functions, views, and event triggers continue to work as before.

0.8.0 → 0.9.0

No catalog schema DDL changes to pgtrickle.pgt_stream_tables or the dependency catalog.

New API function added:

  • pgtrickle.restore_stream_tables() — re-installs CDC triggers and re-registers stream tables after a pg_restore from a pg_dump.

Hidden auxiliary columns for AVG / STDDEV / VAR aggregates. Stream tables using these aggregates will automatically receive hidden __pgt_aux_* columns on the next refresh after upgrading. No manual action is needed — pg_trickle detects missing auxiliary columns and performs a single full reinitialise to add them.

Behavioral notes:

  • COUNT, SUM, and AVG now update in constant time (O(changed rows)) instead of rescanning the whole group.
  • STDDEV and VAR variants likewise update in O(changed rows) via hidden sum-of-squares auxiliary columns.
  • MIN/MAX still requires a group rescan only when the deleted value is the current extreme.
  • Refresh groups (create_refresh_group, drop_refresh_group, refresh_groups()) are available starting from this version.

0.9.0 → 0.10.0

Two new catalog columns added to pgtrickle.pgt_stream_tables:

ColumnTypeDefaultPurpose
pooler_compatibility_modeBOOLEAN NOT NULLFALSEDisables prepared statements and NOTIFY for this stream table — required when accessed through PgBouncer in transaction-pool mode
refresh_tierTEXT NOT NULL'hot'Tiered scheduling tier: hot, warm, cold, or frozen

One new catalog table is added:

TablePurpose
pgtrickle.pgt_refresh_groupsStores refresh groups for snapshot-consistent multi-table refresh

The upgrade script also updates and adds SQL functions:

  • pgtrickle.create_stream_table(...) gains the pooler_compatibility_mode parameter
  • pgtrickle.create_stream_table_if_not_exists(...) likewise
  • pgtrickle.create_or_replace_stream_table(...) likewise
  • pgtrickle.alter_stream_table(...) likewise
  • Adds pgtrickle.create_refresh_group(name, members, isolation)
  • Adds pgtrickle.drop_refresh_group(name)
  • Adds pgtrickle.refresh_groups() — lists all declared groups

Behavioral notes:

  • pooler_compatibility_mode defaults to false. Existing stream tables are unaffected. Enable it only for stream tables accessed through PgBouncer transaction-mode pooling.
  • pg_trickle.auto_backoff now defaults to on (was off). The backoff threshold is raised from 80 % → 95 % and the maximum slowdown is capped at 8× (was 64×). If you relied on the old opt-in behaviour, set pg_trickle.auto_backoff = off explicitly.
  • diamond_consistency now defaults to 'atomic' for new stream tables (was 'none'). Existing stream tables keep their current setting.
  • The scheduler now uses row-level locking for concurrency control instead of session-level advisory locks, making pg_trickle compatible with PgBouncer transaction-pool and similar connection poolers.
  • Statistical aggregates (CORR, COVAR_*, REGR_*) now update incrementally using Welford-style accumulation, no longer requiring a group rescan.
  • Materialized view sources can now be used in DIFFERENTIAL mode when pg_trickle.matview_polling = on is set.
  • Recursive CTE stream tables with DELETE/UPDATE now use the Delete-and-Rederive algorithm (O(delta) instead of O(n)).

0.10.0 → 0.11.0

New catalog columns added to pgtrickle.pgt_stream_tables:

ColumnTypeDefaultPurpose
effective_refresh_modeTEXTNULLActual refresh mode used in the last cycle (FULL / DIFFERENTIAL / APPEND_ONLY / TOP_K / NO_DATA); populated by the scheduler after each completed refresh
fuse_modeTEXT NOT NULL'off'Circuit-breaker mode: off, on, or auto
fuse_stateTEXT NOT NULL'armed'Circuit-breaker state: armed, blown, or disabled
fuse_ceilingBIGINTNULLMaximum change-row count that can pass through in one refresh before the fuse blows; NULL = unlimited
fuse_sensitivityINTNULLSensitivity multiplier for auto-fuse detection
blown_atTIMESTAMPTZNULLTimestamp when the fuse last triggered
blow_reasonTEXTNULLHuman-readable reason the fuse blew
st_partition_keyTEXTNULLPartition key column for declaratively partitioned stream tables; NULL = not partitioned

Updated function signatures — existing calls continue to work because new parameters all have defaults:

  • pgtrickle.create_stream_table(...) gains partition_by TEXT DEFAULT NULL
  • pgtrickle.create_stream_table_if_not_exists(...) likewise
  • pgtrickle.create_or_replace_stream_table(...) likewise
  • pgtrickle.alter_stream_table(...) gains fuse TEXT DEFAULT NULL, fuse_ceiling BIGINT DEFAULT NULL, fuse_sensitivity INT DEFAULT NULL

New functions:

  • pgtrickle.reset_fuse(name TEXT, action TEXT DEFAULT 'apply') — clear a blown fuse and resume scheduling
  • pgtrickle.fuse_status() — returns circuit-breaker state for every stream table
  • pgtrickle.explain_refresh_mode(name TEXT) — shows configured mode, effective mode, and the reason for any downgrade

Behavioral notes:

  • Event-driven wake (pg_trickle.event_driven_wake) is on by default — the background worker now wakes within ~15 ms of a source-table write instead of waiting up to 500 ms.
  • Stream-table-to-stream-table chains now refresh incrementally — downstream tables receive a small insert/delete delta rather than cascading full refreshes.
  • pg_trickle.tiered_scheduling now defaults to on.
  • Declaratively partitioned stream tables are supported via partition_by — the refresh MERGE is automatically restricted to only the changed partitions.

0.11.0 → 0.12.0

No schema changes. This release adds four new diagnostic SQL functions only:

FunctionReturnsPurpose
pgtrickle.explain_query_rewrite(query TEXT)TABLE(pass_name TEXT, changed BOOL, sql_after TEXT)Walk a query through every DVM rewrite pass to see how pg_trickle transforms it
pgtrickle.diagnose_errors(name TEXT)TABLE(event_time TIMESTAMPTZ, error_type TEXT, error_message TEXT, remediation TEXT)Last 5 FAILED refresh events with error classification and suggested fixes
pgtrickle.list_auxiliary_columns(name TEXT)TABLE(column_name TEXT, data_type TEXT, purpose TEXT)List all hidden __pgt_* auxiliary columns on a stream table's storage relation
pgtrickle.validate_query(query TEXT)TABLE(valid BOOL, mode TEXT, reason TEXT)Parse and validate a query for stream-table compatibility without creating one

Behavioral notes:

  • The incremental engine now handles multi-table join deletes correctly — phantom rows after simultaneous deletes from multiple join sides no longer occur.
  • Stream-table-to-stream-table row identity is now computed consistently between the change buffer and the downstream table, eliminating stale duplicate rows after upstream UPDATEs.
  • pg_trickle.tiered_scheduling defaults to on (same as 0.11.0 runtime behaviour; this release makes it the explicit default).

0.12.0 → 0.13.0

Ten new catalog columns added to pgtrickle.pgt_stream_tables:

ColumnTypeDefaultPurpose
effective_refresh_modeTEXTNULLComputed refresh mode after AUTO resolution
fuse_modeTEXT NOT NULL'off'Fuse configuration: off, auto, or manual
fuse_stateTEXT NOT NULL'armed'Current fuse state: armed or blown
fuse_ceilingBIGINTNULLMaximum change count before fuse blows
fuse_sensitivityINTNULLConsecutive cycles above ceiling before triggering
blown_atTIMESTAMPTZNULLTimestamp when the fuse last blew
blow_reasonTEXTNULLReason the fuse blew
st_partition_keyTEXTNULLPartition key specification (RANGE, LIST, or HASH)
max_differential_joinsINTNULLMaximum join count for differential mode (auto-fallback to FULL when exceeded)
max_delta_fractionDOUBLE PRECISIONNULLMaximum delta-to-table ratio for differential mode (auto-fallback to FULL when exceeded)

All columns use ADD COLUMN IF NOT EXISTS for idempotent upgrades.

Nine new SQL functions (plus one replacement with new signature):

FunctionPurpose
pgtrickle.explain_delta(name, format)Delta SQL query plan inspection
pgtrickle.dedup_stats()MERGE deduplication frequency counters
pgtrickle.shared_buffer_stats()Per-source-buffer observability
pgtrickle.explain_refresh_mode(name)Refresh mode decision explanation
pgtrickle.reset_fuse(name)Reset a blown fuse
pgtrickle.fuse_status()Fuse state across all stream tables
pgtrickle.explain_query_rewrite(query)DVM rewrite pass inspection
pgtrickle.diagnose_errors(name)Error classification and remediation
pgtrickle.list_auxiliary_columns(name)Hidden __pgt_* column listing
pgtrickle.validate_query(query)Query compatibility validation
pgtrickle.alter_stream_table(...)(replaced) — new partition_by parameter

New GUC variables:

GUCDefaultPurpose
pg_trickle.per_database_worker_quota0 (auto)Per-database parallel worker limit

Behavioral notes:

  • Shared change buffers: Multiple stream tables reading from the same source now automatically share a single change buffer. No migration action required — existing per-source buffers continue to work.
  • Columnar change tracking: Wide-table UPDATEs that touch only value columns (not GROUP BY / JOIN / WHERE columns) now generate significantly less delta volume. This is fully automatic.
  • Auto buffer partitioning: Set pg_trickle.buffer_partitioning = 'auto' to let high-throughput buffers self-promote to partitioned mode for O(1) cleanup.
  • dbt macros: If you use dbt-pgtrickle, update your macros to the matching v0.13.0 version. New config options: partition_by, fuse, fuse_ceiling, fuse_sensitivity.

No breaking changes. All v0.12.0 functions, views, and event triggers continue to work as before.

0.13.0 → 0.14.0

Two new catalog columns added to pgtrickle.pgt_stream_tables:

ColumnTypeDefaultPurpose
last_error_messageTEXTNULLError message from the last permanent refresh failure
last_error_atTIMESTAMPTZNULLTimestamp of the last permanent refresh failure

Updated function signature (return type gained new columns):

  • pgtrickle.st_refresh_stats() — gains consecutive_errors, schedule, refresh_tier, and last_error_message columns. The upgrade script drops and recreates the function. No behavior change for existing callers that ignore unknown columns.

New SQL functions (available immediately after ALTER EXTENSION ... UPDATE):

FunctionPurpose
pgtrickle.recommend_refresh_mode(name)Workload-based refresh mode recommendation with confidence level
pgtrickle.refresh_efficiency(name)Per-table FULL vs. DIFFERENTIAL performance metrics
pgtrickle.export_definition(name)Export stream table as reproducible DROP+CREATE+ALTER DDL
pgtrickle.convert_buffers_to_unlogged()Convert logged change buffers to UNLOGGED

New GUC variables:

GUCDefaultPurpose
pg_trickle.planner_aggressivetrueConsolidated switch replacing merge_planner_hints + merge_work_mem_mb
pg_trickle.unlogged_buffersfalseCreate new change buffers as UNLOGGED (reduces WAL by ~30%)
pg_trickle.agg_diff_cardinality_threshold1000Warn at creation time when GROUP BY cardinality is below this

Deprecated GUCs (still accepted but ignored at runtime):

  • pg_trickle.merge_planner_hints → use pg_trickle.planner_aggressive
  • pg_trickle.merge_work_mem_mb → use pg_trickle.planner_aggressive

Behavioral notes:

  • Error-state circuit breaker: A single permanent refresh failure (e.g. a function that doesn't exist for the column type) now immediately sets the stream table status to ERROR with a message stored in last_error_message. The scheduler skips ERROR tables. Use pgtrickle.resume_stream_table(name) followed by pgtrickle.alter_stream_table(name, query => ...) to recover.
  • Tiered scheduling NOTICE: Demoting a stream table from hot to cold or frozen now emits a NOTICE so operators are aware the effective refresh interval has changed (10× for cold, suspended for frozen).
  • SECURITY DEFINER triggers: All CDC trigger functions now run with SECURITY DEFINER and an explicit SET search_path, hardening against privilege-escalation attacks. This is applied automatically on upgrade — no manual action needed.
  • TUI binary: A pgtrickle command-line tool is now included in the package. See TUI.md for usage.

No breaking changes. All v0.13.0 functions, views, and event triggers continue to work as before.


Supported Upgrade Paths

The following migration hops are available. PostgreSQL chains them automatically when you run ALTER EXTENSION pg_trickle UPDATE.

FromToScript
0.1.30.2.0pg_trickle--0.1.3--0.2.0.sql
0.2.00.2.1pg_trickle--0.2.0--0.2.1.sql
0.2.10.2.2pg_trickle--0.2.1--0.2.2.sql
0.2.20.2.3pg_trickle--0.2.2--0.2.3.sql
0.2.30.3.0pg_trickle--0.2.3--0.3.0.sql
0.3.00.4.0pg_trickle--0.3.0--0.4.0.sql
0.4.00.5.0pg_trickle--0.4.0--0.5.0.sql
0.5.00.6.0pg_trickle--0.5.0--0.6.0.sql
0.6.00.7.0pg_trickle--0.6.0--0.7.0.sql
0.7.00.8.0pg_trickle--0.7.0--0.8.0.sql
0.8.00.9.0pg_trickle--0.8.0--0.9.0.sql
0.9.00.10.0pg_trickle--0.9.0--0.10.0.sql
0.10.00.11.0pg_trickle--0.10.0--0.11.0.sql
0.11.00.12.0pg_trickle--0.11.0--0.12.0.sql
0.12.00.13.0pg_trickle--0.12.0--0.13.0.sql
0.13.00.14.0pg_trickle--0.13.0--0.14.0.sql

That means any installation currently on 0.1.3 through 0.13.0 can upgrade to 0.14.0 in one step after the new binaries are installed and PostgreSQL has been restarted.


Rollback / Downgrade

PostgreSQL does not support automatic extension downgrades. To roll back:

  1. Export stream table definitions (if you want to recreate them later):
cargo run --bin pg_trickle_dump -- --output backup.sql

Or, if the binary is already installed in your PATH:

pg_trickle_dump --output backup.sql

Use --dsn '<connection string>' or standard PG* / DATABASE_URL environment variables when the default local connection parameters are not sufficient.

  1. Drop the extension (destroys all stream tables):

    DROP EXTENSION pg_trickle CASCADE;
    
  2. Install the old version and restart PostgreSQL.

  3. Recreate the extension at the old version:

    CREATE EXTENSION pg_trickle VERSION '0.1.3';
    
  4. Recreate stream tables from your backup.


Troubleshooting

"function pgtrickle.xxx does not exist" after upgrade

This means the upgrade script is missing a function. Workaround:

-- Check what version PostgreSQL thinks is installed
SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';

-- If the version looks correct but functions are missing,
-- the upgrade script may be incomplete. Try a clean reinstall:
DROP EXTENSION pg_trickle CASCADE;
CREATE EXTENSION pg_trickle CASCADE;
-- Warning: this destroys all stream tables!

Report this as a bug — upgrade scripts should never silently drop functions.

"could not access file pg_trickle" after restart

The new shared library file was not installed correctly. Verify:

ls -la $(pg_config --pkglibdir)/pg_trickle*

ALTER EXTENSION UPDATE says "already at version X"

The binary files are already the new version but the SQL catalog wasn't upgraded. This usually means the .control file's default_version matches your current version. Check:

cat $(pg_config --sharedir)/extension/pg_trickle.control

Multi-Database Environments

ALTER EXTENSION UPDATE must be run in each database where pg_trickle is installed. A common pattern:

for db in $(psql -t -c "SELECT datname FROM pg_database WHERE datname NOT IN ('template0', 'template1')"); do
  psql -d "$db" -c "ALTER EXTENSION pg_trickle UPDATE;" 2>/dev/null || true
done

CloudNativePG (CNPG)

For CNPG deployments, see cnpg/README.md for upgrade instructions specific to the Kubernetes operator.

Backup and Restore

Like any standard PostgreSQL extension, pg_trickle supports logical backups via pg_dump and physical backups (via tools like pgBackRest or pg_basebackup).

Because pg_trickle maintains automated states (like Change Data Capture buffers and DDL Event Triggers), specific workflows should be followed to ensure a smooth recovery.

Physical Backups (pgBackRest / pg_basebackup)

Physical backups copy the underlying data blocks. These are the most robust backups.

No special steps are needed during restore. When the database comes online, pg_trickle's catalogs, CDC buffers, and internal dependencies exist precisely as they did at the moment the snapshot was taken.

Note for WAL-Mode Users: Physical backups do not export replication slot data by default. If your CDC pipeline was in wal mode, logical slots might not survive the recreation. The pg_trickle scheduler handles missing slots gracefully by temporarily re-enabling table triggers.

Logical Backups (pg_dump / pg_restore)

Logical backups dump your database schema as generic cross-compatible SQL (CREATE TABLE, INSERT, CREATE INDEX).

pg_trickle integrates with pg_dump natively. When restoring these backups (which typically involves sequentially recreating schemas, inserting data into those tables, and lastly applying indexes and triggers), you must restore into a database precisely, to allow the extension to rewrite its own internal triggers correctly without conflicting with plain PostgreSQL commands.

The most reliable approach is to use the --section arguments of pg_restore. By breaking the restore up into pieces, we guarantee that when the schema, data, and constraints are created, all variables and configurations are actively in the database, and our custom hook DdlEventKind::ExtensionChange intercepts the query and automatically dials pgtrickle.restore_stream_tables() internally.

pg_trickle TUI — User Guide

pgtrickle is a terminal tool for managing and monitoring pg_trickle stream tables. It works in two modes:

  • Interactive dashboard — run pgtrickle with no arguments to launch a live-updating TUI that shows all your stream tables, their health, dependencies, and configuration.
  • One-shot CLI — run pgtrickle <command> to perform a single operation and exit. Output goes to stdout in table, JSON, or CSV format. Designed for scripts, CI pipelines, and automation.

Building

The TUI is a standalone Rust binary in the pgtrickle-tui workspace member. It does not require the PostgreSQL extension to compile — only a Rust toolchain.

# Build (debug)
cargo build -p pgtrickle-tui

# Build (release, optimized)
cargo build --release -p pgtrickle-tui

# The binary is at:
#   target/debug/pgtrickle       (debug)
#   target/release/pgtrickle     (release)

To install it on your PATH:

cargo install --path pgtrickle-tui

Verify:

pgtrickle --version
pgtrickle --help

Requirements

  • Rust 2024 edition (1.85+)
  • A running PostgreSQL 18 server with the pg_trickle extension installed
  • Network access to the database (no local socket required)

Connecting to a Database

pgtrickle resolves connection parameters in this order (first match wins):

PriorityMethodExample
1--url flagpgtrickle --url postgres://user:pass@host:5432/mydb list
2PGTRICKLE_URL env varexport PGTRICKLE_URL=postgres://...
3Individual flags--host, --port, --dbname, --user, --password
4Standard libpq env varsPGHOST, PGPORT, PGDATABASE, PGUSER, PGPASSWORD
5Defaultslocalhost:5432/postgres as user postgres

Connection flags work with every subcommand and with the interactive dashboard:

# URL-style connection
pgtrickle --url postgres://admin:secret@db.example.com:5432/analytics

# Environment variables (most common in production)
export PGHOST=db.example.com
export PGPORT=5432
export PGDATABASE=analytics
export PGUSER=admin
export PGPASSWORD=secret
pgtrickle list

# Explicit flags
pgtrickle --host db.example.com --dbname analytics --user admin list

Interactive Dashboard

Run pgtrickle with no subcommand:

pgtrickle

This opens a full-screen terminal UI that auto-refreshes every 2 seconds. The screen has three areas:

  • Header — application name, current view, connection status (● connected / ✗ disconnected), and time since last poll.
  • Body — the active view (see below).
  • Footer — keyboard shortcuts for switching views and a filter indicator.

Press q or Ctrl+C to exit.

Views

There are 14 views. Switch between them by pressing the key shown:

KeyViewWhat it shows
1DashboardAll stream tables in a sortable list with status, mode, staleness, and last refresh time. A status ribbon at the top summarizes active/error/stale counts.
2DetailDeep dive into the selected stream table: properties (schema, status, mode, schedule, tier, refresh mode explanation), source tables, refresh history, CDC health, and diagnosed errors for error-state tables.
3DependenciesThe stream table dependency graph rendered as an ASCII tree. Edges are color-coded by status (green = active, red = error).
4Refresh LogA scrollable timeline of recent refreshes across all tables — timestamp, mode (DIFF/FULL), table name, status, duration, and rows affected.
5DiagnosticsOutput of recommend_refresh_mode() — shows each table's current mode vs. recommended mode with confidence level and reasoning.
6CDC HealthChange buffer sizes and byte counts per source table, plus the CDC mode (trigger/WAL). Large buffers are highlighted as warnings.
7ConfigurationAll pg_trickle.* GUC parameters: current value, unit, category, and description.
8Health ChecksResults of health_check() — each check displays a name, severity (OK/WARN/CRITICAL), and detail message. Critical items are shown in red.
9AlertsReal-time alert feed from LISTEN pg_trickle_alert. Shows timestamp, severity icon, and message for each event.
wWorkersBackground scheduler worker pool: each worker's state (running/idle), the table it's refreshing, and duration. Below that, the pending job queue with priority and wait time.
fFuseCircuit breaker status for each stream table: fuse state (ARMED/TRIPPED/BLOWN), consecutive error count, and last error message.
mWatermarksWatermark group alignment: group name, member count, min/max watermarks, and whether the group is gated. Two tabs: Groups and Gates.
dDelta InspectorFetches and displays the auto-generated delta SQL for the selected stream table (two tabs: Delta SQL and Auxiliary Columns). Press e to show the table's CREATE DDL.
iIssuesAll detected DAG issues (cycles, orphans, missing sources) sorted by severity and blast radius.

Keyboard Shortcuts

Navigation — works in all views:

KeyAction
j or Move selection down
k or Move selection up
Page Down / Page UpScroll 20 rows
HomeJump to first row
EndJump to last row
EnterDrill into detail (Dashboard → Detail view; Delta Inspector → reload delta SQL)
EscGo back to Dashboard / close overlay / clear filter
TabSwitch sub-tabs (Delta Inspector: SQL ↔ Auxiliary Columns; Watermarks: Groups ↔ Gates)

Write actions (view-specific):

KeyViewAction
rDashboard, DetailRefresh selected stream table
RDashboardRefresh all active tables (with confirmation)
pDashboard, DetailPause selected (with confirmation)
PDashboard, DetailResume selected
eDetail, Delta InspectorShow CREATE DDL overlay for selected table
AFuseRe-arm fuse for selected (with confirmation)
gWatermarks (Gates tab)Gate / ungate selected source (confirmation for gate)

Global actions:

KeyAction
/Open filter — type to search, Enter to apply, Esc to cancel
:Open command palette
s / SCycle sort field / reverse sort direction (Dashboard)
tToggle light/dark theme
Ctrl+RForce an immediate poll
Ctrl+EExport current view to JSON file (/tmp/pgtrickle_export_*.json)
?Toggle help overlay
q or Ctrl+CQuit

View switching:

Press 19, w, f, m, d, g, or i to jump directly to any view. The active view and selected table are shown in both the header bar and the footer nav bar.

Command Palette

Press : to open the command palette. Tab-completion works on stream table names. Available commands:

CommandDescription
refresh <name>Refresh a stream table (or refresh all)
pause <name>Pause a stream table
resume <name>Resume a paused stream table
repair <name>Re-install CDC triggers
export <name>Show CREATE DDL overlay
explain <name>Fetch and display delta SQL for a stream table
validate <SQL>Validate a SQL query against the extension
fuse reset <name>Reset the circuit breaker fuse
quitExit the TUI

LISTEN/NOTIFY

The TUI opens a second, dedicated database connection that runs LISTEN pg_trickle_alert. Alerts (refresh failures, auto-suspension events, etc.) appear in the Alerts view (9) in real time, without waiting for the next poll cycle.


CLI Subcommands

Every subcommand runs non-interactively: it connects, executes one query, prints the result, and exits. This makes them suitable for shell scripts, cron jobs, CI pipelines, and monitoring probes.

Output Formats

All subcommands that produce tabular output accept --format / -f:

FormatFlagDescription
Table--format table (default)Human-readable aligned columns
JSON--format jsonArray of objects on stdout
CSV--format csvComma-separated values

Command Reference

pgtrickle list

List all stream tables with status, mode, schedule, tier, and refresh stats.

pgtrickle list
pgtrickle list --format json

pgtrickle status <name>

Show detailed status for a single stream table.

pgtrickle status order_totals
pgtrickle status order_totals --format json

pgtrickle refresh <name>

Trigger a manual refresh of one stream table, or all of them.

pgtrickle refresh order_totals
pgtrickle refresh --all

pgtrickle create <name> <query>

Create a new stream table with the given defining query.

pgtrickle create my_totals "SELECT region, SUM(amount) FROM orders GROUP BY region"
pgtrickle create my_totals "SELECT ..." --schedule 5m --mode differential
pgtrickle create my_totals "SELECT ..." --no-initialize
FlagDescription
--scheduleRefresh schedule (e.g. 5m, @hourly)
--modeRefresh mode: auto, differential, full, immediate
--no-initializeSkip the initial refresh after creation

pgtrickle drop <name>

Drop a stream table.

pgtrickle drop my_totals

pgtrickle alter <name>

Change a stream table's settings.

pgtrickle alter order_totals --mode full
pgtrickle alter order_totals --schedule 10m
pgtrickle alter order_totals --tier cold
pgtrickle alter order_totals --status paused
pgtrickle alter order_totals --query "SELECT ..."
FlagDescription
--modeNew refresh mode
--scheduleNew refresh schedule
--tierNew scheduling tier (hot, warm, cold, frozen)
--statusNew status (active, paused, suspended)
--queryNew defining query (ALTER QUERY)

pgtrickle export <name>

Print the DDL (SQL definition) for a stream table.

pgtrickle export order_totals

pgtrickle diag [name]

Show refresh mode diagnostics and recommendations. Without a name, shows all tables. With a name, shows diagnostics for that table only.

pgtrickle diag
pgtrickle diag order_totals
pgtrickle diag --format json

pgtrickle cdc

Show CDC change buffer sizes and health.

pgtrickle cdc
pgtrickle cdc --format json

pgtrickle graph

Print the stream table dependency graph as an ASCII tree.

pgtrickle graph
pgtrickle graph --format json

pgtrickle config

Show all pg_trickle.* GUC parameters, or set one.

pgtrickle config
pgtrickle config --set pg_trickle.unlogged_buffers=true
pgtrickle config --format json

The --set flag runs ALTER SYSTEM SET followed by pg_reload_conf().

pgtrickle health

Run system health checks. Returns exit code 1 if any check is CRITICAL.

pgtrickle health
pgtrickle health --format json

# Use in CI/monitoring:
pgtrickle health || echo "Health check failed"

pgtrickle workers

Show the background worker pool status and pending job queue.

pgtrickle workers
pgtrickle workers --format json

pgtrickle fuse

Show fuse (circuit breaker) status for all stream tables.

pgtrickle fuse
pgtrickle fuse --format json

pgtrickle watermarks

Show watermark groups and source gating status.

pgtrickle watermarks
pgtrickle watermarks --format json

pgtrickle explain <name>

Inspect the generated delta SQL, DVM operator tree, or deduplication stats for a stream table. By default shows the delta SQL.

pgtrickle explain order_totals                  # Delta SQL
pgtrickle explain order_totals --analyze        # EXPLAIN ANALYZE on the delta
pgtrickle explain order_totals --operators      # DVM operator tree
pgtrickle explain order_totals --dedup          # Dedup stats per source
pgtrickle explain order_totals --format json
FlagDescription
--analyzeRun EXPLAIN ANALYZE on the delta query
--operatorsShow the DVM operator tree instead of raw SQL
--dedupShow change buffer deduplication statistics

pgtrickle watch

Non-interactive continuous output mode. Polls the database and prints a status table at regular intervals. Useful for CI logs, monitoring, and terminals without TUI support.

pgtrickle watch                     # Default: every 2 seconds
pgtrickle watch -n 10               # Every 10 seconds
pgtrickle watch --compact           # One line per table
pgtrickle watch --no-color          # No ANSI color codes
pgtrickle watch --append            # Append mode (don't clear screen)

# Log to a file
pgtrickle watch --compact --no-color --append >> /var/log/pgtrickle.log
FlagShortDescription
--interval-nPoll interval in seconds (default: 2)
--compactOne-line-per-table output
--no-colorDisable ANSI color codes
--appendAppend to stdout instead of clearing the screen

pgtrickle completions <shell>

Generate shell completion scripts. Install them once and get tab-completion for all subcommands and flags.

# Bash
pgtrickle completions bash > /etc/bash_completion.d/pgtrickle
# or for the current user:
pgtrickle completions bash > ~/.local/share/bash-completion/completions/pgtrickle

# Zsh
pgtrickle completions zsh > ~/.zfunc/_pgtrickle

# Fish
pgtrickle completions fish > ~/.config/fish/completions/pgtrickle.fish

# PowerShell
pgtrickle completions powershell > pgtrickle.ps1

Examples

Quick health check in CI

#!/bin/bash
set -e
export PGHOST=db.example.com PGDATABASE=analytics PGUSER=monitor

pgtrickle health || { echo "pg_trickle health check failed"; exit 1; }
pgtrickle list --format json | jq '.[] | select(.status != "ACTIVE")'

Monitor stream tables in a tmux pane

pgtrickle watch -n 5

Export all definitions for version control

for name in $(pgtrickle list --format json | jq -r '.[].name'); do
  pgtrickle export "$name" > "sql/stream_tables/${name}.sql"
done

Debug a slow differential refresh

pgtrickle explain order_totals --analyze
pgtrickle explain order_totals --operators
pgtrickle explain order_totals --dedup

How It Works

The TUI connects to PostgreSQL using tokio-postgres (async, no TLS by default) and queries pg_trickle's built-in SQL API functions:

ViewSQL function(s)
Dashboardpgtrickle.st_refresh_stats()
Detailpgtrickle.explain_refresh_mode(), pgtrickle.list_sources(), pgtrickle.get_refresh_history(), pgtrickle.diagnose_errors()
Dependenciespgtrickle.dependency_tree()
Refresh Logpgtrickle.refresh_timeline()
Diagnosticspgtrickle.recommend_refresh_mode()
CDC Healthpgtrickle.change_buffer_sizes(), pgtrickle.check_cdc_health()
Configurationpg_settings WHERE name LIKE 'pg_trickle.%'
Health Checkspgtrickle.health_check()
AlertsLISTEN pg_trickle_alert (real-time)
Workerspgtrickle.worker_pool_status(), pgtrickle.parallel_job_status()
Fusepgtrickle.fuse_status()
Watermarkspgtrickle.watermark_groups(), pgtrickle.source_gate_status()
Delta Inspectorpgtrickle.explain_delta(), pgtrickle.list_auxiliary_columns(), pgtrickle.pgt_stream_tables (DDL)
Issuespgtrickle.dag_issues()

In interactive mode, a background task polls all of these every 2 seconds and pushes state updates to the rendering loop. A second connection runs LISTEN pg_trickle_alert for real-time notifications.

The TUI is purely a client — it reads from pg_trickle's monitoring API and sends commands (refresh, create, drop, alter) through the same SQL functions you would call from psql. It does not require any special privileges beyond what the pg_trickle SQL API requires.

Planned: cache_stats() and health_summary() Integration

Status: Not yet surfaced in the TUI (v0.18.0 gap).

The following SQL functions are available but not yet integrated into the TUI:

  • pgtrickle.cache_stats() — template cache hit rate, L1 hits, evictions, delta cache entries. Useful for monitoring cache effectiveness.
  • pgtrickle.health_summary() — single-row deployment overview with total/active/error/stale stream table counts, P99 refresh latency, scheduler status, and cache hit rate.

Lightest integration path: Add cache hit rate to the Dashboard status ribbon (currently shows scheduler status from quick_health). The Health Checks view (8) could display health_summary() fields alongside the existing health_check() results. Both functions are already available via raw SQL (psql, Grafana, or the Prometheus exporter).

Tech Stack

ComponentCratePurpose
Terminal renderingratatui 0.29 + crossterm 0.28Full-screen TUI with color, layout, widgets
Async runtimetokio 1.xBackground polling, LISTEN/NOTIFY, signals
PostgreSQLtokio-postgres 0.7Async database queries
CLI parsingclap 4.xSubcommands, flags, env var integration
Table outputcomfy-table 7.xAligned text tables for CLI mode
Serializationserde + serde_jsonJSON and CSV output formats
Shell completionsclap_complete 4.xbash/zsh/fish/PowerShell completions

Contributing to pg_trickle

Thank you for your interest in contributing! pg_trickle is an Apache 2.0-licensed open-source project and welcomes contributions of all kinds.

Before You Start

  • Check the open issues and discussions to avoid duplicating work.
  • For non-trivial changes, open an issue first to discuss the approach.
  • Read AGENTS.md — it is the authoritative guide for all coding conventions, error handling rules, module layout, and test requirements.
  • Read docs/ARCHITECTURE.md to understand the system.
  • Read ROADMAP.md to see what work is planned.

Ways to Contribute

TypeWhere to start
Bug reportOpen an issue
Feature requestOpen an issue or start a discussion
Documentation fixOpen a PR directly — no issue needed for typos/clarity
Code fix or featureOpen an issue first, then a PR
Performance improvementInclude benchmark numbers (see just bench)

Development Setup

# Install pgrx
cargo install cargo-pgrx --version "=0.17.0"
cargo pgrx init --pg18 /usr/lib/postgresql/18/bin/pg_config

# Build
cargo build

# Format + lint (required before every PR)
just fmt
just lint

# Run tests
just test-unit          # fast, no DB
just test-integration   # Testcontainers
just test-light-e2e     # PR-equivalent Light E2E tier (stock postgres)
just test-e2e           # full E2E (builds Docker image)
just test-pgbouncer     # PgBouncer transaction-pool compatibility tests

Full setup instructions are in INSTALL.md.

Devcontainer / Containerized Development

If you are developing in a devcontainer, use the default non-root vscode user and run the normal commands from the workspace root:

just fmt
just lint
just test-unit

just test-unit uses scripts/run_unit_tests.sh, which now selects a writable and cache-friendly target directory in this order:

  1. target/ (preferred)
  2. .cargo-target/ (project-local fallback)
  3. $HOME/.cache/pg_trickle-target
  4. ${TMPDIR:-/tmp}/pg_trickle-target (last resort)

This avoids permission failures on bind mounts and preserves incremental builds when source or test files change.

If you see permission errors in containerized runs, verify you are not forcing a different container user/UID than expected by your workspace mount.

Run E2E tests in devcontainer

E2E tests use Testcontainers and require Docker access from inside the devcontainer (provided by the Docker-in-Docker feature in .devcontainer/devcontainer.json).

Run from the workspace root inside the devcontainer:

just build-e2e-image
just test-e2e

Notes:

  • The E2E harness starts containers via testcontainers (tests/e2e/mod.rs).
  • The default E2E image is pg_trickle_e2e:latest (built by tests/build_e2e_image.sh).
  • A plain docker run of the dev image is not equivalent to a full VS Code devcontainer session with features/lifecycle hooks enabled.

Making a Pull Request

  1. Fork the repository and create a branch: git checkout -b fix/my-fix
  2. Make your changes following the conventions in AGENTS.md
  3. Run just fmt && just lint — both must pass with zero warnings
  4. Add or update tests — see AGENTS.md § Testing
  5. Open a PR against main

The PR template will walk you through the checklist.

CI Coverage on PRs

PR CI runs a three-tier gate:

  • Unit tests (Linux only)
  • Integration tests
  • Light E2E — curated PR-friendly end-to-end coverage split across three shards and executed against stock postgres:18.3

Full E2E, TPC-H tests, benchmarks, dbt, CNPG smoke, and the extra macOS / Windows unit jobs stay off the PR critical path and run on push-to-main, schedule, or manual dispatch. This keeps typical PR feedback closer to the single-digit-minute range while preserving broader scheduled coverage.

To trigger the full CI matrix on your PR branch (recommended for DVM engine, refresh, or CDC changes):

gh workflow run ci.yml --ref <your-branch>

To run all tests locally before pushing:

just test-all          # unit + integration + e2e

# PR-equivalent fast path:
just test-unit
just test-integration
just test-light-e2e

# TPC-H correctness tests (requires e2e Docker image):
cargo test --test e2e_tpch_tests -- --ignored --test-threads=1 --nocapture

See AGENTS.md § Testing for the full CI coverage matrix.

Coding Conventions (summary)

  • No unwrap() or panic!() in non-test code
  • All unsafe blocks require a // SAFETY: comment
  • Errors go through PgTrickleError in src/error.rs
  • New SQL functions use #[pg_extern(schema = "pgtrickle")]
  • Tests use Testcontainers — never a local PostgreSQL instance

Full details are in AGENTS.md.

Commit Messages

Use Conventional Commits:

fix: correct pgoutput action parsing for tables named INSERT_LOG
feat: add CUBE explosion guard (max 64 UNION ALL branches)
docs: document JOIN key change limitation in SQL_REFERENCE
test: add E2E test for keyless table duplicate-row behaviour

License

By contributing you agree that your contributions will be licensed under the Apache License 2.0.

dbt-pgtrickle

A dbt package that integrates pg_trickle stream tables into your dbt project via a custom stream_table materialization.

No custom Python adapter required — works with the standard dbt-postgres adapter. Just Jinja SQL macros that call pg_trickle's SQL API.

Prerequisites

RequirementMinimum Version
dbt Core≥ 1.9
dbt-postgres adapterMatching dbt Core version
PostgreSQL18.x
pg_trickle extension≥ 0.1.0 (CREATE EXTENSION pg_trickle;)

Installation

Add to your packages.yml:

packages:
  - git: "https://github.com/grove/pg-trickle.git"
    revision: v0.15.0
    subdirectory: "dbt-pgtrickle"

From dbt Hub (once published)

After the package is listed on dbt Hub, you can install by package name:

packages:
  - package: grove/dbt_pgtrickle
    version: [">=0.15.0", "<1.0.0"]

Note: dbt Hub listing requires a separate GitHub repository for the package. See docs/integrations/dbt-hub-submission.md for the submission checklist and steps.

Then run:

dbt deps

Quick Start

Create a model with materialized='stream_table':

-- models/marts/order_totals.sql
{{
  config(
    materialized='stream_table',
    schedule='5m',
    refresh_mode='DIFFERENTIAL'
  )
}}

SELECT
    customer_id,
    SUM(amount) AS total_amount,
    COUNT(*) AS order_count
FROM {{ source('raw', 'orders') }}
GROUP BY customer_id
dbt run --select order_totals   # Creates the stream table
dbt test --select order_totals  # Tests work normally (it's a real table)

Configuration Reference

KeyTypeDefaultDescription
materializedstringMust be 'stream_table'
schedulestring/null'1m'Refresh schedule (e.g., '5m', '1h', cron). null for pg_trickle's CALCULATED schedule.
refresh_modestring'DIFFERENTIAL''FULL', 'DIFFERENTIAL', 'AUTO', or 'IMMEDIATE'
initializebooltruePopulate on creation
statusstring/nullnull'ACTIVE' or 'PAUSED'. When set, applies on subsequent runs via alter_stream_table().
stream_table_namestringmodel nameOverride stream table name
stream_table_schemastringtarget schemaOverride schema
cdc_modestring/nullnullCDC mode override: 'auto', 'trigger', or 'wal'. null uses the GUC default.
partition_bystring/nullnullColumn name for RANGE partitioning of the storage table (v0.13.0+). Cannot be changed after creation.
fusestring/nullnullFuse circuit-breaker mode: 'off', 'on', or 'auto' (v0.13.0+). Applied via alter_stream_table() on every run; no-op if unchanged.
fuse_ceilingint/nullnullChange-count threshold that triggers the fuse (v0.13.0+). null uses the global GUC default.
fuse_sensitivityint/nullnullNumber of consecutive over-ceiling observations before the fuse blows (v0.13.0+). null means 1 (immediate).

partition_by — RANGE partitioning

Partition the stream table's storage table by a column value. pg_trickle creates a PARTITION BY RANGE (<col>) storage table with a default catch-all partition. Add your own date/integer range partitions via standard PostgreSQL DDL after dbt run.

-- models/marts/events_by_day.sql
{{ config(
    materialized='stream_table',
    schedule='1m',
    refresh_mode='DIFFERENTIAL',
    partition_by='event_day'
) }}

SELECT
    event_day,
    user_id,
    COUNT(*) AS event_count
FROM {{ source('raw', 'events') }}
GROUP BY event_day, user_id

Note: partition_by is applied only at creation time. Changing it after the stream table exists has no effect. Use dbt run --full-refresh to recreate with a new partition key.

fuse — Circuit breaker

The fuse circuit breaker suspends refreshes when the change volume exceeds a threshold, protecting against runaway refresh cycles during bulk ingestion.

-- models/marts/order_totals.sql
{{ config(
    materialized='stream_table',
    schedule='5m',
    refresh_mode='DIFFERENTIAL',
    fuse='auto',
    fuse_ceiling=50000,
    fuse_sensitivity=3
) }}

SELECT customer_id, SUM(amount) AS total
FROM {{ source('raw', 'orders') }}
GROUP BY customer_id
fuse valueBehaviour
'off'Fuse disabled (default)
'on'Fuse always active; blows when ceiling is exceeded
'auto'Fuse activates only when the delta is large enough to make FULL refresh cheaper than DIFFERENTIAL

Fuse parameters are applied on every dbt run via alter_stream_table() — only calls the SQL function when the values have genuinely changed from the catalog state.

Project-level defaults

# dbt_project.yml
models:
  my_project:
    marts:
      +materialized: stream_table
      +schedule: '5m'
      +refresh_mode: DIFFERENTIAL

Operations

pgtrickle_refresh — Manual refresh

dbt run-operation pgtrickle_refresh --args '{"model_name": "order_totals"}'

refresh_all_stream_tables — Refresh all in dependency order

Refreshes all dbt-managed stream tables in topological (dependency) order. Upstream tables are refreshed before downstream ones. Designed for CI pipelines: run after dbt run and before dbt test to ensure all data is current.

# Refresh all dbt-managed stream tables
dbt run-operation refresh_all_stream_tables

# Refresh only stream tables in a specific schema
dbt run-operation refresh_all_stream_tables --args '{"schema": "analytics"}'

drop_all_stream_tables — Drop dbt-managed stream tables

Drops only stream tables defined as dbt models (safe in shared environments):

dbt run-operation drop_all_stream_tables

drop_all_stream_tables_force — Drop ALL stream tables

Drops everything from the pg_trickle catalog, including non-dbt stream tables:

dbt run-operation drop_all_stream_tables_force

pgtrickle_check_cdc_health — CDC pipeline health

dbt run-operation pgtrickle_check_cdc_health

Raises an error (non-zero exit) if any CDC source is unhealthy.

Freshness Monitoring

Native dbt source freshness is not supported (the last_refresh_at column lives in the catalog, not on the stream table). Use the pgtrickle_check_freshness run-operation instead:

# Check all active stream tables (defaults: warn=600s, error=1800s)
dbt run-operation pgtrickle_check_freshness

# Custom thresholds
dbt run-operation pgtrickle_check_freshness \
  --args '{model_name: order_totals, warn_seconds: 300, error_seconds: 900}'

Exits non-zero when any stream table exceeds the error threshold — safe for CI.

Useful dbt Commands

# List all stream table models
dbt ls --select config.materialized:stream_table

# Full refresh (drop + recreate)
dbt run --select order_totals --full-refresh

# Build models + tests in DAG order
dbt build --select order_totals

Note: dbt build runs stream table models early in the DAG. If downstream models depend on a stream table with initialize: false, the table may not be populated yet.

Testing

Stream tables are standard PostgreSQL heap tables — all dbt tests work normally:

models:
  - name: order_totals
    columns:
      - name: customer_id
        tests:
          - not_null
          - unique

Stream Table Health Test

Use the built-in stream_table_healthy generic test to fail your dbt test suite when a stream table is stale, erroring, or paused:

models:
  - name: order_totals
    tests:
      - dbt_pgtrickle.stream_table_healthy:
          warn_seconds: 300  # fail if stale for more than 5 minutes

The test queries pgtrickle.pg_stat_stream_tables and returns rows for any unhealthy condition. An empty result set means the stream table is healthy.

Stream Table Status Macro

For more programmatic control, use the pgtrickle_stream_table_status() macro directly in custom tests or run-operations:

{%- set st = dbt_pgtrickle.pgtrickle_stream_table_status('order_totals', warn_seconds=300) -%}
{# st.status is one of: 'healthy', 'stale', 'erroring', 'paused', 'not_found' #}
{# st.staleness_seconds, st.consecutive_errors, st.total_refreshes, etc. #}

__pgt_row_id Column

pg_trickle adds an internal __pgt_row_id column to stream tables for row identity tracking. This column:

  • Appears in SELECT * and dbt docs generate
  • Does not affect dbt test unless you check column counts
  • Can be documented to reduce confusion:
columns:
  - name: __pgt_row_id
    description: "Internal pg_trickle row identity hash. Ignore this column."

Limitations

LimitationWorkaround
No in-place query alterationMaterialization auto-drops and recreates when query changes
__pgt_row_id visibleDocument it; exclude in downstream SELECT
No native dbt source freshnessUse pgtrickle_check_freshness run-operation
No dbt snapshot supportSnapshot the stream table as a regular table
Query change detection is whitespace-sensitivedbt compiles deterministically; unnecessary recreations are safe
PostgreSQL 18 requiredExtension requirement
Shared version tags with pg_trickle extensionPin to specific git revision

Contributing

See AGENTS.md for development guidelines and the implementation plan for design rationale.

Running tests locally

The quickest way (requires Docker and dbt installed):

# Full run — builds Docker image, starts container, runs tests, cleans up
just test-dbt

# Fast run — reuses existing Docker image (run after first build)
just test-dbt-fast

Or use the script directly with options:

cd dbt-pgtrickle/integration_tests/scripts

# Default: builds image, runs tests with dbt 1.9, cleans up
./run_dbt_tests.sh

# Skip image rebuild (faster iteration)
./run_dbt_tests.sh --skip-build

# Keep the container running after tests (for debugging)
./run_dbt_tests.sh --skip-build --keep-container

# Use a custom port (avoids conflicts with local PostgreSQL)
PGPORT=25432 ./run_dbt_tests.sh

Manual testing against an existing pg_trickle instance

If you already have PostgreSQL 18 + pg_trickle running locally:

export PGHOST=localhost PGPORT=5432 PGUSER=postgres PGPASSWORD=postgres PGDATABASE=postgres
cd dbt-pgtrickle/integration_tests
dbt deps
dbt seed
dbt run
./scripts/wait_for_populated.sh order_totals 30
dbt test
dbt run-operation drop_all_stream_tables

License

Apache 2.0 — see LICENSE.

CloudNativePG / Kubernetes

pg_trickle is designed to work with CloudNativePG (CNPG) — the Kubernetes operator for PostgreSQL. The extension is loaded via Image Volume Extensions, meaning no custom PostgreSQL image is needed.

Prerequisites

  • Kubernetes 1.33+ with the ImageVolume feature gate enabled
  • CloudNativePG operator 1.28+
  • The pg_trickle-ext OCI image available in your cluster registry

Architecture

┌─────────────────────────────────────┐
│  CNPG Cluster (3 pods)              │
│                                     │
│  ┌──────────┐  ┌──────────────────┐ │
│  │ Primary  │  │ pg_trickle-ext   │ │
│  │ PG 18    │◄─┤ (ImageVolume)    │ │
│  │          │  │ .so + .sql only  │ │
│  └──────────┘  └──────────────────┘ │
│  ┌──────────┐  ┌──────────┐        │
│  │ Replica 1│  │ Replica 2│        │
│  │ (standby)│  │ (standby)│        │
│  └──────────┘  └──────────┘        │
└─────────────────────────────────────┘
  • The scheduler runs on the primary pod only. Replica pods detect recovery mode (pg_is_in_recovery() = true) and sleep.
  • Stream tables are replicated to standbys via physical streaming replication like any other heap table.
  • Pod restarts are safe — the scheduler resumes from the stored frontier with no data loss.

Deploying pg_trickle on CNPG

1. Build the extension image

The cnpg/Dockerfile.ext builds a scratch-based OCI image containing only the shared library, control file, and SQL migrations:

# From the dist/ directory with pre-built artifacts:
docker build -t ghcr.io/<owner>/pg_trickle-ext:0.13.0 -f cnpg/Dockerfile.ext dist/
docker push ghcr.io/<owner>/pg_trickle-ext:0.13.0

2. Deploy the Cluster

Apply the Cluster manifest with pg_trickle configured as an Image Volume extension:

# cnpg/cluster-example.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: pg-trickle-demo
spec:
  instances: 3
  imageName: ghcr.io/cloudnative-pg/postgresql:18

  postgresql:
    shared_preload_libraries:
      - pg_trickle
    extensions:
      - name: pg-trickle
        image:
          reference: ghcr.io/<owner>/pg_trickle-ext:0.13.0
    parameters:
      max_worker_processes: "8"

  bootstrap:
    initdb:
      database: app
      owner: app

  storage:
    size: 10Gi
    storageClass: standard
kubectl apply -f cnpg/cluster-example.yaml

3. Enable the extension

Use the CNPG Database resource for declarative extension management:

# cnpg/database-example.yaml
apiVersion: postgresql.cnpg.io/v1
kind: Database
metadata:
  name: app
spec:
  cluster:
    name: pg-trickle-demo
  name: app
  owner: app
  extensions:
    - name: pg_trickle
kubectl apply -f cnpg/database-example.yaml

4. Verify

kubectl exec -it pg-trickle-demo-1 -- psql -U postgres -d app -c \
  "SELECT pgtrickle.version();"

Key Considerations

Worker processes

Each database with pg_trickle needs one background worker slot. Set max_worker_processes in the Cluster manifest to accommodate the launcher (1) + one scheduler per database + any parallel refresh workers:

parameters:
  max_worker_processes: "16"

Persistent volumes

Catalog tables (pgtrickle.pgt_stream_tables) and change buffers (pgtrickle_changes.*) are stored in regular PostgreSQL tablespaces. Persistent volume claims preserve them across pod rescheduling.

Backups

pg_trickle state (catalog, change buffers, stream table data) is included in CNPG's Barman object-store backups automatically. After a restore, the scheduler detects frontier inconsistencies and performs a full refresh on the first cycle. See Backup and Restore for details.

Failover

When the primary pod fails and a replica is promoted, the new primary's scheduler starts automatically. Since stream tables were replicated via streaming replication, they are already up-to-date (minus replication lag). The scheduler resumes refreshing from the stored frontier.

Resource limits

For production deployments, set resource requests and limits in the Cluster manifest to prevent the scheduler from starving other workloads:

resources:
  requests:
    memory: 512Mi
    cpu: 500m
  limits:
    memory: 2Gi
    cpu: 2000m

Example manifests

The repository includes ready-to-use manifests in the cnpg/ directory:

FilePurpose
cnpg/Dockerfile.extBuild the scratch-based extension image
cnpg/Dockerfile.ext-buildMulti-stage build for CI/CD pipelines
cnpg/cluster-example.yamlComplete Cluster manifest with pg_trickle
cnpg/database-example.yamlDatabase resource with declarative extension management

Further reading

Prometheus & Grafana Monitoring

pg_trickle ships with a complete observability stack based on postgres_exporter, Prometheus, and Grafana. The monitoring/ directory in the repository contains everything you need.

Quick Start

cd monitoring/
docker compose up -d

Open Grafana at http://localhost:3000 (default: admin / admin). The pg_trickle Overview dashboard is pre-provisioned.

Architecture

PostgreSQL + pg_trickle
        │
        │  custom SQL queries
        ▼
postgres_exporter (:9187)
        │
        │  /metrics (Prometheus format)
        ▼
   Prometheus (:9090)
        │
        │  data source
        ▼
    Grafana (:3000)

postgres_exporter runs custom SQL queries defined in prometheus/pg_trickle_queries.yml against the pg_trickle monitoring views (pgtrickle.stream_tables_info, pgtrickle.pg_stat_stream_tables, etc.) and exposes them as Prometheus metrics.

Connecting to an Existing Database

If you already have PostgreSQL + pg_trickle running, configure the exporter to point at your instance:

export PG_HOST=your-pg-host
export PG_PORT=5432
export PG_USER=postgres
export PG_PASSWORD=yourpassword
export PG_DATABASE=yourdb
docker compose up -d

Or edit the DATA_SOURCE_NAME in docker-compose.yml directly.

Metrics Exposed

All metrics are prefixed pg_trickle_.

MetricTypeDescription
pg_trickle_stream_tables_totalgaugeTotal stream tables by status
pg_trickle_stale_tables_totalgaugeTables with data older than schedule
pg_trickle_consecutive_errorsgaugePer-table consecutive error count
pg_trickle_refresh_duration_msgaugeAverage refresh duration (ms)
pg_trickle_total_refreshescounterTotal refresh count per table
pg_trickle_failed_refreshescounterFailed refresh count per table
pg_trickle_rows_inserted_totalcounterRows inserted per table
pg_trickle_rows_deleted_totalcounterRows deleted per table
pg_trickle_staleness_secondsgaugeSeconds since last successful refresh
pg_trickle_cdc_pending_rowsgaugePending rows in CDC change buffer
pg_trickle_cdc_buffer_bytesgaugeCDC change buffer size in bytes
pg_trickle_scheduler_runninggauge1 if scheduler background worker is alive
pg_trickle_health_statusgaugeOverall health: 0=OK, 1=WARNING, 2=CRITICAL

Pre-configured Alerts

Alerting rules are defined in prometheus/alerts.yml:

AlertConditionSeverity
PgTrickleTableStaleStaleness > 5 min past schedulewarning
PgTrickleConsecutiveErrors≥ 3 consecutive refresh failureswarning
PgTrickleTableSuspendedAny table in SUSPENDED statuscritical
PgTrickleCdcBufferLargeCDC buffer > 1 GBwarning
PgTrickleSchedulerDownScheduler not running for > 2 mincritical
PgTrickleHighRefreshDurationAvg refresh > 30 swarning

NOTIFY-Based Alerting

In addition to Prometheus alerts, pg_trickle emits real-time PostgreSQL NOTIFY events on the pg_trickle_alert channel:

LISTEN pg_trickle_alert;

Events include stale_data, auto_suspended, reinitialize_needed, buffer_growth_warning, fuse_blown, refresh_completed, and refresh_failed. Each notification carries a JSON payload with the stream table name and relevant details.

You can bridge NOTIFY events to external alerting systems (PagerDuty, Slack, etc.) using tools like pgnotify or a simple LISTEN loop in your application.

Grafana Dashboard

The pre-provisioned pg_trickle Overview dashboard (grafana/dashboards/pg_trickle_overview.json) includes panels for:

  • Stream table status distribution (active / suspended / error)
  • Refresh rate and duration over time
  • Staleness heatmap
  • CDC buffer sizes
  • Consecutive error counts
  • Scheduler uptime

Built-in SQL Monitoring Views

pg_trickle also provides built-in monitoring accessible without Prometheus:

-- Quick health overview (returns warnings and errors)
SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

-- Stream table status and staleness
SELECT name, status, refresh_mode, staleness
FROM pgtrickle.stream_tables_info;

-- Detailed refresh statistics
SELECT * FROM pgtrickle.pg_stat_stream_tables;

-- CDC health per source table
SELECT * FROM pgtrickle.check_cdc_health();

-- Change buffer sizes
SELECT * FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

See the SQL Reference for the complete list of monitoring functions.

Files Reference

FilePurpose
monitoring/docker-compose.ymlDemo stack: PG + exporter + Prometheus + Grafana
monitoring/prometheus/prometheus.ymlPrometheus scrape configuration
monitoring/prometheus/pg_trickle_queries.ymlCustom SQL queries for postgres_exporter
monitoring/prometheus/alerts.ymlAlerting rules
monitoring/grafana/provisioning/Auto-provisioned data source + dashboard
monitoring/grafana/dashboards/pg_trickle_overview.jsonOverview dashboard

Requirements

  • Docker 24+ with Compose v2
  • pg_trickle 0.10.0+ installed in the target database
  • PostgreSQL user with SELECT on the pgtrickle.* schema

PgBouncer & Connection Poolers

pg_trickle's background scheduler uses session-level PostgreSQL features. This page explains how to configure pg_trickle alongside connection poolers like PgBouncer, Supavisor (Supabase), and PgCat.

Compatibility Matrix

Pooling ModeCompatible?Notes
Session mode (pool_mode = session)✅ FullyAll features work.
Direct connection (no pooler for scheduler)✅ FullyApplication queries can still go through a pooler.
Transaction mode (pool_mode = transaction)❌ Not supportedAdvisory locks, prepared statements, and LISTEN/NOTIFY are session-scoped.
Statement mode (pool_mode = statement)❌ Not supportedSame session-scoped limitations.

Why Transaction Mode Breaks

The pg_trickle scheduler relies on three session-level features:

FeatureProblem in Transaction Mode
pg_advisory_lock()Session lock released when connection returns to pool — concurrent refreshes become possible
PREPARE / EXECUTEPrepared statements vanish on connection hop — "prepared statement does not exist" errors
LISTEN / NOTIFYListener loses notifications when assigned a different backend connection

Route the pg_trickle background worker through a direct connection while keeping application traffic on the pooler:

┌─────────────────┐     ┌──────────────┐
│  Application    │────▶│  PgBouncer   │──┐
│  (transaction   │     │  (txn mode)  │  │
│   mode OK)      │     └──────────────┘  │
└─────────────────┘                       │
                                          ▼
┌─────────────────┐                ┌─────────────┐
│  pg_trickle     │───────────────▶│ PostgreSQL   │
│  scheduler      │  direct conn   │             │
│  (session mode) │                └─────────────┘
└─────────────────┘

The scheduler connects directly to PostgreSQL as a background worker — it does not go through the pooler at all. No special configuration is needed for this; the scheduler always uses an internal SPI connection.

The pooler only matters for application queries that read from stream tables or call pg_trickle functions (e.g., refresh_stream_table()).

Platform-Specific Notes

Supabase

Supabase uses Supavisor in transaction mode by default. pg_trickle's scheduler works because it runs as a background worker (bypasses the pooler). Application queries against stream tables work normally through the pooler since they are regular SELECT statements.

If you call pgtrickle.refresh_stream_table() from application code, use the direct connection string (port 5432) rather than the pooled connection (port 6543).

Neon

Neon uses a custom proxy that supports both session and transaction modes. Use the session-mode connection string for any pg_trickle management calls. The scheduler runs as a background worker and is unaffected by the proxy.

AWS RDS Proxy

RDS Proxy only supports transaction-mode pooling. The pg_trickle scheduler runs as a background worker inside the RDS instance and is unaffected. Application queries reading stream tables work normally through the proxy.

Manual refresh_stream_table() calls through the proxy may fail due to advisory lock issues. Use a direct connection for management operations.

Pooler Compatibility Mode

pg_trickle includes a pooler_compatibility_mode setting (v0.10.0+) that adjusts internal behavior for environments where the scheduler's SPI connection may be affected by pooler-like middleware:

-- Usually not needed — the scheduler bypasses external poolers
SHOW pg_trickle.pooler_compatibility_mode;

This GUC is primarily for edge cases in managed PostgreSQL services. For standard deployments, the default setting works correctly.

Further Reading

Flyway & Liquibase Migration Frameworks

pg_trickle stream tables are managed through SQL function calls, not standard DDL (CREATE TABLE / ALTER TABLE). This page documents patterns for integrating pg_trickle with Flyway and Liquibase migration frameworks.

Key Principle

Stream tables are created and managed via pgtrickle.create_stream_table(), pgtrickle.alter_stream_table(), and pgtrickle.drop_stream_table(). These are regular SQL function calls that can be embedded in any migration script.

CDC triggers are automatically installed on source tables during stream table creation — no manual trigger management is needed.


Flyway

Creating Stream Tables in Migrations

Place stream table definitions in versioned migration files alongside your regular schema changes:

-- V3__create_order_stream_tables.sql

-- 1. Create the source tables first (standard DDL)
CREATE TABLE IF NOT EXISTS orders (
    id         SERIAL PRIMARY KEY,
    region     TEXT NOT NULL,
    amount     NUMERIC(10,2) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT now()
);

-- 2. Create stream tables via pg_trickle API
SELECT pgtrickle.create_stream_table(
    'order_totals',
    $$SELECT region, COUNT(*) AS order_count, SUM(amount) AS total
      FROM orders GROUP BY region$$,
    schedule     => '5s',
    refresh_mode => 'DIFFERENTIAL'
);

Altering Stream Tables

Use pgtrickle.alter_stream_table() in a new migration:

-- V5__update_order_totals_schedule.sql
SELECT pgtrickle.alter_stream_table(
    'order_totals',
    schedule => '10s'
);

Altering the Defining Query

Use alter_query to change the SQL without dropping and recreating:

-- V7__add_avg_to_order_totals.sql
SELECT pgtrickle.alter_stream_table(
    'order_totals',
    alter_query => $$SELECT region,
                            COUNT(*) AS order_count,
                            SUM(amount) AS total,
                            AVG(amount) AS avg_amount
                     FROM orders GROUP BY region$$
);

Dropping Stream Tables

-- V9__remove_legacy_stream_tables.sql
SELECT pgtrickle.drop_stream_table('legacy_report');

Bulk Creation

For environments with many stream tables, use bulk_create to create them atomically:

-- V4__create_all_stream_tables.sql
SELECT pgtrickle.bulk_create('[
    {
        "name": "order_totals",
        "query": "SELECT region, COUNT(*) AS cnt, SUM(amount) AS total FROM orders GROUP BY region",
        "schedule": "5s",
        "refresh_mode": "DIFFERENTIAL"
    },
    {
        "name": "daily_revenue",
        "query": "SELECT date_trunc(''day'', created_at) AS day, SUM(amount) AS revenue FROM orders GROUP BY 1",
        "schedule": "30s",
        "refresh_mode": "DIFFERENTIAL"
    }
]'::jsonb);

Ordering: Source Tables Before Stream Tables

Flyway executes migrations in version order. Ensure source tables are created in an earlier migration than their dependent stream tables:

V1__create_schema.sql           -- CREATE TABLE orders, products, ...
V2__create_indexes.sql          -- CREATE INDEX ...
V3__create_stream_tables.sql    -- SELECT pgtrickle.create_stream_table(...)

Repeatable Migrations

If you want stream table definitions to be re-applied on every Flyway run (for development environments), use repeatable migrations:

-- R__stream_tables.sql
-- Drop and recreate all stream tables
SELECT pgtrickle.drop_stream_table('order_totals') 
WHERE EXISTS (
    SELECT 1 FROM pgtrickle.pgt_stream_tables 
    WHERE pgt_name = 'order_totals'
);

SELECT pgtrickle.create_stream_table(
    'order_totals',
    $$SELECT region, COUNT(*) AS cnt FROM orders GROUP BY region$$,
    schedule => '5s',
    refresh_mode => 'DIFFERENTIAL'
);

Or use create_or_replace_stream_table for idempotent definitions:

-- R__stream_tables.sql (idempotent)
SELECT pgtrickle.create_or_replace_stream_table(
    'order_totals',
    $$SELECT region, COUNT(*) AS cnt FROM orders GROUP BY region$$,
    schedule => '5s',
    refresh_mode => 'DIFFERENTIAL'
);

Handling ALTER TABLE on Source Tables

When a Flyway migration alters a source table (e.g., adding a column), pg_trickle's DDL event trigger detects the change and suspends affected stream tables. After the schema change, stream tables resume automatically on the next refresh cycle.

If the source table change invalidates the stream table's defining query (e.g., removing a referenced column), you must update or drop the stream table in the same or a subsequent migration.


Liquibase

Creating Stream Tables in Changesets

Use Liquibase's <sql> tag to call pg_trickle functions:

<!-- changelog-3.0.xml -->
<changeSet id="create-order-stream-tables" author="dev">
    <sql>
        SELECT pgtrickle.create_stream_table(
            'order_totals',
            $pgt$SELECT region, COUNT(*) AS order_count, SUM(amount) AS total
                  FROM orders GROUP BY region$pgt$,
            schedule     => '5s',
            refresh_mode => 'DIFFERENTIAL'
        );
    </sql>
    <rollback>
        <sql>SELECT pgtrickle.drop_stream_table('order_totals');</sql>
    </rollback>
</changeSet>

Rollback Support

Always include <rollback> blocks that drop the stream table:

<changeSet id="add-daily-revenue-st" author="dev">
    <sql>
        SELECT pgtrickle.create_stream_table(
            'daily_revenue',
            $pgt$SELECT date_trunc('day', created_at) AS day,
                        SUM(amount) AS revenue
                 FROM orders GROUP BY 1$pgt$,
            schedule => '30s',
            refresh_mode => 'DIFFERENTIAL'
        );
    </sql>
    <rollback>
        <sql>SELECT pgtrickle.drop_stream_table('daily_revenue');</sql>
    </rollback>
</changeSet>

Altering Stream Tables

<changeSet id="update-order-totals-schedule" author="dev">
    <sql>
        SELECT pgtrickle.alter_stream_table(
            'order_totals',
            schedule => '10s'
        );
    </sql>
    <rollback>
        <sql>
            SELECT pgtrickle.alter_stream_table(
                'order_totals',
                schedule => '5s'
            );
        </sql>
    </rollback>
</changeSet>

Preconditions

Use Liquibase preconditions to check whether pg_trickle is available:

<changeSet id="create-stream-tables" author="dev">
    <preConditions onFail="MARK_RAN">
        <sqlCheck expectedResult="1">
            SELECT COUNT(*) FROM pg_extension WHERE extname = 'pg_trickle'
        </sqlCheck>
    </preConditions>
    <sql>
        SELECT pgtrickle.create_stream_table(...);
    </sql>
</changeSet>

Common Patterns

Environment-Specific Schedules

Use different schedules for development vs. production:

-- Use a function to parameterize schedules
SELECT pgtrickle.create_stream_table(
    'order_totals',
    $$SELECT region, COUNT(*) AS cnt FROM orders GROUP BY region$$,
    schedule => CASE 
        WHEN current_setting('pg_trickle.enabled', true) = 'on' 
        THEN '5s' 
        ELSE '1m' 
    END,
    refresh_mode => 'DIFFERENTIAL'
);

CI/Test Environments

In CI, set pg_trickle.enabled = off in postgresql.conf to prevent the background scheduler from running during schema migrations. Stream tables will still be created correctly — they just won't auto-refresh until the scheduler is enabled.

Extension Dependency

Ensure CREATE EXTENSION pg_trickle runs before any stream table migration. In Flyway, use an early versioned migration:

-- V0__extensions.sql
CREATE EXTENSION IF NOT EXISTS pg_trickle;

In Liquibase:

<changeSet id="install-extensions" author="dev" runOnChange="true">
    <sql>CREATE EXTENSION IF NOT EXISTS pg_trickle;</sql>
</changeSet>

Further Reading

ORM Integration Guides

pg_trickle stream tables are read-only materialized views that refresh automatically. This page documents how to use stream tables from popular Python ORMs — SQLAlchemy and Django ORM.

Key Principles

  1. Stream tables are read-only. All writes go to the source tables; pg_trickle refreshes stream tables in the background.
  2. Model stream tables as views, not regular tables. ORMs should never attempt INSERT, UPDATE, or DELETE on a stream table.
  3. Internal columns are hidden. The __pgt_row_id column used for incremental maintenance is excluded from SELECT * queries.

SQLAlchemy

Read-Only Model Definition

Map a stream table as a read-only model using __table_args__:

from sqlalchemy import Column, Integer, Numeric, String, BigInteger
from sqlalchemy.orm import DeclarativeBase

class Base(DeclarativeBase):
    pass

class OrderTotals(Base):
    """Read-only model backed by pg_trickle stream table."""
    __tablename__ = "order_totals"
    
    # Map the stream table's row ID as primary key for ORM identity
    __pgt_row_id = Column("__pgt_row_id", BigInteger, primary_key=True)
    
    region = Column(String, nullable=False)
    order_count = Column(BigInteger, nullable=False)
    total = Column(Numeric(10, 2), nullable=False)
    
    __table_args__ = {
        "info": {"readonly": True},  # Convention marker
    }

Querying

Query stream tables like any other SQLAlchemy model:

from sqlalchemy import select

# All regions
stmt = select(OrderTotals).order_by(OrderTotals.total.desc())
results = session.execute(stmt).scalars().all()

# Filtered
stmt = (
    select(OrderTotals)
    .where(OrderTotals.order_count > 10)
    .where(OrderTotals.region == "East")
)
row = session.execute(stmt).scalar_one_or_none()

Preventing Accidental Writes

Use SQLAlchemy events to block write operations:

from sqlalchemy import event

READONLY_TABLES = {"order_totals", "daily_revenue", "customer_stats"}

@event.listens_for(session, "before_flush")
def block_stream_table_writes(session, flush_context, instances):
    for obj in session.new | session.dirty | session.deleted:
        table_name = obj.__class__.__tablename__
        if table_name in READONLY_TABLES:
            raise RuntimeError(
                f"Cannot write to stream table '{table_name}'. "
                f"Write to the source table instead."
            )

Reflecting Stream Tables

If you prefer reflection over explicit models:

from sqlalchemy import MetaData, Table, create_engine

engine = create_engine("postgresql://...")
metadata = MetaData()

# Reflect the stream table (treated as a regular table by PostgreSQL)
order_totals = Table("order_totals", metadata, autoload_with=engine)

# Query
with engine.connect() as conn:
    result = conn.execute(order_totals.select().limit(10))
    for row in result:
        print(row)

Checking Freshness

Query the stream table's metadata to check when it was last refreshed:

from sqlalchemy import text

def get_staleness(session, st_name: str) -> dict:
    """Return freshness info for a stream table."""
    result = session.execute(
        text("SELECT * FROM pgtrickle.get_staleness(:name)"),
        {"name": st_name},
    ).mappings().one()
    return dict(result)

# Usage
staleness = get_staleness(session, "order_totals")
print(f"Last refresh: {staleness['data_timestamp']}")
print(f"Stale for: {staleness['staleness_seconds']}s")

Async SQLAlchemy (2.0+)

Works identically with async_session:

from sqlalchemy.ext.asyncio import AsyncSession

async def get_top_regions(session: AsyncSession, limit: int = 10):
    stmt = (
        select(OrderTotals)
        .order_by(OrderTotals.total.desc())
        .limit(limit)
    )
    result = await session.execute(stmt)
    return result.scalars().all()

Django ORM

Read-Only Model Definition

Use managed = False so Django never creates, alters, or drops the table:

# models.py
from django.db import models

class OrderTotals(models.Model):
    """Read-only model backed by pg_trickle stream table."""
    
    region = models.CharField(max_length=255)
    order_count = models.BigIntegerField()
    total = models.DecimalField(max_digits=10, decimal_places=2)
    
    class Meta:
        managed = False        # Django will not create/alter this table
        db_table = "order_totals"
    
    def save(self, *args, **kwargs):
        raise NotImplementedError("Stream tables are read-only")
    
    def delete(self, *args, **kwargs):
        raise NotImplementedError("Stream tables are read-only")

Querying

Standard Django QuerySet operations work:

# All regions sorted by total
OrderTotals.objects.all().order_by("-total")

# Filtered
OrderTotals.objects.filter(
    order_count__gt=10,
    region="East"
).first()

# Aggregation (on the stream table itself)
from django.db.models import Sum, Avg
OrderTotals.objects.aggregate(
    total_revenue=Sum("total"),
    avg_orders=Avg("order_count"),
)

Django Migrations

Since managed = False, Django migrations won't touch stream tables. Create stream tables in a custom migration using RunSQL:

# migrations/0003_create_stream_tables.py
from django.db import migrations

class Migration(migrations.Migration):
    dependencies = [
        ("myapp", "0002_create_orders_table"),
    ]

    operations = [
        migrations.RunSQL(
            sql="""
                SELECT pgtrickle.create_stream_table(
                    'order_totals',
                    $pgt$SELECT region,
                                COUNT(*) AS order_count,
                                SUM(amount) AS total
                         FROM orders GROUP BY region$pgt$,
                    schedule     => '5s',
                    refresh_mode => 'DIFFERENTIAL'
                );
            """,
            reverse_sql="""
                SELECT pgtrickle.drop_stream_table('order_totals');
            """,
        ),
    ]

Read-Only Mixin

Create a reusable mixin for all stream table models:

class StreamTableMixin(models.Model):
    """Base class for pg_trickle stream table models."""
    
    class Meta:
        abstract = True
        managed = False
    
    def save(self, *args, **kwargs):
        raise NotImplementedError(
            f"{self.__class__.__name__} is a read-only stream table. "
            f"Write to the source table instead."
        )
    
    def delete(self, *args, **kwargs):
        raise NotImplementedError(
            f"{self.__class__.__name__} is a read-only stream table."
        )

# Usage
class OrderTotals(StreamTableMixin):
    region = models.CharField(max_length=255)
    order_count = models.BigIntegerField()
    total = models.DecimalField(max_digits=10, decimal_places=2)
    
    class Meta(StreamTableMixin.Meta):
        db_table = "order_totals"

class DailyRevenue(StreamTableMixin):
    day = models.DateField()
    revenue = models.DecimalField(max_digits=12, decimal_places=2)
    
    class Meta(StreamTableMixin.Meta):
        db_table = "daily_revenue"

Checking Freshness

Use raw SQL to query pg_trickle diagnostics:

from django.db import connection

def get_staleness(st_name: str) -> dict:
    """Return freshness info for a stream table."""
    with connection.cursor() as cursor:
        cursor.execute(
            "SELECT * FROM pgtrickle.get_staleness(%s)", [st_name]
        )
        columns = [col.name for col in cursor.description]
        row = cursor.fetchone()
        return dict(zip(columns, row)) if row else {}

Django REST Framework

Stream table models work with DRF serializers and viewsets:

from rest_framework import serializers, viewsets

class OrderTotalsSerializer(serializers.ModelSerializer):
    class Meta:
        model = OrderTotals
        fields = ["region", "order_count", "total"]

class OrderTotalsViewSet(viewsets.ReadOnlyModelViewSet):
    """Read-only API endpoint for order totals stream table."""
    queryset = OrderTotals.objects.all()
    serializer_class = OrderTotalsSerializer

Common Patterns

Write to Source, Read from Stream

The fundamental pattern: all writes go to source tables (normal ORM models), reads come from stream tables (read-only models).

# Write to source table (normal ORM)
order = Order(region="East", amount=Decimal("99.99"))
session.add(order)
session.commit()

# Read from stream table (auto-refreshed by pg_trickle)
totals = session.execute(
    select(OrderTotals).where(OrderTotals.region == "East")
).scalar_one()
print(f"East: {totals.order_count} orders, ${totals.total}")

Handling Eventual Consistency

Stream tables refresh on a schedule (e.g., every 5 seconds). After writing to a source table, the stream table may be briefly stale. Options:

  1. Accept staleness — suitable for dashboards and reports.
  2. Force refresh — call pgtrickle.refresh_stream_table() after critical writes.
  3. Use IMMEDIATE mode — stream table refreshes within the same transaction.
# Option 2: Force refresh after a critical write
session.execute(text(
    "SELECT pgtrickle.refresh_stream_table('order_totals')"
))

Further Reading

Architecture

This document describes the internal architecture of pg_trickle — a PostgreSQL 18 extension that implements stream tables with differential view maintenance. For a high-level description of what pg_trickle does and why, read ESSENCE.md. For release milestones and future plans, see ROADMAP.md.


High-Level Overview

┌─────────────────────────────────────────────────────────────────┐
│                     PostgreSQL 18 Backend                       │
│                                                                 │
│  ┌──────────┐   ┌──────────┐   ┌──────────┐   ┌─────────────┐   │
│  │  Source  │   │  Source  │   │  Storage │   │  Storage    │   │
│  │  Table A │   │  Table B │   │  Table X │   │  Table Y    │   │
│  └────┬─────┘   └────┬─────┘   └────▲─────┘   └────▲────────┘   │
│       │              │              │              │            │
│  ═════╪══════════════╪══════════════╪══════════════╪════════    │
│       │              │              │              │            │
│  ┌────▼──────────────▼────┐   ┌────┴──────────────┴────┐        │
│  │  Hybrid CDC Layer      │   │  Delta Application     │        │
│  │  Triggers ──or── WAL   │   │  (INSERT/DELETE diffs) │        │
│  └────────────┬───────────┘   └────────────▲───────────┘        │
│               │                            │                    │
│  ┌────────────▼───────────┐   ┌────────────┴───────────┐        │
│  │   Change Buffer        │   │   DVM Engine           │        │
│  │   (pgtrickle_changes.*) │   │   (Operator Tree)      │        │
│  └────────────┬───────────┘   └────────────▲───────────┘        │
│               │                            │                    │
│               └────────────┬───────────────┘                    │
│                            │                                    │
│  ┌─────────────────────────▼─────────────────────────────┐      │
│  │              Refresh Engine                           │      │
│  │  ┌──────────┐  ┌──────────┐  ┌─────────────────────┐  │      │
│  │  │ Frontier │  │ DAG      │  │ Scheduler           │  │      │
│  │  │ Tracker  │  │ Resolver │  │ (canonical schedule)│  │      │
│  │  └──────────┘  └──────────┘  └─────────────────────┘  │      │
│  └───────────────────────────────────────────────────────┘      │
│                                                                 │
│  ┌────────────────────────────────────────────────────────┐     │
│  │                    Catalog (pgtrickle.*)                │     │
│  │  pgt_stream_tables │ pgt_dependencies │ pgt_refresh_history│  │
│  └────────────────────────────────────────────────────────┘     │
│                                                                 │
│  ┌──────────────────────────────────────────────────────┐       │
│  │                  Monitoring Layer                    │       │
│  │  st_refresh_stats │ slot_health │ check_cdc_health    │       │
│  │  explain_st │ views │ NOTIFY alerting               │       │
│  └──────────────────────────────────────────────────────┘       │
└─────────────────────────────────────────────────────────────────┘

Component Details

1. SQL API Layer (src/api.rs)

The public entry point for users. All operations are exposed as #[pg_extern] functions in the pgtrickle schema:

  • create_stream_table — Applies a chain of auto-rewrite passes (view inlining → DISTINCT ON → GROUPING SETS → scalar subquery in WHERE → correlated scalar subquery in SELECT → SubLinks in OR → multi-PARTITION BY windows), parses the defining query, builds an operator tree, creates the storage table, registers CDC slots, populates the catalog, and optionally performs an initial full refresh.
  • alter_stream_table — Modifies schedule, refresh mode, status (ACTIVE/SUSPENDED), or defining query. Query changes trigger schema migration, dependency updates, and a full refresh within a single transaction.
  • drop_stream_table — Removes the storage table, catalog entries, and cleans up CDC slots.
  • refresh_stream_table — Triggers a manual refresh (same path as automatic scheduling).
  • pgt_status — Returns a summary of all registered stream tables.

2. Catalog (src/catalog.rs)

The catalog manages persistent metadata stored in PostgreSQL tables within the pgtrickle schema:

TablePurpose
pgtrickle.pgt_stream_tablesCore metadata: name, query, schedule, status, frontier, etc.
pgtrickle.pgt_dependenciesDAG edges from ST to source tables
pgtrickle.pgt_refresh_historyAudit log of every refresh operation
pgtrickle.pgt_change_trackingPer-source CDC slot metadata

Schema creation is handled by extension_sql!() macros that run at CREATE EXTENSION time.

Entity-Relationship Diagram

erDiagram
    pgt_stream_tables {
        bigserial pgt_id PK
        oid pgt_relid UK "OID of materialized storage table"
        text pgt_name
        text pgt_schema
        text defining_query
        text original_query "User's original SQL (pre-inlining)"
        text schedule "Duration or cron expression"
        text refresh_mode "FULL | DIFFERENTIAL | IMMEDIATE"
        text status "INITIALIZING | ACTIVE | SUSPENDED | ERROR"
        boolean is_populated
        timestamptz data_timestamp "Freshness watermark"
        jsonb frontier "DBSP-style version frontier"
        timestamptz last_refresh_at
        int consecutive_errors
        boolean needs_reinit
        float8 auto_threshold
        float8 last_full_ms
        timestamptz created_at
        timestamptz updated_at
    }

    pgt_dependencies {
        bigint pgt_id PK,FK "References pgt_stream_tables.pgt_id"
        oid source_relid PK "OID of source table"
        text source_type "TABLE | STREAM_TABLE | VIEW"
        text_arr columns_used "Column-level lineage"
        text cdc_mode "TRIGGER | TRANSITIONING | WAL"
        text slot_name "Replication slot (WAL mode)"
        pg_lsn decoder_confirmed_lsn "WAL decoder progress"
        timestamptz transition_started_at "Trigger→WAL transition start"
    }

    pgt_refresh_history {
        bigserial refresh_id PK
        bigint pgt_id FK "References pgt_stream_tables.pgt_id"
        timestamptz data_timestamp
        timestamptz start_time
        timestamptz end_time
        text action "NO_DATA | FULL | DIFFERENTIAL | REINITIALIZE | SKIP"
        bigint rows_inserted
        bigint rows_deleted
        text error_message
        text status "RUNNING | COMPLETED | FAILED | SKIPPED"
        text initiated_by "SCHEDULER | MANUAL | INITIAL"
        timestamptz freshness_deadline
    }

    pgt_change_tracking {
        oid source_relid PK "OID of tracked source table"
        text slot_name "Trigger function name"
        pg_lsn last_consumed_lsn
        bigint_arr tracked_by_pgt_ids "ST IDs sharing this source"
    }

    pgt_stream_tables ||--o{ pgt_dependencies : "has sources"
    pgt_stream_tables ||--o{ pgt_refresh_history : "has refresh history"
    pgt_stream_tables }o--o{ pgt_change_tracking : "tracks via pgt_ids array"

Note: Change buffer tables (pgtrickle_changes.changes_<oid>) are created dynamically per source table OID and live in the separate pgtrickle_changes schema.

3. CDC / Change Data Capture (src/cdc.rs, src/wal_decoder.rs)

pg_trickle uses a hybrid CDC architecture that starts with triggers and optionally transitions to WAL-based (logical replication) capture for lower write-side overhead.

Trigger Mode (default)

  1. Trigger Management — Creates AFTER INSERT OR UPDATE OR DELETE row-level triggers (pg_trickle_cdc_<oid>) on each tracked source table. Each trigger fires a PL/pgSQL function (pg_trickle_cdc_fn_<oid>()) that writes changes to the buffer table.
  2. Change Buffering — Decoded changes are written to per-source change buffer tables in the pgtrickle_changes schema. Each row captures the LSN (pg_current_wal_lsn()), transaction ID, action type (I/U/D), and the new/old row data as typed columns (new_<col> TYPE, old_<col> TYPE) — native PostgreSQL types, not JSONB.
  3. Cleanup — Consumed changes are deleted after each successful refresh via delete_consumed_changes(), bounded by the upper LSN to prevent unbounded scans.
  4. Lifecycle — Triggers and trigger functions are automatically created when a source table is first tracked and dropped when the last stream table referencing a source is removed.

The trigger approach was chosen as the default for transaction safety (triggers can be created in the same transaction as DDL), simplicity (no slot management, no wal_level = logical requirement), and immediate visibility (changes are visible in buffer tables as soon as the source transaction commits).

WAL Mode (optional, automatic transition)

When pg_trickle.cdc_mode is set to 'auto' or 'wal' and wal_level = logical is available, the system transitions from trigger-based to WAL-based CDC after the first successful refresh:

  1. WAL Availability Detection — At stream table creation, checks whether wal_level = logical is configured. If so, the source dependency is marked for WAL transition.
  2. WAL Decoder Background Worker — A dedicated background worker (src/wal_decoder.rs) polls logical replication slots and writes decoded changes into the same change buffer tables used by triggers, ensuring a uniform format for the DVM engine.
  3. Transition Orchestration — The transition is a three-step process: (a) create a replication slot, (b) wait for the decoder to catch up to the trigger's last confirmed LSN, (c) drop the trigger and switch the dependency to WAL mode. If the decoder doesn't catch up within pg_trickle.wal_transition_timeout (default 300s), the system falls back to triggers.
  4. CDC Mode Tracking — Each source dependency in pgt_dependencies carries a cdc_mode column (TRIGGER / TRANSITIONING / WAL) and WAL-specific metadata (slot_name, decoder_confirmed_lsn, transition_started_at).

See ADR-001 and ADR-002 in plans/adrs/PLAN_ADRS.md for the original design rationale and plans/sql/PLAN_HYBRID_CDC.md for the full implementation plan.

Immediate Mode / Transactional IVM (src/ivm.rs)

When refresh_mode = 'IMMEDIATE', pg_trickle uses statement-level AFTER triggers with transition tables instead of row-level CDC triggers. The stream table is maintained synchronously within the same transaction as the base table DML.

  1. BEFORE Triggers — Statement-level BEFORE triggers on each base table acquire an advisory lock on the stream table to prevent concurrent conflicting updates.
  2. AFTER Triggers — Statement-level AFTER triggers with REFERENCING NEW TABLE AS ... OLD TABLE AS ... copy the transition table data to temp tables, then call the Rust pgt_ivm_apply_delta() function.
  3. Delta Computation — The DVM engine's Scan operator reads from the temp tables (via DeltaSource::TransitionTable) instead of change buffer tables. No LSN filtering or net-effect computation is needed — each trigger invocation represents a single atomic statement.
  4. Delta Application — The computed delta is applied via explicit DML (DELETE + INSERT ON CONFLICT) to the stream table.
  5. TRUNCATE — A separate AFTER TRUNCATE trigger calls pgt_ivm_handle_truncate(), which truncates the stream table and re-populates from the defining query.

No change buffer tables, no scheduler involvement, and no WAL infrastructure is needed for IMMEDIATE mode. See plans/sql/PLAN_TRANSACTIONAL_IVM.md for the design plan.

ST-to-ST Change Capture (v0.11.0+)

When a stream table's defining query references another stream table (rather than a base table), neither triggers nor WAL capture apply — the upstream source is itself maintained by pg_trickle. A dedicated ST change buffer mechanism enables downstream stream tables to refresh differentially even when their source is another stream table.

  Base Table  ──trigger/WAL──▶  changes_<oid>       (base-table buffer)
  Stream Table A  ──refresh──▶  changes_pgt_<pgt_id>  (ST buffer for A's consumers)
  Stream Table B  reads from    changes_pgt_<pgt_id>  (B depends on A)

Buffer schema. ST change buffers are named pgtrickle_changes.changes_pgt_<pgt_id> (using the internal pgt_id rather than the OID). Unlike base-table buffers, they store only new_* columns — no old_* columns — because ST deltas are expressed as INSERT/DELETE pairs, not UPDATE rows.

Delta capture — DIFFERENTIAL path. When an upstream stream table refreshes in DIFFERENTIAL mode and has downstream consumers, the refresh engine captures the computed delta (the INSERT and DELETE rows applied to the upstream ST) into the ST change buffer via explicit DML. Downstream stream tables then read from this buffer exactly as they would read from a base-table change buffer.

Delta capture — FULL path. When an upstream stream table refreshes in FULL mode (e.g., due to a mode downgrade or full => true), the engine takes a pre-refresh snapshot, executes the full refresh, then computes an EXCEPT ALL diff between the old and new contents. The resulting INSERT/DELETE pairs are written to the ST change buffer. This prevents FULL refreshes from cascading through the entire dependency chain — downstream STs always receive a minimal delta regardless of how the upstream was refreshed.

Frontier tracking. ST source positions are tracked in the same frontier JSONB structure as base-table sources, using pgt_<upstream_pgt_id> as the key (e.g., {"pgt_42": 157}) rather than the OID-based keys used for base tables. The scheduler's has_stream_table_source_changes() function compares the downstream's last-consumed frontier position against the upstream buffer's current maximum LSN to decide whether a refresh is needed.

Lifecycle. ST change buffers are created automatically when a stream table gains its first downstream consumer (create_st_change_buffer_table()), and dropped when the last downstream consumer is removed (drop_st_change_buffer_table()). On upgrade from pre-v0.11.0, existing ST-to-ST dependencies have their buffers auto-created on the first scheduler tick. Consumed rows are cleaned up by cleanup_st_change_buffers_by_frontier() after each successful downstream refresh.

4. DVM Engine (src/dvm/)

The Differential View Maintenance engine is the core of the system. It transforms the defining SQL query into an executable operator tree that can compute deltas efficiently.

Auto-Rewrite Pipeline (src/dvm/parser.rs)

Before the defining query is parsed into an operator tree, it passes through a chain of auto-rewrite passes that normalize SQL constructs the DVM parser doesn't handle directly:

PassFunctionPurpose
#0rewrite_views_inline()Replace view references with (view_definition) AS alias subqueries
#1rewrite_distinct_on()Convert DISTINCT ON to ROW_NUMBER() OVER (…) = 1 window subquery
#2rewrite_grouping_sets()Decompose GROUPING SETS / CUBE / ROLLUP into UNION ALL of GROUP BY
#3rewrite_scalar_subquery_in_where()Convert WHERE col > (SELECT …) to CROSS JOIN
#4rewrite_sublinks_in_or()Split WHERE a OR EXISTS (…) into UNION branches
#5rewrite_multi_partition_windows()Split multiple PARTITION BY clauses into joined subqueries

The view inlining pass (#0) runs first so that view definitions containing DISTINCT ON, GROUPING SETS, etc. are further rewritten by downstream passes. Nested views are expanded via a fixpoint loop (max depth 10).

Query Parser (src/dvm/parser.rs)

Parses the defining query using PostgreSQL's internal parser (via pgrx raw_parser) and extracts:

  • WITH clause — CTE definitions (non-recursive: inline expansion or shared delta; recursive: detected for mode gating)
  • Target list — output columns
  • FROM clause — source tables, joins, subqueries, and CTE references
  • WHERE clause — filters
  • GROUP BY / aggregate functions
  • DISTINCT / UNION ALL / INTERSECT / EXCEPT

The parser produces an OpTree — a tree of operator nodes. CTE handling follows a tiered approach:

  1. Tier 1 (Inline Expansion) — Non-recursive CTEs referenced once are expanded into Subquery nodes, equivalent to subqueries in FROM.
  2. Tier 2 (Shared Delta) — Non-recursive CTEs referenced multiple times produce CteScan nodes that share a single delta computation via a CTE registry and delta cache.
  3. Tier 3a/3b/3c (Recursive) — Recursive CTEs (WITH RECURSIVE) are detected via query_has_recursive_cte(). In FULL mode, the query executes as-is. In DIFFERENTIAL mode, the strategy is auto-selected: semi-naive evaluation for INSERT-only changes, Delete-and-Rederive (DRed) for mixed changes, or recomputation fallback when CTE columns don't match ST storage or when the recursive term contains non-monotone operators (EXCEPT, Aggregate, Window, DISTINCT, AntiJoin, INTERSECT SET). In IMMEDIATE mode, the same semi-naive / DRed machinery runs against statement transition tables and is bounded by pg_trickle.ivm_recursive_max_depth to guard against unbounded recursion.

Operators (src/dvm/operators/)

Each operator knows how to generate a delta query — given a set of changes to its inputs, it produces the corresponding changes to its output:

OperatorDelta Strategy
ScanDirect passthrough of CDC changes
FilterApply WHERE predicate to deltas
ProjectApply column projection to deltas
JoinJoin deltas against the other side's current state
OuterJoinLEFT/RIGHT outer join with NULL padding
FullJoinFULL OUTER JOIN with 8-part delta (both sides may produce NULLs)
AggregateRecompute group values where affected keys changed
DistinctCOUNT-based duplicate tracking
UnionAllMerge deltas from both branches
IntersectDual-count multiplicity with LEAST boundary crossing
ExceptDual-count multiplicity with GREATEST(0, L-R) boundary crossing
SubqueryTransparent delegation + optional column renaming (CTEs, subselects)
CteScanShared delta lookup from CTE cache (multi-reference CTEs)
RecursiveCteSemi-naive / DRed / recomputation for WITH RECURSIVE
WindowPartition-based recomputation for window functions
LateralFunctionRow-scoped recomputation for SRFs in FROM (jsonb_array_elements, unnest, etc.)
LateralSubqueryRow-scoped recomputation for correlated subqueries in LATERAL FROM
SemiJoinEXISTS / IN subquery delta via semi-join
AntiJoinNOT EXISTS / NOT IN subquery delta via anti-join
ScalarSubqueryCorrelated scalar subquery in SELECT list

See DVM_OPERATORS.md for detailed descriptions.

Diff Engine (src/dvm/diff.rs)

Generates the final diff SQL that:

  1. Computes the delta from the operator tree
  2. Produces ('+', row) for inserts and ('-', row) for deletes
  3. Applies the diff via DELETE matching old rows and INSERT for new rows

5. DAG / Dependency Graph (src/dag.rs)

Stream tables can depend on other stream tables (cascading), forming a Directed Acyclic Graph:

  • Cycle detection — Detects circular dependencies at creation time using Kahn's algorithm (BFS topological sort). When pg_trickle.allow_circular = true, monotone cycles (queries using only safe operators — joins, filters, UNION ALL, etc.) are allowed; non-monotone cycles (aggregates, EXCEPT, window functions, anti-joins) are rejected. SCC IDs are automatically assigned to cycle members and recomputed on drop/alter.
  • SCC decomposition — Tarjan's algorithm decomposes the graph into strongly connected components. Singleton SCCs are acyclic; multi-node SCCs contain cycles that are handled by fixed-point iteration in the scheduler.
  • Monotonicity analysis — Static check (check_monotonicity() in src/dvm/parser.rs) determines whether a query's operators are safe for cyclic fixed-point iteration. Non-monotone operators (Aggregate, EXCEPT, Window, NOT EXISTS) block cycle creation.
  • Topological ordering — Determines refresh order: upstream STs must be refreshed before downstream STs.
  • Condensation ordercondensation_order() returns SCCs in topological order, grouping cyclic STs for fixed-point iteration. The scheduler's iterate_to_fixpoint() processes multi-node SCCs by refreshing all members repeatedly until convergence (zero net changes) or max_fixpoint_iterations is exceeded.
  • Cascade operations — When a source table changes, all transitive dependents are identified for refresh.

6. Version / Frontier Tracking (src/version.rs)

Implements a per-source frontier (JSONB map of source_oid → LSN) to track exactly how far each stream table has consumed changes:

  • Read frontier — Before refresh, read the frontier to know where to start consuming changes.
  • Advance frontier — After a successful refresh, the frontier is updated to the latest consumed LSN.
  • Consistent snapshots — The frontier ensures that each refresh processes a contiguous, non-overlapping window of changes.

Delayed View Semantics (DVS) Guarantee

The contents of every stream table are logically equivalent to evaluating its defining query at some past point in time — the data_timestamp. The scheduler refreshes STs in topological order so that when ST B references upstream ST A, A has already been refreshed to the target data_timestamp before B runs its delta query against A's contents. The frontier lifecycle is:

  1. Created — on first full refresh; records the LSN of each source at that moment.
  2. Advanced — on each differential refresh; the old frontier becomes the lower bound and the new frontier (with fresh LSNs) the upper bound. The DVM engine reads changes in [old, new].
  3. Reset — on reinitialize; a fresh frontier is created from scratch.

7. Refresh Engine (src/refresh.rs)

Orchestrates the complete refresh cycle:

┌──────────────┐
│  Check State │ → Is ST active? Has it been populated?
└──────┬───────┘
       │
 ┌─────▼──────┐
 │ Drain CDC  │ → Read WAL changes into change buffer tables
 └─────┬──────┘
       │
 ┌─────▼──────────────┐
 │ Determine Action   │ → FULL, DIFFERENTIAL, NO_DATA, REINITIALIZE, or SKIP?
 │                    │   (adaptive: if change ratio > pg_trickle.differential_max_change_ratio,
 │                    │    downgrade DIFFERENTIAL → FULL automatically)
 └─────┬──────────────┘
       │
 ┌─────▼──────┐
 │ Execute    │ → Full: TRUNCATE + INSERT ... SELECT
 │            │   Differential: Generate & apply delta SQL
 └─────┬──────┘
       │
 ┌─────▼──────────────┐
 │ Record History     │ → Write to pgtrickle.pgt_refresh_history
 └─────┬──────────────┘
       │
 ┌─────▼──────────────┐
 │ Advance Frontier   │ → Update JSONB frontier in catalog
 └─────┬──────────────┘
       │
 ┌─────▼──────────────┐
 │ Reset Error Count  │ → On success, reset consecutive_errors to 0
 └──────────────────────┘

8. Background Worker & Scheduling (src/scheduler.rs)

Registration & Lifecycle

pg_trickle registers one PostgreSQL background worker — the scheduler — during _PG_init() (extension load). Because it is registered at startup, pg_trickle must appear in shared_preload_libraries, which requires a server restart.

┌──────────────────────────────────────────────────────────────────┐
│                  PostgreSQL postmaster                           │
│                                                                  │
│  shared_preload_libraries = 'pg_trickle'                          │
│       │                                                          │
│       ▼                                                          │
│  _PG_init()                                                      │
│    ├─ Register GUCs (pg_trickle.enabled, scheduler_interval_ms …) │
│    ├─ Register shared memory (PgTrickleSharedState, atomics)      │
│    └─ BackgroundWorkerBuilder::new("pg_trickle scheduler")        │
│         .set_start_time(RecoveryFinished)                        │
│         .set_restart_time(5s)       ← auto-restart on crash      │
│         .load()                                                  │
│                                                                  │
│  After recovery finishes:                                        │
│       │                                                          │
│       ▼                                                          │
│  pg_trickle_scheduler_main()         ← background worker starts   │
│    ├─ Attach SIGHUP + SIGTERM handlers                           │
│    ├─ Connect to SPI (database = "postgres")                     │
│    ├─ Crash recovery: mark stale RUNNING records as FAILED       │
│    └─ Enter main loop ─────────────────────────┐                 │
│         │                                      │                 │
│         ▼                                      │                 │
│     wait_latch(scheduler_interval_ms)          │                 │
│         │                                      │                 │
│     ┌───▼───────────────────────────────┐      │                 │
│     │ SIGTERM? → log + break            │      │                 │
│     │ pg_trickle.enabled = false? → skip │      │                 │
│     │ Otherwise → scheduler tick        │      │                 │
│     └───┬───────────────────────────────┘      │                 │
│         │                                      │                 │
│         └──────────── loop ────────────────────┘                 │
└──────────────────────────────────────────────────────────────────┘

Key lifecycle properties:

PropertyBehaviour
Start conditionAfter PostgreSQL recovery finishes (RecoveryFinished)
Auto-restart5-second delay after an unexpected crash
Graceful shutdownHandles SIGTERM — breaks the main loop and exits cleanly
Config reloadHandles SIGHUP — re-reads GUC values on the next latch wake
Crash recoveryOn startup, any pgt_refresh_history rows stuck in RUNNING status are marked FAILED (the transaction that wrote them was rolled back by PostgreSQL, but the status row may have been committed in a prior transaction)
DatabaseConnects to the postgres database via SPI
Standby / replicaOn standby servers (pg_is_in_recovery() = true), the worker enters a sleep loop and does not attempt refreshes. Stream tables are still readable on standbys — they are regular heap tables replicated via physical streaming replication. After promotion the scheduler resumes automatically. See the FAQ § Replication for details on logical replication and subscriber limitations.

Scheduler Tick

Each tick of the main loop performs the following steps inside a single transaction:

  1. DAG rebuild — Compare the shared-memory DAG_REBUILD_SIGNAL counter against the local copy. If it advanced (a CREATE, ALTER, or DROP stream table occurred), rebuild the in-memory dependency graph (StDag) from the catalog.
  2. Topological traversal — Walk stream tables in dependency order (upstream before downstream). This ensures that when ST B references ST A, A is refreshed first.
  3. Per-ST evaluation — For each active ST:
    • Skip if in retry backoff (exponential, per-ST).
    • Skip if schedule/cron says not yet due.
    • Skip if a row-level lock on the catalog entry indicates a concurrent refresh.
    • Check upstream change buffers for pending rows.
  4. Execute refresh — Acquire a row-level lock on the catalog entry → record RUNNING in history → run FULL / DIFFERENTIAL / REINITIALIZE → store new frontier → release lock → record completion.
  5. WAL transitions — Advance any trigger→WAL CDC mode transitions (src/wal_decoder.rs).
  6. Slot health — Check replication slot health and emit NOTIFY alerts.
  7. Prune retry state — Remove backoff entries for STs that no longer exist.

Sequential Processing (Default)

By default (parallel_refresh_mode = 'off), the scheduler processes stream tables sequentially within a single background worker. All STs are refreshed one at a time in topological order. pg_trickle.max_concurrent_refreshes (default 4) only prevents a manual pgtrickle.refresh_stream_table() call from overlapping with the scheduler on the same ST — it does not spawn additional workers.

The PostgreSQL GUC max_worker_processes (default 8) sets the server-wide budget for all background workers (autovacuum, parallel query, logical replication, extensions). In sequential mode pg_trickle consumes one slot from that budget.

Parallel Refresh (parallel_refresh_mode = 'on')

When enabled, the scheduler builds an execution-unit DAG from the stream-table dependency graph and dispatches independent units to dynamic background workers:

  1. Execution units — Each independent stream table becomes a singleton unit. Atomic consistency groups and IMMEDIATE-trigger closures are collapsed into composite units that run in a single worker for correctness.
  2. Ready queue — Units whose upstream dependencies have all completed enter the ready queue. The coordinator dispatches them subject to a per-database cap (max_concurrent_refreshes) and a cluster-wide cap (max_dynamic_refresh_workers).
  3. Dynamic workers — Each dispatched unit spawns a short-lived background worker via BackgroundWorkerBuilder::load_dynamic(). Workers claim a job from the pgtrickle.pgt_scheduler_jobs catalog table, execute the refresh, and exit.

The parallel path respects the same topological ordering as the sequential path — downstream units only become ready after all upstream units succeed. The worker-budget caps ensure pg_trickle does not exhaust max_worker_processes.

See PLAN_PARALLELISM.md for the full design and CONFIGURATION.md for tuning guidance.

Retry & Error Handling

Each ST maintains an in-memory RetryState (reset on scheduler restart):

  • Retryable errors (SPI failures, lock contention, slot issues) trigger exponential backoff.
  • Permanent errors (schema mismatch, user errors) skip backoff but increment consecutive_errors.
  • When consecutive_errors reaches pg_trickle.max_consecutive_errors (default 3), the ST is auto-suspended and a NOTIFY alert is emitted.
  • Schema errors additionally set needs_reinit, triggering a REINITIALIZE on the next successful cycle.

Scheduling Policy

Automatic refresh scheduling uses canonical periods (48·2ⁿ seconds, n = 0, 1, 2, …) snapped to the user's schedule:

  • Picks the smallest canonical period ≤ schedule.
  • For DOWNSTREAM schedule (NULL schedule), the ST refreshes only when explicitly triggered or when a downstream ST needs it.
  • Advisory locks prevent concurrent refreshes of the same ST.
  • The scheduler is driven by the background worker polling at the pg_trickle.scheduler_interval_ms GUC interval.

Shared Memory (src/shmem.rs)

The scheduler background worker and user sessions share a PgTrickleSharedState structure protected by a PgLwLock. Key fields:

FieldTypePurpose
dag_versionu64Incremented when the ST catalog changes; used by the scheduler to detect when the DAG needs rebuilding.
scheduler_pidi32PID of the scheduler background worker (0 if not running).
scheduler_runningboolWhether the scheduler is active.
last_scheduler_wakei64Unix timestamp of the last scheduler wake cycle (for monitoring).

A separate PgAtomic<AtomicU64> named DAG_REBUILD_SIGNAL is incremented by API functions (create, alter, drop) after catalog mutations. The scheduler compares its local copy against the atomic counter to detect when to rebuild its in-memory DAG without holding a lock.

A second PgAtomic<AtomicU64> named CACHE_GENERATION tracks DDL events that may invalidate cached delta or MERGE templates across backends. When DDL hooks fire (view change, ALTER TABLE, function change) or API functions mutate the catalog, CACHE_GENERATION is bumped. Each backend maintains a thread-local generation counter; on the next refresh, if the shared generation has advanced, the backend flushes its delta template cache, MERGE template cache, and explicitly DEALLOCATEs tracked __pgt_merge_* prepared statements before rebuilding local state.

9. DDL Tracking (src/hooks.rs)

Event triggers monitor DDL changes to source tables and functions:

  • _on_ddl_end — Fires on ALTER TABLE to detect column adds/drops/type changes. If a source table used by a ST is altered, the ST's needs_reinit flag is set. Also detects CREATE OR REPLACE FUNCTION / ALTER FUNCTION — if the function appears in a ST's functions_used catalog column, the ST is marked for reinit.
  • _on_sql_drop — Fires on DROP TABLE to set needs_reinit for affected STs. Also detects DROP FUNCTION and marks affected STs for reinit.
  • Function name extractionobject_identity strings (e.g., public.my_func(integer, text)) are parsed to extract the bare function name, which is matched against the functions_used TEXT[] column in pgt_stream_tables.

Reinitialization is deferred until the next refresh cycle, which then performs a REINITIALIZE action (drop and recreate the storage table from the updated query).

10. Error Handling (src/error.rs)

Centralized error types using thiserror:

  • PgTrickleError variants cover catalog access, SQL execution, CDC, DVM, DAG, and config errors.
  • Each refresh failure increments consecutive_errors.
  • When consecutive_errors reaches pg_trickle.max_consecutive_errors (default 3), the ST is moved to ERROR status and suspended from automatic refresh.
  • Manual intervention (ALTER ... status => 'ACTIVE') resets the counter.

11. Monitoring (src/monitor.rs)

Provides observability functions:

  • st_refresh_stats — Aggregate statistics (total/successful/failed refreshes, avg duration, staleness status).
  • get_refresh_history — Per-ST audit trail.
  • get_staleness — Current staleness in seconds.
  • slot_health — Checks replication slot state and WAL retention.
  • check_cdc_health — Per-source CDC health status including mode, slot lag, confirmed LSN, and alerts.
  • explain_st — Describes the DVM plan for a given ST.
  • diamond_groups — Lists detected diamond dependency groups, their members, convergence points, and epoch counters.
  • Viewspgtrickle.stream_tables_info (computed staleness) and pgtrickle.pg_stat_stream_tables (combined stats).

NOTIFY Alerting

Operational events are broadcast via PostgreSQL NOTIFY on the pg_trickle_alert channel. Clients can subscribe with LISTEN pg_trickle_alert; and receive JSON-formatted events:

EventCondition
staledata staleness exceeds 2× schedule
auto_suspendedST suspended after pg_trickle.max_consecutive_errors failures
reinitialize_neededUpstream DDL change detected
slot_lag_warningReplication slot WAL retention exceeded pg_trickle.slot_lag_warning_threshold_mb
cdc_transition_completeSource transitioned from trigger to WAL-based CDC
cdc_transition_failedTrigger→WAL transition failed (fell back to triggers)
refresh_completedRefresh completed successfully
refresh_failedRefresh failed with an error

12. Row ID Hashing (src/hash.rs)

Provides deterministic 64-bit row identifiers using xxHash (xxh64) with a fixed seed. Two SQL functions are exposed:

  • pgtrickle.pg_trickle_hash(text) — Hash a single text value; used for simple single-column row IDs.
  • pgtrickle.pg_trickle_hash_multi(text[]) — Hash multiple values (separated by a record-separator byte \x1E) for composite keys (join row IDs, GROUP BY keys).

Row IDs are written into every stream table's storage as an internal __pgt_row_id BIGINT column and are used by the delta application phase to match DELETE candidates precisely.

13. Diamond Dependency Consistency (src/dag.rs)

When stream tables form diamond-shaped dependency graphs, a convergence (fan-in) node may read from multiple upstream STs that share a common ancestor:

        A (source table)
       / \
      B   C   (intermediate STs)
       \ /
        D     (convergence / fan-in ST)

If B refreshes successfully but C fails, D would read a fresh version of B's data alongside stale data from C — a split-version inconsistency.

Detection

StDag::detect_diamonds() walks all fan-in nodes (STs with multiple upstream ST dependencies) and computes transitive ancestor sets per branch. If two or more branches share ancestors, a diamond is detected. Overlapping diamonds are merged.

Consistency Groups

StDag::compute_consistency_groups() converts detected diamonds into consistency groups — topologically ordered sets of STs that must be refreshed atomically. Each group contains:

  • Members — All intermediate STs plus the convergence node, in refresh order.
  • Convergence points — The fan-in nodes where multiple paths meet.
  • Epoch counter — Advances on each successful atomic refresh.

STs not involved in any diamond are placed in singleton groups (no overhead).

Scheduler Wiring

When diamond_consistency = 'atomic' (per-ST or via the pg_trickle.diamond_consistency GUC):

  1. The scheduler wraps each multi-member group in a SAVEPOINT pgt_consistency_group.
  2. Each member is refreshed in topological order within the savepoint.
  3. If all succeedRELEASE SAVEPOINT and advance the group epoch.
  4. If any member failsROLLBACK TO SAVEPOINT undoes all members' changes. The failure is logged and the group retries on the next scheduler tick.

With diamond_consistency = 'none', members refresh independently in topological order — matching pre-feature behavior.

Schedule Policy

The diamond_schedule_policy setting (per-convergence-node or via the pg_trickle.diamond_schedule_policy GUC) controls when an atomic group fires:

PolicyTrigger conditionTrade-off
'fastest' (default)Any member is dueHigher freshness, more refreshes
'slowest'All members are dueLower resource cost, staler data

The policy is set on the convergence (fan-in) node. When multiple convergence nodes exist in the same group (nested diamonds), the strictest policy wins (slowest > fastest). The GUC serves as a cluster-wide fallback for nodes without an explicit per-node setting.

Monitoring

The pgtrickle.diamond_groups() SQL function exposes detected groups for operational visibility. See SQL_REFERENCE.md for details.

14. Configuration (src/config.rs)

Runtime behavior is controlled by a growing set of GUC (Grand Unified Configuration) variables. See CONFIGURATION.md for the complete, current list.

GUCDefaultPurpose
pg_trickle.enabledtrueMaster on/off switch for the scheduler
pg_trickle.scheduler_interval_ms1000Scheduler background worker wake interval (ms)
pg_trickle.min_schedule_seconds60Minimum allowed schedule
pg_trickle.max_consecutive_errors3Errors before auto-suspending a ST
pg_trickle.change_buffer_schemapgtrickle_changesSchema for change buffer tables
pg_trickle.max_concurrent_refreshes4Maximum parallel refresh workers
pg_trickle.differential_max_change_ratio0.15Change-to-table-size ratio above which DIFFERENTIAL falls back to FULL
pg_trickle.cleanup_use_truncatetrueUse TRUNCATE instead of DELETE for change buffer cleanup when the entire buffer is consumed
pg_trickle.user_triggers'auto'User-defined trigger handling: auto / off (on accepted as deprecated alias for auto)
pg_trickle.block_source_ddlfalseBlock column-affecting DDL on tracked source tables instead of reinit
pg_trickle.cdc_mode'auto'CDC mechanism: auto / trigger / wal
pg_trickle.wal_transition_timeout300Max seconds to wait for WAL decoder catch-up during transition
pg_trickle.slot_lag_warning_threshold_mb100Warning threshold for WAL slot retention used by slot_lag_warning and health_check()
pg_trickle.slot_lag_critical_threshold_mb1024Critical threshold for WAL slot retention used by check_cdc_health() alerts
pg_trickle.diamond_consistency'atomic'Diamond dependency consistency mode: atomic or none
pg_trickle.diamond_schedule_policy'fastest'Schedule policy for atomic diamond groups: fastest or slowest
pg_trickle.merge_planner_hintstrueInject SET LOCAL planner hints (disable nestloop, raise work_mem) before MERGE
pg_trickle.merge_work_mem_mb64work_mem (MB) applied when delta exceeds 10 000 rows and planner hints enabled
pg_trickle.use_prepared_statementstrueUse SQL PREPARE/EXECUTE for cached MERGE templates

Data Flow: End-to-End Refresh

 Source Table INSERT/UPDATE/DELETE
           │
           ▼
 Hybrid CDC Layer:
   ┌─────────────────────────────────────────────┐
   │ TRIGGER mode: Row-Level AFTER Trigger        │
   │   pg_trickle_cdc_fn_<oid>() → buffer table    │
   │                                              │
   │ WAL mode: Logical Replication Slot           │
   │   wal_decoder bgworker → same buffer table   │
   │                                              │
   │ ST-to-ST: Refresh engine captures delta      │
   │   → changes_pgt_<pgt_id> buffer table        │
   └─────────────────────────────────────────────┘
           │
           ▼
 Change Buffer Table
   Base tables:   pgtrickle_changes.changes_<oid>
   ST sources:    pgtrickle_changes.changes_pgt_<pgt_id>
   Columns: change_id, lsn, action (I/U/D), pk_hash, new_<col>, old_<col> (typed)
           │
           ▼
 DVM Engine: generate delta SQL from operator tree
   - Scan operator reads from changes_<oid> or changes_pgt_<id>
   - Filter/Project/Join transform the deltas
   - Aggregate recomputes affected groups
           │
           ▼
 Diff Engine: produce (+/-) diff rows
           │
           ▼
 Delta Application:
   DELETE FROM storage WHERE __pgt_row_id IN (removed)
   INSERT INTO storage SELECT ... FROM (added)
           │
           ▼
 Frontier Update: advance per-source LSN
           │
           ▼
 History Record: log to pgtrickle.pgt_refresh_history

Module Map

src/
├── lib.rs           # Extension entry, module declarations, _PG_init
├── bin/
│   └── pgrx_embed.rs# pgrx SQL entity embedding (generated)
├── api.rs           # SQL API functions (create/alter/drop/refresh/status)
├── catalog.rs       # Catalog CRUD operations
├── cdc.rs           # Change data capture (triggers + WAL transition)
├── config.rs        # GUC variable registration
├── dag.rs           # Dependency graph (cycle detection, SCC decomposition, topo sort)
├── error.rs         # Centralized error types
├── hash.rs          # xxHash row ID generation (pg_trickle_hash / pg_trickle_hash_multi)
├── hooks.rs         # DDL event trigger handlers (_on_ddl_end, _on_sql_drop)
├── ivm.rs           # Transactional IVM (IMMEDIATE mode: statement-level triggers)
├── shmem.rs         # Shared memory state (PgTrickleSharedState, DAG_REBUILD_SIGNAL, CACHE_GENERATION)
├── dvm/
│   ├── mod.rs       # DVM module root + recursive CTE orchestration
│   ├── parser.rs    # Query → OpTree converter (CTE extraction, subquery, window support)
│   ├── diff.rs      # Delta SQL generation (CTE delta cache)
│   ├── row_id.rs    # Row ID generation
│   └── operators/
│       ├── mod.rs           # Operator trait + registry
│       ├── scan.rs          # Table scan (CDC passthrough)
│       ├── filter.rs        # WHERE clause filtering
│       ├── project.rs       # Column projection
│       ├── join.rs          # Inner join
│       ├── join_common.rs   # Shared join utilities (snapshot subqueries, column disambiguation)
│       ├── outer_join.rs    # LEFT/RIGHT outer join
│       ├── full_join.rs     # FULL OUTER JOIN (8-part delta)
│       ├── aggregate.rs     # GROUP BY + aggregate functions (39 AggFunc variants)
│       ├── distinct.rs      # DISTINCT deduplication
│       ├── union_all.rs     # UNION ALL merging
│       ├── intersect.rs     # INTERSECT / INTERSECT ALL (dual-count LEAST)
│       ├── except.rs        # EXCEPT / EXCEPT ALL (dual-count GREATEST)
│       ├── subquery.rs      # Subquery / inlined CTE delegation
│       ├── cte_scan.rs      # Shared CTE delta (multi-reference)
│       ├── recursive_cte.rs # Recursive CTE (semi-naive + DRed + recomputation)
│       ├── window.rs        # Window function (partition recomputation)
│       ├── lateral_function.rs  # LATERAL SRF (row-scoped recomputation)
│       ├── lateral_subquery.rs  # LATERAL correlated subquery
│       ├── semi_join.rs     # EXISTS / IN subquery (semi-join delta)
│       ├── anti_join.rs     # NOT EXISTS / NOT IN subquery (anti-join delta)
│       └── scalar_subquery.rs   # Correlated scalar subquery in SELECT
├── monitor.rs       # Monitoring & observability functions
├── refresh.rs       # Refresh orchestration
├── scheduler.rs     # Automatic scheduling with canonical periods
├── version.rs       # Frontier / LSN tracking
└── wal_decoder.rs   # WAL-based CDC (logical replication slot polling, transitions)

Extension Control File (pg_trickle.control)

The pg_trickle.control file in the repository root is required by PostgreSQL's extension infrastructure. It declares the extension's description, default version, shared-library path, and privilege requirements. PostgreSQL reads this file when CREATE EXTENSION pg_trickle; is executed.

During packaging (cargo pgrx package), pgrx replaces the @CARGO_VERSION@ placeholder with the version from Cargo.toml and copies the file into the target's share/extension/ directory alongside the SQL migration scripts.

DVM Operators

This document describes the Differential View Maintenance (DVM) operators implemented by pgtrickle. Each operator transforms a stream of row-level changes (deltas) propagated from source tables through the operator tree.

Prior Art

  • Budiu, M. et al. (2023). "DBSP: Automatic Incremental View Maintenance." VLDB 2023. (comparison)
  • Gupta, A. & Mumick, I.S. (1999). Materialized Views: Techniques, Implementations, and Applications. MIT Press.
  • Koch, C. et al. (2014). "DBToaster: Higher-order Delta Processing for Dynamic, Frequently Fresh Views." VLDB Journal.
  • PostgreSQL 9.4+ — Materialized views with REFRESH MATERIALIZED VIEW CONCURRENTLY.

Overview

When a stream table is created, the defining SQL query is parsed into a tree of DVM operators. During an differential refresh, changes flow bottom-up through this tree:

         Aggregate
            │
         Project
            │
          Filter
            │
    ┌───────┴───────┐
   Join             │
  ┌─┴─┐            │
Scan(A) Scan(B)   Scan(C)

Each operator implements a differentiation rule: given the delta (Δ) to its input(s), it produces the corresponding delta to its output. This is conceptually similar to automatic differentiation in calculus.

The general contract:

  • Input: a set of ('+', row) and ('-', row) tuples (inserts and deletes)
  • Output: a set of ('+', row) and ('-', row) tuples

Updates are modeled as a delete of the old row followed by an insert of the new row.

DIFFERENTIAL and IMMEDIATE maintenance require deterministic expressions. VOLATILE functions and custom operators such as random() or clock_timestamp() are rejected during stream table creation because re-evaluation would corrupt delta semantics. STABLE functions such as now() and current_timestamp are allowed with a warning; FULL mode accepts all volatility classes because it recomputes the full result on each refresh.


Operator Support Matrix

The following table shows which SQL constructs are supported under each refresh mode.

SQL ConstructFULLDIFFERENTIALIMMEDIATENotes
Basic
Simple SELECT / projection
WHERE filter
Column expressions / aliases
DISTINCTUses __pgt_dup_count reference counting
DISTINCT ON
Joins
INNER JOINHybrid delta strategy
LEFT OUTER JOINNULL-padding transitions tracked
RIGHT OUTER JOIN
FULL OUTER JOIN8-part UNION ALL delta
CROSS JOIN
LATERAL JOINRow-scoped re-execution
Multi-table join (≤2 right scans)Full phantom-row-after-DELETE fix
Multi-table join (≥3 right scans)⚠️⚠️Falls back to post-change snapshot for right subtree (EC-01 boundary, fix planned for v0.12.0)
Subqueries
EXISTS / IN (semi-join)Delta-key pre-filter on left side
NOT EXISTS / NOT IN (anti-join)Inverted semantics; two-part delta
Scalar subquery (SELECT-list)Pre/post snapshot EXCEPT ALL diff
Correlated LATERAL subquery
Set Operations
UNION ALLDual-branch merge
INTERSECT / INTERSECT ALLDual-count tracking
EXCEPT / EXCEPT ALL
Aggregates
COUNT, SUM, AVGAlgebraic — fully invertible delta
MIN, MAXSemi-algebraic — group rescan on ambiguous delete
COUNT(DISTINCT), SUM(DISTINCT)Algebraic via auxiliary columns
BOOL_AND, BOOL_OR, BIT_AND, BIT_ORAlgebraic via auxiliary columns
EVERYAlgebraic via auxiliary columns
STRING_AGG, ARRAY_AGG⚠️⚠️Group-rescan strategy — warning emitted at creation time in DIFFERENTIAL mode
STDDEV, VARIANCE, STDDEV_POP, VAR_POPAlgebraic via auxiliary M2/sum/count columns
COVAR_SAMP, COVAR_POP, CORRAlgebraic via auxiliary columns
REGR_* (all 9 regression functions)Algebraic via auxiliary columns
PERCENTILE_CONT, PERCENTILE_DISC⚠️⚠️Group-rescan strategy
MODE⚠️⚠️Group-rescan strategy
XMLAGG, JSON_AGG, JSONB_AGG⚠️⚠️Group-rescan strategy
JSON_OBJECT_AGG, JSONB_OBJECT_AGG⚠️⚠️Group-rescan strategy
GROUP BY / HAVING
GROUP BY ROLLUP / CUBE / GROUPING SETSBranch count capped by max_grouping_set_branches (default 64)
Window Functions
ROW_NUMBER, RANK, DENSE_RANKPartition-scoped recompute
LAG, LEAD, FIRST_VALUE, LAST_VALUEPartition-scoped recompute
NTILE, CUME_DIST, PERCENT_RANKPartition-scoped recompute
Window frame clauses (ROWS, RANGE, GROUPS)
CTEs
Non-recursive WITHInlined or delta-cached (multi-ref)
WITH RECURSIVE (INSERT-only workload)Semi-naive evaluation
WITH RECURSIVE (mixed INSERT/DELETE/UPDATE)Delete-and-Rederive (DRed) strategy
TopK
ORDER BY … LIMIT NScoped recomputation; metadata validated each refresh
ORDER BY … LIMIT N OFFSET M
Lateral / SRF
LATERAL with set-returning functionRow-scoped re-execution
JSON_TABLEVia lateral function operator
generate_series()
unnest()
ST-to-ST Dependencies
Stream table reading from another stream tableDifferential via changes_pgt_ buffers (v0.11.0); FULL upstream produces I/D diff so downstream stays differential
Multi-level ST chainsTopological order; per-level delta propagation
Function Volatility
IMMUTABLE functions
STABLE functions (now(), current_timestamp)⚠️⚠️Allowed with warning — value may differ between initial load and delta evaluation
VOLATILE functions (random(), clock_timestamp())Rejected at creation time — re-evaluation corrupts delta semantics

Legend: ✅ = fully supported — ⚠️ = supported with caveats (see Notes column) — ❌ = not supported (blocked at creation time)


Operators

Scan

Module: src/dvm/operators/scan.rs

The leaf operator. Reads CDC changes from a source table's change buffer.

Delta Rule:

$$\Delta(\text{Scan}(R)) = \Delta R$$

The scan operator is a direct passthrough — inserts in the source become inserts in the output, deletes become deletes.

SQL Generation:

SELECT op, row_data FROM pgtrickle_changes.changes_<oid>
WHERE xid >= <last_consumed_xid>

Notes:

  • Each source table has a dedicated change buffer table created by the CDC module.
  • Row data is stored as JSONB with column names as keys.
  • The __pgt_row_id column (xxHash of primary key) is included for deduplication.

Filter

Module: src/dvm/operators/filter.rs

Applies a WHERE clause predicate to the delta stream.

Delta Rule:

$$\Delta(\sigma_p(R)) = \sigma_p(\Delta R)$$

Filtering is applied to the deltas in the same way as to the base data — only rows satisfying the predicate pass through.

SQL Generation:

SELECT * FROM (<input_delta>) AS d
WHERE <predicate>

Example:

If the defining query is:

SELECT * FROM orders WHERE status = 'shipped'

And a new row (id=5, status='pending') is inserted, it does not appear in the delta output. If (id=3, status='shipped') is inserted, it passes through.

Edge Cases:

  • For updates that change the predicate column (e.g., status from 'pending' to 'shipped'), the CDC produces a delete of the old row and insert of the new row. The filter passes the insert (matches) and blocks the delete (doesn't match the old row against the predicate), correctly resulting in a net insert.

Project

Module: src/dvm/operators/project.rs

Applies column projection from the target list.

Delta Rule:

$$\Delta(\pi_L(R)) = \pi_L(\Delta R)$$

Projects the same columns from the delta that the query projects from the base data.

SQL Generation:

SELECT <target_columns> FROM (<input_delta>) AS d

Notes:

  • Projection is applied after filtering for efficiency.
  • Computed expressions in the target list (e.g., price * quantity AS total) are evaluated on the delta rows.

Join (Inner)

Module: src/dvm/operators/join.rs

Implements inner join between two inputs.

Delta Rule:

For $R \bowtie S$:

$$\Delta(R \bowtie S) = (\Delta R \bowtie S) \cup (R' \bowtie \Delta S)$$

Where $R' = R \cup \Delta R$ (the new state of R after applying deltas).

In practice, when only one side has changes (common case), the delta join simplifies to joining the changed rows against the current state of the other side.

SQL Generation:

-- Changes to left side joined with current right side
SELECT '+' AS op, l.*, r.*
FROM (<left_delta> WHERE op = '+') AS l
JOIN <right_table> AS r ON <join_condition>

UNION ALL

-- Current left side joined with changes to right side
SELECT '+' AS op, l.*, r.*
FROM <left_table> AS l
JOIN (<right_delta> WHERE op = '+') AS r ON <join_condition>

(And corresponding DELETE queries for op = '-'.)

Notes:

  • The join uses the current state of the non-changed side, not the change buffer.
  • For equi-joins, this is efficient — the join key narrows the scan.
  • Non-equi joins (theta joins) may require broader scans.

Outer Join

Module: src/dvm/operators/outer_join.rs (LEFT JOIN), src/dvm/operators/full_join.rs (FULL JOIN)

Implements LEFT, RIGHT, and FULL OUTER JOIN.

RIGHT JOIN Handling:

RIGHT JOIN is automatically converted to a LEFT JOIN with swapped left/right operands during query parsing. This normalization happens transparently — the user can write RIGHT JOIN and the parser rewrites it to an equivalent LEFT JOIN before the operator tree is constructed.

Delta Rule:

Similar to inner join, but additionally handles NULL-padded rows:

$$\Delta(R \text{ LEFT JOIN } S) = (\Delta R \bowtie_L S) \cup (R' \bowtie_L \Delta S)$$

With special handling for:

  • Rows in ΔR that have no match in S → emit ('+', row, NULLs)
  • Rows in ΔS that create a first match for an R row → emit ('-', row, NULLs) and ('+', row, s_data)
  • Rows in ΔS that remove the last match for an R row → emit ('-', row, s_data) and ('+', row, NULLs)

SQL Generation (LEFT JOIN):

Uses anti-join detection (via NOT EXISTS) to correctly handle the NULL padding transitions.

FULL OUTER JOIN Delta Rule:

FULL OUTER JOIN extends the LEFT JOIN delta with symmetric right-side handling. The delta is computed as an 8-part UNION ALL:

  1. Parts 1–5: Same as LEFT JOIN delta (inserted/deleted rows from both sides, with NULL-padding transitions)
  2. Parts 6–7: Symmetric anti-join transitions for the right side (rows in ΔL that remove/create the last/first match for an S row)
  3. Part 8: Right-side insertions that have no match in the left side → emit ('+', NULLs, s_data)

Each part uses pre-computed delta flags (__has_ins_*, __has_del_*) to efficiently detect first-match/last-match transitions without redundant subqueries.

Nested Join Support:

Module: src/dvm/operators/join_common.rs

All join operators (inner, left, full) support nested children — i.e., a join whose left or right operand is itself another join. The join_common module provides shared helpers:

  • build_snapshot_sql() — returns the table reference for simple (Scan) operands, or a parenthesized subquery with disambiguated columns for nested join operands
  • rewrite_join_condition() — rewrites column references in ON conditions to use the correct alias prefixes for nested children (e.g., o.cust_iddl.o__cust_id)

This enables queries with 3 or more joined tables, e.g.:

SELECT o.id, c.name, p.title
FROM orders o
JOIN customers c ON o.cust_id = c.id
JOIN products p ON o.prod_id = p.id

Limitations:

  • FULL OUTER JOIN delta computation can be expensive due to dual-side NULL tracking (8 UNION ALL parts).
  • Performance degrades with high-cardinality join keys.
  • NATURAL JOIN is supported — common columns are resolved automatically and synthesized into an explicit equi-join condition.
  • EC-01 pre-change snapshot boundary (SF-5): The phantom-row-after-DELETE fix (EC-01) uses EXCEPT ALL to reconstruct the pre-change state of a join side. This is limited to join subtrees with ≤ 2 scan nodes to avoid PostgreSQL temporary file exhaustion on wide join chains. For queries with ≥ 3 base tables on one side of a join (e.g. TPC-H Q7/Q8/Q9), a simultaneous DELETE on both join sides may leave a phantom row in the stream table until the next full refresh. See use_pre_change_snapshot() in join_common.rs for the full rationale.

Aggregate

Module: src/dvm/operators/aggregate.rs

Handles GROUP BY with aggregate functions (COUNT, SUM, AVG, MIN, MAX, BOOL_AND, BOOL_OR, STRING_AGG, ARRAY_AGG, JSON_AGG, JSONB_AGG, BIT_AND, BIT_OR, BIT_XOR, JSON_OBJECT_AGG, JSONB_OBJECT_AGG, STDDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP, MODE, PERCENTILE_CONT, PERCENTILE_DISC, JSON_ARRAYAGG, JSON_OBJECTAGG) and the FILTER (WHERE …) and WITHIN GROUP (ORDER BY …) clauses.

Delta Rule:

$$\Delta(\gamma_{G, \text{agg}}(R)) = \gamma_{G, \text{agg}}(R' \text{ WHERE } G \in \text{affected_keys}) - \gamma_{G, \text{agg}}(R \text{ WHERE } G \in \text{affected_keys})$$

Where:

  • $G$ = grouping columns
  • affected_keys = the set of group key values that appear in ΔR
  • $R'$ = $R \cup \Delta R$ (the new state)

Strategy:

  1. Identify affected groups — Collect all group key values that appear in the delta (either inserted or deleted rows).
  2. Recompute old values — Query the storage table for current aggregate values of affected groups.
  3. Recompute new values — Query the updated source for new aggregate values of affected groups.
  4. Diff — For each affected group:
    • If old exists and new differs → emit ('-', old) and ('+', new)
    • If old exists and new is gone → emit ('-', old) (group eliminated)
    • If no old and new exists → emit ('+', new) (new group appeared)

Supported Aggregate Functions:

FunctionDVM StrategyNotes
COUNT(*)AlgebraicFully differential
COUNT(expr)AlgebraicFully differential
SUM(expr)AlgebraicFully differential
AVG(expr)AlgebraicDecomposed to SUM/COUNT internally
MIN(expr)Semi-algebraicUses LEAST merge; falls back to per-group rescan when min row is deleted
MAX(expr)Semi-algebraicUses GREATEST merge; falls back to per-group rescan when max row is deleted
BOOL_AND(expr)Group-rescanAffected groups are re-aggregated from source data
BOOL_OR(expr)Group-rescanAffected groups are re-aggregated from source data
STRING_AGG(expr, sep)Group-rescanAffected groups are re-aggregated from source data
ARRAY_AGG(expr)Group-rescanAffected groups are re-aggregated from source data
JSON_AGG(expr)Group-rescanAffected groups are re-aggregated from source data
JSONB_AGG(expr)Group-rescanAffected groups are re-aggregated from source data
BIT_AND(expr)Group-rescanAffected groups are re-aggregated from source data
BIT_OR(expr)Group-rescanAffected groups are re-aggregated from source data
BIT_XOR(expr)Group-rescanAffected groups are re-aggregated from source data
JSON_OBJECT_AGG(key, value)Group-rescanAffected groups are re-aggregated from source data
JSONB_OBJECT_AGG(key, value)Group-rescanAffected groups are re-aggregated from source data
STDDEV_POP(expr) / STDDEV(expr)Group-rescanAffected groups are re-aggregated from source data
STDDEV_SAMP(expr)Group-rescanAffected groups are re-aggregated from source data
VAR_POP(expr)Group-rescanAffected groups are re-aggregated from source data
VAR_SAMP(expr) / VARIANCE(expr)Group-rescanAffected groups are re-aggregated from source data
MODE() WITHIN GROUP (ORDER BY expr)Group-rescanOrdered-set aggregate; affected groups re-aggregated
PERCENTILE_CONT(frac) WITHIN GROUP (ORDER BY expr)Group-rescanOrdered-set aggregate; affected groups re-aggregated
PERCENTILE_DISC(frac) WITHIN GROUP (ORDER BY expr)Group-rescanOrdered-set aggregate; affected groups re-aggregated
CORR(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
COVAR_POP(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
COVAR_SAMP(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_AVGX(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_AVGY(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_COUNT(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_INTERCEPT(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_R2(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_SLOPE(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_SXX(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_SXY(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
REGR_SYY(Y, X)Group-rescanRegression aggregate; affected groups re-aggregated
ANY_VALUE(expr)Group-rescanPostgreSQL 16+; affected groups re-aggregated
JSON_ARRAYAGG(expr ...)Group-rescanSQL-standard JSON aggregation (PostgreSQL 16+); full deparsed SQL preserved
JSON_OBJECTAGG(key: value ...)Group-rescanSQL-standard JSON aggregation (PostgreSQL 16+); full deparsed SQL preserved
User-defined aggregates (CREATE AGGREGATE)Group-rescanAny custom aggregate is supported via group-rescan; full aggregate call SQL preserved verbatim

FILTER Clause:

All aggregate functions support the FILTER (WHERE …) clause:

SELECT COUNT(*) FILTER (WHERE status = 'active') AS active_count FROM orders GROUP BY region

The filter predicate is applied within the delta computation — only rows matching the filter contribute to the aggregate delta. Filtered aggregates are excluded from the P5 direct-bypass optimization.

SQL Generation:

The aggregate operator uses a 3-CTE pipeline:

  1. Merge CTE — Joins affected group keys against old (storage) and new (source) aggregate values, producing __pgt_meta_action ('I' for new-only groups, 'D' for disappeared groups, 'U' for changed groups).
  2. LATERAL VALUES expansion — A single-pass LATERAL (VALUES ...) clause expands each merge row into insert and delete actions, avoiding a 4-branch UNION ALL:
FROM merge_cte m,
LATERAL (VALUES
    ('I', m.new_count, m.new_total),
    ('D', m.old_count, m.old_total)
) v(action, count_val, val_total)
WHERE (m.__pgt_meta_action = 'I' AND v.action = 'I')
   OR (m.__pgt_meta_action = 'D' AND v.action = 'D')
   OR (m.__pgt_meta_action = 'U')
  1. Final projection — Emits ('+', row) and ('-', row) tuples for the refresh engine.

MIN/MAX Merge Strategy:

MIN and MAX use a semi-algebraic strategy with two cases:

  1. Non-extremum deletion — When the deleted row is NOT the current minimum (or maximum), the merge uses LEAST(old_value, new_inserts) for MIN or GREATEST(old_value, new_inserts) for MAX. This is fully algebraic and requires no rescan.

  2. Extremum deletion — When the row holding the current minimum (or maximum) IS deleted, the new value cannot be computed from the delta alone. The merge expression returns NULL as a sentinel, which triggers the change-detection guard (IS DISTINCT FROM) to emit the group for re-aggregation. The MERGE layer treats this as a DELETE + INSERT pair, recomputing the group from source data. This is still more efficient than a full table refresh since only affected groups are rescanned.


Distinct

Module: src/dvm/operators/distinct.rs

Implements SELECT DISTINCT using reference counting.

Delta Rule:

$$\Delta(\delta(R)) = { r \in \Delta R : \text{count}(r, R) = 0 \land \text{count}(r, R') > 0 } - { r \in \Delta R : \text{count}(r, R) > 0 \land \text{count}(r, R') = 0 }$$

In other words:

  • A row enters the output when its count transitions from 0 to ≥1
  • A row leaves the output when its count transitions from ≥1 to 0

Strategy:

Maintains a hidden __pgt_dup_count column in the storage table to track how many times each distinct row appears in the pre-distinct input.

  1. On insert: increment count. If count was 0, emit ('+', row).
  2. On delete: decrement count. If count becomes 0, emit ('-', row).

Notes:

  • The duplicate count is not visible in user queries against the storage table (projected away by the view layer).
  • Duplicate counting uses __pgt_row_id (xxHash) for efficient lookups.

Union All

Module: src/dvm/operators/union_all.rs

Merges deltas from two branches.

Delta Rule:

$$\Delta(R \cup_{\text{all}} S) = \Delta R \cup_{\text{all}} \Delta S$$

Simply concatenates the delta streams from both branches.

SQL Generation:

SELECT * FROM (<left_delta>)
UNION ALL
SELECT * FROM (<right_delta>)

Notes:

  • Column count and types must match between branches.
  • Each branch is independently processed through its own operator sub-tree.
  • This is the simplest operator since UNION ALL preserves all duplicates.

Intersect

Module: src/dvm/operators/intersect.rs

Implements INTERSECT and INTERSECT ALL using dual-count per-branch multiplicity tracking.

Delta Rule:

$$\Delta(R \cap S): \text{emit rows where } \min(\text{count}_L, \text{count}_R) \text{ crosses the 0 boundary}$$

  • INTERSECT (set): a row is present when both branches contain it.
  • INTERSECT ALL (bag): a row appears $\min(\text{count}_L, \text{count}_R)$ times.

SQL Generation (3-CTE chain):

  1. Delta CTE — tags rows from left/right child deltas with branch indicator ('L'/'R') and computes per-row net_count.
  2. Merge CTE — joins with the storage table to compute old and new per-branch counts (__pgt_count_l, __pgt_count_r).
  3. Final CTE — detects boundary crossings using LEAST(old_count_l, old_count_r) vs LEAST(new_count_l, new_count_r).

Notes:

  • Storage table requires hidden columns __pgt_count_l and __pgt_count_r for multiplicity tracking.
  • Both set and bag variants use the same 3-CTE structure; only the boundary logic stays the same (both use LEAST).

Except

Module: src/dvm/operators/except.rs

Implements EXCEPT and EXCEPT ALL using dual-count per-branch multiplicity tracking.

Delta Rule:

$$\Delta(R - S): \text{emit rows where } \max(0, \text{count}_L - \text{count}_R) \text{ crosses the 0 boundary}$$

  • EXCEPT (set): a row is present when it exists in the left but not the right branch.
  • EXCEPT ALL (bag): a row appears $\max(0, \text{count}_L - \text{count}_R)$ times.

SQL Generation (3-CTE chain):

  1. Delta CTE — same as Intersect: tags rows from both child deltas with branch indicator.
  2. Merge CTE — joins with storage table for old/new per-branch counts.
  3. Final CTE — detects boundary crossings using GREATEST(0, old_count_l - old_count_r) vs GREATEST(0, new_count_l - new_count_r).

Notes:

  • EXCEPT is not commutative — left branch is the positive input, right is subtracted.
  • Storage table requires hidden columns __pgt_count_l and __pgt_count_r.
  • Same 3-CTE structure as Intersect with different effective-count function.

Subquery

Module: src/dvm/operators/subquery.rs

Handles both inlined CTEs and explicit subqueries in FROM ((SELECT ...) AS alias).

Delta Rule:

$$\Delta(\rho_{\text{alias}}(Q)) = \rho_{\text{alias}}(\Delta Q)$$

A subquery wrapper is transparent for differentiation — it delegates to its child's delta and optionally renames output columns to match the subquery's column aliases.

SQL Generation:

-- If column aliases differ from child output columns:
SELECT __pgt_row_id, __pgt_action, child_col1 AS alias_col1, child_col2 AS alias_col2
FROM (<child_delta>)

If the child columns already match the aliases, the subquery is a pure passthrough — no additional CTE is emitted.

Notes:

  • This operator enables both CTE support (Tier 1) and standalone subqueries in FROM.
  • Column aliases on subqueries (FROM (...) AS x(a, b)) are handled by emitting a thin renaming CTE.
  • The subquery body is fully differentiated as a normal operator sub-tree.

CTE Scan (Shared Delta)

Module: src/dvm/operators/cte_scan.rs

Handles multi-reference CTEs by computing the CTE body's delta once and reusing it across all references (Tier 2).

Delta Rule:

$$\Delta(\text{CteScan}(\text{id}, Q)) = \text{cache}[\text{id}] \quad \text{(computed once, reused)}$$

When a CTE is referenced multiple times in a query, each reference produces a CteScan node with the same cte_id. The diff engine differentiates the CTE body once and caches the result. Subsequent CteScan nodes for the same CTE reuse the cached delta.

SQL Generation:

-- First reference: differentiates the CTE body and stores result in cache
-- Subsequent references: point to the same system CTE name
SELECT __pgt_row_id, __pgt_action, <columns>
FROM __pgt_cte_<cte_name>_delta  -- shared across all references

If column aliases are present, a thin renaming CTE is added on top of the cached delta.

Notes:

  • Without CteScan (Tier 1), multi-reference CTEs are inlined: each reference duplicates the full operator sub-tree. CteScan (Tier 2) eliminates this duplication.
  • The CTE body is pre-differentiated in dependency order (earlier CTEs before later ones that reference them).
  • Column alias support follows the same pattern as the Subquery operator.

Recursive CTEs

Recursive CTEs (WITH RECURSIVE) are supported in FULL, DIFFERENTIAL, and IMMEDIATE modes, with different execution paths depending on the refresh mode:

FULL Mode

Recursive CTEs work out-of-the-box with refresh_mode = 'FULL'. The defining query is executed as-is via INSERT INTO ... SELECT ..., and PostgreSQL handles the iterative evaluation internally.

DIFFERENTIAL Mode (Three-Strategy Incremental Maintenance)

Recursive CTEs with refresh_mode = 'DIFFERENTIAL' use an automatic three-strategy approach, selected based on column compatibility and change type:

Strategy 1: Semi-Naive Evaluation (INSERT-only changes)

When only INSERT changes are present in the change buffer, pg_trickle uses semi-naive evaluation — the standard technique for incremental fixpoint computation. The base case is differentiated normally through the DVM operator tree, then the resulting delta is propagated through the recursive term using a nested WITH RECURSIVE:

WITH RECURSIVE
  __pgt_base_delta AS (
    -- Normal DVM differentiation of the base case (INSERT rows only)
    <differentiated base case>
  ),
  __pgt_rec_delta AS (
    -- Seed: base case delta rows
    SELECT cols FROM __pgt_base_delta WHERE __pgt_action = 'I'
    UNION ALL
    -- Seed: new base rows joining existing ST storage
    SELECT cols FROM <recursive term with self_ref = ST_storage, base = change_buffer>
    UNION ALL
    -- Propagation: recursive term applied to growing delta
    SELECT cols FROM <recursive term with self_ref = __pgt_rec_delta, base = full>
  )
SELECT pgtrickle.pg_trickle_hash(...) AS __pgt_row_id, 'I' AS __pgt_action, cols
FROM __pgt_rec_delta

The cost is proportional to the number of new rows produced by the change, not the full result set.

Strategy 2: Delete-and-Rederive / DRed (mixed INSERT/DELETE/UPDATE changes)

When the change buffer contains DELETE or UPDATE changes, simple propagation is insufficient — a deleted base row may have transitively derived many recursive rows, some of which may still be derivable from alternative paths. DRed handles this in four phases:

  1. Insert propagation — semi-naive evaluation for the INSERT portion (same as Strategy 1)
  2. Over-deletion cascade — propagate base-case deletions through the recursive term against ST storage to find all transitively-derived rows that might be invalidated
  3. Rederivation — re-execute the recursive CTE from the remaining (non-deleted) base rows to restore any over-deleted rows that have alternative derivations
  4. Combine — final delta = inserts + (over-deletions − rederived rows)

This avoids full recomputation while correctly handling deletions with alternative derivation paths.

IMMEDIATE Mode

Recursive CTEs with refresh_mode = 'IMMEDIATE' use the same semi-naive and Delete-and-Rederive machinery as DIFFERENTIAL mode, but the base changes come from PostgreSQL statement transition tables instead of the background change buffer. This keeps the stream table transactionally up to date within the same statement. To guard against cyclic data or unexpectedly deep recursion, the semi-naive SQL injects a depth counter capped by pg_trickle.ivm_recursive_max_depth (default 100; set to 0 to disable the guard).

Strategy 3: Recomputation Fallback

When the CTE defines more columns than the outer SELECT projects (column mismatch), the incremental strategies cannot be used because the ST storage table lacks columns needed for recursive self-joins. In this case, the full defining query is re-executed and anti-joined against current storage:

WITH __pgt_recomp_new AS (
    SELECT pgtrickle.pg_trickle_hash(row_to_json(sub)::text) AS __pgt_row_id, col1, col2, ...
    FROM (<defining_query>) sub
),
__pgt_recomp_ins AS (
    SELECT n.__pgt_row_id, 'I'::text AS __pgt_action, n.col1, n.col2, ...
    FROM __pgt_recomp_new n
    LEFT JOIN <storage_table> s ON s.__pgt_row_id = n.__pgt_row_id
    WHERE s.__pgt_row_id IS NULL
),
__pgt_recomp_del AS (
    SELECT s.__pgt_row_id, 'D'::text AS __pgt_action, s.col1, s.col2, ...
    FROM <storage_table> s
    LEFT JOIN __pgt_recomp_new n ON n.__pgt_row_id = s.__pgt_row_id
    WHERE n.__pgt_row_id IS NULL
)
SELECT * FROM __pgt_recomp_ins
UNION ALL
SELECT * FROM __pgt_recomp_del

The cost is proportional to the full result set size.

Strategy Selection
CTE columns match ST?Change typerefresh_mode / DeltaSourceStrategy
✅ MatchINSERT-onlyDIFFERENTIAL (ChangeBuffer)Semi-naive (Strategy 1)
MatchMixed (INSERT+DELETE/UPDATE)DIFFERENTIAL (ChangeBuffer)DRed (Strategy 2)
MatchINSERT-onlyIMMEDIATE (TransitionTable)Semi-naive (Strategy 1)
MatchMixed (INSERT+DELETE/UPDATE)IMMEDIATE (TransitionTable)DRed (Strategy 2)
MismatchAnyAnyRecomputation (Strategy 3)

DRed in DIFFERENTIAL mode (P2-1 -- implemented in v0.10.0)

DRed is now active in both DIFFERENTIAL and IMMEDIATE modes when CTE output columns match ST storage columns. Phase 1 propagates inserts via semi-naive evaluation; Phase 2 cascades deletions through ST storage; Phase 3 rederives over-deleted rows that have alternative derivation paths; Phase 4 combines the results. DRed correctly handles derived-column changes such as path rebuilds under a renamed ancestor node. Column-mismatch cases still use recomputation fallback.

Notes:

  • Non-linear recursion (multiple self-references in the recursive term) is rejected — PostgreSQL restricts the recursive term to reference the CTE at most once.
  • The __pgt_row_id column (xxHash of the JSON-serialized row) is used for row identity.
  • For write-heavy workloads on very large recursive result sets with frequent mixed changes, refresh_mode = 'FULL' may still be more efficient than DRed.

Window Functions

Module: src/dvm/operators/window.rs

Handles window functions (ROW_NUMBER, RANK, DENSE_RANK, SUM() OVER, etc.) using partition-based recomputation.

Delta Rule:

When any row in a partition changes (insert, update, or delete), the entire partition's window function output is recomputed:

$$\Delta(\omega_{f, P}(R)) = \omega_{f, P}(R'|{\text{affected partitions}}) - \omega{f, P}(R|_{\text{affected partitions}})$$

Where $P$ is the PARTITION BY key and $f$ is the window function.

Strategy:

  1. Identify affected partition keys from the child delta.
  2. Delete old window function results for affected partitions from storage.
  3. Build the current input for affected partitions by excluding changed rows via NOT EXISTS on pass-through columns.
  4. Recompute the window function on the current input for affected partitions.
  5. Compute unique row IDs via row_to_json + row_number (handles tied values in ranking functions).
  6. Emit the recomputed rows as inserts.

SQL Generation:

-- CTE 1: Affected partition keys from delta
WITH affected_partitions AS (
    SELECT DISTINCT <partition_cols> FROM (<child_delta>)
),
-- CTE 2: Current input (surviving rows not in delta) for affected partitions
current_input AS (
    SELECT * FROM <child_snapshot>
    WHERE (<partition_cols>) IN (SELECT * FROM affected_partitions)
    AND NOT EXISTS (
        SELECT 1 FROM (<child_delta>) d
        WHERE d.<col1> IS NOT DISTINCT FROM <child_alias>.<col1>
        AND   d.<col2> IS NOT DISTINCT FROM <child_alias>.<col2> ...
    )
),
-- CTE 3: Recompute window function with unique row IDs
recomputed AS (
    SELECT *, pgtrickle.pg_trickle_hash(
        row_to_json(w)::text || '/' || row_number() OVER ()::text
    ) AS __pgt_row_id
    FROM (
        SELECT *, <window_func> OVER (PARTITION BY <partition_cols> ORDER BY <order_cols>) AS <alias>
        FROM current_input
    ) w
)
-- Delete old results + insert recomputed results
SELECT 'D' AS __pgt_action, ...  -- old rows from affected partitions
UNION ALL
SELECT 'I' AS __pgt_action, ...  -- recomputed rows

Notes:

  • The cost is proportional to the size of affected partitions, not the full table. For workloads where changes spread across few partitions, this is efficient.
  • When multiple window functions use different PARTITION BY clauses, the parser accepts all of them. If they share the same partition key it is used directly; otherwise the operator falls back to un-partitioned (full) recomputation.
  • Without PARTITION BY, the entire table is treated as a single partition — any change triggers a full recomputation.
  • Window functions wrapping aggregates (e.g., RANK() OVER (ORDER BY SUM(x))) are supported: the window diff rewrites ORDER BY / PARTITION BY expressions to reference aggregate output aliases via build_agg_alias_map.
  • Row IDs are computed from the full row content (row_to_json) plus a positional disambiguator (row_number) to avoid hash collisions with tied ranking values (DENSE_RANK, RANK).

Known Limitation: O(partition_size) Recomputation Cost

Any single-row change within a window partition triggers recomputation of the entire partition. For queries with large partitions (e.g., PARTITION BY region where a region has 500K rows), a single INSERT into that partition causes all 500K rows to be recomputed and diffed. This is inherent to the partition-based delta strategy — window functions cannot be incrementally maintained at sub-partition granularity because a single row insertion can shift the rank, row number, or running aggregate of every other row in the same partition.

Mitigation strategies:

  • Use more granular PARTITION BY keys to keep partition sizes small.
  • For queries without PARTITION BY, consider restructuring as a GROUP BY aggregate if the window function is equivalent (e.g., SUM(x) OVER ()SUM(x) as a scalar subquery).
  • Accept the cost for low-change-frequency partitions; the recomputation is still cheaper than a full table refresh since only affected partitions are touched.
  • If partition sizes routinely exceed 100K rows and changes are frequent, consider the FULL refresh mode which bypasses the per-partition delta entirely.

Window Frame Clauses:

Window frame specifications are fully supported:

  • Modes: ROWS, RANGE, GROUPS
  • Bounds: UNBOUNDED PRECEDING, N PRECEDING, CURRENT ROW, N FOLLOWING, UNBOUNDED FOLLOWING
  • Between syntax: BETWEEN <start> AND <end>
  • Exclusion: EXCLUDE CURRENT ROW, EXCLUDE GROUP, EXCLUDE TIES, EXCLUDE NO OTHERS

Example: SUM(val) OVER (ORDER BY ts ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)

Named WINDOW Clauses:

Named window definitions are resolved from the query-level WINDOW clause:

SELECT id, SUM(val) OVER w, AVG(val) OVER w
FROM data
WINDOW w AS (PARTITION BY category ORDER BY ts)

The parser resolves OVER w by looking up the window definition from the WINDOW clause and merging partition, order, and frame specifications.


Lateral Function (Set-Returning Functions in FROM)

Module: src/dvm/operators/lateral_function.rs

Handles set-returning functions (SRFs) used in the FROM clause with implicit LATERAL semantics: jsonb_array_elements, jsonb_each, jsonb_each_text, unnest, etc.

Delta Rule:

When a source row changes (insert, update, or delete), the SRF expansion is re-evaluated only for that source row:

$$\Delta(R \ltimes f(R.\text{col})) = (R' \ltimes f(R'.\text{col}))|{\text{changed rows}} - (R \ltimes f(R.\text{col}))|{\text{changed rows}}$$

Where $R$ is the source table, $f$ is the SRF, and changed rows are identified via the child delta.

Strategy (Row-Scoped Recomputation):

  1. Propagate the child delta to identify changed source rows.
  2. Find all existing ST rows derived from changed source rows (via column matching).
  3. Delete old SRF expansions for those source rows.
  4. Re-expand the SRF for inserted/updated source rows.
  5. Emit deletes + inserts as the final delta.

SQL Generation (4-CTE chain):

-- CTE 1: Changed source rows from child delta
WITH lat_changed AS (
    SELECT DISTINCT "__pgt_row_id", "__pgt_action", <child_cols>
    FROM <child_delta>
),
-- CTE 2: Old ST rows for changed source rows (to be deleted)
lat_old AS (
    SELECT st."__pgt_row_id", st.<all_output_cols>
    FROM <st_table> st
    WHERE EXISTS (
        SELECT 1 FROM lat_changed cs
        WHERE st.<col1> IS NOT DISTINCT FROM cs.<col1>
          AND st.<col2> IS NOT DISTINCT FROM cs.<col2>
          ...
    )
),
-- CTE 3: Re-expand SRF for inserted/updated source rows
lat_expand AS (
    SELECT pg_trickle_hash(<all_cols>::text) AS "__pgt_row_id",
           cs.<child_cols>, <srf_alias>.<srf_cols>
    FROM lat_changed cs,
         LATERAL <srf_function>(cs.<arg>) AS <srf_alias>
    WHERE cs."__pgt_action" = 'I'
),
-- CTE 4: Final delta
lat_final AS (
    SELECT "__pgt_row_id", 'D' AS "__pgt_action", <cols> FROM lat_old
    UNION ALL
    SELECT "__pgt_row_id", 'I' AS "__pgt_action", <cols> FROM lat_expand
)

Row Identity:

Content-based: hash(child_columns || srf_result_columns). This is stable as long as the same source row produces the same expanded values.

Supported SRFs:

FunctionOutput ColumnsNotes
jsonb_array_elements(jsonb)value (jsonb)Expands JSONB array to rows
jsonb_array_elements_text(jsonb)value (text)Text variant
jsonb_each(jsonb)key (text), value (jsonb)Expands JSONB object to key-value pairs
jsonb_each_text(jsonb)key (text), value (text)Text variant
unnest(anyarray)Element typeUnnests PostgreSQL arrays
Custom SRFsUser-provided column aliasesAS alias(col1, col2)

Notes:

  • The cost is proportional to the number of changed source rows × average SRF expansion size, not the full table.
  • WITH ORDINALITY is supported — adds a bigint ordinality column to the output.
  • ROWS FROM() with multiple functions is not supported (rejected at parse time).
  • Column aliases (e.g., AS child(value)) are used to determine output column names; for known SRFs without aliases, the alias name becomes the column name.
  • JSON_TABLE (PostgreSQL 17+) — JSON_TABLE(expr, path COLUMNS (...)) is modeled as a LateralFunction and uses the same row-scoped recomputation strategy. Supported column types: regular, EXISTS, formatted, and nested columns with ON ERROR/ON EMPTY behaviors and PASSING clauses.

Lateral Subquery (Correlated Subqueries in FROM)

Module: src/dvm/operators/lateral_subquery.rs

Handles correlated subqueries used in the FROM clause with explicit or implicit LATERAL semantics: FROM t, LATERAL (SELECT ... WHERE ref = t.col) AS alias or FROM t LEFT JOIN LATERAL (...) AS alias ON true.

Delta Rule:

When an outer row changes, the correlated subquery is re-executed only for that row:

$$\Delta(R \ltimes Q(R)) = (R' \ltimes Q(R'))|{\text{changed rows}} - (R \ltimes Q(R))|{\text{changed rows}}$$

Where $R$ is the outer table, $Q(R)$ is the correlated subquery, and changed rows are identified via the child delta.

Strategy (Row-Scoped Recomputation):

  1. Propagate the child delta to identify changed outer rows.
  2. Find all existing ST rows derived from changed outer rows (via column matching with IS NOT DISTINCT FROM).
  3. Delete old subquery expansions for those outer rows.
  4. Re-execute the subquery for inserted/updated outer rows using the original outer alias.
  5. Emit deletes + inserts as the final delta.

SQL Generation (4-CTE chain):

-- CTE 1: Changed outer rows from child delta
WITH lat_sq_changed AS (
    SELECT DISTINCT "__pgt_row_id", "__pgt_action", <child_cols>
    FROM <child_delta>
),
-- CTE 2: Old ST rows for changed outer rows (to be deleted)
lat_sq_old AS (
    SELECT st."__pgt_row_id", st.<all_output_cols>
    FROM <st_table> st
    WHERE EXISTS (
        SELECT 1 FROM lat_sq_changed cs
        WHERE st.<col1> IS NOT DISTINCT FROM cs.<col1>
          AND st.<col2> IS NOT DISTINCT FROM cs.<col2>
          ...
    )
),
-- CTE 3: Re-execute subquery for inserted/updated outer rows
lat_sq_expand AS (
    SELECT pg_trickle_hash(<all_cols>::text) AS "__pgt_row_id",
           <outer_alias>.<child_cols>, <sub_alias>.<sub_cols>
    FROM lat_sq_changed AS <outer_alias>,      -- Original outer alias!
         LATERAL (<subquery_sql>) AS <sub_alias>
    WHERE <outer_alias>."__pgt_action" = 'I'
),
-- CTE 4: Final delta
lat_sq_final AS (
    SELECT "__pgt_row_id", 'D' AS "__pgt_action", <cols> FROM lat_sq_old
    UNION ALL
    SELECT "__pgt_row_id", 'I' AS "__pgt_action", <cols> FROM lat_sq_expand
)

LEFT JOIN LATERAL Handling:

For queries using LEFT JOIN LATERAL (...) ON true, the expand CTE uses LEFT JOIN LATERAL instead of comma syntax and wraps subquery columns in COALESCE for hash stability:

lat_sq_expand AS (
    SELECT pg_trickle_hash(<outer_cols>::text || '/' || COALESCE(<sub_cols>::text, '')) AS "__pgt_row_id",
           <outer_alias>.<child_cols>, <sub_alias>.<sub_cols>
    FROM lat_sq_changed AS <outer_alias>
    LEFT JOIN LATERAL (<subquery_sql>) AS <sub_alias> ON true
    WHERE <outer_alias>."__pgt_action" = 'I'
)

Row Identity:

Content-based: hash(outer_columns || '/' || subquery_result_columns). For LEFT JOIN with NULL results, COALESCE ensures a stable hash.

Supported Patterns:

PatternSyntaxNotes
Top-N per groupLATERAL (SELECT ... ORDER BY ... LIMIT N)Most common use case
Correlated aggregateLATERAL (SELECT SUM(x) FROM t WHERE t.fk = p.pk)Returns single row per outer row
Existence with dataLEFT JOIN LATERAL (...) ON truePreserves outer rows with NULLs
Multi-column lookupLATERAL (SELECT a, b FROM t WHERE t.fk = p.pk LIMIT 1)Multiple derived values
GROUP BY inside subqueryLATERAL (SELECT type, COUNT(*) FROM t WHERE t.fk = p.pk GROUP BY type)Multiple rows per outer row

Key Design Decision: Outer Alias Rewriting

The subquery body contains column references to the outer table (e.g., WHERE li.order_id = o.id). In the expansion CTE, the changed-sources CTE is aliased with the original outer table alias (e.g., lat_sq_changed AS o) so that the subquery's column references resolve naturally without rewriting.

Notes:

  • The cost is proportional to the number of changed outer rows × average subquery result size, not the full table.
  • The subquery is stored as raw SQL (like LateralFunction) because it cannot be independently differentiated — it depends on outer row context.
  • Source table OIDs referenced by the subquery body are extracted at parse time for CDC trigger setup.
  • ORDER BY + LIMIT inside the subquery are valid (they apply per-outer-row, not to the stream table).

Semi-Join (EXISTS / IN Subquery)

Module: src/dvm/operators/semi_join.rs

Handles WHERE EXISTS (SELECT ... FROM ...) and WHERE col IN (SELECT ...) patterns. The parser transforms these into a SemiJoin operator with a left (outer) child, a right (inner) child, and a join condition.

Delta Rule:

$$\Delta(L \ltimes R) = \Delta L|{R} + L|{\Delta R \text{ causes existence change}}$$

  • Part 1: Outer rows that changed and still satisfy the semi-join condition.
  • Part 2: Existing outer rows whose semi-join result flipped due to inner changes (a matching inner row was inserted or deleted).

Strategy (Two-Part Delta):

  1. Part 1 (outer delta): Filter delta_left to rows that have at least one match in the current right-hand snapshot.
  2. Part 2 (inner delta): For each row in the left snapshot, check whether the existence of matching right-hand rows changed between the old and current state. Emit 'I' if a match appeared, 'D' if all matches disappeared.

The "old" right-hand state is reconstructed from the current state by reversing the delta: R_old = (R_current EXCEPT ALL delta_right(action='I')) UNION ALL delta_right(action='D').

Row Identity:

  • Part 1: Uses __pgt_row_id from the left delta.
  • Part 2: Content-based hash via pg_trickle_hash_multi on left-side columns.

Supported Patterns:

PatternSQLNotes
EXISTSWHERE EXISTS (SELECT 1 FROM t WHERE t.fk = s.pk)Direct semi-join
IN (subquery)WHERE id IN (SELECT fk FROM t)Rewritten to EXISTS with equality
Multiple conditionsWHERE EXISTS (... AND ...)Additional predicates in subquery WHERE

Anti-Join (NOT EXISTS / NOT IN Subquery)

Module: src/dvm/operators/anti_join.rs

Handles WHERE NOT EXISTS (SELECT ... FROM ...) and WHERE col NOT IN (SELECT ...) patterns. The inverse of the semi-join operator.

Delta Rule:

$$\Delta(L \triangleright R) = \Delta L|{\neg R} + L|{\Delta R \text{ causes existence change}}$$

  • Part 1: Outer rows that changed and have no match in the right-hand snapshot.
  • Part 2: Existing outer rows whose anti-join result flipped due to inner changes.

Strategy (Two-Part Delta):

  1. Part 1 (outer delta): Filter delta_left to rows with NOT EXISTS in the current right snapshot.
  2. Part 2 (inner delta): For each row in the left snapshot, detect existence changes. Emit 'D' if a match appeared (row no longer qualifies), 'I' if all matches disappeared (row now qualifies).

Note the inverted semantics compared to semi-join: a new match means deletion, losing all matches means insertion.

Row Identity: Same as semi-join.

Supported Patterns:

PatternSQLNotes
NOT EXISTSWHERE NOT EXISTS (SELECT 1 FROM t WHERE t.fk = s.pk)Direct anti-join
NOT IN (subquery)WHERE id NOT IN (SELECT fk FROM t)Rewritten to NOT EXISTS with equality

Scalar Subquery (Correlated SELECT Subquery)

Module: src/dvm/operators/scalar_subquery.rs

Handles scalar subqueries appearing in the SELECT list, e.g., SELECT a, (SELECT max(x) FROM t) AS mx FROM s. The subquery must return exactly one row and one column.

Delta Rule:

$$\Delta(L \times q) = \Delta L \times q' + L \times (q' - q)$$

Where $q$ is the scalar subquery value and $q'$ is the updated value.

Strategy (Two-Part Delta):

  1. Part 1 (outer delta): Propagate the child delta, appending the current scalar subquery value to each row.
  2. Part 2 (scalar value change): When the scalar subquery's result changes, emit deletes for all existing outer rows (with the old scalar value) and re-inserts for all outer rows (with the new value). The old scalar value is reconstructed by reversing the inner delta.

SQL Generation (3 or 4 CTEs):

-- Part 1: child delta + current scalar value
WITH sq_outer AS (
    SELECT *, (<scalar_subquery>) AS "<alias>"
    FROM <child_delta>
),
-- Part 2a: DELETE all outer rows when scalar changed
sq_del AS (
    SELECT "__pgt_row_id", 'D' AS "__pgt_action", <cols>
    FROM <st_table>
    WHERE (<scalar_old>) IS DISTINCT FROM (<scalar_current>)
),
-- Part 2b: INSERT all outer rows with new scalar value
sq_ins AS (
    SELECT pg_trickle_hash_multi(...) AS "__pgt_row_id",
           'I' AS "__pgt_action", <cols>, (<scalar_current>) AS "<alias>"
    FROM <source_snapshot>
    WHERE (<scalar_old>) IS DISTINCT FROM (<scalar_current>)
)
-- Final: UNION ALL of all parts
SELECT * FROM sq_outer
UNION ALL SELECT * FROM sq_del
UNION ALL SELECT * FROM sq_ins

Row Identity:

  • Part 1: __pgt_row_id from the child delta.
  • Part 2: Content-based hash via pg_trickle_hash_multi on all output columns.

Notes:

  • The scalar subquery is stored as raw SQL (deparsed from the parse tree).
  • The old scalar value is approximated using the same EXCEPT ALL / UNION ALL reversal technique as semi/anti-join.
  • If the scalar subquery references a table that changes, all outer rows must be re-evaluated — the delta can be large.
  • Source OIDs used by the scalar subquery are captured at parse time for CDC trigger registration.

Operator Tree Construction

The DVM engine builds the operator tree by analyzing the parsed query:

  1. WITH clause → CTE definitions extracted into a name→body map (non-recursive) or CTE registry (multi-reference)
  2. FROM clauseScan nodes for physical tables; Subquery nodes for inlined CTEs and subqueries in FROM; CteScan nodes for multi-reference CTEs; LateralFunction nodes for SRFs and JSON_TABLE in FROM; LateralSubquery nodes for correlated subqueries in FROM
  3. JOINJoin or OuterJoin wrapping two sub-trees
  4. LATERAL SRFsLateralFunction wrapping the left-hand FROM item as its child
  5. LATERAL subqueriesLateralSubquery wrapping the left-hand FROM item as its child (comma syntax or JOIN LATERAL)
  6. WHERE subqueriesSemiJoin for EXISTS/IN (subquery), AntiJoin for NOT EXISTS/NOT IN (subquery), extracted from the WHERE clause
  7. Scalar subqueriesScalarSubquery for (SELECT ...) in the SELECT list, wrapping the child tree
  8. WHEREFilter wrapping the scan/join tree (remaining non-subquery predicates)
  9. SELECT listProject for column selection and expressions
  10. GROUP BYAggregate wrapping the filtered/projected tree
  11. DISTINCTDistinct on top
  12. UNION ALLUnionAll combining two complete sub-trees
  13. INTERSECT / EXCEPTIntersect or Except combining two sub-trees with dual-count tracking
  14. Window functionsWindow wrapping the sub-tree with PARTITION BY / ORDER BY metadata
  15. ORDER BY → silently discarded (storage row order is undefined)
  16. LIMIT / OFFSETORDER BY + LIMIT [+ OFFSET] is accepted as TopK (scoped recomputation); standalone LIMIT or OFFSET without ORDER BY is rejected

For recursive CTEs (WITH RECURSIVE), the query is parsed into an OpTree with RecursiveCte operator nodes. In DIFFERENTIAL mode, the strategy (semi-naive, DRed, or recomputation) is selected automatically based on column compatibility and change type — see the Recursive CTEs section above for details.

The tree is then traversed bottom-up during delta generation: each operator's generate_delta_sql() method composes its SQL fragment around the output of its child operator(s).


Further Reading

pg_trickle — Benchmark Guide

This document explains how the database-level refresh benchmarks work and how to interpret their output.


Overview

The benchmark suite in tests/e2e_bench_tests.rs measures wall-clock refresh time for FULL vs DIFFERENTIAL mode across a matrix of table sizes, change rates, and query complexities. Each benchmark spawns an isolated PostgreSQL 18.x container via Testcontainers, ensuring reproducible and interference-free measurements.

The core question the benchmarks answer:

How much faster is an DIFFERENTIAL refresh compared to a FULL refresh, given a specific workload?


Prerequisites

Build the E2E test Docker image before running any benchmarks:

./tests/build_e2e_image.sh

Docker must be running on the host.


Running Benchmarks

All benchmark tests are tagged #[ignore] so they are skipped during normal CI. The --nocapture flag is required to see the printed output tables.

Quick Spot Checks (~5–10 seconds each)

# Simple scan, 10K rows, 1% change rate
cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_scan_10k_1pct

# Aggregate query, 100K rows, 1% change rate
cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_aggregate_100k_1pct

# Join + aggregate, 100K rows, 10% change rate
cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_join_agg_100k_10pct

Zero-Change Latency (~5 seconds)

cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_no_data_refresh_latency

Full Matrix (~15–30 minutes)

Runs all 30 combinations and prints a consolidated summary:

cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture bench_full_matrix

Run All Benchmarks in Parallel

cargo test --test e2e_bench_tests --features pg18 -- --ignored --nocapture

Note: each test starts its own container, so parallel execution requires sufficient Docker resources.


Benchmark Dimensions

Table Sizes

SizeRowsPurpose
Small10,000Fast iteration; measures per-row overhead
Medium100,000More realistic; reveals scaling characteristics

Change Rates

RateDescription
1%Low churn — the sweet spot for incremental refresh
10%Moderate churn — tests delta query scalability
50%High churn — stress test; approaches full-refresh cost

Query Complexities

ScenarioDefining QueryOperators Tested
scanSELECT id, region, category, amount, score FROM srcTable scan only
filterSELECT id, region, amount FROM src WHERE amount > 5000Scan + filter (WHERE)
aggregateSELECT region, SUM(amount), COUNT(*) FROM src GROUP BY regionScan + group-by aggregate
joinSELECT s.id, s.region, s.amount, d.region_name FROM src s JOIN dim d ON ...Scan + inner join
join_aggSELECT d.region_name, SUM(s.amount), COUNT(*) FROM src s JOIN dim d ON ... GROUP BY ...Scan + join + aggregate

DML Mix per Cycle

Each change cycle applies a realistic mix of operations:

OperationFractionExample at 10K rows, 10% rate
UPDATE70%700 rows have amount incremented
DELETE15%150 rows removed
INSERT15%150 new rows added

What Each Benchmark Does

1. Start a fresh PostgreSQL 18.x container
2. Install the pg_trickle extension
3. Create and populate the source table (10K or 100K rows)
4. Create dimension table if needed (for join scenarios)
5. ANALYZE for stable query plans

── FULL mode ──
6. Create a Stream Table in FULL refresh mode
7. For each of 3 cycles:
   a. Apply random DML (updates + deletes + inserts)
   b. ANALYZE
   c. Time the FULL refresh (TRUNCATE + re-execute entire query)
   d. Record refresh_ms and ST row count
8. Drop the FULL-mode ST

── DIFFERENTIAL mode ──
9. Reset source table to same starting state
10. Create a Stream Table in DIFFERENTIAL refresh mode
11. For each of 3 cycles:
    a. Apply random DML (same parameters)
    b. ANALYZE
    c. Time the DIFFERENTIAL refresh (delta query + MERGE)
    d. Record refresh_ms and ST row count

12. Print results table and summary

Both modes start from the same data to ensure a fair comparison. The 3-cycle design captures warm-up effects (cycle 1 may be slower due to plan caching).


Reading the Output

Detail Table

╔══════════════════════════════════════════════════════════════════════════════════════╗
║                    pg_trickle Refresh Benchmark Results                      ║
╠════════════╤══════════╤════════╤═════════════╤═══════╤════════════╤═════════════════╣
║ Scenario   │ Rows     │ Chg %  │ Mode        │ Cycle │ Refresh ms │ ST Rows         ║
╠════════════╪══════════╪════════╪═════════════╪═══════╪════════════╪═════════════════╣
║ aggregate  │    10000 │     1% │ FULL        │     1 │       22.1 │               5 ║
║ aggregate  │    10000 │     1% │ FULL        │     2 │        4.8 │               5 ║
║ aggregate  │    10000 │     1% │ FULL        │     3 │        5.3 │               5 ║
║ aggregate  │    10000 │     1% │ DIFFERENTIAL │     1 │        8.4 │               5 ║
║ aggregate  │    10000 │     1% │ DIFFERENTIAL │     2 │        4.4 │               5 ║
║ aggregate  │    10000 │     1% │ DIFFERENTIAL │     3 │        4.6 │               5 ║
╚════════════╧══════════╧════════╧═════════════╧═══════╧════════════╧═════════════════╝
ColumnMeaning
ScenarioQuery complexity level (scan, filter, aggregate, join, join_agg)
RowsNumber of rows in the base table
Chg %Percentage of rows changed per cycle
ModeFULL (truncate + recompute) or DIFFERENTIAL (delta + merge)
CycleWhich of the 3 measurement rounds (cycle 1 often includes warm-up)
Refresh msWall-clock time for the refresh operation
ST RowsRow count in the Stream Table after refresh (sanity check)

Summary Table

┌─────────────────────────────────────────────────────────────────────────┐
│                        Summary (avg ms per cycle)                       │
├────────────┬──────────┬────────┬─────────────────┬──────────────────────┤
│ Scenario   │ Rows     │ Chg %  │ FULL avg ms     │ DIFFERENTIAL avg ms   │
├────────────┼──────────┼────────┼─────────────────┼──────────────────────┤
│ aggregate  │    10000 │     1% │       10.7       │        5.8 (  1.8x) │
└────────────┴──────────┴────────┴─────────────────┴──────────────────────┘

The Speedup value in parentheses is FULL avg / DIFFERENTIAL avg — how many times faster the incremental refresh is compared to a full refresh.


Interpreting the Speedup

What to Expect

Change RateTable SizeExpected SpeedupExplanation
1%10K1.5–5xSmall table; overhead is similar, delta is tiny
1%100K5–50xLarger table amplifies full-refresh cost
10%100K2–10xModerate delta; still significantly faster
50%any1–2xDelta is nearly as large as full table

Rules of Thumb

SpeedupInterpretation
> 10xStrong win for DIFFERENTIAL — typical at low change rates on larger tables
5–10xClear advantage for DIFFERENTIAL
2–5xModerate advantage — DIFFERENTIAL is the right choice
1–2xMarginal gain — either mode is acceptable
~1xBreak-even — change rate is too high for incremental to help
< 1xDIFFERENTIAL is slower — would indicate overhead exceeds savings (investigate)

Key Patterns to Look For

  1. Scaling with table size: For the same change rate, speedup should increase with table size. FULL must re-process all rows; DIFFERENTIAL processes only the delta.

  2. Degradation with change rate: As change rate rises from 1% → 50%, speedup should decrease. At 50%, DIFFERENTIAL processes half the table which approaches FULL cost.

  3. Query complexity amplifies speedup: Aggregate and join queries benefit more from DIFFERENTIAL because they avoid expensive re-computation. A join_agg at 1% changes should show higher speedup than a simple scan at the same parameters.

  4. Cycle 1 warm-up: The first cycle in each mode may be slower due to PostgreSQL plan cache population. Use cycles 2–3 for the steadiest numbers.

  5. ST Rows consistency: The ST row count should be similar between FULL and DIFFERENTIAL for the same scenario (accounting for random DML). Large discrepancies indicate a correctness issue.


Zero-Change Latency

The bench_no_data_refresh_latency test measures the overhead of a refresh when no data has changed — the NO_DATA code path.

┌──────────────────────────────────────────────┐
│ NO_DATA Refresh Latency (10 iterations)      │
├──────────────────────────────────────────────┤
│ Avg:     3.21 ms                             │
│ Max:     5.10 ms                             │
│ Target: < 10 ms                              │
│ Status: ✅ PASS                              │
└──────────────────────────────────────────────┘
MetricMeaning
AvgAverage wall-clock time across 10 no-op refreshes
MaxWorst-case single iteration
TargetThe PLAN.md goal: < 10 ms per no-op refresh
StatusPASS if avg < 10 ms, SLOW otherwise

A passing result confirms the scheduler's per-cycle overhead is negligible. Values > 10 ms in containerized environments may be acceptable due to Docker overhead; bare-metal PostgreSQL should comfortably meet the target.


Available Tests

Individual Tests (10K rows)

Test NameScenarioChange Rate
bench_scan_10k_1pctscan1%
bench_scan_10k_10pctscan10%
bench_scan_10k_50pctscan50%
bench_filter_10k_1pctfilter1%
bench_aggregate_10k_1pctaggregate1%
bench_join_10k_1pctjoin1%
bench_join_agg_10k_1pctjoin_agg1%

Individual Tests (100K rows)

Test NameScenarioChange Rate
bench_scan_100k_1pctscan1%
bench_scan_100k_10pctscan10%
bench_scan_100k_50pctscan50%
bench_aggregate_100k_1pctaggregate1%
bench_aggregate_100k_10pctaggregate10%
bench_join_agg_100k_1pctjoin_agg1%
bench_join_agg_100k_10pctjoin_agg10%

Special Tests

Test NameDescription
bench_full_matrixAll 30 combinations (5 queries × 2 sizes × 3 rates)
bench_no_data_refresh_latencyZero-change overhead (10 iterations)

Nexmark Streaming Benchmark

The Nexmark benchmark validates correctness against a sustained high-frequency DML workload modelling an online auction system. It is adapted from the Nexmark benchmark specification used by streaming systems like Flink, Feldera, and Materialize.

Data Model

TableDescriptionDefault Size
personRegistered users (sellers/bidders)100 rows
auctionItems listed for sale500 rows
bidBids placed on auctions2,000 rows

Queries

QueryFeaturesDescription
Q0PassthroughIdentity projection of all bids
Q1Projection + arithmeticCurrency conversion
Q2FilterBids on specific auctions
Q3JOIN + filterLocal item suggestion (person-auction join)
Q4JOIN + GROUP BY + AVGAverage selling price by category
Q5GROUP BY + COUNTHot items (bid count per auction)
Q6JOIN + GROUP BY + AVGAverage bid price per seller
Q7Aggregate (MAX)Highest bid price
Q8JOINPerson-auction join (new users monitoring)
Q9JOIN + DISTINCT ONWinning bid per auction with bidder info

Running Nexmark Tests

# Default scale (100 persons, 500 auctions, 2000 bids, 3 cycles)
cargo test --test e2e_nexmark_tests -- --ignored --test-threads=1 --nocapture

# Larger scale
NEXMARK_PERSONS=1000 NEXMARK_AUCTIONS=5000 NEXMARK_BIDS=50000 NEXMARK_CYCLES=5 \
  cargo test --test e2e_nexmark_tests -- --ignored --test-threads=1 --nocapture

What Each Cycle Does

Each refresh cycle applies three mutation functions (RF1-RF3) then refreshes all stream tables and asserts multiset equality:

  1. RF1 (INSERT): New persons, auctions, and bids
  2. RF2 (DELETE): Remove oldest bids, orphaned auctions, orphaned persons
  3. RF3 (UPDATE): Price changes, reserve adjustments, city moves
  4. Refresh + Assert: Differential refresh → EXCEPT ALL correctness check

Correctness Validation

The test uses the same DBSP invariant as TPC-H: after every differential refresh, the stream table must be multiset-equal to re-executing the defining query from scratch (symmetric EXCEPT ALL). Additionally, negative __pgt_count values (over-retraction bugs) are detected.


DAG Topology Benchmarks

The DAG topology benchmark suite in tests/e2e_dag_bench_tests.rs measures end-to-end propagation latency and throughput through multi-level DAG topologies. While the single-ST benchmarks above measure per-operator refresh speed, these benchmarks measure how efficiently changes propagate through chains, fan-outs, diamonds, and mixed topologies with 5–100+ stream tables.

The core questions these benchmarks answer:

How long does it take for a source-table INSERT to propagate through an entire DAG to the leaf stream tables?

How does PARALLEL refresh mode compare to CALCULATED mode across different topology shapes?

Running DAG Benchmarks

# Full suite (rebuilds Docker image)
just test-dag-bench

# Skip Docker image rebuild
just test-dag-bench-fast

# Individual topology tests
cargo test --test e2e_dag_bench_tests --features pg18 -- --ignored bench_latency_linear_5 --test-threads=1 --nocapture
cargo test --test e2e_dag_bench_tests --features pg18 -- --ignored bench_throughput_diamond --test-threads=1 --nocapture

Topology Patterns

TopologyShapeDescription
Linear Chainsrc → st_1 → st_2 → ... → st_NSequential pipeline; L1 aggregate, L2+ alternating project/filter
Wide DAGsrc → [W parallel chains × D deep]W independent chains of depth D from a shared source; tests parallel refresh mode
Fan-Out Treesrc → root → [b children] → [b² grandchildren] → ...Exponential fan-out; each parent spawns b children with filter/project variants
Diamondsrc → [fan-out aggregates] → JOIN → [extension]Fan-out to independent aggregates (SUM/COUNT/MAX/MIN/AVG) then converge via JOIN
MixedTwo sources, 4 layers, ~15 STsRealistic e-commerce scenario with chains, fan-out, cross-source joins, and alerts

Measurement Modes

Latency benchmarks (auto-refresh): The scheduler is enabled with a 200 ms interval. The test INSERTs into the source table and polls pgt_refresh_history until the leaf stream table has a new COMPLETED entry. This measures the full propagation latency including scheduler overhead.

Throughput benchmarks (manual refresh): The scheduler is disabled. The test applies mixed DML (70% UPDATE, 15% DELETE, 15% INSERT) then manually refreshes all STs in topological order. This isolates pure refresh cost from scheduler overhead.

Theoretical Comparison

Each latency benchmark computes the theoretical prediction from PLAN_DAG_PERFORMANCE.md and reports the delta:

ModeFormula
CALCULATEDL = I_s + N × T_r
PARALLEL(C)L = Σ ⌈W_l / C⌉ × max(I_p, T_r) per level

Where T_r is the measured average per-ST refresh time, I_s = 200 ms (scheduler interval), and C is the concurrency limit.

Reading the Output

Per-Cycle Machine-Parseable Lines (stderr)

[DAG_BENCH] topology=linear_chain mode=CALCULATED sts=10 depth=10 width=1 cycle=1 actual_ms=820.3 theory_ms=700.0 overhead_pct=17.2 per_hop_ms=82.0

ASCII Summary Table (stdout)

╔══════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                         pg_trickle DAG Topology Benchmark Results                                 ║
╠═══════════════╤═══════════════╤══════╤═══════╤═══════╤════════════╤════════════╤═══════════════════╣
║ Topology      │ Mode          │ STs  │ Depth │ Width │ Actual ms  │ Theory ms  │ Overhead          ║
╠═══════════════╪═══════════════╪══════╪═══════╪═══════╪════════════╪════════════╪═══════════════════╣
║ linear_chain  │ CALCULATED    │   10 │    10 │     1 │      820.3 │      700.0 │ +17.2%            ║
║ wide_dag      │ PARALLEL_C8   │   60 │     3 │    20 │     2430.1 │     1800.0 │ +35.0%            ║
╚═══════════════╧═══════════════╧══════╧═══════╧═══════╧════════════╧════════════╧═══════════════════╝

Per-Level Breakdown

  Per-Level Breakdown (linear_chain D=10, CALCULATED):
  Level  1: avg  52.3ms  [st_lc_1]
  Level  2: avg  48.7ms  [st_lc_2]
  ...
  Level 10: avg  51.2ms  [st_lc_10]
  Total:       513.5ms  (scheduler overhead: 306.8ms)

JSON Export

Results are written to target/dag_bench_results/<timestamp>.json (overridable via PGS_DAG_BENCH_JSON_DIR env var) for cross-run comparison.

Available DAG Benchmark Tests

Latency Tests (Auto-Refresh)

Test NameTopologyModeSTs
bench_latency_linear_5_calcLinear, D=5CALCULATED5
bench_latency_linear_10_calcLinear, D=10CALCULATED10
bench_latency_linear_20_calcLinear, D=20CALCULATED20
bench_latency_linear_10_par4Linear, D=10PARALLEL(4)10
bench_latency_wide_3x20_calcWide, D=3 W=20CALCULATED60
bench_latency_wide_3x20_par4Wide, D=3 W=20PARALLEL(4)60
bench_latency_wide_3x20_par8Wide, D=3 W=20PARALLEL(8)60
bench_latency_wide_5x20_calcWide, D=5 W=20CALCULATED100
bench_latency_wide_5x20_par8Wide, D=5 W=20PARALLEL(8)100
bench_latency_fanout_b2d5_calcFan-out, b=2 d=5CALCULATED31
bench_latency_fanout_b2d5_par8Fan-out, b=2 d=5PARALLEL(8)31
bench_latency_diamond_4_calcDiamond, fan=4CALCULATED5
bench_latency_mixed_calcMixed, ~15 STsCALCULATED~15
bench_latency_mixed_par8Mixed, ~15 STsPARALLEL(8)~15

Throughput Tests (Manual Refresh)

Test NameTopologySTsDelta Sizes
bench_throughput_linear_5Linear, D=5510, 100, 1000
bench_throughput_linear_10Linear, D=101010, 100, 1000
bench_throughput_linear_20Linear, D=202010, 100, 1000
bench_throughput_wide_3x20Wide, D=3 W=206010, 100, 1000
bench_throughput_fanout_b2d5Fan-out, b=2 d=53110, 100, 1000
bench_throughput_diamond_4Diamond, fan=4510, 100, 1000
bench_throughput_mixedMixed, ~15 STs~1510, 100, 1000

What to Look For

  1. Linear chain: CALCULATED faster than PARALLEL. For width=1 DAGs, PARALLEL adds poll overhead without parallelism benefit. CALCULATED should be faster.

  2. Wide DAG: PARALLEL(C=8) speedup over CALCULATED. For width ≥ 20, PARALLEL should show measurable improvement — it refreshes up to C STs concurrently per level instead of sequentially.

  3. Overhead < 100%. Theoretical vs actual overhead should stay below 100% across all topologies — the formulas should be in the right ballpark.

  4. DIFFERENTIAL action in per-ST breakdown. ST-on-ST hops should show DIFFERENTIAL rather than FULL, confirming differential propagation is working.

  5. Throughput scaling with delta size. Smaller deltas (10 rows) should yield lower per-cycle wall-clock time than larger deltas (1000 rows).


In-Process Micro-Benchmarks (Criterion.rs)

In addition to the E2E database benchmarks, the project includes two Criterion.rs benchmark suites that measure pure Rust computation time without database overhead. These are useful for tracking performance regressions in the internal query-building and IVM differentiation logic.

Benchmark Suites

refresh_bench — Utility Functions

benches/refresh_bench.rs benchmarks the low-level helper functions used during refresh operations:

Benchmark GroupWhat It Measures
quote_identPostgreSQL identifier quoting speed
col_listColumn list SQL generation
prefixed_col_listPrefixed column list generation (e.g., NEW.col)
expr_to_sqlAST expression → SQL string conversion
output_columnsOutput column extraction from parsed queries
source_oidsSource table OID resolution
lsn_gtLSN comparison expression generation
frontier_jsonFrontier state JSON serialization
canonical_periodInterval parsing and canonicalization
dag_operationsDAG topological sort and cycle detection
xxh64xxHash-64 hashing throughput

diff_operators — IVM Operator Differentiation

benches/diff_operators.rs benchmarks the delta SQL generation for every IVM operator. Each benchmark creates a realistic operator tree and measures differentiate() throughput:

Benchmark GroupWhat It Measures
diff_scanTable scan differentiation (3, 10, 20 columns)
diff_filterFilter (WHERE) differentiation
diff_projectProjection (SELECT subset) differentiation
diff_aggregateGROUP BY aggregate differentiation (simple + complex)
diff_inner_joinInner join differentiation
diff_left_joinLeft outer join differentiation
diff_distinctDISTINCT differentiation
diff_union_allUNION ALL differentiation (2, 5, 10 children)
diff_windowWindow function differentiation
diff_join_aggregateComposite join + aggregate pipeline
differentiate_fullFull differentiate() call for scan-only and filter+scan trees

Running Micro-Benchmarks

# Run all Criterion benchmarks
just bench

# Run only refresh utility benchmarks
cargo bench --bench refresh_bench --features pg18

# Run only IVM diff operator benchmarks
just bench-diff
# or equivalently:
cargo bench --bench diff_operators --features pg18

# Output in Bencher-compatible format (for CI integration)
just bench-bencher

Output and Reports

Criterion produces statistical analysis for each benchmark including:

  • Mean and standard deviation of execution time
  • Throughput (iterations/sec)
  • Comparison with previous run — reports improvements/regressions with confidence intervals

HTML reports are generated in target/criterion/ with interactive charts showing distributions and regression history. Open target/criterion/report/index.html to browse all results.

Sample output:

diff_scan/3_columns   time:   [11.834 µs 12.074 µs 12.329 µs]
diff_scan/10_columns  time:   [16.203 µs 16.525 µs 16.869 µs]
diff_aggregate/simple time:   [21.447 µs 21.862 µs 22.301 µs]
diff_inner_join       time:   [25.919 µs 26.421 µs 26.952 µs]

Continuous Benchmarking with Bencher

Bencher provides continuous benchmark tracking in CI, detecting performance regressions on pull requests before they merge.

How It Works

The .github/workflows/benchmarks.yml workflow:

  1. On main pushes — runs both Criterion suites and uploads results to Bencher as the baseline. This establishes the expected performance for each benchmark.

  2. On pull requests — runs the same benchmarks and compares against the main baseline using a Student's t-test with a 99% upper confidence boundary. If any benchmark regresses beyond the threshold, the PR check fails.

Setup

To enable Bencher for your fork or deployment:

  1. Create a Bencher account at bencher.dev and create a project.

  2. Add the API token as a GitHub Actions secret:

    • Go to Settings → Secrets and variables → Actions
    • Add BENCHER_API_TOKEN with your Bencher API token
  3. Update the project slug in .github/workflows/benchmarks.yml if your Bencher project name differs from pg-trickle.

The workflow gracefully degrades — if BENCHER_API_TOKEN is not set, benchmarks still run and upload artifacts but skip Bencher tracking.

Local Bencher-Format Output

To see what Bencher would receive from CI:

just bench-bencher

This runs both suites with --output-format bencher, producing JSON output compatible with bencher run.

Dashboard

Once configured, the Bencher dashboard shows:

  • Historical trends for every benchmark across commits
  • Statistical thresholds with configurable alerting
  • PR annotations highlighting which benchmarks regressed and by how much

Troubleshooting

IssueResolution
docker: command not foundInstall Docker Desktop and ensure it is running
Container startup timeoutIncrease Docker memory allocation (≥ 4 GB recommended)
image not foundRun ./tests/build_e2e_image.sh to build the test image
Highly variable timingsClose other workloads; use --test-threads=1 to avoid container contention
SLOW status on latency testExpected in Docker; bare-metal should pass < 10 ms

CDC Write-Side Overhead Benchmarks

The CDC write-overhead benchmark suite in tests/e2e_cdc_write_overhead_tests.rs measures the DML throughput cost of pg_trickle's CDC triggers on source tables. This quantifies the "write amplification factor" — how much slower DML becomes when a stream table is attached.

The core question this benchmark answers:

How much write throughput do you sacrifice by attaching a stream table to a source table?

Running CDC Write Overhead Benchmarks

# Full suite (all 5 scenarios)
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_write_overhead_full

# Individual scenarios
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_single_row_insert
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_bulk_insert
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_bulk_update
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_bulk_delete
cargo test --test e2e_cdc_write_overhead_tests --features pg18 -- --ignored --nocapture bench_cdc_concurrent_writers

Scenarios

ScenarioDescriptionRows per Cycle
Single-row INSERTOne INSERT statement per row, 1,000 rows total1,000
Bulk INSERTSingle INSERT ... SELECT generate_series(...)10,000
Bulk UPDATESingle UPDATE ... WHERE id <= N10,000
Bulk DELETESingle DELETE ... WHERE id <= N10,000
Concurrent writers4 parallel sessions each inserting 5,000 rows20,000 total

Reading the Output

╔═══════════════════════════════════════════════════════════════════════════════════╗
║               pg_trickle CDC Write-Side Overhead Benchmark                       ║
╠═══════════════════════╤═══════════════╤═══════════════╤═════════════════════════╣
║ Scenario              │ Baseline (ms) │ With CDC (ms) │ Write Amplification     ║
╠═══════════════════════╪═══════════════╪═══════════════╪═════════════════════════╣
║ single-row INSERT     │         450.2 │         890.5 │       1.98×             ║
║ bulk INSERT (10K)     │          35.1 │          72.3 │       2.06×             ║
║ bulk UPDATE (10K)     │          48.7 │         105.2 │       2.16×             ║
║ bulk DELETE (10K)     │          22.4 │          51.8 │       2.31×             ║
║ concurrent (4×5K)     │          65.3 │         142.1 │       2.18×             ║
╚═══════════════════════╧═══════════════╧═══════════════╧═════════════════════════╝
ColumnMeaning
ScenarioDML pattern being measured
BaselineAverage wall-clock time with no stream table (no CDC trigger)
With CDCAverage wall-clock time with an active stream table (CDC trigger fires)
Write AmplificationWith CDC / Baseline — how many times slower the write path becomes

Machine-Readable Output

[CDC_BENCH] scenario=single-row_INSERT baseline_avg_ms=450.2 cdc_avg_ms=890.5 write_amplification=1.98

Interpreting Write Amplification

Write AmplificationInterpretation
1.0–1.5×Minimal overhead — triggers add negligible cost. Typical for bulk DML with statement-level triggers.
1.5–2.5×Expected range for statement-level CDC triggers. Each DML statement incurs one additional INSERT into the change buffer.
2.5–4.0×Moderate overhead — acceptable for most workloads. Common with row-level triggers or single-row DML.
4.0–10×High overhead — consider pg_trickle.cdc_trigger_mode = 'statement' if using row-level triggers, or reduce DML frequency.
> 10×Investigate — may indicate lock contention on the change buffer or pathological trigger interaction.

Key Patterns to Look For

  1. Statement-level triggers vs row-level: Statement-level triggers (default since v0.11.0) should show significantly lower overhead for bulk DML compared to row-level triggers.

  2. Bulk DML advantage: Bulk INSERT/UPDATE/DELETE should show lower write amplification than single-row INSERT because the trigger fires once per statement, not once per row.

  3. Concurrent writer safety: The concurrent scenario should complete without deadlocks or errors, and the write amplification should be similar to the serial bulk INSERT case.

  4. DELETE overhead: DELETE triggers tend to be slightly more expensive than INSERT triggers because the trigger must capture the OLD row values.


CI Benchmark Workflows

All benchmark jobs run only on weekly schedule and workflow_dispatch — never on PR or push — to avoid blocking the merge gate with long-running tests.

e2e-benchmarks.yml — E2E Benchmark Tracking

Produces the numbers in README.md and this document. Each job posts a summary table to the GitHub Actions run page and uploads artifacts at 90-day retention. Manual dispatch accepts a job input (refresh | latency | cdc | tpch | all) to re-run a single job.

JobTest(s)README SectionTimeoutjust command
bench-refreshbench_full_matrixDifferential vs Full Refresh60 minjust test-bench-e2e-fast
bench-latencybench_no_data_refresh_latencyZero-Change Latency20 minjust test-bench-e2e-fast
bench-cdcbench_cdc_trigger_overheadWrite-Path Overhead30 minjust test-bench-e2e-fast
bench-tpchtest_tpch_performance_comparisonTPC-H per-query table30 minjust bench-tpch-fast

ci.yml — Benchmark Jobs

Criterion micro-benchmarks and DAG topology benchmarks. Run on the daily schedule and workflow_dispatch.

JobTest SuiteWhat It MeasuresTimeoutjust command
benchmarksbenches/refresh_bench.rs, benches/diff_operators.rsIn-process Rust: query building, delta SQL generation (sub-µs)20 minjust bench
dag-bench-calce2e_dag_bench_tests (excl. par*)DAG propagation latency + throughput, CALCULATED mode30 minjust test-dag-bench-fast
dag-bench-parallele2e_dag_bench_tests (par*)DAG propagation with 4–8 parallel workers120 minjust test-dag-bench-fast

benchmarks.yml — Bencher Integration (opt-in)

Disabled by default (no scheduled trigger). Re-enable by restoring push/pull_request triggers and adding a BENCHER_API_TOKEN secret. When active, it annotates PRs with regressions detected via Student’s t-test at a 99% upper confidence boundary.

JobTest SuiteWhat It MeasuresTracking
benchmarkbenches/refresh_bench.rs, benches/diff_operators.rsSame as ci.yml benchmarks jobBencher (regression alert on PR)

Artifact Retention Summary

WorkflowArtifactRetention
e2e-benchmarks.ymlbench-{refresh,latency,cdc,tpch}-results (stdout + JSON)90 days
ci.yml benchmarksbenchmark-results (Criterion HTML + JSON)7 days
benchmarks.ymlcriterion-results (Criterion HTML + JSON)7 days

What Happens When You INSERT a Row?

This tutorial traces the complete lifecycle of a single INSERT statement on a base table that is referenced by a stream table — from the moment the row is written to the moment the stream table reflects the change.

Setup: A Real-World Example

Suppose you run an e-commerce platform. You have an orders table and a stream table that maintains a running total per customer:

-- Base table
CREATE TABLE orders (
    id    SERIAL PRIMARY KEY,
    customer TEXT NOT NULL,
    amount   NUMERIC(10,2) NOT NULL
);

-- Stream table: always-fresh customer totals
SELECT pgtrickle.create_stream_table(
    name     => 'customer_totals',
    query    => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule => '1m'  -- refresh when data is staler than 1 minute
    -- refresh_mode defaults to 'AUTO' (differential with full-refresh fallback)
);

After creation, customer_totals is a real PostgreSQL table:

SELECT * FROM customer_totals;
-- (empty — no orders yet)

Phase 1: The INSERT

A new order arrives:

INSERT INTO orders (customer, amount) VALUES ('alice', 49.99);

What happens inside PostgreSQL

When create_stream_table() was called, pg_trickle installed an AFTER INSERT OR UPDATE OR DELETE trigger on the orders table. This trigger fires automatically — the user's INSERT statement triggers it transparently.

The trigger function (pgtrickle_changes.pg_trickle_cdc_fn_<oid>()) executes inside the same transaction as the INSERT and writes a single row into the change buffer table:

pgtrickle_changes.changes_16384    (where 16384 = orders table OID)
┌───────────┬─────────────┬────────┬─────────┬──────────┬──────────┬────────────┐
│ change_id │ lsn         │ action │ pk_hash  │ new_id   │ new_cust │ new_amount │
├───────────┼─────────────┼────────┼─────────┼──────────┼──────────┼────────────┤
│ 1         │ 0/1A3F2B80  │ I      │ -837291 │ 1        │ alice    │ 49.99      │
└───────────┴─────────────┴────────┴─────────┴──────────┴──────────┴────────────┘

Key details:

  • lsn: The current WAL Log Sequence Number (pg_current_wal_lsn()), used to bound which changes belong to which refresh cycle.
  • action: 'I' for INSERT, 'U' for UPDATE, 'D' for DELETE.
  • pk_hash: A pre-computed hash of the primary key (orders.id), used later for efficient row matching.
  • new_* columns: The actual column values from NEW, stored as native PostgreSQL types (not JSONB). There are no old_* values for INSERTs.

The trigger adds zero overhead to the user's transaction commit beyond this single INSERT into the buffer table. There is no JSONB serialization, no logical replication slot, and no external process involved.

Phase 2: The Scheduler Wakes Up

A background worker called the scheduler runs inside PostgreSQL (registered via shared_preload_libraries). It wakes up every pg_trickle.scheduler_interval_ms milliseconds (default: 1000ms) and performs a tick:

  1. Rebuild the DAG (if any stream tables were created/dropped since last tick) — a dependency graph of all stream tables and their source tables.
  2. Topological sort — determine the refresh order so that stream tables depending on other stream tables are refreshed after their dependencies.
  3. For each stream table, check: has its staleness exceeded its schedule?

For customer_totals with a '1m' schedule, the scheduler compares:

  • now() minus data_timestamp (the freshness watermark from the last refresh)
  • Against the schedule: 60 seconds

If more than 60 seconds have elapsed and the stream table isn't already being refreshed, the scheduler begins a refresh.

Phase 3: Frontier Advancement

Before executing the refresh, the scheduler creates a new frontier — a snapshot of how far to read changes from each source table:

Previous frontier: { orders(16384): lsn = 0/1A3F2A00 }
New frontier:      { orders(16384): lsn = 0/1A3F2C00 }

The frontier is a DBSP-inspired version vector. Each source table has its own LSN cursor. The refresh will process all changes in the buffer table where lsn > previous_frontier_lsn AND lsn <= new_frontier_lsn.

This means:

  • Changes committed before the previous refresh are already reflected.
  • Changes committed after the new frontier will be picked up in the next cycle.
  • The INSERT we made (lsn = 0/1A3F2B80) falls within this window.

Phase 4: Change Detection — Is There Anything to Do?

Before running the full delta query, the scheduler runs a short-circuit check: does the change buffer actually have any rows in the LSN window?

SELECT count(*)::bigint FROM (
    SELECT 1 FROM pgtrickle_changes.changes_16384
    WHERE lsn > '0/1A3F2A00'::pg_lsn
    AND lsn <= '0/1A3F2C00'::pg_lsn
    LIMIT <threshold>
) __pgt_capped

This query also checks the adaptive threshold: if the number of changes exceeds a percentage of the source table size (default: 10%), the scheduler falls back to a FULL refresh instead of DIFFERENTIAL, because applying thousands of individual deltas would be slower than a bulk reload.

For our single INSERT, the count is 1 — well below the threshold. The scheduler proceeds with a DIFFERENTIAL refresh.

Phase 5: Delta Query Generation (DVM Engine)

This is where the Differential View Maintenance (DVM) engine does its work. The defining query:

SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
FROM orders GROUP BY customer

is parsed into an operator tree:

Aggregate(GROUP BY customer, SUM(amount), COUNT(*))
  └── Scan(orders)

The DVM engine differentiates each operator — converting it from "compute the full result" to "compute only what changed":

Step 1: Differentiate the Scan

The Scan(orders) operator becomes a read from the change buffer:

-- Reads only changes in the LSN window, splitting UPDATEs into DELETE+INSERT
WITH __pgt_raw AS (
    SELECT c.pk_hash, c.action,
           c."new_customer", c."old_customer",
           c."new_amount", c."old_amount"
    FROM pgtrickle_changes.changes_16384 c
    WHERE c.lsn > '0/1A3F2A00'::pg_lsn
    AND   c.lsn <= '0/1A3F2C00'::pg_lsn
)
-- INSERT rows: take new_* values
SELECT pk_hash AS __pgt_row_id, 'I' AS __pgt_action,
       "new_customer" AS customer, "new_amount" AS amount
FROM __pgt_raw WHERE action IN ('I', 'U')
UNION ALL
-- DELETE rows: take old_* values
SELECT pk_hash AS __pgt_row_id, 'D' AS __pgt_action,
       "old_customer" AS customer, "old_amount" AS amount
FROM __pgt_raw WHERE action IN ('D', 'U')

For our single INSERT, this produces:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
-837291      | I            | alice    | 49.99

Step 2: Differentiate the Aggregate

The Aggregate differentiation is the heart of incremental maintenance. Instead of re-computing SUM(amount) over the entire orders table, it computes:

-- Delta for SUM: add new values, subtract deleted values
SELECT customer,
       SUM(CASE WHEN __pgt_action = 'I' THEN amount
                WHEN __pgt_action = 'D' THEN -amount END) AS total,
       SUM(CASE WHEN __pgt_action = 'I' THEN 1
                WHEN __pgt_action = 'D' THEN -1 END) AS order_count,
       pgtrickle.pg_trickle_hash(customer::text) AS __pgt_row_id,
       'I' AS __pgt_action
FROM <scan_delta>
GROUP BY customer

For our INSERT of ('alice', 49.99), this yields:

customer | total  | order_count | __pgt_row_id | __pgt_action
---------|--------|-------------|--------------|-------------
alice    | +49.99 | +1          | 7283194      | I

The stream table uses reference counting: it tracks __pgt_count (how many source rows contribute to each group). When __pgt_count reaches 0, the group row is deleted.

Phase 6: MERGE Into the Stream Table

The delta is applied to the customer_totals storage table using a single SQL MERGE statement:

MERGE INTO public.customer_totals AS st
USING (<delta_query>) AS d
ON st.__pgt_row_id = d.__pgt_row_id
WHEN MATCHED AND d.__pgt_action = 'D' THEN DELETE
WHEN MATCHED AND d.__pgt_action = 'I' THEN
    UPDATE SET customer = d.customer, total = d.total, order_count = d.order_count
WHEN NOT MATCHED AND d.__pgt_action = 'I' THEN
    INSERT (__pgt_row_id, customer, total, order_count)
    VALUES (d.__pgt_row_id, d.customer, d.total, d.order_count)

Since alice didn't exist before, this is a NOT MATCHEDINSERT. The stream table now contains:

SELECT * FROM customer_totals;
 customer | total | order_count
----------|-------|------------
 alice    | 49.99 | 1

Phase 7: Cleanup and Bookkeeping

After the MERGE succeeds:

  1. Consumed changes are deleted from the buffer table:

    DELETE FROM pgtrickle_changes.changes_16384
    WHERE lsn > '0/1A3F2A00'::pg_lsn
    AND lsn <= '0/1A3F2C00'::pg_lsn
    
  2. The frontier is saved to the catalog as JSONB, so the next refresh knows where to start.

  3. The refresh is recorded in pgtrickle.pgt_refresh_history:

    refresh_id | pgt_id | action       | rows_inserted | rows_deleted | delta_row_count | status    | initiated_by
    1          | 1      | DIFFERENTIAL | 1             | 0            | 1               | COMPLETED | SCHEDULER
    

    The delta_row_count column (new in v0.2.0) records the total number of change buffer rows consumed during this refresh cycle.

  4. The data timestamp on the stream table is advanced, resetting the staleness clock.

  5. The MERGE template is cached in thread-local storage. The next refresh for this stream table skips SQL parsing, operator tree construction, and differentiation — it only substitutes LSN values into the cached template. This saves ~45ms per refresh cycle.

What About UPDATE and DELETE?

UPDATE

UPDATE orders SET amount = 59.99 WHERE id = 1;

The trigger writes a single row with action = 'U', capturing both OLD and NEW values:

action | new_amount | old_amount | new_customer | old_customer
-------|------------|------------|--------------|-------------
U      | 59.99      | 49.99      | alice        | alice

The scan differentiation splits this into:

  • DELETE old: (alice, 49.99) with action 'D'
  • INSERT new: (alice, 59.99) with action 'I'

The aggregate differentiation computes: +59.99 - 49.99 = +10.00 for alice's total. The MERGE updates the existing row.

DELETE

DELETE FROM orders WHERE id = 1;

The trigger writes action = 'D' with the OLD values. The aggregate differentiation computes -49.99 for the total and -1 for the count. If the __pgt_count reaches 0 (no more orders for alice), the MERGE deletes alice's row from the stream table entirely.

Performance: Why This Is Fast

StepWhat it avoids
Trigger-based CDCNo logical replication slot, no WAL parsing, no external process
Typed columnsNo JSONB serialization in the trigger, no jsonb_populate_record in the delta query
Pre-computed pk_hashNo per-row hash computation during the delta query
LSN-bounded readsIndex scan on the change buffer, not a full table scan
Algebraic differentiationProcesses only changed rows — O(changes) not O(table size)
MERGE statementSingle SQL round-trip for all inserts, updates, and deletes
Cached templatesAfter the first refresh, delta SQL generation is skipped entirely
Adaptive fallbackAutomatically switches to FULL refresh when changes exceed a threshold

For a table with 10 million rows and 100 changed rows, a DIFFERENTIAL refresh processes only those 100 rows. A FULL refresh would need to scan all 10 million.


What About IMMEDIATE Mode?

Everything described above applies to the default AUTO mode — changes accumulate in a buffer and are applied on a schedule using differential (delta-only) maintenance. As of v0.2.0, pg_trickle also supports IMMEDIATE mode, which takes a fundamentally different path.

With IMMEDIATE mode, there are no change buffers, no scheduler, and no waiting:

SELECT pgtrickle.create_stream_table(
    name         => 'customer_totals_live',
    query        => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    refresh_mode => 'IMMEDIATE'
);

How IMMEDIATE Mode Differs for INSERT

PhaseDIFFERENTIALIMMEDIATE
Trigger typeRow-level AFTER triggerStatement-level AFTER trigger with REFERENCING NEW TABLE
What's capturedOne buffer row per INSERTA transition table containing all inserted rows
When delta runsNext scheduler tick (up to schedule bound)Immediately, in the same transaction
Delta sourceChange buffer table (pgtrickle_changes.*)Temp table copied from transition table
ConcurrencyNo locking between writersAdvisory lock per stream table

When you run INSERT INTO orders ...:

  1. A BEFORE INSERT statement-level trigger acquires an advisory lock on the stream table
  2. The AFTER INSERT trigger captures the transition table (NEW TABLE AS __pgt_newtable) into a temp table
  3. The DVM engine generates the same delta query, but reads from the temp table instead of the change buffer
  4. The delta is applied to the stream table via INSERT/DELETE DML (not MERGE)
  5. The stream table is immediately up-to-date — within the same transaction
BEGIN;
INSERT INTO orders (customer, amount) VALUES ('alice', 49.99);
-- customer_totals_live already shows alice with total=49.99 here!
SELECT * FROM customer_totals_live;
COMMIT;

The delta SQL template is cached per (pgt_id, source_oid, has_new, has_old) combination, so subsequent trigger invocations skip query parsing entirely.


Next in This Series

What Happens When You UPDATE a Row?

This tutorial traces what happens when an UPDATE statement hits a base table that is referenced by a stream table. It covers the trigger capture, the scan-level decomposition into DELETE + INSERT, and how each DVM operator propagates the change — including cases where the group key changes, where JOINs are involved, and where multiple UPDATEs happen within a single refresh window.

Prerequisite: Read WHAT_HAPPENS_ON_INSERT.md first — it introduces the full 7-phase lifecycle. This tutorial focuses on how UPDATE differs.

Setup

Same e-commerce example:

CREATE TABLE orders (
    id       SERIAL PRIMARY KEY,
    customer TEXT NOT NULL,
    amount   NUMERIC(10,2) NOT NULL
);

SELECT pgtrickle.create_stream_table(
    name     => 'customer_totals',
    query    => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule => '1m'
);

-- Seed some data
INSERT INTO orders (customer, amount) VALUES
    ('alice', 49.99),
    ('alice', 30.00),
    ('bob',   75.00);

After the first refresh, the stream table contains:

customer | total | order_count
---------|-------|------------
alice    | 79.99 | 2
bob      | 75.00 | 1

Case 1: Simple Value UPDATE (Same Group Key)

UPDATE orders SET amount = 59.99 WHERE id = 1;

Alice's first order changes from 49.99 to 59.99. The customer (group key) stays the same.

Phase 1: Trigger Capture

The AFTER UPDATE trigger fires and writes one row to the change buffer with both OLD and NEW values:

pgtrickle_changes.changes_16384
┌───────────┬─────────────┬────────┬──────────┬──────────┬────────────┬──────────┬────────────┐
│ change_id │ lsn         │ action │ new_cust │ new_amt  │ old_cust   │ old_amt  │ pk_hash    │
├───────────┼─────────────┼────────┼──────────┼──────────┼────────────┼──────────┼────────────┤
│ 4         │ 0/1A3F3000  │ U      │ alice    │ 59.99    │ alice      │ 49.99    │ -837291    │
└───────────┴─────────────┴────────┴──────────┴──────────┴────────────┴──────────┴────────────┘

Key difference from INSERT: the trigger writes both new_* and old_* columns. The pk_hash is computed from NEW.id.

Phase 2–4: Scheduler, Frontier, Change Detection

Identical to the INSERT flow. The scheduler detects one change row in the LSN window.

Phase 5: Scan Differentiation — The U → D+I Split

This is where UPDATE handling diverges fundamentally. The scan delta operator decomposes the UPDATE into two events:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
-837291      | D            | alice    | 49.99     ← old values (DELETE)
-837291      | I            | alice    | 59.99     ← new values (INSERT)

Why split into D+I? This is a core IVM principle. Downstream operators (aggregates, joins, filters) don't have special "update" logic — they only understand insertions and deletions. By decomposing the UPDATE:

  • The DELETE event subtracts the old values from running aggregates
  • The INSERT event adds the new values

This algebraic approach handles arbitrary operator trees without operator-specific update logic.

Phase 5 (continued): Aggregate Differentiation

The aggregate operator processes both events against the alice group:

-- DELETE event: subtract old values
alice: total += CASE WHEN action='D' THEN -49.99 END  →  -49.99
alice: count += CASE WHEN action='D' THEN -1 END       →  -1

-- INSERT event: add new values
alice: total += CASE WHEN action='I' THEN +59.99 END  →  +59.99
alice: count += CASE WHEN action='I' THEN +1 END       →  +1

Net effect on alice's group:

total delta:  -49.99 + 59.99 = +10.00
count delta:  -1 + 1 = 0

The aggregate emits this as an INSERT (because the group still exists and its value changed):

customer | total  | order_count | __pgt_row_id | __pgt_action
---------|--------|-------------|--------------|-------------
alice    | +10.00 | 0           | 7283194      | I

Phase 6: MERGE

The MERGE updates the existing row:

-- MERGE WHEN MATCHED AND action = 'I' THEN UPDATE:
-- alice's total: 79.99 + 10.00 = 89.99  (via reference counting)
-- alice's count: 2 + 0 = 2

Wait — that's not right. The MERGE doesn't add deltas; it replaces the row. The aggregate delta query actually computes the new absolute value by combining the stored state with the delta:

COALESCE(existing.total, 0) + delta.total  → 79.99 + 10.00 = 89.99
COALESCE(existing.__pgt_count, 0) + delta.__pgt_count → 2 + 0 = 2

Result:

SELECT * FROM customer_totals;
 customer | total | order_count
----------|-------|------------
 alice    | 89.99 | 2            ← was 79.99
 bob      | 75.00 | 1

Case 2: Group Key Change (Customer Reassignment)

UPDATE orders SET customer = 'bob' WHERE id = 2;

Alice's second order (amount=30.00) is reassigned to Bob. The group key itself changes.

Trigger Capture

change_id | lsn         | action | new_cust | new_amt | old_cust | old_amt | pk_hash
5         | 0/1A3F3100  | U      | bob      | 30.00   | alice    | 30.00   | 4521038

The old and new customer values differ.

Scan Delta: D+I Split

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
4521038      | D            | alice    | 30.00    ← removes from alice's group
4521038      | I            | bob      | 30.00    ← adds to bob's group

Aggregate Delta

The aggregate groups by customer, so the DELETE and INSERT land in different groups:

Group "alice":
  total delta:  -30.00
  count delta:  -1

Group "bob":
  total delta:  +30.00
  count delta:  +1

After MERGE

SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 alice    | 59.99  | 1            ← lost one order (-30.00)
 bob      | 105.00 | 2            ← gained one order (+30.00)

This is why the D+I decomposition is essential. Without it, you'd need special "move between groups" logic. With it, the standard aggregate differentiation handles group key changes naturally.


Case 3: UPDATE That Deletes a Group

-- Alice only has one order left. Reassign it to bob.
UPDATE orders SET customer = 'bob' WHERE id = 1;

Aggregate Delta

Group "alice":
  total delta:    -59.99
  count delta:    -1
  new __pgt_count: 1 - 1 = 0  → group vanishes!

Group "bob":
  total delta:    +59.99
  count delta:    +1

When __pgt_count reaches 0, the aggregate emits a DELETE for alice's group:

customer | total | __pgt_row_id | __pgt_action
---------|-------|--------------|-------------
alice    | —     | 7283194      | D             ← group removed
bob      | ...   | 9182734      | I             ← group updated

The MERGE deletes alice's row entirely:

SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 bob      | 165.00 | 3

Case 4: Multiple UPDATEs on the Same Row (Within One Refresh Window)

What if a row is updated multiple times before the next refresh?

UPDATE orders SET amount = 10.00 WHERE id = 3;  -- bob: 75 → 10
UPDATE orders SET amount = 20.00 WHERE id = 3;  -- bob: 10 → 20
UPDATE orders SET amount = 30.00 WHERE id = 3;  -- bob: 20 → 30

The change buffer now has 3 rows for pk_hash of order #3:

change_id | action | old_amt | new_amt
6         | U      | 75.00   | 10.00
7         | U      | 10.00   | 20.00
8         | U      | 20.00   | 30.00

Net-Effect Computation

The scan delta uses a split fast-path design. Since order #3 has multiple changes (cnt > 1), it takes the multi-change path with window functions:

FIRST_VALUE(action) OVER (PARTITION BY pk_hash ORDER BY change_id)  → 'U'
LAST_VALUE(action) OVER (...)                                        → 'U'

Both first and last actions are 'U', so:

  • DELETE: emits using old values from the earliest change (change_id=6): old_amt = 75.00
  • INSERT: emits using new values from the latest change (change_id=8): new_amt = 30.00

Net delta:

__pgt_row_id | __pgt_action | amount
-------------|--------------|-------
pk_hash_3    | D            | 75.00    ← original value before all changes
pk_hash_3    | I            | 30.00    ← final value after all changes

The aggregate sees -75.00 + 30.00 = -45.00. This is correct regardless of the intermediate values. The intermediate rows (10.00, 20.00) are never seen.


Case 5: INSERT + UPDATE in Same Window

INSERT INTO orders (customer, amount) VALUES ('charlie', 100.00);
UPDATE orders SET amount = 200.00 WHERE customer = 'charlie';

Both happen before the next refresh. The buffer has:

change_id | action | old_amt | new_amt
9         | I      | NULL    | 100.00
10        | U      | 100.00  | 200.00

Net-effect analysis:

  • first_action = 'I' (row didn't exist before this window)
  • last_action = 'U' (row exists after)

Result:

  • No DELETE emitted (first_action = 'I' means the row was born in this window)
  • INSERT with final values: (charlie, 200.00)

The aggregate sees a pure insertion of (charlie, 200.00) — the intermediate value of 100.00 never appears.


Case 6: UPDATE + DELETE in Same Window

UPDATE orders SET amount = 999.99 WHERE id = 3;
DELETE FROM orders WHERE id = 3;

Net-effect:

  • first_action = 'U' (row existed before)
  • last_action = 'D' (row no longer exists)

Result:

  • DELETE with original old values from the first change
  • No INSERT (last_action = 'D')

The aggregate correctly sees only a removal.


Case 7: UPDATE with JOINs

Consider a stream table that joins two tables:

CREATE TABLE customers (
    id   SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    tier TEXT NOT NULL DEFAULT 'standard'
);

CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer_id INT REFERENCES customers(id),
    amount      NUMERIC(10,2)
);

SELECT pgtrickle.create_stream_table(
    name         => 'order_details',
    query        => $$
      SELECT c.name, c.tier, o.amount
      FROM orders o
      JOIN customers c ON o.customer_id = c.id
    $$,
    schedule => '1m'
);

Now update a customer's tier:

UPDATE customers SET tier = 'premium' WHERE name = 'alice';

How the JOIN Delta Works

The join differentiation follows the formula:

$$\Delta(L \bowtie R) = (\Delta L \bowtie R) \cup (L \bowtie \Delta R) - (\Delta L \bowtie \Delta R)$$

Since only the customers table changed:

  • $\Delta L$ = changes to orders (empty)
  • $\Delta R$ = changes to customers (alice's tier: standard → premium)

So:

  • Part 1: $\Delta\text{orders} \bowtie \text{customers}$ = empty (no order changes)
  • Part 2: $\text{orders} \bowtie \Delta\text{customers}$ = all of alice's orders joined with her tier change
  • Part 3: $\Delta\text{orders} \bowtie \Delta\text{customers}$ = empty (no order changes)

Part 2 produces the delta: for each of alice's orders, DELETE the old row (with tier='standard') and INSERT a new row (with tier='premium').

The stream table is updated to reflect the new tier across all of alice's order rows.


Performance Summary

ScenarioBuffer rowsDelta rows emittedWork
Simple value change12 (D+I)O(1) per group
Group key change12 (D+I, different groups)O(1) per affected group
Group deletion11 (D) + 1 (I) or 1 (D)O(1)
N updates same rowN2 (D first-old + I last-new)O(N) scan, O(1) aggregate
INSERT+UPDATE same window21 (I only)O(1)
UPDATE+DELETE same window21 (D only)O(1)

In all cases, the work is proportional to the number of changed rows, not the total table size. A single UPDATE on a billion-row table produces the same delta cost as on a 10-row table.


What About IMMEDIATE Mode?

Everything above describes DIFFERENTIAL mode — changes accumulate in a buffer and are applied on a schedule. As of v0.2.0, pg_trickle also supports IMMEDIATE mode, where the stream table is updated synchronously within the same transaction as your UPDATE.

How IMMEDIATE Mode Differs for UPDATE

PhaseDIFFERENTIALIMMEDIATE
Trigger typeRow-level AFTER triggerStatement-level AFTER trigger with REFERENCING OLD TABLE, NEW TABLE
What's capturedOne buffer row with old_* and new_*Two transition tables: __pgt_oldtable and __pgt_newtable
When delta runsNext scheduler tickImmediately, in the same transaction
D+I decompositionIn the scan delta CTESame algebra, but reading from transition temp tables
ConcurrencyNo locking between writersAdvisory lock per stream table

When you run UPDATE orders SET amount = 59.99 WHERE id = 1:

  1. A BEFORE UPDATE trigger acquires an advisory lock on the stream table
  2. The AFTER UPDATE trigger captures both OLD TABLE AS __pgt_oldtable and NEW TABLE AS __pgt_newtable into temp tables
  3. The DVM engine generates the same D+I decomposition, reading old values from the old-table and new values from the new-table
  4. The delta is applied to the stream table immediately
  5. Any query within the same transaction sees the updated stream table
BEGIN;
UPDATE orders SET amount = 59.99 WHERE id = 1;
-- customer_totals already reflects the new amount here!
SELECT * FROM customer_totals WHERE customer = 'alice';
COMMIT;

The same D+I split, aggregate differentiation, and net-effect logic applies — the only difference is the data source (transition tables vs change buffer) and timing (synchronous vs scheduled).


Next in This Series

What Happens When You DELETE a Row?

This tutorial traces what happens when a DELETE statement hits a base table that is referenced by a stream table. It covers the trigger capture, how the scan delta emits a single DELETE event, and how each DVM operator propagates the removal — including group deletion, partial group reduction, JOINs, cascading deletes within a single refresh window, and the important edge case where a DELETE cancels a prior INSERT.

Prerequisite: Read WHAT_HAPPENS_ON_INSERT.md first — it introduces the full 7-phase lifecycle (trigger → scheduler → frontier → change detection → DVM delta → MERGE → cleanup). This tutorial focuses on how DELETE differs.

Setup

Same e-commerce example used throughout the series:

CREATE TABLE orders (
    id       SERIAL PRIMARY KEY,
    customer TEXT NOT NULL,
    amount   NUMERIC(10,2) NOT NULL
);

SELECT pgtrickle.create_stream_table(
    name     => 'customer_totals',
    query    => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule => '1m'
);

-- Seed some data
INSERT INTO orders (customer, amount) VALUES
    ('alice', 50.00),
    ('alice', 30.00),
    ('bob',   75.00),
    ('bob',   25.00);

After the first refresh, the stream table contains:

customer | total  | order_count
---------|--------|------------
alice    | 80.00  | 2
bob      | 100.00 | 2

Case 1: Delete One Row (Group Survives)

DELETE FROM orders WHERE id = 2;  -- alice's 30.00 order

Alice still has one remaining order (id=1, amount=50.00). The group shrinks but doesn't vanish.

Phase 1: Trigger Capture

The AFTER DELETE trigger fires and writes one row to the change buffer with only OLD values:

pgtrickle_changes.changes_16384
┌───────────┬─────────────┬────────┬──────────┬──────────┬────────────┬──────────┬────────────┐
│ change_id │ lsn         │ action │ new_cust │ new_amt  │ old_cust   │ old_amt  │ pk_hash    │
├───────────┼─────────────┼────────┼──────────┼──────────┼────────────┼──────────┼────────────┤
│ 5         │ 0/1A3F3000  │ D      │ NULL     │ NULL     │ alice      │ 30.00    │ 4521038    │
└───────────┴─────────────┴────────┴──────────┴──────────┴────────────┴──────────┴────────────┘

Key difference from INSERT and UPDATE:

  • new_* columns are all NULL — the row no longer exists, so there are no NEW values
  • old_* columns contain the deleted row's data — this is what gets subtracted
  • pk_hash is computed from OLD.id (the deleted row's primary key)

Phase 2–4: Scheduler, Frontier, Change Detection

Identical to the INSERT flow. The scheduler detects one change row in the LSN window.

Phase 5: Scan Differentiation — Pure DELETE

Unlike UPDATE (which splits into D+I), a DELETE produces a single event:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
4521038      | D            | alice    | 30.00

The scan delta applies the net-effect filtering rule:

  • first_action = 'D' → row existed before the refresh window
  • last_action = 'D' → row does not exist after

Result: emit a DELETE using old values. No INSERT is emitted (because last_action = 'D').

This is the simplest path through the scan delta — one change, one PK, one DELETE event.

Phase 5 (continued): Aggregate Differentiation

The aggregate operator processes the DELETE event against the alice group:

-- DELETE event: subtract old values from alice's group
__ins_count = 0         -- no inserts
__del_count = 1         -- one deletion
__ins_total = 0         -- no amount added
__del_total = 30.00     -- 30.00 removed

The merge CTE joins this delta with the existing stream table state:

new_count = old_count + ins_count - del_count = 2 + 0 - 1 = 1  (still > 0)

Since new_count > 0 and the group already existed (old_count = 2), the action is classified as 'U' (update). The aggregate emits the group with its new values:

customer | total | order_count | __pgt_row_id | __pgt_action
---------|-------|-------------|--------------|-------------
alice    | 50.00 | 1           | 7283194      | I

Note: the 'U' meta-action is emitted as __pgt_action = 'I' because the MERGE treats it as an update-via-INSERT (see aggregate final CTE: CASE WHEN __pgt_meta_action = 'D' THEN 'D' ELSE 'I' END).

Phase 6: MERGE

The MERGE statement matches alice's existing row and updates it:

MERGE INTO customer_totals AS st
USING (...delta...) AS d
ON st.__pgt_row_id = d.__pgt_row_id
WHEN MATCHED AND d.__pgt_action = 'I' THEN
  UPDATE SET customer = d.customer, total = d.total, order_count = d.order_count, ...

Result:

SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 alice    | 50.00  | 1            ← was 80.00 / 2
 bob      | 100.00 | 2

Phase 7: Cleanup

The change buffer rows in the consumed LSN window are deleted:

DELETE FROM pgtrickle_changes.changes_16384
WHERE lsn > '0/1A3F2FFF'::pg_lsn AND lsn <= '0/1A3F3000'::pg_lsn;

Case 2: Delete Last Row in Group (Group Vanishes)

-- Alice has one order left (id=1, amount=50.00). Delete it.
DELETE FROM orders WHERE id = 1;

Trigger Capture

change_id | lsn         | action | old_cust | old_amt | pk_hash
6         | 0/1A3F3100  | D      | alice    | 50.00   | -837291

Scan Delta

Single DELETE event:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
-837291      | D            | alice    | 50.00

Aggregate Delta

Group "alice":
  ins_count = 0
  del_count = 1
  new_count = old_count + 0 - 1 = 1 - 1 = 0  → group vanishes!

When new_count drops to 0 (or below), the aggregate classifies this as action 'D' (delete). The reference count has reached zero — no rows contribute to this group anymore.

The aggregate emits a DELETE for alice's group:

customer | __pgt_row_id | __pgt_action
---------|--------------|-------------
alice    | 7283194      | D

MERGE

The MERGE matches alice's existing row and deletes it:

WHEN MATCHED AND d.__pgt_action = 'D' THEN DELETE

Result:

SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 bob      | 100.00 | 2

Alice's row is completely removed from the stream table. This is the correct behavior — with zero contributing rows, the group should not exist.


Case 3: Delete Multiple Rows (Same Group, Same Window)

-- Delete both of bob's orders before the next refresh
DELETE FROM orders WHERE id = 3;  -- bob, 75.00
DELETE FROM orders WHERE id = 4;  -- bob, 25.00

The change buffer has two rows with different pk_hash values (different PKs):

change_id | action | old_cust | old_amt | pk_hash
7         | D      | bob      | 75.00   | pk_hash_3
8         | D      | bob      | 25.00   | pk_hash_4

Scan Delta

Each PK has exactly one change, so both take the single-change fast path:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
pk_hash_3    | D            | bob      | 75.00
pk_hash_4    | D            | bob      | 25.00

Two DELETE events, both targeting bob's group.

Aggregate Delta

The aggregate sums both deletions:

Group "bob":
  ins_count = 0
  del_count = 2
  del_total = 75.00 + 25.00 = 100.00
  new_count = 2 + 0 - 2 = 0  → group vanishes!

The aggregate emits a DELETE for bob's group.

MERGE

Bob's row is deleted from the stream table. With both alice and bob gone (from Cases 1+2+3), the stream table is now empty.


Case 4: INSERT + DELETE in Same Window (Cancellation)

What if a row is inserted and then deleted before the next refresh?

INSERT INTO orders (customer, amount) VALUES ('charlie', 200.00);
DELETE FROM orders WHERE customer = 'charlie';

The change buffer has:

change_id | action | new_cust | new_amt | old_cust | old_amt | pk_hash
9         | I      | charlie  | 200.00  | NULL     | NULL    | pk_hash_new
10        | D      | NULL     | NULL    | charlie  | 200.00  | pk_hash_new

Net-Effect Computation

Both changes share the same pk_hash. The pk_stats CTE finds cnt = 2, so this goes through the multi-change path:

first_action = FIRST_VALUE(action) OVER (...) → 'I'
last_action  = LAST_VALUE(action)  OVER (...) → 'D'

The scan delta applies the net-effect filtering:

  • DELETE branch: requires first_action != 'I' → FAILS (first_action = 'I')
  • INSERT branch: requires last_action != 'D' → FAILS (last_action = 'D')

Result: zero events emitted. The INSERT and DELETE completely cancel each other out.

The aggregate never sees charlie. The stream table is unchanged. This is correct — the row was born and died within the same refresh window, so it should have no visible effect.


Case 5: UPDATE + DELETE in Same Window

UPDATE orders SET amount = 999.99 WHERE id = 3;  -- bob: 75 → 999.99
DELETE FROM orders WHERE id = 3;

The change buffer:

change_id | action | old_amt | new_amt
11        | U      | 75.00   | 999.99
12        | D      | 999.99  | NULL

Net-Effect Computation

Same pk_hash, cnt = 2:

first_action = 'U'  (row existed before this window)
last_action  = 'D'  (row no longer exists)

Filtering:

  • DELETE branch: first_action != 'I' → OK. Emit DELETE with old values from the earliest change: old_amt = 75.00
  • INSERT branch: last_action != 'D' → FAILS. No INSERT emitted.

Net delta:

__pgt_row_id | __pgt_action | amount
-------------|--------------|-------
pk_hash_3    | D            | 75.00

The intermediate value of 999.99 never appears. The aggregate sees only the removal of the original value (75.00), which is correct — that's the value that was previously accounted for in the stream table.


Case 6: DELETE with JOINs

Consider a stream table that joins two tables:

CREATE TABLE customers (
    id   SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    tier TEXT NOT NULL DEFAULT 'standard'
);

CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer_id INT REFERENCES customers(id),
    amount      NUMERIC(10,2)
);

SELECT pgtrickle.create_stream_table(
    name         => 'order_details',
    query        => $$
      SELECT c.name, c.tier, o.amount
      FROM orders o
      JOIN customers c ON o.customer_id = c.id
    $$,
    schedule => '1m'
);

Seed data:

INSERT INTO customers VALUES (1, 'alice', 'premium'), (2, 'bob', 'standard');
INSERT INTO orders VALUES (1, 1, 50.00), (2, 1, 30.00), (3, 2, 75.00);

After refresh, the stream table has:

name  | tier     | amount
------|----------|-------
alice | premium  | 50.00
alice | premium  | 30.00
bob   | standard | 75.00

Now delete an order:

DELETE FROM orders WHERE id = 2;  -- alice's 30.00 order

How the JOIN Delta Works

The join differentiation formula:

$$\Delta(L \bowtie R) = (\Delta L \bowtie R) \cup (L \bowtie \Delta R) - (\Delta L \bowtie \Delta R)$$

Since only the orders table changed:

  • $\Delta L$ = changes to orders (one DELETE: order #2)
  • $\Delta R$ = changes to customers (empty)

So:

  • Part 1: $\Delta\text{orders} \bowtie \text{customers}$ = the deleted order joined with its customer
  • Part 2: $\text{orders} \bowtie \Delta\text{customers}$ = empty (no customer changes)
  • Part 3: $\Delta\text{orders} \bowtie \Delta\text{customers}$ = empty (customers unchanged)

Part 1 produces:

name  | tier    | amount | __pgt_action
------|---------|--------|-------------
alice | premium | 30.00  | D

The deleted order is joined with alice's customer record to produce a DELETE delta row with the complete joined values.

MERGE

The MERGE matches the row (alice, premium, 30.00) and deletes it:

SELECT * FROM order_details;
 name  | tier     | amount
-------|----------|-------
 alice | premium  | 50.00      ← alice's remaining order
 bob   | standard | 75.00

What About Deleting From the Dimension Table?

DELETE FROM customers WHERE id = 2;  -- remove bob entirely

Now $\Delta R$ has a DELETE for bob, while $\Delta L$ is empty:

  • Part 1: $\Delta\text{orders} \bowtie \text{customers}$ = empty
  • Part 2: $\text{orders} \bowtie \Delta\text{customers}$ = bob's order(s) joined with deleted customer record

Part 2 produces DELETE events for every order that referenced bob:

name | tier     | amount | __pgt_action
-----|----------|--------|-------------
bob  | standard | 75.00  | D

After MERGE, bob's rows vanish from the stream table.

Note: This assumes referential integrity — if orders still references customer #2, a foreign key constraint would prevent the DELETE in practice. But from the IVM perspective, the join delta correctly handles the removal regardless.


Case 7: Bulk DELETE

DELETE FROM orders WHERE amount < 50.00;

This deletes multiple rows across potentially multiple groups. The trigger fires once per row (it's a FOR EACH ROW trigger), writing one change buffer entry per deleted row:

change_id | action | old_cust | old_amt | pk_hash
13        | D      | alice    | 30.00   | pk_hash_2
14        | D      | bob      | 25.00   | pk_hash_4

Scan Delta

Each deleted PK is independent (different pk_hash values), so each takes the single-change fast path. Two DELETE events:

__pgt_row_id | __pgt_action | customer | amount
-------------|--------------|----------|-------
pk_hash_2    | D            | alice    | 30.00
pk_hash_4    | D            | bob      | 25.00

Aggregate Delta

The aggregate groups these by customer:

Group "alice":
  del_count = 1, del_total = 30.00
  new_count = 2 - 1 = 1  (survives)

Group "bob":
  del_count = 1, del_total = 25.00
  new_count = 2 - 1 = 1  (survives)

Both groups survive (count > 0), so the aggregate emits UPDATE (as 'I') events with new values:

customer | total | order_count
---------|-------|------------
alice    | 50.00 | 1
bob      | 75.00 | 1

The MERGE updates both rows. All work is proportional to the number of deleted rows (2), not the total table size.


Case 8: TRUNCATE (Automatic Full Refresh)

TRUNCATE orders;

TRUNCATE does not fire row-level triggers. However, as of v0.2.0, pg_trickle installs a statement-level AFTER TRUNCATE trigger that writes a 'T' marker to the change buffer. On the next refresh cycle, the scheduler detects this marker and automatically performs a full refresh — truncating the stream table and recomputing from the defining query.

No manual intervention is required. For details on how TRUNCATE is handled across all three refresh modes (DIFFERENTIAL, IMMEDIATE, FULL), see What Happens When You TRUNCATE a Table?.


How DELETE Differs From INSERT and UPDATE — A Summary

AspectINSERTUPDATEDELETE
Trigger writesnew_* columns onlyBoth new_* and old_*old_* columns only
new_ columns*Row valuesNew valuesNULL
old_ columns*NULLOld valuesRow values
pk_hash sourceNEW.pkNEW.pkOLD.pk
Scan delta output1 INSERT event2 events (D+I split)1 DELETE event
Aggregate effectAdds to group count/sumSubtracts old, adds newSubtracts from group
Can delete a group?No (only creates/grows)Yes (if group key changes)Yes (if count reaches 0)
MERGE actionINSERT new rowUPDATE existing rowDELETE matched row

The Reference Counting Principle

The core insight behind incremental DELETE handling is reference counting. Every aggregate group in the stream table maintains an internal counter (__pgt_count) that tracks how many source rows contribute to the group:

Stream table internal state:
customer | total | order_count | __pgt_count (hidden)
---------|-------|-------------|---------------------
alice    | 80.00 | 2           | 2
bob      | 100.00| 2           | 2
  • INSERT__pgt_count += 1
  • DELETE__pgt_count -= 1
  • UPDATE__pgt_count += 0 (D cancels I for same-group updates)

When __pgt_count reaches 0:

  • The group has zero contributing rows
  • The aggregate emits a DELETE event
  • The MERGE removes the row from the stream table

This is mathematically rigorous — the stream table always reflects the correct result of the defining query over the current base table contents, incrementally maintained through algebraic delta operations.


Performance Summary

ScenarioBuffer rowsDelta rows emittedWork
Single row DELETE (group survives)11 (D)O(1) per group
Single row DELETE (group vanishes)11 (D)O(1)
N deletes same groupNN (D) → 1 group deltaO(N) scan, O(1) per group
INSERT+DELETE same window20 (cancels)O(1)
UPDATE+DELETE same window21 (D original)O(1)
Bulk DELETE across M groupsNN (D) → M group deltasO(N) scan, O(M) aggregate
JOIN table DELETE1K (one per matched join row)O(K) join

In all cases, the work is proportional to the number of changed rows, not the total table size. Deleting 3 rows from a billion-row table produces the same delta cost as from a 10-row table.


What About IMMEDIATE Mode?

Everything above describes DIFFERENTIAL mode — changes accumulate in a buffer and are applied on a schedule. As of v0.2.0, pg_trickle also supports IMMEDIATE mode, where the stream table is updated synchronously within the same transaction as your DELETE.

How IMMEDIATE Mode Differs for DELETE

PhaseDIFFERENTIALIMMEDIATE
Trigger typeRow-level AFTER triggerStatement-level AFTER trigger with REFERENCING OLD TABLE
What's capturedOne buffer row with old_* columns per deleted rowA transition table containing all deleted rows
When delta runsNext scheduler tickImmediately, in the same transaction
Delta sourceChange buffer rows with action='D'Temp table copied from transition table
ConcurrencyNo locking between writersAdvisory lock per stream table

When you run DELETE FROM orders WHERE id = 2:

  1. A BEFORE DELETE trigger acquires an advisory lock on the stream table
  2. The AFTER DELETE trigger captures OLD TABLE AS __pgt_oldtable into a temp table
  3. The DVM engine generates the same aggregate delta, reading deleted values from the old-table
  4. The delta is applied to the stream table immediately — groups are decremented, and groups reaching count=0 are removed
  5. Any query within the same transaction sees the updated stream table
BEGIN;
DELETE FROM orders WHERE id = 2;  -- alice's 30.00 order
-- customer_totals already reflects the deletion here!
SELECT * FROM customer_totals WHERE customer = 'alice';
-- Shows: alice | 50.00 | 1
COMMIT;

The same reference counting, group deletion, and net-effect logic applies — the only difference is the data source (transition tables vs change buffer) and timing (synchronous vs scheduled).


Next in This Series

What Happens When You TRUNCATE a Table?

This tutorial explains what happens when a TRUNCATE statement hits a base table that is referenced by a stream table. Unlike INSERT, UPDATE, and DELETE — which are fully tracked by the CDC trigger — TRUNCATE is a special case that bypasses row-level triggers entirely. Understanding this gap is essential for operating pg_trickle correctly.

Prerequisite: Read WHAT_HAPPENS_ON_INSERT.md first — it introduces the 7-phase lifecycle. This tutorial explains why TRUNCATE breaks that lifecycle and how to recover.

Setup

Same e-commerce example used throughout the series:

CREATE TABLE orders (
    id       SERIAL PRIMARY KEY,
    customer TEXT NOT NULL,
    amount   NUMERIC(10,2) NOT NULL
);

SELECT pgtrickle.create_stream_table(
    name     => 'customer_totals',
    query    => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule => '1m'
);

-- Seed some data
INSERT INTO orders (customer, amount) VALUES
    ('alice', 50.00),
    ('alice', 30.00),
    ('bob',   75.00),
    ('bob',   25.00);

After the first refresh, the stream table contains:

customer | total  | order_count
---------|--------|------------
alice    | 80.00  | 2
bob      | 100.00 | 2

Case 1: TRUNCATE the Base Table (DIFFERENTIAL Mode)

TRUNCATE orders;

All four rows are removed instantly.

What Happens at the Trigger Level: TRUNCATE Marker

Updated in v0.2.0: pg_trickle now installs a statement-level AFTER TRUNCATE trigger on tracked source tables. This trigger writes a single marker row to the change buffer with action = 'T'.

Unlike the per-row DML triggers, the TRUNCATE trigger cannot capture individual row data (PostgreSQL's TRUNCATE does not provide OLD records). Instead, it writes a sentinel:

pgtrickle_changes.changes_16384
┌───────────┬─────────────┬────────┬──────────┬──────────┐
│ change_id │ lsn         │ action │ new_*    │ old_*    │
├───────────┼─────────────┼────────┼──────────┼──────────┤
│ 5         │ 0/1A3F4000  │ T      │ NULL     │ NULL     │
└───────────┴─────────────┴────────┴──────────┴──────────┘

The 'T' action marker tells the refresh engine: "a TRUNCATE happened — a full refresh is required."

What Happens at the Scheduler: Automatic Full Refresh

On the next refresh cycle, the scheduler:

  1. Checks the change buffer for rows in the LSN window
  2. Finds the action = 'T' marker row
  3. Falls back to a FULL refresh — regardless of the stream table's configured refresh_mode
  4. TRUNCATEs the stream table
  5. Re-executes the defining query against the current base table state
  6. Inserts all results

Since the orders table is now empty, the defining query returns zero rows:

-- After the next scheduled refresh:
SELECT * FROM customer_totals;
 customer | total | order_count
----------|-------|------------
 (0 rows)                        ← correct: orders is empty

No manual intervention required. The TRUNCATE marker ensures the stream table is automatically brought back into consistency on the next refresh cycle.

Note: In versions before v0.2.0, TRUNCATE was not captured at all — the change buffer stayed empty and the stream table became silently stale. If you're running an older version, you still need to call pgtrickle.refresh_stream_table() manually after a TRUNCATE.


Case 2: Manual Refresh (Explicit Recovery)

Although TRUNCATE is now automatically handled on the next refresh cycle, you can force an immediate recovery without waiting:

SELECT pgtrickle.refresh_stream_table('customer_totals');

This executes a full refresh regardless of the stream table's configured refresh mode:

  1. TRUNCATE the stream table itself (clearing the stale data)
  2. Re-execute the defining query
  3. INSERT the results into the stream table
  4. Update the frontier so future differential refreshes start from the current LSN

This is useful when you can't wait for the next scheduled refresh cycle and need the stream table consistent immediately.


Case 3: TRUNCATE Then INSERT (Common ETL Pattern)

A common data loading pattern is:

BEGIN;
TRUNCATE orders;
INSERT INTO orders (customer, amount) VALUES
    ('charlie', 100.00),
    ('charlie', 200.00),
    ('dave',    150.00);
COMMIT;

What the Change Buffer Sees

  • TRUNCATE: 1 marker event (action = 'T') — captured by the statement-level trigger
  • INSERT charlie 100.00: 1 event (captured)
  • INSERT charlie 200.00: 1 event (captured)
  • INSERT dave 150.00: 1 event (captured)

The change buffer has 4 rows — the TRUNCATE marker plus 3 INSERT events.

What the Scheduler Does

The scheduler sees the action = 'T' marker and triggers a full refresh, ignoring the individual INSERT events. The full refresh re-executes the defining query against the current state of orders, which now contains only charlie and dave:

-- After the next scheduled refresh:
SELECT * FROM customer_totals;
 customer | total  | order_count
----------|--------|------------
 charlie  | 300.00 | 2            ← correct
 dave     | 150.00 | 1            ← correct

The old data (alice, bob) is gone because the full refresh recomputed from scratch. This is correct — the TRUNCATE marker ensures consistency regardless of what other changes occurred in the same window.


Case 4: TRUNCATE a Dimension Table in a JOIN

Consider a stream table that joins two tables:

CREATE TABLE customers (
    id   SERIAL PRIMARY KEY,
    name TEXT NOT NULL,
    tier TEXT NOT NULL DEFAULT 'standard'
);

CREATE TABLE orders (
    id          SERIAL PRIMARY KEY,
    customer_id INT REFERENCES customers(id),
    amount      NUMERIC(10,2)
);

SELECT pgtrickle.create_stream_table(
    name         => 'order_details',
    query        => $$
      SELECT c.name, c.tier, o.amount
      FROM orders o
      JOIN customers c ON o.customer_id = c.id
    $$,
    schedule => '1m'
);

Now truncate the dimension table:

TRUNCATE customers CASCADE;

The CASCADE also truncates orders (due to the foreign key). Both tables have TRUNCATE triggers installed, so both write a 'T' marker to their respective change buffers.

On the next refresh cycle, the scheduler detects the TRUNCATE markers and performs a full refresh. The stream table is recomputed from the now-empty base tables:

-- After the next scheduled refresh:
SELECT * FROM order_details;
-- (0 rows) — correct

Case 5: FULL Mode Stream Tables Are Immune

If the stream table uses FULL refresh mode instead of DIFFERENTIAL:

SELECT pgtrickle.create_stream_table(
    name         => 'customer_totals_full',
    query        => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    schedule     => '1m',
    refresh_mode => 'FULL'
);

A FULL-mode stream table doesn't use the change buffer at all. Every refresh cycle:

  1. TRUNCATEs the stream table
  2. Re-executes the defining query
  3. Inserts all results

So after a TRUNCATE of the base table, the next scheduled refresh automatically picks up the correct state — no manual intervention needed. The trade-off is that every refresh recomputes from scratch, which is more expensive for large result sets.


Why PostgreSQL Doesn't Fire Row Triggers on TRUNCATE

Understanding the PostgreSQL internals helps explain why per-row capture is impossible:

OperationMechanismRow triggers fired?Statement triggers fired?
DELETE FROM tScans and removes rows one by oneYes — AFTER DELETE per rowYes
TRUNCATE tRemoves all heap files and reinitializes the table storageNo — no per-row processingYes — AFTER TRUNCATE
DELETE FROM t WHERE trueSame as DELETE FROM t (full scan)Yes — AFTER DELETE per rowYes

TRUNCATE is fundamentally different from DELETE. It's an O(1) operation that replaces the table's storage files, while DELETE is O(N) — scanning every row and recording each removal in WAL.

pg_trickle uses a statement-level AFTER TRUNCATE trigger to detect the event and write a 'T' marker to the change buffer. This marker does not contain per-row data (PostgreSQL's TRUNCATE trigger doesn't provide OLD records), but it's sufficient to signal that a full refresh is needed.


Alternative: DELETE FROM Instead of TRUNCATE

For DIFFERENTIAL mode, TRUNCATE is now handled automatically (via the 'T' marker and full refresh fallback). However, using DELETE FROM instead of TRUNCATE has its own advantages:

-- Instead of: TRUNCATE orders;
DELETE FROM orders;

This fires the row-level DELETE trigger for every row. The change buffer captures all removals, and the next differential refresh correctly decrements all reference counts through the standard algebraic delta path — avoiding the need for a full refresh fallback.

ApproachSpeedStream table consistent?Refresh type
TRUNCATE ordersO(1) — instantYes — automatic full refresh on next cycleFULL (fallback)
DELETE FROM ordersO(N) — scans all rowsYes — per-row triggers fireDIFFERENTIAL
TRUNCATE + manual refreshO(1) + O(query)Yes — immediatelyFULL (manual)

For tables with millions of rows, DELETE FROM can be slow and generate significant WAL. TRUNCATE is generally the better choice — the automatic full refresh fallback makes it safe to use.


Best Practices

1. TRUNCATE Is Safe to Use

As of v0.2.0, TRUNCATE on tracked source tables is automatically detected and triggers a full refresh on the next scheduler cycle. No manual intervention is required for standard operation.

2. Use Manual Refresh for Immediate Consistency

If you need the stream table to be consistent immediately (not on the next cycle), call refresh explicitly:

TRUNCATE orders;
SELECT pgtrickle.refresh_stream_table('customer_totals');

3. Consider IMMEDIATE Mode for Real-Time Needs

For stream tables that need to reflect TRUNCATE instantly (within the same transaction), use IMMEDIATE mode. The TRUNCATE trigger automatically performs a full refresh synchronously.

4. Consider FULL Mode for ETL-Heavy Tables

If a table is routinely truncated and reloaded, FULL refresh mode may be simpler than DIFFERENTIAL — it naturally handles TRUNCATE because it recomputes from scratch every cycle.

5. Use trigger_inventory() to Verify Triggers

You can verify that both the DML trigger and the TRUNCATE trigger are installed and enabled:

SELECT * FROM pgtrickle.trigger_inventory();

This shows one row per (source table, trigger type) confirming both pg_trickle_cdc_<oid> (DML) and pg_trickle_cdc_truncate_<oid> (TRUNCATE) triggers are present.


How TRUNCATE Compares to Other Operations

AspectINSERTUPDATEDELETETRUNCATE
Row trigger fires?Yes (per row)Yes (per row)Yes (per row)No
Statement trigger fires?YesYesYesYes (writes 'T' marker)
Change buffer1 row per INSERT1 row per UPDATE1 row per DELETE1 marker row (action='T')
Stream table updated?Yes (next refresh)Yes (next refresh)Yes (next refresh)Yes (full refresh on next cycle)
RecoveryAutomatic (differential)Automatic (differential)Automatic (differential)Automatic (full refresh fallback)
FULL mode affected?N/A (recomputes)N/A (recomputes)N/A (recomputes)N/A (recomputes)
IMMEDIATE mode?Synchronous deltaSynchronous deltaSynchronous deltaSynchronous full refresh
SpeedO(1) per rowO(1) per rowO(1) per rowO(1) + O(query) for refresh

What About IMMEDIATE Mode?

In IMMEDIATE mode, TRUNCATE is handled synchronously within the same transaction:

  1. The BEFORE TRUNCATE trigger acquires an advisory lock on the stream table
  2. The AFTER TRUNCATE trigger calls pgt_ivm_handle_truncate(pgt_id)
  3. This function TRUNCATEs the stream table and re-populates it by re-executing the defining query
  4. The stream table is immediately consistent — within the same transaction
SELECT pgtrickle.create_stream_table(
    name         => 'customer_totals_live',
    query        => $$
      SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
      FROM orders GROUP BY customer
    $$,
    refresh_mode => 'IMMEDIATE'
);

BEGIN;
TRUNCATE orders;
-- customer_totals_live is already empty here!
SELECT * FROM customer_totals_live;  -- (0 rows)
COMMIT;

No waiting for a scheduler cycle, no stale data — TRUNCATE is fully handled in real-time.


Summary

As of v0.2.0, TRUNCATE is fully tracked by pg_trickle across all three refresh modes. While it cannot be captured as per-row DELETE events (PostgreSQL's TRUNCATE doesn't process individual rows), pg_trickle uses a statement-level trigger to detect the event and respond appropriately.

The key takeaways:

  1. TRUNCATE is automatically handled — a statement-level AFTER TRUNCATE trigger writes a 'T' marker to the change buffer
  2. DIFFERENTIAL mode: automatic full refresh — the scheduler detects the marker and falls back to a full refresh on the next cycle
  3. IMMEDIATE mode: synchronous full refresh — the stream table is rebuilt within the same transaction
  4. FULL mode: naturally immune — every refresh recomputes from scratch regardless
  5. Manual refresh for instant consistency — call pgtrickle.refresh_stream_table() if you can't wait for the next cycle
  6. DELETE FROM remains an alternative — fires per-row triggers, enabling incremental delta processing instead of full refresh fallback

Next in This Series

Row-Level Security (RLS) on Stream Tables

This tutorial shows how to apply PostgreSQL Row-Level Security to stream tables so that different database roles see only the rows they are permitted to access.

Background

Stream tables materialize the full result set of their defining query, regardless of any RLS policies on the source tables. This matches the behavior of PostgreSQL's built-in MATERIALIZED VIEW — the cache contains everything, and access control is enforced at read time.

The recommended pattern is:

  1. Source tables: may or may not have RLS. Stream tables always see all rows.
  2. Stream table: enable RLS on the stream table and create per-role policies so each role sees only its permitted rows.

Setup: Multi-Tenant Orders

-- Source table: all tenant orders
CREATE TABLE orders (
    id        SERIAL PRIMARY KEY,
    tenant_id INT    NOT NULL,
    product   TEXT   NOT NULL,
    amount    NUMERIC(10,2) NOT NULL
);

INSERT INTO orders (tenant_id, product, amount) VALUES
    (1, 'Widget A', 19.99),
    (1, 'Widget B',  9.50),
    (2, 'Gadget X', 49.00),
    (2, 'Gadget Y', 25.00),
    (3, 'Doohickey', 5.00);

-- Stream table: per-tenant spend summary
SELECT pgtrickle.create_stream_table(
    name  => 'tenant_spend',
    query => $$
      SELECT tenant_id,
             COUNT(*)       AS order_count,
             SUM(amount)    AS total_spend
      FROM orders
      GROUP BY tenant_id
    $$,
    schedule => '1m'
);

After the first refresh, tenant_spend contains all three tenants:

SELECT * FROM pgtrickle.tenant_spend ORDER BY tenant_id;
--  tenant_id | order_count | total_spend
-- -----------+-------------+-------------
--          1 |           2 |       29.49
--          2 |           2 |       74.00
--          3 |           1 |        5.00

Step 1: Enable RLS on the Stream Table

ALTER TABLE pgtrickle.tenant_spend ENABLE ROW LEVEL SECURITY;

Once RLS is enabled, non-superuser roles see zero rows unless a policy grants access. The superuser (table owner) bypasses RLS by default.

Step 2: Create Per-Tenant Roles

CREATE ROLE tenant_1 LOGIN;
CREATE ROLE tenant_2 LOGIN;

GRANT USAGE  ON SCHEMA pgtrickle TO tenant_1, tenant_2;
GRANT SELECT ON pgtrickle.tenant_spend TO tenant_1, tenant_2;

Step 3: Create RLS Policies

-- Tenant 1 sees only tenant_id = 1
CREATE POLICY tenant_1_policy ON pgtrickle.tenant_spend
    FOR SELECT
    TO tenant_1
    USING (tenant_id = 1);

-- Tenant 2 sees only tenant_id = 2
CREATE POLICY tenant_2_policy ON pgtrickle.tenant_spend
    FOR SELECT
    TO tenant_2
    USING (tenant_id = 2);

Step 4: Verify Filtering

Connect as each tenant role and query:

-- As tenant_1:
SET ROLE tenant_1;
SELECT * FROM pgtrickle.tenant_spend;
--  tenant_id | order_count | total_spend
-- -----------+-------------+-------------
--          1 |           2 |       29.49

RESET ROLE;

-- As tenant_2:
SET ROLE tenant_2;
SELECT * FROM pgtrickle.tenant_spend;
--  tenant_id | order_count | total_spend
-- -----------+-------------+-------------
--          2 |           2 |       74.00

RESET ROLE;

Each tenant sees only their own data. The underlying stream table still contains all rows — the filtering happens at query time via RLS.

How Refresh Works with RLS

Both scheduled and manual refreshes run with superuser-equivalent privileges, so RLS on source tables is always bypassed during refresh. This ensures:

  • The stream table always contains the complete result set.
  • A refresh_stream_table() call produces the same result regardless of who calls it.
  • IMMEDIATE mode (IVM triggers) also bypasses RLS via SECURITY DEFINER trigger functions.

Policy Change Detection

pg_trickle automatically detects RLS-related DDL on source tables:

DDL on source tableEffect
CREATE POLICY / ALTER POLICY / DROP POLICYStream table marked for reinit
ALTER TABLE ... ENABLE ROW LEVEL SECURITYStream table marked for reinit
ALTER TABLE ... DISABLE ROW LEVEL SECURITYStream table marked for reinit
ALTER TABLE ... FORCE ROW LEVEL SECURITYStream table marked for reinit
ALTER TABLE ... NO FORCE ROW LEVEL SECURITYStream table marked for reinit

Since the stream table always sees all rows (bypassing RLS), these reinits serve as a confirmation that the materialized data remains consistent after the security posture of the source table changed.

Tips

  • One stream table, many roles: A single stream table can serve all tenants. Each role's RLS policy filters at read time — no per-tenant duplication needed.
  • Write policies: Stream tables are maintained by pg_trickle. Restrict writes to the pg_trickle system by only creating FOR SELECT policies.
  • Default deny: Once RLS is enabled, roles without a matching policy see zero rows. Always test with a non-superuser role.
  • FORCE ROW LEVEL SECURITY: By default, table owners bypass RLS. Use ALTER TABLE ... FORCE ROW LEVEL SECURITY if the owner should also be subject to policies.

Partitioned Tables as Sources

This tutorial shows how pg_trickle works with PostgreSQL's declarative table partitioning. It covers RANGE, LIST, and HASH partitioned source tables, explains what happens when you add or remove partitions, and documents known caveats.

Background

PostgreSQL lets you split large tables into smaller "partitions" — for example one partition per month for an orders table. This is a common technique for managing very large datasets. pg_trickle handles partitioned source tables transparently:

  • CDC triggers fire on all partitions. PostgreSQL 13+ automatically clones row-level triggers from the parent to every child partition. All DML (INSERT, UPDATE, DELETE) on any partition is captured in a single change buffer keyed by the parent table's OID.

  • ATTACH PARTITION is detected automatically. When you add a new partition with pre-existing data, pg_trickle's DDL event trigger detects the change and marks affected stream tables for reinitialization. No manual intervention required.

  • WAL-based CDC works correctly. When using WAL mode, publications are created with publish_via_partition_root = true so all partition changes appear under the parent table's identity.

Example: Monthly Sales Partitions (RANGE)

-- Create a RANGE-partitioned source table
CREATE TABLE sales (
    id         SERIAL,
    sale_date  DATE    NOT NULL,
    region     TEXT    NOT NULL,
    amount     NUMERIC NOT NULL,
    PRIMARY KEY (id, sale_date)
) PARTITION BY RANGE (sale_date);

-- Create partitions for each half of the year
CREATE TABLE sales_h1_2025 PARTITION OF sales
    FOR VALUES FROM ('2025-01-01') TO ('2025-07-01');
CREATE TABLE sales_h2_2025 PARTITION OF sales
    FOR VALUES FROM ('2025-07-01') TO ('2026-01-01');

-- Insert data across partitions
INSERT INTO sales (sale_date, region, amount) VALUES
    ('2025-02-15', 'US', 100.00),
    ('2025-05-20', 'EU', 250.00),
    ('2025-08-10', 'US', 175.00),
    ('2025-11-30', 'EU', 300.00);

-- Create a stream table over the partitioned source
SELECT pgtrickle.create_stream_table(
    name  => 'regional_sales',
    query => $$
        SELECT region, SUM(amount) AS total, COUNT(*) AS cnt
        FROM sales
        GROUP BY region
    $$,
    schedule     => '1 minute',
    refresh_mode => 'DIFFERENTIAL'
);

-- Refresh to populate
SELECT pgtrickle.refresh_stream_table('regional_sales');

-- Verify — aggregates span all partitions:
SELECT * FROM regional_sales ORDER BY region;
--  region | total  | cnt
-- --------+--------+-----
--  EU     | 550.00 |   2
--  US     | 275.00 |   2

Adding New Partitions

When you add a new partition, any new rows inserted through the parent are automatically captured by CDC triggers. The trigger on the parent is cloned to the new partition by PostgreSQL.

-- Add a new partition for 2026
CREATE TABLE sales_h1_2026 PARTITION OF sales
    FOR VALUES FROM ('2026-01-01') TO ('2026-07-01');

-- Inserts into the new partition are captured normally
INSERT INTO sales (sale_date, region, amount)
    VALUES ('2026-03-15', 'US', 400.00);

-- Next refresh picks up the new row
SELECT pgtrickle.refresh_stream_table('regional_sales');

SELECT * FROM regional_sales ORDER BY region;
--  region | total  | cnt
-- --------+--------+-----
--  EU     | 550.00 |   2
--  US     | 675.00 |   3

ATTACH PARTITION with Pre-Existing Data

The most important edge case: attaching a table that already contains rows. These rows were never seen by CDC triggers, so the stream table would be stale. pg_trickle detects this automatically.

-- Create a standalone table with existing data
CREATE TABLE sales_h2_2026 (
    id        SERIAL,
    sale_date DATE    NOT NULL,
    region    TEXT    NOT NULL,
    amount    NUMERIC NOT NULL,
    PRIMARY KEY (id, sale_date)
);
INSERT INTO sales_h2_2026 (sale_date, region, amount) VALUES
    ('2026-08-01', 'EU', 500.00),
    ('2026-09-15', 'US', 200.00);

-- Attach it to the partitioned table
ALTER TABLE sales ATTACH PARTITION sales_h2_2026
    FOR VALUES FROM ('2026-07-01') TO ('2027-01-01');

-- pg_trickle detects the partition change and marks the stream table
-- for reinitialize. Check:
SELECT pgt_name, needs_reinit
FROM pgtrickle.pgt_stream_tables
WHERE pgt_name = 'regional_sales';
--  pgt_name        | needs_reinit
-- -----------------+--------------
--  regional_sales  | t

-- The next refresh reinitializes — re-reading all data from scratch:
SELECT pgtrickle.refresh_stream_table('regional_sales');

SELECT * FROM regional_sales ORDER BY region;
--  region | total   | cnt
-- --------+---------+-----
--  EU     | 1050.00 |   3
--  US     |  875.00 |   4

DETACH PARTITION

When you detach a partition, the detached table's data is no longer visible through the parent. pg_trickle detects this too and marks stream tables for reinitialize.

-- Archive the old partition
ALTER TABLE sales DETACH PARTITION sales_h1_2025;

-- Stream table is marked for reinit:
SELECT pgt_name, needs_reinit
FROM pgtrickle.pgt_stream_tables
WHERE pgt_name = 'regional_sales';
--  pgt_name        | needs_reinit
-- -----------------+--------------
--  regional_sales  | t

-- After refresh, the detached partition's rows are gone:
SELECT pgtrickle.refresh_stream_table('regional_sales');
SELECT * FROM regional_sales ORDER BY region;
-- (only rows from remaining partitions)

LIST Partitioning

LIST partitioning splits rows by discrete values. It works identically:

CREATE TABLE events (
    id      SERIAL,
    region  TEXT NOT NULL,
    payload TEXT,
    PRIMARY KEY (id, region)
) PARTITION BY LIST (region);

CREATE TABLE events_us PARTITION OF events FOR VALUES IN ('US');
CREATE TABLE events_eu PARTITION OF events FOR VALUES IN ('EU');
CREATE TABLE events_ap PARTITION OF events FOR VALUES IN ('AP');

SELECT pgtrickle.create_stream_table(
    name  => 'event_counts',
    query => 'SELECT region, count(*) AS cnt FROM events GROUP BY region',
    schedule => '1 minute'
);

HASH Partitioning

HASH partitioning distributes rows across a fixed number of partitions. Useful for spreading write load evenly:

CREATE TABLE metrics (
    id        SERIAL PRIMARY KEY,
    sensor_id INT    NOT NULL,
    value     DOUBLE PRECISION
) PARTITION BY HASH (id);

CREATE TABLE metrics_0 PARTITION OF metrics
    FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE metrics_1 PARTITION OF metrics
    FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE metrics_2 PARTITION OF metrics
    FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE metrics_3 PARTITION OF metrics
    FOR VALUES WITH (MODULUS 4, REMAINDER 3);

SELECT pgtrickle.create_stream_table(
    name  => 'sensor_avg',
    query => $$
        SELECT sensor_id, AVG(value) AS avg_val, COUNT(*) AS cnt
        FROM metrics GROUP BY sensor_id
    $$,
    schedule => '1 minute'
);

Foreign Tables

Tables from other databases (via postgres_fdw) can be used as sources, but with restrictions:

  • No trigger-based CDC — foreign tables don't support row-level triggers.
  • No WAL-based CDC — foreign tables don't generate local WAL.
  • FULL refresh worksSELECT * executes a remote query each time.
  • Polling-based CDC works — when pg_trickle.foreign_table_polling is enabled, pg_trickle creates a local snapshot table and detects changes via EXCEPT ALL comparison.

When you use a foreign table as a source, pg_trickle emits an info message explaining the limitations:

CREATE EXTENSION postgres_fdw;

CREATE SERVER remote_db
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'remote-host', dbname 'analytics');

CREATE USER MAPPING FOR CURRENT_USER
    SERVER remote_db OPTIONS (user 'reader');

CREATE FOREIGN TABLE remote_orders (
    id     INT,
    amount NUMERIC
) SERVER remote_db OPTIONS (table_name 'orders');

-- Only FULL refresh is available:
SELECT pgtrickle.create_stream_table(
    name  => 'remote_totals',
    query => 'SELECT SUM(amount) AS total FROM remote_orders',
    schedule     => '5 minutes',
    refresh_mode => 'FULL'
);
-- INFO: pg_trickle: source table remote_orders is a foreign table.
-- Foreign tables cannot use trigger-based or WAL-based CDC —
-- only FULL refresh mode or polling-based change detection is supported.

Known Caveats

CaveatDescription
PostgreSQL 13+ requiredParent-table triggers only propagate to child partitions on PG 13+. pg_trickle targets PostgreSQL 18, so this is always satisfied.
Partition key in PRIMARY KEYPostgreSQL requires the partition key to be part of any unique constraint. This means your PRIMARY KEY must include the partition column.
ATTACH with data = reinitializeAttaching a partition with pre-existing rows triggers a full reinitialize on the next refresh. For very large tables, this may be slow. Consider gating the source with pgtrickle.gate_source() during bulk partition operations.
Sub-partitioningMulti-level partitioning (partitions of partitions) works in principle because triggers propagate through the entire hierarchy, but it is not extensively tested.
pg_partman compatibilitypg_partman dynamically creates and drops partitions. Since pg_trickle detects ATTACH/DETACH via DDL event triggers, it should work, but this combination is not yet tested.
Partitioned storage tablesUsing a partitioned table as the stream table's storage is not supported. This is tracked for a future release.
DETACH PARTITION CONCURRENTLYDETACH PARTITION ... CONCURRENTLY is a two-phase operation. The DDL event trigger fires after the first phase; the partition is not fully detached until the second phase commits. The stream table may briefly reflect the old partition count.

Foreign Table Sources

This tutorial shows how to use a postgres_fdw foreign table as a source for a stream table. Foreign tables let you aggregate data from remote PostgreSQL databases into a local stream table that refreshes automatically.

Background

PostgreSQL's Foreign Data Wrapper (postgres_fdw) lets you define tables that transparently query a remote database. pg_trickle can use these foreign tables as stream table sources, but with different change-detection semantics than regular tables.

Key difference: Foreign tables cannot use trigger-based or WAL-based CDC. Changes are detected either by re-scanning the entire remote table (FULL refresh) or by comparing a local snapshot to the remote data (polling-based CDC).

Step 1 — Set Up the Foreign Server

-- Enable the foreign data wrapper extension
CREATE EXTENSION IF NOT EXISTS postgres_fdw;

-- Create a connection to the remote database
CREATE SERVER warehouse_db
    FOREIGN DATA WRAPPER postgres_fdw
    OPTIONS (host 'warehouse.example.com', dbname 'analytics', port '5432');

-- Map the current user to a remote user
CREATE USER MAPPING FOR CURRENT_USER
    SERVER warehouse_db
    OPTIONS (user 'readonly_user', password 'secret');

Step 2 — Define the Foreign Table

CREATE FOREIGN TABLE remote_orders (
    id          INT,
    customer_id INT,
    amount      NUMERIC(12,2),
    region      TEXT,
    created_at  TIMESTAMP
) SERVER warehouse_db
  OPTIONS (schema_name 'public', table_name 'orders');

Alternatively, import an entire remote schema:

IMPORT FOREIGN SCHEMA public
    LIMIT TO (orders, customers)
    FROM SERVER warehouse_db
    INTO public;

Step 3 — Create a Stream Table with FULL Refresh

The simplest approach uses FULL refresh mode — pg_trickle re-executes the query against the remote table on every refresh cycle:

SELECT pgtrickle.create_stream_table(
    name         => 'orders_by_region',
    query        => $$
        SELECT
            region,
            COUNT(*)        AS order_count,
            SUM(amount)     AS total_revenue,
            AVG(amount)     AS avg_order_value
        FROM remote_orders
        GROUP BY region
    $$,
    schedule     => '5m',
    refresh_mode => 'FULL'
);

pg_trickle will emit an informational message:

INFO: pg_trickle: source table remote_orders is a foreign table.
Foreign tables cannot use trigger-based or WAL-based CDC —
only FULL refresh mode or polling-based change detection is supported.

How FULL refresh works with foreign tables:

  1. Every 5 minutes, pg_trickle executes the defining query.
  2. The query is sent to the remote database via postgres_fdw.
  3. The complete result set replaces the stream table contents.
  4. This is equivalent to a MATERIALIZED VIEW refresh, but automated.

Step 4 — Polling-Based CDC (Optional)

If the remote table is large and changes are small, FULL refresh becomes expensive because it transfers the entire result set every cycle. Polling-based CDC provides a more efficient alternative:

-- Enable polling globally (or per-session)
SET pg_trickle.foreign_table_polling = on;

-- Now create with DIFFERENTIAL mode — pg_trickle will use polling
SELECT pgtrickle.create_stream_table(
    name         => 'orders_by_region_polling',
    query        => $$
        SELECT
            region,
            COUNT(*)        AS order_count,
            SUM(amount)     AS total_revenue,
            AVG(amount)     AS avg_order_value
        FROM remote_orders
        GROUP BY region
    $$,
    schedule     => '5m',
    refresh_mode => 'FULL'
);

How polling works:

  1. On the first refresh, pg_trickle creates a local snapshot table that mirrors the remote table's data.
  2. On subsequent refreshes, it fetches the current remote data and computes an EXCEPT ALL difference against the snapshot.
  3. Only the changed rows are written to the change buffer and processed through the incremental delta pipeline.
  4. The snapshot table is updated to reflect the new remote state.
  5. When the stream table is dropped, the snapshot table is cleaned up.

Trade-offs:

AspectFULL RefreshPolling CDC
Network transferFull result set every cycleFull remote scan, but only diffs applied
Local storageStream table onlyStream table + snapshot table
Best forSmall remote tablesLarge remote tables with small change rates
GUC requiredNopg_trickle.foreign_table_polling = on

Step 5 — Verify and Monitor

-- Check stream table status
SELECT * FROM pgtrickle.pgt_status('orders_by_region');

-- Check CDC health (will show foreign table constraints)
SELECT * FROM pgtrickle.check_cdc_health();

-- View refresh history
SELECT * FROM pgtrickle.get_refresh_history('orders_by_region', 5);

-- Monitor staleness
SELECT * FROM pgtrickle.get_staleness('orders_by_region');

Worked Example — Remote Inventory Dashboard

This example aggregates inventory data from a remote warehouse database into a local dashboard table:

-- Remote table definition
CREATE FOREIGN TABLE remote_inventory (
    sku         TEXT,
    warehouse   TEXT,
    quantity    INT,
    updated_at  TIMESTAMP
) SERVER warehouse_db
  OPTIONS (schema_name 'inventory', table_name 'stock_levels');

-- Dashboard: inventory summary by warehouse
SELECT pgtrickle.create_stream_table(
    name     => 'inventory_dashboard',
    query    => $$
        SELECT
            warehouse,
            COUNT(DISTINCT sku)  AS unique_products,
            SUM(quantity)        AS total_units,
            MIN(updated_at)      AS oldest_update,
            MAX(updated_at)      AS newest_update
        FROM remote_inventory
        GROUP BY warehouse
    $$,
    schedule     => '10m',
    refresh_mode => 'FULL'
);

After the first refresh:

SELECT * FROM inventory_dashboard;
 warehouse | unique_products | total_units | oldest_update       | newest_update
-----------+-----------------+-------------+---------------------+---------------------
 east      |             142 |       23500 | 2026-03-14 08:00:00 | 2026-03-14 09:15:00
 west      |              98 |       15200 | 2026-03-14 07:30:00 | 2026-03-14 09:10:00
 central   |             215 |       41000 | 2026-03-14 06:00:00 | 2026-03-14 09:20:00

Constraints and Caveats

ConstraintDetails
No trigger CDCForeign tables don't support PostgreSQL row-level triggers.
No WAL CDCForeign tables don't generate local WAL entries.
Network latencyEach refresh cycle queries the remote database. Schedule accordingly.
Remote availabilityIf the remote database is down, the refresh will fail (logged in pgt_refresh_history). The stream table retains its last successful data.
AuthenticationCREATE USER MAPPING credentials must remain valid. Use .pgpass or environment variables in production.
Snapshot storagePolling CDC creates a snapshot table sized proportionally to the remote table. Monitor disk usage.

FAQ

Q: Why does my foreign table stream table only work in FULL mode?

Foreign tables cannot install row-level triggers (the mechanism pg_trickle uses for trigger-based CDC) and don't generate local WAL records (used by WAL-based CDC). FULL refresh works because it simply re-executes the remote query. Enable pg_trickle.foreign_table_polling if you need differential-style change detection.

Q: Can I mix foreign and local tables in the same defining query?

Yes. If your query joins a foreign table with a local table, pg_trickle uses trigger/WAL CDC for the local table and FULL-rescan or polling for the foreign table. The refresh mode must be FULL unless polling is enabled for the foreign table sources.

Q: What happens if the remote database is temporarily unavailable?

The refresh attempt fails, is logged in pgt_refresh_history with status FAILED, and the consecutive_errors counter increments. The stream table retains its last successful data. When the remote database recovers, the next scheduled refresh succeeds and the error counter resets.

Tutorial: Migrating from Materialized Views

This guide shows how to incrementally migrate existing PostgreSQL MATERIALIZED VIEW + manual REFRESH workflows to pg_trickle stream tables.

Why Migrate?

Materialized ViewStream Table
RefreshManual (REFRESH MATERIALIZED VIEW)Automatic (scheduler) or manual
Incremental refreshNot supportedBuilt-in differential mode
Blocking readsREFRESH without CONCURRENTLY blocks readersNever blocks readers
Dependency orderingManualAutomatic (DAG-aware topological refresh)
MonitoringNoneBuilt-in views, stats, NOTIFY alerts
SchedulingExternal (cron, pg_cron)Native (duration, cron, CALCULATED)

Step-by-Step Migration

1. Identify materialized views to migrate

-- List all materialized views with their defining queries
SELECT schemaname, matviewname, definition
FROM pg_matviews
ORDER BY schemaname, matviewname;

2. Create the stream table

Take the materialized view's defining query and pass it to create_stream_table():

Before (materialized view):

CREATE MATERIALIZED VIEW order_totals AS
SELECT customer_id, SUM(amount) AS total, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id;

-- Refreshed via cron or pg_cron:
-- */5 * * * * psql -c "REFRESH MATERIALIZED VIEW CONCURRENTLY order_totals"

After (stream table):

SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total, COUNT(*) AS order_count
                 FROM orders GROUP BY customer_id',
    schedule => '5m'
);

3. Update application queries

Stream tables live in the pgtrickle schema by default. Update your application queries to reference the new location:

-- Before:
SELECT * FROM public.order_totals WHERE total > 1000;

-- After:
SELECT * FROM pgtrickle.order_totals WHERE total > 1000;

Or create a view in the original schema for backward compatibility:

CREATE VIEW public.order_totals AS
SELECT customer_id, total, order_count
FROM pgtrickle.order_totals;

4. Recreate indexes

Stream tables are regular heap tables — you can add indexes just like any other table. Recreate the indexes your queries depend on:

-- Before (on materialized view):
CREATE UNIQUE INDEX ON order_totals (customer_id);

-- After (on stream table):
CREATE INDEX ON pgtrickle.order_totals (customer_id);

Note: The __pgt_row_id column is the primary key on stream tables. You cannot add a separate UNIQUE primary key, but you can add regular or unique indexes on your business columns.

5. Remove the old materialized view

Once you've verified the stream table is working correctly:

DROP MATERIALIZED VIEW IF EXISTS public.order_totals;

6. Remove external refresh jobs

Delete any cron jobs, pg_cron entries, or application-level refresh triggers that were maintaining the old materialized view.

Migrating Concurrent Refresh Patterns

If you use REFRESH MATERIALIZED VIEW CONCURRENTLY (which requires a unique index), the stream table equivalent is simpler — differential refresh never blocks readers and doesn't require a unique index:

Before:

CREATE MATERIALIZED VIEW active_users AS
SELECT user_id, MAX(login_at) AS last_login
FROM logins
WHERE login_at > NOW() - INTERVAL '30 days'
GROUP BY user_id;

CREATE UNIQUE INDEX ON active_users (user_id);
REFRESH MATERIALIZED VIEW CONCURRENTLY active_users;

After:

SELECT pgtrickle.create_stream_table(
    name     => 'active_users',
    query    => 'SELECT user_id, MAX(login_at) AS last_login
                 FROM logins
                 WHERE login_at > NOW() - INTERVAL ''30 days''
                 GROUP BY user_id',
    schedule => '1m'
);
-- No unique index needed. No manual refresh needed.

Migrating Cascading Materialized Views

If you have materialized views that depend on other materialized views, the migration is straightforward — pg_trickle handles dependency ordering automatically:

Before:

CREATE MATERIALIZED VIEW order_totals AS
SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id;

CREATE MATERIALIZED VIEW big_customers AS
SELECT customer_id, total FROM order_totals WHERE total > 1000;

-- Must refresh in order:
REFRESH MATERIALIZED VIEW order_totals;
REFRESH MATERIALIZED VIEW big_customers;

After:

SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule => '1m'
);

SELECT pgtrickle.create_stream_table(
    name     => 'big_customers',
    query    => 'SELECT customer_id, total FROM pgtrickle.order_totals WHERE total > 1000',
    schedule => '1m'
);
-- Dependency ordering is automatic. No manual refresh needed.

Idempotent Deployment

For CI/CD pipelines, use create_or_replace_stream_table() so your migration scripts are safe to re-run:

SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'order_totals',
    query        => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule     => '5m',
    refresh_mode => 'DIFFERENTIAL'
);

Choosing the Right Refresh Mode

ScenarioMode
Most migrations (default)DIFFERENTIAL — only processes changes
Volatile functions (NOW(), RANDOM()) in the queryFULL — the query result changes even without source DML
Need real-time consistency within a transactionIMMEDIATE
UnsureAUTO (default) — pg_trickle picks the best mode per cycle

Migration Checklist

  • Identify all materialized views and their refresh schedules
  • Create equivalent stream tables with matching queries
  • Recreate any required indexes on the stream tables
  • Update application queries to reference the pgtrickle schema
  • Verify data correctness (compare stream table vs. materialized view)
  • Remove external refresh jobs (cron, pg_cron)
  • Drop the old materialized views
  • Set up monitoring (Prometheus/Grafana or built-in views)

Further Reading

Tutorial: Fuse Circuit Breaker

The fuse circuit breaker (v0.11.0+) suspends differential refreshes when the incoming change volume exceeds a threshold. This protects your database from runaway refresh cycles during bulk data loads, accidental mass-deletes, or migration scripts.

When to Use It

  • Bulk ETL loads — loading millions of rows that would overwhelm a differential refresh
  • Data migration scripts — large schema or data changes that temporarily spike the change buffer
  • Protection against accidents — an errant DELETE FROM orders shouldn't silently cascade through all downstream stream tables

How It Works

Normal operation           Fuse blows               After reset
─────────────────         ─────────────────        ─────────────────
Source DML ──▶ CDC ──▶ Refresh   Source DML ──▶ CDC ──▶ BLOCKED   Source DML ──▶ CDC ──▶ Refresh
                                  │                                    (resumed)
                                  ▼
                           NOTIFY alert
                           (fuse_blown)
  1. Each refresh cycle, the scheduler counts pending changes in the buffer.
  2. If the count exceeds fuse_ceiling for fuse_sensitivity consecutive cycles, the fuse blows.
  3. The stream table enters a paused state — no refreshes occur.
  4. A fuse_blown alert is emitted via NOTIFY pg_trickle_alert.
  5. An operator investigates and calls reset_fuse() to resume.

Step-by-Step Example

1. Create a stream table with fuse protection

SELECT pgtrickle.create_stream_table(
    name         => 'category_summary',
    query        => 'SELECT category, COUNT(*) AS cnt, SUM(price) AS total
                     FROM products GROUP BY category',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

-- Arm the fuse: blow when pending changes exceed 50,000 rows
SELECT pgtrickle.alter_stream_table(
    'category_summary',
    fuse           => 'on',
    fuse_ceiling   => 50000,
    fuse_sensitivity => 3    -- require 3 consecutive over-ceiling cycles
);

2. Observe normal operation

-- Insert a small batch — well under the ceiling
INSERT INTO products (name, category, price)
SELECT 'Product ' || i, 'Electronics', 9.99
FROM generate_series(1, 100) i;

-- After the next refresh cycle, the stream table is updated normally
SELECT * FROM pgtrickle.category_summary;

3. Trigger a bulk load

-- Simulate a large ETL load — 100,000 rows
INSERT INTO products (name, category, price)
SELECT 'Bulk ' || i, 'Imported', 4.99
FROM generate_series(1, 100000) i;

After fuse_sensitivity scheduler cycles (3 in our example), the fuse blows. The stream table stops refreshing.

4. Inspect the fuse state

SELECT name, fuse_mode, fuse_state, fuse_ceiling, blown_at, blow_reason
FROM pgtrickle.fuse_status();
     name          | fuse_mode | fuse_state | fuse_ceiling |          blown_at          |       blow_reason
-------------------+-----------+------------+--------------+----------------------------+---------------------------
 category_summary  | on        | blown      |        50000 | 2026-03-31 14:22:01.123+00 | change_count_exceeded

5. Decide how to recover

You have three options:

-- Option A: Apply the changes (process the bulk load normally)
SELECT pgtrickle.reset_fuse('category_summary', action => 'apply');

-- Option B: Skip the changes (discard the batch, resume from current state)
SELECT pgtrickle.reset_fuse('category_summary', action => 'skip_changes');

-- Option C: Reinitialize (full rebuild from the defining query)
SELECT pgtrickle.reset_fuse('category_summary', action => 'reinitialize');

After resetting, the fuse returns to 'armed' state and the scheduler resumes.

Fuse Modes

ModeBehavior
'off'No fuse protection (default)
'on'Always armed — blows when changes exceed fuse_ceiling
'auto'Blows only when a FULL refresh would be cheaper than DIFFERENTIAL

'auto' mode is recommended for most use cases — it protects against bulk loads while allowing large-but-efficient differential refreshes to proceed.

Using with dbt

In dbt models, configure the fuse via the stream_table materialization:

-- models/marts/category_summary.sql
{{ config(
    materialized='stream_table',
    schedule='5m',
    refresh_mode='DIFFERENTIAL',
    fuse='auto',
    fuse_ceiling=50000,
    fuse_sensitivity=3
) }}

SELECT category, COUNT(*) AS cnt, SUM(price) AS total
FROM {{ source('raw', 'products') }}
GROUP BY category

Global Defaults

Set a cluster-wide default ceiling via the pg_trickle.fuse_default_ceiling GUC. Stream tables with fuse_ceiling = NULL inherit this value:

ALTER SYSTEM SET pg_trickle.fuse_default_ceiling = 100000;
SELECT pg_reload_conf();

Monitoring

  • pgtrickle.fuse_status() — inspect fuse state for all stream tables
  • LISTEN pg_trickle_alert — receive real-time fuse_blown notifications
  • pgtrickle.dedup_stats() — includes fuse-related counters
  • pgtrickle.pgt_stream_tables.fuse_state — direct catalog query

Further Reading

Tutorial: Tiered Scheduling

Tiered scheduling (v0.12.0+) lets you assign refresh priorities to stream tables using four tiers: Hot, Warm, Cold, and Frozen. This reduces CPU and I/O overhead by refreshing less-critical tables less frequently.

When to Use It

  • You have many stream tables (50+) and want to reduce scheduler load
  • Some tables power real-time dashboards (need hot refresh) while others serve weekly reports (can be cold)
  • You want to freeze tables during maintenance windows without dropping them

Tier Overview

TierMultiplierEffect
hotRefresh at the configured schedule (default)
warmRefresh at 2× the configured interval
cold10×Refresh at 10× the configured interval
frozenskipNever refreshed until manually promoted

For a stream table with schedule => '1m':

TierEffective Interval
hot1 minute
warm2 minutes
cold10 minutes
frozennever

Note: Cron-based schedules are not affected by the tier multiplier. They always fire at the configured cron time.

Step-by-Step Example

1. Enable tiered scheduling

Tiered scheduling is enabled by default since v0.12.0. Verify:

SHOW pg_trickle.tiered_scheduling;
-- Should return: on

2. Create stream tables with different priorities

-- Real-time dashboard — stays hot (default)
SELECT pgtrickle.create_stream_table(
    name     => 'live_order_count',
    query    => 'SELECT COUNT(*) AS total FROM orders WHERE status = ''active''',
    schedule => '30s'
);

-- Important but not latency-critical
SELECT pgtrickle.create_stream_table(
    name     => 'daily_revenue',
    query    => 'SELECT DATE_TRUNC(''day'', created_at) AS day, SUM(amount) AS revenue
                 FROM orders GROUP BY 1',
    schedule => '1m'
);

-- Weekly report — rarely queried
SELECT pgtrickle.create_stream_table(
    name     => 'customer_lifetime_value',
    query    => 'SELECT customer_id, SUM(amount) AS lifetime_value
                 FROM orders GROUP BY customer_id',
    schedule => '5m'
);

3. Assign tiers

-- live_order_count stays at 'hot' (default) — refreshes every 30s

-- daily_revenue: 2× multiplier → effective interval = 2 minutes
SELECT pgtrickle.alter_stream_table('daily_revenue', tier => 'warm');

-- customer_lifetime_value: 10× multiplier → effective interval = 50 minutes
SELECT pgtrickle.alter_stream_table('customer_lifetime_value', tier => 'cold');

4. Verify effective schedules

SELECT pgt_name, schedule, refresh_tier,
       CASE refresh_tier
           WHEN 'hot'  THEN schedule
           WHEN 'warm' THEN schedule || ' ×2'
           WHEN 'cold' THEN schedule || ' ×10'
           WHEN 'frozen' THEN 'never'
       END AS effective
FROM pgtrickle.pgt_stream_tables
ORDER BY refresh_tier;

5. Freeze a table during maintenance

-- Freeze before a schema migration
SELECT pgtrickle.alter_stream_table('customer_lifetime_value', tier => 'frozen');

-- ... perform migration ...

-- Promote back when ready
SELECT pgtrickle.alter_stream_table('customer_lifetime_value', tier => 'warm');

Choosing the Right Tier

Use CaseRecommended Tier
Real-time dashboards, alerting tableshot
Operational reports queried hourlywarm
Weekly/monthly analytics, batch consumerscold
Tables under maintenance, seasonal reportsfrozen

Rules of thumb:

  • Start with everything at hot (the default). Move tables to warm or cold as you identify which ones can tolerate more staleness.
  • Warm halves the refresh CPU cost compared to hot.
  • Cold reduces refresh overhead by 90%.
  • Use frozen sparingly — changes accumulate in the buffer and will be processed when you promote the table back.

Monitoring Tiers

-- Check which tables are in which tier
SELECT pgt_name, refresh_tier, status, staleness
FROM pgtrickle.stream_tables_info
ORDER BY refresh_tier, staleness DESC;

-- Find frozen tables (these are NOT being refreshed)
SELECT pgt_name, refresh_tier
FROM pgtrickle.pgt_stream_tables
WHERE refresh_tier = 'frozen';

Troubleshooting

All tables are frozen and nothing is refreshing:
If every stream table is set to frozen, the scheduler has nothing to do. Promote at least one table back to hot or warm.

Staleness exceeds expectations for cold tables:
Remember that cold applies a 10× multiplier. A 5-minute schedule becomes a 50-minute effective interval. If this is too stale, use warm instead.

Further Reading

Tutorial: Tuning Refresh Mode

This tutorial walks you through using pg_trickle's built-in diagnostics to determine whether your stream tables are running in the most efficient refresh mode (FULL vs DIFFERENTIAL), and how to act on the recommendations.

Prerequisites

  • pg_trickle v0.14.0 or later
  • At least one stream table with several completed refresh cycles (the diagnostics become more accurate with more history)

Step 1: Check Current Refresh Efficiency

Start by reviewing how your stream tables are performing with their current refresh mode:

SELECT pgt_name, refresh_mode, diff_count, full_count,
       avg_diff_ms, avg_full_ms, diff_speedup
FROM pgtrickle.refresh_efficiency();

Example output:

pgt_namerefresh_modediff_countfull_countavg_diff_msavg_full_msdiff_speedup
order_totalsDIFFERENTIAL142312.4850.268.6x
user_statsFULL0145320.1
daily_metricsDIFFERENTIAL9847425.8410.31.0x

Key observations:

  • order_totals: DIFFERENTIAL is 68× faster — this is a great fit.
  • user_stats: Running in FULL mode with no DIFFERENTIAL history — worth checking if DIFFERENTIAL would be faster.
  • daily_metrics: DIFFERENTIAL and FULL take about the same time (1.0× speedup). FULL might actually be simpler and more predictable here.

Step 2: Get Recommendations

Use recommend_refresh_mode() to get AI-weighted recommendations:

SELECT pgt_name, current_mode, recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode();

Example output:

pgt_namecurrent_moderecommended_modeconfidencereason
order_totalsDIFFERENTIALKEEPhighDIFFERENTIAL is 68.6× faster than FULL with low latency variance
user_statsFULLDIFFERENTIALmediumQuery is simple (no complex joins), change ratio is low (2.1%), target table is large
daily_metricsDIFFERENTIALFULLmediumDIFFERENTIAL shows no speedup over FULL (1.0×); high latency variance (p95/p50 = 4.2) suggests unstable performance

For a single table with full signal details:

SELECT recommended_mode, confidence, reason,
       jsonb_pretty(signals) AS signal_details
FROM pgtrickle.recommend_refresh_mode('daily_metrics');

Step 3: Understand the Signals

The signals JSONB column contains the detailed breakdown of all seven weighted signals that contributed to the recommendation:

{
  "composite_score": -0.22,
  "signals": [
    { "name": "change_ratio_avg", "score": -0.1, "weight": 0.30 },
    { "name": "empirical_timing", "score": -0.3, "weight": 0.35 },
    { "name": "change_ratio_current", "score": -0.2, "weight": 0.25 },
    { "name": "query_complexity", "score": 0.0, "weight": 0.10 },
    { "name": "target_size", "score": 0.1, "weight": 0.10 },
    { "name": "index_coverage", "score": 0.0, "weight": 0.05 },
    { "name": "latency_variance", "score": -0.4, "weight": 0.05 }
  ]
}

Positive scores favour DIFFERENTIAL; negative scores favour FULL. A composite score above +0.15 recommends DIFFERENTIAL; below −0.15 recommends FULL; in between, the current mode is near-optimal (KEEP).

Confidence levels:

LevelMeaning
high10+ completed refresh cycles; strong signal agreement
medium5–10 cycles or mixed signals
lowFewer than 5 cycles; recommendation is speculative

Step 4: Apply the Recommendation

If you decide to follow a recommendation, use ALTER STREAM TABLE:

-- Switch daily_metrics from DIFFERENTIAL to FULL
SELECT pgtrickle.alter_stream_table('daily_metrics',
    refresh_mode => 'FULL'
);

Or switch a table to DIFFERENTIAL:

-- Switch user_stats to DIFFERENTIAL mode
SELECT pgtrickle.alter_stream_table('user_stats',
    refresh_mode => 'DIFFERENTIAL'
);

The change takes effect on the next refresh cycle. No data is lost during the transition.

Step 5: Monitor After the Change

After switching modes, wait for several refresh cycles and re-check:

-- Wait a few minutes, then re-check efficiency
SELECT pgt_name, refresh_mode, diff_count, full_count,
       avg_diff_ms, avg_full_ms, diff_speedup
FROM pgtrickle.refresh_efficiency()
WHERE pgt_name = 'daily_metrics';

Run the recommendation function again to verify the change was beneficial:

SELECT recommended_mode, confidence, reason
FROM pgtrickle.recommend_refresh_mode('daily_metrics');

If the recommendation now says KEEP, the new mode is working well.

Common Scenarios

High-cardinality aggregates

Stream tables with SUM/COUNT/AVG over high-cardinality GROUP BY keys (1000+ groups) are almost always better in DIFFERENTIAL mode. pg_trickle warns about low-cardinality groups at creation time (DIAG-2).

Small tables with frequent full rewrites

If the source table is small (< 10,000 rows) and changes affect > 30% of rows per cycle, FULL refresh is often faster because it avoids the overhead of change tracking and delta application.

Complex multi-join queries

Queries with 4+ JOINs may have high DIFFERENTIAL overhead due to the delta propagation rules. If diff_speedup is below 2×, consider FULL mode.

Tables with volatile functions

Stream tables using volatile functions (e.g., now(), random()) must use FULL mode. pg_trickle rejects volatile functions in DIFFERENTIAL mode at creation time.

Using the TUI

The pgtrickle TUI provides a visual diagnostics panel. Press 5 or d in the interactive dashboard to open the diagnostics view, which shows recommendations with confidence levels for all stream tables at a glance.

From the CLI:

# Show recommendations for all tables
pgtrickle diag

# Show recommendations in JSON format (for automation)
pgtrickle diag --format json

See Also

Tutorial: Circular Dependencies

pg_trickle supports circular (cyclic) stream table dependencies (v0.7.0+) for queries that use only monotone operators. The scheduler groups circular dependencies into Strongly Connected Components (SCCs) and iterates them to a fixed point.

When to Use It

  • Transitive closure — computing all reachable nodes in a graph
  • Graph reachability — finding all paths between nodes
  • Iterative convergence — mutual dependencies that stabilize after a few iterations

Prerequisites

Circular dependencies are disabled by default. Enable them:

SET pg_trickle.allow_circular = true;

Monotone Operator Requirement

Only monotone operators are allowed in circular dependency chains. Monotone operators guarantee convergence — the result set grows (or stays the same) with each iteration until a fixed point is reached.

Allowed (Monotone)Blocked (Non-Monotone)
Joins (INNER, LEFT, RIGHT, FULL)Aggregates (SUM, COUNT, etc.)
Filters (WHERE)EXCEPT
Projections (SELECT)Window functions
UNION ALLNOT EXISTS / NOT IN
INTERSECT
EXISTS

Creating a circular dependency with non-monotone operators is rejected with a clear error message, regardless of the allow_circular setting.

Step-by-Step Example: Transitive Closure

Suppose you have a graph of relationships:

CREATE TABLE edges (src INT, dst INT);
INSERT INTO edges VALUES
    (1, 2), (2, 3), (3, 4), (4, 5),
    (1, 3), (2, 5);

1. Create the base reachability table

-- Direct edges: all nodes directly connected
SELECT pgtrickle.create_stream_table(
    name     => 'reachable_direct',
    query    => 'SELECT src, dst FROM edges',
    schedule => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

2. Create the transitive closure with a self-reference

-- Transitive closure: if A→B and B→C, then A→C
-- This creates a circular dependency (reachable depends on itself via the join)
SELECT pgtrickle.create_stream_table(
    name     => 'reachable',
    query    => 'SELECT DISTINCT r1.src, r2.dst
                 FROM pgtrickle.reachable_direct r1
                 JOIN pgtrickle.reachable_direct r2 ON r1.dst = r2.src
                 UNION ALL
                 SELECT src, dst FROM edges',
    schedule => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

Note: This example uses the reachable_direct table for the join rather than self-referencing reachable directly. For a true self-referencing cycle, pg_trickle detects the SCC and iterates.

3. Observe the fixed-point iteration

When the scheduler processes an SCC, it iterates until no new rows are produced (the fixed point):

-- Check SCC status
SELECT * FROM pgtrickle.pgt_scc_status();

Output:

 scc_id | members                          | iteration | converged
--------+----------------------------------+-----------+-----------
      1 | {reachable_direct,reachable}     |         3 | true

4. Add new edges and watch convergence

INSERT INTO edges VALUES (5, 1);  -- creates a cycle in the graph

On the next refresh cycle, the scheduler re-iterates the SCC until the transitive closure stabilizes with the new edge.

Monitoring SCCs

-- View all SCCs and their convergence status
SELECT * FROM pgtrickle.pgt_scc_status();

-- Check which stream tables belong to which SCC
SELECT pgt_name, scc_id
FROM pgtrickle.pgt_stream_tables
WHERE scc_id IS NOT NULL;

Controlling Iteration Limits

The pg_trickle.max_fixpoint_iterations GUC limits how many iterations the scheduler attempts before declaring non-convergence:

-- Default: 100 (generous headroom)
SHOW pg_trickle.max_fixpoint_iterations;

-- Lower it for fast-converging workloads
SET pg_trickle.max_fixpoint_iterations = 20;

If convergence is not reached within the limit, all SCC members are marked as ERROR. This prevents runaway infinite loops.

Limitations

  • Non-monotone operators are always rejected — aggregates, EXCEPT, window functions, and NOT EXISTS/NOT IN cannot appear in circular chains because they prevent convergence.
  • Performance scales with iteration count — each iteration runs a full differential refresh cycle for all SCC members. Keep cycles small.
  • All SCC members must use DIFFERENTIAL mode — FULL and IMMEDIATE modes are not supported for circular dependencies.

Further Reading

Tutorial: Monitoring & Alerting

This guide consolidates all pg_trickle monitoring capabilities into a single reference: built-in SQL views, NOTIFY-based alerts, and the Prometheus/Grafana observability stack.

Quick Health Check

The fastest way to verify pg_trickle is healthy:

SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

If this returns no rows, everything is working. Any WARN or ERROR rows tell you where to investigate.

Built-in Monitoring Views

Stream table status

-- Overview: name, status, mode, staleness
SELECT name, status, refresh_mode, staleness, stale
FROM pgtrickle.stream_tables_info;

-- Detailed stats: refresh counts, duration, error streaks
SELECT pgt_name, total_refreshes, avg_duration_ms, consecutive_errors, stale
FROM pgtrickle.pg_stat_stream_tables;

-- Live status with error counts
SELECT * FROM pgtrickle.pgt_status();

Refresh history

-- Last 10 refreshes for a specific stream table
SELECT start_time, action, status, duration_ms, rows_inserted, rows_deleted, error_message
FROM pgtrickle.get_refresh_history('order_totals', 10);

-- Global refresh timeline (last 20 events across all stream tables)
SELECT start_time, stream_table, action, status, duration_ms, error_message
FROM pgtrickle.refresh_timeline(20);

-- Aggregate refresh statistics
SELECT * FROM pgtrickle.st_refresh_stats();

CDC pipeline health

-- Per-source CDC mode, WAL lag, and alerts
SELECT * FROM pgtrickle.check_cdc_health();

-- Change buffer sizes (pending changes not yet consumed)
SELECT stream_table, source_table, cdc_mode, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

-- Verify all CDC triggers are installed and enabled
SELECT source_table, trigger_type, trigger_name
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

Dependencies

-- ASCII tree view of the entire dependency graph
SELECT tree_line, status, refresh_mode
FROM pgtrickle.dependency_tree();

-- Diamond consistency groups
SELECT * FROM pgtrickle.diamond_groups();

Fuse circuit breaker

-- Check fuse state for all stream tables
SELECT name, fuse_mode, fuse_state, fuse_ceiling, blown_at
FROM pgtrickle.fuse_status();

Parallel workers

-- Worker pool status (when parallel_refresh_mode = 'on')
SELECT * FROM pgtrickle.worker_pool_status();

-- Recent parallel job history
SELECT job_id, unit_key, status, duration_ms
FROM pgtrickle.parallel_job_status(60);

NOTIFY-Based Alerting

pg_trickle emits real-time events via PostgreSQL's NOTIFY system:

LISTEN pg_trickle_alert;

Event Types

EventTriggerSeverity
stale_dataScheduler is also behind — view is genuinely out of dateWarning
no_upstream_changesScheduler is healthy but source tables have had no writes — view is correctInfo
auto_suspendedStream table suspended after max consecutive errorsCritical
resumedStream table resumed after suspensionInfo
reinitialize_neededUpstream DDL change detectedWarning
buffer_growth_warningChange buffer growing unexpectedlyWarning
slot_lag_warningWAL replication slot retaining excessive dataWarning
fuse_blownCircuit breaker trippedWarning
refresh_completedRefresh completed successfullyInfo
refresh_failedRefresh failedError
diamond_partial_failureOne member of an atomic diamond group failedWarning
scheduler_falling_behindRefresh duration approaching the schedule intervalWarning
spill_threshold_exceededDelta MERGE spilled to temp files for consecutive refreshes, forcing FULLWarning

Notification Payload

Each notification carries a JSON payload:

{
  "event": "auto_suspended",
  "stream_table": "order_totals",
  "consecutive_errors": 3,
  "last_error": "column \"deleted_column\" does not exist",
  "timestamp": "2026-03-31T14:22:01.123Z"
}

Bridging to External Systems

To forward NOTIFY events to external alerting systems (PagerDuty, Slack, OpsGenie), use a listener process:

# Example: Python listener using psycopg
import psycopg
import json

conn = psycopg.connect("postgresql://user:pass@host/db", autocommit=True)
conn.execute("LISTEN pg_trickle_alert")

for notify in conn.notifies():
    payload = json.loads(notify.payload)
    event = payload["event"]
    # no_upstream_changes is informational — source tables are quiet but healthy.
    # Only page on actionable events.
    if event in ("auto_suspended", "fuse_blown", "refresh_failed"):
        send_to_pagerduty(payload)
    elif event == "stale_data":  # scheduler itself is falling behind
        send_to_pagerduty(payload)

Prometheus & Grafana Stack

For production deployments, use the pre-built observability stack in the monitoring/ directory:

cd monitoring/
docker compose up -d

This gives you:

  • Prometheus scraping pg_trickle metrics via postgres_exporter
  • Grafana with a pre-provisioned dashboard
  • Alerting rules for staleness, errors, CDC lag, and scheduler health

See Prometheus & Grafana Integration for full setup details.

Diagnostic Workflow

When something is wrong, follow this systematic workflow:

Step 1 — Global health

SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

Step 2 — Status and staleness

SELECT name, status, consecutive_errors, staleness
FROM pgtrickle.pgt_status()
ORDER BY staleness DESC NULLS FIRST;

Step 3 — Recent refresh activity

SELECT start_time, stream_table, action, status, error_message
FROM pgtrickle.refresh_timeline(20);

Step 4 — Error details for a specific stream table

SELECT * FROM pgtrickle.diagnose_errors('my_stream_table');

Step 5 — CDC pipeline

SELECT stream_table, source_table, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

Step 6 — Trigger verification

SELECT source_table, trigger_type, trigger_name
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

Common Alert Responses

AlertLikely CauseAction
stale_dataScheduler behind, long refresh, or lock contentionCheck pgt_status() and refresh_timeline()
auto_suspendedRepeated refresh failuresFix root cause, then resume_stream_table()
fuse_blownBulk load exceeded fuse ceilingInvestigate, then reset_fuse()
buffer_growth_warningScheduler not consuming buffers fast enoughCheck scheduler status and refresh errors
reinitialize_neededSource table DDL changedVerify schema compatibility; scheduler handles automatically

Further Reading

Tutorial: ETL & Bulk Load Patterns

pg_trickle provides source gating (v0.5.0+) and watermark gating (v0.7.0+) to coordinate stream table refreshes with ETL pipelines and bulk data loads. This tutorial covers common patterns for pausing refreshes during loads and resuming them safely afterward.

The Problem

When you bulk-load data into a source table (e.g., a nightly ETL job), the change buffer fills rapidly. Without coordination:

  • A differential refresh mid-load sees a partial batch, producing incomplete results
  • The adaptive fallback may trigger repeated FULL refreshes during the load
  • The fuse circuit breaker may blow, requiring manual intervention

Source gating solves this by telling pg_trickle to skip refreshes for gated sources until the load completes.

Recipe 1 — Single Source Bulk Load

The simplest pattern: gate the source, load data, ungate.

-- 1. Gate the source table — all dependent stream tables pause
SELECT pgtrickle.gate_source('public.orders');

-- 2. Perform the bulk load
COPY orders FROM '/data/orders_20260331.csv' WITH (FORMAT csv, HEADER);
-- or: INSERT INTO orders SELECT ... FROM staging_orders;

-- 3. Ungate — stream tables resume and process the full batch
SELECT pgtrickle.ungate_source('public.orders');

While gated, the scheduler skips all stream tables that depend on the gated source. Changes still accumulate in the CDC buffer and are processed in a single batch after ungating.

Recipe 2 — Coordinated Multi-Source Load

When your ETL loads multiple tables that feed into the same stream table:

-- Gate all sources involved in the load
SELECT pgtrickle.gate_source('public.orders');
SELECT pgtrickle.gate_source('public.customers');
SELECT pgtrickle.gate_source('public.products');

-- Load all tables
COPY orders FROM '/data/orders.csv' WITH (FORMAT csv, HEADER);
COPY customers FROM '/data/customers.csv' WITH (FORMAT csv, HEADER);
COPY products FROM '/data/products.csv' WITH (FORMAT csv, HEADER);

-- Ungate all at once — stream tables see a consistent snapshot
SELECT pgtrickle.ungate_source('public.orders');
SELECT pgtrickle.ungate_source('public.customers');
SELECT pgtrickle.ungate_source('public.products');

Recipe 3 — Gate + Deferred Stream Table Creation

For initial deployments where data must be loaded before stream tables are created:

-- 1. Gate the source before any stream tables exist
SELECT pgtrickle.gate_source('public.orders');

-- 2. Load the initial data
COPY orders FROM '/data/historical_orders.csv' WITH (FORMAT csv, HEADER);

-- 3. Create stream tables — they won't refresh yet (source is gated)
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule => '1m'
);

-- 4. Ungate — the first refresh processes all data cleanly
SELECT pgtrickle.ungate_source('public.orders');

Recipe 4 — Nightly Batch Pattern

A common production pattern using a scheduled batch job:

-- Run nightly at 02:00 UTC

-- Step 1: Gate all ETL sources
DO $$
DECLARE
    src TEXT;
BEGIN
    FOR src IN SELECT DISTINCT source_table
               FROM pgtrickle.list_sources('daily_report')
    LOOP
        PERFORM pgtrickle.gate_source(src);
    END LOOP;
END;
$$;

-- Step 2: Run the ETL pipeline
CALL etl.load_daily_data();

-- Step 3: Ungate all sources
DO $$
DECLARE
    gated RECORD;
BEGIN
    FOR gated IN SELECT source_name FROM pgtrickle.source_gates()
                 WHERE is_gated = true
    LOOP
        PERFORM pgtrickle.ungate_source(gated.source_name);
    END LOOP;
END;
$$;

Monitoring During a Gated Load

While sources are gated, verify the gate status:

-- Check which sources are currently gated
SELECT * FROM pgtrickle.source_gates();

-- Bootstrap gate status (v0.6.0+)
SELECT * FROM pgtrickle.bootstrap_gate_status();

Combining with the Fuse Circuit Breaker

For extra safety, combine gating with the fuse circuit breaker:

-- Arm the fuse as a safety net
SELECT pgtrickle.alter_stream_table('order_totals',
    fuse         => 'on',
    fuse_ceiling => 500000
);

-- Gate for controlled loads
SELECT pgtrickle.gate_source('public.orders');
-- ... load data ...
SELECT pgtrickle.ungate_source('public.orders');

-- The fuse catches any unexpected bulk changes outside the gated window

Watermark Gating (v0.7.0+)

Watermark gating extends source gating with LSN-based coordination for more precise control:

-- Set a watermark — refreshes only consume changes up to this LSN
SELECT pgtrickle.set_watermark('public.orders', pg_current_wal_lsn());

-- Load new data (changes accumulate beyond the watermark)
COPY orders FROM '/data/new_orders.csv' WITH (FORMAT csv, HEADER);

-- Advance the watermark to include the new data
SELECT pgtrickle.advance_watermark('public.orders', pg_current_wal_lsn());

-- Or clear the watermark entirely
SELECT pgtrickle.clear_watermark('public.orders');

See the SQL Reference — Watermark Gating for the complete API.

Further Reading

Tutorial: Migrating from pg_ivm to pg_trickle

This guide walks through migrating existing pg_ivm IMMVs (Incrementally Maintained Materialized Views) to pg_trickle stream tables. It covers API mapping, behavioral differences, and a step-by-step migration checklist.

See also: plans/ecosystem/GAP_PG_IVM_COMPARISON.md for the full feature comparison and gap analysis between the two extensions.


Why Migrate?

pg_ivm (IMMV)pg_trickle (Stream Table)
Maintenance modelImmediate only (in-transaction)Deferred (scheduler) and Immediate
Aggregate functions5 (COUNT, SUM, AVG, MIN, MAX)60+ (all built-in + user-defined)
Window functionsNot supportedFull support
CTEs (recursive)Not supportedSemi-naive, DRed, recomputation
SubqueriesVery limitedFull (EXISTS, NOT EXISTS, IN, LATERAL, scalar)
Set operationsNot supportedUNION, INTERSECT, EXCEPT (bag + set)
HAVING clauseNot supportedSupported
GROUPING SETS / CUBE / ROLLUPNot supportedAuto-rewritten to UNION ALL
DISTINCT ONNot supportedAuto-rewritten to ROW_NUMBER
Views as sourcesNot supportedAuto-inlined
Cascading viewsNot supportedDAG-aware topological scheduling
Background schedulingNone (manual only)Native cron, duration, CALCULATED
Monitoring1 catalog table15+ diagnostic functions
ConcurrencyExclusiveLock during maintenanceAdvisory locks, non-blocking reads
Parallel refreshNot supportedWorker pool with caps

Concept Mapping

pg_ivm Conceptpg_trickle EquivalentNotes
IMMV (Incrementally Maintained Materialized View)Stream tableSame idea — a query result kept incrementally up to date
pgivm.create_immv(name, query)pgtrickle.create_stream_table(name, query)pg_trickle adds optional schedule and refresh_mode parameters
pgivm.refresh_immv(name, true)pgtrickle.refresh_stream_table(name)Manual refresh
pgivm.refresh_immv(name, false)No direct equivalentpg_trickle has pgtrickle.alter_stream_table(name, enabled => false) to suspend
pgivm.pg_ivm_immv catalogpgtrickle.pgt_stream_tablesPlus pgt_status(), refresh_timeline(), etc.
DROP TABLE immv_namepgtrickle.drop_stream_table(name)Stream tables must be dropped via the API
ALTER TABLE immv RENAME TO ...pgtrickle.alter_stream_table(old, name => new)Rename via API
In-transaction maintenance (AFTER row triggers)refresh_mode => 'IMMEDIATE'Same model — triggers fire in the writing transaction
(not available)refresh_mode => 'DIFFERENTIAL'Deferred incremental refresh via change buffers
(not available)refresh_mode => 'AUTO'Picks DIFFERENTIAL or FULL automatically
Auto-created indexes on GROUP BY / PKManual CREATE INDEXpg_trickle auto-creates the primary key but not secondary indexes

Step-by-Step Migration

1. Inventory existing IMMVs

List all pg_ivm IMMVs in your database:

-- pg_ivm catalog
SELECT immvrelid::regclass AS immv_name,
       pgivm.get_immv_def(immvrelid) AS defining_query
FROM pgivm.pg_ivm_immv
ORDER BY immvrelid::regclass::text;

Record each IMMV's name, defining query, and any indexes you have created on it.

2. Check query compatibility

pg_trickle supports a superset of pg_ivm's SQL dialect, so any query that works with pg_ivm will work with pg_trickle. However, there are a few things to verify:

  • Data types: pg_ivm requires btree operator classes for all columns (excluding json, xml, point, etc.). pg_trickle has no such restriction.
  • Outer joins: If your IMMV uses outer joins, pg_trickle removes pg_ivm's restrictions (single equijoin, no aggregates, no CASE). Your query may work unchanged or you may be able to simplify workarounds you added for pg_ivm.

3. Choose a refresh mode

For each IMMV, decide which pg_trickle refresh mode to use:

pg_ivm behaviorpg_trickle refresh modeWhen to choose
Zero staleness requiredIMMEDIATESame in-transaction behavior as pg_ivm
Some staleness acceptableDIFFERENTIAL with scheduleLower write latency, batched refresh
Let pg_trickle decideAUTO (default)Recommended for most cases

4. Create stream tables

For each IMMV, create the corresponding stream table:

pg_ivm (before):

SELECT pgivm.create_immv(
    'order_totals',
    'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id'
);

pg_trickle — IMMEDIATE mode (same behavior as pg_ivm):

SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    NULL,          -- no schedule needed for IMMEDIATE
    'IMMEDIATE'
);

pg_trickle — deferred mode (lower write latency):

SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    '30s'          -- refresh every 30 seconds; mode defaults to AUTO
);

5. Recreate indexes

pg_ivm auto-creates indexes on GROUP BY, DISTINCT, and primary key columns. pg_trickle auto-creates the primary key (pgt_row_id) but not secondary indexes.

Recreate any indexes that your read queries depend on:

-- Example: index on the GROUP BY column for lookup queries
CREATE INDEX ON pgtrickle.order_totals (customer_id);

6. Update application queries

pg_ivm IMMVs live in the schema where they were created (usually public). pg_trickle stream tables default to the pgtrickle schema.

-- Before (pg_ivm):
SELECT * FROM public.order_totals WHERE customer_id = 42;

-- After (pg_trickle):
SELECT * FROM pgtrickle.order_totals WHERE customer_id = 42;

To avoid changing application code, create a compatibility view:

CREATE VIEW public.order_totals AS
SELECT * FROM pgtrickle.order_totals;

7. Verify correctness

After creating the stream table and running a refresh, compare results:

-- Compare row counts
SELECT 'immv' AS source, COUNT(*) FROM public.order_totals_immv
UNION ALL
SELECT 'stream_table', COUNT(*) FROM pgtrickle.order_totals;

-- Full diff (should return zero rows)
(SELECT * FROM public.order_totals_immv EXCEPT SELECT * FROM pgtrickle.order_totals)
UNION ALL
(SELECT * FROM pgtrickle.order_totals EXCEPT SELECT * FROM public.order_totals_immv);

8. Drop the old IMMV

Once you have verified the stream table is correct and applications are updated:

DROP TABLE public.order_totals_immv;

9. (Optional) Remove pg_ivm

After all IMMVs are migrated:

DROP EXTENSION pg_ivm CASCADE;

Remove pg_ivm from shared_preload_libraries if it was listed there and restart PostgreSQL.


Behavioral Differences to Be Aware Of

Locking

  • pg_ivm: Holds ExclusiveLock on the IMMV during maintenance. In REPEATABLE READ / SERIALIZABLE, concurrent writes to the same IMMV's base tables may raise serialization errors.
  • pg_trickle (IMMEDIATE): Uses advisory locks. Concurrent reads of the stream table are never blocked.
  • pg_trickle (deferred): Base table writes only insert into change buffers (~2–50 μs). No lock contention with refresh.

TRUNCATE

  • pg_ivm: Synchronously truncates or fully refreshes the IMMV.
  • pg_trickle (IMMEDIATE): Performs a full refresh within the same transaction.
  • pg_trickle (deferred): Clears the change buffer and queues a full refresh on the next cycle.

Logical Replication

  • pg_ivm: Not compatible with logical replication — subscriber nodes do not have triggers that fire for replicated changes.
  • pg_trickle (deferred): Supports WAL-based CDC (pg_trickle.cdc_mode = 'wal') which reads from the WAL directly. Trigger-based CDC also works with logical replication if triggers are created on the subscriber.

Schema Changes

  • pg_ivm: No automatic DDL tracking. If a base table column is altered, the IMMV may break silently.
  • pg_trickle: Event triggers detect DDL changes on source tables and automatically reinitialize affected stream tables.

Upgrading Queries That pg_ivm Couldn't Handle

pg_ivm's SQL restrictions often force users to create workarounds. With pg_trickle, many of these workarounds can be simplified:

HAVING clauses

-- pg_ivm workaround: filter in application or wrap in a view
SELECT pgivm.create_immv('big_customers',
    'SELECT customer_id, SUM(amount) AS total
     FROM orders GROUP BY customer_id'
);
-- Then: SELECT * FROM big_customers WHERE total > 1000;

-- pg_trickle: use HAVING directly
SELECT pgtrickle.create_stream_table('big_customers',
    'SELECT customer_id, SUM(amount) AS total
     FROM orders GROUP BY customer_id
     HAVING SUM(amount) > 1000'
);

NOT EXISTS / anti-joins

-- pg_ivm: not supported — manual workaround required

-- pg_trickle: works directly
SELECT pgtrickle.create_stream_table('orphan_orders',
    'SELECT o.* FROM orders o
     WHERE NOT EXISTS (SELECT 1 FROM customers c WHERE c.id = o.customer_id)'
);

Window functions

-- pg_ivm: not supported

-- pg_trickle: works directly
SELECT pgtrickle.create_stream_table('ranked_products',
    'SELECT product_id, category, revenue,
            RANK() OVER (PARTITION BY category ORDER BY revenue DESC) AS rnk
     FROM product_revenue'
);

UNION ALL pipelines

-- pg_ivm: not supported — requires separate IMMVs + application-side UNION

-- pg_trickle: works directly
SELECT pgtrickle.create_stream_table('all_events',
    'SELECT id, ts, ''order'' AS type FROM order_events
     UNION ALL
     SELECT id, ts, ''return'' AS type FROM return_events'
);

Monitoring After Migration

pg_trickle provides extensive monitoring that pg_ivm does not offer:

-- Overall health
SELECT * FROM pgtrickle.health_check();

-- Status of all stream tables (includes staleness, last refresh, error count)
SELECT * FROM pgtrickle.pgt_status();

-- Recent refresh history across all stream tables
SELECT * FROM pgtrickle.refresh_timeline(20);

-- CDC pipeline health
SELECT * FROM pgtrickle.change_buffer_sizes();

-- Diagnose errors for a specific stream table
SELECT * FROM pgtrickle.diagnose_errors('order_totals');

See SQL Reference for the complete list of monitoring functions.

Frequently Asked Questions

This FAQ covers everything from core concepts and getting started, through SQL support details, to operational topics like deployment, monitoring, and troubleshooting. Use the table of contents below to jump to a specific topic.


New User FAQ — Top 15 Questions

New to pg_trickle? Start here. Each answer is a short summary with a link to the full explanation further down.

1. What is pg_trickle?

A PostgreSQL 18 extension that adds stream tables — materialized views that refresh themselves incrementally, processing only changed rows instead of re-running the entire query. Full answer →

2. How is this different from a materialized view?

Stream tables refresh automatically on a schedule, support incremental (differential) refresh, track changes via CDC triggers, and propagate updates through dependency chains — none of which REFRESH MATERIALIZED VIEW provides. Full answer →

3. How do I install pg_trickle?

Install from the Docker image, PGXN, or build from source. Add shared_preload_libraries = 'pg_trickle' to postgresql.conf, then CREATE EXTENSION pg_trickle; in each database. Full answer →

4. How do I create my first stream table?

One function call: SELECT pgtrickle.create_stream_table(name => 'my_st', query => 'SELECT ...', schedule => '5s'); See the Getting Started guide for a walkthrough. Full answer →

5. What is the difference between FULL and DIFFERENTIAL refresh?

FULL re-runs the entire defining query. DIFFERENTIAL reads only the changed rows from the change buffer and computes the delta — orders of magnitude faster for small changes on large tables. AUTO mode picks the best strategy per cycle. Full answer →

6. Which refresh mode should I use?

Use AUTO (the default) — it selects DIFFERENTIAL when possible and falls back to FULL when needed. Use IMMEDIATE for same-transaction consistency. Use FULL only when the defining query uses volatile functions or is not IVM-eligible. Full answer →

7. What SQL features are supported?

Joins (INNER, LEFT, RIGHT, FULL OUTER, CROSS, LATERAL), aggregates (60+ functions including SUM, COUNT, AVG, array_agg, jsonb_agg), CTEs (including recursive), window functions, UNION/INTERSECT/EXCEPT, subqueries, CASE, COALESCE, DISTINCT, GROUP BY with ROLLUP/CUBE/GROUPING SETS, and more. Full answer →

8. How fresh is my stream table data?

As fresh as the refresh schedule allows. With a 1s schedule, data is typically < 2 seconds stale. With IMMEDIATE mode, data is updated within the same transaction as the source write. Full answer →

9. Can I chain stream tables (ST reads from another ST)?

Yes — stream tables can reference other stream tables. pg_trickle builds a dependency DAG and refreshes them in topological order automatically. Full answer →

10. How does change data capture work?

Lightweight row-level AFTER triggers capture every INSERT, UPDATE, and DELETE into per-table change buffers. If wal_level = logical is available, pg_trickle can automatically transition to WAL-based CDC for near-zero write-path overhead. Full answer →

11. Do I need wal_level = logical?

No. pg_trickle works with the default wal_level = replica using trigger-based CDC. WAL-based CDC is optional and provides lower write-path overhead. Full answer →

12. Can I use pg_trickle with PgBouncer / connection poolers?

Yes. pg_trickle's background workers use direct connections, not pooled ones. Your application can use any pooler for reads and writes — the scheduler operates independently. Full answer →

13. How do I monitor stream table health?

Built-in views (pgtrickle.pgt_status, pgtrickle.pgt_refresh_history), Prometheus metrics endpoint, Grafana dashboard, NOTIFY-based alerts, and a TUI tool. Full answer →

14. What happens if a refresh fails?

The stream table is marked SUSPENDED after exceeding the fuse threshold (default 5 consecutive failures). Data in the change buffer is preserved. Use pgtrickle.reset_fuse('my_st') to resume after fixing the issue. Full answer →

15. Can I use pg_trickle with dbt?

Yes — the dbt-pgtrickle package provides a stream_table materialization. dbt run creates/alters stream tables, dbt source freshness checks staleness. Full answer →


Table of Contents

Getting started

Consistency & refresh modes

SQL features

Internals & architecture

Operations

Troubleshooting & reference


General

These questions cover fundamental concepts — what pg_trickle is, how incremental view maintenance works, and the key building blocks (frontiers, row IDs, the auto-rewrite pipeline) that power the extension.

What is pg_trickle?

pg_trickle is a PostgreSQL 18 extension that implements stream tables — declarative, automatically-refreshing materialized views with Differential View Maintenance (DVM). You define a SQL query and a refresh schedule; the extension handles change capture, delta computation, and incremental refresh automatically.

It is inspired by the DBSP differential dataflow framework. See DBSP_COMPARISON.md for a detailed comparison.

What is incremental view maintenance (IVM) and why does it matter?

Incremental View Maintenance means updating a materialized view by processing only the changes (deltas) to the source data, rather than re-executing the entire defining query from scratch.

Consider a stream table defined as SELECT customer_id, SUM(amount) FROM orders GROUP BY customer_id over a 10-million-row orders table. When you insert 5 new rows:

  • Without IVM (FULL refresh): Re-scans all 10 million rows and recomputes every group. Cost: O(total rows).
  • With IVM (DIFFERENTIAL refresh): Reads only the 5 new rows from the change buffer, identifies the affected groups, and updates just those groups. Cost: O(changed rows × affected groups).

pg_trickle's DVM engine implements IVM using differentiation rules for each SQL operator (Scan, Filter, Join, Aggregate, etc.), generating a delta query that computes the exact changes to the stream table from the exact changes to the source.

What is the difference between a stream table and a regular materialized view, in practice?

FeatureMaterialized ViewsStream Tables
RefreshManual (REFRESH MATERIALIZED VIEW)Automatic (scheduler) or manual
Incremental refreshNot supported nativelyBuilt-in differential mode
Change detectionNone — always full recomputeCDC triggers track row-level changes
Dependency orderingNoneDAG-aware topological refresh
MonitoringNoneBuilt-in views, stats, NOTIFY alerts
ScheduleNoneDuration strings (5m) or cron (*/5 * * * *)
Transactional IVMNoYes (IMMEDIATE mode)

In practice, stream tables are regular PostgreSQL heap tables under the hood — you can query them, create indexes on them, join them with other tables, and reference them from views. The key difference is that pg_trickle manages their contents automatically.

What happens behind the scenes when I INSERT a row into a table tracked by a stream table?

The full data flow for a DIFFERENTIAL-mode stream table:

  1. Your INSERT completes normally. The row is written to the source table.
  2. A CDC trigger fires (row-level AFTER INSERT). It writes a change record (action=I, the new row data as JSONB, the current WAL LSN) into the source's change buffer table (pgtrickle_changes.changes_<oid>). This happens within your transaction — if you roll back, the change record is also rolled back.
  3. You commit. Both the source row and the change record become visible.
  4. The scheduler wakes up (every pg_trickle.scheduler_interval_ms, default 1 second). It checks whether the stream table's schedule says a refresh is due.
  5. If due, the refresh engine runs. It reads the change buffer for rows with LSN > the stream table's current frontier, generates a delta query from the DVM operator tree, and applies the result via MERGE.
  6. Frontier advances. The stream table's frontier is updated to the new LSN, and the consumed change buffer rows are cleaned up.

For IMMEDIATE-mode stream tables, steps 2–6 are replaced: a statement-level AFTER trigger computes and applies the delta within your transaction, so the stream table is updated before your transaction commits.

What does "differential" mean in the context of pg_trickle?

"Differential" refers to the mathematical approach of computing differences (deltas) rather than absolute values. Given a query Q and a set of changes ΔR to source table R, the DVM engine computes ΔQ(R, ΔR) — the change to the query result caused by the change to the source. This delta is then applied (merged) into the stream table.

Each SQL operator has its own differentiation rule. For example:

  • Filter: ΔFilter(R, ΔR) = Filter(ΔR) — just apply the filter to the changes.
  • Join: ΔJoin(R, S, ΔR) = Join(ΔR, S) — join the changes against the other side's current state.
  • Aggregate: Recompute only the groups whose keys appear in the changes.

See DVM_OPERATORS.md for the complete set of differentiation rules.

What is a frontier, and why does pg_trickle track LSNs?

A frontier is a per-source map of {source_oid → LSN} that records exactly how far each stream table has consumed changes from each of its source tables. It is stored as JSONB in the pgtrickle.pgt_stream_tables catalog.

Why LSNs? PostgreSQL's Write-Ahead Log Sequence Number (LSN) provides a globally ordered, monotonically increasing position in the change stream. By recording the LSN at which each source was last consumed, the frontier ensures:

  • No missed changes. The next refresh reads changes with LSN > frontier, ensuring contiguous, non-overlapping windows.
  • No duplicate processing. Changes at or below the frontier are never re-read.
  • Consistent snapshots. When a stream table depends on multiple source tables, the frontier tracks each source independently, enabling consistent multi-source delta computation.

Lifecycle: Created on first full refresh → Advanced on each differential refresh → Reset on reinitialize.

What is the __pgt_row_id column and why does it appear in my stream tables?

Every stream table has a __pgt_row_id BIGINT PRIMARY KEY column. It stores a 64-bit xxHash of the row's group-by key (for aggregate queries) or all output columns (for non-aggregate queries). The refresh engine uses it to match incoming deltas against existing rows during the MERGE operation.

You should ignore this column in your queries. It is an implementation detail. If it bothers you, exclude it explicitly:

SELECT customer_id, total FROM order_totals;  -- omit __pgt_row_id

What is the auto-rewrite pipeline and how does it affect my queries?

Before parsing a defining query into the DVM operator tree, pg_trickle runs six automatic rewrite passes:

#PassWhat it does
0View inliningReplaces view references with (view_definition) AS alias subqueries (fixpoint, max depth 10)
1DISTINCT ONConverts to ROW_NUMBER() OVER (PARTITION BY … ORDER BY …) = 1 subquery
2GROUPING SETS / CUBE / ROLLUPDecomposes into UNION ALL of separate GROUP BY queries
3Scalar subquery in WHERERewrites WHERE col > (SELECT …) to CROSS JOIN
4Correlated scalar subquery in SELECTRewrites to LEFT JOIN with grouped inline view
5SubLinks in ORSplits WHERE a OR EXISTS (…) into UNION branches

The rewrites are transparent — your original query is preserved in the catalog (original_query column) while the rewritten version is stored in defining_query. The DVM engine only sees standard SQL operators after rewriting.

See ARCHITECTURE.md for details on each pass.

How does pg_trickle compare to DBSP (the academic framework)?

pg_trickle is inspired by DBSP but is not a direct implementation. Key differences:

  • DBSP is a general-purpose differential dataflow framework with a Rust runtime (Feldera). It models computation as circuits over Z-sets (multisets with integer weights).
  • pg_trickle implements the same mathematical principles (delta queries, frontier tracking) but embedded inside PostgreSQL as an extension. It generates SQL delta queries rather than running a separate computation engine.
  • Trade-off: pg_trickle leverages PostgreSQL's optimizer, indexes, and storage engine but is limited to what can be expressed as SQL queries. DBSP can implement arbitrary dataflow computations.

See DBSP_COMPARISON.md for a detailed comparison.

How does pg_trickle compare to pg_ivm?

Featurepg_ivmpg_trickle
Refresh timingImmediate (same transaction) onlyImmediate, Deferred (scheduled), or Manual
Incremental strategyTransition tables + query rewritingDVM operator tree + delta SQL generation
Supported SQLInner joins, simple outer joins, COUNT/SUM/AVG/MIN/MAX, EXISTS, DISTINCTAll of the above + window functions, recursive CTEs, LATERAL, UNION/INTERSECT/EXCEPT, 37 aggregates, TopK, GROUPING SETS
Cascading (view-on-view)NoYes (DAG-aware topological refresh)
SchedulingNone (always immediate)Duration, cron, CALCULATED, or NULL
MonitoringNoneBuilt-in views, stats, NOTIFY alerts
PostgreSQL version14–1718 only (until v0.4.0)

pg_trickle's IMMEDIATE mode is designed as a migration path for pg_ivm users — it uses the same statement-level trigger approach with transition tables.

What PostgreSQL versions are supported?

PostgreSQL 18.x exclusively. pg_trickle uses PostgreSQL 18 features such as enhanced MERGE syntax with NOT MATCHED BY SOURCE and improved event trigger payloads. These features are not available in earlier versions.

Backward compatibility with PostgreSQL 16–17 is planned for a future release (tracked in the roadmap).

Does pg_trickle require wal_level = logical?

No. By default, pg_trickle uses lightweight row-level triggers for change data capture instead of logical replication. This means you do not need to set wal_level = logical, configure max_replication_slots, or create publications.

If you later enable the hybrid CDC mode (pg_trickle.cdc_mode = 'auto'), WAL-based capture becomes an option — but this is opt-in and not required for normal operation.

Is pg_trickle production-ready?

pg_trickle is under active development and approaching production readiness. It has a comprehensive test suite with 700+ unit tests and 290+ end-to-end tests covering correctness, failure recovery, and concurrency scenarios.

That said, as with any new extension, you should evaluate it against your specific workloads before deploying to production. Start with non-critical dashboards or reporting tables, monitor refresh performance and data correctness, and gradually expand usage as confidence grows.


Installation & Setup

How do I install pg_trickle?

  1. Add pg_trickle to shared_preload_libraries in postgresql.conf:
    shared_preload_libraries = 'pg_trickle'
    
  2. Restart PostgreSQL.
  3. Run:
    CREATE EXTENSION pg_trickle;
    

See INSTALL.md for platform-specific instructions and pre-built release artifacts.

What are the minimum configuration requirements?

The only mandatory setting is adding pg_trickle to shared_preload_libraries in postgresql.conf (this requires a PostgreSQL restart):

shared_preload_libraries = 'pg_trickle'

All other GUC parameters have sensible defaults and can be tuned later. However, max_worker_processes often needs to be raised from its default of 8 — see the next question.

Can I install pg_trickle on a managed PostgreSQL service (RDS, Cloud SQL, etc.)?

It depends on whether the service allows custom extensions and shared_preload_libraries modifications. Many managed services restrict these. However, pg_trickle has one advantage over replication-based extensions: it does not require wal_level = logical, which avoids one of the most common restrictions on managed PostgreSQL services.

Check your provider's documentation for custom extension support. Services that support custom extensions (e.g., some tiers of Azure Flexible Server, Supabase, Neon) are more likely to work.

How do I uninstall pg_trickle?

  1. Drop all stream tables first (or they will be cascade-dropped):
    SELECT pgtrickle.drop_stream_table(pgt_name) FROM pgtrickle.pgt_stream_tables;
    
  2. Drop the extension:
    DROP EXTENSION pg_trickle CASCADE;
    
  3. Remove pg_trickle from shared_preload_libraries and restart PostgreSQL.

Creating & Managing Stream Tables

Do I need to choose a refresh mode?

No. The default mode ('AUTO') is adaptive: it uses differential (delta-only) maintenance when efficient, and automatically falls back to full recomputation when the change volume is high or the query cannot be differentiated. This works well for the vast majority of queries.

You only need to specify a mode explicitly when:

  • You want FULL mode to force recomputation every time (rare).
  • You want IMMEDIATE mode for sub-second, in-transaction updates (adds overhead to every write on source tables).
  • You want strict DIFFERENTIAL mode and prefer an error over silent fallback when the query isn't differentiable.

How do I create a stream table?

-- Minimal: just name and query. Refreshes on a calculated schedule
-- using adaptive differential maintenance.
SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT customer_id, SUM(amount) AS total
     FROM orders GROUP BY customer_id'
);

-- With custom schedule:
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total
     FROM orders GROUP BY customer_id',
    schedule => '5m'
);

What is the difference between FULL and DIFFERENTIAL refresh mode?

  • FULL — Truncates the stream table and re-runs the entire defining query every refresh cycle. Simple but expensive for large result sets.
  • DIFFERENTIAL — Computes only the delta (changes since the last refresh) using the DVM engine and applies it via a MERGE statement. Much faster when only a small fraction of source data changes between refreshes. When the change ratio exceeds pg_trickle.differential_max_change_ratio (default 15%), DIFFERENTIAL automatically falls back to FULL for that cycle.
  • IMMEDIATE — Maintains the stream table synchronously within the same transaction as the base table DML. Uses statement-level triggers with transition tables — no change buffers, no scheduler. The stream table is always up-to-date.

Why does FULL mode exist if DIFFERENTIAL can fall back to it automatically?

DIFFERENTIAL mode with adaptive fallback covers most user needs — it uses incremental deltas when changes are small and automatically switches to a full recompute when the change ratio is high. However, explicit FULL mode still has its place:

  1. No CDC overhead. FULL mode installs CDC triggers on source tables (for DAG tracking), but the refresh itself ignores the change buffers entirely. If your workload has very high write throughput and you know you'll always do a full recompute, FULL mode avoids the per-row trigger overhead of writing change records that will never be consumed incrementally.

  2. Simpler debugging. When investigating data correctness issues, FULL mode is a clean baseline — it re-runs the defining query with no delta computation, no frontier tracking, and no MERGE logic. If FULL produces correct results but DIFFERENTIAL doesn't, the bug is in the delta pipeline.

  3. Predictable performance. DIFFERENTIAL refresh time varies with the number of changes, which can be unpredictable. FULL refresh time is proportional to the total result set size, which is stable. For SLA-sensitive workloads where you'd rather have consistent 500ms refreshes than variable 5ms–500ms refreshes, FULL provides that predictability.

  4. Unsupported-but-planned constructs. Some queries may parse correctly in DIFFERENTIAL mode but produce suboptimal deltas. Using FULL mode explicitly is a safe fallback while the DVM engine matures.

For most users, DIFFERENTIAL is the right default. Use FULL when you have a specific reason.

When should I use FULL vs. DIFFERENTIAL vs. IMMEDIATE?

Use DIFFERENTIAL (default) when:

  • Source tables are large and changes between refreshes are small
  • The defining query uses supported operators (most common SQL is supported)
  • Some staleness (seconds to minutes) is acceptable

Use FULL when:

  • The defining query uses unsupported aggregates (CORR, COVAR_*, REGR_*)
  • Source tables are small and a full recompute is cheap
  • You see frequent adaptive fallbacks to FULL (check refresh history)

Use IMMEDIATE when:

  • The stream table must always reflect the latest committed data
  • You need transactional consistency (reads within the same transaction see updated data)
  • Write-side overhead per DML statement is acceptable
  • The defining query is relatively simple (no TopK, no materialized view sources)

What are the advantages and disadvantages of IMMEDIATE vs. deferred (FULL/DIFFERENTIAL) refresh modes?

IMMEDIATE mode

Detail
✅ Read-your-writes consistencyThe stream table is updated within the same transaction as the base table DML — always current from the writer's perspective.
✅ No lagNo background worker, no schedule interval. The view is never stale.
✅ No change bufferspgtrickle_changes.* tables are not used, reducing write overhead on source tables.
✅ pg_ivm compatibilityDrop-in migration path for existing pg_ivm / IMMV users.
❌ Write amplificationEvery DML statement on a base table also executes IVM trigger logic, adding latency to the original transaction.
❌ Serialized concurrent writesAn ExclusiveLock is taken on the stream table during maintenance, serializing writers.
❌ Limited SQL supportWindow functions, recursive CTEs, LATERAL joins, scalar subqueries, and TopK (ORDER BY … LIMIT) are not supported — use DIFFERENTIAL instead.
❌ Cascading limitationsCascading IMMEDIATE stream tables work but may require manual refresh for deep chains.
❌ No throttlingThe refresh cannot be delayed or rate-limited.

Deferred mode (FULL / DIFFERENTIAL)

Detail
✅ Decoupled write pathBase table writes are fast; view maintenance runs later via the scheduler or manual refresh.
✅ Broadest SQL supportWindow functions, recursive CTEs, LATERAL, UNION, user-defined aggregates, TopK, cascading stream tables, and more.
✅ Adaptive cost controlDIFFERENTIAL automatically falls back to FULL when the change ratio exceeds pg_trickle.differential_max_change_ratio.
✅ Concurrency-friendlyWriters never block on view maintenance.
❌ StalenessThe stream table lags by up to one schedule interval (e.g. 1m).
❌ No read-your-writesA writer querying the stream table immediately after a write may see the pre-change data.
❌ Infrastructure overheadRequires change buffer tables, a background worker, and frontier tracking.

Rule of thumb: use IMMEDIATE when the query is simple and freshness within the transaction matters. Use DIFFERENTIAL (or FULL) for complex queries, high concurrency, or when you want to decouple write latency from view maintenance.

What happens if I have an IMMEDIATE stream table between two DIFFERENTIAL stream tables in a dependency chain?

Consider the chain: source → ST_A (DIFFERENTIAL) → ST_B (IMMEDIATE) → ST_C (DIFFERENTIAL). This is a valid but unusual configuration with important behavioral consequences:

  • ST_A refreshes on its schedule (e.g., every 1 minute) via the background scheduler.
  • ST_B is IMMEDIATE, so it has no CDC triggers on ST_A — it uses statement-level IVM triggers. But ST_A is updated by the scheduler (not by user DML), and the scheduler's MERGE operation does fire statement-level triggers on ST_A's dependents. So ST_B updates within the scheduler's transaction when ST_A refreshes.
  • ST_C is DIFFERENTIAL and depends on ST_B. Since ST_B is a stream table, ST_C's CDC triggers fire when ST_B is modified. The scheduler refreshes ST_C on its own schedule.

The practical concern: write latency stacking. When the scheduler refreshes ST_A, ST_B's IVM triggers fire synchronously within that same transaction, adding IVM overhead to ST_A's refresh. If ST_B's delta computation is expensive, it slows down the entire scheduler cycle.

Recommendation: Avoid mixing IMMEDIATE into the middle of a deferred chain. Either make the entire chain IMMEDIATE (for small, simple queries) or keep it entirely DIFFERENTIAL. If you need read-your-writes for one specific step, consider making that the terminal (leaf) stream table in the chain.

What schedule formats are supported?

Duration strings:

UnitSuffixExample
Secondss30s
Minutesm5m
Hoursh2h
Daysd1d
Weeksw1w
Compound1h30m

Cron expressions:

FormatExampleDescription
5-field*/5 * * * *Every 5 minutes
Aliases@hourly, @dailyBuilt-in shortcuts

CALCULATED mode: Pass NULL as the schedule to inherit the schedule from downstream dependents.

How do cron schedules handle timezones? What does @daily really mean?

pg_trickle evaluates cron expressions in UTC. The underlying croner crate computes the next occurrence from a UTC timestamp, and the scheduler compares this against chrono::Utc::now(). There is no per-stream-table timezone setting.

This means:

  • @daily (equivalent to 0 0 * * *) fires at midnight UTC, not midnight in your local timezone.
  • @hourly (equivalent to 0 * * * *) fires at the top of each UTC hour.
  • 0 9 * * 1-5 fires at 09:00 UTC on weekdays — if your server is in America/New_York, that's 04:00 or 05:00 local time depending on DST.

If you need a schedule aligned to a local timezone, convert the desired local time to UTC and write the cron expression accordingly. For example, to refresh at 08:00 Europe/Oslo (UTC+1 in winter, UTC+2 in summer), use 0 6 * * * in summer and 0 7 * * * in winter — or accept the 1-hour seasonal shift and pick one.

Tip: For most analytics workloads, UTC-based schedules are preferable because they don't shift with daylight saving transitions.

What is the minimum allowed schedule?

The pg_trickle.min_schedule_seconds GUC (default: 60 seconds) sets the shortest allowed refresh schedule. Any create_stream_table or alter_stream_table call with a schedule shorter than this floor is rejected with a clear error message.

This guard exists to prevent accidentally creating stream tables that refresh too frequently, which could overload the scheduler or the source tables. During development and testing, you can lower it:

ALTER SYSTEM SET pg_trickle.min_schedule_seconds = 1;
SELECT pg_reload_conf();

What happens if all stream tables in the DAG have a CALCULATED schedule?

When every stream table uses a CALCULATED schedule (schedule => 'calculated'), there are no explicit schedules for the resolution algorithm to derive from. The CALCULATED logic works by propagating MIN(effective_schedule) from downstream dependents upward through the DAG. If no node has an explicit duration:

  1. Leaf nodes (no downstream dependents) have no schedules to take the minimum of, so they fall back to the pg_trickle.min_schedule_seconds GUC (default: 60 seconds).
  2. Upstream nodes then resolve to MIN(fallback) = fallback.
  3. The result: every stream table in the DAG gets the fallback schedule (60 s by default).

This is safe but usually not what you want — the whole DAG refreshes at the same generic interval. Best practice is to set an explicit schedule on at least the leaf (most-downstream) stream tables so that upstream CALCULATED schedules resolve to something meaningful:

-- Leaf ST with an explicit schedule
SELECT pgtrickle.create_stream_table(
    name     => 'daily_summary',
    query    => 'SELECT region, SUM(total) FROM pgtrickle.order_totals GROUP BY region',
    schedule => '10m'
);

-- Upstream ST inherits that 10 m schedule via CALCULATED
SELECT pgtrickle.create_stream_table(
    name     => 'order_totals',
    query    => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule => 'calculated'
);

You can inspect the resolved effective schedules with:

SELECT pgt_name, schedule, effective_schedule
FROM pgtrickle.pgt_stream_tables;

Can a stream table reference another stream table?

Yes. Stream tables can depend on other stream tables. The scheduler automatically refreshes them in topological order (upstream first). Circular dependencies are detected and rejected at creation time.

-- ST1: aggregates orders
SELECT pgtrickle.create_stream_table(
    name         => 'order_totals',
    query        => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

-- ST2: filters ST1
SELECT pgtrickle.create_stream_table(
    name         => 'big_customers',
    query        => 'SELECT customer_id, total FROM pgtrickle.order_totals WHERE total > 1000',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

How do I change a stream table's schedule or mode?

-- Change schedule
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '10m');

-- Switch refresh mode
SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'FULL');

-- Suspend
SELECT pgtrickle.alter_stream_table('order_totals', status => 'SUSPENDED');

-- Resume
SELECT pgtrickle.alter_stream_table('order_totals', status => 'ACTIVE');

Can I change the defining query of a stream table?

Yes — use the query parameter of alter_stream_table():

SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total, COUNT(*) AS order_count
              FROM orders GROUP BY customer_id');

The ALTER QUERY operation validates the new query, migrates the storage table schema if needed, updates catalog entries and source dependencies, and runs a full refresh — all within a single transaction. Concurrent readers see either the old data or the new data, never an empty table.

Schema migration behavior:

Schema changeBehavior
Same columnsFast path — no storage DDL, just catalog update + full refresh
Columns added or removedCompatible migration via ALTER TABLE ADD/DROP COLUMN — storage table OID preserved
Column type incompatibleFull rebuild — storage table dropped and recreated (OID changes, WARNING emitted)

You can also change the query and other parameters simultaneously:

SELECT pgtrickle.alter_stream_table('order_totals',
    query => 'SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id',
    refresh_mode => 'FULL');

How do I deploy stream tables idempotently?

Use create_or_replace_stream_table() — one function call that does the right thing automatically:

-- Safe to run on every deploy — creates, updates, or no-ops as needed:
SELECT pgtrickle.create_or_replace_stream_table(
    name         => 'order_totals',
    query        => 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule     => '2m',
    refresh_mode => 'DIFFERENTIAL'
);

What happens on each deploy:

SituationAction
First deploy (stream table doesn't exist)Creates it, populates data
Nothing changed since last deployNo-op — logs INFO, returns instantly
You changed the schedule or modeUpdates config in place (no data loss)
You changed the queryMigrates storage schema + runs a full refresh

This mirrors PostgreSQL's CREATE OR REPLACE VIEW / CREATE OR REPLACE FUNCTION pattern.

When to use which function:

FunctionUse case
create_or_replace_stream_table()Recommended for most deployments. Declarative, idempotent — handles all cases automatically.
create_stream_table_if_not_exists()Safe re-run, but never modifies an existing definition. Good for one-time seed migrations.
create_stream_table()Strict mode — errors if the stream table already exists. Use when you want an explicit failure on duplicates.

How do I trigger a manual refresh?

Call refresh_stream_table() to immediately refresh a stream table without waiting for the next scheduled cycle:

SELECT pgtrickle.refresh_stream_table('order_totals');

This runs a synchronous refresh in your current session and returns when complete. It works even when the background scheduler is disabled (pg_trickle.enabled = false), making it useful for testing, debugging, or one-off data refreshes.

To force a full refresh regardless of the stream table's configured mode, temporarily change the refresh mode:

SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'FULL');
SELECT pgtrickle.refresh_stream_table('order_totals');
-- Switch back to the original mode when done:
SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'DIFFERENTIAL');

Data Freshness & Consistency

Understanding when and how stream tables become current is the #1 conceptual hurdle for users coming from synchronous materialized views. This section explains staleness guarantees, read-your-writes behavior, and Delayed View Semantics (DVS).

How stale can a stream table be?

For deferred modes (FULL / DIFFERENTIAL): A stream table can be at most one schedule interval behind the source data, plus the time it takes to execute the refresh itself. For example, with schedule => '1m', the maximum staleness is approximately 1 minute + refresh duration.

In practice, staleness is often less than the schedule interval because the scheduler continuously checks for due refreshes at pg_trickle.scheduler_interval_ms (default: 1 second).

For IMMEDIATE mode: The stream table is always current within the transaction that modified the source data. There is zero staleness.

Check current staleness:

SELECT pgtrickle.get_staleness('order_totals');  -- returns seconds, NULL if never refreshed

-- Or check all stream tables:
SELECT pgt_name, staleness, stale FROM pgtrickle.stream_tables_info;

Can I read my own writes immediately after an INSERT?

It depends on the refresh mode:

  • IMMEDIATE mode: Yes. The stream table is updated within the same transaction as your INSERT. You can query it immediately and see the updated data.
  • DIFFERENTIAL / FULL mode: No. The stream table is updated by the background scheduler in a separate transaction. Your INSERT is captured by the CDC trigger, but the stream table won't reflect it until the next scheduled refresh (or a manual refresh_stream_table() call).

If read-your-writes consistency is a requirement, use refresh_mode => 'IMMEDIATE'.

What consistency guarantees does pg_trickle provide?

pg_trickle provides Delayed View Semantics (DVS): the contents of every stream table are logically equivalent to evaluating its defining query at some past point in time — the data_timestamp. This means:

  • The data is always internally consistent — it corresponds to a valid snapshot of the source data.
  • The data may be stale — it reflects the source state at data_timestamp, not necessarily the current state.
  • For cascading stream tables, the scheduler refreshes in topological order so that when ST B references upstream ST A, A has already been refreshed before B runs its delta query against A's contents.

For IMMEDIATE mode, the guarantee is stronger: the stream table always reflects the state of the source data as of the current transaction.

What are "Delayed View Semantics" (DVS)?

DVS is the formal consistency guarantee: a stream table's contents are equivalent to evaluating its defining query at a specific past time (the data_timestamp). This is analogous to how a materialized view captured at a point in time is always internally consistent, even if the source data has since changed.

The data_timestamp is recorded in the catalog and advanced after each successful refresh:

SELECT pgt_name, data_timestamp FROM pgtrickle.pgt_stream_tables;

What happens if the scheduler is behind — does data get lost?

No. Change data is never lost, even if the scheduler falls behind. Changes accumulate in the change buffer tables (pgtrickle_changes.changes_<oid>) until consumed by a refresh. The frontier ensures that each refresh picks up exactly where the last one left off.

However, a growing change buffer increases:

  • Disk usage (change buffer tables grow)
  • Refresh time (more changes to process per cycle)
  • Risk of adaptive fallback to FULL (if the change ratio exceeds pg_trickle.differential_max_change_ratio)

The monitoring system emits a buffer_growth_warning NOTIFY alert if buffers grow unexpectedly.

How does pg_trickle ensure deltas are applied in the right order across cascading stream tables?

The scheduler uses topological ordering from the dependency DAG. When ST B depends on ST A:

  1. ST A is refreshed first — its data is brought up to date and its frontier advances.
  2. ST A's refresh writes are captured by CDC triggers (since ST A is a source for ST B).
  3. ST B is refreshed next — its delta query reads ST A's current (just-refreshed) data and the change buffer.

This ensures that downstream stream tables always see consistent upstream data. Circular dependencies are rejected at creation time.


IMMEDIATE Mode (Transactional IVM)

IMMEDIATE mode maintains the stream table synchronously — within the same transaction as the source DML. This section covers when to use it, what SQL it supports, locking behavior, and how to switch between modes.

When should I use IMMEDIATE mode instead of DIFFERENTIAL?

Use IMMEDIATE when:

  • Your application requires read-your-writes consistency — e.g., a user inserts an order and immediately queries a dashboard that must include that order.
  • The defining query is relatively simple (single-table aggregation, joins, filters).
  • The source table write rate is moderate (IMMEDIATE adds latency to every DML statement).

Stick with DIFFERENTIAL when:

  • Staleness of a few seconds to minutes is acceptable.
  • The defining query uses unsupported IMMEDIATE constructs (materialized-view sources, foreign-table sources).
  • Write-side performance is critical (high-throughput OLTP).
  • You need to decouple write latency from view maintenance.

What SQL features are NOT supported in IMMEDIATE mode?

IMMEDIATE mode supports all constructs that DIFFERENTIAL supports, with two source-type exceptions:

FeatureStatusNotes
WITH RECURSIVE✅ Supported (IM1)Semi-naive evaluation inside the trigger. A depth counter guards against infinite loops (pg_trickle.ivm_recursive_max_depth, default 100). A warning is emitted at create time for very deep hierarchies.
TopK (ORDER BY … LIMIT N [OFFSET M])✅ Supported (IM2)Micro-refresh: recomputes the top-N rows on every DML statement. Gated by pg_trickle.ivm_topk_max_limit to prevent unbounded scans.
Materialized views as sources❌ RejectedStale-snapshot prevents trigger-based capture — use the underlying query instead.
Foreign tables as sources❌ RejectedNo triggers on foreign tables — use FULL mode instead.

Attempting to create or switch to IMMEDIATE mode with an unsupported construct produces a clear error message.

What happens when I TRUNCATE a source table in IMMEDIATE mode?

A statement-level AFTER TRUNCATE trigger fires and truncates the stream table, then re-populates it by executing a full refresh from the defining query — all within the same transaction. The stream table remains consistent.

Can I have cascading IMMEDIATE stream tables (ST A → ST B)?

Yes. When ST A is IMMEDIATE and ST B depends on ST A and is also IMMEDIATE, changes propagate through the chain within the same transaction. The IVM triggers on the base table update ST A, and since that write is visible within the transaction, ST B's triggers fire and update ST B.

What locking does IMMEDIATE mode use?

IMMEDIATE mode acquires statement-level locks on the stream table during delta application:

  • Simple queries (single-table scan/filter without aggregates or DISTINCT): RowExclusiveLock — allows concurrent readers, blocks other writers.
  • Complex queries (joins, aggregates, DISTINCT, window functions): ExclusiveLock — blocks both readers and writers to ensure delta consistency.

This means concurrent writes to the same base table are serialized through the stream table lock. For high-concurrency write workloads, DIFFERENTIAL mode avoids this bottleneck.

How do I switch an existing DIFFERENTIAL stream table to IMMEDIATE?

SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'IMMEDIATE');

This:

  1. Validates the defining query against IMMEDIATE mode restrictions.
  2. Removes the row-level CDC triggers from source tables.
  3. Installs statement-level IVM triggers (BEFORE + AFTER with transition tables).
  4. Clears the schedule (IMMEDIATE mode has no schedule).
  5. Performs a full refresh to establish a consistent baseline.

To switch back:

SELECT pgtrickle.alter_stream_table('order_totals', refresh_mode => 'DIFFERENTIAL');

This reverses the process: removes IVM triggers, installs CDC triggers, restores the schedule (default 1m), and performs a full refresh.

What happens to IMMEDIATE mode during a manual refresh_stream_table() call?

For IMMEDIATE mode stream tables, refresh_stream_table() performs a FULL refresh — truncates and re-populates from the defining query. This is useful for recovering from edge cases or forcing a clean baseline. It is equivalent to pg_ivm's refresh_immv(name, true).

How much write-side overhead does IMMEDIATE mode add?

Each DML statement on a base table tracked by an IMMEDIATE stream table incurs:

  • BEFORE trigger: Advisory lock acquisition + pre-state setup (~0.1–0.5 ms).
  • AFTER trigger: Transition table copy to temp tables + delta SQL generation + delta application (~1–50 ms depending on query complexity and delta size).

For a simple single-table aggregate, expect 2–10 ms overhead per statement. For multi-table joins or window functions, overhead is higher. The overhead scales with the number of IMMEDIATE stream tables that depend on the same source table.


SQL Support

pg_trickle supports a broad range of SQL in defining queries. This section covers what’s supported, what’s rejected (with rewrites), and how specific constructs like aggregates and ORDER BY are handled. The subsections that follow dive deeper into aggregates, joins, CTEs, window functions, and TopK.

What SQL features are supported in defining queries?

Most common SQL is supported in both FULL and DIFFERENTIAL modes:

  • Table scans, projections, WHERE/HAVING filters
  • INNER, LEFT, RIGHT, FULL OUTER JOIN (including multi-table joins)
  • GROUP BY with 25+ aggregate functions (COUNT, SUM, AVG, MIN, MAX, BOOL_AND/OR, STRING_AGG, ARRAY_AGG, JSON_AGG, JSONB_AGG, BIT_AND/OR/XOR, STDDEV, VARIANCE, MODE, PERCENTILE_CONT/DISC, and more)
  • FILTER (WHERE ...) on aggregates
  • DISTINCT
  • Set operations: UNION ALL, UNION, INTERSECT, INTERSECT ALL, EXCEPT, EXCEPT ALL
  • Subqueries: EXISTS, NOT EXISTS, IN (subquery), NOT IN (subquery), scalar subqueries
  • Non-recursive and recursive CTEs
  • Window functions (ROW_NUMBER, RANK, SUM OVER, etc.)
  • LATERAL joins with set-returning functions and correlated subqueries
  • CASE, COALESCE, NULLIF, GREATEST, LEAST, BETWEEN, IS DISTINCT FROM

See DVM Operators for the complete list.

What SQL features are NOT supported?

The following are rejected with clear error messages and suggested rewrites:

FeatureReasonSuggested Rewrite
TABLESAMPLEStream tables materialize the full result setUse WHERE random() < fraction in consuming query
Window functions in expressionsCannot be differentially maintainedMove window function to a separate column
LIMIT / OFFSET (without ORDER BY)Stream tables materialize the full result set; ORDER BY … LIMIT N [OFFSET M] is supported as TopKApply when querying the stream table, or add ORDER BY + LIMIT to use the TopK pattern
FOR UPDATE / FOR SHARERow-level locking not applicableRemove the locking clause
RANGE_AGG / RANGE_INTERSECT_AGGNo incremental delta decomposition exists for range aggregatesUse FULL mode, or compute range unions in the consuming query

Each rejected feature is explained in detail in the Why Are These SQL Features Not Supported? section below.

What happens to ORDER BY in defining queries?

ORDER BY in the defining query is accepted but silently discarded. This is consistent with how PostgreSQL handles CREATE MATERIALIZED VIEW AS SELECT ... ORDER BY ... — the ordering only affects the initial INSERT, not the stored data.

Stream tables are heap tables with no guaranteed row order. Apply ORDER BY when querying the stream table instead:

-- Don't rely on ORDER BY in the defining query:
-- 'SELECT region, SUM(amount) AS total FROM orders GROUP BY region ORDER BY total DESC'

-- Instead, order when reading:
SELECT * FROM regional_totals ORDER BY total DESC;

Exception: When ORDER BY is paired with LIMIT N (with or without OFFSET M), pg_trickle recognizes the TopK pattern and preserves the ordering, limit, and offset.

Which aggregates support DIFFERENTIAL mode?

Algebraic (O(changes), fully incremental): COUNT, SUM, AVG

Semi-algebraic (incremental with occasional group rescan): MIN, MAX

Group-rescan (affected groups re-aggregated from source): STRING_AGG, ARRAY_AGG, JSON_AGG, JSONB_AGG, BOOL_AND, BOOL_OR, BIT_AND, BIT_OR, BIT_XOR, JSON_OBJECT_AGG, JSONB_OBJECT_AGG, STDDEV, STDDEV_POP, STDDEV_SAMP, VARIANCE, VAR_POP, VAR_SAMP, MODE, PERCENTILE_CONT, PERCENTILE_DISC, CORR, COVAR_POP, COVAR_SAMP, REGR_AVGX, REGR_AVGY, REGR_COUNT, REGR_INTERCEPT, REGR_R2, REGR_SLOPE, REGR_SXX, REGR_SXY, REGR_SYY

37 aggregate function variants are supported in total.


Aggregates & Group-By

Aggregate handling is one of the most complex parts of incremental view maintenance. This section explains how pg_trickle categorizes aggregates by their incremental cost, how hidden auxiliary columns work, and what happens when groups are created or destroyed.

Which aggregates are fully incremental (O(1) per change) vs. group-rescan?

pg_trickle categorizes aggregates into three tiers:

TierCost per changeAggregatesMechanism
AlgebraicO(1)COUNT, SUM, AVGHidden auxiliary columns (__pgt_count, __pgt_sum_x) track running totals. Delta updates these columns arithmetically.
Semi-algebraicO(1) normally, O(group) on extremum deletionMIN, MAXMaintained via LEAST/GREATEST. If the current MIN/MAX is deleted, the group is rescanned to find the new extremum.
Group-rescanO(group size) per affected groupAll others (35 functions)Affected groups are re-aggregated from source data. A NULL sentinel marks stale groups for rescan.

For most workloads, the algebraic tier (COUNT/SUM/AVG) covers the majority of aggregations and is the fastest.

Why do some aggregates have hidden auxiliary columns?

For algebraic aggregates (COUNT, SUM, AVG), the DVM engine adds hidden __pgt_count and __pgt_sum_x columns to the stream table's storage. These store running totals that can be updated with O(1) arithmetic per change instead of rescanning the entire group.

For example, a stream table defined as SELECT dept, AVG(salary) FROM employees GROUP BY dept internally stores:

  • dept — the group-by key
  • avg — the user-visible average (computed as __pgt_sum_x / __pgt_count)
  • __pgt_count — running count of rows in the group
  • __pgt_sum_x — running sum of salary values
  • __pgt_row_id — row identity hash

When a new employee is inserted, the refresh updates __pgt_count += 1, __pgt_sum_x += new_salary, and recomputes avg. No rescan of the source table is needed.

How does HAVING work with incremental refresh?

HAVING is fully supported in DIFFERENTIAL mode. The DVM engine tracks threshold transitions — groups entering or exiting the HAVING condition:

  • Group crosses threshold upward: A previously excluded group (e.g., HAVING COUNT(*) > 5) gains enough members → the group is inserted into the stream table.
  • Group crosses threshold downward: A group that was included drops below the threshold → the group is deleted from the stream table.
  • Group stays above threshold: Normal delta update (adjust aggregate values).

This means the stream table always reflects only the groups that satisfy the HAVING clause, even as group membership changes.

What happens to a group when all its rows are deleted?

When the last row of a group is deleted from the source table, the DVM engine detects that __pgt_count drops to zero and deletes the group row from the stream table. The hidden auxiliary columns are cleaned up along with it.

If a new row for the same group-by key is later inserted, a fresh group row is created from scratch.

Why are CORR, COVAR_*, and REGR_* limited to FULL mode?

Regression aggregates like CORR, COVAR_POP, COVAR_SAMP, and the REGR_* family require maintaining running sums of products and squares across the entire group. Unlike COUNT/SUM/AVG (where deltas can be computed from the change alone), regression aggregates:

  1. Lack algebraic delta rules. There is no closed-form way to update a correlation coefficient from a single row change without access to the full group's data.
  2. Would degrade to group-rescan anyway. Even if supported, the implementation would need to rescan the full group from source — identical to FULL mode for most practical group sizes.

These aggregates work fine in FULL refresh mode, which re-runs the entire query from scratch each cycle.


Joins

Join delta computation can produce surprising results when both sides change simultaneously. This section covers the standard IVM join rule, FULL OUTER JOIN support, and known edge cases.

How does a DIFFERENTIAL refresh handle a join when both sides changed?

When both tables in a join have changes since the last refresh, the DVM engine computes the join delta using the standard IVM join rule:

$$\Delta(R \bowtie S) = (\Delta R \bowtie S) \cup (R \bowtie \Delta S) \cup (\Delta R \bowtie \Delta S)$$

In practice, this means:

  1. Join the changes from the left against the current state of the right.
  2. Join the current state of the left against the changes from the right.
  3. Join the changes from both sides (handles simultaneous changes to matching keys).

All three parts are combined into a single CTE-based delta query that PostgreSQL executes in one pass.

Does pg_trickle support FULL OUTER JOIN incrementally?

Yes. FULL OUTER JOIN is supported in DIFFERENTIAL mode with an 8-part delta computation. This handles all four cases: matched rows on both sides, left-only rows, right-only rows, and rows that transition between matched and unmatched states as data changes.

The 8 parts cover: new left matches, removed left matches, new right matches, removed right matches, newly matched from left-only, newly matched from right-only, newly unmatched to left-only, and newly unmatched to right-only.

What happens when a join key is updated and the joined row is simultaneously deleted?

This is a known edge case. When a join key column is updated in the same refresh cycle as the joined-side row is deleted, the delta may miss the required DELETE, potentially leaving a stale row in the stream table.

Mitigations:

  • The adaptive FULL fallback (triggered when the change ratio exceeds pg_trickle.differential_max_change_ratio) catches most high-change-rate scenarios where this is likely.
  • You can stagger changes across refresh cycles.
  • Use FULL mode for tables where this pattern is common.

How does NATURAL JOIN work?

NATURAL JOIN is fully supported. At parse time, pg_trickle resolves the common columns between the two tables and synthesizes explicit equi-join conditions. The internal __pgt_row_id column is excluded from common column resolution, so NATURAL JOINs between stream tables also work correctly.


CTEs & Recursive Queries

Recursive CTE support is a key differentiator for pg_trickle. This section explains the three maintenance strategies (semi-naive, DRed, recomputation) and when each is used.

Do recursive CTEs work in DIFFERENTIAL mode?

Yes. pg_trickle supports WITH RECURSIVE in DIFFERENTIAL mode with three auto-selected strategies:

StrategyWhen usedHow it works
Semi-naive evaluationINSERT-only changes to the base caseIteratively evaluates new derivations from the inserted rows without touching existing rows. Fastest path.
Delete-and-Rederive (DRed)Mixed changes (INSERT + DELETE/UPDATE)Deletes potentially affected derived rows, then rederives them from scratch to determine the true delta.
Recomputation fallbackColumn mismatch or non-monotone recursive termsFalls back to full recomputation of the recursive CTE. Used when the recursive term contains EXCEPT, Aggregate, Window, DISTINCT, AntiJoin, or INTERSECT SET operators.

The strategy is selected automatically based on the type of changes and the recursive term's structure.

What are the three strategies for recursive CTE maintenance?

See the table above. In brief:

  • Semi-naive is the fast path for append-only workloads (e.g., adding nodes to a tree). It's O(new derivations) — much cheaper than a full re-evaluation.
  • DRed handles deletions and updates correctly by first removing potentially invalidated rows and then rederiving them. More expensive than semi-naive, but still incremental.
  • Recomputation is the safe fallback that re-executes the entire recursive CTE. Used when the recursive term's structure is too complex for incremental processing.

What triggers a fallback from semi-naive to recomputation?

A recomputation fallback is triggered when:

  1. The recursive term contains non-monotone operatorsEXCEPT, Aggregate, Window, DISTINCT, AntiJoin, or INTERSECT SET. These operators can "un-derive" rows when inputs change, which semi-naive evaluation cannot handle.
  2. Column mismatch — the CTE's output columns don't match the stream table's storage schema (e.g., after a schema change).
  3. Mixed DML with non-monotone terms — DELETE or UPDATE changes combined with non-monotone recursive terms always trigger recomputation.

Check which strategy was used in the refresh history:

SELECT action, rows_inserted, rows_deleted
FROM pgtrickle.get_refresh_history('my_recursive_st', 5);

What happens when a CTE is referenced multiple times in the same query?

When a non-recursive CTE is referenced more than once, pg_trickle uses shared delta computation — the CTE's delta is computed once and cached, then reused by each reference. This is tracked via CteScan operator nodes that look up the shared delta from an internal CTE registry.

For single-reference CTEs, pg_trickle simply inlines them as subqueries (no overhead).


Window Functions & LATERAL

Window functions are maintained via partition-based recomputation rather than row-level deltas. This section covers what’s supported, the expression restriction, and LATERAL constructs.

How are window functions maintained incrementally?

pg_trickle uses partition-based recomputation for window functions. When source data changes, the DVM engine:

  1. Identifies which partitions are affected by the changes (based on the PARTITION BY key).
  2. Recomputes the window function for only the affected partitions.
  3. Replaces the old partition results with the new ones in the stream table.

This is more efficient than a full recomputation when changes affect a small number of partitions.

Why can't I use a window function inside a CASE or COALESCE expression?

Window functions like ROW_NUMBER() OVER (…) are supported as standalone columns but cannot be embedded in expressions (e.g., CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN ...).

This restriction exists because the DVM engine handles window functions by recomputing entire partitions. When a window function is buried inside an expression, the engine cannot isolate the window computation from the surrounding expression.

Rewrite: Move the window function to a separate column in one stream table, then reference it in a second stream table:

-- ST1: compute the window function
SELECT id, dept, salary,
       ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rn
FROM employees

-- ST2: use it in an expression (references ST1)
SELECT id, CASE WHEN rn = 1 THEN 'top' ELSE 'other' END AS rank_label
FROM st1

What LATERAL constructs are supported?

pg_trickle supports three kinds of LATERAL constructs:

ConstructExampleDelta strategy
Set-returning functionsLATERAL jsonb_array_elements(data)Row-scoped recomputation — only affected parent rows are re-expanded
Correlated subqueriesLATERAL (SELECT ... WHERE t.id = s.id)Row-scoped recomputation
JSON_TABLE (PG 17+)JSON_TABLE(data, '$.items[*]' ...)Modeled as LateralFunction

Additional supported SRFs: jsonb_each, jsonb_each_text, unnest, generate_series, and others.

What happens when a row moves between window partitions during a refresh?

When a row's PARTITION BY key changes (e.g., an employee moves departments), the DVM engine recomputes both the old partition (to remove the row) and the new partition (to add it). Both partitions are re-evaluated from the source data, ensuring window function results are correct.


TopK (ORDER BY … LIMIT)

TopK queries (ORDER BY ... LIMIT N, optionally with OFFSET M) are handled via a specialized MERGE-based strategy that re-executes the bounded query each cycle. This section explains how it works and its limitations.

How does ORDER BY … LIMIT N work in a stream table?

When a defining query has a top-level ORDER BY … LIMIT N (with a constant integer N), pg_trickle recognizes it as a TopK pattern. An optional OFFSET M (constant integer) selects a "page" within the ranked result. The stream table stores exactly N rows and is refreshed via a MERGE-based scoped-recomputation strategy:

  1. On each refresh, the full query (with ORDER BY + LIMIT, and OFFSET if present) is re-executed against the source tables.
  2. The result is merged into the stream table using MERGE with NOT MATCHED BY SOURCE for deletes.
  3. The catalog records topk_limit, topk_order_by, and optionally topk_offset for the stream table.

TopK bypasses the DVM delta pipeline — it always re-executes the bounded query. This is efficient because the result set is bounded by N.

SELECT pgtrickle.create_stream_table(
    name         => 'top_customers',
    query        => 'SELECT customer_id, total FROM order_totals ORDER BY total DESC LIMIT 100',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

-- With OFFSET — "page 2" of the leaderboard (rows 101–200):
SELECT pgtrickle.create_stream_table(
    name         => 'next_customers',
    query        => 'SELECT customer_id, total FROM order_totals ORDER BY total DESC LIMIT 100 OFFSET 100',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

Does OFFSET work with TopK?

Yes. ORDER BY … LIMIT N OFFSET M is fully supported. The stream table stores exactly N rows starting from position M+1 in the ranked result. This is useful for:

  • Paginated dashboards: Each page is a separate stream table with a different OFFSET.
  • Excluding outliers: OFFSET 5 LIMIT 50 skips the top 5 and shows the next 50.
  • Windowed leaderboards: OFFSET 10 LIMIT 10 shows the "second tier."

Caveat: When source data changes, the "page" can shift — a row on page 3 may move to page 2 or 4. The stream table always reflects the current state of the page at the time of the last refresh.

OFFSET 0 is treated as no offset.

What happens when a row below the top-N cutoff rises above it?

On the next refresh, the full ORDER BY … LIMIT N query is re-executed. The newly qualifying row appears in the result, and the row that fell out of the top-N is removed. The MERGE operation handles this by:

  • INSERT the newly qualifying row
  • DELETE the row that fell below the cutoff
  • UPDATE any rows whose values changed but remained in the top-N

Since TopK always re-executes the bounded query, it correctly detects all ranking changes.

Can I use TopK with aggregates or joins?

Yes. The defining query can contain any SQL that pg_trickle supports, plus ORDER BY … LIMIT N:

-- TopK over an aggregate
SELECT dept, SUM(salary) AS total_salary
FROM employees GROUP BY dept
ORDER BY total_salary DESC LIMIT 10

-- TopK over a join
SELECT e.name, d.name AS dept, e.salary
FROM employees e JOIN departments d ON e.dept_id = d.id
ORDER BY e.salary DESC LIMIT 20

The only restriction is that TopK cannot be combined with set operations (UNION/INTERSECT/EXCEPT) or GROUPING SETS/CUBE/ROLLUP.


Tables Without Primary Keys

While primary keys are not required, their absence changes how pg_trickle identifies rows. This section explains the content-based hashing fallback and its limitations with duplicate rows.

Do source tables need a primary key?

No, but it is strongly recommended. When a source table has a primary key, pg_trickle uses it to generate a deterministic __pgt_row_id for each row — this is the most reliable way to track row identity across refreshes.

Without a primary key, pg_trickle falls back to content-based hashing — an xxHash of all column values. This works correctly for tables where every row is unique, but has known issues with exact duplicate rows. See What are the risks of using tables without primary keys? for details.

What are the risks of using tables without primary keys?

Content-based row identity has known limitations with exact duplicate rows (rows where every column value is identical):

  1. INSERT as no-op: If a row identical to an existing one is inserted, both have the same __pgt_row_id hash, so the MERGE treats it as a no-op (the row already exists).
  2. DELETE removes all copies: Deleting one of N identical rows generates a DELETE delta, but the MERGE removes all rows with that __pgt_row_id.
  3. Aggregate drift: Over time, these mismatches can cause aggregate values to drift from the true result.

Recommendation: Add a primary key or unique constraint to source tables, or use FULL mode for tables with frequent exact-duplicate rows.

How does content-based row identity work for duplicate rows?

For tables without a primary key, __pgt_row_id is computed as pg_trickle_hash_multi(ARRAY[col1::text, col2::text, ...]) — an xxHash of all column values. Rows with identical content produce identical hashes.

The hash uses \x1E (record separator) between values and \x00NULL\x00 for NULL values, minimizing collision risk for rows with different content. However, truly identical rows (same values in every column) will always hash to the same value — this is inherent to content-based identity.


Change Data Capture (CDC)

This section explains how pg_trickle captures changes to your source tables, the trade-offs between trigger-based and WAL-based CDC, and operational topics like backup/restore and buffer inspection.

How does pg_trickle capture changes to source tables?

pg_trickle installs AFTER INSERT/UPDATE/DELETE row-level PL/pgSQL triggers on each source table referenced by a stream table. Whenever a row in the source table is modified, the trigger writes a change record into a per-source buffer table in the pgtrickle_changes schema.

Each change record contains:

  • ActionI (insert), U (update), D (delete), or T (truncate marker)
  • Row data — old and/or new row values serialized as JSONB
  • LSN — the current WAL log sequence number, used for frontier tracking
  • Transaction ID — links the change to its originating transaction

The trigger fires within your transaction, so if you roll back, the change record is also rolled back. This guarantees that only committed changes appear in the buffer.

What is the overhead of CDC triggers?

The per-row overhead is approximately 20–55 μs, which covers the PL/pgSQL function dispatch, row_to_json() serialization, and the buffer table INSERT.

At typical write rates (fewer than 1,000 writes per second per source table), this adds less than 5% additional DML latency. For most OLTP workloads, the overhead is negligible — a single network round-trip to the database is usually 10–100× more expensive.

If you have very high-throughput source tables (>10K writes/sec), consider enabling the hybrid CDC mode (pg_trickle.cdc_mode = 'auto') which can automatically transition to WAL-based capture for lower per-row overhead (~5–15 μs).

What happens when I TRUNCATE a source table?

TRUNCATE is captured via a statement-level AFTER TRUNCATE trigger that writes a T marker row to the change buffer. When the differential refresh engine detects this marker, it automatically falls back to a full refresh for that cycle, ensuring the stream table stays consistent. Both FULL and DIFFERENTIAL mode stream tables handle TRUNCATE correctly.

Are CDC triggers automatically cleaned up?

Yes. pg_trickle tracks which source tables are referenced by which stream tables in the pgt_dependencies catalog. When the last stream table referencing a particular source table is dropped, pg_trickle automatically:

  1. Removes the CDC triggers from the source table.
  2. Drops the associated change buffer table (pgtrickle_changes.changes_<oid>).

You do not need to manually clean up triggers or buffer tables.

What happens if a source table is dropped or altered?

pg_trickle has DDL event triggers that listen for ALTER TABLE and DROP TABLE on source tables. When a change is detected, pg_trickle responds automatically:

  1. All stream tables that depend on the altered source are marked with needs_reinit = true in the catalog.
  2. On the next scheduler cycle, each affected stream table is reinitialized — the existing storage table is dropped, recreated from the current defining query schema, and re-populated with a full refresh.
  3. A reinitialize_needed NOTIFY alert is sent so your monitoring can detect the event.

If the DDL change breaks the defining query (e.g., a column referenced in the query was dropped), the reinitialization will fail and the stream table will enter ERROR status. In that case, you need to drop and recreate the stream table with an updated query.

How do I check if a source table has switched from trigger-based CDC to WAL-based CDC?

When you enable hybrid CDC (pg_trickle.cdc_mode = 'auto'), pg_trickle starts capturing changes with triggers and can automatically transition to WAL-based logical replication once conditions are met. There are several ways to check the current CDC mode for each source table:

1. Query the dependency catalog directly:

SELECT d.source_relid, c.relname AS source_table, d.cdc_mode,
       d.slot_name, d.decoder_confirmed_lsn, d.transition_started_at
FROM pgtrickle.pgt_dependencies d
JOIN pg_class c ON c.oid = d.source_relid;

The cdc_mode column shows one of three values:

  • TRIGGER — changes are captured via row-level triggers (the default)
  • TRANSITIONING — the system is in the process of switching from triggers to WAL
  • WAL — changes are captured via logical replication

2. Use the built-in health check function:

SELECT source_table, cdc_mode, slot_name, lag_bytes, alert
FROM pgtrickle.check_cdc_health();

This returns a row per source table with the current mode, replication slot lag (for WAL-mode sources), and any alert conditions such as slot_lag_exceeds_threshold or replication_slot_missing.

3. Listen for real-time transition notifications:

LISTEN pg_trickle_cdc_transition;

pg_trickle sends a NOTIFY with a JSON payload whenever a transition starts, completes, or is rolled back. Example payload:

{
  "event": "transition_complete",
  "source_table": "public.orders",
  "old_mode": "TRANSITIONING",
  "new_mode": "WAL",
  "slot_name": "pg_trickle_slot_16384"
}

This lets you integrate CDC mode changes into your monitoring stack without polling.

4. Check the global GUC setting:

SHOW pg_trickle.cdc_mode;

This shows the desired global behavior (trigger, auto, or wal), not the per-table actual state. The per-table state lives in pgt_dependencies.cdc_mode as described above.

See CONFIGURATION.md for details on the pg_trickle.cdc_mode, pg_trickle.wal_transition_timeout, pg_trickle.slot_lag_warning_threshold_mb, and pg_trickle.slot_lag_critical_threshold_mb GUCs.

Is it safe to add triggers to a stream table while the source table is switching CDC modes?

Yes, this is completely safe. CDC mode transitions and user-defined triggers operate on different tables and do not interfere with each other:

  • CDC transitions affect how changes are captured from source tables (e.g., orders). The transition switches the capture mechanism from row-level triggers on the source table to WAL-based logical replication.
  • User-defined triggers live on stream tables (e.g., order_totals) and control how the refresh engine applies changes to the materialized output.

Because these are independent concerns, you can freely add, modify, or remove triggers on a stream table at any point — including during an active CDC transition on its source tables.

How it works in practice:

  1. The refresh engine checks for user-defined triggers on the stream table at the start of each refresh cycle (via a fast pg_trigger lookup, <0.1 ms).
  2. If user triggers are detected, the engine uses explicit DELETE / UPDATE / INSERT statements instead of MERGE, so your triggers fire with correct TG_OP, OLD, and NEW values.
  3. The change data consumed by the refresh engine has the same format regardless of whether it came from CDC triggers or WAL decoding — so the trigger detection and the CDC mode are fully decoupled.

A trigger added between two refresh cycles will simply be picked up on the next cycle. The only (theoretical) edge case is adding a trigger in the tiny window during a single refresh transaction, between the trigger-detection check and the MERGE execution — but since both happen within the same transaction, this is virtually impossible in practice.

Why does pg_trickle use triggers instead of logical replication for initial CDC?

pg_trickle always bootstraps CDC with row-level AFTER triggers because they provide single-transaction atomicity — the change record is written in the same transaction as the source DML, so:

  1. No commit-order ambiguity. The change buffer always reflects committed data; rolled-back transactions never produce partial change records.
  2. No replication slot management at creation time. Logical replication requires creating and monitoring replication slots, which can bloat WAL if the subscriber falls behind. Trigger-based bootstrap avoids this complexity.
  3. Works on all hosting providers. Some managed PostgreSQL services restrict wal_level = logical or limit the number of replication slots. Trigger bootstrap works everywhere, with no configuration changes.
  4. Simpler initial deployment. No need for wal_level = logical, no publication/subscription setup, and no extra connections for WAL senders.

With pg_trickle.cdc_mode = 'auto' (the default since v0.3.0), pg_trickle uses triggers initially and then transparently transitions to WAL-based CDC if wal_level = logical is available. If WAL is not available, triggers are kept permanently — no degradation, no errors. Set pg_trickle.cdc_mode = 'trigger' if you want to disable WAL transitions entirely. See ADR-001 and ADR-002 in the architecture documentation for the full rationale.

Why is auto the default pg_trickle.cdc_mode?

As of v0.3.0, auto is the default CDC mode. This was changed from trigger based on the following considerations:

1. Safe no-op on standard installs. PostgreSQL ships with wal_level = replica by default. In this configuration, auto simply stays on trigger-based CDC permanently — it does not create replication slots, publications, or any WAL infrastructure. There is no error, warning, or user-visible difference from the old trigger default. auto only activates the WAL transition path when wal_level = logical is explicitly configured by the operator.

2. Automatic fallback hardening. The WAL transition and steady-state polling now include robust automatic fallback:

  • Consecutive poll errors (5 failures) trigger automatic revert to triggers.
  • check_decoder_health() validates slot existence, WAL lag, and wal_level on every tick.
  • The TRANSITIONING phase has a progressive timeout with informative warnings.
  • Post-restart health checks (check_cdc_transition_health()) automatically clean up stale transitions.

3. Zero overhead for trigger-only deployments. When wal_level != logical, the auto scheduler branch takes a fast-path exit after a single GUC check and pg_replication_slots query. The overhead compared to trigger mode is negligible (<1 ms per scheduler tick).

4. Progressive optimisation without config changes. When an operator later enables wal_level = logical (e.g., for other replication needs), pg_trickle automatically benefits from lower per-row CDC overhead (~5–15 μs vs ~20–55 μs) without any configuration change. This aligns with the principle of least surprise.

When to use trigger instead: Set pg_trickle.cdc_mode = 'trigger' if you want fully deterministic trigger-only behaviour, need to minimize any replication slot management, or are on a restricted managed PostgreSQL that caps replication slots. This reverts to the pre-v0.3.0 default.

Caveats to be aware of in auto mode:

  • Keyless tables (no PRIMARY KEY) stay on triggers permanently — WAL mode requires a PK for pk_hash computation.
  • Replication slots prevent WAL recycling: if the decoder falls behind, WAL accumulates. pg_trickle now warns at pg_trickle.slot_lag_warning_threshold_mb (default 100 MB) and marks per-source CDC health unhealthy at pg_trickle.slot_lag_critical_threshold_mb (default 1024 MB).
  • The TRANSITIONING phase runs both trigger and WAL decoder simultaneously; LSN-based deduplication handles correctness. If anything goes wrong, the system rolls back to triggers.

How does the trigger-to-WAL automatic transition work?

When pg_trickle.cdc_mode = 'auto', pg_trickle monitors each source table's write rate. When the rate exceeds an internal threshold, the transition proceeds in three phases:

  1. Slot creation. A logical replication slot is created for the source table's OID (e.g., pg_trickle_slot_16384).
  2. Dual capture. For a brief period, both triggers and WAL decoding capture changes. The system uses LSN comparison to deduplicate, ensuring no changes are lost or double-counted.
  3. Trigger removal. Once the WAL decoder has confirmed it is caught up (its confirmed LSN ≥ the frontier LSN), the row-level triggers are dropped and the source transitions fully to WAL mode.

The transition is tracked in pgt_dependencies.cdc_mode (values: TRIGGERTRANSITIONINGWAL). If the transition times out (pg_trickle.wal_transition_timeout, default 5 minutes), it is rolled back and triggers are kept.

What happens to CDC if I restore a database backup?

After restoring a backup (pg_dump, pg_basebackup, or PITR), the CDC state depends on the backup type:

Backup typeTriggersChange buffersFrontierAction needed
pg_dump (logical)Preserved (in DDL)Buffer rows includedCatalog restoredUsually none — next refresh detects stale frontier and does a full refresh
pg_basebackup (physical)PreservedBuffer rows preserved (committed at backup time)Catalog restoredReplication slots may be invalid — WAL-mode sources may need manual transition back to TRIGGER mode
PITR (point-in-time)PreservedOnly committed buffer rows at the recovery targetCatalog restoredSimilar to pg_basebackup; frontier may point ahead of actual buffer content → first refresh does a full refresh to reconcile

In all cases, the pg_trickle scheduler automatically detects frontier inconsistencies and falls back to a full refresh for the first cycle after restore. No manual intervention is required for trigger-mode sources.

For full guidelines on disaster recovery strategies, see our dedicated Backup and Restore chapter.

For WAL-mode sources, replication slots created after the backup point will not exist in the restored state. Set pg_trickle.cdc_mode = 'trigger' temporarily, or let the auto transition recreate slots.

Do CDC triggers fire for rows inserted via logical replication (subscribers)?

Yes. PostgreSQL fires row-level triggers on the subscriber side for rows applied via logical replication. This means if you have a subscriber database with pg_trickle installed, the CDC triggers will capture replicated changes into the local change buffers.

Implication: You can run stream tables on a subscriber database that tracks replicated tables — the change capture works transparently. However, be careful about:

  • Double-counting. If the same table is tracked by pg_trickle on both the publisher and subscriber, changes are captured twice (once on each side). This is fine if the stream tables are independent, but confusing if you expect them to be identical.
  • Replication lag. The stream table on the subscriber will be delayed by both the replication lag and the pg_trickle refresh schedule.

Can I inspect the change buffer tables directly?

Yes. Change buffers are ordinary tables in the pgtrickle_changes schema, named changes_<source_oid>:

-- List all change buffer tables
SELECT tablename FROM pg_tables WHERE schemaname = 'pgtrickle_changes';

-- Inspect recent changes for a source table (find OID first)
SELECT c.oid FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = 'orders' AND n.nspname = 'public';

-- Then query the buffer
SELECT action, lsn, txid, old_data, new_data
FROM pgtrickle_changes.changes_16384
ORDER BY lsn DESC LIMIT 10;

The action column contains: I (insert), U (update), D (delete), or T (truncate).

Warning: Do not modify buffer tables directly. The refresh engine manages buffer cleanup (truncation) after each successful refresh. Manual changes will corrupt the frontier tracking.

How does pg_trickle prevent its own refresh writes from re-triggering CDC?

When the refresh engine writes to a stream table (via MERGE or explicit DML), it does not trigger CDC capture on that stream table, even if the stream table is itself a source for a downstream stream table. This is because:

  1. CDC triggers are only installed on source tables, not on stream tables. The refresh engine writes directly to the stream table's storage without going through any change-capture mechanism.
  2. Downstream change propagation uses a different path. When stream table A is a source for stream table B, changes to A are detected at B's refresh time by re-reading A's data (not via triggers on A). The topological ordering ensures A is refreshed before B.

This design prevents infinite loops (A triggers B triggers A) and avoids the overhead of capturing changes to materialized output that will be recomputed anyway.


Diamond Dependencies & DAG Scheduling

When multiple stream tables form a diamond-shaped dependency graph, careful coordination is needed to avoid inconsistent snapshots. This section covers atomic consistency, schedule policies, and topological ordering.

What is a diamond dependency and why does it matter?

A diamond dependency occurs when two (or more) intermediate stream tables both depend on the same source, and a downstream stream table depends on both of them:

       Source: orders
       /             \
  ST: totals      ST: counts
       \             /
    ST: combined_report

Without coordination, combined_report might be refreshed after totals is updated but before counts is updated (or vice versa), producing a temporarily inconsistent snapshot — totals reflects the latest data but counts is stale.

What does diamond_consistency = 'atomic' do?

When diamond_consistency = 'atomic' is set on the downstream stream table (e.g., combined_report), pg_trickle ensures that all upstream stream tables in the diamond are refreshed within the same scheduler cycle before the downstream table is refreshed. This guarantees a consistent point-in-time snapshot.

If any upstream refresh in the atomic group fails, the downstream refresh is skipped for that cycle to avoid inconsistency. The failed upstream will be retried on the next cycle.

SELECT pgtrickle.alter_stream_table('combined_report',
    diamond_consistency => 'atomic');

What is the difference between 'fastest' and 'slowest' schedule policy?

When a stream table has multiple upstream dependencies with different schedules, pg_trickle needs a policy for when to refresh the downstream table:

PolicyBehaviorBest for
fastestRefresh downstream whenever any upstream refreshesLow-latency dashboards where partial freshness is acceptable
slowestRefresh downstream only after all upstreams have refreshedReports requiring all-or-nothing consistency

The default is fastest. Use slowest with diamond_consistency = 'atomic' for the strongest consistency guarantees.

What happens when an atomic diamond group partially fails?

When diamond_consistency = 'atomic' is set and one upstream stream table in the diamond fails to refresh:

  1. The downstream refresh is skipped for that cycle (it reads stale-but-consistent data from the previous successful cycle).
  2. The failed upstream follows the normal retry logic (exponential backoff, up to max_consecutive_errors).
  3. Other non-failing upstreams in the diamond are still refreshed normally — their data is fresh, but the downstream won't consume it until all upstreams succeed.
  4. A NOTIFY pg_trickle_alert with event diamond_partial_failure is sent so your monitoring can detect the situation.

How does pg_trickle determine topological refresh order?

The scheduler builds a directed acyclic graph (DAG) of stream table dependencies at startup and after any create_stream_table / drop_stream_table call. The algorithm:

  1. Edge discovery. For each stream table, the defining query's source tables are extracted. If a source table is itself a stream table, a dependency edge is added.
  2. Cycle detection. The DAG is checked for cycles. If a cycle is detected, the offending create_stream_table call is rejected with a clear error message listing the cycle path.
  3. Topological sort. A Kahn's algorithm topological sort produces the refresh order — leaf nodes (no stream table dependencies) are refreshed first, then their dependents, and so on.
  4. Level assignment. Each stream table is assigned a "level" (0 for leaves, max(parent levels) + 1 for dependents). Stream tables at the same level are refreshed concurrently when pg_trickle.parallel_refresh_mode = 'on'.

The topological order is recalculated whenever the DAG changes. You can inspect it with:

SELECT pgt_name, depends_on, topo_level
FROM pgtrickle.stream_tables_info
ORDER BY topo_level, pgt_name;

Schema Changes & DDL Events

pg_trickle detects source table schema changes via PostgreSQL’s DDL event trigger system and reacts automatically. This section explains what happens for various DDL operations and how to handle them.

What happens when I add a column to a source table?

Adding a column to a source table is safe and non-disruptive if the stream table's defining query does not use SELECT *:

  • Named columns: If the defining query explicitly lists columns (e.g., SELECT id, name, amount FROM orders), the new column is simply not captured by CDC and has no effect on the stream table.
  • SELECT *: If the defining query uses SELECT *, pg_trickle detects the schema mismatch at the next refresh and marks the stream table with needs_reinit = true. The next scheduler cycle performs a full reinitialization — drops the storage table, recreates it with the new column set, and does a full refresh.

CDC triggers capture the full row as JSONB regardless of which columns the stream table uses, so no trigger changes are needed.

What happens when I drop a column used in a stream table's query?

Dropping a column that is referenced in a stream table's defining query will cause the next refresh to fail because the column no longer exists in the source table. pg_trickle handles this via:

  1. DDL event trigger detects the ALTER TABLE ... DROP COLUMN and marks all affected stream tables with needs_reinit = true.
  2. On the next refresh cycle, the scheduler attempts reinitialization — but the defining query will fail with a PostgreSQL error (e.g., column "amount" does not exist).
  3. The stream table moves to ERROR status after max_consecutive_errors failures.
  4. A reinitialize_needed NOTIFY alert is sent.

Resolution: Drop and recreate the stream table with an updated defining query:

SELECT pgtrickle.drop_stream_table('order_totals');
SELECT pgtrickle.create_stream_table(
    name         => 'order_totals',
    query        => 'SELECT id, name FROM orders',  -- updated query without dropped column
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

What happens when I CREATE OR REPLACE a view used by a stream table?

PostgreSQL event triggers fire on CREATE OR REPLACE VIEW, so pg_trickle detects the change and marks dependent stream tables with needs_reinit = true. On the next refresh:

  • If the new view definition is compatible (same output columns, same types), reinitialization succeeds transparently — the stream table is repopulated with the new query logic.
  • If the new view definition changes the output schema (different columns or types), the delta query will fail and the stream table enters ERROR status.

Tip: To avoid disruption, use pgtrickle.alter_stream_table() to pause the stream table before replacing the view, then resume after verifying compatibility.

What happens when I alter or drop a function used in a stream table's query?

If a stream table's defining query calls a user-defined function (e.g., SELECT my_func(amount) FROM orders) and that function is altered or dropped:

  • ALTER FUNCTION (changing the body): pg_trickle does not detect this automatically — PostgreSQL does not fire DDL event triggers for function body changes. The stream table continues refreshing with the new function behavior. If this is intentional, no action is needed. If you want a full rebase to the new logic, temporarily switch to FULL mode and refresh:
    SELECT pgtrickle.alter_stream_table('my_st', refresh_mode => 'FULL');
    SELECT pgtrickle.refresh_stream_table('my_st');
    SELECT pgtrickle.alter_stream_table('my_st', refresh_mode => 'DIFFERENTIAL');
    
  • DROP FUNCTION: The next refresh fails because the function no longer exists. The stream table enters ERROR status. Recreate the function or drop and recreate the stream table.

What is reinitialize and when does it trigger?

Reinitialize is pg_trickle's mechanism for handling structural changes to source tables. When a stream table is marked with needs_reinit = true, the next scheduler cycle performs:

  1. Drop the existing storage table (the physical heap table backing the stream table).
  2. Recreate the storage table from the defining query's current output schema.
  3. Full refresh — run the defining query against current source data and populate the new storage table.
  4. Reset the frontier to the current LSN.
  5. Clear the needs_reinit flag.

Reinitialize triggers automatically when:

  • DDL event triggers detect ALTER TABLE, DROP TABLE, or CREATE OR REPLACE VIEW on source tables or intermediate views.
  • A needs_reinit NOTIFY alert is sent.
  • You can also trigger it manually:
    UPDATE pgtrickle.pgt_stream_tables SET needs_reinit = true WHERE pgt_name = 'my_st';
    

Can I block DDL on tracked source tables?

pg_trickle does not currently block DDL on source tables — it only reacts to DDL changes via event triggers. If you want to prevent accidental schema changes on critical source tables, use PostgreSQL's built-in mechanisms:

-- Revoke ALTER/DROP from application roles
REVOKE ALL ON TABLE orders FROM app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON TABLE orders TO app_user;
-- Only the table owner (or superuser) can now ALTER/DROP

Alternatively, create a custom event trigger that raises an exception when DDL targets tracked source tables:

CREATE OR REPLACE FUNCTION prevent_source_ddl() RETURNS event_trigger AS $$
BEGIN
    IF EXISTS (
        SELECT 1 FROM pg_event_trigger_ddl_commands() cmd
        JOIN pgtrickle.pgt_dependencies d ON d.source_relid = cmd.objid
    ) THEN
        RAISE EXCEPTION 'Cannot ALTER/DROP a table tracked by pg_trickle';
    END IF;
END;
$$ LANGUAGE plpgsql;

CREATE EVENT TRIGGER guard_source_ddl ON ddl_command_end
EXECUTE FUNCTION prevent_source_ddl();

What happens if I run DDL on a source table during an active refresh?

PostgreSQL's locking mechanism prevents most conflicts. The refresh transaction acquires a ShareLock on source tables before reading them. Since ALTER TABLE (including ADD COLUMN, DROP COLUMN, ALTER TYPE) requires an AccessExclusiveLock, the DDL statement blocks until the refresh transaction completes.

In practice:

  • During a refresh: The ALTER TABLE waits for the refresh to finish, then proceeds. pg_trickle's DDL event trigger then detects the change and marks the stream table for reinitialization.
  • Between refreshes: DDL proceeds immediately. The next refresh picks up the reinitialization flag.

There is a tiny theoretical window between lock acquisition and the first read where DDL could sneak in, but this is prevented by PostgreSQL's MVCC — the refresh's snapshot was taken before the DDL committed, so it reads the old schema regardless.

If pg_trickle.block_source_ddl = true: Column-affecting DDL on tracked source tables is rejected entirely with an ERROR, regardless of whether a refresh is running.

Do stream tables work with logical replication?

Stream tables are replicated to standbys via physical (streaming) replication like any other heap table. However, they are not automatically maintained by pg_trickle on the subscriber:

AspectPrimaryPhysical standbyLogical subscriber
Scheduler runsYesNo (read-only)No (no pg_trickle catalog)
Stream tables readableYesYes (replicated)Only if published
Refreshes occurYesNo (standby is read-only)No
Change buffersManaged by pg_trickleReplicated but not consumedNot available

Key limitations:

  • Change buffer tables (pgtrickle_changes.*) are not published through logical replication — they are internal transient data.
  • The pg_trickle catalog (pgtrickle.pgt_stream_tables) is not replicated through logical replication.
  • On a physical standby, stream tables receive updates through streaming replication with the usual replication lag.

Recommended pattern: Run pg_trickle on the primary only. Read stream tables from any physical standby.


Performance & Tuning

This section covers scheduler tuning, the adaptive FULL fallback, disk space management, and guidance on when to use DIFFERENTIAL vs. FULL mode.

How do I tune the scheduler interval?

The pg_trickle.scheduler_interval_ms GUC controls how often the scheduler checks for stale stream tables (default: 1000 ms).

WorkloadRecommended Value
Low-latency (near real-time)100500
Standard1000 (default)
Low-overhead (many STs, long schedules)500010000

Is there any risk in setting min_schedule_seconds very low?

Yes. pg_trickle.min_schedule_seconds (default: 60) is a safety guardrail, not an arbitrary limit. Setting it very low — especially in production — can cause several problems:

WAL amplification. Every differential refresh writes a MERGE to the WAL. At 1-second intervals across many stream tables, WAL generation rises sharply, increasing replication lag and storage costs.

Lock contention. Each refresh acquires locks on the change buffer table. With cleanup_use_truncate = true (the default), this is an AccessExclusiveLock. Sub-second schedules can starve concurrent INSERT/UPDATE/DELETE statements on the source tables.

Cascading refresh load. If a refresh takes longer than the schedule interval (e.g., an 800 ms refresh on a 1-second schedule), the next refresh fires almost immediately upon completion. With chained or diamond-shaped ST graphs, the entire topological chain must complete within the interval to avoid falling behind.

Autovacuum pressure. Rapid MERGE operations produce dead tuples in the stream table faster than autovacuum can clean them up, bloating the table and degrading query performance over time.

Adaptive fallback triggering. At high change rates, pg_trickle.differential_max_change_ratio may trigger a FULL refresh instead of DIFFERENTIAL. A FULL refresh at 1-second intervals is very expensive and defeats the purpose of differential maintenance.

Practical guidance:

EnvironmentRecommended minimum
Development / testing1 s — fine for fast iteration
Lightly loaded production1030 s
Standard production60 s (default)
High-throughput OLTP120+ s — let change buffers accumulate for efficient batch merging

If you need near-real-time results, consider IMMEDIATE mode (refresh_mode => 'DIFFERENTIAL' with same-transaction refresh) instead of a very short schedule — it avoids the scheduler overhead entirely and updates the stream table within your transaction.

What is the adaptive fallback to FULL?

When the number of pending changes exceeds pg_trickle.differential_max_change_ratio (default: 15%) of the source table size, DIFFERENTIAL mode automatically falls back to FULL for that refresh cycle. This prevents pathological delta queries on bulk changes.

  • Set to 0.0 to always use DIFFERENTIAL (even on large change sets)
  • Set to 1.0 to effectively always use FULL
  • Default 0.15 (15%) is a good balance

How many concurrent refreshes can run?

By default (parallel_refresh_mode = 'off') refreshes are processed sequentially within the scheduler's single background worker. This is safe and efficient for most deployments.

Starting in v0.4.0, true parallel refresh is available via:

ALTER SYSTEM SET pg_trickle.parallel_refresh_mode = 'on';
ALTER SYSTEM SET pg_trickle.max_dynamic_refresh_workers = 4;  -- cluster-wide cap
ALTER SYSTEM SET pg_trickle.max_concurrent_refreshes = 4;    -- per-database cap
SELECT pg_reload_conf();

When enabled, independent stream tables at the same DAG level are refreshed concurrently in separate dynamic background workers. Each worker uses one max_worker_processes slot — see the worker-budget formula before enabling.

Monitor parallel refresh with:

SELECT * FROM pgtrickle.worker_pool_status();
SELECT * FROM pgtrickle.parallel_job_status(60);

For most deployments with fewer than 100 stream tables, sequential processing is still efficient (each differential refresh typically takes 5–50 ms).

How do I check if my stream tables are keeping up?

-- Quick overview
SELECT pgt_name, status, staleness, stale
FROM pgtrickle.stream_tables_info;

-- Detailed statistics
SELECT pgt_name, total_refreshes, avg_duration_ms, consecutive_errors, stale
FROM pgtrickle.pg_stat_stream_tables;

-- Recent refresh history for a specific ST
SELECT * FROM pgtrickle.get_refresh_history('order_totals', 10);

What is __pgt_row_id?

Every stream table has a __pgt_row_id BIGINT PRIMARY KEY column that stores a 64-bit xxHash of the row's identity key. The refresh engine uses it to match incoming deltas against existing rows during MERGE operations.

For a detailed explanation of how this column is computed and why it exists, see What is the __pgt_row_id column and why does it appear in my stream tables? in the General section.

You should ignore this column in your queries. It is an implementation detail.

How much disk space do change buffer tables consume?

Each change buffer table stores one row per source-table change (INSERT, UPDATE, DELETE, or TRUNCATE marker). The row size depends on the source table's column count and data types:

ComponentApproximate size
action column (char)1 byte
old_data / new_data (JSONB)1–10 KB per row (depends on source columns)
lsn (pg_lsn)8 bytes
txid (xid8)8 bytes
Index (on lsn)~40 bytes per row

Rule of thumb: Buffer tables consume roughly 2–3× the raw row size of the source change, because both OLD and NEW values are stored as JSONB.

Buffer tables are cleaned up (truncated or deleted) after each successful refresh. If you suspect buffer bloat, check:

SELECT relname, pg_size_pretty(pg_total_relation_size(oid)) AS size
FROM pg_class
WHERE relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'pgtrickle_changes')
ORDER BY pg_total_relation_size(oid) DESC;

What determines whether DIFFERENTIAL or FULL is faster for a given workload?

The breakeven point depends on the change ratio — the number of changed rows relative to the total source table size:

Change ratioRecommended modeWhy
< 5%DIFFERENTIALDelta query touches few rows; much cheaper than re-reading everything
5–15%DIFFERENTIAL (usually)Still faster, but approaching the crossover
15–50%FULLThe delta query scans a large fraction of the source anyway; FULL avoids the overhead of delta computation
> 50%FULLBulk load scenario — TRUNCATE + INSERT is simpler and faster

Additional factors:

  • Query complexity: Queries with many joins or window functions have more expensive delta computation. The crossover shifts lower.
  • Source table size: For small tables (<10K rows), FULL is nearly always faster because the overhead is negligible.
  • Index presence: DIFFERENTIAL uses indexes to look up changed rows. Missing indexes on join keys or GROUP BY columns can make delta queries slow.

The adaptive fallback (pg_trickle.differential_max_change_ratio, default 0.15) automates this decision per-cycle.

What are the planner hints and when should I disable them?

Before executing a delta query, pg_trickle sets several session-level planner parameters to guide PostgreSQL toward efficient delta plans:

SET LOCAL enable_seqscan = off;     -- Prefer index scans for small deltas
SET LOCAL enable_nestloop = on;     -- Nested loops are good for small delta × large table joins
SET LOCAL enable_mergejoin = off;   -- Merge joins are worse for skewed delta sizes

These hints are active only during the refresh transaction and are reset afterward.

When to disable hints: If you notice that a particular stream table's refresh is slow (check avg_duration_ms in pg_stat_stream_tables), the planner hints may be suboptimal for that specific query. You can disable them by setting:

SET pg_trickle.planner_hints = off;

This allows PostgreSQL's planner to choose its own strategy. Test both settings and compare avg_duration_ms.

How do prepared statements help refresh performance?

The refresh engine uses PostgreSQL prepared statements (PREPARE / EXECUTE) for the delta and MERGE queries. On the first refresh, the statement is prepared; subsequent refreshes reuse the cached plan. Benefits:

  • Reduced planning overhead. For complex delta queries with many joins and CTEs, planning can take 5–50 ms. Prepared statements skip this on subsequent refreshes.
  • Stable plans. The planner uses generic plans after the 5th execution (PostgreSQL default), avoiding plan instability from statistic fluctuations.

Prepared statements are stored per-session and are invalidated when:

  • The stream table is reinitialized (schema change)
  • The shared cache generation advances after DDL or stream-table metadata changes
  • The PostgreSQL connection is recycled
  • The session ends

How does the adaptive FULL fallback threshold work in practice?

The pg_trickle.differential_max_change_ratio GUC (default: 0.15) is evaluated per source table, per refresh cycle:

  1. Before each differential refresh, the engine counts pending changes in the buffer table: pending_changes = COUNT(*) FROM pgtrickle_changes.changes_<oid>.
  2. It estimates the source table size from pg_class.reltuples.
  3. If pending_changes / reltuples > differential_max_change_ratio, the engine falls back to FULL for that cycle.

Edge cases:

  • If the source table has reltuples = 0 (freshly created, no ANALYZE yet), the engine always uses FULL until statistics are available.
  • For multi-source stream tables (joins), each source is evaluated independently. If any source exceeds the threshold, the entire refresh falls back to FULL.
  • The threshold applies to the current cycle only — the next cycle re-evaluates.

How many stream tables can a single PostgreSQL instance handle?

There is no hard limit. Practical limits depend on:

FactorGuideline
Scheduler overheadEach cycle iterates all STs; at 1000 STs with 1ms overhead per check, the cycle takes ~1s
Background connections1 per database (the scheduler) + 1 per manual refresh call
Change buffer bloatEach source table gets its own buffer table — many sources = many tables in pgtrickle_changes
Catalog sizepgt_stream_tables and pgt_dependencies grow linearly
Refresh throughputSequential processing means total cycle time = sum of individual refresh times

Tested benchmarks: Up to 500 stream tables on a single instance with <2s total cycle time for DIFFERENTIAL refreshes averaging 3ms each.

What is the TRUNCATE vs DELETE cleanup trade-off for change buffers?

After each successful refresh, the engine cleans up processed change records from the buffer table. The pg_trickle.cleanup_use_truncate GUC (default: true) controls the method:

MethodProsCons
TRUNCATE (default)Instant — O(1) regardless of row count. Reclaims disk space immediately.Takes an ACCESS EXCLUSIVE lock on the buffer table, briefly blocking concurrent INSERTs from CDC triggers (~0.1 ms typical).
DELETERow-level lock only — no blocking of concurrent CDC writes.O(N) — proportional to the number of processed rows. Dead tuples require VACUUM to reclaim space.

When to switch to DELETE: If your source table has extremely high write throughput (>10K writes/sec) and you observe brief stalls in DML latency during refresh cleanup, switch to DELETE:

ALTER SYSTEM SET pg_trickle.cleanup_use_truncate = false;
SELECT pg_reload_conf();

For most workloads, TRUNCATE is the better choice because buffer tables are typically emptied completely after each refresh.


Interoperability

Stream tables are standard PostgreSQL heap tables, which means they work with most PostgreSQL features. This section clarifies what’s compatible (views, replication, triggers) and what’s not (direct DML, foreign keys).

Can PostgreSQL views reference stream tables?

Yes. Since stream tables are standard PostgreSQL heap tables, you can create views on top of them just like any other table. The view will return whatever data is currently in the stream table, reflecting the most recent refresh:

CREATE VIEW high_value_customers AS
SELECT customer_id, total FROM pgtrickle.order_totals WHERE total > 1000;

This is a common pattern for adding per-user filters or formatting on top of a shared stream table.

Can materialized views reference stream tables?

Yes, though this is usually redundant — both materialized views and stream tables are physical snapshots of query results. The key difference is that the materialized view requires its own manual REFRESH MATERIALIZED VIEW call; it does not auto-refresh when the underlying stream table refreshes.

A more idiomatic approach is to create a second stream table that references the first one. This way, pg_trickle handles the dependency ordering and refresh scheduling for both automatically.

Can I replicate stream tables with logical replication?

Yes. Stream tables can be published like any ordinary table:

CREATE PUBLICATION my_pub FOR TABLE pgtrickle.order_totals;

Important caveats:

  • The __pgt_row_id column is replicated (it is the primary key)
  • Subscribers receive materialized data, not the defining query
  • Do not install pg_trickle on the subscriber and attempt to refresh the replicated table — it will have no CDC triggers or catalog entries
  • Internal change buffer tables are not published by default

Can I INSERT, UPDATE, or DELETE rows in a stream table directly?

No. Stream table contents are managed exclusively by the refresh engine, and direct DML will corrupt the internal state (row IDs, frontier tracking, and change buffer consistency). See Why can't I INSERT, UPDATE, or DELETE rows in a stream table? for a detailed explanation of what goes wrong.

If you need to post-process stream table data, create a view or a second stream table that references the first one.

Can I add foreign keys to or from stream tables?

No. Foreign key constraints are incompatible with how the refresh engine operates. The engine uses bulk MERGE operations that apply inserts and deletes atomically, without guaranteeing the row-by-row ordering that foreign key checks require. Full refreshes also use TRUNCATE + INSERT, which bypasses cascade logic entirely.

See Why can't I add foreign keys? for details. If you need referential integrity, enforce it in your application or in a view that joins the stream tables.

Can I add my own triggers to stream tables?

Yes, for DIFFERENTIAL mode stream tables. When user-defined row-level triggers are detected, the refresh engine automatically switches from MERGE to explicit DELETE + UPDATE + INSERT statements. This ensures triggers fire with the correct TG_OP, OLD, and NEW values. Legacy configs that still set pg_trickle.user_triggers = 'on' are treated the same as auto.

Limitations:

  • Row-level triggers do not fire during FULL refresh (they are automatically suppressed via DISABLE TRIGGER USER). Use REFRESH MODE DIFFERENTIAL for stream tables with triggers.
  • The IS DISTINCT FROM guard prevents no-op UPDATE triggers when the aggregate result is unchanged.
  • BEFORE triggers that modify NEW will affect the stored value — the next refresh may "correct" it back, causing oscillation.

See the pg_trickle.user_triggers GUC in CONFIGURATION.md for control options.

Can I ALTER TABLE a stream table directly?

No. Direct ALTER TABLE would change the physical table without updating pg_trickle's catalog, causing column mismatches and __pgt_row_id invalidation on the next refresh. See Why can't I ALTER TABLE a stream table directly? for details.

Instead, use the pg_trickle API:

-- Change schedule, mode, or status:
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '10m');

-- To change the defining query or column structure, drop and recreate:
SELECT pgtrickle.drop_stream_table('order_totals');
SELECT pgtrickle.create_stream_table(
    name         => 'order_totals',
    query        => '...',
    schedule     => '5m',
    refresh_mode => 'DIFFERENTIAL'
);

Does pg_trickle work with PgBouncer or other connection poolers?

It depends on the pooling mode. pg_trickle's background scheduler uses session-level features that are incompatible with transaction-mode connection pooling:

FeatureIssue with Transaction-Mode Pooling
pg_advisory_lock()Session-level lock released when connection returns to pool — concurrent refreshes possible
PREPARE / EXECUTEPrepared statements are session-scoped — "does not exist" errors on different connections
LISTEN / NOTIFYNotifications lost when listeners change connections

Recommended configurations:

  • Session-mode pooling (pool_mode = session): Fully compatible. The scheduler holds a dedicated connection.
  • Direct connection (no pooler for the scheduler): Fully compatible. Application queries can still go through a pooler.
  • Transaction-mode pooling (pool_mode = transaction): Not supported. The scheduler requires a persistent session.

Tip: If your infrastructure requires transaction-mode pooling (e.g., AWS RDS Proxy, Supabase), route the pg_trickle background worker through a direct connection while keeping application traffic on the pooler. Most connection poolers support per-database or per-user routing rules.

Does pg_trickle work with pgvector?

Partially — it depends on the refresh mode and what the defining query does.

What works:

  • Source tables with vector columns. CDC triggers are generated using PostgreSQL's format_type(), which returns the full type name (e.g. vector(1536)). Change buffer tables mirror the source schema correctly, so inserts, updates, and deletes on pgvector tables are captured and replayed without issue.
  • Passing vector columns through in DIFFERENTIAL mode. Stream tables that select, filter (on non-vector columns), or join sources that happen to contain vector columns work correctly — the vector data is treated as an opaque value and copied through unchanged.
  • FULL mode with any pgvector expression. Because FULL mode re-executes the entire defining query, all pgvector operators (<->, <=>, <#>) and functions (cosine_distance, l2_normalize, etc.) work exactly as they do in a regular query.

What does not work:

  • DIFFERENTIAL mode with pgvector distance operators in the query. The DVM engine needs a differentiation rule for every SQL operator it encounters. Custom operators like <-> (L2 distance) or <=> (cosine distance) are not in the built-in rule set. The engine will fall back automatically to FULL mode if such operators appear in the delta query path. Set refresh_mode => 'FULL' explicitly to make this intent clear.
  • Incremental aggregation over vector columns. There is no meaningful incremental form for aggregates over vector values (e.g. averaging embeddings). Use FULL mode for any aggregate that involves vector arithmetic.

Recommended pattern for a nearest-neighbour cache or semantic search result set:

CREATE EXTENSION IF NOT EXISTS vector;

SELECT pgtrickle.create_stream_table(
    name         => 'top_similar_docs',
    query        => $$
        SELECT d.id, d.title, d.embedding,
               d.embedding <=> '[0.1, 0.2, 0.3]'::vector AS distance
        FROM documents d
        ORDER BY distance
        LIMIT 100
    $$,
    schedule     => '5m',
    refresh_mode => 'FULL'
);

For use-cases that only carry vector columns through without computing on them, DIFFERENTIAL mode works fine:

-- Vectors are not used in the delta computation — DIFFERENTIAL is safe here
SELECT pgtrickle.create_stream_table(
    name         => 'active_doc_embeddings',
    query        => $$
        SELECT id, embedding
        FROM documents
        WHERE status = 'published'
    $$,
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

dbt Integration

The dbt-pgtrickle package provides a stream_table materialization that lets you manage stream tables through dbt’s standard workflow. This section covers setup, commands, freshness checks, and query change handling.

How do I use pg_trickle with dbt?

Install the dbt-pgtrickle package (a pure Jinja SQL macro package — no Python dependencies):

# packages.yml
packages:
  - package: pg_trickle/dbt_pgtrickle
    version: ">=0.2.0"

Then define a stream table model using the stream_table materialization:

-- models/order_totals.sql
{{ config(
    materialized='stream_table',
    schedule='1m',
    refresh_mode='DIFFERENTIAL'
) }}

SELECT customer_id, SUM(amount) AS total
FROM {{ source('public', 'orders') }}
GROUP BY customer_id

The stream_table materialization calls pgtrickle.create_stream_table() on the first run and pgtrickle.alter_stream_table() on subsequent runs (if the schedule or mode changes).

What dbt commands work with stream tables?

CommandBehavior
dbt runCreates stream tables that don't exist; updates schedule/mode if changed; does not alter the defining query of existing STs
dbt run --full-refreshDrops and recreates all stream tables from scratch (new defining query, fresh data)
dbt testWorks normally — tests query the stream table as a regular table
dbt source freshnessWorks if you configure a freshness block on the stream table source
dbt docs generateDocuments stream tables like any other model

How does dbt run --full-refresh work with stream tables?

When --full-refresh is passed, the stream_table materialization:

  1. Calls pgtrickle.drop_stream_table('model_name') to remove the existing stream table, CDC triggers, and change buffers.
  2. Calls pgtrickle.create_stream_table(...) with the current defining query from the model file.
  3. The new stream table starts in INITIALIZING status and performs its first full refresh.

This is the correct way to update a stream table's defining query in dbt. Without --full-refresh, dbt will not detect query changes (it only compares schedule and mode).

How do I check stream table freshness in dbt?

Use dbt's built-in source freshness feature by adding a freshness block to your source definition:

# models/sources.yml
sources:
  - name: pgtrickle
    schema: pgtrickle
    tables:
      - name: order_totals
        loaded_at_field: "last_refreshed_at"  # from stream_tables_info
        freshness:
          warn_after: {count: 5, period: minute}
          error_after: {count: 15, period: minute}

Then run dbt source freshness to check.

Alternatively, query the pg_trickle monitoring views directly in a dbt test:

-- tests/check_freshness.sql
SELECT pgt_name FROM pgtrickle.stream_tables_info WHERE stale = true

What happens when the defining query changes in dbt?

If you modify the SQL in a stream table model file and run dbt run without --full-refresh:

  • The stream_table materialization detects that the stream table already exists.
  • It compares the schedule and refresh mode — if either changed, it calls alter_stream_table() to update them.
  • It does not compare the defining query text. The existing defining query remains in effect.

To apply a new defining query, you must run dbt run --full-refresh. This drops and recreates the stream table with the new query.

Recommendation: After changing a model's SQL, always run dbt run --full-refresh -s model_name to apply the change.

Can I use dbt snapshot with stream tables?

Yes, with caveats. dbt snapshots work by tracking changes to a source table over time using updated_at or check strategies. You can snapshot a stream table like any other table.

However, keep in mind:

  • Stream tables are refreshed periodically, not on every write. The snapshot will only capture changes at refresh boundaries, not at the granularity of individual source-table writes.
  • The __pgt_row_id column will appear in the snapshot. You may want to exclude it with check_cols or a select in the snapshot configuration.
  • FULL refresh mode replaces all rows each cycle, which will appear as "updates" to the snapshot strategy even if the data hasn't changed. Use DIFFERENTIAL mode for stream tables that are snapshotted.

What dbt versions are supported?

dbt-pgtrickle is a pure Jinja SQL macro package that works with:

  • dbt-core 1.7+ (the stream_table materialization uses standard Jinja patterns)
  • dbt-postgres adapter (required for PostgreSQL connection)

There are no Python dependencies beyond dbt-core and dbt-postgres. The package is tested against dbt 1.7.x and 1.8.x in CI.


Row-Level Security (RLS)

Does RLS on source tables affect stream table content?

No. Stream tables always materialize the full, unfiltered result set, regardless of any RLS policies on source tables. This matches the behavior of PostgreSQL's built-in REFRESH MATERIALIZED VIEW.

The scheduled refresh runs as a superuser background worker. Manual calls to refresh_stream_table() and IMMEDIATE-mode IVM triggers also bypass RLS internally (SET LOCAL row_security = off / SECURITY DEFINER trigger functions), ensuring the stream table content is always complete and deterministic.

Can I use RLS on a stream table to filter reads per role?

Yes. Stream tables are regular PostgreSQL tables, so ALTER TABLE … ENABLE ROW LEVEL SECURITY and CREATE POLICY work exactly as expected. This is the recommended pattern for multi-tenant filtering:

ALTER TABLE pgtrickle.order_totals ENABLE ROW LEVEL SECURITY;

CREATE POLICY tenant_isolation ON pgtrickle.order_totals
    USING (tenant_id = current_setting('app.tenant_id')::INT);

One stream table serves all tenants. Per-tenant filtering happens at query time with zero storage duplication.

What happens when I ENABLE or DISABLE RLS on a source table?

pg_trickle's DDL event trigger detects ALTER TABLE … ENABLE ROW LEVEL SECURITY, DISABLE ROW LEVEL SECURITY, FORCE ROW LEVEL SECURITY, and NO FORCE ROW LEVEL SECURITY on source tables and marks all dependent stream tables for reinitialisation. The same applies to CREATE POLICY, ALTER POLICY, and DROP POLICY.

Why are IVM trigger functions SECURITY DEFINER?

In IMMEDIATE mode, the IVM trigger fires in the DML-issuing user's context. If that user has restricted RLS visibility, the delta query could see only a subset of the base table rows, producing a corrupt stream table. Making the trigger function SECURITY DEFINER (owned by the extension installer, typically a superuser) ensures the delta query always has full visibility. The DML itself is still subject to the user's own RLS policies — only the stream table maintenance runs with elevated privileges.

The trigger functions also set search_path = pg_catalog, pgtrickle, pgtrickle_changes, public to prevent search_path hijacking — a security best practice for all SECURITY DEFINER functions. The public schema is included because the delta SQL references user tables that typically reside there.


Deployment & Operations

This section covers the operational aspects of running pg_trickle in production: background workers, upgrades, restarts, replicas, Kubernetes, partitioned tables, and multi-database deployments.

How many background workers does pg_trickle use?

pg_trickle uses a two-tier background worker model:

  1. Launcher (pg_trickle launcher) — one per cluster, static. Scans pg_database every ~10 seconds and spawns a per-database scheduler for every database where pg_trickle is installed. Automatically re-spawns schedulers that exit.
  2. Per-database scheduler (pg_trickle scheduler) — one dynamic worker per database with pg_trickle installed.
ComponentWorkersNotes
Launcher1 (static)Cluster-wide; connects to postgres database
Scheduler1 per database (dynamic)Persistent per database; drives all refreshes
Parallel refresh workers0–N per databaseOnly when pg_trickle.parallel_refresh_mode = 'on'
WAL decoder0 (shared)Shares the scheduler's SPI connection
Manual refresh0Runs in the caller's session

How do I size max_worker_processes?

When max_worker_processes is too low, the launcher silently fails to spawn schedulers for some databases and retries every 5 minutes. Those databases stop refreshing with no error in the stream table itself — you only see it in the PostgreSQL log:

WARNING:  pg_trickle launcher: could not spawn scheduler for database 'mydb'

The minimum formula:

max_worker_processes ≥
  1 (pg_trickle launcher)
  + N  (one scheduler per database with pg_trickle installed)
  + max_dynamic_refresh_workers  (only if parallel_refresh_mode = 'on'; default 4)
  + autovacuum_max_workers        (default 3)
  + parallel query workers        (max_parallel_workers_per_gather × concurrent queries)
  + slots for other extensions    (logical replication launcher, etc.)

A practical starting point for a cluster with a handful of databases:

max_worker_processes = 32

This value requires a full PostgreSQL restart (not just reload).

How do I upgrade pg_trickle to a new version?

  1. Install the new shared library (replace the .so/.dylib file in PostgreSQL's lib directory).
  2. Run the upgrade SQL:
    ALTER EXTENSION pg_trickle UPDATE;
    
    This applies migration scripts (e.g., pg_trickle--0.2.1--0.2.2.sql) that update catalog tables, add new functions, and migrate data as needed.
  3. Restart PostgreSQL if the shared library changed (required for shared_preload_libraries changes).
  4. Verify:
    SELECT pgtrickle.version();
    

Zero-downtime upgrades are possible for minor versions (patch releases) that don't change the shared library. Just run ALTER EXTENSION pg_trickle UPDATE — no restart needed.

For detailed instructions, version-specific notes, rollback procedures, and troubleshooting, see the full Upgrading Guide.

How do I know if my shared library and SQL extension versions match?

The background worker checks for version mismatches at startup and logs a WARNING if the compiled .so version differs from the installed SQL extension version. You can also check manually:

-- Compiled .so version:
SELECT pgtrickle.version();

-- Installed SQL extension version:
SELECT extversion FROM pg_extension WHERE extname = 'pg_trickle';

If these differ, run ALTER EXTENSION pg_trickle UPDATE; and restart PostgreSQL if prompted.

Are stream tables preserved during an upgrade?

Yes. ALTER EXTENSION pg_trickle UPDATE applies only additive schema migrations (new columns, updated function signatures). Existing stream tables, their data, refresh history, and CDC infrastructure are preserved. The scheduler resumes normal operation after the upgrade completes.

For version-specific migration notes, see the Upgrading Guide — Version-Specific Notes.

What happens to stream tables during a PostgreSQL restart?

During a restart:

  1. The scheduler stops. No refreshes occur while PostgreSQL is down.
  2. CDC triggers are inactive. Source table writes during the restart window are captured when PostgreSQL comes back up (triggers are persistent DDL objects).
  3. On startup, the scheduler background worker starts, reads the catalog, rebuilds the DAG, and resumes refresh cycles from where it left off.
  4. Frontier reconciliation. The scheduler detects any gap between the stored frontier LSN and the current WAL position. Source changes that occurred between the last successful refresh and the restart are in the change buffers (for trigger-mode CDC) and will be processed in the first refresh cycle.

Net effect: Stream tables may be stale for the duration of the downtime, but no data is lost. The first refresh cycle after restart catches up automatically.

Can I use pg_trickle on a read replica / standby?

The scheduler does not run on standby servers. When pg_trickle detects it is running in recovery mode (pg_is_in_recovery() = true), the background worker enters a sleep loop and does not attempt any refreshes.

However, stream tables replicated from the primary are readable on the standby — they are regular heap tables and are replicated via physical (streaming) replication like any other table.

Pattern for read-heavy workloads:

  • Run pg_trickle on the primary — it performs all refreshes.
  • Query stream tables on the standby — read replicas get the latest refreshed data via streaming replication, with replication lag as the only additional delay.

How does pg_trickle work with CloudNativePG / Kubernetes?

pg_trickle is compatible with CloudNativePG. The cnpg/ directory in the repository contains example manifests:

  • Dockerfile.ext — builds a PostgreSQL image with pg_trickle pre-installed
  • cluster-example.yaml — CloudNativePG Cluster manifest with shared_preload_libraries = 'pg_trickle'

Key considerations:

  • Include pg_trickle in shared_preload_libraries in the Cluster's postgresql configuration.
  • The scheduler runs on the primary pod only. Replica pods detect recovery mode and sleep.
  • Pod restarts are handled the same way as PostgreSQL restarts (see above).
  • Persistent volume claims preserve catalog and change buffers across pod rescheduling.

Does pg_trickle work with partitioned source tables?

Yes. pg_trickle installs CDC triggers on the partitioned parent table, which PostgreSQL automatically propagates to all existing and future partitions. When a row is inserted into any partition, the trigger fires and writes the change to the buffer table.

Caveats:

  • TRUNCATE on individual partitions fires the partition-level trigger, which is also captured.
  • Attaching or detaching partitions (ALTER TABLE ... ATTACH/DETACH PARTITION) fires DDL event triggers, which may mark the stream table for reinitialization.
  • Row movement between partitions (when the partition key is updated) is captured as a DELETE from the old partition and an INSERT into the new partition.

Can I run pg_trickle in multiple databases on the same cluster?

Yes. Each database gets its own independent scheduler background worker, its own catalog tables, and its own change buffers. Stream tables in different databases do not interact.

Resource planning: Each database with stream tables requires 1 background worker slot in max_worker_processes. If you have many databases, the default of 8 is easily exhausted.

Important: When max_worker_processes is exhausted, the launcher silently skips databases it cannot spawn a scheduler for and retries every 5 minutes. This means stream tables in those databases stop refreshing with no visible error — they just go stale. Check the PostgreSQL log for:

WARNING:  pg_trickle launcher: could not spawn scheduler for database 'mydb'

If you see this, increase max_worker_processes and restart PostgreSQL.

See How do I size max_worker_processes? for the full formula.

-- On each database where you want pg_trickle:
CREATE EXTENSION pg_trickle;

The extension must be created separately in each database — shared_preload_libraries loads the shared library cluster-wide, but the SQL objects (catalog tables, functions) are per-database.


Monitoring & Alerting

pg_trickle provides built-in monitoring views and NOTIFY-based alerting. This section explains the available views, alert events, and failure handling.

How do I list all stream tables in my database?

Several options depending on how much detail you need:

-- Quickest: name + status + mode + staleness
SELECT name, status, refresh_mode, is_populated, staleness
FROM pgtrickle.stream_tables_info;

-- Full stats: refresh counts, rows inserted/deleted, avg duration, error streaks
SELECT * FROM pgtrickle.pg_stat_stream_tables;

-- Live status including consecutive_errors and data_timestamp
SELECT * FROM pgtrickle.pgt_status();

-- Raw catalog (all persisted properties, no computed fields)
SELECT * FROM pgtrickle.pgt_stream_tables;

How do I inspect what pg_trickle is doing right now?

Quick status snapshot:

SELECT name, status, refresh_mode, consecutive_errors, staleness
FROM pgtrickle.pgt_status();

Deep dive into a specific stream table — shows the defining query, DVM operator tree, source tables, generated delta SQL, and current WAL frontier:

SELECT * FROM pgtrickle.explain_st('my_table');

Key properties returned:

PropertyDescription
dvm_supportedWhether differential maintenance is possible for this query
operator_treeHow the DVM engine has decomposed the query
delta_queryThe actual SQL executed during a differential refresh
frontierPer-source LSN positions flushed at last refresh

Recent refresh activity:

-- Last 10 refreshes for a stream table (action, status, rows, duration):
SELECT * FROM pgtrickle.get_refresh_history('my_table', 10);

-- Aggregate refresh stats for all stream tables:
SELECT * FROM pgtrickle.st_refresh_stats();

CDC and slot health:

-- Per-source CDC mode, WAL lag, and alerts:
SELECT * FROM pgtrickle.check_cdc_health();

-- Replication slot health (slot_name, active, lag_bytes):
SELECT * FROM pgtrickle.slot_health();

Real-time event stream:

LISTEN pg_trickle_alert;
-- Receives JSON payloads for: stale_data, auto_suspended, resumed,
-- reinitialize_needed, buffer_growth_warning, refresh_completed, refresh_failed

Pending change buffers (rows not yet consumed by a differential refresh):

SELECT stream_table, source_table, cdc_mode, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

Are there convenience functions for inspecting source tables and CDC buffers?

Yes. pg_trickle provides two functions added to complement the existing monitoring suite:

pgtrickle.list_sources(name) — shows every source table a stream table depends on, the CDC mode each uses, and any column-level usage metadata:

SELECT * FROM pgtrickle.list_sources('order_totals');
-- Returns: source_table, source_oid, source_type, cdc_mode, columns_used

pgtrickle.change_buffer_sizes() — shows, for every tracked source table, how many CDC rows are pending (not yet consumed by a differential refresh) and the estimated on-disk size of the change buffer:

SELECT * FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;
-- Returns: stream_table, source_table, source_oid, cdc_mode, pending_rows, buffer_bytes

A large pending_rows value for a source table means a differential refresh is overdue or stalled — use pgtrickle.get_refresh_history() to investigate.

Can I see a tree view of all stream table dependencies?

Yes. pgtrickle.dependency_tree() walks the dependency DAG and renders it as an indented ASCII tree:

SELECT tree_line, status, refresh_mode
FROM pgtrickle.dependency_tree();

Example output:

tree_line                                 | status | refresh_mode
------------------------------------------+--------+--------------
report_summary                            | ACTIVE | DIFFERENTIAL
├── orders_by_region                      | ACTIVE | DIFFERENTIAL
│   ├── public.orders [src]              |        |
│   └── public.customers [src]           |        |
└── revenue_totals                        | ACTIVE | DIFFERENTIAL
    └── public.orders [src]              |        |

Each row has node (qualified name), node_type (stream_table or source_table), depth, status, and refresh_mode. Source tables are shown as leaves tagged with [src].

What monitoring views are available?

ViewDescription
pgtrickle.stream_tables_infoStatus overview with computed staleness
pgtrickle.pg_stat_stream_tablesComprehensive stats (refresh counts, avg duration, error streaks)

How do I get alerted when something goes wrong?

pg_trickle sends PostgreSQL NOTIFY messages on the pg_trickle_alert channel with JSON payloads:

EventWhen
stale_dataStaleness exceeds 2× the schedule
auto_suspendedStream table suspended after max consecutive errors
reinitialize_neededUpstream DDL change detected
buffer_growth_warningChange buffer growing unexpectedly
refresh_completedRefresh completed successfully
refresh_failedRefresh failed

Listen with:

LISTEN pg_trickle_alert;

What happens when a stream table keeps failing?

After pg_trickle.max_consecutive_errors (default: 3) consecutive failures, the stream table moves to ERROR status and automatic refreshes stop. An auto_suspended NOTIFY alert is sent.

To recover:

-- Fix the underlying issue (e.g., restore a dropped source table), then:
SELECT pgtrickle.alter_stream_table('my_table', status => 'ACTIVE');

Retries use exponential backoff (base 1s, max 60s, ±25% jitter, up to 5 retries before counting as a real failure).


Configuration Reference

All pg_trickle settings are configured via PostgreSQL GUC parameters. The table below lists every available parameter with its type, default, and description.

GUCTypeDefaultDescription
pg_trickle.enabledbooltrueEnable/disable the scheduler. Manual refreshes still work when false.
pg_trickle.scheduler_interval_msint1000Scheduler wake interval in milliseconds (100–60000)
pg_trickle.min_schedule_secondsint60Minimum allowed schedule duration (1–86400)
pg_trickle.max_consecutive_errorsint3Failures before auto-suspending (1–100)
pg_trickle.change_buffer_schematextpgtrickle_changesSchema for CDC buffer tables
pg_trickle.max_concurrent_refreshesint4Max parallel refresh workers (1–32)
pg_trickle.user_triggerstextautoUser trigger handling: auto (detect), off (suppress), on (deprecated alias for auto)
pg_trickle.differential_max_change_ratiofloat0.15Change ratio threshold for adaptive FULL fallback (0.0–1.0)
pg_trickle.cleanup_use_truncatebooltrueUse TRUNCATE instead of DELETE for buffer cleanup

All GUCs are SUSET context (superuser SET) and take effect without restart, except shared_preload_libraries which requires a PostgreSQL restart.


Troubleshooting

This section covers common problems and how to debug them. If your issue isn’t listed here, check the refresh history for error messages and the monitoring views for status information.

How do I diagnose stalled data flow through stream tables?

See also: Error Reference — comprehensive guide to all pg_trickle error variants with causes and fixes.

If data seems to have stopped flowing -- stream tables show stale results despite DML on the source tables -- follow this systematic diagnostic workflow. Each step narrows the problem from broad health checks down to specific root causes.

Step 0 -- Verify GUC configuration:

Misconfigured GUCs are a common and easy-to-miss cause of stalled or severely throttled data flow. Check all pg_trickle settings in one query:

SELECT name, setting, unit
FROM pg_settings
WHERE name LIKE 'pg_trickle.%'
   OR name = 'max_worker_processes'
ORDER BY name;

Key values to check:

GUCSafe valueProblem if set to
pg_trickle.enabledonoff -- stops all automatic refreshes
pg_trickle.tiered_schedulingon (fine)on with all STs at tier = 'frozen' -- silently skips them
pg_trickle.max_consecutive_errors3-101 -- one transient error suspends the ST permanently
pg_trickle.scheduler_interval_ms100-1000Very high (e.g. 60000) -- scheduler only wakes every 60 s
pg_trickle.event_driven_wakeonoff -- falls back to poll-only; latency equals scheduler_interval_ms
pg_trickle.wake_debounce_ms1-50Very high (e.g. 5000) -- coalesces notifications for 5 s before acting
pg_trickle.auto_backoffonFine normally, but if refreshes take >95% of schedule it silently stretches intervals up to 8x
pg_trickle.default_schedule_seconds1-60Very high -- isolated CALCULATED tables refresh very infrequently
max_worker_processes>= 16 (typical)Too low -- workers cannot be spawned; parallel mode silently stalls

Also check whether any stream tables are frozen:

SELECT pgt_name, refresh_tier
FROM pgtrickle.pgt_stream_tables
WHERE refresh_tier = 'frozen';

Step 1 -- Quick health overview:

SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

This single call checks scheduler status, error tables, stale tables, buffer growth, replication slot lag, and the worker pool. Any WARN or ERROR row tells you where to look next.

Step 2 -- Check stream table status and staleness:

SELECT name, status, refresh_mode, consecutive_errors, staleness
FROM pgtrickle.pgt_status()
ORDER BY staleness DESC NULLS FIRST;

Look for SUSPENDED status (auto-suspended after repeated errors), high consecutive_errors, or unexpectedly large staleness.

Step 3 -- Check recent refresh activity:

SELECT start_time, stream_table, action, status, duration_ms, error_message
FROM pgtrickle.refresh_timeline(20);

If no recent rows appear, the scheduler may not be running. If rows show ERROR, the error messages explain why refreshes are failing.

Step 4 -- Inspect errors for a specific stream table:

SELECT * FROM pgtrickle.diagnose_errors('my_stream_table');

Returns the last 5 FAILED refresh events with error classification and suggested remediation steps.

Step 5 -- Check the CDC pipeline (are changes being captured?):

SELECT stream_table, source_table, cdc_mode, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;
  • pending_rows = 0 everywhere: either no DML is happening on the source tables, or CDC triggers are missing.
  • pending_rows growing but stream tables are not refreshing: scheduler or refresh problem (go back to Steps 1-3).

Step 6 -- Verify CDC triggers exist and are enabled:

SELECT source_table, trigger_type, trigger_name
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

Any rows returned here mean change capture is broken for that source table -- DML changes are not being recorded.

Step 7 -- Check CDC slot health (WAL mode only):

SELECT * FROM pgtrickle.check_cdc_health();

Look for alert values like slot_lag_exceeds_threshold or replication_slot_missing.

Step 8 -- Verify the dependency DAG:

SELECT tree_line, status, refresh_mode
FROM pgtrickle.dependency_tree();

Confirms the dependency graph is wired as expected. A missing edge means upstream changes will not propagate to downstream stream tables.

Step 9 -- Check the parallel worker pool (if using parallel mode):

SELECT * FROM pgtrickle.worker_pool_status();

SELECT job_id, unit_key, status, duration_ms
FROM pgtrickle.parallel_job_status(300)
WHERE status NOT IN ('SUCCEEDED');

Common root causes at a glance:

SymptomDiagnostic functionLikely root cause
No refreshes happening at allhealth_check -> scheduler_runningBackground worker not running or pg_trickle.enabled = off
Stream table in SUSPENDED statuspgt_statusRepeated errors hit max_consecutive_errors threshold
Zero pending changes despite DMLtrigger_inventoryCDC trigger was dropped or disabled by DDL
WAL slot missing or laggingcheck_cdc_health, slot_healthReplication slot dropped, or WAL retention exceeded
Buffers growing but no refresheschange_buffer_sizes + refresh_timelineScheduler stalled, refresh failing, or lock contention
Upstream changes not propagatingdependency_treeUpstream ST not connected in the DAG

Unit tests crash with symbol not found in flat namespace on macOS 26+

macOS 26 (Tahoe) changed the dynamic linker (dyld) to eagerly resolve all flat-namespace symbols at binary load time. pgrx extensions link PostgreSQL server symbols (e.g. CacheMemoryContext, SPI_connect) with -Wl,-undefined,dynamic_lookup, which previously resolved lazily. Since cargo test --lib runs outside the postgres process, those symbols are missing and the test binary aborts:

dyld[66617]: symbol not found in flat namespace '_CacheMemoryContext'

Use just test-unit — it automatically detects macOS 26+ and injects a stub library (libpg_stub.dylib) via DYLD_INSERT_LIBRARIES. The stub provides NULL/no-op definitions for the ~28 PostgreSQL symbols; they are never called during unit tests (pure Rust logic only).

This does not affect integration tests, E2E tests, just lint, just build, or the extension running inside PostgreSQL.

See the Installation Guide for details and manual usage.

My stream table is stuck in INITIALIZING status

The initial full refresh may have failed. Check:

SELECT * FROM pgtrickle.get_refresh_history('my_table', 5);

If the error is transient, retry with:

SELECT pgtrickle.refresh_stream_table('my_table');

My stream table shows stale data but the scheduler is running

Common causes:

  1. TRUNCATE on source table — bypasses CDC triggers. Manual refresh needed.
  2. Too many errors — check consecutive_errors in pgtrickle.pg_stat_stream_tables. Resume with ALTER ... status => 'ACTIVE'.
  3. Long-running refresh — check for lock contention or slow defining queries.
  4. Scheduler disabled — verify SHOW pg_trickle.enabled; returns on.

I get "cycle detected" when creating a stream table

Stream tables cannot have circular dependencies. If stream table A depends on stream table B and B depends on A (either directly or through a chain of intermediate stream tables), pg_trickle rejects the creation with a clear error message listing the cycle path.

To resolve this, restructure your queries to eliminate the circular reference. Common patterns:

  • Extract the shared logic into a single base stream table that both A and B reference.
  • Use a regular view instead of a stream table for one side of the dependency.
  • Merge the two queries into a single stream table if possible.

A source table was altered and my stream table stopped refreshing

pg_trickle detects DDL changes (column additions, drops, type changes) via event triggers and marks affected stream tables with needs_reinit = true. The next scheduler cycle will attempt to reinitialize the stream table — drop the storage table, recreate it from the current defining query schema, and perform a full refresh.

If the schema change breaks the defining query (e.g., a column referenced in the query was dropped or renamed), the reinitialization will fail repeatedly until the stream table hits max_consecutive_errors and enters ERROR status.

To fix it: Update the defining query and recreate the stream table:

SELECT pgtrickle.drop_stream_table('order_totals');
SELECT pgtrickle.create_stream_table(
    name         => 'order_totals',
    query        => 'SELECT id, name FROM orders',  -- updated query reflecting new schema
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

Check the refresh history for the specific error message:

SELECT * FROM pgtrickle.get_refresh_history('order_totals', 5);

How do I see the delta query generated for a stream table?

SELECT pgtrickle.explain_st('order_totals');

This shows the DVM operator tree, source tables, and the generated delta SQL.

How do I interpret the refresh history?

The pgtrickle.get_refresh_history() function returns the most recent refresh records for a stream table:

SELECT * FROM pgtrickle.get_refresh_history('order_totals', 10);

Key columns:

ColumnMeaning
actionRefresh type: FULL, DIFFERENTIAL, TOPK, IMMEDIATE, or REINITIALIZE
rows_insertedRows added to the stream table in this cycle
rows_deletedRows removed from the stream table in this cycle
rows_updatedRows modified in the stream table (for explicit DML path)
duration_msWall-clock time for the refresh
error_messageNULL for success; error text for failures
source_changesNumber of pending change records processed
fallback_reasonIf DIFFERENTIAL fell back to FULL: change_ratio_exceeded, truncate_detected, or reinitialize

Patterns to look for:

  • High rows_inserted + rows_deleted with low source_changes → possible duplicate rows (keyless source tables)
  • fallback_reason = 'change_ratio_exceeded' frequently → consider lowering the threshold or switching to FULL mode
  • Increasing duration_ms over time → index maintenance or buffer bloat; consider VACUUM or checking for missing indexes

How can I tell if the scheduler is running?

Several ways to verify:

1. Check the background worker:

SELECT pid, datname, backend_type, state, query
FROM pg_stat_activity
WHERE backend_type = 'pg_trickle scheduler';

If no rows are returned, the scheduler is not running. Common causes:

  • pg_trickle.enabled = false
  • Extension not in shared_preload_libraries
  • max_worker_processes exhausted — the launcher silently skips databases it cannot accommodate and retries every 5 minutes. Check the PostgreSQL log for WARNING: pg_trickle launcher: could not spawn scheduler for database '...'.

2. Check recent refresh activity:

SELECT MAX(refreshed_at) AS last_refresh
FROM pgtrickle.pgt_stream_tables
WHERE status = 'ACTIVE';

If the last refresh was long ago relative to the shortest schedule, the scheduler may be stuck.

3. Check PostgreSQL logs: The scheduler logs startup and shutdown messages at LOG level:

LOG:  pg_trickle scheduler started for database "mydb"
LOG:  pg_trickle scheduler shutting down (SIGTERM)

How do I debug a stream table that shows stale data?

Follow this diagnostic checklist:

  1. Is the scheduler running? (See above)
  2. Is the stream table active?
    SELECT pgt_name, status, consecutive_errors FROM pgtrickle.pg_stat_stream_tables
    WHERE pgt_name = 'my_st';
    
    If status is ERROR or SUSPENDED, the stream table has been auto-suspended after repeated failures.
  3. Are there pending changes?
    SELECT COUNT(*) FROM pgtrickle_changes.changes_<source_oid>;
    
    If zero, the source table may not have CDC triggers (check SELECT tgname FROM pg_trigger WHERE tgrelid = '<source_oid>').
  4. Is the refresh failing silently?
    SELECT * FROM pgtrickle.get_refresh_history('my_st', 5);
    
    Check for error messages.
  5. Is there lock contention? Long-running transactions holding locks on the source or stream table can block refreshes. Check pg_locks and pg_stat_activity.

What does the needs_reinit flag mean and how do I clear it?

The needs_reinit flag in pgtrickle.pgt_stream_tables indicates that the stream table's physical storage needs to be rebuilt — typically because a source table's schema changed.

When needs_reinit = true:

  • The scheduler skips normal differential/full refresh.
  • Instead, it performs a reinitialize: drop the storage table, recreate it from the current defining query schema, and populate with a full refresh.
  • If reinitialization succeeds, needs_reinit is cleared automatically.

If reinitialization keeps failing (e.g., the defining query references a dropped column):

-- Fix the underlying issue first, then clear manually:
UPDATE pgtrickle.pgt_stream_tables SET needs_reinit = false WHERE pgt_name = 'my_st';
-- Or drop and recreate:
SELECT pgtrickle.drop_stream_table('my_st');
SELECT pgtrickle.create_stream_table(
    name         => 'my_st',
    query        => 'SELECT ...',
    schedule     => '1m',
    refresh_mode => 'DIFFERENTIAL'
);

Why Are These SQL Features Not Supported?

This section gives detailed technical explanations for each SQL limitation. pg_trickle follows the principle of "fail loudly rather than produce wrong data" — every unsupported feature is detected at stream-table creation time and rejected with a clear error message and a suggested rewrite.

For all of these, returning an explicit error is a deliberate design choice: the alternative would be silently producing incorrect results after a refresh, which is far harder to diagnose.

How does NATURAL JOIN work?

NATURAL JOIN is fully supported. At parse time, pg_trickle resolves the common columns between the two tables (using OpTree::output_columns()) and synthesizes explicit equi-join conditions. This supports INNER, LEFT, RIGHT, and FULL NATURAL JOIN variants.

Internally, NATURAL JOIN is converted to an explicit JOIN ... ON before the DVM engine builds its operator tree, so delta computation works identically to a manually specified equi-join.

Note: The internal __pgt_row_id column is excluded from common column resolution, so NATURAL JOINs between stream tables work correctly.

How do GROUPING SETS, CUBE, and ROLLUP work?

GROUPING SETS, CUBE, and ROLLUP are fully supported via an automatic parse-time rewrite. pg_trickle decomposes these constructs into a UNION ALL of separate GROUP BY queries before the DVM engine processes the query.

Explosion guard: CUBE(N) generates $2^N$ branches. pg_trickle rejects CUBE/ROLLUP combinations that would produce more than 64 branches to prevent runaway memory usage. Use explicit GROUPING SETS(...) instead.

For example:

-- This defining query:
SELECT dept, region, SUM(amount) FROM sales GROUP BY CUBE(dept, region)

-- Is automatically rewritten to:
SELECT dept, region, SUM(amount) FROM sales GROUP BY dept, region
UNION ALL
SELECT dept, NULL::text, SUM(amount) FROM sales GROUP BY dept
UNION ALL
SELECT NULL::text, region, SUM(amount) FROM sales GROUP BY region
UNION ALL
SELECT NULL::text, NULL::text, SUM(amount) FROM sales

GROUPING() function calls are replaced with integer literal constants corresponding to the grouping level. The rewrite is transparent — the DVM engine sees only standard GROUP BY + UNION ALL operators and can apply incremental delta computation to each branch independently.

How does DISTINCT ON (…) work?

DISTINCT ON is fully supported via an automatic parse-time rewrite. pg_trickle transparently transforms DISTINCT ON into a ROW_NUMBER() window function subquery:

-- This defining query:
SELECT DISTINCT ON (dept) dept, employee, salary
FROM employees ORDER BY dept, salary DESC

-- Is automatically rewritten to:
SELECT dept, employee, salary FROM (
    SELECT dept, employee, salary,
           ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rn
    FROM employees
) sub WHERE rn = 1

The rewrite happens before DVM parsing, so the operator tree sees a standard window function query and can apply partition-based recomputation for incremental delta maintenance.

Why is TABLESAMPLE rejected?

TABLESAMPLE returns a random subset of rows from a table (e.g., FROM orders TABLESAMPLE BERNOULLI(10) gives ~10% of rows).

Stream tables materialize the complete result set of the defining query and keep it up-to-date across refreshes. Baking a random sample into the defining query is not meaningful because:

  1. Non-determinism. Each refresh would sample different rows, making the stream table contents unstable and unpredictable. The delta between refreshes would be dominated by sampling noise, not actual data changes.

  2. CDC incompatibility. The trigger-based change-capture system tracks specific row-level changes (inserts, updates, deletes). A TABLESAMPLE defining query has no stable row identity — the "changed rows" concept doesn't apply when the entire sample shifts each cycle.

Rewrite:

-- Instead of sampling in the defining query:
SELECT * FROM orders TABLESAMPLE BERNOULLI(10)

-- Materialize the full result and sample when querying:
SELECT * FROM order_stream_table WHERE random() < 0.1

Why is LIMIT / OFFSET rejected?

Stream tables materialize the complete result set and keep it synchronized with source data. Bare LIMIT/OFFSET (without a recognized pattern) would truncate the result:

  1. Undefined ordering. LIMIT without ORDER BY returns an arbitrary subset.

  2. Delta instability. When source rows change, the boundary between "in the LIMIT" and "out of the LIMIT" shifts. A single INSERT could evict one row and admit another, requiring the refresh to track the full ordered position of every row.

  3. Semantic mismatch. Users who write LIMIT 100 typically want to limit what they read, not what is stored.

Exception — TopK pattern: When the defining query has a top-level ORDER BY … LIMIT N (constant integer, optionally with OFFSET M), pg_trickle recognizes this as a TopK query and accepts it. The ORDER BY clause is required — bare LIMIT without ORDER BY is always rejected because it selects an arbitrary subset. With ORDER BY, the top-N boundary is well-defined and the stream table stores exactly those N rows (starting from position M+1 if OFFSET is specified). See the TopK section for details.

Rewrite (when TopK doesn't apply):

-- Instead of:
'SELECT * FROM orders ORDER BY created_at DESC LIMIT 100'

-- Omit LIMIT from the defining query, apply when reading:
SELECT * FROM orders_stream_table ORDER BY created_at DESC LIMIT 100

Why are window functions in expressions rejected?

Window functions like ROW_NUMBER() OVER (…) are supported as standalone columns in stream tables. However, embedding a window function inside an expression — such as CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN ... or SUM(x) OVER (...) + 1 — is rejected.

This restriction exists because:

  1. Partition-based recomputation. pg_trickle's differential mode handles window functions by recomputing entire partitions that were affected by changes. When a window function is buried inside an expression, the DVM engine cannot isolate the window computation from the surrounding expression, making it impossible to correctly identify which partitions to recompute.

  2. Expression tree ambiguity. The DVM parser would need to differentiate the outer expression (arithmetic, CASE, etc.) while treating the inner window function specially. This creates a combinatorial explosion of differentiation rules for every possible expression type × window function combination.

Rewrite:

-- Instead of:
SELECT id, CASE WHEN ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) = 1
                THEN 'top' ELSE 'other' END AS rank_label
FROM employees

-- Move window function to a separate column, then use a wrapping stream table:
-- ST1:
SELECT id, dept, salary,
       ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC) AS rn
FROM employees

-- ST2 (references ST1):
SELECT id, CASE WHEN rn = 1 THEN 'top' ELSE 'other' END AS rank_label
FROM pgtrickle.employees_ranked

Why is FOR UPDATE / FOR SHARE rejected?

FOR UPDATE and related locking clauses (FOR SHARE, FOR NO KEY UPDATE, FOR KEY SHARE) acquire row-level locks on selected rows. This is incompatible with stream tables because:

  1. Refresh semantics. Stream table contents are managed by the refresh engine using bulk MERGE operations. Row-level locks taken during the defining query would conflict with the refresh engine's own locking strategy.

  2. No direct DML. Since users cannot directly modify stream table rows, there is no use case for locking rows inside the defining query. The locks would be held for the duration of the refresh transaction and then released, serving no purpose.

How does ALL (subquery) work?

ALL (subquery) comparisons (e.g., WHERE x > ALL (SELECT y FROM t)) are supported via an automatic rewrite to NOT EXISTS. For example, x > ALL (SELECT y FROM t) is rewritten to NOT EXISTS (SELECT 1 FROM t WHERE y >= x), which pg_trickle handles via its anti-join operator.

Why is ORDER BY silently discarded?

ORDER BY in the defining query is accepted but ignored. This is consistent with how PostgreSQL treats CREATE MATERIALIZED VIEW AS SELECT ... ORDER BY ... — the ordering is not preserved in the stored data.

Stream tables are heap tables with no guaranteed row order. The ORDER BY in the defining query would only affect the order of the initial INSERT, which has no lasting effect. Apply ordering when querying the stream table:

-- This ORDER BY is meaningless in the defining query:
'SELECT region, SUM(amount) FROM orders GROUP BY region ORDER BY total DESC'

-- Instead, order when reading:
SELECT * FROM regional_totals ORDER BY total DESC

Why are unsupported aggregates (CORR, COVAR_*, REGR_*) limited to FULL mode?

Regression aggregates like CORR, COVAR_POP, COVAR_SAMP, and the REGR_* family require maintaining running sums of products and squares across the entire group. Unlike COUNT/SUM/AVG (where deltas can be computed from the change alone) or group-rescan aggregates (where only affected groups are re-read), regression aggregates:

  1. Lack algebraic delta rules. There is no closed-form way to update a correlation coefficient from a single row change without access to the full group's data.

  2. Would degrade to group-rescan anyway. Even if supported, the implementation would need to rescan the full group from source — identical to FULL mode for most practical group sizes.

These aggregates work fine in FULL refresh mode, which re-runs the entire query from scratch each cycle.


Why Are These Stream Table Operations Restricted?

Stream tables are regular PostgreSQL heap tables under the hood, but their contents are managed exclusively by the refresh engine. This section explains why certain operations that work on ordinary tables are disallowed or unsupported on stream tables, and what to do instead.

Why can't I INSERT, UPDATE, or DELETE rows in a stream table?

Stream table contents are the output of the refresh engine — they represent the materialized result of the defining query at a specific point in time. Direct DML would corrupt this contract in several ways:

  1. Row ID integrity. Every row has a __pgt_row_id (a 64-bit xxHash of the group-by key or all columns). The refresh engine uses this for delta MERGE — matching incoming deltas against existing rows. A manually inserted row with an incorrect or duplicate __pgt_row_id would cause the next differential refresh to produce wrong results (double-counting, missed deletes, or merge conflicts).

  2. Frontier inconsistency. Each refresh records a frontier — a set of per-source LSN positions that represent "data up to this point has been materialized." A manual DML change is not tracked by any frontier. The next differential refresh would either overwrite the change (if the delta touches the same row) or leave the stream table in a state that doesn't match any consistent point-in-time snapshot of the source data.

  3. Change buffer desync. The CDC triggers on source tables write changes to buffer tables. The refresh engine reads these buffers and advances the frontier. Manual DML on the stream table bypasses this pipeline entirely — the buffer and frontier have no record of the change, so future refreshes cannot account for it.

If you need to post-process stream table data, create a view or a second stream table that references the first one.

Why can't I add foreign keys to or from a stream table?

Foreign key constraints require that referenced/referencing rows exist at the time of each DML statement. The refresh engine violates this assumption:

  1. Bulk MERGE ordering. A differential refresh executes a single MERGE INTO statement that applies all deltas (inserts and deletes) atomically. PostgreSQL evaluates FK constraints row-by-row within this MERGE. If a parent row is deleted and a new parent inserted in the same delta batch, the child FK check may fail because it sees the delete before the insert — even though the final state would be consistent.

  2. Full refresh uses TRUNCATE + INSERT. In FULL mode, the refresh engine truncates the stream table and re-inserts all rows. TRUNCATE does not fire individual DELETE triggers and bypasses FK cascade logic, which would leave referencing tables with dangling references.

  3. Cross-table refresh ordering. If stream table A has an FK referencing stream table B, both tables refresh independently (in topological order, but in separate transactions). There is no guarantee that A's refresh sees B's latest data — the FK constraint could transiently fail between refreshes.

Workaround: Enforce referential integrity in the consuming application or use a view that joins the stream tables and validates the relationship.

How do user-defined triggers work on stream tables?

When a DIFFERENTIAL mode stream table has user-defined row-level triggers, the refresh engine uses explicit DML decomposition instead of MERGE:

  1. Delta materialized once. The delta query result is stored in a temporary table (__pgt_delta_<id>) to avoid evaluating it three times.

  2. DELETE removed rows. Rows in the stream table whose __pgt_row_id is absent from the delta are deleted. AFTER DELETE triggers fire with correct OLD values.

  3. UPDATE changed rows. Rows whose __pgt_row_id exists in both the stream table and delta but whose values differ (checked via IS DISTINCT FROM) are updated. AFTER UPDATE triggers fire with correct OLD and NEW. No-op updates (where values are identical) are skipped, preventing spurious triggers.

  4. INSERT new rows. Rows in the delta whose __pgt_row_id is absent from the stream table are inserted. AFTER INSERT triggers fire with correct NEW values.

FULL refresh behavior: Row-level user triggers are automatically suppressed during FULL refresh via DISABLE TRIGGER USER / ENABLE TRIGGER USER. A NOTIFY pgtrickle_refresh is emitted so listeners know a FULL refresh occurred. Use REFRESH MODE DIFFERENTIAL for stream tables that need per-row trigger semantics.

Performance: The explicit DML path adds ~25–60% overhead compared to MERGE for triggered stream tables. Stream tables without user triggers have zero overhead (only a fast pg_trigger check, <0.1 ms).

Control: The pg_trickle.user_triggers GUC controls this behavior:

  • auto (default): detect user triggers automatically
  • off: always use MERGE, suppressing triggers
  • on: deprecated compatibility alias for auto

Why can't I ALTER TABLE a stream table directly?

Stream table metadata (defining query, schedule, refresh mode) is stored in the pg_trickle catalog (pgtrickle.pgt_stream_tables). A direct ALTER TABLE would change the physical table without updating the catalog, causing:

  1. Column mismatch. If you add or remove columns, the refresh engine's cached delta query and MERGE statement would reference columns that no longer exist (or miss new ones), causing runtime errors.

  2. __pgt_row_id invalidation. The row ID hash is computed from the defining query's output columns. Altering the table schema without updating the defining query would make existing row IDs inconsistent with the new column set.

Use pgtrickle.alter_stream_table() to change schedule, refresh mode, or status. To change the defining query or column structure, drop and recreate the stream table.

Why can't I TRUNCATE a stream table?

TRUNCATE removes all rows instantly but does not update the pg_trickle frontier or change buffers. After a TRUNCATE:

  1. Differential refresh sees no changes. The frontier still records the last-processed LSN. No new source changes may have occurred, so the next differential refresh produces an empty delta — leaving the stream table empty even though the source still has data.

  2. No recovery path for differential mode. The refresh engine has no way to detect that the stream table was externally truncated. It assumes the current contents match the frontier.

Use pgtrickle.refresh_stream_table('my_table') to force a full re-materialization, or drop and recreate the stream table if you need a clean slate.

What are the memory limits for delta processing?

The differential refresh path executes the delta query as a single SQL statement. For large batches (e.g., a bulk UPDATE of 10M rows), PostgreSQL may attempt to materialize the entire delta result set in memory. If the delta exceeds work_mem, PostgreSQL will spill to temporary files on disk, which is slower but safe. In extreme cases, OOM (out of memory) can occur if work_mem is set very high and the delta is enormous.

Mitigations:

  1. Adaptive fallback. The pg_trickle.differential_max_change_ratio GUC (default 0.15) automatically triggers a FULL refresh when the ratio of pending changes to total rows exceeds the threshold. This prevents large deltas from consuming excessive memory.

  2. work_mem tuning. PostgreSQL's work_mem setting controls how much memory each sort/hash operation uses before spilling to disk. For pg_trickle workloads, 64–256 MB is typical. Monitor temp_blks_written in pg_stat_statements to detect spilling.

  3. pg_trickle.merge_work_mem_mb GUC. Sets a session-level work_mem override during the MERGE execution (default: 0 = use global work_mem). This allows higher memory for refresh without affecting other queries.

  4. Monitoring. If pg_stat_statements is installed, pg_trickle logs a warning when the MERGE query writes temporary blocks to disk.

Why are refreshes processed sequentially by default?

The default (parallel_refresh_mode = 'off') is sequential because it is simple, correct, and efficient for most workloads. Topological ordering guarantees upstream stream tables refresh before downstream ones with no coordination overhead.

When to consider enabling parallel refresh:

  • Your database has many independent stream tables (no shared dependencies).
  • Total cycle time = sum of all refresh durations and some refreshes are visibly blocking unrelated ones.
  • You have enough max_worker_processes headroom (each parallel worker uses one slot).

Enabling parallel refresh (v0.4.0+):

ALTER SYSTEM SET pg_trickle.parallel_refresh_mode = 'on';
SELECT pg_reload_conf();

With parallel_refresh_mode = 'on', the scheduler builds an execution-unit DAG and dispatches independent units to dynamic background workers. Atomic consistency groups and IMMEDIATE-trigger closures remain single-worker for correctness.

See CONFIGURATION.md — Parallel Refresh for tuning guidance including the worker-budget formula.

How many connections does pg_trickle use?

pg_trickle uses the following PostgreSQL connections:

ComponentConnectionsWhen
Background scheduler1Always (per database with STs)
WAL decoder polling0 (shared)Uses the scheduler's SPI connection
Manual refresh1Per-call (uses caller's session)

Total: 1 persistent connection per database. WAL decoder polling shares the scheduler's SPI connection rather than opening separate connections.

max_worker_processes: pg_trickle registers 1 background worker per database during _PG_init(). Ensure max_worker_processes (default 8) has room for the pg_trickle worker plus any other extensions.

Advisory locks: The scheduler holds a session-level advisory lock per actively-refreshing ST. These are released immediately after each refresh completes.

Troubleshooting & Failure Mode Runbook

This document covers common failure scenarios, their symptoms, diagnosis steps, and resolution procedures. It is intended for operators and DBAs running pg_trickle in production.

Quick start: Run SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK'; for a single-call triage of your installation.

See also:


Table of Contents


Diagnostic Toolkit

These functions are your primary tools for diagnosing issues:

FunctionPurpose
pgtrickle.health_check()Single-call overall health triage (OK/WARN/ERROR)
pgtrickle.pgt_status()Status, staleness, error count for all stream tables
pgtrickle.refresh_timeline(N)Last N refresh events across all stream tables
pgtrickle.diagnose_errors('name')Last 5 failed events with classification and remediation
pgtrickle.change_buffer_sizes()CDC pipeline: pending rows and buffer bytes per source
pgtrickle.trigger_inventory()CDC trigger presence and enabled state
pgtrickle.check_cdc_health()WAL replication slot health (WAL mode only)
pgtrickle.dependency_tree()Dependency DAG visualization
pgtrickle.worker_pool_status()Parallel refresh worker pool state
pgtrickle.explain_st('name')DVM operator tree and generated delta SQL

Quick health check script:

-- 1. Overall health
SELECT * FROM pgtrickle.health_check() WHERE severity != 'OK';

-- 2. Problem stream tables
SELECT name, status, refresh_mode, consecutive_errors, staleness
FROM pgtrickle.pgt_status()
WHERE status != 'ACTIVE' OR consecutive_errors > 0
ORDER BY consecutive_errors DESC;

-- 3. Recent failures
SELECT start_time, stream_table, action, status, duration_ms, error_message
FROM pgtrickle.refresh_timeline(20)
WHERE status = 'FAILED';

Failure Scenarios

1. Scheduler Not Running

Symptoms:

  • No stream tables are refreshing
  • health_check() reports scheduler_running = false
  • No pg_trickle scheduler process in pg_stat_activity

Diagnosis:

-- Check for the scheduler process
SELECT pid, datname, backend_type, state
FROM pg_stat_activity
WHERE backend_type = 'pg_trickle scheduler';

-- Check GUC
SHOW pg_trickle.enabled;

-- Check shared_preload_libraries
SHOW shared_preload_libraries;

Resolution:

CauseFix
pg_trickle.enabled = offALTER SYSTEM SET pg_trickle.enabled = on; SELECT pg_reload_conf();
Not in shared_preload_librariesAdd pg_trickle to shared_preload_libraries in postgresql.conf and restart PostgreSQL
max_worker_processes exhaustedIncrease max_worker_processes and restart. The launcher retries every 5 minutes — check PostgreSQL logs for WARNING: pg_trickle launcher: could not spawn scheduler
Scheduler crashedCheck PostgreSQL logs for crash details. The launcher will auto-restart it. If recurring, check for OOM or resource limits

2. Stream Table Stuck in SUSPENDED Status

Symptoms:

  • Stream table status shows SUSPENDED
  • consecutive_errors is at or above pg_trickle.max_consecutive_errors
  • No refreshes happening for this stream table

Diagnosis:

-- Check the stream table status
SELECT pgt_name, status, consecutive_errors, last_error_message
FROM pgtrickle.pg_stat_stream_tables
WHERE pgt_name = 'my_stream_table';

-- Get detailed error history
SELECT * FROM pgtrickle.diagnose_errors('my_stream_table');

Resolution:

  1. Fix the underlying error (check last_error_message and diagnose_errors)
  2. Resume the stream table:
SELECT pgtrickle.alter_stream_table('my_stream_table', enabled => true);
  1. Trigger a manual refresh to verify:
SELECT pgtrickle.refresh_stream_table('my_stream_table');

Prevention: Increase pg_trickle.max_consecutive_errors (default 3) if transient errors are common in your environment:

ALTER SYSTEM SET pg_trickle.max_consecutive_errors = 5;
SELECT pg_reload_conf();

3. CDC Triggers Missing or Disabled

Symptoms:

  • Stream table refreshes succeed but shows no changes
  • change_buffer_sizes() shows pending_rows = 0 despite active DML
  • Source tables have no pg_trickle triggers

Diagnosis:

-- Check trigger inventory
SELECT source_table, trigger_type, trigger_name, present, enabled
FROM pgtrickle.trigger_inventory()
WHERE NOT present OR NOT enabled;

-- Manual check on a specific source table
SELECT tgname, tgenabled
FROM pg_trigger
WHERE tgrelid = 'public.orders'::regclass
  AND tgname LIKE 'pgt_%';

Resolution:

CauseFix
Triggers dropped by DDL (e.g., pg_dump + restore without triggers)Drop and recreate the stream table, or reinitialize: SELECT pgtrickle.refresh_stream_table('my_st');
Triggers disabled (ALTER TABLE ... DISABLE TRIGGER)ALTER TABLE source_table ENABLE TRIGGER ALL;
Source gating activeCheck SELECT * FROM pgtrickle.source_gates(); and ungate: SELECT pgtrickle.ungate_source('source_table');
WAL mode active but slot missingSee WAL Replication Slot Lag or Missing

4. WAL Replication Slot Lag or Missing

Symptoms:

  • check_cdc_health() shows slot_lag_exceeds_threshold or replication_slot_missing
  • WAL disk usage growing unexpectedly
  • Stream tables not receiving changes in WAL mode

Diagnosis:

-- Check CDC health
SELECT * FROM pgtrickle.check_cdc_health();

-- Check replication slots directly
SELECT slot_name, active, restart_lsn, confirmed_flush_lsn,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS lag
FROM pg_replication_slots
WHERE slot_name LIKE 'pgt_%';

Resolution:

CauseFix
Slot dropped externallypg_trickle will auto-fallback to trigger-based CDC. To recreate: drop and recreate the stream table
Slot lagging (WAL accumulation)Check for long-running transactions: SELECT pid, age(backend_xmin) FROM pg_stat_replication;. Kill idle-in-transaction sessions
wal_level != logicalWAL CDC requires wal_level = logical. Set it and restart PostgreSQL
max_replication_slots exhaustedIncrease max_replication_slots and restart

Fallback: Force trigger-based CDC mode if WAL mode is problematic:

ALTER SYSTEM SET pg_trickle.cdc_mode = 'trigger';
SELECT pg_reload_conf();

5. Stream Table Stuck in INITIALIZING

Symptoms:

  • Stream table status is INITIALIZING for an extended period
  • The initial full refresh may have failed or is still running

Diagnosis:

-- Check refresh history
SELECT * FROM pgtrickle.get_refresh_history('my_st', 5);

-- Check for active refresh
SELECT pid, state, query, now() - query_start AS running_for
FROM pg_stat_activity
WHERE query LIKE '%pgtrickle%' AND state = 'active';

Resolution:

CauseFix
Initial refresh failed (check error in history)Fix the error, then: SELECT pgtrickle.refresh_stream_table('my_st');
Defining query is very slowOptimize the query, add indexes on source tables, or increase work_mem
Lock contention during initial refreshSee Lock Contention

6. Change Buffers Growing Without Refresh

Symptoms:

  • change_buffer_sizes() shows large pending_rows and growing buffer_bytes
  • Stream tables are stale
  • Refreshes are not running or are failing

Diagnosis:

-- Check buffer sizes
SELECT stream_table, source_table, pending_rows, buffer_bytes
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

-- Check if refreshes are happening
SELECT * FROM pgtrickle.refresh_timeline(10);

-- Check for blocked refresh processes
SELECT pid, wait_event_type, wait_event, state, query
FROM pg_stat_activity
WHERE query LIKE '%pgtrickle%';

Resolution:

CauseFix
Scheduler not runningSee Scheduler Not Running
All refreshes failingCheck diagnose_errors() for each affected stream table
Lock contentionSee Lock Contention
Very large buffer causing slow MERGEConsider lowering pg_trickle.differential_change_ratio_threshold to trigger FULL refresh for large batches

Emergency: If buffers are dangerously large and you need immediate relief:

-- Force a full refresh (bypasses change buffers)
SELECT pgtrickle.refresh_stream_table('my_st', force_full => true);

7. Lock Contention Blocking Refresh

Symptoms:

  • Refresh duration is much longer than usual
  • pg_stat_activity shows refresh processes in Lock wait state
  • Long-running transactions on source or stream tables

Diagnosis:

-- Find blocking locks
SELECT blocked.pid AS blocked_pid,
       blocked.query AS blocked_query,
       blocking.pid AS blocking_pid,
       blocking.query AS blocking_query
FROM pg_stat_activity blocked
JOIN pg_locks bl ON bl.pid = blocked.pid AND NOT bl.granted
JOIN pg_locks gl ON gl.locktype = bl.locktype
    AND gl.database IS NOT DISTINCT FROM bl.database
    AND gl.relation IS NOT DISTINCT FROM bl.relation
    AND gl.page IS NOT DISTINCT FROM bl.page
    AND gl.tuple IS NOT DISTINCT FROM bl.tuple
    AND gl.pid != bl.pid
    AND gl.granted
JOIN pg_stat_activity blocking ON blocking.pid = gl.pid
WHERE blocked.query LIKE '%pgtrickle%';

Resolution:

  1. Identify and terminate the blocking session if appropriate:
    SELECT pg_terminate_backend(<blocking_pid>);
    
  2. Investigate why the blocking transaction is long-running (idle in transaction, slow query, etc.)
  3. Consider adding statement_timeout or idle_in_transaction_session_timeout to prevent future occurrences

8. Out-of-Memory During Refresh

Symptoms:

  • Refresh processes killed by OS OOM killer
  • PostgreSQL logs show out of memory errors
  • Stream tables fail with system-category errors

Diagnosis:

# Check OS OOM killer logs
dmesg | grep -i "oom\|killed process" | tail -20

# Check PostgreSQL logs for memory errors
grep -i "out of memory\|oom" /var/log/postgresql/postgresql-*.log | tail -10
-- Check which stream tables have large source data
SELECT stream_table, source_table, pending_rows
FROM pgtrickle.change_buffer_sizes()
ORDER BY pending_rows DESC;

Resolution:

CauseFix
Large FULL refresh on big tableReduce work_mem or maintenance_work_mem to limit per-query memory
Large change buffer accumulationRefresh more frequently (shorter schedule) to keep buffers small
Complex query with many joinsSimplify the defining query or break into cascading stream tables
Parallel refresh amplifies memoryReduce pg_trickle.max_concurrent_refreshes

Tuning:

-- Limit per-refresh memory
SET work_mem = '64MB';

-- Limit concurrent refreshes to reduce peak memory
ALTER SYSTEM SET pg_trickle.max_concurrent_refreshes = 2;
SELECT pg_reload_conf();

9. Disk Full / WAL Retention Exceeded

Symptoms:

  • PostgreSQL logs No space left on device errors
  • WAL directory consuming excessive disk
  • Replication slots preventing WAL cleanup

Diagnosis:

# Check disk usage
df -h /var/lib/postgresql/data
du -sh /var/lib/postgresql/data/pg_wal/
-- Check replication slot WAL retention
SELECT slot_name, active,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal
FROM pg_replication_slots
ORDER BY pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) DESC;

-- Check change buffer table sizes
SELECT stream_table, source_table,
       pg_size_pretty(buffer_bytes::bigint) AS buffer_size
FROM pgtrickle.change_buffer_sizes()
ORDER BY buffer_bytes DESC;

Resolution:

CauseFix
Inactive replication slot holding WALDrop the slot: SELECT pg_drop_replication_slot('pgt_...');
Change buffer tables too largeForce full refresh to clear buffers, or refresh more frequently
WAL accumulation from long transactionsTerminate idle-in-transaction sessions
max_wal_size too lowIncrease max_wal_size in postgresql.conf

Emergency cleanup:

-- Drop inactive pg_trickle replication slots
SELECT pg_drop_replication_slot(slot_name)
FROM pg_replication_slots
WHERE slot_name LIKE 'pgt_%' AND NOT active;

10. Circular Pipeline Convergence Failure

Symptoms:

  • Stream tables in a circular dependency hit the maximum iteration limit
  • Refresh history shows repeated cycles without convergence
  • Error messages mention fixed_point_max_iterations

Diagnosis:

-- Check for circular dependencies
SELECT * FROM pgtrickle.dependency_tree();

-- Check refresh history for iteration patterns
SELECT start_time, stream_table, action, status, error_message
FROM pgtrickle.refresh_timeline(50)
WHERE stream_table IN ('st_a', 'st_b')  -- suspected cycle members
ORDER BY start_time DESC;

Resolution:

  1. Verify the cycle is intentional (see Circular Dependencies tutorial)
  2. Increase the iteration limit if convergence is slow:
    ALTER SYSTEM SET pg_trickle.fixed_point_max_iterations = 20;
    SELECT pg_reload_conf();
    
  3. If the cycle never converges, the defining queries may not be monotone. Restructure to eliminate the cycle or ensure monotonicity

11. Schema Change Broke Stream Table

Symptoms:

  • Stream table has needs_reinit = true
  • Reinitialization keeps failing
  • Error messages reference dropped or renamed columns

Diagnosis:

-- Check for pending reinit
SELECT pgt_name, needs_reinit, status, last_error_message
FROM pgtrickle.pg_stat_stream_tables
WHERE needs_reinit;

-- Get error details
SELECT * FROM pgtrickle.diagnose_errors('my_st');

Resolution:

If the defining query is still valid after the DDL change, force a reinit:

SELECT pgtrickle.refresh_stream_table('my_st');

If the defining query needs to be updated:

-- Option 1: Alter the defining query
SELECT pgtrickle.alter_stream_table('my_st',
    query => 'SELECT new_column, SUM(amount) FROM orders GROUP BY new_column'
);

-- Option 2: Drop and recreate
SELECT pgtrickle.drop_stream_table('my_st');
SELECT pgtrickle.create_stream_table(
    'my_st',
    'SELECT new_column, SUM(amount) FROM orders GROUP BY new_column',
    '1m'
);

12. Worker Pool Exhaustion

Symptoms:

  • Refresh latency increases across the board
  • Some stream tables refresh while others queue indefinitely
  • worker_pool_status() shows all workers busy

Diagnosis:

-- Check worker pool
SELECT * FROM pgtrickle.worker_pool_status();

-- Check for long-running parallel jobs
SELECT job_id, unit_key, status, duration_ms
FROM pgtrickle.parallel_job_status(300)
WHERE status = 'RUNNING'
ORDER BY duration_ms DESC;

Resolution:

CauseFix
Too few workers for workloadIncrease pg_trickle.max_concurrent_refreshes and/or max_worker_processes
One stream table monopolizing workersCheck if a single slow refresh is blocking the pool. Consider splitting into smaller stream tables
Global worker cap reachedIncrease pg_trickle.max_dynamic_refresh_workers

13. Fuse Tripped (Circuit Breaker)

Symptoms:

  • Stream table shows fuse_state = 'BLOWN' or refresh is paused
  • fuse_status() reports a tripped fuse
  • No refreshes happening despite active scheduler

Diagnosis:

-- Check fuse status
SELECT * FROM pgtrickle.fuse_status();

Resolution:

Reset the fuse after investigating the root cause:

SELECT pgtrickle.reset_fuse('my_stream_table');

See the Fuse Circuit Breaker tutorial for details on fuse thresholds and configuration.


General Diagnostic Workflow

When investigating any issue, follow this sequence:

1. health_check()          → identify which subsystem is unhealthy
2. pgt_status()            → find specific affected stream tables
3. diagnose_errors('name') → get root cause for failures
4. refresh_timeline(20)    → correlate with recent refresh events
5. change_buffer_sizes()   → check CDC pipeline health
6. trigger_inventory()     → verify change capture is working
7. dependency_tree()       → confirm DAG wiring
8. PostgreSQL logs         → low-level crash/resource details

GUC Quick Reference for Troubleshooting

GUCDefaultWhat to check
pg_trickle.enabledonMust be on for scheduler to run
pg_trickle.max_consecutive_errors3Stream tables suspend after this many failures
pg_trickle.scheduler_interval_ms100Very high values cause refresh lag
pg_trickle.event_driven_wakeonoff = poll-only, higher latency
pg_trickle.cdc_modeautotrigger for reliable fallback
pg_trickle.max_concurrent_refreshes4Per-database parallel refresh cap
pg_trickle.fixed_point_max_iterations10Circular pipeline iteration limit
pg_trickle.differential_change_ratio_threshold0.5Falls back to FULL above this ratio
pg_trickle.auto_backoffonStretches intervals up to 8x under load

pg_trickle Error Reference

This document lists all PgTrickleError variants with descriptions, common causes, and suggested fixes. If you encounter an error not listed here, please open an issue.

Tip: Most errors include context (table name, OID, or query fragment) in the message text. Use that context to narrow down the root cause.


SQLSTATE Code Reference

Every pg_trickle error includes a PostgreSQL SQLSTATE code for programmatic error handling. Use SQLSTATE in PL/pgSQL EXCEPTION WHEN blocks or check the error code in your client library.

Error VariantSQLSTATECode Name
QueryParseError42000SYNTAX_ERROR_OR_ACCESS_RULE_VIOLATION
TypeMismatch42804DATATYPE_MISMATCH
UnsupportedOperator0A000FEATURE_NOT_SUPPORTED
CycleDetected3F000INVALID_SCHEMA_DEFINITION
NotFound42P01UNDEFINED_TABLE
AlreadyExists42P07DUPLICATE_TABLE
InvalidArgument22023INVALID_PARAMETER_VALUE
QueryTooComplex54000PROGRAM_LIMIT_EXCEEDED
UpstreamTableDropped42P01UNDEFINED_TABLE
UpstreamSchemaChanged42P17INVALID_TABLE_DEFINITION
LockTimeout55P03LOCK_NOT_AVAILABLE
ReplicationSlotError55000OBJECT_NOT_IN_PREREQUISITE_STATE
WalTransitionError55000OBJECT_NOT_IN_PREREQUISITE_STATE
SpiErrorXX000INTERNAL_ERROR
SpiPermissionError42501INSUFFICIENT_PRIVILEGE
WatermarkBackwardMovement22000DATA_EXCEPTION
WatermarkGroupNotFound42704UNDEFINED_OBJECT
WatermarkGroupAlreadyExists42710DUPLICATE_OBJECT
RefreshSkipped55000OBJECT_NOT_IN_PREREQUISITE_STATE
InternalErrorXX000INTERNAL_ERROR

Error Categories

pg_trickle classifies errors into four categories that determine retry behavior:

CategoryRetried by scheduler?Description
UserNoInvalid queries, type mismatches, DAG cycles. Fix the input.
SchemaNo (triggers reinitialize)Upstream DDL changes. The stream table is reinitialized automatically.
SystemYes (with backoff)Lock timeouts, replication slot problems, transient SPI failures.
InternalNoUnexpected bugs. Please report these.

User Errors

QueryParseError

Message: query parse error: <details>

Description: The defining query could not be parsed or validated by the pg_trickle query analyzer.

Common causes:

  • Syntax error in the defining query
  • Use of PostgreSQL syntax not yet supported by pgrx's query parser
  • A CTE or subquery that cannot be analyzed

Suggested fix: Simplify the query. Check that it runs as a standalone SELECT statement. Review SQL Reference — Expression Support for supported syntax.


TypeMismatch

Message: type mismatch: <details>

Description: A type incompatibility was detected between the defining query output and the stream table schema, or between source columns and expected types.

Common causes:

  • Column type changed on a source table after stream table creation
  • Explicit cast to an incompatible type in the defining query
  • UNION branches with mismatched column types

Suggested fix: Ensure column types match. Use explicit CAST() to align types if needed. If the source table changed, use pgtrickle.repair_stream_table() to reinitialize.


UnsupportedOperator

Message: unsupported operator for DIFFERENTIAL mode: <operator>

Description: The defining query uses an SQL operator or construct that pg_trickle cannot maintain incrementally.

Common causes:

  • TABLESAMPLE, GROUPING SETS beyond the branch limit, recursive CTEs with unsupported patterns, certain window function combinations
  • Non-monotonic or volatile functions in positions that prevent differential maintenance

Suggested fix: Use refresh_mode => 'FULL' to fall back to full recomputation:

SELECT pgtrickle.alter_stream_table('my_stream_table',
    refresh_mode => 'FULL');

Or restructure the query to avoid the unsupported construct. See SQL Reference — Expression Support.


CycleDetected

Message: cycle detected in dependency graph: A -> B -> C -> A

Description: Adding or altering this stream table would create a circular dependency in the refresh DAG.

Common causes:

  • Stream table A depends on stream table B, which depends on A
  • Indirect cycles through chains of stream tables

Suggested fix: Restructure the stream table definitions to break the cycle. Use pgtrickle.get_dependency_graph() to visualize the current DAG. If circular dependencies are intentional, enable pg_trickle.allow_circular = true (see Configuration).


NotFound

Message: stream table not found: <name>

Description: The specified stream table does not exist in the pgtrickle.pgt_stream_tables catalog.

Common causes:

  • Typo in the stream table name
  • The stream table was already dropped
  • Schema-qualified name required but not provided (e.g., myschema.my_st)

Suggested fix: Check the name with pgtrickle.list_stream_tables(). Use the fully qualified name: schema.table_name.


AlreadyExists

Message: stream table already exists: <name>

Description: A create_stream_table() call was made for a stream table name that is already registered.

Common causes:

  • Re-running a migration or DDL script without IF NOT EXISTS

Suggested fix: Use pgtrickle.create_stream_table_if_not_exists() or pgtrickle.create_or_replace_stream_table() for idempotent creation.


InvalidArgument

Message: invalid argument: <details>

Description: An invalid value was passed to a pg_trickle API function.

Common causes:

  • Invalid refresh_mode value (must be 'DIFFERENTIAL', 'FULL', or 'AUTO')
  • Calling resume_stream_table() on a table that is not suspended
  • Invalid schedule interval or threshold value
  • Empty or malformed table name

Suggested fix: Check the function signature in the SQL Reference and correct the argument.


QueryTooComplex

Message: query too complex: <details>

Description: The defining query exceeds the maximum parse depth, which protects against stack overflow during query analysis.

Common causes:

  • Deeply nested subqueries (> 64 levels by default)
  • Large UNION ALL chains
  • Complex CTE hierarchies

Suggested fix: Simplify the query. If the depth limit is too restrictive, increase pg_trickle.max_parse_depth (default: 64). See Configuration.


Schema Errors

UpstreamTableDropped

Message: upstream table dropped: OID <oid>

Description: A source table referenced by the stream table's defining query was dropped.

Common causes:

  • DROP TABLE on a source table
  • Table replaced via DROP + CREATE (new OID)

Suggested fix: Either recreate the source table with the same schema or drop the stream table and recreate it. If pg_trickle.block_source_ddl = true (default), the DROP would have been blocked in the first place.


UpstreamSchemaChanged

Message: upstream table schema changed: OID <oid>

Description: A source table's schema was altered (e.g., column added, dropped, or type changed) in a way that affects the defining query.

Common causes:

  • ALTER TABLE ... ADD/DROP/ALTER COLUMN on a source table
  • Type change on a column used in the defining query

Suggested fix: The stream table will be automatically reinitialized on the next scheduler tick. If pg_trickle.block_source_ddl = true (default), most schema changes are blocked proactively. Use pgtrickle.alter_stream_table(..., query => '...') to update the defining query if needed.


System Errors

LockTimeout

Message: lock timeout: <details>

Description: A lock required for refresh could not be acquired within the configured timeout.

Common causes:

  • Long-running transactions holding locks on the stream table or source tables
  • Concurrent ALTER TABLE or VACUUM FULL operations
  • High contention on the change buffer tables

Suggested fix: This error is automatically retried with exponential backoff. If persistent, investigate long-running transactions with pg_stat_activity. Consider increasing lock_timeout or reducing refresh frequency.


ReplicationSlotError

Message: replication slot error: <details>

Description: An error occurred with the logical replication slot used for WAL-based CDC.

Common causes:

  • Replication slot dropped externally
  • wal_level changed from logical to a lower level
  • Slot lag exceeded max_slot_wal_keep_size

Suggested fix: Check replication slot status with SELECT * FROM pg_replication_slots. Ensure wal_level = logical. If the slot was dropped, pg_trickle will recreate it automatically. See Configuration — WAL CDC.


WalTransitionError

Message: WAL transition error: <details>

Description: An error occurred during the transition from trigger-based CDC to WAL-based CDC.

Common causes:

  • wal_level is not logical when cdc_mode = 'auto'
  • Transient connection issues during the transition

Suggested fix: Ensure wal_level = logical in postgresql.conf if you want WAL-based CDC. Otherwise set pg_trickle.cdc_mode = 'trigger' to stay on trigger-based CDC. This error is retried automatically.


SpiError

Message: SPI error: <details>

Description: A PostgreSQL Server Programming Interface (SPI) error occurred during an internal query.

Common causes:

  • Transient serialization failures under high concurrency
  • Deadlocks between refresh and concurrent DML
  • Connection issues in background workers
  • Permanent errors: missing columns, syntax errors in generated SQL

Suggested fix: Transient SPI errors (deadlocks, serialization failures) are retried automatically. Permanent errors (permission denied, missing objects) will suspend the stream table after max_consecutive_errors failures. Check pgtrickle.check_health() for details.


SpiPermissionError

Message: SPI permission error: <details>

Description: The background worker's role lacks required permissions.

Common causes:

  • Missing SELECT privilege on a source table
  • Missing INSERT/UPDATE/DELETE privilege on the stream table
  • Role used by the background worker is not the table owner

Suggested fix: Grant the necessary privileges to the role running pg_trickle's background workers:

GRANT SELECT ON source_table TO pgtrickle_role;
GRANT ALL ON pgtrickle.my_stream_table TO pgtrickle_role;

This error does not count toward the consecutive error suspension limit.


Watermark Errors

WatermarkBackwardMovement

Message: watermark moved backward: <details>

Description: A watermark advancement was rejected because the new value is older than the current watermark, violating monotonicity.

Common causes:

  • Clock skew in distributed systems
  • Manual watermark manipulation with an incorrect value
  • Bug in watermark tracking logic

Suggested fix: Ensure watermark values are monotonically increasing. Check the current watermark with pgtrickle.get_watermark_groups().


WatermarkGroupNotFound

Message: watermark group not found: <details>

Description: The specified watermark group does not exist.

Common causes:

  • Typo in the watermark group name
  • The group was deleted or never created

Suggested fix: List existing groups with pgtrickle.get_watermark_groups().


WatermarkGroupAlreadyExists

Message: watermark group already exists: <details>

Description: A watermark group with this name already exists.

Common causes:

  • Re-running a setup script without idempotent guards

Suggested fix: Use a different name or delete the existing group first.


Transient Errors

RefreshSkipped

Message: refresh skipped: <details>

Description: A refresh was skipped because a previous refresh for the same stream table is still running.

Common causes:

  • Slow refresh (large delta or complex query) overlapping with the next scheduled cycle
  • Multiple manual refresh_stream_table() calls in parallel

Suggested fix: No action needed — the scheduler will retry on the next cycle. If this happens frequently, increase the schedule interval or investigate why refreshes are slow using pgtrickle.explain_st().

This error does not count toward the consecutive error suspension limit.


Internal Errors

InternalError

Message: internal error: <details>

Description: An unexpected internal error that indicates a bug in pg_trickle.

Common causes:

  • This should not happen in normal operation

Suggested fix: Please report the issue with the full error message, your PostgreSQL version, and pg_trickle version. Include the output of pgtrickle.check_health() and the relevant PostgreSQL log entries.


See Also

Changelog

What's new in pg_trickle — written for everyone, not just developers.

For future plans and upcoming features, see ROADMAP.md.

Table of Contents


[Unreleased]


[0.20.0] — Dog Feeding

pg_trickle now monitors itself. Instead of you having to check on pg_trickle's health manually, this release lets pg_trickle watch its own performance, spot problems early, and even fix some of them on its own. Five new stream tables sit in the pgtrickle schema and continuously analyse refresh history — the same technology you use for your own data, pointed inward. One SQL call sets everything up; one call tears it down.

We call this dog feeding — pg_trickle uses its own stream-table technology to keep an eye on itself, just like it keeps your data views up to date.

What's new

  • One-click self-monitoring — run SELECT pgtrickle.setup_dog_feeding() and pg_trickle creates five monitoring stream tables that continuously track how well it is performing. Run teardown_dog_feeding() to remove them. Both are idempotent — safe to call as many times as you like, even during rolling upgrades.

  • Health at a glance — the new dog_feeding_status() function shows the status of all five monitoring views in one query: whether each one exists, its refresh mode, and the last time it refreshed. Quick to run from a monitoring script or dashboard.

  • Threshold recommendations — after enough refresh cycles accumulate (typically 10–20 minutes of activity), df_threshold_advice starts producing suggestions for each stream table. Each recommendation includes a confidence level (HIGH / MEDIUM / LOW) and a reason — for example, "DIFF is 73% faster — raise threshold to allow more DIFF". A sla_headroom_pct column shows exactly how much faster incremental refresh is versus full refresh for that table.

  • Automatic tuning — set pg_trickle.dog_feeding_auto_apply = 'threshold_only' and pg_trickle will apply HIGH-confidence threshold recommendations automatically. Changes are rate-limited to once per 10 minutes per stream table, and every adjustment is logged to pgt_refresh_history with initiated_by = 'DOG_FEED' so you have a full audit trail.

  • Real-time alerts — when pg_trickle detects an anomaly (duration spike exceeding 3× the baseline, or two or more recent failures), it sends a NOTIFY on the pgtrickle_alert channel with a JSON payload. Your application, Alertmanager webhook, or LISTEN client can act immediately without polling.

  • Scheduling interference detectiondf_scheduling_interference tracks pairs of stream tables that consistently overlap during refresh. When overlap is heavy, the scheduler automatically backs off its poll interval (up to 2× the configured base) to reduce contention.

  • Visual dependency graph — the new explain_dag() function renders your full refresh pipeline as a Mermaid or Graphviz DOT diagram. User stream tables appear in blue, dog-feeding tables in green, suspended tables in red. Paste the output into any Mermaid renderer or dot to see exactly how your tables depend on each other.

  • Scheduler overhead reportscheduler_overhead() returns metrics for the last hour: total refreshes, how many were dog-feeding, the fraction they represent, and average durations. Useful for confirming that self-monitoring adds negligible cost.

What pg_trickle watches

Monitoring viewWhat it tracks
df_efficiency_rollingRolling-window refresh speed, change ratio, DIFF vs FULL counts
df_anomaly_signalsDuration spikes (> 3× baseline), error bursts, mode oscillation
df_threshold_advicePer-table threshold recommendations with confidence level and reasoning
df_cdc_buffer_trendsChange-capture buffer growth rate per source table; alerts on burst spikes
df_scheduling_interferenceRefresh overlap patterns; pairs with 3+ concurrent refreshes in the last hour

Faster and more reliable

  • A new index on pgt_refresh_history(pgt_id, start_time) speeds up all dog-feeding queries and general history lookups. Applied automatically during the 0.19.0 → 0.20.0 upgrade.
  • Old history records are now pruned in batches of 1,000 rows per transaction (previously one large DELETE), which avoids long lock holds on pgt_refresh_history during the nightly cleanup.
  • check_cdc_health() is enriched with spill-risk alerts: if a source table's max burst delta exceeds 10× its average, you get an early warning before the buffer fills.
  • explain_st() now shows two new properties: dog_feeding_coverage (none / partial / full) and recommended_refresh_mode, so diagnostics automatically surface self-monitoring data when it is available.

New documentation and tooling

  • SQL Reference — a new "Dog Feeding — Self-Monitoring" section covers all five stream tables, setup_dog_feeding(), teardown_dog_feeding(), confidence levels, and the sla_headroom_pct column.
  • Getting Started — a new "Day 2 Operations" section walks through enabling dog-feeding, reading recommendations, enabling auto-apply, and visualising the DAG.
  • Configurationpg_trickle.dog_feeding_auto_apply is fully documented with values, rate-limiting behaviour, and the audit trail.
  • A ready-made Grafana dashboard (pg_trickle_dog_feeding.json) with five panels covers refresh throughput, anomaly heatmap, threshold calibration, CDC buffer growth, and the scheduling interference matrix.
  • A dbt macro (pgtrickle_enable_monitoring) enables monitoring as a post-hook with one line in dbt_project.yml.
  • A quick-start SQL script at sql/dog_feeding_setup.sql walks through setup, auto-apply, alert listening, and status verification in six steps.

[0.19.0] — 2026-04-13

Safer, faster, easier to operate. This release closes several security and correctness gaps, adds new conveniences for operators and developers, and significantly improves performance for deployments with many stream tables. The background scheduler finds the next table to refresh 10–15× faster. Four breaking changes are included — all easy to adapt to, each one correcting behaviour that was a source of subtle bugs in production.

Breaking changes

  • Only owners can modify their own stream tables — other database users can no longer drop or alter a stream table they did not create. If shared access is intentional, grant superuser or explicitly add the user as owner. Superusers are unaffected.

  • Dropping a stream table no longer cascadesdrop_stream_table() now behaves like PostgreSQL's own DROP TABLE: it refuses to drop if dependent objects exist, unless you pass cascade => true explicitly. Previously it silently removed all dependents, which surprised operators after restructuring.

  • The refresh notification channel was renamed — change LISTEN pgtrickle_refresh to LISTEN pg_trickle_refresh (note the added underscore). The old name was inconsistent with every other channel in the extension.

  • The delete_insert refresh strategy was removed — this strategy could produce wrong results for queries containing aggregates or DISTINCT. If you had it configured, pg_trickle logs a warning and automatically switches to the safe auto strategy. No data is lost; the next refresh corrects any affected rows.

New features

  • Installation health checkversion_check() returns the installed extension version, the loaded library version, and the PostgreSQL server version in one row. If the extension was upgraded but the server has not been restarted, you get an explicit warning. Useful in deploy scripts and smoke tests.

  • Write and refresh in one stepwrite_and_refresh(sql, st_name) executes an arbitrary SQL statement and immediately refreshes the named stream table in the same transaction. Downstream readers see consistent results as soon as the transaction commits — no polling loop needed.

  • Better connection-pooler support — the new pg_trickle.connection_pooler_mode GUC configures pg_trickle for PgBouncer, pgcat, or Supavisor at the cluster level. Previously each stream table had to be configured individually, which was error-prone on large deployments.

  • Automatic refresh history cleanuppgt_refresh_history is now trimmed automatically after 90 days (configurable with pg_trickle.history_retention_days; set to 0 to disable). Without this, the history table could grow by thousands of rows per day on busy deployments.

  • Schema migration tracking — pg_trickle now records which upgrade scripts have been applied in pgtrickle.pgt_schema_version. This makes it straightforward to verify that a deployment is fully up to date and simplifies the rollback story.

  • Clearer skip messages — when a refresh is skipped because another refresh of the same stream table is already running, you now see a NOTICE: skipping refresh of <name> — already running message instead of silence. Reduces confusion when debugging slow or stuck schedulers.

  • Deeper diagnosticsexplain_st() gains a with_analyze parameter. When set to true, it runs EXPLAIN (ANALYZE, BUFFERS) on the refresh query and returns actual row counts, timing, and buffer hit/miss ratios — the same information PostgreSQL's query planner provides for any query, but surfaced inside the stream-table diagnostic tool.

  • New deployment guides — step-by-step documentation for PgBouncer, pgcat, Supavisor, CNPG, and Kubernetes deployments, plus an operational runbook for common Kubernetes failure modes.

Bug fixes

  • Fixed a constraint-validation inconsistency in databases upgraded from 0.11.0 or earlier where pgt_refresh_history had a duplicate check entry in the catalog. Affected databases could see spurious constraint errors on busy write paths.

  • Error messages throughout the extension now show human-readable table names (e.g. public.orders) instead of raw PostgreSQL OIDs. This affects "source table was dropped", "schema changed", and several other error paths that were previously unreadable without a catalog lookup.

Performance

  • 10–15× faster scheduler dispatch — the scheduler now finds the next stream table to process with a direct lookup instead of scanning the full list on every poll cycle. On a deployment with 500 stream tables this drops from ~650 µs to ~45 µs per poll, reducing background CPU overhead significantly at scale.

  • Single-query change detection — when the scheduler checks whether any source tables have changed, it now issues one query covering all sources at once instead of one query per source table. On deployments with 50+ source tables this meaningfully reduces the overhead of each scheduler cycle, especially under PgBouncer transaction pooling.


[0.18.0] — 2026-04-12

Hardening & Delta Performance. This release focuses on correctness, reliability, and giving operators better visibility into what pg_trickle is doing. Stream tables that group by columns containing NULL values now refresh correctly in all cases. A new memory safety net prevents runaway refreshes from consuming too much RAM. Error messages across the board now explain what went wrong and suggest how to fix it. Two new SQL functions — health_summary() and cache_stats() — give you a single-query overview of the entire system, and updated Grafana dashboards make monitoring plug-and-play. The TPC-H industry benchmark now runs as a nightly regression guard, and property-based tests mathematically verify the core delta engine's arithmetic.

Highlights

  • NULL values in GROUP BY now handled correctly — previous versions could produce wrong results when a stream table grouped by a column that contained NULL values and rows were deleted. The root cause was that NULL group keys broke the internal row-matching logic. This is now fixed: NULL keys are matched correctly during both inserts and deletes, so aggregate stream tables always return the right answer regardless of NULLs in the data.

  • Memory safety net for large deltas — if an unexpectedly large batch of changes arrives (for example, a bulk import into a source table), the incremental refresh could previously consume unbounded memory. A new configuration option (pg_trickle.delta_work_mem_cap_mb) lets you set a ceiling. When a refresh would exceed it, pg_trickle automatically falls back to a full refresh instead of risking an out-of-memory crash.

  • Early warning when refreshes spill to disk — when the incremental refresh engine runs low on memory, PostgreSQL may spill intermediate data to temporary files on disk, which is much slower. pg_trickle now detects this and sends a notification so you can investigate before performance degrades. If spilling happens repeatedly, the scheduler automatically switches the affected stream table to full refresh.

  • One-query system health check — the new pgtrickle.health_summary() function returns a single row with everything you need at a glance: how many stream tables are active, how many are in error or suspended state, the worst staleness across all tables, whether the scheduler is running, and the overall cache hit rate. Perfect for dashboards, alerting rules, or a quick manual check.

  • Cache performance visibility — the new pgtrickle.cache_stats() function shows how effectively pg_trickle is reusing its internal query templates. You can see cache hit rates, eviction counts, and current cache size — useful for tuning pg_trickle.template_cache_size on busy systems.

  • Better error messages — every error pg_trickle can raise now includes a standard PostgreSQL error code (SQLSTATE), a DETAIL line explaining the context, and a HINT suggesting what to do. Instead of a cryptic internal error, you get actionable guidance like "Table 'orders' was dropped while stream table 'order_summary' depends on it — recreate the source table or drop the stream table."

Monitoring & dashboards

  • Updated Grafana dashboards — the bundled pg_trickle_overview.json dashboard now includes panels for template cache hit rate, P99 and average refresh latency, hourly refresh success/failure counts, and cache eviction trends. Import it into Grafana and point it at your Prometheus instance for instant visibility.

  • Prometheus metric documentation — all 8 new metrics exposed by cache_stats() and health_summary() are now fully documented in the monitoring guide, with ready-to-use PromQL queries.

Correctness & testing

  • TPC-H regression guard — all 22 queries from the TPC-H industry benchmark now run nightly against known-good expected output. If a code change causes any query to return different results, CI fails immediately. This catches subtle correctness regressions that targeted tests might miss.

  • Mathematical proof of delta arithmetic — 6 property-based tests (2,000 random cases each) verify that the core engine's insert/delete accounting is correct: operations compose in the right order, groups cancel out properly, and no phantom rows appear after mixed workloads. An additional 4 end-to-end property tests exercise the full pipeline from change capture through to the final merged result.

  • CDC edge case coverage — new tests cover composite primary keys, generated (computed) columns, NULL values in non-key columns, and domain types — real-world schema patterns that were previously untested.

  • dbt integration tests — the dbt adapter now has regression tests for AUTO refresh mode, stream table health checks, and refresh history lifecycle — ensuring the dbt workflow stays reliable across releases.

Scalability

  • Scaling guide — a new docs/SCALING.md document covers how to configure pg_trickle for large deployments (200+ stream tables), including worker pool sizing, tiered scheduling, per-database quotas, and tuning profiles for different workload types.

  • Buffer growth stress tests — new tests verify that the max_buffer_rows safety limit works correctly under sustained high write rates, including automatic recovery back to incremental refresh after a burst subsides.

Testing infrastructure

  • Faster CI on pull requests — 19 additional test files (~197 tests) were moved to the lightweight test runner that does not require building a custom Docker image. Pull request CI is now faster without sacrificing coverage.

  • Upgrade path tested — the full upgrade chain from version 0.1.3 through every release up to 0.18.0 is verified automatically in CI, including function availability, schema integrity, and data survival.

Fixed

  • Upgrade script completeness — the 0.17.0→0.18.0 upgrade migration now includes all new and changed functions (pg_trickle_hash, cache_stats(), health_summary()), so ALTER EXTENSION pg_trickle UPDATE works correctly.

[0.17.0] — 2026-04-08

Query Intelligence & Stability. This release teaches pg_trickle to make smarter decisions about how to refresh each stream table, reduces unnecessary work when only a handful of columns actually changed, and proves correctness through 10,000 automated random mutations every night. Large deployments with hundreds of stream tables now handle schema changes much faster. Alongside these improvements, three new documentation resources make it easier to get started, troubleshoot problems, and migrate from pg_ivm.

Highlights

  • Query-aware refresh decisions — pg_trickle previously used a fixed threshold to decide between incremental and full refresh: if more than 50% of rows changed, switch to full. That works for simple queries but is poorly calibrated for joins or aggregates. The engine now classifies each query by its complexity (simple scan, filter, aggregate, join, or join+aggregate) and weights the cost estimate accordingly. Simple queries stay incremental even at high change rates; expensive join-heavy queries switch to full refresh sooner when the data is largely different. You can also pin a table to always use one strategy with the new pg_trickle.refresh_strategy setting ('auto' / 'differential' / 'full'), or tune the aggressiveness with pg_trickle.cost_model_safety_margin.

  • Skip columns that did not change — when a row is updated in a wide source table (say, 50 columns) but only 2 columns that the stream table actually uses are modified, pg_trickle previously processed the full change anyway. It now tracks exactly which columns were modified and skips updates that touch none of the relevant columns. For aggregate stream tables the savings go further: a value-only update that does not affect group membership is applied as a single lightweight correction instead of a delete-then-insert pair. On write-heavy workloads with wide tables, this reduces the volume of data flowing through the refresh pipeline by 50–90%.

  • Faster schema changes on large deployments — every time you create, alter, or drop a stream table, pg_trickle previously rebuilt the entire internal dependency graph from scratch. With 100 stream tables that takes only a few milliseconds, but at 1,000 it becomes noticeable. The graph is now updated incrementally — only the affected edges are touched, leaving everything else in place. At 1,000 stream tables the rebuild time drops from ~600 µs to ~116 µs and no longer scales with the total number of tables in the database.

  • Nightly correctness oracle — a new automated test runs 10,000 random data mutations every night against a broad set of query shapes. For each mutation it compares the result of incremental refresh against a full recompute and fails if they ever disagree. This catches subtle correctness bugs that only surface after unusual sequences of inserts, updates, and deletes — the kind that hand-written tests rarely reach.

  • ROWS FROM() fully supported — queries that use ROWS FROM() to call multiple set-returning functions side-by-side are now fully supported in incremental mode, including updates and deletes. This was previously restricted to insert-only workloads.

New documentation

  • Try it in 60 seconds — a new playground/ directory contains a docker compose up environment with PostgreSQL 18 + pg_trickle pre-wired, sample data loaded, and five stream tables ready to query. No installation required beyond Docker.

  • Troubleshooting runbookdocs/TROUBLESHOOTING.md covers 13 real-world failure scenarios: scheduler not running, stream table stuck in SUSPENDED state, CDC triggers missing, WAL slot problems, out-of-memory, disk full, circular dependency convergence issues, unexpected schema changes, worker pool exhaustion, and blown fuses. Each scenario lists symptoms, diagnostic queries, and step-by-step resolution.

  • Migrating from pg_ivmdocs/tutorials/MIGRATING_FROM_PG_IVM.md is a step-by-step guide for teams moving from the pg_ivm extension. It maps every pg_ivm API to its pg_trickle equivalent, explains behavioral differences, and includes ready-to-run SQL examples and a post-migration verification checklist.

  • New user FAQ — the top 15 common questions are now answered at the top of docs/FAQ.md so new users find answers before scrolling through the full document.

  • Post-install verification scriptscripts/verify_install.sql walks through the complete setup: checks that pg_trickle is loaded, creates a test stream table, runs a refresh, verifies the result, and cleans up. Useful for confirming a fresh installation or diagnosing environment issues.

Stability & code quality

  • Safer internal code — the number of unsafe Rust blocks in the query parser was reduced from 690 to 441 (a 36% drop) by introducing two helper macros that wrap the most common unsafe patterns. No behavior change; this makes the codebase easier to audit and maintain.

  • Cleaner internal structure — the largest source file (api.rs, ~9,400 lines) was split into three focused modules. This has no user-visible effect but makes the codebase significantly easier to work with and reduces the risk of regressions from unrelated code being in the same file.

  • Refresh logic extracted and tested — seven functions responsible for building the SQL used during refresh were extracted into standalone testable units and covered with 29 new unit tests. This catches regressions in generated SQL templates before they reach production.


[0.16.0] — 2026-04-06

Performance & Refresh Optimization. This release makes stream table refreshes significantly faster across the board. Small changes to large tables are now applied without expensive full-table scans. Tables that only receive new rows (no updates or deletes) use a streamlined path that skips unnecessary work. Aggregate queries like SUM and COUNT are refreshed with pinpoint updates instead of recalculating entire groups. A new template cache eliminates repeated startup work when database connections are recycled. An automated benchmark system now prevents future changes from accidentally slowing things down.

Highlights

  • Smarter refresh for small changes — when only a handful of rows change in a large stream table (less than 1% of total rows), pg_trickle now uses a faster strategy that skips the full-table comparison. This can reduce refresh time by up to 40% for common workloads where most data stays the same between refreshes. The system picks the best strategy automatically, but you can override it via the merge_strategy setting.

  • Insert-only fast path — stream tables backed by append-only data sources (like event logs or audit trails that never update or delete rows) are now detected automatically and refreshed using a much simpler, faster path. No configuration is needed — pg_trickle observes your data patterns and switches to the fast path on its own. If an update or delete is later detected, it safely falls back to the standard approach with a warning.

  • Faster aggregate refreshes — stream tables that use SUM, COUNT, AVG, or STDDEV aggregates now update individual groups directly instead of re-joining against the entire table. For queries with many distinct groups, this can be 5–20× faster. Non-invertible aggregates like MIN, MAX, and STRING_AGG continue using the standard path.

  • Template cache for faster cold starts — the first time a database connection refreshes a stream table, pg_trickle normally spends ~45 ms preparing the refresh query. A new cross-connection cache stores these prepared queries so that subsequent connections (including those from connection poolers like PgBouncer) start refreshing in about 1 ms instead.

  • Automated performance regression checks — every code change to pg_trickle is now automatically benchmarked before it can be merged. If any operation slows down by more than 10%, the change is blocked until the regression is fixed. This protects users from accidental performance degradation in future releases.

New features

  • Error reference guide — a new error reference page documents every error message pg_trickle can produce, explains what caused it, and suggests how to fix it. Useful when troubleshooting unexpected behavior in production.

  • Change buffer growth protection — if a stream table's refresh keeps failing, the backlog of unprocessed changes could previously grow without limit, consuming disk space. A new max_buffer_rows setting (default: 1,000,000 rows) caps this growth. When the limit is reached, pg_trickle performs a full refresh to clear the backlog and warns you about the situation.

  • Automatic index creation control — pg_trickle has always created helpful indexes on stream tables automatically. A new auto_index setting lets you disable this behavior when you want full control over indexing. Stream tables using SELECT DISTINCT now also get an automatic index on their distinct columns.

  • Compaction and predicate pushdown stats — the explain_st() diagnostics function now shows additional information about change buffer compaction thresholds, merge strategy selection, append-only mode, aggregate fast-path status, and template cache hit rates.

Improved

  • Configuration guidance — the documentation now includes detailed tuning advice for the planner_aggressive and cleanup_use_truncate settings, especially for environments using connection poolers like PgBouncer or running under memory pressure.

  • Terminal dashboard improvements — the pgtrickle TUI dashboard now shows the effective refresh mode for each stream table (e.g., when a table is temporarily downgraded from differential to full refresh). The Alerts tab has been restructured with a clearer table layout and better distinction between "stale data" and "no upstream changes" conditions.

Fixed

  • Append-only detection with chained stream tables — stream tables that feed into other stream tables (cascading dependencies) now correctly skip the append-only fast path to avoid data inconsistencies. Previously, a chained stream table could incorrectly use the insert-only path even when downstream tables needed the full change set.

  • Append-only heuristic accuracy — the automatic detection of insert-only data sources now also checks the stream table's own change buffer for non-insert operations, avoiding false positives.

  • Full refresh fallback for mixed changes — when both a stream table and its source table have pending changes in the same refresh cycle, pg_trickle now correctly falls back to a full refresh to avoid inconsistencies.

  • resume_stream_table() confirmed working — the function referenced in error messages when a stream table enters SUSPENDED state was verified to exist and work correctly (present since v0.2.0).

Testing & quality

  • 13 new end-to-end tests covering JOIN correctness across update/delete cycles, window function differential behavior, differential-vs-full equivalence validation, and source table schema evolution resilience.
  • 5 new benchmark scenarios covering semi-joins, anti-joins, multi-table join chains, and aggregate queries at varying group counts. Total: 22 benchmark functions.
  • 1,700 unit tests pass (up from 1,630 in v0.15.0).

[0.15.0] — 2026-04-03

0.15.0 brings the terminal dashboard to full operational capability, adds safety features that protect against runaway refreshes, and broadens the ecosystem with guides for popular migration and ORM frameworks. It also includes a major internal refactoring of the query parser and a new streaming benchmark suite.

Highlights

  • Interactive terminal dashboard — the pgtrickle TUI is no longer read-only. Refresh, pause, resume, and repair stream tables directly from the dashboard. A command palette (:) with fuzzy search makes common operations fast. The poller reconnects automatically after network interruptions.

  • Bulk creationpgtrickle.bulk_create() creates many stream tables in a single atomic transaction, ideal for CI/CD and dbt pipelines.

  • Runaway-refresh protection — two new safety nets prevent expensive merges from spiralling: a pre-flight row-count estimate that downgrades to FULL refresh when deltas are too large (max_delta_estimate_rows), and a spill detector that forces FULL refresh after repeated temp-file writes (spill_threshold_blocks).

  • Stuck-watermark alerting — if an upstream ETL pipeline stops advancing its watermark, pg_trickle now pauses affected stream tables and sends a watermark_stuck notification so the issue is surfaced immediately rather than silently producing stale data.

  • Integration guides — new documentation for Flyway, Liquibase, SQLAlchemy, Django, and dbt Hub helps teams adopt pg_trickle alongside their existing tooling.

New Features

  • Volatile function policy — a new volatile_function_policy setting lets you choose whether volatile functions (like random() or clock_timestamp()) should be rejected (the default), allowed with a warning, or allowed silently when creating stream tables.

  • Bulk create APIpgtrickle.bulk_create(definitions) accepts a JSON array of stream table definitions and creates them all in one transaction. If any definition fails, the entire batch is rolled back.

  • Enhanced diagnosticspgtrickle.explain_st() now shows refresh timing statistics (min/max/average duration), partition info for partitioned source tables, and a dependency graph you can render with Graphviz.

  • Join strategy override — the merge_join_strategy setting lets you force a specific join method (hash_join, nested_loop, or merge_join) during delta merges, which can help when the automatic heuristic doesn't suit your workload.

  • Pre-flight delta estimation — when max_delta_estimate_rows is set, pg_trickle counts the delta rows before merging. If the count exceeds the limit, it falls back to a FULL refresh and logs a notice, preventing out-of-memory conditions on unexpectedly large change sets.

  • Spill-aware refresh — if differential merges spill to disk repeatedly (controlled by spill_threshold_blocks and spill_consecutive_limit), the scheduler switches to FULL refresh automatically.

  • Stuck watermark hold-back — the watermark_holdback_timeout setting detects watermarks that have not advanced within a configurable window. Downstream stream tables are paused and a watermark_stuck notification is emitted until the watermark advances again.

  • Cascade dropdrop_stream_table() now accepts an optional cascade parameter (default true). Setting it to false raises an error if dependent stream tables exist, matching PostgreSQL's RESTRICT behavior.

  • Nexmark benchmark suite — a 10-query streaming benchmark (modelled on an online auction system) validates correctness under sustained high-frequency inserts, updates, and deletes.

  • 17 new end-to-end tests — 7 tests for multi-level stream-table chains (3- and 4-level cascades with mixed refresh modes) and 10 tests for diamond/fan-in topologies with IMMEDIATE mode. No deadlocks were found.

Terminal Dashboard (TUI)

  • Write actions — refresh, pause, resume, repair, reset fuse, and gate/ungate operations can now be performed without leaving the dashboard.
  • Command palette — press : for fuzzy-matched command entry with tab-completion.
  • Automatic reconnection — the dashboard reconnects with exponential back-off (up to 15 s) after a connection loss, with a visual indicator.
  • Richer views — all 14 views now show additional live data (diagnostics, CDC health, refresh history with row-delta counts, error remediation hints, dependency-graph annotations, worker queue status, and watermark alignment).
  • Cross-view filtering — the / search filter now persists across all 10 list views.
  • Navigation re-fetch — moving between rows in the Detail view immediately fetches fresh data for the selected table.
  • Toast messages — write actions show confirmation and error toasts.
  • Sort cycling — press s / S on the Dashboard to cycle through 6 sort modes.
  • Mouse support--mouse enables scroll-wheel navigation.
  • Theme togglet or --theme dark|light switches colour themes.
  • JSON exportCtrl+E or :export writes the current view to a file.
  • TLS support--sslmode and --sslrootcert flags.

Documentation & Ecosystem

  • Flyway / Liquibase guide — migration patterns for versioned and repeatable migrations, rollback blocks, and CI environments.
  • SQLAlchemy / Django guide — read-only model patterns, write-blocking safeguards, DRF viewsets, and freshness checking.
  • dbt Hub readiness — the dbt-pgtrickle package is version-synced and ready for dbt Hub submission.
  • Kubernetes / CNPG — updated probe configuration and a new deployment section in the Getting Started guide.
  • Full documentation review — configuration reference expanded from 23 to 40+ settings, missing SQL reference entries filled in, outdated FAQ answers corrected.

Internal Improvements

  • Parser modularisation — the 21 000-line query parser has been split into 5 focused sub-modules (types, validation, rewrites, sublinks, and the main entry point). No behavior change — all 1 687 unit tests pass.
  • Unsafe audit — every unsafe block in the codebase (~750 total) now has a // SAFETY: comment explaining why it is sound.
  • Shared-memory cache RFC — an RFC for a DSM-based MERGE template cache has been written, informing the v0.16.0 implementation plan.
  • TRUNCATE handling verified — TRUNCATE on source tables in trigger CDC mode already triggers a FULL refresh; this is now documented.
  • JOIN key-change fix verified — the v0.14.0 correctness fix for simultaneous JOIN key updates and DELETEs has been verified working and the former known-limitation note replaced with a description of the fix.

Bug Fixes

  • Fixed a panic in the TUI when deserializing health-check data that returned 64-bit integers where 32-bit was expected.
  • Fixed spurious "Error: db error" toasts in the TUI Detail view — background queries now degrade silently instead of surfacing transient errors.
  • Fixed incorrect integer type annotations in two E2E tests for IMMEDIATE mode diamond topologies.

[0.14.0] — 2026-04-02

0.14.0 is the Tiered Scheduling, Diagnostics & TUI release. It gives you fine-grained control over how often each stream table refreshes, adds tools that recommend the best refresh strategy for your workload, introduces a full-screen terminal dashboard for managing stream tables without SQL, and includes important security and reliability fixes.

Terminal Dashboard (TUI)

A new pgtrickle command-line tool lets you monitor and manage stream tables from a terminal — no SQL required. Run it with no arguments to launch a live-updating full-screen dashboard (think htop for stream tables), or use one-shot subcommands like pgtrickle list, pgtrickle status, or pgtrickle refresh for scripting and CI.

The interactive dashboard includes:

  • Live overview — stream table statuses, refresh timing, and issue counts update every 2 seconds, with color-coded health indicators.
  • Dependency graph — see how stream tables relate to each other in an ASCII tree view.
  • Diagnostics — view refresh mode recommendations with confidence levels.
  • CDC health — monitor change buffer sizes with warnings when they grow too large.
  • Alert feed — real-time notification display with severity levels.
  • Issue detection — automatically spots broken dependency chains, growing buffers, blown fuses, and stale data, with a persistent badge showing the issue count from any view.
  • Watch modepgtrickle watch provides continuous non-interactive output suitable for log aggregation.
  • Output formats — all CLI subcommands support --format json, --format csv, and human-readable table output.

See docs/TUI.md for the full user guide.

Tiered Refresh Scheduling

Stream tables can now be assigned to refresh tiers — hot, warm, cold, or frozen — to control how frequently they refresh:

  • Hot (default) — refreshes at the configured interval.
  • Warm — refreshes at 2× the interval.
  • Cold — refreshes at 10× the interval, ideal for infrequently accessed reports.
  • Frozen — pauses automatic refresh entirely until promoted back.

Assign a tier with ALTER STREAM TABLE ... SET (tier = 'cold'). A NOTICE is emitted when demoting from Hot to Cold or Frozen so operators are aware of the change in refresh frequency.

Smarter Refresh Recommendations

Two new diagnostic functions help you choose the most efficient refresh strategy for each stream table:

  • pgtrickle.recommend_refresh_mode(name) — analyzes seven workload signals (change frequency, timing history, query complexity, table size, index coverage, and latency patterns) and recommends FULL or DIFFERENTIAL mode with a confidence level and plain-language explanation. Useful when you're unsure which mode will be faster for a particular table.

  • pgtrickle.refresh_efficiency(name) — shows per-table refresh performance: how many FULL vs. DIFFERENTIAL refreshes have run, average timing for each, and the speedup factor. Good for monitoring dashboards and alerting.

A new tutorial — Tuning Refresh Mode — walks through the process step by step.

Reduced Write Overhead with UNLOGGED Buffers

Enable pg_trickle.unlogged_buffers = true and newly created change buffer tables will skip write-ahead logging, reducing WAL volume by roughly 30%. This is ideal for workloads where you can tolerate a full re-sync after a crash (the extension detects the crash and re-syncs automatically).

A utility function — pgtrickle.convert_buffers_to_unlogged() — converts existing buffers in one call. Run it during a maintenance window since it briefly locks each buffer table.

Instant Error Detection

Previously, when a stream table's refresh hit a permanent error (for example, a function that doesn't exist for the column type), the extension would retry several times before giving up. Now it recognizes permanent errors immediately, sets the stream table status to ERROR with a clear error message, and stops retrying. You can see the error at a glance in the stream_tables_info view or the TUI dashboard, and fix it by altering the stream table's query.

Security Hardening

  • CDC trigger functions now use SECURITY DEFINER — change-data-capture trigger functions run with the privileges of the extension owner rather than the current user, preventing privilege escalation through modified search paths.
  • Explicit SET search_path — all CDC trigger functions now set search_path to pgtrickle_changes, pg_catalog to prevent search-path manipulation attacks.

Other Improvements

  • Export definitionspgtrickle.export_definition(name) exports a stream table's full configuration as reproducible SQL (DROP + CREATE + ALTER statements), making it easy to version-control or migrate stream table definitions between environments.

  • Creation-time warnings — when creating a stream table with aggregates like MIN, MAX, or STRING_AGG in DIFFERENTIAL mode, a warning now suggests that FULL or AUTO mode may be more efficient. For algebraic aggregates (SUM/COUNT/AVG), the warning only appears when the estimated number of groups is below a configurable threshold.

  • Simplified settings — the merge_planner_hints and merge_work_mem_mb settings have been consolidated into a single planner_aggressive switch. The old setting names still work but are ignored in favor of the new one.

  • GHCR Docker image — a multi-architecture Docker image (ghcr.io/grove/pg_trickle) with PostgreSQL 18.3 and pg_trickle pre-installed is now published automatically on each release.

  • Pre-deployment checklist — new PRE_DEPLOYMENT.md with a 10-point checklist for production deployments.

  • Best-practice patterns guide — new PATTERNS.md with 6 common patterns: Bronze/Silver/Gold materialization, event sourcing, slowly-changing dimensions, high-fan-out topology, real-time dashboards, and tiered refresh strategies.

  • Keyless dedup fix — replaced MAX(col) with array_agg(col)[1] for deduplicating keyless scan results, which is more correct for non-orderable types.

Bug Fixes

  • ST-on-ST differential refresh — manually refreshing a stream table that reads from another stream table now uses true incremental (DIFFERENTIAL) refresh instead of falling back to a full re-scan. This matches the behavior of the automatic scheduler and is significantly faster for large tables.

  • Staleness tracking — the staleness indicator now uses the actual last refresh time instead of an internal data timestamp, making the pg_stat_stream_tables view more accurate.

Testing & Reliability

  • Soak test — a new long-running stability test validates zero worker crashes, zero ERROR states, and stable memory usage under sustained mixed workload (configurable duration, default 10 minutes).

  • Multi-database isolation test — verifies that two databases in the same PostgreSQL cluster run pg_trickle independently without interference.

  • 140 TUI tests — comprehensive unit, snapshot, and interaction tests for the terminal dashboard.

  • 23 mixed-object E2E tests — validates stream tables alongside regular PostgreSQL views, materialized views, and other objects.

  • Scheduler race fixes — eliminated flaky test failures caused by scheduler timing races and GUC leak between tests.

New SQL Functions

FunctionPurpose
pgtrickle.recommend_refresh_mode(name)Workload-based refresh mode recommendation
pgtrickle.refresh_efficiency(name)Per-table refresh performance metrics
pgtrickle.export_definition(name)Export stream table as reproducible DDL
pgtrickle.convert_buffers_to_unlogged()Convert logged change buffers to UNLOGGED

New Settings

SettingDefaultPurpose
pg_trickle.planner_aggressivetrueConsolidated switch for MERGE planner hints
pg_trickle.unlogged_buffersfalseCreate new change buffers as UNLOGGED
pg_trickle.agg_diff_cardinality_threshold1000Warn about DIFFERENTIAL mode below this group count

Deprecated

  • pg_trickle.merge_planner_hints — Use pg_trickle.planner_aggressive instead. Still accepted but ignored at runtime.
  • pg_trickle.merge_work_mem_mb — Same; use planner_aggressive instead.

Upgrading

Run ALTER EXTENSION pg_trickle UPDATE; after installing the new binaries. The upgrade adds new catalog columns, functions, and the TUI workspace member. No breaking changes — everything from v0.13.0 continues to work. See UPGRADING.md for details.


[0.13.0] — 2026-03-31

0.13.0 is the Scalability Foundations release. It makes pg_trickle handle large tables, complex queries, and multi-tenant deployments much more efficiently — and it achieves a major milestone: all 22 TPC-H benchmark queries now run in incremental (DIFFERENTIAL) mode, meaning the engine no longer needs to fall back to slow full-refresh for any standard analytical query pattern.

Smarter Change Detection for Wide Tables

When you UPDATE a few columns in a large table — say, changing a status column in a 60-column table — pg_trickle used to treat every column as potentially changed, doing extra work to keep all downstream views up to date.

Now it knows the difference. Columns used in GROUP BY, JOIN, or WHERE clauses are "key columns"; everything else is a "value column." When only value columns change, the engine takes a shortcut: it sends a single correction row instead of a full delete-and-reinsert pair. For wide-table workloads, this can cut the volume of data processed by 50% or more.

Shared Change Buffers

If you have several stream tables watching the same source table, each one used to maintain its own private copy of the change log. That's wasteful. Now they share a single change buffer per source, and each consumer simply tracks how far it has read. The slowest reader protects the buffer for everyone.

You can see how this is working with the new pgtrickle.shared_buffer_stats() function — it shows each buffer, who's reading from it, how many rows are queued, and whether it's been automatically partitioned for performance.

Automatic Buffer Partitioning

Set pg_trickle.buffer_partitioning = 'auto' and pg_trickle will start with simple, unpartitioned change buffers. If a buffer starts accumulating a lot of rows (high-throughput sources), it automatically converts to a partitioned layout where old data can be removed almost instantly instead of deleting rows one by one.

More Partitioning Options for Stream Tables

Building on the RANGE partitioning added in v0.11.0, you can now partition stream tables in three additional ways:

  • Multi-column keys — partition by a combination of columns (partition_by='region,year')
  • LIST partitioning — for low-cardinality columns like status or type (partition_by='LIST:status')
  • HASH partitioning — for even distribution across a fixed number of partitions (partition_by='HASH:customer_id:8')

You can also change the partition key of an existing stream table at runtime with alter_stream_table(partition_by => ...) — data is preserved automatically. If rows land in the default (catch-all) partition, a WARNING is emitted to prompt you to add explicit partitions.

All 22 TPC-H Queries Now Run Incrementally

The DVM (differential view maintenance) engine received its most significant set of improvements yet, targeting the complex multi-table join patterns found in standard analytical benchmarks:

  • Smarter pre-image lookups — instead of reconstructing what the data looked like before a change by subtracting deltas (expensive for large tables), the engine now uses targeted index lookups that only touch the rows that actually changed.
  • Predicate pushdown — WHERE conditions from the original query are now pushed into the delta computation, preventing unnecessary cross-products in multi-table joins.
  • Deep-join optimizations — queries joining 5+ tables get automatic planner hints (more memory, smarter join strategies) to avoid spilling to disk.
  • Scan-count-aware strategy selector — queries that exceed configurable join complexity or delta volume thresholds automatically fall back to full refresh on a per-query basis rather than failing.

The result: all 22 TPC-H queries pass at SF=0.01 in DIFFERENTIAL mode with zero drift across 3 refresh cycles. The DIFFERENTIAL_SKIP_ALLOWLIST (queries that previously required full refresh) is now empty.

Refresh Performance Inspection Tools

Two new functions help you understand what pg_trickle is doing under the hood:

  • pgtrickle.explain_delta(name, format) — shows you the query plan for the auto-generated delta SQL, the same way EXPLAIN works for regular queries. Available in text, JSON, XML, or YAML format.
  • pgtrickle.dedup_stats() — reports how often concurrent writes produce duplicate entries that need pre-processing before the MERGE step.

Multi-Tenant Worker Quotas

New setting: pg_trickle.per_database_worker_quota — if you run many databases on one PostgreSQL cluster, this prevents a busy database from monopolizing all the refresh workers. Workers are assigned by priority (immediate-mode tables first, then hot, warm, and cold), with burst capacity up to 150% when other databases are idle.

TPC-H Benchmark Harness

You can now measure refresh performance across all 22 TPC-H queries in a structured way. Run just bench-tpch to get per-query timing, FULL vs. DIFFERENTIAL comparison, and P95 latency numbers. Five synthetic benchmarks (q01, q05, q08, q18, q21) also measure the pure Rust delta-SQL generation time without needing a database.

Broader SQL Support

  • IS JSON predicates (PG 16+) — expressions like expr IS JSON OBJECT now work in incremental mode.
  • SQL/JSON constructors (PG 16+) — JSON_OBJECT(...), JSON_ARRAY(...), JSON_OBJECTAGG(...), and JSON_ARRAYAGG(...) are now accepted.
  • Recursive CTEs — recursive queries with non-monotone operators (like EXCEPT) correctly fall back to full refresh instead of producing wrong results.

dbt Integration Updates

If you use dbt-pgtrickle, you can now set partitioning and fuse options directly from dbt model config:

  • {{ config(partition_by='customer_id') }} for partitioned stream tables
  • {{ config(fuse='auto', fuse_ceiling=100000, fuse_sensitivity=3) }} for circuit-breaker protection

Bug Fixes

  • Scheduler cascade fix — stream tables downstream of FULL-mode upstream tables now detect changes correctly via a last_refresh_at fallback, preventing stale data in chains where the upstream uses full refresh.
  • SUM(CASE WHEN ...) drift fix — aggregate expressions using CASE were occasionally producing slightly wrong incremental results; these are now correctly detected and processed via a group rescan.
  • Duplicate column DDL fix — removed a duplicate column definition in the pgt_stream_tables DDL that could cause issues on fresh installs.

Testing Improvements

  • New regression test suite targeting 9 structural weaknesses: join multi-cycle correctness (7 tests), differential-equals-full equivalence (11 tests), DVM operator execution, failure recovery, and MERGE template unit tests.
  • E2E test infrastructure now uses template databases, cutting per-test setup time significantly.

New SQL Functions

FunctionPurpose
pgtrickle.explain_delta(name, format)Show the query plan for the delta SQL
pgtrickle.dedup_stats()MERGE deduplication frequency counters
pgtrickle.shared_buffer_stats()Per-source change buffer status
pgtrickle.explain_refresh_mode(name)Why a stream table uses its current refresh mode
pgtrickle.reset_fuse(name)Reset a blown circuit-breaker fuse
pgtrickle.fuse_status()Fuse state across all stream tables

New Catalog Columns

Ten new columns on pgtrickle.pgt_stream_tables:

ColumnPurpose
effective_refresh_modeThe actual refresh mode after AUTO resolution
fuse_modeCircuit-breaker configuration (off / auto / manual)
fuse_stateCurrent fuse state (armed / blown)
fuse_ceilingMaximum change count before fuse blows
fuse_sensitivityConsecutive cycles above ceiling before triggering
blown_atWhen the fuse last blew
blow_reasonWhy the fuse blew
st_partition_keyPartition key specification
max_differential_joinsMaximum join count for differential mode
max_delta_fractionMaximum delta-to-table ratio for differential mode

Upgrading

Run ALTER EXTENSION pg_trickle UPDATE; after installing the new binaries. All new columns and functions are added automatically. No breaking changes — everything from v0.12.0 continues to work as before. See UPGRADING.md for details.


[0.12.0] — 2026-03-28

0.12.0 is a correctness, reliability, and developer-experience release built on top of 0.11.0's major new features. It closes the last known wrong-answer bugs for complex join queries, adds tools to help you understand and debug stream table behavior, hardens the scheduler against several edge cases that could cause stale data or crashes, and backs it all with thousands of new automatically generated tests.

Stale Rows Fixed in Stream-Table Chains

What was the problem? When a stream table (B) reads from another stream table (A), each change in A is recorded as a small "what changed" entry — a row added or removed. But the identity key used for those entries was computed differently inside the change buffer than it was inside B's own storage. As a result, when A changed via an upstream UPDATE, B's refresh could silently fail to delete the old version of a row, leaving a stale duplicate.

What changed? The change buffer now computes row identity the same way B does — using a hash of all the data columns rather than the upstream source's primary key. Stale rows after UPDATE no longer appear in stream-table chains. This bug was found and confirmed by the new property-based test suite (see below).

Phantom Rows Fixed for Complex Joins (TPC-H Q7 / Q8 / Q9)

What was the problem? When a stream table's query joins three or more tables together and rows are deleted from more than one join side at the same time, the incremental engine could silently drop the correction — leaving rows in the stream table that should have been removed.

This affected TPC-H queries Q7, Q8, and Q9 (which all involve deep join trees), and any user query with a similar multi-table join structure. A temporary workaround (falling back to full refresh for wide joins) was in place since v0.11.0 and has now been lifted.

What changed? The incremental engine now takes an individual "before snapshot" for each leaf table in the join tree — each one cheaply computed from a single-table comparison — and re-joins them after the delete. This avoids writing multi-gigabyte temp files to disk (the root cause of the original workaround) and eliminates the phantom-row bug entirely. Q7, Q8, and Q9 now run in differential mode without any workarounds.

Type Errors Fixed in Parallel Refresh Chains

What was the problem? When a chain of stream tables is fused into a single execution unit for efficiency (the "bypass" optimisation added in v0.11.0), the internal bypass table used text for every column regardless of the actual column type. This caused an operator does not exist: text > integer error whenever a downstream stream table had a type-sensitive WHERE clause (e.g. WHERE amount > 100), making the parallel worker tests fail silently across all topologies that included a fused chain.

What changed? Bypass tables now use the real column types. The six parallel-worker benchmark tests now complete in 9–26 seconds rather than timing out after 120 seconds.

Scheduler Fixes for Diamond and ST-on-ST Topologies

Two scheduler bugs that caused incorrect refresh behavior with complex dependency graphs were fixed:

  • Diamond timeout. In a diamond topology (A → B, A → C, B+C → D), the L1 arm stream tables (B and C) were created with a 1-minute fixed interval rather than a calculated schedule. This meant D never received updates within the test window. The scheduler also had a bug loading stream table records by ID that caused silent failures in parallel worker paths. Both are fixed.

  • ST-on-ST parallel workers. When an upstream stream table changed, the parallel worker paths (singleton, atomic group, immediate closure, fused chain) were not forcing a full refresh on downstream stream tables the way the main scheduler loop did. This could leave downstream tables stale. The fix ensures all parallel paths treat upstream stream-table changes the same way.

Four New Diagnostic Functions

When stream table behavior is unexpected — wrong refresh mode, a query being rewritten in a surprising way, persistent errors — it previously required reading server logs or source code to understand why. Four new SQL functions expose that internal state directly in queries:

  • pgtrickle.explain_query_rewrite(query TEXT) — shows exactly how pg_trickle rewrites your query for incremental refresh: which operators were applied, how delta keys are injected, and how aggregates are classified. Useful for understanding why a query got a particular refresh mode.

  • pgtrickle.diagnose_errors(name TEXT) — shows the last 5 errors for a stream table, each classified by type (correctness, performance, configuration, infrastructure) with a suggested fix.

  • pgtrickle.list_auxiliary_columns(name TEXT) — lists the internal __pgt_* columns that pg_trickle injects into a stream table's query plan, with an explanation of each one's purpose. Helpful when SELECT * returns unexpected extra columns.

  • pgtrickle.validate_query(query TEXT) — analyses a SQL query and reports which refresh mode it would get, which SQL constructs were detected, and any warnings — all without creating a stream table.

Multi-Column IN (subquery) Now Gives a Clear Error

What was the problem? A query like WHERE (col_a, col_b) IN (SELECT x, y FROM …) passed validation but produced silently wrong results — the engine was only matching on the first column and ignoring the second.

What changed? This construct is now detected at stream table creation time and rejected with a clear error message that recommends rewriting it as EXISTS (SELECT 1 FROM … WHERE col_a = x AND col_b = y).

IMMEDIATE Mode Proven Correct Under High Concurrency

IMMEDIATE mode (where the stream table updates inside the same transaction as the source table change) now has a dedicated concurrency stress test: 100–120 concurrent transactions firing simultaneously against the same source table, across five scenarios (all inserts, all updates to distinct rows, all updates to the same row, all deletes, and a mixed workload). Zero lost updates, zero phantom rows, and no deadlocks were observed in any run.

Protection Against Pathological Queries

A new guard prevents a particularly deep or convoluted query from consuming all available stack space and crashing the database backend. When the query analyser recurses more than 64 levels deep (configurable via pg_trickle.max_parse_depth), it now returns a clear QueryTooComplex error instead of crashing.

Tiered Scheduling Now On By Default

The tiered scheduling feature — which automatically slows down cold (infrequently-read) stream tables and speeds up hot ones — is now enabled by default. In large deployments this reduces the scheduler's CPU usage significantly. Stream tables you query often continue refreshing at full speed. Stream tables that nobody has read recently back off gracefully.

If you rely on all stream tables refreshing at the same rate regardless of read frequency, set pg_trickle.tiered_scheduling = off.

Thousands of Automatically Generated Tests

Two new automated testing systems were added to complement the hand-written test suite:

  • Property-based tests — the test framework automatically generates thousands of random DAG shapes, schedule combinations, and edge cases and checks that the scheduler's ordering guarantees hold for all of them. If any configuration would cause a table to refresh in the wrong order or get spuriously suspended, these tests catch it.

  • SQLancer fuzzing — SQLancer generates random SQL queries and checks that pg_trickle's incremental result matches the result of running the same query directly in PostgreSQL. Any mismatch is automatically saved as a permanent regression test. A weekly CI job runs this continuously. At time of release, zero mismatches have been found.

CDC Write-Side Benchmark Published

A new benchmark suite measures the overhead that pg_trickle's change capture triggers add to your write workload. Results across five scenarios (single-row INSERT, bulk INSERT, bulk UPDATE, bulk DELETE, concurrent writers) are published in docs/BENCHMARK.md. Use these numbers to estimate the impact before deploying pg_trickle on a write-heavy table.

MERGE Template Validation at Test Startup

The SQL templates that pg_trickle generates for applying incremental changes (the MERGE statements) are now validated with an EXPLAIN dry-run at every test startup. If a code change accidentally produces a malformed MERGE template, the tests catch it before any data is processed — rather than manifesting as a cryptic runtime error.


[0.11.0] — 2026-03-26

This is the biggest release since the initial launch. The headline features are 34× lower latency for real-time workloads, stream-table chains that now refresh incrementally (no more forced full recomputation when one stream table feeds another), declarative partitioning to cut I/O on large tables by up to 100×, a ready-to-use Prometheus and Grafana monitoring stack, and a circuit breaker to protect production databases from runaway change bursts.

34× Lower Latency — Changes Arrive Instantly

Previously, the background worker woke up on a fixed timer every ~500 ms to check for new data, even when nothing had changed. Every change had to wait up to half a second in the change buffer before being processed.

Now, when a source table is modified, the change capture trigger immediately wakes the background worker via a PostgreSQL notification channel. The worker starts processing within ~15 ms of the write committing — a 34× improvement for low-volume workloads. Under heavy DML, a 10 ms debounce window coalesces rapid notifications so the worker isn't flooded.

Event-driven wake is on by default. You can turn it off (pg_trickle.event_driven_wake = off) to revert to poll-based wake, and you can tune the debounce window with pg_trickle.wake_debounce_ms (default 10).

Stream-Table-to-Stream-Table Chains Now Refresh Incrementally

Previously, when stream table B's query read from stream table A, pg_trickle had to do a full recomputation of B every time A changed — even if only a few rows in A actually changed. For long chains (A → B → C → D), every hop was a full re-scan.

Now, stream tables can read from other stream tables incrementally. When A refreshes, the rows it added and removed are recorded in a change buffer just like a base table. B wakes up, reads only the changed rows from A, and applies a delta — not a full recomputation. Even when A does a full refresh (e.g. because its query does not support differential mode), a before/after snapshot diff is captured automatically so downstream tables still receive a small insert/delete delta rather than cascading full refreshes through the chain.

Declaratively Partitioned Stream Tables

Stream tables can now be declared with a partition key:

SELECT create_stream_table(
  'monthly_sales',
  $$ SELECT month, region, SUM(amount) FROM orders GROUP BY 1, 2 $$,
  partition_by => 'month'
);

pg_trickle creates a range-partitioned storage table and, when refreshing, automatically restricts the MERGE operation to only the partitions that contain changed rows. For large tables where changes touch only 2–3 out of 100 monthly partitions, this can reduce the MERGE I/O from 10 million rows to ~100,000 — a 100× improvement.

Ready-to-Use Prometheus and Grafana Monitoring

A complete observability stack is now included in the monitoring/ directory:

  • monitoring/prometheus/pg_trickle_queries.yml — drop-in configuration for postgres_exporter that exports 14 metrics covering refresh performance, CDC buffer sizes, staleness, error rates, and per-table status.
  • monitoring/prometheus/alerts.yml — 8 alerting rules that page you when a stream table goes stale (> 5 min), starts error-looping (≥ 3 consecutive failures), is suspended, or when the CDC buffer exceeds 1 GB.
  • monitoring/grafana/dashboards/pg_trickle_overview.json — a pre-built Grafana dashboard with six sections: cluster overview, refresh latency time-series, staleness heatmap, CDC lag, per-table drill-down, and scheduler health.
  • monitoring/docker-compose.yml — brings up PostgreSQL + pg_trickle + postgres_exporter + Prometheus + Grafana with one command (docker compose up). Grafana opens at http://localhost:3000; the dashboard shows live metrics generated by a seed workload of stream tables continuously refreshing synthetic order and product data (see monitoring/init/01_demo.sql).

No code changes are needed to use this stack with an existing pg_trickle installation.

Circuit Breaker (Fuse) — Protection Against Runaway Change Bursts

A new circuit breaker mechanism halts refresh for a stream table when its pending change count exceeds a configurable threshold. This protects your database from accidental mass-delete scripts, runaway migrations, or data imports that would otherwise trigger an unexpectedly large and expensive refresh operation.

When the fuse blows, pg_trickle sends a pgtrickle_alert PostgreSQL notification that you can subscribe to, and suspends the affected stream table. You then choose how to recover using reset_fuse():

  • reset_fuse(name, action => 'apply') — process the backlog normally (default).
  • reset_fuse(name, action => 'reinitialize') — clear the change buffer and repopulate the stream table from scratch.
  • reset_fuse(name, action => 'skip_changes') — discard the pending changes and resume without reprocessing them.

Configure per-table with alter_stream_table(fuse => 'on', fuse_ceiling => 10000) or set a global default with pg_trickle.fuse_default_ceiling. Use fuse_status() to inspect the blown/active state of all stream tables at once.

Wider Column Bitmask — No More 63-Column Limit

pg_trickle's change capture tracks which columns were actually modified in each row so that stream tables that reference only a subset of columns can ignore irrelevant updates. Previously, this optimization silently stopped working for source tables with more than 63 columns — all updates were treated as touching every column.

The bitmask has been extended from a 64-bit integer to an arbitrary-width PostgreSQL VARBIT value, removing the column count cap entirely. Existing deployments are migrated automatically (the old column value becomes NULL, which the filter treats conservatively — no rows are silently dropped). Tables with fewer than 64 columns are unaffected at the data level.

Per-Database Worker Quotas

In multi-tenant environments where multiple databases share a single PostgreSQL instance, all stream-table refresh workers previously competed for the same concurrency pool. A single busy database could crowd out others.

A new GUC pg_trickle.per_database_worker_quota sets a soft concurrency limit per database. When the rest of the cluster is lightly loaded (< 80% of available capacity in use), a database can burst to 150% of its quota. When the cluster is busy, each database is held to its base quota.

Refresh work is also now dispatched in priority order: IMMEDIATE mode tables → atomic diamond groups → singleton tables.

DAG Scheduling Performance

For deployments with chains of stream tables (A → B → C), several improvements reduce end-to-end propagation latency:

  • Fused single-consumer chains. When a stream table chain has exactly one downstream consumer at each hop, the scheduler fuses the chain into a single execution unit in one background worker. Intermediate deltas are stored in temporary in-memory tables instead of persistent change buffers, eliminating the WAL writes, index maintenance, and cleanup that would normally occur at each hop.
  • Batch coalescing. Before a downstream table reads from an upstream change buffer, redundant insert/delete pairs for the same row are cancelled out. This prevents rapid-fire upstream refreshes from accumulating duplicate work for downstream tables.
  • Adaptive dispatch polling. The parallel dispatch loop now backs off exponentially (20 ms → 200 ms) instead of using a fixed 200 ms poll, and resets to 20 ms as soon as any worker finishes. Cheap refreshes no longer wait a full 200 ms for the next tick.
  • Delta amplification warnings. When a differential refresh produces many more output rows than input rows (default threshold: 100×), a WARNING is emitted with the table name, input and output counts, and a tuning hint. explain_st() now exposes amplification_stats from the last 20 refreshes.

Smarter Diagnostics and Warnings

Several improvements to make problems visible earlier and easier to diagnose:

  • Know which refresh mode is actually running. When a stream table is set to AUTO, pg_trickle now records which mode it actually chose at each refresh (DIFFERENTIAL, FULL, etc.) in a new effective_refresh_mode column on pgt_stream_tables. A new explain_refresh_mode(name) function reports the configured mode, the actual mode used, and the reason for any downgrade — all in one query.
  • Clearer warning when a stream table falls back to full refresh. If a stream table cannot use differential mode, pg_trickle now emits a WARNING message naming the affected table and the reason. Previously this happened silently.
  • Warning when using aggregates that require full group rescans. Aggregate functions like STRING_AGG, ARRAY_AGG, and JSON_AGG require re-aggregating the entire group whenever any member changes. pg_trickle now warns at stream table creation time when such aggregates are used in DIFFERENTIAL mode, and explain_st() classifies each aggregate's maintenance strategy (incremental, auxiliary-state, or group-rescan) so you can understand the cost.
  • Better error messages. Errors for unsupported query patterns, cycle detection, upstream schema changes, and query parse failures now include a DETAIL field explaining what went wrong and a HINT field suggesting how to fix it.
  • Invalid parameter combinations are rejected at creation time. For example, using diamond_schedule_policy='slowest' without diamond_consistency='atomic' now produces a clear error at create_stream_table / alter_stream_table time rather than silently doing the wrong thing at refresh time.
  • TopK queries validate their metadata on every refresh. Stream tables defined with ORDER BY ... LIMIT N now recheck that the stored LIMIT/OFFSET metadata still matches the actual query on each refresh. On mismatch, they fall back to a full refresh with a WARNING rather than silently producing wrong results.

Safety and Reliability Improvements

  • No more crashes from schema changes. If a source table's schema changes while a refresh is running (e.g. a column is dropped), pg_trickle now catches the error, emits a structured WARNING with the table name and error details, and continues refreshing all other stream tables. The scheduler never crashes due to an individual table's error.
  • Failure injection tests. New end-to-end tests deliberately drop columns and tables mid-refresh to verify that the scheduler stays alive and other stream tables continue processing correctly.
  • Safer defaults. Three default settings have been updated to reflect production-safe behavior:
    • parallel_refresh_mode now defaults to 'on' (was 'off'). Parallel refresh has been stable for several releases; serial mode is now opt-in.
    • block_source_ddl now defaults to true. Accidental ALTER TABLE on a source table while a stream table depends on it is now blocked by default, with clear instructions on how to temporarily disable the guard if needed.
    • The invalidation ring capacity has been doubled from 32 to 128 slots, reducing the risk of invalidation events being silently discarded under rapid DDL.

Getting Started Guide Restructured

docs/GETTING_STARTED.md has been reorganised into five progressive chapters:

  1. Hello World — create your first stream table and watch it update.
  2. Joins, Aggregates & Chains — multi-table dependencies and DAG patterns.
  3. Scheduling & Backpressure — controlling refresh frequency and auto-backoff.
  4. Monitoring In Depth — using the five key diagnostic functions and the Prometheus/Grafana stack.
  5. Advanced Topics — FUSE circuit breaker, partitioned stream tables, IMMEDIATE (in-transaction) IVM, and multi-tenant worker quotas.

TPC-H Correctness Gate Added to CI

Five queries derived from the TPC-H benchmark — covering single-table GROUP BY, filter-aggregate, CASE WHEN inside SUM, a three-way join, and LEFT OUTER JOIN with GROUP BY — now run in DIFFERENTIAL mode on every push to main and daily. Any correctness mismatch between pg_trickle's incremental output and plain PostgreSQL execution fails the CI build automatically.

Docker Hub Image Improvements

The Dockerfile.hub image that is published to Docker Hub has been expanded with a comprehensive set of GUC defaults fine-tuned for production use. A new just build-hub-image recipe builds the image locally for testing.

Bug Fixes

  • Scheduler crash after event-driven wake was enabled. The background worker crashed immediately after startup when event_driven_wake = on (the default) because the LISTEN command was being issued outside of a transaction. Fixed by issuing LISTEN inside a short-lived SPI transaction at startup. (#296)
  • Spurious full refresh for non-recursive CTEs. Stream tables containing WITH clauses that were not recursive (WITH foo AS (SELECT ...)) were being incorrectly forced to FULL refresh mode. Only truly recursive CTEs (WITH RECURSIVE) require this. Non-recursive CTEs now correctly use differential mode. (#298)
  • DISTINCT ON inside a CTE body caused a parse error. When a stream table's defining query contained a WITH clause whose body used DISTINCT ON (...), the DVM query analyser failed with a parse error. The DISTINCT ON clause is now rewritten before analysis so it no longer interferes. (#300)
  • Full-refresh fallback warning now names the affected table. When pg_trickle falls back from differential to full refresh, the emitted WARNING now includes the stream table name and the reason, making it straightforward to identify which table you need to investigate. (#301)

[0.10.0] — 2026-03-25

The headline features of 0.10.0 are cloud deployment compatibility, query engine correctness, refresh performance, and improved developer experience for auto_backoff. pg_trickle now works reliably behind PgBouncer — the connection pooler used by default on Supabase, Railway, Neon, and other managed PostgreSQL platforms. A broad set of correctness issues in the incremental query engine are fixed. And several performance optimizations cut refresh time for large tables and busy deployments.

auto_backoff Is Now Much Friendlier on Developer Machines

When pg_trickle.auto_backoff = true is enabled, the scheduler automatically slows down stream tables whose refresh cost exceeds their schedule budget — a good safeguard in production. This release makes the feature safe to use alongside short schedules (e.g. '1s') in developer and CI environments:

  • Trigger threshold raised from 80 % → 95 %. Backoff now only activates when a refresh consumes more than 95 % of the schedule window. A 900 ms refresh on a 1-second schedule (90 %) used to trigger backoff; it no longer does. EC-11 operator alerting continues to fire at 80 % (unchanged) so you still get an early warning before the scheduler is actually stuck.

  • Maximum slowdown reduced from 64× → 8×. In the worst case, a stream table's effective refresh interval is now capped at 8× its configured schedule (e.g. 8 seconds for a '1s' table) instead of 64 seconds. The cap self-heals immediately: a single on-time refresh resets the factor to 1×.

  • Backoff events now emit WARNING instead of INFO. When the scheduler stretches or resets a stream table's effective interval, you will see a WARNING message in your PostgreSQL client, including the new effective interval — rather than a silent slowdown with no explanation.

  • auto_backoff now defaults to on. With the above improvements in place, the feature is safe in all environments. New installations get CPU runaway protection out of the box. To restore the old opt-in behaviour, set pg_trickle.auto_backoff = off.

Works Behind PgBouncer

PgBouncer is the most popular PostgreSQL connection pooler. In "transaction mode" — the default setting on most cloud PostgreSQL platforms — it hands a fresh database connection to every transaction, which breaks anything that assumes the same connection stays open between calls (session locks, prepared statements). pg_trickle previously relied on both. This release makes pg_trickle work correctly in such deployments.

  • Session locks replaced with row-level locking. The background scheduler now acquires a short-lived row-level lock on each stream table's catalog entry instead of a session-level advisory lock. Row-level locks are released automatically at transaction end — exactly what PgBouncer transaction mode requires. If a concurrent refresh is already running for a given stream table, the scheduler skips that cycle and retries, rather than blocking.

  • New pooler_compatibility_mode option per stream table. Setting pooler_compatibility_mode => true when creating or altering a stream table disables prepared statements and NOTIFY emissions for that table. Leave it off (the default) if you're not behind a pooler — behaviour is unchanged from v0.9.0.

  • PgBouncer tested end-to-end. A new automated test suite boots PgBouncer in transaction-pool mode alongside pg_trickle and exercises the full lifecycle: create, refresh, alter, drop — all through the pooler. Run with just test-pgbouncer.

Query Engine Correctness Fixes

Several SQL patterns that appeared to work correctly could produce wrong results silently under the incremental query engine. All of the following are now fixed:

  • Recursive queries (WITH RECURSIVE) update correctly when rows are deleted. Recursive queries are used for organisation hierarchies, bill-of-materials roll-ups, graph traversals, and similar structures. In DIFFERENTIAL mode, deleting a row from the source previously caused a full recomputation (correct, but expensive — O(n)). Now pg_trickle uses the Delete-and-Rederive algorithm, updating only affected rows at O(delta) cost. Computed expressions like ancestor.path || ' > ' || node.name update correctly when any ancestor is renamed or moved.

  • SUM over a FULL OUTER JOIN no longer returns 0 instead of NULL. When matched rows on both join sides transition to matched on one side only (creating null-padded rows), the incremental SUM formula previously returned 0 instead of NULL. pg_trickle now tracks how many non-null values exist in each group and produces the correct answer without any full-group rescan.

  • Multi-source delta merging is now correct for diamond-shaped queries. A "diamond" topology is when two separate paths through the dependency graph both feed into the same stream table (e.g. table A → both B and C → D). Simultaneous changes on both paths could previously cause some corrections to be silently discarded, leaving D with wrong values. Now uses proper weight aggregation (Z-set algebra) so every correction is applied. Six property-based tests verify this for different diamond shapes.

  • Statistical aggregates (CORR, COVAR, REGR_*) now update in constant time. All twelve SQL correlation and regression functions — CORR, COVAR_POP, COVAR_SAMP, and the ten REGR_* variants — now update incrementally using running totals (Welford-style accumulation) instead of rescanning the whole group. Each changed row is processed once regardless of group size.

  • LATERAL subqueries only re-examine correlated rows. When data changes in the inner part of a LATERAL JOIN, pg_trickle previously re-ran the subquery for every row in the outer table. Now it re-runs it only for outer rows that actually correlate with the changed inner data, reducing work from proportional-to-table-size to proportional-to-changes.

  • Materialized view sources now work in DIFFERENTIAL mode. Stream tables can use a PostgreSQL materialized view as their data source when pg_trickle.matview_polling = on is set. Changes are detected by comparing snapshots, the same mechanism used for foreign table sources.

  • Six correctness bugs in the query rewriting engine fixed. These all involved edge cases in how the incremental engine translates SQL:

    • SQL comment fragments such as /* unsupported ... */ that were being injected into generated SQL and causing runtime syntax errors are now replaced with clear extension-level errors.
    • When a column-rename step (e.g. EXTRACT(year FROM orderdate) AS o_year) sits between an aggregate and its source, GROUP BY and aggregate expressions now resolve correctly.
    • EXCEPT queries wrapped in a projection no longer silently lose their row multiplicity tracking.
    • A placeholder row identifier value of zero could collide with real row hashes; changed to a sentinel value (i64::MIN) outside the normal hash range.
    • Empty scalar subqueries now raise a clear error instead of silently emitting NULL.
  • Change capture (CDC) fixes. The UPDATE trigger now correctly handles rows with NULL values in their primary key columns (previously those rows were silently dropped from the change buffer). WAL logical replication publications are automatically rebuilt when a source table is converted to partitioned after the publication was set up — previously this caused the stream table to silently stop updating. TRUNCATE followed by INSERT is handled atomically so post-TRUNCATE inserts are never lost.

Faster Refreshes

  • Automatic covering index on stream table row IDs. Stream tables with eight or fewer output columns now automatically get a covering index with INCLUDE (col1, col2, ...) on the internal __pgt_row_id column. This lets the MERGE step use index-only scans — no heap lookups for matched rows — reducing refresh time by roughly 20–50% in small-delta / large-table scenarios.

  • Change buffer compaction. When the pending change buffer grows beyond pg_trickle.compact_threshold (default 100,000 rows), pg_trickle compacts it before the next refresh cycle. INSERT→DELETE pairs that cancel each other out are eliminated; multiple sequential changes to the same row are collapsed to a single net change. Reduces delta scan overhead by 50–90% for high-churn tables. Uses change_id (not ctid) for safe operation under concurrent VACUUM.

  • Tiered refresh scheduling. Large deployments can assign stream tables to one of four tiers: Hot (refresh at the configured interval), Warm (2× interval), Cold (10× interval), or Frozen (skip until manually promoted). Gate the feature with pg_trickle.tiered_scheduling = on (default off). Set per stream table via ALTER STREAM TABLE ... SET (tier => 'warm'). Frozen stream tables are entirely skipped by the scheduler until you promote them.

  • Incremental dependency-graph updates. When a stream table is created, altered, or dropped, the internal dependency graph now updates only the affected entries instead of rebuilding the entire graph from scratch. Reduces the latency impact of DDL operations from roughly 50 ms to roughly 1 ms in deployments with 1,000+ stream tables.

  • Smarter topo-sort caching inside a scheduler tick. The ordering in which stream tables are refreshed (topological order through the dependency graph) is now computed once per scheduler tick and reused across all internal callers, eliminating redundant work.

Better Visibility Into What pg_trickle Is Doing

Several behaviours that previously happened silently now produce a short, actionable message at the moment they occur:

  • ORDER BY without LIMIT warns you at creation time. Adding ORDER BY to a stream table's defining query without also adding LIMIT has no effect: stream table storage has no guaranteed row order. pg_trickle now emits a WARNING pointing you toward the TopK pattern or suggesting you remove the ORDER BY.

  • append_only mode reversions are visible. When pg_trickle automatically exits append-only mode (because deletions or updates were detected in the source), the notice is now emitted at WARNING level (was INFO, normally suppressed) and also dispatched as a pgtrickle_alert notification.

  • Cleanup failures escalate after 3 consecutive attempts. If the background worker fails to clean up a source table 3 times in a row, the message is promoted from DEBUG1 (normally invisible) to WARNING so it appears in the server log.

  • Diamond dependency with diamond_consistency='none' now advises you. When you create a stream table that forms a diamond in the dependency graph and explicitly set diamond_consistency='none', a NOTICE advises you to consider diamond_consistency='atomic' for consistent cross-branch reads.

  • diamond_consistency now defaults to 'atomic'. New stream tables get atomic group semantics by default, meaning all branches of a diamond are refreshed together in a single savepoint before the convergence node is updated. This prevents a read from the convergence node seeing one branch partially updated and the other stale. To restore the old independent behavior, pass diamond_consistency => 'none' explicitly.

  • Adaptive fallback is visible at the default log level. When a differential refresh falls back to a full refresh because the delta is too large, the message is now emitted at NOTICE level (the default client_min_messages threshold) instead of INFO (usually suppressed in the client session).

  • CALCULATED schedule without downstream dependents warns you. When a stream table is created with schedule='calculated' but no existing stream table references it as a downstream dependent, a NOTICE explains that the schedule will fall back to pg_trickle.default_schedule_seconds.

  • Internal __pgt_* auxiliary columns are now documented. The hidden columns that the refresh engine may add to stream table physical storage are described in a new section of SQL_REFERENCE.md. This covers all variants from the always-present __pgt_row_id primary key through the aggregate-specific auxiliary columns for AVG, STDDEV, CORR, COVAR, REGR_*, window functions, and recursive CTE depth.

Bug Fixes

  • Scheduler no longer permanently misses stream tables created under a stale snapshot. signal_dag_invalidation is called inside the creating transaction before it commits. If the background scheduler happened to start a new tick and capture a catalog snapshot at that exact instant, the DAG rebuild query would not see the new stream table — yet the version counter was already advanced, so the scheduler would never rebuild again. The affected stream table would then never be scheduled for refresh. Fixed by verifying that every invalidated pgt_id is present in the rebuilt DAG after each rebuild. If any are missing the scheduler signals a full-rebuild for the next tick (which starts a fresh transaction that includes all committed data) rather than accepting the stale version. Fixes CI test test_autorefresh_diamond_cascade.

Upgrade Notes

  • New catalog columns. The 0.9.0 → 0.10.0 upgrade migration adds pooler_compatibility_mode BOOLEAN and refresh_tier TEXT to pgt_stream_tables. Run ALTER EXTENSION pg_trickle UPDATE TO '0.10.0' after replacing the extension files. Verification script: scripts/check_upgrade_completeness.sh.

  • Hidden auxiliary columns for statistical aggregates. Stream tables using CORR, COVAR_POP, COVAR_SAMP, or any REGR_* aggregate will get hidden __pgt_aux_* columns when created or altered under 0.10.0. These are invisible to normal queries (excluded by the NOT LIKE '__pgt_%' convention) and managed automatically.

  • pooler_compatibility_mode is off by default. Existing stream tables are unaffected. Enable it only for stream tables accessed through PgBouncer transaction-mode pooling.

Additional Bug Fixes (2026-03-24)

Scheduler stability:

  • Scheduler no longer crashes when concurrent refreshes compete. The internal function that decides whether to skip a refresh cycle was running a locking query outside a transaction boundary — a strict PostgreSQL requirement. It now runs inside a proper subtransaction, eliminating the crash.

  • Auto-backoff no longer causes a transaction conflict in the background worker. When the auto-backoff feature stretches a stream table's refresh interval, it previously tried to open a new transaction inside the background worker's already-open transaction. PostgreSQL does not allow this nesting; the code path is now restructured to avoid it.

Query engine correctness:

  • Queries that filter on hidden columns now produce correct results. For example, SELECT name FROM users WHERE internal_id > 5 — where internal_id is not part of the output — could return wrong rows during incremental updates. Fixed.

  • JOIN results are correct when both joined tables change at the same time. Simultaneous changes to two stream tables connected by a JOIN could leave the output with stale or duplicated rows. Fixed.

  • NULLIF(a, b) expressions now work in incremental queries. NULLIF returns NULL when its two arguments are equal. It was not recognised by the incremental parser, causing a fallback error. Fixed.

  • LIKE and ILIKE pattern matching now work in filter conditions. Filter expressions such as WHERE name LIKE 'A%' or WHERE description ILIKE '%widget%' were not handled by the incremental engine. Fixed.

  • Subqueries with ORDER BY, LIMIT, or OFFSET are now preserved correctly. When the incremental engine reconstructed a subquery, those clauses were silently dropped. The incremental result no longer differs from a full refresh for such queries.

  • Scalar subqueries using LIMIT or OFFSET are now handled gracefully. Rather than producing a runtime error, the engine falls back to a full refresh for those cases and continues.

SQL parser:

  • Wildcard column references (table.*) now work for qualified names. A two- or three-part column reference such as schema.table.* or alias.* caused a parser crash. Fixed.

Change capture and WAL:

  • State transitions no longer stall when the WAL replication slot is behind. When a stream table moves through the TRANSITIONING state, pg_trickle now advances the WAL replication slot up-front. This eliminates a lag-check stall that could cause the transition to hang indefinitely under write-heavy workloads.

Security:

  • Several low-severity code quality and security scanner alerts from Semgrep and CodeQL are resolved. No user-visible behaviour changes.

[0.9.0] — 2026-03-20

The headline feature of 0.9.0 is incremental aggregate maintenance: when a single row changes inside a group of 100,000 rows, pg_trickle no longer has to re-scan all 100,000 rows to update COUNT, SUM, AVG, STDDEV, or VAR results. Instead it keeps running totals and adjusts them in constant time. Only MIN/MAX still needs a rescan — and only when the deleted value happens to be the current extreme.

Beyond aggregates, this release contains a broad set of performance optimizations that reduce wasted I/O during every refresh cycle, two new configuration knobs, a refresh-group management API, and several bug fixes.

Faster Aggregates

  • Constant-time COUNT, SUM, AVG: Changed rows are now applied algebraically (new_sum = old_sum + inserted − deleted) instead of re-aggregating the whole group. AVG uses hidden auxiliary SUM and COUNT columns maintained automatically on the stream table.
  • Constant-time STDDEV and VAR: Standard-deviation and variance aggregates (STDDEV_POP, STDDEV_SAMP, VAR_POP, VAR_SAMP) now use a sum-of-squares decomposition with a hidden auxiliary column, achieving the same constant-time update as COUNT/SUM/AVG.
  • MIN/MAX safety guard: Deleting the row that currently holds the minimum (or maximum) value correctly triggers a rescan of that group. Property-based tests verify this boundary.
  • Floating-point drift reset: A new setting (pg_trickle.algebraic_drift_reset_cycles) periodically forces a full recomputation to correct any floating-point rounding drift that accumulates over many incremental cycles.

Smarter Refresh Scheduling

  • Automatic backoff for overloaded streams: The pg_trickle.auto_backoff GUC was introduced here (default off at the time). See the v0.10.0 entry for the improved thresholds, reduced cap, and the flip to on by default.
  • Index-aware MERGE: A new threshold setting (pg_trickle.merge_seqscan_threshold, default 0.001) tells PostgreSQL to use an index lookup instead of a full table scan when only a tiny fraction of the stream table's rows are changing.

Less Wasted I/O

  • Skip unchanged columns: The scan operator now checks the CDC trigger's per-row bitmask to skip UPDATE rows where none of the columns your query actually uses were modified. For wide tables where you only reference a few columns, most UPDATE processing is eliminated.
  • Skip unchanged sources in joins: When a multi-source join query has three source tables but only one of them changed, the delta branches for the two unchanged sources are now replaced with FALSE at plan time. PostgreSQL's planner recognises those branches as empty and skips them entirely.
  • Push WHERE filters into the change scan: If your stream table's defining query has a WHERE clause (e.g. WHERE status = 'shipped'), that filter is now applied immediately after reading the change buffer — before rows enter the join or aggregate pipeline. Rows that don't match the filter are discarded right away.
  • Faster DISTINCT counting: The per-row multiplicity lookup for SELECT DISTINCT queries now uses an index-driven scalar subquery instead of a LEFT JOIN, guaranteeing I/O proportional to the number of changed rows regardless of stream table size.
  • Scalar subquery short-circuit: When a scalar subquery's inner source has no changes in the current cycle, the expensive outer-table snapshot reconstruction is skipped entirely.

Refresh Group Management

  • New SQL functions for grouping stream tables that should always be refreshed together (cross-source snapshot consistency):
    • pgtrickle.create_refresh_group(name, members, isolation)
    • pgtrickle.drop_refresh_group(name)
    • pgtrickle.refresh_groups() — lists all declared groups.

Bug Fixes

  • Fixed a crash when internal status queries failed: The source_gates() and watermarks() SQL functions previously crashed the entire PostgreSQL backend process on any internal error. They now report a normal SQL error instead.
  • Clearer handling of window functions in expressions: Queries like CASE WHEN ROW_NUMBER() OVER (...) > 5 THEN ... were silently accepted but failed at refresh time with a confusing error. pg_trickle now automatically falls back to full refresh mode (in AUTO mode) or warns you at creation time (in explicit DIFFERENTIAL mode).

Documentation

  • Documented the known limitation that recursive CTE stream tables in DIFFERENTIAL mode fall back to full recomputation when rows are deleted or updated. Workaround: use refresh_mode = 'IMMEDIATE'.
  • Documented the pgt_refresh_groups catalog table schema and usage.
  • Documented the O(partition_size) cost of window function maintenance with mitigation strategies.

Deferred to v0.10.0

The following performance optimizations were evaluated and explicitly deferred. In every case the current behaviour is correct — these items would make certain workloads faster but carry enough implementation risk that they need more design work first:

  • Recursive CTE incremental delete/update in DIFFERENTIAL mode (P2-1)
  • SUM NULL-transition shortcut for FULL OUTER JOIN aggregates (P2-2)
  • Materialized view sources in IMMEDIATE mode (P2-4)
  • LATERAL subquery scoped re-execution (P2-6)
  • Welford auxiliary columns for CORR/COVAR/REGR_* aggregates (P3-2)
  • Merged-delta weight aggregation for multi-source deduplication (B3-2/B3-3)

Upgrade Notes

  • New SQL objects: The 0.8.0 → 0.9.0 upgrade migration adds the pgt_refresh_groups table and the restore_stream_tables function. Run ALTER EXTENSION pg_trickle UPDATE TO '0.9.0' after replacing the extension files.
  • Hidden auxiliary columns: Stream tables using AVG, STDDEV, or VAR aggregates will automatically get hidden __pgt_aux_* columns when created or altered. These columns are invisible to normal queries (filtered by the existing NOT LIKE '__pgt_%' convention) and are managed automatically.
  • PGXN publishing: Release artifacts are now automatically uploaded to PGXN via GitHub Actions.

[0.8.0] — 2026-03-17

This release focuses on making your streams easier to back up, far more reliable under complex scenarios, and solidifying the underlying core engine through massive testing improvements.

Added

  • Backup and Restore Support: You can now safely backup your database using standard pg_dump and pg_restore commands. The system will automatically reconnect all streams and data queues to eliminate downtime during disaster recovery.
  • Connection Pooler Opt-In: Replaced the global PgBouncer pooler compatibility setting with a per-stream option. You can now enable connection pooling optimizations selectively on a stream-by-stream basis.

Fixed

  • Cyclic Stream Reliability: Fixed internal bugs that occasionally caused streams referencing each other in a loop to get stuck refreshing forever. Streams now accurately detect when row changes stop and naturally settle.
  • Large Dependency Chains: Fixed a crash (stack overflow) that could happen if you attempted to drop an extremely large or heavily recursive chain of stream tables sequentially.
  • Special Character Support in SQL: Handled an edge case causing errors when multi-byte characters or special non-ASCII symbols were parsed inside certain SQL commands.
  • Mac Support for Developer Tooling: Addressed a minor internal tool error stopping test components from automatically building on Apple Silicon machines.

Under the Hood Code and Testing Enhancements

  • Massive Testing Hardening: We have fundamentally overhauled and upgraded how we test the system. Our internal test suite has been completely enhanced with tens of thousands of continuous automated checks ensuring query answers are perfect, no matter how complex the data joins or updates get.
  • Performance Migrations: Began adopting new tools (cargo nextest) to speed up how fast we can iterate and develop the software in the background.

[0.7.0] — 2026-03-16

0.7.0 makes pg_trickle easier to trust in real-world data pipelines. The big theme of this release is fewer surprises: the scheduler can now wait for late arriving source data, some circular pipelines can run safely instead of being blocked, more queries stay on incremental refresh, and the system does a better job of deciding when incremental work is no longer worth it.

Added

Multi-source data can wait until it is actually ready

pg_trickle can now delay a refresh until related source tables have all caught up to roughly the same point in time. This is useful for ETL jobs where, for example, orders arrives before order_lines and refreshing too early would produce a half-finished report.

  • New watermark APIs: advance_watermark(source, watermark), create_watermark_group(name, sources[], tolerance_secs), and drop_watermark_group(name).
  • New status helpers: watermarks(), watermark_groups(), and watermark_status().
  • The scheduler now skips gated refreshes when grouped sources are too far apart and records the reason in refresh history.
  • New catalog tables store per-source watermarks and watermark group definitions.
  • 28 end-to-end tests cover normal operation, bad input, tolerance windows, and scheduler behavior.

Some circular pipelines can now run safely

Stream tables that depend on each other in a loop are no longer always blocked. If the cycle is monotone and uses DIFFERENTIAL mode, pg_trickle can now keep refreshing the group until it stops changing.

  • Circular refreshes run to a fixed point, with pg_trickle.max_fixpoint_iterations as a safety limit.
  • Cycle creation and ALTER validation now check that every member is safe for convergence before allowing the loop.
  • pgtrickle.pgt_status() now reports scc_id, and pgtrickle.pgt_scc_status() shows per-cycle-group status.
  • pgtrickle.pgt_stream_tables now tracks last_fixpoint_iterations so it is easier to spot slow or unstable cycles.
  • 6 end-to-end tests cover convergence, rejection of unsafe cycles, non-convergence handling, and cleanup.

More queries stay on incremental refresh

Several query shapes that used to fall back to FULL refresh, or fail outright, now keep working in DIFFERENTIAL and AUTO mode.

  • User-defined aggregates created with CREATE AGGREGATE now work through the existing group-rescan strategy, including common extension-provided aggregates.
  • More complex OR plus subquery patterns are now rewritten correctly, including cases that need De Morgan normalization and multiple rewrite passes.
  • The rewrite pipeline has a guardrail to stop runaway branch explosion.
  • A dedicated 14-test end-to-end suite covers these previously missing cases.

Easier packaging ahead of 1.0

The release also adds infrastructure that makes evaluation and future distribution simpler.

  • Dockerfile.hub and a dedicated CI workflow can build and smoke-test a ready-to-run PostgreSQL 18 image with pg_trickle preinstalled.
  • META.json adds PGXN package metadata with release_status: "testing".
  • CNPG smoke testing is now part of the documented pre-1.0 packaging story.

Improved

Refresh strategy and performance decisions are smarter

The scheduler and refresh engine now make better choices when incremental work is likely to help and back off sooner when it is not.

  • Wide tables now use xxh64-based change detection instead of slower MD5-based comparisons.
  • Aggregate stream tables can skip expensive incremental work and jump straight to FULL refresh when the pending change set is obviously too large.
  • Strategy selection now combines a change-ratio signal with recent refresh history, which helps on workloads with uneven batch sizes.
  • DAG levels are extracted explicitly, enabling level-parallel refresh scheduling.
  • Small internal hot paths such as column-list building and LSN comparison were tightened to remove avoidable allocations.

Benchmarking is much easier to use and compare

The performance toolchain was expanded so regressions are easier to spot and large-scale behavior is easier to study.

  • Benchmarks now support per-cycle output, optional EXPLAIN ANALYZE capture, larger 1M-row runs, and more stable Criterion settings.
  • New tooling covers cross-run comparison, concurrent writers, and extra query shapes such as window, lateral, CTE, and UNION ALL workloads.
  • just bench-docker makes it easier to run Criterion inside the builder image when local linking is awkward.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

  • Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
  • Round-trip notificationspg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
  • DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
  • Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
  • Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

Internal low-level code is much safer to audit

This release cuts the amount of low-level unsafe Rust in half without changing behavior.

  • Unsafe blocks were reduced by 51%, from 1,309 to 641.
  • Repeated patterns were consolidated into a small set of documented helper functions.
  • 37 internal functions no longer need to be marked unsafe.
  • Existing unit tests continued to pass unchanged after the refactor.

[0.6.0] — 2026-03-14

Added

Idempotent DDL (create_or_replace)

New one-call function for deploying stream tables without worrying about whether they already exist. Replaces the old "check if it exists, then drop and recreate" pattern.

  • create_or_replace_stream_table() — a single function that does the right thing automatically:
    • Creates the stream table if it doesn't exist yet.
    • Does nothing if the stream table already exists with the same query and settings (logs an INFO so you know it was a no-op).
    • Updates settings (schedule, refresh mode, etc.) if only config changed.
    • Replaces the query if the defining query changed — including automatic schema migration and a full refresh.
  • dbt uses it automatically. The stream_table materialization now calls create_or_replace_stream_table() when running against pg_trickle 0.6.0+, with automatic fallback for older versions.
  • Whitespace-insensitive. Cosmetic SQL differences (extra spaces, tabs, newlines) are correctly treated as no-ops — won't trigger unnecessary rebuilds.

dbt Integration Enhancements

  • Check stream table health from dbt. New pgtrickle_stream_table_status() macro returns whether a stream table is healthy, stale, erroring, or paused. Pair it with the new built-in stream_table_healthy test in your schema.yml to fail CI when a stream table is behind or broken.
  • Refresh everything in the right order. New refresh_all_stream_tables run-operation refreshes all dbt-managed stream tables in dependency order. Run it after dbt run and before dbt test in your CI pipeline.

Partitioned Source Tables

Stream tables now work with PostgreSQL's declarative table partitioning — RANGE, LIST, and HASH partitioned tables all work as sources out of the box.

  • Changes in any partition are captured automatically. CDC triggers fire on the parent table so inserts, updates, and deletes in any child partition are picked up.
  • ATTACH PARTITION triggers automatic rebuild. When you attach a new partition, pg_trickle detects the structural change and rebuilds affected stream tables to include the new partition's pre-existing data.
  • WAL mode works with partitions. Publications are configured with publish_via_partition_root = true, so all partitions report changes under the parent table's identity.
  • New tutorial covering partitioned source tables, ATTACH/DETACH behavior, and known caveats (docs/tutorials/PARTITIONED_TABLES.md).

Circular Dependency Foundation

Lays the groundwork for stream tables that reference each other in a cycle (A → B → A). The actual cyclic refresh execution is planned for v0.7.0 — this release adds the detection, validation, and safety infrastructure.

  • Cycle detection. pg_trickle can now identify groups of stream tables that form circular dependencies.
  • Safety checks at creation time. Queries that can't safely participate in a cycle (those using aggregates, EXCEPT, window functions, or NOT EXISTS) are rejected with a clear error explaining why.
  • New settings:
    • pg_trickle.allow_circular (default: off) — master switch for circular dependencies.
    • pg_trickle.max_fixpoint_iterations (default: 100) — prevents runaway loops.

Source Gating Improvements

  • bootstrap_gate_status() function. Shows which sources are currently gated, when they were gated, how long the gate has been active, and which stream tables are waiting. Useful for debugging "why isn't my stream table refreshing?"
  • ETL coordination cookbook. SQL Reference now includes five step-by-step recipes for common bulk-load patterns.

More SQL Patterns Supported

Two query patterns that previously required workarounds now just work:

  • Window functions inside expressions. Queries like CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN 'top' ELSE 'other' END or COALESCE(SUM() OVER (...), 0) are now accepted and produce correct results. Use FULL refresh mode for these queries — incremental (DIFFERENTIAL) refresh of window-in-expression patterns is not yet supported. Previously, the query was rejected entirely at creation time.

  • ALL (subquery) comparisons. Queries like WHERE price < ALL (SELECT price FROM competitors) are now accepted in both FULL and DIFFERENTIAL modes. Supports all comparison operators (>, >=, <, <=, =, <>) and correctly handles NULL values per the SQL standard.

Operational Safety Improvements

  • Function changes detected automatically. If a stream table's query calls a user-defined function and you update that function with CREATE OR REPLACE FUNCTION, pg_trickle detects the change and automatically rebuilds the stream table on the next cycle. No manual intervention needed.

  • WAL mode explains why it isn't activating. When cdc_mode = 'auto' and the system stays on trigger-based tracking, the scheduler now periodically logs the exact reason (e.g., "wal_level is not logical") and check_cdc_health() reports the current mode so you can diagnose the issue.

  • WAL + keyless tables rejected early. Creating a stream table with cdc_mode = 'wal' on a table that has no primary key and no REPLICA IDENTITY FULL is now rejected at creation time with a clear error — instead of silently producing incomplete results later.

  • Automatic recovery after backup/restore. When a PostgreSQL server is restored from pg_basebackup, WAL replication slots are lost. pg_trickle now detects the missing slot, automatically falls back to trigger-based tracking, and logs a WARNING so you know what happened.

Documentation

  • ALL (subquery) worked example in the SQL Reference with sample data and expected results.
  • Window-in-expression documentation showing before/after examples of the automatic rewrite.
  • Foreign table sources tutorial — step-by-step guide for using postgres_fdw foreign tables as stream table sources.

Fixed

  • create_or_replace whitespace handling. Extra spaces, tabs, and newlines in queries no longer trigger unnecessary rebuilds.
  • create_or_replace schema incompatibility detection. Incompatible column type changes (e.g., text → integer) are now properly detected and handled.

[0.5.0] — 2026-03-13

Added

Row-Level Security (RLS) Support

Stream tables now work correctly with PostgreSQL's Row-Level Security feature, which lets you control which rows different users can see.

  • Refreshes always see all data. When a stream table is refreshed, it computes the full result regardless of RLS policies on the source tables. This matches how PostgreSQL's built-in materialized views work. You then add RLS policies directly on the stream table to control who can read what.
  • Internal tables are protected. The internal change-tracking tables used by pg_trickle are shielded from RLS interference, so refreshes won't silently fail if you turn on RLS at the schema level.
  • Real-time (IMMEDIATE) mode secured. Triggers that keep stream tables updated in real time now run with elevated privileges and a locked-down search path, preventing data corruption or security bypasses.
  • RLS changes are detected automatically. If you enable, disable, or force RLS on a source table, pg_trickle detects the change and marks affected stream tables for a full rebuild.
  • New tutorial. Step-by-step guide for setting up per-tenant RLS policies on stream tables (see docs/tutorials/ROW_LEVEL_SECURITY.md).

Source Gating for Bulk Loads

New pause/resume mechanism for large data imports. When you're loading a big batch of data into a source table, you can temporarily "gate" it to prevent the background scheduler from triggering refreshes mid-load. Once the load is done, ungate it and everything catches up in a single refresh.

  • gate_source('my_table') — pauses automatic refreshes for any stream table that depends on my_table.
  • ungate_source('my_table') — resumes automatic refreshes. All changes made during the gate are picked up in the next refresh cycle.
  • source_gates() — shows which source tables are currently gated, when they were gated, and by whom.
  • Manual refresh still works. Even while a source is gated, you can explicitly call refresh_stream_table() if needed.
  • Gating is idempotent — calling gate_source() twice is safe, and gating a source that's already gated is a no-op.

Append-Only Fast Path

Significant performance improvement for tables that only receive INSERTs (event logs, audit trails, time-series data, etc.). When you mark a stream table as append_only, refreshes skip the expensive merge logic (checking for deletes, updates, and row comparisons) and use a simple, fast insert.

  • How to use: Pass append_only => true when creating or altering a stream table.
  • Safe fallback. If a DELETE or UPDATE is detected on a source table, the extension automatically falls back to the standard refresh path and logs a warning. It won't silently produce wrong results.
  • Restrictions. Append-only mode requires DIFFERENTIAL refresh mode and source tables with primary keys.

Usability Improvements

  • Manual refresh history. When you manually call refresh_stream_table(), the result (success or failure, timing, rows affected) is now recorded in the refresh history, just like scheduled refreshes.
  • quick_health view. A single-row health summary showing how many stream tables you have, how many are in error or stale, whether the scheduler is running, and an overall status (OK, WARNING, CRITICAL). Easy to plug into monitoring dashboards.
  • create_stream_table_if_not_exists(). A convenience function that does nothing if the stream table already exists, instead of raising an error. Makes migration scripts and deployment automation simpler.

Smooth Upgrade from 0.4.0

  • Existing installations can upgrade with ALTER EXTENSION pg_trickle UPDATE TO '0.5.0'. All new features (source gating, append-only mode, quick health view, and the new convenience functions) are included in the upgrade script.
  • The upgrade has been verified with automated tests that confirm all 40 SQL objects survive the upgrade intact.

[0.4.0] — 2026-03-12

Added

Parallel Refresh (opt-in)

Stream tables can now be refreshed in parallel, using multiple background workers instead of processing them one at a time. This can dramatically reduce end-to-end refresh latency when you have many independent stream tables.

  • Off by default. Set pg_trickle.parallel_refresh_mode = 'on' to enable. Use 'dry_run' to preview what the scheduler would do without changing behavior.
  • Automatic dependency awareness. The scheduler figures out which stream tables can safely refresh at the same time and which must wait for others. Stream tables connected by real-time (IMMEDIATE) triggers are always refreshed together to prevent race conditions.
  • Atomic groups. When a group of stream tables must succeed or fail together (e.g. diamond dependencies), all members are wrapped in a single transaction — if one fails, the whole group rolls back cleanly.
  • Worker pool controls:
    • pg_trickle.max_dynamic_refresh_workers (default 4) — cluster-wide cap on concurrent refresh workers.
    • pg_trickle.max_concurrent_refreshes — per-database dispatch cap.
  • Monitoring:
    • worker_pool_status() — shows how many workers are active and the current limits.
    • parallel_job_status(max_age_seconds) — lists recent and active refresh jobs with timing and status.
    • health_check() now warns when the worker pool is saturated or the job queue is backing up.
  • Self-healing. On startup, the scheduler automatically cleans up orphaned jobs and reclaims leaked worker slots from previous crashes.

Statement-Level CDC Triggers

Change tracking triggers have been upgraded from row-level to statement-level, reducing write-side overhead for bulk INSERT and UPDATE operations. This is now the default for all new and existing stream tables. A benchmark harness is included so you can measure the difference on your own hardware.

dbt Getting Started Example

New examples/dbt_getting_started/ project with a complete, runnable dbt example showing org-chart seed data, staging views, and three stream table models. Includes an automated test script.

Fixed

Refresh Lock Not Released After Errors

Fixed a bug where refresh_stream_table() could get permanently stuck after a PostgreSQL error (e.g. running out of temp file space). The internal lock was session-level and survived transaction rollback, causing all future refreshes for that stream table to report "another refresh is already in progress". Refresh locks are now transaction-level, so they are automatically released when the transaction ends — whether it succeeds or fails.

dbt Integration Fixes

  • Fixed query quoting in dbt macros that broke when queries contained single quotes.
  • Fixed schedule = none in dbt being incorrectly mapped to SQL NULL.
  • Fixed view inlining when the same view was referenced with different aliases.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

  • Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.

  • Round-trip notificationspg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.

  • DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.

  • Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.

  • Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

  • Updated to PostgreSQL 18.3 across CI and test infrastructure.

  • Dependency updates: tokio 1.49 → 1.50 and several GitHub Actions bumps.

Breaking Changes

These behavioural changes shipped in v0.4.0. They improve usability but may require action from users upgrading from v0.3.0.

  • Schedule default changed from '1m' to 'calculated'. create_stream_table now defaults to schedule => 'calculated', which auto-computes the refresh interval from downstream dependents instead of refreshing every 1 minute. If you relied on the implicit 1-minute default, explicitly pass schedule => '1m' to preserve the old behaviour.

  • NULL schedule input rejected. Passing schedule => NULL to create_stream_table now returns an error. Use schedule => 'calculated' instead — it's explicit and self-documenting.

  • Diamond GUCs removed. The cluster-wide GUCs pg_trickle.diamond_consistency and pg_trickle.diamond_schedule_policy have been removed. Diamond behaviour is now controlled per-table via parameters on create_stream_table() / alter_stream_table(): diamond_consistency => 'atomic', diamond_schedule_policy => 'slowest'.


[0.3.0] — 2026-03-11

This is a correctness and hardening release. No new SQL functions, tables, or views were added — all changes are in the compiled extension code. ALTER EXTENSION pg_trickle UPDATE is safe and a no-op for schema objects.

Fixed

Incremental Correctness Fixes

All 18 previously-disabled correctness tests have been re-enabled (0 remaining). The following query patterns now produce correct results during incremental (non-full) refreshes:

  • HAVING clause threshold crossing. Queries with HAVING filters (e.g. HAVING SUM(amount) > 100) now produce correct totals when groups cross the threshold. Previously, a group gaining enough rows to meet the condition would show only the newly added values instead of the correct total.

  • FULL OUTER JOIN. Five bugs affecting incremental updates for FULL OUTER JOIN queries are fixed: mismatched row identifiers, incorrect handling of compound GROUP BY expressions like COALESCE(left.col, right.col), and wrong NULL handling for SUM aggregates.

  • EXISTS with HAVING subqueries. Queries using WHERE EXISTS(... GROUP BY ... HAVING ...) now work correctly — the inner GROUP BY and HAVING were previously being silently discarded.

  • Correlated scalar subqueries. Correlated subqueries in SELECT like (SELECT MAX(e.salary) FROM emp e WHERE e.dept_id = d.id) are now automatically rewritten into LEFT JOINs so the incremental engine can handle them correctly.

Background Worker Detection on PostgreSQL 18

Fixed a bug where health_check() and the scheduler reported zero active workers on PostgreSQL 18 due to a column name change in system views.

Scheduler Stability

Fixed a loop where the scheduler launcher could get stuck retrying failed database probes indefinitely instead of backing off properly.

Added

Security Tooling

Added static security analysis to the CI pipeline:

  • GitHub CodeQL — automated security scanning across all Rust source files. First scan: zero findings.
  • cargo deny — enforces a license allow-list and flags unmaintained or yanked dependencies.
  • Semgrep — custom rules that flag potentially dangerous patterns such as dynamic SQL construction and privilege escalation. Advisory-only (does not block merges).
  • Unsafe block inventory — CI tracks the count of unsafe code blocks per file and fails if any file exceeds its baseline, preventing unreviewed growth of low-level code.

[0.2.3] — 2026-03-09

Added

  • Unsafe function detection. Queries using non-deterministic functions like random() or clock_timestamp() are now rejected when creating incremental stream tables, because they can't produce reliable results. Functions like now() that return the same value within a transaction are allowed with a warning.

  • Per-table change tracking mode. You can now choose how each stream table tracks changes ('auto', 'trigger', or 'wal') via the cdc_mode parameter on create_stream_table() and alter_stream_table(), instead of relying only on the global setting.

  • CDC status view. New pgtrickle.pgt_cdc_status view shows the change tracking mode, replication slot, and transition status for every source table in one place.

  • Configurable WAL lag thresholds. The warning and critical thresholds for replication slot lag are now configurable via pg_trickle.slot_lag_warning_threshold_mb (default 100 MB) and pg_trickle.slot_lag_critical_threshold_mb (default 1024 MB), instead of being hard-coded.

  • pg_trickle_dump backup tool. New standalone CLI that exports all your stream table definitions as replayable SQL, ordered by dependency. Useful for backups before upgrades or migrations.

  • Upgrade path. ALTER EXTENSION pg_trickle UPDATE picks up all new features from this release.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

  • Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.

  • Round-trip notificationspg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.

  • DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.

  • Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.

  • Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

  • After a full refresh, WAL replication slots are now advanced to the current position, preventing unnecessary WAL accumulation and false lag alarms.

  • Change buffers are now flushed after a full refresh, fixing a cycle where the scheduler would alternate endlessly between incremental and full refreshes on bulk-loaded tables.

  • IMMEDIATE mode now correctly rejects explicit WAL CDC requests with a clear error, since real-time mode uses its own trigger mechanism.

  • The pg_trickle.user_triggers setting is simplified to auto and off. The old on value still works as an alias for auto.

  • CI pipelines are faster on PRs — only essential tests run; the full suite runs on merge and daily schedule.


[0.2.2] — 2026-03-08

Added

  • Change a stream table's query. alter_stream_table now accepts a query parameter, so you can change what a stream table computes without dropping and recreating it. If the new query's columns are compatible, the underlying storage table is preserved — existing views, policies, and publications continue to work.

  • AUTO refresh mode (new default). Stream tables now default to AUTO mode, which uses fast incremental updates when the query supports it and automatically falls back to a full recompute when it doesn't. You no longer need to think about whether your query is "incremental-compatible" — just create the stream table and it picks the best strategy.

  • Version mismatch warning. The background scheduler now warns if the installed extension version doesn't match the compiled library, making it easier to spot a half-finished upgrade.

  • ORDER BY + LIMIT + OFFSET. You can now page through top-N results, e.g. ORDER BY revenue DESC LIMIT 10 OFFSET 20 to get the third page of top earners.

  • Real-time mode: recursive queries. WITH RECURSIVE queries (e.g. org-chart hierarchies) now work in IMMEDIATE mode. A depth limit (default 100) prevents infinite loops.

  • Real-time mode: top-N queries. ORDER BY ... LIMIT N queries now work in IMMEDIATE mode — the top-N rows are recomputed on every data change. Maximum N is controlled by pg_trickle.ivm_topk_max_limit (default 1000).

  • Foreign table support. Stream tables can now use foreign tables as sources. Changes are detected by comparing snapshots since foreign tables don't support triggers. Enable with pg_trickle.foreign_table_polling = on.

  • Documentation reorganization. Configuration and SQL reference docs are reorganized around practical workflows. New sections cover DDL-during-refresh behavior, standby/replica limitations, and PgBouncer constraints.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

  • Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.

  • Round-trip notificationspg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.

  • DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.

  • Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.

  • Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

  • Default refresh mode changed from 'DIFFERENTIAL' to 'AUTO'.

  • Default schedule changed from '1m' to 'calculated' (automatic).

  • Default change tracking mode changed from 'trigger' to 'auto' — WAL-based tracking starts automatically when available, with trigger-based as fallback.


[0.2.1] — 2026-03-05

Added

  • Safe upgrades. New upgrade infrastructure ensures that ALTER EXTENSION pg_trickle UPDATE works correctly. A CI check detects missing functions or views in upgrade scripts, and automated tests verify that stream tables survive version-to-version upgrades intact. See docs/UPGRADING.md for the upgrade guide.

  • ORDER BY + LIMIT + OFFSET. You can now create stream tables over paged results, like "the second page of the top-100 products by revenue" (ORDER BY revenue DESC LIMIT 100 OFFSET 100).

  • 'calculated' schedule. Instead of passing SQL NULL to request automatic scheduling, you can now write schedule => 'calculated'. Passing NULL now gives a helpful error message.

  • Documentation expansion. Six new pages in the online book covering dbt integration, contributing guidelines, security policy, release process, and research comparisons with other projects.

  • Better warnings and safety checks:

    • Warning when a source table lacks a primary key (duplicate rows are handled safely but less efficiently).
    • Warning when using SELECT * (new columns added later can break incremental updates).
    • Alert when the refresh queue is falling behind (> 80% capacity).
    • Guard triggers prevent accidental direct writes to stream table storage.
    • Automatic fallback from WAL to trigger-based change tracking when the replication slot disappears.
    • Nested window functions and complex WHERE clauses with EXISTS are now handled automatically.
  • Change buffer partitioning. For high-throughput tables, change buffers can now be partitioned so that processed data is dropped efficiently.

  • Column pruning. The incremental engine now skips source columns not used in the query, reducing I/O for wide tables.

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

  • Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.

  • Round-trip notificationspg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.

  • DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.

  • Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.

  • Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

  • Default schedule changed from '1m' to 'calculated' (automatic).

  • Minimum schedule interval lowered from 60 s to 1 s.

  • Cluster-wide diamond consistency settings removed; per-table settings remain and now default to 'atomic' / 'fastest'.

Fixed

  • The 0.1.3 → 0.2.0 upgrade script was accidentally a no-op, silently skipping 11 new functions. Fixed.
  • Queries combining WITH (CTEs) and UNION ALL now parse correctly.

[0.2.0] — 2026-03-04

Added

  • Monitoring & health checks. Six new functions for inspecting your stream tables at runtime (no superuser required):

    • change_buffer_sizes() — shows how much pending change data each stream table has queued up.
    • list_sources(name) — lists all base tables that feed a given stream table, with row counts and size estimates.
    • dependency_tree() — displays an ASCII tree of how your stream tables depend on each other.
    • health_check() — quick system triage that checks whether the scheduler is running, flags tables in error or stale, and warns about large change buffers or WAL lag.
    • refresh_timeline() — recent refresh history across all stream tables, showing timing, row counts, and any errors.
    • trigger_inventory() — verifies that all required change-tracking triggers are in place and enabled.
  • IMMEDIATE refresh mode (real-time updates). New 'IMMEDIATE' mode keeps stream tables updated within the same transaction as your data changes. There's no delay — the stream table reflects changes the instant they happen. Supports window functions, LATERAL joins, scalar subqueries, and aggregate queries. You can switch between IMMEDIATE and other modes at any time using alter_stream_table.

  • Top-N queries (ORDER BY + LIMIT). Queries like SELECT ... ORDER BY score DESC LIMIT 10 are now supported. The stream table stores only the top N rows and updates efficiently.

  • Diamond dependency consistency. When multiple stream tables share common sources and feed into the same downstream table (a "diamond" pattern), they can now be refreshed as an atomic group — either all succeed or all roll back. This prevents inconsistent reads at convergence points. Controlled via the diamond_consistency parameter (default: 'atomic').

  • Multi-database auto-discovery. The background scheduler now automatically finds and services all databases on the server where pg_trickle is installed. No manual pg_trickle.database configuration required — just install the extension and the scheduler discovers it.

Fixed

  • Fixed IMMEDIATE mode incorrectly trying to read from change buffer tables (which don't exist in that mode) for certain aggregate queries.
  • Fixed type mismatches when join queries had unchanged source tables producing empty change sets.
  • Fixed join condition column order being swapped when the right-side table was written first in the ON clause (e.g. ON r.id = l.id).
  • Fixed dbt macros silently rolling back stream table creation because dbt wraps statements in a ROLLBACK by default.
  • Fixed LIMIT ALL being incorrectly rejected as an unsupported LIMIT clause.
  • Fixed false "query may produce incorrect incremental results" warnings on simple arithmetic like depth + 1 or path || name.
  • Fixed auto-created indexes using the wrong column name when the query had a column alias (e.g. SELECT id AS department_id).

[0.1.3] — 2026-03-02

Major hardening release with 50 improvements across correctness, robustness, operational safety, and test coverage.

Added

  • DDL change tracking expanded. ALTER TYPE, ALTER POLICY, and ALTER DOMAIN on source tables are now detected and trigger a rebuild of affected stream tables. Previously only column changes were tracked.
  • Recursive query safety guard. Recursive CTEs (WITH RECURSIVE) are now checked for non-monotonic terms that could produce incorrect incremental results.
  • Read replica awareness. The background scheduler detects when it's running on a read replica and skips refresh work, preventing errors.
  • Range aggregates rejected. RANGE_AGG and RANGE_INTERSECT_AGG are now properly rejected in incremental mode with a clear error.
  • Refresh history: row counts. Refresh history now records how many rows were inserted, updated, and deleted in each refresh cycle.
  • Change buffer alerts. New pg_trickle.buffer_alert_threshold setting lets you configure when to be warned about growing change buffers.
  • st_auto_threshold() function. Shows the current adaptive threshold that decides when to switch between incremental and full refresh.
  • Wide table optimization. Tables with more than 50 columns use a hash shortcut during refresh merges, improving performance.
  • Change buffer security. Internal change buffer tables are no longer accessible to PUBLIC.
  • Documentation. PgBouncer compatibility, keyless table limitations, delta memory bounds, sequential processing rationale, and connection overhead are all now documented in the FAQ.

TPC-H Correctness Suite: 22/22 Queries Passing

The TPC-H-derived correctness test suite (22 industry-standard analytical queries) now passes completely across multiple rounds of data changes. This validates that incremental refreshes produce identical results to full recomputation for complex real-world query patterns.

Fixed

Window Function Correctness

Fixed incremental maintenance of window functions (ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG/LEAD, SUM OVER, etc.) to correctly handle:

  • Non-RANGE frame types
  • Ranking functions over tied values
  • Window functions wrapping aggregates (e.g. RANK() OVER (ORDER BY SUM(x)))
  • Multiple window functions with different PARTITION BY clauses

INTERSECT / EXCEPT Correctness

Fixed incremental maintenance of INTERSECT and EXCEPT queries that produced wrong results due to invalid SQL generation.

EXISTS / IN with OR Correctness

Fixed EXISTS and IN subqueries combined with OR in WHERE clauses that produced wrong results.

Aggregate Correctness

  • MIN / MAX now correctly rescan the source table when the current minimum or maximum value is deleted.
  • STRING_AGG(... ORDER BY ...) and ARRAY_AGG(... ORDER BY ...) no longer silently drop the ORDER BY clause.

[0.1.2] — 2026-02-28

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

  • Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
  • Round-trip notificationspg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
  • DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
  • Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
  • Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

Project Renamed from pg_stream to pg_trickle

Renamed the entire project from pg_stream to pg_trickle to avoid a naming collision with an unrelated project. If you were using the old name, all configuration prefixes changed from pg_stream.* to pg_trickle.*, and the SQL schemas changed from pgstream to pgtrickle. The "stream tables" terminology is unchanged.

Fixed

Fixed numerous incremental computation bugs discovered while building a comprehensive correctness test suite based on all 22 TPC-H analytical queries:

  • Inner join double-counting. When both sides of a join had changes in the same refresh cycle, some rows were counted twice.
  • Shared source cleanup. Cleaning up processed changes for one stream table could accidentally delete entries still needed by another stream table sharing the same source.
  • Scalar aggregate identity mismatch. Queries like SELECT SUM(amount) FROM orders could produce mismatched row identifiers between the incremental and merge phases. AVG also failed to recompute correctly after partial group changes.
  • EXISTS / NOT EXISTS snapshots. Incremental maintenance of EXISTS and NOT EXISTS subqueries missed pre-change state, producing wrong results.
  • Column resolution in complex joins. Several fixes for column name resolution in multi-table joins and nested subqueries.
  • COUNT(*) rendering. COUNT(*) was sometimes rendered as COUNT() (missing the star), causing SQL errors.
  • Subquery rewriting. Several subquery patterns (correlated vs non-correlated scalar subqueries, derived tables in FROM) were incorrectly rewritten, blocking certain queries from being created.
  • Cleanup worker crash. The background cleanup worker no longer crashes when it encounters entries for stream tables that were dropped mid-cycle.

Added

TPC-H Correctness Test Suite

Added a comprehensive correctness test suite based on all 22 TPC-H analytical queries. These tests verify that incremental refreshes produce identical results to a full recompute after INSERT, DELETE, and UPDATE mutations. 20 of 22 queries can be created as stream tables; 15 pass full correctness checks at this point (improved to 22/22 in v0.1.3).


[0.1.1] — 2026-02-26

Changed

Internal Code Quality: Integration Test Suite Hardening

Completed a full hardening pass of the integration test suite, bringing all items in PLAN_TEST_EVALS_INTEGRATION.md to done:

  • Multiset validation — Extracted assert_sets_equal() helper relying on EXCEPT/UNION ALL SQL logic and applied it to workflow tests to ensure storage table state correctly matches the defining query post-refresh.
  • Round-trip notificationspg_trickle_alert notifications now verify receipt end-to-end via sqlx::PgListener.
  • DVM operators — Added unit coverage for complex semi/anti-join behaviors (multi-column, filtered, complementary), multi-table join chains for inner and full joins, and proptest! fuzz tests enforcing generated SQL invariants across INNER, SEMI, and ANTI joins.
  • Resilience and edge cases — Test coverage for ST drop cascades verifying dependent object removal, exact error escalation thresholds, and scheduler job lifecycles across queued mock states.
  • Cleanups — Standardized naming practices (test_workflow_*, test_infra_*) and eliminated clock-bound flakes by widening staleness assertions.

CloudNativePG Extension Image

Replaced the full PostgreSQL Docker image (~400 MB) with a minimal extension-only image (< 10 MB) following the CloudNativePG Image Volume Extensions specification. This means faster pulls and less disk usage in Kubernetes deployments. The image contains just the extension files — no full PostgreSQL server.


[0.1.0] — 2026-02-26

Initial release of pg_trickle — a PostgreSQL extension that keeps query results automatically up to date as your data changes.

Core Concept

Define a SQL query and a schedule. pg_trickle creates a stream table that stores the query's results and keeps them fresh — either on a schedule (every N seconds) or in real time. When data in your source tables changes, only the affected rows are recomputed instead of re-running the entire query.

What You Can Do

  • Create stream tables from SELECT queries — joins, aggregates, subqueries, CTEs, window functions, set operations, and more.
  • Automatic refresh — a background scheduler refreshes stream tables in dependency order. You can also trigger refreshes manually.
  • Incremental updates — the engine automatically figures out how to update only the rows that changed, instead of recomputing everything. This works for most query patterns including multi-table joins and aggregates.
  • Views as sources — views referenced in your query are automatically expanded so change tracking works on the underlying tables.
  • Tables without primary keys — supported via content hashing. Tables with primary keys get better performance.
  • Hybrid change tracking — starts with lightweight triggers (no special PostgreSQL configuration needed). Can automatically switch to WAL-based tracking for lower overhead when wal_level = logical is available.
  • Multi-database support — the scheduler automatically discovers all databases on the server where the extension is installed.
  • User triggers on stream tables — your own AFTER triggers on stream tables fire correctly during incremental refreshes.
  • DDL awarenessALTER TABLE, DROP TABLE, CREATE OR REPLACE FUNCTION, and other DDL on source tables or functions used in your query are detected and handled automatically.

SQL Support

Broad coverage of SQL features:

  • Joins: INNER, LEFT, RIGHT, FULL OUTER, NATURAL, LATERAL subqueries, LATERAL set-returning functions (unnest, jsonb_array_elements, etc.)
  • Aggregates: 39 functions including COUNT, SUM, AVG, MIN, MAX, STRING_AGG, ARRAY_AGG, JSON_ARRAYAGG, JSON_OBJECTAGG, statistical regression functions (CORR, COVAR_, REGR_), and ordered-set aggregates (MODE, PERCENTILE_CONT, PERCENTILE_DISC)
  • Window functions: ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, SUM OVER, etc. with full frame clause support
  • Set operations: UNION, UNION ALL, INTERSECT, EXCEPT
  • Subqueries: in FROM, EXISTS/NOT EXISTS, IN/NOT IN, scalar subqueries
  • CTEs: WITH and WITH RECURSIVE
  • Special syntax: DISTINCT, DISTINCT ON, GROUPING SETS / CUBE / ROLLUP, CASE WHEN, COALESCE, JSON_TABLE (PostgreSQL 17+)
  • Unsafe function detection: queries using non-deterministic functions like random() are rejected with a clear error

Monitoring

  • explain_st() — shows the incremental computation plan
  • st_refresh_stats(), get_refresh_history(), get_staleness() — refresh performance and status
  • slot_health() — WAL replication slot health
  • check_cdc_health() — change tracking health per source table
  • stream_tables_info and pg_stat_stream_tables views
  • NOTIFY alerts for stale data, errors, and refresh events

Documentation

  • Architecture guide, SQL reference, configuration reference, FAQ, getting-started tutorial, and deep-dive tutorials.

Known Limitations

  • TABLESAMPLE, LIMIT / OFFSET, FOR UPDATE / FOR SHARE — not yet supported (clear error messages).
  • Window functions inside expressions (e.g. CASE WHEN ROW_NUMBER() ...) — not yet supported.
  • Circular stream table dependencies — not yet supported.

pg_trickle — Project Roadmap

Last updated: 2026-04-13 Latest release: 0.19.0 (2026-04-13) Current milestone: v0.21.0 — PostgreSQL 17 Support

For a concise description of what pg_trickle is and why it exists, read ESSENCE.md — it explains the core problem (full REFRESH MATERIALIZED VIEW recomputation), how the differential dataflow approach solves it, the hybrid trigger→WAL CDC architecture, and the broad SQL coverage, all in plain language.

Table of Contents


Overview

pg_trickle is a PostgreSQL 18 extension that implements streaming tables with incremental view maintenance (IVM) via differential dataflow. The extension is designed for maximum performance, low latency, and high throughput — differential refresh is the default mode, and full refresh is a fallback of last resort. All 13 design phases are complete. This roadmap tracks the path from the v0.1.x series to 1.0 and beyond.

VersionThemeStatus
v0.1.xCore engine, DVM, CDC, scheduling, monitoring✅ Released
v0.2.0TopK, diamond consistency, transactional IVM✅ Released
v0.2.1Upgrade infrastructure & documentation✅ Released
v0.2.2OFFSET, AUTO mode, ALTER QUERY, CDC hardening✅ Released
v0.2.3Non-determinism, CDC/mode gaps, operational polish✅ Released
v0.3.0DVM correctness, SAST & test coverage✅ Released
v0.4.0Parallel refresh & performance hardening✅ Released
v0.5.0Row-level security & operational controls✅ Released
v0.6.0Partitioning, idempotent DDL, circular dependency foundation✅ Released
v0.7.0Performance, watermarks, circular DAG, observability✅ Released
v0.8.0pg_dump support & test hardening✅ Released
v0.9.0Incremental aggregate maintenance✅ Released
v0.10.0DVM hardening, connection pooler compat, refresh optimizations✅ Released
v0.11.0Partitioned stream tables, Prometheus/Grafana, safety hardening✅ Released
v0.12.0Correctness, reliability & developer tooling✅ Released
v0.13.0Scalability foundations, MERGE profiling, multi-tenant scheduling✅ Released
v0.14.0Tiered scheduling, UNLOGGED buffers & diagnostics✅ Released
v0.15.0External test suites & integration✅ Released
v0.16.0Performance & refresh optimization✅ Released
v0.17.0Query intelligence & stability✅ Released
v0.18.0Hardening & delta performance✅ Released
v0.19.0Production gap closure & distribution✅ Released
v0.20.0Dog-feeding (pg_trickle monitors itself)✅ Released
v0.21.0PostgreSQL 17 supportPlanned
v0.22.0PGlite proof of conceptPlanned
v0.23.0Core extraction (pg_trickle_core)Planned
v0.24.0PGlite WASM extensionPlanned
v0.25.0PGlite reactive integrationPlanned
v1.0.0Stable release (incl. PG 19 compatibility)Planned

v0.1.x Series — Released

Completed items (click to expand)

v0.1.0 — Released (2026-02-26)

Status: Released — all 13 design phases implemented.

Core engine, DVM with 21 OpTree operators, trigger-based CDC, DAG-aware scheduling, monitoring, dbt macro package, and 1,300+ tests.

Key additions over pre-release:

  • WAL decoder pgoutput edge cases (F4)
  • JOIN key column change limitation docs (F7)
  • Keyless duplicate-row behavior documented (F11)
  • CUBE explosion guard (F14)

v0.1.1 — Released (2026-02-27)

Patch release: WAL decoder keyless pk_hash fix (F2), old_* column population for UPDATEs (F3), and delete_insert merge strategy removal (F1).

v0.1.2 — Released (2026-02-28)

Patch release: ALTER TYPE/POLICY DDL tracking (F6), window partition key E2E tests (F8), PgBouncer compatibility docs (F12), read replica detection (F16), SPI retry with SQLSTATE classification (F29), and 40+ additional E2E tests.

v0.1.3 — Released (2026-03-01)

Patch release: Completed 50/51 SQL_GAPS_7 items across all tiers. Highlights:

  • Adaptive fallback threshold (F27), delta change metrics (F30)
  • WAL decoder hardening: replay deduplication, slot lag alerting (F31–F38)
  • TPC-H 22-query correctness baseline (22/22 pass, SF=0.01)
  • 460 E2E tests (≥ 400 exit criterion met)
  • CNPG extension image published to GHCR

See CHANGELOG.md for the full feature list.


v0.2.0 — TopK, Diamond Consistency & Transactional IVM

Status: Released (2026-03-04).

The 51-item SQL_GAPS_7 correctness plan was completed in v0.1.x. v0.2.0 delivers three major feature additions.

Completed items (click to expand)
TierItemsStatus
0 — CriticalF1–F3, F5–F6✅ Done in v0.1.1–v0.1.3
1 — VerificationF8–F10, F12✅ Done in v0.1.2–v0.1.3
2 — RobustnessF13, F15–F16✅ Done in v0.1.2–v0.1.3
3 — Test coverageF17–F26 (62 E2E tests)✅ Done in v0.1.2–v0.1.3
4 — Operational hardeningF27–F39✅ Done in v0.1.3
4 — Upgrade migrationsF40✅ Done in v0.2.1
5 — Nice-to-haveF41–F51✅ Done in v0.1.3

TPC-H baseline: 22/22 queries pass deterministic correctness checks across multiple mutation cycles (just test-tpch, SF=0.01).

Queries are derived from the TPC-H Benchmark specification; results are not comparable to published TPC results. TPC Benchmark™ is a trademark of TPC.

ORDER BY / LIMIT / OFFSET — TopK Support ✅

In plain terms: Stream tables can now be defined with ORDER BY ... LIMIT N — for example "keep the top 10 best-selling products". When the underlying data changes, only the top-N slot is updated incrementally rather than recomputing the entire sorted list from scratch every tick.

ORDER BY ... LIMIT N defining queries are accepted and refreshed correctly. All 9 plan items (TK1–TK9) implemented, including 5 TPC-H queries with ORDER BY restored (Q2, Q3, Q10, Q18, Q21).

ItemDescriptionStatus
TK1E2E tests for FETCH FIRST / FETCH NEXT rejection✅ Done
TK2OFFSET without ORDER BY warning in subqueries✅ Done
TK3detect_topk_pattern() + TopKInfo struct in parser.rs✅ Done
TK4Catalog columns: pgt_topk_limit, pgt_topk_order_by✅ Done
TK5TopK-aware refresh path (scoped recomputation via MERGE)✅ Done
TK6DVM pipeline bypass for TopK tables in api.rs✅ Done
TK7E2E + unit tests (e2e_topk_tests.rs, 18 tests)✅ Done
TK8Documentation (SQL Reference, FAQ, CHANGELOG)✅ Done
TK9TPC-H: restored ORDER BY + LIMIT in Q2, Q3, Q10, Q18, Q21✅ Done

See PLAN_ORDER_BY_LIMIT_OFFSET.md.

Diamond Dependency Consistency ✅

In plain terms: A "diamond" is when two stream tables share the same source (A → B, A → C) and a third (D) reads from both B and C. Without special handling, updating A could refresh B before C, leaving D briefly in an inconsistent state where it sees new-B but old-C. This groups B and C into an atomic refresh unit so D always sees them change together in a single step.

Atomic refresh groups eliminate the inconsistency window in diamond DAGs (A→B→D, A→C→D). All 8 plan items (D1–D8) implemented.

ItemDescriptionStatus
D1Data structures (Diamond, ConsistencyGroup) in dag.rs✅ Done
D2Diamond detection algorithm in dag.rs✅ Done
D3Consistency group computation in dag.rs✅ Done
D4Catalog columns + GUCs (diamond_consistency, diamond_schedule_policy)✅ Done
D5Scheduler wiring with SAVEPOINT loop✅ Done
D6Monitoring function pgtrickle.diamond_groups()✅ Done
D7E2E test suite (tests/e2e_diamond_tests.rs)✅ Done
D8Documentation (SQL_REFERENCE.md, CONFIGURATION.md, ARCHITECTURE.md)✅ Done

See PLAN_DIAMOND_DEPENDENCY_CONSISTENCY.md.

Transactional IVM — IMMEDIATE Mode ✅

In plain terms: Normally stream tables refresh on a schedule (every N seconds). IMMEDIATE mode updates the stream table inside the same database transaction as the source table change — so by the time your INSERT/UPDATE/ DELETE commits, the stream table is already up to date. Zero lag, at the cost of a slightly slower write.

New IMMEDIATE refresh mode that updates stream tables within the same transaction as base table DML, using statement-level AFTER triggers with transition tables. Phase 1 (core engine) and Phase 3 (extended SQL support) are complete. Phase 2 (pg_ivm compatibility layer) is postponed. Phase 4 (performance optimizations) has partial completion (delta SQL template caching).

ItemDescriptionStatus
TI1RefreshMode::Immediate enum, catalog CHECK, API validation✅ Done
TI2Statement-level IVM trigger functions with transition tables✅ Done
TI3DeltaSource::TransitionTable — Scan operator dual-path✅ Done
TI4Delta application (DELETE + INSERT ON CONFLICT)✅ Done
TI5Advisory lock-based concurrency (IvmLockMode)✅ Done
TI6TRUNCATE handling (full refresh of stream table)✅ Done
TI7alter_stream_table mode switching (DIFFERENTIAL↔IMMEDIATE, FULL↔IMMEDIATE)✅ Done
TI8Query restriction validation (validate_immediate_mode_support)✅ Done
TI9Delta SQL template caching (thread-local IVM_DELTA_CACHE)✅ Done
TI10Window functions, LATERAL, scalar subqueries in IMMEDIATE mode✅ Done
TI11Cascading IMMEDIATE stream tables (ST_A → ST_B)✅ Done
TI1229 E2E tests + 8 unit tests✅ Done
TI13Documentation (SQL Reference, Architecture, FAQ, CHANGELOG)✅ Done

Remaining performance optimizations (ENR-based transition table access, aggregate fast-path, C-level trigger functions, prepared statement reuse) are tracked under post-1.0 A2.

See PLAN_TRANSACTIONAL_IVM.md.

Exit criteria:

  • ORDER BY ... LIMIT N (TopK) defining queries accepted and refreshed correctly
  • TPC-H queries Q2, Q3, Q10, Q18, Q21 pass with original LIMIT restored
  • Diamond dependency consistency (D1–D8) implemented and E2E-tested
  • IMMEDIATE refresh mode: INSERT/UPDATE/DELETE on base table updates stream table within the same transaction
  • Window functions, LATERAL, scalar subqueries work in IMMEDIATE mode
  • Cascading IMMEDIATE stream tables (ST_A → ST_B) propagate correctly
  • Concurrent transaction tests pass

v0.2.1 — Upgrade Infrastructure & Documentation

Status: Released (2026-03-05).

Patch release focused on upgrade safety, documentation, and three catalog schema additions via sql/pg_trickle--0.2.0--0.2.1.sql:

Completed items (click to expand)
  • has_keyless_source BOOLEAN NOT NULL DEFAULT FALSE — EC-06 keyless source flag; changes apply strategy from MERGE to counted DELETE when set.
  • function_hashes TEXT — EC-16 function-body hash map; forces a full refresh when a referenced function's body changes silently.
  • topk_offset INT — OS2 catalog field for paged TopK OFFSET support, shipped and used in this release.

Upgrade Migration Infrastructure ✅

In plain terms: When you run ALTER EXTENSION pg_trickle UPDATE, all your stream tables should survive intact. This adds the safety net that makes that true: automated scripts that check every upgrade script covers all database objects, real end-to-end tests that actually perform the upgrade in a test container, and CI gates that catch regressions before they reach users.

Complete safety net for ALTER EXTENSION pg_trickle UPDATE:

ItemDescriptionStatus
U1scripts/check_upgrade_completeness.sh — CI completeness checker✅ Done
U2sql/archive/ with archived SQL baselines per version✅ Done
U3tests/Dockerfile.e2e-upgrade for real upgrade tests✅ Done
U46 upgrade E2E tests (function parity, stream table survival, etc.)✅ Done
U5CI: upgrade-check (every PR) + upgrade-e2e (push-to-main)✅ Done
U6docs/UPGRADING.md user-facing upgrade guide✅ Done
U7just check-upgrade, just build-upgrade-image, just test-upgrade✅ Done
U8Fixed 0.1.3→0.2.0 upgrade script (was no-op placeholder)✅ Done

Documentation Expansion ✅

In plain terms: Added six new pages to the documentation book: a dbt integration guide, contributing guide, security policy, release process, a comparison with the pg_ivm extension, and a deep-dive explaining why row-level triggers were chosen over logical replication for CDC.

GitHub Pages book grew from 14 to 20 pages:

PageSectionSource
dbt IntegrationIntegrationsdbt-pgtrickle/README.md
ContributingReferenceCONTRIBUTING.md
Security PolicyReferenceSECURITY.md
Release ProcessReferencedocs/RELEASE.md
pg_ivm ComparisonResearchplans/ecosystem/GAP_PG_IVM_COMPARISON.md
Triggers vs ReplicationResearchplans/sql/REPORT_TRIGGERS_VS_REPLICATION.md

Exit criteria:

  • ALTER EXTENSION pg_trickle UPDATE from 0.1.3→0.2.0 tested end-to-end
  • Completeness check passes (upgrade script covers all pgrx-generated SQL objects)
  • CI enforces upgrade script completeness on every PR
  • All documentation pages build and render in mdBook

v0.2.2 — OFFSET, AUTO Mode, ALTER QUERY, Edge Cases & CDC Hardening

Status: Released (2026-03-08).

This milestone shipped paged TopK OFFSET support, AUTO-by-default refresh selection, ALTER QUERY, the remaining upgrade-tooling work, edge-case and WAL CDC hardening, IMMEDIATE-mode parity fixes, and the outstanding documentation sweep.

Completed items (click to expand)

ORDER BY + LIMIT + OFFSET (Paged TopK) — Finalization ✅

In plain terms: Extends TopK to support OFFSET — so you can define a stream table as "rows 11–20 of the top-20 best-selling products" (page 2 of a ranked list). Useful for paginated leaderboards, ranked feeds, or any use case where you want a specific window into a sorted result.

Core implementation is complete (parser, catalog, refresh path, docs, 9 E2E tests). The topk_offset catalog column shipped in v0.2.1 and is exercised by the paged TopK feature here.

ItemDescriptionStatusRef
OS19 OFFSET E2E tests in e2e_topk_tests.rs✅ DonePLAN_OFFSET_SUPPORT.md §Step 6
OS2sql/pg_trickle--0.2.1--0.2.2.sql — function signature updates (no schema DDL needed)✅ DonePLAN_OFFSET_SUPPORT.md §Step 2

AUTO Refresh Mode ✅

In plain terms: Changes the default from "always try differential (incremental) refresh" to a smart automatic selection: use differential when the query supports it, fall back to a full re-scan when it doesn't. New stream tables also get a calculated schedule interval instead of a hardcoded 1-minute default.

ItemDescriptionStatusRef
AM1RefreshMode::Auto — uses DIFFERENTIAL when supported, falls back to FULL✅ DonePLAN_REFRESH_MODE_DEFAULT.md
AM2create_stream_table default changed from 'DIFFERENTIAL' to 'AUTO'✅ Done
AM3create_stream_table schedule default changed from '1m' to 'calculated'✅ Done

ALTER QUERY ✅

In plain terms: Lets you change the SQL query of an existing stream table without dropping and recreating it. pg_trickle inspects the old and new queries, determines what type of change was made (added a column, dropped a column, or fundamentally incompatible change), and performs the most minimal migration possible — updating in place where it can, rebuilding only when it must.

ItemDescriptionStatusRef
AQ1alter_stream_table(query => ...) — validate, classify schema change, migrate storage✅ DonePLAN_ALTER_QUERY.md
AQ2Schema classification: same, compatible (ADD/DROP COLUMN), incompatible (full rebuild)✅ Done
AQ3ALTER-aware cycle detection (check_for_cycles_alter)✅ Done
AQ4CDC dependency migration (add/remove triggers for changed sources)✅ Done
AQ5SQL Reference & CHANGELOG documentation✅ Done

Upgrade Tooling ✅

In plain terms: If the compiled extension library (.so file) is a different version than the SQL objects in the database, the scheduler now warns loudly at startup instead of failing in confusing ways later. Also adds FAQ entries and cross-links for common upgrade questions.

ItemDescriptionStatusRef
UG1Version mismatch check — scheduler warns if .so version ≠ SQL version✅ DonePLAN_UPGRADE_MIGRATIONS.md §5.2
UG2FAQ upgrade section — 3 new entries with UPGRADING.md cross-links✅ DonePLAN_UPGRADE_MIGRATIONS.md §5.4
UG3CI and local upgrade automation now target 0.2.2 (upgrade-check, upgrade-image defaults, upgrade E2E env)✅ DonePLAN_UPGRADE_MIGRATIONS.md

IMMEDIATE Mode Parity ✅

In plain terms: Closes two remaining SQL patterns that worked in DIFFERENTIAL mode but not in IMMEDIATE mode. Recursive CTEs (queries that reference themselves to compute e.g. graph reachability or org-chart hierarchies) now work in IMMEDIATE mode with a configurable depth guard. TopK (ORDER BY + LIMIT) queries also get a dedicated fast micro-refresh path in IMMEDIATE mode.

Close the gap between DIFFERENTIAL and IMMEDIATE mode SQL coverage for the two remaining high-risk patterns — recursive CTEs and TopK queries.

ItemDescriptionEffortRef
IM1Validate recursive CTE semi-naive in IMMEDIATE mode; add stack-depth guard for deeply recursive defining queries2–3dPLAN_EDGE_CASES_TIVM_IMPL_ORDER.md Stage 6 §5.1
IM2TopK in IMMEDIATE mode: statement-level micro-refresh + ivm_topk_max_limit GUC2–3dPLAN_EDGE_CASES_TIVM_IMPL_ORDER.md Stage 6 §5.2

IMMEDIATE parity subtotal: ✅ Complete (IM1 + IM2)

Edge Case Hardening ✅

In plain terms: Three targeted fixes for uncommon-but-real scenarios: a cap on CUBE/ROLLUP combinatorial explosion (which can generate thousands of grouping variants from a single query and crash the database); automatic recovery when CDC gets stuck in a "transitioning" state after a database restart; and polling-based change detection for foreign tables (tables in external databases) that can't use triggers or WAL.

Self-contained items from Stage 7 of the edge-cases/TIVM implementation plan.

ItemDescriptionEffortRef
EC1pg_trickle.max_grouping_set_branches GUC — cap CUBE/ROLLUP branch-count explosion4hPLAN_EDGE_CASES.md EC-02
EC2Post-restart CDC TRANSITIONING health check — detect stuck CDC transitions after crash or restart1dPLAN_EDGE_CASES.md EC-20
EC3Foreign table support: polling-based change detection via periodic re-execution2–3dPLAN_EDGE_CASES.md EC-05

Edge-case hardening subtotal: ✅ Complete (EC1 + EC2 + EC3)

Documentation Sweep

In plain terms: Filled three documentation gaps: what happens to an in-flight refresh if you run DDL (ALTER TABLE, DROP INDEX) at the same time; limitations when using pg_trickle on standby replicas; and a PgBouncer configuration guide explaining the session-mode requirement and incompatible settings.

Remaining documentation gaps identified in Stage 7 of the gap analysis.

ItemDescriptionEffortStatusRef
DS1DDL-during-refresh behaviour: document safe patterns and races2h✅ DonePLAN_EDGE_CASES.md EC-17
DS2Replication/standby limitations: document in FAQ and Architecture3h✅ DonePLAN_EDGE_CASES.md EC-21/22/23
DS3PgBouncer configuration guide: session-mode requirements and known incompatibilities2h✅ DonePLAN_EDGE_CASES.md EC-28

Documentation sweep subtotal: ✅ Complete

WAL CDC Hardening

In plain terms: WAL (Write-Ahead Log) mode tracks changes by reading PostgreSQL's internal replication stream rather than using row-level triggers — which is more efficient and works across concurrent sessions. This work added a complete E2E test suite for WAL mode, hardened the automatic fallback from WAL to trigger mode when WAL isn't available, and promoted cdc_mode = 'auto' (try WAL first, fall back to triggers) as the default.

WAL decoder F2–F3 fixes (keyless pk_hash, old_* columns for UPDATE) landed in v0.1.3.

ItemDescriptionEffortStatusRef
W1WAL mode E2E test suite (parallel to trigger suite)8–12h✅ DonePLAN_HYBRID_CDC.md
W2WAL→trigger automatic fallback hardening4–6h✅ DonePLAN_HYBRID_CDC.md
W3Promote pg_trickle.cdc_mode = 'auto' to default~1h✅ DonePLAN_HYBRID_CDC.md

WAL CDC subtotal: ~13–19 hours

Exit criteria:

  • ORDER BY + LIMIT + OFFSET defining queries accepted, refreshed, and E2E-tested
  • sql/pg_trickle--0.2.1--0.2.2.sql exists (column pre-provisioned in 0.2.1; function signature updates)
  • Upgrade completeness check passes for 0.2.1→0.2.2
  • CI and local upgrade-E2E defaults target 0.2.2
  • Version check fires at scheduler startup if .so/SQL versions diverge
  • IMMEDIATE mode: recursive CTE semi-naive validated; ivm_recursive_max_depth depth guard added
  • IMMEDIATE mode: TopK micro-refresh fully tested end-to-end (10 E2E tests)
  • max_grouping_set_branches GUC guards CUBE/ROLLUP explosion (3 E2E tests)
  • Post-restart CDC TRANSITIONING health check in place
  • Foreign table polling-based CDC implemented (3 E2E tests)
  • DDL-during-refresh and standby/replication limitations documented
  • WAL CDC mode passes full E2E suite
  • E2E tests pass (just build-e2e-image && just test-e2e)

v0.2.3 — Non-Determinism, CDC/Mode Gaps & Operational Polish

Status: Released (2026-03-09).

Completed items (click to expand)

Goal: Close a small set of high-leverage correctness and operational gaps that do not need to wait for the larger v0.3.0 parallel refresh, security, and partitioning work. This milestone tightens refresh-mode behavior, makes CDC transitions easier to observe, and removes one silent correctness hazard in DIFFERENTIAL mode.

Non-Deterministic Function Handling

In plain terms: Functions like random(), gen_random_uuid(), and clock_timestamp() return a different value every time they're called. In DIFFERENTIAL mode, pg_trickle computes what changed between the old and new result — but if a function changes on every call, the "change" is meaningless and produces phantom rows. This detects such functions at stream-table creation time and rejects them in DIFFERENTIAL mode (they still work fine in FULL or IMMEDIATE mode).

Status: Done. Volatility lookup, OpTree enforcement, E2E coverage, and documentation are complete.

Volatile functions (random(), gen_random_uuid(), clock_timestamp()) break delta computation in DIFFERENTIAL mode — values change on each evaluation, causing phantom changes and corrupted row identity hashes. This is a silent correctness gap.

ItemDescriptionEffortRef
ND1Volatility lookup via pg_proc.provolatile + recursive Expr scannerDonePLAN_NON_DETERMINISM.md §Part 1
ND2OpTree volatility walker + enforcement policy (reject volatile in DIFFERENTIAL, warn for stable)DonePLAN_NON_DETERMINISM.md §Part 2
ND3E2E tests (volatile rejected, stable warned, immutable allowed, nested volatile in WHERE)DonePLAN_NON_DETERMINISM.md §E2E Tests
ND4Documentation (SQL_REFERENCE.md, DVM_OPERATORS.md)DonePLAN_NON_DETERMINISM.md §Files

Non-determinism subtotal: ~4–6 hours

CDC / Refresh Mode Interaction Gaps ✅

In plain terms: pg_trickle has four CDC modes (trigger, WAL, auto, per-table override) and four refresh modes (FULL, DIFFERENTIAL, IMMEDIATE, AUTO). Not every combination makes sense, and some had silent bugs. This fixed six specific gaps: stale change buffers not being flushed after FULL refreshes (so they got replayed again on the next tick), a missing error for the IMMEDIATE + WAL combination, a new pgt_cdc_status monitoring view, per-table CDC mode overrides, and a guard against refreshing stream tables that haven't been populated yet.

Six gaps between the four CDC modes and four refresh modes — missing validations, resource leaks, and observability holes. Phased from quick wins (pure Rust) to a larger feature (per-table cdc_mode override).

ItemDescriptionEffortRef
G6Defensive is_populated + empty-frontier check in execute_differential_refresh()DonePLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G6
G2Validate IMMEDIATE + cdc_mode='wal' — global-GUC path logs INFO; explicit per-table override is rejected with a clear errorDonePLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G2
G3Advance WAL replication slot after FULL refresh; flush change buffersDonePLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G3
G4Flush change buffers after AUTO→FULL adaptive fallback (prevents ping-pong)DonePLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G4
G5pgtrickle.pgt_cdc_status view + NOTIFY on CDC transitionsDonePLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G5
G1Per-table cdc_mode override (SQL API, catalog, dbt, migration)DonePLAN_CDC_MODE_REFRESH_MODE_GAPS.md §G1

CDC/refresh mode gaps subtotal: ✅ Complete

Progress: G6 is now implemented in v0.2.3: the low-level differential executor rejects unpopulated stream tables and missing frontiers before it can scan from 0/0, while the public manual-refresh path continues to fall back to FULL for initialize => false stream tables.

Progress: G1 and G2 are now complete: create_stream_table() and alter_stream_table() accept an optional per-table cdc_mode override, the requested value is stored in pgt_stream_tables.requested_cdc_mode, dbt forwards the setting, and shared-source WAL transition eligibility is now resolved conservatively from all dependent deferred stream tables. The cluster-wide pg_trickle.cdc_mode = 'wal' path still logs INFO for refresh_mode = 'IMMEDIATE', while explicit per-table cdc_mode => 'wal' requests are rejected for IMMEDIATE mode with a clear error.

Progress: G3 and G4 are now implemented in v0.2.3: advance_slot_to_current() in wal_decoder.rs advances WAL slots after each FULL refresh; the shared post_full_refresh_cleanup() helper in refresh.rs advances all WAL/TRANSITIONING slots and flushes change buffers, called from scheduler.rs after every Full/Reinitialize execution and from the adaptive fallback path. This prevents change-buffer ping-pong on bulk-loaded tables.

Progress: G5 is now implemented in v0.2.3: the pgtrickle.pgt_cdc_status convenience view has been added, and a cdc_modes text-array column surfaces per-source CDC modes in pgtrickle.pg_stat_stream_tables. NOTIFY on CDC transitions (TRIGGER → TRANSITIONING → WAL) was already implemented via emit_cdc_transition_notify() in wal_decoder.rs.

Progress: The SQL upgrade path for these CDC and monitoring changes is in place via sql/pg_trickle--0.2.2--0.2.3.sql, which adds requested_cdc_mode, updates the create_stream_table / alter_stream_table signatures, recreates pgtrickle.pg_stat_stream_tables, and adds pgtrickle.pgt_cdc_status for ALTER EXTENSION ... UPDATE users.

Operational

In plain terms: Four housekeeping improvements: clean up prepared statements when the database catalog changes (prevents stale caches after DDL); make WAL slot lag alert thresholds configurable rather than hardcoded; simplify a confusing GUC setting (user_triggers) with a deprecated alias; and add a pg_trickle_dump tool that exports all stream table definitions to a replayable SQL file — useful as a backup before running an upgrade.

ItemDescriptionEffortRef
O1Prepared statement cleanup on cache invalidationDoneGAP_SQL_PHASE_7.md G4.4
O2Slot lag alerting thresholds configurable (slot_lag_warning_threshold_mb, slot_lag_critical_threshold_mb)DonePLAN_HYBRID_CDC.md §6.2
O3Simplify pg_trickle.user_triggers GUC (canonical auto / off, deprecated on alias)DonePLAN_FEATURE_CLEANUP.md C5
O4pg_trickle_dump: SQL export tool for manual backup before upgradeDonePLAN_UPGRADE_MIGRATIONS.md §5.3

Operational subtotal: Done

Progress: All four operational items are now shipped in v0.2.3. Warning-level and critical WAL slot lag thresholds are configurable, prepared __pgt_merge_* statements are cleaned up on shared cache invalidation, pg_trickle.user_triggers is simplified to canonical auto / off semantics with a deprecated on alias, and pg_trickle_dump provides a replayable SQL export for upgrade backups.

v0.2.3 total: ~45–66 hours

Exit criteria:

  • Volatile functions rejected in DIFFERENTIAL mode; stable functions warned
  • DIFFERENTIAL on unpopulated ST returns error (G6)
  • IMMEDIATE + explicit cdc_mode='wal' rejected with clear error (G2)
  • WAL slot advanced after FULL refresh; change buffers flushed (G3)
  • Adaptive fallback flushes change buffers; no ping-pong cycles (G4)
  • pgtrickle.pgt_cdc_status view available; NOTIFY on CDC transitions (G5)
  • Prepared statement cache cleanup works after invalidation
  • Per-table cdc_mode override functional in SQL API and dbt adapter (G1)
  • Extension upgrade path tested (0.2.2 → 0.2.3)

v0.3.0 — DVM Correctness, SAST & Test Coverage

Status: Released (2026-03-11).

Completed items (click to expand)

Goal: Re-enable all 18 previously-ignored DVM correctness E2E tests by fixing HAVING, FULL OUTER JOIN, correlated EXISTS+HAVING, and correlated scalar subquery differential computation bugs. Harden the SAST toolchain with privilege-context rules and an unsafe-block baseline. Expand TPC-H coverage with rollback, mode-comparison, single-row, and DAG tests.

DVM Correctness Fixes

In plain terms: The Differential View Maintenance engine — the core algorithm that computes what changed incrementally — had four correctness bugs in specific SQL patterns. Queries using these patterns were silently producing wrong results and had their tests marked "ignored". This release fixes all four: HAVING clauses on aggregates, FULL OUTER JOINs, correlated EXISTS subqueries combined with HAVING, and correlated scalar subqueries in SELECT lists. All 18 previously-ignored E2E tests now pass.

ItemDescriptionStatus
DC1HAVING clause differential correctness — fix COUNT(*) rewrite and threshold-crossing upward rescan (5 tests un-ignored)✅ Done
DC2FULL OUTER JOIN differential correctness — fix row-id mismatch, compound GROUP BY expressions, SUM NULL semantics, and rescan CTE SELECT list (5 tests un-ignored)✅ Done
DC3Correlated EXISTS with HAVING differential correctness — fix EXISTS sublink parser discarding GROUP BY/HAVING, row-id mismatch for Project(SemiJoin), and diff_project row-id recomputation (1 test un-ignored)✅ Done
DC4Correlated scalar subquery differential correctness — rewrite_correlated_scalar_in_select rewrites correlated scalar subqueries to LEFT JOINs before DVM parsing (2 tests un-ignored)✅ Done

DVM correctness subtotal: 18 previously-ignored E2E tests re-enabled (0 remaining)

SAST Program (Phases 1–3)

In plain terms: Adds formal static security analysis (SAST) to every build. CodeQL and Semgrep scan for known vulnerability patterns — for example, using SECURITY DEFINER functions without locking down search_path, or calling SET ROLE in ways that could be abused. Separately, every Rust unsafe {} block is inventoried and counted; any PR that adds new unsafe blocks beyond the committed baseline fails CI automatically.

ItemDescriptionStatus
S1CodeQL + cargo deny + initial Semgrep baseline — zero findings across 115 Rust source files✅ Done
S2Narrow rust.panic-in-sql-path scope — exclude src/dvm/** and src/bin/** to eliminate 351 false-positive alerts✅ Done
S3sql.row-security.disabled Semgrep rule — flag SET LOCAL row_security = off✅ Done
S4sql.set-role.present Semgrep rule — flag SET ROLE / RESET ROLE patterns✅ Done
S5Updated sql.security-definer.present message to require explicit SET search_path✅ Done
S6scripts/unsafe_inventory.sh + .unsafe-baseline — per-file unsafe { counter with committed baseline (1309 blocks across 6 files)✅ Done
S7.github/workflows/unsafe-inventory.yml — advisory CI workflow; fails if any file exceeds its baseline✅ Done
S8Remove pull_request trigger from CodeQL + Semgrep workflows (no inline PR annotations; runs on push-to-main + weekly schedule)✅ Done

SAST subtotal: Phases 1–3 complete; Phase 4 rule promotion tracked as post-v0.3.0 cleanup

TPC-H Test Suite Enhancements (T1–T6)

In plain terms: TPC-H is an industry-standard analytical query benchmark — 22 queries against a simulated supply-chain database. This extends the pg_trickle TPC-H test suite to verify four additional scenarios that the basic correctness checks didn't cover: that ROLLBACK atomically undoes an IVM stream table update; that DIFFERENTIAL and IMMEDIATE mode produce identical answers for the same data; that single-row mutations work correctly (not just bulk changes); and that multi-level stream table DAGs refresh in the correct topological order.

ItemDescriptionStatus
T1__pgt_count < 0 guard in assert_tpch_invariant — over-retraction detector, applies to all existing TPC-H tests✅ Done
T2Skip-set regression guard in DIFFERENTIAL + IMMEDIATE tests — any newly skipped query not in the allowlist fails CI✅ Done
T3test_tpch_immediate_rollback — verify ROLLBACK restores IVM stream table atomically across RF mutations✅ Done
T4test_tpch_differential_vs_immediate — side-by-side comparison: both incremental modes produce identical results after shared mutations✅ Done
T5test_tpch_single_row_mutations + SQL fixtures — single-row INSERT/UPDATE/DELETE IVM trigger paths on Q01/Q06/Q03✅ Done
T6atest_tpch_dag_chain — two-level DAG (Q01 → filtered projection), refreshed in topological order✅ Done
T6btest_tpch_dag_multi_parent — multi-parent fan-in (Q01 + Q06 → UNION ALL), DIFFERENTIAL mode✅ Done

TPC-H subtotal: T1–T6 complete; 22/22 TPC-H queries passing

Exit criteria:

  • All 18 previously-ignored DVM correctness E2E tests re-enabled
  • SAST Phases 1–3 deployed; unsafe baseline committed; CodeQL zero findings
  • TPC-H T1–T6 implemented; rollback, differential-vs-immediate, single-row, and DAG tests pass
  • Extension upgrade path tested (0.2.3 → 0.3.0)

v0.4.0 — Parallel Refresh & Performance Hardening

Status: Released (2026-03-12).

Completed items (click to expand)

Goal: Deliver true parallel refresh, cut write-side CDC overhead with statement-level triggers, close a cross-source snapshot consistency gap, and ship quick ergonomic and infrastructure improvements. Together these close the main performance and operational gaps before the security and partitioning work begins.

Parallel Refresh

In plain terms: Right now the scheduler refreshes stream tables one at a time. This feature lets multiple stream tables refresh simultaneously — like running several errands at once instead of in a queue. When you have dozens of stream tables, this can cut total refresh latency dramatically.

Detailed implementation is tracked in PLAN_PARALLELISM.md. The older REPORT_PARALLELIZATION.md remains the options-analysis precursor.

ItemDescriptionEffortRef
P1Phase 0–1: instrumentation, dry_run, and execution-unit DAG (atomic groups + IMMEDIATE closures)12–20hPLAN_PARALLELISM.md §10
P2Phase 2–4: job table, worker budget, dynamic refresh workers, and ready-queue dispatch16–28hPLAN_PARALLELISM.md §10
P3Phase 5–7: composite units, observability, rollout gating, and CI validation12–24hPLAN_PARALLELISM.md §10

Progress:

  • P1 — Phase 0 + Phase 1 (done): GUCs (parallel_refresh_mode, max_dynamic_refresh_workers), ExecutionUnit/ExecutionUnitDag types in dag.rs, IMMEDIATE-closure collapsing, dry-run logging in scheduler, 10 new unit tests (1211 total).
  • P2 — Phase 2–4 (done): Job table (pgt_scheduler_jobs), catalog CRUD, shared-memory token pool (Phase 2). Dynamic worker entry point, spawn helper, reconciliation (Phase 3). Coordinator dispatch loop with ready-queue scheduling, per-db/cluster-wide budget enforcement, transaction-split spawning, dynamic poll interval, 8 new unit tests (Phase 4). 1233 unit tests total.
  • P3a — Phase 5 (done): Composite unit execution — execute_worker_atomic_group() with C-level sub-transaction rollback, execute_worker_immediate_closure() with root-only refresh (IMMEDIATE triggers propagate downstream). Replaces Phase 3 serial placeholder.
  • P3b — Phase 6 (done): Observability — worker_pool_status(), parallel_job_status() SQL functions; health_check() extended with worker_pool and job_queue checks; docs updated.
  • P3c — Phase 7 (done): Rollout — GUC documentation in CONFIGURATION.md, worker-budget guidance in ARCHITECTURE.md, CI E2E coverage with PGT_PARALLEL_MODE=on, feature stays gated behind parallel_refresh_mode = 'off' default.

Parallel refresh subtotal: ~40–72 hours

Statement-Level CDC Triggers

In plain terms: Previously, when you updated 1,000 rows in a source table, the database fired a "row changed" notification 1,000 times — once per row. Now it fires once per statement, handing off all 1,000 changed rows in a single batch. For bulk operations like data imports or batch updates this is 50–80% cheaper; for single-row changes you won't notice a difference.

Replace per-row AFTER triggers with statement-level triggers using NEW TABLE AS __pgt_new / OLD TABLE AS __pgt_old. Expected write-side trigger overhead reduction of 50–80% for bulk DML; neutral for single-row.

ItemDescriptionEffortRef
B1Replace per-row triggers with statement-level triggers; INSERT/UPDATE/DELETE via set-based buffer fill8h✅ Done — build_stmt_trigger_fn_sql in cdc.rs; REFERENCING NEW TABLE AS __pgt_new OLD TABLE AS __pgt_old FOR EACH STATEMENT created by create_change_trigger
B2pg_trickle.cdc_trigger_mode = 'statement'|'row' GUC + migration to replace row-level triggers on ALTER EXTENSION UPDATE4h✅ Done — CdcTriggerMode enum in config.rs; rebuild_cdc_triggers() in api.rs; 0.3.0→0.4.0 upgrade script migrates existing triggers
B3Write-side benchmark matrix (narrow/medium/wide tables × bulk/single DML)2h✅ Done — bench_stmt_vs_row_cdc_matrix + bench_stmt_vs_row_cdc_quick in e2e_bench_tests.rs; runs via cargo test -- --ignored bench_stmt_vs_row_cdc_matrix

Statement-level CDC subtotal: ✅ All done (~14h)

Cross-Source Snapshot Consistency (Phase 1)

In plain terms: Imagine a stream table that joins orders and customers. If a single transaction updates both tables, the old scheduler could read the new orders data but the old customers data — a half-applied, internally inconsistent snapshot. This fix takes a "freeze frame" of the change log at the start of each scheduler tick and only processes changes up to that point, so all sources are always read from the same moment in time. Zero configuration required.

At start of each scheduler tick, snapshot pg_current_wal_lsn() as a tick_watermark and cap all CDC consumption to that LSN. Zero user configuration — prevents interleaved reads from two sources that were updated in the same transaction from producing an inconsistent stream table.

ItemDescriptionEffortRef
CSS1LSN tick watermark: snapshot pg_current_wal_lsn() per tick; cap frontier advance; log in pgt_refresh_history; pg_trickle.tick_watermark_enabled GUC (default on)3–4h✅ Done

Cross-source consistency subtotal: ✅ All done

Ergonomic Hardening

In plain terms: Added helpful warning messages for common mistakes: "your WAL level isn't configured for logical replication", "this source table has no primary key — duplicate rows may appear", "this change will trigger a full re-scan of all source data". Think of these as friendly guardrails that explain why something might not work as expected.

ItemDescriptionEffortRef
ERG-BWarn at _PG_init when cdc_mode='auto' but wal_level != 'logical' — prevents silent trigger-only operation30min✅ Done
ERG-CWarn at create_stream_table when source has no primary key — surfaces keyless duplicate-row risk1h✅ Done (pre-existing in warn_source_table_properties)
ERG-FEmit WARNING when alter_stream_table triggers an implicit full refresh1h✅ Done

Ergonomic hardening subtotal: ✅ All done

Code Coverage

In plain terms: Every pull request now automatically reports what percentage of the code is exercised by tests, and which specific lines are never touched. It's like a map that highlights the unlit corners — helpful for spotting blind spots before they become bugs.

ItemDescriptionEffortRef
COVCodecov integration: move token to with:, add codecov.yml with patch targets for src/dvm/, add README badge, verify first upload1–2h✅ Done — reports live at app.codecov.io/github/grove/pg-trickle

v0.4.0 total: ~60–94 hours

Exit criteria:

  • max_concurrent_refreshes drives real parallel refresh via coordinator + dynamic refresh workers
  • Statement-level CDC triggers implemented (B1/B2/B3); benchmark harness in bench_stmt_vs_row_cdc_matrix
  • LSN tick watermark active by default; no interleaved-source inconsistency in E2E tests
  • Codecov badge on README; coverage report uploading
  • Extension upgrade path tested (0.3.0 → 0.4.0)

v0.5.0 — Row-Level Security & Operational Controls

Status: Released (2026-03-13).

Completed items (click to expand)

Goal: Harden the security context for stream tables and IVM triggers, add source-level pause/resume gating for bulk-load coordination, and deliver small ergonomic improvements.

Row-Level Security (RLS) Support

In plain terms: Row-level security lets you write policies like "user Alice can only see rows where tenant_id = 'alice'". Stream tables already honour these policies when users query them. What this work fixes is the machinery behind the scenes — the triggers and refresh functions that build the stream table need to see all rows regardless of who is running them, otherwise they'd produce an incomplete result. This phase hardens those internal components so they always have full visibility, while end-users still see only their filtered slice.

Stream tables materialize the full result set (like MATERIALIZED VIEW). RLS is applied on the stream table itself for read-side filtering. Phase 1 hardens the security context; Phase 2 adds a tutorial; Phase 3 completes DDL tracking. Phase 4 (per-role security_invoker) is deferred to post-1.0.

ItemDescriptionEffortRef
R1Document RLS semantics in SQL_REFERENCE.md and FAQ.md1hPLAN_ROW_LEVEL_SECURITY.md §3.1
R2Disable RLS on change buffer tables (ALTER TABLE ... DISABLE ROW LEVEL SECURITY)30minPLAN_ROW_LEVEL_SECURITY.md §3.1 R2
R3Force superuser context for manual refresh_stream_table() (prevent "who refreshed it?" hazard)2hPLAN_ROW_LEVEL_SECURITY.md §3.1 R3
R4Force SECURITY DEFINER on IVM trigger functions (IMMEDIATE mode delta queries must see all rows)2hPLAN_ROW_LEVEL_SECURITY.md §3.1 R4
R5E2E test: RLS on source table does not affect stream table content1hPLAN_ROW_LEVEL_SECURITY.md §3.1 R5
R6Tutorial: RLS on stream tables (enable RLS, per-tenant policies, verify filtering)1.5hPLAN_ROW_LEVEL_SECURITY.md §3.2 R6
R7E2E test: RLS on stream table filters reads per role1hPLAN_ROW_LEVEL_SECURITY.md §3.2 R7
R8E2E test: IMMEDIATE mode + RLS on stream table30minPLAN_ROW_LEVEL_SECURITY.md §3.2 R8
R9Track ENABLE/DISABLE RLS DDL on source tables (AT_EnableRowSecurity et al.) in hooks.rs2hPLAN_ROW_LEVEL_SECURITY.md §3.3 R9
R10E2E test: ENABLE RLS on source table triggers reinit1hPLAN_ROW_LEVEL_SECURITY.md §3.3 R10

RLS subtotal: ~8–12 hours (Phase 4 security_invoker deferred to post-1.0)

Bootstrap Source Gating

In plain terms: A pause/resume switch for individual source tables. If you're bulk-loading 10 million rows into a source table (a nightly ETL import, for example), you can "gate" it first — the scheduler will skip refreshing any stream table that reads from it. Once the load is done you "ungate" it and a single clean refresh runs. Without gating, the CDC system would frantically process millions of intermediate changes during the load, most of which get immediately overwritten anyway.

Allow operators to pause CDC consumption for specific source tables (e.g. during bulk loads or ETL windows) without dropping and recreating stream tables. The scheduler skips any stream table whose transitive source set intersects the current gated set.

ItemDescriptionEffortRef
BOOT-1pgtrickle.pgt_source_gates catalog table (source_relid, gated, gated_at, gated_by)30minPLAN_BOOTSTRAP_GATING.md
BOOT-2gate_source(source TEXT) SQL function — sets gate, pg_notify scheduler1hPLAN_BOOTSTRAP_GATING.md
BOOT-3ungate_source(source TEXT) + source_gates() introspection view30minPLAN_BOOTSTRAP_GATING.md
BOOT-4Scheduler integration: load gated-source set per tick; skip and log SKIP in pgt_refresh_history2–3hPLAN_BOOTSTRAP_GATING.md
BOOT-5E2E tests: single-source gate, coordinated multi-source, partial DAG, bootstrap with initialize => false3–4hPLAN_BOOTSTRAP_GATING.md

Bootstrap source gating subtotal: ~7–9 hours

Ergonomics & API Polish

In plain terms: A handful of quality-of-life improvements: track when someone manually triggered a refresh and log it in the history table; a one-row quick_health view that tells you at a glance whether the extension is healthy (total tables, any errors, any stale tables, scheduler running); a create_stream_table_if_not_exists() helper so deployment scripts don't crash if the table was already created; and CALL syntax wrappers so the functions feel like native PostgreSQL commands rather than extension functions.

ItemDescriptionEffortRef
ERG-DRecord manual refresh_stream_table() calls in pgt_refresh_history with initiated_by='MANUAL'2hPLAN_ERGONOMICS.md §D
ERG-Epgtrickle.quick_health view — single-row status summary (total_stream_tables, error_tables, stale_tables, scheduler_running, status)2hPLAN_ERGONOMICS.md §E
COR-2create_stream_table_if_not_exists() convenience wrapper30minPLAN_CREATE_OR_REPLACE.md §COR-2
NAT-CALLCREATE PROCEDURE wrappers for all four main SQL functions — enables CALL pgtrickle.create_stream_table(...) syntax1hDeferred — PostgreSQL does not allow procedures and functions with the same name and argument types

Ergonomics subtotal: ~5–5.5 hours (NAT-CALL deferred)

Performance Foundations (Wave 1)

These quick-win items from PLAN_NEW_STUFF.md ship alongside the RLS and operational work. Read the risk analyses in that document before implementing any item.

ItemDescriptionEffortRef
A-3aMERGE bypass — Append-Only INSERT path: expose APPEND ONLY declaration on CREATE STREAM TABLE; CDC heuristic fallback (fast-path until first DELETE/UPDATE seen)1–2 wkPLAN_NEW_STUFF.md §A-3

A-4, B-2, and C-4 deferred to v0.6.0 Performance Wave 2 (scope mismatch with the RLS/operational-controls theme; correctness risk warrants a dedicated wave).

Performance foundations subtotal: ~10–20h (A-3a only)

v0.5.0 total: ~51–97h

Exit criteria:

  • RLS semantics documented; change buffers RLS-hardened; IVM triggers SECURITY DEFINER
  • RLS on stream table E2E-tested (DIFFERENTIAL + IMMEDIATE)
  • gate_source / ungate_source operational; scheduler skips gated sources correctly
  • quick_health view and create_stream_table_if_not_exists available
  • Manual refresh calls recorded in history with initiated_by='MANUAL'
  • A-3a: Append-Only INSERT path eliminates MERGE for event-sourced stream tables
  • Extension upgrade path tested (0.4.0 → 0.5.0)

v0.6.0 — Partitioning, Idempotent DDL, Edge Cases & Circular Dependency Foundation

Status: Released (2026-03-14).

Completed items (click to expand)

Goal: Validate partitioned source tables, add create_or_replace_stream_table for idempotent deployments (critical for dbt and migration workflows), close all remaining P0/P1 edge cases and two usability-tier gaps, harden ergonomics and source gating, expand the dbt integration, fill SQL documentation gaps, and lay the foundation for circular stream table DAGs.

Partitioning Support (Source Tables)

In plain terms: PostgreSQL lets you split large tables into smaller "partitions" — for example one partition per month for an orders table. This is a common technique for managing very large datasets. This work teaches pg_trickle to track all those partitions as a unit, so adding a new monthly partition doesn't silently break stream tables that depend on orders. It also handles the special case of foreign tables (tables that live in another database), restricting them to full-scan refresh since they can't be change-tracked the normal way.

ItemDescriptionEffortRef
PT1Verify partitioned tables work end-to-end. Create stream tables over RANGE-partitioned source tables, insert/update/delete rows, refresh, and confirm results match — proving that pg_trickle handles partitions correctly out of the box.8–12hPLAN_PARTITIONING_SHARDING.md §7
PT2Detect new partitions automatically. When someone runs ALTER TABLE orders ATTACH PARTITION orders_2026_04 ..., pg_trickle notices and rebuilds affected stream tables so the new partition's data is included. Without this, the new partition would be silently ignored.4–8hPLAN_PARTITIONING_SHARDING.md §3.3
PT3Make WAL-based change tracking work with partitions. PostgreSQL's logical replication normally sends changes tagged with the child partition name, not the parent. This configures it to report changes under the parent table name so pg_trickle's WAL decoder can match them correctly.2–4hPLAN_PARTITIONING_SHARDING.md §3.4
PT4Handle foreign tables gracefully. Tables that live in another database (via postgres_fdw) can't have triggers or WAL tracking. pg_trickle now detects them and automatically uses full-scan refresh mode instead of failing with a confusing error.2–4hPLAN_PARTITIONING_SHARDING.md §6.3
PT5Document partitioned table support. User-facing guide covering which partition types work, what happens when you add/remove partitions, and known caveats.2–4hPLAN_PARTITIONING_SHARDING.md §8

Partitioning subtotal: ~18–32 hours

Idempotent DDL (create_or_replace)

In plain terms: Right now if you run create_stream_table() twice with the same name it errors out, and changing the query means drop_stream_table() followed by create_stream_table() — which loses all the data in between. create_or_replace_stream_table() does the right thing automatically: if nothing changed it's a no-op, if only settings changed it updates in place, if the query changed it rebuilds. This is the same pattern as CREATE OR REPLACE FUNCTION in PostgreSQL — and it's exactly what the dbt materialization macro needs so every dbt run doesn't drop and recreate tables from scratch.

create_or_replace_stream_table() performs a smart diff: no-op if identical, in-place alter for config-only changes, schema migration for ADD/DROP column, full rebuild for incompatible changes. Eliminates the drop-and-recreate pattern used by the dbt materialization macro.

ItemDescriptionEffortRef
COR-1The core function. create_or_replace_stream_table() compares the new definition against the existing one and picks the cheapest path: no-op if identical, settings-only update if just config changed, column migration if columns were added/dropped, or full rebuild if the query is fundamentally different. One function call replaces the drop-and-recreate dance.4hPLAN_CREATE_OR_REPLACE.md
COR-3dbt just works. Updates the stream_table dbt materialization macro to call create_or_replace instead of dropping and recreating on every dbt run. Existing data survives deployments; only genuinely changed stream tables get rebuilt.2hPLAN_CREATE_OR_REPLACE.md
COR-4Upgrade path and documentation. Upgrade SQL script so existing installations get the new function via ALTER EXTENSION UPDATE. SQL Reference and FAQ updated with usage examples.2.5hPLAN_CREATE_OR_REPLACE.md
COR-5Thorough test coverage. 13 end-to-end tests covering: identical no-op, config-only change, query change with compatible columns, query change with incompatible columns, mode switches, and error cases.4hPLAN_CREATE_OR_REPLACE.md

Idempotent DDL subtotal: ~12–13 hours

Circular Dependency Foundation ✅

In plain terms: Normally stream tables form a one-way chain: A feeds B, B feeds C. A circular dependency means A feeds B which feeds A — usually a mistake, but occasionally useful for iterative computations like graph reachability or recursive aggregations. This lays the groundwork — the algorithms, catalog columns, and GUC settings — to eventually allow controlled circular stream tables. The actual live execution is completed in v0.7.0.

Forms the prerequisite for full SCC-based fixpoint refresh in v0.7.0.

ItemDescriptionEffortRef
CYC-1Find cycles in the dependency graph. Implement Tarjan's algorithm to efficiently detect which stream tables form circular groups. This tells the scheduler "these three stream tables reference each other — they need special handling."~2hPLAN_CIRCULAR_REFERENCES.md Part 1
CYC-2Block unsafe cycles. Not all queries can safely participate in a cycle — aggregates, EXCEPT, window functions, and NOT EXISTS can't converge to a stable answer when run in a loop. This checker rejects those at creation time with a clear error explaining why.~1hPLAN_CIRCULAR_REFERENCES.md Part 2
CYC-3Track cycles in the catalog. Add columns to the internal tables that record which cycle group each stream table belongs to and how many iterations the last refresh took. Needed for monitoring and the scheduler logic in v0.7.0.~1hPLAN_CIRCULAR_REFERENCES.md Part 3
CYC-4Safety knobs. Two new settings: max_fixpoint_iterations (default 100) prevents runaway loops, and allow_circular (default off) is the master switch — circular dependencies are rejected unless you explicitly opt in.~30minPLAN_CIRCULAR_REFERENCES.md Part 4

Circular dependency foundation subtotal: ~4.5 hours

Edge Case Hardening

In plain terms: Six remaining edge cases from the PLAN_EDGE_CASES.md catalogue — one data correctness issue (P0), three operational-surprise items (P1), and two usability gaps (P2). Together they close every open edge case above "accepted trade-off" status.

P0 — Data Correctness

ItemDescriptionEffortRef
EC-19Prevent silent data corruption with WAL + keyless tables. If you use WAL-based change tracking on a table without a primary key, PostgreSQL needs REPLICA IDENTITY FULL to send complete row data. Without it, deltas are silently incomplete. This rejects the combination at creation time with a clear error instead of producing wrong results.0.5 dayPLAN_EDGE_CASES.md EC-19

P1 — Operational Safety

ItemDescriptionEffortRef
EC-16Detect when someone silently changes a function your query uses. If a stream table's query calls calculate_discount() and someone does CREATE OR REPLACE FUNCTION calculate_discount(...) with new logic, the stream table's cached computation plan becomes stale. This checks function body hashes on each refresh and triggers a rebuild when a change is detected.2 daysPLAN_EDGE_CASES.md EC-16
EC-18Explain why WAL mode isn't activating. When cdc_mode = 'auto', pg_trickle is supposed to upgrade from trigger-based to WAL-based change tracking when possible. If it stays stuck on triggers (e.g. because wal_level isn't set to logical), there's no feedback. This adds a periodic log message explaining the reason and surfaces it in the health_check() output.1 dayPLAN_EDGE_CASES.md EC-18
EC-34Recover gracefully after restoring from backup. When you restore a PostgreSQL server from pg_basebackup, replication slots are lost. pg_trickle's WAL decoder would fail trying to read from a slot that no longer exists. This detects the missing slot, automatically falls back to trigger-based tracking, and logs a WARNING so you know what happened.1 dayPLAN_EDGE_CASES.md EC-34

P2 — Usability Gaps

ItemDescriptionEffortRef
EC-03Support window functions inside expressions. Queries like CASE WHEN ROW_NUMBER() OVER (...) = 1 THEN 'first' ELSE 'other' END are currently rejected because the incremental engine can't handle a window function nested inside a CASE. This automatically extracts the window function into a preliminary step and rewrites the outer query to reference the precomputed result — so the query pattern just works.3–5 daysPLAN_EDGE_CASES.md EC-03
EC-32Support ALL (subquery) comparisons. Queries like WHERE price > ALL (SELECT price FROM competitors) (meaning "greater than every row in the subquery") are currently rejected in incremental mode. This rewrites them into an equivalent form the engine can handle, removing a Known Limitation from the changelog.2–3 daysPLAN_EDGE_CASES.md EC-32

Edge case hardening subtotal: ~9.5–13.5 days

Ergonomics Follow-Up

In plain terms: Several test gaps and a documentation item were left over from the v0.5.0 ergonomics work. These are all small E2E tests that confirm existing features actually produce the warnings and errors they're supposed to — catching regressions before users hit them. The changelog entry documents breaking behavioural changes (the default schedule changed from a fixed "every 1 minute" to an auto-calculated interval, and NULL schedule input is now rejected).

ItemDescriptionEffortRef
ERG-T1Test the smart schedule default. Verify that passing 'calculated' as a schedule works (pg_trickle picks an interval based on table size) and that passing NULL gives a clear error instead of silently breaking. Catches regressions in the schedule parser.4hPLAN_ERGONOMICS.md §Remaining follow-up
ERG-T2Test that removed settings stay removed. The diamond_consistency GUC was removed in v0.4.0. Verify that SHOW pg_trickle.diamond_consistency returns an error — not a stale value from a previous installation that confuses users.2hPLAN_ERGONOMICS.md §Remaining follow-up
ERG-T3Test the "heads up, this will do a full refresh" warning. When you change a stream table's query via alter_stream_table(query => ...), it may trigger an expensive full re-scan. Verify the WARNING appears so users aren't surprised by a sudden spike in load.3hPLAN_ERGONOMICS.md §Remaining follow-up
ERG-T4Test the WAL configuration warning. When cdc_mode = 'auto' but PostgreSQL's wal_level isn't set to logical, pg_trickle can't use WAL-based tracking and silently falls back to triggers. Verify the startup WARNING appears so operators know they need to change wal_level.3hPLAN_ERGONOMICS.md §Remaining follow-up
ERG-T5Document breaking changes in the changelog. In v0.4.0 the default schedule changed from "every 1 minute" to auto-calculated, and NULL schedule input started being rejected. These behavioural changes need explicit CHANGELOG entries so upgrading users aren't caught off guard.2hPLAN_ERGONOMICS.md §Remaining follow-up

Ergonomics follow-up subtotal: ~14 hours

Bootstrap Source Gating Follow-Up

In plain terms: Source gating (pause/resume for bulk loads) shipped in v0.5.0 with the core API and scheduler integration. This follow-up adds robustness tests for edge cases that real-world ETL pipelines will hit: What happens if you gate a source twice? What if you re-gate it after ungating? It also adds a dedicated introspection function that shows the full gate lifecycle (when gated, who gated it, how long it's been gated), and documentation showing common ETL coordination patterns like "gate → bulk load → ungate → single clean refresh."

ItemDescriptionEffortRef
BOOT-F1Calling gate twice is safe. Verify that calling gate_source('orders') when orders is already gated is a harmless no-op — not an error. Important for ETL scripts that may retry on failure.3hPLAN_BOOTSTRAP_GATING.md
BOOT-F2Gate → ungate → gate again works correctly. Verify the full lifecycle: gate a source (scheduler skips it), ungate it (scheduler resumes), gate it again (scheduler skips again). Proves the mechanism is reusable across multiple load cycles.3hPLAN_BOOTSTRAP_GATING.md
BOOT-F3See your gates at a glance. A new bootstrap_gate_status() function that shows which sources are gated, when they were gated, who gated them, and how long they've been paused. Useful for debugging when the scheduler seems to be "doing nothing" — it might just be waiting for a gate.3hPLAN_BOOTSTRAP_GATING.md
BOOT-F4Cookbook for common ETL patterns. Documentation with step-by-step recipes: gating a single source during a bulk load, coordinating multiple source loads that must finish together, gating only part of a stream table DAG, and the classic "nightly batch → gate → load → ungate → single clean refresh" workflow.3hPLAN_BOOTSTRAP_GATING.md

Bootstrap gating follow-up subtotal: ~12 hours

dbt Integration Enhancements

In plain terms: The dbt macro package (dbt-pgtrickle) shipped in v0.4.0 with the core stream_table materialization. This adds three improvements: a stream_table_status macro that lets dbt models query health information (stale? erroring? how many refreshes?) so you can build dbt tests that fail when a stream table is unhealthy; a bulk refresh_all_stream_tables operation for CI pipelines that need everything fresh before running tests; and expanded integration tests covering the alter_stream_table flow (which gets more important once create_or_replace lands in the same release).

ItemDescriptionEffortRef
DBT-1Check stream table health from dbt. A new stream_table_status() macro that returns whether a stream table is healthy, stale, or erroring — so you can write dbt tests like "fail if the orders summary hasn't refreshed in the last 5 minutes." Makes pg_trickle a first-class citizen in dbt's testing framework.3hPLAN_ECO_SYSTEM.md §Project 1
DBT-2Refresh everything in one command. A dbt run-operation refresh_all_stream_tables command that refreshes all stream tables in the correct dependency order. Designed for CI pipelines: run it after dbt run and before dbt test to make sure all materialized data is current.2hPLAN_ECO_SYSTEM.md §Project 1
DBT-3Test the dbt ↔ alter flow. Integration tests that verify query changes, config changes, and mode switches all work correctly when made through dbt's stream_table materialization. Especially important now that create_or_replace is landing in the same release.3hPLAN_ECO_SYSTEM.md §Project 1

dbt integration subtotal: ~8 hours

SQL Documentation Gaps

In plain terms: Once EC-03 (window functions in expressions) and EC-32 (ALL (subquery)) are implemented in this release, the documentation needs to explain the new patterns with examples. The foreign table polling CDC feature (shipped in v0.2.2) also needs a worked example showing common setups like postgres_fdw source tables with periodic polling.

ItemDescriptionEffortRef
DOC-1Show users how ALL-subqueries work. Once EC-32 lands, add a SQL Reference section explaining WHERE price > ALL (SELECT ...), how pg_trickle rewrites it internally, and a complete worked example with sample data and expected output.2hGAP_SQL_OVERVIEW.md
DOC-2Show the window-in-expression pattern. Once EC-03 lands, add a before/after example to the SQL Reference: "Here's your original query with CASE WHEN ROW_NUMBER() ..., and here's what pg_trickle does under the hood to make it work incrementally."2hPLAN_EDGE_CASES.md EC-03
DOC-3Walkthrough for foreign table sources. A step-by-step recipe showing how to create a postgres_fdw foreign table, use it as a stream table source with polling-based change detection, and what to expect in terms of refresh behaviour. This feature shipped in v0.2.2 but was never properly documented with an example.1hExisting feature (v0.2.2)

SQL documentation subtotal: ~5 hours

v0.6.0 total: ~77–92h

Exit criteria:

  • Partitioned source tables E2E-tested; ATTACH PARTITION detected
  • WAL mode works with publish_via_partition_root = true
  • create_or_replace_stream_table deployed; dbt macro updated
  • SCC algorithm in place; monotonicity checker rejects non-monotone cycles
  • WAL + keyless without REPLICA IDENTITY FULL rejected at creation (EC-19)
  • ALTER FUNCTION body changes detected via pg_proc hash polling (EC-16)
  • Stuck auto CDC mode surfaces explanation in logs and health check (EC-18)
  • Missing WAL slot after restore auto-detected with TRIGGER fallback (EC-34)
  • Window functions in expressions supported via subquery-lift rewrite (EC-03)
  • ALL (subquery) rewritten to NULL-safe anti-join (EC-32)
  • Ergonomics E2E tests for calculated schedule, warnings, and removed GUCs pass
  • gate_source() idempotency and re-gating tested; bootstrap_gate_status() available
  • dbt stream_table_status() and refresh_all_stream_tables macros shipped
  • SQL Reference updated for EC-03, EC-32, and foreign table polling patterns
  • Extension upgrade path tested (0.5.0 → 0.6.0)

v0.7.0 — Performance, Watermarks, Circular DAG Execution, Observability & Infrastructure

Status: Released (2026-03-16).

Goal: Land Part 9 performance improvements (parallel refresh scheduling, MERGE strategy optimization, advanced benchmarks), add user-injected temporal watermark gating for batch-ETL coordination, complete the fixpoint scheduler for circular stream table DAGs, ship ready-made Prometheus/Grafana monitoring, and prepare the 1.0 packaging and deployment infrastructure.

Completed items (click to expand)

Watermark Gating

In plain terms: A scheduling control for ETL pipelines where multiple source tables are populated by separate jobs that finish at different times. For example, orders might be loaded by a job that finishes at 02:00 and products by one that finishes at 03:00. Without watermarks, the scheduler might refresh a stream table that joins the two at 02:30, producing a half-complete result. Watermarks let each ETL job declare "I'm done up to timestamp X", and the scheduler waits until all sources are caught up within a configurable tolerance before proceeding.

Let producers signal their progress so the scheduler only refreshes stream tables when all contributing sources are aligned within a configurable tolerance. The primary use case is nightly batch ETL pipelines where multiple source tables are populated on different schedules.

ItemDescriptionEffortRef
WM-1Catalog: pgt_watermarks table (source_relid, current_watermark, updated_at, wal_lsn_at_advance); pgt_watermark_groups table (group_name, sources, tolerance)✅ DonePLAN_WATERMARK_GATING.md
WM-2advance_watermark(source, watermark) — monotonicity check, store LSN alongside watermark, lightweight scheduler signal✅ DonePLAN_WATERMARK_GATING.md
WM-3create_watermark_group(name, sources[], tolerance) / drop_watermark_group()✅ DonePLAN_WATERMARK_GATING.md
WM-4Scheduler pre-check: evaluate watermark alignment predicate; skip + log SKIP(watermark_misaligned) if not aligned✅ DonePLAN_WATERMARK_GATING.md
WM-5watermarks(), watermark_groups(), watermark_status() introspection functions✅ DonePLAN_WATERMARK_GATING.md
WM-6E2E tests: nightly ETL, micro-batch tolerance, multiple pipelines, mixed external+internal sources✅ DonePLAN_WATERMARK_GATING.md

Watermark gating: ✅ Complete

Circular Dependencies — Scheduler Integration

In plain terms: Completes the circular DAG work started in v0.6.0. When stream tables reference each other in a cycle (A → B → A), the scheduler now runs them repeatedly until the result stabilises — no more changes flowing through the cycle. This is called "fixpoint iteration", like solving a system of equations by re-running it until the numbers stop moving. If it doesn't converge within a configurable number of rounds (default 100) it surfaces an error rather than looping forever.

Completes the SCC foundation from v0.6.0 with a working fixpoint iteration loop. Stream tables in a monotone cycle are refreshed repeatedly until convergence (zero net change) or max_fixpoint_iterations is exceeded.

ItemDescriptionEffortRef
CYC-5Scheduler fixpoint iteration: iterate_to_fixpoint(), convergence detection from (rows_inserted, rows_deleted), non-convergence → ERROR status✅ DonePLAN_CIRCULAR_REFERENCES.md Part 5
CYC-6Creation-time validation: allow monotone cycles when allow_circular=true; assign scc_id; recompute SCCs on drop_stream_table✅ DonePLAN_CIRCULAR_REFERENCES.md Part 6
CYC-7Monitoring: scc_id + last_fixpoint_iterations in views; pgtrickle.pgt_scc_status() function✅ DonePLAN_CIRCULAR_REFERENCES.md Part 7
CYC-8Documentation + E2E tests (e2e_circular_tests.rs): 6 scenarios (monotone cycle, non-monotone reject, convergence, non-convergence→ERROR, drop breaks cycle, allow_circular=false default)✅ DonePLAN_CIRCULAR_REFERENCES.md Part 8

Circular dependencies subtotal: ~19 hours

Last Differential Mode Gaps

In plain terms: Three query patterns that previously fell back to FULL refresh in AUTO mode — or hard-errored in explicit DIFFERENTIAL mode — despite the DVM engine having the infrastructure to handle them. All three gaps are now closed.

ItemDescriptionEffortRef
DG-1User-Defined Aggregates (UDAs). PostGIS (ST_Union, ST_Collect), pgvector vector averages, and any CREATE AGGREGATE function are rejected. Fix: classify unknown aggregates as AggFunc::UserDefined and route them through the existing group-rescan strategy — no new delta math required.✅ DonePLAN_LAST_DIFFERENTIAL_GAPS.md §G1
DG-2Window functions nested in expressions. RANK() OVER (...) + 1, CASE WHEN ROW_NUMBER() OVER (...) <= 10, COALESCE(LAG(v) OVER (...), 0) etc. are rejected.✅ Done (v0.6.0)PLAN_LAST_DIFFERENTIAL_GAPS.md §G2
DG-3Sublinks in deeply nested OR. The two-stage rewrite pipeline handles flat EXISTS(...) OR … and AND(EXISTS OR …) but gives up on multiple OR+sublink conjuncts. Fix: expand all OR+sublink conjuncts in AND to a cartesian product of UNION branches with a 16-branch explosion guard.✅ DonePLAN_LAST_DIFFERENTIAL_GAPS.md §G3

Last differential gaps: ✅ Complete

Pre-1.0 Infrastructure Prep

In plain terms: Three preparatory tasks that make the eventual 1.0 release smoother. A draft Docker Hub image workflow (tests the build but doesn't publish yet); a PGXN metadata file so the extension can eventually be installed with pgxn install pg_trickle; and a basic CNPG integration test that verifies the extension image loads correctly in a CloudNativePG cluster. None of these ship user-facing features — they're CI and packaging scaffolding.

ItemDescriptionEffortRef
INFRA-1Prove the Docker image builds. Set up a CI workflow that builds the official Docker Hub image (PostgreSQL 18 + pg_trickle pre-installed), runs a smoke test (create extension, create a stream table, refresh it), but doesn't publish anywhere yet. When 1.0 arrives, publishing is just flipping a switch.5h✅ Done
INFRA-2Publish an early PGXN testing release. Draft META.json and upload a release_status: "testing" package to PGXN so pgxn install pg_trickle works for early adopters now. PGXN explicitly supports pre-stable releases; this gets real-world install testing and establishes registry presence before 1.0. At 1.0 the only change is flipping release_status to "stable".2–3h✅ Done
INFRA-3Verify Kubernetes deployment works. A CI smoke test that deploys the pg_trickle extension image into a CloudNativePG (CNPG) Kubernetes cluster, creates a stream table, and confirms a refresh cycle completes. Catches packaging and compatibility issues before they reach Kubernetes users.4h✅ Done

Pre-1.0 infrastructure prep: ✅ Complete

Performance — Regression Fixes & Benchmark Infrastructure (Part 9 S1–S2) ✅ Done

Fixes Criterion benchmark regressions identified in Part 9 and ships five benchmark infrastructure improvements to support data-driven performance decisions.

ItemDescriptionStatus
A-3Fix prefixed_col_list/20 +34% regression — eliminate intermediate Vec allocation✅ Done
A-4Fix lsn_gt +22% regression — use split_once instead of split().collect()✅ Done
I-1cjust bench-docker target for running Criterion inside Docker builder image✅ Done
I-2Per-cycle [BENCH_CYCLE] CSV output in E2E benchmarks for external analysis✅ Done
I-3EXPLAIN ANALYZE capture mode (PGS_BENCH_EXPLAIN=true) for delta query plans✅ Done
I-61M-row benchmark tier (bench_*_1m_* + bench_large_matrix)✅ Done
I-8Criterion noise reduction (sample_size(200), measurement_time(10s))✅ Done

Performance — Parallel Refresh, MERGE Optimization & Advanced Benchmarks (Part 9 S4–S6) ✅ Done

DAG level-parallel scheduling, improved MERGE strategy selection (xxh64 hashing, aggregate saturation bypass, cost-based threshold), and expanded benchmark suite (JSON comparison, concurrent writers, window/lateral/CTE).

ItemDescriptionStatus
C-1DAG level extraction (topological_levels() on StDag and ExecutionUnitDag)✅ Done
C-2Level-parallel dispatch (existing parallel_dispatch_tick infrastructure sufficient)✅ Done
C-3Result communication (existing SchedulerJob + pgt_refresh_history sufficient)✅ Done
D-1xxh64 hash-based change detection for wide tables (≥50 cols)✅ Done
D-2Aggregate saturation FULL bypass (changes ≥ groups → FULL)✅ Done
D-3Cost-based strategy selection from pgt_refresh_history data✅ Done
I-4Cross-run comparison tool (just bench-compare, JSON output)✅ Done
I-5Concurrent writer benchmarks (1/2/4/8 writers)✅ Done
I-7Window / lateral / CTE / UNION ALL operator benchmarks✅ Done

v0.7.0 total: ~59–62h

Exit criteria:

  • Part 9 performance: DAG levels, xxh64 hashing, aggregate saturation bypass, cost-based threshold, advanced benchmarks
  • advance_watermark + scheduler gating operational; ETL E2E tests pass
  • Monotone circular DAGs converge to fixpoint; non-convergence surfaces as ERROR
  • UDAs, nested window expressions, and deeply nested OR+sublinks supported in DIFFERENTIAL mode
  • Docker Hub image CI workflow builds and smoke-tests successfully
  • PGXN testing release uploaded; pgxn install pg_trickle works
  • CNPG integration smoke test passes in CI
  • Extension upgrade path tested (0.6.0 → 0.7.0)

v0.8.0 — pg_dump Support & Test Hardening

Status: Released

Goal: Complete the pg_dump round-trip story so stream tables survive pg_dump/pg_restore cycles, and comprehensively harden the E2E test suites with multiset invariants to mathematically enforce DVM correctness.

Completed items (click to expand)

pg_dump / pg_restore Support

In plain terms: pg_dump is the standard PostgreSQL backup tool. Without this, a dump of a database containing stream tables may not capture them correctly — and restoring from that dump would require manually recreating them by hand. This teaches pg_dump to emit valid SQL for every stream table, and adds logic to automatically re-link orphaned catalog entries when restoring an extension from a backup.

Complete the native DDL story: teach pg_dump to emit CREATE MATERIALIZED VIEW … WITH (pgtrickle.stream = true) for stream tables and add an event trigger that re-links orphaned catalog entries on extension restore.

ItemDescriptionEffortRef
NAT-DUMPgenerate_dump() + restore_stream_tables() companion functions (done); event trigger on extension load for orphaned catalog entries3–4dPLAN_NATIVE_SYNTAX.md §pg_dump
NAT-TESTE2E tests: pg_dump round-trip, restore from backup, orphaned-entry recovery2–3dPLAN_NATIVE_SYNTAX.md §pg_dump

pg_dump support subtotal: ~5–7 days

Test Suite Evaluation & Hardening

In plain terms: Replacing legacy, row-count-based assertions with comprehensive, order-independent multiset evaluations (assert_st_matches_query) across all testing tiers. This mathematical invariant proving guarantees differential dataflow correctness under highly chaotic multiset interleavings and edge cases.

ItemDescriptionEffortRef
TE1Unit Test Hardening: Full multiset equality testing for pure-Rust DVM operatorsDonePLAN_EVALS_UNIT
TE2Light E2E Migration: Expand speed-optimized E2E pipeline with rigorous symmetric difference checksDonePLAN_EVALS_LIGHT_E2E
TE3Integration Concurrency: Prove complex orchestration correctness under transaction delaysDonePLAN_EVALS_INTEGRATION
TE4Full E2E Hardening: Validate cross-boundary, multi-DAG cascades, partition handling, and upgrade pathsDonePLAN_EVALS_FULL_E2E
TE5TPC-H Smoke Test: Stateful invariant evaluations for heavily randomized DML loads over large matricesDonePLAN_EVALS_TPCH
TE6Property-Based Invariants: Chaotic property testing pipelines for topological boundaries and cyclic executionsDonePLAN_PROPERTY_BASED_INVARIANTS
TE7cargo-nextest Migration: Move test suite execution to cargo-nextest to aggressively parallelize and isolate tests, solving wall-clock execution regressions1–2dPLAN_CARGO_NEXTEST

Test evaluation subtotal: ~11-14 days (Mostly Completed)

v0.8.0 total: ~16–21 days

Exit criteria:

  • Test infrastructure hardened with exact mathematical multiset validation
  • Test harness migrated to cargo-nextest to fix speed and CI flake regressions
  • pg_dump round-trip produces valid, restorable SQL for stream tables (Done)
  • Extension upgrade path tested (0.7.0 → 0.8.0)

v0.9.0 — Incremental Aggregate Maintenance

Status: Released (2026-03-20).

Goal: Implement algebraic incremental maintenance for decomposable aggregates (COUNT, SUM, AVG, MIN, MAX, STDDEV), reducing per-group refresh from O(group_size) to O(1) for the common case. This is the highest-potential-payoff item in the performance plan — benchmarks show aggregate scenarios going from 2.5 ms to sub-1 ms per group.

Completed items (click to expand)

Critical Bug Fixes

ItemDescriptionEffortStatusRef
G-1panic!() in SQL-callable source_gates() and watermarks() functions. Both functions reach panic!() on any SPI error, crashing the PostgreSQL backend process. AGENTS.md explicitly forbids panic!() in code reachable from SQL. Replace both .unwrap_or_else(|e| panic!(…)) calls with pgrx::error!(…) so any SPI failure surfaces as a PostgreSQL ERROR instead.~1h✅ Donesrc/api.rs

Critical bug fixes subtotal: ~1 hour

Algebraic Aggregate Shortcuts (B-1)

In plain terms: When only one row changes in a group of 100,000, today pg_trickle re-scans all 100,000 rows to recompute the aggregate. Algebraic maintenance keeps running totals: new_sum = old_sum + Δsum, new_count = old_count + Δcount. Only MIN/MAX needs a rescan — and only when the deleted value was the current minimum or maximum.

ItemDescriptionEffortStatusRef
B1-1Algebraic rules: COUNT, SUM (already algebraic), AVG (done — aux cols), STDDEV/VAR (done — sum-of-squares decomposition), MIN/MAX with rescan guard (already implemented)3–4 wk✅ DonePLAN_NEW_STUFF.md §B-1
B1-2Auxiliary column management (__pgt_aux_sum_*, __pgt_aux_count_*, __pgt_aux_sum2_* — done); hidden via __pgt_* naming convention (existing NOT LIKE '__pgt_%' filter)1–2 wk✅ DonePLAN_NEW_STUFF.md §B-1
B1-3Migration story for existing aggregate stream tables; periodic full-group recomputation to reset floating-point drift1 wk✅ DonePLAN_NEW_STUFF.md §B-1
B1-4Fallback to full-group recomputation for non-decomposable aggregates (mode, percentile, string_agg with ordering)1 wk✅ DonePLAN_NEW_STUFF.md §B-1
B1-5Property-based tests: MIN/MAX boundary case (deleting the exact current min or max value must trigger rescan)1 wk✅ DonePLAN_NEW_STUFF.md §B-1

Implementation Progress

Completed:

  • AVG algebraic maintenance (B1-1): AVG no longer triggers full group-rescan. Classified as is_algebraic_via_aux() and tracked via __pgt_aux_sum_* / __pgt_aux_count_* columns. The merge expression computes (old_sum + ins - del) / NULLIF(old_count + ins - del, 0).

  • STDDEV/VAR algebraic maintenance (B1-1): STDDEV_POP, STDDEV_SAMP, VAR_POP, and VAR_SAMP are now algebraic using sum-of-squares decomposition. Auxiliary columns: __pgt_aux_sum_* (running SUM), __pgt_aux_sum2_* (running SUM(x²)), __pgt_aux_count_*. Merge formulas:

    • VAR_POP = GREATEST(0, (n·sum2 − sum²) / n²)
    • VAR_SAMP = GREATEST(0, (n·sum2 − sum²) / (n·(n−1)))
    • STDDEV_POP = SQRT(VAR_POP), STDDEV_SAMP = SQRT(VAR_SAMP) Null guards match PostgreSQL semantics (NULL when count ≤ threshold).
  • Auxiliary column infrastructure (B1-2): create_stream_table() and alter_stream_table() detect AVG/STDDEV/VAR aggregates and automatically add NUMERIC sum/sum2 and BIGINT count columns. Full refresh and initialization paths inject SUM(arg), COUNT(arg), and SUM(arg*arg). All __pgt_aux_* columns are automatically hidden by the existing NOT LIKE '__pgt_%' convention used throughout the codebase.

  • Non-decomposable fallback (B1-4): Already existed as the group-rescan strategy — any aggregate not classified as algebraic or algebraic-via-aux falls back to full group recomputation.

  • Property-based tests (B1-5): Seven proptest tests verify: (a) MIN merge uses LEAST, MAX merge uses GREATEST; (b) deleting the exact current extremum triggers rescan; (c) delta expressions use matching aggregate functions; (d) AVG is classified as algebraic-via-aux (not group-rescan); (e) STDDEV/VAR use sum-of-squares algebraic path with GREATEST guard; (f) STDDEV wraps in SQRT, VAR does not; (g) DISTINCT STDDEV falls back (not algebraic).

  • Migration story (B1-3): ALTER QUERY transition seamlessly. Handled by extending migrate_aux_columns to execute ALTER TABLE ADD COLUMN or DROP COLUMN exactly matching runtime changes in the new_avg_aux or new_sum2_aux definitions.

  • Floating-point drift reset (B1-3): Implemented global GUC pg_trickle.algebraic_drift_reset_cycles (0=disabled) that counts differential refresh attempts in scheduler memory per-stream-table. When the threshold fires, action degrades to RefreshAction::Reinitialize.

  • E2E integration tests: Tested via multi-cycle inserts, updates, and deletes checking proper handling without regression (added specifically for STDDEV/VAR).

Remaining work:

  • Extension upgrade path (0.8.0 → 0.9.0): Upgrade SQL stub created. Left as a final pre-release checklist item to generate the final sql/archive/pg_trickle--0.9.0.sql with cargo pgrx package once all CI checks pass.

  • F15 — Selective CDC Column Capture: ✅ Complete. Column-selection pipeline, monitoring exposure via check_cdc_health().selective_capture, and 3 E2E integration tests done.

⚠️ Critical: the MIN/MAX maintenance rule is directionally tricky. The correct condition for triggering a rescan is: deleted value equals the current min/max (not when it differs). Getting this backwards silently produces stale aggregates on the most common OLTP delete pattern. See the corrected table and risk analysis in PLAN_NEW_STUFF.md §B-1.

Retraction consideration (B-1): Keep in v0.9.0, but item B1-5 (property-based tests covering the MIN/MAX boundary case) is a hard prerequisite for B1-1, not optional follow-on work. The MIN/MAX rule was stated backwards in the original spec; the corrected rule is now in PLAN_NEW_STUFF.md. Do not merge any MIN/MAX algebraic path until property-based tests confirm: (a) deleting the exact current min triggers a rescan and (b) deleting a non-min value does not. Floating-point drift reset (B1-3) is also required before enabling persistent auxiliary columns.

B1-5 hard prerequisite satisfied. Property-based tests now cover both conditions — see prop_min_max_rescan_guard_direction in tests/property_tests.rs.

Algebraic aggregates subtotal: ~7–9 weeks

Advanced SQL Syntax & DVM Capabilities (B-2)

These represent expansions of the DVM engine to handle richer SQL constructs and improve runtime execution consistency.

ItemDescriptionEffortStatusRef
B2-1LIMIT / OFFSET / ORDER BY. Top-K queries evaluated directly within the DVM engine.2–3 wk✅ DonePLAN_ORDER_BY_LIMIT_OFFSET.md
B2-2LATERAL Joins. Expanding the parser and DVM diff engine to handle LATERAL subqueries.2 wk✅ DonePLAN_LATERAL_JOINS.md
B2-3View Inlining. Allow stream tables to query standard PostgreSQL views natively.1-2 wk✅ DonePLAN_VIEW_INLINING.md
B2-4Synchronous / Transactional IVM. Evaluating DVM diffs synchronously in the same transaction as the DML.3 wk✅ DonePLAN_TRANSACTIONAL_IVM.md
B2-5Cross-Source Snapshot Consistency. Improving engine consistency models when joining multiple tables.2 wk✅ DonePLAN_CROSS_SOURCE_SNAPSHOT_CONSISTENCY.md
B2-6Non-Determinism Guarding. Better handling or rejection of non-deterministic functions (random(), now()).1 wk✅ DonePLAN_NON_DETERMINISM.md

Multi-Table Delta Batching (B-3)

In plain terms: When a join query has three source tables and all three change in the same cycle, today pg_trickle makes three separate passes through the source tables. B-3 merges those passes into one and prunes UNION ALL branches for sources with no changes.

ItemDescriptionEffortStatusRef
B3-1Intra-query delta-branch pruning: skip UNION ALL branch entirely when a source has zero changes in this cycle1–2 wk✅ DonePLAN_NEW_STUFF.md §B-3
B3-2Merged-delta generation: weight aggregation (GROUP BY __pgt_row_id, SUM(weight)) for cross-source deduplication; remove zero-weight rows3–4 wk✅ Done (v0.10.0)PLAN_NEW_STUFF.md §B-3
B3-3Property-based correctness tests for simultaneous multi-source changes; diamond-flow scenarios1–2 wk✅ Done (v0.10.0)PLAN_NEW_STUFF.md §B-3

✅ B3-2 correctly uses weight aggregation (GROUP BY __pgt_row_id, SUM(weight)) instead of DISTINCT ON. B3-3 property-based tests (6 diamond-flow scenarios) verify correctness.

Multi-source delta batching subtotal: ~5–8 weeks

Phase 7 Gap Resolutions (DVM Correctness, Syntax & Testing)

These items pull in the remaining correctness edge cases and syntax expansions identified in the Phase 7 SQL Gap Analysis, along with completing exhaustive differential E2E test maturation.

ItemDescriptionEffortStatusRef
G1.1JOIN Key Column Changes. Handle updates that simultaneously modify a JOIN key and right-side tracked columns.3-5d✅ DoneGAP_SQL_PHASE_7.md
G1.2Window Function Partition Drift. Explicit tracking for updates that cause rows to cross PARTITION BY ranges.4-6d✅ DoneGAP_SQL_PHASE_7.md
G1.5/G7.1Keyless Table Duplicate Identity. Resolve __pgt_row_id collisions for non-PK tables with exact duplicate rows.3-5d✅ DoneGAP_SQL_PHASE_7.md
G5.6Range Aggregates. Support and differentiate RANGE_AGG and RANGE_INTERSECT_AGG.1-2d✅ DoneGAP_SQL_PHASE_7.md
G5.3XML Expression Parsing. Native DVM handling for T_XmlExpr syntax trees.1-2d✅ DoneGAP_SQL_PHASE_7.md
G5.5NATURAL JOIN Drift Tracking. DVM tracking of schema shifts in NATURAL JOIN between refreshes.2-3d✅ DoneGAP_SQL_PHASE_7.md
F15Selective CDC Column Capture. Limit row I/O by only tracking columns referenced in query lineage.1-2 wk✅ DoneGAP_SQL_PHASE_6.md
F40Extension Upgrade Migrations. Robust versioned SQL schema migrations.1-2 wk✅ DoneREPORT_DB_SCHEMA_STABILITY.md

Phase 7 Gaps subtotal: ~5-7 weeks

Additional Query Engine Improvements

ItemDescriptionEffortStatusRef
A1Circular dependency support (SCC fixpoint iteration)~40h✅ DoneCIRCULAR_REFERENCES.md
A7Skip-unchanged-column scanning in delta SQL (requires column-usage demand-propagation pass in DVM parser)~1–2d✅ DonePLAN_EDGE_CASES_TIVM_IMPL_ORDER.md Stage 4 §3.4
EC-03Window-in-expression DIFFERENTIAL fallback warning: emit a WARNING (and eventually an INFO hint) when a stream table with CASE WHEN window_fn() OVER (...) ... silently falls back from DIFFERENTIAL to FULL refresh mode; currently fails at runtime with column st.* does not exist — no user-visible signal exists~1d✅ DonePLAN_EDGE_CASES.md §EC-03
A8pgt_refresh_groups SQL API: companion functions (pgtrickle.create_refresh_group(), pgtrickle.drop_refresh_group(), pgtrickle.refresh_groups()) for the Cross-Source Snapshot Consistency catalog table introduced in the 0.8.0→0.9.0 upgrade script~2–3d✅ DonePLAN_CROSS_SOURCE_SNAPSHOT_CONSISTENCY.md

Advanced Capabilities subtotal: ~11–13 weeks

DVM Engine Correctness & Performance Hardening (P2)

These items address correctness gaps that silently degrade to full-recompute modes or cause excessive I/O on each differential cycle. All are observable in production workloads.

ItemDescriptionEffortStatusRef
P2-1Recursive CTE DRed in DIFFERENTIAL mode. Currently, any DELETE or UPDATE against a recursive CTE's source in DIFFERENTIAL mode falls back to O(n) full recompute + diff. The Delete-and-Rederive (DRed) algorithm exists for IMMEDIATE mode only. Implement DRed for DeltaSource::ChangeBuffer so recursive CTE stream tables in DIFFERENTIAL mode maintain O(delta) cost.2–3 wk⏭️ Deferred to v0.10.0src/dvm/operators/recursive_cte.rs
P2-2SUM NULL-transition rescan for FULL OUTER JOIN aggregates. When SUM sits above a FULL OUTER JOIN and rows transition between matched and unmatched states (matched→NULL), the algebraic formula gives 0 instead of NULL, triggering a child_has_full_join() full-group rescan on every cycle where rows cross that boundary. Implement a targeted correction that avoids full-group rescans in the common case.1–2 wk⏭️ Deferred to v0.10.0src/dvm/operators/aggregate.rs
P2-3DISTINCT multiplicity-count JOIN overhead. Every differential refresh for SELECT DISTINCT queries joins against the stream table's __pgt_count column for the full stream table, even when only a tiny delta is being processed. Replace with a per-affected-row lookup pattern to limit this to O(delta) I/O.1 wk✅ Donesrc/dvm/operators/distinct.rs
P2-4Materialized view sources in IMMEDIATE mode (EC-09). Stream tables that use a PostgreSQL materialized view as a source are rejected at creation time when IMMEDIATE mode is requested. Implement a polling-change-detection wrapper (same approach as EC-05 for foreign tables) to support REFRESH MATERIALIZED VIEW-sourced queries in IMMEDIATE mode.2–3 wk⏭️ Deferred to v0.10.0plans/PLAN_EDGE_CASES.md §EC-09
P2-5changed_cols bitmask captured but not consumed in delta scan SQL. Every CDC change buffer row stores a changed_cols BIGINT bitmask recording which source columns were modified by an UPDATE. The DVM delta scan CTE reads every UPDATE row regardless of whether any query-referenced column actually changed. Implement a demand-propagation pass to identify referenced columns per Scan, then inject a changed_cols & referenced_mask != 0 filter into the delta CTE WHERE clause. For wide source tables (50+ columns) where a typical UPDATE touches 1–3 columns, this eliminates ~98% of UPDATE rows entering the join/aggregate pipeline.2–3 wk✅ Donesrc/dvm/operators/scan.rs · plans/PLAN_EDGE_CASES_TIVM_IMPL_ORDER.md §Task 3.1
P2-6LATERAL subquery inner-source change triggers O(|outer table|) full re-execution. When any inner source has CDC entries in the current window, build_inner_change_branch() re-materializes the entire outer table snapshot and re-executes the lateral subquery for every outer row — O(|outer|) per affected cycle. Gate the outer-table scan behind a join to the inner delta rows so only outer rows correlated with changed inner rows are re-executed. (The analogous scalar subquery fix is P3-3; this is the lateral equivalent.)1–2 wk⏭️ Deferred to v0.10.0src/dvm/operators/lateral_subquery.rs
P2-7Delta predicate pushdown not implemented. WHERE predicates from the defining query are not pushed into the change buffer scan CTE. A stream table defined as SELECT … FROM orders WHERE status = 'shipped' reads all changes from pgtrickle_changes.changes_<oid> then filters — for 10K changes/cycle with 50 matching the predicate, 9,950 rows traverse the join/aggregate pipeline needlessly. Collect pushable predicates from the Filter node above the Scan; inject new_<col> / old_<col> predicate variants into the delta scan SQL. Care required: UPDATE rows need both old and new column values checked to avoid missing deletions that move rows out of the predicate window.2–3 wk✅ Donesrc/dvm/operators/scan.rs · src/dvm/operators/filter.rs · plans/performance/PLAN_NEW_STUFF.md §B-2

DVM hardening (P2) subtotal: ~6–9 weeks

DVM Performance Trade-offs (P3)

These items are correct as implemented but scale with data size rather than delta size. They are lower priority than P2 but represent solid measurable wins for high-cardinality workloads.

ItemDescriptionEffortStatusRef
P3-1Window partition full recompute. Any single-row change in a window partition triggers recomputation of the entire partition. Add a partition-size heuristic: if the affected partition exceeds a configurable row threshold, downgrade to FULL refresh for that cycle and emit a pgrx::info!() message. At minimum, document the O(partition_size) cost prominently.1 wk✅ Done (documented)src/dvm/operators/window.rs
P3-2Welford auxiliary columns for CORR/COVAR/REGR_* aggregates. CORR, COVAR_POP, COVAR_SAMP, REGR_* currently use O(group_size) group-rescan. Implement Welford-style auxiliary column accumulation (__pgt_aux_sumx_*, __pgt_aux_sumy_*, __pgt_aux_sumxy_*) to reach O(1) algebraic maintenance identical to the STDDEV/VAR path.2–3 wk⏭️ Deferred to v0.10.0src/dvm/operators/aggregate.rs
P3-3Scalar subquery C₀ EXCEPT ALL scan. Part 2 of the scalar subquery delta computes C₀ = C_current EXCEPT ALL Δ_inserts UNION ALL Δ_deletes by scanning the full outer snapshot. For large outer tables with an unstable inner source, this scan is proportional to the outer table size. Profile and gate the scan behind an existence check on inner-source stability to avoid it when possible; the WHERE EXISTS (SELECT 1 FROM delta_subquery) guard already handles the trivial case.1 wk✅ Donesrc/dvm/operators/scalar_subquery.rs
P3-4Index-aware MERGE planning. For small deltas against large stream tables (e.g. 5 delta rows, 10M-row ST), the PostgreSQL planner often chooses a sequential scan of the stream table for the MERGE join on __pgt_row_id, yielding O(n) full-table I/O when an index lookup would be O(log n). Emit SET LOCAL enable_seqscan = off within the MERGE transaction when the delta row count is below a configurable threshold fraction of the ST row count (pg_trickle.merge_seqscan_threshold GUC, default 0.001).1–2 wk✅ Donesrc/refresh.rs · src/config.rs · plans/performance/PLAN_NEW_STUFF.md §A-4
P3-5auto_backoff GUC for falling-behind stream tables. EC-11 implemented the scheduler_falling_behind NOTIFY alert at 80% of the refresh budget. The companion auto_backoff GUC that automatically doubles the effective refresh interval when a stream table consistently runs behind was explicitly deferred. Add a pg_trickle.auto_backoff bool GUC (default off); when enabled, track a per-ST exponential backoff factor in scheduler shared state and reset it on the first on-time cycle. Saves CPU runaway when operators are offline to respond manually.1–2d✅ Donesrc/scheduler.rs · src/config.rs · plans/PLAN_EDGE_CASES.md §EC-11

DVM performance trade-offs (P3) subtotal: ~4–7 weeks

Documentation Gaps (D)

ItemDescriptionEffortStatus
D1Recursive CTE DIFFERENTIAL mode limitation. The O(n) fallback for mixed DELETE/UPDATE against a recursive CTE source is not documented in docs/SQL_REFERENCE.md or docs/DVM_OPERATORS.md. Users hitting DELETE/UPDATE-heavy workloads on recursive CTE stream tables will see unexpectedly slow refresh times with no explanation. Add a "Known Limitations" callout in both files.~2h✅ Done
D2pgt_refresh_groups catalog table undocumented. The catalog table added in the 0.8.0→0.9.0 upgrade script is not described in docs/SQL_REFERENCE.md. Even before the full A8 API lands, document the table schema, its purpose, and the manual INSERT/DELETE workflow users can use in the interim.~2h✅ Done

v0.9.0 total: ~23–29 weeks

Exit criteria:

  • AVG algebraic path implemented (SUM/COUNT auxiliary columns)
  • STDDEV/VAR algebraic path implemented (sum-of-squares decomposition)
  • MIN/MAX boundary case (delete-the-extremum) covered by property-based tests
  • Non-decomposable fallback confirmed (group-rescan strategy)
  • Auxiliary columns hidden from user queries via __pgt_* naming convention
  • Migration path for existing aggregate stream tables tested
  • Floating-point drift reset mechanism in place (periodic recompute)
  • E2E integration tests for algebraic aggregate paths
  • B2-1: Top-K queries (LIMIT/OFFSET/ORDER BY) support
  • B2-2: LATERAL Joins support
  • B2-3: View Inlining support
  • B2-4: Synchronous / Transactional IVM mode
  • B2-5: Cross-Source Snapshot Consistency models
  • B2-6: Non-Determinism Guarding semantics implemented
  • Extension upgrade path tested (0.8.0 → 0.9.0)
  • G1 Correctness Gaps addressed (G1.1, G1.2, G1.5, G1.6)
  • G5 Syntax Gaps addressed (G5.2, G5.3, G5.5, G5.6)
  • G6 Test Coverage expanded (G6.1, G6.2, G6.3, G6.5)
  • F15: Selective CDC Column Capture (optimize I/O by only tracking columns referenced in query lineage)
  • F40: Extension Upgrade Migration Scripts (finalize versioned SQL schema migrations)
  • B3-1: Delta-branch pruning for zero-change sources (skip UNION ALL branch when source has no changes)
  • B3-2: Merged-delta weight aggregation — implemented in v0.10.0 (weight aggregation replaces DISTINCT ON; B3-3 property tests verify correctness)
  • B3-3: Property-based correctness tests for B3-2 — implemented in v0.10.0 (6 diamond-flow E2E property tests)
  • EC-03: WARNING emitted when window-in-expression query silently falls back from DIFFERENTIAL to FULL refresh mode
  • A8: pgt_refresh_groups SQL API (pgt_add_refresh_group, pgt_remove_refresh_group, pgt_list_refresh_groups)
  • P2-1: Recursive CTE DRed for DIFFERENTIAL mode — deferred to v0.10.0 (high risk; ChangeBuffer mode lacks old-state context for safe rederivation; recomputation fallback is correct)
  • P2-2: SUM NULL-transition rescan optimization — deferred to v0.10.0 (requires auxiliary nonnull-count columns; current rescan approach is correct)
  • P2-3: DISTINCT __pgt_count lookup scoped to O(delta) I/O per cycle
  • P2-4: Materialized view sources in IMMEDIATE mode — deferred to v0.10.0 (requires external polling-change-detection wrapper; out of scope for v0.9.0)
  • P3-1: Window partition O(partition_size) cost documented; heuristic downgrade implemented or explicitly deferred
  • P3-2: CORR/COVAR_/REGR_ Welford auxiliary columns — explicitly deferred to v0.10.0 (group-rescan strategy already works correctly for all regression/correlation aggregates)
  • P3-3: Scalar subquery C₀ EXCEPT ALL scan gated behind inner-source stability check or explicitly deferred
  • D1: Recursive CTE DIFFERENTIAL mode limitation documented in SQL_REFERENCE.md and DVM_OPERATORS.md
  • D2: pgt_refresh_groups table schema and interim workflow documented in SQL_REFERENCE.md
  • G-1: panic!() replaced with pgrx::error!() in source_gates() and watermarks() SQL functions
  • G-2 (P2-5): changed_cols bitmask consumed in delta scan CTE — referenced-column mask filter injected
  • G-3 (P2-6): LATERAL subquery inner-source scoping — deferred to v0.10.0 (requires correlation predicate extraction from raw SQL; full re-execution is correct)
  • G-4 (P2-7): Delta predicate pushdown implemented (pushable predicates injected into change buffer scan CTE)
  • G-5 (P3-4): Index-aware MERGE planning: SET LOCAL enable_seqscan = off for small deltas against large STs
  • G-6 (P3-5): auto_backoff GUC implemented; scheduler doubles interval when stream table falls behind

v0.10.0 — DVM Hardening, Connection Pooler Compatibility, Core Refresh Optimizations & Infrastructure Prep

Status: Released (2026-03-23).

Goal: Land deferred DVM correctness and performance improvements (recursive CTE DRed, FULL OUTER JOIN aggregate fix, LATERAL scoping, Welford regression aggregates, multi-source delta merging), fix a class of post-audit DVM safety issues (SQL comment injection as FROM fragments, silent wrong aggregate results, EC-01 gap for complex join trees) and CDC correctness bug (NULL-unsafe PK join, TRUNCATE+INSERT race, stale WAL publication after partitioning), deliver the first wave of refresh performance optimizations (index-aware MERGE, predicate pushdown, change buffer compaction, cost-based refresh strategy), enable cloud-native PgBouncer transaction-mode deployments via an opt-in compatibility mode, and complete the pre-1.0 packaging and deployment infrastructure.

Completed items (click to expand)

Connection Pooler Compatibility

In plain terms: PgBouncer is the most widely used PostgreSQL connection pooler — it sits in front of the database and reuses connections across many application threads. In its common "transaction mode" it hands a different physical connection to each transaction, which breaks anything that assumes the same connection persists between calls (session locks, prepared statements). This work introduces an opt-in compatibility mode for pg_trickle so it works correctly in cloud deployments — Supabase, Railway, Neon, and similar platforms that route through PgBouncer by default.

pg_trickle uses session-level advisory locks and PREPARE statements that are incompatible with PgBouncer transaction-mode pooling. This section introduces an opt-in graceful degradation layer for connection pooler compatibility.

ItemDescriptionEffortStatusRef
PB1Replace pg_advisory_lock() with catalog row-level locking (FOR UPDATE SKIP LOCKED)3–4d✅ Done (0.10-adjustments)PLAN_PG_BOUNCER.md
PB2Add pooler_compatibility_mode catalog column directly to pgt_stream_tables via CREATE STREAM TABLE ... WITH (...) or alter_stream_table() to bypass PREPARE statements and skip NOTIFY locally3–4d✅ Done (0.10-adjustments)PLAN_PG_BOUNCER.md
PB3E2E validation against PgBouncer transaction-mode (Docker Compose with pooler sidecar)1–2d✅ Done (0.10-adjustments)PLAN_EDGE_CASES.md EC-28

⚠️ PB1 — SKIP LOCKED fails silently, not safely. pg_advisory_lock() blocks until the lock is granted, guaranteeing mutual exclusion. FOR UPDATE SKIP LOCKED returns zero rows immediately if the row is already locked — meaning a second worker will simply not acquire the lock and proceed as if uncontested, potentially running a concurrent refresh on the same stream table. Before merging PB1, verify that every call site that previously relied on the blocking guarantee now explicitly handles the "lock not acquired" path (e.g. skip this cycle and retry) rather than silently proceeding. The E2E test in PB3 must include a concurrent-refresh scenario that would fail if the skip-and-proceed bug is present.

PgBouncer compatibility subtotal: ~7–10 days

DVM Correctness & Performance (deferred from v0.9.0)

In plain terms: These items were evaluated during v0.9.0 and deferred because the current implementations are correct — they just scale with data size rather than delta size in certain edge cases. All produce correct results today; this work makes them faster.

ItemDescriptionEffortStatusRef
P2-1Recursive CTE DRed in DIFFERENTIAL mode. DELETE/UPDATE against a recursive CTE source falls back to O(n) full recompute + diff. Implement DRed for DeltaSource::ChangeBuffer to maintain O(delta) cost.2–3 wk✅ Done (0.10-adjustments)src/dvm/operators/recursive_cte.rs
P2-2SUM NULL-transition rescan for FULL OUTER JOIN aggregates. When SUM sits above a FULL OUTER JOIN and rows transition between matched/unmatched states, algebraic formula gives 0 instead of NULL, triggering full-group rescan. Implement targeted correction.1–2 wk✅ Donesrc/dvm/operators/aggregate.rs
P2-4Materialized view sources in IMMEDIATE mode (EC-09). Implement polling-change-detection wrapper for REFRESH MATERIALIZED VIEW-sourced queries in IMMEDIATE mode.2–3 wk✅ Doneplans/PLAN_EDGE_CASES.md §EC-09
P2-6LATERAL subquery inner-source scoped re-execution. Gate outer-table scan behind a join to inner delta rows so only correlated outer rows are re-executed, reducing O(|outer|) to O(delta).1–2 wk✅ Donesrc/dvm/operators/lateral_subquery.rs
P3-2Welford auxiliary columns for CORR/COVAR/REGR_* aggregates. Implement Welford-style accumulation to reach O(1) algebraic maintenance identical to the STDDEV/VAR path.2–3 wk✅ Donesrc/dvm/operators/aggregate.rs
B3-2Merged-delta weight aggregation. GROUP BY __pgt_row_id, SUM(weight) for cross-source deduplication; remove zero-weight rows.3–4 wk✅ DonePLAN_NEW_STUFF.md §B-3
B3-3Property-based correctness tests for simultaneous multi-source changes; diamond-flow scenarios. Hard prerequisite for B3-2.1–2 wk✅ DonePLAN_NEW_STUFF.md §B-3

✅ B3-2 correctly uses weight aggregation (GROUP BY __pgt_row_id, SUM(weight)) instead of DISTINCT ON. B3-3 property-based tests verify correctness for 6 diamond-flow topologies (inner join, left join, full join, aggregate, multi-root, deep diamond).

DVM deferred items subtotal: ~12–19 weeks

DVM Safety Fixes & CDC Correctness Hardening

These items were identified during a post-v0.9.0 audit of the DVM engine and CDC pipeline. P0 items produce runtime PostgreSQL syntax errors with no helpful extension-level error; P1 items produce silent wrong results. They target uncommon query shapes but are fully reachable by users without warning.

SQL Comment Injection (P0)

ItemDescriptionEffortStatusRef
SF-1build_snapshot_sql catch-all returns an SQL comment as a FROM clause fragment. The _ arm of build_snapshot_sql() returns /* unsupported snapshot for <node> */ which is injected directly into JOIN SQL, producing a PostgreSQL syntax error (syntax error at or near "/") instead of a clear extension error. Affects any RecursiveCte, Except, Intersect, UnionAll, LateralSubquery, LateralFunction, ScalarSubquery, Distinct, or RecursiveSelfRef node appearing as a direct JOIN child. Replace the catch-all arm with PgTrickleError::UnsupportedQuery.0.5 d✅ Donesrc/dvm/operators/join_common.rs
SF-2Explicit /* unsupported snapshot for distinct */ string in join.rs. Hardcoded variant of SF-1 for the Distinct-child case in inner-join snapshot construction. Same fix: return PgTrickleError::UnsupportedQuery.0.5 d✅ Donesrc/dvm/operators/join.rs
SF-3parser.rs FROM-clause deparser fallbacks inject SQL comments. /* unsupported RangeSubselect */ and /* unsupported FROM item */ are emitted as FROM clause fragments, causing PostgreSQL syntax errors when the generated SQL is executed. Replace with PgTrickleError::UnsupportedQuery.0.5 d✅ Donesrc/dvm/parser.rs

DVM Correctness Bugs (P1)

ItemDescriptionEffortStatusRef
SF-4child_to_from_sql returns None for renamed-column Project nodes, silently skipping group rescan. When a Project with column renames (e.g. EXTRACT(year FROM orderdate) AS o_year) sits between an aggregate and its source, child_to_from_sql() returns None and the group-rescan CTE is omitted without error. Groups crossing COUNT 0→1 or MAX deletion thresholds produce permanently stale aggregate values. Distinct from tracked P2-2 (SUM/FULL OUTER JOIN specific); this affects any complex projection above an aggregate.1–2 wk✅ Donesrc/dvm/operators/aggregate.rs
SF-5EC-01 fix is incomplete for right-side join subtrees with ≥3 scan nodes. use_pre_change_snapshot() applies a join_scan_count(child) <= 2 threshold to avoid cascading CTE materialization. For right-side join chains with ≥3 scan nodes (TPC-H Q7, Q8, Q9 all qualify), the original EC-01 phantom-row-after-DELETE bug is still present. The roadmap marks EC-01 as "Done" without noting this remaining boundary. Extend the fix to ≥3-scan right subtrees, or document the limitation explicitly with a test that asserts the boundary.2–3 wk✅ Done (boundary documented with 5 unit tests + DVM_OPERATORS.md limitation note)src/dvm/operators/join_common.rs
SF-6EXCEPT __pgt_count columns not forwarded through Project nodes, causing silent wrong results. EXCEPT uses a "retain but mark invisible" design (never emits 'D' events). A Project above EXCEPT that does not propagate __pgt_count_l/__pgt_count_r prevents the MERGE step from distinguishing visible from invisible rows. Enforce count column propagation in the planner or raise PgTrickleError at planning time if a Project over Except drops these columns.1–2 wk✅ Donesrc/dvm/operators/project.rs

DVM Edge-Condition Correctness (P2)

ItemDescriptionEffortStatusRef
SF-7Empty subquery_cols silently emits (SELECT NULL FROM …) as scalar subquery result. When inner column detection fails (e.g. star-expansion from a view source), scalar_col is set to "NULL" and NULL values silently propagate into the stream table with no error raised. Detect empty subquery_cols at planning time and return PgTrickleError::UnsupportedQuery.0.5 d✅ Donesrc/dvm/operators/scalar_subquery.rs
SF-8Dummy row_id = 0 in lateral inner-change branch can hash-collide with a real outer row. build_inner_change_branch() emits 0::BIGINT AS __pgt_row_id as a placeholder for re-executed outer rows. Since actual row hashes span the full BIGINT range, a real outer row could hash to 0, causing the DISTINCT/MERGE step to conflate it with the dummy entry. Use a sentinel outside the hash range (e.g. (-9223372036854775808)::BIGINT, i.e. MIN(BIGINT)) or add a separate __pgt_is_inner_dummy BOOLEAN discriminator column.1 wk✅ Done (sentinel changed to i64::MIN)src/dvm/operators/lateral_subquery.rs

CDC Correctness (P1–P2)

ItemDescriptionEffortStatusRef
SF-9UPDATE trigger uses = (not IS NOT DISTINCT FROM) on composite PK columns, silently dropping rows with NULL PK columns. The __pgt_new JOIN __pgt_old ON pk_a = pk_a AND pk_b = pk_b uses =, so NULL = NULL evaluates to false and those rows are silently dropped from the change buffer. The stream table permanently diverges from the source with no error. Change all PK join conditions in the UPDATE trigger to use IS NOT DISTINCT FROM.0.5 d✅ Donesrc/cdc.rs
SF-10TRUNCATE marker + same-window INSERT ordering is untested; post-TRUNCATE rows may be missed. If INSERTs arrive after a TRUNCATE but before the scheduler ticks, the change buffer contains both a 'T' marker and 'I' rows. The "TRUNCATE → full refresh → discard buffer" path has no E2E test coverage for this sequencing. A race between the FULL refresh snapshot and in-flight inserts could drop post-TRUNCATE inserted rows. Add a targeted E2E test and verify atomicity of the discard-vs-snapshot sequence.0.5 d✅ Done (verified: TRUNCATE triggers full refresh which re-reads source; change buffer is discarded atomically within the same transaction)src/cdc.rs
SF-11WAL publication goes stale after a source table is later converted to partitioned. create_publication() sets publish_via_partition_root = true only at creation time. If a source table is subsequently converted to partitioned, WAL events arrive with child-partition OIDs, causing lookup failures and a silent CDC stall for that table (no error, stream table silently freezes). Detect post-creation partitioning during publication health checks and rebuild the publication entry.1–2 wk✅ Donesrc/wal_decoder.rs

Operational & Documentation Gaps (P3)

ItemDescriptionEffortStatusRef
SF-12DiamondSchedulePolicy::Fastest CPU multiplication is undocumented. The default policy refreshes all members of a diamond consistency group whenever any member is due. In an asymmetric diamond (B every 1s, C every 5s, both feeding D), C refreshes 5× more often than scheduled, consuming unexplained CPU. Add a cost-implication warning to CONFIGURATION.md and ARCHITECTURE.md, and explain DiamondSchedulePolicy::Slowest as the low-CPU alternative.0.5 d✅ Donesrc/dag.rs · docs/CONFIGURATION.md
SF-13ROADMAP inconsistency: B-2 (Delta Predicate Pushdown) listed as ⬜ Not started in v0.10.0 but G-4/P2-7 marked completed in v0.9.0. The v0.9.0 exit criteria mark [x] G-4 (P2-7): Delta predicate pushdown implemented, yet the v0.10.0 table lists B-2 | Delta Predicate Pushdown | ⬜ Not started. If B-2 has additional scope beyond G-4 (e.g. OR-branch handling for deletions, covering index creation, benchmark targets), document that scope explicitly. If B-2 is fully covered by G-4, remove or mark it done in the v0.10.0 table to avoid double-counting effort.0.5 d✅ Done (B-2 marked as completed by G-4/P2-7)ROADMAP.md

DVM safety & CDC hardening subtotal: ~3–4 days (SF-1–3, SF-7, SF-9–10, SF-12–13) + ~6–10 weeks (SF-4–6, SF-8, SF-11)

Core Refresh Optimizations (Wave 2)

Read the risk analyses in PLAN_NEW_STUFF.md before implementing. Implement in this order: A-4 (no schema change), B-2, C-4, then B-4.

ItemDescriptionEffortStatusRef
A-4Index-Aware MERGE Planning. Planner hint injection (enable_seqscan = off for small-delta / large-target); covering index auto-creation on __pgt_row_id. No schema changes required.1–2 wk✅ DonePLAN_NEW_STUFF.md §A-4
B-2Delta Predicate Pushdown. Push WHERE predicates from defining query into change-buffer delta_scan CTE; OR old_col handling for deletions; 5–10× delta-row-volume reduction for selective queries.2–3 wk✅ Done (v0.9.0 as G-4/P2-7)PLAN_NEW_STUFF.md §B-2
C-4Change Buffer Compaction. Net-change compaction (INSERT+DELETE=no-op; UPDATE+UPDATE=single row); run when buffer exceeds pg_trickle.compact_threshold; use advisory lock to serialise with refresh.2–3 wk✅ DonePLAN_NEW_STUFF.md §C-4
B-4Cost-Based Refresh Strategy. Replace fixed differential_max_change_ratio with a history-driven cost model fitted on pgt_refresh_history; cold-start fallback to fixed threshold.2–3 wk✅ Done (cost model + adaptive threshold already active)PLAN_NEW_STUFF.md §B-4

⚠️ C-4: The compaction DELETE must use seq (the sequence primary key) not ctid as the stable row identifier. ctid changes under VACUUM and will silently delete the wrong rows. See the corrected SQL and risk analysis in PLAN_NEW_STUFF.md §C-4.

⚠️ A-4 — Planner hint must be transaction-scoped (SET LOCAL), never session-scoped (SET). The existing P3-4 implementation (already shipped) uses SET LOCAL enable_seqscan = off, which PostgreSQL automatically reverts at transaction end. Any extension of A-4 (e.g. the covering index auto-creation path) must continue to use SET LOCAL. Using plain SET instead would permanently disable seq-scans for the remainder of the session, corrupting planner behaviour for all subsequent queries in that backend.

Core refresh optimizations subtotal: ~7–11 weeks

Scheduler & DAG Scalability

These items address scheduler CPU efficiency and DAG maintenance overhead at scale. Both were identified as C-1 and C-2 in plans/performance/PLAN_NEW_STUFF.md but were not included in earlier milestones.

ItemDescriptionEffortStatusRef
G-7Tiered refresh scheduling (Hot/Warm/Cold/Frozen). All stream tables currently refresh at their configured interval regardless of how often they are queried. In deployments with many STs, most Cold/Frozen tables consume full scheduler CPU unnecessarily. Introduce four tiers keyed by a per-ST pgtrickle access counter (not pg_stat_user_tables, which is polluted by pg_trickle's own MERGE scans): Hot (≥10 reads/min: refresh at configured interval), Warm (1–10 reads/min: ×2 interval), Cold (<1 read/min: ×10 interval), Frozen (0 reads since last N cycles: suspend until manually promoted). A single GUC pg_trickle.tiered_scheduling (default off) gates the feature.3–4 wk✅ Donesrc/scheduler.rs · plans/performance/PLAN_NEW_STUFF.md §C-1
G-8Incremental DAG rebuild on DDL changes. Any CREATE/ALTER/DROP STREAM TABLE currently triggers a full O(V+E) re-query of all pgt_dependencies rows to rebuild the entire DAG. For deployments with 100+ stream tables this adds per-DDL latency and has a race condition: if two DDL events arrive before the scheduler ticks, only the latest pgt_id stored in shared memory may be processed. Replace with a targeted edge-delta approach: the DDL hooks write affected stream table OIDs into a pending-changes queue; the scheduler applies only those edge insertions/deletions, leaving the rest of the graph intact.2–3 wk✅ Donesrc/dag.rs · src/scheduler.rs · plans/performance/PLAN_NEW_STUFF.md §C-2
C2-1Ring-buffer DAG invalidation. Replace single pgt_id scalar in shared memory with a bounded ring buffer of affected IDs; full-rebuild fallback on overflow. Hard prerequisite for correctness of G-8 under rapid DDL changes.1 wk✅ DonePLAN_NEW_STUFF.md §C-2
C2-2Incremental topo-sort. Incremental topo-sort on affected subgraph only; cache sorted schedule in shared memory.1–2 wk✅ DonePLAN_NEW_STUFF.md §C-2

⚠️ A single pgt_id scalar in shared memory is vulnerable to overwrite when two DDL changes arrive between scheduler ticks — use a ring buffer (C2-1) or fall back to full rebuild. See PLAN_NEW_STUFF.md §C-2 risk analysis.

Scheduler & DAG scalability subtotal: ~7–10 weeks

"No Surprises" — Principle of Least Astonishment

In plain terms: pg_trickle does a lot of work automatically — rewriting queries, managing auxiliary columns, transitioning CDC modes, falling back between refresh strategies. Most of this is exactly what users want, but several behaviors happen silently where a brief notification would prevent confusion. This section adds targeted warnings, notices, and documentation so that every implicit behavior is surfaced to the user at the moment it matters.

ItemDescriptionEffortStatusRef
NS-1Warn on ORDER BY without LIMIT. Emit WARNING at create_stream_table / alter_stream_table time when query contains ORDER BY without LIMIT: "ORDER BY without LIMIT has no effect on stream tables — storage row order is undefined."2–4h✅ Donesrc/api.rs
NS-2Warn on append_only auto-revert. Upgrade the info!() to warning!() when append_only is automatically reverted due to DELETE/UPDATE. Add a pgtrickle_alert NOTIFY with category append_only_reverted.1–2h✅ Donesrc/refresh.rs
NS-3Promote cleanup errors after consecutive failures. Track consecutive drain_pending_cleanups() error count in thread-local state; promote from debug1 to WARNING after 3 consecutive failures for the same source OID.2–4h✅ Donesrc/refresh.rs
NS-4Document __pgt_* auxiliary columns in SQL_REFERENCE. Add a dedicated subsection listing all implicit columns (__pgt_row_id, __pgt_count, __pgt_sum, __pgt_sum2, __pgt_nonnull, __pgt_covar_*, __pgt_count_l, __pgt_count_r) with the aggregate functions that trigger each.2–4h✅ Donedocs/SQL_REFERENCE.md
NS-5NOTICE on diamond detection with diamond_consistency='none'. When create_stream_table detects a diamond dependency and the user hasn't explicitly set diamond_consistency, emit NOTICE: "Diamond dependency detected — consider setting diamond_consistency='atomic' for consistent cross-branch reads."2–4h✅ Donesrc/api.rs · src/dag.rs
NS-6NOTICE on differential→full fallback. Upgrade the existing info!() in adaptive fallback to NOTICE so it appears at default client_min_messages level.0.5–1h✅ Donesrc/refresh.rs
NS-7NOTICE on isolated CALCULATED schedule. When create_stream_table creates an ST with schedule='calculated' that has no downstream dependents, emit NOTICE: "No downstream dependents found — schedule will fall back to pg_trickle.default_schedule_seconds (currently Ns)."1–2h✅ Donesrc/api.rs

"No Surprises" subtotal: ~10–20 hours

v0.10.0 total: ~58–84 hours + ~32–50 weeks DVM, refresh & safety work + ~10–20 hours "No Surprises"

Exit criteria:

  • ALTER EXTENSION pg_trickle UPDATE tested (0.9.0 → 0.10.0) — upgrade script verified complete via scripts/check_upgrade_completeness.sh; adds pooler_compatibility_mode, refresh_tier, pgt_refresh_groups, and updated API function signatures
  • All public documentation current and reviewed — SQL_REFERENCE.md, CONFIGURATION.md, CHANGELOG.md, and ROADMAP.md updated for all v0.10.0 features
  • G-7: Tiered scheduling (Hot/Warm/Cold/Frozen) implemented; pg_trickle.tiered_scheduling GUC gating the feature
  • G-8: Incremental DAG rebuild implemented; DDL-triggered edge-delta replaces full O(V+E) re-query
  • C2-1: Ring-buffer DAG invalidation safe under rapid consecutive DDL changes
  • C2-2: Incremental topo-sort caches sorted schedule; verified by property-based test
  • P2-1: Recursive CTE DRed for DIFFERENTIAL mode (O(delta) instead of O(n) recompute) — implemented in 0.10-adjustments
  • P2-2: SUM NULL-transition correction for FULL OUTER JOIN aggregates — implemented; __pgt_aux_nonnull_* auxiliary column eliminates full-group rescan
  • P2-4: Materialized view sources supported in IMMEDIATE mode
  • P2-6: LATERAL subquery inner-source scoped re-execution (O(delta) instead of O(|outer|))
  • P3-2: CORR/COVAR_/REGR_ Welford auxiliary columns for O(1) algebraic maintenance
  • B3-2: Merged-delta weight aggregation passes property-based correctness proofs — implemented; replaces DISTINCT ON with GROUP BY + SUM(weight) + HAVING
  • B3-3: Property-based tests for simultaneous multi-source changes — implemented; 6 diamond-flow E2E property tests
  • A-4: Covering index auto-created on __pgt_row_id with INCLUDE clause for ≤8-column schemas; planner hint prevents seq-scan on small delta; SET LOCAL confirmed (not SET) so hint reverts at transaction end
  • B-2: Predicate pushdown reduces delta volume for selective queries — bench_b2_predicate_pushdown in e2e_bench_tests.rs measures median filtered vs unfiltered refresh time; asserts filtered ≤3× unfiltered (in practice typically faster)
  • C-4: Compaction uses change_id PK (not ctid); correct under concurrent VACUUM; serialised with advisory lock; net-zero elimination + intermediate row collapse
  • B-4: Cost model self-calibrates from refresh history (estimate_cost_based_threshold + compute_adaptive_threshold with 60/40 blend); cold-start fallback to fixed GUC threshold
  • PB1: Concurrent-refresh scenario covered by test_pb1_concurrent_refresh_skip_locked_no_corruption in e2e_concurrent_tests.rs; two concurrent refresh_stream_table() calls verified to produce correct data without corruption; SKIP LOCKED path confirmed non-blocking
  • SF-1: build_snapshot_sql catch-all arm uses pgrx::error!() instead of injecting an SQL comment as a FROM fragment
  • SF-2: Explicit /* unsupported snapshot for distinct */ string replaced with PgTrickleError::UnsupportedQuery in join.rs
  • SF-3: parser.rs FROM-clause deparser fallbacks replaced with PgTrickleError::UnsupportedQuery
  • SF-4: child_to_from_sql wraps Project in subquery with projected expressions; rescan CTE correctly resolves aliased column names
  • SF-5: EC-01 ≤2-scan boundary documented with 5 unit tests asserting the boundary + DVM_OPERATORS.md limitation note explaining the CTE materialization trade-off
  • SF-6: diff_project forwards __pgt_count_l/__pgt_count_r through projection when present in child result
  • SF-7: Empty subquery_cols in scalar subquery returns PgTrickleError::UnsupportedQuery rather than emitting NULL
  • SF-8: Lateral inner-change branch uses i64::MIN sentinel instead of 0::BIGINT as dummy __pgt_row_id
  • SF-9: UPDATE trigger PK join uses IS NOT DISTINCT FROM for all PK columns; NULL-PK rows captured correctly
  • SF-10: TRUNCATE + same-window INSERT E2E test passes; post-TRUNCATE rows not dropped
  • SF-11: check_publication_health() detects post-creation partitioning and rebuilds publication with publish_via_partition_root = true
  • SF-12: DiamondSchedulePolicy::Fastest cost-multiplication documented in CONFIGURATION.md with Slowest explanation
  • SF-13: B-2 / G-4 roadmap inconsistency resolved; entry reflects actual remaining scope (or marked done if fully completed)
  • NS-1: ORDER BY without LIMIT emits WARNING at creation time; E2E test verifies message
  • NS-2: append_only auto-revert uses WARNING (not INFO) and sends pgtrickle_alert NOTIFY
  • NS-3: drain_pending_cleanups promotes to WARNING after 3 consecutive failures per source OID
  • NS-4: __pgt_* auxiliary columns documented in SQL_REFERENCE with triggering aggregate functions
  • NS-5: Diamond detection with diamond_consistency='none' emits NOTICE suggesting 'atomic'
  • NS-6: Differential→full adaptive fallback uses NOTICE (not INFO)
  • NS-7: Isolated CALCULATED schedule emits NOTICE with effective fallback interval
  • NS-8: diamond_consistency default changed to 'atomic'; catalog DDL, API code comments, and all documentation updated to match actual runtime behavior (API already resolved NULL to Atomic)

v0.11.0 — Partitioned Stream Tables, Prometheus & Grafana Observability, Safety Hardening & Correctness

Status: Released 2026-03-26. See CHANGELOG.md §0.11.0 for the full feature list.

Highlights: 34× lower latency via event-driven scheduler wake · incremental ST-to-ST refresh chains · declaratively partitioned stream tables (100× I/O reduction) · ready-to-use Prometheus + Grafana monitoring stack · FUSE circuit breaker · VARBIT changed-column bitmask (no more 63-column cap) · per-database worker quotas · DAG scheduling performance improvements (fused chains, adaptive polling, amplification detection) · TPC-H correctness gate in CI · safer production defaults.

Completed items (click to expand)

Partitioned Stream Tables — Storage (A-1)

In plain terms: A 10M-row stream table partitioned into 100 ranges means only the 2–3 partitions that actually received changes are touched by MERGE — reducing the MERGE scan from 10M rows to ~100K. The partition key must be a user-visible column and the refresh path must inject a verified range predicate.

ItemDescriptionEffortRef
A1-1DDL: CREATE STREAM TABLE … PARTITION BY declaration; catalog column for partition key1–2 wkPLAN_NEW_STUFF.md §A-1
A1-2Delta inspection: extract min/max of partition key from delta CTE per scheduler tick1 wkPLAN_NEW_STUFF.md §A-1
A1-3MERGE rewrite: inject validated partition-key range predicate or issue per-partition MERGEs via Rust loop2–3 wkPLAN_NEW_STUFF.md §A-1
A1-4E2E benchmarks: 10M-row partitioned ST, 0.1% change rate concentrated in 2–3 partitions1 wkPLAN_NEW_STUFF.md §A-1

⚠️ MERGE joins on __pgt_row_id (a content hash unrelated to the partition key) — partition pruning will not activate automatically. A predicate injection step is mandatory. See PLAN_NEW_STUFF.md §A-1 risk analysis before starting.

Retraction consideration (A-1): The 5–7 week effort estimate is optimistic. The core assumption — that partition pruning can be activated via a WHERE partition_key BETWEEN ? AND ? predicate — requires the partition key to be a tracked catalog column (not currently the case) and a verified range derivation from the delta. The alternative (per-partition MERGE loop in Rust) is architecturally sound but requires significant catalog and refresh-path changes. A design spike (2–4 days) producing a written implementation plan must be completed before A1-1 is started. The milestone is at P3 / Very High risk and should not block the 1.0 release if the design spike reveals additional complexity.

Partitioned stream tables subtotal: ~5–7 weeks

Multi-Database Scheduler Isolation (C-3)

ItemDescriptionEffortRef
C3-1Per-database worker quotas (pg_trickle.per_database_worker_quota); priority ordering (IMMEDIATE > Hot > Warm > Cold); burst capacity up to 150% when other DBs are under budget ✅ Done in v0.11.0 Phase 11 — compute_per_db_quota() helper with burst threshold at 80% cluster utilisation; sort_ready_queue_by_priority() dispatches ImmediateClosure first; 7 unit tests.src/scheduler.rs

Multi-DB isolation subtotal: ✅ Complete

Prometheus & Grafana Observability

In plain terms: Most teams already run Prometheus and Grafana to monitor their databases. This ships ready-to-use configuration files — no custom code, no extension changes — that plug into the standard postgres_exporter and light up a Grafana dashboard showing refresh latency, staleness, error rates, CDC lag, and per-stream-table detail. Also includes Prometheus alerting rules so you get paged when a stream table goes stale or starts error-looping. A Docker Compose file lets you try the full observability stack with a single docker compose up.

Zero-code monitoring integration. All config files live in a new monitoring/ directory in the main repo (or a separate pgtrickle-monitoring repo). Queries use existing views (pg_stat_stream_tables, check_cdc_health(), quick_health).

ItemDescriptionEffortRef
OBS-1Prometheus metrics out of the box. ✅ Done in v0.11.0 Phase 3 — monitoring/prometheus/pg_trickle_queries.yml exports 14 metrics (per-table refresh stats, health summary, CDC buffer sizes, status counts, recent error rate) via postgres_exporter.monitoring/prometheus/pg_trickle_queries.yml
OBS-2Get paged when things go wrong. ✅ Done in v0.11.0 Phase 3 — monitoring/prometheus/alerts.yml has 8 alerting rules: staleness > 5 min, ≥3 consecutive failures, table SUSPENDED, CDC buffer > 1 GB, scheduler down, high refresh duration, cluster WARNING/CRITICAL.monitoring/prometheus/alerts.yml
OBS-3See everything at a glance. ✅ Done in v0.11.0 Phase 3 — monitoring/grafana/dashboards/pg_trickle_overview.json has 6 sections: cluster overview stat panels, refresh performance time-series, staleness heatmap, CDC health graphs, per-table drill-down table with schema/table variable filters.monitoring/grafana/dashboards/pg_trickle_overview.json
OBS-4Try it all in one command. ✅ Done in v0.11.0 Phase 3 — monitoring/docker-compose.yml spins up PostgreSQL + pg_trickle + postgres_exporter + Prometheus + Grafana with pre-wired config and demo seed data (monitoring/init/01_demo.sql). docker compose up → Grafana at :3000.monitoring/docker-compose.yml

Observability subtotal: ~12 hours

Default Tuning & Safety Defaults (from REPORT_OVERALL_STATUS.md)

These four changes flip conservative defaults to the behavior that is safe and correct in production. All underlying features are implemented and tested; only the default values change. Each keeps the original GUC so operators can revert if needed.

ItemDescriptionEffortRef
DEF-1Flip parallel_refresh_mode default to 'on'. ✅ Done in v0.11.0 Phase 1 — default flipped; normalize_parallel_refresh_mode maps None/unknown → On; unit test renamed to defaults_to_on.REPORT_OVERALL_STATUS.md §R1
DEF-2Flip auto_backoff default to true. ✅ Done in v0.10.0 — default flipped to true; trigger threshold raised to 95%, cap reduced to 8×, log level raised to WARNING. CONFIGURATION.md updated.1–2hREPORT_OVERALL_STATUS.md §R10
DEF-3SemiJoin delta-key pre-filter (O-1). ✅ Verified already implemented in v0.11.0 Phase 2 — left_snapshot_filtered pre-filter with WHERE left_key IN (SELECT DISTINCT right_key FROM delta) was already present in semi_join.rs.src/dvm/operators/semi_join.rs
DEF-4Increase invalidation ring capacity from 32 to 128 slots. ✅ Done in v0.11.0 Phase 1 — INVALIDATION_RING_CAPACITY raised to 128 in shmem.rs.REPORT_OVERALL_STATUS.md §R9
DEF-5Flip block_source_ddl default to true. ✅ Done in v0.11.0 Phase 1 — default flipped to true; both error messages in hooks.rs include step-by-step escape-hatch procedure.REPORT_OVERALL_STATUS.md §R12

Default tuning subtotal: ~14–21 hours

Safety & Resilience Hardening (Must-Ship)

In plain terms: The background worker should never silently hang or leave a stream table in an undefined state when an internal operation fails. These items replace panic!/unwrap() in code paths reachable from the background worker with structured errors and graceful recovery.

ItemDescriptionEffortRef
SAF-1Replace worker-path panics with structured errors. ✅ Done in v0.11.0 Phase 1 — full audit of scheduler.rs, refresh.rs, hooks.rs: no panic!/unwrap() outside #[cfg(test)]. check_skip_needed now logs WARNING on SPI error with table name and error details. Audit finding documented in comment.src/scheduler.rs
SAF-2Failure-injection E2E test. ✅ Done in v0.11.0 Phase 2 — two E2E tests in tests/e2e_safety_tests.rs: (1) column drop triggers UpstreamSchemaChanged, verifies scheduler stays alive and other STs continue; (2) source table drop, same verification.tests/e2e_safety_tests.rs

Safety hardening subtotal: ~7–12 hours

Correctness & Code Quality Quick Wins (from REPORT_OVERALL_STATUS.md §12–§15)

In plain terms: Six self-contained improvements identified in the deep gap analysis. Each takes under a day and substantially reduces silent failure modes, operator confusion, and diagnostic friction.

Quick Fixes (< 1 hour each)

ItemDescriptionEffortRef
QF-1Fix unguarded debug println!. ✅ Done in v0.11.0 Phase 1 — println! replaced with pgrx::log!() guarded by new pg_trickle.log_merge_sql GUC (default off).src/refresh.rs
QF-2Upgrade AUTO mode downgrade log level. ✅ Done in v0.11.0 Phase 1 — four AUTO→FULL downgrade paths in api.rs raised from pgrx::info!() to pgrx::warning!().plans/performance/REPORT_OVERALL_STATUS.md §12
QF-3Warn when append_only auto-reverts. ✅ Verified already implemented — pgrx::warning!() + emit_alert(AppendOnlyReverted) already present in refresh.rs.plans/performance/REPORT_OVERALL_STATUS.md §15
QF-4Document parser unwrap() invariants. ✅ Done in v0.11.0 Phase 1 — // INVARIANT: comments added at four unwrap() sites in dvm/parser.rs (after is_empty() guard, len()==1 guards, and non-empty Err return).src/dvm/parser.rs

Quick-fix subtotal: ~3–4 hours

Effective Refresh Mode Tracking (G12-ERM)

In plain terms: When a stream table is configured as AUTO, operators currently have no way to discover which mode is actually being used at runtime without reading warning logs. Storing the resolved mode in the catalog and exposing a diagnostic function closes this observability gap.

ItemDescriptionEffortRef
G12-ERM-1Add effective_refresh_mode column to pgt_stream_tables. ✅ Done in v0.11.0 Phase 2 — column added; scheduler writes actual mode (FULL/DIFFERENTIAL/APPEND_ONLY/TOP_K/NO_DATA) via thread-local tracking; upgrade SQL pg_trickle--0.10.0--0.11.0.sql created.src/catalog.rs
G12-ERM-2Add explain_refresh_mode(name TEXT) SQL function. ✅ Done in v0.11.0 Phase 2 — pgtrickle.explain_refresh_mode() returns configured mode, effective mode, and downgrade reason.src/api.rs

Effective refresh mode subtotal: ~4–7 hours

Correctness Guards (G12-2, G12-AGG)

ItemDescriptionEffortRef
G12-2TopK runtime validation. ✅ Done in v0.11.0 Phase 4 — validate_topk_metadata() re-parses the reconstructed full query on each TopK refresh; validate_topk_metadata_fields() validates stored fields (pure logic, unit-testable). Falls back to FULL + WARNING on mismatch. 7 unit tests.src/refresh.rs
G12-AGGGroup-rescan aggregate warning. ✅ Done in v0.11.0 Phase 4 — classify_agg_strategy() classifies each aggregate as ALGEBRAIC_INVERTIBLE / ALGEBRAIC_VIA_AUX / SEMI_ALGEBRAIC / GROUP_RESCAN. Warning emitted at create_stream_table time for DIFFERENTIAL + group-rescan aggs. Strategy exposed in explain_st() as aggregate_strategies JSON. 18 unit tests.src/dvm/parser.rs

Correctness guards subtotal: ✅ Complete

Parameter & Error Hardening (G15-PV, G13-EH)

ItemDescriptionEffortRef
G15-PVValidate incompatible parameter combinations. ✅ Done in v0.11.0 Phase 2 — (a) cdc_mode='wal' + refresh_mode='IMMEDIATE' rejection was already present; (b) diamond_schedule_policy='slowest' + diamond_consistency='none' now rejected in create_stream_table_impl and alter_stream_table_impl with structured error.src/api.rs
G13-EHStructured error HINT/DETAIL fields. ✅ Done in v0.11.0 Phase 2 — raise_error_with_context() helper in api.rs uses ErrorReport::new().set_detail().set_hint() for UnsupportedOperator, CycleDetected, UpstreamSchemaChanged, and QueryParseError; all 8 API-boundary error sites updated.src/api.rs

Parameter & error hardening subtotal: ~6–12 hours

Testing: EC-01 Boundary Regression (G17-EC01B-NEG)

ItemDescriptionEffortRef
G17-EC01B-NEGAdd a negative regression test asserting that ≥3-scan join right subtrees currently fall back to FULL refresh. ✅ Done in v0.11.0 Phase 4 — 4 unit tests in join_common.rs covering 3-way join, 4-way join, right-subtree ≥3 scans, and 2-scan boundary. // TODO: Remove when EC01B-1/EC01B-2 fixed in v0.12.0src/dvm/operators/join_common.rs

EC-01 boundary regression subtotal: ✅ Complete

Documentation Quick Wins (G16-GS, G16-SM, G16-MQR, G15-GUC)

ItemDescriptionEffortRef
G16-GSRestructure GETTING_STARTED.md with progressive complexity. Five chapters: (1) Hello World — single-table ST with no join; (2) Multi-table join; (3) Scheduling & backpressure; (4) Monitoring — 5 key functions; (5) Advanced — FUSE, wide bitmask, partitions. Remove the current flat wall-of-SQL structure. ✅ Done in v0.11.0 Phase 11 — 5-chapter structure implemented; Chapter 1 Hello World example added; Chapter 5 Advanced Topics adds inline FUSE, partitioning, IMMEDIATE, and multi-tenant quota examples.docs/GETTING_STARTED.md
G16-SMSQL/mode operator support matrix. ✅ Done — 60+ row operator support matrix added to docs/DVM_OPERATORS.md covering all operators × FULL/DIFFERENTIAL/IMMEDIATE modes with caveat footnotes.docs/DVM_OPERATORS.md
G16-MQRMonitoring quick reference. ✅ Done — Monitoring Quick Reference section added to docs/GETTING_STARTED.md with pgt_status(), health_check(), change_buffer_sizes(), dependency_tree(), fuse_status(), Prometheus/Grafana stack, key metrics table, and alert summary.docs/GETTING_STARTED.md
G15-GUCGUC interaction matrix. ✅ Done — GUC Interaction Matrix (14 interaction pairs) and three named Tuning Profiles (Low-Latency, High-Throughput, Resource-Constrained) added to docs/CONFIGURATION.md.docs/CONFIGURATION.md

Documentation subtotal: ~2–3 days

Correctness quick-wins & documentation subtotal: ~1–2 days code + ~2–3 days docs

Should-Ship Additions

Wider Changed-Column Bitmask (>63 columns)

In plain terms: Stream tables built on source tables with more than 63 columns fall back silently to tracking every column on every UPDATE, losing all CDC selectivity. Extending the changed_cols field from a BIGINT to a BYTEA vector removes this cliff without breaking existing deployments.

ItemDescriptionEffortRef
WB-1Extend the CDC trigger changed_cols column from BIGINT to BYTEA; update bitmask encoding/decoding in cdc.rs; add schema migration for existing change buffer tables (tables with <64 columns are unaffected at the data level).1–2 wkREPORT_OVERALL_STATUS.md §R13
WB-2E2E test: wide (>63 column) source table; verify only referenced columns trigger delta propagation; benchmark UPDATE selectivity before/after.2–4htests/e2e_cdc_tests.rs

Wider bitmask subtotal: ~1–2 weeks + ~4h testing

Fuse — Anomalous Change Detection

In plain terms: A circuit breaker that stops a stream table from processing an unexpectedly large batch of changes (runaway script, mass delete, data migration) without operator review. A blown fuse halts refresh and emits a pgtrickle_alert NOTIFY; reset_fuse() resumes with a chosen recovery action (apply, reinitialize, or skip_changes).

ItemDescriptionEffortRef
FUSE-1Catalog: fuse state columns on pgt_stream_tables (fuse_mode, fuse_state, fuse_ceiling, fuse_sensitivity, blown_at, blow_reason)1–2hPLAN_FUSE.md
FUSE-2alter_stream_table() new params: fuse, fuse_ceiling, fuse_sensitivity1hPLAN_FUSE.md
FUSE-3reset_fuse(name, action => 'apply'|'reinitialize'|'skip_changes') SQL function1hPLAN_FUSE.md
FUSE-4fuse_status() introspection function1hPLAN_FUSE.md
FUSE-5Scheduler pre-check: count change buffer rows; evaluate threshold; blow fuse + NOTIFY if exceeded2–3hPLAN_FUSE.md
FUSE-6E2E tests: normal baseline, spike → blow, reset (apply/reinitialize/skip_changes), diamond/DAG interaction4–6hPLAN_FUSE.md

Fuse subtotal: ~10–14 hours — ✅ Complete

External Correctness Gate (TS1 or TS2)

In plain terms: Run an independent public query corpus through pg_trickle's DIFFERENTIAL mode and assert the results match a vanilla PostgreSQL execution. This catches blind spots that the extension's own test suite cannot, and provides an objective correctness baseline before v1.0.

ItemDescriptionEffortRef
TS1sqllogictest suite. Run the PostgreSQL sqllogic suite through pg_trickle DIFFERENTIAL mode; gate CI on zero correctness mismatches. Preferred choice: broadest query coverage.2–3dPLAN_TESTING_GAPS.md §J
TS2JOB (Join Order Benchmark). Correctness baseline and refresh latency profiling on realistic multi-join analytical queries. Alternative if sqllogictest setup is too costly.1–2dPLAN_TESTING_GAPS.md §J

Deliver one of TS1 or TS2; whichever is completed first meets the exit criterion.

External correctness gate subtotal: ~1–3 days

Differential ST-to-ST Refresh (✅ Done)

In plain terms: When stream table B's defining query reads from stream table A, pg_trickle currently forces a FULL refresh of B every time A updates — re-executing B's entire query even when only a handful of rows changed. This feature gives ST-to-ST dependencies the same CDC change buffer that base tables already have, so B refreshes differentially (applying only the delta). Crucially, even when A itself does a FULL refresh, a pre/post snapshot diff is captured so B still receives a small I/D delta rather than cascading FULL through the chain.

ItemDescriptionStatusRef
ST-ST-1Change buffer infrastructure. create_st_change_buffer_table() / drop_st_change_buffer_table() in cdc.rs; lifecycle hooks in api.rs; idempotent ensure_st_change_buffer()✅ DonePLAN_ST_TO_ST.md §Phase 1
ST-ST-2Delta capture — DIFFERENTIAL path. Force explicit DML when ST has downstream consumers; capture delta from __pgt_delta_{id} to changes_pgt_{id}✅ DonePLAN_ST_TO_ST.md §Phase 2
ST-ST-3Delta capture — FULL path. Pre/post snapshot diff writes I/D pairs to changes_pgt_{id}; eliminates cascading FULL✅ DonePLAN_ST_TO_ST.md §7
ST-ST-4DVM scan operator for ST sources. Read from changes_pgt_{id}; pgt_-prefixed LSN tokens; extended frontier and placeholder resolver✅ DonePLAN_ST_TO_ST.md §Phase 3
ST-ST-5Scheduler integration. Buffer-based change detection in has_stream_table_source_changes(); removed FULL override; frontier augmented with ST source positions✅ DonePLAN_ST_TO_ST.md §Phase 4
ST-ST-6Cleanup & lifecycle. cleanup_st_change_buffers_by_frontier() for ST buffers; removed prewarm skip for ST sources; ST buffer cleanup in both differential and full refresh paths✅ DonePLAN_ST_TO_ST.md §Phase 5–6

ST-to-ST differential subtotal: ~4.5–6.5 weeks

Adaptive/Event-Driven Scheduler Wake (Must-Ship)

In plain terms: The scheduler currently wakes on a fixed 1-second timer even when nothing has changed. This adds event-driven wake: CDC triggers notify the scheduler immediately when changes arrive. Median end-to-end latency drops from ~515 ms to ~15 ms for low-volume workloads — a 34× improvement. This is a must-ship item because low latency is a primary project goal.

ItemDescriptionEffortRef
WAKE-1Event-driven scheduler wake. ✅ Done in v0.11.0 Phase 7 — CDC triggers emit pg_notify('pgtrickle_wake', '') after each change buffer INSERT; scheduler issues LISTEN pgtrickle_wake at startup; 10 ms debounce coalesces rapid notifications; poll fallback preserved. New GUCs: event_driven_wake (default true), wake_debounce_ms (default 10). E2E tests in tests/e2e_wake_tests.rs.REPORT_OVERALL_STATUS.md §R16

Event-driven wake subtotal: ✅ Complete

Stretch Goals (if capacity allows after Must-Ship)

ItemDescriptionEffortRef
STRETCH-1Partitioned stream tables — design spike only. ✅ Done in v0.11.0 Partitioning Spike — RFC written (PLAN_PARTITIONING_SPIKE.md), go/no-go decision: Go. A1-1 implemented (catalog column, API parameter, validation).2–4dPLAN_PARTITIONING_SPIKE.md
A1-1DDL: CREATE STREAM TABLE … PARTITION BY; st_partition_key catalog column. ✅ Done — partition_by parameter added to all three create_stream_table* functions; st_partition_key TEXT column in catalog; validate_partition_key() validates column exists in output; build_create_table_sql emits PARTITION BY RANGE (key); setup_storage_table creates default catch-all partition and non-unique __pgt_row_id index.1–2 wkPLAN_PARTITIONING_SPIKE.md
A1-2Delta min/max inspection. ✅ Done — extract_partition_range() in refresh.rs runs SELECT MIN/MAX(key)::text on the resolved delta SQL; returns None on empty delta (MERGE skipped).1 wkPLAN_PARTITIONING_SPIKE.md §8
A1-3MERGE rewrite. ✅ Done — inject_partition_predicate() replaces __PGT_PART_PRED__ placeholder in MERGE ON clause with AND st."key" BETWEEN 'min' AND 'max'; CachedMergeTemplate stores delta_sql_template; D-2 prepared statements disabled for partitioned STs.2–3 wkPLAN_PARTITIONING_SPIKE.md §8
A1-4E2E benchmarks: 10M-row partitioned ST, 0.1%/0.2%/100% change rate scenarios; EXPLAIN (ANALYZE, BUFFERS) partition-scan verification. ✅ Done — 7 E2E tests added to tests/e2e_partition_tests.rs covering: initial populate, differential inserts, updates/deletes, empty-delta fast path, EXPLAIN plan verification, invalid partition key rejection; added to light-E2E allowlist.1 wkPLAN_PARTITIONING_SPIKE.md §9

Stretch subtotal: STRETCH-1 + A1-1 + A1-2 + A1-3 + A1-4 ✅ All complete

DAG Refresh Performance Improvements (from PLAN_DAG_PERFORMANCE.md §8)

In plain terms: Now that ST-to-ST differential refresh eliminates the "every hop is FULL" bottleneck, the next performance frontier is reducing per-hop overhead and exploiting DAG structure more aggressively. These items target the scheduling and dispatch layer — not the DVM engine — and collectively can reduce end-to-end propagation latency by 30–50% for heterogeneous DAGs.

ItemDescriptionEffortRef
DAG-1Intra-tick pipelining. Within a single scheduler tick, begin processing a downstream ST as soon as all its specific upstream dependencies have completed — not when the entire topological level finishes. Requires per-ST completion tracking in the parallel dispatch loop and immediate enqueuing of newly-ready STs. Expected 30–50% latency reduction for DAGs with mixed-cost levels. ✅ Done — Already achieved by Phase 4’s parallel dispatch architecture: per-dependency remaining_upstreams tracking with immediate downstream readiness propagation. No level barrier exists. 3 validation tests.2–3 wkPLAN_DAG_PERFORMANCE.md §8.1
DAG-2Adaptive poll interval. Replace the fixed 200 ms parallel dispatch poll with exponential backoff (20 ms → 200 ms), resetting on worker completion. Makes parallel mode competitive with CALCULATED for cheap refreshes ($T_r \approx 10\text{ms}$). Alternative: WaitLatch with shared-memory completion flags. ✅ Done — compute_adaptive_poll_ms() pure-logic helper with exponential backoff (20ms → 200ms); ParallelDispatchState tracks adaptive_poll_ms + completions_this_tick; resets to 20ms on worker completion; 8 unit tests.1–2 wkPLAN_DAG_PERFORMANCE.md §8.2
DAG-3Delta amplification detection. Track input→output delta ratio per hop via pgt_refresh_history. When a join ST amplifies delta beyond a configurable threshold (e.g., output > 100× input), emit a performance WARNING and optionally fall back to FULL for that hop. Expose amplification metrics in explain_st(). ✅ Done — pg_trickle.delta_amplification_threshold GUC (default 100×); compute_amplification_ratio + should_warn_amplification pure-logic helpers; WARNING emitted after MERGE with ratio, counts, and tuning hint; explain_st() exposes amplification_stats JSON from last 20 DIFFERENTIAL refreshes; 15 unit tests.3–5dPLAN_DAG_PERFORMANCE.md §8.4
DAG-4ST buffer bypass for single-consumer CALCULATED chains. For ST dependencies with exactly one downstream consumer refreshing in the same tick, pass the delta in-memory instead of writing/reading from the changes_pgt_ buffer table. Eliminates 2× SPI DML per hop (~20 ms savings per hop for 10K-row deltas). ✅ Done — FusedChain execution unit kind; find_fusable_chains() pure-logic detection; capture_delta_to_bypass_table() writes to temp table; DiffContext.st_bypass_tables threads bypass through DVM scan; delta SQL cache bypassed when active; 11+4 unit tests.3–4 wkPLAN_DAG_PERFORMANCE.md §8.3
DAG-5ST buffer batch coalescing. Apply net-effect computation to ST change buffers before downstream reads — cancel INSERT/DELETE pairs for the same __pgt_row_id that accumulate between reads during rapid-fire upstream refreshes. Adapts existing compute_net_effect() logic to the ST buffer schema. ✅ Done — compact_st_change_buffer() with build_st_compact_sql() pure-logic helper; advisory lock namespace 0x5047_5500; integrated in execute_differential_refresh() after C-4 base-table compaction; 9 unit tests.1–2 wkPLAN_DAG_PERFORMANCE.md §8.5

DAG refresh performance subtotal: ~8–12 weeks

v0.11.0 total: ~7–10 weeks (partitioning + isolation) + ~12h observability + ~14–21h default tuning + ~7–12h safety hardening + ~2–4 weeks should-ship (bitmask + fuse + external corpus) + ~4.5–6.5 weeks ST-to-ST differential + ~2–3 weeks event-driven wake + ~1–2 days correctness quick-wins + ~2–3 days documentation + ~8–12 weeks DAG performance

Exit criteria: ✅ All met. Released 2026-03-26.

  • Declaratively partitioned stream tables accepted; partition key tracked in catalog — ✅ Done in v0.11.0 Partitioning Spike (STRETCH-1 RFC + A1-1)
  • Partitioned storage table created with PARTITION BY RANGE + default catch-all partition — ✅ Done (A1-1 physical DDL)
  • Partition-key range predicate injected into MERGE ON clause; empty-delta fast-path skips MERGE — ✅ Done (A1-2 + A1-3)
  • Partition-scoped MERGE benchmark: 10M-row ST, 0.1% change rate (expect ~100× I/O reduction) — ✅ Done (A1-4 E2E tests)
  • Per-database worker quotas enforced; burst reclaimed within 1 scheduler cycle — ✅ Done in v0.11.0 Phase 11 (pg_trickle.per_database_worker_quota GUC; burst to 150% at < 80% cluster load)
  • Prometheus queries + alerting rules + Grafana dashboard shipped — ✅ Done in v0.11.0 Phase 3 (monitoring/ directory)
  • DEF-1: parallel_refresh_mode default is 'on'; unit test updated — ✅ Done in v0.11.0 Phase 1
  • DEF-2: auto_backoff default is true; CONFIGURATION.md updated — ✅ Done in v0.10.0
  • DEF-3: SemiJoin delta-key pre-filter verified already implemented — ✅ Done in v0.11.0 Phase 2 (pre-existing in semi_join.rs)
  • DEF-4: Invalidation ring capacity is 128 slots — ✅ Done in v0.11.0 Phase 1
  • DEF-5: block_source_ddl default is true; error message includes escape-hatch instructions — ✅ Done in v0.11.0 Phase 1
  • SAF-1: No panic!/unwrap() in background worker hot paths; check_skip_needed logs SPI errors — ✅ Done in v0.11.0 Phase 1
  • SAF-2: Failure-injection E2E tests in tests/e2e_safety_tests.rs — ✅ Done in v0.11.0 Phase 2
  • WB-1+2: Changed-column bitmask supports >63 columns (VARBIT); wide-table CDC selectivity E2E passes; schema migration tested — ✅ Done in v0.11.0 Phase 5
  • FUSE-1–6: Fuse blows on configurable change-count threshold; reset_fuse() recovers in all three action modes; diamond/DAG interaction tested — ✅ Done in v0.11.0 Phase 6
  • TS2: TPC-H-derived 5-query DIFFERENTIAL correctness gate passes with zero mismatches; gated in CI — ✅ Done in v0.11.0 Phase 9
  • QF-1–4: println! replaced with guarded pgrx::log!(); AUTO downgrades emit WARNING; append_only reversion verified already warns; parser invariant sites annotated — ✅ Done in v0.11.0 Phase 1
  • G12-ERM: effective_refresh_mode column present in pgt_stream_tables; explain_refresh_mode() returns configured mode, effective mode, downgrade reason — ✅ Done in v0.11.0 Phase 2
  • G12-2: TopK path validates assumptions at refresh time; triggers FULL fallback with WARNING on violation — ✅ Done in v0.11.0 Phase 4
  • G12-AGG: Group-rescan aggregate warning fires at create_stream_table for DIFFERENTIAL mode; strategy visible in explain_st() — ✅ Done in v0.11.0 Phase 4
  • G15-PV: Incompatible cdc_mode/refresh_mode and diamond_schedule_policy combinations rejected at creation time with structured HINT — ✅ Done in v0.11.0 Phase 2
  • G13-EH: UnsupportedOperator, CycleDetected, UpstreamSchemaChanged, QueryParseError include DETAIL and HINT fields — ✅ Done in v0.11.0 Phase 2
  • G17-EC01B-NEG: Negative regression test documents ≥3-scan fall-back behavior; linked to v0.12.0 EC01B fix — ✅ Done in v0.11.0 Phase 4
  • G16-GS/SM/MQR/GUC: GETTING_STARTED restructured (5 chapters + Hello World + Advanced Topics); DVM_OPERATORS support matrix; monitoring quick reference; CONFIGURATION.md GUC matrix — ✅ Done in v0.11.0 Phase 11
  • ST-ST-1–6: All ST-to-ST dependencies refresh differentially when upstream has a change buffer; FULL refreshes on upstream produce pre/post I/D diff; no cascading FULL — ✅ Done in v0.11.0 Phase 8
  • WAKE-1: Event-driven scheduler wake; median latency ~15 ms (34× improvement); 10 ms debounce; poll fallback — ✅ Done in v0.11.0 Phase 7
  • DAG-1: Intra-tick pipelining confirmed in Phase 4 architecture — ✅ Done
  • DAG-2: Adaptive poll interval (20 ms → 200 ms exponential backoff) — ✅ Done in v0.11.0 Phase 10
  • DAG-3: Delta amplification detection with pg_trickle.delta_amplification_threshold GUC — ✅ Done in v0.11.0 Phase 10
  • DAG-4: ST buffer bypass (FusedChain) for single-consumer CALCULATED chains — ✅ Done in v0.11.0 Phase 10
  • DAG-5: ST buffer batch coalescing cancels redundant I/D pairs — ✅ Done in v0.11.0 Phase 10
  • Extension upgrade path tested (0.10.0 → 0.11.0) — ✅ upgrade SQL in sql/pg_trickle--0.10.0--0.11.0.sql

v0.12.0 — Correctness, Reliability & Developer Tooling

Goal: Close the last known wrong-answer bugs in the incremental query engine, add SQL-callable diagnostic functions for observability, harden the scheduler against edge cases uncovered with deeper topologies, and back the whole release with thousands of automatically generated property and fuzz tests.

Phases 5–8 from the original v0.12.0 scope (Scalability Foundations, Partitioning Enhancements, MERGE Profiling, and dbt Macro Updates) have been moved to v0.13.0 to keep this release tightly focused on correctness and reliability. See §v0.13.0 for those items.

Status: Released (2026-03-28).

Completed items (click to expand)

Anomalous Change Detection (Fuse)

In plain terms: Imagine a source table suddenly receives a million-row batch delete — a bug, runaway script, or intentional purge. Without a fuse, pg_trickle would try to process all of it and potentially overload the database. This adds a circuit breaker: you set a ceiling (e.g. "never process more than 50,000 changes at once"), and if that limit is hit the stream table pauses and sends a notification. You investigate, fix the root cause, then resume with reset_fuse() and choose how to recover (apply the changes, reinitialize from scratch, or skip them entirely).

Per-stream-table fuse that blows when the change buffer row count exceeds a configurable fixed ceiling or an adaptive μ+kσ threshold derived from pgt_refresh_history. A blown fuse halts refresh and emits a pgtrickle_alert NOTIFY; reset_fuse() resumes with a chosen recovery action.

ItemDescriptionEffortRef
FUSE-1Catalog: fuse state columns on pgt_stream_tables (fuse_mode, fuse_state, fuse_ceiling, fuse_sensitivity, blown_at, blow_reason)1–2hPLAN_FUSE.md
FUSE-2alter_stream_table() new params: fuse, fuse_ceiling, fuse_sensitivity1hPLAN_FUSE.md
FUSE-3reset_fuse(name, action => 'apply'|'reinitialize'|'skip_changes') SQL function1hPLAN_FUSE.md
FUSE-4fuse_status() introspection function1hPLAN_FUSE.md
FUSE-5Scheduler pre-check: count change buffer rows; evaluate threshold; blow fuse + NOTIFY if exceeded2–3hPLAN_FUSE.md
FUSE-6E2E tests: normal baseline, spike → blow, reset, diamond/DAG interaction4–6hPLAN_FUSE.md

Anomalous change detection subtotal: ~10–14 hours

Correctness — EC-01 Deep Fix (≥3-Scan Join Right Subtrees)

In plain terms: The phantom-row-after-DELETE bug (EC-01) was fixed for join children with ≤2 scan nodes on the right side. Wider join chains — TPC-H Q7, Q8, Q9 all qualify — are still silently affected: when both sides of a join are deleted in the same batch, the DELETE can be silently dropped. The existing EXCEPT ALL snapshot strategy causes PostgreSQL to spill multi-GB temp files for deep join trees, which is why the threshold exists. This work designs a fundamentally different per-subtree snapshot strategy that removes the cap.

ItemDescriptionEffortRef
EC01B-1Design and implement a per-subtree CTE-based snapshot strategy to replace EXCEPT ALL for right-side join chains with ≥3 scan nodes; remove the join_scan_count(child) <= 2 threshold in use_pre_change_snapshot ✅ Donesrc/dvm/operators/join_common.rs · plans/PLAN_EDGE_CASES.md §EC-01
EC01B-2TPC-H Q7/Q8/Q9 regression tests: combined left-DELETE + right-DELETE in same cycle; assert no phantom-row drop ✅ Donetests/e2e_tpch_tests.rs

EC-01 deep fix subtotal: ~3–4 weeks — ✅ Complete

CDC Write-Side Overhead Benchmark

In plain terms: Every INSERT/UPDATE/DELETE on a source table fires a PL/pgSQL trigger that writes to the change buffer. We have never measured how much write throughput this costs. These benchmarks quantify it across five scenarios (single-row, bulk INSERT, bulk UPDATE, bulk DELETE, concurrent writers) and gate the decision on whether to implement a change_buffer_unlogged GUC that could reduce WAL overhead by ~20–30%.

ItemDescriptionEffortRef
BENCH-W1Implement tests/e2e_cdc_write_overhead_tests.rs: compare source-only vs. source + stream table DML throughput across five scenarios; report write amplification factor ✅ Donetests/e2e_cdc_write_overhead_tests.rs
BENCH-W2Publish results in docs/BENCHMARK.md ✅ Donedocs/BENCHMARK.md

CDC write-side benchmark subtotal: ~3–5 days — ✅ Complete

DAG Topology Benchmark Suite (from PLAN_DAG_BENCHMARK.md)

In plain terms: Production deployments form DAGs with 10–500+ stream tables arranged in chains, fan-outs, diamonds, and mixed topologies. This benchmark suite measures end-to-end propagation latency and throughput through these DAG shapes, validates the theoretical latency formulas from PLAN_DAG_PERFORMANCE.md, and provides regression detection for DAG propagation overhead.

ItemDescriptionEffortRef
DAG-B1Session 1: Infrastructure, linear chain topology builder, latency + throughput measurement drivers, reporting (ASCII/JSON), 7 benchmark tests ✅ DonePLAN_DAG_BENCHMARK.md §11.1
DAG-B2Session 2: Wide DAG + fan-out tree topology builders; 9 latency + throughput tests (5 wide + 2 fan-out latency, 2 throughput) ✅ DonePLAN_DAG_BENCHMARK.md §11.2
DAG-B3Session 3: Diamond + mixed topology builders; 5 latency + throughput tests; per-level breakdown reporting ✅ DonePLAN_DAG_BENCHMARK.md §11.3
DAG-B4Session 4: Update docs/BENCHMARK.md, full suite validation run ✅ DonePLAN_DAG_BENCHMARK.md §11.4

DAG topology benchmark subtotal: ~3–5 days — ✅ Complete

Developer Tooling & Observability Functions (from REPORT_OVERALL_STATUS.md §15) ✅ Complete

In plain terms: pg_trickle's diagnostic toolbox today is limited to explain_st() and refresh_history(). Operators debugging unexpected mode changes, query rewrites, or error patterns must read source code or server logs. This section adds four SQL-callable diagnostic functions that surface internal state in a structured, queryable form.

ItemDescriptionEffortStatus
DT-1explain_query_rewrite(query TEXT) — parse a query through the DVM pipeline and return the rewritten SQL plus a list of passes applied (operator rewrites, delta-key injections, TopK detection, group-rescan classification). Useful for debugging unexpected refresh behavior without creating a stream table.~1–2d✅ Done in v0.12.0 Phase 2
DT-2diagnose_errors(name TEXT) — return the last 5 error events for a stream table, classified by type (correctness, performance, config, infrastructure), with a suggested remediation for each class.~2–3d✅ Done in v0.12.0 Phase 2
DT-3list_auxiliary_columns(name TEXT) — list all __pgt_* internal columns injected into the stream table's query plan with their purpose (delta tracking, row identity, compaction key). Helps users understand unexpected columns in SELECT * output.~1d✅ Done in v0.12.0 Phase 2
DT-4validate_query(query TEXT) — parse and run DVM validation on a query without creating a stream table; return the resolved refresh mode, detected SQL constructs (group-rescan aggregates, non-equijoins, multi-scan subtrees), and any warnings.~1–2d✅ Done in v0.12.0 Phase 2

Developer tooling subtotal: ~5–8 days

Parser Safety, Concurrency & Query Coverage (from REPORT_OVERALL_STATUS.md §13/§12/§17)

Additional correctness and robustness items from the deep gap analysis: a stack-overflow prevention guard for pathological queries, a concurrency stress test for IMMEDIATE mode, and two investigations into known under- documented query constructs.

ItemDescriptionEffortRef
G13-SDParser recursion depth limit. Add a recursion depth counter to all recursive parse-tree visitor functions in dvm/parser.rs. Return PgTrickleError::QueryTooComplex if depth exceeds pg_trickle.max_parse_depth (GUC, default 64). Prevents stack-overflow crashes on pathological queries. ✅ Donesrc/dvm/parser.rs · src/config.rs · src/error.rs
G17-IMSIMMEDIATE mode concurrency stress test. 100+ concurrent DML transactions on the same source table in IMMEDIATE refresh mode; assert zero lost updates, zero phantom rows, and no deadlocks. ✅ Donetests/e2e_immediate_concurrency_tests.rs
G12-SQL-INMulti-column IN (subquery) correctness investigation. Determine behavior when DVM encounters EXPR IN (subquery returning multiple columns). Add a correctness test; if the construct is broken, fix it or document as unsupported with a structured error. ✅ Done — documented as unsupportedtests/e2e_multi_column_in_tests.rs · src/dvm/parser.rs
G14-MDEDMERGE deduplication profiling. Profile how often concurrent-write scenarios produce duplicate key entries requiring pre-MERGE compaction. If ≥10% of refresh cycles need dedup, write an RFC for a two-pass MERGE strategy.~3–5dplans/performance/REPORT_OVERALL_STATUS.md §14
G17-MERGEEXMERGE template EXPLAIN validation in E2E tests. Add EXPLAIN (COSTS OFF) dry-run checks for generated MERGE SQL templates at E2E test startup. Catches malformed templates before any data is processed. ✅ Donetests/e2e_merge_template_tests.rs

Parser safety & coverage subtotal: ~9–15 days

Differential Fuzzing (SQLancer)

In plain terms: SQLancer is a SQL fuzzer that generates thousands of syntactically valid but structurally unusual queries and uses mathematical oracles (NoREC, TLP) to prove our DVM engine produces exactly the same results as PostgreSQL's native executor. Unlike hand-written tests, it explores the long tail of NULL semantics, nested aggregations, and edge cases no human would write. Any backend crash or result mismatch becomes a permanent regression test seed.

ItemDescriptionEffortRef
SQLANCER-1Docker-based harness: just sqlancer spins up E2E container; crash-test oracle verifies that no SQLancer-generated create_stream_table call crashes the backend3–4dPLAN_SQLANCER.md §Steps 1–2
SQLANCER-2Equivalence oracle: for each generated query Q, assert create_stream_table + refresh output equals native SELECT (multiset comparison); failures auto-committed as proptest regression seeds3–4dPLAN_SQLANCER.md §Step 3
SQLANCER-3CI weekly-sqlancer job (daily schedule + manual dispatch); new proptest seed files committed on any detected correctness failure1–2dPLAN_SQLANCER.md

SQLancer fuzzing subtotal: ~1–2 weeks

Property-Based Invariant Tests (Items 5 & 6)

In plain terms: Items 1–4 of the property test plan are done. These two remaining items add topology/scheduler stress tests (random DAG shapes with multi-source branch interactions) and pure Rust unit-level properties (ordering monotonicity, SCC bookkeeping correctness). Both slot into the existing proptest harness and provide coverage that example-based tests cannot exhaustively explore.

ItemDescriptionEffortRef
PROP-5Topology / scheduler stress: randomized DAG topologies with multi-source branch interactions; assert no incorrect refresh ordering or spurious suspension4–6dPLAN_TEST_PROPERTY_BASED_INVARIANTS.md §Item 5
PROP-6Pure Rust DAG / scheduler helper properties: ordering invariants, monotonic metadata helpers, SCC bookkeeping edge-cases2–4dPLAN_TEST_PROPERTY_BASED_INVARIANTS.md §Item 6

Property testing subtotal: ~6–10 days

Async CDC — Research Spike (D-2)

In plain terms: A custom PostgreSQL logical decoding plugin could write changes directly to change buffers without the polling round-trip, cutting CDC latency by ~10× and WAL decoding CPU by 50–80%. This milestone scopes a research spike only — not a full implementation — to validate the key technical constraints.

ItemDescriptionEffortRef
D2-RResearch spike: prototype in-memory row buffering inside pg_trickle_decoder; validate SPI flush in commit callback; document memory-safety constraints and feasibility; produce a written RFC before any full implementation is started2–3 wkPLAN_NEW_STUFF.md §D-2

⚠️ SPI writes inside logical decoding change callbacks are not supported. All row buffering must occur in-memory within the plugin's memory context; flush only in the commit callback. In-memory buffers must handle arbitrarily large transactions. See PLAN_NEW_STUFF.md §D-2 risk analysis before writing any C code.

Retraction candidate (D-2): Even as a research spike, this item introduces C-level complexity (custom output plugin memory management, commit-callback SPI failure handling, arbitrarily large transaction buffering) that substantially exceeds the stated 2–3 week estimate once the architectural constraints are respected. The risk rating is Very High and the SPI-in-change-callback infeasibility makes the originally proposed design non-functional. Recommend moving D-2 to a post-1.0 research backlog entirely; do not include it in a numbered milestone until a separate feasibility study (outside the release cycle) produces a concrete RFC.

D-2 research spike subtotal: ~2–3 weeks

Scalability Foundations (pulled forward from v0.13.0)

In plain terms: These items directly serve the project's primary goal of world-class performance and scalability. Columnar change tracking eliminates wasted delta processing for wide tables, and shared change buffers reduce I/O multiplication in deployments with many stream tables reading from the same source.

ItemDescriptionEffortRef
A-2Columnar Change Tracking. Per-column bitmask in CDC triggers; skip rows where no referenced column changed; lightweight UPDATE-only path when only projected columns changed; 50–90% delta-volume reduction for wide-table UPDATE workloads.3–4 wkPLAN_NEW_STUFF.md §A-2
D-4Shared Change Buffers. Single buffer per source shared across all dependent STs; multi-frontier cleanup coordination; static-superset column mode for initial implementation.3–4 wkPLAN_NEW_STUFF.md §D-4

Scalability foundations subtotal: ~6–8 weeks

Partitioning Enhancements (A1 follow-ons from v0.11.0 spike)

In plain terms: The v0.11.0 spike delivered RANGE partitioning end-to-end. These follow-on items extend coverage to the use cases deliberately deferred from A1: multi-column keys, retrofitting existing stream tables, LIST-based partitions, HASH partitions (which need a different strategy than predicate injection), and operational quality-of-life improvements.

ItemDescriptionEffortRef
A1-1bMulti-column partition keys. Comma-separated partition_by; PARTITION BY RANGE (col_a, col_b); multi-column MIN/MAX extraction; ROW() comparison predicates for partition pruning. ✅ Done — parse_partition_key_columns(), composite extract_partition_range(), ROW comparison in inject_partition_predicate(); 5 unit tests + 3 E2E testssrc/api.rs, src/refresh.rs
A1-1calter_stream_table(partition_by => …) support. Add/change/remove partition key on existing stream tables; alter_stream_table_partition_key() handles DROP + recreate + full refresh; update_partition_key() in catalog; SQL migration adds parameter; also fixed alter_stream_table_query to preserve partition key. ✅ Done — 4 E2E testssrc/api.rs, src/catalog.rs
A1-1dLIST partitioning support. partition_by => 'LIST:col' creates PARTITION BY LIST storage; PartitionMethod enum dispatches LIST vs RANGE; extract_partition_bounds() uses SELECT DISTINCT for LIST; inject_partition_predicate() emits IN (…) predicate; single-column-only validation. ✅ Done — 16 unit tests + 4 E2E testssrc/api.rs, src/refresh.rs
A1-3bHASH partitioning via per-partition MERGE loop. partition_by => 'HASH:col[:N]' creates PARTITION BY HASH storage with N auto-created child partitions; execute_hash_partitioned_merge() materializes delta → discovers children via pg_inherits → per-child MERGE filtered through satisfies_hash_partition(); build_hash_child_merge() rewrites MERGE targeting ONLY child_partition. ✅ Done — 22 unit tests + 6 E2E testssrc/api.rs, src/refresh.rs
PART-WARNDefault-partition growth warning. warn_default_partition_growth() emits pgrx::warning!() after FULL and DIFFERENTIAL refresh when the default partition has rows; includes example DDL. ✅ Done — 2 E2E testssrc/refresh.rs

Auto-partition creation (TimescaleDB-style automatic chunk management) remains a post-1.0 item as stated in PLAN_PARTITIONING_SPIKE.md §10.

Partitioning enhancements subtotal: ~5–8 weeks

Performance Defaults (from REPORT_OVERALL_STATUS.md)

Targeted improvements identified in the overall status report. None require large design changes; all build on existing infrastructure.

ItemDescriptionEffortRef
PERF-2Auto-enable buffer_partitioning for high-throughput sources. ✅ Done — should_promote_inner() throughput-based heuristic; convert_buffer_to_partitioned() runtime migration; auto-promote hook in execute_differential_refresh(); docs/CONFIGURATION.md updated; 10 unit tests + 3 E2E testsREPORT_OVERALL_STATUS.md §R7
PERF-3Flip tiered_scheduling default to true. The feature is implemented and tested since v0.10.0. ✅ Done — default flipped; CONFIGURATION.md updated with tier thresholds sectionsrc/config.rs · docs/CONFIGURATION.md
PERF-1Adaptive scheduler wake interval. ➡️ Pulled forward to v0.11.0 as WAKE-1.REPORT_OVERALL_STATUS.md §R3/R16
PERF-4Flip block_source_ddl default to true. ➡️ Pulled forward to v0.11.0 as DEF-5.REPORT_OVERALL_STATUS.md §R12
PERF-5Wider changed-column bitmask (>63 columns). ➡️ Pulled forward to v0.11.0 as WB-1/WB-2.REPORT_OVERALL_STATUS.md §R13

Performance defaults subtotal: ~1–3 weeks

DAG Refresh Performance Improvements (from PLAN_DAG_PERFORMANCE.md §8)

➡️ Moved to v0.11.0 — these items build directly on the ST-to-ST differential infrastructure shipped in v0.11.0 Phase 8 and are most impactful while that work is fresh.

v0.12.0 total: ~18–27 weeks + ~6–8 weeks scalability + ~5–8 weeks partitioning enhancements + ~1–3 weeks defaults + ~3–5 weeks developer tooling & observability

Priority tiers: P0 = Phases 1–3 (must ship); P1 = Phases 4 + 7 (target); P2 = Phases 5, 6, 8 (can defer to v0.13.0 as a unit — never partially ship Phase 5/6).

dbt Macro Updates (Phase 8)

Priority P2 — Expose the v0.11.0 SQL API additions (partition_by, fuse, fuse_ceiling, fuse_sensitivity) in the dbt materialization macros so dbt users can configure them via config(...). No catalog changes; pure Jinja/SQL. Can defer to v0.13.0 as a unit.

ItemDescriptionEffort
DBT-1partition_by config option wired through stream_table.sql, create_stream_table.sql, and alter_stream_table.sql~1d
DBT-2fuse, fuse_ceiling, fuse_sensitivity config options wired through the materialization and alter macro with change-detection logic~1–2d
DBT-3dbt docs update: README and SQL_REFERENCE.md dbt section~0.5d

dbt macro updates subtotal: ~2–3.5 days

Exit criteria — all met (v0.12.0 Released 2026-03-28):

  • EC01B-1/2: No phantom-row drop for ≥3-scan right-subtree joins; TPC-H Q7/Q8/Q9 DELETE regression tests pass ✅
  • BENCH-W: Write-side overhead benchmarks published in docs/BENCHMARK.md
  • DAG-B1–B4: DAG topology benchmark suite complete ✅
  • SQLANCER-1/2/3: Crash-test + equivalence oracles in weekly CI job; zero mismatches ✅
  • PROP-5+6: Topology stress and DAG/scheduler helper property tests pass ✅
  • DT-1–4: explain_query_rewrite(), diagnose_errors(), list_auxiliary_columns(), validate_query() callable from SQL ✅
  • G13-SD: max_parse_depth guard active; pathological query returns QueryTooComplex
  • G17-IMS: IMMEDIATE mode concurrency stress test (5 scenarios × 100+ concurrent DML) passes ✅
  • G12-SQL-IN: Multi-column IN subquery documented as unsupported with structured error + EXISTS hint ✅
  • G17-MERGEEX: MERGE template EXPLAIN validation at E2E test startup ✅
  • PERF-3: tiered_scheduling default is true; CONFIGURATION.md updated ✅
  • ST-ST-9: Content-hash pk_hash in ST change buffers; stale-row-after-UPDATE bug fixed ✅
  • DAG-4 bypass column types fixed; parallel worker tests complete without timeout ✅
  • docs/UPGRADING.md updated with v0.11.0→v0.12.0 migration notes ✅
  • scripts/check_upgrade_completeness.sh passes ✅
  • Extension upgrade path tested (0.11.0 → 0.12.0) ✅

v0.13.0 — Scalability Foundations, Partitioning Enhancements, MERGE Profiling & Multi-Tenant Scheduling

Status: Released (2026-03-31).

Goal: Deliver the scalability foundations deferred from v0.12.0 — columnar change tracking and shared change buffers — alongside the partitioning enhancements that build on v0.11.0's RANGE partitioning spike, a MERGE deduplication profiling pass, the dbt macro updates, per-database worker quotas for multi-tenant deployments, the TPC-H-derived benchmarking harness for data-driven performance validation, and a small SQL coverage cleanup for PG 16+ expression types.

Completed items (click to expand)

Phases from PLAN_0_12_0.md: Phases 5 (Scalability), 6 (Partitioning), 7 (MERGE Profiling), and 8 (dbt Macro Updates). Plus three new phases: 9 (Multi-Tenant Scheduler Isolation), 10 (TPC-H Benchmark Harness), and 11 (SQL Coverage Cleanup).

Scalability Foundations (Phase 5)

In plain terms: These items directly serve the project's primary goal of world-class performance and scalability. Columnar change tracking eliminates wasted delta processing for wide tables, and shared change buffers reduce I/O multiplication in deployments with many stream tables reading from the same source.

ItemDescriptionEffortRef
A-2Columnar Change Tracking. Per-column bitmask in CDC triggers; skip rows where no referenced column changed; lightweight UPDATE-only path when only projected columns changed; 50–90% delta-volume reduction for wide-table UPDATE workloads.3–4 wkPLAN_NEW_STUFF.md §A-2
D-4Shared Change Buffers. Single buffer per source shared across all dependent STs; multi-frontier cleanup coordination; static-superset column mode for initial implementation.3–4 wkPLAN_NEW_STUFF.md §D-4
PERF-2Auto-enable buffer_partitioning for high-throughput sources. ✅ Done — throughput-based auto-promotion: buffer exceeding compact_threshold in a single refresh cycle is converted to RANGE(lsn) partitioned mode at runtime.REPORT_OVERALL_STATUS.md §R7

⚠️ D-4 multi-frontier cleanup correctness verified. MIN(consumer_frontier) used in all cleanup paths. Property-based tests with 5–10 consumers and 500 random frontier advancement cases pass.

Scalability foundations subtotal: ~6–8 weeks

Partitioning Enhancements (Phase 6)

In plain terms: The v0.11.0 spike delivered RANGE partitioning end-to-end. These follow-on items extend coverage to the use cases deliberately deferred from A1: multi-column keys, retrofitting existing stream tables, LIST-based partitions, HASH partitions, and operational quality-of-life improvements.

ItemDescriptionEffortRef
A1-1bMulti-column partition keys. Comma-separated partition_by; ROW() predicate for composite keys. ✅ Donesrc/api.rs, src/refresh.rs
A1-1calter_stream_table(partition_by => …) support. Add/change/remove partition key with full storage rebuild. ✅ Donesrc/api.rs, src/catalog.rs
A1-1dLIST partitioning support. PARTITION BY LIST for low-cardinality columns; IN (…) predicate style from the delta. ✅ Donesrc/api.rs, src/refresh.rs
A1-3bHASH partitioning via per-partition MERGE loop. HASH:col[:N] with auto-created child partitions; per-partition MERGE through satisfies_hash_partition(). ✅ Donesrc/api.rs, src/refresh.rs
PART-WARNDefault-partition growth warning. warn_default_partition_growth() after FULL and DIFFERENTIAL refresh. ✅ Donesrc/refresh.rs

Partitioning enhancements subtotal: ~5–8 weeks

MERGE Profiling (Phase 7)

ItemDescriptionEffortRef
G14-MDEDMERGE deduplication profiling. Profile how often concurrent-write scenarios produce duplicate key entries requiring pre-MERGE compaction. If ≥10% of refresh cycles need dedup, write an RFC for a two-pass MERGE strategy.3–5dplans/performance/REPORT_OVERALL_STATUS.md §14
PROF-DLTDelta SQL query plan profiling (explain_delta() function). Capture EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) for auto-generated delta SQL queries to identify PostgreSQL execution bottlenecks (join algorithms, scan types, sort spills). Add pgtrickle.explain_delta(st_name, format DEFAULT 'text') SQL function; optional PGS_PROFILE_DELTA=1 environment variable for E2E test auto-capture to /tmp/delta_plans/<st>.json. Enables identification of operator-level performance issues (semi-join full scans, deep join chains). Prerequisite for data-driven MERGE optimization.1–2wPLAN_TPC_H_BENCHMARKING.md §1-5

MERGE profiling subtotal: ~1–3 weeks

dbt Macro Updates (Phase 8)

In plain terms: Expose the v0.11.0 SQL API additions (partition_by, fuse, fuse_ceiling, fuse_sensitivity) in the dbt materialization macros so dbt users can configure them via config(...). No catalog changes; pure Jinja/SQL.

ItemDescriptionEffort
DBT-1partition_by config option wired through stream_table.sql, create_stream_table.sql, and alter_stream_table.sql~1d
DBT-2fuse, fuse_ceiling, fuse_sensitivity config options wired through the materialization and alter macro with change-detection logic~1–2d
DBT-3dbt docs update: README and SQL_REFERENCE.md dbt section~0.5d

dbt macro updates subtotal: ~2–3.5 days

Multi-Tenant Scheduler Isolation (Phase 9)

In plain terms: As deployments grow past 10 databases on a single cluster, all schedulers compete for the same global background-worker pool. One busy database can starve the others. Phase 9 gives operators per-database quotas and a priority queue so critical databases always get workers.

ItemDescriptionEffortRef
C-3Per-database worker quotas. Add pg_trickle.per_database_worker_quota GUC; priority ordering: IMMEDIATE > Hot > Warm > Cold STs; burst capacity up to 150% when other databases are under quota. ✅ Done — GUC registered; compute_per_db_quota() with 80% burst; tier-aware sort_ready_queue_by_priority; 5 unit tests + 6 E2E testssrc/scheduler.rs

⚠️ C-3 depends on C-1 (tiered scheduling) for Hot/Warm/Cold classification. If C-1 is not ready, fall back to IMMEDIATE > all-other ordering with equal priority within each tier; add full tier-aware ordering as a follow-on when C-1 lands in v0.14.0.

Multi-tenant scheduler isolation subtotal: ~2–3 weeks

TPC-H Benchmark Harness (Phase 10)

In plain terms: The existing TPC-H correctness suite (22/22 queries passing) has no timing infrastructure. Phase 10 adds benchmark mode so we can measure FULL vs DIFFERENTIAL speedups across all 22 queries — the only way to validate that A-2, D-4, and other v0.13.0 changes actually help on realistic analytical workloads, and to catch per-query regressions at larger scale factors.

ItemDescriptionEffortRef
TPCH-1TPCH_BENCH=1 benchmark mode for Phase 3. Instrument test_tpch_full_vs_differential with warm-up cycles (WARMUP_CYCLES=2), reuse extract_last_profile() for [PGS_PROFILE] extraction, emit [TPCH_BENCH] structured output per cycle (query=q01 tier=2 cycle=1 mode=DIFF ms=12.7 decision=0.41 merge=11.3 …). Add print_tpch_summary() with per-query FULL/DIFF median, speedup, P95, and MERGE% table.4–5hPLAN_TPC_H_BENCHMARKING.md §3
TPCH-2just bench-tpch / bench-tpch-large / bench-tpch-fast justfile targets. bench-tpch: SF-0.01 with TPCH_BENCH=1; bench-tpch-large: SF-0.1 with 5 cycles; bench-tpch-fast: skip Docker image rebuild. Enables before/after measurement for every v0.13.0 optimization.15 minPLAN_TPC_H_BENCHMARKING.md §3
TPCH-3TPC-H OpTree Criterion micro-benchmarks. Add composite OpTree benchmarks to benches/diff_operators.rs representing TPC-H query shapes (diff_tpch_q01, diff_tpch_q05, diff_tpch_q08, diff_tpch_q18, diff_tpch_q21). Measures pure-Rust delta SQL generation time for complex multi-join/semi-join trees; catches DVM engine regressions without a running database.4hPLAN_TPC_H_BENCHMARKING.md §4

TPC-H benchmark harness subtotal: ~1 day

SQL Coverage Cleanup (Phase 11)

In plain terms: Three small SQL expression gaps that are unscheduled anywhere. Two are PG 16+ standard SQL syntax currently rejected with errors; one is an audit-gated correctness check for recursive CTEs with non-monotone operators. All are low-effort items that round out DVM coverage without adding scope risk.

ItemDescriptionEffortRef
SQL-RECURRecursive CTE non-monotone divergence audit. Write an E2E test for a recursive CTE with EXCEPT or aggregation in the recursive term (WITH RECURSIVE … SELECT … EXCEPT SELECT …). If the test passes → downgrade G1.3 to P4 (verified correct, no code change). If it fails → add a guard in diff_recursive_cte that detects non-monotone recursive terms and rejects them with ERROR: non-monotone recursive CTEs are not supported in DIFFERENTIAL mode — use FULL.6–8hGAP_SQL_PHASE_7.md §G1.3
SQL-PG16-1IS JSON predicate support (PG 16+). expr IS JSON, expr IS JSON OBJECT, expr IS JSON ARRAY, expr IS JSON SCALAR, expr IS JSON WITH UNIQUE KEYS — standard SQL/JSON predicates rejected today. Add a T_JsonIsPredicate arm in parser.rs; the predicate is treated opaquely (no delta decomposition); it passes through to the delta SQL unchanged where the PG executor evaluates it natively.2–3hGAP_SQL_PHASE_6.md §G1.4
SQL-PG16-2SQL/JSON constructor support (PG 16+). JSON_OBJECT(…), JSON_ARRAY(…), JSON_OBJECTAGG(…), JSON_ARRAYAGG(…) — standard SQL/JSON constructors (T_JsonConstructorExpr) currently rejected. Add opaque pass-through in parser.rs; treat as scalar expressions (no incremental maintenance of the JSON value itself); handle the aggregate variants the same way as other custom aggregates (full group rescan).4–6hGAP_SQL_PHASE_6.md §G1.5

SQL coverage cleanup subtotal: ~1–2 days

DVM Engine Improvements (Phase 10)

In plain terms: The delta SQL generated for deep multi-table joins (e.g., TPC-H Q05/Q09 with 6 joined tables) computes identical pre-change snapshots redundantly at every reference site, spilling multi-GB temporary files that exceed temp_file_limit. Nested semi-joins (Q20) exhibit an O(n²) blowup from fully materializing the right-side pre-change state. These improvements target the intermediate data volume directly in the delta SQL generator, with TPC-H 22/22 DIFFERENTIAL correctness as the measurable gate.

ItemDescriptionEffortRef
DI-1Named CTE L₀ snapshots. Emit per-leaf pre-change snapshots as named CTEs (NOT MATERIALIZED default; MATERIALIZED when reference count ≥ 3); deduplicate 3–10× redundant EXCEPT ALL evaluations per leaf. Targets Q05/Q09 temp spill root cause.2–3dPLAN_DVM_IMPROVEMENTS.md §DI-1
DI-2Pre-image read from change buffer + aggregate UPDATE-split. Replace per-leaf EXCEPT ALL with a NOT EXISTS anti-join on pk_hash + direct old_* read. Per-leaf conditional fallback to EXCEPT ALL when delta exceeds max_delta_fraction for that leaf. Includes aggregate UPDATE-split: the 'D' side of SUM(CASE WHEN …) evaluates using old_* column values, superseding DI-8’s band-aid.3.5–5.5dPLAN_DVM_IMPROVEMENTS.md §DI-2
DI-3Group-key filtered aggregate old rescan. Restrict non-algebraic aggregate EXCEPT ALL rescans to affected groups via EXISTS (… IS NOT DISTINCT FROM …) filter. NULL-safe. Independent quick win.0.5–1dPLAN_DVM_IMPROVEMENTS.md §DI-3
DI-6Lazy semi-join R_old materialization. Skip EXCEPT ALL for unchanged semi-join right children; push down equi-join key as a filter when R_old is needed. Eliminates Q20-type O(n²) blowup.1–2dPLAN_DVM_IMPROVEMENTS.md §DI-6
DI-4Shared R₀ CTE cache. Cache pre-change snapshot SQL by OpTree node identity to avoid regenerating duplicate inline subqueries for shared subtrees. Depends on DI-1.1–2dPLAN_DVM_IMPROVEMENTS.md §DI-4
DI-5Part 3 correction consolidation. Consolidate per-node Part 3 correction CTEs for linear inner-join chains into a single term.2–3dPLAN_DVM_IMPROVEMENTS.md §DI-5
DI-7Scan-count-aware strategy selector. max_differential_joins and max_delta_fraction per-stream-table options; auto-fallback to FULL refresh when join count or delta-rate threshold is exceeded. Complements DI-2's per-leaf fallback with a coarser per-ST guard at scheduler decision time.1–2dPLAN_DVM_IMPROVEMENTS.md §DI-7
DI-8SUM(CASE WHEN …) algebraic drift fix. Detect Expr::Raw("CASE …") in is_algebraically_invertible() and fall back to GROUP_RESCAN. Q14 is unaffected (parsed as ComplexExpression, already GROUP_RESCAN). Correctness band-aid superseded by DI-2’s aggregate UPDATE-split.~0.5dPLAN_DVM_IMPROVEMENTS.md §DI-8
DI-9Scheduler skips IMMEDIATE-mode tables. Raise scheduler_interval_ms GUC cap to 600,000 ms; return early from refresh-due check for refresh_mode = IMMEDIATE (verified safe: IMMEDIATE drains TABLE-source buffers synchronously; downstream CALCULATED tables detected via has_stream_table_source_changes() independently).0.5dPLAN_DVM_IMPROVEMENTS.md §DI-9
DI-10SF=1 benchmark validation gate. Add bench-tpch-sf1 justfile target (TPCH_SF=1 TPCH_BENCH=1). Gate v0.13.0 release on 22/22 queries at SF=1. CI: manual dispatch only (60–180 min runtime, 4h timeout).~0.5dPLAN_DVM_IMPROVEMENTS.md §DI-10
DI-11Predicate pushdown + deep-join L₀ threshold + planner hints. (a) Enable push_filter_into_cross_joins() with scalar-subquery guard. (b) Deep-join L₀ threshold (4+ scans): skip L₀ reconstruction, use L₁ + Part 3 correction. (c) Deep-join planner hints (5+ scans): disable nestloop, raise work_mem, override temp_file_limit. Result: 22/22 TPC-H DIFFERENTIAL.~1d

DI-2 promoted from v1.x: CDC old_* column capture was completed as part of the typed-column CDC rewrite (already in production). DI-2 scope includes both the join-level pre-image capture (NOT EXISTS anti-join) and an aggregate UPDATE-split that uses old_* values for the 'D' side of SUM(CASE WHEN …), superseding DI-8's GROUP_RESCAN band-aid.

Implementation order: DI-8 → DI-9 → DI-1 → DI-3 → DI-2 → DI-6 → DI-4 → DI-5 → DI-7 → DI-10 → DI-11

DVM improvements subtotal: ~2–3 weeks (DI-8/DI-9 are small independent fixes; DI-1–DI-7 are the core engine work; DI-10 is a validation run; DI-11 is predicate pushdown + deep-join optimization)

Regression-Free Testing Initiative (Q2 2026)

Tracking: TESTING_GAPS_2_IMPLEMENTATION_PROPOSAL.md

Addresses 9 structural weaknesses identified in the regression risk analysis. Target: reduce regression escape rate from ~15% to <5%.

PhaseItemStatus
P1Test infrastructure hardening: #[must_use] on poll helpers; wait_for_condition with exponential backoff; assert_column_types_match✅ Done (2026-03-28)
P2Join multi-cycle correctness: 7 tests — LEFT/RIGHT/FULL join, join-key update, both-sides DML, 4-table chain, NULL key✅ Done (2026-03-28)
P3Differential ≡ Full equivalence: 11 tests covering every major DVM operator class; effective_refresh_mode guard✅ Done (2026-03-28)
P4DVM operator execution: LATERAL MAX subquery multi-cycle (5 cycles) + recursive CTE org hierarchy multi-cycle (5 cycles)✅ Done (2026-03-28)
P5Failure recovery & schema evolution: 6 failure recovery tests (FR-1..6 in e2e_failure_recovery_tests.rs) + 5 schema evolution tests (SE-1..5 in e2e_ddl_event_tests.rs)✅ Done (2026-03-28)
P6MERGE template unit tests: 8 pure-Rust tests — determine_refresh_action (×5) + build_is_distinct_clause boundary (×3) in src/refresh.rs✅ Done (2026-03-28)

v0.13.0 total: ~15–23 weeks (Scalability: 6–8w, Partitioning: 5–8w, MERGE Profiling: 1–3w, dbt: 2–3.5d, Multi-tenant: 2–3w, TPC-H harness: ~1d, SQL cleanup: ~1–2d, DVM improvements: ~2–3w)

Exit criteria:

  • A-2: Columnar change tracking bitmask skips irrelevant rows; key column classification ✅, __pgt_key_changed annotation ✅, P5 value-only fast path ✅, DiffResult.has_key_changed signal propagation ✅, MERGE value-only UPDATE optimization ✅, upgrade script ✅ ✅ Done
  • D-4: Shared buffer serves multiple STs via per-source changes_{oid} naming; pgt_change_tracking.tracked_by_pgt_ids reference counting; shared_buffer_stats() observability; property-based test with 5–10 consumers (3 properties, 500 cases) ✅ Done; 5 E2E fan-out tests
  • PERF-2: buffer_partitioning = 'auto' activates RANGE(lsn) partitioned mode for high-throughput sources — throughput-based should_promote_inner() heuristic, convert_buffer_to_partitioned() runtime migration, 10 unit tests + 3 E2E tests, docs/CONFIGURATION.md updated ✅ Done
  • A1-1b: Multi-column RANGE partition keys work end-to-end; composite ROW() predicate triggers partition pruning; 3 E2E tests + 5 unit tests ✅ Done
  • A1-1c: alter_stream_table(partition_by => …) repartitions existing storage table without data loss; add/change/remove tested
  • A1-1d: LIST partitioning creates PARTITION BY LIST storage; IN-list predicate injected; single-column-only validated; 4 E2E tests pass
  • A1-3b: HASH partitioning uses per-partition MERGE loop; auto-creates N child partitions; satisfies_hash_partition() filter; 22 unit tests + 6 E2E tests ✅ Done
  • PART-WARN: WARNING emitted when default partition has rows after refresh; warn_default_partition_growth() on both FULL and DIFFERENTIAL paths ✅ Done
  • G14-MDED: Deduplication frequency profiling complete; TOTAL_DIFF_REFRESHES + DEDUP_NEEDED_REFRESHES shared-memory atomic counters; pgtrickle.dedup_stats() reports ratio; RFC threshold documented at ≥10% ✅ Done
  • PROF-DLT: pgtrickle.explain_delta(st_name, format) function captures delta query plans in text/json/xml/yaml; PGS_PROFILE_DELTA=1 auto-capture to /tmp/delta_plans/; documented in SQL_REFERENCE.md ✅ Done
  • C-3: Per-database worker quota enforced; tier-aware priority sort (IMMEDIATE > Hot > Warm > Cold) implemented; GUC + E2E quota tests added; compute_per_db_quota() with burst at 80% cluster load ✅ Done
  • TPCH-1/2: TPCH_BENCH=1 mode emits [TPCH_BENCH] lines + summary table; just bench-tpch and bench-tpch-large targets functional ✅ Done
  • TPCH-3: Five TPC-H OpTree Criterion benchmarks pass and run without a PostgreSQL backend ✅ Done
  • DBT-1/2/3: partition_by, fuse, fuse_ceiling, fuse_sensitivity exposed in dbt macros; change detection wired; integration tests added; README and SQL_REFERENCE.md updated ✅ Done
  • SQL-RECUR: Recursive CTE non-monotone audit complete; G1.3 downgraded to P4 — two Tier 3h E2E tests verify recomputation fallback is correct ✅ Done
  • SQL-PG16-1: IS JSON predicate accepted in DIFFERENTIAL defining queries; E2E tests in e2e_expression_tests.rs confirm correct delta behaviour ✅ Done
  • SQL-PG16-2: JSON_OBJECT, JSON_ARRAY, JSON_OBJECTAGG, JSON_ARRAYAGG accepted in DIFFERENTIAL defining queries; E2E tests in e2e_expression_tests.rs confirm correct delta behaviour ✅ Done
  • scripts/check_upgrade_completeness.sh passes (all catalog changes in sql/pg_trickle--0.12.0--0.13.0.sql) ✅ Done — 58 functions, 8 new columns, all covered
  • DI-8: is_algebraically_invertible() detects Expr::Raw("CASE …") and returns false for SUM(CASE WHEN …) (Q14 unaffected — ComplexExpression); Q12 removed from DIFFERENTIAL_SKIP_ALLOWLIST; 4 unit tests ✅ Done
  • DI-9: scheduler_interval_ms cap raised to 600,000 ms; scheduler skips IMMEDIATE-mode tables in check_schedule(); verified safe for CALCULATED dependants ✅ Done
  • DI-1: Named CTE L₀ snapshots implemented (NOT MATERIALIZED default, MATERIALIZED when ref ≥ 3); Q05/Q09 pass DIFFERENTIAL correctness ✅ Done
  • DI-2: NOT EXISTS anti-join replaces EXCEPT ALL in build_pre_change_snapshot_sql(); per-leaf conditional EXCEPT ALL fallback when delta > max_delta_fraction; aggregate UPDATE-split blocked on Q12 drift root cause (DI-8 band-aid retained) ✅ Done
  • DI-3: Already implemented — non-algebraic aggregate old rescan filtered via EXISTS (… IS NOT DISTINCT FROM …) to affected groups; NULL-safe ✅ Done
  • DI-6: Semi-join R_old lazy materialization with key push-down; Q20 DIFF passes at SF=0.01 ✅ Done
  • DI-4/5/7: R₀ cache (subset of DI-1), Part 3 threshold raised from 3→5, strategy selector + max_delta_fraction complete ✅ Done
  • DI-10: bench-tpch-sf1 target added; 22/22 queries pass at SF=0.01 (3 cycles, zero drift) ✅ Done
  • DI-11: Predicate pushdown enabled with scalar-subquery guard; deep-join L₀ threshold (4 scans); deep-join planner hints (5+ total scans); 22/22 TPC-H DIFFERENTIAL ✅ Done
  • Extension upgrade path tested (0.12.0 → 0.13.0) ✅ Done

v0.14.0 — Tiered Scheduling, UNLOGGED Buffers & Diagnostics

Status: Released (2026-04-02).

Tiered refresh scheduling, UNLOGGED change buffers, refresh mode diagnostics, error-state circuit breaker, a full-featured TUI dashboard, security hardening (SECURITY DEFINER triggers with explicit search_path), GHCR Docker image, pre-deployment checklist, best-practice patterns guide, and comprehensive E2E test coverage. See CHANGELOG.md for the full feature list.

Completed items (click to expand)

Quick Polish & Error State Circuit Breaker (Phase 1 + 1b) — ✅ Done

  • C4: pg_trickle.planner_aggressive GUC consolidates merge_planner_hints + merge_work_mem_mb. Old GUCs deprecated.
  • DIAG-2: Creation-time WARNING for group-rescan and low-cardinality algebraic aggregates. agg_diff_cardinality_threshold GUC added.
  • DOC-OPM: Operator support matrix summary table linked from SQL_REFERENCE.md.
  • ERR-1: Permanent failures immediately set ERROR status with last_error_message/last_error_at. API calls clear error state. E2E test pending.

Manual Tiered Scheduling (Phase 2 — C-1) — ✅ Done

Tiered scheduling infrastructure was already in place since v0.11/v0.12 (refresh_tier column, RefreshTier enum, ALTER ... SET (tier=...), scheduler multipliers). Phase 2 verified completeness and added:

  • C-1b: NOTICE on tier demotion from Hot to Cold/Frozen, alerting operators to the effective interval change.
  • C-1c: Scheduler tier-aware multipliers confirmed: Hot ×1, Warm ×2, Cold ×10, Frozen = skip. Gated by pg_trickle.tiered_scheduling (default true since v0.12.0).

UNLOGGED Change Buffers (Phase 3 — D-1) — ✅ Done

  • D-1a: pg_trickle.unlogged_buffers GUC (default false). New change buffer tables created as UNLOGGED when enabled, reducing WAL amplification by ~30%.
  • D-1b: Crash recovery detection — scheduler detects UNLOGGED buffers emptied by crash (postmaster restart after last refresh) and auto-enqueues FULL refresh.
  • D-1c: pgtrickle.convert_buffers_to_unlogged() utility function for converting existing logged buffers. Documents lock-window warning.
  • D-1e: Documentation in CONFIGURATION.md and SQL_REFERENCE.md.

Documentation: Best-Practice Patterns Guide (G16-PAT) — ✅ Done

ItemDescriptionEffortRef
G16-PATBest-practice patterns guide. docs/PATTERNS.md: 6 patterns (Bronze/Silver/Gold, event sourcing, SCD type-1/2, high-fan-out, real-time dashboards, tiered refresh) with SQL examples, anti-patterns, and refresh mode recommendations.✅ Done

Patterns guide subtotal: ✅ Done

Long-Running Stability & Multi-Database Testing (G17-SOAK, G17-MDB) — ✅ Done

Soak test validates zero worker crashes, zero ERROR states, and stable RSS under sustained mixed DML. Multi-database test validates catalog isolation, shared-memory independence, and concurrent correctness.

ItemDescriptionEffortRef
G17-SOAKLong-running stability soak test. tests/e2e_soak_tests.rs with configurable duration, 5 source tables, mixed DML, health checks, RSS monitoring, correctness verification. just test-soak / just test-soak-short. CI job: schedule + manual dispatch.✅ Done
G17-MDBMulti-database scheduler isolation test. tests/e2e_mdb_tests.rs with two databases, catalog isolation assertion, concurrent mutation cycles, correctness verification per database. just test-mdb. CI job: schedule + manual dispatch.✅ Done

Stability & multi-database testing subtotal: ✅ Done

Container Infrastructure (INFRA-GHCR)

ItemDescriptionEffortRef
INFRA-GHCRGHCR Docker image. Dockerfile.ghcr (pinned to postgres:18.3-bookworm) + .github/workflows/ghcr.yml workflow that builds a multi-arch (linux/amd64 + linux/arm64) PostgreSQL 18.3 server image with pg_trickle pre-installed and all sensible GUC defaults baked in. Smoke-tests on amd64 before push. Published to ghcr.io/grove/pg_trickle on every v* tag with immutable (<version>-pg18.3), floating (pg18), and latest tags. Uses GITHUB_TOKEN — no extra secrets.4h

Container infrastructure subtotal: ✅ Done

Refresh Mode Diagnostics (DIAG-1) — ✅ Done

Analyzes stream table workload characteristics and recommends the optimal refresh mode. Seven weighted signals (change ratio, empirical timing, query complexity, target size, index coverage, latency variance) produce a composite score with confidence level and human-readable explanation.

ItemDescriptionEffortRef
DIAG-1asrc/diagnostics.rs — pure signal-scoring functions + unit tests✅ Done
DIAG-1bSPI data-gathering layer✅ Done
DIAG-1cpgtrickle.recommend_refresh_mode() SQL function✅ Done
DIAG-1dpgtrickle.refresh_efficiency() function✅ Done
DIAG-1eE2E integration tests; upgrade migration✅ Done
DIAG-1fDocumentation: SQL_REFERENCE.md additions✅ Done

The function synthesises 7 weighted signals (historical change ratio 0.30, empirical timing 0.35, current change ratio 0.25, query complexity 0.10, target size 0.10, index coverage 0.05, P95/P50 variance 0.05) into a composite score. Confidence degrades gracefully when history is sparse.

Diagnostics subtotal: ~3.5–7 days

Export Definition API (G15-EX) — ✅ Done

ItemDescriptionEffortRef
G15-EXexport_definition(name TEXT) — export a stream table configuration as reproducible DDL✅ Done

G15-EX subtotal: ~1–2 days

TUI Tool (E3-TUI)

In plain terms: A full-featured terminal user interface (TUI) for managing, monitoring, and diagnosing pg_trickle stream tables without touching SQL. Built with ratatui in Rust, it provides a real-time dashboard (think htop for stream tables), interactive dependency graph visualization, live refresh log, diagnostics with signal breakdown charts, CDC health monitoring, a GUC configuration editor, and a real-time alert feed — all navigable with keyboard shortcuts and a command palette. It also supports every original CLI command as one-shot subcommands for scripting and CI.

ItemDescriptionEffortRef
E3-TUITUI tool (pgtrickle) for interactive management and monitoring8–10dPLAN_TUI.md

E3-TUI subtotal: ~8–10 days (T1–T8 implemented: CLI skeleton with 18 subcommands, interactive dashboard with 15 views, watch mode with --filter, LISTEN/NOTIFY alerts with JSON parsing, async polling with force-poll, cascade staleness detection, DAG issue detection, sparklines, fuse detail panel, trigger inventory, context-sensitive help, docs/TUI.md)

GUC Surface Consolidation (C4)

ItemDescriptionEffortRef
C4Consolidate merge_planner_hints + merge_work_mem_mb into single planner_aggressive boolean. Reduces GUC surface area; existing two GUCs become aliases that emit a deprecation notice.~1–2hPLAN_FEATURE_CLEANUP.md §C4

C4 subtotal: ~1–2 hours

Documentation: Pre-Deployment Checklist (DOC-PDC) — ✅ Done

ItemDescriptionEffortRef
DOC-PDCPre-deployment checklist page. docs/PRE_DEPLOYMENT.md: 10-point checklist covering PG version, shared_preload_libraries, WAL configuration, PgBouncer compatibility, recommended GUCs, resource planning, monitoring, validation script. Cross-linked from GETTING_STARTED.md and INSTALL.md.✅ Done

DOC-PDC subtotal: ✅ Done

ItemDescriptionEffortRef
DOC-OPMCross-link operator support matrix from SQL_REFERENCE.md. The 60+ operator × FULL/DIFFERENTIAL/IMMEDIATE matrix in DVM_OPERATORS.md is not discoverable from the page users actually read. Add a summary table and prominent link in SQL_REFERENCE.md §Supported SQL Constructs.~2–4hdocs/DVM_OPERATORS.md · docs/SQL_REFERENCE.md

DOC-OPM subtotal: ~2–4 hours

Aggregate Mode Warning at Creation Time (DIAG-2)

In plain terms: Queries with very few distinct GROUP BY groups (e.g. 5 regions from 100K rows) are always faster with FULL refresh — differential overhead exceeds the cost of re-aggregating a tiny result set. Today users discover this only after benchmarking. A creation-time WARNING with an explicit recommendation prevents the surprise. The classification logic is already present in the DVM parser (aggregate strategy classification from is_algebraically_invertible, is_group_rescan); this item exposes it at the SQL boundary.

ItemDescriptionEffortRef
DIAG-2Aggregate mode warning at create_stream_table time. After parsing the defining query, inspect the top-level operator: if it is an Aggregate node containing non-algebraic (group-rescan) functions such as MIN, MAX, STRING_AGG, ARRAY_AGG, BOOL_AND/OR, emit a WARNING recommending refresh_mode='full' or 'auto' and citing the group-rescan cost. For algebraic aggregates (SUM/COUNT/AVG), emit the warning only when the estimated group cardinality (from pg_stats.n_distinct on the GROUP BY columns) is below pg_trickle.agg_diff_cardinality_threshold (default: 1000 distinct groups), since below this threshold FULL is reliably faster. No behavior change — warning only.~2–4hplans/performance/REPORT_OVERALL_STATUS.md §12.3

DIAG-2 subtotal: ~2–4 hours

DIFFERENTIAL Refresh for Manual ST-on-ST Path (FIX-STST-DIFF)

Background: When a stream table reads from another stream table (calculated schedule), the scheduler propagates changes via a per-ST change buffer (pgtrickle_changes.changes_pgt_{id}) and performs a true DIFFERENTIAL DVM refresh against that buffer. The manual pgtrickle.refresh_stream_table() path does not: it currently falls back to an unconditional TRUNCATE + INSERT (FULL refresh) for every call.

This was introduced as a correctness fix in v0.13.0 (PR #371) to close a scheduler race where the previous no-op guard could leave stale data in place. The FULL fallback is correct but inefficient — it pays a full table scan of all upstream STs even when only a small delta is present.

What needs to happen: Wire execute_manual_differential_refresh to use the same changes_pgt_ change buffers the scheduler already writes. When a manual refresh is requested for a calculated ST that has a stored frontier, check each upstream ST's change buffer for rows with lsn > frontier.get_st_lsn(upstream_pgt_id). If new rows exist, apply the DVM delta SQL (same as execute_differential_refresh). If no rows exist beyond the frontier, return a true no-op. This also fixes the pre-existing test_st_on_st_uses_differential_not_full E2E failure.

ItemDescriptionEffortRef
FIX-STST-DIFFDIFFERENTIAL manual refresh for ST-on-ST. In execute_manual_differential_refresh (src/api.rs), replace the unconditional FULL fallback for has_st_source with a proper change-buffer delta path: read rows from changes_pgt_{upstream_pgt_id} beyond the stored frontier LSN, run DVM differential SQL, advance the frontier. Matches the scheduler path exactly. Fixes test_st_on_st_uses_differential_not_full.✅ Done

FIX-STST-DIFF subtotal: ~1–2 days

v0.14.0 total: ~2–6 weeks + ~1wk patterns guide + ~2–4 days stability tests + ~3.5–7 days diagnostics + ~1–2d export API + ~8–10d TUI + ~0.5d docs + ~2–4h aggregate warning + ~1–2d ST-on-ST diff manual path

Exit criteria:

  • C-1: Tier classification with manual assignment; Cold STs skip refresh correctly; E2E tested ✅ Done
  • D-1: UNLOGGED change buffers opt-in (unlogged_buffers = false by default); crash-recovery FULL-refresh path tested; E2E tested ✅ Done
  • G16-PAT: Patterns guide published in docs/PATTERNS.md covering 6 patterns ✅ Done
  • G17-SOAK: Soak test passes with zero worker crashes, zero zombie stream tables, stable memory ✅ Done
  • G17-MDB: Multi-database scheduler isolation verified ✅ Done
  • DIAG-1: recommend_refresh_mode() + refresh_efficiency() implemented with 7 signals; E2E tested; tutorial published ✅ Done
  • DIAG-2: WARNING emitted at creation time for group-rescan and low-cardinality aggregates; threshold configurable ✅ Done
  • G15-EX: export_definition(name TEXT) returns valid reproducible DDL; round-trip tested ✅ Done
  • E3-TUI: pgtrickle TUI binary builds as workspace member; one-shot CLI commands functional with --format json; interactive dashboard launches with no subcommand; 15 views with cascade staleness, issue detection, sparklines, force-poll, NOTIFY, and context-sensitive help; documented in docs/TUI.md ✅ Done
  • C4: merge_planner_hints and merge_work_mem_mb consolidated into planner_aggressive ✅ Done
  • DOC-PDC: Pre-deployment checklist published in docs/PRE_DEPLOYMENT.md ✅ Done
  • DOC-OPM: Operator mode support matrix summary and link added to SQL_REFERENCE.md ✅ Done
  • FIX-STST-DIFF: Manual DIFFERENTIAL refresh for ST-on-ST path ✅ Done
  • INFRA-GHCR: ghcr.io/grove/pg_trickle multi-arch image builds, smoke-tests, and pushes on v* tags ✅ Done
  • ERR-1: Error-state circuit breaker with E2E test coverage ✅ Done
  • Extension upgrade path tested (0.13.0 → 0.14.0) ✅ Done

v0.15.0 — External Test Suites & Integration

Status: Released (2026-04-03). All 20 roadmap items complete.

Goal: Validate correctness against independent query corpora and ship the dbt integration as a formal release.

Completed items (click to expand)

External Test Suite Integration

In plain terms: pg_trickle's own tests were written by the pg_trickle team, which means they can have the same blind spots as the code. This adds validation against three independent public benchmarks: PostgreSQL's own SQL conformance suite (sqllogictest), the Join Order Benchmark (a realistic analytical query workload), and Nexmark (a streaming data benchmark). If pg_trickle produces a different answer than PostgreSQL does on the same query, these external suites will catch it.

Validate correctness against independent query corpora beyond TPC-H.

➡️ TS1 and TS2 pulled forward to v0.11.0. Delivering one of TS1 or TS2 is an exit criterion for 0.11.0. TS3 (Nexmark) remains in 0.15.0. If TS1/TS2 slip from 0.11.0, they land here.

ItemDescriptionEffortRef
TS1sqllogictest: run PostgreSQL sqllogic suite through pg_trickle DIFFERENTIAL mode ➡️ Pulled to v0.11.02–3dPLAN_TESTING_GAPS.md §J
TS2JOB (Join Order Benchmark): correctness baseline and refresh latency profiling ➡️ Pulled to v0.11.01–2dPLAN_TESTING_GAPS.md §J
TS3Nexmark streaming benchmark: sustained high-frequency DML correctness1–2dPLAN_TESTING_GAPS.md §J

External test suites subtotal: ~1–2 days (TS3 only; TS1/TS2 in v0.11.0) -- ✅ TS3 complete

Documentation Review

In plain terms: A full documentation review polishes everything so the product is ready to be announced to the wider PostgreSQL community.

ItemDescriptionEffortRef
I2Complete documentation review & polish4--6hdocs/

Documentation subtotal: ✅ Done

Bulk Create API (G15-BC)

ItemDescriptionEffortRef
G15-BCbulk_create(definitions JSONB) — create multiple stream tables and their CDC triggers in a single transaction. Useful for dbt/CI pipelines that manage many STs programmatically. ✅ Done~2–3dplans/performance/REPORT_OVERALL_STATUS.md §15

G15-BC subtotal: ✅ Completed

Parser Modularization (G13-PRF) -- ✅ Done

In plain terms: At ~21,000 lines, parser.rs was too large to maintain safely. Split into 5 sub-modules by concern -- zero behavior change.

ItemDescriptionEffortRef
G13-PRFModularize src/dvm/parser.rs. ✅ Done. Split into mod.rs, types.rs, validation.rs, rewrites.rs, sublinks.rs. Added // SAFETY: comments to all ~750 unsafe blocks (~676 newly documented).~3–4wkplans/performance/REPORT_OVERALL_STATUS.md §13

G13-PRF subtotal: ✅ Completed

Watermark Hold-Back Mode (WM-7) -- ✅ Done

In plain terms: The watermark gating system (shipped in v0.7.0) lets ETL producers signal their progress. Hold-back mode adds stuck detection: when a watermark is not advanced within a configurable timeout, downstream stream tables are paused and operators are notified.

ItemDescriptionEffortRef
WM-7Watermark hold-back mode. watermark_holdback_timeout GUC detects stuck watermarks; pauses downstream gated STs; emits pgtrickle_alert NOTIFY with watermark_stuck event; auto-resumes with watermark_resumed event when watermark advances.✅ DonePLAN_WATERMARK_GATING.md §4.1

WM-7 subtotal: ✅ Done

Delta Cost Estimation (PH-E1) — ✅ Done

In plain terms: Before executing the MERGE, runs a capped COUNT on the delta subquery to estimate output cardinality. If the count exceeds pg_trickle.max_delta_estimate_rows, emits a NOTICE and falls back to FULL refresh to prevent OOM or excessive temp-file spills.

ItemDescriptionEffortRef
PH-E1Delta cost estimation. Capped SELECT count(*) FROM (delta LIMIT N+1) before MERGE execution. max_delta_estimate_rows GUC (default: 0 = disabled). Falls back to FULL + NOTICE when exceeded.PLAN_PERFORMANCE_PART_9.md §Phase E

PH-E1 subtotal: ✅ Complete

dbt Hub Publication (I3) — ✅ Done

In plain terms: dbt-pgtrickle is now prepared for dbt Hub publication. The dbt_project.yml is version-synced (0.15.0), README documents both git and Hub install methods, and a submission guide documents the hubcap PR process. Actual Hub listing requires creating a standalone grove/dbt-pgtrickle repository and submitting a PR to dbt-labs/hubcap.

ItemDescriptionEffortRef
I3Prepared dbt-pgtrickle for dbt Hub publication. Version synced to 0.15.0, README updated with Hub install snippet, submission guide written. Hub listing pending separate repo creation + hubcap PR.2–4hdbt-pgtrickle/ · docs/integrations/dbt-hub-submission.md

I3 subtotal: ~2–4 hours — ✅ Complete

Hash-Join Planner Hints (PH-D2) — ✅ Done

In plain terms: Added pg_trickle.merge_join_strategy GUC that lets operators manually override the join strategy used during MERGE. Values: auto (default heuristic), hash_join, nested_loop, merge_join. The existing delta-size heuristics remain the default (auto).

ItemDescriptionEffortRef
PH-D2Hash-join planner hints. Added merge_join_strategy GUC with manual override for join strategy during MERGE. auto preserves existing delta-size heuristics; hash_join/nested_loop/merge_join force specific strategies.3–5dPLAN_PERFORMANCE_PART_9.md §Phase D

PH-D2 subtotal: ~3–5 days — ✅ Complete

Shared-Memory Template Cache Research Spike (G14-SHC-SPIKE)

In plain terms: Every new database connection that triggers a refresh pays a 15–50ms cold-start cost to regenerate the MERGE SQL template. With PgBouncer in transaction mode, this happens on every refresh cycle. This milestone scopes a research spike only: write an RFC, build a prototype, measure whether DSM-based caching eliminates the cold-start. Full implementation stays in v0.16.0.

ItemDescriptionEffortRef
G14-SHC-SPIKEShared-memory template cache research spike. Write an RFC for DSM + lwlock-based MERGE SQL template caching. Build a prototype benchmark to validate cold-start elimination. Full implementation deferred to v0.16.0.2–3dplans/performance/REPORT_OVERALL_STATUS.md §14

G14-SHC-SPIKE subtotal: ~2–3 days -- ✅ RFC complete (plans/performance/RFC_SHARED_TEMPLATE_CACHE.md)

TRUNCATE Capture for Trigger-Mode CDC (TRUNC-1)

In plain terms: WAL-mode CDC detects TRUNCATE on source tables and marks downstream stream tables for reinitialization. But trigger-mode CDC has no TRUNCATE handler — a TRUNCATE silently leaves the stream table stale. Adding a DDL event trigger that catches TRUNCATE and flags affected STs closes this correctness gap.

ItemDescriptionEffortRef
TRUNC-1TRUNCATE capture for trigger-mode CDC. Add a DDL event trigger or statement-level trigger that detects TRUNCATE on source tables in trigger CDC mode and marks downstream STs for needs_reinit. ✅ Done — CDC TRUNCATE triggers write action='T' marker; refresh engine detects and falls back to FULL.4–6hplans/adrs/PLAN_ADRS.md ADR-070

TRUNC-1 subtotal: ✅ Completed

Volatile Function Policy GUC (VOL-1)

In plain terms: Volatile functions (random(), clock_timestamp(), etc.) are correctly rejected at stream table creation time in DIFFERENTIAL and IMMEDIATE modes. But there’s no way for users to override this — some want volatile functions in FULL mode. Adding a volatile_function_policy GUC with reject/warn/allow modes gives operators control.

ItemDescriptionEffortRef
VOL-1pg_trickle.volatile_function_policy GUC. Add a GUC with values reject (default), warn, allow to control volatile function handling. reject preserves current behavior; warn emits WARNING but allows creation; allow silently permits (user accepts correctness risk). ✅ Done3–5hplans/sql/PLAN_NON_DETERMINISM.md

VOL-1 subtotal: ✅ Completed

Spill-Aware Refresh (PH-E2)

In plain terms: After PH-E1 adds pre-flight cost estimation, PH-E2 adds post-flight monitoring: track temp_bytes from pg_stat_statements after each refresh cycle and auto-adjust if spill is excessive.

ItemDescriptionEffortRef
PH-E2Spill-aware refresh. Monitor temp_bytes from pg_stat_statements after each refresh cycle. If spill exceeds threshold 3 consecutive times, automatically increase per-ST work_mem override or switch to FULL. Expose in explain_st() as spill_history. ✅ Done1–2 wkPLAN_PERFORMANCE_PART_9.md §Phase E

PH-E2 subtotal: ✅ Completed

ORM Integration Guides (E5)

In plain terms: Documentation showing how popular ORMs (SQLAlchemy, Django, etc.) interact with stream tables — model definitions, migrations, and freshness checks. Documentation-only work.

ItemDescriptionEffortRef
E5ORM integrations guide (SQLAlchemy, Django, etc.)8–12hPLAN_ECO_SYSTEM.md §5

E5 subtotal: ✅ Done

Flyway / Liquibase Migration Support (E4)

In plain terms: Documentation showing how standard migration frameworks interact with stream tables — CREATE/ALTER/DROP patterns, handling CDC triggers across schema migrations. Documentation-only work.

ItemDescriptionEffortRef
E4Flyway / Liquibase migration support8–12hPLAN_ECO_SYSTEM.md §5

E4 subtotal: ✅ Done

JOIN Key Change + DELETE Correctness Fix (EC-01) — ✅ Done (pre-existing)

In plain terms: The phantom-row-after-DELETE bug was fixed in v0.14.0 via the R₀ pre-change snapshot strategy. Part 1 of the JOIN delta is split into 1a (inserts ⋈ R₁) + 1b (deletes ⋈ R₀), ensuring DELETE deltas always find the old join partner. The fix was extended to all join depths via the EC-01B-1 per-leaf CTE strategy, and regression tests (EC-01B-2) cover TPC-H Q07, Q08, Q09.

ItemDescriptionEffortRef
EC-01R₀ pre-change snapshot for JOIN key change + DELETE. Part 1 split into 1a (inserts ⋈ R₁) + 1b (deletes ⋈ R₀). Applied to INNER/LEFT/FULL JOIN. Closes G1.1.GAP_SQL_PHASE_7.md §G1.1

EC-01 subtotal: ✅ Complete (implemented in v0.14.0)

Multi-Level ST-on-ST Testing (STST-3)

In plain terms: FIX-STST-DIFF (v0.14.0) fixed 2-level stream-table-on-stream-table DIFFERENTIAL refresh. Some 3-level cascade tests exist, but systematic coverage for 3+ level chains — including mixed refresh modes, concurrent DML at multiple levels, and DELETE/UPDATE propagation through deep chains — is missing. This adds a dedicated test matrix to prevent regressions as cascade depth increases.

ItemDescriptionEffortRef
STST-3Multi-level ST-on-ST test matrix (3+ levels). Systematic coverage: 3-level and 4-level chains, INSERT/UPDATE/DELETE propagation, mixed DIFFERENTIAL/FULL modes, concurrent DML at multiple levels, correctness comparison against materialized-view baseline.3–5de2e_cascade_regression_tests.rs

STST-3 subtotal: ✅ Done

Circular Dependencies + IMMEDIATE Mode (CIRC-IMM)

In plain terms: Circular dependencies are rejected at creation time (EC-30), but the interaction between near-circular topologies (e.g. diamond dependencies with IMMEDIATE triggers on both sides) and IMMEDIATE mode is untested territory. This adds targeted testing and, if needed, hardening to ensure IMMEDIATE mode doesn't deadlock or produce incorrect results on complex dependency graphs. Conditional P1 — can slip to v0.16.0 if no issues surface during other IMMEDIATE-mode work.

ItemDescriptionEffortRef
CIRC-IMMCircular-dependency + IMMEDIATE mode hardening. Test: diamond deps with IMMEDIATE triggers, near-circular topologies, lock ordering under concurrent DML. Add deadlock detection / timeout guard if issues found.3–5dPLAN_EDGE_CASES.md §EC-30 · PLAN_CIRCULAR_REFERENCES.md

CIRC-IMM subtotal: ✅ Done

Cross-Session MERGE Cache Staleness Fix (G8.1)

In plain terms: When session A alters a stream table's defining query, session B's cached MERGE SQL template remains stale until B encounters a refresh error or reconnects. Adding a catalog version counter that is bumped on every ALTER QUERY and checked before each refresh closes this race window.

ItemDescriptionEffortRef
G8.1Cross-session MERGE cache invalidation. Add a catalog_version counter to pgt_stream_tables, bump on ALTER QUERY / DROP / reinit. Before each refresh, compare cached version to catalog; regenerate template on mismatch. ✅ Done — existing CACHE_GENERATION counter + defining_query_hash provides cross-session + per-ST invalidation without a schema change.4–6h

G8.1 subtotal: ✅ Completed

explain_st() Enhancements (EXPL-ENH) — ✅ Done

In plain terms: Small quality-of-life improvements to the diagnostic function: refresh timing statistics, partition source info, and a dependency-graph visualization snippet in DOT format.

ItemDescriptionEffortRef
EXPL-ENHexplain_st() enhancements. Added: (a) refresh timing stats (min/max/avg/latest duration from last 20 refreshes), (b) source partition info for partitioned tables, (c) dependency sub-graph visualization in DOT format.4–8hPLAN_FEATURE_CLEANUP.md

EXPL-ENH subtotal: ~4–8 hours — ✅ Complete

CNPG Operator Hardening (R4)

In plain terms: Kubernetes-native improvements for the CloudNativePG integration: adopt K8s 1.33+ native ImageVolume (replacing the init-container workaround), add liveness/readiness probe integration for pg_trickle health, and test failover behavior with stream tables.

ItemDescriptionEffortRef
R4CNPG operator hardening. Adopt K8s 1.33+ native ImageVolume, add pg_trickle health to CNPG liveness/readiness probes, test primary→replica failover with active stream tables.4–6hPLAN_CLOUDNATIVEPG.md

R4 subtotal: ~4–6 hours -- ✅ Complete

v0.15.0 total: ~52–90h + ~2–3d bulk create + ~3–5d planner hints + ~2–3d cache spike + ~3–4wk parser + ~1–2wk watermark + ~2–4wk delta cost/spill + ~2–3d EC-01 + ~3–5d ST-on-ST + ~3–5d CIRC-IMM

Exit criteria:

  • At least one external test corpus (sqllogictest, JOB, or Nexmark) passes
  • Complete documentation review done
  • G15-BC: pgtrickle.bulk_create(definitions JSONB) creates all STs and CDC triggers atomically; tested with 10+ definitions in a single call
  • G13-PRF: parser.rs split into 5 sub-modules; zero behavior change; all existing tests pass
  • WM-7: Stuck watermarks detected and downstream STs paused; watermark_stuck alert emitted; auto-resume on watermark advance
  • PH-E1: Delta cost estimation via capped COUNT on delta subquery; max_delta_estimate_rows GUC; FULL downgrade + NOTICE when threshold exceeded
  • PH-E2: Spill-aware auto-adjustment triggers after 3 consecutive spills; spill_info exposed in explain_st()
  • PH-D2: merge_join_strategy GUC with manual override (auto/hash_join/nested_loop/merge_join)
  • G14-SHC-SPIKE: RFC written; prototype benchmark validates or invalidates DSM-based approach
  • I2: Complete documentation review done -- CONFIGURATION.md GUCs documented (40+), SQL_REFERENCE.md gaps filled, FAQ refs fixed
  • TRUNC-1: TRUNCATE on trigger-mode CDC source marks downstream STs for reinit; tested end-to-end
  • VOL-1: volatile_function_policy GUC controls volatile function handling; reject/warn/allow modes tested
  • I3: dbt-pgtrickle prepared for dbt Hub; submission guide written; Hub listing pending separate repo + hubcap PR
  • E4: Flyway / Liquibase integration guide published in docs/integrations/flyway-liquibase.md
  • E5: ORM integration guides (SQLAlchemy, Django) published in docs/integrations/orm.md
  • EC-01: R₀ pre-change snapshot ensures DELETE deltas find old join partners; unit + TPC-H regression tests confirm correctness
  • STST-3: 3-level and 4-level ST-on-ST chains tested with INSERT/UPDATE/DELETE propagation; mixed modes covered
  • CIRC-IMM: Diamond + near-circular IMMEDIATE topologies tested; no deadlocks or incorrect results
  • G8.1: Cross-session MERGE cache invalidation via catalog version counter; tested with concurrent ALTER QUERY + refresh
  • EXPL-ENH: explain_st() shows refresh timing stats, source partition info, and dependency sub-graph (DOT format)
  • R4: CNPG operator hardening — ImageVolume, health probes, failover tested
  • G13-PRF: parser.rs split into 5 sub-modules; all ~750 unsafe blocks have // SAFETY: comments; zero behavior change; all existing tests pass
  • Extension upgrade path tested (0.14.0 → 0.15.0)
  • just check-version-sync passes

v0.16.0 — Performance & Refresh Optimization

Status: Released (2026-04-06).

Faster refreshes across the board: sub-1% deltas use DELETE+INSERT instead of MERGE, insert-only stream tables auto-detect and skip the MERGE join, algebraic aggregates apply pinpoint updates, and a cross-backend template cache eliminates cold-start latency. Automated benchmark regression gating prevents future performance degradation.

Completed items (click to expand)

Goal: Attack the MERGE bottleneck from multiple angles — alternative merge strategies, algebraic aggregate shortcuts, append-only bypass, delta filtering, change buffer compaction, shared-memory template caching — close critical test coverage gaps to validate these new paths.

MERGE Alternatives & Planner Control (Phase D)

In plain terms: MERGE dominates 70–97% of refresh time. This explores whether replacing MERGE with DELETE+INSERT (or INSERT ON CONFLICT + DELETE) is faster for specific patterns — particularly for small deltas against large stream tables where the MERGE join is the bottleneck.

ItemDescriptionEffortRef
PH-D1DELETE+INSERT strategy. For stream tables where delta is <1% of target, replace MERGE with DELETE WHERE __pgt_row_id IN (delta_deletes) + INSERT ... SELECT FROM delta_inserts. Benchmark against MERGE for 1K/10K/100K deltas against 1M/10M targets. Gate behind pg_trickle.merge_strategy = 'auto'|'merge'|'delete_insert' GUC.1–2 wkPLAN_PERFORMANCE_PART_9.md §Phase D

MERGE alternatives subtotal: ~1–2 weeks

Algebraic Aggregate UPDATE Fast-Path (B-1)

In plain terms: The current aggregate delta rule recomputes entire groups where the GROUP BY key appears in the delta. For a group with 100K rows where 1 row changed, the aggregate re-scans all 100K rows in that group. For decomposable aggregates (SUM/COUNT/AVG), a direct UPDATE target SET col = col + Δ replaces the full MERGE join — dropping aggregate refresh from O(group_size) to O(1) per group.

ItemDescriptionEffortRef
B-1Algebraic aggregate UPDATE fast-path. For GROUP BY queries where all aggregates are algebraically invertible (SUM/COUNT/AVG), replace the MERGE with a direct UPDATE target SET col = col + Δ WHERE group_key = ? for existing groups, plus INSERT for newly-appearing groups and DELETE for groups whose count reaches zero. Eliminates the MERGE join overhead — the dominant cost for aggregate refresh when group cardinality is high. Requires adding __pgt_aux_count / __pgt_aux_sum auxiliary columns to the stream table. Fallback to existing MERGE path for non-algebraic aggregates (MIN, MAX, STRING_AGG, etc.). Gate behind pg_trickle.aggregate_fast_path GUC (default true). Expected impact: 5–20× apply-time reduction for high-cardinality GROUP BY (10K+ distinct groups); aggregate scenarios at 100K/1% projected to drop from ~50ms to sub-1ms apply time.4–6 wkplans/performance/PLAN_NEW_STUFF.md §B-1 · plans/sql/PLAN_TRANSACTIONAL_IVM.md §Phase 4

B-1 subtotal: ~4–6 weeks

Append-Only Stream Tables — MERGE Bypass (A-3-AO)

In plain terms: When a stream table's sources are insert-only (e.g. event logs, append-only tables where CDC never sees DELETE/UPDATE), the MERGE is pure overhead — every delta row is an INSERT, never a match. Bypassing MERGE entirely with a plain INSERT INTO st SELECT ... FROM delta removes the join against the target table, takes only RowExclusiveLock, and is the single highest-payoff optimization for event-sourced architectures.

ItemDescriptionEffortRef
A-3-AOAppend-only stream table fast path. Expose an explicit CREATE STREAM TABLE … APPEND ONLY declaration. When set, refresh uses INSERT INTO st SELECT ... FROM delta instead of MERGE — no target-table join, RowExclusiveLock only. CDC-observed heuristic fallback: if no DELETE/UPDATE has been seen, use the fast path; fall back to MERGE on first non-insert. Benchmark against MERGE for 1K/10K/100K append deltas.1–2 wkplans/performance/PLAN_NEW_STUFF.md §A-3

A-3-AO subtotal: ~1–2 weeks

Delta Predicate Pushdown (B-2)

In plain terms: For a query like SELECT ... FROM orders WHERE status = 'shipped', if a CDC change row has status = 'pending', the delta processes it through scan → filter → discard. All the scan and join work is wasted. Pushing the WHERE predicate down into the change buffer scan eliminates irrelevant rows before any join processing begins — a 5–10× reduction in delta row volume for selective queries.

ItemDescriptionEffortRef
B-2Delta predicate pushdown. During OpTree construction, identify Filter nodes whose predicates reference only columns from a single source table. Inject these predicates into the delta_scan CTE as additional WHERE clauses (including OR old_col = 'value' for DELETE correctness). Expected impact: 5–10× delta row reduction for queries with < 10% selectivity.2–3 wkplans/performance/PLAN_NEW_STUFF.md §B-2

B-2 subtotal: ~2–3 weeks

Shared-Memory Template Caching (G14-SHC)

In plain terms: Every new database connection that triggers a refresh pays a 15–50ms cold-start cost to regenerate the MERGE SQL template. With PgBouncer in transaction mode, this happens on every single refresh cycle. Shared-memory caching stores compiled templates in PostgreSQL DSM so they survive across connections — eliminating the cold-start entirely for steady-state workloads.

ItemDescriptionEffortRef
G14-SHCShared-memory template caching (implementation). Full implementation of DSM + lwlock-based MERGE SQL template caching, building on the G14-SHC-SPIKE RFC from v0.15.0.~2–3wkplans/performance/REPORT_OVERALL_STATUS.md §14

G14-SHC subtotal: ~2–3 weeks

PostgreSQL 19 Forward-Compatibility (A3) — Moved to v1.0.0

PG 19 beta not available in time. Items A3-1 through A3-4 deferred to v1.0.0 milestone.

Change Buffer Compaction (C-4)

In plain terms: A high-churn source table can accumulate thousands of changes to the same row between refresh cycles — an INSERT followed by 10 UPDATEs followed by a DELETE is really just "nothing happened." Compaction merges multiple changes to the same row ID into a single net change before the delta query runs, reducing change buffer size by 50–90% for high-churn tables. This directly reduces work for every downstream path (MERGE, DELETE+INSERT, append-only INSERT, predicate pushdown).

ItemDescriptionEffortRef
C-4Change buffer compaction. Before delta-query execution, merge multiple changes to the same __pgt_row_id into a single net change: INSERT+DELETE cancel out; consecutive UPDATEs collapse to one. Trigger on buffer exceeding pg_trickle.compact_threshold rows (default: 100K). Expected impact: 50–90% reduction in change buffer size for high-churn tables.2–3 wkplans/performance/PLAN_NEW_STUFF.md §C-4

C-4 subtotal: ~2–3 weeks

Test Coverage Hardening (TG2)

In plain terms: The performance optimizations in this release change core refresh paths (MERGE alternatives, aggregate fast-path, append-only bypass, predicate pushdown). Before and alongside these changes, critical test coverage gaps need closing — particularly around operators and scenarios where bugs could hide silently. These gaps were identified in the TESTING_GAPS_2 audit.

High-Priority Gaps

ItemDescriptionEffortRef
TG2-WINWindow function DVM execution tests. ~5 unit tests exist but 0 DVM execution tests. Add execution-level tests for ROW_NUMBER, RANK, DENSE_RANK, LAG/LEAD delta behavior across INSERT/UPDATE/DELETE cycles.3–5dTESTING_GAPS_2.md
TG2-JOINJoin multi-cycle UPDATE/DELETE correctness. E2E join tests are INSERT-only; no UPDATE/DELETE differential cycles. Add systematic multi-cycle coverage for INNER/LEFT/FULL JOIN with UPDATE and DELETE propagation. Risk: silent data corruption in production workloads.3–5dTESTING_GAPS_2.md
TG2-EQUIVDifferential ≡ Full equivalence validation. Only CTEs validated; joins and aggregates lack equivalence proof. Add a test harness that runs every defining query in both DIFFERENTIAL and FULL mode and asserts identical results. Critical for trusting the new optimization paths.3–5dTESTING_GAPS_2.md

Medium-Priority Gaps

ItemDescriptionEffortRef
TG2-MERGErefresh.rs MERGE template unit tests. Only helpers/enums tested; the core MERGE SQL template generation is untested at the unit level.2–3dTESTING_GAPS_2.md
TG2-CANCELTimeout/cancellation during refresh. Zero tests for statement_timeout, pg_cancel_backend() during active refresh. Risk: silent failures or resource leaks under production load.1–2dTESTING_GAPS_2.md
TG2-SCHEMASource table schema evolution. Partial DDL tests exist; type changes and column renames are thin. Risk: silent data corruption on schema change.2–3dTESTING_GAPS_2.md

TG2 subtotal: ~2–4 weeks (high-priority) + ~1–2 weeks (medium-priority)

Performance Regression CI (BENCH-CI)

In plain terms: v0.16.0 changes core refresh paths (MERGE alternatives, aggregate fast-path, append-only bypass, predicate pushdown, buffer compaction). Without automated benchmarks in CI, performance regressions will slip through silently. This adds a benchmark suite that runs on every PR and compares against a committed baseline — any statistically significant regression blocks the merge.

ItemDescriptionEffortRef
BENCH-CI-1Benchmark harness in CI. Run just bench (Criterion-based) on a fixed hardware profile (GitHub Actions large runner or self-hosted). Capture results as JSON artifacts. Compare against committed baseline using Criterion's --save-baseline / --baseline.2–3dplans/performance/PLAN_PERFORMANCE_PART_9.md §I
BENCH-CI-2Regression gate. Parse Criterion JSON output; fail CI if any benchmark regresses by more than 10% (configurable threshold). Report regressions as PR comment with before/after numbers.1–2dplans/performance/PLAN_PERFORMANCE_PART_9.md §I
BENCH-CI-3Scenario coverage. Ensure benchmark suite covers: scan, filter, aggregate (algebraic + non-algebraic), join (2-table, 3-table), window function, CTE, TopK, append-only, and mixed workloads. At minimum 1K/10K/100K row scales.2–3dplans/performance/PLAN_PERFORMANCE_PART_9.md §I

BENCH-CI subtotal: ~1–2 weeks

Auto-Indexing on Stream Table Creation (AUTO-IDX)

In plain terms: pg_ivm automatically creates indexes on GROUP BY columns and primary key columns when creating an incrementally maintained view. pg_trickle currently requires manual index creation, which is a friction point for new users. Auto-indexing creates appropriate indexes at stream table creation time — GROUP BY keys, DISTINCT columns, and the __pgt_row_id covering index for MERGE performance.

ItemDescriptionEffortRef
AUTO-IDX-1Auto-create indexes on GROUP BY / DISTINCT columns. ✅ GROUP BY composite index (existing) and DISTINCT composite index (new) auto-created at create_stream_table() time. Gated behind pg_trickle.auto_index GUC.src/api.rs
AUTO-IDX-2Covering index on __pgt_row_id. ✅ Already implemented (A-4). Now gated behind pg_trickle.auto_index GUC (default true).src/api.rs

AUTO-IDX: ✅ Done

Quick Wins

ItemDescriptionEffortRef
C2-BUGImplement missing resume_stream_table(). ✅ Already existed since v0.2.0 — verified operational.
ERR-REFError reference documentation. ✅ Published as docs/ERRORS.md with all 20 variants documented. Cross-linked from FAQ.docs/ERRORS.md
GUC-DEFAULTSReview dangerous GUC defaults. ✅ Defaults kept at true (correct for most workloads). Added detailed tuning guidance for memory-constrained and PgBouncer environments in CONFIGURATION.md.docs/CONFIGURATION.md
BUF-LIMITChange buffer hard growth limit.pg_trickle.max_buffer_rows GUC added (default: 1M). Forces FULL refresh + truncation when exceeded.src/config.rs · src/refresh.rs

Quick wins: ✅ Done

v0.16.0 total: ~1–2 weeks (MERGE alts) + ~4–6 weeks (aggregate fast-path) + ~1–2 weeks (append-only) + ~2–3 weeks (predicate pushdown) + ~2–3 weeks (template cache) + ~2–3 weeks (buffer compaction) + ~3–6 weeks (test coverage) + ~1–2 weeks (bench CI) + ~2–3 days (auto-indexing) + ~2–4 hours (quick wins) Note: PG 19 compatibility (A3, ~18–36h) moved to v1.0.0.

Exit criteria:

  • PH-D1: DELETE+INSERT strategy implemented and gated behind merge_strategy GUC; correctness verified for INSERT/UPDATE/DELETE deltas
  • B-1: Algebraic aggregate fast-path replaces MERGE for SUM/COUNT/AVG GROUP BY queries; aggregate_fast_path GUC respected; explicit DML path (DELETE+UPDATE+INSERT) used instead of MERGE for all-algebraic aggregates; explain_st() exposes aggregate_path; existing tests pass — ✅ Done in v0.16.0 Phase 8
  • A-3-AO: CREATE STREAM TABLE … APPEND ONLY accepted; refresh uses INSERT path; heuristic auto-promotion on insert-only buffers; falls back to MERGE on first non-insert CDC event
  • B-2: Delta predicate pushdown implemented for single-source Filter nodes (P2-7); DELETE correctness verified (OR old_col predicate); selective-query benchmarks show delta row reduction
  • G14-SHC: Cross-backend template cache eliminates cold-start; catalog-backed L2 cache with template_cache GUC; invalidation on DDL; explain_st() exposes stats
  • A3: PG 19 builds and passes full E2E suite — moved to v1.0.0
  • C-4: Change buffer compaction reduces buffer size by ≥50% for high-churn workloads; compact_threshold GUC respected; no correctness regressions
  • TG2-WIN: Window function DVM execution tests cover ROW_NUMBER, RANK, DENSE_RANK, LAG/LEAD across INSERT/UPDATE/DELETE
  • TG2-JOIN: Join multi-cycle tests cover INNER/LEFT/FULL JOIN with UPDATE and DELETE propagation; no silent data loss
  • TG2-EQUIV: Differential ≡ Full equivalence validated for joins, aggregates, and window functions
  • TG2-MERGE: refresh.rs MERGE template generation has unit test coverage (completed in v0.17.0)
  • TG2-CANCEL: Timeout and cancellation during refresh tested; no resource leaks (completed in v0.17.0)
  • TG2-SCHEMA: Source table type changes and column renames tested end-to-end
  • BENCH-CI: Performance regression CI runs on every PR; 10% regression threshold blocks merge; scenario coverage includes scan/filter/aggregate/join/window/CTE/TopK/SemiJoin/AntiJoin
  • AUTO-IDX: Stream tables auto-create indexes on GROUP BY / DISTINCT columns; __pgt_row_id covering index for ≤ 8-column tables; auto_index GUC respected
  • C2-BUG: resume_stream_table() verified operational (present since v0.2.0)
  • ERR-REF: Error reference doc published with all 20 PgTrickleError variants, common causes, and suggested fixes
  • GUC-DEFAULTS: planner_aggressive and cleanup_use_truncate defaults reviewed; trade-offs documented in CONFIGURATION.md
  • BUF-LIMIT: max_buffer_rows GUC prevents unbounded change buffer growth; triggers FULL + truncation when exceeded
  • Extension upgrade path tested (0.15.0 → 0.16.0)
  • just check-version-sync passes

v0.17.0 — Query Intelligence & Stability

Status: Released (2026-04-08).

Goal: Make the refresh engine smarter, prove correctness through automated fuzzing, harden for scale, and prepare for adoption. Cost-based strategy selection replaces the fixed DIFF/FULL threshold, columnar change tracking skips irrelevant columns in wide-table UPDATEs, SQLancer integration provides automated semantic proving, incremental DAG rebuild supports 1000+ stream table deployments, and unsafe block reduction continues the safety hardening toward 1.0. On the adoption side: api.rs modularization improves code maintainability, a pg_ivm migration guide targets the largest potential adopter audience, a failure mode runbook equips production teams, and a Docker Compose playground provides a 60-second tryout experience.

Completed items (click to expand)

Cost-Based Refresh Strategy Selection (B-4)

In plain terms: The current adaptive FULL/DIFFERENTIAL threshold is a fixed ratio (differential_max_change_ratio default 0.5). A join-heavy query may be better off with FULL at 5% change rate, while a scan-only query benefits from DIFFERENTIAL up to 80%. This replaces the fixed threshold with a cost model trained on each stream table's own refresh history — selecting the cheapest strategy per cycle automatically.

ItemDescriptionEffortRef
B-4Cost-based refresh strategy selection. Collect per-ST statistics (delta_row_count, merge_duration_ms, full_refresh_duration_ms, query_complexity_class) from pgt_refresh_history. Fit a simple linear cost model. Before each refresh, compare estimated_diff_cost(Δ) vs estimated_full_cost × safety_margin and select the cheaper path. Cold-start heuristic (< 10 refreshes) falls back to existing fixed threshold. Gate behind pg_trickle.refresh_strategy = 'auto'|'differential'|'full' GUC.2–3 wkplans/performance/PLAN_NEW_STUFF.md §B-4

B-4 subtotal: ~2–3 weeks

Columnar Change Tracking (A-2-COL)

In plain terms: When a source table UPDATE changes only 1 of 50 columns, the current CDC captures the entire row (old + new) and the delta query processes all columns. If the changed column is not referenced by the stream table's defining query, the entire refresh is wasted work. Columnar change tracking adds a per-column bitmask to CDC events so the delta query can skip irrelevant rows at scan time — a 50–90% reduction in delta volume for wide-table OLTP workloads.

ItemDescriptionEffortRef
A-2-COL-1CDC trigger bitmask. Compute changed_columns bitmask (old.col IS DISTINCT FROM new.col) in the CDC trigger; store as int8 or bit(n) alongside the change row.1–2 wkplans/performance/PLAN_NEW_STUFF.md §A-2
A-2-COL-2Delta-scan column filtering. At delta-query build time, consult the bitmask: skip rows where no referenced column changed; use lightweight UPDATE-only path when only projected columns changed (no join keys, no filter predicates, no aggregate keys).1–2 wkplans/performance/PLAN_NEW_STUFF.md §A-2
A-2-COL-3Aggregate correction optimization. For aggregates where only the aggregated value column changed (not GROUP BY key), emit a single correction row instead of delete-old + insert-new.3–5dplans/performance/PLAN_NEW_STUFF.md §A-2

A-2-COL subtotal: ~3–4 weeks

Transactional IVM Phase 4 Remaining (A2)

In plain terms: IMMEDIATE mode (same-transaction refresh) shipped in v0.2.0 using SQL-level statement triggers. Phase 4 completes the transition to lower-overhead C-level triggers and ENR-based transition tables — sharing the transition tuplestore directly between the trigger and the refresh engine instead of copying through a temp table. Also adds prepared statement reuse to eliminate repeated parse/plan overhead for the delta query.

ItemDescriptionEffortRef
A2-ENRENR-based transition tables. 🚫 Deferred post-1.0 — requires raw pg_sys ENR tuplestore FFI not surfaced by pgrx; carries memory-corruption and pg_upgrade compatibility risk. Revisit after 1.0 stabilisation.12–18hPLAN_TRANSACTIONAL_IVM.md §Phase 4
A2-CTRC-level triggers. 🚫 Deferred post-1.0 — requires raw CreateTrigger() FFI not surfaced by pgrx; carries memory-corruption and pg_upgrade compatibility risk. Revisit after 1.0 stabilisation.12–18hPLAN_TRANSACTIONAL_IVM.md §Phase 4
A2-PSPrepared statement reuse.Already shippedpg_trickle.use_prepared_statements GUC (default true) implemented and wired in refresh.rs; parse/plan overhead eliminated on steady-state workloads.8–12hPLAN_TRANSACTIONAL_IVM.md §Phase 4

A2 subtotal: 0h remaining (A2-PS shipped; A2-ENR + A2-CTR deferred post-1.0)

ROWS FROM() Support (A8)

In plain terms: ROWS FROM() with multiple set-returning functions is a rarely-used SQL feature, but supporting it closes a coverage gap in the parser and DVM pipeline.

ItemDescriptionEffortRef
A8ROWS FROM() with multiple SRF functions. Parser + DVM support for ROWS FROM(generate_series(...), unnest(...)) in defining queries. Very low demand.~1–2dPLAN_TRANSACTIONAL_IVM_PART_2.md Task 2.3

A8 subtotal: ~1–2 days

SQLancer Fuzzing Integration (SQLANCER)

In plain terms: pg_trickle's tests were written by the pg_trickle team, which means they share the same assumptions as the code. SQLancer is an automated database testing tool that generates random SQL queries and checks whether the results are correct — it has found hundreds of bugs in PostgreSQL, SQLite, CockroachDB, and TiDB. Integrating SQLancer gives pg_trickle a crash-test oracle (does the parser panic on fuzzed input?), an equivalence oracle (does DIFFERENTIAL mode produce the same answer as FULL?), and stateful DML fuzzing (do random INSERT/UPDATE/DELETE sequences corrupt stream table data?). This is the single highest-value testing investment for finding unknown correctness bugs.

ItemDescriptionEffortRef
SQLANCER-1Fuzzing environment.Done — Docker-based harness (just sqlancer), Rust LCG query generator, SQLANCER_CASES/SQLANCER_SEED controls, weekly-sqlancer CI job.2–3dPLAN_SQLANCER.md §1
SQLANCER-2Crash-test oracle.Donetest_sqlancer_crash_oracle / run_crash_oracle() verifies zero backend crashes over 200–2000 fuzzed queries.3–5dPLAN_SQLANCER.md §2
SQLANCER-3Equivalence oracle.Donetest_sqlancer_diff_vs_full_oracle / run_diff_vs_full_oracle() creates DIFFERENTIAL + FULL stream tables, applies 4 DML mutations, and asserts count parity. Integrated into test_sqlancer_ci_combined.3–5dPLAN_SQLANCER.md §3
SQLANCER-4Stateful DML fuzzing.Donetest_sqlancer_stateful_dml / run_stateful_dml_fuzzing() runs SQLANCER_MUTATIONS (default 100, nightly 10 000) random INSERT/UPDATE/DELETE mutations with checkpoints every 50. CI: weekly-sqlancer-stateful job (SQLANCER_MUTATIONS=10000).3–5dPLAN_SQLANCER.md §4

SQLANCER subtotal: 0 remaining (all four items shipped in v0.17.0)

Incremental DAG Rebuild (C-2)

In plain terms: When any DDL change occurs (e.g. ALTER STREAM TABLE, DROP STREAM TABLE), the entire dependency graph is rebuilt from scratch by querying pgt_dependencies. For 1000+ stream tables this becomes expensive — O(V+E) SPI queries. Incremental DAG maintenance records which specific stream table was affected and only re-sorts the affected subgraph, reducing the scheduler latency spike from ~50ms to ~1ms at scale.

ItemDescriptionEffortRef
C-2-1Delta-based rebuild. Record affected pgt_id in a bounded ring buffer in shared memory alongside DAG_REBUILD_SIGNAL. On overflow, fall back to full rebuild.1 wkplans/performance/PLAN_NEW_STUFF.md §C-2
C-2-2Incremental topological sort. Add/remove only affected edges and vertices; re-run topological sort on the affected subgraph only. Cache the sorted schedule in shared memory.1–2 wkplans/performance/PLAN_NEW_STUFF.md §C-2

C-2 subtotal: ~2–3 weeks

Unsafe Block Reduction — Phase 6 (UNSAFE-R1/R2)

In plain terms: pg_trickle achieved a 51% reduction in unsafe blocks (from ~1,300 to 641) in earlier releases. The remaining blocks are concentrated in well-documented field-accessor macros and standalone is_a type checks. Converting these to safe wrappers removes another 150–250 unsafe blocks with minimal risk — a meaningful safety improvement before 1.0.

ItemDescriptionEffortRef
UNSAFE-R1Safe field-accessor macros. Replace unsafe { (*node).field } patterns with safe accessor functions. Estimated reduction: ~100–150 unsafe blocks.2–4hPLAN_REDUCED_UNSAFE.md §R1
UNSAFE-R2Safe is_a checks. Convert standalone unsafe { is_a(node, T_Foo) } calls to safe wrapper functions. Estimated reduction: ~50–99 unsafe blocks.2–4hPLAN_REDUCED_UNSAFE.md §R2

UNSAFE-R1/R2 subtotal: ~4–8 hours

api.rs Modularization (API-MOD)

In plain terms: api.rs is 9,413 lines — the largest file in the codebase. It contains stream table CRUD, ALTER QUERY, CDC management, bulk operations, diagnostics, and monitoring functions all in one file. The same treatment that parser.rs received in v0.15.0 (split from 21K lines into 5 sub-modules) is needed here. Zero behavior change — purely structural.

ItemDescriptionEffortRef
API-MODSplit src/api.rs into sub-modules. Proposed split: api/create.rs (create/drop/alter), api/refresh.rs (refresh entry points), api/cdc.rs (CDC management), api/diagnostics.rs (explain_st, health_check), api/bulk.rs (bulk_create), api/mod.rs (re-exports). Zero behavior change.1–2 wk

API-MOD subtotal: ~1–2 weeks

pg_ivm Migration Guide (MIG-IVM)

In plain terms: pg_ivm is the incumbent IVM extension with 1,400+ GitHub stars and 4 years of production use. Many potential pg_trickle adopters are currently using pg_ivm. A step-by-step migration guide — mapping pg_ivm concepts to pg_trickle equivalents, with concrete SQL examples — removes the biggest adoption friction for this audience.

ItemDescriptionEffortRef
MIG-IVMpg_ivm → pg_trickle migration guide. Map: create_immv()create_stream_table(); refresh_immv()refresh_stream_table(); IMMEDIATE mode equivalence; aggregate coverage differences (5 vs 60+); GUC mapping; worked example migrating a real pg_ivm deployment. Publish as docs/tutorials/MIGRATING_FROM_PG_IVM.md.2–3ddocs/research/PG_IVM_COMPARISON.md

MIG-IVM subtotal: ~2–3 days

Failure Mode Runbook (RUNBOOK)

In plain terms: Production teams need to know what happens when things go wrong — and what to do about it. This documents every failure mode pg_trickle can encounter (scheduler crash, WAL slot lag, OOM during refresh, disk full, replication slot conflict, stuck watermarks, circular convergence failure) with symptoms, diagnosis steps, and resolution procedures. Essential for on-call engineers.

ItemDescriptionEffortRef
RUNBOOKFailure mode runbook. Document: scheduler crash recovery, WAL decoder failures, OOM during refresh, disk-full behavior, replication slot conflicts, stuck watermarks, circular convergence timeout, CDC trigger failures, SUSPENDED state recovery, lock contention diagnosis. Include health_check() output interpretation and explain_st() troubleshooting. Publish as docs/TROUBLESHOOTING.md.3–5ddocs/PRE_DEPLOYMENT.md

RUNBOOK subtotal: ~3–5 days

Docker Quickstart Playground (PLAYGROUND)

In plain terms: The fastest way to evaluate any database extension is to run it locally in 60 seconds. A docker-compose.yml with PostgreSQL + pg_trickle pre-installed, sample data (e.g. the org-chart from GETTING_STARTED.md), and a Jupyter notebook or pgAdmin web UI gives potential users a zero-friction tryout experience. This is the single most impactful thing for driving initial adoption.

ItemDescriptionEffortRef
PLAYGROUNDDocker Compose quickstart. docker-compose.yml with: PG 18 + pg_trickle, seed SQL script (org-chart example from GETTING_STARTED.md + TPC-H SF=0.01), pgAdmin web UI (optional). Single docker compose up command. README with guided walkthrough.2–3ddocs/GETTING_STARTED.md

PLAYGROUND subtotal: ~2–3 days

Documentation Polish (DOC-POLISH)

In plain terms: The existing documentation is comprehensive and technically excellent, but it's optimized for users already familiar with IVM and PostgreSQL internals. These items restructure the docs for a better "first hour" experience — simpler getting-started examples, a refresh mode decision guide, a condensed new-user FAQ, and a setup verification checklist. The goal is to reduce cognitive overload for new users without losing the depth that experienced users need.

ItemDescriptionEffortRef
DOC-HELLOSimplified "Hello Stream Table" in GETTING_STARTED. Add a Chapter 0 with a single-table, single-aggregate stream table (e.g. SELECT department, count(*) FROM employees GROUP BY department). Create it, insert a row, verify the refresh. Build confidence before the multi-table org-chart example.2–4hdocs/GETTING_STARTED.md
DOC-DECIDERefresh mode decision guide. Flowchart: "Need transactional consistency? → IMMEDIATE. Volatile functions? → FULL. Otherwise → AUTO (DIFFERENTIAL with FULL fallback)." Include when-to-use guidance for each mode with concrete examples. Publish as a section in GETTING_STARTED or as a standalone tutorial.2–4hdocs/tutorials/tuning-refresh-mode.md
DOC-FAQ-NEWNew User FAQ (top 15 questions). Extract the 15 most common new-user questions from the 3,000-line FAQ into a prominent "New User FAQ" section at the top. Keyword-rich headings for searchability. Link to deep FAQ for details.2–3hdocs/FAQ.md
DOC-VERIFYPost-install verification checklist. SQL script that verifies: extension loaded, shared_preload_libraries configured, GUCs set, CDC triggers installable, first stream table creates and refreshes successfully. Runnable as psql -f verify_install.sql.2–4hdocs/GETTING_STARTED.md
DOC-STUBSFill or remove research stubs. PG_IVM_COMPARISON.md (60 bytes) and CUSTOM_SQL_SYNTAX.md (57 bytes) are empty stubs. Either flesh them out (PG_IVM_COMPARISON can draw from the existing comparison data) or remove from SUMMARY.md.2–4hdocs/research/

DOC-POLISH subtotal: ~2–3 days

v0.17.0 total: ~2–3 weeks (cost-based strategy) + ~3–4 weeks (columnar tracking) + ~32–48 hours (TIVM Phase 4) + ~1–2 days (ROWS FROM) + ~2–3 weeks (SQLancer) + ~2–3 weeks (incremental DAG) + ~4–8 hours (unsafe reduction) + ~1–2 weeks (api.rs modularization) + ~2–3 days (pg_ivm migration) + ~3–5 days (failure runbook) + ~2–3 days (Docker playground) + ~2–3 days (doc polish)

Exit criteria:

  • B-4: Cost-based strategy selector trained on per-ST history; cold-start fallback to fixed threshold; QueryComplexityClass cost model (scan/filter/aggregate/join/join_agg); refresh_strategy + cost_model_safety_margin GUCs; pre-refresh predictive comparison; 10 unit tests
  • A-2-COL: CDC trigger emits changed_cols VARBIT bitmask (COL-1); delta-scan filters irrelevant rows via changed_cols & mask (COL-2); aggregate value-only correction 'V' path halves row volume (COL-3)
  • [ ] A2-ENR: 🚫 Deferred post-1.0 — requires raw pg_sys ENR tuplestore FFI (memory-corruption risk); revisit after 1.0 stabilisation
  • [ ] A2-CTR: 🚫 Deferred post-1.0 — requires raw CreateTrigger() C FFI (memory-corruption risk); revisit after 1.0 stabilisation
  • A2-PS: ✅ Already shipped — pg_trickle.use_prepared_statements GUC (default true) wired in refresh.rs; parse/plan overhead eliminated on steady-state workloads
  • A8: ROWS FROM() with multiple SRFs accepted in defining queries; E2E tests cover INSERT/UPDATE/DELETE propagation
  • SQLANCER: ✅ SQLANCER-1/2 crash + equivalence oracles shipped in v0.12.0; SQLANCER-3 diff-vs-full oracle and SQLANCER-4 stateful DML soak (10K mutations) added in v0.17.0; weekly-sqlancer-stateful CI job wired
  • C-2: Incremental DAG rebuild reduces DDL-triggered latency spike to < 5ms at 100+ STs; ring buffer overflow falls back to full rebuild; no correctness regressions
  • UNSAFE-R1/R2: Unsafe block count reduced by 249 (690→441 in parser); is_node_type! and pg_deref! macros; all 1,700 unit tests pass
  • API-MOD: api.rs split into 3 sub-modules (mod.rs 5,624 + diagnostics.rs 1,377 + helpers.rs 2,461); zero behavior change; all 1,700 unit tests pass
  • MIG-IVM: docs/tutorials/MIGRATING_FROM_PG_IVM.md published with step-by-step migration, API mapping, behavioral differences, SQL upgrade examples, and verification checklist
  • RUNBOOK: docs/TROUBLESHOOTING.md covers 13 failure scenarios (scheduler, SUSPENDED, CDC triggers, WAL slots, INITIALIZING, buffer growth, lock contention, OOM, disk full, circular convergence, schema changes, worker pool, fuse) with symptoms, diagnosis, and resolution
  • PLAYGROUND: playground/ with docker-compose.yml, seed.sql (3 base tables, 5 stream tables), and README walkthrough
  • DOC-HELLO: Chapter 1 "Hello World" in GETTING_STARTED already provides the single-table aggregate example (products/category_summary)
  • DOC-DECIDE: Refresh mode decision guide already published as tutorials/tuning-refresh-mode.md with recommend_refresh_mode() and signal breakdown
  • DOC-FAQ-NEW: New User FAQ section with 15 keyword-rich entries added at top of FAQ.md
  • DOC-VERIFY: scripts/verify_install.sql checks shared_preload_libraries, extension, scheduler, GUCs, and runs end-to-end stream table cycle
  • DOC-STUBS: Research stubs already use {{#include}} directives pointing to substantial content (923 + 1232 lines)
  • Extension upgrade path tested (0.16.0 → 0.17.0)

v0.18.0 — Hardening & Delta Performance

Status: Released (2026-04-12).

Release Theme This release hardens pg_trickle for production at scale and delivers the biggest remaining performance win in the differential refresh path. The Z-set multi-source delta engine merges per-source delta branches into a single GROUP BY + SUM(weight) query, eliminating redundant join evaluation when multiple source tables change in the same cycle. Cross-source snapshot consistency guarantees that multi-source stream tables always read all upstream tables at the same transaction boundary — closing the last known correctness gap. Every production-path .unwrap() is replaced with graceful error propagation, another ~69 unsafe blocks are eliminated, and a populated TPC-H baseline turns the 22-query suite into a true regression canary. SQLancer fuzzing integration provides an external, assumption-free correctness oracle. Together, these changes build the confidence foundation for 1.0.

Completed items (click to expand)

Correctness

IDTitleEffortPriority
CORR-1Enforce cross-source snapshot consistencyLP0
CORR-2Populate TPC-H expected-output regression guardXSP0
CORR-3NULL-safe GROUP BY elimination under deletesSP1
CORR-4Z-set merged-delta weight accounting proofMP0
CORR-5HAVING-filtered aggregate correction under group depletionSP1

CORR-1 — Enforce cross-source snapshot consistency (CSS-3)

In plain terms: When a stream table reads from two different source tables, there is a window where it can see source A at a newer point in time than source B — for example, seeing a new order but the old inventory count. Phase 3 completes the tick-watermark enforcement so both sources are always read at the same consistent LSN before any refresh proceeds. Phases 1 and 2 are already complete.

ItemDescriptionEffortRef
CSS-3-1LSN watermark enforcement in the scheduler — hold refresh until all upstream sources reach the same tick boundary4–6hPLAN_CROSS_SOURCE_SNAPSHOT_CONSISTENCY.md §Phase 3
CSS-3-2Catalog column pgt_css_watermark_lsn + GUC pg_trickle.cross_source_consistency (default off)2–3h
CSS-3-3E2E test: concurrent writes to two sources, assert stream table never sees a split snapshot2–3h

CSS-3 subtotal: ~8–12 hours Dependencies: None. Schema change: Yes.

CORR-2 — Populate TPC-H expected-output regression guard (TPCH-BASE)

In plain terms: The TPC-H correctness tests run all 22 queries but the expected-output comparison guard was never populated — so the tests catch structural failures but not quiet result regressions. Populating the baseline turns the suite into a true correctness canary.

ItemDescriptionEffort
TPCH-BASE-1Run TPC-H suite once at known-good state; capture output30min
TPCH-BASE-2Populate comparison baseline in e2e_tpch_tests.rs line 89 (remove TODO); verify guard fires on a deliberate regression1h

TPCH-BASE subtotal: ~1–2 hours Dependencies: None. Schema change: No.

CORR-3 — NULL-safe GROUP BY elimination under deletes

In plain terms: When all rows in a GROUP BY group are deleted and the grouping key contains NULLs, the differential engine must correctly remove the group. SQL's three-valued logic in IS DISTINCT FROM may cause delta weight miscounting for NULL keys.

Verify: E2E test with GROUP BY nullable_col, delete all group members, assert zero rows remain in the stream table. Dependencies: None. Schema change: No.

CORR-4 — Z-set merged-delta weight accounting proof

In plain terms: Companion correctness gate for PERF-1 (B3-MERGE). The Z-set algebra requires that SUM(weight) across all merged branches for every primary key never produces a spurious net-positive or net-negative for a single join path.

Verify: property-based tests (proptest) asserting merged_weights == individual_branch_sums for randomly generated multi-source DAGs. All existing B3-3 diamond-flow tests must pass unchanged. Dependencies: PERF-1. Schema change: No.

CORR-5 — HAVING-filtered aggregate correction under group depletion

In plain terms: When a HAVING-qualified group loses enough rows to no longer satisfy the predicate (e.g., HAVING count(*) > 5 and 3 of 6 rows are deleted), the differential aggregate path must delete the stream table row rather than leaving a stale row matching the old HAVING predicate.

Verify: E2E test with HAVING + selective deletes crossing the threshold. Dependencies: None. Schema change: No.

Stability

IDTitleEffortPriority
STAB-1Eliminate production-path .unwrap() callsSP0
STAB-2unsafe block reduction Phase 1MP1
STAB-3Spill detection alertingSP1
STAB-4Parallel worker orphaned resource cleanupMP1
STAB-5Upgrade migration test (0.17→0.18)SP0
STAB-6Error SQLSTATE coverage auditSP2

STAB-1 — Eliminate production-path .unwrap() calls (SAFE-1)

In plain terms: A small number of SQL-parsing code paths in production (non-test) code call .unwrap() directly — if they encounter unexpected input they will panic the backend process and disconnect all clients. These should propagate errors gracefully instead.

ItemDescriptionEffort
SAFE-1-1detect_and_strip_distinct() call in api.rs (L8163) → propagate PgTrickleError1h
SAFE-1-2find_top_level_keyword(sql, "FROM") calls in api.rs (L8229–8258, 3×) → propagate error1h
SAFE-1-3merge_sql[using_start.unwrap()..using_end.unwrap()] in refresh.rs (L6236) → bounds-check1h
SAFE-1-4entry.unwrap() in delta computation loop in refresh.rs (L5992) → return Err1h
SAFE-1-5Chained .unwrap().unwrap() in refresh.rs (L6556–6557) → propagate1h

SAFE-1 subtotal: ~4–6 hours Dependencies: None. Schema change: No.

STAB-2 — unsafe block reduction Phase 1 (UNSAFE-P1)

In plain terms: The DVM parser has 1,286 unsafe blocks — 98% of the total. Phase 1 introduces a single pg_cstr_to_str() safe helper that eliminates ~69 of the most mechanical ones: C-string-to-Rust conversions. No API or behavior change; pure safety improvement.

ItemDescriptionEffortRef
UNSAFE-P1-1Implement pg_cstr_to_str(ptr: *const c_char) -> &str safe wrapper in src/dvm/parser/mod.rs1hPLAN_REDUCED_UNSAFE.md §Phase 1
UNSAFE-P1-2Replace ~69 unsafe { CStr::from_ptr(...).to_str()... } call-sites with the safe helper4–6h
UNSAFE-P1-3unsafe_inventory.sh baseline update + CI check1hscripts/unsafe_inventory.sh

UNSAFE-P1 subtotal: ~6–8 hours Dependencies: None. Schema change: No.

STAB-3 — Spill detection alerting (PH-E2)

In plain terms: The GUCs pg_trickle.spill_threshold_blocks and pg_trickle.spill_consecutive_limit already exist to configure spill budgets, but no alert fires when a refresh actually spills to disk. This adds an AlertEvent::SpillThresholdExceeded notification so operators know when large delta queries are hitting disk.

ItemDescriptionEffort
PH-E2-1Add AlertEvent::SpillThresholdExceeded variant to src/monitor.rs1h
PH-E2-2Detect spill after MERGE execution; emit alert when consecutive count exceeds limit2–3h
PH-E2-3E2E test: configure low spill threshold, trigger spill, assert alert fires1–2h

PH-E2 subtotal: ~4–6 hours Dependencies: None. Schema change: No.

STAB-4 — Parallel worker orphaned resource cleanup

In plain terms: After a parallel worker panics mid-refresh, advisory locks, __pgt_delta_* temp tables, and partially-written change buffer rows may be left behind. The scheduler recovery path must clean these up.

Audit the recovery path to ensure: (a) advisory locks are released on next scheduler tick, (b) temp tables are cleaned up, (c) change buffer rows are not double-counted on retry. Verify: E2E test simulating worker crash via pg_terminate_backend() followed by successful recovery. Dependencies: None. Schema change: No.

STAB-5 — Upgrade migration test (0.17→0.18) Extend the upgrade E2E test framework to cover the 0.17.0→0.18.0 migration path and the three-version chain 0.16→0.17→0.18. Verify: catalog column additions, new function signatures, existing stream tables survive, refresh continues working post-upgrade. Dependencies: All schema-changing items (CORR-1). Schema change: No.

STAB-6 — Error SQLSTATE coverage audit Audit all ereport!() and error!() calls for SQLSTATE classification. Ensure every user-facing error has a unique, documented SQLSTATE code that connection poolers and application retry logic can pattern-match. Cross- reference with docs/ERRORS.md for completeness. Dependencies: None. Schema change: No.

Performance

IDTitleEffortPriority
PERF-1Z-set multi-source delta engineLP0
PERF-2Cost-based refresh strategy completionLP1
PERF-3Zero-change source branch elisionMP1
PERF-4Columnar change tracking Phase 1 — CDC bitmaskLP1
PERF-5Index hint generation for MERGE targetSP2

PERF-1 — Z-set multi-source delta engine (B3-MERGE)

In plain terms: When a stream table joins multiple tables and more than one of those tables receives changes in the same scheduler cycle, the current engine generates one delta branch per source and stacks them in a UNION ALL. With this change those branches are merged into a single GROUP BY + SUM(weight) query using Z-set algebra, eliminating duplicate evaluation of shared join paths. B3-1 (branch pruning) and B3-3 (correctness proofs) are already done; this is the final payoff.

ItemDescriptionEffortRef
B3-2-1Z-set merged-delta generation in src/dvm/diff.rs (DiffEngine::diff_node())8–10hPLAN_MULTI_TABLE_DELTA_BATCHING.md
B3-2-2Unit + property-based tests (existing B3-3 diamond-flow tests must pass unchanged)2–4h
B3-2-3Benchmark regression check against Part-8 baseline2h

B3-MERGE subtotal: ~12–16 hours Dependencies: CORR-4 (property tests must accompany). Schema change: No.

PERF-2 — Cost-based refresh strategy completion (B-4 remainder)

In plain terms: Deferred from v0.17.0. The refresh_strategy GUC landed in the current cycle. The remaining work is the per-ST cost model: collect delta_row_count, merge_duration_ms, full_refresh_duration_ms from pgt_refresh_history; fit a simple linear cost model; cold-start heuristic (<10 refreshes) falls back to the fixed threshold.

Verify: mixed-workload benchmark showing the model picks the cheaper strategy ≥80% of the time. Dependencies: B-4 Phase 1 (shipped). Schema change: No.

PERF-3 — Zero-change source branch elision

In plain terms: When building a multi-source delta query, skip branches entirely for sources with empty change buffers. Currently all branches are generated and executed regardless of whether a source has changes.

Verify: benchmark showing latency reduction when 1-of-3 sources changes vs. all 3 changing. Dependencies: PERF-1 (applies to the merged delta builder). Schema change: No.

PERF-4 — Columnar change tracking Phase 1 — CDC bitmask (A-2-COL-1)

In plain terms: Deferred from v0.17.0. Compute changed_columns bitmask (old.col IS DISTINCT FROM new.col) in the CDC trigger; store as int8 or bit(n) alongside the change row. Phase 1 only: bitmask computation + storage. Phase 2 (delta-scan filtering using the bitmask) deferred to v0.22.0. Provides the foundation for 50–90% delta volume reduction on wide-table UPDATE workloads.

Gate behind pg_trickle.columnar_tracking GUC (default off). Dependencies: None. Schema change: Yes (change buffer schema addition).

PERF-5 — Index hint generation for MERGE target

In plain terms: When the stream table has a covering index on the MERGE join keys, bias the planner toward the index to avoid expensive sequential scans during delta application on large stream tables.

Emit SET enable_seqscan = off within the MERGE statement's session. Verify: EXPLAIN ANALYZE shows index scan on MERGE for tables with PK index. Dependencies: None. Schema change: No.

Scalability

IDTitleEffortPriority
SCAL-1Change buffer growth stress test at 10× write rateMP1
SCAL-2Parallel worker utilization profiling at 200+ STsMP2
SCAL-3Delta working-set memory capMP2

SCAL-1 — Change buffer growth stress test at 10× write rate Run a sustained write load at 10× normal throughput for 30+ minutes with intentionally slow refresh intervals. Verify the max_buffer_rows cap triggers correctly, FULL refresh clears the backlog, no disk exhaustion occurs, and the extension recovers cleanly once write rate normalizes. This validates the v0.16.0 buffer growth protection under extreme conditions. Dependencies: None. Schema change: No.

SCAL-2 — Parallel worker utilization profiling at 200+ STs Profile the scheduler with 200+ stream tables across pg_trickle.max_workers = 4/8/16 settings. Measure: CPU utilization per worker, scheduling queue depth, per-ST refresh latency P50/P99. Identify whether the scheduling loop itself becomes a bottleneck before worker saturation. Document findings as a scaling guide section. Dependencies: None. Schema change: No.

SCAL-3 — Delta working-set memory cap The current delta merge can allocate unbounded work_mem for hash joins. Add a configurable cap (pg_trickle.delta_work_mem_mb, default: 256 MB) that triggers FULL refresh fallback when the delta working set would exceed the limit, preventing OOM on unexpectedly large deltas. Verify: E2E test with low cap triggers fallback and logs a warning. Dependencies: None. Schema change: No.

Ease of Use

IDTitleEffortPriority
UX-1Template cache observabilitySP1
UX-2Pre-built Grafana dashboard panelsMP1
UX-3Error message actionability auditSP1
UX-4Single-endpoint health summary functionSP2
UX-5Prometheus metric completeness auditXSP2
UX-6TUI surfaces for cache_stats and health_summaryXSP2

UX-1 — Template cache observability (CACHE-OBS)

In plain terms: The delta SQL template cache (IVM_DELTA_CACHE) saves regenerating delta queries on every refresh cycle, but its hit rate is invisible to operators. Adding pgtrickle.cache_stats() lets you see whether the cache is effective and tune pg_trickle.ivm_cache_size accordingly.

ItemDescriptionEffort
CACHE-OBS-1Add hit/miss/eviction counters to IVM_DELTA_CACHE1h
CACHE-OBS-2Expose via pgtrickle.cache_stats() returning (hits BIGINT, misses BIGINT, evictions BIGINT, size INT)1–2h
CACHE-OBS-3Documentation and E2E smoke test1h

CACHE-OBS subtotal: ~3–4 hours Dependencies: None. Schema change: No.

UX-2 — Pre-built Grafana dashboard panels Extend monitoring/grafana/ with import-ready JSON panels for: refresh latency P50/P99 histogram, differential vs. FULL refresh ratio over time, change buffer backlog per stream table, spill event count, template cache hit rate, and worker utilization gauge. Document import instructions in monitoring/README.md. Dependencies: UX-1 (cache stats metric), STAB-3 (spill events). Schema change: No.

UX-3 — Error message actionability audit Audit all PgTrickleError variants and ereport!()/error!() calls. Ensure every user-facing error includes: the stream table name (when applicable), the operation that failed, and a 1-sentence remediation hint. Cross-reference with docs/ERRORS.md; add missing entries. Dependencies: None. Schema change: No.

UX-4 — Single-endpoint health summary function New pgtrickle.health_summary() function returning a single-row JSONB: total STs, healthy/degraded/error counts, oldest un-refreshed age, largest buffer backlog, fuse status, scheduler state. Useful for monitoring integrations (Nagios, Datadog) without parsing multiple views. Dependencies: None. Schema change: No.

UX-5 — Prometheus metric completeness audit Verify every metric emitted by the extension matches the documented name in docs/CONFIGURATION.md §Prometheus. Remove undocumented metrics or add documentation. Ensure metric names follow Prometheus naming conventions (pgtrickle_* prefix, snake_case, unit suffix). Dependencies: None. Schema change: No.

UX-6 — TUI surfaces for cache_stats and health_summary

In plain terms: The new pgtrickle.cache_stats() (UX-1) and pgtrickle.health_summary() (UX-4) functions are useful in isolation but are most discoverable when surfaced in the TUI. Even a read-only status panel showing total STs, healthy/degraded/error counts, cache hit rate, and scheduler state would make these endpoints visible to users who reach the extension through pgtrickle-tui rather than raw SQL. Audit pgtrickle-tui/src/ to identify the lightest-weight integration point (likely a new "Health" tab or an expanded "Status" panel). If TUI changes are out of scope for this release, document the gap in docs/TUI.md so it is not silently deferred.

Verify: TUI displays non-zero cache stats and a valid health JSONB row after at least one refresh cycle in the E2E playground environment. Dependencies: UX-1, UX-4. Schema change: No.

Test Coverage

IDTitleEffortPriority
TEST-1TPC-H regression baselineXSP0
TEST-2SQLancer fuzzing — crash-test oracleLP1
TEST-3CDC edge cases: NULL PKs, composite PKs, generated columnsMP1
TEST-4Property-based tests for Z-set merged deltaMP0
TEST-5Light E2E eligibility auditSP2
TEST-6Three-version upgrade chain test (0.16→0.17→0.18)SP0
TEST-7dbt integration regression coverageSP1

TEST-1 — TPC-H regression baseline (TPCH-BASE) Same as CORR-2. Capture known-good outputs; verify guard fires on deliberate regression. Dependencies: None. Schema change: No.

TEST-2 — SQLancer fuzzing — crash-test oracle

In plain terms: Deferred from v0.17.0 (second time). Scope reduced to crash-test oracle only for v0.18.0: SQLancer in Docker, configured to feed randomized SQL to the parser and DVM pipeline. Zero-panic guarantee — any input that crashes the extension is a bug. Equivalence oracle (DIFFERENTIAL ≡ FULL) and stateful DML fuzzing deferred to v0.22.0.

Verify: 10K+ fuzzed queries with zero panics. Dependencies: None. Schema change: No.

TEST-3 — CDC edge cases: NULL PKs, composite PKs, generated columns Create E2E tests covering: (a) tables with nullable PK columns in differential mode, (b) composite PKs with 3+ columns, (c) GENERATED ALWAYS AS stored columns as source columns, (d) domain-typed columns, (e) array-typed columns referenced in defining queries. Dependencies: None. Schema change: No.

TEST-4 — Property-based tests for Z-set merged delta Required companion to PERF-1. proptest-based tests generating random multi-source DAGs (2–5 sources, 1–3 join levels) with random DML sequences. Assert merged delta produces identical stream table state as sequential per-branch application. Detect weight-accounting bugs before they ship. Dependencies: PERF-1. Schema change: No.

TEST-5 — Light E2E eligibility audit Review all 10 full E2E test files (~90 tests). Identify tests that don't require custom Docker image features (custom extensions, special configurations) and can run on the stock postgres:18.3 image. Migrate eligible tests to reduce CI wall-clock time on PRs. Dependencies: None. Schema change: No.

TEST-6 — Three-version upgrade chain test (0.16→0.17→0.18) Extend upgrade E2E tests to cover: fresh install of 0.16.0, create stream tables, upgrade to 0.17.0, verify survival, upgrade to 0.18.0, verify survival + new features functional. Dependencies: All schema-changing items. Schema change: No.

TEST-7 — dbt integration regression coverage

In plain terms: The dbt-pgtrickle macro package is the primary adoption vector for teams using dbt, but the integration test suite in dbt-pgtrickle/integration_tests/ currently verifies only happy-path macro expansion. Add regression tests covering: (a) pgtrickle_stream_table macro with all supported materialisation strategies (differential, full, auto), (b) incremental model compatibility, (c) pgtrickle_status test macro, (d) teardown and recreation idempotency (drop + re-run produces identical output). Run as part of just test-dbt.

Verify: just test-dbt passes all new cases; idempotency test confirms identical stream table contents after a full dbt run --full-refresh cycle. Dependencies: None. Schema change: No.

Conflicts & Risks

  1. PERF-1 + CORR-4 + TEST-4 form a mandatory cluster. The Z-set multi-source delta engine (B3-MERGE) is the highest-impact performance item but also touches the DVM engine core. Property-based tests (TEST-4) and the weight accounting proof (CORR-4) are not optional — they must ship alongside PERF-1 to prevent correctness regressions.

  2. Two schema changes. CORR-1 (CSS-3) adds pgt_css_watermark_lsn to the catalog. PERF-4 (A-2-COL-1) adds changed_columns to change buffer tables. Both require upgrade migration scripts and freeze-risk coordination. Consider batching both into a single migration file.

  3. PERF-3 depends on PERF-1. Zero-change branch elision modifies the same delta query builder as B3-MERGE. Sequence PERF-3 strictly after PERF-1 to avoid merge conflicts and compound risk.

  4. TEST-2 (SQLancer) is deferred for the second time. Originally planned for v0.17.0, it remains unstarted. v0.18.0 scopes it to crash-test oracle only (L effort instead of XL), but there is a risk of perpetual deferral. If capacity is tight, prioritize the crash-test oracle as a standalone deliverable rather than deferring the full suite again.

  5. PERF-2 (cost model) requires production history data. The per-ST cost model trains on pgt_refresh_history. Users upgrading from v0.17.0 will have a cold history cache. The cold-start heuristic (< 10 refreshes) is critical — test it explicitly.

  6. PERF-4 (columnar tracking) changes CDC trigger output. The changed_columns bitmask adds overhead to every trigger invocation. Gate behind a GUC (default off) and benchmark the per-row overhead (< 1μs target) before enabling by default in a later release.

  7. B-4 and A-2-COL are carry-overs from v0.17.0. Both were originally scoped for v0.17.0 but not started. They are re-proposed here with reduced scope (B-4 cost model only, A-2-COL Phase 1 bitmask only). If v0.17.0 ships B-4 partially, adjust PERF-2 scope accordingly.

v0.18.0 total: ~70–100 hours

Exit criteria:

  • CORR-1: Split-snapshot E2E test passes under concurrent writes; pgt_css_watermark_lsn column added
  • CORR-2 / TEST-1: TPC-H baseline populated; deliberate regression detected by the guard
  • CORR-3: NULL-keyed GROUP BY group fully removed after all-row delete
  • CORR-4 / TEST-4: Property-based Z-set weight tests pass for randomly generated multi-source DAGs
  • CORR-5: HAVING-qualified group deleted from stream table when row count drops below threshold
  • STAB-1: All production-path unwrap() calls in api.rs and refresh.rs replaced with proper error propagation
  • STAB-2: unsafe_inventory.sh reports ≥69 fewer unsafe blocks; CI baseline updated
  • STAB-3: Spill alert fires in E2E test with artificially low threshold
  • STAB-4: Worker crash recovery E2E test cleans up advisory locks, temp tables, and buffer rows
  • STAB-5 / TEST-6: Three-version upgrade chain (0.16→0.17→0.18) passes
  • STAB-6: All user-facing errors have documented SQLSTATE codes in docs/ERRORS.md
  • PERF-1: Merged multi-source delta implemented; all B3-3 diamond-flow property tests pass unchanged
  • PERF-2: Cost model picks cheaper strategy ≥80% of the time on mixed workload benchmark
  • PERF-3: Zero-change branch elision shows measurable latency reduction in multi-source benchmark
  • PERF-4: changed_columns bitmask stored in change buffer; per-row overhead < 1μs
  • PERF-5: Index scan confirmed via EXPLAIN ANALYZE for MERGE on tables with PK covering index
  • SCAL-1: Buffer growth stress test at 10× rate completes without disk exhaustion or data loss
  • SCAL-2: Profiling report for 200+ STs documented
  • SCAL-3: Delta work_mem cap triggers FULL fallback in E2E test
  • UX-1: pgtrickle.cache_stats() returns correct counters in smoke test
  • UX-2: Grafana dashboard JSON importable; documents refresh latency, buffer backlog, spill events
  • UX-3: Error message audit complete; all errors include table name and remediation hint
  • UX-4: pgtrickle.health_summary() returns single-row JSONB with correct counts
  • UX-5: Prometheus metric names match documentation; no undocumented metrics
  • TEST-2: SQLancer crash-test oracle runs 10K+ fuzzed queries with zero panics
  • TEST-3: CDC edge case tests cover NULL PKs, composite PKs, generated columns, domain types, arrays
  • TEST-5: At least 10 tests migrated from full E2E to light E2E
  • TEST-7: dbt regression suite covers all macro strategies and teardown idempotency; just test-dbt passes
  • UX-6: TUI (or docs/TUI.md gap note) reflects cache_stats() and health_summary() availability
  • Extension upgrade path tested (0.17.0 → 0.18.0)
  • just check-version-sync passes

v0.19.0 — Production Gap Closure & Distribution

Status: Released (2026-04-13).

Release Theme This release closes the most impactful correctness, security, stability, and performance gaps identified in the Phase 7 deep-dive and subsequent audits that v0.18.0 did not address. It removes the unsafe delete_insert merge strategy, adds ownership checks to all DDL-like API functions, hardens the WAL decoder path before it is promoted to production-ready, eliminates O(n²) scheduler dispatch overhead, and ships pg_trickle on standard package registries for the first time. The JOIN delta R₀ fix for simultaneous key-change + right-side delete is the highest-value correctness improvement remaining before 1.0. CDC ordering guarantees, parallel worker crash recovery, delta branch pruning for zero-change sources, and an index-aware MERGE path round out a release that strengthens every layer of the stack. Four to five weeks of focused work delivers measurable correctness improvements, privilege enforcement, catalog index optimizations, a PgBouncer transaction-mode compatibility fix, read-replica safety, and PGXN/apt/rpm distribution.

Completed items (click to expand)

Correctness

IDTitleEffortPriority
CORR-1Remove unsafe delete_insert merge strategyXSP0
CORR-2JOIN delta R₀ fix — key change + right-side deleteMP1
CORR-3Track ALTER TYPE / ALTER DOMAIN DDL eventsSP1
CORR-4Track ALTER POLICY DDL events for RLS source tablesSP1
CORR-5Fix keyless content-hash collision on identical-content rowsSP1
CORR-6Harden guarded .unwrap() calls in DVM operatorsXSP2
CORR-7TRUNCATE + INSERT CDC ordering guaranteeSP1
CORR-8NULL join-key delta handling for INNER/OUTER joinsSP1

CORR-1 — Remove unsafe delete_insert merge strategy

In plain terms: The delete_insert strategy (set via pg_trickle.merge_join_strategy = 'delete_insert') is semantically unsafe for aggregate and DISTINCT queries because the DELETE half executes against already-mutated state, producing phantom deletes. It is slower than standard MERGE for small deltas and incompatible with prepared statements. The auto strategy already covers its only legitimate use case.

ItemDescriptionEffort
CORR-1-1Remove delete_insert as a valid enum value; emit ERROR if set with hint to use 'auto'.XS
CORR-1-2Add upgrade SQL to detect old GUC value and log a NOTICE.XS

Verify: SET pg_trickle.merge_join_strategy = 'delete_insert' raises ERROR with actionable hint. All existing benchmarks pass. Dependencies: None. Schema change: No.

CORR-2 — JOIN delta R₀ fix for simultaneous key-change + right-side delete

In plain terms: When a row's join key column is updated (UPDATE orders SET cust_id = 5 WHERE cust_id = 3) in the same refresh cycle as the old join partner (customer 3) is deleted, the DELETE half of the delta finds no match in current_right and is silently dropped, leaving a stale row in the stream table until the next full refresh. The fix applies the R₀ snapshot technique (pre-change right-side state via EXCEPT ALL) symmetrically with the existing L₀ already implemented for Part 2 of the delta. build_snapshot_sql() in join_common.rs already exists.

ItemDescriptionEffort
CORR-2-1Add right_part1_source / use_r0 logic mirroring use_l0 in diff_inner_join, diff_left_join, diff_full_join.M
CORR-2-2Split Part 1 SQL into two UNION ALL arms for the use_r0 case; update row ID hashing for Part 1b.M
CORR-2-3Integration tests: co-delete scenario, UPDATE-then-delete, multi-cycle correctness, TPC-H Q07 regression.M

Verify: E2E test where UPDATE orders SET cust_id = new_id and DELETE FROM customers WHERE id = old_id land in the same refresh cycle produces correct stream table result without a forced full refresh. Dependencies: EC-01 R₀ EXCEPT ALL pattern (shipped in v0.15.0). Schema change: No.

CORR-3 — Track ALTER TYPE / ALTER DOMAIN DDL events

In plain terms: When a user-defined type or domain used by a source table column is altered (e.g., extending an enum, changing a domain constraint), the DDL event trigger fires but hooks.rs does not classify it as requiring downstream stream table invalidation. Fix: extend the DDL classifier to catch ALTER TYPE and ALTER DOMAIN and trigger cascade invalidation.

Verify: ALTER TYPE my_enum ADD VALUE 'new_val' on a type used by a source column triggers the marked-for-reinit flag on dependent stream tables. Dependencies: None. Schema change: No.

CORR-4 — Track ALTER POLICY DDL events for RLS source tables

In plain terms: If an ALTER POLICY changes the USING expression on a source table, stream tables may silently return wrong results for sessions with active RLS. Fix: detect ALTER POLICY in the DDL classifier and mark dependent stream tables for conservative reinit.

Verify: ALTER POLICY on a source table with dependent stream tables triggers invalidation. E2E test with RLS policy change confirms correct reinitialization. Dependencies: None. Schema change: No.

CORR-5 — Fix keyless content-hash collision on identical-content rows

In plain terms: The keyless table path uses a content hash to identify rows. If two rows have completely identical content, they hash to the same bucket. Under concurrent INSERT + DELETE of identical rows, the net-counting approach may attribute a delete to the wrong "copy" of the row, leaving incorrect counts. Fix: incorporate the change buffer's (lsn, op_index) pair into the hash to break ties between otherwise-identical rows.

Verify: E2E test with two identical rows — insert 2, delete 1 in same cycle; stream table retains exactly 1 row. Dependencies: EC-06 keyless path (shipped in prior release). Schema change: No.

CORR-6 — Harden guarded .unwrap() calls in DVM operators

In plain terms: Several DVM operators use .unwrap() on values that are logically guaranteed by a prior is_some() guard, but the coupling is implicit and fragile — a refactor could silently break the invariant, causing a panic in SQL-reachable code. The most fragile instance is ctx.st_qualified_name.as_deref().unwrap() in filter.rs (line ~130), guarded by has_st which is derived from is_some() several lines earlier. Replace these patterns with if let Some(…) or .unwrap_or_else(|| …) to make the invariant structurally enforced rather than comment-documented.

Verify: grep -rn '\.unwrap()' src/dvm/operators/ returns zero hits outside test modules. All existing unit tests pass. Dependencies: None. Schema change: No.

CORR-7 — TRUNCATE + INSERT CDC ordering guarantee

In plain terms: When a TRUNCATE and subsequent INSERT occur within the same transaction on a source table, the change buffer must preserve their ordering. If the refresh engine processes the INSERT before the TRUNCATE, the stream table loses all rows including the newly inserted ones. The trigger- based CDC path records operations in ctid order within a statement, but cross-statement ordering within a single transaction relies on the change buffer’s op_seq column. Verify that op_seq is monotonically increasing across statements and that the refresh engine applies TRUNCATE before INSERT.

Verify: E2E test: BEGIN; TRUNCATE src; INSERT INTO src VALUES (1); COMMIT; followed by refresh — stream table contains exactly 1 row. Dependencies: None. Schema change: No.

CORR-8 — NULL join-key delta handling for INNER/OUTER joins

In plain terms: When a join key column contains NULL, the INNER JOIN delta should produce zero matching rows (NULL ≠ NULL in SQL), and LEFT/FULL OUTER JOIN deltas should produce NULL-extended rows. The v0.18.0 NULL GROUP BY fix addressed aggregate grouping but the JOIN delta path’s NULL-key behavior is exercised only indirectly by existing tests. Add explicit coverage: INSERT a row with NULL join key, UPDATE it to a non-NULL key, DELETE it — verify each delta cycle produces correct results under both INNER and LEFT JOIN.

Verify: E2E tests with NULL join keys for INNER JOIN, LEFT JOIN, and FULL JOIN — all delta cycles produce correct results matching a full recompute. Dependencies: None. Schema change: No.

Security

IDTitleEffortPriority
SEC-1Add ownership checks to drop_stream_table / alter_stream_tableSP0
SEC-2SQL injection audit for dynamic refresh SQLXSP1

SEC-1 — Add ownership checks to drop_stream_table / alter_stream_table

In plain terms: Currently, any role with EXECUTE privilege on pgtrickle.drop_stream_table() or pgtrickle.alter_stream_table() can modify or drop any stream table, regardless of who created it. PostgreSQL convention requires that only the owner (or a superuser) can DROP or ALTER an object. Fix: call pg_class_ownercheck(stream_table_oid, GetUserId()) (or the pgrx-safe equivalent) at the top of both functions and raise ERROR: must be owner of stream table "name" if the check fails. create_stream_table already records the creating role as the table owner in pg_class.

Verify: Non-owner role calling pgtrickle.drop_stream_table('other_users_st') receives ERROR: must be owner of stream table "other_users_st". Superuser can still drop any stream table. E2E test with two roles confirms. Dependencies: None. Schema change: No.

SEC-2 — SQL injection audit for dynamic refresh SQL

In plain terms: The refresh engine builds SQL strings dynamically using format!() with user-provided table names, column names, and schema names. While pgrx’s quote_identifier() and quote_literal() are used in most places, a focused audit of every format!() call site in refresh.rs, diff.rs, and the operators/ directory ensures no path allows unquoted user input into executable SQL. This is a review-only item — fix any findings immediately as P0.

Verify: Audit checklist signed off — every format!() that incorporates catalog-derived names uses quote_identifier() or parameterised SPI queries. Zero unquoted interpolations outside test code. Dependencies: None. Schema change: No.

Stability

IDTitleEffortPriority
STAB-1PgBouncer transaction-mode compatibility guardMP1
STAB-2Read-replica / hot-standby safety guardSP1
STAB-3Elevate Semgrep to blocking in CIXSP1
STAB-4auto_backoff GUC — double interval after 3 falling-behind cyclesSP2
STAB-5Harden unwrap() in scheduler hot pathXSP2
STAB-6Parallel worker crash recovery sweepMP1
STAB-7Extension version mismatch detection at loadXSP2

STAB-1 — PgBouncer transaction-mode compatibility guard

In plain terms: In PgBouncer transaction mode, session-level state is lost between transactions because different backend connections may serve the same session. pg_trickle uses transaction-scoped advisory locks which are safe, but also uses prepared statements and SET LOCAL — both of which fail silently in transaction mode, causing incorrect refresh behavior. Adding pg_trickle.connection_pooler_mode GUC (none / session / transaction) and disabling prepared statements in transaction mode prevents silent misbehavior.

Verify: integration test with PgBouncer transaction mode confirms refreshes complete correctly without prepared statement errors. pg_trickle.connection_pooler_mode = 'transaction' documented in docs/PRE_DEPLOYMENT.md. Dependencies: None. Schema change: No.

STAB-2 — Read-replica / hot-standby safety guard

In plain terms: If pg_trickle's background worker accidentally starts on a streaming replica (hot standby), it attempts writes to the catalog and crash-loops. Fix: detect pg_is_in_recovery() at worker startup and exit gracefully with LOG: pg_trickle background worker skipped: server is in recovery mode.

Verify: integration test that simulates a replica environment; background worker exits cleanly with the correct log message. No crash loop. Dependencies: None. Schema change: No.

STAB-3 — Elevate Semgrep to blocking in CI

In plain terms: CodeQL and cargo-deny are already blocking in CI; Semgrep runs as advisory-only. Before v1.0.0, all SAST tooling should be blocking. Verify zero findings across all current rules, then flip the CI step from continue-on-error: true to blocking.

Verify: CI step passes in blocking mode. Zero advisory-only bypasses remain. Dependencies: None. Schema change: No.

STAB-4 — auto_backoff GUC for scheduler overload

In plain terms: EC-11 shipped the scheduler_falling_behind alert but deferred auto-remediation. When a stream table has triggered the alert for 3 consecutive cycles, automatically double the effective refresh interval for that table until the next successful on-time cycle. Prevents a single heavy stream table from starving the rest of the queue.

Verify: E2E test with artificially slow stream table; effective interval doubles after 3 consecutive falling-behind alerts; returns to original interval after catching up. Dependencies: EC-11 scheduler_falling_behind (shipped in v0.18.0). Schema change: No.

STAB-5 — Harden unwrap() in scheduler hot path

In plain terms: The scheduler dispatch loop in scheduler.rs uses eu_dag.units().find(|u| u.id == uid).unwrap() at several call sites (lines ~1522, ~1680, ~1751, ~1811, ~1859, ~1885). While the IDs come from the same DAG and are expected to always match, a stale topo-order after a concurrent DDL change could cause a panic inside the background worker. Fix: replace with .ok_or(PgTrickleError::InternalError("unit not found in DAG"))? or use the HashMap introduced by PERF-5. This eliminates the last unwrap() cluster in the scheduler hot path.

Verify: grep -n '\.unwrap()' src/scheduler.rs returns zero hits outside test-only code. All scheduler integration tests pass. Dependencies: PERF-5 (HashMap replaces .find().unwrap() pattern). Schema change: No.

STAB-6 — Parallel worker crash recovery sweep

In plain terms: If a background worker is killed (OOM, SIGKILL) or crashes mid-refresh, it may leave behind: (a) orphaned advisory locks that block the next refresh of that stream table, (b) partially consumed rows in the change buffer (consumed but not committed), or (c) incomplete catalog state. Add a startup recovery sweep to the scheduler: on launch, scan for advisory locks held by PIDs that no longer exist (pg_stat_activity), roll back any xact_status = 'in progress' from dead backends, and reset stream tables stuck in REFRESHING state with no active backend.

Verify: Integration test: kill a worker PID mid-refresh via pg_terminate_backend(); restart the scheduler; the affected stream table recovers without manual intervention within one scheduler cycle. Dependencies: None. Schema change: No.

STAB-7 — Extension version mismatch detection at load

In plain terms: Running ALTER EXTENSION pg_trickle UPDATE updates the SQL objects but the shared library (pg_trickle.so) remains loaded from the previous version until the server is restarted. This mismatch can cause subtle failures (wrong function signatures, missing struct fields). Add a version check in _PG_init() that compares the compiled-in version string against the SQL-level extversion from pg_extension. Emit a WARNING if they differ and refuse to start background workers until the server is reloaded.

Verify: After ALTER EXTENSION pg_trickle UPDATE without server restart, the extension log shows WARNING: pg_trickle shared library version (X) does not match installed extension version (Y) — restart PostgreSQL. Background workers do not start. Dependencies: None. Schema change: No.

Performance

IDTitleEffortPriority
PERF-1Fix WAL decoder: old_* columns always NULL on UPDATESP1
PERF-2Fix WAL decoder: naive pgoutput action string parsingSP1
PERF-3EXPLAIN (ANALYZE, BUFFERS) surface for delta SQL in explain_st()SP2
PERF-4Add catalog indexes on pgt_relid and pgt_dependencies(pgt_id)XSP1
PERF-5Eliminate O(n²) units().find() in scheduler dispatchSP1
PERF-6Batch has_table_source_changes() into single querySP2
PERF-7Delta branch pruning for zero-change sourcesSP1
PERF-8Index-aware MERGE path selectionSP2

PERF-1 — Fix WAL decoder: old_* columns always NULL on UPDATE

In plain terms: In WAL-based CDC (pg_trickle.wal_enabled = true), the old_col_* values for UPDATE rows are always NULL because the decoder reads new_tuple for both old and new field positions. This breaks R₀ snapshot construction for the WAL path. Fix: correctly write old_tuple fields to the old_col_* buffer columns for UPDATE events. Currently dormant (only manifests with wal_enabled = true).

Verify: WAL decoder integration test: UPDATE source SET pk = new_pk; assert old_col_pk IS NOT NULL in the change buffer and equals the pre-update value. Dependencies: None. Schema change: No.

PERF-2 — Fix WAL decoder: naive pgoutput action string parsing

In plain terms: The WAL decoder parses action type with starts_with("I") which incorrectly matches any string beginning with "I" (e.g., "INSERT"). Fix: use exact single-character comparison (== "I") or parse the action byte directly from the pgoutput message buffer. Currently dormant (only manifests with wal_enabled = true).

Verify: WAL decoder unit tests for each action type using exact-match assertion. Fuzz test with action strings longer than 1 character. Dependencies: None. Schema change: No.

PERF-3 — EXPLAIN (ANALYZE, BUFFERS) in explain_st()

In plain terms: pgtrickle.explain_st(name) returns the delta SQL template without execution statistics. Adding a with_analyze BOOLEAN parameter that runs EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) on the delta SQL gives operators plan + actual row counts + buffer hit/miss data — making slow refresh diagnosis much easier.

Verify: pgtrickle.explain_st('my_st', with_analyze => true) returns JSONB with Plan, Actual Rows, and Shared Hit Blocks fields. Documented in docs/SQL_REFERENCE.md. Dependencies: None. Schema change: No.

PERF-4 — Add catalog indexes on pgt_relid and pgt_dependencies(pgt_id)

In plain terms: pgt_stream_tables has an index on status but not on pgt_relid, which is used in hot-path lookups (WHERE pgt_relid = $1) by DDL hooks, CDC trigger installation, and refresh dependency resolution. pgt_dependencies has an index on source_relid but not on pgt_id, which is used when rebuilding a single stream table's dependency set. Adding these two B-tree indexes eliminates sequential scans on these catalog tables at scale.

Verify: \di pgtrickle.idx_pgt_relid and \di pgtrickle.idx_deps_pgt_id exist after upgrade. EXPLAIN of SELECT * FROM pgtrickle.pgt_stream_tables WHERE pgt_relid = 12345 shows Index Scan. Dependencies: None. Schema change: Yes (upgrade SQL adds CREATE INDEX).

PERF-5 — Eliminate O(n²) units().find() in scheduler dispatch

In plain terms: The scheduler dispatch loop calls eu_dag.units().find(|u| u.id == uid) inside iteration over topo_order and ready_queue, causing O(n²) behavior per tick. At 500+ stream tables this adds measurable overhead. Fix: build a HashMap<UnitId, &Unit> once per tick and replace all .find() lookups with O(1) map access.

Verify: Benchmark with 500 stream tables shows tick latency < 1ms (currently ~5–10ms). grep -n 'units().find' src/scheduler.rs returns zero hits. Dependencies: None. Schema change: No.

PERF-6 — Batch has_table_source_changes() into single query

In plain terms: has_table_source_changes() executes N separate SELECT EXISTS(SELECT 1 FROM changes_<oid> LIMIT 1) SPI queries — one per source table per stream table per scheduler tick. For a stream table with 5 sources, this is 5 SPI round-trips. Batching into a single SELECT unnest(ARRAY[oid1, oid2, ...]) AS oid WHERE EXISTS(...) or using a single UNION ALL subquery reduces this to 1 SPI call regardless of source count.

Verify: SPI call count for has_table_source_changes() is 1 regardless of source table count. Scheduler integration tests pass. Dependencies: None. Schema change: No.

PERF-7 — Delta branch pruning for zero-change sources

In plain terms: In a multi-source JOIN stream table (SELECT * FROM a JOIN b ON ...), the delta has two arms: Δ_a ⋈ b and a ⋈ Δ_b. If only source a has changes, the second arm (a ⋈ Δ_b) reads an empty change buffer and produces zero rows — but the engine still executes the full SQL including the join against a. Short-circuit: check has_table_source_changes() per source before building each delta arm. Skip arms where the source has zero changes. For a 5-source star join with only 1 changing source, this eliminates 4 of 5 delta arms entirely.

Verify: Benchmark with 5-source JOIN where only 1 source changes; observe 4 of 5 delta arms skipped in explain_st() output. Refresh latency drops proportionally. Dependencies: PERF-6 (batched source-change check). Schema change: No.

PERF-8 — Index-aware MERGE path selection

In plain terms: The MERGE statement used during differential refresh joins the delta against the stream table on __pgt_row_id. If the stream table has a covering index on the row ID column (which pg_trickle creates by default), the planner should use an index nested-loop join. However, PostgreSQL’s cost model sometimes prefers a hash join for large deltas. Add a targeted SET LOCAL enable_hashjoin = off within the refresh transaction when the delta cardinality is below a configurable threshold (pg_trickle.merge_index_threshold, default 10,000 rows) to steer the planner toward the index path for small deltas.

Verify: EXPLAIN of the MERGE with delta < 10,000 rows shows Index Nested Loop instead of Hash Join. Benchmark shows improved P99 latency for small deltas on large stream tables. Dependencies: None. Schema change: No.

Scalability

IDTitleEffortPriority
SCAL-1Read replica compatibility section in docs/SCALING.mdSP1
SCAL-2Multi-database GUC stub (pg_trickle.database_list)SP2
SCAL-3CNPG operational runbook in docs/SCALING.mdSP2
SCAL-4Partitioned source table impact assessmentMP2

SCAL-1 — Read replica compatibility documentation

In plain terms: The background worker now safely skips on replicas (STAB-2), but the interaction with read replicas for query offloading deserves its own documentation section. Add docs/SCALING.md §Read Replicas covering: which queries are safe on a replica, how pg_is_in_recovery() is used by the extension, and the recommended architecture for OLAP read-offload alongside pg_trickle stream tables.

Verify: docs/SCALING.md has a dedicated replica section. Dependencies: STAB-2. Schema change: No.

SCAL-2 — Multi-database GUC stub

In plain terms: Post-1.0 multi-database support requires catalog changes. This item adds only the pg_trickle.database_list TEXT GUC declaration with a default of '' (current database only) and a startup WARNING if set. This reserves the configuration namespace and lets operators test GUC surface before the full feature ships.

Verify: SHOW pg_trickle.database_list returns ''. Setting a non-empty value emits a WARNING: "pg_trickle.database_list is not yet implemented." Dependencies: None. Schema change: No.

SCAL-3 — CNPG operational runbook in docs/SCALING.md

In plain terms: The CNPG (CloudNativePG) smoke test in CI validates that pg_trickle loads and functions on a CNPG-managed cluster, but the operational patterns are not documented. Add a §CNPG / Kubernetes section to docs/SCALING.md covering: cluster-example.yaml annotations for loading the extension, pod restart behavior when the background worker crashes, WAL volume sizing for CDC, recommended shared_preload_libraries configuration, and health check integration with Kubernetes liveness/readiness probes.

Verify: docs/SCALING.md has a CNPG/Kubernetes section. Content reviewed against actual CNPG deployment behavior. Dependencies: None. Schema change: No.

SCAL-4 — Partitioned source table impact assessment

In plain terms: Stream tables backed by partitioned source tables (inheritance or declarative partitioning) are untested and likely broken: CDC triggers may be installed only on the parent, change buffers may miss partition-routed inserts, and ALTER TABLE ... ATTACH/DETACH PARTITION DDL events are unhandled. This item is a time-boxed spike (2 days): create a partitioned source, attach a stream table, run INSERT/UPDATE/DELETE through various partitions, and document what works, what breaks, and what the fix scope is. Output: a plans/PLAN_PARTITIONING_SPIKE.md update.

Verify: Spike report documents concrete findings. At minimum: which operations work, which fail, and a rough estimate for full partitioning support. Dependencies: None. Schema change: No.

Ease of Use

IDTitleEffortPriority
UX-1PGXN release_status"stable"XSP1
UX-2Automated Docker Hub release pipelineSP1
UX-3apt/rpm packaging via PGDGMP1
UX-4Connection pooler compatibility guide in docs/PRE_DEPLOYMENT.mdSP1
UX-5pgtrickle.write_and_refresh(dml_sql TEXT, st_name TEXT)SP2
UX-6Change drop_stream_table cascade default to falseXSP1
UX-7Resolve OIDs to table names in error messagesSP1
UX-8Emit NOTICE when refresh_stream_table is skippedXSP1
UX-9Fix CONFIGURATION.md TOC gaps for 3 undocumented GUCsXSP2
UX-10TUI per-table refresh latency sparklineSP2
UX-11pgtrickle.version() diagnostic functionXSP2

UX-1 — PGXN release_status"stable"

In plain terms: pg_trickle's META.json uses release_status: "testing". Flipping to "stable" signals production-readiness, enabling the extension to appear in the main PGXN package listing and in downstream package managers that consume the PGXN stable feed. One field change in META.json.

Verify: META.json "release_status": "stable". Published PGXN listing reflects the change after the next PGXN sync. Dependencies: None. Schema change: No.

UX-2 — Automated Docker Hub release pipeline

In plain terms: Automate publishing pgtrickle/pg_trickle:<ver>-pg18 and pgtrickle/pg_trickle:latest on every tagged release. Wire the existing Dockerfile.hub into the GitHub Actions release workflow via docker/build-push-action. The latest tag tracks the highest non-prerelease version.

Verify: After a test release tag, Docker Hub shows the correct image. docker pull pgtrickle/pg_trickle:0.19.0-pg18 succeeds and passes the smoke test. Dependencies: Dockerfile.hub (already exists). Schema change: No.

UX-3 — apt/rpm packaging via PGDG

In plain terms: PostgreSQL users install extensions via apt install postgresql-18-pg-trickle or dnf install pg_trickle_18. Submit package specs to pgrpms.org (rpm) and the PGDG apt repository (deb). Generate packages from the GitHub release tarball. This is the most impactful distribution improvement possible.

Verify: apt install postgresql-18-pg-trickle works on Ubuntu 24.04. dnf install pg_trickle_18 works on RHEL 9. Both pass verify_install.sql. Dependencies: None. Schema change: No.

UX-4 — Connection pooler compatibility guide

In plain terms: Add a dedicated section to docs/PRE_DEPLOYMENT.md covering: PgBouncer session mode (fully compatible), PgBouncer transaction mode (set pg_trickle.connection_pooler_mode = 'transaction'), pgpool-II (session mode only), PgCat (session mode only). Include a compatibility matrix and postgresql.conf + PgBouncer config snippets.

Verify: PRE_DEPLOYMENT.md pooler section reviewed by a DBA familiar with PgBouncer. All described modes are tested or explicitly marked "untested." Dependencies: STAB-1. Schema change: No.

UX-5 — pgtrickle.write_and_refresh() convenience function

In plain terms: In DIFFERENTIAL mode, a write followed by refresh_stream_table() requires two API calls. A single function that executes the DML and triggers a refresh atomically simplifies read-your-writes patterns for applications that need immediate consistency without the overhead of IMMEDIATE mode.

Verify: SELECT pgtrickle.write_and_refresh('INSERT INTO src VALUES (1)', 'my_st') executes the INSERT and refreshes the stream table. Documented in docs/SQL_REFERENCE.md. Dependencies: None. Schema change: No.

UX-6 — Change drop_stream_table cascade default to false

In plain terms: pgtrickle.drop_stream_table(name, cascade) currently defaults cascade to true. This violates the PostgreSQL convention where DROP defaults to RESTRICT and CASCADE must be explicit. A user calling SELECT pgtrickle.drop_stream_table('my_st') may inadvertently cascade-drop dependent stream tables. Fix: change the default to false (RESTRICT). This is a behavior change — existing scripts that rely on the implicit cascade must add cascade => true explicitly.

Verify: SELECT pgtrickle.drop_stream_table('parent_st') returns an error when parent_st has dependents. SELECT pgtrickle.drop_stream_table('parent_st', cascade => true) succeeds. Documented in CHANGELOG as a breaking change. Dependencies: None. Schema change: No (function signature change only).

UX-7 — Resolve OIDs to table names in error messages

In plain terms: UpstreamTableDropped(u32) and UpstreamSchemaChanged(u32) display raw PostgreSQL OIDs (e.g., "upstream table dropped: OID 16384"). Users cannot easily map OIDs to table names. Fix: resolve the OID to schema.table via pg_class at error-construction time or store the name alongside the OID. If the table is already dropped, fall back to "OID <oid> (table no longer exists)".

Verify: UpstreamTableDropped error message shows "upstream table dropped: public.orders" instead of raw OID. Fallback tested with a pre-dropped table. Dependencies: None. Schema change: No.

UX-8 — Emit NOTICE when refresh_stream_table is skipped

In plain terms: When refresh_stream_table() encounters a RefreshSkipped condition (e.g., no changes detected, another refresh already in progress), it currently logs at debug1 level and returns success — invisible to the caller at default log levels. Fix: emit a PostgreSQL NOTICE (visible to the calling session) in addition to the debug1 log, so the caller knows the refresh did not execute.

Verify: SELECT pgtrickle.refresh_stream_table('my_st') with no pending changes emits NOTICE: refresh skipped for "my_st": no changes detected. Visible in psql output. Dependencies: None. Schema change: No.

UX-9 — Fix CONFIGURATION.md TOC gaps

In plain terms: Three GUCs (delta_work_mem_cap_mb, volatile_function_policy, unlogged_buffers) have full documentation sections in docs/CONFIGURATION.md but are missing from the table of contents navigation at the top of the file. Additionally, there is a duplicate "Guardrails" entry in the TOC. Fix: add the missing TOC entries and remove the duplicate.

Verify: All ### pg_trickle.* headings in CONFIGURATION.md have a corresponding TOC link. No duplicate entries. Dependencies: None. Schema change: No.

UX-10 — TUI per-table refresh latency sparkline

In plain terms: The pgtrickle TUI dashboard shows each stream table’s current status and last refresh duration, but operators cannot see at a glance whether latency is trending up or down. Add a sparkline column (last 20 refresh latencies, ~80 chars wide) to the stream table list view. The data is already available in pgt_refresh_history; the TUI polls it on each tick. This makes performance degradation and recovery immediately visible without switching to Grafana.

Verify: TUI stream table view shows a sparkline column. Sparkline updates after each refresh cycle. Values match pgt_refresh_history entries. Dependencies: None. Schema change: No.

UX-11 — pgtrickle.version() diagnostic function

In plain terms: A SELECT pgtrickle.version() function that returns the installed extension version, the shared library version, and the target PostgreSQL major version as a composite record. This is standard practice for PostgreSQL extensions (cf. postgis_full_version()) and simplifies remote diagnostics — support can ask a user to run one query instead of checking pg_available_extensions, pg_config, and SHOW server_version separately.

Verify: SELECT * FROM pgtrickle.version() returns three fields: extension_version, library_version, pg_major_version. Values match the installed state. Dependencies: None. Schema change: No.

Test Coverage

IDTitleEffortPriority
TEST-1E2E tests for CORR-2 (JOIN delta R₀ fix)SP1
TEST-2E2E tests for DDL tracking gaps (CORR-3 / CORR-4)SP1
TEST-3WAL decoder unit tests for PERF-1 / PERF-2SP1
TEST-4PgBouncer transaction-mode integration smoke testMP1
TEST-5Read-replica guard integration testSP1
TEST-6Ownership-check privilege tests for SEC-1SP1
TEST-7Scheduler dispatch benchmark (500+ STs)SP1
TEST-8Upgrade E2E tests (e2e_migration_tests.rs)MP1
TEST-9Extract unit-testable logic from E2E-only pathsMP1
TEST-10TPC-H scale factor coverage (SF-1, SF-10)SP2

TEST-1 — E2E tests for CORR-2 (JOIN delta R₀ fix)

In plain terms: The co-delete scenario (UPDATE join key + DELETE join partner in same cycle) is currently untested. Add three E2E tests: (a) simultaneous key change + right-side delete; (b) UPDATE key + DELETE multiple right-side rows; (c) multi-cycle correctness after the scenario.

Verify: 3 E2E tests in e2e_join_tests.rs. All pass; intermediate full refresh not required for correctness. Dependencies: CORR-2. Schema change: No.

TEST-2 — E2E tests for DDL tracking (CORR-3 / CORR-4)

In plain terms: Add E2E tests verifying that ALTER TYPE, ALTER DOMAIN, and ALTER POLICY DDL events correctly trigger stream table invalidation.

Verify: 3 E2E tests (one per DDL type). Stream table state after reinit is correct. Dependencies: CORR-3, CORR-4. Schema change: No.

TEST-3 — WAL decoder unit tests

In plain terms: Add WAL decoder unit tests that explicitly enable wal_enabled = true and verify: (a) old_col_* values are non-NULL for UPDATE rows; (b) pk_hash is non-zero for keyless tables; (c) action string parsing uses exact comparison.

Verify: 5+ unit tests in tests/wal_decoder_tests.rs using Testcontainers with WAL mode enabled. Dependencies: PERF-1, PERF-2. Schema change: No.

TEST-4 — PgBouncer transaction-mode smoke test

In plain terms: Start PgBouncer in transaction mode via Testcontainers, connect pg_trickle through it, and run a basic refresh cycle. Verifies connection_pooler_mode = 'transaction' correctly disables prepared statements and refreshes complete without errors.

Verify: integration test passes with PgBouncer transaction mode container. Dependencies: STAB-1. Schema change: No.

TEST-5 — Read-replica guard integration test

In plain terms: Start a streaming replica via Testcontainers, install pg_trickle on the replica, and verify the background worker exits cleanly with the correct log message rather than crash-looping.

Verify: worker log contains "pg_trickle background worker skipped: server is in recovery mode." No ERROR or FATAL in replica logs. Dependencies: STAB-2. Schema change: No.

TEST-6 — Ownership-check privilege tests for SEC-1

In plain terms: Add E2E tests with two PostgreSQL roles: role A creates a stream table, role B (non-superuser, non-owner) attempts to drop and alter it. Verify that role B receives ERROR: must be owner of stream table. Also verify that a superuser can drop/alter any stream table regardless of ownership.

Verify: 3 E2E tests (non-owner drop, non-owner alter, superuser override). Dependencies: SEC-1. Schema change: No.

TEST-7 — Scheduler dispatch benchmark (500+ STs)

In plain terms: Add a Criterion benchmark that creates a mock DAG with 500+ stream tables and measures per-tick dispatch latency. This gates PERF-5 (HashMap optimization) and provides a regression baseline for future scheduler changes. The benchmark should run in the existing benches/ framework.

Verify: cargo bench --bench scheduler_bench runs and reports P50/P99 tick latency. Baseline saved for Criterion regression gate. Dependencies: PERF-5. Schema change: No.

TEST-8 — Upgrade E2E tests (e2e_migration_tests.rs)

In plain terms: The upgrade path from 0.18.0 → 0.19.0 is currently tested only by verifying ALTER EXTENSION pg_trickle UPDATE runs without error. There are no tests that verify (a) existing stream tables continue to function after upgrade, (b) the new catalog schema items (DB-2 FK, DB-3 version table, DB-5 history retention) are present and correct, or (c) stream table data is preserved. Add a Testcontainers-based upgrade E2E test.

Verify: tests/e2e_migration_tests.rs tests: fresh install, upgrade from previous version with populated stream tables, catalog integrity check, post-upgrade refresh cycle. All pass. Dependencies: DB-1, DB-2, DB-3. Schema change: No (tests existing schema).

TEST-9 — Extract unit-testable logic from E2E-only paths

In plain terms: Several core functions in refresh.rs and scheduler.rs are currently exercised only through end-to-end tests that require a PostgreSQL container. Extracting pure logic from SPI-dependent code and adding direct unit tests makes regressions detectable in seconds instead of minutes. Target: identify 5+ functions (refresh strategy selection, delta cardinality estimation, backoff calculation, topo-sort cycle detection, merge strategy costing) that operate on plain Rust data structures and can be tested with #[cfg(test)] modules.

Verify: 5+ new #[cfg(test)] unit tests in src/refresh.rs or src/scheduler.rs. just test-unit runs them in < 5 seconds. Dependencies: None. Schema change: No.

TEST-10 — TPC-H scale factor coverage (SF-1, SF-10)

In plain terms: The v0.18.0 TPC-H regression guard runs all 22 queries at a single scale factor. Real-world correctness bugs sometimes only manifest at higher cardinalities where hash collisions, sort spill, and parallel execution change the code path. Add nightly runs at SF-1 (6M rows) and SF-10 (60M rows) alongside the existing default. The SF-10 run doubles as a performance soak test — flag any query whose refresh time regresses by more than 20% compared to the previous nightly.

Verify: CI nightly job runs TPC-H at SF-1 and SF-10. All 22 queries produce correct results at both scales. SF-10 timing baseline saved for regression detection. Dependencies: None. Schema change: No.

Schema Stability

IDTitleEffortPriority
DB-1Fix duplicate 'DIFFERENTIAL' in two CHECK constraintsXSP0
DB-2Add ON DELETE CASCADE FK on pgt_refresh_history.pgt_idXSP0
DB-3Add pgtrickle.pgt_schema_version version tracking tableXSP0
DB-4Rename pgtrickle_refresh NOTIFY channel → pg_trickle_refreshXSP0
DB-5pg_trickle.history_retention_days GUC + scheduler daily cleanupSP1
DB-6Document public API stability contract in docs/SQL_REFERENCE.mdXSP1
DB-7Add migration script template to sql/XSP1
DB-8Validate orphan cleanup in drop_stream_tableXSP1
DB-9pgtrickle.migrate() utility functionSP2

DB-1 — Fix duplicate 'DIFFERENTIAL' in CHECK constraints

In plain terms: Both pgt_stream_tables.refresh_mode and pgt_refresh_history.action have 'DIFFERENTIAL' listed twice in their CHECK constraints. While logically harmless, it signals sloppiness and produces confusing output in dumps. Both from REPORT_DB_SCHEMA_STABILITY.md §3.1.

Verify: \d+ pgtrickle.pgt_stream_tables and \d+ pgtrickle.pgt_refresh_history show their CHECK constraints with no duplicate values. Dependencies: None. Schema change: Yes (upgrade SQL drops/recreates constraints).

DB-2 — Add ON DELETE CASCADE FK on pgt_refresh_history.pgt_id

In plain terms: pgt_refresh_history.pgt_id references pgt_stream_tables.pgt_id logically but has no formal FK. When a stream table is dropped, orphan history rows accumulate indefinitely. Adding FOREIGN KEY (pgt_id) REFERENCES pgtrickle.pgt_stream_tables(pgt_id) ON DELETE CASCADE cleans up automatically.

Verify: Drop a stream table; SELECT count(*) FROM pgtrickle.pgt_refresh_history WHERE pgt_id = <dropped_id> returns 0. Dependencies: None. Schema change: Yes.

DB-3 — Add pgtrickle.pgt_schema_version version tracking table

In plain terms: There is currently no way for migration scripts to verify which schema version is installed before applying changes. Add a pgt_schema_version(version TEXT PRIMARY KEY, applied_at TIMESTAMPTZ, description TEXT) table seeded with the current version. Every future migration script will check this table and insert its target version.

Verify: SELECT version FROM pgtrickle.pgt_schema_version ORDER BY applied_at DESC LIMIT 1 returns the current extension version after upgrade. Dependencies: None. Schema change: Yes.

DB-4 — Rename pgtrickle_refresh NOTIFY channel → pg_trickle_refresh

In plain terms: Two existing NOTIFY channels use pg_trickle_* naming (pg_trickle_alert, pg_trickle_cdc_transition). The third uses inconsistent pgtrickle_refresh (no separator). Rename before 1.0 while still pre-1.0. Any external LISTEN pgtrickle_refresh in application code must be updated. Document as a breaking change in CHANGELOG.

Verify: LISTEN pg_trickle_refresh receives notifications on refresh events. LISTEN pgtrickle_refresh receives none. Dependencies: None. Schema change: No (code change only).

DB-5 — pg_trickle.history_retention_days GUC + scheduler cleanup

In plain terms: pgt_refresh_history has no retention policy. Production deployments running daily refreshes on 100+ stream tables will accumulate millions of rows within months. Add a GUC (default: 30 days) and a daily cleanup step in the scheduler: DELETE FROM pgtrickle.pgt_refresh_history WHERE start_time < now() - make_interval(...).

Verify: SET pg_trickle.history_retention_days = 1 and run the cleanup; rows older than 1 day are removed. Default retains 30 days. Dependencies: None. Schema change: No (new GUC + cleanup logic only).

DB-6 — Document public API stability contract

In plain terms: The stability contract defined in REPORT_DB_SCHEMA_STABILITY.md §5 (Tier 1/2/3 surfaces) is not yet published anywhere users can find it. Add a "Stability Guarantees" section to docs/SQL_REFERENCE.md covering: which function signatures are stable, which view columns can be added without a major version, and which internal objects may change with migration scripts.

Verify: docs/SQL_REFERENCE.md has a §Stability Guarantees section linked from the TOC. Dependencies: None. Schema change: No.

DB-7 — Add migration script template to sql/

In plain terms: The sql/pg_trickle--0.18.0--0.19.0.sql file is currently empty (stub). Populate it with: (a) the DB-1 CHECK constraint fixes, (b) the DB-2 FK addition, (c) the DB-3 schema version table creation, and (d) the DB-4 NOTIFY channel rename notice. Also create a reusable migration script template comment header for future versions.

Verify: ALTER EXTENSION pg_trickle UPDATE on a 0.18.0 instance applies all schema changes correctly. check_upgrade_completeness.sh passes. Dependencies: DB-1, DB-2, DB-3, DB-4. Schema change: Yes (this IS the migration script).

DB-8 — Validate orphan cleanup in drop_stream_table

In plain terms: When a stream table is dropped, pgt_change_tracking rows with the dropped pgt_id in tracked_by_pgt_ids (a BIGINT[] column) may not be cleaned up if the array contains other IDs. Add an explicit sweep: remove the dropped pgt_id from all tracked_by_pgt_ids arrays; delete rows where the array becomes empty.

Verify: Create a shared-source ST pair, drop one; SELECT * FROM pgtrickle.pgt_change_tracking shows correct state. Dependencies: None. Schema change: No.

DB-9 — pgtrickle.migrate() utility function

In plain terms: Add a pgtrickle.migrate() SQL function that iterates over all registered stream tables and applies any pending dynamic object migrations (change buffer schema updates, CDC trigger function regeneration). This is called automatically at the end of ALTER EXTENSION UPDATE and can also be called manually after an upgrade to repair STs that were being refreshed during the upgrade window.

Verify: SELECT pgtrickle.migrate() completes without error on a fresh install and after a version upgrade. Returns a summary of migrated objects. Dependencies: DB-3 (uses schema version to determine needed migrations). Schema change: No.

v0.19.0 total: ~4–5 weeks

Exit criteria:

  • CORR-1: delete_insert strategy removed; ERROR raised on old GUC value
  • CORR-2: JOIN delta R₀ fix: UPDATE key + DELETE partner in same cycle produces correct stream table result
  • CORR-3: ALTER TYPE / ALTER DOMAIN DDL events trigger stream table invalidation
  • CORR-4: ALTER POLICY DDL events trigger stream table invalidation
  • CORR-5: Keyless content-hash collision test passes with two identical-content rows
  • CORR-6: Zero .unwrap() in src/dvm/operators/ outside test modules
  • SEC-1: Non-owner drop_stream_table/alter_stream_table raises ERROR: must be owner
  • STAB-1: pg_trickle.connection_pooler_mode GUC added; transaction mode disables prepared statements
  • STAB-2: Background worker exits cleanly on hot standby with correct log message
  • STAB-3: Semgrep elevated to blocking; zero findings verified
  • STAB-4: auto_backoff GUC: interval doubles after 3 consecutive falling-behind alerts
  • STAB-5: Zero .unwrap() in scheduler hot path outside test modules
  • PERF-1: WAL decoder writes correct old_col_* values for UPDATE rows
  • PERF-2: WAL decoder uses exact action string comparison
  • PERF-4: Catalog indexes on pgt_relid and pgt_dependencies(pgt_id) exist after upgrade
  • PERF-5: Zero units().find() in scheduler; HashMap-based O(1) lookup
  • PERF-6: has_table_source_changes() executes single SPI query regardless of source count
  • SCAL-1: docs/SCALING.md replica section added
  • UX-1: META.json release_status"stable"; PGXN listing updated
  • UX-2: Docker Hub release automation wired in GitHub Actions
  • UX-3: apt/rpm packages available via PGDG
  • UX-4: docs/PRE_DEPLOYMENT.md connection pooler compatibility guide added
  • UX-6: drop_stream_table defaults to cascade => false
  • UX-7: UpstreamTableDropped/UpstreamSchemaChanged show table name instead of raw OID
  • UX-8: refresh_stream_table emits NOTICE when refresh is skipped
  • UX-9: CONFIGURATION.md TOC complete; no duplicate entries
  • TEST-1: 3 JOIN delta R₀ E2E tests pass
  • TEST-2: 3 DDL tracking E2E tests pass
  • TEST-3: 5+ WAL decoder unit tests pass with wal_enabled = true
  • TEST-4: PgBouncer transaction-mode integration test passes
  • TEST-5: Read-replica guard integration test passes
  • TEST-6: 3 ownership-check privilege E2E tests pass
  • TEST-7: Scheduler dispatch benchmark baseline saved
  • TEST-8: Upgrade E2E tests pass (pre- and post-upgrade stream table correctness)
  • DB-1: No duplicate 'DIFFERENTIAL' in CHECK constraints
  • DB-2: pgt_refresh_history.pgt_id FK with ON DELETE CASCADE added
  • DB-3: pgtrickle.pgt_schema_version table present and seeded
  • DB-4: pgtrickle_refresh channel renamed to pg_trickle_refresh
  • DB-5: pg_trickle.history_retention_days GUC active; daily cleanup deletes old rows
  • DB-6: docs/SQL_REFERENCE.md stability contract section published
  • DB-7: sql/pg_trickle--0.18.0--0.19.0.sql applies DB-1 through DB-4 changes
  • DB-8: drop_stream_table leaves no orphan rows in pgt_change_tracking
  • CORR-7: TRUNCATE + INSERT in same transaction — stream table correct after refresh
  • CORR-8: NULL join-key delta correct for INNER, LEFT, and FULL JOIN
  • SEC-2: SQL injection audit complete — zero unquoted interpolations in refresh SQL
  • STAB-6: Worker crash recovery sweep cleans orphaned locks and stuck REFRESHING state
  • STAB-7: Version mismatch WARNING emitted after ALTER EXTENSION without restart
  • PERF-7: Delta branch pruning skips zero-change source arms in multi-JOIN
  • PERF-8: Index-aware MERGE uses nested loop for small deltas on indexed tables
  • SCAL-3: docs/SCALING.md CNPG/Kubernetes section published
  • SCAL-4: Partitioning spike report written with concrete findings
  • UX-10: TUI sparkline column visible for refresh latency trend
  • UX-11: pgtrickle.version() returns extension, library, and PG versions
  • TEST-9: 5+ unit tests extracted from E2E-only refresh/scheduler logic
  • TEST-10: TPC-H nightly runs at SF-1 and SF-10 with correct results
  • Extension upgrade path tested (0.18.0 → 0.19.0)
  • just check-version-sync passes

Conflicts & Risks

  1. CORR-1 is a user-visible breaking change. Any deployment with merge_join_strategy = 'delete_insert' in postgresql.conf will error at startup after upgrade. Requires a prominent CHANGELOG entry and a NOTICE during the upgrade migration.

  2. CORR-2 touches high-traffic diff operators. diff_inner_join and diff_left_join are the most commonly used operators. Gate the merge behind TPC-H regression suite + TEST-1. Do not merge without both passing.

  3. STAB-1 introduces a new GUC. The pg_trickle.connection_pooler_mode GUC must be mirrored in upgrade migration SQL, CONFIGURATION.md, and check-version-sync validation.

  4. PERF-1/PERF-2 are currently dormant. Changes to wal_decoder.rs must be tested with wal_enabled = true explicitly. The default trigger-based CDC is unaffected — keep WAL tests behind an explicit env var to avoid slowing down the default test run.

  5. UX-3 (apt/rpm packaging) depends on PGDG maintainer availability (~8–12h) and can be cut without impacting correctness if it risks delaying the release.

  6. SEC-1 changes privilege semantics. Existing deployments where non-owner roles call drop_stream_table or alter_stream_table will break. Requires a CHANGELOG entry and, optionally, a pg_trickle.skip_ownership_check GUC (default false) for a transition period.

  7. UX-6 changes the cascade default. Scripts relying on implicit cascade => true will silently change behavior — DROP will error instead of cascading. Ship alongside SEC-1 and document both breaking changes together.

  8. PERF-4 requires upgrade SQL. The two CREATE INDEX statements must be added to sql/pg_trickle--0.18.0--0.19.0.sql. Index creation on a busy system may briefly lock the catalog tables (millisecond-range for small catalogs; document in upgrade notes).

  9. DB-4 renames the pgtrickle_refresh NOTIFY channel. Any application code using LISTEN pgtrickle_refresh will stop receiving notifications after upgrade. The old channel name ceases to exist. Document prominently in CHANGELOG and UPGRADING.md.

  10. DB-2 adds a CASCADE FK. If any external tooling holds open transactions when a stream table is dropped, the cascade may fail under lock. Test in upgrade E2E (TEST-8) before shipping.

  11. STAB-6 touches the scheduler startup path. A bug in the recovery sweep could incorrectly reset a stream table that is still being refreshed on a live backend. The sweep must verify that the PID is truly dead via pg_stat_activity before taking corrective action.

  12. PERF-8 disables hashjoin within the refresh transaction. If the threshold is set too high, large deltas will use a slower nested-loop path. Make the merge_index_threshold GUC tunable and document clearly that it only affects the MERGE step, not the delta SQL.

  13. SCAL-4 (partitioning spike) may uncover scope too large for v0.19.0. If the spike reveals that full partitioning support requires CDC architectural changes, defer the implementation to a later release and document findings in the spike report.


v0.20.0 — Dog-Feeding (pg_trickle Monitors Itself)

Status: Released (2026-04-15). All 62 items implemented, 1 skipped (PERF-6 already shipped in v0.19.0). See plans/PLAN_0_20_0.md.

Release Theme This release implements dog-feeding: pg_trickle uses its own stream tables to maintain reactive analytics over its internal catalog and refresh-history tables. Five dog-feeding stream tables (df_efficiency_rolling, df_anomaly_signals, df_threshold_advice, df_cdc_buffer_trends, df_scheduling_interference) replace repeated full-scan diagnostic functions with continuously-maintained incremental views, enable multi-cycle trend detection for threshold tuning, and surface anomalies reactively. An optional auto-apply policy layer can automatically adjust auto_threshold when confidence is high. This validates pg_trickle on its own non-trivial workload and demonstrates the incremental analytics value proposition to users.

See plans/PLAN_DOG_FEEDING.md for the full design, architecture, and risk analysis.

Phase 1 — Foundation

ItemDescriptionEffortRef
DF-F1Verify CDC on pgt_refresh_history. Confirm that create_stream_table() installs INSERT triggers on pgt_refresh_history. Fix schema-exclusion logic if the pgtrickle schema is skipped.2–4hPLAN_DOG_FEEDING.md §7 Phase 1
DF-F2Create df_efficiency_rolling (DF-1). Maintained rolling-window aggregates over pgt_refresh_history. Replaces refresh_efficiency() full scans.2–4hPLAN_DOG_FEEDING.md §5 DF-1
DF-F3E2E test: DF-1 output matches refresh_efficiency(). Insert synthetic history rows, refresh DF-1, assert aggregates agree.2–4hPLAN_DOG_FEEDING.md §8
DF-F4pgtrickle.setup_dog_feeding() helper. Single SQL call that creates all five df_* stream tables.2–4hPLAN_DOG_FEEDING.md §7 Phase 4
DF-F5pgtrickle.teardown_dog_feeding() helper. Drops all df_* stream tables cleanly.1hPLAN_DOG_FEEDING.md §7 Phase 4

Phase 2 — Anomaly Detection

ItemDescriptionEffortRef
DF-A1Create df_anomaly_signals (DF-2). Detects duration spikes, error bursts, and mode oscillation by comparing recent behavior against DF-1 baselines.3–5hPLAN_DOG_FEEDING.md §5 DF-2
DF-A2Create df_threshold_advice (DF-3). Multi-cycle threshold recommendation replacing the single-step compute_adaptive_threshold() convergence.3–5hPLAN_DOG_FEEDING.md §5 DF-3
DF-A3Verify DAG ordering. DF-1 refreshes before DF-2 and DF-3.1–2hPLAN_DOG_FEEDING.md §7 Phase 2
DF-A4E2E test: threshold spike detection. Inject synthetic history making DIFF consistently fast; assert DF-3 recommends raising the threshold.2–4hPLAN_DOG_FEEDING.md §8
DF-A5E2E test: anomaly duration spike. Inject a 3× duration spike; assert DF-2 detects it.2–4hPLAN_DOG_FEEDING.md §8

Phase 3 — CDC Buffer & Interference

ItemDescriptionEffortRef
DF-C1Create df_cdc_buffer_trends (DF-4). Tracks change-buffer growth rates per source table. May require pgtrickle.cdc_buffer_row_counts() helper for dynamic table names.4–8hPLAN_DOG_FEEDING.md §5 DF-4
DF-C2Create df_scheduling_interference (DF-5). Detects concurrent refresh overlap. FULL-refresh mode initially (bounded 1-hour window).3–5hPLAN_DOG_FEEDING.md §5 DF-5
DF-C3E2E test: scheduling overlap detection. Create 3 STs with overlapping schedules; verify DF-5 detects overlap.2–4hPLAN_DOG_FEEDING.md §8

Phase 4 — GUC & Auto-Apply

ItemDescriptionEffortRef
DF-G1pg_trickle.dog_feeding_auto_apply GUC. Values: off (default) / threshold_only / full. Registered in src/config.rs.1–2hPLAN_DOG_FEEDING.md §6.2
DF-G2Auto-apply worker (threshold_only). Post-tick hook reads df_threshold_advice; applies ALTER STREAM TABLE ... SET auto_threshold = <recommended> when confidence is HIGH and delta > 5%. Rate-limited to 1 change per ST per 10 minutes.4–8hPLAN_DOG_FEEDING.md §7 Phase 5
DF-G3initiated_by = 'DOG_FEED' audit trail. Log auto-apply changes to pgt_refresh_history.1–2hPLAN_DOG_FEEDING.md §7 Phase 5
DF-G4E2E test: auto-apply threshold. Enable threshold_only, inject history making DIFF consistently faster, verify threshold increases automatically.2–4hPLAN_DOG_FEEDING.md §8
DF-G5E2E test: rate limiting. Verify no more than 1 threshold change per ST per 10 minutes.1–2hPLAN_DOG_FEEDING.md §8

Phase 5 — Operational Diagnostics

ItemDescriptionEffortRef
OPS-1pgtrickle.recommend_refresh_mode(st_name) Reads df_threshold_advice to return a structured recommendation { mode, confidence, reason } rather than computing on demand.2–4hPLAN_DOG_FEEDING.md §10.6
OPS-2check_cdc_health() spill-risk enrichment. Query df_cdc_buffer_trends growth rate; emit a spill_risk alert when buffer growth will breach spill_threshold_blocks within 2 cycles.2–4hPLAN_DOG_FEEDING.md §10.3
OPS-3pgtrickle.scheduler_overhead() diagnostic function. Returns busy-time ratio, queue depth, avg dispatch latency, and fraction of CPU spent on DF STs vs user STs.2–4h
OPS-4pgtrickle.explain_dag() — Mermaid/DOT output. Returns DAG as Mermaid markdown with node colours: user=blue, dog-feeding=green, suspended=red.3–4h
OPS-5sql/dog_feeding_setup.sql quick-start template. Runnable script: call setup_dog_feeding(), set dog_feeding_auto_apply = 'threshold_only', configure LISTEN, query initial recommendations.1h
OPS-6Workload-aware poll intervals via DF-5 signal. Replace compute_adaptive_poll_ms() exponential backoff with pre-emptive dispatch interval widening when df_scheduling_interference detects contention.2–4hPLAN_DOG_FEEDING.md §10.2
DASH-1Grafana Dog-Feeding Dashboard. New monitoring/grafana/dashboards/pg_trickle_dog_feeding.json — 5 panels reading from DF-1 through DF-5.4–6hPLAN_DOG_FEEDING.md §10.5
DBT-1dbt pgtrickle_enable_monitoring post-hook macro. Calls setup_dog_feeding() automatically after a successful dbt run; documented in dbt-pgtrickle/.2h

OPS-1 — pgtrickle.recommend_refresh_mode(st_name text)

Reads directly from df_threshold_advice instead of computing a single-cycle cost comparison on demand (PLAN_DOG_FEEDING.md §10.6). Returns TABLE(mode text, confidence text, reason text). When confidence is LOW (< 10 history rows), emits a fallback with mode='AUTO' and a reason explaining insufficient data. Integrates with explain_st() output.

Verify: call on an ST with ≥ 20 history cycles; assert mode{'DIFFERENTIAL','FULL','AUTO'} and confidence{'HIGH','MEDIUM','LOW'}. Dependencies: DF-A2. Schema change: No.

OPS-2 — check_cdc_health() spill-risk enrichment

Currently check_cdc_health() performs full-table scans to detect anomalies. When DF-C1 is active, query df_cdc_buffer_trends growth rate instead. Emit a spill_risk = 'IMMINENT' row when the 1-cycle growth rate extrapolated 2 cycles ahead exceeds spill_threshold_blocks. Falls back to full scan when dog-feeding is not set up.

Verify: inject 80% of spill_threshold_blocks worth of buffer rows with a steep growth rate; assert check_cdc_health() returns a spill-risk alert. Dependencies: DF-C1. Schema change: No.

OPS-3 — pgtrickle.scheduler_overhead() diagnostic function

Returns a snapshot of scheduler efficiency: scheduler_busy_ratio (fraction of wall-clock time spent executing refreshes), queue_depth (STs waiting to be dispatched), avg_dispatch_latency_ms, df_refresh_fraction (fraction of busy time attributable to DF STs). This makes PERF-3's < 1% CPU target observable in production without custom monitoring.

Verify: function returns non-NULL values after 5+ refresh cycles; assert df_refresh_fraction < 0.01 in the soak test context. Dependencies: DF-D4. Schema change: No (new function only).

OPS-4 — pgtrickle.explain_dag() — Mermaid / DOT graph output

Returns the full refresh DAG as a Mermaid markdown string (default) or Graphviz DOT (via format => 'dot' argument). Node labels show ST name, current mode, and refresh interval. Node colours: user STs = blue, dog-feeding STs = green, suspended = red, fused = orange. Edges show dependency direction. Validates that DF-1 → DF-2 → DF-3 ordering is correct post-setup.

Verify: SELECT pgtrickle.explain_dag() after setup_dog_feeding() returns a string containing all five df_ nodes in green with correct edges. Dependencies: None. Schema change: No (new function only).

OPS-5 — sql/dog_feeding_setup.sql quick-start template

A standalone SQL script in sql/ that an operator can run with psql -f sql/dog_feeding_setup.sql. Contents: calls setup_dog_feeding(), sets pg_trickle.dog_feeding_auto_apply = 'threshold_only', runs LISTEN pg_trickle_alert, queries dog_feeding_status() for a status summary, and queries df_threshold_advice for initial recommendations with a warm-up note. Referenced from GETTING_STARTED.md Day 2 operations section (UX-4).

Verify: script executes without errors on a fresh install; produces visible output showing 5 active DF STs. Dependencies: DF-F4, DF-G1, UX-4. Schema change: No.

OPS-6 — Workload-aware poll intervals via DF-5 signal

Currently compute_adaptive_poll_ms() uses pure exponential backoff that reacts to contention only after it occurs. Replace this with a pre-emptive signal: after each scheduler tick, read the latest overlap_count from df_scheduling_interference; if overlap_count >= 2, increase the dispatch interval for the next tick by 20% before dispatching (capped at pg_trickle.max_poll_interval_ms). This closes the dog-feeding feedback loop by letting the analytics directly influence scheduling policy, reducing contention on write-heavy deployments without waiting for timeouts.

Verify: soak test with known-contending STs shows lower overlap_count in DF-5 with signal enabled vs disabled. scheduler_overhead() shows reduced busy-time ratio. Dependencies: DF-C2, OPS-3. Schema change: No.

DASH-1 — Grafana Dog-Feeding Dashboard

Add monitoring/grafana/dashboards/pg_trickle_dog_feeding.json alongside the existing pg_trickle_overview.json. Five panels: (1) Refresh throughput timeline (DF-1 avg_diff_ms over time), (2) Anomaly heatmap (DF-2 per-ST anomaly type grid), (3) Threshold calibration scatter (DF-3 current vs recommended threshold), (4) CDC buffer growth sparklines (DF-4 per-source growth rate), (5) Interference matrix (DF-5 overlap heatmap). Provisioned automatically in monitoring/grafana/provisioning/.

Verify: docker compose up in monitoring/ loads both dashboards; all five panels resolve without No data errors using the postgres-exporter queries. Dependencies: DF-F2, DF-A1, DF-A2, DF-C1, DF-C2. Schema change: No.

DBT-1 — pgtrickle_enable_monitoring dbt post-hook macro

Add a pgtrickle_enable_monitoring macro to dbt-pgtrickle/macros/ that calls {{ pgtrickle.setup_dog_feeding() }} and emits a log() message confirming activation. Documented in dbt-pgtrickle/README.md. Users add +post-hook: "{{ pgtrickle_enable_monitoring() }}" to dbt_project.yml to auto-enable monitoring after any dbt run. Idempotent — safe to call on every run because setup_dog_feeding() is already idempotent (STAB-1).

Verify: just test-dbt includes a test case that runs the macro twice; asserts dog_feeding_status() shows 5 active STs after both calls. Dependencies: DF-F4, STAB-1. Schema change: No.

Documentation & Safety

ItemDescriptionEffortRef
DF-D1SQL_REFERENCE.md: dog-feeding quick start. Document setup_dog_feeding(), teardown_dog_feeding(), all five df_* stream tables, and the auto-apply GUC.2–4h
DF-D2CONFIGURATION.md: pg_trickle.dog_feeding_auto_apply GUC.1h
DF-D3E2E test: control plane survives DF ST suspension. Drop or suspend all df_* STs; verify the scheduler and refresh logic operate identically.2–4hPLAN_DOG_FEEDING.md §8
DF-D4Soak test addition. Add dog-feeding STs to the existing soak test; verify no memory growth or scheduler stalls under 1-hour sustained load.2–4hPLAN_DOG_FEEDING.md §8

Correctness

IDTitleEffortPriority
CORR-1df_threshold_advice output always within [0.01, 0.80]SP0
CORR-2DF-2 suppresses false-positive spike on first-ever refreshSP0
CORR-3avg_change_ratio never NaN/Inf on zero-delta streamsSP0
CORR-4CDC INSERT-only invariant verified on pgt_refresh_historyXSP1
CORR-5DF-1 historical window boundary is exclusive, not inclusiveXSP1

CORR-1 — df_threshold_advice output always within [0.01, 0.80]

The LEAST(0.80, GREATEST(0.01, …)) expression in DF-3 must hold for all input combinations including NULL avg_diff_ms, zero avg_full_ms, and extreme ratios. Add a property-based test (proptest) that generates random (avg_diff_ms, avg_full_ms, current_threshold) triples and asserts the output is always in the valid range. Any value outside [0.01, 0.80] that reaches auto-apply would corrupt stream table configuration.

Verify: proptest with 10,000 iterations; zero out-of-range results. Dependencies: DF-A2. Schema change: No.

CORR-2 — DF-2 suppresses false-positive spike on first-ever refresh

df_anomaly_signals compares latest.duration_ms against eff.avg_diff_ms. On the very first refresh of a stream table there is no rolling average yet (eff.avg_diff_ms IS NULL), so the CASE WHEN would produce no anomaly. Confirm the LATERAL subquery returns NULL (not 0) when history is empty, and that the CASE guard is > 3.0 * NULLIF(eff.avg_diff_ms, 0) so a NULL baseline never triggers a spike.

Verify: E2E test creating a brand-new ST; assert duration_anomaly IS NULL on first DF-2 refresh. Dependencies: DF-A1. Schema change: No.

CORR-3 — avg_change_ratio never NaN/Inf on zero-delta streams

DF-1 computes avg(h.delta_row_count::float / NULLIF(h.rows_inserted + h.rows_deleted, 0)). If a stream table runs only FULL refreshes (no DIFF cycles) the divisor is always NULL and avg() returns NULL — correct. But if DIFF runs with exactly zero rows inserted and zero deleted (CDC buffer was empty), NULLIF must prevent a divide-by-zero NaN. Verify the guard holds and that avg_change_ratio is either a valid float in [0, 1] or NULL.

Verify: E2E test triggering a DIFF refresh on a quiescent source; assert avg_change_ratio IS NULL OR avg_change_ratio BETWEEN 0 AND 1. Dependencies: DF-F2. Schema change: No.

CORR-4 — CDC INSERT-only invariant verified on pgt_refresh_history

pgt_refresh_history is semantically append-only: rows are only ever INSERTed (one per refresh). The CDC trigger installed by DF-F1 must be an INSERT-only trigger (no UPDATE/DELETE triggers). If the trigger were registered as FOR EACH ROW AFTER INSERT OR UPDATE, a future catalog UPDATE would generate spurious change-buffer rows and corrupt DF-1 aggregates. Inspect pg_trigger to confirm only an INSERT trigger exists.

Verify: SELECT tgtype FROM pg_trigger WHERE tgrelid = 'pgtrickle.pgt_refresh_history'::regclass returns only INSERT-event triggers. Dependencies: DF-F1. Schema change: No.

CORR-5 — DF-1 historical window boundary is exclusive, not inclusive

The WHERE h.start_time > now() - interval '1 hour' clause uses a strict > comparison. This ensures a row with start_time exactly equal to the boundary is excluded on each pass, preventing double-counting in rolling aggregates. Confirm the query plan uses the index on (pgt_id, start_time) (see PERF-2) and that the boundary is consistent across DF-1, DF-2, and DF-4 (all use the same 1-hour lookback).

Verify: unit test comparing aggregate output with a row at the exact boundary; assert it is excluded. Dependencies: DF-F2. Schema change: No.


Stability

IDTitleEffortPriority
STAB-1setup_dog_feeding() is fully idempotentSP0
STAB-2Auto-apply handles ALTER STREAM TABLE failure gracefullySP0
STAB-3DF STs survive DROP EXTENSION + CREATE EXTENSION cycleSP1
STAB-4Auto-apply worker checks ST still exists before applyingXSP1
STAB-5teardown_dog_feeding() is safe when some DF STs already removedXSP1

STAB-1 — setup_dog_feeding() is fully idempotent

Calling setup_dog_feeding() a second time while DF STs already exist must not raise an error. Use IF NOT EXISTS semantics internally (or check catalog before creating). The function must also be safe to call concurrently from two sessions. Idempotency is critical for upgrade scripts and Terraform-style declarative deployment workflows.

Verify: call setup_dog_feeding() three times in a row; no errors, no duplicate stream tables. Dependencies: DF-F4. Schema change: No.

STAB-2 — Auto-apply handles ALTER STREAM TABLE failure gracefully

The auto-apply post-tick hook reads df_threshold_advice and issues ALTER STREAM TABLE … SET auto_threshold = <recommended>. If the stream table was dropped between the advice read and the apply (a TOCTOU race), the ALTER will error. Catch SQL errors in the post-tick hook with an appropriate match on PgTrickleError and log a WARNING rather than crashing the background worker.

Verify: unit test with a mocked ALTER that returns ERROR: relation does not exist; assert the worker logs a warning and continues to the next advice row. Dependencies: DF-G2. Schema change: No.

STAB-3 — DF STs survive DROP EXTENSION + CREATE EXTENSION cycle

DROP EXTENSION pg_trickle CASCADE drops all extension-owned objects. After CREATE EXTENSION pg_trickle, setup_dog_feeding() should recreate the DF STs cleanly. There must be no leftover triggers, orphaned change buffer tables, or stale catalog rows from the previous installation. This is the most likely failure mode after an emergency rollback + reinstall.

Verify: E2E test: setup_dog_feeding()DROP EXTENSION CASCADECREATE EXTENSIONsetup_dog_feeding() → insert history → refresh DF-1; assert correct aggregates. Dependencies: DF-F4, DF-F5. Schema change: No.

STAB-4 — Auto-apply worker checks ST still exists before applying

Before issuing ALTER STREAM TABLE, the worker should confirm the ST is still in pgt_stream_tables and is not in SUSPENDED or FUSED state. Applying a threshold change to a SUSPENDED ST is harmless but wasteful; applying to a FUSED ST is wrong (the fuse exists for a reason). Add a pre-apply guard in the Rust post-tick hook.

Verify: E2E test suspending an ST manually while auto-apply is enabled; assert no threshold change is applied-to a suspended stream table. Dependencies: DF-G2. Schema change: No.

STAB-5 — teardown_dog_feeding() is safe when some DF STs already removed

If a user manually drops df_anomaly_signals before calling teardown_dog_feeding(), the teardown function must not error on DROP STREAM TABLE df_anomaly_signals. Use drop_stream_table(name, if_exists => true) semantics for each DF table in the teardown. Otherwise a partial teardown leaves the system in an inconsistent state.

Verify: drop two DF STs manually, then call teardown_dog_feeding(); assert no errors and remaining DF STs are gone. Dependencies: DF-F5. Schema change: No.


Performance

IDTitleEffortPriority
PERF-1Index on pgt_refresh_history(pgt_id, start_time) for DF queriesXSP0
PERF-2Benchmark DF-1 vs refresh_efficiency() on 10 K history rowsSP0
PERF-3Dog-feeding scheduler overhead target: < 1% of total CPUSP1
PERF-4DF-5 self-join uses bounded index scan, not seq-scanSP1
PERF-5History pruning batch-DELETE with short transactions (no CDC lock contention)SP1
PERF-6Columnar change tracking Phase 1 — CDC bitmask (deferred from v0.17/v0.18)MP1

PERF-1 — Index on pgt_refresh_history(pgt_id, start_time) for DF queries

All five DF stream tables filter pgt_refresh_history on (pgt_id, start_time). Without a composite index on these columns the rolling-window WHERE clause forces a sequential scan of the growing history table. Verify the index was created during extension install (check the upgrade migration); if missing, add it as part of the 0.19.0 → 0.20.0 migration script.

Verify: EXPLAIN (FORMAT TEXT) SELECT … FROM pgtrickle.pgt_refresh_history WHERE pgt_id = 1 AND start_time > now() - interval '1 hour' shows an index scan. Schema change: Yes (index addition in migration script).

PERF-2 — Benchmark DF-1 vs refresh_efficiency() on 10 K history rows

The primary performance claim of dog-feeding is that a maintained DIFFERENTIAL stream table is cheaper than scanning the full history table on every diagnostic call. Establish a Criterion micro-benchmark that seeds 10 K history rows, then compares: (a) a full SELECT * FROM pgtrickle.refresh_efficiency() call vs (b) a SELECT * FROM pgtrickle.df_efficiency_rolling read after one incremental refresh. The benchmark documents the win concretely.

Verify: Criterion benchmark shows DF-1 read is at least 5× faster than refresh_efficiency() at 10 K rows. Included in benches/ and run in CI. Dependencies: DF-F2. Schema change: No.

PERF-3 — Dog-feeding scheduler overhead target: < 1% of total CPU

Five DF STs at 48–96 s schedules add background refresh work. Under a realistic load (20 user STs, 10 K history rows), the total time spent refreshing DF STs should be < 1% of total scheduler CPU. Measure in the E2E soak test by comparing scheduler loop busy-time with and without DF STs. If overhead exceeds 1%, relax schedules to 120 s or move DF STs to refresh_tier = 'cold'.

Verify: soak test reports DF refresh overhead as a fraction of total scheduler CPU; assert < 1%. Dependencies: DF-D4. Schema change: No.

PERF-4 — DF-5 self-join uses bounded index scan, not seq-scan

df_scheduling_interference joins pgt_refresh_history to itself on an overlap condition with a 1-hour bound. Without the index from PERF-1 this double-scan is O(N²) in history rows. Verify EXPLAIN shows nested-loop index scans (not hash or merge join over full table) for both sides of the self-join. If the planner chooses a seq-scan, add enable_seqscan = off for the DF-5 query or restructure with a CTE.

Verify: EXPLAIN of DF-5 query shows index scans on both sides of the JOIN. Dependencies: PERF-1, DF-C2. Schema change: No.

PERF-5 — History pruning batch-DELETE with short transactions

pg_trickle.history_retention_days cleanup (shipped in v0.19.0) currently deletes rows in a single long transaction. Under dog-feeding, that transaction holds a lock on pgt_refresh_history that can delay CDC trigger INSERTs. Rewrite the purge as batched DELETEs: delete at most 500 rows per transaction, commit between batches, sleep 50 ms between batches. The index from PERF-1 ensures each batch is an index-range scan, not a seq-scan.

Verify: soak test running history purge concurrently with DF CDC trigger INSERTs; no lock wait timeout observed. Batch size configurable via pg_trickle.history_purge_batch_size GUC (default 500). Dependencies: PERF-1. Schema change: No.

PERF-6 — Columnar change tracking Phase 1 — CDC bitmask

Deferred from v0.17.0 (twice) and v0.18.0. Dog-feeding now provides concrete internal workload data that justifies the schema change. Phase 1 only: compute changed_columns bitmask (old.col IS DISTINCT FROM new.col) in the CDC trigger for UPDATE rows; store as int8 in the change buffer. Phase 2 (delta-scan filtering using the bitmask) deferred to v0.22.0. Gate behind pg_trickle.columnar_tracking GUC (default off). This is the foundation for 50–90% delta volume reduction on wide-table UPDATE workloads.

Verify: UPDATE a 20-column row, changing 2 columns; assert changed_columns bitmask has exactly 2 bits set. just check-upgrade-all passes. Dependencies: None. Schema change: Yes (change buffer schema addition + migration script).


Scalability

IDTitleEffortPriority
SCAL-1DF STs refresh within window at 100 user stream tablesSP1
SCAL-2pgt_refresh_history retention interacts correctly with dog-feedingSP1
SCAL-31-hour rolling window doesn't over-aggregate when history is sparseXSP2

SCAL-1 — DF STs refresh within window at 100 user stream tables

With 100 user STs generating up to 100 history rows per 48 s window, DF-1 processes up to ~7,500 rows/hour. Verify that the DIFFERENTIAL refresh of DF-1 completes within its 48 s schedule interval at this load, leaving margin for DF-2 and DF-3. If DF-1 duration exceeds 10 s, investigate query plan and index usage. Run as part of the soak-test at high table count.

Verify: soak test with 100 STs; DF-1 refresh duration < 10 s throughout. Dependencies: PERF-1. Schema change: No.

SCAL-2 — pgt_refresh_history retention interacts correctly with dog-feeding

pg_trickle.history_retention_days (shipped in v0.19.0, default 90 days) purges old history rows. DF-1 only looks back 1 hour, so retention does not affect correctness. However the purge job must not hold a long-running lock that delays CDC trigger firing on concurrent INSERT into the history table. Verify that the cleanup job uses a DELETE … RETURNING batch strategy with short transactions to avoid blocking DF CDC triggers.

Verify: E2E test running the history purge job while DF-1 is being refreshed; no lock wait timeout, no CDC trigger delay. Dependencies: DF-F1. Schema change: No.

SCAL-3 — 1-hour rolling window doesn't over-aggregate when history is sparse

For a stream table that refreshes every 30 minutes (2 refreshes/hour), the DF-1 1-hour window contains at most 2 rows. The AVG() aggregate is still meaningful, but percentile_cont(0.95) over 2 rows is misleading. Document the minimum sample size (in the confidence column of DF-3) and add a note in SQL_REFERENCE.md that DF stats are most meaningful for STs refreshing every 60 s or faster.

Verify: SQL_REFERENCE.md updated; confidence = 'LOW' for STs with total_refreshes < 10. Dependencies: DF-A2. Schema change: No.


Ease of Use

IDTitleEffortPriority
UX-1pgtrickle.dog_feeding_status() diagnostic functionSP0
UX-2setup_dog_feeding() warm-up hint when history is sparseXSP1
UX-3NOTIFY on anomaly via pg_trickle_alert channelSP1
UX-4GETTING_STARTED.md: "Day 2 operations" sectionSP1
UX-5explain_st() shows if a DF ST covers the queried stream tableXSP2
UX-6recommend_refresh_mode() exposed in explain_st() JSON outputXSP2
UX-7scheduler_overhead() output included in TUI diagnostics panelXSP2
UX-8df_threshold_advice extended with SLA headroom columnSP2

UX-1 — pgtrickle.dog_feeding_status() diagnostic function

A single-query overview of the dog-feeding analytics plane: name, last refresh timestamp, row count, and whether the DF ST is ACTIVE / SUSPENDED / NOT_CREATED. Calling this function is the first thing an operator should run to check that dog-feeding is working. Return type: TABLE(df_name text, status text, last_refresh timestamptz, row_count bigint, note text).

Verify: function returns 5 rows when all DF STs are active; returns rows with status = 'NOT_CREATED' when setup_dog_feeding() has not been called. Schema change: No (new function only).

UX-2 — setup_dog_feeding() warm-up hint when history is sparse

If pgt_refresh_history has fewer than 50 rows when setup_dog_feeding() is called, emit a NOTICE: "Dog-feeding stream tables created. DF analytics will populate as refresh history accumulates (currently N rows; recommend ≥ 50 before consulting df_threshold_advice)." This prevents operators from acting on meaningless LOW-confidence advice immediately after setup.

Verify: call setup_dog_feeding() on a fresh install; assert NOTICE contains the row count and the ≥ 50 recommendation. Dependencies: DF-F4. Schema change: No.

UX-3 — NOTIFY on anomaly via pg_trickle_alert channel

When df_anomaly_signals detects a duration_anomaly IS NOT NULL or recent_failures >= 2 after a refresh, emit a pg_notify('pg_trickle_alert', payload::text) with event = 'dog_feed_anomaly', the stream table name, anomaly type, last duration, baseline, and a plain-English recommendation. This integrates with existing alert pipelines without requiring a new channel. Fires from a post-refresh trigger on df_anomaly_signals or from the auto-apply post-tick hook.

Verify: E2E test LISTEN on pg_trickle_alert; inject a 3× duration spike; assert NOTIFY payload arrives with correct anomaly type. Dependencies: DF-A1. Schema change: No.

UX-4 — GETTING_STARTED.md: "Day 2 operations" section

Add a new section to docs/GETTING_STARTED.md covering the first steps after initial deployment: (1) enable dog-feeding with setup_dog_feeding(), (2) check status with dog_feeding_status(), (3) query df_threshold_advice to tune thresholds, (4) set up anomaly alerting via LISTEN. This gives new users a clear post-install checklist and demonstrates the dog-feeding value proposition immediately.

Verify: documentation PR reviewed; code examples in GETTING_STARTED.md execute without modification. Dependencies: UX-1, UX-2. Schema change: No.

UX-5 — explain_st() shows if a DF ST covers the queried stream table

When a user calls pgtrickle.explain_st('my_table'), append a line "Dog-feeding coverage: df_efficiency_rolling ✓, df_threshold_advice ✓" (or "Not set up — run setup_dog_feeding()") to the output. This surfaces the analytics plane to users who might not know dog-feeding exists, without requiring a separate function call.

Verify: SELECT explain_st('any_table') output includes a dog_feeding field in the JSON output. Dependencies: UX-1. Schema change: No.

UX-8 — df_threshold_advice extended with SLA headroom column

Extend the DF-3 defining query to include a computed sla_headroom_ms column: freshness_deadline_ms - avg_diff_ms from pgt_refresh_history. When sla_headroom_ms < 0, add a boolean sla_breach_risk = true flag so operators can see at a glance which STs risk missing their freshness SLA on the next DIFFERENTIAL cycle. The freshness_deadline column already exists in pgt_refresh_history (since v0.2.3). No schema change required.

Verify: create an ST with a tight freshness_deadline; run slow synthetic refreshes; assert df_threshold_advice.sla_breach_risk = true. Dependencies: DF-A2. Schema change: No (view column addition only).

UX-6 — recommend_refresh_mode() exposed in explain_st() JSON output

explain_st() already shows dog-feeding coverage (UX-5). Extend its JSON output with a recommended_mode field reading from df_threshold_advice (OPS-1). If OPS-1 is not available (no DF setup), fall back to null with a setup_dog_feeding() hint. Keeps the single-function diagnostic surface comprehensive without requiring separate calls.

Verify: SELECT explain_st('any_table') JSON includes recommended_mode and mode_confidence fields. Dependencies: OPS-1. Schema change: No.

UX-7 — scheduler_overhead() output included in TUI diagnostics panel

The TUI (pgtrickle-tui) already shows refresh latency sparklines and ST status. Add a diagnostics panel (toggle key D) showing the fields from scheduler_overhead(): busy ratio, queue depth, and DF fraction as a percentage. Gives operators hands-on observability without needing psql.

Verify: TUI diagnostics panel shows all three scheduler overhead fields; df_refresh_fraction updates after each DF refresh cycle. Dependencies: OPS-3. Schema change: No.


Test Coverage

IDTitleEffortPriority
TEST-1Property test: DF-3 recommended threshold always ∈ [0.01, 0.80]SP0
TEST-2Light E2E: dog-feeding create/refresh/teardown full cycleSP0
TEST-3Upgrade test: pgt_refresh_history rows survive 0.19.0 → 0.20.0SP0
TEST-4Regression test: DF STs absent from check_cdc_health() anomaly listXSP1
TEST-5Stability test: dog-feeding under 1-h soak with 50 user STsMP1
TEST-6Light E2E: setup_dog_feeding() idempotency (3× call)XSP1

TEST-1 — Property test: DF-3 recommended threshold always ∈ [0.01, 0.80]

Implements CORR-1 as a proptest unit test. Generate random (avg_diff_ms: 0.0–100_000.0, avg_full_ms: 0.0–100_000.0, current: 0.01–0.80) triples, compute the DF-3 CASE expression in Rust, assert output ∈ [0.01, 0.80]. Can be a pure Rust unit test in src/refresh.rs alongside the existing compute_adaptive_threshold tests — no database required.

Verify: just test-unit passes; 10,000 proptest iterations with zero failures. Dependencies: CORR-1. Schema change: No.

TEST-2 — Light E2E: dog-feeding create/refresh/teardown full cycle

A light E2E test (stock postgres:18.3 container) that: (1) installs the extension, (2) creates 3 user STs, (3) runs 5 refresh cycles to populate history, (4) calls setup_dog_feeding(), (5) refreshes all DF STs once, (6) asserts dog_feeding_status() shows 5 active STs, (7) calls teardown_dog_feeding(), (8) asserts all DF STs are gone.

Verify: test passes in just test-light-e2e with zero assertions failed. Schema change: No.

TEST-3 — Upgrade test: pgt_refresh_history rows survive 0.19.0 → 0.20.0

The 0.19.0 → 0.20.0 migration adds an index to pgt_refresh_history (PERF-1). The upgrade must not truncate, reorder, or modify existing history rows. Write an upgrade E2E test: deploy 0.19.0, run 10 refreshes, ALTER EXTENSION pg_trickle UPDATE, assert all 10 history rows are intact and the new index exists.

Verify: upgrade E2E test passes; SELECT count(*) FROM pgt_refresh_history unchanged after upgrade. Schema change: Yes (index).

TEST-4 — Regression test: DF STs absent from check_cdc_health() anomaly list

pgtrickle.check_cdc_health() scans all stream tables for CDC anomalies. After setup_dog_feeding(), DF STs must not appear in the anomaly list just because they are refreshed at longer intervals (48–96 s). Their schedules must be recognised as intentionally relaxed, not "falling behind".

Verify: E2E test: setup_dog_feeding() → wait one full DF cycle → assert check_cdc_health() returns no anomalies for any df_ table. Dependencies: DF-F4. Schema change: No.

TEST-5 — Stability test: dog-feeding under 1-h soak with 50 user STs

Extends DF-D4. Runs 50 user STs + 5 DF STs for 1 hour under steady insert load (1 000 rows/min across all sources). Assertions: (a) all DF STs remain ACTIVE, (b) no OOM or background worker crash, (c) DF-1 avg refresh duration < 5 s throughout, (d) pgtrickle.dog_feeding_status() shows 5 active STs at end of run.

Verify: soak test passes with all four assertions. Dependencies: DF-D4, SCAL-1. Schema change: No.

TEST-6 — Light E2E: setup_dog_feeding() idempotency (3× call)

Implements STAB-1 as a light E2E test. Call setup_dog_feeding() three consecutive times in the same session. Assert: no errors, exactly five df_ stream tables in pgt_stream_tables, no duplicate triggers in pg_trigger for history table.

Verify: test passes in just test-light-e2e; SELECT count(*) FROM pgtrickle.pgt_stream_tables WHERE pgt_name LIKE 'df_%' = 5 after all three calls. Dependencies: STAB-1. Schema change: No.


Conflicts & Risks

  1. PERF-1 (index addition) requires a migration script change. Adding CREATE INDEX CONCURRENTLY to the 0.19.0 → 0.20.0 migration must be tested with just check-upgrade-all. CONCURRENTLY cannot run inside a transaction block — the migration must issue it outside the default single-transaction DDL wrapper.

  2. UX-3 (NOTIFY on anomaly) fires from a post-refresh path. If the pg_notify() call fails (e.g., payload too large), it must not roll back the DF-2 refresh. Wrap the notify in a BEGIN … EXCEPTION WHEN OTHERS THEN NULL END block, or fire it from a deferred trigger.

  3. STAB-3 (DROP EXTENSION cycle) requires DF STs to be extension-owned or cleanly unregistered. If DF STs are not extension-owned objects, DROP EXTENSION CASCADE will not drop them. Either register them as extension members or document that teardown_dog_feeding() must be called before DROP EXTENSION.

  4. TEST-5 (soak test) overlaps with the existing soak test in CI. Add it to the daily stability-tests.yml workflow rather than ci.yml to avoid extending PR CI time. Mark with #[ignore] and trigger via just test-soak.

  5. CORR-5 / PERF-4 interaction. The start_time > now() - interval '1 hour' boundary and the index depend on the planner choosing an index range scan. On very busy deployments where the cardinality estimate is off, the planner may prefer a seq-scan. Consider adding SET enable_seqscan = off inside the DF stream table queries if plan stability is a concern.

  6. PERF-6 (columnar tracking) is a schema change — deferred twice already. The changed_columns column addition to all change buffer tables requires a migration script. Gate strictly behind pg_trickle.columnar_tracking = off default. If capacity is tight, PERF-6 can be cut from v0.20.0 without affecting any other item — it shares no code paths with the DF pipeline.

  7. OPS-2 (check_cdc_health() enrichment) has a fallback requirement. When setup_dog_feeding() has not been called, the function must fall back to the old full-scan path without error. Guard with a catalog check for df_cdc_buffer_trends existence before querying it.

  8. OPS-4 (explain_dag()) output size. At 100+ user STs the Mermaid output may exceed typical terminal width. Offer format => 'dot' and limit => N arguments to constrain output. Default format => 'mermaid' with a NOTICE when DAG has > 20 nodes.

  9. OPS-6 (workload-aware poll) writes to the scheduler hot path. The compute_adaptive_poll_ms() function is called on every scheduler tick. The DF-5 read must be a single O(1) catalog lookup (latest row only), not a full table scan. Guard with LIMIT 1 ORDER BY collected_at DESC. If the DF-5 table does not exist (dog-feeding not set up), fall back to the old backoff logic without error.

  10. DASH-1 (Grafana) depends on postgres-exporter SQL queries. The dashboard panels use custom SQL collectors in the postgres-exporter config. Verify that monitoring/ docker-compose already mounts query config; if not, add a pg_trickle_df_queries.yaml collector file alongside the existing exporter config.

  11. DBT-1 macro idempotency. The pgtrickle_enable_monitoring macro calls setup_dog_feeding() on every dbt run. Document that this is intentionally safe (STAB-1) and adds < 5 ms overhead per run.

v0.20.0 total: ~3–4 weeks

Exit criteria:

  • DF-F1: pgt_refresh_history receives CDC INSERT triggers when create_stream_table() is called
  • DF-F2: df_efficiency_rolling created and refreshes correctly in DIFFERENTIAL mode
  • DF-F3: DF-1 output matches refresh_efficiency() results on synthetic history
  • DF-F4: setup_dog_feeding() creates all five df_* stream tables in one call
  • DF-F5: teardown_dog_feeding() drops all df_* tables cleanly with no orphaned triggers
  • DF-A1: df_anomaly_signals created and detects 3× duration spikes
  • DF-A2: df_threshold_advice provides HIGH-confidence recommendations after ≥ 20 refresh cycles
  • DF-A3: DAG ensures DF-1 refreshes before DF-2 and DF-3 in every scheduler tick
  • DF-C1: df_cdc_buffer_trends created (FULL or DIFFERENTIAL mode)
  • DF-C2: df_scheduling_interference detects overlapping concurrent refreshes
  • DF-G1: pg_trickle.dog_feeding_auto_apply GUC registered with default off
  • DF-G2: Auto-apply adjusts threshold with ≥ 1 confirmed change in E2E test
  • DF-G5: Rate limiting verified — no more than 1 change per ST per 10 minutes
  • DF-D3: Suspending all df_* STs does not affect control-plane operation
  • CORR-1: df_threshold_advice output always within [0.01, 0.80] (property test)
  • CORR-2: No false-positive DURATION_SPIKE on first-ever refresh of a new ST
  • CORR-3: avg_change_ratio is NULL or in [0, 1] for zero-delta sources
  • CORR-4: Only INSERT triggers (no UPDATE/DELETE) on pgt_refresh_history
  • STAB-1: setup_dog_feeding() called 3× produces no errors and no duplicates
  • STAB-2: Auto-apply worker logs WARNING (not panic) when ALTER target disappears
  • STAB-3: DROP EXTENSION + CREATE EXTENSION + setup_dog_feeding() cycle works cleanly
  • PERF-1: pgt_refresh_history(pgt_id, start_time) index exists and is used by DF queries
  • PERF-2: DF-1 read ≥ 5× faster than refresh_efficiency() at 10 K history rows
  • UX-1: pgtrickle.dog_feeding_status() returns correct status for all five DF STs
  • UX-2: setup_dog_feeding() emits warm-up NOTICE when history has < 50 rows
  • UX-3: pg_trickle_alert NOTIFY received within one DF cycle after a 3× duration spike
  • TEST-1: Proptest for DF-3 threshold bounds passes 10,000 iterations
  • TEST-2: Light E2E full cycle test passes
  • TEST-3: Upgrade E2E: history rows intact and index present after 0.19.0 → 0.20.0
  • TEST-4: check_cdc_health() reports no anomalies for df_* tables after setup
  • OPS-1: recommend_refresh_mode() returns mode{'DIFFERENTIAL','FULL','AUTO'} and confidence{'HIGH','MEDIUM','LOW'}
  • OPS-2: check_cdc_health() returns spill-risk alert when buffer growth rate extrapolates to breach threshold within 2 cycles
  • OPS-3: scheduler_overhead() returns non-NULL fields after ≥ 5 refresh cycles; df_refresh_fraction < 0.01 in soak test
  • OPS-4: explain_dag() output contains all five df_* nodes after setup_dog_feeding()
  • OPS-5: sql/dog_feeding_setup.sql executes without errors on a fresh install
  • PERF-5: Concurrent history purge + DF CDC INSERT produces no lock wait timeouts in soak test
  • PERF-6: changed_columns bitmask stored in change buffer for UPDATE rows when columnar_tracking = on (if included)
  • OPS-6: Soak test shows lower overlap_count in DF-5 with workload-aware poll enabled vs disabled
  • DASH-1: docker compose up in monitoring/ loads pg_trickle_dog_feeding dashboard; all 5 panels show data
  • DBT-1: pgtrickle_enable_monitoring macro runs twice without error; dog_feeding_status() shows 5 active STs after both calls
  • UX-8: df_threshold_advice.sla_breach_risk = true when avg_diff_ms > freshness_deadline_ms on synthetic data
  • Extension upgrade path tested (0.19.0 → 0.20.0)
  • just check-version-sync passes

v0.21.0 — PostgreSQL 17 Support

Release Theme This release adds PostgreSQL 17 as a supported target alongside PostgreSQL 18. PGlite is built on PostgreSQL 17, so this is a hard prerequisite for the PGlite proof of concept (v0.22.0). The pgrx 0.17.x framework already supports PG 17 — the work is enabling the feature flag, adapting version-sensitive code paths, expanding the CI matrix, and validating the full test suite against a PG 17 instance.

Cargo & Build System

ItemDescriptionEffortRef
PG17-1Add pg17 feature to Cargo.toml. Define pg17 = ["pgrx/pg17", "pgrx-tests/pg17"] feature. Keep default = ["pg18"].1h
PG17-2Broaden #[cfg] guards in src/dag.rs. Three #[cfg(feature = "pg18")] blocks must become #[cfg(any(feature = "pg17", feature = "pg18"))].1–2h
PG17-3Guard NodeTag numeric assertions. src/dvm/parser/mod.rs asserts specific NodeTag integer values (e.g., T_GroupingSet = 107) that shift between PG versions. Gate behind #[cfg(feature = "pg18")] or use per-version value tables.2–4h
PG17-4Audit pg_sys::* API surface. Verify that every pg_sys call compiles and behaves correctly on PG 17 bindings. Focus on catalog struct field names, WAL decoder types, and any PG 18-only additions.4–8h

CI & Infrastructure

ItemDescriptionEffortRef
PG17-5CI matrix expansion. Add PG 17 build + unit test job to ci.yml. Use postgres:17 Docker image for integration and light E2E tests.4–8h
PG17-6justfile parameterisation. Add pg17 variants for build, test, and package recipes (e.g., just build-pg17, just test-e2e-pg17).2–4h
PG17-7tests/Dockerfile.e2e PG version parameter. Accept a build arg for the base PostgreSQL image version so the same Dockerfile works for PG 17 and PG 18.2–4h
PG17-8Scripts parameterisation. Update run_unit_tests.sh, run_light_e2e_tests.sh, run_e2e_tests.sh to accept a PG version argument instead of hardcoding pg18.2–4h

Testing & Validation

ItemDescriptionEffortRef
PG17-9Full E2E suite against PG 17. Run the complete E2E test suite against a PG 17 instance. Fix any parser or catalog incompatibilities that surface.1–2d
PG17-10TPC-H validation on PG 17. Run TPC-H benchmark queries on PG 17 to verify differential refresh correctness for complex queries.4–8h
PG17-11Upgrade path test. Verify ALTER EXTENSION pg_trickle UPDATE from 0.19.0 to 0.20.0 works on both PG 17 and PG 18.2–4h

Documentation

ItemDescriptionEffortRef
PG17-12Update docs and README. Change "PostgreSQL 18 extension" to "PostgreSQL 17/18 extension" in README.md, INSTALL.md, src/lib.rs doc comments, and ARCHITECTURE.md.1–2h
PG17-13Docker Hub image variants. Publish images tagged with both PG versions (e.g., :0.20.0-pg17, :0.20.0-pg18).2–4h

v0.21.0 total: ~2–4 days

Exit criteria:

  • PG17-1: cargo build --features pg17 --no-default-features compiles cleanly
  • PG17-2/PG17-3: cargo clippy --features pg17 --no-default-features passes with zero warnings
  • PG17-4: No pg_sys compile errors on PG 17 bindings
  • PG17-5: CI runs unit + integration + light E2E tests on PG 17
  • PG17-9: Full E2E suite passes on PG 17 with zero failures
  • PG17-10: TPC-H differential refresh matches full refresh on PG 17
  • PG17-11: Extension upgrade path works on both PG 17 and PG 18
  • PG17-12: Documentation reflects PG 17/18 dual support
  • Extension upgrade path tested (0.20.0 → 0.21.0)
  • just check-version-sync passes

v0.22.0 — PGlite Proof of Concept

Release Theme This release validates whether PGlite users want real incremental view maintenance by shipping a lightweight TypeScript plugin with zero core changes. The plugin (@pgtrickle/pglite-lite) intercepts DML via statement-level AFTER triggers and applies pre-computed delta SQL for simple patterns — single-table aggregates, two-table inner joins, and filtered scans. It deliberately limits scope to 3–5 SQL patterns to keep effort low while generating a concrete demand signal. If adoption materialises, the full core extraction (v0.23.0) and WASM build (v0.24.0) proceed. The main pg_trickle PostgreSQL extension ships no functional changes in this release — only version bumps and upgrade migration plumbing.

See PLAN_PGLITE.md for the full feasibility report.

PGlite JS Plugin PoC (Strategy C — Phase 0)

In plain terms: PGlite's built-in live.incrementalQuery() re-runs the full query on every change and diffs at the JavaScript layer. This proof of concept ships a PGlite plugin (@pgtrickle/pglite-lite) that intercepts DML via statement-level AFTER triggers and applies pre-computed delta SQL for simple cases — single-table aggregates and two-table inner joins. It validates whether PGlite users want real IVM and whether the trigger infrastructure works correctly in PGlite's single-user WASM mode. No WASM compilation, no pgrx changes, no core refactoring required.

ItemDescriptionEffortRef
PGL-0-1PGlite trigger infrastructure validation. Empirically verify that statement-level triggers with REFERENCING NEW TABLE AS ... OLD TABLE AS ... work in PGlite's single-user mode. Document any limitations.4–8hPLAN_PGLITE.md §8 Q1
PGL-0-2Delta SQL templates for simple patterns. Implement delta SQL generation in TypeScript for: (a) single-table GROUP BY with COUNT/SUM/AVG, (b) two-table INNER JOIN, (c) simple WHERE filter. Pre-compute at createStreamTable() time.2–3dPLAN_PGLITE.md §5 Strategy C
PGL-0-3PGlite plugin skeleton. TypeScript plugin implementing createStreamTable(), dropStreamTable(), trigger registration, and delta application via PGlite's plugin API.2–3dPLAN_PGLITE.md §5 Strategy C
PGL-0-4npm package @pgtrickle/pglite-lite. Package, publish, README with usage examples, and 3–5 supported SQL patterns documented.1–2d
PGL-0-5Benchmark vs live.incrementalQuery(). Compare latency and throughput for a 10K-row table with single-row inserts. Quantify the IVM advantage.1dPLAN_PGLITE.md §4.2

Phase 0 subtotal: ~2–3 weeks

Correctness

IDTitleEffortPriority
CORR-1Delta SQL equivalence for supported patternsMP0
CORR-2NULL-key aggregate correctness in JS deltaSP0
CORR-3Multi-DML transaction atomicitySP1

CORR-1 — Delta SQL equivalence for supported patterns

In plain terms: The TypeScript delta SQL templates must produce the exact same stream table state as a full query re-evaluation, for every combination of INSERT, UPDATE, and DELETE on the supported patterns (single-table GROUP BY + COUNT/SUM/AVG, two-table INNER JOIN, simple WHERE filter). Correctness is proven by running each DML operation, comparing the delta-maintained result against a fresh SELECT, and asserting row-for-row equivalence.

Verify: automated test suite runs 100+ randomised DML sequences per pattern; zero divergence from full re-evaluation. Dependencies: PGL-0-2, PGL-0-3. Schema change: No.

CORR-2 — NULL-key aggregate correctness in JS delta

In plain terms: When a GROUP BY key is NULL, SQL three-valued logic means GROUP BY NULL forms its own group. The TypeScript delta templates must handle NULL group keys correctly — insertions into the NULL group, deletions that empty it, and updates that move rows in/out of the NULL group. This is the most common correctness pitfall in hand-rolled IVM.

Verify: E2E test with nullable GROUP BY column; assert NULL group appears, grows, shrinks, and disappears correctly. Dependencies: CORR-1. Schema change: No.

CORR-3 — Multi-DML transaction atomicity

In plain terms: PGlite runs in single-connection mode, so a BEGIN; INSERT ...; DELETE ...; COMMIT sequence fires two separate statement-level triggers. The plugin must ensure the stream table reflects the net effect of the entire transaction, not an intermediate state. If trigger ordering produces incorrect intermediate results, a post-transaction reconciliation pass is needed.

Verify: test with BEGIN; INSERT; UPDATE; DELETE; COMMIT on a single base table; stream table matches full re-evaluation after commit. Dependencies: PGL-0-3. Schema change: No.

Stability

IDTitleEffortPriority
STAB-1Trigger cleanup on dropStreamTableSP0
STAB-2Graceful error on unsupported SQLSP0
STAB-3Plugin idempotency (create-drop-create cycle)SP1

STAB-1 — Trigger cleanup on dropStreamTable

In plain terms: When a user calls dropStreamTable(), all statement- level AFTER triggers registered on source tables must be removed. Orphaned triggers would fire on every subsequent DML and attempt to write to a non-existent stream table, causing errors.

Verify: after dropStreamTable(), no pg_trickle-related triggers remain in pg_trigger for the source tables. Dependencies: PGL-0-3. Schema change: No.

STAB-2 — Graceful error on unsupported SQL

In plain terms: The PoC supports only 3–5 SQL patterns. If a user passes an unsupported query (e.g., a LEFT JOIN, window function, or recursive CTE), the plugin must throw a clear, actionable error message listing what is supported — not silently produce wrong results or crash.

Verify: createStreamTable() with an unsupported query throws an error whose message names the unsupported feature and lists supported alternatives. Dependencies: PGL-0-2. Schema change: No.

STAB-3 — Plugin idempotency (create-drop-create cycle)

In plain terms: Creating a stream table, dropping it, and creating it again with the same name must work without leftover state. Leftover catalog rows, triggers, or temp tables from the first creation must not interfere with the second.

Verify: create-drop-create cycle produces correct results; no duplicate triggers or stale catalog entries. Dependencies: STAB-1. Schema change: No.

Performance

IDTitleEffortPriority
PERF-1Benchmark vs live.incrementalQuery()MP0
PERF-2Delta overhead profiling per DMLSP1
PERF-3Large result set scalability (10K/100K rows)SP1

PERF-1 — Benchmark vs live.incrementalQuery() (= PGL-0-5)

In plain terms: The entire value proposition of this PoC depends on being faster than PGlite's built-in live.incrementalQuery() for the supported patterns. Produce a public benchmark comparing latency and throughput for single-row inserts into a 10K-row base table across all three supported patterns (aggregate, join, filter).

Verify: delta-maintained stream table refresh latency < 50% of live.incrementalQuery() latency for all supported patterns at 10K rows. Dependencies: PGL-0-3, PGL-0-4. Schema change: No.

PERF-2 — Delta overhead profiling per DML

In plain terms: Measure the per-DML overhead added by the statement- level triggers. INSERT-heavy workloads should not suffer more than 2x latency increase compared to the same INSERT without pg_trickle triggers installed. Profile trigger function execution time, temp table creation, and delta DML.

Verify: microbenchmark shows per-DML overhead < 2 ms for aggregate pattern; < 5 ms for join pattern at 10K source rows. Dependencies: PGL-0-3. Schema change: No.

PERF-3 — Large result set scalability (10K/100K rows)

In plain terms: Verify that the delta approach maintains its advantage over full re-evaluation as base table size grows. At 100K rows, the delta path should be significantly faster than full re-evaluation for single-row changes.

Verify: at 100K base table rows, single-row insert refresh latency is < 10% of full query re-evaluation latency. Dependencies: PERF-1. Schema change: No.

Scalability

IDTitleEffortPriority
SCAL-1Multiple stream tables on same sourceSP1
SCAL-2Cascading stream table triggersMP2
SCAL-3Concurrent DML with multiple stream tablesSP2

SCAL-1 — Multiple stream tables on same source

In plain terms: Verify that 3+ stream tables can be maintained from the same base table simultaneously. Each DML fires one trigger per stream table; ensure triggers do not interfere with each other.

Verify: 3 stream tables on the same source; INSERT + UPDATE + DELETE cycle; all 3 produce correct results. Dependencies: PGL-0-3. Schema change: No.

SCAL-2 — Cascading stream table triggers

In plain terms: If stream table B reads from stream table A's underlying storage, an INSERT into A's source should propagate through A's trigger, update A, and then fire B's trigger to update B — all within the same PGlite transaction. Verify this works in PGlite's single-connection environment without deadlocks or infinite trigger loops.

Verify: A->B cascade produces correct results for INSERT/DELETE on A's source. No infinite loops detected. Dependencies: SCAL-1. Schema change: No.

SCAL-3 — Concurrent DML with multiple stream tables

In plain terms: PGlite is single-connection, but a user could issue rapid sequential DML (INSERT; INSERT; INSERT) without explicit transactions. Verify all stream tables converge to the correct state.

Verify: 100 sequential INSERTs with 3 stream tables; final state matches full re-evaluation. Dependencies: SCAL-1. Schema change: No.

Ease of Use

IDTitleEffortPriority
UX-1Getting-started README with copy-paste examplesSP0
UX-2Supported patterns decision tableXSP0
UX-3Error messages include remediation hintsSP1
UX-4TypeScript type definitionsSP1
UX-5ElectricSQL outreach and collaborationSP1

UX-1 — Getting-started README with copy-paste examples

In plain terms: The npm package README must include 3 complete, copy-pasteable examples — one per supported pattern — that a developer can run in under 2 minutes. Include Node.js and browser (Vite) examples.

Verify: all README examples execute without modification on a fresh PGlite instance. Dependencies: PGL-0-4. Schema change: No.

UX-2 — Supported patterns decision table

In plain terms: A clear table showing which SQL patterns are and are not supported, what error you get for unsupported patterns, and when full support is expected (v0.24.0). This prevents user frustration and sets expectations.

Verify: decision table in README and npm page lists all tested patterns with status (supported / unsupported / planned). Dependencies: None. Schema change: No.

UX-3 — Error messages include remediation hints

In plain terms: Every error thrown by the plugin must include the table name, the failing operation, and a one-sentence hint. Example: "LEFT JOIN is not supported in pglite-lite. Use @pgtrickle/pglite (v0.24.0+) for full SQL support, or rewrite as INNER JOIN."

Verify: all error paths tested; every error message includes a remediation sentence. Dependencies: STAB-2. Schema change: No.

UX-4 — TypeScript type definitions

In plain terms: Ship .d.ts type definitions so TypeScript users get autocomplete and type checking for createStreamTable(), dropStreamTable(), and configuration options.

Verify: TypeScript project consumes the plugin with strict mode; no any types leaked. Dependencies: PGL-0-4. Schema change: No.

UX-5 — ElectricSQL outreach and collaboration

In plain terms: PGlite is developed by ElectricSQL. Their cooperation is essential for Phase 2 (WASM build). Initiate contact before shipping Phase 0 to gauge interest, validate assumptions about PGlite's trigger infrastructure, and explore potential co-marketing.

Verify: documented exchange with ElectricSQL team (GitHub issue, email, or meeting notes). Dependencies: None. Schema change: No.

Test Coverage

IDTitleEffortPriority
TEST-1Automated correctness suite (all patterns x DML types)MP0
TEST-2PGlite version compatibility matrixSP1
TEST-3Regression test: trigger firing orderSP1
TEST-4Bundle size monitoringXSP2
TEST-5Extension upgrade path (0.18 to 0.19)SP0

TEST-1 — Automated correctness suite (all patterns x DML types)

In plain terms: For each supported pattern (aggregate, join, filter), run every DML type (INSERT, UPDATE, DELETE, multi-row, TRUNCATE) and assert the stream table matches a fresh full evaluation. This is the primary quality gate.

Verify: Jest/Vitest test suite with > 50 test cases; all pass on PGlite latest. Dependencies: PGL-0-2, PGL-0-3. Schema change: No.

TEST-2 — PGlite version compatibility matrix

In plain terms: PGlite updates frequently. Test the plugin against the last 3 PGlite releases to ensure trigger behavior hasn't changed. Document the minimum supported PGlite version.

Verify: CI matrix runs tests against PGlite N, N-1, N-2. Dependencies: TEST-1. Schema change: No.

TEST-3 — Regression test: trigger firing order

In plain terms: When multiple triggers exist on the same table, PostgreSQL fires them in alphabetical order by trigger name. Verify that trigger naming conventions prevent ordering conflicts with user-defined triggers.

Verify: test with a user-defined AFTER trigger alongside the plugin's trigger; both fire correctly; stream table produces correct results. Dependencies: PGL-0-3. Schema change: No.

TEST-4 — Bundle size monitoring

In plain terms: The npm package should be small (< 50 KB minified + gzipped) since this is a pure-JS plugin with no WASM. Add a CI check that fails if bundle size exceeds the threshold.

Verify: npm pack --dry-run reports < 50 KB gzipped. Dependencies: PGL-0-4. Schema change: No.

TEST-5 — Extension upgrade path (0.18 to 0.19)

In plain terms: The main pg_trickle PostgreSQL extension ships no functional changes in v0.21.0, but the upgrade migration path must still be tested. ALTER EXTENSION pg_trickle UPDATE from 0.20.0 to 0.21.0 must leave existing stream tables intact.

Verify: upgrade E2E test confirms all existing stream tables survive and refresh correctly after 0.20.0 -> 0.21.0 upgrade. Dependencies: None. Schema change: No (PG extension unchanged).

Conflicts & Risks

  1. Demand uncertainty is the primary risk. This entire milestone is a bet that PGlite users want IVM beyond what pg_ivm provides. If Phase 0 generates no adoption signal, v0.23.0–v0.25.0 should be deprioritised and v1.0.0 proceeds without PGlite. Define a concrete adoption threshold (e.g., > 100 npm weekly downloads within 60 days of publication) as a go/no-go gate for v0.23.0.

  2. PGlite trigger infrastructure is unverified. PGL-0-1 (trigger validation) is a hard prerequisite for everything else. If statement-level triggers with transition tables do not work in PGlite's single-user mode, the entire Strategy C approach fails and the PoC must pivot to a pure JS diff approach (lower value).

  3. PGlite version mismatch. PGlite tracks PostgreSQL 17; pg_trickle targets PG 18. The PoC operates at the SQL level and should be unaffected, but if PGlite upgrades to PG 18 mid-cycle, trigger behavior may change. Pin the minimum PGlite version in package.json.

  4. No core Rust changes, but version bump required. The main pg_trickle extension needs a v0.22.0 version bump, upgrade migration SQL, and passing CI even though no functional code changes. This is low-risk but must not be forgotten.

  5. ElectricSQL collaboration timing. UX-5 (outreach) should happen early — before v0.22.0 ships — to avoid building something ElectricSQL is already working on or would actively resist. If they signal interest in co-development, Phase 2 scope and timeline may shift.

  6. TypeScript delta SQL correctness is harder to prove than Rust. The main extension uses property-based testing and SQLancer for correctness. The TS plugin lacks these tools. TEST-1 must be rigorously designed to compensate — consider porting the proptest approach to a JS property- testing library (e.g., fast-check).

v0.22.0 total: ~2–3 weeks (PGlite plugin) + ~1–2 days (PG extension version bump)

Exit criteria:

  • PGL-0-1: Statement-level triggers with transition tables confirmed working in PGlite
  • PGL-0-2: Delta SQL correct for single-table aggregate, two-table join, and filtered query
  • PGL-0-3: @pgtrickle/pglite-lite plugin creates and maintains stream tables in PGlite
  • PGL-0-4: npm package published with README and usage examples
  • PGL-0-5: Benchmark shows measurable latency improvement over live.incrementalQuery() for supported patterns
  • CORR-1: Automated delta SQL equivalence tests pass (100+ DML sequences per pattern)
  • CORR-2: NULL-key aggregate groups correctly created, updated, and removed
  • CORR-3: Multi-DML transaction produces correct net result
  • STAB-1: No orphaned triggers after dropStreamTable()
  • STAB-2: Unsupported SQL patterns produce clear, actionable errors
  • STAB-3: Create-drop-create cycle produces correct results
  • PERF-1: Delta refresh latency < 50% of live.incrementalQuery() at 10K rows
  • PERF-3: Delta advantage holds at 100K rows (< 10% of full re-evaluation latency)
  • SCAL-1: 3+ stream tables on same source produce correct results
  • UX-1: README examples run unmodified on fresh PGlite instance
  • UX-2: Supported patterns decision table published
  • UX-4: TypeScript type definitions ship with strict-mode compatibility
  • TEST-1: > 50 correctness test cases pass on PGlite latest
  • TEST-2: CI tests pass against PGlite N, N-1, N-2
  • TEST-5: Extension upgrade path tested (0.21.0 -> 0.22.0)
  • just check-version-sync passes

v0.23.0 — Core Extraction (pg_trickle_core)

Release Theme This release surgically separates pg_trickle's "brain" — the DVM engine, operator delta SQL generation, query rewrite passes, and DAG computation — into a standalone Rust crate (pg_trickle_core) with zero pgrx dependency. The extraction touches ~51,000 lines of code across 30+ source files but produces zero user-visible behavior change: every existing test must pass unchanged. The payoff is threefold: the core crate compiles to WASM (enabling the PGlite extension in v0.24.0), pure-logic unit tests run without a PostgreSQL instance (10x faster CI), and the main extension gains a cleaner internal architecture. Approximately 500 unsafe blocks in the parser require an abstraction layer over raw pg_sys node traversal, making this the most technically demanding refactoring in the project's history.

See PLAN_PGLITE.md §5 Strategy A for the full extraction architecture.

Core Crate Extraction (Phase 1)

In plain terms: pg_trickle's "brain" — the code that analyses SQL queries, builds operator trees, and generates delta SQL — is currently tangled with pgrx (the Rust-to-PostgreSQL bridge). This milestone surgically separates the pure logic into its own crate so it can be compiled independently. The existing extension continues to work unchanged; it just imports from pg_trickle_core instead of having the code inline. A trait DatabaseBackend abstracts SPI and parser access so the core logic can be tested without a running PostgreSQL instance.

ItemDescriptionEffortRef
PGL-1-1Create pg_trickle_core crate. Workspace member with [lib] target, no pgrx dependency. Move OpTree, Expr, Column, AggExpr, and all shared types.1–2dPLAN_PGLITE.md §5 Strategy A
PGL-1-2Extract operator delta SQL generation. Move all src/dvm/operators/ logic (~24K lines, 23 files) into the core crate. Each operator's generate_delta_sql() becomes a pure function taking abstract types.3–5dPLAN_PGLITE.md §5 Strategy A
PGL-1-3Extract auto-rewrite passes. Move view inlining, DISTINCT ON rewrite, GROUPING SETS expansion, and SubLink extraction into pg_trickle_core::rewrites.2–3dPLAN_PGLITE.md §5 Strategy A
PGL-1-4Extract DAG computation. Move dependency graph, topological sort, cycle detection, diamond detection into pg_trickle_core::dag.1–2dPLAN_PGLITE.md §5 Strategy A
PGL-1-5Define trait DatabaseBackend. Abstract trait for SPI queries and raw_parser access. Implement for pgrx in the main extension crate.2–3dPLAN_PGLITE.md §5 Strategy A
PGL-1-6WASM compilation gate. Verify pg_trickle_core compiles to wasm32-unknown-emscripten target. CI check for WASM build.1–2dPLAN_PGLITE.md §5 Strategy A
PGL-1-7Existing test suite passes. All unit, integration, and E2E tests pass with the refactored crate structure. Zero behavior change.2–3d

Phase 1 subtotal: ~3–4 weeks

Correctness

IDTitleEffortPriority
CORR-1Delta SQL output byte-for-byte equivalenceMP0
CORR-2OpTree serialization round-trip fidelitySP0
CORR-3Rewrite pass ordering preservationSP1
CORR-4DAG cycle detection parity after extractionSP1

CORR-1 — Delta SQL output byte-for-byte equivalence

In plain terms: After the extraction, every operator's generate_delta_sql() must produce the exact same SQL string as it did before the refactoring. Any byte-level difference — even whitespace — indicates a semantic shift that could change query plans or correctness. Capture the SQL output for all 22 TPC-H stream tables before and after the extraction and assert bit-for-bit equality.

Verify: snapshot test comparing delta SQL for all TPC-H queries + the full E2E test suite. Any diff fails the build. Dependencies: PGL-1-2. Schema change: No.

CORR-2 — OpTree serialization round-trip fidelity

In plain terms: The OpTree types are moving to a new crate. If any field is accidentally dropped or retyped during the move, the delta SQL generator will silently produce wrong output. Add a round-trip test: serialize an OpTree to JSON, deserialize it back, and assert structural equality. This catches missing #[derive] attributes and field ordering issues.

Verify: proptest generating random OpTrees; serialize-deserialize round-trip produces identical trees. Dependencies: PGL-1-1. Schema change: No.

CORR-3 — Rewrite pass ordering preservation

In plain terms: The auto-rewrite passes (view inlining, DISTINCT ON, GROUPING SETS, SubLink extraction) must execute in the same order after extraction. Reordering could change the resulting OpTree and thereby the delta SQL. Add an integration test that runs all rewrite passes on a complex query (joining 3 tables with DISTINCT ON + GROUPING SETS) and asserts the final OpTree matches a golden snapshot.

Verify: golden-snapshot test for rewrite pass output on complex query. Dependencies: PGL-1-3. Schema change: No.

CORR-4 — DAG cycle detection parity after extraction

In plain terms: The cycle detection algorithm in dag.rs has subtleties around self-referencing views and diamond patterns. After moving to the core crate, the algorithm must detect the same cycles. Run the existing cycle-detection unit tests and add 3 new edge cases: self-referencing CTE, diamond with mixed IMMEDIATE/DIFFERENTIAL, and 4-level cascade.

Verify: all existing DAG unit tests pass + 3 new edge-case tests. Dependencies: PGL-1-4. Schema change: No.

Stability

IDTitleEffortPriority
STAB-1pg_sys node abstraction layer (~500 unsafe blocks)LP0
STAB-2Compile-time pgrx dependency leak detectionSP0
STAB-3Cargo workspace configuration correctnessSP0
STAB-4Extension upgrade path (0.19 to 0.20)SP0
STAB-5Feature-flag isolation for WASM targetSP1

STAB-1 — pg_sys node abstraction layer (~500 unsafe blocks)

In plain terms: rewrites.rs (118 unsafe blocks, 295 pg_sys refs) and sublinks.rs (367 unsafe blocks, 492 pg_sys refs) are the most deeply coupled to pgrx. The core crate cannot contain raw pg_sys calls. Define a trait NodeVisitor (or equivalent) that wraps pg_sys node traversal behind safe method calls. The pgrx backend implements the trait using actual pg_sys pointers; a mock backend can be used for unit tests. This is the single highest-effort item in the release.

Verify: zero pg_sys:: references in pg_trickle_core/; grep -r pg_sys pg_trickle_core/src/ returns empty. Dependencies: PGL-1-1, PGL-1-5. Schema change: No.

STAB-2 — Compile-time pgrx dependency leak detection

In plain terms: After extraction, any accidental use pgrx::* in the core crate would break the WASM build. Add a CI job that compiles pg_trickle_core in isolation (without the pgrx feature) and fails if any pgrx symbol is referenced. This catches leaks immediately rather than at WASM build time.

Verify: cargo build -p pg_trickle_core --no-default-features succeeds in CI. Dependencies: PGL-1-1. Schema change: No.

STAB-3 — Cargo workspace configuration correctness

In plain terms: Adding a workspace member changes Cargo.lock resolution, feature unification, and cargo pgrx behavior. Verify: cargo pgrx package still produces a valid .so, cargo test runs all workspace tests, and cargo pgrx test works for the extension crate. pgrx version must remain pinned at 0.17.x.

Verify: cargo pgrx package, cargo test --workspace, cargo pgrx test all succeed. Dependencies: PGL-1-1. Schema change: No.

STAB-4 — Extension upgrade path (0.19 to 0.20)

In plain terms: v0.23.0 makes no SQL-visible changes (same functions, same catalog schema), but the upgrade migration must still be tested. ALTER EXTENSION pg_trickle UPDATE from 0.21.0 to 0.22.0 must leave existing stream tables intact and refreshable.

Verify: upgrade E2E test confirms stream tables survive and refresh correctly after 0.22.0 -> 0.23.0.

STAB-5 — Feature-flag isolation for WASM target

In plain terms: The core crate must compile on both native and WASM. Any platform-specific code (e.g., std::time::Instant unavailable on wasm32-unknown-emscripten) must be gated behind #[cfg] attributes. Add a CI matrix entry for the WASM target that catches platform leaks.

Verify: cargo build --target wasm32-unknown-emscripten -p pg_trickle_core succeeds in CI. Dependencies: PGL-1-6. Schema change: No.

Performance

IDTitleEffortPriority
PERF-1Zero-overhead abstraction for DatabaseBackendMP0
PERF-2Benchmark regression gate across extractionSP0
PERF-3Core-only unit test speedup measurementSP1

PERF-1 — Zero-overhead abstraction for DatabaseBackend

In plain terms: The trait DatabaseBackend introduces dynamic dispatch (dyn DatabaseBackend or generics). For the native extension, the abstraction must add zero measurable overhead. Use monomorphization (generics, not trait objects) for the hot path — delta SQL generation is called on every refresh cycle and must not regress. Measure with Criterion before/after on the diff_operators benchmark suite.

Verify: Criterion benchmark shows < 1% regression on diff_operators suite after extraction. Dependencies: PGL-1-5. Schema change: No.

PERF-2 — Benchmark regression gate across extraction

In plain terms: The extraction touches 51K lines of code. Even without functional changes, module restructuring can alter inlining, cache locality, and link-time optimization. Run the full Criterion benchmark suite before and after and assert no regression > 5%.

Verify: scripts/criterion_regression_check.py passes with 5% threshold on all existing benchmarks. Dependencies: PGL-1-7. Schema change: No.

PERF-3 — Core-only unit test speedup measurement

In plain terms: One of the key benefits of extraction is that pg_trickle_core unit tests run without starting PostgreSQL. Measure the wall-clock time for cargo test -p pg_trickle_core vs the old in-tree unit tests. Document the speedup in the CHANGELOG — expect 5-10x faster CI for unit-level tests.

Verify: document test execution times before/after in PR description. Dependencies: PGL-1-7. Schema change: No.

Scalability

IDTitleEffortPriority
SCAL-1Workspace build parallelism verificationSP1
SCAL-2Core crate binary size for WASM budgetSP1
SCAL-3Incremental compilation impact assessmentSP2

SCAL-1 — Workspace build parallelism verification

In plain terms: With two crates, cargo build can compile pg_trickle_core and other non-dependent crates in parallel. Verify that the workspace DAG allows parallel compilation and measure the incremental rebuild time for a change in pg_trickle_core only.

Verify: cargo build --timings shows parallel compilation of core crate. Dependencies: PGL-1-1. Schema change: No.

SCAL-2 — Core crate binary size for WASM budget

In plain terms: v0.24.0 targets < 2 MB WASM bundle. Measure the compiled size of pg_trickle_core for the WASM target now so the budget is known before Phase 2. If > 5 MB, investigate wasm-opt stripping and feature-gating large operator modules.

Verify: wasm32-unknown-emscripten build of pg_trickle_core produces < 5 MB unoptimized. Document size in tracking issue. Dependencies: PGL-1-6. Schema change: No.

SCAL-3 — Incremental compilation impact assessment

In plain terms: Splitting into two crates changes the incremental compilation boundary. A change in pg_trickle_core now forces a recompile of the extension crate. Measure incremental compile time for common edit patterns (add a test, modify an operator, change a rewrite pass) and ensure developer-experience compile times remain < 30s.

Verify: document incremental compile times for 3 edit patterns. Dependencies: PGL-1-1. Schema change: No.

Ease of Use

IDTitleEffortPriority
UX-1Workspace-aware justfile targetsSP0
UX-2Developer guide for core crate contributionsSP1
UX-3ARCHITECTURE.md update for two-crate layoutSP1

UX-1 — Workspace-aware justfile targets

In plain terms: Existing just targets (just test-unit, just lint, just fmt) must work seamlessly with the new workspace layout. Update the justfile so just test-unit runs both pg_trickle_core unit tests and extension unit tests. Add just test-core for core-only tests.

Verify: all existing just targets pass; just test-core runs core-only tests in < 5 seconds. Dependencies: PGL-1-1. Schema change: No.

UX-2 — Developer guide for core crate contributions

In plain terms: Contributors need to know the rules: what goes in pg_trickle_core (pure logic, no pgrx) vs the extension crate (SPI, FFI, SQL functions). Add a section to CONTRIBUTING.md explaining the crate boundary, the DatabaseBackend trait contract, and how to add a new operator to the core crate.

Verify: CONTRIBUTING.md updated with crate boundary rules. Dependencies: PGL-1-5. Schema change: No.

UX-3 — ARCHITECTURE.md update for two-crate layout

In plain terms: The module layout diagram in docs/ARCHITECTURE.md and AGENTS.md must reflect the new two-crate structure. Update both files so new contributors see the correct layout.

Verify: docs/ARCHITECTURE.md and AGENTS.md module diagrams show pg_trickle_core/ and pg_trickle/ crates. Dependencies: PGL-1-7. Schema change: No.

Test Coverage

IDTitleEffortPriority
TEST-1Delta SQL snapshot tests for all 22 TPC-H queriesMP0
TEST-2Pure-Rust unit tests for extracted operatorsLP0
TEST-3Mock DatabaseBackend for in-memory testingMP1
TEST-4WASM build smoke test in CISP0
TEST-5Cargo deny / audit for new crateXSP0

TEST-1 — Delta SQL snapshot tests for all 22 TPC-H queries

In plain terms: Before extraction, capture the exact delta SQL output for each of the 22 TPC-H stream table definitions. After extraction, run the same generator and diff. Any change is a hard failure. This is the primary correctness gate for the refactoring.

Verify: cargo test -p pg_trickle_core -- snapshot passes with zero diffs. Dependencies: CORR-1. Schema change: No.

TEST-2 — Pure-Rust unit tests for extracted operators

In plain terms: The 23 operator files currently have ~1,700 unit tests that run inside cargo pgrx test (requires PostgreSQL). After extraction, all pure-logic tests should run via cargo test -p pg_trickle_core without a database. Tests that require SPI (e.g., catalog lookups) stay in the extension crate. Audit and migrate every test that can run without PostgreSQL.

Verify: > 80% of existing operator unit tests run in pg_trickle_core without PostgreSQL. Dependencies: PGL-1-2, TEST-3. Schema change: No.

TEST-3 — Mock DatabaseBackend for in-memory testing

In plain terms: For core crate tests that need to call the parser or SPI, provide a MockBackend that returns canned parse trees and query results. This allows testing the full pipeline (parse -> rewrite -> operator tree -> delta SQL) without PostgreSQL.

Verify: MockBackend supports at least: raw_parser() returning a canned OpTree, and spi_query() returning a canned result set. 10+ tests use it. Dependencies: PGL-1-5. Schema change: No.

TEST-4 — WASM build smoke test in CI

In plain terms: Add a CI job that compiles pg_trickle_core to wasm32-unknown-emscripten on every PR. This catches platform-specific code leaks before they accumulate. The job does not need to run the WASM binary — just compile it.

Verify: CI job build-wasm passes on every PR targeting the core crate. Dependencies: PGL-1-6, STAB-5. Schema change: No.

TEST-5 — Cargo deny / audit for new crate

In plain terms: The new pg_trickle_core crate may introduce new transitive dependencies. Ensure cargo deny check and cargo audit cover the new crate and report no advisories.

Verify: cargo deny check and cargo audit pass for the full workspace. Dependencies: PGL-1-1. Schema change: No.

Conflicts & Risks

  1. STAB-1 is the critical path. The ~500 unsafe blocks in rewrites.rs and sublinks.rs require a NodeVisitor abstraction over raw pg_sys pointer traversal. This is the highest-effort, highest-risk item. If the abstraction proves too leaky (e.g., too many pg_sys node types to wrap), consider leaving rewrites.rs and sublinks.rs in the extension crate and extracting only operators + DAG + types to the core crate. This reduces v0.23.0 scope but still delivers the WASM-compilable operator engine for v0.24.0.

  2. PERF-1 must be validated before merging. Introducing a trait DatabaseBackend could add vtable dispatch overhead on the hot refresh path. Use monomorphization (generics) rather than dyn Trait for the extension-side implementation. If Criterion shows > 1% regression, investigate #[inline] annotations and LTO settings.

  3. No schema changes, but workspace restructuring can break cargo pgrx. The cargo-pgrx tool makes assumptions about workspace layout (e.g., expecting a single lib.rs entry point). Test cargo pgrx package, cargo pgrx test, and cargo pgrx run early. If cargo-pgrx 0.17.x cannot handle the workspace, consider upgrading to a newer pgrx that supports workspaces, or use a [patch] section in Cargo.toml.

  4. TEST-2 depends on TEST-3 (MockBackend). Pure-Rust operator tests need a way to feed canned parse trees. Build the MockBackend early so TEST-2 can proceed.

  5. WASM target may not be available in standard CI runners. The wasm32-unknown-emscripten target requires Emscripten SDK. Either install it in CI (adds ~2 min setup) or use a pre-built Docker image with the SDK. Budget for CI setup time.

  6. Extraction is all-or-nothing per module. Partially extracting a module (e.g., moving half of rewrites.rs) creates circular dependencies. Each module must move completely or stay. Plan the extraction order: types -> operators -> DAG -> diff -> rewrites -> sublinks.

v0.23.0 total: ~3–4 weeks (extraction) + ~1–2 weeks (abstraction layer + testing)

Exit criteria:

  • PGL-1-1: pg_trickle_core crate exists as a workspace member with zero pgrx dependencies
  • PGL-1-2: All operator delta SQL generation lives in the core crate
  • PGL-1-3: All auto-rewrite passes live in the core crate
  • PGL-1-4: DAG computation lives in the core crate
  • PGL-1-5: trait DatabaseBackend defined; pgrx implementation passes all existing tests
  • PGL-1-6: cargo build --target wasm32-unknown-emscripten -p pg_trickle_core succeeds
  • PGL-1-7: just test-all passes with zero regressions
  • CORR-1: Delta SQL snapshot tests pass for all 22 TPC-H queries (byte-for-byte match)
  • CORR-2: OpTree serialize-deserialize round-trip passes proptest
  • CORR-3: Rewrite pass ordering golden snapshot matches
  • CORR-4: DAG cycle detection passes with 3 new edge-case tests
  • STAB-1: Zero pg_sys:: references in pg_trickle_core/src/
  • STAB-2: cargo build -p pg_trickle_core --no-default-features passes in CI
  • STAB-3: cargo pgrx package and cargo pgrx test succeed with workspace layout
  • STAB-4: Extension upgrade path tested (0.22.0 -> 0.23.0)
  • STAB-5: WASM target builds in CI
  • PERF-1: Criterion shows < 1% regression on diff_operators benchmark
  • PERF-2: Full benchmark suite passes with < 5% regression threshold
  • TEST-1: TPC-H delta SQL snapshot tests pass
  • TEST-2: > 80% of operator unit tests run without PostgreSQL
  • TEST-3: MockBackend used by 10+ core crate tests
  • TEST-4: CI build-wasm job passes on every PR
  • TEST-5: cargo deny check and cargo audit pass for workspace
  • UX-1: All existing just targets pass; just test-core added
  • UX-3: ARCHITECTURE.md and AGENTS.md updated with two-crate layout
  • just check-version-sync passes

v0.24.0 — PGlite WASM Extension

Release Theme This release delivers the first working PGlite extension — the moment pg_trickle's incremental view maintenance runs in the browser. By wrapping pg_trickle_core (extracted in v0.23.0) in a thin C/FFI shim and compiling to WASM via PGlite's Emscripten toolchain, we ship an npm package (@pgtrickle/pglite) that gives PGlite users the full DVM operator vocabulary — outer joins, window functions, subqueries, recursive CTEs — in IMMEDIATE mode. This dramatically exceeds pg_ivm's PGlite offering (INNER joins + basic aggregates only). The release also establishes the cross-platform correctness and performance baselines that all future PGlite work builds on.

See PLAN_PGLITE.md §5 Strategy A and §7 Phase 2 for the full architecture.

PGlite WASM Build (Phase 2)

In plain terms: This takes the pg_trickle_core crate extracted in v0.23.0 and wraps it in a thin C shim that PGlite's Emscripten-based extension build system can compile to WASM. The result is a PGlite extension package (@pgtrickle/pglite) that provides create_stream_table(), drop_stream_table(), and alter_stream_table() — all running IMMEDIATE mode inside the WASM PostgreSQL engine with the full DVM operator set.

ItemDescriptionEffortRef
PGL-2-1C shim for PGlite. Thin C wrapper bridging PGlite's Emscripten environment to pg_trickle_core via Rust FFI. Handles raw_parser calls through PGlite's built-in PostgreSQL parser.1–2wkPLAN_PGLITE.md §5 Strategy A
PGL-2-2DatabaseBackend for PGlite. Implement the trait for PGlite's single-connection SPI and built-in parser. Remove advisory lock acquisition (trivial in single-connection).3–5dPLAN_PGLITE.md §5 Strategy A
PGL-2-3WASM bundle build. Integrate with PGlite's extension toolchain (postgres-pglite). Produce .tar.gz WASM bundle. Target bundle size < 2 MB.3–5dPLAN_PGLITE.md §8
PGL-2-4TypeScript wrapper. @pgtrickle/pglite npm package with PGlite plugin API. createStreamTable(), dropStreamTable(), alterStreamTable() with full IMMEDIATE mode support.2–3dPLAN_PGLITE.md §7 Phase 2
PGL-2-5IMMEDIATE mode E2E tests on PGlite. Verify inner joins, outer joins, aggregates, DISTINCT, UNION ALL, window functions, subqueries, CTEs (non-recursive + recursive), LATERAL, view inlining, DISTINCT ON, GROUPING SETS.1–2wkPLAN_PGLITE.md §4.1
PGL-2-6PG 17 vs PG 18 parse tree compatibility. PGlite tracks PG 17; pg_trickle targets PG 18. Audit and gate any node struct differences with conditional compilation.3–5dPLAN_PGLITE.md §8

Phase 2 subtotal: ~5–7 weeks

Correctness

IDTitleEffortPriority
CORR-1PG 17/18 parse tree node divergence auditMP0
CORR-2Delta SQL cross-platform equivalenceMP0
CORR-3Advisory lock no-op safety proofSP1
CORR-4IMMEDIATE trigger ordering in single-connectionSP1

CORR-1 — PG 17/18 parse tree node divergence audit

In plain terms: PGlite embeds PostgreSQL 17's parser; pg_trickle's OpTree construction targets PostgreSQL 18 node structs. Any struct layout difference (added fields, renamed members, changed enum values) would cause the C shim to misinterpret parse trees, producing silently wrong delta SQL. Systematically diff the PG 17 and PG 18 parse tree headers (nodes/parsenodes.h, nodes/primnodes.h) and catalog every node type that pg_trickle traverses. Gate incompatible nodes behind #[cfg(pg17)] / #[cfg(pg18)] conditional compilation.

Verify: a CI job compiles pg_trickle_core against both PG 17 and PG 18 parse tree headers. A test generates OpTrees from the same SQL on both versions and asserts structural equality. Dependencies: PGL-2-6. Schema change: No.

CORR-2 — Delta SQL cross-platform equivalence

In plain terms: The same SQL view definition must produce the exact same delta SQL on native PostgreSQL 18 and PGlite (WASM + PG 17 parser). Any divergence means one platform gets wrong incremental results. Create a snapshot test suite that runs all 22 TPC-H stream table definitions through both the native and WASM DatabaseBackend implementations and asserts byte-for-byte identical delta SQL output.

Verify: snapshot comparison test passes for all 22 TPC-H queries on both platforms. Any diff is a hard failure. Dependencies: PGL-2-2, CORR-1. Schema change: No.

CORR-3 — Advisory lock no-op safety proof

In plain terms: The native extension uses pg_advisory_xact_lock() to prevent concurrent refresh of the same stream table. PGlite is single-connection — the lock acquisition is a no-op. Verify that removing the lock cannot cause re-entrancy (a trigger firing create_stream_table() from within a refresh) by auditing all SPI call paths from the PGlite DatabaseBackend for re-entrant calls.

Verify: code review + integration test that attempts re-entrant refresh from within a trigger. Must error cleanly, not corrupt state. Dependencies: PGL-2-2. Schema change: No.

CORR-4 — IMMEDIATE trigger ordering in single-connection

In plain terms: IMMEDIATE mode relies on AFTER triggers firing in a specific order when multiple source tables are modified in the same statement (e.g., a CTE with multiple INSERTs). Verify that PGlite's trigger execution order matches native PostgreSQL's for the trigger configurations pg_trickle creates.

Verify: integration test with multi-table CTE INSERT on PGlite; assert stream table state matches native. Dependencies: PGL-2-5. Schema change: No.

Stability

IDTitleEffortPriority
STAB-1WASM heap OOM graceful degradationMP0
STAB-2C shim panic/unwind boundary safetySP0
STAB-3Extension load/unload lifecycle correctnessSP0
STAB-4Native extension upgrade path (0.21 → 0.22)SP0
STAB-5npm package version synchronizationXSP1

STAB-1 — WASM heap OOM graceful degradation

In plain terms: WASM environments have a finite heap (typically 256 MB in browsers, configurable in Node). A large stream table with many operators could exhaust WASM memory during OpTree construction or delta SQL generation. The extension must detect allocation failures and return a clear PostgreSQL error rather than crashing the WASM instance (which would kill all PGlite state). Implement a memory-aware allocator wrapper or check emscripten_get_heap_size() at entry points.

Verify: stress test creating stream tables over increasingly complex views until OOM; assert PGlite remains functional and returns an actionable error. Dependencies: PGL-2-1. Schema change: No.

STAB-2 — C shim panic/unwind boundary safety

In plain terms: Rust panics must not cross the FFI boundary into C. The C shim must catch panics via std::panic::catch_unwind() and convert them to PostgreSQL ereport(ERROR) calls. Any uncaught panic in WASM would abort the entire PGlite instance. Audit every #[no_mangle] extern "C" entry point in the shim for panic safety.

Verify: test that triggers a panic path (e.g., invalid SQL) from TypeScript; assert PGlite returns a SQL error, not a WASM trap. Dependencies: PGL-2-1. Schema change: No.

STAB-3 — Extension load/unload lifecycle correctness

In plain terms: PGlite extensions can be loaded and unloaded. The C shim must free all Rust-allocated memory on unload and not leave dangling pointers or leaked state. Test the full lifecycle: load extension → create stream tables → drop stream tables → unload extension → reload extension → create new stream tables.

Verify: lifecycle test with memory profiling shows zero leaked allocations after unload/reload cycle. Dependencies: PGL-2-1, PGL-2-4. Schema change: No.

STAB-4 — Native extension upgrade path (0.22 → 0.23)

In plain terms: v0.24.0 adds PGlite support but makes no SQL-visible changes to the native extension. The upgrade migration from 0.21.0 to 0.22.0 must leave existing stream tables intact and refreshable.

Verify: upgrade E2E test confirms stream tables survive and refresh correctly after 0.23.0 -> 0.24.0.

STAB-5 — npm package version synchronization

In plain terms: The @pgtrickle/pglite npm package version must match the extension version (0.22.0). Add a CI check that verifies package.json version matches pg_trickle.control version, similar to the existing just check-version-sync target.

Verify: just check-version-sync also validates npm package version. Dependencies: PGL-2-4. Schema change: No.

Performance

IDTitleEffortPriority
PERF-1WASM vs native refresh latency benchmarkMP0
PERF-2WASM bundle size optimization (< 2 MB target)MP0
PERF-3PGlite cold-start extension load timeSP1

PERF-1 — WASM vs native refresh latency benchmark

In plain terms: WASM is expected to be 1.5–3× slower than native (per PLAN_PGLITE.md §8). Quantify the actual overhead by benchmarking IMMEDIATE-mode refresh on both platforms using the same schema + data. The overhead must stay below the threshold where IMMEDIATE mode is still faster than full re-evaluation — otherwise PGlite users would be better off just re-running the query. Establish a Criterion-like benchmark suite for PGlite (potentially using Node.js + @electric-sql/pglite).

Verify: benchmark report showing WASM refresh latency for 5 representative stream tables (scan, join, aggregate, window, recursive CTE). Document native-to-WASM overhead ratio. Dependencies: PGL-2-5. Schema change: No.

PERF-2 — WASM bundle size optimization (< 2 MB target)

In plain terms: The WASM bundle must be < 2 MB for acceptable download times in browser environments (PostGIS is 8.2 MB, pgcrypto is 1.1 MB — pg_trickle should be closer to pgcrypto). Apply wasm-opt -Oz, LTO, codegen-units = 1, strip debug info, and feature-gate large operator modules (e.g., recursive CTE, window functions) behind optional features if needed to meet the target.

Verify: CI job measures WASM bundle size after wasm-opt and fails if > 2 MB. Document size breakdown by operator module. Dependencies: PGL-2-3. Schema change: No.

PERF-3 — PGlite cold-start extension load time

In plain terms: The first CREATE EXTENSION pg_trickle in a PGlite session compiles and loads the WASM module. This must complete in < 500 ms in a browser and < 200 ms in Node.js. Measure and optimize by using streaming WASM compilation (WebAssembly.compileStreaming()) and ensuring the extension _PG_init() function does minimal work.

Verify: benchmark measuring time from CREATE EXTENSION to first create_stream_table() on fresh PGlite instance. Document cold-start time. Dependencies: PGL-2-1, PGL-2-3. Schema change: No.

Scalability

IDTitleEffortPriority
SCAL-1Stream table count ceiling in WASMSP1
SCAL-2Wide-table OpTree memory footprintSP1
SCAL-3Dataset size practical limit for IMMEDIATE modeSP2

SCAL-1 — Stream table count ceiling in WASM

In plain terms: Each stream table consumes memory for its OpTree, delta SQL templates, and trigger metadata. In native PostgreSQL with gigabytes of RAM this is trivial, but in a 256 MB WASM heap it matters. Determine the practical limit by creating stream tables in a loop until OOM, then document the ceiling and add a guard that errors at 80% capacity with an actionable message.

Verify: stress test documents the ceiling (e.g., "~200 stream tables with average 3-table join in 256 MB heap"). Guard errors at 80%. Dependencies: STAB-1. Schema change: No.

SCAL-2 — Wide-table OpTree memory footprint

In plain terms: A stream table over a 100-column source table produces a large OpTree and long delta SQL strings. Profile the memory consumption of OpTree construction for wide tables and ensure it fits within the WASM heap budget alongside typical stream table counts.

Verify: profile OpTree allocation for 10, 50, 100-column source tables. Document memory per stream table as a function of column count. Dependencies: PGL-2-5. Schema change: No.

SCAL-3 — Dataset size practical limit for IMMEDIATE mode

In plain terms: IMMEDIATE mode fires triggers on every DML, so overhead scales with write frequency. In a WASM environment with ~2× slower execution, determine at what dataset size (rows × columns × writes/second) IMMEDIATE mode becomes impractical. Document the breakpoint so PGlite users know when their use case has outgrown the browser and should migrate to native pg_trickle with DIFFERENTIAL mode.

Verify: benchmark with increasing write rates; document the throughput ceiling (e.g., "> 10K rows/sec INSERT rate degrades stream table latency past 100 ms"). Dependencies: PERF-1. Schema change: No.

Ease of Use

IDTitleEffortPriority
UX-1TypeScript API ergonomics and type safetySP0
UX-2PGlite getting-started guideMP0
UX-3WASM-context error message qualitySP1
UX-4npm package README with runnable examplesSP1

UX-1 — TypeScript API ergonomics and type safety

In plain terms: The @pgtrickle/pglite TypeScript API must follow PGlite plugin conventions (PGlitePlugin interface, init() lifecycle). All methods must be fully typed — no any types. The API surface must be minimal: createStreamTable(sql), dropStreamTable(name), alterStreamTable(name, sql), listStreamTables(), and refreshStreamTable(name). Review against existing PGlite plugins (@electric-sql/pglite-repl, pglite-vector) for consistency.

Verify: TypeScript strict mode compilation with no errors. API review against PGlite plugin conventions checklist. Dependencies: PGL-2-4. Schema change: No.

UX-2 — PGlite getting-started guide

In plain terms: A docs/tutorials/PGLITE_QUICKSTART.md guide walking a user from npm install to a working React app with live stream tables in < 10 minutes. Include: install, create PGlite instance with extension, define source table + stream table, insert data, observe stream table update. Provide a CodeSandbox / StackBlitz link for zero-install try-it-now experience.

Verify: a new developer can follow the guide and see a working stream table in PGlite in a browser within 10 minutes. Dependencies: PGL-2-4, UX-1. Schema change: No.

UX-3 — WASM-context error message quality

In plain terms: Error messages from the Rust/C shim must be JavaScript-friendly: no raw pg_sys error codes, no memory addresses. Every error must include the stream table name, the failing SQL fragment, and a remediation hint. Unsupported features (DIFFERENTIAL mode, scheduled refresh, parallel workers) must error with "Not supported in PGlite: . Use IMMEDIATE mode." rather than cryptic internal errors.

Verify: audit all error paths in the C shim + PGlite DatabaseBackend. Every error message includes table name + remediation hint. Dependencies: PGL-2-1, PGL-2-2. Schema change: No.

UX-4 — npm package README with runnable examples

In plain terms: The npm package must have a README with: badge for PGlite compatibility, install command, 3 runnable examples (basic aggregate, join, window function), API reference, link to the full PGlite quickstart guide, and a "Limitations vs native pg_trickle" section clearly stating: no DIFFERENTIAL mode, no scheduled refresh, no parallel workers, PG 17 parser only.

Verify: README renders correctly on npmjs.com; examples are copy-pasteable into a Node.js REPL. Dependencies: PGL-2-4, UX-2. Schema change: No.

Test Coverage

IDTitleEffortPriority
TEST-1Full DVM operator E2E suite on PGliteLP0
TEST-2PG 17/18 parse tree compatibility testsMP0
TEST-3WASM memory stress testsMP1
TEST-4TypeScript integration testsMP0
TEST-5Bundle size regression gate in CISP0

TEST-1 — Full DVM operator E2E suite on PGlite

In plain terms: Run every DVM operator (23 operators across inner join, outer join, full join, semi-join, anti-join, aggregate, distinct, union/intersect/except, subquery, scalar subquery, CTE scan, recursive CTE, lateral function, lateral subquery, window function, scan, filter, project) through IMMEDIATE mode in PGlite. This is the primary correctness gate for the WASM extension. Use a Node.js test harness with @electric-sql/pglite to run the tests headlessly.

Verify: test suite with ≥ 1 test per operator (23+ tests) passes in CI using PGlite Node.js. Test matrix: INSERT, UPDATE, DELETE for each operator. Dependencies: PGL-2-5. Schema change: No.

TEST-2 — PG 17/18 parse tree compatibility tests

In plain terms: For every parse tree node type that pg_trickle traverses, generate a test query that exercises that node, parse it on both PG 17 (PGlite) and PG 18 (native), and assert that the resulting OpTree is structurally identical. This catches version-specific divergences before they reach users.

Verify: compatibility test suite covers all node types referenced in pg_trickle_core. Any divergence is a hard failure with clear diagnostic. Dependencies: CORR-1. Schema change: No.

TEST-3 — WASM memory stress tests

In plain terms: Create increasing numbers of stream tables with increasing complexity until OOM. Verify that: (a) the guard from SCAL-1 fires at 80% capacity, (b) PGlite remains functional after the guard fires, (c) dropping stream tables actually frees memory. Run under different heap sizes (64 MB, 128 MB, 256 MB) to validate the guard thresholds.

Verify: stress test with 3 heap sizes completes without WASM trap. Guard fires at documented threshold. Memory reclaimed after DROP. Dependencies: STAB-1, SCAL-1. Schema change: No.

TEST-4 — TypeScript integration tests

In plain terms: Test the @pgtrickle/pglite TypeScript API end-to-end using Jest or Vitest in Node.js. Cover: create/drop/alter stream table, error handling (invalid SQL, unsupported features), plugin lifecycle (init/cleanup), and concurrent operations on different stream tables. Run as part of CI on every PR that touches pg_trickle_pglite/.

Verify: ≥ 20 TypeScript integration tests pass in CI. Test coverage report for the TypeScript wrapper shows > 90% line coverage. Dependencies: PGL-2-4, UX-1. Schema change: No.

TEST-5 — Bundle size regression gate in CI

In plain terms: Add a CI job that builds the WASM bundle, runs wasm-opt, measures the final .wasm file size, and fails if it exceeds 2 MB. Store the current size as a baseline and alert on any increase > 10%. This prevents bundle bloat as features are added.

Verify: CI job check-wasm-size runs on every PR touching pg_trickle_core/ or pg_trickle_pglite/. Fails at > 2 MB. Dependencies: PGL-2-3, PERF-2. Schema change: No.

Conflicts & Risks

  1. CORR-1 (PG 17/18 parse tree compatibility) is the highest risk. PGlite embeds PG 17; pg_trickle targets PG 18. If node struct layouts diverged significantly between versions (e.g., JoinExpr gained a field, RangeTblEntry changed a flag), the C shim must handle both layouts via conditional compilation. In the worst case, some operators may need version-specific code paths. Start this audit early — it blocks PGL-2-1 and PGL-2-2.

  2. PERF-2 (bundle size < 2 MB) may conflict with full operator coverage. If the 23-operator delta SQL generator compiles to > 2 MB, we may need to feature-gate rarely-used operators (recursive CTE, GROUPING SETS) behind cargo features. This would reduce the "full DVM vocabulary" claim and require documenting which operators are available by default. Measure early with a minimal build to establish baseline.

  3. PGlite's Emscripten toolchain is a moving target. PGlite's extension build system (postgres-pglite) is not yet stable. Breaking changes in the toolchain could block PGL-2-3. Pin the PGlite version and track upstream releases. Have a fallback plan: manual Emscripten compilation without the PGlite toolchain.

  4. STAB-2 (panic boundary) and STAB-1 (OOM handling) interact. A Rust OOM in WASM triggers a panic, which must not cross the FFI boundary. Both items must be implemented together: the OOM guard (STAB-1) sets a pre-panic threshold, and the catch_unwind wrapper (STAB-2) is the last-resort safety net.

  5. No prior C FFI in the codebase. The only C code is scripts/pg_stub.c (test helper). The C shim (PGL-2-1) introduces a new language and toolchain requirement. Ensure the C code is minimal (< 500 lines), well-documented, and covered by the TypeScript integration tests.

  6. TEST-1 and TEST-4 require a PGlite-based CI runner. Need Node.js 18+ with @electric-sql/pglite in CI. This is a new CI dependency. Add it to the existing CI matrix as a separate job that only runs when pg_trickle_pglite/ or pg_trickle_core/ files are modified.

v0.24.0 total: ~5–7 weeks (WASM build) + ~2–3 weeks (testing + polish)

Exit criteria:

  • PGL-2-1: C shim compiles and links against PGlite's WASM PostgreSQL headers
  • PGL-2-2: PGlite DatabaseBackend passes all IMMEDIATE-mode operator tests
  • PGL-2-3: WASM bundle size < 2 MB after wasm-opt
  • PGL-2-4: @pgtrickle/pglite npm package published to npmjs.com
  • PGL-2-5: All 23 DVM operators pass E2E tests on PGlite
  • PGL-2-6: PG 17 parse tree differences documented and handled with #[cfg]
  • CORR-1: PG 17/18 parse tree audit complete; compatibility tests pass
  • CORR-2: Delta SQL cross-platform snapshot tests pass for all 22 TPC-H queries
  • CORR-3: Re-entrant refresh test passes on PGlite
  • CORR-4: Multi-table CTE trigger ordering matches native
  • STAB-1: OOM stress test: PGlite survives with actionable error
  • STAB-2: Panic from invalid SQL returns SQL error, not WASM trap
  • STAB-3: Load/unload/reload lifecycle test: zero leaked allocations
  • STAB-4: Extension upgrade path tested (0.23.0 -> 0.24.0)
  • PERF-1: WASM vs native benchmark report published (≤ 3× overhead)
  • PERF-2: WASM bundle ≤ 2 MB (CI gated)
  • PERF-3: Cold-start load time < 500 ms browser, < 200 ms Node.js
  • TEST-1: ≥ 23 operator E2E tests pass on PGlite in CI
  • TEST-2: Parse tree compatibility tests cover all traversed node types
  • TEST-3: Memory stress tests pass under 64/128/256 MB heap sizes
  • TEST-4: ≥ 20 TypeScript integration tests with > 90% line coverage
  • TEST-5: CI check-wasm-size job passes on every PR
  • UX-1: TypeScript strict mode compilation: zero errors
  • UX-2: PGlite getting-started guide published with CodeSandbox link
  • UX-4: npm README renders correctly on npmjs.com
  • just check-version-sync passes (incl. npm package version)

v0.25.0 — PGlite Reactive Integration

Release Theme This release completes the PGlite story by bridging the gap between database-side incremental view maintenance and front-end UI reactivity. By connecting stream table deltas to PGlite's live.changes() API and providing framework-specific hooks (useStreamTable() for React and Vue), pg_trickle becomes the first IVM engine to offer truly reactive UI bindings — where DOM updates are proportional to changed rows, not result set size. This is the local-first developer's final mile: from INSERT to re-render in a single digit millisecond count, with no polling, no diffing, and no full query re-execution.

See PLAN_PGLITE.md §7 Phase 3 for the full reactive integration design.

Reactive Bindings (Phase 3)

In plain terms: Phase 2 gave PGlite users in-engine IVM. This phase connects stream table changes to PGlite's live.changes() API and provides framework-specific hooks — useStreamTable() for React, useStreamTable() for Vue — so UI components automatically re-render when the underlying data changes. For local-first apps like collaborative editors, dashboards, and offline-capable tools, this is the last mile between incremental SQL and reactive UI.

ItemDescriptionEffortRef
PGL-3-1live.changes() bridge. Emit INSERT/UPDATE/DELETE change events from stream table delta application to PGlite's live query system. Keyed by __pgt_row_id.3–5dPLAN_PGLITE.md §7 Phase 3
PGL-3-2React hooks. useStreamTable(query) hook that subscribes to stream table changes and returns reactive state. Handles mount/unmount lifecycle.3–5d
PGL-3-3Vue composable. useStreamTable(query) composable with equivalent functionality.2–3d
PGL-3-4Documentation and examples. Local-first app patterns: collaborative todo list, real-time dashboard, offline-first inventory tracker. Published as @pgtrickle/pglite docs.2–3d
PGL-3-5Performance benchmarks. End-to-end latency from INSERT to React re-render. Compare against live.incrementalQuery() for complex queries (3-table join + aggregate).1–2d

Phase 3 subtotal: ~2–3 weeks

Correctness

IDTitleEffortPriority
CORR-1Change event fidelity vs stream table stateMP0
CORR-2Multi-row DML atomicity in reactive streamSP0
CORR-3Hook state consistency after rapid mutationsMP1
CORR-4DELETE/re-INSERT identity stabilitySP1

CORR-1 — Change event fidelity vs stream table state

In plain terms: The live.changes() bridge emits INSERT/UPDATE/DELETE events derived from the IMMEDIATE mode delta application. If an event is missed, duplicated, or misclassified (e.g., an UPDATE emitted as DELETE + INSERT), the React/Vue state will diverge from the actual stream table contents. For every DML operation on every DVM operator type, assert that the sequence of change events, when applied to an empty accumulator, produces a set identical to SELECT * FROM stream_table.

Verify: integration test replaying 1,000 random DML operations across all operator types; final accumulator state matches SELECT *. Any divergence is a hard failure. Dependencies: PGL-3-1. Schema change: No.

CORR-2 — Multi-row DML atomicity in reactive stream

In plain terms: A single INSERT INTO source SELECT ... FROM generate_series(1, 100) inserts 100 rows and triggers IMMEDIATE mode delta application. The live.changes() bridge must emit all 100 change events as a single batch — not trickle them one-by-one — so that React performs a single re-render, not 100. If events leak across batch boundaries, the UI shows intermediate states that never existed in the database.

Verify: test with 100-row INSERT; assert useStreamTable() callback fires exactly once with all 100 rows. Intermediate renders counted via React profiler must be ≤ 1. Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

CORR-3 — Hook state consistency after rapid mutations

In plain terms: If a user performs INSERT → DELETE → INSERT on the same row within 10 ms (e.g., optimistic UI with undo), the hook must resolve to the correct final state. Race conditions between the live.changes() event stream and React's asynchronous render cycle could show stale data. The hook must use a monotonic sequence number (from the bridge's event stream) to discard stale updates.

Verify: stress test with 50 rapid mutations on the same row at 1 ms intervals; final hook state matches SELECT *. Test on both React 18 (concurrent mode) and React 19. Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

CORR-4 — DELETE/re-INSERT identity stability

In plain terms: When a row is deleted and a new row with the same PK is inserted, the __pgt_row_id changes but the PK doesn't. The change bridge must emit a DELETE for the old __pgt_row_id and an INSERT for the new one — not an UPDATE — so that React's reconciler correctly unmounts and remounts the component (not just re-renders it). Wrong identity semantics cause stale closures and event handler leaks.

Verify: test DELETE + INSERT with same PK; verify React component lifecycle (unmount + mount, not just update). Use React DevTools profiler. Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

Stability

IDTitleEffortPriority
STAB-1Memory leak prevention in long-lived hooksMP0
STAB-2Subscription cleanup on component unmountSP0
STAB-3Error boundary integration for hook failuresSP0
STAB-4Native extension upgrade path (0.24 → 0.25)SP0
STAB-5Framework version compatibility matrixSP1

STAB-1 — Memory leak prevention in long-lived hooks

In plain terms: A useStreamTable() hook in a long-lived component (e.g., a dashboard that runs for hours) accumulates change events via the live.changes() subscription. If the bridge or hook retains references to processed events, memory grows unboundedly. Implement a bounded event buffer (configurable, default 1,000 events) that discards processed events after they are applied to the hook's state snapshot. After the buffer fills, old entries are garbage-collected.

Verify: 4-hour soak test with continuous 1 row/sec mutations. Heap snapshot at 1h and 4h shows < 10% growth. No detached DOM nodes or leaked closures. Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

STAB-2 — Subscription cleanup on component unmount

In plain terms: When a React component using useStreamTable() is unmounted (e.g., route change), the live.changes() subscription must be cancelled immediately. Failing to clean up causes: (a) memory leaks from the change listener, (b) "setState on unmounted component" warnings, (c) stale event processing after the component is gone. Use useEffect() cleanup function with an AbortController pattern.

Verify: mount/unmount cycle test (100 cycles); zero console warnings, zero leaked subscriptions (verified via PGlite connection subscription count). Dependencies: PGL-3-2. Schema change: No.

STAB-3 — Error boundary integration for hook failures

In plain terms: If the live.changes() bridge throws (e.g., stream table was dropped while the hook is active), the hook must propagate the error to React's error boundary / Vue's onErrorCaptured — not swallow it silently or crash the app. Provide an onError callback option and a default that throws to the nearest error boundary.

Verify: test dropping a stream table while useStreamTable() is active; assert error boundary catches the error with an actionable message. Dependencies: PGL-3-2, PGL-3-3. Schema change: No.

STAB-4 — Native extension upgrade path (0.24 → 0.25)

In plain terms: v0.25.0 adds reactive bindings at the TypeScript/npm layer only. The native PostgreSQL extension and PGlite WASM extension must continue to work unchanged. The upgrade migration from 0.23.0 to 0.24.0 must leave existing stream tables and the @pgtrickle/pglite WASM extension intact.

Verify: upgrade E2E test confirms stream tables survive and refresh correctly after 0.24.0 -> 0.25.0. TypeScript API backward compatibility verified. Dependencies: None. Schema change: No.

STAB-5 — Framework version compatibility matrix

In plain terms: Test useStreamTable() against: React 18.x, React 19.x, Vue 3.4+. Document which framework versions are supported. Future consideration: Svelte 5 (runes), SolidJS, Angular signals — document these as "community-contributed" integration points, not first-party.

Verify: CI matrix testing React 18, React 19, Vue 3.4. Published compatibility table in npm README. Dependencies: PGL-3-2, PGL-3-3. Schema change: No.

Performance

IDTitleEffortPriority
PERF-1INSERT-to-render latency benchmarkMP0
PERF-2Batch rendering efficiency (single re-render)SP0
PERF-3Bridge overhead vs raw live.changes()SP1

PERF-1 — INSERT-to-render latency benchmark

In plain terms: Measure the end-to-end latency from INSERT INTO source_table to the React component's DOM update. The target is < 50% of live.incrementalQuery() latency for a 3-table join + aggregate at 10K rows (per PLAN_PGLITE.md). This is the headline metric: if pg_trickle's reactive path is not significantly faster than PGlite's built-in incremental query, the value proposition collapses.

Verify: benchmark suite with 5 complexity levels (scan, filter, join, aggregate, window). Publish results as a comparison table against live.incrementalQuery(). Target: < 50% latency at 10K rows. Dependencies: PGL-3-1, PGL-3-2, PGL-3-5. Schema change: No.

PERF-2 — Batch rendering efficiency (single re-render)

In plain terms: A bulk INSERT (100 rows) must produce exactly one React re-render, not 100. The change bridge must batch events emitted within the same transaction into a single live.changes() notification. Use queueMicrotask() or requestAnimationFrame() batching in the TypeScript wrapper to coalesce rapid-fire events.

Verify: React profiler shows ≤ 1 render per bulk DML. Test with 1, 10, 100, 1000-row INSERTs; render count is always 1. Dependencies: PGL-3-1, PGL-3-2, CORR-2. Schema change: No.

PERF-3 — Bridge overhead vs raw live.changes()

In plain terms: The change bridge adds a translation layer between the IMMEDIATE mode delta application and PGlite's live.changes() API. Measure the overhead of this translation (serialization, event construction, key mapping) and ensure it is < 5% of total refresh latency. If overhead is higher, optimize the bridge's change event construction (e.g., avoid JSON round-trips, use structured clones).

Verify: micro-benchmark isolating bridge overhead from WASM refresh time. Document overhead as percentage of total INSERT-to-event latency. Dependencies: PGL-3-1. Schema change: No.

Scalability

IDTitleEffortPriority
SCAL-1Multiple concurrent subscriptionsSP1
SCAL-2Large result set rendering (10K+ rows)MP1
SCAL-3Multi-tab / SharedWorker isolationSP2

SCAL-1 — Multiple concurrent subscriptions

In plain terms: A dashboard page may render 5-10 useStreamTable() hooks simultaneously, each watching a different stream table. The bridge must not create per-hook subscriptions to live.changes() — instead, use a single multiplexed subscription that fans out to registered hooks. Measure performance with 1, 5, 10, 20 concurrent hooks.

Verify: benchmark with 20 concurrent useStreamTable() hooks; latency degradation < 20% vs single hook. Memory growth linear (not quadratic). Dependencies: PGL-3-1, PGL-3-2. Schema change: No.

SCAL-2 — Large result set rendering (10K+ rows)

In plain terms: A stream table with 10K+ rows produces a large initial snapshot when useStreamTable() mounts. The hook must support virtualized rendering (integrating with libraries like react-virtual or tanstack-virtual) by providing a stable row identity key (__pgt_row_id) and fine-grained change signals (which rows changed, not just "something changed"). Without this, mounting a 10K-row stream table would freeze the UI for seconds.

Verify: demo app with 10K-row stream table using @tanstack/react-virtual. Mount time < 200 ms. Single-row INSERT re-renders only the affected row, not the full list. Dependencies: PGL-3-2, PGL-3-4. Schema change: No.

SCAL-3 — Multi-tab / SharedWorker isolation

In plain terms: In multi-tab apps using PGlite with SharedWorker, each tab gets its own useStreamTable() hooks but shares a single PGlite instance. The bridge must correctly fan out change events to all tabs without cross-tab interference or duplicate processing. Document the SharedWorker architecture and test with 3 concurrent tabs.

Verify: 3-tab test with shared PGlite instance via SharedWorker. INSERT in tab 1 causes re-render in all 3 tabs. No duplicate events. No memory leaks across tabs. Dependencies: PGL-3-1. Schema change: No.

Ease of Use

IDTitleEffortPriority
UX-1Local-first app example: collaborative todoMP0
UX-2Real-time dashboard exampleMP0
UX-3API reference with interactive playgroundSP1
UX-4Migration guide from live.incrementalQuery()SP1

UX-1 — Local-first app example: collaborative todo

In plain terms: A complete, runnable React app demonstrating pg_trickle + PGlite for a collaborative todo list: multiple "users" (simulated in separate components) INSERT/UPDATE/DELETE todos, each user's view updates reactively via useStreamTable(). Published in the monorepo under examples/pglite-todo/ with a CodeSandbox link. This is the primary "show, don't tell" marketing asset.

Verify: example app runs in CodeSandbox with zero local setup. README explains every code section. A non-pg_trickle developer can understand it in 5 minutes. Dependencies: PGL-3-2, PGL-3-4. Schema change: No.

UX-2 — Real-time dashboard example

In plain terms: A React dashboard with 3 stream tables: (a) live order count (aggregate), (b) revenue by region (join + aggregate), (c) top products (window function + LIMIT). Data is inserted via a simulated event stream. Each panel updates reactively. Demonstrates the breadth of SQL operators supported in PGlite, beyond what live.incrementalQuery() can efficiently handle.

Verify: example app with 3 panels. INSERT 100 orders; all 3 panels update with a single render each. Published to CodeSandbox. Dependencies: PGL-3-2, PGL-3-4. Schema change: No.

UX-3 — API reference with interactive playground

In plain terms: An interactive documentation page (MDX or Storybook) where users can type SQL, create a stream table, insert data, and see the useStreamTable() hook update live — all in the browser via PGlite. This replaces the need for a local install for initial exploration.

Verify: playground page loads in < 3 seconds. Users can create a stream table and see reactive updates within 30 seconds of page load. Dependencies: PGL-3-2, UX-1. Schema change: No.

UX-4 — Migration guide from live.incrementalQuery()

In plain terms: Users already using PGlite's live.incrementalQuery() need a clear guide showing: (a) when to switch to pg_trickle (complex queries, high-throughput writes, large result sets), (b) how to migrate step-by-step (replace live.incrementalQuery(q) with createStreamTable(q) + useStreamTable(name)), (c) what to expect (latency improvement, memory trade-off, SQL surface differences).

Verify: migration guide published in docs. Includes a before/after code diff and a decision flowchart. Dependencies: PGL-3-4, PERF-1. Schema change: No.

Test Coverage

IDTitleEffortPriority
TEST-1Change event fidelity suite (all operators)LP0
TEST-2React hook lifecycle testsMP0
TEST-3Vue composable lifecycle testsMP0
TEST-4Cross-framework render count assertionsSP0
TEST-5Long-running soak test for memory leaksMP1

TEST-1 — Change event fidelity suite (all operators)

In plain terms: For each of the 23 DVM operators, test that the live.changes() bridge emits the correct change events for INSERT, UPDATE, and DELETE on the source table. Replay events into an accumulator and assert it matches SELECT * FROM stream_table. This extends v0.24.0 TEST-1 (operator E2E) by adding the reactive layer.

Verify: ≥ 69 tests (23 operators × 3 DML types). Accumulator matches SELECT * for every test case. Dependencies: PGL-3-1, v0.24.0 TEST-1. Schema change: No.

TEST-2 — React hook lifecycle tests

In plain terms: Test the full lifecycle of useStreamTable(): (a) initial mount returns current stream table state, (b) INSERT on source triggers re-render with new data, (c) unmount cancels subscription, (d) remount re-subscribes and returns current state, (e) rapid mount/unmount (100 cycles) has no leaks. Use React Testing Library with renderHook().

Verify: ≥ 15 tests covering mount, update, unmount, remount, error, and stress scenarios. Zero console warnings in test output. Dependencies: PGL-3-2. Schema change: No.

TEST-3 — Vue composable lifecycle tests

In plain terms: Equivalent of TEST-2 for Vue: mount, update, unmount, remount, error handling. Use Vue Test Utils with mount() and wrapper.unmount(). Test with both Options API and Composition API usage patterns.

Verify: ≥ 10 tests covering Vue lifecycle. Zero console warnings. Dependencies: PGL-3-3. Schema change: No.

TEST-4 — Cross-framework render count assertions

In plain terms: For each framework (React, Vue), verify that a bulk INSERT (100 rows) triggers exactly 1 render, not 100. This is the batching correctness test. Use framework-specific profiling APIs (React Profiler, Vue DevTools perf hooks) to count renders.

Verify: render count = 1 for 100-row bulk INSERT in both React and Vue. CI assertion. Dependencies: PGL-3-2, PGL-3-3, PERF-2. Schema change: No.

TEST-5 — Long-running soak test for memory leaks

In plain terms: Run a React app with useStreamTable() for 4 hours with 1 mutation/second. Take heap snapshots at 0h, 1h, 2h, 4h. Assert heap growth < 10%. Check for detached DOM nodes, leaked event listeners, and orphaned closures. This validates STAB-1 under real conditions.

Verify: soak test runs in CI (with a 30-min abbreviated version for PR CI). Full 4-hour version runs in nightly CI. Heap growth < 10%. Dependencies: STAB-1, PGL-3-2. Schema change: No.

Conflicts & Risks

  1. live.changes() API stability. PGlite's live.changes() is relatively new and its event format may change between PGlite releases. Pin the PGlite version and add an adapter layer so the bridge can accommodate event format changes without rewriting the React/Vue hooks. If PGlite deprecates live.changes() before v0.25.0 ships, fall back to LISTEN/NOTIFY with a custom channel.

  2. CORR-2 (batch atomicity) and PERF-2 (single re-render) are coupled. The batching mechanism must ensure correctness (all-or-nothing event delivery) AND performance (single render). Using queueMicrotask() for batching risks splitting a transaction's events across two microtasks if the event stream straddles a microtask boundary. Consider explicit transaction-boundary markers in the bridge's event protocol.

  3. React concurrent mode complicates CORR-3 (rapid mutations). React 18/19 concurrent features (startTransition, useDeferredValue) may delay or re-order state updates from useStreamTable(). The hook must use useSyncExternalStore() (React 18+) to ensure tearing-free reads. This is non-negotiable for correctness.

  4. SCAL-2 (large result set rendering) requires external library integration. The useStreamTable() hook should not bundle a virtualization library — instead, expose stable row keys and fine-grained change signals that integrate with @tanstack/react-virtual or similar. Document the pattern but do not create a hard dependency.

  5. SCAL-3 (SharedWorker) is exploratory. PGlite's SharedWorker support has known limitations (no concurrent transactions). Mark SCAL-3 as P2 and scope it to documentation + a proof-of-concept, not production-grade support.

  6. No native extension changes in v0.25.0. This release is entirely in the TypeScript/npm layer. Any temptation to add native features (e.g., LISTEN/NOTIFY bridge, WebSocket push) should be deferred to post-1.0. Keep the scope tight: reactive bindings + examples + docs.

v0.25.0 total: ~2–3 weeks (bridge + hooks) + ~1–2 weeks (examples + testing + polish)

Exit criteria:

  • PGL-3-1: Stream table changes appear in live.changes() event stream
  • PGL-3-2: React useStreamTable() hook re-renders on stream table changes
  • PGL-3-3: Vue useStreamTable() composable re-renders on stream table changes
  • PGL-3-4: At least 2 example apps published with documentation and CodeSandbox links
  • PGL-3-5: End-to-end latency benchmarked and published
  • CORR-1: 1,000-operation replay test: accumulator matches SELECT * for all operators
  • CORR-2: 100-row bulk INSERT triggers exactly 1 re-render
  • CORR-3: 50 rapid same-row mutations: final hook state matches SELECT *
  • CORR-4: DELETE + re-INSERT with same PK: correct unmount/mount lifecycle
  • STAB-1: 4-hour soak test: heap growth < 10%
  • STAB-2: 100 mount/unmount cycles: zero leaked subscriptions
  • STAB-3: Stream table dropped while hook active: error boundary catches
  • STAB-4: Extension upgrade path tested (0.24.0 -> 0.25.0)
  • STAB-5: CI matrix passes for React 18, React 19, Vue 3.4+
  • PERF-1: INSERT-to-render latency < 50% of live.incrementalQuery() at 10K rows
  • PERF-2: Render count = 1 for bulk DML (1, 10, 100, 1000 rows)
  • TEST-1: ≥ 69 change event fidelity tests pass (23 operators × 3 DML types)
  • TEST-2: ≥ 15 React hook lifecycle tests pass
  • TEST-3: ≥ 10 Vue composable lifecycle tests pass
  • TEST-4: Cross-framework render count = 1 for bulk DML
  • TEST-5: 30-min abbreviated soak test passes in PR CI
  • UX-1: Collaborative todo example published to CodeSandbox
  • UX-2: Real-time dashboard example published to CodeSandbox
  • UX-4: Migration guide from live.incrementalQuery() published
  • just check-version-sync passes (incl. npm package version)

v1.0.0 — Stable Release

Goal: First officially supported release. Semantic versioning locks in. API, catalog schema, and GUC names are considered stable. Focus is distribution — getting pg_trickle onto package registries — and PostgreSQL 19 forward-compatibility.

PostgreSQL 19 Forward-Compatibility (A3)

In plain terms: When PostgreSQL 19 beta stabilises and pgrx 0.18.x ships with PG 19 support, this milestone bumps the pgrx dependency, audits every internal pg_sys::* API call for breaking changes, adds conditional compilation gates, and validates the WAL decoder against any pgoutput format changes introduced in PG 19. Moved here from the earlier v0.22.0 milestone because PG 19 beta availability is uncertain.

ItemDescriptionEffortRef
A3-1pgrx version bump to 0.18.x (PG 19 support) + cargo pgrx init --pg192–4hPLAN_PG19_COMPAT.md §2
A3-2pg_sys::* API audit: heap access, catalog structs, WAL decoder LogicalDecodingContext8–16hPLAN_PG19_COMPAT.md §3
A3-3Conditional compilation (#[cfg(feature = "pg19")]) for changed APIs4–8hPLAN_PG19_COMPAT.md §4
A3-4CI matrix expansion for PG 19 + full E2E suite run4–8hPLAN_PG19_COMPAT.md

A3 subtotal: ~18–36 hours

Release engineering

In plain terms: The 1.0 release is the official "we stand behind this API" declaration — from this point on the function names, catalog schema, and configuration settings won't change without a major version bump. The practical work is getting pg_trickle onto standard package registries (PGXN, apt, rpm) so it can be installed with the same commands as any other PostgreSQL extension, and hardening the CloudNativePG integration for Kubernetes deployments.

ItemDescriptionEffortRef
R1Semantic versioning policy + compatibility guarantees2–3hPLAN_VERSIONING.md
R2apt / rpm packaging (Debian/Ubuntu .deb + RHEL .rpm via PGDG)8–12hPLAN_PACKAGING.md
R2bPGXN release_status"stable" (flip one field; PGXN testing release ships in v0.7.0)30minPLAN_PACKAGING.md
R3Docker Hub official image → CNPG extension image✅ DonePLAN_CLOUDNATIVEPG.md
R4CNPG operator hardening (K8s 1.33+ native ImageVolume) ➡️ Pulled to v0.15.04–6hPLAN_CLOUDNATIVEPG.md
R5Docker Hub official image. Publish pgtrickle/pg_trickle:1.0.0-pg18 and :latest to Docker Hub. Sync Dockerfile.hub version tag with release. Automate via GitHub Actions release workflow.2–4h
R6Version sync automation. Ensure just check-version-sync covers all version references (Cargo.toml, extension control files, Dockerfile.hub, dbt_project.yml, CNPG manifests). Add to CI as a blocking check.2–3h
SAST-SEMGREPElevate Semgrep to blocking in CI. CodeQL and cargo-deny already block; Semgrep is advisory-only. Flip to blocking for consistent safety gating. Before flipping, verify zero findings across all current rules.1–2hPLAN_SAST.md

v1.0.0 total: ~36–66 hours (incl. PG 19 compat ~18–36h + release engineering ~18–30h)

Exit criteria:

  • A3: PG 19 builds and passes full E2E suite
  • CI matrix includes PG 19
  • Published on PGXN (stable) and apt/rpm via PGDG
  • Docker Hub image published (pgtrickle/pg_trickle:1.0.0-pg18 and :latest)
  • CNPG extension image published to GHCR (pg_trickle-ext)
  • CNPG cluster-example.yaml validated (Image Volume approach)
  • just check-version-sync passes and blocks CI on mismatch
  • SAST-SEMGREP: Semgrep elevated to blocking in CI; zero findings verified
  • Upgrade path from v0.17.0 tested
  • Semantic versioning policy in effect

Post-1.0 — Scale, Ecosystem & Platform Expansion

These are not gated on 1.0 but represent the longer-term horizon. PG backward compatibility (PG 16–18) and native DDL syntax were moved here from v0.16.0 to keep the pre-1.0 milestones focused on performance and correctness.

Ecosystem expansion

In plain terms: Building first-class integrations with the tools most data teams already use — a proper dbt adapter (beyond just a materialization macro), an Airflow provider so you can trigger stream table refreshes from Airflow DAGs, a pgtrickle TUI for managing and monitoring stream tables without writing SQL (shipped in v0.14.0), and integration guides for popular ORMs and migration frameworks like Django, SQLAlchemy, Flyway, and Liquibase.

ItemDescriptionEffortRef
E1dbt full adapter (dbt-pgtrickle extending dbt-postgres)20–30hPLAN_DBT_ADAPTER.md
E2Airflow provider (apache-airflow-providers-pgtrickle)16–20hPLAN_ECO_SYSTEM.md §4
E3CLI tool (pgtrickle) for management outside SQL ➡️ Pulled to v0.14.0 as TUI (E3-TUI)4–6dPLAN_TUI.md
E4Flyway / Liquibase migration support ➡️ Pulled to v0.15.08–12hPLAN_ECO_SYSTEM.md §5
E5ORM integrations guide (SQLAlchemy, Django, etc.) ➡️ Pulled to v0.15.08–12hPLAN_ECO_SYSTEM.md §5

Scale

In plain terms: When you have hundreds of stream tables or a very large cluster, the single background worker that drives pg_trickle today can become a bottleneck. These items explore running the scheduler as an external sidecar process (outside the database itself), distributing stream tables across Citus shards for horizontal scale-out, and managing stream tables that span multiple databases in the same PostgreSQL cluster.

ItemDescriptionEffortRef
S1External orchestrator sidecar for 100+ STs20–40hREPORT_PARALLELIZATION.md §D
S2Citus / distributed PostgreSQL compatibility~6 monthsplans/infra/CITUS.md
S3Multi-database support (beyond postgres DB)TBDPLAN_MULTI_DATABASE.md

PG Backward Compatibility (PG 16–18)

In plain terms: pg_trickle currently only targets PostgreSQL 18. This work adds support for PG 16 and PG 17 so teams that haven't yet upgraded can still use the extension. Each PostgreSQL major version has subtly different internal APIs — especially around query parsing and the WAL format used for change-data-capture — so each version needs its own feature flags, build path, and CI test run.

ItemDescriptionEffortRef
BC1Cargo.toml feature flags (pg16, pg17, pg18) + cfg_aliases4–8hPLAN_PG_BACKCOMPAT.md §5.2 Phase 1
BC2#[cfg] gate JSON_TABLE nodes in parser.rs (~250 lines, PG 17+)12–16hPLAN_PG_BACKCOMPAT.md §5.2 Phase 2
BC3pg_get_viewdef() trailing-semicolon behavior verification2–4hPLAN_PG_BACKCOMPAT.md §5.2 Phase 3
BC4CI matrix expansion (PG 16, 17, 18) + parameterized Dockerfiles12–16hPLAN_PG_BACKCOMPAT.md §5.2 Phases 4–5
BC5WAL decoder validation against PG 16–17 pgoutput format8–12hPLAN_PG_BACKCOMPAT.md §6A

Backward compatibility subtotal: ~38–56 hours

Native DDL Syntax

In plain terms: Currently you create stream tables by calling a function: SELECT pgtrickle.create_stream_table(...). This adds support for standard PostgreSQL DDL syntax: CREATE MATERIALIZED VIEW my_view WITH (pgtrickle.stream = true) AS SELECT .... That single change means pg_dump can back them up properly, \dm in psql lists them, ORMs can introspect them, and migration tools like Flyway treat them like ordinary database objects. Stream tables finally look native to PostgreSQL tooling.

ItemDescriptionEffortRef
NAT-1ProcessUtility_hook infrastructure: register in _PG_init(), dispatch+passthrough, hook chaining with TimescaleDB/pg_stat_statements3–5dPLAN_NATIVE_SYNTAX.md §Tier 2
NAT-2CREATE/DROP/REFRESH interception: parse CreateTableAsStmt reloptions, route to internal impls, IF EXISTS handling, CONCURRENTLY no-op8–13dPLAN_NATIVE_SYNTAX.md §Tier 2
NAT-3E2E tests: CREATE/DROP/REFRESH via DDL syntax, hook chaining, non-pg_trickle matview passthrough2–3dPLAN_NATIVE_SYNTAX.md §Tier 2

Native DDL syntax subtotal: ~13–21 days

Advanced SQL

In plain terms: Longer-horizon features requiring significant research — backward-compatibility to PG 14/15, partitioned stream table storage, and remaining SQL coverage gaps. Several items have been pulled forward to v0.16.0 and v0.17.0.

ItemDescriptionEffortRef
A2Transactional IVM Phase 4 remaining (ENR-based transition tables, C-level triggers, prepared stmt reuse) ➡️ Pulled to v0.17.0~36–54hPLAN_TRANSACTIONAL_IVM.md
A3PostgreSQL 19 forward-compatibility ➡️ Pulled to v0.16.0 ➡️ Moved to v1.0.0~18–36hPLAN_PG19_COMPAT.md
A4PostgreSQL 14–15 backward compatibility~40hPLAN_PG_BACKCOMPAT.md
A5Partitioned stream table storage (opt-in)~60–80hPLAN_PARTITIONING_SHARDING.md §4
A6Buffer table partitioning by LSN range (pg_trickle.buffer_partitioning GUC)✅ DonePLAN_EDGE_CASES_TIVM_IMPL_ORDER.md Stage 4 §3.3
A8ROWS FROM() with multiple SRF functions ➡️ Pulled to v0.17.0~1–2dPLAN_TRANSACTIONAL_IVM_PART_2.md Task 2.3

Parser Modularization & Shared Template Cache (G13-PRF, G14-SHC)

In plain terms: Two large-effort research items identified in the deep gap analysis. Parser modularization is a prerequisite for native DDL syntax (BC2); shared template caching eliminates per-connection cold-start overhead.

ItemDescriptionEffortRef
G13-PRFModularize src/dvm/parser.rs. ✅ Done in v0.15.0~3–4wkplans/performance/REPORT_OVERALL_STATUS.md §13
G14-SHCShared-memory template caching (research spike). ➡️ Pulled to v0.16.0~2–3wkplans/performance/REPORT_OVERALL_STATUS.md §14

Parser modularization: ✅ Done in v0.15.0. Template caching: ➡️ v0.16.0

Convenience API Functions (G15-BC, G15-EX)

In plain terms: Two quality-of-life API additions that simplify programmatic stream table management, useful for dbt/CI pipelines.

ItemDescriptionEffortRef
G15-BCbulk_create(definitions JSONB) — create multiple stream tables and their CDC triggers in a single transaction. Useful for dbt/CI pipelines that manage many STs programmatically. ➡️ Pulled to v0.15.0~2–3dplans/performance/REPORT_OVERALL_STATUS.md §15
G15-EXexport_definition(name TEXT) — export a stream table configuration as reproducible CREATE STREAM TABLE … WITH (…) DDL. ➡️ Pulled to v0.14.0~1–2dplans/performance/REPORT_OVERALL_STATUS.md §15

Convenience API subtotal: ~2–3 days (G15-EX pulled to v0.14.0; G15-BC pulled to v0.15.0)


Effort Summary

MilestoneEffort estimateCumulativeStatus
v0.1.x — Core engine + correctness~30h actual30h✅ Released
v0.2.0 — TopK, Diamond & Transactional IVM✔️ Complete62–78h✅ Released
v0.2.1 — Upgrade Infrastructure & Documentation~8h70–86h✅ Released
v0.2.2 — OFFSET Support, ALTER QUERY & Upgrade Tooling~50–70h120–156h✅ Released
v0.2.3 — Non-Determinism, CDC/Mode Gaps & Operational Polish45–66h165–222h✅ Released
v0.3.0 — DVM Correctness, SAST & Test Coverage~20–30h185–252h✅ Released
v0.4.0 — Parallel Refresh & Performance Hardening~60–94h245–346h✅ Released
v0.5.0 — RLS, Operational Controls + Perf Wave 1 (A-3a only)~51–97h296–443h✅ Released
v0.6.0 — Partitioning, Idempotent DDL & Circular Dependency Foundation~35–50h331–493h✅ Released
v0.7.0 — Performance, Watermarks, Circular DAG Execution, Observability & Infrastructure~59–62h390–555h
v0.8.0 — pg_dump Support & Test Hardening~16–21d
v0.9.0 — Incremental Aggregate Maintenance (B-1)~7–9 wk
v0.10.0 — DVM Hardening, Connection Pooler Compat, Core Refresh Opts & Infra Prep~7–10d + ~26–40 wk
v0.11.0 — Partitioned Stream Tables, Prometheus & Grafana, Safety Hardening & Correctness~7–10 wk + ~12h obs + ~14–21h defaults + ~7–12h safety + ~2–4 wk should-ship
v0.12.0 — Scalability Foundations, Partitioning Enhancements & Correctness~18–27 wk + ~6–8 wk scalability + ~5–8 wk partitioning + ~1–3 wk defaults
v0.13.0 — Scalability Foundations, Partitioning Enhancements, MERGE Profiling & Multi-Tenant Scheduling~15–23 wk
v0.14.0 — Tiered Scheduling, UNLOGGED Buffers & Diagnostics~2–6 wk + ~1 wk patterns + ~2–4d stability + ~3.5–7d diagnostics + ~1–2d export + ~4–6d TUI + ~0.5d docs
v0.15.0 — External Test Suites & Integration~40–70h + ~2–3d bulk create + ~3–5d planner hints + ~2–3d cache spike + ~3–4wk parser + ~1–2wk watermark + ~2–4wk delta cost/spill✅ Released
v0.16.0 — Performance & Refresh Optimization~1–2wk MERGE alts + ~4–6wk aggregate fast-path + ~1–2wk append-only + ~2–3wk predicate pushdown + ~2–3wk template cache + ~2–3wk buffer compaction + ~3–6wk test coverage + ~1–2wk bench CI + ~2–3d auto-indexing + ~12–22h quick wins
v0.17.0 — Query Intelligence & Stability~2–3wk cost-based strategy + ~3–4wk columnar tracking + ~32–48h TIVM Phase 4 + ~1–2d ROWS FROM + ~2–3wk SQLancer + ~2–3wk incremental DAG + ~4–8h unsafe reduction + ~1–2wk api.rs mod + ~2–3d migration guide + ~3–5d runbook + ~2–3d playground + ~2–3d doc polish
v0.18.0 — Hardening & Delta Performance~70–100h
v0.19.0 — Production Gap Closure & Distribution~4–5 weeks
v0.20.0 — Dog-Feeding (pg_trickle monitors itself)~3–4wk
v0.21.0 — PostgreSQL 17 Support~2–4d
v0.22.0 — PGlite Proof of Concept~2–3wk (plugin) + ~1–2d (version bump)
v0.23.0 — Core Extraction (pg_trickle_core)~3–4wk (extraction) + ~1–2wk (abstraction + testing)
v0.24.0 — PGlite WASM Extension~5–7wk (WASM build) + ~2–3wk (testing + polish)
v0.25.0 — PGlite Reactive Integration~2–3wk (bridge + hooks) + ~1–2wk (examples + testing + polish)
v1.0.0 — Stable release (incl. PG 19 compat)~36–66h
Post-1.0 (PG compat + Native DDL)~38–56h (PG 16–18) + ~13–21d (Native DDL)
Post-1.0 (ecosystem)88–134h
Post-1.0 (scale)6+ months

References

DocumentPurpose
CHANGELOG.mdWhat's been built
plans/PLAN.mdOriginal 13-phase design plan
plans/sql/SQL_GAPS_7.md53 known gaps, prioritized
plans/sql/PLAN_PARALLELISM.mdDetailed implementation plan for true parallel refresh
plans/performance/REPORT_PARALLELIZATION.mdParallelization options analysis
plans/performance/STATUS_PERFORMANCE.mdBenchmark results
plans/ecosystem/PLAN_ECO_SYSTEM.mdEcosystem project catalog
plans/dbt/PLAN_DBT_ADAPTER.mdFull dbt adapter plan
plans/infra/CITUS.mdCitus compatibility plan
plans/infra/PLAN_VERSIONING.mdVersioning & compatibility policy
plans/infra/PLAN_PACKAGING.mdPGXN / deb / rpm packaging
plans/infra/PLAN_DOCKER_IMAGE.mdOfficial Docker image (superseded by CNPG extension image)
plans/ecosystem/PLAN_CLOUDNATIVEPG.mdCNPG Image Volume extension image
plans/infra/PLAN_MULTI_DATABASE.mdMulti-database support
plans/infra/PLAN_PG19_COMPAT.mdPostgreSQL 19 forward-compatibility
plans/sql/PLAN_UPGRADE_MIGRATIONS.mdExtension upgrade migrations
plans/sql/PLAN_TRANSACTIONAL_IVM.mdTransactional IVM (immediate, same-transaction refresh)
plans/sql/PLAN_ORDER_BY_LIMIT_OFFSET.mdORDER BY / LIMIT / OFFSET gaps & TopK support
plans/sql/PLAN_NON_DETERMINISM.mdNon-deterministic function handling
plans/sql/PLAN_ROW_LEVEL_SECURITY.mdRow-Level Security support plan (Phases 1–4)
plans/infra/PLAN_PARTITIONING_SHARDING.mdPostgreSQL partitioning & sharding compatibility
plans/infra/PLAN_PG_BACKCOMPAT.mdSupporting older PostgreSQL versions (13–17)
plans/sql/PLAN_DIAMOND_DEPENDENCY_CONSISTENCY.mdDiamond dependency consistency (multi-path refresh atomicity)
plans/adrs/PLAN_ADRS.mdArchitectural decisions
docs/ARCHITECTURE.mdSystem architecture

Release Process

This document describes how to create a release of pg_trickle.

Overview

Releases are fully automated via GitHub Actions. Pushing a version tag (v*) triggers the Release workflow, which:

  1. Runs a preflight version-sync check to ensure all version references match the tag
  2. Builds extension packages for Linux (amd64), macOS (arm64), and Windows (amd64)
  3. Smoke-tests the Linux artifact against a live PostgreSQL 18 instance
  4. Creates a GitHub Release with archives and SHA256 checksums
  5. Builds and pushes a multi-arch extension image to GHCR (for CNPG Image Volumes)

A separate PGXN workflow also fires on the same v* tag and publishes the source archive to the PostgreSQL Extension Network.

Prerequisites

  • Push access to the repository (or a PR merged by a maintainer)
  • All CI checks passing on main (verify the last run on the version-bump commit succeeded)
  • The version in Cargo.toml matches the tag you intend to push
  • Required GitHub secrets configured (see Required GitHub Secrets below)

Required GitHub Secrets

The release automation uses the following GitHub Actions secrets. Set them under Settings → Secrets and variables → Actions → New repository secret.

SecretUsed byDescription
PGXN_USERNAMEpgxn.ymlYour PGXN account username. Used to authenticate the curl upload to PGXN Manager when publishing source archives to the PostgreSQL Extension Network. Register at pgxn.org.
PGXN_PASSWORDpgxn.ymlPassword for the PGXN account above. Never hardcode this — it must be stored as a secret so it is never exposed in logs or committed to the repository.
CODECOV_TOKENcoverage.ymlUpload token for Codecov. Used to publish unit and E2E coverage reports. Obtain it from the Codecov dashboard after linking the repository. The workflow degrades gracefully (fail_ci_if_error: false) if absent.
BENCHER_API_TOKENbenchmarks.ymlAPI token for Bencher, the continuous benchmarking platform. Used to track Criterion benchmark results on main and detect regressions on pull requests. The benchmark steps are skipped entirely when this secret is absent, so CI still passes without it. Create a project at bencher.dev and copy the token from the project settings.

Note: The GITHUB_TOKEN secret is provided automatically by GitHub Actions and does not need to be configured manually. It is used by the release workflow to create GitHub Releases, by the Docker workflow to push images to GHCR, and by Bencher to post PR comments.

Step-by-Step

1. Decide the version number

Follow Semantic Versioning:

Change typeBumpExample
Breaking SQL API or config changeMajor1.0.0 → 2.0.0
New feature, backward-compatibleMinor0.1.0 → 0.2.0
Bug fix, no API changePatch0.2.0 → 0.2.1
Pre-release / release candidateSuffix0.3.0-rc.1

2. Update the version

Four files must have their version bumped together:

# 1. Cargo.toml — the canonical version source for the extension
#    Change:  version = "0.7.0"  →  version = "0.8.0"

# 2. pgtrickle-tui/Cargo.toml — the TUI binary; must always match Cargo.toml
#    Change:  version = "0.7.0"  →  version = "0.8.0"

# 3. META.json — the PGXN package metadata
#    Change both top-level "version" and the nested "provides" version

# 4. CHANGELOG.md
#    Rename ## [Unreleased] → ## [0.8.0] — YYYY-MM-DD
#    Add a new empty ## [Unreleased] section at the top

Important: Cargo.toml (extension) and pgtrickle-tui/Cargo.toml (TUI) must always carry the same version. They are built and released together, and a mismatch causes cargo install --path pgtrickle-tui to report the wrong version. The just check-version-sync script does not currently enforce this, so it must be checked manually.

The extension control file (pg_trickle.control) uses default_version = '@CARGO_VERSION@', which pgrx substitutes automatically at build time — no manual edit needed there.

After editing, verify all version-related files are in sync:

just check-version-sync

3. Commit the version bump

git add Cargo.toml META.json CHANGELOG.md
git commit -m "release: v0.8.0"
git push origin main

4. Wait for CI to pass and verify upgrade completeness

Ensure the CI workflow passes on main with the version bump commit. All unit, integration, E2E, and pgrx tests must be green.

Critical: Before tagging, verify that the upgrade script covers all SQL schema changes:

# Run comprehensive upgrade completeness checks
just check-upgrade-all

# If any check fails (e.g. "ERROR: X new function(s) missing from upgrade script"),
# fix the issue by adding the missing SQL objects to:
#   sql/pg_trickle--<prev>--<new>.sql
#
# Then re-run until all checks pass:
just check-upgrade-all  # Should print "All 15 upgrade step(s) passed completeness checks."

Why this matters: New SQL functions, views, tables, and columns added in any prior release must be carried forward in the upgrade script, even if the current release doesn't change them. The upgrade script is the source of truth for what PostgreSQL applies when users run ALTER EXTENSION pg_trickle UPDATE.

Confirm the local and CI upgrade-E2E defaults were advanced to the new release:

just check-version-sync  # Verifies ci.yml, justfile, and test defaults

5. Create and push the tag

git tag -a v0.2.0 -m "Release v0.2.0"
git push origin v0.2.0

This triggers the Release workflow automatically.

6. Monitor the release

Watch the Actions tab for progress. The release workflow runs these jobs in order:

preflight  ──►  build-release (linux, macos, windows)
                      │
                      ▼
                test-release  ──►  publish-release
                              ──►  publish-docker-arch (linux/amd64 + linux/arm64)
                                         │
                                         ▼
                                   publish-docker (merge manifest + push :latest)

The PGXN workflow (pgxn.yml) runs independently and publishes the source archive to pgxn.org in parallel with the release workflow.

7. Make the GHCR package public (first release only)

When a package is pushed to GHCR for the first time it is private by default. Because this is an open-source project, packages linked to the public repository inherit public visibility — but you must make the package public once to unlock that:

  1. Go to github.com/⟨owner⟩ → Packages → pg_trickle-ext
  2. Click Package settings
  3. Scroll to Danger ZoneChange package visibility → set to Public

After that first change:

  • All future pushes keep the package public automatically
  • Unauthenticated docker pull ghcr.io/grove/pg_trickle-ext:... works
  • Storage and bandwidth are free (GHCR open-source advantage)
  • The package page shows the README, linked repository, license, and description from the OCI labels

8. Verify the release

Once both workflows complete:

  • Check the GitHub Releases page for the new release
  • Verify all three platform archives are attached (.tar.gz for Linux/macOS, .zip for Windows)
  • Verify SHA256SUMS.txt is present
  • Verify the extension image is available at ghcr.io/grove/pg_trickle-ext:<version>
  • Verify the PGXN upload succeeded: pgxn info pg_trickle should show the new version
  • Optionally verify the extension image layout:
docker pull ghcr.io/grove/pg_trickle-ext:<version>
ID=$(docker create ghcr.io/grove/pg_trickle-ext:<version>)
docker cp "$ID:/lib/" /tmp/ext-lib/
docker cp "$ID:/share/" /tmp/ext-share/
docker rm "$ID"
ls -la /tmp/ext-lib/ /tmp/ext-share/extension/

Post-Release Checklist

Complete these steps immediately after a release tag has been pushed and both the Release and PGXN workflows have finished successfully.

  • Create a post-release branch from main (e.g. post-release-<ver>-a)
  • Bump Cargo.toml version to the next development version (e.g. 0.12.00.13.0)
  • Bump pgtrickle-tui/Cargo.toml version to the same next development version — must always match Cargo.toml
  • Bump META.json — both the top-level "version" and the nested "provides" → "pg_trickle" → "version" to match
  • Write plans/PLAN_0_<next>_0.md — initial planning document for the next milestone
  • Delete plans/PLAN_0_<released>_0.md — remove the now-completed plan
  • Wrap roadmap items — in ROADMAP.md, wrap all completed items from the old release with <details> tags to archive them
  • Add ## [Unreleased] stub to CHANGELOG.md above the just-released entry
  • Create sql/pg_trickle--<released>--<next>.sql — empty upgrade script stub for the next migration hop
  • Copy sql/archive/pg_trickle--<released>.sqlsql/archive/pg_trickle--<next>.sql — placeholder archive baseline for the next version
  • Update justfile — advance build-upgrade-image and test-upgrade to defaults to <next>; update the build-hub Docker image tag
  • Update tests/e2e_upgrade_tests.rs — advance all unwrap_or("<released>".into()) fallback strings to <next>
  • Update version numbers in README.md — search for occurrences of the released version (e.g. 0.17.0) and advance them to <next>: CNPG image reference (ghcr.io/grove/pg_trickle-ext:<version>), dbt revision tag, and any other hardcoded version strings. A quick check: grep -n '<released>' README.md
  • Run just check-version-sync — must exit 0 before opening the PR
  • Open a PR against main with the commit title chore: start v<next> development cycle

Preparing for the Next Release (Pre-Work Checklist)

Use this checklist at the start of each new release milestone to ensure the repository is properly set up before development begins. This maps directly to what just check-version-sync verifies.

File / targetActioncheck-version-sync check
Cargo.tomlversion = "<next>"canonical version source
META.jsonboth "version" fields set to <next>PGXN manifest
CHANGELOG.md## [Unreleased] section present(manual hygiene)
sql/pg_trickle--<prev>--<next>.sqlstub file existsupgrade SQL exists
sql/archive/pg_trickle--<next>.sqlplaceholder file exists (copy of <prev>)archive SQL exists
.github/workflows/ci.ymlupgrade matrix and chain end at <next>CI matrix up to date
justfilebuild-upgrade-image and test-upgrade to defaults = <next>justfile defaults
tests/e2e_upgrade_tests.rsall unwrap_or fallbacks = "<next>"e2e fallback strings

Quick-verify with:

just check-version-sync
# Should print: All version references are in sync.

Release Artifacts

Each release produces:

ArtifactDescription
pg_trickle-<ver>-pg18-linux-amd64.tar.gzExtension files for Linux x86_64
pg_trickle-<ver>-pg18-macos-arm64.tar.gzExtension files for macOS Apple Silicon
pg_trickle-<ver>-pg18-windows-amd64.zipExtension files for Windows x64
SHA256SUMS.txtSHA-256 checksums for all archives
ghcr.io/grove/pg_trickle-ext:<ver>CNPG extension image for Image Volumes (amd64 + arm64)

Installing from an archive

tar xzf pg_trickle-<version>-pg18-linux-amd64.tar.gz
cd pg_trickle-<version>-pg18-linux-amd64

sudo cp lib/*.so "$(pg_config --pkglibdir)/"
sudo cp extension/*.control extension/*.sql "$(pg_config --sharedir)/extension/"

Then add to postgresql.conf and restart:

shared_preload_libraries = 'pg_trickle'

See INSTALL.md for full installation details.

Pre-releases

Tags containing -rc, -beta, or -alpha (e.g., v0.3.0-rc.1) are automatically marked as pre-releases on GitHub. Pre-release extension images are tagged but do not update the latest tag.

Hotfix Releases

For urgent fixes on an older release:

# Branch from the tag
git checkout -b hotfix/v0.2.1 v0.2.0

# Apply fix, bump version to 0.2.1
git commit -am "fix: ..."
git push origin hotfix/v0.2.1

# Tag from the branch (CI will still run the release workflow)
git tag -a v0.2.1 -m "Release v0.2.1"
git push origin v0.2.1

Files to Update for Each Release

Every release requires manual updates to the files below. Missing any of them leads to version skew between the code, the docs, and the packages.

FileWhat to changeWhy
Cargo.tomlversion = "x.y.z" fieldThe canonical version source. pgrx reads this at build time and substitutes it into pg_trickle.control via @CARGO_VERSION@. The git tag must match.
META.jsonBoth "version" fields (top-level and inside "provides")The PGXN package manifest. The pgxn.yml workflow uploads this file as part of the source archive; a stale version here means the wrong version appears on pgxn.org.
CHANGELOG.mdRename ## [Unreleased]## [x.y.z] — YYYY-MM-DD; add a new empty ## [Unreleased] at the topKeeps the public changelog accurate and gives downstream users a dated record of changes.
ROADMAP.mdUpdate the preamble's latest-release/current-milestone lines; mark the released milestone done; advance the "We are here" pointer to the next milestoneKeeps the forward-looking plan aligned with reality. Leaves no confusion about what just shipped versus what is next.
README.mdUpdate test-count line (~N unit tests + M E2E tests) if test counts changed significantlyThe README is the first thing users read; stale numbers erode trust.
INSTALL.mdUpdate any version numbers in install commands or example URLsUsers copy-paste installation commands; stale versions cause failures.
docs/UPGRADING.mdAdd the new version-specific migration notes and extend the supported upgrade-path tableDocuments exactly what ALTER EXTENSION ... UPDATE will do and which chains are supported.
sql/pg_trickle--<old>--<new>.sqlAdd or update the hand-authored upgrade script for every SQL-surface change (new objects, changed signatures, changed defaults, view changes). Also carry forward all functions/views/tables added in previous releases — the upgrade script is cumulative.ALTER EXTENSION ... UPDATE only applies what is explicitly scripted; function defaults and signatures stored in pg_proc do not update themselves. Omitting a function that existed in <old> but is expected in <new> will break user upgrades.
sql/archive/pg_trickle--<new>.sqlRegenerate and commit the full-install SQL baseline for the new version. This file was created as a placeholder copy of <prev> at the start of the development cycle — it must be replaced with the actual generated SQL before tagging. Run cargo pgrx schema (or the equivalent just target) to produce the final schema, then overwrite the placeholder.Future upgrade-completeness checks and upgrade E2E tests need an exact baseline for the released version. A stale placeholder from the start of the cycle will cause spurious failures.
.github/workflows/ci.yml, justfile, tests/build_e2e_upgrade_image.sh, tests/Dockerfile.e2e-upgradeAdvance the upgrade-check chain and default upgrade-E2E target version to the new releasePrevents release automation and local upgrade validation from getting stuck on the previous version after a new migration hop is added.
pg_trickle.controlNo manual edit neededdefault_version is set to '@CARGO_VERSION@' and pgrx substitutes it at build time. Verify the substitution in the built artifact.Ensures the SQL CREATE EXTENSION command installs the right version.

CRITICAL: After updating sql/pg_trickle--<old>--<new>.sql, always run just check-upgrade-all to verify that the upgrade script is complete. This checks not just the immediate hop to the new version, but the entire upgrade chain from v0.1.3 onwards. If the check fails (e.g. "ERROR: 3 new function(s) missing"), it means the upgrade script is missing one or more SQL objects that users will expect to have after upgrading. Fix all failures before tagging.

Checklist summary

[ ] Cargo.toml — version bumped
[ ] META.json — both "version" fields updated to match
[ ] CHANGELOG.md — [Unreleased] renamed to [x.y.z] with date; new empty [Unreleased] added
[ ] ROADMAP.md — preamble updated; released milestone marked done
[ ] README.md — test counts current (if materially changed)
[ ] INSTALL.md — version references current
[ ] docs/UPGRADING.md — latest migration notes and supported chains added
[ ] sql/pg_trickle--<old>--<new>.sql — covers every SQL-surface change AND carries forward all previous release functions
[ ] sql/archive/pg_trickle--<new>.sql — regenerated from final schema and committed (replaces the dev-cycle placeholder)
[ ] just check-upgrade-all — all upgrade steps pass completeness checks (not just the one-step hop)
[ ] Upgrade automation defaults — CI/local upgrade checks and E2E target the new version
[ ] just check-version-sync — all version references in sync
[ ] All CI checks on main have passed (verify the last run on the version-bump commit succeeded)
[ ] git tag matches Cargo.toml version

Troubleshooting

Release workflow failed

Go to the Actions tab and identify which job failed. Then follow the appropriate recovery path below.

Option A: Re-run (transient failure)

If the failure is transient — network timeout, registry hiccup, runner issue — you can re-run without changing anything:

  1. Open the failed workflow run in the Actions tab
  2. Click Re-run all jobs (or re-run just the failed job)

This works because the v* tag still points to the same commit, and the workflow uses cancel-in-progress: false so a re-run won't be cancelled.

Option B: Fix code and re-tag

If the failure is a real build or code issue:

# 1. Delete the remote tag
git push origin :refs/tags/v0.2.0

# 2. Delete the local tag
git tag -d v0.2.0

# 3. Fix the issue, commit, and push
git add <files>
git commit -m "fix: ..."
git push origin main

# 4. Re-tag on the new commit and push
git tag -a v0.2.0 -m "Release v0.2.0"
git push origin v0.2.0

This triggers a fresh release workflow run.

Option C: Clean up a partial GitHub Release

If the workflow created a draft or partial Release before failing:

  1. Go to Releases in the repository
  2. Delete the broken release (this does not delete the tag)
  3. Then follow Option A or Option B above

Upgrade script completeness check failed

If just check-upgrade-all reports errors like "ERROR: X new function(s) missing from upgrade script", it means the upgrade SQL script is incomplete:

# 1. Look at the error — it tells you exactly what's missing
just check-upgrade-all  # e.g. "ERROR: 3 new function(s) missing from upgrade script:
                        #        - pgtrickle.\"explain_refresh_mode\"
                        #        - pgtrickle.\"fuse_status\"
                        #        - pgtrickle.\"reset_fuse\""

# 2. Find where those objects are defined in the previous release
#    (they should already exist in sql/archive/pg_trickle--<prev>.sql)
grep -n "CREATE.*FUNCTION.*explain_refresh_mode" sql/archive/pg_trickle--*.sql

# 3. Copy the function definitions (CREATE OR REPLACE FUNCTION) to the
#    upgrade script you're fixing. They should go into:
#    sql/pg_trickle--<old>--<new>.sql
#    
#    Typically, carry-forward functions are grouped in their own section
#    at the top of the upgrade script with a comment explaining they're
#    from a prior release.

# 4. Re-run the check to verify it passes
just check-upgrade-all

Why this happens: When a new release (e.g. v0.11.0) adds SQL functions, those functions must be explicitly included in all subsequent upgrade scripts. The upgrade script is the ground truth — PostgreSQL only applies what is listed in the .sql file. If you skip a function that users expect, their upgraded extension will be missing that object.

Common failure causes

SymptomCauseFix
Version mismatch errorCargo.toml version doesn't match the git tagRun just check-version-sync, fix any skew, commit, delete tag, re-tag (Option B)
Build failureCompilation error in release profileFix on main, re-tag (Option B)
Docker push failedMissing permissionsVerify packages: write is in the workflow and GITHUB_TOKEN has GHCR access, then re-run (Option A)
Smoke test failedExtension doesn't load in PostgreSQLFix the issue, re-tag (Option B)
PGXN upload failedMissing PGXN_USERNAME / PGXN_PASSWORD secrets, or META.json version not updatedAdd the secrets in repository settings; verify META.json version matches the tag; re-run the pgxn.yml workflow from the Actions tab
just check-upgrade-all reports missing functions/viewsUpgrade script is incomplete — new objects from prior releases not carried forwardSee "Upgrade script completeness check failed" above for recovery steps
Rate limitedGitHub API or GHCR throttlingWait a few minutes, then re-run (Option A)

Yanking a release

If a release has a critical issue:

  1. Mark it as pre-release on the GitHub Releases page (uncheck "Set as the latest release")
  2. Add a warning to the release notes
  3. Publish a patch release with the fix

Security Policy

Supported Versions

VersionSupported
0.13.x (current pre-release)

During pre-1.0 development, only the latest minor version receives security fixes. Once v1.0.0 is released, the two most recent minor versions will receive security fixes.

Reporting a Vulnerability

Please do not report security vulnerabilities via public GitHub Issues.

Use GitHub's built-in private vulnerability reporting:

  1. Go to the Security tab of this repository
  2. Click "Report a vulnerability"
  3. Fill in the details — affected version, description, reproduction steps, and potential impact

We aim to acknowledge reports within 48 hours and provide a fix or mitigation within 14 days for critical issues.

What to Include

A useful report includes:

  • PostgreSQL version and pg_trickle version
  • Minimal reproduction SQL or Rust code
  • Description of the unintended behaviour and its security impact
  • Whether the vulnerability requires a trusted (superuser) or untrusted role to trigger

Scope

In-scope:

  • SQL injection or privilege escalation via pgtrickle.* functions
  • Memory safety issues in the Rust extension code (buffer overflows, use-after-free, etc.)
  • Denial-of-service caused by a low-privilege user triggering runaway resource usage
  • Information disclosure through change buffers (pgtrickle_changes.*) or monitoring views

Out-of-scope:

Disclosure Policy

We follow coordinated disclosure. Once a fix is released we will publish a security advisory on GitHub with a CVE if applicable.

pg_trickle vs. DBSP: Similarities and Differences

What They Share (Conceptual Foundation)

pg_trickle explicitly cites DBSP as its theoretical foundation (see PRIOR_ART.md). The key overlap:

ConceptDBSP (paper)pg_trickle (implementation)
Z-set / delta modelRows annotated with weights (+1/−1) in an abelian group__pgt_action = 'I'/'D' column on every delta row — effectively Z-sets restricted to {+1, −1}
Per-operator differentiationRecursive Algorithm 4.6: Q^Δ = D ∘ Q ∘ I, decomposed per-operator via the chain rule (Q₁ ∘ Q₂)^Δ = Q₁^Δ ∘ Q₂^ΔDiffContext::diff_node() walks the OpTree and calls per-operator differentiators (scan, filter, project, join, aggregate, distinct, union, etc.) — same recursive structural decomposition
Linear operators are self-incrementalTheorem 3.3: for LTI operator Q, Q^Δ = QFilter and Project pass deltas through unchanged (just apply predicate/projection to the delta stream)
Bilinear join ruleTheorem 3.4: Δ(a × b) = Δa × Δb + a × Δb + Δa × bdiff_inner_join generates exactly 3 UNION ALL parts: (delta_left ⋈ current_right), (current_left ⋈ delta_right), and optionally (delta_left ⋈ delta_right)
Aggregate auxiliary counters§4.2: counting algorithm for maintaining aggregates with deletions__pgt_count auxiliary column, LEFT JOIN back to stream table to read old counts and compute new counts
Recursive queries§6: fixed-point iteration with z⁻¹ delay operator, semi-naive evaluationdiff_recursive_cte uses recomputation-diff (DRed-style), not DBSP's native fixed-point circuit

Key Differences

1. Execution model — standalone engine vs. embedded in PostgreSQL

DBSP is a standalone streaming runtime (Rust library, now Feldera). It compiles query plans into dataflow graphs that maintain in-memory state and process continuous micro-batches. Operators are long-lived stateful actors with their own memory.

pg_trickle is an extension inside PostgreSQL. It has no persistent dataflow graph. On each refresh, it generates a single SQL query (CTE chain) that PostgreSQL's own planner/executor evaluates. After execution, no operator state persists — auxiliary state lives in the stream table itself (__pgt_count columns) and change buffer tables.

2. Streams vs. periodic batches

DBSP operates on true infinite streams indexed by logical time t ∈ ℕ. Each "step" processes one micro-batch of changes, and operators carry integration state (I operator = running sum from t=0).

pg_trickle operates in discrete refresh cycles triggered by a lag-based scheduler. There is no integration operator — the "current state" is just the stream table's contents, and changes are consumed from CDC buffer tables between LSN boundaries. Each refresh is a self-contained transaction.

3. Z-set weights vs. binary actions

DBSP uses integer weights in ℤ — rows can have weights > 1 (bags) or < −1 (multiple deletions). This enables correct multiset semantics and composable group algebra.

pg_trickle uses binary actions ('I' insert, 'D' delete, sometimes 'U' update). It doesn't maintain true Z-set weights. For aggregates, the __pgt_count auxiliary column serves a similar purpose but is specific to the aggregate operator — it's not a general weight propagated through the operator tree.

4. Integration operator (I)

DBSP: The integration operator I(s)[t] = Σᵢ≤ₜ s[i] is an explicit first-class circuit element. It maintains running sums of changes and is the key mechanism for computing incremental joins (z⁻¹(I(a)) = "accumulated left side up to previous step").

pg_trickle: No explicit integration. The equivalent of I is just "read the current contents of the source/stream table." Join differentiation directly reads the current snapshot of the non-delta side (build_snapshot_sql() generates FROM "public"."orders" r), which implicitly includes all historical changes.

5. Recursion

DBSP: Native fixed-point circuits with z⁻¹ delay. Can incrementally maintain recursive queries (e.g., transitive closure) by iterating only on new changes within each step — semi-naive evaluation generalized to arbitrary recursion.

pg_trickle: Uses recomputation-diff for recursive CTEs — re-executes the full recursive query and anti-joins against current storage to compute the delta. This is correct but not truly incremental for the recursive part.

6. Correctness guarantees

DBSP: Proven correct in Lean. All theorems are machine-checked. The chain rule, cycle rule, and bilinear decomposition are formally verified.

pg_trickle: Verified empirically via property-based tests (the assert_invariant checks that Contents(ST) = Q(DB) after each mutation cycle). No formal proof, but the per-operator rules are direct translations of DBSP's rules.

7. Scope

DBSP: A general-purpose theory and streaming engine. Handles nested relations, streaming aggregation over windows, arbitrary compositions. The Feldera implementation supports a full SQL frontend.

pg_trickle: Focused on materialized views inside PostgreSQL. Supports a specific subset of SQL (scan, filter, project, inner/left/full join, aggregates, DISTINCT, UNION ALL, INTERSECT, EXCEPT, CTEs, window functions, lateral joins). It is not a general streaming engine — it leverages PostgreSQL's own query planner and executor.


Summary

pg_trickle applies DBSP's differentiation rules to generate delta queries, but it is not a DBSP implementation. It borrows the mathematical framework (per-operator differentiation, Z-set-like deltas, bilinear join decomposition) while making fundamentally different architectural choices: embedded in PostgreSQL, no persistent dataflow state, periodic batch execution, and PostgreSQL's planner as the optimizer. Think of it as "DBSP's differentiation algebra, compiled down to SQL CTEs and executed by PostgreSQL."

Prior Art

This document lists the academic papers, PostgreSQL commits, open-source tools, and standard algorithms whose techniques are reused in pg_trickle.

Maintaining this record serves two purposes:

  1. Attribution — credit the research and engineering work this project builds upon.
  2. Independent derivation — demonstrate that every core technique predates and is independent of any single vendor's commercial product.

Differential View Maintenance (DVM)

DBSP — Automatic Incremental View Maintenance

Budiu, M., Ryzhyk, L., McSherry, F., & Tannen, V. (2023). "DBSP: Automatic Incremental View Maintenance for Rich Query Languages." Proceedings of the VLDB Endowment (PVLDB), 16(7), 1601–1614. https://arxiv.org/abs/2203.16684

The Z-set abstraction (rows annotated with +1/−1 multiplicity) is the theoretical foundation for the __pgt_action column produced by the delta operators in src/dvm/operators/. The per-operator differentiation rules (scan, filter, project, join, aggregate, union) are direct applications of the DBSP lifting operator (D) described in this paper.

See DBSP_COMPARISON.md for a detailed comparison of pg_trickle's architecture with the DBSP model.

Gupta & Mumick — Materialized Views Survey

Gupta, A. & Mumick, I.S. (1995). "Maintenance of Materialized Views: Problems, Techniques, and Applications." IEEE Data Engineering Bulletin, 18(2), 3–18.

Gupta, A. & Mumick, I.S. (1999). Materialized Views: Techniques, Implementations, and Applications. MIT Press. ISBN 978-0-262-57122-7.

The per-operator differentiation rules in src/dvm/operators/ follow the derivation given in section 3 of the 1995 survey. The counting algorithm for maintaining aggregates with deletions uses the approach described in the MIT Press book.

DBToaster — Higher-order Delta Processing

Koch, C., Ahmad, Y., Kennedy, O., Nikolic, M., Nötzli, A., Olteanu, D., & Zavodny, J. (2014). "DBToaster: Higher-order Delta Processing for Dynamic, Frequently Fresh Views." The VLDB Journal, 23(2), 253–278. https://doi.org/10.1007/s00778-013-0348-4

Inspiration for the recursive delta compilation strategy where the delta of a complex query is itself a query that can be differentiated.

DRed — Deletion and Re-derivation

Gupta, A., Mumick, I.S., & Subrahmanian, V.S. (1993). "Maintaining Views Incrementally." Proceedings of the 1993 ACM SIGMOD International Conference, 157–166.

The DRed algorithm for handling deletions in recursive views is the basis for the recursive CTE differential refresh strategy in src/dvm/operators/recursive_cte.rs.


Scheduling

Earliest-Deadline-First (EDF)

Liu, C.L. & Layland, J.W. (1973). "Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment." Journal of the ACM, 20(1), 46–61. https://doi.org/10.1145/321738.321743

The schedule-based scheduling in src/scheduler.rs applies the classic EDF principle: the stream table whose freshness deadline expires soonest is refreshed first. EDF is optimal for uniprocessor preemptive scheduling and is a standard technique in operating systems and real-time databases.

Topological Sort — Kahn's Algorithm

Kahn, A.B. (1962). "Topological sorting of large networks." Communications of the ACM, 5(11), 558–562. https://doi.org/10.1145/368996.369025

The dependency DAG in src/dag.rs uses Kahn's algorithm for topological ordering and cycle detection. This is standard computer science curriculum and appears in every major algorithms textbook (Cormen et al., Sedgewick, Kleinberg & Tardos).


Change Data Capture (CDC)

PostgreSQL Row-Level Triggers

Row-level AFTER INSERT/UPDATE/DELETE triggers have been available in PostgreSQL since version 6.x (late 1990s). The trigger-based change capture pattern used in src/cdc.rs is a well-established PostgreSQL technique:

  • PostgreSQL documentation: CREATE TRIGGER — trigger-based CDC has been a standard pattern for decades.
  • PostgreSQL wiki: "Trigger-based Change Data Capture in PostgreSQL."

Debezium

Debezium project (Red Hat, open source since 2016). https://debezium.io/

Debezium implements trigger-based and WAL-based CDC for PostgreSQL and other databases. The change buffer table pattern (pg_trickle_changes.changes_<oid>) follows a similar approach, modified for single-process consumption within the PostgreSQL backend.

pgaudit

pgaudit extension (2015). https://github.com/pgaudit/pgaudit

Captures DML via AFTER row-level triggers for audit logging, demonstrating the same trigger-based change-capture technique in production since 2015.


Materialized View Refresh

PostgreSQL REFRESH MATERIALIZED VIEW CONCURRENTLY

PostgreSQL 9.4 (December 2014, commit 96ef3b8). src/backend/commands/matview.c

The snapshot-diff strategy used for recomputation-diff refreshes (where the full query is re-executed and anti-joined against current storage to compute inserts and deletes) mirrors the algorithm implemented in PostgreSQL's REFRESH MATERIALIZED VIEW CONCURRENTLY. This PostgreSQL feature predates all relevant patents and is publicly documented.

SQL MERGE Statement

ISO/IEC 9075:2003 (SQL:2003 standard) — MERGE statement. PostgreSQL 15 (October 2022, commit 7103eba).

The MERGE-based delta application in src/refresh.rs uses the ISO-standard MERGE statement, independently implemented by Oracle, SQL Server, DB2, and PostgreSQL. This is not derived from any vendor-specific implementation.


General Database Theory

Relational Algebra

Codd, E.F. (1970). "A Relational Model of Data for Large Shared Data Banks." Communications of the ACM, 13(6), 377–387.

The operator tree in src/dvm/parser.rs models standard relational algebra operators (select, project, join, aggregate, union). These are foundational database theory from 1970.

Semi-Naive Evaluation

Bancilhon, F. & Ramakrishnan, R. (1986). "An Amateur's Introduction to Recursive Query Processing Strategies." Proceedings ACM SIGMOD, 16–52.

General background for recursive CTE evaluation strategies. PostgreSQL's own WITH RECURSIVE implementation uses iterative fixpoint evaluation based on these principles.


This document is maintained for attribution and independent-derivation documentation purposes. It does not constitute legal advice.

Custom SQL Syntax for PostgreSQL Extensions

Comprehensive Technical Research Report

Date: 2026-02-25 Context: pg_trickle extension — evaluating approaches to support CREATE STREAM TABLE syntax or equivalent native-feeling DDL.


Table of Contents

  1. Executive Summary
  2. PostgreSQL Parser Hooks / Utility Hooks
  3. The ProcessUtility_hook Approach
  4. Raw Parser Extension (gram.y)
  5. The Utility Command Approach
  6. Custom Access Methods (CREATE ACCESS METHOD)
  7. Table Access Method API (PostgreSQL 12+)
  8. Foreign Data Wrapper Approach
  9. Event Triggers
  10. TimescaleDB Continuous Aggregates Pattern
  11. Citus Distributed DDL Pattern
  12. PostgreSQL 18 New Features
  13. COMMENT / OPTIONS Abuse Pattern
  14. pg_ivm (Incremental View Maintenance) Pattern
  15. CREATE TABLE ... USING (Table Access Methods) Deep Dive
  16. Comparison Matrix
  17. Recommendations for pg_trickle

1. Executive Summary

PostgreSQL's parser is not extensible — there is no parser hook that allows extensions to add new grammar rules. This is a fundamental design constraint. Every approach to "custom DDL syntax" in extensions falls into one of two categories:

  1. Intercept existing syntax — Use ProcessUtility_hook or event triggers to intercept standard DDL (e.g., CREATE TABLE, CREATE VIEW) and augment its behavior.
  2. Use a SQL function as the DDL interface — Define SELECT my_extension.create_thing(...) as the user-facing API (this is what pg_trickle currently does).

No production PostgreSQL extension ships truly new SQL grammar without forking the PostgreSQL parser. TimescaleDB, Citus, pg_ivm, and others all work within existing syntax boundaries.


2. PostgreSQL Parser Hooks / Utility Hooks

Available Hook Points

PostgreSQL provides several hook function pointers that extensions can override in _PG_init():

HookHeaderPurpose
ProcessUtility_hooktcop/utility.hIntercept utility (DDL) statement execution
post_parse_analyze_hookparser/analyze.hInspect/modify the analyzed parse tree after semantic analysis
planner_hookoptimizer/planner.hReplace or augment the query planner
ExecutorStart_hookexecutor/executor.hIntercept executor startup
ExecutorRun_hookexecutor/executor.hIntercept executor row processing
ExecutorFinish_hookexecutor/executor.hIntercept executor finish
ExecutorEnd_hookexecutor/executor.hIntercept executor cleanup
object_access_hookcatalog/objectaccess.hNotifications when objects are created/modified/dropped
emit_log_hookutils/elog.hIntercept log messages

What's Missing: No Parser Hook

There is no parser_hook or raw_parser_hook. The raw parser (gram.yscan.l → bison grammar) is compiled into the PostgreSQL server binary. Extensions cannot:

  • Add new keywords (e.g., STREAM)
  • Add new grammar productions (e.g., CREATE STREAM TABLE)
  • Modify the tokenizer/lexer
  • Intercept raw SQL text before parsing

The closest hook is post_parse_analyze_hook, which fires after the SQL has already been parsed and analyzed. By this point:

  • The SQL string has already been tokenized and parsed by gram.y
  • A parse tree (Query node) has been produced
  • If the SQL contains unknown syntax, a syntax error has already been raised

Technical Details of post_parse_analyze_hook

/* In src/backend/parser/analyze.c */
typedef void (*post_parse_analyze_hook_type)(ParseState *pstate,
                                             Query *query,
                                             JumbleState *jstate);
post_parse_analyze_hook_type post_parse_analyze_hook = NULL;

Extensions can set this in _PG_init():

static post_parse_analyze_hook_type prev_post_parse_analyze_hook = NULL;

void _PG_init(void) {
    prev_post_parse_analyze_hook = post_parse_analyze_hook;
    post_parse_analyze_hook = my_post_parse_analyze;
}

Use cases: Query rewriting after parsing (e.g., adding security predicates, row-level security), statistics collection, plan caching invalidation. Not usable for new syntax because parsing has already completed.

Pros/Cons

AspectAssessment
Native syntaxImpossible — cannot add new grammar
Intercept existing DDLYes via ProcessUtility_hook
Modify parsed queriesYes via post_parse_analyze_hook
ComplexityLow for hooking, but limited in capability
PG versionAll modern versions (hooks stable since PG 9.x)
MaintenanceVery low — hook signatures rarely change

3. The ProcessUtility_hook Approach

How It Works

ProcessUtility_hook is the most powerful DDL interception point. It fires for every "utility statement" (DDL, COPY, EXPLAIN, etc.) after parsing but before execution.

typedef void (*ProcessUtility_hook_type)(PlannedStmt *pstmt,
                                         const char *queryString,
                                         bool readOnlyTree,
                                         ProcessUtilityContext context,
                                         ParamListInfo params,
                                         QueryEnvironment *queryEnv,
                                         DestReceiver *dest,
                                         QueryCompletion *qc);

An extension can:

  1. Inspect the parse tree node — The PlannedStmt->utilityStmt field contains the parsed DDL node (e.g., CreateStmt, AlterTableStmt, ViewStmt).
  2. Modify the parse tree — Change fields before passing to the standard handler.
  3. Replace execution entirely — Skip calling the standard handler and do something else.
  4. Post-process — Call the standard handler first, then do additional work.
  5. Block execution — Raise an error to prevent the DDL.

What Extensions Use This

ExtensionWhat they interceptPurpose
TimescaleDBCREATE TABLE, ALTER TABLE, DROP TABLE, CREATE INDEX, etc.Convert regular tables to hypertables, distribute DDL
CitusMost DDL statementsPropagate DDL to worker nodes
pg_partmanCREATE TABLE, partition DDLAuto-manage partitioning
pg_stat_statementsAll utility statementsTrack DDL execution statistics
pgAuditAll utility statementsAudit logging
pg_hint_planUses post_parse_analyze_hook instead
sepgsqlObject creation/modificationSecurity label enforcement

Can It Handle New Syntax?

No. It can only intercept DDL that PostgreSQL's parser already understands. You cannot use ProcessUtility_hook to handle CREATE STREAM TABLE because the parser will reject that syntax before the hook is ever called.

However, it can intercept and augment existing syntax:

  • CREATE TABLE ... (some_option) → Intercept CreateStmt, check for special markers, do extra work
  • CREATE VIEW ... WITH (custom_option = true) → Intercept ViewStmt, check reloptions
  • CREATE MATERIALIZED VIEW ... WITH (custom = true) → Same approach

Pattern: Intercepting CREATE TABLE

static void my_process_utility(PlannedStmt *pstmt, ...) {
    Node *parsetree = pstmt->utilityStmt;

    if (IsA(parsetree, CreateStmt)) {
        CreateStmt *stmt = (CreateStmt *) parsetree;
        // Check for a special reloption or table name pattern
        ListCell *lc;
        foreach(lc, stmt->options) {
            DefElem *opt = (DefElem *) lfirst(lc);
            if (strcmp(opt->defname, "stream") == 0) {
                // This is a stream table! Do custom logic.
                create_stream_table_from_ddl(stmt, queryString);
                return; // Don't call standard handler
            }
        }
    }

    // Pass through to standard handler
    if (prev_ProcessUtility)
        prev_ProcessUtility(pstmt, ...);
    else
        standard_ProcessUtility(pstmt, ...);
}

Pros/Cons

AspectAssessment
Native CREATE STREAM TABLENo — parser rejects unknown syntax
CREATE TABLE ... WITH (stream=true)Yes — feasible via reloptions
ComplexityMedium — must carefully chain with other extensions
PG versionAll modern versions
MaintenanceLow — hook signature changes rarely (changed in PG14, PG15)
RiskMust always chain prev_ProcessUtility — misbehaving can break other extensions

4. Raw Parser Extension (gram.y)

How It Works

PostgreSQL's SQL parser is a Bison-generated LALR(1) parser defined in:

  • src/backend/parser/gram.y — Grammar rules (~18,000 lines)
  • src/backend/parser/scan.l — Flex lexer (tokenizer)
  • src/include/parser/kwlist.h — Reserved/unreserved keyword list

To add CREATE STREAM TABLE, you would:

  1. Add STREAM to the keyword list (unreserved or reserved)
  2. Add grammar rules to gram.y:
    CreateStreamTableStmt:
        CREATE STREAM TABLE qualified_name '(' OptTableElementList ')'
        OptWith AS SelectStmt
        {
            CreateStreamTableStmt *n = makeNode(CreateStreamTableStmt);
            n->relation = $4;
            n->query = $9;
            /* ... */
            $$ = (Node *) n;
        }
    ;
    
  3. Add a new NodeTag for CreateStreamTableStmt
  4. Handle it in ProcessUtility
  5. Rebuild the PostgreSQL server

Implications

This requires forking PostgreSQL. The modified parser is compiled into postgres binary. You cannot ship a grammar modification as a loadable extension (.so/.dylib).

Who Does This?

  • YugabyteDB — Fork of PG with custom grammar for distributed features
  • CockroachDB — Entirely custom parser (Go, not PG's Bison grammar)
  • Amazon Aurora (partially) — Custom grammar additions for Aurora-specific features
  • Greenplum — Fork of PG with added grammar for DISTRIBUTED BY, PARTITION BY etc.
  • ParadeDB — Fork of PG with some custom syntax additions

Pros/Cons

AspectAssessment
Native CREATE STREAM TABLEYes — full parser-level support
ComplexityVery high — must maintain a PG fork
PG versionTied to a single PG version
MaintenanceExtremely high — must rebase on every PG release (gram.y changes significantly between major versions)
DistributionCannot use CREATE EXTENSION; must ship entire modified PostgreSQL
User adoptionVery low — users must replace their PostgreSQL installation
psql autocompleteWould work with matching psql modifications
pg_dump/pg_restoreBroken unless you also modify those tools

Verdict: Not viable for an extension. Only viable for a PostgreSQL fork/distribution.


5. The Utility Command Approach

How It Works

Some sources reference a "custom utility command" mechanism. In practice, this does not exist as a formal PostgreSQL extension point. What people sometimes mean is one of:

5a. Using DO Blocks as Custom Commands

DO $$ BEGIN PERFORM pgtrickle.create_stream_table('my_st', 'SELECT ...'); END $$;

This is just a wrapped function call — not a real custom command.

5b. Abusing COMMENT or SET for Command Dispatch

Some extensions parse custom commands from strings:

-- Using SET to pass commands
SET myext.command = 'CREATE STREAM TABLE my_st AS SELECT ...';
SELECT myext.execute_pending_command();

Or using post_parse_analyze_hook to intercept a specially-formatted query:

-- Extension intercepts this via post_parse_analyze_hook
SELECT * FROM myext.dispatch('CREATE STREAM TABLE ...');

5c. Overloading Existing Syntax

Some extensions overload SELECT or CALL:

CALL pgtrickle.create_stream_table('my_st', $$SELECT ...$$);

CALL was introduced in PostgreSQL 11 for stored procedures. Using it makes the DDL feel more "command-like" than SELECT function().

Pros/Cons

AspectAssessment
Native syntaxNo — still a function call in disguise
User experienceModerate — CALL is better than SELECT
ComplexityLow
PG versionPG11+ for CALL
MaintenanceVery low

6. Custom Access Methods (CREATE ACCESS METHOD)

How It Works

PostgreSQL supports extension-defined access methods (index AMs and table AMs):

CREATE ACCESS METHOD my_am TYPE TABLE HANDLER my_am_handler;

This was introduced in PostgreSQL 9.6 for index AMs and extended to table AMs in PostgreSQL 12. The CREATE ACCESS METHOD statement shows PostgreSQL's philosophy: extensions can define new implementations of existing concepts (tables, indexes) but not new concepts (stream tables).

Table AM vs. Index AM

TypeSinceHandler SignatureExample
Index AMPG 9.6IndexAmRoutine with scan/insert/delete callbacksbloom, brin, GiST
Table AMPG 12TableAmRoutine with 60+ callbacksheap (default), columnar (Citus), zedstore (experimental)

Can We Use This for Stream Tables?

The table AM API defines how tuples are stored and retrieved, not how tables are created or maintained. A stream table's key features are:

  • Defining query — Not part of the table AM concept
  • Automatic refresh — Not part of the table AM concept
  • Change tracking — Could partially overlap with table AM's tuple modification callbacks
  • Storage — The actual storage could use heap (default) AM

You could theoretically create a custom table AM that:

  1. Uses heap storage underneath
  2. Intercepts INSERT/UPDATE/DELETE to maintain change buffers
  3. Adds custom metadata

But this would be an extreme abuse of the API. Table AMs are meant for storage engines, not for implementing materialized view semantics.

Pros/Cons

AspectAssessment
Native syntaxNoCREATE TABLE ... USING my_am is the closest
ComplexityExtremely high — 60+ callbacks to implement
FitnessPoor — table AM is about storage, not view maintenance
PG versionPG 12+
MaintenanceHigh — AM API evolves between major versions

7. Table Access Method API (PostgreSQL 12+)

Deep Technical Details

The Table Access Method (AM) API was introduced in PostgreSQL 12 via commit c2fe139c20 by Andres Freund. It abstracts the storage layer, allowing extensions to replace the default heap storage with custom implementations.

The CREATE TABLE ... USING Syntax

-- Use default AM (heap)
CREATE TABLE normal_table (id int, data text);

-- Use custom AM
CREATE TABLE my_table (id int, data text) USING my_custom_am;

-- Set default for a database
SET default_table_access_method = 'my_custom_am';

TableAmRoutine Structure

The handler function must return a TableAmRoutine struct with callbacks:

typedef struct TableAmRoutine {
    NodeTag type;

    /* Slot callbacks */
    const TupleTableSlotOps *(*slot_callbacks)(Relation rel);

    /* Scan callbacks */
    TableScanDesc (*scan_begin)(Relation rel, Snapshot snap, int nkeys, ...);
    void (*scan_end)(TableScanDesc scan);
    void (*scan_rescan)(TableScanDesc scan, ...);
    bool (*scan_getnextslot)(TableScanDesc scan, ...);

    /* Parallel scan */
    Size (*parallelscan_estimate)(Relation rel);
    Size (*parallelscan_initialize)(Relation rel, ...);
    void (*parallelscan_reinitialize)(Relation rel, ...);

    /* Index fetch */
    IndexFetchTableData *(*index_fetch_begin)(Relation rel);
    void (*index_fetch_reset)(IndexFetchTableData *data);
    void (*index_fetch_end)(IndexFetchTableData *data);
    bool (*index_fetch_tuple)(IndexFetchTableData *data, ...);

    /* Tuple modification */
    void (*tuple_insert)(Relation rel, TupleTableSlot *slot, ...);
    void (*tuple_insert_speculative)(Relation rel, ...);
    void (*tuple_complete_speculative)(Relation rel, ...);
    void (*multi_insert)(Relation rel, TupleTableSlot **slots, int nslots, ...);
    TM_Result (*tuple_delete)(Relation rel, ItemPointer tid, ...);
    TM_Result (*tuple_update)(Relation rel, ItemPointer otid, ...);
    TM_Result (*tuple_lock)(Relation rel, ItemPointer tid, ...);

    /* DDL callbacks */
    void (*relation_set_new_filelocator)(Relation rel, ...);
    void (*relation_nontransactional_truncate)(Relation rel);
    void (*relation_copy_data)(Relation rel, const RelFileLocator *newrlocator);
    void (*relation_copy_for_cluster)(Relation rel, ...);
    void (*relation_vacuum)(Relation rel, VacuumParams *params, ...);
    bool (*scan_analyze_next_block)(TableScanDesc scan, ...);
    bool (*scan_analyze_next_tuple)(TableScanDesc scan, ...);

    /* Planner support */
    void (*relation_estimate_size)(Relation rel, int32 *attr_widths, ...);

    /* ... more callbacks */
} TableAmRoutine;

Hybrid Approach: Table AM + ProcessUtility_hook

A more practical pattern:

  1. Register a custom table AM (e.g., stream_am) that wraps heap
  2. Use ProcessUtility_hook to intercept CREATE TABLE ... USING stream_am
  3. When detected, perform stream table registration (catalog, CDC, etc.)
  4. The actual storage uses standard heap via delegation
-- User writes:
CREATE TABLE order_totals (region text, total numeric)
    USING stream_am
    WITH (query = 'SELECT region, SUM(amount) FROM orders GROUP BY region',
          schedule = '1m',
          refresh_mode = 'DIFFERENTIAL');

Problems with This Approach

  1. Column list is mandatoryCREATE TABLE ... USING requires explicit column definitions. Stream tables should derive columns from the query.
  2. Query in WITH clause — Storing a full SQL query in reloptions is hacky and has length limits.
  3. No AS SELECT — Table AMs don't support CREATE TABLE ... AS SELECT with USING clause in the standard grammar.
  4. VACUUM, ANALYZE complexity — Must implement or delegate all maintenance callbacks.
  5. pg_dump compatibility — pg_dump would dump CREATE TABLE ... USING stream_am but not the associated metadata (query, schedule, etc.)

Pros/Cons

AspectAssessment
Native syntaxPartialCREATE TABLE ... USING stream_am
Feels like a stream tableNo — still looks like a regular table with options
ComplexityVery high
pg_dumpBroken — metadata in catalog tables won't be dumped
PG versionPG 12+
MaintenanceHigh — table AM API changes between versions

8. Foreign Data Wrapper Approach

How It Works

Foreign Data Wrappers (FDW) allow PostgreSQL to access external data sources via CREATE FOREIGN TABLE. An extension can register a custom FDW:

CREATE EXTENSION pg_trickle;
CREATE SERVER stream_server FOREIGN DATA WRAPPER pgtrickle_fdw;

CREATE FOREIGN TABLE order_totals (region text, total numeric)
    SERVER stream_server
    OPTIONS (
        query 'SELECT region, SUM(amount) FROM orders GROUP BY region',
        schedule '1m',
        refresh_mode 'DIFFERENTIAL'
    );

FDW API

The FDW API provides callbacks for:

  • GetForeignRelSize — Estimate relation size for planning
  • GetForeignPaths — Generate access paths
  • GetForeignPlan — Create a plan node
  • BeginForeignScan — Start scan
  • IterateForeignScan — Get next tuple
  • EndForeignScan — End scan
  • AddForeignUpdatePaths — Support INSERT/UPDATE/DELETE (optional)

How It Could Work for Stream Tables

  1. Define a custom FDW (pgtrickle_fdw)
  2. The FDW's scan callbacks read from the underlying storage table
  3. ProcessUtility_hook intercepts CREATE FOREIGN TABLE ... SERVER stream_server to set up CDC, catalog entries, etc.
  4. A background worker handles refresh scheduling

Problems

  1. Foreign tables have restrictions — Cannot have indexes, constraints, triggers, or participate in inheritance. This severely limits usability.
  2. Query planner limitations — Foreign tables use a separate planning path with potentially worse plan quality.
  3. No MVCC — Foreign tables typically don't provide snapshot isolation semantics.
  4. User model confusion — "Foreign table" implies external data, not a derived view.
  5. EXPLAIN output — Shows "Foreign Scan" instead of "Seq Scan", confusing users.
  6. pg_dump — Foreign tables are dumped, but server/FDW setup may not transfer correctly.
  7. Two-step creation — Requires CREATE SERVER before CREATE FOREIGN TABLE.

Pros/Cons

AspectAssessment
Native syntaxPartialCREATE FOREIGN TABLE with options
Feels like a stream tableNo — foreign tables have different semantics
Index supportNo — major limitation
Trigger supportNo — major limitation
ComplexityMedium
PG versionPG 9.1+
MaintenanceLow — FDW API is very stable

Verdict: Not suitable. The restrictions on foreign tables (no indexes, no triggers) make this impractical for stream tables that need to behave like regular tables.


9. Event Triggers

How It Works

Event triggers fire on DDL events at the database level:

CREATE EVENT TRIGGER my_trigger ON ddl_command_end
    WHEN TAG IN ('CREATE TABLE', 'ALTER TABLE', 'DROP TABLE')
    EXECUTE FUNCTION my_handler();

Available events:

  • ddl_command_start — Before DDL execution (PG 9.3+)
  • ddl_command_end — After DDL execution (PG 9.3+)
  • sql_drop — When objects are dropped (PG 9.3+)
  • table_rewrite — When a table is rewritten (PG 9.5+)

Inside the Handler

CREATE FUNCTION my_handler() RETURNS event_trigger AS $$
DECLARE
    obj record;
BEGIN
    FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
    LOOP
        -- obj.objid, obj.object_type, obj.command_tag, etc.
        IF obj.command_tag = 'CREATE TABLE' AND obj.object_type = 'table' THEN
            -- Check if this table has a special marker
            -- (e.g., a specific reloption or comment)
        END IF;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Pattern: CREATE TABLE + Event Trigger

  1. User creates a table with a special comment or option:
    CREATE TABLE order_totals (region text, total numeric);
    COMMENT ON TABLE order_totals IS 'pgtrickle:query=SELECT region...;schedule=1m';
    
  2. Event trigger on ddl_command_end fires
  3. Handler parses the comment, detects stream table intent
  4. Handler registers the stream table in the catalog

Limitations

  1. Cannot modify the DDL — Event triggers observe DDL, they can't change what happened. On ddl_command_end, the table already exists.
  2. Cannot prevent DDL — On ddl_command_start, you can raise an error to prevent it, but you can't redirect it.
  3. Two-step process — User must CREATE TABLE AND then mark it somehow (comment, option, separate function call).
  4. No custom syntax — Event triggers watch existing DDL commands.
  5. pg_trickle already uses this — For DDL tracking on upstream tables (see hooks.rs).

Pros/Cons

AspectAssessment
Native syntaxNo — watches existing DDL only
ComplexityLow
Can transform DDLNo — observe only
PG versionPG 9.3+
MaintenanceVery low
pg_trickle usageAlready used for upstream DDL tracking

10. TimescaleDB Continuous Aggregates Pattern

How It Works

TimescaleDB continuous aggregates (caggs) demonstrate the most sophisticated approach to custom DDL-like syntax in a PostgreSQL extension. Their evolution is instructive.

Phase 1: Pure Function API (early versions)

-- Create a view, then register it
CREATE VIEW daily_temps AS
SELECT time_bucket('1 day', time) AS day, AVG(temp)
FROM conditions GROUP BY 1;

SELECT add_continuous_aggregate_policy('daily_temps', ...);

Phase 2: CREATE MATERIALIZED VIEW WITH (introduced in TimescaleDB 2.0)

CREATE MATERIALIZED VIEW daily_temps
WITH (timescaledb.continuous) AS
SELECT time_bucket('1 day', time) AS day, device_id, AVG(temp)
FROM conditions
GROUP BY 1, 2;

How the Hook Chain Works

TimescaleDB's approach uses layered hooks:

  1. ProcessUtility_hook intercepts CREATE MATERIALIZED VIEW
  2. Checks reloptions for timescaledb.continuous in the WithClause
  3. If found:
    • Does NOT call standard ProcessUtility for the matview
    • Instead creates a regular hypertable (the materialization)
    • Creates an internal view (the user-facing query interface)
    • Registers refresh policies in the catalog
    • Sets up continuous aggregate metadata
  4. For REFRESH MATERIALIZED VIEW, intercepts and routes to their refresh engine
  5. For DROP MATERIALIZED VIEW, intercepts and cleans up all artifacts

The Magic: Reloptions as Extension Point

PostgreSQL's CREATE MATERIALIZED VIEW ... WITH (option = value) passes options as DefElem nodes in the parse tree. The parser treats these as generic key-value pairs — it does NOT validate the option names. This is the key insight: PostgreSQL's parser accepts arbitrary options in WITH clauses.

// In ProcessUtility_hook:
if (IsA(parsetree, CreateTableAsStmt)) {
    CreateTableAsStmt *stmt = (CreateTableAsStmt *) parsetree;
    if (stmt->objtype == OBJECT_MATVIEW) {
        // Check for our custom option in stmt->into->options
        bool is_continuous = false;
        ListCell *lc;
        foreach(lc, stmt->into->rel->options) {
            DefElem *opt = (DefElem *) lfirst(lc);
            if (strcmp(opt->defname, "timescaledb.continuous") == 0) {
                is_continuous = true;
                break;
            }
        }
        if (is_continuous) {
            // Handle as continuous aggregate
            return;
        }
    }
}

Refresh Policies

-- Add a refresh policy (function call, not DDL)
SELECT add_continuous_aggregate_policy('daily_temps',
    start_offset => INTERVAL '1 month',
    end_offset => INTERVAL '1 day',
    schedule_interval => INTERVAL '1 hour');

What pg_trickle Could Learn

The TimescaleDB pattern for pg_trickle would look like:

-- Option A: CREATE MATERIALIZED VIEW with custom option
CREATE MATERIALIZED VIEW order_totals
WITH (pgtrickle.stream = true, pgtrickle.schedule = '1m', pgtrickle.mode = 'DIFFERENTIAL')
AS SELECT region, SUM(amount) FROM orders GROUP BY region;

-- Option B: CREATE TABLE with custom option (less natural)
CREATE TABLE order_totals (region text, total numeric)
WITH (pgtrickle.stream = true);
-- Then separately: SELECT pgtrickle.set_query('order_totals', 'SELECT ...');

Pros/Cons

AspectAssessment
Native syntaxGoodCREATE MATERIALIZED VIEW ... WITH (pgtrickle.stream) looks natural
User experienceVery good — familiar DDL syntax with extension options
ComplexityHigh — must implement full ProcessUtility_hook chain
pg_dumpPartial — matview DDL is dumped, but custom metadata needs pg_dump extension or config tables
PG versionPG 9.3+ (matviews), PG 12+ (better option handling)
MaintenanceMedium — must track changes to matview creation internals
Shared preloadRequired — ProcessUtility_hook needs shared_preload_libraries

11. Citus Distributed DDL Pattern

How It Works

Citus (now part of Microsoft) demonstrates another approach to extending DDL behavior:

ProcessUtility_hook Chain

Citus has one of the most comprehensive ProcessUtility_hook implementations:

void multi_ProcessUtility(PlannedStmt *pstmt, ...) {
    // 1. Classify the DDL
    Node *parsetree = pstmt->utilityStmt;

    // 2. Check if it affects distributed tables
    if (IsA(parsetree, AlterTableStmt)) {
        // Propagate ALTER TABLE to all worker nodes
        PropagateAlterTable((AlterTableStmt *)parsetree, queryString);
    }

    // 3. Call standard handler (or skip for intercepted commands)
    if (prev_ProcessUtility)
        prev_ProcessUtility(pstmt, ...);
    else
        standard_ProcessUtility(pstmt, ...);

    // 4. Post-processing
    if (IsA(parsetree, CreateStmt)) {
        // Check if we should auto-distribute this table
    }
}

Table Distribution via Function Calls

Citus does NOT add custom DDL syntax. Distribution is done via function calls:

-- Create a regular table
CREATE TABLE events (id bigint, data jsonb, created_at timestamptz);

-- Distribute it (function call, not DDL)
SELECT create_distributed_table('events', 'id');

-- Or create a reference table
SELECT create_reference_table('lookups');

Columnar Storage via Table AM

Citus also provides columnar storage as a table AM:

CREATE TABLE analytics_data (...)
    USING columnar;

This uses the table AM API (PostgreSQL 12+) — see Section 7.

What Citus Teaches Us

  • Function calls for complex operationscreate_distributed_table() is analogous to pgtrickle.create_stream_table().
  • ProcessUtility_hook for DDL propagation — Intercept standard DDL and add behavior.
  • Table AM for storage — Separate concern from distribution logic.
  • No custom syntax — Even with Microsoft's resources, Citus doesn't fork the parser.

Pros/Cons

AspectAssessment
Native syntaxNo — uses function calls like pg_trickle
Approach validatedYes — Citus is used at massive scale with this pattern
ComplexityMedium (function API) to High (ProcessUtility_hook)
User adoptionProven successful
MaintenanceLow for function API

12. PostgreSQL 18 New Features

Relevant Extension Points in PG 18

PostgreSQL 18 (released 2025) includes several features relevant to this analysis:

12a. Virtual Generated Columns

PG 18 adds GENERATED ALWAYS AS (expr) VIRTUAL columns. Not directly relevant to stream tables, but shows PostgreSQL's willingness to expand CREATE TABLE syntax incrementally.

12b. Improved Table AM API

PG 18 refines the table AM API with better TOAST handling and improved parallel scan support. This makes custom table AMs slightly more practical.

12c. Enhanced Event Trigger Information

PG 18 expands pg_event_trigger_ddl_commands() with additional metadata fields, making event-trigger-based approaches more capable.

12d. pg_stat_io Improvements

Enhanced I/O statistics infrastructure that could benefit monitoring of stream table refresh operations.

12e. No New Parser Extension Points

PostgreSQL 18 does not add any parser extension mechanism. The parser remains monolithic and non-extensible. There have been occasional discussions on pgsql-hackers about parser hooks, but no concrete proposals have been accepted.

12f. No Custom DDL Extension Points

No new general-purpose DDL extension points beyond the existing hook system.

Looking Forward: Discussion on pgsql-hackers

There have been recurring threads on pgsql-hackers about:

  • Extension-defined SQL syntax — Rejected due to complexity and parser architecture
  • Loadable parser modules — Theoretical discussions, no implementation
  • Extension catalogs — Some interest in allowing extensions to register custom catalogs

None of these are implemented in PG 18.

Pros/Cons

AspectAssessment
New syntax extension pointsNone in PG 18
Table AM improvementsMinor — slightly easier to implement
Event trigger improvementsMinor — more metadata available
Parser extensibilityNot planned for any upcoming PG release

13. COMMENT / OPTIONS Abuse Pattern

How It Works

Several extensions use table comments or reloptions as a "poor man's metadata" to tag tables with custom semantics.

Pattern 1: COMMENT-based

CREATE TABLE order_totals (region text, total numeric);
COMMENT ON TABLE order_totals IS '@pgtrickle {"query": "SELECT ...", "schedule": "1m"}';

An event trigger or background worker scans pg_description for tables with the @pgtrickle prefix and processes them.

Pattern 2: Reloptions-based

CREATE TABLE order_totals (region text, total numeric)
    WITH (fillfactor = 70, pgtrickle.stream = true);

Problem: PostgreSQL validates reloptions against a known list. You cannot add arbitrary options to WITH (...) without registering them. Extensions can register custom reloptions via add_reloption() functions, but this is a relatively obscure API.

Pattern 3: GUC-based Tagging

-- Set a GUC that our ProcessUtility_hook reads
SET pgtrickle.next_create_is_stream = true;
SET pgtrickle.stream_query = 'SELECT region, SUM(amount) FROM orders GROUP BY region';

-- Hook intercepts this CREATE TABLE and registers it
CREATE TABLE order_totals (region text, total numeric);

-- Reset
RESET pgtrickle.next_create_is_stream;

This is extremely hacky but has been used in practice (some partitioning extensions used similar patterns before native partitioning).

Who Uses This?

  • pgmemcache — Uses comments to configure caching behavior
  • Some row-level security extensions — Comments to define policies
  • pg_partman — Uses a configuration table (not comments) but similar concept

Pros/Cons

AspectAssessment
Native syntaxNo — abuses existing mechanisms
User experiencePoor — fragile, easy to break by editing comments
ComplexityLow
pg_dumpCOMMENT is dumped — metadata survives pg_dump/restore
RobustnessLow — comments can be accidentally changed
PG versionAll versions

14. pg_ivm (Incremental View Maintenance) Pattern

How It Works

pg_ivm is the most directly comparable extension to pg_trickle. It implements incremental view maintenance for PostgreSQL.

API Design

pg_ivm uses a pure function-call API:

-- Create an incrementally maintainable materialized view
SELECT create_immv('order_totals', 'SELECT region, SUM(amount) FROM orders GROUP BY region');

-- Refresh
SELECT refresh_immv('order_totals');

-- Drop
DROP TABLE order_totals;  -- Just drop the underlying table

Key function: create_immv(name, query) — Creates an "Incrementally Maintainable Materialized View" (IMMV).

Internal Implementation

  1. create_immv() is a SQL function (not a hook)
  2. It parses the query, creates a storage table, sets up triggers on source tables
  3. IMMVs are stored as regular tables with metadata in a custom catalog (pg_ivm_immv)
  4. Triggers on source tables automatically update the IMMV on DML

No ProcessUtility_hook

pg_ivm does not use ProcessUtility_hook. It operates entirely through:

  • SQL functions (create_immv, refresh_immv)
  • Row-level triggers for automatic maintenance
  • A custom catalog table for metadata

Why No Custom Syntax?

pg_ivm was developed as a proof-of-concept for PostgreSQL core IVM support. The authors explicitly chose function-call syntax to:

  1. Avoid shared_preload_libraries requirement (hooks need it)
  2. Keep the extension simple and portable
  3. Focus on the IVM algorithm, not the user interface

Eventually Merged to Core?

There was discussion about upstreaming IVM to PostgreSQL core. If merged, it would get proper syntax (CREATE INCREMENTAL MATERIALIZED VIEW). As an extension, it stays with function calls.

Relevance to pg_trickle

pg_trickle's current API (pgtrickle.create_stream_table()) follows the exact same pattern as pg_ivm. This is the established approach for IVM extensions.

Pros/Cons

AspectAssessment
Native syntaxNo — function calls
ComplexityLow — simple function API
shared_preload_librariesNot required for basic function API
pg_dumpNo — function calls are not dumped; must use custom dump/restore
User experienceModerate — familiar to pg_ivm users
Community acceptanceEstablished pattern for IVM extensions

15. CREATE TABLE ... USING (Table Access Methods) Deep Dive

Full Syntax

CREATE TABLE tablename (
    column1 datatype,
    column2 datatype,
    ...
) USING access_method_name
  WITH (storage_parameter = value, ...);

How the Parser Handles USING

In gram.y:

CreateStmt: CREATE OptTemp TABLE ...
    OptTableAccessMethod OptWith ...

OptTableAccessMethod:
    USING name    { $$ = $2; }
    | /* empty */ { $$ = NULL; }
    ;

The USING clause sets CreateStmt->accessMethod to the access method name string.

How ProcessUtility Handles It

In createRelation() (src/backend/commands/tablecmds.c):

  1. If accessMethod is specified, look it up in pg_am
  2. Verify it's a table AM (not an index AM)
  3. Store the AM OID in pg_class.relam
  4. Use the AM's callbacks for all subsequent operations

Custom Reloptions with Table AMs

Table AMs can define custom reloptions via:

static relopt_parse_elt stream_relopt_tab[] = {
    {"query", RELOPT_TYPE_STRING, offsetof(StreamOptions, query)},
    {"schedule", RELOPT_TYPE_STRING, offsetof(StreamOptions, schedule)},
    {"refresh_mode", RELOPT_TYPE_STRING, offsetof(StreamOptions, refresh_mode)},
};

This would allow:

CREATE TABLE order_totals (region text, total numeric)
    USING stream_heap
    WITH (query = 'SELECT ...', schedule = '1m', refresh_mode = 'DIFFERENTIAL');

Problems Specific to Stream Tables

  1. Column derivation — Stream tables derive columns from the query. CREATE TABLE ... USING requires explicit column definitions, creating redundancy and potential inconsistency.

  2. No AS SELECT — You can't combine USING with AS SELECT:

    -- This does NOT work in PostgreSQL grammar:
    CREATE TABLE order_totals
        USING stream_heap
        AS SELECT region, SUM(amount) FROM orders GROUP BY region;
    
  3. Full AM implementation required — Even if you delegate to heap, you must implement all callbacks and handle edge cases.

  4. VACUUM/ANALYZE — Must properly delegate to heap for these to work.

  5. Replication — Logical replication assumes heap tuples; custom AMs may break.

Hybrid Practical Approach

If pursuing this route:

-- Step 1: Set default AM
SET default_table_access_method = 'stream_heap';

-- Step 2: Create with query in options
CREATE TABLE order_totals ()
    WITH (pgtrickle.query = 'SELECT region, SUM(amount) FROM orders GROUP BY region',
          pgtrickle.schedule = '1m');

-- ProcessUtility_hook would:
-- 1. Detect USING stream_heap (or detect our custom reloptions)
-- 2. Parse the query from options
-- 3. Derive columns from the query
-- 4. Create the actual table with proper columns using heap AM
-- 5. Register in pgtrickle catalog
-- 6. Set up CDC

Pros/Cons

AspectAssessment
Native syntaxPartialCREATE TABLE ... USING stream_heap WITH (...)
Column derivationNot supported — must specify columns or use hook magic
ComplexityVery high
pg_dumpGoodCREATE TABLE ... USING is properly dumped
PG versionPG 12+
MaintenanceHigh — AM API changes between versions

16. Comparison Matrix

ApproachNative SyntaxComplexitypg_dumpPG VersionMaintenanceRecommended
Function API (current)NoLowNo*AnyVery LowYes
ProcessUtility_hook + MATVIEW WITHGoodHighPartial9.3+MediumMaybe
Raw parser forkPerfectVery HighNoFork onlyVery HighNo
Table AM USINGPartialVery HighYes12+HighNo
FDW FOREIGN TABLEPartialMediumYes9.1+LowNo
Event triggers aloneNoLowNo9.3+LowNo
COMMENT abuseNoLowYesAnyLowNo
GUC + CREATE TABLE hackNoMediumPartialAnyMediumNo
TimescaleDB pattern (MATVIEW + WITH)GoodHighPartial9.3+MediumBest option

* Custom pg_dump support can be added via pg_dump hook or wrapper script.


17. Recommendations for pg_trickle

Current Approach: Function API (Keep and Enhance)

pg_trickle's current approach (pgtrickle.create_stream_table('name', 'query', ...)) is:

  • Proven — Same pattern as pg_ivm, Citus, and many other extensions
  • Simple — No shared_preload_libraries required for basic usage
  • Maintainable — No hook chains to debug
  • Portable — Works on any PG version that supports pgrx

Enhancement opportunities:

-- Current
SELECT pgtrickle.create_stream_table('order_totals',
    'SELECT region, SUM(amount) FROM orders GROUP BY region', '1m');

-- Enhanced: CALL syntax for more DDL-like feel (PG 11+)
CALL pgtrickle.create_stream_table('order_totals',
    $$SELECT region, SUM(amount) FROM orders GROUP BY region$$, '1m');

Future Option: TimescaleDB-style Materialized View Integration

If user demand justifies the complexity, pg_trickle could add a second creation path via ProcessUtility_hook:

-- New native-feeling syntax (requires shared_preload_libraries)
CREATE MATERIALIZED VIEW order_totals
WITH (pgtrickle.stream = true, pgtrickle.schedule = '1m')
AS SELECT region, SUM(amount) FROM orders GROUP BY region
WITH NO DATA;

-- Original function API still works (no hook needed)
SELECT pgtrickle.create_stream_table('order_totals',
    'SELECT region, SUM(amount) FROM orders GROUP BY region', '1m');

Implementation plan for hook-based approach:

  1. Register ProcessUtility_hook in _PG_init() (already needed for shared_preload_libraries)
  2. Intercept CREATE MATERIALIZED VIEW → Check for pgtrickle.stream option
  3. If found: parse options, call create_stream_table_impl() internally, create standard storage table instead of matview
  4. Intercept DROP MATERIALIZED VIEW → Check if target is a stream table → Clean up
  5. Intercept REFRESH MATERIALIZED VIEW → Route to stream table refresh engine
  6. Intercept ALTER MATERIALIZED VIEW → Route to stream table alter logic

Estimated complexity: ~800-1200 lines of Rust hook code + tests.

  • Forking PostgreSQL for custom grammar — Maintenance cost is prohibitive
  • Table AM approach — Complexity without proportional benefit
  • FDW approach — Too many restrictions on foreign tables
  • COMMENT abuse — Fragile and poor UX

pg_dump / pg_restore Strategy

Regardless of approach, pg_dump is a challenge. Options:

  1. Custom dump/restore functionspgtrickle.dump_config() and pgtrickle.restore_config()
  2. Migration script generationpgtrickle.generate_migration() outputs SQL to recreate all stream tables
  3. Event trigger on restore — Detect when tables are restored and re-register them
  4. Sidecar file — Generate a companion SQL file alongside pg_dump

Appendix A: Hook Registration in pgrx (Rust)

For reference, here's how ProcessUtility_hook registration works in pgrx:

#![allow(unused)]
fn main() {
use pgrx::prelude::*;
use pgrx::pg_sys;
use std::ffi::CStr;

static mut PREV_PROCESS_UTILITY_HOOK: pg_sys::ProcessUtility_hook_type = None;

#[pg_guard]
pub extern "C-unwind" fn my_process_utility(
    pstmt: *mut pg_sys::PlannedStmt,
    query_string: *const std::os::raw::c_char,
    read_only_tree: bool,
    context: pg_sys::ProcessUtilityContext,
    params: pg_sys::ParamListInfo,
    query_env: *mut pg_sys::QueryEnvironment,
    dest: *mut pg_sys::DestReceiver,
    qc: *mut pg_sys::QueryCompletion,
) {
    // SAFETY: pstmt is a valid pointer provided by PostgreSQL
    let stmt = unsafe { (*pstmt).utilityStmt };

    // Check if this is a CreateTableAsStmt (materialized view)
    if unsafe { pgrx::is_a(stmt, pg_sys::NodeTag::T_CreateTableAsStmt) } {
        // Check for our custom options...
    }

    // Chain to previous hook or standard handler
    unsafe {
        if let Some(prev) = PREV_PROCESS_UTILITY_HOOK {
            prev(pstmt, query_string, read_only_tree, context,
                 params, query_env, dest, qc);
        } else {
            pg_sys::standard_ProcessUtility(
                pstmt, query_string, read_only_tree, context,
                params, query_env, dest, qc);
        }
    }
}

pub fn register_hooks() {
    unsafe {
        PREV_PROCESS_UTILITY_HOOK = pg_sys::ProcessUtility_hook;
        pg_sys::ProcessUtility_hook = Some(my_process_utility);
    }
}
}

Appendix B: Key Source Files in PostgreSQL

FilePurpose
src/backend/parser/gram.ySQL grammar (~18,000 lines)
src/backend/parser/scan.lLexer/tokenizer
src/include/parser/kwlist.hKeyword definitions
src/backend/tcop/utility.cProcessUtility() — DDL dispatcher
src/backend/commands/tablecmds.cCREATE/ALTER/DROP TABLE implementation
src/backend/commands/createas.cCREATE TABLE AS / CREATE MATVIEW AS
src/include/access/tableam.hTable Access Method API
src/include/foreign/fdwapi.hFDW API
src/backend/commands/event_trigger.cEvent trigger infrastructure

Appendix C: References

  1. PostgreSQL Documentation — Table Access Method Interface
  2. PostgreSQL Documentation — Event Triggers
  3. PostgreSQL Documentation — Writing A Foreign Data Wrapper
  4. TimescaleDB Source — process_utility.c
  5. Citus Source — multi_utility.c
  6. pg_ivm Source — createas.c
  7. pgrx Documentation — Hooks
  8. PostgreSQL Wiki — CustomScanProviders

pg_trickle vs pg_ivm — Comparison Report & Gap Analysis

Date: 2026-02-28 (merged 2026-03-01, updated 2026-03-20) Author: Internal research Status: Reference document


1. Executive Summary

Both pg_trickle and pg_ivm implement Incremental View Maintenance (IVM) as PostgreSQL extensions — the goal of keeping materialized query results up-to-date without full recomputation. Despite the shared objective they differ fundamentally in design philosophy, maintenance model, SQL coverage, operational model, and target audience.

pg_ivm is a mature, widely-deployed C extension (1.4k GitHub stars, 17 releases) focused on immediate, synchronous IVM that runs inside the same transaction as the base-table write. pg_trickle is a Rust extension (v0.9.0) offering both deferred (scheduled) and immediate (transactional) IVM with a richer SQL dialect, a dependency DAG, and built-in operational tooling.

pg_trickle is significantly ahead of pg_ivm in SQL coverage, operator support, aggregate support, and operational features. As of v0.2.1, pg_trickle also matches pg_ivm's core strength — immediate, in-transaction maintenance — via the IMMEDIATE refresh mode (all phases complete). pg_ivm's one remaining structural advantage is broader PostgreSQL version support (PG 13–18):

  • IMMEDIATE mode — fully implemented. Statement-level AFTER triggers with transition tables update stream tables within the same transaction as base-table DML. Window functions, LATERAL, scalar subqueries, cascading IMMEDIATE stream tables, WITH RECURSIVE (with a stack-depth warning), and TopK micro-refresh are all supported. See PLAN_TRANSACTIONAL_IVM.md.
  • AUTO refresh mode — new default for create_stream_table. Selects DIFFERENTIAL when the query supports it and transparently falls back to FULL otherwise, eliminating the need to choose a mode at creation time.
  • pg_ivm compatibility layer — postponed. The pgivm.create_immv() / pgivm.refresh_immv() / pgivm.pg_ivm_immv wrappers (Phase 2) are deferred to post-1.0.
  • PLAN_PG_BACKCOMPAT.md details backporting pg_trickle to PG 14–18 (recommended) or PG 16–18 (minimum viable), requiring ~2.5–3 weeks of effort primarily in #[cfg]-gating ~435 lines of JSON/SQL-standard parse-tree handling.

With IMMEDIATE mode fully implemented, Row Level Security support (v0.5.0), pg_dump/restore support (v0.8.0), algebraic aggregate maintenance (v0.9.0), parallel refresh (v0.4.0), circular pipeline support (v0.7.0), watermark APIs (v0.7.0), and 40+ unique features, pg_ivm's only remaining advantages are PG version breadth and production maturity.


2. Project Overview

Attributepg_ivmpg_trickle
Repositorysraoss/pg_ivmgrove/pg-trickle
LanguageCRust (pgrx 0.17)
Latest release1.13 (2025-10-20)0.9.0 (2026-03-20)
Stars~1,400early stage
LicensePostgreSQL LicenseApache 2.0
PG versions13 – 1818 only; PG 14–18 planned
Schemapgivmpgtrickle / pgtrickle_changes
Shared library requiredYes (shared_preload_libraries or session_preload_libraries)Yes (shared_preload_libraries, required for background worker)
Background workerNoYes (scheduler + optional WAL decoder)

3. Maintenance Model

This is the most important design difference between the two extensions.

pg_ivm — Immediate Maintenance

pg_ivm updates its views synchronously inside the same transaction that modified the base table. When a row is inserted/updated/deleted, AFTER row triggers fire and update the IMMV before the transaction commits.

BEGIN;
  UPDATE base_table ...;   -- triggers fire here
  -- IMMV is updated before COMMIT
COMMIT;

Consequences:

  • The IMMV is always exactly consistent with the committed state of the base table — zero staleness.
  • Write latency increases by the cost of view maintenance. For large joins or aggregates on popular tables this can be significant.
  • Locking: ExclusiveLock is held on the IMMV during maintenance to prevent concurrent anomalies. In REPEATABLE READ or SERIALIZABLE isolation, errors are raised when conflicts are detected.
  • TRUNCATE on a base table triggers full IMMV refresh (for most view types).
  • Not compatible with logical replication (subscriber nodes are not updated).

pg_trickle — Deferred, Scheduled Maintenance

pg_trickle updates its stream tables asynchronously, driven by a background worker scheduler. Changes are captured by row-level triggers (or optionally by WAL decoding) into change-buffer tables and are applied in batch on the next refresh cycle.

-- Write path: only a trigger INSERT into change buffer
BEGIN;
  UPDATE base_table ...;   -- trigger captures delta into pgtrickle_changes.*
COMMIT;

-- Separate refresh cycle (background worker):
  apply_delta_to_stream_table(...)

Consequences:

  • Write latency is minimized — the trigger write into the change buffer is ~2–50 μs regardless of view complexity.
  • Stream tables are stale between refresh cycles. The staleness bound is configurable (e.g. '30s', '5m', '@hourly', or cron expressions).
  • Refresh can be triggered manually: pgtrickle.refresh_stream_table(...).
  • Multiple stream tables can share a refresh pipeline ordered by dependency (topological DAG scheduling).
  • The WAL-based CDC mode (pg_trickle.cdc_mode = 'wal') eliminates trigger overhead entirely when wal_level = logical is available.
  • Append-only fast path (v0.5.0): append_only => true skips merge for INSERT-only tables with auto-fallback if DELETE/UPDATE detected.
  • Source gating (v0.5.0): pause CDC during bulk loads via gate_source() and ungate_source() to avoid trigger overhead during large batch inserts.

Implemented: pg_trickle IMMEDIATE Mode

pg_trickle now offers an IMMEDIATE refresh mode (Phase 1 + Phase 3 complete) that uses statement-level AFTER triggers with transition tables — the same mechanism pg_ivm uses. Key implementation details:

  • Reuses the DVM engine — the Scan operator reads from transition tables (via temporary views) instead of change buffer tables.
  • Phase 1 (complete): core IMMEDIATE engine — INSERT/UPDATE/DELETE/TRUNCATE handling, advisory lock-based concurrency (IvmLockMode), mode switching via alter_stream_table, query restriction validation.
  • Phase 2 (postponed): pgivm.* compatibility layer for drop-in migration.
  • Phase 3 (complete): extended SQL support — window functions, LATERAL, scalar subqueries, cascading IMMEDIATE stream tables, WITH RECURSIVE (IM1: supported with a stack-depth warning), and TopK micro-refresh (IM2: recomputes top-K on every DML, gated by pg_trickle.ivm_topk_max_limit).
  • Phase 4 (complete): delta SQL template caching (IVM_DELTA_CACHE); ENR-based transition tables and C-level triggers deferred to post-1.0 as optimizations only.
-- Create an IMMEDIATE stream table (zero staleness)
SELECT pgtrickle.create_stream_table(
    'live_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    NULL,          -- no schedule needed
    'IMMEDIATE'
);

-- Updates propagate within the same transaction
BEGIN;
  INSERT INTO orders (region, amount) VALUES ('EU', 100);
  SELECT * FROM live_totals;  -- already includes the new row
COMMIT;

4. SQL Feature Coverage — Summary

Dimensionpg_ivmpg_trickleWinner
Maintenance timingImmediate (in-transaction triggers)Deferred (scheduler/manual) and IMMEDIATE (in-transaction)pg_trickle (offers both models)
PostgreSQL versions13–1818 only; PG 14–18 plannedpg_ivm (today); planned parity
Aggregate functions5 (COUNT, SUM, AVG, MIN, MAX)60+ (all built-in aggregates incl. algebraic O(1) for COUNT/SUM/AVG/STDDEV/VAR)pg_trickle
FILTER clause on aggregatesNoYespg_trickle
HAVING clauseNoYespg_trickle
Inner joinsYes (including self-join)Yes (including self-join, NATURAL, nested)pg_trickle
Outer joinsYes (limited — equijoin, single condition, many restrictions)Yes (LEFT/RIGHT/FULL, nested, complex conditions)pg_trickle
DISTINCTYes (reference-counted)Yes (reference-counted)Tie
DISTINCT ONNoYes (auto-rewritten to ROW_NUMBER)pg_trickle
UNION / INTERSECT / EXCEPTNoYes (all 6 variants, bag + set)pg_trickle
Window functionsNoYes (partition recomputation)pg_trickle
CTEs (non-recursive)Simple only (no aggregates, no DISTINCT inside)Full (aggregates, DISTINCT, multi-reference shared delta)pg_trickle
CTEs (recursive)NoYes (semi-naive, DRed, recomputation; IMMEDIATE mode with stack-depth warning)pg_trickle
Subqueries in FROMSimple only (no aggregates/DISTINCT inside)Full supportpg_trickle
EXISTS subqueriesYes (WHERE only, AND only, no agg/DISTINCT)Yes (WHERE + targetlist, AND/OR, agg/DISTINCT inside)pg_trickle
NOT EXISTS / NOT INNoYes (anti-join operator)pg_trickle
IN (subquery)NoYes (semi-join operator)pg_trickle
Scalar subquery in SELECTNoYes (scalar subquery operator)pg_trickle
LATERAL subqueriesNoYes (row-scoped recomputation)pg_trickle
LATERAL SRFsNoYes (jsonb_array_elements, unnest, etc.)pg_trickle
JSON_TABLE (PG 17+)NoYespg_trickle
GROUPING SETS / CUBE / ROLLUPNoYes (auto-rewritten to UNION ALL)pg_trickle
Views as sourcesNo (simple tables only)Yes (auto-inlined, nested)pg_trickle
Partitioned tablesNoYespg_trickle
Foreign tablesNoFULL mode onlypg_trickle
Cascading (view-on-view)NoYes (DAG-aware scheduling)pg_trickle
Background schedulingNo (user must trigger)Yes (cron + duration, background worker)pg_trickle
Monitoring / observability1 catalog tableExtensive (stats, history, staleness, CDC health, NOTIFY)pg_trickle
CDC mechanismTriggers onlyHybrid (triggers + optional WAL)pg_trickle
DDL trackingNo automatic handlingYes (event triggers, auto-reinit)pg_trickle
TRUNCATE handlingYes (auto-truncate IMMV)IMMEDIATE mode: full refresh in same txn; DEFERRED: queued full refreshTie (functionally equivalent in IMMEDIATE mode)
Auto-indexingYes (on GROUP BY / DISTINCT / PK columns)No (user creates indexes)pg_ivm
Row Level SecurityYes (with limitations)Yes (refreshes see all data; RLS on stream table; IMMEDIATE mode secured)pg_trickle (richer model)
Concurrency modelExclusiveLock on IMMV during maintenanceAdvisory locks, non-blocking reads, parallel refreshpg_trickle
Data type restrictionsMust have btree opclass (no json, xml, point)No documented type restrictionspg_trickle
Maturity / ecosystem4 years, 1.4k stars, PGXN, yum packagesv0.9.0 released, 1,100+ unit tests + 900+ E2E tests, 22 TPC-H benchmarks, dbt integrationpg_ivm

4.1 Areas Where pg_ivm Wins

Of the ~35 dimensions in the summary table above, pg_ivm holds an advantage in only 3 (down from 6 before IMMEDIATE mode and RLS were implemented). One is substantive, two are temporary gaps with existing plans.

1. PostgreSQL Version Support (substantive, planned resolution)

pg_ivm ships pre-built packages for PostgreSQL 13–18 across all major Linux distros via yum.postgresql.org and PGXN. pg_trickle currently targets PG 18 only.

This is the single largest remaining structural gap. PG 13 is EOL (Nov 2025), but PG 14–17 are widely deployed in production environments. Users on those versions simply cannot use pg_trickle today.

Planned resolution: PLAN_PG_BACKCOMPAT.md details backporting to PG 14–18 (~2.5–3 weeks). pgrx 0.17 already supports PG 14–18 via feature flags; ~435 lines in parser.rs need #[cfg] gating for JSON/SQL-standard parse-tree handling.

2. Auto-Indexing (substantive, low priority)

When pg_ivm creates an IMMV, it automatically adds indexes on columns used in GROUP BY, DISTINCT, and primary keys. This is a genuine usability advantage — new users get reasonable read performance without manual intervention.

pg_trickle leaves index creation entirely to the user. For DIFFERENTIAL mode stream tables, the DVM engine's MERGE-based delta application already uses the stream table's primary key (which is auto-created), and index-aware MERGE (pg_trickle.merge_seqscan_threshold, added v0.9.0) uses index lookups for tiny change ratios, but secondary indexes for read-side query patterns must be added manually.

Impact: Low — experienced users always create application-specific indexes anyway. Auto-indexing mostly helps onboarding and simple use-cases.

Planned resolution: Tracked as part of the pg_ivm compatibility layer (Phase 2, postponed to post-1.0). Could also be implemented independently as a CREATE INDEX IF NOT EXISTS step in create_stream_table.

3. Maturity / Ecosystem (temporary, closing over time)

pg_ivm has 4 years of production use, ~1,400 GitHub stars, 17 releases, and is distributed via PGXN, yum, and apt package repositories. It has a track record of stability and a community of users.

pg_trickle is a v0.9.0 series release with 1,100+ unit tests, 200+ integration tests, 570+ light E2E tests, 90+ full E2E tests, and 22 TPC-H correctness benchmarks—but no wide production deployments yet. It lacks the battle-testing that comes from years of real-world usage.

Impact: High for risk-averse organizations considering production adoption. Low for greenfield projects or teams willing to adopt early.

Resolution: This gap closes naturally with time, releases, and adoption. The dbt integration (dbt-pgtrickle) and CNPG/Kubernetes deployment support accelerate ecosystem development.


5. Detailed SQL Comparison

5.1 Aggregate Functions

Functionpg_ivmpg_trickle
COUNT(*) / COUNT(expr)✅ Algebraic✅ Algebraic (O(1) running total, v0.9.0)
SUM✅ Algebraic✅ Algebraic (O(1) running total, v0.9.0)
AVG✅ Algebraic (via SUM/COUNT)✅ Algebraic (O(1) via SUM/COUNT decomposition, v0.9.0)
MIN✅ Semi-algebraic (rescan on extremum delete)✅ Semi-algebraic (O(1) unless extremum deleted, v0.9.0 safety guard)
MAX✅ Semi-algebraic (rescan on extremum delete)✅ Semi-algebraic (O(1) unless extremum deleted, v0.9.0 safety guard)
BOOL_AND / BOOL_OR✅ Group-rescan
STRING_AGG✅ Group-rescan
ARRAY_AGG✅ Group-rescan
JSON_AGG / JSONB_AGG✅ Group-rescan
BIT_AND / BIT_OR / BIT_XOR✅ Group-rescan
JSON_OBJECT_AGG / JSONB_OBJECT_AGG✅ Group-rescan
STDDEV / VARIANCE (all variants)✅ Algebraic (O(1) sum-of-squares decomposition, v0.9.0)
MODE / PERCENTILE_CONT / PERCENTILE_DISC✅ Group-rescan
CORR / COVAR / REGR_* (11 functions)✅ Group-rescan
ANY_VALUE (PG 16+)✅ Group-rescan
JSON_ARRAYAGG / JSON_OBJECTAGG (PG 16+)✅ Group-rescan
User-defined aggregates (CREATE AGGREGATE)✅ Group-rescan
FILTER (WHERE) clause
WITHIN GROUP (ORDER BY)
COUNT(DISTINCT expr) / SUM(DISTINCT expr)
Total560+

Gap for pg_ivm: Massive. Only 5 of ~60 built-in aggregate functions are supported. pg_trickle v0.9.0 also introduced algebraic (O(1)) maintenance for COUNT, SUM, AVG, STDDEV, and VARIANCE — meaning these aggregates update in constant time per changed row via running totals, whereas pg_ivm’s algebraic support is limited to COUNT, SUM, AVG. pg_trickle additionally supports user-defined aggregates via group-rescan and floating-point drift correction (pg_trickle.algebraic_drift_reset_cycles).

5.2 Joins

Featurepg_ivmpg_trickle
Inner join
Self-join
LEFT JOIN✅ (restricted)✅ (full)
RIGHT JOIN✅ (restricted)✅ (normalized to LEFT)
FULL OUTER JOIN✅ (restricted)✅ (8-part delta)
NATURAL JOIN?
Cross join?
Nested joins (3+ tables)
Non-equi joins (theta)?
Outer join + aggregates
Outer join + subqueries
Outer join + CASE/non-strict
Outer join multi-condition❌ (single equality only)

Gap for pg_ivm: Outer joins are heavily restricted — single equijoin condition, no aggregates, no subqueries, no CASE expressions, no IS NULL in WHERE.

5.3 Subqueries

Featurepg_ivmpg_trickle
Simple subquery in FROM✅ (no aggregates/DISTINCT inside)✅ (full support)
EXISTS in WHERE✅ (AND only, no agg/DISTINCT inside)✅ (AND + OR, full SQL inside)
NOT EXISTS in WHERE✅ (anti-join operator)
IN (subquery)✅ (rewritten to semi-join)
NOT IN (subquery)✅ (rewritten to anti-join)
ALL (subquery)✅ (rewritten to anti-join)
Scalar subquery in SELECT✅ (scalar subquery operator)
Scalar subquery in WHERE✅ (auto-rewritten to CROSS JOIN)
LATERAL subquery in FROM✅ (row-scoped recomputation)
LATERAL SRF in FROM✅ (jsonb_array_elements, unnest, etc.)
Subqueries in OR✅ (auto-rewritten to UNION)

Gap for pg_ivm: Severely limited subquery support. No anti-joins, no scalar subqueries, no LATERAL, no SRFs.

5.4 CTEs

Featurepg_ivmpg_trickle
Simple non-recursive CTE✅ (no aggregates/DISTINCT inside)✅ (full SQL inside)
Multi-reference CTE?✅ (shared delta optimization)
Chained CTEs?
WITH RECURSIVE✅ (semi-naive, DRed, recomputation; IMMEDIATE mode with stack-depth warning)

Gap for pg_ivm: No recursive CTEs, no aggregates/DISTINCT inside CTEs.

5.5 Set Operations

Featurepg_ivmpg_trickle
UNION ALL
UNION (set)✅ (via DISTINCT + UNION ALL)
INTERSECT✅ (dual-count multiplicity)
INTERSECT ALL
EXCEPT✅ (dual-count multiplicity)
EXCEPT ALL

Gap for pg_ivm: No set operations at all.

5.6 Window Functions

Featurepg_ivmpg_trickle
ROW_NUMBER, RANK, DENSE_RANK
SUM/AVG/COUNT OVER ()
Frame clauses (ROWS/RANGE/GROUPS)
Named WINDOW clauses
PARTITION BY recomputation

Gap for pg_ivm: Window functions are completely unsupported.

5.7 DISTINCT & Grouping

Featurepg_ivmpg_trickle
SELECT DISTINCT
DISTINCT ON (expr, ...)✅ (auto-rewritten to ROW_NUMBER)
GROUP BY
GROUPING SETS✅ (auto-rewritten to UNION ALL)
CUBE✅ (auto-rewritten via GROUPING SETS)
ROLLUP✅ (auto-rewritten via GROUPING SETS)
GROUPING() function
HAVING

5.8 Source Table Types

Source typepg_ivmpg_trickle
Simple heap tables
Views✅ (auto-inlined)
Materialized viewsFULL mode only
Partitioned tables
Partitions✅ (via parent)
Foreign tablesFULL mode only
Other IMMVs / stream tables✅ (DAG cascading)

Gap for pg_ivm: Only simple heap tables. No views, no partitioned tables, no cascading.


6. API Comparison

pg_ivm API

-- Create an IMMV
SELECT pgivm.create_immv('myview', 'SELECT * FROM mytab');

-- Full refresh (emergency)
SELECT pgivm.refresh_immv('myview', true);   -- with data
SELECT pgivm.refresh_immv('myview', false);  -- disable maintenance

-- Inspect
SELECT immvrelid, pgivm.get_immv_def(immvrelid)
FROM pgivm.pg_ivm_immv;

-- Drop
DROP TABLE myview;

-- Rename
ALTER TABLE myview RENAME TO myview2;

pg_ivm IMMVs are standard PostgreSQL tables. They can be dropped with DROP TABLE and renamed with ALTER TABLE.

pg_trickle API

-- Create a stream table (AUTO mode: DIFFERENTIAL when possible, FULL fallback)
SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region'
    -- refresh_mode defaults to 'AUTO', schedule defaults to 'calculated'
);

-- Create a stream table (explicit deferred, scheduled)
SELECT pgtrickle.create_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule     => '2m',
    refresh_mode => 'DIFFERENTIAL'
);

-- Create a stream table (immediate, in-transaction)
SELECT pgtrickle.create_stream_table(
    'live_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region',
    schedule     => NULL,
    refresh_mode => 'IMMEDIATE'
);

-- Manual refresh
SELECT pgtrickle.refresh_stream_table('order_totals');

-- Alter schedule, mode, or defining query
SELECT pgtrickle.alter_stream_table('order_totals', schedule => '5m');
SELECT pgtrickle.alter_stream_table(
    'order_totals',
    query => 'SELECT region, SUM(amount) AS total FROM orders WHERE active GROUP BY region'
);

-- Drop
SELECT pgtrickle.drop_stream_table('order_totals');

-- Status and monitoring
SELECT * FROM pgtrickle.pgt_status();
SELECT * FROM pgtrickle.pg_stat_stream_tables;
SELECT * FROM pgtrickle.pgt_stream_tables;

-- DAG inspection
SELECT * FROM pgtrickle.pgt_dependencies;

-- Extended observability (added v0.2.0+)
SELECT * FROM pgtrickle.change_buffer_sizes();  -- CDC buffer health
SELECT * FROM pgtrickle.list_sources('order_totals');  -- source table stats
SELECT * FROM pgtrickle.dependency_tree();  -- ASCII DAG view
SELECT * FROM pgtrickle.health_check();  -- OK/WARN/ERROR triage
SELECT * FROM pgtrickle.refresh_timeline();  -- cross-stream history
SELECT * FROM pgtrickle.trigger_inventory();  -- CDC trigger audit
SELECT * FROM pgtrickle.diamond_groups();  -- diamond consistency groups

-- Source gating (v0.5.0)
SELECT pgtrickle.gate_source('orders');      -- pause CDC
SELECT pgtrickle.ungate_source('orders');    -- resume CDC
SELECT * FROM pgtrickle.source_gates();      -- gate status

-- Watermarks (v0.7.0)
SELECT pgtrickle.advance_watermark('orders', '2026-03-20 12:00:00');
SELECT pgtrickle.create_watermark_group('sync', ARRAY['orders','products'], 30);
SELECT * FROM pgtrickle.watermarks();
SELECT * FROM pgtrickle.watermark_status();

-- Parallel refresh monitoring (v0.4.0)
SELECT * FROM pgtrickle.worker_pool_status();
SELECT * FROM pgtrickle.parallel_job_status();

-- Refresh groups (v0.9.0)
SELECT pgtrickle.create_refresh_group('my_group', ARRAY['st1','st2']);
SELECT pgtrickle.drop_refresh_group('my_group');

-- Idempotent DDL (v0.6.0)
SELECT pgtrickle.create_or_replace_stream_table(
    'order_totals',
    'SELECT region, SUM(amount) AS total FROM orders GROUP BY region'
);

pg_trickle stream tables are regular PostgreSQL tables but managed through the pgtrickle schema's API functions. They cannot be renamed with ALTER TABLE (use alter_stream_table).


7. Scheduling and Dependency Management

Capabilitypg_ivmpg_trickle
Automatic scheduling❌ (immediate only, no scheduler)✅ background worker
Manual refreshrefresh_immv()refresh_stream_table()
Cron schedules✅ (standard 5/6-field cron + aliases)
Duration-based staleness bounds✅ ('30s', '5m', '1h', …)
Dependency DAG✅ (stream tables can reference other stream tables)
Topological refresh ordering✅ (upstream refreshes before downstream)
CALCULATED schedule propagation✅ (consumers drive upstream schedules)
Parallel refresh✅ (worker pool with database + cluster caps, v0.4.0)
Circular pipeline support✅ (monotone cycles with fixed-point iteration, v0.7.0)
Watermark coordination✅ (multi-source readiness gates, v0.7.0)
Refresh group management✅ (atomic multi-ST refresh, v0.9.0)

pg_trickle's DAG scheduling is a significant differentiator: you can build multi-layer pipelines where each downstream stream table is automatically refreshed after its upstream dependencies.


8. Change Data Capture

Attributepg_ivmpg_trickle
MechanismAFTER row triggers (inline, same txn)AFTER row/statement triggers → change buffer
WAL-based CDC✅ optional (pg_trickle.cdc_mode = 'wal')
Statement-level triggers✅ (v0.4.0, reduced overhead for bulk operations)
Logical replication slotsNot usedUsed in WAL mode only
Write-side overheadHigher (view maintenance in txn)Lower (small trigger insert only)
Change buffer tablesNone (applied immediately)pgtrickle_changes.changes_<oid>
TRUNCATE handlingIMMV truncated/refreshed synchronouslyChange buffer cleared; full refresh queued

9. Concurrency and Isolation

pg_ivm

  • Holds ExclusiveLock on the IMMV during incremental update.
  • In READ COMMITTED: serializes concurrent updates to the same IMMV.
  • In REPEATABLE READ / SERIALIZABLE: raises an error when a concurrent transaction has already updated the IMMV.
  • Single-table INSERT-only IMMVs use the lighter RowExclusiveLock.

pg_trickle

  • Refresh operations acquire an advisory lock per stream table so only one refresh can run at a time.
  • Base table writes are never blocked by refresh operations.
  • Parallel refresh (v0.4.0): pg_trickle.parallel_refresh_mode = 'on' enables a worker pool with per-database (max_concurrent_refreshes, default 4) and cluster-wide (max_dynamic_refresh_workers) caps.
  • Atomic refresh groups for diamond dependencies.
  • Crash recovery: in-flight refreshes are marked failed on restart; the scheduler retries on the next cycle.

10. Observability

Featurepg_ivmpg_trickle
Catalog of managed viewspgivm.pg_ivm_immvpgtrickle.pgt_stream_tables
Per-refresh timing/historypgtrickle.pgt_refresh_history
Staleness reportingstale column + get_staleness()
Scheduler statuspgtrickle.pgt_status()
NOTIFY-based alertingpgtrickle_refresh channel (10+ alert types)
Error tracking✅ consecutive error counter, last error message
dbt integrationdbt-pgtrickle macro package
Explain/introspectionexplain_st
CDC buffer healthpgtrickle.change_buffer_sizes() (v0.2.0)
Source table statspgtrickle.list_sources() (v0.2.0)
Dependency tree viewpgtrickle.dependency_tree() (v0.2.0)
Health triagepgtrickle.health_check() (v0.2.0)
Cross-stream refresh historypgtrickle.refresh_timeline() (v0.2.0)
CDC trigger auditpgtrickle.trigger_inventory() (v0.2.0)
Diamond group inspectionpgtrickle.diamond_groups() (v0.2.0)
Quick health summarypgtrickle.quick_health view (v0.5.0)
Source gating statuspgtrickle.source_gates() (v0.5.0)
Watermark monitoringpgtrickle.watermarks() / watermark_status() (v0.7.0)
Parallel worker statuspgtrickle.worker_pool_status() / parallel_job_status() (v0.4.0)
SCC cycle statuspgtrickle.pgt_scc_status() (v0.7.0)
Replication slot healthpgtrickle.slot_health()
CDC mode per-sourcepgtrickle.pgt_cdc_status view

11. Installation and Deployment

Attributepg_ivmpg_trickle
Pre-built packagesRPM via yum.postgresql.orgOCI image, tarball
CNPG / Kubernetes❌ (no OCI image)✅ OCI extension image + CNPG smoke tests
Docker local devManual✅ documented + Docker Hub image
shared_preload_librariesRequired (or session_preload_libraries)Required
Extension upgrade scripts✅ (1.0 → 1.1 → … → 1.13)✅ (0.1.3 → … → 0.9.0, CI completeness check, upgrade E2E tests)
pg_dump / restoreManual IMMV recreation required✅ Standard pg_dump supported (v0.8.0)

12. Performance Characteristics

pg_ivm

  • Write path: slower — every DML statement triggers inline view maintenance. From the README example: a single row update on a 10M-row join IMMV takes ~15 ms vs ~9 ms for a plain table update.
  • Read path: instant — IMMV is always current, no refresh needed on read.
  • Refresh (full): comparable to REFRESH MATERIALIZED VIEW (~20 seconds for a 10M-row join in the example).

pg_trickle

  • Write path: minimal overhead — only a small trigger INSERT into the change buffer (~2–50 μs per row). In WAL mode, zero trigger overhead. Statement-level CDC triggers (v0.4.0) further reduce overhead for bulk ops.
  • Read path: instant from the materialized table (potentially stale).
  • Refresh (differential): proportional to the number of changed rows, not the total table size. A single-row change on a million-row aggregate touches one row's worth of computation. Algebraic aggregates (v0.9.0) like COUNT/SUM/AVG/STDDEV/VAR update in O(1) constant time per changed row.
  • Refresh (full): re-runs the entire query; comparable to REFRESH MATERIALIZED VIEW.
  • Parallel refresh (v0.4.0): linear speedup with worker pool size.
  • I/O optimizations (v0.9.0): column skipping, source skipping in joins, WHERE filter push-down, index-aware MERGE for tiny change ratios, scalar subquery short-circuit.

13. Known Limitations

pg_ivm Limitations

  • Adds latency to every write on tracked base tables.
  • Cannot track tables modified via logical replication (subscriber nodes are not updated).
  • pg_dump / pg_upgrade require manual recreation of all IMMVs.
  • Limited aggregate support (no user-defined aggregates, no window functions).
  • Column type restrictions (btree operator class required in target list).
  • No scheduler or background worker — refresh is immediate only.
  • On high-churn tables, min/max aggregates can trigger expensive rescans.

pg_trickle Limitations

  • In DIFFERENTIAL/FULL mode, data is stale between refresh cycles. Use IMMEDIATE mode for zero-staleness, in-transaction consistency.
  • Recursive CTEs in IMMEDIATE mode emit a stack-depth warning; very deep recursion may hit PostgreSQL's stack limit.
  • Recursive CTEs in DIFFERENTIAL mode fall back to full recomputation for mixed DELETE/UPDATE changes (DRed scheduled for v0.10.0+).
  • LIMIT without ORDER BY is not supported in defining queries.
  • OFFSET without ORDER BY … LIMIT is not supported. Paged TopK (ORDER BY … LIMIT N OFFSET M) is fully supported.
  • ORDER BY + LIMIT (TopK) without OFFSET uses scoped recomputation (MERGE).
  • Volatile SQL functions rejected in DIFFERENTIAL mode.
  • Materialized views as sources not supported in DIFFERENTIAL mode.
  • Window functions in expressions (e.g. CASE WHEN ROW_NUMBER() OVER (...) > 5) require FULL mode.
  • Foreign tables as sources require FULL mode.
  • ALTER EXTENSION pg_trickle UPDATE migration scripts ship from v0.2.1; continuous upgrade path through v0.9.0.
  • Targets PostgreSQL 18 only; no backport to PG 13–17 (planned for PG 14–18).
  • v0.9.x series — extensive testing but not yet production-hardened at scale.

14. PostgreSQL Version Support

pg_ivmpg_trickle (current)pg_trickle (planned)
PG 13❌ (EOL Nov 2025)
PG 14✅ (full plan)
PG 15✅ (full plan)
PG 16✅ (MVP target)
PG 17✅ (MVP target)
PG 18

Planned resolution: PLAN_PG_BACKCOMPAT.md:

  • Minimum viable (PG 16–18): ~1.5 weeks effort.
  • Full target (PG 14–18): ~2.5–3 weeks effort.
  • pgrx 0.17.0 already supports PG 14–18 via feature flags.
  • ~435 lines in src/dvm/parser.rs need #[cfg] gating (all in JSON/SQL-standard sections). The remaining ~13,500 lines compile unchanged.

Feature degradation matrix:

FeaturePG 14PG 15PG 16PG 17PG 18
Core streaming tables
Trigger-based CDC
Differential refresh
SQL/JSON constructors
JSON_TABLE
WAL-based CDCNeeds testNeeds testLikelyLikely

15. Features Unique to Each System

Features Unique to pg_trickle (42 items, no pg_ivm equivalent)

  1. IMMEDIATE + deferred modes (pg_ivm is immediate-only; pg_trickle offers both)
  2. 60+ aggregate functions (vs 5), including algebraic O(1) for COUNT/SUM/AVG/STDDEV/VAR
  3. FILTER / HAVING / WITHIN GROUP on aggregates
  4. Window functions (partition recomputation)
  5. Set operations (UNION ALL, UNION, INTERSECT, EXCEPT — all 6 variants)
  6. Recursive CTEs (semi-naive, DRed, recomputation; including IMMEDIATE mode with stack-depth warning)
  7. LATERAL subqueries and SRFs (jsonb_array_elements, unnest, JSON_TABLE)
  8. Anti-join / semi-join operators (NOT EXISTS, NOT IN, IN, EXISTS with full SQL)
  9. Scalar subqueries in SELECT list
  10. Views as sources (auto-inlined with nested expansion)
  11. Partitioned table support (RANGE, LIST, HASH with auto-rebuild on ATTACH PARTITION)
  12. Cascading stream tables (ST referencing other STs via DAG)
  13. Background scheduler (cron + duration + canonical periods) with multi-database auto-discovery
  14. GROUPING SETS / CUBE / ROLLUP (auto-rewritten)
  15. DISTINCT ON (auto-rewritten to ROW_NUMBER)
  16. Hybrid CDC (trigger → WAL transition)
  17. DDL change detection and automatic reinitialization (including ALTER FUNCTION body changes)
  18. Monitoring suite (15+ observability functions: change_buffer_sizes, list_sources, dependency_tree, health_check, refresh_timeline, trigger_inventory, diamond_groups, source_gates, watermarks, watermark_groups, watermark_status, worker_pool_status, parallel_job_status, pgt_scc_status, slot_health, check_cdc_health)
  19. Auto-rewrite pipeline (6 transparent SQL rewrites)
  20. Volatile function detection
  21. AUTO refresh mode (smart DIFFERENTIAL/FULL selection with transparent fallback)
  22. ALTER QUERY — change the defining query of an existing stream table online, with schema-change classification and OID-preserving migration
  23. dbt macro package (materialization, status macro, health test, refresh operation)
  24. CNPG / Kubernetes deployment
  25. SQL/JSON constructors (JSON_OBJECT, JSON_ARRAY, etc.)
  26. JSON_TABLE support (PG 17+)
  27. TopK stream tables (ORDER BY + LIMIT, including IMMEDIATE mode via micro-refresh)
  28. Paged TopK (ORDER BY + LIMIT + OFFSET for server-side pagination)
  29. Diamond dependency consistency (multi-path refresh atomicity with SAVEPOINT)
  30. Extension upgrade infrastructure (SQL migration scripts, CI completeness check, upgrade E2E tests, per-release SQL baselines)
  31. Row Level Security (refreshes see all data; RLS policies on ST itself; IMMEDIATE mode secured; internal change buffers shielded from RLS interference) (v0.5.0)
  32. Source gating (pause/resume CDC for bulk loads: gate_source, ungate_source) (v0.5.0)
  33. Append-only fast path (append_only => true skips merge for INSERT-only tables) (v0.5.0)
  34. Parallel refresh (background worker pool with per-database and cluster-wide caps, atomic groups for diamond dependencies) (v0.4.0)
  35. Statement-level CDC triggers (reduced write-side overhead for bulk operations) (v0.4.0)
  36. Circular pipeline support (monotone cycles with fixed-point iteration, max_fixpoint_iterations safety limit, SCC status monitoring) (v0.7.0)
  37. Watermark APIs (delay refresh until multi-source data is ready: advance_watermark, create_watermark_group, tolerance-based readiness) (v0.7.0)
  38. pg_dump / pg_restore support (safe backup with auto-reconnect of streams) (v0.8.0)
  39. Algebraic aggregate maintenance (O(1) constant-time updates for COUNT/SUM/AVG/STDDEV/VAR with floating-point drift correction) (v0.9.0)
  40. Refresh group management (create_refresh_group, drop_refresh_group for atomic multi-ST refresh) (v0.9.0)
  41. Automatic backoff (exponential slowdown for overloaded streams) (v0.9.0)
  42. Index-aware MERGE (use index lookups for tiny change ratios) (v0.9.0)

Features Unique to pg_ivm (with planned resolutions)

#FeatureStatusRef
1Immediate (synchronous) maintenanceClosed — IMMEDIATE refresh mode fully implemented (all phases)PLAN_TRANSACTIONAL_IVM
2Auto-index creation on GROUP BY / DISTINCT / PKPostponed (Phase 2 of transactional IVM)PLAN_TRANSACTIONAL_IVM §5.2
3TRUNCATE propagation (auto-truncate IMMV)Closed — IMMEDIATE mode fires full refresh on TRUNCATEPLAN_TRANSACTIONAL_IVM §3.2
4Row Level Security respectClosed — v0.5.0: refreshes see all data; RLS on ST itself; IMMEDIATE mode secured; change buffers shieldedROW_LEVEL_SECURITY.md
5PostgreSQL 13–17 supportPG 14–18 backcompat planned (~2.5–3 weeks)PLAN_PG_BACKCOMPAT
6session_preload_librariesNot applicable (background worker needs shared_preload)
7Rename via ALTER TABLEEvent trigger support (low effort)
8Drop via DROP TABLEPostponed (Phase 2 of transactional IVM)PLAN_TRANSACTIONAL_IVM §4.3
9Extension upgrade scriptsClosed — Scripts ship from v0.2.1; CI completeness check and upgrade E2E tests in place
10pg_dump / pg_restoreClosed — v0.8.0: safe backup with pg_dump and pg_restore, auto-reconnect streams

Of the 10 items, 5 are now closed (immediate maintenance, TRUNCATE, RLS, upgrade scripts, pg_dump), 3 have concrete implementation plans, and 2 are low-priority or not applicable.


16. Use-Case Fit

ScenarioRecommended
Need views consistent within the same transactionEither (pg_trickle IMMEDIATE mode or pg_ivm)
Application cannot tolerate any view stalenessEither (pg_trickle IMMEDIATE mode or pg_ivm)
High write throughput, views can be slightly stalepg_trickle (DIFFERENTIAL mode)
Multi-layer summary pipelines with dependenciespg_trickle
Time-based or cron-driven refresh schedulespg_trickle
Views with complex SQL (window functions, CTEs, UNION)pg_trickle
Simple aggregation with zero-staleness requirementEither (pg_trickle has richer SQL coverage)
Kubernetes / CloudNativePG deploymentpg_trickle
dbt integrationpg_trickle
Circular / self-referencing pipelinespg_trickle
Multi-source watermark coordinationpg_trickle
High-throughput bulk loading (append-only)pg_trickle (append-only fast path)
Row Level Security on analytical summariespg_trickle (richer RLS model)
pg_dump / pg_restore workflowpg_trickle
PostgreSQL 13–17pg_ivm
PostgreSQL 18pg_trickle (superset of pg_ivm)
Production-hardened, stable APIpg_ivm
Early adopter, rich SQL coverage neededpg_trickle

17. Coexistence

The two extensions can be installed in the same database simultaneously — they use different schemas (pgivm vs pgtrickle/pgtrickle_changes) and do not interfere with each other. However, with pg_trickle's IMMEDIATE mode now available and its dramatically broader feature set (v0.9.0), there is little reason to use both:

  • Use pg_trickle IMMEDIATE for small, critical lookup tables that must be perfectly consistent within transactions (the use-case that previously required pg_ivm).
  • Use pg_trickle DIFFERENTIAL/FULL for large analytical summary tables, multi-layer aggregation pipelines, circular pipelines, or views where slight staleness is acceptable.
  • Use pg_trickle AUTO (default) to let the system choose the best strategy.
  • Use pg_ivm only if you need PostgreSQL 13–17 support or prefer its mature, battle-tested codebase.

18. Recommendations

Planned work that closes pg_ivm gaps

PriorityItemPlanEffortCloses Gaps
✅ DoneIMMEDIATE refresh mode (all phases)PLAN_TRANSACTIONAL_IVMComplete#1 (immediate maintenance), #3 (TRUNCATE)
✅ DoneExtension upgrade scriptsv0.2.1 releaseComplete#9 (upgrade scripts)
✅ DoneRow Level Securityv0.5.0 releaseComplete#4 (RLS)
✅ Donepg_dump / pg_restorev0.8.0 releaseComplete#10 (backup/restore)
Postponedpg_ivm compatibility layerPLAN_TRANSACTIONAL_IVM Phase 2Deferred to post-1.0#2 (auto-indexing), #7 (rename), #8 (DROP TABLE)
HighPG 16–18 backcompat (MVP)PLAN_PG_BACKCOMPAT §11~1.5 weeks#5 (PG version support)
MediumPG 14–18 backcompat (full)PLAN_PG_BACKCOMPAT §5~2.5–3 weeks#5 (PG version support)

Remaining small gaps (no existing plan)

PriorityItemDescriptionEffort
LowALTER TABLE RENAMEDetect rename via event trigger, update catalog2–4h

Not worth pursuing

ItemReason
PG 13 supportEOL since November 2025. Incompatible raw_parser() API.
session_preload_librariesRequires background worker, which needs shared_preload_libraries.

19. Conclusion

pg_trickle covers all of pg_ivm's SQL surface and extends it dramatically with 55+ additional aggregate functions (including algebraic O(1) maintenance for COUNT/SUM/AVG/STDDEV/VAR), window functions, set operations, recursive CTEs, LATERAL support, anti/semi-joins, circular pipeline support, watermark coordination, parallel refresh, Row Level Security, and a comprehensive operational layer.

The immediate maintenance gap is now fully closed: pg_trickle's IMMEDIATE refresh mode provides the same in-transaction consistency as pg_ivm, while also supporting window functions, LATERAL, scalar subqueries, WITH RECURSIVE (IM1), TopK micro-refresh (IM2), and cascading stream tables in IMMEDIATE mode — all of which pg_ivm cannot do.

The upgrade infrastructure gap is also closed: v0.2.1 ships SQL migration scripts with continuous upgrade path through v0.9.0, a CI completeness checker, and upgrade E2E tests, matching pg_ivm's upgrade path story.

The Row Level Security gap is closed (v0.5.0): refreshes see all data, RLS policies on the stream table itself control access, and IMMEDIATE mode is secured with shielded change buffers.

The pg_dump/restore gap is closed (v0.8.0): safe backup with standard PostgreSQL tools and automatic stream reconnection on restore.

The one remaining structural gap is PG version support:

  • PLAN_PG_BACKCOMPAT details backporting to PG 14–18 (or PG 16–18 as MVP) in ~2.5–3 weeks, primarily by #[cfg]- gating ~435 lines of JSON/SQL-standard parse-tree code.

Once backcompat is implemented, pg_trickle will be a strict superset of pg_ivm in every dimension: same immediate maintenance model, comparable PG version support (14–18 vs 13–18, with PG 13 EOL), dramatically wider SQL coverage (60+ aggregates vs 5, 21 DVM operators, 42 unique features), and a complete operational layer that pg_ivm entirely lacks.

For users migrating from pg_ivm, the IMMEDIATE refresh mode already provides the same zero-staleness guarantee. A full compatibility layer (pgivm.create_immv, pgivm.refresh_immv, pgivm.pg_ivm_immv) is planned for post-1.0 to enable zero-change migration.


References

Triggers vs Logical Replication for CDC in pg_trickle

Status: Evaluation Report (updated with implementation status)
Date: 2026-02-24
Context: ADR-001/ADR-002 in PLAN_ADRS.md · PLAN_USER_TRIGGERS_EXPLICIT_DML.md


Executive Summary

pg_trickle uses row-level AFTER triggers to capture changes on source tables. This report evaluates the trigger-based approach against logical replication (WAL-based CDC) across five dimensions: correctness, performance, operations, and two end-user features — user-defined triggers on stream tables and logical replication subscriptions from stream tables.

Conclusion: Triggers remain the correct choice for the current scope given operational simplicity and zero-config deployment. The hybrid approach — trigger bootstrap for creation with automatic WAL transition for steady-state — is now implemented (pg_trickle.cdc_mode GUC, src/wal_decoder.rs). User- defined triggers on stream tables are also implemented (pg_trickle.user_triggers GUC, DISABLE TRIGGER USER during refresh). These were previously recommendations (§6.2, §6.6); both are now shipped.

However, the atomicity constraint — the original reason for choosing triggers — is primarily a creation-time inconvenience, not a steady-state limitation. Once a stream table exists, logical replication has three significant runtime advantages:

  • No write-side overhead — With triggers, every INSERT/UPDATE/DELETE on a tracked source table does extra work before the application's transaction can commit: it runs a PL/pgSQL function, writes a row into a buffer table, and updates an index. This slows down the application. With logical replication, PostgreSQL already writes every change to its internal transaction log (WAL) regardless — the CDC layer simply reads that log after the fact, so the application's writes are not slowed down at all.

  • TRUNCATE capture — When someone runs TRUNCATE on a source table, row-level triggers do not fire (TRUNCATE replaces the entire file rather than deleting rows one-by-one). This leaves stream tables silently stale until a manual refresh. Logical replication captures TRUNCATE natively from the WAL, so pg_trickle would know immediately that all rows were removed.

  • Change ordering from the transaction log — With triggers, each trigger independently calls pg_current_wal_lsn() to timestamp its change. With logical replication, the ordering comes directly from the WAL — the authoritative, global record of all database changes — which means change ordering is guaranteed to match commit order, even across concurrent transactions.

The two end-user features (user triggers and logical replication FROM stream tables) are both achievable without changing the CDC mechanism. A hybrid approach (triggers for creation, logical replication for steady-state) deserves serious consideration. See §3 for the full analysis.


1. Background

Current Architecture

CDC triggers on each tracked source table write typed per-column rows into per-table buffer tables (pgtrickle_changes.changes_<oid>). Each buffer row captures:

ColumnPurpose
change_idBIGSERIAL ordering within a source
lsnpg_current_wal_lsn() at trigger time
action'I' / 'U' / 'D'
pk_hashContent hash of PK columns (optional)
new_<col>Per-column NEW values (INSERT/UPDATE)
old_<col>Per-column OLD values (UPDATE/DELETE)

A covering B-tree index (lsn, pk_hash, change_id) INCLUDE (action) supports the differential refresh's LSN-range scan.

The Atomicity Constraint

create_stream_table() performs DDL (CREATE TABLE) and DML (catalog inserts) before setting up CDC. pg_create_logical_replication_slot() cannot execute inside a transaction that has already performed writes. This makes single-transaction atomic creation impossible with logical replication — the decisive factor in the original ADR.


2. Comparison Matrix

2.1 Correctness & Transactional Safety

AspectTriggersLogical Replication
Atomic creation✅ Same transaction as DDL+catalog❌ Slot creation requires separate transaction
Change visibility✅ Immediate (same transaction)⚠️ Asynchronous (after COMMIT + WAL decode)
TRUNCATE capture❌ Row-level triggers not fired✅ WAL emits TRUNCATE since PG 11
Transaction ordering✅ Change buffer rows ordered by LSN✅ WAL stream preserves commit order
Crash recovery✅ Buffer tables are WAL-logged; no orphan state⚠️ Slot survives crash but may need re-sync
Schema change handling✅ DDL event hooks rebuild trigger in-place⚠️ Requires slot re-creation or output plugin awareness

Key insight: The TRUNCATE gap is the most significant correctness limitation of the trigger approach. A statement-level AFTER TRUNCATE trigger that marks downstream STs for automatic FULL refresh would close this gap without changing the CDC architecture (see §6 Recommendation 3).

2.2 Performance

MetricTriggersLogical Replication
Per-row write overhead~2–4 μs (narrow INSERT) to ~5–15 μs (wide UPDATE)~0 (WAL writes happen regardless)
Expected throughput reduction1.5–5× on tracked source tablesNone on source tables
Write amplification2× (source WAL + buffer table WAL + index)1× (only source WAL)
Change buffer storageHeap table + index per sourceWAL segments (shared, recycled)
Sequence contentionBIGSERIAL per buffer (lightweight)N/A
Throughput ceiling~5,000 writes/sec (estimated)WAL throughput (much higher)
Decoding CPU costN/ANon-trivial; output plugin runs in WAL sender
Zero-change refresh~3 ms (EXISTS check on empty buffer)~3 ms (no pending WAL changes)

Key insight: Trigger overhead is synchronous — every committing transaction pays the cost. For applications with moderate write rates (<5,000 writes/sec) this is acceptable. For high-throughput OLTP workloads, logical replication's zero write-side overhead is a significant advantage.

2.3 Operational Complexity

AspectTriggersLogical Replication
PostgreSQL configurationNone requiredwal_level = logical + restart
Managed PG compatibility✅ Works everywhere⚠️ Some providers restrict wal_level
WAL retention riskNone (buffer tables are independent)Slots prevent WAL cleanup; disk exhaustion risk
Slot managementN/ACreate, monitor, drop; orphan detection
max_replication_slotsN/AMust be sized for number of tracked sources
REPLICA IDENTITY configN/ARequired on all tracked source tables
MonitoringBuffer table row countsSlot lag, WAL retention, decode rate
Extension dependenciesNoneOutput plugin (pgoutput, wal2json, or custom)
Upgrade pathCREATE OR REPLACE FUNCTIONSlot protocol version compatibility

Key insight: Triggers are operationally simpler by a wide margin. Logical replication introduces a class of failure modes (stuck slots, WAL bloat, replica identity misconfiguration) that require dedicated monitoring and operational runbooks.

2.4 Feature: User Triggers on Stream Tables

This addresses end-user triggers on the output stream tables, not CDC triggers on source tables.

AspectCurrent (Trigger CDC)With Logical Replication CDC
Feasibility✅ Achievable via session_replication_role✅ Same mechanism applies
Refresh suppressionSET LOCAL session_replication_role = 'replica'Same
Post-refresh notificationNOTIFY pg_trickle_refresh with metadataSame
MERGE firing patternDELETE+INSERT (not UPDATE); must be suppressedSame — refresh mechanism is independent of CDC

Key insight: User trigger support on stream tables is orthogonal to the CDC mechanism and is now implemented. The solution uses ALTER TABLE ... DISABLE TRIGGER USER / ENABLE TRIGGER USER around FULL refresh (avoiding the session_replication_role conflict with logical replication publishing). In DIFFERENTIAL mode, explicit per-row DML (INSERT/UPDATE/DELETE) is used instead of MERGE so that user-defined AFTER triggers fire correctly. The implementation is controlled by the pg_trickle.user_triggers GUC (auto/ on/off). See PLAN_USER_TRIGGERS_EXPLICIT_DML.md for the full design.

Note: Sections 2.1–2.5 compare creation-time and operational aspects. For a focused steady-state comparison (what matters once the ST exists), see §3.

2.5 Feature: Logical Replication FROM Stream Tables

This addresses end-users subscribing to stream table changes via PostgreSQL's built-in logical replication.

AspectStatusNotes
Basic publishing✅ Works todaySTs are regular heap tables; CREATE PUBLICATION works
__pgt_row_id column⚠️ Replicated by defaultUse column list in PUBLICATION to exclude, or document as usable PK
Differential refresh✅ DELETE+INSERT via MERGE are replicatedSubscriber sees individual DELETEs and INSERTs, not UPDATEs
Full refresh✅ TRUNCATE + INSERT replicatedSubscriber needs replica_identity set; receives TRUNCATE + mass INSERT
REPLICA IDENTITYNeeds configuration__pgt_row_id could serve as unique index for identity

The session_replication_role Conflict

If the refresh engine sets session_replication_role = 'replica' to suppress user triggers (Phase 1 of the user-trigger plan), this may also suppress publication of the DML to logical replication subscribers. When a session is in replica mode, PostgreSQL treats it as a replication subscriber — DML performed in that session may not be forwarded to downstream subscribers (depending on the publication's publish_via_partition_root and the subscriber's origin setting).

This is a potential conflict between the two features. Options:

OptionUser Triggers Suppressed?Replication Published?Drawback
session_replication_role = 'replica'✅ Yes❌ May not be publishedBreaks logical replication from STs
ALTER TABLE ... DISABLE TRIGGER USER✅ Yes✅ YesRequires ACCESS EXCLUSIVE lock
pg_trickle.suppress_user_triggers GUC → DISABLE TRIGGER USER only when needed✅ Configurable✅ YesLock overhead; crash safety concern (ENABLE on recovery)
tgisinternal flag manipulation✅ Yes✅ YesNon-portable; catalog-level hack

Recommended resolution: Use ALTER TABLE ... DISABLE TRIGGER USER within a SAVEPOINT, restoring on error. The ACCESS EXCLUSIVE lock is brief (only held for the catalog update, not the entire refresh). If the user has enabled both user triggers AND logical replication on a stream table, this is the only approach that supports both simultaneously. If neither feature is in use, skip the overhead entirely.


3. Separating Creation-Time from Steady-State

The original ADR chose triggers because pg_create_logical_replication_slot() cannot execute inside a transaction that has already performed writes. This report initially treated that constraint as "decisive." But it deserves scrutiny: the atomicity constraint only affects the create_stream_table() call — a one-time event. Once a stream table exists, CDC runs for hours, days, or months. The steady-state characteristics are what actually matter for performance, correctness, and user experience.

3.1 The Atomicity Constraint Is a Solvable Engineering Problem

The constraint is real but workable. Three approaches exist, all with well-understood trade-offs:

ApproachHow It WorksDownside
Two-phase creationPhase 1: DDL + catalog in one transaction. Phase 2: slot creation in a separate transaction. Rollback Phase 1 artifacts on Phase 2 failure.Brief window where catalog entry exists without CDC. Cleanup on failure adds ~50 lines of code.
Background worker handoffMain transaction creates DDL + catalog + temporary trigger. Background worker creates slot asynchronously, then drops trigger.Race window: changes between COMMIT and slot creation are captured by the temporary trigger, so no data is lost. Adds complexity (~100 lines).
Trigger bootstrap → slot transitionCreate with triggers (current approach). After first successful refresh, migrate to logical replication in the background.Trigger overhead during bootstrap period (minutes). Most natural hybrid approach.

None of these are architecturally difficult. The two-phase approach is straightforward — if slot creation fails, drop the storage table and catalog entry. The temporary-trigger approach eliminates even the theoretical data-loss window. These are engineering inconveniences, not fundamental blockers.

3.2 Steady-State: Triggers vs Logical Replication (Honest Comparison)

Once the stream table exists and CDC is running, here is how the two approaches compare on their actual runtime merits.

In plain terms: With triggers, every time the application writes a row to a tracked source table, the database does extra work right then and there — calling a function, writing to a buffer table, updating an index — all before the application's transaction can finish. This is like a toll booth on a highway: every car (write) must stop and pay (trigger overhead) before continuing.

With logical replication, the database already writes every change to its internal transaction log (the WAL) as part of normal operation. CDC simply reads that log after the fact, in a separate background process. The application's writes pass through without stopping — there is no toll booth. The cost of reading the log is paid by the database server, but it happens asynchronously and never slows down the application.

Where Logical Replication Wins (Steady-State)

DimensionTrigger ImpactLogical Replication Advantage
Write-path latencyEvery INSERT/UPDATE/DELETE on a tracked source pays ~2–15 μs synchronous overhead (PL/pgSQL dispatch, buffer INSERT, index update). This is inside the committing transaction's critical path.Zero additional write-path cost. WAL writes happen regardless; decoding is asynchronous. Source table DML performance is completely unaffected.
Write amplificationEach source row change produces: (1) source table WAL, (2) buffer table heap write, (3) buffer table WAL, (4) buffer index update, (5) index WAL. ~2–3× total write amplification.1× — only the source table's normal WAL. No additional heap writes, no secondary indexes.
TRUNCATE captureCannot capture. Row-level triggers don't fire. Requires a separate statement-level AFTER TRUNCATE workaround (§4) that only marks for reinit — the actual row deletions are invisible to differential mode.Native. WAL emits TRUNCATE events since PG 11. The decoder receives a clean signal that all rows were removed.
Throughput ceilingEstimated ~5,000 writes/sec on tracked sources before trigger overhead dominates. PL/pgSQL function dispatch is the bottleneck.Bounded by WAL throughput — typically 50,000–200,000+ writes/sec depending on hardware and wal_buffers.
Connection-pool pressureTrigger executes in the application's connection. Long-running trigger INSERTs can increase connection hold time under load.Decoding runs in a dedicated WAL sender process. Application connections are unaffected.
Vacuum pressureBuffer tables accumulate dead tuples between cleanups. Each refresh cycle creates bloat that autovacuum must reclaim.No buffer tables to vacuum. WAL segments are recycled by the WAL management subsystem.
Transaction ID consumptionEach trigger INSERT consumes sub-transaction resources within the outer transaction. High-volume batch operations can cause excessive subtransaction overhead.No additional transaction work.

Where Triggers Win (Steady-State)

DimensionTrigger AdvantageLogical Replication Impact
Operational simplicityNo external state to manage. Buffer tables are regular heap tables — queryable, monitorable, backed up normally. Drop the trigger and it's gone.Replication slots are persistent server-side state. A stuck or crashed consumer prevents WAL recycling, potentially filling the disk. Requires monitoring, max_slot_wal_keep_size guards, and orphan-slot cleanup.
Zero configurationWorks with any wal_level (minimal, replica, logical). No restart required. No REPLICA IDENTITY configuration.Requires wal_level = logical (server restart), max_replication_slots sizing, and REPLICA IDENTITY on every tracked source table. Many managed PostgreSQL providers default to wal_level = replica.
Schema evolutionDDL event hooks rebuild the trigger function via CREATE OR REPLACE FUNCTION. New columns are added to the buffer table with ADD COLUMN IF NOT EXISTS. Simple, same-transaction, no coordination.Schema changes on tracked tables require careful handling. The output plugin must be aware of column additions/removals. Slot may need to be recreated. ALTER TABLE during active decoding can cause protocol errors.
Debugging & visibilityChange buffers are queryable tables: SELECT * FROM pgtrickle_changes.changes_12345 ORDER BY change_id DESC LIMIT 10. Immediate visibility into what was captured.WAL is binary and opaque. Inspecting captured changes requires pg_logical_slot_peek_changes() which advances or peeks the slot — disruptive in production.
Crash recoveryBuffer tables are WAL-logged and survive crashes. No special recovery needed — the refresh engine picks up from the last frontier LSN.Slots survive crashes, but the decoding position may be ahead of what pg_trickle has consumed. Requires careful bookkeeping to avoid replaying or losing changes.
Multi-source coordinationEach source has an independent buffer table. The refresh engine reads from multiple buffers with independent LSN ranges. No coordination between sources.Multiple sources could share a single slot (decoding all tables) or use per-source slots. Shared slots require demultiplexing; per-source slots multiply the slot management burden.
IsolationTrigger failure (e.g., buffer table full) raises an error in the application transaction — visible and immediate.Decoding failure is asynchronous. The application commits successfully, but changes may never reach the buffer. Silent data loss is possible unless monitored.

Neutral (Roughly Equivalent)

DimensionNotes
Refresh-path performanceBoth approaches populate the same buffer table schema. The MERGE/DVM pipeline is identical regardless of how buffers were filled.
Zero-change detectionTriggers: EXISTS check on empty buffer (~3 ms). Logical replication: check slot position vs current WAL LSN (~3 ms). Equivalent.
Memory footprintTriggers: PL/pgSQL function cache per backend. Logical replication: WAL sender process + decoding context. Both are modest.

3.3 When Does Logical Replication Become the Better Choice?

The crossover point depends on workload characteristics:

ScenarioBetter ChoiceWhy
< 1,000 writes/sec on tracked sourcesTriggersOverhead is negligible; operational simplicity dominates
1,000–5,000 writes/secEither / Triggers still acceptableTrigger overhead is measurable but unlikely to be the bottleneck
> 5,000 writes/secLogical ReplicationWrite-path overhead starts to matter; 2–3× write amplification compounds
ETL patterns (TRUNCATE + bulk INSERT)Logical ReplicationNative TRUNCATE capture; no stale-data gap
Wide tables (20+ columns)Logical ReplicationTrigger overhead scales with column count (~5–15 μs); WAL overhead does not
Managed PostgreSQL with wal_level restrictionsTriggersNo choice — logical replication may not be available
Many tracked sources (50+)Logical ReplicationFewer moving parts than 50 triggers + 50 buffer tables + 50 indexes
Need logical replication FROM stream tablesTriggers (with caveats)see §2.5 — session_replication_role conflict with DISABLE TRIGGER USER as workaround

3.4 Reassessing the Decision

With the atomicity constraint properly scoped as a creation-time concern, the decision to use triggers rests on three remaining pillars:

  1. Operational simplicity — no wal_level change, no slot management, no REPLICA IDENTITY configuration. This is genuinely valuable for an early-stage extension that needs frictionless adoption.

  2. Debugging visibility — queryable buffer tables are a major developer experience advantage. Being able to SELECT * FROM changes_<oid> during debugging is invaluable.

  3. Zero-config deployment — works on any PostgreSQL 18 instance without server restarts or configuration changes. Critical for managed PostgreSQL environments.

However, these advantages are primarily about developer and operator experience, not about the fundamental capability of the system. A mature pg_trickle deployment that needs high write throughput, TRUNCATE support, or minimal source-table impact would be better served by logical replication in steady-state.

The honest assessment: Triggers are the right choice today for pragmatic reasons (simplicity, early-stage adoption, managed PG compatibility). But the report should not overstate the atomicity constraint as a fundamental blocker — it is a solvable problem. If pg_trickle grows to serve high-throughput production workloads, the migration to logical replication for steady-state CDC should be treated as a planned evolution, not a theoretical future.


4. TRUNCATE: The Gap and How to Close It

This limitation is one of the strongest arguments for logical replication in steady-state — see §3.2 for the comparison.

The TRUNCATE limitation is the most commonly cited drawback of trigger-based CDC. PostgreSQL does not fire row-level triggers for TRUNCATE because TRUNCATE operates at the file level (O(1)) — there are no individual rows to enumerate.

Current Behavior

  1. User runs TRUNCATE source_table
  2. CDC trigger does not fire — change buffer remains empty
  3. Scheduler sees zero changes → NO_DATA → stream table is stale
  4. Stream table shows data from rows that no longer exist

Proposed Fix: Statement-Level AFTER TRUNCATE Trigger

PostgreSQL supports statement-level AFTER TRUNCATE triggers. While they provide no OLD row data, they can mark downstream stream tables for reinitialization:

CREATE TRIGGER pg_trickle_truncate_<oid>
  AFTER TRUNCATE ON <source_table>
  FOR EACH STATEMENT
  EXECUTE FUNCTION pgtrickle.on_source_truncated('<source_oid>');

The trigger function would:

  1. Look up all stream tables that depend on this source
  2. Mark them needs_reinit = true in the catalog
  3. Cascade transitively to downstream STs

This closes the TRUNCATE gap without changing the CDC architecture. The next scheduler cycle would trigger a FULL refresh automatically.

Effort estimate: ~2–4 hours (trigger creation in cdc.rs, PL/pgSQL or Rust function for on_source_truncated, cascade logic reuse from hooks.rs).


5. Migration Path: Trigger → Logical Replication (Now Implemented)

Status: Phase A (Hybrid Creation) is now implemented in src/wal_decoder.rs. The pg_trickle.cdc_mode GUC controls the behavior (trigger/auto/wal).

As discussed in §3, the atomicity constraint is a creation-time problem with known solutions. The buffer table schema and downstream IVM pipeline are decoupled from the capture mechanism, so migration is isolated to the CDC layer. This should be treated as a planned evolution for high-throughput deployments, not a theoretical future:

Phase A: Hybrid Creation

  1. create_stream_table() continues using triggers for atomic creation
  2. After first successful full refresh, a background worker creates a replication slot and transitions to WAL-based capture
  3. Trigger is dropped; buffer table continues to be populated from WAL decode

Phase B: Steady-State WAL Capture

  1. Background worker runs a logical decoding consumer per tracked source
  2. WAL changes are decoded and written to the same buffer table schema
  3. Downstream pipeline (DVM, MERGE, frontier) is unchanged
  4. TRUNCATE events are captured natively from WAL

Prerequisites

  • wal_level = logical (must be documented as optional upgrade path)
  • REPLICA IDENTITY on tracked sources (auto-configured or user-managed)
  • Custom output plugin or pgoutput + column mapping
  • Slot health monitoring (WAL retention alerts, orphan cleanup)

Effort estimate: 3–5 weeks for a production-quality implementation.


6. Recommendations

Recommendation 1: Keep Trigger-Based CDC (For Now)

Operational simplicity and zero-config deployment are strong advantages for an early-stage extension. The performance ceiling (~5,000 writes/sec) is adequate for current target use cases. The atomicity constraint, while solvable (see §3.1), adds creation-time complexity that is not yet justified.

However: This decision should be revisited when any of these triggers are hit: (a) users report write-path latency from CDC triggers, (b) TRUNCATE-based ETL patterns become a common pain point, (c) pg_trickle targets environments where wal_level = logical is already the norm. The steady-state advantages of logical replication (§3.2) are substantial and should not be dismissed.

Recommendation 2: ✅ IMPLEMENTED — User Trigger Suppression

User-defined triggers on stream tables are now fully supported. The implementation uses ALTER TABLE ... DISABLE TRIGGER USER / ENABLE TRIGGER USER around FULL refresh, and explicit per-row DML (INSERT/UPDATE/DELETE) instead of MERGE during DIFFERENTIAL refresh so user AFTER triggers fire correctly. Controlled by pg_trickle.user_triggers GUC (auto/on/off). The session_replication_role approach from the original plan was rejected to avoid conflict with logical replication publishing (see §2.5).

Recommendation 3: Add TRUNCATE Capture Trigger

Add a statement-level AFTER TRUNCATE trigger on each tracked source table that marks downstream STs for reinitialization. This closes the most significant usability gap without changing the CDC architecture.

Recommendation 4: Document Logical Replication FROM Stream Tables

Add documentation and examples for CREATE PUBLICATION on stream tables, including:

  • Column filtering to exclude __pgt_row_id
  • REPLICA IDENTITY configuration using __pgt_row_id as unique index
  • Behavior during FULL vs DIFFERENTIAL refresh
  • Interaction with user trigger suppression

Recommendation 5: Benchmark Trigger Overhead

Execute the benchmark plan in PLAN_TRIGGERS_OVERHEAD.md to establish data-driven thresholds for the logical replication migration crossover point. The results should feed directly into the §3.3 crossover analysis.

Recommendation 6: ✅ IMPLEMENTED — Hybrid CDC Approach

The "trigger bootstrap → slot transition" pattern is now implemented in src/wal_decoder.rs (1152 lines). The implementation includes:

  • Automatic transition: After stream table creation with triggers, a background worker creates a logical replication slot and transitions to WAL-based capture.
  • GUC control: pg_trickle.cdc_mode (trigger/auto/wal) and pg_trickle.wal_transition_timeout control the behavior.
  • Transition orchestration: Create slot → wait for catch-up → drop trigger. Automatic fallback to triggers if slot creation fails.
  • Catalog extension: pgt_dependencies gains cdc_mode, slot_name, decoder_confirmed_lsn, transition_started_at columns.
  • Health monitoring: pgtrickle.check_cdc_health() function and NOTIFY pg_trickle_cdc_transition notifications.

7. Decision Log

#DecisionRationale
D1Keep triggers for CDC on source tables — for nowZero-config, operational simplicity, adequate for current scale
D2Atomicity constraint is solvable, not fundamentalTwo-phase creation and hybrid bootstrap are proven patterns (§3.1)
D3Logical replication is superior in steady-stateZero write overhead, TRUNCATE capture, higher throughput ceiling (§3.2)
D4User triggers on STs are orthogonal to CDC choicesession_replication_role / DISABLE TRIGGER USER works with either approach
D5Logical replication FROM STs works todayRegular heap tables; needs documentation, not code
D6TRUNCATE gap is closable with statement-level triggerLow effort, high impact — but logical replication handles it natively
D7Hybrid approach is the optimal long-term targetTrigger bootstrap for creation + logical replication for steady-state
D8User trigger suppression uses DISABLE TRIGGER USERAvoids session_replication_role conflict with logical replication publishing (§2.5)
D9Hybrid CDC implemented with auto-transitionpg_trickle.cdc_mode = 'auto' triggers → WAL transition after creation
D10Explicit DML for DIFFERENTIAL refresh with user triggersINSERT/UPDATE/DELETE instead of MERGE so AFTER triggers fire correctly