Changelog
All notable changes to pg_trickle are documented in this file.
The format is based on Keep a Changelog. For future plans and release milestones, see ROADMAP.md.
[Unreleased]
Added
- TopK (ORDER BY + LIMIT) support — Queries with a top-level
ORDER BY … LIMIT N(constant integer, no OFFSET) are now recognized as "TopK" and accepted. TopK stream tables store only the top-N rows. Refreshes use scoped-recomputation via MERGE (bypass the DVM delta pipeline). Catalog columnstopk_limitandtopk_order_byrecord the pattern. Monitoring view exposesis_topk. - FETCH FIRST / FETCH NEXT rejection —
FETCH FIRST N ROWS ONLYandFETCH NEXT N ROWS ONLYnow produce the same unsupported-feature error asLIMIT. - OFFSET without ORDER BY warning — Subqueries using
OFFSETwithoutORDER BYnow emit a parser warning (alongside the existingLIMITwithoutORDER BYwarning).
[0.1.3] — 2026-03-02
Added
SQL_GAPS_7: 50/51 Gap Items Completed
Comprehensive gap remediation across all 5 tiers of the SQL_GAPS_7 plan, completing 50 of 51 items (F40 — extension upgrade migration scripts — deferred to PLAN_DB_SCHEMA_STABILITY.md).
Tier 0 — Critical Correctness:
- F1 — Removed
delete_insertmerge strategy (unsafe, superseded byauto). - F2 — WAL decoder: keyless-table
pk_hashnow rejects keyless tables and requiresREPLICA IDENTITY FULL. - F3 — WAL decoder:
old_*column population for UPDATEs viaparse_pgoutput_old_columnsand old-key→new-tuple section parsing. - F6 — ALTER TYPE / ALTER POLICY DDL tracking via
handle_type_changeandhandle_policy_changeinhooks.rs.
Tier 1 — High-Value Correctness Verification:
- F8 — Window partition key change E2E tests (2 tests in
e2e_window_tests.rs). - F9 — Recursive CTE monotonicity audit with
recursive_term_is_non_monotoneguard and 11 unit tests. - F10 — ALTER DOMAIN DDL tracking via
handle_domain_changeinhooks.rs. - F11 — Keyless table duplicate-row limitation documented in SQL_REFERENCE.md.
- F12 — PgBouncer compatibility documented in FAQ.md.
Tier 2 — Robustness:
- F13 — Warning on
LIMITin subquery withoutORDER BY. - F15 —
RANGE_AGG/RANGE_INTERSECT_AGGrecognized and rejected in DIFFERENTIAL mode. - F16 — Read replica detection:
pg_is_in_recovery()check skips background worker on replicas.
Tier 3 — Test Coverage (62 new E2E tests across 10 test files):
- F17 — 18 aggregate differential E2E tests (
e2e_aggregate_coverage_tests.rs). - F18 — 5 FULL JOIN E2E tests (
e2e_full_join_tests.rs). - F19 — 6 INTERSECT/EXCEPT E2E tests (
e2e_set_operation_tests.rs). - F20 — 4 scalar subquery E2E tests (
e2e_scalar_subquery_tests.rs). - F21 — 4 SubLinks-in-OR E2E tests (
e2e_sublink_or_tests.rs). - F22 — 6 multi-partition window E2E tests (
e2e_multi_window_tests.rs). - F23 — 7 GUC variation E2E tests (
e2e_guc_variation_tests.rs). - F24 — 5 multi-cycle refresh E2E tests (
e2e_multi_cycle_tests.rs). - F25 — 7 HAVING group transition E2E tests (
e2e_having_transition_tests.rs). - F26 — FULL JOIN NULL keys E2E tests (in
e2e_full_join_tests.rs).
Tier 4 — Operational Hardening (13/14, F40 deferred):
- F27 — Adaptive threshold exposed in
stream_tables_infoview. - F29 — SPI SQLSTATE error classification for retry (
classify_spi_error_retryable). - F30 — Delta row count in refresh history (3 new columns +
RefreshRecordAPI). - F31 —
StaleDataNOTIFY emitted consistently (emit_stale_alert_if_needed). - F32 — WAL transition retry with 3× progressive backoff.
- F33 — WAL column rename detection via
detect_schema_mismatch. - F34 — Clear error on SPI permission failure (
SpiPermissionErrorvariant). - F38 — NATURAL JOIN column drift tracking (warning emitted).
- F39 — Drop orphaned buffer table columns (
sync_change_buffer_columns).
Tier 5 — Nice-to-Have:
- F41 — Wide table MERGE hash shortcut for >50-column tables.
- F42 — Delta memory bounds documented in FAQ.md.
- F43 — Sequential processing rationale documented in FAQ.md.
- F44 — Connection overhead documented in FAQ.md.
- F45 — Memory/temp file usage tracking (
query_temp_file_usage). - F46 —
pg_trickle.buffer_alert_thresholdGUC. - F47 —
pgtrickle.st_auto_threshold()SQL function. - F48 — 7 keyless table duplicate-row E2E tests (
e2e_keyless_duplicate_tests.rs). - F49 — Generated column snapshot filter alignment.
- F50 — Covering index overhead benchmark (
e2e_bench_tests.rs). - F51 — Change buffer schema permissions (
REVOKE ALL FROM PUBLIC).
TPC-H-Derived Correctness Suite: 22/22 Queries Passing
TPC Fair Use Policy: Queries are derived from the TPC-H Benchmark specification and do not constitute TPC-H Benchmark results. TPC Benchmark™ is a trademark of the Transaction Processing Performance Council (TPC).
Improved the TPC-H-derived correctness suite from 20/22 create + 15/22 deterministic pass to 22/22 queries create and pass across multiple mutation cycles. Fixed Q02 subquery and TPC-H schema/datagen edge cases.
Planning & Research Documentation
- PLAN_DIAMOND_DEPENDENCY_CONSISTENCY.md — multi-path refresh correctness analysis for diamond-shaped DAG dependencies.
- PLAN_PG_BACKCOMPAT.md — analysis for supporting PostgreSQL 13–17.
- PLAN_TRANSACTIONAL_IVM.md — immediate (transactional) IVM design.
- PLAN_EXTERNAL_PROCESS.md — external sidecar feasibility analysis.
- PLAN_PGWIRE_PROXY.md — pgwire proxy/intercept feasibility analysis.
- GAP_ANALYSIS_EPSIO.md / GAP_ANALYSIS_FELDERA.md — competitive gap analysis documents.
Fixed
Window Function Differential Maintenance (6 tests un-ignored)
Fixed window function differential maintenance to correctly handle non-RANGE frames, LAG/LEAD, ranking functions (DENSE_RANK, NTILE, RANK), and window-over-aggregate queries. Six previously-ignored E2E tests now pass:
- Parser:
is_agg_node()OVER clause check — Window function calls withOVERwere incorrectly classified as aggregate nodes, causing wrong operator tree construction. - Parser:
extract_aggregates()OVER clause early return — Aggregates wrapped inOVER (...)were extracted as plain aggregates, producing duplicate columns in the delta SQL. - Parser:
needs_pgt_countWindow delegation — The__pgt_counttracking column was not propagated through Window operators. - Window diff: NOT EXISTS filter on pass-through columns — The
current_inputCTE used__pgt_row_idfor change detection, which does not exist in the Window operator's input. Switched to NOT EXISTS join on pass-through columns. - Window diff:
build_agg_alias_map+render_window_sql— Window functions wrapping aggregates (e.g.,RANK() OVER (ORDER BY SUM(x))) emitted raw aggregate expressions instead of referencing the aggregate output aliases. - Row ID uniqueness via
row_to_json+row_number— Window functions over tied values (DENSE_RANK, RANK) produced duplicate__pgt_row_idhashes. Row IDs are now computed from the full row content plus a positional disambiguator.
INTERSECT/EXCEPT Differential Correctness (6 tests un-ignored)
Fixed INTERSECT and EXCEPT differential SQL generation that produced invalid GROUP BY clauses. The set operation diff now correctly generates dual-count multiplicity tracking with LEAST/GREATEST boundary crossing.
SubLink OR Differential Correctness (3 tests un-ignored)
Fixed EXISTS/IN subqueries combined with OR in WHERE clauses that generated invalid GROUP BY expressions. The semi-join/anti-join delta operators now correctly handle OR-combined SubLinks.
Multi-Partition Window Native Handling
Queries with multiple window functions using different PARTITION BY clauses are now handled natively by the parser instead of requiring a CTE+JOIN rewrite. If all windows share the same partition key, it is used directly; otherwise the window operator falls back to un-partitioned (full) recomputation.
Aggregate Differential Correctness
- MIN/MAX rescan on extremum deletion: When the current MIN or MAX value was deleted and no new inserts existed, the merge expression returned NULL instead of rescanning the source table. MIN/MAX now participate in the rescan CTE and use the rescanned value when the extremum is deleted.
- Regular aggregate ORDER BY parsing:
STRING_AGG(val, ',' ORDER BY val)andARRAY_AGG(val ORDER BY val)silently dropped the ORDER BY clause because the parser only captured ordering for ordered-set aggregates (WITHIN GROUP). Now all aggregate ORDER BY clauses are parsed correctly. - ORDER BY placement in rescan SQL: Regular aggregate ORDER BY is now emitted
inside the function call parentheses (
STRING_AGG(val, ',' ORDER BY val)) rather than asWITHIN GROUP (ORDER BY ...), which is reserved for ordered-set aggregates (MODE, PERCENTILE_CONT, PERCENTILE_DISC).
E2E Test Infrastructure: Multi-Statement Execute
- Fixed
db.execute()calls that sent multiple SQL statements in a single prepared statement (which PostgreSQL rejects). Split into separate calls ine2e_full_join_tests.rs,e2e_scalar_subquery_tests.rs,e2e_set_operation_tests.rs,e2e_sublink_or_tests.rs, ande2e_multi_cycle_tests.rs.
CI: pg_stub.c Missing Stubs
Added palloc0 and error reporting stubs to scripts/pg_stub.c to fix unit
test compilation.
Changed
- Test count: ~1,455 total tests (up from ~1,138): 963 unit + 32 integration
- 460 E2E across 34 test files (up from ~22). 18 E2E tests are
#[ignore]d pending DVM correctness fixes (see Known Limitations below).
- 460 E2E across 34 test files (up from ~22). 18 E2E tests are
- 1 new GUC variable —
buffer_alert_thresholdadded. Total: 16 GUCs.
Known Limitations
18 E2E tests are marked #[ignore] due to pre-existing DVM differential logic
bugs that will be addressed in future releases:
| Suite | Ignored | Reason |
|---|---|---|
e2e_full_join_tests | 5/5 | FULL OUTER JOIN differential produces incorrect results |
e2e_having_transition_tests | 5/7 | HAVING threshold crossing differential incorrect |
e2e_keyless_duplicate_tests | 5/7 | Keyless table duplicate-row row_id hash collision |
e2e_scalar_subquery_tests | 2/4 | Correlated scalar subquery differential generates invalid SQL |
e2e_sublink_or_tests | 1/4 | Correlated EXISTS with HAVING loses aggregate state |
[0.1.2] — 2026-02-28
Changed
Project Renamed from pg_stream to pg_trickle
Renamed the entire project to avoid a naming collision with an unrelated project. All identifiers, schemas, GUC prefixes, catalog columns, and documentation references have been updated:
- Crate name:
pg_stream→pg_trickle - Extension control file:
pg_stream.control→pg_trickle.control - SQL schemas:
pgstream→pgtrickle,pgstream_changes→pgtrickle_changes - Catalog column prefix:
pgs_→pgt_ - Internal column prefix:
__pgs_→__pgt_ - GUC prefix:
pg_stream.*→pg_trickle.* - CamelCase types:
PgStreamError→PgTrickleError - dbt package:
dbt-pgstream→dbt-pgtrickle
"Stream tables" terminology is unchanged — only the project/extension name was renamed.
Fixed
DVM: Inner Join Delta Double-Counting
Fixed inner join pre-change snapshot logic that caused delta double-counting during differential refresh. The snapshot now correctly eliminates rows that would be counted twice when both sides of the join have changes in the same refresh cycle. Discovered via TPC-H-derived Q07.
DVM: Multi-Stream-Table Change Buffer Cleanup
Fixed a bug where change buffer cleanup for one stream table could delete entries still needed by another stream table that shares the same source table. Buffer cleanup now scopes deletions per-stream-table rather than per-source-table.
DVM: Scalar Aggregate Row ID Mismatch and AVG Group Rescan
Fixed scalar aggregate row_id generation that produced mismatched identifiers
between delta and merge phases, and corrected AVG group rescan logic that
failed to recompute averages after partial group changes. Fixes TPC-H-derived
Q06 and improves Q01.
DVM: SemiJoin/AntiJoin Snapshots and GROUP BY Alias Projection
Fixed snapshot handling for SemiJoin and AntiJoin operators that missed
pre-change state, corrected __pgt_count filtering in delta output, and
fixed the parser's GROUP BY alias resolution to emit proper Project nodes.
Raises TPC-H-derived passing count to 14/22.
DVM: Unqualified Column Resolution and Deep Disambiguation
Fixed unqualified column resolution in join contexts, intermediate aggregate delta computation, and deep column disambiguation for nested subqueries.
DVM: COALESCE Null Counts and Walker-Based OID Extraction
Fixed COALESCE handling for null count columns in aggregate deltas, replaced
regex-based OID extraction with a proper AST walker, and fixed
ComplexExpression aggregate detection.
DVM: Column Reference Resolution Against Disambiguated Join CTEs
Fixed column reference resolution that failed to match against disambiguated join CTE column names, causing incorrect references in multi-join queries.
Stale Pending Cleanup Crash on Dropped Change Buffer Tables
Prevented the background cleanup worker from crashing when it encounters pending cleanup entries for change buffer tables that have already been dropped (e.g., after a stream table is removed mid-cycle).
DVM Parser: 4 Query Rewrite Bugs (TPC-H-Derived Regression Coverage)
Fixed four bugs in src/dvm/parser.rs discovered while building the TPC-H-derived
correctness test suite. Together they unblock 3 more TPC-H-derived queries (Q04,
Q15, Q21) from stream table creation, raising the create-success rate from
17/22 to 20/22.
node_to_expragg_star —FuncCallnodes withagg_star: true(i.e.COUNT(*)) were emitted ascount()(no argument). Addedagg_starcheck that insertsExpr::Raw("*")so the deparser producescount(*).rewrite_sublinks_in_orfalse trigger — The OR-sublink rewriter was entered for any AND expression containing a SubLink (e.g. a bareEXISTSclause). Addedand_contains_or_with_sublink()guard so the rewriter only activates when the AND contains an OR conjunct that itself has a SubLink. Prevented the false-positiveCOUNT()deparse for Q04 and Q21.- Correlated scalar subquery detection —
rewrite_scalar_subquery_in_wherenow collects outer table names and checks whether the scalar subquery references any of them (is_correlated()). Correlated subqueries are skipped (rather than incorrectly CROSS JOIN-rewritten). Non-correlated subqueries now use the correct wrapper pattern:CROSS JOIN (SELECT v."c" AS "sq_col" FROM (subquery) AS v("c")) AS sq. T_RangeSubselectin FROM clause — Bothfrom_item_to_sqlanddeparse_from_itemnow handleT_RangeSubselect(derived tables / inline views in FROM). Previously these fell through to a"?"placeholder, causing a syntax error for Q15 after its CTE was inlined.
Added
TPC-H-Derived Correctness Test Suite
TPC Fair Use Policy: The queries in this test suite are derived from the TPC-H Benchmark specification and do not constitute TPC-H Benchmark results. TPC Benchmark™ is a trademark of the Transaction Processing Performance Council (TPC). pg_trickle results are not comparable to published TPC results.
Added a TPC-H-derived correctness test suite (tests/e2e_tpch_tests.rs) that
validates the core DBSP invariant — Contents(ST) ≡ Result(defining_query)
after every differential refresh — across all 22 TPC-H-derived queries at SF=0.01.
-
Schema & data generation (
tests/tpch/schema.sql,datagen.sql) — SQL-only, no externaldbgendependency, works with existingE2eDbtestcontainers infrastructure. -
Mutation scripts (
rf1.sqlINSERT,rf2.sqlDELETE,rf3.sqlUPDATE) — multi-cycle churn to catch cumulative drift. -
22 query files (
tests/tpch/queries/q01.sql–q22.sql) — queries derived from TPC-H, adapted for pg_trickle SQL compatibility:Query Adaptation Q08 NULLIF→CASE WHEN;BETWEEN→ explicit>= AND <=Q09 LIKE '%green%'→strpos(p_name, 'green') > 0Q14 NULLIF→CASE;LIKE 'PROMO%'→left(p_type, 5) = 'PROMO'Q15 WITH revenue0 AS (...)CTE → inline derived tableQ16 COUNT(DISTINCT)→ DISTINCT subquery +COUNT(*);NOT LIKE/LIKE→left()/strpos()All →replaced with->in comments (avoids UTF-8 byte-boundary panic) -
3 test functions —
test_tpch_differential_correctness,test_tpch_cross_query_consistency,test_tpch_full_vs_differential. All pass (3 passed; 0 failed). Queries blocked by known DVM limitations soft-skip rather than fail. -
Current score: 20/22 create successfully; 15/22 pass deterministic correctness checks across multiple mutation cycles after the DVM fixes listed above.
-
justtargets:test-tpch(fast, SF=0.01),test-tpch-large(SF=0.1, 5 cycles),test-tpch-fast(skips image rebuild).
[0.1.1] — 2026-02-26
Changed
CloudNativePG Image Volume Extension Distribution
- Extension-only OCI image — replaced the full PostgreSQL Docker image
(
ghcr.io/<owner>/pg_trickle) with a minimalscratch-based extension image (ghcr.io/<owner>/pg_trickle-ext) following the CNPG Image Volume Extensions specification. The image contains only.so,.control, and.sqlfiles (< 10 MB vs ~400 MB for the old full image). - New
cnpg/Dockerfile.ext— release Dockerfile for packaging pre-built artifacts into the scratch-based extension image. - New
cnpg/Dockerfile.ext-build— multi-stage from-source build for local development and CI. - New
cnpg/database-example.yaml— CNPGDatabaseresource for declarativeCREATE EXTENSION pg_trickle(replacespostInitSQL). - Updated
cnpg/cluster-example.yaml— uses official CNPG PostgreSQL 18 operand image with.spec.postgresql.extensionsfor Image Volume mounting. - Removed
cnpg/Dockerfileandcnpg/Dockerfile.release— the old full PostgreSQL images are no longer built or published. - Updated release workflow — publishes multi-arch (amd64/arm64) extension image to GHCR with layout verification and SQL smoke test.
- Updated CI CNPG smoke test — uses transitional composite image approach
until
kindsupports Kubernetes 1.33 withImageVolumefeature gate.
[0.1.0] — 2026-02-26
Fixed
WAL Decoder pgoutput Action Parsing (F4 / G2.3)
- Positional action parsing —
parse_pgoutput_action()previously useddata.contains("INSERT:")etc., which would misclassify events when a schema name, table name, or column value contained an action keyword (e.g., a table namedINSERT_LOGor a text value"DELETE: old row"). Replaced with positional parsing: strip"table "prefix, skipschema.table:, then match the action keyword before the next:. - 3 new unit tests covering the edge cases.
Added
CUBE / ROLLUP Combinatorial Explosion Guard (F14 / G5.2)
- Branch limit guard —
CUBE(n)on N columns generates $2^N$UNION ALLbranches. Large CUBEs would silently produce memory-exhausting query trees.rewrite_grouping_sets()now rejects CUBE/ROLLUP combinations that would expand beyond 64 branches, emitting a clear error that directs users to explicitGROUPING SETS(...).
Documentation: Known Delta Computation Limitations (F7 / F11)
- JOIN key change + simultaneous right-side delete — documented in
docs/SQL_REFERENCE.md§ "Known Delta Computation Limitations" with a concrete SQL example, root-cause explanation, and three mitigations (adaptive FULL fallback, staggered changes, FULL mode). - Keyless table duplicate-row limitation — the "Tables Without Primary
Keys" section now includes a
> Limitationcallout explaining that rows with identical content produce the same content hash, causing INSERT deduplication and ambiguous DELETE matching. Recommends adding a surrogate PK or UNIQUE constraint. - SQL-standard JSON aggregate recognition —
JSON_ARRAYAGG(expr ...)andJSON_OBJECTAGG(key: value ...)are now recognized as first-class DVM aggregates with the group-rescan strategy. Previously treated as opaque raw expressions, they now work correctly in DIFFERENTIAL mode. - Two new
AggFuncvariants:JsonObjectAggStd(String),JsonArrayAggStd(String). The carried String preserves the full deparsed SQL since the specialkey: value,ABSENT ON NULL,ORDER BY, andRETURNINGclauses differ from regular function syntax. - 4 E2E tests covering both FULL and DIFFERENTIAL modes.
JSON_TABLE Support (F12)
JSON_TABLE()in FROM clause — PostgreSQL 17+JSON_TABLE(expr, path COLUMNS (...))is now supported. Deparsed with full syntax includingPASSINGclauses, regular/EXISTS/formatted/nested columns, andON ERROR/ON EMPTYbehaviors. Modeled internally asLateralFunction.- 2 E2E tests (FULL and DIFFERENTIAL modes).
Operator Volatility Checking (F16)
- Operator volatility checking — custom operators backed by volatile
functions are now detected and rejected in DIFFERENTIAL mode. The check
queries
pg_operator→pg_procto resolve operator function volatility. This completes the volatility coverage (G7.2) started with function volatility detection. - 3 unit tests and 2 E2E tests.
Cross-Session Cache Invalidation (F17)
- Cross-session cache invalidation — a shared-memory atomic counter
(
CACHE_GENERATION) ensures that when one backend alters a stream table, drops a stream table, or triggers a DDL hook, all other backends automatically flush their delta template and MERGE template caches on the next refresh cycle. Previously, cached templates could become stale in multi-backend deployments. - Thread-local generation tracking in both
dvm/mod.rs(delta cache) andrefresh.rs(MERGE cache + prepared statements).
Function/Operator DDL Tracking (F18)
- Function DDL tracking —
CREATE OR REPLACE FUNCTIONandALTER FUNCTIONon functions referenced by stream table defining queries now trigger reinit of affected STs.DROP FUNCTIONalso marks affected STs for reinit. functions_usedcatalog column — newTEXT[]column inpgtrickle.pgt_stream_tablesstores all function names used by the defining query (extracted from the parsed OpTree at creation time). DDL hooks query this column to find affected STs.- 2 E2E tests and 5 unit tests for function name extraction.
View Inlining (G2.1)
- View inlining auto-rewrite — views referenced in defining queries are transparently replaced with their underlying SELECT definition as inline subqueries. CDC triggers land on base tables, so DIFFERENTIAL mode works correctly with views. Nested views (view → view → table) are fully expanded via a fixpoint loop (max depth 10).
- Materialized view rejection — materialized views (
relkind = 'm') are rejected with a clear error in DIFFERENTIAL mode. FULL mode allows them. - Foreign table rejection — foreign tables (
relkind = 'f') are rejected in DIFFERENTIAL mode (row-level triggers cannot be created on foreign tables). - Original query preservation — the user's original SQL (pre-inlining) is
stored in
pgtrickle.pgt_stream_tables.original_queryfor reinit after view changes and user introspection. - View DDL hooks —
CREATE OR REPLACE VIEWtriggers reinit of affected stream tables.DROP VIEWsets affected stream tables to ERROR status. - View dependency tracking — views are registered as soft dependencies in
pgtrickle.pgt_dependencies(source_type = 'VIEW') for DDL hook lookups. - E2E test suite — 16 E2E tests covering basic view inlining, UPDATE/DELETE through views, filtered views, aggregation, joins, nested views, FULL mode, materialized view rejection/allowance, view replacement/drop hooks, TRUNCATE propagation, column renaming, catalog verification, and dependency registration.
SQL Feature Gaps (S1–S15)
- Volatile function detection (S1) — defining queries containing volatile
functions (e.g.,
random(),clock_timestamp()) are rejected in DIFFERENTIAL mode with a clear error. Stable functions (e.g.,now()) emit a warning. - TRUNCATE capture in CDC (S2) — statement-level
AFTER TRUNCATEtrigger writes aTmarker row to the change buffer. Differential refresh detects the marker and automatically falls back to a full refresh. ALL (subquery)support (S3) —x op ALL (subquery)is rewritten to an AntiJoin viaNOT EXISTSwith a negated condition.DISTINCT ONauto-rewrite (S4) —DISTINCT ON (col1, col2)is transparently rewritten to aROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) = 1subquery before DVM parsing. Previously rejected.- 12 regression aggregates (S5) —
CORR,COVAR_POP,COVAR_SAMP,REGR_AVGX,REGR_AVGY,REGR_COUNT,REGR_INTERCEPT,REGR_R2,REGR_SLOPE,REGR_SXX,REGR_SXY,REGR_SYY— all use group-rescan strategy. 39 aggregate function variants total (up from 25). - Mixed
UNION/UNION ALL(S6) — nested set operations with differentALLflags are now parsed correctly. - Column snapshot + schema fingerprint (S7) —
pgt_dependenciesstores a JSONB column snapshot and SHA-256 fingerprint for each source table. DDL change detection uses a 3-tier fast path: fingerprint → snapshot → legacycolumns_usedfallback. pg_trickle.block_source_ddlGUC (S8) — whentrue, column-affecting DDL on tracked source tables is blocked with an ERROR instead of marking stream tables for reinit.NATURAL JOINsupport (S9) — common columns are resolved at parse time and an explicit equi-join condition is synthesized. Supports INNER, LEFT, RIGHT, and FULL NATURAL JOIN variants. Previously rejected.- Keyless table support (S10) — source tables without a primary key now
work correctly. CDC triggers compute an all-column content hash for row
identity. Consistent
__pgt_row_idbetween full and delta refreshes. GROUPING SETS/CUBE/ROLLUPauto-rewrite (S11) — decomposed at parse time into aUNION ALLof separateGROUP BYqueries.GROUPING()calls become integer literals. Previously rejected.- Scalar subquery in WHERE rewrite (S12) —
WHERE col > (SELECT avg(x) FROM t)is rewritten to aCROSS JOINwith column reference replacement. - SubLinks in OR rewrite (S13) —
WHERE a OR EXISTS (...)is decomposed intoUNIONbranches, one per OR arm. - Multi-PARTITION BY window rewrite (S14) — window functions with different
PARTITION BYclauses are split into separate subqueries joined by aROW_NUMBER() OVER ()row marker. - Recursive CTE semi-naive + DRed (S15) — DIFFERENTIAL mode for recursive CTEs now uses semi-naive evaluation for INSERT-only changes, Delete-and- Rederive (DRed) for mixed changes, and recomputation fallback. Strategy is auto-selected per refresh.
Native Syntax Planning
- Native DDL syntax research — comprehensive analysis of 15 PostgreSQL
extension syntax mechanisms for supporting
CREATE STREAM TABLE-like syntax. Seeplans/sql/REPORT_CUSTOM_SQL_SYNTAX.md. - Native syntax plan — tiered strategy: Tier 1 (function API, existing),
Tier 1.5 (
CALLprocedure wrappers), Tier 2 (CREATE MATERIALIZED VIEW ... WITH (pgtrickle.stream = true)viaProcessUtility_hook). Seeplans/sql/PLAN_NATIVE_SYNTAX.md.
Hybrid CDC — Automatic Trigger → WAL Transition
- Hybrid CDC architecture — stream tables now start with lightweight
row-level triggers for zero-config setup and can automatically transition to
WAL-based (logical replication) capture for lower write-side overhead. The
transition is controlled by the
pg_trickle.cdc_modeGUC (trigger/auto/wal). - WAL decoder background worker — dedicated worker that polls logical replication slots and writes decoded changes into the same change buffer tables used by triggers, ensuring a uniform format for the DVM engine.
- Transition orchestration — transparent three-step process: create replication slot, wait for decoder catch-up, drop trigger. Falls back to triggers automatically if the decoder does not catch up within the timeout.
- CDC health monitoring — new
pgtrickle.check_cdc_health()function returns per-source CDC mode, slot lag, confirmed LSN, and alerts. - CDC transition notifications —
NOTIFY pg_trickle_cdc_transitionemits JSON payloads when sources transition between CDC modes. - New GUCs —
pg_trickle.cdc_modeandpg_trickle.wal_transition_timeout. - Catalog extension —
pgt_dependenciestable gainscdc_mode,slot_name,decoder_confirmed_lsn, andtransition_started_atcolumns.
User-Defined Triggers on Stream Tables
- User trigger support in DIFFERENTIAL mode — user-created
AFTERtriggers on stream tables now fire correctly during differential refresh via explicit per-row DML (INSERT/UPDATE/DELETE) instead of bulk MERGE. - FULL refresh trigger handling — user triggers are suppressed during FULL
refresh with
DISABLE TRIGGER USERand aNOTIFY pgtrickle_refreshis emitted so listeners know when to re-query. - Trigger detection —
has_user_triggers()automatically detects user-defined triggers on storage tables at refresh time. - DDL warning —
CREATE TRIGGERon a stream table emits a notice explaining the trigger semantics and thepg_trickle.user_triggersGUC. - New GUC —
pg_trickle.user_triggers(auto/on/off) controls whether the explicit DML path is used.
Changed
- Monitoring layer —
slot_health()now covers WAL-mode sources. Architecture diagrams and documentation updated to reflect the hybrid CDC model. - Stream table restrictions — user triggers on stream tables upgraded from "⚠️ Unsupported" to "✅ Supported (DIFFERENTIAL mode)".
Core Engine
- Declarative stream tables — define a SQL query and a schedule; the extension handles automatic refresh.
- Differential View Maintenance (DVM) — incremental delta computation derived automatically from the defining query's operator tree.
- Trigger-based CDC — lightweight
AFTERrow-level triggers capture changes into per-source buffer tables. Nowal_level = logicalrequired. - DAG-aware scheduling — stream tables that depend on other stream tables are refreshed in topological order with cycle detection.
- Background scheduler — canonical-period scheduling (48·2ⁿ seconds) with cron expression support.
- Crash-safe refresh — advisory locks prevent concurrent refreshes; crash recovery marks in-flight refreshes as failed.
SQL Support
- Full operator coverage — table scans, projections, WHERE/HAVING filters, INNER/LEFT/RIGHT/FULL OUTER joins, nested multi-table joins, GROUP BY with 25 aggregate functions, DISTINCT, UNION ALL, UNION, INTERSECT, EXCEPT.
- Subquery support — subqueries in FROM, EXISTS/NOT EXISTS, IN/NOT IN (subquery), scalar subqueries in SELECT.
- CTE support — non-recursive CTEs (inline and shared delta), recursive
CTEs (
WITH RECURSIVE) in both FULL and DIFFERENTIAL modes. - Recursive CTE incremental maintenance — DIFFERENTIAL mode now uses semi-naive evaluation for INSERT-only changes, Delete-and-Rederive (DRed) for mixed changes, and recomputation fallback when CTE columns don't match ST storage. Strategy is auto-selected per refresh.
- DISTINCT ON auto-rewrite — transparently rewritten to ROW_NUMBER() window subquery before DVM parsing.
- GROUPING SETS / CUBE / ROLLUP auto-rewrite — decomposed into UNION ALL of separate GROUP BY queries at parse time.
- NATURAL JOIN support — common columns resolved at parse time with explicit equi-join synthesis.
- ALL (subquery) support — rewritten to AntiJoin via NOT EXISTS.
- Scalar subquery in WHERE — rewritten to CROSS JOIN.
- SubLinks in OR — decomposed into UNION branches.
- Multi-PARTITION BY windows — split into joined subqueries.
- Regression aggregates — CORR, COVAR_POP, COVAR_SAMP, REGR_* (12 new).
- JSON_ARRAYAGG / JSON_OBJECTAGG — SQL-standard JSON aggregates recognized as first-class DVM aggregates in DIFFERENTIAL mode.
- JSON_TABLE — PostgreSQL 17+ JSON_TABLE() in FROM clause.
- Keyless table support — tables without primary keys use content hashing.
- Volatile function and operator detection — rejected in DIFFERENTIAL, warned for stable. Custom operators backed by volatile functions are also detected.
- TRUNCATE capture in CDC — triggers fall back to full refresh.
- Window functions — ROW_NUMBER, RANK, SUM OVER, etc. with full frame clause support (ROWS, RANGE, GROUPS, BETWEEN, EXCLUDE) and named WINDOW clauses.
- LATERAL SRFs —
jsonb_array_elements,unnest,jsonb_each, etc. via row-scoped recomputation. - LATERAL subqueries — explicit
LATERAL (SELECT ...)in FROM with correlated references. - Expression support — CASE WHEN, COALESCE, NULLIF, GREATEST, LEAST, IN (list), BETWEEN, IS DISTINCT FROM, IS TRUE/FALSE/UNKNOWN, SIMILAR TO, ANY/ALL (array), ARRAY/ROW constructors, array subscript, field access.
- Ordered-set aggregates — MODE, PERCENTILE_CONT, PERCENTILE_DISC with WITHIN GROUP (ORDER BY).
Monitoring & Observability
- Refresh statistics —
st_refresh_stats(),get_refresh_history(),get_staleness(). - Slot health —
slot_health()checks replication slot state and WAL retention. - DVM plan inspection —
explain_st()describes the operator tree. - Monitoring views —
pgtrickle.stream_tables_infoandpgtrickle.pg_stat_stream_tables. - NOTIFY alerting —
pg_trickle_alertchannel broadcasts stale, suspended, reinitialize, slot lag, refresh completed/failed events.
Infrastructure
- Row ID hashing —
pg_trickle_hash()andpg_trickle_hash_multi()using xxHash (xxh64) for deterministic row identity. - DDL event tracking —
ALTER TABLEandDROP TABLEon source tables automatically setneeds_reiniton affected stream tables.CREATE OR REPLACE FUNCTION/ALTER FUNCTION/DROP FUNCTIONon functions used by defining queries also triggers reinit. - Cross-session cache coherence — shared-memory
CACHE_GENERATIONatomic counter ensures all backends flush delta/MERGE template caches when DDL changes occur. - Version / frontier tracking — per-source JSONB frontier for consistent snapshots and Delayed View Semantics (DVS) guarantee.
- 12 GUC variables —
enabled,scheduler_interval_ms,min_schedule_seconds,max_consecutive_errors,change_buffer_schema,max_concurrent_refreshes,differential_max_change_ratio,cleanup_use_truncate,user_triggers,cdc_mode,wal_transition_timeout,block_source_ddl.
Documentation
- Architecture guide, SQL reference, configuration reference, FAQ, getting-started tutorial, DVM operators reference, benchmark guide.
- Deep-dive tutorials: What Happens on INSERT / UPDATE / DELETE / TRUNCATE.
Testing
- ~1,138 unit tests, 22 E2E test suites (Testcontainers + custom Docker image).
- Property-based tests, integration tests, resilience tests.
- Column snapshot and schema fingerprint-based DDL change detection.
Known Limitations
TABLESAMPLE,LIMIT/OFFSET,FOR UPDATE/FOR SHARE— rejected with clear error messages.- Window functions inside expressions (CASE, COALESCE, arithmetic) — rejected.
- Circular stream table dependencies (cycles) — not yet supported.