What Happens When You TRUNCATE a Table?
This tutorial explains what happens when a TRUNCATE statement hits a base table that is referenced by a stream table. Unlike INSERT, UPDATE, and DELETE — which are fully tracked by the CDC trigger — TRUNCATE is a special case that bypasses row-level triggers entirely. Understanding this gap is essential for operating pg_stream correctly.
Prerequisite: Read WHAT_HAPPENS_ON_INSERT.md first — it introduces the 7-phase lifecycle. This tutorial explains why TRUNCATE breaks that lifecycle and how to recover.
Setup
Same e-commerce example used throughout the series:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer TEXT NOT NULL,
amount NUMERIC(10,2) NOT NULL
);
SELECT pgstream.create_stream_table(
'customer_totals',
$$
SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
FROM orders GROUP BY customer
$$,
'1m',
'DIFFERENTIAL'
);
-- Seed some data
INSERT INTO orders (customer, amount) VALUES
('alice', 50.00),
('alice', 30.00),
('bob', 75.00),
('bob', 25.00);
After the first refresh, the stream table contains:
customer | total | order_count
---------|--------|------------
alice | 80.00 | 2
bob | 100.00 | 2
Case 1: TRUNCATE the Base Table
TRUNCATE orders;
All four rows are removed instantly.
What Happens at the Trigger Level: Nothing
PostgreSQL's TRUNCATE command does not fire row-level AFTER triggers. This is by design in PostgreSQL — TRUNCATE is a DDL-like operation that removes all rows without scanning them individually. The per-row AFTER INSERT OR UPDATE OR DELETE trigger that pg_stream installs is simply never invoked.
The change buffer remains empty:
pgstream_changes.changes_16384
┌───────────┬─────────────┬────────┬──────────┬──────────┐
│ change_id │ lsn │ action │ ... │ ... │
├───────────┼─────────────┼────────┼──────────┼──────────┤
│ (empty) │ │ │ │ │
└───────────┴─────────────┴────────┴──────────┴──────────┘
No rows. Zero change events captured.
What Happens at the Scheduler: Skip
On the next refresh cycle, the scheduler:
- Checks the change buffer for rows in the LSN window
- Finds zero rows
- Classifies the refresh action as
NO_DATA - Advances the data timestamp but does not modify the stream table
The stream table still shows the old values:
SELECT * FROM customer_totals;
customer | total | order_count
----------|--------|------------
alice | 80.00 | 2 ← STALE! orders table is empty
bob | 100.00 | 2 ← STALE! orders table is empty
The stream table is now stale. It reflects data that no longer exists in the base table.
Why This Happens
The entire incremental view maintenance pipeline depends on the change buffer to know what changed. Without change events, the DVM has no deltas to apply. The stream table's reference-counted aggregates still think there are 4 orders contributing to two groups.
This is not a bug — it's a fundamental limitation of trigger-based CDC. The trigger can only fire when PostgreSQL executes individual row operations, and TRUNCATE deliberately skips per-row processing for performance.
Case 2: How to Recover — Manual Refresh
The fix is straightforward. Force a manual refresh:
SELECT pgstream.refresh_stream_table('customer_totals');
This executes a full refresh regardless of the stream table's configured refresh mode:
- TRUNCATE the stream table itself (clearing the stale data)
- Re-execute the defining query:
SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count FROM orders GROUP BY customer - INSERT the results into the stream table
- Update the frontier so future differential refreshes start from the current LSN
Since the orders table is empty, the defining query returns zero rows. The stream table becomes empty too:
SELECT pgstream.refresh_stream_table('customer_totals');
SELECT * FROM customer_totals;
customer | total | order_count
----------|-------|------------
(0 rows) ← correct: orders is empty
The stream table is now consistent again. Future INSERT/UPDATE/DELETE operations on orders will be captured normally by the trigger and propagated incrementally.
Case 3: TRUNCATE Then INSERT (Common ETL Pattern)
A common data loading pattern is:
BEGIN;
TRUNCATE orders;
INSERT INTO orders (customer, amount) VALUES
('charlie', 100.00),
('charlie', 200.00),
('dave', 150.00);
COMMIT;
What the Change Buffer Sees
- TRUNCATE: 0 events (not captured)
- INSERT charlie 100.00: 1 event (captured)
- INSERT charlie 200.00: 1 event (captured)
- INSERT dave 150.00: 1 event (captured)
The change buffer has 3 INSERT events — but knows nothing about the 4 rows that were removed by TRUNCATE.
What the Scheduler Does
The scheduler sees 3 pending changes and runs a DIFFERENTIAL refresh:
Aggregate delta (from INSERTs only):
Group "charlie": ins_count = 2, ins_total = 300.00
Group "dave": ins_count = 1, ins_total = 150.00
The MERGE applies these deltas to the existing stream table:
SELECT * FROM customer_totals;
customer | total | order_count
----------|--------|------------
alice | 80.00 | 2 ← STALE! alice has no orders
bob | 100.00 | 2 ← STALE! bob has no orders
charlie | 300.00 | 2 ← correct
dave | 150.00 | 1 ← correct
The new data (charlie, dave) is correct. But the old data (alice, bob) persists because the TRUNCATE that removed their orders was never captured. The reference counts for alice and bob were never decremented.
How to Fix
SELECT pgstream.refresh_stream_table('customer_totals');
SELECT * FROM customer_totals;
customer | total | order_count
----------|--------|------------
charlie | 300.00 | 2 ← correct
dave | 150.00 | 1 ← correct
Case 4: TRUNCATE a Dimension Table in a JOIN
Consider a stream table that joins two tables:
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
tier TEXT NOT NULL DEFAULT 'standard'
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INT REFERENCES customers(id),
amount NUMERIC(10,2)
);
SELECT pgstream.create_stream_table(
'order_details',
$$
SELECT c.name, c.tier, o.amount
FROM orders o
JOIN customers c ON o.customer_id = c.id
$$,
'1m',
'DIFFERENTIAL'
);
Now truncate the dimension table:
TRUNCATE customers CASCADE;
The CASCADE also truncates orders (due to the foreign key). Neither TRUNCATE fires row-level triggers. The change buffer is empty for both tables.
The stream table continues to show all the old joined rows as if nothing happened. The only recovery is a manual refresh:
SELECT pgstream.refresh_stream_table('order_details');
Case 5: FULL Mode Stream Tables Are Immune
If the stream table uses FULL refresh mode instead of DIFFERENTIAL:
SELECT pgstream.create_stream_table(
'customer_totals_full',
$$
SELECT customer, SUM(amount) AS total, COUNT(*) AS order_count
FROM orders GROUP BY customer
$$,
'1m',
'FULL'
);
A FULL-mode stream table doesn't use the change buffer at all. Every refresh cycle:
- TRUNCATEs the stream table
- Re-executes the defining query
- Inserts all results
So after a TRUNCATE of the base table, the next scheduled refresh automatically picks up the correct state — no manual intervention needed. The trade-off is that every refresh recomputes from scratch, which is more expensive for large result sets.
Why PostgreSQL Doesn't Fire Row Triggers on TRUNCATE
Understanding the PostgreSQL internals helps explain why this limitation exists:
| Operation | Mechanism | Row triggers fired? |
|---|---|---|
DELETE FROM t | Scans and removes rows one by one | Yes — AFTER DELETE per row |
TRUNCATE t | Removes all heap files and reinitializes the table storage | No — no per-row processing |
DELETE FROM t WHERE true | Same as DELETE FROM t (full scan) | Yes — AFTER DELETE per row |
TRUNCATE is fundamentally different from DELETE. It's an O(1) operation that replaces the table's storage files, while DELETE is O(N) — scanning every row and recording each removal in WAL.
PostgreSQL does support statement-level AFTER TRUNCATE triggers:
CREATE TRIGGER after_truncate_trigger
AFTER TRUNCATE ON orders
FOR EACH STATEMENT
EXECUTE FUNCTION some_function();
However, statement-level TRUNCATE triggers:
- Do not receive OLD row data (there's no
OLDrecord) - Cannot enumerate which rows were removed
- Only know that a TRUNCATE happened on a specific table
This means a TRUNCATE trigger could detect the event but cannot generate the per-row DELETE events that the DVM pipeline needs.
Alternative: DELETE FROM Instead of TRUNCATE
If you need the stream table to stay consistent without manual intervention, use DELETE FROM instead of TRUNCATE:
-- Instead of: TRUNCATE orders;
DELETE FROM orders;
This is slower (O(N) vs O(1)) but fires the row-level DELETE trigger for every row. The change buffer captures all removals, and the next differential refresh correctly decrements all reference counts, removing groups whose count reaches zero.
| Approach | Speed | Stream table consistent? |
|---|---|---|
TRUNCATE orders | O(1) — instant | No — requires manual refresh |
DELETE FROM orders | O(N) — scans all rows | Yes — triggers fire for each row |
TRUNCATE + manual refresh | O(1) + O(query) | Yes — after manual refresh |
For tables with millions of rows, DELETE FROM can be slow and generate significant WAL. In those cases, TRUNCATE followed by a manual refresh is often the pragmatic choice.
Best Practices
1. Always Refresh After TRUNCATE
Make it a habit: if you TRUNCATE a base table, immediately refresh all dependent stream tables:
TRUNCATE orders;
SELECT pgstream.refresh_stream_table('customer_totals');
2. Use DELETE FROM for Small Tables
For tables with fewer than ~100K rows, DELETE FROM is fast enough and keeps everything consistent automatically.
3. Wrap TRUNCATE + Refresh in a Function
For ETL pipelines, create a helper:
CREATE OR REPLACE FUNCTION reload_orders(data jsonb) RETURNS void AS $$
BEGIN
TRUNCATE orders;
INSERT INTO orders SELECT * FROM jsonb_populate_recordset(null::orders, data);
PERFORM pgstream.refresh_stream_table('customer_totals');
END;
$$ LANGUAGE plpgsql;
4. Consider FULL Mode for ETL-Heavy Tables
If a table is routinely truncated and reloaded, FULL refresh mode may be simpler than DIFFERENTIAL — it naturally handles TRUNCATE because it recomputes from scratch every cycle.
5. Monitor for Staleness
pg_stream emits monitoring alerts when a stream table's data deviates from expected freshness. After a TRUNCATE, the data_timestamp still advances (since the scheduler sees no pending changes), but the actual data is stale. The most reliable detection is comparing the stream table results against a fresh query:
-- Quick consistency check
SELECT count(*) FROM customer_totals; -- still shows rows?
SELECT count(*) FROM orders; -- should be 0 after TRUNCATE
How TRUNCATE Compares to Other Operations
| Aspect | INSERT | UPDATE | DELETE | TRUNCATE |
|---|---|---|---|---|
| Trigger fires? | Yes (per row) | Yes (per row) | Yes (per row) | No |
| Change buffer | 1 row per INSERT | 1 row per UPDATE | 1 row per DELETE | Empty |
| Stream table updated? | Yes (next refresh) | Yes (next refresh) | Yes (next refresh) | No — stays stale |
| Recovery | Automatic | Automatic | Automatic | Manual refresh required |
| FULL mode affected? | N/A (recomputes) | N/A (recomputes) | N/A (recomputes) | N/A (recomputes) |
| Speed | O(1) per row | O(1) per row | O(1) per row | O(1) total |
Summary
TRUNCATE is the one common DML-like operation that falls outside pg_stream's automatic change tracking. The trigger-based CDC architecture captures INSERT, UPDATE, and DELETE perfectly — but TRUNCATE bypasses row-level triggers by design.
The key takeaways:
- TRUNCATE does not fire row-level triggers — the change buffer stays empty
- The stream table becomes stale — showing data that no longer exists in the base table
- Manual refresh fixes it —
SELECT pgstream.refresh_stream_table('name')recomputes from scratch - FULL mode is immune — every refresh recomputes regardless of change tracking
DELETE FROMis the trigger-safe alternative — slower but keeps everything consistent automatically- After TRUNCATE, always refresh — make this a standard part of your ETL workflow
Next in This Series
- What Happens When You INSERT a Row? — The full 7-phase lifecycle (start here if you haven't already)
- What Happens When You UPDATE a Row? — D+I split, group key changes, net-effect for multiple UPDATEs
- What Happens When You DELETE a Row? — Reference counting, group deletion, INSERT+DELETE cancellation