Scaling

pg_ripple scales vertically within a single PostgreSQL instance and horizontally for read traffic via streaming replication. This page covers how to allocate resources, tune the merge worker, set up read replicas, and understand current limitations.

Current scaling model

pg_ripple runs entirely within PostgreSQL. It inherits PostgreSQL's single-writer architecture: one primary handles all writes, and read replicas serve read-only SPARQL queries. Cross-node sharding is not yet supported.


Vertical Scaling

The most impactful scaling lever is giving your single PostgreSQL instance more resources.

Memory

Memory affects three key areas:

ResourceControlled ByImpact
Dictionary LRU cachepg_ripple.dictionary_cache_sizeReduces disk I/O for IRI/literal lookups. Every SPARQL query touches the dictionary on decode.
PostgreSQL shared buffersshared_buffersCaches VP table pages. Larger = fewer disk reads for joins.
Work memorywork_memMemory for sorts, hash joins, and hash aggregates in SPARQL-generated SQL.

Dictionary Cache Sizing

The dictionary cache is allocated in shared memory at server startup. Each entry consumes approximately 200 bytes.

-- Check current utilization
SELECT
  s->>'encode_cache_capacity' AS capacity,
  s->>'encode_cache_utilization_pct' AS utilization_pct,
  ROUND(
    (s->>'encode_cache_hits')::numeric /
    NULLIF((s->>'encode_cache_hits')::numeric + (s->>'encode_cache_misses')::numeric, 0),
    4
  ) AS hit_rate
FROM pg_ripple.stats() s;
Hit RateAction
> 95%Healthy — no change needed
90–95%Consider increasing dictionary_cache_size
< 90%Double dictionary_cache_size and restart

Rule of thumb

Set dictionary_cache_size to at least 10% of your total unique IRIs + literals. For a dataset with 5M unique terms, start with 500K entries (~100 MB of shared memory).

PostgreSQL Memory Settings

# postgresql.conf — for a 64 GB server with pg_ripple as the primary workload
shared_buffers = 16GB
effective_cache_size = 48GB
work_mem = 256MB
maintenance_work_mem = 2GB

work_mem and SPARQL

Complex SPARQL queries with multiple joins, UNIONs, or aggregates can spawn many hash operations. PostgreSQL allocates work_mem per operation per query. Start conservative (64MB–256MB) and increase if you see "temporary file" entries in the logs.

CPU

WorkloadCPU Benefit
SPARQL query executionMore cores → more parallel workers for large joins
Merge workerSingle-threaded per predicate, but merges run concurrently across predicates
Bulk loadingload_turtle / load_ntriples are I/O-bound; CPU helps with dictionary encoding
Datalog inferenceSemi-naive fixpoint is CPU-intensive; benefits from faster cores

Set max_parallel_workers_per_gather to allow PostgreSQL to parallelize large VP table scans:

max_parallel_workers_per_gather = 4
max_parallel_workers = 8
parallel_setup_cost = 100
parallel_tuple_cost = 0.001

pg_ripple's parallel_query_min_joins GUC controls when the SPARQL engine enables parallel hints in generated SQL (default: 3 joins).

Storage

TierRecommendation
NVMe SSDBest for all workloads. Random I/O for dictionary lookups and VP table joins.
SATA SSDAcceptable for medium datasets.
HDDNot recommended. Dictionary lookups and VP joins are random-I/O heavy.

Separate WAL and data

Place pg_wal on a separate NVMe device from the main data directory. pg_ripple's bulk load and merge operations generate significant WAL traffic.


Merge Worker Tuning

The HTAP merge worker is the most important pg_ripple-specific scaling knob. It controls how quickly delta rows (recent writes) are consolidated into the main BRIN-indexed partition.

How the Merge Worker Operates

  1. The worker polls every merge_interval_secs (default: 60s)
  2. For each predicate, it checks if delta row count >= merge_threshold
  3. If yes, it creates a new main table: (old main − tombstones) UNION ALL delta
  4. It swaps the view to point at the new main, then drops the old main after merge_retention_seconds
  5. If auto_analyze is on, it runs ANALYZE on the new main

Tuning for Write-Heavy Workloads

Lower the merge threshold and interval to keep the delta tables small:

pg_ripple.merge_threshold = 5000
pg_ripple.merge_interval_secs = 30
pg_ripple.latch_trigger_threshold = 5000

This gives fresher reads but increases I/O from more frequent merges.

Tuning for Read-Heavy Workloads

Raise the threshold to batch more writes before merging:

pg_ripple.merge_threshold = 50000
pg_ripple.merge_interval_secs = 120

This reduces merge I/O overhead but means queries scan larger delta tables.

Monitoring Merge Activity

-- Is the merge worker running?
SELECT (pg_ripple.stats()->>'merge_worker_pid')::int AS pid;

-- How many unmerged delta rows?
SELECT (pg_ripple.stats()->>'unmerged_delta_rows')::int AS delta_rows;

Merge worker stalls

If unmerged_delta_rows grows continuously while merge_worker_pid is non-zero, the worker may be stuck. Check pg_stat_activity for long-running merge transactions and look for lock contention. The merge_watchdog_timeout GUC (default: 300s) logs a WARNING if the worker is idle too long.


Read Replicas

PostgreSQL streaming replication provides horizontal read scaling for SPARQL queries.

Architecture

┌────────────┐     WAL stream     ┌────────────┐
│  Primary   │ ──────────────────→ │  Replica 1 │ ← SPARQL reads
│  (writes)  │                     └────────────┘
│            │     WAL stream     ┌────────────┐
│            │ ──────────────────→ │  Replica 2 │ ← SPARQL reads
└────────────┘                     └────────────┘

Setting Up a Read Replica

On the primary:

# postgresql.conf
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1GB

Create a replication slot:

SELECT pg_create_physical_replication_slot('replica1');

On the replica:

pg_basebackup -h primary-host -D /var/lib/postgresql/18/main -R -S replica1 -P

Start the replica — it will begin streaming WAL and replaying changes, including all VP table mutations.

Replica Considerations

Merge worker does not run on replicas

The background merge worker only runs on the primary. Replicas receive already-merged state through WAL replay. This means replicas always have a consistent view of the data without any additional overhead.

  • SPARQL queries work identically on replicas — the query engine reads VP tables the same way
  • Dictionary cache is independent per instance — each replica maintains its own LRU cache
  • Replication lag: monitor with pg_stat_replication on the primary. Under normal load, lag should be sub-second
  • Hot standby conflicts: long-running SPARQL queries on replicas may conflict with WAL replay. Set max_standby_streaming_delay appropriately:
# On the replica
max_standby_streaming_delay = 30s
hot_standby_feedback = on

Connection Pooling

For workloads with many concurrent SPARQL clients, use a connection pooler:

PgBouncer with pg_ripple

pg_ripple uses session-level GUC parameters (e.g., pg_ripple.inference_mode). If you use PgBouncer, configure it in session pooling mode, not transaction mode. Transaction-mode pooling resets GUCs between transactions, which can cause unexpected behavior.

# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
pool_mode = session
max_client_conn = 200
default_pool_size = 20

Scaling Limits and Honest Boundaries

DimensionCurrent CapabilityLimitation
Triples per instanceTested to 1B+Bound by disk and memory
Concurrent SPARQL queriesHundreds (with pooler)Bound by max_connections and CPU
Write throughput~50K–200K triples/sec (bulk load)Single-writer architecture
Read replicasUnlimitedStandard PG replication
Cross-node shardingNot supportedNo distributed query planner
Multi-primary writesNot supportedPostgreSQL limitation
FederationSupported (SERVICE clause)Remote endpoints add latency

Sharding is not available

pg_ripple does not support sharding VP tables across multiple PostgreSQL instances. If your dataset exceeds what a single instance can handle, consider: (1) vertical scaling with larger hardware, (2) federation via SERVICE clauses to distribute queries across multiple pg_ripple instances, each holding a subset of graphs, or (3) archiving cold graphs to separate instances.


Capacity Planning

Storage Estimates

ComponentPer Triple (approx.)
VP table row (s, o, g, i, source)~40 bytes
VP indexes (dual B-tree)~80 bytes
Dictionary entry (per unique term)~120 bytes
HTAP overhead (delta + tombstone tables)~20% of VP size during active writes

Example: 100M triples with 20M unique terms ≈ 12 GB (VP) + 2.4 GB (dictionary) + overhead ≈ ~20 GB total.

Memory Estimates

ComponentSizing
shared_buffers25% of RAM
dictionary_cache_size10% of unique terms
work_mem64MB–512MB depending on query complexity
OS page cacheRemaining RAM

Start small, measure, scale

Deploy with conservative settings, load your data, and run representative queries. Use pg_ripple.stats() and PostgreSQL's pg_stat_user_tables to identify bottlenecks before adding hardware.