More

samokhvalov · 2026-05-14T22:55:58 1778799358

Hey, PostgresAI founder here.

thank you for using DBLab

Can you DM me, please? Really curious about your experience

samokhvalov · 2026-04-19T05:13:26 1776575606

thanks for pushing back, by the way – I'm thinking this thru, and will likely rename

fun fact: I now think, "River" (Go project) is also a misleading name for a task queue system :)

samokhvalov · 2026-04-19T04:47:46 1776574066

you need to explain claude code that PG18 is out already ;)

samokhvalov · 2026-04-19T04:45:55 1776573955

Fair. I had an attempt to clarify it in README that PgQue is "closer to Kafka topics than to a job queue" -- per-subscription cursor on a shared event log, no ACK-delete, no visibility timeout.

That makes PgQue an event-streaming tool, not an MQ. For SKIP LOCKED systems like PGMQ, PgQue can still be a replacement in certain cases – similarly to how Kafka can be a replacement for RabbitMQ or ActiveMQ in certain cases.

Agreed the "queue" naming is historical and a bit loose -- https://github.com/NikolayS/pgque/issues/70

samokhvalov · 2026-04-19T01:15:50 1776561350

correct

it's explained in README:

> Category: River, Que, and pg-boss (and Oban, graphile-worker, solid_queue, good_job) are job queue frameworks. PgQue is an event/message queue optimized for high-throughput streaming with fan-out.

samokhvalov · 2026-04-19T00:19:46 1776557986

nice work

I wonder if you considered WAL-G, which is also written in Go

and has this: https://github.com/wal-g/wal-g/blob/master/docs/PostgreSQL.m...

alzhi7 · 2026-04-20T15:36:24 1776699384

Thanks for the feedback!

Yes, I know about this tool, it's great. I watched videos about how it was developed, what difficulties there were in achieving delta backups, and how the developers also spent a ton of time studying the PostgreSQL source code. And I studied the Wal-G source code myself. I just never had to use it at work, since I was used to pgBackRest and, a bit later, to Barman. Wal-G focuses on cloud and universality (i.e., it's not only used for PG, but has a unified interface for many different storage systems).

Initially, I didn't even have the idea of making a complete, reliable tool. Over time, I started striving toward exactly that. When there was an available hypervisor at work, I set up k8s there and ran my receiver for several dev databases, just to test its operation 24/7, setting aggressive config parameters (frequent compression, unloading, cleanup, frequent backups, etc.). At the same time, I was choosing not small databases, but quite real production ones, with various nightly integrations for data population (external APIs, Airflow, and all that), blobs/tablespaces.

And of course I read your articles, and watched a lot of videos

samokhvalov · 2026-04-19T00:03:37 1776557017

Taxonomy is correct. But the benefit isn't "table grows indefinitely vs. vacuum-starved death spiral"

in all three approaches, if the consumer falls behind, events accumulate

The real distinction is cost per event under MVCC pressure. Under held xmin (idle-in-transaction, long-running writer, lagging logical slot, physical standby with hot_standby_feedback=on):

1. SKIP LOCKED systems: every DELETE or UPDATE creates a dead tuple that autovacuum can't reclaim (xmin is frozen). Indexes bloat. Each subsequent FOR UPDATE SKIP LOCKED scans don't help.

2. Partition + DROP (some SKIP LOCKED systems already support it, e.g. PGMQ): old partitions drop cleanly, but the active partition is still DELETE-based and accumulates dead tuples — same pathology within the active window, just bounded by retention. Another thing is that DROPping and attaching/detaching partitions is more painful than working with a few existing ones and using TRUNCATE.

3. PgQue / PgQ: active event table is INSERT-only. Each consumer remembers its own pointer (ID of last event processed) independently. CPU stays flat under xmin pressure.

I posted a few more benchmark charts on my LinkedIn and Twitter, and plan to post an article explaining all this with examples. Among them was a demo where 30-min-held-xmin bench at 2000 ev/s: PgQue sustains full producer rate at ~14% CPU; SKIP LOCKED queues pinned at 55-87% CPU with throughput dropping 20-80% and what's even worse, after xmin horizon gets unblocked, not all of them recovered / caught up consuming withing next 30 min.

pierrekin · 2026-04-19T04:32:55 1776573175

I think there are two kinds of partition based approach which may cause some confusion if lumped together in this kind of comparison.

Insert and delete with old partition drop vs insert only with old partition drop.

The semantics of the two approaches differ by default but you can achieve the same semantics from either with some higher order changes (partitioning the event space, tracking a cursor per consumer etc).

How does PgQue compare to the insert only partition based approach?

samokhvalov · 2026-04-19T04:55:48 1776574548

1. partitions are never dropped – they got TRUNCATEd (gracefully) during rotation

2. INSERT-only. Each consumer remembers its position – ID of the last event consumed. This pointer shifts independently for each consumer. It's much closer to Kafka than to task queue systems like ActiveMQ or RabbitMQ.

When you run long-running tx with real XID or read-only in REPEATABLE READ (e.g., pg_dump for long time), or logical slot is unused/lagging, this affects performance badly if you have dead tuples accumulated from DELETEs/UPDATEs, but not promptly vacuumed.

PgQue event tables are append-only, and consumers know how to find next batch of events to consume – so xmin horizon block is not affecting, by design.

samokhvalov · 2026-04-18T23:20:17 1776554417

(PgQue author here)

I didn't understand nuances in the beginning myself

We have 3 kinds of latencies when dealing with event messages:

1. producer latency – how long does it take to insert an event message?

2. subscriber latency – how long does it take to get a message? (or a batch of all new messages, like in this case)

3. end-to-end event delivery time – how long does it take for a message to go from producer to consumer?

In case of PgQ/PgQue, the 3rd one is limited by "tick" frequency – by default, it's once per second (I'm thinking how to simplify more frequent configs, pg_cron is limited by 1/s).

While 1 and 2 are both sub-ms for PgQue. Consumers just don't see fresh messages until tick happens. Meanwhile, consuming queries is fast.

Hope this helps. Thanks for the question. Will this to README.

hardwaresofton · 2026-04-19T06:22:55 1776579775

Not the original commenter but this is an excellent answer, thanks for the clear explanation — this is what people continue to come to HN for IMO.

Also thanks for all the podcasts and content, always a joy to watch.

samokhvalov · 2026-04-19T19:58:10 1776628690

thank you!

samokhvalov · 2026-01-26T19:52:17 1769457137

congrats! the more postgres everywhere, the better

samokhvalov · 2025-06-03T03:00:14 1748919614

1) built using an open source kubernetes operator, as I understand 2) Crunchy provides true superuser access and access to physical backups – that's huge

anonymousDan · 2025-06-03T04:22:00 1748924520

Why is that huge out of interest?

CBLT · 2025-06-03T04:27:31 1748924851

Business continuity. If you don't have access to your backups, there's nothing you can do to work around a vendor issue.

apexalpha · 2025-06-03T06:18:22 1748931502

Sounds like Stackgres?