More

tudorg · 2025-09-30T20:11:05 1759263065

Thanks for the report!

tudorg · 2025-09-25T17:37:01 1758821821

Lots of good improvements, my favorites are Oauth, NOT NULL constraint with NOT VALID, uuidv7, RETURNING contains old/new. And I think the async IO will bring performance benefits, although maybe not so much immediately.

sergeyprokhoren · 2025-09-25T20:46:14 1758833174

PostgreSQL Gains a Built-in UUIDv7 Generation Function for Primary Keys (many interesting details)

https://habr.com/en/news/950340/

tudorg · 2025-09-15T16:03:51 1757952231

Great to see pgstream on HN!

Besides DDL changes, pgstream can also do on-the-fly data anonymization and data masking. It can stream to other stores, like Elasticsearch, but the current main focus is on PG to PG replication with DDL changes and anonymization.

We've been using it in Xata to power our `xata clone` functionality, which creates a "staging replica" on the Xata platform, with anonymized data that closely resembles production. Then one gets fast copy-on-write branching from this anonymized staging replica. This is great for creating dev branches and ephemeral environments.

tudorg · 2025-07-26T10:43:55 1753526635

Check out Xata.io. A `xata.micro` instance is ~$9 per month (+storage). It’s in private beta but I can give you access if you are interested.

tudorg · 2025-07-25T19:42:42 1753472562

For a tooling solution for this problem, and many others, pgroll (https://github.com/xataio/pgroll) automates the steps from the blog post in a single higher-level operation. It can do things like adding a hidden column, backfill it with data, then adds the constraint, and only then expose it in the new schema.

tudorg · 2025-07-25T19:38:41 1753472321

That is correct, for non-volatile default values Postgres is quick, which means that it is generally a safe operation.

Also interesting, `now()` is non-volatile because it's defined as "start of the transaction". So if you add a column with `DEFAULT now()` all rows will get the same value. But `timeofday()` is not volatile, so `DEFAULT timeofday()` is going to lock the table for a long time. A bit of a subtle gotcha.

avg_dev · 2025-07-25T21:03:42 1753477422

Thanks for the info. One minor point:

> But `timeofday()` is not volatile, so `DEFAULT timeofday()` is going to lock the table for a long time.

Perhaps the “not” was a typo?

tudorg · 2025-07-26T09:26:53 1753522013

Ah yes, it was a typo, sorry.

tudorg · 2025-07-16T18:41:17 1752691277

[author] My point was that using the default config for the other providers while you control yours is a weakness in the methodology, and might influence the results.

Note that when strictly limiting to 4 CPU it's still faster.

tudorg · 2025-07-09T10:10:31 1752055831

Another way to mitigate this is to make the agents always work only with a copy of the data that is anonymized. Assuming the anonymisation step removes / replaces all sensitive data, then whatever the AI agent does, it won't be disastrous.

The anonymization can be done by pgstream or pg_anonymizer. In combination with copy-on-write branching, you can create a safe environments on the fly for AI agents that get access to data relevant for production, but not quite production data.

tudorg · 2025-07-06T12:27:11 1751804831

On the Xata platform we actually do CoW snapshots and branching at the block device level, which works great.

However we are developing pgstream in order to bring in data and sync it from other Postgres providers. pgstream can also do anonymisation and in the future subsetting. Basically this means that no matter which Postgres service you are using (RDS, CloudSQL, etc) you can get still use Xata for staging and dev branches.

tudorg · 2025-05-30T18:56:07 1748631367

Answering on behalf of Xata, it is orthogonal. I'm curious to try out Oriole on our platform when I get some time.