More

dharbin · 2025-11-04T16:53:43 1762275223

Why would Snowflake develop and release this? Doesn't this cannibalize their main product?

barrrrald · 2025-11-04T17:48:15 1762278495

One thing I admire about Snowflake is a real commitment to self-cannibalization. They were super out front with Iceberg even though it could disrupt them, because that's what customers were asking for and they're willing to bet they'll figure out how to make money in that new world

Video of their SVP of Product talking about it here: https://youtu.be/PERZMGLhnF8?si=DjS_OgbNeDpvLA04&t=1195

qaq · 2025-11-04T19:02:05 1762282925

Have you interacted with Snowflake teams much? We are using external iceberg tables with snowflake. Every interaction pretty much boils down to you really should not be using iceberg you should be using snowflake for storage. It's also pretty obvious some things are strategically not implemented to push you very strongly in that direction.

barrrrald · 2025-11-04T21:35:30 1762292130

Not surprised - this stuff isn’t fully mature yet. But I interact with their team a lot and know they have a commitment to it (I’m the other guy in that video)

ozkatz · 2025-11-04T19:32:56 1762284776

Out of curiosity - can you share a few examples of functionality currently not supported with Iceberg but that works well with their internal format?

qaq · 2025-11-04T19:46:38 1762285598

even partition elimination is pretty primitive. For Query optimizer Iceberg is really not a primary target. The overall interaction with even technical people gives strong this is a sales org that happens to own an OLAP db product vibe.

andiz · 2025-11-06T09:23:20 1762421000

I have to very much disagree on that. All pruning techniques in Snowflake work equally well both on their proprietary format as well for Iceberg tables. Iceberg is nowadays a first-class citizen in Snowflake, with pruning working at the file level, row group level, and page level. Same is true for other query optimization techniques. There is even a paper on that: https://arxiv.org/abs/2504.11540

Where pruning differences might arise for Iceberg tables is the structure of Parquet files and the availability of metadata. Both depend on the writer of the Parquet files. Metadata might be completely missing (e.g., no per column min/max), or partially missing (e.g., no page indexes), which will indeed impact the perf. This is why it's super important to choose a writer that produces rich metadata. The metadata can be backfilled / recomputed after the fact by the querying engine, but it comes at a cost.

Another aspect is storage optimization: The ability to skip / prune files is intrinsically tied to the storage optimization quality of the table. If the table is neither clustered nor partitioned, or if the table has sub-optimally sized files, then all of these things will severely impact any engine's ability to skip files or subsets thereof.

I would be very curious if you can find a query on an Iceberg table that shows a better partition elimination rate in a different system.

qaq · 2025-11-07T04:05:03 1762488303

sure select distinct customer_id ... customer_id is first part of partition key you really don't need to do a tablescan to resolve that do you ?

blef · 2025-11-05T12:38:00 1762346280

Supporting Iceberg is eventually having people leaving you because they have better elsewhere, but this is birectionnal, it means you can welcome people from Databricks because you have feature parity.

kentm · 2025-11-04T17:08:54 1762276134

It's not going to scale as well as Snowflake, but it gets you into an Iceberg ecosystem which Snowflake can ingest and process at scale. Analytical data systems are typically trending to heterogenous compute with a shared storage backend -- you have large, autoscaling systems to process the raw data down to something that is usable by a smaller, cheaper query engine supporting UIs/services.

hobs · 2025-11-04T17:23:27 1762277007

But if you are used to this type of compute per dollar what on earth would make you want to move to Snowflake?

kentm · 2025-11-04T17:55:40 1762278940

Different parts of the analytical stack have different performance requirements and characteristics. Maybe none of your stack needs it and so you never need Snowflake at all.

More likely, you don't need Snowflake to process queries from your BI tools (Mode, Tableau, Superset, etc), but you do need it to prepare data for those BI tools. Its entirely possible that you have hundreds of terabytes, if not petabytes, of input data that you want to pare down to < 1 TB datasets for querying, and Snowflake can chew through those datasets. There's also third party integrations and things like ML tooling that you need to consider.

You shouldn't really consider analytical systems the same as a database backing a service. Analytical systems are designed to funnel large datasets that cover the entire business (cross cutting services and any sharding you've done) into subsequently smaller datasets that are cheaper and faster to query. And you may be using different compute engines for different parts of these pipelines; there's a good chance you're not using only Snowflake but Snowflake and a bunch of different tools.

mslot · 2025-11-04T20:27:57 1762288077

When we first developed pg_lake at Crunchy Data and defined GTM we considered whether it could be a Snowflake competitor, but we quickly realised that did not make sense.

Data platforms like Snowflake are built as a central place to collect your organisation's data, do governance, large scale analytics, AI model training and inference, share data within and across orgs, build and deploy data products, etc. These are not jobs for a Postgres server.

Pg_lake foremost targets Postgres users who currently need complex ETL pipelines to get data in and out of Postgres, and accidental Postgres data warehouses where you ended up overloading your server with slow analytical queries, but you still want to keep using Postgres.

999900000999 · 2025-11-04T19:02:53 1762282973

It'll probably be really difficult to set up.

If it's anything like super base, your question the existence of God when trying to get it to work properly.

You pay them to make it work right.

pgguru · 2025-11-04T20:26:56 1762288016

For testing, we at least have a Dockerfile to automate the setup of the pgduck_server and a minio instance so it Just Works™ with the extensions installed in your local Postgres cluster (after installing the extensions).

The configuration mainly involves just defining the default iceberg location for new tables, pointing it to the pgduck_server, and providing the appropriate auth/secrets for your bucket access.

dharbin · 2025-05-20T12:01:01 1747742461

This feels like a contentless article. They gave a statistic and crafted a narrative based on one person’s experience, which leaves me with many more questions than answers.

mbajkowski · 2025-05-20T12:28:49 1747744129

Agreed, a very shallow article with a few personal opinions. A little breather for Austin proper may be a good thing, housing prices and rents have come down significantly from a few years ago. The infrastructure is presently a mess and will be such for the next few years, along with the airport expansion. But the surrounding areas are still growing quickly, and there is no shortage of interesting startups in the area. The one obvious thing the article misse is the weather, which simply is not for everyone.

lotsofpulp · 2025-05-20T12:36:57 1747744617

Hence why one should never waste time clicking on links without numbers, specifically change in distribution of data (deciles/quintiles), or at least median.

If there are no numbers about the distribution, there can be no evidence of the article’s claim, hence it is useless and meant to evoke emotion.

dharbin · on Sept 10, 2024

"Pro" is a funny qualifier for a game console. The definition has morphed from "for professionals" to "more expensive" or "more capable."

dharbin · on Sept 6, 2024

I’m a big fan of chezmoi (https://www.chezmoi.io/) which is a very capable dotfile manager. Chezmoi supports some useful advanced capabilities like work/home profiles and secrets manager integration.

kstrauser · on Sept 6, 2024

Same for me. I'd done the same thing as the author with various methods like stow, symlink farms, etc. over the years. Chezmoi is good enough that I'm willing to let someone else handle maintaining all logic.

dngray · on Sept 6, 2024

Yup, I tried a number of dotfile managers. I think yadm was the first one I started with and then ended up with chezmoi.

The main reason was because I discovered the power of templating. With Yadm it required an external dependency, envptl, then j2cli, and both of these became unmaintained, while chezmoi used the text/template standard library. After the task of converting my jinja2 templates to gotmpl I never looked back.

One of the other things I like about chezmoi is I significantly cut down any "scripts" to just a few as most of the logic became "deterministic", ie I would set conditions based on the host in chezmoi.toml.tmpl and then that would define how everything under that would run across multiple hosts, and devices.

ashconnor · on Sept 6, 2024

I migrated to chezmoi recently my only gripe is `chezmoi cd` opening in a new shell but `chezmoi git` usually is what I need. The age [0] integration is nice.

[0] - https://github.com/FiloSottile/age

baliex · on Sept 6, 2024

I added an alias `cm='cd $(chezmoi source-path)'` to my shell config to cd to the chezmoi directory (without opening a new shell) so I can use all the usual commands (e.g. git) without need the chezmoi prefix. The alias is in a chezmoi-managed file, naturally.

twp · on Sept 7, 2024

Unfortunately there's no other way of doing this, except for adding a shell alias. See https://www.chezmoi.io/user-guide/frequently-asked-questions....

ashconnor · on Sept 8, 2024

Makes sense. Thank you for the project.

samgranieri · on Sept 7, 2024

Chezmoi is amazing. I dabbled with Stow, but Chezmoi is the way to go.

wyclif · on Sept 8, 2024

Hey, I had never heard about chezmoi before reading your comment, but I just installed it. Took less than 10 minutes to set up from start to finish. I noticed that if you choose to use it to manage your `~/.ssh/config/`, by default chezmoi sets it up as `private_dot_ssh/` and so if your dotfiles are public it doesn't expose sensitive data like private key files such as `~/.ssh/id_rsa`. Smart!

Arrowmaster · on Sept 9, 2024

The private_ only applies to file permissions so in this case it makes the .ssh directory only readable by the owner. This is checked for by openssh and the config will be ignored if it's readable by the group or all.

If you make your dotfiles repo publicly accessible, you will leak your private keys unless you use other features in chezmoi to protect them.

Barrin92 · on Sept 6, 2024

also a big fan of it because the templating feature makes it very easy to handle dotfiles with different locations on multiple machines and if you use multiple operating systems. Really not that many tools around that have good windows support.

dharbin · on Aug 30, 2024

Can it automate a cutout of Michael Jordan attached to a model train set?

jeroenhd · on Aug 30, 2024

You jest, but with an ESP32 flashed with ESPHome and a few dollars of electronics to regulate the power, I think controlling model trains would actually be quite doable. Your biggest challenge is probably dealing with network/scripting latency for events that need to happen in quick succession, like when dealing with switches.

Or, if you consider Lego DUPLO trains to be model trains, there's this: https://community.home-assistant.io/t/lego-duplo-train-contr...

Edit: there's also this https://github.com/aaron9589/esphome-for-model-railroading for the more serious model railroad enthusiast, though I'm not 100% sure if that actually controls the trains themselves (or just the switches and lights)

shimon · on Aug 30, 2024

In many cases, a simple off-the-shelf smart plug would suffice.

ewoodrich · on Aug 30, 2024

That's why I try as hard as I can to find either truly "dumb" devices with mechanical switches vs momentary buttons or devices that remember their last state after A/C power is restored. Hard to figure out the second option though without trying it unless a review happens to mention it specifically.

sofixa · on Aug 30, 2024

Awesome. The Home Assistant and related (ESPHome, Voice Assistant, Music Assistanc) communities are amazing and there is just a crazy amount of projects one can just pick up and use.

moandcompany · on Aug 30, 2024

It could also automate ordering pizza delivery and the tipping process to "Keep the change, ya filthy animal"

paradox460 · on Aug 30, 2024

There's a dominoes pizza integration in home assistant

paradox460 · on Aug 30, 2024

There are dcc systems that run on Arduino or raspberry pi, and expose an API interface. Could control one via home assistant

dharbin · on June 7, 2024

In my opinion, psql is the "perfect" terminal database client. It's fast, has the ability to easily switch between wide and narrow row formats, and has commands for the things I do frequently.

I have often wanted psql to work with other databases, because the other CLI clients are either bare-bones, or just plain unusable.

kermatt · on June 7, 2024

https://github.com/xo/usql has a similar feel to it, with a variety of backends.

7bit · on June 8, 2024

It works, but faaaaaaar from perfect! Perfect would include a lot more things like auto completion, syntax highlighting etc etc etc

signal11 · on June 7, 2024

psql’s very good, but duckdb’s CLI made me wish psql had some of its features.

dharbin · on Jan 9, 2024

I don't think I even knew what those "Easter Island" statues were called until I started playing Gradius games.

mayormcmatt · on Jan 9, 2024

Thank you: this is exactly the game I was looking for in the list, but forgot the name. It, too, was my first introduction to Moai as a kid.

crazydoggers · on Jan 9, 2024

Those Moai were the worst.. luckily there was up, up, down, down, left, right, left, right, B, A

dharbin · on Dec 6, 2022

I heard a story about an Intel fab in Arizona that would always produce bad silicon at a certain time a day. After some investigations it was determined that a train passed by at that time every day causing enough seismic activity to disrupt the manufacturing process.

dharbin · on Nov 14, 2022

Actually, I love this. I remember having a teeny tiny phone back then (not quite a Zoolander size), and it was a lot of fun. Now everything is a boring rectangle of glass. Bring back the fun!

dharbin · on July 20, 2022

I find it amusing that they suggest DALL-E, which typically generates lovecraftian nightmare images, for making children's story illustrations.

driverdan · on July 20, 2022

How so? If you give it prompts for children story illustrations with a detailed description it will not give you "lovecraftian nightmare images".

throwaway0x7E6 · on July 20, 2022

yeah. dalle is "so bad it's good".

it's great for post-post-ironic memes, but I don't see it being useful for anything else

andybak · on July 20, 2022

Have you tried any of the "human or Dall-E" tests?

How did you score?

I only scored as well as I did because I knew the kind of stylistic choices to look out for. In terms of "quality" I really don't understand how you've reached this conclusion.

throwaway0x7E6 · on July 20, 2022

I've only seen this thing https://huggingface.co/spaces/dalle-mini/dalle-mini

is it not dall-e?

andybak · on July 20, 2022

It's a reimplementation.

It's a long way off in terms of quality (at the moment anyway)

astrange · on July 20, 2022

It's a model inspired by DALLE 1 but it's not even very close to that.

But it does seem to know a lot of things the real DALLE2 doesn't.

_flux · on July 20, 2022

It is not and that's why OpenAI asked them to change the name, which they did.

throwaway0x7E6 · on July 20, 2022

oh. I retract my OP then

arkitaip · on July 20, 2022

No wireless. Less space than a nomad. Lame.