More

stym06 · 2025-12-02T16:20:36 1764692436

you could just rank the papers, and show trending ones as a separate tab.

for filters, create a set of pre-defined tags and let the LLM choose one of your pre-defined tags from the paper's summary.

stym06 · 2025-12-02T16:14:06 1764692046

How's it different from existing open source data catalogs like amundsen.io?

NortySpock · 2025-12-02T16:49:54 1764694194

Amundsen has two databases and three services in its architecture diagram. For me, that's a smell that you now have risk of inconsistency between the two, and you may have to learn how to tune elasticsearch and Neo4j...

Versus the conceptually simpler "one binary, one container, one storage volume/database" model.

I acknowledge it's a false choice and a semi-silly thing to fixate on (how do you perf-tune ingestion queue problems vs write problems vs read problems for a go binary?)..

But, like, I have 10 different systems I'm already debugging.

Adding another one like a data catalog that is supposed to make life easier and discovering I now have 5-subsystems-in-a-trenchcoat to possibly need to debug means I'm spending even more time on babysitting the metadata manager rather than doing data engineering _for the business_

https://www.amundsen.io/amundsen/architecture/

stym06 · 2025-11-15T04:45:15 1763181915

If a human had done this, these would be at a museum

stym06 · 2025-11-15T04:43:04 1763181784

off topic, but prometheus pushgateway is such a bad implementation (once you push the metrics, it always stays there until it's restarted, like counter does not increase, it just pushes a new metric with the new value) that we had to write our own metrics collector endpoint.

mickeyp · 2025-11-15T05:01:51 1763182911

That is literally how it is supposed to work. Prometheus grabs metrics --- that is how it works. If you for some reason find yourself unable to host an endpoint with metrics, you can use the fallback pushgateway to push metrics where yes they will stay until restarted. Ask yourself how it could ever work if they are subsequently deleted after read. How would multiple prometheus agents be able to read from the same source?

dewey · 2025-11-15T08:22:56 1763194976

It sounds like you are using it for the wrong job. It’s supposed to be a solution for jobs / short running processes that don’t expose a /metrics endpoint for Prometheus long enough to be scraped and there you exactly want that kind of behavior.

jordanb · 2025-11-15T05:07:10 1763183230

The pushgateway is itself a horrible hack for the fact that prometheus is designed only for metrics scraping. Unfortunately the whole ecosystem around it is an utter mess.

pahae · 2025-11-15T05:51:31 1763185891

Remote Write is a viable alternative in Prometheus and its drop-in replacements. I'm not a massive fan of it myself as I feel the pull-based approach is superior overall but still make heavy use of it.

The pushgateway's documentation itself calls out that there are only very limited cirumstances where it makes sense.

I personally only used it in $old_job and only for batch jobs that could not use the node_exporter's textfile collector. I would not use it again and would even advise against it.

valyala · 2025-11-17T07:42:48 1763365368

You can use other monitoring systems, which support both pull-based and push-based models for metrics' collection. See, for example, https://docs.victoriametrics.com/victoriametrics/#how-to-imp...

stym06 · 2025-11-15T04:39:48 1763181588

> "But I got it all working; now I can finally stop explaining to my boss why we need to re-structure the monitoring stack every year."

Prometheus and Grafana have been progressing in their own ways and each of them is trying to have a fullstack solution and then the OTEL thingy came and ruined the party for everyone

hagen1778 · 2025-11-17T09:31:43 1763371903

I think OTEL has made things worse for metrics. Prometheus was so simple and clean before the long journey toward OTEL support began. Now Prometheus is much more complicated:

- all the delta-vs-cumulative counter confusion

- push support for Prometheus, and the resulting out-of-order errors

- the {"metric_name"} syntax changes in PromQL

- resource attributes and the new info() function needed to join them

I just don’t see how any of these OTEL requirements make my day-to-day monitoring tasks easier. Everything has only become more complicated.

And I haven’t even mentioned the cognitive and resource cost everyone pays just to ship metrics in the OTEL format - see https://promlabs.com/blog/2025/07/17/why-i-recommend-native-...

jamesblonde · 2025-11-15T05:24:07 1763184247

I still haven't got my head around how OTEL fits into a good open-source monitoring stack. Afaik, it is a protocol for metrics, traces, and logs. And we want our open-source monitoring services/dbs to support it, so they become pluggable. But, afaik, there's no one good DB for logs and metrics, so most of us use Prometheus for metrics and OpenSearch for logs.

Does OTEL mean we just need to replace all our collectors (like logstash for logs and all the native metrics collectors and pushgateway crap) and then reconfigure Prometheus and OpenSearch?

pas · 2025-11-15T10:02:49 1763200969

logs, spans and metrics are stored as time-stamped stuff. sure simple fixed-width columnar storage is faster, and makes sense to special case for numbers (add downsampling and aggregations, and histogram maintenance and whatnot), but any write-optimized storage engine can handle this, it's not the hard part (basically LevelDB, and if there's need for scaling out it'll look like Cassandra, Aerospike, ScyllaDB, or ClickHouse ... see also https://docs.greptime.com/user-guide/concepts/data-model/ and specialized storage engines https://docs.greptime.com/reference/about-greptimedb-engines... )

rsanheim · 2025-11-15T05:45:32 1763185532

I think the answer is it doesn't fit in any definition of a _good_ monitoring stack, but we are stuck with it. It has largely become the blessed protocol, specification, and standard for OSS monitoring, along every axis (logging, tracing, collecting, instrumentation, etc)...its a bit like the efforts that resulted in J2EE and EJBs back in the day, only more diffuse and with more varied implementations.

And we don't really have a simpler alternative in sight...at least in the java days there was the disgust and reaction via struts, spring, EJB3+, and of course other languages and communities.

Not sure how we exactly we got into such an over-engineered mono-culture in terms of operations and monitoring and deployment for 80%+ of the industry (k8s + graf/loki/tempo + endless supporting tools or flavors), but it is really a sad state.

Then you have endless implementations handling bits and pieces of various parts of the spec, and of course you have the tools to actually ingest and analyze and report on them.

stym06 · 2025-11-12T03:34:18 1762918458

Still WIP, but the idea is to have a Grafana like setup for data quality

stym06 · on July 9, 2024

Absolutely loved your project ZoneTree. Gotta dig deep on this

stym06 · on July 9, 2024

I guess so! Will be fixed

stym06 · on July 9, 2024

Yet to implement linting and unit tests. This is kind of a rough draft/v0

stym06 · on July 9, 2024

Will do! Thanks for the comment