More

pglevy · 2025-12-11T13:21:26 1765459286

At least one person. https://tonsky.me/blog/centering/

Vinnl · 2025-12-11T13:42:35 1765460555

Haha fair, we do still suck at properly centering text, I'll give you that.

pglevy · 2025-12-09T00:34:18 1765240458

I've been thinking about something like this from a UI perspective. I'm a UX designer working on a product with a fairly legacy codebase. We're vibe coding prototypes and moving towards making it easier for devs to bring in new components. We have a hard enough time verifying the UI quality as it is. And having more devs vibing on frontend code is probably going to make it a lot worse. I'm thinking about something like having agents regularly traversing the code to identify non-approved components (and either fixing or flagging them). Maybe with this we won't fall further behind with verification debt than we already are.

cousinbryce · 2025-12-09T03:59:24 1765252764

That’s just static code analysis with extra steps

pglevy · 2025-10-13T03:01:36 1760324496

For context, I'm a UX Designer at a low-code company. LLMs are great at cranking out prototypes using well-known React component libraries. But lesser known low-code syntax takes more work. We made an MCP server that helps a lot, but what I'm working on now is a set of steering docs to generate components and prototypes that are "backwards compatible" with our bespoke front end language. This way our vibe prototyping has our default look out of the box and translates more directly to production code. https://github.com/pglevy/sail-zero

pglevy · 2025-09-29T16:46:28 1759164388

Our low-code expression language is not well-represented in the pre-training data. So as a baseline we get lots of syntax errors and really bad-looking UIs. But we're getting much better results by setting up our design system documentation as an MCP server. Our docs include curated guidance and code samples, so when the LLM uses the server, it's able to more competently search for things and call the relevant tools. With this small but high-quality dataset, it also looks better than some of our experiments with fine tuning. I imagine this could work for other docs use cases that are more dynamic (ie, we're actively updating the docs so having the LLM call APIs for what it needs seems more appropriate than a static RAG setup).

pglevy · 2025-09-16T21:12:37 1758057157

> I answered.

I never answer the phone.

pglevy · 2025-09-02T02:39:23 1756780763

Not an engineer but I think this is where my mind was going after reading the post. Seems like what will be useful is continuously generated "decision documentation." So the system has access to what has come before in a dynamic way. (Like some mix of RAG with knowledge graph + MCP?) Maybe even pre-outlining "decisions to be made," so if an agent is checking in, it could see there is something that needs to be figured out but hasn't been done yet.

CuriouslyC · 2025-09-02T03:30:34 1756783834

I actually have a "LLM as a judge" loop on all my codebases. I have an architecture panel that debates improvements given an optimization metric and convergence criteria and I feed their findings into a deterministic spec generator (cue /w validation) that can emit unit/e2e tests, scaffold terraform. It's pretty magical.

This cue spec gets decomposed into individual tasks by an orchestrator that does research per ticket and bundles it.

adastra22 · 2025-09-02T05:38:56 1756791536

I think we are all building the same thing. If only there was an open source framework for aggregating all our work together.

pglevy · 2025-08-24T14:34:57 1756046097

Mine is a much simpler use case but sharing in case it's useful. I wanted to be able to quickly generate and iterate on user flows during design collaboration. So I use some boilerplate HTML/CSS and have the LLM generate an "outline" (basically a config file) and then generate the HTML from that. This way I can make quick adjustments in the outline and just have it refresh the code when needed to avoid too much back forth with the chat.

Overall, it has been working pretty well. I did make a tweak I haven't pushed yet to make it always writes the outline to a file first (instead of just terminal). And I've also started adding slash commands to the instructions so I can type things like "/create some flow" and then just "/refresh" (instead of "pardon me, would you mind refreshing that flow now?").

https://github.com/pglevy/breadboarding-kit

pglevy · 2025-08-15T22:13:17 1755295997

But not Sonnet?

pglevy · 2025-07-25T11:40:05 1753443605

My use case is a little different (mostly prototyping and building design ops tools) but +1 to this flow.

At this point, I typically do an LLM-readme at the branch level to document both planning and progress. At the project level I've started having it dump (and organize) everything in a work-focused Obsidian vault. This way I end up with cross-project resources in one place, it doesn't bloat my repos, and it can be used by other agents from where it is.

pglevy · 2025-07-22T20:57:04 1753217824

How does this differ from this project? https://github.com/simonw/llm

peskypotato · 2025-07-23T06:04:24 1753250664

From my understanding of Simon's project it only supports OpenAI and OpenAI-compatible models in addition to local model support. For example, if I wanted to use a model on Amazon Bedrock I'd have to first deploy (and manage) a gateway/proxy layer[1] to make it OpenAI-compatible.

Mozzila's project boosts of a lot of existing interfaces already, much like LiteLLM, which has the benefit of directly being able to use a wider range or supported models.

> No Proxy or Gateway server required so you don't need to deal with setting up any other service to talk to whichever LLM provider you need.

Now how it compares to LiteLLM, I don't have enough experience in either to tell.

[1] https://github.com/aws-samples/bedrock-access-gateway

delijati · 2025-07-23T10:17:27 1753265847

Not true is use it with gemini https://llm.datasette.io/en/stable/plugins/directory.html#re...

peskypotato · 2025-07-23T13:30:01 1753277401

Plugins! I completely missed that when testing this earlier. Thank you, will have to take another look at it.