Hacker Newsnew | past | comments | ask | show | jobs | submit | sams99's commentslogin

For those interested, edit is a surprisingly difficult problem, it seems easy on the surface but there is both fine tuning and real world hallucinations you are fighting with. I implemented one this week in:

https://github.com/samsaffron/term-llm

It is about my 10th attempt at the problem so I am aware of a lot of the edge cases, a very interesting bit of research here is:

https://gist.github.com/SamSaffron/5ff5f900645a11ef4ed6c87f2...

Fascinating read.


Codex requires stuffing a very specific system prompt otherwise the custom endpoint will reject you



Author here, thanks heaps for the discussion, I replied to a few of the points in my blog comments:

https://discuss.samsaffron.com/t/your-vibe-coded-slop-pr-is-...


Qwen coder 32b with a JavaScript interpreter

Impressive answer for a model that can run on your own computer

https://discuss.samsaffron.com/discourse-ai/ai-bot/shared-ai...


thanks for sharing. Your blog looks like a old forum board.


I find it odd that is refused me so badly https://discuss.samsaffron.com/discourse-ai/ai-bot/shared-ai... my guess is that I am using a quantized model

It simply did not want to use XML tools for some reason something that even qwen coder does not struggle with: https://discuss.samsaffron.com/discourse-ai/ai-bot/shared-ai...

I have not seen any model including sonnet that is able to 1 shot a working 9x9 go board

For ref gpt-4o which is still quite bad https://discuss.samsaffron.com/discourse-ai/ai-bot/shared-ai...


The original was posted at work earlier this week, to me the original missed a bit around explaining what this tech is yes good at... https://meta.discourse.org/discourse-ai/ai-bot/shared-ai-con...


Highly Gamed === It is better if users with slow devices see a white screen for 30 seconds vs an indication that something is happening, because ... reasons?


You missed the point, which is that it's better if users with slow devices see actually useful content rather than a splash screen.


For those looking for a rubyish approach to this see: https://github.com/discourse/mini_sql



This is something we are investigating. We would like it to interoperate with the wider ecosystem.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: