Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Using APIs that don't exist is the biggest problem I've seen with ChatGPT, and it seems GPT-4 as well.


I asked chatgpt about the api for an old programming game called chipwits.. it invented a whole programming language that it called chiptalk with an amalgam of the original chipwits stuff, missing some bits and adding others, and generated a parser for it, which I implemented and got to work, before figuring out how much was imaginary, after talking to the original chipwits devs. They found it pretty amusing.


> and got to work

Can you elaborate?


I'm fast learning Django and even though it's an extremely well documented space, ChatGPT has sent me down the wrong path more than a handful of times.

This is especially difficult because I don't know when it's wrong and it's so damn confident. I've gotten better at questioning its correctness when the code doesn't perform as expected but initially it cost me upwards of 30min per time.

Still, I would say between ChatGPT and Copilot - I'm WAY further ahead.


chatgpt or gpt4?

public copilot uses gpt3.5, as does non premium chatgpt.


my biggest problem with it is that it doesn't seem to understand its own knowledge. If you talk to it for a while and you go back and forth on a coding problem it will often suddenly start using wrong syntax that doesn't exist. Even though at this point it should already know and have looked up for sure that this syntax can't possibly exist because many times it responded correctly. So in human terms it has read the documentation and must know that this syntax can't possibly exist and yet it doesn't know that 10 sec later. That's currently what makes it seem like a not real intelligence to me.


One of the advantages of Bing, and do guess now ChatGPT with browsing plugin, is that it's able to search on the web for the right API.


To be fair, using APIs that I think should exist, is how I develop most of my APIs.


Except that I wasn't asking it to develop a new API.


It's very likely it was using other languages' as "inspiration" given there's very little Zig code out there... so it's maybe natural it would use APIs that don't yet exist... perhaps informing it that it also needs to implement those APIs could work?


Then I guess you're not using it to its fullest potential ;)


We can’t keep blaming the prompter.


A simple metric on confidence interval could do the trick. As the model grows larger, it is getting more difficult to understand what is going on, but that doesn't mean that it needs to be a total black box. At least let it throw some proxy metrics. In due course, will learn to interpret those metrics and adjust our internal trust model.


You can just ask it to give you confidence in the output on a scale 0 to 1


I wonder if a plugin to let it query API docs would solve this problem.


Also it makes up Python libraries, macOS apps to do certain tasks, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: