Hacker Newsnew | past | comments | ask | show | jobs | submit | alangpierce's commentslogin

> What if you just... removed the AI part?

Maybe I'm not fully understanding the approach, but it seems like if you started relying on third-party MCP servers without the AI layer in the middle, you'd quickly run into backcompat issues. Since MCP servers assume they're being called by an AI, they have the right to make breaking changes to the tools, input schemas, and output formats without notice.


Exactly my thoughts after reading the article. I am surprised that so few have pointed this out because it entirely invalidates the article’s conclusion for any serious usage. To stay at the USB-C example: it‘s like plugging in a Toaster into a monitor but the Toaster changes its communication protocol every time it gets reconnected.


Yes! Once the first integration is done. It will be static unless someone manually changes it.

Maybe the author is okay with that and just want new APIs (for his toaster).


The state already puts a lot of resources into improving road safety, and that's one of the primary goals of the DMV, DOT, etc. There's reason to believe that driverless cars will greatly improve road safety, so it feels reasonable that the state would allow their development and have a framework for responsibly expanding their use.


Interestingly, the ChatGPT Plugin docs [1] say that POST operations like these are required to implement user confirmation, so you might blame the plugin implementation (or OpenAI's non-enforcement of the policy) in this case:

> for POST requests, we require that developers build a user confirmation flow to avoid destruction actions

However, at least from what I can see, the docs don't provide much more detail about how to actually implement confirmation. I haven't played around with the plugins API myself, but I originally assumed it was a non-AI-driven technical constraint, maybe a confirmation modal that ChatGPT always shows to the user before any POST. From a forum post I saw [2], though, it looks like ChatGPT doesn't have any system like that, and you're just supposed to write your manifest and OpenAPI spec in a way that tells ChatGPT to confirm with the user. From the forum post, it sounds like this is pretty fragile, and of course is susceptible to prompt injection as well.

[1] https://platform.openai.com/docs/plugins/introduction

[2] https://community.openai.com/t/implementing-user-confirmatio...


This might be an intentional interpretation of the plugin authors.

Meaning they potentially took the reasoning "in order to prevent destruction actions" to inversely mean that non-destructive POST requests must be OK then and do not require a prompt. Plenty of POST search APIs out there to get around path length limitations and such.

That is probably not the intended meaning but a valid enough if kind of tongue in cheek-we-will-do-as-we-please-following-the-letter-only implementation. And like the author found even creative a d not destructive actions can be surprising and unwanted. But isn't this what AI would ultimately be about?


Why would it not be the intended meaning, if they wanted it to be all post requests they would have said so, the specifically scoped it “destructive actions”, their intention is in their words. POST as a verb can pretty much be used for anything retrieval, creation, deletion, updates, noops , it’s just code it does whatever we tell it to do.


I think you are slightly misreading it. The rule is a requirement and an explanation.

Requirement: for POST requests, we require that developers build a user confirmation flow

Explanation: to avoid destruction actions

I think you are reading it as if it said:

> for POST requests, we require that developers build a user confirmation flow *for* destruction actions


After I shared some POC exploits with Plugins OpenAI added this requirement it seems.

However as far as I can tell, and most recent testing shows, this requirement is not enforced: https://embracethered.com/blog/posts/2023/chatgpt-plugin-vul...

I'm still hoping that OpenAI will fix this at the platform level, so that not every Plugin developer has to do this themselves.

It took 15+ years to get same-site cookies - let's see if the we can do better in here...


> It took 15+ years to ~~get~~ re-gain same-site cookies.

IIRC, cookies were originally tightly locked to the domain/subdomain which set them.


Wow, not the kinda thing you'd want to be so precarious. Really surprising that more thought wasn't put into this.


This article argues that there's no reliable way to detect prompt injection: https://simonwillison.net/2022/Sep/17/prompt-injection-more-...

One solution to some indirect prompt injection attacks is proposed in this article, where you "sandbox" untrusted content into a second LLM that isn't given the ability to decide which actions to take: https://simonwillison.net/2023/Apr/25/dual-llm-pattern/


I see absolutely no way prompt injection can be fully protected against.

There are nearly infinite ways to word an attack. You can only protect against the most common of them.



I mean, sure that'd work, but doesn't it defeat most of the point in using an LLM?

The only way that works is if you escape _all_ user content. If you're telling an LLM to ignore all user content, then why are you using an LLM in the first place?


The approach isn't to ignore all "user" content at all. It is trained to follow instructions in normal text; only instructions contained in specially quoted text (that is, external text, like a website) are ignored. Quotation would apply to Bing's search abilities or ChatGPTs new Browsing Mode, which both load website content into the context window.


Giving different permissions levels to different email senders would be very challenging to implement reliably with LLMs. With an AI assistant like this, the typical implementation would be to feed it the current instruction, history of interactions, content of recent emails, etc, and ask it what command to run to best achieve the most recent instruction. You could try to ask the LLM to say which email the command originates from, but if there's a prompt injection, the LLM can be tricked in to lying about that. Any permissions details need to be implemented outside the LLM, but that pretty much means that each email would need to be handled in its own isolated LLM instance, which means that it's impossible to implement features like summarizing all recent emails.


You don’t need to ask the LLM where the email came from or provide the LLM with the email address. You just take the subject and the body of the email and provide that to the LLM, and then take the response from the LLM along with the unaffected email address to make the API calls…

  addTodoItem(taintedLLMtranslation, untaintedOriginalEmailAddress)
As for summaries, don’t allow that output to make API calls or be eval’d! Sure, it might be in pig latin from a prompt injection but it won’t be executing arbitrary code or even making API calls to delete Todo items.

All of the data that came from remote commands, such as the body of a newly created Todo item, should still be considered tainted and and treated in a similar manner.

These are the exact same security issues for any case of remote API calls with arbitrary execution.


Agreed that if you focus on any specific task, there's a safe way to do it, but the challenge is to handle arbitrary natural language requests from the user. That's what the Privileged LLM in the article is for: given a user prompt and only the trusted snippets of conversation history, figure out what action should be taken and how the Quarantined LLM should be used to power the inputs to that action. I think you really need that kind of two-layer approach for the general use case of an AI assistant.


I think the two-layer approach is worthwhile if only for limiting tokens!

Here’s an example of what I mean:

https://github.com/williamcotton/transynthetical-engine#brow...

By keeping the main discourse between the user and the LLM from containing all of the generated code and instead just using that main “thread” to orchestrate instructions to write code it allows for more back-and-forth.

It’s a good technique in general!

I’m still too paranoid to execute instructions via email without a very limited set of abilities!


It looks to me like the profile viewer is actually speedscope ( https://www.speedscope.app/ ). I find it nicer for exploring profiles compared with Chrome's built-in viewer.

To use with Node.js profiling, do the `node --inspect` and `chrome://inspect` steps, then save the profile as a .cpuprofile file and drag that file into speedscope.

Another thing I've found useful is programmatically starting/stopping the profiler using `console.profile()` and `console.profileEnd()`.


That is a great tool. Also, I learned about perfetto the first time today, after doing Javascript profiling for a few years.


Like any transpiler, Sucrase can be run in parallel by having the build system send different files to different threads/processes. Sucrase itself it more of a primitive, just a plain function from input code to output code.

> What I do not understand is "Sucrase does not check your code for errors." So it's not a type checker?

That's correct, Sucrase, swc, esbuild, and Babel are all just transpilers that transform TypeScript syntax into plain JavaScript (plus other transformations). The usual way you set things up is to use the transpiler to run and build your TS code, and you separately run type checking using the official TypeScript package.


Hi, Sucrase author here.

To be clear, the benchmark in the README does not allow JIT warm-up. The Sucrase numbers would be better if it did. From testing just now (add `warmUp: true` to `benchmarkJest`), Sucrase is a little over 3x faster than swc if you allow warm-up, but it seemed unfair to disregard warm-up for the comparison in the README.

It's certainly fair to debate whether 360k lines of code is a realistic codebase size for the benchmark; the higher-scale the test case, the better Sucrase looks.

> worse it disables esbuild and swc's multi-threading

At some point I'm hoping to update the README benchmark to run all tools in parallel, which should be more convincing despite the added variability: https://github.com/alangpierce/sucrase/issues/730 . In an ideal environment, the results are pretty much the same as a per-core benchmark, but I do expect that Node's parallelism overhead and the JIT warm-up cost across many cores would make Sucrase less competitive than the current numbers.


itertools.groupby isn't really the groupBy operation that people would normally expect. It looks like it would do a SQL-style "group by" where it categorizes elements across the collection, but really it only groups adjacent elements, so you end up with the same group key multiple times, which can cause subtle bugs. From my experience, it's more common for it to be misused than used correctly, so at my work we have a lint rule disallowing it. IMO this surprising behavior is one of the unfriendliest parts of Python when it comes to collection manipulation.

https://docs.python.org/3/library/itertools.html#itertools.g...


It's common in other languages as well (at least Haskell) and a bit surprising at first. However, a `.sortBy(fn).groupBy(fn)` is easy and of similar efficiency and when you actually need the local-only `groupBy()` you're happy it's there.

A bit more expressive overall.

At least it is better than lodash' useless groupBy which creates this weird key value mapping, loses order and converts keys to string and what not.


yep, that's a good example of what I refer to as IKEA assembling your groupby. You need to put something like 3 parts together before it does what you want, and they aren't that intuitive (or they only are in retrospect).


The resulting groups are also iterators which are exhaustible. It's good if you're running group by on a huge dataset to save some memory, but for everyday operations it's another trap to fall into.


Yes, for itertools.groupby to work as most people would expect, the data needs to be sorted by the grouping key first. That may obviously cause a significant performance hit.



The big difference, I think, is that dx87's post encourages the reader to make fewer assumptions and to keep an open mind ("While we don't know the details, it's entirely possible..."), while your bullet points are statements that encourage the reader to make more assumptions.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: