Another interesting use case for WASM is making cross-language libraries. Write ...

rpeden · on Nov 15, 2023

It's awesome, but also funny that we're using WASM to reinvent COM 30 years later.

And that's not a knock on WASM. It's just that COM was pretty neat for it's time, even if it was sometimes painful in practice.

But I find it pretty nifty that I can still take a COM library written in one language and then import and use it in C++, C#, Python, Ruby, JavaScript, or Racket (and plenty of others - those are just the ones I've used COM libraries with.)

pjmlp · on Nov 16, 2023

COM is pretty much presence tense, it being the main Windows ABI since Windows Vista adopted Longhorn ideas into COM/C++ instead of .NET.

The tooling could have been improved, but I guess WinDev really loves their baroque tools.

theta_d · on Nov 17, 2023

1998 was the pinnacle of software development. VB6 composing off the shelf COM components purchased form a catalog sent in the physical mail.

Desktop apps were still a thing. The web was simple. ASP or some CGI scripts. Perfection.

brabel · on Nov 15, 2023

That's currently impossible. It requires the component model[1] to be figured out and that's taking a ridiculous amount of time because as anyone who has tried this before (and many have) this is really hard (languages have different semantics, different rules, different representations etc.).

If they do manage to get it working somehow, it will indeed be very exciting... but I've been waiting for this for several years :D (I was misled to believe this sort of modular thing was possible : but that's false unless you generate lots and lots of intermediate JS to glue different modules - and then all "communication" goes through JS, nothing goes directly from one module to another - which completely defies any possible performance advantage over just doing pure JS) so don't hold your breath.

[1] https://github.com/WebAssembly/component-model

phickey · on Nov 15, 2023

The component model is already shipping in Wasmtime, and will be stable for use in Node.js and in browsers via jco (https://github.com/bytecodealliance/jco) soon. WASI Preview 2 will be done in December or January, giving component model users a stable set of interfaces to use for scheduling, streams, and higher level functionality like stdio, filesystem, sockets, and http on an opt-in basis. You should look at wit-bindgen (https://github.com/bytecodealliance/wit-bindgen) to see some of the languages currently supported, and more that will be mature enough to use very soon (https://github.com/bytecodealliance/componentize-py)

Right now jco will automatically generate the JS glue code which implements a Component Model runtime on top of the JS engine's existing WebAssembly implementation. So, yes, Components are a composition of Wasm Modules and JS code is handling passing values from one module/instance to another. You still get the performance benefits of running computation in Wasm.

One day further down the standardization road, we would like to see Web engines ship a native implementation of the Component Model, which might be able to make certain optimizations that the JS implementation cannot. Until then you can consider jco a polyfill for a native implementation, and it still gives you the power to compose isolated programs written in many languages and run them in many different contexts, including the Web.

(Disclosure: I am co-chair of WASI, Wasmtime maintainer, implemented many parts of WASI/CM)

brabel · on Nov 15, 2023

Hm, ok so it seems there has finally been progress since I stopped looking (around a year ago, I was very actively following developments for a couple of years before that but got tired). I will check wit-bindgen and jco and see if I can finally make my little compiler emit code that can be called from other languages and vice versa without myself generating any JS glue code.

flohofwoe · on Nov 15, 2023

> That's currently impossible. It requires the component model[1] to be figured out...

Not really if you use C-APIs with 'primitive-type args' at the language boundaries, which is the same like in any other mixed-language scenario. Some languages make it harder to interact with C APIs than others, but that's a problem that needs to be fixed in those languages.

ncruces · on Nov 15, 2023

This. And as long as you provide memory allocation primitives, you can pass arbitrarily complex arguments in linear memory. It's just a matter of “ABI.”

dlock17 · on Nov 16, 2023

It's possible all right, I've already done it. Imported Tesseract into Go with WASM.

https://news.ycombinator.com/item?id=38146154

It's not trivial because you have to figure out intricacies of the language and whatever compiled it into WASM, and Emscripten compiled WASM does expect some JS glue code. But WASM with WASI doesn't inherently require JS. And since Emscripten's JS glue is called via WASM host imports, you can implement them in whatever the host language is.

nl · on Nov 16, 2023

This isn't really true at a base level. You can do things like JS/Python interop now without this, eg: https://til.simonwillison.net/deno/pyodide-sandbox

ReactiveJelly · on Nov 15, 2023

Most of the 3rd-party libraries I use, I use for their side effects.

Qt opens GUI windows and sockets and such. libusb touches USB devices. OpenCV can capture video frames from a camera and sometimes use GPU acceleration. Sqlite manipulates files on disk.

So unfortunately with wasm in a sandbox, the easiest libraries to work with are only pure functions. ffmpeg would work, but HW encoding or decoding would be difficult, and I need to either enable some file system access in the wasm runtime or feed it file chunks on demand.

WanderPanda · on Nov 16, 2023

This! Who really needs to tie together business logic components written in different languages?!

umvi · on Nov 15, 2023

Yes, and go upvote this .NET feature so we can make portable .NET WASM libraries: https://github.com/dotnet/runtime/issues/86162

.NET WASM performance is actually very impressive, especially with AOT enabled.

mazeez · on Nov 16, 2023

You're in luck because this is already possible: https://github.com/extism/dotnet-pdk

kaba0 · on Nov 16, 2023

As mentioned by others, it is not particularly new in any way.

I find GraalVM’s polyglot abilities far more impressive, where the VM can actually optimize across your JS code calling into Python calling into C — while providing more granular sandboxing abilities as well (you can run certain so-called isolates with, say file-access only, some others without even that, all under the same runtime/setup.

d_philla · on Nov 15, 2023

This is exactly one of the use-cases for the Scale Framework[1]. (Full disclosure: I work on this project)

You can absolutely take a library from one language and run it in another. In a sense, you could kind of see this ability as drastically reducing the need for rewriting sdks, middlewares, etc. across languages, as you could just reuse code from one language across many others. We played around with some fun ideas here, like taking a Rust regex library and using it in a Golang program via a scale function plugin (compiled to Wasm), to the effect of the performance being ~4x faster than native code that uses Go's regex library[2].

[1] https://github.com/loopholelabs/scale

[2] https://twitter.com/confusedqubit/status/1628409282462093312

nilslice · on Nov 16, 2023

Extism handles this really well across 16 or so different languages - and you don’t need to write a whole IDL / schema.

https://github.com/extism/extism

It’s a general purpose framework for building with WebAssembly and sharing code across languages is a great way to put it to work.

rockwotj · on Nov 15, 2023

Yeah the component model the bytecode alliance is pushing defines a canonical ABI and codegen tools to make this easier (also separating memory from these components so a bug in some random C library doesn’t have a blast radius outside the processing it does in its library boundary)

jacobheric · on Nov 15, 2023

We sort of do this with WASM for just in time pipelines. We write pipeline rules in WASM...for things like detecting/masking fields...then we import and execute those wasm rules in a variety of language SDKs. As a sibling comment indicates, it's pretty difficult getting data in and out, but it's doable. See here for an example: https://github.com/streamdal/node-sdk/blob/main/src/internal.... We do this sort of thing in node, go & python and are adding other languages.

ledgerdev · on Nov 15, 2023

With supply chain attacks becoming more of an issue the strong sandboxing of library permissions would a huge benefit also. A thought on how this might be workable would be to have a wasm registry that when pushed to, would auto-build packages for each ecosystem, then push upstream to npm/maven/etc.

Of course the "component model" or some agreed upon structure of data shared among modules and the mappings to each language is the missing piece.

never_inline · on Nov 15, 2023

Your post got me wondering, what advantage might it provide over an FFI? Does WASM ABI define higher level common primitives than C?

coderedart · on Nov 15, 2023

well, for starters, wasm is sandboxed. So, if a wasm library needs an import (eg: read/write filesystem), it has to be explicitly provided. It cannot do anything except math by default. This allows host a high amount of control.

different wasm libraries can have separate memories. So, if a library X depends on a jpeg decoder library, the host has to provide that as import. The jpeg decoder library might export a fn called "decode" which takes an array of bytes of a jpeg, and returns an Image struct with rgba pixels. This allows the "memory" of the two libraries to be separate. the jpeg decoder cannot "write" to the X's memory, cleanly separating the two of them.

Wasm component model recognizes higher level objects called resources, which can be shared between libraries (components). This allows X to simply pass a file descriptor to jpeg decode fn, and the sandbox model makes sure that jpeg library can read from that file only and the rest of the filesystem is still offlimits. wasm is already getting support for garbage collector. So, a high level language can just rely on wasm's GC and avoid shipping its entire runtime. Or the host can guarantee that all the imports a language needs will be provided, so that the language libraries can shed as much runtime weight as possible.

Finally, Component model is designed from ground up to be modular, which allows imports/exports/namespaces and other such modern features. C.. well, only has headers and usually brings a lot of backwards compatibility baggage. The tooling (eg: wit-bindgen) will provide higher level support like generating code for different language bindings by taking a wit (header for wasm) declaration file. If you are familiar with rust, then https://github.com/bytecodealliance/cargo-component#getting-... shows how easy it is to safely create (or bind to) wasm bindings

nonethewiser · on Nov 15, 2023

I had to look up FFI (Foreign Function Interface). Im not sure if WASM is better. Im aware there are language bindings (maybe synonymous or overlaps with FFI?) but Im not that familiar with them either.

I wondered if perhaps this WASM use case for a cross-language library was already just as possible and ergonomic using language bindings and maybe thats why this use case doesnt seem like a big deal to people. It does seem possible that the allure of running in the browser might prompt deeper support for WASM compilation than language bindings. The WASM case is also a many-to-one relationship (all languages to WASM) whereas language bindings are a many-to-many relationship (all languages to all languages) so it would take a lot mote effort for the same level of support.

josephg · on Nov 15, 2023

> I wondered if perhaps this WASM use case for a cross-language library was already just as possible and ergonomic using language bindings and maybe thats why this use case doesnt seem like a big deal to people.

Yeah that’s the reason. You don’t notice it a lot of the time, but FFIs are everywhere already. The most common foreign function interface is basically the ability to call C code, or have functions made available to C code. C is used because everyone knows it and it’s simple. And most languages either compile to native code (eg rust) - which makes linking to C code easy. Or the runtime is implemented in C or C++ (eg V8, Ruby). In languages like that, the standard library is already basically implemented via a FFI to C/C++ code.

I’ve got an iOS app I’m working on that’s half rust and half swift, with a touch of C in the middle. The bindings work great - the whole thing links together into one binary, even with link time optimizations. But the glue code is gross, and when I want to fiddle with the rust to Swift API I need to change my code in about 4 different places.

Most FFIs are a one to many relationship in that if you write a clean C API, you can probably write bindings in every language. But you don’t actually want to call naked C code from Ruby or Javascript. Good bindings will make you forget everything is done via ffi. Eg numpy. I haven’t looked at the wasm component model proposal - I assume it’s trying to make this process cleaner, which sounds lovely.

I maintain the nodejs bindings for foundationdb. Foundationdb bindings are all done via ffi linking to their C code. And the API is complex, using promises and things. I find it really interesting browsing their official bindings to go, Java, Python and Ruby. Same bindings. Same wrapped api. Same team of authors. Just different languages. And that’s enough to make the wrapper wildly different in every language. From memory the Java ffi wrapper is 4x as much code as it is in Ruby.

https://github.com/apple/foundationdb/tree/main/bindings

kaba0 · on Nov 16, 2023

> You don’t notice it a lot of the time, but FFIs are everywhere already

That’s true, but I sort of find it a negative in a platform if it relies too much on C libs, unless absolutely necessary. FFI is the prime reason why a given software fails to work on another OS, e.g. if your python/js project doesn’t build elsewhere, you 90% have trouble with a C lib.

There are various reasons why this is not the case with JVM languages (it historically didn’t have too great FFI options, also, the ratio of Java:C speed is much less than Python:C, so it didn’t make that much sense), but that platform grown to be almost 100% pure JVM byte code. The only part where they use native parts is stuff like OpenGL, where you pretty much have to. I think this gives for a more ideal starting point.

josephg · on Nov 16, 2023

Yeah - its a problem with nodejs as well. Native C libraries in npm regularly break when you change OS, and given that most javascript packages pull in a small country worth of dependencies its very common to have some native code in there somewhere.

I really hope most native javascript modules being rewritten / repackaged into wasm. As well as solving any cross-OS compatibility problems, that'll also make them work transparently in the browser.

lesuorac · on Nov 15, 2023

If it ends up becoming more similar to LUA then a big advantage is that the WASM code won't randomly read your harddrive and send your bitcoins to North Korea unless you explicitly gave the WASM code disk/network permissions.

brabel · on Nov 15, 2023

Lua allows full control over which APIs you want to expose to a script you embed in your application. With some effort you can even expose only a constrained version of the `os` module for example which only lets you access a few resources. Why do you believe WASM can do better here? In fact, as far as I know , there's nothing in WASM that lets you sandbox it yet once you've given it WASI access (unless you're talking about host specific features, which are NOT part of WASM spec itself).

ncruces · on Nov 15, 2023

That depends entirely on the runtime, and its WASI implementation.

wazero [1], which I'm most familiar with, allows you to decide in a relatively fine-grained way what capabilities your WASI module will have: command line arguments, environment variables, stdin/out/err, monotonic/wall clock, sleeping, even yielding CPU… Maybe more importantly, filesystem access can be fully emulated, or sandboxed to a specific folder, or have some directories mounted read-only, etc; it's very much up to you.

I've used it to wrap command line utilities, and package them as Go libraries.

For one example, dcraw [2]. WASM makes this memory safe, and I can sandbox it to access only the single file I want it to process (which can be a memory buffer, or something in blob storage, if I want it to).

Notice in [3] how you provide an io.ReadSeeker which could be anything from a file, a buffer in memory, or an HTTP resource. The spaghetti C that dcraw is made of won't be able to access any other file, bring your server down, etc.

1: https://wazero.io/

2: https://dechifro.org/dcraw/

3: https://pkg.go.dev/github.com/ncruces/rethinkraw/pkg/dcraw

emmanueloga_ · on Nov 16, 2023

Wazero looks super cool. I saw somewhere that programs can be run with a timeout, which sounds great for sandboxing. The program input is just a slice of bytes [1], so an interesting use case would be to use something like Nats [2] to distribute programs to different servers. Super simple distributed computing!

--

1: https://github.com/tetratelabs/wazero/blob/main/examples/bas...

2: https://natsbyexample.com/examples/messaging/pub-sub/go

ncruces · on Nov 16, 2023

Yes: if so configured, wazero respects context cancelation (including, but not limited to, timeouts).

This has a slight toll on performance: a call back from WASM AOT-compiled-assembly into Go is introduced regularly (on every backwards jump?) to give the Go runtime the opportunity to yield the goroutine and update the context (and break infinite loops), even when GOMAXPROCS=1.

Coordinating with the Go scheduler might be an area where there's some room for improvement, if fact.

josephg · on Nov 15, 2023

Yeah lua is a weird example because it’s actually amazing at this. Lua gives you a massive amount of control over what scripts can access.

It is much safer than pulling in an opaque C library that works via ffi. Eg a nodejs native module. Those are written in C and can indeed sell your data to North Korea. (Just like any other package in npm.)

I’m excited by the idea of being able to depend on 3rd party code without it having access to my entire OS.

pjmlp · on Nov 16, 2023

As proven by the JVM, CLR, TIMI and many others that predated WASM, that is much harder in practice than people think.

Turns out there always needs to exist a common subset that most languages are capable to understand.

ms4720 · on Nov 16, 2023

Vax VMs used to do that.

Also anything the compiles to a C shared library also does that

whoopdedo · on Nov 15, 2023

WASM is the new DLL. Will we have to deal with WASM-Hell eventually? But that's not catchy enough. Maybe we should call circular dependency and incompatible versions "WASM-WTF"