Rebuilding the Racket Compiler with Chez Scheme

patterns · on Nov 27, 2020

I wonder whether the migration to Chez Scheme will improve the performance of Dr. Racket which consumes a lot of memory and feels very sluggish.

I have been experimenting with Racket for quite a while and I appreciate the effort that went into making the language extensible. That being said, I wish that the community would have embraced "object-oriented" techniques for building the VM - Ian Piumarta's COLA (Combined Object-Lambda Architecture) comes to mind [1]. I think this would have saved them a lot of performance troubles (but this is mere speculation) and would make the system far more flexible and pleasant to use.

While it's comforting to have a very powerful macro system at your fingertips to change the semantics of the language when needed, many of the macros in the base language and libraries could be eliminated with a more convenient default notation for closures and message-passing (to objects simulating control structures like in Smalltalk).

Racket has one of the best module systems I have worked with but modules cannot be (easily) parameterized. Research on systems such as Self and more recently Newspeak [2] gives ample of demonstration of the benefits (conceptual integrity, security) of having modules as objects.

What Racket also lacks is a good meta-object system with highly reflective capabilities (of say a CLOS or Smalltalk system). This makes it difficult to build tooling such as inspectors or browsers.

I hope that in the future these issues will be addressed.

[1] https://www.piumarta.com/papers/colas-whitepaper.pdf

[2] https://newspeaklanguage.org/

chriswarbo · on Nov 27, 2020

I would absolutely love to see such a thing (e.g. VPRI's 'frank' system), but it's quite a different approach than Scheme, and I wouldn't want to see PLT/Racket take such a drastic course change while there's still so much interesting work to be done in the Scheme world (e.g. Kernel's F-expressions are powerful, but hard to optimise; work on reflective towers and multi-stage programming is pushing languages beyond the traditional AOT/JIT dichotomy; and so on).

From my understanding, Piumarta's main trick with COLA is fast dynamic dispatch through a uniform interface (putting pointers to vtables at offset -1), and implementing vtables via interface (so they can be replaced). It's a cool approach, but it's also a case of 'everything can be solved by adding a layer of indirection'; and I've not sat down and thought through how these ideas might apply to different paradigms, and what might be unique or common to them all.

patterns · on Nov 27, 2020

If I remember correctly, Adele Goldberg once remarked that "in Smalltalk everything happens somewhere else". Although there has been ongoing work to collapse the many layers of indirection you find in message-passing systems, I am not convinced that throwing more tools at the problem will address the deeper underlying issues. At some level, you need more concise descriptions of your system and operators to shape its structure and organization; perhaps even compute different organizations. This is very difficult to accomplish in purely object-oriented systems/languages that have no direct and convenient "algebraic" correspondence for composing complex communication structures and specifying relationships such as inheritance. For example, I was very taken by the idea of class definitions as expressions, having seen it in Racket first many years ago.

On the other hand, I believe that when it comes to low-level "machine" work, objects are a good abstraction to model components such as activation records such that they can be uniformly explored and modified (on the fly). But this is perhaps a moot point.

Over the years, I have studied many object/component-oriented systems and come up with a sizable catalogue of message-exchange patterns, plumbing and machinery. My hope is that at some point, this can be crystallized into a language or calculus for specifying systems; and Scheme/Racket is certainly a good language to think about these issues from another perspective (which is worthwhile to preserve).

So I guess, I understand your point. Thanks for raising it, much appreciated.

alexisread · on Nov 27, 2020

Oberon and Maude I think make a good case for (parameterised) modules, both from being able to reason mathematically and their use at low level OS system.

I've been very slowly trying to combine the work in Maru with Composita to yield a modular and deterministic lisp (Composita uses managed memory without GC).

http://concurrency.ch/Content/publications/Blaeser_Component...

Naturally you'd want to layer something like Shen or Maude on top of this, to provide the equational proof-checking. The K language is a good example - it provides facilities to model new languages and semantics like racket.

http://fsl.cs.illinois.edu/index.php/K-Maude:_A_Rewriting_Ba...

rscho · on Nov 27, 2020

I have no CL experience but doesn't a CLOS system require multiple dispatch?

dfox · on Nov 27, 2020

CLOS is inherently multiple dispatch, but Flavours that are the most prominent precursor of CLOS were single dispatch. Interesting consequence of that is that IIRC traditional implementation of CLOS/MOP does not really use multiple-dispatch internally.

ska · on Nov 27, 2020

It's multiple dispatch, but the meta-object protocol runs deeper than that (e.g. :before, :after, :around )

Pet_Ant · on Nov 27, 2020

With macros you can easily build your own multiple dispatch, it's really not a concern.

rscho · on Nov 27, 2020

Well, there's at least the concern that multiple dispatch results in unsafe edge cases across module boundaries, apparently.

So, multiple dispatch as a library yes (and racket has that), but probably not ok in the core language?

bloopernova · on Nov 27, 2020

Very cool that HackerNews is using "their own language called Arc that is programmed in Racket."

http://arclanguage.org/

I had no idea I was using Racket already, via this website! Makes me wonder if Terraform's HCL could be re-written using Racket.

opnitro · on Nov 27, 2020

I'm sure they could (don't know they if should). If you'd like to learn more about Racket's language building infrastructure, check out: https://beautifulracket.com/

pmoriarty · on Nov 27, 2020

"Chez Scheme is one of the oldest Scheme implementations, and its evolution informed many parts of the Scheme standard through R6RS. (Racket's influence on the Scheme standard, in contrast, is limited to aspects of the R6RS library design.)..."

Why did Chez and Racket embrace R6RS, a standard widely rejected by the rest of the Scheme community?

Also, why did Racket choose Chez over some other high-performance Schemes, like Chicken?

Finally, how many Racket and Chez users are there compared to those of other Schemes?

samth · on Nov 27, 2020

The main implementors of Racket and Chez Scheme (Matthew Flatt and Kent Dybvig) were part of the committee that created the R6RS, and many of the ideas in R6RS came from Racket and Chez. Whether R6RS was "widely rejected" is of course up for debate, but the people that didn't like it decided that mostly after Matthew and Kent's contributions.

https://ecraven.github.io/r7rs-benchmarks/ is an ok comparison between Scheme implementations; you can see that Chez is much faster than Chicken in general. Chez also already supports many features beyond Chicken that were needed to implement Racket (for example, internal support that made delimited continuations easy).

It's hard to know exactly how many people use what, but it seems likely that Racket is the most widely used Scheme derivative.

kryptiskt · on Nov 27, 2020

I never understood that hate. R6RS is the only Scheme standard that makes it possible to write useful standards-compliant programs. R7RS-small is a big step back in that respect. The new standard is less of a burden on the implementer[0], and I guess that makes it popular in the Scheme community, as that mostly consists of people implementing Scheme and very few people writing Scheme programs.

I built a Chez Scheme backend for Idris (and Idris2 uses Chez as the default compilation target). I used it because it's so much better than the other compilers, and what's the point of being compliant with a standard that provides no interoperability anyway because it is so meager?

[0] https://weinholt.se/articles/r7rs-vs-r6rs/

cat199 · on Nov 28, 2020

from what I gather (not an implementor) r6rs was too much to implement for the smaller 'testbed' scheme's popular with language designers, and perhaps a bit too proscriptive for the core - which is why r7 has the small/big split. small can be implemented more easily, and big can be added on as a portable library or alternately implemented at a more low level by the language runtime if desired

GrumpySloth · on Nov 27, 2020

> Why did Chez and Racket embrace R6RS, a standard widely rejected by the rest of the Scheme community?

Why wouldn't they? The usual rationale for rejecting R6RS is that it's somehow "too big". But Racket already has a library whose size blows R6RS out of the water. So this rationale is irrelevant in this context. Also, Chez Scheme is an implementation detail. R6RS is only one of many languages that you may use through Racket.

And what is this "Scheme community" that rejected R6RS? It's always been my impression that it was rejected by authors of toy implementations. Is there actually any data on what portion of people who use Scheme (as opposed to write another implementation of it) reject R6RS?

bitwize · on Nov 27, 2020

> Why wouldn't they? The usual rationale for rejecting R6RS is that it's somehow "too big". But Racket already has a library whose size blows R6RS out of the water. So this rationale is irrelevant in this context.

No one begrudges Racket for having a huge library. The Racket team have obviously had enormous success building large, comprehensive real-world systems.

But Racket is an implementation -- not the Scheme standard. The historic value of the Scheme standard is its smallness and ease of implementation -- a small base upon which to build large systems. Large, comprehensive implementations -- such as T -- have been built on top of this small basis for almost as long as there's been an RnRS.

R6RS represented a fundamental shift in philosophy for what is Scheme. It de-emphasized the "small language kernel" approach and emphasized a comprehensive, software-engineering approach for building "real world systems". But the Lisp community already has a comprehensive, software-engineering-oriented language, Common Lisp (which borrowed some of its best features, like lexical scoping, from Scheme). So what's the point of building Scheme into a large software-engineering language to compete with Common Lisp?

It's a bit hard to fathom today, when most modern programming languages -- excluding ECMAScript but including, for example, Rust -- are defined in terms of a single reference implementation, but there are advantages to keeping the language standard small while allowing implementations to grow to arbitrary size and accrete features. Advantages which Scheme has turned into its own little niche among the family of Lisp programming languages.

> And what is this "Scheme community" that rejected R6RS? It's always been my impression that it was rejected by authors of toy implementations.

Of course, because Gambit, Chicken, and Gauche -- among others -- are mere toys.

A toy implementation that implements all of R5RS is at least possible. You can't build a toy implementation of Common Lisp, Python, or Rust, that covers even a small fraction of the core language. But you can build a Scheme that's nearly or entirely R5RS-compliant in maybe a few thousand lines of C, and from there build out the language to incorporate Python- or Rust-like features. "Toy" Schemes can even be useful in production, as an embedded scripting layer to a larger system.

cwyers · on Nov 28, 2020

The reason for Racket and its large library to embrace a large standard is the amount of things that Racket doesn't have to implement itself because it's in the standard already.

bitwize · on Nov 28, 2020

That's not how standards work. The standard just specifies something; you still have to implement it, unless the standard supplies a reference implementation which you can make work with your code. There's a reason why the quasi-standards for extensions to Scheme are called Scheme Requests for Implementation.

Neither R6RS nor any other RnRS standard contain a reference implementation, so whether it's in R6RS or not, the Racket developers have to implement it. RnRS standards do tend to have a description of Scheme's formal semantics, and SRFI documents, by convention, contain a reference implementation of the proposed API where possible.

rscho · on Nov 27, 2020

Isn't chicken actually slower than Racket was before the switch? IMO its strong point is more seamless CFFI than performance. Chez also has a small C core and the rest in scheme, which I think was the main motive for the switch.

jhoechtl · on Nov 27, 2020

What are the reasons Racket CS is, or at least used to be slower than Racket BS? Was the whole transition towards CS not primaritly to boost performance? Were expectations to high in this regard?

samth · on Nov 27, 2020

1. The big performance issues early on were compilation speed -- Racket had a simple JIT compiler that ran almost instantly.

2. Some rewritten parts of the system, such as IO, were initially slower, but have now been improved to catch up to the previous implementation.

3. Chez Scheme didn't support some performance features that Racket BC had, such as unboxed floating point arithmetic, parallel GC, or continuation marks. Those have been added to (Racket's variant of) Chez Scheme.

4. Racket's prior compiler was somewhat more competitive with Chez Scheme than we expected, particularly for larger and more Racket-idiomatic programs.

dreamcompiler · on Nov 27, 2020

Flatt has written elsewhere that the motivation to use Chez was not speed per se but better maintainability and portability, and that required rewriting more of the C core in Racket itself [0]. Doing so would have slowed down all of Racket considerably, but by using Chez it became possible without a big speed penalty. At first there was such a penalty because the rewritten stuff was still slower than C, but with tuning that has mostly gone away.

[0] https://blog.racket-lang.org/2018/01/racket-on-chez-status.h...

tux1968 · on Nov 27, 2020

They took the opportunity to move some of the infrastructure from C to Racket itself:

   Mostly, we did reimplement the C stuff in Racket. The I/O
   subsystem, the concurrency subsystem (which includes the 
   scheduler for “green” threads, Concurrent ML-style events,
   and custodians), and the regexp matcher were all rewritten
   in Racket. Those pieces followed the rewrite of the macro 
   expander in Racket.

So there was a slowdown until those pieces matured. But now they get the benefit of having all those pieces in higher level of abstraction and more accessible to Racket programmers.

ywei3410 · on Nov 27, 2020

To add to the other reasons - the CS compiler optimizes different forms and Racket is currently optimized for it's own compiler. To give a concrete example, taken directly from the racket blog [1]:

> For example, Racket BC makes dormant code relatively cheap, so Racket libraries generate a lot of code. Future Racket libraries will likely shift to take advantage of Racket CS’s cheaper function calls and dramatically cheaper continuations. One day, probably, Racket BC will no longer be a viable alternative to Racket CS for most programs.

[1] https://blog.racket-lang.org/2020/02/racket-on-chez-status.h...

Janet29 · on Nov 27, 2020

In my opinion, Dr Racket is the most bloated and most sluggish programming editor ever written. And the editor on windows constantly display strange unwanted lines (see for example this screenshot [1]. I had complained numerous times to Racket's news group because of this, but this has never been fixed. Very strange community, very poor tool.

[1] https://pasteboard.co/JCkb0tM.png

shakna · on Nov 27, 2020

That looks like underline has been enabled for parenthesis. Dr. Racket is _highly_ configurable. I believe you'll find the option you're looking for at:

Preferences->Colors->Racket->Parenthesis (Underline)

andrewflnr · on Nov 27, 2020

That looks like it's working as intended, to mark the end of the function. The lines from a symbol to its definition look even weirder, but I can see the appeal. A setting to toggle them would be reasonable, but they're not going to get rid of them, if that's what you asked for.

nyanpasu64 · on Nov 27, 2020

I think the issue is because you're running on 125%-ish DPI scaling, and the program doesn't handle it well.

Janet29 · on Nov 28, 2020

Yes, I do have 125% DPI on my computer. When I set back to 100% DPI than strange lines doesn't show. But I want to keep 125% DPI and the point is that Dr Racket should respect that. But, Dr Racket is the only program that display strange artifacts on 125% DPI. Neither VS Code nor Sublime nor Notepad++ do that. As I already mentioned, I reported this bug many times during past few years to the Racket group, but the damn lines are still there, and that made the editor totally unusable. What should I say? Strange community, they don't care for this serious bug, they are preocupied with the wrong stuff of "speeding" the language with Chez Scheme. The result is crappy, unusable tool and angry users.

samth · on Nov 28, 2020

Can you point to where this bug report was? I'll take a look at what happened.

jasonwa · on Nov 28, 2020

It seems that someone reported the issue more than three years ago: https://groups.google.com/g/racket-users/c/6dIb242Ve8s/m/us8...

nyanpasu64 · on Nov 28, 2020

Did you mention it's caused by fractional DPI scaling, when reporting the bug?

You can set Dr. Racket to "system" DPI scaling (by accessing the .exe or .lnk properties), at the cost of rendering at 96dpi and being blurrily upscaled. The program is not good, the result sucks, but it technically works.

mitch_br · on Nov 27, 2020

My experience with Dr Racket is the same.