Stak Scheme: The tiny R7RS-small implementation

	Stak	Ribbit
"Bytecode" encoding	Structured memory snapshot	Serialization + ad-hoc merging
`eval` procedure	The compiler itself	A separate library

	Lines of code	Binary size (KB)
mstak	9,127	108,648
tr7i	16,891	301,536

Benchmark	mstak	stak	mstak (embed)	stak (embed)	tr7i	gsi	chibi	gosh
empty	1.00	1.04	0.14	0.38	0.77	0.51	3.63	1.27
hello	1.00	1.04	0.13	0.36	0.73	0.53	9.84	3.62
fibonacci	1.00	1.12	0.96	1.05	1.35	1.66	0.93	0.45
sum	1.00	1.13	1.01	1.06	1.19	1.64	0.98	0.24
tak	1.00	1.09	0.89	0.98	0.96	1.23	1.21	0.54

I'm Yota Toyama. In this talk, I introduce a new tiny R7RS Scheme implementation called Stak Scheme. I'm a developer of Stak Scheme.

There is a tiny R4RS Scheme implementation called Ribbit Scheme. It is very tiny as its R4RS REPL fits in 7 KB. Its VM aims to be simple, portable, compact, and fast at the same time. In Ribbit Scheme, one of the primary features is the split architecture of the compiler and the virtual machine. The compiler compiles source code in Scheme into bytecode. The virtual machine runs the bytecode as a Scheme program. Proving its portability, the VM is implemented in various host languages including assembly, C, Javascript, and even Bash.

RVM is very compact and reasonably fast. The question is, can we implement the entire R7RS-small, the latest standard of Scheme, on RVM?

And the answer is yes, we did.

That's how the project of Stak Scheme started. Stak Scheme is still tiny but implements the entire R7RS-small standard on RVM. Its design and goal are similar to Ribbit Scheme. Stak Scheme is primarily designed as an embedded scripting language. But it can also run by itself as a standalone interpreter on command line. In terms of differences from Ribbit Scheme, Stak Scheme uses a different encoding scheme for bytecode, and the eval procedure is implemented differently. I'm gonna focus on these two topics in today's talk.

Before going into the details of Stak Scheme, let me briefly explain RVM on which Ribbit and Stak Scheme are implemented. Ribbit Scheme's virtual machine is called Ribbit Virtual Machine, which is a typical stack machine. On a virtual machine, we need to represent bytecode and Scheme values in some way. But on RVM, everything is represented as a list including all Scheme values, bytecode, and the VM's state like a call stack. In other words, RVM adopts "Von Neumann architecture". On RVM, you can manipulate code and data in exactly the same way.

On RVM, the representation of a Scheme program is called a code graph. This example is the one of the fibonacci function. A code graph is just a DAG of pairs containing both code and data for Scheme code. Because of that, for example, to implement the eval procedure, we simply compile an S-expression into a code graph, and execute it as a procedure. I'm gonna talk more about it later.

The code graph is used at two places. On Ribbit Scheme and Stak Scheme, the compiler compiles source code into a code graph. The virtual machine, RVM runs the code graph as a program. We have extra encoding and decoding steps to store and load a code graph as a byte sequence. In both the compiler and the VM, we use code graphs as a representation of a compiled Scheme program.

The encoding of a code graph in Stak Scheme is conceptually a structured memory snapshot. Its purpose is to transfer a code graph as a compiled Scheme program in the compiler into the VM by the encoding and decoding algorithms. Behind the scenes, it works just like a topological sort with a cache table with shared nodes. The point is that this cache table is implemented in the VM's heap memory. It does not require any extra complex data structure in the host language like hash maps, which contributes to the portability of the VM.

In Ribbit Scheme, the eval procedure is implemented as a library attached to a main program. But in Stak Scheme, the compiler itself is part of the eval procedure. We took this design because the compiler is relatively large for R7RS as it includes the macro and library systems. It is very tedious to maintain two separate implementations of the compiler. In some way, we need to make the compiler available at runtime.

Stak Scheme implements the R7RS-small standard. Compared to R4RS, one of the biggest features in R7RS-small is hygienic macros and the library system. In the world without the eval procedure, we do not need macros and libraries at runtime. We expand them at compile time. However, the `eval` procedure needs their information at runtime.

First, we compared the compactness of Stak Scheme with TR7. TR7 is the tiniest R7RS-small implementation before Stak Scheme. One of the biggest reasons for Stak Scheme to be so tiny is that it is implemented mostly in Scheme itself. So most of the interpreter logic is fit into the compact bytecode.

In terms of speed, Stak Scheme is comparable with the other Scheme implementation. It's steadily faster than TR7 and the interpreter of Gambit Scheme. It's still far behind from the modern interpreters of Gauche.

RVM looks good at every perspective. RVM is not as secure as other modern ones due to its flexibility. For example, because of the unified representation of code and data, user input might maliciously try to modify the code of the Scheme program dynamically. To prevent that, we need type checking in primitives on RVM. Porting to another host language is another next goal for Stak Scheme, to actually prove its portability as well as Ribbit Scheme..

On decoding, we do the same thing but in a reverse order.

Stak Scheme: The tiny R7RS-small implementation

Scheme Workshop 2025

Background

Can we implement the entire R7RS-small standard on RVM?

Yes, we can!

Stak Scheme

RVM in depth

Code graph

Compiling and running a program

Example

Scheme

Example

Code graph

Encoding for structured memory snapshot

`eval` and the compiler

Macros and libraries in `eval`

Compactness

Benchmarks

Future work

Acknowledgements

Interpreter demo

Appendix

If instruction

Duplicate strings

Code graph in depth

Fibonacci function

Encoding shared nodes

References

Stak Scheme: The tiny R7RS-small implementation

Scheme Workshop 2025

Background

Can we implement the entire R7RS-small standard on RVM?

Yes, we can!

Stak Scheme

RVM in depth

Code graph

Compiling and running a program

Example

Scheme

Example

Code graph

Encoding for structured memory snapshot

eval and the compiler

Macros and libraries in eval

Compactness

Benchmarks

Future work

Acknowledgements

Interpreter demo

Appendix

If instruction

Duplicate strings

Code graph in depth

Fibonacci function

Encoding shared nodes

References

`eval` and the compiler

Macros and libraries in `eval`