Stak Scheme: The tiny R7RS-small implementation

Scheme Workshop 2025

Yota Toyama

Background

  • Ribbit Scheme, the tiny R4RS implementation
    • Bytecode compiler: Scheme
    • Virtual Machine (VM): x86-64 assembly, C, Javascript, Bash, etc.
      • Simple, portable, compact, and fast
  • Can we implement the entire R7RS-small standard on the Ribbit VM? 🤔
    • Yes, we can!

Stak Scheme

  • Stak Scheme, the tiny R7RS-small implementation
  • Open source on GitHub: raviqqe/stak

Comparison to Ribbit Scheme

Stak Ribbit
Data structure Pair Rib
Bytecode encoding Dynamic cache Global cache + continuation/constant
Compiler Scheme Scheme
VM Rust Many languages

Virtual machine

  • A stack machine
  • Everything is a pair.
    • Bytecode
    • Values
      • Lists, characters, strings, etc.
    • A stack
  • Binary-level homoiconicity
  • "Von Neumann architecture"

Code graph

  • A representation of a Scheme program on memory
  • Directed Acyclic Graph (DAG) of pairs
  • Used at both compile time in the compiler and runtime in the VM.

Examples

If instruction

Scheme

(display (if x "foo" "bar"))

If instruction

Code graph

Duplicate strings

Scheme

(display "foo")
(display "foo")
(display "bar")

Duplicate strings

Code graph

Library system

Scheme

(define-library (foo)
  (export foo)

  (begin
    (define foo 123)))
(import (prefix (foo) bar-))

(define foo 456)

(+ bar-foo foo)

Library system

Code graph

Encoding & decoding

  • A code graph is encoded by a topological sort.
  • The compiler encodes a code graph into a byte sequence.
  • The VM decodes a code graph into a byte sequence.

Encoding merges

  • Merged pairs (nodes) are cached locally and dynamically.
  • On the first visit, the pair is added to cache.
  • On the last visit, the pair is removed from cache.

eval and compiler

  • The compiler from S-expression to code graph is data.
  • (incept source) embeds the compiler as a library into source code.
  • ((eval compiler) source) compiles the source code.

Compactness

Lines of code (LOC) Binary size (KB)
mstak 9,127 108,648
tr7i 16,891 301,536

References

Future work

WIP

Acknowledgements

WIP

References

Appendix

Code graph in depth

  • A pair consists of car, cdr, and a tag on the side of cdr.
    • Tags represent either instructions or data types.
  • Universal representation for both in-memory bytecode and Scheme values

Fibonacci function

# Examples > WIP