Planet Scheme

Thursday, November 30, 2023

Scheme Requests for Implementation

SRFI 251: Mixing groups of definitions with expressions within bodies

SRFI 251 is now in draft status.

Scheme has traditionally required procedure bodies and the bodies of derived constructs such as let to contain definitions followed by commands/expressions. This SRFI proposes to allow mixing commands and groups of definitions in such bodies, so that each command/expression is in the scope of all local definitions preceding it, but not in scope of the local definitions following it. This approach is backwards compatible with R7RS and upholds the intuitive rule that to find the definition of a lexical variable, one has to look up the source code tree.

by Sergei Egorov at Thursday, November 30, 2023

SRFI 243: Unreadable Data

SRFI 243 is now in withdrawn status.

This SRFI suggests how the Scheme reader and writer should handle unreadable data in general, and unreadable objects in particular.

by Lassi Kortela at Thursday, November 30, 2023

spritely.institute

Building interactive web pages with Guile Hoot

Hoot owl with a to-do list scroll

A question we frequently hear in discussions about WebAssembly (Wasm) is:

"Can Wasm call the DOM (Document Object Model) API?"

The answer is: Yes, thanks to Wasm Garbage Collection!

In this post, we will use Guile Hoot (our Scheme to Wasm compiler) to demonstrate how a language that compiles to Wasm GC can be used to implement the kind of interactivity we're used to implementing with JavaScript. We'll start very simple and build up to something that resembles a React-like application.

In our previous post about running Scheme in the browser, we had to use quite a lot of JavaScript code to call Scheme procedures (functions), render the user interface, and handle user input. However, today we're pleased to announce that those days are behind us; Hoot 0.2.0, released today, now includes a foreign function interface (FFI) for calling JavaScript from Scheme. In other words, the vast majority of code in a Hoot application can now be written directly in Scheme!

Hello, world!

Let's start with a "Hello, world" application that simply adds a text node to the document body.

(define-foreign document-body
  "document" "body"
  ;; Parameters: none
  ;; Result: an external reference which may be null
  -> (ref null extern))

(define-foreign make-text-node
  "document" "createTextNode"
  ;; Parameters: a string
  ;; Result: an external reference which may be null
  (ref string) -> (ref null extern))

(define-foreign append-child!
  "element" "appendChild"
  ;; Parameters: two external references which may be null
  ;; Result: an external reference which may be null
  (ref null extern) (ref null extern) -> (ref null extern))

(append-child! (document-body) (make-text-node "Hello, world!"))

The define-foreign syntax declares a Wasm import with a given signature that is bound to a Scheme variable. In this example, we'd like access to document.body, document.createTextNode, and element.appendChild.

Each import has a two-part name, which correspond to the strings in the define-foreign expressions above. Imported functions have a signature specifying the parameter and result types.

WebAssembly follows the capability security model (and if you know us, you know we're big fans). Wasm modules act as guests within a host environment — in our case, the host is the browser. By default, a Wasm module has no access to functions that interact with its host. The Wasm guest must be granted explicit permission by its host to be able to, for example, add DOM elements to a web page.

The way capabilities work in Wasm is as follows:

  • The Wasm module declares a set of imports.
  • When the Wasm module is instantiated, the host maps the import names to concrete implementations.

So, define-foreign merely declares the module's need for a particular function import; it does not bind to a particular function on the host. The Wasm guest cannot grant capabilities unto itself.

The host environment in the browser is the JavaScript engine, so we must use JavaScript to bootstrap our Wasm program. To do so, we instantiate the Wasm module and pass along the implementations for the declared imports. The import names in JavaScript match up to the import names in the Wasm module. For our "Hello, world" example, that code looks like this:

window.addEventListener("load", function() {
  Scheme.load_main("hello.wasm", {}, {
    document: {
      body() { return document.body; },
      createTextNode: Document.prototype.createTextNode.bind(document)
    },
    element: {
      appendChild(parent, child) { return parent.appendChild(child); }
    }
  });
});

The Scheme class is provided by our JavaScript integration library, reflect.js. This library provides a Scheme abstraction layer on top of the WebAssembly API and is used to load Hoot binaries and manipulate Scheme values from JavaScript.

The result:

The text "Hello, world!" rendered in a web browser.

HTML as Scheme data

Now that we've explained the basics, let's move on to something more interesting. How about rendering an entire tree of DOM elements? For that, we'll need to declare imports for document.createElement and element.setAttribute:

(define-foreign make-element
  "document" "createElement"
  (ref string) -> (ref null extern))

(define-foreign set-attribute!
  "element" "setAttribute"
  (ref null extern) (ref string) (ref string) -> none)

And here's the additional JavaScript wrapper code:

// These are added to the existing bindings from the previous section.
{
  document: {
    createElement: Document.prototype.createElement.bind(document),
    // ... the previously defined document functions
  },
  element: {
    setAttribute(elem, name, value) { elem.setAttribute(name, value); },
    // ... the previously defined element functions
  }
}

Now we need some markup to render. Thanks to the symbolic manipulation powers of Scheme, we have no need for an additional language like React's JSX to cleanly mix markup with code. Scheme has a lovely little thing called quote which we can use to represent arbitrary data as Scheme expressions. We'll use this to embed HTML within our Scheme code.

We use quote by prepending a single quote (') to an expression. For example, the expression (+ 1 2 3) calls + with the numbers 1, 2, and 3 and returns 6. However, the expression '(+ 1 2 3) (note the ') does not call + but instead returns a list of 4 elements: the symbol +, followed by the numbers 1, 2, and 3. In other words, quote turns code into data.

Now, to write HTML from within Scheme, we can leverage a format called SXML:

(define sxml
  '(section
    (h1 "Scheme rocks!")
    (p "With Scheme, data is code and code is data.")
    (small "Made with "
           (a (@ (href "https://spritely.institute/hoot"))
              "Guile Hoot"))))

section, h1, etc. are not procedure calls, they are literal symbolic data.

To transform this Scheme data into a rendered document, we need to walk the SXML tree and make the necessary DOM API calls, like so:

(define (sxml->dom exp)
  (match exp
    ;; The simple case: a string representing a text node.
    ((? string? str)
     (make-text-node str))
    ;; An element tree.  The first item is the HTML tag.
    (((? symbol? tag) . body)
     ;; Create a new element with the given tag.
     (let ((elem (make-element (symbol->string tag))))
       (define (add-children children)
         ;; Recursively call sxml->dom for each child node and
         ;; append it to elem.
         (for-each (lambda (child)
                     (append-child! elem (sxml->dom child)))
                   children))
       (match body
         ;; '@' denotes an attribute list.  Child nodes follow.
         ((('@ . attrs) . children)
          ;; Set attributes.
          (for-each (lambda (attr)
                      (match attr
                        ;; Attributes are (symbol string) tuples.
                        (((? symbol? name) (? string? val))
                         (set-attribute! elem
                                         (symbol->string name)
                                         val))))
                    attrs)
          (add-children children))
         ;; No attributes, just a list of child nodes.
         (children (add-children children)))
       elem))))

Then we can use our newly defined sxml->dom procedure to generate the element tree and add it to the document body:

(append-child! (document-body) (sxml->dom sxml))

The result:

SXML rendered to DOM elements.

HTML templating with Scheme

That was pretty neat, but we don't need JavaScript, let alone WebAssembly, to render a simple static document! In other words, we're far from done here. Let's introduce some interactivity by adding a button and a counter that displays how many times the button was clicked. We could take an imperative approach and modify the element by mutating the counter value every time the button is clicked. Indeed, for something this simple that would be fine. But just for fun, let's take a look at a more functional approach that uses a template to create the entire document body.

Scheme provides support for structured templating via quasiquote, which uses backticks (`) instead of single-quotes ('). Arbitrary code can be evaluated in the template by using unquote, represented by a comma (,). The expression `(+ 1 2 ,(+ 1 2)) produces the list (+ 1 2 3). The first (+ ...) expression is quoted, but the second is not, and thus (+ 1 2) is evaluated as code and the value 3 is placed into the final list element.

Scheme's quasiquote stands in stark contrast to the limited string templating available in most other languages. In JavaScript, we could do `1 + 2 + ${1 + 2}`, but this form of templating lacks structure. The result is just a flat string, which makes it clumsy to work with and vulnerable to injection bugs.

Below is an SXML template. It is a procedure which generates a document based on the current value of *clicks*, a global mutable variable that stores how many times the button was clicked:

;; It is a Scheme/Lisp naming convention to use asterisks, AKA
;; earmuffs, to mark global mutable variables.
(define *clicks* 0)

(define (template)
  `(div (@ (id "container"))
    (p ,(number->string *clicks*) " clicks")
    ;; Increment click counter when button is clicked.
    (button (@ (click ,(lambda (event)
                         (set! *clicks* (+ *clicks* 1))
                         (render))))
            "Click me")))

To handle interactive elements, we've added a new feature on top of SXML. Notice that the button has a click property with a procedure as its value. To get some interactivity, we need event handlers, and we've chosen to encode the click handler into the template much like how you could use the onclick attribute in plain HTML.

To make this work, we need imports for document.getElementById, element.addEventListener and element.remove:

(define-foreign get-element-by-id
  "document" "getElementById"
  (ref string) -> (ref null extern))

(define-foreign add-event-listener!
  "element" "addEventListener"
  (ref null extern) (ref string) (ref null extern) -> none)

(define-foreign remove!
  "element" "remove"
  (ref null extern) -> none)

And the JavaScript bindings:

{
  document: {
    getElementById: Document.prototype.getElementById.bind(document),
    // ...
  },
  element: {
    remove(elem) { elem.remove(); },
    addEventListener(elem, name, f) { elem.addEventListener(name, f); },
    // ...
  }
}

Here is what the attribute handling code in the sxml->dom procedure looks like now:

(match body
  ((('@ . attrs) . children)
   (for-each (lambda (attr)
               (match attr
                 (((? symbol? name) (? string? val))
                  (set-attribute! elem
                                  (symbol->string name)
                                  val))
                 ;; THIS IS THE NEW BIT!
                 ;;
                 ;; If the attribute's value is a procedure, add an
                 ;; event listener.
                 (((? symbol? name) (? procedure? proc))
                  (add-event-listener! elem
                                       (symbol->string name)
                                       (procedure->external proc)))))
             attrs)
   (add-children children))
  (children (add-children children)))

To keep things simple (for now), every time the button is clicked we'll delete what's in the document and re-render the template:

(define (render)
  (let ((old (get-element-by-id "container")))
    (unless (external-null? old) (remove! old)))
  (append-child! (document-body) (sxml->dom (template))))

The result:

Button click counter screenshot.

Building a virtual DOM

Wouldn't it be even cooler if we could apply some kind of React-like diffing algorithm that only updates the parts of the document that have changed when we need to re-render? Why yes, that would be cool! Let's do that now.

We'll add bindings for the TreeWalker interface to traverse the document, as well as some additional element methods to get/set the value and checked properties, remove event listeners, replace elements, remove attributes, and get event targets:

;; TreeWalker constructor:
(define-foreign make-tree-walker
  "document" "createTreeWalker"
  (ref null extern) -> (ref null extern))

;; TreeWalker API:
(define-foreign current-node
  "treeWalker" "currentNode"
  (ref null extern) -> (ref null extern))

(define-foreign set-current-node!
  "treeWalker" "setCurrentNode"
  (ref null extern) (ref null extern) -> (ref null extern))

(define-foreign next-node!
  "treeWalker" "nextNode"
  (ref null extern) -> (ref null extern))

(define-foreign first-child!
  "treeWalker" "firstChild"
  (ref null extern) -> (ref null extern))

(define-foreign next-sibling!
  "treeWalker" "nextSibling"
  (ref null extern) -> (ref null extern))

;; More element API:
(define-foreign element-value
  "element" "value"
  (ref null extern) -> (ref string))

(define-foreign set-element-value!
  "element" "setValue"
  (ref null extern) (ref string) -> none)

(define-foreign remove-event-listener!
  "element" "removeEventListener"
  (ref null extern) (ref string) (ref null extern) -> none)

(define-foreign replace-with!
  "element" "replaceWith"
  (ref null extern) (ref null extern) -> none)

(define-foreign remove-attribute!
  "element" "removeAttribute"
  (ref null extern) (ref string) -> none)

;; Event API:
(define-foreign event-target
  "event" "target"
  (ref null extern) -> (ref null extern))

And the JavaScript bindings:

{
  document: {
    createTreeWalker: Document.prototype.createTreeWalker.bind(document),
    // ...
  },
  element: {
    value(elem) { return elem.value; },
    setValue(elem, value) { elem.value = value; },
    checked(elem) { return elem.checked; },
    setChecked(elem, checked) { elem.checked = (checked == 1); },
    removeAttribute(elem, name) { elem.removeAttribute(name); },
    removeEventListener(elem, name, f) { elem.removeEventListener(name, f); },
    // ...
  },
  treeWalker: {
    currentNode(walker) { return walker.currentNode; },
    setCurrentNode(walker, node) { walker.currentNode = node; },
    nextNode(walker) { return walker.nextNode(); },
    firstChild(walker) { return walker.firstChild(); },
    nextSibling(walker) { return walker.nextSibling(); }
  },
  event: {
    target(event) { return event.target; }
  }
}

Now we can implement the diffing algorithm. Below is an abridged version, but you can see the beautiful, fully-fledged source code on GitLab:

(define (virtual-dom-render root old new)
  ;; <nested helper procedures have been omitted>
  (let ((walker (make-tree-walker root)))
    (first-child! walker)
    (let loop ((parent root)
               (old old)
               (new new))
      (match old
        (#f
         ;; It's the first render, so clear out whatever might be
         ;; in the actual DOM and render the entire tree.  No
         ;; diffing necessary.
         (let loop ((node (current-node walker)))
           (unless (external-null? node)
             (let ((next (next-sibling! walker)))
               (remove! node)
               (loop next))))
         (append-child! parent (sxml->dom new)))
        ((? string?)
         ;; Replace text node with either a new text node if the
         ;; text has changed, or an element subtree if the text
         ;; has been replaced by an element.
         (unless (and (string? new) (string-=? old new))
           (let ((new-node (sxml->dom new)))
             (replace-with! (current-node walker) new-node)
             (set-current-node! walker new-node))))
        (((? symbol? old-tag) . old-rest)
         (let-values (((old-attrs old-children)
                       (attrs+children old-rest)))
           (match new
             ((? string?)
              ;; Old node was an element, but the new node is a
              ;; string, so replace the element subtree with a
              ;; text node.
              (let ((new-text (make-text-node new)))
                (replace-with! (current-node walker) new-text)
                (set-current-node! walker new-text)))
             (((? symbol? new-tag) . new-rest)
              (let-values (((new-attrs new-children)
                            (attrs+children new-rest)))
                (cond
                 ;; The element tag is the same, so modify the
                 ;; inner contents of the element if necessary.
                 ((eq? old-tag new-tag)
                  (let ((parent (current-node walker)))
                    (update-attrs parent old-attrs new-attrs)
                    (first-child! walker)
                    (let child-loop ((old old-children)
                                     (new new-children))
                      (match old
                        (()
                         ;; The old child list is empty, so
                         ;; diffing stops here.  All remaining
                         ;; children in the new list are fresh
                         ;; elements that need to be added.
                         (for-each
                          (lambda (new)
                            (append-child! parent (sxml->dom new)))
                          new))
                        ((old-child . old-rest)
                         (match new
                           ;; The new child list is empty, so any
                           ;; remaining children in the old child
                           ;; list need to be removed, including
                           ;; the current one.
                           (()
                            (let rem-loop ((node (current-node walker)))
                              (unless (external-null? node)
                                (let ((next (next-sibling! walker)))
                                  (remove! node)
                                  (rem-loop next)))))
                           ;; Recursively diff old and new child
                           ;; elements.
                           ((new-child . new-rest)
                            (loop parent old-child new-child)
                            (next-sibling! walker)
                            (child-loop old-rest new-rest))))))
                    (set-current-node! walker parent)))
                 ;; New element tag is different than the old
                 ;; one, so replace the entire element subtree.
                 (else
                  (replace-with! (current-node walker)
                                 (sxml->dom new)))))))))))))

Now, instead of deleting and recreating every single element on the DOM tree to update the text for the counter, we can call the virtual-dom-render procedure and it will replace just one text node instead.

;; We'll diff the new vdom against the current one.
(define *current-vdom* #f)

(define (render)
  (let ((new-vdom (template)))
    (virtual-dom-render (document-body) *current-vdom* new-vdom)
    (set! *current-vdom* new-vdom)))

To-do app

Let's wrap things up with our take on a classic: the to-do list.

A to-do list application inside a web browser.

The humble to-do list is often used as a "Hello, world" of sorts for client-side UI libraries (there's even an entire website dedicated to them).

A to-do list has many tasks. Tasks have a name and a flag indicating if the task has been completed. We'll use Scheme's define-record-type to encapsulate a task:

(define-record-type <task>
  (make-task name done?)
  task?
  (name task-name)
  (done? task-done? set-task-done!))

Tasks can be created and deleted. We'll use a bit of global state for managing the task list as a substitute for actual persistent storage:

(define *tasks* '())

(define (add-task! task)
  (set! *tasks* (cons task *tasks*)))

(define (remove-task! task)
  (set! *tasks* (delq task *tasks*)))

Now we can define a template for rendering the UI:

(define (template)
  (define (task-template task)
    `(li (input (@ (type "checkbox")
                   ;; Toggle done? flag on click.
                   (change ,(lambda (event)
                              (let* ((checkbox (event-target event))
                                     (checked? (element-checked? checkbox)))
                                (set-task-done! task checked?)
                                (render))))
                   ;; Check the box if task is done.
                   (checked ,(task-done? task))))
         (span (@ (style "padding: 0 1em 0 1em;"))
               ;; Strikethrough if task is done.
               ,(if (task-done? task)
                    `(s ,(task-name task))
                    (task-name task)))
         (a (@ (href "#")
               ;; Remove task on click.
               (click ,(lambda (event)
                         (remove-task! task)
                         (render))))
            "remove")))
  `(div
    (h2 "Tasks")
    ;; Tasks are stored in reverse order.
    (ul ,@(map task-template (reverse *tasks*)))
    (input (@ (id "new-task")
              (placeholder "Write more Scheme")))
    ;; Add new task on click
    (button (@ (click ,(lambda (event)
                         (let* ((input (get-element-by-id "new-task"))
                                (name (element-value input)))
                           (unless (string-=? name "")
                             (add-task! (make-task name #f))
                             (set-element-value! input "")
                             (render))))))
            "Add task")))

For this final example, we've embedded the result in an iframe below. This requires a browser capable of Wasm GC and tail calls, such as Chrome 119 or Firefox 121:

It should be said that a React-like virtual DOM isn't necessarily the best way to implement a UI layer. We like how the abstraction turns the state of the UI into a function mapping application state to HTML elements, as it avoids an entire class of synchronization bugs that impact more imperative approaches. That said, the overhead introduced by a virtual DOM is not always acceptable. Simple web pages are better off with little to no client-side code so that they are more usable on low power devices. Still, a to-do application with a virtual DOM diffing algorithm is a neat yet familiar way to show off the expressive power of Hoot and its FFI.

Looking forward

With the introduction of an FFI, you can now implement nearly your entire web frontend in Scheme; the examples we've looked at today are but a glimpse of what's now possible!

In the future, we hope the Guile community will join us in developing a colorful variety of wrapper libraries for commonly-used web APIs, so that building with Hoot becomes increasingly fun and easy.

If you'd like to start playing around with the demos today, you can find the complete source code on GitLab.

To see everything that's new in Hoot 0.2.0, be sure to check out our release announcement post!

by Spritely Institute (contact@spritely.institute) at Thursday, November 30, 2023

Guile Hoot v0.2.0 released!

Hoot version 0.2.0

We are excited to announce the release of Guile Hoot v0.2.0! Hoot is a Scheme to WebAssembly compiler backend for Guile, as well as a general purpose WebAssembly toolchain. In other words, Scheme in the browser! It has been a busy two months since 0.1.0, and we've got a lot of goodies to share.

Highlights

  • Nearly all of R7RS-small is now implemented! Hoot 0.2.0 is now capable of running many more standard Scheme programs than 0.1.0.

  • A foreign function interface (FFI) has been added to make it easy to declare imported host functions and call them from Scheme.

  • User-defined record types, sorely missing from 0.1.0, have been added.

Read on for a detailed list of changes.

R7RS-small

  • Added record types and define-record-type syntax.

  • Added read.

  • Added case-lambda and case-lambda*.

  • Added system timer procedures current-second, current-jiffy, and jiffies-per-second.

  • Added promise procedures make-promise, promise, ``delay,force, anddelay-force`.

  • Added string->symbol.

  • Added partial string->number implementation.

  • Added character procedures char-upcase, char-downcase, char-titlecase, char-upper-case?, char-lower-case?, char-alphabetic?, char-whitespace?, char-numeric?, digit-value, char-ci<?, char-ci<=?, char-ci=?, char-ci>=?, and char-ci>?.

  • Added mutable string procedures string-set!, string-copy!, and string-fill!.

  • Added stubs for command-line, get-environment-variable, and get-environment-variables procedures that do not make much sense in a web context.

  • Added integer flonum support for quotient, remainder, and modulo.

  • Fixed string->?.

  • Fixed assq, assv, and assoc.

  • Fixed memq, memv, and member.

  • Fixed gcd, expt, and exact.

  • Fixed finite?, infinite?, and nan?.

  • Fixed peek-char and read-char for multibyte chars.

Non-standard interfaces

  • Added foreign function interface via define-foreign syntax.

  • Added support for unboxed parameters and results to %inline-wasm.

  • Added %wasm-import form for use by the FFI.

  • Added Guile's structured exception system. All errors raised in the Scheme prelude are now structured.

Toolchain

  • Added global lowering pass to more easily emit Scheme constants that are not Wasm constants.

  • Added support for symbols/keywords in secondary Wasm modules that import the ABI from a primary module.

  • Improved compiler tree shaker to reduce overall Wasm binary size.

  • Reduced the size of bytevector literals in Wasm binaries.

  • Fixed lowering of stringview_iter.slice instruction.

  • Fixed flonum constants sometimes being treated as fixnums.

  • Fixed printing of mutable strings in both JS and Scheme reflection libraries.

  • Fixed scm_from_char in reflect.wasm.

  • Fixed br_table implementation in Wasm interpreter.

  • Fixed mis-compilation of rational? and integer? primcalls.

  • Fixed switch optimization for repeated symbol comparisons.

Browser compatibility

  • Google Chrome 119 was released about a month ago and comes with Wasm GC and tail calls enabled by default. Hoot binaries now work in Chrome without fiddling with any experimental settings!

  • Firefox 120 was released with Wasm GC enabled, but tail calls will not be enabled until December 19th when Firefox 121 is released. For the time being, Firefox users can either install a Nightly build or enable javascript.options.wasm_tail_calls in about:config on Firefox 120.

  • Safari/WebKit is still unsupported.

Get Hoot 0.2.0!

Hoot is already available in GNU Guix:

$ guix pull
$ guix install guile-next guile-hoot

(Hoot currently requires a bleeding-edge version of Guile, hence guile-next above.)

Otherwise, Hoot can be built from source via our release tarball. See the Hoot landing page for a download link and GPG signature.

Documentation for Hoot 0.2.0, including build instructions, can be found here.

Check out our greatest hoots!

A "Greatest Hoots" album CD cover

For a detailed walkthrough of using the new foreign function interface, check out our cool new blog post about building interactive web pages with Hoot!

Robin Templeton and David Thompson recently presented about Hoot at the SeaGL conference in Seattle, WA. The talk was livestreamed and you can catch the replay on YouTube!

For bug reports, pull requests, or just to follow along with development, check out the Hoot project on GitLab.

If you build something cool with Hoot, let us know on our community forum! (Use OCAPN2023 for the Invite Code when you join.)

The code in this release was brought to you by Andy Wingo, Robin Templeton, David Thompson, and Christine Lemmer-Webber. The lovely Hoot art is by tessa. Special thanks to the MetaMask folks for funding this work!

Happy hooting! 🦉

by Spritely Institute (contact@spritely.institute) at Thursday, November 30, 2023

The Racket Blog

Racket v8.11.1


Racket version 8.11.1 is now available from https://racket-lang.org/

This bug-fix release repairs a problem with building from source when using the “builtpkgs” source distribution.

Feedback Welcome


by The Unknown Author at Thursday, November 30, 2023

Wednesday, November 29, 2023

Idiomdrottning

Applying thread patches with Emacs Notmuch

I have made this shell script called git-inled in my path.

#!/bin/sh

if git show-ref --verify --quiet "refs/heads/$1"; then
    git checkout "$1"
else
    git checkout main
    git checkout -b "$1"
fi

Then I had this snippet copied from somewhere and changed it from always wanting to create new branches to instead use the inled script:

(defun apply-thread-patchset (repo branch)
  (interactive "Dgit repo: \nsbranch name: ")
  (let ((tid notmuch-show-thread-id)
        (tmp "/tmp/notmuch-patchset"))
    (shell-command (format "notmuch-extract-patch %s > %s && ( cd %s && git inled %s && git am %s )"
                           (shell-quote-argument tid)
                           (shell-quote-argument tmp)
                           (shell-quote-argument (expand-file-name repo))
                           (shell-quote-argument branch)
                           (shell-quote-argument tmp)))))

Thanks to the script I added, if the branch already exists, it switches to it, and if it does not exist, it makes sure the new one is branched off of main.

Before, it’d always create new branches and it’d always create them off of where I already was (which got confusing if I was on a dev branch).

Most of the time I actually do want to accept the patches directly on main. That’s one of the advantages of this workflow after all. And that’s what this script also allows me to do, I just enter main as the branchname to apply to. Then I can use git tools on top of that if I want to fix it up further beyond what was addressed in the review process.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Wednesday, November 29, 2023

Wiki on the wider web

Alex writes:

Perhaps there is something to it, but perhaps there simply is something to writing hypertext. Alone.

I mean I don’t just make internal links. I link to other people and to Wikipedia. Ideally the entire web is one big web. “Wiki” was invented by Ward to make web quickly. Wiki means fast or quick.

Before wiki, most (not all, but most) people made web in one of two ways: bug-ridden GUI hells like Macromedia Dreamweaver, or hand-writing HTML the long way. M4 and other shortcuts weren’t something we, the slovenly post–Eternal-September arrivees, were familiar with. Wiki made “text-near, minimalist” markup mainstream.

And yes, Wards wiki was also publicly editable. If someone’s gonna argue that that’s an inalienable part of the definition of a wiki, that’s fine. When I switched from UseMod to Jekyll, I first thought “this is gonna be a blog, not a wiki” but I made some decisions early on, like “no dates in URLs” that has led me to keep editing and polishing old pages. It’s like the entire thing is permanently in a state of rough draft. That “wiki-like-ness” has made me feel a li’l guilty about not turning on public write-ability, at first. Then, the more I used it, something happened. I found myself feeling less like the curator of my own garden and more a part of something bigger and interconnected. I make links to my other pages, sure, I do write interconnectedly, but I also link to Wikipedia and to other blogs.

It’s all one big world wide web.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Wednesday, November 29, 2023

Sunday, November 26, 2023

Idiomdrottning

Judging a programming language

Here is something where I have a li’l bit of hypocrisy in my own bitter heart.

I’ve got a close friend who, whenever we’re talking about programming languages, I get the impression that he only judges them by the surfacest of surface syntax features. Does the code look like the languages of old that he’s used to? Then good. Otherwise bad.

Never a look at things like type systems, generics, comptime, metaprogramming, container heterogenity, polymorphism, inference, verbosity (well—he does favor explicitness whereas I favor implicitness & good defaults), first-class functions or the first-class–ness of any other codemes, performance, code reuse, ease of factoring and refactoring, flexibility etc.

No. It’s only “I think square brackets look good”, that’s the extent of it with him.

Now to the hypcrisy part. It’s not as if I don’t have a passion for surface syntax: I love sexps and I cannot lie. If I can’t paredit it’s not my revolution. And I generally hate grawlixes and redundancy. It’s just… that’s not the only thing I look at.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Sunday, November 26, 2023

Saturday, November 25, 2023

Idiomdrottning

Branching & Patching

Having read seven or eight posts about the “patch”-style, kernel.org style git workflow, I feel like this article by Drew explains it the straight-forwardliest.

That is pretty much how I do it too. Unless I’m in an organization with a different flow, of course. When in Rome… Both for my own stuff, and for the typical drive-by bazaar-contributions I do, this work well.

I don’t like calling it “branchless”, though, since just like Drew:

Sometimes I do use branches, though, when I know that a workstream is going to be a lot of work — it involves lots of large-scale refactoring, or will take several weeks to complete. This isolates it from my normal workflow on small-to-medium patches, acknowledging that the large workstream is going to be more prone to conflicts. By addressing these separately, I don’t waste my time fixing up the error-prone branch all the time while I’m working on my smaller workstreams.

One example of that for me is when I did the massive update of match-generics, which was one of the difficultest things I’ve ever written (ironic since the original version was just a one-hour hack wrapper for foof’s match-lambda). I kept the “rewrite” in its own branch until it was done, so I could still do maintenance fixes for the original. I knew the “rewrite” would take a while.

Once it was done, I could say goodbye to the old version.

This debate is pretty over now that there are tools that lets you get the best of your preferred workflow regardless. For example, there is notmuch-import-patch that turns a branch (from a pull-request or the like) into email so you can review and reply to it.

Here is how I apply patches.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Saturday, November 25, 2023

Friday, November 24, 2023

GNU Guix

Write package definitions in a breeze

More than 28,000 packages are available in Guix today, not counting third-party channels. That’s a lot—the 5th largest GNU/Linux distro! But it’s nothing if the one package you care about is missing. So even you, dear reader, may one day find yourself defining a package for your beloved deployment tool. This post introduces a new tool poised to significantly lower the barrier to writing new packages.

Introducing Guix Packager

Defining packages for Guix is not all that hard but, as always, it’s much harder the first time you do it, especially when starting from a blank page and/or not being familiar with the programming environment of Guix. Guix Packager is a new web user interface to get you started—try it!. It arrived right in time as an aid to the packaging tutorial given last week at the Workshop on Reproducible Software Environments.

Screenshot showing the Guix Packager interface.

The interface aims to be intuitive: fill in forms on the left and it produces a correct, ready-to-use package definition on the right. Importantly, it helps you avoid pitfalls that trip up many newcomers:

  • When you add a dependency in one of the “Inputs” fields, it adds the right variable name in the generated code and imports the right package module.
  • Likewise, you can choose a license and be sure the license field will refer to the right variable representing that license.
  • You can turn tests on and off, and add configure flags. These translate to a valid arguments field of your package, letting you discover the likes of keyword arguments and G-expressions without having to first dive into the manual.

Pretty cool, no?

Implementation

All the credit for this tool goes to co-worker and intrepid hacker Philippe Virouleau. A unique combination of paren aversion and web development superpowers—unique in the Guix community—led Philippe to develop the whole thing in a glimpse (says Ludovic!).

The purpose was to provide a single view to be able to edit a package recipe, therefore the application is a single-page application (SPA) written in using the UI library Philippe is most comfortable with: React, and MaterialUI for styling the components. It's built with TypeScript, and the library part actually defines all the types needed to manipulate Guix packages and their components (such as build systems or package sources). One of the more challenging parts was to be able to provide fast and helpful “search as you type” results over the 28k+ packages. It required a combination of MaterialUI's virtualized inputs, as well as caching the packages data in the browser's local storage, when possible (packaging metadata itself is fetched from https://guix.gnu.org/packages.json, a generic representation of the current package set).

While the feature set provides a great starting point, there are still a few things that may be worth implementing. For instance, only the GNU and CMake build systems are supported so far; it would make sense to include a few others (Python-related ones might be good candidates).

Running a local (development) version of the application can happen on top of Guix, since—obviously—it's been developed with the node version packaged in Guix, using the quite standard packages.json for JavaScript dependencies installed through npm. Contributions welcome!

Lowering the barrier to entry

This neat tool complements a set of steps we’ve taken over time to make packaging in Guix approachable. Indeed, while package definitions are actually code written in the Scheme language, the package “language” was designed from the get-go to be fully declarative—think JSON with parens instead of curly braces and semicolons. More recently we simplified the way package inputs are specified with an eye on making package definitions less intimidating.

The guix import command also exists to make it easier to simplify packaging: it can generate a package definition for anything available in other package repositories such as PyPI, CRAN, Crates.io, and so forth. If your preference goes to curly braces rather than parens, it can also convert a JSON package description to Scheme code. Once you have your first .scm file, guix build prints hints for common errors such missing module imports (those #:use-module stanzas). We also put effort into providing reference documentation, a video tutorial, and a tutorial for more complex packages.

Do share your experience with us and until then, happy packaging!

Acknowledgments

Thanks to Felix Lechner and Timothy Sample for providing feedback on an earlier draft of this post.

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the Hurd or the Linux kernel, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, AArch64 and POWER9 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

by Ludovic Courtès, Philippe Virouleau at Friday, November 24, 2023

Andy Wingo

tree-shaking, the horticulturally misguided algorithm

Let's talk about tree-shaking!

looking up from the trough

But first, I need to talk about WebAssembly's dirty secret: despite the hype, WebAssembly has had limited success on the web.

There is Photoshop, which does appear to be a real success. 5 years ago there was Figma, though they don't talk much about Wasm these days. There are quite a number of little NPM libraries that use Wasm under the hood, usually compiled from C++ or Rust. I think Blazor probably gets used for a few in-house corporate apps, though I could be fooled by their marketing.

You might recall the hyped demos of 3D first-person-shooter games with Unreal engine again from 5 years ago, but that was the previous major release of Unreal and was always experimental; the current Unreal 5 does not support targetting WebAssembly.

Don't get me wrong, I think WebAssembly is great. It is having fine success in off-the-web environments, and I think it is going to be a key and growing part of the Web platform. I suspect, though, that we are only just now getting past the trough of disillusionment.

It's worth reflecting a bit on the nature of web Wasm's successes and failures. Taking Photoshop as an example, I think we can say that Wasm does very well at bringing large C++ programs to the web. I know that it took quite some work, but I understand the end result to be essentially the same source code, just compiled for a different target.

Similarly for the JavaScript module case, Wasm finds success in getting legacy C++ code to the web, and as a way to write new web-targetting Rust code. These are often tasks that JavaScript doesn't do very well at, or which need a shared implementation between client and server deployments.

On the other hand, WebAssembly has not been a Web success for DOM-heavy apps. Nobody is talking about rewriting the front-end of wordpress.com in Wasm, for example. Why is that? It may sound like a silly question to you: Wasm just isn't good at that stuff. But why? If you dig down a bit, I think it's that the programming models are just too different: the Web's primary programming model is JavaScript, a language with dynamic typing and managed memory, whereas WebAssembly 1.0 was about static typing and linear memory. Getting to the DOM from Wasm was a hassle that was overcome only by the most ardent of the true Wasm faithful.

Relatedly, Wasm has also not really been a success for languages that aren't, like, C or Rust. I am guessing that wordpress.com isn't written mostly in C++. One of the sticking points for this class of language. is that C#, for example, will want to ship with a garbage collector, and that it is annoying to have to do this. Check my article from March this year for more details.

Happily, this restriction is going away, as all browsers are going to ship support for reference types and garbage collection within the next months; Chrome and Firefox already ship Wasm GC, and Safari shouldn't be far behind thanks to the efforts from my colleague Asumu Takikawa. This is an extraordinarily exciting development that I think will kick off a whole 'nother Gartner hype cycle, as more languages start to update their toolchains to support WebAssembly.

if you don't like my peaches

Which brings us to the meat of today's note: web Wasm will win where compilers create compact code. If your language's compiler toolchain can manage to produce useful Wasm in a file that is less than a handful of over-the-wire kilobytes, you can win. If your compiler can't do that yet, you will have to instead rely on hype and captured audiences for adoption, which at best results in an unstable equilibrium until you figure out what's next.

In the JavaScript world, managing bloat and deliverable size is a huge industry. Bundlers like esbuild are a ubiquitous part of the toolchain, compiling down a set of JS modules to a single file that should include only those functions and data types that are used in a program, and additionally applying domain-specific size-squishing strategies such as minification (making monikers more minuscule).

Let's focus on tree-shaking. The visual metaphor is that you write a bunch of code, and you only need some of it for any given page. So you imagine a tree whose, um, branches are the modules that you use, and whose leaves are the individual definitions in the modules, and you then violently shake the tree, probably killing it and also annoying any nesting birds. The only thing that's left still attached is what is actually needed.

This isn't how trees work: holding the trunk doesn't give you information as to which branches are somehow necessary for the tree's mission. It also primes your mind to look for the wrong fixed point, removing unneeded code instead of keeping only the necessary code.

But, tree-shaking is an evocative name, and so despite its horticultural and algorithmic inaccuracies, we will stick to it.

The thing is that maximal tree-shaking for languages with a thicker run-time has not been a huge priority. Consider Go: according to the golang wiki, the most trivial program compiled to WebAssembly from Go is 2 megabytes, and adding imports can make this go to 10 megabytes or more. Or look at Pyodide, the Python WebAssembly port: the REPL example downloads about 20 megabytes of data. These are fine sizes for technology demos or, in the limit, very rich applications, but they aren't winners for web development.

shake a different tree

To be fair, both the built-in Wasm support for Go and the Pyodide port of Python both derive from the upstream toolchains, where producing small binaries is nice but not necessary: on a server, who cares how big the app is? And indeed when targetting smaller devices, we tend to see alternate implementations of the toolchain, for example MicroPython or TinyGo. TinyGo has a Wasm back-end that can apparently go down to less than a kilobyte, even!

These alternate toolchains often come with some restrictions or peculiarities, and although we can consider this to be an evil of sorts, it is to be expected that the target platform exhibits some co-design feedback on the language. In particular, running in the sea of the DOM is sufficiently weird that a Wasm-targetting Python program will necessarily be different than a "native" Python program. Still, I think as toolchain authors we aim to provide the same language, albeit possibly with a different implementation of the standard library. I am sure that the ClojureScript developers would prefer to remove their page documenting the differences with Clojure if they could, and perhaps if Wasm becomes a viable target for Clojurescript, they will.

on the algorithm

To recap: now that it supports GC, Wasm could be a winner for web development in Python and other languages. You would need a different toolchain and an effective tree-shaking algorithm, so that user experience does not degrade. So let's talk about tree shaking!

I work on the Hoot Scheme compiler, which targets Wasm with GC. We manage to get down to 70 kB or so right now, in the minimal "main" compilation unit, and are aiming for lower; auxiliary compilation units that import run-time facilities (the current exception handler and so on) from the main module can be sub-kilobyte. Getting here has been tricky though, and I think it would be even trickier for Python.

Some background: like Whiffle, the Hoot compiler prepends a prelude onto user code. Tree-shakind happens in a number of places:

Generally speaking, procedure definitions (functions / closures) are the easy part: you just include only those functions that are referenced by the code. In a language like Scheme, this gets you a long way.

However there are three immediate challenges. One is that the evaluation model for the definitions in the prelude is letrec*: the scope is recursive but ordered. Binding values can call or refer to previously defined values, or capture values defined later. If evaluating the value of a binding requires referring to a value only defined later, then that's an error. Again, for procedures this is trivially OK, but as soon as you have non-procedure definitions, sometimes the compiler won't be able to prove this nice "only refers to earlier bindings" property. In that case the fixing letrec (reloaded) algorithm will end up residualizing bindings that are set!, which of all the tree-shaking passes above require the delicate DCE pass to remove them.

Worse, some of those non-procedure definitions are record types, which have vtables that define how to print a record, how to check if a value is an instance of this record, and so on. These vtable callbacks can end up keeping a lot more code alive even if they are never used. We'll get back to this later.

Similarly, say you print a string via display. Well now not only are you bringing in the whole buffered I/O facility, but you are also calling a highly polymorphic function: display can print anything. There's a case for bitvectors, so you pull in code for bitvectors. There's a case for pairs, so you pull in that code too. And so on.

One solution is to instead call write-string, which only writes strings and not general data. You'll still get the generic buffered I/O facility (ports), though, even if your program only uses one kind of port.

This brings me to my next point, which is that optimal tree-shaking is a flow analysis problem. Consider display: if we know that a program will never have bitvectors, then any code in display that works on bitvectors is dead and we can fold the branches that guard it. But to know this, we have to know what kind of arguments display is called with, and for that we need higher-level flow analysis.

The problem is exacerbated for Python in a few ways. One, because object-oriented dispatch is higher-order programming. How do you know what foo.bar actually means? Depends on foo, which means you have to thread around representations of what foo might be everywhere and to everywhere's caller and everywhere's caller's caller and so on.

Secondly, lookup in Python is generally more dynamic than in Scheme: you have __getattr__ methods (is that it?; been a while since I've done Python) everywhere and users might indeed use them. Maybe this is not so bad in practice and flow analysis can exclude this kind of dynamic lookup.

Finally, and perhaps relatedly, the object of tree-shaking in Python is a mess of modules, rather than a big term with lexical bindings. This is like JavaScript, but without the established ecosystem of tree-shaking bundlers; Python has its work cut out for some years to go.

in short

With GC, Wasm makes it thinkable to do DOM programming in languages other than JavaScript. It will only be feasible for mass use, though, if the resulting Wasm modules are small, and that means significant investment on each language's toolchain. Often this will take the form of alternate toolchains that incorporate experimental tree-shaking algorithms, and whose alternate standard libraries facilitate the tree-shaker.

Welp, I'm off to lunch. Happy wassembling, comrades!

by Andy Wingo at Friday, November 24, 2023

Wednesday, November 22, 2023

Idiomdrottning

Re: Stop Using Encrypted Email

I already have my own positive take on this.

But I want to reply this post directly.

Plaintext email only?

When you’re saying:

Email is unsafe and cannot be made safe. The tools we have today to encrypt email are badly flawed. Even if those flaws were fixed, email would remain unsafe. Its problems cannot plausibly be mitigated. Avoid encrypted email.

and

There are reasons people use and like email. We use email, too! It’s incredibly convenient. You can often guess people’s email addresses and communicate with them without ever being introduced. Every computing platform in the world supports it. Nobody needs to install anything new, or learn how to use a new system. Email is not going away.

at the same time, what that means is that you’re saying “go ahead, keep using unencrypted, insecure, plaintext email”. I’m not onboard with that.

Keep using encrypted email?

When you’re saying:

And we don’t object to email security features, like hop-by-hop TLS encryption and MTA-STS, that make the system more resistant to dragnet surveillance.

You’re saying yes to email encryption after all. Good, then we’re on the same page.

I also wanna add SPF, DKIM, DMARC, and PGP. I use them every day, pretty happy about them.

LARPers vs the state?

When you’re saying:

Most email encryption on the Internet is performative, done as a status signal or show of solidarity. Ordinary people don’t exchange email messages that any powerful adversary would bother to read, and for those people, encrypted email is LARP security. […] Metadata is as important as content, and email leaks it.

You’re saying that the only adversaries that matter are powerful ones, like state actors that are gonna put you in jail for whatever reason. But we also want to fight the corporations and the spammers and the ad profilers.

Truth is that with TLS and STS, only people who can see the emails are the email service providers. And in a world where the biggest email provider is also the biggest ad seller, maybe encrypting the body text is not a bad idea.

Not that the larping isn’t fun. PGP does have a nostalgic charm for many of my generation. From being something I could barely understand, something people tattooed on their bodies, to something I now use hundreds of times per week, there is something joyful in that.

Bad alternatives

Yes, email isn’t the most secure, and OMEMO and Matrix also leaks metadata. To the provider, not to the entire world, but that’s true for email as well in the TLS era.

They suggest Signal, which is proprietary and leaks your actual cell phone number. What.

age will encrypt documents that can be sent through less secure systems. These tools are all harder to use and more fraught than secure messengers, but they’re better than encrypted email.

Age has all the same drawbacks as PGP without any of its advantages like WKD.

But there are good ways to speak anonymously and securely to each other. I’m glad you have apps you like. Email is very far from being the securest thing out there right now. That’s not what I’m contesting at all.

My problem with this widely quoted and self-contradictory essay is that they have driven a lot of otherwise very smart people to give up on making email the best it can be.

And there are plenty of things where email is still the best thing around. It’s a miracle that a world-writable fully federated inbox system is as good as it is. Fedi is a spam pit, IRC a harassment nightmare, email is at the forefront of fighting those things. Of course we want encryption on that.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Wednesday, November 22, 2023

Saturday, November 18, 2023

Idiomdrottning

Age for email

I’ve had a couple of different people ask to use age for email, and Emacs can handle that just fine, so I’ve acquiesced. I haven’t settled on a permanent key yet but here is one I’ve been using:

public key: age1p0r26arafja3ehq4kn5mrsh5fjw5mnp6jnde35hs4yq3z9l7tuyqcmjtmu

However, PGP has better tooling for automatic encryption and for key exchange (WKD and Autocrypt). It’s just something the email ecosystem has better adapted to.

Another thing that really sucks about age compared to PGP is that if I encrypt and send something to someone, I can’t then read it myself. If I wanna remember what the heck I’m even writing, I need to save a copy first.

For email, age has all the same drawbacks of PGP:

  • Susceptibility to MITM
  • No forward secrecy (worse, since there are some hacky efforts to get some semblance of forward secrecy in some PGP setups)
  • Same encryption algorithms (like RSA and the quantum-fragile ed25519)
  • Leaks metadata (sender, recipient & subject)

Age is a good tool for encrypting your backups and your own secret local text files, especially compared to a specific version of PGP called GPG:

  • Easier to use than GPG
  • Less liable to include deprecated and crusty old algorithms
  • (This next one one only applies to super old versions of GPG since it now also has it, but) Authenticated encryptions

Those drawbacks aren’t universal to all PGP implementations, though.

I don’t think it’s worthwhile to use for email.

I prefer PGP. Here is my key.

PGP is probably not the be-all, end-all either. I love email, I want email as a protocol to last forever, it’s the only platform that has managed to be fully federated with a world-writable inbox and a robust set of spam-fighting tools, and with SSL, DKIM, DMARC it has seen great strides, and if email can get better e2ee than PGP that’d be something I’d love; age isn’t it.

The age devs don’t wanna use age for email either, not because it can’t be done (as they point out in their thread, there’s an -a option to make it work) but because they are opposed to email security.🤦🏻‍♀️

Out of scope: Anything about emails (which are a fundamentally unsecurable medium)

Not really into the idea that we shouldn’t harm reduce email, that we should just give up on it etc. That’s not what I want.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Saturday, November 18, 2023

Friday, November 17, 2023

Idiomdrottning

Adless Kineto

Here is a branch of Kineto that doesn’t have the “details” at the bottom, making the pages look more like they belong on the web and less like an ad for Kineto.

I’ve sent this patch upstream, if they like it I’ll update this page to point there instead.

(I can see why having those details can be helpful for a technical audience.)

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Friday, November 17, 2023

Thursday, November 16, 2023

Andy Wingo

a whiff of whiffle

A couple nights ago I wrote about a superfluous Scheme implementation and promised to move on from sheepishly justifying my egregious behavior in my next note, and finally mention some results from this experiment. Well, no: I am back on my bullshit. Tonight I write about a couple of implementation details that discerning readers may find of interest: value representation, the tail call issue, and the standard library.

what is a value?

As a Lisp, Scheme is one of the early "dynamically typed" languages. These days when you say "type", people immediately think propositions as types, mechanized proof of program properties, and so on. But "type" has another denotation which is all about values and almost not at all about terms: one might say that vector-ref has a type, but it's not part of a proof; it's just that if you try to vector-ref a pair instead of a vector, you get a run-time error. You can imagine values as being associated with type tags: annotations that can be inspected at run-time for, for example, the sort of error that vector-ref will throw if you call it on a pair.

Scheme systems usually have a finite set of type tags: there are fixnums, booleans, strings, pairs, symbols, and such, and they all have their own tag. Even a Scheme system that provides facilities for defining new disjoint types (define-record-type et al) will implement these via a secondary type tag layer: for example that all record instances are have the same primary tag, and that you have to retrieve their record type descriptor to discriminate instances of different record types.

Anyway. In Whiffle there are immediate types and heap types. All values have a low-bit tag which is zero for heap objects and nonzero for immediates. For heap objects, the first word of the heap object has tagging in the low byte as well. The 3-bit heap tag for pairs is chosen so that pairs can just be two words, with no header word. There is another 3-bit heap tag for forwarded objects, which is used but the GC when evacuating a value. Other objects put their heap tags in the low 8 bits of the first word. Additionally there is a "busy" tag word value, used to prevent races when evacuating from multiple threads.

Finally, for generational collection of objects that can be "large" -- the definition of large depends on the collector implementation, and is not nicely documented, but is more than, like, 256 bytes -- anyway these objects might need to have space for a "remembered" bit in the object themselves. This is not the case for pairs but is the case for, say, vectors: even though they are prolly smol, they might not be, and they need space for a remembered bit in the header.

tail calls

When I started Whiffle, I thought, let's just compile each Scheme function to a C function. Since all functions have the same type, clang and gcc will have no problem turning any tail call into a proper tail call.

This intuition was right and wrong: at optimization level -O2, this works great. We don't even do any kind of loop recognition / contification: loop iterations are tail calls and all is fine. (Not the most optimal implementation technique, but the assumption is that for our test cases, GC costs will dominate.)

However, when something goes wrong, I will need to debug the program to see what's up, and so you might think to compile at -O0 or -Og. In that case, somehow gcc does not compile to tail calls. One time while debugging a program I was flummoxed at a segfault during the call instruction; turns out it was just stack overflow, and the call was trying to write the return address into an unmapped page. For clang, I could use the musttail attribute; perhaps I should, to allow myself to debug properly.

Not being able to debug at -O0 with gcc is annoying. I feel like if GNU were an actual thing, we would have had the equivalent of a musttail attribute 20 years ago already. But it's not, and we still don't.

stdlib

So Whiffle makes C, and that C uses some primitives defined as inline functions. Whiffle actually lexically embeds user Scheme code with a prelude, having exposed a set of primitives to that prelude and to user code. The assumption is that the compiler will open-code all primitives, so that the conceit of providing a primitive from the Guile compilation host to the Whiffle guest magically works out, and that any reference to a free variable is an error. This works well enough, and it's similar to what we currently do in Hoot as well.

This is a quick and dirty strategy but it does let us grow the language to something worth using. I think I'll come back to this local maximum later if I manage to write about what Hoot does with modules.

coda

So, that's Whiffle: the Guile compiler front-end for Scheme, applied to an expression that prepends a user's program with a prelude, in a lexical context of a limited set of primitives, compiling to very simple C, in which tail calls are just return f(...), relying on the C compiler to inline and optimize and all that.

Perhaps next up: some results on using Whiffle to test Whippet. Until then, good night!

by Andy Wingo at Thursday, November 16, 2023