Planet Scheme

Sunday, November 24, 2024

GNU Guix

Guix/Hurd on a Thinkpad X60

A lot has happened with respect to the Hurd since our Childhurds and GNU/Hurd Substitutes post. As long as two years ago some of you have been asking for a progress update and although there have been rumours on a new blog post for over a year, we were kind of waiting for the rumoured x86_64 support.

With all the exciting progress on the Hurd coming available after the recent (last?) merger of core-updates we thought it better not to wait any longer. So here is a short overview of our Hurd work over the past years:

NetDDE and Rumpdisk support

Back in 2020, Ricardo Wurmus added the NetDDE package that provides Linux 2.6 network drivers. At the time we didn't get to integrate and use it though and meanwhile it bitrotted.

After we resurrected the NetDDE build, and with kind help of the Hurd developers we finally managed to support NetDDE for the Hurd.. This allows the usage of the Intel 82573L Gigabit Ethernet Controller of the Thinkpad X60 (and many other network cards, possibly even WIFI). Instead of using the builtin kernel driver in GNU Mach, it would be running as a userland driver.

What sparked this development was upstream's NetBSD rumpdisk support that would allow using modern hard disks such as SSDs, again running as a userland driver. Hard disk support builtin in GNU Mach was once considered to be a nice hack but it only supported disks up to 128 GiB…

First, we needed to fix the cross build on Guix.

After the initial attempt at rumpdisk support for the Hurd it took (v2) some (v3) work (v4) to finally arrive at rumpdisk support for the Hurd, really, *really* (v5)

Sadly when actually using them, booting hangs:

start: pci.arbiter:

What did not really help is that upstream's rumpkernel archive was ridiculously large. We managed to work with upstream to remove unused bits from the archive. Upstream created a new archive that instead of 1.8 GiB (!) now “only” weighs 670 MiB.

Anyway, after a lot of building, rebuilding, and debugging and some more with kind help from upstream we finally got Rumpdisk and NetDDE to run in a Childhurd.

NetDDE and Rumpdisk userland processes in a Childhurd

Initial Guix/Hurd on the Thinkpad X60

Now that the last (!) core-updates merge has finally happened (thanks everyone!), the recipe of installing Guix/Hurd has been much simpfilied. It goes something along these lines.

  1. Install Guix/Linux on your X60,

  2. Reserve a partition and format it for the Hurd:

    mke2fs -o hurd -L hurd /dev/sdaX
  3. In your config.scm, add some code to add GRUB menuentries for booting the Hurd, and mount the Hurd partition under /hurd:

    (use-modules (srfi srfi-26)
                 (ice-9 match)
                 (ice-9 rdelim)
                 (ice-9 regex)
                 (gnu build file-systems))
    
    (define %hurd-menuentry-regex
      "menuentry \"(GNU with the Hurd[^{\"]*)\".*multiboot ([^ \n]*) +([^\n]*)")
    (define (text->hurd-menuentry text)
      (let* ((m (string-match %hurd-menuentry-regex text))
             (label (match:substring m 1))
             (kernel (match:substring m 2))
             (arguments (match:substring m 3))
             (arguments (string-split arguments #\space))
             (root (find (cute string-prefix? "root=" <>) arguments))
             (device-spec (match (string-split root #\=)
                            (("root" device) device)))
             (device (hurd-device-name->device-name device-spec))
             (modules (list-matches "module ([^\n]*)" text))
             (modules (map (cute match:substring <> 1) modules))
             (modules (map (cute string-split <> #\space) modules)))
        (menu-entry
         (label label)
         (device device)
         (multiboot-kernel kernel)
         (multiboot-arguments arguments)
         (multiboot-modules modules))))
    
    (define %hurd-menuentries-regex
      "menuentry \"(GNU with the Hurd[^{\"]*)\" \\{([^}]|[^\n]\\})*\n\\}")
    (define (grub.cfg->hurd-menuentries grub.cfg)
      (let* ((entries (list-matches %hurd-menuentries-regex grub.cfg))
             (entries (map (cute match:substring <> 0) entries)))
        (map text->hurd-menuentry entries)))
    
    (define (hurd-menuentries)
      (let ((grub.cfg (with-input-from-file "/hurd/boot/grub/grub.cfg"
                        read-string)))
        (grub.cfg->hurd-menuentries grub.cfg)))
    
    ...
    (operating-system
       ...
      (bootloader (bootloader-configuration
                   (bootloader grub-bootloader)
                   (targets '("/dev/sda"))
                   (menu-entries (hurd-menuentries))))
      (file-systems (cons* (file-system
                             (device (file-system-label "guix"))
                             (mount-point "/")
                             (type "ext4"))
                           (file-system
                             (device (file-system-label "hurd"))
                             (mount-point "/hurd")
                             (type "ext2"))
                           %base-file-systems))
      ...)
  4. Create a config.scm for your Hurd system. You can get inspiration from bare-hurd.tmpl and inherit from %hurd-default-operating-system. Use grub-minimal-bootloader and add a static-networking-service-type. Something like:

    (use-modules (srfi srfi-1) (ice-9 match))
    (use-modules (gnu) (gnu system hurd))
    
    (operating-system
      (inherit %hurd-default-operating-system)
      (bootloader (bootloader-configuration
                   (bootloader grub-minimal-bootloader)
                   (targets '("/dev/sda"))))
      (kernel-arguments '("noide"))
    ...
      (services
        (cons*
          (service static-networking-service-type
                   (list %loopback-static-networking
                         (static-networking
                          (addresses
                           (list
                            (network-address
                             (device "eth0")
                             (value "192.168.178.37/24"))))
                          (routes
                           (list (network-route
                                  (destination "default")
                                  (gateway "192.168.178.1"))))
                          (requirement '())
                          (provision '(networking))
                          (name-servers '("192.168.178.1")))))
        ...)))
  5. Install the Hurd. Assuming you have an ext2 filesystem mounted on /hurd, do something like:

    guix system build --target=i586-pc-gnu vuurvlieg.hurd --verbosity=1
    sudo -E guix system init --target=i586-pc-gnu --skip-checks \
        vuurvlieg.hurd /hurd
    sudo -E guix system reconfigure vuurvlieg.scm
  6. Reboot and...

Hurray!

We now have Guix/Hurd running on Thinkpad.

Guix/Hurd GRUB menu on ThinkpadX60

Guix/Hurd running on ThinkpadX60

Guix/Hurd on Real Iron

While the initial manual install on the X60 was an inspiring milestone, we can do better. As mentioned above, just recently the installer learnt about the Hurd, right after some smaller problems were addressed, like guix system init creating essential devices for the Hurd, not attempting to run a cross-built grub-install to install Grub, soft-coding the hard-coded part:1:device:wd0 root file-system, adding support for booting Guix/Hurd more than once.

To install Guix/Hurd, first, build a 32bit installation image and copy it to a USB stick:

guix system image --image-type=iso9660 --system=i686-linux gnu/system/install.scm
dd if=/gnu/store/cabba9e-image.iso of=/dev/sdX status=progress
sync

then boot it on a not-too-new machine that has wired internet (although installation over WIFI is possible, there is currently no WIFI support for the installed Hurd to use it). On the new Kernel page:

Installer Kernel page

choose Hurd. Do not choose a desktop environment, that's not available yet. On the Network management page:

Installer Network management page

choose the new Static networking service. In the final Configuration file step, don't forget to edit:

Installer Configuration file page

and fill-in your IP and GATEWAY:

Installer Edit static networking

You may want to add some additional packages such as git-minimal from (gnu packages version-control) and sqlite from (gnu packages sqlite).

If you also intend to do Guix development on the Hurd—e.g., debugging and fixing native package builds—then you might want to include all dependencies to build the guix package, see the devel-hurd.tmpl for inspiration on how to do that. Note that any package you add must already have support for cross-building.

Good luck, and let us know if it works for you and on what kind of machine, or why it didn't!

What's next?

In an earlier post we tried to answer the question “Why bother with the Hurd anyway?” An obvious question because it is all too easy to get discouraged, to downplay or underestimate the potential social impact of GNU and the Hurd.

The most pressing problem currently is that the guix-daemon fails when invoking guix authenticate on the Hurd, which means that we cannot easily keep our substitutes up to date. guix pull is known to work but only by pulling from a local branch doing something like:

mkdir -p ~/src/guix
cd src/guix
git clone https://git.savannah.gnu.org/git/guix.git master
guix pull --url=~/src/guix/master

kinda like we did it in the old days. Sometimes it seems that guix does not grok the sqlite store database. This is needs to be resolved but as sqlite does read it this can be easily worked around by doing something like:

mv /var/guix/db/db.sqlite /var/guix/db/db.sqlite.orig
sqlite3 /var/guix/db/db.sqlite.orig .dump > /var/guix/db/db.sqlite.dump
sqlite3 -init /var/guix/db/db.sqlite.dump /var/guix/db/db.sqlite .quit

Also, the boot process still fails to handle an unclean root file system. Whenever the Hurd has suffered an unclean shutdown, cleaning it must currently be done manually, e.g., by booting from the installer USB stick.

Upstream now has decent support for 64bit (x86_64) albeit with more bugs and fewer packages. As mentioned in the overview we are now working to have that in Guix and have made some progress:

Hello Guix 64bit Hurd

more on that in another post. Other interesting task for Guix include:

  • Have guix system reconfigure work on the Hurd,
  • Figure out WiFi support with NetDDE (and add it to installer!),
  • An isolated build environment (or better wait for, err, contribute to the Guile guix-daemon?),
  • An installer running the Hurd, and,
  • Packages, packages, packages!

We tried to make Hurd development as easy and as pleasant as we could. As you have seen, things start to work pretty nicely and there is still plenty of work to do in Guix. In a way this is “merely packaging” the amazing work of others. Some of the real work that needs to be done and which is being discussed and is in progress right now includes:

All these tasks look daunting, and indeed that’s a lot of work ahead. But the development environment is certainly an advantage. Take an example: surely anyone who’s hacked on device drivers or file systems before would have loved to be able to GDB into the code, restart it, add breakpoints and so on—that’s exactly the experience that the Hurd offers. As for Guix, it will make it easy to test changes to the micro-kernel and to the Hurd servers, and that too has the potential to speed up development and make it a very nice experience.

Join #guix and #hurd on libera.chat or the mailing lists and get involved!

Addendum

It has been brought to our attention that people haven't heard that Debian GNU/Hurd has been a reality for some years now. While Guix GNU/Hurd has an exciting future, please be aware that it lacks many packages and services, including Xorg. If you would simply like to install the Hurd on bare metal running your favorite window manager (eg: i3, icewm, etc.) or lightweight desktop environment (Xfce) right now, then installing Debian GNU/Hurd is a good choice. Though we hope to catch up to them soon!

by Janneke Nieuwenhuizen at Sunday, November 24, 2024

Wednesday, November 20, 2024

Scheme Requests for Implementation

SRFI 256: Minimal extension to SRFI 9/R7RS small record type definitions for inheritance

SRFI 256 is now in draft status.

A SRFI 9-style define-record-type is specified which allows subtyping while preserving encapsulation, in that the field structure of supertypes remains an implementation detail with which subtypes need not concern themselves.

by Daphne Preston-Kendal at Wednesday, November 20, 2024

Monday, November 18, 2024

Peter Bex

What to expect from CHICKEN 6

NOTE: This is a guest post by Felix Winkelmann, the founder and one of the current maintainers of CHICKEN Scheme.

Introduction

This article is about the changes in the next major version of CHICKEN Scheme.

The current version is 5.4.0. The next major version, 6.0.0, is mostly implemented. Currently we are adding final touches before preparing the next release. If you are already familiar with CHICKEN, this article will help you get a more detailed view of what has been done. If you are not into CHICKEN or even Scheme, this article may still give you an insight into what's involved in the development and maintenance of a programming language project. You may also find it interesting to know how we address issues like portability and backwards-compatibility. There are also some juicy details on implementation techniques.

No previous knowledge of CHICKEN Scheme is required, but it will help if you are familiar with common terms used in programming language design and Lisp-y languages.

Versions

CHICKEN uses the usual major.minor.patch versioning scheme, where we bump major versions only for significant changes that break compatibility with older versions. CHICKEN has a relatively large community for a non-mainstream language and has lots of contributed extension libraries called "eggs". Breaking backwards compatibility in non-major versions adds implementation effort for users that just want keep their CHICKEN up to date and any eggs they're using should also keep working.

The process of avoiding breakage can sometimes become challenging. We may, for example, provide two different implementations for one and the same thing, to allow adoption of the new feature while keeping old code working. Typically this happens when the new feature is safer, easier to use, faster or more standards compliant than the old. We also remove features in stages. First we mark it as deprecated, and delete it later. This allows users to upgrade their code gradually.

On major version changes, we create a new mirror of the egg "repository", where most of the extensions are stored. Externally contributed eggs can choose when and how to publish an extension for specific versions. This is described in a previous post on this blog.

Major version changes also usually bump the "binary version". This is a suffix to the shared runtime library name to avoid intermixing tools and programs written for one version with the libraries for another.

We started CHICKEN major version 6 to introduce new features that were dearly required but were impossible to add without changing external interfaces, in particular what modules exist and what they export. Specifically, we needed full support for UNICODE strings and compliance with the R7RS (small) language standard.

Both features were already available in a rudimentary manner as external libraries. They were not fully integrated, a fact that showed in various places and could lead to subtle bugs as core and extension code assumed differences in string representation and the behaviour of standard procedures. The same approach was used for integrating the "full numeric tower" (the full set of numeric types, including rationals, big integers and complex numbers), which was formerly a library and was moved into the core system with the 5.0.0 release.

We'll now describe the most noteworthy changes made during this major version transition.

UNICODE

A major shortcoming of CHICKEN up to now was that it didn't support full UNICODE strings. All string data was assumed to consist of 8-bit characters in the ASCII or Latin-1 range. Internationalisation of source code and applications is unavoidable and libraries and operating systems have moved towards UTF-8 as a standard character encoding. So we saw no choice but to find a reasonably efficient way of using the full range of possible characters in all parts of the system.

There is an open-ended space of design choices regarding how to efficiently implement UNICODE strings. The Damocles Sword of unanticipated performance degradation constantly looms over the head of the language implementer, so finding the ideal solution in terms of memory use, speed and simplicity is not easy. Some implementations go to great lengths by providing complex storage schemes or multiple representations depending on a strings' content. In the end, we think a simple representation serves everybody better by being easier to understand and maintain.

Also it is not entirely sure that (say) a dynamic multi-representation approach pays off sufficiently. Too many factors come into play, like string usage patterns in applications, memory management and implementation overhead. Data exchange at the boundaries of operating system and foreign code also have to be taken into account. You want to avoid unnecessary conversions and copying, especially because CHICKEN is geared towards easy interfacing to external libraries and OS functionality.

We decided to use an UTF-8 representation internally. Many operating systems and applications already support UTF-8, which removes the need for costly conversions. UTF-8 is also backwards compatible to ASCII and keeps storage requirements at a minimum.

Since UTF-8 is a multi-byte encoding, character lookup in a string is of linear complexity. Characters have to be scanned when addressing a specific index in a string. To avoid repeatedly scanning the same section of a string when iterating over it, we use a simple cache slot in the string that maps byte-offsets to code point indices and holds the offset/index pair of the last access.

String representation

A quick recap regarding how strings are currently stored in CHICKEN is in order. If you are interested, there's also an older post on this blog with a more detailed explanation of the data representation used by CHICKEN.

Characters are stored as immediates with the special bit pattern 1110 (type code 111):

This gives us enough bits (even on 32-bit platforms) to hold all code points in the Basic Multilingual Plane (code points in the range 32 - 0x1fffff). CHICKEN 5 already supported characters in that range, but strings were still restricted to having 8-bit elements.

Up to CHICKEN 5 a string was represented by a single byteblock:

In CHICKEN 6 we add an indirection:

As you can see, the string itself is a normal block containing a pointer to a byteblock holding the UTF-8 byte sequence and a trailing zero byte. The "Count" slot is a fixnum with the number of code points in the string (the length). "Offset" and "Index" act as a cache for the byte-offset and code point-index of the last access. They are reset to zero if the string changes its length due to a character's width changing at an offset before the cached point.

An indirection is necessary regardless of the other details of the representation: as UTF-8 is a multi-byte encoding, destructively modifying characters in a string may grow or shrink the total byte-sequence. The string must still keep its identity and be indistinguishable from the original string (in terms of the primitive eq? procedure).

One obvious optimisation we can do here is that character accesses for strings where the length ("Count") and the length of the character data (encoded in the header of the byteblock) minus one is equal: then we can simply index by byte offset.

Alternative representations are possible. I believe the following is used in Chibi Scheme: keep a list of "chunks", i.e. multiple byte-offset/code point-index pairs per string. You can traverse this list to obtain the first chunk of string data containing the index you want to access.

In CHICKEN this would probably be represented thus:

This way we can find the offset for the index right at or before the location addressed. This reduces the amount of scanning to the part inside a chunk pointed to by the offset/index pair.

Such an approach of maintaining the offset/index list is more complex and keeping it up to date is more effort than the simple offset/index cache. Should the performance impact of the simpler representation turn out to be too large, the alternative approach may still be an option to try.

This leads me to the subject of benchmarks. We do have a benchmark suite holding a number of micro-benchmarks and CHICKEN 6 has been compared to previous versions using this suite. Nevertheless, the results of micro-benchmarking have to be taken with a grain of salt.

The behaviour of real-world applications may differ considerably depending on the memory consumption, memory address patterns and the type and amount of data processed. Reasoning about performance is especially difficult due to the complex caching mechanisms of contemporary hardware and operating systems. Therefore we will try to use the simplest approach that has a good chance of providing sufficient performance while keeping the implementation effort and complexity at a minimum.

Dealing with the outside world

There's another point we have to address when considering strings that are received from external sources like OS APIs, from files and from the foreign function interface. What to do when encountering invalid UTF-8 byte sequences, for example when reading file or directory names? Signal an error? Convert to some other representation or mark the location in the byte sequence using some special pattern?

We decided to basically ignore the problem. Strings are accepted whether valid or not. Only when decoding a string we distinguish between valid and invalid sequences. When indexing and scanning a string's byte sequence, we return the invalid byte as a "surrogate pair end" code point that has the value 0xDCxx. This approach allows us to store the byte in the lower 8 bits of the code point. When inserting a character in a string that has such a value, we simply copy the byte. As I understand it, this is the approach used by the "surrogateescape" error handler in PEP 383. However, we do this masking every time we decode an unexpected byte (and do the inverse when encoding).

Say we received a string from an external source containing the following byte sequence:

This is invalid UTF-8. Extracting character by character with the described method would yield the following code point values:

Inserting such a surrogate pair end code point will inject the value of the lower byte. For example, calling (string-set! str #\xdcfe), where str is the above invalid utf-8 encoded string would yield the following byte sequence:

The advantage is that we can simply accept and pass strings containing invalid sequences as if nothing happened. We don't need to do any conversions and checks. We can also access items in the character sequence and store them back without having to worry about the validity of the sequence with respect to the UTF-8 encoding. The process is "lossless".

The disadvantage is that we have to perform an additional check when encoding or decoding UTF-8 sequences. Again we decided to reduce copying and transcoding overhead and rather have complete transparency regardless of the source of the string. Furthermore no validation is performed for incoming strings. The R7RS standard procedure utf8->string which converts bytevectors into strings does validation, though to at least ensure that strings created from binary data are always correctly encoded.

A final issue is UNICODE handling on Windows platforms. There, the OS API entry-points that are UNICODE aware use UTF-16 for strings like filenames or the values of environment variables. On Windows there is no choice but to encode and decode from UTF-8 to UTF-16 and back when interfacing with the OS.

Port encodings

When accessing files, we still want to be able to read and write data in an 8-bit encoding or in binary. We may also want to support additional encodings, even though these are currently not provided by default so we'll need to associate an "encoding" property to "ports". Ports are the Scheme objects that provide access to files and other streams of data, like traffic received from a network. The encoding is selected when opening a port using additional arguments to standard procedures that create ports for textual input and output like open-input-port/open-output-port.

Ports internally hold a table of "port methods", roughly similar to how object-oriented programming systems attach behaviour to data objects of a particular class. The port methods in former versions of CHICKEN included the low-level counterpart of the standard operations peek-char, read-char and read-string for input ports and write-char/write-string (among others) for output ports. The major change here is to replace the string I/O methods with methods that act upon bytevectors.

An internal mechanism delegates the operations for encoding and decoding or scanning for the next character in a stream to an internal hook procedure. Additional encodings can be registered by using a built-in procedure to extend the hook. Currently supported are binary, UTF-8 and Latin-1 encodings. Support for a larger set of encodings can be done through extensions and thus can be developed separately from the core system. Port encodings can be accessed using port-encoding. They can also be changed using the SRFI-17 setter for port-encoding, because encodings need sometimes to be changed while the port is still open.

R7RS does not define whether binary I/O standard procedures are allowed to operate on textual ports and vice versa. In CHICKEN we do not restrict the set of operations depending on the port type, so you can read and write bytevectors to and from a textual port and the other way round. We think this is more practical and intuitive - it makes more sense to read and write binary data as bytevectors and have dedicated operations for this.

R7RS support

The second big change for CHICKEN 6 is proper R7RS (small) compliance. Like with the UTF-8 support this was previously provided by an extension library, but using it needed some minor extra steps to set up. Now, all syntactic definitions of the (scheme base) module are available by default (most notably define-library) without requiring any further action.

Deciding what is available by default in a compilation unit or interpretation environment is a bit of a problem: to make it easy to get going when experimenting or scripting, we defaulted to having all standard R5RS procedures and macros available in the base environment of the interpreter (csi), together with the imports from the (chicken base) module. Compiled code was slightly more restricted but defaulted also to R5RS.

In CHICKEN 5 the scheme module referred to the R5RS standard procedures. To avoid breaking too much existing code this is still the case. So now, scheme is an alias for the R7RS (scheme r5rs) library module that exports all bindings of the former language standard. But to avoid duplicating the set of exported identifiers over several core modules, certain functionality has been moved from (chicken base) to (scheme base) and is thus not available in the default toplevel environment.

To make a long story short, the switch makes it necessary to access certain standard bindings like open-input-string by importing additional modules like (scheme base). This is not necessarily a bad thing, as it incentivises the user to write code in a more standard compliant way. But it all feels a bit clunky and may have been a suboptimal choice. Note that just adding an (import (scheme base)) is usually enough to make existing code run. We will see how that works out.

All required (scheme ...) modules are implemented and can be imported using their full functionality. Two notable changes that influence backwards compatibility are generative record types and hexadecimal escape sequences in string literals.

Formerly record types defined with define-record-type were non-generative: re-defining a record type with the same name, even if the type definition is completely different, would not create a new type. Instances of the former definition would also be of the newly defined type, provided the type name is the same. Now every definition of a record type creates a completely new type, regardless of the name. This is of course more correct and much safer as it doesn't invalidate previously defined instances.

A second (and much more annoying) change is that R7RS requires hex escape sequences in string literals to be terminated by a semicolon. Unfortunately the change is incompatible to the convention used in most programming languages, including existing many Lisp and Scheme implementations.

What in CHICKEN 5 would looked like this:

   "the color \x1b[31mRED\x1b[0m"

in CHICKEN 6 (and R7RS) must now be (note the semicolon):

   "the color \x1b;[31mRED\x1b;[0m"

The motivation here was probably to avoid having dedicated escape sequences for values that require more than 2 hex digits (e.g. \uNNNN). The change is particularly bad from a compatibility point of view. All string literals that contain the old style of escape sequences must be changed. To keep code working in both CHICKEN 5 and 6 you can use the 4-digit \uNNNN escape sequence which is still valid in all versions.

Foreign Function Interface changes

CHICKEN has a very easy to use foreign function interface, which mostly derives from the fact that the compiler generates C. The alternative approach are binary FFIs that use native code to interface with C code, like libffi, which must reproduce a lot of ABI details to safely interface with C libraries, things like struct alignment and padding, how arguments of various lengths are passed on the stack, etc.

The most notable FFI-related change for CHICKEN 6 is that it allows passing and returning C structs and unions to and from foreign code by value. The contents are copied in and out of bytevectors and wrapped in a block. The components of the struct or union can not be directly manipulated in Scheme but can be passed on to other foreign functions. Additionally, for completeness, you can now also directly pass C99 complex numbers. Note that complex numbers in the FFI are always passed as inexact (i.e., floating-point), as opposed to Scheme complex numbers that may have exact (integer or even rational) real and imaginary components.

Platform support and build system

Here two major changes have been introduced. First, we now have a proper configuration ("configure") script. Formerly, all build parameters were passed as extra arguments to make(1), like this:

   make PREFIX=$HOME/.local ...

This required that for every make(1) invocation, the same set of parameters had to be given to avoid inconsistencies. A configuration script written in portable POSIX sh(1) notation is now provided to perform the configuration step once before building. It also follows the de facto convention used in many GNU programs, where the usual incantation is:

   ./configure; make; make install

Note that we don't use the dreaded "autotools" (autoconf, automake and libtool), which have arcane syntax, are very brittle and produce more problems that they are trying to solve. They were originally designed to port code to now dead or obscure UNIX derivatives, yet fail to provide a straightforward and easy to use configuration tool for modern systems (Linux/BSD, Mac, Windows, mostly). Our configure script is hand written instead of auto-generated, and only checks for the platform differences that are relevant to those platforms that CHICKEN actually supports.

The old style of passing variables to make(1) should still work, but is deprecated.

The second change in the build system is that we cleaned up the confusion about toolchains on Windows systems. There is a bunch of half-maintained flavors of GNU-based C development environments for Windows systems ("MingGW", "MingGW-w64", "MSYS2", etc.) and it is a constant source of pain to support and test all these variants.

There is now one official "blessed" toolchain that we support. Specifically, we recommend Chris Wellon's excellent w64devkit. It contains compilers, build tools and enough of a POSIX environment for building CHICKEN on Windows. We also require a POSIX sh(1) (contained in w64devkit) and have dropped all support for building on a non-POSIX shell, i.e. cmd.exe. This simplifies the build and package management tools considerably. It also ensures we have less environments to test.

Building for Windows Subsystem for Linux (WSL) and Cygwin is of course still supported, but those use the UNIX build setup and need no or very little specific platform support.

Minor changes

Quite a number of minor changes have taken place that either increase the robustness or standards compliance of the system.

syntax-error is now a macro, as required by R7RS. Previously there was a syntax-error procedure in (chicken syntax) with the same argument signature. So where the error was previously raised at runtime, it is now done at expansion time. This is something to take note of when porting code to CHICKEN 6. The invocation is still the same, so the difference can at least be identified easily and corrected, and (scheme base) needs to be imported, of course.

The csc compiler driver passes arguments now properly to subprocesses via execve(2) and not system(3) which reduces shell quoting headaches.

The chicken-install package ("egg") manager locks the build cache directory to avoid conflicts when running multiple installations in parallel. Also, a custom-config feature has been added to places in the package description (.egg) file that specify compile and link options provided by external tools like pkg-config for more portable configuration of native libraries that packages use. The configuration script is expected to be Scheme code. Other eggs can extend the set of possible customisation facilities by providing library code to access them.

The feathers debugger has been removed from the core system and re-packaged as an egg, allowing separate development or replacement. It always was a bit of a proof-of-concept thing that lacks the robustness and flexibility of a real debugger. Some users found it helpful, so we keep it for the time being.

Future directions

Every major release is a chance of fixing long-standing problems with the codebase and address bad design decisions. CHICKEN is now nearly 25 years old and we had many major overhauls of the system. Sometimes these caused a lot of pain, but still we always try to improve things and hopefully make it more enjoyable and practical for our users. There are places in the code that are messy, too complex, or that require cleanup or rewrite, always sitting there waiting to be addressed. On the other hand CHICKEN has been relatively stable compared to many other language implementations and has a priceless community of users that help us improving it. Our users never stop reminding us of what could be better, where the shortcomings are, where things are hard to use or inefficient.

So the final goal is and will always be to make it more robust and easier to use. Performance improvements are always good but are secondary. Strong standards compliance is a requirement, unless it interferes with practicality. We also try to avoid dependencies in the core system at all costs. This eases porting and avoids friction that is very often introduced by inadequate tools or overdesigned development environments.

Some moody ramblings about Scheme standards

The switch to seamless support for R7RS (small) was due for quite a while. R7RS is by now the generally accepted Scheme standard that implementations try to follow. How the further development of R7RS (large) will turn out remains to be seen, but I'm not very confident that it will result in anything more that just a shapeless agglomeration of features. The current direction seems to be oriented towards creating something that includes everything but the "kitchen-sink" - too ambitious and too much driven by the urge to produce something big and comprehensive.

What always distinguished Scheme from other languages and Lisp dialects was its elegance and minimalism. This resulted in a large number of experimental implementations, the investigation of various directions of programming language semantics and in highly interesting advanced implementation techniques that are still in use today. The expressiveness and the small number of core concepts made it also perfect for teaching computing. It still is great for teaching, even if the tendency to address perceived requirements of the "market" replaces the academic use of Scheme with languages that belong more to the "mainstream". This strikes me as strange, as if learning multiple languages and studying different programming paradigms couldn't help in obtaining a broader view of software development; or as if one is somehow wasting time by exploring the world of programming in a non-mainstream language.

Repeatedly the small size and limited scope of Scheme has driven a number of enthusiasts dedicated to favouring a broader use in the industry to demand "bigness". They dream of a comprehensive standard that will support everything and please everyone and makes it easy to write portable and non-trivial applications in Scheme and have them run on large number of implementations. With this comes the tacit expectation that implementers will just follow such a large standard and implement it faithfully.

But what made Scheme successful (and beautiful) was the small size, the small number of powerful concepts, the minimalism, the joy in experimentation. Trying to create the Comprehensive Mother of all Schemes is in my opinion a waste of time. In fact, it makes Scheme less relevant. Large languages inevitably die - the warts and inconsistencies that crop up when you try to design too much up front will just get more blatant as the language ages and will annoy users, constrain implementers and make people search for alternatives.

You can already write serious programs and use a large number of libraries in Scheme: just install CHICKEN, Guile or (even) Racket and get going. The code will to a certain part not be portable across implementations, that's unavoidable, but to use dynamic languages effectively and successfully, some experience with and a certain understanding of the design of the underlying compiler or interpreter is necessary anyway.

Acknowledgements

I would like to thank my employer, bevuta IT GmbH which sponsored part of the effort of preparing a new major release of CHICKEN.

by Peter Bex at Monday, November 18, 2024

Friday, November 15, 2024

Scheme Requests for Implementation

SRFI 253: Data (Type-)Checking

SRFI 253 is now in final status.

Data validation and type checking (supposedly) make for more correct code. And faster code too, sometimes. And, in rare cases, code that's easier to follow than un-checked code. Unfortunately, Scheme does not have many (type-)checking primitives out of the box. This SRFI provides some, with the aim of allowing more performant and correct code with minimum effort on the user side. Both (manual) argument checking/validation (check-arg) and return value(s) (values-checked) checking/coercion are provided. Syntax sugar like define-checked and define-record-type-checked is added on top.

by Artyom Bologov at Friday, November 15, 2024

Sunday, November 10, 2024

GNU Guix

Take the Guix User and Contributor Survey

To understand the views of the Guix community we're running a survey that we'd love you to take part in! The Guix User and Contributor Survey is live now, and should take about 10 minutes to fill out. Perfect for doing with a cup of tea and a biscuit!

The Guix project continues to grow and change, with new contributors and users joining our community. We decided to run this survey as it's the best way to gather good quality feedback across the widest cross-section of the community. Of course, there's lots of interesting topics a survey could ask about! We decided to focus on how Guix is used, and how contributors take part in the project.

The survey is being run on LimeSurvey which is a Free Software project and has been used by many other projects for similar surveys. The survey's hosted on the LimeSurvey SaaS so that we don't have the additional task of operating the software. No personal data is asked for (e.g. email addresses), no tracking data is being collected (e.g. IP addresses) and the entries are anonymised.

We'll be making the results and the anonymised data available under the Creative Commons CCO: that way anyone can analyse the data for further insights.

We hope the results of the survey will be used to understand both the Guix project's strengths and areas we can improve. Which is why your input is so important. If you can, please take the survey!

Take the survey now!

by Steve George at Sunday, November 10, 2024

Tuesday, November 5, 2024

The Racket Blog

Racket v8.15

posted by Stephen De Gabrielle


We are pleased to announce Racket v8.15 is now available from https://download.racket-lang.org/.

As of this release:

  • Documentation search results are ordered, with visual cues indicating what their source is (core, main-distribution, etc.). These results are also grouped by language family (Racket, Rhombus, etc.). Search e.g. second to see an example.

  • DrRacket offers to restore previously open files when starting, which can be made the default.

DrRacket restore open files dialog

  • In DrRacket, Images in editing panels can be saved by right-clicking, including those generated by the functional picture libraries pict and 2htdp/image.

DrRacket save image in context menu

Thank you

The following people contributed to this release:

Alec Mills, Alex Knauth, Alexander Shopov, Ashlynn Anderson, Ashton Wiersdorf, Ben Greenman, Benjamin Yeung, Bob Burger, Bogdan Popa, Breck Yunits, Carl Gay, Claes Wallin (韋嘉誠), CooperCorad, Crystal Jacobs, D. Ben Knoble, Dexter Santucci, Eduardo Cavazos, Emil Szpakowski, evelynmitchell, Greg Hendershott, Gunnar Ahlberg, Gwen Weinholt, Idiomdrottning, Ikko Eltociear Ashimine, Jacqueline Firth, Jarhmander, Jay McCarthy, Jens Axel Søgaard, Jimmy McNutt, jinser, Jinser Kafka, John Clements, lukejianu, Marc Nieper-Wißkirchen, Matej Fandl, Matthew Flatt, Matthias Felleisen, Michael Ballantyne, Mike Sperber, olopierpa, Paul Morris, Phil Nguyen, Philip McGrath, Robby Findler, Ronald Garcia, Ryan Culpepper, Sam Phillips, Sam Tobin-Hochstadt, Siddhartha Kasivajhula, Sorawee Porncharoenwase, Stephen De Gabrielle, Syntacticlosure, Taylor Allred, Tomas Fabrizio Orsi, Wing Hei Chan, and Yafei Yang.

Racket is a community developed open source project and we welcome new contributors. See racket/README.md to learn how you can be a part of this amazing project.

Feedback Welcome

Questions and discussion welcome at the Racket community Discourse announcement(join) or on the Racket Discord.

Please share

If you can - please help get the word out to users and platform specific repo packagers

Racket - the Language-Oriented Programming Language - version 8.15 is now available from https://download.racket-lang.org

See https://blog.racket-lang.org/2024/08/racket-v8-15.html for the release announcement and highlights.

by John Clements, Stephen De Gabrielle at Tuesday, November 5, 2024

Monday, October 28, 2024

Idiomdrottning

Watching Jellyfin videos with VLC

Okay so if you wanna watch a Jellyfin video with VLC, go to your Jellyfin instance’s web view where you can copy a stream URL and it’ll look like this:

https://jellyfinserver.tld/Items/TheIdForTheVideoGoesHere/Download?api_key=YourKeyIdGoesHere

And you can paste that URL into your VLC.

If you want to grab a whole season’s worth of video IDs that you can mangle into a .pls, you can with Jellyfin’s API! Here’s an example using httpie and jq.

https jellyfinserver.tld/Shows/ShowIdThatYouAlsoCanFindFromTheWebView/Episodes X-Emby-Token:YourKeyIdGoesHere season==1 |jq -r '.Items[].Id'

With VLC, watching stream URLs from .pls files is way better than watching individually pasted streams since it can remember how far into the video you were.

Pretty sure this works with other stream players too, like mpd.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Monday, October 28, 2024

Tuesday, October 22, 2024

spritely.institute

Make a game with Hoot for the Autumn Lisp Game Jam!

Hi everyone, it's that time of year again: the Autumn Lisp Game Jam is upon us!

In case you're unfamiliar, the Lisp Game Jam is a twice yearly ten day game jam co-hosted by our very own Dave Thompson, in which participants make games using their favorite Lisp dialects. It's a much more relaxed schedule than your average game jam to foster a more inclusive turnout.

Although there are no prizes, there is time reserved at the end to play everyone's games and give feedback! ✨ Of course, the real game jam is the Lisps you used along the way ☺�

This fall's game jam starts on Friday, October 25th. Head on over to the official game jam page on itch.io for helpful tips and information on how to join!

Spritely is here to help

Do you wish you could use Scheme in the browser? Well thanks to Hoot, our Scheme to WebAssembly compiler, you can! In fact, we've prepared a game jam template repository to get you up and running on making a 2D game with Hoot and HTML5 canvas. It even comes with a game demo of its own!

If you run into any issues with Hoot or have ideas on how to make it even better, we'd love to hear about them.

For help with Hoot or the game template during the jam, you can post questions on our community forum or in this itch.io forum thread.

And if you'd like to show off your game in progress, we'd love to hear about it on our community forum!

Further reading

If you'd like to learn more about past game jams and what we've built, here are some of our previous posts to get you started.

Thanks for reading, and see you at the itch arcade! 👾🕹�

by tessa (contact@spritely.institute) at Tuesday, October 22, 2024

Monday, October 21, 2024

GNU Guix

Build User Takeover Vulnerability (CVE-2024-52867)

A security issue, known as CVE-2024-52867, has been identified in guix-daemon which allows for a local user to gain the privileges of any of the build users and subsequently use this to manipulate the output of any build. You are strongly advised to upgrade your daemon now (see instructions below), especially on multi-user systems.

This exploit requires the ability to start a derivation build and the ability to run arbitrary code with access to the store in the root PID namespace on the machine the build occurs on. As such, this represents an increased risk primarily to multi-user systems and systems using dedicated privilege-separation users for various daemons: without special sandboxing measures, any process of theirs can take advantage of this vulnerability.

Vulnerability

For a very long time, guix-daemon has helpfully made the outputs of failed derivation builds available at the same location they were at in the build container. This has aided greatly especially in situations where test suites require the package to already be installed in order to run, as it allows one to re-run the test suite interactively outside of the container when built with --keep-failed. This transferral of store items from inside the chroot to the real store was implemented with a simple rename, and no modification of the store item or any files it may contain.

If an attacker starts a build of a derivation that creates a binary with the setuid and/or setgid bit in an output directory, then, and the build fails, that binary will be accessible unaltered for anybody on the system. The attacker or a cooperating user can then execute the binary, gain the privileges, and from there use a combination of signals and procfs to freeze a builder, open any file it has open via /proc/$PID/fd, and overwrite it with whatever it wants. This manipulation of builds can happen regardless of which user started the build, so it can work not only for producing compromised outputs for commonly-used programs before anybody else uses them, but also for compromising any builds another user happens to start.

A related vulnerability was also discovered concerning the outputs of successful builds. These were moved - also via rename() - outside of the container prior to having their permissions, ownership, and timestamps canonicalized. This means that there also exists a window of time for a successful build's outputs during which a setuid/setgid binary can be executed.

In general, any time that a build user running a build for some submitter can get a setuid/setgid binary to a place the submitter can execute it, it is possible for the submitter to use it to take over the build user. This situation always occurs when --disable-chroot is passed to guix-daemon. This holds even in the case where there are no dedicated build users, and builds happen under the same user the daemon runs as, as happens during make check in the guix repository. Consequently, if a permissive umask that allows execute permission for untrusted users on directories all the way to a user's guix checkout is used, an attacker can use that user's test-environment daemon to gain control over their user while make check is running.

Mitigation

This security issue has been fixed by two commits. Users should make sure they have updated to the second commit to be protected from this vulnerability. Upgrade instructions are in the following section. If there is a possibility that a failed build has left a setuid/setgid binary lying around in the store by accident, run guix gc to remove all failed build outputs.

The fix was accomplished by sanitizing the permissions of all files in a failed build output prior to moving it to the store, and also by waiting to move successful build outputs to the store until after their permissions had been canonicalized. The sanitizing was done in such a way as to preserve as many non-security-critical properties of failed build outputs as possible to aid in debugging. After applying these two commits, the guix package in Guix was updated so that guix-daemon deployed using it would use the fixed version.

If you are using --disable-chroot, whether with dedicated build users or not, make sure that access to your daemon's socket is restricted to trusted users. This particularly affects anyone running make check and anyone running on GNU/Hurd. The former should either manually remove execute permission for untrusted users on their guix checkout or apply this patch, which restricts access to the test-environment daemon to the user running the tests. The latter should adjust the ownership and permissions of /var/guix/daemon-socket, which can be done for Guix System users using the new socket-directory-{perms,group,user} fields in this patch.

A proof of concept is available at the end of this post. One can run this code with:

guix repl -- setuid-exposure-vuln-check.scm

This will output whether the current guix-daemon being used is vulnerable or not. If it is not vulnerable, the last line will contain your system is not vulnerable, otherwise the last line will contain YOUR SYSTEM IS VULNERABLE.

Upgrading

Due to the severity of this security advisory, we strongly recommend all users to upgrade their guix-daemon immediately.

For Guix System, the procedure is to reconfigure the system after a guix pull, either restarting guix-daemon or rebooting. For example:

guix pull
sudo guix system reconfigure /run/current-system/configuration.scm
sudo herd restart guix-daemon

where /run/current-system/configuration.scm is the current system configuration but could, of course, be replaced by a system configuration file of a user's choice.

For Guix running as a package manager on other distributions, one needs to guix pull with sudo, as the guix-daemon runs as root, and restart the guix-daemon service, as documented. For example, on a system using systemd to manage services, run:

sudo --login guix pull
sudo systemctl restart guix-daemon.service

Note that for users with their distro's package of Guix (as opposed to having used the install script) you may need to take other steps or upgrade the Guix package as per other packages on your distro. Please consult the relevant documentation from your distro or contact the package maintainer for additional information or questions.

Conclusion

Even with the sandboxing features of modern kernels, it can be quite challenging to synthesize a situation in which two users on the same system who are determined to cooperate nevertheless cannot. Guix has an especially difficult job because it needs to not only realize such a situation, but also maintain the ability to interact with both users itself, while not allowing them to cooperate through itself in unintended ways. Keeping failed build outputs around for debugging introduced a vulnerability, but finding that vulnerability because of it enabled the discovery of an additional vulnerability that would have existed anyway, and prompted the use of mechanisms for securing access to the guix daemon.

I would like to thank Ludovic Courtès for giving feedback on these vulnerabilities and their fixes — discussion of which led to discovering the vulnerable time window with successful build outputs — and also for helping me to discover that my email server was broken.

Proof of Concept

Below is code to check if your guix-daemon is vulnerable to this exploit. Save this file as setuid-exposure-vuln-check.scm and run following the instructions above, in "Mitigation."

(use-modules (guix)
             (srfi srfi-34))

(define maybe-setuid-file
  ;; Attempt to create a setuid file in the store, with one of the build
  ;; users as its owner.
  (computed-file "maybe-setuid-file"
                 #~(begin
                     (call-with-output-file #$output (const #t))
                     (chmod #$output #o6000)

                     ;; Failing causes guix-daemon to copy the output from
                     ;; its temporary location back to the store.
                     (exit 1))))

(with-store store
  (let* ((drv (run-with-store store
                (lower-object maybe-setuid-file)))
         (out (derivation->output-path drv)))
    (guard (c (#t
               (if (zero? (logand #o6000 (stat:perms (stat out))))
                   (format #t "~a is not setuid: your system is not \
vulnerable.~%"
                           out)
                   (format #t "~a is setuid: YOUR SYSTEM IS VULNERABLE.

Run 'guix gc' to remove that file and upgrade.~%"
                           out))))
      (build-things store (list (derivation-file-name drv))))))

by Caleb Ristvedt at Monday, October 21, 2024

Thursday, October 17, 2024

spritely.institute

Our first office hours: A recap (and how to join the next one)

In mid-September, the Spritely Institute held our first open office hours event. 🎉

The team was surprised and delighted that ten members of our community came to ask questions and learn more about our work!

A dive into Spritely

After opening with introductions, we asked attendees what brought them here and if they had any questions for the team. Reasons included:

  • Curiosity
  • Wanting to try Spritely technologies for game development and more
  • Tangential interest through Scheme
  • Concerns about social network design
  • Desire to learn more about cryptography
  • Enthusiasm to hear about the institute's roadmap

As for questions, the discussion this time around centered on broad overviews of the Institute and broader community's various initiatives, such as:

Thank you to everyone who joined us for a rousing conversation in September!

What the future holds

Going forward, we're looking to establish a cadence of meeting on the last Wednesday of the month at 2:00 PM Eastern Time (UTC-4/5 depending on Daylight Saving Time), with exceptions in the schedule to account for holidays and so forth.

As previously, reservations are available for the closest new office hours meeting, and drop ins are also welcome!

For our next meeting, we're interested in hosting a discussion on community-driven topics, as well as demoing how to get started with Hoot for the Autumn Lisp Game Jam starting up next Friday, October 25th.

We'll be addressing pre-scheduled project and other technical questions first, so if that describes your needs, you can read more about how to reserve time with us below.

To submit topics for the open discussion, we'd love to hear your thoughts on our community form thread here. We'll add some suggestions first to help get the party started. 🥳

Reserving an office hours slot

If, after filing an issue and/or posting on our forums, you still require further discussion, you can schedule a meeting with a member of the team here.

Dropping in

Alternatively, if you just want to say hello, listen in, or think of a burning question on the day of, you can join us impromptu in the meeting room. Our next meeting as of this post is on Wednesday, October 30th at 2:00 PM EDT (UTC-4)!

In the meantime, we look forward to seeing you around on our community forum. Thanks for reading, and see you soon! 👋

by tessa (contact@spritely.institute) at Thursday, October 17, 2024

Wednesday, October 16, 2024

Idiomdrottning

md, mvd and cpd

I have gotten so much mileage out of these three zsh functions for making directories for over a decade, I should’ve posted them a long time ago but I just didn’t think to.

md () {
	mkdir -p $1
	cd $1
}

mvd () {
	mkdir -p "${@[$#]}"
	mv "$@"
	cd "${@[$#]}"
}

cpd () {
	mkdir -p "${@[$#]}"
	cp -r "$@"
	cd "${@[$#]}"
}

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Wednesday, October 16, 2024

Thursday, October 3, 2024

Andy Wingo

preliminary notes on a nofl field-logging barrier

When you have a generational collector, you aim to trace only the part of the object graph that has been allocated recently. To do so, you need to keep a remembered set: a set of old-to-new edges, used as roots when performing a minor collection. A language run-time maintains this set by adding write barriers: little bits of collector code that run when a mutator writes to a field.

Whippet’s nofl space is a block-structured space that is appropriate for use as an old generation or as part of a sticky-mark-bit generational collector. It used to have a card-marking write barrier; see my article diving into V8’s new write barrier, for more background.

Unfortunately, when running whiffle benchmarks, I was seeing no improvement for generational configurations relative to whole-heap collection. Generational collection was doing fine in my tiny microbenchmarks that are part of Whippet itself, but when translated to larger programs (that aren’t yet proper macrobenchmarks), it was a lose.

I had planned on doing some serious tracing and instrumentation to figure out what was happening, and thereby correct the problem. I still plan on doing this, but instead for this issue I used the old noggin technique instead: just, you know, thinking about the thing, eventually concluding that unconditional card-marking barriers are inappropriate for sticky-mark-bit collectors. As I mentioned in the earlier article:

An unconditional card-marking barrier applies to stores to slots in all objects, not just those in oldspace; a store to a new object will mark a card, but that card may contain old objects which would then be re-scanned. Or consider a store to an old object in a more dense part of oldspace; scanning the card may incur more work than needed. It could also be that Whippet is being too aggressive at re-using blocks for new allocations, where it should be limiting itself to blocks that are very sparsely populated with old objects.

That’s three problems. The second is well-known. But the first and last are specific to sticky-mark-bit collectors, where pages mix old and new objects.

a precise field-logging write barrier

Back in 2019, Steve Blackburn’s paper Design and Analysis of Field-Logging Write Barriers took a look at the state of the art in precise barriers that record not regions of memory that have been updated, but the precise edges (fields) that were written to. He ends up re-using this work later in the 2022 LXR paper (see §3.4), where the write barrier is used for deferred reference counting and a snapshot-at-the-beginning (SATB) barrier for concurrent marking. All in all field-logging seems like an interesting strategy. Relative to card-marking, work during the pause is much less: you have a precise buffer of all fields that were written to, and you just iterate that, instead of iterating objects. Field-logging does impose some mutator cost, but perhaps the payoff is worth it.

To log each old-to-new edge precisely once, you need a bit per field indicating whether the field is logged already. Blackburn’s 2019 write barrier paper used bits in the object header, if the object was small enough, and otherwise bits before the object start. This requires some cooperation between the collector, the compiler, and the run-time that I wasn’t ready to pay for. The 2022 LXR paper was a bit vague on this topic, saying just that it used “a side table”.

In Whippet’s nofl space, we have a side table already, used for a number of purposes:

  1. Mark bits.

  2. Iterability / interior pointers: is there an object at a given address? If so, it will have a recognizable bit pattern.

  3. End of object, to be able to sweep without inspecting the object itself

  4. Pinning, allowing a mutator to prevent an object from being evacuated, for example because a hash code was computed from its address

  5. A hack to allow fully-conservative tracing to identify ephemerons at trace-time; this re-uses the pinning bit, since in practice such configurations never evacuate

  6. Bump-pointer allocation into holes: the mark byte table serves the purpose of Immix’s line mark byte table, but at finer granularity. Because of this though, it is swept lazily rather than eagerly.

  7. Generations. Young objects have a bit set that is cleared when they are promoted.

Well. Why not add another thing? The nofl space’s granule size is two words, so we can use two bits of the byte for field logging bits. If there is a write to a field, a barrier would first check that the object being written to is old, and then check the log bit for the field being written. The old check will be to a byte that is nearby or possibly the same as the one to check the field logging bit. If the bit is unsert, we call out to a slow path to actually record the field.

preliminary results

I disassembled the fast path as compiled by GCC and got something like this on x86-64, in AT&T syntax, for the young-generation test:

mov    %rax,%rdx
and    $0xffffffffffc00000,%rdx
shr    $0x4,%rax
and    $0x3ffff,%eax
or     %rdx,%rax
testb  $0xe,(%rax)

The first five instructions compute the location of the mark byte, from the address of the object (which is known to be in the nofl space). If it has any of the bits in 0xe set, then it’s in the old generation.

Then to test a field logging bit it’s a similar set of instructions. In one of my tests the data type looks like this:

struct Node {
  uintptr_t tag;
  struct Node *left;
  struct Node *right;
  int i, j;
};

Writing the left field will be in the same granule as the object itself, so we can just test the byte we fetched for the logging bit directly with testb against $0x80. For right, we should be able to know it’s in the same slab (aligned 4 MB region) and just add to the previously computed byte address, but the C compiler doesn’t know that right now and so recomputes. This would work better in a JIT. Anyway I think these bit-swizzling operations are just lost in the flow of memory accesses.

For the general case where you don’t statically know the offset of the field in the object, you have to compute which bit in the byte to test:

mov    %r13,%rcx
mov    $0x40,%eax
shr    $0x3,%rcx
and    $0x1,%ecx
shl    %cl,%eax
test   %al,%dil

Is it good? Well, it improves things for my whiffle benchmarks, relative to the card-marking barrier, seeing a 1.05×-1.5× speedup across a range of benchmarks. I suspect the main advantage is in avoiding the “unconditional” part of card marking, where a write to a new object could cause old objects to be added to the remembered set. There are still quite a few whiffle configurations in which the whole-heap collector outperforms the sticky-mark-bit generational collector, though; I hope to understand this a bit more by building a more classic semi-space nursery, and comparing performance to that.

Implementation links: the barrier fast-path, the slow path, and the sequential store buffers. (At some point I need to make it so that allocating edge buffers in the field set causes the nofl space to page out a corresponding amount of memory, so as to be honest when comparing GC performance at a fixed heap size.)

Until next time, onwards and upwards!

by Andy Wingo at Thursday, October 3, 2024

Tuesday, October 1, 2024

Idiomdrottning

chunk-by

Takes a list and splits it in order into smaller lists such that each list contains one pred. If there are no pred, that’s fine, DTRT silently i.e. make a one-element list containing the whole list.

(define (vowel? l) ((as-list (c member l)) "aeiou"))

(define (chunk-by pred lis) (list lis))

(define (chunk-by pred lis)
  (receive (hd tl)
      (split-at lis (add1 (require number? (list-index pred lis))))
    (cons hd (chunk-by pred (require (c any pred) tl)))))

(chunk-by vowel? (string->list "somejunk"))

⇒ ((#\s #\o) (#\m #\e) (#\j #\u #\n #\k))

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Tuesday, October 1, 2024

Thursday, September 26, 2024

Andy Wingo

needed-bits optimizations in guile

Hey all, I had a fun bug this week and want to share it with you.

numbers and representations

First, though, some background. Guile’s numeric operations are defined over the complex numbers, not over e.g. a finite field of integers. This is generally great when writing an algorithm, because you don’t have to think about how the computer will actually represent the numbers you are working on.

In practice, Guile will represent a small exact integer as a fixnum, which is a machine word with a low-bit tag. If an integer doesn’t fit in a word (minus space for the tag), it is represented as a heap-allocated bignum. But sometimes the compiler can realize that e.g. the operands to a specific bitwise-and operation are within (say) the 64-bit range of unsigned integers, and so therefore we can use unboxed operations instead of the more generic functions that do run-time dispatch on the operand types, and which might perform heap allocation.

Unboxing is important for speed. It’s also tricky: under what circumstances can we do it? In the example above, there is information that flows from defs to uses: the operands of logand are known to be exact integers in a certain range and the operation itself is closed over its domain, so we can unbox.

But there is another case in which we can unbox, in which information flows backwards, from uses to defs: if we see (logand n #xff), we know:

  • the result will be in [0, 255]

  • that n will be an exact integer (or an exception will be thrown)

  • we are only interested in a subset of n‘s bits.

Together, these observations let us transform the more general logand to an unboxed operation, having first truncated n to a u64. And actually, the information can flow from use to def: if we know that n will be an exact integer but don’t know its range, we can transform the potentially heap-allocating computation that produces n to instead truncate its result to the u64 range where it is defined, instead of just truncating at the use; and potentially this information could travel farther up the dominator tree, to inputs of the operation that defines n, their inputs, and so on.

needed-bits: the |0 of scheme

Let’s say we have a numerical operation that produces an exact integer, but we don’t know the range. We could truncate the result to a u64 and use unboxed operations, if and only if only u64 bits are used. So we need to compute, for each variable in a program, what bits are needed from it.

I think this is generally known a needed-bits analysis, though both Google and my textbooks are failing me at the moment; perhaps this is because dynamic languages and flow analysis don’t get so much attention these days. Anyway, the analysis can be local (within a basic block), global (all blocks in a function), or interprocedural (larger than a function). Guile’s is global. Each CPS/SSA variable in the function starts as needing 0 bits. We then compute the fixpoint of visiting each term in the function; if a term causes a variable to flow out of the function, for example via return or call, the variable is recorded as needing all bits, as is also the case if the variable is an operand to some primcall that doesn’t have a specific needed-bits analyser.

Currently, only logand has a needed-bits analyser, and this is because sometimes you want to do modular arithmetic, for example in a hash function. Consider Bon Jenkins’ lookup3 string hash function:

#define rot(x,k) (((x)<<(k)) | ((x)>>(32-(k))))
#define mix(a,b,c) \
{ \
  a -= c;  a ^= rot(c, 4);  c += b; \
  b -= a;  b ^= rot(a, 6);  a += c; \
  c -= b;  c ^= rot(b, 8);  b += a; \
  a -= c;  a ^= rot(c,16);  c += b; \
  b -= a;  b ^= rot(a,19);  a += c; \
  c -= b;  c ^= rot(b, 4);  b += a; \
}
...

If we transcribe this to Scheme, we get something like:

(define (jenkins-lookup3-hashword2 str)
  (define (u32 x) (logand x #xffffFFFF))
  (define (shl x n) (u32 (ash x n)))
  (define (shr x n) (ash x (- n)))
  (define (rot x n) (logior (shl x n) (shr x (- 32 n))))
  (define (add x y) (u32 (+ x y)))
  (define (sub x y) (u32 (- x y)))
  (define (xor x y) (logxor x y))

  (define (mix a b c)
    (let* ((a (sub a c)) (a (xor a (rot c 4)))  (c (add c b))
           (b (sub b a)) (b (xor b (rot a 6)))  (a (add a c))
           (c (sub c b)) (c (xor c (rot b 8)))  (b (add b a))
           ...)
      ...))

  ...

These u32 calls are like the JavaScript |0 idiom, to tell the compiler that we really just want the low 32 bits of the number, as an integer. Guile’s compiler will propagate that information down to uses of the defined values but also back up the dominator tree, resulting in unboxed arithmetic for all of these operations.

(When writing this, I got all the way here and then realized I had already written quite a bit about this, almost a decade ago ago. Oh well, consider this your lucky day, you get two scoops of prose!)

the bug

All that was just prelude. So I said that needed-bits is a fixed-point flow analysis problem. In this case, I want to compute, for each variable, what bits are needed for its definition. Because of loops, we need to keep iterating until we have found the fixed point. We use a worklist to represent the conts we need to visit.

Visiting a cont may cause the program to require more bits from the variables that cont uses. Consider:

(define-significant-bits-handler
    ((logand/immediate label types out res) param a)
  (let ((sigbits (sigbits-intersect
                   (inferred-sigbits types label a)
                   param
                   (sigbits-ref out res))))
    (intmap-add out a sigbits sigbits-union)))

This is the sigbits (needed-bits) handler for logand when one of its operands (param) is a constant and the other (a) is variable. It adds an entry for a to the analysis out, which is an intmap from variable to a bitmask of needed bits, or #f for all bits. If a already has some computed sigbits, we add to that set via sigbits-union. The interesting point comes in the sigbits-intersect call: the bits that we will need from a are first the bits that we infer a to have, by forward type-and-range analysis; intersected with the bits from the immediate param; intersected with the needed bits from the result value res.

If the intmap-add call is idempotent—i.e., out already contains sigbits for a—then out is returned as-is. So we can check for a fixed-point by comparing out with the resulting analysis, via eq?. If they are not equal, we need to add the cont that defines a to the worklist.

The bug? The bug was that we were not enqueuing the def of a, but rather the predecessors of label. This works when there are no cycles, provided we visit the worklist in post-order; and regardless, it works for many other analyses in Guile where we compute, for each labelled cont (basic block), some set of facts about all other labels or about all other variables. In that case, enqueuing a predecessor on the worklist will cause all nodes up and to including the variable’s definition to be visited, because each step adds more information (relative to the analysis computed on the previous visit). But it doesn’t work for this case, because we aren’t computing a per-label analysis.

The solution was to rewrite that particular fixed-point to enqueue labels that define a variable (possibly multiple defs, because of joins and loop back-edges), instead of just the predecessors of the use.

Et voilà ! If you got this far, bravo. Type at y’all again soon!

by Andy Wingo at Thursday, September 26, 2024

spritely.institute

Spritely went to DWeb Camp: 2024 Recap

Decorative painting of goblin trying to set up a tent but tripping and flopping it down on another

Last month, Spritely's Executive Director (Christine) and CTO (me) attended DWeb Camp. DWeb Camp is an annual tech conference unlike any I’ve experienced before. It's organized by the Internet Archive, and is focused on creating a truly decentralized web.

Sign at campsite entrance, adapted from photo by John Bruhling

In lieu of a traditional conference space, DWeb Camp takes place at a campground in the redwood forests of northern California. On top being outdoors, the event has several layers of COVID mitigations in place to help protect public safety. It's a unique setting for discussing all the problems facing the decentralized web and the solutions currently being developed. Getting there from the east coast of the US was a long journey, but DWeb Camp was worth it!

DWeb Camp main building, adapted from photo by John Bruhling

The plan 🗺�

Some of Spritely’s many goals at DWeb Camp this year were to:

  • Share the current state of our technology through presentations

  • Generate excitement for our work with cool demos

  • Talk to Spritely supporters/collaborators

  • Make new connections with potential future collaborators and funders

Presentations and panels 🎤

DWeb Camp stage, adapted from photo by JohnBruhling

To that end, we gave two presentations:

  • Why Spritely is all-in on WebAssembly: A lightning talk about our Hoot project and why we’re investing in WebAssembly. This talk was recorded but, as of this writing, the video has not yet been posted publicly.

  • Keeping decentralized networking fun with Spritely: A longer talk about why Spritely uses games as a way to generate excitement for decentralized web technology and make a very technical subject more fun and approachable. This talk was not recorded, unfortunately.

Additionally, Christine participated in a couple panels:

  • Film Showing: Secrets in Your Data: An outdoor screening of PBS NOVA’s Secrets in Your Data documentary (which features Christine for her work on ActivityPub) followed by a Q&A session.

  • Successful Cultures for Long-Term Decentralized Communities: An open discussion on what technical, structural, and cultural considerations engender sustainable, decentralized communities.

As if that weren’t enough, Christine also ran an in-person instance of the Hack & Craft workshop, a part of her FOSS and Crafts podcast project.

The Night Market 🌙

DWeb Camp enameled mugs, adapted from photo by Brad Shirakawa

The best event of all for Spritely was the Demo Night Market, which gave us a chance for us to run in-person, live demos of our technologies. Tables were placed in a circle around a cluster of redwood trees and the space was divided up amongst the various projects on display.

We shared a table with object capability comrades from the Endo JS project. Our theme was the Spritely Arcade. We featured games like Cirkoban and Strigoform that show off our Goblins and Hoot projects.

We didn’t have room in our checked luggage for arcade cabinets (but how cool would it be to roll up to camp with a Taito Egret II?) so we made do with simply running the games on some spare laptops. What we did have room for, though, were a bunch of lovely, colorful cardboard displays made by tessa (Developer Relations) to attract passersby. We even had a DIY gachapon machine as a fun way to give out stickers and trinkets.

Our combination of displays and cute pixel graphics worked well… almost too well. We were busy entertaining from the moment the night market opened until a half hour after it officially closed! While I’ve never been to a true night market, we were told our table embodied the vibe and spirit of one. What a wonderful compliment to top off a fantastic night!

Speaking of lovely, colorful things… it’s safe to say that our merch game was on point this year, thanks to tessa. We had many types of stickers, postcards, coasters, stamps, and even coloring books (many campers bring their kids but these were a hit with adults, too!) Check out our pre-DWeb Camp post to see lovely pictures of all the wonderful merch.

All work and some play 🥽

During rare moments of downtime, we had fun doing things like:

  • Singing karaoke

  • Participating in the talent show

  • Swimming in the nearby river

  • Stargazing

Audiovisual floral art installation, adapted from photo by Brad Shirakawa

Final thoughts ��

In summary, DWeb Camp was an overwhelmingly good time! We left camp exhausted but with high spirits and optimism that we’re building the right things for the future of the decentralized web. It’s clear that many people like what we’re doing and for that we are very grateful. If these goals and activities sound at a tech conference sound up your alley, DWeb Camp might be for you, too!

Upshot of trees, adapted from photo by John Bruhling

As always, if you’d like to discuss anything Spritely-related then please join our community forum. Until next time! ��


Unedited photo sources: 1 2 3 4 5 6

Original photos by John Bruhling and Brad Shirakawa

by Dave Thompson (contact@spritely.institute) at Thursday, September 26, 2024