Planet Scheme

Wednesday, November 6, 2024

The Racket Blog

Racket v8.15

posted by Stephen De Gabrielle


We are pleased to announce Racket v8.15 is now available from https://download.racket-lang.org/.

As of this release:

  • Documentation search results are ordered, with visual cues indicating what their source is (core, main-distribution, etc.). These results are also grouped by language family (Racket, Rhombus, etc.). (See https://docs.racket-lang.org/search/index.html?q=second .)

  • DrRacket offers to restore previously open files when starting, which can be made the default.

  • In DrRacket, Picts can be saved by right-clicking.

  • raco pkg introduces the uninstall command as the opposite of install. The remove name for this functionality is retained for compatibility. (See https://docs.racket-lang.org/pkg/cmdline.html#%28part._raco-pkg-uninstall%29 .)

  • raco pkg improves the handling of --clone and --unclone. (See https://docs.racket-lang.org/pkg/cmdline.html#%28part._raco-pkg-update%29 .)

  • iOS is a compilation target, distinct from macOS.

  • Racket supports falling back to IPv4 during hostname resolution when IPv6 fails.

  • Memory allocated using the ffi/unsafe library can be initially zeroed, using the 'zeroed-atomic and 'zeroed-atomic-interior flags. (See https://docs.racket-lang.org/foreign/foreign_pointer-funcs.html#%28idx._%28gentag.11.%28lib._scribblings%2Fforeign%2Fforeign..scrbl%29%29%29 .)

  • Many other bugs are fixed and documentation has been improved!

Thank you

The following people contributed to this release:

Alec Mills, Alex Knauth, Alexander Shopov, Ashlynn Anderson, Ashton Wiersdorf, Ben Greenman, Benjamin Yeung, Bob Burger, Bogdan Popa, Breck Yunits, Carl Gay, Claes Wallin (韋嘉誠), CooperCorad, Crystal Jacobs, D. Ben Knoble, Dexter Santucci, Eduardo Cavazos, Emil Szpakowski, evelynmitchell, Greg Hendershott, Gunnar Ahlberg, Gwen Weinholt, Idiomdrottning, Ikko Eltociear Ashimine, Jacqueline Firth, Jarhmander, Jay McCarthy, Jens Axel Søgaard, Jimmy McNutt, jinser, Jinser Kafka, John Clements, lukejianu, Marc Nieper-Wißkirchen, Matej Fandl, Matthew Flatt, Matthias Felleisen, Michael Ballantyne, Mike Sperber, olopierpa, Paul Morris, Phil Nguyen, Philip McGrath, Robby Findler, Ronald Garcia, Ryan Culpepper, Sam Phillips, Sam Tobin-Hochstadt, Siddhartha Kasivajhula, Sorawee Porncharoenwase, Stephen De Gabrielle, Syntacticlosure, Taylor Allred, Tomas Fabrizio Orsi, Wing Hei Chan, and Yafei Yang.

Racket is a community developed open source project and we welcome new contributors. See racket/README.md to learn how you can be a part of this amazing project.

Feedback Welcome

Questions and discussion welcome at the Racket community Discourse announcement(join) or on the Racket Discord.

Please share

If you can - please help get the word out to users and platform specific repo packagers

Racket - the Language-Oriented Programming Language - version 8.15 is now available from https://download.racket-lang.org

See https://blog.racket-lang.org/2024/08/racket-v8-15.html for the release announcement and highlights.

by John Clements, Stephen De Gabrielle at Wednesday, November 6, 2024

Monday, October 28, 2024

Idiomdrottning

Watching Jellyfin videos with VLC

Okay so if you wanna watch a Jellyfin video with VLC, go to your Jellyfin instance’s web view where you can copy a stream URL and it’ll look like this:

https://jellyfinserver.tld/Items/TheIdForTheVideoGoesHere/Download?api_key=YourKeyIdGoesHere

And you can paste that URL into your VLC.

If you want to grab a whole season’s worth of video IDs that you can mangle into a .pls, you can with Jellyfin’s API! Here’s an example using httpie and jq.

https jellyfinserver.tld/Shows/ShowIdThatYouAlsoCanFindFromTheWebView/Episodes X-Emby-Token:YourKeyIdGoesHere season==1 |jq -r '.Items[].Id'

With VLC, watching stream URLs from .pls files is way better than watching individually pasted streams since it can remember how far into the video you were.

Pretty sure this works with other stream players too, like mpd.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Monday, October 28, 2024

Tuesday, October 22, 2024

spritely.institute

Make a game with Hoot for the Autumn Lisp Game Jam!

Hi everyone, it's that time of year again: the Autumn Lisp Game Jam is upon us!

In case you're unfamiliar, the Lisp Game Jam is a twice yearly ten day game jam co-hosted by our very own Dave Thompson, in which participants make games using their favorite Lisp dialects. It's a much more relaxed schedule than your average game jam to foster a more inclusive turnout.

Although there are no prizes, there is time reserved at the end to play everyone's games and give feedback! ✨ Of course, the real game jam is the Lisps you used along the way ☺�

This fall's game jam starts on Friday, October 25th. Head on over to the official game jam page on itch.io for helpful tips and information on how to join!

Spritely is here to help

Do you wish you could use Scheme in the browser? Well thanks to Hoot, our Scheme to WebAssembly compiler, you can! In fact, we've prepared a game jam template repository to get you up and running on making a 2D game with Hoot and HTML5 canvas. It even comes with a game demo of its own!

If you run into any issues with Hoot or have ideas on how to make it even better, we'd love to hear about them.

For help with Hoot or the game template during the jam, you can post questions on our community forum or in this itch.io forum thread.

And if you'd like to show off your game in progress, we'd love to hear about it on our community forum!

Further reading

If you'd like to learn more about past game jams and what we've built, here are some of our previous posts to get you started.

Thanks for reading, and see you at the itch arcade! 👾🕹�

by tessa (contact@spritely.institute) at Tuesday, October 22, 2024

Monday, October 21, 2024

GNU Guix

Build User Takeover Vulnerability

A security issue has been identified in guix-daemon which allows for a local user to gain the privileges of any of the build users and subsequently use this to manipulate the output of any build. Your are strongly advised to upgrade your daemon now (see instructions below), especially on multi-user systems.

This exploit requires the ability to start a derivation build and the ability to run arbitrary code with access to the store in the root PID namespace on the machine the build occurs on. As such, this represents an increased risk primarily to multi-user systems and systems using dedicated privilege-separation users for various daemons: without special sandboxing measures, any process of theirs can take advantage of this vulnerability.

Vulnerability

For a very long time, guix-daemon has helpfully made the outputs of failed derivation builds available at the same location they were at in the build container. This has aided greatly especially in situations where test suites require the package to already be installed in order to run, as it allows one to re-run the test suite interactively outside of the container when built with --keep-failed. This transferral of store items from inside the chroot to the real store was implemented with a simple rename, and no modification of the store item or any files it may contain.

If an attacker starts a build of a derivation that creates a binary with the setuid and/or setgid bit in an output directory, then, and the build fails, that binary will be accessible unaltered for anybody on the system. The attacker or a cooperating user can then execute the binary, gain the privileges, and from there use a combination of signals and procfs to freeze a builder, open any file it has open via /proc/$PID/fd, and overwrite it with whatever it wants. This manipulation of builds can happen regardless of which user started the build, so it can work not only for producing compromised outputs for commonly-used programs before anybody else uses them, but also for compromising any builds another user happens to start.

A related vulnerability was also discovered concerning the outputs of successful builds. These were moved - also via rename() - outside of the container prior to having their permissions, ownership, and timestamps canonicalized. This means that there also exists a window of time for a successful build's outputs during which a setuid/setgid binary can be executed.

In general, any time that a build user running a build for some submitter can get a setuid/setgid binary to a place the submitter can execute it, it is possible for the submitter to use it to take over the build user. This situation always occurs when --disable-chroot is passed to guix-daemon. This holds even in the case where there are no dedicated build users, and builds happen under the same user the daemon runs as, as happens during make check in the guix repository. Consequently, if a permissive umask that allows execute permission for untrusted users on directories all the way to a user's guix checkout is used, an attacker can use that user's test-environment daemon to gain control over their user while make check is running.

Mitigation

This security issue has been fixed by two commits. Users should make sure they have updated to the second commit to be protected from this vulnerability. Upgrade instructions are in the following section. If there is a possibility that a failed build has left a setuid/setgid binary lying around in the store by accident, run guix gc to remove all failed build outputs.

The fix was accomplished by sanitizing the permissions of all files in a failed build output prior to moving it to the store, and also by waiting to move successful build outputs to the store until after their permissions had been canonicalized. The sanitizing was done in such a way as to preserve as many non-security-critical properties of failed build outputs as possible to aid in debugging. After applying these two commits, the guix package in Guix was updated so that guix-daemon deployed using it would use the fixed version.

If you are using --disable-chroot, whether with dedicated build users or not, make sure that access to your daemon's socket is restricted to trusted users. This particularly affects anyone running make check and anyone running on GNU/Hurd. The former should either manually remove execute permission for untrusted users on their guix checkout or apply this patch, which restricts access to the test-environment daemon to the user running the tests. The latter should adjust the ownership and permissions of /var/guix/daemon-socket, which can be done for Guix System users using the new socket-directory-{perms,group,user} fields in this patch.

A proof of concept is available at the end of this post. One can run this code with:

guix repl -- setuid-exposure-vuln-check.scm

This will output whether the current guix-daemon being used is vulnerable or not. If it is vulnerable, the last line will contain your system is not vulnerable, otherwise the last line will contain YOUR SYSTEM IS VULNERABLE.

Upgrading

Due to the severity of this security advisory, we strongly recommend all users to upgrade their guix-daemon immediately.

For Guix System, the procedure is to reconfigure the system after a guix pull, either restarting guix-daemon or rebooting. For example:

guix pull
sudo guix system reconfigure /run/current-system/configuration.scm
sudo herd restart guix-daemon

where /run/current-system/configuration.scm is the current system configuration but could, of course, be replaced by a system configuration file of a user's choice.

For Guix running as a package manager on other distributions, one needs to guix pull with sudo, as the guix-daemon runs as root, and restart the guix-daemon service, as documented. For example, on a system using systemd to manage services, run:

sudo --login guix pull
sudo systemctl restart guix-daemon.service

Note that for users with their distro's package of Guix (as opposed to having used the install script) you may need to take other steps or upgrade the Guix package as per other packages on your distro. Please consult the relevant documentation from your distro or contact the package maintainer for additional information or questions.

Conclusion

Even with the sandboxing features of modern kernels, it can be quite challenging to synthesize a situation in which two users on the same system who are determined to cooperate nevertheless cannot. Guix has an especially difficult job because it needs to not only realize such a situation, but also maintain the ability to interact with both users itself, while not allowing them to cooperate through itself in unintended ways. Keeping failed build outputs around for debugging introduced a vulnerability, but finding that vulnerability because of it enabled the discovery of an additional vulnerability that would have existed anyway, and prompted the use of mechanisms for securing access to the guix daemon.

I would like to thank Ludovic Courtès for giving feedback on these vulnerabilities and their fixes — discussion of which led to discovering the vulnerable time window with successful build outputs — and also for helping me to discover that my email server was broken.

Proof of Concept

Below is code to check if your guix-daemon is vulnerable to this exploit. Save this file as setuid-exposure-vuln-check.scm and run following the instructions above, in "Mitigation."

(use-modules (guix)
             (srfi srfi-34))

(define maybe-setuid-file
  ;; Attempt to create a setuid file in the store, with one of the build
  ;; users as its owner.
  (computed-file "maybe-setuid-file"
                 #~(begin
                     (call-with-output-file #$output (const #t))
                     (chmod #$output #o6000)

                     ;; Failing causes guix-daemon to copy the output from
                     ;; its temporary location back to the store.
                     (exit 1))))

(with-store store
  (let* ((drv (run-with-store store
                (lower-object maybe-setuid-file)))
         (out (derivation->output-path drv)))
    (guard (c (#t
               (if (zero? (logand #o6000 (stat:perms (stat out))))
                   (format #t "~a is not setuid: your system is not \
vulnerable.~%"
                           out)
                   (format #t "~a is setuid: YOUR SYSTEM IS VULNERABLE.

Run 'guix gc' to remove that file and upgrade.~%"
                           out))))
      (build-things store (list (derivation-file-name drv))))))

by Caleb Ristvedt at Monday, October 21, 2024

Thursday, October 17, 2024

spritely.institute

Our first office hours: A recap (and how to join the next one)

In mid-September, the Spritely Institute held our first open office hours event. 🎉

The team was surprised and delighted that ten members of our community came to ask questions and learn more about our work!

A dive into Spritely

After opening with introductions, we asked attendees what brought them here and if they had any questions for the team. Reasons included:

  • Curiosity
  • Wanting to try Spritely technologies for game development and more
  • Tangential interest through Scheme
  • Concerns about social network design
  • Desire to learn more about cryptography
  • Enthusiasm to hear about the institute's roadmap

As for questions, the discussion this time around centered on broad overviews of the Institute and broader community's various initiatives, such as:

Thank you to everyone who joined us for a rousing conversation in September!

What the future holds

Going forward, we're looking to establish a cadence of meeting on the last Wednesday of the month at 2:00 PM Eastern Time (UTC-4/5 depending on Daylight Saving Time), with exceptions in the schedule to account for holidays and so forth.

As previously, reservations are available for the closest new office hours meeting, and drop ins are also welcome!

For our next meeting, we're interested in hosting a discussion on community-driven topics, as well as demoing how to get started with Hoot for the Autumn Lisp Game Jam starting up next Friday, October 25th.

We'll be addressing pre-scheduled project and other technical questions first, so if that describes your needs, you can read more about how to reserve time with us below.

To submit topics for the open discussion, we'd love to hear your thoughts on our community form thread here. We'll add some suggestions first to help get the party started. 🥳

Reserving an office hours slot

If, after filing an issue and/or posting on our forums, you still require further discussion, you can schedule a meeting with a member of the team here.

Dropping in

Alternatively, if you just want to say hello, listen in, or think of a burning question on the day of, you can join us impromptu in the meeting room. Our next meeting as of this post is on Wednesday, October 30th at 2:00 PM EDT (UTC-4)!

In the meantime, we look forward to seeing you around on our community forum. Thanks for reading, and see you soon! 👋

by tessa (contact@spritely.institute) at Thursday, October 17, 2024

Wednesday, October 16, 2024

Idiomdrottning

md, mvd and cpd

I have gotten so much mileage out of these three zsh functions for making directories for over a decade, I should’ve posted them a long time ago but I just didn’t think to.

md () {
	mkdir -p $1
	cd $1
}

mvd () {
	mkdir -p "${@[$#]}"
	mv "$@"
	cd "${@[$#]}"
}

cpd () {
	mkdir -p "${@[$#]}"
	cp -r "$@"
	cd "${@[$#]}"
}

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Wednesday, October 16, 2024

Thursday, October 3, 2024

Andy Wingo

preliminary notes on a nofl field-logging barrier

When you have a generational collector, you aim to trace only the part of the object graph that has been allocated recently. To do so, you need to keep a remembered set: a set of old-to-new edges, used as roots when performing a minor collection. A language run-time maintains this set by adding write barriers: little bits of collector code that run when a mutator writes to a field.

Whippet’s nofl space is a block-structured space that is appropriate for use as an old generation or as part of a sticky-mark-bit generational collector. It used to have a card-marking write barrier; see my article diving into V8’s new write barrier, for more background.

Unfortunately, when running whiffle benchmarks, I was seeing no improvement for generational configurations relative to whole-heap collection. Generational collection was doing fine in my tiny microbenchmarks that are part of Whippet itself, but when translated to larger programs (that aren’t yet proper macrobenchmarks), it was a lose.

I had planned on doing some serious tracing and instrumentation to figure out what was happening, and thereby correct the problem. I still plan on doing this, but instead for this issue I used the old noggin technique instead: just, you know, thinking about the thing, eventually concluding that unconditional card-marking barriers are inappropriate for sticky-mark-bit collectors. As I mentioned in the earlier article:

An unconditional card-marking barrier applies to stores to slots in all objects, not just those in oldspace; a store to a new object will mark a card, but that card may contain old objects which would then be re-scanned. Or consider a store to an old object in a more dense part of oldspace; scanning the card may incur more work than needed. It could also be that Whippet is being too aggressive at re-using blocks for new allocations, where it should be limiting itself to blocks that are very sparsely populated with old objects.

That’s three problems. The second is well-known. But the first and last are specific to sticky-mark-bit collectors, where pages mix old and new objects.

a precise field-logging write barrier

Back in 2019, Steve Blackburn’s paper Design and Analysis of Field-Logging Write Barriers took a look at the state of the art in precise barriers that record not regions of memory that have been updated, but the precise edges (fields) that were written to. He ends up re-using this work later in the 2022 LXR paper (see §3.4), where the write barrier is used for deferred reference counting and a snapshot-at-the-beginning (SATB) barrier for concurrent marking. All in all field-logging seems like an interesting strategy. Relative to card-marking, work during the pause is much less: you have a precise buffer of all fields that were written to, and you just iterate that, instead of iterating objects. Field-logging does impose some mutator cost, but perhaps the payoff is worth it.

To log each old-to-new edge precisely once, you need a bit per field indicating whether the field is logged already. Blackburn’s 2019 write barrier paper used bits in the object header, if the object was small enough, and otherwise bits before the object start. This requires some cooperation between the collector, the compiler, and the run-time that I wasn’t ready to pay for. The 2022 LXR paper was a bit vague on this topic, saying just that it used “a side table”.

In Whippet’s nofl space, we have a side table already, used for a number of purposes:

  1. Mark bits.

  2. Iterability / interior pointers: is there an object at a given address? If so, it will have a recognizable bit pattern.

  3. End of object, to be able to sweep without inspecting the object itself

  4. Pinning, allowing a mutator to prevent an object from being evacuated, for example because a hash code was computed from its address

  5. A hack to allow fully-conservative tracing to identify ephemerons at trace-time; this re-uses the pinning bit, since in practice such configurations never evacuate

  6. Bump-pointer allocation into holes: the mark byte table serves the purpose of Immix’s line mark byte table, but at finer granularity. Because of this though, it is swept lazily rather than eagerly.

  7. Generations. Young objects have a bit set that is cleared when they are promoted.

Well. Why not add another thing? The nofl space’s granule size is two words, so we can use two bits of the byte for field logging bits. If there is a write to a field, a barrier would first check that the object being written to is old, and then check the log bit for the field being written. The old check will be to a byte that is nearby or possibly the same as the one to check the field logging bit. If the bit is unsert, we call out to a slow path to actually record the field.

preliminary results

I disassembled the fast path as compiled by GCC and got something like this on x86-64, in AT&T syntax, for the young-generation test:

mov    %rax,%rdx
and    $0xffffffffffc00000,%rdx
shr    $0x4,%rax
and    $0x3ffff,%eax
or     %rdx,%rax
testb  $0xe,(%rax)

The first five instructions compute the location of the mark byte, from the address of the object (which is known to be in the nofl space). If it has any of the bits in 0xe set, then it’s in the old generation.

Then to test a field logging bit it’s a similar set of instructions. In one of my tests the data type looks like this:

struct Node {
  uintptr_t tag;
  struct Node *left;
  struct Node *right;
  int i, j;
};

Writing the left field will be in the same granule as the object itself, so we can just test the byte we fetched for the logging bit directly with testb against $0x80. For right, we should be able to know it’s in the same slab (aligned 4 MB region) and just add to the previously computed byte address, but the C compiler doesn’t know that right now and so recomputes. This would work better in a JIT. Anyway I think these bit-swizzling operations are just lost in the flow of memory accesses.

For the general case where you don’t statically know the offset of the field in the object, you have to compute which bit in the byte to test:

mov    %r13,%rcx
mov    $0x40,%eax
shr    $0x3,%rcx
and    $0x1,%ecx
shl    %cl,%eax
test   %al,%dil

Is it good? Well, it improves things for my whiffle benchmarks, relative to the card-marking barrier, seeing a 1.05×-1.5× speedup across a range of benchmarks. I suspect the main advantage is in avoiding the “unconditional” part of card marking, where a write to a new object could cause old objects to be added to the remembered set. There are still quite a few whiffle configurations in which the whole-heap collector outperforms the sticky-mark-bit generational collector, though; I hope to understand this a bit more by building a more classic semi-space nursery, and comparing performance to that.

Implementation links: the barrier fast-path, the slow path, and the sequential store buffers. (At some point I need to make it so that allocating edge buffers in the field set causes the nofl space to page out a corresponding amount of memory, so as to be honest when comparing GC performance at a fixed heap size.)

Until next time, onwards and upwards!

by Andy Wingo at Thursday, October 3, 2024

Tuesday, October 1, 2024

Idiomdrottning

chunk-by

Takes a list and splits it in order into smaller lists such that each list contains one pred. If there are no pred, that’s fine, DTRT silently i.e. make a one-element list containing the whole list.

(define (vowel? l) ((as-list (c member l)) "aeiou"))

(define (chunk-by pred lis) (list lis))

(define (chunk-by pred lis)
  (receive (hd tl)
      (split-at lis (add1 (require number? (list-index pred lis))))
    (cons hd (chunk-by pred (require (c any pred) tl)))))

(chunk-by vowel? (string->list "somejunk"))

⇒ ((#\s #\o) (#\m #\e) (#\j #\u #\n #\k))

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Tuesday, October 1, 2024

Thursday, September 26, 2024

Andy Wingo

needed-bits optimizations in guile

Hey all, I had a fun bug this week and want to share it with you.

numbers and representations

First, though, some background. Guile’s numeric operations are defined over the complex numbers, not over e.g. a finite field of integers. This is generally great when writing an algorithm, because you don’t have to think about how the computer will actually represent the numbers you are working on.

In practice, Guile will represent a small exact integer as a fixnum, which is a machine word with a low-bit tag. If an integer doesn’t fit in a word (minus space for the tag), it is represented as a heap-allocated bignum. But sometimes the compiler can realize that e.g. the operands to a specific bitwise-and operation are within (say) the 64-bit range of unsigned integers, and so therefore we can use unboxed operations instead of the more generic functions that do run-time dispatch on the operand types, and which might perform heap allocation.

Unboxing is important for speed. It’s also tricky: under what circumstances can we do it? In the example above, there is information that flows from defs to uses: the operands of logand are known to be exact integers in a certain range and the operation itself is closed over its domain, so we can unbox.

But there is another case in which we can unbox, in which information flows backwards, from uses to defs: if we see (logand n #xff), we know:

  • the result will be in [0, 255]

  • that n will be an exact integer (or an exception will be thrown)

  • we are only interested in a subset of n‘s bits.

Together, these observations let us transform the more general logand to an unboxed operation, having first truncated n to a u64. And actually, the information can flow from use to def: if we know that n will be an exact integer but don’t know its range, we can transform the potentially heap-allocating computation that produces n to instead truncate its result to the u64 range where it is defined, instead of just truncating at the use; and potentially this information could travel farther up the dominator tree, to inputs of the operation that defines n, their inputs, and so on.

needed-bits: the |0 of scheme

Let’s say we have a numerical operation that produces an exact integer, but we don’t know the range. We could truncate the result to a u64 and use unboxed operations, if and only if only u64 bits are used. So we need to compute, for each variable in a program, what bits are needed from it.

I think this is generally known a needed-bits analysis, though both Google and my textbooks are failing me at the moment; perhaps this is because dynamic languages and flow analysis don’t get so much attention these days. Anyway, the analysis can be local (within a basic block), global (all blocks in a function), or interprocedural (larger than a function). Guile’s is global. Each CPS/SSA variable in the function starts as needing 0 bits. We then compute the fixpoint of visiting each term in the function; if a term causes a variable to flow out of the function, for example via return or call, the variable is recorded as needing all bits, as is also the case if the variable is an operand to some primcall that doesn’t have a specific needed-bits analyser.

Currently, only logand has a needed-bits analyser, and this is because sometimes you want to do modular arithmetic, for example in a hash function. Consider Bon Jenkins’ lookup3 string hash function:

#define rot(x,k) (((x)<<(k)) | ((x)>>(32-(k))))
#define mix(a,b,c) \
{ \
  a -= c;  a ^= rot(c, 4);  c += b; \
  b -= a;  b ^= rot(a, 6);  a += c; \
  c -= b;  c ^= rot(b, 8);  b += a; \
  a -= c;  a ^= rot(c,16);  c += b; \
  b -= a;  b ^= rot(a,19);  a += c; \
  c -= b;  c ^= rot(b, 4);  b += a; \
}
...

If we transcribe this to Scheme, we get something like:

(define (jenkins-lookup3-hashword2 str)
  (define (u32 x) (logand x #xffffFFFF))
  (define (shl x n) (u32 (ash x n)))
  (define (shr x n) (ash x (- n)))
  (define (rot x n) (logior (shl x n) (shr x (- 32 n))))
  (define (add x y) (u32 (+ x y)))
  (define (sub x y) (u32 (- x y)))
  (define (xor x y) (logxor x y))

  (define (mix a b c)
    (let* ((a (sub a c)) (a (xor a (rot c 4)))  (c (add c b))
           (b (sub b a)) (b (xor b (rot a 6)))  (a (add a c))
           (c (sub c b)) (c (xor c (rot b 8)))  (b (add b a))
           ...)
      ...))

  ...

These u32 calls are like the JavaScript |0 idiom, to tell the compiler that we really just want the low 32 bits of the number, as an integer. Guile’s compiler will propagate that information down to uses of the defined values but also back up the dominator tree, resulting in unboxed arithmetic for all of these operations.

(When writing this, I got all the way here and then realized I had already written quite a bit about this, almost a decade ago ago. Oh well, consider this your lucky day, you get two scoops of prose!)

the bug

All that was just prelude. So I said that needed-bits is a fixed-point flow analysis problem. In this case, I want to compute, for each variable, what bits are needed for its definition. Because of loops, we need to keep iterating until we have found the fixed point. We use a worklist to represent the conts we need to visit.

Visiting a cont may cause the program to require more bits from the variables that cont uses. Consider:

(define-significant-bits-handler
    ((logand/immediate label types out res) param a)
  (let ((sigbits (sigbits-intersect
                   (inferred-sigbits types label a)
                   param
                   (sigbits-ref out res))))
    (intmap-add out a sigbits sigbits-union)))

This is the sigbits (needed-bits) handler for logand when one of its operands (param) is a constant and the other (a) is variable. It adds an entry for a to the analysis out, which is an intmap from variable to a bitmask of needed bits, or #f for all bits. If a already has some computed sigbits, we add to that set via sigbits-union. The interesting point comes in the sigbits-intersect call: the bits that we will need from a are first the bits that we infer a to have, by forward type-and-range analysis; intersected with the bits from the immediate param; intersected with the needed bits from the result value res.

If the intmap-add call is idempotent—i.e., out already contains sigbits for a—then out is returned as-is. So we can check for a fixed-point by comparing out with the resulting analysis, via eq?. If they are not equal, we need to add the cont that defines a to the worklist.

The bug? The bug was that we were not enqueuing the def of a, but rather the predecessors of label. This works when there are no cycles, provided we visit the worklist in post-order; and regardless, it works for many other analyses in Guile where we compute, for each labelled cont (basic block), some set of facts about all other labels or about all other variables. In that case, enqueuing a predecessor on the worklist will cause all nodes up and to including the variable’s definition to be visited, because each step adds more information (relative to the analysis computed on the previous visit). But it doesn’t work for this case, because we aren’t computing a per-label analysis.

The solution was to rewrite that particular fixed-point to enqueue labels that define a variable (possibly multiple defs, because of joins and loop back-edges), instead of just the predecessors of the use.

Et voilà ! If you got this far, bravo. Type at y’all again soon!

by Andy Wingo at Thursday, September 26, 2024

spritely.institute

Spritely went to DWeb Camp: 2024 Recap

Decorative painting of goblin trying to set up a tent but tripping and flopping it down on another

Last month, Spritely's Executive Director (Christine) and CTO (me) attended DWeb Camp. DWeb Camp is an annual tech conference unlike any I’ve experienced before. It's organized by the Internet Archive, and is focused on creating a truly decentralized web.

Sign at campsite entrance, adapted from photo by John Bruhling

In lieu of a traditional conference space, DWeb Camp takes place at a campground in the redwood forests of northern California. On top being outdoors, the event has several layers of COVID mitigations in place to help protect public safety. It's a unique setting for discussing all the problems facing the decentralized web and the solutions currently being developed. Getting there from the east coast of the US was a long journey, but DWeb Camp was worth it!

DWeb Camp main building, adapted from photo by John Bruhling

The plan 🗺�

Some of Spritely’s many goals at DWeb Camp this year were to:

  • Share the current state of our technology through presentations

  • Generate excitement for our work with cool demos

  • Talk to Spritely supporters/collaborators

  • Make new connections with potential future collaborators and funders

Presentations and panels �

DWeb Camp stage, adapted from photo by JohnBruhling

To that end, we gave two presentations:

  • Why Spritely is all-in on WebAssembly: A lightning talk about our Hoot project and why we’re investing in WebAssembly. This talk was recorded but, as of this writing, the video has not yet been posted publicly.

  • Keeping decentralized networking fun with Spritely: A longer talk about why Spritely uses games as a way to generate excitement for decentralized web technology and make a very technical subject more fun and approachable. This talk was not recorded, unfortunately.

Additionally, Christine participated in a couple panels:

  • Film Showing: Secrets in Your Data: An outdoor screening of PBS NOVA’s Secrets in Your Data documentary (which features Christine for her work on ActivityPub) followed by a Q&A session.

  • Successful Cultures for Long-Term Decentralized Communities: An open discussion on what technical, structural, and cultural considerations engender sustainable, decentralized communities.

As if that weren’t enough, Christine also ran an in-person instance of the Hack & Craft workshop, a part of her FOSS and Crafts podcast project.

The Night Market 🌙

DWeb Camp enameled mugs, adapted from photo by Brad Shirakawa

The best event of all for Spritely was the Demo Night Market, which gave us a chance for us to run in-person, live demos of our technologies. Tables were placed in a circle around a cluster of redwood trees and the space was divided up amongst the various projects on display.

We shared a table with object capability comrades from the Endo JS project. Our theme was the Spritely Arcade. We featured games like Cirkoban and Strigoform that show off our Goblins and Hoot projects.

We didn’t have room in our checked luggage for arcade cabinets (but how cool would it be to roll up to camp with a Taito Egret II?) so we made do with simply running the games on some spare laptops. What we did have room for, though, were a bunch of lovely, colorful cardboard displays made by tessa (Developer Relations) to attract passersby. We even had a DIY gachapon machine as a fun way to give out stickers and trinkets.

Our combination of displays and cute pixel graphics worked well… almost too well. We were busy entertaining from the moment the night market opened until a half hour after it officially closed! While I’ve never been to a true night market, we were told our table embodied the vibe and spirit of one. What a wonderful compliment to top off a fantastic night!

Speaking of lovely, colorful things… it’s safe to say that our merch game was on point this year, thanks to tessa. We had many types of stickers, postcards, coasters, stamps, and even coloring books (many campers bring their kids but these were a hit with adults, too!) Check out our pre-DWeb Camp post to see lovely pictures of all the wonderful merch.

All work and some play 🥽

During rare moments of downtime, we had fun doing things like:

  • Singing karaoke

  • Participating in the talent show

  • Swimming in the nearby river

  • Stargazing

Audiovisual floral art installation, adapted from photo by Brad Shirakawa

Final thoughts ��

In summary, DWeb Camp was an overwhelmingly good time! We left camp exhausted but with high spirits and optimism that we’re building the right things for the future of the decentralized web. It’s clear that many people like what we’re doing and for that we are very grateful. If these goals and activities sound at a tech conference sound up your alley, DWeb Camp might be for you, too!

Upshot of trees, adapted from photo by John Bruhling

As always, if you’d like to discuss anything Spritely-related then please join our community forum. Until next time! ��


Unedited photo sources: 1 2 3 4 5 6

Original photos by John Bruhling and Brad Shirakawa

by Dave Thompson (contact@spritely.institute) at Thursday, September 26, 2024

Tuesday, September 24, 2024

Scheme Requests for Implementation

SRFI 250: Insertion-ordered hash tables

SRFI 250 is now in withdrawn status.

This SRFI defines an interface to hash tables, which are widely recognized as a fundamental data structure for a wide variety of applications. A hash table is a data structure that:

  • Is disjoint from all other types.
  • Provides a mapping from objects known as keys to corresponding objects known as values.
    • Keys may be any Scheme objects in some kinds of hash tables, but are restricted in other kinds.
    • Values may be any Scheme objects.
  • Provides an equality predicate which defines when a proposed key is the same as an existing key. No table may contain more than one value for a given key.
  • Provides a hash function which maps a candidate key into a non-negative exact integer.
  • Supports mutation as the primary means of setting the contents of a table.
  • Provides key lookup and destructive update in (expected) amortized constant time, provided that a satisfactory hash function is available.
  • Does not guarantee that whole-table operations work in the presence of concurrent mutation of the whole hash table. (Values may be safely mutated.)

Unlike the hash tables of SRFI 125, which is the direct ancestor of this specification, the hash tables described here are ordered by insertion: that is, associations inserted earlier in the history of the hash table appear earlier in the ordering. Advances in the implementations of hash tables, as provided by C++, Python, JavaScript, etc., make the provision of this new facility practical. As a result, the hash tables of this SRFI do not interoperate with the hash tables of SRFI 125, SRFI 126, or existing R6RS implementations.

by John Cowan at Tuesday, September 24, 2024

SRFI 234: Topological Sorting

SRFI 234 is now in final status.

Topological sorting is an algorithm that takes a graph consisting of nodes and other nodes that depend on them, forming a partial order, and returns a list representing a total ordering of the graph. If the graph is cyclic, the topological sort will fail. The procedure topological-sort returns three values. If sorting succeeds, the first value contains the result and the second and third are #false. If sorting fails, the result is #false and the second and third value may provide additional information about the error.

by John Cowan and Arne Babenhauserheide at Tuesday, September 24, 2024

Thursday, September 19, 2024

spritely.institute

Spritely Goblins v0.14.0: libp2p and Improved Persistence

Goblins version 0.14.0

The goblins have been working away all summer to make a new release, and we're excited to announce that version 0.14.0 of Guile Goblins is here! In it, we introduce not only a new libp2p netlayer but also a veritable smörgåsbord of new persistence features.

(Note: There are no major protocol updates to OCapN functionality in this release, and there is not an accompanying release of Racket Goblins this time.)

libp2p

libp2p is a networking stack designed for secure peer-to-peer connections, making it a great fit for Goblins. This netlayer uses protocols like TCP and QUIC which enable libp2p to be fast.

In addition, compared to typical netlayer implementations, such as our TCP-TLS netlayer, which have issues with other peers reaching them when used behind firewalls, libp2p can sometimes mitigate these issues by automatically trying to use NAT hole-punching to open the ports allowing peer-to-peer communication.

Note that, because libp2p doesn't have a Scheme implementation, the Goblins libp2p netlayer uses our own libp2p daemon written in Go. This needs to be run alongside Goblins — similarly to how Goblins uses the Tor daemon whenever the onion netlayer is in use. To learn more about the libp2p daemon, check out its README.

Persistence improvements (Aurie)

Last release, we introduced Aurie, our persistence system which enables writing to actors which can persist across and rehydrate from different stores automatically. In this release, we upgraded Aurie to persist all netlayers and work across different local vats!

Intra-vat persistence

When building Goblins applications, developers often code across several vats with objects that reference other objects which live on different vats on their local systems (whew! Say that ten times fast). These references are known as "far references" (or "far refrs" for short), and up until this release, trying to persist an object which held far references would cause an error to be thrown!

But now there's a new actor in Goblinstown: ^persistence-registry! ^persistence-registry coordinates restoration across local vats. And while this actor itself doesn't persist any information (in other words, it isn't persistence aware), vats can use it to register their own existences at startup time. ^persistence-registry also enables vats to request far references on other vats it needs to restore.

Let's see it in use!

(use-modules (goblins)
             (goblins vat)
             (goblins actor-lib cell)
             (goblins persistence-store syrup))

;; First lets spawn the ^persistence-registry actor!
;; It's not storing anything, so no need for a persistence vat
(define persistence-vat (spawn-vat #:name "Persistence"))
(define persistence-registry
  (with-vat persistence-vat
    (spawn ^persistence-registry))) ;; provided by (goblins vat)

;; Next let's spawn both our vats using the registry above.
(define-values (a-vat a-cell)
  (spawn-persistent-vat
   cell-env
   (lambda () (spawn ^cell))
   (make-syrup-store "a-vat.syrup")
   #:persistence-registry persistence-registry))

(define-values (b-vat b-cell)
  (spawn-persistent-vat
   cell-env
   ;; Keep a reference to a-cell on a-vat.
   (lambda () (spawn ^cell a-cell))
   (make-syrup-store "b-vat.syrup")
   #:persistence-registry persistence-registry))

Note that the above vats don't need to be spawned in a particular order; any far reference is restored as a promise that resolves when the object on the other vat becomes available. In other words, all far references start out as promises.

Netlayer persistence

Previously, if you wanted to maintain multiple netlayers within a persistent location, you'd have had to save a bunch of data, read it back in somehow, and provide it to all the netlayers.

As of today's release, you can now leverage Goblins' persistence system, just like you do for all the other objects, to persist netlayers. This is especially useful when considering the intra-vat persistence feature detailed above.

Selfish actors

It can sometimes be useful for actors to reference themselves, such as when building a user profile (which might want to create a sturdyref to itself). In the past, there were several ways to accomplish this, usually with the help of (goblins actor-lib selfish-spawn). The selfish-spawn library gives you this fancy selfish-spawn procedure, which everyone who wants to spawn your actor needs to use, instead of the usual spawn that you normally use when spawning actors. It's weird to need to use a special spawn procedure to spawn an actor because, really, the selfish nature of some actors is an internal concern of the actor, not the spawner. Actors shouldn't be imposing strange spawning APIs on everyone who wants to use them. On top of all that, you couldn't use selfish spawn and the persistence system together; you had to rely on other tricks to get a reference to yourself.

But with the release of 0.14.0, define-actor now provides a new #:self keyword that fixes both of these issues! You simply specify the variable you want to bind the "self" reference to and within the actor you can now use that.

Take the following actor whose behavior takes an actor reference in and returns differing commentary depending on whether the reference points to itself or someone else:


(define-actor (^knows-self bcom)
  #:self self
  (lambda (someone)
    (if (eq? someone self)
        "that's me!"
        "I dunno who that is")))

We can now use the #:self keyword on define-actor. We're able to reference ourself with the identifier we choose (self). This lets our ^knows-self actor use the usual eq? identity predicate to see if the reference passed in really is itself or not!

Migrations macro

The persistence system can be used to not just store actors, but also upgrade them. Longtime readers will recognize that this is nothing new; developers could always optionally specify a version for portrait data (if no version is specified, Goblins will tag the data as version 0) and then pass said data into the restore function. In this way, developers were able to upgrade data prior to spawning the actor, but had to manually write out the code every time.

As of this release, Goblins now has a handy new migrations macro that can successfully apply migrations until it reaches the desired version. Let's look at the idea of a thermostat which initially stored temperatures in Fahrenheit but now has switched to Celsius.

(define-actor (^thermostat bcom temperature)
  #:version 2 ;; We're on the second version
  ;; Specify a restore function which we can use to migrate from previous versions
  #:restore
  (lambda (version restored-temp)
    ;; Convert the temperature if needed...
    (define temperature
      (match version
        ;; F -> C conversion
        [0 (+ (* 9/5 restored-temp) 32)]
        ;; C, no conversion needed.
        [1 restored-temp]))
    ;; Finally spawn the thermostat
    (spawn ^thermostat temperature))

  (methods
   [(get-current-temp) temperature]
   [(set-temp new-temp) ...]))

This technically works, but what if you wanted to see the numbers in Kelvin? Previously, you'd have had to change the initial version 0 to go from Fahrenheit to Kelvin, add a migration from Celsius to Kelvin, and handle the case where you're being handed Kelvin. Let's see how this new migration macro helps.

(define-actor (^thermostat bcom temperature)
   #:version 2
   ;; Instead of #:restore, we can use #:upgrade which isn't
   ;; responsible for spawning, but is responsible for upgrading
   ;; the portrait data. This works with the new migrations macro.
   #:upgrade
   (migrations
    ;; Version 0 (F) to version 1 (C)
    [(1 f-temp) (+ (* 9/5 restored-temp) 32)]
    ;; Version 2 (C) to version 2 (K)
    [(2 c-temp) (+ c-temp 2 273.15)])

   ;; Same behavior as before
   (methods
    [(get-current-temp) temperature]
    [(set-temp new-temp) ...]))

Notice the similarity between migration syntax and the methods macro:

[(target-version data ...) body ...]

That means in our example when a Fahrenheit temperature (at version 0) is passed in, it'll run the initial F -> C migration, migrating it to Celsius, then it'll run the next to get it into Kelvin. This makes the job of writing migrations far easier as you only have to focus on migrating from one version to the next, not updating how all the migrations work.

Vat root upgrades

As we mentioned previously, Aurie, Goblins' persistence system, provides an upgrade path for actor portrait data.

But did you know it can also be useful to upgrade the roots of a graph when spawning a persistence vat? Similarly to how this works with actors, specify #:version with a version number for the roots and #:upgrade with an upgrade procedure when calling spawn-persistent-vat.

This even works with the new migrations macro!

Let's see it in action, imagine we've got a vat, it just spawns a single object as it's cell:

(use-modules (goblins)
             (goblins vat)
             (goblins actor-lib cell)
             (goblins persistence-store syrup))

;; Version 0
(define-values (vat cell)
  (spawn-persistent-vat
   cell-env
   (lambda ()
     (spawn ^cell))
   (make-syrup-store "my-vat.syrup")))

We didn't specify any version above, but just like actors, Goblins is tagging this as version 0 so that if we wanted to upgrade in the future, we have a version number to work off. Now, imagine we want to change this vat so that we return two objects as the roots. We want to take the cell above, but instead make it the swappable actor which returns a proxy to the cell and a swapper actor.

Let's see what that looks like:

(use-modules (goblins)
             (goblins vat)
             (goblins actor-lib cell)
             (goblins actor-lib swappable)
             (goblins persistence-store syrup))

;; Version 1
(define-values (vat cell swapper)
  (spawn-persistent-vat
   (persistence-env-compose cell-env swappable-env)
   (lambda ()
     ;; Change the initial spawn procedure so it matches what we expect.
     (swappable (spawn ^cell)))
   (make-syrup-store "my-vat.syrup")
   ;; Specify the version now we've changed the roots.
   #:version 1
   #:upgrade
   (migrations
    ;; Migration from 0 -> 1
    [(1 cell)
     (define-values (proxy swapper)
       (swappable cell))
     (list proxy swapper)])))

Just like we saw with the ^thermostat, as required we can keep adding migrations and Goblins will apply them as needed.

Getting the release

This release also includes a number of other bugfixes and minor features. See the NEWS file for more information!

As usual, if you're using Guix, you can upgrade to 0.14.0 by using:

guix pull
guix install guile-goblins

Otherwise, you can find the tarball on our release page

If you're making something with Goblins or want to contribute to Goblins itself, be sure to join our community at community.spritely.institute! Thanks for reading, and hope to see you there!

by Jessica Tallon (contact@spritely.institute) at Thursday, September 19, 2024

Wednesday, September 18, 2024

Andy Wingo

whippet progress update: feature-complete!

Greetings, gentle readers. Today, an update on recent progress in the Whippet embeddable garbage collection library.

feature-completeness

When I started working on Whippet, two and a half years ago already, I was aiming to make a new garbage collector for Guile. In the beginning I was just focussing on proving that it would be advantageous to switch, and also learning how to write a GC. I put off features like ephemerons and heap resizing until I was satisfied with the basics.

Well now is the time: with recent development, Whippet is finally feature-complete! Huzzah! Now it’s time to actually work on getting it into Guile. (I have some performance noodling to do before then, adding tracing support, and of course I have lots of ideas for different collectors to build, but I don’t have any missing features at the current time.)

heap resizing

When you benchmark a garbage collector (or a program with garbage collection), you generally want to fix the heap size. GC overhead goes down with more memory, generally speaking, so you don’t want to compare one collector at one size to another collector at another size.

(Unfortunately, many (most?) benchmarks of dynamic language run-times and the programs that run on them fall into this trap. Imagine you are benchmarking a program before and after a change. Automatic heap sizing is on. Before your change the heap size is 200 MB, but after it is 180 MB. The benchmark shows the change to regress performance. But did it really? It could be that at the same heap size, the change improved performance. You won’t know unless you fix the heap size.)

Anyway, Whippet now has an implementation of MemBalancer. After every GC, we measure the live size of the heap, and compute a new GC speed, as a constant factor to live heap size. Every second, in a background thread, we observe the global allocation rate. The heap size is then the live data size plus the square root of the live data size times a factor. The factor has two components. One is constant, the expansiveness of the heap: the higher it is, the more room the program has. The other depends on the program, and is computed as the square root of the ratio of allocation speed to collection speed.

With MemBalancer, the heap ends up getting resized at every GC, and via the heartbeat thread. During the GC it’s easy because it’s within the pause; no mutators are running. From the heartbeat thread, mutators are active: taking the heap lock prevents concurrent resizes, but mutators are still consuming empty blocks and producing full blocks. This works out fine in the same way that concurrent mutators is fine: shrinking takes blocks from the empty list one by one, atomically, and returns them to the OS. Expanding might reclaim paged-out blocks, or allocate new slabs of blocks.

However, even with some exponentially weighted averaging on the speed observations, I have a hard time understanding whether the algorithm is overall a good thing. I like the heartbeat thread, as it can reduce memory use of idle processes. The general square-root idea sounds nice enough. But adjusting the heap size at every GC feels like giving control of your stereo’s volume knob to a hyperactive squirrel.

GC collection time vs memory usage graph comparing V8 with MemBalancer. MemBalancer is shown to be better.
Figure 5 from the MemBalancer paper

Furthermore, the graphs in the MemBalancer paper are not clear to me: the paper claims more optimal memory use even in a single-heap configuration, but the x axis of the graphs is “average heap size”, which I understand to mean that maximum heap size could be higher than V8’s maximum heap size, taking into account more frequent heap size adjustments. Also, some measurement of total time would have been welcome, in addition to the “garbage collection time” on the paper’s y axis; there are cases where less pause time doesn’t necessarily correlate to better total times.

deferred page-out

Motivated by MemBalancer’s jittery squirrel, I implemented a little queue for use in paging blocks in and out, for the mmc and pcc collectors: blocks are quarantined for a second or two before being returned to the OS via madvise(MADV_DONTNEED). That way if you release a page and then need to reacquire it again, you can do so without bothering the kernel or other threads. Does it matter? It seems to improve things marginally and conventional wisdom says to not mess with the page table too much, but who knows.

mmc rename

Relatedly, Whippet used to be three things: the project itself, consisting of an API and a collection of collectors; one specific collector; and one specific space within that collector. Last time I mentioned that I renamed the whippet space to the nofl space. Now I finally got around to renaming what was the whippet collector as well: it is now the mostly-marking collector, or mmc. Be it known!

Also as a service note, I removed the “serial copying collector” (scc). It had the same performance as the parallel copying collector with parallelism=1, and pcc can be compiled with GC_PARALLEL=0 to explicitly choose the simpler serial grey-object worklist.

per-object pinning

The nofl space has always supported pinned objects, but it was never exposed in the API. Now it is!

Of course, collectors with always-copying spaces won’t be able to pin objects. If we want to support use of these collectors with embedders that require pinning, perhaps because of conservative root scanning, we’d need to switch to some kind of mostly-copying algorithm.

safepoints without allocation

Another missing feature was a safepoint API. It hasn’t been needed up to now because my benchmarks all allocate, but for programs that have long (tens of microseconds maybe) active periods without allocation, you want to be able to stop them without waiting too long. Well we have that exposed in the API now.

removed ragged stop

Following on my article on ragged stops, I removed ragged-stop marking from mmc, for a nice net 180 line reduction in some gnarly code. Speed seems to be similar.

next up: tracing

And with that, I’m relieved to move on to the next phase of Whippet development. Thanks again to NLnet for their support of this work. Next up, adding fine-grained tracing, so that I can noodle a bit on performance. Happy allocating!

by Andy Wingo at Wednesday, September 18, 2024

Friday, September 13, 2024

Scheme Requests for Implementation

SRFI 249: Restarting conditions

SRFI 249 is now in withdrawn status.

When an exceptional situation is encountered by a program, it may create a condition object describing the situation and then signal the condition and pass control to a condition handler. The signaler and handler are two different parts of a system, between which there is a barrier of abstraction. In order to recover gracefully and flexibly from exceptional situations, however, the signaler can provide multiple ways by which the handler can restart the computation, some of which may require extra input. Often, the decision of which method of recovery to choose is left up to a human user, who may be prompted for the input needed to recover. This SRFI proposes a simple mechanism called restarters to encapsulate the information necessary to restart a computation with associated interactive prompters.

by John Cowan at Friday, September 13, 2024