Planet Scheme

Thursday, October 3, 2024

Andy Wingo

preliminary notes on a nofl field-logging barrier

When you have a generational collector, you aim to trace only the part of the object graph that has been allocated recently. To do so, you need to keep a remembered set: a set of old-to-new edges, used as roots when performing a minor collection. A language run-time maintains this set by adding write barriers: little bits of collector code that run when a mutator writes to a field.

Whippet’s nofl space is a block-structured space that is appropriate for use as an old generation or as part of a sticky-mark-bit generational collector. It used to have a card-marking write barrier; see my article diving into V8’s new write barrier, for more background.

Unfortunately, when running whiffle benchmarks, I was seeing no improvement for generational configurations relative to whole-heap collection. Generational collection was doing fine in my tiny microbenchmarks that are part of Whippet itself, but when translated to larger programs (that aren’t yet proper macrobenchmarks), it was a lose.

I had planned on doing some serious tracing and instrumentation to figure out what was happening, and thereby correct the problem. I still plan on doing this, but instead for this issue I used the old noggin technique instead: just, you know, thinking about the thing, eventually concluding that unconditional card-marking barriers are inappropriate for sticky-mark-bit collectors. As I mentioned in the earlier article:

An unconditional card-marking barrier applies to stores to slots in all objects, not just those in oldspace; a store to a new object will mark a card, but that card may contain old objects which would then be re-scanned. Or consider a store to an old object in a more dense part of oldspace; scanning the card may incur more work than needed. It could also be that Whippet is being too aggressive at re-using blocks for new allocations, where it should be limiting itself to blocks that are very sparsely populated with old objects.

That’s three problems. The second is well-known. But the first and last are specific to sticky-mark-bit collectors, where pages mix old and new objects.

a precise field-logging write barrier

Back in 2019, Steve Blackburn’s paper Design and Analysis of Field-Logging Write Barriers took a look at the state of the art in precise barriers that record not regions of memory that have been updated, but the precise edges (fields) that were written to. He ends up re-using this work later in the 2022 LXR paper (see §3.4), where the write barrier is used for deferred reference counting and a snapshot-at-the-beginning (SATB) barrier for concurrent marking. All in all field-logging seems like an interesting strategy. Relative to card-marking, work during the pause is much less: you have a precise buffer of all fields that were written to, and you just iterate that, instead of iterating objects. Field-logging does impose some mutator cost, but perhaps the payoff is worth it.

To log each old-to-new edge precisely once, you need a bit per field indicating whether the field is logged already. Blackburn’s 2019 write barrier paper used bits in the object header, if the object was small enough, and otherwise bits before the object start. This requires some cooperation between the collector, the compiler, and the run-time that I wasn’t ready to pay for. The 2022 LXR paper was a bit vague on this topic, saying just that it used “a side table”.

In Whippet’s nofl space, we have a side table already, used for a number of purposes:

  1. Mark bits.

  2. Iterability / interior pointers: is there an object at a given address? If so, it will have a recognizable bit pattern.

  3. End of object, to be able to sweep without inspecting the object itself

  4. Pinning, allowing a mutator to prevent an object from being evacuated, for example because a hash code was computed from its address

  5. A hack to allow fully-conservative tracing to identify ephemerons at trace-time; this re-uses the pinning bit, since in practice such configurations never evacuate

  6. Bump-pointer allocation into holes: the mark byte table serves the purpose of Immix’s line mark byte table, but at finer granularity. Because of this though, it is swept lazily rather than eagerly.

  7. Generations. Young objects have a bit set that is cleared when they are promoted.

Well. Why not add another thing? The nofl space’s granule size is two words, so we can use two bits of the byte for field logging bits. If there is a write to a field, a barrier would first check that the object being written to is old, and then check the log bit for the field being written. The old check will be to a byte that is nearby or possibly the same as the one to check the field logging bit. If the bit is unsert, we call out to a slow path to actually record the field.

preliminary results

I disassembled the fast path as compiled by GCC and got something like this on x86-64, in AT&T syntax, for the young-generation test:

mov    %rax,%rdx
and    $0xffffffffffc00000,%rdx
shr    $0x4,%rax
and    $0x3ffff,%eax
or     %rdx,%rax
testb  $0xe,(%rax)

The first five instructions compute the location of the mark byte, from the address of the object (which is known to be in the nofl space). If it has any of the bits in 0xe set, then it’s in the old generation.

Then to test a field logging bit it’s a similar set of instructions. In one of my tests the data type looks like this:

struct Node {
  uintptr_t tag;
  struct Node *left;
  struct Node *right;
  int i, j;
};

Writing the left field will be in the same granule as the object itself, so we can just test the byte we fetched for the logging bit directly with testb against $0x80. For right, we should be able to know it’s in the same slab (aligned 4 MB region) and just add to the previously computed byte address, but the C compiler doesn’t know that right now and so recomputes. This would work better in a JIT. Anyway I think these bit-swizzling operations are just lost in the flow of memory accesses.

For the general case where you don’t statically know the offset of the field in the object, you have to compute which bit in the byte to test:

mov    %r13,%rcx
mov    $0x40,%eax
shr    $0x3,%rcx
and    $0x1,%ecx
shl    %cl,%eax
test   %al,%dil

Is it good? Well, it improves things for my whiffle benchmarks, relative to the card-marking barrier, seeing a 1.05×-1.5× speedup across a range of benchmarks. I suspect the main advantage is in avoiding the “unconditional” part of card marking, where a write to a new object could cause old objects to be added to the remembered set. There are still quite a few whiffle configurations in which the whole-heap collector outperforms the sticky-mark-bit generational collector, though; I hope to understand this a bit more by building a more classic semi-space nursery, and comparing performance to that.

Implementation links: the barrier fast-path, the slow path, and the sequential store buffers. (At some point I need to make it so that allocating edge buffers in the field set causes the nofl space to page out a corresponding amount of memory, so as to be honest when comparing GC performance at a fixed heap size.)

Until next time, onwards and upwards!

by Andy Wingo at Thursday, October 3, 2024

Tuesday, October 1, 2024

Idiomdrottning

chunk-by

Takes a list and splits it in order into smaller lists such that each list contains one pred. If there are no pred, that’s fine, DTRT silently i.e. make a one-element list containing the whole list.

(define (vowel? l) ((as-list (c member l)) "aeiou"))

(define (chunk-by pred lis) (list lis))

(define (chunk-by pred lis)
  (receive (hd tl)
      (split-at lis (add1 (require number? (list-index pred lis))))
    (cons hd (chunk-by pred (require (c any pred) tl)))))

(chunk-by vowel? (string->list "somejunk"))

⇒ ((#\s #\o) (#\m #\e) (#\j #\u #\n #\k))

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Tuesday, October 1, 2024

Thursday, September 26, 2024

Andy Wingo

needed-bits optimizations in guile

Hey all, I had a fun bug this week and want to share it with you.

numbers and representations

First, though, some background. Guile’s numeric operations are defined over the complex numbers, not over e.g. a finite field of integers. This is generally great when writing an algorithm, because you don’t have to think about how the computer will actually represent the numbers you are working on.

In practice, Guile will represent a small exact integer as a fixnum, which is a machine word with a low-bit tag. If an integer doesn’t fit in a word (minus space for the tag), it is represented as a heap-allocated bignum. But sometimes the compiler can realize that e.g. the operands to a specific bitwise-and operation are within (say) the 64-bit range of unsigned integers, and so therefore we can use unboxed operations instead of the more generic functions that do run-time dispatch on the operand types, and which might perform heap allocation.

Unboxing is important for speed. It’s also tricky: under what circumstances can we do it? In the example above, there is information that flows from defs to uses: the operands of logand are known to be exact integers in a certain range and the operation itself is closed over its domain, so we can unbox.

But there is another case in which we can unbox, in which information flows backwards, from uses to defs: if we see (logand n #xff), we know:

  • the result will be in [0, 255]

  • that n will be an exact integer (or an exception will be thrown)

  • we are only interested in a subset of n‘s bits.

Together, these observations let us transform the more general logand to an unboxed operation, having first truncated n to a u64. And actually, the information can flow from use to def: if we know that n will be an exact integer but don’t know its range, we can transform the potentially heap-allocating computation that produces n to instead truncate its result to the u64 range where it is defined, instead of just truncating at the use; and potentially this information could travel farther up the dominator tree, to inputs of the operation that defines n, their inputs, and so on.

needed-bits: the |0 of scheme

Let’s say we have a numerical operation that produces an exact integer, but we don’t know the range. We could truncate the result to a u64 and use unboxed operations, if and only if only u64 bits are used. So we need to compute, for each variable in a program, what bits are needed from it.

I think this is generally known a needed-bits analysis, though both Google and my textbooks are failing me at the moment; perhaps this is because dynamic languages and flow analysis don’t get so much attention these days. Anyway, the analysis can be local (within a basic block), global (all blocks in a function), or interprocedural (larger than a function). Guile’s is global. Each CPS/SSA variable in the function starts as needing 0 bits. We then compute the fixpoint of visiting each term in the function; if a term causes a variable to flow out of the function, for example via return or call, the variable is recorded as needing all bits, as is also the case if the variable is an operand to some primcall that doesn’t have a specific needed-bits analyser.

Currently, only logand has a needed-bits analyser, and this is because sometimes you want to do modular arithmetic, for example in a hash function. Consider Bon Jenkins’ lookup3 string hash function:

#define rot(x,k) (((x)<<(k)) | ((x)>>(32-(k))))
#define mix(a,b,c) \
{ \
  a -= c;  a ^= rot(c, 4);  c += b; \
  b -= a;  b ^= rot(a, 6);  a += c; \
  c -= b;  c ^= rot(b, 8);  b += a; \
  a -= c;  a ^= rot(c,16);  c += b; \
  b -= a;  b ^= rot(a,19);  a += c; \
  c -= b;  c ^= rot(b, 4);  b += a; \
}
...

If we transcribe this to Scheme, we get something like:

(define (jenkins-lookup3-hashword2 str)
  (define (u32 x) (logand x #xffffFFFF))
  (define (shl x n) (u32 (ash x n)))
  (define (shr x n) (ash x (- n)))
  (define (rot x n) (logior (shl x n) (shr x (- 32 n))))
  (define (add x y) (u32 (+ x y)))
  (define (sub x y) (u32 (- x y)))
  (define (xor x y) (logxor x y))

  (define (mix a b c)
    (let* ((a (sub a c)) (a (xor a (rot c 4)))  (c (add c b))
           (b (sub b a)) (b (xor b (rot a 6)))  (a (add a c))
           (c (sub c b)) (c (xor c (rot b 8)))  (b (add b a))
           ...)
      ...))

  ...

These u32 calls are like the JavaScript |0 idiom, to tell the compiler that we really just want the low 32 bits of the number, as an integer. Guile’s compiler will propagate that information down to uses of the defined values but also back up the dominator tree, resulting in unboxed arithmetic for all of these operations.

(When writing this, I got all the way here and then realized I had already written quite a bit about this, almost a decade ago ago. Oh well, consider this your lucky day, you get two scoops of prose!)

the bug

All that was just prelude. So I said that needed-bits is a fixed-point flow analysis problem. In this case, I want to compute, for each variable, what bits are needed for its definition. Because of loops, we need to keep iterating until we have found the fixed point. We use a worklist to represent the conts we need to visit.

Visiting a cont may cause the program to require more bits from the variables that cont uses. Consider:

(define-significant-bits-handler
    ((logand/immediate label types out res) param a)
  (let ((sigbits (sigbits-intersect
                   (inferred-sigbits types label a)
                   param
                   (sigbits-ref out res))))
    (intmap-add out a sigbits sigbits-union)))

This is the sigbits (needed-bits) handler for logand when one of its operands (param) is a constant and the other (a) is variable. It adds an entry for a to the analysis out, which is an intmap from variable to a bitmask of needed bits, or #f for all bits. If a already has some computed sigbits, we add to that set via sigbits-union. The interesting point comes in the sigbits-intersect call: the bits that we will need from a are first the bits that we infer a to have, by forward type-and-range analysis; intersected with the bits from the immediate param; intersected with the needed bits from the result value res.

If the intmap-add call is idempotent—i.e., out already contains sigbits for a—then out is returned as-is. So we can check for a fixed-point by comparing out with the resulting analysis, via eq?. If they are not equal, we need to add the cont that defines a to the worklist.

The bug? The bug was that we were not enqueuing the def of a, but rather the predecessors of label. This works when there are no cycles, provided we visit the worklist in post-order; and regardless, it works for many other analyses in Guile where we compute, for each labelled cont (basic block), some set of facts about all other labels or about all other variables. In that case, enqueuing a predecessor on the worklist will cause all nodes up and to including the variable’s definition to be visited, because each step adds more information (relative to the analysis computed on the previous visit). But it doesn’t work for this case, because we aren’t computing a per-label analysis.

The solution was to rewrite that particular fixed-point to enqueue labels that define a variable (possibly multiple defs, because of joins and loop back-edges), instead of just the predecessors of the use.

Et voilà ! If you got this far, bravo. Type at y’all again soon!

by Andy Wingo at Thursday, September 26, 2024

spritely.institute

Spritely went to DWeb Camp: 2024 Recap

Decorative painting of goblin trying to set up a tent but tripping and flopping it down on another

Last month, Spritely's Executive Director (Christine) and CTO (me) attended DWeb Camp. DWeb Camp is an annual tech conference unlike any I’ve experienced before. It's organized by the Internet Archive, and is focused on creating a truly decentralized web.

Sign at campsite entrance, adapted from photo by John Bruhling

In lieu of a traditional conference space, DWeb Camp takes place at a campground in the redwood forests of northern California. On top being outdoors, the event has several layers of COVID mitigations in place to help protect public safety. It's a unique setting for discussing all the problems facing the decentralized web and the solutions currently being developed. Getting there from the east coast of the US was a long journey, but DWeb Camp was worth it!

DWeb Camp main building, adapted from photo by John Bruhling

The plan 🗺�

Some of Spritely’s many goals at DWeb Camp this year were to:

  • Share the current state of our technology through presentations

  • Generate excitement for our work with cool demos

  • Talk to Spritely supporters/collaborators

  • Make new connections with potential future collaborators and funders

Presentations and panels �

DWeb Camp stage, adapted from photo by JohnBruhling

To that end, we gave two presentations:

  • Why Spritely is all-in on WebAssembly: A lightning talk about our Hoot project and why we’re investing in WebAssembly. This talk was recorded but, as of this writing, the video has not yet been posted publicly.

  • Keeping decentralized networking fun with Spritely: A longer talk about why Spritely uses games as a way to generate excitement for decentralized web technology and make a very technical subject more fun and approachable. This talk was not recorded, unfortunately.

Additionally, Christine participated in a couple panels:

  • Film Showing: Secrets in Your Data: An outdoor screening of PBS NOVA’s Secrets in Your Data documentary (which features Christine for her work on ActivityPub) followed by a Q&A session.

  • Successful Cultures for Long-Term Decentralized Communities: An open discussion on what technical, structural, and cultural considerations engender sustainable, decentralized communities.

As if that weren’t enough, Christine also ran an in-person instance of the Hack & Craft workshop, a part of her FOSS and Crafts podcast project.

The Night Market 🌙

DWeb Camp enameled mugs, adapted from photo by Brad Shirakawa

The best event of all for Spritely was the Demo Night Market, which gave us a chance for us to run in-person, live demos of our technologies. Tables were placed in a circle around a cluster of redwood trees and the space was divided up amongst the various projects on display.

We shared a table with object capability comrades from the Endo JS project. Our theme was the Spritely Arcade. We featured games like Cirkoban and Strigoform that show off our Goblins and Hoot projects.

We didn’t have room in our checked luggage for arcade cabinets (but how cool would it be to roll up to camp with a Taito Egret II?) so we made do with simply running the games on some spare laptops. What we did have room for, though, were a bunch of lovely, colorful cardboard displays made by tessa (Developer Relations) to attract passersby. We even had a DIY gachapon machine as a fun way to give out stickers and trinkets.

Our combination of displays and cute pixel graphics worked well… almost too well. We were busy entertaining from the moment the night market opened until a half hour after it officially closed! While I’ve never been to a true night market, we were told our table embodied the vibe and spirit of one. What a wonderful compliment to top off a fantastic night!

Speaking of lovely, colorful things… it’s safe to say that our merch game was on point this year, thanks to tessa. We had many types of stickers, postcards, coasters, stamps, and even coloring books (many campers bring their kids but these were a hit with adults, too!) Check out our pre-DWeb Camp post to see lovely pictures of all the wonderful merch.

All work and some play 🥽

During rare moments of downtime, we had fun doing things like:

  • Singing karaoke

  • Participating in the talent show

  • Swimming in the nearby river

  • Stargazing

Audiovisual floral art installation, adapted from photo by Brad Shirakawa

Final thoughts ��

In summary, DWeb Camp was an overwhelmingly good time! We left camp exhausted but with high spirits and optimism that we’re building the right things for the future of the decentralized web. It’s clear that many people like what we’re doing and for that we are very grateful. If these goals and activities sound at a tech conference sound up your alley, DWeb Camp might be for you, too!

Upshot of trees, adapted from photo by John Bruhling

As always, if you’d like to discuss anything Spritely-related then please join our community forum. Until next time! ��


Unedited photo sources: 1 2 3 4 5 6

Original photos by John Bruhling and Brad Shirakawa

by Dave Thompson (contact@spritely.institute) at Thursday, September 26, 2024

Tuesday, September 24, 2024

Scheme Requests for Implementation

SRFI 250: Insertion-ordered hash tables

SRFI 250 is now in withdrawn status.

This SRFI defines an interface to hash tables, which are widely recognized as a fundamental data structure for a wide variety of applications. A hash table is a data structure that:

  • Is disjoint from all other types.
  • Provides a mapping from objects known as keys to corresponding objects known as values.
    • Keys may be any Scheme objects in some kinds of hash tables, but are restricted in other kinds.
    • Values may be any Scheme objects.
  • Provides an equality predicate which defines when a proposed key is the same as an existing key. No table may contain more than one value for a given key.
  • Provides a hash function which maps a candidate key into a non-negative exact integer.
  • Supports mutation as the primary means of setting the contents of a table.
  • Provides key lookup and destructive update in (expected) amortized constant time, provided that a satisfactory hash function is available.
  • Does not guarantee that whole-table operations work in the presence of concurrent mutation of the whole hash table. (Values may be safely mutated.)

Unlike the hash tables of SRFI 125, which is the direct ancestor of this specification, the hash tables described here are ordered by insertion: that is, associations inserted earlier in the history of the hash table appear earlier in the ordering. Advances in the implementations of hash tables, as provided by C++, Python, JavaScript, etc., make the provision of this new facility practical. As a result, the hash tables of this SRFI do not interoperate with the hash tables of SRFI 125, SRFI 126, or existing R6RS implementations.

by John Cowan at Tuesday, September 24, 2024

SRFI 234: Topological Sorting

SRFI 234 is now in final status.

Topological sorting is an algorithm that takes a graph consisting of nodes and other nodes that depend on them, forming a partial order, and returns a list representing a total ordering of the graph. If the graph is cyclic, the topological sort will fail. The procedure topological-sort returns three values. If sorting succeeds, the first value contains the result and the second and third are #false. If sorting fails, the result is #false and the second and third value may provide additional information about the error.

by John Cowan and Arne Babenhauserheide at Tuesday, September 24, 2024

Thursday, September 19, 2024

spritely.institute

Spritely Goblins v0.14.0: libp2p and Improved Persistence

Goblins version 0.14.0

The goblins have been working away all summer to make a new release, and we're excited to announce that version 0.14.0 of Guile Goblins is here! In it, we introduce not only a new libp2p netlayer but also a veritable smörgåsbord of new persistence features.

(Note: There are no major protocol updates to OCapN functionality in this release, and there is not an accompanying release of Racket Goblins this time.)

libp2p

libp2p is a networking stack designed for secure peer-to-peer connections, making it a great fit for Goblins. This netlayer uses protocols like TCP and QUIC which enable libp2p to be fast.

In addition, compared to typical netlayer implementations, such as our TCP-TLS netlayer, which have issues with other peers reaching them when used behind firewalls, libp2p can sometimes mitigate these issues by automatically trying to use NAT hole-punching to open the ports allowing peer-to-peer communication.

Note that, because libp2p doesn't have a Scheme implementation, the Goblins libp2p netlayer uses our own libp2p daemon written in Go. This needs to be run alongside Goblins — similarly to how Goblins uses the Tor daemon whenever the onion netlayer is in use. To learn more about the libp2p daemon, check out its README.

Persistence improvements (Aurie)

Last release, we introduced Aurie, our persistence system which enables writing to actors which can persist across and rehydrate from different stores automatically. In this release, we upgraded Aurie to persist all netlayers and work across different local vats!

Intra-vat persistence

When building Goblins applications, developers often code across several vats with objects that reference other objects which live on different vats on their local systems (whew! Say that ten times fast). These references are known as "far references" (or "far refrs" for short), and up until this release, trying to persist an object which held far references would cause an error to be thrown!

But now there's a new actor in Goblinstown: ^persistence-registry! ^persistence-registry coordinates restoration across local vats. And while this actor itself doesn't persist any information (in other words, it isn't persistence aware), vats can use it to register their own existences at startup time. ^persistence-registry also enables vats to request far references on other vats it needs to restore.

Let's see it in use!

(use-modules (goblins)
             (goblins vat)
             (goblins actor-lib cell)
             (goblins persistence-store syrup))

;; First lets spawn the ^persistence-registry actor!
;; It's not storing anything, so no need for a persistence vat
(define persistence-vat (spawn-vat #:name "Persistence"))
(define persistence-registry
  (with-vat persistence-vat
    (spawn ^persistence-registry))) ;; provided by (goblins vat)

;; Next let's spawn both our vats using the registry above.
(define-values (a-vat a-cell)
  (spawn-persistent-vat
   cell-env
   (lambda () (spawn ^cell))
   (make-syrup-store "a-vat.syrup")
   #:persistence-registry persistence-registry))

(define-values (b-vat b-cell)
  (spawn-persistent-vat
   cell-env
   ;; Keep a reference to a-cell on a-vat.
   (lambda () (spawn ^cell a-cell))
   (make-syrup-store "b-vat.syrup")
   #:persistence-registry persistence-registry))

Note that the above vats don't need to be spawned in a particular order; any far reference is restored as a promise that resolves when the object on the other vat becomes available. In other words, all far references start out as promises.

Netlayer persistence

Previously, if you wanted to maintain multiple netlayers within a persistent location, you'd have had to save a bunch of data, read it back in somehow, and provide it to all the netlayers.

As of today's release, you can now leverage Goblins' persistence system, just like you do for all the other objects, to persist netlayers. This is especially useful when considering the intra-vat persistence feature detailed above.

Selfish actors

It can sometimes be useful for actors to reference themselves, such as when building a user profile (which might want to create a sturdyref to itself). In the past, there were several ways to accomplish this, usually with the help of (goblins actor-lib selfish-spawn). The selfish-spawn library gives you this fancy selfish-spawn procedure, which everyone who wants to spawn your actor needs to use, instead of the usual spawn that you normally use when spawning actors. It's weird to need to use a special spawn procedure to spawn an actor because, really, the selfish nature of some actors is an internal concern of the actor, not the spawner. Actors shouldn't be imposing strange spawning APIs on everyone who wants to use them. On top of all that, you couldn't use selfish spawn and the persistence system together; you had to rely on other tricks to get a reference to yourself.

But with the release of 0.14.0, define-actor now provides a new #:self keyword that fixes both of these issues! You simply specify the variable you want to bind the "self" reference to and within the actor you can now use that.

Take the following actor whose behavior takes an actor reference in and returns differing commentary depending on whether the reference points to itself or someone else:


(define-actor (^knows-self bcom)
  #:self self
  (lambda (someone)
    (if (eq? someone self)
        "that's me!"
        "I dunno who that is")))

We can now use the #:self keyword on define-actor. We're able to reference ourself with the identifier we choose (self). This lets our ^knows-self actor use the usual eq? identity predicate to see if the reference passed in really is itself or not!

Migrations macro

The persistence system can be used to not just store actors, but also upgrade them. Longtime readers will recognize that this is nothing new; developers could always optionally specify a version for portrait data (if no version is specified, Goblins will tag the data as version 0) and then pass said data into the restore function. In this way, developers were able to upgrade data prior to spawning the actor, but had to manually write out the code every time.

As of this release, Goblins now has a handy new migrations macro that can successfully apply migrations until it reaches the desired version. Let's look at the idea of a thermostat which initially stored temperatures in Fahrenheit but now has switched to Celsius.

(define-actor (^thermostat bcom temperature)
  #:version 2 ;; We're on the second version
  ;; Specify a restore function which we can use to migrate from previous versions
  #:restore
  (lambda (version restored-temp)
    ;; Convert the temperature if needed...
    (define temperature
      (match version
        ;; F -> C conversion
        [0 (+ (* 9/5 restored-temp) 32)]
        ;; C, no conversion needed.
        [1 restored-temp]))
    ;; Finally spawn the thermostat
    (spawn ^thermostat temperature))

  (methods
   [(get-current-temp) temperature]
   [(set-temp new-temp) ...]))

This technically works, but what if you wanted to see the numbers in Kelvin? Previously, you'd have had to change the initial version 0 to go from Fahrenheit to Kelvin, add a migration from Celsius to Kelvin, and handle the case where you're being handed Kelvin. Let's see how this new migration macro helps.

(define-actor (^thermostat bcom temperature)
   #:version 2
   ;; Instead of #:restore, we can use #:upgrade which isn't
   ;; responsible for spawning, but is responsible for upgrading
   ;; the portrait data. This works with the new migrations macro.
   #:upgrade
   (migrations
    ;; Version 0 (F) to version 1 (C)
    [(1 f-temp) (+ (* 9/5 restored-temp) 32)]
    ;; Version 2 (C) to version 2 (K)
    [(2 c-temp) (+ c-temp 2 273.15)])

   ;; Same behavior as before
   (methods
    [(get-current-temp) temperature]
    [(set-temp new-temp) ...]))

Notice the similarity between migration syntax and the methods macro:

[(target-version data ...) body ...]

That means in our example when a Fahrenheit temperature (at version 0) is passed in, it'll run the initial F -> C migration, migrating it to Celsius, then it'll run the next to get it into Kelvin. This makes the job of writing migrations far easier as you only have to focus on migrating from one version to the next, not updating how all the migrations work.

Vat root upgrades

As we mentioned previously, Aurie, Goblins' persistence system, provides an upgrade path for actor portrait data.

But did you know it can also be useful to upgrade the roots of a graph when spawning a persistence vat? Similarly to how this works with actors, specify #:version with a version number for the roots and #:upgrade with an upgrade procedure when calling spawn-persistent-vat.

This even works with the new migrations macro!

Let's see it in action, imagine we've got a vat, it just spawns a single object as it's cell:

(use-modules (goblins)
             (goblins vat)
             (goblins actor-lib cell)
             (goblins persistence-store syrup))

;; Version 0
(define-values (vat cell)
  (spawn-persistent-vat
   cell-env
   (lambda ()
     (spawn ^cell))
   (make-syrup-store "my-vat.syrup")))

We didn't specify any version above, but just like actors, Goblins is tagging this as version 0 so that if we wanted to upgrade in the future, we have a version number to work off. Now, imagine we want to change this vat so that we return two objects as the roots. We want to take the cell above, but instead make it the swappable actor which returns a proxy to the cell and a swapper actor.

Let's see what that looks like:

(use-modules (goblins)
             (goblins vat)
             (goblins actor-lib cell)
             (goblins actor-lib swappable)
             (goblins persistence-store syrup))

;; Version 1
(define-values (vat cell swapper)
  (spawn-persistent-vat
   (persistence-env-compose cell-env swappable-env)
   (lambda ()
     ;; Change the initial spawn procedure so it matches what we expect.
     (swappable (spawn ^cell)))
   (make-syrup-store "my-vat.syrup")
   ;; Specify the version now we've changed the roots.
   #:version 1
   #:upgrade
   (migrations
    ;; Migration from 0 -> 1
    [(1 cell)
     (define-values (proxy swapper)
       (swappable cell))
     (list proxy swapper)])))

Just like we saw with the ^thermostat, as required we can keep adding migrations and Goblins will apply them as needed.

Getting the release

This release also includes a number of other bugfixes and minor features. See the NEWS file for more information!

As usual, if you're using Guix, you can upgrade to 0.14.0 by using:

guix pull
guix install guile-goblins

Otherwise, you can find the tarball on our release page

If you're making something with Goblins or want to contribute to Goblins itself, be sure to join our community at community.spritely.institute! Thanks for reading, and hope to see you there!

by Jessica Tallon (contact@spritely.institute) at Thursday, September 19, 2024

Wednesday, September 18, 2024

Andy Wingo

whippet progress update: feature-complete!

Greetings, gentle readers. Today, an update on recent progress in the Whippet embeddable garbage collection library.

feature-completeness

When I started working on Whippet, two and a half years ago already, I was aiming to make a new garbage collector for Guile. In the beginning I was just focussing on proving that it would be advantageous to switch, and also learning how to write a GC. I put off features like ephemerons and heap resizing until I was satisfied with the basics.

Well now is the time: with recent development, Whippet is finally feature-complete! Huzzah! Now it’s time to actually work on getting it into Guile. (I have some performance noodling to do before then, adding tracing support, and of course I have lots of ideas for different collectors to build, but I don’t have any missing features at the current time.)

heap resizing

When you benchmark a garbage collector (or a program with garbage collection), you generally want to fix the heap size. GC overhead goes down with more memory, generally speaking, so you don’t want to compare one collector at one size to another collector at another size.

(Unfortunately, many (most?) benchmarks of dynamic language run-times and the programs that run on them fall into this trap. Imagine you are benchmarking a program before and after a change. Automatic heap sizing is on. Before your change the heap size is 200 MB, but after it is 180 MB. The benchmark shows the change to regress performance. But did it really? It could be that at the same heap size, the change improved performance. You won’t know unless you fix the heap size.)

Anyway, Whippet now has an implementation of MemBalancer. After every GC, we measure the live size of the heap, and compute a new GC speed, as a constant factor to live heap size. Every second, in a background thread, we observe the global allocation rate. The heap size is then the live data size plus the square root of the live data size times a factor. The factor has two components. One is constant, the expansiveness of the heap: the higher it is, the more room the program has. The other depends on the program, and is computed as the square root of the ratio of allocation speed to collection speed.

With MemBalancer, the heap ends up getting resized at every GC, and via the heartbeat thread. During the GC it’s easy because it’s within the pause; no mutators are running. From the heartbeat thread, mutators are active: taking the heap lock prevents concurrent resizes, but mutators are still consuming empty blocks and producing full blocks. This works out fine in the same way that concurrent mutators is fine: shrinking takes blocks from the empty list one by one, atomically, and returns them to the OS. Expanding might reclaim paged-out blocks, or allocate new slabs of blocks.

However, even with some exponentially weighted averaging on the speed observations, I have a hard time understanding whether the algorithm is overall a good thing. I like the heartbeat thread, as it can reduce memory use of idle processes. The general square-root idea sounds nice enough. But adjusting the heap size at every GC feels like giving control of your stereo’s volume knob to a hyperactive squirrel.

GC collection time vs memory usage graph comparing V8 with MemBalancer. MemBalancer is shown to be better.
Figure 5 from the MemBalancer paper

Furthermore, the graphs in the MemBalancer paper are not clear to me: the paper claims more optimal memory use even in a single-heap configuration, but the x axis of the graphs is “average heap size”, which I understand to mean that maximum heap size could be higher than V8’s maximum heap size, taking into account more frequent heap size adjustments. Also, some measurement of total time would have been welcome, in addition to the “garbage collection time” on the paper’s y axis; there are cases where less pause time doesn’t necessarily correlate to better total times.

deferred page-out

Motivated by MemBalancer’s jittery squirrel, I implemented a little queue for use in paging blocks in and out, for the mmc and pcc collectors: blocks are quarantined for a second or two before being returned to the OS via madvise(MADV_DONTNEED). That way if you release a page and then need to reacquire it again, you can do so without bothering the kernel or other threads. Does it matter? It seems to improve things marginally and conventional wisdom says to not mess with the page table too much, but who knows.

mmc rename

Relatedly, Whippet used to be three things: the project itself, consisting of an API and a collection of collectors; one specific collector; and one specific space within that collector. Last time I mentioned that I renamed the whippet space to the nofl space. Now I finally got around to renaming what was the whippet collector as well: it is now the mostly-marking collector, or mmc. Be it known!

Also as a service note, I removed the “serial copying collector” (scc). It had the same performance as the parallel copying collector with parallelism=1, and pcc can be compiled with GC_PARALLEL=0 to explicitly choose the simpler serial grey-object worklist.

per-object pinning

The nofl space has always supported pinned objects, but it was never exposed in the API. Now it is!

Of course, collectors with always-copying spaces won’t be able to pin objects. If we want to support use of these collectors with embedders that require pinning, perhaps because of conservative root scanning, we’d need to switch to some kind of mostly-copying algorithm.

safepoints without allocation

Another missing feature was a safepoint API. It hasn’t been needed up to now because my benchmarks all allocate, but for programs that have long (tens of microseconds maybe) active periods without allocation, you want to be able to stop them without waiting too long. Well we have that exposed in the API now.

removed ragged stop

Following on my article on ragged stops, I removed ragged-stop marking from mmc, for a nice net 180 line reduction in some gnarly code. Speed seems to be similar.

next up: tracing

And with that, I’m relieved to move on to the next phase of Whippet development. Thanks again to NLnet for their support of this work. Next up, adding fine-grained tracing, so that I can noodle a bit on performance. Happy allocating!

by Andy Wingo at Wednesday, September 18, 2024

Friday, September 13, 2024

Scheme Requests for Implementation

SRFI 249: Restarting conditions

SRFI 249 is now in withdrawn status.

When an exceptional situation is encountered by a program, it may create a condition object describing the situation and then signal the condition and pass control to a condition handler. The signaler and handler are two different parts of a system, between which there is a barrier of abstraction. In order to recover gracefully and flexibly from exceptional situations, however, the signaler can provide multiple ways by which the handler can restart the computation, some of which may require extra input. Often, the decision of which method of recovery to choose is left up to a human user, who may be prompted for the input needed to recover. This SRFI proposes a simple mechanism called restarters to encapsulate the information necessary to restart a computation with associated interactive prompters.

by John Cowan at Friday, September 13, 2024

Thursday, September 12, 2024

spritely.institute

Come work with us! Hiring a Technical Administrator

A picture of a goblin at a desk saying "Work With Us!"

Like Spritely? Like free and open source software? Want to work at a mission-based nonprofit? And do you bridge the gap between technical and organizational/administrative skills? Then consider applying to our Technical Administrator position and become part of the Spritely Institute team!

by The Spritely Institute (contact@spritely.institute) at Thursday, September 12, 2024

Wednesday, September 11, 2024

Scheme Requests for Implementation

SRFI 246: Guardians

SRFI 246 is now in withdrawn status.

Guardians allow programs to protect objects from deallocation by the garbage collector and to determine which objects would otherwise have been deallocated. When the object has associated non-memory resources, a program can register it with a guardian. The GC will mark inaccessible objects but will not collect them; at the program's convenience, inaccessible objects are removed from the guardian and their non-memory resources are disposed of. Guardians allow objects to be saved from deallocation indefinitely so that they can be reused or so that clean-up or other actions can be performed using the data stored within the objects. Guardians avoid the problems associated with classical finalizers detailed in the Rationale section.

by John Cowan at Wednesday, September 11, 2024

Saturday, September 7, 2024

Andy Wingo

conservative gc can be faster than precise gc

Should your garbage collector be precise or conservative? The prevailing wisdom is that precise is always better. Conservative GC can retain more objects than strictly necessary, making GC slow: GC has to more frequently, and it has to trace a larger heap on each collection. However the calculus is not as straightforward as most people think, and indeed there are some reasons to expect that conservative root-finding can result in faster systems.

(I have made / relayed some of these arguments before but I feel like a dedicated article can make a contribution here.)

problem precision

Let us assume that by conservative GC we mean conservative root-finding, in which the collector assumes that any integer on the stack that happens to be a heap address indicates a reference on the object containing that address. The address doesn’t have to be at the start of the object. Assume that objects on the heap are traced precisely; contrast to BDW-GC which generally traces both the stack and the heap conservatively. Assume a collector that will pin referents of conservative roots, but in which objects not referred to by a conservative root can be moved, as in Conservative Immix or Whippet’s stack-conservative-mmc collector.

With that out of the way, let’s look at some reasons why conservative GC might be faster than precise GC.

smaller lifetimes

A compiler that does precise root-finding will typically output a side-table indicating which slots in a stack frame hold references to heap objects. These lifetimes aren’t always precise, in the sense that although they precisely enumerate heap references, those heap references might actually not be used in the continuation of the stack frame. When GC occurs, it might mark more objects as live than are actually live, which is the imputed disadvantage of conservative collectors.

This is most obviously the case when you need to explicitly register roots with some kind of handle API: the handle will typically be kept live until the scope ends, but that might be an overapproximation of lifetime. A compiler that can assume conservative stack scanning may well exhibit more precision than it would if it needed to emit stack maps.

no run-time overhead

For generated code, stack maps are great. But if a compiler needs to call out to C++ or something, it needs to precisely track roots in a run-time data structure. This is overhead, and conservative collectors avoid it.

smaller stack frames

A compiler may partition spill space on a stack into a part that contains pointers to the heap and a part containing numbers or other unboxed data. This may lead to larger stack sizes than if you could just re-use a slot for two purposes, if the lifetimes don’t overlap. A similar concern applies for compilers that partition registers.

no stack maps

The need to emit stack maps is annoying for a compiler and makes binaries bigger. Of course it’s necessary for precise roots. But then there is additional overhead when tracing the stack: for each frame on the stack, you need to look up the stack map for the return continuation, which takes time. It may be faster to just test if words on the stack might be pointers to the heap.

unconstrained compiler

Having to make stack maps is a constraint imposed on the compiler. Maybe if you don’t need them, the compiler could do a better job, or you could use a different compiler entirely. A conservative compiler can sometimes have better codegen, for example by the use of interior pointers.

anecdotal evidence

The Conservative Immix paper shows that conservative stack scanning can beat precise scanning in some cases. I have reproduced these results with parallel-stack-conservative-mmc compared to parallel-mmc. It’s small—maybe a percent—but it was a surprising result to me and I thought others might want to know.

Also, Apple’s JavaScriptCore uses conservative stack scanning, and V8 is looking at switching to it. Funny, right?

conclusion

When it comes to designing a system with GC, don’t count out conservative stack scanning; the tradeoffs don’t obviously go one way or the other, and conservative scanning might be the right engineering choice for your system.

by Andy Wingo at Saturday, September 7, 2024

Friday, September 6, 2024

Andy Wingo

on taking advantage of ragged stops

Many years ago I read one of those Cliff Click “here’s what I learned” articles in which he was giving advice about garbage collector design, and one of the recommendations was that at a GC pause, running mutator threads should cooperate with the collector by identifying roots from their own stacks. You can read a similar assertion in their VEE2005 paper, The Pauseless GC Algorithm, though this wasn’t the source of the information.

One motivation for the idea was locality: a thread’s stack is already local to a thread. Then specifically in the context of a pauseless collector, you need to avoid races between the collector and the mutator for a thread’s stack, and having the thread visit its own stack neatly handles this problem.

However, I am not so interested any more in (so-called) pauseless collectors; though I have not measured myself, I am convinced enough by the arguments in the Distilling the real costs of production garbage collectors paper, which finds that state of the art pause-minimizing collectors actually increase both average and p99 latency, relative to a well-engineered collector with a pause phase. So, the racing argument is not so relevant to me, because a pause is happening anyway.

There was one more argument that I thought was interesting, which was that having threads visit their own stacks is a kind of cheap parallelism: the mutator threads are already there, they might as well do some work; it could be that it saves time, if other threads haven’t seen the safepoint yet. Mutators exhibit a ragged stop, in the sense that there is no clean cutoff time at which all mutators stop simultaneously, only a time after which no more mutators are running.

Visiting roots during a ragged stop introduces concurrency between the mutator and the collector, which is not exactly free; notably, it prevents objects marked while mutators are running from being evacuated. Still, it could be worth it in some cases.

Or so I thought! Let’s try to look at the problem analytically. Consider that you have a system with N processors, a stop-the-world GC with N tracing threads, and M mutator threads. Let’s assume that we want to minimize GC latency, as defined by the time between GC is triggered and the time that mutators resume. There will be one triggering thread that causes GC to begin, and then M–1 remote threads that need to reach a safepoint before the GC pause can begin.

The total amount of work that needs to be done during GC can be broken down into rootsi, the time needed to visit roots for mutator i, and then graph, the time to trace the transitive closure of live objects. We want to know whether it’s better to perform rootsi during the ragged stop or in the GC pause itself.

Let’s first look to the case where M is 1 (just one mutator thread). If we visit roots before the pause, we have

latencyragged,M=1 = roots0 + graphN

Which is to say, thread 0 triggers GC, visits its own roots, then enters the pause in which the whole graph is traced by all workers with maximum parallelism. It may be that graph tracing doesn’t fully parallelize, for example if the graph has a long singly-linked list, but the parallelism with be maximal within the pause as there are N workers tracing the graph.

If instead we visit roots within the pause, we have:

latencypause,M=1= roots0+graphN

This is strictly better than the ragged-visit latency.

If we have two threads, then we will need to add in some delay, corresponding to the time it takes for remote threads to reach a safepoint. Let’s assume that there is a maximum period (in terms of instructions) at which a mutator will check for safepoints. In that case the worst-case delay will be a constant, and we add it on to the latency. Let us assume also there are more than two threads available. The marking-roots-during-the-pause case it’s easiest to analyze:

latencypause,M=2= delay + roots0+roots1+graphN

In this case, a ragged visit could win: while the triggering thread is waiting for the remote thread to stop, it could perform roots0, moving the work out of the pause, reducing pause time, and reducing latency, for free.

latencyragged,M=2= delay + roots1 + graphN

However, we only have this win if the root-visiting time is smaller than the safepoint delay; otherwise we are just prolonging the pause. Thing is, you don’t know in general. If indeed the root-visiting time is short, relative to the delay, we can assume the roots elements of our equation are 0, and so the choice to mark during ragged stop doesn’t matter at all! If we assume instead that root-visiting time is long, then it is suboptimally parallelised: under-parallelised if we have more than M cores, oversubscribed if M is greater than N, and needlessly serializing before the pause while it waits for the last mutator to finish marking its roots. What’s worse, root-visiting actually slows down delay, because the oversubscribed threads compete with visitation for CPU time.

So in summary, I plan to switch away from doing GC work during the ragged stop. It is complexity that doesn’t pay. Onwards and upwards!

by Andy Wingo at Friday, September 6, 2024

spritely.institute

Guile Hoot v0.5.0 released!

Hoot version 0.5.0

We are excited to announce the release of Guile Hoot v0.5.0! Hoot is a Scheme to WebAssembly compiler backend for Guile, as well as a general purpose WebAssembly toolchain. In other words, Scheme in the browser!

Hoot 0.5.0, four months in the making, features a Scheme interpreter, a lightweight concurrency API, improved Guile compatibility, and many small improvements and bug fixes.

Before reading further, how about evaluating some Scheme code?

Read on for the full release notes.

Highlights

  • New Scheme interpreter in the (hoot eval) module. Finally, it is possible to eval Scheme code at runtime! See the manual for an example program that uses the interpreter. Note that this initial version of the Scheme interpreter supports base Scheme syntax only and does not support user-defined macros. This restriction will be relaxed in the future when Hoot ships a full-featured macro expander.

  • New lightweight concurrency API based on Guile's fibers library. Fibers build atop JavaScript's promises and Guile's delimited continuations for convenient asynchronous programming without having to pepper your code with async/await. See the manual for documentation of the fibers API.

  • More of Guile's default environment (the (guile) module) has been implemented, and Guile-flavored define-module libraries now properly handle duplicate definitions. A few modules from upstream Guile can now be compiled into Hoot programs, such as (ice-9 vlist). Much work remains to be done, but 0.5.0 is a vast improvement over prior releases in this regard.

  • Uncaught exceptions now print backtraces with source locations! This one has been a long time coming and we're happy to finally say we have backtraces. Debugging was getting pretty painful without them!

New features

  • Added Scheme interpreter in (hoot eval) module.

  • Added fibers (lightweight concurrency) API in (fibers ...) module namespace.

  • Added support for R7RS define-library syntax.

  • Added code-ref Wasm heap type for storing source locations.

  • Added source locations to exceptions.

  • Added backtrace printing for uncaught exceptions.

  • Added (hoot interaction-environment) module.

  • Added (hoot regexps) module.

  • Added (hoot syntax-objects) module.

  • Added (hoot tree-il) module.

  • Added (ice-9 atomic) module.

  • Added (ice-9 binary-ports) module.

  • Added (ice-9 custom-ports) module.

  • Added (ice-9 control) module.

  • Added (ice-9 exceptions) module.

  • Added (ice-9 i18n) stub module.

  • Added (ice-9 match) module.

  • Added (ice-9 optargs) stub module.

  • Added (ice-9 ports internal) module.

  • Added (ice-9 rdelim) module, though it is mostly stubs.

  • Added (ice-9 regex) module.

  • Added (rnrs bytevectors gnu) module.

  • Added (srfi srfi-9 gnu) module.

  • Added (srfi srfi-11) module.

  • Added #:unwind-for-type support to with-exception-handler.

  • Added fluid? predicate to (hoot fluids).

  • Added endianness, native-endianness, string->utf8, utf8->string, and big endian support to (rnrs bytevectors).

  • Reimplemented hashtables and added support for eqv, equal, and custom hashtables.

  • Added partial compatibility for Guile's legacy hashtable API (hashq-ref and friends) to (guile).

  • Added numerous miscellaneous procedures to (guile) module for improved Guile compatibility.

  • Added duplicate definition resolution to define-module style modules.

  • Added define-external-type macro to (hoot ffi).

  • Added support for (ref null string) result type in define-foreign.

  • Added --run flag to guild compile-wasm for quickly testing compiled programs.

  • Added --async flag to guild compile-wasm to run async programs in conjunction with --run.

  • Reorganized JS reflection code and slightly changed the API. See the manual for updated web setup instructions.

  • Added call_async method to Procedure class in JS reflection API. Use this method to invoke async procedures that use fibers.

  • Added ,hoot-compile, ,hoot-compile-file, ,hoot-run, and ,hoot-run-file REPL commands. Enter ,use (hoot repl) at the Guile REPL to use them.

Bug fixes

  • Fixed Wasm invalid cast error when applying a non-procedure. A Scheme exception is now raised.

  • Fluids can now be used at expansion time.

  • Current port parameters (current-output-port, current-input-port, current-error-port) can now be used at expansion time.

  • make-bytevector and bytevector-copy can now be used at expansion time.

  • string->list can now be used at expansion time.

  • Fixed string conversion when going across the Guile/Wasm boundary when running Hoot programs on the Hoot VM.

  • Exceptions are now thrown when trying to open files when running in a web browser.

  • Fixed emitted code for lsh/immediate CPS instruction when y >= 33.

  • Fixed multi-byte character reads from input ports.

  • Fixed beyond-relooper compiler pass for convergent case-lambda.

  • Fixed stringref to wtf8 lowering for nullable strings.

  • Fix characters being quoted when printed with display.

Browser compatibility

  • Compatible with Firefox 121 or later.

  • Compatible with Google Chrome 119 or later.

  • Safari/WebKit is currently unsupported. The recent Safari Technology Preview has enabled Wasm GC and tail calls, but it seems there is still something missing.

Get Hoot 0.5.0!

Hoot is already available in GNU Guix:

$ guix pull
$ guix install guile-next guile-hoot

(Hoot currently requires a bleeding-edge version of Guile, hence guile-next above.)

Otherwise, Hoot can be built from source via our release tarball. See the Hoot homepage for a download link and GPG signature.

Documentation for Hoot 0.5.0, including build instructions, can be found here.

Hoot at the ICFP Scheme workshop

Tomorrow, 9/7, lead Hoot engineer Andy Wingo of Igalia will be giving the opening keynote presentation at the Scheme workshop colocated with ICFP in Milan, Italy. Andy will be talking all about how we've made the dreams of the 90's come to life by bringing Scheme to the web. You can watch the talk live or later on the ACM SIGPLAN YouTube channel.

Get in touch!

For bug reports, pull requests, or just to follow along with development, check out the Hoot project on GitLab.

If you build something cool with Hoot, let us know on our community forum!

Thanks to tessa for the lovely Hoot art, and a special thanks to the MetaMask folks for funding this work!

Until next time, happy hooting! 🦉

by Dave Thompson (contact@spritely.institute) at Friday, September 6, 2024

Thursday, September 5, 2024

spritely.institute

Announcing Spritely Office Hours!

writing a Camp Spritely-themed postcard home

The Spritely Institute is excited to announce our new trial for Engineering Office Hours by popular demand! Have a burning question for the team? Read on to learn more about how the program works:

Reporting a bug

If you have a bug to report in Hoot or Goblins, please open an issue on the respective repo below:

Getting help

To get help with questions about Spritely technologies, as soon as possible, share your question with the community on the Spritely forums!

Reserving an office hours slot

If, after filing an issue and/or posting on our forums, you still require further discussion, you can schedule a meeting with a member of the team here. We currently have two slots open on Wednesday, September 18th at 2:00 PM EDT (UTC-4).

Dropping in

Alternatively, if you just want to say hello, listen in, or think of a burning question on the day of, you can join us impromptu in the meeting room on Wednesday, September 18th at 2:00 PM EDT (UTC-4)!

Thanks as always for reading. We look forward to chatting soon!

by tessa (contact@spritely.institute) at Thursday, September 5, 2024