Planet Scheme

Wednesday, July 24, 2024

Andy Wingo

whippet progress update: funding, features, future

Greets greets! Today, an update on recent progress in Whippet, including sponsorship, a new collector, and a new feature.

the lob, the pitch

But first, a reminder of what the haps: Whippet is a garbage collector library. The target audience is language run-time authors, particularly “small” run-times: wasm2c, Guile, OCaml, and so on; to a first approximation, the kinds of projects that currently use the Boehm-Demers-Weiser collector.

The pitch is that if you use Whippet, you get a low-fuss small dependency to vendor into your source tree that offers you access to a choice of advanced garbage collectors: not just the conservative mark-sweep collector from BDW-GC, but also copying collectors, an Immix-derived collector, generational collectors, and so on. You can choose the GC that fits your problem domain, like Java people have done for many years. The Whippet API is designed to be a no-overhead abstraction that decouples your language run-time from the specific choice of GC.

I co-maintain Guile and will be working on integrating Whippet in the next months, and have high hopes for success.

bridgeroos!

I’m delighted to share that Whippet was granted financial support from the European Union via the NGI zero core fund, administered by the Dutch non-profit, NLnet foundation. See the NLnet project page for the overview.

This funding allows me to devote time to Whippet to bring it from proof-of-concept to production. I’ll finish the missing features, spend some time adding tracing support, measuring performance, and sanding off any rough edges, then work on integrating Whippet into Guile.

This bloggery is a first update of the progress of the funded NLnet project.

a new collector!

I landed a new collector a couple weeks ago, a parallel copying collector (PCC). It’s like a semi-space collector, in that it always evacuates objects (except large objects, which are managed in their own space). However instead of having a single global bump-pointer allocation region, it breaks the heap into 64-kB blocks. In this way it supports multiple mutator threads: mutators do local bump-pointer allocation into their own block, and when their block is full, they fetch another from the global store.

The block size is 64 kB, but really it’s 128 kB, because each block has two halves: the active region and the copy reserve. It’s a copying collector, after all. Dividing each block in two allows me to easily grow and shrink the heap while ensuring there is always enough reserve space.

Blocks are allocated in 64-MB aligned slabs, so you get 512 blocks in a slab. The first block in a slab is used by the collector itself, to keep metadata for the rest of the blocks, for example a chain pointer allowing blocks to be collected in lists, a saved allocation pointer for partially-filled blocks, whether the block is paged in or out, and so on.

The PCC not only supports parallel mutators, it can also trace in parallel. This mechanism works somewhat like allocation, in which multiple trace workers compete to evacuate objects into their local allocation buffers; when an allocation buffer is full, the trace worker grabs another, just like mutators do.

However, unlike the simple semi-space collector which uses a Cheney grey worklist, the PCC uses the fine-grained work-stealing parallel tracer originally developed for Whippet’s Immix-like collector. Each trace worker maintains a local queue of objects that need tracing, which currently has 1024 entries. If the local queue becomes full, the worker will publish 3/4 of those entries to the worker’s shared worklist. When a worker runs out of local work, it will first try to remove work from its own shared worklist, then will try to steal from other workers.

Of course, because threads compete to evacuate objects, we have to use atomic compare-and-swap instead of simple forwarding pointer updates; if you only have one mutator thread and are OK with just one tracing thread, you can avoid the ~30% performance penalty that atomic operations impose. The PCC generally starts to win over a semi-space collector when it can trace with 2 threads, and gets better with each thread you add.

I sometimes wonder whether the worklist should contain grey edges or grey objects. MMTk seems to do the former, and bundles edges into work packets, which are the unit of work sharing. I don’t know yet what is best and look forward to experimenting once I have better benchmarks.

Anyway, maintaining an external worklist is cheating in a way: unlike the Cheney worklist, this memory is not accounted for as part of the heap size. If you are targetting a microcontroller or something, probably you need to choose a different kind of collector. Fortunately, Whippet enables this kind of choice, as it contains a number of collector implementations.

What about benchmarks? Well, I’ll be doing those soon in a more rigorous way. For now I will just say that it seems to behave as expected and I am satisfied; it is useful to have a performance oracle against which to compare other collectors.

finalizers!

This week I landed support for finalizers!

Finalizers work in all the collectors: semi, pcc, whippet, and the BDW collector that is a shim to use BDW-GC behind the Whippet API. They have a well-defined relationship with ephemerons and are multi-priority, allowing embedders to build guardians or phantom references on top.

In the future I should do some more work to make finalizers support generations, if the collector is generational, allowing a minor collection to avoid visiting finalizers for old objects. But this is a straightforward extension that will happen at some point.

future!

And that’s the end of this update. Next up, I am finally going to tackle heap resizing, using the membalancer approach. Then basic Windows and Mac support, then I move on to the tracing and performance measurement phase. Onwards and upwards!

by Andy Wingo at Wednesday, July 24, 2024

Tuesday, July 23, 2024

GNU Guix

The European Union must keep funding free software

Guix is the fruit of a combination of volunteer work by an amazing number of people, work paid for by employers, but also work sponsored by public institutions. The European Commission’s Next Generation Internet (NGI) calls have been instrumental in that regard. News that NGI funding could vanish came to us as a warning signal.

Since 2020, NGI has supported many free software projects, allowing for significant strides on important topics that would otherwise be hard to fund. As an example, here are some of the NGI grants that directly benefited Guix and related projects:

Over the years, NGI has more than demonstrated that public financial support for free software development makes a difference. We strongly believe that this support must continue, that it must strengthen the development of innovative software where user autonomy and freedom is a central aspect.

For these reasons, the Guix project joins a growing number of projects and organizations in signing the following open letter to the European Commission.

The open letter below was initially published by petites singularités. English translation provided by OW2.

Open Letter to the European Commission

Since 2020, Next Generation Internet (NGI) programmes, part of European Commission's Horizon programme, fund free software in Europe using a cascade funding mechanism (see for example NLnet's calls). This year, according to the Horizon Europe working draft detailing funding programmes for 2025, we notice that Next Generation Internet is not mentioned any more as part of Cluster 4.

NGI programmes have shown their strength and importance to supporting the European software infrastructure, as a generic funding instrument to fund digital commons and ensure their long-term sustainability. We find this transformation incomprehensible, moreover when NGI has proven efficient and economical to support free software as a whole, from the smallest to the most established initiatives. This ecosystem diversity backs the strength of European technological innovation, and maintaining the NGI initiative to provide structural support to software projects at the heart of worldwide innovation is key to enforce the sovereignty of a European infrastructure. Contrary to common perception, technical innovations often originate from European rather than North American programming communities, and are mostly initiated by small-scaled organisations.

Previous Cluster 4 allocated 27 million euros to:

  • "Human centric Internet aligned with values and principles commonly shared in Europe" ;
  • "A flourishing internet, based on common building blocks created within NGI, that enables better control of our digital life" ;
  • "A structured ecosystem of talented contributors driving the creation of new internet commons and the evolution of existing internet commons".

In the name of these challenges, more than 500 projects received NGI funding in the first 5 years, backed by 18 organisations managing these European funding consortia.

NGI contributes to a vast ecosystem, as most of its budget is allocated to fund third parties by the means of open calls, to structure commons that cover the whole Internet scope - from hardware to application, operating systems, digital identities or data traffic supervision. This third-party funding is not renewed in the current program, leaving many projects short on resources for research and innovation in Europe.

Moreover, NGI allows exchanges and collaborations across all the Euro zone countries as well as "widening countries"¹, currently both a success and an ongoing progress, likewise the Erasmus programme before us. NGI also contributes to opening and supporting longer relationships than strict project funding does. It encourages implementing projects funded as pilots, backing collaboration, identification and reuse of common elements across projects, interoperability in identification systems and beyond, and setting up development models that mix diverse scales and types of European funding schemes.

While the USA, China or Russia deploy huge public and private resources to develop software and infrastructure that massively capture private consumer data, the EU can't afford this renunciation. Free and open source software, as supported by NGI since 2020, is by design the opposite of potential vectors for foreign interference. It lets us keep our data local and favors a community-wide economy and know-how, while allowing an international collaboration.

This is all the more essential in the current geopolitical context: the challenge of technological sovereignty is central, and free software allows to address it while acting for peace and sovereignty in the digital world as a whole.

In this perspective, we urge you to claim for preserving the NGI programme as part of the 2025 funding programme.

¹ As defined by Horizon Europe, widening Member States are Bulgaria, Croatia, Cyprus, Czechia, Estonia, Greece, Hungary, Latvia, Lituania, Malta, Poland, Portugal, Romania, Slovakia, and Slovenia. Widening associated countries (under condition of an association agreement) include Albania, Armenia, Bosnia, Feroe Islands, Georgia, Kosovo, Moldavia, Montenegro, Morocco, North Macedonia, Serbia, Tunisia, Turkeye, and Ukraine. Widening overseas regions are Guadeloupe, French Guyana, Martinique, Reunion Island, Mayotte, Saint-Martin, The Azores, Madeira, the Canary Islands.

by The Guix Project at Tuesday, July 23, 2024

Monday, July 22, 2024

Andy Wingo

finalizers, guardians, phantom references, et cetera

Consider guardians. Compared to finalizers, in which the cleanup procedures are run concurrently with the mutator, by the garbage collector, guardians allow the mutator to control concurrency. See Guile’s documentation for more notes. Java’s PhantomReference / ReferenceQueue seems to be similar in spirit, though the details differ.

questions

If we want guardians, how should we implement them in Whippet? How do they relate to ephemerons and finalizers?

It would be a shame if guardians were primitive, as they are a relatively niche language feature. Guile has them, yes, but really what Guile has is bugs: because Guile implements guardians on top of BDW-GC’s finalizers (without topological ordering), all the other things that finalizers might do in Guile (e.g. closing file ports) might run at the same time as the objects protected by guardians. For the specific object being guarded, this isn’t so much of a problem, because when you put an object in the guardian, it arranges to prepend the guardian finalizer before any existing finalizer. But when a whole clique of objects becomes unreachable, objects referred to by the guarded object may be finalized. So the object you get back from the guardian might refer to, say, already-closed file ports.

The guardians-are-primitive solution is to add a special guardian pass to the collector that will identify unreachable guarded objects. In this way, the transitive closure of all guarded objects will be already visited by the time finalizables are computed, protecting them from finalization. This would be sufficient, but is it necessary?

answers?

Thinking more abstractly, perhaps we can solve this issue and others with a set of finalizer priorities: a collector could have, say, 10 finalizer priorities, and run the finalizer fixpoint once per priority, in order. If no finalizer is registered at a given priority, there is no overhead. A given embedder (Guile, say) could use priority 0 for guardians, priority 1 for “normal” finalizers, and ignore the other priorities. I think such a facility could also support other constructs, including Java’s phantom references, weak references, and reference queues, especially when combined with ephemerons.

Anyway, all this is a note for posterity. Are there other interesting mutator-exposed GC constructs that can’t be implemented with a combination of ephemerons and multi-priority finalizers? Do let me know!

by Andy Wingo at Monday, July 22, 2024

Sunday, July 21, 2024

Idiomdrottning

Put that in your CLI and smoke it

Snover, as transcribed by Michael Lynch, says:

I realized that — you know, that the mouse is antisocial. The GUI is antisocial. So what’s that mean? You have a problem to solve, and you solve it with the GUI. What do you have? A problem solved.

But when you solve it with a command line interface in a scripting environment, you have an artifact. And all of a sudden, that artifact can be shared with someone.

It’s certainly true that historically, CLIs have lead themselves much much more readily to reproducing and automating tedious tasks, and that those automations can be shared. That’s the main point, and I’m right there joining the choir.

It’s his secondary point that’s a li’l off:

By the way, the way you did it can show cleverness. I’ve never seen anybody use a GUI in a clever way. Ever. There’s no cleverness to it. No, like, “Oh my god, you should see the way Adam clicked that mouse. Oh my god. Guys, guys, guys, guys, come on! Check it out: Adam’s going to click the button! Oh my god! That’s amazing!” It just doesn’t happen.

Snover contrasts this with the reaction to seeing an impressive command-line script:

It’s like, “Oh my god! Did you see what Proust did? That’s phenomenal! This guy’s a frickin’ genius.” And then, “Hey, give that to me. I’m going to steal that technique and apply it to my code.”

Now, this on the other hand is the most sollipsistic thing I’ve heard all week (and it was RNC week!).

You’ve never seen anyone use a GUI in a clever way? I’ve been ten thousand miles in the mouth of a graveyard. The last makings of the future upon green banks of unseen battlefields. I traveled far and wide to prisons of the cross. What did you see there? I’ve seen hackers juggle mouse chords in ACME, musicians sequencing sequences of MIDI sequences in seq24, speed painters catching the light of day itself. I’ve seen Bay Raitt box modelling in Mirai.

It’s not a selling point of any UI that you’ve got to be clever to use it well, I don’t even agree that this is a good thing about CLIs, but GUIs share the same trait.

Ten years ago, the first post on this entire blog was about CLIs. I love CLIs. I just feel like this particular selling point is based on an ignorance on all the wonderfully productive work people do in spreadsheet, MIDI sequencers, box modellers, and painting apps. We use UIs to live, we don’t live to use UIs.

Michael replied (and I’ll paste some of this in as reply to his comment on his web page next time I have www access):

Thanks for responding! I’m still thinking about whether it is possible to use GUIs in a “clever” way. You shared the video of box modeling in Mirai, and others have talked about ableton speedruns, but even that feels like not quite using a GUI in a clever way. Like if you watched F. Scott Fitzgerald or Alice Walker sit at a typewriter and write an amazing page of a novel, we probably wouldn’t say they’re using the typewriter in a clever way. When I see the Mirai and Ableton videos, it feels like the person is using the GUI in an efficient or competent way, but it still feels somehow different from scripting something in a clever way.

A typewriter is a straightforward UI and while the end product of a story like Bernice Bobs Her Hair is indeed awesome, I agree that it’s not UI dazzling (a court stenographer might impress me with theor chorded typing of text and even words, but Fitzgerald or Walker banging on sholes, that’s not a “clever UI” in that same way). I also had many people writing in with game examples, yeah, I had some game examples in my own first draft but ended up deleting them since the point of a UI is to be easy to use while the point of a game is to be challenging to use. Not the same thing.

But the Ableton or Mirai examples or even some video or audio editors, that’s where I’m gonna take my stand. If we disagree that those usages are clever then there’s got to be some sort of semantics mismatch of what “clever” really means: I’ve seen radio station journalist use audio editing apps the public have never seen in ways that are insanely ideosyncratic, keyboard centric, efficienct, and clever.

My reaction to this framing is that we devs make apps (and libes) for making apps for making apps for making apps. Someone who’s learned to juggle Mirai or Ableton, they’re not a part of that same dev community, that same hacker culture, but I respect them maybe even more for how they do what they do. I understand some of the scripting clevernessess like duff’s device or the eval/apply loop or even code golfing. For me, the awe I feel is the same awe. If you have a more nuanced take on different kinds of awe and cleverness, that’s something I don’t wanna take away from you, but it’s hard to argue for that in a way that doesn’t disparages these non-coders that are masters of their tool in their own right.

Of course, when I can combine the two, that’s where I’m in really my element. I love the scripting interfaces of MyPaint, GIMP (#ChangeTheName), Blender, and Inkscape. Blender is as much of a Python machine as Emacs is a Lisp machine. We don’t have to choose.

Artifacts

Snover went on:

Or then I have this artifact, and I publish it, and people are using it. There’s a debt of gratitude. Like, they owe me a beer.

Always a quid-pro-quo with these guys. I’m not on board with shackling our world of imagination and sharing-is-caring to the zero-sum world of beers. Since we can make copies there’s no limit to how much we can lift each other up, how many shoulders we can stand on. It’s amazing that we can reproduce and automate and share workflows, that’s great, and that’s one area CLIs are winning.

Not the only area: on the web, in search boxes and address fields and chats, those are all CLIs. CLIs are back with a bang and there are so many jobs that a CLI does better.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Sunday, July 21, 2024

Gwen Weinholt

Chez Scheme 10 in Debian experimental

I have recently been working on getting Chez Scheme 10.0.0 into Debian and have uploaded it to Debian experimental. After some minor fixes it now builds on most archs except armel and x32, using bytecode where a native port is not available. Please test it and report any bugs in the Debian bug tracker.

by weinholt at Sunday, July 21, 2024

Wednesday, July 10, 2024

Andy Wingo

copying collectors with block-structured heaps are unreliable

Good day, garbage pals! This morning, a quick note on “reliability” and garbage collectors, how a common GC construction is unreliable, and why we choose it anyway.

on reliability

For context, I’m easing back in to Whippet development. One of Whippet’s collectors is a semi-space collector. Semi-space collectors are useful as correctness oracles: they always move objects, so they require their embedder to be able to precisely enumerate all edges of the object graph, to update those edges to point to the relocated objects. A semi-space collector is so simple that if there is a bug, it is probably in the mutator rather than the collector. They also have well-understood performance, as as such are useful when comparing performance of other collectors.

But one other virtue of the simple semi-space collector is that it is reliable, in the sense that given a particular set of live objects, allocated in any order, there is a single heap size at which the allocation (and collection) will succeed, and below which the program fails (not enough memory). This is because all allocations go in the same linear region, collection itself doesn’t allocate memory, the amount of free space after an object (the fragmentation) does not depend on where it is allocated, and those object extents just add up in a commutative way.

Reliability is a virtue. Sometimes it is a requirement: for example, the Toit language and run-time targets embeded microcontrollers, and you there you have finite resources and either they workload fits or it doesn’t. You can’t really tolerate a result of “it works sometimes”. So, Toit uses a generational semi-space + mark-sweep collector that never makes things worse.

on block-structured heaps

But, threads make reliability tricky. With Whippet I am targetting embedders with multiple mutator threads, and a classic semi-space collector doesn’t scale – you could access the allocation pointer atomically, but that would be a bottleneck, serializing mutators, not to mention the cache contention.

The usual solution for this problem is to arrange the heap in such a way that different threads can allocate in different areas, so they don’t need to share an allocation pointer and so they don’t write to the same cache lines. And, the most common way to do this is to use a block-structured heap; for example you might have a 256 MB heap, but divided into 4096 blocks, each of which is 64 kB. That’s enough granularity to dynamically partition out space between many threads: you keep a list of available blocks and allocator threads compete to grab fresh blocks as needed. There’s central contention on the block list, so you want blocks big enough that you aren’t fetching blocks too often.

To break a heap into blocks requires a large-object space, to allow for allocations that are larger than a block. And actually, as I mentioned in the article about block-structured heaps, usually you choose a threshold for large object allocations that is smaller than the block size, to limit the maximum and expected amount of fragmentation at the end of each block, when an allocation doesn’t fit.

on unreliability

Which brings me to my point: a copying collector with a block-structured heap is unreliable, in the sense that there is no single heap size below which the program fails and above which it succeeds.

Consider a mutator with a single active thread, allocating a range of object sizes, all smaller than the large object threshold. There is a global list of empty blocks available for allocation, and the thread grabs blocks as needed and bump-pointer allocates into that block. The last allocation in each block will fail: that’s what forces the thread to grab a new fresh block. The space left at the end of the block is fragmentation.

Assuming that the sequence of allocations performed by the mutator is deterministic, by the time the mutator has forced the first collection, the total amount of fragmentation will also be deterministic, as will the set of live roots at the time of collection. Assume also that there is a single collector thread which evacuates the live objects; this will also produce deterministic fragmentation.

However, there is no guarantee that the post-collection fragmentation is less than the pre-collection fragmentation. Unless objects are copied in such a way that preserves allocation order—generally not the case for a semi-space collector, and it would negate any advantage of a block-structured heap—then different object order could produce different end-of-block fragmentation.

causes of unreliability

The unreliability discussed above is due to non-commutative evacuation. If your collector marks objects in place, you are not affected. If you don’t commute live objects—if you preserve their allocation order, as Toit’s collector does—then you are not affected. If your evacuation commutes, as in the case of the simple semi-space collector, you are not affected. But if you have a block-structured heap and you evacuate, your collector is probably unreliable.

There are other sources of unreliability in a collector, but to my mind they are not as fundamental as this one.

  • Multiple mutator threads generally lead to a kind of unreliability, because the size of the live object graph is not deterministic at the time of collection: even if all threads have the same allocation trace, they don’t necessarily proceed in lock-step nor stop in the same place.

  • Adding collector threads to evacuate in parallel adds its own form of unreliability: if you have 8 evacuator threads, then there are 8 blocks local to the evacuator threads which also contribute to post-collection wasted space, possibly an entire block per thread.

  • Some collectors will allocate memory during collection, for example to represent a worklist of objects that need tracing. This allocation can fail. Also, annoyingly, collection-time allocation complicates comparison: you can no longer compare two collectors at the same heap size, because one of them cheats.

  • Virtual memory and paging can make you have a bad time. For example, you go to allocate a large object, so you remove some empty blocks from the main space and return them to the OS, providing you enough budget to allocate the new large object. Then the new large object is collected, so you reclaim the pages you returned to the OS, adding them to the available list. But probably you don’t page them in already, because who wants a syscall? They get paged in lazily when the mutator needs them, but that could fail because of other processes on the system.

embracing unreliability

I think it only makes sense to insist on a reliable collector if your mutator does not have threads; otherwise, the fragmentation-related unreliability pales in comparison.

What’s more, fragmentation-related unreliability can be entirely mitigated by giving the heap more memory: the maximum amount of fragmentation is an object just shy of the large object threshold, per block, so in our case 8 kB per 64 kB. So, just increase your heap by 12.5%. You will certainly not regret increasing your heap by 12.5%.

And happily, increasing heap size also works to mitigate unreliability related to multiple mutator threads. Consider 10 threads each of which has a local object graph that is usually 10 MB but briefly 100MB when calculating: usually when GC happens, total live object size is 10×10MB=100MB, but sometimes as much as 1 GB; there is a minimum heap size for which the program sometimes works, but also a minimum heap size at which it always works. The trouble is, of course, that you generally only know the minimum always-works size by experimentation, and you are unlikely to be able to actually measure the maximum heap size.

Which brings me to my final point, which is that virtual memory and growable heaps are good. Unless you have a dedicated devops team or you are evaluating a garbage collector, you should not be using a fixed heap size. The ability to just allocate some pages to keep the heap from being too tight buys us a form of soft reliability.

And with that, end of observations. Happy fragmenting, and until next time!

by Andy Wingo at Wednesday, July 10, 2024

Saturday, June 29, 2024

Gauche Devlog

Running prebuilt Gauche on GitHub workflow

Running prebuilt Gauche on GitHub workflow

The setup-gauche action installs Gauche on GitHub workflow runners for you (Using Gauche in GitHub Actions). But it downloaded source tarball and compiled, which took time. Especially if your repo is a small library, it feels waste of time compiling Gauche every time you push to the repo.

Now, setup-gauche can use a prebuilt binary on ubuntu-latest and macos-latest platforms. Just give prebuilt-binary: true as the parameter:

name: Build and test

on: [push, pull_request]

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
    - uses: actions/checkout@v3
    - uses: practical-scheme/setup-gauche@v5
      with:
        prebuilt-binary: true
    - name: Install dependencies
      run: |
        sudo apt install -y gettext
    - name: Build and check
      run: |
        ./configure
        make
        make -s check

Installing prebuilt binary takes around 10s or so; huge time saving.

Note that the prebuilt binary is provided with the latest Gauche release only. Other parameters of setup-gauche are ignored if you use the prebuilt binary.

(You may have noticed that the repository name is now under practical-scheme instead of shirok--I made practical-scheme organization and am gradually moving Gauche repositories to there, for easier maintenance. The URL is redirected from shirok so you don't need to update immediately, but just FYI.)


The following is for those who are curious about behind-the-scene.

Prebuilt binaries are prepared in a different repository: https://github.com/practical-scheme/setup-gauche-binary

It has GitHub actions that fetches the latest release tarball, build in GitHub runner, and upload the result as the assets of the repo's release. That ensures the binary runs on GitHub runners.

Tags: github, CI

Saturday, June 29, 2024

Monday, June 17, 2024

Jérémy Korwin-Zmijowski

Gunit64 - Dev Log

Guile Logo Logo Emacs

I practice Test Driven Development almost systematically when I write code. So executing tests and reading their results are more than repetitive tasks.

Repetition could lead to automation... Here I present the first features of a tool that should make my life easier. I call it Gunit64!

This article is valid for the following versions: – guile-gunit64: 0.0.1-0.094c17a – geiser-gunit64: 0.0.1-0.7a87357

Introductory tutorial

Given the following SRFI-64 test suite.

The key binding C-c r t will execute these tests and display the result.

The key binding C-c r f will only execute tests that failed during the last execution.

The key binding C-c r . will execute the test at point (the cursor must be somewhere on the string representing the test name).

The key binding C-c r r will re-execute the tests from the last execution.

Quick explanation

The geiser-gunit minor mode key bindings will call Emacs Lisp functions which will : – save all buffers – compile the current buffer – evaluate an s-exp in the REPL

This s-exp is one of the procedures exported by the Guile modile gunit64.

Troubleshooting

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Unbound variable: run-root-tests

You're missing (use-modules (gunit64)).

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Wrong type to apply: ""

Your tests are not correctly passed to the *test-root* parameter. Make sure you respect the following model:

(*test-root*
 (lambda ()
   ;;; your tests here))

Installation

  1. Install guile-gunit64
  2. Install geiser-gunit64
  3. Enable geiser-gunit64-mode when geiser-mode is enabled (via a hook in your Emacs configuration).
  4. Configure Geiser so that the *Geiser Debug* buffer processes ANSI colors (setq geiser-debug-treat-ansi-colors 'colors).
  5. Configure Geiser not to jump to *Geiser Debug* buffer on compile-time error (setq geiser-debug-jump-to-debug nil).

For steps 1 and 2, I created the Guix definitions for these 2 packages. You can find them in my personal channel.

Limitations

Gunit64 only handles simple test suites (test-assert, test-equal and test-error). As soon as test-group is introduced, things get messy. I haven't yet seen how it behaves with user-defined test-runner (I guess, not very well).

Thank you very much for reading this article! Hope you learned something!

Don't hesitate to give me your opinion, suggest an idea for improvement, report an error, or ask a question ! I would be so glad to discuss about the topic covered here with you ! You can reach me here.

Don't miss out on the next ones ! You can subscribe via RSS or via e-mail just right here !

And more importantly, share this blog and tell your friends why they should read this post!

#gnu #guile #emacs #geiser #gunit64 #tdd #english

Monday, June 17, 2024

Gunit64 - Journal de développement

Guile Logo      Logo Emacs

Je pratique presque systématiquement le Test Driven Development lorsque je travail sur mes petits projets. Autant dire que l'exécution de tests et la lecture de leurs résultats sont des tâches plus que répétitives.

Qui dit répétition peut dire automatisation… Je te présente ici les premières fonctionnalités d'un outil qui devrait me faciliter la vie. J'ai nommé Gunit64 !

Cet article est valable pour les versions suivantes : – guile-gunit64 : 0.0.1-0.094c17a – geiser-gunit64 : 0.0.1-0.7a87357

Tutoriel d'introduction

Etant donné la suite de tests SRFI-64 suivante.

Le raccourci C-c r t va exécuter ces tests et afficher le résultat de l'exécution.

Le raccourci C-c r f ne va exécuter que les tests qui ont échoué lors de la dernière exécution.

Le raccourci C-c r . va exécuter le test sur lequel le curseur est positionné (le curseur doit être sur la chaîne de caractère qui représente le nom du test).

Le raccourci C-c r r va ré-exécuter les tests de la dernière exécution.

Rapide explication

Les raccourcis du mode mineur geiser-gunit vont appeler des fonction Emacs Lisp qui vont : – sauvegarder tous les buffers – compiler le buffer courant – évaluer une s-exp dans le REPL

La dite s-exp est une des procédures exportées par le module Guile gunit64.

Dépannage

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Unbound variable: run-root-tests

Il te manque (use-modules (gunit64)) ou un équivalent.

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Wrong type to apply: ""

Tes tests ne sont pas correctement passé au paramètre *test-root*. Respecte bien le modèle suivant :

(*test-root*
 (lambda ()
   ;;; Ici tes tests
 ))

Installation

  1. Installer guile-gunit64
  2. Installer geiser-gunit64
  3. Activer geiser-gunit64-mode lorsque geiser-mode est activé (via un hook dans votre configuration d'Emacs).
  4. Configurer Geiser pour que le buffer *Geiser Debug* traite les couleurs ANSI (setq geiser-debug-treat-ansi-colors 'colors)
  5. Configurer Geiser pour ne pas sauter vers le buffer *Geiser Debug* en cas d'erreur à la compilation (setq geiser-debug-jump-to-debug nil)

Pour les étapes 1 et 2, j'ai créé les définitions Guix de ces 2 paquets. Vous pouvez les trouver dans mon canal personnel.

Limitations

Gunit64 ne gère que des suites de test simples (test-assert, test-equal et test-error). Dès que des test-group font leur entrée, ça cafouille. Je n'ai pas encore vu comment il se comporte avec des test-runner définis par l'utilisateur (j'imagine, pas très bien).

Thank you very much for reading this article! Hope you learned something!

Don't hesitate to give me your opinion, suggest an idea for improvement, report an error, or ask a question ! I would be so glad to discuss about the topic covered here with you ! You can reach me here.

Don't miss out on the next ones ! You car subscribe via RSS or via e-mail just right here !

And more importantly, share this blog and tell your friends why they should read this post!

#gnu #guile #emacs #geiser #gunit64 #tdd #français

Monday, June 17, 2024

Wednesday, June 12, 2024

Idiomdrottning

Finding PGP keys

Here’s how I find PGP public keys. I have a zsh function that runs this, where “$1” means the email address I want to send to:

gpg --auto-key-locate local,wkd,keyserver --locate-keys "$1" ||
    curl -XGET https://api.protonmail.ch/pks/lookup\?op\=get\&search\=$(uenc "$1")|gpg --import

That second clause doesn’t get invoked very often; Proton users who don’t have their own domain, their keys are available over WKD, and some users who do have their own domain still have WKD set up, and some (all?) who don’t are still in Proton’s HPK keyserver. Maybe that covers all of them and there’s nothing left. I put it in the script before I knew about their keyserver, and got good mileage out of it early on. I’m never ever gonna get Proton myself and it’s so nice to just be able to normally email people who’re on there and step one to doing that is getting their keys.

This doesn’t find Autocrypt keys; that’s something I might wanna fix somehow, maybe introducing a notmuch query into the mix? I’d have to reindex with that header.

Now, local is checked first and that’s bad, don’t try that at home, kids. If I already have some old, stale key to them, that’s what’s gonna pop up first and end the search. But I’m such a ditz that I kept re-importing keys that I’ve already got until I introduced local as the first step.

The keyservers I currently check are these:

keyserver hkps://keys.openpgp.org
keyserver hkps://mail-api.proton.me
keyserver hkps://keys.mailvelope.com
keyserver hkps://keyserver.ubuntu.com
keyserver hkps://pgp.mit.edu

I’m not sure my own keys are in any of them, I might’ve submitted them at one time or another. I primarily rely on WKD or Autocrypt. The keyserver idea was pretty flawed compared to WKD, and then Autocrypt is a good workaround for email providers that don’t allow WKD. Which are pretty few, but Posteo does and anything that allows you to use your own domain.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Wednesday, June 12, 2024

Thursday, June 6, 2024

Idiomdrottning

Coding for yourself is OK

Re: Learning to be a programmer

I’ve made tons of driveby commits since pandemic started. 100 contributions in the last year just on GitHub and probably just as many on other repos.

So “even they can’t read programs written by others without 1000x effort” isn’t literally true.

I also… I think “programming for the household” is actually awesome. Automating our own lives, autonomously, not squeezing our lives into someone else’s automation.

When what we do can be extracted into reusable tools or libraries, that’s great. I’ve started 136 repos on idiomdrottning.org also since pandemic started. Programming for your own needs doesn’t have to end there.

There’s no shame in programming for yourself.

The other day I was pondering if egoism can be used for altruism and in hindsight, the answer’s closer than I would’ve guessed: in FOSS licensing and paying it forward.

by Idiomdrottning (sandra.snan@idiomdrottning.org) at Thursday, June 6, 2024

Monday, June 3, 2024

spritely.institute

Cirkoban: Sokoban meets cellular automata written in Scheme

Last week, we released a small puzzle game called Cirkoban. Cirkoban is the very first publicly accessible application developed by Spritely that features the Goblins distributed programming library running in web browsers. We bet big on Hoot, our Scheme-to-WebAssembly compiler, a little over a year ago in order to bring Goblins to the web. That bet is starting to pay off! In this post, we’ll talk in detail about how we made Cirkoban and how it showcases Spritely technology.

About Cirkoban

Cirkoban combines Sokoban block pushing with the Wireworld cellular automaton. It was made in 10 days for the Spring Lisp Game Jam 2024.

In Cirkoban, you play as an owl stuck in a secret, ethereal lab filled with strange circuitry. You must solve devious puzzles to ascend the stairs and reach the top floor. It has minimal controls (4 direction movement and an “undo” button) and contains 15 levels of increasing difficulty. Solving the puzzles requires precise movement, but you can travel back in time to undo mistakes. For an added challenge, there is a gem in each level that is harder to reach than the stairs to the next level.

Goals

While it is fun to make games, the real reason we made Cirkoban was to exercise Spritely technology and produce an appealing user-facing application under a tight deadline. We wanted to demonstrate the current state of Hoot, as well as the early progress made towards porting Goblins to the web. Games also have the tendency to stress language implementations and libraries since they require asynchronous event loops, perform a lot of math calculations and I/O, and need to refresh at 60Hz or so to appear smooth to the player.

I successfully used Hoot in the autumn edition of the Lisp Game Jam, when it was barely usable for such things. This time around, we were hoping that to spend very little time debugging Hoot issues and much more time making the game and testing out something new: Goblins!

Thanks to the work of Juli Sims, it is now possible to compile and run a subset of the Goblins library with Hoot. Specifically, we had a working port of the (goblins core) module that we wanted to incorporate into the architecture of the game. The port is not yet at the state where we can do networking, so instead we set out to demonstrate local actors and transactional rollback.

Design

If you’re a regular reader of our blog, then you know that Wireworld comes up often around here. It is our favorite cellular automaton because, with a few simple rules, you can build logic gates, diodes, and other components which can be composed into complicated circuitry. The simplicity of Wireworld has made it a compelling demo program for Hoot at various stages of its development.

Christine Lemmer-Webber thought it would be interesting to create a puzzle game where the levels contain broken circuits that need to be fixed by moving some puzzle pieces around. Sokoban was a natural starting point for such a game, and it also provided an opportunity to show off Goblins’ rollback feature. In Sokoban, if you push a block into a corner then you can’t push it anymore and thus may not be able to complete the puzzle. Games like Baba is You allow the player to undo previous moves when they get stuck. With the addition of a similar undo mechanic, we now had a way for a user to directly experience rollback as part of the gameplay.

Terminal Phase time-traveldemo

In the past, we have shown off rollback using the space shooter game Terminal Phase, a text-based game that runs in the player’s terminal. With Cirkoban, we now demonstrate that we have successfully ported this core feature to the web!

For the sake of fun gameplay, we decided to take some liberties with the Wireworld rules. For example, we wanted to condense common constructs which require many cells in Wireworld, such as electron-emitting generators and logic gates, into a single cell. This would help us keep the levels small and more readable, especially for the average player who isn’t already familiar with Wireworld, at the expense of more complicated code under the hood.

Development

To model the game state, we used Goblins actors. Every object in the game world is represented as an actor that responds to messages sent from other actors. Every time the player presses a key, the game either advances by a turn or rolls back to the state of the previous turn, in the case of the “undo” key. The game “client” then procsses the events of that turn, playing sound and visual effects as appropriate, and updates the visual representation of the level. The game state and rendering state are thus separated from one another.

Emacs with Cirkoban actor source code open

To edit the code, we used Emacs, naturally. I use paredit and rainbow-delimiters to efficiently edit Scheme, and the unsurpassed magit for working with our Git repository. A simple Makefile handles build automation and running a local web server for testing the game.

For graphics, we thought a low-resolution pixel art style with our own original art would work well. Using HTML5 canvas was an obvious choice to make rendering the game as simple as possible. WebGL/WebGPU are not currently viable options with Wasm GC, anyway. We chose to feature an owl as the player character because the Hoot mascot is an owl. Christine came up with the ethereal “technomancer” theme and created lots of amazing sprites in Libresprite.

Musical artist and Spritely community member EncryptedWhispers whipped up a background music track that fit our theme based on a very funny recording of Christine imitating a theremin. Juli and I made retro sound effects using the classic game jam tool sfxr.

Tiled map editor window with a Cirkoban level loaded

To design the levels, we used Tiled. We kept the map structure as simple as possible so that we could build levels quickly. There are only two layers in each map: A background tile layer for the static level tiles and an object layer for dynamic objects such as pushable blocks. Tiled’s native .tmx format uses XML, but we did not want to be in the business of fetching and parsing XML at runtime. Instead, we baked the levels into the Wasm binary by compiling Tiled map files to Scheme. The compiled levels use Scheme bytevector literals to densely pack only the essential map data needed at runtime. As levels were added, the change to the total binary size was nearly imperceptible. The compiled game code takes up much more space than all of the levels combined.

At the last minute I added some onscreen controls for touchscreens using Kenney assets so the game was playable on phones and tablets. The controls were shown or hidden based on the pointer CSS media query. When the pointer is fine (mouse), the onscreen controls are display: none; when coarse (touchscreen), they are shown. I used Inkscape to export just the directional pad and A button from Kenney’s combined SVG file, made some modifications to the resulting XML by hand to remove extraneous things, and then inlined the SVG into index.html. With the SVG inlined, I was able attach click handlers to individual components of the graphic, such as each button on the directional pad. I had no idea you could this before! After the deployment freeze of the jam rating period I made some additional improvements to how the game canvas is scaled on small phone screens.

I even had the time to sneak in some quality-of-life improvements. For instance, game progress is automatically saved to localStorage at the start of each level. We save the last completed level and the set of levels for which gems have been collected. This small feature became very important for testing as the level count grew, and players who played multiple sessions appreciated it.

Reflections

We’re a bit biased, but Hoot is actually pretty good now! We built Cirkoban using the latest unstable code in Hoot’s main branch, rather than the most recent 0.4.1 release. Even so, I only found one small problem during development and it was trivial to fix.

Now that Hoot is feeling quite stable, the lack of proper Scheme debug tools has become the most annoying developer experience issue. The current state of WebAssembly makes it hard for Hoot to do things like show a proper Scheme backtrace, but we need to see what we can do with the tools we have to improve the debugging story.

Relatedly, I also miss the REPL. Towards the end of the jam, compile times were about 20 seconds long. Every time I made a small tweak I had to recompile the entire program. Hoot is a whole-program, ahead-of-time, cross compiler that does a good job generating optimized binaries, but during development it would be nice to have a way to build incrementally like with native Guile. Generating Wasm from within Wasm is something we haven’t done yet, but Andy Wingo has thoughts about how to do it.

Finally, the Goblins port is going well! The (goblins core) module worked reliably and very little time was spent debugging issues. Juli has been making incredible progress porting Goblins to Hoot, and if the jam had been just a week or two later we might have even had vats, our asynchronous event loops, running on top of Hoot’s fibers asynchronous programming API. We could have also had true persistence that saved the game state exactly where you left off; players could take a break in the middle of a level and not have to start that level over when they returned. We are looking forward to having the full power of Goblins available in the browser in the coming months.

Reactions

Cirkoban received positive ratings and comments amongst the other jam participants, ranking second in the jam overall! We are simply thrilled with how the game turned out and the feedback we’ve received. We love shiny demos here, and the positive feedback is a good indicator that we should continue to make them!

by Dave Thompson (contact@spritely.institute) at Monday, June 3, 2024

Friday, May 31, 2024

GNU Guix

Source code archiving in Guix: new publication

We are glad to announce the publication of a new research paper entitled Source Code Archiving to the Rescue of Reproducible Deployment for the ACM Conference on Reproducibility and Replicability. The paper presents work that has been done since we started connecting Guix with the Software Heritage (SWH) archive five years ago:

The ability to verify research results and to experiment with methodologies are core tenets of science. As research results are increasingly the outcome of computational processes, software plays a central role. GNU Guix is a software deployment tool that supports reproducible software deployment, making it a foundation for computational research workflows. To achieve reproducibility, we must first ensure the source code of software packages Guix deploys remains available.

We describe our work connecting Guix with Software Heritage, the universal source code archive, making Guix the first free software distribution and tool backed by a stable archive. Our contribution is twofold: we explain the rationale and present the design and implementation we came up with; second, we report on the archival coverage for package source code with data collected over five years and discuss remaining challenges.

The ability to retrieve package source code is important for researchers who need to be able to replay scientific workflows, but it’s just as important for engineers and developers alike, who may also have good reasons to redeploy or to audit past package sets.

Support for source code archiving and recovery in Guix has improved a lot over the past five years, in particular with:

  • Support for recovering source code tarballs (tar.gz and similar files): this is made possible by Disarchive, written by Timothy Sample.

Diagram taken from the paper showing Disarchive tarball “disassembly” and “assembly”.

  • The ability to look up data by nar hash in the SWH archive (“nar” is the normalized archive format used by Nix and Guix), thanks to fellow SWH hackers. This, in turn, allows Guix to look up any version control checkout by content hash—Git, Subversion, Mercurial, you name it!
  • The monitoring of archival coverage with Timothy’s Preservation of Guix reports has allowed us to identify discrepancies in Guix, Disarchive, and/or SWH and to increase archival coverage.

Graph taken from the paper showing package source code archival coverage over time.

94% of the packages in a January 2024 snapshot of Guix are known to have their source code archived!

Check out the paper to learn more about the machinery at play and the current status.

by Ludovic Courtès, Timothy Sample, Simon Tournier, Stefano Zacchiroli at Friday, May 31, 2024

Monday, May 27, 2024

Andy Wingo

cps in hoot

Good morning good morning! Today I have another article on the Hoot Scheme-to-Wasm compiler, this time on Hoot’s use of the continuation-passing-style (CPS) transformation.

calls calls calls

So, just a bit of context to start out: Hoot is a Guile, Guile is a Scheme, Scheme is a Lisp, one with “proper tail calls”: function calls are either in tail position, syntactically, in which case they are tail calls, or they are not in tail position, in which they are non-tail calls. A non-tail call suspends the calling function, putting the rest of it (the continuation) on some sort of stack, and will resume when the callee returns. Because non-tail calls push their continuation on a stack, we can call them push calls.

(define (f)
  ;; A push call to g, binding its first return value.
  (define x (g))
  ;; A tail call to h.
  (h x))

Usually the problem in implementing Scheme on other language run-times comes in tail calls, but WebAssembly supports them natively (except on JSC / Safari; should be coming at some point though). Hoot’s problem is the reverse: how to implement push calls?

The issue might seem trivial but it is not. Let me illustrate briefly by describing what Guile does natively (not compiled to WebAssembly). Firstly, note that I am discussing residual push calls, by which I mean to say that the optimizer might remove a push call in the source program via inlining: we are looking at those push calls that survive the optimizer. Secondly, note that native Guile manages its own stack instead of using the stack given to it by the OS; this allows for push-call recursion without arbitrary limits. It also lets Guile capture stack slices and rewind them, which is the fundamental building block we use to implement exception handling, Fibers and other forms of lightweight concurrency.

The straightforward function call will have an artificially limited total recursion depth in most WebAssembly implementations, meaning that many idiomatic uses of Guile will throw exceptions. Unpleasant, but perhaps we could stomach this tradeoff. The greater challenge is how to slice the stack. That I am aware of, there are three possible implementation strategies.

generic slicing

One possibility is that the platform provides a generic, powerful stack-capture primitive, which is what Guile does. The good news is that one day, the WebAssembly stack-switching proposal should provide this too. And in the meantime, the so-called JS Promise Integration (JSPI) proposal gets close: if you enter Wasm from JS via a function marked as async, and you call out to JavaScript to a function marked as async (i.e. returning a promise), then on that nested Wasm-to-JS call, the engine will suspend the continuation and resume it only when the returned promise settles (i.e. completes with a value or an exception). Each entry from JS to Wasm via an async function allocates a fresh stack, so I understand you can have multiple pending promises, and thus multiple wasm coroutines in progress. It gets a little gnarly if you want to control when you wait, for example if you might want to wait on multiple promises; in that case you might not actually mark promise-returning functions as async, and instead import an async-marked async function waitFor(p) { return await p} or so, allowing you to use Promise.race and friends. The main problem though is that JSPI is only for JavaScript. Also, its stack sizes are even smaller than the the default stack size.

instrumented slicing

So much for generic solutions. There is another option, to still use push calls from the target machine (WebAssembly), but to transform each function to allow it to suspend and resume. This is what I think of as Joe Marshall’s stack trick (also see §4.2 of the associated paper). The idea is that although there is no primitive to read the whole stack, each frame can access its own state. If you insert a try/catch around each push call, the catch handler can access local state for activations of that function. You can slice a stack by throwing a SaveContinuation exception, in which each frame’s catch handler saves its state and re-throws. And if we want to avoid exceptions, we can use checked returns as Asyncify does.

I never understood, though, how you resume a frame. The Generalized Stack Inspection paper would seem to indicate that you need the transformation to introduce a function to run “the rest of the frame” at each push call, which becomes the Invoke virtual method on the reified frame object. To avoid code duplication you would have to make normal execution flow run these Invoke snippets as well, and that might undo much of the advantages. I understand the implementation that Joe Marshall was working on was an interpreter, though, which bounds the number of sites needing such a transformation.

cps transformation

The third option is a continuation-passing-style transformation. A CPS transform results in a program whose procedures “return” by tail-calling their “continuations”, which themselves are procedures. Taking our previous example, a naïve CPS transformation would reify the following program:

(define (f' k)
  (g' (lambda (x) (h' k x))))

Here f' (“f-prime”) receives its continuation as an argument. We call g', for whose continuation argument we pass a closure. That closure is the return continuation of g, binding a name to its result, and then tail-calls h with respect to f. We know their continuations are the same because it is the same binding, k.

Unfortunately we can’t really slice abitrary ranges of a stack with the naïve CPS transformation: we can only capture the entire continuation, and can’t really inspect its structure. There is also no way to compose a captured continuation with the current continuation. And, in a naïve transformation, we would be constantly creating lots of heap allocation for these continuation closures; a push call effectively pushes a frame onto the heap as a closure, as we did above for g'.

There is also the question of when to perform the CPS transform; most optimizing compilers would like a large first-order graph to work on, which is out of step with the way CPS transformation breaks functions into many parts. Still, there is a nugget of wisdom here. What if we preserve the conventional compiler IR for most of the pipeline, and only perform the CPS transformation at the end? In that way we can have nice SSA-style optimizations. And, for return continuations of push calls, what if instead of allocating a closure, we save the continuation data on an explicit stack. As Andrew Kennedy notes, closures introduced by the CPS transform follow a stack discipline, so this seems promising; we would have:

(define (f'' k)
  (push! k)
  (push! h'')
  (g'' (lambda (x)
         (define h'' (pop!))
         (define k (pop!))
         (h'' k x))))

The explicit stack allows for generic slicing, which makes it a win for implementing delimited continuations.

hoot and cps

Hoot takes the CPS transformation approach with stack-allocated return closures. In fact, Hoot goes a little farther, too far probably:

(define (f''')
  (push! k)
  (push! h''')
  (push! (lambda (x)
           (define h'' (pop!))
           (define k (pop!))
           (h'' k x)))
  (g'''))

Here instead of passing the continuation as an argument, we pass it on the stack of saved values. Returning pops off from that stack; for example, (lambda () 42) would transform as (lambda () ((pop!) 42)). But some day I should go back and fix it to pass the continuation as an argument, to avoid excess stack traffic for leaf function calls.

There are some gnarly details though, which I know you are here for!

splits

For our function f, we had to break it into two pieces: the part before the push-call to g and the part after. If we had two successive push-calls, we would instead split into three parts. In general, each push-call introduces a split; let us use the term tails for the components produced by a split. (You could also call them continuations.) How many tails will a function have? Well, one for the entry, one for each push call, and one any time control-flow merges between two tails. This is a fixpoint problem, given that the input IR is a graph. (There is also some special logic for call-with-prompt but that is too much detail for even this post.)

where to save the variables

Guile is a dynamically-typed language, having a uniform SCM representation for every value. However in the compiler and run-time we can often unbox some values, generally as u64/s64/f64 values, but also raw pointers of some specific types, some GC-managed and some not. In native Guile, we can just splat all of these data members into 64-bit stack slots and rely on the compiler to emit stack maps to determine whether a given slot is a double or a tagged heap object reference or what. In WebAssembly though there is no sum type, and no place we can put either a u64 or a (ref eq) value. So we have not one stack but three (!) stacks: one for numeric values, implemented using a Wasm memory; one for (ref eq) values, using a table; and one for return continuations, because the func type hierarchy is disjoin from eq. It’s.... it’s gross? It’s gross.

what variables to save

Before a push-call, you save any local variables that will be live after the call. This is also a flow analysis problem. You can leave off constants, and instead reify them anew in the tail continuation.

I realized, though, that we have some pessimality related to stacked continuations. Consider:

(define (q x)
  (define y (f))
  (define z (f))
  (+ x y z))

Hoot’s CPS transform produces something like:

(define (q0 x)
  (save! x)
  (save! q1)
  (f))

(define (q1 y)
  (restore! x)
  (save! x)
  (save! y)
  (save! q2)
  (f))

(define (q2 z)
  (restore! x)
  (restore! y)
  ((pop!) (+ x y z)))

So q0 saved x, fine, indeed we need it later. But q1 didn’t need to restore x uselessly, only to save it again on q2‘s behalf. Really we should be applying a stack discipline for saved data within a function. Given that the source IR is a graph, this means another flow analysis problem, one that I haven’t thought about how to solve yet. I am not even sure if there is a solution in the literature, given that the SSA-like flow graphs plus tail calls / CPS is a somewhat niche combination.

calling conventions

The continuations introduced by CPS transformation have associated calling conventions: return continuations may have the generic varargs type, or the compiler may have concluded they have a fixed arity that doesn’t need checking. In any case, for a return, you call the return continuation with the returned values, and the return point then restores any live-in variables that were previously saved. But for a merge between tails, you can arrange to take the live-in variables directly as parameters; it is a direct call to a known continuation, rather than an indirect call to an unknown call site.

cps soup?

Guile’s intermediate representation is called CPS soup, and you might wonder what relationship that CPS has to this CPS. The answer is not much. The continuations in CPS soup are first-order; a term in one function cannot continue to a continuation in another function. (Inlining and contification can merge graphs from different functions, but the principle is the same.)

It might help to explain that it is the same relationship as it would be if Guile represented programs using SSA: the Hoot CPS transform runs at the back-end of Guile’s compilation pipeline, where closures representations have already been made explicit. The IR is still direct-style, just that syntactically speaking, every call in a transformed program is a tail call. We had to introduce save and restore primitives to implement the saved variable stack, and some other tweaks, but generally speaking, the Hoot CPS transform ensures the run-time all-tail-calls property rather than altering the compile-time language; a transformed program is still CPS soup.

fin

Did we actually make the right call in going for a CPS transformation?

I don’t have good performance numbers at the moment, but from what I can see, the overhead introduced by CPS transformation can impose some penalties, even 10x penalties in some cases. But some results are quite good, improving over native Guile, so I can’t be categorical.

But really the question is, is the performance acceptable for the functionality, and there I think the answer is more clear: we have a port of Fibers that I am sure Spritely colleagues will be writing more about soon, we have good integration with JavaScript promises while not relying on JSPI or Asyncify or anything else, and we haven’t had to compromise in significant ways regarding the source language. So, for now, I am satisfied, and looking forward to experimenting with the stack slicing proposal as it becomes available.

Until next time, happy hooting!

by Andy Wingo at Monday, May 27, 2024

Arthur A. Gleckler

papers

I left graduate school decades ago, but I still love to read academic papers. The field of computer science reinvents its wheels constantly, but academic literature is a great way to mine existing ideas and avoid that problem. It's a way to "stand on the shoulders of giants."

For a long time, I maintained and carefully indexed a collection of actual printed papers. Once it reached the hundreds, that approach became too cumbersome. I ended up throwing away papers in order to avoid having too many, but often regretted doing that when some half-remembered idea popped up again in another context.

Now I have a crude system that meets my needs. I keep notes on the most interesting papers using Org-mode files, and I keep my collection in a Git repo in purely digital form. Every paper appears in the top-level directory, and there's a subdirectory to-read/ for papers I haven't yet read. A little bit of automation helps, too. Now managing almost two thousand papers is no problem.

Here's the Scheme program, papers, I use for adding new papers:

#!/usr/local/bin/chibi-scheme -r
;;;; -*- mode: scheme -*-

;;;; Expect environment variable <p> to name a directory that holds
;;;; all papers, e.g. "papers/".  A subdirectory "to-read/" that holds
;;;; unread papers must also exist.

;;;; Copy specified documents into "$p/" directory.

;;;; If <--to-read> is specified, copy them to "$p/to-read/", too.

;;;; If <--commit> is specified, use Git to commit each of the new
;;;; documents.  Each document will be committed separately.  It is an
;;;; error if any other file is already staged.

(import (chibi filesystem)
        (chibi pathname)
        (only (chibi process)
              exit
              process->output+error+status)
        (scheme process-context)
        (scheme small)
        (scheme write)
        (only (srfi 1)
              any
              remove)
        (srfi 98)
        (only (srfi 130)
              string-join
              string-prefix?)
        (srfi 166 base))

(define (echo . rest)
  (apply show #true rest)
  (show #true nl))

(define (echo-command command)
  (apply echo (list "command: " (string-join command " "))))

(define (usage program)
  (echo "Usage: "
        (path-strip-directory program)
        " [--commit] [--to-read] <pathname> ..."
        nl
        nl
        "Environment variable p must be set to a directory for holding papers."
        nl
        "It must have a subdirectory called \"to-read/\"."
        nl))

(define (run command)
  (let* ((out+err+status (process->output+error+status command))
         (stdout (car out+err+status))
         (stderr (cadr out+err+status))
         (status (caddr out+err+status)))
    (unless (zero? status)
      (echo-command command)
      (write-string stderr)
      (write-string stdout)
      (exit 1))))

(define (run/assert command message)
  (unless (zero? (caddr (process->output+error+status command)))
    (echo-command command)
    (echo message)
    (exit 1)))

(define (act document papers commit? to-read?)
  (let ((filename (path-strip-directory document)))
    (link-file document (path-resolve filename papers))
    (when commit? (run `("git" "add" ,filename)))
    (when to-read?
      (let ((place (path-resolve filename (path-resolve "to-read" papers))))
        (symbolic-link-file (path-resolve filename "..") place)
        (when commit? (run `("git"  "add" ,place)))))
    (when commit?
      (run `("git" "commit" "-m" ,filename)))))

(define (switch? string) (string-prefix? "--" string))

(define (main arguments)
  (let* ((program (car arguments))
         (options (cdr arguments))
         (papers (or (get-environment-variable "p")
                     (begin (usage program) (exit 1))))
         (valid-switches '("--commit" "--help" "--to-read")))
    (cond ((member "--help" options) (usage program) (exit 0))
          ((any (lambda (a)
                  (and (switch? a)
                       (not (member a valid-switches))))
                options)
           (usage program)
           (exit 1)))
    (let* ((commit? (member "--commit" options))
           (to-read? (member "--to-read" options))
           (cwd (current-directory))
           (documents (map (lambda (f) (path-resolve f cwd))
                           (remove switch? options))))

      (when (null? documents)
        (usage program)
        (exit 1))
      (change-directory papers)
      (when commit?
        (run/assert `("git" "diff" "--cached" "--quiet")
                    "Error: Files already staged."))
      (for-each (lambda (f) (act f papers commit? to-read?))
                documents))))

For example, to add one paper, including a copy in to-read/, commiting it to the repo:

papers --commit --to-read /tmp/aim-349.pdf

I use MIT Scheme for most of my Scheme hacking, but Alex Shinn's Chibi Scheme is wonderful for implementing this kind of tool. It's small, R7RS-Small-compliant, and has many useful libraries. Thank you, Alex!

Fixed on Wed 29 May 2024 to handle relative pathnames.

by Arthur A. Gleckler at Monday, May 27, 2024