Syntax extensions are a feature in LIPS Scheme that allow users to add new syntax. They work similar
to readers, macro in Common Lisp. You create a sequence of characters that maps to a function that
is executed by the parser, the function works similar to a macro and a result of the function is
returned by the parser in place of the sequence of defined characters.
I more or less only used FOSS from late nineties until autumn 2021,
where I got an iPad. This decision was partially prompted by
frustration with Android’s open washing (I was stuck with an
un-updatable Android tablet that wasn’t that old. By comparison, this
iPad has received updates for way more years already) and partially in
a rash angry moment after interacting with a complete jerk online who
made me in one instant give up on the entire FOSS community.
There was also the fact that I realized that I had already been
leaking away my own FOSS-y values, that there had been one growing,
frogboiling exception; video game consoles. Now, for a while when I
was at my most freshly converted GNU-wly brainwashed, I didn’t use
any. Then I rationalized that it’s not that bad to use a Game Boy or
NES. The roms are runnable on widely available emulators (not to
discount the incredible feats of engineering creating those emulators
entailed) and often decompilable or simple enough to understand in
machine code. I found them similar to Z-machine or Scumm VM which I
already thought was OK. After all, the requirement was free
software, not all free media. I could still watch normal movies, for
example.
But that can hardly be said about modern consoles like Switch or Wii
or 3DS. They’re like entire operating systems spamming tons of
telemetry and junk.
So “I am already using proprietary apps so why not do it more?” was my thinking.
After getting the iPad I immediately regretted that. It was so much worse than I could’ve ever imagined.
And now I’m stuck with it for as long as I have this device. There’s no chance of a FOSS firmware being made anytime soon. That m1, m2 Linux distro that existed briefly until the kernel guys bullied them away, they were prohibited to make it work on tablets, they could only work on computers.
I’m not gonna get the Switch 2 nor a new iPad. I’m going to try to get
better at this in the future. It’s not exactly easy because this world
has made it way harder to be FOSS only than it was in the nineties.
I’m just gonna do my best from now on and not beat myself up.
The one rule I stuck to and I’m glad I did
I want to steer clear of “network effect” apps like Messenger or
WhatsApp or X. Those are much much harder to get out of once you
start using them. Do not lightly sign up for them. So on the iPad I
never ever used iMessage for example.
However, Scheme also has exact numbers. Numbers without a decimal
point or exponent, or rational numbers, are read as exact numbers.
You can also prefix decimal numbers with #e to make them exact.
Using exact numbers, you can have an exact result.
gosh> (apply + (make-list 10 #e0.1))
1
The trick is that Gauche reads #e0.1 as an exact rational number
1/10, and perform computation as exact rationals.
It is revealed when the result is not a whole number:
gosh> (+ #e0.1 #e0.1)
1/5
It is incovenient, though, when you want to perform exact computation
with decimal numbers, i.e. adding prices with dollars and cents.
If you add $15.15 and $8.91, you want to see the result as
24.06 instead of 1203/50.
So, we added a new REPL print mode, exact-decimal. If you set it
to #t, Gauche tries to print exact non-integer result as
decimal notation whenever possible.
We can always have exact decimal notation
of rational numbers whose denominator's factor
contains only 2 and 5.
gosh> 1/65536
#e0.0000152587890625
As far as we use addition, subtraction,
and multiplication of exact decimal notated numbers, the result
is always representable with exact decimal notation.
But what if division is involved? Isn't it a shame that we have
an exact value (as a rational number), but can't print it as a
decimal exactly?
Decimal notation of rational numbers whose denominator contains
factors other than 2 and 5 becomes repeating decimals. Hence
if we have a notation of repeating decimals, we can cover such
cases.
So, here it is. If a numeric literal contains # followed
by one or more digits, we understand the digits after #
repeating infinitely.
(Note: If no digits follows #, it is "insignificant digit"
notation in R5RS.)
The above examples have limited number of digits because
they're inexact numbers (note that we didn't
prefix them with #e). For exact numbers, we can represent
any rational numbers exactly with this notation:
Note that the length of repetition can be arbitrarily long, so
there are numbers that can't practically be printed in this notation.
For the time being, we have a hard limit of 1024 for the length of
repetition. If the result exceeds this limitation, we fall
back to rational notation.
;; 1/2063 has repeating cycle of 1031 digits
gosh> (/ 1 2063)
1/2063
The Spritely team is overjoyed by the support we've received in our
first ever supporter drive.
More than just donations, these are individual voices which inspire us
to deliver on our promise of a more secure future.
The world needs social media that isn't powered by centralized sources
that don't have our best interests in mind. Spritely's mission helps
make that more achievable. 🖤
- Josh Mock
Spritely is painting a picture of a hopeful, human future for people
and their communication, and have a credible vision to get
there. That's a rare thing, and worthy of support.
- David Anderson
My digital life is my life.
Corporations and their centralized web services exist to turn a
profit. They’ll eventually do so by extracting what they can from
me. And by using their services. I’ve been giving them coercive
leverage!
To hold pieces of me hostage.
No mas.
- Moto A
More than five hundred people have decided to donate to Spritely so
far, with more than three hundred new voices being heard during this campaign!
We easily surpassed our original goal of $80,000 with weeks to spare!
Even though we fell short of our stretch goal we
introduced in January,
our final amount was $90,000, well in excess of what we expected.
Also, if you donated at the Silver level or higher during the
campaign, your name has already been added to the
credits for Cirkoban
in the shiniest way we could think of. Our founder, Christine
Lemmer-Webber, is busy working on doodles to send out to the Diamond
tier supporters and those should be received in the mail in the near
future. As part of our appreciation for your support, we are keeping
the tiered reward system for future donations, regardless of funding
drive, and we hope to add even more rewards in the future.
Just because the campaign is over doesn't mean we are slowing
down. Just the opposite; your support has given us a second
wind. Already, we've released new versions of
goblins
and hoot and we have even more
in store for the near future. On top of this, we gave seven
presentations at
FOSDEM which
laid out our vision and our progress and allowed us to meet with
partners and supporters in person.
Despite the recent uncertainty in non-profit funding in the US, we are
continuing to work against centralized interests as hard as we can,
and all of your support makes that possible. Donor drive or not, we
always welcome more donations and we appreciate every
dollar and cent.
Thank you to everyone who made this supporter drive a success. We
couldn't have done it without you!
An uninterned symbol is not the same as any other symbol, even one
with the same name. These symbols are useful in macro programming and
in other situations where guaranteed-unique names are needed.
A survey of uninterned and uniquely named symbols
in Scheme is also provided.
We are happy to announce the release of Goblins 0.15.1! This patch
release includes many bug fixes, documentation fixes, and
quality-of-life improvements made since the
0.15.0 release
back in January.
Thank you to the Spritely community for identifying and reporting many
of these issues! We’re thrilled to see more people trying Goblins for
the first time. Big thanks to Jessica Tallon, who has done nearly all
of the hard work sanding the rough edges for this release.
For more detail about the changes in this release, see the
NEWS
file.
Bug fixes
Fixed sending ghashes over CapTP that have actor references or other
complex objects.
Fixed sending messages to far references in persistent vats.
Fixed thread-safety issues in CapTP garbage collection.
Fixed issue where severed connections to peers couldn’t be re-enlivened.
Fixed CapTP crossed hellos mitigation.
Fixed call-with-vat blocking the current vat’s event loop. It now
returns a promise when called in this context.
Fixed prelay netlayer not reconnecting after severance.
Fixed prelay severance breaking local references.
Fixed on-sever for WebSocket netlayer on Hoot.
Fixed WebSocket error handling on Hoot.
Fixed (goblins ocapn netlayer base-port) and some of its
dependencies not compiling with Hoot.
Fixed make install installing utility libraries that are only for
the test suite.
Documentation fixes/improvements
Fixed documentation for spawning ^onion-netlayer and registering
sturdyrefs.
Fixed the greeter example in the “Persistence Environments” section.
Added additional explanation of cooperative multitasking, event
loops, and common vat pitfalls in response to user feedback.
Added example of passing an actor reference over OCapN.
Quality-of-life improvements
Added race* variant of race joiner in (goblins actor-lib joiners).
Vectors and non-list pairs (such as key/value pairs within
association lists) are now serializable over CapTP.
Exported ^persistence-registry from (goblins) module.
The fake netlayer can now be halted, which is useful when testing.
Added on-sever support to the prelay netlayer.
Added a new procedure, timeout, for creating promises that resolve
after a certain amount of time has passed. timeout can be found
in the new (goblins actor-lib timers) module.
Suppressed warning for the override of Guile core binding spawn
when (goblins) is imported.
Getting the release
As usual, if you're using Guix, you can upgrade to 0.15.1 by running:
guix pull
guix install guile-goblins
Otherwise, you can find the source tarball on our release
page.
Get in touch!
For bug reports, pull requests, or just to follow along with
development, check out the Goblins project on
Codeberg.
If you're making something with Goblins or want to contribute to
Goblins itself, be sure to join our community at
community.spritely.institute!
Thanks for following along and hope to see you there!
Earlier this
week
I took an inventory of how Guile uses the
Boehm-Demers-Weiser (BDW) garbage collector, with the goal of making
sure that I had replacements for all uses lined up in
Whippet. I categorized the uses
into seven broad categories, and I was mostly satisfied that I have
replacements for all except the last: I didn’t know what to do with
untagged allocations: those that contain arbitrary data, possibly full
of pointers to other objects, and which don’t have a header that we can
use to inspect on their type.
But now I do! Today’s note is about how we can support untagged
allocations of a few different kinds in Whippet’s mostly-marking
collector.
inside and outside
Why bother supporting untagged allocations at all? Well, if I had my
way, I wouldn’t; I would just slog through Guile and fix all uses to be
tagged. There are only a finite number of use sites and I could get to
them all in a month or so.
The problem comes for uses of scm_gc_malloc from outside libguile
itself, in C extensions and embedding programs. These users are loathe
to adapt to any kind of change, and garbage-collection-related changes
are the worst. So, somehow, we need to support these users if we are
not to break the Guile community.
on intent
The problem with scm_gc_malloc, though, is that it is missing an expression of intent, notably as regards tagging. You can use it
to allocate an object that has a tag and thus can be traced precisely,
or you can use it to allocate, well, anything else. I think we will
have to add an API for the tagged case and assume that anything that
goes through scm_gc_malloc is requesting an untagged,
conservatively-scanned block of memory. Similarly for
scm_gc_malloc_pointerless: you could be allocating a tagged object
that happens to not contain pointers, or you could be allocating an
untagged array of whatever. A new API is needed there too for
pointerless untagged allocations.
on data
Recall that the mostly-marking collector can be built in a number of
different ways: it can support conservative and/or precise roots, it can
trace the heap precisely or conservatively, it can be generational or
not, and the collector can use multiple threads during pauses or not.
Consider a basic configuration with precise roots. You can make
tagged pointerless allocations just fine: the trace function for that
tag is just trivial. You would like to extend the collector with the ability
to make untagged pointerless allocations, for raw data. How to do
this?
Consider first that when the collector goes to trace an object, it can’t use bits inside
the object to discriminate between the tagged and untagged cases.
Fortunately though the main space of the mostly-marking collector has one metadata byte for each 16 bytes of
payload. Of those 8 bits, 3 are used for the mark (five different
states, allowing for future concurrent tracing), two for the precise
field-logging write
barrier,
one to indicate whether the object is pinned or not, and one to indicate
the end of the object, so that we can determine object bounds just by
scanning the metadata byte array. That leaves 1 bit, and we can use it
to indicate untagged pointerless allocations. Hooray!
However there is a wrinkle: when Whippet decides the it should evacuate
an object, it tracks the evacuation state in the object itself; the
embedder has to provide an implementation of a little state machine,
allowing the collector to detect whether an object is forwarded or not,
to claim an object for forwarding, to commit a forwarding pointer, and
so on. We can’t do that for raw data, because all bit states belong to
the object, not the collector or the embedder. So, we have to set the
“pinned” bit on the object, indicating that these objects can’t move.
We could in theory manage the forwarding state in the metadata byte, but
we don’t have the bits to do that currently; maybe some day. For now,
untagged pointerless allocations are pinned.
on slop
You might also want to support untagged allocations that contain
pointers to other GC-managed objects. In this case you would want these
untagged allocations to be scanned conservatively. We can do this, but
if we do, it will pin all objects.
Thing is, conservative stack roots is a kind of a sweet spot in
language run-time design. You get to avoid constraining your compiler,
you avoid a class of bugs related to rooting, but you can still support
compaction of the heap.
How is this, you ask? Well, consider that you can move any object for
which we can precisely enumerate the incoming references. This is
trivially the case for precise roots and precise tracing. For
conservative roots, we don’t know whether a given edge is really an
object reference or not, so we have to conservatively avoid moving those
objects. But once you are done tracing conservative edges, any live
object that hasn’t yet been traced is fair game for evacuation, because
none of its predecessors have yet been visited.
But once you add conservatively-traced objects back into the mix, you
don’t know when you are done tracing conservative edges; you could
always discover another conservatively-traced object later in the trace,
so you have to pin everything.
The good news, though, is that we have gained an easier migration path.
I can now shove Whippet into Guile and get it running even before I have
removed untagged allocations. Once I have done so, I will be able to
allow for compaction / evacuation; things only get better from here.
Also as a side benefit, the mostly-marking collector’s heap-conservative
configurations are now faster, because we have metadata attached to
objects which allows tracing to skip known-pointerless objects. This
regains an optimization that BDW has long had via its
GC_malloc_atomic, used in Guile since time out of mind.
fin
With support for untagged allocations, I think I am finally ready to
start getting Whippet into Guile itself. Happy hacking, and see you on
the other side!
I was talking to Arthur Gleckler last night and he mentioned that
he had been making good use of a function he
called index-list. This function takes two selector
functions and a list of objects. The first selector extracts a key
from each object, and the second selector extracts a value. A table
is returned that maps the keys to a list of all the values that were
associated with that key.
I had to laugh. I had written the same function a few month back.
I called it collate.
Here is Arthur’s version in Scheme:
(define (index-list elements table choose-data choose-key)
"Given a hash table ‘table’, walk a list of ‘elements’ E, using
‘choose-key’ to extract the key K from each E and ‘choose-data’ to
extract a list of data D from each E. Store each K in ‘table’ along
with a list of all the elements of all the D for that K."
(do-list (e elements)
(hash-table-update!/default
table
(choose-key e)
(lambda (previous) (append (choose-data e) previous))
’()))
table)
Arthur’s version takes the hash table as a parameter. This
allows the caller to control the hash table’s properties. My
version creates a hash table using the test
parameter, which defaults to eql.
Arthur’s version uses choose-key to extract the key
from each element. My version uses key, which is a
keyword parameter defaulting to car. My choice was
driven by the convention of Common Lisp sequence functions to take
a :key parameter.
Arthur’s version uses a default value of ’() for
the entries in the hash table. My version uses
the :default keyword argument that defaults to
’().
Arthur’s version uses choose-data to extract the
datum in each element. My version uses the :merger
keyword argument to specify how to merge the entire element into the
table. If you only want a subfield of the element, you
can compose a selector function with a merger
function.
Arthur’s version uses append to collect the data
associated with each element. My version uses a merger function to
merge the element into the entry in the hash table. The default
merger is merge-adjoin, which uses adjoin
to add the element to the list of elements associated with the
key. merge-adjoin is paramterized by a test function
that defaults to eql. If the test is true, the new
data is not merged, so the result of (merge-adjoin
#’eql) is a list with no duplicates.
If you instead specify a default of 0 and a merger of (lambda
(existing new) (+ existing 1)) you get a histogram.
Another merger I make use of is merge-unique, which
ensures that all copies of the data being merged are the same,
raising a warning if they are not.
Finally, I occasionally make use of a higher-order merger called
merge-list that takes a list of mergers and applies
them elementwise to two lists to be merged. This allows you to
create a singleton aggregate merged element where the subfields are
merged using different strategies.
Like Arthur, I found this to be a very useful function. I was
auditing a data set obtained from GitHub. It came in as a flat list
of records of users. Each record was a list of GitHub org, GitHub
ID, and SAML/SSO login. Many of our users inadvertently have
multiple GitHub IDs associated with their accounts. I used
my collate function to create a table that mapped
SAML/SSO login to a list of all the GitHub IDs associated with that
login, and the list of orgs where that mapping applies.
Salutations, populations. Today’s note is more of a work-in-progress
than usual; I have been finally starting to look at getting
Whippet into
Guile, and there are some open questions.
inventory
I started by taking a look at how Guile uses the Boehm-Demers-Weiser
collector‘s API, to make sure I had all
my bases covered for an eventual switch to something that was not BDW.
I think I have a good overview now, and have divided the parts of BDW-GC
used by Guile into seven categories.
implicit uses
Firstly there are the ways in which Guile’s run-time and compiler depend
on BDW-GC’s behavior, without actually using BDW-GC’s API. By this I
mean principally that we assume that any reference to a GC-managed
object from any thread’s stack will keep that object alive. The same
goes for references originating in global variables, or static data
segments more generally. Additionally, we rely on GC objects not to
move: references to GC-managed objects in registers or stacks are valid
across a GC boundary, even if those references are outside the GC-traced
graph: all objects are pinned.
Some of these “uses” are internal to Guile’s implementation itself, and
thus amenable to being changed, albeit with some effort. However some
escape into the wild via Guile’s API, or, as in this case, as implicit
behaviors; these are hard to change or evolve, which is why I am putting
my hopes on Whippet’s mostly-marking
collector,
which allows for conservative roots.
defensive uses
Then there are the uses of BDW-GC’s API, not to accomplish a task, but
to protect the mutator from the collector:
GC_call_with_alloc_lock,
explicitly enabling or disabling GC, calls to sigmask that take
BDW-GC’s use of POSIX signals into account, and so on. BDW-GC can stop
any thread at any time, between any two instructions; for most users is
anodyne, but if ever you use weak references, things start to get really
gnarly.
Of course a new collector would have its own constraints, but switching
to cooperative instead of pre-emptive safepoints would be a welcome
relief from this mess. On the other hand, we will require client code
to explicitly mark their threads as inactive during calls in more cases,
to ensure that all threads can promptly reach safepoints at all times.
Swings and roundabouts?
precise tracing
Did you know that the Boehm collector allows for precise tracing? It
does! It’s slow and truly gnarly, but when you need precision, precise
tracing nice to have. (This is the
GC_new_kind
interface.) Guile uses it to mark Scheme stacks, allowing it to avoid
treating unboxed locals as roots. When it loads compiled files, Guile
also adds some sliced of the mapped files to the root set. These
interfaces will need to change a bit in a switch to Whippet but are
ultimately internal, so that’s fine.
What is not fine is that Guile allows C users to hook into precise
tracing, notably via
scm_smob_set_mark.
This is not only the wrong interface, not allowing for copying
collection, but these functions are just truly gnarly. I don’t know
know what to do with them yet; are our external users ready to forgo
this interface entirely? We have been working on them over time, but I
am not sure.
reachability
Weak references, weak maps of various kinds: the implementation of these
in terms of BDW’s API is incredibly gnarly and ultimately unsatisfying.
We will be able to replace all of these with ephemerons and tables of
ephemerons, which are natively supported by Whippet. The same goes with
finalizers.
The same goes for constructs built on top of finalizers, such as
guardians;
we’ll get to reimplement these on top of nice Whippet-supplied
primitives. Whippet allows for resuscitation of finalized objects, so
all is good here.
misc
There is a long list of miscellanea: the interfaces to explicitly
trigger GC, to get statistics, to control the number of marker threads,
to initialize the GC; these will change, but all uses are internal, making it not a terribly big
deal.
I should mention one API concern, which is that BDW’s state is all
implicit. For example, when you go to allocate, you don’t pass the API
a handle which you have obtained for your thread, and which might hold
some thread-local freelists; BDW will instead load thread-local
variables in its API. That’s not as efficient as it could be and
Whippet goes the explicit route, so there is some additional plumbing to
do.
Finally I should mention the true miscellaneous BDW-GC function:
GC_free. Guile exposes it via an API, scm_gc_free. It was already
vestigial and we should just remove it, as it has no sensible semantics
or implementation.
allocation
That brings me to what I wanted to write about today, but am going to
have to finish tomorrow: the actual allocation routines. BDW-GC
provides two, essentially: GC_malloc and GC_malloc_atomic. The
difference is that “atomic” allocations don’t refer to other
GC-managed objects, and as such are well-suited to raw data. Otherwise you can think of atomic allocations as a pure optimization, given that BDW-GC mostly traces conservatively anyway.
From the perspective of a user of BDW-GC looking to switch away, there
are two broad categories of allocations, tagged and untagged.
Tagged objects have attached metadata bits allowing their type to be inspected by the user later on. This is the
happy path! We’ll be able to write a gc_trace_object function that
takes any object, does a switch on, say, some bits in the first word,
dispatching to type-specific tracing code. As long as the object is
sufficiently initialized by the time the next safepoint comes around,
we’re good, and given cooperative safepoints, the compiler should be able to
ensure this invariant.
Then there are untagged allocations. Generally speaking, these are of
two kinds: temporary and auxiliary. An example of a temporary
allocation would be growable storage used by a C run-time routine,
perhaps as an unbounded-sized alternative to alloca. Guile uses these a
fair amount, as they compose well with non-local control flow as
occurring for example in exception handling.
An auxiliary allocation on the other hand might be a data structure only
referred to by the internals of a tagged object, but which itself never
escapes to Scheme, so you never need to inquire about its type; it’s
convenient to have the lifetimes of these values managed by the GC, and
when desired to have the GC automatically trace their contents. Some of
these should just be folded into the allocations of the tagged objects
themselves, to avoid pointer-chasing. Others are harder to change,
notably for mutable objects. And the trouble is that for external users of scm_gc_malloc, I fear that we won’t be able to migrate them over, as we don’t know whether they are making tagged mallocs or not.
what is to be done?
One conventional way to handle untagged allocations is to manage
to fit your data into other tagged data structures; V8 does this in many
places with instances of FixedArray, for example, and Guile should do
more of this. Otherwise, you make new tagged data types. In either case, all auxiliary data
should be tagged.
I think there may be an alternative, which would be just to support the
equivalent of untagged GC_malloc and GC_malloc_atomic; but for that,
I am out of time today, so type at y’all tomorrow. Happy hacking!
The serialize-structs module allows the minimization of dependencies by providing only a handful of core forms.
The flbit-field function allows access to the binary representation of IEEE floating-point numbers.
The top-left search box in the documentation works once more.
The XML reader is 2–3x faster on inputs with long CDATA and comments, and avoids some internal contract checks to obtain a 25% speedup on large documents generally.
The read-json* and write-json* functions allow customization of the Racket representation of JSON elements, eliminating the need for a separate “translation” pass.
There is lots of new documentation, and many defects repaired!
Thank you
The following people contributed to this release:
a11ce, Alex Knauth, Alexander Shopov, Alexis King, Andrew Mauer-Oats, Anthony Carrico, Bert De Ketelaere, Bob Burger, Bogdan Popa, D. Ben Knoble, David Van Horn, Gustavo Massaccesi, halfminami, Hao Zhang, Jacqueline Firth, Jinser Kafka, JJ, John Clements, Jörgen Brandt, Kraskaska, lafirest, Laurent Orseau, lukejianu, Marc Nieper-Wißkirchen, Matthew Flatt, Matthias Felleisen, mehbark, Mike Sperber, Noah Ma, Onorio Catenacci, Oscar Waddell, Pavel Panchekha, payneca, Robby Findler, Sam Phillips, Sam Tobin-Hochstadt, Shu-Hung You, Sorawee Porncharoenwase, Stephen De Gabrielle, Wing Hei Chan, Yi Cao, and ZhangHao.
Racket is a community developed open source project and we welcome new contributors. See racket/README.md to learn how you can be a part of this amazing project.
Feedback Welcome
Questions and discussion welcome at the Racket community Discourse or Discord
Please share
If you can - please help get the word out to users and platform specific repo packagers
Racket - the Language-Oriented Programming Language - version 8.16 is now available from https://download.racket-lang.org
See https://blog.racket-lang.org/2024/08/racket-v8-16.html for the release announcement and highlights.
The first version of LIPS Scheme had regex based tokenizer. It was using a single regex to split the
input string into tokens. In this article, I will show the internals of the new
Lexer in LIPS Scheme.
I ended the talk with some puzzling results around generational
collection, which prompted yesterday’s
post.
I don’t have a firm answer yet. Or rather, perhaps for the splay
benchmark, it is to be expected that a generational GC is not great; but
there are other benchmarks that also show suboptimal throughput in
generational configurations. Surely it is some tuning issue; I’ll be
looking into it.
A SRFI 9-style define-record-type is specified which allows subtyping while preserving encapsulation, in that the field structure of supertypes remains an implementation detail with which subtypes need not concern themselves.
Today we're looking at the results from the Contributor section of the Guix User and Contributor Survey (2024). The goal was to understand how people contribute to Guix and their overall development experience. A great development experience is important because a Free Software project's sustainability depends on happy contributors to continue the work!
See Part 1 for insights about Guix adoption, and Part 2 for users overall experience. With over 900 participants there's lots of interesting insights!
Contributor community
The survey defined someone as a Contributor if they sent patches of any form. That includes changes to code, but also other improvements such as documentation and translations. Some Guix contributors have commit access to the Guix repository, but it's a much more extensive group than those with commit rights.
Of the survey's 943 full responses, 297 participants classified themselves as current contributors and 58 as previous contributors, so 355 participants were shown this section.
The first question was (Q22), How many patches do you estimate you've contributed to Guix in the last year?
Table 21: Guix contributors patch estimates
Number of patches
Count
Percentage
1 — 5 patches
190
61%
6 — 20 patches
60
19%
21 — 100 patches
36
12%
100+ patches
27
9%
None, but I've contributed in the past
42
N/A
Note that the percentages in this table, and throughout the posts, are rounded up to make them easier to refer to.
The percentage is the percentage of contributors that sent patches in the last year. That means the 42 participants who were previous contributors have been excluded.
Figure 13 shows this visually:
Figure 13: Guix contributor estimated patch count
As we can see many contributors send a few patches (61%), perhaps updating a package that they personally care about. At the other end of the scale, there are a few contributors who send a phenomenal number of patches.
Active contributors
It's interesting to investigate the size of Guix's contributor community. While running the survey I did some separate research to find out the total number of contributors. I defined an Active contributor as someone who had sent a patch in the last two years, which was a total of 454 people. I deduplicated by names, but as this is a count by email address there may be some double counting.
This research also showed the actual number of patches that were sent by contributors:
Table 22: Active contributors by patch count
Number of patches
Count
Percentage of Contributors
1 — 5 patches
187
41%
6 — 20 patches
102
22%
21 — 100 patches
91
20%
100+ patches
74
16%
Figure 14 shows this:
Figure 14: Active Guix contributors by patch count
Together this give us an interesting picture of the contributor community:
There's a good community of active contributors to Guix: 300 in the survey data, and 454 from the direct research.
A significant percentage of contributors send one, or a few patches. This reflects that packaging in Guix can be easy to get started with.
The direct research shows an even distribution of contributors across the different levels of contribution. This demonstrates that there are some contributors who have been working on Guix for a long-time, as well as newer people joining the team. That's great news for the sustainability of the project!
There are also some very committed contributors who have created a lot of patches and been contributing to the project for many years. In fact, the top 10 contributors have all contributed over 700 patches each!
Types of contribution
The survey also asked contributors (Q23), How do you participate in the development of Guix?
Table 23: Types of contribution
Type of contribution
Count
Percentage
Develop new code (patches services, modules, etc)
312
59%
Review patches
65
12%
Triage, handle and test bugs
65
12%
Write documentation
38
7%
Quality Assurance (QA) and testing
23
4%
Organise the project (e.g. mailing lists, infrastructure etc)
16
3%
Localise and translate
12
2%
Graphical design and User Experience (UX)
2
0.4%
Figure 15 shows this as a pie chart (upping my game!):
Figure 15: Guix contribution types
Of course, the same person can contribute in multiple areas: as there were 531 responses to this question, from 355 participants, we can see that's happening.
Complex projects like Guix need a variety of contributions, not just code. Guix's web site needs visual designers who have great taste, and certainly a better sense of colour than mine! We need documentation writers to provide the variety of articles and how-tos that we've seen users asking for in the comments. The list goes on!
Unsurprisingly, Guix is code heavy with 60% of contributors focusing in this area, but it's great to see that there are people contributing across the project. Perhaps there's a role you can play? ... yes, you reading this post!
Paid vs unpaid contribution
FOSS projects exist on a continuum of paid and unpaid contribution. Many projects are wholly built by volunteers. Equally, there are many large and complex projects where the reality is that they're built by paid developers — after all, everyone needs to eat!
To explore this area the survey then asked (Q24), Are you paid to contribute to Guix?
The results show:
Table 24: Contributor compensation
Type of compensation
Count
Percentage
I'm an unpaid volunteer
328
94%
I'm partially paid to work on Guix (e.g. part of my employment or a small grant)
19
5%
I'm full-time paid to work on Guix
1
0.3%
No answer
7
N/A
We can see this as Figure 16 :
Figure 16: Guix developer compensation
Some thoughts:
Guix is a volunteer driven project.
The best way to work on Guix professionally is to find a way to make it part of your employment.
For everyone involved in the project the fact that the majority of contributors are doing it in their spare time has to be factored into everything we do, and how we treat each other.
Previous contributors
Ensuring contributors continue to be excited and active in the project is important for it's health. Ultimately, fewer developers means less can be done. In volunteer projects there's always natural churn as contributor's lives change. But, fixing any issues that discourages contributors is important for maintaining a healthy project.
Question 25 was targeted at the 59 participants who identified themselves as Previous Contributors. It asked, You previously contributed to Guix, but stopped, why did you stop?
The detailed results are:
Table 25: Previous contributor analysis
Category
Count
Percentage of Previous Contributors
External circumstances (e.g. other priorities, not enough time, etc)
28
35%
Response to contributions was slow and/or reviews arduous
12
15%
The contribution process (e.g. email and patch flow)
11
14%
Developing in Guix/Guile was too difficult (e.g. REPL/developer tooling)
6
8%
Guix speed and performance
3
4%
Project co-ordination, decision making and governance
2
3%
Lack of appreciation, acknowledgement and/or loneliness
2
3%
Negative interactions with other contributors (i.e. conflict)
2
3%
Burnt out from contributing to Guix
2
3%
Learning Guix internals was too complex (e.g. poor documentation)
1
1%
Social pressure of doing reviews and/or turning down contributions
1
1%
Other
10
13%
Figure 17 shows this graphically:
Figure 17: Reasons for ceasing to contribute to Guix
There were 80 answers from the 59 participants so some participants chose more than one reason.
As we can see a change in external circumstances was the biggest reason and to be expected.
The next reason was Response to contributions was slow and/or reviews arduous, as we'll see this repeatedly showed-up as the biggest issue.
Next was The contribution process (e.g. email and patch flow) which also appears in many comments. Judging by the comments the email and patch flow may be a gateway factor that puts-off potential contributors from starting. There's no way for the survey to determine this as it only covers people that started contributing and then stopped, but the comments are interesting.
Future contributions
Q26 asked contributors to grade their likelihood of contributing further, this is essentially a satisfaction score.
The question was, If you currently contribute patches to Guix, how likely are you to do so in the future?
Table 26: Future contributions scoring
Category
Count
Percentage
Definitely not
7
2%
Probably not
34
10%
Moderately likely
80
23%
Likely
111
31%
Certain
123
35%
Figure 18 shows this graphically:
Figure 18: Contributor satisfaction
Out of the audience of current and previous contributors, 355 in total:
The 35% of contributors who are 'Certain' they'll contribute is a great sign.
The 31% that are 'Likely' shows that there's a good pool of people who could be encouraged to continue to contribute.
We had 58 participants who categoried themselves as Previous Contributors and 41 answered this question with definitely or probably not, that's about 12%. That leaves the 80 (23%) who are loosely positive.
Improving contribution
The survey then explored areas of friction for contributors. Anything that reduces friction should increase overall satisfaction for existing contributors.
The question (Q27) was, What would help you contribute more to the project?
Table 27: Contribution improvements
Answer
Count
Percentage
Timely reviews and actions taken on contributions
203
20%
Better read-eval-print loop (REPL) and debugging
124
12%
Better performance and tuning (e.g. faster guix pull)
102
10%
Better documentation on Guix's internals (e.g. Guix modules)
100
10%
Guidance and mentoring from more experienced contributors
100
10%
Addition of a pull request workflow like GitHub/Gitlab
90
9%
Improved documentation on the contribution process
77
8%
Nothing, the limitations to contributing are external to the project
65
7%
More acknowledgement of contributions
40
4%
More collaborative interactions (e.g. sprints)
41
4%
Other
56
6%
Figure 19 bar chart visualises this:
Figure 19: Improvements for contributors
The 355 contributors selected 933 options for this question, so many of them selected multiple aspects that would help them to contribute more to the project.
Conclusions we can draw are:
Ensuring there's Timely reviews and actions taken on contributions is the biggest concern for contributors, and as we saw also causes contributors to become demoralised and cease working on the project.
The concern over both Debugging and error messages has been a consistent concern from contributors.
Interestingly, documentation of Guix's internals is a priority in this list, but in other questions it doesn't appear as a high priority.
Comments on improving contribution
Jumping ahead, the last question of the contributor section (Q30) was a comment box. It asked, Is there anything else that you would do to improve contributing to Guix?
The full list of comments from Q27, and Q30 are available and worth reading (or at least scanning!).
Looking across all of them I've created some common themes - picking a couple of example comments to avoid repetition:
Compensation for developers: there were comments from developers who want to work on Guix professionally, or people offering to donate.
"[Part of a long comment] ... For me personally it really boils down to the review process. Some patches just hang there for many months without *any* reaction. That is quite discouraging to be honest. So if there would be fund raising, I think it should (for a large part) go to paying someone (maybe even multiple people?) to do code reviews and merge patches. And in general do the "gardening job" on the issue tracker."
"I would be happy to contribute to some kind of fund, maybe by a monthly subscription, which would award stipends for experienced guix contributors to work on patch review."
Complexity of contribution: where the overall set of steps required to contribute were too complex.
"For occasional contributions, the threshold is higher than for most projects, in part due to less common tools used in the project (bugtracker for example)"
"[long comment where the substance is] I'd rather spend my limited time contributing to a 100% free software project than reading 20 webpages on how to use all the CLI tooling."
Issues with email-based contribution: concerns about the steps to create a patch, email it and so forth.
"Difficult; I am not used to the email workflow and since I'm not contributing often it is close to rediscovering everything again which is annoying. There isn't a specific thing that could solve that I guess. apologies if this doesn't say much"
"The GNU development process with mailing lists and email patches is the most difficult aspect."
Issues with speed and capacity of patch reviews: this is the highest priority amongst contributors, so there were many comments about patches not being reviewed, or reviews taking a long time.
"I really dislike that 70% of my patches don't get reviewed at all, how simple or trivial they may be. I do really test them and dogfood so contributing seems like a waste of time as someone without commit-access."
"I already checked "timely reviews/actions", but I want to emphasize how demoralizing my contribution experience was. I was excited about how easy it was to make, test, and submit a patch; I would love to help improve the packaging situation for things that I use. But it's been about a year now since I submitted some patches and have received exactly 0 communication regarding what I submitted. No reviews, no comments, no merge, nothing. Really took the wind out of my sails"
Automate patch testing and acceptance: suggestions to speed up the review pipeline by automating.
"A bias for action. If no one shows interest in a patch, and it's constrained, it should just be landed."
"Minimizing the required work needed to keep packages up to date. Most of the best developers in Guix are commiters and all the time they have to spend reviewing package update patches etc. is away from improving Guix's core. They should be able to focus on things like shepherd, bootloader configuration, guix-daemon in guile, distributed substitutes or a more modular system configuration (e.g. letting nginx know of certbot certificates without having to manually pass (ssl-certificate "/etc/...")).*"
Adding more committers: comments that more contributors would increase project velocity, and general concerns about how difficult it is to become a committer.
"Keep manual up to date, I think we need more committers doing reviews and give more love to darker corners."
"All the best. The project might need more hands to review incoming patches."
Addition of a pull requests workflow: specific comments requesting the addition of a Forge experience.
"I would use Forgejo (either an instance or at codeberg) to simplify contributions and issue handling. In my humble and personal opinion the forge workflow makes it much easier to get an overview of what is going on and to interact with others on issues and PRs"
"I think opening a pull request approach would really modernize the way of working and help welcome more people. We could still easily support patches too."
Automating package builds and tests: comments relating to automation of building packages as part of the contribution flow.
"We really need to improve the CICD situation. I see we have so many system tests that could catch issues. Let's make sure each patch has run at least against a couple of those system tests before it is being merged, or even before a reviewer has even looked at. Today a colleague of mine, who is just getting into Guix because I told him had issues with the u-boot-tools package not being built on a substitute server and being broken. Yeah, that can happen, but it happens all the time and it is such a bad experience for new and existing users."
Bugtracker improvements: comments about improving the bug tracker.
"A formal method to count the number of users affected by an issue so that developers know what to prioritize. For example, ubuntu's launchpad has a "bug heat" metric which counts the number of users that report they are affected by the bug."
Debugging and error reporting: challenges debugging issues due to difficult error messages in Guix, or underlying Guile weaknesses.
"The development workflow with Guile. I've recently switched to arei/ares, but I'm still a total newbie on how to effectively develop and navigate. I've used quite some Common Lisp, and I have my own channel with a handful packages, but it takes a long time to develop without the necessary knowledge of both Guile setup and Guix setup."
"I just want to reiterate that the debugging process can be painful sometimes. Sometimes guile gives error messages that can be bewildering. As an example, I spent awhile debugging the error message "no value specified for service of type 'myservice'". The problem was that I omitted the default-value field in my service-type, but the error message could have included the default-value field."
Runtime performance and resource usage: where it makes the experience of building and testing Guix slow or unusable.
"Foremost faster guix evals. I forget what I was doing while it runs."
"Building guix takes too long time for packagers. It is not clear why everything needs to be compiled when only contributing a package. Why does the documentation need to be built when adding a package?"
Practical guides, how-tos and examples: requests for direct instructions or examples, as compared to reference documentation.
"Improve the documentation on how to contribute. It is currently very hard to follow, some sections are simply in the wrong order, others presuppose the reader wants to evaluate several different alternatives instead of pointing to one simple way of doing things. And steps that though simple are unusual and will seem complicated to most people don't get explained in sufficient detail and examples."
FSF association as a constraint: concerns about Free Software and GNU as an organisation constraining practical user freedom.
"*Drop GNU and drop the hardline stance against discussing any proprietary software. It doesn't have to be supported directly, but at least have a link to Nonguix or something. Or have a feature flag like Nixpkgs. Who cares if the distro is certified by an organization that is pretty much irrelevant, actually giving people agency over their tech is what should be the number one goal."
"Guix is one of the GNU projects with the most potential and relevance, but unfortunately it seems association with the FSF is a contributing factor to limited adoption."
Not enough FSF: comments that the Guix project was not sufficiently supportive of FSF and/or Richard Stallman.
"collaborate more with other GNU projects"
Commit messages: concerns that the commit message format is repetitious or unneccessary.
"Encourage or enforce the usage of commit messages that detail why a change is done (and not what is done - which is already visible from the git diff)."
Importers and language ecosystem: comments about possible improvements to deal with dynamic language ecosystems (e.g. Javascript and Rust).
"Improved build systems and importers. Generally improving management of high-noise ecosystems (Maven, Rust, NPM, …)"
"Packaging Golang or Rust apps can be difficult and time-consuming because Guix requires all (recursive) dependencies to be packaged in Guix. I often gave up and just re-packaged a precompiled binary from upstream or another distro. It would be much easier if Guix relied on existing language-specific dependency management (e.g., use Cargo.lock files to fix all dependencies) - not perfect from Guix pov, but pragmatic and much more usable."
"More flexible package definitions but also more strict filtering of available packages. For example, allow some packages to use the internet in the build path (so you may easily install pip packages like TensorFlow, Flax), but by default do not allow installation of NonFree licenses and network enabled packages. We allow package transformations (--with-commit) which need network access anyway and doesn't verify hashes, I think this can be allowed. The end goal of a package should be to be reproducible from source, but the current goal can be usability, widespread adoption, reliability. This way we can start to use Guix in more scientific experiments and super computers, then the new users can help contribute further."
Project communications methods: requests for communications within the project to use modern methods (e.g. Matrix, Discourse, Github).
"Having a Discourse instance, so that people can ask questions and others and chime in and the best answers get upvotes. IRC and mailing lists are suboptimal. My success rate of getting ANY reply to my question have been consistently less than 50% regardless of the time of the day, because in IRC it scrolls down and questions go out of focus. Also in IRC the threads of discussion is getting mixed. Keep the IRC, but provide a Discourse instance. I personally even pay for paart of the cost."
Repo organisation: ideas to widen the set of contributors by having a community repo (e.g. Arch Linux like).
"I would like more packages under Guix, but I am not convinced that adding them all to the Guix channel is the way. I believe a large number of Guix packages should be moved to guix-free or similar channel. The packages in Guix itself should be the minimal ones that come installed in Guix system. The guix-free channel should be part of default-channels."
"I feel like channels are a cumbersome alternative to community packages. I previously tried to package a lesser known programming language compiler for Guix but never got replies to my patches to contribute the package. Perhaps there could be a core and community channel with stronger/weaker standards."
Project culture: concerns about the project being inward looking, not inclusive and with too much gatekeeping. Most comments in this area were very passionate, and in some cases a bit angry.
"TODO lists and direction is very helpful. Lists of "good first task" or "very important — need help with" etc, things to motivate others to contribute in. Also helpful if people ACTUALLY become part of the distro and it's not all gate-kept by idiots with attitude. I don't want to invest 1000 man hours to prove myself worthy of maintanership of a package!"
Organisational and social improvements
It's common in FOSS projects to focus on the technical issues, but Free Software is a social endeavour where organisational and social aspects are just as important. Q28 focused on the social and organisational parts of contribution by asking, What organisational and social areas would you prioritise to improve Guix?
This was a ranked question where participants had to prioritise their top 3. The rationale for asking it in this way was to achieve prioritisation.
It's useful to look at the results in two ways, first the table where participants set their highest priority (Rank 1):
Table 28: Rank 1 — Organisational and social improvements
Category
Count
Percentage
Improve the speed and capacity of the contribution process
213
63%
Project decision making and co-ordination
36
11%
Fund raising
22
7%
Request-for-comments (RFC) process for project-wide decision making
17
5%
Regular releases (i.e. release management)
19
6%
In-person collaboration and sprints
8
2%
Promotion and advocacy
23
7%
Out of the 355 participants in this section, 338 answered this question and marked their highest priority.
Figure 20 shows it as a pie chart:
Figure 20: Organisational and social improvements to GNU Guix (Rank 1)
This second table shows how each category was prioritied across all positions:
Table 29: All Ranks - Organisational and social improvements
Category
Rank 1
Rank 2
Rank 3
Overall priority
Project decison making and co-ordination
2
1
3
1
Promotion and advocacy
3
3
1
2
Fund raising
4
5
2
3
Request-for-comments (RFC) process for project-wide decision making
6
2
4
4
Improve the speed and capacity of the contribution process
1
6
6
5
Regular releases (i.e. release management)
5
4
5
6
In-person collaboration and sprints
7
7
7
7
Figure 21 shows this as a stacked bar chart. Each of the categories is the position for a rank (priority), so the smallest overall priority is the most important:
Figure 21: Organisational and social improvements to GNU Guix (All Ranks)
Looking at these together:
It's clear that the highest priority (table 28) is to Improve the speed and capacity of the contribution process, as 63% of participants selected it and nothing else was close to it.
I found it quite confusing that it didn't also score highly in the second and third rank questions, which negatively impacts the overall score. This seems to be caused by the question having a significant drop-off in answers: 338 participants set their 'Rank 1', but only 264 set a 'Rank 2' and then 180 set a 'Rank 3'. The conclusion I draw is that for many contributors the sole important organisational improvement is to improve the speed and capacity of the contribution process.
Nonetheless, overall Project decision making and co-ordination was the most important social improvement across all ranks, and it was the second most important one for 'Rank 1' — so that's pretty consistent. Other than improving the contribution process this was the next most important item on contributors minds.
Promotion and advocacy also seems to be important, though there are very few comments about it in the survey overall. The next most important across all ranks was Fund raising, which does get some comments.
Technical improvements
The partner question was Q29 which asked, What technical areas would you prioritise to improve Guix overall?
This was also a ranked question where participants had to prioritise their top 3.
Table 30: Rank 1 — Technical improvements
Category
Count
Percentage
Debugging and error reporting
63
18%
Making the latest version of packages available (package freshness)
There were 345 answers for the highest priority, 327 for the second rank and 285 for the third rank — so not as significant a drop-off as for the social question. Figure 22 shows this as a bar chart:
Figure 22: Technical improvements to GNU Guix (Rank 1)
As before I've converted them to priorities in each rank. The smallest overall score is the highest priority:
Table 29: All Ranks — Organisational and social improvements
Making the latest version of packages avalable (package freshness)
2
8
6
6
Package reliability (e.g. installs and works)
5
7
4
7
More packages (more is better!)
7
6
10
8
Guix Home services
11
10
8
9
Improving Guix's modules
8
12
9
10
Guix System services
10
9
11
11
Stable releases (e.g. regular tested releases)
12
11
12
12
Focused packages (fewer is better!)
13
13
13
13
Figure 23 shows this as a stacked bar chart.
Figure 23: Technical improvements to GNU Guix (All Ranks)
Some things that are interesting from this question:
For the technical improvements there isn't a single over-riding 'Rank 1' priority (table 30). The first choice, Debugging and error reporting, does come up consistently in comments as a problem for packagers, and across all three ranks it's the third priority.
Across all ranks Debugging and error reporting along with Runtime performance (speed and memory) are high priorities. These are probably quite connected as there's lots of comments in the survey about error reporting and slow evaluations making development time-consuming and difficult.
It's possible to think of the second and third priorities for 'Rank 1' (table 30) as being connected, since the velocity needed for Making the latest version of packages available would be helped by Automate patch testing and acceptance. We can see from the second table that through all priorities this is the area that contributors care about the most.
We asked the same question of all Users (Q21) earlier in the survey. This time the question was for Contributors only and there were a few specific contribution-focused options. It's interesting to see the contrast between contributors and users priorities:
For both the contributor (P2) and users (P1) improving the runtime performance was a high priority, so it's pretty consistent.
For users making Guix easier to learn was the second highest priority, there wasn't really an equivalent option in the contributor question.
Users identified Making the latest versions of packages available (package freshness) as very important and it's also a high priority in the first rank for contributors. However, overall it was middle of the pack for them — with both Project infrastructure (e.g. continuous integration) and Contribution workflow (e.g. Pull Requests) coming higher.
Key insights recap
That completes our review of the contributor section! Here are the key insights I draw:
The size of the active contributor community (~450) is really exciting. Many developers send a few patches (~60%), while at the other end of the scale there are some who have sent hundreds.
Retaining and developing contributors is important for the project's sustainability. About 66% of active developers are likely to contribute again. That's great, how can we encourage that to happen?
The key reasons contributors stopped (aside from life changes) was a slow response to contributions and the contribution process (e.g email and patch flow).
Improving the capacity and speed of reviews was also the over-riding concern for active contributors by a significant margin. High priority suggestions were automating patch testing and acceptance, along with improving the projects infrastructure (e.g. continuous integration).
Technical improvements to the developer experience were improving debugging and error reporting, runtime performance and also providing a more commonly used contribution process (e.g. Pull Requests).
Finally, the project is 95% a volunteer one, so we should bear in mind that everyone's contributing to Guix on their personal time! While it's great to see all this fantastic feedback and it's very useful, Guix is a collective of volunteers with the constraints that brings.
Getting the Data
We've really squeezed the juice from the lemon over these three posts — but maybe you'd like to dig into the data and do your own analysis? If so head over to the Guix Survey repository where you'll find all the data available to create your own plots!
This SRFI defines the procedure generate-symbol. Each
time it is invoked, the procedure returns a new symbol whose name
cannot be guessed. The returned symbol is a standard symbol for all
purposes; it obeys write/read invariance and it is equal to another
symbol if and only if their names are spelt the same.