<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>weinholt.se</title>
    <atom:link href="https://weinholt.se/feed.xml" rel="self" type="application/rss+xml"></atom:link>
    <link>https://weinholt.se</link>
    <description>Notes of a technical nature</description>
    <pubDate>Sun, 21 Jul 2024 02:00:00 +0200</pubDate>
    <generator>Wintersmith - https://github.com/jnordberg/wintersmith</generator>
    <language>en</language>
    <item>
      <title>Chez Scheme 10 in Debian experimental</title>
      <link>https://weinholt.se/articles/chezscheme-10-in-debian-experimental/</link>
      <pubDate>Sun, 21 Jul 2024 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/chezscheme-10-in-debian-experimental/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;I have recently been working on getting Chez Scheme 10.0.0 into Debian
and have uploaded it to &lt;a href=&quot;https://wiki.debian.org/DebianExperimental&quot;&gt;Debian experimental&lt;/a&gt;. After
some minor fixes it now builds on most
archs &lt;a href=&quot;https://buildd.debian.org/status/package.php?p=chezscheme&amp;amp;suite=experimental&quot;&gt;except armel and x32&lt;/a&gt;, using bytecode where a native
port is not available. Please test it and report any bugs in the
Debian bug tracker.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Fuzzing Scheme with AFL++</title>
      <link>https://weinholt.se/articles/fuzzing-scheme-with-aflplusplus/</link>
      <pubDate>Thu, 18 May 2023 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/fuzzing-scheme-with-aflplusplus/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;The comments on this blog are now back from their GDPR-induced coma.
I’m using a custom comment system powered by &lt;a href=&quot;https://htmx.org/&quot;&gt;HTMX&lt;/a&gt;
and a backend built on &lt;a href=&quot;https://scheme.fail&quot;&gt;Loko Scheme&lt;/a&gt;. While
writing the backend, one thing lead to another and I wanted to see if
my HTTP message parser could crash. This is when I discovered that the
AFL support in Loko Scheme had suffered bit rot. I have repaired it
now and wanted to demonstrate how to fuzz Scheme code with Loko Scheme
and AFL++.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://aflplus.plus/&quot;&gt;AFL++&lt;/a&gt; is a fuzzer based on the original
American Fuzzy Lop (AFL)
by &lt;a href=&quot;https://lcamtuf.coredump.cx/&quot;&gt;Michał “lcamtuf” Zalewski&lt;/a&gt;. Loko
Scheme previously had support for AFL but it inadvertently stopped working back when the
&lt;code&gt;.data&lt;/code&gt; segment was made read-only.
The &lt;a href=&quot;https://gitlab.com/weinholt/loko/-/commit/fa5dc51daa77e4b0ee2832e7832bdf5218a46700&quot;&gt;fuzzer support is now repaired&lt;/a&gt;
and accessible with a command line flag.&lt;/p&gt;
&lt;p&gt;I did find one bug in the HTTP message parser, but it was not very
interesting. More interesting were the problems that I found
in &lt;a href=&quot;https://akkuscm.org/packages/laesare/&quot;&gt;laesare&lt;/a&gt;, my Scheme reader
library. AFL++ found a way to crash it and to make it hang. Oops!&lt;/p&gt;
&lt;h1 id=&quot;steps-to-fuzzing&quot;&gt;Steps to Fuzzing&lt;/h1&gt;
&lt;p&gt;Here is how you fuzz a Scheme program (R6RS or R7RS):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;First make sure you have AFL++ installed: &lt;code&gt;sudo apt install afl++&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Install &lt;a href=&quot;https://akkuscm.org&quot;&gt;Akku.scm&lt;/a&gt;, the Scheme package manager.
It is required by Loko Scheme.&lt;/li&gt;
&lt;li&gt;Install &lt;a href=&quot;https://scheme.fail&quot;&gt;Loko Scheme&lt;/a&gt; from git, or any future
version later than 0.12.0.&lt;/li&gt;
&lt;li&gt;Write a program that reads input from standard input and passes it
to your function under test.&lt;/li&gt;
&lt;li&gt;Compile that program with &lt;code&gt;loko -fcoverage=afl++ --compile coverage.sps&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Create a directory called &lt;code&gt;inputs&lt;/code&gt; with sample input files. It is
often enough to provide a single file, but it can help speed up the
fuzzing process if you have samples of a wide variety.&lt;/li&gt;
&lt;li&gt;Run AFL++ with &lt;code&gt;env AFL_CRASH_EXITCODE=70 afl-fuzz -i inputs/ -o outputs -- ./coverage&lt;/code&gt;.
Watch the pretty status screen and wait for it to find crashes and
timeouts.&lt;/li&gt;
&lt;li&gt;Analyze the outputs.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That’s essentially it. Only steps 4—8 are really specific to fuzzing,
so I will go through them in detail.&lt;/p&gt;
&lt;h1 id=&quot;program-under-test&quot;&gt;Program Under Test&lt;/h1&gt;
&lt;p&gt;You need a program that reads from standard input and passes the data
to the code you want to test with AFL++. This is usually very simple
to accomplish.&lt;/p&gt;
&lt;p&gt;If your program works with textual data then you read from
&lt;code&gt;(current-input-port)&lt;/code&gt;, but if it binary data then you make a new
binary input port with &lt;code&gt;(standard-input-port)&lt;/code&gt; and read from that. You
can either read until you see the eof object and pass all the data
directly to the code under test, or you can pass the port directly to
the code under test. It depends on what your API needs as input.&lt;/p&gt;
&lt;p&gt;Here is an example program that feeds data to &lt;code&gt;get-token&lt;/code&gt; from laesare:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;&lt;span class=&quot;comment&quot;&gt;;; SPDX-License-Identifier: MIT&lt;/span&gt;
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;import&lt;/span&gt;&lt;/span&gt;
  (&lt;span class=&quot;name&quot;&gt;rnrs&lt;/span&gt;)
  (&lt;span class=&quot;name&quot;&gt;laesare&lt;/span&gt; reader))

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;reader&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;make-reader&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;current-input-port&lt;/span&gt;&lt;/span&gt;) &lt;span class=&quot;string&quot;&gt;&quot;stdin&quot;&lt;/span&gt;)))
  (&lt;span class=&quot;name&quot;&gt;reader-mode-set!&lt;/span&gt; reader &lt;span class=&quot;symbol&quot;&gt;'r6rs&lt;/span&gt;)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; lp ()
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let-values&lt;/span&gt;&lt;/span&gt; ([(&lt;span class=&quot;name&quot;&gt;type&lt;/span&gt; token)
                  (&lt;span class=&quot;name&quot;&gt;guard&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;con&lt;/span&gt;
                          ((&lt;span class=&quot;name&quot;&gt;lexical-violation?&lt;/span&gt; con)
                           (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;values&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;symbol&quot;&gt;'condition&lt;/span&gt; con)))
                    (&lt;span class=&quot;name&quot;&gt;get-token&lt;/span&gt; reader))])
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;write&lt;/span&gt;&lt;/span&gt; type)
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;display&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;literal&quot;&gt;#\space&lt;/span&gt;)
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;write&lt;/span&gt;&lt;/span&gt; token)
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;newline&lt;/span&gt;&lt;/span&gt;)
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;unless&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eof-object?&lt;/span&gt;&lt;/span&gt; token)
        (&lt;span class=&quot;name&quot;&gt;lp&lt;/span&gt;)))))
&lt;/code&gt;&lt;/pre&gt;
&lt;h1 id=&quot;compile-with-instrumentation&quot;&gt;Compile with Instrumentation&lt;/h1&gt;
&lt;p&gt;The program now needs to be compiled with instrumentation for AFL++.
This is done by passing a new flag to loko:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;.akku/env loko -fcoverage=afl++ --compile coverage.sps
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The new &lt;code&gt;-fcoverage=afl++&lt;/code&gt; flag tells the code generator to insert a
special code sequence inside every &lt;code&gt;if&lt;/code&gt; expression.&lt;/p&gt;
&lt;p&gt;The way that AFL++ works is that the program receives a shared memory
segment from &lt;code&gt;afl-fuzz&lt;/code&gt; that it mutates during the execution of the
program. You can imagine that this program:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; test
    conseq
    altern)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;is transformed into this program:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; test
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;begin&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;afl-mutate&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;compile-time-random&lt;/span&gt;)) conseq)
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;begin&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;afl-mutate&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;compile-time-random&lt;/span&gt;)) altern))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The expression &lt;code&gt;(compile-time-random)&lt;/code&gt; should be fixed at compile-time
and should be different for the two branches. Therefore the effect of
&lt;code&gt;afl-mutate&lt;/code&gt; on the shared memory segment will be different depending
on which branch is taken at runtime, but it will be identical for
different runs of the program if the input is identical.&lt;/p&gt;
&lt;p&gt;Now suppose that this transformation is applied to the whole program.
The path of all branches taken through the program for a given input
generates a unique fingerprint that AFL++ uses to explore all the
branches in the program. It uses some clever algorithms to mutate the
input in ways that uncover new paths through the program. This is
repeated thousands of times per second to eventually (maybe) find
inputs that crash or hang the program.&lt;/p&gt;
&lt;p&gt;By the way, when you use the &lt;code&gt;-fcoverage=afl++&lt;/code&gt; flag with Loko you
also get an instrumented standard library. This means that AFL++ can
see into &lt;code&gt;(rnrs)&lt;/code&gt;, &lt;code&gt;(scheme base)&lt;/code&gt;, etc, and can fuzz them along with
your code. This means that AFL++ can be smarter when it searches for
bugs that are triggered by how your program interacts with the
standard library, which would otherwise be a black box.&lt;/p&gt;
&lt;h1 id=&quot;run-the-fuzzer&quot;&gt;Run the Fuzzer&lt;/h1&gt;
&lt;p&gt;With the binary built you can start the fuzzer:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ env AFL_CRASH_EXITCODE=70 afl-fuzz -i inputs \
    -o outputs -- ./coverage
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;It will tell you if something is wrong with the program. Otherwise it
starts up a screen that looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/articles/fuzzing-scheme-with-aflplusplus/aflplusplus.png&quot; alt=&quot;The status screen of AFL++ shows progress, findings and various statistics&quot; title=&quot;AFL++ running on Laesare&quot;&gt;&lt;/p&gt;
&lt;p&gt;Not all crashes and timeouts are necessarily unique, many of them are
likely to be triggered by the same bug.&lt;/p&gt;
&lt;p&gt;The speed of fuzzing can shift over time, but I commonly see around
&lt;s&gt;2500&lt;/s&gt; 4300 executions/sec using a single core on my machine. This can be
further sped up by using multiple cores and system tuning that the
AFL++ manual can tell you more about.&lt;/p&gt;
&lt;h1 id=&quot;analyze-the-outputs&quot;&gt;Analyze the outputs&lt;/h1&gt;
&lt;p&gt;If the “findings in depth” box reports crashes and timeouts then you
can go and look in the output directory. The &lt;code&gt;outputs/default/crashes&lt;/code&gt;
directory contains files that you can just feed directly into your
test program:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ ./coverage &amp;lt; outputs/default/crashes/id:000000,*
…
 Frame 2 has return address #x26987C.
  Local 3: #&amp;lt;closure dynamic-wind /usr/local/lib/loko/loko/runtime/control.loko.sls:164:0&amp;gt;
  Local 4: #[reader port: #&amp;lt;textual-input-port &amp;quot;*stdin*&amp;quot; fd: 0&amp;gt;
                    filename: &amp;quot;stdin&amp;quot; line: 1 column: 40 saved-line: 1
                    saved-column: 10 fold-case?: #f mode: r6rs tolerant?: #f]
  Local 5: &amp;amp;lexical
 Frame 3 has return address #x24282F.
  Local 0: #f
  Local 1: 0
  Local 2: 0
End of stack trace.
The condition has 6 components:
 1. &amp;amp;assertion &amp;amp;violation &amp;amp;serious
 2. &amp;amp;who: get-token
 3. &amp;amp;who: &amp;quot;/usr/local/lib/loko/laesare/reader.sls:460:0&amp;quot;
 4. &amp;amp;message: &amp;quot;Type error: expected a fixnum&amp;quot;
 5. &amp;amp;program-counter: #x2E68F1
 6. &amp;amp;continuation
     k: #&amp;lt;closure continuation /usr/local/lib/loko/loko/runtime/control.loko.sls:152:21&amp;gt;
End of condition components.
…
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This tells us there is a crash in the &lt;code&gt;get-token&lt;/code&gt; procedure.
Unfortunately this is a huge procedure and Loko does not yet generate
DWARF information that lets us get source line information from the
instruction pointer. We know that something in &lt;code&gt;get-token&lt;/code&gt; expected
a fixnum, but Loko is a bit sloppy with the &lt;code&gt;&amp;amp;who&lt;/code&gt; condition when
it comes to assertions from inlined built-ins.&lt;/p&gt;
&lt;p&gt;We can use &lt;code&gt;objdump -d ./coverage&lt;/code&gt; and look for the instruction at
&lt;code&gt;0x2E68F1&lt;/code&gt; or the instruction that jumps to that address and try
to make sense of the context. But there is another way.&lt;/p&gt;
&lt;h2 id=&quot;more-tools-for-easier-fun&quot;&gt;More Tools for Easier Fun&lt;/h2&gt;
&lt;p&gt;AFL++ comes with tools that let you analyze the crashes and minimize
the inputs. When you report a bug found using a fuzzer it is important
to first use a minimizer to find a minimal reproducer. The person
reading your bug report does not want to have to guess which parts of
the input are relevant and which are noise. This applies also to us
when we’re the ones using AFL++ for our own code.&lt;/p&gt;
&lt;h2 id=&quot;minimize-with-afl-tmin&quot;&gt;Minimize with afl-tmin&lt;/h2&gt;
&lt;p&gt;Here is how you run the minimizer:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ env AFL_CRASH_EXITCODE=70 afl-tmin  \
  -i outputs/default/crashes/id:000000,* \
  -o crash -- ./coverage
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The afl-tmin program uses the same binary as before but has a
different goal: remove as much of the input as possible.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/articles/fuzzing-scheme-with-aflplusplus/afl-tmin.png&quot; alt=&quot;afl-tmin reduced the file size by 33.87%&quot; title=&quot;afl-tmin shrinking an input file&quot;&gt;&lt;/p&gt;
&lt;p&gt;This is what happened to the file:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ hexdump outputs/default/crashes/id:000000,*
0000000 0023 0100 2300 0000 0001 5c23 4678 4646
0000010 4646 4646 4646 4646 4646 4646 4646 4646
0000020 4646 4646 4646 4646 4623 4646 1821 3070
0000030 1818 1818 1818 1702 1818 5c00 7038
000003e
$ hexdump crash
0000000 5c23 4678 3030 3030 3030 3030 3030 3030
0000010 3030 0030                              
0000013
$ cat -vet crash
#\xF000000000000000
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;AFL++ has just told us that laesare crashes if it attempts to read a
large character constant! That bug is now &lt;a href=&quot;https://gitlab.com/weinholt/laesare/-/commit/1d996c67883ed2ba23f459d6c5bd9c8ffc16b1ec&quot;&gt;fixed in the git repo&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Perhaps even more impressive is that the minimizer works even when the
program hangs and it turns out that &lt;code&gt;(string-&amp;gt;number &amp;quot;0F800000&amp;quot;)&lt;/code&gt; hangs
Loko Scheme. Oops again!&lt;/p&gt;
&lt;h2 id=&quot;analyze-with-afl-analyze&quot;&gt;Analyze with afl-analyze&lt;/h2&gt;
&lt;p&gt;The analyzer is another fun tool and here is how you run it:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ afl-analyze -i crash -- ./coverage
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;img src=&quot;/articles/fuzzing-scheme-with-aflplusplus/afl-analyze.png&quot; alt=&quot;afl-analyze categorizes each byte in the file&quot; title=&quot;afl-analyze has analyzed the input&quot;&gt;&lt;/p&gt;
&lt;p&gt;In this case we already know what the problem is a character constant
that is too large, so it is not telling us anything new. But it has
figured out that the middle zeros do not really affect the program
flow, which can be useful information when analyzing other test cases.&lt;/p&gt;
&lt;h1 id=&quot;tl-dr&quot;&gt;tl;dr&lt;/h1&gt;
&lt;p&gt;Fuzzing is a powerful technique that automatically searches for inputs
that crash or hang your program. Loko Scheme can now be used with
AFL++ to fuzz Scheme programs.&lt;/p&gt;
&lt;p&gt;Write a program &lt;code&gt;coverage.sps&lt;/code&gt; that passes standard input to the
procedure you want to test and then compile it with a recent Loko
Scheme:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-sh&quot;&gt;mkdir inputs
&lt;span class=&quot;built_in&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;'()'&lt;/span&gt; &amp;gt; inputs/nil
.akku/env loko -ftarget=linux -fcoverage=afl++ --compile coverage.sps
env AFL_CRASH_EXITCODE=70 afl-fuzz -i inputs -o outputs -- ./coverage
&lt;/code&gt;&lt;/pre&gt;
</description>
    </item>
    <item>
      <title>Akku website updates</title>
      <link>https://weinholt.se/articles/akku-website-updates/</link>
      <pubDate>Sun, 15 Jan 2023 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/akku-website-updates/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;I have had some free time recently between working for clients, and
took this opportunity to implement new features
for &lt;a href=&quot;https://akkuscm.org/&quot;&gt;Akku’s website&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;In case you did not know, &lt;a href=&quot;https://akkuscm.org/&quot;&gt;Akku is a package manager&lt;/a&gt; with
features specially designed for R6RS and R7RS Scheme.&lt;/p&gt;
&lt;p&gt;The library systems in Scheme make it possible to automatically
analyze source code to find libraries, exports and imports. Akku
combines such analysis with a package index that lets you install
packages from the command line, automatically resolving dependencies
and placing files in the right place. Dependencies and installed files
are project-specific, so you can concentrate on each project
separately.&lt;/p&gt;
&lt;h1 id=&quot;search-box&quot;&gt;Search box&lt;/h1&gt;
&lt;p&gt;The top of the page now has a handy search box:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;/articles/akku-website-updates/search-box.png&quot; alt=&quot;DuckDuckGo site search&quot;&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;It uses DuckDuckGo, which I think is a good balance between privacy
and functionality.&lt;/p&gt;
&lt;p&gt;At some point it would be good to set up a custom search engine which
also searches through all source code.&lt;/p&gt;
&lt;h1 id=&quot;who-uploaded-the-package-&quot;&gt;Who uploaded the package?&lt;/h1&gt;
&lt;p&gt;Akku has over 500 packages in its index. Many of them have been
uploaded by me personally when I initially set up the archive.
Packaging is really simple due to the automatic analysis built-in to
Akku, so when packaging something new I just needed to read through
the code to see that nothing bad was going on, then add the required
dependencies and write a description.&lt;/p&gt;
&lt;p&gt;The package uploader is identified through their OpenPGP signature on
the uploaded package
at &lt;a href=&quot;https://archive.akkuscm.org/archive/packages/&quot;&gt;https://archive.akkuscm.org/archive/packages/&lt;/a&gt;. The website
generator uses this signature to figure out who uploaded the package
and shows their name if it is different from the (single) package
author.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;/articles/akku-website-updates/uploader.png&quot; alt=&quot;Authors box showing an uploader&amp;#39;s name&quot;&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There’s an exception for packages mirrored from Snow. Those are all
signed by me anyway, as there is no way to directly upload a Snow
package to Akku. Those all have to go
through &lt;a href=&quot;https://snow-fort.org&quot;&gt;snow-fort.org&lt;/a&gt; first.&lt;/p&gt;
&lt;p&gt;Anyone can upload packages to Akku’s archive, so if you find some cool
R6RS project that’s missing in Akku then you can go ahead and upload
it. See the man page for instructions. (If you want to upload a new
version of a package that already exists then that’s okay too, but if
the author themselves uploaded the previous version then please check
with them first).&lt;/p&gt;
&lt;p&gt;Before packages are published in the archive they are manually
reviewed to verify that they’re not up to any funny business. I think
this is the only way to truly prevent the attacks that regularly
happen to the larger package repositories for other languages. The
work is manageable today and if Akku gets more popular then it should
still be sustainable with increased automation to help with the
tedious parts of the manual review.&lt;/p&gt;
&lt;h1 id=&quot;where-do-i-get-that-library-&quot;&gt;Where do I get that library?&lt;/h1&gt;
&lt;p&gt;If you have some Scheme code in front of you that imports a library
then you might like to find the package that contains it. There is now
a new page with just such an index:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;/articles/akku-website-updates/libraries.png&quot; alt=&quot;Image showing the Libraries page&quot;&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The index is still small enough that everything fits on one page. On
the left you see libraries, which are tagged as R6RS libraries, R7RS
libraries or implementation-specific modules. On the right you see
which packages contain the library.&lt;/p&gt;
&lt;p&gt;A library can exist in multiple packages, so you will see some
libraries with links to multiple packages. When Akku encounters this
situation it is the order of the dependencies in Akku.manifest that
determines which variant “wins”: dependencies can overwrite files from
dependencies specified earlier in the list.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;/articles/akku-website-updates/multiple-packages.png&quot; alt=&quot;Image shows the (sdl2) package on the libraries page&quot;&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h1 id=&quot;who-exports-this-identifier-&quot;&gt;Who exports this identifier?&lt;/h1&gt;
&lt;p&gt;Akku’s archive also has information about what identifiers are
exported by each library, so I have made a page where you can look for
identifiers and find the relevant libraries and packages.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;/articles/akku-website-updates/identifiers.png&quot; alt=&quot;Image showing the packages export &amp;quot;fail&amp;quot;, etc&quot;&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I’m not too happy about the usability of this page, and ideas for
improvements are very welcome. I tried putting it all on one page,
thinking that at least then you can search in the browser, but Mr.
Browser got slow and the page was 11MB. So instead there’s an awkward
split into multiple pages. It should however make search engines
happy, and that’s good enough for now.&lt;/p&gt;
&lt;h1 id=&quot;what-s-in-the-package-&quot;&gt;What’s in the package?&lt;/h1&gt;
&lt;p&gt;Akku’s analyzer knows what is in each package and the archive software
has been publishing this information for some time, but it has never
been visualized before. The only information you got on the website
was a synopsis, a list of authors, maybe a description and a reference
to the source code.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;/articles/akku-website-updates/contents.png&quot; alt=&quot;Package contents of the spdx package&quot;&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The new “Package contents” box lists all libraries and modules that
are found in the package. The first row under each library name is the
list of exported identifiers, followed by one row for each imported
library. Meta-information is added to the rows, like a little tag
showing if it’s an R6RS library, an R7RS library, or an
implementation-specified module format.&lt;/p&gt;
&lt;p&gt;A library name can show up multiple times if there are
implementation-specific variants, e.g., one variant for Chez Scheme
and another one for Chibi-Scheme. This is also shown with a little tag
on the package name.&lt;/p&gt;
&lt;p&gt;SRFI library names are linked to the relevant page
on &lt;a href=&quot;https://srfi.schemers.org&quot;&gt;https://srfi.schemers.org&lt;/a&gt;. This type of linking is something that
should be expanded on later, but for now that’s the only type of link
you will see on a library.&lt;/p&gt;
&lt;p&gt;I think this type of information adds a lot to the package pages. You
get a deeper insight into what libraries there are, what libraries
they use, and you can quickly see if the package is missing library
variants for your Scheme implementation. An example is
the &lt;a href=&quot;https://akkuscm.org/packages/wak-common/&quot;&gt;wak-common&lt;/a&gt; package and
its (wak private include compat) library, which exists only for Chez
Scheme, GNU Guile, Ikarus, Mosh, Racket and Ypsilon. However, if you
look in the library index then you can see that the akku package also
has a variant of this library for Loko Scheme.&lt;/p&gt;
&lt;h1 id=&quot;future-work&quot;&gt;Future work&lt;/h1&gt;
&lt;p&gt;There is more I would like to add to the web site, and they are
not such small projects:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Documentation. Some packages have proper documentation and it would
be good to link to this. It might require an update to Akku’s
package format so that it will be built properly, e.g., if there are
PDFs to generate. It’s also common for other languages to have
automatically extracted documentation from comments, but today there
is no wide-spread tool for Scheme that does this.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Test reports. Many packages have automatic test suites, but Akku
does nothing with them at the moment. An even simpler test would be
to just try importing each library in each Scheme and see if that
works.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Suggestions and merge requests are welcome at the website’s project
page: &lt;a href=&quot;https://gitlab.com/akkuscm/akku-web&quot;&gt;https://gitlab.com/akkuscm/akku-web&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Loko Scheme 2022 Q4 Update</title>
      <link>https://weinholt.se/articles/loko-scheme-2022-q4/</link>
      <pubDate>Sun, 06 Nov 2022 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/loko-scheme-2022-q4/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;I released &lt;a href=&quot;https://scheme.fail/&quot;&gt;Loko Scheme&lt;/a&gt; 0.12.0 last month and forgot to blog
about it. I’ve been busy starting my own consulting company so it just
slipped my mind. There are two cool milestones with this release.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h1 id=&quot;self-compilation-on-bare-metal&quot;&gt;Self-compilation on bare metal&lt;/h1&gt;
&lt;p&gt;A cool milestone in 0.12.0 is one of those things that is pretty
significant but that you can’t really demonstrate visually.&lt;/p&gt;
&lt;p&gt;I have implemented enough of the Linux syscall layer that I was able
to run Loko’s compiler on bare metal. I used an old Acer laptop to
compile Loko itself while running only Loko on the laptop. Many
compilers can compile themselves but this is a new extreme.&lt;/p&gt;
&lt;h1 id=&quot;valand-a-windowing-system&quot;&gt;Valand, a windowing system&lt;/h1&gt;
&lt;p&gt;Loko now has a windowing system called Valand. Its design is somewhat
inspired by Wayland, except it’s integrated in the kernel and is meant
to be used on bare metal. So Loko on bare metal now has support for
running multiple graphical programs with preemptive multitasking. You
can even run Doom through a port
of &lt;a href=&quot;https://gitlab.com/weinholt/doomgeneric&quot;&gt;doomgeneric&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/articles/loko-scheme-2022-q4/doom-loko-0-12-0.png&quot;&gt;&lt;img src=&quot;/articles/loko-scheme-2022-q4/doom-loko-0-12-0.png_thumb.jpg&quot; alt=&quot;Doom on Loko screenshot&quot;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The way this works is through an extension to the Linux syscall ABI
emulation. When you cross-compile doomgeneric on Linux you get an ELF
binary that you can copy to the hard drive and then load with
&lt;code&gt;@/doomgeneric&lt;/code&gt; in the REPL window. That starts a doomgeneric process
that opens &lt;code&gt;/dev/valand&lt;/code&gt;, which gives it a file descriptor for Valand.&lt;/p&gt;
&lt;p&gt;The Valand file descriptor supports an &lt;code&gt;ioctl&lt;/code&gt; for creating a
graphical surface which is then mapped into the process memory with
&lt;code&gt;mmap&lt;/code&gt;. Doomgeneric writes pixel data to this memory and calls another
&lt;code&gt;ioctl&lt;/code&gt; to mark the surface as &lt;em&gt;damaged&lt;/em&gt;. Valand regularly fixes the
damages by copying the damaged pixels to the framebuffer, which means
that the screen is updated with a new frame from the game.&lt;/p&gt;
&lt;p&gt;Keyboard events are returned by doing a non-blocking &lt;code&gt;read&lt;/code&gt; on the
Valand file descriptor. If there is an event then it’s returned as a
struct that specifies a USB HID page and usage. Using USB HID means
that there is no need to invent yet another scancode table just for
Loko.&lt;/p&gt;
&lt;p&gt;Valand keeps track of surfaces and composes an image from them. The
composing magic is done with a bunch of rectangle math and a z-buffer.
It is all done in Scheme code compiled to native machine code by Loko.
I haven’t benchmarked it, but it’s fast enough to not be laggy.&lt;/p&gt;
&lt;p&gt;It’s not much but it’s enough to get Doom running. You might notice
that there are no title bars and controls on the windows. There’s very
little that the window system gives you in the current version. You
can move windows and keyboard focus will follow the mouse. Valand is
starting out small and simple.&lt;/p&gt;
&lt;h1 id=&quot;dreaming-up-what-s-next&quot;&gt;Dreaming up what’s next&lt;/h1&gt;
&lt;p&gt;The next milestone could be to port an editor. With an editor running
on Loko and Valand it would in principle be possible to keep
developing Loko without using another OS. I’m thinking that the fork
of uEmacs/PK that Torvalds maintains should be pretty simple to port.
Loko doesn’t have a terminal emulator, not even a tty layer, but you
could build the terminal renderer into the uEmacs binary and have it
use Valand for the UI. &lt;strong&gt;Update 2023-02-05: I just learned that uEmacs
has a non-free license. I will find something else.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;And I intend for Valand to be an integral part of the operating system
that I’m building with Loko. This will make it possible to do some
things that you can’t do in an OS like GNU/Linux where these
components are much more loosely coupled. The Linux kernel has no idea
about the desktop environment you’re using, which is the right thing
for its design, but which also limits what can be done.&lt;/p&gt;
&lt;p&gt;The tighter coupling means that Valand can provide a &lt;em&gt;trusted path&lt;/em&gt;.
The user should have a way into the system which they know with
certainty can’t be faked. The system menu on top of the screen will be
one such trusted path. It’s a placeholder in the screenshot shown
above, but you can imagine something like the macOS menu. Window
decorations will be another trusted path; it should not be possible to
fake them.&lt;/p&gt;
&lt;h1 id=&quot;a-mini-rant&quot;&gt;A mini-rant&lt;/h1&gt;
&lt;p&gt;Linux systems sometimes freeze because the kernel overcommits memory
and under heavy memory pressure begins discarding the pages of
demand-paged executables. The kernel can basically decide to discard
all of user space in favor of a rogue memory hog, so user space grinds
to a halt.&lt;/p&gt;
&lt;p&gt;Loko should guarantee that the computer always remains responsive,
even if a program goes rogue and uses up all resources. I’m pretty
weary of my Linux desktop occasionally freezing, so I’m not going to
allow that in Loko.&lt;/p&gt;
&lt;p&gt;And I don’t want to support anything that steals keyboard focus. Not
even dialogue windows. Imagine typing and knowing with utter certainty
where your keystrokes will be sent. I haven’t experienced that since
DOS.&lt;/p&gt;
&lt;h1 id=&quot;so-when-1-0-&quot;&gt;So when 1.0?&lt;/h1&gt;
&lt;p&gt;Obviously a version number like 0.12.0
is &lt;a href=&quot;https://0ver.org/&quot;&gt;getting ridiculous&lt;/a&gt; and it’s time for 1.0.0
soon. The big milestone that I’ve been wanting to reach before 1.0.0
is to make &lt;code&gt;eval&lt;/code&gt; use the compiler. I’ve been putting it off, even
though it’s not really all that difficult. Perhaps I’ll get to it once
my company is off the ground.&lt;/p&gt;
&lt;!-- # Possibly a hiatus --&gt;
&lt;!-- I will take a break away from working on Loko for a couple months --&gt;
&lt;!-- while my business starts up. --&gt;
</description>
    </item>
    <item>
      <title>Cond-expand and #ifdef</title>
      <link>https://weinholt.se/articles/cond-expand-and-ifdef/</link>
      <pubDate>Thu, 09 Jun 2022 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/cond-expand-and-ifdef/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;In the C programming language you can ask the macro preprocessor to
keep or remove part of a source file. This is done with &lt;code&gt;#ifdef&lt;/code&gt;. The
equivalent in Scheme is called &lt;code&gt;cond-expand&lt;/code&gt;. R7RS Scheme has two
different instances of &lt;code&gt;cond-expand&lt;/code&gt;, while R6RS Scheme does not have
it all. What does R6RS do instead, and is &lt;code&gt;cond-expand&lt;/code&gt; a bad idea?&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h1 id=&quot;use-cases&quot;&gt;Use cases&lt;/h1&gt;
&lt;p&gt;What is &lt;code&gt;#ifdef&lt;/code&gt;, an its cousins &lt;code&gt;#if&lt;/code&gt; and &lt;code&gt;#ifndef&lt;/code&gt;, used for? And why
might we want its equivalent in Scheme? There are two major use cases
for &lt;code&gt;#ifdef&lt;/code&gt;: build-time configuration and portability. The build
system will usually have some sort of configuration script that
detects what system it’s running on. Usually these scripts also let
the user enable or disable features.&lt;/p&gt;
&lt;p&gt;Chez Scheme runs on top of a chunk of fairly portable C and uses
&lt;code&gt;#ifdef&lt;/code&gt; as described:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ grep -h &amp;#39;^#ifdef&amp;#39; ChezScheme/c/*.[ch] |awk &amp;#39;{print $2}&amp;#39; \
   | sort -u | xargs
ARCHYPERBOLIC ARMV6 BSDI CHAFF CHECK_FOR_ROSETTA CLOCK_HIGHRES
CLOCK_MONOTONIC CLOCK_MONOTONIC_HR CLOCK_PROCESS_CPUTIME_ID
CLOCK_REALTIME CLOCK_REALTIME_HR CLOCK_THREAD_CPUTIME_ID DEBUG
DEFINE_MATHERR DISABLE_CURSES EINTR ENABLE_OBJECT_COUNTS
FEATURE_EXPEDITOR FEATURE_ICONV FEATURE_PTHREADS FEATURE_WINDOWS FLOCK
FLUSHCACHE FunCRepl GETWD HANDLE_SIGWINCH HPUX I386 IEEE_DOUBLE ITEST
KEEPSMALLPUPPIES LIBX11 LITTLE_ENDIAN_IEEE_DOUBLE LOAD_SHARED_OBJECT
LOCKF LOG1P LOOKUP_DYNAMIC MACOSX MAP_32BIT __MINGW32__ MMAP_HEAP
NAN_INCLUDE NO_DIRTY_NEWSPACE_POINTERS NOISY
NO_LOCKED_OLDSPACE_OBJECTS PPC32 PROMPT PTHREADS SA_INTERRUPT
SA_RESTART SAVEDHEAPS segment_t2_bits segment_t3_bits SIGBUS SIGQUIT
SOLARIS SPARC SPARC64 TIOCGWINSZ USE_MBRTOWC_L WIN32 _WIN64 WIPECLEAN
X86_64
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Macros like &lt;code&gt;FEATURE_EXPEDITOR&lt;/code&gt; turn on and off functionality, while
macros like &lt;code&gt;HPUX&lt;/code&gt; and &lt;code&gt;I386&lt;/code&gt; are used for portability. So,
configuration and portability.&lt;/p&gt;
&lt;h1 id=&quot;portability-&quot;&gt;Portability?&lt;/h1&gt;
&lt;p&gt;Does &lt;code&gt;#ifdef&lt;/code&gt; truly help with portability? It can certainly seem this
way, but there’s a different way to think about this issue. This is
what &lt;a href=&quot;https://lwn.net/ml/tuhs/CAKzdPgw3F9snv-kO+tE=rE2Q_wh_7AKxVaZ9gXFoCxaX6pgBkA@mail.gmail.com/&quot;&gt;Rob Pike had to say on the TUHS main list&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;C with #ifdefs is not portable, it is a collection of 2^n overlaid
programs, where n is the number of distinct #if[n]def tags. It’s too bad
the problems of that approach were not appreciated by the C standard
committee, who mandated the #ifndef guard approach that I’m sure could
count as a provable billion dollar mistake, probably much more. The cost of
building #ifdef’ed code, especially with C++, which decided to be more
fine-grained about it, is unfathomable.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;For each C file with &lt;code&gt;#ifdef&lt;/code&gt; you need to understand what happens if
the condition is true versus if it’s false. It’s simple with just one
&lt;code&gt;#ifdef&lt;/code&gt;, but the problem grows exponentially.&lt;/p&gt;
&lt;h1 id=&quot;configuration-&quot;&gt;Configuration?&lt;/h1&gt;
&lt;p&gt;You can use &lt;code&gt;#ifdef&lt;/code&gt; for conditional compilation, to turn on and off
features. But this can also create a mess like that described by Pike
above. &lt;a href=&quot;https://www.gnu.org/prep/standards/standards.html#Conditional-Compilation&quot;&gt;The GNU Coding Standards&lt;/a&gt; have this to say:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When supporting configuration options already known when building
your program we prefer using &lt;code&gt;if (... )&lt;/code&gt; over conditional compilation,
as in the former case the compiler is able to perform more extensive
checking of all possible code paths.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The same sentiment is echoed by &lt;a href=&quot;https://lwn.net/ml/tuhs/CAKH6PiVCk6gSv-WVztRUiJrOt3QHVi1pCVEKzw1RcEi+m+G=bw@mail.gmail.com/&quot;&gt;Douglas McIlroy in the message that
preceded Pike’s message above&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This approach is generally a good idea. All those conditionals give
you a lot of code paths. Checking that all them even compile is
difficult to do by hand, and you need all the help you can get. When
you use &lt;code&gt;#ifdef&lt;/code&gt; you hide the code from the compiler. The compiler
can’t check code that it can’t see. Using &lt;code&gt;if (... )&lt;/code&gt; when the
expression is constant at compile time should give the same result as
conditional inclusion, at least if you have an optimizing compiler.&lt;/p&gt;
&lt;h1 id=&quot;also-considered-harmful&quot;&gt;Also Considered Harmful&lt;/h1&gt;
&lt;p&gt;It’s not just people on the Internet saying these things about
&lt;code&gt;#ifdef&lt;/code&gt;, and the complaints are not new either.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We believe that a C programmer’s impulse to use&lt;code&gt;#ifdef&lt;/code&gt; in an attempt at portability is
usually a mistake. Portability is generally the result of advance planning rather than trench
warfare involving &lt;code&gt;#ifdef&lt;/code&gt;. In the course of developing C News on different systems, we
evolved various tactics for dealing with differences among systems without producing a
welter of &lt;code&gt;#ifdef&lt;/code&gt;s at points of difference. We discuss the alternatives to, and occasional
proper use of, &lt;code&gt;#ifdef&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Source: SPENCER, Henry; COLLYER, Geoff. &lt;a href=&quot;https://www.usenix.org/legacy/publications/library/proceedings/sa92/spencer.pdf&quot;&gt;#ifdef considered harmful, or portability experience with C News.&lt;/a&gt; In: &lt;em&gt;USENIX Summer 1992 Technical Conference (USENIX Summer 1992 Technical Conference). 1992&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;You can
use &lt;a href=&quot;https://scholar.google.com/scholar?cites=7662692707746701882&quot;&gt;Google Scholar to find papers that cite this paper&lt;/a&gt;, if
you’re interested in more reading.&lt;/p&gt;
&lt;h1 id=&quot;cond-expand-is-not-as-bad-&quot;&gt;cond-expand is not as bad…&lt;/h1&gt;
&lt;p&gt;The first standardization of &lt;code&gt;cond-expand&lt;/code&gt; that I know of
is &lt;a href=&quot;https://srfi.schemers.org/srfi-0/srfi-0.html&quot;&gt;Marc Feeley’s SRFI-0&lt;/a&gt;. It is also part
of &lt;a href=&quot;https://github.com/johnwcowan/r7rs-spec/blob/errata/spec/r7rs.pdf&quot;&gt;R7RS Scheme&lt;/a&gt;, where I believe it has seen wider adoption
than plain SRFI-0.&lt;/p&gt;
&lt;p&gt;There is one major difference between &lt;code&gt;#ifdef&lt;/code&gt; and &lt;code&gt;cond-expand&lt;/code&gt;. The
former is handled by a preprocessor that does not understand the
lexical syntax of the language is it working with. You can even use
cpp with other languages than C, e.g. assembly. This means you can
easily introduce latent syntax errors with &lt;code&gt;#ifdef&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;There is a &lt;code&gt;cond-expand&lt;/code&gt; available from inside &lt;code&gt;define-library&lt;/code&gt; and
another one available as syntax in &lt;code&gt;(scheme base)&lt;/code&gt;. Both of these are
handled after the source file has been parsed. This means that
&lt;code&gt;cond-expand&lt;/code&gt; cannot create an unbalanced syntax tree. What I mean by
this is that you can’t somehow use &lt;code&gt;cond-expand&lt;/code&gt; wrong in such a way
that the parenthesis become unbalanced. To make this mistake with
&lt;code&gt;#ifdef&lt;/code&gt; is trivial; simply place &lt;code&gt;}&lt;/code&gt; before &lt;code&gt;#endif&lt;/code&gt; when it should
have been after, or vice versa.&lt;/p&gt;
&lt;p&gt;You get bonus points, so to speak, if you do this near code that
handles portability to operating systems that you can’t test your
changes on.&lt;/p&gt;
&lt;h1 id=&quot;-but-not-really-better&quot;&gt;… but not really better&lt;/h1&gt;
&lt;p&gt;Apart from the differences in how the compiler handles them, they do
actually express the same thing. One can look at &lt;code&gt;cond-expand&lt;/code&gt; as
morally equivalent to a series of &lt;code&gt;#if&lt;/code&gt;, &lt;code&gt;#else&lt;/code&gt; and &lt;code&gt;#endif&lt;/code&gt;
directives. Therefore the very same problems that happen with &lt;code&gt;#ifdef&lt;/code&gt;
also happen with &lt;code&gt;cond-expand&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I’m not optimistic about the future landscape of R7RS code if
&lt;code&gt;cond-expand&lt;/code&gt; is not recognized for the problems it brings. It may be
that each R7RS library will become a jungle of 2&lt;sup&gt;n&lt;/sup&gt; overlaid
libraries. I have seen some indication of this process already
beginning when looking at the packages in &lt;a href=&quot;https://snow-fort.org/&quot;&gt;Snow Fort&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Fortunately I have not seen any examples where &lt;code&gt;cond-expand&lt;/code&gt; is used
to change which identifiers are exported from a library, but that day
may yet come.&lt;/p&gt;
&lt;h1 id=&quot;back-to-r6rs&quot;&gt;Back to R6RS&lt;/h1&gt;
&lt;p&gt;So in the beginning of this article I wrote that R6RS Scheme does not
have &lt;code&gt;cond-expand&lt;/code&gt;. Does that mean it has another way to handle these
problems?&lt;/p&gt;
&lt;p&gt;No, but in practice: yes. In the R6RS report there are only libraries
as a suggested way to handle this. And there isn’t really a way to
conditionally import libraries at compile time.&lt;/p&gt;
&lt;p&gt;This situation has given rise to some creative solutions. The
configuration and portability problems do not disappear just like
that, so people have tried to solve it within the restrictions of the
language.&lt;/p&gt;
&lt;h2 id=&quot;portability-between-r6rs-implementations&quot;&gt;Portability between R6RS implementations&lt;/h2&gt;
&lt;p&gt;I believe that all R6RS Scheme implementations implement the de facto
standard of importing libraries by first looking for them in files
that end with the &lt;code&gt;.&amp;lt;impl&amp;gt;.sls&lt;/code&gt; suffix before they try &lt;code&gt;.sls&lt;/code&gt;. If Chez
Scheme sees &lt;code&gt;(import (foo))&lt;/code&gt; then it first tries &lt;code&gt;foo.chezscheme.sls&lt;/code&gt;
before it tries &lt;code&gt;foo.sls&lt;/code&gt;. This mechanism, even though it’s not in
R6RS, is widely implemented.&lt;/p&gt;
&lt;p&gt;This mechanism is used to create compatibility libraries. One striking
example is the &lt;code&gt;(xitomatl common)&lt;/code&gt; library from Derick
Eddington’s &lt;a href=&quot;https://akkuscm.org/packages/xitomatl/&quot;&gt;xitomatl&lt;/a&gt;. It contains a few procedures that
traditionally appear in Scheme implementation, but which are not in
the reports. Here is &lt;code&gt;common.chezscheme.sls&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;&lt;span class=&quot;comment&quot;&gt;;; Copyright 2009 Derick Eddington.  My MIT-style license is in the file named&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;;; LICENSE from the original collection this file is distributed with.&lt;/span&gt;

(&lt;span class=&quot;name&quot;&gt;library&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;xitomatl&lt;/span&gt; common)
  (&lt;span class=&quot;name&quot;&gt;export&lt;/span&gt;
    add1 sub1
    format printf fprintf pretty-print
    gensym
    time
    with-input-from-string with-output-to-string
    system
    &lt;span class=&quot;comment&quot;&gt;;; &lt;span class=&quot;doctag&quot;&gt;TODO:&lt;/span&gt; add to as needed/appropriate&lt;/span&gt;
    )
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;import&lt;/span&gt;&lt;/span&gt;
    (&lt;span class=&quot;name&quot;&gt;chezscheme&lt;/span&gt;))
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There are matching libraries for Guile, Ikarus, Larcency, Racket, Mosh
and Ypsilon. They export the same identifiers, but they all have some 
tweaks to adapt them to the various implementations.&lt;/p&gt;
&lt;p&gt;This is the “Plan 9” approach to portability, as briefly described
in &lt;a href=&quot;https://weinholt.se/articles/cond-expand-and-ifdef/doug&quot;&gt;the mailing list thread referenced above&lt;/a&gt;. Define APIs and
let the implementation of the API hide the portability problems from
the rest of the program.&lt;/p&gt;
&lt;h2 id=&quot;configuration-for-r6rs-code&quot;&gt;Configuration for R6RS code&lt;/h2&gt;
&lt;p&gt;What is to be done for configuration? When I need this in my own code,
I create a library that exports the configuration as identifier
syntax. Here’s an abbreviated example from Loko Scheme:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;library&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;loko&lt;/span&gt; arch amd64 config)
  (&lt;span class=&quot;name&quot;&gt;export&lt;/span&gt;
    &lt;span class=&quot;comment&quot;&gt;; ...&lt;/span&gt;
    use-popcnt)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;import&lt;/span&gt;&lt;/span&gt;
    (&lt;span class=&quot;name&quot;&gt;rnrs&lt;/span&gt;))

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define-syntax&lt;/span&gt;&lt;/span&gt; define-const
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;syntax-rules&lt;/span&gt;&lt;/span&gt; ()
    ((&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; name v)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define-syntax&lt;/span&gt;&lt;/span&gt; name
       (&lt;span class=&quot;name&quot;&gt;identifier-syntax&lt;/span&gt; v)))))

&lt;span class=&quot;comment&quot;&gt;; ...&lt;/span&gt;

(&lt;span class=&quot;name&quot;&gt;define-const&lt;/span&gt; use-popcnt &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt;))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I can then use this identifier syntax as a regular variable, like &lt;code&gt;(if
use-popcnt &amp;lt;do-this&amp;gt; &amp;lt;do-that&amp;gt;)&lt;/code&gt;. But thanks to &lt;code&gt;define-const&lt;/code&gt; it
becomes inlined at the place where it is used. So if it’s set to &lt;code&gt;#f&lt;/code&gt;
then the expanded conditional is actually &lt;code&gt;(if #f &amp;lt;do-this&amp;gt; &amp;lt;do-that&amp;gt;)&lt;/code&gt;,
which is trivial to optimize.&lt;/p&gt;
&lt;h1 id=&quot;akku-supports-this-stuff&quot;&gt;Akku supports this stuff&lt;/h1&gt;
&lt;p&gt;I have included support in &lt;a href=&quot;https://akkuscm.org&quot;&gt;Akku&lt;/a&gt; for both the
R6RS and R7RS approach to portability. Akku will keep track of which
Scheme implementation an R6RS library is meant for and adapt the way
it installs the library. It will use the &lt;code&gt;.&amp;lt;impl&amp;gt;.sls&lt;/code&gt; extension and
even escape the file name correctly for that implementation.&lt;/p&gt;
&lt;p&gt;With R7RS libraries it merely needs to install the &lt;code&gt;.sld&lt;/code&gt; file to the
right location. This is simple enough to do. But Akku also translates
R7RS libraries into R6RS libraries. Akku has to do some interesting
juggling when &lt;code&gt;cond-expand&lt;/code&gt; appears at the &lt;code&gt;define-library&lt;/code&gt; level.&lt;/p&gt;
&lt;p&gt;Akku checks the list of features that appear &lt;code&gt;cond-expand&lt;/code&gt; and looks
to see if it recognizes any implementation names. For each
implementation it then creates a copy of the library that is specific
to that implementation. For each such copy it expands all
&lt;code&gt;cond-expand&lt;/code&gt; expressions at the &lt;code&gt;define-library&lt;/code&gt; level, as best as it
can, and installs &lt;code&gt;.&amp;lt;impl&amp;gt;.sls&lt;/code&gt; files. Kind of dirty, but it mostly
works.&lt;/p&gt;
&lt;p&gt;For example, this library will be installed as &lt;code&gt;hello.chezscheme.sls&lt;/code&gt;,
&lt;code&gt;hello.loko.sls&lt;/code&gt;, &lt;code&gt;hello.sld&lt;/code&gt; and &lt;code&gt;hello.sls&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;define-library&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;hello&lt;/span&gt;)
  (&lt;span class=&quot;name&quot;&gt;export&lt;/span&gt; hello)
  (&lt;span class=&quot;name&quot;&gt;cond-expand&lt;/span&gt;
   ((&lt;span class=&quot;name&quot;&gt;library&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;rnrs&lt;/span&gt;))
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;import&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;rnrs&lt;/span&gt;)))
   (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;else&lt;/span&gt;&lt;/span&gt;
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;import&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;scheme&lt;/span&gt; base))))

  (&lt;span class=&quot;name&quot;&gt;cond-expand&lt;/span&gt;
   (&lt;span class=&quot;name&quot;&gt;chezscheme&lt;/span&gt;
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;begin&lt;/span&gt;&lt;/span&gt;
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;hello&lt;/span&gt;)
        (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;display&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;Hello Chez!\n&quot;&lt;/span&gt;))))
   (&lt;span class=&quot;name&quot;&gt;loko&lt;/span&gt;
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;begin&lt;/span&gt;&lt;/span&gt;
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;hello&lt;/span&gt;)
        (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;display&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;Hello Loko!\n&quot;&lt;/span&gt;))))
   (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;else&lt;/span&gt;&lt;/span&gt;
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;begin&lt;/span&gt;&lt;/span&gt;
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;hello&lt;/span&gt;)
        (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;display&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;Hello world!\n&quot;&lt;/span&gt;))))))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;(The &lt;code&gt;hello.loko.sls&lt;/code&gt; file it creates is actually a symlink to
&lt;code&gt;hello.sld&lt;/code&gt; because Akku knows that Loko supports R7RS).&lt;/p&gt;
&lt;h1 id=&quot;maybe-a-way-forward&quot;&gt;Maybe a way forward&lt;/h1&gt;
&lt;p&gt;The picture I’ve painted seems quite damning for &lt;code&gt;cond-expand&lt;/code&gt;. But
the problem is not really &lt;code&gt;cond-expand&lt;/code&gt; itself. The problem is when
the misuse of &lt;code&gt;cond-expand&lt;/code&gt; leads to a mess. I’m not advocating for
its removal, but I would like it to be better understood for what it
is. Some widely distributed guidelines for how to use it would go far
in reducing the damage.&lt;/p&gt;
&lt;p&gt;The “Plan 9” approach of making APIs can be done with &lt;code&gt;cond-expand&lt;/code&gt;
just as well as with the R6RS approach of &lt;code&gt;.&amp;lt;impl&amp;gt;.sls&amp;gt;&lt;/code&gt;. It just
requires that you know what you’re doing; that it’s a bad idea to
sprinkle all your code with &lt;code&gt;cond-expand&lt;/code&gt; and that you should keep
this code in isolated libraries.&lt;/p&gt;
&lt;p&gt;Finally, I’d like to say that &lt;code&gt;cond-expand&lt;/code&gt; is actually more powerful
than the &lt;code&gt;.&amp;lt;impl&amp;gt;.sls&lt;/code&gt; approach. This power unfortunately makes static
analysis of library declarations more difficult, but it also gives you
access to feature identifiers that are more interesting than just the
name of the Scheme implementation, such as &lt;code&gt;x86-64&lt;/code&gt; and &lt;code&gt;clr&lt;/code&gt;. But
please do hide your use of these behind an appropriate API.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Loko Scheme 0.9.0</title>
      <link>https://weinholt.se/articles/loko-scheme-0-9-0/</link>
      <pubDate>Sat, 21  Aug 2021 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/loko-scheme-0-9-0/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;Loko Scheme 0.9.0 is now available from:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://scheme.fail/releases/loko-0.9.0.tar.gz&quot;&gt;https://scheme.fail/releases/loko-0.9.0.tar.gz&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://scheme.fail/releases/loko-0.9.0.tar.gz.sig&quot;&gt;https://scheme.fail/releases/loko-0.9.0.tar.gz.sig&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;A bootable disk image for 64-bit PCs is available from:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://scheme.fail/releases/disk-images/loko-hdd-0.9.0.img.gz&quot;&gt;https://scheme.fail/releases/disk-images/loko-hdd-0.9.0.img.gz&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://scheme.fail/releases/disk-images/loko-hdd-0.9.0.img.gz.sig&quot;&gt;https://scheme.fail/releases/disk-images/loko-hdd-0.9.0.img.gz.sig&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The signatures are made with the GnuPG key 0xE33E61A2E9B8C3A2.&lt;/p&gt;
&lt;p&gt;Loko Scheme 0.9.0 fixes bugs, improves performance and adds features.
See NEWS.md in the distribution for a more detailed summary of
changes.&lt;/p&gt;
&lt;p&gt;Loko Scheme is an optimizing Scheme compiler that builds statically
linked binaries for bare metal, Linux and NetBSD/amd64. It supports
the R6RS Scheme and R7RS Scheme standards.&lt;/p&gt;
&lt;p&gt;Loko Scheme’s web site is &lt;a href=&quot;https://scheme.fail&quot;&gt;https://scheme.fail&lt;/a&gt;, where you can find
the release tarballs and the manual.&lt;/p&gt;
&lt;p&gt;Loko Scheme is available under GNU Affero GPL version 3 or later.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>A Record Type Representation Trick</title>
      <link>https://weinholt.se/articles/record-type-representation-trick/</link>
      <pubDate>Sat, 14  Aug 2021 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/record-type-representation-trick/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;I’ve been working on optimizations in &lt;a href=&quot;https://scheme.fail&quot;&gt;Loko Scheme&lt;/a&gt; recently and
have implemented large parts
of &lt;a href=&quot;https://andykeep.com/pubs/scheme-12b.pdf&quot;&gt;A Sufficiently Smart Compiler for Procedural Records&lt;/a&gt; (Keep
&amp;amp; Dybvig, 2012). At the same time I have improved the representation
of record type descriptors and wanted to share a simple trick
I used to improve record type checks for non-sealed records. But first
I should explain what a record is in Scheme.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h1 id=&quot;background-records-in-scheme&quot;&gt;Background: Records in Scheme&lt;/h1&gt;
&lt;p&gt;Scheme supports &lt;em&gt;record types&lt;/em&gt;, which are user-defined data
types. &lt;a href=&quot;https://r7rs.scheme.org/&quot;&gt;R⁷RS Scheme&lt;/a&gt; has a syntax-based variant of this feature,
based on &lt;a href=&quot;https://srfi.schemers.org/srfi-9/&quot;&gt;SRFI-9&lt;/a&gt;. Here’s an example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;import&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;scheme&lt;/span&gt; base))

(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; point
  (&lt;span class=&quot;name&quot;&gt;make-point&lt;/span&gt; x y)
  point?
  (&lt;span class=&quot;name&quot;&gt;x&lt;/span&gt; point-x point-x-set!)
  (&lt;span class=&quot;name&quot;&gt;y&lt;/span&gt; point-y point-y-set))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This will let you write &lt;code&gt;(make-point 0.0 0.0)&lt;/code&gt; to get a point at (0.0,
0.0), and &lt;code&gt;(point-x p)&lt;/code&gt; to access the &lt;em&gt;x&lt;/em&gt; field of &lt;em&gt;p&lt;/em&gt;. That’s it for
records in R⁷RS and SRFI-9.&lt;/p&gt;
&lt;p&gt;The record type system in &lt;a href=&quot;https://r7rs.scheme.org/&quot;&gt;R⁶RS Scheme&lt;/a&gt; improves on SRFI-9 in
several ways. In R⁶RS you would instead write this:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;import&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;rnrs&lt;/span&gt;))

(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; point
  (&lt;span class=&quot;name&quot;&gt;fields&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;mutable&lt;/span&gt; x)
          (&lt;span class=&quot;name&quot;&gt;mutable&lt;/span&gt; y)))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There is no longer any need to explicitly write out the names of the
constructor, predicate, accessors and mutators (unless you want to).
Additionally, you can extend a record type, you can customize the
constructor, and you can control what happens if &lt;code&gt;define-record-type&lt;/code&gt;
runs multiple times, i.e. if it makes a new type each time or not.&lt;/p&gt;
&lt;p&gt;A syntactical record layer can be abstraction on top of a procedural
layer. What you can do with syntax, you can also do with procedure
calls at runtime. R⁶RS standardizes this layer as well. It also
standardizes a record inspection layer that lets you get the record
type descriptor (&lt;em&gt;RTD&lt;/em&gt;) from a record at runtime (unless it’s marked
as &lt;em&gt;opaque&lt;/em&gt;) and also to inspect all aspects of RTDs. In fact, RTDs
are objects in their own right, just like pairs and symbols.&lt;/p&gt;
&lt;p&gt;The above record type definition might expand to this code that uses
the procedural layer (a real expansion would use fresh identifiers for
the RTD and RCD):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; point-rtd
  (&lt;span class=&quot;name&quot;&gt;make-record-type-descriptor&lt;/span&gt;
    &lt;span class=&quot;symbol&quot;&gt;'point&lt;/span&gt; &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt; &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt; &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt; &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt;
    '#((&lt;span class=&quot;name&quot;&gt;mutable&lt;/span&gt; x) (&lt;span class=&quot;name&quot;&gt;mutable&lt;/span&gt; y))))
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; point-rcd
  (&lt;span class=&quot;name&quot;&gt;make-record-constructor-descriptor&lt;/span&gt;
    point-rtd &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt; &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt;))

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; point? (&lt;span class=&quot;name&quot;&gt;record-predicate&lt;/span&gt; point-rtd))
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; make-point (&lt;span class=&quot;name&quot;&gt;record-constructor&lt;/span&gt; point-rcd))
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; point-x (&lt;span class=&quot;name&quot;&gt;record-accessor&lt;/span&gt; point-rtd &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; point-y (&lt;span class=&quot;name&quot;&gt;record-accessor&lt;/span&gt; point-rtd &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; point-x-set! (&lt;span class=&quot;name&quot;&gt;record-mutator&lt;/span&gt; point-rtd &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; point-y-set! (&lt;span class=&quot;name&quot;&gt;record-mutator&lt;/span&gt; point-rtd &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This code uses the helpers &lt;code&gt;record-predicate&lt;/code&gt;, &lt;code&gt;record-constructor&lt;/code&gt;,
&lt;code&gt;record-accessor&lt;/code&gt; and &lt;code&gt;record-mutator&lt;/code&gt; to create procedures. An
intermediate &lt;em&gt;record constructor descriptor&lt;/em&gt; contains the information
needed to make the constructor.&lt;/p&gt;
&lt;p&gt;Next we will have a look at records in memory, how the code above
is optimized, and finally a trick to speed up record type checks.&lt;/p&gt;
&lt;h1 id=&quot;record-type-representation&quot;&gt;Record Type Representation&lt;/h1&gt;
&lt;p&gt;Loko Scheme has a straightforward representation of records. Using the
above &lt;code&gt;point&lt;/code&gt; type as an example, here are the records returned by
&lt;code&gt;(make-point 1.0 2.0)&lt;/code&gt; and &lt;code&gt;(make-point 1.5 -2.0)&lt;/code&gt; as they are
structured in memory. The rows are 64-bit words and the arrows are
tagged pointers. Memory allocations are aligned to 16 bytes, and the
empty space created by alignment is also shown.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/articles/record-type-representation-trick/point.dot&quot;&gt;&lt;img src=&quot;/articles/record-type-representation-trick/point.svg&quot; alt=&quot;Graphviz view of two point records&quot;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The information stored in the record is a pointer to the record type
descriptor, which is reused for each record of the same type, followed
by a slot for each field in the record.&lt;/p&gt;
&lt;p&gt;The information in the record type descriptor is used by the record
inspection procedures and the garbage collector. The slots contain: a
type tag for the rtd itself, the size of the records, an optional
parent type, an optional record unique identifier, the field names,
field mutability (a bit-field), and an optional record writer
procedure.&lt;/p&gt;
&lt;p&gt;Loko uses the first slot in the RTD to store the length of the RTD and
these flags: &lt;em&gt;opaque?&lt;/em&gt;, &lt;em&gt;sealed?&lt;/em&gt;, &lt;em&gt;generative?&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Other R⁶RS implementations will have similar representations of RTDs
because all this information is needed at runtime.&lt;/p&gt;
&lt;h1 id=&quot;single-inheritance&quot;&gt;Single Inheritance&lt;/h1&gt;
&lt;p&gt;R⁶RS supports single inheritance for record types. Instead of
demonstrating this with some contrived geometric shapes or balloon
animals, let’s use an example from working code. The following record
types are simplified variants of records used in Loko’s PCI library.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; pcibar
  (&lt;span class=&quot;name&quot;&gt;fields&lt;/span&gt; reg base size))

(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; pcibar-i/o
  (&lt;span class=&quot;name&quot;&gt;parent&lt;/span&gt; pcibar))

(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; pcibar-mem
  (&lt;span class=&quot;name&quot;&gt;parent&lt;/span&gt; pcibar)
  (&lt;span class=&quot;name&quot;&gt;fields&lt;/span&gt; type prefetchable?))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;PCI base address registers (BARs) all have these fields: &lt;em&gt;reg&lt;/em&gt;,
&lt;em&gt;base&lt;/em&gt;, and &lt;em&gt;size&lt;/em&gt;. If they are in I/O space then that’s all, but BARs
in memory space have two additional fields: &lt;em&gt;type&lt;/em&gt; and
&lt;em&gt;prefetchable?&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/articles/record-type-representation-trick/pcibar.dot&quot;&gt;&lt;img src=&quot;/articles/record-type-representation-trick/pcibar.svg&quot; alt=&quot;Graphviz view of PCI bars records&quot;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Notice that both &lt;code&gt;pcibar-i/o&lt;/code&gt; and &lt;code&gt;pcibar-mem&lt;/code&gt; point to &lt;code&gt;pcibar&lt;/code&gt; as
their parent. The size field is larger in &lt;code&gt;pcibar-mem&lt;/code&gt; to account for
the extra fields. The extra fields in the &lt;code&gt;pcibar-mem&lt;/code&gt; record are
placed immediately after the fields that belong to &lt;code&gt;pcibar&lt;/code&gt;, so
accessors for &lt;code&gt;pcibar&lt;/code&gt; don’t need to recompute the slot numbers when
passed a &lt;code&gt;pcibar-mem&lt;/code&gt;.&lt;/p&gt;
&lt;h1 id=&quot;predicates-for-non-sealed-types&quot;&gt;Predicates for Non-Sealed Types&lt;/h1&gt;
&lt;p&gt;R⁶RS lets you say that a record type is &lt;em&gt;sealed&lt;/em&gt;. This prevents a
record type from being extended. As a consequence, type checks are
more efficient. Why is that?&lt;/p&gt;
&lt;p&gt;The record predicate &lt;code&gt;pcibar?&lt;/code&gt; is given an object and returns true if
the object has that record type, and false otherwise. If the
implementation uses tagged pointers then the predicate first checks
the tag. Next, it reads the type field of the object and compares it
to the &lt;code&gt;pcibar&lt;/code&gt; type.&lt;/p&gt;
&lt;p&gt;But if types are not sealed then they can be extended, and it’s
possible that the type that the predicate is checking for was used as
a parent. The &lt;code&gt;pcibar?&lt;/code&gt; predicate should return true even for a
&lt;code&gt;pcibar-mem&lt;/code&gt; record.&lt;/p&gt;
&lt;h1 id=&quot;the-trick&quot;&gt;The Trick&lt;/h1&gt;
&lt;p&gt;Previously Loko Scheme’s &lt;code&gt;record-predicate&lt;/code&gt; procedure worked as
follows. It checked the RTD to see if it’s sealed. For sealed RTDs it
returned a procedure that implemented the fast check described above.&lt;/p&gt;
&lt;p&gt;For non-sealed RTDs another procedure was returned that did that
check, and additionally looped over all parent RTDs to see if any of
them was the desired RTD:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;record-predicate&lt;/span&gt; rtd)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;record-type-sealed?&lt;/span&gt; rtd)
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (obj)     &lt;span class=&quot;comment&quot;&gt;;fast path&lt;/span&gt;
        (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;and&lt;/span&gt;&lt;/span&gt;
          (&lt;span class=&quot;name&quot;&gt;$box?&lt;/span&gt; obj)
          (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eq?&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;$box-type&lt;/span&gt; obj) rtd)))
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (obj)     &lt;span class=&quot;comment&quot;&gt;;slow path&lt;/span&gt;
        (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;and&lt;/span&gt;&lt;/span&gt;
          (&lt;span class=&quot;name&quot;&gt;$box?&lt;/span&gt; obj)
          (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;t&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;$box-type&lt;/span&gt; obj)))
            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;or&lt;/span&gt;&lt;/span&gt;
              (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eq?&lt;/span&gt;&lt;/span&gt; t rtd)
              (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;and&lt;/span&gt;&lt;/span&gt;
                (&lt;span class=&quot;name&quot;&gt;record-type-descriptor?&lt;/span&gt; t)
                (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; lp ((&lt;span class=&quot;name&quot;&gt;t&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;record-type-parent&lt;/span&gt; t)))
                  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cond&lt;/span&gt;&lt;/span&gt;
                    ((&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eq?&lt;/span&gt;&lt;/span&gt; t rtd) &lt;span class=&quot;literal&quot;&gt;#t&lt;/span&gt;)
                    ((&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;not&lt;/span&gt;&lt;/span&gt; t) &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt;)
                    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;else&lt;/span&gt;&lt;/span&gt;
                     (&lt;span class=&quot;name&quot;&gt;lp&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;record-type-parent&lt;/span&gt; t))))))))))))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I haven’t checked around, but I suspect that most R⁶RS implementations
do something similar. Even when I checked Chez Scheme’s assembly
output I saw a loop that’s morally equivalent to this one. This loop
also shows up in accessors and mutators, because they need to know
that the object they’ve been passed has the right type.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The trick:&lt;/strong&gt; this loop can be avoided by extending the RTD
representation so that each RTD directly contains pointers to all
parent RTDs. The pointers are laid out so that the base type is placed
first, followed by the sub-types in order. An RTD will then appear at
a &lt;em&gt;fixed&lt;/em&gt; location in any RTD that extends it.&lt;/p&gt;
&lt;p&gt;I’m sure that there are other language implementations where this
problem of sub-typing shows up and someone else has come up with just
this optimization, because it’s kind of obvious.&lt;/p&gt;
&lt;p&gt;Suppose that we have a base record type and some record types that
extend each other. For simplicity, I will not give them any fields.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; A)

(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; B (&lt;span class=&quot;name&quot;&gt;parent&lt;/span&gt; A))
(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; C (&lt;span class=&quot;name&quot;&gt;parent&lt;/span&gt; B))

(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; S (&lt;span class=&quot;name&quot;&gt;parent&lt;/span&gt; A))
(&lt;span class=&quot;name&quot;&gt;define-record-type&lt;/span&gt; T (&lt;span class=&quot;name&quot;&gt;parent&lt;/span&gt; S))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The expression &lt;code&gt;(A? (make-A))&lt;/code&gt; evaluates to true, which also &lt;code&gt;A?&lt;/code&gt; does
for all types shown. But &lt;code&gt;(B? (make-T))&lt;/code&gt; evaluates to false because
&lt;code&gt;T&lt;/code&gt; does not have &lt;code&gt;B&lt;/code&gt; anywhere in its chain of parents. That’s what
the loop would be checking.&lt;/p&gt;
&lt;p&gt;This picture shows the memory layout when the trick is used on these
RTDs.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;/articles/record-type-representation-trick/inherit.dot&quot;&gt;&lt;img src=&quot;/articles/record-type-representation-trick/inherit.svg&quot; alt=&quot;Graphviz view of the flat parent list&quot;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The pointer to the &lt;code&gt;A&lt;/code&gt; RTD is always in the &lt;code&gt;0:&lt;/code&gt; slot for any RTD that
extends it. Similarly, the predicate for &lt;code&gt;B&lt;/code&gt; knows to always check in
the &lt;code&gt;1:&lt;/code&gt; slot. A bounds check on the RTD is also needed in this
layout.&lt;/p&gt;
&lt;h1 id=&quot;taking-it-further&quot;&gt;Taking it further&lt;/h1&gt;
&lt;p&gt;Further improvements on this layout are possible. Loko Scheme always
allocates memory for four parent RTDs. If an RTD will appear at slots
0 to 3, then the predicate does not need to do bounds checking on the
RTD. The &lt;code&gt;parent:&lt;/code&gt; slot is not strictly needed and can be removed.&lt;/p&gt;
&lt;p&gt;Specially just for Loko Scheme, the predicates hidden inside accessors
and mutators use slightly less code than the predicate procedures.
These hidden predicates do not explicitly verify the tags on the
pointers, instead leaving it up to the processor’s
built-in &lt;a href=&quot;https://weinholt.se/articles/alignment-check/&quot;&gt;alignment checking&lt;/a&gt; to trap
invalid references.&lt;/p&gt;
&lt;p&gt;The type checks are even faster if specialized predicates, accessors,
and mutators can be generated at compile time. If the RTD is known at
compile time then the slot that contains the RTD is also known and can
be inlined.&lt;/p&gt;
&lt;h1 id=&quot;sufficiently-smart&quot;&gt;Sufficiently Smart&lt;/h1&gt;
&lt;p&gt;A &lt;em&gt;sufficiently smart compiler&lt;/em&gt; is a legendary compiler that does your
favorite optimizations so that your favorite language feature becomes
very efficient.&lt;/p&gt;
&lt;p&gt;In &lt;a href=&quot;https://andykeep.com/pubs/scheme-12b.pdf&quot;&gt;A Sufficiently Smart Compiler for Procedural Records&lt;/a&gt;, Andy
Keep and Kent Dybvig present their work on optimizing the R⁶RS
procedural record system in Chez Scheme. It builds on top of the
source-level optimizer cp0. Loko Scheme has its own implementation of
cp0, so adapting their work has been pretty simple.&lt;/p&gt;
&lt;p&gt;The basic idea is to have cp0 generate static or partially static
RTDs, which are then propagated throughout the program using cp0’s
existing mechanisms. If cp0 succeeds in propagating the RTDs to where
&lt;code&gt;record-accessor&lt;/code&gt; (etc) are called, then it can also generate code
specialized to each record type.&lt;/p&gt;
&lt;p&gt;I’ve implemented large parts of the ideas in the paper in Loko Scheme,
together with the improved record type representation.&lt;/p&gt;
&lt;h1 id=&quot;post-script&quot;&gt;Post-script&lt;/h1&gt;
&lt;p&gt;Compiling Loko Scheme with Loko Scheme is now almost as fast as
compiling it with Chez Scheme, &lt;em&gt;if garbage collection time is not
counted&lt;/em&gt; (run the compilation with something like &lt;code&gt;LOKO_HEAP=28000&lt;/code&gt; if
you have enough RAM). I’m not sure it’s an apples-to-apples comparison
though, because when Chez is used, it also has to load and compile
Loko’s compiler, whereas Loko cheats by already having it loaded. But
still, Loko’s performance is improving. I’m interested in seeing how
well the next release will fare in the &lt;a href=&quot;https://ecraven.github.io/r7rs-benchmarks/&quot;&gt;Scheme Benchmarks&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Akku.scm 1.1.0 released</title>
      <link>https://weinholt.se/articles/akku-scm-1-1-0/</link>
      <pubDate>Sat, 06 Feb 2021 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/akku-scm-1-1-0/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;&lt;a href=&quot;https://akkuscm.org/&quot;&gt;Akku.scm&lt;/a&gt; version 1.1.0, a language package manager for R6RS
and R7RS Scheme, is now generally available. It can be downloaded
from &lt;a href=&quot;https://gitlab.com/akkuscm/akku/-/releases&quot;&gt;GitLab&lt;/a&gt;. This version adds support for Guile 3.0,
Digamma, and includes some bug fixes and new features.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Akku is a language package manager designed for Scheme. In Scheme,
libraries can be analyzed to find their names, exports and imports.
Akku uses this information, plus knowledge of how the various Scheme
implementations work, to automatically install libraries where they
will be found. Libraries can come from the current project or from
packages downloaded from the Internet.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://asciinema.org/a/259410&quot;&gt;&lt;img src=&quot;https://asciinema.org/a/259410.svg&quot; alt=&quot;asciicast&quot;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Akku installs libraries to a per-project library directory that works
across all supported Scheme implementations. On top of this there is a
traditional package index and a dependency solver. Packages are
manually reviewed before they are published in the cryptographically
signed index.&lt;/p&gt;
&lt;p&gt;To increase the portability of R7RS code, Akku also performs an
automatic conversion from the R7RS &lt;code&gt;define-library&lt;/code&gt; form to the R6RS
&lt;code&gt;library&lt;/code&gt; form.&lt;/p&gt;
&lt;p&gt;Akku supports Chez Scheme, Chibi Scheme, Digamma, GNU Guile, Gauche
Scheme, Ikarus Scheme, IronScheme, Larceny Scheme, Loko Scheme, Mosh
Scheme, Racket (plt-r6rs), Sagittarius Scheme, Vicare Scheme and
Ypsilon Scheme. It has been tested on Cygwin, FreeBSD, GNU/Linux,
MSYS2, OpenBSD and macOS.&lt;/p&gt;
&lt;h2 id=&quot;further-reading&quot;&gt;Further reading&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://akkuscm.org&quot;&gt;The Akku website&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://akkuscm.org/packages&quot;&gt;The Akku package list&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://gitlab.com/akkuscm/akku/wikis/FAQ&quot;&gt;The Akku FAQ&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    <item>
      <title>Loko Scheme 0.6.0</title>
      <link>https://weinholt.se/articles/loko-scheme-0-6-0/</link>
      <pubDate>Sat, 29  Aug 2020 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/loko-scheme-0-6-0/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;Loko Scheme 0.6.0 is now available from:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://scheme.fail/releases/loko-0.6.0.tar.gz&quot;&gt;https://scheme.fail/releases/loko-0.6.0.tar.gz&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://scheme.fail/releases/loko-0.6.0.tar.gz.sig&quot;&gt;https://scheme.fail/releases/loko-0.6.0.tar.gz.sig&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The release tarball is signed by the GnuPG key 0xE33E61A2E9B8C3A2.&lt;/p&gt;
&lt;p&gt;Loko Scheme 0.6.0 introduces support for R7RS-small. The release
tarballs now include a pre-built compiler and all dependencies needed
for building Loko. See NEWS.md in the distribution for a more detailed
summary of changes.&lt;/p&gt;
&lt;p&gt;Loko Scheme is an optimizing Scheme compiler that builds statically
linked binaries for bare metal, Linux and NetBSD/amd64. It supports
the R6RS Scheme and R7RS Scheme standards.&lt;/p&gt;
&lt;p&gt;Loko Scheme’s web site is &lt;a href=&quot;https://scheme.fail&quot;&gt;https://scheme.fail&lt;/a&gt;, where you can find
the release tarballs and the manual.&lt;/p&gt;
&lt;p&gt;Loko Scheme is available under GNU Affero GPL version 3 or later.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Akku Archive Improvements</title>
      <link>https://weinholt.se/articles/akku-archive-improvements/</link>
      <pubDate>Sat, 20 Jun 2020 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/akku-archive-improvements/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;&lt;a href=&quot;https://akkuscm.org/&quot;&gt;Akku.scm&lt;/a&gt; is a language package manager for R6RS and R7RS
Scheme. The &lt;a href=&quot;https://gitlab.com/akkuscm/akku-archive&quot;&gt;software that powers the package index&lt;/a&gt; has
been growing beyond the simple one-liner it was in the beginning and
today I’ve finally pushed it to a public repository. I’ve also made
preparations for hosting packages as tarballs directly in the archive.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h1 id=&quot;tarballs&quot;&gt;Tarballs&lt;/h1&gt;
&lt;p&gt;The Akku archive has never hosted packages directly. The index points at
git repositories and commit revisions. These are added to each
project’s &lt;code&gt;Akku.lock&lt;/code&gt; file and are used when &lt;code&gt;akku install&lt;/code&gt; clones the
repository.&lt;/p&gt;
&lt;p&gt;This has two major drawbacks. Cloning a git repository can be really
slow. The repositories are also hosted on sites like GitHub where
users sometimes decide to force-push or remove the repositories
completely. I feel this is likely to happen more often the more
politics and business influences GitHub in the future.&lt;/p&gt;
&lt;p&gt;I’ve prepared the Akku archive to host tarballs directly. These are made
with &lt;code&gt;git archive&lt;/code&gt; from the submitted git repository. Downloading
these is much faster than cloning a repository, they are not at risk
of being removed at a whim, and they are cached in a local shared
cache. Other package managers as a rule host their own archives as
well, so this is nothing unusual.&lt;/p&gt;
&lt;h1 id=&quot;provenance&quot;&gt;Provenance&lt;/h1&gt;
&lt;p&gt;It’s important to me that users of Akku can trust that they get
original software that has not been tampered with. I review all code
that goes into the archive to protect Akku against use
in &lt;a href=&quot;https://attack.mitre.org/techniques/T1195/&quot;&gt;supply chain attacks&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Building tarballs changes the equation a little bit since you now need
to trust that the tarballs have not been tampered with. Tarballs are
verified when they are downloaded, but how do you know that they match
the original software?&lt;/p&gt;
&lt;p&gt;This can be seen as an issue of &lt;em&gt;provenance&lt;/em&gt;, or providing proof of
the history of a piece of software. Here is the chain for the new
tarballs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Akku packages are submitted through &lt;code&gt;akku publish&lt;/code&gt; by a developer
(or by the Snow mirror software) as a &lt;code&gt;.akku&lt;/code&gt; file with a detached
GPG signature. This signature can be independently verified by
fetching the key from the keyservers.&lt;/p&gt;
&lt;p&gt;The signed &lt;code&gt;.akku&lt;/code&gt; file contains a git commit id. Because it is
signed by the person who submitted the package, we can use the
signature to verify that it was not tampered with after it went into
the archive.&lt;/p&gt;
&lt;p&gt;Copies of these files are hosted
under
&lt;a href=&quot;https://archive.akkuscm.org/archive/packages/&quot;&gt;/archive/packages&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The archive software creates a tarball from the original repository
using the &lt;code&gt;git archive&lt;/code&gt; command. It also creates a new &lt;code&gt;.akku&lt;/code&gt; file
which contains information about the original repository and commit
id as a comment. The non-comment part of the file contains the URL
and hash of the new tarball. Like other &lt;code&gt;.akku&lt;/code&gt; files, it is signed.
This provides a signature linking the original git commit to the new
tarball’s hash.&lt;/p&gt;
&lt;p&gt;These files are available
under &lt;a href=&quot;https://archive.akkuscm.org/archive/pkg/&quot;&gt;/archive/pkg&lt;/a&gt;. The
signature is made with the current Akku archive key, which is in
turn signed by my own key (which is in the Debian keyring).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;code&gt;.akku&lt;/code&gt; files for Snow packages and the new tarballs are
combined using &lt;code&gt;akku archive-scan&lt;/code&gt; and written to &lt;code&gt;Akku-index.scm&lt;/code&gt;,
which is then XZ-compressed and signed with the archive key. The
&lt;code&gt;akku update&lt;/code&gt; command verifies the signature when it downloads this
file. When Akku creates an &lt;code&gt;Akku.lock&lt;/code&gt; file it incorporates the hash
from the index, which is verified when &lt;code&gt;akku install&lt;/code&gt; runs.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The above should make it possible for any interested party to check
the integrity of the archive. It also protects against attackers
uploading funky tarballs that don’t match the git repository.&lt;/p&gt;
&lt;p&gt;All git repositories and Snow packages are mirrored in the archive
under &lt;a href=&quot;https://archive.akkuscm.org/archive/mirror/&quot;&gt;/archive/mirror&lt;/a&gt;.
This mirror is not used in the index and is mostly provided for backup
purposes.&lt;/p&gt;
&lt;h1 id=&quot;beta-testers&quot;&gt;Beta testers&lt;/h1&gt;
&lt;p&gt;The new index with tarballs is not live yet, it needs some testing.&lt;/p&gt;
&lt;p&gt;Anyone who wants to do so can try it and report successes or failures
in the comments section below or in GitLab issues. Here is how to
update to the new archive manually:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;curl https://archive.akkuscm.org/beta/Akku-index.scm \
  &amp;gt; ~/.local/share/akku/index.db
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;There is a GPG signature (&lt;code&gt;.sig&lt;/code&gt;) in the same directory in case you
want to verify that it was not tampered with.&lt;/p&gt;
&lt;p&gt;Run &lt;code&gt;akku lock&lt;/code&gt; in your existing project to get a lockfile that uses
the new index. Then run &lt;code&gt;akku install&lt;/code&gt; to download your packages as
usual.&lt;/p&gt;
&lt;p&gt;If all goes well then some time soon the switch to the new index will
happen and &lt;code&gt;akku update&lt;/code&gt; will use the new style index. You will still
be able to revert to the old index by downloading &lt;code&gt;Akku-origin.scm&lt;/code&gt;
manually from the archive site and then use that as your index. This
file will keep being maintained because that is where the Akku website
generator finds pointers to upstream Git repositories.&lt;/p&gt;
&lt;h1 id=&quot;further-reading&quot;&gt;Further reading&lt;/h1&gt;
&lt;p&gt;More about Akku:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://akkuscm.org&quot;&gt;The Akku website&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://akkuscm.org/packages&quot;&gt;The Akku package list&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://gitlab.com/akkuscm/akku/wikis/FAQ&quot;&gt;The Akku FAQ&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    <item>
      <title>Quasiquote - Literal Magic</title>
      <link>https://weinholt.se/articles/quasiquote-literal-magic/</link>
      <pubDate>Fri, 15 May 2020 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/quasiquote-literal-magic/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;While I was writing a manpage for Scheme’s &lt;a href=&quot;https://github.com/schemedoc/scheme-manpages/blob/master/man3/quasiquote.3scheme&quot;&gt;quasiquote&lt;/a&gt;,
something I saw surprised me and changed my understanding of
quasiquote. It turns out that a new language, with semantics that are
interesting to PLT enthusiasts, hides behind the innocent backtick
character. Starting with R6RS Scheme, quasiquote became &lt;em&gt;total
magic&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h1 id=&quot;background&quot;&gt;Background&lt;/h1&gt;
&lt;p&gt;It is not going to be easy to understand the argument in this article
if you lack some background knowledge, so here’s a brief explanation
of quasiquote, Scheme’s concept of locations, Scheme’s handling of
literal constants, and finally referential transparency.&lt;/p&gt;
&lt;h2 id=&quot;briefly-on-quasiquote&quot;&gt;Briefly on quasiquote&lt;/h2&gt;
&lt;p&gt;Quasiquote is a language feature in Scheme that lets you write a
template for a structure of lists and vectors. These templates are
more like web templates than C++ templates; don’t let the terminology
confuse you.&lt;/p&gt;
&lt;p&gt;Basically you write a backtick character to start a template. The code
immediately following the backtick is the template. You write a comma
wherever you want to fill in some variable or other expression.
(There’s also a list-splicing version of the comma which often comes in
handy).&lt;/p&gt;
&lt;p&gt;The expression &lt;code&gt;`(b &amp;quot;Hello &amp;quot; ,x)&lt;/code&gt; builds a list with these elements:
the symbol &lt;code&gt;b&lt;/code&gt;, the string &lt;code&gt;&amp;quot;Hello&amp;quot;&lt;/code&gt; and lastly whatever value the
variable &lt;code&gt;x&lt;/code&gt; happened to have. So perhaps &lt;code&gt;(b &amp;quot;Hello &amp;quot; &amp;quot;John&amp;quot;)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Quasiquote is very useful to build SXML and SHTML. Forget learning a
new templating system every week; this one’s a keeper. But this being
Scheme, the most popular use is likely to write S-expressions that
represent code in some language. It’s used for just that
in &lt;a href=&quot;https://nanopass.org/&quot;&gt;the nanopass framework&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;location-location-location-&quot;&gt;Location, location, location!&lt;/h2&gt;
&lt;p&gt;Making a language is a difficult job. Everything should ideally work
smoothly together as a coherent whole, like pineapples on a pizza.
Objects in Scheme programs implicitly refer to &lt;em&gt;locations&lt;/em&gt;. There are
many details around that which affect the whole language, and they
have interesting consequences for quasiquote.&lt;/p&gt;
&lt;p&gt;What’s a location? It’s just a place where you can store a value. The
vector &lt;code&gt;#(1 2 3)&lt;/code&gt; has three locations where values are stored, and
currently it’s the numbers 1, 2 and 3. If you make a vector using
&lt;code&gt;make-vector&lt;/code&gt; then the vector is &lt;em&gt;mutable&lt;/em&gt; and you can change the
values. Later when you see the same vector again it will still contain
the new values.&lt;/p&gt;
&lt;p&gt;In practice a location is some address in memory, but the garbage
collector might move it around, so its address changes, but it is
still the same location.&lt;/p&gt;
&lt;p&gt;Other objects in Scheme do not have locations. The number &lt;code&gt;1&lt;/code&gt; does not
have any locations that hold values that you can change. This is only
because of the wisdom, kindheartedness and foresight of the language
designers, because it is possible to design things differently.&lt;/p&gt;
&lt;p&gt;As a consequence of numbers not having locations, there is also very
little point in worrying about which number object you have. Suppose
that numbers &lt;em&gt;did&lt;/em&gt; have locations and you could store important
information in them. You’d be very concerned that the &lt;code&gt;1&lt;/code&gt; number
object where you stored all your passwords is the same &lt;code&gt;1&lt;/code&gt; that you
now have in your &lt;em&gt;secrets&lt;/em&gt; variable. (Nobody will ever think to look
inside the number &lt;code&gt;1&lt;/code&gt; for your passwords, so your secrets are safe
there.) But number objects do not actually have locations, so it
doesn’t matter if the Scheme implementation fiddles with them behind
your back and gives you a different &lt;code&gt;1&lt;/code&gt; object the next time you’re
looking.&lt;/p&gt;
&lt;h2 id=&quot;literal-constants&quot;&gt;Literal constants&lt;/h2&gt;
&lt;p&gt;Pairs and vectors have locations, but the rules for their locations
are much relaxed if they are literal constants in the program code.&lt;/p&gt;
&lt;p&gt;Constants in Scheme are allowed to have read-only locations. If you
compile a program with Loko Scheme, you will notice that you get an
assertion if you try to change any of the constants. From R6RS:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It is desirable for constants (i.e. the values of literal
expressions) to reside in read-only memory. To express this, it is
convenient to imagine that every object that refers to locations is
associated with a flag telling whether that object is mutable or
immutable. Literal constants, the strings returned by
&lt;code&gt;symbol-&amp;gt;string&lt;/code&gt;, records with no mutable fields, and other values
explicitly designated as immutable are immutable objects, while all
objects created by the other procedures listed in this report are
mutable. An attempt to store a new value into a location referred to
by an immutable object should raise an exception with condition type
&lt;code&gt;&amp;amp;assertion&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Literal constants can also share locations. If the same constant
appears in different places in the program then the compiler is
allowed to create a single shared instance of the constant, here
explained as it applies to structures, in the section on &lt;code&gt;eqv?&lt;/code&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Implementations may share structure between constants where
appropriate. Thus the value of &lt;code&gt;eqv?&lt;/code&gt; on constants is sometimes
implementation-dependent.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The illustrative examples are:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eqv?&lt;/span&gt;&lt;/span&gt; '(a) '(a))         &lt;span class=&quot;comment&quot;&gt;;⇒ unspecified&lt;/span&gt;
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eqv?&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;a&quot;&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;a&quot;&lt;/span&gt;)           &lt;span class=&quot;comment&quot;&gt;;⇒ unspecified&lt;/span&gt;
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eqv?&lt;/span&gt;&lt;/span&gt; '(b) (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cdr&lt;/span&gt;&lt;/span&gt; '(a b))) &lt;span class=&quot;comment&quot;&gt;;⇒ unspecified&lt;/span&gt;
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;x&lt;/span&gt; '(a)))
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eqv?&lt;/span&gt;&lt;/span&gt; x x))            &lt;span class=&quot;comment&quot;&gt;;⇒ #t&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So when it comes to literal constants, Scheme’s normal storage rules
do not apply. A program might find that two different locations have
become the same location, so that changing the value in a quoted
vector ends up changing the value in another quoted vector. It’s also
very likely that the program gets an exception when it tries to change
the value in such a location. The last example shows that going the
other way is not allowed: the compiler is not allowed to create two
different versions of the list &lt;code&gt;(a)&lt;/code&gt; in that program.&lt;/p&gt;
&lt;h2 id=&quot;referential-transparency&quot;&gt;Referential transparency&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Referential transparency&lt;/em&gt; is a concept that is important in purely
functional programming languages. An expression that is referentially
transparent can be replaced by the values it returns.&lt;/p&gt;
&lt;p&gt;Some expressions in Scheme are referentially transparent. Constants,
references to variables that are never mutated, arithmetic in general,
type predicates, etc. A Scheme compiler is allowed to replace &lt;code&gt;(+ 1
2)&lt;/code&gt; with &lt;code&gt;3&lt;/code&gt;. It doesn’t matter that the program might actually have
returned a “different” &lt;code&gt;3&lt;/code&gt; each time the expression runs. In the same
way it doesn’t matter if the compiler turns two different constants
into the same constant.&lt;/p&gt;
&lt;p&gt;Most parts of Scheme are not referentially transparent. As an example,
a Scheme compiler cannot replace &lt;code&gt;(vector 1 2 3)&lt;/code&gt; with &lt;code&gt;&amp;#39;#(1 2 3)&lt;/code&gt;. The
locations created by the vector procedure need to be fresh and
mutable. But it can replace &lt;code&gt;(vector-ref &amp;#39;#(1 2 3) 0)&lt;/code&gt; with &lt;code&gt;1&lt;/code&gt;, so
this expression is referentially transparent. And as we previously
saw, it can replace &lt;code&gt;(cdr &amp;#39;(a b))&lt;/code&gt; with &lt;code&gt;&amp;#39;(b)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;All of this should be fairly widely known, but now comes the
interesting part.&lt;/p&gt;
&lt;h1 id=&quot;there-is-a-crack-in-everything&quot;&gt;There is a crack in everything&lt;/h1&gt;
&lt;p&gt;So far I have explained Scheme’s notion of locations, referential
transparency and that the rules are different for literal constants.&lt;/p&gt;
&lt;p&gt;Behold this &lt;a href=&quot;https://weinholt.se/scheme/r6rs/r6rs-Z-H-14.html#node_idx_770&quot;&gt;hidden gem&lt;/a&gt; in R6RS:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A quasiquote expression may return either fresh, mutable objects or
literal structure for any structure that is constructed at run time
during the evaluation of the expression. Portions that do not need to
be rebuilt are always literal. Thus,&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;3&lt;/span&gt;)) `((&lt;span class=&quot;name&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) ,a ,&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; ,&lt;span class=&quot;symbol&quot;&gt;'five&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;may be equivalent to either of the following expressions:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;'((&lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; five &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;)

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;3&lt;/span&gt;))
   (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cons&lt;/span&gt;&lt;/span&gt; '(&lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;)
         (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cons&lt;/span&gt;&lt;/span&gt; a (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cons&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cons&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;symbol&quot;&gt;'five&lt;/span&gt; '(&lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;))))))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;However, it is not equivalent to this expression:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;3&lt;/span&gt;)) (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;list&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;list&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) a &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; &lt;span class=&quot;symbol&quot;&gt;'five&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;))
&lt;/code&gt;&lt;/pre&gt;
&lt;/blockquote&gt;
&lt;p&gt;This part of R6RS originally came from Kent
Dybvig’s &lt;a href=&quot;http://www.r6rs.org/r6rs/formal-comments/comment-204.txt&quot;&gt;formal comment #204&lt;/a&gt;. The same type of language was
adopted in R7RS.&lt;/p&gt;
&lt;p&gt;The meaning is that a quasiquoted expression can be turned into a
literal, or parts may be turned into literals. Where there was code in
the quasiquote expression, there can now be a literal. Going the other
direction is not allowed: literals cannot be turned into code that
returns fresh, mutable structure. But as the example &lt;code&gt;&amp;#39;((1 2) 3 4 five
6)&lt;/code&gt; shows, a compiler is allowed to even propagate constants into
quasiquote.&lt;/p&gt;
&lt;p&gt;There is a very deep rabbit hole here! Have a look again: &lt;strong&gt;return […]
literal structure&lt;/strong&gt; &lt;em&gt;for any structure that is constructed at&lt;/em&gt; &lt;strong&gt;run
time&lt;/strong&gt; &lt;em&gt;during the evaluation of the expression&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;There is now a way to construct literals from run time code, but to do
so ahead of run time.&lt;/p&gt;
&lt;h1 id=&quot;literal-magic&quot;&gt;Literal magic&lt;/h1&gt;
&lt;p&gt;Let me demonstrate the power of Scheme’s magic quasiquote. Let &lt;code&gt;--&amp;gt;&lt;/code&gt;
mean “equivalent to”. It can be the result of an expansion or another
compiler pass, such as a partial evaluator. Here is the original,
innocuous-looking example:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;3&lt;/span&gt;)) `((&lt;span class=&quot;name&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) ,a ,&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; ,&lt;span class=&quot;symbol&quot;&gt;'five&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;))
&lt;span class=&quot;comment&quot;&gt;; --&amp;gt;&lt;/span&gt;
'((&lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; five &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Can we get literal structure copied into the constant part? Easy:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;a&lt;/span&gt; '(&lt;span class=&quot;number&quot;&gt;3&lt;/span&gt;))) `((&lt;span class=&quot;name&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) ,a ,&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; ,&lt;span class=&quot;symbol&quot;&gt;'five&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;))
&lt;span class=&quot;comment&quot;&gt;; --&amp;gt;&lt;/span&gt;
'((&lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) (&lt;span class=&quot;number&quot;&gt;3&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; five &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But we’re just getting started. Can we construct a structure at
runtime and have that appear as a constant? Of course!&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;`((1 2) ,(list 3) ,4 ,'five 6)
;; --&amp;gt;
'((1 2) (3) 4 five 6)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Those Schemers who are paying attention will be thinking I’ve gone mad
now. Maybe I have, but this example simply &lt;em&gt;returned literal structure
for the (list) structure that was constructed at run time during the
evaluation of the expression&lt;/em&gt;, to paraphrase the quasiquote
specification. Let’s increase the volume:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;`((1 2) ,(map + '(1 1) '(2 3)) ,'five 6)
;; --&amp;gt;
'((1 2) (3 4) five 6)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Hang on, shouldn’t &lt;code&gt;map&lt;/code&gt; return a fresh, mutable list? Not anymore,
this is quasiquote. The &lt;code&gt;map&lt;/code&gt; function constructs a list at run time
during the evaluation of the quasiquote expression, so the structure
no longer needs to be fresh. (Besides, the R6RS and R7RS definitions
of &lt;code&gt;map&lt;/code&gt; do not actually say that the list needs to be fresh and
mutable, but everyone probably assumes it does.)&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;letrec&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;iota&lt;/span&gt;
          (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (n)
            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; lp ((&lt;span class=&quot;name&quot;&gt;m&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;n&lt;/span&gt; n))
              (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;&amp;gt;=&lt;/span&gt;&lt;/span&gt; m n)
                  '()
                  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cons&lt;/span&gt;&lt;/span&gt; m (&lt;span class=&quot;name&quot;&gt;lp&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; m &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;) n)))))))
  `((&lt;span class=&quot;name&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) ,(&lt;span class=&quot;name&quot;&gt;iota&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;) ,&lt;span class=&quot;symbol&quot;&gt;'five&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;))
&lt;span class=&quot;comment&quot;&gt;;; --&amp;gt;&lt;/span&gt;
'((&lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) (&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;3&lt;/span&gt;) five &lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Is this valid? I think most would say it isn’t; the list created by
&lt;code&gt;iota&lt;/code&gt; is constructed with &lt;code&gt;cons&lt;/code&gt;, which returns fresh, mutable pairs.
But this is happening inside quasiquote where the normal rules of
society break down. The &lt;code&gt;iota&lt;/code&gt; procedure constructs a list structure
at runtime during the evaluation of a quasiquote expression, so a
compiler is allowed to return literal structure for that list
structure.&lt;/p&gt;
&lt;h1 id=&quot;these-go-to-11&quot;&gt;These go to 11&lt;/h1&gt;
&lt;p&gt;Let’s crank it up and make quasiquote transform code.&lt;/p&gt;
&lt;p&gt;Compilers like Chez Scheme, Loko Scheme, Guile Scheme and many others
use partial evaluators to perform source-level optimizations. A
partial evaluator runs code symbolically. The output is a new program
that is hopefully more efficient than the program that went into it.&lt;/p&gt;
&lt;p&gt;The partial evaluators used by Scheme compilers are pretty mild as far
as partial evaluators go, mostly because of the semantics of Scheme.
Doing more powerful transformations on Scheme programs would require
quite powerful static analysis, and that is both slow and difficult.&lt;/p&gt;
&lt;p&gt;To get quasiquote to work with code, we need something that enables a
partial evaluator to run a given procedure in such as way that it’s
always inside a quasiquote expression. If we have such a procedure
then the partial evaluator can start using the tricks described above,
and treat all code that constructs new structures as if their
structures were literals. This makes the partial evaluator very happy,
so here is &lt;code&gt;happy&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;happy&lt;/span&gt; proc)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;name&quot;&gt;x&lt;/span&gt;
    `,(apply proc x)))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In a normal Scheme implementation this operator doesn’t do much more
than maybe waste a little time and space. But in a Scheme that knows
the magic nature of quasiquote, it would enable powerful program
transformations on lists and vectors, without the need for as much
analysis as it would normally require. In particular, it should no
longer be necessary to analyze if intermediate results are mutated,
nor to analyze if programs check results for pointer equality with
&lt;code&gt;eq?&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Here is an illustrative example of a potential program transformation,
based on Philip Wadler’s 1990 paper &lt;em&gt;Deforestation: Transforming
programs to eliminate trees&lt;/em&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;upto&lt;/span&gt; m n)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt; m n)
      '()
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cons&lt;/span&gt;&lt;/span&gt; m (&lt;span class=&quot;name&quot;&gt;upto&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; m &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;) n))))

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;square&lt;/span&gt; x)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;*&lt;/span&gt;&lt;/span&gt; x x))

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;sum&lt;/span&gt; xs)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; sum* ((&lt;span class=&quot;name&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;xs&lt;/span&gt; xs))
    (&lt;span class=&quot;name&quot;&gt;match&lt;/span&gt; xs
      [() a]
      [(&lt;span class=&quot;name&quot;&gt;x&lt;/span&gt; . xs) (&lt;span class=&quot;name&quot;&gt;sum*&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; a x) xs)])))

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;squares&lt;/span&gt; xs)
  (&lt;span class=&quot;name&quot;&gt;match&lt;/span&gt; xs
    [() '()]
    [(&lt;span class=&quot;name&quot;&gt;x&lt;/span&gt; . xs) (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cons&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;square&lt;/span&gt; x) (&lt;span class=&quot;name&quot;&gt;squares&lt;/span&gt; xs))]))

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; f
  (&lt;span class=&quot;name&quot;&gt;happy&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (n) (&lt;span class=&quot;name&quot;&gt;sum&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;squares&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;upto&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; n))))))
&lt;span class=&quot;comment&quot;&gt;;; --&amp;gt;&lt;/span&gt;
(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; f
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (n)
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;letrec&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;h0&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (u0 u1 n)
                   (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;&amp;gt;&lt;/span&gt;&lt;/span&gt; u1 n)
                       u0
                       (&lt;span class=&quot;name&quot;&gt;h0&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; u0 (&lt;span class=&quot;name&quot;&gt;square&lt;/span&gt; u1)) (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; u1 &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;) n)))))
      (&lt;span class=&quot;name&quot;&gt;h0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; n))))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After &lt;em&gt;deforestation&lt;/em&gt; (or &lt;em&gt;fusion&lt;/em&gt;), the intermediate lists used in
&lt;code&gt;f&lt;/code&gt; have been eliminated. This is beneficial in that you can write
high-level code, but still have the compiler produce the efficient
loop you would have had to write by hand. Scheme compilers normally
don’t do these transformations due to the required analysis.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;happy&lt;/code&gt; operator does not completely open the barn doors: the
transformation still needs to not change other program side effects,
such as I/O and exceptions.&lt;/p&gt;
&lt;h1 id=&quot;the-fly-in-the-ointment&quot;&gt;The fly in the ointment&lt;/h1&gt;
&lt;p&gt;Imagine a program written according to this template:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;main&lt;/span&gt;)
  ...)

(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; main* (&lt;span class=&quot;name&quot;&gt;happy&lt;/span&gt; main))

(&lt;span class=&quot;name&quot;&gt;main*&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What are the consequences for the main program? It would seem that
everything in it follows the rules of quasiquote and it can’t use
&lt;code&gt;cons&lt;/code&gt; in the normal way. This is bad news for the main program.&lt;/p&gt;
&lt;h1 id=&quot;conclusions&quot;&gt;Conclusions&lt;/h1&gt;
&lt;p&gt;I don’t know where this leads. What is the precise limit for what a
compiler can and can’t turn into literal structure in quasiquote? The
example with the main program makes it seem that quasiquote actually
gives the compiler a bit too much freedom.&lt;/p&gt;
&lt;p&gt;Perhaps it’s actually just poor wording, so R6RS and R7RS will get a
new erratum that clarifies what is and what isn’t allowed. I suspect
that this is the most likely outcome.&lt;/p&gt;
&lt;p&gt;But it doesn’t stop someone who is working on a partial evaluator or
another program transformation from proposing a &lt;code&gt;happy&lt;/code&gt; operator as a
SRFI, giving it semantics that enable even more powerful
transformations, but without the need to rely on language lawyering.&lt;/p&gt;
&lt;p&gt;There is one conclusion I can draw from all this: don’t assume that
what comes of out quasiquote can be mutated.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Loko Scheme 0.4.3</title>
      <link>https://weinholt.se/articles/loko-scheme-0-4-3/</link>
      <pubDate>Mon, 02 Mar 2020 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/loko-scheme-0-4-3/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;&lt;a href=&quot;https://scheme.fail&quot;&gt;Loko Scheme&lt;/a&gt; 0.4.3 is now out with a few
important fixes, new features and network card drivers for eepro100,
rtl8139, virtio net and Linux tuntap devices. The &lt;code&gt;include&lt;/code&gt; form is
now available and &lt;code&gt;#u8()&lt;/code&gt; is recognized. Hashtables are written using
Racket’s &lt;code&gt;#hasheq()&lt;/code&gt; syntax, and cycles are handled while printing
records.&lt;/p&gt;
&lt;p&gt;Read more about the drivers in the companion article
&lt;a href=&quot;https://weinholt.se/articles/drivers-loko-scheme&quot;&gt;Device Drivers in Loko Scheme&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>A New R6RS Scheme Compiler</title>
      <link>https://weinholt.se/articles/new-r6rs-compiler/</link>
      <pubDate>Wed, 02 Oct 2019 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/new-r6rs-compiler/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;Some readers already know this and a few have suspected. I’ve been
working on a new R6RS Scheme compiler for a while. Now I have released
it as free software. Read on to learn the many wonderful drawbacks of
this niche compiler.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;I will start with what many will find to be the largest drawback, so
that those of you who don’t want it can close this tab right away and
never look back (but don’t close the tab!). The compiler is licensed
under the GNU &lt;strong&gt;Affero&lt;/strong&gt; General Public License, version 3 or later.
I chose this license, not only to promote chaos and disorder, but also
because of where I see technology and society heading.&lt;/p&gt;
&lt;p&gt;With that out of the way, here are some questionable facts about Loko
Scheme:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You can download it from the &lt;a href=&quot;https://scheme.fail/&quot;&gt;Loko Scheme web site&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It is written in R6RS Scheme and a wafer thin amount of assembly.
Once it has been bootstrapped it can self-compile. There is no C
code in the compiler or the runtime.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It generates code for the AMD64 instruction set.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It has a few optimization passes, such as &lt;em&gt;Fixing Letrec
(reloaded)&lt;/em&gt;, the inliner &lt;em&gt;cp0&lt;/em&gt; (also used in Chez Scheme and Ikarus
Scheme) and a low-level instruction level optimizer. Some code runs
really fast, but most code runs just okay.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It outputs statically compiled binaries only, although there is also
an interpreter.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The binaries are simultaneously Linux ELF binaries and multiboot
binaries for bare hardware.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Linux port of Loko starts in just 3 ms on my machine.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It has concurrency based on Concurrent ML with an API surface mostly
nicked from fibers for Guile. I/O is non-blocking by default. If
you’re familiar with Golang then this part will feel familiar.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Most SRFIs from &lt;a href=&quot;https://akkuscm.org/packages/chez-srfi/&quot;&gt;chez-srfi&lt;/a&gt; are supported and there’s an
early POSIX library for the Linux port based on the current SRFI 170
draft.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;why-loko&quot;&gt;Why Loko&lt;/h1&gt;
&lt;p&gt;Why not? If you can live with the license (which really isn’t as bad
as you might think), then Loko is definitely for you. If one looks at
how the GPL has worked out for Linux, I think it will be okay. Linux’s
license doesn’t extend to user space and I want that aspect to work
the same for Loko.&lt;/p&gt;
&lt;p&gt;My original use case for Loko Scheme is experimental operating system
development. Forget about all legacy software and build your own
kernel, with blackjack and hookers, so to speak. Due to the nature of
that kind of work, I think it will necessarily be useful for more
things.&lt;/p&gt;
&lt;p&gt;Suppose that you can compile a Scheme program on your machine and have
a guarantee that it will work on all other Linux AMD64 systems. No
confusion with glibc vs musl vs whatever. You also have access to
non-blocking I/O, a concurrency library and direct syscalls. What can
you build with that?&lt;/p&gt;
&lt;p&gt;There’s a directory in the Loko repository with the following samples,
which might give you an idea of where Loko is today:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;bga-graphics - this is a simple program that uses the linear
framebuffer of the Bochs graphics adapter (available in QEMU), reads
a 3D model and renders it on the screen.&lt;/li&gt;
&lt;li&gt;etherdump - simple driver for an RTL8139 networking chip combined
with a text-mode based Ethernet frame dumper.&lt;/li&gt;
&lt;li&gt;hello - it’s just Hello World as a library, that runs on Linux or
on bare hardware (printing to the serial port)&lt;/li&gt;
&lt;li&gt;lspci - scans the devices on the PCI bus, prints the register
locations, IRQs and option ROM sizes, and uses the PCI ID database
to print the name of the devices and the vendors&lt;/li&gt;
&lt;li&gt;web-server - this Linux program sets up a concurrent web server that
responds with a static payload. Pretty simplistic code, but not too
far away from a working web server. Handles maybe 15k requests per
second.&lt;/li&gt;
&lt;/ul&gt;
&lt;h1 id=&quot;future-direction&quot;&gt;Future direction&lt;/h1&gt;
&lt;p&gt;In the future I think there will be ports to more kernels, probably
NetBSD and FreeBSD, and more instruction sets. I’d like to try to port
it to AArch64 myself unless someone gets there before me.&lt;/p&gt;
&lt;p&gt;But before that, I will work on more operating system stuff. There’s
an experimental USB stack. I’ve got some code that reads a file from a
FAT file system on a USB stick, but it’s very slow in its current
manifestation. I just recently added a buddy allocator and fibers and
just haven’t had time to write more drivers.&lt;/p&gt;
&lt;p&gt;Kernel programming with Loko is not as difficult as regular kernel
programming. The concurrency model of kernel code is the same as the one
for user programs. There is no need to care about writing special code
safe for interrupt contexts. If you’re a Scheme programmer then you
can probably already do kernel programming with Loko; you just don’t
know it yet.&lt;/p&gt;
&lt;p&gt;I also want to have user space support in Loko. This would mean you
could have a kernel in Scheme that can run regular Unix-like programs.
(It’s not a pipe dream either, it’s actually pretty straightforward).
This would let you design your own syscall interface for programs.
Loko doesn’t really care about what those syscalls do and doesn’t have
any opinions about file systems, networking and drivers. If you
preferred how file systems worked in TOPS-20, ITS or VMS, then you
could make it work that way.&lt;/p&gt;
&lt;p&gt;Do not cling to the past. Download Loko and start experimenting today!&lt;/p&gt;
&lt;h1 id=&quot;further-reading&quot;&gt;Further reading&lt;/h1&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://scheme.fail/&quot;&gt;Loko Scheme’s web site&lt;/a&gt; has links to
downloads and so on.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://weinholt.se/articles/alignment-check/&quot;&gt;Faster Dynamic Type Checks&lt;/a&gt; describes how Loko
gets hardware-assisted type checks, so that e.g. a safe &lt;code&gt;car&lt;/code&gt; or
&lt;code&gt;cdr&lt;/code&gt; is a single instruction.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://weinholt.se/articles/design-low-tagging-z3py&quot;&gt;Design Your Low-Bit Tagging with Z3Py&lt;/a&gt;
describes how Loko’s value tagging system is designed (the git
repository has the proof code).&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://akkuscm.org/&quot;&gt;Akku.scm&lt;/a&gt; is the Scheme package manager
required when building Loko Scheme. This will also allow for R7RS
support in the future. It would already be working, but CI tests for
akku-r7rs are currently failing for Loko.&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    <item>
      <title>Announcing Akku.scm 1.0.0</title>
      <link>https://weinholt.se/articles/announcing-akku-scm-1-0-0/</link>
      <pubDate>Fri, 26 Jul 2019 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/announcing-akku-scm-1-0-0/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;I am happy to announce the general availability of &lt;a href=&quot;https://akkuscm.org/&quot;&gt;Akku.scm&lt;/a&gt;
1.0.0, a language package manager for R6RS and R7RS Scheme. It can
be downloaded from &lt;a href=&quot;https://gitlab.com/akkuscm/akku/releases&quot;&gt;GitLab&lt;/a&gt; and &lt;a href=&quot;https://github.com/weinholt/akku/releases&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Akku is a package manager with features specially designed for Scheme.
The library systems of R6RS and R7RS, where libraries are fully self
describing, make it possible to automatically analyze source code to
find libraries and imports.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://asciinema.org/a/259410&quot;&gt;&lt;img src=&quot;https://asciinema.org/a/259410.svg&quot; alt=&quot;asciicast&quot;&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Akku installs libraries to a per-project library directory that works
across all supported Scheme implementations. On top of this there is a
traditional package index and a dependency solver. Packages are
manually vetted before they are published in the index.&lt;/p&gt;
&lt;p&gt;To increase the portability of R7RS code, Akku also performs an
automatic conversion from the R7RS &lt;code&gt;define-library&lt;/code&gt; form to the R6RS
&lt;code&gt;library&lt;/code&gt; form.&lt;/p&gt;
&lt;p&gt;Akku supports Chez Scheme, Chibi Scheme, GNU Guile, Gauche Scheme,
Ikarus Scheme, IronScheme, Larceny Scheme, Loko Scheme, Mosh Scheme,
Racket (plt-r6rs), Sagittarius Scheme, Vicare Scheme and Ypsilon
Scheme. It has been tested on Cygwin, FreeBSD, GNU/Linux, MSYS2,
OpenBSD and macOS.&lt;/p&gt;
&lt;p&gt;Akku has been in development for 21 months.&lt;/p&gt;
&lt;h2 id=&quot;further-reading&quot;&gt;Further reading&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://akkuscm.org&quot;&gt;The Akku website&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://akkuscm.org/packages&quot;&gt;The Akku package list&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://gitlab.com/akkuscm/akku/wikis/FAQ&quot;&gt;The Akku FAQ&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    <item>
      <title>Terminfo and its DSL</title>
      <link>https://weinholt.se/articles/terminfo-and-its-dsl/</link>
      <pubDate>Tue, 05 Feb 2019 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/terminfo-and-its-dsl/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;Programs for Linux that run in the terminal often use color. There are
a few approaches to making this work. Many programs use hardcoded ANSI
compatible escape sequences, which are widespread enough today that
they work almost everywhere. There are drawbacks to hardcoding these
and for that reason there’s a database called &lt;em&gt;terminfo&lt;/em&gt;, which has
its own stack-based Domain Specific Language (DSL).&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;!--

If you don't want to use terminfo then a good reference is
the [*console_codes*][codes](4) manpage. You can start to use these
escape sequences immediately:

 [codes]: http://manpages.ubuntu.com/manpages/xenial/en/man4/console_codes.4.html

```sh
$ echo -e '\e[1;33;44mYellow on blue\e[m'
```

The first escape sequence, starting with `\e`, turns on a yellow
foreground color (`1;33`) and a blue background (`44`). The second
escape sequence disables these attributes. This is described in the
manpage. These codes work in most terminals today.

--&gt;
&lt;p&gt;The terminfo database has entries for most terminals that were ever
made, even very obscure brands. My machine has 1743 entries in
&lt;code&gt;/lib/terminfo&lt;/code&gt; and &lt;code&gt;/usr/share/terminfo&lt;/code&gt;. Terminfo provides a
standardized interface towards all these terminals, including future
terminals.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;giFTcurs-0.6.2.png&quot; alt=&quot;giFTcurs 0.6.2 screenshot&quot;&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The terminfo database is also used by ncurses, a library for making
programs like the one shown above. These programs will work more or
less the same on all terminals that supports cursor movement. If there
is no color support then they are monochrome, but still they work.
Terminfo has a standard set of booleans and numbers that tell you how
many colors the terminal supports, how many rows and columns it has
(by default), whether it supports a mouse and also which quirks it
has.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-sh&quot;&gt;$ &lt;span class=&quot;built_in&quot;&gt;echo&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;&lt;span class=&quot;variable&quot;&gt;$(tput bold; tput setaf 3;
tput setab 4)&lt;/span&gt;Yellow on blue&lt;span class=&quot;variable&quot;&gt;$(tput sgr0)&lt;/span&gt;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;tput&lt;/code&gt; tool uses the terminfo library to generate escape sequences
for the current terminal, which it finds through the &lt;code&gt;TERM&lt;/code&gt;
environment variable. &lt;!-- The above command is equivalent to the example --&gt;
&lt;!-- from earlier. --&gt;&lt;/p&gt;
&lt;h1 id=&quot;a-dumb-terminfo-entry&quot;&gt;A dumb terminfo entry&lt;/h1&gt;
&lt;p&gt;The &lt;code&gt;infocmp&lt;/code&gt; tool can inspect terminfo entries:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-sh&quot;&gt;$ TERM=dumb infocmp -x
dumb|80-column dumb tty,
    am,
    cols&lt;span class=&quot;comment&quot;&gt;#80,&lt;/span&gt;
    bel=^G, cr=\r, cud1=\n, ind=\n,
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;All terminfo entries consist of named fields (called capabilities)
that are either booleans, numbers or strings. The names are short and
cryptic, although there is a longer version available as well.
The &lt;a href=&quot;http://manpages.ubuntu.com/manpages/xenial/en/man5/terminfo.5.html&quot;&gt;&lt;em&gt;terminfo&lt;/em&gt;&lt;/a&gt;(5) manpage has a short description of all
fields, e.g. &lt;code&gt;cud1&lt;/code&gt; means &lt;code&gt;cursor_down&lt;/code&gt; and contains the bytes that
move the cursor down one line. Unsurprisingly, it’s a newline
character. A pre-requisite is that the terminal is in raw mode, so the
kernel does not translate newline to newline + carriage return as it
would normally do.&lt;/p&gt;
&lt;p&gt;There is also support for extended capabilities, which are not in the
pre-defined list. The difference is basically that the binary terminfo
format then has to explicitly encode the name of the field, whereas
otherwise it is implicit by its location in the file. Terminfo can use
these to encode new and interesting capabilities, like support for
24-bit color.&lt;/p&gt;
&lt;h1 id=&quot;an-advanced-entry&quot;&gt;An advanced entry&lt;/h1&gt;
&lt;p&gt;Let’s go from dumb to advanced. The terminal emulator xterm has a
variant with support for 24-bit color. It’s big and cryptic!&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-sh&quot;&gt;TERM=vt100 infocmp -x
&lt;span class=&quot;comment&quot;&gt;#    Reconstructed via infocmp from file: /lib/terminfo/v/vt100&lt;/span&gt;
vt100|vt100-am|dec vt100 (w/advanced video),
    OTbs, am, mc5i, msgr, xenl, xon,
    cols&lt;span class=&quot;comment&quot;&gt;#80, it#8, lines#24, vt#3,&lt;/span&gt;
    acsc=``aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~,
    bel=^G, blink=\E[5m$&amp;lt;2&amp;gt;, bold=\E[1m$&amp;lt;2&amp;gt;,
    clear=\E[H\E[J$&amp;lt;50&amp;gt;, cr=\r, csr=\E[%i%p1%d;%p2%dr,
    cub=\E[%p1%dD, cub1=^H, cud=\E[%p1%dB, cud1=\n,
    cuf=\E[%p1%dC, cuf1=\E[C$&amp;lt;2&amp;gt;,
    cup=\E[%i%p1%d;%p2%dH$&amp;lt;5&amp;gt;, cuu=\E[%p1%dA,
    cuu1=\E[A$&amp;lt;2&amp;gt;, ed=\E[J$&amp;lt;50&amp;gt;, el=\E[K$&amp;lt;3&amp;gt;, el1=\E[1K$&amp;lt;3&amp;gt;,
    enacs=\E(B\E)0, home=\E[H, ht=^I, hts=\EH, ind=\n, ka1=\EOq,
    ka3=\EOs, kb2=\EOr, kbs=^H, kc1=\EOp, kc3=\EOn, kcub1=\EOD,
    kcud1=\EOB, kcuf1=\EOC, kcuu1=\EOA, kent=\EOM, kf0=\EOy,
    kf1=\EOP, kf10=\EOx, kf2=\EOQ, kf3=\EOR, kf4=\EOS, kf5=\EOt,
    kf6=\EOu, kf7=\EOv, kf8=\EOl, kf9=\EOw, lf1=pf1, lf2=pf2,
    lf3=pf3, lf4=pf4, mc0=\E[0i, mc4=\E[4i, mc5=\E[5i, rc=\E8,
    rev=\E[7m$&amp;lt;2&amp;gt;, ri=\EM$&amp;lt;5&amp;gt;, rmacs=^O, rmam=\E[?7l,
    rmkx=\E[?1l\E&amp;gt;, rmso=\E[m$&amp;lt;2&amp;gt;, rmul=\E[m$&amp;lt;2&amp;gt;,
    rs2=\E&amp;lt;\E&amp;gt;\E[?3;4;5l\E[?7;8h\E[r, sc=\E7,
    sgr=\E[0%?%p1%p6%|%t;1%;%?%p2%t;4%;%?%p1%p3%|%t;7%;%?%p4%t;5%;m%?%p9%t\016%e\017%;$&amp;lt;2&amp;gt;,
    sgr0=\E[m\017$&amp;lt;2&amp;gt;, smacs=^N, smam=\E[?7h, smkx=\E[?1h\E=,
    smso=\E[7m$&amp;lt;2&amp;gt;, smul=\E[4m$&amp;lt;2&amp;gt;, tbc=\E[3g,
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Many interesting things are going on here. All the string capabilities
that start with &lt;code&gt;k&lt;/code&gt;, plus a few more, tell you how the terminal
encodes keys. The &lt;code&gt;home&lt;/code&gt; entry says that the &lt;em&gt;Home&lt;/em&gt; key sends &lt;code&gt;\E[H&lt;/code&gt;
(i.e. &lt;code&gt;ESC [ H&lt;/code&gt;). A quirky thing with this terminal is that it
requires delays after some commands. Those are encoded as e.g. &lt;code&gt;$&amp;lt;5&amp;gt;&lt;/code&gt;,
which is a 5 millisecond delay. The delays are generated by sending an
amount of NUL bytes that will generate the appropriate delay given the
terminal’s current baud rate (although today they are often omitted).&lt;/p&gt;
&lt;h1 id=&quot;anatomy-of-a-string&quot;&gt;Anatomy of a string&lt;/h1&gt;
&lt;p&gt;All non-dumb terminals can be told to move the cursor using the &lt;code&gt;cup&lt;/code&gt;
(&lt;em&gt;cursor_address&lt;/em&gt;) capability. When &lt;code&gt;tput&lt;/code&gt; is called as &lt;code&gt;tput cup 5
10&lt;/code&gt; it takes two arguments: the row and the column. In this &lt;code&gt;vt100&lt;/code&gt;
entry this string is fairly simple:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;cup=\E[%i%p1%d;%p2%dH$&amp;lt;5&amp;gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;It’s ASCII soup. This is a program written in a DSL. Actually, all the
strings that generate output are written in this DSL, even if they
just output the same bytes every time. I have written a parser and
compiler in R6RS Scheme for this DSL, which you can find in
the &lt;a href=&quot;https://gitlab.com/weinholt/text-mode&quot;&gt;&lt;code&gt;text-mode&lt;/code&gt;&lt;/a&gt; package. Here is the result of tokenizing
the string:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-lisp&quot;&gt;&amp;gt; (&lt;span class=&quot;name&quot;&gt;import&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;text-mode&lt;/span&gt; terminfo parser))
&amp;gt; (&lt;span class=&quot;name&quot;&gt;parse-term-string&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;string-&amp;gt;utf8&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;\x1b;[%i%p1%d;%p2%dH$&amp;lt;5&amp;gt;&quot;&lt;/span&gt;))
((&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;27&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;91&lt;/span&gt;))      &lt;span class=&quot;comment&quot;&gt;;prints ESC [&lt;/span&gt;
 (&lt;span class=&quot;name&quot;&gt;inc-p1/p2&lt;/span&gt;)              &lt;span class=&quot;comment&quot;&gt;;increment parameters 1 and 2&lt;/span&gt;
 (&lt;span class=&quot;name&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;)            &lt;span class=&quot;comment&quot;&gt;;push parameter 1 to the stack&lt;/span&gt;
 (&lt;span class=&quot;name&quot;&gt;printf&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; #f #f #f #\space #f #f #\d)  &lt;span class=&quot;comment&quot;&gt;;pop and printf as %d&lt;/span&gt;
 (&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;59&lt;/span&gt;))         &lt;span class=&quot;comment&quot;&gt;;print ;&lt;/span&gt;
 (&lt;span class=&quot;name&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;)            &lt;span class=&quot;comment&quot;&gt;;push parameter 2 to the stack&lt;/span&gt;
 (&lt;span class=&quot;name&quot;&gt;printf&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; #f #f #f #\space #f #f #\d)  &lt;span class=&quot;comment&quot;&gt;;pop and printf as %d&lt;/span&gt;
 (&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;72&lt;/span&gt;))         &lt;span class=&quot;comment&quot;&gt;;print H&lt;/span&gt;
 (&lt;span class=&quot;name&quot;&gt;msleep&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;53&lt;/span&gt;) #f #f)) &lt;span class=&quot;comment&quot;&gt;;sleep for 5 ms&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Parameters passed to the program itself are addressed by their index
(1-9), whereas operations in the program implicitly pop from or push
to an argument stack. Some operations pop their arguments, e.g. the
&lt;code&gt;%d&lt;/code&gt; operation that pops a number and formats it as the &lt;code&gt;printf&lt;/code&gt;
function in the C library would. Other arguments push to the stack,
e.g. &lt;code&gt;%p1&lt;/code&gt; that pushes the first parameter.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;%i&lt;/code&gt; operation is an interesting special operation that increments
the first two parameters. This is useful because terminfo (and
termcap, its historical predecessor) uses zero-based indexing for rows
and columns, whereas VT100/ANSI terminals use one-based indexing.&lt;/p&gt;
&lt;h1 id=&quot;the-terminfo-language&quot;&gt;The terminfo language&lt;/h1&gt;
&lt;p&gt;The terminfo language is a simple stack-based DSL that has parameters,
&lt;em&gt;if-then-else&lt;/em&gt;, built-in &lt;code&gt;printf&lt;/code&gt; (with padding and decimal, octal and
hex output), persistent variables and basic math/logic operators.&lt;/p&gt;
&lt;p&gt;It is a dynamically typed language that supports integers of some
unspecified type (i.e. signed 32-bit) and NUL terminated strings.
There are two operations on strings: &lt;code&gt;%l&lt;/code&gt; (pop and push the length of
a string) and &lt;code&gt;%s&lt;/code&gt; (pop and print a string). Terminfo generally has no
clue if what is on the argument stack is a valid string or not, so
these operations can very easily be made to cause segfaults in C
implementations of terminfo.&lt;/p&gt;
&lt;p&gt;The persistent variables are interesting. There are two sets of them:
the static and the dynamic. Historically there was likely some
difference between these, but today they appear to be the same. They
are commonly used as temporary variables inside programs, as a way to
get around the need to manage the argument stack properly. But they
can also be used to store information that persists between calls to
the terminfo library (when used through &lt;code&gt;tparm&lt;/code&gt;/&lt;code&gt;tiparm&lt;/code&gt; rather than
&lt;code&gt;tput&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;So basically terminfo comes equipped with a language that can scan
arbitrary addresses for NUL bytes (strlen), &lt;strong&gt;copy memory from
arbitrary addresses in your program to the terminal&lt;/strong&gt;, and persist
information between library calls and perform basic logic and
arithmetic. It lacks loops, which means it is not really Turing
complete as such, but that can be worked around by relying on multiple
calls to the program and by using the persistent variables to drive
control flow through &lt;em&gt;if-then-else&lt;/em&gt; constructs. Let’s hope your
terminfo entry comes from a trustworthy source.&lt;/p&gt;
&lt;h1 id=&quot;compiling-terminfo-programs&quot;&gt;Compiling terminfo programs&lt;/h1&gt;
&lt;p&gt;The ncurses implementation of terminfo uses a simple combined
lexer-and-interpreter to run the programs. One of my hobbies is
compilers, so I decided to do things differently in the &lt;code&gt;text-mode&lt;/code&gt;
package. I decided to compile the programs.&lt;/p&gt;
&lt;p&gt;Let’s have a look at the &lt;code&gt;setab&lt;/code&gt; string from the &lt;code&gt;xterm-direct&lt;/code&gt; entry:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-lisp&quot;&gt;&amp;gt; (&lt;span class=&quot;name&quot;&gt;import&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;text-mode&lt;/span&gt; terminfo parser))
&amp;gt; (&lt;span class=&quot;name&quot;&gt;define&lt;/span&gt; setab &lt;span class=&quot;string&quot;&gt;&quot;\x1b;[%?%p1%{8}%&amp;lt;%t4%p1%d%e48\\:2\\:\\:%p1%{65536}%/%d\\:%p1%{256}%/%{255}%&amp;amp;%d\\:%p1%{255}%&amp;amp;%d%;m&quot;&lt;/span&gt;)
&lt;span class=&quot;comment&quot;&gt;; Tokenizer output&lt;/span&gt;
&amp;gt; (&lt;span class=&quot;name&quot;&gt;parse-term-string&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;string-&amp;gt;utf8&lt;/span&gt; setab))
((&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;27&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;91&lt;/span&gt;))
 (&lt;span class=&quot;name&quot;&gt;if&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;push&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;8&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;&amp;lt;&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;then&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;52&lt;/span&gt;))
 (&lt;span class=&quot;name&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;printf&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; #f #f #f #\space #f #f #\d)
 (&lt;span class=&quot;name&quot;&gt;else&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;52&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;56&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;50&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt;))
 (&lt;span class=&quot;name&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;push&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;65536&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;quotient&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;printf&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; #f #f #f #\space #f #f #\d)
 (&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt;))
 (&lt;span class=&quot;name&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;push&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;256&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;quotient&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;push&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;255&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;bitwise-and&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;printf&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; #f #f #f #\space #f #f #\d)
 (&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt;))
 (&lt;span class=&quot;name&quot;&gt;parameter&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;push&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;255&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;bitwise-and&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;printf&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; #f #f #f #\space #f #f #\d)
 (&lt;span class=&quot;name&quot;&gt;endif&lt;/span&gt;)
 (&lt;span class=&quot;name&quot;&gt;print&lt;/span&gt; #vu8(&lt;span class=&quot;number&quot;&gt;109&lt;/span&gt;)))
&lt;span class=&quot;comment&quot;&gt;; Compiler output&lt;/span&gt;
&amp;gt; (&lt;span class=&quot;name&quot;&gt;expand/optimize&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;terminfo-expression&lt;/span&gt; setab))
(&lt;span class=&quot;name&quot;&gt;lambda&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;p&lt;/span&gt; dvar svar baudrate lines p1 p2 p3 p4 p5 p6 p7 p8 p9)
  (&lt;span class=&quot;name&quot;&gt;put-bytevector&lt;/span&gt; p #vu8(&lt;span class=&quot;number&quot;&gt;27&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;91&lt;/span&gt;))
  (&lt;span class=&quot;name&quot;&gt;if&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&amp;lt;&lt;/span&gt; p1 &lt;span class=&quot;number&quot;&gt;8&lt;/span&gt;)
      (&lt;span class=&quot;name&quot;&gt;begin&lt;/span&gt;
        (&lt;span class=&quot;name&quot;&gt;put-u8&lt;/span&gt; p &lt;span class=&quot;number&quot;&gt;52&lt;/span&gt;)
        (&lt;span class=&quot;name&quot;&gt;ti-printf&lt;/span&gt; p &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; p1 #f #f #f #\space #f #f #\d))
      (&lt;span class=&quot;name&quot;&gt;begin&lt;/span&gt;
        (&lt;span class=&quot;name&quot;&gt;put-bytevector&lt;/span&gt; p #vu8(&lt;span class=&quot;number&quot;&gt;52&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;56&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;50&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt;))
        (&lt;span class=&quot;name&quot;&gt;let&lt;/span&gt; ([v (&lt;span class=&quot;name&quot;&gt;quotient&lt;/span&gt; p1 &lt;span class=&quot;number&quot;&gt;65536&lt;/span&gt;)])
          (&lt;span class=&quot;name&quot;&gt;ti-printf&lt;/span&gt; p &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; v #f #f #f #\space #f #f #\d)
          (&lt;span class=&quot;name&quot;&gt;put-bytevector&lt;/span&gt; p #vu8(&lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt;))
          (&lt;span class=&quot;name&quot;&gt;let&lt;/span&gt; ([v (&lt;span class=&quot;name&quot;&gt;bitwise-and&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;255&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;quotient&lt;/span&gt; p1 &lt;span class=&quot;number&quot;&gt;256&lt;/span&gt;))])
            (&lt;span class=&quot;name&quot;&gt;ti-printf&lt;/span&gt; p &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; v #f #f #f #\space #f #f #\d)
            (&lt;span class=&quot;name&quot;&gt;put-bytevector&lt;/span&gt; p #vu8(&lt;span class=&quot;number&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;58&lt;/span&gt;))
            (&lt;span class=&quot;name&quot;&gt;let&lt;/span&gt; ([v (&lt;span class=&quot;name&quot;&gt;bitwise-and&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;255&lt;/span&gt; p1)])
              (&lt;span class=&quot;name&quot;&gt;ti-printf&lt;/span&gt; p &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; v #f #f #f #\space #f #f #\d))))))
  (&lt;span class=&quot;name&quot;&gt;put-u8&lt;/span&gt; p &lt;span class=&quot;number&quot;&gt;109&lt;/span&gt;)
  (&lt;span class=&quot;name&quot;&gt;void&lt;/span&gt;))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The output from the compiler is ready to be consumed by &lt;code&gt;eval&lt;/code&gt;, which
in &lt;code&gt;Chez Scheme&lt;/code&gt; will generate fast machine code for this program.&lt;/p&gt;
&lt;p&gt;The process is a whole lot more complex than it looks. The compiler
has to translate stack-based code to direct code, which means that it
has to keep track of the argument stack and translate it to procedure
arguments and return values. It has to implement &lt;em&gt;if-then-else&lt;/em&gt; and
handle arguments going into the branches and possibly going out of the
branches. On top of that the first two parameters can be mutated and
the resulting code should still be efficient.&lt;/p&gt;
&lt;p&gt;To make things easier, I relied on cp0 (compiler pass 0). This is a
partial evaluator described in Oscar Waddell’s Ph.D. dissertation and
is built-in to Chez Scheme. It basically lets my compiler write pretty
terrible code, but terrible in a particular good way that cp0 likes.
The call to &lt;code&gt;expand/optimize&lt;/code&gt; above shows the output after cp0 has
done its job.&lt;/p&gt;
&lt;p&gt;The first thing the compiler does is to statically analyze the
operations in the program to find the required stack size. Taking the
&lt;code&gt;%+&lt;/code&gt; operation as an example, it first pops two values and then pushes
a value. This means that: the stack has to have room for at least two
values, there are two values going in to the operation, and there is
one value going out. The concatenative nature of the language makes
the analysis simple to perform for any sequence of operation. This
analysis is carried out for the whole program, but also for branches
in the program.&lt;/p&gt;
&lt;p&gt;The stack is made explicit as variables in the generated program. This
is key to letting cp0 get rid of the stack completely. Here is the
code generated for the program &lt;code&gt;%p1%d&lt;/code&gt; before cp0:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-lisp&quot;&gt;(&lt;span class=&quot;name&quot;&gt;lambda&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;p&lt;/span&gt; dvar svar baudrate lines p1 p2 p3 p4 p5 p6 p7 p8 p9)
  (&lt;span class=&quot;name&quot;&gt;define&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;k&lt;/span&gt; . x) (&lt;span class=&quot;name&quot;&gt;if&lt;/span&gt; #f #f))
  (&lt;span class=&quot;name&quot;&gt;let&lt;/span&gt; ([p1^ p1] [p2^ p2] [s0 &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;])
    (&lt;span class=&quot;name&quot;&gt;let&lt;/span&gt; ()
      ((&lt;span class=&quot;name&quot;&gt;lambda&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;s0&lt;/span&gt;)
         (&lt;span class=&quot;name&quot;&gt;let&lt;/span&gt; ([s0 p1^])
           ((&lt;span class=&quot;name&quot;&gt;lambda&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;s0&lt;/span&gt;)
              (&lt;span class=&quot;name&quot;&gt;let&lt;/span&gt; ([v s0])
                (&lt;span class=&quot;name&quot;&gt;ti-printf&lt;/span&gt; p &lt;span class=&quot;string&quot;&gt;&quot;%d&quot;&lt;/span&gt; v #f #f #f #\space #f #f #\d)
                (&lt;span class=&quot;name&quot;&gt;k&lt;/span&gt; s0)))
             s0)))
        s0)))
  (&lt;span class=&quot;name&quot;&gt;if&lt;/span&gt; #f #f))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is lambda soup. The &lt;code&gt;k&lt;/code&gt; procedure acts as a continuation in the
case when the compiler detects that it can’t generate sequential code
for an &lt;em&gt;if-then-else&lt;/em&gt;. This is rare, but happens when a branch affects
the explicit program state in a bad way. It can be due to either a
side-effect in a branch (i.e. &lt;code&gt;%i&lt;/code&gt;) or that a branch affects the
stack. In these cases the output from the branches has to be passed to
the continuation of the program, which is therefore made explicit as a
new &lt;code&gt;k&lt;/code&gt; procedure. In all examples I looked at, cp0 optimized this
quite well.&lt;/p&gt;
&lt;p&gt;The program after cp0:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;(lambda (p dvar svar baudrate lines p1 p2 p3 p4 p5 p6 p7 p8 p9)
  (ti-printf p &amp;quot;%d&amp;quot; p1 #f #f #f #\space #f #f #\d)
  (void))
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Notice that the program before cp0 has an s0 variable. This is the
single stack location needed for this program. Pushing a value to the
stack is simply done by rebinding s0 to the new value, &lt;code&gt;(let ([s0
p1^])&lt;/code&gt;. This binds &lt;code&gt;s0&lt;/code&gt; to &lt;code&gt;p1^&lt;/code&gt;, which is the program’s current value
of parameter 1. It is different from &lt;code&gt;p1&lt;/code&gt;, because incrementing p1 in
&lt;code&gt;%i&lt;/code&gt; is handled by rebinding &lt;code&gt;p1^&lt;/code&gt; as &lt;code&gt;(+ 1 p1)&lt;/code&gt;. The mutable program
state (&lt;code&gt;s0&lt;/code&gt; in this case) is passed to the rest of the program by
application of a procedure that rebinds all state variables and
contains the rest of the program, in this case &lt;code&gt;((lambda (s0) ...)
s0)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When popping from the stack, a temporary variable is bound to &lt;code&gt;s0&lt;/code&gt; and
the rest of the stack is rebound. The temporary variable is then used
in the implementation of the operation, in this case &lt;code&gt;ti-print&lt;/code&gt;. Since
this example only has one stack slot nothing happens to &lt;code&gt;s0&lt;/code&gt;, but
otherwise &lt;code&gt;s0&lt;/code&gt; would be rebound to &lt;code&gt;s1&lt;/code&gt;, and so on.&lt;/p&gt;
&lt;p&gt;This is a lot of rebinding and unnecessary lambdas, but cp0 eats this
kind of code for breakfast and transforms it into (most of the times)
optimal code. This approach to compiling the programs is made easier
by the fact that there is no advanced control flow.&lt;/p&gt;
&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;
&lt;p&gt;Down the rabbit hole, indeed. Terminfo has a strangely powerful DSL
that specializes in generating escape sequences for terminals.
Although ncurses uses an interpreter to run the programs, it is also
possible to compile them and get efficient direct code.&lt;/p&gt;
&lt;p&gt;Someone clever in a white hat should probably have a look at how good
it is that terminfo programs, in the ncurses implementation, can copy
arbitrary memory to the terminal. This has not been possible to
implement in the Scheme implementation of the same.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Alignment Checking &amp; Meltdown</title>
      <link>https://weinholt.se/articles/low-bit-tagging-meltdown/</link>
      <pubDate>Sun, 06 Jan 2019 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/low-bit-tagging-meltdown/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;Here is some interesting news for compiler writers worried about
Meltdown. I have previously described a way to get hardware-based type
checks (think branchless &lt;code&gt;car&lt;/code&gt;, &lt;code&gt;cdr&lt;/code&gt;, &lt;code&gt;vector-ref&lt;/code&gt;, etc.)
using &lt;a href=&quot;https://weinholt.se/articles/alignment-check/&quot;&gt;alignment checks&lt;/a&gt;. It now appears
that this technique may be immune to Meltdown-type attacks:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Alignment Faults.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Upon detecting an unaligned memory operand, the processor can
(optionally) generate an alignment check exception (#AC). We found
that the results of unaligned memory accesses never reach the
transient execution. We suspect that this is because #AC is generated
early-on (even before the operand’s virtual address is translated to a
physical one). Thus, Meltdown-AC is not possible.&lt;/p&gt;
&lt;p&gt;– &lt;a href=&quot;https://arxiv.org/abs/1811.05441&quot;&gt;A Systematic Evaluation of Transient Execution Attacks and Defenses&lt;/a&gt; (2018, Canella, et al.)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The kernel unfortunately can’t use it because #AC does not work at
CPL=0, but for user space it could be a great way to avoid some
Meltdown vulnerabilities.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Design Your Low-Bit Tagging with Z3Py</title>
      <link>https://weinholt.se/articles/design-low-tagging-z3py/</link>
      <pubDate>Sun, 18 Nov 2018 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/design-low-tagging-z3py/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;&lt;a href=&quot;https://wingolog.org/archives/2011/05/18/value-representation-in-javascript-implementations&quot;&gt;Low-bit tagging&lt;/a&gt; is
a technique where the low bits of values are used to store type
information. There are numerous benefits that come with this technique
and it is quite popular in implementations of Scheme, JavaScript and
other languages. But once you start down the road of bit-twiddling it
is hard to stop and the design of the tagging system may become
difficult to understand. So that’s when you look in your tool box and
pull out something like &lt;a href=&quot;https://github.com/Z3Prover/z3&quot;&gt;Z3&lt;/a&gt;, which this article explores.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h1 id=&quot;example-of-cleverness&quot;&gt;Example of Cleverness&lt;/h1&gt;
&lt;p&gt;Let us begin with a look at a real life example of low-bit tagging
from Chez Scheme on an AMD64 system, where integers in the interval
[-2&lt;sup&gt;60&lt;/sup&gt;, 2&lt;sup&gt;60&lt;/sup&gt;-1] are encoded directly into the
value’s bit pattern:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-x86asm&quot;&gt;Chez Scheme Version &lt;span class=&quot;number&quot;&gt;9.5&lt;/span&gt;
Copyright &lt;span class=&quot;number&quot;&gt;1984&lt;/span&gt;-&lt;span class=&quot;number&quot;&gt;2017&lt;/span&gt; Cisco Systems, &lt;span class=&quot;keyword&quot;&gt;Inc&lt;/span&gt;.

&amp;gt; (#%$assembly-output #t)
&amp;gt; (lambda (x y) (fx+ x y))
&lt;span class=&quot;symbol&quot;&gt;
entry.28:&lt;/span&gt;
&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;:       cmpi           (imm &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;), %ac0
&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;:       bne            lf&lt;span class=&quot;meta&quot;&gt;.27&lt;/span&gt;(&lt;span class=&quot;number&quot;&gt;35&lt;/span&gt;)
&lt;span class=&quot;symbol&quot;&gt;dcl.29:&lt;/span&gt;
&lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;:       &lt;span class=&quot;keyword&quot;&gt;mov&lt;/span&gt;            %rdi, %rcx
&lt;span class=&quot;number&quot;&gt;9&lt;/span&gt;:       &lt;span class=&quot;keyword&quot;&gt;or&lt;/span&gt;             %r8, %rcx
&lt;span class=&quot;number&quot;&gt;12&lt;/span&gt;:      testib         (imm &lt;span class=&quot;number&quot;&gt;7&lt;/span&gt;), %rcx
&lt;span class=&quot;number&quot;&gt;15&lt;/span&gt;:      bne            Llib&lt;span class=&quot;meta&quot;&gt;.26&lt;/span&gt;(&lt;span class=&quot;number&quot;&gt;12&lt;/span&gt;)
&lt;span class=&quot;symbol&quot;&gt;lt.30:&lt;/span&gt;
…
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Chez Scheme has its own mnemonics for the assembler instructions, but
it should be familiar enough. In this example the &lt;code&gt;rdi&lt;/code&gt; and &lt;code&gt;r8&lt;/code&gt;
registers are bitwise logical OR’d together, the three low bits are
tested and at &lt;code&gt;15:&lt;/code&gt; it branches to some error handler if the bits were
not 0. In C terms: &lt;code&gt;if (((x | y) &amp;amp; 7) != 0) { goto error; }&lt;/code&gt;. This is an
example of the cleverness possible in a well-designed low-bit tagging
system: two values can be type checked with a single branch
instruction.&lt;/p&gt;
&lt;p&gt;This works because all fixnums are tagged with three 0 bits and
bitwise logical OR with something not a fixnum would introduce some 1
bits. Chez Scheme is careful to never create non-fixnum objects with
three lower 0 bits.&lt;/p&gt;
&lt;h1 id=&quot;starting-with-z3py&quot;&gt;Starting with Z3Py&lt;/h1&gt;
&lt;p&gt;Let us have a look at how to enter this system into Z3 via Z3Py. Z3 is
an MIT-licensed theorem prover from Microsoft Research. The native
language of Z3 is actually &lt;a href=&quot;http://smtlib.cs.uiowa.edu/&quot;&gt;SMT-LIB&lt;/a&gt;, but
I will use Z3Py here because I find it helps with writing logic at a
higher level. Z3Py is available in many Linux distributions, including
Debian: &lt;code&gt;apt-get install python-z3&lt;/code&gt; (currently Python 2.7 only).&lt;/p&gt;
&lt;p&gt;So what does Z3 actually do? Think of it as a tool that does a brute
force search through a whole problem space, looking at every possible
model that satisfies a set of constraints, but that it also knows a
lot of shortcuts that speed up the search. If you gave it the
assertion &lt;em&gt;x&lt;/em&gt; + 1 = 2 it would be clever enough that it would not need
to go and search through all possible values of &lt;em&gt;x&lt;/em&gt; until it found &lt;em&gt;x&lt;/em&gt;
= 1 (but in some situations would do basically this).&lt;/p&gt;
&lt;p&gt;Back to tagging. One interesting property of tagging fixnums with 0
bits is that addition can be performed without masking away the tag
bits. This Z3Py script checks this property:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Python&quot;&gt;&lt;span class=&quot;keyword&quot;&gt;from&lt;/span&gt; __future__ &lt;span class=&quot;keyword&quot;&gt;import&lt;/span&gt; print_function
&lt;span class=&quot;keyword&quot;&gt;from&lt;/span&gt; z3 &lt;span class=&quot;keyword&quot;&gt;import&lt;/span&gt; *

&lt;span class=&quot;comment&quot;&gt;# We want fixnums to be tagged with some three low bits.&lt;/span&gt;
tag_fixnum = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-fixnum'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_fixnum = BitVecVal(&lt;span class=&quot;number&quot;&gt;0b111&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)

&lt;span class=&quot;comment&quot;&gt;# Keep these ordered.&lt;/span&gt;
tags = ( tag_fixnum, )
masks = ( mask_fixnum, )

&lt;span class=&quot;comment&quot;&gt;# Create a solver.&lt;/span&gt;
s = Solver()

&lt;span class=&quot;comment&quot;&gt;# Two fixnums can be added and the result is a fixnum.&lt;/span&gt;
x = BitVec(&lt;span class=&quot;string&quot;&gt;'x'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
y = BitVec(&lt;span class=&quot;string&quot;&gt;'y'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
s.add(ForAll([x, y],
             Implies(And((x &amp;amp; mask_fixnum) == tag_fixnum,
                         (y &amp;amp; mask_fixnum) == tag_fixnum),
                     ((x + y) &amp;amp; mask_fixnum) == tag_fixnum)))

print(s.sexpr())
print(s.check())
print(s.model().sexpr())
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;BitVec&lt;/code&gt; objects here are bit-vector variables in the model,
representing some 64-bit value and &lt;code&gt;BitVecVal&lt;/code&gt; is a 64-bit constant.
The argument to &lt;code&gt;s.add&lt;/code&gt; is read as: for all &lt;em&gt;x&lt;/em&gt; and for all &lt;em&gt;y&lt;/em&gt; it is
true that &lt;em&gt;x&lt;/em&gt; and &lt;em&gt;y&lt;/em&gt; being fixnums implies (as in &lt;em&gt;P → Q&lt;/em&gt;) that &lt;em&gt;x&lt;/em&gt; + &lt;em&gt;y&lt;/em&gt;
is also a fixnum. It is important to use &lt;em&gt;x&lt;/em&gt; and &lt;em&gt;y&lt;/em&gt; inside &lt;code&gt;ForAll&lt;/code&gt;,
otherwise Z3 will look for some specific &lt;em&gt;x&lt;/em&gt; and &lt;em&gt;y&lt;/em&gt; that satisfy the
assertions instead of proving a model for all fixnums. (There is a
missing constraint, though. Can you see it?)&lt;/p&gt;
&lt;p&gt;When run through Python it generates this output:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Lisp&quot;&gt;(&lt;span class=&quot;name&quot;&gt;declare-fun&lt;/span&gt; tag-fixnum () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;))
(&lt;span class=&quot;name&quot;&gt;assert&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;forall&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;x&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)) (&lt;span class=&quot;name&quot;&gt;y&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)))
  (&lt;span class=&quot;name&quot;&gt;=&amp;gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;and&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;=&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;bvand&lt;/span&gt; x &lt;span class=&quot;number&quot;&gt;#x0000000000000007&lt;/span&gt;) tag-fixnum)
           (&lt;span class=&quot;name&quot;&gt;=&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;bvand&lt;/span&gt; y &lt;span class=&quot;number&quot;&gt;#x0000000000000007&lt;/span&gt;) tag-fixnum))
      (&lt;span class=&quot;name&quot;&gt;=&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;bvand&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;bvadd&lt;/span&gt; x y) &lt;span class=&quot;number&quot;&gt;#x0000000000000007&lt;/span&gt;) tag-fixnum))))

sat
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-fixnum () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000000&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first part is the solver in the SMT-LIB language and &lt;code&gt;sat&lt;/code&gt; means
it is satisfiable. That it is satisfiable means that Z3 also found a
model, which is the last output shown. Here it has assigned all 0 bits
to the fixnum tag, having proved that tagging with 0 bits allows
addition to work directly with fixnums. (Almost, actually. We didn’t
check that addition itself actually gives the right result).&lt;/p&gt;
&lt;p&gt;Something that’s tricky with these theorem solvers is that you really
need to tell them all the constraints or, like little children, they
will find that one electrical outlet that you didn’t secure. The
missing constraint here is that the tag must fit within the mask. Z3
actually found a model where the tag is zero, which is what we were
looking for, but it could just as well have given us a model where
some high bit of the fixnum tag is set. That will be seen in the next
section.&lt;/p&gt;
&lt;h1 id=&quot;types-types-types-&quot;&gt;Types, types, types!&lt;/h1&gt;
&lt;p&gt;Just having fixnums is no fun, so let’s add pairs, characters,
booleans and the empty list (&lt;em&gt;nil&lt;/em&gt;). Some additional constraints will
be needed, but let us first see what happens without them.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Python&quot;&gt;&lt;span class=&quot;keyword&quot;&gt;from&lt;/span&gt; __future__ &lt;span class=&quot;keyword&quot;&gt;import&lt;/span&gt; print_function
&lt;span class=&quot;keyword&quot;&gt;from&lt;/span&gt; z3 &lt;span class=&quot;keyword&quot;&gt;import&lt;/span&gt; *

tag_fixnum   = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-fixnum'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tag_pair     = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-pair'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tag_char     = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-char'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tag_boolean  = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-boolean'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tag_nil      = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-nil'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)

mask_fixnum  = BitVecVal(&lt;span class=&quot;number&quot;&gt;0b111&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_pair    = BitVecVal(&lt;span class=&quot;number&quot;&gt;0b111&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_char    = BitVec(&lt;span class=&quot;string&quot;&gt;'mask-char'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_boolean = BitVec(&lt;span class=&quot;string&quot;&gt;'mask-boolean'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_nil     = BitVec(&lt;span class=&quot;string&quot;&gt;'mask-nil'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)

tags = ( tag_fixnum, tag_pair, tag_char, tag_boolean, tag_nil )
masks = ( mask_fixnum, mask_pair, mask_char, tag_boolean, mask_nil )

s = Solver()
x = BitVec(&lt;span class=&quot;string&quot;&gt;'x'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
y = BitVec(&lt;span class=&quot;string&quot;&gt;'y'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
s.add(ForAll([x, y],
             Implies(And((x &amp;amp; mask_fixnum) == tag_fixnum,
                         (y &amp;amp; mask_fixnum) == tag_fixnum),
                     ((x + y) &amp;amp; mask_fixnum) == tag_fixnum)))

&lt;span class=&quot;comment&quot;&gt;# Uncommented one by one in the discussion below.&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;#s.add(Distinct(tags))&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;#s.add([(tag &amp;amp; mask) == tag for (tag, mask) in zip(tags, masks)])&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;#s.add([And(mask &amp;gt; 0, mask &amp;lt;= 0xff) for mask in masks])&lt;/span&gt;

print(s.sexpr())
print(s.check())
print(s.model().sexpr())
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you were to run this through Python you would find that Z3Py
probably prints exactly the same thing as before! That’s because the
new tag and mask variables are not referenced anywhere in the model.
Uncomment the &lt;code&gt;Distinct&lt;/code&gt; constraint and you might get this model:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Lisp&quot;&gt;(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-pair () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000003&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-nil () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000000&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-char () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000002&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-boolean () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000001&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-fixnum () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x2000000000000000&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Distinct means that the tags must have different values. Note that Z3
gave the all-0 tag to &lt;em&gt;nil&lt;/em&gt; and fixnums have a high bit set. And
indeed, the addition constraint still holds in this model. Oops.
Uncomment the constraint after the &lt;code&gt;Distinct&lt;/code&gt; line to constrain tags
to fit inside their mask. Here is a model with the new constraints:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Lisp&quot;&gt;(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-pair () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000004&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-nil () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x32212a2aa3282220&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; mask-char () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000010000000&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; mask-nil () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x32212a2aa3282220&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-char () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000010000000&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-boolean () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0100000000000000&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-fixnum () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000000&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Z3 really got creative here with the tag and mask for &lt;em&gt;nil&lt;/em&gt;, but
fixnums are back to zero tags, so that’s good. In general the tags are
a bit too large, so let’s enable the next constraint, saying that the
masks should be 8-bit values. Here is the new model (again, several
models are possible):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Lisp&quot;&gt;sat
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-pair () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000004&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-nil () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000040&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; mask-char () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000061&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; mask-nil () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000040&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-char () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000061&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-boolean () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000080&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-fixnum () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000000&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This iterative approach to the design is where using a theorem prover
really shines. In this model the &lt;em&gt;nil&lt;/em&gt; value and booleans are actually
fixnums, which is wrong, so assertions should be added that prevents
this from happening. But now on to some cleverness.&lt;/p&gt;
&lt;h1 id=&quot;clever-masking&quot;&gt;Clever masking&lt;/h1&gt;
&lt;p&gt;The masks in the current design are small and neat and fit as
immediates in instruction encodings. However, anyone who is familiar
with the x86 instruction set knows that registers can be addressed in
smaller parts without separate masking. Here is a handy table for one
of the registers:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style=&quot;text-align:right&quot;&gt;Register name&lt;/th&gt;
&lt;th style=&quot;text-align:center&quot;&gt;Register size&lt;/th&gt;
&lt;th&gt;Addressed data&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;code&gt;rax&lt;/code&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:center&quot;&gt;64&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rax&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;code&gt;eax&lt;/code&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:center&quot;&gt;32&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rax &amp;amp; 0xffffffff&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;code&gt;ax&lt;/code&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:center&quot;&gt;16&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rax &amp;amp; 0xffff&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;code&gt;al&lt;/code&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:center&quot;&gt;16&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rax &amp;amp; 0xff&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style=&quot;text-align:right&quot;&gt;&lt;code&gt;ah&lt;/code&gt;&lt;/td&gt;
&lt;td style=&quot;text-align:center&quot;&gt;16&lt;/td&gt;
&lt;td&gt;&lt;code&gt;(rax &amp;gt;&amp;gt; 8) &amp;amp; 0xff&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Using &lt;code&gt;al&lt;/code&gt; would make it possible to type check without applying the
mask explicitly, if it is exactly &lt;code&gt;0xff&lt;/code&gt;. This also means that the
original value is not overwritten, so a temporary register is not
needed. Decreasing register pressure is important when optimizing some
code, e.g. tight loops.&lt;/p&gt;
&lt;p&gt;Let us see what Chez Scheme does here.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-x86asm&quot;&gt;&amp;gt; (#%$assembly-output #t)
&amp;gt; (lambda (x) (char=? x #\space))
&lt;span class=&quot;symbol&quot;&gt;
entry.21:&lt;/span&gt;
&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;:       cmpi           (imm &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;), %ac0
&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;:       bne            lf&lt;span class=&quot;meta&quot;&gt;.20&lt;/span&gt;(&lt;span class=&quot;number&quot;&gt;66&lt;/span&gt;)
&lt;span class=&quot;symbol&quot;&gt;dcl.22:&lt;/span&gt;
&lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;:       &lt;span class=&quot;keyword&quot;&gt;mov&lt;/span&gt;            %r8, %rcx
&lt;span class=&quot;number&quot;&gt;9&lt;/span&gt;:       andi           (imm &lt;span class=&quot;number&quot;&gt;255&lt;/span&gt;), %rcx
&lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;:      cmpi           (imm &lt;span class=&quot;number&quot;&gt;22&lt;/span&gt;), %rcx
&lt;span class=&quot;number&quot;&gt;20&lt;/span&gt;:      bne            lf&lt;span class=&quot;meta&quot;&gt;.19&lt;/span&gt;(&lt;span class=&quot;number&quot;&gt;31&lt;/span&gt;)
&lt;span class=&quot;symbol&quot;&gt;lt.23:&lt;/span&gt;
…
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Hmm! The equivalent C code is &lt;code&gt;rcx = (r8 &amp;amp; 255);&lt;/code&gt; &lt;code&gt;if (rcx != 255) {&lt;/code&gt;
&lt;code&gt;goto error; }&lt;/code&gt;. It turns out that Chez Scheme doesn’t know that it can
use &lt;code&gt;r8l&lt;/code&gt; to do the check without involving a temporary register, even
though the mask allows for this. Perhaps Chez Scheme has some
low-hanging fruit for the intrepid compiler developer.&lt;/p&gt;
&lt;p&gt;When you find a trick that you want to use in your tagging system, you
just add it as a constraint. This Z3Py snippet adds the new constraint
for characters:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Python&quot;&gt;s.add(Or(mask_char == &lt;span class=&quot;number&quot;&gt;0xff&lt;/span&gt;,         &lt;span class=&quot;comment&quot;&gt;# for free with al&lt;/span&gt;
         mask_char == &lt;span class=&quot;number&quot;&gt;0xffff&lt;/span&gt;,       &lt;span class=&quot;comment&quot;&gt;# ax&lt;/span&gt;
         mask_char == &lt;span class=&quot;number&quot;&gt;0xffffffff&lt;/span&gt;))  &lt;span class=&quot;comment&quot;&gt;# eax&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h1 id=&quot;clever-shifting&quot;&gt;Clever shifting&lt;/h1&gt;
&lt;p&gt;Any language with both characters and integers needs some way to
convert between them and Scheme is no exception. Chez Scheme’s
implementation of &lt;code&gt;char-&amp;gt;integer&lt;/code&gt; contains a piece of cleverness (not
unique to itself) that works because of how the character and fixnum
tags are arranged:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-x86asm&quot;&gt;&amp;gt; (#%$assembly-output #t)
&amp;gt; (lambda (x) (char-&amp;gt;integer x))
&lt;span class=&quot;symbol&quot;&gt;
entry.28:&lt;/span&gt;
&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;:       cmpi           (imm &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;), %ac0
&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;:       bne            lf&lt;span class=&quot;meta&quot;&gt;.27&lt;/span&gt;(&lt;span class=&quot;number&quot;&gt;39&lt;/span&gt;)
&lt;span class=&quot;symbol&quot;&gt;dcl.29:&lt;/span&gt;
&lt;span class=&quot;number&quot;&gt;6&lt;/span&gt;:       &lt;span class=&quot;keyword&quot;&gt;mov&lt;/span&gt;            %r8, %rcx
&lt;span class=&quot;number&quot;&gt;9&lt;/span&gt;:       andi           (imm &lt;span class=&quot;number&quot;&gt;255&lt;/span&gt;), %rcx
&lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;:      cmpi           (imm &lt;span class=&quot;number&quot;&gt;22&lt;/span&gt;), %rcx
&lt;span class=&quot;number&quot;&gt;20&lt;/span&gt;:      bne            lf&lt;span class=&quot;meta&quot;&gt;.26&lt;/span&gt;(&lt;span class=&quot;number&quot;&gt;11&lt;/span&gt;)
&lt;span class=&quot;symbol&quot;&gt;lt.30:&lt;/span&gt;
&lt;span class=&quot;number&quot;&gt;22&lt;/span&gt;:      &lt;span class=&quot;keyword&quot;&gt;mov&lt;/span&gt;            %r8, %ac0
&lt;span class=&quot;number&quot;&gt;25&lt;/span&gt;:      lsri           (imm &lt;span class=&quot;number&quot;&gt;5&lt;/span&gt;), %ac0
…
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;It first does a type check on &lt;code&gt;r8&lt;/code&gt; to ensure that it’s a character.
Then it writes the return value as (in C terms) &lt;code&gt;r8 &amp;gt;&amp;gt; 5&lt;/code&gt;. How can 5
work with no unmasking or tagging? Let’s add it as a constraint. This
requires some leg work, see the comments:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Python&quot;&gt;&lt;span class=&quot;comment&quot;&gt;# SPDX-License-Identifier: MIT&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;# Unchanged from before&lt;/span&gt;
&lt;span class=&quot;keyword&quot;&gt;from&lt;/span&gt; __future__ &lt;span class=&quot;keyword&quot;&gt;import&lt;/span&gt; print_function
&lt;span class=&quot;keyword&quot;&gt;from&lt;/span&gt; z3 &lt;span class=&quot;keyword&quot;&gt;import&lt;/span&gt; *
tag_fixnum   = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-fixnum'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tag_pair     = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-pair'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tag_char     = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-char'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tag_boolean  = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-boolean'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tag_nil      = BitVec(&lt;span class=&quot;string&quot;&gt;'tag-nil'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_fixnum  = BitVecVal(&lt;span class=&quot;number&quot;&gt;0b111&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_pair    = BitVecVal(&lt;span class=&quot;number&quot;&gt;0b111&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_char    = BitVec(&lt;span class=&quot;string&quot;&gt;'mask-char'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_boolean = BitVec(&lt;span class=&quot;string&quot;&gt;'mask-boolean'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
mask_nil     = BitVec(&lt;span class=&quot;string&quot;&gt;'mask-nil'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
tags = ( tag_fixnum, tag_pair, tag_char, tag_boolean, tag_nil )
masks = ( mask_fixnum, mask_pair, mask_char, tag_boolean, mask_nil )
s = Solver()
x = BitVec(&lt;span class=&quot;string&quot;&gt;'x'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
y = BitVec(&lt;span class=&quot;string&quot;&gt;'y'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
s.add(ForAll([x, y],
             Implies(And((x &amp;amp; mask_fixnum) == tag_fixnum,
                         (y &amp;amp; mask_fixnum) == tag_fixnum),
                     ((x + y) &amp;amp; mask_fixnum) == tag_fixnum)))
s.add(Distinct(tags))
s.add([(tag &amp;amp; mask) == tag &lt;span class=&quot;keyword&quot;&gt;for&lt;/span&gt; (tag, mask) &lt;span class=&quot;keyword&quot;&gt;in&lt;/span&gt; zip(tags, masks)])
s.add([And(mask &amp;gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;, mask &amp;lt;= &lt;span class=&quot;number&quot;&gt;0xff&lt;/span&gt;) &lt;span class=&quot;keyword&quot;&gt;for&lt;/span&gt; mask &lt;span class=&quot;keyword&quot;&gt;in&lt;/span&gt; masks])
s.add(Or(mask_char == &lt;span class=&quot;number&quot;&gt;0xff&lt;/span&gt;, 
         mask_char == &lt;span class=&quot;number&quot;&gt;0xffff&lt;/span&gt;, 
         mask_char == &lt;span class=&quot;number&quot;&gt;0xffffffff&lt;/span&gt;))
&lt;span class=&quot;comment&quot;&gt;# New code starts here:&lt;/span&gt;

&lt;span class=&quot;comment&quot;&gt;# A trick (see Hacker's Delight) to get the shift amounts&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;# that match the masks.&lt;/span&gt;
shift_char = BitVec(&lt;span class=&quot;string&quot;&gt;'shift-char'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
shift_fixnum = BitVec(&lt;span class=&quot;string&quot;&gt;'shift-fixnum'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
s.add(mask_char == ((&lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &amp;lt;&amp;lt; shift_char) - &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
s.add(mask_fixnum == ((&lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &amp;lt;&amp;lt; shift_fixnum) - &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
s.add(shift_char &amp;gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
s.add(shift_fixnum &amp;gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)

&lt;span class=&quot;comment&quot;&gt;# Z3Py versions of char-&amp;gt;integer and integer-&amp;gt;char&lt;/span&gt;
&lt;span class=&quot;function&quot;&gt;&lt;span class=&quot;keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;title&quot;&gt;char_to_integer&lt;/span&gt;&lt;span class=&quot;params&quot;&gt;(ch)&lt;/span&gt;:&lt;/span&gt;
    &lt;span class=&quot;keyword&quot;&gt;return&lt;/span&gt; ((ch &amp;gt;&amp;gt; shift_char) &amp;lt;&amp;lt; shift_fixnum) | tag_fixnum
&lt;span class=&quot;function&quot;&gt;&lt;span class=&quot;keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;title&quot;&gt;integer_to_char&lt;/span&gt;&lt;span class=&quot;params&quot;&gt;(fx)&lt;/span&gt;:&lt;/span&gt;
    &lt;span class=&quot;keyword&quot;&gt;return&lt;/span&gt; ((fx &amp;gt;&amp;gt; shift_fixnum) &amp;lt;&amp;lt; shift_char) | tag_char

&lt;span class=&quot;comment&quot;&gt;# Sets fx_A to the fixnum that represents the 'A' code point&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;# and sets ch_A to the character 'A'. Then asserts that the&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;# conversion functions work. I'm a little bit lazy and do this&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;# for 'A' instead of all chars.&lt;/span&gt;
ch_A = BitVec(&lt;span class=&quot;string&quot;&gt;'ch-A'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
fx_A = BitVec(&lt;span class=&quot;string&quot;&gt;'fx-A'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
s.add(fx_A == ((ord(&lt;span class=&quot;string&quot;&gt;'A'&lt;/span&gt;) &amp;lt;&amp;lt; shift_fixnum) | tag_fixnum))
s.add(ch_A == ((ord(&lt;span class=&quot;string&quot;&gt;'A'&lt;/span&gt;) &amp;lt;&amp;lt; shift_char) | tag_char))
s.add(char_to_integer(ch_A) == fx_A)
s.add(integer_to_char(fx_A) == ch_A)

&lt;span class=&quot;comment&quot;&gt;# Assert that no object satisfies both fixnump and charp.&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;# Ideally there should be a complete set of these.&lt;/span&gt;
&lt;span class=&quot;function&quot;&gt;&lt;span class=&quot;keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;title&quot;&gt;fixnump&lt;/span&gt;&lt;span class=&quot;params&quot;&gt;(obj)&lt;/span&gt;:&lt;/span&gt; &lt;span class=&quot;keyword&quot;&gt;return&lt;/span&gt; (obj &amp;amp; mask_fixnum) == tag_fixnum
&lt;span class=&quot;function&quot;&gt;&lt;span class=&quot;keyword&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;title&quot;&gt;charp&lt;/span&gt;&lt;span class=&quot;params&quot;&gt;(obj)&lt;/span&gt;:&lt;/span&gt; &lt;span class=&quot;keyword&quot;&gt;return&lt;/span&gt; (obj &amp;amp; mask_char) == tag_char
s.add(ForAll([x], Implies(fixnump(x), Not(charp(x)))))
s.add(ForAll([x], Implies(charp(x), Not(fixnump(x)))))

&lt;span class=&quot;comment&quot;&gt;# Assert that char-&amp;gt;integer is equivalent to (ch &amp;gt;&amp;gt; n) for some n.&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;# This is the main point of this section.&lt;/span&gt;
shift_ch_to_fx = BitVec(&lt;span class=&quot;string&quot;&gt;'shift-ch-&amp;gt;fx'&lt;/span&gt;, &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
s.add(char_to_integer(ch_A) == (ch_A &amp;gt;&amp;gt; shift_ch_to_fx))

print(s.check())
print(s.model().sexpr())
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Quite a bit of code, but it is necessary so that Z3 will not find a
loophole. Here is one possible output:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Lisp&quot;&gt;sat
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; shift-char () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000008&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; ch-A () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000004102&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; shift-fixnum () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000003&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; mask-char () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x00000000000000ff&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-char () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000002&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; mask-nil () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x000000000000000c&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-boolean () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000014&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-pair () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000004&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-nil () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x000000000000000c&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; tag-fixnum () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000000&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; fx-A () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000208&lt;/span&gt;)
(&lt;span class=&quot;name&quot;&gt;define-fun&lt;/span&gt; shift-ch-&amp;gt;fx () (&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; BitVec &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;)
  &lt;span class=&quot;number&quot;&gt;#x0000000000000005&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The interesting part is &lt;code&gt;shift-ch-&amp;gt;fx&lt;/code&gt;, which is 5, just like in Chez
Scheme. Furthermore, if you add the assertion &lt;code&gt;shift_ch_to_fx != 5&lt;/code&gt;
then Z3 will say “model is not available”, meaning that only this
shift amount has the desired properties. There is nothing particular
that stands out in the assertions that causes this to happen (although
it is possible to get other shift amounts if some constraints are
relaxed).&lt;/p&gt;
&lt;p&gt;This has also affected the selections of the other values in the
model. If you were to change the shift amount for characters then you
could no longer rely on this trick and Z3 would happily let you know
about it. If you rely on this trick in your compiler then it’s a good
idea to add it as an assertion. In fact, while writing your Z3Py code
it’s a good idea to add all your assumptions as assertions.&lt;/p&gt;
&lt;h1 id=&quot;summing-up&quot;&gt;Summing up&lt;/h1&gt;
&lt;p&gt;I could go on and show a few more missed optimization opportunities in
Chez Scheme, like how it doesn’t type check multiple characters with
one branch, but I hope that you have already seen how Z3 is useful. It
lets you prove properties of your tagging system and, perhaps just as
importantly, lets you document your assumptions.&lt;/p&gt;
&lt;p&gt;Z3 itself is lacking in documentation, attempting to use tutorials as
a substitute, and its web site is full of dead links. When Z3 fails to
find a solution it either goes into a seemingly infinite spin and/or
prints unsat, hoping you will go away. (There is a way to get it to
print a counterexample, but the output is sadly incomprehensible.)
Quite often I found myself sitting at my terminal wondering why my
assertions were unsatisfiable.&lt;/p&gt;
&lt;p&gt;My advice, if you find yourself in the situation where you are 100%
certain something should work, is to remove the general cases and
add assertions for very specific cases. At some point a specific case
will cause Z3 to reject your assertions, which will give you a clue
as to what has gone wrong.&lt;/p&gt;
&lt;p&gt;Give it a spin if you’re thinking about revamping your tagging system
or if you want to add extra tricks and be certain that they are well
founded.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>R7RS versus R6RS</title>
      <link>https://weinholt.se/articles/r7rs-vs-r6rs/</link>
      <pubDate>Fri, 22 Jun 2018 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/r7rs-vs-r6rs/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;InPhase asked today on &lt;code&gt;#scheme&lt;/code&gt; about the R7RS vs R6RS debate. I
followed the original debate closely and have experience both using
and implementing R6RS. I also recently added R7RS support
in &lt;a href=&quot;https://akkuscm.org/&quot;&gt;Akku.scm&lt;/a&gt; 0.3.0, so I feel like I can weigh in on this. It’s
a topic that many feel passionately about, and I’m also firmly on one
side of the debate, but I will try to keep my own opinions and
hyperbole out of it this time.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;[Note from January 2020: Note that this article is about R7RS-small.
As R7RS-large is coming together, new readers could assume that it is
about R7RS-large, but it is not. Thanks to cos on lobste.rs for
pointing this out.]&lt;/em&gt;&lt;/p&gt;
&lt;h1 id=&quot;the-number-argument&quot;&gt;The number argument&lt;/h1&gt;
&lt;p&gt;If you simply asked around, you might get the answer that R6RS is just
so much bigger than R5RS/R7RS (and bigger is presumed to be not as
good). It looks obvious on the surface, but defenders of R6RS see it
is a canard. Here are the numbers, based on
the &lt;a href=&quot;https://weinholt.se/scheme/r6rs/&quot;&gt;latest updated documents&lt;/a&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;AI Memo № 349: 43 pages.&lt;/li&gt;
&lt;li&gt;RRS: 35 pages.&lt;/li&gt;
&lt;li&gt;R2RS: 76 pages.&lt;/li&gt;
&lt;li&gt;R3RS: 43 pages.&lt;/li&gt;
&lt;li&gt;R4RS: 55 pages.&lt;/li&gt;
&lt;li&gt;R5RS: 50 pages.&lt;/li&gt;
&lt;li&gt;R6RS: 91 + 72 = 163 pages (not counting the non-normative appendices
and the rationale, which I think is fair).&lt;/li&gt;
&lt;li&gt;R7RS: 88 pages (similarly not counting the overview).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;By these numbers we see that R6RS is 185% the size of R7RS and 326%
the size of R5RS. The argument looks true, at least on the surface.
These numbers hide the fact that R6RS contains 15 pages of formal
semantics.&lt;/p&gt;
&lt;p&gt;Is it fair to count these pages as a point against R6RS? The formal
semantics are a good reference for implementers who are unsure about
some corner of the language and can be used to validate an
implementation’s semantics. Here are the recent numbers again,
excluding formal semantics, appendices, bibliographies and indices:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;R5RS: 40 pages (excluding section 7.2 and forwards).&lt;/li&gt;
&lt;li&gt;R6RS: 60 + 65 = 125 pages (excluding Appendix A and forwards).&lt;/li&gt;
&lt;li&gt;R7RS: 65 pages (excluding section 7.2 and forwards).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The 65 pages from the R6RS standard libraries are the only remaining
part of the number argument that still holds.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;R7RS side&lt;/strong&gt;: The R6RS is too big. &lt;strong&gt;R6RS side&lt;/strong&gt;: It’s smaller than
it appears.&lt;/p&gt;
&lt;h1 id=&quot;the-condition-system&quot;&gt;The condition system&lt;/h1&gt;
&lt;p&gt;R6RS specifies a condition system with 14 standard condition types for
code that raises exceptions. R5RS does not provide any standard way to
handle or distinguish exceptions and not even a way to raise
exceptions. R7RS borrows &lt;code&gt;guard&lt;/code&gt;, &lt;code&gt;raise&lt;/code&gt; and &lt;code&gt;raise-continuable&lt;/code&gt; from
R6RS but does not specify a complete condition system.&lt;/p&gt;
&lt;p&gt;Instead of a condition system, R7RS has &lt;code&gt;error-object-message&lt;/code&gt;,
&lt;code&gt;error-object-irritants&lt;/code&gt;, &lt;code&gt;error-object?&lt;/code&gt;, &lt;code&gt;read-error?&lt;/code&gt; and
&lt;code&gt;file-error?&lt;/code&gt;. These are not necessarily meant to work with a new
distinct type, but may simply work with e.g. symbols and vectors. This
gives the implementer the freedom to reuse whatever condition objects
were used before they implemented R7RS support.&lt;/p&gt;
&lt;p&gt;The condition system in R6RS comes from the pain of trying to write
any kind of error handling at all in R5RS. It was not possible to,
let’s say, write portable code that reliably writes to the file system
and correctly handles I/O errors. In contrast, if an R6RS program
tries to open a file it does not have access to then it will get an
exception with an &lt;code&gt;&amp;amp;i/o-file-protection&lt;/code&gt; condition as well as a few
other conditions that together give a complete picture of the
condition. In Chez Scheme (which also adds the extra &lt;code&gt;&amp;amp;format&lt;/code&gt;):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;guard&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;exn&lt;/span&gt;
        ((&lt;span class=&quot;name&quot;&gt;i/o-file-protection-error?&lt;/span&gt; exn)
         (&lt;span class=&quot;name&quot;&gt;simple-conditions&lt;/span&gt; exn)))
  (&lt;span class=&quot;name&quot;&gt;open-file-output-port&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;/dev/foo&quot;&lt;/span&gt;))
&lt;span class=&quot;comment&quot;&gt;;; =&amp;gt; (#&amp;lt;condition &amp;amp;i/o-file-protection&amp;gt; #&amp;lt;condition &amp;amp;format&amp;gt;&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;;;     #&amp;lt;condition &amp;amp;who&amp;gt; #&amp;lt;condition &amp;amp;message&amp;gt;&lt;/span&gt;
&lt;span class=&quot;comment&quot;&gt;;;     #&amp;lt;condition &amp;amp;irritants&amp;gt; #&amp;lt;condition &amp;amp;continuation&amp;gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;User code has the same access to these conditions as the implementer
and can be construct, inspect and pretty print them.&lt;/p&gt;
&lt;p&gt;However, this means that anyone implementing R6RS should go through
all their code and update it to raise the correct conditions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;R7RS side&lt;/strong&gt;: The condition system is too big and burdensome. &lt;strong&gt;R6RS
side&lt;/strong&gt;: It lets us write portable code that catches exceptions.&lt;/p&gt;
&lt;h1 id=&quot;undefined-behavior-controversy&quot;&gt;Undefined behavior controversy&lt;/h1&gt;
&lt;p&gt;This is a big philosophical difference between the reports. I’ll let
the documents themselves tell the story.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As defined by this document, the Scheme programming
language is safe in the following sense: The execution of
a safe top-level program cannot go so badly wrong as to
crash or to continue to execute while behaving in ways
that are inconsistent with the semantics described in this
document, unless an exception is raised.&lt;/p&gt;
&lt;p&gt;&amp;mdash; Revised&lt;sup&gt;6&lt;/sup&gt; Report on the Algorithmic Language Scheme&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;That’s for R6RS, although it does later leave room for implementations
to add unsafe features. But it’s clear that a program that doesn’t
import such extra libraries is safe. It will not have a buffer
overflow waiting for an attacker to use it.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;When speaking of an error situation, this report uses the
phrase “an error is signaled” to indicate that implementations
must detect and report the error. […]&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;If such wording does not appear&lt;/strong&gt; in the discussion of an error,
then implementations are not required to detect or
report the error, though they are encouraged to do so.
Such a situation is sometimes, but not always, referred
to with the phrase “an error.” In such a situation, an
implementation &lt;strong&gt;may or may not signal an error&lt;/strong&gt;; […]&lt;/p&gt;
&lt;p&gt;For example, it is an error for a procedure to be passed
an argument of a type that the procedure is not explicitly
specified to handle, even though such domain errors are
seldom mentioned in this report. Implementations may
signal an error, extend a procedure’s domain of definition
to include such arguments, or &lt;strong&gt;fail catastrophically&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;&amp;mdash; Revised&lt;sup&gt;7&lt;/sup&gt; Report on the Algorithmic Language Scheme&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;“Fail catastrophically” is presumably not too far away from the nasal
demons of C compilers. The report does not say what will happen if
a program does &lt;code&gt;(string-ref &amp;quot;&amp;quot; -1)&lt;/code&gt; or &lt;code&gt;(car 0)&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Implementation restrictions provide even more ways that things can go
wrong:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;This report uses the phrase “may report a violation of an
implementation restriction” to indicate circumstances under
which an implementation is permitted to report that
it is unable to continue execution of a correct program
because of some restriction imposed by the implementation.
Implementation restrictions are discouraged, but &lt;strong&gt;implementations
are encouraged to report violations of implementation
restrictions&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;For example, an implementation may report a violation of
an implementation restriction if it &lt;strong&gt;does not have enough
storage to run a program&lt;/strong&gt;, or if &lt;strong&gt;an arithmetic operation
would produce an exact number that is too large&lt;/strong&gt; for the
implementation to represent.&lt;/p&gt;
&lt;p&gt;&amp;mdash; Revised&lt;sup&gt;7&lt;/sup&gt; Report on the Algorithmic Language Scheme&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So an implementation is also within its rights to &lt;em&gt;not&lt;/em&gt; detect out of
memory errors or integer overflow. R6RS does not work that way:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Implementations must raise an exception when they are
unable to continue correct execution of a correct program
due to some implementation restriction. For example,
an implementation that does not support infinities
must raise an exception with condition type
&lt;code&gt;&amp;amp;implementation-restriction&lt;/code&gt; when it evaluates an
expression whose result would be an infinity.&lt;/p&gt;
&lt;p&gt;&amp;mdash; Revised&lt;sup&gt;6&lt;/sup&gt; Report on the Algorithmic Language Scheme&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;R7RS side&lt;/strong&gt;: Safety is not an essential language feature.
&lt;strong&gt;R6RS side&lt;/strong&gt;: Safety is an essential language feature.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;[Updated in January 2020: this previously stated the R7RS side as&lt;/em&gt;
“Safety is not a desirable language feature”, &lt;em&gt;but that does not
accurately describe the situation. Thanks to John Cowan for pointing
this out.]&lt;/em&gt;&lt;/p&gt;
&lt;h1 id=&quot;optional-is-better-argument&quot;&gt;Optional is better argument&lt;/h1&gt;
&lt;p&gt;R7RS requires that implementations support 7-bit ASCII (except for NUL
in strings). This is different from R5RS, which is character set
agnostic. And it’s different again from R6RS which requires full
Unicode support.&lt;/p&gt;
&lt;p&gt;Unicode is one of several optional features in R7RS. Appendix B gives
a list of feature identifiers that may be missing in any given
implementation:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;exact-closed&lt;/code&gt; - The algebraic operations &lt;code&gt;+&lt;/code&gt;, &lt;code&gt;-&lt;/code&gt;, &lt;code&gt;*&lt;/code&gt;, and &lt;code&gt;expt&lt;/code&gt;
where the second argument is a non-negative integer produce exact
values given exact inputs.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;exact-complex&lt;/code&gt; - Exact complex numbers are provided.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ieee-float&lt;/code&gt; - Inexact numbers are IEEE 754 binary floating point
values.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;full-unicode&lt;/code&gt; - All Unicode characters present in Unicode version
6.0 are supported as Scheme characters.&lt;/li&gt;
&lt;/ul&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ratios&lt;/code&gt; - &lt;code&gt;/&lt;/code&gt; with exact arguments produces an exact result when
the divisor is nonzero.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A benefit of these features being optional is that R7RS is easier to
implement in certain environments. We can see that R7RS strings can be
implemented as C strings, which also do not support NUL characters. An
R7RS Scheme targeting ECMAScript can let &lt;code&gt;(+ 1 1)&lt;/code&gt; evaluate to &lt;code&gt;2.0&lt;/code&gt;
and &lt;code&gt;(/ 1 2)&lt;/code&gt; to &lt;code&gt;0.5&lt;/code&gt;. An R7RS targeting an AVR microcontroller can
exclude Unicode support. This will lead to more R7RS implementations,
which is good.&lt;/p&gt;
&lt;p&gt;Now the other side of the argument. Implementations which don’t
implement these features will likely list them as restrictions in
their documentation. Nothing stops an implementer from similarly
claiming compliance with R6RS and listing some restrictions. Some
targets will require certain restrictions, such as due to memory
limits on microcontrollers. But if these features are taken as
optional in the language itself then we can’t write portable code that
uses these features. The burden is on the user to provide a full
Unicode library if our software requires the use of Unicode.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;R7RS side&lt;/strong&gt;: It is too burdensome and/or restrictive to require all
these features. &lt;strong&gt;R6RS side&lt;/strong&gt;: It is too burdensome and/or difficult
to write portable code without these features.&lt;/p&gt;
&lt;h1 id=&quot;syntax-case&quot;&gt;syntax-case&lt;/h1&gt;
&lt;p&gt;The choice of macro system is a very contended issue. The result of
this particular controversy was that R7RS just kept &lt;code&gt;syntax-rules&lt;/code&gt;
from R5RS. In R6RS there is both &lt;code&gt;syntax-rules&lt;/code&gt; and the more powerful
&lt;code&gt;syntax-case&lt;/code&gt;, in which &lt;code&gt;syntax-rules&lt;/code&gt; can be written in just a few
lines.&lt;/p&gt;
&lt;p&gt;I don’t think I can properly make justice to the arguments for and
against &lt;code&gt;syntax-case&lt;/code&gt; in this article. There are other popular macro
systems with the same expressiveness, and perhaps the popularity of
some of those is the largest reason why R7RS didn’t choose
&lt;code&gt;syntax-case&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In R5RS and R7RS it is not possible
to &lt;a href=&quot;http://okmij.org/ftp/Scheme/Dirty-Macros.pdf&quot;&gt;write macros that violate syntactic hygiene&lt;/a&gt;. The macro
system is based on rewriting rules, which happen to easily be Turing
complete, but which do not have access to the Scheme language itself.
They can therefore also not deconstruct strings or create new
identifiers. That’s why &lt;code&gt;define-record-type&lt;/code&gt; in R7RS (and SRFI 9)
requires the user to write out all procedure names for all fields.
This is also a simple motivation for wanting a more powerful macro
system.&lt;/p&gt;
&lt;p&gt;A macro expander like &lt;code&gt;syntax-rules&lt;/code&gt; is a very tricky piece of code
and &lt;code&gt;syntax-case&lt;/code&gt; is even tricker. Those interested can check
out &lt;a href=&quot;https://web.archive.org/web/20010615153947/https://www.cs.indiana.edu/~owaddell/papers/thesis.ps.gz&quot;&gt;Oscar Waddell’s PhD thesis&lt;/a&gt;. Requiring a tricky macro
system is obviously a burden for the implementer and perhaps another
reason why R7RS did not add &lt;code&gt;syntax-case&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;From the side of R6RS, adding &lt;code&gt;syntax-case&lt;/code&gt; made sense. It is a great
feature to have as a user of the language. We can run any Scheme code
at expansion time and easily write macros that insert new identifiers,
even while preserving hygiene.&lt;/p&gt;
&lt;h1 id=&quot;bottom-line&quot;&gt;Bottom line&lt;/h1&gt;
&lt;p&gt;I’ve not written about all controversies. The record system of R6RS
has also received criticism. But I have shown a number of essential
differences between R7RS and R6RS. I think that this is a fair
summary:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;R6RS is more demanding on implementers but easier on users.
Conversely, R7RS is easier on implementers but more demanding on
users.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Finally, a caveat for this whole article is that it applies to
R7RS-small vs R6RS. Much might change with R7RS-large.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>R7RS comes to Akku</title>
      <link>https://weinholt.se/articles/r7rs-comes-to-akku/</link>
      <pubDate>Sun, 10 Jun 2018 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/r7rs-comes-to-akku/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;I have made some strides with &lt;a href=&quot;https://akkuscm.org/&quot;&gt;Akku.scm&lt;/a&gt; since
the &lt;a href=&quot;https://weinholt.se/articles/introduction-to-akku-scm/&quot;&gt;introductory blog article&lt;/a&gt; and &lt;a href=&quot;https://groups.google.com/forum/#!topic/chez-scheme/bqkX9eUdEH4&quot;&gt;the announcement&lt;/a&gt; on
the Chez Scheme mailing list. The big feature on the horizon is
support for translating R7RS libraries to run on R6RS.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;But first of all I need to apologize for building version 0.2.3 with
libncurses6. Chez Scheme uses ncurses for its expression editor and
Debian sid, which I use, has just had a migration from libncurses5 to
libncurses6. Most Linux systems do not have that version of ncurses
yet, so many of you could not get the pre-compiled version of Akku to
run (the &lt;code&gt;+src&lt;/code&gt; version works). I will have this fixed for the next
release.&lt;/p&gt;
&lt;p&gt;So what is this about translating R7RS to R6RS? The subset of R7RS
that already exists in R6RS is quite large and the incompatible bits
are manageable. Anyone interested in the details can
read &lt;a href=&quot;http://www.schemeworkshop.org/2014/papers/Kato2014.pdf&quot;&gt;Implementing R7RS on an R6RS Scheme system&lt;/a&gt; (Kato,
2014). There are some syntactical differences, so a new reader is
needed and the &lt;code&gt;define-library&lt;/code&gt; forms need to be translated into
&lt;code&gt;library&lt;/code&gt; forms. Lastly the R7RS standard library needs to be
implemented.&lt;/p&gt;
&lt;p&gt;For this purpose I’ve written a reader called &lt;a href=&quot;https://akkuscm.org/packages/laesare/&quot;&gt;&lt;code&gt;laesare&lt;/code&gt;&lt;/a&gt;
that can handle both R6RS and R7RS. The bulk of the reader already
existed and I added support for the R7RS lexical syntax. Next I added
code to Akku to have it understand R7RS libraries and translate them
to R6RS libraries. The final piece of the puzzle is
the &lt;a href=&quot;https://akkuscm.org/packages/akku-r7rs/&quot;&gt;&lt;code&gt;akku-r7rs&lt;/code&gt;&lt;/a&gt; package that provides the standard
library. The latter is based on &lt;a href=&quot;https://github.com/okuoku/yuni/&quot;&gt;yuni&lt;/a&gt; by okuoku, with my own
additions.&lt;/p&gt;
&lt;p&gt;Some trickery was needed to support &lt;code&gt;include&lt;/code&gt; and &lt;code&gt;cond-expand&lt;/code&gt;. The
&lt;code&gt;include&lt;/code&gt; form in R7RS is somewhat loosely specified and leaves it up
to the implementation to decide how to search for the files, but in
practice the file paths are relative to the file where the &lt;code&gt;include&lt;/code&gt;
form appeared. This is not trivial to get working in straight up R6RS,
but is easy with some help from Akku.&lt;/p&gt;
&lt;p&gt;The next release of Akku will install the &lt;code&gt;(akku metadata)&lt;/code&gt; library
that describes all the libraries and assets (i.e. included files) that
exist in the project. The &lt;code&gt;include&lt;/code&gt; form uses this library to look up
the location of the referenced files:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; installed-assets
  '(((include &lt;span class=&quot;string&quot;&gt;&quot;match/match.scm&quot;&lt;/span&gt;)
      (&lt;span class=&quot;string&quot;&gt;&quot;chibi/match/match.scm&quot;&lt;/span&gt;)
      (chibi match))
    ...))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The location is relative to the library search path, which is how all
R6RS &lt;code&gt;include&lt;/code&gt; forms already do the job today. The metadata library
also contains a list of installed libraries so that &lt;code&gt;cond-expand&lt;/code&gt;
library clauses work. These tricks work because Akku is
project-oriented: it has full knowledge of all files it has installed.&lt;/p&gt;
&lt;p&gt;Currently the R7RS support works with Chez Scheme. There is a slight
problem with Racket: it has its own &lt;code&gt;(scheme *)&lt;/code&gt; modules. Other
implementations either already have R7RS support (Sagittarius and
Larceny) or they don’t have a compatibility library in &lt;code&gt;akku-r7rs&lt;/code&gt;
and/or &lt;code&gt;chez-srfi&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Another fly in the ointment is that many of the packages
in &lt;a href=&quot;http://snow-fort.org/&quot;&gt;Snow&lt;/a&gt; require implementation-specific code, which doesn’t
exist yet for R6RS implementations. Hopefully that will change if Akku
gains some popularity.&lt;/p&gt;
&lt;p&gt;And finally, a bit of news for everyone who wants to use the git
master version of Akku: you no longer need to manually bootstrap the
dependencies. See &lt;a href=&quot;https://github.com/weinholt/akku/blob/master/CONTRIBUTING.md&quot;&gt;CONTRIBUTING.md&lt;/a&gt; for instructions. Happy hacking!&lt;/p&gt;
</description>
    </item>
    <item>
      <title>So many package managers</title>
      <link>https://weinholt.se/articles/so-many-scheme-package-managers/</link>
      <pubDate>Sun, 25 Feb 2018 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/so-many-scheme-package-managers/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;In a &lt;a href=&quot;https://weinholt.se/articles/introduction-to-akku-scm/&quot;&gt;previous article&lt;/a&gt;, I wrote about &lt;a href=&quot;https://github.com/weinholt/akku&quot;&gt;Akku.scm&lt;/a&gt;,
a package manager for Scheme. It is far from being the first package
manager or even the first for Scheme. There have been at least a dozen
failed attempts at getting something going.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://gitlab.com/rotty/dorodango&quot;&gt;Dorodango&lt;/a&gt; is another package manager aimed at R6RS. It
works more like a &lt;em&gt;system&lt;/em&gt; package manager (apt, dpkg) in some sense.
You point it at a repository and it can then install packages for you.
It is not project-oriented and does not help you with locking specific
dependencies for your project. It was later forked
as &lt;a href=&quot;https://github.com/ijp/guildhall&quot;&gt;Guildhall&lt;/a&gt; for GNU Guile. That one has also stopped
moving (perhaps they moved to Guix).&lt;/p&gt;
&lt;p&gt;Another package manager is &lt;a href=&quot;http://ravensc.com/&quot;&gt;Raven&lt;/a&gt;, which is targeted at Chez
Scheme. It wants to be installed by root into &lt;code&gt;/usr/local&lt;/code&gt; but can
then be used by unprivileged users. It’s quite new and doesn’t really
have a lot of features yet. It works and it can download some
packages. You can have a look at the code yourself, it’s quite small.&lt;/p&gt;
&lt;p&gt;There are a bunch of defunct package managers: Alex
Shinn’s &lt;a href=&quot;http://synthcode.com/scheme/common-scheme/doc/common-scheme.html&quot;&gt;Common-Scheme&lt;/a&gt;, Will Donnelly’s &lt;a href=&quot;https://web.archive.org/web/20100825100541/http://ucl.willdonnelly.net/&quot;&gt;UnCommon Lisp (UCL)&lt;/a&gt;,
Marc Feeley’s &lt;a href=&quot;http://snow.iro.umontreal.ca/&quot;&gt;Scheme Now!&lt;/a&gt; (the first Snow), Aaron
Hsu’s &lt;a href=&quot;https://github.com/arcfide/descot/&quot;&gt;DeSCoT&lt;/a&gt;, Higepon Taro Minowa’s &lt;a href=&quot;https://github.com/higepon/spon&quot;&gt;spon&lt;/a&gt; and Manuel
Serrano’s &lt;a href=&quot;http://www.stklos.net/scmpkg.html&quot;&gt;ScmPkg&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;centrifugal-vs-centripetal&quot;&gt;Centrifugal vs. centripetal&lt;/h1&gt;
&lt;p&gt;The &lt;a href=&quot;http://www.stklos.net/~eg/Publis/dls07.pdf&quot;&gt;ScmPkg paper&lt;/a&gt; (Serrano and Galleiso, 2007) is interesting reading.
They make the distinction between &lt;em&gt;centrifugal&lt;/em&gt; and &lt;em&gt;centripetal&lt;/em&gt;
approaches. The centrifugal approach is to attract users to the
language implementation by having a lot of libraries, so that a
community forms around it, which then contributes more libraries to
it. The centripetal approach is to create some common framework that,
if used, lets unmodified Scheme code run in multiple implementations.&lt;/p&gt;
&lt;p&gt;There are successful centrifugal projects. &lt;a href=&quot;http://www.schemespheres.org/&quot;&gt;SchemeSpheres&lt;/a&gt; is,
according to their site, &lt;em&gt;like the batteries (as in batteries
included) for the Gambit Scheme compiler&lt;/em&gt;. The site is a bit wonky at
the moment and the blog was last updated in 2014, but they have around
150 libraries. Even more successful, I would say, is &lt;a href=&quot;https://wiki.call-cc.org/eggs&quot;&gt;Eggs&lt;/a&gt; for
Felix Winkelmann’s Chicken Scheme. But the grand winner in this
contest must be &lt;a href=&quot;https://pkgs.racket-lang.org/&quot;&gt;Racket Packages&lt;/a&gt; for Racket. You know you have
a winner on your hand when they have four different sets of bindings
for ØMQ. (Snide remarks aside, they are very popular).&lt;/p&gt;
&lt;p&gt;There are centripetal approaches alive today, but they are not
anywhere near as popular as Eggs or Racket Packages. &lt;a href=&quot;http://snow-fort.org/&quot;&gt;Snow&lt;/a&gt; is the
successor to the old Snow (Scheme Now). It’s alive and in use by some
Scheme implementers and a handful of other people. Eerily similar to
Snow is &lt;a href=&quot;https://github.com/sethalves/snow2-client&quot;&gt;Snow2&lt;/a&gt; by Seth Alves. Some or all of this is the basis for
a &lt;a href=&quot;https://bitbucket.org/cowan/r7rs-wg1-infra/src/a7b7be6fc01f37bddc4ea700b02fcbad6c37e532/Snow.md?at=default&amp;amp;fileviewer=file-view-default&quot;&gt;rather flawed R7RS package repository format&lt;/a&gt;.&lt;/p&gt;
&lt;h1 id=&quot;my-pointy-analysis-the-rant-&quot;&gt;My pointy analysis (the rant)&lt;/h1&gt;
&lt;p&gt;The centrifugal approaches are excellent for the implementations that
they target. Chicken and Racket are themselves excellent
implementations and better off for their package repositories. But
it’s not directly useful to Scheme as such. The packages are not
portable between implementations. Not what I’m looking for.&lt;/p&gt;
&lt;p&gt;Centripetal is no good either. In my opinion, Snow suffers from the
same problems as other centripetal approaches. The packages in ScmPkg
used &lt;code&gt;.spi&lt;/code&gt; files that defined an interface and did &lt;code&gt;include&lt;/code&gt; of some
code from regular naked &lt;code&gt;.scm&lt;/code&gt; files. Snow does the same but calls
those files &lt;code&gt;.sld&lt;/code&gt; and relies heavily on &lt;code&gt;cond-expand&lt;/code&gt;. Same kind of
animal. So every library is divided in two files and on top of that
it’s &lt;a href=&quot;https://www.cqse.eu/en/blog/living-in-the-ifdef-hell/&quot;&gt;#ifdef Hell&lt;/a&gt; all over again.&lt;/p&gt;
&lt;p&gt;I don’t think Akku falls directly into either centrifugal or
centripetal, and neither should it. Something else is going on. As I
wrote in Akku’s README: &lt;em&gt;It grabs hold of code and vigorously shakes
it until it behaves properly&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Akku is built on the solid ground of R6RS Scheme. This provides a
target that is well-specified and stable. Sure, some things are not
specified, such as how libraries are stored in the file system, but
that is where Akku comes in and bridges the gap between the code and
the implementation. I think this is a better approach than, let’s say,
smugly not providing a REPL just because R6RS didn’t specify the
procedures necessary for it. (Such an implementation should, to be
consistent with its ideals, not permit loading libraries from the file
system either).&lt;/p&gt;
&lt;p&gt;R6RS does not have &lt;code&gt;cond-expand&lt;/code&gt;. Instead there is a de facto standard
of loading &lt;code&gt;.&amp;lt;impl&amp;gt;.sls&lt;/code&gt; before &lt;code&gt;.sls&lt;/code&gt; that all R6RS implementations
support, as well as Akku. This works better than the &lt;code&gt;cond-expand&lt;/code&gt;
forms which tend to show up in every little library file. (Can you
find the typo’d SRFI number hidden in one of the cond-expands in Chibi
Scheme?). The &lt;code&gt;.&amp;lt;impl&amp;gt;.sls&lt;/code&gt; approach, on the other hand, results in a
few libraries where all the non-portable stuff goes. You end up
building reusable abstractions instead of clutter. That is just
better.&lt;/p&gt;
&lt;p&gt;And R6RS already supports libraries. ScmPkg had to target any number
of module systems. I think it is rather telling that the centripetal
approaches tend to expect implementers to write their own clients for
the package system.&lt;/p&gt;
&lt;h1 id=&quot;minor-r7rs-rant&quot;&gt;Minor R7RS rant&lt;/h1&gt;
&lt;p&gt;I can not write my software with R7RS as the target language. My
critiques are too numerous and I don’t even know &lt;em&gt;how&lt;/em&gt; my name got
into the standard.&lt;/p&gt;
&lt;p&gt;However the library system in R7RS is quite alright, basically being a
downgraded copy of that in R6RS, and a future Akku version is likely
to support R7RS/Snow packages. They would be installed both for use
directly in R7RS implementations and lightly converted for use in R6RS
implementations. Going in the other direction is not possible in the
general case (see the botched attempts at porting Industria to R7RS).&lt;/p&gt;
&lt;p&gt;To me R7RS-large looks like the centrifugal approach, except applied
directly to the language standard. Most of what it attempts to
accomplish has been possible with R6RS all along. More progress would
have been made in this if Scheme standardization had not been turned
into an “Us vs. Them” game. It disillusioned and demotivated many
schemers.&lt;/p&gt;
&lt;p&gt;Besides, implementers who want access to many libraries can add R6RS
support, which I think will be less effort than adding R7RS-large.&lt;/p&gt;
&lt;h1 id=&quot;security&quot;&gt;Security&lt;/h1&gt;
&lt;blockquote&gt;
&lt;p&gt;I stand in front of you. I’ll take the force of the blow. Protection.&lt;/p&gt;
&lt;p&gt;– Massive Attack – Protection&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Am I the only one to see that the &lt;a href=&quot;http://snow-fort.org/doc/spec/&quot;&gt;Snow repository format&lt;/a&gt;
uses &lt;a href=&quot;https://crypto.stackexchange.com/questions/31286/textbook-rsa-signature-scheme-security&quot;&gt;unpadded RSA signatures&lt;/a&gt;, to be verified by keys sent
next to the signatures over plain http? As we say in Sweden: &lt;em&gt;va&lt;/em&gt;?&lt;/p&gt;
&lt;p&gt;The other package managers listed are not any more serious about
security either, but at least they don’t pretend to be either. Racket
is again the best of the bunch.&lt;/p&gt;
&lt;p&gt;I’m taking Akku’s security seriously. That includes using standard
cryptographic protocols and algorithms, independently verifiable
signatures on the package index, message digests on all downloaded
code, no arbitrary code execution on installation and manual reviews
before code even shows up in the official index.&lt;/p&gt;
&lt;p&gt;We are all friends here but let’s not kid ourselves, the Internet is a
wild place, and let’s have some standards.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Introduction to Akku.scm</title>
      <link>https://weinholt.se/articles/introduction-to-akku-scm/</link>
      <pubDate>Sat, 24 Feb 2018 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/introduction-to-akku-scm/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;For the past few months I’ve been working on &lt;a href=&quot;https://github.com/weinholt/akku&quot;&gt;Akku.scm&lt;/a&gt;, a
language package manager for R6RS Scheme. It’s not the first one for
Scheme and it’s not even the first for R6RS. But it’s here, it’s yet
another package manager, it works and I’m using it.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Language package managers are specialized to some specific set of
programming languages. They are not general tools to distribute any
kind of software. The purpose is to aid developers in managing
the dependencies of their code. Here’s a quick demo:&lt;/p&gt;
&lt;p&gt;&lt;script src=&quot;https://asciinema.org/a/2czq2tV7tYqTan7wVOzaZvvqj.js&quot; id=&quot;asciicast-2czq2tV7tYqTan7wVOzaZvvqj&quot; async&gt;&lt;/script&gt;&lt;/p&gt;
&lt;p&gt;&lt;noscript&gt;
&lt;a href=&quot;https://asciinema.org/a/2czq2tV7tYqTan7wVOzaZvvqj&quot;&gt;&lt;img src=&quot;https://asciinema.org/a/2czq2tV7tYqTan7wVOzaZvvqj.png&quot; alt=&quot;asciicast&quot;&gt;&lt;/a&gt;
&lt;/noscript&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;

&lt;p&gt;As the demonstration shows, there are three parts: manifest, lock and
install. I started writing Akku.scm in reverse order. I first wrote
the installer, then applied it to manually written lockfiles, and then
wrote the code that computed the lockfile automatically.&lt;/p&gt;
&lt;h1 id=&quot;installing-a-locked-set-of-projects&quot;&gt;Installing a locked set of projects&lt;/h1&gt;
&lt;p&gt;The lockfile contains specifications of projects to download and
install. Any code pulled in by Akku.scm’s installer will come from a
lock specification in the lockfile. Currently Akku can clone git
repositories and install R6RS libraries/programs, but the format is
flexible enough to support anything.&lt;/p&gt;
&lt;p&gt;The lockfile will always contain some cryptographic checksums. Today
this is the sha1 commit id to checkout in git. Even when a git tag is
used to checkout, it is verified against the sha1 commit id. This is
because git tags can be replaced and are not cryptographically secure
by themselves. When file downloads are later added they will have
their sha256 digests in the lock, and so on. The locks come from the
package index and that is signed with an OpenPGP signature.&lt;/p&gt;
&lt;p&gt;Other than code locations and checksums, the lockfile can also contain
instructions for what to do with the downloaded projects. By default
everything is treated the same, as R6RS libraries and programs.
Reasonable future extensions include: library name prefixing, library
filtering (to split a project into multiple packages), build
instructions for code loaded via FFIs, and conversion of R7RS
libraries.&lt;/p&gt;
&lt;p&gt;Installation is not as simple as just copying files that match
&lt;code&gt;*.sls&lt;/code&gt;. Akku has a repository scanner that analyzes all files to
locate and categorize libraries, programs, included files and license
notices. It furthermore has code to recreate the pathnames of
libraries based on the library names and the various rules used in
different Scheme implementations. Example: &lt;code&gt;(srfi :1 lists)&lt;/code&gt; is loaded
from &lt;code&gt;srfi/:1/lists.sls&lt;/code&gt; in Chez Scheme, &lt;code&gt;srfi/1/lists.sls&lt;/code&gt; in
IronScheme and &lt;code&gt;srfi/%3a1/lists.sls&lt;/code&gt; in Ikarus. Akku installs it at
all these locations.&lt;/p&gt;
&lt;p&gt;The result is that you can simply point the installer at a repository
and it automatically figures out what code there is and how to install
it in the library path.&lt;/p&gt;
&lt;h1 id=&quot;solving&quot;&gt;Solving&lt;/h1&gt;
&lt;p&gt;I’ve briskly ripped out the dependency solver from Andreas
Rottmann’s &lt;a href=&quot;https://gitlab.com/rotty/dorodango&quot;&gt;Dorodango&lt;/a&gt;. The solver is a Scheme port of the
one in Debian’s aptitude. Its inputs are the dependencies listed in
the package manifest in combination with the package index. The output
is a set of package versions that go into the lockfile.&lt;/p&gt;
&lt;p&gt;All packages have SemVer versions and package dependencies are listed
as ranges with a syntax borrowed from npm. Essentially SemVer means
that versions are written as X.Y.Z, where X is incremented when
breaking backwards compatibility, Y when adding features and Z when
fixing bugs. Usually 0.x.x implies that compatibility is not
guaranteed.&lt;/p&gt;
&lt;p&gt;The solver’s job is to take the package’s direct dependencies and
select a set of compatible package versions that pulls in not just the
immediate dependencies, but also their dependencies, and then the next
level of dependencies. You need package A and package A needs package
B, so the lockfile must have both A and B. The trick for the solver is
to get the highest versions which are all compatible with each other.
This is in general a difficult problem, &lt;a href=&quot;https://research.swtch.com/version-sat&quot;&gt;NP-complete&lt;/a&gt;
actually.&lt;/p&gt;
&lt;p&gt;For now the solver is working beautifully, but that might change as
the package index grows and dependencies grow more complex. The way
out of the NP trap is to switch to a simpler problem. Akku can later
be extended to do what npm does: if the dependencies want to use
package A both in version X and Y, then npm will install both X and Y
at the same time. I think that with proper care this will mostly work
in R6RS.&lt;/p&gt;
&lt;h1 id=&quot;infrastructure-not-there-yet&quot;&gt;Infrastructure - not there yet&lt;/h1&gt;
&lt;p&gt;The next natural steps for Akku involve infrastructure for publishing
and discovering packages. These are in progress. Currently the package
index can be updated with &lt;code&gt;akku update&lt;/code&gt;, but there also needs to be
commands to publish packages and securely bind an OpenPGP key to the
publisher and the package names.&lt;/p&gt;
&lt;p&gt;To make the situation complete there also needs to be a web site
connected with the package index. Package documentation and testing
needs to be handled as well. There must also be support for publishing
tarballs rather than git repos.&lt;/p&gt;
&lt;h1 id=&quot;epilogue&quot;&gt;Epilogue&lt;/h1&gt;
&lt;p&gt;I have mentioned npm a few times, but don’t be led to believe that I
think npm is a good example of a package manager. It seems that every
few months there’s a scandal about npm or some other popular package
manager. It has become something of a fashion to publicize package
manager failures. Everyone has heard of &lt;a href=&quot;http://blog.npmjs.org/post/141577284765/kik-left-pad-and-npm&quot;&gt;left-pad&lt;/a&gt; and recently it
would even &lt;a href=&quot;https://github.com/npm/npm/issues/19883&quot;&gt;chmod your filesystem&lt;/a&gt; into chaos. So now everyone
should switch to &lt;a href=&quot;https://yarnpkg.com/en/&quot;&gt;Yarn&lt;/a&gt;, which is advertised as… Mega Secure?
Golang’s &lt;code&gt;go get&lt;/code&gt; uses GitHub URLs without any commit ids and someone
thinks &lt;a href=&quot;https://donatstudios.com/GithubsTotalSecurityFacepalm&quot;&gt;it’s GitHub’s fault&lt;/a&gt; that things can go wrong. Okay.
It’s the new normal. (Russ Cox writes about &lt;a href=&quot;https://research.swtch.com/vgo&quot;&gt;vgo&lt;/a&gt; where these
problems are fixed).&lt;/p&gt;
&lt;p&gt;I have been wanting to get a package manager going for a while, but it
was only after reading Sam Boyer’s article
&lt;em&gt;&lt;a href=&quot;https://medium.com/@sdboyer/so-you-want-to-write-a-package-manager-4ae9c17d9527&quot;&gt;So you want to write a package manager&lt;/a&gt;&lt;/em&gt; that the final
pieces fell in place. Sam’s work is for Golang, which is somewhat
more popular than R6RS. Akku has an easier problem to solve.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“Python is popular; we’re not!  We have an advantage!”&lt;/p&gt;
&lt;p&gt;&amp;mdash; Abdulaziz Ghuloum&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;For even more opinions about package managers and Scheme in general,
see the next article: &lt;a href=&quot;https://weinholt.se/articles/so-many-scheme-package-managers/&quot;&gt;So many package managers&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Linting Scheme with r6lint</title>
      <link>https://weinholt.se/articles/linting-r6rs-scheme/</link>
      <pubDate>Sat, 08 Apr 2017 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/linting-r6rs-scheme/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;I find it useful while working in Python, JavaScript or C to have
Emacs show me the location of code errors. For Python there
is &lt;a href=&quot;https://www.pylint.org/&quot;&gt;Pylint&lt;/a&gt; and for JavaScript one can use &lt;a href=&quot;http://jshint.com/&quot;&gt;JSHint&lt;/a&gt; and a few
others. And of course with C there was the original &lt;a href=&quot;http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.1841&quot;&gt;lint&lt;/a&gt;, but
today the compilers themselves generate quite good warnings. These
linters are easily integrated with Emacs via &lt;a href=&quot;http://www.flycheck.org/en/latest/&quot;&gt;Flycheck&lt;/a&gt;, which
highlights errors in the code. Finding that they produce too many
errors when fed Scheme code, I decided to make my own
linter, &lt;a href=&quot;https://github.com/weinholt/r6lint&quot;&gt;r6lint&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;img src=&quot;/articles/linting-r6rs-scheme/r6lint.png&quot; alt=&quot;Screenshot of r6lint in Emacs&quot;&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Some normal things are expected from a linter:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It should not produce irrelevant messages.&lt;/li&gt;
&lt;li&gt;It should show the location of the problem.&lt;/li&gt;
&lt;li&gt;It should do some useful analysis.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I wanted my linter to warn about bad style (possibly according
to &lt;a href=&quot;https://mumble.net/~campbell/scheme/style.txt&quot;&gt;Riastradh’s Lisp Style Rules&lt;/a&gt;), improper usage of procedures and
to show the location of unused variables. This last part is something
that even the original &lt;a href=&quot;http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.56.1841&quot;&gt;lint&lt;/a&gt; did. Of course, a linter performs
static analysis and this limits what it can do. But it should at least
find some problems that compilers don’t bother to warn about. And a
linter can be a social tool, helping to spread awareness of good
style.&lt;/p&gt;
&lt;h1 id=&quot;practical-issues&quot;&gt;Practical issues&lt;/h1&gt;
&lt;p&gt;Scheme is pretty good for working with Scheme code, but in this case
the tools in the standard library are not adequate. The &lt;code&gt;read&lt;/code&gt;
procedure that parses S-expressions does not keep the source
information, so e.g. the location of procedure definitions would be
lost. The linter needs this information and that means a custom reader
must be used.&lt;/p&gt;
&lt;p&gt;Scheme is not limited to the syntax provided by the language designer
and compiler implementer. Code can define new syntax and package it
in libraries. R6RS Scheme supports &lt;code&gt;syntax-case&lt;/code&gt;, which allows macros
to run arbitrary code at compile time. These can introduce new
variables and control structures. If the linter didn’t understand
these then the analysis would be very lacking. So when the parsed
S-expressions are in memory they need to be macro expanded.&lt;/p&gt;
&lt;p&gt;The macro expander is however not exported by the standard libraries.
One of the reasons for this is that the output from the expander is
very implementation-specific. Exposing the macro expander wouldn’t
automatically mean that programs could do anything useful with its
output, because the forms it returns do not need to be standard Scheme
forms.&lt;/p&gt;
&lt;h1 id=&quot;practical-solutions&quot;&gt;Practical solutions&lt;/h1&gt;
&lt;p&gt;I happened to already have &lt;a href=&quot;https://github.com/weinholt/r6lint/blob/master/lib/reader.sls&quot;&gt;a lexer and parser for R6RS&lt;/a&gt; and for
this project I’ve improved it so that it keeps source information. I
also modified the reader be tolerant to errors, so it can emit more
than one error message. If packaged separately it should be called
tra-la-la, because it will happily ignore all possible errors and
continue reading until end of file.&lt;/p&gt;
&lt;p&gt;The next part of the solution is a macro expander. For this I dug up
the &lt;a href=&quot;http://www.cs.indiana.edu/chezscheme/syntax-case/&quot;&gt;portable syntax-case&lt;/a&gt; implementation by Abdulaziz Ghuloum and
R.&amp;nbsp;Kent Dybvig. The official code repository is in Launchpad
as &lt;a href=&quot;https://launchpad.net/r6rs-libraries&quot;&gt;lp:r6rs-libraries&lt;/a&gt;, but some fixes and improvements can be found
in the psyntax embedded in Ikarus and &lt;a href=&quot;https://github.com/leppie/IronScheme&quot;&gt;IronScheme&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I made my own modifications to psyntax. There is a small change in how
source information is handled so that the reader’s annotations can be
used. Furthermore there is no longer any need for compatibility
libraries. One assumption of psyntax is that it will be integrated in
a Scheme implementation. This means that it needs access to a lower
level of the Scheme implementation than is accessible from R6RS. It
wants to read and set top-level variables that are reachable from
&lt;code&gt;eval&lt;/code&gt;‘d code (it uses &lt;code&gt;eval&lt;/code&gt; to run user-provided macros). In R6RS
there is no portable interaction environment, which would normally
provide this kind of semantics. I’ve worked around this by placing
macro-defined global variables in a hashtable.&lt;/p&gt;
&lt;p&gt;Another problem is that psyntax needs to generate unique symbols. This
is an important feature for syntactical hygiene: if a macro contains
the variable &lt;em&gt;x&lt;/em&gt; it should not clash with the macro user’s variable
&lt;em&gt;x&lt;/em&gt;. In R6RS there isn’t really a &lt;code&gt;gensym&lt;/code&gt;, but macros still need to
be able to generate temporary names, so access to the host Scheme’s
&lt;code&gt;gensym&lt;/code&gt; can more or less be finagled by using the standard procedures
&lt;code&gt;generate-temporaries&lt;/code&gt; and &lt;code&gt;syntax-&amp;gt;datum&lt;/code&gt;. A requirement from the
linter (not psyntax) is that a gensym must be possible to turn back
into the original symbol. In Chez Scheme this isn’t a problem due to
an innovative &lt;code&gt;gensym&lt;/code&gt; that works with &lt;code&gt;symbol-&amp;gt;string&lt;/code&gt;. But generally
in other implementations the name returned could be anything, so the
linter saves all gensym names in a hashtable.&lt;/p&gt;
&lt;p&gt;Finally the output from psyntax is records instead of S-expressions.
In part this eliminated the need to represent the void value, but
primarily it was useful to get a more general way to store source
information.&lt;/p&gt;
&lt;h1 id=&quot;lint-it&quot;&gt;Lint it&lt;/h1&gt;
&lt;p&gt;In r6lint the analysis happens on several levels. The lexer itself
warns about lexical violations, e.g. unexpected end of file,
characters outside the valid Unicode range and invalid identifiers.
The reader finds problems with mismatching braces and other structural
problems. The tokens from the lexer are also used to detect formatting
errors, e.g. hanging parenthesis, trailing whitespace and other
whitespace issues.&lt;/p&gt;
&lt;p&gt;Syntactical violations are reported during macro expansion. Exceptions
from the expander are caught and transformed into something useful.
This doesn’t do much more than a compiler already does, except it
tries to preserve source information.&lt;/p&gt;
&lt;p&gt;The more interesting analysis has barely even been implemented, but a
proof of concept is there. The records returned by psyntax are fed
into a simple analyzer that warns about unused variables.&lt;/p&gt;
&lt;h1 id=&quot;wonders&quot;&gt;Wonders&lt;/h1&gt;
&lt;p&gt;I integrated the linter with my editor before it was working. At one
point while I was coding the linter sprang to life and started to warn
about errors in itself. This sort of thing tends to happen a lot with
Scheme.&lt;/p&gt;
&lt;p&gt;In summary there is a new R6RS Scheme frontend that is designed to run
standalone in any R6RS implementation. It feeds a simple static
analyzer where new analysis passes can be plugged in. It’s an
interesting framework that I hope will grow more and more featureful.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Structure of the ARM A64 instruction set</title>
      <link>https://weinholt.se/articles/arm-a64-instruction-set/</link>
      <pubDate>Sun, 29 Jan 2017 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/arm-a64-instruction-set/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;Earlier this year I bought a Raspberry Pi 3 to have as an AArch64
development machine. The fastest way to get familiar with an
instruction set is to write a disassembler for it and I’ve made one
for 64-bit ARM in R6RS Scheme as part of
the &lt;a href=&quot;https://github.com/weinholt/machine-code&quot;&gt;machine-code&lt;/a&gt; project.
The instruction set is called ARM A64, instructions are always 32 bits
wide and they have a neat structure which is pretty fast to decode in
software.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The architecture has 31 integer registers (x0-x30). There is also a
stack pointer register and a zero register that always contains
zeroes. Both these registers are encoded as register number 31, and
it’s up to each instruction if an operand can use the stack pointer or
the zero register. The x30 register is used to store the return
address. These registers are all 64-bit registers and the lower 32
bits can be accessed using the names w0-w31. Operations that write to
the lower 32 bits also clear the upper 32 bits, just like on AMD64.&lt;/p&gt;
&lt;p&gt;There are also 32 registers usable as either floating point registers
or 128-bit vector registers. As vectors they support different
arrangements that are either 64 or 128 bits in total, containing
8-bit, 16-bit, 32-bit or 64-bit quantities. There are many
instructions that operate on multiple quantities at the same time,
which is an interesting way to speed up code. Multiple loop iterations
can be run simultaneously.&lt;/p&gt;
&lt;p&gt;The instructions are documented in the &lt;a href=&quot;https://developer.arm.com/docs/ddi0487/a/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile&quot;&gt;ARM ARM for ARMv8-A&lt;/a&gt;. I’ve
counted, not including instruction aliases, 442 instruction mnemonics
(things like ADD, EOR, B.EQ, etc). They are organized in what is
basically a four-level table: main encoding, instruction group, decode
group and instruction. Chapter C4 of the manual follows the same
structure. This structure is nice for fast decoding, but it’s not
strictly necessary since all encodings at the instruction level still
need to have a unique meaning.&lt;/p&gt;
&lt;p&gt;For each instruction mnemonic there can be multiple variants that
enable the instruction to handle different types of operands. An
example of this is the FMUL instruction that multiples two floating
point values. In a C program it would look like &lt;code&gt;a = b * c&lt;/code&gt;. In A64
assembler it might look like one of these, depending on what the
surrounding code does:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-armasm&quot;&gt;&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; &lt;span class=&quot;built_in&quot;&gt;s0&lt;/span&gt;, &lt;span class=&quot;built_in&quot;&gt;s1&lt;/span&gt;, &lt;span class=&quot;built_in&quot;&gt;s2&lt;/span&gt;            &lt;span class=&quot;comment&quot;&gt;;single precision floats&lt;/span&gt;
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; &lt;span class=&quot;built_in&quot;&gt;d0&lt;/span&gt;, &lt;span class=&quot;built_in&quot;&gt;d1&lt;/span&gt;, &lt;span class=&quot;built_in&quot;&gt;d2&lt;/span&gt;            &lt;span class=&quot;comment&quot;&gt;;double precision floats&lt;/span&gt;
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; v0.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;s, &lt;span class=&quot;built_in&quot;&gt;v1&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;s, &lt;span class=&quot;built_in&quot;&gt;v2&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;s   &lt;span class=&quot;comment&quot;&gt;;vectors with two singles&lt;/span&gt;
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; v0.&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;s, &lt;span class=&quot;built_in&quot;&gt;v1&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;s, &lt;span class=&quot;built_in&quot;&gt;v2&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;s   &lt;span class=&quot;comment&quot;&gt;;vectors with four singles&lt;/span&gt;
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; v0.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;d, &lt;span class=&quot;built_in&quot;&gt;v1&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;d, &lt;span class=&quot;built_in&quot;&gt;v2&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;d   &lt;span class=&quot;comment&quot;&gt;;vectors with two doubles&lt;/span&gt;
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; &lt;span class=&quot;built_in&quot;&gt;s0&lt;/span&gt;, &lt;span class=&quot;built_in&quot;&gt;s1&lt;/span&gt;, &lt;span class=&quot;built_in&quot;&gt;v2&lt;/span&gt;.s[&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;]       &lt;span class=&quot;comment&quot;&gt;;multiplies by a vector element&lt;/span&gt;
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; &lt;span class=&quot;built_in&quot;&gt;d0&lt;/span&gt;, &lt;span class=&quot;built_in&quot;&gt;d1&lt;/span&gt;, &lt;span class=&quot;built_in&quot;&gt;v2&lt;/span&gt;.d[&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;]
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; v0.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;s, &lt;span class=&quot;built_in&quot;&gt;v1&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;s, &lt;span class=&quot;built_in&quot;&gt;v2&lt;/span&gt;.s[&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;] &lt;span class=&quot;comment&quot;&gt;;combinations of the above&lt;/span&gt;
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; v0.&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;s, &lt;span class=&quot;built_in&quot;&gt;v1&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;s, &lt;span class=&quot;built_in&quot;&gt;v2&lt;/span&gt;.s[&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;]
&lt;span class=&quot;symbol&quot;&gt;fmul&lt;/span&gt; v0.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;d, &lt;span class=&quot;built_in&quot;&gt;v1&lt;/span&gt;.&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;d, &lt;span class=&quot;built_in&quot;&gt;v2&lt;/span&gt;.d[&lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;That’s quite a few variants for a single mnemonic. Not all mnemonics
have this many variants, but depending on how one counts I estimate
that there are in total around 1000-2000 variants. The instruction set
designers had to fit all these variants into 32 bits, while at the
same time making space for instructions that encode relatively large
immediate operands, and not forgetting about leaving space for future
extensions. As if that wasn’t difficult enough, the instructions
should also be easy to decode with hardware.&lt;/p&gt;
&lt;h1 id=&quot;instruction-encodings&quot;&gt;Instruction encodings&lt;/h1&gt;
&lt;p&gt;I’ve extracted the tables from my disassembler, rendered them with
the &lt;a href=&quot;https://www.npmjs.com/package/bit-field&quot;&gt;bit-field package&lt;/a&gt;, and
made them slightly interactive. If you’re reading this in a browser
you can see the encodings below. The thing to notice is that each
layer adds extra fixed bits: fields that must be a fixed 0 or 1 value.
(The last level, the instruction level, is not shown in this table).
Two encodings under the same parent always have some differences in
these fields, so that they can be separated by an instruction decoder.
Click an encoding to expand the next level of encodings.&lt;/p&gt;
&lt;noscript&gt;
The table is not shown without JavaScript support in your browser.
&lt;/noscript&gt;

&lt;div id=&quot;instructions&quot; align=&quot;center&quot;&gt;
&lt;/div&gt;

&lt;p&gt;There are many conventions in the field names. Instructions that take
register operands encode them in fields named &lt;em&gt;Rd&lt;/em&gt;, &lt;em&gt;Rn&lt;/em&gt; and &lt;em&gt;Rm&lt;/em&gt;.
Immediate values (integers, PC-relative offsets, etc) are named &lt;em&gt;imm&lt;/em&gt;.
Fields that change the type of operation tend to be called &lt;em&gt;op&lt;/em&gt;N or
&lt;em&gt;opcode&lt;/em&gt;. In general a few of the fields encode the operation (or the
size of the operation) and the rest encode the operands.&lt;/p&gt;
&lt;h1 id=&quot;room-to-grow&quot;&gt;Room to grow&lt;/h1&gt;
&lt;p&gt;The image below shows the encoding space of the instruction set. The
&lt;em&gt;x&lt;/em&gt; axis goes from 0 to 2&lt;sup&gt;16&lt;/sup&gt;-1 and encodes the lower 16 bits
of the instruction space, and the &lt;em&gt;y&lt;/em&gt; axis contains the upper 16 bits.
The different colors denote different decode groups, i.e. all the
encodings at the third level of the table above. (There is probably a
better representation).&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;a64-2048-crushed.png&quot; title=&quot;Encoding space used by ARM A64&quot;
  style=&quot;background: #202020&quot; alt=&quot;An image showing the 32-bit
  encoding space. Mostly there are horizontal thick lines of different
  colors. This shows that the higher 16 bits tend to keep similar
  instructions together, although there is some mirroring around the
  middle of the image.&quot; /&gt;&lt;/p&gt;
&lt;p&gt;All the dark spots are places where ARMv8 does not have any allocated
instructions, or the encoding is reserved. For many instructions there
are some fields that have reserved encodings and these are also dark.&lt;/p&gt;
&lt;p&gt;Even if instructions are kept to the fixed 32 bit encoding there is
still plenty of room for the instruction set to grow.&lt;/p&gt;
&lt;h1 id=&quot;impression&quot;&gt;Impression&lt;/h1&gt;
&lt;p&gt;ARM A64 is a quite clean instruction set with only a few quirks here
and there in its encoding. Compared to AMD64 it has twice the amount
of registers, a clean separation of load/store instructions, clean
RISCy operands (mostly one destination register and two source
registers) and of course the register names and most mnemonics are
totally different. Both have 128-bit vector registers and 64-bit
integer registers and a 64-bit address space. They look quite similar,
except everything’s different.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Splitting Industria</title>
      <link>https://weinholt.se/articles/splitting-industria/</link>
      <pubDate>Sat, 14 Jan 2017 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/splitting-industria/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;Recently a friend lent me the book &lt;em&gt;Start With Why&lt;/em&gt; by Simon Sinek. It
made a lot of sense to me and made me look at my own projects in a new
light. The &lt;a href=&quot;https://weinholt.se/industria&quot;&gt;Industria libraries&lt;/a&gt; is a set of libraries for
R6RS Scheme that do, well, quite a few different things. There’s
cryptography, compression, a few network protocols, various things,
but also an assembler and a few disassemblers. It has many things, but
it doesn’t truly have a “why”.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The original idea was to make pure Scheme implementations of things
that are needed by user space. Libraries for those things that are
needed at the base of an operating system, and everything that’s
needed to communicate with computers running other modern operating
systems. So a few things made sense to add. But at some point there
were unnecessary things added (e.g. the broken FiSH crypto protocol
for IRC), and at some point the complexity of multiple systems is just
too much to contain in a single project (the DNS libraries languish,
the TLS client is not up to date, etc).&lt;/p&gt;
&lt;p&gt;So the original “why” got lost at some point. And now the project can
move in so many different directions that it’s not clear what, if
anything, should be done.&lt;/p&gt;
&lt;p&gt;That’s the background for my decision to split Industria into multiple
projects. The first split-off is the
the &lt;a href=&quot;https://github.com/weinholt/machine-code&quot;&gt;machine-code&lt;/a&gt; project.
This is where Industria’s assembler, disassembler and object code
libraries have moved. To get some momentum into this project I’m also
writing a new disassembler for 64-bit ARM which will be released soon.&lt;/p&gt;
&lt;p&gt;There are a few other mega projects like Industria in the R6RS world.
I think it would be beneficial for more projects to do similar splits.
Of course, this will increase the need for a proper package manager
and, more importantly, a public package repository. But I see that as
a positive side effect. I think that we need to experience some
discomfort before we can gather enough motivation to improve our
infrastructure. There is
already &lt;a href=&quot;http://home.gna.org/dorodango/&quot;&gt;Dorodango&lt;/a&gt;, but I’m not aware
of a public package repository for R6RS. I think that would be an
interesting project in itself.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Automated Testing of Zabavno</title>
      <link>https://weinholt.se/articles/zabavno-automated-testing/</link>
      <pubDate>Fri, 23 Dec 2016 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/zabavno-automated-testing/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;I had already been programming for twenty years before I started my
current project at Ericsson. During my time in the project I’ve come
to really appreciate a few things that were new to me, like Continuous
Integration (CI) and automated testing. I recently setup CI for
Zabavno on GitHub with a new test case generator and immediately found
bugs.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h1 id=&quot;the-approach&quot;&gt;The approach&lt;/h1&gt;
&lt;p&gt;Zabavno is an x86 emulator and the x86 is a notoriously tricky
architecture. Of course, it’s not the first x86 emulator and not the
first one that needs testing. In the
paper &lt;a href=&quot;https://www.microsoft.com/en-us/research/publication/design-and-testing-of-a-cpu-emulator/&quot;&gt;Design and Testing of a CPU Emulator&lt;/a&gt; (2009, Forin and Liu)
a very systematic way to generate test cases for the x86 is described.
But the findings in their evaluation section are a bit puzzling to me.
When the processor manuals say that the flags are undefined it should
not really be surprising that they can have been modified. They
describe errors in the processor manual’s description of the
instruction encodings. My own approach to the x86 has been to use the
opcode tables, but even they have errors. They’ve also rediscovered
opcode 82, which is actually already in the opcode map. Their
techniques seem quite complex for what they accomplish. My main
takeaway from this article is to generate test cases automatically and
run them on actual hardware, but to do it in an easier way. (I’ll also
be stealing their parity matrix for the &lt;code&gt;aas&lt;/code&gt; instruction).&lt;/p&gt;
&lt;p&gt;A simpler approach is to use the opcode tables to generate random
operands for instructions. The instructions can then be run in the
emulator and the results incorporated into a binary that runs them on
real hardware. I took a similar approach in &lt;a href=&quot;https://github.com/weinholt/schjig&quot;&gt;schjig&lt;/a&gt;, which is a
program that tests R6RS Scheme implementations by comparing two
implementations. In the 2009 paper they generate C programs that are
run under a “test execution engine”. This might be necessary later but
the first version will incorporate everything into the test binary,
which will be built using the &lt;a href=&quot;https://github.com/weinholt/machine-code&quot;&gt;machine-code&lt;/a&gt; x86 assembler.&lt;/p&gt;
&lt;h1 id=&quot;generating-test-cases&quot;&gt;Generating test cases&lt;/h1&gt;
&lt;p&gt;In schjig there is a table of Scheme procedures along with a
description of their arguments and return values (similar to what is
found in the R6RS documents themselves):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; ops
  '#(((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) number? obj)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) complex? obj)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) real? obj)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) rational? obj)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) integer? obj)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) real-valued? obj)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) rational-valued? obj)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) integer-valued? obj)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) exact? z)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) inexact? z)
     ((&lt;span class=&quot;name&quot;&gt;z&lt;/span&gt;) exact z)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) = z z z ...)
     ((&lt;span class=&quot;name&quot;&gt;bool&lt;/span&gt;) &amp;lt; x x x ...)
…
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These descriptions are used to generate test cases. If an argument has
type &lt;em&gt;z&lt;/em&gt; then a random complex number is generated as an argument, and
of course the right number of arguments should be generated. (Partly
this table was also automatically generated by eval’ing procedure calls
with random arguments. It turns out that the procedure signatures differ
slightly between different Scheme implementations).&lt;/p&gt;
&lt;p&gt;Instructions in &lt;a href=&quot;https://github.com/weinholt/machine-code/blob/master/disassembler/x86-opcodes.sls&quot;&gt;x86 opcode tables&lt;/a&gt; are described using what is
basically the same notation (albeit more complex, with nested tables):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; opcodes
  '#((&lt;span class=&quot;name&quot;&gt;add&lt;/span&gt; Eb Gb)
     (&lt;span class=&quot;name&quot;&gt;add&lt;/span&gt; Ev Gv)
     (&lt;span class=&quot;name&quot;&gt;add&lt;/span&gt; Gb Eb)
     (&lt;span class=&quot;name&quot;&gt;add&lt;/span&gt; Gv Ev)
     (&lt;span class=&quot;name&quot;&gt;add&lt;/span&gt; *AL Ib)
     (&lt;span class=&quot;name&quot;&gt;add&lt;/span&gt; *rAX Iz)
     #(&lt;span class=&quot;name&quot;&gt;Mode&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;push&lt;/span&gt; *ES) &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt;)
     #(&lt;span class=&quot;name&quot;&gt;Mode&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;pop&lt;/span&gt; *ES) &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt;)
     &lt;span class=&quot;comment&quot;&gt;;; 08&lt;/span&gt;
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;or&lt;/span&gt;&lt;/span&gt; Eb Gb)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;or&lt;/span&gt;&lt;/span&gt; Ev Gv)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;or&lt;/span&gt;&lt;/span&gt; Gb Eb)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;or&lt;/span&gt;&lt;/span&gt; Gv Ev)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;or&lt;/span&gt;&lt;/span&gt; *AL Ib)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;or&lt;/span&gt;&lt;/span&gt; *rAX Iz)
     #(&lt;span class=&quot;name&quot;&gt;Mode&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;push&lt;/span&gt; *CS) &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt;)
…
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The structure of the table itself is used to navigate the opcode
space. From the snippet above it can be seen that the &lt;code&gt;add&lt;/code&gt;
instruction uses opcodes 00 to 05, &lt;code&gt;push es&lt;/code&gt; is on 06, etc. Some
instructions take implicitly encoded operands (e.g. &lt;code&gt;*AL&lt;/code&gt; can only be
the &lt;code&gt;al&lt;/code&gt; register operand), but most operands need to be provided
separately using additional bytes. The &lt;code&gt;Eb&lt;/code&gt; opsyntax can be a byte
register or a byte memory reference. The encoding details are left to
the assembler and the task is just to generate operands that match the
requirements.&lt;/p&gt;
&lt;p&gt;Both schjig and the new test generator for Zabavno prefer to generate
integers that lie close to power-of-two boundaries. This tends to
uncover a lot of edge cases. The test generator is very simple right
now at around 380 lines, it only generates byte register operands and
byte immediates, but is easy to extend with additional operands.&lt;/p&gt;
&lt;p&gt;The tested instruction is placed in a Linux i386 ELF binary that’s
generated using the x86 assembler of the &lt;a href=&quot;https://github.com/weinholt/machine-code&quot;&gt;machine-code&lt;/a&gt; project.
There is no need to emit C code or interact with other tools at all,
except for the Linux kernel, so the test runtime environment is pretty
simple to build and execute. For each test case the ELF binary
contains a short setup sequence followed by the tested instruction
itself. Then it compares the actual register values with the register
values produced by Zabavno. If there’s a mismatch it prints a failure
report and exits with a non-zero status.&lt;/p&gt;
&lt;h1 id=&quot;bugs-found&quot;&gt;Bugs found&lt;/h1&gt;
&lt;p&gt;Even before the test program was finished it found a bug in the &lt;code&gt;dec&lt;/code&gt;
instruction. Here’s the report (note the difference in the &lt;em&gt;flags&lt;/em&gt;
line):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;Test failed: dec-Eb-0
(mov (mem32+ scratch-flags) #x41)
(mov esp scratch-flags)
(popfd)
(mov eax #x10001)
(mov ecx #x80000001)
(mov edx #xFFFFFFF)
(mov ebx #x1)
(mov esp #x7FFF)
(mov ebp #x3FFFF)
(mov esi #x1FFFFF)
(mov edi #x40000001)
(dec al)
Result from emulation in Zabavno:
eax     #x00010000
ecx     #x80000001
edx     #x0FFFFFFF
ebx     #x00000001
esp     #x00007FFF
ebp     #x0003FFFF
esi     #x001FFFFF
edi     #x40000001
flags   #x00000257
Result from processor execution:
eax     #x00010000
ecx     #x80000001
edx     #x0FFFFFFF
ebx     #x00000001
esp     #x00007FFF
ebp     #x0003FFFF
esi     #x001FFFFF
edi     #x40000001
flags   #x00000247
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;It has so far found bugs in the AF flag handling of &lt;code&gt;adc&lt;/code&gt;, &lt;code&gt;dec&lt;/code&gt; and
&lt;code&gt;sbb&lt;/code&gt;. And it hasn’t even been activated yet for most instructions.
One complication with enabling this for more instructions is that
Zabavno doesn’t emulate the undefined processor flags “correctly” (it
just clears them). It remains to be seen what can be done there, but
it should not be very difficult for the emulator to track which flags
are undefined.&lt;/p&gt;
&lt;h1 id=&quot;continuous-integration&quot;&gt;Continuous Integration&lt;/h1&gt;
&lt;p&gt;The test suite should run automatically. GitHub offers
a &lt;a href=&quot;https://github.com/integrations/feature/continuous-integration&quot;&gt;large amount of CI tools&lt;/a&gt;, so it can be hard to know where to
start. I naturally picked the one with a moustache, &lt;a href=&quot;https://travis-ci.org/&quot;&gt;Travis CI&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Setting up an account on Travis CI is just a matter of logging in via
your GitHub account and granting some innocuous permissions. Travis
will then let you enable testing for any of your repositories.&lt;/p&gt;
&lt;p&gt;Tests are configured in a configuration file that you commit to your
repository as &lt;code&gt;.travis.yml&lt;/code&gt;. Right now they don’t have a runtime for
Scheme. But they do have runtimes for C and most of the big popular
languages, so installing a Scheme as part of the build process isn’t
very difficult. (And besides that, they also let you use apt to
install packages from Ubuntu, and there are a bunch of Schemes
available through there). Here’s the configuration file used by
Zabavno (slightly reformatted for the web). It downloads Chez Scheme,
generates a test suite and runs it:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-yaml&quot;&gt;&lt;span class=&quot;attr&quot;&gt;language:&lt;/span&gt; c

&lt;span class=&quot;attr&quot;&gt;os:&lt;/span&gt;
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; linux

&lt;span class=&quot;attr&quot;&gt;compiler:&lt;/span&gt;
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; gcc

&lt;span class=&quot;attr&quot;&gt;before_script:&lt;/span&gt;
  &lt;span class=&quot;comment&quot;&gt;# Install Chez Scheme&lt;/span&gt;
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;wget https://github.com/cisco/ChezScheme/archive/master.zip
    -O ChezScheme-master.zip&quot;&lt;/span&gt;
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; unzip ChezScheme-master.zip
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;pushd ChezScheme-master &amp;amp;&amp;amp; ./configure
      --installprefix=$TRAVIS_BUILD_DIR/chez &amp;amp;&amp;amp;
     make &amp;amp;&amp;amp; make install &amp;amp;&amp;amp; popd&quot;&lt;/span&gt;
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; export PATH=$TRAVIS_BUILD_DIR/chez/bin:$PATH
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; export CHEZSCHEMELIBDIRS=$TRAVIS_BUILD_DIR/..:$TRAVIS_BUILD_DIR
  &lt;span class=&quot;comment&quot;&gt;# Install machine-code&lt;/span&gt;
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; &lt;span class=&quot;string&quot;&gt;&quot;wget https://github.com/weinholt/machine-code/archive/master.zip
      -O machine-code-master.zip&quot;&lt;/span&gt;
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; unzip machine-code-master.zip
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; mv machine-code-master machine-code

&lt;span class=&quot;attr&quot;&gt;script:&lt;/span&gt;
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; programs/zabavno --help
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; tests/x86/generate.scm &amp;amp;&amp;amp; chmod +x generate.out
&lt;span class=&quot;bullet&quot;&gt;  -&lt;/span&gt; ./generate.out
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Simply commit a file similar to this one this named &lt;code&gt;.travis.yml&lt;/code&gt; and
push it to a branch. For the initial setup you can push it to a test
branch and Travis will still see it.&lt;/p&gt;
&lt;p&gt;Travis sends you emails about the build status and also shows the
build output as the build is happening. To top it all off there’s a
status image you can link to from your project. Now everyone can see
that the code is working.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://travis-ci.org/weinholt/zabavno&quot;&gt;&lt;img src=&quot;https://travis-ci.org/weinholt/zabavno.svg?branch=master&quot; alt=&quot;Build Status&quot;&gt;&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Make Test Inputs with Prolog</title>
      <link>https://weinholt.se/articles/make-test-inputs-prolog/</link>
      <pubDate>Wed, 23 Nov 2016 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/make-test-inputs-prolog/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;A while back I wrote a parser for R6RS Scheme numbers, or the
&lt;code&gt;string-&amp;gt;number&lt;/code&gt; procedure. Numbers in Scheme are somewhat
sophisticated and can be written in some surprising variations and I
wanted some test inputs for verifying that the parser doesn’t crash on
valid inputs. Luckily, the number syntax is specified in such a way
that a Prolog program easily can be written that generates test
inputs.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.swi-prolog.org/&quot;&gt;SWI-Prolog&lt;/a&gt; supports an alternative syntax
called &lt;a href=&quot;http://www.swi-prolog.org/pldoc/man?section=DCG&quot;&gt;definite clause grammars&lt;/a&gt; (DCG) that is suitable for this
task. The specification in R6RS is written in a similar BNF syntax so
translation is very easy. Except for this there is nothing particular
about Prolog itself that makes it suitable for this task and
alternatives such as &lt;a href=&quot;http://minikanren.org/&quot;&gt;miniKanren&lt;/a&gt; or &lt;a href=&quot;http://www.r6rs.org/corrected/html/r6rs/r6rs-Z-H-7.html#node_sec_4.2.1&quot;&gt;µKanren&lt;/a&gt; could be used
instead.&lt;/p&gt;
&lt;p&gt;Let’s get down to numbers. The datum syntax can be found
in &lt;a href=&quot;http://www.r6rs.org/corrected/html/r6rs/r6rs-Z-H-7.html#node_sec_4.2.1&quot;&gt;R6RS section 4.2.1&lt;/a&gt;, which has this to say:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The rules for ⟨num R⟩, ⟨complex R⟩, ⟨real R⟩, ⟨ureal R⟩, ⟨uinteger R⟩, and ⟨prefix R⟩ below should be replicated for R = 2, 8, 10, and 16. There are no rules for ⟨decimal 2⟩, ⟨decimal 8⟩, and ⟨decimal 16⟩, which means that number representations containing decimal points or exponents must be in decimal radix.&lt;/p&gt;
&lt;p&gt;In the following rules, case is insignificant.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So we’ll need to remember to replicate the rules for every radix and
to handle both upper and lower case letters. (Luckily DCG handles the
first for us). This text is followed up by rules that look something
like this (made to be less compact than in the specification):&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;⟨digit⟩ → 0 ∣ 1 ∣ 2 ∣ 3 ∣ 4 ∣ 5 ∣ 6 ∣ 7 ∣ 8 ∣ 9
…
⟨number⟩ → ⟨num 2⟩ ∣ ⟨num 8⟩ ∣ ⟨num 10⟩ ∣ ⟨num 16⟩
⟨num R⟩ → ⟨prefix R⟩ ⟨complex R⟩
⟨complex R⟩ → ⟨real R⟩
    ∣ ⟨real R⟩ @ ⟨real R⟩
    ∣ ⟨real R⟩ + ⟨ureal R⟩ i
    ∣ ⟨real R⟩ - ⟨ureal R⟩ i
    ∣ ⟨real R⟩ + ⟨naninf⟩ i
    ∣ ⟨real R⟩ - ⟨naninf⟩ i
    ∣ ⟨real R⟩ + i
    ∣ ⟨real R⟩ - i
    ∣ + ⟨ureal R⟩ i
    ∣ - ⟨ureal R⟩ i
    ∣ + ⟨naninf⟩ i
    ∣ - ⟨naninf⟩ i
    ∣ + i
    ∣ - i
…
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If one’s not familiar with BNF this might be tricky to read. Rules are
surrounded by ⟨brackets⟩ and can be referred to using the same syntax.
Things outside the brackets are right arrows (→) saying that the thing
on the left side is defined as what’s on the right side. The right
side contains references to rules, vertical bars (∣) to define
multiple options, and literal characters to say that “this character
must be here”.&lt;/p&gt;
&lt;p&gt;One way to use this is for determining if a particular string is a
valid number or not. The first rule defines a number as one of four
possible things. To see if a string is a number a program can follow
the rules and try to match every character in the string to a
character in a rule. For the string &lt;code&gt;&amp;quot;52697461&amp;quot;&lt;/code&gt; there should be a way
to get all the way from the &lt;code&gt;⟨number⟩&lt;/code&gt; rule to &lt;code&gt;⟨digit⟩&lt;/code&gt;, and not just
once but eight times. The path might be through &lt;code&gt;⟨num 10⟩&lt;/code&gt; or
&lt;code&gt;⟨num 16⟩&lt;/code&gt;, it doesn’t really matter which.&lt;/p&gt;
&lt;p&gt;Here is one (abbreviated) path that takes us through the rules from
&lt;code&gt;⟨number⟩&lt;/code&gt; to &lt;code&gt;&amp;quot;+i&amp;quot;&lt;/code&gt;: &lt;code&gt;⟨number⟩&lt;/code&gt; ⇒ &lt;code&gt;⟨num 2⟩&lt;/code&gt; ⇒ &lt;code&gt;⟨prefix 2⟩ ⟨complex
2⟩&lt;/code&gt; ⇒ (prefix can be empty) ⇒ &lt;code&gt;⟨complex 2⟩&lt;/code&gt; ⇒ &lt;code&gt;+ i&lt;/code&gt;. A program can be
written that walks all paths in the rules and when it finds a dead end
prints the characters it has collected along the way, and then
backtracks to continue on another path.&lt;/p&gt;
&lt;p&gt;In Prolog with DCG the program looks remarkably similar to the rules
in the specification (disregarding the insignificance of case for a
moment):&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-Prolog&quot;&gt;scheme_number --&amp;gt; (num(&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;); num(&lt;span class=&quot;number&quot;&gt;8&lt;/span&gt;); num(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;); num(&lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;)).

num(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;) --&amp;gt; prefix(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), complex(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;).

complex(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;) --&amp;gt; (real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;);
                real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;@&quot;&lt;/span&gt;, real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;);
                real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;+&quot;&lt;/span&gt;, ureal(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;i&quot;&lt;/span&gt;;
                real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;-&quot;&lt;/span&gt;, ureal(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;i&quot;&lt;/span&gt;;
                real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;+&quot;&lt;/span&gt;, naninf(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;i&quot;&lt;/span&gt;;
                real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;-&quot;&lt;/span&gt;, naninf(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;i&quot;&lt;/span&gt;;
                real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;+i&quot;&lt;/span&gt;;
                real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;-i&quot;&lt;/span&gt;;
                &lt;span class=&quot;string&quot;&gt;&quot;+&quot;&lt;/span&gt;, ureal(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;i&quot;&lt;/span&gt;;
                &lt;span class=&quot;string&quot;&gt;&quot;-&quot;&lt;/span&gt;, ureal(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;i&quot;&lt;/span&gt;;
                &lt;span class=&quot;string&quot;&gt;&quot;+&quot;&lt;/span&gt;, naninf(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;i&quot;&lt;/span&gt;;
                &lt;span class=&quot;string&quot;&gt;&quot;-&quot;&lt;/span&gt;, naninf(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;i&quot;&lt;/span&gt;;
                &lt;span class=&quot;string&quot;&gt;&quot;+i&quot;&lt;/span&gt;;
                &lt;span class=&quot;string&quot;&gt;&quot;-i&quot;&lt;/span&gt;).

real(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;) --&amp;gt; (sign, ureal(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;);
             &lt;span class=&quot;string&quot;&gt;&quot;+&quot;&lt;/span&gt;, naninf(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;);
             &lt;span class=&quot;string&quot;&gt;&quot;-&quot;&lt;/span&gt;, naninf(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;)).

naninf(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;) --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;nan.0&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;inf.0&quot;&lt;/span&gt;).

ureal(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;) --&amp;gt; (uinteger(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;);
              uinteger(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;/&quot;&lt;/span&gt;, uinteger(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;);
              decimal(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), mantissa_width).

decimal(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;) --&amp;gt; (uinteger(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;), suffix;
                 &lt;span class=&quot;string&quot;&gt;&quot;.&quot;&lt;/span&gt;, digits(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;), suffix;
                 digits(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;.&quot;&lt;/span&gt;, digits0(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;), suffix;
                 digits(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;), &lt;span class=&quot;string&quot;&gt;&quot;.&quot;&lt;/span&gt;, suffix).

uinteger(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;) --&amp;gt; digits(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;).

prefix(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;) --&amp;gt; (radix(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), exactness;
               exactness, radix(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;)).

suffix --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;&quot;&lt;/span&gt;;
            exponent_marker, sign, digits(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;)).
exponent_marker --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;e&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;s&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;f&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;d&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;l&quot;&lt;/span&gt;).
mantissa_width --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;|&quot;&lt;/span&gt;, digits(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;)).
sign --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;+&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;-&quot;&lt;/span&gt;).
exactness --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;#i&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;#e&quot;&lt;/span&gt;).

radix(&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) --&amp;gt; &lt;span class=&quot;string&quot;&gt;&quot;#b&quot;&lt;/span&gt;.
radix(&lt;span class=&quot;number&quot;&gt;8&lt;/span&gt;) --&amp;gt; &lt;span class=&quot;string&quot;&gt;&quot;#o&quot;&lt;/span&gt;.
radix(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;) --&amp;gt; &lt;span class=&quot;string&quot;&gt;&quot;&quot;&lt;/span&gt;; &lt;span class=&quot;string&quot;&gt;&quot;#d&quot;&lt;/span&gt;.
radix(&lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;) --&amp;gt; &lt;span class=&quot;string&quot;&gt;&quot;#x&quot;&lt;/span&gt;.

digit(&lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;) --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;0&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;1&quot;&lt;/span&gt;).
digit(&lt;span class=&quot;number&quot;&gt;8&lt;/span&gt;) --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;0&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;1&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;2&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;3&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;4&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;5&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;6&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;7&quot;&lt;/span&gt;).
digit(&lt;span class=&quot;number&quot;&gt;10&lt;/span&gt;) --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;0&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;1&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;2&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;3&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;4&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;5&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;6&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;7&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;8&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;9&quot;&lt;/span&gt;).
digit(&lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;) --&amp;gt; (&lt;span class=&quot;string&quot;&gt;&quot;0&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;1&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;2&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;3&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;4&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;5&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;6&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;7&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;8&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;9&quot;&lt;/span&gt;;
               &lt;span class=&quot;string&quot;&gt;&quot;a&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;b&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;c&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;d&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;e&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;f&quot;&lt;/span&gt;;
               &lt;span class=&quot;string&quot;&gt;&quot;A&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;B&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;C&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;D&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;E&quot;&lt;/span&gt;;&lt;span class=&quot;string&quot;&gt;&quot;F&quot;&lt;/span&gt;).

digits(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;) --&amp;gt; (digit(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;);
               digit(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;), digits(&lt;span class=&quot;symbol&quot;&gt;R&lt;/span&gt;)).
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;When developing ones own solution I suggest to start small, focus on
the basic rules with almost only characters, and build on more rules
later when that’s working. There are two primary ways to use this
program. Firstly it can check if a string is a valid number or not.
Load the program into SWI-Prolog and call &lt;code&gt;scheme_number&lt;/code&gt; with a
string and the empty list. Prolog will print true or false and might
wait for the user to type a command (try &lt;code&gt;.&lt;/code&gt; or &lt;code&gt;;&lt;/code&gt;).&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ swipl -s numbers.pl
% numbers.pl compiled 0.00 sec, 42 clauses
Welcome to SWI-Prolog (Multi-threaded, 64 bits, Version 6.6.6)
…
?- scheme_number(&amp;quot;52697461&amp;quot;, []).
true ;
true ;
true ;
true ;
false.
?- scheme_number(&amp;quot;42i&amp;quot;, []).
false.
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;From the first test it looks like there are four different ways to
parse &lt;code&gt;&amp;quot;52697461&amp;quot;&lt;/code&gt; as a number (e.g. as decimal and hexadecimal). The
program can also be run in the other direction and generate all
numbers:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;$ swipl -s numbers.pl
% numbers.pl compiled 0.00 sec, 42 clauses
Welcome to SWI-Prolog (Multi-threaded, 64 bits, Version 6.6.6)
…
?- forall(scheme_number(X, []), writef(&amp;quot;%s\n&amp;quot;, [X])).
#b0
#b1
#b00
#b01
#b000
#b001
…
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Oops! It is really printing &lt;em&gt;all&lt;/em&gt; numbers. Some modifications need to
be made before the program is useful. &lt;a href=&quot;/scheme/every-number-revised2.pl&quot;&gt;The final version&lt;/a&gt; has
case-insensitivity added and some sample strings for those cases that
would otherwise generate digits forever. After these modifications it
prints useful test inputs, although it can no longer recognize all
numbers. Even though the search space is now finite the program
still &lt;a href=&quot;/scheme/every-number-sorted-unique-revised2.txt.xz&quot;&gt;outputs 523&amp;thinsp;908 unique numbers&lt;/a&gt;. But that is just a
consequence of the notation.&lt;/p&gt;
&lt;p&gt;The program finds some very strange looking numbers, e.g.
&lt;code&gt;#e#d-inf.0-49.83e+49|53i&lt;/code&gt; and &lt;code&gt;#o#e-0755/0755@-0755/0755&lt;/code&gt;. The first
is the exact decimal complex number with real part -&amp;infin; and
imaginary part -49.83&amp;sdot;10&lt;sup&gt;49&lt;/sup&gt; with mantissa width 53. The
second is the octal exact complex number with both magnitude and angle
-1 (i.e. -1∠-1). Kind of hard to see, though. Many Scheme
implementations do not support exact complex numbers and they should
reject these inputs.&lt;/p&gt;
&lt;p&gt;The testing doesn’t stop after all the lovely test cases have been fed
through the parser that’s being tested. Unfortunately the program
can’t generate all invalid numbers, so using its output does not
completely test an implementation. A parser tested only using the
strategy presented here could accept or even crash on invalid inputs.
(However if crashes are possible then &lt;a href=&quot;http://lcamtuf.coredump.cx/afl/&quot;&gt;american fuzzy lop&lt;/a&gt; can be
used to automatically find crashing test cases).&lt;/p&gt;
&lt;p&gt;The fact that the parser doesn’t reject valid inputs also doesn’t say
much about if the input was parsed correctly. However the test inputs
can be fed through the parser and printed, and then be compared to a
reference printout (e.g. generated with a trusted implementation).&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This was written as part of a series of articles on stuff that has
been lying around on this website for a long time without any real
commentary. &lt;a href=&quot;/scheme/every-number.pl&quot;&gt;The original version of the number generator&lt;/a&gt; was
written in 2012, &lt;a href=&quot;/scheme/every-number-revised.pl&quot;&gt;revised&lt;/a&gt; the following year, and revised again
for this article.&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Efficient computation of the &quot;man or boy&quot; test</title>
      <link>https://weinholt.se/articles/man-or-boy-test/</link>
      <pubDate>Tue, 25 Oct 2016 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/man-or-boy-test/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;Here’s a quote from a computer scientist living in what was clearly
simpler times:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;[…]. Hence I have written the following simple routine, which may
separate the man-compilers from the boy-compilers: […]&lt;br&gt;
– &lt;a href=&quot;http://archive.computerhistory.org/resources/text/algol/algol_bulletin/A17/P24.HTM&quot;&gt;Donald Knuth&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Here is Knuth’s program in ALGOL&lt;em&gt;60&lt;/em&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-algol60&quot;&gt;begin real procedure A(k, x1, x2, x3, x4, x5);
           value k; integer k;
           begin real procedure B;
                      begin k := k - 1;
                            B := A := A(k, B, x1, x2, x3, x4)
                      end;
                 if k ≤ then A : = x4 + x5 else B
           end;
      outreal(A(10, 1, -1, -1, 1, 0))
end;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This seemingly innocent program uses a lot of resources to compute its
result. It takes a parameter &lt;em&gt;k&lt;/em&gt; that is used to scale up the
difficulty of the problem. It’s an interesting program to use for
testing language implementations, since it’s easy to verify the result
and also to compare the performance with other language
implementations. People have translated the program into many
languages over at &lt;a href=&quot;http://rosettacode.org/wiki/Man_or_boy_test&quot;&gt;Rosetta code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But what if you have a groundbreaking compiler that manages to compute
the solution to a larger &lt;em&gt;k&lt;/em&gt;-value than you can find in the tables?
You would want to know if you got the right answer. The “last word”
from Knuth in &lt;a href=&quot;http://algol60.org/bulletins/ab19.zip&quot;&gt;ALGOL bulletin #19&lt;/a&gt;, page 8, gives us a way to
easily compute the answer much faster than running the actual program:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Since then my right hand has observed that the value of A(k, x1, x2,
x3, x4, x5) is equal to c1 × x1 + c2 × x2 + c3 × x3 + c4 × x4 + c5 ×
x5 where the coefficients are given in the following table:&lt;/p&gt;
&lt;/blockquote&gt;
&lt;style type=&quot;text/css&quot;&gt;
td
{
    padding:0 0.5em 0 0.5em;
}
&lt;/style&gt;

&lt;blockquote&gt;
&lt;blockquote&gt;
&lt;table&gt;
&lt;thead align=&quot;right&quot;&gt;
&lt;tr&gt;&lt;td&gt; k &lt;/td&gt;&lt;td&gt;    c1(k) &lt;/td&gt;   &lt;td&gt;c2(k)&lt;/td&gt;    &lt;td&gt;c3(k)&lt;/td&gt;   &lt;td&gt;c4(k)&lt;/td&gt;   &lt;td&gt;c5(k)&lt;/td&gt;&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody align=&quot;right&quot;&gt;
&lt;tr&gt;&lt;td&gt;≤0 &lt;/td&gt;&lt;td&gt;     0    &lt;/td&gt;    &lt;td&gt;0    &lt;/td&gt;    &lt;td&gt;0    &lt;/td&gt;   &lt;td&gt;1    &lt;/td&gt;   &lt;td&gt;1&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 1 &lt;/td&gt;&lt;td&gt;     0    &lt;/td&gt;    &lt;td&gt;0    &lt;/td&gt;    &lt;td&gt;1    &lt;/td&gt;   &lt;td&gt;1    &lt;/td&gt;   &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 2 &lt;/td&gt;&lt;td&gt;     0    &lt;/td&gt;    &lt;td&gt;1    &lt;/td&gt;    &lt;td&gt;1    &lt;/td&gt;   &lt;td&gt;0    &lt;/td&gt;   &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 3 &lt;/td&gt;&lt;td&gt;     1    &lt;/td&gt;    &lt;td&gt;1    &lt;/td&gt;    &lt;td&gt;0    &lt;/td&gt;   &lt;td&gt;0    &lt;/td&gt;   &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 4 &lt;/td&gt;&lt;td&gt;     2    &lt;/td&gt;    &lt;td&gt;1    &lt;/td&gt;    &lt;td&gt;0    &lt;/td&gt;   &lt;td&gt;0    &lt;/td&gt;   &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 5 &lt;/td&gt;&lt;td&gt;     3    &lt;/td&gt;    &lt;td&gt;2    &lt;/td&gt;    &lt;td&gt;1    &lt;/td&gt;   &lt;td&gt;0    &lt;/td&gt;   &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 6 &lt;/td&gt;&lt;td&gt;     5    &lt;/td&gt;    &lt;td&gt;3    &lt;/td&gt;    &lt;td&gt;3    &lt;/td&gt;   &lt;td&gt;2    &lt;/td&gt;   &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 7 &lt;/td&gt;&lt;td&gt;     8    &lt;/td&gt;    &lt;td&gt;6    &lt;/td&gt;    &lt;td&gt;9    &lt;/td&gt;   &lt;td&gt;6    &lt;/td&gt;   &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 8 &lt;/td&gt;&lt;td&gt;    14    &lt;/td&gt;   &lt;td&gt;15   &lt;/td&gt;    &lt;td&gt;22   &lt;/td&gt;   &lt;td&gt;13   &lt;/td&gt;    &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt; 9 &lt;/td&gt;&lt;td&gt;    29    &lt;/td&gt;   &lt;td&gt;37   &lt;/td&gt;    &lt;td&gt;48   &lt;/td&gt;   &lt;td&gt;26   &lt;/td&gt;    &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;10 &lt;/td&gt;&lt;td&gt;    66    &lt;/td&gt;   &lt;td&gt;85   &lt;/td&gt;   &lt;td&gt;102  &lt;/td&gt;   &lt;td&gt; 54  &lt;/td&gt;     &lt;td&gt;0&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;

&lt;/blockquote&gt;
&lt;p&gt;When k ≥ 5, these values may be obtained by the relations&lt;br&gt;&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;c1(k + 1) = c1(k) + c2(k) &lt;br&gt;
c2(k + 1) = c2(k) + c3(k) &lt;br&gt;
c3(k + 1) = c3(k) + c4(k + 1) &lt;br&gt;
c4(k + 1) = c1(k) + c4(k) - 1 &lt;br&gt;
c5(k) = 0&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/blockquote&gt;
&lt;p&gt;By using Knuth’s table and relations it’s possible
to &lt;a href=&quot;/hacks/man-or-boy.scm&quot;&gt;compute the man or boy function quite quickly&lt;/a&gt;. I wrote a
program that does this. A walk-through follows.&lt;/p&gt;
&lt;p&gt;The first part of the program is a gratuitous use of macros. The
&lt;code&gt;define-coefficients&lt;/code&gt; macro defines one of the coefficients mentioned
in Knuth’s last word letter (&lt;em&gt;c1&lt;/em&gt; to &lt;em&gt;c5&lt;/em&gt;). Each coefficient is
parameterized by the &lt;em&gt;k&lt;/em&gt;-value, which means that we should define a
procedure that takes &lt;em&gt;k&lt;/em&gt; as an argument and returns &lt;em&gt;cn(k)&lt;/em&gt;. Knuth’s
table gives us the first values it should return (&lt;em&gt;consts&lt;/em&gt;). For the
rest of the values we’ll use the corresponding relation. As a bonus
the macro also creates a hashtable where it stores previously computed
values, which is key to speeding up the algorithm.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define-syntax&lt;/span&gt;&lt;/span&gt; define-coefficients
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;syntax-rules&lt;/span&gt;&lt;/span&gt; ()
    ((&lt;span class=&quot;name&quot;&gt;_&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;name&lt;/span&gt; k) consts equation)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; name
       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;h&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;make-eqv-hashtable&lt;/span&gt;)))
         (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (k)
           (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;v&lt;/span&gt; consts))
             (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;cond&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;&amp;lt;&lt;/span&gt;&lt;/span&gt; k &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
                    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;vector-ref&lt;/span&gt;&lt;/span&gt; v &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))
                   ((&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;&amp;lt;&lt;/span&gt;&lt;/span&gt; k (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;vector-length&lt;/span&gt;&lt;/span&gt; v))
                    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;vector-ref&lt;/span&gt;&lt;/span&gt; v k))
                   ((&lt;span class=&quot;name&quot;&gt;hashtable-ref&lt;/span&gt; h k &lt;span class=&quot;literal&quot;&gt;#f&lt;/span&gt;))
                   (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;else&lt;/span&gt;&lt;/span&gt;
                    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;t&lt;/span&gt; equation))
                      (&lt;span class=&quot;name&quot;&gt;hashtable-set!&lt;/span&gt; h k t) &lt;span class=&quot;comment&quot;&gt;;memoize&lt;/span&gt;
                      t))))))))))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Next the macro is used to define the coefficients.&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;define-coefficients&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c1&lt;/span&gt; k) '#(&lt;span class=&quot;name&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;3&lt;/span&gt;)
                     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c1&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;-&lt;/span&gt;&lt;/span&gt; k &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
                        (&lt;span class=&quot;name&quot;&gt;c2&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;-&lt;/span&gt;&lt;/span&gt; k &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))))

(&lt;span class=&quot;name&quot;&gt;define-coefficients&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c2&lt;/span&gt; k) '#(&lt;span class=&quot;name&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;2&lt;/span&gt;)
                     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c2&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;-&lt;/span&gt;&lt;/span&gt; k &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
                        (&lt;span class=&quot;name&quot;&gt;c3&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;-&lt;/span&gt;&lt;/span&gt; k &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))))

(&lt;span class=&quot;name&quot;&gt;define-coefficients&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c3&lt;/span&gt; k) '#(&lt;span class=&quot;name&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;)
                     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c3&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;-&lt;/span&gt;&lt;/span&gt; k &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
                        (&lt;span class=&quot;name&quot;&gt;c4&lt;/span&gt; k)))

(&lt;span class=&quot;name&quot;&gt;define-coefficients&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c4&lt;/span&gt; k) '#(&lt;span class=&quot;name&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
                     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c1&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;-&lt;/span&gt;&lt;/span&gt; k &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
                        (&lt;span class=&quot;name&quot;&gt;c4&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;-&lt;/span&gt;&lt;/span&gt; k &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))
                        &lt;span class=&quot;number&quot;&gt;-1&lt;/span&gt;))

(&lt;span class=&quot;name&quot;&gt;define-coefficients&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c5&lt;/span&gt; k) '#(&lt;span class=&quot;name&quot;&gt;1&lt;/span&gt;)
                     &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The equations given as the last argument to the macro are the
relations adjusted to &lt;em&gt;cn(k)&lt;/em&gt; instead of &lt;em&gt;cn(k+1)&lt;/em&gt;. Now that the
coefficients have been defined it is easy to define &lt;em&gt;A&lt;/em&gt;, just as Knuth
observed:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;A&lt;/span&gt; k x1 x2 x3 x4 x5)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;+&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;*&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c1&lt;/span&gt; k) x1)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;*&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c2&lt;/span&gt; k) x2)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;*&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c3&lt;/span&gt; k) x3)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;*&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c4&lt;/span&gt; k) x4)
     (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;*&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;c5&lt;/span&gt; k) x5)))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Computing what is commonly referred to as &lt;em&gt;A(k)&lt;/em&gt; is now achieved by
calling &lt;code&gt;(A k 1 -1 -1 1 0)&lt;/code&gt;. In Chez Scheme the computation takes no
time at all, even for very large &lt;em&gt;k&lt;/em&gt; that no existing machine could
possibly handle using the original algorithm:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;&amp;gt; (time (A 11 1 -1 -1 1 0))
(time (A 11 ...))
    no collections
    0.000004543s elapsed cpu time
    0.000004149s elapsed real time
    96 bytes allocated
-138
&amp;gt; (time (A 128 1 -1 -1 1 0))
(time (A 128 ...))
    no collections
    0.000007392s elapsed cpu time
    0.000007114s elapsed real time
    304 bytes allocated
-4635937302118988398753530618599286738522250
&amp;gt; (time (A 1024 1 -1 -1 1 0))
(time (A 1024 ...))
    no collections
    0.000744346s elapsed cpu time
    0.000743677s elapsed real time
    635472 bytes allocated
-134525946204897748012677078309699584745858677996302504232754322006054514469295899995329805478290190232956296857851394596551319734862436102284482400522332522964879376131124504760141628006776332968808079671234906412428151832076596502192653046509414077174915139024920128557185075083924034657098210492453447406606370793777979291350811243549339280709645840417
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Only one question is left unanswered. Whether or not a sufficiently
smart compiler can (or indeed should) transform Knuth’s original
algorithm into the fast algorithm presented here.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This was written as the first part of a series of articles on stuff
that has been lying around on this website for a long time without any
real commentary. The code it talks about was written in 2012.&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Internals of Zabavno the x86 emulator</title>
      <link>https://weinholt.se/articles/zabavno-pc-emulator/</link>
      <pubDate>Mon, 24 Oct 2016 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/zabavno-pc-emulator/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;&lt;a href=&quot;https://github.com/weinholt/zabavno/&quot;&gt;Zabavno&lt;/a&gt; (Забавно) is an x86 emulator I’ve been working on in my
spare time. It translates x86 instructions into Scheme and eval’s
them, which works surprisingly well.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The initial commit was made two years ago, but at the time I only
worked on it for a few weeks. When &lt;a href=&quot;https://github.com/cisco/chezscheme/&quot;&gt;Chez Scheme&lt;/a&gt; was open sourced I
got interested in it again, since the techniques used depend on having
a good compiler. (And Chez Scheme is a really good compiler).&lt;/p&gt;
&lt;p&gt;Here’s a look at the internals of the emulator. The rest of this
article gets very technical and assumes that the reader knows
something about CPUs. The core of the CPU emulation happens in this
procedure:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;%run-until-abort&lt;/span&gt; M debug instruction-limit
                          fl ip AX CX DX BX SP BP SI DI
                          cs ds ss es fs gs)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;call/cc&lt;/span&gt;&lt;/span&gt;
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (abort)
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; loop ((&lt;span class=&quot;name&quot;&gt;ip&lt;/span&gt; ip) (&lt;span class=&quot;name&quot;&gt;fl&lt;/span&gt; fl) (&lt;span class=&quot;name&quot;&gt;AX&lt;/span&gt; AX) (&lt;span class=&quot;name&quot;&gt;CX&lt;/span&gt; CX) (&lt;span class=&quot;name&quot;&gt;DX&lt;/span&gt; DX) (&lt;span class=&quot;name&quot;&gt;BX&lt;/span&gt; BX)
                 (&lt;span class=&quot;name&quot;&gt;SP&lt;/span&gt; SP) (&lt;span class=&quot;name&quot;&gt;BP&lt;/span&gt; BP) (&lt;span class=&quot;name&quot;&gt;SI&lt;/span&gt; SI) (&lt;span class=&quot;name&quot;&gt;DI&lt;/span&gt; DI)
                 (&lt;span class=&quot;name&quot;&gt;cs&lt;/span&gt; cs) (&lt;span class=&quot;name&quot;&gt;ds&lt;/span&gt; ds) (&lt;span class=&quot;name&quot;&gt;ss&lt;/span&gt; ss) (&lt;span class=&quot;name&quot;&gt;es&lt;/span&gt; es) (&lt;span class=&quot;name&quot;&gt;fs&lt;/span&gt; fs) (&lt;span class=&quot;name&quot;&gt;gs&lt;/span&gt; gs))
        &lt;span class=&quot;comment&quot;&gt;;; Translate instruction(s) or get an existing translation.&lt;/span&gt;
        (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ((&lt;span class=&quot;name&quot;&gt;trans&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;translate&lt;/span&gt; cs ip debug instruction-limit)))
          &lt;span class=&quot;comment&quot;&gt;;; Call the translation and get new values for the&lt;/span&gt;
          &lt;span class=&quot;comment&quot;&gt;;; registers. The translation may choose to abort in the&lt;/span&gt;
          &lt;span class=&quot;comment&quot;&gt;;; middle of a translation.&lt;/span&gt;
          (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let-values&lt;/span&gt;&lt;/span&gt; (((&lt;span class=&quot;name&quot;&gt;ip^&lt;/span&gt; fl^ AX^ CX^ DX^ BX^ SP^ BP^ SI^ DI^
                             cs^ ds^ ss^ es^ fs^ gs^)
                        (&lt;span class=&quot;name&quot;&gt;trans&lt;/span&gt; abort fl AX CX DX BX SP BP SI DI
                               cs ds ss es fs gs)))
            (&lt;span class=&quot;name&quot;&gt;loop&lt;/span&gt; ip^ fl^ AX^ CX^ DX^ BX^ SP^ BP^ SI^ DI^
                  cs^ ds^ ss^ es^ fs^ gs^)))))))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;translate&lt;/code&gt; procedure translates a block of machine instructions
from memory at the address pointed to by &lt;code&gt;cs:ip&lt;/code&gt;. The translated code
is a procedure that takes the registers as arguments and returns new
values for the registers. This is then done in a loop over and over,
with the result the program in the emulated machine runs. Translations
are cached, so if the emulator sees the same code address again it can
just use the previous translation. This makes things a lot faster.&lt;/p&gt;
&lt;p&gt;One might think that translation caching would violate the processor
semantics in some way. Actually, it’s the opposite. Real processors
have a buffer that tries to stay ahead of the actual code execution.
One way to detect that code is running under a debugger is to modify
an instruction that should have been pre-fetched by the processor. If
a debugger is single-stepping then the modified code will be used,
otherwise the original pre-fetched code is used. We can mimic the
real processor’s semantics by putting multiple instructions into the
same translated block. One just needs to be careful to invalidate the
cache when writing to memory, but there is no need to interrupt due to
writes in the middle of a translated block.&lt;/p&gt;
&lt;p&gt;There are other benefits to putting multiple instructions in the same
translated block. A lot of translated code will compute some
intermediate result that is never actually used. This seems
counter-intuitive since an optimizing compiler should have removed all
such computations. However, the x86 architecture has a flags register
that is updated automatically in a very inconsistent manner. This is
how it looked in Intel’s 80386 programmer’s reference manual (1986):&lt;/p&gt;
&lt;!--
```
  31              23               15                7           0
 ╔═══════════════╪═══════════╤═╤═╤╪╤═╤════╤═╤═╤═╤═╤╪╤═╤═╤═╤═╤═╤═╤═╗
 ║▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│V│R│▒│N│IO  │O│D│I│T│S│Z│▒│A│▒│P│▒│C║
 ║0 0 0 0 0 0 0 0 0 0 0 0 0 0│ │ │0│ │    │▒│▒│ │▒│▒│▒│0│▒│0│▒│1│▒║
 ║▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒│M│F│▒│T│  PL│F│F│F│F│F│F│▒│F│▒│F│▒│F║
 ╚═══════════════╪═══════════╧╤╧╤╧╪╧╤╧═╤══╧═╧═╧╤╧═╧╪╧═╧═╧═╧═╧═╧═╧═╝
                              │ │   │  │       │
         VIRTUAL 8086 MODE────┘ │   │  │       │
               RESUME FLAG──────┘   │  │       │
          NESTED TASK FLAG──────────┘  │       │
       I/O PRIVILEGE LEVEL─────────────┘       │
          INTERRUPT ENABLE─────────────────────┘
```
--&gt;
&lt;p&gt;&lt;img src=&quot;/articles/zabavno-pc-emulator/flags.png&quot; alt=&quot;Picture of the flags register, because sadly Firefox is unable to correctly render box characters&quot;&gt;&lt;/p&gt;
&lt;p&gt;The lower bits of the flags register holds information about
arithmetic results: &lt;strong&gt;o&lt;/strong&gt;verflow, &lt;strong&gt;s&lt;/strong&gt;ign, &lt;strong&gt;z&lt;/strong&gt;ero and &lt;strong&gt;c&lt;/strong&gt;arry.
Then there is the more interesting &lt;strong&gt;p&lt;/strong&gt;arity flag that contains the
even parity of the lowest byte of the result, and the even more
interesting &lt;strong&gt;a&lt;/strong&gt;uxiliary carry that signals carry/borrow from the
lower four bits of the result (used for &lt;span style=&quot;font-variant:
small-caps&quot;&gt;bcd&lt;/span&gt;). The processor can compute these flags as a
side-effect of its normal operation, but the emulator has no such
luxury. This is a lot of overhead for each and every arithmetic
instruction.&lt;/p&gt;
&lt;p&gt;Zabavno emits code to update the flags, but it’s done in a clever way
so that in the majority of cases the code is never used. As was
mentioned earlier, a translation block can contain multiple
instructions. When one arithmetic instruction is followed by another
one, the second tends to overwrite the flags. In that case it’s
completely unnecessary to do the first flag update.&lt;/p&gt;
&lt;p&gt;The bookkeeping for tracking which flags should be used and which
should be discarded is generally a bit tricky, since a lot of
instructions use the flags as input, and some instructions update only
a few of them, and a lot of the time the flags are left undefined.
Zabavno outsources almost all bookkeeping to the host Scheme’s
optimizer. The code generator was written with cp0 (from Oscar
Waddell’s PhD thesis) in mind. This optimizer is available in Chez
Scheme and a few other ones.&lt;/p&gt;
&lt;p&gt;Let’s look at the translation of a small instruction sequence. This
code updates the &lt;code&gt;eax&lt;/code&gt; register twice and computes flags twice:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-x86asm&quot;&gt;&lt;span class=&quot;symbol&quot;&gt;example:&lt;/span&gt;
    &lt;span class=&quot;keyword&quot;&gt;sub&lt;/span&gt; &lt;span class=&quot;built_in&quot;&gt;eax&lt;/span&gt;,&lt;span class=&quot;built_in&quot;&gt;edx&lt;/span&gt;
    &lt;span class=&quot;keyword&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;built_in&quot;&gt;eax&lt;/span&gt;,&lt;span class=&quot;built_in&quot;&gt;ecx&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As previously mentioned, the code generator in Zabavno emits code that
returns new values for the registers. This makes it easy to wrap each
translated instruction in a lambda and call it with the registers
returned by the previous instruction. It also happens that cp0 is very
good at optimizing this kind of code. The arithmetic flags are also
wrapped in lambdas and are called when they are needed. They are never
used as first-class procedures, and they are quite small, so cp0
inlines them every time and no memory allocations are needed. Here is
the initial (huge, nightmarish) translation of the example program:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (abort fl AX CX DX BX SP BP SI DI cs ds ss es fs gs)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; RAM
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;case-lambda&lt;/span&gt;&lt;/span&gt;
      [(&lt;span class=&quot;name&quot;&gt;addr&lt;/span&gt; size)
       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;case&lt;/span&gt;&lt;/span&gt; size
         [(&lt;span class=&quot;name&quot;&gt;8&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;memory-u8-ref&lt;/span&gt; addr)]
         [(&lt;span class=&quot;name&quot;&gt;16&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;memory-u16-ref&lt;/span&gt; addr)]
         [(&lt;span class=&quot;name&quot;&gt;32&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;memory-u32-ref&lt;/span&gt; addr)])]
      [(&lt;span class=&quot;name&quot;&gt;addr&lt;/span&gt; size value)
       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;case&lt;/span&gt;&lt;/span&gt; size
         [(&lt;span class=&quot;name&quot;&gt;8&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;memory-u8-set!&lt;/span&gt; addr value)]
         [(&lt;span class=&quot;name&quot;&gt;16&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;memory-u16-set!&lt;/span&gt; addr value)]
         [(&lt;span class=&quot;name&quot;&gt;32&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;memory-u32-set!&lt;/span&gt; addr value)])]))
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;define&lt;/span&gt;&lt;/span&gt; I/O
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;case-lambda&lt;/span&gt;&lt;/span&gt;
      [(&lt;span class=&quot;name&quot;&gt;addr&lt;/span&gt; size) (&lt;span class=&quot;name&quot;&gt;port-read&lt;/span&gt; addr size)]
      [(&lt;span class=&quot;name&quot;&gt;addr&lt;/span&gt; size value) (&lt;span class=&quot;name&quot;&gt;port-write&lt;/span&gt; addr size value)]))
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ([&lt;span class=&quot;name&quot;&gt;fl&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () fl)]
        [&lt;span class=&quot;name&quot;&gt;fl-OF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; fl &lt;span class=&quot;number&quot;&gt;2048&lt;/span&gt;))]
        [&lt;span class=&quot;name&quot;&gt;fl-SF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; fl &lt;span class=&quot;number&quot;&gt;128&lt;/span&gt;))]
        [&lt;span class=&quot;name&quot;&gt;fl-ZF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; fl &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt;))]
        [&lt;span class=&quot;name&quot;&gt;fl-AF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; fl &lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;))]
        [&lt;span class=&quot;name&quot;&gt;fl-PF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; fl &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;))]
        [&lt;span class=&quot;name&quot;&gt;fl-CF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; fl &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt;))])
    ((&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (fl AX CX DX BX SP BP SI DI cs ds ss es fs gs)
       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let*&lt;/span&gt;&lt;/span&gt; ([&lt;span class=&quot;name&quot;&gt;t0&lt;/span&gt; AX]
              [&lt;span class=&quot;name&quot;&gt;t1&lt;/span&gt; DX]
              [&lt;span class=&quot;name&quot;&gt;tmp&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fx-&lt;/span&gt; t0 t1)]
              [&lt;span class=&quot;name&quot;&gt;result&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; tmp &lt;span class=&quot;number&quot;&gt;4294967295&lt;/span&gt;)]
              [&lt;span class=&quot;name&quot;&gt;fl-OF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt;
                            (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxxor&lt;/span&gt; t0 t1) (&lt;span class=&quot;name&quot;&gt;fxxor&lt;/span&gt; t0 result))
                            &lt;span class=&quot;number&quot;&gt;31&lt;/span&gt;)
                           &lt;span class=&quot;number&quot;&gt;2048&lt;/span&gt;
                           &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
              [&lt;span class=&quot;name&quot;&gt;fl-SF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; result &lt;span class=&quot;number&quot;&gt;31&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;128&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
              [&lt;span class=&quot;name&quot;&gt;fl-ZF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eqv?&lt;/span&gt;&lt;/span&gt; result &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
              [&lt;span class=&quot;name&quot;&gt;fl-AF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fx-&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; t0 &lt;span class=&quot;number&quot;&gt;15&lt;/span&gt;)
                                            (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; t1 &lt;span class=&quot;number&quot;&gt;15&lt;/span&gt;))
                                       &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;)
                           &lt;span class=&quot;number&quot;&gt;16&lt;/span&gt;
                           &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
              [&lt;span class=&quot;name&quot;&gt;fl-PF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                       (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;vector-ref&lt;/span&gt;&lt;/span&gt;
                            byte-parity-table
                            (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; result &lt;span class=&quot;number&quot;&gt;255&lt;/span&gt;))
                           &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;
                           &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
              [&lt;span class=&quot;name&quot;&gt;fl-CF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; tmp &lt;span class=&quot;number&quot;&gt;32&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
              [&lt;span class=&quot;name&quot;&gt;AX&lt;/span&gt; result])
         ((&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (fl AX CX DX BX SP BP SI DI cs ds ss es fs gs)
            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let*&lt;/span&gt;&lt;/span&gt; ([&lt;span class=&quot;name&quot;&gt;t0&lt;/span&gt; AX]
                   [&lt;span class=&quot;name&quot;&gt;t1&lt;/span&gt; CX]
                   [&lt;span class=&quot;name&quot;&gt;tmp&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fx+&lt;/span&gt; t0 t1)]
                   [&lt;span class=&quot;name&quot;&gt;result&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; tmp &lt;span class=&quot;number&quot;&gt;4294967295&lt;/span&gt;)]
                   [&lt;span class=&quot;name&quot;&gt;fl-OF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt;
                                 (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxxor&lt;/span&gt; t0 result)
                                        (&lt;span class=&quot;name&quot;&gt;fxxor&lt;/span&gt; t1 result))
                                 &lt;span class=&quot;number&quot;&gt;31&lt;/span&gt;)
                                &lt;span class=&quot;number&quot;&gt;2048&lt;/span&gt;
                                &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
                   [&lt;span class=&quot;name&quot;&gt;fl-SF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; result &lt;span class=&quot;number&quot;&gt;31&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;128&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
                   [&lt;span class=&quot;name&quot;&gt;fl-ZF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eqv?&lt;/span&gt;&lt;/span&gt; result &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
                   [&lt;span class=&quot;name&quot;&gt;fl-AF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt;
                                 (&lt;span class=&quot;name&quot;&gt;fx+&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; t0 &lt;span class=&quot;number&quot;&gt;15&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; t1 &lt;span class=&quot;number&quot;&gt;15&lt;/span&gt;))
                                 &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;)
                                &lt;span class=&quot;number&quot;&gt;16&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
                   [&lt;span class=&quot;name&quot;&gt;fl-PF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;vector-ref&lt;/span&gt;&lt;/span&gt; byte-parity-table
                                            (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; result &lt;span class=&quot;number&quot;&gt;255&lt;/span&gt;))
                                &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;
                                &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
                   [&lt;span class=&quot;name&quot;&gt;fl-CF&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; () (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; tmp &lt;span class=&quot;number&quot;&gt;32&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;))]
                   [&lt;span class=&quot;name&quot;&gt;AX&lt;/span&gt; result])
              (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let*&lt;/span&gt;&lt;/span&gt; ([&lt;span class=&quot;name&quot;&gt;fl&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; ()
                           (&lt;span class=&quot;name&quot;&gt;fxior&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fl&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;-2262&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;fl-OF&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;fl-SF&lt;/span&gt;)
                                  (&lt;span class=&quot;name&quot;&gt;fl-ZF&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;fl-PF&lt;/span&gt;)
                                  (&lt;span class=&quot;name&quot;&gt;fxior&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fl-AF&lt;/span&gt;) (&lt;span class=&quot;name&quot;&gt;fl-CF&lt;/span&gt;))))])
                (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;values&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;262&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fl&lt;/span&gt;) AX CX DX BX SP BP SI DI
                        cs ds ss es fs gs))))
          fl AX CX DX BX SP BP SI DI
          cs ds ss es fs gs)))
     fl AX CX DX BX SP BP SI DI
     cs ds ss es fs gs)))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;There is some initial setup code for accessing memory and the &lt;span
style=&quot;font-variant: small-caps&quot;&gt;i/o&lt;/span&gt; bus. Then the initial
values of the flags are wrapped in lambdas. The instructions
themselves correspond to the &lt;code&gt;let*&lt;/code&gt; expressions. The innermost &lt;code&gt;let*&lt;/code&gt;
computes the flags and then returns all registers. Pretty terrible
with a lot of lambdas, so the GC will be invoked very frequently if
this is what the emulator uses. But this is the code after cp0 has
optimized it:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-scheme&quot;&gt;(&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;lambda&lt;/span&gt;&lt;/span&gt; (abort fl AX CX DX BX SP BP SI DI cs ds ss es fs gs)
  (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ([&lt;span class=&quot;name&quot;&gt;result&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;#xffffffff&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fx-&lt;/span&gt; AX DX))])
    (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ([&lt;span class=&quot;name&quot;&gt;tmp&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fx+&lt;/span&gt; result CX)])
      (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;let&lt;/span&gt;&lt;/span&gt; ([&lt;span class=&quot;name&quot;&gt;result&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;#xffffffff&lt;/span&gt; tmp)])
        (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;values&lt;/span&gt;&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;262&lt;/span&gt;
          (&lt;span class=&quot;name&quot;&gt;fxior&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;-2262&lt;/span&gt; fl)
                 (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxxor&lt;/span&gt; result result)
                                        (&lt;span class=&quot;name&quot;&gt;fxxor&lt;/span&gt; CX result)) &lt;span class=&quot;number&quot;&gt;31&lt;/span&gt;)
                &lt;span class=&quot;number&quot;&gt;2048&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; result &lt;span class=&quot;number&quot;&gt;31&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;128&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;eqv?&lt;/span&gt;&lt;/span&gt; result &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;) &lt;span class=&quot;number&quot;&gt;64&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
            (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;vector-ref&lt;/span&gt;&lt;/span&gt; byte-parity-table (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;#xff&lt;/span&gt; result))
                &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
            (&lt;span class=&quot;name&quot;&gt;fxior&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fx+&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;15&lt;/span&gt; result)
                                        (&lt;span class=&quot;name&quot;&gt;fxand&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;15&lt;/span&gt; CX))
                                   &lt;span class=&quot;number&quot;&gt;4&lt;/span&gt;)
                       &lt;span class=&quot;number&quot;&gt;16&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)
                   (&lt;span class=&quot;name&quot;&gt;&lt;span class=&quot;builtin-name&quot;&gt;if&lt;/span&gt;&lt;/span&gt; (&lt;span class=&quot;name&quot;&gt;fxbit-set?&lt;/span&gt; tmp &lt;span class=&quot;number&quot;&gt;32&lt;/span&gt;)
                       &lt;span class=&quot;number&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;number&quot;&gt;0&lt;/span&gt;)))
          result CX DX BX SP BP SI DI cs ds ss es fs gs)))))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;None of the remaining code needs to allocate memory (in fact only the
&lt;code&gt;vector-ref&lt;/code&gt; call needs to read from memory). There is still some room
for improvement, but the flags are only computed once. If there was an
instruction that did need a flag from &lt;code&gt;sub&lt;/code&gt; then cp0 would see that,
and that flag would be computed. In general the flags tend to be
computed only once per block.&lt;/p&gt;
&lt;p&gt;This is how the core of the CPU emulation works. Now that you’ve seen
it, why don’t you give it a try?
&lt;a href=&quot;https://github.com/weinholt/zabavno/&quot;&gt;Zabavno is available from github&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;lang-bash&quot;&gt;git &lt;span class=&quot;built_in&quot;&gt;clone&lt;/span&gt; https://github.com/weinholt/zabavno/
&lt;/code&gt;&lt;/pre&gt;
</description>
    </item>
    <item>
      <title>Shiny new website layout</title>
      <link>https://weinholt.se/articles/shiny-new-layout/</link>
      <pubDate>Sun, 23 Oct 2016 02:00:00 +0200</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/shiny-new-layout/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;This website has a new layout. The previous &lt;s&gt;anti-social&lt;/s&gt;
directory listing layout is gone and the future is shiny. I’ve been
putting this off for maybe ten years now.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;I still wanted a static web site because it’s so much easier to run a
web server that way. So I went looking for a Node.js based static site
generator. Node.js because, honestly, JavaScript is the language for
the web.&lt;/p&gt;
&lt;p&gt;There are a few of these. My previous web site was made with a
self-made static site generator, and they are not actually very
complicated when you get down to it. It’s just that there are more
design choices than there are lines of code. The ones that caught my
eye were &lt;a href=&quot;http://wintersmith.io/&quot;&gt;Wintersmith&lt;/a&gt; and &lt;a href=&quot;https://docpad.org/&quot;&gt;DocPad&lt;/a&gt;. DocPad talks about setting
me free, suggesting that I’m currently a captive. Not a positive
image. I went with Wintersmith because I happen to like winter. And
the author is Swedish, so there’s that. The obvious choice.&lt;/p&gt;
&lt;p&gt;Wintersmith lets you write in Markdown and it automatically turns it
into web pages. Markdown is a shiny web thing. But the layout of the
site itself is controlled with Jade (apparently now renamed Pug?). In
my own templating engine I used &lt;a href=&quot;http://okmij.org/ftp/Scheme/xml.html&quot;&gt;SXML&lt;/a&gt; for both the job of Markdown
and Jade. I hadn’t used Jade for more than five minutes before I
started to miss quasiquote, unquote and in particular my good old
friend unquote-splicing. (Search for how to make a list of tags
separated by a comma). If I were to do a site generator again I’d use
Markdown for the content and SXML for the structure. Scheme gets a lot
of things right, in the end.&lt;/p&gt;
</description>
    </item>
    <item>
      <title>Faster Dynamic Type Checks</title>
      <link>https://weinholt.se/articles/alignment-check/</link>
      <pubDate>Wed, 07 Mar 2012 01:00:00 +0100</pubDate>
      <guid isPermaLink="true">https://weinholt.se/articles/alignment-check/</guid>
      <author>weinholt</author>
      <description>&lt;p&gt;&lt;a href=&quot;/scheme/alignment-check.pdf&quot;&gt;“Arranging for Safety Checks with Hardware Traps”&lt;/a&gt; was the title
of an article I wrote for a class project. It describes how to use the
Alignment Checking feature of the x86/AMD64 architecture to get
branchless dynamic type checks.&lt;/p&gt;
&lt;p&gt;&lt;span class=&quot;more&quot;&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The article has not been published formally, although it was written
in a style that makes it look like it could be published. I was unable
to find anything in the literature that describes how to use alignment
checking for dynamic type checks.&lt;/p&gt;
&lt;p&gt;After writing the article I discovered that the idea is mentioned in
passing in Olin Shiver’s dissertation and there are several mentions
of it on Usenet. The method is old but has not been used on the x86
because it is difficult to make it work with other software.&lt;/p&gt;
&lt;p&gt;And for what it’s worth: the branchless vector-ref is probably slower
than the one with the branch. But it is cool.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“A hack is a terrible thing to waste, please give to the
implementation of your choice…” – GJC&lt;/p&gt;
&lt;/blockquote&gt;
</description>
    </item>
  </channel>
</rss>