6 Months of Bluejay

July 13, 2021

*Blue Jay photo by Erin Minuskin on Unsplash*

Nearly six months ago now I started working on a kernel that I called Bluejay. I didn’t have much of a goal from the get-go beyond wanting to do a bit of OS development, but after a few weeks of work I started to get an idea of where I wanted to go with the project.

In short, I wanted to create an operating system blending the best parts of UNIX and Lisp machines: the “UNIX philosophy” mixed with a high-level and expressive language. Nearly six months later and I think I am well on my way to that goal. Here’s what I’ve made so far:

Bluejay has two main parts, the kernel and the Lisp compiler. Both will eventually work hand in hand, but for now they are separate. Both the kernel and compiler are written in C and Assembly language.

The Lisp Compiler

The Lisp compiler is the most interesting part of Bluejay so far in my opinion. It is a quite capable and reasonably performant JIT compiler powered by DynASM (the library powering the LuaJIT and PHP compilers). It supports many of bells and whistles you would expect from a normal Lisp implementation, with a few other features thrown in:

Functions, variables, the basic stuff
Optional and variadic arguments
Macros
A garbage collector
Seamless interop with C
Lexically-scoped closures
The beginning of a standard library

The notable exception so far is an object system similar to CLOS, but I plan on implementing that soon.

Bluejay compiles functions as it goes to x86 machine code, which is stored in memory. Compiled functions follow the hosts calling conventions, meaning that they can be called seamlessly from C. Take for example the following:

;;;; hello.lisp

;; Create a list of numbers
(defun make-list-range (from to)
  (if (= from to)
      (cons to nil)
      (cons from (make-list-range (+ from 1) to))))

Say we want to create a range of numbers 2 to 6, and then add two to each one. This example shows doing so from C, by passing a C function to a Lisp definition of mapcar.

value_t add_two(value_t val)
{
    return l_plus(val, intval(2));
}

int main(int argc, char **argv)
{
    // Create an input stream
    struct istream *is = new_fistream("hello.lisp", false);
    // Compile the file
    struct environment env = compile_all(is);
    
    // This is kind of verbose, but I'll explain it:
    // print the value ...
    l_printval(
        // you get from calling mapcar ...
        find_function(env, "mapcar")->def2(
            // with add_two ...
            create_closure(add_two, 1, 0), // 1 arg, 0 captures
            // and the result of calling make-list-range ...
            find_function(env, "make-list-range")->def2(
                // with 2 and 6
                intval(2), intval(6))));
}

The C example is a bit verbose, but that’s C for you. Anyway, this works as you (hopefully) expected, printing out 4 5 6 7 8.

This small example shows how easy it is to pass data between Lisp and C. Although it’s missing some error handling and some details required for the GC to work properly it should demonstrate just how easy it is to plug Lisp into an existing C application.

Maybe next post I’ll show the opposite, mapcar in C calling a Lisp function…

One feature that was particularly tricky to get working well was lexical scope for closures. If you don’t know what lexical scope is, here’s a quick example:

(let1 (number 4)
  (lambda (n)
    (+ n number))) ; `number` is captured thanks to lexical scope

This makes possible the famous “let over lambda” pattern so common in Lisp.

The way this works is that when the compiler encounters a lambda expression it compiles it just like a normal function, but passes the current local scope as the “parent” scope. Then, when a variable can’t be resolved in the lambda’s scope (like number in the example) it will check the parent scope. If the variable is found in the parent scope it will mark it as “free” in the current scope (i.e. bound outside this scope).

Once the lambda has been compiled code is generated to bind the captured variables at runtime. It first creates a closure object, then the parent function walks through the lambda’s local scope and finds any variables marked as free. It then saves those variables in the capture list of the closure object.

When the closure is eventually called (with apply or funcall) these captured values are passed along too (in edi, if you’re interested). This way even though the same code is run for each instance of the closure, different captured values can be passed to it.

This of course works with the garbage collector, so if a reference to the closure exists the captured values will be marked as alive too.

The Bluejay Kernel

The second big part of Bluejay is the kernel. A ton of work has gone into it, but there are less bells and whistles to show off compared to the compiler. However, the architecture is in place to integrate the compiler and kernel together very soon, so I expect to be able to write Lisp code from within Bluejay itself before the end of the year.

Multi-tasking

That said, there are some very interesting features in the kernel already. It is full preemptive and supports thread-based multi-tasking. Memory protections are of course in place, with the kernel mapping itself to 0xC0000000 and above as is common among monolithic kernels.

Several drivers in the kernel already take advantage of the simple multi-threading capabilities. For example polling can be easily offloaded to a new thread.

void polling_thread(void *data)
{
    while (poll(something))
    {
        // Wait for something to happen
    }
}

void main_task()
{
    spawn_thread(polling_thread, NULL);
    
    // Keep working here while the other thread polls
}

Once file system support is finished user programs will be able to take advantage of these features as well.

Drivers

So far only a few simple drivers have been implemented, but I am in the process of developing some more interesting ones. Currently in progress is the EXT-2 driver which will provide a real back-end for the existing VFS architecture. The actual hard drive interface so far is handled through the old but simple ATA protocol, which all hard disks support for backwards compatibility.

I have also implemented PCI drivers which provide a convenient base for other device drivers to build upon. PCI device drivers can specify the physical devices they support and will be automatically launched if a compatible device is found. I plan to implement a similar interface for USB device drivers as well when I get to that point.

Future plans

Hopefully the next few months will see this all come together and make Bluejay the (sort-of) usable system I would like it to be. My goal for this year is to be able to write Lisp programs on a machine running Bluejay on bare metal, and at the current pace that doesn’t seem that far off. This project has been great fun to work on, and I am excited to see where it takes me.