In this post I wanted to give people the low-down on
radicle-surf and the journey for its first version.
In part it’s also a personal journey for myself and the team, since we are a partly-remote team and I wanted to give an
idea of how we work.
As a part of the greater Radicle ecosystem, we will focus on the
radicle-surf component here. The aim of this component is to provide an API for browsing code in a Version Control System (VCS). Think of it as similar to going to GitHub and looking at a specific repository. Some of the common things we want to do are:
- Navigate directories
- View a file’s contents
- See what commit touched a file last
- View the commits in some history
Another wish for the project was to have it independent from the VCS it used. We would like to explore having support for Pijul repositories, for example. With a few of these details in mind we moved forward into thinking about the design of
Denotational design is a technique for designing your API in such a way that it minimises abstraction leaks and lends itself to getting to the “true meaning” of what you are designing. Conal Elliott is the main (pretty much only) proponent of this technique. I first became interested in it after watching his talk and remarked on how elegant his abstractions turned out to be. On top of that, the process of thinking about what you want to represent and eliminating edge cases in the system is dear to my heart as a functional programmer who likes types and static guarantees.
The reason I mention this is because as I began to think about code browsing my friend Sandy Maguire had mentioned denotational design in his upcoming book, “Design and Interpretation of Haskell Programs”. This re-lit the first in my mind and I decided to explore the technique as a way to provide some upfront work on
radicle-surf before diving in and writing code.
It’s rare that we get things right the first time around. It was no different with working denotational design and our initial stab at the problem. We can find the initial design here.
The tactic for designing in this way is to lay out your domain as much as you can up front. So, to start off we list all the objects we think we will require in the system. In this case, we had listed objects such as:
Change. We can note here that these might not have initially existed from the outset, but instead brought into existence after exploring the design further.
Once we had the objects listed, we listed the actions we would like to have working on them. These build up the functionality of our API. For example, in code browsing we want to list a directory, get a particular set of commits, view a file’s contents, etc. With the types of objects and functions in hand we begin to reason about the “meaning” of these objects and functions. The aim of this step is a bit more nuanced and we spend the most of our time figuring this out and refining it. The best guide for this step is to find the type that is used the least in the functions. For example,
Repo was only mentioned in two places:
addCommitHistory :: CommitHistory -> Repo -> Repo getCommitHistories :: Repo -> [CommitHistory]
We have a function that adds a
CommitHistory to a
Repo, getting back the updated
Repo. The other is to retrieve all the
CommitHistory objects in a
Repo. When we want to assign a meaning to an object, we want to pick the simplest meaning as possible. This means that we do not want to bring any unnecessary meaning with us during the design. We will not want to talk about what time the
Repo was created at, so we will not be adding a time type to the meaning. The simplest meaning in our case here is
type instance Repo = [CommitHistory]. A
Repo is simply the collection of
CommitHistory objects. From there we can derive the meaning of our two functions above using the overloaded operator
μ (read “as meaning of”).
-- Prepend a 'CommitHistory' to a 'Repo' addCommitHistory :: CommitHistory -> Repo -> Repo μ addCommitHistory history = history : μ repo -- Retrieve a list of the 'CommitHistory's getCommitHistories :: Repo -> [CommitHistory] μ getCommitHistories repo = μ repo
The issue was that when starting out I did not describe the domain correctly to myself. When fleshing out the design I got to the point where a
Change was the atomic unit of the system. A
Commit was made up of a set of
Changes and they would build a file. Maybe it was my buzz on just learning about patch theory, Darcs, and Pijul.
In the design we defined
type Change -- Constructors of Change - think GADT AddLineToFile :: FileName -> Location -> FileContents -> Change RemoveLineFromFile :: FileName -> Location -> Change MoveFile :: FileName -> FileName -> Change CreateFile :: FileName -> Change DeleteFile :: FileName -> Change
Then to build up a file’s contents we added:
applyChange :: Change -> FileContents -> FileContents μ applyChange change fileContents = case change of AddLineToFile _ location fileContents' -> addLine location fileContents' fileContents RemoveLineFromFile _ location -> removeLine location fileContents -- These operations don't modify contents per se but rather just modify the file MoveFile _ _ -> fileContents CreateFile _ -> fileContents -- Deleting will set FileContents to nothing DeleteFile _ -> emptyFileContents
While this is elegant from the point of view of constructing a file via its history, it became unnecessarily complicated to implement when all we wanted to do is view a file’s state. We would have to implement a patch-based approach for Git Commits. This would be a distraction from what we really wanted to do: view files and directories.
The good news is that I have a team at Monadic to lean on when the going gets tough! After expressing my frustration with the initial design, Kim and I setup a call where we could collaborate on a refinement of the design. Kim is based in Berlin and I am based in Dublin so we hopped on Google Hangouts and used tmate to start a remote terminal that we could both work in. This is a really cool tool for doing some pair programming remotely, or even in the same room.
Sitting down together we thought about the problem again. We initially tried to think about the initial design and what was the issue, but we switched how we were thinking about things a bit. We had a thought experiment as to what a code browser might look like if we were to build it for a REPL. This lead us to an initial jumping off point where the user would get a repository, pick a history to view, build the directory from that history, and finally browse that directory. This matched up pretty well when thinking about how a user would utilise GitHub. We first go to a URL, pick a branch (default to the
master branch), and we have our directories and files to view.
Then things got interesting. We knew we had this concept of building a directory. Using Pijul and Git as examples we thought about how they worked to render a directory from their units of change.
In Git we have the notion of Commit (see Commit Objects), and this will give us the ability to render a Tree. So if we had a list of commits (a Branch can be seen as a list of Commits), to get the directory state we grab the head of the list and render the Tree. On the flip-side we have Pijul which contains a list of Patches. If we are given a collection of Patches we can fold over them, applying each using Pijul’s logic, to get the redered directory state.
So it was important to note that we have two similar functions that go from some VCS artifact to a rendered directory. This led us to the idea of a
Snapshot was simply a function
[a] -> Directory where
a could be a
Commit or a
Patch. This is when we felt like we were really onto something. The boundary was clearly there separating any
Repo logic from the
We refined this further. Since we must have at least one
a to bring a
Directory into existence, a
NonEmpty a -> Directory. We note that we could have also said that it would be
[a] -> Maybe Directory. Instead we wanted to work on more natural semantics, since we can also note that there is a transformation from
[a] -> Maybe (NonEmpty a). For a further explanation on this method, we highly recommend reading Parse, don’t validate, by Alexis King.
One issue with this initial meaning for
Snapshot though was that
head of an empty list is not a total function and so we could not get the
head of list of
Commits. We would like to avoid these kinds of faulty semantics as much as possible, and thankfully this is simply fixed by using a
NonEmpty list, but it clearly illustrates the push and pull of giving meanings to types and functions.
We explored further the meaning of
Snapshot and how it would play in the browsing semantics. We noted that in the initial user flow of the REPL, that there was no way of changing the history we wanted to work with. This smelled like we needed some way of viewing and updating state. In Haskell-land, this is achieved simply by using the
State monad, where we can call functions such as
modify. At that moment we realised that our way of browsing code was simply combining a
Reader monad (a read only state) over a
State monad. The read-only state was the
Snapshot function, and the modifiable state was
NonEmpty a). From there we could easily define a function that rendered a
Directory from its current state and modify state, such as finding a particular commit, or viewing a particular branch.
Once we have a
Directory we can do things like finding a particular file or directory, or getting the diff between to two
Directory objects. These are outlined in the second design.
The first minimal version of
radicle-surf is coming along nicely and we would say this is due to having put some good up-front work before writing any code.
On top of having a reference document for writing the implementation, it also gave other Monadlings the chance to give their input on the design and allowed us to have a common place to discuss the design. It allowed us to not get distracted by nit-picky semantics, but rather focus on what we truly wanted to design and use.
We would love to hear your feedback and if you have had any experience in writing denotational design before
Lots of love,