This is an old revision of the document!

Circuit Synthesis

The process of converting a CHP description of a process into a PRS description automatically.

Maelstrom Guide

work in progress…

Maelstrom is a new logic synthesis technique (and software tool) for asynchronous circuits, developed at the AVLSI Lab at Yale. This is a guide to writing CHP so that the tool produces high-quality circuits.

Synthesizable CHP

Maelstrom only accepts CHP descriptions of the following form:

x0:=v0;
x1:=v1;
...
xn:=vn;
*[ true -> P ]

i.e. a (possibly empty) semi-colon separated list of initializations to a subset of variables used in the CHP, followed by an infinite repetition of an arbitrary CHP program P. The values vi must evaluate to constants. There are no restrictions on P, but some forms of P will lead to circuits that are much more expensive compared to others, as we will see later. Further, the channel implementation leads to all receives being passive and all sends being active. This means that (currently) only the receiving side may probe a channel. At the process level, the ports of a process may only be channels or Booleans (see section below on external constants for usage).

Program Structure

The first and most important concept to keep in mind when writing a CHP program is that the structure of the program you write very directly defines the control circuit that gets generated. The syntax is the controller. For example, suppose you would like a process to repeat two things, one after the other, forever. It is easy to accidentally write:

s:=0;
*[
    [s=0 -> thing1
    []s=1 -> thing2
    ];
    s:=1-s
 ]

when the correct thing to do is:

*[
    thing1;
    thing2
 ]

The former is very similar to Verilog, in terms of how the state machine is defined, but CHP is more abstract than RTL. Such writing has a cost as well. The first program will have a selection block which computes guards every iteration and activates the correct thing and a loop-carry (the variable s) apart from the circuit to implement the functionality that is actually needed. The second program will simply contain the circuits to implement the two things. Note that such writing occurs more frequently when the program size grows. The figure below shows, at a block-level, the controllers that will be generated for the two programs. It should be pretty obvious as to which is the better implementation. It is helpful to internalize this bijection between the structure of the CHP and the generated control circuits.

The only time to use the state machine-style syntax is when the number of iterations spent in each state is unknown/variable. For example, consider the simple CHP that computes the 2n-th Fibonacci number.

c:=0; x:=0; y:=0;
*[
    [ c=0 -> r!x; l?c , x:=0 , y:=1
    []c>0  -> x:=x+y ; y:=x+y ; c:=c-1
    ]
 ]

Here, the number of iterations cannot be statically determined when the program is written and necessitates the state-machine style CHP. One might think to write this as a nested loop with a c>0 guard. This is addressed in a section below on loops.

Conditionals and Ternaries

The selection in CHP allows for conditional statements and control flow control. However, there is a controller overhead for writing selections. In general, it is helpful to be sparing with the use of selections. Consider the following CHP:

[ x=0 -> y:=42 []else -> y:=0 ]

This is not a situation that warrants the overhead of a selection control block. The more appropriate CHP to write would be:

y:=(x=0)?(42):(0)

Although these look very similar, the hardware overhead is not. The selection controller is much more expensive than simply the combinational logic for a ternary, which is just a mux (or perhaps even cheaper, depending on the exact logical expression and how good a job the logic optimizer can do with it). However, there are cases where the usage of selections over ternaries is justified. For example, consider:

[ c -> y:=huge_expensive_computation
[]else-> y:=0 
]

Here, suppose the slow, first branch is not taken very often in the execution of your program. Then the control will entirely skip the expensive portion and run faster on average. Rewriting this into a ternary would then actually hurt the performance, since the slow computation is put unconditionally into the execution path and slows down the circuit on every cycle.

Further, there is an important distinction between selections written directly within a CHP body in a process and one written in a function. Selections written in functions are converted into the equivalent ternary expression in the function in-lining step and substituted. Selections in CHP bodies in a process are as they are. For example, consider:

fn_vs_sel.act

function f (int<1> c; int x) : int {
    chp {
        [ c=0 -> self:=x+1 [] c=1 -> self:=x-1 ]
    }
}
 
defproc p1 () {
    int x,y;
    int<1> c;
    chp {
        y:=f(c,x)
    }
}
 
defproc p2 () {
    int x,y;
    int<1> c;
    chp {
        [ c=0 -> y:=x+1 [] c=1 -> y:=x-1 ]    
    }
}

The CHP body in p1 will contain a single ternary, but p2 will contain a full selection.

Another situation where selections cannot be avoided is when there is a channel access in one of the branches but not the others. For example,

[ c -> Y!e
[]else -> skip
]

This cannot be rewritten into a ternary operator without re-architecting the usage of the channel Y at a higher level, and often that is unwise as well, since you must try to access channels only as often as needed and not just send values just to be thrown away at the other end.

Channel Accesses

Continuing from the previous example, it is possible that the selection is more than 2-way and the channel is only accessed in some branches but not the others. For example, consider:

[c=0 -> Y!e0
[]c=1 -> Y!e1
[]c=2 -> x:=x+1
[]c=3 -> skip
]

Here, although the entire selection cannot be squashed down into ternaries, the channel Y is syntactically accessed at more than one point in the program. This leads to an expensive circuit. Ideally, you want every channel in your process to appear in the CHP at most once. In this case, the correct thing to do would be to assign the value to send over Y to a variable then conditionally send it later:

[c=0 -> y:=e0
[]c=1 -> y:=e1
[]c=2 -> x:=x+1
[]c=3 -> skip
];
[c=0 | c=1 -> Y!y []else -> skip ]

In this way, the channel Y is only accessed once. In essence, channels should not be accessed multiple times when you know you're going to send something over it, but just don't know what. Instead, you must do so only know when you don't know how many times you are going to access it. The same logic extends to receives as well.

At a higher level, it is also important to be mindful when having multiple channels that go between the same two processes in the same direction. For example,

*[ ... ; C0!e0 , C1!e1 ; ... ]
||
*[ ... ; C0?x0 , C1?x1 ; ... ]

is incredibly wasteful. There is absolutely no need to have two channels between the same two points in the two programs. Instead, send one wider variable and assign the individual variables on the other end. Or better yet, use a structure for readability:

*[ ... ; (s.x0:=e0,s.x1:=e1) ; C!s ; ... ]
||
*[ ... ; C?t ; x0:=t.x0,x1:=t.x1 ; ... ]

External Constants

To Do…

Nested Loop Constructs

The loop construct in CHP allows for a conditional repetition of a CHP block. For example:

*[ L?x; *[x>0 -> X!x; x:=x-1] ]

generates all numbers from x to zero and sends them over X. However, the controller for a nested conditional loop is expensive. On top of that, the datapath must also include muxes to correctly set values of variables from before the loop or from the previous iteration of the loop, based on whether the current iteration of the loop is the first one or not. After some experimentation, it becomes apparent that the cheapest implementation for this is to factor out nested loops into multiple parallel (unconditional) loops. For example, the process above would automatically get rewritten internally by Maelstrom into:

*[ L?x; C!y ]
||
s:=0;
*[ [s=0 -> C?y
   []s=1 & y>0 -> X!y; y:=y-1
   []s=1 & y=0 -> s=0
   ] 
 ]