Publications

This is a selected list of papers. I have selected papers for this list because, to my knowledge, they represent interesting (to me) "firsts." Some of the ideas presented in these papers were further developed and published. Along with each paper, I have included a brief description explaining its contribution.

The documents listed below are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons viewing this information will adhere to the terms and constraints invoked by each author's copyright.

Selected Papers (full list)

Karthi Srinivasan and Rajit Manohar. Maelstrom: A Logic Synthesis Technique for Asynchronous Circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), (early access) May 2025. (abstract, pdf)
This paper introduces the first scalable and efficient logic synthesis method that translates behavioral asynchronous descriptions into asynchronous circuits. Previous work was either non-scalable (computationally intractable for large circuits), or generated inefficient circuits. This work presents an alternate approach, and represents a qualitative advance in logic synthesis techniques for asynchronous circuits.

Rajit Manohar and Yoram Moses. Timed Signalling Processes. IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), July 2023. (abstract, pdf)
This paper introduces the timed signalling processes model that captures digital computations that can use timing information to order signal transitions. It introduces the notion of a zigzag pattern in circuits, and shows that the existence of a generalized zigzag pattern is both a necessary and sufficient condition for ordering signal transitions in the timed asynchronous circuit setting. In particular, all timing constraints for correctness are special cases of zigzag patterns, including setup/hold time requirements in synchronous logic. This also proves that the digital ASIC flow using the ACT framework is universal—it has no inherent limitation and can be used to support any digital logic family.

Karthi Srinivasan, Yoram Moses, and Rajit Manohar. Opportunistic Mutual Exclusion. IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), July 2023. (abstract, pdf)
This paper shows how we can exploit run-time information combined with knowledge of timing to improve the performance of a classic algorithm—mutual exclusion using a central server. This is the first example of using a zigzag pattern to solve a concrete problem where realized delays at run-time are exploited for improving performance.

Ruslan Dashkin and Rajit Manohar. General Approach to Asynchronous Circuits Simulation Using Synchronous FPGAs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41(10):3452--3465 (TCAD), October 2022. (pdf)
This paper describes a new universal method that can translate asynchronous circuits into a synthesizable synchronous simulation model. What is novel about the approach presented is that it provides a unified mechanism that can handle a wide range of asynchronous circuit families. This translation provides a mechanism for fast functional simulation of asynchronous circuits on commercially available FPGAs.

Rui Li, Lincoln Berkley, Yihang Yang, and Rajit Manohar. Fluid: An Asynchronous High-level Synthesis Tool for Complex Program Structures. IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), September 2021. (abstract, pdf)
A high-level synthesis engine that can generate asynchronous dataflow circuits from optimized C/C++ programs, operating directly on the LLVM compiler intermediate representation. The paper introduces several new techniques that enable dataflow synthesis in the presence of complex control dataflow graphs that arise due to common software design patterns and compiler optimizations.

Rajit Manohar. Exact Timing Analysis for Asynchronous Circuits with Multiple Periods. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020. (abstract, pdf)
Repetitive event-rule (RER) systems are used to capture the timing properties of asynchronous circuits. It is known that critically connected RER systems are exactly periodic with a single cycle period. This paper completes the analysis, showing that general RER systems can have multiple periods, and provides an efficient algorithm to compute the period for every signal in the circuit.

Wenmian Hua, Yi-Shan Lu, Keshav Pingali, Rajit Manohar. Cyclone: a static timing and power analysis engine for asynchronous circuits. Proceedings of the IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2020. (abstract, pdf)
The first static timing and power analysis engine for asynchronous logic. The paper introduces the concepts of arrival time, required time, and slack for asynchronous circuits.

Rajit Manohar and Yoram Moses. Asynchronous Signalling Processes. Proceedings of the IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2019. (abstract, pdf)
There was a long-standing question about the foundations of delay-insensitive (DI) circuits. One group of papers proposed sets of universal delay-insensitive components (universal in the sense of computational ability). A second group of papers argued that DI circuits are limited in what they can compute; that, in fact, they are essentially just C-elements. This paper proves that the C-element limitations to DI circuits are a consequence of the assumption that a DI module has a single output. We show that in a circuit consisting of single output DI modules, every DI module must behave like a C-element, no matter how complex the module. We also show that having two output DI gates is sufficient to compute any function using purely DI circuits.

Samira Ataei and Rajit Manohar. AMC: An Asynchronous Memory Compiler. Proceedings of the IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2019. (abstract, pdf)
The first open-source asynchronous pipelined memory compiler. The memories generated are competitive with those from commercial memory compilers.

Nitish Srivastava and Rajit Manohar. Operation Dependent Frequency Scaling Using Desynchronization. IEEE Transactions on VLSI, 27(4):799--809, April 2019. (abstract, pdf)
This paper shows a modified version of de-synchronization that can, for the first time, result in a synchronous to asynchronous conversion that results in better performance than the original synchronous design.

Alexander Neckar, Sam Fok, Ben Benjamin, Terrence C. Stewart, Nick N. Oza, Aaron R. Voelker, Chris Eliasmith, Rajit Manohar, Kwabena Boahen. Braindrop: A Mixed-Signal Neuromorphic Architecture with a Dynamical Systems-Based Programming Model. Proceeedings of the IEEE, 107(1):144--164, January 2019. (abstract, pdf)
A mixed-signal neuromorphic chip that supports a dynamical systems programming model using the NEF framework. The energy-efficiency of the chip exceeds the state of the art by about an order of magnitude.

Yu Chen, Xiaoyang Zhang, Yong Lian, Rajit Manohar, Yannis Tsividis. A Continuous-Time Digital IIR Filter with Signal-Derived Timing and Fully Agile Power Consumption. IEEE Journal of Solid-State Circuits, 53(2):418-430, February 2018. (abstract, pdf)
This paper presents the first continuous-time IIR filter. This is a non-trivial result because a straightforward implementation of an IIR filter in the presence of timing uncertainty can easily have a large number of spurious outputs.

Wenmian Hua and Rajit Manohar. Exact Timing Analysis for Asynchronous Systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 37(1):203-216, January 2018. (abstract, pdf)
There are many papers on timing analysis for asynchronous circuits in the presence of AND causality. They all assume that the repetitive event structure used to represent event causality is strongly connected. This paper lifts this restriction; it shows that, after an initial finite prefix, the timing behavior of asynchronous circuits is periodic even without the restriction of strong connectivity. Hence, for the first time, we have a unified framework to analyze both synchronous and asynchronous circuits. We show how the theory can be used to create a metastability-free interface between asynchronous and synchronous logic.

Rajit Manohar and Yoram Moses. The Eventual C-Element Theorem for Delay-Insensitive Asynchronous Circuits. Proceedings of the IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2017. (abstract, pdf)
In a classic result, Martin showed that purely delay-insensitive circuits are very limited---under an assumption on computations that makes the result much less general. This paper presents a new theorem that also shows that purely delay-insensitive circuits are very limited---without making any apriori assumptions about computations.

Sandra Jackson and Rajit Manohar. Gradual Synchronization. IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2016. (abstract, pdf)
Synchronizing an asynchronous signal to a clock requires a circuit that can handle metastability. While techniques have been developed for high-throughput synchronization, there is a fundamental latency penalty required to achieve low failure rates. This work is the first paper that shows that you can compute on the data while resolving the metastability, enabling the latency to be hidden behind useful work. In other words, this is the first "computing synchronizer" (thanks to M. Nystrom for suggesting the phrase).

Rajit Manohar. Comparing Stochastic and Deterministic Computing. IEEE Computer Architecture Letters, 2015. (abstract, pdf)
Researchers are investigating stochastic computing again. This paper provides a simple analytical treatment of the benefits and drawbacks of stochastic computing versus conventional approaches that takes the numerical errors introduced by the two approaches into account.

Rajit Manohar and Yoram Moses. Analyzing Isochronic Forks with Potential Causality. IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2015. (abstract, pdf)
First complete proof of the precise nature of the timing constraint required for correct operation of quasi delay-insensitive circuits. This paper also introduces the analog of Lamport causality (widely used in the distributed systems literature) for asynchronous circuits.

Stephen Longfield and Rajit Manohar. Removing Concurrency for Rapid Functional Verification. Proceedings of the 2014 International Conference on Computer-Aided Design (ICCAD), November 2014. (abstract, pdf)
This paper shows how slack elasticity can be used to "sequentialize" a concurrent asynchronous system, thereby vastly reducing the complexity of the verification problem in asynchronous circuits.

Paul A. Merolla, John V. Arthur, Rodrigo Alvarez-Icaza, Andrew S. Cassidy, Jun Sawada, Filipp Akopyan, Bryan L. Jackson, Nabil Imam, Chen Guo, Yutaka Nakamura, Bernad Brezzo, Ivan Vo, Steven K. Esser, Rathinakumar Appuswamy, Brian Taba, Arnon Amir, Myron D. Flickner, William P. Risk, Rajit Manohar, and Dharmendra Modha. A Million Spiking-Neuron Integrated Circuit with a Scalable Communication Network and Interface. Science, 345(6197):668--673, August 2014. (abstract, pdf)
First large-scale deterministic neuromorphic architecture. Largest asynchronous chip ever designed (5.4B transistors, correct on first silicon). Record for low power operation in neuromorphic electronics.

Benjamin Tang, Sunil Bhave, and Rajit Manohar. Low Power Asynchronous VLSI with NEM Relays. Proceedings of the 20th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2014. (abstract, pdf)
First paper that looks at integrating nano-mechanical relays with asynchronous logic.

Stephen Longfield and Rajit Manohar. Inverting Martin Synthesis for Verification. Proceedings of the 19th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2013. (abstract, pdf)
A new approach to verification of asynchronous circuits that inverts the synthesis procedure to simplify equivalence checking.

Robert Karmazin, Carlos Otero, and Rajit Manohar. CellTK: Automated Layout for Asynchronous Circuits with Nonstandard Cells. Proceedings of the 19th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2013. (abstract, pdf)
Automated layout flow for self-timed circuits with dynamic cell library generation. First automated layout flow for general asynchronous circuits.

Benjamin Tang, Stephen Longfield, Sunil Bhave, and Rajit Manohar. A Low Power Asynchronous GPS Baseband Processor. Proceedings of the 18th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC), May 2012. (abstract, pdf)
A low power GPS baseband processor implemented with self-timed circuits. The design uses significantly lower power than previous GPS baseband designs---1.4 mW in 90nm for continuous tracking of six channels.

Paul Merolla, John Arthur, Filipp Akopyan, Nabil Imam, Rajit Manohar, Dharmendra Modha. A Digital Neurosynaptic Core Using Embedded Crossbar Memory with 45pJ per Spike in 45nm. Proceedings of the IEEE Custom Integrated Circuits Conference (CICC), September 2011. (abstract, pdf)
This paper introduced the notion of a "neurosynaptic core" for neuromorphic computing. It also presents a fully digital implementation that is, for the first time, competitive with previous mixed-signal implementations.

Basit Riaz Sheikh and Rajit Manohar. An Operand-Optimized Asynchronous IEEE 754 Double-precision floating-point adder. Proceedings of the IEEE International Symposium on Asynchronous Circuits and Systems, May 2010. (abstract, pdf)
This paper presents the first detailed design of an asynchronous double-precision floating-point adder. The paper introduces a new class of data-dependent optimizations for asynchronous arithmetic circuits. The adder achieves 33 GFLOPS/W at 2.15 GHz, and 52 GFLOPS/W at 1.3 GHz in a 65nm bulk technology.

S. Ramaswamy, L. Rockett, D. Patel, S. Danziger, R. Manohar, C. Kelly, J. Holt, V. Ekanayake, D. Elftmann. A Radiation Hardened Reconfigurable FPGA. Proceedings of the IEEE Aerospace Conference, March 2009.
This paper presents test results from the first radiation-hardened, re-programmable FPGA architecture.

David Fang, Filipp Akopyan, and Rajit Manohar. Self-Timed Thermally Aware Circuits. IEEE Computer Society Annual Symposium on VLSI, March 2006. (abstract, pdf)
This paper describes a low-overhead method to guarantee that an asynchronous circuit will never exhibit thermal runaway.

Song Peng, David Fang, John Teifel, and Rajit Manohar. Automated Synthesis for Asynchronous FPGAs. 13th ACM International Symposium on Field Programmable Gate Arrays, February 2005. (abstract, pdf, ps)
This paper describes a complete automated synthesis flow for asynchronous dataflow computations, and a mapping to asynchronous FPGAs. This is the first time anyone has bridged the gap between a high-level ("RTL"-level) language and an asynchronous FPGA architecture using automated tools.

Rajit Manohar and K. Mani Chandy. Δ-Dataflow Networks for Event Stream Processing. Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Systems, November 2004. (abstract, pdf, ps)
This paper describes a simple model for incremental computations. The model is very efficient at change detection, and can be thought of as "memoization on steroids."

John Teifel and Rajit Manohar. Static Tokens: Using Dataflow to Automate Concurrent Pipeline Synthesis. Proceedings of the 10th International Symposium on Asynchronous Circuits and Systems, April 2004. (abstract, pdf, ps)
This paper describes an intermediate representation--static token form--that is suitable for dataflow-style synthesis of high-level asynchronous specifications. Both normal and loop-carried dependencies are handled in a unified framework.

John Teifel and Rajit Manohar. Programmable Asynchronous Pipeline Arrays. Proceedings of the 13th International Conference on Field Programmable Logic and Applications, Lisbon, Portugal, September 2003. (abstract, ps, pdf)
This paper describes an asynchronous FPGA architecture that is programmable at the pipeline stage level. We report performance numbers that, for the first time, are competitive with (and actually better than) clocked FPGA architectures, and that are also competitive with full custom asynchronous design.

Clinton Kelly IV and Rajit Manohar. An Event-Synchronization Protocol for Parallel Simulation of Large-Scale Wireless Networks. Seventh IEEE International Symposium on Distributed Simulation and Real Time Applications, October 2003. (abstract, pdf, ps)
This paper describes a method to implement scalable parallel discrete event simulators based on executing events at approximately a scaled version of real-time.

Clinton Kelly IV, Virantha Ekanayake, and Rajit Manohar. SNAP: A Sensor Network Asynchronous Processor. Proceedings of the Ninth International Symposium on Asynchronous Circuits and Systems, Vancouver, BC, May 2003. (abstract, ps, pdf)
This paper presents the first microprocessor optimized for sensor network applications and wireless network simulation. The entire processor is clockless and event-driven, allowing for very fast transitions to/from its idle state as well as energy-efficient operation. The processor can handle 10 sensor events/sec with 20-40 nW of active power.

Rajit Manohar and Clinton Kelly, IV. Network on a Chip: Modeling Wireless Networks with Asynchronous VLSI. IEEE Communications Magazine, November 2001. (abstract, ps, pdf)
This paper presents the connection between asynchronous VLSI and networks, and argues that efficient hardware network emulators can be built using asynchronous design techniques.

Rajit Manohar. Width-Adaptive Data Word Architectures. Proceedings of the 19th Conference on Advanced Research in VLSI, Salt Lake City, Utah, March 2001. (abstract, ps)
This paper presents a comprehensive set of techniques for designing adaptive processors that only have datapath switching activity for the significant digits in a binary number. Independently, Jim Smith's group at Wisconsin provided an architectural evaluation of clocked datapaths that use similar concepts but a different representation (MICRO, December 2000).

Rajit Manohar, Tak-Kwan Lee, and Alain J. Martin. Projection: A Synthesis Technique for Concurrent Systems. Proceedings of the Fifth International Symposium on Advanced Research in Asynchronous Circuits and Systems, April 1999. (abstract, ps)
This paper presents a powerful program transformation that can be used to reason about the correctness of asynchronous pipelines. In particular, asynchronous computations pipelined according to their dataflow graph can be shown to be correct in a trivial manner.

Rajit Manohar and José A. Tierno. Asynchronous Parallel Prefix Computation. IEEE Transactions on Computers, 47(11):1244--1252, November 1998. (abstract, ps)
This paper presents the design of an N-input asynchronous parallel prefix circuit that has an expected latency that is O(log log N) when the prefix operator has a right zero. In particular, this circuit can be used to construct an asynchronous adder that has O(log log N) expected latency. Asymptotically, our design has the best attainable: (i) throughput; (ii) worst-case latency; (iii) average-case latency for any input distribution (!). Given its performance characteristics, it also has the best possible area. My thesis chapter has a more complete analysis.

Rajit Manohar and Alain J. Martin. Slack Elasticity in Concurrent Computing. Proceedings of the Fourth International Conference on the Mathematics of Program Construction, Lecture Notes in Computer Science 1422, pp. 272-285, Springer-Verlag 1998. (abstract, ps)
This paper presents an analysis of the effect of increasing the synchronization slack between two communication actions on the correctness of the computation. In particular, it is shown that a large class of asynchronous computations remain unchanged when the slack is increased. This has important consequences for asynchronous microprocessor design, and shows that most local re-pipelining decisions do not affect global correctness.

Alain J. Martin, Andrew Lines, Rajit Manohar, Mika Nyström, Paul Penzes, Robert Southworth, Uri V. Cummings, and Tak-Kwan Lee. The Design of an Asynchronous MIPS R3000 microprocessor. Proceedings of the 17th Conference on Advanced Research in VLSI, pp. 164--181, September 1997. (abstract, ps, pdf)
This paper was the first published asynchronous microprocessor that that was competitive with (actually better than) clocked microprocessors in terms of performance. This paper introduced a number of important techniques at the circuit and microarchitecture level that were used to achieve high performance without resorting to aggressive timing assumptions. This paper also introduced the Ed2 energy-efficiency metric.

José A. Tierno, Rajit Manohar, and Alain J. Martin. The Energy and Entropy of VLSI Computations. Proceedings of the Second International Symposium on Advanced Research in Asynchronous Circuits and Systems. March 1996. (abstract, ps)
This paper presents the connection between energy, entropy, and asynchronous computation. This is a follow-on to an earlier paper on low energy asynchronous memories that contains some of the theory presented here.

Rajit Manohar and Alain J. Martin. Quasi-delay-insensitive circuits are Turing-complete. Invited article, Second International Symposium on Advanced Research in Asynchronous Circuits and Systems. March 1996. Available as Caltech technical report CS-TR-95-11, November 1995. (abstract, ps)
This paper presents the connection between hazard-free quasi-delay insensitive (QDI) circuits, the stability property of gates, and the confluence property of computations. It also shows that the synthesis method used for QDI circuits is complete.



Errata: The paper on "Slack Elasticity" published in the proceedings of the conference on the Mathematics of Program Construction (1998) has an error in the final printed version due to an unfortunate oversight in proof-reading. Corollary 1 should read: If a system satisfies its specification when the slack on channel c is k, and if it is unchanged when the slack on channel c is l (> k), it satisfies its specification when the slack on c is s, for all s satisfying k <= s <= l. An examination of the proof shows that this is the statement being established, so the proof is identical. This statement was the version presented at the conference as well.

 
  
Yale