Department of Computer Science
Phone: (212) 939-7056
FAX: (646) 775-6023
E-mail: nowick AT cs DOT columbia DOT edu
I am a Professor Emeritus of Computer
Science (and, by courtesy, Electrical Engineering) at Columbia University,
and was a co-founder and former chair of the Computer Engineering Program.
I received a Ph.D. in Computer Science from Stanford University in 1993,
and a B.A. from Yale University in 1976. I retired from Columbia in 2019.
My main research interests
are: the design and optimization of asynchronous and mixed-timing digital
systems (globally-asynchronous locally-synchronous [GALS]); scalable high-performance and low-power on-chip networks for parallel processors and embedded systems;
computer-aided digital design and optimization (CAD); fault tolerance and reliability; and ultra-low-energy digital systems.
My main research interests are: the design and optimization of asynchronous and mixed-timing digital systems (globally-asynchronous locally-synchronous [GALS]); scalable high-performance and low-power on-chip networks for parallel processors and embedded systems; computer-aided digital design and optimization (CAD); fault tolerance and reliability; and ultra-low-energy digital systems.
I am an IEEE Fellow (2009), and recipient of an Alfred P. Sloan Research Fellowship (1995), and NSF CAREER (1995) and RIA (1993) Awards, and am a Senior Member of the ACM. I received Best Paper Awards at the IEEE International Conference on Computer Design (1991, 2012) and the IEEE Async Symposium (2000). I co-founded the IEEE "Async" Symposia series (1994), now in its 26th year, and was its Program Committee Chair (1994, 1999) and General Chair (2005). I was Program Chair of the ACM/IEEE International Workshop on Logic and Synthesis (IWLS, 2005), and Program Track/Subcommittee Chair at DAC, DATE and ICCD conferences. I served on the editorial boards of several leading journals: IEEE Design & Test Magazine, IEEE Transactions on Computer-Aided Design, IEEE Transactions on VLSI Systems, and ACM Journal on Emerging Technologies in Computer Systems. I also served as a guest editor for a special issue of the Proceedings of the IEEE (Feb. 1999).
I was the selection committee chair of ACM/SIGDA's Outstanding Dissertation in EDA (Electronic Design Automation) Award, a selection committee member of the ACM/IEEE A. Richard Newton Technical Impact Award in Electronic Design Automation, a jury member of the ACM India Doctoral Dissertation Award, and a member of the Best Paper Award committees for ACM/IEEE DAC and ICCAD conferences. I am also a recipient of the Columbia Engineering School Alumni Distinguished Faculty Teaching Award (2011). I hold 13 issued US patents.
My industrial collaboration and technology transfer, of our group's asynchronous digital designs and methodologies, include: (i) to NASA Goddard Space Center (Greenbelt, MD), for joint design of an experimental laser space measurement circuit for space applications, using our asynchronous "burst-mode" controllers and Minimalist CAD tool; (ii) to IBM T.J. Watson Research, for joint design of a mixed async-sync experimental low-power chip for an FIR filter for disk drive reads, using our "high-capacity" asynchronous pipelines, which out-performed the best comparable IBM synchronous commercial design; and (iii) to AMD Corporation, implementing our low-latency and low-energy asynchronous network-on-chip switch for multicore systems, using advanced 14nm FinFET technology, which significantly outperformed one of their recent commercial synchronous designs. (See "Research Summary" and CV below for details.)
Data Science Institute: Chair/Founder of Research Center
I was also the founder and former chair of a new Columbia University research center, Computing Systems for Data-Driven Science, which has 45 faculty members (2018-2019). It is part of Columbia's Data Science Institute (DSI). It was formerly a working group, "Frontiers in Computing Systems," which I founded and chaired (2016-2018). Its focus is on the design and application of large-scale computing systems to break through current barriers in processing and analyzing vast data sets. The center brings together diverse researchers at Columbia in three areas: (i) computing systems (hardware, parallel computer architecture, distributed and cloud computing, software programming environments, databases, quantum computing and other emerging paradigms); (ii) data science and machine learning; and (iii) large-scale computational application areas in science, engineering and medicine (e.g. ocean and climate science, astrophysics, materials science, civil engineering, physics, biomedical informatics, and computational genomics). Its goal is to foster new and exciting cross-disciplinary collaborations and research projects between these areas.
See initial news story (July 2016), as well as a summary of our recent inaugural symposium ( story and agenda) (March 2017).
Our DSI center also hosted the 2019 New York Scientific Data Summit (NYSDS-19), which we co-organized with Brookhaven National Laboratory's Computational Science Initiative (June 12-14), held in Davis Auditorium on Columbia's Morningside Campus. The event is the leading regional symposium focusing on large-scale computational problems in science/medicine/engineering, computing systems, and data analytics. See a news story  on the event. For agenda and details, see the NYSDS-19 web site.
Short Profile of My Research (PDF): (click here)
Recent Professional Highlights (2008-present ) (PDF): (click here)
Detailed Research Summary (July 2018) (PDF): (click here) (covers my main research areas, recent papers, technology transfer, grants)
Bio's and CV's:
Provides an introduction to our complete methodology for design low-latency and power-efficient asynchronous on-chip networks. The approach includes: a network-on-chip (NoC) microarchitecture and design, a complete CAD tool flow harnessing synchronous commercial EDA tools, detailed comparisons with synchronous designs in identical technology, and re-implementation at AMD Research (Boxborough, MA) in advanced 14nm FinFET commercial technology. We also provide direct normalized comparisons, for the first time, with some other influential asynchronous NoC designs.
Our asynchronous design is largely dominating, in comparison to some leading synchronous designs in both academia and industry, in most cost metrics: area, power and performance.
In comparison to an ultra-low power synchronous NoC using state-of-art clock gating, our async NoC, projected to a full HD video playback application for high-end mobile devices, exhibits up to 45% power improvements and 37% latency savings.
Moreover, we were invited to collaborate by AMD, and together we reimplemented one of their recent commercial synchronous NoCs in identical advanced (14nm FinFET) technology. Results confirm substantial benefits: 55% lower area, 28% lower latency, and reductions of 88% idle and 58% active power. This is the first apples-to-apples comparison of asynchronous to commercial synchronous general-purpose NoCs in advanced technology.
Provides a broad and modern overview of state-of-the-art of the field of asynchronous design. Includes a short history of asynchronous design, as well as a technical introduction to handshaking protocols and data encoding. Also, covers recent industrial successes in mainstream technologies (IBM, Intel, Philips Semiconductors, etc.), as well as recent application to emerging areas (neuromorphic computers, flexible electronics, quantum cellular automata, continuous-time DSPs, ultra-low voltage design, extreme environments). Highlights several application areas in depth, with a wide range of cited publications: GALS systems, networks-on-chip, computer architecture, testing and design-for-testability, and CAD tool development.
Provides a good basic introduction to asynchronous pipelines. Includes basic background on handshaking protocols, industrial developments, as well as a detailed technical introduction to several leading high-performance pipelines, and their use at Intel, Achronix Semiconductor, and other companies.
Selected Research Papers + Slides:
In collaboration with AMD Research, migrating our asynchronous network-on-chip into advanced 14nm FinFET industrial technology, with a direct head-on-head comparison with an AMD commercial network-on-chip (NoC). This is the first "apples-to-apples" comparison of an asynchronous NoC in advanced technology to a commercial chip. Our asynchronous network-on-chip exhibited dominating results: 55% less circuit area, 28% lower latency, and 58% (/88%) savings in active (/idle) power. For slides, click here.
An automated tool flow for asynchronous networks-on-chip using synchronous commercial CAD tools (joint with University of Ferrara, Italy).
Recent Web Articles (grants and tool releases):
Steven Nowick Invited To Present Work on Asynchronous On-Chip Networks at Two National Study Groups" -- CS department news story (July 2015)
Prof. Nowick Developing New Dynamically-Adaptable On-Chip Networks" -- Engineering School profile on my NSF grant (June 2012)
Profs. Nowick and Tsividis Developing Ultra-Low Energy Continuous-Time Signal Processors" -- Engineering School profile on my medium-scale NSF grant (May 2010)
"Columbia Engineering News": "Nowick Developing New Desktop Supercomputer" -- story on my medium-scale NSF team grant, joint with University of Maryland (Summer 2008)
"EE Times (Europe)" -- article on our "CaSCADE" Asynchronous CAD Tool Release (December 2007)
"Columbia Engineering News": "New Information Technology" -- cover story on my two Medium-Scale NSF ITR Awards (Fall 2000)
Asynchronous Design in the News...:
Back row (left to right): Yu Chen, Kshitij Bhardwaj, Steve Nowick, Christos Vezyrtzis, Weiwei Jiang
Front row (left to right): Adil Sadik, George Faldamis
Former Post-Doctoral Research Scientists:
Gennette Gill (D.E. Shaw Research Laboratory, New York, NY)
Former PhD Students:
Kshitij Bhardwaj (Post-Doctoral Research Fellow, Harvard University, Computer Architecture and VLSI group, EE/CS Depts. [leads: Profs. David Brooks and Gu-Yeon Wei,] Cambridge, MA)
Weiwei Jiang (Senior R&D Engineer, FPGA Synthesis Team, Verification Group, Synopsys Corporation, Mountain View, CA)
Christos Vezyrtzis (first position: Research Staff Member, IBM T.J. Watson Research Center, Yorktown, NY)
Melinda Agyekum (Program Manager, Enterprise Storage Backend, Google, New York, NY)
Peggy McGee (Senior R&D Engineer, Power Compiler group, Synopsys Corporation, Sunnyvale, CA)
Cheoljoo Jeong (Senior Design Engineer, Cadence Design Systems, Sunnyvale, CA)
Cheng-Hong Li (became student of Prof. Luca Carloni, now at Google)
Tiberiu Chelcea (first position: Postdoctoral Fellow, CS Department, CMU [Prof. Seth Goldstein's group])
Michael Theobald (Researcher, D.E. Shaw Research Laboratory, New York, NY; first position: Postdoctoral Fellow, CS Department, CMU [Prof. Ed Clarke's group])
Montek Singh (Associate Professor, CS Department, University of North Carolina - Chapel Hill)
Robert Fuhrer (Software Engineer, Google, New York, NY; first position: IBM T.J. Watson Research Center, Yorktown, NY )
Kunal Mahajan (transferred to Vishal Misra and Dan Rubenstein)
Yu Chen (EE, became student of Prof. Yannis Tsividis)
Former Collaborating PhD Students:
Gabriele Miorandi (University of Ferrara [D. Bertozzi group])
Alberto Ghiribaldi (University of Ferrara [D. Bertozzi group])
Former MS Students: (partial list)
Sumedh Attarde (Server Design group, Intel Corporation, Santa Clara, CA)
Clementine Barbet (Comp Eng)
Marco Cannizzaro (MS co-advisor; from Politecnico di Torino, Italy)
Georgios (George) Faldamis (Cavium, Inc.)
Michael Horak [U. of Maryland, co-chair of MS thesis committee] (Advanced Simulation Technology, Inc.)
Kiran Kumar Mada
Geoffray Lacourba (ARM Ltd., France)
Amitava Mitra (Intel India)
Wei Wei (CDM verification engineer, CPU design and verification team, Apple Corporation)
Former Undergraduate Project Students: (partial list)
Steven Callender (Intel Hillsboro; formerly UC Berkeley, PhD Student)
Charles O'Donnell (MIT, PhD Student)
Fall 16: CSEE W4823 Advanced Logic Design
Detailed Course Overview: (click here) Class Web Page: http://www.cs.columbia.edu/~cs4823
Spring 16: CSEE E6861 Computer-Aided Design of Digital Systems
Course Advertisement: (click here)
Detailed Course Overview: (click here) Class Web Page: http://www.cs.columbia.edu/~cs6861
Fall-16: Monday 4:30-5:30pm, Thursday 4:00-5:00pm
Room 508, Computer Science Building
phone: (212) 939-7056
"The CaSCADE Package" is our new asynchronous design environment, including six different tools and libraries. The acronym "CaSCADE" = "Columbia University and University of Southern California Asynchronous Design Environment". It was developed under NSF ITR Award No. NSF-CCR-0086036, with support from additional grants (see CaSCADE web pages for details). This set of asynchronous CAD tools is available for free download for use with Linux platforms.
Three of the tools in the CaSCADE package were developed and maintained by our Columbia asynchronous research group: (a) "MINIMALIST" for asynchronous controllers; (b) the "ATN_OPT Toolset" for robust asynchronous threshold networks; and (c) the "DES (Discrete Event System) Analyzer" for performance analysis and timing verification of concurrent systems.
(a) The MINIMALIST CAD Package, release v2.0: "MINIMALIST" is a comprehensive CAD package for the automated synthesis and optimization of asynchronous controllers. It includes a Verilog back-end, multi-level logic optimizer, decomposition tool for large specifications, verifier, online help and graphical interfaces. Click below to access the web page of the "CaSCADE" asynchronous tool package, where you can download Minimalist (including extensive tutorial slides and setup instructions), available for Linux platforms.
Go to the "CaSCADE" web page to download this tool (click here)
(b) The ATN_OPT Toolset, release v0.1: The "ATN_OPT" Toolset is a comprehensive CAD package for the automated synthesis and optimization of robust dual-rail asynchronous threshold networks. It supports circuit descriptions in several common formats (Verilog/VHDL/BLIF), and supports cell libraries defined in GENLIB format. It allows several user-specified optimization targets: area, delay, power, and delay-area tradeoffs. It also includes a user shell. Click below to access the web page of the "CaSCADE" asynchronous tool package, where you can download ATN_OPT (including tutorial slides and setup instructions), available for Linux platforms.
Go to the "CaSCADE" web page to download this tool (click here)
(c) The DES (Discrete Event System) Analyzer, release v0.1: The "DES Analyzer" is a comprehensive CAD package for performance analysis and timing verification of concurrent digital systems. It includes two tools: (i) "DES-PERF" which uses user-supplied stochastic information to compute asymptotic system performance, and (ii) "DES-TSE", which uses user-supplied min/max delay bounds on individual events to compute the global min/max "time-separation-of events" between any two pairs of events. The DES-TSE tool is especially useful in determining which orderings of concurrent events are impossible in the actual global evolution of a concurrent system (going from startup to steady-state) -- potentially useful for optimizing the system -- as well as providing min/max bounds on the system's cycle time (and hence min/max bounds on system throughput). The tool accepts system specifications in the form of a restricted classof Petri net (i.e. "marked graph"), and includes several user options, graphical interfaces, and detailed output reports. Click below to access the web page of the "CaSCADE" asynchronous tool package, where you can download the DES Analyzer (and tutorial slides and setup instructions), available for Linux platforms.
Go to the "CaSCADE" web page to download this tool (click here)