The problem with Moore's Law
While Moore's Law has fueled the semiconductor industry, it
has also fueled this spiral of increasing costs and shrinking fab
customer bases. As transistors have shrunk, the cost of fabricating a
semiconductor device has grown commensurately. While the fabrication
cost per transistor has steadily declined, other expenses have
skyrocketed resulting in increased total cost. For example, small
features are more susceptible to process variation than larger ones,
increasing the range of variation and the proportion of faulty
chips. In addition, the smaller the transistor, the more of them that
can fit in a given amount of silicon. The result is that circuit
complexity has been increasingly outstripping designer productivity,
in a phenomenon referred to as Moore's Law's corollary of "compound
complexity".
Industry has dealt with these challenges by increasing the
engineering effort that goes into each chip. This effort manifests
itself as larger design teams, or longer product cycles, and often
both at once. The vast majority of this engineering effort is incurred
once per chip design, and does not vary with the number of chips
produced. Accordingly, this expense is called the non-recurring
engineering cost (NRE) of a chip. Industry analysts estimate that the
NREs for a typical 90nm standard cell ASIC can range from $5M up to
$50M.
Maintaining a particular price per chip in the face of
skyrocketing NREs requires larger and larger batches of chips. This is
because the single NRE is shared evenly across the population of chips
produced. The larger the population, the smaller the impact of the NRE
on individual chip cost, so chips produced in large batches cost less
than chips produced in small batches. Growing NREs are pushing the
line that divides "small" from "large" higher and higher. The result
of this situation is that only high-volume chip manufacturers, or
those who can sell smaller batches at high prices, can afford to be in
the chip business.
Moreover, at the same time that complexity and engineering
effort have been soaring, the commercial market has been demanding and
rewarding short chip design cycles. This is due to shrinking product
lifetimes and the increasing competitive importance of being the first
to market with a new product. A technology that succeeds in reducing
engineering effort will simultaneously attack the cost of chip
preparation as well as its time to market. This research seeks to
develop such a technology: one that reaps the benefits of Moore's Law
(e.g., high clock speeds, integration) without incurring the downsides
(e.g., high NRE costs, long time to market). A viable technology with
these characteristics would serve markets that today are economically
unreachable.
Brick and Mortar Chips
We propose a system, called brick and mortar, which is
designed to allow fabricated ASICs to be used in many different chip
designs. In this way, brick and mortar achieves high volume usage of
the individual ASIC components while producing small
batches of any given chip. The purpose is to reduce the non-recurring
engineering costs as much as possible while maintaining, to the
largest extent possible, the other benefits of ASICs.
At the heart of the brick and mortar manufacturing technique are two
architectural components: bricks, which are mass-produced pieces of
silicon containing processor cores, memory arrays, small gate arrays, DSPs,
FFT engines, and other IP (intellectual property) blocks; and mortar,
an I/O cap, that is a mass-produced silicon substrate containing
inter-brick communication infrastructure and I/O support. In the brick and
mortar process, engineers design chips by assembling an application-specific
layout of bricks. This arrangement of bricks is then bonded, as illustrated
below, to the I/O cap that interconnects them.
Applications can execute on this chip exactly as they would on a traditional
chip. Because this chip is constructed from discrete dies, the difference
between local communication within a functional core and between functional
cores is more pronounced. Thus, it is especially critical for performance on
such a chip that an application be carefully partitioned and mapped to the
functional cores.
There are a number of key advantages brick and mortar chips offer.
-
Reduced cost. This is the chief motivation for brick and mortar
chips. The aim is to produce a low-cost alternative to ASIC chips. The
cost savings of brick and mortar stems from mass-production of the
constituent parts. Although the components themselves are ASICs, they are
produced in bulk to be reused in a variety of end user designs. This reuse
amortizes the design and verification cost of the components across multiple
products. In addition, bricks are small, resulting in lower individual
design and verification costs to begin with.
-
Compatible design flow. Today ASIC designers employ significant
amounts of existing IP to produce chips. This improves design reliability
and saves design time. In such design flows, the IP blocks are provided as
``gateware'' netlists. The designer integrates these netlists into a
complete design which is then manufactured. Brick and mortar is compatible
with this design flow, merely moving the bricks from design modules, which
fit into synthesis tool flows, to physical bricks, which fit into a
manufacturing flow. The IP blocks are pre-manufactured physical entities
which will be bonded to a general purpose communication substrate, the I/O
cap. In many ways, bricks are the modern-day analogue of the 7400 series of
logic, and the I/O cap is the modern wire-wrap board. Rather than spin
custom ASICs for products, engineers could purchase these pre-fabricated
components and bond them together as needed.
-
ASIC-like speed and power. Because most of the logic of a brick
and mortar chip exists within a single ASIC component, its performance, in
speed and power, will tend to be closer to an ASIC than other custom logic
implementation techniques, such as a field programmable gate array (FPGA).
When a design calls for it, gate array bricks can implement any necessary
custom logic.
-
Mixed process integration. Bricks must comply with a standard
physical and logical interface. They do not, however, have to be
built from the same underlying technology. This makes it readily
possible to mix and match bulk CMOS, SOI, DRAM and other process
technologies into the same chip.
-
Improved speed. A subtle positive effect with brick and mortar
production is that under certain circumstances it can potentially produce
higher-performing large chips than an ASIC process. This is because bricks
can be partitioned according to speed grade, and chips then produced from
parts with like grade.
-
Improved yield. Large brick and mortar chips can have a higher
yield than large ASICs. The advantage comes from assembling a large chip
out of many smaller components. The smaller the component, the higher the
yield. One can test component bricks before assembling them, ensuring only
functional bricks are included in any assembly, and resulting in an
extremely high overall yield.
These benefits will not come for free. Brick and mortar chips will achieve
them only through careful design of the necessary hardware, software, and
manufacturing subsystems:
-
Communication network. The communication infrastructure to be
implemented in the I/O cap presents a particular challenge. It must be
fixed in silicon well before the brick and application layers that are to
use it have been determined. Because of this, the communication
infrastructure must be general purpose, while also endeavoring to maintain
as much of the performance of an application-specific network as
possible. This combination, high performance and general, is a very
sensitive balance.
-
Brick family design. The more applications implementable with
brick and mortar the greater the cost savings the system can offer. It is
important to keep this need for re-use in mind when designing a brick
family. Specifically, the bricks must have appropriate sizes and useful
functions.
-
Component assembly. Brick and mortar chips introduce a new step
to the chip fabrication pipeline: die assembly. It is essential that the
assembly technique one uses not be so costly that it erases the savings
gained elsewhere by brick and mortar. The assembly options that are
available trade off assembly speed and cost.
-
Software. Because they reside on different physical dies, there
is a greater ``distance'' between cores in a brick and mortar chip than in a
traditional chip. This makes the quality of the application's mapping to
computational cores extremely important to overall performance.
Publications