11.1 Introduction

Two 3D IC heterogeneous integrations by Fan-Out Wafer-Level Packaging (FOWLP) technology are presented in this chapter. The emphasis of the first such method is on the design, and of the other method, the emphasis is on the manufacturing process. The heterogeneous integration versus SoC (system-on-chip) will be briefly discussed. Many examples on the TSV (Through-Silicon Via)-less heterogeneous integration by FOWLP will also be presented. Since MCM (multichip module) is the frontier of heterogeneous integration and thus it will be briefly mentioned first.

11.2 Multichip Module (MCM)

MCM integrates different chips and discrete components side-by-side on a common substrate such as ceramic, silicon, or organic to form a system or subsystem for high-end networking, telecommunication, servers, and computer applications. Basically, there are three different kinds of MCM, namely, MCM-C, MCM-D, and MCM-L.

11.2.1 MCM-C

MCM-C are multichip modules that use thick film technology such as fireable metals to form the conductive patterns, and are constructed entirely from ceramic or glass-ceramic materials, or possibly, other materials having a dielectric constant above five. In short, an MCM-C is constructed on ceramic (C) or glass-ceramic substrates [1].

11.2.2 MCM-D

MCM-D are multichip modules on which the multilayered signal conductors are formed by the deposition of thin-film metals on unreinforced dielectric materials with a dielectric constant below 5 over a support structure of silicon, ceramic, or metal. In short, MCM-D uses deposited (D) metals and unreinforced dielectrics on a variety of rigid bases [1].

11.2.3 MCM-L

MCM-L are multichip modules which use laminate structures and employ PCB (printed circuit board) technology to form predominantly copper conductors and vias. These structures may sometimes contain thermal expansion controlling metal layers. In short, MCM-L utilizes PCB technology of reinforced organic laminates (L) [1].

There was much research performed on MCMs during the 1990s. Unfortunately, at that time, due to the high cost of ceramic and silicon substrates and the limitation of line width and spacing of the laminate substrate, compounded with business models such as difficulty in getting the bare chips, the high-volume manufacturing (HVM) of MCMs never materialized, except some niche applications. Actually, since then, MCM has been a “dirty” word in semiconductor packaging.

11.3 System-in-Package (SiP)

11.3.1 Intention of SiP

SiP integrates different chips and discrete components, as well as 3D chip stacking of either packaged chips or bare chips (e.g., wide-bandwidth memory cubes and memory on logic with TSVs) side-by-side on a common (either silicon, ceramic, or organic) substrate to form a system or subsystem for smartphones, tablets, high-end networking, telecommunication, server, and computer applications. SiP technology performs horizontal as well as vertical integrations. Some people also called SiP vertical MCM or 3D MCM.

11.3.2 Actual Applications of SiP

Unfortunately, because of the high cost of TSV technology [2, 3] for smartphones and tablets, it never materialized. Most SiPs that went into HVM in the past 10 years are actually MCM-L for low-end applications such as smartphones, tablets, smart watches, medical, wearable electronics, gaming systems, consumer products, and internet of things (IoT)-related products [4] such as smart homes, smart energy, and smart industrial automation. Most actual applications of SiPs by OSATs (outsourced semiconductor assembly and test providers) integrate two or more dissimilar chips and some discrete components on a common laminated substrate.

11.3.3 Potential Applications of SiP

The applications of SiP for the high-price, high-margin, and high-end products are, e.g., dual-lens camera modules. However, right now this SiP cannot be all done by the OSATs, but also involves optical design, testing, lenses, micromotors, flexible substrate, and system integration capabilities which still need to be strengthened.

11.4 System-on-Chip (SoC)

Moore’s law [5] has been driving the system-on-chip (SoC) platform. Especially in the past 10 years, SoCs have been very popular for smartphones, tablets, and the like. SoCs integrate different-function ICs into a single chip for a system or subsystem. Two typical SoC examples are shown in the followings.

11.4.1 Apple Application Processor (A10)

The application processor (AP) A10 is designed by Apple and manufactured by TSMC using its 16 nm process technology. It consists of a 6-core graphics processor unit (GPU), two dual-core central processing unit (CPUs), 2 blocks of static random access memories (SRAMs), etc. The chip area (11.6 mm x 10.8 mm) is 125 mm2, Fig. 11.1a.

Fig. 11.1
figure 1

SoC platforms for the A10 and A11 APs

11.4.2 Apple Application Processor (A11)

The application processor A11 is also designed by Apple and manufactured using TSMC’s 10 nm process technology. The A11 consists of more functions, including a tri-core Apple-designed GPU, neural engine for face ID, etc. However, the chip area (89.23 mm2) is about 30% smaller than that of the A10 because of Moore’s law, i.e., the feature size is from 16 nm down to 10 nm, Fig. 11.1b.

11.5 Heterogeneous Integration

Some of the early researches in heterogeneous integration have been provided by Georgia Institute of Technology [6,7,8], where they reported a differential Si CMOS (complementary metal–oxide semiconductor) receiver IC (operating at 1 Gbps) integrated with a large-area thin-film InGaAs/InP I-MSM (metal–semiconductor–metal) photodetector (Fig. 11.2). Today, most heterogeneous integrations focus on higher density, finer pitch, and more complex system.

Fig. 11.2
figure 2

InGaAs/InP I-MSM integrated onto differential Si CMOS receiver IC

11.5.1 Heterogeneous Integration Versus SoC

Why is the heterogeneous integration of such great interest? One of the key reasons is because the end of Moore’s law is fast approaching and it is more and more difficult and costly to reduce the feature size (to do the scaling) to make SoCs.

Heterogeneous integration contrasts with SoCs as follows. The heterogeneous integration uses packaging technology to integrate dissimilar chips (either side-by-side or stack) with different functions from different foundries, wafer sizes, and feature sizes (as shown in Fig. 11.3) into a system or subsystem on different (e.g., organic, silicon, or RDL) substrates, rather than integrating most of the functions into a single chip and going for a finer feature size. Heterogeneous integration and SiP are similar, except that heterogeneous integration is for finer pitch and higher density applications.

Fig. 11.3
figure 3

Heterogeneous integration or SiP

11.5.2 Advantages of Heterogeneous Integration

For the next few years, we will see more of a higher level of heterogeneous integration, whether it is for time-to-market, performance, form factor, power consumption, signal integrity, or cost. Heterogeneous integration is going to take some of the market shares away from SoCs on high-end applications such as high-end smartphones, tablets, wearables, networkings, telecommunications, and computing devices. How should these dissimilar chips talk to each other, however? The answer is redistribution layers (RDLs) [9]! How should those RDLs be made? One key method is by FOWLP technology.

11.6 Heterogeneous Integration on Organic Substrates

Today, the most common applications of heterogeneous integration are on organic substrates, or the so-called SiP. The assembly methods are usually SMT (surface mount technology) including flip chips with mass reflow as shown in Fig. 2.16a and wire bonding chips on board. In general, this is for low-end middle-end applications.

11.6.1 Amkor’s SiP for Automobiles

Amkor’s SiP for automobiles focuses on autonomous driving, infotainment, and ADAS (advanced drive assist systems), and computer in a car. Figure 11.4a, b shows a couple of examples of Amkor’s SiP for automobiles. It can be seen from Fig. 11.4a that the 42.5 mm × 42.5 mm infotainment organic substrate is supporting the processor and DDR (double data rate) memories. While from Fig. 11.4b, the 55 mm × 72 mm organic substrate is supporting the network switch, ASIC (application-specific integrated circuit), and memories.

Fig. 11.4
figure 4

Amkor’s SiP for automobiles. a 42.5 mm × 42.5 mm infotainment. b 55 mm × 72 mm organic substrate

11.6.2 Apple Watch II (SiP) Assembled by ASE

Through USI (Universal Scientific Industrial), ASE is a sole backend provider for Apple’s custom-designed S2 SiP modules (Fig. 11.5) for use in the Apple Watch II. It can be seen from Fig. 11.5 that there are 42 chips and are on an organic substrate. Some of these chips are discrete passive components such as capacitors and resistors, ASIC, processors, controller, converter, DRAM (dynamic random access memory), NAND, Wi-Fi, NFC (near-field communication), GPS (global positioning system), sensors, etc.

Fig. 11.5
figure 5

Apple’s smart watch SII assembled by ASE

11.6.3 Cisco’s ASIC and HBM on Organic Substrate

Figure 11.6 shows a 3D system-in-package (SiP) designed and manufactured with a large organic interposer (substrate) with fine-pitch and fine-line interconnections by Cisco/eSilicon [10]. The organic interposer has a size of 38 mm × 30 mm × 0.4 mm. The linewidth, spacing, and thickness of the front-side and backside of the organic interposer are the same and are, respectively, 6, 6, and 10 μm. A high-performance ASIC die measured at 19.1 mm × 24 mm × 0.75 mm is attached on top of the organic interposer along with four HBM (high-bandwidth memory) DRAM die stacks. The 3D HBM die stack with a size of 5.5 mm × 7.7 mm × 0.48 mm includes one base buffer die and four DRAM core dice, which are interconnected with TSVs and fine-pitch micro-pillars with solder caps. This is for the high-end application.

Fig. 11.6
figure 6

Cisco’s networking system with organic interposer

11.6.4 Intel’s CPU and Micron’s HMC on Organic Substrate

Figure 11.7 shows Intel’s Knights Landing CPU with Micron’s HMC (hybrid memory cube), which have been shipping to Intel’s favorite customers since the second-half of 2016. It can be seen that the 72-core processor is supported by 8 multichannel DRAMs (MCDRAM) based on Micron’s HMC technology. Each HMC consists of 4 DRAMs and a logic controller (with TSVs), and each DRAM has >2000 TSVs with C2 bumps (Fig. 2.6). The CPU and the DRAM + logic controller stack are attached to an organic package substrate. Micron’s current HMC assembly process is by using a low-force TCB (thermocompression bonding) with CUF (capillary underfill) as shown in Fig. 2.16b. This is for the high-end application.

Fig. 11.7
figure 7

Intel’s Knights Landing and micron’s HMC on an organic substrate

11.7 Heterogeneous Integration on Silicon Substrates (SoW)

In general, heterogeneous integrations on silicon substrates are for multichips on silicon wafer or system-on-wafer (SoW). The assembly methods are usually flip chips-on-wafers (CoW) with TSVs (through-silicon vias) with mass reflow (Fig. 2.16a) or with thermocompression bonding (Fig. 2.16b, c) for very fine pitches. In general, this is for high-end applications.

11.7.1 Leti’s SoW

One of the early applications of SoW is given by Leti [11, 12] as shown in Fig. 11.8. It can be seen that a system of chips such as ASIC and memories, PMIC (power management IC) and MEMS (microelectromechanical systems) are on a silicon wafer with TSVs. After dicing, the individual unit becomes a system or subsystem and can be attached on an organic substrate or stand alone.

Fig. 11.8
figure 8

Leti’s SoW

11.7.2 Xilinx/TSMC’s CoWoS

In the past few years, because of the very high-density, high I/Os, and ultrafine pitch requirements such as the sliced field-programmable gate array (FPGA), even a 12 build-up layers (6-2-6) organic package substrate is not enough to support the chips and a TSV-interposer is needed [13,14,15,16,17,18,19,20,21,22]. For example, Fig. 11.9 shows the Xilinx/TSMC’s sliced FPBG chip-on-wafer-on-substrate (CoWoS) [15,16,17]. It can be seen that the TSV (10 µm diameter) interposer (100 µm deep) has four top RDLs: three Cu damascene layers and one aluminum layer. The 10,000+ of lateral interconnections between the sliced FPGA chips are connected mainly by the 0.4 µm pitch (minimum) RDLs of the interposer. The minimum thickness of the RDLs and passivation is <1 µm. Each FPGA has more than 50,000 microbumps (200,000+ microbumps on the interposer) at 45 µm pitch as shown in Fig. 11.9.

Fig. 11.9
figure 9

Xilinx/TSMC’s CoWoS

11.7.3 Analog Devices’ MEMS on ASIC Wafer

Figure 11.10 shows Analog Devices’ MEMS on ASIC wafer. It can be seen that the MEMS chip is bonded on the ASIC wafer with TSVs. After dicing the wafer into individual units, then they can be attached to the PCB (printed circuit board) with solder bumps (balls).

Fig. 11.10
figure 10

Analog devices’ MEMS on ASIC wafer

11.7.4 AMD’s GPU and Hynix’s HBM on TSV-Interposer

Figure 11.11 shows AMD’s Radeon R9 Fury X graphic processor unit (GPU) shipped in the second-half of 2015. The GPU is built on TSMC’s 28 nm process technology and is supported by four HBM cubes manufactured by Hynix. Each HBM consists of four DRAMs with C2 bumps and a logic base with TSVs straight through them. Each DRAM chip has >1000 TSVs. The GPU and HBM cubes are on top of a TSV interposer (28 mm × 35 mm), which is fabricated by UMC with a 64 nm process technology. The final assembly of the TSV interposer (with C4 bumps as shown in Fig. 2.4) on a 4-2-4 organic package substrate (fabricated by Ibiden) is by ASE.

Fig. 11.11
figure 11

IMD’s GPU and Hynix’s HBM on Si interposer

11.7.5 Nvidia’s GPU and Samsung’s HBM2 on TSV-Interposer

Figure 11.12 shows Nvidia’s Pascal 100 GPU, which was shipped in the second-half of 2016. The GPU is built on TSMC’s 16 nm process technology and is supported by four HBM2 (16 GB) fabricated by Samsung. Each HBM2 consists of four DRAMs with C2 bumps and a base logic die with TSVs straight through them. Each DRAM chip has >1000 TSVs. The GPU and HBM2s are on top of a TSV interposer (1200 mm2), which is fabricated by TSMC with a 64 nm process technology. The TSV interposer is attached to a 5-2-5 organic package substrate with C4 bumps.

Fig. 11.12
figure 12

Nvidia’s GPU and Samsung’s HBM2 on Si interposer

11.7.6 UCLA’s SoW

Figure 11.13 shows the complete fabricated Si-IF (silicon interconnect fabric) by UCLA [23]. It can be seen that the test Si-IF has 4 dielets of size (4 mm × 4 mm) with an interconnect pitch of 10 μm and with a total of 640,000 connections. The Si-IF is fabricated using conventional Si-based BEOL (back end of line) processing with up to four levels of conventional Cu damascene interconnects with wire pitches in the range of 1–10 μm and is terminated with Cu pillars of 2–5 μm height and diameter also using a damascene process. Au-capped Cu–Cu thermocompression direct bonding has been used.

Fig. 11.13
figure 13

UCLA’s SoW

11.8 Heterogeneous Integration on RDLs

Recently, in order to lower the package profile, enhance the performance, and lower the cost, the heterogeneous integration on RDLs have been very popular, especially with the FOWLP technology. In general, this is for middle-end to high-end applications.

11.8.1 Xilinx/SPIL’s TSV-Less SLIT

In the past few years, through-silicon via (TSV)-less interposer [24] to support flip chips is a very hot topic in semiconductor packaging. In 2014, Xilinx/SPIL proposed a TSV-less interposer for sliced FPGA chips called silicon-less interconnect technology (SLIT) [25]. The upper right-hand corner of Fig. 11.14 shows the new packaging structure along with the old one, which is shown in the left-hand corner. It can be seen that the TSVs and most of the interposer are eliminated and only the four RDLs needed for performance, mainly, the lateral communication of the sliced FPGA chips, remain.

Fig. 11.14
figure 14

Xilinx/SPIL’s SLIT

The SLIT process flow is shown in Fig. 11.15. It starts off by fabricating the RDLs—examples on a bare silicon wafer can be seen in [9] (Fig. 11.15a). That process is followed by chip-to-wafer bonding (i.e., bonding the FPGA chip to the silicon wafer with RDLs; Fig. 11.15b), and underfilling/curing (Fig. 11.15c). These processes are followed by overmolding the whole wafer with an epoxy mold compound (EMC) (Fig. 11.15d). It is followed by backgrinding the over mold to expose the backside of the chips and attaching an optional reinforcement wafer on the backside of the chips (Fig. 11.15d). Then, backgrind the silicon wafer (Fig. 11.15e). Next come passivation, photoresist, mask, patterning, etching, sputtering TiCu, photoresist, mask, and patterning (Fig. 11.15f). Finally, Cu-contact pad plating (Fig. 11.15g), photoresist stripping, TiCu etching, and controlled-collapse chip connection (C4) wafer bumping are done (Fig. 11.15h). The final assembly of the heterogeneous integration package on the substrate and then on PCB is shown in Fig. 11.16.

Fig. 11.15
figure 15

Process flow for implementing SLIT technology

Fig. 11.16
figure 16

Final assembly of the Xilinx/SPIL’s SLIT

Depending on the linewidth/spacing of the RDLs’ conductive wiring, the fabrication method of the RDLs can be accomplished either by using a polymer for the dielectric layer and Cu plating of the conductive wiring (line width/spacing ≥5 μm), or by using plasma-enhanced chemical vapor deposition (PECVD) to make the SiO2 dielectric layer and Cu damascene plus chemical mechanical polishing (CMP) to make the conductive wiring (linewidth/spacing <5 μm). In 2016, SPIL/Xilinx published a similar paper [26] with more characterization results including warpage data and called it non-TSV interposer (NTI).

11.8.2 Amkor’s TSV-Less SLIM

In 2015, Amkor announced a very similar technology to SLIT and is called silicon interposer-less integrated module (SLIM) [27].

11.8.3 Intel’s TSV-Less EMIB (RDL) for FPGA and HBM

Intel proposed an embedded multi-die interconnect bridge (EMIB) [28] RDLs  to replace the TSV interposer [29]. The lateral communication between the chips will be taken care of by the silicon embedded bridge and the power/ground and some signals will go through the organic package substrate as shown in Fig. 11.17. There are two major tasks in fabricating the organic package substrate with EMIB. One is to make the EMIB, and the other is to make the substrate with EMIB. To make the EMIB, one must first build the RDLs (including the contact pads) on a Si wafer. The way to make the RDLs depends on the line width/spacing of the conductive wiring of the RDLs. Finally, attach the non-RDL side of the Si wafer to a die-attach film, and then singulate the Si wafer.

Fig. 11.17
figure 17

Intel’s TSV-less interposer—EMIB

To make the substrate with an EMIB, first place the singulated EMIB with the die-attached film on top of the Cu foil in the cavity of the substrate, Fig. 11.18a. It is followed by laminating a resin film on the whole organic package substrate. Then, drilling (on epoxy resin) and Cu plating to fill the holes (vias) to make connections to the contact pads of the EMIB. Continue Cu plating to make lateral connections of the substrate as shown in Fig. 11.18b. Then, it is followed by laminating another resin film on the whole substrate and drilling (on resin) and Cu plating to fill the holes and make contact pads, Fig. 11.18c. (Smaller pads on a finer pitch are for microbumps, while larger pads on a gross pitch are for ordinary bumps). The organic package substrate with an EMIB is ready for bonding of the chips as shown in Fig. 11.18d.

Fig. 11.18
figure 18

Assembly process of Intel’s EMIB

On November 9, 2015, Altera/Intel announced the industry’s first heterogeneous integration devices that integrate stacked HBM from SK Hynix with high-performance Stratix® 10 FPGAs and SoCs as shown in Fig. 11.19. It can be seen that the TSV interposer is gone and replaced by Intel’s EMIB.

Fig. 11.19
figure 19

Intel’s FPGA and HBM on EMIB

It is interesting to note that in order to use the EMIB, the chips will have different kinds/sizes of bumps as shown in Fig. 11.19, i.e., C4 bumps and microbumps (Cu pillar + solder cap). Wafer bumping and flip chip assembly could be challenging.

11.8.4 EMIB (RDL) for Intel’s CPU and AMD’s GPU

On November 6, 2017, Intel has formally revealed it has been working on a new series of processors that combine its high-performance ×86 cores CPUs with AMD GPUs (Radeon Graphics), as shown in Fig. 11.20, into the same processor package (heterogeneous integration) using Intel’s own EMIB multi-die technology. If that wasn’t enough, Intel also announced that it is bundling the design with the latest high-bandwidth memory, HBM, as shown schematically in Fig. 11.21.

Fig. 11.20
figure 20

Intel’s CPU and AMD’s GPU on EMIB

Fig. 11.21
figure 21

Schematic of Intel’s CPU, AMD’s GPU, and Hynix’s HBM on EMIB

11.8.5 STATS ChipPAC’s FOFC-eWLB

At ECTC2013, STATS chipPAC proposed [30, 31] using the fan-out flip chip (FOFC)-eWLB to make the RDLs for the chips to perform mostly lateral communications as shown in Fig. 11.22. It can be seen that the TSV interposer, wafer bumping, fluxing, chip-to-wafer bonding, cleaning, and underfill dispensing and curing are eliminated.

Fig. 11.22
figure 22

STATSChipPac’s TSV-less FOFC-eWLB

11.8.6 ASE’s FOCoS

In 2016, ASE [32] proposed using the fan-out wafer-level packaging (FOWLP) technology (chip-first and die-down on a temporary wafer carrier and then overmolded by the compression method) to make the RDLs for the chips to perform mostly lateral communications as shown in Fig. 11.23; the technology is called fan-out wafer-level chip-on-substrate (FOCoS). The TSV interposer, wafer bumping of the chips, fluxing, chip-to-wafer bonding, and cleaning, and underfill dispensing and curing are eliminated. The bottom RDL is connected to the package substrate using under bump metallurgy (UBM) and the C4 bump as shown in Fig. 11.23.

Fig. 11.23
figure 23

ASE’s FOCoS

11.8.7 MediaTek’s RDLs by FOWLP

In 2016, MediaTek [33] proposed similar TSV-less interposer RDLs fabricated with FOWLP technology as shown in Figs. 11.24 and 11.25. Instead of the C4 bump, they used a microbump (Cu pillar + solder cap) to connect the bottom RDL to the 6-2-6 package substrate.

Fig. 11.24
figure 24

Schematic of MediaTek’s RDLs by FOWLP

Fig. 11.25
figure 25

SEM images of MediaTek’s RDLs by FOWLP

11.9 3D IC Heterogeneous Integration by FOWLP

A low-profile and low-cost 3D IC heterogeneous integration of the application processor chipset by FOWLP is presented in this section. The emphasis is placed on the design of the package.

11.9.1 Application Processor with FOWLP

The A10 and A11 application processors are packaged using TSMC’s InFO (integrated fan-out) wafer-level packaging method [34,35,36,37,38,39,40,41,42,43,44]. The mobile dynamic random access memories (DRAMs) are wire bonded on a 3-layer core-less package substrate and the substrate is area-array solder balled on top of the application processor package—a package-on-package (PoP) format as shown schematically in Fig. 11.26. The interconnections between the application processor and the mobile DRAMs are mainly through the RDLs, through-InFO vias (TIVs), solder balls, and core-less substrate.

Fig. 11.26
figure 26

PoP for packaging the application processor and mobile memory

11.9.2 Application Processor by 3D IC Heterogeneous Integration with FOWLP

A new 3D IC heterogeneous integration by FOWLP, as shown in Fig. 11.27, is proposed in this chapter. It consists of the SoC, chips, and the mobile DRAMs. Their interconnections are mainly through the RDLs, which can be fabricated by the FOWLP method. Depending on the number of layers of the RDLs, usually the total thickness of a 3-layer RDL is about 40 µm. The DRAMs (≤50 µm thick) are cross-stacked with wire bonds and then encapsulated. The diameter of the solder ball is usually 200 µm.

Fig. 11.27
figure 27

3D IC heterogeneous integration by FOWLP

Figure 11.28 shows a special case of Fig. 11.27 (when there is no other chip and the SoC is the application processor). Comparing the new design (Fig. 11.28) with that of Fig. 11.26 (the 3D IC heterogeneous integration vs. the PoP), it is obvious that: (1) the new design leads to a lower package profile; (2) the new design has less interconnects; (3) the new design is more reliable because of less interconnects; (4) the new design has better electrical performance; and (5) the new design leads to lower cost.

Fig. 11.28
figure 28

3D IC heterogeneous integration to package the application processor chipset

The manufacturing process of the proposed 3D IC heterogeneous integration is very simple. First, the device wafer has to be modified by sputtering an under bump metallurgy (UBM) and electroplating a Cu contact pad (for building the RDLs later), as shown in Fig. 11.29. This step is followed by spin coating a polymer on top of the device wafer and laminating a die-attach film (DAF) at the bottom of the device wafer. Meanwhile, a light-to-heat conversion (LTHC) layer is spin coated onto the temporary glass carrier wafer. Then the individual known-good die (KGD) (chip) from the device wafer is placed face-up on the LTHC carrier. This step is followed by epoxy molding compound (EMC) dispensing, compression molding, and finally, post mold cure (PMC). These steps are followed by backgrinding the EMC and polymer to expose the Cu contact pad for making the RDLs and for mounting the solder balls, as shown in Fig. 11.29. This is the conventional FOWLP method to package the application processor [34,35,36,37,38,39,40,41,42,43,44], as shown in Chap. 6.

Fig. 11.29
figure 29figure 29

Manufacturing process for packaging the application processor chipset

There are two methods to attach the mobile DRAMs to the bottom of the application processor fan-out wafer-level package. The first method comprises the following steps: (1) removing the glass carrier by a laser (Fig. 11.30a); (2) dicing the reconstituted wafer into strips with individual packages (Fig. 11.30b); (3) wire bonding the memory chips to the bottom side of the individual package (Fig. 11.30c, d); (4) and then glob topping the wires and memory chips with an encapsulant (Fig. 11.30c, d).

Fig. 11.30
figure 30

Wire bonding memory chip at the bottom of the individual application processor package

The second method to attach the mobile DRAMs to the bottom of the application processor fan-out wafer-level package comprises the following steps: (1) wire bonding the memory chips to the bottom side of every package on the reconstituted wafer; (2) glob topping the wires and memory chips with an encapsulant; and (3) then dicing the reconstituted wafer into individual packages (Fig. 11.31).

Fig. 11.31
figure 31

Wire bonding memory chip at the bottom of the application processor package on a wafer

Figure 11.32 is a special case of Fig. 11.27. This is when it is difficult and costly to reduce the feature size to make the SoC. Therefore, some of the functions (for example, the GPU) are not integrated into the SoC and the GPU chip is placed side-by-side with the SoC.

Fig. 11.32
figure 32

3D IC heterogeneous integration to package the application processor chipset

In [21], we asked the question: “What if there is no PoP for the application processor chipset?” We proposed to place the application processor and the mobile DRAMs side-by-side on a build-up package substrate. The memory chips can be either cross-stacked or individually placed by wire bonding. Also, the memory chips can be placed individually by solder-bumped flip chips. The memory chips can even be stacked and have TSVs. In this study, because we used the FOWLP method to construe the RDLs for the interconnections between the SoC and mobile DRAMs as shown in Fig. 11.33, the build-up package substrate was eliminated.

Fig. 11.33
figure 33

2D IC heterogeneous integration to package the application processor chipset

11.10 3D IC High-Performance Heterogeneous Integration by FOWLP

A high-performance 3D IC heterogeneous integration of CPU, GPU, FPGA, ASIC, HBM, etc., is presented in this section. The emphasis is placed on the manufacturing process.

11.10.1 High-Performance 3D IC Heterogeneous Integration System

Figure 11.34 schematically shows a 3D IC high-performance heterogeneous integration by FOWLP technology. It can be seen that it consists of a GPU, a FPGA (field-programmable grid array), CPU, or a high-performance application-specific integrated circuit (ASIC), and is surrounded by high-bandwidth memory (HBM) cubes. Each HBM cube consists of four DRAMs and a logic base with through-silicon vias (TSVs) [2, 3] straight through them. Each DRAM chip has >500 TSVs. The interconnections between the GPU/FPGA/CPU/ASIC and HBMs are through the RDLs. The major heat path of this structure is from the backside of the GPU/FPGA/CPU/ASIC to the heat spreader. A heat sink can be added on top of the heat spreader if it is necessary.

Fig. 11.34
figure 34

3D IC high-performance heterogeneous integration by FOWLP

11.10.2 Manufacturing Process

In this case, the emphasis is placed on the manufacturing method (process) of this structure. This method comprises these steps: (1) testing for KGD of device wafers; (2) sputtering UBM; (3) electroplating the Cu contact pad; (4) spin coating a polymer on top of the device wafers; and (5) painting a thermal interface material (TIM) on the bottom (backside) of the device wafers (Fig. 11.35). The last step is different from the conventional method (the first case) which is laminating a DAF on the bottom of the device wafers.

Fig. 11.35
figure 35figure 35

Manufacturing method for 3D IC high-performance heterogeneous integration

After the steps outlined above are completed, the following are done: (1) the individual KGDs are picked and placed face-up on a metal such as copper, aluminum, steel, and an alloy 42 (with thermal expansion coefficient = 8 to 10 × 10−6/°C) carrier about 1 mm thick; (2) molding the EMC on the reconstituted wafer is accomplished by using the compression method and then post mold curing (PMC) of the EMC; (3) backgrinding the EMC and polymer to expose the Cu contact pad; (4) building up the RDLs; and (5) mounting the solder balls. Then, the reconstituted wafer is diced into individual packages (Fig. 11.35). (Note: this process is different from the conventional method, which used a glass carrier and was coated with an LTHC release layer).

11.10.3 Advantages of the New Manufacturing Process

It should be emphasized that unlike the conventional method, there is no debonding of the carrier. The metal carrier becomes the heat spreader of the individual high-performance heterogeneous integration package. This new method of manufacturing high-performance chips and memory cubes in a heterogeneous integration scheme with the FOWLP technology results in fewer assembly steps, lower cost, faster time-to-market, and higher assembly yield. Also, because of the metal carrier, the warpage is reduced during all the process steps. Furthermore, because of the metal carrier, the individual package size can be larger.

11.11 Summary and Recommendations

Two 3D IC heterogeneous integrations by FOWLP technology have been presented. The first 3D IC heterogeneous integration is emphasized on the design and the other 3D IC high-performance heterogeneous integration is on the manufacturing method. Some important results and recommendations are as follows:

  • A 3D IC heterogeneous integration of the application processor chipset has been proposed. The interconnections between the application processor and mobile DRAMs are through the RDLs, which are fabricated using the FOWLP method. The manufacturing processes for making the 3D IC heterogeneous integration have also been presented.

  • When it is difficult and costly to reduce the feature size to make the SoC, one way is not to integrate some of the functions (for example, the GPU) into the SoC and instead place the GPU chip side-by-side with the SoC.

  • The simplest heterogeneous integration of the application processor chipset is to place the application processor and the mobile DRAMs side-by-side on RDLs. One consideration is that the package size could be too large to be reliable. One of the alternatives is to stack up the mobile DRAMs by wire bonding (for lower cost) or TSV (for wider bandwidth.)

  • A 3D IC high-performance heterogeneous integration of GPU/FPGA/CPU/ASIC and HBM/HBM2 by FOWLP technology has been proposed. Emphasis is placed on a simple and effective manufacturing method to fabricate the structure. Unlike the conventional method, there is no debonding of the temporary metal carrier. The metal carrier becomes the heat spreader of the individual high-performance heterogeneous integration package.

  • The advantages of heterogeneous integration are time-to-market, performance, form factor, power consumption, signal integrity, and cost.

  • In order to lower the package profile and enhance the electrical and thermal performance of the application processor chipset for mobile applications such as smartphones and tablets, the current PoP format should be eliminated.

  • The recent advances of heterogeneous integrations on organic substrates, silicon wafers, and RDLs have been briefly mentioned.