SECTION 01

Logic Synthesis

1.1 Introduction to Synthesis

🧱 Start Here — What Is RTL?
RTL (Register Transfer Level) is the way engineers describe digital hardware in code — using languages like Verilog or VHDL. You write what the circuit does (assign outputs = inputs AND enable, always @posedge clk...) without specifying exactly which physical transistors to use. Think of it like a recipe — RTL is the recipe, silicon transistors are the actual ingredients. Synthesis is the chef that takes that recipe and builds the real thing from whatever physical components are available in the foundry's library.
💡 Why Can't We Just Send RTL to the Foundry?
A foundry manufactures silicon using specific transistor sizes (5nm, 7nm, 28nm…). They have a library of pre-designed, pre-tested cells (AND gates, flip-flops, buffers) called the Standard Cell Library. Your RTL code says "I want an adder" but the foundry needs to know exactly which cells to connect together, how many transistors, and what wire connections to make. Synthesis bridges this gap — it translates your behavioral intent into real, physical gate instances from the foundry's library.
Key Terms — Defined Before We Go Further
🔵
RTL (Register Transfer Level)
Verilog/VHDL code describing circuit behavior at the level of registers and data transfers. Tells you what to compute, not how to implement it in transistors. Example: assign y = a & b; is RTL for an AND gate.
🔲
Standard Cell
A pre-designed, pre-verified logic building block from the foundry library — AND2, OR3, NAND2, DFF (flip-flop), BUF, INV, MUX2. Every cell has a fixed height (fits in a row), known delay, area, and power. Synthesis maps your RTL into thousands of these cells.
🗂️
Gate-Level Netlist
The output of synthesis — a file listing every standard cell instance and every wire connection between them. It's still text (Verilog), but instead of behavioral code, it says things like: AND2_X4 U101 (.A(net1), .B(net2), .Y(net3)); — a specific AND gate with specific connections.
📚
Technology Library (.lib)
The catalog of all available cells at a specific process node and PVT condition. Contains: cell delay vs load tables, setup/hold times, leakage power, area in µm². The synthesis tool uses this to pick the right cell for each function and estimate timing. Think of it as the ingredient nutrition label.
📋
SDC (Synopsys Design Constraints)
A file where you tell the synthesis tool your timing requirements: how fast the clock is, when data arrives at inputs, when outputs must be ready. Without SDC, synthesis has no idea what "fast enough" means and will produce a correct but potentially very slow netlist.
📐
QoR (Quality of Results)
How good the synthesis output is, measured across multiple dimensions: timing (WNS/TNS), area (mm² or cell count), power (mW), and DRC violations. A good synthesis engineer optimizes all four simultaneously — improving one often hurts another.
📥
Inputs to Synthesis
RTL source files (.v / .sv / .vhd) — the behavioral description
Technology library (.lib / .db) — available cells and their properties
SDC constraints (.sdc) — timing requirements (clock, I/O delays)
UPF/CPF (optional) — low-power intent file
⚙️
What Synthesis Does
1. Parses and understands your RTL code (Elaboration)
2. Converts logic to technology-independent gates (Generic Mapping)
3. Maps those to real cells from your library (Technology Mapping)
4. Optimizes: minimize delay on critical paths, reduce area, lower power (Optimization)
📤
Outputs of Synthesis
Gate-level netlist (.v) — your RTL translated to real gates
Mapped SDC (.sdc) — constraints ready for PD tool
Reports — timing (WNS/TNS), area (mm²), power (mW), DRC violations
DDC database — for incremental re-runs
RTL-to-GDS Complete Chip Design Flow
RTL TO GDS II DESIGN FLOW
RTL Verilog/VHDL SYNTHESIS DC / Genus NETLIST Gate-level FLOOR- PLAN Innovus/ICC2 PLACE+CTS +Route SIGN-OFF STA/DRC/LVS TAPEOUT GDS II Foundry SILICON FAB Wafer Front-End Design Back-End / Physical Design
📌 Key Concept
Synthesis sits at the boundary of front-end and back-end design. The quality of synthesis directly impacts all downstream physical design steps — poor synthesis results in congestion, timing closure problems, and increased power.

1.2 Detailed Synthesis Flow

The synthesis flow transforms RTL into an optimized gate-level netlist through several distinct stages, each with specific goals and transformations.

SYNTHESIS STAGE-BY-STAGE FLOW
RTL INPUT Verilog / VHDL .v / .sv files ELABORATION Parse HDL Build hierarchy read_design GENERIC MAPPING GTECH library Boolean optim. TECH MAPPING Target library compile / syn_map OPTIMIZATION Area/Power/Timing DRC fixing compile_ultra NETLIST OUT Gate-level .v + SDC + Reports Technology Library .lib / .db / .lef Constraints .sdc file Step 1 Step 2 Step 3 Step 4 Step 5 Step 6
🔍 What Happens at Each Stage
Step 1 — Elaboration: The tool reads your Verilog files and "understands" your design — like a compiler parsing code. It figures out what each module does, how they connect, and what kind of logic is needed (registers, adders, FSMs).

Step 2 — Generic Mapping: Converts your design into a technology-independent intermediate form using GTECH (generic) gates — simple AND/OR/NOT/FF operations with no size or speed information yet. Boolean optimization happens here: constant propagation, dead code removal, logic simplification.

Step 3 — Technology Mapping: Now the tool looks at your target library (.lib file) and replaces each generic gate with an actual physical cell from that library. An AND2 becomes AND2_X4 (4x drive strength), a flip-flop becomes DFF_X1. This is where cell selection decisions are made.

Step 4 — Optimization: Iteratively improve the design. Fix timing violations by upsizing cells or restructuring logic. Reduce area by downsizing non-critical cells. Insert clock gating to save power. This phase runs many passes until WNS/TNS meets your target.
📋 Elaboration
Parses HDL source files, resolves module hierarchy, identifies registers, FSMs, and datapath elements. Builds an internal design representation (GTECH netlist) using generic logic cells independent of technology.
🗺️ Technology Mapping
Maps GTECH gates to cells from the target technology library (.lib). Uses pattern matching and tree-covering algorithms to find optimal cell selections that meet timing and area targets.

1.3 Synopsys Design Compiler (DC)

Design Compiler is the industry-standard synthesis tool from Synopsys. It supports hierarchical synthesis, compile strategies, and advanced optimization for timing, area, and power.

Key DC Commands
CommandPurposeKey Options
read_verilogRead RTL source files-sv (SystemVerilog), file list
elaborateBuild design hierarchy-parameters, -lib_work
linkResolve all design referencesMust be called after elaborate
compile_ultraFull compile with all optimizations-no_autoungroup, -timing_high_effort
compileBasic compile-map_effort [low/med/high], -incremental
report_timingTiming path reports-max_paths N, -slack_lesser_than 0
report_areaArea statistics-hier (hierarchical breakdown)
report_powerPower analysis-analysis_effort high
write_fileOutput gate-level netlist-format verilog -hierarchy -output
write_sdcWrite constraints file-version 2.0
set_dont_touchProtect cells from optimizationApply to specific instances
check_timingValidate timing constraintsReports unconstrained paths
Sample DC Synthesis Script (.tcl)
TCL — DC Synthesis Script
## =========================================================
## DC Synthesis Script — sample_chip.tcl
## Project: sample_chip | Author: VLSI Engineer
## =========================================================

## 1. Setup target/link libraries
set target_library    "saed32nm_tt1p05v25c.db"
set link_library      "* $target_library"
set symbol_library    "saed32nm.sdb"

## 2. Read RTL sources
read_verilog -sv "../rtl/top.v ../rtl/core.v ../rtl/alu.v"

## 3. Elaborate and link design
elaborate    sample_chip
link
check_design

## 4. Apply timing constraints
read_sdc "../constraints/sample_chip.sdc"

## 5. Set operating conditions
set_operating_conditions "tt1p05v25c"

## 6. Compile with high effort
compile_ultra -no_autoungroup -timing_high_effort_script

## 7. Reports
report_timing  -max_paths 10 -slack_lesser_than 0 -nosplit > rpt/timing.rpt
report_area    -hier                                              > rpt/area.rpt
report_power   -analysis_effort high                              > rpt/power.rpt
report_qor                                                         > rpt/qor.rpt

## 8. Write outputs
write_file -format verilog -hierarchy -output "out/sample_chip_netlist.v"
write_sdc  -version 2.0                             "out/sample_chip_mapped.sdc"
write_file -format ddc -hierarchy -output  "out/sample_chip.ddc"

puts "=== Synthesis Complete ==="
Sample SDC Constraint File
SDC — Timing Constraints
## =========================================================
## SDC Constraint File — sample_chip.sdc
## =========================================================

## Clock definition
create_clock -name CLK -period 5.0 -waveform {0 2.5} [get_ports clk]

## Clock uncertainty (jitter + skew)
set_clock_uncertainty -setup 0.15 [get_clocks CLK]
set_clock_uncertainty -hold  0.05 [get_clocks CLK]

## Clock transition
set_clock_transition  0.1 [get_clocks CLK]

## Input delays (relative to CLK edge)
set_input_delay  -max 1.5 -clock CLK [get_ports data_in*]
set_input_delay  -min 0.2 -clock CLK [get_ports data_in*]

## Output delays
set_output_delay -max 1.2 -clock CLK [get_ports data_out*]
set_output_delay -min 0.1 -clock CLK [get_ports data_out*]

## Drive strength and load
set_driving_cell  -lib_cell BUFX4 [get_ports data_in*]
set_load 0.05 [get_ports data_out*]

## False paths (async reset, test ports)
set_false_path -from [get_ports rst_n]
set_false_path -from [get_ports scan_en]

## Multicycle path (2-cycle computation)
set_multicycle_path 2 -setup -from [get_cells mult_inst*]
set_multicycle_path 1 -hold  -from [get_cells mult_inst*]

## Max capacitance / transition constraints
set_max_capacitance 0.2 [current_design]
set_max_transition  0.4 [current_design]

1.4 Cadence Genus

Genus is Cadence's modern synthesis solution featuring concurrent optimization and a unified data model with Innovus for seamless handoff.

Key Genus Commands
CommandPurpose
read_hdl -language svRead SystemVerilog/Verilog/VHDL sources
elaborateElaborate and link design hierarchy
read_mmmcRead multi-mode multi-corner view definition
syn_genericGeneric synthesis (technology-independent)
syn_mapTechnology mapping to library cells
syn_optIncremental optimization (timing/area)
report timingReport worst timing paths
report areaReport cell count and area
report powerDynamic and leakage power
write_hdlWrite gate-level netlist
write_sdcWrite timing constraints
DC vs Genus Comparison
FeatureSynopsys DCCadence Genus
VendorSynopsysCadence
Script LanguageTCL (dc_shell)TCL / Innovus-compatible
Compile Commandcompile_ultrasyn_opt
MMMC SupportVia scenario objectsNative via read_mmmc
PD IntegrationICC2 (write_icc2)Innovus (write_db)
Physical GuidanceDC TopologicalPhysical Guidance Mode
Industry UsageDominantGrowing

1.5 Timing Constraints (SDC)

Clock & I/O Timing Waveform — SDC Constraints Visualized

This diagram shows the complete setup timing budget for a register-to-register path through an I/O port. All SDC constraint values map directly to regions on the waveform. The available combinational logic window = Period − input_delay − output_delay − setup_margin.

Period = 5.0 ns CLK 200MHz ±Jitter Launch Edge t₀ Capture Edge t₁ DATA_IN (port) input_delay -max 1.5 ns arrival DATA_OUT (port) output_delay -max 1.2 ns must valid Available combo window = 5.0 − 1.5 − 1.2 = 2.3 ns T_su CLOCK UNCERTAINTY BREAKDOWN Jitter (PLL): ±50–100 ps Skew (pre-CTS): ±100–200 ps Margin (guardband): ~50 ps Total (setup): 0.15 ns typical Total (hold): 0.05 ns typical Post-CTS: only jitter remains (propagated clocks) SETUP REQUIRED TIME: T_req = Period + T_clk_capture − T_uncertainty_setup − T_setup_FF = 5.0 + 0.0 − 0.15 − 0.09 = 4.76 ns SETUP SLACK: Slack = T_req − T_arrival = 4.76 − (T_cq + T_combo + T_input_delay) ≥ 0 → PASS | < 0 → VIOLATION 0 ns 5 ns 10 ns 15 ns 20 ns
📐 Input Delay Explained
set_input_delay -max 1.5 -clock CLK [get_ports data_in*]

This tells the tool: upstream logic takes 1.5 ns of the clock period before data is valid at our input port. This is NOT a constraint we impose — it's a description of the external world. The tool uses it to compute the remaining time budget for our internal combinational logic. Tighter input delay = less margin for your combo path.
📐 Output Delay Explained
set_output_delay -max 1.2 -clock CLK [get_ports data_out*]

This says: the downstream chip needs our output to be valid 1.2 ns before its capture clock edge. The tool reserves this time from the end of the period. Together: available combo window = Period − input_delay − output_delay = 5.0 − 1.5 − 1.2 = 2.3 ns (before accounting for FF setup time and uncertainty).
Complete SDC Command Reference
SDC CommandAnalysis TypeWhat It ModelsExample
create_clock Both Defines clock signal: period, waveform shape, source pin. Foundation of all timing analysis. create_clock -period 5 -waveform {0 2.5} [get_ports clk]
create_generated_clock Both Clock derived from master clock (PLL output, divider). Must be declared for STA to analyze crossing paths. create_generated_clock -divide_by 2 -source clk [get_pins div_reg/Q]
set_clock_uncertainty Both Models jitter + skew + margin. Setup reduces required time. Hold adds to minimum required time. set_clock_uncertainty -setup 0.15 -hold 0.05 [get_clocks CLK]
set_clock_transition Both Models clock rise/fall slew at the source. Affects clock cell delays in tree analysis. set_clock_transition 0.08 [get_clocks CLK]
set_input_delay -max Setup Latest external data arrival relative to clock. Reduces time budget for internal combo logic. set_input_delay -max 1.5 -clock CLK [get_ports din*]
set_input_delay -min Hold Earliest external data arrival. Used for hold analysis. Without -min, hold on input paths is unconstrained. set_input_delay -min 0.2 -clock CLK [get_ports din*]
set_output_delay -max Setup Time before next clock edge downstream chip needs our output stable. Eats into our combo budget. set_output_delay -max 1.2 -clock CLK [get_ports dout*]
set_output_delay -min Hold Minimum time downstream chip needs output stable after our clock. Constrains minimum combo path. set_output_delay -min 0.1 -clock CLK [get_ports dout*]
set_false_path Disable Removes path from STA entirely. For async resets, test ports, clock MUX select pins — paths never race functionally. set_false_path -from [get_ports rst_n]
set_multicycle_path Both Allows N-cycle propagation. ALWAYS pair setup with hold correction (N-1). Missing hold fix → hold violations. set_multicycle_path 2 -setup -from [get_cells mul*]
set_multicycle_path 1 -hold -from [get_cells mul*]
set_clock_groups Async Tells STA not to analyze paths between unrelated clocks. Essential for correct CDC handling in STA. set_clock_groups -asynchronous -group {CLK_A} -group {CLK_B}
set_driving_cell Setup Models external driver strength at input ports. Without this, input transitions are ideal (zero-resistance). Affects input timing accuracy. set_driving_cell -lib_cell BUFX4 [get_ports din*]
set_load Setup Models output port capacitive load (downstream PCB trace, other chip input). Affects output transition and delay. set_load 0.05 [get_ports dout*]
💡 Pro Tip — Pre-CTS vs Post-CTS Uncertainty
Pre-CTS: Use set_clock_uncertainty -setup 0.15 to model total skew + jitter. This is pessimistic because skew is unknown.
Post-CTS: Switch to set_propagated_clock [all_clocks] in PrimeTime. The tool computes actual clock latencies through the synthesized clock tree. Only jitter uncertainty remains (typically 0.05–0.08 ns). This recovers significant timing margin — often 100–200 ps — that was previously modeled as skew pessimism.

1.6 Optimization Techniques

📐
Area Optimization
Logic sharing, constant folding, dead code elimination, cell downsizing. Minimize cell count and wire length. Use compile -map_effort high and set_max_area 0.
⏱️
Timing Optimization
Critical path restructuring, cell upsizing, buffer insertion, logic duplication for fanout reduction. Fix negative slack paths. Use compile_ultra -timing_high_effort_script.
Power Optimization
Clock gating insertion, operand isolation, multi-threshold voltage assignment (HVT/SVT/LVT), data activity propagation via switching activity files (SAIF).
Advanced Optimization Techniques
TechniqueDescriptionBenefit
RetimingMove registers across combinational logic to balance pipeline stagesTiming
Constant PropagationReplace signals that are always 0/1 with constants; simplify downstream logicArea
Logic RestructuringRearrange tree structures (AND/OR) to reduce critical path depthTiming
Ungroup HierarchyFlatten sub-modules to enable cross-boundary optimizationTiming
Path GroupingGroup critical paths for prioritized optimization effortTiming
Multi-Vt AssignmentUse HVT cells in non-critical paths, LVT on critical pathsPower+Timing

1.7 Quality of Results (QoR)

QoR is the overall measure of synthesis success across all objectives: timing, area, power, and design rule compliance. A good synthesis engineer tracks all four simultaneously — improving one often hurts another.

📊 Reading QoR Numbers — What Do They Mean?
After synthesis, you run report_qor and see numbers like WNS = −0.28ns, TNS = −15.4ns. Here is what that means:

WNS (Worst Negative Slack) = the single worst timing path in the design. −0.28ns means the most critical path is 0.28 nanoseconds too slow — data arrives 0.28ns after it needs to. This is the path you fix first.

TNS (Total Negative Slack) = the sum of all negative slacks across all violating paths. −15.4ns means if you added up all the violations, the total shortfall is 15.4ns of work to fix. A large TNS with a small WNS means many paths are slightly violated (broad problem). A large WNS with small TNS means one very bad path (focused problem).

Goal: WNS ≥ 0 AND TNS = 0 — every single timing endpoint must pass. Even one path at −0.001ns is a failure at sign-off.
WNS
Worst Negative Slack
The single most critical path. Must be ≥ 0 for timing sign-off. This is what you fix first — it sets your maximum achievable frequency.
TNS
Total Negative Slack
Sum of ALL negative slacks across all endpoints. Indicates total work remaining. Must be exactly 0 at sign-off. Large TNS = many violations to fix.
WHS
Worst Hold Slack
Most critical hold violation. Hold failures happen at ALL frequencies — they are structural problems fixed with delay buffers, not by lowering frequency.
QoR Improvement Checklist
IssueSymptomFixPriority
Setup violationsWNS < 0Upsize cells, remove logic levels, add pipeline stageP0
Hold violationsWHS < 0Insert delay buffers on short pathsP0
High leakage powerreport_power shows high staticReplace LVT with HVT on non-critical pathsP1
High dynamic powerSwitching activity highEnable clock gating, operand isolationP1
DRC violationsMax cap/trans violationsBuffer high-fanout nets, fix transitionsP0
Large areaArea > targetset_max_area 0, use higher Vt cellsP2
SECTION 02

Physical Design

2.1 Introduction to Physical Design

🔧 What Is Physical Design and Why Does It Exist?
After synthesis you have a gate-level netlist — a text file listing cells and connections. But the foundry cannot manufacture a text file. They need a GDS II file — a precise geometric description of every shape of every metal layer at exact X,Y coordinates, measured in nanometers. Physical Design is the entire process of going from that netlist to GDS: figuring out where every cell physically sits on the silicon, how to distribute power to all of them, how to route every wire connecting them, and verifying it all meets manufacturing rules. PD is what transforms your design from "a list of logic" to "a physical chip that can be manufactured."

Physical Design converts a synthesized gate-level netlist into a manufacturing-ready GDS layout, determining the physical placement, power, clock, and routing of all cells.

PD Flow — Click each step to expand details
NETLIST Input FLOOR PLAN Die/Core/IO POWER PLAN VDD/VSS Grid PLACEMENT Global+Detail CTS Clock Tree ROUTING Global+Detail SIGN-OFF STA/DRC/LVS GDS out TAPEOUT GDS II
STEP 1 Floorplanning
Define chip boundary (die area), place macros, I/O pins, and establish power domains. Sets utilization and aspect ratio constraints that guide all subsequent steps.
STEP 2 Power Planning
Create VDD/VSS rings around the core and stripes across the die. Ensure low IR drop and EM-safe current densities throughout the power network.
STEP 3 Placement
Place standard cells within the core area. Global placement minimizes wirelength. Detailed placement legalizes to rows. Timing-driven placement optimizes critical paths.
STEP 4 Clock Tree Synthesis (CTS)
Build a balanced clock distribution network minimizing skew (difference in clock arrival times at flip-flops) and insertion delay. Uses buffers and inverters to drive all clocked elements.
STEP 5 Routing
Connect all cell pins using metal interconnects. Global routing assigns regions. Detail routing assigns actual wires. Must satisfy all DRC rules (spacing, width, via enclosure).
STEP 6 Sign-Off Verification
Run STA with parasitic extraction, DRC (design rule check), LVS (layout vs schematic), and IR drop analysis. All checks must pass before tape-out.

2.2 Floorplanning

🧱 Start Here — What Is a Chip Physically Made Of?
Before floorplanning makes sense, you need to understand what actually sits on a piece of silicon. A chip is a layered sandwich: a silicon substrate at the bottom, transistors built on top of it, then alternating layers of metal wires and insulating oxide on top. All of these layers together form the die. Floorplanning is the step where you decide where on that silicon each piece of logic goes.
Anatomy of a Chip Die — Every Term Explained
WAFER / SCRIBE LINE (DICING STREET) DIE (total silicon area including everything) I/O PAD I/O PAD CLK PAD VDD PAD VSS PAD I/O PAD I/O VDD VSS I/O CORE AREA (where all your logic is placed) HARD MACRO (SRAM 512KB) Pre-designed, fixed layout NOT synthesized HARD MACRO (PLL / Clock Gen) ① DIE Total silicon rectangle cut from wafer ② I/O PAD Connects chip to package pins / PCB ③ VDD/VSS POWER RING Wide metal loop distributing power ④ CORE AREA Where std cells + macros are placed ⑤ HARD MACRO Pre-designed IP (SRAM, PLL, ROM). Fixed shape. Not synthesized. ⑥ MACRO HALO (keepout) No std cells allowed in this margin ⑦ STANDARD CELL ROWS Horizontal strips (height = cell height). Logic gates + FFs snap to these rows. ⑧ CORE-TO-IO MARGIN Gap between core and pad ring. Contains power rings + routing. ⑨ SCRIBE LINE Diamond-saw cuts here to separate dies
Every Floorplan Term — Explained From Scratch
① Die
The die (also called a chip) is a single rectangular piece of silicon cut from a wafer. Hundreds of identical dies are fabricated simultaneously on one 300mm wafer, then sawn apart. The die includes everything — pads, power rings, seal ring, and the core logic. Die area is measured in mm². Larger die = more expensive (cost scales roughly with area²).
② I/O Pads
I/O pads are the connection points between your chip and the outside world (PCB board, other chips). They live around the perimeter of the die. Each pad has:

• A signal pad — for data inputs/outputs (bidirectional, input-only, or output-only)
• A power pad — for VDD (positive supply) and VSS (ground). Multiple VDD/VSS pads are used because each pad can only carry limited current.

In your SDC file, get_ports refers to these pad signals. The set_input_delay and set_output_delay constraints model the timing from/to these pads.
③ Core Area vs Die Area
The core area is the inner rectangle where all your synthesized logic (standard cells + macros) lives. The die area is the full silicon including the pad ring around it.

Think of it like a room (core) inside a building (die). The walls of the building hold the doors (I/O pads). The room is where all the furniture (logic) goes. The gap between the room walls and the building walls is used for corridors (power rings, routing channels) — this gap is called the core-to-IO margin (typically 20–50 µm).
④ Standard Cell
A standard cell is a pre-designed, pre-characterized logic gate (AND, OR, NAND, flip-flop, buffer, mux, etc.) from the technology library. Every standard cell has:

• A fixed height (e.g., 12-track height in 7nm) — all cells in the same technology have the same height
• A variable width depending on complexity (a 4-input NAND is wider than a 2-input AND)
• VDD and VSS rails running along the top and bottom edges

Because all standard cells have the same height, they can be placed in rows like books on a shelf. The synthesis tool converts your Verilog RTL into thousands of standard cell instances. The PD tool then physically places them in the rows.
⑤ Hard Macro
A hard macro is a large, pre-designed block with a fixed physical layout — you cannot synthesize it or change its internals. Common examples:

SRAM — On-chip memory. Your CPU's cache or register file. Designed by memory compilers with optimized bit-cell layout.
PLL (Phase-Locked Loop) — Clock generator circuit. Analog design, cannot be synthesized.
ROM — Read-only memory for boot code or lookup tables.
Analog IP — ADC, DAC, SerDes PHY — all analog, all hard macros.

Hard macros are placed first during floorplanning, before any standard cells. Their position determines how efficiently the remaining logic can be placed and routed.
⑥ Macro Halo (Keepout Zone)
A macro halo is an exclusion zone around each hard macro where standard cells cannot be placed. Typical size: 2–5 µm on all sides.

Why? Because the macro's internal structure needs routing access around its edges (for signal and power connections). If standard cells are placed right up against the macro wall, the router has no room to route those connections — creating a routing deadlock.

It's like leaving a sidewalk around a building so people can walk to the entrance — if you park cars right up to the walls, nobody can get in.
⑦ Utilization
Utilization = how full is your core area with actual logic?

Utilization = (Total std cell area) / (Core area) × 100%

Why not 100%? Because you need space for:
• Routing channels (wires between cells)
• Clock buffers and power supply cells inserted during PD
• Filler cells and decap cells
• Spare cells for post-silicon ECO

Rule of thumb: 60–75% utilization is the sweet spot. Below 60% = die is wastefully large (costs more money per chip). Above 80% = routing becomes extremely congested and timing closure becomes very difficult or impossible.
⑧ Aspect Ratio
Aspect ratio = Core height / Core width. An aspect ratio of 1.0 means a perfect square core.

Most designs target 1:1 (square) because it minimizes average wire length (which minimizes delay and power). Non-square shapes are used when:
• I/O pad constraints require a specific shape (e.g., a chip with many memory interfaces on one side)
• Large hard macros naturally push the aspect ratio
• The package dictates the die shape

Extreme aspect ratios (e.g., 3:1 — very tall and thin) cause problems: clock distribution becomes unbalanced, wire lengths increase, and some areas become routability bottlenecks.
Utilization — Visualized
40% Util ⚠ Wasteful — Die too big 65–75% Util ✓ TARGET ✓ Good density + routable gaps 90% Util — CONGESTED ✗ No routing room → DRC failures UTILIZATION vs IMPACT <50% High cost 60–75% IDEAL ✓ 75–85% Caution >85% Danger QoR Poor Good
Formulas with Worked Example
Core Utilization
Utilization = (Total Standard Cell Area) / (Core Area) × 100%
Core Area from Target Utilization
Core Area = Total Cell Area / Target Utilization
Aspect Ratio
AR = Core Height / Core Width (1.0 = square)
📐 Worked Example — How to Size a Floorplan
Given: Synthesis reports total cell area = 4.8 mm². Target utilization = 70%. Preferred square core.

Step 1: Core Area = 4.8 / 0.70 = 6.86 mm²
Step 2: For AR = 1.0 (square): Width = Height = √6.86 = 2.62 mm × 2.62 mm
Step 3: Add core-to-IO margin (say 40 µm each side): Die = (2.62 + 0.08) × (2.62 + 0.08) = 2.70 mm × 2.70 mm
Step 4: Verify: Do the hard macros (SRAMs) fit? If SRAM is 1.2 mm × 0.8 mm + halo, it needs ~1.25 mm × 0.85 mm footprint — this fits in the 2.62mm core.
Floorplan Parameters — With Full Explanation
ParameterTypical ValueWhat It MeansIf You Get It Wrong
Core Utilization 60–75% Percentage of core area filled with standard cell logic. The rest is routing space + buffers. >80%: router can't fit all wires → routing overflow → unrouteable design.
<50%: die is larger than needed → higher cost per chip.
Aspect Ratio 1:1 (square) Core height divided by core width. 1.0 = perfect square. Controls the shape of the chip. Extreme ratios (3:1) make clock distribution and power delivery much harder. I/O pad count may also force non-square shapes.
Core-to-IO margin 20–50 µm The gap between the outer edge of the core and the inner edge of the I/O pad ring. Used for power rings (VDD/VSS) and routing channels to connect pads to core logic. Too narrow: power rings don't fit, I/O connections cannot be routed.
Too wide: wastes die area.
Macro halo 2–5 µm The empty forbidden zone around each hard macro where NO standard cells are placed. Required to leave room for the macro's own routing connections. Without halo: standard cells crowd the macro edges → router cannot access macro pins → open circuits in the layout (LVS failures).

2.3 Power Planning

🔌 Start Here — Why Does Every Cell Need VDD and VSS?
Every single logic gate needs two power connections: VDD (positive supply, e.g. 1.0V) and VSS (ground, 0V). Without these, no transistor can switch. A chip with 10 million cells all simultaneously drawing current needs a power delivery highway network. Power planning builds this network. Get it wrong → cells starve for power → they slow down → timing failures.
🔴
VDD — Power Supply Rail
The positive voltage rail. At 28nm ≈ 0.9–1.05V, at 5nm ≈ 0.65–0.8V. Every cell's PMOS transistors connect here. When VDD wire has resistance, current causes a voltage drop — cells at the far end see less than VDD → slower switching → potential timing violations.
🔵
VSS — Ground Rail
The 0V reference. Every cell's NMOS transistors return current to VSS. Must also be low-resistance — a "bouncing" VSS from high return currents can cause ground bounce noise that flips logic states erroneously (functional failure!).
🏗️
The Power Delivery Hierarchy
Current path: PCB → Package pins → Solder bumps/Bond wires → I/O pads on die edge → Power rings (thick metal rings around core) → Power stripes (wide wires criss-crossing the core on upper metals) → Power rails inside std cell rows → Individual cell VDD/VSS pins. Each step adds resistance.
Decap Cells
Decoupling capacitor cells placed between VDD and VSS in empty spaces. They act as local charge reservoirs — when many cells switch simultaneously and demand a sudden surge of current, the decaps supply it instantly without waiting for current to travel from far-away pads. Reduces dynamic IR drop peaks.
IR Drop — Voltage Lost Along the Wire
V_drop = I (current drawn by cells) × R_metal (wire resistance) → Cell sees VDD − V_drop
Wire Resistance — Why Wider = Better
R = ρ × L / (W × T) where L=length, W=width, T=thickness, ρ=metal resistivity. Double W → half R → half IR drop
Electromigration — Maximum Safe Current Density
J = I / A must be < J_max from Black's equation: J_max = A × e^(-Ea/kT). Exceed this → wire fails in product lifetime
⚠️ Consequence of IR Drop — A Real Example
If VDD drops 10% (1.0V → 0.9V) in a hot corner of the chip, transistors in that region become ~20% slower. Paths that barely meet 5ns timing now take 6ns → new setup violations appear at sign-off that weren't visible in pre-IR-drop STA. This is why IR drop is a mandatory sign-off check — STA without IR drop is not sign-off quality.
Power Grid Topology
VDD RING VSS RING VDD VDD VSS VSS DECAP DECAP VDD VSS Decap Cells
⚡ IR Drop
Voltage reduction along the power rail due to resistive metal. Static IR drop: DC current × metal resistance. Dynamic IR drop: transient switching currents cause instantaneous dip. Must keep < 5–10% of VDD.
🌊 Electromigration (EM)
Gradual movement of metal atoms due to electron flow (current density). Causes open circuits over time. Limit: J < Jmax for each metal segment. Wider wires or via arrays reduce EM risk.

2.4 Placement

📍 Start Here — What Exactly Gets Placed?
After synthesis, you have a gate-level netlist — a list of cells (AND2, DFF, BUF...) and wires connecting them. But they have no physical location yet — it's like having all the components of a city but no map showing where each building goes. Placement decides the X,Y coordinates of every standard cell inside the core area. This decision is critical: cells that are logically connected should be physically close → shorter wires → less resistance and capacitance → faster timing → less power. Bad placement = long wires everywhere = timing closure becomes nearly impossible.
What Is a Standard Cell Row? (The Grid Cells Sit In)
📏 Standard Cell Rows — The Shelf System
The core area is divided into horizontal strips called rows, all of the same height (determined by the technology node, e.g. 0.27µm tall at 28nm). Every standard cell has exactly this height, so all cells snap perfectly into rows — like books on a shelf. Each row has VDD and VSS power rails running horizontally through it. Cells in adjacent rows are flipped upside down so they share the power rails between rows — this halves the number of power stripes needed. Cells must be placed aligned to the row grid AND to a horizontal placement site (typically 0.09µm pitch). Any deviation = legalization violation.
Before Placement
AND FF OR INV BUF NAND Unplaced (Overlap/Random)
After Legal Placement
FF AND INV OR BUF NAND FF Legalized (Row-aligned)
StageDescriptionKey Metric
Global PlacementDistributes cells across core to minimize total wirelength. Cells may overlap temporarily.HPWL (half-perimeter wire length)
LegalizationMoves cells to legal rows, removes overlaps, snaps to row grid.Cell displacement from global
Detailed PlacementLocal cell swaps and moves to improve timing and routing.WNS improvement
Congestion ReductionSpread cells in congested areas, use placement blockages.Routing overflow %

2.5 Clock Tree Synthesis (CTS)

⏰ Start Here — The Clock Distribution Problem
Your chip has one clock source (e.g. a PLL output) but potentially millions of flip-flops that all need that clock signal. You cannot connect one wire from the source to every FF — a single wire driving 1,000,000 FFs would have astronomical capacitance → extremely slow transitions → the clock would barely toggle. Also, if the wire is very long, the signal takes different amounts of time to reach FFs at different corners of the die → clock skew — some FFs see the clock edge nanoseconds after others, which breaks timing. CTS solves this by building a tree of buffers that fans out the clock progressively, like a branching river delta, ensuring every FF gets a clean, fast clock edge at approximately the same time.
💡 What Is a Clock Buffer and Why Is It Needed?
A clock buffer is a standard cell with one input (the incoming clock) and one output (a buffered copy of the clock). Its job: take a weak, slightly degraded clock signal and reproduce it as a clean, strong signal that can drive many downstream loads. Without buffers: one wire from the PLL driving 100,000 FFs would have total capacitance of ~50pF → the clock signal would have a 5ns rise time → the "edge" would be a slow ramp instead of a sharp transition → setup and hold times cannot be met → chip fails. Clock buffers are inserted every few cells in the tree to keep the clock transition times under ~100ps at every FF.
Key CTS Concepts — From Scratch
🌳
What Is a Clock Tree?
A tree of clock buffers starting at the clock source and branching out to all flip-flop clock pins. Each level buffers the signal and drives the next level. A typical design might have 4–8 levels of buffering. The root drives 2–4 branches, each branch drives more sub-branches, eventually reaching individual FFs. The tree ensures signal integrity (clean transitions) and balances arrival times.
📐
Clock Skew
The difference in clock arrival time between any two flip-flops. If FF_A receives the clock edge at 1.00ns and FF_B at 1.25ns, the skew is 0.25ns. Skew matters because: positive skew relaxes setup but tightens hold; negative skew tightens setup. CTS target: local skew < 50ps, global skew < 200ps.
📏
Clock Insertion Delay (Latency)
The time from the clock source to a flip-flop's clock pin, through all the buffers and wires of the clock tree. Typical values: 0.3–1.5ns depending on design size and node. Latency itself doesn't cause problems — it's the difference in latency between FFs (skew) that causes timing issues.
🎯
Clock Uncertainty
A timing margin added to clock edges to account for: Jitter (cycle-to-cycle variation from PLL noise, typically 50–100ps), Skew (modeled as uncertainty pre-CTS), and extra guardband. Applied in SDC as set_clock_uncertainty. Post-CTS, skew is captured by propagated clock latencies, so only jitter+guardband remain.
Skew
Δ in clock arrival between FFs
Target: <50ps (local), <200ps (global)
Latency
Clock insertion delay
Source → FF clock pin delay
Uncertainty
Jitter + skew margin
Applied as timing margin in STA
H-TREE TOPOLOGY CLK SRC B B B B B B FF D Q FF D Q FF D Q FF D Q FF D Q FF D Q FF D Q FF D Q All paths: CLK → FF have equal length Equal wire length → equal delay → minimal skew CLK source L1 buffer L2 buffer FF (sink) CLOCK SKEW WAVEFORM 0 ns 5 ns 10 ns CLK@FF1 arrives t₁ ↑t₁ CLK@FF2 arrives t₂ ↑t₂ Skew Δt Period = 5 ns Skew = t₂ − t₁ = late arrival of CLK at FF2 Target: local <50ps | global <200ps after CTS
Key CTS Commands
CommandToolPurpose
ccopt_designInnovusRun CTS with concurrent optimization
set_ccopt_propertyInnovusSet CTS target skew, latency targets
clock_optICC2Run clock tree optimization
set_clock_tree_optionsICC2Configure CTS parameters
report_clock_treeBothReport skew, latency, buffer count

2.6 Routing

🔗 Start Here — What Is Routing?
After placement, you know where every cell sits, but the cells are still disconnected — like buildings in a city with no roads between them. Routing draws the actual metal wires that connect every cell pin to every other cell pin according to the netlist. A modern chip might have 50–200 million net connections to route. The wires must: (1) actually connect what the netlist says, (2) satisfy all foundry DRC rules, (3) minimize wire length (affects timing and power), and (4) not cause crosstalk noise. This is done in two phases: global routing (plan the routes) and detailed routing (draw the actual shapes).
Metal Layers — Why Multiple Layers Exist
🏗️ Why Do Chips Have Multiple Metal Layers?
If you had only one metal layer, wires couldn't cross without shorting — like a city with only one road that can never have intersections. Multiple metal layers (M1, M2, M3… up to M15+ at advanced nodes) solve this: each layer's wires run in one direction (alternating horizontal/vertical), and vias connect between layers wherever a wire needs to change layers or connect to another wire. Lower metals (M1, M2) have thin, tight-pitch wires for local connections. Upper metals (M8+) have wide, coarse wires for global signals, power, and clock distribution — they carry more current and span longer distances.
What Is a Via? How Do Layers Connect?
🔩
Via — The Vertical Connection
A via is a small metal pillar that connects two adjacent metal layers vertically. Via between M1 and M2 is called V1 (Via 1). Between M2 and M3 is V2, etc. Vias have limited current capacity — high-current nets need via arrays (many parallel vias) to prevent electromigration. A missing or broken via = open circuit = LVS failure.
↔️
Preferred Routing Direction
Each metal layer has a preferred routing direction: M1 vertical, M2 horizontal, M3 vertical, M4 horizontal... (alternating). This orthogonal arrangement minimizes parallel-running wires on adjacent layers (reduces coupling capacitance / crosstalk). Routing against preferred direction is allowed but penalized by the router.
📏
Routing Track
The routing grid on each metal layer is divided into tracks — parallel lines at the minimum wire pitch. Each track can hold one wire. The number of tracks in a routing channel = available routing resources. When more wires need to cross a region than there are tracks → routing overflow / congestion → DRC violations or unroutable design.
Metal Layer Stack (Color-Coded)
M8 Global bus M6 Semi-global H M5 Semi-global V M4 Intermediate H M3 Intermediate V M2 Local H M1 Local V (cell conn.)
DRC RuleDefinitionViolation Impact
SpacingMinimum distance between same-metal parallel wiresShort circuit risk, manufacturing defects
WidthMinimum wire width per metal layerHigher resistance → IR drop, EM failure
Via enclosureMetal must extend beyond via by minimum amountBroken via connection on manufacturing variation
AntennaLimits ratio of metal area to gate oxide areaGate oxide damage during plasma etch
DensityMin/max metal fill requirements per layerCMP non-uniformity → dishing/erosion

2.7 Physical Verification

🔍
DRC — Design Rule Check
Verifies the layout satisfies all foundry manufacturing rules (spacing, width, enclosure, density). Zero DRC violations required for tape-out. Tool: Calibre DRC, Mentor.
⚖️
LVS — Layout vs Schematic
Compares extracted netlist from layout with the reference schematic. Verifies all connections are correct and no opens/shorts were introduced during PD. Tool: Calibre LVS.
🛡️
ERC — Electrical Rule Check
Checks for floating nodes, unconnected power/ground, improper biasing, ESD violations, latchup risk areas. Ensures circuit will function correctly electrically.
Common DRC Violation Examples
SPACING VIOLATION
Metal Wire 1 Metal Wire 2 4nm ⚠ Rule: min spacing = 8nm VIOLATION: 4nm < 8nm
WIDTH VIOLATION
Wire: 3nm wide 3nm Correct: 8nm wide 8nm Rule: min width = 8nm 3nm wire → DRC FAIL

2.8 PD Tool Knowledge

CommandPurpose
read_dbImport design from Genus (unified data model)
init_designInitialize design with LEF/DEF/SDC
floorPlanDefine die/core size and utilization
addRing / addStripeCreate power rings and stripes
place_designRun global and detailed placement
ccopt_designConcurrent CTS and optimization
routeDesignGlobal + detailed routing
extractRCParasitic extraction (RC)
timeDesignIn-tool timing analysis
streamOutGenerate GDS II for tape-out
CommandPurpose
open_lib / open_blockOpen design library and block
initialize_floorplanSet die, core area, utilization
create_net_shapeCreate power network shapes
connect_pg_netConnect power/ground nets
place_optPlacement with optimization
clock_optCTS with timing optimization
route_autoAutomatic global + detail routing
route_optPost-route optimization
write_gdsOutput GDS stream
report_designDesign statistics and QoR
FeatureCadence InnovusSynopsys ICC2
Synthesis HandoffGenus (write_db → read_db)DC (write_icc2)
CTS Commandccopt_designclock_opt
Script FormatTCL / Encounter-styleTCL / IC Compiler style
STA IntegrationTempus (native)PrimeTime (GoldRoute)
EM/IR AnalysisVoltusStarRC + RedHawk
DRC/LVSCalibre in-designCalibre / IC Validator
Market PositionStrongStrong
SECTION 03

Static Timing Analysis

3.1 Introduction to STA

🕐 Start Here — Why Does Timing Matter At All?
Every digital circuit operates on a clock — a signal that ticks millions or billions of times per second. On each tick, every flip-flop captures whatever data is at its input at that exact moment. The fundamental question STA answers is: "Does the data have enough time to travel from one flip-flop to the next between two consecutive clock ticks?" If yes → the chip works. If no → the chip captures wrong data → functional failure. STA checks this for every single one of the potentially millions of paths in your design.
Key Terms — Defined Before We Go Further
🔲
Flip-Flop (Register)
A memory element that captures (stores) its input D on the rising edge of the clock and presents it at output Q. Every register in your design is a flip-flop. Synthesis maps Verilog always @(posedge clk) blocks to flip-flop cells from the library.
🔀
Combinational Logic
Logic gates (AND, OR, NAND, MUX, adders…) between flip-flops. No memory — output depends only on current inputs. Data has to propagate through these gates within one clock period. The more gates in the path, the longer it takes, the lower the maximum frequency.
Clock Period
The time between two rising clock edges. A 1 GHz clock has period = 1ns. A 500 MHz clock has period = 2ns. All combinational logic between two FFs must finish within one period (minus setup time and clock uncertainty). The period sets your timing budget.
📍
Timing Path
The route data travels from a starting point (a FF output or input port) through combinational gates to an ending point (a FF input or output port). STA measures the propagation delay of every timing path and checks it against the constraint.
📊
Slack
Timing margin = Required Time − Actual Arrival Time. Positive slack (+) = data arrives early enough — timing is MET. Negative slack (−) = data arrives too late — timing VIOLATED. Goal: all slacks ≥ 0 at sign-off.
🏁
Critical Path
The timing path with the worst (most negative or least positive) slack. This is the bottleneck that limits your maximum operating frequency. Fixing the critical path is the primary goal of timing closure. WNS (Worst Negative Slack) = the slack of the critical path.
Why STA over Dynamic Simulation?
Dynamic simulation requires test vectors, is slow, and may miss rare corner-case paths. STA analyzes ALL paths statically in minutes, covering 100% of the design space including paths with near-zero functional probability. A design with 1M flip-flops has trillions of possible paths — only STA can check them all.
⚠️
STA Limitations — What It Cannot Catch
STA cannot catch functional logic bugs (wrong RTL behavior), doesn't simulate dynamic power behavior, and requires correctly specified constraints (garbage-in garbage-out). False paths and multicycle paths must be explicitly declared by the engineer — STA trusts what you tell it.
Setup Time & Hold Time Waveforms
CLK DATA Setup window Hold window Capture Edge Launch Edge Data Arrival Setup Slack (+ve = MET) Parameters: • T_setup = FF setup time • T_hold = FF hold time • T_arrival = data path delay • T_required = clock period

3.2 Timing Path Types

🗺️ What Is a Timing Path?
A timing path is the route a signal takes through combinational logic from where it starts (a startpoint) to where it's captured (an endpoint). Startpoint = either a flip-flop's clock pin (Q output launches data) or an input port (external data enters the chip). Endpoint = either a flip-flop's data pin (D input captures data) or an output port (data leaves the chip). Everything in between is combinational logic: AND gates, OR gates, adders, muxes, inverters — all the logic that computes the result. The sum of all gate delays + wire delays along this path is the path delay that STA measures.

STA analyzes 4 fundamental path types in digital circuits. Every timing path has a startpoint (port or FF clock pin) and endpoint (FF data pin or output port).

PATH TYPE 1: INPUT → REGISTER
INPUT PORT COMB LOGIC FF D clk▷ CLK Startpoint: Input port | Endpoint: FF D-pin
PATH TYPE 2: REGISTER → REGISTER (Most Common)
FF1 Q clk▷ COMB LOGIC FF2 D clk▷ CLK (launch → capture)
PATH TYPE 3: REGISTER → OUTPUT
FF Q clk▷ COMB LOGIC OUTPUT PORT Startpoint: FF clk pin | Endpoint: Output port
PATH TYPE 4: INPUT → OUTPUT (Combinational)
INPUT AND OR OUTPUT No FFs — pure combinational path

3.3 Setup & Hold Slack Analysis

Setup Slack
Slack_setup = (T_clock – T_setup – T_cq_launch – T_combo) – T_arrival
Setup Required Time
T_required = T_clock_edge + T_clk_latency_capture – T_clock_uncertainty_setup – T_setup
Hold Slack
Slack_hold = T_arrival – (T_clk_latency_capture + T_hold)
✅ Positive Slack (MET)
Data arrives before required time. Extra margin available. Setup: Slack = +0.3ns means 300ps of timing margin. Design passes. No action needed.
slack: +0.350 ns (MET)
❌ Negative Slack (VIOLATED)
Data arrives AFTER required time. Setup violation = data might not be captured correctly. Must fix before tape-out. Hold violation = data changes too fast.
slack: -0.120 ns (VIOLATED)
Sample Timing Report (PrimeTime Format)
TIMING REPORT — Setup Analysis
===========================================================
Path Type       : max (Setup)
Point                              Incr       Path
===========================================================
--- Input Port ---
clock CLK (rise edge)              0.000      0.000
clock network delay (ideal)         0.500      0.500
FF1/CK                              0.000      0.500 r
--- Data Path (Launch) ---
FF1/Q       (DFF_X2/Q)               0.120      0.620 r
U101/Y      (AND2_X4/Y)              0.085      0.705 r
U102/Y      (OAI21_X2/Y)             0.110      0.815 f
U103/Y      (INV_X4/Y)               0.062      0.877 r
U104/Y      (BUF_X8/Y)               0.075      0.952 r
FF2/D                                0.000      0.952 r
data arrival time                                0.952
--- Capture Edge ---
clock CLK (rise edge)              5.000      5.000
clock network delay (propagated)    0.510      5.510
FF2/CK                              0.000      5.510 r
library setup time                 -0.085     5.425
data required time                               5.425
-----------------------------------------------------------
data required time                               5.425
data arrival time                               -0.952
-----------------------------------------------------------
slack (MET)                                       4.473
===========================================================

3.4 Clock Domain Crossing (CDC)

CDC occurs when a signal crosses from one clock domain to another. This creates a risk of metastability — the output of a flip-flop remains at an indeterminate voltage level for an unpredictable time if setup/hold requirements are violated during the crossing.

2-FF Synchronizer (Most Common Fix)
FF_SRC Q clkA▷ ⚡meta FF_S1 D Q clkB▷ FF_S2 D Q clkB▷ SAFE clkA clkB (synchronizer domain) 2-FF: MTBF increases exponentially
CDC Signal Crossing Waveform
clkA clkB DATA_A META? FF_S1/Q FF_S2/Q Stable (safe to use)
CDC Violation TypeDescriptionFix
Single-bit crossing (no sync)Flip-flop driven by different clock without synchronizerAdd 2-FF synchronizer
Multi-bit bus crossingMultiple bits cross independently — may sample incoherent valuesUse gray code, handshake, async FIFO
Fast-to-slow domainSource clock faster; receiving domain may miss pulsesPulse stretcher + synchronizer
ReconvergenceTwo paths from different domains merge — non-deterministic glitchRe-synchronize before combining

3.5 On-Chip Variation (OCV) & AOCV

Real silicon has spatial and temporal variation in process, voltage, and temperature (PVT). OCV models capture that cells on the same die can behave differently from each other.

⚙️
Process Corners
FF — Fast NMOS, Fast PMOS. Cells are fast. Best-case for timing.

TT — Typical-Typical. Nominal design point.

SS — Slow NMOS, Slow PMOS. Worst-case for setup timing.
🌡️
Voltage & Temperature
Low voltage + high temp = slow cells (worst setup). High voltage + low temp = fast cells (worst hold). Temperature inversion: at advanced nodes (<65nm) speed increases with temperature in some conditions.
📊
Derating
Apply derating factors to account for OCV. Late (slow) path: multiply by 1.05 (5% slower). Early (fast) path: multiply by 0.95 (5% faster). Creates pessimistic timing margin.
MethodDescriptionAccuracyPessimism
OCV (flat derating)Apply fixed derate to all paths equallyMediumHigh
AOCV (Advanced)Derate based on depth (number of cells in path). Longer paths have more statistical averaging → less pessimismHighMedium
POCV (Parametric)Full statistical model using σ distributions for each cell. Most accurateHighestLow

3.6 Multi-Mode Multi-Corner (MMMC)

Modern designs must meet timing across multiple operating modes (functional, scan, standby) AND multiple PVT corners simultaneously. MMMC analysis runs all combinations in one pass.

Corner NameProcessVoltageTempAnalysis TypePurpose
func_slow SS0.9V125°C Setup Worst-case functional timing (setup closure)
func_fast FF1.1V-40°C Hold Worst-case hold (fast paths cause hold violations)
func_typical TT1.0V25°C Both Nominal analysis for power estimation
scan_slow SS0.9V25°C Setup Scan shift timing at slow corner
hold_fast FF1.2V-40°C Hold Extreme hold analysis for ECO coverage

3.7 Synopsys PrimeTime

PrimeTime (PT) is the industry-standard sign-off STA tool. It uses accurate parasitic data (SPEF) from the extracted layout for final timing certification.

Key PrimeTime Commands
CommandPurpose
read_netlistRead gate-level netlist from PD tool
read_sdcApply timing constraints (SDC)
read_parasiticsLoad extracted parasitics (SPEF file)
set_operating_conditionsSet PVT corner for analysis
update_timingPropagate timing through all paths
report_timingPrint timing paths (worst paths)
report_constraintReport all violated constraints
check_timingValidate constraint coverage (unconstrained paths)
report_global_timingSummary: WNS, TNS, WHS, THS
pt_shell -fileRun PrimeTime in batch mode
Sample PrimeTime Script
TCL — PrimeTime Sign-off Script
## PrimeTime Sign-off Script
set_app_var search_path [". /tech/saed32nm/db"]
set_app_var target_library "saed32nm_ss0p9v125c.db"
set_app_var link_library   "* $target_library"

## Read design
read_netlist    "./out/chip_final.v"
link_design     chip_top

## Constraints and parasitics
read_sdc        "./out/chip_final.sdc"
read_parasitics -format spef "./out/chip.spef"

## PVT corner
set_operating_conditions "ss0p9v125c"

## Enable OCV derating
set_timing_derate -late  1.05 -cell_delay
set_timing_derate -early 0.95 -cell_delay

## Update timing
update_timing -full

## Reports
report_timing        -max_paths 20 -slack_lesser_than 0   > rpt/vio_setup.rpt
report_timing  -delay min -max_paths 20 -slack_lesser_than 0   > rpt/vio_hold.rpt
report_constraint    -all_violators                           > rpt/all_vio.rpt
report_global_timing -significant_digits 3                    > rpt/global.rpt
check_timing                                                   > rpt/check.rpt

3.8 Cadence Tempus

FeatureSynopsys PrimeTimeCadence Tempus
Industry StatusGold Standard Sign-offChallenger / Growing
MMMCVia scenario managerNative MMMC (view definitions)
ECO FlowPT-ECO + write_changesNative ECO (eco_opt_design)
Innovus IntegrationVia StarRC/SignoffSeamless (same data model)
POCV SupportYes (POCV derating)Yes (SOCV)
Primary UseSign-off timingIn-design + sign-off

3.9 Timing Closure Techniques

Fixing Setup Violations
TechniqueMethod
Cell UpsizingReplace slow cell with larger drive strength version (X4 → X8)
Buffer InsertionSplit long wire into shorter segments with buffers
Logic RestructuringReduce logic depth on critical path by rearranging gate tree
Floorplan ChangeMove source/sink cells closer to reduce wire delay
RetimingMove registers to balance pipeline stages
Frequency ReductionLast resort: lower clock frequency (increase period)
Fixing Hold Violations
TechniqueMethod
Buffer InsertionInsert delay buffers (delay cells) on short paths to add delay
Cell DownsizingReplace fast (LVT) cell with slower (HVT) version
Wire StretchingMake path wire longer to add RC delay
Clock SkewingIntentionally skew clock to give more hold margin
ECO (Engineering Change Order) Flow
📌 What is ECO?
ECO is a controlled method to make targeted netlist changes after synthesis or tape-out to fix timing, functional bugs, or sign-off issues. It modifies only the affected cells/nets, preserving the rest of the design.
PT Analysis Find violations ECO Script Gen fix_eco_timing Innovus ECO ecoPlace/Route Re-extract RC extractRC Re-sign-off PT re-run CLOSURE All slack ≥ 0
SECTION 04

Interview Prep & Quick Reference

Synthesis Interview Questions (Top 30)

1. What is logic synthesis and what are its inputs and outputs?
+
Logic synthesis is the process of converting an RTL (Register Transfer Level) hardware description into a gate-level netlist optimized for a target technology.

Inputs: RTL code (Verilog/VHDL/SystemVerilog), technology library (.lib/.db), timing constraints (.sdc), design rules.
Outputs: Gate-level netlist (.v), mapped SDC, timing/area/power reports, DDC database.
2. What is the difference between compile and compile_ultra in Design Compiler?
+
compile: Standard compile with basic optimization. Limited effort. Options: -map_effort [low/medium/high], -incremental for re-optimization of existing netlist.

compile_ultra: Advanced optimization including retiming, adaptive body biasing, path-based analysis. Enables -no_autoungroup (prevents flattening) and -timing_high_effort_script. Significantly better QoR at the cost of longer runtime. Used in production flows.
3. What is a technology library (.lib file)? What does it contain?
+
A .lib (Liberty) file characterizes every standard cell in the technology at specific PVT conditions. It contains:
  • Cell delay tables (input transition vs output load)
  • Setup/hold times for sequential cells
  • Leakage and dynamic power values
  • Area in technology units
  • Pin capacitances, max fanout, max transition limits
  • Function description (Boolean)
Multiple .lib files cover different PVT corners (ss125c, tt25c, ff-40c).
4. What is a false path? Give an example of when you would use set_false_path.
+
False path: A timing path that exists in the netlist but is not functionally active — it will never carry real data during operation, so timing should not be analyzed on it.

Examples:
  • Asynchronous reset/set ports: set_false_path -from [get_ports rst_n]
  • Scan test mode paths (active only during test, not functional operation)
  • Paths between mutually exclusive clocks that never switch simultaneously
  • Configuration pins written once at startup
Note: Incorrectly setting false paths can hide real timing problems. Use with care.
5. What is a multicycle path? How is set_multicycle_path used?
+
A multicycle path is one that is intentionally designed to take more than one clock cycle to propagate. This relaxes the timing constraint on that path.

Example: A multiplier that takes 2 clock cycles:
set_multicycle_path 2 -setup -from [get_cells mult_inst/reg*]
set_multicycle_path 1 -hold -from [get_cells mult_inst/reg*]

The -hold must be explicitly set to (N-1) to avoid hold violations introduced by the relaxed setup. Failure to set hold correction is a very common bug.
6. What is clock gating and why is it used?
+
Clock gating reduces dynamic power by stopping the clock to a flip-flop or a group of flip-flops when their output is not needed. Instead of clocking an FF every cycle (wasting power toggling), a gating condition (enable signal) controls whether the clock reaches the FF.

Implementation: Synthesis tools insert ICG (Integrated Clock Gating) cells which are AND/OR-latch combinations that suppress the clock edge cleanly without glitches. Reduces dynamic power by 20–40% in typical designs.
7. What is the difference between WNS, TNS, and WHS?
+
WNS (Worst Negative Slack): The most negative setup slack in the design. Represents the single worst timing path. Must be ≥ 0 at sign-off.

TNS (Total Negative Slack): Sum of all negative slacks across all endpoints. Indicates the total amount of timing work needed. WNS=0 but TNS<0 means many marginal paths.

WHS (Worst Hold Slack): The most negative hold slack. Indicates the worst hold violation. Must also be ≥ 0. Fixed by inserting delay buffers on short paths.
8. What is retiming in synthesis?
+
Retiming moves registers (flip-flops) across combinational logic boundaries without changing the circuit's functional behavior. It balances pipeline stages to improve frequency.

Example: If Stage 1 has 3ns of logic and Stage 2 has 1ns, retiming moves a register to equalize ~2ns each, doubling achievable frequency. The tool handles the mathematical transformation automatically. Enabled via compile_ultra in DC.
9. What is operand isolation in power optimization?
+
Operand isolation prevents switching activity on functional units (like adders, multipliers) when their outputs are not being used. An AND gate or mux is inserted at the inputs of the datapath block, driven by the enable signal. When disabled, all inputs are forced to 0, preventing glitches from propagating through the combinational logic and reducing switching power significantly.
10. What happens during elaboration in synthesis?
+
Elaboration parses the HDL source files and builds an internal design representation (GTECH netlist — technology-independent generic gates). During elaboration:
  • Module hierarchy is constructed
  • Parameters/generics are resolved to constants
  • FSMs are identified and optionally encoded
  • Registers, memories, operators (+, *, >>) are mapped to GTECH primitives
  • Design rule checks (unconnected ports, latches vs FFs) are performed
The check_design command after elaboration reports any issues.
11. What is set_dont_touch and when do you use it?
+
set_dont_touch prevents DC from optimizing, resizing, or removing a specific cell or net. Use cases:
  • Protect manually sized critical cells from being downsized
  • Preserve specific clock buffers needed for DFT
  • Protect cells needed for post-silicon debug/observation points
  • Guard hand-placed analog boundary interface cells
Over-use of set_dont_touch can degrade QoR by blocking legitimate optimizations.
12. What is the difference between target_library and link_library?
+
target_library: The technology library whose cells DC will USE when mapping the design. These are the cells that appear in the output netlist.

link_library: Libraries used to RESOLVE module references during linking. Includes "*" (current design) + all .db files. Needed so DC can find instantiated sub-modules and external IPs. A cell can be in link_library but not target_library — it gets resolved but DC won't use it for new cells.
13. What is ungroup and when should you use it during synthesis?
+
ungroup flattens a sub-module into its parent, removing the hierarchical boundary. This allows DC to optimize logic across that boundary (e.g., constant propagation from parent into child, logic sharing between siblings).

Use when: Sub-module boundaries prevent critical optimization. In compile_ultra, the -no_autoungroup flag disables DC's automatic ungrouping. Manual ungrouping is done before compile: ungroup -all -flatten. Tradeoff: loses hierarchy for debug and incremental compile benefits.
14. What is scan insertion and how does synthesis handle it?
+
Scan insertion (Design for Test, DFT) replaces regular flip-flops with scan flip-flops (SFF) that have an additional scan data input (SI) and scan enable (SE). During test mode, all SFFs form a chain allowing external test patterns to be shifted in and captured results shifted out.

In synthesis: After compile, insert_dft and preview_dft commands handle scan. The SDC must set false paths on scan paths (set_false_path -from [get_ports scan_en]). Scan adds ~5–10% area overhead.
15. What is the significance of set_max_area 0 in DC?
+
set_max_area 0 tells Design Compiler to minimize area as much as possible (target = 0 means "minimize"). DC will aggressively use smaller cells, share logic, and apply area recovery techniques after meeting timing. Setting this to 0 doesn't mean area will be 0 — it's a directive to minimize. Without this command, DC may leave unused area if timing is met. Always set after timing constraints are applied so timing takes priority.
16. What are HVT, SVT, and LVT cells? How are they used in synthesis?
+
Multi-threshold voltage cells on the same process node:

LVT (Low Vt): Fast switching, but high leakage. Used on critical timing paths.
SVT (Standard Vt): Balanced. General use cells.
HVT (High Vt): Slow, but very low leakage. Used on non-critical paths to reduce standby power.

Strategy: Use LVT to fix WNS on critical paths; replace non-critical LVT cells with HVT to recover power. DC can perform multi-Vt optimization automatically when multiple .lib corners are provided.
17. What is GTECH in Design Compiler?
+
GTECH (Generic Technology) is Synopsys's internal, technology-independent logic library used as an intermediate representation during synthesis. After elaboration, the design is mapped to GTECH primitives (GTECH_AND2, GTECH_FD1, etc.) before technology mapping to the target library. GTECH allows Boolean optimization without technology-specific constraints. The check_design on a GTECH netlist catches structural issues before committing to technology mapping.
18. What is the purpose of set_clock_uncertainty?
+
set_clock_uncertainty adds a timing margin to account for:
  • Jitter: Cycle-to-cycle variation in clock edge arrival (PLL jitter)
  • Skew: Spatial variation in clock arrival (before CTS; post-CTS uses propagated clocks)
  • Margin: Extra guardband for post-silicon variation
Pre-CTS: set_clock_uncertainty models all uncertainty.
Post-CTS: Usually only jitter+margin, as skew is captured in propagated clock latencies. Setup and hold have separate uncertainty values.
19. What is path grouping in synthesis optimization?
+
Path grouping organizes timing paths into groups so DC can apply targeted optimization effort. Each group can receive different weights and effort. Default groups: REGOUT (reg-to-output), REGIN (input-to-reg), COMBO (combinational), and per-clock groups.

group_path -name critical_paths -critical_range 0.5 -weight 5

Higher weight = more optimization effort. Useful to tell DC to focus on specific paths without spending runtime on already-met paths.
20. What is the difference between read_verilog and analyze + elaborate?
+
read_verilog: Reads, analyzes, and elaborates the design in one step. Simpler for single-design flows.

analyze + elaborate (two-step):
analyze -format verilog -library WORK [file list]
elaborate top_module

The two-step approach is preferred for large hierarchical designs because analyze compiles each file to an intermediate form, and elaborate builds the hierarchy. This allows reuse of analyzed modules and better error isolation. Also enables explicit parameter override during elaborate.
21. What causes latch inference vs flip-flop inference in synthesis?
+
In Verilog RTL:
Flip-flop is inferred when: output is assigned only on a clock edge (always @(posedge clk)).
Latch is inferred when: output is assigned inside a level-sensitive always block AND not all conditions assign the output (incomplete if/case).

Example latch inference: always @(en or d) if (en) q = d; // q holds when en=0 → LATCH

Latches are generally undesirable in synthesis (timing hard to analyze). Fix: Use flip-flops with explicit reset, or make if/case statements complete with else/default.
22. What is incremental compile and when do you use it?
+
Incremental compile (compile -incremental) re-optimizes only the portions of the design that violate constraints, leaving already-met portions unchanged. It is faster than a full compile and is used:
  • After making small ECO changes to the netlist
  • After constraint changes affecting only a subset of paths
  • In a second-pass optimization after an initial compile
Not as thorough as a full compile_ultra — use only when runtime is critical or changes are known to be local.
23. What does check_timing report and why is it important?
+
check_timing validates that all paths in the design are covered by timing constraints. It reports:
  • Unconstrained paths: Flip-flops or ports with no clock or timing constraint → timing not analyzed → potential sign-off risk
  • Loops: Combinational loops (no register) which cause infinite path delays
  • No-clock endpoints: FFs without an associated clock
Always run check_timing before reporting timing. "Clean" means 0 warnings — every path is constrained.
24. What is propagated clock vs ideal clock in synthesis?
+
Ideal clock: Clock arrives at all FFs simultaneously with zero skew and zero network delay. Used pre-CTS. The set_clock_uncertainty models expected skew/jitter as a guardband.

Propagated clock: After CTS, the actual clock network delay is computed from the clock source through every buffer/inverter to each FF's clock pin. The tool uses real propagated delays — more accurate, removes pessimism of ideal clock uncertainty. set_propagated_clock [all_clocks] switches to propagated mode in PrimeTime post-CTS.
25. What is set_driving_cell and set_load?
+
set_driving_cell: Specifies the cell driving each input port, allowing DC to accurately compute input transition times. Without this, DC assumes an ideal (zero-resistance) driver. Example: set_driving_cell -lib_cell BUFX4 [get_ports data_in*]

set_load: Specifies the capacitive load on output ports (models the off-chip load). Example: set_load 0.05 [get_ports data_out*]

Both are necessary for accurate I/O timing analysis. Without them, input/output timing will be optimistic.
26. What is the SAIF file and how is it used in power analysis?
+
SAIF (Switching Activity Interchange Format) captures the toggle rate and static probability of every net in the design from simulation. It is used by synthesis and power analysis tools to compute accurate dynamic (switching) power rather than relying on default activity assumptions (typically 20% toggle rate).

Flow: Run RTL or gate-level simulation → dump SAIF → read in DC/PT for power analysis: read_saif -input sim.saif -instance top. More accurate switching data = more accurate power optimization decisions.
27. What is the difference between a latch and a flip-flop from a timing perspective?
+
Flip-flop (edge-triggered): Captures data only at the clock edge. Setup/hold times apply at that edge. STA treats it as a fixed timing endpoint — straightforward.

Latch (level-sensitive): Transparent when clock is high (or low). Data can "time-borrow" through the latch during the transparent phase, borrowing time from the next cycle. This makes STA significantly more complex — the tool must perform "time-borrowing" analysis. Latches in pipelines can improve throughput but require careful constraint handling with set_latch_time and cycle_time constraints.
28. What is a generated clock? Give an example.
+
A generated clock is a clock derived from a master clock by division, multiplication, or phase shift — typically from a PLL output or a clock divider register.

create_generated_clock -name CLK_DIV2 -source [get_ports clk_in] -divide_by 2 [get_pins clkdiv_reg/Q]

Generated clocks are essential for STA to correctly analyze paths crossing from the master to generated domain. Without declaring them, those paths are unconstrained. Generated clocks also inherit uncertainty from their master unless explicitly overridden.
29. What is a combinational loop and how does it affect synthesis?
+
A combinational loop is a circuit path where the output feeds back to its own input without any register (flip-flop/latch) in between. This creates infinite path delay in STA (the propagation loops forever), and in real hardware causes oscillation or lock-up states.

Synthesis tools detect loops via check_design and report them as errors. Loops must be fixed before synthesis can complete. Common causes: feedback mux without enable register, asynchronous handshake signals coded incorrectly in RTL.
30. What is register balancing vs pipeline optimization?
+
Register balancing (retiming): Moves existing registers within the current pipeline structure to equalize logic depth between stages. No new registers are added. The functional latency (number of cycles) stays the same.

Pipeline optimization: Adds NEW pipeline stages (registers) to reduce combinational depth at the cost of increased latency. This is an architectural decision made at RTL level, not done automatically by synthesis.

Key difference: Retiming is synthesis-level; pipelining is architectural. Both improve timing but retiming is transparent to function while pipelining increases output latency.

Physical Design Interview Questions (Top 30)

1. What is utilization in floorplanning and what is a good target value?
+
Utilization = (Total Standard Cell Area) / (Core Area) × 100%. It represents how densely cells are packed in the core.

Target: 60–75% for most designs. Lower (<50%) wastes die area and increases cost. Higher (>80%) causes routing congestion, difficulty placing buffers, and degraded routability. Memory-heavy designs may use 40–60% because large SRAMs occupy significant area.
2. What is the difference between die area and core area?
+
Die area: The total silicon area of the chip, including the I/O ring, pads, and all structures to the edge of the die.

Core area: The interior region where standard cells and macros are placed. It is surrounded by the I/O ring. Core area = Die area − I/O ring area − margins.

The core-to-die margin accommodates power rings, I/O pad connections, and design rule keepouts. Utilization is measured relative to the core area, not die area.
3. What is IR drop and how does it affect the design?
+
IR drop is the voltage reduction along the power delivery network due to resistive metal wires. V_drop = I × R.

Effects:
  • Cells receiving lower VDD switch slower → increased cell delay → potential setup violations
  • Severe IR drop can prevent cells from switching at all → functional failure
  • Dynamic IR drop (transient) from simultaneous switching of many cells
Fix: Wider power stripes, more vias, adding decap cells near the IR hotspot, reducing switching current by spreading cells.
4. What is electromigration (EM) and how do you fix it?
+
Electromigration (EM) is the gradual displacement of metal atoms in a wire due to electron momentum transfer at high current densities. Over time it creates voids (opens) or hillocks (shorts), causing chip failure.

Fix:
  • Widen the wire to reduce current density (J = I/A)
  • Add parallel wires (increase cross-section)
  • Add more vias (reduce via current density)
  • Reduce switching frequency or activity
EM analysis is part of sign-off using tools like Voltus or RedHawk.
5. What is clock skew? What is acceptable skew?
+
Clock skew = difference in clock arrival time between any two flip-flops in the design (or between launch FF and capture FF on a specific path).

Skew = T_clk_capture − T_clk_launch

Acceptable values:
  • Local skew (adjacent FFs): < 30–50 ps
  • Global skew (across chip): < 100–200 ps
Positive skew (capture FF's clock arrives later): relaxes setup, tightens hold. Negative skew: tightens setup, relaxes hold. CTS targets balanced (near-zero) skew between all FFs in a domain.
6. What is the difference between global routing and detailed routing?
+
Global Routing: Divides the chip into a coarse grid (GCells) and assigns each net to a sequence of GCells. Determines which metal layers and routing regions each net passes through. Fast, approximate — does not produce actual wire geometries. Identifies congested areas.

Detailed Routing: Works within the global routing assignment to produce exact wire coordinates, widths, vias, and layer assignments. Must satisfy all DRC rules. The actual GDSII-ready metal geometries are the output.
7. What is a DRC violation? Give three examples.
+
DRC (Design Rule Check) violations are layout patterns that violate foundry manufacturing rules:
  1. Spacing violation: Two wires on the same metal layer are closer than the minimum spacing rule
  2. Width violation: A wire is narrower than the minimum width for that metal layer
  3. Via enclosure violation: Metal doesn't extend enough beyond the via in all directions
  4. Antenna violation: Metal attached to gate has too high area ratio (damages oxide during fab)
  5. Density violation: Metal fill percentage outside foundry-specified min/max range
8. What is LVS and what errors does it catch?
+
LVS (Layout vs. Schematic) extracts a netlist from the physical layout (by identifying connected metal regions as nets and transistors from poly-over-active patterns) and compares it to the reference schematic/netlist.

Errors caught:
  • Open circuits: A connection exists in schematic but is missing/broken in layout
  • Short circuits: Two nets that should be separate are connected in layout
  • Extra devices: Layout has transistors not in schematic
  • Missing devices: Schematic has cells not present in layout
LVS must be clean before tape-out.
9. What is a macro in physical design? How is it placed?
+
A macro is a large pre-designed block (hard macro) with a fixed layout: SRAM, ROM, PLL, analog IP, large memories. Unlike standard cells which are placed in rows, macros have fixed dimensions and internal structure.

Macro placement guidelines:
  • Place at die edges or corners to minimize routing blockage in the center
  • Align to row boundaries if possible
  • Add a "halo" or keepout around each macro (no std cells within 2–5µm)
  • Consider macro pin accessibility — pins should face the routing channels
  • Group related macros (e.g., all SRAMs near their controllers)
10. What are filler cells and their purpose?
+
Filler cells (decap fillers) are placed in empty spaces between standard cells in each row to:
  • Maintain N-well continuity across the row (required for correct transistor operation)
  • Connect power rails (VDD/VSS straps run through standard cell rows)
  • Provide decoupling capacitance (some filler cells include capacitors)
  • Ensure minimum density requirements for metal layers
Different sizes exist (FILL1, FILL2, FILL4, FILL8, FILL16, FILL32) and the placer fills every gap. Must be removed before ECO changes and re-inserted after.
11. What is an antenna violation in routing?
+
During plasma etching in semiconductor fabrication, metal wires connected to gate terminals accumulate charge. If the metal-to-gate area ratio exceeds a threshold, the charge can damage the thin gate oxide.

Antenna ratio = Metal area connected to gate / Gate oxide area

Fixes:
  • Jump up to higher metal layer (top layer is added last, less exposure)
  • Insert antenna diodes at gate inputs (discharge the accumulated charge)
  • Use antenna-aware routing (route to higher layer early)
Antenna violations are found by DRC and must be fixed before tape-out.
12. What is crosstalk and how does it affect timing?
+
Crosstalk occurs when a switching wire (aggressor) capacitively couples noise onto an adjacent wire (victim).

Timing impact:
  • Crosstalk delta delay: Aggressor switching in same direction as victim → speeds up victim (improves setup, worsens hold). Opposite direction → slows down victim (worsens setup).
  • Crosstalk noise/glitch: On a quiet net, coupling from aggressor creates a voltage spike that may cause a logic error if the net is near a switching threshold.
Fixes: Shield critical nets with VDD/VSS, widen wire spacing, use lower metal layers (smaller coupling cap).
13. What is the difference between legalization and detailed placement?
+
Legalization: After global placement places cells at approximate (possibly overlapping) locations, legalization moves each cell to the nearest legal position — aligned to a placement row, on the power rail grid, with no overlaps. Cells may move significantly from their global placement position.

Detailed Placement: After legalization, cells are in legal positions but timing may be degraded. Detailed placement does local cell swaps, single-row and multi-row moves to improve timing and reduce wirelength while maintaining legality.
14. What is a placement blockage? Name three types.
+
A placement blockage prevents the placer from placing standard cells in a specified area.

Types:
  • Hard blockage: No cells placed at all. Used around macros, analog circuits, special structures.
  • Soft blockage: Discourages placement but allows it if necessary for congestion relief.
  • Partial blockage: Only buffers and inverters (low-level cells) allowed — commonly used around macro halos.
  • Route blockage: Blocks routing (not placement) on specific metal layers in a region.
15. What is SPEF? Why is it needed for sign-off timing?
+
SPEF (Standard Parasitic Exchange Format) is a file that describes the extracted RC parasitics (resistance and capacitance) of every wire in the physical layout. After routing, an extraction tool (like StarRC or RCX) reads the physical layout and produces SPEF.

SPEF is needed because wire delays depend heavily on actual metal resistance and capacitance, which are only known after physical layout. Pre-route timing uses estimated wire loads (WLM) which can be 20–30% off. Sign-off STA uses SPEF for accurate, real timing. Without SPEF, timing sign-off is unreliable.
16. What is timing-driven placement?
+
Timing-driven placement considers timing criticality when placing cells. Critical path cells are placed close together to minimize wire length and thus wire delay. Non-critical paths can tolerate longer wires.

The placer uses early wire length estimation and constraint data to prioritize cell proximity for critical nets. Without timing-driven placement, a pure wirelength minimizer might spread critical cells apart, degrading timing after routing when actual wire RC is seen. Most modern placers (Innovus, ICC2) do timing-driven placement by default.
17. What is a power domain and what is level shifting?
+
A power domain is a region of the chip that operates at a specific supply voltage, potentially different from the rest of the chip. Used in low-power design to run non-critical blocks at lower voltage (lower power).

Level shifters are required when a signal crosses between two power domains at different voltages. They translate signal levels: a signal valid at 0.6V/1.2V in domain A must be converted to the 0.8V/1.8V levels of domain B. Without level shifters, the receiver sees incorrect logic levels, causing functional failure. Level shifters must be inserted in the netlist during synthesis/PD with proper UPF (Unified Power Format) flow.
18. What is the purpose of tap cells in physical design?
+
Tap cells (also called well taps) connect the N-well to VDD and substrate to VSS at regular intervals to prevent latchup. They have no active function but provide necessary bias connections.

Without tap cells, parasitic PNP/NPN transistors in the CMOS structure can turn on, creating a low-resistance path from VDD to VSS (latchup), permanently damaging the chip. Foundry rules specify maximum tap cell pitch (typically 20–50µm). They are placed in every standard cell row at regular intervals.
19. What is congestion in routing? How do you resolve it?
+
Congestion occurs when the number of wires that need to pass through a routing region exceeds the available routing tracks (routing overflow).

Fixes:
  • Reduce placement density (lower utilization) in congested areas
  • Add routing blockages on congested layers to force rerouting
  • Move macros to open routing channels
  • Add extra metal layers via process upgrades
  • Use high-fanout net synthesis to break up congested drivers
  • Adjust floorplan to redistribute logic
Congestion map analysis in Innovus/ICC2 shows hotspots before detailed routing.
20. What is double patterning and why is it needed?
+
At advanced nodes (<20nm), the minimum wire pitch required is smaller than what a single photolithography exposure can resolve. Double patterning splits the layout into two separate masks, each printed in a separate exposure, whose combined result achieves the fine pitch.

This requires the layout to be "colorable" — adjacent wires must be assigned to different masks (colors). DRC checks for double patterning conflicts (two adjacent same-color wires that should be different colors). Routing tools must be double-patterning-aware and ensure no conflicts.
21. What is a flyline (ratsnest) and how is it used?
+
A flyline (ratsnest) is a straight-line visual connection between unconnected pins that are logically connected in the netlist. It shows the router "intent" — which pins must be connected — before actual routing is done.

Uses:
  • Visual guide during floorplanning to estimate wire congestion and length
  • Identify poor floorplan choices (macros creating long flylines across the chip)
  • Estimate wirelength for timing budgeting
High-density flyline areas after floorplanning predict routing congestion hot spots. Move macros/cells to reduce crossing flylines.
22. What is a high-fanout net and how is it handled in PD?
+
A high-fanout net is a signal connected to a very large number of sink pins (e.g., enable, scan_en, reset driving hundreds or thousands of FFs). High fanout causes:
  • Excessive wire capacitance → slow transition → timing violation
  • Single wire spanning entire chip → routing congestion
Handling: Use buffer tree synthesis — insert buffers to split the net into sub-nets. The synthesis and PD tools do this automatically for nets exceeding max_fanout. Scan enable and test signals often need 4–6 levels of buffering.
23. What is the difference between ICC2 and Innovus?
+
Both are industry-leading place-and-route tools from different vendors:

Synopsys ICC2: Tightly integrated with DC (write_icc2), PrimeTime for timing sign-off, and IC Validator for DRC. Uses hierarchical database (.dlib).

Cadence Innovus: Tight integration with Genus (write_db/read_db), Tempus for in-design STA, and Calibre in-design. Known for Concurrent Optimization (CCOpt) for CTS.

Both support advanced features (multi-patterning, advanced node DRC, power analysis). Choice depends on existing tool stack and foundry PDK support.
24. What is a power intent file (UPF/CPF)?
+
UPF (Unified Power Format) and CPF (Common Power Format) describe the power architecture of a multi-voltage design:
  • Power domain definitions (which cells are in each domain)
  • Supply voltages for each domain
  • Power state definitions (ON, OFF, low-power)
  • Level shifter and isolation cell requirements
  • Power switching cell locations
UPF is now IEEE 1801 standard. The PD tool reads UPF to automatically insert level shifters, isolation cells, and power switches at domain boundaries. Without UPF, multi-voltage designs cannot be correctly implemented.
25. What is the difference between CTS and CTO?
+
CTS (Clock Tree Synthesis): Builds the clock distribution network from scratch — inserting buffers, inverters, and routing wires to distribute the clock to all FFs with controlled skew and latency.

CTO (Clock Tree Optimization): A post-CTS step that fine-tunes the existing clock tree — adjusting buffer sizes, changing net routes, and tweaking the tree topology to improve skew, latency, and clock power without fully rebuilding the tree. Used after post-route optimization when incremental clock improvement is needed. In Innovus: ccopt_design covers both CTS and CTO.
26. What is metal fill and why is it required?
+
Metal fill is dummy metal patterns inserted into empty areas of each metal layer to satisfy foundry density rules. CMP (Chemical Mechanical Polishing) during fabrication requires uniform metal density across the wafer to achieve planar surface topography.

Without adequate fill: CMP removes too much metal in sparse areas (dishing) or leaves too much in dense areas (erosion) → non-uniform heights → via formation failures → reliability problems.

Fill is inserted after routing using fill tools. It must not electrically connect to any signal but must meet min/max density rules on each layer within specified check windows.
27. What is the difference between pre-route and post-route optimization?
+
Pre-route optimization occurs before detailed routing. Wire delays are estimated (using virtual wire models). Cell placement, sizing, and buffering changes are fast because no DRC checking is needed. Timing closure is attempted here first for efficiency.

Post-route optimization uses actual extracted parasitics (RC from real wires). It is slower and must maintain DRC cleanliness with every change. Changes are limited (ECO-mode: only add/resize buffers/inverters, minimal perturbation to avoid DRC). Sign-off timing happens post-route.
28. What is a standard cell row? How does orientation affect placement?
+
A standard cell row is a horizontal strip in the core area with fixed height (matching the standard cell height for the technology node). Rows alternate between N-side up (N2HS) and P-side up, with power rails (VDD and VSS) running horizontally through them. All standard cells must snap to row boundaries.

Row orientation alternates (flipped in Y) so adjacent rows share power rails, reducing the number of power straps needed. Some cells can only be placed in certain orientations (e.g., cells with specific Nwell connections). Placement tools handle orientation automatically per-row.
29. What happens during sign-off? What must pass before tape-out?
+
Sign-off is the final verification phase before releasing the design to the foundry. All checks must pass with zero failures:
  • STA: Zero setup AND hold violations across all MMMC corners
  • DRC: Zero design rule violations (Calibre DRC clean)
  • LVS: Layout vs. Schematic clean (zero shorts/opens)
  • IR Drop: All cells receive sufficient voltage (static + dynamic)
  • EM: All metal/via segments below current density limits
  • Antenna: All gates meet antenna ratio rules
  • ESD: ESD protection structures verified
30. What is the purpose of decap cells?
+
Decap cells (decoupling capacitor cells) are standard-cell-height structures that contain a large capacitor between VDD and VSS. They serve as local charge reservoirs:
  • Supply instantaneous current to switching cells without waiting for current from the power pads (which have long RC path)
  • Reduce dynamic IR drop by providing local charge
  • Filter high-frequency noise on the power supply
Placed in empty spaces near high-switching-activity areas. Some filler cells also contain small decap capacitances. Excessive decap can cause excessive inrush current at power-on.

STA Interview Questions (Top 30)

1. What is static timing analysis and how does it differ from dynamic simulation?
+
STA exhaustively analyzes all timing paths in a design mathematically without requiring simulation vectors. It checks whether data can propagate from every startpoint to every endpoint within timing constraints.

Differences from dynamic simulation:
  • STA covers 100% of paths; simulation covers only exercised paths
  • STA is fast (minutes); full simulation can take days
  • STA cannot find functional bugs; simulation can
  • STA is deterministic given constraints; simulation depends on input vectors
  • STA uses library models; simulation uses detailed transistor behavior
2. What is setup time and hold time?
+
Setup time (T_su): The minimum time BEFORE the active clock edge that data must be stable at the FF input (D pin). If data changes within this window, the FF may fail to capture correctly → metastability.

Hold time (T_h): The minimum time AFTER the active clock edge that data must remain stable. If data changes within this window, the FF may capture the new value instead of the intended value.

Both are characteristics of the flip-flop cell from the technology library, measured at specific operating conditions. They represent fundamental timing requirements of the storage element.
3. How is setup slack calculated?
+
Setup slack = Required Arrival Time − Actual Arrival Time

Required Arrival Time:
= Clock period + Capture clock latency − Clock uncertainty (setup) − Setup time of FF

Actual Arrival Time:
= Launch clock edge time + Launch clock latency + CK→Q delay + Combinational delay

Slack ≥ 0: Setup MET (timing passes)
Slack < 0: Setup VIOLATED (must fix)

Example: Required = 4.8ns, Arrival = 3.5ns → Slack = +1.3ns (MET with 1.3ns margin)
4. Why do we need to fix both setup AND hold violations?
+
Both represent different failure modes that cause the flip-flop to capture the wrong data:

Setup violation: Data arrives too late → FF is asked to capture before data is stable → metastability → wrong Q output (random). Functional failure at speed.

Hold violation: Data changes too soon after clock → FF captures new value when old value was expected → wrong Q output. A hold violation is particularly dangerous because it causes failure at ALL frequencies — it's not a speed problem, it's a structural problem that causes failure even at low frequency.

Both must be zero violations at sign-off in every MMMC corner.
5. What is clock-to-Q delay (CK-to-Q)?
+
CK-to-Q (propagation delay) is the time from when the clock edge arrives at the FF's clock pin to when the output Q settles to its new logic value. It is a library cell characteristic and contributes to the data path delay in timing analysis:

Data arrival time = T_clk_source + T_launch_clk_latency + T_CKtoQ + T_combo_logic

Typical values: 50–200ps depending on cell drive strength and load. Larger, faster cells have smaller CK-to-Q. It also depends on output load capacitance (higher load → longer CK-to-Q).
6. What is metastability? How do synchronizers help?
+
Metastability occurs when a flip-flop's input violates setup or hold time. The FF output neither fully resolves to 0 nor 1 — it remains at an intermediate voltage. Given enough time (mean time to resolve), it will eventually resolve to a valid logic level, but the resolution time is unpredictable — it can be arbitrarily long, causing downstream logic to see incorrect values.

Synchronizers (2-FF chains) help by providing extra time for the FF output to resolve before being used. The probability of metastability causing failure decreases exponentially with the resolution time given. Mean Time Between Failures (MTBF) increases exponentially with the number of synchronizer stages.
7. What is OCV (On-Chip Variation) and why does it matter?
+
OCV is the spatial variation in process, voltage, and temperature across a single die. Two identical cells at different locations on the same chip may have different delays due to process gradient, local power supply variations, and thermal gradients.

OCV matters because STA corner analysis (SS, TT, FF) assumes the whole chip is at one corner. In reality, the launch path might be slow while the capture path is fast (or vice versa), creating additional timing margin loss. OCV derating adds guardband by making launch paths pessimistically slow and capture paths pessimistically fast (or vice versa for hold).
8. What is AOCV? How does it differ from flat OCV derating?
+
Flat OCV: Apply the same derate factor (e.g., 5%) to every cell, regardless of path length. This is very pessimistic for long paths — a path with 50 cells has much more statistical averaging than one with 3 cells.

AOCV (Advanced OCV): Applies a smaller derating factor to longer paths (more cells) because statistical averaging reduces the probability of all cells simultaneously being at the worst case. Shorter paths get higher derating. This reduces over-pessimism in long paths, recovering timing margin and avoiding unnecessary ECO effort. The derate table is a function of path depth (cell count).
9. What is MMMC analysis? Name four typical corners.
+
MMMC (Multi-Mode Multi-Corner) runs STA simultaneously across all operating modes and PVT corners to ensure the design meets timing in every scenario:
  • func_slow: SS, 0.9V, 125°C — Setup check for functional mode
  • func_fast: FF, 1.1V, -40°C — Hold check for functional mode
  • scan_slow: SS, 0.9V, 25°C — Scan shift timing
  • hold_extreme: FF, 1.2V, -55°C — Worst-case hold
All must pass simultaneously. One tool run covers all corners efficiently.
10. What is clock uncertainty and what components does it model?
+
Clock uncertainty is a timing margin applied to clock edges to account for:
  • Jitter (period jitter): Cycle-to-cycle variation in clock period from the PLL/crystal
  • Skew: Spatial variation in clock arrival times (pre-CTS only; post-CTS uses actual propagated latencies)
  • Uncertainty margin: Additional guardband for modeling limitations
Setup uncertainty reduces required time (more pessimistic). Hold uncertainty reduces required hold time (more pessimistic for hold). Applied using: set_clock_uncertainty -setup 0.15 -hold 0.05
11. What is clock reconvergence pessimism removal (CRPR)?
+
When the launch and capture flip-flops share a portion of their clock path (common clock path), the STA tool would otherwise apply OCV derating to the shared segment twice — once making it slow (for launch) and once making it fast (for capture). This is physically impossible: the shared wire has one actual delay.

CRPR removes this double pessimism by identifying the common portion of the clock path and applying derating only to the diverging portions. This can recover significant timing margin (50–200ps) especially in designs with long shared clock networks.
12. What is a timing arc?
+
A timing arc is a delay specification between an input pin and an output pin of a cell in the library. It describes how long it takes for a transition at the input to propagate to the output. Types:
  • Cell arc: Input→Output delay within a cell (e.g., A→Y in AND2)
  • Net arc: Wire delay from cell output to next cell input (RC delay)
  • Setup/hold arc: Constraint arcs on FF data vs clock pins
  • Clock arc: CK→Q propagation arc of a flip-flop
STA tools traverse all arcs to compute path delays.
13. What is the purpose of read_parasitics / read_spef in PrimeTime?
+
read_parasitics -format spef filename.spef loads the extracted wire RC parasitics from the post-layout extraction tool (StarRC, QRC). Without this, PrimeTime uses ideal wires or estimated loading (from SDC set_load), which is inaccurate.

After loading SPEF, wire delays are computed from actual metal resistance and capacitance (R×C delay), giving accurate net delays. Sign-off timing MUST use SPEF parasitics. The SPEF file must match the design netlist exactly (same net names). Mismatches cause warnings and incorrect timing.
14. What is hold analysis and why is it corner-reversed from setup?
+
Hold analysis checks whether data changes too quickly after a clock edge. Hold slack = Data arrival time − (Capture clock latency + Hold time).

Hold violations occur when the DATA PATH is too fast (short logic path) and the CLOCK arrives late at the capture FF.

Therefore, hold analysis uses the FAST corner (FF process, high voltage, low temperature) which makes data paths fast and can make hold more critical. This is the reverse of setup analysis which uses the SLOW corner. That's why MMMC must check setup at slow corner AND hold at fast corner simultaneously.
15. What does check_timing check and what warnings indicate?
+
check_timing validates constraint completeness and reports:
  • Unconstrained endpoints: FF data/output ports with no timing path from a clock — path not analyzed by STA
  • No-clock FFs: Registers with no associated clock definition
  • Partial path constraints: Input_delay covers only -max but not -min (or vice versa)
  • Loop detection: Combinational loops
  • Multiple clocks: Endpoints with multiple clock paths (may need set_false_path or set_clock_groups)
All warnings should be investigated — unconstrained paths are a sign-off risk.
16. What is path-based analysis (PBA) vs graph-based analysis (GBA)?
+
GBA (Graph-Based Analysis): Standard STA mode. Each cell's arrival time is calculated once using the worst-case input transition and output load from all converging paths. Very fast but pessimistic — assumes the worst condition at every cell simultaneously, even if physically impossible.

PBA (Path-Based Analysis): Re-analyzes specific critical paths using the actual input transition experienced by each cell on that specific path. More accurate, less pessimistic — removes false worst-case combinations. Much slower (only applied to a subset of near-critical paths). Used to "rescue" paths that look violated in GBA but actually pass when analyzed properly.
17. What is input transition and output load in cell timing models?
+
Cell delay is characterized as a 2D lookup table indexed by:

Input transition time (slew): How fast the input signal switches (rise/fall). A slower input → longer cell propagation delay.

Output load capacitance: Total capacitance the cell drives (input caps of fanout cells + wire cap). Higher load → longer output transition and higher cell delay.

The 2D table is NLDM (Non-Linear Delay Model). STA tools interpolate within the table to compute accurate delays for the specific transition and load seen at each cell in the design.
18. What causes a max-capacitance violation and how is it fixed?
+
A max-capacitance violation occurs when the capacitive load on a cell's output exceeds the maximum capacitance limit specified in the technology library for that cell. This causes:
  • Output transition (slew) becoming too slow
  • Downstream cell delays increasing
  • Possible functional failure if slew is extremely slow
Fixes:
  • Insert buffers to split the high-fanout net
  • Upsize the driving cell to a higher drive strength
  • Reduce wire length (physical proximity of sinks)
Max-cap violations show up as DRC violations in STA reports.
19. What is the difference between setup uncertainty and hold uncertainty?
+
Setup uncertainty: Applied to reduce the timing window available for data to meet setup. It tightens setup (makes it harder to meet). Typically 100–150ps for pre-CTS, reduced post-CTS.

Hold uncertainty: Applied to increase the minimum data arrival time required to meet hold. It tightens hold (makes hold harder to meet). Typically 50ps.

The asymmetry is because hold uncertainty models jitter that shortens the clock cycle for the capture edge, while setup uncertainty models jitter that either shortens or lengthens. Pre-CTS uses larger uncertainty; post-CTS switches to propagated clocks with only jitter uncertainty remaining.
20. What is back-annotated timing? When is it used?
+
Back-annotated timing (post-layout STA) uses actual extracted RC parasitics (SPEF) from the physical layout to compute wire delays. Contrast with pre-layout timing which uses estimated loads.

Used: After routing is complete for final sign-off. The parasitics precisely capture the resistance and capacitance of every metal wire and via, giving timing accuracy within 5% of silicon measurement.

Back-annotation reveals new violations not seen pre-route (because estimated wires underestimated actual wire capacitance). These violations require post-route ECO fixes with minimal netlist perturbation.
21. What is a timing exception and why must it be carefully applied?
+
A timing exception modifies how STA analyzes a specific path: set_false_path, set_multicycle_path, set_max_delay, set_min_delay.

They must be carefully applied because:
  • Over-generous false paths hide real timing violations
  • Wrong multicycle path settings (missing hold correction) create hold violations
  • Incorrectly specified endpoints leave real functional violations unchecked
  • Timing exceptions survive synthesis to PD to sign-off — errors propagate through the entire flow
All exceptions must be documented and reviewed. Functional paths must never be marked false.
22. What is the difference between max-delay and false path?
+
set_false_path: Completely removes the path from timing analysis. The tool ignores it entirely — no timing report, no optimization. For paths that are genuinely never timing-critical in any operating scenario.

set_max_delay: Still analyzes the path for timing, but uses the specified delay as the timing constraint instead of the default (clock period). For paths that need to meet a specific delay that's different from the clock period (e.g., async paths that must complete within 10ns regardless of clock).

Key difference: set_false_path means "never check this." set_max_delay means "check this, but use this constraint."
23. What is a violation cascade and how do you prioritize fixes?
+
A violation cascade occurs when fixing one timing violation makes another one worse. For example, upsizing a cell to fix setup on path A may load a net and degrade setup on path B.

Prioritization strategy:
  1. Fix the WNS (worst) path first — largest magnitude violation
  2. Use ECO minimize-impact mode (minimize cell moves)
  3. Iterate in small batches (fix 20 paths, re-analyze, fix next 20)
  4. Monitor TNS trend — decreasing TNS = making progress
  5. Separate setup and hold fixes (hold buffer insertion can slow setup)
24. How does temperature inversion affect timing at advanced nodes?
+
At mature nodes (130nm+): Higher temperature → slower transistors (mobility decreases). Standard worst-case timing = high temp.

At advanced nodes (<65nm): Below a threshold voltage, temperature inversion occurs — at low Vdd, transistors can be SLOWER at low temperature than high temperature because subthreshold current becomes significant. This means the traditional slow corner (SS, 125°C) may no longer be worst-case timing; SS at -40°C may be worse.

Impact: Need to check timing at multiple temperature points. Some foundries provide separate library corners for this. Ignoring temperature inversion at advanced nodes can lead to post-silicon timing failures.
25. What is signal integrity (SI) in STA context?
+
In STA, Signal Integrity (SI) analysis accounts for crosstalk-induced delay changes:

SI delta delay: Coupling from aggressor wires causes victim wire delay to increase or decrease. STA includes SI analysis in sign-off by computing the worst-case delay considering all possible aggressor switching combinations.

SI noise analysis: Checks if crosstalk-induced voltage glitches on quiet nets can cause logic errors. The noise immunity of the receiving cell must exceed the peak noise voltage.

SI analysis requires layout parasitics including coupling capacitance (SPEF with coupling) — simple ground capacitance models are insufficient for SI-accurate timing.
26. What is setup recovery and removal time for asynchronous pins?
+
For asynchronous control pins (async reset, preset, clear) of flip-flops:

Recovery time: Minimum time the async signal must be deasserted BEFORE the active clock edge. Analogous to setup time — if async reset is released too close to the clock, the FF may not properly respond to the clock. Checked with set_max_delay or special recovery constraints in SDC.

Removal time: Minimum time the async signal must remain asserted AFTER the active clock edge. Analogous to hold time. These are library-characterized values that must be checked if async resets are used in a synchronous design.
27. What is the difference between input/output delay -max and -min?
+
set_input_delay -max: Latest time data can arrive at the port relative to clock. Used for setup analysis of the first internal register that captures this input.

set_input_delay -min: Earliest time data arrives. Used for hold analysis (ensures data doesn't arrive so early that it violates hold at the capturing FF).

set_output_delay -max: Latest time data must be stable at output before next clock edge (for the downstream receiver's setup).

set_output_delay -min: Earliest time data must be stable (for downstream receiver's hold).

All four values (-max/-min for input/output) must be specified for complete I/O timing coverage.
28. What is POCV (Parametric OCV)?
+
POCV (Parametric/Statistical OCV) replaces flat or AOCV derating with a statistical model. Each cell delay is modeled as a Gaussian distribution with a mean and standard deviation (from silicon characterization).

STA computes the statistical distribution of path delay (sum of independent Gaussian cell delays → Gaussian path delay by central limit theorem). Slack is then expressed as a sigma value — e.g., "path meets timing at 3σ".

Benefits: Most accurate OCV model, removes pessimism from flat/AOCV derating. Used in advanced (<7nm) nodes where OCV is very significant. Requires POCV characterization data from the foundry library.
29. How do you handle paths between asynchronous clock domains in STA?
+
Paths between asynchronous clock domains (clocks with no fixed phase relationship) cannot be meaningfully analyzed by standard STA — the arrival time of data relative to the capture clock is unbounded.

Proper handling:
  • set_clock_groups -asynchronous: Tells the STA tool to not analyze paths between these clock domains. The crossing is handled by synchronizers in the design.
  • CDC analysis (separate tool: Mentor CDC, Cadence JasperGold): Verifies correct synchronization structures are present
  • The synchronizer itself is analyzed with appropriate timing constraints
Failure to set clock_groups for async clocks creates false setup violations with pessimistic slack values.
30. What is the ECO flow in PrimeTime and how is it used?
+
PT-ECO is PrimeTime's automated ECO (Engineering Change Order) capability for fixing timing violations post-route:

Flow:
  1. PT analyzes sign-off netlist with SPEF, finds violations
  2. fix_eco_timing -setup and fix_eco_timing -hold generate cell changes (upsize/insert buffers)
  3. Changes written to eco_changes.tcl
  4. Innovus/ICC2 reads changes, places/routes ECO cells
  5. RC re-extracted, PT re-runs analysis
  6. Iterate until clean
PT-ECO minimizes cell perturbation to preserve DRC cleanliness of the post-route database.

Formula Cheatsheet

⏱ Setup Timing
Slack_setup = T_req − T_arr
T_req = Period + T_clk_capture − T_uncertainty − T_setup
T_arr = T_clk_launch + T_cq + T_combo
Positive slack = PASS, Negative = FAIL
⏳ Hold Timing
Slack_hold = T_arr − (T_clk_cap + T_hold)
T_arr must be GREATER than capture clock + hold time
Fix: Add delay buffers to data path
📐 Floorplan
Util = CellArea / CoreArea × 100%
AR = CoreHeight / CoreWidth
CoreArea = CellArea / Util_target
Target utilization: 60–75%
⚡ IR Drop
V_drop = I × R_metal
R = ρ × L / (W × T)
Max allowed: typically 5–10% of VDD
Fix: wider stripes, more stripes, decaps
🌊 Electromigration
J = I / A (current density)
J_max = A × e^(-Ea/kT)
Black's equation for MTTF. Wider wires → lower J
🕐 Clock Skew
Skew = T_clk_cap − T_clk_launch
Positive: relaxes setup, tightens hold
Negative: tightens setup, relaxes hold
🔋 Dynamic Power
P_dyn = α × C × V² × f
α = activity factor, C = load cap, V = supply, f = frequency
💤 Leakage Power
P_leak ∝ W/L × e^(-Vt/nVT)
Exponential sensitivity to Vt. HVT cells reduce leakage
📊 WNS / TNS
WNS = min(all slacks)
TNS = Σ(negative slacks)
WNS → worst path. TNS → total work needed.
🌡 PVT Corners
Setup: SS, low V, high T
Hold: FF, high V, low T
Temperature inversion at <65nm nodes!
📡 Wire Delay (Elmore)
T_d = 0.69 × R × C
T_d ∝ L² (wire delay scales as L²)
Long wires need repeaters/buffers every Lopt
🔄 Fanout & Buffering
Logical Effort: g = C_in / C_inv
Optimal fanout = e ≈ 2.7 (e-based)
Buffer chain for high-fanout: h = C_load/C_in, stages = log_e(h)

VLSI Glossary

AOCV
Advanced On-Chip Variation. OCV derating method that applies smaller derating to longer (higher cell count) paths due to statistical averaging.
Antenna Violation
Layout violation where cumulative metal area connected to a gate exceeds the foundry's antenna ratio limit, risking gate oxide damage during plasma etching.
AOCV
Advanced OCV — path-depth-aware derating that reduces pessimism vs flat OCV for long paths with many cells (statistical averaging effect).
Aspect Ratio
Core height divided by core width. Typically 1:1 (square) but can vary based on I/O and macro constraints.
Back-Annotation
Loading post-layout extracted parasitics (SPEF) into STA for accurate timing analysis using real wire RC values.
Blackbox
A module whose internal implementation is hidden from synthesis/STA. Only the timing model (liberty file) is used for analysis.
Buffer Tree
A hierarchy of buffers used to drive high-fanout nets, reducing wire capacitance per driver and improving transition times.
CCD
Concurrent Clock and Data optimization in Cadence CTS flow — simultaneously optimizes clock tree and data paths.
CDC
Clock Domain Crossing. Transfer of signals between flip-flops clocked by different, potentially asynchronous clocks. Requires synchronizer circuits.
CMP
Chemical Mechanical Polishing. Fab step that planarizes the wafer surface after each metal layer deposition. Requires uniform metal density.
CRPR
Clock Reconvergence Pessimism Removal. Removes double-counting of OCV derating on the shared clock path between launch and capture FFs.
CTS
Clock Tree Synthesis. Process of building the clock distribution network to minimize skew and control latency from clock source to all FF clock pins.
CK-to-Q
Clock-to-Q propagation delay of a flip-flop — time from clock edge to Q output settling to new value. Part of data path delay.
Compile Ultra
Design Compiler's advanced compile command enabling retiming, adaptive body biasing, and high-effort optimization for best QoR.
Core Area
The interior region of the chip die where standard cells and macros are placed. Excludes I/O ring and pad frame.
CPF
Common Power Format. Cadence's format (now merged into UPF/IEEE 1801) for specifying multi-voltage power intent.
Crosstalk
Capacitive coupling between adjacent wires. Causes delta delays (aggressor switching affects victim timing) and noise glitches (functional risk on quiet nets).
Decap Cell
Standard-cell-height structure containing a VDD-to-VSS capacitor. Placed in empty areas to reduce dynamic IR drop and supply noise.
Derating
Multiplicative factor applied to cell or wire delays in STA to model OCV. Early path derated by <1.0 (faster), late path by >1.0 (slower).
DEF
Design Exchange Format. Contains physical placement coordinates, routing geometry, and other physical design information for a chip.
Die Area
Total silicon area of the chip including I/O ring, pads, and all structures. Larger than core area.
DRC
Design Rule Check. Verifies layout geometry against foundry manufacturing rules (spacing, width, enclosure, density). Must be clean for tape-out.
DFT
Design for Test. Techniques (scan insertion, BIST, boundary scan) that make the design testable after manufacturing.
ECO
Engineering Change Order. Targeted, minimal netlist or layout change to fix a specific timing, functional, or sign-off issue after implementation.
Elaboration
Synthesis step that parses HDL, resolves hierarchy/parameters, and maps to GTECH (generic technology-independent) primitives.
Electromigration (EM)
Gradual metal atom displacement due to high electron current density. Causes voids (opens) or hillocks (shorts) over time. Characterized by Black's equation.
ERC
Electrical Rule Check. Verifies electrical correctness: floating nodes, improper biasing, ESD violations, latchup risk.
Filler Cells
Cells placed in empty row spaces to maintain N-well continuity, connect power rails, and satisfy metal density rules.
False Path
A timing path that exists in the netlist but is never functionally active. Excluded from STA via set_false_path.
Flyline
Straight-line visual connection between logically connected but physically unrouted pins. Used to assess routing congestion in floorplanning.
Footprint
Physical area occupied by a cell or block on the die, including any keepout regions.
GBA
Graph-Based Analysis. Standard STA mode computing arrival times on the timing graph once. Fast but pessimistic vs Path-Based Analysis (PBA).
GDS II
Graphic Design System II. Binary file format that contains all layout geometry for the chip. Final output sent to the foundry for mask making.
GTECH
Generic Technology. Synopsys internal technology-independent gate library used as intermediate representation during synthesis elaboration.
HVT
High Threshold Voltage cell. Slower than SVT/LVT but has very low leakage power. Used on non-critical paths to minimize standby power.
ICG
Integrated Clock Gating cell. Latch-based AND gate that cleanly gates the clock for power reduction without glitches. Inserted by synthesis tools.
IR Drop
Resistive voltage drop along power distribution network wires (V=IR). Reduces effective VDD at cells, increasing delay.
Jitter
Cycle-to-cycle variation in clock period caused by PLL noise, supply variation, and other sources. Modeled in clock uncertainty.
Latency (Clock)
Total delay from clock source to a flip-flop's clock pin, through all buffers/wires of the clock distribution network.
Legalization
Placement step that moves cells from global placement positions to the nearest legal row-aligned positions with no overlaps.
LEF
Library Exchange Format. Describes the physical abstract views of cells (pin locations, blockages, dimensions) for use by PD tools.
Level Shifter
Cell that converts a signal between two different voltage levels at a power domain boundary. Required for multi-voltage designs.
Liberty (.lib)
Industry-standard format for cell characterization data: timing arcs, power, area, and function at specific PVT conditions.
LVS
Layout vs Schematic. Extracts netlist from layout and compares to reference schematic. Catches opens, shorts, and missing/extra devices.
LVT
Low Threshold Voltage cell. Fastest switching speed, but highest leakage power. Used on critical timing paths to meet WNS.
Macro
Pre-designed hard block (SRAM, ROM, PLL, analog IP) with fixed layout dimensions. Placed early in floorplanning, not synthesized.
Metastability
Condition where a flip-flop output remains at an intermediate voltage indefinitely after setup/hold violation. Resolved by 2-FF synchronizers.
MMMC
Multi-Mode Multi-Corner. STA analysis across all operating modes and PVT corners simultaneously in one tool run.
Multicycle Path
A timing path intentionally designed to take N clock cycles. Declared via set_multicycle_path to relax the timing constraint.
NLDM
Non-Linear Delay Model. 2D cell delay table indexed by input transition time and output load capacitance. Standard model in Liberty files.
OCV
On-Chip Variation. Spatial PVT variation across a single die causing identical cells at different locations to have different delays.
PBA
Path-Based Analysis. Accurate STA mode that re-analyzes specific paths with actual input transitions, removing pessimism vs GBA.
PDN
Power Delivery Network. The complete network of metal rails, rings, stripes, and vias that distributes VDD/VSS to all cells.
POCV
Parametric/Statistical OCV. Most accurate OCV model — each cell delay modeled as a statistical distribution, path slack expressed as sigma confidence.
PrimeTime
Synopsys's industry-standard sign-off STA tool. Uses SPEF parasitics for post-layout timing analysis across MMMC corners.
PVT
Process, Voltage, Temperature. The three main variation sources characterized by IC timing libraries at multiple corners.
QoR
Quality of Results. Overall measure of synthesis/PD success: WNS, TNS, area, power, DRC count, and routability.
Retiming
Synthesis technique that moves registers across combinational logic to balance pipeline stage delays without changing function.
SAIF
Switching Activity Interchange Format. Simulation output capturing signal toggle rates for accurate dynamic power analysis.
SDC
Synopsys Design Constraints. Industry-standard format for timing, area, and power constraints used by all EDA synthesis and STA tools.
Setup Time
Minimum time data must be stable at FF input BEFORE the active clock edge. Violation causes metastability.
Skew
Difference in clock arrival times between any two flip-flops. Target: <50ps local, <200ps global after CTS.
Slack
Timing margin at a path endpoint: Required Time − Arrival Time (setup) or Arrival Time − Required Time (hold). Negative = violation.
Slew Rate
Speed of signal transition (rise/fall time), measured as time to transition between 20%–80% of supply voltage. Slow slew → more delay and power.
SPEF
Standard Parasitic Exchange Format. Contains extracted wire RC values from post-layout extraction for accurate back-annotated STA.
SVT
Standard Threshold Voltage cell. Balanced speed/leakage. General-purpose cell for non-critical paths.
Tap Cell
N-well to VDD and substrate to VSS connection cell. Prevents latchup. Placed at regular intervals (≤50µm) in every standard cell row.
Tempus
Cadence's sign-off STA tool, tightly integrated with Innovus. Supports native MMMC and in-design ECO optimization.
Timing Arc
Delay specification between a cell input and output pin. Includes cell arcs (logic delay), net arcs (wire RC), and constraint arcs (setup/hold).
TNS
Total Negative Slack. Sum of all negative slack values across all timing endpoints. Indicates total timing work remaining.
Uncertainty (Clock)
Timing margin accounting for jitter, skew, and modeling uncertainty. Applied via set_clock_uncertainty in SDC.
UPF
Unified Power Format (IEEE 1801). Standard format for describing multi-voltage power intent: domains, voltages, level shifters, isolation cells.
Utilization
Ratio of placed cell area to total core area, expressed as percentage. Target 60–75% for most designs.
Via
Metal connection between two adjacent metal layers in the layout. Vias have resistance and current capacity limits (EM rules).
WHS
Worst Hold Slack. Most negative hold slack across all endpoints. Must be ≥ 0 at sign-off. Fixed with delay buffer insertion.
WNS
Worst Negative Slack. Most negative setup slack — represents the single worst timing path in the design. Must be ≥ 0 at sign-off.
SECTION 05

Interactive Waveform Lab

An interactive digital timing waveform viewer. Toggle signals, animate the waveform, and inject setup/hold violations to see how they appear in practice.

🧪 Lab Controls
■ CLK — System Clock ■ D — Data Input ■ Q — FF Output ■ RESET — Async Reset ■ Violation Window
📖 How to Use
  • Click Play to animate the waveform timeline
  • Toggle individual signals using the colored buttons
  • Click Introduce Violation to inject a setup or hold violation
  • The violation window appears highlighted in red on the waveform
  • Click Reset to restore clean waveforms
🔬 Timing Parameters Displayed
  • CLK — 5ns period, 50% duty cycle (200MHz)
  • D — Data changes asynchronously relative to CLK
  • Q — Captured on rising edge of CLK (after CK-to-Q delay)
  • RESET — Active-low async reset; clears Q immediately
  • Setup window — Red zone before capture edge where D must be stable
SECTION 06

Physical Verification (PV)

🧱 Start Here — What Is Physical Verification?
After Physical Design completes, you have a GDS II file — the full geometric layout of your chip. But how do you know it's actually manufacturable? That it matches your circuit? That it won't destroy itself electrically? Physical Verification is the set of automated checks that answer all these questions before sending the GDS to the foundry.

Think of it as the final quality inspection before manufacturing. A single unresolved DRC violation → foundry rejects your file. A single LVS open → chip has a broken wire → dead chip. PV sign-off is non-negotiable.
🔍
DRC — Will the foundry be able to manufacture it?
Checks that every wire, via, and shape meets the foundry's minimum size and spacing rules. Violations mean the photolithography process cannot print your shapes correctly → defective chip.
⚖️
LVS — Does the layout match the netlist?
Extracts the connectivity from your physical layout and compares it to the reference netlist. A mismatch means a wire is missing, shorted, or connected to the wrong place → wrong circuit manufactured.
ERC / Antenna — Will it work electrically?
Checks for floating gates, ESD risks, latchup, and plasma damage to gate oxide. DRC-clean + LVS-clean is not enough — ERC violations can destroy the chip during manufacturing or use.
🔑 Why PV Is Critical — The Stakes
Unlike simulation (checks function) or STA (checks timing), Physical Verification checks manufacturability and physical correctness. A chip that simulates perfectly and passes STA but has a spacing DRC violation will either fail in the fab or be rejected at tape-out review. A modern chip tapeout costs $500K–$5M for the mask set. One missed LVS open = wasted silicon run = millions of dollars lost. PV is the last line of defense.

6.1 PV Flow Overview

Physical Verification consists of several distinct checks, each targeting a different failure mode. They must all be run at sign-off, typically in this order:

GDS II Layout Input DRC Design Rule Check Calibre / ICV LVS Layout vs Schematic Calibre LVS ERC Electrical Rule Check Calibre ERC ANTENNA Gate oxide protection Calibre / PT DENSITY Metal Fill / CMP rules Calibre Fill TAPE-OUT ALL CLEAN GDS → Foundry
📐
What gets checked?
Every geometric shape in the GDS file is checked against the foundry's Process Design Kit (PDK) rules — spacing, width, overlap, enclosure, density, connectivity, and electrical properties.
🛠️
Primary Tools
Calibre (Siemens EDA) — Industry-dominant sign-off PV tool. Also: Synopsys IC Validator (ICV), Mentor Hercules. Foundry PDKs are certified for Calibre — it's the reference.
📋
PDK Rule Deck
Foundry provides a Calibre rule deck (.svrf file) — typically 50,000–200,000 lines of rules. Engineers run this deck against their GDS. Do NOT modify rule decks without foundry approval.

6.2 DRC — Design Rule Check

DRC verifies that your layout geometry satisfies every manufacturing rule in the foundry's PDK. These rules exist because the lithography and etching processes have physical limits — too-small features simply cannot be manufactured reliably.

Understanding DRC Rule Categories
Rule CategoryWhat It ChecksWhy It ExistsTypical Fix
Minimum Width Wire width ≥ Wmin per metal layer Too-thin wires break during CMP or have excessive resistance / EM risk Widen the wire; router usually handles this automatically
Minimum Spacing Gap between same-layer shapes ≥ Smin Lithography cannot resolve too-small gaps → shorts between wires Increase routing track separation; re-route in congested area
Via Enclosure Metal must extend beyond via edge by min amount on all sides Overlay (misalignment) in fab could expose via without metal contact Use larger via enclosure design rule; ensure auto-router uses correct rules
Via Coverage Minimum number of vias on high-current nets Single via has limited current capacity; EM requires multiple vias Replace single-cut vias with via arrays; use via doubling ECO
Notch Rule Internal notch (concave corner) ≥ Nmin Narrow notches print incorrectly — corners round off → shape deformation Fill small notches; ensure polygon merging after fill insertion
Area Rule Minimum enclosed polygon area Tiny isolated shapes may not print or etch completely Remove floating metal shapes; merge small disconnected polygons
Extension Rule Active/poly must extend beyond diffusion edge Transistor channel defined by overlap; insufficient extension = no transistor Standard cell library handles this; flagged in custom analog layout
Density Rules Min/max metal fill % per window per layer CMP planarization requires uniform metal density across the wafer Run fill insertion tool (Calibre Fill); remove excess fill if over-dense
Double-Patterning Adjacent same-mask shapes must be separable into 2 colors At <20nm, single lithography cannot print minimum pitch → 2 exposures needed Assign colors using DP-aware router; fix coloring conflicts
Poly Spacing to Diff Minimum distance between poly gate and nearby diffusion Gate coupling to adjacent diffusion can cause leakage or latchup Handled by standard cell design; appears in custom layout
DRC Violation Examples — Before & After (Annotated)
① SPACING VIOLATION Wire A Wire B 6nm need ≥12nm Fix → 15nm ✓ ② WIDTH VIOLATION 4nm Width = 4nm (FAIL, need ≥8nm) Fix → 10nm Width = 10nm ✓ PASS ③ VIA ENCLOSURE VIOLATION No enclosure! Enc ≥ 2nm ✓ ④ DENSITY VIOLATION (too sparse) Check window Metal fill = 8% (need ≥25%) Fix: insert dummy fill → Metal fill = 38% ✓ ⑤ DOUBLE-PATTERNING CONFLICT Mask 1 Mask 1? ✗ Same-mask conflict! Fix: assign alternating masks → Mask A Mask B Mask A ✓ Alternating = no conflict ⑥ ANTENNA VIOLATION Very long metal wire GATE Ratio > limit → oxide damage Fix A: jump to higher layer → Added later → less exposure time Fix B: antenna diode → ↓VSS Diode discharges accumulated charge
Running DRC with Calibre
SHELL — Calibre DRC Invocation
# Calibre DRC batch run
calibre -drc \
    -hier \                          # hierarchical mode (faster, uses cell caching)
    -turbo 16 \                    # 16 CPU threads parallel
    -64 \                           # 64-bit mode for large designs
    -runset ./drc_runset.svrf \    # DRC rule deck from foundry PDK
    -gds    ./out/chip_final.gds \ # input layout
    -top    chip_top               # top-level cell name

# Key sections in the Calibre runset (.svrf):
DRC RESULTS DATABASE  "drc.results"      ; output DB
DRC SUMMARY REPORT   "drc_summary.rpt"  ; human-readable summary
DRC MAXIMUM RESULTS  1000               ; stop at 1000 per rule (debug mode)
LAYOUT SYSTEM        GDSII

# Check results
grep "RULE" drc_summary.rpt | sort -k3 -rn | head -20
# Shows top 20 rules with most violations — fix these first
💡 Pro DRC Workflow
Don't try to fix DRC violations randomly. Sort by rule then by count — the top 5 rules usually account for 90% of all violations. Fix one rule type at a time using batch ECO scripts. Always re-run DRC after each fix iteration to catch cascading effects (fixing a spacing violation can sometimes introduce a width violation nearby).

6.3 LVS — Layout vs. Schematic

LVS extracts a netlist from your physical layout (by tracing metal connectivity and identifying transistors) and compares it against your reference schematic/netlist. Any mismatch is a critical bug that would cause chip failure.

How LVS Works — Step by Step
GDS Layout Physical shapes metal, poly active, vias EXTRACTION Trace connectivity ID transistors Calibre LVS extract EXTRACTED NETLIST M1↔M2 conns + device sizes COMPARE Node-by-node device-by-device Calibre LVS Reference Netlist (from synthesis .v) LVS CLEAN ✓ Layout matches netlist LVS ERRORS ✗ Opens / Shorts / Mismatches
Common LVS Error Types and Root Causes
LVS ErrorWhat It MeansRoot CauseHow to Debug
Open Net A connection present in the schematic is missing in the layout — net is broken Missing wire segment, broken via, net not routed, missing metal fill connection Highlight the net in layout viewer. Find the discontinuity. Add missing wire/via segment.
Short Circuit Two nets that should be separate are electrically connected in layout Routing DRC waiver created a short, accidentally connected polygons, missing wire cut Identify which two nets are shorted. Find where they touch. Remove the connection or add a cut.
Device Mismatch Device exists in schematic but not in layout (or vice versa) Cell not placed, wrong cell reference, flatten/unflatten issue, macro not properly instantiated Compare instance counts. Find missing instance in layout. Check hierarchy mapping.
Port Mismatch Port name or type doesn't match between layout and schematic Wrong pin label on layout port, renaming in synthesis not propagated to layout, case mismatch Check label text on layout pins vs netlist port names. Calibre is case-sensitive.
Unconnected Port A port declared in the netlist has no connection in the layout I/O pad not connected to core, power domain port not properly tied, spare gate left floating Find the port in the layout. Verify it has a metal label and is connected to the correct net.
Parameter Mismatch Device dimensions differ between layout and schematic (W/L, capacitor value) Standard cell used wrong size, analog cell manually edited without updating schematic Check transistor W/L in layout vs SPICE netlist. Typically only affects analog blocks.
SHELL — Calibre LVS Run + Key Report Fields
# Calibre LVS run
calibre -lvs \
    -hier \
    -turbo 16 \
    -64 \
    -runset    ./lvs_runset.svrf \
    -gds       ./out/chip_final.gds \
    -top       chip_top \
    -netlist   ./out/chip_netlist.v  # reference from synthesis

# LVS report sections to check:
# 1. CIRCUIT COMPARISON RESULTS — Overall PASS/FAIL
# 2. SHORTS — nets merged that shouldn't be
# 3. OPENS — nets split that should be connected
# 4. UNMATCHED NETS — present in one side only
# 5. UNMATCHED INSTANCES — devices missing

# LVS clean confirmation in report:
# "CORRECT" → clean
# "INCORRECT" → failures exist

# Quick grep for errors:
grep -E "INCORRECT|SHORTS|OPENS|Unmatched" lvs_summary.rep

6.4 ERC — Electrical Rule Check

ERC catches electrical issues that DRC and LVS miss. A layout can be DRC-clean and LVS-clean but still have electrical errors that cause chip malfunction.

ERC CheckWhat It DetectsConsequence if Missed
Floating Gate MOSFET gate connected to nothing (floating net) Gate floats to indeterminate voltage → random switching behavior. Very common ERC error in early PD.
Floating Well N-well or P-well not connected to VDD/VSS Well floats → transistors biased incorrectly → latchup risk, parametric failures
VDD/VSS Short Power and ground nets connected together Direct short circuit → chip draws excessive current → burns out immediately on power-up
Input Not Driven Logic input pin with no driver Input floats → oscillation, metastability, excessive power consumption
Output Contention Two outputs driving the same net simultaneously Short circuit between drivers → device damage, incorrect logic level
ESD Violation I/O pad has insufficient ESD protection structure ESD event during handling destroys input gate oxide → dead chip before it even runs
Latchup Violation Tap cells too far from active region (>50µm) Parasitic SCR triggers → VDD-to-VSS latchup → chip permanently damaged

6.5 Antenna Check

During plasma etching in fabrication, metal connected to gate terminals accumulates charge. The antenna ratio is the cumulative metal area divided by the gate area. Exceeding the foundry limit damages the thin gate oxide — permanently degrading or destroying the transistor.

Antenna Ratio
AR = Σ(Connected Metal Area on Layer L) / Gate Oxide Area
⚠️ When Violation Occurs
Foundries specify max AR per metal layer. Typically AR < 400 for M1, AR < 800 for M2+. Violation occurs when a long wire on lower layers is connected to a gate before any higher layer connection breaks the accumulation path.
💡 Two Fix Strategies
1. Jump to higher layer (preferred): Re-route wire through a higher metal layer early. Higher layers are deposited later in the fab process — less plasma exposure time → lower charge accumulation.

2. Insert antenna diode: Place a reverse-biased diode (anode to net, cathode to VSS) at the gate. During fab the diode conducts the plasma current safely to ground before oxide damage occurs.

6.6 Metal Fill & Density Rules

CMP (Chemical Mechanical Polishing) planarizes each metal layer. Non-uniform metal density causes uneven polishing: sparse areas "dish" (metal removed excessively) and dense areas retain more. Both cause via formation failures and reliability issues.

ParameterTypical RangeEffect of Violation
Minimum metal density20–30% per check windowDishing: metal recedes below ILD surface → via misses metal → open circuit
Maximum metal density70–80% per check windowErosion: ILD polished away → shorts between layers, increased leakage
Check window size50×50 µm – 200×200 µmFoundry-defined. Smaller windows = tighter local control
Fill shape min size≥ Wmin per layerToo-small fill shapes violate width rules themselves
Fill to signal spacing≥ 2× normal spacingFill too close to signal → coupling capacitance → SI issues
SHELL — Calibre Metal Fill Insertion
# Run Calibre fill (after routing, before final DRC)
calibre -drc -hier -runset fill_runset.svrf

# fill_runset.svrf key options:
LAYOUT SYSTEM     GDSII
LAYOUT PATH       "chip_prefill.gds"
DRC RESULTS DATABASE "fill_out.gds"    ; GDS with fill added

# Fill insertion is non-electrical — it must NOT connect to any signal net
# Most foundry fill decks insert floating unconnected metal polygons
# Some advanced PDKs insert connected fill for better SI (optional)

# After fill: re-run DRC to verify:
# 1. Fill shapes themselves don't create new DRC violations
# 2. Fill-to-signal spacing rules satisfied
# 3. Density targets met on all layers

6.7 PV Tool Knowledge

Calibre from Siemens EDA (formerly Mentor Graphics) is the industry-standard sign-off verification tool. Virtually all foundry PDKs are certified for Calibre. If your Calibre DRC is clean, the foundry accepts your GDS.

Calibre ModeCommandPurpose
DRCcalibre -drc -hier -runset drc.svrfDesign rule verification against foundry rules
LVScalibre -lvs -hier -runset lvs.svrfLayout vs schematic comparison
PEX/RCXcalibre -xrc -rcx -runset rcx.svrfParasitic RC extraction → generates SPEF
Fillcalibre -drc -hier -runset fill.svrfInsert dummy metal fill to meet density rules
ERCcalibre -erc -runset erc.svrfElectrical connectivity and latchup checks
PERCcalibre -perc -runset perc.svrfESD and latchup reliability analysis
DFMcalibre -dfm -runset dfm.svrfDesign-for-manufacturing: yield improvement checks
Litho CheckcalibreLitho -verifyOptical proximity correction / lithography simulation
📌 Calibre Interactive (RVE)
The Results Viewing Environment (RVE) is Calibre's GUI for viewing DRC/LVS errors. Open it with calibredrv -m gds -gui. Features: highlight errors in layout, zoom to violation, batch-fix mode, error count by rule, and cross-probe to schematic for LVS.

Synopsys IC Validator (ICV) is Synopsys's native sign-off verification tool, tightly integrated with StarRC (parasitic extraction) and ICC2. Growing adoption especially in designs using the full Synopsys tool chain.

ICV ModeCommandPurpose
DRCicv -drc -i chip.gds -c icv_drc.rsDesign rule verification
LVSicv -lvs -i chip.gds -s netlist.vLayout vs schematic
Fillicv -fill -i chip.gds -c fill.rsDensity fill insertion
ERCicv -erc -i chip.gdsElectrical rule check
In-design DRCicc2_shell> check_drcDRC inside ICC2 during routing — catch violations early
FeatureCalibre (Siemens)ICV (Synopsys)
Foundry CertificationGold Standard — all foundriesCertified at major foundries
Tape-out AcceptanceUniversally acceptedTSMC, Samsung, GF certified
PD IntegrationIn-design: Innovus + ICC2Native in ICC2
Parasitic ExtractionCalibre xRC/PEXStarRC (separate tool)
GUI ViewerCalibre RVECustom Error Browser
Speed (large designs)Excellent (hierarchical)Excellent (hierarchical)
Rule Deck LanguageSVRF / TVFRSDB / SVRF compatible

6.8 Physical Verification — Interview Questions

1. What is the difference between DRC, LVS, and ERC? Which must pass for tape-out?
+
DRC (Design Rule Check): Checks layout geometry against foundry manufacturing rules — spacing, width, enclosure. Ensures the chip CAN be manufactured.

LVS (Layout vs Schematic): Extracts connectivity from layout and compares to reference netlist. Ensures the manufactured chip WILL match the intended circuit.

ERC (Electrical Rule Check): Checks for floating nodes, power/ground violations, ESD issues. Ensures the chip WILL WORK electrically.

All three must be 100% clean for tape-out. No exceptions. A single unresolved error means the foundry rejects the submission or the chip is at risk.
2. What is an antenna violation and how do you fix it?
+
An antenna violation occurs when the ratio of metal area connected to a gate terminal exceeds the foundry's limit during plasma etching. The metal accumulates charge which can tunnel through and damage the gate oxide — permanently destroying the transistor.

Two fixes:
  1. Layer jump: Route the wire through a higher metal layer before connecting to the gate. Higher layers are deposited later → less plasma exposure time → less charge accumulation. This is the preferred fix as it adds no area.
  2. Antenna diode: Insert a reverse-biased diode (anode on the net, cathode to VSS) at the gate input. During fab, accumulated charge safely bleeds to ground through the diode. Adds small area (~0.5–1 std cell)
3. What causes an LVS short and how do you debug it?
+
An LVS short means two nets that should be electrically separate in the schematic are connected in the layout.

Common causes:
  • Two wires on the same metal layer touching (spacing DRC violation that was waived)
  • Via connecting two unrelated nets through the same via hole
  • Accidentally connected power/signal during manual ECO
  • Missing via cut between two nets running over each other
Debug: Open Calibre RVE. Click the shorted net pair. The tool highlights in layout. Zoom to find the touching shapes. Check the DRC results — usually there's a spacing violation co-located with the LVS short. Fix the spacing violation to separate the nets.
4. What is a DRC waiver and when is it appropriate?
+
A DRC waiver is an explicit exception that suppresses a specific DRC violation from being reported, with foundry documentation justifying why it's acceptable.

Legitimate use cases:
  • Spacing violations in ESD clamp cells — intentionally tight by design, foundry-approved cell
  • Density violations in seal ring or pad ring areas — these regions have special rules
  • Known violations inside foundry-provided hard IP (black box) — foundry-guaranteed correct
Never waive: Violations you don't understand. Violations in custom logic you designed. Always document the waiver with justification and get foundry approval if required. Incorrect waivers = chip failure in fab.
5. Why does metal fill affect timing, and how do you manage it?
+
Metal fill polygons are floating (unconnected) metal shapes on each layer. Although electrically disconnected, they add parasitic coupling capacitance to nearby signal wires. This:
  • Increases wire capacitance → slower transitions → increased propagation delay
  • Adds coupling between fill and signal → minor crosstalk noise
  • Can shift timing by 2–5% on metal-dense designs
Management:
  • Run fill insertion BEFORE final sign-off STA (not after). STA must include fill parasitics.
  • Foundry fill rules specify minimum fill-to-signal spacing — this limits coupling impact.
  • Some flows use "timing-aware fill" which avoids placing fill near critical nets.
  • StarRC/Calibre xRC re-extraction after fill captures the additional capacitance in SPEF.
6. What is double patterning and which nodes require it?
+
Double Patterning (DP) splits a single metal layer's patterns into two separate photomasks that are exposed sequentially. After first exposure + etch, the second mask fills in the remaining patterns. Together they achieve pitches half of what single exposure can print.

Required at:
  • 28nm: some critical layers
  • 20nm/16nm: M1, M2, via layers
  • 10nm/7nm: Most metal layers, Fin definition, contact layers
DRC implications: Adjacent wires on a DP layer must be "colorable" — assigned to alternating masks without conflict. A "coloring conflict" occurs when three adjacent wires are too close together to assign alternating colors without two same-color wires violating spacing. Requires routing perturbation to resolve.
7. What is Calibre xRC and how does it relate to STA sign-off?
+
Calibre xRC (eXtracted RC) is Calibre's parasitic extraction engine. It reads the post-route GDS layout and computes the actual resistance (R) and capacitance (C) of every metal wire and via, outputting a SPEF file.

Relationship to STA:
  1. Routing completes → GDS generated
  2. Calibre xRC reads GDS → produces chip.spef
  3. PrimeTime reads chip.spef via read_parasitics
  4. Wire delays computed from real RC → accurate back-annotated timing
  5. Sign-off STA with real parasitics must pass before tape-out
Pre-route timing uses estimated loads (WLM) which can be 20–40% off. Calibre xRC gives ground-truth parasitic data within ~3% of silicon silicon measurement. Without SPEF from Calibre xRC, timing sign-off is not reliable.
8. How do you approach a large DRC run with 50,000 violations?
+
50,000 violations sounds overwhelming but they're usually from just 3–5 root causes. Systematic approach:
  1. Sort by rule, count descending: grep "RULE" drc_summary.rpt | sort -k3 -rn. The top rule might account for 40,000 violations from one root cause.
  2. Fix the top rule first: Understand why it's occurring. Is it a routing configuration issue? A macro halo not set? A missing fill constraint?
  3. Batch fix vs point fix: If 10,000 violations are "M2 spacing" due to track pitch, fix the router configuration and re-route — don't fix them one by one.
  4. Re-run DRC after each major fix: Cascade effects — fixing spacing might introduce new width violations.
  5. Isolate by region: If violations cluster in one area, focus there. Use Calibre's "check window" to run DRC on a sub-region during debug.
9. What is LVS clean vs LVS correct — is there a difference?
+
LVS clean: Calibre reports "CORRECT" — all nodes match between layout and schematic. No shorts, opens, or device mismatches. This is what tape-out requires.

Important nuance: LVS-clean does NOT guarantee the design is functionally correct. It only guarantees the layout faithfully implements the netlist. If the netlist itself has a bug (wrong logic, timing violation, incorrect constraint), LVS will still pass. That's why functional verification (simulation), STA, and LVS are all independently required — they catch different classes of errors. An LVS-clean, STA-clean chip can still fail functionally if the RTL logic was wrong.
10. What is the seal ring and why does it have special DRC rules?
+
The seal ring is a continuous ring of metal and active structures running around the perimeter of the die, between the pad ring and the dicing street. Its purposes:
  • Mechanically seals the chip edge against moisture ingress (prevents corrosion)
  • Guards against plasma-induced damage at the die edge during dicing
  • Provides a stress buffer between the die bulk and the scribe line
Special DRC: The seal ring intentionally violates several standard DRC rules — it has very narrow/tight structures and rule violations are expected and foundry-approved. Engineers must either exclude the seal ring cell from DRC or use the foundry-provided waiver file that suppresses known seal ring rule violations. Trying to "fix" seal ring DRC violations is a common mistake by junior engineers.
SECTION 07

How to Prepare — Career Roadmap

This section is your end-to-end guide to entering and advancing in VLSI engineering. Whether you're a student targeting your first role or an experienced engineer moving into a specialized domain, follow this structured path. Advice written from the perspective of what hiring managers and senior engineers actually look for.

7.1 VLSI Domains — Which One Is For You?

🔧
RTL Design Engineer
What you do: Write synthesizable Verilog/SystemVerilog. Design microarchitecture — FSMs, pipelines, datapaths. Write verification plans.

Skills needed: SystemVerilog, microarch, timing-aware RTL coding.

Companies: Intel, AMD, ARM, Qualcomm, Apple, NVIDIA (design teams).
⚙️
Synthesis Engineer
What you do: Run synthesis flows (DC/Genus), meet QoR targets, write SDC constraints, perform timing closure at gate level.

Skills needed: TCL scripting, DC/Genus, SDC, timing analysis, QoR optimization.

Companies: Samsung, MediaTek, Marvell, Broadcom.
🗺️
Physical Design Engineer
What you do: Floorplan, power plan, place, CTS, route, close timing post-route. Work in Innovus or ICC2 daily.

Skills needed: Innovus/ICC2, floorplanning, CTS, routing DRC, ECO.

Companies: TSMC, GlobalFoundries, fabless design houses, Apple silicon.
⏱️
STA Engineer
What you do: Sign-off timing across all MMMC corners using PrimeTime/Tempus. Write ECO scripts. Own WNS/TNS/WHS closure.

Skills needed: PrimeTime, SPEF, MMMC, OCV/AOCV, ECO flows.

Companies: Any semiconductor company with tape-out responsibility.
Physical Verification Engineer
What you do: Run DRC, LVS, ERC, Antenna checks. Debug violations. Own Calibre flow. Coordinate tape-out sign-off.

Skills needed: Calibre DRC/LVS, SVRF rule decks, GDS debugging, Calibre xRC.

Companies: Foundry customers, TSMC design enablement, IP companies.
🔬
Verification Engineer (DV)
What you do: Write UVM testbenches, functional coverage, formal verification, emulation. Ensure RTL is functionally correct.

Skills needed: SystemVerilog, UVM, SVA, Questa/VCS, formal tools.

Companies: All major semiconductor companies.

7.2 Learning Roadmap — Fresher to Professional

PHASE 1 — FOUNDATION Months 0–6 Digital Electronics Basics Logic gates, FFs, timing, FSMs Verilog / SystemVerilog RTL coding, simulation basics CMOS Fundamentals Transistors, gates, delay, power Linux + TCL Scripting Shell, grep, awk, TCL loops Computer Architecture Pipeline, cache, memory hierarchy 📚 Resources: Weste & Harris CMOS VLSI Design, Patterson & Hennessy Comp Arch PHASE 2 — CORE VLSI Months 6–18 Logic Synthesis Deep Dive DC, SDC constraints, QoR Static Timing Analysis Setup/hold, paths, PrimeTime Physical Design Fundamentals Floorplan, place, CTS, route Physical Verification Basics DRC, LVS, Calibre concepts OpenLane / Free PDK Practice Sky130 — full flow hands-on 📚 Resources: Rabaey CMOS, Bhatnagar ASIC Design, Synopsys/Cadence tutorials PHASE 3 — PROFESSIONAL Year 1–3 MMMC + OCV/AOCV Mastery Full sign-off corner analysis Advanced CTS Optimization Skew groups, useful skew, ccopt Low-Power Design (UPF) Multi-Vt, power domains, CPF Timing Closure ECO Flow PT-ECO, post-route, fill-aware Tape-out Ownership Sign-off checklist, tapeout flow 📚 Resources: Company flows, Synopsys/Cadence training, internal project experience PHASE 4 — EXPERT Year 3+ Advanced Node PDK (3nm/5nm) FinFET, GAA, double-patterning EDA Tool Development OpenROAD, Python EDA scripting Methodology Ownership Define flows for whole team/project Research / Advanced Topics POCV, ML-assisted PD, 3D-IC Technical Leadership Staff/Principal Engineer role 📚 Resources: IEEE TCAD, DAC/ICCAD papers, internal tapeout retrospectives

7.3 Essential Tools — What to Learn and How

🎯 Reality Check
Industry EDA tools (DC, Innovus, PrimeTime, Calibre) are expensive and require a license. As a student, use the free alternatives below to build hands-on experience. Hiring managers know you won't have industry tool access — they want to see that you understand the concepts and can demonstrate hands-on work with open-source equivalents.
DomainIndustry ToolFree/Open AlternativeHow to Practice
Synthesis Synopsys DC / Cadence Genus Yosys (open source) Synthesize your Verilog designs with Yosys. Understand liberty files. Write SDC constraints manually. Compare area reports.
Place & Route Cadence Innovus / Synopsys ICC2 OpenROAD via OpenLane2 Use OpenLane with Sky130 PDK. Run full RTL-to-GDS on a small design (UART, I2C, simple CPU). Examine each output.
STA Synopsys PrimeTime / Cadence Tempus OpenSTA (inside OpenLane) Read timing reports from OpenSTA. Understand slack calculation. Introduce timing violations manually and fix them.
Simulation Synopsys VCS / Cadence Questa Verilator / Icarus Verilog Write testbenches. Simulate your RTL. View waveforms in GTKWave. Practice writing self-checking testbenches.
Physical Verification Calibre DRC/LVS Magic VLSI / KLayout Open Sky130 GDS in KLayout. Inspect metal layers. Run built-in DRC checks. Understand what each layer represents.
Parasitic Extraction Calibre xRC / Synopsys StarRC OpenRCX (inside OpenROAD) Run OpenRCX on a placed-and-routed design. Examine the SPEF output. Understand how RC values affect timing.
Waveform Viewing Synopsys DVE / Cadence SimVision GTKWave View VCD dumps from Verilator/Icarus simulation. Practice reading waveforms, adding cursors, measuring timing.
Layout Editing Cadence Virtuoso / Synopsys L-Edit Magic VLSI / KLayout Draw simple standard cells in Magic. Understand how transistors form. See the connection between schematic and layout.
OpenLane Quick Start — Full RTL-to-GDS in One Command
SHELL — OpenLane2 with Sky130 PDK
# Install OpenLane2 (requires Docker or Nix)
pip install openlane

# Create a minimal design config
mkdir my_design && cd my_design
cat > config.json << 'EOF'
{
  "DESIGN_NAME": "my_alu",
  "VERILOG_FILES": "src/alu.v",
  "CLOCK_PORT": "clk",
  "CLOCK_PERIOD": 10,
  "FP_CORE_UTIL": 40,
  "PL_TARGET_DENSITY": 0.4
}
EOF

# Run complete RTL-to-GDS flow
openlane config.json

# Outputs you'll find in runs/RUN_*/:
# synthesis/    → gate-level netlist (.v)
# floorplan/    → DEF with core/IO defined
# placement/    → placed cells DEF
# cts/          → clock tree built DEF
# routing/      → fully routed DEF + GDS
# signoff/      → timing reports (OpenSTA)
# signoff/      → DRC results (KLayout/Magic)

# View final GDS in KLayout:
klayout runs/RUN_latest/final/gds/my_alu.gds

7.4 Interview Preparation Plan — 8 Weeks

WeekTopic FocusWhat to StudyPractice Task
Week 1 Digital Fundamentals Setup/hold time, metastability, clock domains, timing diagrams, flip-flop operation Draw timing diagrams by hand. Explain setup violation to a friend without notes.
Week 2 Synthesis Concepts RTL-to-netlist flow, SDC constraints (create_clock, set_input/output_delay, false_path, multicycle_path), QoR metrics (WNS/TNS) Write a complete SDC file for a simple design from memory. Run Yosys synthesis on a small Verilog module.
Week 3 STA Deep Dive Setup/hold slack formulas, 4 path types, timing reports, OCV/AOCV, propagated clock, MMMC corners Manually calculate setup slack for a given circuit. Read a full PrimeTime report and identify violations.
Week 4 Physical Design Floorplan formulas (utilization, AR), IR drop, CTS (skew/latency/uncertainty), routing DRC rules, Innovus vs ICC2 Run OpenLane on a UART or I2C controller. Examine floorplan DEF, routing layers, DRC results.
Week 5 Physical Verification DRC categories, LVS flow, antenna violations, metal fill/density, Calibre commands, ERC checks Open a Sky130 GDS in KLayout. Identify metal layers. Find a DRC violation and understand which rule it breaks.
Week 6 Advanced Topics CDC (2FF synchronizer, set_clock_groups), low power (clock gating, multi-Vt, UPF), timing closure ECO flow Write a CDC synchronizer in Verilog. Simulate it with an asynchronous signal crossing. Verify no metastability.
Week 7 Mock Interviews Work through all 90 Q&As in this guide. Time yourself. Answer out loud, not just in your head. Do 3 mock interviews with a peer or use a mirror. Record yourself. Identify weak areas and go back to Week 2–6.
Week 8 Company-Specific Prep Research target company's products. Know their process node (e.g., TSMC 5nm, Samsung 3nm). Read recent conference papers from their engineers. Prepare 3–5 intelligent questions to ask the interviewer. Show you understand their specific domain challenges.

7.5 What Interviewers Actually Evaluate

✅ What Gets You Hired
  • Can explain why, not just what. "Why does hold analysis use the fast corner?" shows deep understanding.
  • Hands-on experience — even with open-source tools. Running OpenLane end-to-end beats "I studied PD in class."
  • Correct use of units and numbers. "Skew under 50ps," not "skew should be small."
  • Knows limits and tradeoffs. "Increasing drive strength fixes setup but increases power and may cause hold violations."
  • Asks clarifying questions before answering — demonstrates engineering mindset.
  • Admits uncertainty honestly: "I haven't used Genus directly but DC concepts are the same — let me explain my DC knowledge."
  • Connects concepts: "DRC-clean layout is needed before Calibre xRC extraction which feeds sign-off STA."
❌ Common Mistakes That Fail Candidates
  • Memorizing answers without understanding. Interviewers probe with follow-up questions — memorized answers collapse immediately.
  • "I know the theory but haven't used the tools." Every VLSI job requires tool proficiency. Use open-source tools to fill this gap.
  • Getting confused between setup and hold. This is the most basic STA concept — if you mix them up, the interview ends.
  • Not knowing which corner is used for setup vs hold analysis. This comes up in almost every STA interview.
  • Saying "I would just rerun synthesis" to fix a post-route timing violation. Late-stage fixes must be ECO-based — no full rerun.
  • Cannot explain what SPEF is and why it's needed. This is fundamental to any STA sign-off conversation.
  • Treating LVS-clean as the same as functionally correct. Interviewers know this is a common misconception.
⚡ The Most Common Interview Question — And How to Answer It Properly
"Explain setup and hold time." — Almost every VLSI interview starts here. Wrong answer: "Setup is the time before clock, hold is the time after clock."

Right answer: "Setup time is the minimum duration that data must be stable at the flip-flop input before the active clock edge, so the FF can reliably capture it. Hold time is the minimum duration data must remain stable after the clock edge. Violating setup causes data to arrive late — the FF may not capture the correct value. Violating hold causes data to change too quickly — the FF may capture the new value instead of the intended one. Critically, hold violations cause failures at all frequencies, not just high speed — they're structural, not a speed problem. That's why they're fixed with delay buffer insertion rather than clock frequency reduction."

7.6 Essential Books, Courses & Resources

📖
Foundational Books
  • Weste & Harris — CMOS VLSI Design (the bible)
  • Rabaey et al. — Digital Integrated Circuits
  • Patterson & Hennessy — Computer Organization & Design
  • Bhatnagar — Advanced ASIC Chip Synthesis (DC-specific)
  • Elmore — RC delay modeling papers
🌐
Online Resources
  • efabless.com — Free chip tapeout with Sky130
  • openroad.tools — Open-source RTL-to-GDS
  • vlsiuniverse.com — VLSI interview prep
  • Synopsys SolvNetPlus — Official DC/PT docs
  • Cadence Training — Innovus/Genus tutorials
  • IEEE Xplore — DAC, ICCAD, CICC papers
🎓
Courses & Projects
  • MIT 6.004 — Computation Structures (free)
  • Coursera: HDL & FPGA — Entry RTL practice
  • Build a RISC-V CPU in Verilog, synthesize it
  • Tape-out on Sky130 via efabless chipIgnite
  • Contribute to OpenROAD — visible open-source work
  • ICCAD Contests — Register for student competitions

7.7 A Day in the Life — By Role

Physical Design Engineer — Typical Day (Pre-Tapeout)
TimeActivityTools Used
9:00 AMCheck overnight Innovus place-and-route run results. Review DRC violation count trend and timing summary.Innovus GUI, log parser scripts
9:30 AMTeam stand-up: report WNS/TNS status, blocking issues (congested area, unresolvable DRC in macro boundary)Confluence, Jira
10:00 AMDebug 3 specific DRC violations near a macro corner that have resisted automatic fixing. Manually re-route 2 wires.Innovus ECO route, DRC GUI
11:30 AMReview CTS results. Skew is 220ps — above 200ps target. Adjust ccopt settings, re-run CTS on critical clock domain.Innovus ccopt_design
1:30 PMRun post-route STA to check setup/hold after morning ECO changes. Two new hold violations introduced by yesterday's buffer insertion.Tempus, timing reports
2:30 PMFix hold violations by inserting delay buffers on 2 short paths. Re-run route_opt for those nets.Innovus ecoAddDelay
3:30 PMMeeting with STA team to align on acceptable WNS margin at this stage of the project.
4:00 PMWrite TCL script to automate tomorrow's overnight run: place → CTS → route → extractRC → STA → DRC. Submit to compute farm.TCL, LSF job scheduler
5:00 PMUpdate project tracking spreadsheet. Document today's changes. Review tomorrow's schedule.Excel, Confluence
STA Engineer — Typical Day (Sign-off Phase)
TimeActivityTools Used
9:00 AMReview overnight PT sign-off results across 5 MMMC corners. Identify which corners still have violations.PrimeTime, report parser
9:45 AMWNS is -0.04ns at func_slow corner on one clock domain. Identify the critical path — reg-to-reg through a wide adder.PT report_timing
10:15 AMRun PT-ECO to generate fix suggestions (upsize 3 cells, insert 1 buffer). Review suggestions for reasonableness.PT fix_eco_timing
10:45 AMSend ECO script to PD team for implementation in Innovus. Coordinate on expected turnaround.Email, Jira ticket
11:30 AMReview hold corner (func_fast) — clean. Review scan corner (scan_slow) — 2 violations. Update tracking spreadsheet.PrimeTime, Excel
1:30 PMNew SPEF delivered from PD team after yesterday's route ECO. Run full PT update on all 5 corners. ~2 hour runtime.PrimeTime update_timing
3:30 PMResults back: func_slow now +0.02ns (CLEAN). Scan corner improved to -0.01ns — one path remains. Document.PrimeTime, Confluence
4:00 PMDebug the remaining scan violation — it's an MCP (multicycle_path) that has wrong hold correction. Fix SDC, re-run.PrimeTime, SDC editor
5:00 PMSubmit final overnight run with updated SDC. Send status email to project lead: "func_slow CLEAN, scan_slow -0.01ns in progress"LSF, email
Physical Verification Engineer — Typical Day (Pre-Tapeout)
TimeActivityTools Used
9:00 AMReview overnight Calibre DRC results. 142 violations remain (down from 2,400 last week). Classify by rule.Calibre RVE, shell scripts
9:30 AMTop rule: M3_SPACING.2 — 47 violations, all in one macro boundary region. Root cause: macro halo not set correctly.Calibre RVE, Innovus
10:00 AMFix: Adjust macro halo in Innovus, re-run routing around that macro. Generates new GDS for next DRC run.Innovus, script
11:00 AMLVS run completed overnight — 3 opens found. Debug: all 3 are on VDD tie-off cells that weren't properly connected after fill insertion.Calibre RVS, layout viewer
11:45 AMFix: Add missing metal connections in Innovus. Verify fix with quick LVS on the affected nets only.Innovus ECO, Calibre partial LVS
1:30 PMAntenna check: 8 violations remain. All on NAND gate inputs with long M1 wires. Add jumper vias to M3 for 6 of them; insert 2 antenna diodes for the others.Innovus antenna fixer, Calibre
3:00 PMSubmit full DRC/LVS/Antenna run to compute farm with new GDS. Expected 4 hours runtime.LSF compute farm
3:30 PMPrepare tape-out checklist. Verify all IP blocks have current DRC waivers. Coordinate with fab liaison on GDS delivery window.Confluence checklist, email
5:00 PMUpdate PV sign-off dashboard. Send status: "DRC: 142→TBD tonight, LVS: 3 opens fixed, Antenna: 8→2 fixes sent to PD"Dashboard, email
Synthesis Engineer — Typical Day (Synthesis Closure)
TimeActivityTools Used
9:00 AMReview overnight compile_ultra run. WNS = -0.28ns, TNS = -15.4ns. 47 violating endpoints. Area 4.2 mm².DC, QoR scripts
9:30 AMIdentify top 5 violating paths — all through the FP multiply unit. Discuss with RTL team: can MCP be applied?DC report_timing, email
10:00 AMRTL team confirms multiplier is 2-cycle. Add MCP to SDC. Re-run compile_ultra incremental on that path group.DC compile -incremental
11:00 AMNew WNS = -0.06ns, TNS = -0.9ns. Good progress. Remaining violations are in the interconnect arbiter.DC, report_qor
1:30 PMTry path group with higher weight on the arbiter timing paths. Also try ungroup on the arbiter sub-module to allow cross-boundary optimization.DC group_path, ungroup
2:30 PMCheck max-cap violations — 12 found on high-fanout reset net. Set don't_touch on clock buffers. Insert buffer tree on reset.DC set_max_fanout, compile
3:30 PMRun power analysis with SAIF from simulation. Dynamic power = 380mW — 15% over target. Apply clock gating and increase HVT cell usage on non-critical paths.DC, power_compiler
4:30 PMSubmit overnight full compile_ultra run with updated SDC and power optimizations. Write synthesis run notes for the team.LSF, Confluence
5:00 PMWrite handoff email to PD team with current netlist, mapped SDC, and QoR summary noting areas of concern for floorplanning.Email

7.8 Skills Proficiency Matrix

Rate yourself honestly against this matrix. Target "Intermediate" in your primary domain and "Awareness" in adjacent domains before your first interview. "Advanced" in your domain is the 3–5 year mark.

Skill AreaAwarenessIntermediate (Hire-ready)Advanced (3–5yr)
Verilog / SV Can read RTL code. Knows gates, FFs, always blocks. Writes synthesizable RTL. Understands latch vs FF inference. Codes FSMs correctly. Writes parameterized, reusable RTL. Knows synthesis implications of every construct.
Synthesis / SDC Knows flow: RTL → netlist. Knows create_clock exists. Writes complete SDC. Runs DC. Interprets QoR. Understands WNS/TNS. Tunes compile strategies. Multi-Vt optimization. compile_ultra deep settings.
STA Can define setup/hold time. Knows slack formula. Reads PT timing reports. Understands MMMC. Knows OCV derating. Can close timing with ECO. POCV, CRPR, PBA vs GBA. Develops full MMMC corner methodology for a project.
Physical Design Knows PD flow stages. Can explain utilization and skew. Can floorplan a block. Runs Innovus/OpenROAD full flow. Understands DRC and IR drop. Closes timing at advanced nodes. Owns CTS strategy. Designs power grid from scratch.
Physical Verification Knows DRC/LVS purpose. Can identify a spacing violation. Runs Calibre DRC/LVS. Debugs top violation types. Understands antenna and fill. Owns tape-out PV sign-off. Writes Calibre runset modifications. DP-aware verification.
TCL Scripting Can read/modify existing TCL scripts. Knows variables, loops, procs. Writes TCL flow scripts from scratch. Parses timing reports. Automates batch jobs. Writes complex flow automation, QoR parsers, automatic ECO generators in TCL/Python.
Low Power Knows clock gating saves power. Knows LVT has more leakage. Sets up multi-Vt optimization in synthesis. Understands UPF domains and level shifters. Designs full multi-voltage UPF architecture. Owns power sign-off (Voltus/RedHawk).