SOC: RISC-V System-on-Chip
A complete system-on-chip built around a 32-bit RISC-V CPU (RV32IM) running on a Tang Nano 20K FPGA. The CPU connects to eight peripherals through a shared 32-bit bus, with HDMI video output, 8-channel audio synthesis, SD card storage, 64 Mbit SDRAM, and a UART serial console.
text
┌────────────┐
│ RISC-V │
│ CPU │
│ (SOURCE) │
└──────┬─────┘
│ SIMPLE_BUS
┌────────────────────┴─────────────────────────────────────┐
│ arbiter │
└─┬───────┬──────┬──────┬────────┬────────┬──────┬───────┬─┘
│ │ │ │ │ │ │ │
┌─────┐ ┌─────┐ ┌────┐ ┌────┐ ┌──────┐ ┌──────┐ ┌────┐ ┌─────┐
│ ROM │ │ RAM │ │LED │ │UART│ │SDRAM │ │ Term │ │ SD │ │Audio│
│0x0_ │ │0x1_ │ │0x2_│ │0x3_│ │ 0x4_ │ │ 0x5_ │ │0x6_│ │0x7_ │
└─────┘ └─────┘ └────┘ └────┘ └──┬───┘ └──────┘ └──┬─┘ └─────┘
│ │
SDRAM SD Card
64 Mbit (SPI)INFO
Audio support is a work in progress and not currently functional.
Bus Architecture
The SIMPLE_BUS definition in the global file describes the shared bus:
jz
BUS SIMPLE_BUS {
OUT [32] ADDR;
OUT [1] CMD;
OUT [1] VALID;
INOUT [32] DATA;
IN [1] DONE;
}The CPU declares a BUS SIMPLE_BUS SOURCE port. Each peripheral declares a BUS SIMPLE_BUS TARGET port. Directions resolve automatically: OUT from the CPU becomes IN at each peripheral. The INOUT DATA signal participates in tristate resolution — the compiler proves at compile time that exactly one peripheral drives DATA at any moment.
Modules
global
Shared constants and enumerations used across all modules via @global:
- OP: 8-bit opcodes for the accumulator CPU variant (NOP through ST_X, 25 instructions).
- STATE: CPU state machine states (FETCH through HALT, 12 states).
- WAVE: Audio waveform types (SQUARE, TRIANGLE, SAWTOOTH, NOISE).
- ENV: ADSR envelope states (IDLE, ATTACK, DECAY, SUSTAIN, RELEASE).
- CMD: Bus commands (READ=0, WRITE=1).
rv_cpu — RISC-V RV32IM CPU
A multi-cycle 32-bit RISC-V implementation supporting the base integer (RV32I) and multiply/divide (M) extensions.
State machine (4-bit): FETCH → WAIT_FETCH → DECODE → EXECUTE → MEM_WAIT/RMW_WAIT/MULDIV_WAIT → WRITEBACK.
Instruction support:
- LUI, AUIPC, JAL, JALR
- Branches: BEQ, BNE, BLT, BGE, BLTU, BGEU
- Loads: LB, LH, LW, LBU, LHU (with sign/zero extension)
- Stores: SW, SH, SB (sub-word stores use read-modify-write via RMW_WAIT)
- ALU: ADD, SUB, SLL, SLT, SLTU, XOR, SRL, SRA, OR, AND (immediate and register variants)
- Multiply/divide: MUL, MULH, MULHSU, MULHU, DIV, DIVU, REM, REMU
- CSR: CSRRW, CSRRS, CSRRC and immediate variants
- Trap: ECALL, EBREAK, MRET
Interrupt handling: IRQ lines are checked at FETCH. Trap entry saves PC to mepc, sets mcause, copies MIE→MPIE, clears MIE, and jumps to mtvec. MRET restores MIE from MPIE.
Shadow register file: The register file (rv_regfile) maintains a shadow bank of 31 registers for zero-overhead trap context switching. When shadow_mode is active, reads and writes target the shadow bank, preserving the interrupted program's registers without software save/restore.
rv_alu — Combinational ALU
Implements all RV32I ALU operations via funct3 decoding. ADD/SUB selected by alt bit. Signed comparison uses ssub sign bit. Shifts use b[4:0] as shift amount. Entirely combinational — no registers.
rv_muldiv — Multiply/Divide Unit
Single-cycle multiply (64-bit product via hardware multiplier, selecting upper or lower 32 bits). 32-cycle restoring division with sign correction. Handles edge cases: division by zero returns -1/dividend, overflow (-2^31 / -1) returns -2^31.
rv_csr — CSR Register File
M-mode CSR registers: mstatus (MIE/MPIE/MPP), mie, mtvec, mepc, mcause, mtval, mcycle (free-running counter). Custom CSRs at 0xBC0-0xBC5: clock frequency, video mode, baud divider, SDRAM size, IRQ/SD card vectors. IRQ line status readable at 0xFC0.
arbiter
Template-based address decoder routing the CPU bus to 8 targets. Each target has a config entry matched against ADDR[31:28] using ((addr ^ value) & care) == 0. Three @template blocks handle matching, DONE collection, and VALID/ADDR/CMD routing with DATA aliasing for tristate pass-through.
Address map: ROM=0x0_, RAM=0x1_, LED=0x2_, UART=0x3_, SDRAM=0x4_, Terminal=0x5_, SD=0x6_, Audio=0x7_.
block_ram — 20 KB RAM
5120 × 32-bit BLOCK memory (10 BSRAM banks). Two-stage read pipeline: assert address → data ready next cycle. Writes complete in one cycle. Maps bus address via ADDR[14:2].
rom — 16 KB Boot ROM
4096 × 32-bit BLOCK memory initialized from bios.hex via @file. Read-only with the same two-stage pipeline as RAM. Address mapped via ADDR[13:2].
sdram — SDRAM Controller
Low-level command sequencer for the GW2AR-18's embedded 64 Mbit SDRAM (2M × 32, 11-bit row, 8-bit column, 2-bit bank).
Initialization: 200µs power-up wait → PRECHARGE ALL → two AUTO REFRESH cycles → MODE REGISTER SET (CL=2, burst=1).
11-state machine: INIT → IPRE → IREF → IMODE → IDLE → ACT_W → ACT → RD → RD_CL → WR → REF. Auto-precharge (A10=1) is used for both reads and writes. Refresh fires every ~7.8µs.
Tristate control: sdram_dq is driven during writes (r_dq_oe == 1'b1) and released to high-Z during reads.
sdram_bus — SDRAM Bus Wrapper
Adapts the raw SDRAM controller to the SIMPLE_BUS protocol. A 2-state machine (IDLE/WAIT) latches the bus address and data, asserts rd/wr, and waits for the controller's done signal before signaling bus DONE. Address mapping: pbus.ADDR[22:2] → 21-bit controller address.
led_out
Single 32-bit write-only register mapped to 6 LEDs via data[5:0]. Reads return the current register value.
uart — UART Controller
Wraps uart_tx and uart_rx sub-modules. Register map at two offsets:
- Offset 0x0: read returns
{30'b0, rx_has_data, tx_ready}; write sendsDATA[7:0]. - Offset 0x4: read returns the received byte and clears
rx_has_data.
Both TX and RX are 8N1 with configurable baud via a baud_div input from the CPU's CSR. IRQ outputs signal TX ready (rising edge) and RX data available.
sdcard — SD Card SPI Controller
Full SPI-mode SD card interface with 512-byte sector buffer, CMD0/8/55/58/ACMD41 initialization sequence, block read/write with CRC, and DMA handshake for direct-to-RAM writes. Register map includes command, status, sector address, data buffer, and IRQ control. CS gap enforcement between commands per SD specification.
video — DVI/HDMI Video Output
Mode-switchable video pipeline supporting 720p@60Hz (80×22 text, 1280×720) and 1080p@30Hz (120×33 text, 1920×1080), both at 74.25 MHz pixel clock. A 5-stage pipeline reads character and attribute data from the terminal framebuffer, fetches font bitmaps from ROM, and produces TMDS-encoded output. Each cell is 16×32 pixels with RGB565 foreground/background colors. Cursor rendering supports 4 styles (underline, block, blinking variants).
video_timing
Dual-mode CEA-861 timing generator. Mode 0: 1280×720@60Hz (1650×750 total). Mode 1: 1920×1080@30Hz (2200×1125 total). Positive sync polarity for both modes.
terminal — Terminal Framebuffer
Dual-BSRAM character/attribute storage with separate sys_clk (CPU) and pixel_clk (video) ports. Register-mapped interface for cell read/write, cursor position/style, and hardware-accelerated CLEAR and SCROLL_UP commands via a 6-state FSM. Supports up to 120×33 cells.
audio — 8-Channel Audio Synthesizer
Eight independent audio channels, each with selectable waveform (square, triangle, sawtooth, noise), 24-bit frequency, 8-bit volume, 8-bit pan, 8-bit duty cycle, and full ADSR envelope. A 128-sample stereo ring buffer (DISTRIBUTED RAM) feeds the output. Register-mapped per-channel configuration at bus offsets grouped by channel index.
aud_gen — Audio Channel Generator
Single voice with waveform synthesis: square wave (phase vs. duty comparison), triangle (phase fold), sawtooth (direct phase), or noise (16-bit Galois LFSR, taps 16/14/13/11). ADSR envelope with 16-bit accumulator and configurable attack/decay/sustain/release rates. Output: wave × envelope × volume >> 24.
aud_mixer — 8-Channel Stereo Mixer
Sums 8 channels with per-channel pan (0=left, 128=center, 255=right). Each channel is scaled by (255-pan) for left and pan for right. Master volume applied via smul. Output clamped to ±0x7FFF on overflow.
cpu_accumulator — Alternative Simple CPU
A 32-bit accumulator-based CPU with A/X registers, 16-bit stack pointer, and flags (Z/C/N). Included as a simpler alternative to the RISC-V for testing. Same bus interface.
por — Power-On Reset
16-cycle delay after DONE assertion before releasing reset.
jz
@project(CHIP="GW2AR-18-QN88-C8-I7") SIMPLE_SOC
@import "global.jz"
@import "soc.jz"
@import "rv_cpu.jz"
@import "rv_regfile.jz"
@import "rv_alu.jz"
@import "rv_csr.jz"
@import "rv_muldiv.jz"
@import "por.jz"
@import "rom.jz"
@import "block_ram.jz"
@import "led_out.jz"
@import "arbiter.jz"
@import "uart_tx.jz"
@import "uart.jz"
@import "uart_rx.jz"
@import "sdram.jz"
@import "sdram_bus.jz"
@import "terminal.jz"
@import "video.jz"
@import "video_timing.jz"
@import "tmds_encoder.jz"
@import "sdcard.jz"
@import "aud_gen.jz"
@import "aud_mixer.jz"
@import "audio.jz"
CONFIG {
DATA_WIDTH = 32;
ADDR_WIDTH = 32;
CLK_FREQ_MHZ = 74;
SDRAM_SIZE_BYTES = 8388608;
}
CLOCKS {
SCLK = { period=37.037 }; // 27MHz crystal
sys_clk; // 74.25MHz system (from CLKDIV)
serial_clk; // 371.25MHz (5x pixel clock, from PLL)
pixel_clk; // 74.25MHz pixel clock (from CLKDIV)
}
IN_PINS {
SCLK = { standard=LVCMOS33 };
DONE = { standard=LVCMOS33 };
KEY[2] = { standard=LVCMOS33 };
UART_RX = { standard=LVCMOS33 };
SDIO_D0 = { standard=LVCMOS33 };
}
OUT_PINS {
LED[6] = { standard=LVCMOS33, drive=8 };
UART_TX = { standard=LVCMOS33, drive=8 };
SDIO_CLK = { standard=LVCMOS33, drive=8 };
SDIO_CMD = { standard=LVCMOS33, drive=8 };
SDIO_D3 = { standard=LVCMOS33, drive=8 };
TMDS_CLK = { mode=DIFFERENTIAL, standard=LVDS25, drive=3.5, width=10, fclk = serial_clk, pclk = pixel_clk, reset = pll_lock };
TMDS_DATA[3] = { mode=DIFFERENTIAL, standard=LVDS25, drive=3.5, width=10, fclk = serial_clk, pclk = pixel_clk, reset = pll_lock };
O_sdram_clk = { standard=LVCMOS33, drive=8 };
O_sdram_cke = { standard=LVCMOS33, drive=8 };
O_sdram_cs_n = { standard=LVCMOS33, drive=8 };
O_sdram_cas_n = { standard=LVCMOS33, drive=8 };
O_sdram_ras_n = { standard=LVCMOS33, drive=8 };
O_sdram_wen_n = { standard=LVCMOS33, drive=8 };
O_sdram_dqm[4] = { standard=LVCMOS33, drive=8 };
O_sdram_addr[11] = { standard=LVCMOS33, drive=8 };
O_sdram_ba[2] = { standard=LVCMOS33, drive=8 };
}
INOUT_PINS {
IO_sdram_dq[32] = { standard=LVCMOS33, drive=8 };
}
MAP {
// System Clock (27MHz)
SCLK = 4;
// 2 Buttons (active low)
KEY[0] = 88;
KEY[1] = 87;
// 6 LEDs (active low)
LED[0] = 15;
LED[1] = 16;
LED[2] = 17;
LED[3] = 18;
LED[4] = 19;
LED[5] = 20;
// UART
UART_TX = 69;
UART_RX = 70;
// SDCard
SDIO_D3 = 81;
SDIO_D0 = 84;
SDIO_CLK = 83;
SDIO_CMD = 82;
// DVI TMDS differential pairs
TMDS_CLK = { P=33, N=34 };
TMDS_DATA[0] = { P=35, N=36 };
TMDS_DATA[1] = { P=37, N=38 };
TMDS_DATA[2] = { P=39, N=40 };
// DONE (POR signal)
DONE = IOR32B;
// SDRAM
O_sdram_clk = IOR11B;
O_sdram_cke = IOL13A;
O_sdram_cs_n = IOL14B;
O_sdram_cas_n = IOL14A;
O_sdram_ras_n = IOL13B;
O_sdram_wen_n = IOL12B;
O_sdram_dqm[0] = IOL12A;
O_sdram_dqm[1] = IOR11A;
O_sdram_dqm[2] = IOL18A;
O_sdram_dqm[3] = IOR15B;
O_sdram_addr[0] = IOR14A;
O_sdram_addr[1] = IOR13B;
O_sdram_addr[2] = IOR14B;
O_sdram_addr[3] = IOR15A;
O_sdram_addr[4] = IOL16B;
O_sdram_addr[5] = IOL17B;
O_sdram_addr[6] = IOL16A;
O_sdram_addr[7] = IOL17A;
O_sdram_addr[8] = IOL15B;
O_sdram_addr[9] = IOL15A;
O_sdram_addr[10] = IOR12B;
O_sdram_ba[0] = IOR13A;
O_sdram_ba[1] = IOR12A;
IO_sdram_dq[0] = IOL3A;
IO_sdram_dq[1] = IOL3B;
IO_sdram_dq[2] = IOL8A;
IO_sdram_dq[3] = IOL8B;
IO_sdram_dq[4] = IOL9A;
IO_sdram_dq[5] = IOL9B;
IO_sdram_dq[6] = IOL11A;
IO_sdram_dq[7] = IOL11B;
IO_sdram_dq[8] = IOR9B;
IO_sdram_dq[9] = IOR9A;
IO_sdram_dq[10] = IOR5B;
IO_sdram_dq[11] = IOR6A;
IO_sdram_dq[12] = IOR5A;
IO_sdram_dq[13] = IOR4B;
IO_sdram_dq[14] = IOR3B;
IO_sdram_dq[15] = IOR3A;
IO_sdram_dq[16] = IOL39B;
IO_sdram_dq[17] = IOL39A;
IO_sdram_dq[18] = IOL35B;
IO_sdram_dq[19] = IOL35A;
IO_sdram_dq[20] = IOL30B;
IO_sdram_dq[21] = IOL30A;
IO_sdram_dq[22] = IOL20A;
IO_sdram_dq[23] = IOL18B;
IO_sdram_dq[24] = IOR17A;
IO_sdram_dq[25] = IOR16A;
IO_sdram_dq[26] = IOR16B;
IO_sdram_dq[27] = IOR17B;
IO_sdram_dq[28] = IOR18A;
IO_sdram_dq[29] = IOR18B;
IO_sdram_dq[30] = IOR44A;
IO_sdram_dq[31] = IOR44B;
}
BUS SIMPLE_BUS {
OUT [32] ADDR;
OUT [1] CMD;
OUT [1] VALID;
INOUT [32] DATA;
IN [1] DONE;
}
CLOCK_GEN {
PLL {
IN REF_CLK SCLK; // 27MHz crystal
OUT BASE serial_clk; // 371.25 MHz (5x pixel clock)
WIRE LOCK pll_lock;
CONFIG {
IDIV = 3; // divider = 4
FBDIV = 54; // multiplier = 55
ODIV = 2; // VCO = 371.25 * 2 = 742.5 MHz
};
};
// ┌────────────┬───────────┐
// │ Resolution │ Refresh │
// ├────────────┼───────────┤
// │ 1080p │ 30Hz │
// │ 720p │ 60Hz │
// └────────────┴───────────┘
CLKDIV {
IN REF_CLK serial_clk;
OUT BASE pixel_clk; // 371.25 / 5 = 74.25 MHz
CONFIG {
DIV_MODE = 5;
};
};
// ┌──────────┬───────────────┐
// │ DIV_MODE │ sys_clk │
// ├──────────┼───────────────┤
// │ 2 │ 185.625 MHz │
// │ 3.5 │ 106.07 MHz │
// │ 4 │ 92.8125 MHz │
// │ 5 │ 74.25 MHz │
// │ 8 │ 46.41 MHz │
// └──────────┴───────────────┘
// Note: dont forget to set CLK_FREQ_MHZ above
CLKDIV {
IN REF_CLK serial_clk;
OUT BASE sys_clk;
CONFIG {
DIV_MODE = 5;
};
};
}
@top SOC {
IN [1] sclk = sys_clk;
IN [1] rst_n = ~KEY[0];
IN [1] done = DONE;
IN [1] pixel_clk = pixel_clk;
OUT [6] leds = LED;
OUT [1] tx = UART_TX;
IN [1] rx = UART_RX;
OUT [10] tmds_clk = TMDS_CLK;
OUT [10] tmds_d0 = TMDS_DATA[0];
OUT [10] tmds_d1 = TMDS_DATA[1];
OUT [10] tmds_d2 = TMDS_DATA[2];
OUT [1] sd_clk_pin = SDIO_CLK;
OUT [1] sd_mosi_pin = SDIO_CMD;
IN [1] sd_miso_pin = SDIO_D0;
OUT [1] sd_cs_n_pin = SDIO_D3;
OUT [1] sdram_cke = O_sdram_cke;
OUT [1] sdram_cs_n = O_sdram_cs_n;
OUT [1] sdram_ras_n = O_sdram_ras_n;
OUT [1] sdram_cas_n = O_sdram_cas_n;
OUT [1] sdram_wen_n = O_sdram_wen_n;
OUT [4] sdram_dqm = O_sdram_dqm;
OUT [11] sdram_addr = O_sdram_addr;
OUT [2] sdram_ba = O_sdram_ba;
INOUT [32] sdram_dq = IO_sdram_dq;
OUT [1] sdram_clk_out = O_sdram_clk;
}
@endprojjz
// Opcodes (8-bit)
@global OP
NOP = 8'h00;
LDI_A = 8'h01;
LDI_X = 8'h02;
LD_A = 8'h03;
ST_A = 8'h04;
ADD = 8'h05;
SUB = 8'h06;
AND = 8'h07;
OR = 8'h08;
XOR = 8'h09;
CMP = 8'h0A;
JMP = 8'h0B;
BEQ = 8'h0C;
BNE = 8'h0D;
PUSH = 8'h0E;
POP = 8'h0F;
CALL = 8'h10;
RET = 8'h11;
HLT = 8'h12;
INC = 8'h13;
DEC = 8'h14;
SHL = 8'h15;
SHR = 8'h16;
LD_X = 8'h17;
ST_X = 8'h18;
@endglob
// CPU state machine states
@global STATE
FETCH = 4'b0000;
WAIT_FETCH = 4'b0001;
DECODE = 4'b0010;
EXECUTE = 4'b0011;
MEM_READ = 4'b0100;
MEM_WAIT = 4'b0101;
WRITEBACK = 4'b0110;
PUSH_EXEC = 4'b0111;
POP_EXEC = 4'b1000;
CALL_PUSH = 4'b1001;
RET_POP = 4'b1010;
HALT = 4'b1111;
@endglob
// Audio waveform types
@global WAVE
SQUARE = 3'd0;
TRIANGLE = 3'd1;
SAWTOOTH = 3'd2;
NOISE = 3'd3;
@endglob
// Audio envelope states
@global ENV
IDLE = 3'd0;
ATTACK = 3'd1;
DECAY = 3'd2;
SUSTAIN = 3'd3;
RELEASE = 3'd4;
@endglob
// Bus commands
@global CMD
READ = 1'b0;
WRITE = 1'b1;
@endglobjz
@module SOC
PORT {
IN [1] sclk;
IN [1] rst_n;
IN [1] done;
IN [1] pixel_clk;
OUT [6] leds;
OUT [1] tx;
IN [1] rx;
// HDMI/DVI TMDS outputs
OUT [10] tmds_clk;
OUT [10] tmds_d0;
OUT [10] tmds_d1;
OUT [10] tmds_d2;
// SDRAM physical interface
OUT [1] sdram_cke;
OUT [1] sdram_cs_n;
OUT [1] sdram_ras_n;
OUT [1] sdram_cas_n;
OUT [1] sdram_wen_n;
OUT [4] sdram_dqm;
OUT [11] sdram_addr;
OUT [2] sdram_ba;
INOUT [32] sdram_dq;
OUT [1] sdram_clk_out;
// SD card SPI pins
OUT [1] sd_clk_pin;
OUT [1] sd_mosi_pin;
IN [1] sd_miso_pin;
OUT [1] sd_cs_n_pin;
}
WIRE {
por_n [1];
reset [1];
cpu_bus [widthof(SIMPLE_BUS)];
rom_bus [widthof(SIMPLE_BUS)];
ram_bus [widthof(SIMPLE_BUS)];
led_bus [widthof(SIMPLE_BUS)];
uart_bus [widthof(SIMPLE_BUS)];
sdram_bus [widthof(SIMPLE_BUS)];
term_bus [widthof(SIMPLE_BUS)];
sd_bus [widthof(SIMPLE_BUS)];
audio_bus [widthof(SIMPLE_BUS)];
led_sw [6];
uart_tx_pin [1];
uart_rx_pin [1];
uart_irq_tx [1];
uart_irq_rx [1];
cpu_irq_lines [32];
sdcard_irq [1];
audio_irq [1];
sd_clk_w [1];
sd_mosi_w [1];
sd_cs_n_w [1];
// Video mode wire
video_mode_w [1];
// Baud rate divider
baud_div_w [16];
// Video read interface wires
vram_addr [12];
vram_char [8];
vram_attr [32];
// Cursor wires
cursor_pos_w [12];
cursor_style_w [3];
// TMDS output wires
tmds_clk_w [10];
tmds_d0_w [10];
tmds_d1_w [10];
tmds_d2_w [10];
// SDRAM internal wires
sdram_cke_w [1];
sdram_cs_n_w [1];
sdram_ras_n_w [1];
sdram_cas_n_w [1];
sdram_wen_n_w [1];
sdram_dqm_w [4];
sdram_addr_w [11];
sdram_ba_w [2];
sdram_dq_w [32];
}
@new por0 por {
IN [1] clk = sclk;
IN [1] done = done;
OUT [1] por_n = por_n;
}
// Address map (4-bit decode on addr[31:28]):
// ROM: 0x0000_0000 - 0x0000_0FFF addr[31:28]=0x0 → val=0000 care=1111 = 8'h0F
// RAM: 0x1000_0000 - 0x1000_4FFF addr[31:28]=0x1 → val=0001 care=1111 = 8'h1F
// LED: 0x2000_0000 addr[31:28]=0x2 → val=0010 care=1111 = 8'h2F
// UART: 0x3000_0000 addr[31:28]=0x3 → val=0011 care=1111 = 8'h3F
// SDRAM: 0x4000_0000 - 0x407F_FFFF addr[31:28]=0x4 → val=0100 care=1111 = 8'h4F
// TERM: 0x5000_0000 addr[31:28]=0x5 → val=0101 care=1111 = 8'h5F
// SD: 0x6000_0000 addr[31:28]=0x6 → val=0110 care=1111 = 8'h6F
// AUDIO: 0x7000_0000 addr[31:28]=0x7 → val=0111 care=1111 = 8'h7F
@new arb0 arbiter {
OVERRIDE {
TARGET_COUNT = 8;
}
IN [64] map_config = {8'h7F, 8'h6F, 8'h5F, 8'h4F, 8'h3F, 8'h2F, 8'h1F, 8'h0F};
BUS SIMPLE_BUS TARGET [1] src = {cpu_bus};
BUS SIMPLE_BUS SOURCE [8] tgt = {audio_bus, sd_bus, term_bus, sdram_bus, uart_bus, led_bus, ram_bus, rom_bus};
}
@new cpu0 cpu {
OVERRIDE {
CLK_FREQ_MHZ = CONFIG.CLK_FREQ_MHZ;
SDRAM_SIZE_BYTES = CONFIG.SDRAM_SIZE_BYTES;
}
IN [1] clk = sclk;
IN [1] rst_n = reset;
IN [32] irq_lines = cpu_irq_lines;
BUS SIMPLE_BUS SOURCE pbus = cpu_bus;
OUT [1] video_mode = video_mode_w;
OUT [16] baud_div = baud_div_w;
}
@new rom0 rom {
IN [1] clk = sclk;
IN [1] rst_n = reset;
BUS SIMPLE_BUS TARGET pbus = rom_bus;
}
@new ram0 ram {
IN [1] clk = sclk;
IN [1] rst_n = reset;
BUS SIMPLE_BUS TARGET pbus = ram_bus;
}
@new led0 led_out {
IN [1] clk = sclk;
IN [1] rst_n = reset;
BUS SIMPLE_BUS TARGET pbus = led_bus;
OUT [6] leds = led_sw;
}
@new uart0 uart {
IN [1] clk = sclk;
IN [1] rst_n = reset;
BUS SIMPLE_BUS TARGET pbus = uart_bus;
OUT [1] tx = uart_tx_pin;
IN [1] rx = uart_rx_pin;
OUT [1] irq_tx_ready = uart_irq_tx;
OUT [1] irq_rx_data = uart_irq_rx;
IN [16] baud_div = baud_div_w;
}
@new sdram0 sdram_bus {
OVERRIDE {
CLK_FREQ_MHZ = CONFIG.CLK_FREQ_MHZ;
}
IN [1] clk = sclk;
IN [1] rst_n = reset;
BUS SIMPLE_BUS TARGET pbus = sdram_bus;
OUT [1] sdram_cke = sdram_cke_w;
OUT [1] sdram_cs_n = sdram_cs_n_w;
OUT [1] sdram_ras_n = sdram_ras_n_w;
OUT [1] sdram_cas_n = sdram_cas_n_w;
OUT [1] sdram_wen_n = sdram_wen_n_w;
OUT [4] sdram_dqm = sdram_dqm_w;
OUT [11] sdram_addr = sdram_addr_w;
OUT [2] sdram_ba = sdram_ba_w;
INOUT [32] sdram_dq = sdram_dq_w;
}
@new term0 terminal_fb {
IN [1] clk = sclk;
IN [1] rst_n = reset;
IN [1] pixel_clk = pixel_clk;
BUS SIMPLE_BUS TARGET pbus = term_bus;
IN [12] vram_addr = vram_addr;
OUT [8] vram_char = vram_char;
OUT [32] vram_attr = vram_attr;
OUT [12] cursor_pos = cursor_pos_w;
OUT [3] cursor_style = cursor_style_w;
}
@new sd0 sdcard {
IN [1] clk = sclk;
IN [1] rst_n = reset;
BUS SIMPLE_BUS TARGET pbus = sd_bus;
OUT [1] irq = sdcard_irq;
OUT [1] sd_clk = sd_clk_w;
OUT [1] sd_mosi = sd_mosi_w;
IN [1] sd_miso = sd_miso_pin;
OUT [1] sd_cs_n = sd_cs_n_w;
}
@new aud0 audio {
IN [1] clk = sclk;
IN [1] rst_n = reset;
BUS SIMPLE_BUS TARGET pbus = audio_bus;
OUT [1] irq = audio_irq;
}
@new vid0 video_out {
IN [1] pixel_clk = pixel_clk;
IN [1] rst_n = reset;
IN [1] video_mode = video_mode_w;
OUT [12] vram_addr = vram_addr;
IN [8] vram_char = vram_char;
IN [32] vram_attr = vram_attr;
IN [12] cursor_pos = cursor_pos_w;
IN [3] cursor_style = cursor_style_w;
OUT [10] tmds_clk = tmds_clk_w;
OUT [10] tmds_d0 = tmds_d0_w;
OUT [10] tmds_d1 = tmds_d1_w;
OUT [10] tmds_d2 = tmds_d2_w;
}
ASYNCHRONOUS {
reset <= rst_n & por_n;
// IRQ lines: bit 0 = UART TX ready, bit 1 = UART RX data, bit 2 = SD card, bit 3 = audio
cpu_irq_lines <= {28'd0, audio_irq, sdcard_irq, uart_irq_rx, uart_irq_tx};
uart_rx_pin <= rx;
// Active-low LEDs, all 6 software controlled
leds <= ~led_sw;
tx <= uart_tx_pin;
// SDRAM physical pins
sdram_cke <= sdram_cke_w;
sdram_cs_n <= sdram_cs_n_w;
sdram_ras_n <= sdram_ras_n_w;
sdram_cas_n <= sdram_cas_n_w;
sdram_wen_n <= sdram_wen_n_w;
sdram_dqm <= sdram_dqm_w;
sdram_addr <= sdram_addr_w;
sdram_ba <= sdram_ba_w;
sdram_dq = sdram_dq_w;
// SDRAM clock (inverted for setup/hold margin)
sdram_clk_out <= ~sclk;
// SD card SPI pins
sd_clk_pin <= sd_clk_w;
sd_mosi_pin <= sd_mosi_w;
sd_cs_n_pin <= sd_cs_n_w;
// HDMI/DVI TMDS outputs
tmds_clk <= tmds_clk_w;
tmds_d0 <= tmds_d0_w;
tmds_d1 <= tmds_d1_w;
tmds_d2 <= tmds_d2_w;
}
@endmodjz
// RV32I Base Integer ISA CPU
// 32-bit RISC-V, multi-cycle implementation
// Register file: x0=zero, x1-x31 general purpose
// Bus interface: 32-bit byte address, 32-bit data
// CPU state machine
@global RVS
FETCH = 4'h0;
WAIT_FETCH = 4'h1;
DECODE = 4'h2;
EXECUTE = 4'h3;
MEM_WAIT = 4'h4;
WRITEBACK = 4'h5;
RMW_WAIT = 4'h6;
MULDIV_WAIT = 4'h7;
HALT = 4'hF;
@endglob
// RV32I opcode groups (instr[6:0])
@global RVO
LUI = 7'b0110111;
AUIPC = 7'b0010111;
JAL = 7'b1101111;
JALR = 7'b1100111;
BRANCH = 7'b1100011;
LOAD = 7'b0000011;
STORE = 7'b0100011;
ALU_IMM = 7'b0010011;
ALU_REG = 7'b0110011;
FENCE = 7'b0001111;
SYSTEM = 7'b1110011;
@endglob
// RV32I funct3 values
@global F3
ADD = 3'b000;
SLL = 3'b001;
SLT = 3'b010;
SLTU = 3'b011;
XOR = 3'b100;
SRL = 3'b101;
OR = 3'b110;
AND = 3'b111;
BEQ = 3'b000;
BNE = 3'b001;
BLT = 3'b100;
BGE = 3'b101;
BLTU = 3'b110;
BGEU = 3'b111;
LB = 3'b000;
LH = 3'b001;
LW = 3'b010;
LBU = 3'b100;
LHU = 3'b101;
@endglob
// CSR funct3 values (SYSTEM opcode)
@global CF3
CSRRW = 3'b001;
CSRRS = 3'b010;
CSRRC = 3'b011;
CSRRWI = 3'b101;
CSRRSI = 3'b110;
CSRRCI = 3'b111;
@endglob
@module cpu
CONST {
CLK_FREQ_MHZ = 54;
SDRAM_SIZE_BYTES = 0;
}
PORT {
IN [1] clk;
IN [1] rst_n;
IN [32] irq_lines;
BUS SIMPLE_BUS SOURCE pbus;
OUT [1] video_mode;
OUT [16] baud_div;
}
REGISTER {
// Program counter (byte address)
pc [32] = 32'h00000000;
// State machine
state [4] = 4'h0;
// Instruction register
instr [32] = 32'h00000000;
// Decoded register values
rs1_val [32] = 32'h00000000;
rs2_val [32] = 32'h00000000;
// Decoded immediate
imm_val [32] = 32'h00000000;
// Writeback
wb_data [32] = 32'h00000000;
wb_rd [5] = 5'd0;
next_pc [32] = 32'h00000000;
// Memory access tracking
mem_funct3 [3] = 3'b000;
mem_addr_lo [2] = 2'b00;
// Bus control registers
bus_addr [32] = 32'h00000000;
bus_data [32] = 32'h00000000;
bus_cmd [1] = 1'b0;
bus_valid [1] = 1'b0;
// Shadow register bank select (0=normal, 1=trap/ISR)
shadow_mode [1] = 1'b0;
}
WIRE {
// Decoded instruction fields
opcode [7];
rd [5];
funct3 [3];
rs1_addr [5];
rs2_addr [5];
funct7 [7];
// Register file interconnect
rf_rs1_data [32];
rf_rs2_data [32];
rf_wr_en_w [1];
rf_wr_addr_w [5];
rf_wr_data_w [32];
// ALU interconnect
alu_a_w [32];
alu_b_w [32];
alu_f3_w [3];
alu_alt_w [1];
alu_result [32];
// CSR interconnect
csr_rd_addr_w [12];
csr_rd_data [32];
csr_wr_addr_w [12];
csr_wr_data_w [32];
csr_wr_en_w [1];
csr_trap_enter_w [1];
csr_trap_epc_w [32];
csr_trap_cause_w [32];
csr_trap_mret_w [1];
csr_mtvec [32];
csr_mepc [32];
csr_irqvec [32];
csr_sdcardvec [32];
csr_mstatus_mie [1];
csr_mie_meie [1];
irq_pending [1];
csr_new_val [32];
csr_zimm [32];
csr_video_mode [1];
csr_baud_div [16];
// Multiply/divide interconnect
md_start [1];
md_result [32];
md_done [1];
}
@new rf0 rv_regfile {
IN [1] clk = clk;
IN [1] rst_n = rst_n;
IN [5] rs1_addr = rs1_addr;
IN [5] rs2_addr = rs2_addr;
OUT [32] rs1_data = rf_rs1_data;
OUT [32] rs2_data = rf_rs2_data;
IN [5] wr_addr = rf_wr_addr_w;
IN [32] wr_data = rf_wr_data_w;
IN [1] wr_en = rf_wr_en_w;
IN [1] shadow = shadow_mode;
}
@new alu0 rv_alu {
IN [32] a = alu_a_w;
IN [32] b = alu_b_w;
IN [3] funct3 = alu_f3_w;
IN [1] alt = alu_alt_w;
OUT [32] result = alu_result;
}
@new csr0 rv_csr {
OVERRIDE {
CLK_FREQ_MHZ = CLK_FREQ_MHZ;
SDRAM_SIZE_BYTES = SDRAM_SIZE_BYTES;
}
IN [1] clk = clk;
IN [1] rst_n = rst_n;
IN [12] rd_addr = csr_rd_addr_w;
OUT [32] rd_data = csr_rd_data;
IN [12] wr_addr = csr_wr_addr_w;
IN [32] wr_data = csr_wr_data_w;
IN [1] wr_en = csr_wr_en_w;
IN [1] trap_enter = csr_trap_enter_w;
IN [32] trap_epc = csr_trap_epc_w;
IN [32] trap_cause = csr_trap_cause_w;
IN [1] trap_mret = csr_trap_mret_w;
OUT [32] mtvec_out = csr_mtvec;
OUT [32] mepc_out = csr_mepc;
OUT [32] irqvec_out = csr_irqvec;
OUT [32] sdcardvec_out = csr_sdcardvec;
OUT [1] mstatus_mie = csr_mstatus_mie;
OUT [1] mie_meie = csr_mie_meie;
IN [32] irq_lines = irq_lines;
OUT [1] video_mode = csr_video_mode;
OUT [16] baud_div = csr_baud_div;
}
@new md0 rv_muldiv {
IN [1] clk = clk;
IN [1] rst_n = rst_n;
IN [32] a = rs1_val;
IN [32] b = rs2_val;
IN [3] funct3 = funct3;
IN [1] start = md_start;
OUT [32] result = md_result;
OUT [1] done = md_done;
}
ASYNCHRONOUS {
// Instruction field decode
opcode = instr[6:0];
rd = instr[11:7];
funct3 = instr[14:12];
rs1_addr = instr[19:15];
rs2_addr = instr[24:20];
funct7 = instr[31:25];
// Drive bus signals
pbus.ADDR <= bus_addr;
pbus.DATA <= (bus_valid == 1'b1 && bus_cmd == CMD.WRITE) ? bus_data : 32'bz;
pbus.CMD <= bus_cmd;
pbus.VALID <= bus_valid;
// ALU inputs
alu_a_w <= rs1_val;
alu_b_w <= (opcode == RVO.ALU_REG) ? rs2_val : imm_val;
alu_f3_w <= funct3;
alu_alt_w <= (opcode == RVO.ALU_REG || funct3 == F3.SRL) ? instr[30] : 1'b0;
// Register file write control (active during WRITEBACK)
rf_wr_en_w <= (state == RVS.WRITEBACK) ? 1'b1 : 1'b0;
rf_wr_addr_w <= wb_rd;
rf_wr_data_w <= wb_data;
// CSR field decode
csr_zimm <= {27'd0, rs1_addr};
// CSR read address (combinational read)
csr_rd_addr_w <= instr[31:20];
// Video mode output
video_mode <= csr_video_mode;
// Baud rate divider output
baud_div <= csr_baud_div;
// Multiply/divide start signal
md_start <= (state == RVS.EXECUTE && opcode == RVO.ALU_REG && funct7 == 7'b0000001) ? 1'b1 : 1'b0;
// Interrupt pending: any IRQ line active & external IRQ enabled & global enable
irq_pending <= (irq_lines != 32'h00000000) ? (csr_mie_meie & csr_mstatus_mie) : 1'b0;
// CSR write value computation (read-modify-write)
IF (funct3 == CF3.CSRRW) {
csr_new_val <= rs1_val;
} ELIF (funct3 == CF3.CSRRS) {
csr_new_val <= csr_rd_data | rs1_val;
} ELIF (funct3 == CF3.CSRRC) {
csr_new_val <= csr_rd_data & ~rs1_val;
} ELIF (funct3 == CF3.CSRRWI) {
csr_new_val <= csr_zimm;
} ELIF (funct3 == CF3.CSRRSI) {
csr_new_val <= csr_rd_data | csr_zimm;
} ELIF (funct3 == CF3.CSRRCI) {
csr_new_val <= csr_rd_data & ~csr_zimm;
} ELSE {
csr_new_val <= 32'h00000000;
}
// CSR write enable: active during EXECUTE with valid CSR funct3
csr_wr_en_w <= (state == RVS.EXECUTE && opcode == RVO.SYSTEM && funct3 != 3'b000) ? 1'b1 : 1'b0;
csr_wr_addr_w <= instr[31:20];
csr_wr_data_w <= csr_new_val;
// Trap signals (combinational)
// Trap enter: interrupt at FETCH, or ECALL/EBREAK at EXECUTE
IF (state == RVS.FETCH && irq_pending == 1'b1) {
csr_trap_enter_w <= 1'b1;
csr_trap_cause_w <= 32'h8000000B;
} ELIF (state == RVS.EXECUTE && opcode == RVO.SYSTEM && funct3 == 3'b000 && instr[31:20] != 12'h302 && instr[31:20] != 12'h105) {
csr_trap_enter_w <= 1'b1;
IF (instr[20] == 1'b1) {
// EBREAK (imm=0x001)
csr_trap_cause_w <= 32'h00000003;
} ELSE {
// ECALL (imm=0x000)
csr_trap_cause_w <= 32'h0000000B;
}
} ELSE {
csr_trap_enter_w <= 1'b0;
csr_trap_cause_w <= 32'h00000000;
}
csr_trap_epc_w <= pc;
csr_trap_mret_w <= (state == RVS.EXECUTE && opcode == RVO.SYSTEM && funct3 == 3'b000 && instr[31:20] == 12'h302) ? 1'b1 : 1'b0;
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
// ============================================================
// FETCH: Start instruction fetch at PC (or take interrupt)
// ============================================================
IF (state == RVS.FETCH) {
IF (irq_pending == 1'b1) {
// Trap: CSR module saves mepc/mcause, clears MIE
pc <= csr_mtvec;
shadow_mode <= 1'b1;
state <= RVS.FETCH;
} ELSE {
bus_addr <= pc;
bus_cmd <= CMD.READ;
bus_valid <= 1'b1;
state <= RVS.WAIT_FETCH;
}
// ============================================================
// WAIT_FETCH: Wait for bus DONE, latch instruction
// ============================================================
} ELIF (state == RVS.WAIT_FETCH) {
IF (pbus.DONE == 1'b1) {
instr <= pbus.DATA;
bus_valid <= 1'b0;
state <= RVS.DECODE;
}
// ============================================================
// DECODE: Latch register values, compute immediate
// ============================================================
} ELIF (state == RVS.DECODE) {
// Latch register file read outputs
rs1_val <= rf_rs1_data;
rs2_val <= rf_rs2_data;
// --- Decode immediate based on instruction type ---
IF (opcode == RVO.LUI || opcode == RVO.AUIPC) {
// U-type: {instr[31:12], 12'h000}
imm_val <= {instr[31:12], 12'h000};
} ELIF (opcode == RVO.JAL) {
// J-type
IF (instr[31] == 1'b1) {
imm_val <= {11'h7FF, instr[31], instr[19:12], instr[20], instr[30:21], 1'b0};
} ELSE {
imm_val <= {11'h000, instr[31], instr[19:12], instr[20], instr[30:21], 1'b0};
}
} ELIF (opcode == RVO.BRANCH) {
// B-type
IF (instr[31] == 1'b1) {
imm_val <= {19'h7FFFF, instr[31], instr[7], instr[30:25], instr[11:8], 1'b0};
} ELSE {
imm_val <= {19'h00000, instr[31], instr[7], instr[30:25], instr[11:8], 1'b0};
}
} ELIF (opcode == RVO.STORE) {
// S-type
IF (instr[31] == 1'b1) {
imm_val <= {20'hFFFFF, instr[31:25], instr[11:7]};
} ELSE {
imm_val <= {20'h00000, instr[31:25], instr[11:7]};
}
} ELSE {
// I-type (LOAD, ALU_IMM, JALR, FENCE, SYSTEM)
IF (instr[31] == 1'b1) {
imm_val <= {20'hFFFFF, instr[31:20]};
} ELSE {
imm_val <= {20'h00000, instr[31:20]};
}
}
state <= RVS.EXECUTE;
// ============================================================
// EXECUTE: ALU ops, branches, start memory ops
// ============================================================
} ELIF (state == RVS.EXECUTE) {
// --- LUI ---
IF (opcode == RVO.LUI) {
wb_data <= imm_val;
wb_rd <= rd;
next_pc <= pc + 32'h00000004;
state <= RVS.WRITEBACK;
// --- AUIPC ---
} ELIF (opcode == RVO.AUIPC) {
wb_data <= pc + imm_val;
wb_rd <= rd;
next_pc <= pc + 32'h00000004;
state <= RVS.WRITEBACK;
// --- JAL ---
} ELIF (opcode == RVO.JAL) {
wb_data <= pc + 32'h00000004;
wb_rd <= rd;
next_pc <= pc + imm_val;
state <= RVS.WRITEBACK;
// --- JALR ---
} ELIF (opcode == RVO.JALR) {
wb_data <= pc + 32'h00000004;
wb_rd <= rd;
next_pc <= (rs1_val + imm_val) & 32'hFFFFFFFE;
state <= RVS.WRITEBACK;
// --- BRANCH ---
} ELIF (opcode == RVO.BRANCH) {
IF (funct3 == F3.BEQ) {
IF (rs1_val == rs2_val) {
pc <= pc + imm_val;
} ELSE {
pc <= pc + 32'h00000004;
}
} ELIF (funct3 == F3.BNE) {
IF (rs1_val != rs2_val) {
pc <= pc + imm_val;
} ELSE {
pc <= pc + 32'h00000004;
}
} ELIF (funct3 == F3.BLT) {
// Signed less than: ssub sign bit gives signed comparison
IF (ssub(rs1_val, rs2_val)[32] == 1'b1) {
pc <= pc + imm_val;
} ELSE {
pc <= pc + 32'h00000004;
}
} ELIF (funct3 == F3.BGE) {
// Signed greater or equal
IF (ssub(rs1_val, rs2_val)[32] == 1'b0) {
pc <= pc + imm_val;
} ELSE {
pc <= pc + 32'h00000004;
}
} ELIF (funct3 == F3.BLTU) {
IF (rs1_val < rs2_val) {
pc <= pc + imm_val;
} ELSE {
pc <= pc + 32'h00000004;
}
} ELIF (funct3 == F3.BGEU) {
IF (rs1_val < rs2_val) {
pc <= pc + 32'h00000004;
} ELSE {
pc <= pc + imm_val;
}
} ELSE {
pc <= pc + 32'h00000004;
}
state <= RVS.FETCH;
// --- LOAD ---
} ELIF (opcode == RVO.LOAD) {
bus_addr <= rs1_val + imm_val;
mem_addr_lo <= (rs1_val + imm_val)[1:0];
mem_funct3 <= funct3;
bus_cmd <= CMD.READ;
bus_valid <= 1'b1;
wb_rd <= rd;
next_pc <= pc + 32'h00000004;
state <= RVS.MEM_WAIT;
// --- STORE ---
} ELIF (opcode == RVO.STORE) {
bus_addr <= rs1_val + imm_val;
mem_addr_lo <= (rs1_val + imm_val)[1:0];
mem_funct3 <= funct3;
next_pc <= pc + 32'h00000004;
IF (funct3 == F3.LW) {
// Word store: direct write
bus_data <= rs2_val;
bus_cmd <= CMD.WRITE;
bus_valid <= 1'b1;
state <= RVS.MEM_WAIT;
} ELSE {
// Byte/half store: read-modify-write, start with read
bus_cmd <= CMD.READ;
bus_valid <= 1'b1;
state <= RVS.MEM_WAIT;
}
// --- ALU immediate ---
} ELIF (opcode == RVO.ALU_IMM) {
wb_data <= alu_result;
wb_rd <= rd;
next_pc <= pc + 32'h00000004;
state <= RVS.WRITEBACK;
// --- ALU register ---
} ELIF (opcode == RVO.ALU_REG) {
IF (funct7 == 7'b0000001) {
// M extension: multiply/divide (muldiv unit started via async)
wb_rd <= rd;
next_pc <= pc + 32'h00000004;
state <= RVS.MULDIV_WAIT;
} ELSE {
wb_data <= alu_result;
wb_rd <= rd;
next_pc <= pc + 32'h00000004;
state <= RVS.WRITEBACK;
}
// --- FENCE (treat as NOP) ---
} ELIF (opcode == RVO.FENCE) {
pc <= pc + 32'h00000004;
state <= RVS.FETCH;
// --- SYSTEM (CSR / MRET / ECALL / EBREAK) ---
} ELIF (opcode == RVO.SYSTEM) {
IF (funct3 != 3'b000) {
// CSR instruction: old value to rd, write handled by async wr_en
wb_data <= csr_rd_data;
wb_rd <= rd;
next_pc <= pc + 32'h00000004;
state <= RVS.WRITEBACK;
} ELSE {
IF (instr[31:20] == 12'h302) {
// MRET: return to mepc, CSR restores mstatus
pc <= csr_mepc;
shadow_mode <= 1'b0;
state <= RVS.FETCH;
} ELIF (instr[31:20] == 12'h105) {
// WFI: treat as NOP (spec allows this)
pc <= pc + 32'h00000004;
state <= RVS.FETCH;
} ELSE {
// ECALL/EBREAK -> trap to mtvec
bus_valid <= 1'b0;
pc <= csr_mtvec;
state <= RVS.FETCH;
}
}
// --- Unknown opcode: NOP ---
} ELSE {
pc <= pc + 32'h00000004;
state <= RVS.FETCH;
}
// ============================================================
// MEM_WAIT: Wait for load/store bus completion
// ============================================================
} ELIF (state == RVS.MEM_WAIT) {
IF (pbus.DONE == 1'b1) {
IF (opcode == RVO.LOAD) {
// Load: extract byte/half/word with sign extension
bus_valid <= 1'b0;
IF (mem_funct3 == F3.LB) {
IF (mem_addr_lo == 2'b00) {
wb_data <= (pbus.DATA[7] == 1'b1) ? {24'hFFFFFF, pbus.DATA[7:0]} : {24'h000000, pbus.DATA[7:0]};
} ELIF (mem_addr_lo == 2'b01) {
wb_data <= (pbus.DATA[15] == 1'b1) ? {24'hFFFFFF, pbus.DATA[15:8]} : {24'h000000, pbus.DATA[15:8]};
} ELIF (mem_addr_lo == 2'b10) {
wb_data <= (pbus.DATA[23] == 1'b1) ? {24'hFFFFFF, pbus.DATA[23:16]} : {24'h000000, pbus.DATA[23:16]};
} ELSE {
wb_data <= (pbus.DATA[31] == 1'b1) ? {24'hFFFFFF, pbus.DATA[31:24]} : {24'h000000, pbus.DATA[31:24]};
}
} ELIF (mem_funct3 == F3.LH) {
IF (mem_addr_lo[1] == 1'b0) {
wb_data <= (pbus.DATA[15] == 1'b1) ? {16'hFFFF, pbus.DATA[15:0]} : {16'h0000, pbus.DATA[15:0]};
} ELSE {
wb_data <= (pbus.DATA[31] == 1'b1) ? {16'hFFFF, pbus.DATA[31:16]} : {16'h0000, pbus.DATA[31:16]};
}
} ELIF (mem_funct3 == F3.LW) {
wb_data <= pbus.DATA;
} ELIF (mem_funct3 == F3.LBU) {
IF (mem_addr_lo == 2'b00) {
wb_data <= {24'h000000, pbus.DATA[7:0]};
} ELIF (mem_addr_lo == 2'b01) {
wb_data <= {24'h000000, pbus.DATA[15:8]};
} ELIF (mem_addr_lo == 2'b10) {
wb_data <= {24'h000000, pbus.DATA[23:16]};
} ELSE {
wb_data <= {24'h000000, pbus.DATA[31:24]};
}
} ELIF (mem_funct3 == F3.LHU) {
IF (mem_addr_lo[1] == 1'b0) {
wb_data <= {16'h0000, pbus.DATA[15:0]};
} ELSE {
wb_data <= {16'h0000, pbus.DATA[31:16]};
}
} ELSE {
wb_data <= pbus.DATA;
}
state <= RVS.WRITEBACK;
} ELIF (bus_cmd == CMD.WRITE) {
// Store write complete (SW or RMW write phase done)
bus_valid <= 1'b0;
pc <= next_pc;
state <= RVS.FETCH;
} ELSE {
// Store RMW read phase (SB/SH): merge byte/half into read word
IF (mem_funct3 == F3.LB) {
// Store byte
IF (mem_addr_lo == 2'b00) {
bus_data <= {pbus.DATA[31:8], rs2_val[7:0]};
} ELIF (mem_addr_lo == 2'b01) {
bus_data <= {pbus.DATA[31:16], rs2_val[7:0], pbus.DATA[7:0]};
} ELIF (mem_addr_lo == 2'b10) {
bus_data <= {pbus.DATA[31:24], rs2_val[7:0], pbus.DATA[15:0]};
} ELSE {
bus_data <= {rs2_val[7:0], pbus.DATA[23:0]};
}
} ELSE {
// Store halfword
IF (mem_addr_lo[1] == 1'b0) {
bus_data <= {pbus.DATA[31:16], rs2_val[15:0]};
} ELSE {
bus_data <= {rs2_val[15:0], pbus.DATA[15:0]};
}
}
bus_cmd <= CMD.WRITE;
bus_valid <= 1'b1;
state <= RVS.RMW_WAIT;
}
}
// ============================================================
// RMW_WAIT: Wait for RMW write to complete
// ============================================================
} ELIF (state == RVS.RMW_WAIT) {
IF (pbus.DONE == 1'b1) {
bus_valid <= 1'b0;
pc <= next_pc;
state <= RVS.FETCH;
}
// ============================================================
// MULDIV_WAIT: Wait for multiply/divide completion
// ============================================================
} ELIF (state == RVS.MULDIV_WAIT) {
IF (md_done == 1'b1) {
wb_data <= md_result;
state <= RVS.WRITEBACK;
}
// ============================================================
// WRITEBACK: Update PC (register write handled by regfile)
// ============================================================
} ELIF (state == RVS.WRITEBACK) {
pc <= next_pc;
state <= RVS.FETCH;
// ============================================================
// HALT: Stay halted (ECALL/EBREAK)
// ============================================================
} ELIF (state == RVS.HALT) {
state <= RVS.HALT;
}
}
@endmodjz
// RV32I ALU
// Pure combinational arithmetic/logic unit
// Supports ADD/SUB, SLL, SLT, SLTU, XOR, SRL/SRA, OR, AND
@module rv_alu
PORT {
IN [32] a;
IN [32] b;
IN [3] funct3;
IN [1] alt;
OUT [32] result;
}
ASYNCHRONOUS {
IF (funct3 == F3.ADD) {
IF (alt == 1'b1) {
// SUB
result <= a - b;
} ELSE {
// ADD
result <= a + b;
}
} ELIF (funct3 == F3.SLL) {
result <= a << b[4:0];
} ELIF (funct3 == F3.SLT) {
// Signed less than: ssub gives 33-bit signed difference, bit[32] is sign
result <= {31'd0, ssub(a, b)[32]};
} ELIF (funct3 == F3.SLTU) {
result <= (a < b) ? 32'h00000001 : 32'h00000000;
} ELIF (funct3 == F3.XOR) {
result <= a ^ b;
} ELIF (funct3 == F3.SRL) {
IF (alt == 1'b1) {
// SRA
result <= a >>> b[4:0];
} ELSE {
// SRL
result <= a >> b[4:0];
}
} ELIF (funct3 == F3.OR) {
result <= a | b;
} ELIF (funct3 == F3.AND) {
result <= a & b;
} ELSE {
result <= 32'h00000000;
}
}
@endmodjz
// RV32I Zicsr Extension - CSR Register File
// M-mode CSRs: mstatus, mie, mtvec, mepc, mcause, mtval, mcycle
// Combinational read port, synchronous write port
// Trap entry/exit logic for M-mode interrupts
@module rv_csr
CONST {
CLK_FREQ_MHZ = 54;
SDRAM_SIZE_BYTES = 0;
}
PORT {
IN [1] clk;
IN [1] rst_n;
IN [12] rd_addr;
OUT [32] rd_data;
IN [12] wr_addr;
IN [32] wr_data;
IN [1] wr_en;
IN [1] trap_enter;
IN [32] trap_epc;
IN [32] trap_cause;
IN [1] trap_mret;
OUT [32] mtvec_out;
OUT [32] mepc_out;
OUT [1] mstatus_mie;
OUT [1] mie_meie;
IN [32] irq_lines;
OUT [32] irqvec_out;
OUT [32] sdcardvec_out;
OUT [1] video_mode;
OUT [16] baud_div;
}
REGISTER {
mstatus_r [32] = 32'h00001800;
mie_r [32] = 32'h00000000;
mtvec_r [32] = 32'h00000000;
mepc_r [32] = 32'h00000000;
mcause_r [32] = 32'h00000000;
mtval_r [32] = 32'h00000000;
mcycle_r [32] = 32'h00000000;
irqvec_r [32] = 32'h00000000;
sdcardvec_r [32] = 32'h00000000;
video_mode_r [1] = 1'b0;
baud_div_r [16] = 16'd644;
}
ASYNCHRONOUS {
// CSR read mux (combinational)
IF (rd_addr == 12'h300) {
rd_data <= mstatus_r;
} ELIF (rd_addr == 12'h304) {
rd_data <= mie_r;
} ELIF (rd_addr == 12'h305) {
rd_data <= mtvec_r;
} ELIF (rd_addr == 12'h341) {
rd_data <= mepc_r;
} ELIF (rd_addr == 12'h342) {
rd_data <= mcause_r;
} ELIF (rd_addr == 12'h343) {
rd_data <= mtval_r;
} ELIF (rd_addr == 12'hB00) {
rd_data <= mcycle_r;
} ELIF (rd_addr == 12'hBC0) {
rd_data <= lit(32, CLK_FREQ_MHZ);
} ELIF (rd_addr == 12'hBC1) {
rd_data <= {31'd0, video_mode_r};
} ELIF (rd_addr == 12'hBC2) {
rd_data <= {16'd0, baud_div_r};
} ELIF (rd_addr == 12'hBC3) {
rd_data <= lit(32, SDRAM_SIZE_BYTES);
} ELIF (rd_addr == 12'hBC4) {
rd_data <= irqvec_r;
} ELIF (rd_addr == 12'hBC5) {
rd_data <= sdcardvec_r;
} ELIF (rd_addr == 12'hFC0) {
rd_data <= irq_lines;
} ELSE {
rd_data <= 32'h00000000;
}
// Direct outputs
mtvec_out <= mtvec_r;
mepc_out <= mepc_r;
irqvec_out <= irqvec_r;
sdcardvec_out <= sdcardvec_r;
mstatus_mie <= mstatus_r[3];
mie_meie <= mie_r[11];
video_mode <= video_mode_r;
baud_div <= baud_div_r;
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
// Free-running cycle counter
mcycle_r <= mcycle_r + 32'h00000001;
// Trap entry: save mepc/mcause, MPIE<=MIE, MIE<=0
IF (trap_enter == 1'b1) {
mepc_r <= trap_epc;
mcause_r <= trap_cause;
mstatus_r <= {mstatus_r[31:8], mstatus_r[3], mstatus_r[6:4], 1'b0, mstatus_r[2:0]};
// MRET: MIE<=MPIE, MPIE<=1
} ELIF (trap_mret == 1'b1) {
mstatus_r <= {mstatus_r[31:8], 1'b1, mstatus_r[6:4], mstatus_r[7], mstatus_r[2:0]};
// CSR write
} ELIF (wr_en == 1'b1) {
IF (wr_addr == 12'h300) {
// mstatus: preserve MPP=11 (M-mode only)
mstatus_r <= {19'h00000, 2'b11, wr_data[10:8], wr_data[7], wr_data[6:4], wr_data[3], wr_data[2:0]};
} ELIF (wr_addr == 12'h304) {
mie_r <= wr_data;
} ELIF (wr_addr == 12'h305) {
// mtvec: force DIRECT mode (bits[1:0]=00)
mtvec_r <= {wr_data[31:2], 2'b00};
} ELIF (wr_addr == 12'h341) {
// mepc: align to instruction boundary
mepc_r <= {wr_data[31:2], 2'b00};
} ELIF (wr_addr == 12'h342) {
mcause_r <= wr_data;
} ELIF (wr_addr == 12'h343) {
mtval_r <= wr_data;
} ELIF (wr_addr == 12'hBC4) {
irqvec_r <= {wr_data[31:2], 2'b00};
} ELIF (wr_addr == 12'hBC5) {
sdcardvec_r <= {wr_data[31:2], 2'b00};
} ELIF (wr_addr == 12'hBC1) {
video_mode_r <= wr_data[0];
} ELIF (wr_addr == 12'hBC2) {
baud_div_r <= wr_data[15:0];
}
// mcycle (0xB00) and clk_freq (0xBC0) are read-only
}
}
@endmodjz
// RV32M Multiply/Divide Unit
// Multiply: single-cycle via umul/smul intrinsics
// Divide: 32-cycle restoring division + 1 result cycle
// Supports MUL, MULH, MULHSU, MULHU, DIV, DIVU, REM, REMU
@module rv_muldiv
PORT {
IN [1] clk;
IN [1] rst_n;
IN [32] a;
IN [32] b;
IN [3] funct3;
IN [1] start;
OUT [32] result;
OUT [1] done;
}
WIRE {
// Combinational multiply results via intrinsics
mul_uu [64]; // unsigned * unsigned
mul_ss [64]; // signed * signed
mul_su [64]; // |signed| * unsigned (for MULHSU)
mul_su_neg [64]; // negated mul_su
// Absolute values for divide via abs() intrinsic
a_abs [33];
b_abs [33];
}
REGISTER {
running [1] = 1'b0;
finishing [1] = 1'b0;
count [6] = 6'd0;
op [3] = 3'b000;
negate_res [1] = 1'b0;
// Divide state
quotient [32] = 32'h00000000;
remainder [33] = 33'h000000000;
divisor [33] = 33'h000000000;
dividend [32] = 32'h00000000;
// Output registers
res_reg [32] = 32'h00000000;
done_reg [1] = 1'b0;
}
ASYNCHRONOUS {
// Single-cycle multiply via intrinsics
mul_uu <= umul(a, b);
mul_ss <= smul(a, b);
mul_su <= umul(abs(a)[31:0], b);
mul_su_neg <= ~umul(abs(a)[31:0], b) + 64'h0000000000000001;
// Absolute values for divide
a_abs <= abs(a);
b_abs <= abs(b);
result <= res_reg;
done <= done_reg;
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
// ---- Finishing cycle: select divide result ----
IF (finishing == 1'b1) {
finishing <= 1'b0;
done_reg <= 1'b1;
IF (op == 3'b100 || op == 3'b101) {
// DIV/DIVU: return quotient
IF (negate_res == 1'b1) {
res_reg <= ~quotient + 32'h00000001;
} ELSE {
res_reg <= quotient;
}
} ELSE {
// REM/REMU: return remainder
IF (negate_res == 1'b1) {
res_reg <= ~remainder[31:0] + 32'h00000001;
} ELSE {
res_reg <= remainder[31:0];
}
}
// ---- Divide algorithm running ----
} ELIF (running == 1'b1) {
done_reg <= 1'b0;
// Restoring division step
IF ({remainder[31:0], dividend[31]} >= divisor) {
remainder <= {remainder[31:0], dividend[31]} - divisor;
quotient <= {quotient[30:0], 1'b1};
} ELSE {
remainder <= {remainder[31:0], dividend[31]};
quotient <= {quotient[30:0], 1'b0};
}
dividend <= {dividend[30:0], 1'b0};
IF (count == 6'd31) {
running <= 1'b0;
finishing <= 1'b1;
} ELSE {
count <= count + 6'd1;
}
// ---- Idle: check for start ----
} ELIF (start == 1'b1) {
op <= funct3;
IF (funct3[2] == 1'b0) {
// ---- Multiply: single-cycle result from combinational intrinsics ----
done_reg <= 1'b1;
IF (funct3 == 3'b000) {
// MUL: low 32 bits (same for signed/unsigned)
res_reg <= mul_uu[31:0];
} ELIF (funct3 == 3'b001) {
// MULH: signed * signed, high word
res_reg <= mul_ss[63:32];
} ELIF (funct3 == 3'b010) {
// MULHSU: signed * unsigned, high word
IF (a[31] == 1'b1) {
res_reg <= mul_su_neg[63:32];
} ELSE {
res_reg <= mul_su[63:32];
}
} ELSE {
// MULHU: unsigned * unsigned, high word
res_reg <= mul_uu[63:32];
}
} ELSE {
// ---- Divide start ----
// Check divide by zero (1-cycle special case)
IF (b == 32'h00000000) {
IF (funct3 == 3'b110 || funct3 == 3'b111) {
res_reg <= a;
} ELSE {
res_reg <= 32'hFFFFFFFF;
}
done_reg <= 1'b1;
// Check signed overflow: -2^31 / -1
} ELIF (funct3 == 3'b100 && a == 32'h80000000 && b == 32'hFFFFFFFF) {
res_reg <= 32'h80000000;
done_reg <= 1'b1;
} ELIF (funct3 == 3'b110 && a == 32'h80000000 && b == 32'hFFFFFFFF) {
res_reg <= 32'h00000000;
done_reg <= 1'b1;
} ELSE {
// Normal divide: use abs() for signed operands
done_reg <= 1'b0;
IF (funct3[0] == 1'b1) {
// DIVU/REMU: unsigned
dividend <= a;
divisor <= {1'b0, b};
negate_res <= 1'b0;
} ELSE {
// DIV/REM: signed, operate on absolute values
dividend <= a_abs[31:0];
divisor <= {1'b0, b_abs[31:0]};
IF (funct3 == 3'b100) {
negate_res <= a[31] ^ b[31];
} ELSE {
negate_res <= a[31];
}
}
remainder <= 33'h000000000;
quotient <= 32'h00000000;
count <= 6'd0;
running <= 1'b1;
}
}
} ELSE {
done_reg <= 1'b0;
}
}
@endmodjz
// RV32I Register File with Shadow Bank
// 31 general-purpose registers (x1-x31), x0 hardwired to zero
// 31 shadow registers (s1-s31) for zero-overhead trap context switching
// 2 asynchronous read ports, 1 synchronous write port
@module rv_regfile
PORT {
IN [1] clk;
IN [1] rst_n;
IN [5] rs1_addr;
IN [5] rs2_addr;
OUT [32] rs1_data;
OUT [32] rs2_data;
IN [5] wr_addr;
IN [32] wr_data;
IN [1] wr_en;
IN [1] shadow;
}
REGISTER {
x1 [32] = 32'h00000000;
x2 [32] = 32'h00000000;
x3 [32] = 32'h00000000;
x4 [32] = 32'h00000000;
x5 [32] = 32'h00000000;
x6 [32] = 32'h00000000;
x7 [32] = 32'h00000000;
x8 [32] = 32'h00000000;
x9 [32] = 32'h00000000;
x10 [32] = 32'h00000000;
x11 [32] = 32'h00000000;
x12 [32] = 32'h00000000;
x13 [32] = 32'h00000000;
x14 [32] = 32'h00000000;
x15 [32] = 32'h00000000;
x16 [32] = 32'h00000000;
x17 [32] = 32'h00000000;
x18 [32] = 32'h00000000;
x19 [32] = 32'h00000000;
x20 [32] = 32'h00000000;
x21 [32] = 32'h00000000;
x22 [32] = 32'h00000000;
x23 [32] = 32'h00000000;
x24 [32] = 32'h00000000;
x25 [32] = 32'h00000000;
x26 [32] = 32'h00000000;
x27 [32] = 32'h00000000;
x28 [32] = 32'h00000000;
x29 [32] = 32'h00000000;
x30 [32] = 32'h00000000;
x31 [32] = 32'h00000000;
s1 [32] = 32'h00000000;
s2 [32] = 32'h00000000;
s3 [32] = 32'h00000000;
s4 [32] = 32'h00000000;
s5 [32] = 32'h00000000;
s6 [32] = 32'h00000000;
s7 [32] = 32'h00000000;
s8 [32] = 32'h00000000;
s9 [32] = 32'h00000000;
s10 [32] = 32'h00000000;
s11 [32] = 32'h00000000;
s12 [32] = 32'h00000000;
s13 [32] = 32'h00000000;
s14 [32] = 32'h00000000;
s15 [32] = 32'h00000000;
s16 [32] = 32'h00000000;
s17 [32] = 32'h00000000;
s18 [32] = 32'h00000000;
s19 [32] = 32'h00000000;
s20 [32] = 32'h00000000;
s21 [32] = 32'h00000000;
s22 [32] = 32'h00000000;
s23 [32] = 32'h00000000;
s24 [32] = 32'h00000000;
s25 [32] = 32'h00000000;
s26 [32] = 32'h00000000;
s27 [32] = 32'h00000000;
s28 [32] = 32'h00000000;
s29 [32] = 32'h00000000;
s30 [32] = 32'h00000000;
s31 [32] = 32'h00000000;
}
WIRE {
b1 [32];
b2 [32];
b3 [32];
b4 [32];
b5 [32];
b6 [32];
b7 [32];
b8 [32];
b9 [32];
b10 [32];
b11 [32];
b12 [32];
b13 [32];
b14 [32];
b15 [32];
b16 [32];
b17 [32];
b18 [32];
b19 [32];
b20 [32];
b21 [32];
b22 [32];
b23 [32];
b24 [32];
b25 [32];
b26 [32];
b27 [32];
b28 [32];
b29 [32];
b30 [32];
b31 [32];
x0 [32];
}
// Read mux: index 0 = x0 (zero), indices 1-31 = bank-selected registers
MUX {
bank = x0, b1, b2, b3, b4, b5, b6, b7, b8,
b9, b10, b11, b12, b13, b14, b15, b16,
b17, b18, b19, b20, b21, b22, b23, b24,
b25, b26, b27, b28, b29, b30, b31;
}
ASYNCHRONOUS {
// x0 is hardwired to zero
x0 <= 32'h00000000;
// Bank-selected register values
b1 <= (shadow == 1'b0) ? x1 : s1;
b2 <= (shadow == 1'b0) ? x2 : s2;
b3 <= (shadow == 1'b0) ? x3 : s3;
b4 <= (shadow == 1'b0) ? x4 : s4;
b5 <= (shadow == 1'b0) ? x5 : s5;
b6 <= (shadow == 1'b0) ? x6 : s6;
b7 <= (shadow == 1'b0) ? x7 : s7;
b8 <= (shadow == 1'b0) ? x8 : s8;
b9 <= (shadow == 1'b0) ? x9 : s9;
b10 <= (shadow == 1'b0) ? x10 : s10;
b11 <= (shadow == 1'b0) ? x11 : s11;
b12 <= (shadow == 1'b0) ? x12 : s12;
b13 <= (shadow == 1'b0) ? x13 : s13;
b14 <= (shadow == 1'b0) ? x14 : s14;
b15 <= (shadow == 1'b0) ? x15 : s15;
b16 <= (shadow == 1'b0) ? x16 : s16;
b17 <= (shadow == 1'b0) ? x17 : s17;
b18 <= (shadow == 1'b0) ? x18 : s18;
b19 <= (shadow == 1'b0) ? x19 : s19;
b20 <= (shadow == 1'b0) ? x20 : s20;
b21 <= (shadow == 1'b0) ? x21 : s21;
b22 <= (shadow == 1'b0) ? x22 : s22;
b23 <= (shadow == 1'b0) ? x23 : s23;
b24 <= (shadow == 1'b0) ? x24 : s24;
b25 <= (shadow == 1'b0) ? x25 : s25;
b26 <= (shadow == 1'b0) ? x26 : s26;
b27 <= (shadow == 1'b0) ? x27 : s27;
b28 <= (shadow == 1'b0) ? x28 : s28;
b29 <= (shadow == 1'b0) ? x29 : s29;
b30 <= (shadow == 1'b0) ? x30 : s30;
b31 <= (shadow == 1'b0) ? x31 : s31;
// Read ports (combinational): MUX selects register by address
rs1_data <= bank[rs1_addr];
rs2_data <= bank[rs2_addr];
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
IF (wr_en == 1'b1) {
IF (shadow == 1'b0) {
IF (wr_addr == 5'd1) { x1 <= wr_data; }
ELIF (wr_addr == 5'd2) { x2 <= wr_data; }
ELIF (wr_addr == 5'd3) { x3 <= wr_data; }
ELIF (wr_addr == 5'd4) { x4 <= wr_data; }
ELIF (wr_addr == 5'd5) { x5 <= wr_data; }
ELIF (wr_addr == 5'd6) { x6 <= wr_data; }
ELIF (wr_addr == 5'd7) { x7 <= wr_data; }
ELIF (wr_addr == 5'd8) { x8 <= wr_data; }
ELIF (wr_addr == 5'd9) { x9 <= wr_data; }
ELIF (wr_addr == 5'd10) { x10 <= wr_data; }
ELIF (wr_addr == 5'd11) { x11 <= wr_data; }
ELIF (wr_addr == 5'd12) { x12 <= wr_data; }
ELIF (wr_addr == 5'd13) { x13 <= wr_data; }
ELIF (wr_addr == 5'd14) { x14 <= wr_data; }
ELIF (wr_addr == 5'd15) { x15 <= wr_data; }
ELIF (wr_addr == 5'd16) { x16 <= wr_data; }
ELIF (wr_addr == 5'd17) { x17 <= wr_data; }
ELIF (wr_addr == 5'd18) { x18 <= wr_data; }
ELIF (wr_addr == 5'd19) { x19 <= wr_data; }
ELIF (wr_addr == 5'd20) { x20 <= wr_data; }
ELIF (wr_addr == 5'd21) { x21 <= wr_data; }
ELIF (wr_addr == 5'd22) { x22 <= wr_data; }
ELIF (wr_addr == 5'd23) { x23 <= wr_data; }
ELIF (wr_addr == 5'd24) { x24 <= wr_data; }
ELIF (wr_addr == 5'd25) { x25 <= wr_data; }
ELIF (wr_addr == 5'd26) { x26 <= wr_data; }
ELIF (wr_addr == 5'd27) { x27 <= wr_data; }
ELIF (wr_addr == 5'd28) { x28 <= wr_data; }
ELIF (wr_addr == 5'd29) { x29 <= wr_data; }
ELIF (wr_addr == 5'd30) { x30 <= wr_data; }
ELIF (wr_addr == 5'd31) { x31 <= wr_data; }
} ELSE {
IF (wr_addr == 5'd1) { s1 <= wr_data; }
ELIF (wr_addr == 5'd2) { s2 <= wr_data; }
ELIF (wr_addr == 5'd3) { s3 <= wr_data; }
ELIF (wr_addr == 5'd4) { s4 <= wr_data; }
ELIF (wr_addr == 5'd5) { s5 <= wr_data; }
ELIF (wr_addr == 5'd6) { s6 <= wr_data; }
ELIF (wr_addr == 5'd7) { s7 <= wr_data; }
ELIF (wr_addr == 5'd8) { s8 <= wr_data; }
ELIF (wr_addr == 5'd9) { s9 <= wr_data; }
ELIF (wr_addr == 5'd10) { s10 <= wr_data; }
ELIF (wr_addr == 5'd11) { s11 <= wr_data; }
ELIF (wr_addr == 5'd12) { s12 <= wr_data; }
ELIF (wr_addr == 5'd13) { s13 <= wr_data; }
ELIF (wr_addr == 5'd14) { s14 <= wr_data; }
ELIF (wr_addr == 5'd15) { s15 <= wr_data; }
ELIF (wr_addr == 5'd16) { s16 <= wr_data; }
ELIF (wr_addr == 5'd17) { s17 <= wr_data; }
ELIF (wr_addr == 5'd18) { s18 <= wr_data; }
ELIF (wr_addr == 5'd19) { s19 <= wr_data; }
ELIF (wr_addr == 5'd20) { s20 <= wr_data; }
ELIF (wr_addr == 5'd21) { s21 <= wr_data; }
ELIF (wr_addr == 5'd22) { s22 <= wr_data; }
ELIF (wr_addr == 5'd23) { s23 <= wr_data; }
ELIF (wr_addr == 5'd24) { s24 <= wr_data; }
ELIF (wr_addr == 5'd25) { s25 <= wr_data; }
ELIF (wr_addr == 5'd26) { s26 <= wr_data; }
ELIF (wr_addr == 5'd27) { s27 <= wr_data; }
ELIF (wr_addr == 5'd28) { s28 <= wr_data; }
ELIF (wr_addr == 5'd29) { s29 <= wr_data; }
ELIF (wr_addr == 5'd30) { s30 <= wr_data; }
ELIF (wr_addr == 5'd31) { s31 <= wr_data; }
}
}
}
@endmodjz
// Simple address-decode arbiter for single bus master
// Routes CPU bus to ROM, RAM, or LED based on address
@module arbiter
CONST {
SOURCE_COUNT = 1;
TARGET_COUNT = 5;
}
PORT {
IN [TARGET_COUNT * 8] map_config;
BUS SIMPLE_BUS TARGET [SOURCE_COUNT] src;
BUS SIMPLE_BUS SOURCE [TARGET_COUNT] tgt;
}
WIRE {
tgt_done [TARGET_COUNT];
any_done [1];
tgt_match [TARGET_COUNT];
}
@template TARGET_MATCH(match_vec, src_port, config)
match_vec[IDX] <= (((src_port.ADDR[31:28] ^ config[IDX*8+7:IDX*8+4]) & config[IDX*8+3:IDX*8]) == 4'b0000) ? 1'b1 : 1'b0;
@endtemplate
@template TARGET_DONE(done, tgt)
done[IDX] = tgt[IDX].DONE;
@endtemplate
@template ROUTE_TARGET(src_port, tgt_port, config)
@scratch match [1];
// Address decode: ((addr[31:28] ^ value) & care) == 4'b0000
match <= (((src_port.ADDR[31:28] ^ config[IDX*8+7:IDX*8+4]) & config[IDX*8+3:IDX*8]) == 4'b0000);
// VALID only to the matching target
tgt_port[IDX].VALID <= (src_port.VALID == 1'b1 && match == 1'b1) ? 1'b1 : 1'b0;
// Forward address and command (direct assignment, not alias)
tgt_port[IDX].ADDR <= src_port.ADDR;
tgt_port[IDX].CMD <= src_port.CMD;
// DATA uses alias (=) for tristate pass-through
tgt_port[IDX].DATA = src_port.DATA;
@endtemplate
ASYNCHRONOUS {
// Gather target done signals
@apply [TARGET_COUNT] TARGET_DONE(tgt_done, tgt);
any_done <= (tgt_done != lit(TARGET_COUNT, 0));
// Route source to targets (single source, index 0)
@apply [TARGET_COUNT] ROUTE_TARGET(src[0], tgt, map_config);
// Compute which target matches address
@apply [TARGET_COUNT] TARGET_MATCH(tgt_match, src[0], map_config);
// Reverse DATA path: responding target to source
src[0].DATA <= tgt[oh2b(tgt_match)].DATA;
// Route DONE back to source
src[0].DONE <= any_done ? 1'b1 : 1'b0;
}
@endmodjz
@module ram
PORT {
IN [1] clk;
IN [1] rst_n;
BUS SIMPLE_BUS TARGET pbus;
}
// 5120-word bank, 32-bit wide (10 BSRAMs = 20KB)
MEM(TYPE=BLOCK) {
ram_mem [32] [5120] = 32'h00000000 {
OUT read SYNC;
IN write;
};
}
REGISTER {
pending_read [1] = 1'b0;
data_ready [1] = 1'b0;
read_data [32] = 32'b0;
}
ASYNCHRONOUS {
// Drive data only when data_ready
pbus.DATA <= (pbus.VALID && pbus.CMD == CMD.READ && data_ready == 1'b1) ? read_data : 32'bz;
// DONE signaling
IF (pbus.VALID) {
IF (pbus.CMD == CMD.WRITE) {
pbus.DONE <= 1'b1;
} ELIF (data_ready == 1'b1) {
pbus.DONE <= 1'b1;
} ELSE {
pbus.DONE <= 1'b0;
}
} ELSE {
pbus.DONE <= 1'bz;
}
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
// Write path: 1-cycle
IF (pbus.VALID && pbus.CMD == CMD.WRITE) {
ram_mem.write[pbus.ADDR[14:2]] <= pbus.DATA;
}
// Read path: 2-stage pipeline
IF (data_ready == 1'b1) {
data_ready <= 1'b0;
} ELIF (pending_read == 1'b1) {
read_data <= ram_mem.read.data;
data_ready <= 1'b1;
pending_read <= 1'b0;
} ELIF (pbus.VALID && pbus.CMD == CMD.READ) {
ram_mem.read.addr <= pbus.ADDR[14:2];
pending_read <= 1'b1;
}
}
@endmodjz
@module rom
PORT {
IN [1] clk;
IN [1] rst_n;
BUS SIMPLE_BUS TARGET pbus;
}
// Single 4096-word bank, 32-bit wide (BSRAM)
MEM(TYPE=BLOCK) {
rom_mem [32] [4096] = @file("../out/bios.hex") {
OUT read SYNC;
};
}
REGISTER {
pending_read [1] = 1'b0;
data_ready [1] = 1'b0;
read_data [32] = 32'b0;
}
ASYNCHRONOUS {
// Drive data only when data_ready (after BSRAM pipeline)
pbus.DATA <= (pbus.VALID && pbus.CMD == CMD.READ && data_ready == 1'b1) ? read_data : 32'bz;
// DONE: on reads once data_ready is set, writes are ignored (ROM)
IF (pbus.VALID) {
IF (data_ready == 1'b1) {
pbus.DONE <= 1'b1;
} ELSE {
pbus.DONE <= 1'b0;
}
} ELSE {
pbus.DONE <= 1'bz;
}
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
// 2-stage pipeline for SYNC memory read:
// Stage 1: pending_read=1 - address sent to BSRAM
// Stage 2: data_ready=1 - data latched into read_data
IF (data_ready == 1'b1) {
// Clear data_ready after CPU has seen DONE
data_ready <= 1'b0;
} ELIF (pending_read == 1'b1) {
// Stage 2: Memory output is now valid, latch
read_data <= rom_mem.read.data;
data_ready <= 1'b1;
pending_read <= 1'b0;
} ELIF (pbus.VALID && pbus.CMD == CMD.READ) {
// Stage 1: New read request, send address
rom_mem.read.addr <= pbus.ADDR[13:2];
pending_read <= 1'b1;
}
}
@endmodjz
@module sdram_ctrl
CONST {
CLK_FREQ_MHZ = 54;
INIT_COUNT = CLK_FREQ_MHZ * 200;
REFRESH_INTERVAL = CLK_FREQ_MHZ * 78 / 10;
MODE_REG = 544; // CL=2, burst=1, sequential
// GW2AR-18 SDRAM geometry: 2M x 32 = 64Mbit
ROW_BITS = 11;
COL_BITS = 8;
BANK_BITS = 2;
DATA_BITS = 32;
ADDR_BITS = 21; // ROW_BITS + COL_BITS + BANK_BITS
// State machine states
ST_INIT = 0;
ST_IPRE = 1;
ST_IREF = 2;
ST_IMODE = 3;
ST_IDLE = 4;
ST_ACT_W = 5;
ST_ACT = 6;
ST_RD = 7;
ST_RD_CL = 8;
ST_WR = 9;
ST_REF = 10;
}
PORT {
IN [1] clk;
IN [1] rst_n;
// User interface
IN [21] addr;
IN [32] wdata;
OUT [32] rdata;
IN [1] rd;
IN [1] wr;
OUT [1] busy;
OUT [1] done;
// SDRAM physical interface
OUT [1] sdram_cke;
OUT [1] sdram_cs_n;
OUT [1] sdram_ras_n;
OUT [1] sdram_cas_n;
OUT [1] sdram_wen_n;
OUT [4] sdram_dqm;
OUT [11] sdram_addr;
OUT [2] sdram_ba;
INOUT [32] sdram_dq;
}
REGISTER {
state [4] = 4'd0;
init_cnt [14] = 14'b0;
wait_cnt [3] = 3'b0;
ref_cnt [10] = 10'b0;
ref_done [1] = 1'b0;
// Command registers
r_cke [1] = 1'b0;
r_cs_n [1] = 1'b1;
r_ras_n [1] = 1'b1;
r_cas_n [1] = 1'b1;
r_wen_n [1] = 1'b1;
r_addr [11] = 11'b0;
r_ba [2] = 2'b0;
r_dqm [4] = 4'b1111;
// DQ control
r_dq_oe [1] = 1'b0;
r_dq_out [32] = 32'b0;
// Latched request
r_req_addr [21] = 21'b0;
r_req_wdata [32] = 32'b0;
r_req_write [1] = 1'b0;
// Output
r_rdata [32] = 32'b0;
r_done [1] = 1'b0;
}
ASYNCHRONOUS {
// Drive SDRAM pins from registers
sdram_cke <= r_cke;
sdram_cs_n <= r_cs_n;
sdram_ras_n <= r_ras_n;
sdram_cas_n <= r_cas_n;
sdram_wen_n <= r_wen_n;
sdram_dqm <= r_dqm;
sdram_addr <= r_addr;
sdram_ba <= r_ba;
// Tristate DQ bus
sdram_dq <= (r_dq_oe == 1'b1) ? r_dq_out : 32'bz;
// User interface outputs
rdata <= r_rdata;
done <= r_done;
busy <= (state != lit(4, ST_IDLE));
}
SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
SELECT(state) {
// ---- INIT: Power-up wait (200us) ----
CASE (lit(4, ST_INIT)) {
r_cke <= 1'b1;
r_dq_oe <= 1'b0;
r_done <= 1'b0;
IF (init_cnt == lit(14, INIT_COUNT)) {
// PRECHARGE ALL: cs=0, ras=0, cas=1, we=0, A10=1
r_cs_n <= 1'b0;
r_ras_n <= 1'b0;
r_cas_n <= 1'b1;
r_wen_n <= 1'b0;
r_addr <= 11'b10000000000;
wait_cnt <= 3'd1;
state <= lit(4, ST_IPRE);
} ELSE {
// INHIBIT
r_cs_n <= 1'b1;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
init_cnt <= init_cnt + 14'b1;
}
}
// ---- IPRE: Wait after PRECHARGE ALL ----
CASE (lit(4, ST_IPRE)) {
r_done <= 1'b0;
IF (wait_cnt == 3'b0) {
// AUTO REFRESH: cs=0, ras=0, cas=0, we=1
r_cs_n <= 1'b0;
r_ras_n <= 1'b0;
r_cas_n <= 1'b0;
r_wen_n <= 1'b1;
wait_cnt <= 3'd2;
state <= lit(4, ST_IREF);
} ELSE {
// NOP
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
wait_cnt <= wait_cnt - 3'b1;
}
}
// ---- IREF: Init AUTO REFRESH (done twice) ----
CASE (lit(4, ST_IREF)) {
r_done <= 1'b0;
IF (wait_cnt == 3'b0) {
IF (ref_done == 1'b0) {
// Second AUTO REFRESH
r_cs_n <= 1'b0;
r_ras_n <= 1'b0;
r_cas_n <= 1'b0;
r_wen_n <= 1'b1;
wait_cnt <= 3'd2;
ref_done <= 1'b1;
} ELSE {
// MODE SET: cs=0, ras=0, cas=0, we=0, addr=mode
r_cs_n <= 1'b0;
r_ras_n <= 1'b0;
r_cas_n <= 1'b0;
r_wen_n <= 1'b0;
r_addr <= lit(11, MODE_REG);
r_ba <= 2'b0;
wait_cnt <= 3'd2;
state <= lit(4, ST_IMODE);
}
} ELSE {
// NOP
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
wait_cnt <= wait_cnt - 3'b1;
}
}
// ---- IMODE: Wait after MODE REGISTER SET ----
CASE (lit(4, ST_IMODE)) {
r_done <= 1'b0;
IF (wait_cnt == 3'b0) {
// NOP, go to IDLE
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
r_dqm <= 4'b0000;
state <= lit(4, ST_IDLE);
} ELSE {
// NOP
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
wait_cnt <= wait_cnt - 3'b1;
}
}
// ---- IDLE: Ready for commands ----
// rd/wr checked before refresh to prevent lost pulses.
// Delaying refresh by one access (~8 cycles) is within SDRAM timing margin.
CASE (lit(4, ST_IDLE)) {
r_dq_oe <= 1'b0;
r_done <= 1'b0;
IF ((rd == 1'b1 || wr == 1'b1) && r_done == 1'b0) {
// Latch request
r_req_addr <= addr;
r_req_wdata <= wdata;
r_req_write <= wr;
// ACTIVATE: cs=0, ras=0, cas=1, we=1
// Bank = addr[20:19], Row = addr[18:8]
r_cs_n <= 1'b0;
r_ras_n <= 1'b0;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
r_addr <= addr[18:8];
r_ba <= addr[20:19];
ref_cnt <= ref_cnt + 10'b1;
state <= lit(4, ST_ACT_W);
} ELIF (ref_cnt >= lit(10, REFRESH_INTERVAL)) {
// AUTO REFRESH: cs=0, ras=0, cas=0, we=1
r_cs_n <= 1'b0;
r_ras_n <= 1'b0;
r_cas_n <= 1'b0;
r_wen_n <= 1'b1;
ref_cnt <= 10'b0;
wait_cnt <= 3'd2;
state <= lit(4, ST_REF);
} ELSE {
ref_cnt <= ref_cnt + 10'b1;
// NOP
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
}
}
// ---- ACT_W: NOP wait for tRCD (20ns needs 2 cycles at 54MHz) ----
CASE (lit(4, ST_ACT_W)) {
r_done <= 1'b0;
// NOP while waiting for tRCD
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
state <= lit(4, ST_ACT);
}
// ---- ACT: Issue READ or WRITE command ----
CASE (lit(4, ST_ACT)) {
r_done <= 1'b0;
IF (r_req_write == 1'b1) {
// WRITE: cs=0, ras=1, cas=0, we=0, A10=1 (auto-precharge)
// Col = addr[7:0]
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b0;
r_wen_n <= 1'b0;
r_addr <= {1'b1, 2'b0, r_req_addr[7:0]};
r_dq_oe <= 1'b1;
r_dq_out <= r_req_wdata;
r_dqm <= 4'b0000;
state <= lit(4, ST_WR);
} ELSE {
// READ: cs=0, ras=1, cas=0, we=1, A10=1 (auto-precharge)
// Col = addr[7:0]
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b0;
r_wen_n <= 1'b1;
r_addr <= {1'b1, 2'b0, r_req_addr[7:0]};
r_dq_oe <= 1'b0;
r_dqm <= 4'b0000;
wait_cnt <= 3'd2;
state <= lit(4, ST_RD);
}
}
// ---- RD: READ command issued, start CAS latency wait ----
CASE (lit(4, ST_RD)) {
r_done <= 1'b0;
// NOP while waiting
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
r_dq_oe <= 1'b0;
wait_cnt <= wait_cnt - 3'b1;
state <= lit(4, ST_RD_CL);
}
// ---- RD_CL: Waiting for CAS latency ----
CASE (lit(4, ST_RD_CL)) {
// NOP
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
r_dq_oe <= 1'b0;
IF (wait_cnt == 3'b0) {
// Data valid, capture it
r_rdata <= sdram_dq;
r_done <= 1'b1;
state <= lit(4, ST_IDLE);
} ELSE {
r_done <= 1'b0;
wait_cnt <= wait_cnt - 3'b1;
}
}
// ---- WR: WRITE command issued ----
CASE (lit(4, ST_WR)) {
// NOP, clear DQ drive
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
r_dq_oe <= 1'b0;
r_done <= 1'b1;
state <= lit(4, ST_IDLE);
}
// ---- REF: Periodic auto refresh ----
CASE (lit(4, ST_REF)) {
r_done <= 1'b0;
IF (wait_cnt == 3'b0) {
// NOP, return to IDLE
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
state <= lit(4, ST_IDLE);
} ELSE {
// NOP while waiting
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
wait_cnt <= wait_cnt - 3'b1;
}
}
DEFAULT {
r_done <= 1'b0;
// NOP
r_cs_n <= 1'b0;
r_ras_n <= 1'b1;
r_cas_n <= 1'b1;
r_wen_n <= 1'b1;
state <= lit(4, ST_IDLE);
}
}
}
@endmodjz
// Bus-mapped SDRAM peripheral
// Bridges SIMPLE_BUS protocol to sdram_ctrl rd/wr/done interface
@module sdram_bus
CONST {
CLK_FREQ_MHZ = 54;
ST_IDLE = 0;
ST_WAIT = 1;
}
PORT {
IN [1] clk;
IN [1] rst_n;
BUS SIMPLE_BUS TARGET pbus;
// SDRAM physical interface (directly to pins)
OUT [1] sdram_cke;
OUT [1] sdram_cs_n;
OUT [1] sdram_ras_n;
OUT [1] sdram_cas_n;
OUT [1] sdram_wen_n;
OUT [4] sdram_dqm;
OUT [11] sdram_addr;
OUT [2] sdram_ba;
INOUT [32] sdram_dq;
}
WIRE {
ctrl_rdata [32];
ctrl_busy [1];
ctrl_done [1];
ctrl_cke [1];
ctrl_cs_n [1];
ctrl_ras_n [1];
ctrl_cas_n [1];
ctrl_wen_n [1];
ctrl_dqm [4];
ctrl_addr [11];
ctrl_ba [2];
ctrl_dq [32];
}
REGISTER {
state [1] = 1'b0;
rd_hold [1] = 1'b0;
wr_hold [1] = 1'b0;
req_addr [21] = 21'b0;
req_wdata [32] = 32'b0;
}
@new ctrl0 sdram_ctrl {
OVERRIDE {
CLK_FREQ_MHZ = CLK_FREQ_MHZ;
}
IN [1] clk = clk;
IN [1] rst_n = rst_n;
IN [21] addr = req_addr;
IN [32] wdata = req_wdata;
OUT [32] rdata = ctrl_rdata;
IN [1] rd = rd_hold;
IN [1] wr = wr_hold;
OUT [1] busy = ctrl_busy;
OUT [1] done = ctrl_done;
OUT [1] sdram_cke = ctrl_cke;
OUT [1] sdram_cs_n = ctrl_cs_n;
OUT [1] sdram_ras_n = ctrl_ras_n;
OUT [1] sdram_cas_n = ctrl_cas_n;
OUT [1] sdram_wen_n = ctrl_wen_n;
OUT [4] sdram_dqm = ctrl_dqm;
OUT [11] sdram_addr = ctrl_addr;
OUT [2] sdram_ba = ctrl_ba;
INOUT [32] sdram_dq = ctrl_dq;
}
ASYNCHRONOUS {
// Pass through SDRAM physical pins
sdram_cke <= ctrl_cke;
sdram_cs_n <= ctrl_cs_n;
sdram_ras_n <= ctrl_ras_n;
sdram_cas_n <= ctrl_cas_n;
sdram_wen_n <= ctrl_wen_n;
sdram_dqm <= ctrl_dqm;
sdram_addr <= ctrl_addr;
sdram_ba <= ctrl_ba;
sdram_dq = ctrl_dq;
// Drive bus DATA on read completion
pbus.DATA <= (pbus.VALID && pbus.CMD == CMD.READ && state == lit(1, ST_WAIT) && ctrl_done == 1'b1) ? ctrl_rdata : 32'bz;
// DONE signaling: multi-cycle, assert when sdram_ctrl done fires
IF (pbus.VALID) {
IF (state == lit(1, ST_WAIT) && ctrl_done == 1'b1) {
pbus.DONE <= 1'b1;
} ELSE {
pbus.DONE <= 1'b0;
}
} ELSE {
pbus.DONE <= 1'bz;
}
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
SELECT(state) {
CASE (lit(1, ST_IDLE)) {
IF (pbus.VALID && ctrl_busy == 1'b0) {
req_addr <= pbus.ADDR[22:2];
req_wdata <= pbus.DATA;
IF (pbus.CMD == CMD.WRITE) {
wr_hold <= 1'b1;
rd_hold <= 1'b0;
} ELSE {
rd_hold <= 1'b1;
wr_hold <= 1'b0;
}
state <= lit(1, ST_WAIT);
} ELSE {
rd_hold <= 1'b0;
wr_hold <= 1'b0;
}
}
CASE (lit(1, ST_WAIT)) {
// Hold rd/wr high until ctrl completes.
// If ctrl was refreshing when we asserted, it will see
// rd/wr=1 when it returns to IDLE and start the access.
IF (ctrl_done == 1'b1) {
rd_hold <= 1'b0;
wr_hold <= 1'b0;
state <= lit(1, ST_IDLE);
}
}
DEFAULT {
rd_hold <= 1'b0;
wr_hold <= 1'b0;
state <= lit(1, ST_IDLE);
}
}
}
@endmodjz
@module led_out
PORT {
IN [1] clk;
IN [1] rst_n;
BUS SIMPLE_BUS TARGET pbus;
OUT [6] leds;
}
REGISTER {
data [32] = 32'b0;
}
ASYNCHRONOUS {
leds = data[5:0];
// Drive data only when selected, valid, and in READ mode
pbus.DATA <= (pbus.VALID && pbus.CMD == CMD.READ) ? data : 32'bz;
// Drive DONE only when selected and valid
IF (pbus.VALID) {
pbus.DONE <= 1'b1;
} ELSE {
pbus.DONE <= 1'bz;
}
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
IF (pbus.VALID && pbus.CMD == CMD.WRITE) {
data <= pbus.DATA;
}
}
@endmodjz
// Bus-mapped UART peripheral (TX + RX)
// Offset 0x0 (ADDR[2]=0): Write → TX byte. Read → {30'b0, rx_has_data, tx_ready}
// Offset 0x4 (ADDR[2]=1): Read → {24'h00, rx_byte} (clears rx_has_data)
@module uart
PORT {
IN [1] clk;
IN [1] rst_n;
BUS SIMPLE_BUS TARGET pbus;
OUT [1] tx;
IN [1] rx;
OUT [1] irq_tx_ready;
OUT [1] irq_rx_data;
IN [16] baud_div;
}
WIRE {
tx_ready [1];
tx_wire [1];
rx_data_w [8];
rx_valid_w [1];
}
REGISTER {
tx_data [8] = 8'h00;
tx_valid [1] = 1'b0;
tx_ready_d [1] = 1'b0;
irq_tx_r [1] = 1'b0;
rx_byte [8] = 8'h00;
rx_has_data [1] = 1'b0;
}
@new tx0 uart_tx {
IN [1] clk = clk;
IN [1] rst_n = rst_n;
IN [8] data = tx_data;
IN [1] valid = tx_valid;
OUT [1] ready = tx_ready;
OUT [1] tx = tx_wire;
IN [16] baud_div = baud_div;
}
@new rx0 uart_rx {
IN [1] clk = clk;
IN [1] rst_n = rst_n;
IN [1] rx = rx;
OUT [8] data = rx_data_w;
OUT [1] valid = rx_valid_w;
IN [16] baud_div = baud_div;
}
ASYNCHRONOUS {
tx <= tx_wire;
irq_tx_ready <= irq_tx_r;
irq_rx_data <= rx_has_data;
// Drive data on READ: mux on ADDR[2]
IF (pbus.VALID && pbus.CMD == CMD.READ) {
IF (pbus.ADDR[2] == 1'b0) {
pbus.DATA <= {30'd0, rx_has_data, tx_ready};
} ELSE {
pbus.DATA <= {24'h000000, rx_byte};
}
} ELSE {
pbus.DATA <= 32'bz;
}
// DONE: immediate when selected
IF (pbus.VALID) {
pbus.DONE <= 1'b1;
} ELSE {
pbus.DONE <= 1'bz;
}
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
// TX: write byte when bus writes offset 0
IF (pbus.VALID && pbus.CMD == CMD.WRITE) {
tx_data <= pbus.DATA[7:0];
tx_valid <= 1'b1;
} ELSE {
tx_valid <= 1'b0;
}
// TX ready rising-edge detect: pulse IRQ when transmitter becomes ready
tx_ready_d <= tx_ready;
IF (tx_ready == 1'b1 && tx_ready_d == 1'b0) {
irq_tx_r <= 1'b1;
} ELSE {
irq_tx_r <= 1'b0;
}
// RX: latch received byte
IF (rx_valid_w == 1'b1) {
rx_byte <= rx_data_w;
rx_has_data <= 1'b1;
} ELIF (pbus.VALID && pbus.CMD == CMD.READ && pbus.ADDR[2] == 1'b1) {
// Clear rx_has_data when bus reads offset 0x4
rx_has_data <= 1'b0;
}
}
@endmodjz
// Simple UART Transmitter — 8N1, no FIFO
// Asserts ready when idle. When valid is pulsed with data, transmits one byte.
@module uart_tx
PORT {
IN [1] clk;
IN [1] rst_n;
IN [8] data;
IN [1] valid;
OUT [1] ready;
OUT [1] tx;
IN [16] baud_div;
}
REGISTER {
// State machine (0=IDLE, 1=START, 2=DATA, 3=STOP)
state [2] = 2'd0;
baud_cnt [16] = 16'd0;
bit_cnt [3] = 3'd0;
shift [8] = 8'hFF;
// Outputs
tx_out [1] = 1'b1;
ready_out [1] = 1'b1;
}
ASYNCHRONOUS {
tx <= tx_out;
ready <= ready_out;
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
SELECT (state) {
CASE (2'd0) {
// IDLE: line high, ready for data
tx_out <= 1'b1;
IF (valid == 1'b1) {
shift <= data;
baud_cnt <= baud_div;
state <= 2'd1;
ready_out <= 1'b0;
} ELSE {
ready_out <= 1'b1;
}
}
CASE (2'd1) {
// START bit: hold TX low for one baud period
tx_out <= 1'b0;
ready_out <= 1'b0;
IF (baud_cnt == 16'd0) {
baud_cnt <= baud_div;
bit_cnt <= 3'd0;
state <= 2'd2;
} ELSE {
baud_cnt <= baud_cnt - 16'd1;
}
}
CASE (2'd2) {
// DATA: shift out 8 bits LSB first
tx_out <= shift[0];
ready_out <= 1'b0;
IF (baud_cnt == 16'd0) {
shift <= { 1'b1, shift[7:1] };
IF (bit_cnt == 3'd7) {
baud_cnt <= baud_div;
state <= 2'd3;
} ELSE {
bit_cnt <= bit_cnt + 3'd1;
baud_cnt <= baud_div;
}
} ELSE {
baud_cnt <= baud_cnt - 16'd1;
}
}
CASE (2'd3) {
// STOP bit: hold TX high for one baud period
tx_out <= 1'b1;
IF (baud_cnt == 16'd0) {
state <= 2'd0;
ready_out <= 1'b1;
} ELSE {
ready_out <= 1'b0;
baud_cnt <= baud_cnt - 16'd1;
}
}
DEFAULT {
tx_out <= 1'b1;
ready_out <= 1'b1;
state <= 2'd0;
}
}
}
@endmodjz
// Simple UART Receiver — 8N1, no FIFO
// Pulses valid for 1 cycle when a byte is received
@module uart_rx
PORT {
IN [1] clk;
IN [1] rst_n;
IN [1] rx;
OUT [8] data;
OUT [1] valid;
IN [16] baud_div;
}
REGISTER {
// Metastability synchronizer
rx_sync1 [1] = 1'b1;
rx_sync2 [1] = 1'b1;
// State machine (0=IDLE, 1=START, 2=DATA, 3=STOP)
state [2] = 2'd0;
baud_cnt [16] = 16'd0;
bit_cnt [3] = 3'd0;
shift [8] = 8'h00;
// Output
data_out [8] = 8'h00;
valid_out [1] = 1'b0;
}
ASYNCHRONOUS {
data <= data_out;
valid <= valid_out;
}
SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
// 2-stage synchronizer for async RX input
rx_sync1 <= rx;
rx_sync2 <= rx_sync1;
SELECT (state) {
CASE (2'd0) {
// IDLE: wait for start bit (falling edge)
valid_out <= 1'b0;
IF (rx_sync2 == 1'b0) {
baud_cnt <= {1'b0, baud_div[15:1]};
state <= 2'd1;
}
}
CASE (2'd1) {
// START: verify start bit at mid-point
valid_out <= 1'b0;
IF (baud_cnt == 16'd0) {
IF (rx_sync2 == 1'b0) {
baud_cnt <= baud_div;
bit_cnt <= 3'd0;
shift <= 8'h00;
state <= 2'd2;
} ELSE {
// False start
state <= 2'd0;
}
} ELSE {
baud_cnt <= baud_cnt - 16'd1;
}
}
CASE (2'd2) {
// DATA: sample 8 bits at mid-bit
valid_out <= 1'b0;
IF (baud_cnt == 16'd0) {
shift <= { rx_sync2, shift[7:1] };
IF (bit_cnt == 3'd7) {
baud_cnt <= baud_div;
state <= 2'd3;
} ELSE {
bit_cnt <= bit_cnt + 3'd1;
baud_cnt <= baud_div;
}
} ELSE {
baud_cnt <= baud_cnt - 16'd1;
}
}
CASE (2'd3) {
// STOP: wait for stop bit, output byte
IF (baud_cnt == 16'd0) {
data_out <= shift;
valid_out <= 1'b1;
state <= 2'd0;
} ELSE {
valid_out <= 1'b0;
baud_cnt <= baud_cnt - 16'd1;
}
}
DEFAULT {
valid_out <= 1'b0;
state <= 2'd0;
}
}
}
@endmodjz
// SD Card SPI Controller
// SPI-mode SD card interface with sector read/write.
// CPU accesses registers via SIMPLE_BUS TARGET.
// 512-byte sector buffer with auto-incrementing DATA register.
//
// Register map (base + offset, 32-bit word aligned):
// 0x00: COMMAND [W] bits[1:0] = 00=NONE, 01=READ, 10=WRITE
// 0x04: STATUS [R] bits[4:0] = {DMA_ACTIVE, SDHC, READY, ERROR, BUSY}
// 0x08: SECTOR_LO [RW] bits[15:0] = sector address low
// 0x0C: SECTOR_HI [RW] bits[15:0] = sector address high
// 0x10: (reserved)
// 0x14: DATA [RW] bits[15:0] = buffer auto-increment access
// 0x18: IRQ_CTRL [RW] bits[1:0] = {clear, enable}
// SD card state machine states
@global SDCST
POWER_UP = 5'b00000;
SEND_CLKS = 5'b00001;
CMD0 = 5'b00010;
CMD0_RESP = 5'b00011;
CMD8 = 5'b00100;
CMD8_RESP = 5'b00101;
CMD55 = 5'b00110;
CMD55_RESP = 5'b00111;
ACMD41 = 5'b01000;
ACMD41_RESP = 5'b01001;
CMD58 = 5'b01010;
CMD58_RESP = 5'b01011;
IDLE = 5'b01100;
READ_CMD = 5'b01101;
READ_RESP = 5'b01110;
READ_TOKEN = 5'b01111;
READ_DATA = 5'b10000;
READ_CRC = 5'b10001;
READ_DONE = 5'b10011;
WRITE_CMD = 5'b10100;
WRITE_RESP = 5'b10101;
WRITE_TOKEN = 5'b10110;
WRITE_DATA = 5'b10111;
WRITE_CRC = 5'b11000;
WRITE_DRESP = 5'b11001;
WRITE_BUSY = 5'b11010;
WRITE_DONE = 5'b11011;
ERROR = 5'b11100;
CS_GAP = 5'b11101;
@endglob
// SD card register addresses
@global SDREG
COMMAND = 3'b000;
STATUS = 3'b001;
SECTOR_LO = 3'b010;
SECTOR_HI = 3'b011;
DATA = 3'b101;
IRQ_CTRL = 3'b110;
@endglob
// SD card commands
@global SDCMD
NONE = 2'b00;
READ_SECTOR = 2'b01;
WRITE_SECTOR = 2'b10;
@endglob
@module sdcard
CONST {
// SPI clock dividers: system_clk / (2 * (div+1))
// Slow: 74.25MHz / (2*186) = ~200 KHz (init)
// Fast: 74.25MHz / (2*5) = ~7.4 MHz (data)
SPI_DIV_SLOW = 185;
SPI_DIV_FAST = 4;
}
PORT {
IN [1] clk;
IN [1] rst_n;
OUT [1] irq;
BUS SIMPLE_BUS TARGET pbus;
// SPI physical pins
OUT [1] sd_clk;
OUT [1] sd_mosi;
IN [1] sd_miso;
OUT [1] sd_cs_n;
}
// 256-word (512-byte) sector buffer
MEM {
buffer [16] [256] = 16'h0000 {
OUT rd ASYNC;
IN wr;
};
}
WIRE {
read_data [32];
// Precomputed CMD17/CMD24 argument bytes
// SDHC: sector number directly; SDSC: byte address (sector << 9)
cmd_arg_b3 [8];
cmd_arg_b2 [8];
cmd_arg_b1 [8];
cmd_arg_b0 [8];
}
REGISTER {
// SD Card State Machine
state [5] = 5'b00000; // SDCST.POWER_UP
// SPI Engine
spi_div [8] = 8'd185;
spi_div_cnt [8] = 8'b0;
spi_clk_reg [1] = 1'b0;
spi_shift_out [8] = 8'hFF;
spi_shift_in [8] = 8'b0;
spi_bit_cnt [4] = 4'b0;
spi_busy [1] = 1'b0;
spi_rx_data [8] = 8'b0;
// MISO synchronizer
miso_sync1 [1] = 1'b1;
miso_sync2 [1] = 1'b1;
// CS control
cs_n [1] = 1'b1;
// SD Protocol
powerup_cnt [20] = 20'b0;
gap_next_state [5] = 5'b0;
clock_cnt [8] = 8'b0;
send_cmd_phase [3] = 3'b0;
resp_wait_cnt [16] = 16'b0;
resp_byte_cnt [4] = 4'b0;
r1_response [8] = 8'b0;
retry_cnt [16] = 16'b0;
sdhc_flag [1] = 1'b0;
// Sector Read/Write
sector_lo [16] = 16'b0;
sector_hi [16] = 16'b0;
byte_cnt [10] = 10'b0;
byte_pair_lo [8] = 8'b0;
buf_wr_addr [8] = 8'b0;
buf_cpu_addr [8] = 8'b0;
// CPU Command/Status
command [2] = 2'b0;
busy [1] = 1'b0;
error [1] = 1'b0;
ready [1] = 1'b0;
irq_enable [1] = 1'b0;
irq_status [1] = 1'b0;
irq_clear_req [1] = 1'b0;
// Buffer Write Pipeline (single write point, 1-cycle latency)
buf_wr_en [1] = 1'b0;
buf_wr_addr_r [8] = 8'b0;
buf_wr_data_r [16] = 16'b0;
// Bus Interface Pipeline
pending_read [1] = 1'b0;
read_reg_sel [3] = 3'b0;
data_ready [1] = 1'b0;
}
ASYNCHRONOUS {
// SPI Pin Outputs
sd_clk <= spi_clk_reg;
sd_mosi <= spi_shift_out[7];
sd_cs_n <= cs_n;
// CPU Bus Read Data Mux
SELECT(read_reg_sel) {
CASE SDREG.COMMAND {
read_data <= {30'b0, command};
}
CASE SDREG.STATUS {
read_data <= {27'b0, 1'b0, sdhc_flag, ready, error, busy};
}
CASE SDREG.SECTOR_LO {
read_data <= {16'b0, sector_lo};
}
CASE SDREG.SECTOR_HI {
read_data <= {16'b0, sector_hi};
}
CASE SDREG.DATA {
read_data <= {16'b0, buffer.rd[buf_cpu_addr]};
}
CASE SDREG.IRQ_CTRL {
read_data <= {30'b0, irq_status, irq_enable};
}
DEFAULT {
read_data <= 32'b0;
}
}
// Drive bus data on read
pbus.DATA <= (pbus.VALID && pbus.CMD == CMD.READ && data_ready == 1'b1) ? read_data : 32'bz;
// DONE signaling
IF (pbus.VALID) {
IF (pbus.CMD == CMD.WRITE || data_ready == 1'b1) {
pbus.DONE <= 1'b1;
} ELSE {
pbus.DONE <= 1'b0;
}
} ELSE {
pbus.DONE <= 1'bz;
}
// CMD argument bytes (SDHC=sector, SDSC=sector<<9)
IF (sdhc_flag == 1'b1) {
cmd_arg_b3 <= sector_hi[15:8];
cmd_arg_b2 <= sector_hi[7:0];
cmd_arg_b1 <= sector_lo[15:8];
cmd_arg_b0 <= sector_lo[7:0];
} ELSE {
cmd_arg_b3 <= {sector_hi[6:0], sector_lo[15]};
cmd_arg_b2 <= sector_lo[14:7];
cmd_arg_b1 <= {sector_lo[6:0], 1'b0};
cmd_arg_b0 <= 8'h00;
}
// IRQ output
IF (irq_enable == 1'b1 && irq_status == 1'b1) {
irq <= 1'b1;
} ELSE {
irq <= 1'b0;
}
}
SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
// =====================
// MISO Synchronizer (always runs)
// =====================
miso_sync1 <= sd_miso;
miso_sync2 <= miso_sync1;
// =====================
// Bus Read Pipeline
// =====================
IF (data_ready == 1'b1) {
data_ready <= 1'b0;
} ELIF (pending_read == 1'b1) {
data_ready <= 1'b1;
pending_read <= 1'b0;
} ELIF (pbus.VALID && pbus.CMD == CMD.READ) {
pending_read <= 1'b1;
read_reg_sel <= pbus.ADDR[4:2];
}
// =====================
// Combined: Buffer Write + SPI Engine + Bus Writes + State Machine
// Priority: Buffer Write > SPI > Bus Write > State Machine
// =====================
IF (buf_wr_en == 1'b1) {
// --- Pending buffer write (1-cycle delayed) ---
buffer.wr[buf_wr_addr_r] <= buf_wr_data_r;
buf_wr_en <= 1'b0;
} ELIF (spi_busy == 1'b1) {
// --- SPI shift register ---
IF (spi_div_cnt == 8'b0) {
spi_div_cnt <= spi_div;
IF (spi_clk_reg == 1'b0) {
// Rising edge: sample MISO
spi_clk_reg <= 1'b1;
spi_shift_in <= {spi_shift_in[6:0], miso_sync2};
} ELSE {
// Falling edge
spi_clk_reg <= 1'b0;
IF (spi_bit_cnt == 4'd7) {
// Transfer complete
spi_busy <= 1'b0;
spi_rx_data <= spi_shift_in;
spi_bit_cnt <= 4'b0;
} ELSE {
spi_bit_cnt <= spi_bit_cnt + 4'b1;
spi_shift_out <= {spi_shift_out[6:0], 1'b1};
}
}
} ELSE {
spi_div_cnt <= spi_div_cnt - 8'b1;
}
} ELIF (pbus.VALID && pbus.CMD == CMD.WRITE) {
// --- CPU register writes (when SPI idle) ---
SELECT(pbus.ADDR[4:2]) {
CASE SDREG.COMMAND {
command <= pbus.DATA[1:0];
}
CASE SDREG.SECTOR_LO {
sector_lo <= pbus.DATA[15:0];
}
CASE SDREG.SECTOR_HI {
sector_hi <= pbus.DATA[15:0];
}
CASE SDREG.DATA {
buf_wr_en <= 1'b1;
buf_wr_addr_r <= buf_cpu_addr;
buf_wr_data_r <= pbus.DATA[15:0];
buf_cpu_addr <= buf_cpu_addr + 8'b1;
}
CASE SDREG.IRQ_CTRL {
irq_enable <= pbus.DATA[0];
IF (pbus.DATA[1] == 1'b1) {
irq_clear_req <= 1'b1;
}
}
DEFAULT {
}
}
} ELSE {
// --- State Machine (SPI idle, no bus write) ---
SELECT(state) {
// ----- Power-Up Delay -----
CASE SDCST.POWER_UP {
cs_n <= 1'b1;
busy <= 1'b1;
IF (powerup_cnt == 20'd742500) {
powerup_cnt <= 20'b0;
clock_cnt <= 8'b0;
state <= SDCST.SEND_CLKS;
} ELSE {
powerup_cnt <= powerup_cnt + 20'b1;
}
}
// ----- Send 80 clocks with CS high -----
CASE SDCST.SEND_CLKS {
IF (clock_cnt == 8'd20) {
state <= SDCST.CMD0;
send_cmd_phase <= 3'b0;
retry_cnt <= 16'b0;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
clock_cnt <= clock_cnt + 8'b1;
}
}
// ----- CMD0: GO_IDLE_STATE -----
CASE SDCST.CMD0 {
SELECT(send_cmd_phase) {
CASE 3'd0 {
cs_n <= 1'b0;
spi_busy <= 1'b1;
spi_shift_out <= 8'h40;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd1;
}
CASE 3'd1 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd2;
}
CASE 3'd2 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd3;
}
CASE 3'd3 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd4;
}
CASE 3'd4 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd5;
}
CASE 3'd5 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h95; // CRC for CMD0
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd6;
}
CASE 3'd6 {
resp_wait_cnt <= 16'b0;
state <= SDCST.CMD0_RESP;
}
DEFAULT {
send_cmd_phase <= 3'd0;
}
}
}
// ----- CMD0 Response -----
CASE SDCST.CMD0_RESP {
IF (spi_rx_data[7] == 1'b0 && resp_wait_cnt != 16'b0) {
r1_response <= spi_rx_data;
cs_n <= 1'b1;
IF (spi_rx_data == 8'h01) {
gap_next_state <= SDCST.CMD8;
state <= SDCST.CS_GAP;
} ELSE {
IF (retry_cnt == 16'd255) {
state <= SDCST.ERROR;
} ELSE {
retry_cnt <= retry_cnt + 16'b1;
gap_next_state <= SDCST.CMD0;
state <= SDCST.CS_GAP;
}
}
} ELIF (resp_wait_cnt == 16'd64) {
cs_n <= 1'b1;
IF (retry_cnt == 16'd255) {
state <= SDCST.ERROR;
} ELSE {
retry_cnt <= retry_cnt + 16'b1;
gap_next_state <= SDCST.CMD0;
state <= SDCST.CS_GAP;
}
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
}
// ----- CMD8: SEND_IF_COND -----
CASE SDCST.CMD8 {
SELECT(send_cmd_phase) {
CASE 3'd0 {
cs_n <= 1'b0;
spi_busy <= 1'b1;
spi_shift_out <= 8'h48;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd1;
}
CASE 3'd1 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd2;
}
CASE 3'd2 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd3;
}
CASE 3'd3 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h01; // VHS
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd4;
}
CASE 3'd4 {
spi_busy <= 1'b1;
spi_shift_out <= 8'hAA; // Check pattern
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd5;
}
CASE 3'd5 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h87; // CRC for CMD8
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd6;
}
CASE 3'd6 {
resp_wait_cnt <= 16'b0;
resp_byte_cnt <= 4'd0;
state <= SDCST.CMD8_RESP;
}
DEFAULT {
send_cmd_phase <= 3'd0;
}
}
}
// ----- CMD8 Response: R7 -----
CASE SDCST.CMD8_RESP {
IF (resp_byte_cnt == 4'd0) {
IF (spi_rx_data[7] == 1'b0 && resp_wait_cnt != 16'b0) {
r1_response <= spi_rx_data;
IF (spi_rx_data == 8'h01) {
resp_byte_cnt <= 4'd1;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
} ELSE {
cs_n <= 1'b1;
state <= SDCST.ERROR;
}
} ELIF (resp_wait_cnt == 16'd64) {
cs_n <= 1'b1;
state <= SDCST.ERROR;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
} ELIF (resp_byte_cnt == 4'd4) {
cs_n <= 1'b1;
IF (spi_rx_data == 8'hAA) {
retry_cnt <= 16'b0;
gap_next_state <= SDCST.CMD55;
state <= SDCST.CS_GAP;
} ELSE {
state <= SDCST.ERROR;
}
} ELSE {
resp_byte_cnt <= resp_byte_cnt + 4'b1;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
}
}
// ----- CMD55: APP_CMD prefix -----
CASE SDCST.CMD55 {
SELECT(send_cmd_phase) {
CASE 3'd0 {
cs_n <= 1'b0;
spi_busy <= 1'b1;
spi_shift_out <= 8'h77;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd1;
}
CASE 3'd1 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd2;
}
CASE 3'd2 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd3;
}
CASE 3'd3 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd4;
}
CASE 3'd4 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd5;
}
CASE 3'd5 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h65;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd6;
}
CASE 3'd6 {
resp_wait_cnt <= 16'b0;
state <= SDCST.CMD55_RESP;
}
DEFAULT {
send_cmd_phase <= 3'd0;
}
}
}
// ----- CMD55 Response -----
CASE SDCST.CMD55_RESP {
IF (spi_rx_data[7] == 1'b0 && resp_wait_cnt != 16'b0) {
r1_response <= spi_rx_data;
cs_n <= 1'b1;
gap_next_state <= SDCST.ACMD41;
state <= SDCST.CS_GAP;
} ELIF (resp_wait_cnt == 16'd64) {
cs_n <= 1'b1;
state <= SDCST.ERROR;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
}
// ----- ACMD41: SD_SEND_OP_COND -----
CASE SDCST.ACMD41 {
SELECT(send_cmd_phase) {
CASE 3'd0 {
cs_n <= 1'b0;
spi_busy <= 1'b1;
spi_shift_out <= 8'h69;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd1;
}
CASE 3'd1 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h40; // HCS bit
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd2;
}
CASE 3'd2 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd3;
}
CASE 3'd3 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd4;
}
CASE 3'd4 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd5;
}
CASE 3'd5 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h77;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd6;
}
CASE 3'd6 {
resp_wait_cnt <= 16'b0;
state <= SDCST.ACMD41_RESP;
}
DEFAULT {
send_cmd_phase <= 3'd0;
}
}
}
// ----- ACMD41 Response -----
CASE SDCST.ACMD41_RESP {
IF (spi_rx_data[7] == 1'b0 && resp_wait_cnt != 16'b0) {
r1_response <= spi_rx_data;
cs_n <= 1'b1;
IF (spi_rx_data == 8'h00) {
gap_next_state <= SDCST.CMD58;
state <= SDCST.CS_GAP;
} ELSE {
IF (retry_cnt == 16'd1000) {
state <= SDCST.ERROR;
} ELSE {
retry_cnt <= retry_cnt + 16'b1;
gap_next_state <= SDCST.CMD55;
state <= SDCST.CS_GAP;
}
}
} ELIF (resp_wait_cnt == 16'd64) {
cs_n <= 1'b1;
IF (retry_cnt == 16'd1000) {
state <= SDCST.ERROR;
} ELSE {
retry_cnt <= retry_cnt + 16'b1;
gap_next_state <= SDCST.CMD55;
state <= SDCST.CS_GAP;
}
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
}
// ----- CMD58: READ_OCR -----
CASE SDCST.CMD58 {
SELECT(send_cmd_phase) {
CASE 3'd0 {
cs_n <= 1'b0;
spi_busy <= 1'b1;
spi_shift_out <= 8'h7A;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd1;
}
CASE 3'd1 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd2;
}
CASE 3'd2 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd3;
}
CASE 3'd3 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd4;
}
CASE 3'd4 {
spi_busy <= 1'b1;
spi_shift_out <= 8'h00;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd5;
}
CASE 3'd5 {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFD;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd6;
}
CASE 3'd6 {
resp_wait_cnt <= 16'b0;
resp_byte_cnt <= 4'd0;
state <= SDCST.CMD58_RESP;
}
DEFAULT {
send_cmd_phase <= 3'd0;
}
}
}
// ----- CMD58 Response: R3 -----
CASE SDCST.CMD58_RESP {
IF (resp_byte_cnt == 4'd0) {
IF (spi_rx_data[7] == 1'b0 && resp_wait_cnt != 16'b0) {
r1_response <= spi_rx_data;
IF (spi_rx_data == 8'h00) {
resp_byte_cnt <= 4'd1;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
} ELSE {
cs_n <= 1'b1;
state <= SDCST.ERROR;
}
} ELIF (resp_wait_cnt == 16'd64) {
cs_n <= 1'b1;
state <= SDCST.ERROR;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
} ELIF (resp_byte_cnt == 4'd1) {
sdhc_flag <= spi_rx_data[6];
resp_byte_cnt <= 4'd2;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
} ELIF (resp_byte_cnt == 4'd4) {
cs_n <= 1'b1;
spi_div <= 8'd4; // Switch to fast clock
busy <= 1'b0;
ready <= 1'b1;
state <= SDCST.IDLE;
} ELSE {
resp_byte_cnt <= resp_byte_cnt + 4'b1;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
}
}
// ----- IDLE -----
CASE SDCST.IDLE {
// IRQ clear handshake
IF (irq_clear_req == 1'b1) {
irq_status <= 1'b0;
irq_clear_req <= 1'b0;
}
// Command dispatch
IF (command == SDCMD.READ_SECTOR) {
command <= SDCMD.NONE;
error <= 1'b0;
busy <= 1'b1;
buf_wr_addr <= 8'b0;
byte_cnt <= 10'b0;
gap_next_state <= SDCST.READ_CMD;
state <= SDCST.CS_GAP;
} ELIF (command == SDCMD.WRITE_SECTOR) {
command <= SDCMD.NONE;
error <= 1'b0;
busy <= 1'b1;
byte_cnt <= 10'b0;
buf_cpu_addr <= 8'b0;
gap_next_state <= SDCST.WRITE_CMD;
state <= SDCST.CS_GAP;
} ELSE {
// Auto-increment buffer pointer after DATA read
IF (data_ready == 1'b1 && read_reg_sel == SDREG.DATA) {
buf_cpu_addr <= buf_cpu_addr + 8'b1;
}
}
}
// ----- READ_CMD: CMD17 -----
CASE SDCST.READ_CMD {
SELECT(send_cmd_phase) {
CASE 3'd0 {
cs_n <= 1'b0;
spi_busy <= 1'b1;
spi_shift_out <= 8'h51;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd1;
}
CASE 3'd1 {
spi_busy <= 1'b1;
spi_shift_out <= cmd_arg_b3;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd2;
}
CASE 3'd2 {
spi_busy <= 1'b1;
spi_shift_out <= cmd_arg_b2;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd3;
}
CASE 3'd3 {
spi_busy <= 1'b1;
spi_shift_out <= cmd_arg_b1;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd4;
}
CASE 3'd4 {
spi_busy <= 1'b1;
spi_shift_out <= cmd_arg_b0;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd5;
}
CASE 3'd5 {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd6;
}
CASE 3'd6 {
resp_wait_cnt <= 16'b0;
state <= SDCST.READ_RESP;
}
DEFAULT {
send_cmd_phase <= 3'd0;
}
}
}
// ----- READ_RESP -----
CASE SDCST.READ_RESP {
IF (spi_rx_data[7] == 1'b0 && resp_wait_cnt != 16'b0) {
r1_response <= spi_rx_data;
IF (spi_rx_data == 8'h00) {
resp_wait_cnt <= 16'b0;
state <= SDCST.READ_TOKEN;
} ELSE {
cs_n <= 1'b1;
error <= 1'b1;
busy <= 1'b0;
state <= SDCST.IDLE;
}
} ELIF (resp_wait_cnt == 16'd64) {
cs_n <= 1'b1;
error <= 1'b1;
busy <= 1'b0;
state <= SDCST.IDLE;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
}
// ----- READ_TOKEN -----
CASE SDCST.READ_TOKEN {
IF (spi_rx_data == 8'hFE && resp_wait_cnt != 16'b0) {
byte_cnt <= 10'b0;
buf_wr_addr <= 8'b0;
state <= SDCST.READ_DATA;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
} ELIF (resp_wait_cnt == 16'd4096) {
cs_n <= 1'b1;
error <= 1'b1;
busy <= 1'b0;
state <= SDCST.IDLE;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
}
// ----- READ_DATA -----
CASE SDCST.READ_DATA {
IF (byte_cnt == 10'd512) {
byte_cnt <= 10'b0;
state <= SDCST.READ_CRC;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
} ELSE {
IF (byte_cnt[0] == 1'b0) {
byte_pair_lo <= spi_rx_data;
} ELSE {
buf_wr_en <= 1'b1;
buf_wr_addr_r <= buf_wr_addr;
buf_wr_data_r <= {spi_rx_data, byte_pair_lo};
buf_wr_addr <= buf_wr_addr + 8'b1;
}
byte_cnt <= byte_cnt + 10'b1;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
}
}
// ----- READ_CRC -----
CASE SDCST.READ_CRC {
IF (byte_cnt == 10'd2) {
cs_n <= 1'b1;
state <= SDCST.READ_DONE;
} ELSE {
byte_cnt <= byte_cnt + 10'b1;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
}
}
// ----- READ_DONE -----
CASE SDCST.READ_DONE {
busy <= 1'b0;
irq_status <= 1'b1;
irq_clear_req <= 1'b0;
buf_cpu_addr <= 8'b0;
state <= SDCST.IDLE;
}
// ----- WRITE_CMD: CMD24 -----
CASE SDCST.WRITE_CMD {
SELECT(send_cmd_phase) {
CASE 3'd0 {
cs_n <= 1'b0;
spi_busy <= 1'b1;
spi_shift_out <= 8'h58;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd1;
}
CASE 3'd1 {
spi_busy <= 1'b1;
spi_shift_out <= cmd_arg_b3;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd2;
}
CASE 3'd2 {
spi_busy <= 1'b1;
spi_shift_out <= cmd_arg_b2;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd3;
}
CASE 3'd3 {
spi_busy <= 1'b1;
spi_shift_out <= cmd_arg_b1;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd4;
}
CASE 3'd4 {
spi_busy <= 1'b1;
spi_shift_out <= cmd_arg_b0;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd5;
}
CASE 3'd5 {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
send_cmd_phase <= 3'd6;
}
CASE 3'd6 {
resp_wait_cnt <= 16'b0;
state <= SDCST.WRITE_RESP;
}
DEFAULT {
send_cmd_phase <= 3'd0;
}
}
}
// ----- WRITE_RESP -----
CASE SDCST.WRITE_RESP {
IF (spi_rx_data[7] == 1'b0 && resp_wait_cnt != 16'b0) {
r1_response <= spi_rx_data;
IF (spi_rx_data == 8'h00) {
state <= SDCST.WRITE_TOKEN;
} ELSE {
cs_n <= 1'b1;
error <= 1'b1;
busy <= 1'b0;
state <= SDCST.IDLE;
}
} ELIF (resp_wait_cnt == 16'd64) {
cs_n <= 1'b1;
error <= 1'b1;
busy <= 1'b0;
state <= SDCST.IDLE;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
}
// ----- WRITE_TOKEN -----
CASE SDCST.WRITE_TOKEN {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFE;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
byte_cnt <= 10'b0;
buf_cpu_addr <= 8'b0;
state <= SDCST.WRITE_DATA;
}
// ----- WRITE_DATA -----
CASE SDCST.WRITE_DATA {
IF (byte_cnt == 10'd512) {
byte_cnt <= 10'b0;
state <= SDCST.WRITE_CRC;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
} ELSE {
IF (byte_cnt[0] == 1'b0) {
spi_shift_out <= buffer.rd[buf_cpu_addr][7:0];
} ELSE {
spi_shift_out <= buffer.rd[buf_cpu_addr][15:8];
buf_cpu_addr <= buf_cpu_addr + 8'b1;
}
spi_busy <= 1'b1;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
byte_cnt <= byte_cnt + 10'b1;
}
}
// ----- WRITE_CRC -----
CASE SDCST.WRITE_CRC {
IF (byte_cnt == 10'd2) {
resp_wait_cnt <= 16'b0;
state <= SDCST.WRITE_DRESP;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
} ELSE {
byte_cnt <= byte_cnt + 10'b1;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
}
}
// ----- WRITE_DRESP -----
CASE SDCST.WRITE_DRESP {
IF (spi_rx_data != 8'hFF && resp_wait_cnt != 16'b0) {
IF (spi_rx_data[3:1] == 3'b010) {
state <= SDCST.WRITE_BUSY;
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
} ELSE {
cs_n <= 1'b1;
error <= 1'b1;
busy <= 1'b0;
state <= SDCST.IDLE;
}
} ELIF (resp_wait_cnt == 16'd64) {
cs_n <= 1'b1;
error <= 1'b1;
busy <= 1'b0;
state <= SDCST.IDLE;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
resp_wait_cnt <= resp_wait_cnt + 16'b1;
}
}
// ----- WRITE_BUSY -----
CASE SDCST.WRITE_BUSY {
IF (spi_rx_data != 8'h00) {
cs_n <= 1'b1;
state <= SDCST.WRITE_DONE;
} ELSE {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
}
}
// ----- WRITE_DONE -----
CASE SDCST.WRITE_DONE {
busy <= 1'b0;
irq_status <= 1'b1;
irq_clear_req <= 1'b0;
state <= SDCST.IDLE;
}
// ----- CS_GAP: send 8 clocks with CS HIGH between commands -----
CASE SDCST.CS_GAP {
spi_busy <= 1'b1;
spi_shift_out <= 8'hFF;
spi_shift_in <= 8'b0;
spi_bit_cnt <= 4'b0;
spi_clk_reg <= 1'b0;
spi_div_cnt <= spi_div;
state <= gap_next_state;
send_cmd_phase <= 3'd0;
}
// ----- ERROR -----
CASE SDCST.ERROR {
busy <= 1'b0;
error <= 1'b1;
cs_n <= 1'b1;
}
DEFAULT {
state <= SDCST.ERROR;
}
}
}
}
@endmodjz
// Video Output Pipeline
// Dual-mode: 1280x720 @60Hz or 1920x1080 @30Hz DVI/HDMI output.
// Mode 0 (720p): 80x22 characters, 16x32 pixel glyphs
// Mode 1 (1080p): 120x33 characters, 16x32 pixel glyphs
// RGB565 FG+BG per cell. 8-pixel vertical offset from top.
//
// Pipeline (all on pixel_clk):
// Stage 0 (comb): compute cell_addr → drives vram_addr to terminal
// Edge 0→1: terminal registers BSRAM addr; pipeline regs capture timing
// Edge 1→2: char/attr valid from terminal; font ROM addr registered
// Edge 2→3: font bitmap latched into font_data; p3 captures pixel_col/attr
// Stage 3 (comb): font_bit + pixel color computed from font_data + p3
// Edge 3→4: pixel color registered → TMDS input
// Edge 4→5: TMDS encoder registers internally
@module video_out
PORT {
IN [1] pixel_clk;
IN [1] rst_n;
IN [1] video_mode;
// Framebuffer read interface (pixel_clk domain)
OUT [12] vram_addr;
IN [8] vram_char;
IN [32] vram_attr;
// Cursor info from terminal
IN [12] cursor_pos;
IN [3] cursor_style;
// TMDS outputs
OUT [10] tmds_clk;
OUT [10] tmds_d0;
OUT [10] tmds_d1;
OUT [10] tmds_d2;
}
CONST {
V_OFFSET = 8;
}
// Font ROM: 256 glyphs x 32 rows x 16 bits = 8192 entries
MEM(TYPE=BLOCK) {
font_rom [16] [8192] = @file("../out/font_16x32.hex") {
OUT font_rd SYNC;
};
}
// Video timing generator
@new vt0 video_timing {
IN [1] clk = pixel_clk;
IN [1] rst_n = rst_n;
IN [1] mode = mode_sync2;
OUT [1] hsync = vt_hsync;
OUT [1] vsync = vt_vsync;
OUT [1] display_enable = vt_de;
OUT [12] x_pos = vt_x;
OUT [11] y_pos = vt_y;
}
// TMDS encoders
@new enc_b tmds_encoder {
IN [1] clk = pixel_clk;
IN [1] rst_n = rst_n;
IN [8] data_in = p4_b;
IN [1] c0 = p4_hsync;
IN [1] c1 = p4_vsync;
IN [1] display_enable = p4_de;
OUT [10] tmds_out = tmds_d0;
}
@new enc_g tmds_encoder {
IN [1] clk = pixel_clk;
IN [1] rst_n = rst_n;
IN [8] data_in = p4_g;
IN [1] c0 = 1'b0;
IN [1] c1 = 1'b0;
IN [1] display_enable = p4_de;
OUT [10] tmds_out = tmds_d1;
}
@new enc_r tmds_encoder {
IN [1] clk = pixel_clk;
IN [1] rst_n = rst_n;
IN [8] data_in = p4_r;
IN [1] c0 = 1'b0;
IN [1] c1 = 1'b0;
IN [1] display_enable = p4_de;
OUT [10] tmds_out = tmds_d2;
}
WIRE {
// Timing signals from video_timing
vt_hsync [1];
vt_vsync [1];
vt_de [1];
vt_x [12];
vt_y [11];
// Cell coordinates (combinational)
col [7]; // x / 16
row [6]; // (y - V_OFFSET) / 32
scanline [5]; // (y - V_OFFSET) % 32
y_adj [11]; // y - V_OFFSET
// Row multiplication intermediates
row_x128 [13];
row_x64 [13];
row_x16 [11];
row_x8 [10];
cell_addr [12];
// Text area flag
in_text_area [1];
// Cursor matching
is_cursor_cell [1];
cursor_active [1]; // cursor visible (accounts for blink)
in_cursor_zone [1]; // scanline is in cursor region
// Font pixel selection
font_bit [1];
// RGB565 fields from latched attr
fg_r5 [5]; fg_g6 [6]; fg_b5 [5];
bg_r5 [5]; bg_g6 [6]; bg_b5 [5];
// Pixel color output (comb)
pixel_r [8];
pixel_g [8];
pixel_b [8];
// Max text area bounds (mode-dependent)
y_text_max [11];
x_active [12];
col_max [7];
// Cursor pixel output override
cursor_pixel [1];
}
REGISTER {
// CDC synchronizer for video_mode (sys_clk → pixel_clk)
mode_sync1 [1] = 1'b0;
mode_sync2 [1] = 1'b0;
// Blink timer: 0.5s at 74.25MHz = 37,125,000 cycles ~= 2^25
blink_counter [25] = 25'b0;
blink_on [1] = 1'b1;
// Pipeline cursor cell match and scanline through stages
p1_is_cursor [1] = 1'b0;
p2_is_cursor [1] = 1'b0;
p2_scanline [5] = 5'b0;
p3_is_cursor [1] = 1'b0;
p3_scanline [5] = 5'b0;
// Pipeline stage 1: timing delayed 1 cycle
p1_pixel_col [4] = 4'b0;
p1_hsync [1] = 1'b0;
p1_vsync [1] = 1'b0;
p1_de [1] = 1'b0;
p1_in_text [1] = 1'b0;
p1_scanline [5] = 5'b0;
// Pipeline stage 2: char/attr latched, font addr sent
p2_pixel_col [4] = 4'b0;
p2_hsync [1] = 1'b0;
p2_vsync [1] = 1'b0;
p2_de [1] = 1'b0;
p2_in_text [1] = 1'b0;
p2_attr [32] = 32'b0;
// Font data latch (SYNC MEM must be read in SYNC block)
font_data [16] = 16'b0;
// Pipeline stage 3: font data valid, aligned with pixel_col/attr
p3_pixel_col [4] = 4'b0;
p3_hsync [1] = 1'b0;
p3_vsync [1] = 1'b0;
p3_de [1] = 1'b0;
p3_in_text [1] = 1'b0;
p3_attr [32] = 32'b0;
// Pipeline stage 4: pixel computed, TMDS input
p4_r [8] = 8'b0;
p4_g [8] = 8'b0;
p4_b [8] = 8'b0;
p4_hsync [1] = 1'b0;
p4_vsync [1] = 1'b0;
p4_de [1] = 1'b0;
}
ASYNCHRONOUS {
// TMDS clock channel
tmds_clk <= 10'b1111100000;
// Mode-dependent text area bounds
// 720p: 8 + 22*32 = 712, active 1280, 80 cols
// 1080p: 8 + 33*32 = 1064, active 1920, 120 cols
y_text_max <= (mode_sync2 == 1'b1) ? 11'd1064 : 11'd712;
x_active <= (mode_sync2 == 1'b1) ? 12'd1920 : 12'd1280;
col_max <= (mode_sync2 == 1'b1) ? 7'd120 : 7'd80;
// --- Stage 0: Combinational cell address ---
y_adj <= vt_y - lit(11, V_OFFSET);
col <= vt_x[10:4];
row <= y_adj[10:5];
scanline <= y_adj[4:0];
in_text_area <= (vt_y >= lit(11, V_OFFSET) && vt_y < y_text_max &&
vt_x < x_active && vt_x[10:4] < col_max)
? 1'b1 : 1'b0;
// Cell address: row * cols + col
// 720p: row*80 = row*64 + row*16
// 1080p: row*120 = row*128 - row*8
row_x128 <= {row, 7'b0};
row_x64 <= {1'b0, row, 6'b0};
row_x16 <= {1'b0, row, 4'b0};
row_x8 <= {1'b0, row, 3'b0};
IF (mode_sync2 == 1'b1) {
cell_addr <= row_x128[11:0] - {2'b0, row_x8} + {5'b0, col};
} ELSE {
cell_addr <= row_x64[11:0] + {1'b0, row_x16} + {5'b0, col};
}
// Drive BSRAM address combinationally
vram_addr <= cell_addr;
// Cursor cell detection (stage 0)
is_cursor_cell <= (cell_addr == cursor_pos && cursor_style != 3'b0) ? 1'b1 : 1'b0;
// Cursor blink: styles 3,4 blink; styles 1,2 always on
cursor_active <= (cursor_style == 3'd1 || cursor_style == 3'd2 ||
((cursor_style == 3'd3 || cursor_style == 3'd4) && blink_on == 1'b1))
? 1'b1 : 1'b0;
// Cursor scanline zone: check if current scanline is in cursor region
// Glyph is 32 pixels tall. Bottom 2 pixels = scanlines 30-31. Bottom 8 = scanlines 24-31.
IF (cursor_style == 3'd1 || cursor_style == 3'd3) {
// 2-pixel bottom line: scanlines 30-31
in_cursor_zone <= (p3_scanline >= 5'd30) ? 1'b1 : 1'b0;
} ELSE {
// 8-pixel bottom block: scanlines 24-31
in_cursor_zone <= (p3_scanline >= 5'd24) ? 1'b1 : 1'b0;
}
// Final cursor pixel: cell matches AND scanline in zone AND cursor visible
cursor_pixel <= (p3_is_cursor == 1'b1 && in_cursor_zone == 1'b1 && cursor_active == 1'b1)
? 1'b1 : 1'b0;
// --- Stage 3 combinational: Font pixel selection from latched font data ---
// font_data: MSB (bit 15) = leftmost pixel
// Use gslice for dynamic bit indexing (spec requires constant indices for [])
font_bit <= gslice(font_data, 4'd15 - p3_pixel_col, 1);
// RGB565 from latched attribute
fg_r5 <= p3_attr[15:11];
fg_g6 <= p3_attr[10:5];
fg_b5 <= p3_attr[4:0];
bg_r5 <= p3_attr[31:27];
bg_g6 <= p3_attr[26:21];
bg_b5 <= p3_attr[20:16];
// Pixel color: cursor inverts FG/BG; font_bit selects FG/BG; black outside text
IF (p3_in_text == 1'b1 && cursor_pixel == 1'b1) {
// Cursor: swap FG/BG (draw BG-colored text on FG-colored block)
IF (font_bit == 1'b1) {
pixel_r <= {bg_r5, bg_r5[4:2]};
pixel_g <= {bg_g6, bg_g6[5:4]};
pixel_b <= {bg_b5, bg_b5[4:2]};
} ELSE {
pixel_r <= {fg_r5, fg_r5[4:2]};
pixel_g <= {fg_g6, fg_g6[5:4]};
pixel_b <= {fg_b5, fg_b5[4:2]};
}
} ELIF (p3_in_text == 1'b1 && font_bit == 1'b1) {
pixel_r <= {fg_r5, fg_r5[4:2]};
pixel_g <= {fg_g6, fg_g6[5:4]};
pixel_b <= {fg_b5, fg_b5[4:2]};
} ELIF (p3_in_text == 1'b1) {
pixel_r <= {bg_r5, bg_r5[4:2]};
pixel_g <= {bg_g6, bg_g6[5:4]};
pixel_b <= {bg_b5, bg_b5[4:2]};
} ELSE {
pixel_r <= 8'h00;
pixel_g <= 8'h00;
pixel_b <= 8'h00;
}
}
SYNCHRONOUS(CLK = pixel_clk RESET = rst_n RESET_ACTIVE = Low) {
// CDC synchronizer: video_mode from sys_clk domain
mode_sync1 <= video_mode;
mode_sync2 <= mode_sync1;
// Blink timer: toggle every ~0.5s
IF (blink_counter == 25'd0) {
blink_counter <= 25'd18562500; // half of 37.125M (0.5s at 74.25MHz)
blink_on <= ~blink_on;
} ELSE {
blink_counter <= blink_counter - 25'd1;
}
// --- Edge 0 → 1: Terminal BSRAM addr registered (via comb vram_addr) ---
p1_pixel_col <= vt_x[3:0];
p1_hsync <= vt_hsync;
p1_vsync <= vt_vsync;
p1_de <= vt_de;
p1_in_text <= in_text_area;
p1_scanline <= scanline;
p1_is_cursor <= is_cursor_cell;
// --- Edge 1 → 2: char/attr arrive from terminal; send font ROM read ---
font_rom.font_rd.addr <= {vram_char, p1_scanline};
p2_pixel_col <= p1_pixel_col;
p2_hsync <= p1_hsync;
p2_vsync <= p1_vsync;
p2_de <= p1_de;
p2_in_text <= p1_in_text;
p2_attr <= vram_attr;
p2_is_cursor <= p1_is_cursor;
p2_scanline <= p1_scanline;
// --- Edge 2 → 3: Font data latched; p3 captures aligned pixel_col/attr ---
font_data <= font_rom.font_rd.data;
p3_pixel_col <= p2_pixel_col;
p3_hsync <= p2_hsync;
p3_vsync <= p2_vsync;
p3_de <= p2_de;
p3_in_text <= p2_in_text;
p3_attr <= p2_attr;
p3_is_cursor <= p2_is_cursor;
p3_scanline <= p2_scanline;
// --- Edge 3 → 4: Pixel color computed (comb from font_data + p3); latch for TMDS ---
p4_r <= pixel_r;
p4_g <= pixel_g;
p4_b <= pixel_b;
p4_hsync <= p3_hsync;
p4_vsync <= p3_vsync;
p4_de <= p3_de;
}
@endmodjz
// Dual-Mode Video Timing Generator
// Mode 0: 1280x720 @60Hz (CEA-861 VIC 4)
// Mode 1: 1920x1080 @30Hz (CEA-861 VIC 34)
// Both modes use 74.25 MHz pixel clock.
// Sync polarity: positive (sync HIGH during sync pulse)
@module video_timing
PORT {
IN [1] clk;
IN [1] rst_n;
IN [1] mode;
OUT [1] hsync;
OUT [1] vsync;
OUT [1] display_enable;
OUT [12] x_pos;
OUT [11] y_pos;
}
CONST {
// 720p timing
H_ACTIVE_720 = 1280;
H_FRONT_720 = 110;
H_SYNC_720 = 40;
H_BACK_720 = 220;
H_TOTAL_720 = 1650;
V_ACTIVE_720 = 720;
V_FRONT_720 = 5;
V_SYNC_720 = 5;
V_BACK_720 = 20;
V_TOTAL_720 = 750;
// 1080p@30 timing (CEA-861 VIC 34)
H_ACTIVE_1080 = 1920;
H_FRONT_1080 = 88;
H_SYNC_1080 = 44;
H_BACK_1080 = 148;
H_TOTAL_1080 = 2200;
V_ACTIVE_1080 = 1080;
V_FRONT_1080 = 4;
V_SYNC_1080 = 5;
V_BACK_1080 = 36;
V_TOTAL_1080 = 1125;
}
WIRE {
h_total_m1 [12];
v_total_m1 [11];
h_active [12];
v_active [11];
h_sync_start [12];
h_sync_end [12];
v_sync_start [11];
v_sync_end [11];
}
REGISTER {
h_cnt [12] = 12'b0;
v_cnt [11] = 11'b0;
}
ASYNCHRONOUS {
// Mux timing parameters based on mode
h_total_m1 <= (mode == 1'b1) ? lit(12, H_TOTAL_1080 - 1) : lit(12, H_TOTAL_720 - 1);
v_total_m1 <= (mode == 1'b1) ? lit(11, V_TOTAL_1080 - 1) : lit(11, V_TOTAL_720 - 1);
h_active <= (mode == 1'b1) ? lit(12, H_ACTIVE_1080) : lit(12, H_ACTIVE_720);
v_active <= (mode == 1'b1) ? lit(11, V_ACTIVE_1080) : lit(11, V_ACTIVE_720);
h_sync_start <= (mode == 1'b1) ? lit(12, H_ACTIVE_1080 + H_FRONT_1080) : lit(12, H_ACTIVE_720 + H_FRONT_720);
h_sync_end <= (mode == 1'b1) ? lit(12, H_ACTIVE_1080 + H_FRONT_1080 + H_SYNC_1080) : lit(12, H_ACTIVE_720 + H_FRONT_720 + H_SYNC_720);
v_sync_start <= (mode == 1'b1) ? lit(11, V_ACTIVE_1080 + V_FRONT_1080) : lit(11, V_ACTIVE_720 + V_FRONT_720);
v_sync_end <= (mode == 1'b1) ? lit(11, V_ACTIVE_1080 + V_FRONT_1080 + V_SYNC_1080) : lit(11, V_ACTIVE_720 + V_FRONT_720 + V_SYNC_720);
// Positive sync polarity: HIGH during sync pulse
hsync <= (h_cnt >= h_sync_start && h_cnt < h_sync_end) ? 1'b1 : 1'b0;
vsync <= (v_cnt >= v_sync_start && v_cnt < v_sync_end) ? 1'b1 : 1'b0;
// Display enable: active region
display_enable <= (h_cnt < h_active && v_cnt < v_active) ? 1'b1 : 1'b0;
x_pos <= h_cnt;
y_pos <= v_cnt;
}
SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
IF (h_cnt == h_total_m1) {
h_cnt <= 12'b0;
IF (v_cnt == v_total_m1) {
v_cnt <= 11'b0;
} ELSE {
v_cnt <= v_cnt + 11'b1;
}
} ELSE {
h_cnt <= h_cnt + 12'b1;
}
}
@endmodjz
// DVI TMDS 8b/10b Encoder
// Full DVI-compliant TMDS encoding with XOR/XNOR selection and
// running disparity tracking for DC balance on AC-coupled links.
@module tmds_encoder
PORT {
IN [1] clk;
IN [1] rst_n;
IN [8] data_in;
IN [1] c0;
IN [1] c1;
IN [1] display_enable;
OUT [10] tmds_out;
}
WIRE {
// Popcount of data_in and q_m via intrinsic
n1_d [4];
// XOR/XNOR mode selection
use_xnor [1];
// Transition-minimized intermediate word q_m[8:0]
qm0 [1]; qm1 [1]; qm2 [1]; qm3 [1];
qm4 [1]; qm5 [1]; qm6 [1]; qm7 [1];
qm8 [1];
// Popcount of q_m[7:0] via intrinsic
n1_q [4];
// Disparity conditions
cnt_is_zero [1];
qm_balanced [1];
cond1 [1];
cnt_sign [1];
cond_inv [1];
// Arithmetic for disparity update (5-bit two's complement)
diff_n1n0 [5];
diff_n0n1 [5];
qm8_x2 [5];
nqm8_x2 [5];
// Combinational outputs
tmds_data [10];
next_cnt [5];
}
REGISTER {
cnt [5] = 5'b00000;
tmds_reg [10] = 10'b0000000000;
}
ASYNCHRONOUS {
tmds_out <= tmds_reg;
// --- Popcount of data_in ---
n1_d <= popcount(data_in);
// --- XOR/XNOR selection (DVI spec section 3.3.1) ---
use_xnor <= (n1_d > 4'd4 || (n1_d == 4'd4 && data_in[0] == 1'b0))
? 1'b1 : 1'b0;
// --- Build transition-minimized word q_m ---
qm0 <= data_in[0];
qm1 <= (use_xnor == 1'b1) ? ~(data_in[1] ^ qm0) : (data_in[1] ^ qm0);
qm2 <= (use_xnor == 1'b1) ? ~(data_in[2] ^ qm1) : (data_in[2] ^ qm1);
qm3 <= (use_xnor == 1'b1) ? ~(data_in[3] ^ qm2) : (data_in[3] ^ qm2);
qm4 <= (use_xnor == 1'b1) ? ~(data_in[4] ^ qm3) : (data_in[4] ^ qm3);
qm5 <= (use_xnor == 1'b1) ? ~(data_in[5] ^ qm4) : (data_in[5] ^ qm4);
qm6 <= (use_xnor == 1'b1) ? ~(data_in[6] ^ qm5) : (data_in[6] ^ qm5);
qm7 <= (use_xnor == 1'b1) ? ~(data_in[7] ^ qm6) : (data_in[7] ^ qm6);
qm8 <= (use_xnor == 1'b1) ? 1'b0 : 1'b1;
// --- Popcount of q_m[7:0] ---
n1_q <= popcount({qm7, qm6, qm5, qm4, qm3, qm2, qm1, qm0});
// --- Disparity conditions ---
cnt_is_zero <= (cnt == 5'b00000) ? 1'b1 : 1'b0;
qm_balanced <= (n1_q == 4'd4) ? 1'b1 : 1'b0;
cond1 <= (cnt_is_zero == 1'b1 || qm_balanced == 1'b1)
? 1'b1 : 1'b0;
cnt_sign <= cnt[4];
cond_inv <= ((cnt_sign == 1'b0 && cnt_is_zero == 1'b0 && n1_q > 4'd4) ||
(cnt_sign == 1'b1 && n1_q < 4'd4))
? 1'b1 : 1'b0;
// --- Arithmetic helpers (5-bit two's complement) ---
diff_n1n0 <= {n1_q, 1'b0} - 5'd8;
diff_n0n1 <= 5'd8 - {n1_q, 1'b0};
qm8_x2 <= {3'b000, qm8, 1'b0};
nqm8_x2 <= {3'b000, ~qm8, 1'b0};
// --- Output word and next disparity (DVI spec section 3.3.2) ---
IF (cond1 == 1'b1) {
IF (qm8 == 1'b0) {
// XNOR mode, cnt==0 or balanced: invert data, bit[9]=1
tmds_data <= {1'b1, 1'b0, ~qm7, ~qm6, ~qm5, ~qm4,
~qm3, ~qm2, ~qm1, ~qm0};
next_cnt <= cnt + diff_n0n1;
} ELSE {
// XOR mode, cnt==0 or balanced: keep data, bit[9]=0
tmds_data <= {1'b0, 1'b1, qm7, qm6, qm5, qm4,
qm3, qm2, qm1, qm0};
next_cnt <= cnt + diff_n1n0;
}
} ELIF (cond_inv == 1'b1) {
// Invert to reduce disparity
tmds_data <= {1'b1, qm8, ~qm7, ~qm6, ~qm5, ~qm4,
~qm3, ~qm2, ~qm1, ~qm0};
next_cnt <= cnt + qm8_x2 + diff_n0n1;
} ELSE {
// Don't invert
tmds_data <= {1'b0, qm8, qm7, qm6, qm5, qm4,
qm3, qm2, qm1, qm0};
next_cnt <= cnt - nqm8_x2 + diff_n1n0;
}
}
SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
IF (display_enable == 1'b0) {
// Control period: reset disparity and emit control tokens
cnt <= 5'b00000;
IF (c0 == 1'b0 && c1 == 1'b0) {
tmds_reg <= 10'b1101010100;
} ELIF (c0 == 1'b1 && c1 == 1'b0) {
tmds_reg <= 10'b0010101011;
} ELIF (c0 == 1'b0 && c1 == 1'b1) {
tmds_reg <= 10'b0101010100;
} ELSE {
tmds_reg <= 10'b1010101011;
}
} ELSE {
// Data period: latch encoded word and update disparity
tmds_reg <= tmds_data;
cnt <= next_cnt;
}
}
@endmodjz
// Audio controller - Yamaha FM synth inspired
// 8 configurable channels with waveform, ADSR envelope, volume, pan
// Mixes to stereo, fills internal ring buffer (512 stereo samples)
// CPU reads samples via bus register interface
//
// Register map (ADDR[8:2] = register index):
// Global registers (ADDR[8:5] = 0000):
// 0x00 [RW] CTRL: [0]=enable, [1]=irq_en, [2]=irq_clear(W)
// 0x04 [R] STATUS: [9:0]=buf_level, [10]=half_full, [11]=underrun
// 0x08 [R] SAMPLE: {left[15:0], right[15:0]} - read pops from buffer
// 0x0C [RW] DIVIDER: [15:0]=sample rate divider (sclk/divider = sample rate)
// 0x10 [RW] MASTER_VOL: [7:0]=master volume 0-255
//
// Per-channel registers (ADDR[8:5] = 0001..1000 for ch0..ch7):
// +0x00 [RW] CH_CTRL: [0]=key_on, [3:1]=waveform, [4]=enabled
// +0x04 [RW] FREQ_LO: [15:0]=phase increment low
// +0x08 [RW] FREQ_HI: [7:0]=phase increment high (24-bit total)
// +0x0C [RW] VOLUME: [7:0]=volume, [15:8]=pan (0=L, 128=center, 255=R)
// +0x10 [RW] ENV_AD: [7:0]=attack rate, [15:8]=decay rate
// +0x14 [RW] ENV_SR: [7:0]=sustain level, [15:8]=release rate
// +0x18 [RW] DUTY: [7:0]=duty cycle (square wave)
@module audio
CONST {
NUM_CHANNELS = 8;
BUF_DEPTH = 128; // Stereo sample pairs in ring buffer
}
PORT {
IN [1] clk;
IN [1] rst_n;
BUS SIMPLE_BUS TARGET pbus;
OUT [1] irq;
}
WIRE {
// Bus decode
reg_group [4]; // ADDR[8:5] - 0=global, 1-8=channel
reg_sel [3]; // ADDR[4:2] - register within group
bus_read [1];
bus_write [1];
// Sample rate tick generation
tick [1];
// Channel outputs (16-bit signed each)
ch0_out [16]; ch1_out [16]; ch2_out [16]; ch3_out [16];
ch4_out [16]; ch5_out [16]; ch6_out [16]; ch7_out [16];
all_samples [128];
all_pans [64];
// Mixer outputs
mix_left [16];
mix_right [16];
mix_valid [1];
// Buffer level
buf_count [8];
buf_half [1];
}
REGISTER {
// Global control
enable [1] = 1'b0;
irq_en [1] = 1'b0;
master_vol [8] = 8'h80;
divider [16] = 16'd1125; // 54MHz/1125 ≈ 48kHz
div_count [16] = 16'd0;
sample_tick [1] = 1'b0;
underrun [1] = 1'b0;
// Ring buffer pointers (7-bit for 128-deep buffer)
wr_ptr [7] = 7'd0;
rd_ptr [7] = 7'd0;
// Per-channel registers: key_on, waveform, enabled
ch0_key [1] = 1'b0; ch1_key [1] = 1'b0; ch2_key [1] = 1'b0; ch3_key [1] = 1'b0;
ch4_key [1] = 1'b0; ch5_key [1] = 1'b0; ch6_key [1] = 1'b0; ch7_key [1] = 1'b0;
ch0_wf [3] = 3'd0; ch1_wf [3] = 3'd0; ch2_wf [3] = 3'd0; ch3_wf [3] = 3'd0;
ch4_wf [3] = 3'd0; ch5_wf [3] = 3'd0; ch6_wf [3] = 3'd0; ch7_wf [3] = 3'd0;
ch0_en [1] = 1'b0; ch1_en [1] = 1'b0; ch2_en [1] = 1'b0; ch3_en [1] = 1'b0;
ch4_en [1] = 1'b0; ch5_en [1] = 1'b0; ch6_en [1] = 1'b0; ch7_en [1] = 1'b0;
// Frequency (24-bit, split into hi/lo registers)
ch0_freq [24] = 24'd0; ch1_freq [24] = 24'd0; ch2_freq [24] = 24'd0; ch3_freq [24] = 24'd0;
ch4_freq [24] = 24'd0; ch5_freq [24] = 24'd0; ch6_freq [24] = 24'd0; ch7_freq [24] = 24'd0;
// Volume + pan
ch0_vol [8] = 8'd0; ch1_vol [8] = 8'd0; ch2_vol [8] = 8'd0; ch3_vol [8] = 8'd0;
ch4_vol [8] = 8'd0; ch5_vol [8] = 8'd0; ch6_vol [8] = 8'd0; ch7_vol [8] = 8'd0;
ch0_pan [8] = 8'h80; ch1_pan [8] = 8'h80; ch2_pan [8] = 8'h80; ch3_pan [8] = 8'h80;
ch4_pan [8] = 8'h80; ch5_pan [8] = 8'h80; ch6_pan [8] = 8'h80; ch7_pan [8] = 8'h80;
// Envelope params
ch0_atk [8] = 8'd0; ch1_atk [8] = 8'd0; ch2_atk [8] = 8'd0; ch3_atk [8] = 8'd0;
ch4_atk [8] = 8'd0; ch5_atk [8] = 8'd0; ch6_atk [8] = 8'd0; ch7_atk [8] = 8'd0;
ch0_dec [8] = 8'd0; ch1_dec [8] = 8'd0; ch2_dec [8] = 8'd0; ch3_dec [8] = 8'd0;
ch4_dec [8] = 8'd0; ch5_dec [8] = 8'd0; ch6_dec [8] = 8'd0; ch7_dec [8] = 8'd0;
ch0_sus [8] = 8'd0; ch1_sus [8] = 8'd0; ch2_sus [8] = 8'd0; ch3_sus [8] = 8'd0;
ch4_sus [8] = 8'd0; ch5_sus [8] = 8'd0; ch6_sus [8] = 8'd0; ch7_sus [8] = 8'd0;
ch0_rel [8] = 8'd0; ch1_rel [8] = 8'd0; ch2_rel [8] = 8'd0; ch3_rel [8] = 8'd0;
ch4_rel [8] = 8'd0; ch5_rel [8] = 8'd0; ch6_rel [8] = 8'd0; ch7_rel [8] = 8'd0;
// Duty cycle
ch0_duty [8] = 8'h80; ch1_duty [8] = 8'h80; ch2_duty [8] = 8'h80; ch3_duty [8] = 8'h80;
ch4_duty [8] = 8'h80; ch5_duty [8] = 8'h80; ch6_duty [8] = 8'h80; ch7_duty [8] = 8'h80;
// Bus read pipeline
read_reg [32] = 32'd0;
data_ready [1] = 1'b0;
// IRQ output
irq_reg [1] = 1'b0;
}
MEM(TYPE=DISTRIBUTED) {
// Ring buffer: 128 entries, split into left/right 16-bit memories
buf_left [16] [128] = 16'h0000 {
OUT rd ASYNC;
IN wr;
};
buf_right [16] [128] = 16'h0000 {
OUT rd ASYNC;
IN wr;
};
}
// ---- Channel Generators ----
@new gen0 aud_gen {
IN [1] clk = clk; IN [1] rst_n = rst_n;
IN [1] sample_tick = tick; IN [1] key_on = ch0_key;
IN [1] enabled = ch0_en; IN [3] waveform = ch0_wf;
IN [24] freq = ch0_freq; IN [8] volume = ch0_vol;
IN [8] duty = ch0_duty; IN [8] attack_rate = ch0_atk;
IN [8] decay_rate = ch0_dec; IN [8] sustain_level = ch0_sus;
IN [8] release_rate = ch0_rel;
OUT [16] sample_out = ch0_out;
}
@new gen1 aud_gen {
IN [1] clk = clk; IN [1] rst_n = rst_n;
IN [1] sample_tick = tick; IN [1] key_on = ch1_key;
IN [1] enabled = ch1_en; IN [3] waveform = ch1_wf;
IN [24] freq = ch1_freq; IN [8] volume = ch1_vol;
IN [8] duty = ch1_duty; IN [8] attack_rate = ch1_atk;
IN [8] decay_rate = ch1_dec; IN [8] sustain_level = ch1_sus;
IN [8] release_rate = ch1_rel;
OUT [16] sample_out = ch1_out;
}
@new gen2 aud_gen {
IN [1] clk = clk; IN [1] rst_n = rst_n;
IN [1] sample_tick = tick; IN [1] key_on = ch2_key;
IN [1] enabled = ch2_en; IN [3] waveform = ch2_wf;
IN [24] freq = ch2_freq; IN [8] volume = ch2_vol;
IN [8] duty = ch2_duty; IN [8] attack_rate = ch2_atk;
IN [8] decay_rate = ch2_dec; IN [8] sustain_level = ch2_sus;
IN [8] release_rate = ch2_rel;
OUT [16] sample_out = ch2_out;
}
@new gen3 aud_gen {
IN [1] clk = clk; IN [1] rst_n = rst_n;
IN [1] sample_tick = tick; IN [1] key_on = ch3_key;
IN [1] enabled = ch3_en; IN [3] waveform = ch3_wf;
IN [24] freq = ch3_freq; IN [8] volume = ch3_vol;
IN [8] duty = ch3_duty; IN [8] attack_rate = ch3_atk;
IN [8] decay_rate = ch3_dec; IN [8] sustain_level = ch3_sus;
IN [8] release_rate = ch3_rel;
OUT [16] sample_out = ch3_out;
}
@new gen4 aud_gen {
IN [1] clk = clk; IN [1] rst_n = rst_n;
IN [1] sample_tick = tick; IN [1] key_on = ch4_key;
IN [1] enabled = ch4_en; IN [3] waveform = ch4_wf;
IN [24] freq = ch4_freq; IN [8] volume = ch4_vol;
IN [8] duty = ch4_duty; IN [8] attack_rate = ch4_atk;
IN [8] decay_rate = ch4_dec; IN [8] sustain_level = ch4_sus;
IN [8] release_rate = ch4_rel;
OUT [16] sample_out = ch4_out;
}
@new gen5 aud_gen {
IN [1] clk = clk; IN [1] rst_n = rst_n;
IN [1] sample_tick = tick; IN [1] key_on = ch5_key;
IN [1] enabled = ch5_en; IN [3] waveform = ch5_wf;
IN [24] freq = ch5_freq; IN [8] volume = ch5_vol;
IN [8] duty = ch5_duty; IN [8] attack_rate = ch5_atk;
IN [8] decay_rate = ch5_dec; IN [8] sustain_level = ch5_sus;
IN [8] release_rate = ch5_rel;
OUT [16] sample_out = ch5_out;
}
@new gen6 aud_gen {
IN [1] clk = clk; IN [1] rst_n = rst_n;
IN [1] sample_tick = tick; IN [1] key_on = ch6_key;
IN [1] enabled = ch6_en; IN [3] waveform = ch6_wf;
IN [24] freq = ch6_freq; IN [8] volume = ch6_vol;
IN [8] duty = ch6_duty; IN [8] attack_rate = ch6_atk;
IN [8] decay_rate = ch6_dec; IN [8] sustain_level = ch6_sus;
IN [8] release_rate = ch6_rel;
OUT [16] sample_out = ch6_out;
}
@new gen7 aud_gen {
IN [1] clk = clk; IN [1] rst_n = rst_n;
IN [1] sample_tick = tick; IN [1] key_on = ch7_key;
IN [1] enabled = ch7_en; IN [3] waveform = ch7_wf;
IN [24] freq = ch7_freq; IN [8] volume = ch7_vol;
IN [8] duty = ch7_duty; IN [8] attack_rate = ch7_atk;
IN [8] decay_rate = ch7_dec; IN [8] sustain_level = ch7_sus;
IN [8] release_rate = ch7_rel;
OUT [16] sample_out = ch7_out;
}
// ---- Mixer ----
@new mix0 aud_mixer {
IN [1] clk = clk;
IN [1] rst_n = rst_n;
IN [1] sample_tick = tick;
IN [8] master_vol = master_vol;
IN [128] ch_samples = all_samples;
IN [64] ch_pans = all_pans;
OUT [16] out_left = mix_left;
OUT [16] out_right = mix_right;
OUT [1] out_valid = mix_valid;
}
ASYNCHRONOUS {
// Bus decode
reg_group <= pbus.ADDR[8:5];
reg_sel <= pbus.ADDR[4:2];
bus_read <= pbus.VALID & ~pbus.CMD;
bus_write <= pbus.VALID & pbus.CMD;
// Sample tick wire (directly from register, pulsed in SYNC block)
tick <= sample_tick;
// Concatenate channel outputs for mixer
all_samples <= {ch7_out, ch6_out, ch5_out, ch4_out,
ch3_out, ch2_out, ch1_out, ch0_out};
all_pans <= {ch7_pan, ch6_pan, ch5_pan, ch4_pan,
ch3_pan, ch2_pan, ch1_pan, ch0_pan};
// Buffer level = wr_ptr - rd_ptr (modular 9-bit, report 10 bits)
buf_count <= {1'b0, wr_ptr} - {1'b0, rd_ptr};
buf_half <= buf_count[7] | buf_count[6];
// IRQ output
irq <= irq_reg;
// Bus data drive (reads)
IF (pbus.VALID && pbus.CMD == CMD.READ && data_ready == 1'b1) {
pbus.DATA <= read_reg;
} ELSE {
pbus.DATA <= 32'bz;
}
// Bus DONE
IF (pbus.VALID && pbus.CMD == CMD.WRITE) {
pbus.DONE <= 1'b1;
} ELIF (pbus.VALID && pbus.CMD == CMD.READ && data_ready == 1'b1) {
pbus.DONE <= 1'b1;
} ELSE {
pbus.DONE <= 1'bz;
}
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
// =====================
// Chain 1: Sample rate divider (sample_tick, div_count)
// =====================
IF (enable == 1'b1 && div_count >= divider) {
div_count <= 16'd0;
sample_tick <= 1'b1;
} ELIF (enable == 1'b1) {
div_count <= div_count + 16'd1;
sample_tick <= 1'b0;
} ELSE {
div_count <= 16'd0;
sample_tick <= 1'b0;
}
// =====================
// Chain 2: Ring buffer write (wr_ptr)
// =====================
IF (mix_valid == 1'b1) {
buf_left.wr[wr_ptr] <= mix_left;
buf_right.wr[wr_ptr] <= mix_right;
IF (wr_ptr == 7'd127) {
wr_ptr <= 7'd0;
} ELSE {
wr_ptr <= wr_ptr + 7'd1;
}
}
// =====================
// Chain 3: IRQ register (irq_reg)
// Bus write CTRL with bit[2] clears; buffer half-full sets
// =====================
IF (bus_write == 1'b1 && reg_group == 4'b0000 && reg_sel == 3'b000 && pbus.DATA[2] == 1'b1) {
irq_reg <= 1'b0;
} ELIF (irq_en == 1'b1 && buf_half == 1'b1) {
irq_reg <= 1'b1;
}
// =====================
// Chain 4: Underrun flag (underrun)
// Bus write CTRL with bit[2] clears; empty buffer read sets
// =====================
IF (bus_write == 1'b1 && reg_group == 4'b0000 && reg_sel == 3'b000 && pbus.DATA[2] == 1'b1) {
underrun <= 1'b0;
} ELIF (bus_read == 1'b1 && reg_group == 4'b0000 && reg_sel == 3'b010 && rd_ptr == wr_ptr && data_ready == 1'b0) {
underrun <= 1'b1;
}
// =====================
// Chain 5: Bus read pipeline (data_ready, read_reg, rd_ptr)
// Distributed mem reads are combinational (rd_data.addr set in ASYNC)
// =====================
IF (data_ready == 1'b1) {
data_ready <= 1'b0;
} ELIF (bus_read == 1'b1) {
// SAMPLE register: read from ring buffer (combinational)
IF (reg_group == 4'b0000 && reg_sel == 3'b010) {
IF (rd_ptr != wr_ptr) {
read_reg <= {buf_left.rd[rd_ptr], buf_right.rd[rd_ptr]};
data_ready <= 1'b1;
IF (rd_ptr == 7'd127) {
rd_ptr <= 7'd0;
} ELSE {
rd_ptr <= rd_ptr + 7'd1;
}
} ELSE {
// Buffer empty
read_reg <= 32'd0;
data_ready <= 1'b1;
}
// CTRL
} ELIF (reg_group == 4'b0000 && reg_sel == 3'b000) {
read_reg <= {30'd0, irq_en, enable};
data_ready <= 1'b1;
// STATUS
} ELIF (reg_group == 4'b0000 && reg_sel == 3'b001) {
read_reg <= {22'd0, underrun, buf_half, buf_count};
data_ready <= 1'b1;
// DIVIDER
} ELIF (reg_group == 4'b0000 && reg_sel == 3'b011) {
read_reg <= {16'd0, divider};
data_ready <= 1'b1;
// MASTER_VOL
} ELIF (reg_group == 4'b0000 && reg_sel == 3'b100) {
read_reg <= {24'd0, master_vol};
data_ready <= 1'b1;
// Channel 0 readback
} ELIF (reg_group == 4'b0001) {
data_ready <= 1'b1;
IF (reg_sel == 3'b000) {
read_reg <= {27'd0, ch0_en, ch0_wf, ch0_key};
} ELIF (reg_sel == 3'b001) {
read_reg <= {16'd0, ch0_freq[15:0]};
} ELIF (reg_sel == 3'b010) {
read_reg <= {24'd0, ch0_freq[23:16]};
} ELIF (reg_sel == 3'b011) {
read_reg <= {16'd0, ch0_pan, ch0_vol};
} ELIF (reg_sel == 3'b100) {
read_reg <= {16'd0, ch0_dec, ch0_atk};
} ELIF (reg_sel == 3'b101) {
read_reg <= {16'd0, ch0_rel, ch0_sus};
} ELIF (reg_sel == 3'b110) {
read_reg <= {24'd0, ch0_duty};
} ELSE {
read_reg <= 32'd0;
}
// Channel 1 readback
} ELIF (reg_group == 4'b0010) {
data_ready <= 1'b1;
IF (reg_sel == 3'b000) {
read_reg <= {27'd0, ch1_en, ch1_wf, ch1_key};
} ELIF (reg_sel == 3'b001) {
read_reg <= {16'd0, ch1_freq[15:0]};
} ELIF (reg_sel == 3'b010) {
read_reg <= {24'd0, ch1_freq[23:16]};
} ELIF (reg_sel == 3'b011) {
read_reg <= {16'd0, ch1_pan, ch1_vol};
} ELIF (reg_sel == 3'b100) {
read_reg <= {16'd0, ch1_dec, ch1_atk};
} ELIF (reg_sel == 3'b101) {
read_reg <= {16'd0, ch1_rel, ch1_sus};
} ELIF (reg_sel == 3'b110) {
read_reg <= {24'd0, ch1_duty};
} ELSE {
read_reg <= 32'd0;
}
} ELSE {
// Other channel reads: return 0
data_ready <= 1'b1;
read_reg <= 32'd0;
}
}
// =====================
// Chain 6: Bus register writes (enable, irq_en, master_vol, divider, ch regs)
// NOTE: irq_reg and underrun are in their own chains above
// =====================
IF (bus_write == 1'b1) {
// Global registers
IF (reg_group == 4'b0000) {
IF (reg_sel == 3'b000) {
enable <= pbus.DATA[0];
irq_en <= pbus.DATA[1];
} ELIF (reg_sel == 3'b011) {
divider <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b100) {
master_vol <= pbus.DATA[7:0];
}
// Channel 0
} ELIF (reg_group == 4'b0001) {
IF (reg_sel == 3'b000) {
ch0_key <= pbus.DATA[0]; ch0_wf <= pbus.DATA[3:1]; ch0_en <= pbus.DATA[4];
} ELIF (reg_sel == 3'b001) {
ch0_freq[15:0] <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b010) {
ch0_freq[23:16] <= pbus.DATA[7:0];
} ELIF (reg_sel == 3'b011) {
ch0_vol <= pbus.DATA[7:0]; ch0_pan <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b100) {
ch0_atk <= pbus.DATA[7:0]; ch0_dec <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b101) {
ch0_sus <= pbus.DATA[7:0]; ch0_rel <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b110) {
ch0_duty <= pbus.DATA[7:0];
}
// Channel 1
} ELIF (reg_group == 4'b0010) {
IF (reg_sel == 3'b000) {
ch1_key <= pbus.DATA[0]; ch1_wf <= pbus.DATA[3:1]; ch1_en <= pbus.DATA[4];
} ELIF (reg_sel == 3'b001) {
ch1_freq[15:0] <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b010) {
ch1_freq[23:16] <= pbus.DATA[7:0];
} ELIF (reg_sel == 3'b011) {
ch1_vol <= pbus.DATA[7:0]; ch1_pan <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b100) {
ch1_atk <= pbus.DATA[7:0]; ch1_dec <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b101) {
ch1_sus <= pbus.DATA[7:0]; ch1_rel <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b110) {
ch1_duty <= pbus.DATA[7:0];
}
// Channel 2
} ELIF (reg_group == 4'b0011) {
IF (reg_sel == 3'b000) {
ch2_key <= pbus.DATA[0]; ch2_wf <= pbus.DATA[3:1]; ch2_en <= pbus.DATA[4];
} ELIF (reg_sel == 3'b001) {
ch2_freq[15:0] <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b010) {
ch2_freq[23:16] <= pbus.DATA[7:0];
} ELIF (reg_sel == 3'b011) {
ch2_vol <= pbus.DATA[7:0]; ch2_pan <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b100) {
ch2_atk <= pbus.DATA[7:0]; ch2_dec <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b101) {
ch2_sus <= pbus.DATA[7:0]; ch2_rel <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b110) {
ch2_duty <= pbus.DATA[7:0];
}
// Channel 3
} ELIF (reg_group == 4'b0100) {
IF (reg_sel == 3'b000) {
ch3_key <= pbus.DATA[0]; ch3_wf <= pbus.DATA[3:1]; ch3_en <= pbus.DATA[4];
} ELIF (reg_sel == 3'b001) {
ch3_freq[15:0] <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b010) {
ch3_freq[23:16] <= pbus.DATA[7:0];
} ELIF (reg_sel == 3'b011) {
ch3_vol <= pbus.DATA[7:0]; ch3_pan <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b100) {
ch3_atk <= pbus.DATA[7:0]; ch3_dec <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b101) {
ch3_sus <= pbus.DATA[7:0]; ch3_rel <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b110) {
ch3_duty <= pbus.DATA[7:0];
}
// Channel 4
} ELIF (reg_group == 4'b0101) {
IF (reg_sel == 3'b000) {
ch4_key <= pbus.DATA[0]; ch4_wf <= pbus.DATA[3:1]; ch4_en <= pbus.DATA[4];
} ELIF (reg_sel == 3'b001) {
ch4_freq[15:0] <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b010) {
ch4_freq[23:16] <= pbus.DATA[7:0];
} ELIF (reg_sel == 3'b011) {
ch4_vol <= pbus.DATA[7:0]; ch4_pan <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b100) {
ch4_atk <= pbus.DATA[7:0]; ch4_dec <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b101) {
ch4_sus <= pbus.DATA[7:0]; ch4_rel <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b110) {
ch4_duty <= pbus.DATA[7:0];
}
// Channel 5
} ELIF (reg_group == 4'b0110) {
IF (reg_sel == 3'b000) {
ch5_key <= pbus.DATA[0]; ch5_wf <= pbus.DATA[3:1]; ch5_en <= pbus.DATA[4];
} ELIF (reg_sel == 3'b001) {
ch5_freq[15:0] <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b010) {
ch5_freq[23:16] <= pbus.DATA[7:0];
} ELIF (reg_sel == 3'b011) {
ch5_vol <= pbus.DATA[7:0]; ch5_pan <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b100) {
ch5_atk <= pbus.DATA[7:0]; ch5_dec <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b101) {
ch5_sus <= pbus.DATA[7:0]; ch5_rel <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b110) {
ch5_duty <= pbus.DATA[7:0];
}
// Channel 6
} ELIF (reg_group == 4'b0111) {
IF (reg_sel == 3'b000) {
ch6_key <= pbus.DATA[0]; ch6_wf <= pbus.DATA[3:1]; ch6_en <= pbus.DATA[4];
} ELIF (reg_sel == 3'b001) {
ch6_freq[15:0] <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b010) {
ch6_freq[23:16] <= pbus.DATA[7:0];
} ELIF (reg_sel == 3'b011) {
ch6_vol <= pbus.DATA[7:0]; ch6_pan <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b100) {
ch6_atk <= pbus.DATA[7:0]; ch6_dec <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b101) {
ch6_sus <= pbus.DATA[7:0]; ch6_rel <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b110) {
ch6_duty <= pbus.DATA[7:0];
}
// Channel 7
} ELIF (reg_group == 4'b1000) {
IF (reg_sel == 3'b000) {
ch7_key <= pbus.DATA[0]; ch7_wf <= pbus.DATA[3:1]; ch7_en <= pbus.DATA[4];
} ELIF (reg_sel == 3'b001) {
ch7_freq[15:0] <= pbus.DATA[15:0];
} ELIF (reg_sel == 3'b010) {
ch7_freq[23:16] <= pbus.DATA[7:0];
} ELIF (reg_sel == 3'b011) {
ch7_vol <= pbus.DATA[7:0]; ch7_pan <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b100) {
ch7_atk <= pbus.DATA[7:0]; ch7_dec <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b101) {
ch7_sus <= pbus.DATA[7:0]; ch7_rel <= pbus.DATA[15:8];
} ELIF (reg_sel == 3'b110) {
ch7_duty <= pbus.DATA[7:0];
}
}
}
}
@endmodjz
// Audio channel generator - waveform synthesis with ADSR envelope
// Inspired by Yamaha FM synth voice channels
// Waveforms: square (with duty), triangle, sawtooth, noise
// ADSR envelope controls amplitude over time
// Uses @global WAVE and ENV from global.jz
@module aud_gen
PORT {
IN [1] clk;
IN [1] rst_n;
IN [1] sample_tick; // One-cycle pulse at sample rate
IN [1] key_on; // Gate signal (key pressed)
IN [1] enabled; // Channel enable
IN [3] waveform; // WAVE.SQUARE/TRIANGLE/SAWTOOTH/NOISE
IN [24] freq; // Phase increment per sample
IN [8] volume; // Channel volume 0-255
IN [8] duty; // Square wave duty cycle 0-255
IN [8] attack_rate; // Envelope attack speed
IN [8] decay_rate; // Envelope decay speed
IN [8] sustain_level; // Envelope sustain amplitude
IN [8] release_rate; // Envelope release speed
OUT [16] sample_out; // Signed 16-bit output
}
WIRE {
raw_wave [16]; // Raw waveform (signed)
tri_phase [15]; // Triangle intermediate
tri_wave [16]; // Triangle output
env_byte [8]; // Top 8 bits of envelope
vol_ext [16]; // Volume zero-extended
env_ext [16]; // Envelope zero-extended
wave_x_env [32]; // waveform * envelope
scaled_wave [16]; // After envelope scaling
scaled_ext [32]; // For volume multiply
}
REGISTER {
phase [24] = 24'd0;
env_state [3] = 3'd0;
env_value [16] = 16'd0;
key_prev [1] = 1'b0;
lfsr [16] = 16'hACE1;
}
ASYNCHRONOUS {
// Triangle: fold phase into up/down ramp
tri_phase <= (phase[23] == 1'b0) ? phase[22:8] : ~phase[22:8];
tri_wave <= {1'b0, tri_phase};
// Waveform mux
IF (waveform == WAVE.SQUARE) {
raw_wave <= (phase[23:16] < duty) ? 16'h7FFF : 16'h8001;
} ELIF (waveform == WAVE.TRIANGLE) {
// Shift up to full signed range: (tri * 2) - 0x7FFF
raw_wave <= {tri_wave[14:0], 1'b0} - 16'h7FFF;
} ELIF (waveform == WAVE.SAWTOOTH) {
raw_wave <= {phase[23], phase[22:8]};
} ELSE {
raw_wave <= lfsr;
}
// Envelope scaling: wave * env[15:8] -> take upper 16 of 32
env_byte <= env_value[15:8];
env_ext <= {8'd0, env_byte};
vol_ext <= {8'd0, volume};
wave_x_env <= smul(raw_wave, env_ext);
scaled_wave <= wave_x_env[23:8];
// Volume scaling: scaled_wave * volume -> take upper 16 of 32
scaled_ext <= smul(scaled_wave, vol_ext);
// Output
IF (enabled == 1'b1) {
sample_out <= scaled_ext[23:8];
} ELSE {
sample_out <= 16'd0;
}
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
key_prev <= key_on;
IF (sample_tick == 1'b1) {
// Phase accumulator
IF (enabled == 1'b1) {
phase <= phase + freq;
}
// Noise LFSR (Galois, taps at 16,14,13,11 -> XOR feedback)
IF (waveform == WAVE.NOISE) {
IF (lfsr[0] == 1'b1) {
lfsr <= {1'b0, lfsr[15:1]} ^ 16'hB400;
} ELSE {
lfsr <= {1'b0, lfsr[15:1]};
}
}
// ---- ADSR Envelope State Machine ----
// Key-on rising edge -> start attack
IF (key_on == 1'b1 && key_prev == 1'b0) {
env_state <= ENV.ATTACK;
env_value <= 16'd0;
// Key-off -> start release
} ELIF (key_on == 1'b0 && key_prev == 1'b1) {
env_state <= ENV.RELEASE;
} ELSE {
// Envelope progression
IF (env_state == ENV.ATTACK) {
IF (env_value + {8'd0, attack_rate} >= 16'hFF00) {
env_value <= 16'hFF00;
env_state <= ENV.DECAY;
} ELSE {
env_value <= env_value + {8'd0, attack_rate};
}
} ELIF (env_state == ENV.DECAY) {
IF (env_value[15:8] <= sustain_level) {
env_value <= {sustain_level, 8'd0};
env_state <= ENV.SUSTAIN;
} ELIF (env_value < {8'd0, decay_rate}) {
env_value <= {sustain_level, 8'd0};
env_state <= ENV.SUSTAIN;
} ELSE {
env_value <= env_value - {8'd0, decay_rate};
}
} ELIF (env_state == ENV.SUSTAIN) {
env_value <= {sustain_level, 8'd0};
} ELIF (env_state == ENV.RELEASE) {
IF (env_value < {8'd0, release_rate}) {
env_value <= 16'd0;
env_state <= ENV.IDLE;
} ELSE {
env_value <= env_value - {8'd0, release_rate};
}
}
}
}
}
@endmodjz
// Audio mixer - sums 8 channels into stereo output
// Accumulates signed 16-bit inputs with headroom, applies master volume,
// and clamps to signed 16-bit output range.
// Pan per channel: 0=full left, 128=center, 255=full right
@module aud_mixer
CONST {
NUM_CHANNELS = 8;
}
PORT {
IN [1] clk;
IN [1] rst_n;
IN [1] sample_tick;
IN [8] master_vol; // Master volume 0-255
// Channel inputs: concatenated [NUM_CHANNELS * 16] wide
IN [128] ch_samples; // 8 channels x 16-bit signed
IN [64] ch_pans; // 8 channels x 8-bit pan (0=L, 128=center, 255=R)
OUT [16] out_left; // Mixed left channel
OUT [16] out_right; // Mixed right channel
OUT [1] out_valid; // Pulse when outputs are ready
}
WIRE {
// Individual channel extraction
s0 [16]; s1 [16]; s2 [16]; s3 [16];
s4 [16]; s5 [16]; s6 [16]; s7 [16];
p0 [8]; p1 [8]; p2 [8]; p3 [8];
p4 [8]; p5 [8]; p6 [8]; p7 [8];
// Left gain = 255 - pan
l0 [8]; l1 [8]; l2 [8]; l3 [8];
l4 [8]; l5 [8]; l6 [8]; l7 [8];
// Scaled samples left (smul 16x16 -> 32, take [23:8] for 16-bit)
sl0 [32]; sl1 [32]; sl2 [32]; sl3 [32];
sl4 [32]; sl5 [32]; sl6 [32]; sl7 [32];
// Scaled samples right
sr0 [32]; sr1 [32]; sr2 [32]; sr3 [32];
sr4 [32]; sr5 [32]; sr6 [32]; sr7 [32];
// Pan-extended to 16 bits for smul
pe0 [16]; pe1 [16]; pe2 [16]; pe3 [16];
pe4 [16]; pe5 [16]; pe6 [16]; pe7 [16];
le0 [16]; le1 [16]; le2 [16]; le3 [16];
le4 [16]; le5 [16]; le6 [16]; le7 [16];
// Sign-extended to 20-bit for accumulation
xl0 [20]; xl1 [20]; xl2 [20]; xl3 [20];
xl4 [20]; xl5 [20]; xl6 [20]; xl7 [20];
xr0 [20]; xr1 [20]; xr2 [20]; xr3 [20];
xr4 [20]; xr5 [20]; xr6 [20]; xr7 [20];
// Partial sums (pairwise to manage width)
lsum01 [20]; lsum23 [20]; lsum45 [20]; lsum67 [20];
rsum01 [20]; rsum23 [20]; rsum45 [20]; rsum67 [20];
lsum03 [20]; lsum47 [20];
rsum03 [20]; rsum47 [20];
sum_left [20];
sum_right [20];
// Master volume scaling
mvol_ext [20];
left_scaled [40];
right_scaled [40];
left_clamped [16];
right_clamped [16];
}
REGISTER {
out_l_reg [16] = 16'd0;
out_r_reg [16] = 16'd0;
valid_reg [1] = 1'b0;
}
ASYNCHRONOUS {
// Extract individual 16-bit samples
s0 <= ch_samples[15:0];
s1 <= ch_samples[31:16];
s2 <= ch_samples[47:32];
s3 <= ch_samples[63:48];
s4 <= ch_samples[79:64];
s5 <= ch_samples[95:80];
s6 <= ch_samples[111:96];
s7 <= ch_samples[127:112];
// Extract pan values
p0 <= ch_pans[7:0]; p1 <= ch_pans[15:8];
p2 <= ch_pans[23:16]; p3 <= ch_pans[31:24];
p4 <= ch_pans[39:32]; p5 <= ch_pans[47:40];
p6 <= ch_pans[55:48]; p7 <= ch_pans[63:56];
// Left gain = 255 - pan
l0 <= 8'hFF - p0; l1 <= 8'hFF - p1;
l2 <= 8'hFF - p2; l3 <= 8'hFF - p3;
l4 <= 8'hFF - p4; l5 <= 8'hFF - p5;
l6 <= 8'hFF - p6; l7 <= 8'hFF - p7;
// Zero-extend pan/gain to 16-bit for smul
pe0 <= {8'd0, p0}; pe1 <= {8'd0, p1};
pe2 <= {8'd0, p2}; pe3 <= {8'd0, p3};
pe4 <= {8'd0, p4}; pe5 <= {8'd0, p5};
pe6 <= {8'd0, p6}; pe7 <= {8'd0, p7};
le0 <= {8'd0, l0}; le1 <= {8'd0, l1};
le2 <= {8'd0, l2}; le3 <= {8'd0, l3};
le4 <= {8'd0, l4}; le5 <= {8'd0, l5};
le6 <= {8'd0, l6}; le7 <= {8'd0, l7};
// Scale each channel by left/right gain
sl0 <= smul(s0, le0); sl1 <= smul(s1, le1);
sl2 <= smul(s2, le2); sl3 <= smul(s3, le3);
sl4 <= smul(s4, le4); sl5 <= smul(s5, le5);
sl6 <= smul(s6, le6); sl7 <= smul(s7, le7);
sr0 <= smul(s0, pe0); sr1 <= smul(s1, pe1);
sr2 <= smul(s2, pe2); sr3 <= smul(s3, pe3);
sr4 <= smul(s4, pe4); sr5 <= smul(s5, pe5);
sr6 <= smul(s6, pe6); sr7 <= smul(s7, pe7);
// Sign-extend 16-bit pan-scaled results to 20-bit
xl0 <= {sl0[23], sl0[23], sl0[23], sl0[23], sl0[23:8]};
xl1 <= {sl1[23], sl1[23], sl1[23], sl1[23], sl1[23:8]};
xl2 <= {sl2[23], sl2[23], sl2[23], sl2[23], sl2[23:8]};
xl3 <= {sl3[23], sl3[23], sl3[23], sl3[23], sl3[23:8]};
xl4 <= {sl4[23], sl4[23], sl4[23], sl4[23], sl4[23:8]};
xl5 <= {sl5[23], sl5[23], sl5[23], sl5[23], sl5[23:8]};
xl6 <= {sl6[23], sl6[23], sl6[23], sl6[23], sl6[23:8]};
xl7 <= {sl7[23], sl7[23], sl7[23], sl7[23], sl7[23:8]};
xr0 <= {sr0[23], sr0[23], sr0[23], sr0[23], sr0[23:8]};
xr1 <= {sr1[23], sr1[23], sr1[23], sr1[23], sr1[23:8]};
xr2 <= {sr2[23], sr2[23], sr2[23], sr2[23], sr2[23:8]};
xr3 <= {sr3[23], sr3[23], sr3[23], sr3[23], sr3[23:8]};
xr4 <= {sr4[23], sr4[23], sr4[23], sr4[23], sr4[23:8]};
xr5 <= {sr5[23], sr5[23], sr5[23], sr5[23], sr5[23:8]};
xr6 <= {sr6[23], sr6[23], sr6[23], sr6[23], sr6[23:8]};
xr7 <= {sr7[23], sr7[23], sr7[23], sr7[23], sr7[23:8]};
// Pairwise sum to accumulate
lsum01 <= xl0 + xl1; lsum23 <= xl2 + xl3;
lsum45 <= xl4 + xl5; lsum67 <= xl6 + xl7;
rsum01 <= xr0 + xr1; rsum23 <= xr2 + xr3;
rsum45 <= xr4 + xr5; rsum67 <= xr6 + xr7;
lsum03 <= lsum01 + lsum23;
lsum47 <= lsum45 + lsum67;
rsum03 <= rsum01 + rsum23;
rsum47 <= rsum45 + rsum67;
sum_left <= lsum03 + lsum47;
sum_right <= rsum03 + rsum47;
// Apply master volume (zero-extend to 20 for smul)
mvol_ext <= {12'd0, master_vol};
left_scaled <= smul(sum_left, mvol_ext);
right_scaled <= smul(sum_right, mvol_ext);
// Clamp to signed 16-bit range after >>8
IF (left_scaled[27] == 1'b0 && left_scaled[27:23] != 5'b00000) {
left_clamped <= 16'h7FFF;
} ELIF (left_scaled[27] == 1'b1 && left_scaled[27:23] != 5'b11111) {
left_clamped <= 16'h8001;
} ELSE {
left_clamped <= left_scaled[23:8];
}
IF (right_scaled[27] == 1'b0 && right_scaled[27:23] != 5'b00000) {
right_clamped <= 16'h7FFF;
} ELIF (right_scaled[27] == 1'b1 && right_scaled[27:23] != 5'b11111) {
right_clamped <= 16'h8001;
} ELSE {
right_clamped <= right_scaled[23:8];
}
out_left <= out_l_reg;
out_right <= out_r_reg;
out_valid <= valid_reg;
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
IF (sample_tick == 1'b1) {
out_l_reg <= left_clamped;
out_r_reg <= right_clamped;
valid_reg <= 1'b1;
} ELSE {
valid_reg <= 1'b0;
}
}
@endmodjz
// Terminal Framebuffer with Command Register Interface
// Dual BSRAM banks: cpu-side (sys_clk r/w) + video-side (pixel_clk read)
// CPU writes are mirrored to both banks. Hardware scroll/clear operate internally.
//
// Register map (base 0x5000_0000, ADDR[5:2] selects register):
// +0x00 CELL_ADDR [R/W] 12-bit cell index for read/write
// +0x04 CELL_CHAR [R/W] 8-bit character at CELL_ADDR
// +0x08 CELL_ATTR [R/W] 32-bit {BG[15:0], FG[15:0]} at CELL_ADDR
// +0x0C FILL_CHAR [R/W] 8-bit character for clear/scroll fill
// +0x10 FILL_ATTR [R/W] 32-bit attribute for clear/scroll fill
// +0x14 TERM_COLS [R/W] 8-bit terminal columns (80 or 120)
// +0x18 TERM_CELLS [R/W] 12-bit total cells (cols * rows)
// +0x1C COMMAND [W] 1=CLEAR, 2=SCROLL_UP
// +0x20 STATUS [R] bit 0 = BUSY
// +0x24 CURSOR [R/W] bits[14:3]=cell position, bits[2:0]=style
//
// Cursor styles: 0=none 1=line 2=block 3=line-blink 4=block-blink
//
// Write CELL_ADDR first, then read/write CELL_CHAR/CELL_ATTR.
// BSRAM read result is latched; reads return latched value immediately.
@module terminal_fb
PORT {
IN [1] clk;
IN [1] rst_n;
IN [1] pixel_clk;
BUS SIMPLE_BUS TARGET pbus;
// Video read interface (pixel_clk domain)
IN [12] vram_addr;
OUT [8] vram_char;
OUT [32] vram_attr;
// Cursor info for video pipeline
OUT [12] cursor_pos;
OUT [3] cursor_style;
}
// CPU-side memories (both ports in sys_clk domain)
MEM(TYPE=BLOCK) {
char_cpu [8] [4096] = 8'h00 {
OUT cpu_char_rd SYNC;
IN cpu_char_wr;
};
}
MEM(TYPE=BLOCK) {
attr_cpu [32] [4096] = 32'h00000000 {
OUT cpu_attr_rd SYNC;
IN cpu_attr_wr;
};
}
// Video-side memories (read on pixel_clk, write on sys_clk)
MEM(TYPE=BLOCK) {
char_vid [8] [4096] = 8'h00 {
OUT vid_char_rd SYNC;
IN vid_char_wr;
};
}
MEM(TYPE=BLOCK) {
attr_vid [32] [4096] = 32'h00000000 {
OUT vid_attr_rd SYNC;
IN vid_attr_wr;
};
}
WIRE {
// Combinational memory operation signals
char_wr_en [1];
attr_wr_en [1];
mem_wr_addr [12];
mem_wr_char [8];
mem_wr_attr [32];
mem_rd_addr [12];
// FSM next-state signals
next_state [3];
next_counter [12];
next_src [12];
// Bus decode helpers
bus_reg [4];
is_bus_wr [1];
is_bus_rd [1];
// FSM state decode
is_idle [1];
is_clear [1];
is_scr_rd [1];
is_scr_wr [1];
is_scr_fill [1];
at_last_cell [1];
at_last_src [1];
// Bus read data mux
read_data [32];
read_reg_sel [4];
}
REGISTER {
// Register file
cell_addr_reg [12] = 12'b0;
fill_char_reg [8] = 8'h20;
fill_attr_reg [32] = 32'h0000FFFF;
term_cols_reg [8] = 8'd80;
term_cells_reg [12] = 12'd1760;
cursor_reg [15] = 15'b0;
// Latched BSRAM read results
char_latch [8] = 8'h00;
attr_latch [32] = 32'h00000000;
// Latched video read results (pixel_clk domain)
vid_char_latch [8] = 8'h00;
vid_attr_latch [32] = 32'h00000000;
// FSM: 0=IDLE, 1=CLEAR, 2=SCROLL_READ, 3=SCROLL_WRITE, 4=SCROLL_FILL, 5=SCROLL_LATCH
fsm_state [3] = 3'b0;
fsm_counter [12] = 12'b0;
fsm_src [12] = 12'b0;
}
ASYNCHRONOUS {
// Video read outputs
vram_char <= vid_char_latch;
vram_attr <= vid_attr_latch;
// Cursor outputs
cursor_pos <= cursor_reg[14:3];
cursor_style <= cursor_reg[2:0];
// FSM state decode
is_idle <= (fsm_state == 3'd0) ? 1'b1 : 1'b0;
is_clear <= (fsm_state == 3'd1) ? 1'b1 : 1'b0;
is_scr_rd <= (fsm_state == 3'd2) ? 1'b1 : 1'b0;
is_scr_wr <= (fsm_state == 3'd3) ? 1'b1 : 1'b0;
is_scr_fill <= (fsm_state == 3'd4) ? 1'b1 : 1'b0;
// State 5: SCROLL_LATCH — wait cycle for BSRAM read data to arrive
at_last_cell <= (fsm_counter == term_cells_reg - 12'd1) ? 1'b1 : 1'b0;
at_last_src <= (fsm_src == term_cells_reg - 12'd1) ? 1'b1 : 1'b0;
// Bus decode
bus_reg <= pbus.ADDR[5:2];
is_bus_wr <= (fsm_state == 3'd0 && pbus.VALID == 1'b1 && pbus.CMD == CMD.WRITE) ? 1'b1 : 1'b0;
is_bus_rd <= (fsm_state == 3'd0 && pbus.VALID == 1'b1 && pbus.CMD == CMD.READ) ? 1'b1 : 1'b0;
// ---- Separate char/attr write enables ----
// FSM ops write both; bus CELL_CHAR writes char only; bus CELL_ATTR writes attr only
char_wr_en <= (is_clear == 1'b1 || is_scr_wr == 1'b1 || is_scr_fill == 1'b1 ||
(is_bus_wr == 1'b1 && bus_reg == 4'd1))
? 1'b1 : 1'b0;
attr_wr_en <= (is_clear == 1'b1 || is_scr_wr == 1'b1 || is_scr_fill == 1'b1 ||
(is_bus_wr == 1'b1 && bus_reg == 4'd2))
? 1'b1 : 1'b0;
// ---- Memory write address ----
mem_wr_addr <= (is_clear == 1'b1 || is_scr_wr == 1'b1 || is_scr_fill == 1'b1)
? fsm_counter : cell_addr_reg;
// ---- Memory write character ----
mem_wr_char <= (is_clear == 1'b1 || is_scr_fill == 1'b1) ? fill_char_reg :
(is_scr_wr == 1'b1) ? char_latch :
(is_bus_wr == 1'b1 && bus_reg == 4'd1) ? pbus.DATA[7:0] :
char_latch;
// ---- Memory write attribute ----
mem_wr_attr <= (is_clear == 1'b1 || is_scr_fill == 1'b1) ? fill_attr_reg :
(is_scr_wr == 1'b1) ? attr_latch :
(is_bus_wr == 1'b1 && bus_reg == 4'd2) ? pbus.DATA :
attr_latch;
// ---- Memory read address ----
mem_rd_addr <= (is_scr_rd == 1'b1) ? fsm_src :
(is_bus_wr == 1'b1 && bus_reg == 4'd0) ? pbus.DATA[11:0] :
cell_addr_reg;
// ---- FSM next state ----
// State flow: SCROLL_READ(2) → SCROLL_LATCH(5) → SCROLL_WRITE(3) → loop or SCROLL_FILL(4)
next_state <= (is_clear == 1'b1 && at_last_cell == 1'b1) ? 3'd0 :
(is_clear == 1'b1) ? 3'd1 :
(is_scr_rd == 1'b1) ? 3'd5 :
(fsm_state == 3'd5) ? 3'd3 :
(is_scr_wr == 1'b1 && at_last_src == 1'b1) ? 3'd4 :
(is_scr_wr == 1'b1) ? 3'd2 :
(is_scr_fill == 1'b1 && at_last_cell == 1'b1) ? 3'd0 :
(is_scr_fill == 1'b1) ? 3'd4 :
(is_bus_wr == 1'b1 && bus_reg == 4'd7 && pbus.DATA[1:0] == 2'd1) ? 3'd1 :
(is_bus_wr == 1'b1 && bus_reg == 4'd7 && pbus.DATA[1:0] == 2'd2) ? 3'd2 :
fsm_state;
// ---- FSM next counter ----
next_counter <= (is_clear == 1'b1 && at_last_cell == 1'b1) ? 12'd0 :
(is_clear == 1'b1) ? fsm_counter + 12'd1 :
(is_scr_wr == 1'b1) ? fsm_counter + 12'd1 :
(is_scr_fill == 1'b1 && at_last_cell == 1'b1) ? 12'd0 :
(is_scr_fill == 1'b1) ? fsm_counter + 12'd1 :
(is_bus_wr == 1'b1 && bus_reg == 4'd7) ? 12'd0 :
fsm_counter;
// ---- FSM next source ----
next_src <= (is_scr_wr == 1'b1) ? fsm_src + 12'd1 :
(is_bus_wr == 1'b1 && bus_reg == 4'd7 && pbus.DATA[1:0] == 2'd2)
? {4'b0, term_cols_reg} :
fsm_src;
// ---- Bus read data mux ----
read_reg_sel <= pbus.ADDR[5:2];
SELECT(read_reg_sel) {
CASE 4'd0 {
read_data <= {20'b0, cell_addr_reg};
}
CASE 4'd1 {
read_data <= {24'b0, char_latch};
}
CASE 4'd2 {
read_data <= attr_latch;
}
CASE 4'd3 {
read_data <= {24'b0, fill_char_reg};
}
CASE 4'd4 {
read_data <= fill_attr_reg;
}
CASE 4'd5 {
read_data <= {24'b0, term_cols_reg};
}
CASE 4'd6 {
read_data <= {20'b0, term_cells_reg};
}
CASE 4'd8 {
read_data <= {31'b0, is_idle == 1'b0};
}
CASE 4'd9 {
read_data <= {17'b0, cursor_reg};
}
DEFAULT {
read_data <= 32'b0;
}
}
// ---- Bus response ----
pbus.DATA <= (pbus.VALID && pbus.CMD == CMD.READ) ? read_data : 32'bz;
IF (pbus.VALID) {
pbus.DONE <= 1'b1;
} ELSE {
pbus.DONE <= 1'bz;
}
}
// CPU port (sys_clk domain)
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
// FSM state update
fsm_state <= next_state;
fsm_counter <= next_counter;
fsm_src <= next_src;
// Always latch BSRAM read data
char_latch <= char_cpu.cpu_char_rd.data;
attr_latch <= attr_cpu.cpu_attr_rd.data;
// Single memory read address
char_cpu.cpu_char_rd.addr <= mem_rd_addr;
attr_cpu.cpu_attr_rd.addr <= mem_rd_addr;
// Separate char/attr writes to avoid stale latch cross-contamination
IF (char_wr_en == 1'b1) {
char_cpu.cpu_char_wr[mem_wr_addr] <= mem_wr_char;
char_vid.vid_char_wr[mem_wr_addr] <= mem_wr_char;
}
IF (attr_wr_en == 1'b1) {
attr_cpu.cpu_attr_wr[mem_wr_addr] <= mem_wr_attr;
attr_vid.vid_attr_wr[mem_wr_addr] <= mem_wr_attr;
}
// Register writes (from bus)
IF (is_bus_wr == 1'b1 && bus_reg == 4'd0) {
cell_addr_reg <= pbus.DATA[11:0];
}
IF (is_bus_wr == 1'b1 && bus_reg == 4'd3) {
fill_char_reg <= pbus.DATA[7:0];
}
IF (is_bus_wr == 1'b1 && bus_reg == 4'd4) {
fill_attr_reg <= pbus.DATA;
}
IF (is_bus_wr == 1'b1 && bus_reg == 4'd5) {
term_cols_reg <= pbus.DATA[7:0];
}
IF (is_bus_wr == 1'b1 && bus_reg == 4'd6) {
term_cells_reg <= pbus.DATA[11:0];
}
IF (is_bus_wr == 1'b1 && bus_reg == 4'd9) {
cursor_reg <= pbus.DATA[14:0];
}
}
// Video read port (pixel_clk domain)
SYNCHRONOUS(CLK = pixel_clk RESET = rst_n RESET_ACTIVE = Low) {
char_vid.vid_char_rd.addr <= vram_addr;
attr_vid.vid_attr_rd.addr <= vram_addr;
vid_char_latch <= char_vid.vid_char_rd.data;
vid_attr_latch <= attr_vid.vid_attr_rd.data;
}
@endmodjz
// Simple 32-bit accumulator-based CPU
// Registers: A (accumulator), X (index) - 32-bit
// SP (stack pointer), PC (program counter) - 16-bit
// Flags: Z (zero), C (carry), N (negative)
//
// Instruction format (32-bit fixed):
// [31:24] opcode (8-bit)
// [23:16] operand byte
// [15:0] immediate/address (16-bit)
//
// Memory map:
// 0x0000-0x0FFF ROM (4096 words)
// 0x1000-0x1FFF RAM (4096 words, stack at top)
// 0x2000 LED output
@module cpu
CONST {
START_PC = 0;
}
PORT {
IN [1] clk;
IN [1] rst_n;
BUS SIMPLE_BUS SOURCE pbus;
}
REGISTER {
// Program counter
PC [16] = lit(16, START_PC);
// Registers
reg_a [32] = 32'h00000000;
reg_x [32] = 32'h00000000;
// Stack pointer
SP [16] = 16'h1FFF;
// Flags
flag_z [1] = 1'b0;
flag_c [1] = 1'b0;
flag_n [1] = 1'b0;
// State machine
state [4] = STATE.FETCH;
// Instruction register
instr [32] = 32'h00000000;
// Bus control registers
bus_addr [16] = 16'h0000;
bus_data [32] = 32'h00000000;
bus_cmd [1] = 1'b0;
bus_valid [1] = 1'b0;
// Load/store destination tracking
mem_dst [1] = 1'b0; // 0=A, 1=X
}
WIRE {
// Decoded instruction fields
opcode [8];
imm_addr [16];
}
ASYNCHRONOUS {
// Instruction decode
opcode = instr[31:24];
imm_addr = instr[15:0];
// Drive bus signals
pbus.ADDR <= bus_addr;
pbus.DATA <= (bus_valid == 1'b1 && bus_cmd == CMD.WRITE) ? bus_data : 32'bz;
pbus.CMD <= bus_cmd;
pbus.VALID <= bus_valid;
}
SYNCHRONOUS(CLK = clk RESET = rst_n RESET_ACTIVE = Low) {
IF (state == STATE.FETCH) {
// Start instruction fetch from memory at PC
bus_addr <= PC;
bus_cmd <= CMD.READ;
bus_valid <= 1'b1;
state <= STATE.WAIT_FETCH;
} ELIF (state == STATE.WAIT_FETCH) {
// Wait for memory to respond
IF (pbus.DONE == 1'b1) {
instr <= pbus.DATA;
bus_valid <= 1'b0;
state <= STATE.DECODE;
}
} ELIF (state == STATE.DECODE) {
// Decode and execute simple instructions, or set up memory access
IF (opcode == OP.NOP) {
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.LDI_A) {
// Load 16-bit immediate into A (zero-extended)
reg_a <= {16'h0000, imm_addr};
flag_z <= (imm_addr == 16'h0000) ? 1'b1 : 1'b0;
flag_n <= 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.LDI_X) {
// Load 16-bit immediate into X (zero-extended)
reg_x <= {16'h0000, imm_addr};
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.LD_A) {
// Start memory read for A
bus_addr <= imm_addr;
bus_cmd <= CMD.READ;
bus_valid <= 1'b1;
mem_dst <= 1'b0;
PC <= PC + 16'h0001;
state <= STATE.MEM_WAIT;
} ELIF (opcode == OP.LD_X) {
// Start memory read for X
bus_addr <= imm_addr;
bus_cmd <= CMD.READ;
bus_valid <= 1'b1;
mem_dst <= 1'b1;
PC <= PC + 16'h0001;
state <= STATE.MEM_WAIT;
} ELIF (opcode == OP.ST_A) {
// Start memory write from A
bus_addr <= imm_addr;
bus_data <= reg_a;
bus_cmd <= CMD.WRITE;
bus_valid <= 1'b1;
PC <= PC + 16'h0001;
state <= STATE.MEM_WAIT;
} ELIF (opcode == OP.ST_X) {
// Start memory write from X
bus_addr <= imm_addr;
bus_data <= reg_x;
bus_cmd <= CMD.WRITE;
bus_valid <= 1'b1;
PC <= PC + 16'h0001;
state <= STATE.MEM_WAIT;
} ELIF (opcode == OP.ADD) {
// A = A + X
reg_a <= reg_a[31:0] + reg_x[31:0];
flag_z <= ((reg_a[31:0] + reg_x[31:0]) == 32'h00000000) ? 1'b1 : 1'b0;
flag_n <= reg_a[31] ^ reg_x[31];
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.SUB) {
// A = A - X
reg_a <= reg_a[31:0] - reg_x[31:0];
flag_z <= ((reg_a[31:0] - reg_x[31:0]) == 32'h00000000) ? 1'b1 : 1'b0;
flag_n <= (reg_a < reg_x) ? 1'b1 : 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.AND) {
reg_a <= reg_a & reg_x;
flag_z <= ((reg_a & reg_x) == 32'h00000000) ? 1'b1 : 1'b0;
flag_n <= 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.OR) {
reg_a <= reg_a | reg_x;
flag_z <= ((reg_a | reg_x) == 32'h00000000) ? 1'b1 : 1'b0;
flag_n <= 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.XOR) {
reg_a <= reg_a ^ reg_x;
flag_z <= ((reg_a ^ reg_x) == 32'h00000000) ? 1'b1 : 1'b0;
flag_n <= 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.CMP) {
// Set flags from A - X (don't store result)
flag_z <= (reg_a == reg_x) ? 1'b1 : 1'b0;
flag_c <= (reg_a >= reg_x) ? 1'b1 : 1'b0;
flag_n <= (reg_a < reg_x) ? 1'b1 : 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.JMP) {
PC <= imm_addr;
state <= STATE.FETCH;
} ELIF (opcode == OP.BEQ) {
IF (flag_z == 1'b1) {
PC <= imm_addr;
} ELSE {
PC <= PC + 16'h0001;
}
state <= STATE.FETCH;
} ELIF (opcode == OP.BNE) {
IF (flag_z == 1'b0) {
PC <= imm_addr;
} ELSE {
PC <= PC + 16'h0001;
}
state <= STATE.FETCH;
} ELIF (opcode == OP.INC) {
reg_a <= reg_a + 32'h00000001;
flag_z <= ((reg_a + 32'h00000001) == 32'h00000000) ? 1'b1 : 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.DEC) {
reg_a <= reg_a - 32'h00000001;
flag_z <= ((reg_a - 32'h00000001) == 32'h00000000) ? 1'b1 : 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.SHL) {
flag_c <= reg_a[31];
reg_a <= {reg_a[30:0], 1'b0};
flag_z <= ({reg_a[30:0], 1'b0} == 32'h00000000) ? 1'b1 : 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.SHR) {
flag_c <= reg_a[0];
reg_a <= {1'b0, reg_a[31:1]};
flag_z <= ({1'b0, reg_a[31:1]} == 32'h00000000) ? 1'b1 : 1'b0;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
} ELIF (opcode == OP.PUSH) {
// Push A: write A to mem[SP], then SP -= 1
bus_addr <= SP;
bus_data <= reg_a;
bus_cmd <= CMD.WRITE;
bus_valid <= 1'b1;
state <= STATE.PUSH_EXEC;
} ELIF (opcode == OP.POP) {
// Pop A: SP += 1, then read mem[SP]
SP <= SP + 16'h0001;
bus_addr <= SP + 16'h0001;
bus_cmd <= CMD.READ;
bus_valid <= 1'b1;
mem_dst <= 1'b0;
state <= STATE.POP_EXEC;
} ELIF (opcode == OP.CALL) {
// Push PC+1, then jump to addr
bus_addr <= SP;
bus_data <= {16'h0000, PC + 16'h0001};
bus_cmd <= CMD.WRITE;
bus_valid <= 1'b1;
state <= STATE.CALL_PUSH;
} ELIF (opcode == OP.RET) {
// Pop PC
SP <= SP + 16'h0001;
bus_addr <= SP + 16'h0001;
bus_cmd <= CMD.READ;
bus_valid <= 1'b1;
state <= STATE.RET_POP;
} ELIF (opcode == OP.HLT) {
bus_valid <= 1'b0;
state <= STATE.HALT;
} ELSE {
// Unknown opcode: treat as NOP
PC <= PC + 16'h0001;
state <= STATE.FETCH;
}
} ELIF (state == STATE.MEM_WAIT) {
// Wait for memory read/write to complete
IF (pbus.DONE == 1'b1) {
bus_valid <= 1'b0;
IF (bus_cmd == CMD.READ) {
// Load result into destination register
IF (mem_dst == 1'b0) {
reg_a <= pbus.DATA;
flag_z <= (pbus.DATA == 32'h00000000) ? 1'b1 : 1'b0;
flag_n <= pbus.DATA[31];
} ELSE {
reg_x <= pbus.DATA;
}
}
state <= STATE.FETCH;
}
} ELIF (state == STATE.PUSH_EXEC) {
// Wait for push write to complete
IF (pbus.DONE == 1'b1) {
bus_valid <= 1'b0;
SP <= SP - 16'h0001;
PC <= PC + 16'h0001;
state <= STATE.FETCH;
}
} ELIF (state == STATE.POP_EXEC) {
// Wait for pop read to complete
IF (pbus.DONE == 1'b1) {
bus_valid <= 1'b0;
reg_a <= pbus.DATA;
flag_z <= (pbus.DATA == 32'h00000000) ? 1'b1 : 1'b0;
flag_n <= pbus.DATA[31];
PC <= PC + 16'h0001;
state <= STATE.FETCH;
}
} ELIF (state == STATE.CALL_PUSH) {
// Wait for push of return address to complete
IF (pbus.DONE == 1'b1) {
bus_valid <= 1'b0;
SP <= SP - 16'h0001;
PC <= imm_addr;
state <= STATE.FETCH;
}
} ELIF (state == STATE.RET_POP) {
// Wait for pop of return address to complete
IF (pbus.DONE == 1'b1) {
bus_valid <= 1'b0;
PC <= pbus.DATA[15:0];
state <= STATE.FETCH;
}
} ELIF (state == STATE.HALT) {
// Stay halted
state <= STATE.HALT;
}
}
@endmodjz
@module por
PORT {
IN [1] clk;
IN [1] done;
OUT [1] por_n;
}
CONST {
POR_CYCLES = 16;
POR_CNT_BITS = clog2(POR_CYCLES);
POR_MAX = POR_CYCLES - 1;
}
REGISTER {
por_reg [1] = 1'b0;
cnt [POR_CNT_BITS] = POR_CNT_BITS'b0;
}
ASYNCHRONOUS {
por_n <= por_reg;
}
SYNCHRONOUS(CLK = clk) {
IF (done == 1'b0) {
por_reg <= 1'b0;
cnt <= POR_CNT_BITS'b0;
} ELIF (cnt == lit(POR_CNT_BITS, POR_MAX)) {
por_reg <= 1'b1;
cnt <= cnt;
} ELSE {
por_reg <= 1'b0;
cnt <= cnt + POR_CNT_BITS'b1;
}
}
@endmodClock Architecture
text
27 MHz crystal (SCLK)
└─ PLL (IDIV=3, FBDIV=54, ODIV=2)
└─ 371.25 MHz serial_clk
├─ CLKDIV (DIV_MODE=5)
│ └─ 74.25 MHz sys_clk / pixel_clk
└─ (sdram_clk = inverted sys_clk for DDR timing)The sys_clk and pixel_clk are the same 74.25 MHz signal. The SDRAM clock is phase-inverted for proper setup/hold timing at the SDRAM chip.
JZ-HDL Language Features
BUS abstraction. Adding or removing a signal from SIMPLE_BUS requires changing only the bus definition — all modules using BUS SIMPLE_BUS SOURCE or TARGET automatically get the updated port list. In Verilog, this change ripples through every module's port list, every instantiation, and every wire declaration.
Tristate ownership proof. The compiler verifies at compile time that exactly one driver is active on the shared DATA bus at any moment. Each peripheral drives DATA only when selected by the arbiter. The arbiter's template-based address decoding provides the proof structure. In Verilog, tristate conflicts are only found during simulation — or on hardware.
Global constants. @global shares opcodes, state encodings, and bus commands across all modules without parameter threading. Every module that imports the global file sees the same CMD.READ, CMD.WRITE, and STATE.FETCH constants.
Mandatory reset values. Every register in the design has a declared initial value. The SDRAM control signals (r_cs_n, r_ras_n, etc.) reset to inactive (high for active-low signals). Forgetting a register in the reset block — a common Verilog bug that sends garbage commands to SDRAM during power-on — is impossible.
Template-based code generation. The arbiter uses three @template blocks to generate address matching, DONE collection, and signal routing for all 8 targets. Adding a ninth peripheral means adding one more entry to the config constant and one more @new instance — no template changes needed.