Skip to content

UART Audio

A 1280x720 @ 30Hz DVI display with UART-streamed audio playback. A host sends 16-bit mono PCM audio over UART; the FPGA buffers it in an 8192-sample BRAM ring buffer, plays back at 48 kHz via DVI/HDMI audio data islands, and renders an 80-bar Goertzel DFT spectrum analyzer with peak hold and VU meter on screen.

The design uses the same clock architecture as the DVI Color Bars example: 27 MHz crystal → 185.625 MHz serial clock (PLL) → 37.125 MHz pixel clock (CLKDIV).

Audio Pipeline

text
Host (PCM over UART @ 115200 baud)
  └─ uart_rx (8N1 deserializer)
       └─ uart_audio_rx (packet protocol: [LEN][DATA×LEN][PARITY])
            └─ audio_buffer (8192 × 16-bit BRAM ring buffer)
                 ├─ mixer (signed sum + DVI subpacket + BCH ECC)
                 │    └─ dvi_data_island (packet injection during hblank)
                 └─ spectrum_analyzer (Goertzel DFT, 80 bins)
                      └─ spectrum_display + level_display (visualization)

uart_audio_rx

UART audio packet receiver. Protocol: [LEN][DATA×LEN][PARITY]. Accepts packets of up to 64 samples, verifies XOR parity, and sends an ACK byte with the current buffer fill level (0–127) so the host can throttle transmission. A watchdog timer resets the receiver if a packet stalls mid-transfer.

audio_buffer

8192 × 16-bit BRAM ring buffer with Bresenham rate conversion for 48 kHz playback from the 37.125 MHz system clock (step=16, threshold=12375). Reports fill level for flow control back to the host via uart_audio_rx. When the buffer underruns, the output holds at zero.

mixer

Purely combinational. Sums signed 16-bit PCM channels using sadd for correct sign-extended widening, then formats the result as a DVI audio subpacket:

  • Left-justifies the 16-bit sample to 24 bits.
  • Computes even parity via XOR folding.
  • Builds status byte 6 with parity for both L and R channels (mono duplication).
  • Generates BCH(64,56) ECC using polynomial 0x83 in a combinational XOR network.

dvi_data_island

Injects audio sample packets, Audio InfoFrame (AIF), Audio Clock Regeneration (ACR), and AVI InfoFrame packets into the horizontal blanking interval. Shadow registers pre-load each packet's header and subpackets one cycle before transmission for clean timing.

The ACR packet carries N=6144 and CTS=37125, satisfying 48000 = 6144/37125 × 37,125,000.

spectrum_analyzer

Goertzel DFT spectrum analyzer. Computes 80 frequency bins (k=3 through k=102) from 2048-sample windows at 23.4 Hz resolution, covering approximately 55 Hz to 2400 Hz. Uses Q1.14 fixed-point coefficients. Includes auto-gain normalization and peak-hold display smoothing with exponential decay.

spectrum_display

Renders 80 color-gradient bars with reflections. Each bar occupies a 16-pixel pitch (10px body + 6px gap). Bars grow upward in segments of 16 pixels (12px body + 4px gap), up to 25 segments (400px maximum height). A reflection at 1/4 brightness extends below the baseline.

The color gradient runs from red (bar 0) through yellow (bar 39) to blue (bar 79).

level_display

Audio VU meter and buffer fill indicator. Two horizontal bars: the VU meter (y=300–339) shows current audio amplitude, and the buffer bar (y=380–409) shows ring buffer fill level. Both use a green-to-yellow-to-red color gradient with peak-hold and decay.

terc4_encoder

Purely combinational 4-to-10-bit lookup table implementing the 16 TERC4 codewords from the DVI specification. Three instances encode the three data island channels.

tmds_encoder

DVI TMDS 8b/10b encoder with running disparity tracking.

uart_rx / uart_tx

8N1 UART at 115200 baud. The receiver includes a 2-stage metastability synchronizer. The transmitter is used for sending ACK/status bytes back to the host.

video_timing

CEA-861 compliant 1280×720@30Hz timing generator. 37.125 MHz pixel clock.

por

Power-on reset timer. Counts clock cycles after DONE before releasing active-low por_n.

jz
@project(CHIP="GW2AR-18-QN88-C8-I7") UART_AUDIO
    @import "por.jz"
    @import "video_timing.jz"
    @import "tmds_encoder.jz"
    @import "terc4_encoder.jz"
    @import "dvi_data_island.jz"
    @import "uart_rx.jz"
    @import "uart_tx.jz"
    @import "uart_audio_rx.jz"
    @import "audio_buffer.jz"
    @import "mixer.jz"
    @import "spectrum_analyzer.jz"
    @import "spectrum_display.jz"
    @import "dvi.jz"

    CONFIG {
        CLK_HZ = 37125000;
        BAUD_RATE = 1200000;
    }

    CLOCKS {
        SCLK       = { period=37.037 }; // 27MHz crystal
        serial_clk;                      // 185.625MHz (5x pixel, from PLL)
        pixel_clk;                       // 37.125MHz (from CLKDIV)
    }

    IN_PINS {
        SCLK    = { standard=LVCMOS33 };
        DONE    = { standard=LVCMOS33 };
        KEY[2]  = { standard=LVCMOS33 };
        UART_RX = { standard=LVCMOS33 };
    }

    OUT_PINS {
        LED[6]       = { standard=LVCMOS33, drive=8 };
        UART_TX      = { standard=LVCMOS33, drive=8 };
        TMDS_CLK     = { mode=DIFFERENTIAL, standard=LVDS25, drive=3.5, width=10, fclk=serial_clk, pclk=pixel_clk, reset=pll_lock };
        TMDS_DATA[3] = { mode=DIFFERENTIAL, standard=LVDS25, drive=3.5, width=10, fclk=serial_clk, pclk=pixel_clk, reset=pll_lock };
    }

    MAP {
        // System Clock 27MHz
        SCLK = 4;

        // Buttons (active high)
        KEY[0] = 87;
        KEY[1] = 88;

        // LEDs (active low)
        LED[0] = 15;
        LED[1] = 16;
        LED[2] = 17;
        LED[3] = 18;
        LED[4] = 19;
        LED[5] = 20;

        // UART (directly on Tang Nano 20K USB-C serial via BL616)
        UART_RX = 70;
        UART_TX = 69;

        // DVI TMDS differential pairs
        TMDS_CLK     = { P=33, N=34 };
        TMDS_DATA[0] = { P=35, N=36 };
        TMDS_DATA[1] = { P=37, N=38 };
        TMDS_DATA[2] = { P=39, N=40 };

        // DONE (POR)
        DONE = IOR32B;
    }

    CLOCK_GEN {
        PLL {
            IN REF_CLK SCLK;
            OUT BASE   serial_clk;  // 185.625 MHz (5x pixel clock)
            WIRE LOCK  pll_lock;

            CONFIG {
                IDIV = 7;           // divider = 8
                FBDIV = 54;         // multiplier = 55
                ODIV = 4;           // VCO = 185.625 * 4 = 742.5 MHz
            };
        };
        CLKDIV {
            IN REF_CLK serial_clk;
            OUT BASE  pixel_clk;   // 185.625 / 5 = 37.125 MHz

            CONFIG {
                DIV_MODE = 5;
            };
        };
    }

    @top dvi_top {
        IN   [1]  clk         = pixel_clk;
        IN   [1]  por         = DONE;
        IN   [1]  rst_n       = ~KEY[1];
        IN   [1]  uart_rx_pin = UART_RX;
        OUT  [1]  uart_tx_pin = UART_TX;
        OUT  [6]  leds        = ~LED;
        OUT  [10] tmds_clk    = TMDS_CLK;
        OUT  [10] tmds_d0     = TMDS_DATA[0];
        OUT  [10] tmds_d1     = TMDS_DATA[1];
        OUT  [10] tmds_d2     = TMDS_DATA[2];
    }
@endproj
jz
// DVI 1280x720 @ 30Hz with UART Audio Input
// Receives 16-bit mono PCM audio via UART, stores in BRAM ring buffer,
// plays back at 48kHz via DVI/HDMI audio data islands.
// Display shows 80-bar spectrum-style amplitude visualization.
//
// All TMDS outputs go through a registered mux in the SYNCHRONOUS block
// to give the OSER10 a clean register-to-primitive path.
@module dvi_top
    PORT {
        IN   [1]  clk;         // pixel_clk (37.125 MHz)
        IN   [1]  por;         // POR input from DONE
        IN   [1]  rst_n;       // Active-low reset from button
        IN   [1]  uart_rx_pin; // UART RX from host
        OUT  [1]  uart_tx_pin; // UART TX to host
        OUT  [6]  leds;        // Status LEDs
        OUT  [10] tmds_clk;    // TMDS clock channel (serialized)
        OUT  [10] tmds_d0;     // TMDS data channel 0 (blue)
        OUT  [10] tmds_d1;     // TMDS data channel 1 (green)
        OUT  [10] tmds_d2;     // TMDS data channel 2 (red)
    }

    WIRE {
        reset       [1];
        por_n       [1];
        hsync       [1];
        vsync       [1];
        de          [1];
        x_pos       [11];
        y_pos       [10];
        red         [8];
        green       [8];
        blue        [8];

        // DVI TMDS encoder outputs
        dvi_tmds_d0 [10];
        dvi_tmds_d1 [10];
        dvi_tmds_d2 [10];

        // TERC4 encoder outputs
        terc4_ch0_data [4];
        terc4_ch1_data [4];
        terc4_ch2_data [4];
        terc4_d0       [10];
        terc4_d1       [10];
        terc4_d2       [10];

        // Data island control
        di_active      [1];
        di_preamble    [1];
        di_guard       [1];

        // DVI encoder control signals
        enc1_c0        [1];
        enc1_c1        [1];
        enc2_c0        [1];
        enc2_c1        [1];

        // Video preamble and guard band (combinational)
        video_preamble_pre [1];
        video_guard_pre    [1];

        // UART wires
        rx_data        [8];
        rx_valid       [1];
        tx_data        [8];
        tx_valid       [1];
        tx_ready       [1];

        // Audio buffer wires
        wr_data        [16];
        wr_valid       [1];
        audio_sample   [16];
        audio_valid    [1];
        buf_fill       [7];

        // Frame pulse for analyzer
        frame_pulse    [1];

        // Spectrum analyzer <-> display interconnect
        sp_rd_bar      [7];
        sp_rd_amp      [16];

        // Mixer outputs
        mix_samp_lo   [32];
        mix_samp_hi   [32];
        mix_valid     [1];
    }

    REGISTER {
        heartbeat_cnt [25] = 25'b0;
        heartbeat_led [1]  = 1'b0;

        // RX activity LED stretch
        rx_led_cnt    [20] = 20'd0;
        rx_led        [1]  = 1'b0;

        // Data island pipeline (1 cycle to align TERC4 with encoder output)
        di_active_r  [1]  = 1'b0;
        di_guard_r   [1]  = 1'b0;
        terc4_d0_r   [10] = 10'd0;
        terc4_d1_r   [10] = 10'd0;
        terc4_d2_r   [10] = 10'd0;

        // Video preamble and guard band
        video_preamble_r [1] = 1'b0;
        video_guard_r    [1] = 1'b0;

        // Two-stage TMDS output pipeline
        tmds_d0_pre  [10] = 10'd0;
        tmds_d1_pre  [10] = 10'd0;
        tmds_d2_pre  [10] = 10'd0;
        tmds_d0_r    [10] = 10'd0;
        tmds_d1_r    [10] = 10'd0;
        tmds_d2_r    [10] = 10'd0;
    }

    // --- Power-on reset ---
    @new por0 por {
        IN  [1] clk   = clk;
        IN  [1] done  = por;
        OUT [1] por_n = por_n;
    }

    // --- Video timing generator ---
    @new vt0 video_timing {
        IN  [1]  clk            = clk;
        IN  [1]  rst_n          = reset;
        OUT [1]  hsync          = hsync;
        OUT [1]  vsync          = vsync;
        OUT [1]  display_enable = de;
        OUT [11] x_pos          = x_pos;
        OUT [10] y_pos          = y_pos;
    }

    // --- TMDS encoders ---
    @new enc0 tmds_encoder {
        IN  [1]  clk            = clk;
        IN  [1]  rst_n          = reset;
        IN  [8]  data_in        = blue;
        IN  [1]  c0             = hsync;
        IN  [1]  c1             = vsync;
        IN  [1]  display_enable = de;
        OUT [10] tmds_out       = dvi_tmds_d0;
    }

    @new enc1 tmds_encoder {
        IN  [1]  clk            = clk;
        IN  [1]  rst_n          = reset;
        IN  [8]  data_in        = green;
        IN  [1]  c0             = enc1_c0;
        IN  [1]  c1             = enc1_c1;
        IN  [1]  display_enable = de;
        OUT [10] tmds_out       = dvi_tmds_d1;
    }

    @new enc2 tmds_encoder {
        IN  [1]  clk            = clk;
        IN  [1]  rst_n          = reset;
        IN  [8]  data_in        = red;
        IN  [1]  c0             = enc2_c0;
        IN  [1]  c1             = enc2_c1;
        IN  [1]  display_enable = de;
        OUT [10] tmds_out       = dvi_tmds_d2;
    }

    // --- TERC4 encoders for data island period ---
    @new t4_0 terc4_encoder {
        IN  [4]  data_in   = terc4_ch0_data;
        OUT [10] terc4_out = terc4_d0;
    }

    @new t4_1 terc4_encoder {
        IN  [4]  data_in   = terc4_ch1_data;
        OUT [10] terc4_out = terc4_d1;
    }

    @new t4_2 terc4_encoder {
        IN  [4]  data_in   = terc4_ch2_data;
        OUT [10] terc4_out = terc4_d2;
    }

    // --- UART ---
    @new urx0 uart_rx {
        OVERRIDE {
            CLK_HZ = CONFIG.CLK_HZ;
            BAUD_RATE = CONFIG.BAUD_RATE;
        }
        IN  [1] clk   = clk;
        IN  [1] rst_n = reset;
        IN  [1] rx    = uart_rx_pin;
        OUT [8] data  = rx_data;
        OUT [1] valid = rx_valid;
    }

    @new utx0 uart_tx {
        OVERRIDE {
            CLK_HZ = CONFIG.CLK_HZ;
            BAUD_RATE = CONFIG.BAUD_RATE;
        }
        IN  [1] clk   = clk;
        IN  [1] rst_n = reset;
        IN  [8] data  = tx_data;
        IN  [1] valid = tx_valid;
        OUT [1] ready = tx_ready;
        OUT [1] tx    = uart_tx_pin;
    }

    // --- UART audio packet receiver ---
    @new uarx0 uart_audio_rx {
        IN  [1]  clk      = clk;
        IN  [1]  rst_n    = reset;
        IN  [8]  rx_data  = rx_data;
        IN  [1]  rx_valid = rx_valid;
        OUT [8]  tx_data  = tx_data;
        OUT [1]  tx_valid = tx_valid;
        IN  [1]  tx_ready = tx_ready;
        OUT [16] wr_data  = wr_data;
        OUT [1]  wr_valid = wr_valid;
        IN  [7]  buf_fill = buf_fill;
    }

    // --- Audio ring buffer + 48kHz playback ---
    @new abuf0 audio_buffer {
        IN  [1]  clk        = clk;
        IN  [1]  rst_n      = reset;
        IN  [16] wr_data    = wr_data;
        IN  [1]  wr_valid   = wr_valid;
        OUT [16] sample     = audio_sample;
        OUT [1]  samp_valid = audio_valid;
        OUT [7]  fill_level = buf_fill;
    }

    // --- Mixer: feed mono audio to ch0, silence on ch1-ch3 ---
    @new mx0 mixer {
        IN  [16] s0         = audio_sample;
        IN  [16] s1         = 16'h0000;
        IN  [16] s2         = 16'h0000;
        IN  [16] s3         = 16'h0000;
        IN  [1]  samp_valid = audio_valid;
        OUT [32] samp_lo    = mix_samp_lo;
        OUT [32] samp_hi    = mix_samp_hi;
        OUT [1]  out_valid  = mix_valid;
    }

    // --- DVI data island controller ---
    @new di0 dvi_data_island {
        IN  [1]  clk               = clk;
        IN  [1]  rst_n             = reset;
        IN  [1]  hsync             = hsync;
        IN  [1]  vsync             = vsync;
        IN  [1]  display_enable    = de;
        IN  [11] x_pos             = x_pos;
        IN  [32] samp_lo           = mix_samp_lo;
        IN  [32] samp_hi           = mix_samp_hi;
        IN  [1]  samp_valid        = mix_valid;
        OUT [4]  terc4_ch0         = terc4_ch0_data;
        OUT [4]  terc4_ch1         = terc4_ch1_data;
        OUT [4]  terc4_ch2         = terc4_ch2_data;
        OUT [1]  data_island_active = di_active;
        OUT [1]  preamble_active   = di_preamble;
        OUT [1]  guard_active      = di_guard;
    }

    // --- PCM comb-filter spectrum analyzer ---
    @new sa0 spectrum_analyzer {
        IN  [1]  clk          = clk;
        IN  [1]  rst_n        = reset;
        IN  [16] audio_sample = audio_sample;
        IN  [1]  samp_valid   = audio_valid;
        IN  [1]  frame_pulse  = frame_pulse;
        IN  [7]  rd_bar       = sp_rd_bar;
        OUT [16] rd_amp       = sp_rd_amp;
    }

    // --- Spectrum display (same as dvi_audio) ---
    @new sd0 spectrum_display {
        IN  [1]  clk     = clk;
        IN  [1]  rst_n   = reset;
        IN  [11] x_pos   = x_pos;
        IN  [10] y_pos   = y_pos;
        OUT [7]  rd_bar  = sp_rd_bar;
        IN  [16] rd_amp  = sp_rd_amp;
        OUT [8]  red     = red;
        OUT [8]  green   = green;
        OUT [8]  blue    = blue;
    }

    ASYNCHRONOUS {
        reset <= rst_n & por_n;

        // Frame pulse: high for 1 cycle at top-left pixel (start of each frame)
        frame_pulse <= (x_pos == 11'd0 && y_pos == 10'd0) ? 1'b1 : 1'b0;

        // TMDS clock channel: fixed 1111100000 pattern
        tmds_clk <= 10'b1111100000;

        // Registered TMDS output to port
        tmds_d0 <= tmds_d0_r;
        tmds_d1 <= tmds_d1_r;
        tmds_d2 <= tmds_d2_r;

        // Preamble CTL signals on ch1/ch2
        IF (di_preamble == 1'b1) {
            enc1_c0 <= 1'b1;
            enc1_c1 <= 1'b0;
            enc2_c0 <= 1'b1;
            enc2_c1 <= 1'b0;
        } ELIF (video_preamble_r == 1'b1) {
            enc1_c0 <= 1'b1;
            enc1_c1 <= 1'b0;
            enc2_c0 <= 1'b0;
            enc2_c1 <= 1'b0;
        } ELSE {
            enc1_c0 <= 1'b0;
            enc1_c1 <= 1'b0;
            enc2_c0 <= 1'b0;
            enc2_c1 <= 1'b0;
        }

        // Video preamble
        video_preamble_pre <= (
            x_pos >= 11'd1640 && x_pos < 11'd1648 &&
            (y_pos < 10'd719 || y_pos == 10'd749)
        ) ? 1'b1 : 1'b0;

        // Video guard band
        video_guard_pre <= (
            (x_pos == 11'd1648 || x_pos == 11'd1649) &&
            (y_pos < 10'd719 || y_pos == 10'd749)
        ) ? 1'b1 : 1'b0;

        // LED status: LED[0]=heartbeat, LED[1]=RX activity
        leds <= { heartbeat_led, rx_led, 4'b0000 };
    }

    SYNCHRONOUS(CLK=clk RESET=reset RESET_ACTIVE=Low) {
        // Registration pipeline
        di_active_r      <= di_active;
        di_guard_r       <= di_guard;
        terc4_d0_r       <= terc4_d0;
        terc4_d1_r       <= terc4_d1;
        terc4_d2_r       <= terc4_d2;
        video_preamble_r <= video_preamble_pre;
        video_guard_r    <= video_guard_pre;

        // Output mux
        IF (di_active_r == 1'b1) {
            IF (di_guard_r == 1'b1) {
                tmds_d0_pre <= terc4_d0_r;
                tmds_d1_pre <= 10'b0100110011;
                tmds_d2_pre <= 10'b0100110011;
            } ELSE {
                tmds_d0_pre <= terc4_d0_r;
                tmds_d1_pre <= terc4_d1_r;
                tmds_d2_pre <= terc4_d2_r;
            }
        } ELIF (video_guard_r == 1'b1) {
            tmds_d0_pre <= 10'b1011001100;
            tmds_d1_pre <= 10'b0100110011;
            tmds_d2_pre <= 10'b1011001100;
        } ELSE {
            tmds_d0_pre <= dvi_tmds_d0;
            tmds_d1_pre <= dvi_tmds_d1;
            tmds_d2_pre <= dvi_tmds_d2;
        }

        // Stage 2: clean FF-to-OSER10 path
        tmds_d0_r <= tmds_d0_pre;
        tmds_d1_r <= tmds_d1_pre;
        tmds_d2_r <= tmds_d2_pre;

        // Heartbeat blinker
        IF (heartbeat_cnt == 25'd33_554_431) {
            heartbeat_cnt <= 25'b0;
            heartbeat_led <= ~heartbeat_led;
        } ELSE {
            heartbeat_cnt <= heartbeat_cnt + 25'b1;
        }

        // RX activity LED (stretch pulse to ~28ms for visibility)
        IF (rx_valid == 1'b1) {
            rx_led     <= 1'b1;
            rx_led_cnt <= 20'd0;
        } ELIF (rx_led_cnt == 20'd1_048_575) {
            rx_led <= 1'b0;
        } ELSE {
            rx_led_cnt <= rx_led_cnt + 20'd1;
        }
    }
@endmod
jz
// 1280x720 @ 30Hz Video Timing Generator
// CEA-861 timings, pixel clock = 37.125 MHz (half of 74.25 MHz)
// H total: 1650 (1280 active + 110 front + 40 sync + 220 back)
// V total: 750  (720 active + 5 front + 5 sync + 20 back)
// Sync polarity: positive (sync HIGH during sync pulse)
@module video_timing
    PORT {
        IN  [1]  clk;
        IN  [1]  rst_n;
        OUT [1]  hsync;
        OUT [1]  vsync;
        OUT [1]  display_enable;
        OUT [11] x_pos;
        OUT [10] y_pos;
    }

    CONST {
        // Horizontal timing
        H_ACTIVE = 1280;
        H_FRONT  = 110;
        H_SYNC   = 40;
        H_BACK   = 220;
        H_TOTAL  = 1650;

        // Vertical timing
        V_ACTIVE = 720;
        V_FRONT  = 5;
        V_SYNC   = 5;
        V_BACK   = 20;
        V_TOTAL  = 750;
    }

    REGISTER {
        h_cnt [11] = 11'b0;
        v_cnt [10] = 10'b0;
    }

    ASYNCHRONOUS {
        // Positive sync polarity: HIGH during sync pulse, LOW otherwise
        hsync <= (h_cnt >= lit(11, H_ACTIVE + H_FRONT) &&
                  h_cnt <  lit(11, H_ACTIVE + H_FRONT + H_SYNC))
                 ? 1'b1 : 1'b0;

        vsync <= (v_cnt >= lit(10, V_ACTIVE + V_FRONT) &&
                  v_cnt <  lit(10, V_ACTIVE + V_FRONT + V_SYNC))
                 ? 1'b1 : 1'b0;

        // Display enable: active region
        display_enable <= (h_cnt < lit(11, H_ACTIVE) &&
                           v_cnt < lit(10, V_ACTIVE))
                          ? 1'b1 : 1'b0;

        x_pos <= h_cnt;
        y_pos <= v_cnt;
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        IF (h_cnt == lit(11, H_TOTAL - 1)) {
            h_cnt <= 11'b0;
            IF (v_cnt == lit(10, V_TOTAL - 1)) {
                v_cnt <= 10'b0;
            } ELSE {
                v_cnt <= v_cnt + 10'b1;
            }
        } ELSE {
            h_cnt <= h_cnt + 11'b1;
        }
    }
@endmod
jz
// DVI TMDS 8b/10b Encoder
// Full DVI-compliant TMDS encoding with XOR/XNOR selection and
// running disparity tracking for DC balance on AC-coupled links.
@module tmds_encoder
    PORT {
        IN  [1]  clk;
        IN  [1]  rst_n;
        IN  [8]  data_in;
        IN  [1]  c0;
        IN  [1]  c1;
        IN  [1]  display_enable;
        OUT [10] tmds_out;
    }

    WIRE {
        // Popcount of data_in (adder tree)
        d_p0 [2]; d_p1 [2]; d_p2 [2]; d_p3 [2];
        d_s0 [3]; d_s1 [3];
        n1_d [4];

        // XOR/XNOR mode selection
        use_xnor [1];

        // Transition-minimized intermediate word q_m[8:0]
        qm0 [1]; qm1 [1]; qm2 [1]; qm3 [1];
        qm4 [1]; qm5 [1]; qm6 [1]; qm7 [1];
        qm8 [1];

        // Popcount of q_m[7:0] (adder tree)
        q_p0 [2]; q_p1 [2]; q_p2 [2]; q_p3 [2];
        q_s0 [3]; q_s1 [3];
        n1_q [4];

        // Disparity conditions
        cnt_is_zero [1];
        qm_balanced [1];
        cond1       [1];
        cnt_sign    [1];
        cond_inv    [1];

        // Arithmetic for disparity update (5-bit two's complement)
        diff_n1n0 [5];
        diff_n0n1 [5];
        qm8_x2    [5];
        nqm8_x2   [5];

        // Combinational outputs
        tmds_data [10];
        next_cnt  [5];
    }

    REGISTER {
        cnt      [5]  = 5'b00000;
        tmds_reg [10] = 10'b0000000000;
    }

    ASYNCHRONOUS {
        tmds_out <= tmds_reg;

        // --- Popcount of data_in ---
        d_p0 <= {1'b0, data_in[0]} + {1'b0, data_in[1]};
        d_p1 <= {1'b0, data_in[2]} + {1'b0, data_in[3]};
        d_p2 <= {1'b0, data_in[4]} + {1'b0, data_in[5]};
        d_p3 <= {1'b0, data_in[6]} + {1'b0, data_in[7]};
        d_s0 <= {1'b0, d_p0} + {1'b0, d_p1};
        d_s1 <= {1'b0, d_p2} + {1'b0, d_p3};
        n1_d <= {1'b0, d_s0} + {1'b0, d_s1};

        // --- XOR/XNOR selection (DVI spec section 3.3.1) ---
        use_xnor <= (n1_d > 4'd4 || (n1_d == 4'd4 && data_in[0] == 1'b0))
                     ? 1'b1 : 1'b0;

        // --- Build transition-minimized word q_m ---
        qm0 <= data_in[0];
        qm1 <= (use_xnor == 1'b1) ? ~(data_in[1] ^ qm0) : (data_in[1] ^ qm0);
        qm2 <= (use_xnor == 1'b1) ? ~(data_in[2] ^ qm1) : (data_in[2] ^ qm1);
        qm3 <= (use_xnor == 1'b1) ? ~(data_in[3] ^ qm2) : (data_in[3] ^ qm2);
        qm4 <= (use_xnor == 1'b1) ? ~(data_in[4] ^ qm3) : (data_in[4] ^ qm3);
        qm5 <= (use_xnor == 1'b1) ? ~(data_in[5] ^ qm4) : (data_in[5] ^ qm4);
        qm6 <= (use_xnor == 1'b1) ? ~(data_in[6] ^ qm5) : (data_in[6] ^ qm5);
        qm7 <= (use_xnor == 1'b1) ? ~(data_in[7] ^ qm6) : (data_in[7] ^ qm6);
        qm8 <= (use_xnor == 1'b1) ? 1'b0 : 1'b1;

        // --- Popcount of q_m[7:0] ---
        q_p0 <= {1'b0, qm0} + {1'b0, qm1};
        q_p1 <= {1'b0, qm2} + {1'b0, qm3};
        q_p2 <= {1'b0, qm4} + {1'b0, qm5};
        q_p3 <= {1'b0, qm6} + {1'b0, qm7};
        q_s0 <= {1'b0, q_p0} + {1'b0, q_p1};
        q_s1 <= {1'b0, q_p2} + {1'b0, q_p3};
        n1_q <= {1'b0, q_s0} + {1'b0, q_s1};

        // --- Disparity conditions ---
        cnt_is_zero <= (cnt == 5'b00000) ? 1'b1 : 1'b0;
        qm_balanced <= (n1_q == 4'd4) ? 1'b1 : 1'b0;
        cond1       <= (cnt_is_zero == 1'b1 || qm_balanced == 1'b1)
                        ? 1'b1 : 1'b0;
        cnt_sign    <= cnt[4];
        cond_inv    <= ((cnt_sign == 1'b0 && cnt_is_zero == 1'b0 && n1_q > 4'd4) ||
                        (cnt_sign == 1'b1 && n1_q < 4'd4))
                       ? 1'b1 : 1'b0;

        // --- Arithmetic helpers (5-bit two's complement) ---
        diff_n1n0 <= {n1_q, 1'b0} - 5'd8;
        diff_n0n1 <= 5'd8 - {n1_q, 1'b0};
        qm8_x2   <= {3'b000, qm8, 1'b0};
        nqm8_x2  <= {3'b000, ~qm8, 1'b0};

        // --- Output word and next disparity (DVI spec section 3.3.2) ---
        IF (cond1 == 1'b1) {
            IF (qm8 == 1'b0) {
                // XNOR mode, cnt==0 or balanced: invert data, bit[9]=1
                tmds_data <= {1'b1, 1'b0, ~qm7, ~qm6, ~qm5, ~qm4,
                              ~qm3, ~qm2, ~qm1, ~qm0};
                next_cnt  <= cnt + diff_n0n1;
            } ELSE {
                // XOR mode, cnt==0 or balanced: keep data, bit[9]=0
                tmds_data <= {1'b0, 1'b1, qm7, qm6, qm5, qm4,
                              qm3, qm2, qm1, qm0};
                next_cnt  <= cnt + diff_n1n0;
            }
        } ELIF (cond_inv == 1'b1) {
            // Invert to reduce disparity
            tmds_data <= {1'b1, qm8, ~qm7, ~qm6, ~qm5, ~qm4,
                          ~qm3, ~qm2, ~qm1, ~qm0};
            next_cnt  <= cnt + qm8_x2 + diff_n0n1;
        } ELSE {
            // Don't invert
            tmds_data <= {1'b0, qm8, qm7, qm6, qm5, qm4,
                          qm3, qm2, qm1, qm0};
            next_cnt  <= cnt - nqm8_x2 + diff_n1n0;
        }
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        IF (display_enable == 1'b0) {
            // Control period: reset disparity and emit control tokens
            cnt <= 5'b00000;
            IF (c0 == 1'b0 && c1 == 1'b0) {
                tmds_reg <= 10'b1101010100;
            } ELIF (c0 == 1'b1 && c1 == 1'b0) {
                tmds_reg <= 10'b0010101011;
            } ELIF (c0 == 1'b0 && c1 == 1'b1) {
                tmds_reg <= 10'b0101010100;
            } ELSE {
                tmds_reg <= 10'b1010101011;
            }
        } ELSE {
            // Data period: latch encoded word and update disparity
            tmds_reg <= tmds_data;
            cnt <= next_cnt;
        }
    }
@endmod
jz
// DVI Data Island Controller with Audio
// Injects 4 DVI data island packets during horizontal blanking periods:
//   PKT0: AVI InfoFrame (video format descriptor)
//   PKT1: ACR (Audio Clock Regeneration, N=6144, CTS=37125 for 48kHz @ 37.125MHz)
//   PKT2: Audio Sample (L-PCM 2ch 16-bit, 2-3 samples per line)
//   PKT3: Audio InfoFrame (audio format descriptor)
//
// Uses shadow registers to reduce bit-extraction SELECT tables from 4 sets to 1.
// At each packet boundary, the next packet's data is copied into the shadow.
//
// Audio samples are provided externally via samp_lo/samp_hi/samp_valid ports.
//
// Data island timing within hblank (370 pixel clocks):
//   Preamble:        8 clocks  (x=1449..1456)
//   Leading guard:   2 clocks  (x=1457..1458)
//   Packet 0 (AVI):  32 clocks (x=1459..1490)
//   Packet 1 (ACR):  32 clocks (x=1491..1522)
//   Packet 2 (Audio):32 clocks (x=1523..1554)
//   Packet 3 (AIF):  32 clocks (x=1555..1586)
//   Trailing guard:  2 clocks  (x=1587..1588)
//   Control period:  51 clocks until video preamble at x=1640
@module dvi_data_island
    PORT {
        IN  [1]  clk;
        IN  [1]  rst_n;

        // Video timing inputs
        IN  [1]  hsync;
        IN  [1]  vsync;
        IN  [1]  display_enable;
        IN  [11] x_pos;

        // Audio sample inputs (from tone generator)
        IN  [32] samp_lo;       // L+R subpacket low word
        IN  [32] samp_hi;       // L+R subpacket high word
        IN  [1]  samp_valid;    // pulses high for 1 cycle when a new sample is ready

        // TERC4 data outputs (active during data island)
        OUT [4]  terc4_ch0;      // {parity, hdr_bit, vsync, hsync}
        OUT [4]  terc4_ch1;      // subpacket even bits
        OUT [4]  terc4_ch2;      // subpacket odd bits

        // Control signals
        OUT [1]  data_island_active;  // HIGH during guard bands + packet data
        OUT [1]  preamble_active;     // HIGH during data island preamble
        OUT [1]  guard_active;        // HIGH during guard bands only
    }

    CONST {
        H_ACTIVE = 1280;
        H_TOTAL  = 1650;
        V_ACTIVE = 720;

        // Data island timing
        DI_PREAMBLE_START = 1449;
        DI_GUARD_START    = 1457;
        DI_PKT0_START     = 1459;
        DI_PKT1_START     = 1491;
        DI_PKT2_START     = 1523;
        DI_PKT3_START     = 1555;
        DI_TRAIL_START    = 1587;
        DI_TRAIL_END      = 1589;

        // Shadow swap points (1 cycle before each packet start)
        DI_SHD_SWAP1      = 1490;
        DI_SHD_SWAP2      = 1522;
        DI_SHD_SWAP3      = 1554;
    }

    WIRE {
        in_hblank       [1];
        in_preamble     [1];
        in_guard_lead   [1];
        in_packet       [1];
        in_guard_trail  [1];
        in_data_island  [1];
        pkt_clock       [5];

        // Shadow bit extraction outputs
        shd_hdr_bit     [1];
        shd_sub_even    [4];
        shd_sub_odd     [4];
    }

    REGISTER {
        // Shadow registers (loaded with current packet data before each packet)
        shd_header  [32] = 32'd0;
        shd_sp0_lo  [32] = 32'd0;
        shd_sp0_hi  [32] = 32'd0;
        shd_sp1_lo  [32] = 32'd0;
        shd_sp1_hi  [32] = 32'd0;
        shd_sp2_lo  [32] = 32'd0;
        shd_sp2_hi  [32] = 32'd0;
        shd_sp3_lo  [32] = 32'd0;
        shd_sp3_hi  [32] = 32'd0;

        // Audio sample packet (PKT2) - built at H_ACTIVE from sample buffer
        p2_header   [32] = 32'd0;
        p2_sp0_lo   [32] = 32'd0;
        p2_sp0_hi   [32] = 32'd0;
        p2_sp1_lo   [32] = 32'd0;
        p2_sp1_hi   [32] = 32'd0;
        p2_sp2_lo   [32] = 32'd0;
        p2_sp2_hi   [32] = 32'd0;

        // Sample buffer (filled by samp_valid between H_ACTIVE events)
        // Up to 3 samples per line (48000/22500 = 2.133 samples/line)
        samp_buf0_lo [32] = 32'd0;
        samp_buf0_hi [32] = 32'd0;
        samp_buf1_lo [32] = 32'd0;
        samp_buf1_hi [32] = 32'd0;
        samp_buf2_lo [32] = 32'd0;
        samp_buf2_hi [32] = 32'd0;
        samp_count   [2]  = 2'd0;
    }

    ASYNCHRONOUS {
        // Blanking region detection
        in_hblank <= (display_enable == 1'b0) ? 1'b1 : 1'b0;

        // Data island sub-regions
        in_preamble <= (in_hblank == 1'b1 &&
                        x_pos >= lit(11, DI_PREAMBLE_START) && x_pos < lit(11, DI_GUARD_START)) ? 1'b1 : 1'b0;

        in_guard_lead <= (in_hblank == 1'b1 &&
                          x_pos >= lit(11, DI_GUARD_START) && x_pos < lit(11, DI_PKT0_START)) ? 1'b1 : 1'b0;

        in_packet <= (in_hblank == 1'b1 &&
                      x_pos >= lit(11, DI_PKT0_START) && x_pos < lit(11, DI_TRAIL_START)) ? 1'b1 : 1'b0;

        in_guard_trail <= (in_hblank == 1'b1 &&
                           x_pos >= lit(11, DI_TRAIL_START) && x_pos < lit(11, DI_TRAIL_END)) ? 1'b1 : 1'b0;

        in_data_island <= (in_guard_lead == 1'b1 || in_packet == 1'b1 || in_guard_trail == 1'b1) ? 1'b1 : 1'b0;

        data_island_active <= in_data_island;
        preamble_active    <= in_preamble;
        guard_active       <= (in_guard_lead == 1'b1 || in_guard_trail == 1'b1) ? 1'b1 : 1'b0;

        // Packet clock (0-31 within each packet)
        // All 4 packets start at x[4:0]=19, so this formula works for all
        pkt_clock <= x_pos[4:0] - 5'd19;

        // ---------------------------------------------------------------
        // Shadow header bit extraction (1 bit per clock, 32 bits total)
        // ---------------------------------------------------------------
        SELECT (pkt_clock) {
            CASE (5'd0)  { shd_hdr_bit <= shd_header[0]; }
            CASE (5'd1)  { shd_hdr_bit <= shd_header[1]; }
            CASE (5'd2)  { shd_hdr_bit <= shd_header[2]; }
            CASE (5'd3)  { shd_hdr_bit <= shd_header[3]; }
            CASE (5'd4)  { shd_hdr_bit <= shd_header[4]; }
            CASE (5'd5)  { shd_hdr_bit <= shd_header[5]; }
            CASE (5'd6)  { shd_hdr_bit <= shd_header[6]; }
            CASE (5'd7)  { shd_hdr_bit <= shd_header[7]; }
            CASE (5'd8)  { shd_hdr_bit <= shd_header[8]; }
            CASE (5'd9)  { shd_hdr_bit <= shd_header[9]; }
            CASE (5'd10) { shd_hdr_bit <= shd_header[10]; }
            CASE (5'd11) { shd_hdr_bit <= shd_header[11]; }
            CASE (5'd12) { shd_hdr_bit <= shd_header[12]; }
            CASE (5'd13) { shd_hdr_bit <= shd_header[13]; }
            CASE (5'd14) { shd_hdr_bit <= shd_header[14]; }
            CASE (5'd15) { shd_hdr_bit <= shd_header[15]; }
            CASE (5'd16) { shd_hdr_bit <= shd_header[16]; }
            CASE (5'd17) { shd_hdr_bit <= shd_header[17]; }
            CASE (5'd18) { shd_hdr_bit <= shd_header[18]; }
            CASE (5'd19) { shd_hdr_bit <= shd_header[19]; }
            CASE (5'd20) { shd_hdr_bit <= shd_header[20]; }
            CASE (5'd21) { shd_hdr_bit <= shd_header[21]; }
            CASE (5'd22) { shd_hdr_bit <= shd_header[22]; }
            CASE (5'd23) { shd_hdr_bit <= shd_header[23]; }
            CASE (5'd24) { shd_hdr_bit <= shd_header[24]; }
            CASE (5'd25) { shd_hdr_bit <= shd_header[25]; }
            CASE (5'd26) { shd_hdr_bit <= shd_header[26]; }
            CASE (5'd27) { shd_hdr_bit <= shd_header[27]; }
            CASE (5'd28) { shd_hdr_bit <= shd_header[28]; }
            CASE (5'd29) { shd_hdr_bit <= shd_header[29]; }
            CASE (5'd30) { shd_hdr_bit <= shd_header[30]; }
            CASE (5'd31) { shd_hdr_bit <= shd_header[31]; }
            DEFAULT      { shd_hdr_bit <= 1'b0; }
        }

        // ---------------------------------------------------------------
        // Shadow subpacket bit extraction (interleaved across 4 subpackets)
        // At clock T: ch1 = {sp3[2T], sp2[2T], sp1[2T], sp0[2T]}
        //             ch2 = {sp3[2T+1], sp2[2T+1], sp1[2T+1], sp0[2T+1]}
        // T=0..15 uses sp_lo registers, T=16..31 uses sp_hi registers
        // ---------------------------------------------------------------
        SELECT (pkt_clock) {
            CASE (5'd0)  { shd_sub_even <= { shd_sp3_lo[0],  shd_sp2_lo[0],  shd_sp1_lo[0],  shd_sp0_lo[0] };
                           shd_sub_odd  <= { shd_sp3_lo[1],  shd_sp2_lo[1],  shd_sp1_lo[1],  shd_sp0_lo[1] }; }
            CASE (5'd1)  { shd_sub_even <= { shd_sp3_lo[2],  shd_sp2_lo[2],  shd_sp1_lo[2],  shd_sp0_lo[2] };
                           shd_sub_odd  <= { shd_sp3_lo[3],  shd_sp2_lo[3],  shd_sp1_lo[3],  shd_sp0_lo[3] }; }
            CASE (5'd2)  { shd_sub_even <= { shd_sp3_lo[4],  shd_sp2_lo[4],  shd_sp1_lo[4],  shd_sp0_lo[4] };
                           shd_sub_odd  <= { shd_sp3_lo[5],  shd_sp2_lo[5],  shd_sp1_lo[5],  shd_sp0_lo[5] }; }
            CASE (5'd3)  { shd_sub_even <= { shd_sp3_lo[6],  shd_sp2_lo[6],  shd_sp1_lo[6],  shd_sp0_lo[6] };
                           shd_sub_odd  <= { shd_sp3_lo[7],  shd_sp2_lo[7],  shd_sp1_lo[7],  shd_sp0_lo[7] }; }
            CASE (5'd4)  { shd_sub_even <= { shd_sp3_lo[8],  shd_sp2_lo[8],  shd_sp1_lo[8],  shd_sp0_lo[8] };
                           shd_sub_odd  <= { shd_sp3_lo[9],  shd_sp2_lo[9],  shd_sp1_lo[9],  shd_sp0_lo[9] }; }
            CASE (5'd5)  { shd_sub_even <= { shd_sp3_lo[10], shd_sp2_lo[10], shd_sp1_lo[10], shd_sp0_lo[10] };
                           shd_sub_odd  <= { shd_sp3_lo[11], shd_sp2_lo[11], shd_sp1_lo[11], shd_sp0_lo[11] }; }
            CASE (5'd6)  { shd_sub_even <= { shd_sp3_lo[12], shd_sp2_lo[12], shd_sp1_lo[12], shd_sp0_lo[12] };
                           shd_sub_odd  <= { shd_sp3_lo[13], shd_sp2_lo[13], shd_sp1_lo[13], shd_sp0_lo[13] }; }
            CASE (5'd7)  { shd_sub_even <= { shd_sp3_lo[14], shd_sp2_lo[14], shd_sp1_lo[14], shd_sp0_lo[14] };
                           shd_sub_odd  <= { shd_sp3_lo[15], shd_sp2_lo[15], shd_sp1_lo[15], shd_sp0_lo[15] }; }
            CASE (5'd8)  { shd_sub_even <= { shd_sp3_lo[16], shd_sp2_lo[16], shd_sp1_lo[16], shd_sp0_lo[16] };
                           shd_sub_odd  <= { shd_sp3_lo[17], shd_sp2_lo[17], shd_sp1_lo[17], shd_sp0_lo[17] }; }
            CASE (5'd9)  { shd_sub_even <= { shd_sp3_lo[18], shd_sp2_lo[18], shd_sp1_lo[18], shd_sp0_lo[18] };
                           shd_sub_odd  <= { shd_sp3_lo[19], shd_sp2_lo[19], shd_sp1_lo[19], shd_sp0_lo[19] }; }
            CASE (5'd10) { shd_sub_even <= { shd_sp3_lo[20], shd_sp2_lo[20], shd_sp1_lo[20], shd_sp0_lo[20] };
                           shd_sub_odd  <= { shd_sp3_lo[21], shd_sp2_lo[21], shd_sp1_lo[21], shd_sp0_lo[21] }; }
            CASE (5'd11) { shd_sub_even <= { shd_sp3_lo[22], shd_sp2_lo[22], shd_sp1_lo[22], shd_sp0_lo[22] };
                           shd_sub_odd  <= { shd_sp3_lo[23], shd_sp2_lo[23], shd_sp1_lo[23], shd_sp0_lo[23] }; }
            CASE (5'd12) { shd_sub_even <= { shd_sp3_lo[24], shd_sp2_lo[24], shd_sp1_lo[24], shd_sp0_lo[24] };
                           shd_sub_odd  <= { shd_sp3_lo[25], shd_sp2_lo[25], shd_sp1_lo[25], shd_sp0_lo[25] }; }
            CASE (5'd13) { shd_sub_even <= { shd_sp3_lo[26], shd_sp2_lo[26], shd_sp1_lo[26], shd_sp0_lo[26] };
                           shd_sub_odd  <= { shd_sp3_lo[27], shd_sp2_lo[27], shd_sp1_lo[27], shd_sp0_lo[27] }; }
            CASE (5'd14) { shd_sub_even <= { shd_sp3_lo[28], shd_sp2_lo[28], shd_sp1_lo[28], shd_sp0_lo[28] };
                           shd_sub_odd  <= { shd_sp3_lo[29], shd_sp2_lo[29], shd_sp1_lo[29], shd_sp0_lo[29] }; }
            CASE (5'd15) { shd_sub_even <= { shd_sp3_lo[30], shd_sp2_lo[30], shd_sp1_lo[30], shd_sp0_lo[30] };
                           shd_sub_odd  <= { shd_sp3_lo[31], shd_sp2_lo[31], shd_sp1_lo[31], shd_sp0_lo[31] }; }
            CASE (5'd16) { shd_sub_even <= { shd_sp3_hi[0],  shd_sp2_hi[0],  shd_sp1_hi[0],  shd_sp0_hi[0] };
                           shd_sub_odd  <= { shd_sp3_hi[1],  shd_sp2_hi[1],  shd_sp1_hi[1],  shd_sp0_hi[1] }; }
            CASE (5'd17) { shd_sub_even <= { shd_sp3_hi[2],  shd_sp2_hi[2],  shd_sp1_hi[2],  shd_sp0_hi[2] };
                           shd_sub_odd  <= { shd_sp3_hi[3],  shd_sp2_hi[3],  shd_sp1_hi[3],  shd_sp0_hi[3] }; }
            CASE (5'd18) { shd_sub_even <= { shd_sp3_hi[4],  shd_sp2_hi[4],  shd_sp1_hi[4],  shd_sp0_hi[4] };
                           shd_sub_odd  <= { shd_sp3_hi[5],  shd_sp2_hi[5],  shd_sp1_hi[5],  shd_sp0_hi[5] }; }
            CASE (5'd19) { shd_sub_even <= { shd_sp3_hi[6],  shd_sp2_hi[6],  shd_sp1_hi[6],  shd_sp0_hi[6] };
                           shd_sub_odd  <= { shd_sp3_hi[7],  shd_sp2_hi[7],  shd_sp1_hi[7],  shd_sp0_hi[7] }; }
            CASE (5'd20) { shd_sub_even <= { shd_sp3_hi[8],  shd_sp2_hi[8],  shd_sp1_hi[8],  shd_sp0_hi[8] };
                           shd_sub_odd  <= { shd_sp3_hi[9],  shd_sp2_hi[9],  shd_sp1_hi[9],  shd_sp0_hi[9] }; }
            CASE (5'd21) { shd_sub_even <= { shd_sp3_hi[10], shd_sp2_hi[10], shd_sp1_hi[10], shd_sp0_hi[10] };
                           shd_sub_odd  <= { shd_sp3_hi[11], shd_sp2_hi[11], shd_sp1_hi[11], shd_sp0_hi[11] }; }
            CASE (5'd22) { shd_sub_even <= { shd_sp3_hi[12], shd_sp2_hi[12], shd_sp1_hi[12], shd_sp0_hi[12] };
                           shd_sub_odd  <= { shd_sp3_hi[13], shd_sp2_hi[13], shd_sp1_hi[13], shd_sp0_hi[13] }; }
            CASE (5'd23) { shd_sub_even <= { shd_sp3_hi[14], shd_sp2_hi[14], shd_sp1_hi[14], shd_sp0_hi[14] };
                           shd_sub_odd  <= { shd_sp3_hi[15], shd_sp2_hi[15], shd_sp1_hi[15], shd_sp0_hi[15] }; }
            CASE (5'd24) { shd_sub_even <= { shd_sp3_hi[16], shd_sp2_hi[16], shd_sp1_hi[16], shd_sp0_hi[16] };
                           shd_sub_odd  <= { shd_sp3_hi[17], shd_sp2_hi[17], shd_sp1_hi[17], shd_sp0_hi[17] }; }
            CASE (5'd25) { shd_sub_even <= { shd_sp3_hi[18], shd_sp2_hi[18], shd_sp1_hi[18], shd_sp0_hi[18] };
                           shd_sub_odd  <= { shd_sp3_hi[19], shd_sp2_hi[19], shd_sp1_hi[19], shd_sp0_hi[19] }; }
            CASE (5'd26) { shd_sub_even <= { shd_sp3_hi[20], shd_sp2_hi[20], shd_sp1_hi[20], shd_sp0_hi[20] };
                           shd_sub_odd  <= { shd_sp3_hi[21], shd_sp2_hi[21], shd_sp1_hi[21], shd_sp0_hi[21] }; }
            CASE (5'd27) { shd_sub_even <= { shd_sp3_hi[22], shd_sp2_hi[22], shd_sp1_hi[22], shd_sp0_hi[22] };
                           shd_sub_odd  <= { shd_sp3_hi[23], shd_sp2_hi[23], shd_sp1_hi[23], shd_sp0_hi[23] }; }
            CASE (5'd28) { shd_sub_even <= { shd_sp3_hi[24], shd_sp2_hi[24], shd_sp1_hi[24], shd_sp0_hi[24] };
                           shd_sub_odd  <= { shd_sp3_hi[25], shd_sp2_hi[25], shd_sp1_hi[25], shd_sp0_hi[25] }; }
            CASE (5'd29) { shd_sub_even <= { shd_sp3_hi[26], shd_sp2_hi[26], shd_sp1_hi[26], shd_sp0_hi[26] };
                           shd_sub_odd  <= { shd_sp3_hi[27], shd_sp2_hi[27], shd_sp1_hi[27], shd_sp0_hi[27] }; }
            CASE (5'd30) { shd_sub_even <= { shd_sp3_hi[28], shd_sp2_hi[28], shd_sp1_hi[28], shd_sp0_hi[28] };
                           shd_sub_odd  <= { shd_sp3_hi[29], shd_sp2_hi[29], shd_sp1_hi[29], shd_sp0_hi[29] }; }
            CASE (5'd31) { shd_sub_even <= { shd_sp3_hi[30], shd_sp2_hi[30], shd_sp1_hi[30], shd_sp0_hi[30] };
                           shd_sub_odd  <= { shd_sp3_hi[31], shd_sp2_hi[31], shd_sp1_hi[31], shd_sp0_hi[31] }; }
            DEFAULT      { shd_sub_even <= 4'b0000; shd_sub_odd <= 4'b0000; }
        }

        // TERC4 channel outputs
        IF (in_guard_lead == 1'b1 || in_guard_trail == 1'b1) {
            terc4_ch0 <= { 2'b11, vsync, hsync };
            terc4_ch1 <= 4'b0000;
            terc4_ch2 <= 4'b0000;
        } ELIF (in_packet == 1'b1) {
            terc4_ch0 <= { 1'b1, shd_hdr_bit, vsync, hsync };
            terc4_ch1 <= shd_sub_even;
            terc4_ch2 <= shd_sub_odd;
        } ELSE {
            terc4_ch0 <= { 2'b11, vsync, hsync };
            terc4_ch1 <= 4'b0000;
            terc4_ch2 <= 4'b0000;
        }
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        // ---- Packet loading and shadow management ----
        // At H_ACTIVE: build audio sample packet from buffer, load shadow with PKT0
        // At swap points: load shadow with next packet's data
        // Sample buffering runs in all non-H_ACTIVE branches
        IF (x_pos == lit(11, H_ACTIVE)) {
            // Build audio sample packet (PKT2) from accumulated samples
            IF (samp_count == 2'd3) {
                // 3 samples collected
                p2_header <= 32'h4D000702;
                p2_sp0_lo <= samp_buf0_lo;
                p2_sp0_hi <= samp_buf0_hi;
                p2_sp1_lo <= samp_buf1_lo;
                p2_sp1_hi <= samp_buf1_hi;
                p2_sp2_lo <= samp_buf2_lo;
                p2_sp2_hi <= samp_buf2_hi;
            } ELIF (samp_count == 2'd2) {
                // 2 samples collected
                p2_header <= 32'h80000302;
                p2_sp0_lo <= samp_buf0_lo;
                p2_sp0_hi <= samp_buf0_hi;
                p2_sp1_lo <= samp_buf1_lo;
                p2_sp1_hi <= samp_buf1_hi;
                p2_sp2_lo <= 32'd0;
                p2_sp2_hi <= 32'd0;
            } ELIF (samp_count == 2'd1) {
                // 1 sample collected (first line after reset)
                p2_header <= 32'h65000102;
                p2_sp0_lo <= samp_buf0_lo;
                p2_sp0_hi <= samp_buf0_hi;
                p2_sp1_lo <= 32'd0;
                p2_sp1_hi <= 32'd0;
                p2_sp2_lo <= 32'd0;
                p2_sp2_hi <= 32'd0;
            } ELSE {
                // No samples - send silence
                p2_header <= 32'h65000102;
                p2_sp0_lo <= 32'd0;
                p2_sp0_hi <= 32'd0;
                p2_sp1_lo <= 32'd0;
                p2_sp1_hi <= 32'd0;
                p2_sp2_lo <= 32'd0;
                p2_sp2_hi <= 32'd0;
            }

            // Reset sample buffer for next line, capturing if samp_valid fires now
            IF (samp_valid == 1'b1) {
                samp_buf0_lo <= samp_lo;
                samp_buf0_hi <= samp_hi;
                samp_count   <= 2'd1;
            } ELSE {
                samp_count <= 2'd0;
            }

            // Load shadow with PKT0: AVI InfoFrame
            // Header: {ECC=0xE4, Len=0x0D, Ver=0x02, Type=0x82}
            shd_header <= 32'hE40D0282;
            // SP0: {PB3=0, PB2=0, PB1=0, PB0=checksum=0x6F}
            shd_sp0_lo <= 32'h0000006F;
            shd_sp0_hi <= 32'h5F000000;
            shd_sp1_lo <= 32'd0;
            shd_sp1_hi <= 32'd0;
            shd_sp2_lo <= 32'd0;
            shd_sp2_hi <= 32'd0;
            shd_sp3_lo <= 32'd0;
            shd_sp3_hi <= 32'd0;

        } ELIF (x_pos == lit(11, DI_SHD_SWAP1)) {
            // Shadow <- PKT1: ACR (N=6144, CTS=37125=0x009105)
            // Header: {ECC=0x4A, 0x00, 0x00, Type=0x01}
            shd_header <= 32'h4A000001;
            // SP0: {CTS[7:0]=0x05, CTS[15:8]=0x91, CTS[19:16]=0x00, 0x00}
            shd_sp0_lo <= 32'h05910000;
            // SP0 hi: {ECC=0x16, N[7:0]=0x00, N[15:8]=0x18, N[19:16]=0x00}
            shd_sp0_hi <= 32'h16001800;
            shd_sp1_lo <= 32'd0;
            shd_sp1_hi <= 32'd0;
            shd_sp2_lo <= 32'd0;
            shd_sp2_hi <= 32'd0;
            shd_sp3_lo <= 32'd0;
            shd_sp3_hi <= 32'd0;

            // Sample buffering (samp_valid may fire this cycle)
            IF (samp_valid == 1'b1) {
                IF (samp_count == 2'd0) {
                    samp_buf0_lo <= samp_lo;
                    samp_buf0_hi <= samp_hi;
                    samp_count   <= 2'd1;
                } ELIF (samp_count == 2'd1) {
                    samp_buf1_lo <= samp_lo;
                    samp_buf1_hi <= samp_hi;
                    samp_count   <= 2'd2;
                } ELIF (samp_count == 2'd2) {
                    samp_buf2_lo <= samp_lo;
                    samp_buf2_hi <= samp_hi;
                    samp_count   <= 2'd3;
                }
            }

        } ELIF (x_pos == lit(11, DI_SHD_SWAP2)) {
            // Shadow <- PKT2: Audio Sample (from pre-built registers)
            shd_header <= p2_header;
            shd_sp0_lo <= p2_sp0_lo;
            shd_sp0_hi <= p2_sp0_hi;
            shd_sp1_lo <= p2_sp1_lo;
            shd_sp1_hi <= p2_sp1_hi;
            shd_sp2_lo <= p2_sp2_lo;
            shd_sp2_hi <= p2_sp2_hi;
            shd_sp3_lo <= 32'd0;
            shd_sp3_hi <= 32'd0;

            // Sample buffering
            IF (samp_valid == 1'b1) {
                IF (samp_count == 2'd0) {
                    samp_buf0_lo <= samp_lo;
                    samp_buf0_hi <= samp_hi;
                    samp_count   <= 2'd1;
                } ELIF (samp_count == 2'd1) {
                    samp_buf1_lo <= samp_lo;
                    samp_buf1_hi <= samp_hi;
                    samp_count   <= 2'd2;
                }
            }

        } ELIF (x_pos == lit(11, DI_SHD_SWAP3)) {
            // Shadow <- PKT3: Audio InfoFrame (2ch L-PCM 48kHz 16-bit)
            // Header: {ECC=0x4A, Len=0x0A, Ver=0x01, Type=0x84}
            shd_header <= 32'h4A0A0184;
            // SP0: {PB3=0, PB2=0x0D(48kHz/16bit), PB1=0x11(PCM/2ch), PB0=chksum=0x53}
            shd_sp0_lo <= 32'h000D1153;
            shd_sp0_hi <= 32'hB9000000;
            shd_sp1_lo <= 32'd0;
            shd_sp1_hi <= 32'd0;
            shd_sp2_lo <= 32'd0;
            shd_sp2_hi <= 32'd0;
            shd_sp3_lo <= 32'd0;
            shd_sp3_hi <= 32'd0;

            // Sample buffering
            IF (samp_valid == 1'b1) {
                IF (samp_count == 2'd0) {
                    samp_buf0_lo <= samp_lo;
                    samp_buf0_hi <= samp_hi;
                    samp_count   <= 2'd1;
                } ELIF (samp_count == 2'd1) {
                    samp_buf1_lo <= samp_lo;
                    samp_buf1_hi <= samp_hi;
                    samp_count   <= 2'd2;
                }
            }

        } ELSE {
            // Default cycle: sample buffering only
            IF (samp_valid == 1'b1) {
                IF (samp_count == 2'd0) {
                    samp_buf0_lo <= samp_lo;
                    samp_buf0_hi <= samp_hi;
                    samp_count   <= 2'd1;
                } ELIF (samp_count == 2'd1) {
                    samp_buf1_lo <= samp_lo;
                    samp_buf1_hi <= samp_hi;
                    samp_count   <= 2'd2;
                } ELIF (samp_count == 2'd2) {
                    samp_buf2_lo <= samp_lo;
                    samp_buf2_hi <= samp_hi;
                    samp_count   <= 2'd3;
                }
            }
        }
    }
@endmod
jz
// TERC4 (Transition-minimized Error Reduction Coding, 4-bit)
// DVI/HDMI 1.4 spec encoding for data island periods.
// 4-bit input -> 10-bit TERC4 output, purely combinational.
@module terc4_encoder
    PORT {
        IN  [4]  data_in;
        OUT [10] terc4_out;
    }

    WIRE {
        result [10];
    }

    ASYNCHRONOUS {
        SELECT (data_in) {
            CASE (4'b0000) { result <= 10'b1010011100; }
            CASE (4'b0001) { result <= 10'b1001100011; }
            CASE (4'b0010) { result <= 10'b1011100100; }
            CASE (4'b0011) { result <= 10'b1011100010; }
            CASE (4'b0100) { result <= 10'b0101110001; }
            CASE (4'b0101) { result <= 10'b0100011110; }
            CASE (4'b0110) { result <= 10'b0110001110; }
            CASE (4'b0111) { result <= 10'b0100111100; }
            CASE (4'b1000) { result <= 10'b1011001100; }
            CASE (4'b1001) { result <= 10'b0100111001; }
            CASE (4'b1010) { result <= 10'b0110011100; }
            CASE (4'b1011) { result <= 10'b1011000110; }
            CASE (4'b1100) { result <= 10'b1010001110; }
            CASE (4'b1101) { result <= 10'b1001110001; }
            CASE (4'b1110) { result <= 10'b0101100011; }
            CASE (4'b1111) { result <= 10'b1011000011; }
            DEFAULT        { result <= 10'b1010011100; }
        }
        terc4_out <= result;
    }
@endmod
jz
// UART Audio Packet Receiver
// Protocol: Python -> FPGA: [LEN] [DATA x LEN] [PARITY]
//           FPGA -> Python: [ACK]
// LEN:    Number of data bytes (must be even for 16-bit samples)
// DATA:   16-bit samples, little-endian (low byte first)
// PARITY: XOR of all data bytes
// ACK:    bits[7:1] = buffer fill (0-127), bit[0] = status (0=OK, 1=RESEND)
//
// Samples are buffered in local RAM during reception. Only flushed to
// the audio ring buffer after parity verification passes. This prevents
// corrupted packets from producing static in the audio output.
@module uart_audio_rx
    PORT {
        IN  [1]  clk;
        IN  [1]  rst_n;

        // UART RX interface
        IN  [8]  rx_data;
        IN  [1]  rx_valid;

        // UART TX interface (for ACK)
        OUT [8]  tx_data;
        OUT [1]  tx_valid;
        IN  [1]  tx_ready;

        // Audio buffer write interface
        OUT [16] wr_data;
        OUT [1]  wr_valid;

        // Buffer status (from audio_buffer)
        IN  [7]  buf_fill;
    }

    MEM(TYPE=DISTRIBUTED) {
        // Local packet buffer: up to 64 samples per packet (128 data bytes)
        pkt_ram [16] [64] = 16'h0000 { OUT rd ASYNC; IN wr; };
    }

    CONST {
        // Watchdog timeout: ~5ms at 37.125 MHz = 185625 cycles
        // Must tolerate USB Full Speed frame gaps (~1ms for CH340E)
        WD_TIMEOUT = 185624;

        // Periodic fill report interval: ~10ms at 37.125 MHz
        FILL_INTERVAL = 371249;
    }

    REGISTER {
        // State: 0=IDLE, 1=RX_DATA, 2=RX_PARITY, 3=FLUSH, 4=TX_ACK
        state       [3]  = 3'd0;

        // Packet state
        pkt_len     [8]  = 8'd0;       // expected data byte count
        byte_cnt    [8]  = 8'd0;       // bytes received so far
        parity_acc  [8]  = 8'h00;      // running XOR parity
        parity_ok   [1]  = 1'b0;       // parity check result

        // Sample assembly (little-endian: low byte first)
        low_byte    [8]  = 8'h00;      // stored low byte
        have_low    [1]  = 1'b0;       // waiting for high byte

        // Sample counting and flush
        samp_cnt    [6]  = 6'd0;       // samples assembled so far (0-63)
        flush_idx   [6]  = 6'd0;       // index during flush to audio buffer

        // Watchdog timer: resets to IDLE on timeout during RX_DATA/RX_PARITY
        wd_cnt      [18] = 18'd0;

        // Periodic fill-level report timer (counts down in IDLE)
        fill_timer  [19] = 19'd0;

        // Output registers
        wr_data_r   [16] = 16'h0000;
        wr_valid_r  [1]  = 1'b0;
        tx_data_r   [8]  = 8'h00;
        tx_valid_r  [1]  = 1'b0;
    }

    ASYNCHRONOUS {
        wr_data  <= wr_data_r;
        wr_valid <= wr_valid_r;
        tx_data  <= tx_data_r;
        tx_valid <= tx_valid_r;
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        SELECT (state) {
            CASE (3'd0) {
                // IDLE: wait for LEN byte, send periodic fill reports
                wr_valid_r <= 1'b0;
                have_low   <= 1'b0;
                IF (rx_valid == 1'b1) {
                    // Start receiving a new packet
                    pkt_len    <= rx_data;
                    byte_cnt   <= 8'd0;
                    parity_acc <= 8'h00;
                    samp_cnt   <= 6'd0;
                    wd_cnt     <= 18'd0;
                    tx_valid_r <= 1'b0;
                    state      <= 3'd1;
                } ELIF (fill_timer == 19'd0 && tx_ready == 1'b1) {
                    // Timer expired: send unsolicited fill-level report
                    tx_data_r  <= { buf_fill, 1'b0 };
                    tx_valid_r <= 1'b1;
                    fill_timer <= lit(19, FILL_INTERVAL);
                } ELSE {
                    tx_valid_r <= 1'b0;
                    IF (fill_timer != 19'd0) {
                        fill_timer <= fill_timer - 19'd1;
                    }
                }
            }

            CASE (3'd1) {
                // RX_DATA: receive pkt_len data bytes, buffer samples locally
                tx_valid_r <= 1'b0;
                wr_valid_r <= 1'b0;
                IF (wd_cnt == lit(18, WD_TIMEOUT)) {
                    // Watchdog timeout: no byte for ~200us, resync
                    state  <= 3'd0;
                    wd_cnt <= 18'd0;
                } ELIF (rx_valid == 1'b1) {
                    // Reset watchdog on each received byte
                    wd_cnt <= 18'd0;

                    // Accumulate parity
                    parity_acc <= parity_acc ^ rx_data;

                    // Assemble 16-bit samples (little-endian)
                    IF (have_low == 1'b0) {
                        // Low byte: store and wait for high byte
                        low_byte <= rx_data;
                        have_low <= 1'b1;
                    } ELSE {
                        // High byte: store complete sample in local RAM
                        pkt_ram.wr[samp_cnt] <= { rx_data, low_byte };
                        samp_cnt <= samp_cnt + 6'd1;
                        have_low <= 1'b0;
                    }

                    // Count bytes
                    IF (byte_cnt + 8'd1 == pkt_len) {
                        // All data received, expect parity next
                        state <= 3'd2;
                    } ELSE {
                        byte_cnt <= byte_cnt + 8'd1;
                    }
                } ELSE {
                    wd_cnt <= wd_cnt + 18'd1;
                }
            }

            CASE (3'd2) {
                // RX_PARITY: receive and check parity byte
                wr_valid_r <= 1'b0;
                tx_valid_r <= 1'b0;
                IF (wd_cnt == lit(18, WD_TIMEOUT)) {
                    // Watchdog timeout: resync
                    state  <= 3'd0;
                    wd_cnt <= 18'd0;
                } ELIF (rx_valid == 1'b1) {
                    wd_cnt <= 18'd0;
                    IF (parity_acc == rx_data) {
                        parity_ok <= 1'b1;
                        // Parity OK: flush buffered samples to audio buffer
                        flush_idx <= 6'd0;
                        state     <= 3'd3;
                    } ELSE {
                        parity_ok <= 1'b0;
                        // Parity failed: skip flush, go straight to ACK
                        state <= 3'd4;
                    }
                } ELSE {
                    wd_cnt <= wd_cnt + 18'd1;
                }
            }

            CASE (3'd3) {
                // FLUSH: write buffered samples to audio ring buffer (1 per cycle)
                tx_valid_r <= 1'b0;
                wr_data_r  <= pkt_ram.rd[flush_idx];
                wr_valid_r <= 1'b1;
                IF (flush_idx + 6'd1 == samp_cnt) {
                    // Last sample flushed, send ACK
                    state <= 3'd4;
                } ELSE {
                    flush_idx <= flush_idx + 6'd1;
                }
            }

            CASE (3'd4) {
                // TX_ACK: send ACK byte when TX is ready
                wr_valid_r <= 1'b0;
                IF (tx_ready == 1'b1) {
                    // ACK: bits[7:1] = buffer fill (0-127), bit[0] = status
                    IF (parity_ok == 1'b1) {
                        tx_data_r  <= { buf_fill, 1'b0 };
                    } ELSE {
                        tx_data_r  <= { buf_fill, 1'b1 };
                    }
                    tx_valid_r <= 1'b1;
                    state      <= 3'd0;
                } ELSE {
                    tx_valid_r <= 1'b0;
                }
            }

            DEFAULT {
                wr_valid_r <= 1'b0;
                tx_valid_r <= 1'b0;
                state      <= 3'd0;
            }
        }
    }
@endmod
jz
// Audio Ring Buffer with 48kHz Playback
// BRAM-based ring buffer (8192 x 16-bit samples = ~170ms at 48kHz).
// Write port fed by UART audio receiver.
// Read port produces 48kHz sample stream for DVI audio output.
// Reports buffer fill level for flow control.
@module audio_buffer
    CONST {
        // Bresenham audio sample rate: 48000 / 37125000 = 16 / 12375
        BRES_STEP     = 16;
        BRES_THRESH   = 12375;
        BRES_FIRE_MIN = 12359;
    }

    PORT {
        IN  [1]  clk;
        IN  [1]  rst_n;

        // Write port (from UART audio receiver)
        IN  [16] wr_data;
        IN  [1]  wr_valid;

        // Audio output (48kHz)
        OUT [16] sample;
        OUT [1]  samp_valid;

        // Buffer status
        OUT [7]  fill_level;    // 0-127 (top 7 bits of 13-bit count)
    }

    MEM(TYPE=BLOCK) {
        buf [16] [8192] = 16'h0000 {
            IN  write;
            OUT read SYNC;
        };
    }

    WIRE {
        // Bresenham pre-computation
        bres_fire    [1];
        bres_next    [14];

        // Buffer state
        fill_count   [13];
        buf_empty    [1];
    }

    REGISTER {
        // Ring buffer pointers (13-bit for 8192 entries)
        wr_ptr      [13] = 13'd0;
        rd_ptr      [13] = 13'd0;

        // Bresenham accumulator
        bres_acc    [14] = 14'd0;

        // Output registers
        sample_out  [16] = 16'h0000;
        valid_out   [1]  = 1'b0;
    }

    ASYNCHRONOUS {
        // Buffer fill: modular subtraction (wraps correctly for power-of-2 size)
        fill_count <= wr_ptr - rd_ptr;
        buf_empty  <= (wr_ptr == rd_ptr) ? 1'b1 : 1'b0;

        // Fill level: top 7 bits of 13-bit count (0-127)
        fill_level <= fill_count[12:6];

        // Bresenham pre-computation
        bres_fire <= (bres_acc >= lit(14, BRES_FIRE_MIN)) ? 1'b1 : 1'b0;
        IF (bres_acc >= lit(14, BRES_FIRE_MIN)) {
            bres_next <= bres_acc - lit(14, BRES_FIRE_MIN);
        } ELSE {
            bres_next <= bres_acc + lit(14, BRES_STEP);
        }

        // Drive output ports
        sample     <= sample_out;
        samp_valid <= valid_out;
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        // Bresenham accumulator
        bres_acc <= bres_next;

        // BRAM read: always present rd_ptr as read address.
        buf.read.addr <= rd_ptr;

        // 48kHz tick: grab current BRAM output and advance pointer
        IF (bres_fire == 1'b1 && buf_empty == 1'b0) {
            sample_out <= buf.read.data;
            valid_out  <= 1'b1;
            rd_ptr     <= rd_ptr + 13'd1;
        } ELIF (bres_fire == 1'b1) {
            sample_out <= 16'h0000;
            valid_out  <= 1'b1;
        } ELSE {
            valid_out <= 1'b0;
        }

        // Write port: store sample from UART receiver
        IF (wr_valid == 1'b1) {
            buf.write[wr_ptr] <= wr_data;
            wr_ptr <= wr_ptr + 13'd1;
        }
    }
@endmod
jz
// 4-Channel Audio Mixer + DVI Subpacket Formatter
// Sums 4 signed 16-bit PCM channels and formats as DVI audio subpacket words.
// Includes BCH(64,56) ECC computation (polynomial 0x83, LSB-first).
// Purely combinational (ASYNC only).
@module mixer
    PORT {
        IN  [16] s0;            // channel 0 PCM sample
        IN  [16] s1;            // channel 1 PCM sample
        IN  [16] s2;            // channel 2 PCM sample
        IN  [16] s3;            // channel 3 PCM sample
        IN  [1]  samp_valid;    // sample valid pulse
        OUT [32] samp_lo;       // DVI subpacket low word
        OUT [32] samp_hi;       // DVI subpacket high word
        OUT [1]  out_valid;     // pass-through valid
    }

    WIRE {
        sum01    [17];          // sadd(s0, s1) — 17-bit signed
        sum23    [17];          // sadd(s2, s3) — 17-bit signed
        sum_all  [18];          // sadd(sum01, sum23) — 18-bit signed
        mix      [16];          // final 16-bit signed PCM
        samp_24  [24];          // left-justified to 24 bits

        // Parity computation (XOR of all 24 bits, folded in stages)
        p8       [8];           // byte0 ^ byte1 ^ byte2
        p4       [4];           // fold 8 to 4
        p2       [2];           // fold 4 to 2
        parity   [1];           // final even parity bit
        sb6      [8];           // status byte 6

        // Subpacket data (internal wires, before port assignment)
        lo       [32];          // samp_lo value
        hi_data  [24];          // upper 24 bits: {sb6, R[23:16], R[15:8]}

        // BCH ECC
        ecc      [8];           // BCH(64,56) ECC result
    }

    ASYNCHRONOUS {
        // Sum 4 channels using sadd for proper sign-extended widening
        sum01   <= sadd(s0, s1);
        sum23   <= sadd(s2, s3);
        sum_all <= sadd(sum01, sum23);
        mix     <= sum_all[15:0];

        // Left-justify 16-bit sample to 24 bits
        samp_24 <= { mix, 8'h00 };

        // Even parity: XOR all 24 bits by folding bytes then nibbles
        p8     <= samp_24[7:0] ^ samp_24[15:8] ^ samp_24[23:16];
        p4     <= p8[3:0] ^ p8[7:4];
        p2     <= p4[1:0] ^ p4[3:2];
        parity <= p2[0] ^ p2[1];

        // Status byte 6: {P_L, 000, P_R, 000} -- parity for both L and R
        sb6 <= { parity, 3'b000, parity, 3'b000 };

        // DVI subpacket format (L=R mono):
        //   lo = { R[7:0], L[23:16], L[15:8], L[7:0] }
        lo <= { samp_24[7:0], samp_24[23:16], samp_24[15:8], samp_24[7:0] };

        // Upper 24 data bits for ECC: {sb6, R[23:16], R[15:8]}
        hi_data <= { sb6, samp_24[23:16], samp_24[15:8] };

        // BCH(64,56) ECC: polynomial 0x83, LSB-first, right-shift LFSR
        // data[31:0] = lo, data[55:32] = hi_data
        ecc[0] <= lo[0] ^ lo[1] ^ lo[3] ^ lo[4] ^ lo[5] ^ lo[6] ^ lo[11] ^ lo[12] ^ lo[14] ^ lo[17] ^ lo[21] ^ lo[22] ^ lo[23] ^ lo[24] ^ lo[25] ^ lo[27] ^ lo[29] ^ lo[30] ^ lo[31] ^ hi_data[2] ^ hi_data[5] ^ hi_data[7] ^ hi_data[8] ^ hi_data[9] ^ hi_data[10] ^ hi_data[11] ^ hi_data[12] ^ hi_data[15] ^ hi_data[16] ^ hi_data[17] ^ hi_data[18] ^ hi_data[20] ^ hi_data[21] ^ hi_data[23];
        ecc[1] <= lo[0] ^ lo[2] ^ lo[3] ^ lo[7] ^ lo[11] ^ lo[13] ^ lo[14] ^ lo[15] ^ lo[17] ^ lo[18] ^ lo[21] ^ lo[26] ^ lo[27] ^ lo[28] ^ lo[29] ^ hi_data[0] ^ hi_data[2] ^ hi_data[3] ^ hi_data[5] ^ hi_data[6] ^ hi_data[7] ^ hi_data[13] ^ hi_data[15] ^ hi_data[19] ^ hi_data[20] ^ hi_data[22] ^ hi_data[23];
        ecc[2] <= lo[0] ^ lo[5] ^ lo[6] ^ lo[8] ^ lo[11] ^ lo[15] ^ lo[16] ^ lo[17] ^ lo[18] ^ lo[19] ^ lo[21] ^ lo[23] ^ lo[24] ^ lo[25] ^ lo[28] ^ lo[31] ^ hi_data[1] ^ hi_data[2] ^ hi_data[3] ^ hi_data[4] ^ hi_data[5] ^ hi_data[6] ^ hi_data[9] ^ hi_data[10] ^ hi_data[11] ^ hi_data[12] ^ hi_data[14] ^ hi_data[15] ^ hi_data[17] ^ hi_data[18];
        ecc[3] <= lo[0] ^ lo[1] ^ lo[6] ^ lo[7] ^ lo[9] ^ lo[12] ^ lo[16] ^ lo[17] ^ lo[18] ^ lo[19] ^ lo[20] ^ lo[22] ^ lo[24] ^ lo[25] ^ lo[26] ^ lo[29] ^ hi_data[0] ^ hi_data[2] ^ hi_data[3] ^ hi_data[4] ^ hi_data[5] ^ hi_data[6] ^ hi_data[7] ^ hi_data[10] ^ hi_data[11] ^ hi_data[12] ^ hi_data[13] ^ hi_data[15] ^ hi_data[16] ^ hi_data[18] ^ hi_data[19];
        ecc[4] <= lo[0] ^ lo[1] ^ lo[2] ^ lo[7] ^ lo[8] ^ lo[10] ^ lo[13] ^ lo[17] ^ lo[18] ^ lo[19] ^ lo[20] ^ lo[21] ^ lo[23] ^ lo[25] ^ lo[26] ^ lo[27] ^ lo[30] ^ hi_data[1] ^ hi_data[3] ^ hi_data[4] ^ hi_data[5] ^ hi_data[6] ^ hi_data[7] ^ hi_data[8] ^ hi_data[11] ^ hi_data[12] ^ hi_data[13] ^ hi_data[14] ^ hi_data[16] ^ hi_data[17] ^ hi_data[19] ^ hi_data[20];
        ecc[5] <= lo[0] ^ lo[1] ^ lo[2] ^ lo[3] ^ lo[8] ^ lo[9] ^ lo[11] ^ lo[14] ^ lo[18] ^ lo[19] ^ lo[20] ^ lo[21] ^ lo[22] ^ lo[24] ^ lo[26] ^ lo[27] ^ lo[28] ^ lo[31] ^ hi_data[2] ^ hi_data[4] ^ hi_data[5] ^ hi_data[6] ^ hi_data[7] ^ hi_data[8] ^ hi_data[9] ^ hi_data[12] ^ hi_data[13] ^ hi_data[14] ^ hi_data[15] ^ hi_data[17] ^ hi_data[18] ^ hi_data[20] ^ hi_data[21];
        ecc[6] <= lo[1] ^ lo[2] ^ lo[3] ^ lo[4] ^ lo[9] ^ lo[10] ^ lo[12] ^ lo[15] ^ lo[19] ^ lo[20] ^ lo[21] ^ lo[22] ^ lo[23] ^ lo[25] ^ lo[27] ^ lo[28] ^ lo[29] ^ hi_data[0] ^ hi_data[3] ^ hi_data[5] ^ hi_data[6] ^ hi_data[7] ^ hi_data[8] ^ hi_data[9] ^ hi_data[10] ^ hi_data[13] ^ hi_data[14] ^ hi_data[15] ^ hi_data[16] ^ hi_data[18] ^ hi_data[19] ^ hi_data[21] ^ hi_data[22];
        ecc[7] <= lo[0] ^ lo[2] ^ lo[3] ^ lo[4] ^ lo[5] ^ lo[10] ^ lo[11] ^ lo[13] ^ lo[16] ^ lo[20] ^ lo[21] ^ lo[22] ^ lo[23] ^ lo[24] ^ lo[26] ^ lo[28] ^ lo[29] ^ lo[30] ^ hi_data[1] ^ hi_data[4] ^ hi_data[6] ^ hi_data[7] ^ hi_data[8] ^ hi_data[9] ^ hi_data[10] ^ hi_data[11] ^ hi_data[14] ^ hi_data[15] ^ hi_data[16] ^ hi_data[17] ^ hi_data[19] ^ hi_data[20] ^ hi_data[22] ^ hi_data[23];

        // Drive output ports
        samp_lo  <= lo;
        samp_hi  <= { ecc, hi_data };
        out_valid <= samp_valid;
    }
@endmod
jz
// Goertzel DFT Spectrum Analyzer — 80-Bar Frequency Display
// Takes 16-bit signed PCM samples at 48kHz and produces per-bar energy
// data compatible with spectrum_display's rd_bar/rd_amp interface.
//
// Algorithm: Goertzel DFT with N=2048 (23.4 Hz resolution, 42.7ms blocks).
// Per-sample recurrence for each bin k:
//   s1[n] = coeff_k * s1[n-1] - s2[n-1] + x[n]
//   s2[n] = s1[n-1]
// After N samples: power = s1² + s2² (top 16 bits).
// 80 bars map to DFT bins k=3..102 (55 Hz to 2400 Hz, log-spaced).
// Coefficients: coeff_k = 2*cos(2*pi*k/N) in Q1.14 fixed-point.
// Input scaled >>4 for headroom (24-bit state, split across 16-bit RAMs).
@module spectrum_analyzer
    PORT {
        IN  [1]  clk;
        IN  [1]  rst_n;
        IN  [16] audio_sample;    // signed 16-bit PCM
        IN  [1]  samp_valid;      // 48kHz sample pulse
        IN  [1]  frame_pulse;     // once per video frame
        IN  [7]  rd_bar;          // bar index from display (0..79)
        OUT [16] rd_amp;          // [7:0]=smooth_amp, [15:8]=peak_amp
    }

    WIRE {
        // Goertzel coefficient from IF/ELIF LUT (Q1.14 signed)
        coeff_val      [16];

        // Goertzel state reconstructed from split RAMs (24-bit signed)
        s1_val         [24];
        s2_val         [24];

        // Input scaling: cur_sample sign-extended to 24 bits, then >>4
        x_full         [24];
        x_scaled       [24];

        // Goertzel multiply: coeff × s1 (Q1.14 × signed-24 → 48-bit)
        product_raw    [48];
        product_scaled [24];

        // Goertzel accumulate: coeff*s1 - s2 + x (26-bit for headroom)
        goertzel_sum    [26];
        goertzel_result [24];

        // Power: s1² + s2² (unsigned 16-bit from top magnitude bits)
        power_s1sq     [48];
        power_s2sq     [48];
        power_sum      [17];
        power_result   [16];

        // State RAM write mux (24-bit values, split at write time)
        s1_wr_data     [24];
        s2_wr_data     [24];
        state_wr_en    [1];

        // Auto-gain normalization
        raw_power      [16];
        adj_power      [16];
        floor_w        [16];
        adj_max_w      [16];
        norm_shift_w   [4];

        // Barrel-shift normalized amplitude
        target_amp     [8];

        // Smoothing (from sweep_ram)
        smooth_cur     [8];
        smooth_delta   [9];
        smooth_step    [9];
        smooth_next    [9];
        smooth_result  [8];

        // Peak hold (from sweep_ram: [15:12]=peak_top4, [11:8]=timer)
        peak_approx    [8];
        timer_cur      [4];
        new_peak_amp   [8];
        new_peak_timer [4];

        // Write values
        display_wr_val [16];
        sweep_wr_val   [16];
    }

    REGISTER {
        // State machine: 0=IDLE, 1=PROCESS, 2=POWER, 3=SWEEP
        state          [2]  = 2'd0;
        // Bin iteration (0..79)
        bin_idx        [7]  = 7'd0;
        // Latched audio sample
        cur_sample     [16] = 16'd0;
        // DFT block sample counter (0..2047 = N-1)
        sample_count   [11] = 11'd0;
        // Frame sync
        frame_pending  [1]  = 1'b0;
        // Sweep iteration
        sweep_idx      [7]  = 7'd0;
        sweep_phase    [2]  = 2'b00;   // 00=SCAN, 01=COMPUTE, 10=WRITE
        // Auto-gain state
        scan_max       [16] = 16'd0;
        energy_sum     [24] = 24'd0;
        floor_val      [16] = 16'd0;
        norm_shift     [4]  = 4'd0;
    }

    MEM(TYPE=DISTRIBUTED) {
        // Goertzel state split across 16-bit RAMs:
        // s1_lo/s2_lo store lower 16 bits; s_hi packs both upper bytes
        s1_lo       [16] [128] = 16'd0 { OUT rd ASYNC; IN wr; };
        s2_lo       [16] [128] = 16'd0 { OUT rd ASYNC; IN wr; };
        s_hi        [16] [128] = 16'd0 { OUT rd ASYNC; IN wr; };
        // DFT power per bin (written after each N-sample block)
        power_ram   [16] [128] = 16'd0 { OUT rd ASYNC; IN wr; };
        // Display: [7:0]=smooth_amp, [15:8]=peak_amp — read by display module
        display_ram [16] [128] = 16'd0 { OUT rd ASYNC; IN wr; };
        // Sweep: [7:0]=smooth, [11:8]=timer, [15:12]=peak_top4
        sweep_ram   [16] [128] = 16'd0 { OUT rd ASYNC; IN wr; };
    }

    ASYNCHRONOUS {
        // Read port for display module
        rd_amp <= display_ram.rd[rd_bar];

        // ---- Goertzel coefficient LUT: bin_idx → Q1.14 coeff ----
        // coeff_k = round(2*cos(2*pi*k/N) * 16384), N=2048, fs=48kHz
        // 80 log-spaced bars from 55 Hz (k=3) to 2400 Hz (k=102)
        IF (bin_idx < 7'd9) { coeff_val <= 16'h7fff; }
        ELIF (bin_idx < 7'd14) { coeff_val <= 16'h7ffe; }
        ELIF (bin_idx < 7'd18) { coeff_val <= 16'h7ffc; }
        ELIF (bin_idx < 7'd22) { coeff_val <= 16'h7ffa; }
        ELIF (bin_idx < 7'd25) { coeff_val <= 16'h7ff8; }
        ELIF (bin_idx < 7'd27) { coeff_val <= 16'h7ff6; }
        ELIF (bin_idx < 7'd30) { coeff_val <= 16'h7ff4; }
        ELIF (bin_idx < 7'd32) { coeff_val <= 16'h7ff1; }
        ELIF (bin_idx < 7'd34) { coeff_val <= 16'h7fed; }
        ELIF (bin_idx == 7'd34) { coeff_val <= 16'h7fea; }
        ELIF (bin_idx < 7'd37) { coeff_val <= 16'h7fe6; }
        ELIF (bin_idx < 7'd39) { coeff_val <= 16'h7fe2; }
        ELIF (bin_idx == 7'd39) { coeff_val <= 16'h7fdd; }
        ELIF (bin_idx == 7'd40) { coeff_val <= 16'h7fd9; }
        ELIF (bin_idx < 7'd43) { coeff_val <= 16'h7fd3; }
        ELIF (bin_idx == 7'd43) { coeff_val <= 16'h7fce; }
        ELIF (bin_idx == 7'd44) { coeff_val <= 16'h7fc8; }
        ELIF (bin_idx == 7'd45) { coeff_val <= 16'h7fc2; }
        ELIF (bin_idx == 7'd46) { coeff_val <= 16'h7fbc; }
        ELIF (bin_idx == 7'd47) { coeff_val <= 16'h7fb5; }
        ELIF (bin_idx == 7'd48) { coeff_val <= 16'h7fae; }
        ELIF (bin_idx == 7'd49) { coeff_val <= 16'h7fa7; }
        ELIF (bin_idx == 7'd50) { coeff_val <= 16'h7f98; }
        ELIF (bin_idx == 7'd51) { coeff_val <= 16'h7f90; }
        ELIF (bin_idx == 7'd52) { coeff_val <= 16'h7f87; }
        ELIF (bin_idx == 7'd53) { coeff_val <= 16'h7f75; }
        ELIF (bin_idx == 7'd54) { coeff_val <= 16'h7f6c; }
        ELIF (bin_idx == 7'd55) { coeff_val <= 16'h7f58; }
        ELIF (bin_idx == 7'd56) { coeff_val <= 16'h7f4e; }
        ELIF (bin_idx == 7'd57) { coeff_val <= 16'h7f38; }
        ELIF (bin_idx == 7'd58) { coeff_val <= 16'h7f22; }
        ELIF (bin_idx == 7'd59) { coeff_val <= 16'h7f16; }
        ELIF (bin_idx == 7'd60) { coeff_val <= 16'h7efd; }
        ELIF (bin_idx == 7'd61) { coeff_val <= 16'h7ee3; }
        ELIF (bin_idx == 7'd62) { coeff_val <= 16'h7ec8; }
        ELIF (bin_idx == 7'd63) { coeff_val <= 16'h7e9d; }
        ELIF (bin_idx == 7'd64) { coeff_val <= 16'h7e7f; }
        ELIF (bin_idx == 7'd65) { coeff_val <= 16'h7e60; }
        ELIF (bin_idx == 7'd66) { coeff_val <= 16'h7e2f; }
        ELIF (bin_idx == 7'd67) { coeff_val <= 16'h7dfb; }
        ELIF (bin_idx == 7'd68) { coeff_val <= 16'h7dc4; }
        ELIF (bin_idx == 7'd69) { coeff_val <= 16'h7d9e; }
        ELIF (bin_idx == 7'd70) { coeff_val <= 16'h7d4e; }
        ELIF (bin_idx == 7'd71) { coeff_val <= 16'h7d0f; }
        ELIF (bin_idx == 7'd72) { coeff_val <= 16'h7cce; }
        ELIF (bin_idx == 7'd73) { coeff_val <= 16'h7c72; }
        ELIF (bin_idx == 7'd74) { coeff_val <= 16'h7c11; }
        ELIF (bin_idx == 7'd75) { coeff_val <= 16'h7bac; }
        ELIF (bin_idx == 7'd76) { coeff_val <= 16'h7b42; }
        ELIF (bin_idx == 7'd77) { coeff_val <= 16'h7ad3; }
        ELIF (bin_idx == 7'd78) { coeff_val <= 16'h7a42; }
        ELIF (bin_idx == 7'd79) { coeff_val <= 16'h79c9; }
        ELSE { coeff_val <= 16'h7fff; }

        // ---- Reconstruct 24-bit Goertzel state from split RAMs ----
        // s_hi packs {s2[23:16], s1[23:16]} in [15:8] and [7:0]
        s1_val <= { s_hi.rd[bin_idx][7:0], s1_lo.rd[bin_idx] };
        s2_val <= { s_hi.rd[bin_idx][15:8], s2_lo.rd[bin_idx] };

        // ---- Input scaling: sign-extend cur_sample to 24 bits, then >>4 ----
        x_full <= { cur_sample[15], cur_sample[15], cur_sample[15], cur_sample[15],
                    cur_sample[15], cur_sample[15], cur_sample[15], cur_sample[15],
                    cur_sample };
        x_scaled <= { x_full[23], x_full[23], x_full[23], x_full[23], x_full[23:4] };

        // ---- Goertzel multiply: coeff_val × s1_val (Q1.14 × integer) ----
        product_raw    <= smul(coeff_val, s1_val);
        product_scaled <= product_raw[37:14];

        // ---- Goertzel accumulate: coeff*s1 - s2 + x (26-bit signed) ----
        goertzel_sum <= { product_scaled[23], product_scaled[23], product_scaled }
                      - { s2_val[23], s2_val[23], s2_val }
                      + { x_scaled[23], x_scaled[23], x_scaled };

        // Saturate to signed 24-bit
        IF (goertzel_sum[25:23] == 3'b000 || goertzel_sum[25:23] == 3'b111) {
            goertzel_result <= goertzel_sum[23:0];
        } ELIF (goertzel_sum[25] == 1'b0) {
            goertzel_result <= 24'h7FFFFF;
        } ELSE {
            goertzel_result <= 24'h800000;
        }

        // ---- Power computation: s1² + s2² (top 16 bits of magnitude) ----
        power_s1sq <= smul(s1_val, s1_val);
        power_s2sq <= smul(s2_val, s2_val);
        power_sum  <= { 1'b0, power_s1sq[46:31] } + { 1'b0, power_s2sq[46:31] };
        IF (power_sum[16] == 1'b1) {
            power_result <= 16'hFFFF;
        } ELSE {
            power_result <= power_sum[15:0];
        }

        // ---- State RAM write mux (single write point for PROCESS + POWER) ----
        IF (state == 2'd1) {
            // PROCESS: Goertzel recurrence (new s2 = old s1)
            s1_wr_data  <= goertzel_result;
            s2_wr_data  <= s1_val;
            state_wr_en <= 1'b1;
        } ELIF (state == 2'd2) {
            // POWER: clear for next DFT block
            s1_wr_data  <= 24'd0;
            s2_wr_data  <= 24'd0;
            state_wr_en <= 1'b1;
        } ELSE {
            s1_wr_data  <= 24'd0;
            s2_wr_data  <= 24'd0;
            state_wr_en <= 1'b0;
        }

        // ---- Auto-gain normalization ----
        raw_power <= power_ram.rd[sweep_idx];
        floor_w   <= energy_sum[22:7];

        IF (scan_max > floor_w) {
            adj_max_w <= scan_max - floor_w;
        } ELSE {
            adj_max_w <= 16'd0;
        }

        // Leading-one priority encoder → right-shift to map max to 8 bits
        IF      (adj_max_w[15] == 1'b1) { norm_shift_w <= 4'd8; }
        ELIF (adj_max_w[14] == 1'b1) { norm_shift_w <= 4'd7; }
        ELIF (adj_max_w[13] == 1'b1) { norm_shift_w <= 4'd6; }
        ELIF (adj_max_w[12] == 1'b1) { norm_shift_w <= 4'd5; }
        ELIF (adj_max_w[11] == 1'b1) { norm_shift_w <= 4'd4; }
        ELIF (adj_max_w[10] == 1'b1) { norm_shift_w <= 4'd3; }
        ELIF (adj_max_w[9] == 1'b1)  { norm_shift_w <= 4'd2; }
        ELIF (adj_max_w[8] == 1'b1)  { norm_shift_w <= 4'd1; }
        ELSE                          { norm_shift_w <= 4'd0; }

        // Floor-subtracted power (clamped to 0)
        IF (raw_power > floor_val) {
            adj_power <= raw_power - floor_val;
        } ELSE {
            adj_power <= 16'd0;
        }

        // Barrel shift: map adj_power to 8-bit target_amp
        IF      (norm_shift == 4'd8) { target_amp <= adj_power[15:8]; }
        ELIF (norm_shift == 4'd7) { target_amp <= adj_power[14:7]; }
        ELIF (norm_shift == 4'd6) { target_amp <= adj_power[13:6]; }
        ELIF (norm_shift == 4'd5) { target_amp <= adj_power[12:5]; }
        ELIF (norm_shift == 4'd4) { target_amp <= adj_power[11:4]; }
        ELIF (norm_shift == 4'd3) { target_amp <= adj_power[10:3]; }
        ELIF (norm_shift == 4'd2) { target_amp <= adj_power[9:2]; }
        ELIF (norm_shift == 4'd1) { target_amp <= adj_power[8:1]; }
        ELSE                       { target_amp <= adj_power[7:0]; }

        // ---- Smoothing (from sweep_ram) ----
        smooth_cur   <= sweep_ram.rd[sweep_idx][7:0];
        smooth_delta <= { 1'b0, target_amp } - { 1'b0, smooth_cur };
        smooth_step  <= { smooth_delta[8], smooth_delta[8], smooth_delta[8:2] };
        smooth_next  <= { 1'b0, smooth_cur } + smooth_step;
        IF (smooth_next[8] == 1'b1) {
            smooth_result <= 8'd0;
        } ELSE {
            smooth_result <= smooth_next[7:0];
        }

        // ---- Peak hold ----
        peak_approx <= { sweep_ram.rd[sweep_idx][15:12], 4'b0000 };
        timer_cur   <= sweep_ram.rd[sweep_idx][11:8];

        // Pre-compute write values
        IF (smooth_result > peak_approx) {
            new_peak_amp   <= smooth_result;
            new_peak_timer <= 4'd15;
            display_wr_val <= { smooth_result, smooth_result };
        } ELIF (timer_cur == 4'd0) {
            new_peak_amp   <= smooth_result;
            new_peak_timer <= 4'd0;
            display_wr_val <= { smooth_result, smooth_result };
        } ELSE {
            new_peak_amp   <= peak_approx;
            new_peak_timer <= timer_cur - 4'd1;
            display_wr_val <= { peak_approx, smooth_result };
        }
        sweep_wr_val <= { new_peak_amp[7:4], new_peak_timer, smooth_result };
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        // Muxed state RAM write (split 24-bit across 16-bit memories)
        IF (state_wr_en == 1'b1) {
            s1_lo.wr[bin_idx] <= s1_wr_data[15:0];
            s2_lo.wr[bin_idx] <= s2_wr_data[15:0];
            s_hi.wr[bin_idx]  <= { s2_wr_data[23:16], s1_wr_data[23:16] };
        }

        IF (state == 2'd0) {
            // ---- IDLE: wait for sample ----
            IF (frame_pulse == 1'b1) {
                frame_pending <= 1'b1;
            }
            IF (samp_valid == 1'b1) {
                cur_sample <= audio_sample;
                state      <= 2'd1;
                bin_idx    <= 7'd0;
            }
        } ELIF (state == 2'd1) {
            // ---- PROCESS: Goertzel recurrence, 1 bin per cycle (80 cycles) ----
            // s1/s2 RAM write handled by mux above
            IF (bin_idx == 7'd79) {
                IF (sample_count == 11'd2047) {
                    // N samples complete → compute power
                    IF (frame_pulse == 1'b1) {
                        frame_pending <= 1'b1;
                    }
                    sample_count <= 11'd0;
                    state        <= 2'd2;
                    bin_idx      <= 7'd0;
                } ELSE {
                    sample_count <= sample_count + 11'd1;
                    IF (frame_pending == 1'b1 || frame_pulse == 1'b1) {
                        frame_pending <= 1'b0;
                        state         <= 2'd3;
                        sweep_idx     <= 7'd0;
                        sweep_phase   <= 2'b00;
                    } ELSE {
                        state <= 2'd0;
                    }
                }
            } ELSE {
                IF (frame_pulse == 1'b1) {
                    frame_pending <= 1'b1;
                }
                bin_idx <= bin_idx + 7'd1;
            }
        } ELIF (state == 2'd2) {
            // ---- POWER: compute s1² + s2² for all bins, 1 per cycle (80 cycles) ----
            // s1/s2 RAM cleared by mux above; power written here
            power_ram.wr[bin_idx] <= power_result;

            IF (bin_idx == 7'd79) {
                IF (frame_pending == 1'b1 || frame_pulse == 1'b1) {
                    frame_pending <= 1'b0;
                    state         <= 2'd3;
                    sweep_idx     <= 7'd0;
                    sweep_phase   <= 2'b00;
                } ELSE {
                    state <= 2'd0;
                }
            } ELSE {
                IF (frame_pulse == 1'b1) {
                    frame_pending <= 1'b1;
                }
                bin_idx <= bin_idx + 7'd1;
            }
        } ELIF (state == 2'd3) {
            // ---- SWEEP: auto-gain normalize + smooth + peak-hold ----
            IF (sweep_phase == 2'b00) {
                // SCAN: accumulate sum and max over 80 bins
                IF (frame_pulse == 1'b1) {
                    frame_pending <= 1'b1;
                }

                IF (sweep_idx == 7'd0) {
                    energy_sum <= { 8'd0, raw_power };
                    scan_max   <= raw_power;
                } ELSE {
                    energy_sum <= energy_sum + { 8'd0, raw_power };
                    IF (raw_power > scan_max) {
                        scan_max <= raw_power;
                    }
                }

                IF (sweep_idx == 7'd79) {
                    sweep_phase <= 2'b01;
                } ELSE {
                    sweep_idx <= sweep_idx + 7'd1;
                }
            } ELIF (sweep_phase == 2'b01) {
                // COMPUTE: register floor and norm_shift (1 cycle latency)
                IF (frame_pulse == 1'b1) {
                    frame_pending <= 1'b1;
                }
                floor_val   <= floor_w;
                norm_shift  <= norm_shift_w;
                sweep_phase <= 2'b10;
                sweep_idx   <= 7'd0;
            } ELSE {
                // WRITE: normalized smoothing + peak hold
                IF (frame_pulse == 1'b1) {
                    frame_pending <= 1'b1;
                }
                sweep_ram.wr[sweep_idx]   <= sweep_wr_val;
                display_ram.wr[sweep_idx] <= display_wr_val;

                IF (sweep_idx == 7'd79) {
                    state       <= 2'd0;
                    sweep_phase <= 2'b00;
                } ELSE {
                    sweep_idx <= sweep_idx + 7'd1;
                }
            }
        }
    }
@endmod
jz
// Spectrum Display — 80-Bar Renderer
// Reads packed bar state from DISTRIBUTED RAM (smooth+peak amplitudes),
// renders 80 narrow pill-shaped bar graphs with reflections and peak hold.
//
// Bar layout: 80 bars, 16px pitch (10px body + 6px gap) = 1280px total.
// Bars grow upward from baseline at y=600. Max height: 25 segments (400px).
// Segment: 12px body + 4px gap = 16px pitch.
// Reflection: below baseline, up to 6 segments, 1/4 brightness.
// Peak hold: single bright segment above bar (SHOW_PEAK=1 to enable).
// Color: smooth gradient red(bar 0) -> yellow(bar 39) -> blue(bar 79).
@module spectrum_display
    CONST {
        SHOW_PEAK = 0;    // 1=draw peak hold pill above bar, 0=disable
    }

    PORT {
        IN  [1]  clk;
        IN  [1]  rst_n;
        IN  [11] x_pos;
        IN  [10] y_pos;
        // RAM read interface
        OUT [7]  rd_bar;          // bar index to read (0..79)
        IN  [16] rd_amp;          // [7:0]=smooth_amp, [15:8]=peak_amp
        // Pixel output
        OUT [8]  red;
        OUT [8]  green;
        OUT [8]  blue;
    }

    WIRE {
        // Bar detection from x_pos
        bar_idx      [7];     // x_pos[10:4] = 0..79
        bar_x        [4];     // x_pos[3:0] = 0..15
        in_body      [1];     // bar_x < 10
        in_bar       [1];     // valid bar and in body

        // Vertical geometry
        above_base   [1];
        below_base   [1];
        pix_above    [10];    // pixels above baseline (0..399)
        pix_below    [10];    // pixels below baseline (0..99)

        // Segment computation (above baseline)
        seg_idx      [5];     // segment index (0..24)
        y_in_seg     [4];     // pixel within segment (0..15)
        in_pill      [1];     // y_in_seg < 12

        // Reflection segment (below baseline)
        ref_seg_idx  [5];
        ref_y_in_seg [4];
        in_ref_pill  [1];

        // Bar state from RAM
        smooth_amp   [8];
        peak_amp     [8];

        // Amplitude to segments
        bar_segs_raw [5];
        bar_segs     [5];
        peak_segs_raw [5];
        peak_segs    [5];
        ref_segs     [5];

        // Color gradient computation
        // half_pos: position within current color half (0-39)
        half_pos     [6];
        // ramp = half_pos * 13 >> 1 (approximates half_pos * 255/39)
        ramp_x13     [10];    // half_pos * 13, max 39*13=507
        ramp         [8];     // ramp_x13 >> 1, capped at 255

        // Color palette (computed from gradient)
        base_r       [8];
        base_g       [8];
        base_b       [8];

        // Pixel classification
        is_bar_seg   [1];
        is_peak_seg  [1];
        is_ref_seg   [1];

        // Combinational RGB
        next_r       [8];
        next_g       [8];
        next_b       [8];
    }

    REGISTER {
        red_r   [8] = 8'd0;
        green_r [8] = 8'd0;
        blue_r  [8] = 8'd0;
    }

    ASYNCHRONOUS {
        // ---- Bar column detection ----
        bar_idx <= x_pos[10:4];
        bar_x   <= x_pos[3:0];
        in_body <= (bar_x < 4'd10) ? 1'b1 : 1'b0;
        in_bar  <= (bar_idx < 7'd80 && in_body == 1'b1) ? 1'b1 : 1'b0;

        // Drive RAM read address
        rd_bar <= bar_idx;

        // Extract bar state from packed RAM value
        smooth_amp <= rd_amp[7:0];
        peak_amp   <= rd_amp[15:8];

        // ---- Amplitude to segments ----
        bar_segs_raw  <= smooth_amp[7:3];
        IF (bar_segs_raw > 5'd25) {
            bar_segs <= 5'd25;
        } ELSE {
            bar_segs <= bar_segs_raw;
        }
        peak_segs_raw <= peak_amp[7:3];
        IF (peak_segs_raw > 5'd25) {
            peak_segs <= 5'd25;
        } ELSE {
            peak_segs <= peak_segs_raw;
        }
        ref_segs <= { 2'b00, bar_segs[4:2] };

        // ---- Vertical geometry ----
        above_base <= (y_pos >= 10'd200 && y_pos < 10'd600) ? 1'b1 : 1'b0;
        below_base <= (y_pos >= 10'd600 && y_pos < 10'd700) ? 1'b1 : 1'b0;

        pix_above <= 10'd599 - y_pos;
        pix_below <= y_pos - 10'd600;

        seg_idx      <= pix_above[8:4];
        y_in_seg     <= pix_above[3:0];
        ref_seg_idx  <= pix_below[8:4];
        ref_y_in_seg <= pix_below[3:0];

        in_pill     <= (y_in_seg < 4'd12) ? 1'b1 : 1'b0;
        in_ref_pill <= (ref_y_in_seg < 4'd12) ? 1'b1 : 1'b0;

        // ---- Smooth color gradient: red(0) -> yellow(39) -> blue(79) ----
        // half_pos = position within current half (0-39)
        IF (bar_idx < 7'd40) {
            half_pos <= bar_idx[5:0];
        } ELSE {
            half_pos <= bar_idx[5:0] - 6'd40;
        }

        // ramp = half_pos * 13 >> 1 ≈ half_pos * 6.5 (maps 0-39 to 0-253)
        // half_pos * 13 = half_pos * 8 + half_pos * 4 + half_pos * 1
        ramp_x13 <= { 4'b0000, half_pos } + { 3'b000, half_pos, 1'b0 } +
                     { 2'b00, half_pos, 2'b00 } + { 1'b0, half_pos, 3'b000 };
        IF (ramp_x13[9:1] > 9'd255) {
            ramp <= 8'hFF;
        } ELSE {
            ramp <= ramp_x13[8:1];
        }

        // Bars 0-39: red -> yellow (R=FF, G ramps up, B=00)
        // Bars 40-79: yellow -> blue (R ramps down, G ramps down, B ramps up)
        IF (bar_idx < 7'd40) {
            base_r <= 8'hFF;
            base_g <= ramp;
            base_b <= 8'h00;
        } ELSE {
            base_r <= 8'hFF - ramp;
            base_g <= 8'hFF - ramp;
            base_b <= ramp;
        }

        // ---- Pixel classification ----
        is_bar_seg  <= (in_bar == 1'b1 && above_base == 1'b1 && in_pill == 1'b1 && seg_idx < bar_segs) ? 1'b1 : 1'b0;
        is_peak_seg <= (in_bar == 1'b1 && above_base == 1'b1 && in_pill == 1'b1 && seg_idx == peak_segs && peak_segs > bar_segs) ? lit(1, SHOW_PEAK) : 1'b0;
        is_ref_seg  <= (in_bar == 1'b1 && below_base == 1'b1 && in_ref_pill == 1'b1 && ref_seg_idx < ref_segs) ? 1'b1 : 1'b0;

        // ---- Combinational pixel output ----
        IF (is_bar_seg == 1'b1) {
            next_r <= base_r;
            next_g <= base_g;
            next_b <= base_b;
        } ELIF (is_peak_seg == 1'b1) {
            // Peak: brighter version (+0x40, saturate at 0xFF)
            IF (base_r > 8'hBF) {
                next_r <= 8'hFF;
            } ELSE {
                next_r <= base_r + 8'h40;
            }
            IF (base_g > 8'hBF) {
                next_g <= 8'hFF;
            } ELSE {
                next_g <= base_g + 8'h40;
            }
            IF (base_b > 8'hBF) {
                next_b <= 8'hFF;
            } ELSE {
                next_b <= base_b + 8'h40;
            }
        } ELIF (is_ref_seg == 1'b1) {
            // Reflection: 1/4 brightness
            next_r <= { 2'b00, base_r[7:2] };
            next_g <= { 2'b00, base_g[7:2] };
            next_b <= { 2'b00, base_b[7:2] };
        } ELSE {
            next_r <= 8'h00;
            next_g <= 8'h00;
            next_b <= 8'h00;
        }

        // Registered output
        red   <= red_r;
        green <= green_r;
        blue  <= blue_r;
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        red_r   <= next_r;
        green_r <= next_g;
        blue_r  <= next_b;
    }
@endmod
jz
// Audio Level Display (VU Meter)
// Shows two horizontal bars on the DVI output:
//   1. Audio peak level (green -> yellow -> red gradient)
//   2. Ring buffer fill level (blue -> cyan)
// Background is near-black. Bars are centered on screen.
@module level_display
    PORT {
        IN  [1]  clk;
        IN  [1]  rst_n;
        IN  [11] x_pos;
        IN  [10] y_pos;

        // Audio level input
        IN  [16] audio_sample;   // signed 16-bit PCM
        IN  [1]  samp_valid;

        // Buffer fill level (0-127)
        IN  [7]  buf_fill;

        // RGB output
        OUT [8]  red;
        OUT [8]  green;
        OUT [8]  blue;
    }

    CONST {
        // Bar geometry
        BAR_LEFT   = 140;
        BAR_RIGHT  = 1140;

        // VU meter bar: y = 300..339
        VU_TOP     = 300;
        VU_BOT     = 339;

        // Buffer level bar: y = 380..409
        BUF_TOP    = 380;
        BUF_BOT    = 409;

        // Separator line: y = 350..355
        SEP_TOP    = 350;
        SEP_BOT    = 355;
    }

    WIRE {
        // Absolute value of audio sample
        abs_sample   [15];

        // Bar positions (pixel offset from BAR_LEFT)
        bar_x        [11];

        // VU bar width (0-1000 pixels from peak level)
        vu_width     [10];

        // Buffer bar width (0-1000 pixels)
        buf_width    [10];

        // Pixel classification
        in_vu_bar    [1];
        in_buf_bar   [1];
        in_vu_fill   [1];
        in_buf_fill  [1];
        in_sep       [1];

        // Color outputs
        r_out        [8];
        g_out        [8];
        b_out        [8];

        // Peak decay amount
        decay_amt    [15];
    }

    REGISTER {
        // Peak level with decay
        peak_level   [15] = 15'd0;

        // Decay counter: decay peak slowly
        decay_timer  [12] = 12'd0;

        // Registered buffer fill for display
        buf_fill_r   [7]  = 7'd0;
    }

    ASYNCHRONOUS {
        // Absolute value of signed 16-bit sample
        IF (audio_sample[15] == 1'b1) {
            abs_sample <= ~audio_sample[14:0] + 15'd1;
        } ELSE {
            abs_sample <= audio_sample[14:0];
        }

        // Bar x position relative to left edge
        bar_x <= x_pos - lit(11, BAR_LEFT);

        // VU bar width: peak_level[14:5] gives 0-1023, cap at 1000
        IF (peak_level[14:5] > 10'd1000) {
            vu_width <= 10'd1000;
        } ELSE {
            vu_width <= peak_level[14:5];
        }

        // Buffer bar width: buf_fill_r is 0-127, scale to 0-1016
        IF ({ buf_fill_r, 3'b000 } > 10'd1000) {
            buf_width <= 10'd1000;
        } ELSE {
            buf_width <= { buf_fill_r, 3'b000 };
        }

        // Region detection
        in_vu_bar  <= (y_pos >= lit(10, VU_TOP) && y_pos <= lit(10, VU_BOT) &&
                       x_pos >= lit(11, BAR_LEFT) && x_pos <= lit(11, BAR_RIGHT)) ? 1'b1 : 1'b0;

        in_buf_bar <= (y_pos >= lit(10, BUF_TOP) && y_pos <= lit(10, BUF_BOT) &&
                       x_pos >= lit(11, BAR_LEFT) && x_pos <= lit(11, BAR_RIGHT)) ? 1'b1 : 1'b0;

        in_vu_fill  <= (in_vu_bar == 1'b1 && bar_x[10:0] < { 1'b0, vu_width }) ? 1'b1 : 1'b0;
        in_buf_fill <= (in_buf_bar == 1'b1 && bar_x[10:0] < { 1'b0, buf_width }) ? 1'b1 : 1'b0;

        in_sep <= (y_pos >= lit(10, SEP_TOP) && y_pos <= lit(10, SEP_BOT) &&
                   x_pos >= lit(11, BAR_LEFT) && x_pos <= lit(11, BAR_RIGHT)) ? 1'b1 : 1'b0;

        // Decay amount: peak/64 + 1
        decay_amt <= { 6'b000000, peak_level[14:6] } + 15'd1;

        // Color generation
        IF (in_vu_fill == 1'b1) {
            // VU meter: green -> yellow -> red
            IF (bar_x < 11'd600) {
                r_out <= 8'd40;
                g_out <= 8'd220;
                b_out <= 8'd40;
            } ELIF (bar_x < 11'd800) {
                r_out <= 8'd220;
                g_out <= 8'd220;
                b_out <= 8'd20;
            } ELSE {
                r_out <= 8'd240;
                g_out <= 8'd40;
                b_out <= 8'd20;
            }
        } ELIF (in_vu_bar == 1'b1) {
            // VU bar background (dark gray)
            r_out <= 8'd30;
            g_out <= 8'd30;
            b_out <= 8'd30;
        } ELIF (in_buf_fill == 1'b1) {
            // Buffer level: blue -> cyan
            IF (bar_x < 11'd500) {
                r_out <= 8'd20;
                g_out <= 8'd80;
                b_out <= 8'd200;
            } ELSE {
                r_out <= 8'd20;
                g_out <= 8'd180;
                b_out <= 8'd220;
            }
        } ELIF (in_buf_bar == 1'b1) {
            // Buffer bar background (dark gray)
            r_out <= 8'd30;
            g_out <= 8'd30;
            b_out <= 8'd30;
        } ELIF (in_sep == 1'b1) {
            // Separator line (dim white)
            r_out <= 8'd60;
            g_out <= 8'd60;
            b_out <= 8'd60;
        } ELSE {
            // Background (near-black)
            r_out <= 8'd8;
            g_out <= 8'd8;
            b_out <= 8'd12;
        }

        red   <= r_out;
        green <= g_out;
        blue  <= b_out;
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        // Register buffer fill level
        buf_fill_r <= buf_fill;

        // Peak level tracking and decay
        IF (samp_valid == 1'b1 && abs_sample > peak_level) {
            // New peak detected
            peak_level  <= abs_sample;
            decay_timer <= 12'd0;
        } ELIF (decay_timer == 12'd4095) {
            // Decay timer expired: reduce peak
            decay_timer <= 12'd0;
            IF (peak_level > decay_amt) {
                peak_level <= peak_level - decay_amt;
            } ELSE {
                peak_level <= 15'd0;
            }
        } ELSE {
            decay_timer <= decay_timer + 12'd1;
        }
    }
@endmod
jz
// Simple UART Receiver — 8N1, no FIFO
// Pulses valid for 1 cycle when a byte is received
@module uart_rx
    CONST {
        CLK_HZ = 27000000;
        BAUD_RATE = 115200;
        BAUD_DIV = (CLK_HZ / BAUD_RATE) - 1;
        HALF_BAUD = BAUD_DIV / 2;
    }

    PORT {
        IN  [1] clk;
        IN  [1] rst_n;
        IN  [1] rx;
        OUT [8] data;
        OUT [1] valid;
    }

    REGISTER {
        // Metastability synchronizer
        rx_sync1    [1] = 1'b1;
        rx_sync2    [1] = 1'b1;

        // State machine (0=IDLE, 1=START, 2=DATA, 3=STOP)
        state       [2] = 2'd0;
        baud_cnt    [16] = 16'd0;
        bit_cnt     [3] = 3'd0;
        shift       [8] = 8'h00;

        // Output
        data_out    [8] = 8'h00;
        valid_out   [1] = 1'b0;
    }

    ASYNCHRONOUS {
        data  <= data_out;
        valid <= valid_out;
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        // 2-stage synchronizer for async RX input
        rx_sync1 <= rx;
        rx_sync2 <= rx_sync1;

        SELECT (state) {
            CASE (2'd0) {
                // IDLE: wait for start bit (falling edge)
                valid_out <= 1'b0;
                IF (rx_sync2 == 1'b0) {
                    baud_cnt <= lit(16, HALF_BAUD);
                    state <= 2'd1;
                }
            }

            CASE (2'd1) {
                // START: verify start bit at mid-point
                valid_out <= 1'b0;
                IF (baud_cnt == 16'd0) {
                    IF (rx_sync2 == 1'b0) {
                        baud_cnt <= lit(16, BAUD_DIV);
                        bit_cnt <= 3'd0;
                        shift <= 8'h00;
                        state <= 2'd2;
                    } ELSE {
                        // False start
                        state <= 2'd0;
                    }
                } ELSE {
                    baud_cnt <= baud_cnt - 16'd1;
                }
            }

            CASE (2'd2) {
                // DATA: sample 8 bits at mid-bit
                valid_out <= 1'b0;
                IF (baud_cnt == 16'd0) {
                    shift <= { rx_sync2, shift[7:1] };
                    IF (bit_cnt == 3'd7) {
                        baud_cnt <= lit(16, BAUD_DIV);
                        state <= 2'd3;
                    } ELSE {
                        bit_cnt <= bit_cnt + 3'd1;
                        baud_cnt <= lit(16, BAUD_DIV);
                    }
                } ELSE {
                    baud_cnt <= baud_cnt - 16'd1;
                }
            }

            CASE (2'd3) {
                // STOP: wait for stop bit, output byte
                IF (baud_cnt == 16'd0) {
                    data_out <= shift;
                    valid_out <= 1'b1;
                    state <= 2'd0;
                } ELSE {
                    valid_out <= 1'b0;
                    baud_cnt <= baud_cnt - 16'd1;
                }
            }

            DEFAULT {
                valid_out <= 1'b0;
                state <= 2'd0;
            }
        }
    }
@endmod
jz
// Simple UART Transmitter — 8N1, no FIFO
// Asserts ready when idle. When valid is pulsed with data, transmits one byte.
@module uart_tx
    CONST {
        CLK_HZ = 27000000;
        BAUD_RATE = 115200;
        BAUD_DIV = (CLK_HZ / BAUD_RATE) - 1;
    }

    PORT {
        IN  [1] clk;
        IN  [1] rst_n;
        IN  [8] data;
        IN  [1] valid;
        OUT [1] ready;
        OUT [1] tx;
    }

    REGISTER {
        // State machine (0=IDLE, 1=START, 2=DATA, 3=STOP)
        state     [2] = 2'd0;
        baud_cnt  [16] = 16'd0;
        bit_cnt   [3] = 3'd0;
        shift     [8] = 8'hFF;

        // Outputs
        tx_out    [1] = 1'b1;
        ready_out [1] = 1'b1;
    }

    ASYNCHRONOUS {
        tx    <= tx_out;
        ready <= ready_out;
    }

    SYNCHRONOUS(CLK=clk RESET=rst_n RESET_ACTIVE=Low) {
        SELECT (state) {
            CASE (2'd0) {
                // IDLE: line high, ready for data
                tx_out <= 1'b1;
                IF (valid == 1'b1) {
                    shift     <= data;
                    baud_cnt  <= lit(16, BAUD_DIV);
                    state     <= 2'd1;
                    ready_out <= 1'b0;
                } ELSE {
                    ready_out <= 1'b1;
                }
            }

            CASE (2'd1) {
                // START bit: hold TX low for one baud period
                tx_out    <= 1'b0;
                ready_out <= 1'b0;
                IF (baud_cnt == 16'd0) {
                    baud_cnt <= lit(16, BAUD_DIV);
                    bit_cnt  <= 3'd0;
                    state    <= 2'd2;
                } ELSE {
                    baud_cnt <= baud_cnt - 16'd1;
                }
            }

            CASE (2'd2) {
                // DATA: shift out 8 bits LSB first
                tx_out    <= shift[0];
                ready_out <= 1'b0;
                IF (baud_cnt == 16'd0) {
                    shift <= { 1'b1, shift[7:1] };
                    IF (bit_cnt == 3'd7) {
                        baud_cnt <= lit(16, BAUD_DIV);
                        state    <= 2'd3;
                    } ELSE {
                        bit_cnt  <= bit_cnt + 3'd1;
                        baud_cnt <= lit(16, BAUD_DIV);
                    }
                } ELSE {
                    baud_cnt <= baud_cnt - 16'd1;
                }
            }

            CASE (2'd3) {
                // STOP bit: hold TX high for one baud period
                tx_out <= 1'b1;
                IF (baud_cnt == 16'd0) {
                    ready_out <= 1'b1;
                    state     <= 2'd0;
                } ELSE {
                    ready_out <= 1'b0;
                    baud_cnt  <= baud_cnt - 16'd1;
                }
            }

            DEFAULT {
                tx_out    <= 1'b1;
                ready_out <= 1'b1;
                state     <= 2'd0;
            }
        }
    }
@endmod
jz
@module por
    PORT {
        IN  [1] clk;
        IN  [1] done;
        OUT [1] por_n;
    }

    CONST {
        POR_CYCLES   = 1_048_576;  // ~28ms at 37.125MHz — wait for PLL lock
        POR_CNT_BITS = clog2(POR_CYCLES);
        POR_MAX      = POR_CYCLES - 1;
    }

    REGISTER {
        por_reg [1] = 1'b0;
        cnt     [POR_CNT_BITS] = POR_CNT_BITS'b0;
    }

    ASYNCHRONOUS {
        por_n <= por_reg;
    }

    SYNCHRONOUS(CLK=clk) {
        IF (done == 1'b0) {
            por_reg <= 1'b0;
            cnt <= POR_CNT_BITS'b0;
        } ELIF (cnt == lit(POR_CNT_BITS, POR_MAX)) {
            por_reg <= 1'b1;
            cnt <= cnt;
        } ELSE {
            por_reg <= 1'b0;
            cnt <= cnt + POR_CNT_BITS'b1;
        }
    }
@endmod

JZ-HDL Language Features

BRAM ring buffer with rate conversion. MEM(TYPE=BLOCK) explicitly places the 8192-sample audio buffer in block RAM. The Bresenham accumulator for 48 kHz sample rate derivation from 37.125 MHz uses purely integer arithmetic with compile-time constants — no floating-point IP or vendor clock-enable primitives needed.

Fixed-point DSP in combinational logic. The Goertzel DFT coefficients and accumulator use Q1.14 fixed-point arithmetic expressed as standard integer operations with explicit bit widths. The compiler verifies every concatenation and slice produces exactly the declared width, preventing the silent truncation bugs that plague hand-written fixed-point Verilog.

Flow-control protocol. The uart_audio_rx module implements a packet protocol with parity verification and fill-level ACK, all within a single SYNCHRONOUS block's state machine. The compiler's single-driver rule guarantees the UART TX is driven from exactly one place — either the ACK response or the idle state.

Data island integration. Audio sample injection into DVI blanking intervals requires precise cycle-by-cycle control of TMDS vs. TERC4 encoding modes. The output mux in the top-level SYNCHRONOUS block selects between four encoding modes per cycle, and the compiler verifies that every output path is fully covered with no undriven cycles.