Autonomous Verification Agent
Loading... Loading test data...Upload a Verilog design + natural language spec. The agent generates a cocotb testbench, simulates with Icarus Verilog, and self-corrects until all tests pass. Supports combinational, sequential, protocol, and power-aware designs.
Agent Pipeline
Input
Verilog DUT + natural language specification
Generate
LLM creates cocotb testbench with 8 auto-fixes for cocotb 2.0
Simulate
Icarus Verilog compiles + runs, parses structured results
Self-Correct
9-category failure analysis, max 3 corrections + 10 reboots
$ python3 -m src run --design-dir golden/07_dvfs_controller
# Benchmark all designs
$ python3 -m src benchmark
# Use Anthropic API backend
$ python3 -m src benchmark --backend anthropic_api
Benchmark Results (Latest Runs)
Iterations per Design
Design Category Distribution
// Related Open-Source Tools
| Tool | Approach | Reported Pass Rate | Self-Correction | Mutation Testing |
|---|---|---|---|---|
| Project Ava | LLM agent + cocotb | -- | Yes (3+10) | Yes (7 categories) |
| ConfiBench | Confidence masking | 72% | Limited | No |
| CorrectBench | LLM + correction | 70% | Yes (3+10) | No |
| AutoBench | LLM generation | 52% | No | No |
// How It Works
cocotb 2.0 Compatibility
LLMs tend to generate cocotb 1.x code. The generator auto-fixes 8 known breaking changes (units vs unit, fork vs start_soon, .value.integer, etc.) before simulation.
Self-Correction
When tests fail, the agent feeds errors back to the LLM and retries — up to 3 corrections and 10 full reboots.
Mutation Testing
Built-in mutation engine injects bugs into Verilog source (relational, logical, arithmetic, bit flips, stuck-at, etc.) to measure how thorough the generated tests are.
Failure Taxonomy
Every failure is classified: SYNTAX, COCOTB_API, SIGNAL_ACCESS, TIMING, LOGIC, COMPILE, IMPORT, TIMEOUT, UNKNOWN.
// Design Categories Tested
Protocol Designs
Communication protocols with handshaking, timing constraints, and multi-cycle transactions.
Power-Aware Designs
DVFS controllers, clock gating cells, power state machines, frequency dividers, and PWM generators.
Crypto / Complex
SHA-256 core from secworks (ASIC-proven, BSD-2-Clause), tested with NIST FIPS 180-4 vectors.