CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
Published in Under review at ICML 2026, 2026
A simple, effective and low-cost multi-agent workflow for CUDA kernel generation and optimization. Achieves SOTA on KernelBench Levels 1-3 with only $0.3 API cost.
Recommended citation: Zijian Zhang, Rong Wang, Shiyang Li, Yuebo Luo, Mingyi Hong, Caiwen Ding. "CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization." Under review at ICML, 2026.
