Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
portfolio
publications
AnyPrefer: An Automatic Framework for Preference Data Synthesis
Published in ICLR, 2025
An automatic framework for preference data synthesis that improves VLA models through DPO and iterative training.
Recommended citation: Yiyang Zhou, ..., Zijian Zhang, ..., Huaxiu Yao. "AnyPrefer: An Automatic Framework for Preference Data Synthesis." ICLR, 2025.
GRAPE: Generalizing Robot Policy via Preference Alignment
Published in ICRA 2026; ICLR Workshop 2025, 2025
A trajectory-wise DPO method for VLA model posttraining that enhances the safety, efficiency, and success rate of robot policies.
Recommended citation: Zijian Zhang, Kaiyuan Zheng, Zhaorun Chen, Joel Jang, Yi Li, Chaoqi Wang, Mingyu Ding, Dieter Fox, Huaxiu Yao. "GRAPE: Generalizing Robot Policy via Preference Alignment." ICRA 2026; ICLR Workshop, 2025.
InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction
Published in NeurIPS, 2025
A multimodal generalist agent with detailed modularization of agent workflows, tool selection, and tool execution via a unified dialogue context.
Recommended citation: Bin Lei, Weitai Kang, Zijian Zhang, Winson Chen, Xi Xie, Shan Zuo, Mimi Xie, Ali Payani, Mingyi Hong, Yan Yan, Caiwen Ding. "InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction." NeurIPS, 2025.
StitchCUDA: An Automated Multi-Agent End-to-End GPU Programming Framework with Rubric-based Agentic Reinforcement Learning
Published in Under review at ICML 2026, 2026
A multi-agent workflow for End-to-End CUDA code generation and optimization, enhanced by Rubric-based Agentic RL. It achieves SOTA performance, defeating GPT-5.2 by a 32B RL-based model.
Recommended citation: Shiyang Li*, Zijian Zhang* (Co-First Author), Winson Chen, Yuebo Luo, Mingyi Hong, Caiwen Ding. "StitchCUDA: An Automated Multi-Agent End-to-End GPU Programming Framework with Rubric-based Agentic Reinforcement Learning." Under review at ICML, 2026.
CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
Published in Under review at ICML 2026, 2026
A simple, effective and low-cost multi-agent workflow for CUDA kernel generation and optimization. Achieves SOTA on KernelBench Levels 1-3 with only $0.3 API cost.
Recommended citation: Zijian Zhang, Rong Wang, Shiyang Li, Yuebo Luo, Mingyi Hong, Caiwen Ding. "CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization." Under review at ICML, 2026.
