Posts by Collection

portfolio

publications

AnyPrefer: An Automatic Framework for Preference Data Synthesis

Published in ICLR, 2025

An automatic framework for preference data synthesis that improves VLA models through DPO and iterative training.

Recommended citation: Yiyang Zhou, ..., Zijian Zhang, ..., Huaxiu Yao. "AnyPrefer: An Automatic Framework for Preference Data Synthesis." ICLR, 2025.

GRAPE: Generalizing Robot Policy via Preference Alignment

Published in ICRA 2026; ICLR Workshop 2025, 2025

A trajectory-wise DPO method for VLA model posttraining that enhances the safety, efficiency, and success rate of robot policies.

Recommended citation: Zijian Zhang, Kaiyuan Zheng, Zhaorun Chen, Joel Jang, Yi Li, Chaoqi Wang, Mingyu Ding, Dieter Fox, Huaxiu Yao. "GRAPE: Generalizing Robot Policy via Preference Alignment." ICRA 2026; ICLR Workshop, 2025.

InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction

Published in NeurIPS, 2025

A multimodal generalist agent with detailed modularization of agent workflows, tool selection, and tool execution via a unified dialogue context.

Recommended citation: Bin Lei, Weitai Kang, Zijian Zhang, Winson Chen, Xi Xie, Shan Zuo, Mimi Xie, Ali Payani, Mingyi Hong, Yan Yan, Caiwen Ding. "InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction." NeurIPS, 2025.

StitchCUDA: An Automated Multi-Agent End-to-End GPU Programming Framework with Rubric-based Agentic Reinforcement Learning

Published in Under review at ICML 2026, 2026

A multi-agent workflow for End-to-End CUDA code generation and optimization, enhanced by Rubric-based Agentic RL. It achieves SOTA performance, defeating GPT-5.2 by a 32B RL-based model.

Recommended citation: Shiyang Li*, Zijian Zhang* (Co-First Author), Winson Chen, Yuebo Luo, Mingyi Hong, Caiwen Ding. "StitchCUDA: An Automated Multi-Agent End-to-End GPU Programming Framework with Rubric-based Agentic Reinforcement Learning." Under review at ICML, 2026.

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

Published in Under review at ICML 2026, 2026

A simple, effective and low-cost multi-agent workflow for CUDA kernel generation and optimization. Achieves SOTA on KernelBench Levels 1-3 with only $0.3 API cost.

Recommended citation: Zijian Zhang, Rong Wang, Shiyang Li, Yuebo Luo, Mingyi Hong, Caiwen Ding. "CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization." Under review at ICML, 2026.

talks

teaching