Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Posts

portfolio

publications

AnyPrefer: An Automatic Framework for Preference Data Synthesis

Published in ICLR, 2025

An automatic framework for preference data synthesis that improves VLA models through DPO and iterative training.

Recommended citation: Yiyang Zhou, ..., Zijian Zhang, ..., Huaxiu Yao. "AnyPrefer: An Automatic Framework for Preference Data Synthesis." ICLR, 2025.

GRAPE: Generalizing Robot Policy via Preference Alignment

Published in ICRA 2026; ICLR Workshop 2025, 2025

A trajectory-wise DPO method for VLA model posttraining that enhances the safety, efficiency, and success rate of robot policies.

Recommended citation: Zijian Zhang, Kaiyuan Zheng, Zhaorun Chen, Joel Jang, Yi Li, Chaoqi Wang, Mingyu Ding, Dieter Fox, Huaxiu Yao. "GRAPE: Generalizing Robot Policy via Preference Alignment." ICRA 2026; ICLR Workshop, 2025.

InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction

Published in NeurIPS, 2025

A multimodal generalist agent with detailed modularization of agent workflows, tool selection, and tool execution via a unified dialogue context.

Recommended citation: Bin Lei, Weitai Kang, Zijian Zhang, Winson Chen, Xi Xie, Shan Zuo, Mimi Xie, Ali Payani, Mingyi Hong, Yan Yan, Caiwen Ding. "InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction." NeurIPS, 2025.

StitchCUDA: An Automated Multi-Agent End-to-End GPU Programming Framework with Rubric-based Agentic Reinforcement Learning

Published in Under review at ICML 2026, 2026

A multi-agent workflow for End-to-End CUDA code generation and optimization, enhanced by Rubric-based Agentic RL. It achieves SOTA performance, defeating GPT-5.2 by a 32B RL-based model.

Recommended citation: Shiyang Li*, Zijian Zhang* (Co-First Author), Winson Chen, Yuebo Luo, Mingyi Hong, Caiwen Ding. "StitchCUDA: An Automated Multi-Agent End-to-End GPU Programming Framework with Rubric-based Agentic Reinforcement Learning." Under review at ICML, 2026.

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

Published in Under review at ICML 2026, 2026

A simple, effective and low-cost multi-agent workflow for CUDA kernel generation and optimization. Achieves SOTA on KernelBench Levels 1-3 with only $0.3 API cost.

Recommended citation: Zijian Zhang, Rong Wang, Shiyang Li, Yuebo Luo, Mingyi Hong, Caiwen Ding. "CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization." Under review at ICML, 2026.

Zijian Zhang

Sitemap

Pages

Page Not Found

About

Archive Layout with Content

Posts by Category

Posts by Collection

CV

Markdown

Page not in menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Invited Talks

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

portfolio

publications

AnyPrefer: An Automatic Framework for Preference Data Synthesis

GRAPE: Generalizing Robot Policy via Preference Alignment

InfantAgent-Next: A Multimodal Generalist Agent for Automated Computer Interaction

StitchCUDA: An Automated Multi-Agent End-to-End GPU Programming Framework with Rubric-based Agentic Reinforcement Learning

CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization

talks

teaching