Home
News
People
Publications
Seminar
Courses
Events
Vacancies
Contact
Pulp Fictions
Light
Dark
Automatic
3
Fill the GAP: A Granular Alignment Paradigm for Visual Reasoning in Multimodal Large Language Models
Visual latent reasoning lets a multimodal large language model (MLLM) create intermediate visual evidence as continuous tokens, …
Yanting Miao
,
Yutao Sun
,
Dexin Wang
,
Mengyu Zhou
,
Pascal Poupart
,
Lei Lv
,
Qi Zhao
,
Li Wang
,
Hao Li
,
Xiaoxi Jiang
,
Guanjun Jiang
PDF
Cite
Source Document
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills
Large Language Model (LLM) agents increasingly rely on domain-specific skills, yet manually authoring such skills does not scale, and …
Jingwei Ni
,
Yihao Liu
,
Xinpeng Liu
,
Yutao Sun
,
Mengyu Zhou
,
Pengyu Cheng
,
Dexin Wang
,
Erchao Zhao
,
Xiaoxi Jiang
,
Guanjun Jiang
PDF
Cite
Source Document
Cite
×