news

Sep 25, 2025 My summer intern work, Process-Supervised Reinforcement Learning for Interactive Multimodal Tool-Use Agents, is now released on arXiv. Will release our train/inference code for multi-turn RL soon.
Aug 20, 2025 Seeing is Believing: Emotion-Aware Audio-Visual Language Modeling for Expressive Speech Generation accepted to EMNLP 2025!
May 20, 2025 STAR and SSR accepted to IWSLT 2025! Please come to our presentation, see you in Vienna!
May 01, 2025 Starting a new internship at Bytedance Seed! Will be working on iterative (multi-turn, multi-modal) tool-use agents.
Sep 25, 2024 DiffNorm accepted to NeurIPS 2024! Please come to our poster, see you in Vancouver!
Aug 02, 2024 I will work on Multi-modal LLMs as a part-time student researcher at Meta in Fall 2024 & Spring 2025.
Feb 20, 2024 I will be interning at Meta AI (FAIR) in summer 2024, working on Speech Large Language Model. Looking forward to the new project!
Apr 07, 2023 I will be staying at Johns Hopkins University for my PhD, working with Prof. Philipp Koehn!