Xin Gu

Computer Science and Technology, Chinese Academy of Sciences
Beijing, China

Email: guxin21@mails.ucas.ac.cn

[Google Scholar]


Biography

I am a fourth year Ph.D. student at Chinese Academy of Sciences, advised by Prof. Tiejian Luo. Before that, I got his Bachelor's degree from University of Electronic Science and Technology of China at June 2021. My research focuses on multimodal video understanding, particularly on the following topics: Video Captioning, Spatio-Temporal Video Grounding, Long-term Video Understanding and Image Editing.


News

    01/2025 Two paper is accepted by ICLR 2025!
    02/2024 One papers are accepted by CVPR 2024!
    11/2023 One paper is accepted by IJCV!
    06/2023 One papers are accepted by ICCV 2023!
    02/2023 One paper is accepted by CVPR 2023!
    06/2022 Champion of the CVPR'22 LOVEU competition!
    11/2021 Join Bytedance as a research intern.

Educations


Research Experiences


Publications and preprints

First Author / First Co-Author
OmniSTVG: Toward Spatio-Temporal Omni-Object Video Grounding
Jiali Yao*, Xinran Deng*, Xin Gu*, Mengrui Dai, Bing Fan, Zhipeng Zhang, Yan Huang, Heng Fan, Libo Zhang
[Paper (preprint)] [Code]
Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding
Xin Gu, Yaojie Shen, Chenxi Luo, Tiejian Luo, Yan Huang, Yuewei Lin, Heng Fan, Libo Zhang
[Paper (ICLR 2025)] [Code] [Huggingface Checkpoint] (Oral)
Multi-Reward as Condition for Instruction-based Image Editing
Xin Gu, Ming Li, Libo Zhang, Fan Chen, Longyin Wen, Tiejian Luo, Sijie Zhu
[Paper (ICLR 2025)] [Code] [Huggingface Checkpoint]
Edit3k: Universal representation learning for video editing components
Xin Gu, Libo Zhang, Fan Chen, Longyin Wen, Yufei Wang, Tiejian Luo, Sijie Zhu
[Paper (preprint)] [Code] [Huggingface Dataset]
Context-Guided Spatio-Temporal Video Grounding
Xin Gu, Heng Fan, Yan Huang, Tiejian Luo, Libo Zhang
[Paper (CVPR 2024)] [Code] [Huggingface Checkpoint]
Local Compressed Video Stream Learning for Generic Event Boundary Detection
Libo Zhang*, Xin Gu*, Congcong Li, Tiejian Luo, Heng Fan
[Paper (IJCV)] [Code]
Accurate and Fast Compressed Video Captioning
Yaojie Shen*, Xin Gu*, Kai Xu, Heng Fan, Longyin Wen, Libo Zhang
[Paper (ICCV 2023)] [Code]
Text with Knowledge Graph Augmented Transformer for Video Captioning
Xin Gu, Guang Chen, Yufei Wang, Libo Zhang, Tiejian Luo, Longyin Wen
[Paper (CVPR 2023)] [Code]
Dual-Stream Transformer for Generic Event Boundary Captioning
Xin Gu, Hanhua Ye, Guang Chen, Yufei Wang, Libo Zhang, Longyin Wen
1st on LOVEU@CVPR'22: Generic Event Boundary Captioning
[Technical Report (CVPR 2022 Workshops)]

Awards

    12/2024 Chu Lee Yuet Wah Scholarship
    04/2024 President Award, University of Chinese Academy of Sciences
    06/2021 Outstanding Graduate, University of Electronic Science and Technology of China
    08/2020 Third Prize, National WeChat Mini Program Competition
    11/2019 National Encouragement Scholarship
    10/2019 Excellent Student Scholarship, University of Electronic Science and Technology of China
    05/2020 Honorable Mention, Mathematical Contest in Modeling