About me

I received my Ph.D. degree at University of Science and Technology of China (USTC) in 2024. I am currently a researcher at Alibaba TongYi Lab. My research interests include general video understanding and generation.

Publications

2025

  • Rethinking Video Tokenization: A Conditioned Diffusion-based Approach. [arXiv]
    Nianzu Yang *, Pandeng Li *, Liming Zhao, Yang Li, Chen-Wei Xie, Yehui Tang, Xudong Lu, Zhihang Liu, Yun Zheng, Yu Liu, Junchi Yan.

  • Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models.
    Zhihang Liu, Chen-Wei Xie, Pandeng Li , Liming Zhao, Longxiang Tang, Yun Zheng, Chuanbin Liu, Hongtao Xie.
    Computer Vision and Pattern Recognition (CVPR 2025)

  • Denoised and Dynamic Alignment Enhancement for Zero-Shot Learning. [paper]
    Jiannan Ge, Zhihang Liu, Pandeng Li, Lingxi Xie, Yongdong Zhang, Qi Tian, Hongtao Xie.
    IEEE Transactions on Image Processing (TIP 2025) (SCI一区,IF=10.6)

  • UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface [arXiv]
    Hao Tang, Chen-Wei Xie, Haiyang Wang, Xiaoyi Bao, Tingyu Weng, Pandeng Li, Yun Zheng, Liwei Wang.

  • What Is a Good Caption? A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Coverage of MLLMs. [arXiv]
    Zhihang Liu, Chen-Wei Xie, Bin Wen, Feiwu Yu, Jixuan Chen, Boqiang Zhang, Nianzu Yang, Pandeng Li, Yun Zheng, Hongtao Xie.

2024

  • FuseTeacher: Modality-Fused Encoders are Strong Vision Supervisors. [paper]
    Chen-Wei Xie, Siyang Sun, Liming Zhao, Pandeng Li, Shuailei Ma, Yun Zheng.
    European Conference on Computer Vision (ECCV 2024)

  • AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation. [paper]
    Jiannan Ge, Lingxi Xie, Hongtao Xie, Pandeng Li, Xiaopeng Zhang, Yongdong Zhang, Qi Tian
    European Conference on Computer Vision (ECCV 2024)

  • Towards balanced alignment: Modal-enhanced semantic modeling for video moment retrieval. [paper] [code]
    Zhihang Liu, Jun Li, Hongtao Xie, Pandeng Li, Jiannan Ge, Sun-Ao Liu, Guoqing Jin.
    Association for the Advancement of Artificial Intelligence (AAAI 2024)

2023

  • MomentDiff: Generative Video Moment Retrieval from Random to Real. [arXiv] [code]
    Pandeng Li, Chen-Wei Xie, Hongtao Xie, Liming Zhao, Lei Zhang, Yun Zheng, Deli Zhao, Yongdong Zhang.
    Conference on Neural Information Processing Systems (NeurIPS 2023)

  • Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval. [paper] [supp] [code]
    Pandeng Li, Chen-Wei Xie, Liming Zhao, Hongtao Xie, Jiannan Ge, Yun Zheng, Deli Zhao, Yongdong Zhang.
    International Conference on Computer Vision (ICCV 2023) (Oral Presentation, 2%)

  • Balanced Classification: A Unified Framework for Long-Tailed Object Detection. [paper] [code]
    Tianhao Qi, Hongtao Xie, Pandeng Li, Jiannan Ge, Yongdong Zhang.
    IEEE Transactions on Multimedia (TMM 2023)

2022

  • Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval. [paper] [code]
    Pandeng Li, Hongtao Xie, Jiannan Ge, Lei Zhang, Shaobo Min, Yongdong Zhang.
    European Conference on Computer Vision (ECCV 2022)

  • Neighborhood-Adaptive Structure Augmented Metric Learning. [paper] [code]
    Pandeng Li, Yan Li, Hongtao Xie, Lei Zhang.
    Association for the Advancement of Artificial Intelligence (AAAI 2022) (Oral Presentation, 4.5%)

  • Deep Fourier Ranking Quantization for Semi-supervised Image Retrieval. [paper] [code]
    Pandeng Li, Hongtao Xie, Shaobo Min, Jiannan Ge, Xun Chen, Yongdong Zhang.
    IEEE Transactions on Image Processing (TIP 2022) (SCI一区,IF=10.6)

  • Online Residual Quantization Via Streaming Data Correlation Preserving. [paper]
    Pandeng Li, Hongtao Xie, Shaobo Min, Zheng-Jun Zha, Yongdong Zhang.
    IEEE Transactions on Multimedia (TMM 2022) (SCI一区,IF=7.3)

  • Neighborhood-Adaptive Multi-cluster Ranking for Deep Metric Learning. [paper]
    Pandeng Li, Hongtao Xie, Yan Jiang, Jiannan Ge, Yongdong Zhang.
    IEEE Transactions on Circuits and Systems for Video Technology (TCSVT 2022) (SCI一区,IF=8.4)

  • Dual Part Discovery Network for Zero-Shot Learning. [paper]
    Jiannan Ge, Hongtao Xie, Shaobo Min, Pandeng Li, Yongdong Zhang.
    ACM Multimedia (ACM-MM 2022) (Oral Presentation)

Awards

  • 2022/09 National Scholarships, USTC
  • 2023/12 The third prize (¥ 40,000) at The 2nd Guangdong-Hong Kong-Macao International Algorithm Competition
  • 2024/5 CAS Presidential Scholarship