๐Ÿ“ Publications

๐Ÿ 2025

๐Ÿ”ฅ CVPR 2025
sym

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text
Guotao Liang, Baoquan Zhang*, Zhiyuan Wen, Junteng Zhao,Yunming Ye, Xiaochen Qi, Yao He. (Highlights)

Project

  • We propose a novel text-augmented codebook learning framework, TA-VQ, which leverages VLMs to generate longer text for each image, improving text-aligned codebook learning.

๐Ÿฒ 2024

๐Ÿ”ฅ NeurIPS 2024
sym

LG-VQ: Language-Guided Codebook Learning
Guotao Liang, Baoquan Zhang*, Yaowei Wang, Yunming Ye, Xutao Li , HuaiBin Wang, Luo Chuyao, Kola Ye, Linfeng Luo.

Project

  • We propose a novel multi-modal codebook learning method, named LG-VQ, which can enable the codebook to effectively retain fine-grained reconstruction information while aligning with the text.
CVPR 2024
sym

Codebook Transfer with Part-of-Speech for Vector-Quantized Image Modeling
Baoquan Zhang, Wang huaibin, Luo Chuyao, Xutao Li, Guotao Liang, Yunming Ye, Kola Ye, Linfeng Luo.

Project

  • We propose a new perspective, i.e., codebook transfer from language models to VQIM, to alleviate the codebook collapse issue.

๐Ÿฐ 2023