Multimedia Laboratory @
Nanyang Technological University
Affiliated with S-Lab



MMLab@NTU was formed on the 1 August 2018, with a research focus on computer vision and deep learning. Its sister lab is MMLab@CUHK. It is now a group with four faculty members and more than 35 members including research fellows, research assistants, and PhD students.

Members in MMLab@NTU conduct research primarily in low-level vision, image and video understanding, creative content creation, 3D scene understanding and reconstruction. Have a look at the overview of our research. All publications are listed here.

We are always looking for motivated PhD students, postdocs, research assistants who have the same interests like us. Check out the careers page and follow us on Twitter.

MVP Point Cloud Challenge @ ICCV 2021

07/2021: Join this challenge to evaluate your point cloud completion and registration methods. Deadline on September 9, 2021.

View more

ForgeryNet: Face Forgery Analysis Challenge @ ICCV 2021

07/2021: Join this challenge to benchmark your anti-deepfake methods on the largest face forgery dataset. Deadline on September 9, 2021.

View more

Three Champions in NTIRE 2021 Challenge

04/2021: NTIRE is the most competitive challenge for low-level vision tasks. With BasicVSR++, we won three Champions in the tracks for video super-resolution and quality enhancement of heavily compressed videos. Congrats to the team!

View more

ICCV 2021

07/2021: The team has a total of 11 papers accepted to ICCV 2021 (including one oral).

View more

Check Out

News and Highlights

  • 05/2021: Five outstanding CVPR 2021 reviewers from our team! Congrats to Chongyi Li, Davide Moltisanti, Xiangyu Xu, Liang Pan, and Jiahao Xie.
  • 03/2021: The team has a total of 18 papers accepted to CVPR 2021 (including four orals).
  • 01/2021: Two papers to appear in ICLR 2021.
  • 12/2020: In the recent nuScenes 3D detection challenge of the 5th AI Driving Olympics in NeurIPS 2020, we obtained the best PKL award and the second runner-up by multi-modality entry, and the best vision-only results!
  • 12/2020: Shangchen Zhou is recognized as one the top 10% outstanding reviewers in NeurIPS 2020.
  • 07/2020: Eight papers to appear in ECCV 2020 (with one oral and one spotlight).
  • 07/2020: New toolboxes such as MMEditing, MMDetection3D and OpenSelfSup are released under OpenMMLab.
  • 03/2020: Nine papers to appear in CVPR 2020 (including one oral).

View more



Path-Restore: Learning Network Path Selection for Image Restoration
K. Yu, X. Wang, C. Dong, X. Tang, C. C. Loy
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021 (TPAMI)
[DOI] [arXiv] [Project Page]

We observe that some corrupted image regions are inherently easier to restore than others since the distortion and content vary within an image. To this end, we propose Path-Restore, a multi-path CNN with a pathfinder that can dynamically select an appropriate route for each image region. We train the pathfinder using reinforcement learning with a difficulty-regulated reward. This reward is related to the performance, complexity and "the difficulty of restoring a region".

GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution
K. C. K. Chan, X. Wang, X. Xu, J. Gu, C. C. Loy
in Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021 (CVPR, Oral)
[PDF] [Supplementary Material] [arXiv] [Project Page]

We show that pre-trained Generative Adversarial Networks (GANs), e.g., StyleGAN, can be used as a latent bank to improve the restoration quality of large-factor image super-resolution (SR). Switching the bank allows the method to deal with images from diverse categories, e.g., cat, building, human face, and car. Images upscaled by GLEAN show clear improvements in terms of fidelity and texture faithfulness in comparison to existing methods.

FASA: Feature Augmentation and Sampling Adaptation for Long-Tailed Instance Segmentation
Y Zang, C. Huang, C. C. Loy
in Proceedings of IEEE/CVF International Conference on Computer Vision, 2021 (ICCV)
[arXiv] [Project Page]

We propose a simple yet effective method, Feature Augmentation and Sampling Adaptation (FASA), that addresses the data scarcity issue by augmenting the feature space especially for rare classes. FASA is a fast, generic method that can be easily plugged into standard or long-tailed segmentation frameworks, with consistent performance gains and little added cost.

Unsupervised Object-Level Representation Learning from Scene Images
J. Xie, X. Zhan, Z. Liu, Y. S. Ong, C. C. Loy
Technical report, arXiv:2106.11952, 2021
[arXiv] [Project Page]

We introduce Object-level Representation Learning (ORL), a new self-supervised learning framework towards scene images. Extensive experiments on COCO show that ORL significantly improves the performance of self-supervised learning on scene images, even surpassing supervised ImageNet pre-training on several downstream tasks.



ICCV 2021 - The 3rd Workshop on

Sensing, Understanding and Synthesizing Humans

In our workshop this year, we are organizing two exciting challenges.

In MVP Point Cloud Challenge, you can compete with others using your point cloud completion and registration methods based on the newly proposed MVP dataset, a high-quality multi-view partial point cloud dataset. It contains over 100,000 high-quality scans of partial 3D shapes rendered from 26 uniformly distributed camera poses for each 3D CAD model.

In ForgeryNet: Face Forgery Analysis Challenge, you will benchmark your anti-deepfake methods on the largest face forgery data ForgeryNet.

Both challenges have the deadline on September 19, 2021. Great prizes up for grabs!