no code implementations • 22 Mar 2024 • Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
We introduce DragAPart, a method that, given an image and a set of drags as input, can generate a new image of the same object in a new state, compatible with the action of the drags.
no code implementations • 21 Mar 2024 • Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham, Qianyi Wu
3D decomposition/segmentation still remains a challenge as large-scale 3D annotated data is not readily available.
1 code implementation • 21 Mar 2024 • Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai
We propose MVSplat, an efficient feed-forward 3D Gaussian Splatting model learned from sparse multi-view images.
Ranked #1 on Generalizable Novel View Synthesis on ACID
1 code implementation • 28 Dec 2023 • Guanqi Zhan, Chuanxia Zheng, Weidi Xie, Andrew Zisserman
In contrast, we use 3D data to establish an automatic pipeline to determine authentic ground truth amodal masks for partially occluded objects in real images.
1 code implementation • 7 Dec 2023 • Chuanxia Zheng, Andrea Vedaldi
Similar to Zero-1-to-3, we start from a pre-trained 2D image generator for generalization, and fine-tune it for NVS.
no code implementations • 27 Nov 2023 • Minghui Hu, Jianbin Zheng, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham
By integrating a compact network and incorporating an additional simple yet effective step during inference, OMS elevates image fidelity and harmonizes the dichotomy between training and inference, while preserving original model parameters.
1 code implementation • 10 Oct 2023 • Guanqi Zhan, Chuanxia Zheng, Weidi Xie, Andrew Zisserman
(iii) We find that features from Stable Diffusion are good for discriminative learning of a number of properties, including scene geometry, support relations, shadows and depth, but less performant for occlusion and material.
1 code implementation • ICCV 2023 • Chuanxia Zheng, Andrea Vedaldi
Vector Quantisation (VQ) is experiencing a comeback in machine learning, where it is increasingly used in representation learning.
no code implementations • 6 Jul 2023 • Tianhao Wu, Chuanxia Zheng, Tat-Jen Cham
Generating complete 360-degree panoramas from narrow field of view images is ongoing research as omnidirectional RGB data is not readily available.
no code implementations • 1 Jun 2023 • Minghui Hu, Jianbin Zheng, Daqing Liu, Chuanxia Zheng, Chaoyue Wang, DaCheng Tao, Tat-Jen Cham
In this work, we propose Cocktail, a pipeline to mix various modalities into one embedding, amalgamated with a generalized ControlNet (gControlNet), a controllable normalisation (ControlNorm), and a spatial guidance sampling method, to actualize multi-modal and spatially-refined control for text-conditional diffusion models.
1 code implementation • 24 Apr 2023 • Yuedong Chen, Haofei Xu, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
The key to our approach lies in the explicitly modeled correspondence matching information, so as to provide the geometry prior to the prediction of NeRF color and density for volume rendering.
no code implementations • 12 Feb 2023 • Tung-Long Vuong, Trung Le, He Zhao, Chuanxia Zheng, Mehrtash Harandi, Jianfei Cai, Dinh Phung
Learning deep discrete latent presentations offers a promise of better symbolic and summarized abstractions that are more useful to subsequent downstream tasks.
1 code implementation • 27 Nov 2022 • Minghui Hu, Chuanxia Zheng, Heliang Zheng, Tat-Jen Cham, Chaoyue Wang, Zuopeng Yang, DaCheng Tao, Ponnuthurai N. Suganthan
The recently developed discrete diffusion models perform extraordinarily well in the text-to-image task, showing significant promise for handling the multi-modality signals.
2 code implementations • 19 Sep 2022 • Chuanxia Zheng, Long Tung Vuong, Jianfei Cai, Dinh Phung
Although two-stage Vector Quantized (VQ) generative models allow for synthesizing high-fidelity and high-resolution images, their quantization operator encodes similar patches within an image into the same index, resulting in a repeated artifact for similar adjacent regions using existing decoder architectures.
1 code implementation • 20 Jul 2022 • Qianyi Wu, Xian Liu, Yuedong Chen, Kejie Li, Chuanxia Zheng, Jianfei Cai, Jianmin Zheng
This paper proposes a novel framework, ObjectSDF, to build an object-compositional neural implicit representation with high fidelity in 3D reconstruction and object representation.
no code implementations • 5 Apr 2022 • Chuanxia Zheng, Guoxian Song, Tat-Jen Cham, Jianfei Cai, Dinh Phung, Linjie Luo
In this work, we present a novel framework for pluralistic image completion that can achieve both high quality and diversity at much faster inference speed.
1 code implementation • 21 Mar 2022 • Yuedong Chen, Qianyi Wu, Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
In light of recent advances in NeRF-based 3D-aware generative models, we introduce a new task, Semantic-to-NeRF translation, that aims to reconstruct a 3D scene modelled by NeRF, conditioned on one single-view semantic mask as input.
Ranked #1 on 3D-Aware Image Synthesis on CelebAMask-HQ
no code implementations • 23 Feb 2022 • Chuanxia Zheng
The goal of this thesis is to present my research contributions towards solving various visual synthesis and generation tasks, comprising image translation, image completion, and completed scene decomposition.
1 code implementation • ACM Transactions on Graphics 2021 • Guoxian Song, Linjie Luo, Jing Liu, Wan-Chun Ma, Chun-Pong Lai, Chuanxia Zheng, Tat-Jen Cham
While substantial progress has been made in automated stylization, generating high quality stylistic portraits is still a challenge, and even the recent popular Toonify suffers from several artifacts when used on real input images.
1 code implementation • 12 Apr 2021 • Chuanxia Zheng, Duy-Son Dao, Guoxian Song, Tat-Jen Cham, Jianfei Cai
In this work, we propose a higher-level scene understanding system to tackle both visible and invisible parts of objects and backgrounds in a given scene.
2 code implementations • CVPR 2021 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
We propose a novel spatially-correlative loss that is simple, efficient and yet effective for preserving scene structure consistency while supporting large appearance changes during unpaired image-to-image (I2I) translation.
1 code implementation • CVPR 2022 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai, Dinh Phung
Bridging global context interactions correctly is important for high-fidelity image completion with large masks.
Ranked #2 on Image Inpainting on FFHQ 512 x 512
no code implementations • ICCV 2021 • Yujun Cai, Yiwei Wang, Yiheng Zhu, Tat-Jen Cham, Jianfei Cai, Junsong Yuan, Jun Liu, Chuanxia Zheng, Sijie Yan, Henghui Ding, Xiaohui Shen, Ding Liu, Nadia Magnenat Thalmann
Notably, by considering this problem as a conditional generation process, we estimate a parametric distribution of the missing regions based on the input conditions, from which to sample and synthesize the full motion series.
1 code implementation • CVPR 2019 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
In this paper, we present an approach for \textbf{pluralistic image completion} -- the task of generating multiple and diverse plausible solutions for image completion.
1 code implementation • ECCV 2018 • Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai
Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire.
Ranked #3 on Depth Estimation on eBDtheque