Publications

My research publications.

Any Resolution Any Geometry: From Multi-View To Multi-Patch

Wenqing Cui, Zhenyu Li, Mykola Lavreniuk, Jian Shi, Ramzi Idoughi, Xiangjun Tang, Peter Wonka

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026

A multi-patch framework for high-resolution monocular depth and surface normal estimation at any resolution with sharp boundaries and globally consistent geometry.

Project

PatchRefiner V2: Fast and Lightweight Real-Domain High-Resolution Metric Depth Estimation

Zhenyu Li, Wenqing Cui, Shariq Farooq Bhat, Peter Wonka

International Conference on Learning Representations (ICLR) 2026

A fast and lightweight framework for high-resolution metric depth estimation that outperforms state-of-the-art methods in both accuracy and speed.

DOICode

Geometry without Position? When Positional Embeddings Help and Hurt Spatial Reasoning

Jian Shi, Michael Birsak, Wenqing Cui, Zhenyu Li, Peter Wonka

arXiv preprint arXiv:2601.22231 2026

A study revisiting positional embeddings in vision transformers from a geometric perspective, revealing their role as geometric priors for spatial reasoning.

DOI

StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart

Jian Shi, Qian Wang, Zhenyu Li, Ramzi Idoughi, Wenqing Cui, Peter Wonka

arXiv preprint arXiv:2411.14295 2024

A zero-shot framework for stereo video generation using video diffusion priors with a noisy restart strategy, achieving state-of-the-art depth consistency.

DOICode

One-Step Video Depth Estimation via Self-Distillation

Wenqing Cui, Zhenyu Li, Jian Shi, Shariq Farooq Bhat, Peter Wonka

ICLR 2026 RSI Workshop 2026

A two-stage self-distillation strategy that distills multi-step diffusion video depth models into a one-step model, achieving comparable accuracy while reducing denoising time by up to 3x and decoding time by up to 20x.