PhD Student at the Max Planck ETH Center for Learning Systems (CLS).

Thomas Wimmer

I am currently pursuing a PhD through the Max Planck ETH Center for Learning Systems (CLS) and ELLIS programs. My advisors are Jan Eric Lenssen, Bernt Schiele, Christian Theobalt (MPI), and Siyu Tang (ETH).

I previously graduated from my double master’s degree at the Technical University of Munich and the Institut Polytechnique de Paris. During my studies, I have had the chance to work with various great people, including Daniel Cremers, Maks Ovsjanikov, Peter Wonka, and Federico Tombari.

My main research interests are visual representation learning and 3D computer vision. However, I am always open to new ideas and collaborations in related fields. This website gives you an overview of my recent research and other projects.

prof_pic.webp
MPI logo ETH logo TUM logo Polytechnique logo Copenhagen logo

Publications

Research papers and preprints

Blog

Notes and project write-ups

Contact

Get in touch with me

news

Oct 14, 2025 New pre-print: “AnyUp: Universal Feature Upsampling” is now available on arXiv! Super excited to share this work, where we propose a first-of-its-kind feature-agnostic upsampling architecture that can upsample features from any vision model at any resolution, without requiring any encoder-specific training. New state-of-the-art results on multiple downstream benchmarks, while being the first upsampler that naturally generalizes to different feature types at inference time. Accepted to ICLR 2026 as an oral presentation!
Jun 05, 2025 New pre-print: “Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels” is now available on arXiv! We show that foundational features can be refined with an adapter that is trained with pseudo-labels, which are themselves zero-shot predictions using the same foundational features. We improve the quality of pseudo-labels through 3D-aware chaining with cycle-consistency and reject wrong pairs using a spherical prototype. New state-of-the-art results on SPair71k and scalable to larger datasets. Accepted to ICCV 2025!
Jan 12, 2025 Our pre-print “MEt3R: Measuring Multi-View Consistency in Generated Images” is now available on arXiv! In this work, we propose a DUSt3R-based method to measure multi-view consistency which can, e.g., be used to evaluate the 3D consistency of video diffusion models. Accepted to CVPR 2025!
Nov 05, 2024 Happy to report that my latest paper, “Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes”, was accepted for publication at 3DV 2025. Thanks for a great collaboration to my co-authors, Michael Oechsle, Michael Niemeyer, and Federico Tombari!

highlighted work

  1. anyup-teaser.gif
    AnyUp: Universal Feature Upsampling
    In Proceedings of the International Conference on Learning Representations (ICLR), 2026
    stars -- citations --
  2. diysc-teaser.png
    Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels
    In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
    stars -- citations --
  3. met3r-teaser.png
    MEt3R: Measuring Multi-View Consistency in Generated Images
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
    stars -- citations --
  4. g2life-teaser.png
    Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes
    In 2025 International Conference on 3D Vision (3DV), 2025
    stars -- citations --
  5. bt3d-teaser.png
    Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features
    Thomas WimmerPeter Wonka, and Maks Ovsjanikov
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    stars -- citations --