Research Focus
Generative Systems
Diffusion models for image, video, and audio generation with real-world quality and reliability constraints.
Visual Localization
Learning-based localization that blends geometry with deep representations for AR/VR at scale.
Privacy + Security
Content-concealing descriptors and robust perception for privacy-preserving visual systems.
Now
I am a Senior Research Scientist at Google DeepMind on the Science and Strategic Initiatives team. I am interested in generative AI, multimodal systems, and evaluation frameworks that move beyond surface-level metrics.
I am open to collaborations on generative media systems, privacy-preserving perception, and robust evaluation.
selected publications
-
CVPRTUNA: Taming Unified Visual Representations for Native Unified Multimodal ModelsIn CVPR, 2026
-
CVPR
news
| Apr 20, 2026 | Started a new role as Senior Research Scientist at Google DeepMind on the Science and Strategic Initiatives team. |
|---|---|
| Feb 23, 2026 | Two papers were accepted to CVPR 2026: TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models and VecGlypher: Unified Vector Glyph Generation with Language Models. |
| Dec 10, 2025 | New preprint: TUNA — Taming Unified Visual Representations for Native Unified Multimodal Models (arXiv:2512.02014). |
| Aug 19, 2024 | Started a new role as an AI Research Scientist at Meta, focusing on diffusion models for image, video, and audio generation. |
| Feb 6, 2023 | Joined Synthesia as a Research Engineer, working on controllable video diffusion models for AI dubbing on avatars. |