I am Abhinav Utkarsh, a Computer Vision / Deep Learning Engineer (M.Sc. from TUM, 2025) based in Munich, Germany, focusing on 3D vision and multimodal perception. I have developed diffusion-guided editing for Gaussian-splat avatars and a transformer-diffusion model for head-pose estimation, and I delivered RGB-LiDAR anomaly detection pipelines on high-resolution datasets, validated by AUROC and LPIPS/PSNR. I enjoy turning research into reliable on-device solutions under real-time constraints and collaborating with impact-driven teams. My stack includes Python, PyTorch, and OpenCV/Open3D, with hands-on experience in 3D vision, diffusion, transformers, and LoRA fine-tuning for LLMs. I value reproducibility and rigorous evaluation and look forward to translating research into practical, real-world systems.

Abhinav Utkarsh

I am Abhinav Utkarsh, a Computer Vision / Deep Learning Engineer (M.Sc. from TUM, 2025) based in Munich, Germany, focusing on 3D vision and multimodal perception. I have developed diffusion-guided editing for Gaussian-splat avatars and a transformer-diffusion model for head-pose estimation, and I delivered RGB-LiDAR anomaly detection pipelines on high-resolution datasets, validated by AUROC and LPIPS/PSNR. I enjoy turning research into reliable on-device solutions under real-time constraints and collaborating with impact-driven teams. My stack includes Python, PyTorch, and OpenCV/Open3D, with hands-on experience in 3D vision, diffusion, transformers, and LoRA fine-tuning for LLMs. I value reproducibility and rigorous evaluation and look forward to translating research into practical, real-world systems.

Available to hire

I am Abhinav Utkarsh, a Computer Vision / Deep Learning Engineer (M.Sc. from TUM, 2025) based in Munich, Germany, focusing on 3D vision and multimodal perception. I have developed diffusion-guided editing for Gaussian-splat avatars and a transformer-diffusion model for head-pose estimation, and I delivered RGB-LiDAR anomaly detection pipelines on high-resolution datasets, validated by AUROC and LPIPS/PSNR.

I enjoy turning research into reliable on-device solutions under real-time constraints and collaborating with impact-driven teams. My stack includes Python, PyTorch, and OpenCV/Open3D, with hands-on experience in 3D vision, diffusion, transformers, and LoRA fine-tuning for LLMs. I value reproducibility and rigorous evaluation and look forward to translating research into practical, real-world systems.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Intermediate
See more

Language

English
Fluent
German
Fluent

Work Experience

Emotion-Driven Editing of Gaussian Avatars (Research Project) at Technical University of Munich (TUM)
February 1, 2025 - September 29, 2025
Led EMO-GA, a diffusion-guided editing pipeline for FLAME-tracked Gaussian-splat avatars using multi-view diffusion edits as pseudo-supervision. Achieved higher fidelity neutral rendering via differentiable tile-based rasterization; preserved identity without drift using ArcFace/VGG-Face constraints. Implemented a multi-view, multi-frame photometric-consistency loss and a lightweight transformer-diffusion head-pose controller to steer avatar expressions. Demonstrated L1 reductions (~10%) and PSNR gains (~0.67 dB) across ablations.
Multimodal 3D Point-Cloud Anomaly Detection (Research) at Technical University of Munich (TUM)
February 1, 2024 - September 29, 2025
Enhanced CPMF with PointNet/PointNet++ 3D extractors, super-resolution SDF module, noising+patching for high-res point clouds, and a reworked 3-stage fusion with a NasNet 2D backbone; rebuilt preprocessing and a Blender-generated 360-view RGB–LiDAR dataset. Outperformed baselines on all metrics (ROC, PR) with a stronger RGB-LiDAR model.
Large Language Models for Structuring Radiology Reports (Research) at Technical University of Munich (TUM)
September 1, 2023 - September 29, 2025
Built a schema-constrained QA pipeline with progressive prompts and strict validation. LoRA fine-tuned Vicuna-16B for structured extraction, achieving more valid Level-3 generations and reducing off-schema outputs. Added an image-aware VQA module for Level-3 reasoning, boosting recall on radiology findings.
Autonomous Drones with ROS (Research) at Technical University of Munich (TUM)
August 1, 2023 - September 29, 2025
Developed a ROS-based autonomous drone stack; achieved 100% arena coverage (50 x 50 m) in 1 min 20s with two UAVs in Unity. Implemented navigation, perception, and a state-machine; designed a waypoint planner and tuned P and D gains for stable, repeatable map reconstruction.
Technology Intern at Expert PowerHouse GmbH
July 1, 2021 - September 29, 2025
Extended Flask/Python backend and REST API for an expert-matching tool; integrated a real-time response capture service. Built one-click Python automations for reporting, autogenerated decks, and scheduled scripts to fetch data, rerun models, refresh results, and send status emails.

Education

Master of Science (M.Sc.) Robotics, Cognition, Intelligence at Technical University of Munich
January 1, 2021 - January 1, 2025
Bachelor of Technology (B.Tech.) Information Technology at Manipal University Jaipur
January 1, 2017 - January 1, 2021

Qualifications

Goethe B2
January 11, 2030 - September 29, 2025
DE C1 - TUM certificate
January 11, 2030 - September 29, 2025
TOEFL
January 11, 2030 - September 29, 2025

Industry Experience

Software & Internet, Media & Entertainment, Education, Professional Services, Computers & Electronics, Healthcare, Life Sciences