I'm Oleh Siloch, a Senior AI Engineer with over a decade of experience building production-grade AI systems across conversational AI, multimodal generation, and enterprise automation. I specialize in Retrieval Augmented Generation (RAG), AI agent orchestration, real-time voice systems, and GPU-optimized inference pipelines. I lead cross-functional teams to deliver reliable, scalable solutions that drive business value. I design hybrid backend architectures (Django and FastAPI), implement low-latency streaming and distributed microservices on AWS/GCP, and ensure security and regulatory compliance (HIPAA, PCI DSS) while maintaining operational reliability in high-demand environments.

Oleh Siloch

I'm Oleh Siloch, a Senior AI Engineer with over a decade of experience building production-grade AI systems across conversational AI, multimodal generation, and enterprise automation. I specialize in Retrieval Augmented Generation (RAG), AI agent orchestration, real-time voice systems, and GPU-optimized inference pipelines. I lead cross-functional teams to deliver reliable, scalable solutions that drive business value. I design hybrid backend architectures (Django and FastAPI), implement low-latency streaming and distributed microservices on AWS/GCP, and ensure security and regulatory compliance (HIPAA, PCI DSS) while maintaining operational reliability in high-demand environments.

Available to hire

I’m Oleh Siloch, a Senior AI Engineer with over a decade of experience building production-grade AI systems across conversational AI, multimodal generation, and enterprise automation. I specialize in Retrieval Augmented Generation (RAG), AI agent orchestration, real-time voice systems, and GPU-optimized inference pipelines. I lead cross-functional teams to deliver reliable, scalable solutions that drive business value.

I design hybrid backend architectures (Django and FastAPI), implement low-latency streaming and distributed microservices on AWS/GCP, and ensure security and regulatory compliance (HIPAA, PCI DSS) while maintaining operational reliability in high-demand environments.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
See more

Language

Ukrainian
Fluent
English
Advanced

Work Experience

Senior AI / Full-Stack Engineer at Digital A Go Go SRL
July 1, 2023 - Present
Led the end-to-end design and deployment of an AI-powered conversational platform with real-time voice assistants, chatbot workflows, and retrieval-based search. Built a real-time voice booking assistant using ElevenLabs Conversational AI, WebSocket streaming (PCM16 @ 16kHz), VAD, barge-in detection, and queued TTS across Flutter and web apps. Developed LLM-powered support tools (summarization, smart replies, semantic FAQ search) reducing human operator load by ~55%. Architected RAG pipelines with pgvector, Pinecone, and FAISS to boost FAQ relevance and task resolution speed. Optimized backend APIs and microservices (FastAPI, Django, Node.js) with caching, async queues, and high-traffic query optimization. Implemented privacy-first data ingestion with RBAC, encryption, and tokenized storage for HIPAA/PCI-DSS compliance. Owned CI/CD and infrastructure automation on AWS/GCP, with observability stacks (Prometheus, Grafana, Sentry). Built live dashboards (Next.js/React) for transcription p
AI Engineer at SolveIT
February 1, 2021 - February 1, 2024
Designed and developed a real-time multimodal AI video agent enabling natural human–AI interaction through speech, vision, and a lifelike avatar. Built a streaming speech-to-speech pipeline using Whisper-based ASR, GPT-3.5 for dialogue, and neural TTS to achieve low-latency voice interactions. Created an audio-to-visual animation pipeline with Wav2Lip and SadTalker to synchronize avatar facial movements with generated audio. Implemented real-time visual perception using OpenCV with YOLOv5 and MediaPipe to extract object, face, and gesture context during conversations. Developed an embedding-based semantic retrieval layer using sentence-transformers (MiniLM-L6-v2) and pgvector for context-aware retrieval. Optimized GPU-based inference via model parallelism and asynchronous orchestration, reducing end-to-end rendering latency by 45%. Architected a low-latency backend with FastAPI, Redis Streams, and WebRTC to support streaming audio/video and synchronized AI avatar responses, with auto
Junior Backend Engineer at Institute of Molecular Biology and Genetics, NAS of Ukraine
September 1, 2018 - April 1, 2021
Developed Python-based data preprocessing and analysis pipelines using Pandas and NumPy. Built lightweight Django-based internal tools for experiment tracking and reporting. Designed relational database schemas and optimized SQL queries for scientific datasets. Assisted in implementing data processing workflows supporting ML experiments. Refactored legacy Python scripts, improving maintainability and runtime efficiency.

Education

Bachelor of Science in Computer Software Engineering at Odessa Polytechnic National University
January 1, 2010 - January 1, 2014

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Media & Entertainment, Professional Services, Healthcare, Financial Services