Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I'm Piyush Aaryan, an AI Engineer specializing in agentic systems, LLM-powered products, and model fine-tuning. I build production AI backends for both consumer and B2B applications, deploying custom LLM and diffusion pipelines on serverless infrastructure to optimize costs and speed up delivery. I thrive on scalable API development, distributed workers (FastAPI, Celery, Redis, PostgreSQL), production observability (Prometheus/Grafana), and GPU-hosted RAG services using Llama.cpp/LlamaIndex. Recently I delivered a multi-model AI teaching agent with RL-trained code generation and multilingual TTS models.…I'm Piyush Aaryan, an AI Engineer specializing in agentic systems, LLM-powered products, and model fine-tuning. I build production AI backends for both consumer and B2B applications, deploying custom LLM and diffusion pipelines on serverless infrastructure to optimize costs and speed up delivery. I thrive on scalable API development, distributed workers (FastAPI, Celery, Redis, PostgreSQL), production observability (Prometheus/Grafana), and GPU-hosted RAG services using Llama.cpp/LlamaIndex. Recently I delivered a multi-model AI teaching agent with RL-trained code generation and multilingual TTS models.

Piyush Aaryan

AI Engineer, Data Scientist, Full Stack Developer, +3





I'm Piyush Aaryan, an AI Engineer specializing in agentic systems, LLM-powered products, and model fine-tuning. I build production AI backends for both consumer and B2B applications, deploying custom LLM and diffusion pipelines on serverless infrastructure to optimize costs and speed up delivery. I thrive on scalable API development, distributed workers (FastAPI, Celery, Redis, PostgreSQL), production observability (Prometheus/Grafana), and GPU-hosted RAG services using Llama.cpp/LlamaIndex. Recently I delivered a multi-model AI teaching agent with RL-trained code generation and multilingual TTS models.…I'm Piyush Aaryan, an AI Engineer specializing in agentic systems, LLM-powered products, and model fine-tuning. I build production AI backends for both consumer and B2B applications, deploying custom LLM and diffusion pipelines on serverless infrastructure to optimize costs and speed up delivery. I thrive on scalable API development, distributed workers (FastAPI, Celery, Redis, PostgreSQL), production observability (Prometheus/Grafana), and GPU-hosted RAG services using Llama.cpp/LlamaIndex. Recently I delivered a multi-model AI teaching agent with RL-trained code generation and multilingual TTS models.

Available to hire

I’m Piyush Aaryan, an AI Engineer specializing in agentic systems, LLM-powered products, and model fine-tuning. I build production AI backends for both consumer and B2B applications, deploying custom LLM and diffusion pipelines on serverless infrastructure to optimize costs and speed up delivery.

I thrive on scalable API development, distributed workers (FastAPI, Celery, Redis, PostgreSQL), production observability (Prometheus/Grafana), and GPU-hosted RAG services using Llama.cpp/LlamaIndex. Recently I delivered a multi-model AI teaching agent with RL-trained code generation and multilingual TTS models.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Language

English

Fluent

Gujarati

Intermediate

Hindi

Intermediate

Tamil

Intermediate

Work Experience

AI Backend Engineer at Upwork

January 1, 2024 - Present

Freelance AI Backend Engineer delivering production AI backends for 6+ client applications across LLM, diffusion, and agentic workflows. Architected serverless LLM and diffusion inference pipelines using FastAPI, Celery, Redis, and PostgreSQL, achieving $5,000/month in infrastructure cost savings. Implemented end-to-end observability with Prometheus and Grafana for latency, queue health, and error-rate monitoring across distributed systems. Deployed GPU-hosted RAG APIs with Llama.cpp/LlamaIndex and ensured secure access (Cloudflare Tunnels) for enterprise clients. Designed multi-layer LLM safety controls including prompt injection resistance and safe tool usage patterns.

AI Engineer (Consultant) at Reasonify Pvt Ltd (Singularity Learn)

June 1, 2025 - Present

Architected and built Singularity Learn’s AI Teaching Agent—a multi-model, curriculum-aware conversational teacher delivering personalized K-12 lessons with 56+ dynamically triggered interactive tools; engineered a9-mixin modular agent backend (FastAPI) with multi-provider LLM support (Gemini, Claude, GPT-4o via OpenRouter), SSE streaming, pedagogical guardrails, and AI-powered curriculum content extraction; fine-tuned Qwen Coder using RL/GRPO on a curated dataset; trained Orpheus TTS and Qwen 3 TTS on Gujarati, Hindi, and Tamil; built AI avatar streaming pipeline for real-time talking-head video with lip-sync; optimized LLM/TTS inference with TensorRT and vLLM; developed real-time voice pipeline with STT and emotion detection.

AI Backend Engineer (Freelance – Upwork Top Rated) at Multiple Clients – Remote

January 1, 2024 - Present

Delivered production AI backends for 6+ client applications integrating LLM workflows, diffusion inference, and scalable APIs (e.g., couple.me, knoetik.ai, makeitquick.ai, clarifyme.ai, Voops, Design Homes AI). Implemented serverless LLM + diffusion inference pipelines (FastAPI, Celery, Redis, PostgreSQL), achieving ~US$5,000/month in infra cost savings; shipped GPU-hosted RAG APIs using Llama.cpp + LlamaIndex + FastAPI with Cloudflare Tunnels; established observability with Prometheus + Grafana; authored multi-layer LLM safety controls.

AI Backend Engineer (Freelance – Upwork Top Rated) at Multiple Clients – Upwork

January 1, 2024 - Present

Delivered production AI backends for 6+ client applications integrating LLM workflows, diffusion inference, and scalable APIs. Implemented serverless LLM + diffusion inference pipelines (FastAPI, Celery, Redis, PostgreSQL), achieving $5,000/month in infrastructure cost savings. Shipped GPU-hosted RAG APIs using Llama.cpp + LlamaIndex + FastAPI with Cloudflare Tunnels; established end-to-end monitoring with Prometheus & Grafana. Authored multi-layer LLM safety controls covering prompt-injection resistance, policy enforcement, and safe tool usage across agent toolchains.