Hi, I'm Harsh Ghadiya, an AI/ML engineer with 4 years of experience designing and deploying full-stack AI solutions. I specialize in leveraging large language models, retrieval-augmented generation, and structured data extraction to create impactful tools that improve customer experiences. I enjoy building scalable, efficient AI backends using Python, Go, and cloud services like AWS, Azure, and Google Cloud. I'm passionate about collaborating with cross-functional teams to bring AI innovation into production systems and enhancing AI observability and governance. With a strong background in AI infrastructure including Docker, Kubernetes, and Terraform, I focus on optimizing model performance and reducing latency for real-world applications. Always eager to contribute to cutting-edge AI research and development, I strive to make AI technology accessible and effective at scale.

Harsh Ghadiya

Hi, I'm Harsh Ghadiya, an AI/ML engineer with 4 years of experience designing and deploying full-stack AI solutions. I specialize in leveraging large language models, retrieval-augmented generation, and structured data extraction to create impactful tools that improve customer experiences. I enjoy building scalable, efficient AI backends using Python, Go, and cloud services like AWS, Azure, and Google Cloud. I'm passionate about collaborating with cross-functional teams to bring AI innovation into production systems and enhancing AI observability and governance. With a strong background in AI infrastructure including Docker, Kubernetes, and Terraform, I focus on optimizing model performance and reducing latency for real-world applications. Always eager to contribute to cutting-edge AI research and development, I strive to make AI technology accessible and effective at scale.

Available to hire

Hi, I’m Harsh Ghadiya, an AI/ML engineer with 4 years of experience designing and deploying full-stack AI solutions. I specialize in leveraging large language models, retrieval-augmented generation, and structured data extraction to create impactful tools that improve customer experiences. I enjoy building scalable, efficient AI backends using Python, Go, and cloud services like AWS, Azure, and Google Cloud.

I’m passionate about collaborating with cross-functional teams to bring AI innovation into production systems and enhancing AI observability and governance. With a strong background in AI infrastructure including Docker, Kubernetes, and Terraform, I focus on optimizing model performance and reducing latency for real-world applications. Always eager to contribute to cutting-edge AI research and development, I strive to make AI technology accessible and effective at scale.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
See more

Language

English
Fluent
Hindi
Advanced

Work Experience

AI & ML Engineer at State Street, USA
January 1, 2024 - Present
Engineered full-stack AI solutions by integrating LLM-powered Retrieval-Augmented Generation (RAG) models, improving data extraction efficiency by 30% and enhancing real-time decision-making. Developed and deployed AI inference pipelines on AWS ECS/Fargate, optimizing model execution time by 40% and reducing infrastructure costs. Built scalable AI-driven backend systems in Python with PostgreSQL on AWS RDS, handling millions of structured and unstructured data points, improving query performance by 50%. Optimized LLM-based AI Agents by implementing advanced prompt engineering techniques, improving response accuracy by 25% and reducing latency. Designed and implemented a robust AI infrastructure using Docker, Kubernetes, and Terraform for seamless deployment and scaling. Developed internal AI-powered developer tools and automation frameworks in TypeScript and Vue/Nuxt, increasing engineering productivity by 35%. Enhanced cloud-based AI monitoring and logging systems using AWS CloudWatch
AI & ML Engineer at Groovy Web, India
July 31, 2022 - July 18, 2025
Developed, delivered, and supported AI components leveraging PyTorch, AWS Ultraclusters, and VectorDBs, optimizing LLM inference and similarity search to enhance scalability and performance in production-grade AI systems. Deployed scalable and responsible AI solutions on Azure cloud, leveraging advanced AI techniques to optimize model performance and reduce operational costs by 30% while improving system scalability by 20%. Developed and optimized foundation model training and evaluation processes, increasing model efficiency by 30% and enhancing real-time decision-making. Applied optimization techniques for training and inference software, enhancing hardware utilization, reducing latency, and improving throughput, resulting in a 20% reduction in operational expenses and 30% improvement in system performance. Improved AI model inference efficiency by 20% through Python and Scala development. Collaborated with engineers, research scientists, and product managers to design and deploy sca

Education

Master of Science at California State University, Northridge, CA
August 1, 2022 - May 31, 2024
Bachelor of Technology at Gujarat Technological University, India
August 1, 2016 - September 30, 2020

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Financial Services, Professional Services