Hi, I’m Naga Nikshith G, a research‑oriented Software Engineer and Tech Lead with 5+ years of experience across Large Language Model (LLM)/agentic systems, AI safety evaluation, backend engineering, and large‑scale data workflows. I enjoy translating research concepts into robust, scalable systems and collaborating with cross‑disciplinary research, engineering, and policy teams to advance responsible AI. I’m currently the lead engineer for Project Moonshot at NTU Singapore’s AI Safety Institute / Digital Trust Centre, where I design benchmarking platforms and reproducible evaluation methodologies for frontier LLMs, and build production‑grade experimentation pipelines on AWS and multi‑agent frameworks.

Naga Nikshith G

Hi, I’m Naga Nikshith G, a research‑oriented Software Engineer and Tech Lead with 5+ years of experience across Large Language Model (LLM)/agentic systems, AI safety evaluation, backend engineering, and large‑scale data workflows. I enjoy translating research concepts into robust, scalable systems and collaborating with cross‑disciplinary research, engineering, and policy teams to advance responsible AI. I’m currently the lead engineer for Project Moonshot at NTU Singapore’s AI Safety Institute / Digital Trust Centre, where I design benchmarking platforms and reproducible evaluation methodologies for frontier LLMs, and build production‑grade experimentation pipelines on AWS and multi‑agent frameworks.

Available to hire

Hi, I’m Naga Nikshith G, a research‑oriented Software Engineer and Tech Lead with 5+ years of experience across Large Language Model (LLM)/agentic systems, AI safety evaluation, backend engineering, and large‑scale data workflows. I enjoy translating research concepts into robust, scalable systems and collaborating with cross‑disciplinary research, engineering, and policy teams to advance responsible AI.

I’m currently the lead engineer for Project Moonshot at NTU Singapore’s AI Safety Institute / Digital Trust Centre, where I design benchmarking platforms and reproducible evaluation methodologies for frontier LLMs, and build production‑grade experimentation pipelines on AWS and multi‑agent frameworks.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent

Work Experience

Tech Lead / Research Engineer II at Singapore AI Safety Institute / Digital Trust Centre (NTU Singapore)
November 1, 2024 - Present
Lead engineer for Project Moonshot, designing agentic LLM safety evaluation frameworks and large-scale benchmarking infrastructure. Built production-ready LLM safety benchmarking pipeline on AWS Bedrock with a 4-phase workflow (tool generation → prompt generation → tool usage analysis → code generation), including retry logic, timeout handling, batch processing, and deduplication across 30+ risk categories. Developed synthetic data workflows generating hundreds of structured, risk-conditioned prompts per category. Architected scalable Corrective Retrieval-Augmented Generation (C-RAG) using Weaviate and FAISS, processing 10M+ embeddings and improving retrieval quality by ~40%. Implemented multi-agent evaluation systems (LangChain, LangGraph, CrewAI) with planner–executor and supervisor–worker patterns, automating end-to-end safety testing and analysis. Designed automated tool-emulation generation pipeline with human-in-the-loop validation. Integrated OpenAI, AWS Bedrock, and T
Software Developer at MBM Cloud
January 1, 2023 - November 1, 2024
Designed and deployed NLP features for enterprise products (text classification, summarization, semantic search). Built conversational and document-understanding systems using Python, FastAPI, and lightweight transformer models. Created React dashboards to visualize sentiment trends, keyword extraction, and entity summaries. Optimized model inference and caching pipelines, reducing average latency by 35% while maintaining multilingual accuracy. Worked with data and product teams to curate labeled datasets and evaluate model performance (precision, recall, F1). Integrated analytics APIs with downstream applications to deliver actionable insights.
Analyst I at TATA Consultancy Services (TCS)
January 1, 2021 - July 1, 2022
Collaborated within a cross-functional team to enhance a financial services product. Utilized SQL to query large financial databases and generate management reports. Developed and tracked KPIs, driving a 20% improvement in operational efficiency. Supported ETL processes and ensured data integrity for analysis across operational and data warehouse environments. Managed the lifecycle of data issues using Jira.
Full Stack Developer (Part-time/Contract) at CPP SECRETS
August 1, 2019 - January 1, 2020
Developed and implemented multiple front-end and back-end features for the company website, including an interactive quiz application.

Education

Master of Science in Computer Science at University of North Carolina Charlotte
August 1, 2022 - December 1, 2023
Bachelor of Technology in Information Technology at Swami Vivekananda Institute of Technology
July 1, 2017 - July 1, 2021

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Education, Professional Services, Media & Entertainment