Harshita Kaur Chugh
Available to hire
I’m a Site Reliability Engineer | Developer | DevOps Specialist with 5+ years of experience in Java/Spring Boot, AWS, and CI/CD automation. Skilled in monitoring, performance optimization, and cloud infrastructure, I build reliable, scalable systems and deliver fast, high-quality results. Based in Toronto, Canada, I’m a quick learner passionate about driving automation and uptime excellence.
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Language
English
Fluent
Work Experience
Site Reliability Engineer at American Express
June 1, 2025 - June 1, 2025Lead production support and mission-critical incident management for 5+ Java/Spring Boot applications in enterprise environments. Coordinated RCA and incident response for 100+ high-severity incidents. Implemented proactive monitoring with Splunk, Dynatrace, and Grafana; provided CI/CD support with Jenkins, GitHub Actions, and Bitbucket; optimized MTTR and ensured reliable production operations through permanent fixes and automation.
Production Support & Incident Management Engineer at American Express
June 1, 2025 - June 1, 2025Led incident response to 100+ high-severity incidents in a 24/7 production environment, coordinating cross-functional teams to restore services quickly and drive root-cause analysis (RCA). Reduced system downtime by 30% through proactive monitoring, RCA ownership, and implementation of permanent fixes. Applied ITIL/ITSM processes to standardize production support workflows. Defined and tracked SLIs/SLOs for key services to align with business SLAs. Designed alert strategies in Splunk, Dynatrace, and Grafana, reducing alert noise and MTTR. Led Post-Implementation Validation automation, saving 200+ man-hours across 100+ apps. Ensured smooth execution of batch scheduling jobs and production workflows, eliminating delays and data mismatches. Provided DB & platform support with SQL/PL-SQL for PostgreSQL and Oracle. Supported Spring Boot/J2EE applications, identified performance bottlenecks, and coordinated fixes with development teams. Managed secure file transfer operations by renewing 100
Education
Bachelor of Technology - Computer Science Engineering at University of Petroleum and Energy Studies
January 1, 2016 - January 1, 2020Bachelor of Technology - Computer Science Engineering at University of Petroleum and Energy Studies
January 1, 2016 - January 1, 2020Bachelor of Technology at University of Petroleum and Energy Studies
January 1, 2016 - January 1, 2020Qualifications
ITIL / ITSM
January 11, 2030 - November 3, 2025Post-Implementation Validation Automation
January 11, 2030 - November 3, 2025ITIL Certification
January 11, 2030 - November 3, 2025Industry Experience
Software & Internet, Professional Services, Financial Services
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Hire a Production Manager
We have the best production manager experts on Twine. Hire a production manager in Toronto today.