I'm a DevOps / SRE Engineer with a strong software engineering background, focused on building and operating reliable, scalable systems end-to-end. I own production infrastructure, make architectural decisions, and iterate based on real-world performance and failures. I enjoy working across the stack—from Kubernetes and cloud infrastructure to backend services—and I'm comfortable debugging issues directly in application code when needed. I prioritize simplicity, fast iteration, and full ownership of systems in production.

João Simões

I'm a DevOps / SRE Engineer with a strong software engineering background, focused on building and operating reliable, scalable systems end-to-end. I own production infrastructure, make architectural decisions, and iterate based on real-world performance and failures. I enjoy working across the stack—from Kubernetes and cloud infrastructure to backend services—and I'm comfortable debugging issues directly in application code when needed. I prioritize simplicity, fast iteration, and full ownership of systems in production.

Available to hire

I’m a DevOps / SRE Engineer with a strong software engineering background, focused on building and operating reliable, scalable systems end-to-end.

I own production infrastructure, make architectural decisions, and iterate based on real-world performance and failures. I enjoy working across the stack—from Kubernetes and cloud infrastructure to backend services—and I’m comfortable debugging issues directly in application code when needed. I prioritize simplicity, fast iteration, and full ownership of systems in production.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more

Language

Portuguese
Fluent
English
Fluent
French
Intermediate
German
Beginner

Work Experience

Lead DevOps / Platform Engineer at CriticalTechWorks
January 1, 2025 - Present
Redesigned the deployment architecture of a Kubernetes-based microservices platform, reducing deployment time from ~40 minutes to under 10 minutes by restructuring Helm charts and optimizing CI/CD workflows. Built and operated a large-scale data ingestion platform processing ~20PB/month, identifying bottlenecks in S3 and EFS usage and introducing parallelization strategies to improve throughput and reliability. Implemented a full observability stack (Prometheus, Grafana, Tempo, Loki, OpenTelemetry), defining metrics, traces, and logs to debug distributed production systems. Developed automation to convert infrastructure alerts into actionable incidents, reducing manual triage and improving response times. Debugged production issues across Kubernetes workloads and Python services, working beyond infrastructure to resolve application-level failures.
DevOps / Cloud Engineer at KPMG
January 1, 2022 - January 1, 2025
Designed and implemented Terraform-based infrastructure for a microservices platform used daily by 100+ engineers, owning the full CI/CD lifecycle. Built and maintained CI/CD pipelines (GitLab CI, Jenkins) for consistent build, testing, and deployment workflows across multiple environments. Reduced release cycle time from days to hours by introducing automated pipelines and reusable environment templates. Collaborated directly with development teams to debug deployment issues and improve service reliability.

Education

M.Sc. Computer Science and Engineering at Instituto Superior Técnico (IST), Lisbon
January 1, 2020 - January 1, 2023
B.Sc. Computer Engineering at ISCTE Business School, Lisbon
January 1, 2017 - January 1, 2020

Qualifications

Certified Kubernetes Administrator (CKA)
September 1, 2024 - April 17, 2026

Industry Experience

Software & Internet, Financial Services, Professional Services, Manufacturing, Transportation & Logistics