Azure Infrastructure & Kubernetes: Lead the SRE team in building and operating Kubernetes and VM clusters in the AIOps platform with 10M+ DAU, managing the 10,000+ Linux and Windows servers. Developing Helm charts for microservices deployment and versioning across multiple environments, maintaining 99.99% high availability.
Terraform & IaC: Building production workloads at scale on AWS using Terraform and IaC. Managing high concurrency services, including Nginx, LVS, and Kafka, eliminating configuration drift.
Automation & Scripting: Developed Python, Bash, or PowerShell scripts for Kubernetes and VM cluster automation operation. Using Ansible and Argo Workflows to OS hardening and patching, reducing manual effort by 80% and MTTR by 35%.
Monitoring & Incident Response: Engineered full-stack observability using Prometheus, Grafana, and SQL. Designed advanced alerting/alarm logic and led on-call rotations, investigating root causes and implementing permanent fixes for high-severity outages.
Backup & Disaster Recovery: Designed and tested multi-region Disaster Recovery (DR) strategies and automated backup/recovery pipelines for PostgreSQL, MySQL, and S3, ensuring data integrity and meeting strict regulatory standards.
CI/CD & Security: Built and managed automated CI/CD pipelines using GitHub Actions, GitLab CI, and Jenkins, integrating ArgoCD for GitOps-based Kubernetes deployments.
Identity & Access Management (IAM): Implemented Least-Privilege access controls in Linux and Windows environments. Managed user permissions and system hardening to ensure 99% configuration compliance.
Networking & Connectivity: Deeply optimized load balancing (Nginx, LVS) and resolved complex TCP/IP, DNS resolution, response latency and routing issues. Managed VPN and Firewall configurations to ensure secure connectivity across distributed environments.
Message Queue: Optimize Kafka Broker kernel page cache parameters to avoid a large number of topic requests writing to the Broker disk and stabilize Kafka P99 write latency within 15 ms.
Collaboration & Documentation: Maintain comprehensive runbooks and technical documentation in English to streamline team hand-offs and improve system reliability.
Skills
Experience Level
Language
Work Experience
Education
Qualifications
Industry Experience
Skills
Experience Level
Hire a DevOps Developer
We have the best devops developer experts on Twine. Hire a devops developer in Dublin today.