Key Responsibilities
- Design, maintain and optimize modern cloud infrastructures with a focus on Kubernetes-based environments
- Ensure high availability, reliability and performance of infrastructure and services
- Monitor infrastructure and applications, respond to alerts and prevent incidents
- Participate in standby rotations and manage production incidents outside of standard business hours
- Optimize databases and infrastructure components for performance, scalability and cost efficiency
- Design, implement and maintain CI/CD pipelines, Dockerfiles and infrastructure automation workflows
- Continuously improve system visibility, resiliency and deployment processes
- Work closely with engineering teams to support the full application lifecycle
Competencies
- Strong professional experience in infrastructure operations and production environments
- Advanced skills in troubleshooting network, DNS, web services and distributed systems
- Deep understanding of Linux-based systems and cloud-native architectures
- Expert-level proficiency in Docker, including a deep understanding of container internals
- Strong knowledge of networking, storage, security and Kubernetes scalability mechanisms
- Strong skills in Scripting in at least one language such as Bash, Python, or Go
- Ability to monitor, tune, and optimize databases for production workloads
- Experience implementing comprehensive monitoring solutions using tools such as Prometheus
- Familiarity with modern CI/CD tools and best practices for secure deployments