We are looking for a talented and experienced Site Reliability Engineer (SRE) to join our Infrastructure Team. As an SRE, you will be responsible for maintaining and improving our Kubernetes and data stack, as well as Traffic and other services that we provide to our product teams. You will work closely with our product teams to ensure that our infrastructure is reliable, scalable, and secure.
As an SRE in Cafe Bazaar, you will:
- Maintain and improve our diverse range of services, including both in-house developed and open-source services.
- Work closely with our product teams to ensure that our infrastructure is reliable, scalable, and secure.
- Participate in on-call rotations and support days to ensure that our systems are available and performing optimally.
- Troubleshoot and resolve issues related to our infrastructure and services.
- Develop and implement automation tools and processes to improve efficiency and reduce manual intervention.
- Monitor system performance and proactively identify and address potential issues.
- Continuously evaluate and improve our infrastructure and services to ensure that they meet the needs of our product teams.
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related field.
- 3+ years of experience in a Site Reliability Engineering role.
- Strong knowledge of cloud-native services like Kubernetes and Prometheus.
- Experience with software engineering and architecture.
- Strong problem-solving and troubleshooting skills.
- Ability to work independently and as part of a team.
- Excellent communication and collaboration skills.
- Willingness to participate in on-call rotations and support days.
- Having knowledge in Data Engineering and security is highly beneficial for this position.