خودرو45
خودرو45

Site Reliability Engineer

Tehran/ Mirdamad
Full Time
Saturday to Wednesday
-
Loan -Bonus -Health insurance -Coffee shop -Occasional packages and gifts
201 - 500 employees
Internet Provider / E-commerce / Online Services
Iranian company dealing only with Iranian entities
1397
Privately held
توضیحات بیشتر

key Requirements

3 years experience in similar position

Job Description

Role Summary

We are looking for a Site Reliability Engineer who will work closely with our development teams to continuously improve the uptime, scalability, and reliability of our services. This role focuses on application‑level reliability, architecture best practices, automation, and enabling developers — and does not involve day‑to‑day infrastructure maintenance or sysadmin responsibilities.

Key Responsibilities

  • Partner with development teams to design and improve service architectures with a strong focus on reliability, scalability, and reducing operational toil
  • Contribute to defining and promoting 12‑Factor App and cloud‑native best practices across teams
  • Support development teams in deploying and optimizing their services on Kubernetes, without being responsible for infrastructure operations
  • Build internal tools, scripts, and automation (primarily in Python) to enhance delivery quality, observability, and operational efficiency
  • Define and implement SLOs/SLIs/SLAs and establish well‑structured reliability standards
  • Improve service observability by designing metrics, dashboards, and alerting
  • Participate in incident analysis and root cause investigations, focusing on application and service layers
  • Identify and automate repetitive processes to reduce operational overhead
  • Explore and leverage AI‑powered tools to improve development, testing, and operational workflows

Required Skills & Experience

  • Hands‑on experience deploying and debugging services on Kubernetes
  • Strong programming skills, preferably in Python
  • Solid understanding of SRE principles including SLO/SLA/SLI, error budgets, monitoring, and alerting
  • Strong familiarity with the 12‑Factor methodology and cloud‑native application design
  • Experience with observability tools (e.g., Prometheus, Grafana)
  • Ability to analyze complex service‑level issues and propose pragmatic solutions
  • Familiarity with CI/CD pipelines and release engineering practices

Nice to Have

  • Experience using AI tools to enhance development, debugging, testing, or operational workflows
  • Knowledge of containerization and modern deployment practices
  • Experience designing developer golden paths or platform engineering practices
  • Understanding of DevOps concepts and ability to collaborate effectively with DevOps and Infra teams

Personal Attributes

  • A passion for reducing toil and improving software quality through automation
  • Strong communication skills and ability to collaborate closely with development teams
  • Product‑oriented thinking with a focus on end‑to‑end service reliability
  • System‑level thinking and the ability to identify architectural bottlenecks

Job Requirements

Age
22 - 50 Years Old
Gender
Men / Women

ثبت مشکل و تخلف آگهی

ارسال رزومه برای خودرو45