ارتباطات سیار ایران - همراه اول
ارتباطات سیار ایران - همراه اول

Senior Site Reliability Engineer (SRE) - (مرکز تحقیق و نوآوری همراه اول)

Tehran/ Tarasht
Full Time
Saturday to Wednesday 7:00 to 15:50 / 8:30 to 17:20 (Flextime)
-
Bonus -Health insurance -Occasional packages and gifts
1001 - 5000 employees
Telecom
Iranian company dealing with Iranian and foreign customers
1373
Privately held
توضیحات بیشتر

key Requirements

5 years experience in similar position
Gitlab - Advanced

Job Description

Key Responsibilities
- Design, implement, and maintain Kubernetes clusters.
- Build and manage CI/CD pipelines (GitHub Actions, GitLab CI, ArgoCD) to streamline application delivery.
- Monitor system reliability, performance, and capacity using observability tools (Prometheus, Grafana, ELK stack).
- Troubleshoot complex production issues across network, Linux OS, containers, and applications.
- Automate infrastructure provisioning using Infrastructure as Code (Terraform, Ansible).
- Harden security at all layers (network policies, RBAC in Kubernetes, secrets management).

Required Technical Skills
- Deep understanding of networking concepts: TCP/IP, DNS, HTTP/HTTPS, load balancers, firewalls, VPN, and troubleshooting.
- Expertise in Linux operating system (Debian, CentOS): system tuning, process management, file systems, systemd, kernel parameters.
- Production experience with Kubernetes: cluster lifecycle management, Helm, Kustomize, operators, CNI, CSI, Ingress controllers, and resource quotas.
- CI/CD tools: GitLab CI, GitHub Actions, ArgoCD.
- Containerization: Docker, CRIo, build tools.
- Observability stack: Prometheus, Grafana.
- Infrastructure as Code: Ansible, Terraform or Others.
- Scripting & automation: Python, Go, or Bash for tooling and automation tasks.

Required Soft Skills
- Problem-solving under pressure: ability to stay calm and systematic during incident response and outage resolution.
- Strong communication: clearly articulate technical issues to developers, product managers, and leadership.
-Mentorship: guide junior engineers through code reviews, design discussions, and operational best practices.
- Collaboration: work effectively with development teams to design for reliability and observability from the start.
- Proactive mindset: identify potential bottlenecks, security risks, or reliability gaps before they become incidents.
- Continuous learning: keep up with evolving cloud-native tools and SRE methodologies.
- Documentation habit: create and maintain runbooks, architecture diagrams, and post-mortem reports.
- Familiarity with database operations (PostgreSQL, MySQL, Redis, Kafka in production).

Job Requirements

Gender
Men / Women
Software
Gitlab| Advanced

ثبت مشکل و تخلف آگهی

ارسال رزومه برای ارتباطات سیار ایران - همراه اول

insight applicant

مقایسه من با 355 متقاضی دیگر