

Key Responsibilities
- Design, implement, and maintain Kubernetes clusters.
- Build and manage CI/CD pipelines (GitHub Actions, GitLab CI, ArgoCD) to streamline application delivery.
- Monitor system reliability, performance, and capacity using observability tools (Prometheus, Grafana, ELK stack).
- Troubleshoot complex production issues across network, Linux OS, containers, and applications.
- Automate infrastructure provisioning using Infrastructure as Code (Terraform, Ansible).
- Harden security at all layers (network policies, RBAC in Kubernetes, secrets management).
Required Technical Skills
- Deep understanding of networking concepts: TCP/IP, DNS, HTTP/HTTPS, load balancers, firewalls, VPN, and troubleshooting.
- Expertise in Linux operating system (Debian, CentOS): system tuning, process management, file systems, systemd, kernel parameters.
- Production experience with Kubernetes: cluster lifecycle management, Helm, Kustomize, operators, CNI, CSI, Ingress controllers, and resource quotas.
- CI/CD tools: GitLab CI, GitHub Actions, ArgoCD.
- Containerization: Docker, CRIo, build tools.
- Observability stack: Prometheus, Grafana.
- Infrastructure as Code: Ansible, Terraform or Others.
- Scripting & automation: Python, Go, or Bash for tooling and automation tasks.
Required Soft Skills
- Problem-solving under pressure: ability to stay calm and systematic during incident response and outage resolution.
- Strong communication: clearly articulate technical issues to developers, product managers, and leadership.
-Mentorship: guide junior engineers through code reviews, design discussions, and operational best practices.
- Collaboration: work effectively with development teams to design for reliability and observability from the start.
- Proactive mindset: identify potential bottlenecks, security risks, or reliability gaps before they become incidents.
- Continuous learning: keep up with evolving cloud-native tools and SRE methodologies.
- Documentation habit: create and maintain runbooks, architecture diagrams, and post-mortem reports.
- Familiarity with database operations (PostgreSQL, MySQL, Redis, Kafka in production).
ثبت مشکل و تخلف آگهی
ارسال رزومه برای ارتباطات سیار ایران - همراه اول
مقایسه من با 355 متقاضی دیگر