About the Position:
As an SRE, you will be responsible for ensuring the uptime, availability, and reliability of all applications owned by Irancell-Labs Service Development Center. This role involves collaborating closely with DevOps and Operations teams to improve system stability, automation, and performance.
Responsibilities:
Monitoring & Logging:
Implement and manage monitoring, logging, and alerting solutions to improve system visibility and performance.
Enhance monitoring systems to detect and resolve issues proactively.
Automation & Performance Optimization:
Automate operational tasks and processes to increase efficiency and reduce manual efforts.
Identify performance bottlenecks and implement remediation strategies.
Incident Management & Troubleshooting:
Investigate and troubleshoot system issues, ensuring quick recovery.
Analyze and resolve networking and infrastructure-related problems.
Knowledge of database administration
Production Environment & Deployment Management:
Deploy, manage, and enhance production environments.
Requirements:
Technical Expertise:
Strong knowledge of Linux OS and system administration.
Proficiency in scripting & programming (Python, Go, Bash).
Experience with CI/CD pipelines and automation tools.
Monitoring & Observability:
Familiarity with Grafana, ELK, Prometheus, and logging frameworks.
Containerization & Cloud Technologies:
Experience with Docker and Kubernetes for container orchestration.
Networking & Distributed Systems:
Solid understanding of networking principles, protocols, and troubleshooting techniques.
Strong grasp of distributed systems, microservices architecture, and cloud-native environments.
ثبت مشکل و تخلف آگهی
ارسال رزومه برای ایرانسل لبز