اسنپ باکس
اسنپ باکس

NOC Engineer

Tehran/ Zaferanieh
Full Time
Saturday to Wednesday 9 A.M. to 6 P.M.
-
-
201 - 500 employees
Internet Provider / E-commerce / Online Services
توضیحات بیشتر

key Requirements

2 years experience in similar position
MySql - Intermediate
Kafka - Intermediate
MongoDB - Intermediate
Gerafana - Intermediate

Job Description

We are seeking a motivated NOC Engineer to join our SRE team, supporting our mission to deliver reliable, high-availability services and maintain the health of our infrastructure; In this role, you will be responsible for 24/7 monitoring of system performance and uptime, ensuring rapid detection and escalation of incidents, and collaborating closely with technical teams to maintain operational excellence.

Responsibilities

  • Proactively monitor production systems, applications, and infrastructure using industry-standard tools;
  • Respond to alerts and incidents, performing initial triage and escalating issues to relevant teams as needed;
  • Ensure the continuous availability and health of Java-based applications, as well as critical frontend and backend services;
  • Assist in investigating recurring issues, identifying patterns, and contributing to RCA (Root Cause Analysis);
  • Maintain accurate shift logs and incident documentation, providing clear and concise reports to technical stakeholders;
  • Collaborate with SRE, DevOps, and development teams to improve monitoring coverage and alerting rules;
  • Identify opportunities for automation or process improvement, and support their implementation as experience grows;
  • Adhere to established operational procedures and contribute to their continuous improvement;

Requirements

  • At least 2 years of experience in a NOC, IT operations, or similar monitoring-focused role;
  • Willingness to work in a 24/7 shift rotation, including nights, weekends, and holidays;
  • Good working knowledge of Linux system administration and command-line troubleshooting (equivalent to LPIC-1 level);
  • Solid understanding of networking concepts and common protocols (equivalent to CCNA level);
  • Familiarity with monitoring and logging tools such as Prometheus, Grafana, ELK stack, or similar platforms;
  • Exposure to backend technologies and databases such as MySQL, MongoDB, Redis, HAProxy, Kafka, or RabbitMQ;
  • Ability to analyze alerts and logs, and perform effective initial troubleshooting;
  • Strong sense of responsibility and attention to detail in operational environments;
  • Good communication skills, with the ability to document incidents and escalate effectively;
  • Self-motivated, organized, and adaptable, with a passion for continuous learning and quality improvement;

Preferred Qualifications

  • Experience supporting Java-based applications in production environments;
  • Familiarity with incident management and escalation workflows;
  • Exposure to automation scripting (e.g., Bash, Python) or basic SRE tasks;
  • Familiarity with Docker and Kubernetes for container management;
  • Prior experience in a high-availability or cloud-based infrastructure environment;
  • Eagerness to learn and grow into more advanced SRE responsibilities;

Benefits

  • Transportation discount and voucher
  • Organizational food discount
  • Learning budget
  • Team Building Budget
  • Wellness Budget
  • Comprehensive health, dental, and vision insurance

Job Requirements

Gender
Men / Women
Software
Gerafana| Intermediate MySql| Intermediate MongoDB| Intermediate Kafka| Intermediate

ثبت مشکل و تخلف آگهی

ارسال رزومه برای اسنپ باکس