NOC Engineer

(24 days ago)

Snapp Box 4.1

Tehran/ Zaferanieh

Full Time

Working days and hours

Saturday to Wednesday 9 A.M. to 6 P.M.

Business trips

Facilities and Benefits

درباره شرکت

Company Size

201 - 500 employees

Industry

Internet Provider / E-commerce / Online Services

توضیحات بیشتر

key Requirements

2 years experience in similar position

MySql - Intermediate

Kafka - Intermediate

MongoDB - Intermediate

Gerafana - Intermediate

Job Description

We are seeking a motivated NOC Engineer to join our SRE team, supporting our mission to deliver reliable, high-availability services and maintain the health of our infrastructure; In this role, you will be responsible for 24/7 monitoring of system performance and uptime, ensuring rapid detection and escalation of incidents, and collaborating closely with technical teams to maintain operational excellence.

Responsibilities

Proactively monitor production systems, applications, and infrastructure using industry-standard tools;
Respond to alerts and incidents, performing initial triage and escalating issues to relevant teams as needed;
Ensure the continuous availability and health of Java-based applications, as well as critical frontend and backend services;
Assist in investigating recurring issues, identifying patterns, and contributing to RCA (Root Cause Analysis);
Maintain accurate shift logs and incident documentation, providing clear and concise reports to technical stakeholders;
Collaborate with SRE, DevOps, and development teams to improve monitoring coverage and alerting rules;
Identify opportunities for automation or process improvement, and support their implementation as experience grows;
Adhere to established operational procedures and contribute to their continuous improvement;

Requirements

At least 2 years of experience in a NOC, IT operations, or similar monitoring-focused role;
Willingness to work in a 24/7 shift rotation, including nights, weekends, and holidays;
Good working knowledge of Linux system administration and command-line troubleshooting (equivalent to LPIC-1 level);
Solid understanding of networking concepts and common protocols (equivalent to CCNA level);
Familiarity with monitoring and logging tools such as Prometheus, Grafana, ELK stack, or similar platforms;
Exposure to backend technologies and databases such as MySQL, MongoDB, Redis, HAProxy, Kafka, or RabbitMQ;
Ability to analyze alerts and logs, and perform effective initial troubleshooting;
Strong sense of responsibility and attention to detail in operational environments;
Good communication skills, with the ability to document incidents and escalate effectively;
Self-motivated, organized, and adaptable, with a passion for continuous learning and quality improvement;

Preferred Qualifications

Experience supporting Java-based applications in production environments;
Familiarity with incident management and escalation workflows;
Exposure to automation scripting (e.g., Bash, Python) or basic SRE tasks;
Familiarity with Docker and Kubernetes for container management;
Prior experience in a high-availability or cloud-based infrastructure environment;
Eagerness to learn and grow into more advanced SRE responsibilities;

Benefits

Transportation discount and voucher
Organizational food discount
Learning budget
Team Building Budget
Wellness Budget
Comprehensive health, dental, and vision insurance

Job Requirements

Gender

Men / Women

Software

Gerafana| Intermediate

MySql| Intermediate

MongoDB| Intermediate

Kafka| Intermediate

ثبت مشکل و تخلف آگهی

ارسال رزومه برای اسنپ باکس

سوابق ارسال رزومه برای این شرکت