We are a technology company operating in the artificial intelligence and cloud infrastructure space. Our focus is on providing high‑performance GPU infrastructure for AI workloads, including large language models (LLMs). As we expand our platform, we are looking for a skilled DevOps Engineer to help deploy, manage, and scale LLM services on GPU infrastructure.
Job Description
In this role, you will be responsible for deploying, maintaining, and optimizing large language models running on GPU servers. You will work closely with engineering and product teams to ensure reliable, scalable, and high‑performance AI infrastructure.
Responsibilities
- Deploy and host large language models (LLMs) on GPU servers
- Maintain and monitor GPU infrastructure and AI model services
- Optimize performance and resource utilization for GPU workloads
- Implement automation for deployment, scaling, and maintenance of AI services
- Set up monitoring, logging, and alerting for AI infrastructure
- Troubleshoot infrastructure, deployment, and performance issues
- Collaborate with engineering teams to support AI applications and APIs
- Ensure high availability, reliability, and security of the infrastructure
Requirements
- Experience with DevOps practices and infrastructure management
- Strong experience with Linux server administration
- Experience working with Docker and containerized environments
- Familiarity with GPU environments and CUDA‑based workloads
- Experience deploying or managing machine learning or LLM services is a strong plus
- Experience with orchestration tools such as Kubernetes is a plus
- Familiarity with monitoring tools and infrastructure automation
- Strong problem‑solving skills and ability to work in a fast‑paced environment