

About Us:
Smartech, a division within the Digikala Group, is responsible for developing marketing solutions and services to be used within the group and on a broader scale. Our team focuses on advertising and delivering personalized ads. The data team's primary product is building a Data Management Platform (DMP) that creates user profiles to enable better ad targeting and improved campaign performance.
Role Overview:
We are seeking an experienced Data Engineer to build and maintain scalable, distributed data pipelines and data lake infrastructure primarily deployed on-premises.
You will work closely with cross-functional teams to enable real-time and batch processing supporting our DMP platform and analytics capabilities.
Key Responsibilities:
Design, develop, and optimize scalable ETL/ELT data pipelines using distributed processing frameworks like Apache Spark.
Manage and enhance data storage solutions to support large-scale data ingestion and retrieval.
Implement real-time data streaming pipelines with Apache Kafka or similar technologies to process event-level data.
Build and maintain fault-tolerant, highly available data processing systems on distributed clusters.
Collaborate with data scientists, analysts, and product teams to design datasets optimized for DMP use cases such as user segmentation and targeting.
Perform data modeling, indexing, and tuning to enhance data query performance.
Develop monitoring and alerting mechanisms for data pipelines and cluster health.
Explore and integrate new open-source technologies to improve system scalability and reduce processing latency.
Required Qualifications:
Bachelor's or Master's degree in Computer Science, Engineering, or related field.
2+ years of experience in data engineering or big data development, preferably in an AdTech or MarTech environment.
Strong proficiency in programming languages such as Python, Scala, or Java.
Hands-on experience with distributed data processing systems like Apache Spark or Apache Flink.
Experience with distributed messaging systems like Apache Kafka.
Deep knowledge of distributed storage systems: HDFS, NoSQL databases such as Cassandra, HBase, or MongoDB.
Solid experience with SQL and experience optimizing complex queries on large datasets.
Familiarity with cluster management and resource schedulers.
Experience working in on-premises or hybrid data infrastructure environments.
Strong problem-solving skills and the ability to troubleshoot distributed systems at scale.
Preferred Skills:
Exposure to container orchestration (e.g., Kubernetes) for managing on-premises clusters.
Knowledge of data governance and security best practices in large-scale data environments.
Understanding or experience with adtech components like RTB, DSP, SSP systems is a plus.
Understanding of DMP concepts and data management for audience segmentation and profile stitching.
Experience with batch and real-time machine learning feature pipelines.
ثبت مشکل و تخلف آگهی
ارسال رزومه برای اسمارتک