Statsby Solutions

1 Job openings at Statsby Solutions
Data Engineer pune,maharashtra,india 6 years None Not disclosed On-site Full Time

Data Engineer (Scala | Spark | Azure) We are seeking a Scala and Spark–focused Data Engineer with around 6 years of experience to join our data platform team. This is a hands-on role focused on building, optimizing, and operating scalable batch and streaming data pipelines on Azure. You will work closely with engineering, analytics, and product teams to design reliable data systems and evolve our data architecture to support current and future data initiatives. This role suits someone comfortable owning production pipelines and working deeply with distributed systems. Responsibilities Design, build, and maintain scalable data pipelines using Scala and Apache Spark Develop and optimize batch and streaming data processing solutions Build, operate, and support data platforms on Azure Implement and maintain streaming pipelines using Kafka and Spark Structured Streaming Optimize Spark workloads for performance, cost, and reliability Develop robust ETL and ELT workflows for large-scale datasets Collaborate with Engineering, Analytics, and Product teams on data requirements Ensure data quality, security, and access controls across the data platform Monitor, troubleshoot, and resolve production data pipeline issues Contribute to data architecture decisions and engineering best practices Required Skills and Experience Around 6 years of experience in data engineering or backend data systems Strong hands-on experience with Scala for data processing Deep experience with Apache Spark, including performance tuning and optimization Experience running Spark workloads on Azure, including: Azure Databricks or Spark on Azure Synapse Azure Data Lake Storage Experience with Kafka for streaming data pipelines Working knowledge of Hadoop ecosystem tools, including Hive Hands-on experience deploying and operating applications on Kubernetes Experience containerizing Spark applications and managing Kubernetes-based deployments Strong understanding of distributed systems, data modeling, and data pipeline design What We’re Looking For Proven experience owning production-grade data pipelines Strong problem-solving skills with a focus on reliability and scalability Ability to work independently while collaborating across teams Clear and practical communication with technical and non-technical stakeholders