We have developed API gateway aggregators using frameworks like Hystrix and spring-cloud-gateway for circuit breaking and parallel processing.
Our serving microservices handle more than 15K RPS on normal days and during saledays this can go to 30K RPS. Being a consumer app, these systems have SLAs of ~10ms
Our distributed scheduler tracks more than 50 million shipments periodically fromdifferent partners and does async processing involving RDBMS.
We use an in-house video streaming platform to support a wide variety of devices and networks.
What Youll Do
- Design and implement scalable and fault-tolerant data pipelines (batch and streaming) using frameworks like Apache Spark, Flink, and Kafka.
- Lead the design and development of data platforms and reusable frameworks that serve multiple teams and use cases.
- Build and optimize data models and schemas to support large-scale operational and analytical workloads.
- Deeply understand Apache Spark internals and be capable of modifying or extending the open-source Spark codebase as needed.
- Develop streaming solutions using tools like Apache Flink, Spark Structured Streaming.
- Drive initiatives that abstract infrastructure complexity, enabling ML, analytics, and product teams to build faster on the platform.
- Champion a platform-building mindset focused on reusability, extensibility, and developer self-service.
- Ensure data quality, consistency, and governance through validation frameworks, observability tooling, and access controls.
- Optimize infrastructure for cost, latency, performance, and scalability in modern cloud-native environments.
- Mentor and guide junior engineers, contribute to architecture reviews, and uphold high engineering standards.
- Collaborate cross-functionally with product, ML, and data teams to align technical solutions with business needs.
What Were Looking For
- 5-8 years of professional experience in software/data engineering with a focus on distributed data systems.
- Strong programming skills in Java, Scala, or Python, and expertise in SQL.
- At least 2 years of hands-on experience with big data systems including Apache Kafka, Apache Spark/EMR/Dataproc, Hive, Delta Lake, Presto/Trino, Airflow, and data lineage tools (e.g., Datahb,Marquez, OpenLineage).
- Experience implementing and tuning Spark/Delta Lake/Presto at terabyte-scale or beyond.
- Strong understanding of Apache Spark internals (Catalyst, Tungsten, shuffle, etc.) with experience customizing or contributing to open-source code.
- Familiarity and worked with modern open-source and cloud-native data stack components such as:
- Apache Iceberg, Hudi, or Delta Lake
- Trino/Presto, DuckDB, or ClickHouse,Pinot ,Druid
- Airflow, Dagster, or Prefect
- DBT, Great Expectations, DataHub, or OpenMetadata
- Kubernetes, Terraform, Docker
- Strong analytical and problem-solving skills, with the ability to debug complex issues in large-scale systems.
- Exposure to data security, privacy, observability, and compliance frameworks is a plus.
Good to Have
- Contributions to open-source projects in the big data ecosystem (e.g., Spark, Kafka, Hive, Airflow)
- Hands-on data modeling experience and exposure to end-to-end data pipeline development
- Familiarity with OLAP data cubes and BI/reporting tools such as Tableau, Power BI, Superset, or Looker
- Working knowledge of tools and technologies like ELK Stack (Elasticsearch, Logstash, Kibana), Redis, and MySQL
- Exposure to backend technologies including RxJava, Spring Boot, and Microservices architecture