Lead Engineer- Java FS with Streaming

5 - 10 years

12 - 16 Lacs

Posted:1 week ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

We are looking for a skilled and motivated Lead Engineer to join the Cosmos.AI platform team, focusing on development of data driven AIML. You will design and build highly performant backend systems to serve features at low latency, includes feature store, streaming, vector search and graph compute platforms, ensure consistency between training and inference environments, and integrate with a set of data storage and computation layers. This is a high-impact role on a technically ambitious team, working at the intersection of AI, big data, and platform engineering.

Responsibilities

  • Design, build, and maintain robust, scalable backend services for AI/ML Feature platform using Java, Spring Boot, and modern cloud-native technologies.
  • Develop APIs and runtime systems to support low-latency, multi-modal feature serving (aggregation, structured, vector, time-series, graph-based, etc).
  • Implement AI/ML computation applications, both real-time and batch, integrating with streaming platforms (eg, Kafka, Dataproc) and storage.
  • Ensure data/feature consistency and parity between online and offline environments to avoid training/serving skew.
  • Optimize system performance through caching strategies, asynchronous processing, and storage selection tailored to feature types.
  • Build and maintain integration layers with upstream data producers and downstream model inference consumers.
  • Implement monitoring, alerting, and observability features to ensure production reliability.
  • Collaborate cross-functionally with Data Science, MLOps, Infrastructure, and Governance teams to deliver end-to-end capabilities.
  • Participate in on-call support rotations and contribute to incident resolution and root cause analysis.

Requirements

  • bachelors or masters degree in Computer Science, Engineering, or a related field.
  • 5+ years of experience developing backend systems using Java including frameworks like Spring Boot.
  • Strong understanding of distributed systems, API design, and multi-threaded programming.
  • Practical knowledge of infrastructure components compute, networking, and storage within modern cloud environments.
  • Familiarity with microservices architecture, containerization (eg, Docker, Kubernetes), and CI/CD pipelines.
  • Passion for performance tuning, low-latency systems, and building for scale.
  • Experience working on machine learning platforms or feature stores.
  • Hands-on experience with vector databases (eg, FAISS, Milvus), graph databases (eg, Neo4j), or time-series databases (eg, InfluxDB, Prometheus).
  • Experience with one or more of the following: BigQuery, HBase, Aerospile, Kafka, Flink, Spark (or Dataproc), or similar systems.
  • Solid understanding of the AI/ML development lifecycle, with familiarity in MLOps practices such as model deployment, serving, and monitoring.
  • Knowledge about public cloud infrastructure such as Google cloud platform is a strong plus.
  • Knowledge of data governance, access control, and privacy practices in feature/data platforms.
  • Contributions to open-source ML infrastructure or platform projects is a plus.

Mock Interview

Practice Video Interview with JobPe AI

Start Machine Learning Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You