Posted:2 weeks ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Working Hours :

Full Time

Locations :

Hyderabad

Experience :

4 –6 years

apply nowapply now

About The Role

Soothsayer Analytics is a global AI & Data Science consultancy headquartered in Detroit, with a thriving delivery center in Hyderabad. We design and deploy end-to-end custom Machine Learning & GenAI solutions—spanning predictive analytics, optimization, NLP, and enterprise-scale AI platforms—that help leading enterprises forecast, automate, and gain a competitive edge.As a Data Engineer, you will build the foundation that powers these AI systems—scalable, secure, and high-performance data pipelines.

Job Overview

We seek a

Data Engineer (Mid-level)

with 4–6 years of hands-on experience in designing, building, and optimizing data pipelines. You will work closely with AI/ML teams to ensure data availability, quality, and performance for analytics and GenAI use cases.

Key Responsibilities

Data Pipeline Development:

  • Build and maintain scalable ETL/ELT pipelines for structured and unstructured data
  • Ingest data from diverse sources (APIs, streaming, batch systems).

Data Modeling & Warehousing

  • Design efficient data models to support analytics and AI workloads.
  • Develop and optimize data warehouses/lakes using Redshift, BigQuery, Snowflake, or Delta Lake.

Big Data & Streaming

  • Work with distributed systems like Apache Spark, Kafka, or Flink for real-time/large-scale data processing.
  • Manage feature stores for ML pipelines

Collaboration & Best Practices

  • Work closely with Data Scientists and ML Engineers to ensure high-quality training data.
  • Implement data quality checks, observability, and governance frameworks.

Required Skills & Qualifications

Education:

Bachelor’s/Master’s in Computer Science, Data Engineering, or related field.

Experience:

4–6 years in data engineering with expertise in:
  • Programming: Python/Scala/Java (Python preferred).
  • Big Data & Processing: Apache Spark, Kafka, Hadoop.
  • Databases: SQL/NoSQL (Postgres, MongoDB, Cassandra).
  • Data Warehousing: Snowflake, Redshift, BigQuery, or similar.
  • Orchestration: Airflow, Luigi, or similar.
  • Cloud Platforms: AWS, Azure, or GCP (data services).
  • Version Control & CI/CD: Git, Jenkins, GitHub Actions.
  • MLOps/GenAI pipelines: (feature engineering, embeddings, vector DBs)

Skills Matrix

Candidates must submit a detailed resume and fill out the following matrix:

Skill

Details

Skills Last Used

Experience (months)

Self-Rating (0–10)

PythonSQL / NoSQLApache SparkKafkaData Warehousing (Snowflake, Redshift, etc.)Orchestration (Airflow, Luigi, etc.)Cloud (AWS / Azure / GCP)Data Quality / Governance ToolsMLOps / LLMOpsGenAI Integration

Instructions For Candidates

  • Provide a detailed resume highlighting end-to-end data engineering projects.
  • Fill out the above skills matrix with accurate dates, duration, and self-ratings.

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You