Data Engineer with Machine Learning Specialization

5 - 10 years

15 - 30 Lacs

Posted:1 month ago| Platform: Naukri logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Job Requirement for Offshore Data Engineer (with ML expertise) Work Mode: Remote Base Location: Bengaluru Experience: 5+ Years Technical Skills & Expertise: PySpark & Apache Spark: Extensive experience with PySpark and Spark for big data processing and transformation. Strong understanding of Spark architecture, optimization techniques, and performance tuning. Ability to work with Spark jobs in distributed computing environments like Databricks. Data Mining & Transformation: Hands-on experience in designing and implementing data mining workflows. Expertise in data transformation processes, including ETL (Extract, Transform, Load) pipelines. Experience in large-scale data ingestion, aggregation, and cleaning. Programming Languages: Python & Scala: Proficient in Python for data engineering tasks, including using libraries like Pandas and NumPy. Scala proficiency is preferred for Spark job development. Big Data Concepts: In-depth knowledge of big data frameworks and paradigms, such as distributed file systems, parallel computing, and data partitioning. Big Data Technologies: Cassandra & Hadoop: Experience with NoSQL databases like Cassandra and distributed storage systems like Hadoop. Data Warehousing Tools: Proficiency with Hive for data warehousing solutions and querying. ETL Tools: Experience with Beam architecture and other ETL tools for large-scale data workflows. Cloud Technologies (GCP): Expertise in Google Cloud Platform (GCP), including core services like Cloud Storage, BigQuery, and DataFlow. Experience with DataFlow jobs for batch and stream processing. Familiarity with managing workflows using Airflow for task scheduling and orchestration in GCP. Machine Learning & AI: GenAI Experience: Familiarity with Generative AI and its applications in ML pipelines. ML Model Development: Knowledge of basic ML model building using tools like Pandas, NumPy, and visualization with Matplotlib. ML Ops Pipeline: Experience in managing end-to-end ML Ops pipelines for deploying models in production, particularly LLM (Large Language Models) deployments. RAG Architecture: Understanding and experience in building pipelines using Retrieval-Augmented Generation (RAG) architecture to enhance model performance and output. Tech stack : Spark, Pyspark, Python, Scala, GCP data flow, Data composer (Air flow), ETL, Databricks, Hadoop, Hive, GenAI, ML Modeling basic knowledge, ML Ops experience , LLM deployment, RAG

Mock Interview

Practice Video Interview with JobPe AI

Start PySpark Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Pune, Chennai, Bengaluru