Data Engineer

5 - 8 years

9 - 15 Lacs

Posted:1 month ago| Platform: Naukri logo

Apply

Work Mode

Hybrid

Job Type

Full Time

Job Description

Job Description : As a Data Engineer for our Large Language Model Project, you will play a crucial role in designing, implementing, and maintaining the data infrastructure. Your expertise will be instrumental in ensuring the efficient flow of data, enabling seamless integration with various components, and optimizing data processing pipelines. 5+ years of relevant experience in data engineering roles. Key Responsibilities : Data Pipeline Development - Design, develop, and maintain scalable and efficient data pipelines to support the training and deployment of large language models. Implement ETL processes to extract, transform, and load diverse datasets into suitable formats for model training. Data Integration - Collaborate with cross-functional teams, including data scientists and software engineers, to integrate data sources and ensure the availability of relevant and high-quality data. Implement solutions for real-time data processing and integration, fostering model development agility. Data Quality Assurance - Establish and maintain robust data quality checks and validation processes to ensure the accuracy and consistency of datasets. Troubleshoot data quality issues, identify root causes, and implement corrective measures. Infrastructure Management - Work closely with DevOps and IT teams to manage and optimize the data storage infrastructure, ensuring scalability and performance. Implement best practices for data security, access control, and compliance with data governance policies. Performance Optimization - Identify bottlenecks and inefficiencies in data processing pipelines and implement optimizations to enhance overall system performance. Continuously monitor and evaluate system performance metrics, making proactive adjustments as needed. Skills & Tools Programming Languages - Proficiency in languages such as Python for building robust data processing applications. Big Data Technologies - Experience with distributed computing frameworks like Apache Spark, Databricks & DBT for large-scale data processing. Database Systems - In-depth knowledge of both relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., Vector databases, MongoDB, Cassandra etc). Data Warehousing - Familiarity with data warehousing solutions such as Amazon Redshift, Google BigQuery, or Snowflake. ETL Tools - Hands-on experience with ETL tools like Apache NiFi, Talend, or Apache Airflow. Knowledge of NLP will be an added advantage. Cloud Services - Experience with cloud platforms like AWS, Azure, or Google Cloud for deploying and managing data infrastructure. Problem Solving - Analytical mindset with a proactive approach to identifying and solving complex data engineering challenges.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
HCLTech logo
HCLTech

Information Technology Services

New Delhi

RecommendedJobs for You

Hyderabad, Telangana, India

Noida, Uttar Pradesh, India

Noida, Uttar Pradesh, India