Data Engineer

6 - 0 years

0 Lacs

Posted:1 day ago| Platform: Indeed logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

28 days ago
TESCRA India
DESCRIPTION

We are seeking a highly skilled and motivated candidate with expertise in programming, problem-solving, and machine learning (ML) and artificial intelligence (AI). The ideal candidate will have strong programming skills, with a particular focus on Python, and experience using key data manipulation libraries such as Pandas and NumPy.


Key Requirements:


1. Proficiency in Python and its data manipulation libraries (e.g., Datasets, Pandas, NumPy).

2. Demonstrable experience designing and building scalable data pipelines for collecting, cleaning, transforming, and versioning large-scale datasets (text, code, structured) specifically for ML model training / ML applications.

3. Hands-on experience in preparing and formatting diverse datasets into specific structures required for ML model training.

4. Experience in curating, cleaning, and structuring datasets for ML model evaluation, ensuring data quality and relevance for various benchmarks.

5. Familiarity with common challenges in training data data preparation, such as bias detection/mitigation, data distribution analysis, and data augmentation techniques.

6. Solid understanding of data engineering best practices, including data quality, versioning, and efficient data processing.


Desirable skills:

1. LLM fine tuning.

2. Data set preparation for LLMs including prompt-completion, instruction-following, chat formats

3. Familiarity with LLM model evaluation strategies.

4. Familiarity with common challenges in LLM data preparation, such as bias detection/mitigation, data distribution analysis, and data augmentation techniques.

5. Familiarity with the Hugging Face Transformers and Datasets libraries is highly desirable.


In addition to technical expertise, the ideal candidate will have experience with Git and GitHub for version control and a proven ability to collaborate effectively in a team environment, particularly when working on shared codebases and remote projects. Strong data management and manipulation skills are crucial, as is experience working on remote servers to develop and deploy machine learning models.

QUALIFICATIONS
Must Have Skills
  • Python
  • Pandas
  • NumPy
  • Data pipelines
  • Data cleaning
  • Data transformation
  • Machine learning
  • Data engineering
  • Data quality
  • Git
  • GitHub
Good To Have Skills
  • LLM fine tuning
  • Hugging Face Transformers
  • Datasets
Minimum Education Level
No Education Requirement
Years of Experience
6-0 years
ADDITIONAL INFORMATION
Pay Range: null
Work Type: FullTime
Location: Bangalore Karnataka, India
Job ID: Tescra-Zeb-DAF750

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Hyderabad, Telangana, India

Gurugram, Haryana, India