Data Engineer- (Python)- LLMs

8 years

0 Lacs

Posted:2 days ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

About Us

MyRemoteTeam, Inc is a fast-growing distributed workforce enabler, helping companies scale with top global talent. We empower businesses by providing world-class software engineers, operations support, and infrastructure to help them grow faster and better.


Position:

Client:

Location:

Commitment:

Experience:

We are looking for an experienced Python professional with strong expertise in large-scale data processing. This role involves building and maintaining automated data pipelines that process massive text datasets used for AI and LLM training. The ideal candidate will have deep hands-on experience in Python, strong data engineering skills, and the ability to work closely with ML and AI teams.

Key Responsibilities

  • Design and develop scalable ETL/ELT pipelines using Python.
  • Ingest, process, clean, deduplicate, and normalize large text datasets.
  • Work with diverse data formats such as JSON, CSV, XML, and Parquet.
  • Ensure high data quality and establish quality-check standards.
  • Optimize pipelines for speed, cost efficiency, and reliability.
  • Collaborate with AI/ML teams on data requirements and training workflows.
  • Support model training by investigating data-related issues when required.

Required Skills

  • 8+ years in data engineering, backend engineering, or data processing roles.
  • Strong expertise in Python and libraries like Pandas, NumPy, Dask, Polars.
  • Experience building large-scale data pipelines.
  • Strong understanding of data structures, data modeling, and best coding practices.
  • Hands-on experience with JSON/CSV/XML/Parquet formats.
  • Excellent debugging and problem-solving skills.

Good to Have

  • Experience in LLM/AI data preprocessing (LLaMA, GPT, BERT, etc.).
  • Knowledge of big data frameworks (Spark, Ray).
  • Experience with Hugging Face libraries (Transformers, Datasets, Tokenizers).
  • Familiarity with PyTorch or TensorFlow.
  • Experience working on cloud platforms (AWS, GCP, Azure).

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You