Posted:1 day ago|
Platform:
Remote
Full Time
We are seeking a highly skilled and motivated candidate with expertise in programming, problem-solving, and machine learning (ML) and artificial intelligence (AI). The ideal candidate will have strong programming skills, with a particular focus on Python, and experience using key data manipulation libraries such as Pandas and NumPy.
Key Requirements:
1. Proficiency in Python and its data manipulation libraries (e.g., Datasets, Pandas, NumPy).
2. Demonstrable experience designing and building scalable data pipelines for collecting, cleaning, transforming, and versioning large-scale datasets (text, code, structured) specifically for ML model training / ML applications.
3. Hands-on experience in preparing and formatting diverse datasets into specific structures required for ML model training.
4. Experience in curating, cleaning, and structuring datasets for ML model evaluation, ensuring data quality and relevance for various benchmarks.
5. Familiarity with common challenges in training data data preparation, such as bias detection/mitigation, data distribution analysis, and data augmentation techniques.
6. Solid understanding of data engineering best practices, including data quality, versioning, and efficient data processing.
Desirable skills:
1. LLM fine tuning.
2. Data set preparation for LLMs including prompt-completion, instruction-following, chat formats
3. Familiarity with LLM model evaluation strategies.
4. Familiarity with common challenges in LLM data preparation, such as bias detection/mitigation, data distribution analysis, and data augmentation techniques.
5. Familiarity with the Hugging Face Transformers and Datasets libraries is highly desirable.
In addition to technical expertise, the ideal candidate will have experience with Git and GitHub for version control and a proven ability to collaborate effectively in a team environment, particularly when working on shared codebases and remote projects. Strong data management and manipulation skills are crucial, as is experience working on remote servers to develop and deploy machine learning models.
ACHNET Inc
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Now5.0 - 7.6 Lacs P.A.
Gurgaon
1.0 - 9.25 Lacs P.A.
Haryāna
Experience: Not specified
5.7 - 9.75 Lacs P.A.
Hyderabad, Telangana, India
Salary: Not disclosed
Hyderabad, Telangana, India
Salary: Not disclosed
Mumbai, Maharashtra, India
Salary: Not disclosed
Andhra Pradesh
Salary: Not disclosed
Andhra Pradesh
Experience: Not specified
Salary: Not disclosed
Gurugram, Haryana, India
Experience: Not specified
Salary: Not disclosed
Chennai, Tamil Nadu, India
Salary: Not disclosed