Position Overview-
We are looking for an experienced AI Engineer to design, build, and optimize AI-powered applications, leveraging both traditional machine learning and large language models (LLMs). The ideal candidate will have a strong foundation in LLM fine-tuning, inference optimization, backend development, and MLOps, with the ability to deploy scalable AI systems in production environments.
ShyftLabs is a leading data and AI company, helping enterprises unlock value through AI-driven products and solutions. We specialize in data platforms, machine learning models, and AI-powered automation, offering consulting, prototyping, solution delivery, and platform scaling. Our Fortune 500 clients rely on us to transform their data into actionable insights.
-
Design and implement traditional ML and LLM-based systems and applications.
-
Optimize model inference for performance and cost-efficiency.
-
Fine-tune foundation models using methods like LoRA, QLoRA, and adapter layers.
-
Develop and apply prompt engineering strategies including few-shot learning, chain-of-thought, and RAG.
-
Build robust backend infrastructure to support AI-driven applications.
-
Implement and manage MLOps pipelines for full AI lifecycle management.
-
Design systems for continuous monitoring and evaluation of ML and LLM models.
-
Create automated testing frameworks to ensure model quality and performance.
-
Bachelor’s degree in Computer Science, AI, Data Science, or a related field.
-
4+ years of experience in AI/ML engineering, software development, or data-driven solutions.
-
LLM Expertise
-
Experience with parameter-efficient fine-tuning (LoRA, QLoRA, adapter layers).
-
Understanding of inference optimization techniques: quantization, pruning, caching, and serving.
-
Skilled in prompt engineering and design, including RAG techniques.
-
Familiarity with AI evaluation frameworks and metrics.
-
Experience designing automated evaluation and continuous monitoring systems.
-
Backend Engineering
-
Strong proficiency in Python and frameworks like FastAPI or Flask.
-
Experience building RESTful APIs and real-time systems.
-
Knowledge of vector databases and traditional databases.
-
Hands-on experience with cloud platforms (AWS, GCP, Azure) focusing on ML services.
-
MLOps & Infrastructure
-
Familiarity with model serving tools (vLLM, SGLang, TensorRT).
-
Experience with Docker and Kubernetes for deploying ML workloads.
-
Ability to build monitoring systems for performance tracking and alerting.
-
Experience building evaluation systems using custom metrics and benchmarks.
-
Proficient in CI/CD and automated deployment pipelines.
-
Experience with orchestration tools like Airflow.
-
Hands-on experience with LLM frameworks (Transformers, LangChain, LlamaIndex).
-
Familiarity with LLM-specific monitoring tools and general ML monitoring systems.
-
Experience with distributed training and inference on multi-GPU environments.
-
Knowledge of model compression techniques like distillation and quantization.
-
Experience deploying models for high-throughput, low-latency production use.
-
Research background or strong awareness of the latest developments in LLMs.
-
Tools & Technologies We Use
-
Frameworks: PyTorch, TensorFlow, Hugging Face Transformers
-
Serving: vLLM, TensorRT-LLM, SGlang, OpenAI API
-
Infrastructure: Docker, Kubernetes, AWS, GCP
-
Databases: PostgreSQL, Redis, Vector Databases
We are proud to offer a competitive salary alongside a strong healthcare insurance and benefits package. We pride ourselves on the growth of our employees, offering extensive learning and development resources.