Senior AI/ML Engineer – Agentic LLM Systems
Full time - Paid
Important Note
Only apply if you have 2+ years of Experience in AI/ML model deployments in production environments.
Who We Are
Steps AI is a cutting-edge AI company redefining how enterprises retrieve and act on data through autonomous, agent-powered workflows. Our platform leverages the latest in multi-agent orchestration, RAG, and generative AI to deliver real-time, contextual insights. As we scale our product suite, we're seeking an experienced AI/ML engineer to lead the design and delivery of next-generation agentic LLM solutions.
What You'll Do
Architect & Lead Multi-Agent Frameworks
- Design production-grade agent ecosystems featuring robust error-recovery, long-term memory, and dynamic tool integration
- Standardize inter-agent communication via Model Context Protocol (MCP) and Agent-to-Agent (A2A) protocols
Build Multimodal LLM Solutions
- Integrate vision, audio, and structured data inputs into unified LLM pipelines
- Implement cross-modal retrieval strategies and fine-tune multimodal encoder–decoder models
Drive RAG Pipeline Excellence
- Lead the design of Graph-RAG, Agentic-RAG, and hybrid retrieval architectures for high-accuracy knowledge access
- Optimize embedding stores, vector databases (FAISS, Milvus), and reranking strategies
Advance Agent-Building Tooling
- Champion frameworks such as N8N, Langraph, Langflow, Agent Builder and AgentSpace for rapid agent prototyping and governance
- Extend and harden orchestration libraries (LangChain Agents, AutoGen, LlamaIndex) for enterprise use
- Multi-Agent Systems: Develop LLM-based agents for dynamic tasks using orchestration frameworks
Mentor & Evangelize Best Practices
- Guide junior engineers on SOLID design, DRY/YAGNI coding, CI/CD pipelines, and peer code reviews
- Present architecture reviews, design patterns, and performance benchmarks at team tech-talks
Must-Have Qualifications
- Experience: ≥ 2 years in ML/AI engineering, specifically building and deploying LLM-based systems and RAG pipelines
- We will critically evaluate your problem statement understanding, system design, theoretical understanding of LLM techniques/models, hands-on coding experience in GenAI domains, also, the application should take ownership in the tasks assigned to them
- Transformer & Multimodal Mastery: Deep expertise in transformer architectures (GPT, LLaMA, BERT, T5) and multimodal models (e.g., CLIP, Flamingo)
- Agentic Protocols & Platforms: Hands-on with MCP, A2A protocols, and agent toolkits such as Agent Builder and AgentSpace
- Proficiency with frameworks like PyTorch, TensorFlow, and frameworks like LangChain, Hugging Face, transformers, llamaindex, ollama, LLM360, texgen-webUI, or other model orchestration libraries
- RAG Proficiency: Proven track record designing, deploying, and scaling retrieval-augmented generation systems in production
- LLM Orchestration: Extensive use of LangChain, AutoGen, llamaindex, or similar for multi-step reasoning workflows
- Experience in designing, deploying, and orchestrating multi-agent systems using LLM-based agents for dynamic task execution across various tools and platforms using langgraph, langchain, autogen, or llamaindex
- Familiarity with tool-calling frameworks, LLM-sandboxing, and how LLMs can interact with external systems (e.g., databases, APIs)
- MLOps & Cloud: Production deployments on AWS/Azure/GCP using Docker, Kubernetes, Terraform, CI/CD, and monitoring stacks
- Advanced Prompting: Mastery of Chain of Thought (CoT), Tree of Thought (ToT), Graph of Thought (GoT), and self-critique strategies
- Software Engineering Rigor: Solid understanding of SOLID principles, version control (Git), unit/integration testing, and code review culture
- Collaboration & Communication: Excellent at articulating technical solutions to both technical peers and non-technical stakeholders
Nice-to-Have
- Contributions to open-source agentic frameworks or active involvement in AI standards bodies
- Familiarity with real-time streaming data, event-driven architectures, or Large Scale deployed AI agents
Critical Requirements - High Weightage
Open Source & Live Projects
- Active GitHub portfolio showcasing full-stack applications with AI/ML integration
- Open source contributions to web frameworks, AI tools, or developer libraries
- Live project demonstrations - must present production applications you've built and deployed
- Clean, documented code that demonstrates architectural thinking and best practices
Technical Assessment
We will critically evaluate your:
- System design for scalable web applications with AI integration
- Code quality through GitHub portfolio review and live coding sessions
- AI/ML integration capabilities through practical project demonstrations
- Ownership mindset and ability to take end-to-end responsibility for features
What We Offer
- Competitive salary and equity package with rapid growth potential
- Cutting-edge tech stack with latest AI/ML tools and frameworks
- Generous learning budget for conferences, courses, and certifications
- Fully remote with flexible hours and modern development environment
- Open source time - contribute to projects during work hours
- Direct mentorship from seasoned engineers and AI researchers
Applications without a comprehensive GitHub portfolio and live project demonstration will not be considered.
Join Steps AI and lead the charge in building the autonomous LLM engines that will power tomorrow's enterprise intelligence.