- Lead the design, architecture, and implementation of AI-powered agents using LLMs, integrating them into user-facing applications.
- Define technical direction, best practices, and standards for model serving, persistence layers, and multi-turn interactions.
- Partner with Data Science, Engineering, and Product teams to align technical solutions with business goals.
- Ensure performance, reliability, and security through code reviews, CI/CD oversight, and experimentation.
- Stay current on emerging LLM capabilities and recommend innovative applications.
- Oversee owned services from architecture and incident management to infrastructure, monitoring, and documentation.
- Manage team resources, onboarding, and 1:1 coaching in collaboration with the Engineering Manager.
- Maintain compliance with software development, hosting, and security standards, addressing risks proactively.
- Operate portfolio services within AWS, coordinating maintenance and updates according to policies.
Requirements for consideration
- Proven experience coordinating work between Data Science and Engineering teams.
- Strong background in architecting and leading the implementation of AI-powered agents leveraging LLMs, with integration into end-user-facing applications.
- Expertise in defining technical direction and best practices for model serving, memory/persistence layers, and orchestration of multi-turn agent interactions.
- Ability to ensure reliability, performance, and security of AI agent experiences through code reviews, experimentation, and CI/CD oversight.
- Up-to-date knowledge of emerging LLM capabilities (e.g., function calling, tools, retrieval-augmented generation) and ability to identify practical applications.
Strategic & Financial Planning
- Understanding of how technical decisions impact operational costs.
- Experience reviewing and managing team spending in alignment with budget guidelines.
- Ability to provide technical input for roadmap planning, including risk assessment and mitigation.
Architecture & Quality Management
- Track record of maintaining and evolving services in alignment with organizational architecture standards and technology strategy.
- Experience with incident management, evaluating required changes, and managing associated risks.
Team Leadership & Resource Management
- Experience conducting regular 1:1s, providing feedback, and supporting team development.
- Skilled in ensuring teams have necessary resources, tools, and knowledge to achieve objectives.
- Proven ability to onboard new team members effectively.
Compliance & Security
- Strong knowledge of secure software development practices and hosting standards.
- Experience maintaining compliance documentation and addressing deviations from established standards.
- Ability to identify and report security-relevant issues promptly.
Operations & Infrastructure
- Experience coordinating maintenance, managing AWS-hosted services, and administering AWS accounts in line with best practices.
- Skilled in monitoring services to meet defined SLOs and updating infrastructure per organizational policies.
Documentation
- Proven ability to create and maintain detailed documentation, including domain design, architecture, and operating procedures.
- Commitment to keeping team-specific documentation current and planning for ongoing updates.