Were looking for an experienced DevOps Engineer to help build, automate, and maintain both our SaaS cloud infrastructure and on-premises client installations. Youll work closely with development teams to implement robust CI/CD pipelines, manage Kubernetes deployments, and ensure security across our microservices architecture in multiple environments, with a focus on search, AI, and vector database technologies.
Key Responsibilities
- Design and evolve
AI-augmented CI/CD pipelines
that serve as reusable blueprints across rewrite projects supporting multi-tenant SaaS deployments, agentic automation, and environment creation through code. - Collaborate with the AI methodology team to
refine automation patterns
and integrate AI-driven pipeline generation, test orchestration, and telemetry collection into the rewrite process. - Develop
automated installation and update frameworks
for hybrid and customer-managed environments, emphasizing repeatability and low-touch deployment. - Manage
Azure-based SaaS infrastructure
, ensuring reliability, elasticity, and security across Kubernetes and containerized services. - Deploy, scale, and optimize
Elasticsearch and vector database
clusters supporting GenAI workloads. - Implement, monitor, and tune
LLM and AI service deployments
on Azure (OpenAI Service, Cognitive Search, model hosting). - Design and maintain
federated authentication and identity integration
across microservices (Okta, OAuth2, and SSO patterns). - Oversee
PostgreSQL/MS SQL and data infrastructure
, ensuring resilience, automated backup, and performance tuning for high-throughput workloads. - Establish
observability standards
metrics, traces, and logs for AI and non-AI services; use insights to improve future rewrites. - Embed
security automation
into every deployment model, enforcing Zero Trust and continuous vulnerability assessment. - Partner with development and AI teams to
industrialize deployment methodology
, transforming learnings from each rewrite into platform-level automation improvements.
Required Experience:
-
AI-Augmented CI/CD:
Proven experience building and maintaining automated pipelines (GitHub Actions or Azure DevOps) that integrate AI-assisted code generation, testing, and deployment workflows. -
Version Control & Collaboration:
Deep experience with Git-based systems (GitHub, Bitbucket) including managing multi-repo architectures and enforcing branching and governance standards. -
Kubernetes & Cloud Infrastructure:
Advanced proficiency with Kubernetes and Helm; experienced in operating containerized microservices across multiple environments in Azure. -
Infrastructure as Code (IaC):
Strong knowledge of Terraform or Bicep for creating repeatable, parameterized deployment templates used across multiple rewrite projects. -
Authentication & IAM:
Hands-on experience implementing federated identity (Okta, Azure AD, OAuth2/OIDC) across microservices and SaaS environments. -
PostgreSQL & Data Layer Operations:
Skilled in tuning, scaling, and backing up PostgreSQL; familiarity with managing schema migrations in automated CI/CD contexts. -
Vector & Search Systems:
Operational experience with Elasticsearch and vector databases (e.g., Milvus, Pinecone, or Azure AI Search) to support AI-driven use cases. -
Azure AI & LLM Deployments:
Experience provisioning and managing Azure OpenAI, Cognitive Search, and other AI workloads, including model deployment and scaling. -
Observability & Telemetry:
Strong command of Prometheus, Grafana, and distributed tracing; ability to design observability frameworks that feed back into AI-driven optimization loops. -
Security by Design:
Practical application of DevSecOps, vulnerability scanning, and Zero Trust principles; automation of compliance and secret management (Vault or Azure Key Vault).
Nice to have:
-
AI Workflow Orchestration:
Experience with orchestrating or monitoring AI agents within build or deployment pipelines. -
Deployment Automation:
Experience designing agent-driven or customer-managed installers for on-premise or hybrid client environments. -
Multi-Cloud Fluency:
Hands-on experience with AWS and GCP in addition to Azure; understanding of cross-cloud deployment abstractions. -
Containerization Expertise:
Proficiency with Docker, container registries, and image lifecycle management in production contexts. -
Network & Service Mesh:
Familiarity with ingress controllers, API gateways, and service mesh solutions such as Istio or Linkerd. -
Scripting & Automation:
Strong automation skills in Bash, Python, or PowerShell; ability to prototype AI-driven automation extensions. -
Configuration Management:
Practical experience creating and managing Helm charts, Kustomize overlays, and GitOps-style deployment repositories. -
Monitoring & Continuous Improvement:
Skilled in defining metrics that inform both operational health and methodology evolution across rewrite cycles. -
Governance & Quality:
Experience integrating code quality and security scanning tools (SonarQube, Trivy, Snyk) directly into the build pipeline.
What We re Looking For
- 10+ years of DevOps/SRE experience in both cloud and on-premise environments
- Demonstrated experience with microservices architecture
- Experience with Elasticsearch and modern AI infrastructure components
- Familiarity with vector databases (such as Pinecone, Milvus, or Weaviate)
- Experience deploying LLMs on Azure AI or similar platforms
- Experience automating complex installation processes
- Strong problem-solving abilities and communication skills