You ll lead a hands-on engineering team that designs, builds, and operates automation to make security changes safe, fast, and audit-ready. You own the strategy, standards, and roadmap for event-driven orchestration while staying tooling-flexible (e.g., Tines, Argo/Rundeck/StackStorm, GitHub Actions/Jenkins, or cloud-native services) while keeping the core consistent: Terraform for infrastructure, Python for services/runners, strong integrations with ITSM and ChatOps, and safe to run more than once behavior. You ll partner with InfraSec Operations (execution), InfraSec Reliability (resilience), and InfraSec Innovation (engineering) to deliver measurable reliability and risk reduction.
RESPONSIBILITIES
-
Set the automation strategy and standards: define principles, guardrails, and patterns (infra-as-code, code review, versioning, security) that scale across public and private cloud.
-
Lead the team: hire, mentor, and grow automation engineers; foster a low-ego, metrics-driven culture with clear career paths.
-
Own the automation lifecycle: design, build, test, release, operate, retire; keep docs, runbooks, and service contracts current.
-
Build reusable paved roads : ship libraries, modules, templates, and runners in Python/Terraform that teams can adopt with minimal friction.
-
Measure what matters: establish SLIs/SLOs for automations (success rate, latency, error budgets); instruments with logs/metrics/traces and run regular reviews.
-
Security and compliance by default: enforce least privilege, secret management, approvals, and evidence-as-you-go artifacts for audits.
-
System integration: connect automations to identity, cloud/on-prem platforms, CI/CD, observability, and ITSM/ChatOps via stable APIs and service contracts.
-
Prioritize by impact: maintain an intake and roadmap; quantify toil/ROI and sequence work for the highest reliability and risk-reduction gains.
-
Operational excellence: design changes with canaries/rollback; ensure workflows are safe to run more than once (retry-safe) and resilient to failures.
-
Incident learning loop: convert post-incident findings and recurring issues into durable automation or guardrails.
-
Tooling stewardship: evaluate and adopt orchestration solutions as needed (without vendor lock-in); keep costs and vendor relationships in check.
-
Stakeholder partnership: work closely with InfraSec Operations, Reliability (SRE), and Innovation Engineering to ensure clean handoffs and strong adoption.
About You
Basic Qualifications
-
10+ years in security/platform/SRE/DevOps with 4+ years leading engineers/tech leads.
-
Strong, hands-on proficiency in Python and Terraform (modules, testing, CI/CD, code reviews).
-
Experience designing event-driven automations and operating hybrid runners across public cloud and private cloud (bare metal, OpenStack, Kubernetes).
-
Proven integrations with ITSM (Jira) and ChatOps (Slack) including approvals, change records, and audit trails.
-
Bachelor s in CS/Cyber/IS or equivalent practical experience; Master s preferred.
Other Qualifications
-
Experience with at least one modern workflow/orchestration approach (cloud-native services or GitHub-based workflows, or equivalent).
-
Policy-as-code (OPA/Conftest), validation pipelines, and drift detection/remediation.
-
Deep observability (metrics/logs/traces) and SLO management for automation workloads.
-
Security/compliance mindset (least privilege, secret management, artifact signing).
-
Excellent cross-time-zone communication; vendor evaluation and budget stewardship.