We seek a highly skilled, hands-on Senior DevOps Engineer with a proven track record in owning and delivering DevOps for large-scale environments, with a strong focus on in-house (non-cloud) server environments. The ideal candidate is a practitioner who excels at building, automating, scaling, and maintaining high-availability systems—without compromising on performance, security, or reliability. This role demands deep technical leadership alongside a strong DevOps culture mindset, with a focus on both the latest technology and people enablement.
Key Responsibilities
Hands-On DevOps & Cloud Delivery
- End-to-end ownership of DevOps processes for distributed high-traffic environments.
- Design, implement, and optimize CI/CD pipelines (using Bitbucket Pipelines, Jenkins, or GitHub Actions) for automated and secure software delivery.
- Automate infrastructure provisioning, deployment, and configuration using Ansible, HashiCorp Terraform, and similar tools.
- Manage Windows/Linux-based systems for robust performance, security, and reliability.
- Architect, scale, and operate Kubernetes clusters (multi-region, multi-tenant) on bare-metal servers.
- Leverage serverless CI/CD and GitOps techniques were beneficial.
Security & DevSecOps
- Implement DevSecOps best practices, including image scanning, policy-as-code, vulnerability management, and secure CI/CD pipelines.
- Manage secrets and enforce least-privilege access across systems.
- Proactively monitor and respond to infrastructure security threats and compliance requirements.
- Observability, Monitoring & Testing
- Deploy and manage monitoring stacks using Grafana, Prometheus, Loki; integrate distributed tracing and metrics with OpenTelemetry.
- Implement Allure and related reporting for comprehensive test automation visibility in CI/CD.
- Define and track advanced observability metrics (latency, error rate, SLO/SLA adherence)
Data Streaming & Messaging
- Operate and optimize Kafka clusters and similar streaming systems for low-latency, high-throughput requirements.
AI/ML for DevOps
- Collaboration, Documentation & Leadership
- Act as a DevOps technical authority and mentor, providing guidance, code reviews, and knowledge sharing to junior engineers and cross-functional teams.
- Champion DevOps culture, fostering collaboration with product, security, and engineering groups to deliver rapid, safe releases.
- Develop and maintain clear documentation, runbooks, and incident postmortems.
- Apply AI/ML models for anomaly detection, predictive scaling, incident analysis, and intelligent alerting within DevOps toolchains (Optional
Required Skills & Qualifications
- 5+ years of hands-on experience as a DevOps engineer in global, production-grade SaaS/PaaS/B2B/B2C environments.
- Expertise in designing, operating, and troubleshooting scalable bare metal servers and DevOps architectures.
- Strong background with Kubernetes, CI/CD pipeline tooling, Linux administration, and automation frameworks (Ansible, Terraform).
- Advanced proficiency in at least one scripting or programming language (e.g., Python, Go, Bash).
- Demonstrated implementation of DevSecOps and modern security controls in production pipelines.
- Skilled in modern observability stacks (Grafana, Prometheus, Loki, OpenTelemetry, VictoriaMetrics/VictoriaLogs), test automation, and incident management.
- Ability to proactively optimize infrastructure cost, usability, and compliance.
- Clear communicator and effective mentor, with a collaborative, product-first mindset.
- Bachelor’s degree in computer science, Engineering, or related; relevant certifications (e.g., CKA, AWS/GCP/Azure, Security+) are a plus.
- Proficient in database administration, performance tuning, backup & recovery, and security management using MySQL and MS SQL Server.
- Experienced in configuring, monitoring, and troubleshooting network communication, including VPN setup and maintenance for secure connectivity
- Experience with GitOps, AI/ML DevOps tools, and serverless automation is a strong advantage.