5+ years overall in Cloud Operations, including:
- Minimum 5 years of hands-on experience with Google Cloud Platform (GCP)
- Minimum 3 years of experience in Kubernetes administration
Certifications:
GCP Certified Professional Mandatory
Work Hours:
- 24x7 support coverage
- Rotational shifts (including night and weekend shifts)
Key Responsibilities
- Manage and monitor GCP infrastructure resources, ensuring optimal performance, availability, and security.
- Administer Kubernetes clusters: deployment, scaling, upgrades, patching, and troubleshooting.
- Implement and maintain automation for provisioning, scaling, and monitoring using tools like Terraform, Helm, or similar.
- Respond to incidents, perform root cause analysis, and drive issue resolution within SLAs.
- Configure logging, monitoring, and alerting solutions across GCP and Kubernetes environments.
- Support CI/CD pipelines and integrate Kubernetes deployments with DevOps processes.
- Maintain detailed documentation of processes, configurations, and runbooks.
- Work collaboratively with Development, Security, and Architecture teams to ensure compliance and best practices.
- Participate in an on-call rotation and respond promptly to critical alerts.
Required Skills & Qualifications
GCP Certified Professional Cloud Architect, Cloud Engineer, or equivalent).
- Strong working knowledge of GCP services (Compute Engine, GKE, Cloud Storage, IAM, VPC, Cloud Monitoring, etc.).
- Solid experience in Kubernetes cluster administration (setup, scaling, upgrades, security hardening).
- Proficiency with Infrastructure as Code tools (Terraform, Deployment Manager).
- Knowledge of containerization concepts and tools (Docker).
- Experience in monitoring and observability (Prometheus, Grafana, Stackdriver).
- Familiarity with incident management and ITIL processes.
- Ability to work in 24x7 operations with rotating shifts.
- Strong troubleshooting and problem-solving skills.
OR
5+ years overall in Cloud Operations, including:
- Minimum 5 years of hands-on experience with Microsoft Azure
- Minimum 3 years of experience in Kubernetes administration
Certifications:
Azure Certification (Azure Administrator Associate, Azure Solutions Architect Expert, or equivalent) Mandatory
Work Hours:
- 24x7 support coverage
- Rotational shifts (including nights and weekends)
Key Responsibilities
- Manage and monitor Azure infrastructure resources ensuring performance, availability, and security compliance.
- Administer Azure Kubernetes Service (AKS) clusters: provisioning, scaling, upgrades, patching, and troubleshooting.
- Implement and maintain automation for provisioning, configuration management, and monitoring (using ARM templates, Terraform, Bicep).
- Respond to incidents, perform root cause analysis, and resolve issues within defined SLAs.
- Configure and maintain logging, monitoring, and alerting solutions (Azure Monitor, Log Analytics, Application Insights).
- Support CI/CD workflows integrating Azure and Kubernetes deployments.
- Maintain detailed operational documentation, including configurations and runbooks.
- Collaborate closely with Development, Security, and Architecture teams to ensure adherence to best practices and compliance.
- Participate in an on-call rotation for incident response and critical issue remediation.
Required Skills & Qualifications
- Azure Certification (Azure Administrator, Architect, or equivalent).
- Strong working knowledge of Azure services (VMs, Azure Kubernetes Service, Storage Accounts, Networking, IAM, Azure Monitor).
- Proficiency in Kubernetes administration (setup, scaling, upgrades, securing workloads).
- Experience with Infrastructure as Code tools (ARM Templates, Terraform, Bicep).
- Familiarity with containerization concepts and tools (Docker).
- Proficiency in monitoring and observability (Azure Monitor, Prometheus, Grafana).
- Solid understanding of incident management, change management, and operational excellence.
- Ability to work in 24x7 support environment with rotating shifts.
- Strong analytical and problem-solving skills.