Project Role :
Infra Tech Support PractitionerProject Role Description :
Provide ongoing technical support and maintenance of production and development systems and software products (both remote and onsite) and for configured services running on various platforms (operating within a defined operating model and processes). Provide hardware/software support and implement technology at the operating system-level across all server and network areas, and for particular software solutions/vendors/brands. Work includes L1 and L2/ basic and intermediate level troubleshooting.Must have skills :
KubernetesGood to have skills :
NAMinimum 5 Year(s) Of Experience Is Required
Educational Qualification :
15 years full time educationWe are seeking Kubernetes Support Engineer with deep technical expertise in Kubernetes, GKE, and automation. You will be a key technical point of contact for advanced troubleshooting, platform reliability, and change management, ensuring the stability and performance of our large-scale Kubernetes infrastructure. As part of the Kubernetes & Platform Engineering team, you will contribute to the support, governance, and continuous improvement of internal cloud-native platform. Roles & Responsibilities: Advanced Troubleshooting & Support (L3): - Diagnose and resolve complex Kubernetes incidents (networking, scheduling, API server, CNI, storage, etc.) across GKE clusters and potentially more distributions. - Perform root cause analysis. Incident & Problem Management: - Take ownership of critical incidents, perform in-depth investigations, and ensure permanent resolutions through RCA reports and automation. - Collaborate with DevOps, SRE, and Security teams to improve reliability and resilience. Platform Automation & GitOps: - Drive automation initiatives through GitOps practices using ArgoCD and internal platform engineering tools. - Maintain consistency, compliance, and reproducibility of Kubernetes environments at scale. Platform Change Management: - Participate in and oversee change management processes for the Kubernetes platform: - Assess and validate proposed changes (cluster upgrades, configuration updates, policy enforcement). - Coordinate with internal teams to ensure minimal disruption and proper rollback strategies. - Ensure all changes are properly tracked, reviewed, tested, and documented. - Contribute to defining governance and standard operating procedures (SOPs) for platform lifecycle management. - Your goal: guarantee safe, auditable, and transparent platform evolution across all environments. Performance & Reliability Optimization: - Continuously monitor and optimize Kubernetes’ cluster performance, scalability, and cost efficiency. - Contribute to reliability metrics and proactive incident prevention. Cross-team Collaboration: - Work closely with application teams, platform engineers, and architects to ensure operational excellence and continuous improvement. - Documentation & Knowledge Sharing: - Write and maintain high-quality technical documentation, troubleshooting guides, and platform knowledge bases. - Contribute to internal training and technical enablement sessions. Technology Watch: - Stay up to date with Kubernetes releases, CNCF ecosystem tools, and GKE innovations. - Recommend improvements and help drive the platform roadmap evolution. Professional & Technical Skills: - Certification: Certified Kubernetes Administrator (CKA) required (CKAD or CKS certifications are a plus) Kubernetes Expertise: - 3+ years of hands-on experience managing and supporting Kubernetes clusters in production (preferably GKE). - Solid understanding of Kubernetes internals: controlling plane components, scheduling, networking, and security. Cloud & Infrastructure: - Strong knowledge of Google Cloud Platform (GCP) — IAM, Cloud Logging, Monitoring, and Networking. Automation & GitOps: - Experience with ArgoCD and GitOps-based workflows. - Familiarity with CI/CD tools like GitLab CI/CD, plus experience with or automation (Golang if possible but not mandatory) Observability & Reliability: - Experience with observability tools like Dynatrace Change Management & Governance: - Understanding of ITIL or internal change management best practices. - Experience coordinating platform upgrades, patch management, and rollout strategies. - Security & Compliance: - Knowledge of RBAC, Kyverno, network policies, and secret management. Additional Information: - Master’s degree in computer science or equivalent experience - 3+ years as a Kubernetes Engineer, SRE, or Platform Support Engineer - Analytical mindset and passion for solving complex platform issues - Excellent communication, teamwork, and documentation skills - Proactive, rigorous, and focused on reliability, automation, and user satisfaction