Cloud Engineer, Senior

4 - 8 years

1000.0 Lacs P.A.

Bengaluru

Posted:2 weeks ago| Platform: Naukri logo

Apply Now

Skills Required

containerkubernetescdcontinuous integrationpythongithuborchestrationsreci/cdmicrosoft azuresite reliability engineeringcloudconfiguration managementscriptingautomationgcpdevopslinuxtroubleshootingshell scriptingterraformci cd pipelinedeployment

Work Mode

Work from Office

Job Type

Full Time

Job Description

Overview We are seeking a dynamic and skilled professional to join our team in a hybrid role encompassing both Site Reliability Engineering (SRE) and DevOps. The ideal candidate will be responsible for managing and optimizing our cloud infrastructure on Google Cloud Platform (GCP) and Microsoft Azure. Key responsibilities include developing automated systems for monitoring and incident response, collaborating on CI/CD pipelines using GitHub Actions, and implementing infrastructure as code with Terraform. Strong troubleshooting skills, proficiency in Shell and Python scripting, and expertise in Linux and Kubernetes are essential. The candidate should possess a proactive approach to monitoring and system management, ensuring high availability, performance, and security across our cloud environments. Responsibilities Cloud Infrastructure Management: Design, implement, and manage scalable and reliable cloud infrastructure on Google Cloud Platform (GCP) and Microsoft Azure. Optimize cloud resource utilization and costs while ensuring high availability and performance. Site Reliability Engineering (SRE): Develop and maintain automated systems for monitoring, alerting, and incident response to ensure system reliability and uptime. Implement best practices for incident management, including root cause analysis and post-mortem documentation. DevOps Practices: Collaborate with development teams to integrate and deploy code using CI/CD pipelines with GitHub Actions. Automate routine tasks and deployment processes to improve efficiency and reduce time to market. Troubleshooting and Problem Solving: Provide expert troubleshooting for complex technical issues across the infrastructure, applications, and networks. Analyze system performance and reliability, proposing enhancements and optimizations. Infrastructure as Code (IaC): Develop and manage infrastructure using Terraform, ensuring reproducibility and scalability of cloud resources. Maintain version-controlled Terraform configurations in GitHub, enabling collaborative development and change tracking. Scripting and Automation: Develop and maintain automation scripts using Shell and Python to streamline operations and enhance productivity. Create and manage infrastructure as code (IaC) using tools like Terraform or similar. Linux and Kubernetes Administration: Manage and maintain Linux-based systems, ensuring security, patching, and configuration management. Deploy, configure, and manage Kubernetes clusters, ensuring robust container orchestration. Monitoring and Proactive Management: Implement and maintain comprehensive monitoring solutions to track system health and performance. Proactively identify potential issues and implement solutions before they impact users. Security and Compliance: Ensure cloud environments comply with organizational security policies and industry best practices. Implement security measures to protect data and applications from external threats.

Technology - Automatic Identification and Data Capture
Vernon Hills

RecommendedJobs for You