Jobs
Interviews

101 Cloud Monitoring Jobs - Page 4

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

8.0 - 11.0 years

35 - 37 Lacs

Kolkata, Ahmedabad, Bengaluru

Work from Office

Dear Candidate, We are hiring a Cloud Operations Engineer to manage and optimize cloud-based environments. Ideal for engineers passionate about automation, monitoring, and cloud-native technologies. Key Responsibilities: Maintain cloud infrastructure (AWS, Azure, GCP) Automate deployments and system monitoring Ensure availability, performance, and cost optimization Troubleshoot incidents and resolve system issues Required Skills & Qualifications: Hands-on experience with cloud platforms and DevOps tools Proficiency in scripting (Python, Bash) and IaC (Terraform, CloudFormation) Familiarity with logging/monitoring tools (CloudWatch, Datadog, etc.) Bonus: Experience with Kubernetes or serverless architectures Note: If interested, please share your updated resume and preferred time for a discussion. If shortlisted, our HR team will contact you. Kandi Srinivasa Delivery Manager Integra Technologies

Posted 3 months ago

Apply

4.0 - 8.0 years

6 - 10 Lacs

Bengaluru

Work from Office

Back At BCE Global Tech, immerse yourself in exciting projects that are shaping the future of both consumer and enterprise telecommunications This involves building innovative mobile apps to enhance user experiences and enable seamless connectivity on-the-go Thrive in diverse roles like Full Stack Developer, Backend Developer, UI/UX Designer, DevOps Engineer, Cloud Engineer, Data Science Engineer, and Scrum Master; at a workplace that encourages you to freely share your bold and different ideas If you are passionate about technology and eager to make a difference, we want to hear from you! Apply now to join our dynamic team in Bengaluru We're seeking a dedicated Site Reliability Engineer to join our team In this role, you will be responsible for maintaining the reliability, scalability, and performance of our systems You'll implement best practices for monitoring, incident response, and automation to ensure seamless operations Your expertise will help us build resilient infrastructure, reduce downtime, and enhance the overall user experience Key Responsibilities Experience working with various monitoring tools (eg ELK, Dyntrace, Cloudwatch, Cloud logging, Cloud Monitoring, BMC Surveyor, BMC Patrol, Grafana, Prometheus) Ensure monitoring and self-healing strategies are implemented and maintained to proactively prevent production incidents Perform root cause analysis of production issues Design and manage on call and escalation processes- Nice to Have Participate in design reviews and production reviews for new features, products, or pieces of infrastructure Designing and implementing ELK (Elasticsearch, Logstash and Kibana) stack, Prometheus and Grafana solutions for monitoring and alerting Debug production issues across services and levels of the stack Establish KPIs to demonstrate maturity, efficiency, and value to our business partners Works as an integral part of the DevOps team with complimentary skills and common goals L3 Support experience is an asset Work to create a Release management process and help with Out-of-business-hour deployments and support (Rotation with team members) Familiar and comfortable with agile development techniques Technology Skills (Mandatory) ELK, Dyntrace, Cloudwatch, Cloud logging, Cloud Monitoring, BMC Surveyor, BMC Patrol, Grafana, Prometheus Required Qualifications To Be Successful In This Role Bachelors degree in computer science engineering, or related field 8 -10 years of experience as a SRE Proven experience as an SRE, DevOps engineer, or similar role Strong programming skills in languages such as Python, Go, Java, or Ruby Strong problem-solving skills and ability to work under pressure Excellent communication and collaboration skills Flexible to work in EST time zones ( 9-5 EST) Competitive salaries and comprehensive health benefits Flexible work hours and remote work options Professional development and training opportunities A supportive and inclusive work environment

Posted 3 months ago

Apply

5.0 - 10.0 years

6 - 12 Lacs

Kolkata

Work from Office

At Gintaa, were redefining how India orders food. With our focus on affordability, exclusive restaurant partnerships, and hyperlocal logistics, we aim to scale across India's Tier 1 and Tier 2 cities. Were backed by a mission-driven team and expanding rapidly now’s the time to join the core tech leadership and build something impactful from the ground up. Job Summary We are looking for an experienced and motivated DevOps Engineer with 5–7 years of hands-on experience designing, implementing, and managing cloud infrastructure—particularly on Google Cloud Platform (GCP) and Amazon Web Services (AWS). The ideal candidate will have deep expertise in infrastructure as code (IaC), CI/CD pipelines, container orchestration, and cloud-native technologies. This role requires strong analytical skills, attention to detail, and a passion for optimizing cloud infrastructure performance and cost across multi-cloud environments. Key Responsibilities Multi-Cloud Infrastructure: Design, implement, and maintain scalable, reliable, and secure cloud infrastructure using GCP services (Compute Engine, GKE, Cloud Functions, Pub/Sub, BigQuery, Cloud Storage) and AWS services (EC2, ECS/EKS, Lambda, S3, RDS, CloudFront). CI/CD & GitOps: Build and manage CI/CD pipelines using GitHub/GitLab Actions, artifact repositories, and enforce GitOps practices across both GCP and AWS environments. Containerization & Serverless: Leverage Docker, Kubernetes (GKE/EKS), and serverless architectures (Cloud Functions, AWS Lambda) to support microservices and modern application deployments. Infrastructure as Code: Develop and manage IaC using Terraform (or CloudFormation for AWS) to automate provisioning and drift-detection across clouds. Observability & Monitoring: Implement observability tools like Prometheus, Grafana, Google Cloud Monitoring, and AWS CloudWatch for real-time system insights. Security & Compliance: Ensure best practices in cloud security, including IAM policies (GCP IAM + AWS IAM), encryption standards (KMS), network security (VPCs, Security Groups, Firewalls), and compliance frameworks. Service Mesh: Integrate and manage service mesh architectures such as Istio or Linkerd for secure and observable microservices communication. Troubleshooting & DR: Troubleshoot and resolve infrastructure issues, ensure high availability, disaster recovery (GCP Backup + AWS Backup/AWS DR strategies), and performance optimization. Cost Management: Drive initiatives for cloud cost management; use tools like GCP Cost Management and AWS Cost Explorer to suggest optimization strategies. Documentation & Knowledge Transfer: Document technical architectures, processes, and procedures; ensure smooth knowledge transfer and operational readiness. Cross-Functional Collaboration: Collaborate with Development, QA, Security, and Architecture teams to streamline deployment workflows. Required Skills & Qualifications 5–7 years of DevOps/Cloud Engineering experience, with at least 3 years on GCP and 3 years on AWS. Proficiency in Terraform (plus familiarity with CloudFormation), Docker, Kubernetes (GKE/EKS), and other DevOps toolchains. Strong experience with CI/CD tools (GitHub/GitLab Actions) and artifact repositories. Deep understanding of cloud networking, VPCs, load balancing, security groups, firewalls, and VPNs in both GCP and AWS. Expertise in monitoring/logging frameworks such as Prometheus, Grafana, Stackdriver (Cloud Monitoring), and AWS CloudWatch/CloudTrail. Strong scripting skills in Python, Bash, or Go for automation tasks. Knowledge of data backup, high-availability systems, and disaster recovery strategies across multi-cloud. Familiarity with service mesh technologies and microservices-based architecture. Excellent analytical, troubleshooting, and documentation skills. Effective communication and ability to work in a fast-paced, collaborative environment. Preferred Qualifications (Good to Have) Google Professional Cloud Architect Certification and/or AWS Certified Solutions Architect – Professional. Experience with multi-cloud or hybrid cloud setups, including VPN/Direct Connect and Interconnect configurations. Exposure to agile software development, DevSecOps, and compliance-driven environments (e.g., BFSI, Healthcare). Understanding of cost modeling and cloud billing analysis tools. Why Join Gintaa? Be a part of a purpose-driven startup revolutionizing food and local commerce in India. Build impactful, large-scale mobile applications from scratch. Work with a visionary leadership team and dynamic, entrepreneurial culture. Competitive salary and leadership visibility.

Posted 3 months ago

Apply

8.0 - 13.0 years

15 - 30 Lacs

Bengaluru

Work from Office

Job Title : Cloud Architect AWS Location : Bangalore Shift : Rotational Experience Required : 8+ years - 13 Years Type : Full-time Job Summary We are looking for a highly skilled and experienced AWS Cloud Architect with a strong foundation in Site Reliability Engineering (SRE) practices to lead cloud transformation initiatives. The ideal candidate will have hands-on experience with AWS infrastructure , DevOps automation , security governance , cost optimization , and infrastructure as code . You will work on high-impact projects in a cloud-first environment, collaborating with cross-functional teams to ensure scalable, secure, and reliable infrastructure. Key Responsibilities Cloud Architecture & Deployment Design, build, and optimize cloud-native architectures on AWS . Lead the migration of on-premises workloads to AWS using best practices. Define and enforce cloud governance, tagging policies , and account management standards . Security, IAM, and Compliance Implement AWS IAM, PIM, PAM , and manage VPC Security Groups , NACLs , and encryption policies . Conduct cloud security assessments , enforce SOC2 , ISO 27001 , or HIPAA compliance. Work with AWS Config , AWS CloudTrail , AWS GuardDuty , and Security Hub . Infrastructure Automation & DevOps Build and maintain CI/CD pipelines using Jenkins , Git , Terraform , CloudFormation , and Ansible . Manage containerized workloads using Docker , Kubernetes , and orchestration tools. Implement Infrastructure as Code (IaC) and Configuration Management for consistent deployments. Cost Optimization & Performance Tuning Use AWS Cost Explorer , Budgets , and Trusted Advisor to monitor and reduce costs. Optimize workloads through auto-scaling , spot instances , Savings Plans , and rightsizing . Regularly audit and report on cloud spend and performance KPIs. Monitoring & Reliability Set up logging and monitoring using CloudWatch , Prometheus , Nagios , or Datadog . Define and maintain SLA/SLO/SLI metrics, runbooks, and incident response procedures. Implement blue/green deployments , rollback strategies , and chaos engineering principles. Documentation & Collaboration Maintain architecture diagrams using Lucidchart , Draw.io , or Visio . Document SOPs for cloud operations, deployments, and recovery scenarios. Collaborate with engineering, security, product, and QA teams. Skills and Experience Required Cloud Platform : AWS (EC2, S3, RDS, CloudFront, Lambda, VPC, CloudFormation) DevOps Tools : Terraform, Ansible, Jenkins, GitHub Actions, Docker, Kubernetes Security & IAM : AWS IAM, PIM/PAM, encryption, CloudTrail, GuardDuty, Security Hub Scripting : Python, Bash, Shell scripting Monitoring & Logging : CloudWatch, ELK Stack, Datadog, Nagios Networking : VPC, Subnetting, Route Tables, NAT Gateway, VPNs Certifications (preferred): AWS Certified Solutions Architect Professional AWS Certified DevOps Engineer Professional Preferred Qualifications Experience in hybrid environments (AWS + on-prem) Working knowledge of Azure or GCP is a plus Familiarity with microservices architecture Hands-on with CI/CD using GitOps or DevOps pipelines Background in ERP, retail, or high-availability enterprise SaaS platforms Soft Skills Excellent communication and stakeholder management Strong analytical and problem-solving mindset Ability to work in 24x7 environments and rotational shifts Self-motivated and team-oriented approach

Posted 3 months ago

Apply

10.0 - 12.0 years

0 Lacs

Bengaluru / Bangalore, Karnataka, India

On-site

Introduction A career in IBM Software means youll be part of a team that transforms our customers challenges into solutions. Seeking new possibilities and always staying curious, we are a team dedicated to creating the worlds leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career. IBMs product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive. Your role and responsibilities As a Site Reliability Engineer, you will work in an agile, collaborative environment to build, deploy, configure, and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying the latest software updates & fixes. Your primary responsibilities include: . 24x7 Observability: Be part of a worldwide team that monitors the health of production systems and services around the clock, ensuring continuous reliability and optimal customer experience. . Cross-Functional Troubleshooting: Collaborate with engineering teams to provide initial assessments and possible workarounds for production issues. Troubleshoot and resolve production issues effectively. . Deployment and Configuration: Leverage Continuous Delivery (CI/CD) tools to deploy services and configuration changes at enterprise scale. . Security and Compliance Implementation: Implementing security measures that meet or exceed industry standards for regulations such as GDPR, SOC2, ISO 27001, PCI, HIPAA, and FBA. . Maintenance and Support: Tasks related to applying Couchbase security patches and upgrades, supporting Cassandra and Mongo for pager duty rotation, and collaborating with Couchbase Product support for issue resolution. Required education Bachelors Degree Required technical and professional expertise 10+ years working in high-performance engineering team Experience in Cloud server management and troubleshooting, network, windows server management, Aws cloud and automation, cloud monitoring, GitHub, kubernetes, Linux, 10+ years of working knowledge with one or more operating systems: RHEL, CentOS Linux, and Windows Servers. Working knowledge with ServiceNow, JIRA, Confluent, and GitHub Preferred technical and professional experience In-depth understanding and working knowledge with server technologies Working knowledge with how Virtualization, Network, and Storage technologies work in the data center and cloud environments Working knowledge with ServiceNow, JIRA, Confluent, and GitHub ITIL Foundation V4 certification is a plus Excellent verbal and written communication skills Highly responsible, motivated, able to work with little direction Ability to troubleshoot complex problems and customer issues

Posted 3 months ago

Apply

7.0 - 12.0 years

12 - 22 Lacs

Pune

Work from Office

Experience-7+ Years Job Locations-Pune Notice Period-30 Days Job Description- • AWS Ecosystem EKS, EC2, DynamoDB, Lambda, etc. Dynatrace (or similar) The capacity planning team should include some members with Dynatrace experience, while the rest can have experience with similar tools. • Develop and implement capacity planning strategies. Roles & Responsibilities- • Develop and implement capacity planning strategies for AWS environments, focusing on EKS (Elastic Kubernetes Service). • Conduct continuous load testing to assess system performance under varying conditions and identify potential bottlenecks. • Analyze system metrics and usage patterns to forecast future capacity needs and recommend scaling solutions. • Collaborate with development and operations teams to ensure application performance aligns with business objectives. • Design and execute automated testing frameworks to simulate real-world usage scenarios. • Monitor cloud resource utilization and optimize costs while maintaining performance standards. • Prepare detailed reports on capacity trends, load testing results, and recommendations for improvements. • Stay updated on industry best practices and emerging technologies related to cloud infrastructure and capacity management. • 7+ years of proven experience with AWS services, particularly EKS, EC2, S3, and RDS. • Strong understanding of container orchestration and microservices architecture. • Experience with continuous load testing tools (e.g., JMeter, Gatling) and performance monitoring solutions. • Proficiency in scripting languages (e.g., Python, Bash) for automation tasks. • Excellent analytical skills with the ability to interpret complex data sets. • Strong communication skills to effectively collaborate with cross-functional teams.

Posted 3 months ago

Apply

12.0 - 15.0 years

55 - 60 Lacs

Ahmedabad, Chennai, Bengaluru

Work from Office

Dear Candidate, Were hiring a Cloud Systems Integrator to connect disparate systems and ensure seamless cloud-native integrations. Key Responsibilities: Integrate SaaS, legacy, and cloud systems. Build APIs, webhooks, and message queues. Ensure data consistency across platforms. Required Skills & Qualifications: Experience with REST, GraphQL, and messaging (Kafka/SQS). Proficiency in integration platforms (MuleSoft, Boomi, etc.). Cloud-first development experience. Soft Skills: Strong troubleshooting and problem-solving skills. Ability to work independently and in a team. Excellent communication and documentation skills. Note: If interested, please share your updated resume and preferred time for a discussion. If shortlisted, our HR team will contact you. Srinivasa Reddy Kandi Delivery Manager Integra Technologies

Posted 3 months ago

Apply

6.0 - 11.0 years

1 - 5 Lacs

Bengaluru

Work from Office

Job Title:Java AWS Developer Experience6-12 Years Location:Bangalore : : Experience in Java, J2ee, Spring boot. Experience in Design, Kubernetes, AWS (EKS, EC2) is needed. Experience in AWS cloud monitoring tools like Datadog, Cloud watch, Lambda is needed. Experience with XACML Authorization policies. Experience in NoSQL , SQL database such as Cassandra, Aurora, Oracle. Experience with Web Services SOA experience (SOAP as well as Restful with JSON formats), with Messaging (Kafka). Hands on with development and test automation tools/frameworks (e.g. BDD and Cucumber)

Posted 3 months ago

Apply

6.0 - 11.0 years

2 - 5 Lacs

Hyderabad, Bengaluru

Work from Office

Job Title:Java AWS Experience6-12 Years Location:Hyderabad/ Bangalore : Experience in Java, J2ee, Spring boot. Experience in Design, Kubernetes, AWS (Lambda, EKS, EC2) is needed. Experience in AWS cloud monitoring tools like Datadog, Cloud watch, Lambda is needed. Experience with XACML Authorization policies. Experience in NoSQL , SQL database such as Cassandra, Aurora, Oracle. Experience with Web Services SOA experience (SOAP as well as Restful with JSON formats), with Messaging (Kafka). Hands on with development and test automation tools/frameworks (e.g. BDD and Cucumber)

Posted 3 months ago

Apply

8 - 10 years

7 - 11 Lacs

Pune

Work from Office

Wipro Limited (NYSEWIT, BSE507685, NSEWIPRO) is a leading technology services and consulting company focused on building innovative solutions that address clients’ most complex digital transformation needs. Leveraging our holistic portfolio of capabilities in consulting, design, engineering, and operations, we help clients realize their boldest ambitions and build future-ready, sustainable businesses. With over 230,000 employees and business partners across 65 countries, we deliver on the promise of helping our customers, colleagues, and communities thrive in an ever-changing world. For additional information, visit us at www.wipro.com. About The Role Contract Job TitleIncident & Operations Manager (DevOps Background) Responsibilities: 1. Incident Response & Triage Develop and maintain incident response plans for various types of incidents. Ensure timely resolution of tickets by Operations team within agreed SLAs. Assess and prioritize incidents based on severity and initiate response actions accordingly. 2. Incident Documentation & Analysis Maintain detailed incident records including impact, nature, and resolution. Perform post-incident analysis to identify root causes and preventive actions. 3. Customer Satisfaction & Reporting Monitor and improve customer satisfaction through efficient operations. Address customer complaints and conduct monthly SLA review meetings. 4. DevOps Background (Prior Hands-on Experience) Experience with Linux administration, shell scripting, and troubleshooting. Managed CI/CD pipelines, Kubernetes clusters, and cloud infrastructure. Knowledge of virtualization (VMware/KVM) and private cloud environments. Familiarity with tools like Docker, Jenkins, Ansible, and cloud monitoring. 5. Leadership & Coordination Lead cross-functional teams and bridge gap between DevOps and Operations. Ensure clear communication with stakeholders and technical teams. - Location Pune, Maharashtra, India - 24/7 Operations (Yes/No)Yes - General Shift (Yes/No) As per the client's requirement - All Shift Timings: As per roster Rates including mark up - 130K/M Do RESPONSIBILITIES Ensure each and every Change is recorded and approved before implementation. Ensure Change are categorized and are approved as per the defined Process based on the Change category – Standard, Normal, Expedited, Emergency Convene and chair CAB meetings, circulate the MOM of the CABs Ensure no Unauthorized change is implemented which may potentially impact the Production Ensure Periodical audits in place as per Wipro and Customer process and the close the Audit-gaps in the agreed timelines. Report the Management on the agreed Change KPIs and ensure effective change communication in place Ensure Change implementation done as per the Implementation Plan with no manual errors through setting up 4-eye review for each of the Changes Ensures each of the Change is assigned with the Risk involved and ensure Wipro and Customer Processes are followed in case of High-risk / High-impact Changes Conduct Post Implementation Reviews and validate the change status against the defined Change success criterion Bring in Service improvements to improve the overall Process maturity ? KEY SKILLS AND COMPETENCIES 8-10 years of ITSM experience in Change and other processes ITIL V3 / 2011 Foundation or Intermediate certification Capable to collaborate with Multiple Technical towers, face the Customer, coordinate with the Vendors Effective Communication skills ? ? ? Reinvent your world. We are building a modern Wipro. We are an end-to-end digital transformation partner with the boldest ambitions. To realize them, we need people inspired by reinvention. Of yourself, your career, and your skills. We want to see the constant evolution of our business and our industry. It has always been in our DNA - as the world around us changes, so do we. Join a business powered by purpose and a place that empowers you to design your own reinvention. Come to Wipro. Realize your ambitions. Applications from people with disabilities are explicitly welcome.

Posted 4 months ago

Apply

6 - 11 years

6 - 12 Lacs

Pune

Work from Office

Role & responsibilities Location: Pune Job Summary: Experience in handling IT Audits like PCI DSS, ISO, HIPPA Resolve problems reported by the end-user Define network policies and procedures and guidelines. Cloud Security and Cloud Monitoring. Job Description: JD for Server and Network administrator (Deputy Manager) Microsoft Active Directory in the enterprise, including creating bulk user accounts, deleting and modifying user accounts, setting up e-mail. FortiGate Firewall Management Ensure network security and connectivity Define network policies and procedures and guidelines. Perform routine network startup and shutdown procedures, and maintain control records Hands-on experience in networking, routing, and switching Re-installation of Windows Servers in the event of system crash/failures Periodic health check of the systems, troubleshooting problems, analyzing and implementing rectification measures Implementation and maintenance of standard operating procedures for maintenance of the infrastructure Resolve problems reported by the end-user. VLANs, TCP/IP, DHCP, DNS Experience in handling IT Audits like PCI DSS, ISO, HIPPA Managing Antivirus in an enterprise like Trend Micro Support and administer third-party applications Microsoft SQL knowledge Compile monthly reports for the IT Manager Execute duties in accordance with ITIL framework guidelines Log calls with external providers where and when necessary Make recommendations to the IT Manager on budget items Make recommendations to the IT Manager for improving systems and processes Stay abreast of latest technology and work with IT Manager on potential upgrades Perform and support other tasks as identified by the Technology Manager In depth understanding of technology systems in the enterprise Virtual clustering and troubleshooting experience with Hyper-V and VMware Excellent business continuity and disaster recovery methodologies and practice PowerShell scripts and commands to automate certain tasks and to change settings in bulk managing Preferred candidate profile

Posted 4 months ago

Apply

5 - 9 years

4 - 8 Lacs

Kolkata

Work from Office

We are looking for an experienced and motivated DevOps Engineer with 5 to 7 years of hands- on experience designing, implementing, and managing cloud infrastructure, particularly on Google Cloud Platform (GCP). The ideal candidate will have deep expertise in infrastructure, such as code (IaC), CI/CD pipelines, container orchestration, and cloud-native technologies. This role requires strong analytical skills, attention to detail, and a passion for optimizing cloud infrastructure performance and cost. Key Responsibilities Design, implement, and maintain scalable, reliable, and secure cloud infrastructure using Google Cloud Platform (GCP) services, including Compute Engine, Google Kubernetes Engine (GKE), Cloud Functions, Cloud Pub/Sub, BigQuery, and Cloud Storage. Build and manage CI/CD pipelines using GitHub, artifact repositories, and version control systems; enforce GitOps practices across environments. Leverage Docker, Kubernetes, and serverless architectures to support microservices and modern application deployments. Develop and manage Infrastructure as Code (IaC) using Terraform to automate environment provisioning. Implement observability tools like Prometheus, Grafana, and Google Cloud Monitoring for real-time system insights. Ensure best practices in cloud security, including IAM policies, encryption standards, and network security. Integrate and manage service mesh architectures such as Istio or Linkerd for secure and observable microservices communication. Troubleshoot and resolve infrastructure issues, ensuring high availability, disaster recovery, and performance optimization. Drive initiatives for cloud cost management and suggest optimization strategies for resource efficiency. Document technical architectures, processes, and procedures; ensure smooth knowledge transfer and operational readiness. Collaborate with cross-functional teams including Development, QA, Security, and Architecture teams to streamline deployment workflows. Preferred candidate profile 5+ years of DevOps/Cloud Engineering experience, with at least 3 years on GCP. Proficiency in Terraform, Docker, Kubernetes, and other DevOps toolchains. Strong experience with CI/CD tools, GitHub/GitLab, and artifact repositories. Deep understanding of cloud networking, VPCs, load balancing, firewalls, and VPNs. Expertise in monitoring and logging frameworks such as Prometheus, Grafana, and Stackdriver (Cloud Monitoring). Strong scripting skills in Python, Bash, or Go for automation tasks. Knowledge of data backup, high-availability systems, and disaster recovery strategies. Familiarity with service mesh technologies and microservices-based architecture. Excellent analytical, troubleshooting, and documentation skills. Effective communication and ability to work in a fast-paced, collaborative environment.

Posted 4 months ago

Apply

7 - 12 years

15 - 20 Lacs

Navi Mumbai, Bengaluru, Mumbai (All Areas)

Work from Office

Key Responsibilities: Design, implement, and maintain GCP cloud infrastructure using Infrastructure as Code (IaC) tools Manage and optimize Kubernetes clusters on GKE (Google Kubernetes Engine) Build and maintain CI/CD pipelines for efficient application delivery Monitor GCP infrastructure costs and drive optimization strategies Develop observability solutions using GCP-native and third-party tools Collaborate with engineering teams to streamline deployment and operations workflows Enforce security best practices and ensure compliance with internal and industry standards Design and implement high availability (HA) and disaster recovery (DR) architectures Mandatory Technical Skills: GCP Services: Compute Engine, VPC, Cloud Storage, Cloud SQL, IAM, Cloud DNS, Cloud Monitoring Infrastructure as Code: Terraform (preferred), Deployment Manager Containerization: Docker, Kubernetes (GKE expertise required) CI/CD Tools: GitHub Actions, Cloud Build, Jenkins, or similar Version Control: Git Scripting Languages: Python, Bash Monitoring & Logging: Stackdriver, Prometheus, Grafana, ELK Stack Strong experience with automation and configuration management (Terraform, Ansible, etc.) Solid understanding of cloud security best practices Experience designing fault-tolerant, resilient cloud-native architectures 47 years in DevOps/Cloud Engineering roles Minimum 2+ years hands-on with GCP infrastructure and services Proven experience managing CI/CD pipelines and container-based deployments Strong background in modern DevOps tools and cloud-native architectures Preferred candidate profile

Posted 4 months ago

Apply

8 - 13 years

40 - 45 Lacs

Bengaluru

Work from Office

About the role We are seeking an experienced Infrastructure Engineer to join our team at, a leader in blockchain technology and solutions. The ideal candidate will have a strong background in infrastructure management and a deep understanding of blockchain ecosystems. You will be responsible for designing, implementing, and maintaining the foundational infrastructure that supports our blockchain platforms, ensuring high availability, scalability, and security. Your expertise in AWS cloud technologies and database management, particularly with RDS, PostgreSQL, and Aurora, will be essential to our success. Responsibilities: Design & Deployment: Develop, deploy, and manage the infrastructure for blockchain nodes, databases, and network systems. Automation & Optimization: Automate infrastructure provisioning and maintenance tasks to enhance efficiency and reduce downtime. Optimize performance, reliability, and scalability across our blockchain systems. Monitoring & Troubleshooting: Set up monitoring and alerting systems to proactively manage infrastructure health. Quickly identify, troubleshoot, and resolve issues in production environments. Security Management: Implement robust security protocols, firewalls, and encryption to protect infrastructure and data from breaches and vulnerabilities. should be aware of VPC Virtual private cloud good in this Collaboration: Work closely with development, DevOps, and security teams to ensure seamless integration and support of blockchain applications. Support cross-functional teams in achieving network reliability and efficient resource management. Documentation: Maintain comprehensive documentation of infrastructure configurations, processes, and recovery plans. Continuous Improvement: Research and implement new tools and practices to improve infrastructure resiliency, performance, and cost-efficiency. Stay updated with blockchain infrastructure trends and industry best practices. Incident management: Incident dashboard management. Integrate dashboard using different power tools. Requirements: Educational Background: Bachelors degree in Computer Science, Information Technology, or a related field. Experience: Minimum of 7 years of experience in AWS infrastructure engineering, using terraforms, Terra-grunt, and Atlantis with incident management and resolution using automation (infrastructure as a code) , AWS infrastructure cloud provisioning. Should be aware of VPC Virtual private cloud. Technical Skills: Terraform and Automation AWS Cloud watch Hands-on experience with monitoring tools (e.g., Prometheus, Grafana). DevOps with CI/CD pipelines. Incident management resolution and reporting. Proficiency in AWS cloud platforms and container orchestration (e.g., Docker, Kubernetes). Strong knowledge of Linux/Unix system administration. Understanding of networking protocols, VPNs, and firewalls. Participate in on-call rotations to provide 24/7 support for critical systems. Security Knowledge: Strong understanding of security best practices, especially within blockchain environments. Soft Skills: Excellent problem-solving abilities, attention to detail, strong communication skills, and a proactive, team-oriented mindset. Experience working with consensus protocols and node architecture.

Posted 4 months ago

Apply

6 - 10 years

15 - 30 Lacs

Pune, Hyderabad

Work from Office

Roles and Responsibilities Design, develop, and maintain scalable cloud infrastructure on Google Cloud Platform (GCP) using CI/CD pipelines. Collaborate with cross-functional teams to identify areas for improvement in the development process and implement changes using DevOps practices. Ensure high availability, scalability, and security of applications by monitoring performance metrics and implementing optimization strategies. Develop automation scripts using Python or other languages to streamline repetitive tasks and improve efficiency. Participate in code reviews to ensure adherence to coding standards and best practices. Desired Candidate Profile 6-10 years of experience as a Cloud DevOps Engineer with expertise in GCP Cloud. Bachelor's degree in Any Specialization (B.Tech/B.E.). Strong understanding of Kubernetes, Docker, Networking concepts, Cloud Monitoring tools like Stackdriver etc., SRE principles & practices.

Posted 4 months ago

Apply

7.0 - 12.0 years

12 - 22 Lacs

Pune

Work from Office

Experience-7+ Years Job Locations-Pune Notice Period-30 Days Job Description- AWS Ecosystem EKS, EC2, DynamoDB, Lambda, etc. Dynatrace (or similar) The Observability team should include some members with Dynatrace experience, while the rest can have experience with similar tools. Monitoring Site, trend analysis, log analysis **Key Responsibilities: ** Design, implement, and maintain observability solutions using AWS and Dynatrace to monitor application performance and infrastructure health. Collaborate with development and operations teams to define observability requirements and ensure seamless integration of monitoring tools. Develop and manage dashboards, alerts, and reports to provide insights into system performance and user experience. Troubleshoot complex issues by analyzing logs, metrics, and traces to identify root causes and recommend solutions. Optimize existing monitoring frameworks to enhance visibility across cloud environments and applications. Stay updated on industry trends and best practices in observability, cloud technologies, and performance monitoring. 7+ years of proven experience as an Observability Engineer or similar role with a strong focus on AWS services. Proficiency in using Dynatrace for application performance monitoring and observability. Strong understanding of cloud architecture, microservices, containers, and serverless computing. Experience with scripting lan guages (e.g., Python, Bash) for automation tasks. • Excellent problem-solving skills with the ability to work under pressure in a fast-paced environment. • Strong communication skills to effectively collaborate with cross-functional teams.

Posted 4 months ago

Apply

1.0 - 4.0 years

5 - 5 Lacs

chennai

Work from Office

• Proficient in using DataDog, Grafana, and Nagios for monitoring and analysis • Experience in incident management and resolution • Knowledge of AWS and Azure services and architecture • Generate and analyse monitoring reports to provide insights Required Candidate profile • Understanding DevOps principles & practices • Scripting languages • Develop & maintain automation scripts to streamline monitoring process • Detect, analyse, & resolve Cloud/Infrastructure issues

Posted Date not available

Apply

4.0 - 9.0 years

3 - 7 Lacs

bengaluru

Work from Office

As a Technical support Engineer, hybrid cloud, You will utilize your passion for helping others to ensure that our Users and Enterprises are successful in their use of DataStax products and solutions. This is a continuous learning and teaching role where you will develop and share your knowledge of troubleshooting, configuration, and exciting new technologies inclusive of and complementary to Apache Cassandra, DataStax Enterprise, and Astra. What You'll Do: ? Research, reproduce, troubleshoot, and solve highly challenging technical issues for our enterprise customers on various product offerings, including NoSQL DBs, SaaS and GenAI. ? Provide thoughtful direction and support for technical inquiries, ensuring customer issues are resolved as expediently as possible. ? Diagnose and reproduce customer-reported issues and accurately log them in JIRA. ? Participate in an on-call rotation for after-hours, holiday, and weekend support coverage. ? Document known solutions to the internal and external knowledge base Collaborate and contribute to the development and improvement of Support Team infrastructure tools and processes . Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise ? Over 4+ years of proven experience supporting large enterprise customers in customer-facing roles such as support engineer, application developer, database administrator, or Site Reliability Engineer (SRE). ? Strong understanding of Java, Python, and/or Go, including the ability to read error codes, debug, and troubleshoot issues by diving into application code. ? Expert-level Linux skills, encompassing command-line navigation, diagnostic tools, and the ability to identify and resolve hardware or Linux environment bottlenecks, along with best practices for optimization. ? Deep networking skills, including the ability to effectively troubleshoot network issues and proficiency with relevant diagnostic tools. ? Exceptional verbal and written communication skills. ? Collaborative and self-motivated, with the ability to multi-task effectively and thrive during high-pressure situations. ? Experience with Support Ticketing Systems. ? Experience supporting Kubernetes-based distributed applications, or an understanding of Kubernetes fundamentals ? Experience with Java Virtual Machine (JVM) tuning , knowing its lifecycle and troubleshooting. ? Proficiency with cloud monitoring tools AWS/GCP or Azure, including setting up networking and Access Control Lists (ACLs). ? Experience in designing and optimizing highly available systems, including knowledge of how resource distribution across logical nodes and infrastructure components—both on-premise and in cloud environments—impacts service resilience and uptime. Preferred technical and professional experience ? Basic understanding of Kubernetes and its core components as they relate to managing containerized applications, including familiarity with reviewing logs for troubleshooting, monitoring service health, and understanding basic networking concepts within Kubernetes clusters. Exposure to supporting distributed database systems in such environments is a plus. ? Experience with Apache Cassandra™ or DataStax Enterprise. ? Experience with and knowledge of monitoring technologies such as Prometheus and Grafana. ? Experience supporting Apache Pulsar, Apache Kafka, or similar messaging/streaming technologies.

Posted Date not available

Apply

4.0 - 9.0 years

3 - 7 Lacs

bengaluru

Work from Office

As a Technical support Engineer, hybrid cloud, You will utilize your passion for helping others to ensure that our Users and Enterprises are successful in their use of DataStax products and solutions. This is a continuous learning and teaching role where you will develop and share your knowledge of troubleshooting, configuration, and exciting new technologies inclusive of and complementary to Apache Cassandra, DataStax Enterprise, and Astra. What You'll Do: ? Research, reproduce, troubleshoot, and solve highly challenging technical issuesfor our enterprise customers on various product offerings, including NoSQL DBs, SaaS and GenAI. ? Provide thoughtful direction and support for technical inquiries, ensuring customer issues are resolved as expediently as possible. ? Diagnose and reproduce customer-reported issues and accurately log them in JIRA. ? Participate in an on-call rotation for after-hours, holiday, and weekend support coverage. ? Document known solutions to the internal and external knowledge base Collaborate and contribute to the development and improvement of Support Team infrastructure tools and processes. Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise ?Over 4+ years of proven experience supporting large enterprise customers in customer-facing roles such as support engineer, application developer, database administrator, or Site Reliability Engineer (SRE). ?Strong understanding of Java, Python, and/or Go, including the ability to read error codes, debug, and troubleshoot issues by diving into application code. ?Expert-level Linux skills, encompassing command-line navigation, diagnostic tools, and the ability to identify and resolve hardware or Linux environment bottlenecks, along with best practices for optimization. ?Deep networking skills, including the ability to effectively troubleshoot network issues and proficiency with relevant diagnostic tools. ?Exceptional verbal and written communication skills. ?Collaborative and self-motivated, with the ability to multi-task effectively and thrive during high-pressure situations. ?Experience with Support Ticketing Systems. ?Experience supporting Kubernetes-based distributed applications, or an understanding of Kubernetes fundamentals ?Experience with Java Virtual Machine (JVM) tuning , knowing its lifecycle and troubleshooting. ?Proficiency with cloud monitoring tools AWS/GCP or Azure, including setting up networking and Access Control Lists (ACLs). ?Experience in designing and optimizing highly available systems, including knowledge of how resource distribution across logical nodes and infrastructure components—both on-premise and in cloud environments—impacts service resilience and uptime. Preferred technical and professional experience ?Basic understanding of Kubernetes and its core components as they relate to managing containerized applications, including familiarity with reviewing logs for troubleshooting, monitoring service health, and understanding basic networking concepts within Kubernetes clusters. Exposure to supporting distributed database systems in such environments is a plus. ?Experience with Apache Cassandra™ or DataStax Enterprise. ?Experience with and knowledge of monitoring technologies such as Prometheus and Grafana. ?Experience supporting Apache Pulsar, Apache Kafka, or similar messaging/streaming technologies.

Posted Date not available

Apply

15.0 - 20.0 years

5 - 9 Lacs

bengaluru

Work from Office

Project Role :Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Google Cloud DevOps Services Good to have skills : NAMinimum 5 year(s) of experience is required Educational Qualification : 15 years full time educationKey Responsibilities:Infrastructure Reliability & Monitoring:1)Ensure high availability and performance of production and development environments.2)Design, develop, and maintain robust monitoring, alerting, and logging systems using tools such as CloudWatch, Splunk, and Stackdriver .3)Conduct root cause analysis and postmortems for production incidents and drive incident response processes.Automation and DevOps:1)Build and maintain CI/CD pipelines (e.g., GitHub Actions, Jenkins, Cloud Build) for infrastructure and application deployments.2) Implement Infrastructure as Code (IaC) using Terraform, Helm, or Deployment Manager to manage cloud resources consistently and efficiently.3) Automate routine operational tasks to improve system efficiency and reduce manual effort.Cloud & Systems Engineering:1)Manage production environments across Google Cloud Platform (GCP) with services such as Cloud Run, Cloud scheduler, GKE, Cloud Storage, Pub/Sub, and BigQuery.2)Implement and maintain scalability solutions such as autoscaling, load balancing, and failover strategies.3)Ensure security and compliance across infrastructure by implementing best practices and regular audits.Performance & Capacity Planning:1)Conduct system performance tuning and capacity planning to ensure systems scale to meet business needs.2)Analyze usage patterns and load trends to forecast future infrastructure requirements.Collaboration & Documentation:1)Partner with development, QA, and operations teams to improve the reliability and efficiency of product deployments.2)Document operational procedures, runbooks, architecture, and incident response plans for consistency and scalability. Qualifications and Skills: Technical Skills: 1)812 years of IT experience, with at least 6+ years in an SRE, DevOps, or infrastructure engineering role.2)Hands-on experience with Google Cloud Platform (GCP) including services like Cloud Run, GKE, Cloud Monitoring, BigQuery, and IAM. 3)Strong knowledge of containers (Docker) and Kubernetes.4)Knowledge and proficiency in Shell scripting, Python & Helm is a must5) Experience with CI/CD pipelines, Git-based version control (e.g., GitHub), and automated testing frameworks. 6)Deep understanding of networking concepts, DNS, SSL/TLS, proxies, and firewalls.Soft Skills: 1)Strong analytical and problem-solving capabilities. 2)Excellent written and verbal communication skills.3)Ability to work collaboratively across global teams in a fast-paced, agile environment. 4)Detail-oriented with a focus on reliability and operational excellence. Shift & Availability:Selected candidates should be willing to work in Shift B and occasionally support weekend deployments or incidents as needed. Education and Experience:Bachelors degree in Computer Science, Engineering, or a related technical field.Relevant certifications in GCP or Kubernetes (CKA/CKAD) are a plus. Qualification 15 years full time education

Posted Date not available

Apply

1.0 - 2.0 years

3 - 5 Lacs

chennai

Work from Office

Role & responsibilities Monitor Cloud Infrastructure: Continuously monitor cloud environments (AWS and Azure & GCP) using DataDog, AppTio, and Nagios to ensure optimal performance and availability. Incident Management: Detect, analyse, and resolve Cloud/Infrastructure issues in a timely manner to minimize downtime and impact on services. Performance Tuning: Identify and implement optimizations to improve cloud infrastructure performance. Reporting: Generate and analyse monitoring reports to provide insights and recommendations for infrastructure improvements. Collaboration: Work closely with development, DevOps, and Cloud/Infrastructure teams to ensure seamless integration and performance of cloud services. Documentation: Maintain comprehensive documentation of monitoring setups, procedures, and best practices. Compliance: Ensure all monitoring practices adhere to industry standards and compliance requirements. Required Qualifications Experience: Minimum of 1 to 2 years of experience in cloud monitoring or a related field. Tools and Technologies: Proficient in using DataDog, Grafana, and Nagios for monitoring and analysis. Cloud Platforms: Strong knowledge of AWS and Azure services and architecture, GCP. Scripting and Automation: Experience with scripting languages (e.g., Python, Shell) and automation tools. Incident Response: Proven experience in incident management and resolution. Analytical Skills: Strong analytical and problem-solving skills. Communication: Excellent verbal and written communication skills, with the ability to convey complex technical concepts to non-technical stakeholders. OS Competency - Linux & Windows, Micro Services - Docker Preferred Qualifications Experience with Other Monitoring Tools: Familiarity with other monitoring tools and platforms is a plus. DevOps Practices: Understanding of DevOps principles and practices.

Posted Date not available

Apply

4.0 - 7.0 years

8 - 10 Lacs

bengaluru

Work from Office

We are looking for a passionate DevOps Engineer for a 6-month contract role based in Bangalore. The ideal candidate will have 4-7 years of experience managing and optimizing infrastructure on AWS and GCP. Responsibilities include ensuring high availability, scalability, infrastructure security, and performance monitoring. This is a full-time role requiring immediate joining. Candidates must have expertise in AWS, scripting, Linux systems, automation, and monitoring tools like Prometheus, Grafana, CloudWatch, and New Relic.

Posted Date not available

Apply

3.0 - 5.0 years

3 - 7 Lacs

chennai

Work from Office

Skill Set: • .Net Framework and Core. • Microservice Architecture and Docker Containers. • AWS and Azure Cloud Services, Kubernetes • MySQL, SQL • Kafka Message Streams • Cloud Monitoring Tools • Automation Framework : Specflow Role & responsibilities

Posted Date not available

Apply

6.0 - 10.0 years

10 - 15 Lacs

noida

Work from Office

Required Skills: GCP Proficiency Strong expertise in Google Cloud Platform (GCP) services and tools. Strong expertise in Google Cloud Platform (GCP) services and tools, including Compute Engine, Google Kubernetes Engine (GKE), Cloud Storage, Cloud SQL, Cloud Load Balancing, IAM, Google Workflows, Google Cloud Pub/Sub, App Engine, Cloud Functions, Cloud Run, API Gateway, Cloud Build, Cloud Source Repositories, Artifact Registry, Google Cloud Monitoring, Logging, and Error Reporting. Cloud-Native Applications Experience in designing and implementing cloud-native applications, preferably on GCP. Workload Migration Proven expertise in migrating workloads to GCP. CI/CD Tools and Practices Experience with CI/CD tools and practices. Python and IaC Proficiency in Python and Infrastructure as Code (IaC) tools such as Terraform. Responsibilities: Cloud Architecture and Design Design and implement scalable, secure, and highly available cloud infrastructure solutions using Google Cloud Platform (GCP) services and tools such as Compute Engine, Kubernetes Engine, Cloud Storage, Cloud SQL, and Cloud Load Balancing. Cloud-Native Applications Design Development of high-level architecture design and guidelines for develop, deployment and life-cycle management of cloud-native applications on CGP, ensuring they are optimized for security, performance and scalability using services like App Engine, Cloud Functions, and Cloud Run. API ManagementDevelop and implement guidelines for securely exposing interfaces exposed by the workloads running on GCP along with granular access control using IAM platform, RBAC platforms and API Gateway. Workload Migration Lead the design and migration of on-premises workloads to GCP, ensuring minimal downtime and data integrity.

Posted Date not available

Apply

6.0 - 10.0 years

10 - 15 Lacs

noida

Work from Office

Required Skills: GCP Proficiency Strong expertise in Google Cloud Platform (GCP) services and tools. Strong expertise in Google Cloud Platform (GCP) services and tools, including Compute Engine, Google Kubernetes Engine (GKE), Cloud Storage, Cloud SQL, Cloud Load Balancing, IAM, Google Workflows, Google Cloud Pub/Sub, App Engine, Cloud Functions, Cloud Run, API Gateway, Cloud Build, Cloud Source Repositories, Artifact Registry, Google Cloud Monitoring, Logging, and Error Reporting. Cloud-Native Applications Experience in designing and implementing cloud-native applications, preferably on GCP. Workload Migration Proven expertise in migrating workloads to GCP. CI/CD Tools and Practices Experience with CI/CD tools and practices. Python and IaC Proficiency in Python and Infrastructure as Code (IaC) tools such as Terraform. Responsibilities: Cloud Architecture and Design Design and implement scalable, secure, and highly available cloud infrastructure solutions using Google Cloud Platform (GCP) services and tools such as Compute Engine, Kubernetes Engine, Cloud Storage, Cloud SQL, and Cloud Load Balancing. Cloud-Native Applications Design Development of high-level architecture design and guidelines for develop, deployment and life-cycle management of cloud-native applications on CGP, ensuring they are optimized for security, performance and scalability using services like App Engine, Cloud Functions, and Cloud Run. API ManagementDevelop and implement guidelines for securely exposing interfaces exposed by the workloads running on GCP along with granular access control using IAM platform, RBAC platforms and API Gateway. Workload Migration Lead the design and migration of on-premises workloads to GCP, ensuring minimal downtime and data integrity.

Posted Date not available

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies