Jobs
Interviews

648 Sre Jobs - Page 3

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

12.0 - 17.0 years

40 - 50 Lacs

Hyderabad

Work from Office

Job Description: Position Overview Lead SRE Will be responsible for overseeing the maintenance, support, and continuous improvement of the organizations software systems. This role ensures optimal performance, security, and reliability of all software applications while managing a team of engineers and collaborating with other departments to address business needs. Skillsets: DevOps, SRE, CI/CD Key Responsibilities 1. Software Maintenance & Support • Oversee the maintenance and troubleshooting of enterprise applications, ensuring high availability and performance. • Develop and implement strategies for proactive software monitoring, issue resolution, and system optimization. • Lead the identification and resolution of bugs, security vulnerabilities, and performance bottlenecks. 2. Team Leadership & Management • Manage and mentor a team of software maintenance engineers and support specialists. • Define clear roles, responsibilities, and KPIs for the maintenance team. • Foster a culture of continuous learning and process improvement. 3. Process Improvement & Automation • Establish and optimize software maintenance processes, including version control, patch management, and rollback strategies. • Identify opportunities for automation to improve efficiency and reduce downtime. • Ensure adherence to ITIL best practices and industry standards. 4. Collaboration & Stakeholder Management • Work closely with software development, infrastructure, and business teams to align maintenance strategies with organizational goals. • Act as the escalation point for critical software issues impacting business operations. • Communicate effectively with leadership on system health, risks, and improvement plans. 5. Security & Compliance • Ensure compliance with data security regulations and industry standards. • Oversee the implementation of security patches and software updates. • Conduct periodic audits to assess system vulnerabilities and risks. 6. Vendor & Third-Party Management • Manage relationships with third-party software vendors and service providers. • Oversee software licensing, renewals, and support contracts. • Evaluate vendor performance and negotiate service-level agreements (SLAs). Key Requirements 1. Education & Experience • Bachelors or Masters degree in Computer Science, Information Technology, or a related field. • 10+ years of experience in software maintenance, IT operations, or application support. • 5+ years of leadership/management experience in a similar role. 2. Technical Skills • Strong expertise in software maintenance methodologies, troubleshooting, and debugging. • Proficiency in cloud platforms (AWS, Azure, or GCP), databases, and enterprise applications. • Experience with monitoring tools, IT service management (ITSM) tools, and automation frameworks. • Understanding of cybersecurity best practices and compliance frameworks. 3. Leadership & Soft Skills • Excellent leadership, communication, and stakeholder management skills. • Strong analytical and problem-solving capabilities. • Ability to work in a fast-paced and high-pressure environment. Preferred Qualifications • ITIL certification or relevant IT service management experience. • Experience in fintech, banking, or high-availability system environments. • Exposure to DevOps, CI/CD, and Agile methodologies.

Posted 1 week ago

Apply

10.0 - 15.0 years

7 - 11 Lacs

Bengaluru

Work from Office

Capco, a Wipro company, is a global technology and management consulting firm. Awarded with Consultancy of the year in the British Bank Award and has been ranked Top 100 Best Companies for Women in India 2022 by Avtar & Seramount. With our presence across 32 cities across globe, we support 100+ clients acrossbanking, financial and Energy sectors. We are recognized for our deep transformation execution and delivery. WHY JOIN CAPCO You will work on engaging projects with the largest international and local banks, insurance companies, payment service providers and other key players in the industry. The projects that will transform the financial services industry. MAKE AN IMPACT Innovative thinking, delivery excellence and thought leadership to help our clients transform their business. Together with our clients and industry partners, we deliver disruptive work that is changing energy and financial services. #BEYOURSELFATWORK Capco has a tolerant, open culture that values diversity, inclusivity, and creativity. CAREER ADVANCEMENT With no forced hierarchy at Capco, everyone has the opportunity to grow as we grow, taking their career into their own hands. DIVERSITY & INCLUSION We believe that diversity of people and perspective gives us a competitive advantage. MAKE AN IMPACT Job Title: Strong candidate with engineering mindset who can drive the work on their own without being much dependant on peers. Having a good terraform code skills and no Azure devops and operations (SRE) profiles needed. Key responsibilities Need to work in Microsoft Azure Cloud components and architecture Need to develop, deploy, and maintain Azure Cloud Resources as IaC. Manage the entire deployment cycle for azure landing zones. Involve in design and automate processes for CI/CD. Your skills and experience 10+ years experience in IT infrastructure management, out of which a minimum 6 years in Azure Expertise in Managing Azure Environment in enterprise environments. Expertise in Azure IaaS, PaaS, SaaS offerings. Must understand end to end Azure IAAS components. Experience in Implementation of different Azure IaaS Services in Compute, Storage, and Networking Hands-on experience with GitHub Actions. Infrastructure as Code (IaC) experience (Terraform) Scripting and automation via AZ CLI/PowerShell, Python, Bash etc. Hands-on experience with YAML pipelines. Building & Automating CICD pipelines (Azure Devops/Github) for different applications. Experience with Linux Operating systems is preferred Experience in Azure Networking Concepts. Logical thinking, troubleshooting, presentation, and good communication skills. Should be good in communication and work independently Resources will be supporting us with below: Support with Landing Zone migrations Initiate Service Enablement Process for Azure Services currently in use Terraform module development New LZ provisioning, Access Packages, PIM groups etc. Troubleshoot migration issues Technical Documentation

Posted 1 week ago

Apply

5.0 - 7.0 years

15 - 20 Lacs

Bengaluru

Work from Office

Educational Requirements Master of Science (Technology),Master Of Comp. Applications,Master Of Engineering,Master Of Tech (Integrated),Master Of Technology,Bachelor Of Comp. Applications,Bachelor Of Science,Bachelor of Engineering,Bachelor Of Technology (Integrated) Service Line Application Development and Maintenance Responsibilities A day in the life of an InfoscionAs part of the Infosys consulting team, your primary role would be to lead the engagement effort of providing high-quality and value-adding consulting solutions to customers at different stages- from problem definition to diagnosis to solution design, development and deployment. You will review the proposals prepared by consultants, provide guidance, and analyze the solutions defined for the client business problems to identify any potential risks and issues. You will identify change Management requirements and propose a structured approach to client for managing the change using multiple communication mechanisms. You will also coach and create a vision for the team, provide subject matter training for your focus areas, motivate and inspire team members through effective and timely feedback and recognition for high performance. You would be a key contributor in unit-level and organizational initiatives with an objective of providing high-quality, value-adding consulting solutions to customers adhering to the guidelines and processes of the organisation. If you think you fit right in to help our clients navigate their next in their digital transformation journey, this is the place for you! Additional Responsibilities: Good knowledge on software configuration management systems Strong business acumen, strategy and cross-industry thought leadership Awareness of latest technologies and Industry trends Logical thinking and problem-solving skills along with an ability to collaborate Two or three industry domain knowledge Understanding of the financial processes for various types of projects and the various pricing models available Client Interfacing skills Knowledge of SDLC and agile methodologies Project and Team management Technical and Professional Requirements: Primary skills:Technology-DevOps-DevOps Architecture Consultancy Preferred Skills: Technology-DevOps-DevOps Architecture Consultancy

Posted 1 week ago

Apply

3.0 - 5.0 years

14 - 19 Lacs

Bengaluru

Work from Office

Educational Requirements Master of Science (Technology),Master Of Comp. Applications,Master Of Engineering,Bachelor Of Comp. Applications,Bachelor Of Science (Tech),Bachelor of Engineering,Bachelor Of Technology (Integrated) Service Line Application Development and Maintenance Responsibilities A day in the life of an InfoscionAs part of the Infosys consulting team, your primary role would be to actively aid the consulting team in different phases of the project including problem definition, effort estimation, diagnosis, solution generation and design and deployment You will explore the alternatives to the recommended solutions based on research that includes literature surveys, information available in public domains, vendor evaluation information, etc. and build POCs You will create requirement specifications from the business needs, define the to-be-processes and detailed functional designs based on requirements. You will support configuring solution requirements on the products; understand if any issues, diagnose the root-cause of such issues, seek clarifications, and then identify and shortlist solution alternatives You will also contribute to unit-level and organizational initiatives with an objective of providing high quality value adding solutions to customers. If you think you fit right in to help our clients navigate their next in their digital transformation journey, this is the place for you! Additional Responsibilities: Ability to work with clients to identify business challenges and contribute to client deliverables by refining, analyzing, and structuring relevant data Awareness of latest technologies and trends Logical thinking and problem-solving skills along with an ability to collaborate Ability to assess the current processes, identify improvement areas and suggest the technology solutions One or two industry domain knowledge Technical and Professional Requirements: Primary skills:Technology-DevOps-DevOps Architecture Consultancy Preferred Skills: Technology-DevOps-DevOps Architecture Consultancy Technology-DevOps-Continuous integration - Others

Posted 1 week ago

Apply

7.0 - 12.0 years

18 - 22 Lacs

Bengaluru

Work from Office

Consult with clients and propose architectural solutions to help move & improve infra from on-premises to cloud or help optimize cloud spend from one public cloud to the other. Be the first one to experiment on new age cloud offerings, help define the best practice as a thought leader for cloud, automation & Dev-Ops, be a solution visionary and technology expert across multiple channels. Good understanding of cloud design principles, sizing, multi-zone/cluster setup, resiliency and DR design. Solution Architect or similar certifications from Azure is must. Good business judgment, a comfortable, open communication style, and a willingness and ability to work with customers and teams. Strong communication skills and ability to lead discussions with client technical experts, application team & Vendors to drive collaboration, design thinking model towards reaching the desired objective Required education Bachelor's Degree Preferred education Master's Degree Required technical and professional expertise Experience in participating in technical reviews of requirements, designs, code, and other artifacts and use your experience in Multicloud to build hybrid-cloud solutions for customers. Provide leadership to project teams and facilitate the definition of project deliverables around core Cloud based technology and methods. Define tracking mechanisms and ensure IT standards and methodology are met; deliver quality results. Sound knowledge of SRE principles and ability to address performance issues through design or coding is must. Implement observability, develop, and support pipeline model to deploy key features, changes. Security, Risk and Compliance - Advise customers on best practices around access management, network setup, regulatory compliance, and related areas Preferred technical and professional experience 10 - 15 years of experience with at least 5+ years of hands-on experience in Azure Cloud Computing and IT operational experience in a global enterprise environment. Experience in Azure Databricks is preferred. Must have Azure DevOps experience and expertise in all Azure services and Database and Operating Systems experience and good experience in Automation skills like Terraform Ansible etc. Should work in IBM Cloud project as and when needed

Posted 1 week ago

Apply

2.0 - 7.0 years

10 - 20 Lacs

Hyderabad, Ahmedabad, Chennai

Work from Office

Skills: SRE, AWS Devops, Azure Devops Education: B.TECH, B.Sc, BCA Year of Experience : 2+yrs Location : Pan India

Posted 1 week ago

Apply

5.0 - 15.0 years

0 Lacs

ahmedabad, gujarat

On-site

The Platform Engineering Lead will drive the design, development, and continuous evolution of scalable, secure, and high-performance platforms that support OT cybersecurity services. You will be responsible for building a modular, multi-tenant technology foundation that supports rapid solution delivery, strong compliance postures (e.g., IEC 62443, NIST), and robust integrations with SIEM, IAM, EDR, and OEM tools. This role combines hands-on platform architecture leadership with strategic thinking, governance, vendor management, and team building across DevSecOps, infrastructure, and engineering teams. Preferred Qualifications: - Education: Bachelor's or Master's degree in Computer Science, Information Technology, or related field. Additional specialization in Cybersecurity, Cloud Architecture, or Systems Engineering is a strong plus. - Certifications (preferred, not mandatory): Cloud Certifications such as AWS Certified Solutions Architect Professional, Azure Solutions Architect Expert, or GCP Professional Cloud Architect. Security Certifications like CISSP, CISM, or CISA to demonstrate security leadership. DevOps/Architecture certifications like TOGAF, Kubernetes CKA/CKAD, or HashiCorp Terraform Certification. Compliance awareness in IEC 62443, or training in NIST/ISO 27001/GRC frameworks. Key Requirements: - 15+ years of experience in technology architecture or platform engineering, with a minimum of 5 years in leadership roles. - Deep expertise in cloud-native architecture, DevSecOps, SRE, and cybersecurity integrations. - Experience in microservices, modular platforms, and container orchestration (Kubernetes, Docker). - Strong exposure to at least two public clouds (AWS/Azure/GCP). - Hands-on experience with infrastructure automation, secrets management, and release pipelines. - Familiarity with compliance standards such as IEC 62443, NIST CSF, ISO 27001 is a plus. - Prior experience in OT/ICS cybersecurity, IT-OT convergence, or critical infrastructure platforms is desirable. - Proven ability to lead cross-functional teams, communicate with CXOs, and manage strategic vendors. Key Responsibilities: - Lead the architecture and engineering of modular, multi-tenant cybersecurity platforms for IT/OT convergence. - Build and scale cloud-native infrastructures using AWS/Azure/GCP, ensuring 99.9% uptime, horizontal scalability, and security-by-design principles. - Implement and govern robust CI/CD, IaC (e.g., Terraform), containerization (e.g., Kubernetes, Docker), and monitoring frameworks (e.g., Prometheus, Grafana, ELK). - Ensure platform readiness for integration with cybersecurity tools including SIEM, SOAR, EDR/XDR, IAM, PKI, and asset discovery platforms. - Drive DevSecOps maturity across environments, ensuring best practices in secure coding, automated testing, secrets management, and release pipelines. - Define platform engineering OKRs, build sprint governance, and lead agile delivery teams across infrastructure, tooling, and backend development. - Collaborate with Product, Delivery, OT Engineering, and GRC teams to ensure platform alignment with business goals, service offerings, and compliance needs. - Lead vendor evaluations, tool benchmarking, and integration programs with OEM cybersecurity, cloud, and automation partners.,

Posted 1 week ago

Apply

4.0 - 8.0 years

0 Lacs

chennai, tamil nadu

On-site

As a qualified candidate for this role, you should possess a Bachelor's or Master's degree in Computer Science, Information Technology, or a related field. Additionally, you should have 4-5 years of software development experience, with a focus on multicloud architecture, specifically in Service cloud/Vlocity omnistudio. Your experience should include working with Omni studio tools such as Data Raptor, Integration procedure, Omni script, Flex card, Trigger, Apex, Lightning Web Components, Aura Components, and managed packages. Hands-on experience in development practices is a must. Your responsibilities will involve designing and implementing complex Salesforce solutions, including creating Salesforce flows, entry-criteria/profiles, and data migrations. You should be proficient in creating flows, modifying objects, creating custom objects, writing Apex, triggers, and integrating API services using IDE/VisualSourceCode/Codebuilder. Understanding differing integration patterns will be crucial in this role. Experience with system integrations involving Salesforce.com web services (JSON, SOAP) and Vlocity Integration Procedure is required. You will be developing Apex (classes and triggers) to extend Salesforce to meet business requirements and working on custom User Interface development, including Lightning pages and Web Components. High-quality code development that aligns with customer needs and emphasizes simplicity, clarity, and testability is essential. Collaboration with Ford IT/Development teams to integrate Salesforce across the business will be part of your role. Adhering to Salesforce best practices, maintaining code documentation, writing/maintaining test classes for all custom development, and extending these best practices across Ford organizations are key responsibilities. You will also take ownership of release cycles to implement and deploy new/updates to existing applications and code. Additionally, having excellent analytical and problem-solving skills, strong communication, and collaboration skills are important for successful performance in this role. You should ideally have a strong proficiency in the Salesforce development ecosystem (Apex, LWC, Visualforce, Java), significant experience in web development environments (HTML, CSS, JavaScript), and familiarity with agile development methodologies. Nice-to-have qualifications include Salesforce certification (Salesforce Developer/Omni Script Developer), Salesforce Administrator, Platform App Builder, Data Architect. Experience in SRE in Copado and ability to architect services considering observability, traceability, and monitoring aspects will be beneficial. Proven experience in architecting and implementing service-oriented solutions, configuring and managing enterprise monitoring tools, and integrating with Cloud PaaS tech stacks are also desirable skills for this role.,

Posted 1 week ago

Apply

7.0 - 12.0 years

25 - 40 Lacs

Chennai

Work from Office

Job Title: Site Reliability Engineer (SRE) AWS Location: [ Chennai ] Job Type: [Full-time] Job Summary: We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) with deep expertise in Amazon Web Services (AWS) to join our team. The ideal candidate will be responsible for ensuring the reliability, scalability, and performance of our cloud infrastructure and services. You will work closely with development, operations, and security teams to build and maintain robust systems that support our business goals. Key Responsibilities: Design, implement, and maintain scalable, resilient, and secure AWS infrastructure. Develop automation tools and frameworks for deployment, monitoring, and operations. Monitor system performance, availability, and reliability using tools like CloudWatch, Prometheus, Grafana, etc. Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, or AWS CDK. Collaborate with development teams to improve system architecture and application reliability. Manage incident response, root cause analysis, and post-mortem documentation. Optimize cost and performance of AWS resources. Ensure compliance with security and governance policies. Participate in on-call rotations and proactively address system alerts and outages. Required Qualifications: Bachelor’s degree in Computer Science, Engineering, or related field. 7+ years of experience in SRE, DevOps, or Cloud Engineering roles. Strong hands-on experience with AWS services (EC2, S3, RDS, Lambda, ECS/EKS, etc.). Proficiency in scripting languages (Python, Bash, etc.). Experience with CI/CD tools (Jenkins, GitLab CI, AWS CodePipeline). Familiarity with containerization and orchestration (Docker, Kubernetes). Solid understanding of networking, security, and system administration. Excellent problem-solving and communication skills. Preferred Qualifications: AWS certifications (e.g., AWS Certified DevOps Engineer, Solutions Architect). Experience with observability tools (Datadog, New Relic, ELK Stack). Knowledge of chaos engineering and reliability testing practices. Experience in a high-availability, mission-critical environment.

Posted 1 week ago

Apply

5.0 - 10.0 years

15 - 30 Lacs

Noida

Hybrid

Lead Site Reliability Engineer Lead Site Reliability Engineers at UKG are critical team members that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering and auto remediation. Lead Site Reliability Engineers must be passionate about learning and evolving with current technology trends. They strive to innovate and are relentless in pursuing a flawless customer experience. They have an automate everything” mindset, helping us bring value to our customers by deploying services with incredible speed, consistency, and availability Job Responsibilities: Engage in and improve the lifecycle of services from conception to EOL, including system design consulting, and capacity planning Define and implement standards and best practices related to: System Architecture, Service delivery, metrics and the automation of operational tasks Support services, product & engineering teams by providing common tooling and frameworks to deliver increased availability and improved incident response. Improve system performance, application delivery and efficiency through automation, process refinement, postmortem reviews, and in-depth configuration analysis Collaborate closely with engineering professionals within the organization to deliver reliable services Increase operational efficiency, effectiveness, and quality of services by treating operational challenges as a software engineering problem (reduce toil) Guide junior team members and serve as a champion for Site Reliability Engineering Actively participate in incident response, including on-call responsibilities Partner with stakeholders to influence and help drive the best possible technical and business outcomes Required Qualifications Engineering degree, or a related technical discipline, or equivalent work experience Experience coding in higher-level languages (e.g., Python, JavaScript, C++, or Java) Knowledge of Cloud based applications & Containerization Technologies Demonstrated understanding of best practices in metric generation and collection, log aggregation pipelines, time-series databases, and distributed tracing Working experience with industry standards like Terraform, Ansible Demonstrable fundamentals in 2 of the following: Computer Science, Cloud architecture, Security or Network Design fundamentals Demonstrable fundamentals in 2 of the following: Computer Science, Cloud architecture, Security, or Network Design fundamentals (Experience, Education, Certification, License and Training) Must have at least 5 years of hands-on experience working in Engineering or Cloud Minimum 5 years' experience with public cloud platforms (e.g. GCP, AWS, Azure) Minimum 3 years' Experience in configuration and maintenance of applications and/or systems infrastructure for large scale customer facing company Experience with distributed system design and architecture

Posted 1 week ago

Apply

3.0 - 5.0 years

5 - 15 Lacs

Hyderabad

Work from Office

Description: SRE Role Requirements: Years of experience: 2 to 4 years We are seeking an SRE Engineer to help providing strategic direction, technical expertise to ensure the ongoing success and reliability of the platform and products Job Responsibilities: Support and provide guidance in designing, building, and maintaining highly available, scalable, and reliable SaaS infrastructure. Support resilient systems and solutions that meet stringent SLAs Lead efforts to ensure the reliability and uptime of our product, driving proactive monitoring, alerting, and incident response practices. Develop and implement strategies for fault tolerance, disaster recovery, and capacity planning. Conduct thorough post-incident reviews and root cause analyses to identify areas for improvement and prevent recurrence. Drive automation initiatives to streamline operational workflows, reduce manual effort, and improve efficiency. Champion DevOps best practices, promoting infrastructure as code, CI/CD pipelines,and other automation tools and methodologies. Support and partner with other teams on improving our observability systems to monitor site stability and performance Continuously learn and explore new tools, techniques, and methodologies to drive innovation and enhance the DevOps platform. Work closely with development teams to optimize application performance and efficiency. Implement tools and techniques to measure and improve service latency, throughput, and resource utilization. Identify and implement cost-saving measures to ensure cloud infrastructure spending is optimized. Proactively identify and address security vulnerabilities in the cloud environment Collaborate closely with engineering, product management, CISO and other teams to align on reliability goals, prioritize projects, and drive cross functional initiatives. Communicate effectively with stakeholders to provide visibility into reliability initiatives, progress, and challenges Maintain documentation of processes, configurations, and technical guidelines. What We Offer: Exciting Projects: We focus on industries like High-Tech, communication, media, healthcare, retail and telecom. Our customer list is full of fantastic global brands and leaders who love what we build for them. Collaborative Environment: You Can expand your skills by collaborating with a diverse team of highly talented people in an open, laidback environment — or even abroad in one of our global centers or client facilities! Work-Life Balance: GlobalLogic prioritizes work-life balance, which is why we offer flexible work schedules, opportunities to work from home, and paid time off and holidays. Professional Development: Our dedicated Learning & Development team regularly organizes Communication skills training(GL Vantage, Toast Master),Stress Management program, professional certifications, and technical and soft skill trainings. Excellent Benefits: We provide our employees with competitive salaries, family medical insurance, Group Term Life Insurance, Group Personal Accident Insurance , NPS(National Pension Scheme ), Periodic health awareness program, extended maternity leave, annual performance bonuses, and referral bonuses. Fun Perks: We want you to love where you work, which is why we host sports events, cultural activities, offer food on subsidies rates, Corporate parties. Our vibrant offices also include dedicated GL Zones, rooftop decks and GL Club where you can drink coffee or tea with your colleagues over a game of table and offer discounts for popular stores and restaurants!

Posted 1 week ago

Apply

2.0 - 5.0 years

3 - 8 Lacs

Hyderabad

Work from Office

Role Description : DevOps Engineer helps increase speed of delivery, improve quality/security of code, and optimize processes for development team. The DevOps Engineer is responsible for identifying the bottlenecks of various development and delivery processes, working with team members to improve them, and improving the overall experience of developers. They are responsible for infrastructure-as-code deployment tooling and supporting services. Responsibilities : Work closely with Tech Leads and developers of various teams to assess existing problems and to come up with process improvement solutions Detect upcoming bottlenecks and production issues proactively and consult teams hands-on towards improved technical solutions Participate in planning delivery time, code quality, and process efficiency improvement projects Execute on plan by building coding standardizations and automating processes for the organization Perform daily tasks such as environmental health checks, disk space monitoring, and environmental status reports Maintain and grows knowledge of platform configuration management and troubleshooting Actively participate in deploying application artifacts to appropriate target environments using the supported technologies and infrastructure Domain Experience : 3+ years of proven tech experience with implementing high-scale system architectures Bachelors in Computer Science (or related field) Excellent coding and scripting skills (Bash, Perl) and experience with implementing on-premise server using Linux CLI to develop and maintain server configurations. Proven work experience in installing, configuring, and troubleshooting Linux based environments Basic troubleshooting and diagnosis of network equipment Experience with continuous integration and related tools such as Jenkins, Hudson, Maven, Ant, Git, Sonar, etc. BHARAT INTERFACE FOR MONEY (BHIM) 1.0 2/2 Internal Us Agile/Digital Experience : Strong understanding of Agile methodologies Experience as a DevOps or SRE Engineer on a cross-functional agile team preferred Individual Skills Strong communication skills with ability to communicate complex technical concepts and align organization on decisions Utilizes team collaboration to create innovative solutions efficiently Mindsets and Behaviors Wants to unleash inner self-starter and work in an environment that fosters entrepreneurial minds Believes in culture of transparency and trust • Open to learning new ideas outside scope or knowledge

Posted 1 week ago

Apply

1.0 - 5.0 years

0 Lacs

pune, maharashtra

On-site

As a Site Reliability Engineer - Incident Management, you will be responsible for monitoring, maintaining, and managing the entire Qualys infrastructure and services installed at different data centers. In the event of any malfunction in products/services, you will be required to monitor, troubleshoot, repair, and restore the service/system promptly to ensure maximum service availability and performance. Your role will also involve providing support services for Engineering and other technical teams, collaborating for quicker issue resolution, performing end-to-end incident management, documentation, and task automation. Your main responsibilities will include monitoring the performance and capacity of computer systems, utilizing various tools to identify and address issues effectively. You will be expected to conduct basic troubleshooting of platform/product issues, utilize tools such as Splunk, Grafana, Kibana for performance checking, and manage PagerDuty. Additionally, you will assist in task automation wherever applicable, ensure timely resolution of incident tickets, and work on triaging and troubleshooting problems affecting products or services. It will be crucial for you to meticulously track and document all issues and resolutions in detail on the ticketing/documentation tools to enhance the knowledge base and maintain a record of system health. In cases where troubleshooting complex issues is not feasible, you should escalate the problem to management, IT resources, or 3rd party vendors for further assistance. Communication within the team and externally to stakeholders, keeping them informed of relevant information, known issues, and steps being taken, will be an integral part of your role. The Site Reliability Engineer - Incident Management team will operate 24*7*365 on a monthly shift rotation basis as per requirements. To excel in this role, you should possess one to two years of IT Operations (Infra/System admin/Linux) experience or relevant certification. Familiarity with monitoring and integration tools like Splunk, Prometheus, Grafana, Kibana, PagerDuty, Runscope, and incident management tools such as Jira/ServiceNow is beneficial. A good understanding of ITSM main functions and tools, along with strong interpersonal skills to interact with employees at all levels professionally, will be essential. Certifications in computer functionality, Linux, System Admin, VMware, IT Security, or ITSM/ITIL, and knowledge of DevOps/SRE basics, Python, and Cloud will be advantageous for this role.,

Posted 1 week ago

Apply

7.0 - 10.0 years

18 - 20 Lacs

Hyderabad

Remote

Java + AI/ML role required with at least 6+ years of industry experience on Java, Spring Boot, Spring Data & at least 2 years of AI/ML project / professional experience. Strong experience in building and consuming REST APIs and asynchronous messaging (Kafka/RabbitMQ). Working experience in integrating AI/ML models into Java services or calling external ML endpoints (REST/gRPC). Understanding of ML lifecycle: training, validation, inference, monitoring, and retraining. Familiarity with tools like TensorFlow, PyTorch, Scikit-Learn, or ONNX. Prior experience in domain-specific ML implementations (e.g., fraud detection, recommendation systems, NLP chatbots) Experience working with data formats like JSON, Parquet, Avro, and CSV. Solid understanding of database systems both SQL (PostgreSQL, MySQL) and NoSQL (Redis). Integrate machine learning models (batch and real-time) into backend systems and APIs. Optimize and automate AI/ML workflows using MLOps best practices. Monitor and manage model performance, versioning, and rollbacks. Collaborate with cross-functional teams (DevOps, SRE, Product Engineering) to ensure seamless deployment. Exposure to MLOps tools like MLflow, Kubeflow, or Seldon. Experience with any 1 of the cloud platforms, preferably AWS & Knowledge of observability tools & its metrics, events, logs, and traces (for e.g., Prometheus, Grafana, Open Telemetry, Splunk, Data Dog, App Dynamics, etc..).

Posted 1 week ago

Apply

3.0 - 8.0 years

10 - 20 Lacs

Hyderabad, Ahmedabad, Bengaluru

Work from Office

SUMMARY Sr. Site Reliability Engineer Keep Planet-Scale Systems Reliable, Secure, and Fast (On-site only) At Ajmera Infotech , we build planet-scale platforms for NYSE-listed clients from HIPAA-compliant health systems to FDA-regulated software that simply cannot fail. Our 120+ elite engineers design, deploy, and safeguard mission-critical infrastructure trusted by millions. Why You’ll Love It Dev-first SRE culture automation, CI/CD, zero-toil mindset TDD, monitoring, and observability baked in not bolted on Code-first reliability script, ship, and scale with real ownership Mentorship-driven growth with exposure to regulated industries (HIPAA, FDA, SOC2) End-to-end impact own infra across Dev and Ops Requirements Key Responsibilities Architect and manage scalable, secure Kubernetes clusters (k8s/k3s) in production Develop scripts in Python, PowerShell, and Bash to automate infrastructure operations Optimize performance, availability, and cost across cloud environments Design and enforce CI/CD pipelines using Jenkins, Bamboo, GitHub Actions Implement log monitoring and proactive alerting systems Integrate and tune observability tools like Prometheus and Grafana Support both development and operations pipelines for continuous delivery Manage infrastructure components including Artifactory, Nginx, Apache, IIS Drive compliance-readiness across HIPAA, FDA, ISO, SOC2 Must-Have Skills 3 8 years in SRE or infrastructure engineering roles Kubernetes (k8s/k3s) production experience Scripting: Python, PowerShell, Bash CI/CD tools: Jenkins, Bamboo, GitHub Actions Experience with log monitoring, alerting, and observability stacks Cross-functional pipeline support (Dev + Ops) Tooling: Artifactory, Nginx, Apache, IIS Performance, availability, and cost-efficiency tuning Nice-to-Have Skills Background in regulated environments (HIPAA, FDA, ISO, SOC2) Multi-OS platform experience Integration of Prometheus, Grafana, or similar observability platforms Benefits What We Offer Competitive salary package with performance-based bonuses. Comprehensive health insurance for you and your family. Flexible working hours and generous paid leave . High-end workstations and access to our in-house device lab. Sponsored learning: certifications, workshops, and tech conferences.

Posted 1 week ago

Apply

10.0 - 20.0 years

12 - 22 Lacs

Chennai, Bengaluru, Delhi / NCR

Hybrid

Job description Hiring for SRE Devops with experience range 5 to 15years. Mandatory Skills: Site Reliability Engineering, Devops Java, Kubernetes, AWS/Azure, DevOps/DevSecOps, Monitoring Tools - App Dynamics/ Dynatrace/New Relic, Build and Release, Prometheus, Python, Node.JS-site reability engineer Education: BE/B.Tech/MCA/M.Tech/MSc./MSts

Posted 1 week ago

Apply

4.0 - 9.0 years

5 - 15 Lacs

Hyderabad, Pune, Bengaluru

Hybrid

Job description Hiring for SRE Devops with experience range 3 to 15 years. Mandatory Skills: Site Reliability Engineering ,Devops, Java, Kubernetes, AWS/Azure, DevOps/DevSecOps, Monitoring Tools - App Dynamics/ Dynatrace/New Relic, Build and Release, Prometheus, Python, Node.JS-site reability engineer Education: BE/B.Tech/MCA/M.Tech/MSc./MSts

Posted 1 week ago

Apply

5.0 - 9.0 years

14 - 18 Lacs

Hyderabad, Bengaluru

Work from Office

About the Role: Grade Level (for internal use): 11 About the Role We are looking for a highly driven Senior Platform & Full Stack Engineer who brings passion, innovation, and deep technical experience to join our high-performing DevOps and SRE team. In this role, youll help us define, build, and scale the next generation of cloud-native, cloud-agnostic CI/CD pipelines , Infrastructure as Code (IaC) reusable workflows , and AI-driven autonomous deployments . Key Responsibilities Lead the design and implementation of reusable IaC workflows and standardized CI/CD blueprints across multiple teams. Architect and maintain cloud-agnostic deployment solutions with deep expertise in AWS and Kubernetes (EKS). Implement and optimize configuration as code practices using tools like Terraform and GitHub Actions. Partner with developers and SREs to define end-to-end infrastructure workflows covering compute, network, and storage automation. Contribute as a hands-on developer to internal tools, platforms, and APIs (Java, Go, or similar). Collaborate on cutting-edge initiatives such as Agentic AI workflows and autonomous chat-based deployments using MCP and LLM orchestration. Foster a culture of continuous innovation, high energy, and performance excellence. Required Skills & Experience 10+ years of experience in DevOps, Platform Engineering, or Full Stack Development with platform ownership. Proven experience designing Infrastructure as Code using Terraform at scale. Solid programming skills J ava ,Python, Javascript and Go preferred Expertise in CI/CD pipeline design and orchestration using GitHub Actions (and optionally ArgoCD, GitLab, Jenkins, etc.). Strong knowledge of AWS services, with hands-on experience in EKS , IAM, networking (VPCs, Route53, ALBs), storage (EBS, S3), and compute. End-to-end understanding of modern cloud infrastructure , DevSecOps, observability, and release practices. Ability to translate product/platform needs into reliable, secure, scalable infrastructure solutions . Excellent problem-solving skills and a mindset for performance, scalability, and resilience. Passion for innovation, high energy, and eagerness to experiment with emerging tech like LLMs and Agentic AI Additional Skills Experience with multi-cloud environments (Azure, GCP). Knowledge of Agentic AI systems , LLMs , or AI Ops use cases. Exposure to platform-as-product or internal developer platforms. Familiarity with Kubernetes Operators, Helm charts, and service mesh (Istio, Linkerd). Why Join Us Be part of a forward-thinking DevOps and SRE team pushing the boundaries of platform automation. Work on AI-powered workflows and define how infrastructure can be deployed through intelligent assistants. Build developer-centric platforms that make a real impact on engineering productivity and product reliability. Enjoy a culture of innovation, energy, and excellence where your ideas will be heard and executed. Whats In It For You Our Purpose: Progress is not a self-starter. It requires a catalyst to be set in motion. Information, imagination, people, technologythe right combination can unlock possibility and change the world.Our world is in transition and getting more complex by the day. We push past expected observations and seek out new levels of understanding so that we can help companies, governments and individuals make an impact on tomorrow. At S&P Global we transform data into Essential Intelligence, pinpointing risks and opening possibilities. We Accelerate Progress. Our People: Our Values: Integrity, Discovery, Partnership At S&P Global, we focus on Powering Global Markets. Throughout our history, the world's leading organizations have relied on us for the Essential Intelligence they need to make confident decisions about the road ahead. We start with a foundation of integrity in all we do, bring a spirit of discovery to our work, and collaborate in close partnership with each other and our customers to achieve shared goals. Benefits: We take care of you, so you cantake care of business. We care about our people. Thats why we provide everything youand your careerneed to thrive at S&P Global. Health & WellnessHealth care coverage designed for the mind and body. Continuous LearningAccess a wealth of resources to grow your career and learn valuable new skills. Invest in Your FutureSecure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs. Family Friendly PerksIts not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families. Beyond the BasicsFrom retail discounts to referral incentive awardssmall perks can make a big difference. For more information on benefits by country visithttps://spgbenefits.com/benefit-summaries Global Hiring and Opportunity at S&P Global: At S&P Global, we are committed to fostering a connected andengaged workplace where all individuals have access to opportunities based on their skills, experience, and contributions. Our hiring practices emphasize fairness, transparency, and merit, ensuring that we attract and retain top talent. By valuing different perspectives and promoting a culture of respect and collaboration, we drive innovation and power global markets. ---- Equal Opportunity Employer S&P Global is an equal opportunity employer and all qualified candidates will receive consideration for employment without regard to race/ethnicity, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, marital status, military veteran status, unemployment status, or any other status protected by law. Only electronic job submissions will be considered for employment. If you need an accommodation during the application process due to a disability, please send an email to EEO.Compliance@spglobal.com and your request will be forwarded to the appropriate person. US Candidates Only The EEO is the Law Poster http://www.dol.gov/ofccp/regs/compliance/posters/pdf/eeopost.pdf describes discrimination protections under federal law. Pay Transparency Nondiscrimination Provision - https://www.dol.gov/sites/dolgov/files/ofccp/pdf/pay-transp_%20English_formattedESQA508c.pdf ---- IFTECH202.2 - Middle Professional Tier II (EEO Job Group)

Posted 1 week ago

Apply

6.0 - 11.0 years

25 - 40 Lacs

Bengaluru

Work from Office

Hi, Greetings from Thales India Pvt Ltd.....! We are hiring for Technical Lead - Devops Engineer for our Engineering competency center for Bangalore location . Experience: 8 to 12 years. Notice Period: Immediate to Max 30 Days. Location: Thales India Private Limited, Richmond Town, Bengaluru, Karnataka 560025. About Thales: Thales people architect identity management and data protection solutions at the heart of digital security. Business and governments rely on us to bring trust to the billons of digital interactions they have with people. Our technologies and services help banks exchange funds, people cross borders, energy become smarter and much more. More than 30,000 organizations already rely on us to verify the identities of people and things, grant access to digital services, analyze vast quantities of information and encrypt data to make the connected world more secure. Present in India since 1953, Thales is headquartered in Noida, Uttar Pradesh, and has operational offices and sites spread across Bengaluru, Delhi, Gurugram, Hyderabad, Mumbai, Pune among others. Over 1800 employees are working with Thales and its joint ventures in India. Since the beginning, Thales has been playing an essential role in Indias growth story by sharing its technologies and expertise in Defense, Transport, Aerospace and Digital Identity and Security markets. Additional: Imperva, a Thales Company is a cybersecurity leader Together, we provide innovative platforms designed to reduce the complexity and risks of managing and protecting more applications, data, and identities than any other company can. Our solutions enable over 35,000 organizations to deliver trusted digital services to billions of consumers around the world every day. JOB Summary: We're building a first-of-its-kind AI Firewall to protect applications using Large Language Models (LLMs). As one of the first DevOps Engineers on the team, you'll build and maintain the CI/CD pipelines, observability stack, and deployment infrastructure for a cutting-edge AI Firewall. Your work ensures our services are secure, fast, and always available. Job Knowledge, Skill and Qualifications: BE, M.Sc. in Computer Science or equivalent 8+ years of experience in DevOps, SRE, or Infrastructure Engineering Proficient with Kubernetes, Docker, and cloud platforms (AWS/GCP/Azure) Experience in developing performance-oriented applications. Strong scripting skills (Bash, Python, or Groovy) Background in AI/ML, Networking concepts such as TCP/UDP, HTTP, TLS etc. Bonus: Experience with security tooling, API gateways, or LLM-related infrastructure

Posted 1 week ago

Apply

7.0 - 11.0 years

35 - 50 Lacs

Bengaluru

Work from Office

About the Role: This role is responsible for managing and maintaining complex, distributed big data ecosystems. It ensures the reliability, scalability, and security of large-scale production infrastructure. Key responsibilities include automating processes, optimizing workflows, troubleshooting production issues, and driving system improvements across multiple business verticals. Roles and Responsibilities: Manage, maintain, and support incremental changes to Linux/Unix environments. Lead on-call rotations and incident responses, conducting root cause analysis and driving postmortem processes. Design and implement automation systems for managing big data infrastructure, including provisioning, scaling, upgrades, and patching clusters. Troubleshoot and resolve complex production issues while identifying root causes and implementing mitigating strategies. Design and review scalable and reliable system architectures. Collaborate with teams to optimize overall system/cluster performance. Enforce security standards across systems and infrastructure. Set technical direction, drive standardization, and operate independently. Ensure availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning. Resolve, analyze, and respond to system outages and disruptions and implement measures to prevent similar incidents from recurring. Develop tools and scripts to automate operational processes, reducing manual workload, increasing efficiency and improving system resilience. Monitor and optimize system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning. Collaborate with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle. Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities. Develop and enforce SRE best practices and principles. Align across functional teams on priorities and deliverables. Drive automation to enhance operational efficiency. Adapt new technologies as and when the need arises and define architectural recommendations for new tech stacks. Preferred candidate profile Over 6 years of experience managing and maintaining distributed big data ecosystems. Strong expertise in Linux including IP, Iptables, and IPsec. Proficiency in scripting/programming with languages like Perl, Golang, or Python. Hands-on experience with the Hadoop stack (HDFS, HBase, Airflow, YARN, Ranger, Kafka, Pinot). Familiarity with open-source configuration management and deployment tools such as Puppet, Salt, Chef, or Ansible. Solid understanding of networking, open-source technologies, and related tools. Excellent communication and collaboration skills. DevOps tools: Saltstack, Ansible, docker, Git. SRE Logging and monitoring tools: ELK stack, Grafana, Prometheus, opentsdb, Open Telemetry. Good to Have: Experience managing infrastructure on public cloud platforms (AWS, Azure, GCP). Experience in designing and reviewing system architectures for scalability and reliability. Experience with observability tools to visualize and alert on system performance. Experience in massive petabyte scale data migrations, massive upgrades

Posted 1 week ago

Apply

7.0 - 8.0 years

7 - 9 Lacs

Hyderabad, Telangana, India

On-site

You are passionate about driving SRE / DevOps mindset and culture in a fast-paced, challenging environment where you get the opportunity to work with a spectrum of latest tools and technologies to drive forward Automation, Observability and CI/CD automation You are actively looking to improve implemented solutions, understand the efficacy of collaboration, work with cross functional teams to build and improve CI/CD pipeline and improve automation (reduce Toil). As a member of this team, you possess the ability to inspire and leverage your experience to inject new knowledge and skills into an already high performing team. Help Identifying areas of improvement, especially when it comes to Observability, Proactiveness, Automation & Toil Management. Strategic approach with clear objectives to improve System Availability, Performance Optimization, and improve Incident MTBuild and maintain Reliable Engineering Systems using SRE and DevSecOps models with special focus on Event Management (monitoring/alerts), Self Healing and Reliability testing Strong programming skills with experience in API and Webhook development using Dynatrace, GitHub workflows, Ansible, CDK, Type/Java script, Python, Node.js, Ruby, PowerShell, and Shell Scripting languages. Strong understanding of Cloud computing (AWS) Strong understanding of SDLC and DevSecOps Experience in CI/CD pipeline tools such as JIRA, GitHub, Bitbucket, Artifactory, Ansible, or equivalent Working knowledge of Lambda, Glue and CDK Knowledge of cloud services: Application integration, functions, Cloud Databases, data warehouse and analytics, Machine Learning, Developer Tools, Security and identity management Knowledge of software development practices, concepts, and technology obtained through formal training and/or work experience. Knowledge of required programming languages and can code with minimum guidance. Understand functional aspects and technical behavior of the

Posted 1 week ago

Apply

3.0 - 5.0 years

0 - 3 Lacs

Hyderabad, Telangana, India

On-site

Job description The SRE function is a highly visible force multiplier with a growth mindset, going through a period of increased investment, where you can contribute to the delivery of a highly reliable banking solution As part of an SRE squad, you will partner with engineering teams within Macquarie to help develop and drive the adoption of SRE best practices and tooling across the organisation. The role will require close engagement and collaboration with all the engineering community. You will be involved in projects such as measuring, testing and improving our resilience (Chaos engineering), our capacity to deal with increasing load (Demand forecasting and capacity planning), our ability to make changes safely (Change management and System Design) and our Observability (Metrics, monitoring, and alerts) What you offer Strong experience in software engineering and system design utilising Java, Golang or similar language Understand the benefits and correct use of SLOs, metrics, logs and traces Cloud Native at heart ready to build on the shoulders of giants Excellent understanding of modern software development practices, tools and technologies Strong DevOps fundamentals with preference for Java, Golang, Microservices and other cloud technologies. Experience in APM and Observability tools, such as NewRelic, DataDog, Dynatrace, Grafana stack etc.

Posted 1 week ago

Apply

8.0 - 12.0 years

0 Lacs

karnataka

On-site

As a Site Reliability Engineering (SRE) Technical Leader on the Network Assurance Data Platform (NADP) team at Cisco ThousandEyes, you will be responsible for ensuring the reliability, scalability, and security of the cloud and big data platforms. Your role will involve representing the NADP SRE team, contributing to the technical roadmap, and collaborating with cross-functional teams to design, build, and maintain SaaS systems operating at multi-region scale. Your efforts will be crucial in supporting machine learning (ML) and AI initiatives by ensuring the platform infrastructure is robust, efficient, and aligned with operational excellence. You will be tasked with designing, building, and optimizing cloud and data infrastructure to guarantee high availability, reliability, and scalability of big-data and ML/AI systems. This will involve implementing SRE principles such as monitoring, alerting, error budgets, and fault analysis. Additionally, you will collaborate with various teams to create secure and scalable solutions, troubleshoot technical problems, lead the architectural vision, and shape the technical strategy and roadmap. Your role will also encompass mentoring and guiding teams, fostering a culture of engineering and operational excellence, engaging with customers and stakeholders to understand use cases and feedback, and utilizing your strong programming skills to integrate software and systems engineering. Furthermore, you will develop strategic roadmaps, processes, plans, and infrastructure to efficiently deploy new software components at an enterprise scale while enforcing engineering best practices. To be successful in this role, you should have relevant experience (8-12 yrs) and a bachelor's engineering degree in computer science or its equivalent. You should possess the ability to design and implement scalable solutions, hands-on experience in Cloud (preferably AWS), Infrastructure as Code skills, experience with observability tools, proficiency in programming languages such as Python or Go, and a good understanding of Unix/Linux systems and client-server protocols. Experience in building Cloud, Big data, and/or ML/AI infrastructure is essential, along with a sense of ownership and accountability in architecting software and infrastructure at scale. Additional qualifications that would be advantageous include experience with the Hadoop Ecosystem, certifications in cloud and security domains, and experience in building/managing a cloud-based data platform. Cisco encourages individuals from diverse backgrounds to apply, as the company values perspectives and skills that emerge from employees with varied experiences. Cisco believes in unlocking potential and creating diverse teams that are better equipped to solve problems, innovate, and make a positive impact.,

Posted 1 week ago

Apply

18.0 - 22.0 years

0 Lacs

hyderabad, telangana

On-site

As a FinEx Service Resilience Head at HSBC, you will play a crucial role in ensuring the stability and resiliency of the production estate consisting of approximately 400 applications and services. Reporting to the CIO for Finance, Regulatory Reporting, and Cross Functions Technology, you will be responsible for maintaining effective governance and control across the FinEX Production estate. This will involve collaborating with Value Stream-aligned DevSecOps teams and the Enterprise Technology Service Management community to ensure uninterrupted business processes for users across various functions. Your responsibilities will include managing a small central team of Subject Matter Experts in the Service Management, Control, and Infrastructure domains, driving transformation in the DevSecOps teams towards automated solutions and continuous improvement. You will also be expected to maintain a diverse network of stakeholders across Global Finance, Global Risk, Procurement leadership, regional technology leads, key vendors, and various internal HSBC teams. In this role, you will focus on ensuring production stability and resiliency by implementing and reviewing governance and control processes, managing core teams, delivering high-quality production and control metrics, driving convergence of working practices, and actively participating in Communities of Practices to identify best practices in Service Management and Control domains. Additionally, you will be involved in reducing resolution time and service disruption, escalating major incidents appropriately, and ensuring the continual review of key performance indicators and objectives. Your success in this role will require 18+ years of experience as a senior technologist, particularly in providing production service management and control operation for a large, globally distributed technology estate. You should have a track record of DevOps and agile adoption, experience in managing technology vendors, and the ability to influence senior stakeholders effectively. Strong communication skills, attention to detail, and a passion for service management and control will be essential for driving initiatives that create a diverse and inclusive culture within the India team. Join HSBC and be part of a dynamic environment where you can drive culture change, engage with cross-cultural teams, and contribute to the continuous improvement of service performance. Learn more about this exciting opportunity at www.hsbc.com/careers. Please note that personal data related to your job application will be handled in accordance with HSBC's Privacy Statement, available on their website.,

Posted 1 week ago

Apply

8.0 - 15.0 years

0 Lacs

thane, maharashtra

On-site

As a Senior Lead Site Reliability Engineer (SRE) in Thane / Lower Parel, you will be an integral part of the SRE practice ensuring the reliability, availability, scalability, and efficiency of systems. Your role will involve designing, implementing, and governing systems for reliability, availability, and scalability. You will be responsible for developing incident response processes, collaborating with development teams, and ensuring applications are designed with reliability in mind. Additionally, you will support continuous system improvements, manage platform tools, and ensure minimal downtime. Key Responsibilities - Design, implement, and govern systems for reliability, availability, and scalability. - Develop incident response processes for quick resolution. - Collaborate with development teams to ensure applications are designed with reliability. - Perform competitive analysis to maintain minimal downtime. - Support IT architecture alignment, integration design, and design reviews. - Manage platform tools for high availability and disaster recovery across on-premises and cloud infrastructure. - Mentor team members and collaborate with vendor IT teams. Key Result Areas - Assist SRE team on deployment and technical functions on DAST and IAST to perform any kind of vulnerability at runtime. - Maintain and monitor application environments, including patching and upgrades. - Ensure application availability, scalability, and fault tolerance. - Contribute to solution architecture and identify trade-offs in cost, performance, and scalability. - Collaborate on deployment plans and maintain adherence to security and enterprise standards. - Lead a team of SREs, ensuring system performance meets business and technical requirements. Qualifications & Experience - Education: BE/ B. Tech. / ME / M. Tech. / MCA - 8-15 years of relevant experience in SRE, IT infrastructure management, and application lifecycle best practices. - Strong knowledge of cloud infrastructure, disaster recovery, and IT security frameworks. - Excellent collaboration skills with both internal and vendor teams. - Leadership experience is a plus.,

Posted 1 week ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies