System Reliability Engineer (Kubernetes)

0 years

0 Lacs

Posted:3 months ago| Platform: Indeed logo

Apply

Work Mode

On-site

Job Description

Who are we:


Fulcrum Digital is an agile and next-generation digital accelerating company providing digital transformation and technology services right from ideation to implementation. These services have applicability across a variety of industries, including banking & financial services, insurance, retail, higher education, food, health care, and manufacturing.


The Role:


  • Define strategies for Application Performance Monitoring, Optimization in Prod environment.

  • Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.

  • Ensures that batch production scheduling and process are accurate and timely.

  • Able to create and execute queries to big data platforms and relational data tables to identify process issues or to perform mass updates, preferred.

  • Performs ad hoc requests from users such as data research, file manipulation/transfer, research of process issues, etc.

  • Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize meantime to recover.

  • Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation and refinement.

  • Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.

  • Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.

  • Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead in DevOps automation and best practices.

  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.

  • Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.

  • Work with a global team spread across tech hubs in multiple geographies and time zones.

  • Ability to share knowledge and explain processes and procedures to others.


Requirements

Skills:


Must Have:


  • Linux

  • Kubernetes
  • ITIL / ITSM

  • Application Troubleshooting

  • Any Monitoring tool (Preferred Splunk/Dynatrace)

  • Jenkins - CI/CD


Good To Have:


  • Even Framework architecture

  • Git basic/bit bucket

  • Ansible/Chef- Basic

  • Shell Scripting - Basic

  • SQL

  • Groovy Scripting/Yaml


Benefits

    Job Opening ID

    RRF_5299

    Job Type

    Permanent

    Industry

    IT Services

    Date Opened

    09/05/2025

    City

    Pune

    Province

    Maharashtra

    Country

    India

    Postal Code

    411057

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

RecommendedJobs for You