Senior Site Reliability Engineer - Storage

6 - 11 years

0 - 2 Lacs

Posted:2 weeks ago| Platform: Naukri logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Role & responsibilities

  • Deploy, manage, and optimize storage solutions using ZFS and iSCSI across global data centers.
  • Implement and maintain automation and monitoring tools such as Puppet, Grafana, Zabbix, and Jenkins to enhance system performance and reliability.
  • Utilize storcli for managing server storage configurations.
  • Linux Systems Expertise:
  • Manage and maintain Ubuntu-based systems, ensuring security and compliance.
  • Conduct performance tuning and capacity planning for Linux servers.
  • Develop and implement self-healing systems and automated recovery processes on Linux platforms.
  • Reliability Engineering:
  • Develop and implement strategies for improving system availability and performance.
  • Conduct root-cause analysis and incident response for storage-related issues.
  • Collaborate with SDEs to support software development infrastructure and deploy new product features.

Preferred candidate profile

  • Proven experience in site reliability engineering, with a focus on storage solutions and Linux systems.
  • Strong knowledge of ZFS, iSCSI, and Ubuntu.
  • Expertise in automation and configuration management tools (e.g., Bash, Ansible, Puppet).
  • Familiarity with Hashicorp tools, SSH, and LDAP.
  • Experience with storcli for storage configuration.
  • Experience with monitoring tools such as Grafana, Zabbix, InfluxDB.
  • Ability to conduct root-cause analysis and implement effective solutions.
  • High level of ownership for assigned team problem space, including driving predictable delivery, continuous iteration and improvement, consistent and effective communication team, gracefully coordinating with upstream and downstream stakeholders, and project status.
  • Project management skills, including experience with task estimation, scheduling, Gantt charts, unblocking dependencies, Agile methodologies (such as sprint planning or Scrum), being detail-oriented, and keeping projects on track. Ability to define broad, complex problems and break into discrete, specific tasks that can be delegated.
  • Documentation skills including writing standard operating procedures, design docs, policy documents, runbooks.

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Kyndryl logo
Kyndryl

Information Technology Services

New York

RecommendedJobs for You