Site Reliability Engineer

6 - 8 years

6 - 8 Lacs

Posted:16 hours ago| Platform: Foundit logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

At Constant Contact, we're looking for individuals well rounded in several aspects of Technical Operations. You will be taking on the role of a responder to the Operational alerts and monitoring within Constant Contact. This role requires you to work with both Developers and Operational personnel to address and resolve issues and requests.

We are looking for a highly skilled and motivated Site Reliability Engineer to join our team. The successful candidate will be responsible for maintaining the reliability and uptime of critical services, with a focus on CentOS servers, Java application support, incident management, change management and Kubernetes administration.

The ideal candidate will possess strong ArgoCD for Kubernetes management, Linux skills, basic scripting knowledge and familiarity with modern monitoring, alerting and automation tools. We are looking for someone that is self-motivated, possesses excellent communication skills (both oral and written) and is able to work both independently and collaboratively.

What you'll do:

  • Conduct regular routine tasks for system and application maintenance. Follow SOP's to correct/prevent issues
  • Monitor production systems, applications and overall performance.
  • Observability is a process that prepares the software team for uncertainties when the software goes live for end users.
  • Site reliability engineering uses tools to detect abnormal behaviors in the software and, more importantly, collect information that helps developers understand what causes the problem.
  • Conduct security checks
  • Run meetings with our business partners following in place processes and procedures.
  • Writing, updating and maintaining policy and procedure documents
  • Write scripts or code as necessary to develop tools and/or services in order to support the product
  • Learn from Post Mortems and prevent new incidents from occurring
  • Performing admin work on various tools and applications such as JIRA and New Relic
  • Maintain Service-level objectives, specific and quantifiable goals related to maintaining the parameters set for our Golden Metrics.

Who you are:

  • 3-5+ years of experience working in a SaaS and Cloud environment.
  • Administer Kubernetes clusters, including management of applications using ArgoCD.
  • Monitor, maintain, and manage applications on CentOS servers, ensuring high availability and performance.
  • Respond to and manage running incidents, including running post mortem meetings, performing root cause analysis and ensuring timely resolution.
  • Use basic Linux scripting to automate routine tasks and improve operational efficiency.
  • Knowledge in Project Management Tools like JIRA/Confluence
  • Knowledge of Database systems like MySQL and DB2
  • Understand and drive incidents using Incident Management processes and procedures
  • Execute change management procedures, run change management meetings and enforce safe and compliant changes to production environments.
  • Experience as a Linux (CentOS / RHEL) administrator
  • Deep knowledge of on-call responsibilities and awareness of time management. Include maintaining On-call management tools such as xMatters software.
  • Experience with managing deployments using Jenkins
  • Working with a suite of monitoring tools including New Relic, Splunk and Nagios
  • Experience with log aggregation tools like Splunk, Loki or Grafana
  • You must be comfortable troubleshooting and debugging web applications across the
  • entire stack (i.e. the application layer, the database layer, the OS).
  • Production MySQL experience: replication, performance tuning, query optimization.
  • You should have familiarity with Ansible or other configuration management tools like Puppet.

Role:

Industry Type:

Department:

Employment Type:

Role Category:

Education

UG:

PG:

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now
Aeries Technology logo
Aeries Technology

Technology

Tech City

RecommendedJobs for You

Noida, Uttar Pradesh, India

Bengaluru, Karnataka, India

Hyderabad, Telangana, India