Senior Developer - ELK Stack & DevOps / SRE Specialist

5 - 10 years

17 - 22 Lacs

Posted:2 weeks ago| Platform: Naukri logo

Apply

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Summary Synechron is seeking a highly skilled Senior Developer specializing in ELK Stack & DevOps / SRE (Site Reliability Engineering) to join our dynamic issue management team. In this pivotal role, you will leverage your expertise in Site Reliability Engineering (SRE), DevOps practices, and monitoring solutions to ensure the stability, performance, and operational readiness of our applications and infrastructure. Your contributions will directly support our business objectives by enhancing system reliability, streamlining incident management, and fostering continuous improvement across technical domains. Software Requirements Required Skills: Proven proficiency with ELK Stack (Elasticsearch, Logstash, Kibana) — version 7.x or higher, with hands-on experience in building dashboards and analytics Experience with CI/CD tools such as Jenkins, Ansible, or equivalent automation platforms Programming/scripting proficiency in Python and Bash Familiarity with monitoring and logging tools (ELK Stack essential, Splunk preferred) Cloud platform experience (AWS, Azure) — practical knowledge of cloud services and deployment strategies Preferred Skills: Experience with React, Node.js, and Java application logging and monitoring strategies Familiarity with additional DevOps tools and methodologies Knowledge of containerization and orchestration (e.g., Docker, Kubernetes) Experience with Infrastructure as Code (IaC) tools Overall Responsibilities Collaborate with the issue management team to efficiently track, analyze, and resolve incidents and Operational Readiness Evaluations (OREs), ensuring minimal disruption and swift recovery. Develop, implement, and optimize monitoring and logging solutions utilizing ELK Stack, creating actionable dashboards and performance metrics. Design and enforce effective logging strategies for applications built with React, Node.js, and Java to facilitate troubleshooting and performance analysis. Lead continuous improvement initiatives aimed at enhancing system reliability, performance, and operational efficiency. Work cross-functionally with development, infrastructure, and security teams to diagnose and address performance bottlenecks and reliability challenges. Document incident processes, resolution procedures, and best practices to promote knowledge sharing and team growth. Technical Skills (By Category) Programming Languages & Scripts (Essential): Python, Bash, or equivalent scripting languages Monitoring & Logging Tools (Essential): ELK Stack (Elasticsearch, Logstash, Kibana) — version 7.x or higher Splunk (preferred) Cloud Technologies (Essential): AWS or Azure services such as EC2, S3, CloudWatch, or equivalent Frameworks & Application Technologies: Experience in monitoring React, Node.js, and Java applications — implementation of logging and performance metrics Development & Automation Tools: CI/CD pipelines (Jenkins, Ansible) — setup, maintenance, and optimization Containerization (Docker, Kubernetes) — knowledge preferred Security & Protocols (if applicable): Basic understanding of best practices in security for monitoring and logging Experience Requirements Minimum of 8 years of professional experience in DevOps, Site Reliability Engineering, or related fields Demonstrated success in developing and maintaining comprehensive monitoring and logging solutions, particularly using ELK Stack Proven experience implementing and refining logging strategies across diverse application stacks (React, Node.js, Java) Hands-on experience working within cloud environments such as AWS or Azure Experience working in large-scale, distributed systems and incident management processes Day-to-Day Activities Proactively monitor system health and incident alerts, collaborating with the issue management team for swift resolution Design, configure, and enhance ELK Stack dashboards, visualizations, and analytics for operational insights Implement and refine logging strategies for web and backend applications to facilitate effective troubleshooting Participate in continuous improvement projects to boost application and infrastructure performance Engage in cross-team meetings to discuss incident trends, system bottlenecks, and reliability enhancements Document procedures, lessons learned, and best practices for ongoing knowledge sharing Qualifications Bachelor’s degree in Computer Science, Information Technology, or a related discipline; equivalent professional experience supported Relevant certifications such as AWS Certified DevOps Engineer, Certified Kubernetes Administrator, or equivalent are preferred Ongoing professional development in DevOps, SRE practices, and monitoring technologies Professional Competencies Strong analytical and problem-solving skills with a focus on system reliability and performance Effective communicator capable of conveying technical information clearly to diverse audiences Team-oriented collaborator with experience working across cross-functional groups Adaptable learner, eager to stay current with emerging technologies and best practices Demonstrates proactive approach to incident management and continuous improvement Ability to manage multiple priorities efficiently while maintaining attention to detail

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview Now

My Connections Synechron

Download Chrome Extension (See your connection in the Synechron )

chrome image
Download Now
Synechron
Synechron

Information Technology and Services

New York

1000+ Employees

330 Jobs

    Key People

  • Faisal Husain

    Co-Founder & CEO
  • Maqbool Kazi

    Managing Director

RecommendedJobs for You