We are looking for SRE Expert(Architect) for Pune location, please refer the details below:
Exp. Range:- 15 to 19 Years
Location:- Pune
What does a successful Site Reliability Engineer (SRE) Expert do at Fiserv?
The Site reliability engineer blends the principles of software engineering with the discipline of operations to create high-performing and reliable software systems. They are tasked with designing and implementing tools, processes, and systems to improve the reliability, scalability, and performance of large-scale applications and services.
What will you do:
Automation and reduce toils
Create sustainable systems and services through automation. Automate operational mundane jobs, health checks, release and deployments. Measure and optimize system performance and innovate for continuous improvementObservability -
Run the production environment by monitoring availability and taking a holistic view of system health. Use monitoring systems for alerting and dashboardsProcess reengineering
– Mapping the business process / customer journey maps to find reliability gaps. Gather and analyze metrics from operating systems and applications to assist in performance tuning and fault finding. Development Operations partnership
- Participate in system design consulting, platform management, and capacity planning.Documentation –
Drive operations teams on documentation SOP’s, Configurations and infrastructure maps, knowledge articles, known errors resolution, etc Chaos engineering and Testing
– Design Chaos engineering plans and test all applications components and Infrastructure. Document the plans to address the gapsKPI’s and Error budget
– Measure the availability and downtime along with error budgets and develop strategies to maximize availability.
What you will need to have:
- Bachelor’s degree in computer science or related technical field and/or 7+ years of relevant work experience
- 14+ years of relevant work experience in
Site reliability engineering (SRE)
in Fintech / product organization. - 10+ years of experience in
automation of toils
working with Python or Java, Ansible, Powershell
, etc - 10+ years of experience in
Observability and monitoring tools
working with Dynatrace, Splunk, Moogsoft, Grafana, etc - Experience in managing
CI/CD pipelines and automation
(GITLAB, Harness, Nexus, Terraform, SonarQube, etc) - Experience in
SDLC
including associated deployment methodologies, Onboarding, QA processes, and performance tuning efforts and Source Code Management with GitLab/Github. - Strong problem-solving skills and critical thinking to analyze root causes, implement solutions, and prevent future disruptions proactively. Effective communication is also the key for SREs to collaborate with cross-functional teams, share knowledge, and address incidents promptly.
- Experience interacting with customers to analyze, validate, specify, verify, document and manage solution requirements.