Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
8.0 - 12.0 years
0 Lacs
chennai, tamil nadu
On-site
You are seeking a skilled Observability & Site Reliability Engineer to join the team in supporting large-scale, enterprise-grade infrastructure. The ideal candidate will have extensive experience with observability tools such as Grafana, Loki, Mimir, and Kubernetes metrics/logs, and a strong passion for performance, scalability, and system uptime. It is essential that candidates have 8 to 12 years of experience and can join within immediate to 30 days notice period. Key Must-Have Skills: - 5+ years of experience in Observability Engineering. - Expertise in Grafana, Loki, Mimir, and Alloy agent. - Strong understanding of infrastructure metrics like GPU, CPU, and Kubernetes. - Proficiency in scripting languages such as Python, Go, and Bash. - Prior exposure to tools like Prometheus, ELK, Docker, and Terraform. - Flexibility to collaborate with Korean stakeholders and work within the Korean time zone. Role Highlights: - Design and manage the observability stack across large-scale data center infrastructure. - Build scalable telemetry systems, dashboards, alerts, and reports. - Apply Site Reliability Engineering (SRE) best practices to ensure system reliability and performance. - Troubleshoot real-time issues and contribute to ongoing system optimization. Good To Have: - Previous experience working with Korean stakeholders. - Familiarity with cloud platforms like AWS, GCP, or Azure.,
Posted 1 month ago
5.0 - 8.0 years
10 - 15 Lacs
Mumbai
Work from Office
We are seeking an experienced Observability Engineer to join our team. The ideal candidate will have over 5 years of experience in designing, automating, maintaining, and optimizing observability platforms, including logging, metrics, and tracing. You should possess an expert-level understanding of the observability landscape and a strong background in both open-source and closed-source monitoring and logging technologies.
Posted 3 months ago
3.0 - 7.0 years
15 - 20 Lacs
Pune
Work from Office
What Youll Do - Configure and manage observability agents across AWS, Azure & GCP - Use IaC techniques and tools such as Terraform, Helm & GitOps, to automate deployment of Observability stack - Experience with different language stacks such as Java, Ruby, Python and Go - Instrument services using OpenTelemetry and integrate telemetry pipelines - Optimize telemetry metrics storage using time-series databases such as Mimir & NoSQL DBs - Create dashboards, set up alerts, and track SLIs/SLOs - Enable RCA and incident response using observability data - Secure the observability pipeline You Bring - BE/BTech/MTech (CS/IT or MCA), with an emphasis in Software Engineering - Strong skills in reading and interpreting logs, metrics, and traces - Proficiency with LGTM (Loki, Grafana, Tempo, Mimi) or similar stack, Jaeger, Datadog, Zipkin, InfluxDB etc. - Familiarity with log frameworks such as log4j, lograge, Zerolog, loguru etc. - Knowledge of OpenTelemetry, IaC, and security best practices - Clear documentation of observability processes, logging standards & instrumentation guidelines - Ability to proactively identify, debug, and resolve issues using observability data - Focused on maintaining data quality and integrity across the observability pipeline
Posted 3 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
73564 Jobs | Dublin
Wipro
27625 Jobs | Bengaluru
Accenture in India
22690 Jobs | Dublin 2
EY
20638 Jobs | London
Uplers
15021 Jobs | Ahmedabad
Bajaj Finserv
14304 Jobs |
IBM
14148 Jobs | Armonk
Accenture services Pvt Ltd
13138 Jobs |
Capgemini
12942 Jobs | Paris,France
Amazon.com
12683 Jobs |