Posted:4 weeks ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

Observability Engineer

Experience : 8 to 11

Notice Period : 30 days


INTRODUCTION:

Effectv, the advertising sales division of Comcast Cable, helps local, regional, and national advertisers use the best of digital with the power of TV to grow their business. It provides multi-screen marketing solutions to make advertising campaigns more effective and easier to execute. Effectv has a presence in 66 markets with nearly 35 million owned and represented subscribers. We’re dedicated to helping our clients meet their business goals by connecting them with their customers through multiscreen television advertising Working with companies from local startups to nationwide corporations, we provide support to help each business reach its target customers. By applying data to television advertising in new ways, we’re able to bring our clients the best of digital media, coupled with the power of TV.

To learn more, check out www.effectv.com.

ABOUT THE ROLE:

The Observability team builds the platform that all the Effectv engineers use to investigate the health of their services at scale.

The team’s scope includes:

  • Logging
  • Distributed Tracing
  • Error Reporting
  • Profiling

What you’ll do:


You will support a team of 5-10 engineers focused on building the next generation of logging & tracing systems at Effectv, as part of the broader Observability team. In addition to helping define the roadmap for the next 3-5 years, you will be interacting with many other managers and their teams at Effectv, who rely on the Observability platform to deliver stable and scalable services to our customers.

RESPONSIBILITIES:

  • Develop and implement observability solutions to gain insights into application and infrastructure performance, availability, and reliability.
  • Collaborate with development, operations, and other teams to instrument applications and services for metrics, logs, traces, and other relevant data.
  • Design and implement monitoring solutions using industry-standard tools and practices to detect, analyze, and mitigate incidents and anomalies.
  • Create and manage dashboards, alerts, and visualization tools to provide real-time visibility into system behavior and performance.
  • Perform in-depth analysis of system behavior and trends to identify areas for improvement, optimization, and increased efficiency.
  • Troubleshoot complex issues by analyzing data from various sources to quickly diagnose and resolve incidents, minimizing downtime.
  • Continuously evaluate and recommend improvements to observability processes, tools, and practices to align with industry best practices.
  • Contribute to the development of automation scripts and tools to enhance observability and incident response.
  • Collaborate with development teams to improve application design for better observability, including implementing distributed tracing and structured logging.
  • Stay updated with emerging trends, technologies, and methodologies in observability, monitoring, and performance analysis.

PREFERRED QUALIFICATIONS

  • 8+ years of experience as Software Development Engineer, Site Reliability Engineer (SRE), or similar role.
  • Experience defining SLIs/SLOs.
  • Strong proficiency in implementing and maintaining observability tools, such as, Elastic ELK Stack (Elasticsearch, Logstash, Kibana), and APM (Application Performance Monitoring) tools.
  • Experience with Open Telemetry
  • Experience with data transformations and standardization
  • Solid experience with instrumentation practices, including metrics, logging, and distributed tracing.
  • Hands-on experience with cloud platforms (AWS, GCP) and containerization technologies (Docker, Kubernetes).
  • Proficiency in scripting and automation using languages such as Python, Bash, or PowerShell.
  • Excellent problem-solving skills with the ability to analyze complex issues and provide efficient solutions.
  • Strong communication skills and the ability to collaborate effectively across teams.
  • Understanding of Agile and DevOps principles and their application in observability and monitoring contexts.

Preferred:

  • Relevant certifications in observability tools and practices (e.g., Certified Prometheus Practitioner)
  • Knowledge/experience in cloud security and cloud migrations

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Live Connections logo
Live Connections

Telecommunications

Tech City

RecommendedJobs for You

Hyderabad, Telangana, India