Observability Engineer

10 years

0 Lacs

Posted:2 days ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

The Internet of Things (IoT) will unlock trillions of dollars in value over the next 10 years as 50 billion devices are brought online. Aeris is at the forefront of this industry, building networks and applications to enable Fortune 500 clients like Chrysler, Honda and Bosch fundamentally improve their businesses. Headquartered in Silicon Valley with offices in Bucharest, Chicago, London, Delhi, Bangalore, Helsinki, and Tokyo as well as other markets. We rank among the top ten cellular providers for the IoT globally, powering critical projects across energy, transportation, retail, healthcare and more.Built from the ground up for IoT and road-tested at scale, Aeris IoT Services are based on the broadest technology stack in the industry, spanning connectivity up to vertical solutions. As veterans of the industry, we know that implementing an IoT solution can be complex, and we pride ourselves on making it simpler. Our company is in an enviable spot. We’re profitable, and both our bottom line and our global reach are growing rapidly. We’re playing in an exploding market where technology evolves daily and new IoT solutions and platforms are being created at a fast-pace.

A Few Things To Know About Us

  • We put our customers first. When making decisions, we always seek to do what is right for our customer first, our company second, our teams third, and individual selves last.
  • We do things differently. As a pioneer in a highly-competitive industry that is poised to reshape every sector of the global economy, we cannot fall back on old models. Rather, we must chart our own path and strive to out-innovate, out-learn, out-maneuver and out-pace the competition on the way.
  • We walk the walk on diversity. We’re a brilliant and eclectic mix of ethnicities, religions, industry experiences, sexual orientations, generations and more – and that’s by design. We see diverse perspectives as a core competitive advantage.
  • Integrity is essential. We believe in doing things well – and doing them right. Integrity is a core value here: you’ll see it embodied in our staff, our management approach and growing social impact work (we have a VP devoted to it). You’ll also see it embodied in the way we manage people and our HR issues: we expect employees and managers to deal with issues directly, immediately and with the utmost respect for each other and for the Company.
  • We are owners. Strong managers enable and empower their teams to figure out how to solve problems. You will be no exception, and will have the ownership, accountability and autonomy needed to be truly creative.
Aeris is looking for an experienced and visionary

Observability Engineer

to join our Infrastructure and Operations team. In this role, you will be responsible for designing and implementing robust observability solutions that provide deep insights into our systems, applications, and network infrastructure.

Key Responsibilities

  • Observability Systems Management:
    • Design, deploy, and maintain observability tools and platforms, including monitoring, logging, and tracing systems.
    • Ensure optimal configuration and performance of observability tools such as Prometheus, Loki, Grafana, ELK stack (Elasticsearch, Logstash, Kibana), Jaeger and cloud (AWS/GCP/Azure) Observability Tools.
  • Monitoring and Alerting:
    • Develop and manage dashboards using Kibana/Grafana and set up alerts with ElastAlert and Prometheus Alert Manager to monitor the health and performance of applications and infrastructure.
    • Implement robust alerting mechanisms to detect and notify of anomalies, outages, and system performance issues in real-time.
  • Logging and Tracing:
    • Implement centralized logging solutions to aggregate logs from various systems and applications.
    • Develop and maintain distributed tracing solutions to provide end-to-end visibility into system transactions.
  • Performance Analysis and Optimization:
    • Analyze system performance metrics and identify bottlenecks and performance degradation. Understanding of SLOs and SLIs
    • Work with development and operations teams to remediate performance issues and optimize system performance.
  • Automation and Scripting:
    • Create automation scripts to streamline observability tasks and processes.
    • Develop self-healing mechanisms through automated incident response.
  • Collaboration and Communication:
    • Work closely with development, operations, and SRE teams to align observability solutions with business and technical requirements.
    • Provide guidance and training on observability tools and best practices to other team members.
  • Documentation and Reporting:
    • Create and maintain detailed documentation for observability systems, processes, and procedures.
    • Generate periodic reports and dashboards to provide insights into system performance and reliability.

Qualifications And Experience

  • Education: Bachelor's degree in Computer Science, Information Technology, or a related field. Advanced degree preferred.
  • Experience:
    • Minimum of 4+ years of experience in IT infrastructure, with at least 3+ years in a observability or monitoring role.
    • Proven experience in observability engineering, including deploying and managing observability solutions.
    • Experience with monitoring tools (e.g., Prometheus, Grafana), logging tools (e.g., ELK stack), and tracing tools (e.g., Jaeger, OpenTelemetry).
    • Experience with cloud platforms such as AWS, Azure, or Google Cloud and Database like MySQL.
  • Technical Skills:
    • Strong understanding of observability concepts including metrics, logging, and tracing.
    • Proficiency in scripting languages such as Bash, Python, Perl or Go.
    • Familiarity with containerization (e.g., Docker) and orchestration tools (e.g., Kubernetes) and CI/CD pipelines.
    • Understanding of IP Network and monitoring on Network device (e.g. Router, Firewall).
    • Experience with infrastructure as code tools (e.g., Terraform, Ansible).
  • Soft Skills:
    • Excellent problem-solving and analytical skills.
    • Strong communication and collaboration skills.
    • Ability to work independently and in a team-oriented environment.
  • Preferred Qualifications:
    • Experience with APM tools like New Relic, Datadog, or Dynatrace.
    • Knowledge of service mesh technologies (e.g., Istio).
    • Open-source contributions or relevant certifications in observability tools and methodologies.

What is in it for you?

  • You get to build the next leading edge connected vehicle platform and internet of things platform
  • The ability to collaborate with our highly skilled groups who work with cutting edge technologies
  • High visibility as you support the systems that drive our public facing services
  • Career growth opportunities

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You