0 - 4 years

0 Lacs

Posted:2 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

You will be working for our client, Masai, in a dynamic and fast-paced environment. As a qualified candidate, you should possess an AWS Certification and have a strong foundation in networking and cloud computing. Additionally, we are looking for individuals who are set to graduate in 2024 or 2025. Your primary responsibilities will include monitoring the availability and performance of production environments, with a focus on enhancing the efficiency and reliability of SaaS services. You will be tasked with optimizing cloud infrastructure capacity planning, which is entirely hosted on AWS. In case of incidents, you will be responsible for managing emergency responses, ensuring prompt mitigation of issues with high quality. Root cause analysis of incidents and execution of preventive actions will be essential tasks, along with collaborating with DevOps, InfoSec, and Engineering teams to improve the performance, reliability, and operability of various applications and services. You will work extensively with tools like NewRelic, Grafana, Loggly, PagerDuty, Site24x7, FreshService, Kibana, AKAMAI, AWS services like RDS, ESS, ECS, EC2, VPCs, Redis, Lambda, etc., to enhance observability and monitoring. Addressing customer concerns regarding infrastructure availability, performance, and security will also be part of your role. To qualify for this position, you should hold a degree in Computer Science or a related field. A solid understanding of observability, AWS cloud services (EC2, RDS, Elasticsearch, Redis, SQS, API Gateway, Lambda, etc.), and monitoring tools is necessary. You should be capable of monitoring a multi-tenant SaaS environment, including web applications, database services, APIs, and backend jobs. Previous experience in handling live production incidents, debugging/troubleshooting applications and infrastructure issues, following SRE best practices, and analyzing performance metrics is highly valued. Strong documentation and interpersonal communication skills are essential, along with a proactive approach to problem identification, performance improvement, and bottleneck resolution. The ability to work effectively in a diverse, team-focused environment with DevOps and engineering teams, as well as thriving in a rapidly changing environment, is crucial. Programming skills in languages like Python and AWS/ITIL certification would be advantageous. Key Skills: redis, saas environment, networking, AWS, incident management, problem identification, reliability, EC2, engineering, cloud computing, programming skills, documentation, ITIL certification, root cause analysis, monitoring, cloud, interpersonal communication, infosec, troubleshooting, observability, monitoring tools, SaaS, capacity planning, communication skills, performance metrics analysis, AWS certification, debugging, availability, Python, DevOps, infrastructure.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Bengaluru, Karnataka

Bengaluru, Karnataka, India

Bengaluru, Karnataka

Bengaluru, Karnataka, India