Salary Package- 4-6 LPA
Work location - Trivandrum
Experience Required- 0.5 -1.5 years of experience
Work from office
Technical Support Engineer
- Provide technical support and troubleshoot issues related to cloud platforms and services such as
Fargate
, ECS
, DynamoDB
, BigQuery
, SNS
etc. - Understand the problems by consuming logs and metrics from various sources using the services such as
CloudWatch
, Prometheus
, Grafana
, Loki
, Alert Managers
and Splunk
etc. - Analyze and resolve networking challenges, including
load balancers, API gateways, reverse proxies, ingress controllers,
andservice-to-service communications.
- Work on issues related to client-server communications, firewalls, and virtual machines.
- Collaborate with DevOps teams to manage and troubleshoot toolchains like Docker, Kubernetes, Jenkins, Ingress Controllers etc.
- Act as the first point of contact for technical queries and escalate issues when necessary.
- Liaise with development and operations teams to identify root causes and resolve incidents effectively.
- Document troubleshooting steps, solutions, and maintain a knowledge base for recurring issues.
- Collaborate with cross-functional teams to implement best practices for monitoring and incident response.
- Participate in shift handovers and provide timely updates on ongoing issues.
Preferred Skills
- Hands on knowledge working with Fargate and ECS for managing and troubleshooting containerized workloads.
- Proficiency with DynamoDB and BigQuery for analyzing data and take decisions based on the analysis.
- Hands-on knowledge of SNS for debugging message delivery issues and integration workflows.
Monitoring and Logging Tools
- Proficiency in
CloudWatch Logs
, Loki
, and Splunk
for consuming and analyzing logs to identify and resolve issues. - Hands-on knowledge with
Prometheus
and Grafana
for analysing metrics using dashboards and monitoring system health. - Knowledge of
Alert Manager
for configuring and managing alert escalation. - Ability to interpret metrics from various sources and create actionable insights.
Networking and Security
- Understanding of
load balancers
(e.g., ALB, NLB) for distributing traffic and troubleshooting connectivity issues. - Knowledge in
API Gateways
like AWS API Gateway
or NGINX
for managing API traffic. - Knowledge of
reverse proxies
and ingress controllers
(e.g., NGINX Ingress
, Traefik
) for managing internal/external traffic. - Understanding
service-to-service communications
, including DNS, HTTP/HTTPS, and gRPC protocols. - Hands-on knowledge with
firewalls
, security groups, and IAM roles for secure communications. - Troubleshooting skills for VM-related issues in platforms like AWS EC2 or equivalent.
DevOps Toolchains
- Proficiency with Docker for managing container images and runtime debugging.
- Understanding of Kubernetes concepts of managing deployments, ingress setups, and pod-related issues and related troubleshooting commands and mechanisms.
- Knowledge of CI/CD pipeline building tools such as
Jenkins, GitHub Actions, ArgoCD
for building, deploying, and managing automated pipelines. - Understanding of Ingress controllers (e.g., NGINX, Traefik) and SSL termination for secure routing.
Troubleshooting and Incident Management
- Strong problem-solving skills to identify root causes using logs, metrics, and system-level debugging.
- Ability to document detailed troubleshooting steps and solutions for recurring issues.
Collaboration and Communication
- Ability working with cross-functional teams (DevOps, development, and operations) to resolve incidents.
- Skills in effective and proactive communication to escalate issues and provide updates during shift handovers.
- Proficiency with tools like Slack, JIRA, Confluence, or Google Workspace for collaboration and issue tracking.