We are looking for a proactive & highly skilled Technical Support Engineer to provide exceptional technical support to overseas projects, working in rotational shifts to ensure 24/7 availability. This role involves troubleshooting complex issues across cloud platforms, networking, application architectures, and DevOps toolchains. The ideal candidate should be self motivated, a collaborator, agile and a continuous learner.
Key Responsibilities
- Provide technical support and troubleshoot issues related to cloud platforms and services such as Fargate, ECS, DynamoDB, BigQuery, SNS etc.
- Understand the problems by consuming logs and metrics from various sources using the services such as CloudWatch, Prometheus, Grafana, Loki, Alert Managers and Splunk etc.
- Analyze and resolve networking challenges, including load balancers, API gateways, reverse proxies, ingress controllers, and service-to-service communications.
- Work on issues related to client-server communications, firewalls, and virtual machines.
- Collaborate with DevOps teams to manage and troubleshoot toolchains like Docker, Kubernetes, Jenkins, Ingress Controllers etc.
- Act as the first point of contact for technical queries and escalate issues when necessary.
- Liaise with development and operations teams to identify root causes and resolve incidents effectively.
- Document troubleshooting steps, solutions, and maintain a knowledge base for recurring issues.
- Collaborate with cross-functional teams to implement best practices for monitoring and incident response.
- Participate in shift handovers and provide timely updates on ongoing issues.
Technical Skills
Cloud Platforms and Services
- Hands on knowledge working with Fargate and ECS for managing and troubleshooting containerized workloads.
- Proficiency with DynamoDB and BigQuery for analyzing data and take decisions based on the analysis.
- Hands-on knowledge of SNS for debugging message delivery issues and integration workflows.
Monitoring and Logging Tools
- Proficiency in CloudWatch Logs, Loki, and Splunk for consuming and analyzing logs to identify and resolve issues.
- Hands-on knowledge with Prometheus and Grafana for analysing metrics using dashboards and monitoring system health.
- Knowledge of Alert Manager for configuring and managing alert escalation.
- Ability to interpret metrics from various sources and create actionable insights.
Networking and Security
- Understanding of load balancers (e.g., ALB, NLB) for distributing traffic and troubleshooting connectivity issues.
- Knowledge in API Gateways like AWS API Gateway or NGINX for managing API traffic.
- Knowledge of reverse proxies and ingress controllers (e.g., NGINX Ingress, Traefik) for managing internal/external traffic.
- Understanding service-to-service communications, including DNS, HTTP/HTTPS, and gRPC protocols.
- Hands-on knowledge with firewalls, security groups, and IAM roles for secure communications.
- Troubleshooting skills for VM-related issues in platforms like AWS EC2 or equivalent.
DevOps Toolchains
- Proficiency with Docker for managing container images and runtime debugging.
- Understanding of Kubernetes concepts of managing deployments, ingress setups, and pod-related issues and related troubleshooting commands and mechanisms.
- Knowledge of CI/CD pipeline building tools such as Jenkins, GitHub Actions, ArgoCD for building, deploying, and managing automated pipelines.
- Understanding of Ingress controllers (e.g., NGINX, Traefik) and SSL termination for secure routing.
Troubleshooting and Incident Management
- Strong problem-solving skills to identify root causes using logs, metrics, and system-level debugging.
- Ability to document detailed troubleshooting steps and solutions for recurring issues.
Collaboration and Communication
- Ability working with cross-functional teams (DevOps, development, and operations) to resolve incidents.
- Skills in effective and proactive communication to escalate issues and provide updates during shift handovers.
- Proficiency with tools like Slack, JIRA, Confluence, or Google Workspace for collaboration and issue tracking.
Experience Required
Technical Support Engineer with minimum 0.5 years of experience.
Job Types: Full-time, Permanent
Pay: ₹400,000.00 - ₹600,000.00 per year
Benefits:
- Health insurance
- Provident Fund
Work Location: In person