Jobs
Interviews

4 Amazon Eks Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

7.0 - 10.0 years

18 - 33 Lacs

Pune

Hybrid

Dear Candidate, Please apply on below link: https://shorturl.at/EMbNG (copy the link and paste in new browser and follow the instructions.) As a Senior Site Reliability Engineer (SRE) at Privacera , you will be a foundational part of the team which ensures the reliability, availability, and security of our services and platforms for our customers. You must have demonstrated and be capable of an extreme ownership mentality. A successful Senior SRE at this company must have strong facility coding in Python, as well as in bash. You will need to quickly become proficient in understanding how each design, component, configuration, and process is linked to form an end-to-end solution. You will have strong experience in deploying and managing first-tier monitoring, logging, and dashboarding platforms. Your responsibilities Automating the creation, deployment, testing, securing, and overall management of our infrastructure and services. This requires an ability to understand key details about our services, the majority of which are written in Java. Developing quality assurance methodologies for your code, including creating and validating your own unit tests.. Creating and using modern Continuous Integration/Continuous Deployment (CI/CD) pipelines and tooling . . . specifically using Cloud-native technologies; and being able to create the pipelines in such a way that they can scalably be used by the typical engineer. Taking responsibility for ensuring our offerings are secure and compliant with modern frameworks. Fixing various issues in our production environments without involving other teams most of the time. Mentoring junior engineers. Serving in an on-call rotation. Creating root cause analysis (RCA) documentation; and host and participate in meetings on such topics involving multiple stakeholders. Designing and implementing monitoring, logging, and dashboarding platforms across Cloud providers and regions. Your experience, skills, and capabilities should include: 7+ years experience as an SRE, Platform Engineer etc. 4+ years experience managing mission-critical SaaS Applications at scale. Deep understanding of Kubernetes for deploying Microservice based SaaS Applications, including but not exclusive to vendor implementations of such (e.g., AWS EKS) 2+ years are preferably in Bash, Python, Terraform, Helm Very deep experience with various Cloud-native monitoring, logging, and dashboarding platforms (including vendor-specific platforms like CloudWatch and CloudTrail; and third-party platforms like Grafana, Prometheus, Loki, Tempo, etc) A strong ability to perform solely within an infrastructure-as-code (IaC) framework using; this means intimately knowing Terraform and/or Cloudformation in our case. Strong experience with Gitlab/Github pipelines, AWS CodeBuild/Codedeploy/Codepipeline, etc. Being an excellent verbal and written communicator in English. Explaining and documenting are key functions of this role. Experience working in a fast-paced startup environment. B.Tech./M.Tech. in Computer Science and Engineering or MCA or MSc. in Computer Science or Equivalent

Posted 2 months ago

Apply

5.0 - 10.0 years

5 - 10 Lacs

Chennai, Tamil Nadu, India

On-site

Design and implement scalable AI platform solutions to support machine learning workflows. Experience building and delivering software using the Python programming language, exceptional ability in other programming languages will be considered. Demonstratable experience deploying the underlying infrastructure and tooling for running Machine Learning or Data Science at Scale using Infrastructure of Code Experience using DevOps to enable automation strategies Experience or awareness of MLOps practices and building pipelines to accelerate and automate machine learning will be looked upon favorably Manage and optimize the deployment of applications on Amazon EKS (Elastic Kubernetes Service). Implement Infrastructure as Code using tools like Terraform or AWS CloudFormation. Provision and scale AI platforms such as Domino Data Labs, Databricks, or similar systems. Collaborate with cross-functional teams to integrate AI solutions into the AWS cloud infrastructure. Drive automation and Develop DevOps pipelines using GitHub and GitHub Actions. Ensure high availability and reliability of AI platform services. Monitor and troubleshoot system performance, providing quick resolutions. Stay updated with the latest industry trends and advancements in AI and cloud technologies. Experience working with GxP compliant life science systems will be looked upon favorably Qualifications: Proven hands-on experience with Amazon EKS and AWS cloud services. Strong expertise in Infrastructure as Code with Terraform and AWS CloudFormation. Strong expertise with Python programming. Experience in provisioning and scaling AI platforms like Domino Data Labs, Databricks, or similar systems. Solid understanding of DevOps principles and experience with CI/CD tools like GitHub Actions. Familiarity with version control using Git and GitHub. Excellent problem-solving skills and the ability to work independently and in a team. Strong communication and collaboration skills.

Posted 2 months ago

Apply

8.0 - 13.0 years

25 - 40 Lacs

pune

Remote

Role: Performance Test Engineer AWS Cloud & EKS Location : Remote India Experience : 8+ Years Job Type : Full-Time Role Overview: We are actively seeking a Performance Test Engineer with deep hands-on experience in performance testing of cloud-native applications deployed on AWS , particularly within EKS (Elastic Kubernetes Service) environments. The ideal candidate will combine expertise in performance tools with strong knowledge of containerized infrastructure, observability, and caching strategies to ensure scalability, speed, and system health across high-traffic platforms. Key Responsibilities: Performance Engineering: Design and execute comprehensive Load, Stress, Endurance, Spike , and Volume tests using tools like Apache JMeter , LoadRunner , or NeoLoad . Build robust performance test frameworks tailored for microservices hosted on EKS . Conduct end-to-end performance benchmarking across API, database, and UI layers. AWS Cloud and EKS Focus: Work directly with AWS services such as EKS, EC2, S3, RDS, CloudWatch , and Application Load Balancer (ALB) during test planning and execution. Simulate real-world distributed load in EKS-based environments and evaluate auto-scaling behaviors , node performance , and pod-level resource usage . Collaborate with DevOps to ensure test pods are deployed and scaled properly in Kubernetes clusters. Monitoring & Observability: Use tools like Grafana , CloudWatch , Prometheus , Datadog , and New Relic to monitor performance metrics in real time. Analyze system behavior through dashboards, logs, and alerts to identify performance bottlenecks and degradation. Capture pod-level metrics (CPU, memory, network) from Kubernetes namespaces and workloads under test. CI/CD and Automation Integration: Integrate performance test suites into CI/CD pipelines using tools like Jenkins , GitHub Actions , or AWS Code Pipeline . Ensure automated execution of performance tests as part of build validations and release gates. Reliability & Caching Validation: Validate caching layers (Redis, CDN, Memcached) and their impact on response time under high load. Provide recommendations to improve system reliability, latency, and throughput using test results and observability data. Must-Have Skills: 8+ years in Performance Testing & Engineering Strong expertise with JMeter , Neo Load, or similar tools Solid hands-on experience with AWS and especially EKS (pods, nodes, services, auto-scaling) Strong understanding of Kubernetes workloads, Helm charts, and config maps Familiarity with monitoring and logging in cloud-native environments: Grafana, CloudWatch, Datadog, Prometheus Basic scripting in Python , Shell , or Groovy Experience in testing REST APIs, microservices , and containerized applications Nice-to-Have: AWS Certification (Cloud Practitioner, Developer Associate, or DevOps Engineer) Experience with Terraform , Helm , or Kustomize Familiarity with chaos testing or resilience validation Experience with service mesh performance

Posted Date not available

Apply

4.0 - 6.0 years

10 - 15 Lacs

bengaluru

Hybrid

Primary Skill - 3-5 years hands-on experience in Apache Airflow, Any Cloud platform (AWS/Azure/GCP), Python Secondary: GitHub, ELK, Jenkins, Docker, Terraform, Amazon EKS Scope and Responsibilities: As a Senior Engineer with a focus on Managed Airflow Platform (MAP) support engineering, you will: Evangelize and cultivate adoption of Global Platforms, open-source software and agile principles within the organization Ensure solutions are designed and developed using a scalable, highly resilient cloud native architecture Ensure the operational stability, performance, and scalability of cloud-native platforms through proactive monitoring and timely issue resolution Diagnose infrastructure and system issues across cloud environments and Kubernetes clusters, and lead efforts in troubleshooting and remediation Collaborate with engineering and infrastructure teams to manage configurations, resource tuning, and platform upgrades without disrupting business operations Maintain clear, accurate runbooks, support documentation, and platform knowledge bases to enable faster onboarding and incident response Support observability initiatives by improving logging, metrics, dashboards, and alerting frameworks Advocate for operational excellence and drive continuous improvement in system reliability, cost-efficiency, and maintainability Work with product management to support product / service scoping activities Work with leadership to define delivery schedules of key features through an agile framework Be a key contributor to overall architecture, framework and design of global platforms Required Qualifications Bachelor's or Master's degree in Computer Science or a related field 3+ years of experience in large-scale production-grade platform support, including participation in on-call rotations 3+ years of hands-on experience with cloud platforms like AWS, Azure, or GCP 2+ years of experience developing and supporting data pipelines using Apache Airflow including: DAG lifecycle management and scheduling best practices Troubleshooting task failures, scheduler issues, performance bottlenecks managing and error handling Strong programming proficiency in Python, especially for developing and troubleshooting RESTful APIs Working knowledge of Node.js is considered an added advantage 1+ years of experience in observability using the ELK stack (Elasticsearch, Logstash, Kibana) or Grafana Stack 2+ years of experience with DevOps and Infrastructure-as-Code tools such as GitHub, Jenkins, Docker, and Terraform 2+ years of hands-on experience with Kubernetes, including managing and debugging cluster resources and workloads within Amazon EKS Exposure to Agile and test-driven development a plus. Experience delivering projects in a highly collaborative, multi-discipline development team environment Desired Qualifications Experience with participating in projects in a highly collaborative, multi-discipline development team environment Exposure to Agile, ideally a strong background with the SAFe methodology Skill set on any monitoring or observability tool will be a value add.

Posted Date not available

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies