Role Overview We are seeking a seasoned AWS CloudOps Architect to lead the design, implementation, and operational excellence of our cloud infrastructure. In this role, you will bridge the gap between high-level architecture and hands-on operational management. You will be responsible for ensuring that our cloud platforms are not only highly available and scalable but also cost-optimized and secure.
The ideal candidate possesses a deep understanding of the AWS Well-Architected Framework and brings a developer-centric mindset to operations, utilizing Java and Python to automate complex workflows and enhance system reliability.
Key Responsibilities Architectural Leadership:
Design and deploy scalable, highly available, and fault-tolerant systems on AWS, adhering to the Well-Architected Framework.
Operational Excellence:
Lead monitoring, logging, and alerting strategies to ensure proactive system health management and rapid incident response.
Automation & IaC:
Drive a "code-first" approach to infrastructure using Terraform or CloudFormation to automate resource provisioning and configuration.
CI/CD & DevOps:
Collaborate with engineering teams to build and maintain robust CI/CD pipelines, streamlining the path from development to production.
Security & Compliance:
Implement and manage IAM policies, network security (VPC, Security Groups), and compliance guardrails to protect sensitive data.
Cost Optimization:
Conduct regular architectural reviews to identify cost-saving opportunities through rightsizing, reserved instances, and serverless transitions.
Collaboration:
Act as a technical mentor to junior engineers and provide architectural guidance to application development teams.
Required Technical Skills - Strong AWS CloudOps and Cloud Architecture experience
- Core AWS Services:
-
Compute & Containers:
Extensive experience with EKS (Kubernetes)
, ECS
, and Lambda
(Serverless). -
Messaging & Streaming:
Hands-on expertise with SQS
, SNS
, and MSK (Managed Streaming for Kafka)
. -
Databases & Storage:
Deep knowledge of RDS
, DynamoDB
, and S3
lifecycle management.
-
Programming:
Strong proficiency in Java
and Python
for automation, custom tooling, and script development. -
Infrastructure as Code:
Proven experience with Terraform
or AWS CloudFormation
. -
Networking:
Deep understanding of AWS global infrastructure, including Regions, Availability Zones (AZs), Edge locations, Route 53, and CloudFront. -
Observability:
Proficiency with tools such as AWS CloudWatch, CloudTrail, and third-party APM/logging solutions (e.g., ELK, Datadog, Prometheus).
Good to Have AI/ML Integration:
Exposure to AWS AI/ML services such as Amazon SageMaker
, Bedrock
, Rekognition
, or Comprehend
.
Advanced Networking:
Experience with Transit Gateway, Direct Connect, and complex VPC peering architectures.
Disaster Recovery:
Proven track record of designing and testing multi-region DR strategies.
Qualifications - Bachelor s or Master s degree in Computer Science or related field
- 8+ years IT experience with 5+ years on AWS
- Demonstrated experience in managing large-scale, production-grade Kubernetes environments (EKS).
Certifications (Optional) AWS Certified CloudOps Engineer Associate
AWS Solutions Architect Associate
AWS DevOps Engineer Associate