Posted:1 day ago|
Platform:
On-site
Full Time
As the SRE Architect for Flipkart’s Reliability & Productivity Charter, you will own the vision and strategic roadmap for our Reliability charter—defining what “resilient at scale” means for Flipkart and how we measure success.
● Centralized Observability Stack: End-to-end design of metrics, tracing, logging, and alerting pipelines to give every engineering team a single pane of glass into system health.
● Public Cloud Management: Define best practices, guardrails, and automation for Flipkart’s multi-region GCP footprint to ensure cost-effective, secure, and compliant operations.
● SRE Platform Innovations: Lead the architecture of chaos engineering (Chaos Platform), mass code migration (CodeLift with OpenRewrite), golden-image enforcement and artifact scanning (ImageScanning), and other next-generation reliability tools.
In this role, you will collaborate closely with engineering, product, and operations stakeholders to translate high-level reliability objectives into concrete, scalable systems and processes that empower thousands of engineers to build, deploy, and operate Flipkart’s services with confidence.
Join a dynamic SRE team focused on elevating Flipkart’s platform resilience, developer productivity, and operational excellence. We build and own the platforms and tooling that enable thousands of engineers to deliver high-quality features at scale and with confidence.
○ Define the end-to-end architecture for centralized observability (metrics, tracing, logs, alerting) and ensure scalability, security, and cost-efficiency
○ Drive the technical roadmap for platforms such as Chaos Platform, CodeLift, and Image Scanning
○ Establish best-practice patterns (golden paths) for multi-region, multi-cloud deployments aligned with BCP/DR requirements
○ Lead cross-functional design reviews, proof-of-concepts, and production rollouts for new platform components
○ Ensure robust standards for API design, data modeling, and service-level objectives (SLOs)
○ Define and enforce policy as code (e.g., quota management, image enforcement, CI/CD pipelines)
○ Coach and guide SRE Engineers and Platform Engineers on system design, reliability patterns, and performance optimizations
○ Evangelize “shift-left” practices: resilience testing, security scanning (Snyk, Artifactory integration), and automated feedback loops
○ Stay abreast of industry trends (service meshes, event stores, distributed tracing backends) and evaluate their applicability
○ Collaborate with FinanceOps and CloudOps to optimize public cloud cost, capacity, and resource utilization
○ Define monitoring, alerting, and auto-remediation strategies to maintain healthy error budgets
○ 10+ years in large-scale distributed systems architecture, with at least 3 years in an SRE or platform engineering context
○ Hands-on mastery of observability stacks (Prometheus, OpenTelemetry, Jaeger/Zipkin, ELK/EFK, Grafana, Alertmanager)
○ Proven track record of designing chaos engineering frameworks and non-functional testing workflows
○ Deep knowledge of public cloud platforms (GCP preferred), container orchestration (Kubernetes), and IaC (Terraform, Helm)
○ Strong background in language-agnostic tooling (Go, Java, Python) and API-driven microservices architectures
○ Familiarity with OpenRewrite for mass code migration and vulnerability management tools (Snyk, Trivy)
○ Demonstrated ability to influence stakeholders across engineering, product, and operations teams
○ Excellent written and verbal communication—able to translate complex architectures into clear, actionable plans
○ Passion for mentoring and growing engineering talent in reliability and productivity best practice
Flipkart
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Java coding challenges to boost your skills
Start Practicing Java NowBangalore Urban, Karnataka, India
Salary: Not disclosed
Noida, Uttar Pradesh, India
Salary: Not disclosed
bangalore, chennai, noida, hyderabad, kolkata, pune, mumbai city
0.00013 - 0.00019 Lacs P.A.
Delhi, Delhi, India
3.5 - 6.5 Lacs P.A.
Bengaluru, Karnataka, India
3.5 - 6.5 Lacs P.A.
Hyderabad, Telangana, India
3.5 - 6.5 Lacs P.A.
Mumbai, Maharashtra, India
3.5 - 6.5 Lacs P.A.
Coimbatore, Tamil Nadu, India
3.5 - 6.5 Lacs P.A.
Hyderabad, Telangana
Salary: Not disclosed
Pune, Maharashtra, India
Salary: Not disclosed