Get alerts for new jobs matching your selected skills, preferred locations, and experience range.
14.0 - 24.0 years
50 - 60 Lacs
Noida, Hyderabad, Pune
Work from Office
Expectations Prior experience serving as an architect in Practice, COE, and HBUs, where they have creating service offerings, solution accelerators, and unique selling propositions Play a critical role in driving automation, continuous integration/continuous delivery (CI/CD), and monitoring capabilities to enhance the development and operations processes. Lead and execute designing, defining, and prototyping the end-to-end unified observability system leveraging NewRelic, Splunk and Grafana Stack Define build, implementation, and deployment strategies for the DevOps, Observability and Site Reliability Engineering Marketing of technology & domain solutions / service offerings to internal/external stakeholders Manage business relationship with the technology partners & start-up eco systems and demonstrate edge over competition. Passionate about technology and customer success with excellent communication and articulation skills Should have prior experience in presenting capabilities and solutions to end customers Build initial prototypes of the observability solution and lead the demo sessions with the customer teams Behavior Competencies Excellent Communication, interpersonal and Presentation Skills People Management Conflict Resolution Solutioning Customer Service Accountability Judgement and decision making Ability to build and maintain relationships with stakeholders Technical Skills At least 4 years of pre-sales experience, working with RFI / RFP, developing and presenting technical design & solution to the internal and external stakeholders Extensive experience in assessing SRE, DevOps, Observability maturity state for with ability to define maturity improvement roadmap. Extensive experience in defining and implementing SRE, DevOps, Observability strategies for 3 or more large scale projects Experience of cloud platforms such as AWS or Azure or GCP Deep expertise in Time Series Databases configurations and implementation on AWS cloud Experience of scale observability projects as architect in designing, implementation, and cloud deployment of observability on containerized (Azure AKS or AWS EKS) applications using NewRelic, Splunk and Grafana Stack or open source Grafana and Prometheus products/tools Deep expertise in designing and implementing of end-to-end distributed tracing using several Daemonsets/agents and telemetry gathering patterns. 3+ years in a Monitoring & Observability automation using NewRelic, Splunk and Grafana Stack including Prometheus based alerting. Deep expertise in observability tools such as Splunk, NewRelic, AWS CloudWatch, AWS OpenSearch, and ELK etc
Posted -1 days ago
8.0 - 13.0 years
40 - 65 Lacs
Hyderabad
Remote
Technical Head of Cloud & DevOps Location: 100% Remote (India, Eastern Europe, UK, or U.S.-based candidates; occasional travel to company hubs or conferences as needed) Type: Full-time, Senior Technical Leadership Role Overview We are seeking a Head of Cloud & DevOps to lead the hands-on management, scaling, and continuous improvement of our decentralized compute infrastructure. This position will serve as the primary technical leader for cloud operations, Kubernetes orchestration, infrastructure management, and DevOps pipelines, ensuring platform reliability, performance, and scalability. You will work closely with the CTO, product management, and cross-functional engineering teams to operationalize our companys evolving platform, drive our migration to in-house Distributed Kubernetes Service (DKS), and ensure high uptime and SLA adherence for enterprise customers. This role requires deep technical expertise combined with strong leadership to guide and mentor teams, while remaining actively engaged in architecture reviews, troubleshooting, and hands-on problem solving. This role is designed for candidates who aspire to grow into a future CTOO position, taking on expanded enterprise leadership responsibilities as the platform scales globally. Mandatory Skills Kubernetes orchestration (multi-cluster, DKS, service mesh) Cloud infrastructure scaling (AWS, hybrid, AI workloads) DevOps & CI/CD leadership (Jenkins, GitOps, version control) Infrastructure as Code (IaC) (Terraform, Helm, Ansible) Incident response and uptime optimization (SRE, observability, 99.9%+ SLAs) Security & Compliance knowledge (SOC 2, ISO 27001, access control, encryption) Team leadership in DevOps/SRE/Cloud Ops Monitoring and alerting systems Platform reliability and SLA adherence 8+ years in Cloud Infrastructure, 4+ in Kubernetes/DevOps leadership Non Mandatory skills Experience with Distributed Kubernetes Service (DKS) migrations Passion for decentralized computing / Web3 / blockchain NXQ Token or similar token incentive familiarity Cloud-native architecture for AI workloads Experience with hybrid or bare-metal Kubernetes deployments Global infrastructure experience Knowledge of performance-based DevOps metrics (error budgets, SLOs) Key Responsibilities Infrastructure Ownership & Uptime Leadership Own the full operational lifecycle of our companys decentralized compute infrastructure, spanning Kubernetes, VMs, AI workloads, hybrid cloud integrations, and blockchain components. • Develop and execute infrastructure scaling plans to meet growth demands while maintaining enterprise-grade SLAs (99.9%+ uptime). • Build robust monitoring, observability, alerting, and incident response systems to proactively manage global NanoServer operations. • Maintain deep involvement in diagnosing and resolving performance, capacity, and stability issues. Kubernetes Platform Management & DKS Migration Lead the architecture, deployment, and ongoing optimization of our companys Distributed Kubernetes Service (DKS). • Manage the transition from AWS EKS to DKS with zero downtime, thorough testing, rollbacks, and security assurance. • Ensure DKS delivers parity or superiority to leading cloud providers' managed Kubernetes offerings. DevOps Leadership Drive maturity in CI/CD pipelines, infrastructure-as-code, configuration management, and automated testing practices. • Oversee deployment reliability, version control, rollbacks, and release management. • Lead incident response runbooks, playbooks, SRE error budgets, and continuous reliability improvements. Security & Compliance Implement strong security controls for Kubernetes clusters, network access, identity management, data privacy, and blockchain-related assets. • Collaborate with compliance teams on certifications (SOC 2, ISO 27001, etc.) as required by enterprise clients. • Maintain operational adherence to security standards and best practices. Team Leadership & Execution Lead, mentor, and grow cross-functional cloud operations teams: DevOps, SRE, infrastructure engineers, and backend developers. • Foster a culture of accountability, continuous improvement, operational excellence, and proactive ownership. • Set clear objectives, performance metrics, and technical execution roadmaps aligned to business goals. Collaboration & Stakeholder Alignment • Partner closely with the CTO, product management, and engineering leadership to translate platform objectives into actionable infrastructure projects. • Represent technical operations in cross-functional planning sessions and communicate platform health, SLAs, and operational risks. Qualifications & Experience 8+ years of experience managing complex cloud infrastructure, with at least 4+ years leading DevOps/SRE/Kubernetes operations at scale. • Strong hands-on expertise with Kubernetes orchestration, multi-cluster management, service mesh, container security, and high-scale distributed systems. • Proven success in infrastructure scaling, uptime optimization, incident response, and capacity planning. • In-depth knowledge of DevOps pipelines, CI/CD frameworks, Infrastructure-as-Code (Terraform, Helm), and automated deployments. • Demonstrated ability to lead migrations from managed cloud services to in-house infrastructure. • Strong understanding of cloud security, access controls, encryption, data privacy, and enterprise compliance . • Passion for decentralized cloud computing, Web3/blockchain concepts, or AI-driven infrastructure is a plus. • Excellent leadership, communication, and cross-functional collaboration skills. • Bachelors or Master’s degree in Computer Science, Engineering, or a related field; equivalent experience considered. Compensation & Benefits Competitive base salary depending on candidate location • Equity participation aligned to long-term growth of our company • Performance-based annual bonuses • NXQ token incentives aligned with ecosystem growth • Comprehensive healthcare coverage • Remote work flexibility with home office stipends • Opportunities for global collaboration and occasional travel • High-impact leadership role shaping the future of cloud technology • Structured career path to grow into CTOO based on organizational maturity and demonstrated leadership
Posted 3 days ago
7.0 - 10.0 years
20 - 30 Lacs
Bangalore Rural, Bengaluru
Work from Office
Role & responsibilities: Design end-to-end monitoring and observability solutions to provide comprehensive visibility into infrastructure, applications, and networks. Implement monitoring tools and frameworks (e.g., Prometheus, Grafana, OpsRamp, Dynatrace, New Relic) to track key performance indicators and system health metrics. Integration of monitoring and observability solutions with IT Service Management Tools. Develop and deploy dashboards, alerts, and reports to proactively identify and address system performance issues. Architect scalable observability solutions to support hybrid and multi-cloud environments. Collaborate with infrastructure, development, and DevOps teams to ensure seamless integration of monitoring systems into CI/CD pipelines. Continuously optimize monitoring configurations and thresholds to minimize noise and improve incident detection accuracy. Automate alerting, remediation, and reporting processes to enhance operational efficiency. Utilize AIOps and machine learning capabilities for intelligent incident management and predictive analytics. Work closely with business stakeholders to define monitoring requirements and success metrics. Document monitoring architectures, configurations, and operational procedures. Required Skills: Strong understanding of infrastructure and platform development principles and experience with programming languages such as Python, Ansible, for developing custom scripts. Strong knowledge of monitoring frameworks, logging systems (ELK stack, Fluentd), and tracing tools (Jaeger, Zipkin) along with the OpenSource solutions like Prometheus, Grafana. Extensive experience with monitoring and observability solutions such as OpsRamp, Dynatrace, New Relic, must have worked with ITSM integration (e.g. integration with ServiceNow, BMC remedy, etc.) Working experience with RESTful APIs and understanding of API integration with the monitoring tools. Familiarity with AIOps and machine learning techniques for anomaly detection and incident prediction. Knowledge of ITIL processes and Service Management frameworks. Familiarity with security monitoring and compliance requirements. Excellent analytical and problem-solving skills, ability to debug and troubleshoot complex automation issues CVs to angel@anveta,com
Posted 5 days ago
12.0 - 15.0 years
30 - 35 Lacs
Bengaluru
Work from Office
We are seeking a highly experienced and technically profound Cloud Application Architect to drive our cloud-first digital transformation initiatives. This pivotal role involves leading the design, development, and modernization of our enterprise application portfolio to deliver modern, scalable, secure, and business-aligned cloud-native solutions. The ideal candidate will possess a deep, hands-on technical background in application architecture, with a focus on transforming legacy systems into agile, customer-centric, and cloud-optimized experiences within either the Microsoft or Java enterprise stack. This role is critical for shaping our application landscape, ensuring robust end-to-end design, and guiding development teams through complex architectural challenges in a dynamic, cloud-first environment. Key Responsibilities As a Senior Cloud Application Architect, you will: Define Cloud-Native Application Architectures: Lead the definition, design, and implementation of comprehensive cloud-native application architectures and strategic modernization roadmaps for critical enterprise systems, primarily leveraging AWS EKS, Azure AKS, and serverless functions (e.g., AWS Lambda, Azure Functions). Own End-to-End Application Design: Hold ultimate accountability for the end-to-end application design, ensuring solutions meet stringent requirements for scalability (handling high transaction volumes), performance (low latency), robust security (integrating DevSecOps principles like SAST/DAST, Zero Trust), and high reliability (achieving stringent uptime targets). Guide Microservices/API Architecture & Containerization: Provide senior technical guidance and mentorship to multiple distributed project teams on advanced microservices and API-first design patterns, including choreography vs. orchestration, eventual consistency, and idempotent API design. Lead the adoption and implementation of Docker containerization and Kubernetes orchestration (AKS/EKS) for efficient application deployment and management. Develop Deployment & Operational Strategy: Define and enforce declarative deployment strategies (e.g., GitOps with ArgoCD/FluxCD). Design application-level disaster recovery and business continuity plans, including multi-region deployments with active-active/active-passive patterns and automated failover mechanisms. Collaborate Cross-Functionally: Collaborate extensively as a strategic partner with cross-functional teams including software developers (Java/.NET), product owners, business analysts, DevOps engineers, security specialists, and infrastructure teams. Translate complex business requirements into clear, actionable technical specifications. Lead Technical Design Sessions & Governance: Lead high-stakes technical design sessions, facilitate architecture review boards (ARB), and prepare comprehensive architectural documentation (e.g., Architecture Decision Records (ADRs), sequence diagrams, data flow diagrams) to ensure alignment, maintain architectural integrity, and govern new feature implementations. Support Build vs. Buy & Tool Selection: Actively support critical build vs. buy analyses for new functionalities. Evaluate, select, and champion various cloud services (PaaS, SaaS) and third-party tools (e.g., API Management gateways, caching solutions, message brokers) based on technical fit, business needs, and cost efficiency. Conduct and present Proof-of-Concepts (PoCs) for emerging technologies and strategic platform integrations. Drive DevSecOps & Observability Integration: Champion the integration of advanced DevSecOps practices, from "shift-left" security to automated CI/CD pipelines. Implement comprehensive application observability solutions (e.g., Prometheus, Grafana, Application Insights) to monitor SLOs/SLIs, diagnose performance issues, and proactively ensure system health. Optimize Application-Level Costs: Design and optimize application architectures to maximize cloud cost efficiency, leveraging serverless computing, right-sizing container workloads, and implementing intelligent autoscaling policies. Mentor & Foster Innovation: Mentor junior and mid-level developers and architects on cloud-native development best practices, application refactoring techniques, and effective utilization of cloud services. Explore and prototype the integration of emerging technologies (e.g., AI/ML, Generative AI) for intelligent features and digital workflow automation. Qualifications: Education: Bachelors or Masters degree in Computer Science, Engineering, Information Technology, or a related field. Experience: 1 12+ years of progressive experience in application architecture, with a significant and demonstrable focus on cloud-native application design, digital-first transformations, and modernizing enterprise software. Application Development Background: Strong application background with hands-on experience in either the Microsoft (.NET Core, ASP.NET) or Java (Spring Boot, J2EE) enterprise/product software architecture. Cloud Platform Expertise: Proven experience delivering cloud-first solutions using public cloud platforms (AWS, Azure are preferred; GCP experience is a plus), with a deep understanding of their PaaS and IaaS offerings relevant to application development. Modern Application Design Principles: Deep knowledge and hands-on experience with microservices, API-driven development, event-driven architecture, serverless computing, and domain-driven design. Containerization & Orchestration: Expertise in Docker and Kubernetes (EKS, AKS), including deployment strategies and operational best practices for containerized applications. Agile, DevOps, & CI/CD: Strong understanding and practical experience with agile delivery models, comprehensive DevOps practices, and continuous integration/deployment (CI/CD) pipelines. Communication & Stakeholder Management: Excellent communication, presentation, and stakeholder management skills, with a proven ability to bridge technical and business perspectives, and advise senior leadership. Leadership & Governance: Extensive experience in leading cross-functional development and architecture teams, managing architectural governance, and mentoring engineers in large-scale programs. Preferred Skills: Cloud Certifications: Relevant cloud certifications (e.g., AWS Certified Solutions Architect – Professional, Azure Solutions Architect Expert, Certified Kubernetes Application Developer - CKAD). Enterprise Architecture Frameworks: Knowledge of enterprise architecture frameworks (e.g., TOGAF) in the context of digital transformation. Observability Tools: Experience with comprehensive observability solutions for applications (e.g., Prometheus, Grafana, Datadog, Application Insights, distributed tracing tools like Jaeger). Security by Design: Direct experience implementing security best practices at the application architecture level (e.g., OWASP, threat modeling, secure coding standards). AI/ML Integration: Experience with integrating analytics, personalization, and AI/ML capabilities into application architectures. Low-Code/No-Code Platforms: Exposure to low-code/no-code development tools and digital workflow automation platforms.
Posted 1 week ago
8.0 - 12.0 years
2 - 11 Lacs
Bengaluru / Bangalore, Karnataka, India
On-site
This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand software engineering fundamentals and have experience developing software among a team of engineers. Strong experience in the practice of testing. Build Automation: Experience getting artifacts in a variety of languages packaged and tested so that they can be trusted to go into Production. Automatically. Release Automation: Experience in getting artifacts running in production in a reliable manner. Automatically. Observability : Experience with developing service level indicators and objectives, instrumenting software, and building meaningful alerts. Troubleshooting : A passion for tracking down technical root causes of distributed systems, and software. Containers/Container Orchestration Systems : A solid understanding of how to manage and maintain container-based systems especially on Kubernetes. Artificial Intelligence : A grounding in infrastructure for and the use of Agentic Systems. Infrastructure-as-Code : Experience with deploying and maintaining infrastructure as code with tools such as Terraform and Pulumi. Technical Writing : We will need to build documentation and diagrams for other engineering teams. Customer Satisfaction : Keen eye for customer satisfaction (our customers are other engineering teams and Avalara customers). Passion for Learning : Interest in the broader technology space with a constant desire to expand your understanding. Adaptability: Experience working on a variety of projects. Preferred Qualifications GO : Our tooling is developed in GO Distributed Computing : Experience architecting, developing, and deploying distributed services across regions and clouds. GitLab : Experience in working with, managing, and deploying. Artifactory : Experience in working with, managing, and deploying. Technical Writing : writing technical documents that people love and adore. Open Source: Build side-projects or contribute to other open-source projects. Experience Minimum 8 years of experience in a SaaS environment Bachelors degree in computer science or equivalent Ability to participate in an on-call rotation Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality
Posted 1 week ago
8.0 - 13.0 years
3 - 12 Lacs
Hyderabad / Secunderabad, Telangana, Telangana, India
On-site
This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand software engineering fundamentals and have experience developing software among a team of engineers. Strong experience in the practice of testing. Build Automation: Experience getting artifacts in a variety of languages packaged and tested so that they can be trusted to go into Production. Automatically. Release Automation: Experience in getting artifacts running in production in a reliable manner. Automatically. Observability : Experience with developing service level indicators and objectives, instrumenting software, and building meaningful alerts. Troubleshooting : A passion for tracking down technical root causes of distributed systems, and software. Containers/Container Orchestration Systems : A solid understanding of how to manage and maintain container-based systems especially on Kubernetes. Artificial Intelligence : A grounding in infrastructure for and the use of Agentic Systems. Infrastructure-as-Code : Experience with deploying and maintaining infrastructure as code with tools such as Terraform and Pulumi. Technical Writing : We will need to build documentation and diagrams for other engineering teams. Customer Satisfaction : Keen eye for customer satisfaction (our customers are other engineering teams and Avalara customers). Passion for Learning : Interest in the broader technology space with a constant desire to expand your understanding. Adaptability: Experience working on a variety of projects. Preferred Qualifications GO : Our tooling is developed in GO Distributed Computing : Experience architecting, developing, and deploying distributed services across regions and clouds. GitLab : Experience in working with, managing, and deploying. Artifactory : Experience in working with, managing, and deploying. Technical Writing : writing technical documents that people love and adore. Open Source: Build side-projects or contribute to other open-source projects. Experience Minimum 8 years of experience in a SaaS environment Bachelors degree in computer science or equivalent Ability to participate in an on-call rotation Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality
Posted 1 week ago
8.0 - 13.0 years
3 - 11 Lacs
Delhi, India
On-site
This might be a good fit for you, if enabling people to do their best resonates with you. you love platform engineering you want to build cool things with cool people. you love automating everything you love building high impact tools and software which everyone depends on you love automating everything! What Your Responsibilities Will Be Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality What Youll Need to be Successful Qualifications Software Engineering : Understand software engineering fundamentals and have experience developing software among a team of engineers. Strong experience in the practice of testing. Build Automation: Experience getting artifacts in a variety of languages packaged and tested so that they can be trusted to go into Production. Automatically. Release Automation: Experience in getting artifacts running in production in a reliable manner. Automatically. Observability : Experience with developing service level indicators and objectives, instrumenting software, and building meaningful alerts. Troubleshooting : A passion for tracking down technical root causes of distributed systems, and software. Containers/Container Orchestration Systems : A solid understanding of how to manage and maintain container-based systems especially on Kubernetes. Artificial Intelligence : A grounding in infrastructure for and the use of Agentic Systems. Infrastructure-as-Code : Experience with deploying and maintaining infrastructure as code with tools such as Terraform and Pulumi. Technical Writing : We will need to build documentation and diagrams for other engineering teams. Customer Satisfaction : Keen eye for customer satisfaction (our customers are other engineering teams and Avalara customers). Passion for Learning : Interest in the broader technology space with a constant desire to expand your understanding. Adaptability: Experience working on a variety of projects. Preferred Qualifications GO : Our tooling is developed in GO Distributed Computing : Experience architecting, developing, and deploying distributed services across regions and clouds. GitLab : Experience in working with, managing, and deploying. Artifactory : Experience in working with, managing, and deploying. Technical Writing : writing technical documents that people love and adore. Open Source: Build side-projects or contribute to other open-source projects. Experience Minimum 8 years of experience in a SaaS environment Bachelors degree in computer science or equivalent Ability to participate in an on-call rotation Some areas of work are Creating tools that smooth the journey from idea to running in production Learning and evangelizing best practices related to the build, test and deployment of software Providing tools to our fellow engineers with a high degree of reliability and quality
Posted 1 week ago
6.0 - 9.0 years
8 - 11 Lacs
Pune
Work from Office
We are hiring a DevOps / Site Reliability Engineer for a 6-month full-time onsite role in Pune (with possible extension). The ideal candidate will have 69 years of experience in DevOps/SRE roles with deep expertise in Kubernetes (preferably GKE), Terraform, Helm, and GitOps tools like ArgoCD or Flux. The role involves building and managing cloud-native infrastructure, CI/CD pipelines, and observability systems, while ensuring performance, scalability, and resilience. Experience in infrastructure coding, backend optimization (Node.js, Django, Java, Go), and cloud architecture (IAM, VPC, CloudSQL, Secrets) is essential. Strong communication and hands-on technical ability are musts. Immediate joiners only.
Posted 1 week ago
0.0 years
0 Lacs
Bengaluru / Bangalore, Karnataka, India
On-site
Ready to shape the future of work At Genpact, we don&rsquot just adapt to change&mdashwe drive it. AI and digital innovation are redefining industries, and we&rsquore leading the charge. Genpact&rsquos AI Gigafactory, our industry-first accelerator, is an example of how we&rsquore scaling advanced technology solutions to help global enterprises work smarter, grow faster, and transform at scale. From large-scale models to agentic AI, our breakthrough solutions tackle companies most complex challenges. If you thrive in a fast-moving, tech-driven environment, love solving real-world problems, and want to be part of a team that&rsquos shaping the future, this is your moment. Genpact (NYSE: G) is an advanced technology services and solutions company that delivers lasting value for leading enterprises globally. Through our deep business knowledge, operational excellence, and cutting-edge solutions - we help companies across industries get ahead and stay ahead. Powered by curiosity, courage, and innovation, our teams implement data, technology, and AI to create tomorrow, today. Get to know us at genpact.com and on LinkedIn, X, YouTube, and Facebook. Inviting applications for the role of Senior Principal Consultant- Senior Data Engineer - Databricks, Azure & Mosaic AI Role Summary: We are seeking a Senior Data Engineer with extensive expertise in Data & Analytics platform modernization using Databricks, Azure, and Mosaic AI. This role will focus on designing and optimizing cloud-based data architectures, leveraging AI-driven automation to enhance data pipelines, governance, and processing at scale. Key Responsibilities: . Architect & modernize Data & Analytics platforms using Databricks on Azure. . Design and optimize Lakehouse architectures integrating Azure Data Lake, Databricks Delta Lake, and Synapse Analytics. . Implement Mosaic AI for AI-driven automation, predictive analytics, and intelligent data engineering solutions. . Lead the migration of legacy data platforms to a modern cloud-native Data & AI ecosystem. . Develop high-performance ETL pipelines, integrating Databricks with Azure services such as Data Factory, Synapse, and Purview. . Utilize MLflow & Mosaic AI for AI-enhanced data processing and decision-making. . Establish data governance, security, lineage tracking, and metadata management across modern data platforms. . Work collaboratively with business leaders, data scientists, and engineers to drive innovation. . Stay at the forefront of emerging trends in AI-powered data engineering and modernization strategies. Qualifications we seek in you! Minimum Qualifications . experience in Data Engineering, Cloud Platforms, and AI-driven automation. . Expertise in Databricks (Apache Spark, Delta Lake, MLflow) and Azure (Data Lake, Synapse, ADF, Purview). . Strong experience with Mosaic AI for AI-powered data engineering and automation. . Advanced proficiency in SQL, Python, and Scala for big data processing. . Experience in modernizing Data & Analytics platforms, migrating from on-prem to cloud. . Knowledge of Data Lineage, Observability, and AI-driven Data Governance frameworks. . Familiarity with Vector Databases & Retrieval-Augmented Generation (RAG) architectures for AI-powered data analytics. . Strong leadership, problem-solving, and stakeholder management skills. Preferred Skills: . Experience with Knowledge Graphs (Neo4J, TigerGraph) for data structuring. . Exposure to Kubernetes, Terraform, and CI/CD for scalable cloud deployments. . Background in streaming technologies (Kafka, Spark Streaming, Kinesis). Why join Genpact . Be a transformation leader - Work at the cutting edge of AI, automation, and digital innovation . Make an impact - Drive change for global enterprises and solve business challenges that matter . Accelerate your career - Get hands-on experience, mentorship, and continuous learning opportunities . Work with the best - Join 140,000+ bold thinkers and problem-solvers who push boundaries every day . Thrive in a values-driven culture - Our courage, curiosity, and incisiveness - built on a foundation of integrity and inclusion - allow your ideas to fuel progress Come join the tech shapers and growth makers at Genpact and take your career in the only direction that matters: Up. Let&rsquos build tomorrow together. Genpact is an Equal Opportunity Employer and considers applicants for all positions without regard to race, color, religion or belief, sex, age, national origin, citizenship status, marital status, military/veteran status, genetic information, sexual orientation, gender identity, physical or mental disability or any other characteristic protected by applicable laws. Genpact is committed to creating a dynamic work environment that values respect and integrity, customer focus, and innovation. Furthermore, please do note that Genpact does not charge fees to process job applications and applicants are not required to pay to participate in our hiring process in any other way. Examples of such scams include purchasing a %27starter kit,%27 paying to apply, or purchasing equipment or training.
Posted 1 week ago
9.0 - 14.0 years
20 - 35 Lacs
Chennai, Bengaluru
Work from Office
Dynatrace Specialist 9+ Years Location : Bangalore / Chennai Company : HCLTech Experience : 9 to 13 Years Employment Type : Full-Time | Permanent About the Role : HCLTech is seeking an experienced Dynatrace Specialist to join our IT Observability and AIOps team. The ideal candidate will be responsible for implementing, managing, and optimizing Dynatrace-based performance monitoring for enterprise applications. Key Responsibilities : Deploy, configure, and maintain Dynatrace for end-to-end observability. Create custom dashboards, alerts, and synthetic monitoring. Troubleshoot application and infrastructure performance issues using Dynatrace insights. Collaborate with development and DevOps teams to enhance performance tuning. Integrate Dynatrace with ITSM, CI/CD, and other APM tools. Required Skills : 9+ years of IT experience with minimum 3 years in Dynatrace (APM, DEM, RUM, Synthetic). Strong knowledge of application stacks (Java, .NET, Node.js, containers). Experience with Kubernetes, Docker, and cloud-native environments. Exposure to ServiceNow, AppDynamics, Splunk, or similar tools (preferred). Strong scripting and automation skills (Python, Shell, PowerShell preferred). Preferred Certification : Dynatrace Associate/Professional Certification (preferred)
Posted 1 week ago
3.0 - 5.0 years
0 Lacs
Bengaluru / Bangalore, Karnataka, India
On-site
Candidate is expected to write good quality C/C++, Java Code and should be able to develop corresponding Unit tests and Automation. He/She must have hands on experience with Cloud Native technologies like docker/Kubernetes/Monitoring/observability. Additional skill sets include Perl and Python scripting, DB and XML concepts. Should be familiar with Agile methodology, CI/CD process and should have exposure to messaging framework like Kafka. Candidate should be able to understand requirements and deliver independently. Experience in billing domain will be an added advantage. Career Level - IC2
Posted 2 weeks ago
3.0 - 5.0 years
15 - 18 Lacs
Pune
Work from Office
Experience: 3 to 5 years in cloud infrastructure operations, L1 incident management, automation support, and observability, with team coordination or mentoring experience. Location: Pune Shift: 24x7 Support (Rotational Shifts) Education: BE/B.Tech (Relevant certifications preferred AWS Cloud Practitioner/Associate, Azure Fundamentals, CKA, Terraform Associate) Job Summary: We are seeking a L1 Lead – Site Reliability Engineer (SRE) to guide and manage the frontline SRE team in ensuring the stability, availability, and efficiency of enterprise-scale cloud infrastructure operations. This role involves supervising incident response, ensuring adherence to runbooks and SOPs, providing technical guidance to L1 engineers, and being the key escalation point for L1 issues. You will be responsible for monitoring cloud services, triaging alerts, validating remediation efforts, mentoring junior engineers, and collaborating with L2/L3 teams for escalations and root cause analysis. Responsibilities: Lead and mentor the L1 SRE team during shifts, ensuring timely response and proper handling of incidents, service requests, and alerts. Oversee infrastructure and application monitoring using tools such as Prometheus, Grafana, AWS CloudWatch, and Azure Monitor. Validate and guide remediation actions like pod restarts, disk space cleanup, scaling, and alert verification. Ensure SOPs, runbooks , and shift handover notes are followed and updated regularly. Execute and validate predefined Ansible playbooks, Terraform scripts, and CI/CD pipelines with junior team members. Act as the first point of escalation for unresolved L1 issues and coordinate with L2/L3 teams for resolution and RCA. Govern and track shift performance, including SLA compliance, FCR (First Call Resolution), and ticket hygiene. Coordinate patching, backup checks, standard changes, and validations in AWS/Azure environments. Facilitate onboarding of new L1 engineers, and deliver knowledge-sharing and refresher training sessions. Support automation initiatives by identifying repetitive tasks and creating/reviewing simple scripts. Conduct weekly/monthly shift reports and participate in SRE governance and review calls with operations leadership. Monitor the health of Kubernetes clusters and guide the team in basic pod/node/service troubleshooting. Skills/Expertise: 3+ years of experience in cloud infrastructure operations with at least 1 year in a lead or mentoring role. Strong troubleshooting, coordination, documentation, and escalation management skills. Proven ability to lead shifts in a 24x7 support model. Familiarity with ITSM practices and SLA management ( ServiceNow or similar). Proactive and structured communicator, capable of shift planning, reporting, and stakeholder updates. Technical Skills: Experience monitoring and operating cloud-based environments with basic troubleshooting for system and application-level issues. Familiarity with cloud services and concepts across AWS, such as EC2, S3, IAM, VPC, etc and Azure DevOps services. Basic knowledge of container platforms such as Docker and Kubernetes (understanding pod/service basics, logs, etc.). Exposure to scripting using Shell, Bash, or Python for automation of routine tasks. Basic understanding of version control systems like Git, GitHub, or GitLab. Awareness of infrastructure-as-code and automation tools such as Ansible, Terraform, or CloudFormation (execution under guidance). Familiar with CI/CD concepts and tools like Jenkin or GitLab CI (executing builds, monitoring pipelines). Understanding of alerting and monitoring tools like Grafana, ELK, site 24*7, CloudWatch and Prometheus Hands-on with ITSM tools such as ServiceNow for incident and ticket tracking. Role & responsibilities Preferred candidate profile
Posted 2 weeks ago
6.0 - 9.0 years
6 - 9 Lacs
Bengaluru / Bangalore, Karnataka, India
On-site
The Role LeadSquared platform and product suite is 100% on the cloud and currently all on AWS. The product suite comprises a large number of applications, services, and APIs built on various open-source and AWS native tech stacks and deployed across multiple AWS accounts. The role involves leading the mission-critical responsibility of ensuring that all our online services are available, reliable, performant, and running at optimal costs. We firmly believe in a code and automation-driven approach to Site Reliability. Key Responsibilities Build Processes and platforms to ensure full observability and automated incident response management of all systems, applications, platforms, and infrastructure. Track incidents and perform RCA for every incident and focus on prevention. Work closely with Engineering teams to improve the performance, reliability, and operability of various applications and services. Work with customers to address their concerns on infrastructure availability, performance, and security. Key Requirements 6+ years experience in building tools for observability and incident response management for AWS resources as well as custom applications of this 3+ years of experience should be on AWS Cloud. 2+ years of experience in leading SRE team. Deep understanding of observability of all major AWS services - EC2, RDS, Elasticsearch, Redis, SQS, API Gateway, Lambda, etc. Operational experience in deploying, operating, scaling, and troubleshooting large-scale production systems on the cloud. Strong interpersonal communication skills (including listening, speaking, and writing) Ability to create & work well in a diverse, team-focused environment with other DevOps and engineering teams. Function well in a fast-paced, rapidly changing environment
Posted 3 weeks ago
6.0 - 10.0 years
0 Lacs
Bengaluru / Bangalore, Karnataka, India
On-site
Oracle Cloud Infrastructure (OCI) is one of the fastest-growing cloud platforms, and we are assembling a world-class team to build the next generation of security products. We're seeking a Principal Software Engineer to drive the design and development of mission-critical systems that protect OCI customers at hyperscale. As a Principal Engineer in the Security Products Group, you will play a key leadership role in: Architecting and delivering complex, distributed systems with a focus on security, resiliency, and scalability. Driving strategic technical decisions and shaping the long-term vision for OCI's security offerings. Mentoring engineers, influencing cross-team engineering practices, and raising the technical bar across the organization. Leading design reviews, setting coding standards, and fostering a culture of operational excellence. What You'll Do: Lead design and development of major features and large-scale systems from concept to production. Set the direction for platform architecture and system design in areas such as identity, data protection, threat detection, and vulnerability management. Operate and improve high-scale services, driving initiatives to increase reliability, observability, and automation. Collaborate across teams and orgs to align architecture, resolve dependencies, and ensure delivery of high-impact security capabilities. What We're Looking For: Deep experience in building and operating distributed systems at scale. Proven ability to design and deliver complex features with cross-cutting impact. Hands-on experience with services operating across regions and subject to strict compliance and regulatory requirements. Strong coding skills and the ability to dive deep into technical details across the stack-from low-level systems internals to API design. A bias for simplicity, a passion for scale, and a pragmatic approach to problem-solving. Why Security at OCI The OCI Security Products Group is on a mission to build the most secure cloud platform. We deliver a portfolio of cloud-native services that enable our customers to: Isolate workloads, encrypt data, and control access securely. Detect vulnerabilities and threats across applications, containers, and infrastructure. Remediate risks proactively, leveraging intelligence from CVEs, CIS benchmarks, and threat modeling. We are investing heavily in advanced security systems that detect, analyze, and block malicious activity in real time - empowering our customers to build and scale confidently on Oracle Cloud. Explore our work: Lead the design and development of large-scale, mission-critical security services within OCI, ensuring they are reliable, scalable, and secure by default. Define technical strategy and architecture for key areas such as identity, access control, data protection, threat detection, and vulnerability management. Drive end-to-end delivery of complex features - from ideation and design through development, testing, deployment, and operational support. Mentor and guide engineers across multiple teams, fostering technical growth, improving code quality, and raising the bar for design and execution. Champion engineering excellence by setting high standards for design, code, observability, automation, and operational readiness. Collaborate across functional teams (security, platform, compliance, product management) to align on strategy, resolve architectural challenges, and accelerate delivery. Continuously improve system reliability and performance through proactive observability, incident response, chaos engineering, and root cause analysis. Evaluate and adopt new technologies and patterns to improve security posture, performance, and developer productivity. Contribute to the broader OCI engineering community through leadership in design reviews, architecture discussions, and cross-org initiatives. Career Level - IC4
Posted 3 weeks ago
10 - 13 years
18 - 25 Lacs
Bengaluru
Hybrid
Hiring, Lead Site Reliability Engineer with following skills and expertise. What will this person do? Provide leadership in designing and implementing reliable, scalable, and secure infrastructure solutions. Develop and maintain observability solutions, ensuring visibility into system performance using native Azure Cloud solutions. Define and track SLIs, ensuring compliance with SLOs and SLAs. Lead incident response efforts, conduct root cause analysis, and implement preventive measures to minimize downtime. Automate infrastructure provisioning, configuration and management using Terraform & Ansible. Build and maintain robust Observability pipelines to support automated deployments and continuous monitoring practices. Continuously analyze system health and optimize performance by identifying and resolving bottlenecks. Work with our BCDR team to minimize business impact during failures and measure the quality of services. Work with Cloud Governance team to monitor cloud infrastructure spending and implement cost-saving strategies. Implement centralized logging, metric collection, and distributed tracing for troubleshooting and debugging. Deploy, Manage and Monitor containerized workloads. Maintain configuration consistency and compliance across cloud environments using tools like Ansible. Partner with software development teams to integrate reliability best practices into the application development lifecycle. Conduct detailed post-mortems, document learnings, and drive improvements to reduce future incidents. Develop automation scripts in Python, Bash, or other languages to reduce manual efforts and improve efficiency. Provide mentorship to junior engineers, fostering a culture of learning and continuous technical growth. Research and evaluate new technologies, tools, and methodologies to improve system reliability and efficiency. Maintain detailed documentation on infrastructure, monitoring setups, incident responses, and best practices. Qualifications Bachelors degree in Computer Science, Engineering, or a related field. 10+ years in Observability, DevOps, and Site Reliability Engineering (SRE). At least 2 years of experience in defining Observability KPIs for both on-premises and cloud environments. Strong experience with cloud platforms (AWS, Azure, GCP) and cloud-native technologies. Passion for automation, reducing toil and implementing reliability-focused best practices. Deep knowledge of services/tools like Grafana, PowerBI, Prometheus, Azure Monitor, Application Insights & Azure Metrics. Expertise in Terraform, Ansible, Chef, and CI/CD pipeline tools like GitHub Actions, Jenkins, and GitOps methodologies. Working understanding of load balancing, authentication (AAA), encryption, and network parameters monitoring. Strong troubleshooting skills and experience handling on-call incidents and post-mortem analysis. Ability to work cross-functionally, drive technical discussions, and mentor junior engineers. Ability to work in a dynamic team environment and possess time management skills to meet deadlines. Sense of ownership and pride in your performance and its impact on the companys success. Critical thinker with problem-solving skills. Good interpersonal and communication skills.
Posted 1 month ago
5 - 8 years
15 - 25 Lacs
Chennai, Bengaluru
Work from Office
We are looking for a Senior Platform Engineer Airflow & Control-M with 5-10 years of experience to join our team in Bangalore or Chennai The ideal candidate will have strong expertise in Airflow, Control-M, Kubernetes, Observability (OpenTelemetry), Python, and Bash scripting The role involves managing critical data workflows, enhancing platform automation, and ensuring system reliability and scalability Excellent communication skills and hands-on experience in stabilizing production environments are essential
Posted 1 month ago
8 - 12 years
16 - 27 Lacs
Kolkata
Work from Office
Role Observability Engineer (AWS) EXP : 8 + Years Essential Skills (Two top skills) AWS Ecosystem – EKS, EC2, DynamoDB, Lambda, etc. Dynatrace (or similar) Monitoring Site, trend analysis, log analysis Key Responsibilities: Design, implement, and maintain observability solutions using AWS and Dynatrace to monitor application performance and infrastructure health. Collaborate with development and operations teams to define observability requirements and ensure seamless integration of monitoring tools. Develop and manage dashboards, alerts, and reports to provide insights into system performance and user experience. Troubleshoot complex issues by analyzing logs, metrics, and traces to identify root causes and recommend solutions. Optimize existing monitoring frameworks to enhance visibility across cloud environments and applications. Stay updated on industry trends and best practices in observability, cloud technologies, and performance monitoring. 8+ years of proven experience as an Observability Engineer or similar role with a strong focus on AWS services. Proficiency in using Dynatrace for application performance monitoring and observability. Strong understanding of cloud architecture, microservices, containers, and serverless computing. Experience with scripting languages (e.g., Python, Bash) for automation tasks. Excellent problem-solving skills with the ability to work under pressure in a fast-paced environment. Strong communication skills to effectively collaborate with cross-functional teams
Posted 1 month ago
10 - 20 years
25 - 35 Lacs
Pune, Bengaluru, Delhi / NCR
Work from Office
Role & responsibilities SRE Architect in running large Reliability & Observability Programs for large, complex infrastructure deployments / distributed systems for major Banking customers. Proficiency in using Application Performance Monitoring (APM) tool New Relic/Dynatrace for monitoring, logging, tracing and Splunk for Log monitoring. should have implemented solutions around Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for services. • Understanding of software delivery life cycles, particularly Agile/Lean & DevOps • Proven experience in handling large scale and growing infrastructure across Data Centers and heterogeneous Cloud platforms • Expert level hands on knowledge in cloud platforms like PCF . Preferred candidate profile Understanding of software delivery life cycles, particularly Agile/Lean & DevOps Proven experience in handling large scale and growing infrastructure across Data Centers and heterogeneous Cloud platforms Perks and benefits
Posted 1 month ago
10 - 13 years
18 - 25 Lacs
Bengaluru
Hybrid
Hiring, Lead Site Reliability Engineer with following skills and expertise. What will this person do? Provide leadership in designing and implementing reliable, scalable, and secure infrastructure solutions. Develop and maintain observability solutions, ensuring visibility into system performance using native Azure Cloud solutions. Define and track SLIs, ensuring compliance with SLOs and SLAs. Lead incident response efforts, conduct root cause analysis, and implement preventive measures to minimize downtime. Automate infrastructure provisioning, configuration and management using Terraform & Ansible. Build and maintain robust Observability pipelines to support automated deployments and continuous monitoring practices. Continuously analyze system health and optimize performance by identifying and resolving bottlenecks. Work with our BCDR team to minimize business impact during failures and measure the quality of services. Work with Cloud Governance team to monitor cloud infrastructure spending and implement cost-saving strategies. Implement centralized logging, metric collection, and distributed tracing for troubleshooting and debugging. Deploy, Manage and Monitor containerized workloads. Maintain configuration consistency and compliance across cloud environments using tools like Ansible. Partner with software development teams to integrate reliability best practices into the application development lifecycle. Conduct detailed post-mortems, document learnings, and drive improvements to reduce future incidents. Develop automation scripts in Python, Bash, or other languages to reduce manual efforts and improve efficiency. Provide mentorship to junior engineers, fostering a culture of learning and continuous technical growth. Research and evaluate new technologies, tools, and methodologies to improve system reliability and efficiency. Maintain detailed documentation on infrastructure, monitoring setups, incident responses, and best practices. Qualifications Bachelors degree in Computer Science, Engineering, or a related field. 10+ years in Observability, DevOps, and Site Reliability Engineering (SRE). At least 2 years of experience in defining Observability KPIs for both on-premises and cloud environments. Strong experience with cloud platforms (AWS, Azure, GCP) and cloud-native technologies. Passion for automation, reducing toil and implementing reliability-focused best practices. Deep knowledge of services/tools like Grafana, PowerBI, Prometheus, Azure Monitor, Application Insights & Azure Metrics. Expertise in Terraform, Ansible, Chef, and CI/CD pipeline tools like GitHub Actions, Jenkins, and GitOps methodologies. Working understanding of load balancing, authentication (AAA), encryption, and network parameters monitoring. Strong troubleshooting skills and experience handling on-call incidents and post-mortem analysis. Ability to work cross-functionally, drive technical discussions, and mentor junior engineers. Ability to work in a dynamic team environment and possess time management skills to meet deadlines. Sense of ownership and pride in your performance and its impact on the companys success. Critical thinker with problem-solving skills. Good interpersonal and communication skills.
Posted 1 month ago
8 - 13 years
30 - 45 Lacs
Bengaluru
Work from Office
Drive SRE implementation and DevOps best practices. Reduce technical debt, automate reliability workflows, and ensure performance, scalability, and observability across cloud-based digital platforms. Required Candidate profile Experienced SRE with deep knowledge of Azure cloud, CI/CD, observability, automation, and programming. Strong DevOps mindset, troubleshooting ability, and alignment with digital transformation goals
Posted 1 month ago
7.0 - 12.0 years
12 - 22 Lacs
Pune
Work from Office
Experience-7+ Years Job Locations-Pune Notice Period-30 Days Job Description- AWS Ecosystem EKS, EC2, DynamoDB, Lambda, etc. Dynatrace (or similar) The Observability team should include some members with Dynatrace experience, while the rest can have experience with similar tools. Monitoring Site, trend analysis, log analysis **Key Responsibilities: ** Design, implement, and maintain observability solutions using AWS and Dynatrace to monitor application performance and infrastructure health. Collaborate with development and operations teams to define observability requirements and ensure seamless integration of monitoring tools. Develop and manage dashboards, alerts, and reports to provide insights into system performance and user experience. Troubleshoot complex issues by analyzing logs, metrics, and traces to identify root causes and recommend solutions. Optimize existing monitoring frameworks to enhance visibility across cloud environments and applications. Stay updated on industry trends and best practices in observability, cloud technologies, and performance monitoring. 7+ years of proven experience as an Observability Engineer or similar role with a strong focus on AWS services. Proficiency in using Dynatrace for application performance monitoring and observability. Strong understanding of cloud architecture, microservices, containers, and serverless computing. Experience with scripting lan guages (e.g., Python, Bash) for automation tasks. • Excellent problem-solving skills with the ability to work under pressure in a fast-paced environment. • Strong communication skills to effectively collaborate with cross-functional teams.
Posted 1 month ago
8 - 13 years
17 - 32 Lacs
Bengaluru, Hyderabad, Gurgaon
Work from Office
We do have an immediate opening for Splunk SME. Mandatory Skills: ITSI, Observability and use case implementation complex L3 and above resources Job Title: Splunk Location: Bangalore, Hyderabad, Gurgaon, Chennai- Hybrid Module. Job Summary: The Splunk Architect will be responsible for designing, implementing, and maintaining Splunk infrastructure and solutions. This role requires a deep understanding of Splunk architecture, data ingestion, and visualization techniques. The ideal candidate will have a strong background in IT operations, security, and data analytics. Key Responsibilities: Design and implement Splunk infrastructure, including indexers, search heads, and forwarders. Develop and maintain Splunk dashboards, reports, and alerts to meet business requirements. Integrate Splunk with various data sources and third-party tools. Optimize Splunk performance and troubleshoot issues. Collaborate with cross-functional teams to understand data requirements and provide Splunk solutions. Ensure data integrity, security, and compliance within the Splunk environment. Provide technical guidance and mentorship to junior team members. Stay updated with the latest Splunk features and best practices. Qualifications: Bachelors degree in computer science, Information Technology, or a related field. Total 10+ years' work experience with 5+ years of experience working with Splunk, including architecture and administration. Strong knowledge of Splunk Enterprise Security (ES) and IT Service Intelligence (ITSI). Experience with scripting languages such as Python, Bash, or PowerShell. Familiarity with cloud platforms (AWS, Azure, GCP) and their integration with Splunk. Excellent problem-solving skills and attention to detail. Strong communication and interpersonal skills. Preferred Qualifications: Splunk Certified Architect or Splunk Certified Consultant. Experience with other SIEM tools and data analytics platforms. Knowledge of ITIL and other IT service management frameworks.
Posted 2 months ago
8 - 12 years
15 - 30 Lacs
Chennai, Bengaluru, Kolkata
Work from Office
About Client Hiring for One of the Most Prestigious Multinational Corporations! Job Description Job Title : Observability Lead Qualification : BE / B.tech Relevant Experience : 8 to 12 Years Must Have Skills : Having good knowledge in various Observability tools including AppDynamics, BSM, Evolven, Splunk, Aternity, Grafana, Prometheus. Customer interactions and escalations management Good Problem Solving and Communication Skills. Willingness to upskill in related technology as per project need. Excellent Presentation skills. Coordination with various Monitoring teams. Resource management Good to Have Skills : Good knowledge in latest Observability tools like Opentelemetry Containerized platform like Openshift, Kubernetes. Public Cloud like AWS Roles and Responsibilities : 1. Follow ITIL Best Practices and work on various ITIL processes such as Incident Management, Change Management and Service Now Requests etc. for tracking SLA. 2. Onshore and Offshore coordination 3. Technically lead and groom a team of support executives who are geographically distributed 4. Implement Intelligent automation tools to reduce manual effort 5. Understand the overall technical architecture and implement improvements 6. Must be willing to work in shifts (no night shift) and weekends (compensatory off will be provided) Location : Chennai/Kolkata/Hyderabad/Bangalore CTC Range : As per market standards Notice period : Immediate-90 days Shift Timing : General Shift Mode of Interview : Virtual Mode of Work : Work from office Bhuvaneshwari Senior Analyst Black and White outsourcing Pvt Ltd Bangalore, Karnataka,INDIA. bhuvaneshwari@blackwhite.in | www.blackwhite.in
Posted 2 months ago
10 - 19 years
0 - 3 Lacs
Bengaluru
Hybrid
Dear Candidate, Greetings from Encora Innvoation Labs!! We are hiring for Hybrid Cloud Delivery Solution Architect IaC & DevSecOps, Monitoring, and Observability Location: Bengaluru/Pune Job Type: Full-Time Job Description: Bachelors degree in computer science, Engineering, or related field (masters degree preferred). 10+ years of experience in solution architecture with a focus on cloud technologies and hybrid environments. Proven experience in Infrastructure as Code (IaC), using tools like Python, Ansible, Golang, Terraform, CloudFormation. Strong understanding of DevSecOps principles, including security automation, vulnerability management, and CI/CD pipelines. Hands-on experience with monitoring and observability tools, including but not limited to Prometheus, Grafana, ELK Stack, OpsRamp Deep knowledge of Kubernetes and container orchestration, with experience deploying and managing production workloads. Strong understanding of networking, security, and identity management within cloud environments. Familiarity with compliance frameworks (e.g., ISO, SOC 2, GDPR) and the ability to design solutions that meet regulatory requirements. Excellent communication, collaboration, and leadership skills to engage technical and non-technical stakeholders. Preferred Skills: Experience with multi-cloud architectures and strategies. Knowledge of serverless architectures and microservices design. Experience with SRE (Site Reliability Engineering) principles and practices. If the above requirement matches with your profile, kindly revert with your updated resume to yeggidi.sushma@encora.com
Posted 2 months ago
7 - 12 years
20 - 30 Lacs
Bengaluru
Work from Office
Job Description: Role Title: Lead Site Reliability Engineer Position Description: Historically, the role of IT has been to provide a reliable ecosystem to run the business, drive efficiencies and reduce costs. These areas remain integral, however, driven by the quickening pace of innovation, IT must evolve, proactively partnering with the business to enable new digital business models that power new types of customer engagement. At Elanco, our engineer roles bring adaptive set of skills covering Software-as-a-Service (SaaS), Commercial-of-the-Shelf (CotS) and/or Custom Developed applications. The role is part of our software engineering team established to deliver Engineering expertise to business facing products and services. As an Engineer you will be deployed into a multi-disciplined product team applying your software engineering talent to Elancos biggest opportunities. To be successful in an engineering role in Elanco requires a highly motivated individual, with an innovative mindset and a willingness to drive tangible outcomes. The individual must be able to articulate complex technical topics and collaborate with the internal engineering organisation to improve engineering across the enterprise. The Role We are seeking a skilled and motivated engineer, passionate about improving application reliability across our enterprise. As part of our Platform Engineering organization, you will join a product team focused on a suite of capabilities designed to enhance all aspects of our engineering portfolio. In this role, you will be primarily accountable for configuring and operating our observability toolset. You will also lead the charge across the enterprise, driving the transition from reactive to proactive application support. This is a fantastic opportunity to join a growing engineering team with the scope to partner across our entire enterprise of products. Your contributions will help ensure that everything we deliver to our customers come with top-notch reliability as standard. Typical responsibilities: Help define Elancos approach to reliability of applications partnering with our product manager for our portfolio health products. Collaborate with stakeholders such as product and platform owners, to define service level objectives (SLOs), and service-level indicators (SLIs) for system operations focused on the critical features of the customers journey and experience. Assist and coach product teams implementation of telemetry against SLIs/SLOs to ensure adequate traceability is in place. Track and manage reliability performance against agreed SLOs, in partnership with product teams or other stakeholders, and ensure systems continue to meet SLOs over time. Ensure key stakeholders, product owners, and platform owners are informed of reliability concerns and their potential impact to the customers experience. Provide expert knowledge on reliability approaches, to ensure our organization achieves its goals and roadmap for reliability. Champion reliability being treated as a feature in products and platforms and promote the concept across all phases of the software development life cycle. Create dashboards and reports to communicate key metrics, to product teams and key stakeholders. Beyond observability engage in initiatives across the product line including cost, security, and adoption helping the team drive to a health portfolio throughout an applications lifecycle. Participate problem management activities, including post-mortem incident analysis, and provision of technical insight, documented findings, outcomes and recommendations as part of a root cause analysis to troubleshoot priority incidents. Implement automation to reduce probability and/or impact of problems recurring and target self-healing through automation of reoccurring incidents. For critical applications, utilize practices such as chaos engineering and performance engineering to test in preproduction environments. This includes disaster recovery (DR) testing, performance testing, and tabletop planning exercises. Participate and exert influence in organizational learning initiatives such as communities of practice to share knowledge and foster a continuous learning and improvement mindset. Support architects working on new solutions, including analyzing requirements, supporting technical architecture activities, prototyping, designing and developing reusable infrastructure artifacts, testing, implementing, and preparing for ongoing support. Train and mentor junior and engineers to ensure SRE best practices evolve and scale successfully in the organization Partner with the product manager of portfolio health to build out golden paths, education and services to package the capability in a consumable way on our developer portal. Be a product team champion extending into product teams helping to deliver foundational platform engineering capabilities where applicable. Partner with compliance teams to ensure the data we bring into observability platforms meets privacy and compliance standards Maintain consistent standards and set out a taxonomy of telemetry to enable future opportunities including leveraging of AI capability. Basic Qualifications: Experience in some of the following areas essential. 10-15 years of hands-on engineering experience. 5 years experience in Platform Engineering, SRE or similar role 5-10 years of experience working with modern application architecture methodologies (Service Orientated Architecture, API-Centric Design, Twelve-Factor App, FAIR, etc.). 5 + years of experience working with Cloud Native design patterns, with a preference towards Microsoft Azure / Google Cloud. 5 + years of experience designing and delivering digital solutions following a product-mindset and a variety of delivery methodologies (e.g. Agile, CCPM, etc.). 5 + years of experience working within a DevSecOps” culture, including modern software development practices, covering Continuous Integration and Continuous Delivery (CI/CD), Test-Driven Development (TDD), etc. Experience with enterprise observability platforms. E.g Datadog, New Relic Experience with monitoring 3rd party and SaaS applications. Experience establishing standards around MELT (Metrics, Events, Logging and Tracing and implementing at an enterprise level. Experience with Open Telemetry advantageous. Experience supporting digital platforms, including Integrations, Release Management, Regression Testing, Integrations, Data Obfuscation, etc. Experience scaling an “API-Ecosystem”, designing, and implementing “API-First” integration patterns. Experience working with authentication and authorisation protocols/patterns. Experience defining and implementing large-scale, transformative digital solutions. Demonstrated influence and communication skills across all levels of IT and third parties. Experience working in complex, diverse landscapes (business, technology, regulatory, partners, providers, geographies, etc.). Strong organizational and communications skills with multiple examples of being able to convey complex technical topics, that resulted in a definitive direction. Education Requirements: Bachelor’s degree in information technology.
Posted 3 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
Accenture
36723 Jobs | Dublin
Wipro
11788 Jobs | Bengaluru
EY
8277 Jobs | London
IBM
6362 Jobs | Armonk
Amazon
6322 Jobs | Seattle,WA
Oracle
5543 Jobs | Redwood City
Capgemini
5131 Jobs | Paris,France
Uplers
4724 Jobs | Ahmedabad
Infosys
4329 Jobs | Bangalore,Karnataka
Accenture in India
4290 Jobs | Dublin 2