Jobs
Interviews

1633 Grafana Jobs - Page 22

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

15.0 - 20.0 years

9 - 13 Lacs

Mumbai

Work from Office

Project Role : Software Development Lead Project Role Description : Develop and configure software systems either end-to-end or for a specific stage of product lifecycle. Apply knowledge of technologies, applications, methodologies, processes and tools to support a client, project or entity. Must have skills : DevOps, Docker (Software), Kubernetes, Microsoft SQL Server Good to have skills : NAMinimum 5 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Lead, you will lead the effort to design, build, and configure applications, acting as the primary point of contact. You will oversee the development process and ensure successful project delivery. Roles & Responsibilities:- Expected to be an SME- Collaborate and manage the team to perform- Responsible for team decisions- Application deployment and level 1&2 support during SIT, UAT and production implementation for the core banking system of the bank (for Asia, Europe, Middle East and Africa).- Follow UAT issues and enquiries from business users until their appropriate closure (incident communication, root cause analysis and permanent resolution).- Engagement with development teams on a regular basis. Appropriate escalation of critical queries.- Engage with multiple teams and contribute on key decisions- Provide solutions to problems for their immediate team and across multiple teams- Lead the application development process- Coordinate with stakeholders to gather requirements- Ensure timely delivery of projects - Must Have Skills: Proficiency in Java, SQL, UNIX, Shell scripting, Control M.- Hands on experience on Jetty, DevOps, Docker (Software), Microsoft SQL Server, Kubernetes- Hands on experience on RDBMS Oracle 12/19, PL/SQL.- Hands on experience on Docker EE, Kubernetes, Argo CD.- Exposure in DevOps tools like Jenkins, Master Deploy etc.- Exposure in Elastic search (KIBANA), Grafana- Exposure in ETL tools is an advantage.- Exposure to SRE- Good knowledge of all phases of the system development and implementation life cycle (SDLC).- Strong understanding of CI/CD pipelines- Experience with cloud platforms such as AWS or Azure- Knowledge of infrastructure as code tools like Terraform Additional Information:- The candidate should have a minimum of 8 years of experience in DevOps- This position is based at our Mumbai office- A 15 years full-time education is required Qualification 15 years full time education

Posted 2 weeks ago

Apply

12.0 - 15.0 years

5 - 5 Lacs

Thiruvananthapuram

Work from Office

Role Proficiency: Resolve complex trouble tickets spanning across different technologies and fine tune infrastructure for optimum performance and/or provide technical people and financial management (Hierarchical or Lateral) Outcomes: 1) Mentor new team members in understanding customer infrastructure and processes2) Review and approve RCA prepared by team and drive corrective and preventive actions for permanent resolution3) Review problem tickets for timely closure. Measure incident reduction achieve by problem record for showcasing during Governance review4) Provide technical leadership for change implementation 5) Review CSI initiatives and aggressively drive for time bound closure to achieve optimization and efficiency goals6) Drive delivery finance goal to achieve project forecast numbers 7) Work on proposals and PIP for identified greenfield opportunity to increase revenue Measures of Outcomes: 1) SLA Adherence2) Time bound resolution of elevated tickets - OLA3) Manage ticket backlog timelines - OLA4) Adhere to defined process - Number of NCs in internal/external Audits5) Number of KB articles created6) Number of incidents and change tickets handled 7) Number of elevated tickets resolved8) Number of successful change tickets9) % Completion of all mandatory training requirements10) Overall financial goals of project11) % of incident reduction by problem management12) Number of repeated escalations for the same technical issue Outputs Expected: Resolution/Response: Daily review of resolution and response SLA for early intervention of SLA management Troubleshooting: Troubleshooting based on available information from previous tickets or consulting with seniors. Participate in online knowledge forums reference. Convert the new steps to KB article Perform logical/analytical troubleshooting. Work on problem tickets to identify permanent solutions. Assist and lead technical teams to rope in technology experts for complex issues Escalation/Elevation: Escalate within organization/customer peer in case of resolution delay. Define OLA between delivery layers (L1 L2 L3 etc) and enforce adherence. SPOC for any customer and leadership escalations Tickets Backlog/Resolution: Follow up on tickets based on agreed timelines manage ticket backlogs/last activity as per defined process. Resolve incidents and SRs within agreed timelines. Execute change tickets for infrastructure. Runbook/KB: Review KB compliance and suggest changes. Initiate and drive periodic SOP review with customer stake holders Collaboration: Collaborate with different towers of delivery for ticket resolution (within SLA) Resolve L1 tickets with help from respective tower. Collaborate with other team members for timely resolution of tickets. Actively participate in team/organization-wide initiatives. Co-ordinate with UST ISMS teams for resolving connectivity related issues Stakeholder Management: Lead the customer calls and vendor calls. Organize meeting with different stake holders. Take ownership for function's internal communications and related change management. Strategic: Define the strategy on data management policy management and data retention management. Support definition of the IT strategy for the function's relevant scope and be accountable for ensuring the strategy is tracked benchmarked and updated for the area owned. Process Adherence: Thorough understanding of organization and customer defined process. Suggest process improvements and CSI ideas. Adhere to organization' s policies and business conduct Process/efficiency Improvement: Proactively identify opportunities to increase service levels and mitigate any issues in service delivery within the function or across functions. Take accountability for overall productivity efforts within the function including coordination of function specific tasks and close collaboration with Finance. Process Implementation: Coordinate and monitor IT process implementation within the function Compliance: Support information governance activities and audit preparations within the function. Act as a function SPOC for IT audits in local sites (incl. preparation interface to local organization mitigation of findings etc.) and work closely with ISRM (Information Security Risk Management). Coordinate overall objective setting preparation and facilitate process in order to achieve consistent objective setting in function Job Description. Coordination Support for CSI across all services in CIS and beyond. Training: On time completion of all mandatory training requirements of organization and customer. Provide On floor training and one-on-one mentorship for new joiners. Complete certification of respective career paths. Explore cross training possibilities for improved efficiency and career growth. Performance Management: Update FAST Goals in NorthStar; track report and seek continues feedback from peers and manager. Set goals for team members and mentees and provide feedback. Assist new team members understanding the customer environment day-to-day operations and people management for example roster transport and leaves. Prepare weekly/Monthly/Quarterly governance review slides. Drive finance goals of the account. Skill Examples: 1) Good communication skills (Written verbal and email etiquette) to interact with different teams and customers. 2) Modify / Create runbooks based on suggested changes from juniors or newly identified steps3) Ability to work on an elevated server tickets to resolution4) Networking:a. Trouble shooting skills in static and Dynamic routing protocolsb. Should be capable of running netflow analyzers in different product lines5) Server:a. Skills in installing and configuring active directory DNS DHCP DFS IIS patch managementb. Excellent troubleshooting skills in various technologies like AD replication DNS issues etc.c. Skills in managing high availability solutions like failover clustering Vmware clustering etc.6) Storage and Back up:a. Ability to give recommendations to customers. Perform Storage and backup enhancements. Perform change management.b. Skilled in in core fabric technology storage design and implementation. Hands on experience in backup and storage Command Line Interfacesc. Perform Hardware upgrades firmware upgrades vulnerability remediation storage and backup commissioning and de-commissioning replication setup and management.d. Skilled in server Network and virtualization technologies. Integration of virtualization storage and backup technologiese. Review the technical diagrams architecture diagrams and modify the SOP and documentations based on business requirements.f. Ability to perform the ITSM functions for storage and backup team; review the quality of ITSM process followed by the team.7) Cloud:a. Skilled in any one of the cloud technologies - AWS Azure GCP.8) Tools:a. Skilled in administration and configuration of monitoring tools like CA UIM SCOM Solarwinds Nagios ServiceNow etcb. Skilled in SQL scriptingc. Skilled in building Custom Reports on Availability and performance of IT infrastructure building based on the customer requirements9) Monitoring:a. Skills in monitoring of infrastructure and application components10) Database:a. Data modeling and database design; Database schema creation and managementb. Identification of data integrity violations so that only accurate and appropriate data is entered and maintained.c. Backup and recoveryd. Web-specific tech expertise for e-Biz Cloud etc. Examples of this type of technology include XML CGI Java Ruby firewalls SSL and so on.e. Migrating database instances to new hardware and new versions of software from on premise to cloud based databases and vice versa.11) Quality Analysis: a. Ability to drive service excellence and continuous improvement within the framework defined by IT Operations Knowledge Examples: 1) Good understanding of customer infrastructure and related CIs. 2) ITIL Foundation certification3) Thorough hardware knowledge 4) Basic understanding of capacity planning5) Basic understanding of storage and backup6) Networking:a. Hands-on experience in Routers witches and Firewallsb. Should have minimum knowledge and hands-on with BGPc. Good understanding in Load balancers and WAN optimizersd. Advance back and restore knowledge in backup tools7) Server:a. Basic to intermediate powershell / BASH/Python scripting knowledge and demonstrated experience in script based tasksb. Knowledge of AD group policy management group policy tools and troubleshooting GPO sc. Basic AD object creation DNS concepts DHCP DFSd. Knowledge with tools like SCCM SCOM administration8) Storage and Backup:a. Subject Matter Expert in any of the and Backup technologies9) Tools:a. Proficient in understanding and troubleshooting of Windows and Linux family of operating systems10) Monitoring:a. Strong knowledge in ITIL process and functions11) Database:a. Knowledge in general database managementb. Knowledge in OS System and networking skills Additional Comments: We are hiring an Observability Engineer to architect, implement, and maintain enterprise-grade monitoring solutions. You will enhance visibility into system performance, application health, and security events. Key Responsibilities: - Build and manage observability frameworks using LogicMonitor, ServiceNow, BigPanda, NiFi - Implement log analytics via Azure Log Analytics and Azure Sentinel - Design observability dashboards using KQL, Splunk, and Grafana suites (Alloy, Beyla, K6, Loki, Thanos, Tempo) - Manage infrastructure observability for AKS deployments - Automate observability workflows using GitHub, PowerShell, and API Management - Collaborate with DevOps and platform teams to build end-to-end visibility Core Skills: - LogicMonitor, ServiceNow, BigPanda, NiFi - Azure Log Analytics, Azure Sentinel, KQL - Grafana suite (Alloy, Beyla, etc.), Splunk - AKS, Data Pipelines, GitHub - PowerShell, API Management Preferred Skills: - Working knowledge of Cribl - Exposure to distributed tracing and advanced metrics Soft Skills & Expectations: - Cross-team collaboration and problem-solving mindset - Self-starter and fast learner of emerging observability tools Required Skills big panda,Azure Automation,Azure

Posted 2 weeks ago

Apply

21.0 - 31.0 years

35 - 42 Lacs

Bengaluru

Work from Office

What we’re looking for As a member of the Infrastructure team at Survey Monkey, you will have a direct impact in designing, engineering and maintaining our Cloud, Messaging and Observability Platform. Solutioning with best practices, deployment processes, architecture, and support the ongoing operation of our multi-tenant AWS environments. This role presents a prime opportunity for building world-class infrastructure, solving complex problems at scale, learning new technologies and offering mentorship to other engineers. What you’ll be working on Architect, build, and operate AWS environments at scale with well-established industry best practices Automating infrastructure provisioning, DevOps, and/or continuous integration/delivery Support and maintain AWS services, such as EKS, Heroku Write libraries and APIs that provide a simple, unified interface to other developers when they use our monitoring, logging, and event-processing systems Support and partner with other teams on improving our observability systems to monitor site stability and performance Work closely with developers in supporting new features and services. Work in a highly collaborative team environment. Participate in on-call rotation We’d love to hear from people with 8+ years of relevant professional experience with cloud platforms such as AWS, Heroku. Extensive experience with Terraform, Docker, Kubernetes, scripting (Bash/Python/Yaml), and helm. Experience with Splunk, Open Telemetry, CloudWatch, or tools like New Relic, Datadog, or Grafana/Prometheus, ELK (Elasticsearch/Logstash/Kibana). Experience with metrics and logging libraries and aggregators, data analysis and visualization tools – Specifically Splunk and Otel. Experience instrumenting PHP, Python, Java and Node.js applications to send metrics, traces, and logs to third-party Observability tooling. Experience with GitOps and tools like ArgoCD/fluxcd. Interest in Instrumentation and Optimization of Kubernetes Clusters. Ability to listen and partner to understand requirements, troubleshoot problems, or promote the adoption of platforms. Experience with GitHub/GitHub Actions/Jenkins/Gitlab in either a software engineering or DevOps environment. Familiarity with databases and caching technologies, including PostgreSQL, MongoDB, Elasticsearch, Memcached, Redis, Kafka and Debezium. Preferably experience with secrets management, for example Hashicorp Vault. Preferably experience in an agile environment and JIRA. SurveyMonkey believes in-person collaboration is valuable for building relationships, fostering community, and enhancing our speed and execution in problem-solving and decision-making. As such, this opportunity is hybrid and requires you to work from the SurveyMonkey office in Bengaluru 3 days per week. #LI - Hybrid

Posted 2 weeks ago

Apply

2.0 - 3.0 years

10 - 15 Lacs

Bengaluru

Work from Office

Site Reliability Engineer (SRE) Experience: 2 - 3 Years Exp Salary: 10-15 LPA Preferred Notice Period : Within 30 Days Opportunity Type: Office (Bengaluru) Placement Type: Permanent (*Note: This is a requirement for one of Uplers' Clients) Must have skills: Azure OR Kubernetes OR Docker OR Terraform, Prometheus OR Grafana OR ELK Stack OR Elasticsearch OR Logstash OR Kibana, Shell Scripting OR PowerShell TatvaCare (One of Uplers' Clients) is Looking for: About TatvaCare TatvaCare is transforming care practices to deliver positive health outcomes. TatvaCare, a startup in the Indian health tech landscape, is catalyzing the transformation of care practices through digitisation. Our product portfolio includes TatvaPractice, an advanced EMR and Knowledge platform for healthcare professionals, and MyTatva, a Digital Therapeutics application designed to manage chronic diseases like Fatty Liver, COPD, and Asthma. Through these initial solutions and more to come, we aim to bridge the gap in healthcare, connecting professionals and patients. We are committed to revolutionizing healthcare in India, promoting efficient, patient-centric care, and optimizing outcomes across the healthcare spectrum. MyTatva: A DTx app that aids adherence to doctor-recommended lifestyle changes. TatvaPractice: An ABDM-certified EMR platform to Enhance a doctors practice. Our vision is not just about digitizing records; it's about fostering a healthcare ecosystem where efficiency and empathy converge, ultimately leading to a health continuum. Job Description: Are you a Site Reliability Engineer (SRE) looking to make a real impact in the healthcare space? Tatvacare, a technology-driven healthcare platform focused on delivering personalised and accessible care, is looking for an SRE to help scale and optimise its cloud infrastructure. If you're passionate about reliability, automation, and supporting meaningful innovation in digital health this opportunity is for you. Read on to explore the role Job Description: We are hiring a Site Reliability Engineer (SRE) with 23 years of experience. What Youll Do: Manage and monitor infrastructure on Azure Cloud. Work with Kubernetes and Docker for application deployment. Use Terraform to automate infrastructure. Set up and manage OpenVPN for secure access. Monitor performance and reliability using observability tools like Grafana, ELK, or Azure Monitor. Support developers with deployment and system reliability. Explore integration of OpenAI APIs to improve operations. Skills Required: Experience with Azure, Kubernetes, Docker, and Terraform. Knowledge of OpenVPN setup and usage. Hands-on with monitoring tools (Grafana, Prometheus, ELK, etc.). Scripting in Bash, PowerShell, or Python. Basic understanding of OpenAI or interest in AI tools How to apply for this opportunity: Easy 3-Step Process: 1. Click On Apply and register or log in to our portal 2.Upload updated Resume & complete the Screening Form 3. Increase your chances of getting shortlisted & meet the client for the Interview! About Uplers: Our goal is to make hiring and getting hired reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant product and engineering job opportunities and progress in their career. (Note: There are many more opportunities apart from this on the portal.) So, if you are ready for a new challenge, a great work environment, and an opportunity to take your career to the next level, don't hesitate to apply today. We are waiting for you!

Posted 2 weeks ago

Apply

2.0 - 6.0 years

12 - 15 Lacs

Navi Mumbai

Work from Office

Key Responsibilities: Manage and administer complex multi-cloud environments (AWS, GCP, Azure). Monitor infrastructure performance and troubleshoot complex issues at the L3 level. Design and implement automation scripts using tools like Terraform, Ansible, CloudFormation, or Bicep. Optimize cloud cost, performance, and security in line with best practices. Support deployment and maintenance of production, development, and staging environments. Collaborate with DevOps, Networking, and Security teams to ensure seamless operations. Implement and monitor backup, disaster recovery, and high-availability strategies. Conduct root cause analysis (RCA) for critical incidents and service disruptions. Stay updated with evolving cloud technologies and recommend improvements. Participate in 24x7 on-call rotation for critical incident support. Technical Skills: Cloud Platforms: Deep expertise in AWS, GCP, and Azure services (EC2, VPC, IAM, S3, AKS, GKE, App Services, etc.) Infrastructure as Code (IaC): Hands-on with Terraform, Ansible, ARM Templates, CloudFormation. Scripting: Strong in PowerShell, Python, or Bash scripting. CI/CD Pipelines: Experience with Jenkins, GitHub Actions, Azure DevOps, or similar tools. Monitoring Tools: Proficient in tools like CloudWatch, Azure Monitor, Stackdriver, Prometheus, Grafana. Security & Governance: Knowledge of IAM, RBAC, security groups, policies, and compliance. Containers & Orchestration: Familiarity with Kubernetes, Docker, AKS, GKE, EKS. Certifications (Preferred): AWS Certified SysOps Administrator / Solutions Architect Associate or Professional Microsoft Certified: Azure Administrator Associate or Architect Expert Google Associate Cloud Engineer or Professional Cloud Architect

Posted 2 weeks ago

Apply

5.0 - 7.0 years

12 - 14 Lacs

Chennai

Work from Office

Critical Skills to Possess: Deep understanding of server/network infrastructure and cloud platforms (AWS/Azure/GCP). Strong troubleshooting across app tiers: web servers (Apache, Nginx, IIS), app servers (Tomcat, JBoss, WebLogic), and DBs (SQL Server, Oracle, MySQL). Experience working with ITSM tools like ServiceNow, BMC Remedy , and Ivantic Service Management . Familiarity with CI/CD and DevOps tools is a plus (Jenkins, Git, Docker). Sound knowledge of ITIL processes (Incident, Problem, Change Management). 24*7 Role Preferred Qualifications: BS degree in Computer Science or Engineering or equivalent experience Roles and Responsibilities Roles and Responsibilities: Own high-severity and escalated incidents across network, server, application, and middleware layers. Lead RCA (Root Cause Analysis) efforts for recurring issues using logs, dashboards, and correlation tools (e.g., Splunk, SolarWinds). Troubleshoot application-level issues (e.g., API failures, integration errors, service restarts) in coordination with App Support teams. Perform log analysis and monitor application health metrics using tools like SolarWinds, Kibana , or Grafana . Automate alert correlation, ticket enrichment, and remediation using scripting (Python, Bash, or PowerShell). Coach and mentor L1 staff, and review SOPs/runbooks for operational excellence. Ensure accurate and timely documentation of incidents, resolutions, and changes in the ITSM system . Participate in change control, patch coordination, and system upgrades.

Posted 2 weeks ago

Apply

10.0 - 15.0 years

20 - 25 Lacs

Chennai

Work from Office

Job Title:Fullstack Architect/Application ArchitectExperience10-15 YearsLocation:Remote : A NASDAQ-listed company, known for its prominent position in the food and beverage sector, is seeking A Fullstack Architect with expertise in Java Spring Boot. In this role, you will understand problem statements and conduct architecture Assessments specifically in the Manufacturing and Warehousing industry. This is an exciting opportunity for an experienced architect eager to work with a leading multinational Corporation that has a significant impact on millions of lives worldwide. Required Skills: 10+ Years of Experience Minimum of 5 years of experience as an Architect. 5+ Years of experience in Java and Spring Boot Hands-on experience in creating monitoring dashboards and conducting infrastructure monitoring using Datadog, Splunk, or Grafana. Proficiency with AppDynamics and Azure Databricks. Strong problem-solving skills and the ability to navigate complex organizational environments. Excellent verbal and written English communication skills.

Posted 2 weeks ago

Apply

2.0 - 3.0 years

8 - 10 Lacs

Chennai

Work from Office

Job Summary: We are looking for a proactive and technically sound L1 Application Support Engineer with 2 to 3 years of experience, who has hands-on exposure to Java-based applications. The ideal candidate will be responsible for providing first-level technical support, troubleshooting application issues, and coordinating with L2/L3 teams for resolution when needed. Key Responsibilities: Provide L1 support for business-critical Java applications. Monitor application health and performance using dashboards and alerts. Perform basic troubleshooting of issues and escalate to L2/L3 as needed. Analyze logs and errors, and perform root cause analysis for recurring issues. Work on incident and service request tickets within defined SLAs. Coordinate with cross-functional teams for issue resolution. Maintain documentation for incidents, resolutions, and standard procedures. Participate in shift rotations (if applicable) and be available for weekend support (on-call). Must-Have Skills: 2 to 3 years of application support experience. Hands-on experience in Java-based applications (basic debugging/log analysis) the candidate should be able to write the basic java queries. Good knowledge of SQL for querying and data validation. Familiarity with Linux/Unix commands and basic scripting. Excellent communication and customer service skills. Strong analytical and problem-solving abilities. Good to Have: Exposure to ITIL practices (incident/problem/change management). Experience with ticketing tools (ServiceNow, JIRA, etc.). Familiarity with application monitoring tools (e.g., Grafana, Splunk, AppDynamics).

Posted 2 weeks ago

Apply

0.0 - 5.0 years

15 - 20 Lacs

Chennai

Work from Office

Job Title Tech Lead/Cloud ArchitectExperience 0-5 YearsLocation Remote : A NASDAQ-listed company that has effectively maintained its position as the front-runner In the food and beverage sector is looking to onboard A Tech Lead to guide and manage the development team on various projects. The Tech Lead will be responsible for overseeing the technical direction of the projects, Ensuring the development of high-quality, scalable, and maintainable code. The talent will be interacting with other talents as well as an internal cross-functional team. Required Skills: Cloud architecture using microservices design Data Modelling/Design API Design API Contracts React Java Azure ADO RESTful API, GraphQL, SQL/NoSQL DB Experience with ADF, Databricks CI/CD, Sonarqube, Snyk, Prometheus, Grafana Responsibilities: Collaborate with Product and Data teams. Ensure a clear understanding of requirements. Architect and design microservices based enterprise web application Build data intensive, UI-rich microservices based enterprise applications that is scalable, Performant, secure using Cloud best practices in Azure Offer Details: Full-time dedication (40 hours/week) REQUIRED3-hour overlap with CST (Central Standard Time) Interview Process: 2-step interviewinitial screening and technical interview

Posted 2 weeks ago

Apply

5.0 - 6.0 years

10 - 20 Lacs

Mumbai, Delhi / NCR, Bengaluru

Work from Office

(Night shift 6:00 pm to 3:00 am) KEY REPONSIBILITIES Development with Python, TypeScript (TS), GitHub, and AWS Services (Step Functions and Lambda) Development with Terraform and Grafana Tools Experienced development with AWS-based API and ETL solutions. Design and implement scalable APIs leveraging AWS services such as API Gateway, Lambda, and RDS. Leverage Cursor AI with Playwright Automated Testing of Applications Ensure API and ETL job reliability through unit and integration testing, workflow orchestration, and event-driven automation. Monitor, troubleshoot, and enhance performance using AWS-native tools such as CloudWatch, SQS, and Event Bridge. Collaborate with cross-functional teams to ensure alignment with business and technical objectives. Ensure that all uptime and System performance metrics are defined and supported. REQUIRED SKILLS & EXPERIENCE Expert Python and TypeScript (TS) hands-on development Expert AWS Services (Step Functions and Lambda) hands-on solutions and development Hands-On Experience with AWS Services: API Gateway, Lambda, Observability, CloudWatch, SQS, Event Bridge, S3, RDS, AWS Glue, Glue Crawler, Athena. Hands-On design and development of relational databases (Oracle, Postgres, SQL) Knowledge of Cursor AI Code Editor and Playwright Automated Testing Knowledge of workflow orchestration, including job scheduling and API integration. Strong experience in TypeScript (TS) and Python for API development using Serverless Framework v3. Hands-on development of Docker-based APIs. Expertise in CRUD transactions with relational databases. Experience implementing CI/CD pipelines for API deployments. Strong understanding of unit and integration testing, including mocking strategies for API development. PREFERRED QUALIFICATIONS Experience with AWS Observability Maturity Models and Best Practices Experience with serverless architecture best practices. Experience with event-driven architecture using AWS services. Experience with CI/CD pipeline development Strong debugging and performance optimization skills. Ability to write clean, maintainable, and well-documented code. Knowledge of security best practices in AWS, API authentication, and access control. Python development experience for AWS Glue ETL workflows. Strong communication and collaboration skills. OTHER REQUIREMENTS 5+ years of Senior leadership and hands-on development of Python, TypeScript (TS), and API Services 3+ years of AWS Services hands-on development (Step Functions and Lambda) 1+ year of Senior leadership and hands-on development of Grafana, Terraform and GitHub Comprehensive understanding of the complete software development lifecycle Hands-on experience working in an agile software development environment (Preferably Scrum) WORK ENVIRONMENT: Typical office environment Mon-Fr during the hours of 8 A.M. to 5 P.M EST. Location : - Remote, Mumbai, Delhi / NCR, Bengaluru , Kolkata, Chennai, Hyderabad, Ahmedabad, Pune

Posted 2 weeks ago

Apply

3.0 - 8.0 years

20 - 35 Lacs

Gurugram, Delhi / NCR, Mumbai (All Areas)

Hybrid

Job location: Mumbai/Gurugram (Hybrid) About the role: Sun King is looking for a self-driven Infrastructure engineer, who is comfortable working in a fast-paced startup environment and balancing the needs of multiple development teams and systems. You will work on improving our current IAC, observability stack, and incident response processes. You will work with the data science, analytics, and engineering teams to build optimized CI/CD pipelines, scalable AWS infrastructure, and Kubernetes deployments. What you would be expected to do: Work with engineering, automation, and data teams to work on various infrastructure requirements. Designing modular and efficient GitOps CI/CD pipelines, agnostic to the underlying platform. Managing AWS services for multiple teams. Managing custom data store deployments like sharded MongoDB clusters, Elasticsearch clusters, and upcoming services. Deployment and management of Kubernetes resources. Deployment and management of custom metrics exporters, trace data, custom application metrics, and designing dashboards, querying metrics from multiple resources, as an end-to-end observability stack solution. Set up incident response services and design effective processes. Deployment and management of critical platform services like OPA and Keycloak for IAM. Advocate best practices for high availability and scalability when designing AWS infrastructure, observability dashboards, implementing IAC, deploying to Kubernetes, and designing GitOps CI/CD pipelines. You might be a strong candidate if you have/are: Hands-on experience with Docker or any other container runtime environment and Linux with the ability to perform basic administrative tasks. Experience working with web servers (nginx, apache) and cloud providers (preferably AWS). Hands-on scripting and automation experience (Python, Bash), experience debugging and troubleshooting Linux environments and cloud-native deployments. Experience building CI/CD pipelines, with familiarity with monitoring & alerting systems (Grafana, Prometheus, and exporters). Knowledge of web architecture, distributed systems, and single points of failure. Familiarity with cloud-native deployments and concepts like high availability, scalability, and bottleneck. Good networking fundamentals SSH, DNS, TCP/IP, HTTP, SSL, load balancing, reverse proxies, and firewalls. Good to have: Experience with backend development and setting up databases and performance tuning using parameter groups. Working experience in Kubernetes cluster administration and Kubernetes deployments. Experience working alongside SecOps engineers. Basic knowledge of Envoy, service mesh (Istio), and SRE concepts like distributed tracing. Setup and usage of open telemetry, central logging, and monitoring systems. Apply here: https://sunking.pinpointhq.com/postings/b63a7111-1b98-48de-8528-4bb4bb77436f

Posted 2 weeks ago

Apply

3.0 - 8.0 years

5 - 9 Lacs

Bengaluru

Work from Office

Project Role : Application Developer Project Role Description : Design, build and configure applications to meet business process and application requirements. Must have skills : Spring Boot Good to have skills : NAMinimum 3 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As an Application Developer, you will engage in the design, construction, and configuration of applications tailored to fulfill specific business processes and application requirements. Your typical day will involve collaborating with team members to understand project needs, developing innovative solutions, and ensuring that applications are built to the highest standards of quality and performance. You will also participate in discussions to refine project goals and contribute to the overall success of the team. Roles & Responsibilities:- Expected to perform independently and become an SME.- Required active participation/contribution in team discussions.- Contribute in providing solutions to work related problems.- Assist in the documentation of application specifications and design.- Engage in code reviews to ensure adherence to best practices and standards. Professional & Technical Skills: - DS & Algo, Java 17/Java EE, Spring Boot, CICD- Web-Services using RESTful, Spring framework, Caching techniques, PostgreSQL SQL, Junit for testing, and containerization with Kubernetes/Docker. Airflow, GCP, Spark, Kafka - Hands on experiencing in building alerting/monitoring/logging for micro services using frameworks like Open Observe/Splunk, Grafana, Prometheus Additional Information:- The candidate should have minimum 3 years of experience in Spring Boot.- This position is based at our Bengaluru office.- A 15 years full time education is required. Qualification 15 years full time education

Posted 2 weeks ago

Apply

12.0 - 16.0 years

37 - 42 Lacs

Bengaluru

Work from Office

Job Objective: As AVP/VP Architect- Lead the design and development of scalable, reliable, and high-performance architecture for Zwayam. Job Description: In this role you will: Hands-on Coding & Code Review: Actively participate in coding and code reviews, ensuring adherence to best practices, coding standards, and performance optimization. High-Level and Low-Level Design: Create comprehensive architectural documentation that guides the development team and ensures the scalability and security of the system. Security Best Practices: Implement security strategies, including data encryption, access control, and threat detection, ensuring the platform adheres to the highest security standards. Compliance Management: Oversee compliance with regulatory requirements such as GDPR, including data protection, retention policies, and audit readiness. Disaster Recovery & Business Continuity: Design and implement disaster recovery strategies to ensure the reliability and continuity of the system in case of failures or outages. Scalability & Performance Optimization: Ensure the system architecture can scale seamlessly and optimize performance as business needs grow. Monitoring & Alerting: Set up real-time monitoring and alerting systems to ensure proactive identification and resolution of performance bottlenecks, security threats, and system failures. Cross-Platform Deployment: Architect flexible, cloud-agnostic solutions and manage deployments on Azure and AWS platforms. Containerization & Orchestration: Use Kubernetes and Docker Swarm for container management and orchestration to achieve a high degree of automation and reliability in deployments. Data Management: Manage database architecture using MySQL, MongoDB and ElasticSearch to ensure efficient storage, retrieval, and management of data. Message Queuing Systems: Design and manage asynchronous communication using Kafka and Redis for event-driven architecture. Collaboration & Leadership: Work closely with cross-functional teams including developers, product managers, and other stakeholders to deliver high-quality solutions on time. Mentoring & Team Leadership: Mentor, guide, and lead the engineering team, fostering technical growth and maintaining adherence to architectural and coding standards. Required Skills: Experience: 12+ years of experience in software development and architecture, with at least 3 years in a leadership/architect role. Technical Expertise: Proficient in Java and related frameworks like Spring-boot Experience with databases like MySQL, MongoDB, ElasticSearch, and message queuing systems like Kafka, Redis. Proficiency with containerization (Docker, Docker Swarm) and orchestration (Kubernetes). Solid experience with cloud platforms (Azure, AWS, GCP). Experience with monitoring tools (e.g., Prometheus, Grafana, ELK stack) and alerting systems for real-time issue detection and resolution. Compliance & Security: Hands-on experience in implementing security best practices. Familiarity with compliance frameworks such as GDPR and DPDP Architecture & Design: Proven experience in high-level and low-level architectural design. Problem-Solving: Strong analytical and problem-solving skills, with the ability to handle complex and ambiguous situations. Leadership: Proven ability to lead teams, influence stakeholders, and drive change. Communication: Excellent verbal and written communication skills Our Ideal Candidate: The ideal candidate should possess a deep understanding of the latest architectural patterns, cloud-native design, and security practices. They should be adept at translating business requirements into scalable and efficient technical solutions. A proactive, hands-on approach to problem-solving and a passion for innovation are essential. Strong leadership and mentoring skills are crucial to drive a high-performance team and foster technical excellence.

Posted 2 weeks ago

Apply

3.0 - 8.0 years

5 - 10 Lacs

Bengaluru

Work from Office

Project Role : DevOps Engineer Project Role Description : Responsible for building and setting up new development tools and infrastructure utilizing knowledge in continuous integration, delivery, and deployment (CI/CD), Cloud technologies, Container Orchestration and Security. Build and test end-to-end CI/CD pipelines, ensuring that systems are safe against security threats. Must have skills : Google Cloud Compute Services Good to have skills : Google BigQuery, Google Kubernetes EngineMinimum 3 year(s) of experience is required Educational Qualification : 15 years full time education Summary :As a DevOps Engineer, you will be responsible for building and setting up new development tools and infrastructure. A typical day involves utilizing your expertise in continuous integration, delivery, and deployment, while also focusing on cloud technologies, container orchestration, and security measures. You will engage in building and testing end-to-end CI/CD pipelines, ensuring that systems are secure and efficient, and collaborating with various teams to enhance the development process. Roles & Responsibilities:- Expected to perform independently and become an SME.- Required active participation/contribution in team discussions.- Contribute in providing solutions to work related problems.- Collaborate with cross-functional teams to identify and resolve issues in the development process.- Implement best practices for CI/CD pipelines to enhance efficiency and security. Professional & Technical Skills: - Must To Have Skills: Proficiency in Google Cloud Compute Services.- Good To Have Skills: Experience with Google BigQuery, Google Kubernetes Engine.- Strong understanding of container orchestration and management.- Experience with security protocols and best practices in cloud environments.- Familiarity with monitoring and logging tools to ensure system reliability. Additional Information:- The candidate should have minimum 3 years of experience in Google Cloud Compute Services.- This position is based at our Bengaluru office.- A 15 years full time education is required. Qualification 15 years full time education

Posted 2 weeks ago

Apply

7.0 - 12.0 years

9 - 14 Lacs

Pune

Work from Office

Job Summary Synechron is seeking a skilled and experienced Lead Java Developer to oversee the development, deployment, and support of complex enterprise applications. This role involves leading technical initiatives, ensuring best practices in software engineering, and collaborating across teams to deliver cloud-enabled, scalable, and efficient solutions. The successful candidate will contribute to our strategic technology objectives while fostering innovation, best coding practices, and continuous improvement in a dynamic environment. Software Requirements Required: Proficiency in Java (latest stable versions), with extensive experience in building enterprise-scale applications Familiarity with Kettle jobs (Pentaho Data Integration) Operating systems Unix/Linux Scripting languages Shell Scripting , Perl , Python Job scheduling tools Control-M , Autosys Database technologies SQL Server , Oracle , or MongoDB Monitoring tools such as Grafana , Prometheus , or Splunk Container orchestration Kubernetes and OpenShift Messaging middleware Kafka , EMS , RabbitMQ Big data platforms Apache Flink , Spark , Apache Beam , Hadoop , Gemfire , Ignite Continuous Integration/Delivery tools Jenkins , TeamCity , SonarQube , Git Preferred: Experience with cloud platforms (e.g., AWS) Additional data processing frameworks or cloud deployment tools Knowledge of security best practices in enterprise environments Overall Responsibilities Lead the design, development, and deployment of scalable Java-based solutions aligned with business needs Analyze existing system logic, troubleshoot issues, and implement improvements or fixes Collaborate with business stakeholders and technical teams to gather requirements, propose solutions, and document functionalities Define system architecture, including APIs, data flows, and system integration points Develop and maintain comprehensive documentation, including technical specifications, deployment procedures, and API documentation Support application deployment, configurations, and release management within CI/CD pipelines Implement monitoring and alerting solutions using tools like Grafana, Prometheus, or Splunk for operational insights Ensure application security and compliance with enterprise security standards Mentor junior team members and promote development best practices across the team Performance Outcomes: Robust, scalable, and maintainable applications Reduced system outages and improved performance metrics Clear, complete documentation supporting operational and development teams Effective team collaboration and technical leadership Technical Skills (By Category) Programming Languages: Essential Java PreferredScripting languages ( Shell , Perl , Python ) Frameworks and Libraries: EssentialJava frameworks such as Spring Boot , Spring Cloud PreferredMicroservices architecture, messaging, or big data libraries Databases/Data Management: Essential SQL Server , Oracle , MongoDB PreferredData grid solutions like Gemfire or Ignite Cloud Technologies: PreferredHands-on experience with AWS , Azure , or similar cloud platforms, especially for container deployment and orchestration Containerization and Orchestration: Essential Kubernetes , OpenShift DevOps & CI/CD: Essential Jenkins , TeamCity , SonarQube , Git Monitoring & Security: PreferredFamiliarity with Grafana , Prometheus , Splunk Understanding of data security, encryption, and access control best practices Experience Requirements Minimum 7+ years of professional experience in Java application development Proven experience leading enterprise projects, especially involving distributed systems and big data technologies Experience designing and deploying cloud-ready applications Familiarity with SDLC processes, Agile methodologies, and DevOps practices Experience with application troubleshooting, system integration, and performance tuning Day-to-Day Activities Lead project meetings, coordinate deliverables, and oversee technical planning Develop, review, and optimize Java code, APIs, and microservices components Collaborate with development, QA, and operations teams to ensure smooth deployment and operation of applications Conduct system analysis, performance tuning, and troubleshooting of live issues Document system architecture, deployment procedures, and operational workflows Mentor junior developers, review code, and promote best engineering practices Stay updated on emerging technologies, trends, and tools applicable to enterprise software development Qualifications Bachelors or Masters degree in Computer Science, Software Engineering, or a related field Relevant certifications (e.g., Java certifications, cloud certifications) are advantageous Extensive hands-on experience in Java, microservices, and enterprise application development Exposure to big data, cloud deployment, and container orchestration preferred Professional Competencies Strong analytical and problem-solving skills for complex technical challenges Leadership qualities, including mentoring and guiding team members Effective communication skills for stakeholder engagement and documentation Ability to work independently and collaboratively within Agile teams Continuous improvement mindset, eager to adapt and incorporate new technologies Good organizational and time management skills for handling multiple priorities S YNECHRONS DIVERSITY & INCLUSION STATEMENT Diversity & Inclusion are fundamental to our culture, and Synechron is proud to be an equal opportunity workplace and is an affirmative action employer. Our Diversity, Equity, and Inclusion (DEI) initiative Same Difference is committed to fostering an inclusive culture promoting equality, diversity and an environment that is respectful to all. We strongly believe that a diverse workforce helps build stronger, successful businesses as a global company. We encourage applicants from across diverse backgrounds, race, ethnicities, religion, age, marital status, gender, sexual orientations, or disabilities to apply. We empower our global workforce by offering flexible workplace arrangements, mentoring, internal mobility, learning and development programs, and more. All employment decisions at Synechron are based on business needs, job requirements and individual qualifications, without regard to the applicants gender, gender identity, sexual orientation, race, ethnicity, disabled or veteran status, or any other characteristic protected by law .

Posted 2 weeks ago

Apply

4.0 - 8.0 years

15 - 25 Lacs

Bengaluru

Work from Office

Job Summary: We are looking for a skilled Apache Solr Engineer to design, implement, and maintain scalable and high-performance search solutions. The ideal candidate will have hands-on experience with Solr/SolrCloud, strong analytical skills, and the ability to work in cross-functional teams to deliver efficient search functionalities across enterprise or customer-facing applications. Experience: 4–8 years Roles and Responsibilities Key Responsibilities: Design, develop, and maintain enterprise-grade search solutions using Apache Solr and SolrCloud . Develop and optimize search indexes and schema based on use cases like product search, document search, or order/invoice search. Integrate Solr with backend systems, databases and APIs. Implement full-text search , faceted search , auto-suggestions , ranking , and relevancy tuning . Optimize search performance, indexing throughput, and query response time. Ensure data consistency and high availability using SolrCloud and Zookeeper (cluster coordination & configuration management). Monitor search system health and troubleshoot issues in production. Collaborate with product teams, data engineers, and DevOps teams for smooth delivery. Stay up to date with new features of Apache Lucene/Solr and recommend improvements. Required Skills & Qualifications: Strong experience in Apache Solr & SolrCloud Good understanding of Lucene , inverted index , analyzers , tokenizers , and search relevance tuning . Proficient in Java or Python for backend integration and development. Experience with RESTful APIs , data pipelines, and real-time indexing. Familiarity with Zookeeper , Docker , Kubernetes (for SolrCloud deployments). Knowledge of JSON , XML , and schema design in Solr. Experience with log analysis , performance tuning , and monitoring tools like Prometheus/Grafana is a plus. Exposure to e-commerce or document management search use cases is an advantage. Preferred Qualifications: Bachelor’s or Master’s degree in Computer Science, Engineering, or related field. Experience with Elasticsearch or other search technologies is a plus. Working knowledge of CI/CD pipelines and cloud platforms ( Azure).

Posted 2 weeks ago

Apply

10.0 - 15.0 years

20 - 30 Lacs

Mumbai, Powai

Work from Office

Notice period : Immediate to 30 days, currently serving Notice period Job Responsibilities: Engineer and automate various database platforms and services. Assist in the ongoing process of rationalizing the technology and usage of databases. Participate in the creation and implementation of operational policies, procedures & documentation. Database Administration and Production support for databases hosted on private cloud across all regions. Database version Upgrades and Security patching. Performance Tuning. Database replication administration. Collaborate with development teams and utilize coding skills to design and implement database solutions for new and existing applications. Willing to work in the weekend and non-of f ice hours as part of wider scheduled support group. Willingness to learn and adapt to new technologies and methodologies. Required Skills Mandatory The candidate must have the following skills and experience: 10 + years of experience in MSSQL DBA administration Proven ability to navigate Linux operating systems and utilize command -line tools prof iciently. Clear understanding on MS SQL availability group Exposure in scripting languages like Python and automation tools like Ansible. Have a proven effective and efficient troubleshooting skill set. Ability to cope well under pressure. Strong Organization Skills and Practical Sense Quick and Eager to Learn and explore both Technical and Semi -Technical work types Engineering Mindset Preferred Skills Experience / Knowledge of the following will be added advantage (but not mandatory): Experience in MySQL and Oracle Experience in Infrastructure Automation Development Experience with monitoring systems and log management/reporting tools (e.g.Loki, Grafana, Splunk).

Posted 2 weeks ago

Apply

3.0 - 6.0 years

22 - 27 Lacs

Pune

Work from Office

We are growing and seeking a skilled DevOps Engineer to join our devops engineering team. You'll be responsible for building and maintaining scalable cloud infrastructure across clouds and bare metal environments, automating deployment pipelines, and ensuring system reliability. What You’ll Do: Monitor and Optimize: Set up and maintain observability tools (logging, alerting, metrics) to detect and resolve performance bottlenecks. Implement Scalability Solutions: Create programmatic scaling and load balancing strategies to support usage growth. Develop Automation Systems: Write production-grade code for CI/CD pipelines, deployment automation, and infrastructure tooling to accelerate shipping. Migrate services to Kubernetes, improve performance and security of the clusters Improve Data and ML pipelines, work with EMR clusters What You’ll Need: Deep experience in infrastructure, DevOps, or platform engineering roles Deep expertise with cloud platforms (AWS preferred, GCP/Azure also welcome) and linux environment Experience with Terraform Proficiency with CI/CD systems and deployment automation (Jenkins, ArgoCD preferred) Experience with container orchestration using Kubernetes and Helm for application deployments Strong scripting capabilities in Python and Bash for automation and tooling Experience implementing secure systems at scale, including IAM and network security controls Familiarity with monitoring and observability stacks like Prometheus, Grafana, Loki Experience with configuration management tools - Ansible, Puppet, Chef Strong problem-solving skills with a bias toward resilience and scalability Excellent communication and collaboration across engineering teams Shift Timing: The regular hours for this position will cover a combination of business hours in the US and India – typically 2pm-11pm IST. Occasionally, later hours may be required for meetings with teams in other parts of the world. Additionally, for the first 4-6 weeks of onboarding and training, US Eastern time hours (IST -9:30) may be required. Benefits: Medical Insurance coverage is provided to our employees and their dependants, 100% covered by Comscore; Provident Fund is borne by Comscore, and is provided over and above the gross salary to employees; 26 Annual leave days per annum, divided into 8 Casual leave days and 18 Privilege leave days; Comscore also provides a paid “Recharge Week” over the Christmas and New Year period, so that you can start the new year fresh; In addition, you will be entitled to: 10 Public Holidays; 12 Sick leave days; 5 Paternity leave days; 1 Birthday leave day. Flexible work arrangements; “Summer Hours” are offered from March to May: Comscore offers employees the flexibility to work more hours from Monday to Thursday, and the hours can be offset on Friday from 2:00pm onwards; Employees are eligible to participate in Comscore’s Sodexo Meal scheme and enjoy tax benefits About Comscore: At Comscore, we’re pioneering the future of cross-platform media measurement, arming organizations with the insights they need to make decisions with confidence. Central to this aim are our people who work together to simplify the complex on behalf of our clients & partners. Though our roles and skills are varied, we’re united by our commitment to five underlying values: Integrity, Velocity, Accountability, Teamwork, and Servant Leadership. If you’re motivated by big challenges and interested in helping some of the largest and most important media properties and brands navigate the future of media, we’d love to hear from you. Comscore (NASDAQ: SCOR) is a trusted partner for planning, transacting and evaluating media across platforms. With a data footprint that combines digital, linear TV, over-the-top and theatrical viewership intelligence with advanced audience insights, Comscore allows media buyers and sellers to quantify their multiscreen behavior and make business decisions with confidence. A proven leader in measuring digital and set-top box audiences and advertising at scale, Comscore is the industry’s emerging, third-party source for reliable and comprehensive cross-platform measurement. To learn more about Comscore, please visit Comscore.com. Comscore is committed to creating an inclusive culture, encouraging diversity. About Comscore: At Comscore, we’re pioneering the future of cross-platform media measurement, arming organizations with the insights they need to make decisions with confidence. Central to this aim are our people who work together to simplify the complex on behalf of our clients & partners. Though our roles and skills are varied, we’re united by our commitment to five underlying values: Integrity, Velocity, Accountability, Teamwork, and Servant Leadership. If you’re motivated by big challenges and interested in helping some of the largest and most important media properties and brands navigate the future of media, we’d love to hear from you. Comscore (NASDAQ: SCOR) is a trusted partner for planning, transacting and evaluating media across platforms. With a data footprint that combines digital, linear TV, over-the-top and theatrical viewership intelligence with advanced audience insights, Comscore allows media buyers and sellers to quantify their multiscreen behavior and make business decisions with confidence. A proven leader in measuring digital and set-top box audiences and advertising at scale, Comscore is the industry’s emerging, third-party source for reliable and comprehensive cross-platform measurement. To learn more about Comscore, please visit Comscore.com. C omscore is committed to creating an inclusive culture, encouraging diversity. *LI-JL1

Posted 2 weeks ago

Apply

1.0 - 4.0 years

4 - 7 Lacs

Pune

Work from Office

Job Summary: We are seeking a proactive and detail-oriented Site Reliability Engineer (SRE) focused on Monitoring to join our observability team. The candidate will be responsible for ensuring the reliability, availability, and performance of our systems through robust monitoring, alerting, and incident response practices. Key Responsibilities: Monitor Application, IT infrastructure environment Drive the end-to-end incident response and resolution Design, implement, and maintain monitoring and alerting systems for infrastructure and applications. Continuously improve observability by integrating logs, metrics, and traces into a unified monitoring platform. Collaborate with development and operations teams to define and track SLIs, SLOs, and SLAs. Analyze system performance and reliability data to identify trends and potential issues. Participate in incident response, root cause analysis, and post-mortem documentation. Automate repetitive monitoring tasks and improve alert accuracy to reduce noise. Required Skills & Qualifications: 2+ years of experience in application/system monitoring, SRE, or DevOps roles. Proficiency with monitoring tools such as Prometheus, Grafana, ELK, APM, Nagios, Zabbix, Datadog, or similar. Strong scripting skills (Python, Bash, or similar) for automation. Experience with cloud platforms (AWS, Azure) and container orchestration (Kubernetes). Solid understanding of Linux/Unix systems and networking fundamentals. Excellent problem-solving and communication skills.

Posted 2 weeks ago

Apply

0.0 - 2.0 years

2 - 3 Lacs

Mumbai

Work from Office

Monitor systems via GCP tools (Stackdriver, Logging) Use Linux for log analysis & health checks Run SQL queries for DB validation Generate infra/service health reports Work with tools like Grafana, Splunk Escalate issues with clear documentation

Posted 2 weeks ago

Apply

1.0 - 5.0 years

0 Lacs

karnataka

On-site

At Goldman Sachs, our Engineers are dedicated to making the impossible possible. We are committed to changing the world by bridging the gap between people and capital with innovative ideas. Our mission is to tackle the most complex engineering challenges for our clients, crafting massively scalable software and systems, designing low latency infrastructure solutions, proactively safeguarding against cyber threats, and harnessing the power of machine learning in conjunction with financial engineering to transform data into actionable insights. Join our engineering teams to pioneer new businesses, revolutionize finance, and seize opportunities in the fast-paced world of global markets. Engineering at Goldman Sachs, consisting of our Technology Division and global strategists groups, stands at the heart of our business. Our dynamic environment demands creative thinking and prompt, practical solutions. If you are eager to explore the limits of digital possibilities, your journey starts here. Goldman Sachs Engineers embody innovation and problem-solving skills, developing solutions in various domains such as risk management, big data, and mobile technology. We seek imaginative collaborators who can adapt to change and thrive in a high-energy, global setting. The Data Engineering group at Goldman Sachs plays a pivotal role across all aspects of our business. Focused on offering a platform, processes, and governance to ensure the availability of clean, organized, and impactful data, Data Engineering aims to scale, streamline, and empower our core businesses. As a Site Reliability Engineer (SRE) on the Data Engineering team, you will oversee observability, cost, and capacity, with operational responsibility for some of our largest data platforms. We are actively involved in the entire lifecycle of platforms, from design to decommissioning, employing an SRE strategy tailored to this lifecycle. We are looking for individuals who have a development background and are proficient in code. Candidates should prioritize Reliability, Observability, Capacity Management, DevOps, and SDLC (Software Development Lifecycle). As a self-driven leader, you should be comfortable tackling problems with varying degrees of complexity and translating them into data-driven outcomes. You should be actively engaged in strategy development, participate in team activities, conduct Postmortems, and possess a problem-solving mindset. Your responsibilities as a Site Reliability Engineer (SRE) will include driving the adoption of cloud technology for data processing and warehousing, formulating SRE strategies for major platforms like Lakehouse and Data Lake, collaborating with data consumers and producers to align reliability and cost objectives, and devising strategies with data using relevant technologies such as Snowflake, AWS, Grafana, PromQL, Python, Java, Open Telemetry, and Gitlab. Basic qualifications for this role include a Bachelor's or Master's degree in a computational field, 1-4+ years of relevant work experience in a team-oriented environment, at least 1-2 years of hands-on developer experience, familiarity with DevOps and SRE principles, experience with cloud infrastructure (AWS, Azure, or GCP), a proven track record in driving data-oriented strategies, and a deep understanding of data multi-dimensionality, curation, and quality. Preferred qualifications entail familiarity with Data Lake / Lakehouse technologies, experience with cloud databases like Snowflake and Big Query, understanding of data modeling concepts, working knowledge of open-source tools such as AWS Lambda and Prometheus, and proficiency in coding with Java or Python. Strong analytical skills, excellent communication abilities, a commercial mindset, and a proactive approach to problem-solving are essential traits for success in this role.,

Posted 2 weeks ago

Apply

3.0 - 7.0 years

0 Lacs

karnataka

On-site

As a Site Reliability Engineer III at JPMorgan Chase within the Corporate Technology, you will be at the center of a rapidly growing field in technology. Your role involves applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems. You will be responsible for solving complex business problems with simple solutions through code and cloud infrastructure. Your tasks will include configuring, maintaining, monitoring, and optimizing applications and their associated infrastructure. You will play a vital role in decomposing and iteratively improving existing solutions, contributing significantly to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of applications or platforms. Responsibilities: - Guide and assist others in building appropriate level designs and gaining consensus from peers - Collaborate with software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines - Design, develop, test, and implement availability, reliability, scalability, and solutions in applications - Implement infrastructure, configuration, and network as code for applications and platforms - Collaborate with technical experts, key stakeholders, and team members to resolve complex problems - Understand service level indicators and utilize service level objectives to proactively resolve issues - Support the adoption of site reliability engineering best practices within the team Required Qualifications: - Formal training or certification on software engineering concepts with 3+ years of applied experience - Proficiency in site reliability culture and principles, and familiarity with implementing site reliability within an application or platform - Proficiency in at least one programming language such as Python, Java/Spring Boot, and .Net - Knowledge of software applications and technical processes within a given technical discipline (e.g., Cloud, artificial intelligence, Android, etc.) - Experience in observability tools like Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc. - Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform - Familiarity with container and container orchestration such as ECS, Kubernetes, and Docker - Familiarity with troubleshooting common networking technologies and issues - Ability to contribute to large and collaborative teams with limited supervision - Proactive recognition of roadblocks and interest in learning innovative technologies - Ability to identify new technologies and relevant solutions to meet design constraints Preferred Qualifications: - Familiarity with popular IDEs for Software Development - General knowledge of the financial services industry (preferred),

Posted 2 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

pune, maharashtra

On-site

As a PLM Infrastructure Engineer at our company, you will play a crucial role in operating and managing enterprise-grade 3DEXPERIENCE platforms used in engineering and manufacturing. Your responsibilities will include maintaining multiple 3DX environments (Production, Development, Sandbox) both on-premises and on AWS. Collaborating with product and infrastructure teams, you will ensure the smooth operation of the Dassault Systmes ecosystem. Key Responsibilities: - Maintain and operate 3DEXPERIENCE environments, including production, integration, development, and AWS sandboxes. - Perform 3DX upgrades, set up new environments, and roll out software updates. - Administer and troubleshoot components such as Linux, Oracle DB, and Windows Server. - Manage CATIA client deployments (.msi packaging) and oversee the license infrastructure. - Utilize Ansible for automation, GitHub for CI/CD, and monitor systems using Nagios and Grafana. - Implement security policies, backup procedures, and system-level controls. - Provide support for interfaces with Active Directory, IAM, and People & Organization systems. Requirements: - Hands-on experience with 3DEXPERIENCE platform, CATIA V5, and PowerBY. - Proficiency in Linux, Oracle DB, and Windows Server infrastructure. - Knowledge of Ansible, GitHub CI/CD, and basic cloud platforms (AWS/Azure). - Deep understanding of PLM ecosystem dynamics and platform lifecycle. - Proven experience in operating on-premises and hybrid 3DX environments. Good to Have: - Familiarity with Agile tools like Jira and ServiceNow. - Experience with UFT and Selenium for component testing. - Knowledge of Infrastructure as Code (IaC) using Terraform or similar tools. Why Join Us - Be part of a team that drives enterprise engineering platforms. - Enjoy real platform ownership rather than just executing scripts. - Contribute to a mission-critical environment utilized by global engineering teams.,

Posted 2 weeks ago

Apply

4.0 - 8.0 years

0 Lacs

pune, maharashtra

On-site

As a Data Engineer at our organization, you will have the opportunity to work on building smart, automated testing solutions. We are seeking individuals who are passionate about data engineering and eager to contribute to our growing team. Ideally, you should hold a Bachelor's or Master's degree in Computer Science, IT, or equivalent field, with a minimum of 4 to 8 years of experience in building and deploying complex data pipelines and data solutions. For junior profiles, a similar educational background is preferred. Your responsibilities will include deploying data pipelines using technologies like Databricks, as well as demonstrating hands-on experience with Java and Databricks. Additionally, experience with visualization software such as Splunk (or alternatives like Grafana, Prometheus, PowerBI, Tableau) is desired. Proficiency in SQL and Java, along with hands-on experience in data modeling, is essential for this role. Familiarity with Pyspark or Spark for managing distributed data is also expected. Knowledge of Splunk (SPL), data schemas (e.g., JSON/XML/Avro), and deploying services as containers (e.g., Docker, Kubernetes) will be beneficial. Experience working with cloud services, particularly Azure, is advantageous. Familiarity with streaming and/or batch storage technologies like Kafka and data quality management and monitoring will be considered a plus. Strong communication skills in English are essential for effective collaboration within our team. If you are excited about this opportunity and possess the required qualifications, we encourage you to connect with us by sending your updated CV to nivetha.s@eminds.ai. Join us and become a part of our exciting journey!,

Posted 2 weeks ago

Apply

5.0 - 10.0 years

10 - 20 Lacs

Gurugram, Chennai, Bengaluru

Work from Office

Work Model: Hybrid (One in a week to office) Locations: Chennai, Bengaluru, Gurgaon, Pune Job Purpose: Analyzing, designing, developing, and managing the infrastructure to release scalable Data Science models. The ML Engineer is expected to deploy, monitor and operate production grade AI systems in a scalable, automated, and repeatable way. Job Responsibilities Create and maintain a scalable infrastructure to deliver AI/ML processes, responding to the user requests in near real time. Design and implement the pipelines for training and deployment of ML models. Design dashboards to monitor a system. Collect metrics and create alerts based on them. Design and execute performance tests. Perform feasibility studies/analysis with a critical point of view. Support and maintain (troubleshoot issues with data and applications). Develop technical documentation for applications, including diagrams and manuals. Working on many different software challenges always ensuring a combination of simplicity and maintainability within the code. Contribute to architectural designs of large complexity and size, potentially involving several distinct software components. Mentoring other engineers fostering good engineering practices across the department. Working closely with data scientists and a variety of end-users (across diverse cultures) to ensure technical compatibility and user satisfaction. Work as a member of a team, encouraging team building, motivation and cultivating effective team relations. Role Requirements E=essential, P=preferred P - Bachelor's degree in Computer Science or related field P - Masters degree in data engineering or related E - Demonstrated experience and knowledge in Linux and Docker containers E - Demonstrated experience and knowledge in some of the main cloud providers (Azure, GCP or AWS) P - Demonstrated experience and knowledge in distributed systems E - Proficient in programming languages: Python E Experience with ML/Ops technologies like Azure ML E Self driving and good communication skills P Experience with AI/ML frameworks: Torch, Onnx, Tensorflow E - Experience designing and implementing CICD pipelines for automation. P - Experience designing monitoring dashboards (Grafana or similar) P - Experience with container orchestrators (Kubernetes, Docker Swarm. E - Experience in using collaborative developing tools such as Git, Confluence, Jira, etc.

Posted 2 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies