Jobs
Interviews

2943 Datadog Jobs - Page 29

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

0 years

9 - 10 Lacs

Noida

Remote

Job Description Job ID LEADC014754 Employment Type Regular Work Style on-site Location Noida,UP,India Role Lead Cloud Operations Specialist Company Overview With 80,000 customers across 150 countries, UKG is the largest U.S.-based private software company in the world. And we’re only getting started. Ready to bring your bold ideas and collaborative mindset to an organization that still has so much more to build and achieve? Read on. At UKG, you get more than just a job. You get to work with purpose. Our team of U Krewers are on a mission to inspire every organization to become a great place to work through our award-winning HR technology built for all. Here, we know that you’re more than your work. That’s why our benefits help you thrive personally and professionally, from wellness programs and tuition reimbursement to U Choose — a customizable expense reimbursement program that can be used for more than 200+ needs that best suit you and your family, from student loan repayment, to childcare, to pet insurance. Our inclusive culture, active and engaged employee resource groups, and caring leaders value every voice and support you in doing the best work of your career. If you’re passionate about our purpose — people —then we can’t wait to support whatever gives you purpose. We’re united by purpose, inspired by you. Key Responsibilities: Monitor and support Kronos Private Cloud and hosted environments remotely. Perform remote monitoring of Microsoft Windows (2003/2008/2012/2016) and Linux servers for: o System performance and uptime o SQL database health o Application service and web application status o Server resource utilization Respond to alerts from monitoring tools and take corrective actions. Troubleshoot and identify root causes of server and application performance issues. Handle Level 1 escalations and follow the defined escalation matrix. Administer and maintain Windows and Linux operating systems. Support web applications and hosting services including IIS, JBoss, and Apache Tomcat. Understand and troubleshoot server-client architecture issues. Collaborate with internal teams to ensure high availability and performance of hosted services. Document incidents, resolutions, and standard operating procedures. Participate in 24/7 rotational shifts, including nights and weekends. Preferred Requirements and Skills: Experience with UKG Workforce Central (WFC) application. Familiarity with ServiceNow for incident, problem, and change management. Strong understanding of cloud infrastructure, virtualisation (VMware), and hybrid environments. Knowledge of web server configurations, deployments, and troubleshooting. Excellent communication, analytical, and problem-solving skills. Familiarity with monitoring tools (DataDog, Grafana, Splunk) and alert management. Willingness to work in rotational shifts, including nights and weekend Where we’re going UKG is on the cusp of something truly special. Worldwide, we already hold the #1 market share position for workforce management and the #2 position for human capital management. Tens of millions of frontline workers start and end their days with our software, with billions of shifts managed annually through UKG solutions today. Yet it’s our AI-powered product portfolio designed to support customers of all sizes, industries, and geographies that will propel us into an even brighter tomorrow! UKG is proud to be an equal opportunity employer and is committed to promoting diversity and inclusion in the workplace, including the recruitment process. Disability Accommodation in the Application and Interview Process For individuals with disabilities that need additional assistance at any point in the application and interview process, please email UKGCareers@ukg.com

Posted 3 weeks ago

Apply

5.0 years

0 Lacs

India

Remote

At TechBiz Global, we are providing recruitment service to our TOP clients from our portfolio. We are currently seeking 4 DevOps Support Engineer to join one of our clients ' teams in India who can start until 20th of July. If you're looking for an exciting opportunity to grow in a innovative environment, this could be the perfect fit for you. Key Responsibilities Monitor and troubleshoot Azure and AWS environments to ensure optimal performance and availability Respond promptly to incidents and alerts, investigating and resolving issues efficiently Perform basic scripting and automation tasks to streamline cloud operations (e.g., Bash, Python) Communicate clearly and fluently in English with customers and internal teams Collaborate closely with the Team Lead, following Standard Operating Procedures (SOPs) and escalation workflows Work in a rotating shift schedule, including weekends and nights, ensuring continuous support coverage Shift Detail Engineers rotate shifts, typically working 4–5 shifts per weeks Each engineer works about 4 to 5 shifts per week, rotating through morning, evening, and night shifts—including weekends—to cover 24/7 support evenly among the team Rotation ensures no single engineer is always working nights or weekends; the load is shared fairly among the team Qualifications 2–5 years of experience in DevOps or cloud support roles Strong familiarity with AWS and Azure cloud environments Experience with CI/CD tools such as GitHub Actions or Jenkins Proficiency with monitoring tools like Datadog, CloudWatch, or similar Basic scripting skills in Bash, Python, or comparable language Excellent communication skills in English Comfortable and willing to work in a shift-based support role, including night and weekend shifts Prior experience in a shift-based support environment is preferred What We Offer Remote work opportunity — work from anywhere in India with a stable internet connection Comprehensive training program including Shadowing existing processes to gain hands-on experience Learning internal tools, Standard Operating Procedures (SOPs), ticketing systems, and escalation paths to ensure smooth onboarding and ongoing success

Posted 3 weeks ago

Apply

5.0 - 7.0 years

10 - 19 Lacs

Bengaluru

Hybrid

SRE Engineer With Java skills BS degree in Computer Science, Engineering, or related technical subject area. 5+ years hands-on AWS experience integrating, developing and managing applications 5+ years of relevant work experience in a high-volume and/or critical production, software environment 5+ years of hands on software engineering or supporting/maintaining software systems experience (Java and/or c++ services) 3+ years of experience with building automation into daily operational processes through one or more programming languages Experience with container technologies and orchestration (ie: Docker, Kubernetes, EKS) Experience in configuring, tuning and automating operational responsibilities for AWS managed data services including RDS, DynamoDB and Elasticache Experience with monitoring and log management tools (ie: DataDog, CloudWatch, Splunk) Hands-on experience in triaging and tuning Java cloud applications with integration into AWS managed services Solid understanding of AWS networking systems and protocols (ie: ALB, R53, API-Gateway, TCP/IP, HTTP/HTTPS, DNS) Experience with developing or supporting Continuous Integration and Continuous Delivery/Deployment pipelines (CI/CD) Education Qualificaiton: BS degree in Computer Science, Engineering, or related technical subject area.

Posted 3 weeks ago

Apply

15.0 years

0 Lacs

Greater Bengaluru Area

On-site

Key Responsibilities: Manage a QA team of 6 engineers focused on validating data pipelines, APIs, and front-end applications. Define, implement, and maintain test automation strategies for: ETL workflows (Airflow DAGs) API contracts, performance, and data sync UI automation for internal and external portals Collaborate with Data Engineering and Product teams to ensure accurate data ingestion from third-party systems such as Workday, ADP, Greenhouse, Lever, etc. Build and maintain robust automated regression suites for API and UI layers using industry-standard tools. Implement data validation checks, including row-level comparisons, schema evolution testing, null/missing value checks, and referential integrity checks. Own and evolve CI/CD quality gates and integrate automated tests into GitHub Actions, Jenkins, or equivalent. Ensure test environments are reproducible, version-controlled, and equipped for parallel test execution. Mentor QA team members in advanced scripting, debugging, and root-cause analysis practices. Develop monitoring/alerting frameworks for data freshness, job failures, and drift detection using Airflow and observability tools. Technical Skills: Core QA & Automation: Strong hands-on experience with Selenium, Playwright, or Cypress for UI automation. Deep expertise in API testing using Postman, REST-assured, Karate, or similar frameworks. Familiar with contract testing using Pact or similar tools. Strong understanding of BDD/TDD frameworks (e.g., Pytest-BDD, Cucumber). ETL / Data Quality: Experience testing ETL pipelines, preferably using Apache Airflow. Hands-on experience with SQL and data validation tools such as: Great Expectations Custom Python data validators Understanding of data modeling, schema versioning, and data lineage. Languages & Scripting: Strong programming/scripting skills in Python (required), with experience using it for test automation and data validations. Familiarity with Bash, YAML, and JSON for pipeline/test configurations. DevOps & CI/CD: Experience integrating tests into pipelines using tools like GitHub Actions, Jenkins, CircleCI, or GitLab CI. Familiarity with containerized environments using Docker and possibly Kubernetes. Monitoring & Observability: Working knowledge of log aggregation and monitoring tools like Datadog, Grafana, Prometheus, or Splunk. Experience with Airflow monitoring, job-level metrics, and alerts for test/data failures. Qualifications : 15+ years in QA/QE roles with 3+ years in a leadership or management capacity. Strong foundation in testing data-centric and distributed systems. Proven ability to define and evolve automation strategies in agile environments. Excellent analytical, communication, and organizational skills. Preferred: Experience with data graphs, knowledge graphs, or employee graph modeling. Exposure to cloud platforms (AWS/GCP) and data services (e.g., S3, BigQuery, Redshift). Familiarity with HR tech domain and integration challenges.

Posted 3 weeks ago

Apply

8.0 years

0 Lacs

India

Remote

Job Details Title: Engineering Manager (Java / Spring Boot, AWS) Location: Remote Employment Type: Full-time Role Summary We're looking for a passionate and technically adept leader with a deep understanding of modern software development to join our leadership team and guide two critical teams: - Market Positioning Team: Owns feature development that defines our market position. - Integrations Team: Focuses on seamless integration with partners and third-party applications. As Engineering Manager, you’ll guide your teams to achieve ambitious goals with clarity and vision. You'll set the tone for technical excellence, collaboration, and continuous learning. About the Opportunity Guide our micro-service platform and mentor a remote backend team. You’ll blend hands-on technical ownership with people leadership—shaping architecture, driving cloud best practices, and coaching engineers. Key Responsibilities 1. Architecture & Delivery - Define and evolve backend architecture (Java 17+, Spring Boot 3, AWS, Elasticsearch, PostgreSQL/MySQL, Redis, etc.) - Lead design/code reviews; enforce best practices (CI/CD, observability, security, etc.) - Drive scalability and uptime. 2. Team Leadership & Growth - Manage a team of 6–10 backend engineers. - Set objectives, give feedback, coach in AI-assisted development (e.g., GitHub Copilot). 3. Stakeholder Collaboration - Liaison between Product, Frontend, SRE, and Data teams. - Communicate technical concepts to all audiences. 4. Technical Vision & Governance - Own standards, architectural principles, and tool evaluation (GenAI, cloud-native). - Balance tech debt vs. feature delivery using data-driven decisions. Required Qualifications - 8+ years backend experience with Java & Spring Boot. - Experience mentoring or managing engineers. - Expert in AWS, cloud-native design patterns. - Proficiency with Elasticsearch, PostgreSQL/MySQL, Redis. - Scaled systems to millions of users/billions of events. - Strong DevOps practices (Docker, CI/CD, observability). - Excellent communication skills in remote environments. Nice-to-Have - Experience with Datadog (APM, Logs, RUM). - Startup exposure; multitasking across projects. - Previous title: Principal Engineer, Staff Engineer, or Engineering Manager. - Experience with AI-assisted dev tools (e.g., Copilot, Cursor).

Posted 3 weeks ago

Apply

7.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Backend Developer (Python) Experience 7+ Years Location – Hyderabad, Bangalore & Chennai Shift Timings – 2PM to 11PM Mandatory Programming Skills - • Experience in Python development • API endpoint (Flask/ Django/ Fast) creation. • NumPy & Pandas. • S3 / AWS. • Postgres DB /Mongo DB Basic Good To have - • Asynch.io • Logging library & Datadog • Any API Management tool (Postman /Mulesoft) • GRPC Good Communication Strong Problem-solving aptitude Analytical mind and great business sense Degree in Computer Science, Engineering or relevant field is preferred

Posted 3 weeks ago

Apply

3.0 - 8.0 years

5 - 10 Lacs

Pune

Work from Office

Sarvaha would like to welcome a skilled Observability Engineer with a minimum of 3 years of experience to contribute to designing, deploying, and scaling our monitoring and logging infrastructure on Kubernetes. In this role, you will play a key part in enabling end-to-end visibility across cloud environments by processing Petabyte data scales, helping teams enhance reliability, detect anomalies ea rly, and drive operational excellence. What Youll Do Configure and manage observability agents across AWS, Azure & GCP Use IaC techniques and tools such as Terraform, Helm & GitOps, to automate deployment of Observability stack Experience with different language stacks such as Java, Ruby, Python and Go Instrument services using OpenTelemetry and integrate telemetry pipelines Optimize telemetry metrics storage using time-series databases such as Mimir & NoSQL DBs Create dashboards, set up alerts, and track SLIs/SLOs Enable RCA and incident response using observability data Secure the observability pipeline You Bring BE/BTech/MTech (CS/IT or MCA), with an emphasis in Software Engineering Strong skills in reading and interpreting logs, metrics, and traces Proficiency with LGTM (Loki, Grafana, Tempo, Mimi) or similar stack, Jaeger, Datadog, Zipkin, InfluxDB etc. Familiarity with log frameworks such as log4j, lograge, Zerolog, loguru etc. Knowledge of OpenTelemetry, IaC, and security best practices Clear documentation of observability processes, logging standards & instrumentation guidelines Ability to proactively identify, debug, and resolve issues using observability data Focused on maintaining data quality and integrity across the observability pipeline.

Posted 3 weeks ago

Apply

5.0 years

0 Lacs

Pune, Maharashtra, India

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Team The Customer Success Automation team offers opportunities to create a meaningful impact and expand your skill set through a variety of experiences building Customer Success workflows, applications and infrastructure within CrowdStrike’s Falcon platform and its ecosystem. The Automation team is a strategic enabler for the Customer Success business (Technical Support, Technical Account Management and Provisioning teams). As part of this team, you will have the opportunity to be a trailblazer leading with technology as a solution to accelerate the digital transformation and self-service capabilities for both CrowdStrike’s customers and for CrowdStrikers (internal customers). About The Role As an Automation Engineer you will perform a hybrid role of Software Engineer / DevOps Engineer within our engineering team and help build tools, integrations and infrastructure that enable our Customer Success teams to deliver world class Technical Support and Customer Success at scale. What You’ll Do GO and Python Software Engineer for Customer Success Automation team Design and implement Kubernetes clusters in AWS for scalable and resilient application infrastructure Design, implement, deploy and maintain scalable solutions (Applications, Microservices and Infrastructure) supporting the business needs of our Technical Support and Customer Success teams Design, implement and maintain integration flows using an array of different platforms, APIs, databases, protocols and data formats Integrate UI/UX frameworks for custom-built tools System Integration to ensure the tools are well integrated within the Customer Success ecosystem Operate CI/CD pipelines for the support Automation team Collaborate with technical peers within the Customer Success, IT and Product teams Train and enable the teams within CrowdStrike Customer Success Organisation on various tools/technologies built and deployed by Support Automation team. What You’ll Need 5+ years of solid hands-on experience as a software engineer in production-grade projects with proficiency in Go (Golang) and Python Basic UI development with ability to integrate an existing UI/UX framework (any frontend stack) Working with CI/CD (Jenkins, GitLab, Bitbucket or similar CICD platforms) Experience with Kubernetes deployments in AWS (EKS or self-managed clusters) Familiarity with cloud infrastructure components (VPCs, IAM, EC2, RDS, etc.) in AWS Strong understanding of containerization technologies (Docker) and orchestration. Strong Problem-solving, team collaboration, effective communication and ability to thrive in fast-paced environments. Bonus Points Experience with infrastructure as code (IaC) practices and tools Full-Stack development experience will be a plus for this role Prior Technical Support/Customer Success tool development experience Experience building web-services with data processing pipeline Frontend experience (Ember, React, Bootstrap) Hands-on experience with Python Libraries (Text data and Log Parsing) and Python Frameworks (Django/Fast API framework experience preferred) Strong working knowledge of SIEM tools/Log Analysis tools (Logscale, LogStash, Datadog, DynaTrace or Splunk) Integration with Customer Success Platforms like Salesforce Service Cloud and GainSight Customer Success Strong understanding of the day-to-day operations and challenges of Enterprise Software/SaaS Customer Success and Technical Support functions Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 3 weeks ago

Apply

5.0 years

0 Lacs

Mumbai Metropolitan Region

Remote

As a global leader in cybersecurity, CrowdStrike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn’t changed — we’re here to stop breaches, and we’ve redefined modern security with the world’s most advanced AI-native platform. Our customers span all industries, and they count on CrowdStrike to keep their businesses running, their communities safe and their lives moving forward. We’re also a mission-driven company. We cultivate a culture that gives every CrowdStriker both the flexibility and autonomy to own their careers. We’re always looking to add talented CrowdStrikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters? The future of cybersecurity starts with you. About The Team The Customer Success Automation team offers opportunities to create a meaningful impact and expand your skill set through a variety of experiences building Customer Success workflows, applications and infrastructure within CrowdStrike’s Falcon platform and its ecosystem. The Automation team is a strategic enabler for the Customer Success business (Technical Support, Technical Account Management and Provisioning teams). As part of this team, you will have the opportunity to be a trailblazer leading with technology as a solution to accelerate the digital transformation and self-service capabilities for both CrowdStrike’s customers and for CrowdStrikers (internal customers). About The Role As an Automation Engineer you will perform a hybrid role of Software Engineer / DevOps Engineer within our engineering team and help build tools, integrations and infrastructure that enable our Customer Success teams to deliver world class Technical Support and Customer Success at scale. What You’ll Do GO and Python Software Engineer for Customer Success Automation team Design and implement Kubernetes clusters in AWS for scalable and resilient application infrastructure Design, implement, deploy and maintain scalable solutions (Applications, Microservices and Infrastructure) supporting the business needs of our Technical Support and Customer Success teams Design, implement and maintain integration flows using an array of different platforms, APIs, databases, protocols and data formats Integrate UI/UX frameworks for custom-built tools System Integration to ensure the tools are well integrated within the Customer Success ecosystem Operate CI/CD pipelines for the support Automation team Collaborate with technical peers within the Customer Success, IT and Product teams Train and enable the teams within CrowdStrike Customer Success Organisation on various tools/technologies built and deployed by Support Automation team. What You’ll Need 5+ years of solid hands-on experience as a software engineer in production-grade projects with proficiency in Go (Golang) and Python Basic UI development with ability to integrate an existing UI/UX framework (any frontend stack) Working with CI/CD (Jenkins, GitLab, Bitbucket or similar CICD platforms) Experience with Kubernetes deployments in AWS (EKS or self-managed clusters) Familiarity with cloud infrastructure components (VPCs, IAM, EC2, RDS, etc.) in AWS Strong understanding of containerization technologies (Docker) and orchestration. Strong Problem-solving, team collaboration, effective communication and ability to thrive in fast-paced environments. Bonus Points Experience with infrastructure as code (IaC) practices and tools Full-Stack development experience will be a plus for this role Prior Technical Support/Customer Success tool development experience Experience building web-services with data processing pipeline Frontend experience (Ember, React, Bootstrap) Hands-on experience with Python Libraries (Text data and Log Parsing) and Python Frameworks (Django/Fast API framework experience preferred) Strong working knowledge of SIEM tools/Log Analysis tools (Logscale, LogStash, Datadog, DynaTrace or Splunk) Integration with Customer Success Platforms like Salesforce Service Cloud and GainSight Customer Success Strong understanding of the day-to-day operations and challenges of Enterprise Software/SaaS Customer Success and Technical Support functions Benefits Of Working At CrowdStrike Remote-friendly and flexible work culture Market leader in compensation and equity awards Comprehensive physical and mental wellness programs Competitive vacation and holidays for recharge Paid parental and adoption leaves Professional development opportunities for all employees regardless of level or role Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections Vibrant office culture with world class amenities Great Place to Work Certified™ across the globe CrowdStrike is proud to be an equal opportunity employer. We are committed to fostering a culture of belonging where everyone is valued for who they are and empowered to succeed. We support veterans and individuals with disabilities through our affirmative action program. CrowdStrike is committed to providing equal employment opportunity for all employees and applicants for employment. The Company does not discriminate in employment opportunities or practices on the basis of race, color, creed, ethnicity, religion, sex (including pregnancy or pregnancy-related medical conditions), sexual orientation, gender identity, marital or family status, veteran status, age, national origin, ancestry, physical disability (including HIV and AIDS), mental disability, medical condition, genetic information, membership or activity in a local human rights commission, status with regard to public assistance, or any other characteristic protected by law. We base all employment decisions--including recruitment, selection, training, compensation, benefits, discipline, promotions, transfers, lay-offs, return from lay-off, terminations and social/recreational programs--on valid job requirements. If you need assistance accessing or reviewing the information on this website or need help submitting an application for employment or requesting an accommodation, please contact us at recruiting@crowdstrike.com for further assistance.

Posted 3 weeks ago

Apply

0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Role Designation : Technology Lead Job Location : Bangalore Key Responsibilities To work as a techno-functional SME to provide technical troubleshooting and product support for customers using our products. Take ownership of user problems and be proactive when dealing with user issues. Lead day-to-day production support for applications running in GCP (or any other cloud service) & Kubernetes environments. Monitor application health, respond to incidents, and ensure timely resolution within SLAs. Act as the primary escalation point for high-severity incidents and coordinate with engineering and cloud teams. Drive root cause analysis (RCA), post-incident reviews, and long-term problem resolution. Oversee implementation and maintenance of monitoring, alerting, and logging tools. Maintain operational documentation, runbooks, and knowledge base for support teams. Implementing automation tools to streamline operations and reduce the frequency of errors. Mentor and guide L1/L2 support teams and ensure effective knowledge transfer. Technical Skills 1. Java & Application Layer Strong knowledge of Java/J2EE applications and microservices architecture Familiarity with REST APIs Experience with application performance troubleshooting and profiling 2. Kubernetes (K8s) Understanding of Kubernetes objects: Deployments, Pods, Services, ConfigMaps, Secrets, etc. Hands-on with kubectl, Kubernetes troubleshooting, and log analysis 3. GCP (Google Cloud Platform) Experience with key GCP services: GKE (Google Kubernetes Engine), Cloud Logging, Cloud Monitoring, Cloud SQL 5. Monitoring & Observability Tools: Datadog or any other similar tool Proficient in setting up alerts, dashboards, and log aggregation Root cause analysis from logs and metrics 6. Incident & Problem Management Strong skills in triaging production incidents and leading RCA efforts Familiar with ITIL practices (especially Incident, Change, Problem Management) Tools: Jira or any other similar tool Soft Skills & Leadership 1. Team Leadership Leading L1/L2/L3 support teams Incident bridge management, escalations handling Mentoring and upskilling junior team members 2. Stakeholder Communication Effective communication with developers, QA, DevOps, and business stakeholders Post-incident reporting and communication 3. Operational Excellence SLAs, SLOs, error budgets, service reliability Defining support processes, runbooks, and knowledge bases 4. Change Management Coordinating deployments, patches, hotfixes Managing go-live and rollback plans Nice to Have: Knowledge of SRE practices Familiarity with security & compliance in cloud environments (e.g., vulnerability scanning, IAM best practices) Automation using Python, Bash, or other scripting languages

Posted 3 weeks ago

Apply

2.0 - 4.0 years

0 Lacs

Bengaluru, Karnataka, India

On-site

Infrastructure Support 2 to 4 years Location- Bangalore 5 days WFO ( 24/7 Rotational Shifts) Job Description: Technical Skills You have hands-on experience in using CI/CD tools such as Jenkins, CircleCI or Gitlab for executing deployments. You have knowledge of Infrastructure as Code (IAC) tech stacks such as Terraform, Ansible, ARM or Cloudformation to provision and manage infrastructure. You have working experience in using observability tools for logging, monitoring, tracing and alerting, e.g.: Datadog/PrometheusGrafana, ELK/EFK/Splunk. You have experience in supporting at least one public cloud, e.g.: AWS, Azure or GCP. You have hands-on experience executing most common operations in managing workloads on any container ecosystem tech stacks. e.g.: Docker, Kubernetes, Openshift, etc. You understand system performance tuning and scaling to handle common heavy load scenarios along with concepts of highly available systems and basics of disaster recovery solutions, and are familiar with failover, backup and recovery concepts. You have experience operating a Linux OS such as RHEL or a Debian-Based OS and are familiar with most common Linux OS operations and commands, reading and tweaking Bash scripts andmanaging runtime environment configurations such as Env Vars, Logs,etc. You have experience supporting backend storage solutions such as SQL and NoSQL databases, e.g.: Postgres and MongoDB, and caching solutions such as Redis and Memcached. You have experience in networking configuration and security, and are familiar with common networking setup and security practices, e.g.: loading, balancing, proxies, transport layer security (TLS) and certificate management, and an understanding of standard network protocols and configurations. You have a good understanding of fundamental concepts of APIs such as request, response, headers, authentication, JSON payloads, etc. Other things to know Learning & Development There is no one-size-fits-all career path at Thoughtworks: however you want to develop your career is entirely up to you. But we also balance autonomy with the strength of our cultivation culture. This means your career is supported by interactive tools, numerous development programs and teammates who want to help you grow. We see value in helping each other be our best and that extends to empowering our employees in their career journeys.

Posted 3 weeks ago

Apply

0 years

0 Lacs

India

On-site

Job description Company Description Evallo is a leading provider of a comprehensive SaaS platform for tutors and tutoring businesses, revolutionizing education management. With features like advanced CRM, profile management, standardized test prep, automatic grading, and insightful dashboards, we empower educators to focus on teaching. We're dedicated to pushing the boundaries of ed-tech and redefining efficient education management. Why this role matters Evallo is scaling from a focused tutoring platform to a modular operating system for all service businesses that bill by the hour. As we add payroll, proposals, white-boarding, and AI tooling, we need a Solution Architect who can translate product vision into a robust, extensible technical blueprint. You’ll be the critical bridge between product, engineering, and customers—owning architecture decisions that keep us reliable at 5k+ concurrent users and cost-efficient at 100k+ total users. Outcomes we expect Map current backend + frontend, flag structural debt, and publish an Architecture Gap Report Define naming & layering conventions, linter / formatter rules, and a lightweight ADR process Ship reference architecture for new modules Lead cross-team design reviews; no major feature ships without architecture sign-off Eventual goal is to have Evallo run in a fully observable, autoscaling environments with < 10 % infra cost waste. Monitoring dashboards should trigger < 5 false positives per month. Day-to-day Solution Design: Break down product epics into service contracts, data flows, and sequence diagrams. Choose the right patterns—monolith vs. microservice, event vs. REST, cache vs. DB index—based on cost, team maturity, and scale targets. Platform-wide Standards: Codify review checklists (security, performance, observability) and enforce via GitHub templates and CI gates. Champion a shift-left testing mindset; critical paths reach 80 % automated coverage before QA touches them. Scalability & Cost Optimization: Design load-testing scenarios that mimic 5 k concurrent tutoring sessions; guide DevOps on autoscaling policies and CDN strategy. Audit infra spend monthly; recommend serverless, queuing, or data-tier changes to cut waste. Release & Environment Strategy: Maintain clear promotion paths: local → sandbox → staging → prod with one-click rollback. Own schema-migration playbooks; zero-downtime releases are the default, not the exception. Technical Mentorship: Run fortnightly architecture clinics; level-up engineers on domain-driven design and performance profiling. Act as tie-breaker on competing technical proposals, keeping debates respectful and evidence-based. Qualifications 5+ yrs engineering experience, 2+ yrs in a dedicated architecture or staff-level role on a high-traffic SaaS product. Proven track record designing multi-tenant systems that scaled beyond 50 k users or 1k RPM. Deep knowledge of Node.js / TypeScript (our core stack), MongoDB or similar NoSQL, plus comfort with event brokers (Kafka, NATS, or RabbitMQ). Fluency in AWS (preferred) or GCP primitives—EKS, Lambda, RDS, CloudFront, IAM. Hands-on with observability stacks (Datadog, New Relic, Sentry, or OpenTelemetry). Excellent written communication; you can distill technical trade-offs in one page for execs and in one diagram for engineers.

Posted 3 weeks ago

Apply

8.0 - 12.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

TCS Hiring for Observability Tools Tech Lead_PAN India Experience: 8 to 12 Years Only Job Location: PAN India TCS Hiring for Observability Tools Tech Lead_PAN India Required Technical Skill Set: Core Responsibilities: Designing and Implementing Observability Solutions: This involves selecting, configuring, and deploying tools and platforms for collecting, processing, and analyzing telemetry data (logs, metrics, traces). Developing and Maintaining Monitoring and Alerting Systems: Creating dashboards, setting up alerts based on key performance indicators (KPIs), and ensuring timely notification of issues. Instrumenting Applications and Infrastructure: Working with development teams to add instrumentation code to applications to generate meaningful telemetry data. This often involves using open standards like Open Telemetry. Analyzing and Troubleshooting System Performance: Investigating performance bottlenecks, identifying root causes of issues, and collaborating with development teams to resolve them. Defining and Tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs): Working with stakeholders to define acceptable levels of performance and reliability and tracking these metrics. Improving Incident Response and Post-Mortem Processes: Using observability data to understand incidents, identify contributing factors, and implement preventative measures. Collaborating with Development, Operations, and SRE Teams: Working closely with other teams to ensure observability practices are integrated throughout the software development lifecycle. Educating and Mentoring Teams on Observability Best Practices: Promoting a culture of observability within the organization. Managing and Optimizing Observability Infrastructure Costs: Ensuring the cost-effectiveness of observability tools and platforms. Staying Up to Date with Observability Trends and Technologies: Continuously learning about new tools, techniques, and best practices. Key Skills: Strong Understanding of Observability Principles: Deep knowledge of logs, metrics, and traces and how they contribute to understanding system behavior. Proficiency with Observability Tools and Platforms: Experience with tools like: Logging: Elasticsearch, Splunk, Fluentd, Logstash, etc., Metrics: Prometheus, Grafana, InfluxDB, Graphite, etc., Tracing: OpenTelemetry, DataDog APM, etc., APM (Application Performance Monitoring): DataDog, New Relic, AppDynamics, etc, Programming and Scripting Skills: Proficiency in languages like Python, Go, Java, or scripting languages like Bash for automation and tool integration. Experience with Cloud Platforms: Familiarity with cloud providers like AWS, Azure, or GCP and their monitoring and logging services. Understanding of Distributed Systems: Knowledge of how distributed systems work and the challenges of monitoring and troubleshooting them. Troubleshooting and Problem-Solving Skills: Strong analytical skills to identify and resolve complex issues. Communication and Collaboration Skills: Ability to effectively communicate technical concepts to different audiences and work collaboratively with other teams. Knowledge of DevOps and SRE Practices: Understanding of continuous integration/continuous delivery (CI/CD), infrastructure as code, and site reliability engineering principles. Data Analysis and Visualization Skills: Ability to analyze telemetry data and create meaningful dashboards and reports. Experience with Containerization and Orchestration: Familiarity with Docker, Kubernetes, and related technologies. Kind Regards, Priyankha M

Posted 3 weeks ago

Apply

8.0 - 12.0 years

0 Lacs

Kolkata, West Bengal, India

On-site

TCS Hiring for Observability Tools Tech Lead_PAN India Experience: 8 to 12 Years Only Job Location: PAN India TCS Hiring for Observability Tools Tech Lead_PAN India Required Technical Skill Set: Core Responsibilities: Designing and Implementing Observability Solutions: This involves selecting, configuring, and deploying tools and platforms for collecting, processing, and analyzing telemetry data (logs, metrics, traces). Developing and Maintaining Monitoring and Alerting Systems: Creating dashboards, setting up alerts based on key performance indicators (KPIs), and ensuring timely notification of issues. Instrumenting Applications and Infrastructure: Working with development teams to add instrumentation code to applications to generate meaningful telemetry data. This often involves using open standards like Open Telemetry. Analyzing and Troubleshooting System Performance: Investigating performance bottlenecks, identifying root causes of issues, and collaborating with development teams to resolve them. Defining and Tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs): Working with stakeholders to define acceptable levels of performance and reliability and tracking these metrics. Improving Incident Response and Post-Mortem Processes: Using observability data to understand incidents, identify contributing factors, and implement preventative measures. Collaborating with Development, Operations, and SRE Teams: Working closely with other teams to ensure observability practices are integrated throughout the software development lifecycle. Educating and Mentoring Teams on Observability Best Practices: Promoting a culture of observability within the organization. Managing and Optimizing Observability Infrastructure Costs: Ensuring the cost-effectiveness of observability tools and platforms. Staying Up to Date with Observability Trends and Technologies: Continuously learning about new tools, techniques, and best practices. Key Skills: Strong Understanding of Observability Principles: Deep knowledge of logs, metrics, and traces and how they contribute to understanding system behavior. Proficiency with Observability Tools and Platforms: Experience with tools like: Logging: Elasticsearch, Splunk, Fluentd, Logstash, etc., Metrics: Prometheus, Grafana, InfluxDB, Graphite, etc., Tracing: OpenTelemetry, DataDog APM, etc., APM (Application Performance Monitoring): DataDog, New Relic, AppDynamics, etc, Programming and Scripting Skills: Proficiency in languages like Python, Go, Java, or scripting languages like Bash for automation and tool integration. Experience with Cloud Platforms: Familiarity with cloud providers like AWS, Azure, or GCP and their monitoring and logging services. Understanding of Distributed Systems: Knowledge of how distributed systems work and the challenges of monitoring and troubleshooting them. Troubleshooting and Problem-Solving Skills: Strong analytical skills to identify and resolve complex issues. Communication and Collaboration Skills: Ability to effectively communicate technical concepts to different audiences and work collaboratively with other teams. Knowledge of DevOps and SRE Practices: Understanding of continuous integration/continuous delivery (CI/CD), infrastructure as code, and site reliability engineering principles. Data Analysis and Visualization Skills: Ability to analyze telemetry data and create meaningful dashboards and reports. Experience with Containerization and Orchestration: Familiarity with Docker, Kubernetes, and related technologies. Kind Regards, Priyankha M

Posted 3 weeks ago

Apply

8.0 - 12.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

TCS Hiring for Observability Tools Tech Lead_PAN India Experience: 8 to 12 Years Only Job Location: PAN India TCS Hiring for Observability Tools Tech Lead_PAN India Required Technical Skill Set: Core Responsibilities: Designing and Implementing Observability Solutions: This involves selecting, configuring, and deploying tools and platforms for collecting, processing, and analyzing telemetry data (logs, metrics, traces). Developing and Maintaining Monitoring and Alerting Systems: Creating dashboards, setting up alerts based on key performance indicators (KPIs), and ensuring timely notification of issues. Instrumenting Applications and Infrastructure: Working with development teams to add instrumentation code to applications to generate meaningful telemetry data. This often involves using open standards like Open Telemetry. Analyzing and Troubleshooting System Performance: Investigating performance bottlenecks, identifying root causes of issues, and collaborating with development teams to resolve them. Defining and Tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs): Working with stakeholders to define acceptable levels of performance and reliability and tracking these metrics. Improving Incident Response and Post-Mortem Processes: Using observability data to understand incidents, identify contributing factors, and implement preventative measures. Collaborating with Development, Operations, and SRE Teams: Working closely with other teams to ensure observability practices are integrated throughout the software development lifecycle. Educating and Mentoring Teams on Observability Best Practices: Promoting a culture of observability within the organization. Managing and Optimizing Observability Infrastructure Costs: Ensuring the cost-effectiveness of observability tools and platforms. Staying Up to Date with Observability Trends and Technologies: Continuously learning about new tools, techniques, and best practices. Key Skills: Strong Understanding of Observability Principles: Deep knowledge of logs, metrics, and traces and how they contribute to understanding system behavior. Proficiency with Observability Tools and Platforms: Experience with tools like: Logging: Elasticsearch, Splunk, Fluentd, Logstash, etc., Metrics: Prometheus, Grafana, InfluxDB, Graphite, etc., Tracing: OpenTelemetry, DataDog APM, etc., APM (Application Performance Monitoring): DataDog, New Relic, AppDynamics, etc, Programming and Scripting Skills: Proficiency in languages like Python, Go, Java, or scripting languages like Bash for automation and tool integration. Experience with Cloud Platforms: Familiarity with cloud providers like AWS, Azure, or GCP and their monitoring and logging services. Understanding of Distributed Systems: Knowledge of how distributed systems work and the challenges of monitoring and troubleshooting them. Troubleshooting and Problem-Solving Skills: Strong analytical skills to identify and resolve complex issues. Communication and Collaboration Skills: Ability to effectively communicate technical concepts to different audiences and work collaboratively with other teams. Knowledge of DevOps and SRE Practices: Understanding of continuous integration/continuous delivery (CI/CD), infrastructure as code, and site reliability engineering principles. Data Analysis and Visualization Skills: Ability to analyze telemetry data and create meaningful dashboards and reports. Experience with Containerization and Orchestration: Familiarity with Docker, Kubernetes, and related technologies. Kind Regards, Priyankha M

Posted 3 weeks ago

Apply

8.0 - 12.0 years

0 Lacs

Kochi, Kerala, India

On-site

TCS Hiring for Observability Tools Tech Lead_PAN India Experience: 8 to 12 Years Only Job Location: PAN India TCS Hiring for Observability Tools Tech Lead_PAN India Required Technical Skill Set: Core Responsibilities: Designing and Implementing Observability Solutions: This involves selecting, configuring, and deploying tools and platforms for collecting, processing, and analyzing telemetry data (logs, metrics, traces). Developing and Maintaining Monitoring and Alerting Systems: Creating dashboards, setting up alerts based on key performance indicators (KPIs), and ensuring timely notification of issues. Instrumenting Applications and Infrastructure: Working with development teams to add instrumentation code to applications to generate meaningful telemetry data. This often involves using open standards like Open Telemetry. Analyzing and Troubleshooting System Performance: Investigating performance bottlenecks, identifying root causes of issues, and collaborating with development teams to resolve them. Defining and Tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs): Working with stakeholders to define acceptable levels of performance and reliability and tracking these metrics. Improving Incident Response and Post-Mortem Processes: Using observability data to understand incidents, identify contributing factors, and implement preventative measures. Collaborating with Development, Operations, and SRE Teams: Working closely with other teams to ensure observability practices are integrated throughout the software development lifecycle. Educating and Mentoring Teams on Observability Best Practices: Promoting a culture of observability within the organization. Managing and Optimizing Observability Infrastructure Costs: Ensuring the cost-effectiveness of observability tools and platforms. Staying Up to Date with Observability Trends and Technologies: Continuously learning about new tools, techniques, and best practices. Key Skills: Strong Understanding of Observability Principles: Deep knowledge of logs, metrics, and traces and how they contribute to understanding system behavior. Proficiency with Observability Tools and Platforms: Experience with tools like: Logging: Elasticsearch, Splunk, Fluentd, Logstash, etc., Metrics: Prometheus, Grafana, InfluxDB, Graphite, etc., Tracing: OpenTelemetry, DataDog APM, etc., APM (Application Performance Monitoring): DataDog, New Relic, AppDynamics, etc, Programming and Scripting Skills: Proficiency in languages like Python, Go, Java, or scripting languages like Bash for automation and tool integration. Experience with Cloud Platforms: Familiarity with cloud providers like AWS, Azure, or GCP and their monitoring and logging services. Understanding of Distributed Systems: Knowledge of how distributed systems work and the challenges of monitoring and troubleshooting them. Troubleshooting and Problem-Solving Skills: Strong analytical skills to identify and resolve complex issues. Communication and Collaboration Skills: Ability to effectively communicate technical concepts to different audiences and work collaboratively with other teams. Knowledge of DevOps and SRE Practices: Understanding of continuous integration/continuous delivery (CI/CD), infrastructure as code, and site reliability engineering principles. Data Analysis and Visualization Skills: Ability to analyze telemetry data and create meaningful dashboards and reports. Experience with Containerization and Orchestration: Familiarity with Docker, Kubernetes, and related technologies. Kind Regards, Priyankha M

Posted 3 weeks ago

Apply

8.0 - 12.0 years

0 Lacs

Pune, Maharashtra, India

On-site

TCS Hiring for Observability Tools Tech Lead_PAN India Experience: 8 to 12 Years Only Job Location: PAN India TCS Hiring for Observability Tools Tech Lead_PAN India Required Technical Skill Set: Core Responsibilities: Designing and Implementing Observability Solutions: This involves selecting, configuring, and deploying tools and platforms for collecting, processing, and analyzing telemetry data (logs, metrics, traces). Developing and Maintaining Monitoring and Alerting Systems: Creating dashboards, setting up alerts based on key performance indicators (KPIs), and ensuring timely notification of issues. Instrumenting Applications and Infrastructure: Working with development teams to add instrumentation code to applications to generate meaningful telemetry data. This often involves using open standards like Open Telemetry. Analyzing and Troubleshooting System Performance: Investigating performance bottlenecks, identifying root causes of issues, and collaborating with development teams to resolve them. Defining and Tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs): Working with stakeholders to define acceptable levels of performance and reliability and tracking these metrics. Improving Incident Response and Post-Mortem Processes: Using observability data to understand incidents, identify contributing factors, and implement preventative measures. Collaborating with Development, Operations, and SRE Teams: Working closely with other teams to ensure observability practices are integrated throughout the software development lifecycle. Educating and Mentoring Teams on Observability Best Practices: Promoting a culture of observability within the organization. Managing and Optimizing Observability Infrastructure Costs: Ensuring the cost-effectiveness of observability tools and platforms. Staying Up to Date with Observability Trends and Technologies: Continuously learning about new tools, techniques, and best practices. Key Skills: Strong Understanding of Observability Principles: Deep knowledge of logs, metrics, and traces and how they contribute to understanding system behavior. Proficiency with Observability Tools and Platforms: Experience with tools like: Logging: Elasticsearch, Splunk, Fluentd, Logstash, etc., Metrics: Prometheus, Grafana, InfluxDB, Graphite, etc., Tracing: OpenTelemetry, DataDog APM, etc., APM (Application Performance Monitoring): DataDog, New Relic, AppDynamics, etc, Programming and Scripting Skills: Proficiency in languages like Python, Go, Java, or scripting languages like Bash for automation and tool integration. Experience with Cloud Platforms: Familiarity with cloud providers like AWS, Azure, or GCP and their monitoring and logging services. Understanding of Distributed Systems: Knowledge of how distributed systems work and the challenges of monitoring and troubleshooting them. Troubleshooting and Problem-Solving Skills: Strong analytical skills to identify and resolve complex issues. Communication and Collaboration Skills: Ability to effectively communicate technical concepts to different audiences and work collaboratively with other teams. Knowledge of DevOps and SRE Practices: Understanding of continuous integration/continuous delivery (CI/CD), infrastructure as code, and site reliability engineering principles. Data Analysis and Visualization Skills: Ability to analyze telemetry data and create meaningful dashboards and reports. Experience with Containerization and Orchestration: Familiarity with Docker, Kubernetes, and related technologies. Kind Regards, Priyankha M

Posted 3 weeks ago

Apply

8.0 - 12.0 years

0 Lacs

Noida, Uttar Pradesh, India

On-site

TCS Hiring for Observability Tools Tech Lead_PAN India Experience: 8 to 12 Years Only Job Location: PAN India TCS Hiring for Observability Tools Tech Lead_PAN India Required Technical Skill Set: Core Responsibilities: Designing and Implementing Observability Solutions: This involves selecting, configuring, and deploying tools and platforms for collecting, processing, and analyzing telemetry data (logs, metrics, traces). Developing and Maintaining Monitoring and Alerting Systems: Creating dashboards, setting up alerts based on key performance indicators (KPIs), and ensuring timely notification of issues. Instrumenting Applications and Infrastructure: Working with development teams to add instrumentation code to applications to generate meaningful telemetry data. This often involves using open standards like Open Telemetry. Analyzing and Troubleshooting System Performance: Investigating performance bottlenecks, identifying root causes of issues, and collaborating with development teams to resolve them. Defining and Tracking Service Level Objectives (SLOs) and Service Level Indicators (SLIs): Working with stakeholders to define acceptable levels of performance and reliability and tracking these metrics. Improving Incident Response and Post-Mortem Processes: Using observability data to understand incidents, identify contributing factors, and implement preventative measures. Collaborating with Development, Operations, and SRE Teams: Working closely with other teams to ensure observability practices are integrated throughout the software development lifecycle. Educating and Mentoring Teams on Observability Best Practices: Promoting a culture of observability within the organization. Managing and Optimizing Observability Infrastructure Costs: Ensuring the cost-effectiveness of observability tools and platforms. Staying Up to Date with Observability Trends and Technologies: Continuously learning about new tools, techniques, and best practices. Key Skills: Strong Understanding of Observability Principles: Deep knowledge of logs, metrics, and traces and how they contribute to understanding system behavior. Proficiency with Observability Tools and Platforms: Experience with tools like: Logging: Elasticsearch, Splunk, Fluentd, Logstash, etc., Metrics: Prometheus, Grafana, InfluxDB, Graphite, etc., Tracing: OpenTelemetry, DataDog APM, etc., APM (Application Performance Monitoring): DataDog, New Relic, AppDynamics, etc, Programming and Scripting Skills: Proficiency in languages like Python, Go, Java, or scripting languages like Bash for automation and tool integration. Experience with Cloud Platforms: Familiarity with cloud providers like AWS, Azure, or GCP and their monitoring and logging services. Understanding of Distributed Systems: Knowledge of how distributed systems work and the challenges of monitoring and troubleshooting them. Troubleshooting and Problem-Solving Skills: Strong analytical skills to identify and resolve complex issues. Communication and Collaboration Skills: Ability to effectively communicate technical concepts to different audiences and work collaboratively with other teams. Knowledge of DevOps and SRE Practices: Understanding of continuous integration/continuous delivery (CI/CD), infrastructure as code, and site reliability engineering principles. Data Analysis and Visualization Skills: Ability to analyze telemetry data and create meaningful dashboards and reports. Experience with Containerization and Orchestration: Familiarity with Docker, Kubernetes, and related technologies. Kind Regards, Priyankha M

Posted 3 weeks ago

Apply

3.0 - 7.0 years

0 Lacs

haryana

On-site

As a Node.js Developer at Fitelo, a fast-growing health and wellness platform, you will play a pivotal role in leading the data strategy. Working alongside a team of innovative thinkers, front-end experts, and domain specialists, you will be responsible for designing robust architectures, implementing efficient APIs, and ensuring the platform's systems are both lightning-fast and rock-solid. Your contributions will involve building new features, tackling complex challenges, and enhancing performance to shape the future of health and wellness technology. If you are passionate about crafting elegant solutions, thinking creatively, and making a significant impact, we welcome you to join our team. Your responsibilities in this role will include: - Taking complete ownership of designing, developing, deploying, and maintaining server-side components and APIs using Node.js. - Managing the entire lifecycle of database operations with MongoDB and PostgreSQL, from schema design to performance tuning and troubleshooting. - Collaborating with front-end developers to ensure seamless integration of user-facing elements with server-side logic, delivering a flawless user experience. - Optimizing application performance and scalability through proactive monitoring, debugging, and implementing best practices. - Implementing and maintaining security protocols to safeguard data integrity and protect sensitive information, ensuring compliance with industry standards. - Overseeing the entire development process, including requirement gathering, technical planning, and execution to final deployment. - Conducting thorough code reviews to maintain high-quality standards, promote best practices, and guide team members in achieving excellence. - Maintaining comprehensive documentation for APIs, codebases, and processes to support scalability and team collaboration. - Continuously researching and integrating the latest technologies to ensure the application architecture remains innovative and future-proof. - Driving collaboration across teams to solve challenges, adapt to changing priorities, and ensure the successful delivery of projects from start to finish. The ideal candidate for this role will possess: - 3+ years of experience in backend development or a similar role, primarily with Node.js. - Advanced proficiency in JavaScript and Typescript, with experience in frameworks like Express.js or Nest.js. - Strong grasp of asynchronous programming, event-driven architecture, and advanced concepts such as streams and worker threads. - In-depth experience with both SQL databases (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB), including query optimization and schema design. - Expertise in building and consuming RESTful APIs and GraphQL services, with a solid understanding of API versioning and security best practices (e.g., OAuth2, JWT). - Knowledge of microservices architecture and experience with tools like Docker, Kubernetes, and message brokers such as RabbitMQ or Kafka. - Familiarity with front-end integration and technologies (e.g., HTML, CSS, JavaScript frameworks like React or angular.js). - Proficiency in version control tools (e.g., Git) and familiarity with CI/CD pipelines using tools like Jenkins and GitLab CI/CD. - Hands-on experience with cloud platforms (e.g., AWS, GCP, or Azure), including deployment and monitoring services like EC2, CloudWatch, or Kubernetes Engine. - Strong problem-solving skills, with experience in debugging and performance tuning of backend systems using tools like New Relic, Datadog, or ELK Stack. - Understanding of testing frameworks (e.g., Mocha, Chai, Jest) and best practices for unit, integration, and performance testing. Qualifications required for this role: - Bachelor's degree in technology. This is a full-time position with a day shift schedule located in Gurugram. Join us at Fitelo and be part of our mission to revolutionize the health and wellness industry.,

Posted 3 weeks ago

Apply

3.0 years

0 Lacs

Chennai, Tamil Nadu, India

On-site

Freshworks makes it fast and easy for businesses to delight their customers and employees. We do this by taking a fresh approach to building and delivering software that is affordable, quick to implement, and designed for the end user. Headquartered in San Mateo, California, Freshworks has a global team operating from 13 global locations to serve more than 65,000 companies -- from startups to public companies - that rely on Freshworks software-as-a-service to enable a better customer experience (CRM, CX) and employee experience (ITSM). Freshworks' cloud-based software suite includes Freshdesk (omni-channel customer support), Freshsales (sales automation), Freshmarketer (marketing automation), Freshservice (IT service desk), Freshchat (AI-powered bots), supported by Neo, our underlying platform of shared services. Freshworks is featured in global national press including CNBC, Forbes, Fortune, Bloomberg and has been a BuiltIn Best Place to work in San Francisco and Denver for the last 3 years. Our customer ratings have earned Freshworks products TrustRadius Top Rated Software ratings and G2 Best of Awards for Best Feature Set, Best Value for the Price and Best Relationship. Job Description As a Lead Software Engineer - Systems , you will focus on building next-generation platform services for Freshworks with your strong background in distributed systems and mentor your team to achieve this.You will have an opportunity to redefine customer experiences by building systems that are milli-second efficient, always available and working at internet scale. If you are the kind of engineer who is passionate about building systems, have a good eye for analysis and a mind that can think outside the box, we want to talk to you. Do you wanna take on solving some cool and complex Distributed-Systems/Big-Data problems at scale? At Freshworks are building next gen CRM, Support & IT Automation, Sales & Marketing SaaS products/services & related platform/Foundation-services - for the Small and Mid-market customers across the globe. We have about 32K+ customers (Small/Medium size Organizations) across 140 countries, with ~10 SaaS product offerings. We also deal with 20TB of logs/day – where we have some really cool and interesting problems to solve with our Search / Relevance Engineering. We deal with ~1B Messages with ~300K/Min and ~5B conversations with ~6M/Day – where our Chat, Bot and Messaging solutions have to deal with competing with the best in the world. On Data Engineering and Analytics side we have some complex problems to solve with the rate at which we are growing in dealing with challenges like ~5M Db Reads/min, ~700K reqs/Min, 600M users and pushing the limits of Cloud Services The Freshworks (FW) Engineering Platforms today, broadly serves as a key stakeholder to the FW product teams, developers and the customers. The Freshworks platform enables developers, partners, and customers to customize, integrate, and automate business workflows for support, CRM, and IT use cases. The very purpose of the FW Platforms team is to build efficiency, bring in agility into product development, enable services to scale and improve performance, and thereby provide a seamless experience to our customers. In order to achieve this, the Platforms teams work very closely with our internal stakeholders and align to their goals - the Product teams, the Customer facing teams(Sales, Customer Success, Onboarding teams). Some of the key themes include providing a “Unified Freshworks Experience”, being mid-market ready and providing smart analytics. This group is looking for a Lead Systems Engineer who is a very optimised solution oriented with a vision of the impact of the code in the overall software development life cycle. Our System engineers build the APIs / Services / Features to support these complex scenarios and seamlessly scale and perform for current and future rapid growth we are experiencing. We work in solving some of the problems as common platform/foundation-services engineering where we take on problems across products from building SSO, Containerization, Reliable deployment working in Agile mode. Our engineering takes pride in delivering some inspiring and fresh experiences for our customers and their business/customers. As a Lead Systems Engineer you will design and implement multi-tier (DB, services, and the web) software applications, and document, test, fix and enhance systems when needed. In your agile team, you will closely work with engineers, architects, managers, design, QA and operations teams, and create solutions that meet business requirements. You will spend most of your time developing clean code with limited abstraction. In this role, you will also lead and mentor team members across functions. You will also be implementing and supporting compliance to Freshworks compliance and information security processes. Responsibilities: ● Platform teams tend to be small but self-sufficient. You will have a large scope of responsibilities. They also tend not to have any QA or Ops personnel. ● Design, Develop, Maintain software ● Be able to plan and execute goals ● Assist Product Owners with planning and roadmaps ● Lead a team of 2-4 engineers ● Strong communication skills a must ○ Platform services exist to be used by other teams in Freshworks ○ Platform Leads will be the face of their service ○ Important goal of a platform service is increasing its adoption ● Leads will communicate and coordinate with other teams across Freshworks ● Mentoring other engineers in the team ● Strong opinions on engineering best practices ● You will own systems that take high scale and are capable of scaling to greater heights ● Ensure 99.99% availability of your production systems ● Ensure 99.999% uptime of your production systems Must Have: ● Overall 6-10 years of Experience ● Should have a good knowledge of OOPS concepts. Must be comfortable with Design Patterns and SOLID principles ● Strong testing habits, passionate towards unit testing and TDD. ● Extensive experience in Agile methodologies ● Expertise in one or more programming languages like Java, C, C++, C#, Ruby, Python, Golang ● Good understanding of data structures ● Strong understanding of HTTP and REST principles ● Must have experience with Inter-Process Communication — this can be Unix IPC, SOAP Web Services, or microservices. ● Experience handling Production workloads, with experience handling production issues ● Strong CI/CD experience ● Devops knowledge ● Infra knowledge of popular internet serving applications ● Good understanding of multithreading and concurrency primitives ● Strong design skills ● Ability to breakdown a problem ● Cloud/SaaS experience ● Good understanding of RDBMS like MySQL, PostgreSQL, MSSQL, OracleDB ● Strong knowledge of git ● Strong analytical and problem solving skills Qualifications Good to have: ● Prior experience leading a team ● Experience with NoSQL technologies like Mongo, CassandraDB, DynamoDB ● Supporting Production issues brought up by end customers ● Keeping up to date with the cutting edge of technologies ● Familiarity with GitHub a plus ● Experience using static code analyzer tools like SonarQube, Rubocop, checkstyle ● Experience using APM tools like DataDog, NewRelic ● Expertise in Jav Additional Information At Freshworks, we are creating a global workplace that enables everyone to find their true potential, purpose, and passion irrespective of their background, gender, race, sexual orientation, religion and ethnicity. We are committed to providing equal opportunity for all and believe that diversity in the workplace creates a more vibrant, richer work environment that advances the goals of our employees, communities and the business.

Posted 3 weeks ago

Apply

5.0 - 9.0 years

0 Lacs

chennai, tamil nadu

On-site

As a Lead Engineer, DevOps at Toyota Connected India, you will be part of a dynamic team that is dedicated to creating innovative infotainment solutions on embedded and cloud platforms. You will play a crucial role in shaping the future of mobility by leveraging your expertise in cloud platforms, containerization, infrastructure automation, scripting languages, monitoring solutions, networking, security best practices, and CI/CD tools. Your responsibilities will include: - Demonstrating hands-on experience with cloud platforms such as AWS or Google Cloud Platform. - Utilizing strong expertise in containerization (e.g., Docker) and Kubernetes for container orchestration. - Implementing infrastructure automation and configuration management tools like Terraform, CloudFormation, Ansible, or similar. - Proficiency in scripting languages such as Python, Bash, or Go for efficient workflow. - Experience with monitoring and logging solutions such as Prometheus, Grafana, ELK Stack, or Datadog to ensure system reliability. - Knowledge of networking concepts, security best practices, and infrastructure monitoring to maintain a secure and stable environment. - Strong experience with CI/CD tools such as Jenkins, GitLab CI, CircleCI, Travis CI, or similar for continuous integration and delivery. At Toyota Connected, you will enjoy top-of-the-line compensation, autonomy in managing your time and workload, yearly gym membership reimbursement, free catered lunches, and a casual dress code. You will have the opportunity to work on products that enhance the safety and convenience of millions of customers, all within a collaborative, innovative, and empathetic work culture that values customer-centric decision-making, passion for excellence, creativity, and teamwork. Join us at Toyota Connected India and be part of a team that is redefining the automotive industry and making a positive global impact!,

Posted 3 weeks ago

Apply

7.0 years

0 Lacs

Hyderabad, Telangana, India

On-site

Job Title:Sr DevOps Engineer Location : Hyderabad & Ahmedabad Employment Type: Full-Time Work Model - 3 Days from office Exp : 7year+ Summary: The Senior DevOps Engineer is responsible for designing and managing robust, scalable CI/CD pipelines, automating infrastructure with Terraform, and improving deployment efficiency across GCP-hosted environments Experience Required: 5 –8 years in DevOps engineering roles with proven expertise in CI/CD, infrastructure automation, and Kubernetes.. Mandatory: • OS: Linux • Cloud: GCP (Compute Engine, Load Balancing, GKE, IAM) • CI/CD: Jenkins, GitHub Actions, Argo CD • Containers: Docker, Kubernetes • IaC: Terraform, Helm • Monitoring: Prometheus, Grafana, ELK • Security: Vault, Trivy, OWASP concepts Nice to Have : • Service Mesh (Istio), Pub/Sub, API Gateway – Kong • Advanced scripting (Python, Bash, Node.js) • Skywalking, Rancher, Jira, Freshservice Scope: • Own CI/CD strategy and configuration • Implement DevSecOps practices • Drive automation-first culture Roles and Responsibilities: • Design and implement end-to-end CI/CD pipelines using Jenkins, GitHub Actions, and Argo CD for production-grade deployments. • Define branching strategies and workflow templates for development teams. • Automate infrastructure provisioning using Terraform, Helm, and Kubernetes manifests across multiple environments. • Implement and maintain container orchestration strategies on GKE, including Helm-based deployments. • Manage secrets lifecycle using Vault and integrate with CI/CD for secure deployments. • Integrate DevSecOps tools like Trivy, SonarQube, and JFrog into CI/CD workflows. • Collaborate with engineering leads to review deployment readiness and ensure quality gates are met. • Monitor infrastructure health and capacity planning using Prometheus, Grafana, and Datadog; implement alerting rules. • Implement auto-scaling, self-healing, and other resilience strategies in Kubernetes. • Drive process documentation, review peer automation scripts, and provide mentoring to junior DevOps engineers Notice Period: Immediate- 30 Days Email to : sharmila.m@aptita.com

Posted 3 weeks ago

Apply

3.0 years

0 Lacs

India

On-site

Construction is the 2nd largest industry in the world (4x the size of SaaS!). But unlike software (with observability platforms such as AppDynamics and Datadog), construction teams lack automated feedback loops to help projects stay on schedule and on budget. Without this observability, construction wastes a whopping $3T per year because glitches aren’t detected fast enough to recover. Doxel AI exists to bring computer vision to construction, so the industry can deliver what society needs to thrive. From hospitals to data centers, from foreman to VPs of construction, teams use Doxel to make better decisions everyday. In fact, Doxel has contributed to the construction of the facilities that provide many of the products and services you use everyday. We have LLM-driven automation, classic computer vision, deep learning ML object detection, a low-latency 3D three.js web app, and a complex data pipeline powering it all in the background. We’re building out new workflows, analytics dashboards, and forecasting engines. Join us in bringing AI to construction! We're at an exciting stage of scale as we build upon our growing market momentum. Our software is trusted by Shell Oil, Genentech, HCA healthcare, Kaiser, Turner, Layton and several others. Join us in bringing AI to construction! The Role As a Navisworks Engineer, your mission is to revolutionize construction job sites by creating powerful tools within Navisworks that capture, process, and visualize project data for doxel customers. You will collaborate with VDC engineers, Product, Design, Backend, CV/ML teams to deliver seamless, responsive, and visually compelling user experiences. Your work will directly influence how foremen, project managers, and executives make mission-critical decisions on job sites worldwide. What You’ll Do Develop and maintain AEC plugins using .NET (C#) and other relevant technologies to enhance construction workflows Design and implement robust Windows-based applications that integrate with Doxel’s backend APIs and data pipelines Collaborate closely with Product Managers, Designers, and Backend Engineers, VDC engineers and operations to deliver seamless end-to-end solutions Optimize Navisworks plugin performance for handling large, complex 3D models efficiently Ensure robust testing, monitoring, and debugging practices to maintain high software quality Stay up to date with Navisworks API updates and best practices for plugin development Mentor engineers and promote best practices in Windows and plugin development What You’ll Bring To The Team 3+ years of professional experience in software development with expertise in Navisworks plugin development Strong C#/.NET programming skills with experience in Navisworks API and Windows application development Experience with Autodesk Forge, Revit API, or BIM data integration is a plus Strong understanding of 3D model visualization, performance optimization, and state management Experience with RESTful APIs, database integrations, and modern software development practices Proficiency in debugging, profiling, and optimizing Windows-based applications Experience with CI/CD pipelines, automated testing, and deployment Independent, strong problem-solving and Soft skills with the ability to debug and fix issues efficiently Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field Doxel also provides comprehensive health/dental/vision benefits for employees and their families, an Unlimited PTO policy, and a flexible work environment among other benefits. Doxel is an equal opportunity employer and actively seeks diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Posted 3 weeks ago

Apply

0.0 - 2.0 years

0 Lacs

Pune, Maharashtra

Remote

Job Description What You’ll Be Doing Support and monitor production systems to ensure optimal performance and uptime. Assist with troubleshooting, incident response, and resolution of outages and performance issues. Collaborate with senior SREs to automate operational tasks and improve deployment workflows. Contribute to the development and maintenance of CI/CD pipelines. Set up and manage monitoring, alerting, and logging tools to improve observability. Write and maintain runbooks, documentation, and system architecture diagrams. Participate in post-incident reviews, learning from failures and identifying opportunities for improvement. Continuously apply and learn SRE best practices such as reducing toil, setting SLOs, and designing for resilience. Required Experience 0-2 years of experience in SRE, DevOps, or system administration. Familiarity with Linux/Unix environments and shell scripting. Understanding of cloud platforms (AWS, GCP, or Azure). Knowledge of containerisation and orchestration tools (e.g., Docker, Kubernetes). Basic understanding of CI/CD systems (e.g., Jenkins, GitHub Actions, GitLab CI). Basic Exposure to monitoring and logging tools (e.g., Prometheus, Grafana, ELK, Datadog). Experience with version control systems like Git. Strong communication and collaboration skills.  Preferred Qualifications Basic experience with SRE/DevOps tooling and automation. Basic Understanding Familiarity with application monitoring tools (e.g., Datadog). Basic Experience writing automation scripts in Python, Go, or Bash. Basic Understanding Exposure to incident management tools (e.g., PagerDuty, Opsgenie). Awareness of core SRE concepts (SLIs, SLOs, error budgets, toil reduction). Eagerness to learn new technologies and improve engineering processes.  Why Join Us? Work with experienced engineers in a supportive, learning-first environment. Gain hands-on experience with real-world production systems and modern infrastructure tools. Be part of a collaborative team driving reliability and efficiency at scale. Please note that Zendesk can only hire candidates who are physically located and plan to work from Karnataka or Maharashtra. Please refer to the location posted on the requisition for where this role is based. Hybrid: In this role, our hybrid experience is designed at the team level to give you a rich onsite experience packed with connection, collaboration, learning, and celebration - while also giving you flexibility to work remotely for part of the week. This role must attend our local office for part of the week. The specific in-office schedule is to be determined by the hiring manager. The intelligent heart of customer experience Zendesk software was built to bring a sense of calm to the chaotic world of customer service. Today we power billions of conversations with brands you know and love. Zendesk believes in offering our people a fulfilling and inclusive experience. Our hybrid way of working, enables us to purposefully come together in person, at one of our many Zendesk offices around the world, to connect, collaborate and learn whilst also giving our people the flexibility to work remotely for part of the week. Zendesk is an equal opportunity employer, and we’re proud of our ongoing efforts to foster global diversity, equity, & inclusion in the workplace. Individuals seeking employment and employees at Zendesk are considered without regard to race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, medical condition, ancestry, disability, military or veteran status, or any other characteristic protected by applicable law. We are an AA/EEO/Veterans/Disabled employer. If you are based in the United States and would like more information about your EEO rights under the law, please click here . Zendesk endeavors to make reasonable accommodations for applicants with disabilities and disabled veterans pursuant to applicable federal and state law. If you are an individual with a disability and require a reasonable accommodation to submit this application, complete any pre-employment testing, or otherwise participate in the employee selection process, please send an e-mail to peopleandplaces@zendesk.com with your specific accommodation request.

Posted 3 weeks ago

Apply

0.0 - 8.0 years

0 Lacs

Bengaluru, Karnataka

On-site

ABOUT : - Avis Budget Group is a leading provider of mobility options, with brands including Avis, Budget & Budget Truck, and Zipcar. With more than 70 years of experience and 11,000 locations in 180 countries, we are shaping the future of our industry and want you to join us in our mission. Location :- Bangalore Work mode:- Hybrid ( 3 Days in Office ) JOB DEXCRIPTION :- 5 - 8 years of Professional experience designing/writing/supporting highly available web services. 5+ years of experience writing Java applications. Extensive experience with event driven architecture (3+ years) Must have experience analyzing complex data flows - batch or real time processing ( 3+ years) Worked on RabbitMQ or similar (3+ years) Experience with Postgres or similar ( 3+ years) Strong experience with NoSQL DBs – MongoDB and Cassandra/Datastax (2+ years) Strong experience with AWS and CI/CD environment. (2+ years) Experience writing and consuming webservices using Java/Spring boot. Experience developing React/NodeJS Apps and Services is a plus. Understanding of distributed systems – performance bottlenecks, fault tolerance, and data consistency concerns. Experience in Kubernetes for containerization. Experience building mission critical systems, running 24x7. Desire to work within a team of engineers at all levels of experience. Desire to mentor junior developers, maximizing their productivity. Familiarity with tools like DataDog, Grafana, or similar products to monitor web services. Good written and spoken communication skills. Bangalore Karnataka India

Posted 3 weeks ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies