Software Developer

5 years

0 Lacs

Posted:3 hours ago| Platform: GlassDoor logo

Apply

Work Mode

On-site

Job Type

Part Time

Job Description

  • Bachelor’s or Master’s degree in Computer Science & Engineering
  • 5+ years of professional experience in full stack development, with a proven track record of deploying web applications in production environments.
  • Strong fundamentals in data structures & algorithms demonstrated through complex system design and problem-solving in computer network domain.
  • Experience in at least one backend language (e.g., Java, Python, Go), frontend framework including SQL/NoSQL databases and cloud platforms (e.g., AWS, Azure, GCP).
  • Hands-on experience with observability tools (e.g., Prometheus, Grafana, ELK/EFK, OpenTelemetry).
  • Experience with container orchestration (Kubernetes, Docker) and CI/CD tools (e.g., Jenkins, GitHub Actions).
  • Exposure to GPU cluster operations and high-performance networking (RoCE, InfiniBand).
  • Excellent communication and teamwork skills, thriving in a fast-paced, collaborative environment.
  • Experience in data center operations, particularly managing & monitoring GPU clusters for AI/ML or HPC workloads.
  • Familiarity with GPU networking protocols (e.g., NCCL for collective communications, Slurm for job scheduling) and high-performance computing frameworks.
  • Knowledge of cloud and on-prem hybrid deployments (AWS, GCP, Azure, or private data centers).
  • Familiarity with security best practices for large-scale distributed systems.

  • Design and implement full stack applications for data center management, including RESTful APIs, microservices, and responsive UI using OCI frameworks.
  • Develop and maintain observability solutions, including distributed tracing, logging pipelines, and metrics collection (e.g., Prometheus, Grafana) to monitor GPU clusters and data center infrastructure in real-time.
  • Implement operational workflows and CI/CD pipelines to streamline deployment, scaling, and maintenance of data center resources.
  • Optimize GPU cluster networking configurations, integrating high-speed interconnects (e.g., InfiniBand, RoCE, Ethernet fabrics) to support AI/ML workloads, ensuring low-latency communication and fault-tolerant designs.
  • Leverage strong knowledge of data structures & algorithms to optimize large-scale data processing and network topologies.
  • Build secure, scalable dashboards and APIs for visualizing data center metrics, alerting on anomalies, and automating incident response in GPU-accelerated environments.
  • Perform performance tuning and troubleshooting of full stack systems to ensure reliability and efficiency in mission-critical infrastructure.
  • Leverage ML/LLM techniques to analyze high-volume telemetry data, detect anomalies, automate mitigation actions, and deliver intelligent reporting to stakeholders.
  • Contribute to code reviews, documentation, and establishing best practices for observability, automation, and data center operations, staying updated on emerging technologies.
  • Work in an agile environment, participating in sprint planning, retrospectives, and on-call rotations to maintain 24/7 operational uptime.
  • Demonstrates the ability to independently navigate unfamiliar challenges, align with stakeholders, shape the technical roadmap, drive operational excellence, and mentor peers and junior engineers

Mock Interview

Practice Video Interview with JobPe AI

Start Java Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Java Skills

Practice Java coding challenges to boost your skills

Start Practicing Java Now
Oracle logo
Oracle

Information Technology

Redwood City

RecommendedJobs for You

hyderabad, chennai, bengaluru