System Administrator

3 years

0 Lacs

Posted:1 day ago| Platform: SimplyHired logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

System Administrator

Job Description

We aim to bring about a new paradigm in medical image diagnostics; providing intelligent, holistic, ethical, explainable and patient centric care. We are looking for innovative problem solvers who love solving problems. We want people who can empathize with the consumer, understand business problems, and design and deliver intelligent products. We are looking for a System Administrator to manage and optimize our on-premise and cloud infrastructure, ensuring reliability, security, and scalability for high-throughput AI workloads. As a System Administrator, you will be responsible for managing servers, storage, network, and compute infrastructure powering our AI development and deployment pipelines. You will ensure seamless handling of large medical imaging datasets (DICOM/NIfTI), maintain high availability for research and production systems

Key Responsibilities

Infrastructure & Systems Management

  • Manage Linux-based servers, GPU clusters, and network storage for AI training and inference workloads.
  • Configure and maintain message queue systems (RabbitMQ, ActiveMQ, Kafka) for large-scale, asynchronous AI pipeline execution.
  • Set up and maintain service beacons and health checks to proactively monitor the state of critical services (XNAT pipelines, FastAPI endpoints, AI model inference servers).
  • Maintain PACS integration, DICOM routing, and high-throughput data transfer for medical imaging workflows.
  • Manage hybrid infrastructure (on-prem + cloud) including auto-scaling compute for large training tasks.

Service Monitoring & Reliability

  • Implement automated service checking for all production and development services using Prometheus, Grafana, or similar tools.
  • Configure beacon agents to trigger alerts and self-healing scripts for service restarts when anomalies are detected.
  • Set up log aggregation and anomaly detection to catch failures in AI processing pipelines early.
  • Ensure 99.9% uptime for mission-critical systems and clinical services.

Security & Compliance

  • Enforce secure access control (IAM, VPN, RBAC, MFA) and maintain audit trails for all system activities.
  • Ensure compliance with HIPAA, GDPR, ISO 27001 for medical data storage and transfer.
  • Encrypt medical imaging data (DICOM/NIfTI) at rest and in transit.

Automation & DevOps

  • Develop automation scripts for service restarts, scaling GPU resources, and pipeline deployments.
  • Work with DevOps teams to integrate infrastructure monitoring with CI/CD pipelines.
  • Optimize AI pipeline orchestration with MQ-based task handling for scalable performance.

Backup, Disaster Recovery & High Availability

  • Manage data backup policies for medical datasets, AI model artifacts, and PostgreSQL/MongoDB databases.
  • Implement failover systems for MQ brokers and imaging data services to ensure uninterrupted AI processing.

Collaboration & Support

  • Work closely with AI engineers and data scientists to optimize compute resource utilization.
  • Support teams in troubleshooting infrastructure and service issues.
  • Maintain license servers and specialized imaging software environments.

Skills and Qualifications

Required:

  • 3+ years of Linux systems administration experience with a focus on service monitoring and high-availability environments.
  • Experience with message queues (RabbitMQ, ActiveMQ, Kafka) for distributed AI workloads.
  • Familiarity with beacons, service health monitoring, self-healing automation.
  • Experience managing GPU clusters (NVIDIA CUDA, drivers, dockerized AI workflows).
  • Hands-on with cloud platforms (AWS, GCP, Azure).
  • Networking fundamentals (firewalls, VPNs, load balancers).
  • Hands-on experience with GPU-enabled servers (NVIDIA CUDA, drivers, dockerized AI workflows).
  • Experience managing large datasets (100GB–TB scale), preferably in healthcare or scientific research.
  • Familiarity with cloud platforms (AWS EC2, S3, EKS or equivalents).
  • Knowledge of cybersecurity best practices and compliance frameworks (HIPAA, ISO 27001).

Preferred:

  • Experience with PACS, XNAT, or medical imaging servers.
  • Familiarity with Prometheus, Grafana, ELK stack, SaltStack beacons, or similar monitoring tools.
  • Knowledge of Kubernetes or Docker Swarm for container orchestration.
  • Basic scripting knowledge (Bash, Python) for task automation.
  • Exposure to database administration (PostgreSQL, MongoDB).
  • Scripting skills (Bash, Python, PowerShell) for automation and troubleshooting.
  • Understanding of databases (PostgreSQL, MongoDB) used in AI pipelines.

Education: BE/B Tech, MS/M Tech (will be a bonus)

Experience: 3-5 Years

Job Type: Full-time

Work Location: In person

Mock Interview

Practice Video Interview with JobPe AI

Start DevOps Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You