Jobs
Interviews

Scubyt

2 Job openings at Scubyt
Senior Reporting Analyst - Infrastructure India 5 years None Not disclosed Remote Contractual

Senior Reporting Analyst - Infrastructure Location: 100 % Remote in India (EST) Contract About the Role: We are looking for an experienced Senior Reporting Analyst with a passion for transforming technical infrastructure data into actionable business intelligence. This role is perfect for a highly analytical professional with a Big 5 consulting background who excels at crafting compelling executive presentations and partnering with IT leaders to drive informed decisions. As a strategic member of the Infrastructure & Reporting team, you will play a key role in turning raw data from network operations, cybersecurity, and IT systems into clear, impactful insights for C-level stakeholders. You’ll help shape technology strategy by enhancing infrastructure performance visibility and reporting maturity across the organization. Key Responsibilities: Develop and deliver sophisticated dashboards and reports covering critical infrastructure areas such as networks, firewalls, VPNs, and system performance. Visualize complex data into compelling narratives for executive audiences including CIOs, CTOs, and business leaders. Gather, analyze, and interpret data from tools such as Jira, Confluence, ITSM platforms, and infrastructure monitoring systems to support incident, change, and asset management reporting. Partner closely with network engineering and cybersecurity teams to ensure data accuracy, reliability, and consistency. Define and track Key Performance Indicators (KPIs) including system uptime, security event trends, network throughput, and infrastructure health metrics. Lead the continuous improvement of infrastructure reporting processes, with an emphasis on automation and proactive performance monitoring. Provide regular updates and data-driven recommendations to senior leadership, influencing operational strategy and investment decisions. Identify trends in infrastructure operations, propose enhancements, and develop reports that help mitigate risks and drive resilience. Support incident post-mortem reporting, helping teams learn from outages and strengthen operational processes. Required Skills & Experience: 5+ years of experience in data analytics, business intelligence, or IT reporting roles with a focus on infrastructure or IT services. Proven track record with a Big 5 consulting firm (Accenture, Deloitte, PwC, EY, KPMG) in delivering high-impact reporting or advisory services. Strong understanding of IT infrastructure components: routers, switches, firewalls, VPNs, and network performance indicators. Proficiency in Power BI, Tableau, and other data visualization tools, with the ability to create impactful dashboards and executive-level presentations. Skilled in Excel and PowerPoint for rapid analysis and visual storytelling. Familiarity with Jira, Confluence, ITSM tools, and infrastructure monitoring solutions. Excellent stakeholder management, communication, and storytelling skills, with the ability to translate technical information into actionable insights. Hands-on experience in dashboard automation, KPI development, and IT operations reporting. Preferred Qualifications: Previous exposure to the pharmaceutical industry is highly desirable. Bachelor’s or Master’s degree in Computer Science, Data Analytics, Information Systems, or a related technical discipline. Knowledge of ITIL practices, change management, and infrastructure lifecycle processes. Familiarity with infrastructure monitoring platforms such as Splunk, SolarWinds, Datadog, or similar solutions. Experience with Smartsheet, Canva, or other collaborative reporting tools is an added advantage. What You'll Achieve: Your dashboards and insights will become the backbone of executive decision-making. You’ll help optimize infrastructure stability, security, and performance through meaningful data analysis. You will be a trusted advisor to senior leadership, driving infrastructure strategy with evidence-based recommendations.

AI Infrastructure Engineers – Network Design, Deployment & Operations (NCP-AIN/AII/AIO Certified) India 0 years None Not disclosed Remote Contractual

We're hiring NCP-Certified Engineers ! Join us as a Network (AIN) , Deployment (AII) , or Operations (AIO) Engineer and help power next-gen AI infrastructure with NVIDIA H100 racks. Apply now to be part of cutting-edge AI deployments and scalable data center innovation! 1. Network Design & Installation Engineer (NCP-AIN Certified) Location: India REMOTE Duration: Long Term Contract Overview: We are seeking a certified Network Design & Installation Engineer with deep expertise in InfiniBand and Ethernet-based networking solutions. This role is pivotal in architecting and deploying robust, high-performance network fabrics for NVIDIA H100 GPU-powered AI racks. Key Responsibilities: Design and implement scalable InfiniBand/Ethernet networks to support large-scale H100 GPU clusters. Configure Spectrum-X switches, BlueField DPUs, and Cumulus Linux-based environments. Integrate networking architecture with existing data center infrastructure. Perform on-site installations, including racking, cable management, and connectivity validation. Utilize tools such as UFM and IBDiagnet to run diagnostics and optimize network performance. Collaborate with infrastructure and operations teams to ensure seamless deployment and expansion. Qualifications: NCP-AIN certification (required) or strong equivalent hands-on experience. In-depth knowledge of InfiniBand, RoCE v2, Spectrum switches, BlueField DPUs, and Cumulus Linux. Proven experience in designing and deploying high-performance or HPC network environments. Willingness to travel for on-site deployments and hands-on hardware installation. Experience with telemetry, diagnostics, and fabric tuning tools. 2. AI Infrastructure Deployment Engineer (NCP-AII Certified) Location: India REMOTE Duration: Long Term Contract Overview: We are hiring an experienced AI Infrastructure Deployment Engineer to lead the deployment of full-stack AI infrastructure powered by NVIDIA H100 GPUs. This role focuses on validating and configuring the entire stack — from bare-metal systems to orchestration platforms — ensuring production-ready AI environments. Key Responsibilities: Lead end-to-end deployment of AI racks, including servers, GPUs, switches, and interconnects. Validate bare-metal hardware, Spectrum-X switches, routers, and storage systems. Configure multi-tenant GPU environments using MIG, MPS, and virtualization tools. Deploy NVIDIA Base Command, DGX OS, and associated AI/ML software stacks. Integrate systems with Kubernetes, Helm, and other orchestration platforms. Implement monitoring and telemetry using DCGM, UFM, and performance benchmarking tools. Qualifications: NCP-AII certification (required) or equivalent hands-on infrastructure experience. Expertise in GPU server configurations, MIG/MPS, Base Command, and virtualization (K8s, vSphere). Experience with BIOS/firmware updates, system burn-in, and power/cooling validation. Strong understanding of data center infrastructure and AI workload requirements. Experience integrating AI infrastructure with cloud-native tools and container environments. 3. AI Infrastructure Operations Engineer (NCP-AIO Certified) Location: India REMOTE Duration: Long Term Contract Overview: We are looking for a proactive and skilled AI Infrastructure Operations Engineer to manage and optimize large-scale AI clusters built with NVIDIA H100 GPUs. This role focuses on post-deployment operations — ensuring performance, reliability, and maintainability of AI infrastructure environments. Key Responsibilities: Manage day-to-day operations of GPU clusters, networking fabric, and server infrastructure. Monitor and maintain the health of InfiniBand/Ethernet networks and DGX/H100 nodes. Apply firmware upgrades, OS patches, and handle infrastructure lifecycle management. Troubleshoot hardware, network, and container-level failures using telemetry tools like UFM and DCGM. Create and maintain operational runbooks, automate workflows, and improve incident response. Support infrastructure scaling, upgrades, and collaborate with deployment teams. Qualifications: NCP-AIO certification (required) or comparable operational experience in large-scale AI environments. Strong troubleshooting skills across compute, network, and storage domains. Experience with monitoring and telemetry tools (Prometheus, Grafana, DCGM, UFM). Familiarity with log aggregation and alerting systems. Background in data center operations, capacity planning, and support automation. How These Roles Collaborate NCP-AIN (Design & Install): Builds and installs the high-speed network fabric that powers AI workloads. NCP-AII (Deploy): Deploys and validates the full AI infrastructure stack, including hardware and software integration. NCP-AIO (Operate): Ensures continuous, reliable, and optimized operations of deployed AI environments.