Job
Description
As a Senior Server Administrator at our organization, you will play a crucial role in the day-to-day management, maintenance, and troubleshooting of our Linux-based server infrastructure that caters to our US-based clients. Your responsibilities will include ensuring the stability, security, and performance of our systems, as well as automating tasks to enhance efficiency. Collaborating closely with team members, you will support both development and production environments to contribute to the overall improvement of our infrastructure. At our company, we hold certain corporate values in high regard. Respectful communication and cooperation are key aspects of our work culture, where every individual is treated with dignity and respect. We foster teamwork and employee participation by embracing diverse perspectives within our teams and in our interactions with customers. We value a work/life balance that accommodates the varying needs of our employees, recognizing its importance for our collective success. Additionally, we are committed to embracing and supporting the communities that nurture us, appreciating our employees" dedication to positive change. Diversity, inclusion, and belonging are fundamental aspects of our organizational culture. ePlus is dedicated to creating a work environment that celebrates diversity, promotes inclusion, and encourages employees to bring their authentic selves to work. In this role, your impact will be significant. Your responsibilities will include administering and troubleshooting Linux servers, automating server provisioning and infrastructure operations using Ansible, performing basic network and storage troubleshooting, managing and monitoring Nvidia GPUs on servers, maintaining server documentation, collaborating with other teams to resolve technical issues, and contributing to the continuous improvement of our infrastructure and processes. To excel in this position, you should possess strong Linux administration skills, proficiency in using Ansible for automation, expertise in GitHub, a basic understanding of Nvidia GPU management, experience with container technologies, basic network and storage troubleshooting skills, excellent problem-solving and analytical capabilities, the ability to work independently and as part of a team (especially in a remote setting), and strong communication and documentation skills. Preferred skills for this role include experience with Dell and Supermicro servers, familiarity with the MAAS tool for GPU node systems management and provisioning, creating Ansible playbooks for bare metal systems management, scripting skills (e.g., Bash, Python), experience with monitoring tools (e.g., Nagios, Zabbix), and knowledge of virtualization technologies (e.g., KVM, VMware). As you carry out your duties, you may engage in both seated and occasional standing or walking activities. We provide reasonable accommodations, as required by relevant laws, to support your success in this position. By embracing our values and demonstrating your skills and expertise, you will contribute to our shared mission of making a positive impact within our organization and the broader community. Kindly note that this job description serves as a guide and is not an employment contract.,