Sort by
Refine Your Search
-
in deep learning at scale, familiarity with the “alphabet soup” of distributed computing (DP, TP, SP, CP, EP) Experience with production environments, including Git-based workflows Experience working
-
including workload schedulers, storage systems, and distributed compute nodes. Applies analytical methods to evaluate system performance, identify bottlenecks, and implement corrective actions to improve
-
tools (OpenXDMod), distributed and parallel file systems (CEPHFS, NFS, Lustre, BeeGFS), and virtualization platforms (Openstack) Rewards and Benefits This position is located in Ithaca, New York
-
of analytical reports to achieve client, program, and business objectives for resource optimization. Work closely with internal technology teams and vendors to deliver tools and system solutions. Lead the
-
, or workshops. Knowledge, Skills and Ability: Ability to program in multiple programming languages like C/C++, FORTRAN, Python, R or similar scientific programming languages. Knowledge of parallel programming