Sort by
Refine Your Search
-
Category
-
Field
-
-year project with several subcomponents that will be developed in parallel. This role will play crucial role in collating requirements from the program managers for the project subcomponents and
-
Community with emphasis on diverse technical environments. High-performance computing (HPC) specialization such as cluster management, parallel computing, and performance optimization. Hardware and software
-
, Mixture-of-Experts; distributed training/inference (e.g. FSDP, DeepSpeed, Megatron-LM, tensor/sequence parallelism); scalable evaluation pipelines for reasoning and agents. Federated & Collaborative
-
workloads. Deep expertise with high-performance parallel file systems (Lustre, GPFS/Spectrum Scale, BeeGFS, WEKA). Knowledge of storage networking (Infiniband, NVMe-oF, SAN/NAS architectures). Familiarity
-
Infiniband networks and diagnostics. Extensive experience with High Performance Parallel File Systems (Lustre, WEKA, GPFS, etc). Experience with performance and diagnostic tools for benchmarking, analysis and
-
leading peer-reviewed journals and conferences. Researching and developing parallel/scalable uncertainty visualization algorithms using HPC resources. Collaboration with domain scientists for demonstration
-
diagnostics. Experience with Infiniband networks and diagnostics. Extensive experience with High Performance Parallel File Systems (Lustre, WEKA, GPFS, etc). Experience with performance and diagnostic tools