-
Language Model (LLM) GPU cluster to ensure stable and reliable operation of training tasks; (b) handle GPU node failures, IB network anomalies, CUDA/NCCL errors and Kubernetes scheduling failures, perform
-
to support efficient model training iteration; (b) lead the construction of the GPU computing cluster centered around a Kubernetes + NVIDIA GPU Operator, including node planning, resource management
-
outside normal office hours for activities organised by the University and/or the Department. Qualifications Applicants should: (a) have a recognised degree or higher diploma in Computing
Searches related to gpu computing
Enter an email to receive alerts for gpu-computing positions