Sort by
Refine Your Search
-
Listed
-
Country
-
Field
-
Engineers. Serve as liaison with Princeton Research Computing staff on GPU cluster related issues. Professional Development Learn the underlying science, mathematics, statistics, data analysis, and algorithms
-
. Evaluates and selects appropriate foundational models (OpenSource vs. Proprietary) and hosting strategies (Azure AI Foundry, AWS Bedrock, local GPU/TPU), directly influencing the University's cloud spend and
-
pipelines for complex decision‑making. Conducting adversarial testing, implementing input sanitization, and contributing to AI‑safety research. Utilizing GPU/TPU resources, mixed‑precision training, and
-
Knowledge of scaling and optimising software to take advantage of GPU / HPC infrastructure. Desirable: B1 Knowledge of Trusted Research Environments out with or within an HPC environment. Skills Essential: C1
-
, PyTorch) for ML applications, training, evaluation, and deployment of models Use of GPU-based servers and modern IT infrastructure for training and inference Application of classical ML methods (e.g
-
. Additional languages or experience with libraries for utilizing GPU hardware efficiently, e.g., CUDA, are a plus. Experience in AI programming with, e.g., PyTorch(-DDP), Horovod, or DeepSpeed, and in
-
storage and archiving solutions to collaboration and analytics tools. ARC also delivers Baskerville; a leading GPU accelerated National Compute Resource (NCR) and supports researchers using specialist
-
, implementing input sanitization, and contributing to AI‑safety research. Utilizing GPU/TPU resources, mixed‑precision training, and distributed training frameworks such as DeepSpeed or ZeRO. Prior work
-
these resources through a cloud-native Kubernetes environment integrating large-scale CPU and GPU resources, Ceph object storage, BinderHub, Coffea-Casa, Dask, and ServiceX. This platform supports more than 500
-
GHz with 24-cores each) and 4 GPUs (NVIDIA Ampere A100 80 GB PCIe) with 512 GB RAM. The compute nodes are interconnected via a fast InfiniBand network that also connects to ~360 TB of compute storage