-
optimize large-scale distributed training frameworks (e.g., data parallelism, tensor parallelism, pipeline parallelism). Develop high-performance inference engines, improving latency, throughput, and memory
-
toolchains Familiarity with distributed training/inference, AI system bottlenecks, and performance tuning Prior experience with cloud computing and AI system deployment in production settings
-
to well-known open-source projects or a personal portfolio of impactful open-source research code. Experience with large-scale distributed training and high-performance computing (HPC) environments.
-
, copyrighted, or biased. By studying brain data recordings and building computational models that mimic real populations of neurons, the project aims to uncover active unlearning: how the brain learns