Implementation of ResNeXt from the paper Aggregated Residual Transformations for Deep Neural Networks in modern C++ using NVIDIA cuBLAS/cuDNN. Trained with mixed precision on Ampere architecture.