Harness the power of GPUs to easily accelerate your data science, machine learning, and AI workflows.
Run entire data science workflows with high-speed GPU compute and parallelize data loading, data manipulation, and machine learning for 50X faster end-to-end data science pipelines.
Data science and machine learning is the world's largest compute segment. Modest improvements in the accuracy of analytics models translate into billions to the bottom line. To build the best models, data scientists toil to train, evaluate, iterate, and retrain for highly accurate results and performant models. With RAPIDS™, processes that took days take minutes, making it easier and faster to build and deploy value-generating models. With NVIDIA LaunchPad you can go hands-on with RAPIDS labs, and with NVIDIA AI Enterprise we can support your enterprise across all aspects of your AI projects
Workflows have many iterations of transforming Raw Data into Training Data, which gets fed into many algorithm combinations, which undergo hyperparameter tuning to find the right combinations of models, model parameters, and data features for optimal accuracy and performance.
RAPIDS is a suite of open-source software libraries and APIs for executing data science pipelines entirely on GPUs—and can reduce training times from days to minutes. Built on NVIDIA® CUDA-X AI™, RAPIDS unites years of development in graphics, machine learning, deep learning, high-performance computing (HPC), and more.
With Data Science, more compute allows you to gain insights faster. RAPIDS leverages NVIDIA CUDA® under the hood to accelerate your workflows by running the entire data science training pipeline on GPUs. This can reduce your model training time from days to minutes.
By hiding the complexities of working with the GPU and even the behind-the-scenes communication protocols within the data center architecture, RAPIDS creates a simple way to get data science done. As more data scientists use Python and other high-level languages, providing acceleration without code change is essential to rapidly improving development time.
RAPIDS can be run anywhere—cloud or on-prem. You can easily scale from a workstation to multi-GPU servers to multi-node clusters, as well as deploy it in production with Dask, Spark, MLFlow, and Kubernetes.
Access to reliable support is often vital to organizations using data science for mission-critical insights. Global NVIDIA Enterprise Support is available with NVIDIA AI Enterprise, an end-to-end AI software suite, and includes guaranteed response times, priority security notifications, regular updates, and access to NVIDIA AI experts.
Results show that GPUs provide dramatic cost and time-savings for small and large-scale Big Data analytics problems. Using familiar APIs like Pandas and Dask, at 10 terabyte scale, RAPIDS performs at up to 20x faster on GPUs than the top CPU baseline. Using just 16 NVIDIA DGX A100s to achieve the performance of 350 CPU-based servers, NVIDIA’s solution is 7x more cost effective while delivering HPC-level performance.
Common data processing tasks have many steps (data pipelines), which Hadoop can’t handle efficiently. Apache Spark solved this problem by holding all the data in system memory, which allowed more flexible and complex data pipelines, but introduced new bottlenecks. Analyzing even a few hundred gigabytes (GB) of data could take hours if not days on Spark clusters with hundreds of CPU nodes. To tap the true potential of data science, GPUs have to be at the center of data center design, consisting of these five elements: compute, networking, storage, deployment, and software. Generally speaking, end-to-end data science workflows on GPUs are 10X faster than on CPUs.
RAPIDS provides a foundation for a new high-performance data science ecosystem and lowers the barrier of entry for new libraries through interoperability. Integration with leading data science frameworks like Apache Spark, cuPY, Dask, and Numba, as well as numerous deep learning frameworks, such as PyTorch, TensorFlow, and Apache MxNet, help broaden adoption and encourage integration with others. You can find RAPIDS and the correlating frameworks in the NGC catalog.
Integrated with RAPIDS, Plotly Dash enables real-time, interactive visual analytics of multi-gigabyte datasets even on a single GPU.
The RAPIDS Accelerator for Apache Spark provides a set of plug-ins for Apache Spark that leverage GPUs to accelerate processing via RAPIDS and UCX software.
RAPIDS relies on CUDA primitives for low-level compute optimization but exposes that GPU parallelism and high-memory bandwidth through user-friendly Python interfaces. RAPIDS supports end-to-end data science workflows, from data loading and preprocessing to machine learning, graph analytics, and visualization. It’s a fully functional Python stack that scales to enterprise big-data use cases.
RAPIDS’s data loading, preprocessing, and ETL features are built on Apache Arrow for loading, joining, aggregating, filtering, and otherwise manipulating data, all in a pandas-like API familiar to data scientists. Users can expect typical speedups of 10X or greater.
RAPIDS’s machine learning algorithms and mathematical primitives follow a familiar scikit-learn-like API. Popular tools like XGBoost, Random Forest, and many others are supported for both single GPU and large data center deployments. For large datasets, these GPU-based implementations can complete 10-50X faster than their CPU equivalents.
RAPIDS’s graph algorithms like PageRank and functions like NetworkX make efficient use of the massive parallelism of GPUs to accelerate analysis of large graphs by over 1000X. Explore up to 200 million edges on a single NVIDIA A100 Tensor Core GPU and scale to billions of edges on NVIDIA DGX™ A100 clusters.
RAPIDS’s visualization features support GPU-accelerated cross-filtering. Inspired by the JavaScript version of the original, it enables interactive and super-fast multi-dimensional filtering of over 100 million row tabular datasets.
While deep learning is effective in domains like computer vision, natural language processing, and recommenders, there are areas where its use isn’t mainstream. Tabular data problems, which consist of columns of categorical and continuous variables, commonly make use of techniques like XGBoost, gradient boosting, or linear models. RAPIDS streamlines preprocessing of tabular data on GPUs and provides a seamless handoff of data directly to any frameworks supporting DLPack, like PyTorch, TensorFlow, and MxNet. These integrations open up new opportunities for creating rich workflows, even those previously out of reason like feeding new features created from deep learning frameworks back into machine learning algorithms.
There are five key ingredients to building AI-optimized data centers in the enterprise. The key to the design is placing GPUs at the center.
With their tremendous computational performance, systems with NVIDIA GPUs are the core compute building block for AI data centers. NVIDIA DGX systems deliver groundbreaking AI performance and can replace, on average, 50 dual-socket CPU servers. This is the first step to giving data scientists the industry’s most powerful tools for data exploration.
By hiding the complexities of working with the GPU and the behind-the-scenes communication protocols within the data center architecture, RAPIDS creates a simple way to get data science done. As more data scientists use Python and other high-level languages, providing acceleration without code change is essential to rapidly improving development time.
Remote direct memory access (RDMA) in NVIDIA Mellanox® network interface controllers (NICs), NCCL2 (NVIDIA collective communication library), and OpenUCX (an open-source point-to-point communication framework) has led to tremendous improvements in training speed. With RDMA allowing GPUs to communicate directly with each other across nodes at up to 100 gigabits per second (Gb/s), they can span multiple nodes and operate as if they were on one massive server.
Enterprises are moving to Kubernetes and Docker containers for deploying pipelines at scale. Combining containerized applications with Kubernetes enables businesses to change priorities on what task is the most important and adds resiliency, reliability, and scalability to AI data centers.
GPUDirect® Storage allows both NVMe and NVMe over Fabric (NVMe-oF) to read and write data directly to the GPU, bypassing the CPU and system memory. This frees up the CPU and system memory for other tasks, while giving each GPU access to orders of magnitude more data at up to 50 percent greater bandwidth.
NVIDIA is committed to simplifying, unifying, and accelerating data science for the open-source community. By optimizing the whole stack—from hardware to software—and by removing bottlenecks for iterative data science, NVIDIA is helping data scientists everywhere do more than ever with less. This translates into more value for enterprises from their most precious resources: their data and data scientists. As Apache 2.0 open-source software, RAPIDS brings together an ecosystem on GPUs.