We are improving our search experience. To check which content you have full access to, or for advanced search, go back to the old search.

Search

Filters applied:

Search Results

Showing 1-20 of 9,967 results
  1. LICOM3-CUDA: a GPU version of LASG/IAP climate system ocean model version 3 based on CUDA

    The ocean general circulation model (OGCM) is an essential tool for researching oceanography and atmospheric science. The LASG/IAP climate system...

    Junlin Wei, Jinrong Jiang, ... Yuzhu Wang in The Journal of Supercomputing
    Article 18 January 2023
  2. The Use of Functional Programming Library for Parallel Computing on CUDA

    Abstract

    Modern graphics accelerators (GPUs) can significantly speed up the execution of numerical problems. However, porting programs to graphics...

    M. M. Krasnov, O. B. Feodoritova in Programming and Computer Software
    Article 01 February 2024
  3. Improving CUDA performance of an unstructured high-order CFD application under OP2 framework

    OP2 is a domain-specific language-based programming framework for unstructured mesh applications. It supports automatic code generation targeting...

    Kangjin Huang, Yonggang Che, ... Jian Zhang in The Journal of Supercomputing
    Article 07 October 2023
  4. Many-BSP: an analytical performance model for CUDA kernels

    The unknown behavior of GPUs and the differing characteristics among their generations present a serious challenge in the analysis and optimization...

    Ali Riahi, Abdorreza Savadi, Mahmoud Naghibzadeh in Computing
    Article 26 February 2024
  5. Migrating CUDA Code

    Chapter 21 describes terminology, concepts, techniques, and tools to keep in mind when migrating CUDA code to C++ with SYCL. It describes places...
    James Reinders, Ben Ashbaugh, ... Xinmin Tian in Data Parallel C++
    Chapter Open access 2023
  6. CUDA-aware MPI implementation of Gibbs sampling for an IRT model

    Item response theory (IRT) is a popular approach for addressing large-scale assessment problems in psychometrics and other areas of applied research....

    William S. Welling, Yanyan Sheng, Michelle M. Zhu in Cluster Computing
    Article 06 June 2023
  7. swCUDA: Auto parallel code translation framework from CUDA to ATHREAD for new generation sunway supercomputer

    Since specific hardware characteristics and low-level programming model are adapted to both NVIDIA GPU and new generation Sunway architecture,...

    Maoxue Yu, Guanghao Ma, ... Zhiqiang Wei in CCF Transactions on High Performance Computing
    Article Open access 11 January 2024
  8. A CUDA-based parallel optimization method for SM3 hash algorithm

    Hash algorithms are among the most crucial algorithms in cryptography. The SM3 algorithm is a hash cryptographic standard of China. Because of the...

    Jichang Han, Tao Peng, Xuesong Zhang in The Journal of Supercomputing
    Article 10 June 2024
  9. A novel video compression model based on GPU virtualization with CUDA platform using bi-directional RNN

    The exponential increase of superfluous video content across the web applications has provoked the evolution of proficient video compression...

    N. J. Satheesh Kumar, C. H. Arun in International Journal of Information Technology
    Article 23 October 2023
  10. StreamRec: A Recommendation Inference System with CUDA Stream Acceleration

    Deep learning based recommendation models are widely used in various applications. There are often dozens of groups of sparse features in the input...
    Yuean Niu, Zhizhen Xu, ... Chen Xu in Database Systems for Advanced Applications
    Conference paper 2024
  11. Fast CUDA Geomagnetic Map Builder

    In this paper, we use kriging techniques and inverse distance weighting (IDW) to generate geomagnetic maps in Romania. Kriging is a method of spatial...
    Delia Spridon, Adrian Marius Deaconu, Laura Ciupala in Computational Science and Its Applications – ICCSA 2023
    Conference paper 2023
  12. An Empirical Study of Memory Pool Based Allocation and Reuse in CUDA Graph

    As the size of deep neural network models continues to increase, it places higher demands for memory capacity and allocation efficiency. NVIDIA GPUs...
    Ruyi Qian, Mengjuan Gao, ... Yuanchao Xu in Algorithms and Architectures for Parallel Processing
    Conference paper 2024
  13. Porting Numerical Integration Codes from CUDA to oneAPI: A Case Study

    We present our experience in porting optimized CUDA implementations to oneAPI. We focus on the use case of numerical integration, particularly the...
    Ioannis Sakiotis, Kamesh Arumugam, ... Mohammad Zubair in High Performance Computing
    Conference paper 2023
  14. GPU-CUDA Implementation of the Third Order Gaussian Recursive Filter

    Gaussian convolution operation is a fundamental procedure in several data analysis tasks and scientific fields. For example, Gaussian convolution is...

    Pasquale De Luca, Ardelio Galletti, Livia Marcellino in SN Computer Science
    Article 19 November 2021
  15. An efficient parallelization method of Dempster–Shafer evidence theory based on CUDA

    The Dempster–Shafer (D–S) evidence theory is effective for uncertain reasoning; it does not require advanced information. The theory has been widely...

    Kaiyi Zhao, Li Li, ... Gang Yuan in The Journal of Supercomputing
    Article 28 September 2022
  16. Teaching High–performance Computing Systems – A Case Study with Parallel Programming APIs: MPI, OpenMP and CUDA

    High performance computing (HPC) education has become essential in recent years, especially that parallel computing on high performance computing...
    Pawel Czarnul, Mariusz Matuszek, Adam Krzywaniak in Computational Science – ICCS 2024
    Conference paper 2024
  17. Multidimensional adaptative and deterministic integration in CUDA and OpenMP

    Parallelization schemes on many-core architectures, in this case, CUDA and OpenMP, are used to accelerate and improve the accuracy of adaptive...

    R. Quintero-Monsebaiz, A. Meneses-Viveros, ... A. Vela in The Journal of Supercomputing
    Article 01 April 2021
  18. Improving detection and classification of diabetic retinopathy using CUDA and Mask RCNN

    Diabetic retinopathy (DR) is an eye disease caused by diabetes and can progress to certain degrees. Because DR’s the final stage can cause blindness,...

    Abdüssamed Erciyas, Necaattin Barışçı, ... Hüseyin Polat in Signal, Image and Video Processing
    Article 14 August 2022
  19. cuRCD: Region covariance descriptor CUDA implementation

    Abstract Region covariance is a robust feature descriptor that allows the use of even the simplest image features like intensity and gradient...

    M. Ali Asan, Adnan Ozsoy in Multimedia Tools and Applications
    Article 01 March 2021
  20. A hybrid CUDA, OpenMP, and MPI parallel TCA-based domain adaptation for classification of very high-resolution remote sensing images

    Domain Adaptation (DA) is a technique that aims at extracting information from a labeled remote sensing image to allow classifying a different image...

    Alberto S. Garea, Dora B. Heras, ... Begüm Demir in The Journal of Supercomputing
    Article Open access 29 November 2022
Did you find what you were looking for? Share feedback.