Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: single_device_tls_caching_allocator doesn't free memory on failed allocation #4222

Open
1 task done
gevtushenko opened this issue Mar 20, 2025 · 1 comment
Open
1 task done
Labels
bug Something isn't working right.

Comments

@gevtushenko
Copy link
Collaborator

Is this a duplicate?

Type of Bug

Runtime Error

Component

Thrust

Describe the bug

Users have to workaround an issue in thrust::detail::single_device_tls_caching_allocator by reaching into detail namespace to release unused memory upon failed allocation. Without a try-catch block, the following code fails:

#include <thrust/detail/caching_allocator.h>
#include <thrust/device_vector.h>
#include <thrust/execution_policy.h>
#include <thrust/sort.h>

#include <iostream>

int main(int argc, char const *argv[])
{
  size_t mode = 0;
  if (argc > 1)
  {
    mode = std::stoi(argv[1]);
  }

  size_t nbSteps   = 100000;
  size_t deltaStep = 5;

  thrust::device_vector<size_t> target(nbSteps * deltaStep, 1);

  for (size_t i = 0; i < nbSteps; ++i)
  {
    std::cerr << "Steps " << i << "\r";

    if (mode == 0)
    {
      thrust::sort(thrust::cuda::par(thrust::detail::single_device_tls_caching_allocator()),
                   target.begin(),
                   target.end());
    }
    else
    {
      try
      {
        thrust::sort(thrust::cuda::par(thrust::detail::single_device_tls_caching_allocator()),
                     target.begin(),
                     target.begin() + (deltaStep * (i + 1)));
      }
      catch (...)
      {
        thrust::detail::single_device_tls_caching_allocator().resource()->release();
        thrust::sort(thrust::cuda::par(thrust::detail::single_device_tls_caching_allocator()),
                     target.begin(),
                     target.begin() + (deltaStep * (i + 1)));
      }
    }
  }
}

How to Reproduce

Remove try catch in the code above.

Expected behavior

Unsuccessful do_allocate in disjoint_pool frees unused memory and attemps allocation again.

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

No response

@gevtushenko gevtushenko added the bug Something isn't working right. label Mar 20, 2025
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 20, 2025
@gevtushenko
Copy link
Collaborator Author

CC @dkolsen-pgi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right.
Projects
Status: Todo
Development

No branches or pull requests

1 participant