[QST] Read entire images through "CuFileDriver" pread #446

sohomb91 · 2022-11-18T08:40:55Z

Hi Team,

I am currently working on a project to test the performances of different Storages and corresponding IO, leveraging GPU on AI workload.
This brought be to cuCIM (being a bit new here) which utilizes GDS for image processing (just what I was looking for!).

I was playing around a bit with the APIs, however, got stuck at a point, and not sure how to proceed.

Basically I am looking to read entire images using the CuFileDriver; I do know there is also the CuImage API built for this purpose, however, I am reluctant to use it since in the future, I am planning to include image reads from S3 into my project and CuFileDriver gives me this flexibility.

So when trying out the CuFileDriver pread and specify the entire byte length with offset: 0 and look to store it into a cupy ndarray (shape: [375, 500, 3]), I do not think, I am getting the exact image 3d array back -- when trying to plot that image from the array, I do not get the original image. Have a look at the following capture:

But this is not the case when I try using the os.pread() and returning as a bytestring (tried os.pread() since fd.pread() is similar to the POSIX counterparts):

Therefore looking for some help regarding this -- I must be doing something wrong with CufileDriver pread; would very much appreciate it if you could give some hints regarding this -- how to read the entire image into a cupy ndarray?
Even if there is some way I can get the bytestring from CuFileDriver pread just like os.pread(), that would work for me as well!

Thanking you in advance...

The text was updated successfully, but these errors were encountered:

gigony · 2022-11-22T00:33:25Z

Hi @sohomb91 ,

What pread() method returns is the RAW data, not decompressed data.

You will need to convert CuPy array (in GPU memory) to bytes object (in CPU memory) to decode the jpeg data.

from cucim.clara.filesystem import is_gds_available
print(is_gds_available())

import cupy
from cucim.clara.filesystem import CuFileDriver
import os
fno = os.open("test.jpg", os.O_RDONLY| os.O_DIRECT)
leng = os.stat("test.jpg").st_size

fd = CuFileDriver(fno, False)
cp_arr = cupy.ndarray((leng,), dtype=cupy.uint8)
fd.pread(cp_arr, leng, 0)
fd.close()

from PIL import Image
import io
Image.open(io.BytesIO(cp_arr.tobytes()))

Using Pytorch

Pytorch(torchvision) has decode_jpeg() method to decode jpeg data and place a decompressed image into Pytorch Tensor (in GPU memory, though it has a memory leak problem), but it doesn't accept JPEG data (compressed) in GPU memory.

import cupy
from cucim.clara.filesystem import CuFileDriver
import os
fno = os.open("test.jpg", os.O_RDONLY| os.O_DIRECT)
leng = os.stat("test.jpg").st_size

fd = CuFileDriver(fno, False)
cp_arr = cupy.ndarray((leng,), dtype=cupy.uint8)
fd.pread(cp_arr, leng, 0)
fd.close()

import torch
from torchvision.io import decode_jpeg
torch_arr = torch.as_tensor(cp_arr, device='cuda')

torch_one_dim_gpu = torch.flatten(torch_arr)

# 'decode_jpeg()` cannot accept Cuda tensor
torch_one_dim_cpu = torch_one_dim_gpu.cpu()
# output shape of `decode_jpeg()` is (C,H,W) so we need to transpose it to (H,W,C).
torch_image = decode_jpeg(torch_one_dim_cpu, device='cuda').permute(1, 2, 0)
print(torch_image.shape)

from PIL import Image
Image.fromarray(torch_image.cpu().numpy(), 'RGB')

# from torchvision.io import read_file, decode_jpeg
# img_u8 = read_file('test.jpg')
# img_nv = decode_jpeg(img_u8, device='cuda')
# print(img_nv.shape)

# print(torch_one_dim.shape)
# print(img_u8.shape)

For this reason, using GDS for loading a jpeg image into GPU memory doesn't have a benefit unless there is a jpeg-decoding method that accepts JPEG data in GPU memory. (cuCIM uses such a method internally when cuCIM reads Jpeg-compressed TIFF image though its performance is not good due to not having GPU memory pool).

cuCIM can expose the method in the future (#427) but not planned yet.

For using GDS, please also consider RAPIDS kvikio (it only has Conda package for now and PyPI package would be available early next year).

cuCIM may use it internally in the future to provide better features (#221).

Thanks,
Gigon

sohomb91 · 2022-11-28T06:26:49Z

Many thanks for confirming this.... :)
Would really love if the jpeg decoding feature is introduced to support the same format (will follow the progress of #427)

Also, the kvikIO looks promising -- I will give it a try! Thanks...

sohomb91 added the question Further information is requested label Nov 18, 2022

gigony self-assigned this Nov 22, 2022

sohomb91 closed this as completed Nov 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[QST] Read entire images through "CuFileDriver" pread #446

[QST] Read entire images through "CuFileDriver" pread #446

sohomb91 commented Nov 18, 2022 •

edited

Loading

gigony commented Nov 22, 2022 •

edited

Loading

Uh oh!

sohomb91 commented Nov 28, 2022

Uh oh!

[QST] Read entire images through "CuFileDriver" pread #446

[QST] Read entire images through "CuFileDriver" pread #446

Comments

sohomb91 commented Nov 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

gigony commented Nov 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sohomb91 commented Nov 28, 2022

Uh oh!

sohomb91 commented Nov 18, 2022 •

edited

Loading

gigony commented Nov 22, 2022 •

edited

Loading