What steps did you take and what happened:
We are facing issue with the csi-blob-driver in AKS cluster to mount azure blob as volume. Facing some issues while handling files.
There are two processes in my application, Process one writes file in blob container using csi-blob-driver, process two reads files from blob container using SDK. Right after (after 30 ms or so) process 1 completes writing to file (i.e after calling fileWriter.Close()), if process two tries to read files, SOMETIMES its says that file is not available. SOMETIMES - based on my analysis so far, if there is a longer delay at process 2, then files is available where as if process 2 starts right after (30-40 ms) process 1 the files is not available.
Process 1 is implemented using Apache Spark as part of an ETL pipeline.
This issue is reproducible only for large datasets containing approximately 1000 tables. Smaller datasets (50–100 tables) work reliably without encountering this behavior.
What steps did you take and what happened:
We are facing issue with the csi-blob-driver in AKS cluster to mount azure blob as volume. Facing some issues while handling files.
There are two processes in my application, Process one writes file in blob container using csi-blob-driver, process two reads files from blob container using SDK. Right after (after 30 ms or so) process 1 completes writing to file (i.e after calling fileWriter.Close()), if process two tries to read files, SOMETIMES its says that file is not available. SOMETIMES - based on my analysis so far, if there is a longer delay at process 2, then files is available where as if process 2 starts right after (30-40 ms) process 1 the files is not available.
Process 1 is implemented using Apache Spark as part of an ETL pipeline.
This issue is reproducible only for large datasets containing approximately 1000 tables. Smaller datasets (50–100 tables) work reliably without encountering this behavior.