-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened:
When writing a file on a host, sometimes a zero-filled region may be written instead of the actual data.
What you expected to happen:
No data corruption.
How to reproduce it (as minimally and precisely as possible):
On one host: pv -L 3k /dev/urandom > /mnt/jfs-partition/test-file
On another host: while ! hexdump -C /mnt/jfs-partition/test-file | grep '00 00 00 00'; do sleep 1; done
After some time, the command on the second host will find a large cluster of zeroes (>1k) and stop. In the file, you see, for instance:
00033130 26 32 91 00 00 00 00 00 00 00 00 00 00 00 00 00 |&2..............|
00033140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00036000 88 ff a0 0b c6 17 26 95 78 b7 e3 28 f5 35 8b 98 |......&.x..(.5..|
Anything else we need to know?
More details:
- The network on the writing host is quite overloaded/unstable; I believe this may be related because the issue only occurs when the network is overloaded. No other (hardware, disk) failures are observed on the host;
- The zero byte clusters are often, but not always, have 1k-divisible size;
- EIO errors sometimes happen on the writing side (because of the bad networking), but they don't seem to correlate to the issues directly;
- Zero clusters appear in the middle of the file, not at the beginning or end;
- Compression and encryption are enabled.
Unfortunately, we have many irrelevant logs on the writing server because this is a production host. These ones may be relevant (the file inode is 2439005):
2024/07/30 07:53:45.891705 juicefs[4019926] <ERROR>: write inode:2439005 indx:0 input/output error [writer.go:211]
2024/07/30 07:53:52.457389 juicefs[4019926] <WARNING>: slow request: PUT chunks/20/20750/20750481_3_4194304 (req_id: "", err: Put "https://somehost/somedb/%2Fchunks%2F20%2F20750%2F20750481_3_4194304": write tcp 10.42.43.15:60050->100.118.102.36:443: use of closed network connection, cost: 59.994437087s) [cached_store.go:667]
[mysql] 2024/07/30 07:54:35 packets.go:37: read tcp 10.42.43.15:47762->10.0.8.236:3306: read: connection reset by peer
2024/07/30 07:54:35.379260 juicefs[4019926] <WARNING>: Upload chunks/20/20750/20750498_5_4194304: timeout after 1m0s: function timeout (try 1) [cached_store.go:407]
2024/07/30 07:54:39.800111 juicefs[4019926] <INFO>: slow operation: flush (2439005,17488,6D9AF143D1C50E5B) - input/output error <53.869292> [accesslog.go:83]
What else do we plan to try:
- Writing from different hosts with similar networking to completely figure out hardware issues;
- Checking the metadata to see if the zero clusters correspond to single chunks.
Environment:
- JuiceFS version (use
juicefs --version) or Hadoop Java SDK version:juicefs version 1.2.0+2024-06-18.873c47b922ba(both hosts) - Cloud provider or hardware configuration running JuiceFS: Bare-metal host on the writing side, Aliyun ECS on the reading side
- OS (e.g
cat /etc/os-release): NixOS 24.11 (Vicuna) (both hosts) - Kernel (e.g.
uname -a):6.1.90(both hosts) - Object storage (cloud provider and region, or self maintained): Aliyun OSS
- Metadata engine info (version, cloud provider managed or self maintained): Aliyun RDS, MySQL
- Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage): Private Aliyun networking on the reading side, public Internet on the writing side
- Others: