Skip to content

Dart File.writeAsString() method does not write to file if await is not done immediately #36087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
renatoathaydes opened this issue Mar 3, 2019 · 23 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-io P2 A bug or feature request we're likely to work on triaged Issue has been triaged by sub team type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)

Comments

@renatoathaydes
Copy link

  • Dart SDK Version (dart --version)

Dart VM version: 2.2.0 (Unknown timestamp) on "linux_x64"

  • Whether you are using Windows, MacOSX, or Linux (if applicable)

Linux 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  • Whether you are using Chrome, Safari, Firefox, Edge (if applicable)

N/A


I have the following Dart code that doesn't behave as I expected:

final File file = File("result.csv");

Future send(String message) async {
  try {
    await file.writeAsString(message + '\n',
        mode: FileMode.append, flush: true);
  } catch (e) {
    print("Error: $e");
  }
  return await file.length();
}

main() async {
  final futures = <Future>[];
  for (int i = 0; i < 100; i++) {
    futures.add(send("$i"));
  }
  for (int i = 0; i < 100; i++) {
    print(await futures[i]);
  }
}

Expected behaviour

I expected the file to be written as soon as each call to await futures[i] in the second loop returned. However this does not seem to be happening.

The file should contain one line for each index from 0 to 99.

The length of the file that is printed on each iteration of the await loop should show the file's length increasing on each call. Example:

3
6
9
12
...

Observed behaviour

Only the last call in the loop seems to write to the file. The resulting file contains a line with 99 followed by an empty line.

The print calls in the second loop always print the same file length, 3:

3
3
3
3
...

The event loop seems to be somehow merging the calls and only actually executing the last call, even though I still get 100 different futures that I await in the second loop.

The argument for each call to file.writeAsString() is different on every call, so there should be no merging happening.

If the code is modified to await on each call to send(), then the expected behaviour is observed, but that means the caller of send() cannot proceed without waiting for the file to be written to, which is not the desired behaviour (the caller does wait later, in the second loop).

Posted on StackOverflow: https://stackoverflow.com/questions/54958346/dart-file-writeasstring-method-does-not-write-to-file-if-await-is-not-done-imm


@lrhn
Copy link
Member

lrhn commented Mar 4, 2019

You are asking the operating system to append to a file 100 times, before any of them get the chance to actually do the append. So, all of them see the existing zero-length file, then all of them increase the length, and then all of them write into the newly allocated space, overwriting each other.
Which write finally succeeds is random (mine was 97, then 99).

The documentation for O_APPEND says that a file opened with that must seek to the end of the file before each write, and the seek and write are atomic. It seems that the dart:io write operations with append mode are not using O_APPEND (a realtively safe bet since O_APPEND does not occur in the source code).
We should probably make it do that, so appending has the expected behavior.

@lrhn lrhn added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-io labels Mar 4, 2019
@renatoathaydes
Copy link
Author

Thank you very much for the answer. I had already opened a question on StackOverflow, if you'd like to add details there later (or when the issue is resolved) it could be helpful for others: https://stackoverflow.com/questions/54958346/dart-file-writeasstring-method-does-not-write-to-file-if-await-is-not-done-imm

@sortie sortie added P2 A bug or feature request we're likely to work on type-bug Incorrect behavior (everything from a crash to more subtle misbehavior) labels Mar 4, 2019
@sortie
Copy link
Contributor

sortie commented Mar 4, 2019

Yeah it doesn't look like appending is actually implemented in the Dart SDK. I'll turn this bug into a request for implementing the appending behavior.

@sortie
Copy link
Contributor

sortie commented Mar 4, 2019

Alright, I understand how FileMode.append is implemented. It is implemented, but not quite as I expected. The file handled is seeked to the end after it's opened. That's why it's possible that the send() calls all write to the start of the file, which is where they were opened. If they are done sequentially, the correct file offset is used.

The current documentation for FileMode.append is:

/// Mode for opening a file for reading and writing to the
/// end of it. The file is created if it does not already exist.
static const append = const FileMode._internal(2);

This technically doesn't mention the append semantics, though "writing to the end of it" does a bit suggest the O_APPEND POSIX semantics.

The action items I see here is perhaps clarifying the documentation and potentially implementing the O_APPEND mode (and its Windows equivalent) as this is a useful feature for things like log files and such (although ensuring concurrent long writes to those files are atomic and don't interleave may not be possible without file locking).

@attdona
Copy link

attdona commented Mar 4, 2019

If I understand correctly the problem is in the actual File implementation.

File.writeAsString calls writeAsBytes that opens each time the file:

Future<File> writeAsBytes(List<int> bytes,
    {FileMode mode: FileMode.write, bool flush: false}) {
  return open(mode: mode).then((file) {
    return file.writeFrom(bytes, 0, bytes.length).then<File>((_) {
      if (flush) return file.flush().then((_) => this);
      return this;
    }).whenComplete(file.close);
  });
}

If open allocates a new file descriptor with open(2) then synchronization is not possible.

@sortie
Copy link
Contributor

sortie commented Mar 4, 2019

Synchronization is possible using if O_APPEND at the open(2) system call level since that atomically sets the file offset to the file's length and does the write.

@renatoathaydes
Copy link
Author

I experienced this problem also when I was calling file.open(FileMode.write) ... before... I changed to writeAsString() imagining it would help.... pretty sure the same thing happened in both cases.

@renatoathaydes
Copy link
Author

renatoathaydes commented Mar 4, 2019

This program in Go, which should be roughly equivalent to the Dart one, writes every line (though out-of-order, as I expected) and prints the expected output:

package main

import (
	"fmt"
	"os"
	"sync"
)

const filename = "results.txt"

func main() {
	var wg sync.WaitGroup
	wg.Add(100)
	for i := 0; i < 100; i++ {
		go send(fmt.Sprintln(i), &wg)
	}
	wg.Wait()
	print("Done")
}

func send(msg string, wg *sync.WaitGroup) {
	defer wg.Done()
	f, err := os.OpenFile(filename, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0600)
	if err != nil {
		panic(err)
	}
	defer f.Close()
	_, err = f.WriteString(msg)
	if err != nil {
		panic(err)
	}
	stat, err := f.Stat()
	if err != nil {
		panic(err)
	}
	fmt.Println(stat.Size())
}

@attdona
Copy link

attdona commented Mar 5, 2019

Synchronization is possible using if O_APPEND at the open(2) system call level since that atomically sets the file offset to the file's length and does the write.

Yes, I confirm. I was reproducing the problem in C using a wrong combination of open(2) flags/mode that make me wrong about open(2) behavoir.

Sorry for the noise!

@b-cancel
Copy link

I believe this might help someone
The example is for writing to a file but can easily be adapted to appending

Basically here I'm handling things myself so that I don't have to rely on the system to handle it for me
I needed this because, in my program, I am sometimes writing to the file many times per every 2 or 3 seconds
Ever so often I would get an invalid data format (since I am encoding into JSON)
Since I am writing multiple times within a short span of time and since what I am writing is at times large
Sometimes I would request a write before the previous had completed
And this would cause my file to have a weird mesh of both request and break things

I could have used writeAsStringSync (which I imagine would have fixed this problem) but I didn't want to slow anything down since the data being saved is just for the user's convenience
So, If they lose their last edit then no big deal
BUT if the app slows down then I'm DEFINITELY making things LESS convenient NOW just so that I can MAYBE make things MORE convenient LATER...
Which is not a good trade-off (for me)

So I instead opted for simply creating a request Write function that I called "safeSave"
It simply saves all the request to a queue and then another function is constantly trying to empty the queue

You could just finish the current request and then process the next newest one
EX: REQ1 [processing] , REQ2[no longer in memory], REQ 3[processing after REQ1]
which is what I'll end up doing after this but this worked for me

import 'dart:collection';
import 'dart:io';

Map<File, Queue<String>> fileToWriteRequests = new Map<File, Queue<String>>();
Map<File, bool> fileToIsWriting = new Map<File, bool>();

//NOTE: this assumes that the file atleast exists
safeSave(File file, String data){
  //this is the first time we are told to write to this particular file
  //so we create the slot for it
  if(fileToWriteRequests.containsKey(file) == false){
    fileToWriteRequests[file] = new Queue<String>();
    fileToIsWriting[file] = false;
  }

  //add this to the queue
  fileToWriteRequests[file].add(data);
  
  //check if this file is already being written to asyncronously
  //IF it is then eventually this write request will process
  //ELSE begin the process
  if(fileToIsWriting[file] == false){
    fileToIsWriting[file] = true;
    startProcessingRequests(file);
  }
}

startProcessingRequests(File file)async{
  //keep processing write request until the queue is empty
  while(fileToWriteRequests[file].isNotEmpty){
    //grab the next bit of data to write
    String dataToWrite = fileToWriteRequests[file].removeFirst();

    //write it
    //1. keeps things asynchronous for less app jitter
    //2. in the worst case we lose some data (very little)
    await file.writeAsString(
      dataToWrite,
      //overwrite the file if its already been written to and opens it only for writing
      mode: FileMode.writeOnly,
      //ensure data integrity but takes a bit longer
      flush: true,
    );
  }

  //queue is empty and writing is complete
  fileToIsWriting[file] = false;
}

@b-cancel
Copy link

Here is the version that just keeps track of 2 things

  1. whether the file is being written to or not
  2. what is the last string that wanted to be written but could not because the file was being written to already
import 'dart:io';

//keep track of the files current being written to
Set<File> filesWeAreWritingTo = new Set<File>();

//keep track of the newest waiting data
Map<File, String> fileToNextDataToBeWritten  = new Map<File, String>();

//NOTE: this assumes that the file atleast exists
safeSave(File file, String data){
  //If the file is already being written to
  if(filesWeAreWritingTo.contains(file)){
    //save our data for writing after it completes
    //NOTE: may have overwritten old waiting data
    fileToNextDataToBeWritten[file] = data;
  }
  else _writeToFile(file, data);
}

//write to file is a seperate function so we can easily recurse
_writeToFile(file, data) async {
  //mark this file as being written into
  filesWeAreWritingTo.add(file);

  //write into it
  await file.writeAsString(
    data,
    //overwrite the file if its already been written to and opens it only for writing
    mode: FileMode.writeOnly,
    //ensure data integrity but takes a bit longer
    flush: true,
  );

  //once finished check if something else was waiting
  if(fileToNextDataToBeWritten.containsKey(file)){
    //grab data waiting
    String data = fileToNextDataToBeWritten.remove(file);
    //NOTE: we keep the being written to flag on
    _writeToFile(file, data);
  }
  else{ //we finished writing to this file (for now)
    filesWeAreWritingTo.remove(file);
  }
}

@brianquinlan
Copy link
Contributor

brianquinlan commented Jan 24, 2023

Maybe we should document the behavior like NodeJS does i.e.

It is unsafe to use filehandle.write() multiple times on the same file without waiting for the promise to be resolved (or rejected). For this scenario, use filehandle.createWriteStream().

@denisgl7
Copy link

Is there any solution?

@lrhn
Copy link
Member

lrhn commented Jun 19, 2023

The safest solution is to wait for the previous future before writing the next string.

@alexvoina
Copy link

Can someone help me with an answer here pretty please?

what happens if i call this function multiple times in a row, without awaiting for it? Is it safe?


Future<void> saveData({int? num}) async {
    try {
      await _file.writeAsString(_encryptString(json.encode(_data)));
    } catch (exception, stackTrace) {
      if (kDebugMode) print('Error saving data to file: $exception');
    }
  }

here's an answer from ChatGPT. Is it possible to get concurrent writes?

Concurrent Writes: Since the saveData function is asynchronous and not being awaited, multiple instances of it will run concurrently. This can lead to race conditions where multiple writes to the file might overlap, causing data corruption or unexpected results.

@lrhn
Copy link
Member

lrhn commented Jun 7, 2024

Seems ChatsGPT stumbled on the truth here.

Not awaiting the writes directly, or wrapping then in an async function which does await the write, but then not awaiting that function call, gives the same result: the next operation is stated before the former has completed.

@alexvoina
Copy link

alexvoina commented Jun 7, 2024

@lrhn thanks! Can you elaborate a bit on "the next operation is started before the former has completed" ?

I've added some prints to saveData and written this test to understand what's happening

// save data with prints
Future<void> saveData({int? num}) async {
    try {
      print("Save data: $num");
      await _file.writeAsString(_encryptString(json.encode(_data)));
      print("End save data: $num");
    } catch (exception, stackTrace) {
      print('Error saving data to file: $exception');
    }
  }

  // test - save 100 times without awaiting 
  
  var count = 100;
    while (count-- > 0) {      
      manager.saveData(num: count);
    }

When running this test, I can see in the output that the prints before file.writeAsString always come in consecutive order, the prints that come after don't:

// random output example  

Save data: 99
Save data: 98
Save data: 97
Save data: 96
Save data: 95
Save data: 94
Save data: 93
Save data: 92
Save data: 91
Save data: 90
Save data: 89
Save data: 88
Save data: 87
Save data: 86
Save data: 85
Save data: 84
Save data: 83
Save data: 82
... 
Save data: 0


End save data: 99
End save data: 92
End save data: 93
End save data: 91
End save data: 90
End save data: 89
End save data: 88
End save data: 87
End save data: 85
End save data: 83
End save data: 86
End save data: 84
End save data: 82
End save data: 80

So getting back to your statement: "the next operation is started before the former has completed"

What exactly does it mean in this context, and why did ChatGPT stumble on the truth here?

@lrhn
Copy link
Member

lrhn commented Jun 7, 2024

An sync function is like a normal, synchronous, function that runs until the first await, then it returns a Future.

Everything after that first await is run as a callback on the future out awaited, so not until that future completes.

The test of the body code needs to complete before the returned future is completed.

That's why everything before the await happens in order, because it happens synchronously. Then the loop is done, control returns to the event loop, and then the awaited futures can start competing in whichever order they complete in.

@alexvoina
Copy link

Thanks! The part that gets me intrigued is:

and then the awaited futures can start competing in whichever order they complete in.

What determines the order they complete in? In my test, the data that needs to be written to file as text does not change between the calls to saveData(). If the data would change, I would imagine that the call to saveData() that needed to write the "least amout of data" would complete first. I'm sure in practice it is not like that, and there are other things to consider..

I have a weird bug in production which occurs very rarely, and I'm trying to understand whether this could be the problem. There I'm calling saveData() 4 times in a row without awaiting them, and the data that needs to be written DOES change between consecutive calls. It's a JSON that grows with one field between each call. E.g.

update json:
"key1" : "value 1"

// first call to save data
saveData()

update json:
"key1" : "value 1"
"key2" : "value 2"

// 2nd call to save data
saveData()

update json:
"key1" : "value 1"
"key2" : "value 2"
"key3" : "value 3"

// 3rd call to save data
saveData()

the file ends up with a state of this sort

"key1" : "value 1"
"key1" : "value 1"
"key2" : "value 2"
"key3" : "value 3"

I'm looking at this package & example, and I feel I'm hitting the same problem. Which can only mean that there could be concurrent writes happening and chatGPT is right.

@lrhn thanks for taking the time to reply

@lrhn
Copy link
Member

lrhn commented Jun 7, 2024

What determines the order they complete in?

That depends on the operation. In this case the operation is an I/O operation, File.writeAsString, so the order of completion is the order in which the underlying OS operations complete and report back that they are completed.

Which basically means "any order is possible", likely with a tendency towards operations that started earlier also ending earlier, but outliers can and will happen.

If the async operations run entirely in Dart code inside the same isolate, no I/O operations and no timers (which are effectively I/O operations because they depend on the underlying OS), then interleaving may be deterministic ... or it may not, no promises are made. It'll be entirely up to the details of the implementation. Current native behavior doesn't add randomness deliberately.

For your code, you may want to have a "task scheduler" that performs operations when ready, and maybe even discard intermediate reads.

import "dart:async" show Completer;
import "dart:collection" show Queue;
import "dart:convert" show Encoding, utf8;
import "dart:io" show File;

class AsyncWriter {
  final File file;
  final Queue<(Completer<void> result, Future<void> Function() write)>
      _pendingWrites = Queue();

  AsyncWriter(this.file);

  void writeAsString(String string, {Encoding encoding = utf8}) {
    var wasEmpty = _pendingWrites.isEmpty;
    _pendingWrites.add((
      Completer<void>.sync(),
      () => file.writeAsString(string, encoding: encoding)
    ));
    if (wasEmpty) _writeNext();
  }

  void _writeNext() async {
    do {
      // Skip to the last write request, ignore intermediate write requests.
      var (completer, write) = _pendingWrites.last;
      try {
        try {
          await write();
        } finally {
          _completeUntil(completer);
        }
        completer.complete(null);
      } catch (e, s) {
        completer.completeError(e, s);
      }
      // Continue if any new writes were added while writing this one.
    } while (_pendingWrites.isNotEmpty);
  }

  // Removes pending writes up to and including the actual write just performed.
  // Completes all prior operations.
  void _completeUntil(Completer<void> actualCompleter) {
    while (true) {
      var (nextCompleter, _) = _pendingWrites.removeFirst();
      if (identical(actualCompleter, nextCompleter)) return;
    }
  }
}

Warning: Untested code.

@alexvoina
Copy link

alexvoina commented Jun 7, 2024

Thanks for the code example & explanation. In my case there was a much easier solution to fix the problem (there's a writeAsStringSync() which I can call just once).

I was more interested to understand if it is possible to end up with data in a file that is not representative of any single saveData() call.

e.g.

"key1" : "value 1"        }      1st call to saveData()               }
                                                                                               }
"key1" : "value 1"        }                                                         }.     final result of calling saveData() 3 times in a row 
"key2" : "value 2"       }      last call to saveData()              }
"key3" : "value 3"       }.                                                        }

If there's a yes or no answer to this question, I would be interested to find out.

@Jordan-Nelson
Copy link

Maybe we should document the behavior like NodeJS does i.e.

It is unsafe to use filehandle.write() multiple times on the same file without waiting for the promise to be resolved (or rejected). For this scenario, use filehandle.createWriteStream().

I agree that this should be documented, especially as it seems to be platform specific (I can easily reproduce this issue on Windows, but cannot reproduce on MacOS)

@renatoathaydes
Copy link
Author

@Jordan-Nelson strange, I reported this issue originally on Linux, but just tried it on Mac M1 and the result is the same (and I am pretty sure it's the same on x86 arch).

Tested on DartVM:

Dart SDK version: 3.4.4 (stable) (Wed Jun 12 15:54:31 2024 +0000) on "macos_arm64"

What behaviour did you see on Mac?

@lrhn lrhn added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. and removed area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. labels Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-io P2 A bug or feature request we're likely to work on triaged Issue has been triaged by sub team type-bug Incorrect behavior (everything from a crash to more subtle misbehavior)
Projects
None yet
Development

No branches or pull requests

9 participants