-
Notifications
You must be signed in to change notification settings - Fork 1.7k
rust: add event file reading module #4315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: We implement a module for reading a TensorFlow event file. This wraps the TFRecord reading module added in #4307 and parses records as protos. The reader also stores the last-read event wall time, so that we can stop reading from event files after a while (cf. `--reload_multifile`). Along the way, we add a couple utility methods to `TfRecord` that make it easier to use records in test code. Test Plan: Unit tests included. wchargin-branch: rust-event-file-reading wchargin-source: 18183bce70f95cdef820853b67f5a0930076ba15
Hi! Here’s a code review to get us started. Diffstat says +300, but that
…so I hope it’s not too intimidating. As always, happy to discuss or |
wchargin-branch: rust-event-file-reading wchargin-source: 34ffb0d9cc99ba00c0b954a70c281b862d6f6411
/// | ||
/// As with [`TfRecordReader`], an event may be read over one or more underlying reads, to support | ||
/// growing, partially flushed files. | ||
pub struct EventFileReader<R> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per our offline chat, we may want a #[derive(Debug)]
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done; thanks.
@@ -128,6 +154,11 @@ impl<R: Read> TfRecordReader<R> { | |||
} | |||
} | |||
|
|||
/// Consumes this `TfRecordReader<R>`, returning the underlying reader `R`. | |||
pub fn into_inner(self) -> R { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No action needed, this comment is mainly to document what I learned from our offline discussion.
While we don't expect this fn to be used outside of this crate's tests, it's reasonable to keep as is because
- public
into_inner
is a fairly common pattern in Rust's std library - it must be called with
self
, which means the caller already has ownership of thisTfRecordReader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. You can imagine something like:
// parse a file that has one TFRecord as a header and then just a bunch of data
let mut f = File::open("...")?;
let mut reader = EventFileReader::new(f);
let header = reader.read_event()?;
let mut f = reader.into_inner(); // give me the normal file back
let mut buf = Vec::new();
f.read_to_end(&mut buf)?;
Ok(ev) => ev, | ||
Err(err) => { | ||
// On proto decoding failure, check the record checksum first. | ||
record.checksum()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skipping the checksum on 'successful Event decoding' cases is a neat optimization. Perhaps for now, we should perform the checksum check in all cases though? I feel more comfortable if we stick to the behavior as traditional TB data loading in this respect.
Definitely open to adding some sort of flag later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure; that’s reasonable. Done, with test.
let written_len = cursor.position(); | ||
cursor.set_position(0); | ||
let mut reader = TfRecordReader::new(cursor); | ||
let output_record = reader.read_record().expect("read_record"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ...expect("read_record");
--> expect("failed to read record");
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not certain what the prevailing style is here, but it renders like
this:
thread '<test_name>' panicked at 'read_record: BadLengthCrc(Truncated)', tensorboard/data/server/tf_record.rs:123:45
so I think it’s already clear enough. The fact that it’s panicking and
the test fails means that it’s failed.
If this were in non-test code and I were using expect
for a condition
that should never happen (i.e., an assertion), I’d be more careful to
provide clear information and context. But for a test that’s completely
self-contained, I think that this is fine.
wchargin-branch: rust-event-file-reading wchargin-source: 28973ef2acb8b9dfb6a968749cfd771fe14ca448
wchargin-branch: rust-event-file-reading wchargin-source: 28973ef2acb8b9dfb6a968749cfd771fe14ca448
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, thanks for walking me through this!
Summary:
We implement a module for reading a TensorFlow event file. This wraps
the TFRecord reading module added in #4307 and parses records as protos.
The reader also stores the last-read event wall time, so that we can
stop reading from event files after a while (cf.
--reload_multifile
).Along the way, we add a couple utility methods to
TfRecord
that makeit easier to use records in test code.
Test Plan:
Unit tests included.
wchargin-branch: rust-event-file-reading