Skip to content

Add timespec structured mtime metadata #236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 7, 2020
Merged

Conversation

ribasushi
Copy link
Contributor

@ribasushi ribasushi commented Dec 23, 2019

Augment the mtime specification with a well-defined universal datastructure capable of expressing both arbitrarily distant points in time, and sufficiently arbitrary ( in computing terms ) sub-second precision. The very slight increase in on-wire size and the mild extra parsing complexity allows for:

  • A structure representing time, that is unlikely to change any time in the future, aiding implementation convergence
  • A way to encode any filesystem-related timestamp within the wider problem domain of information systems

UNIXFS.md Outdated
@@ -90,7 +103,9 @@ UnixFS currently supports two optional metadata fields:
- The remaining 20 bits are reserved for future use, and are subject to change. Spec implementations **MUST** handle bits they do not expect as follows:
- For future-proofing the (de)serialization layer must preserve the entire uint32 value during clone/copy operations, modifying only bit values that have a well defined meaning: `clonedValue = ( modifiedBits & 07777 ) | ( originalValue & 0xFFFFF000 )`
- Implementations of this spec must proactively mask off bits without a defined meaning in the implemented version of the spec: `interpretedValue = originalValue & 07777`
* `mtime` -- The modification time in seconds since the epoch. This defaults to the unix epoch if unspecified
* `mtime` -- A two-element structure ( `EpochSeconds`, `EpochNanoseconds` ) representing the modification time in seconds relative to the unix epoch `1970-01-01T00:00:00Z`. In contexts where an mtime is mandatory ( e.g. FUSE interfaces ) implementations must treat an unspecified mtime as `0`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just default this to 0 as it is currently. The go-ipfs and js-ipfs implementations cannot control which context they will be used from so should default to the most conservative approach so that might as well be in the spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My hesitation placing this in the spec is that it implicitly means unknown mtime is not a thing. E.g. a web gateway would be obligated to send out a 0 Last-Modified header, and so on...

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine. At least, I can't think of a reason that makes it not fine, given that we need a default value and we're happy with seeing 1970 everywhere when we ipfs files ls -l.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this more concrete, I think a web gateway sending a 0 Last-Modified header is ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:(

I will relent if this is a wider consensus, but then it makes me sad as this part of the spec becomes unusable to me as well. In my use case having a gateway render "this is the mtime" vs "mtime was not supplied" is really crucial.

@alanshaw
Copy link
Member

alanshaw commented Dec 26, 2019

Please can we have some context in the PR description to explain to people why this is required?

@achingbrain
Copy link
Member

Not sure I understand the maths behind why they’ll often be this big?

2^28 is roughly 1/4 of the total range of nanosecond values so assuming a uniform distribution they'll be bigger than that 3/4 of the time.

UNIXFS.md Outdated
@@ -90,7 +103,9 @@ UnixFS currently supports two optional metadata fields:
- The remaining 20 bits are reserved for future use, and are subject to change. Spec implementations **MUST** handle bits they do not expect as follows:
- For future-proofing the (de)serialization layer must preserve the entire uint32 value during clone/copy operations, modifying only bit values that have a well defined meaning: `clonedValue = ( modifiedBits & 07777 ) | ( originalValue & 0xFFFFF000 )`
- Implementations of this spec must proactively mask off bits without a defined meaning in the implemented version of the spec: `interpretedValue = originalValue & 07777`
* `mtime` -- The modification time in seconds since the epoch. This defaults to the unix epoch if unspecified
* `mtime` -- A two-element structure ( `EpochSeconds`, `EpochNanoseconds` ) representing the modification time in seconds relative to the unix epoch `1970-01-01T00:00:00Z`. In contexts where an mtime is mandatory ( e.g. FUSE interfaces ) implementations must treat an unspecified mtime as `0`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine. At least, I can't think of a reason that makes it not fine, given that we need a default value and we're happy with seeing 1970 everywhere when we ipfs files ls -l.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's really worth avoiding an extra tag, but this seems like a clean solution anyways.

@ribasushi ribasushi requested a review from alanshaw January 6, 2020 14:41
- Settle on UnixTime, remove mention of epoch entirely
- Better explain fixed32 rationale for the fractional part
@ribasushi
Copy link
Contributor Author

Ok so... this tortured PR is now blocked on last two issues:

  • Range of the fractional nanosecond value. I read this as protocol-level spec, I am advocating for "0 nanoseconds" to be an invalid value. @achingbrain comes from implementation point of view, and correctly points out that a user should not care whether a value is 0 or not ( feat: store time as timespec js-ipfs-unixfs#40 (comment) )

    • As resolution I propose adding verbiage to the spec that spells out what is valid on the wire vs what is valid on the API side. I have not added to this PR, as there seems to be aversion to a wordy doc
  • The question over whether "no mtime" automatically implies 0 for all intents and purposes, or whether implementations ( e.g. a web gateway ) may elect to render actual 0 and "no data" in distinct ways. @achingbrain advocates for 0/1970-01-01 to be mandated by the spec, including web-gateways being obligated to always supply a Last-Modified: header. I find this unworkable for the archival use-cases I have in mind, and am strongly for leaving this at the implementation's discretion ( Add timespec structured mtime metadata #236 (comment) )

    • Resolution is difficult, as the outcome is zero-sum. I guess requesting for @alanshaw and @Stebalien to break the tie ;)

@Stebalien
Copy link
Member

As resolution I propose adding verbiage to the spec that spells out what is valid on the wire vs what is valid on the API side. I have not added to this PR, as there seems to be aversion to a wordy doc

I agree that we should not encode 0 nanoseconds on the wire.

The question over whether "no mtime" automatically implies 0 for all intents and purposes, or whether implementations ( e.g. a web gateway ) may elect to render actual 0 and "no data" in distinct ways.

I'd let implementations decide what to do here. However, we should specify what they can do. They can either:

  • Preferred: indicate that the time is unset.
  • Otherwise: return the epoch. Unfortunately, some APIs make it difficult to indicate that no mtime is set.

That is, they can't just return any random time. They should also document what they do.

@Stebalien
Copy link
Member

Merging per @ribasushi's suggestion as we can address the final two points in a followup PR.

@Stebalien Stebalien merged commit b9ca0be into master Jan 7, 2020
@Stebalien Stebalien deleted the optional_hires_mtime2 branch January 7, 2020 21:12
@achingbrain
Copy link
Member

On mtime:

@achingbrain advocates for 0/1970-01-01 to be mandated by the spec

The reason for this is primarily this comment and that prior metadata PRs have specified a default value.

I am more than happy to make this field optional, and solid use cases for this definitely strengthen that argument. It also means the mtime field in the protobuf definition doesn't need extra exposition around it which is good too.

@ribasushi
Copy link
Contributor Author

@achingbrain @Stebalien see #237 for the next attempt at expressing this.

Stebalien added a commit that referenced this pull request Jan 9, 2020
Attempt to address mtime remnants of #236
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants