Skip to content

Add attributes to os.stat results #35189

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nickm mannequin opened this issue Sep 17, 2001 · 35 comments
Closed

Add attributes to os.stat results #35189

nickm mannequin opened this issue Sep 17, 2001 · 35 comments
Assignees
Labels
stdlib Python modules in the Lib dir

Comments

@nickm
Copy link
Mannequin

nickm mannequin commented Sep 17, 2001

BPO 462296
Nosy @mwhudson, @gvanrossum, @loewis, @freddrake
Files
  • stat_return.diff: Patch for new os.stat behavior.
  • patch2.diff: EXAMPLE ONLY patch for extended behavior on posix for st_{blksize
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/freddrake'
    closed_at = <Date 2002-03-05.16:46:33.000>
    created_at = <Date 2001-09-17.17:57:02.000>
    labels = ['library']
    title = 'Add attributes to os.stat results'
    updated_at = <Date 2002-03-05.16:46:33.000>
    user = 'https://bugs.python.org/nickm'

    bugs.python.org fields:

    activity = <Date 2002-03-05.16:46:33.000>
    actor = 'loewis'
    assignee = 'fdrake'
    closed = True
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2001-09-17.17:57:02.000>
    creator = 'nickm'
    dependencies = []
    files = ['3622', '3623', '3624', '3625', '3626', '3627', '3628', '3629', '3630']
    hgrepos = []
    issue_num = 462296
    keywords = ['patch']
    message_count = 35.0
    messages = ['37622', '37623', '37624', '37625', '37626', '37627', '37628', '37629', '37630', '37631', '37632', '37633', '37634', '37635', '37636', '37637', '37638', '37639', '37640', '37641', '37642', '37643', '37644', '37645', '37646', '37647', '37648', '37649', '37650', '37651', '37652', '37653', '37654', '37655', '37656']
    nosy_count = 5.0
    nosy_names = ['mwh', 'gvanrossum', 'loewis', 'fdrake', 'nickm']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue462296'
    versions = []

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 17, 2001

    See bug bpo-111481, and PEP-0042. Both suggest that the
    return values for os.{stat,lstat,statvfs,fstatvfs}
    ought to be struct-like objects rather than simple tuples.

    With this patch, the os module will modify the
    aformentioned functions so that their results still
    obey the previous tuple protocol, but now have
    read-only attributes as well. In other words,
    "os.stat('filename')[0]" is now synonymous with
    "os.stat('filename').st_mode.

    The patch also modifies test_os.py to test the new
    behavior.

    In order to prevent old code from breaking, these new
    return types extend tuple. They also use the new
    attribute descriptor interface. (Thanks for
    PEP-025[23], Guido!)

    Backward compatibility: Code will only break if it
    assumes that type(os.stat(...)) == TupleType, or if it
    assumes that os.stat(...) has no attributes beyond
    those defined in tuple.

    @nickm nickm mannequin closed this as completed Sep 17, 2001
    @nickm nickm mannequin assigned freddrake Sep 17, 2001
    @nickm nickm mannequin added the stdlib Python modules in the Lib dir label Sep 17, 2001
    @nickm nickm mannequin closed this as completed Sep 17, 2001
    @nickm nickm mannequin assigned freddrake Sep 17, 2001
    @nickm nickm mannequin added the stdlib Python modules in the Lib dir label Sep 17, 2001
    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 17, 2001

    Logged In: YES
    user_id=499

    BTW, if this gets in, I have another patch that adds support
    for st_blksize, st_blocks, and st_rdev on platforms that
    support them. It don't expose these new fields in the
    tuple, as that would break all the old code that tries to
    unpack all the fields of the tuple. Instead, these fields
    are only accessible as attributes.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 17, 2001

    Logged In: YES
    user_id=21627

    I second the request for supporting additional fields
    where available. At the same time, it appears
    unimplementable using pure Python.

    Consequently, I'd like to see this patch redone in C. The
    implementation strategy could probably remain the same,
    i.e. inherit from tuple for best compatibility; add the
    remaining fields as slots. It may be reasonable to
    implement attribute access using a custom getattr
    function, though.

    I have also my doubts about the naming of the fields. The
    st_ prefix originates from the time where struct fields
    were living in the global namespace (i.e. across different
    structures), so prefixing them for uniqueness was
    essential. I'm not sure whether we should inherit this
    into Python...

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 17, 2001

    Logged In: YES
    user_id=499

    Martin: I'm not entirely sure what you mean here; while my
    patch for extra fields requires a minor chunk of C (to
    access the struct fields), the rest still works in pure
    python. I'm attaching this second version for reference.

    I'm not sure it makes much sense to do this with pure C; it
    would certainly take a lot more code, with little benefit I
    can descern. But you're more experienced than I; what am I
    missing?

    I agree that the field naming is suboptimal; I was taking my
    lead from the stat and statvfs modules. If people prefer,
    we can name the fields whatever we like.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 17, 2001

    Logged In: YES
    user_id=499

    On further consideration, the approach taken in the second
    (example only) patch is indeed too fragile. The C code
    should not lengthen the tuple arbitrarily and depend on the
    Python code to decode it; instead, it should return a
    dictionary of extra fields. I think that this approach uses
    a minimum of C, is easily maintainable, and very extensible.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 18, 2001

    Logged In: YES
    user_id=499

    Here's the revised (example only) patch that takes the
    more portable approach I mention below.

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    Haven't had time to review the patch yet, but the idea of
    providing a structure with fields that doubles as a tuple is
    a good one. It's been tried before and can be done in pure
    Python as well.

    Regarding the field names: I think the field names should
    keep their st_ prefix -- IMO this makes the code more
    recognizable and hence readable.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 18, 2001

    Logged In: YES
    user_id=21627

    The problem with your second and third patch is that it
    includes an incompatibility for users of posix.stat (and
    friends), since it changes the siye of the tuple. If you
    want to continue to return a tuple (as the top-level data
    structure), you'll break compatibility for applications
    using the C module directly. An example of code that would
    be broken is

    mode, ino, dev, nlink, uid, gid, size, a, c, m =
    posix.stat(filename)

    To pass the additional fields, you already need your class
    _StatResult available in C.
    You may find a way to define it in Python and use it in C,
    but that has proven to be very fragile in the past.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 18, 2001

    Logged In: YES
    user_id=499

    Ah! Now I see. I hadn't realized that anybody used the
    posix module directly. (People really do this?)

    I'll try to write up a patch in C tonight or tomorrow
    morning. A couple of questions on which I could use advice:
    (1) Where is the proper place to put this kind of
    tuple-with-fields hybrid? Modules? Objects? In a new file
    or an existing one?
    (2) Should I try to make it general enough for non-stat use?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 19, 2001

    Logged In: YES
    user_id=21627

    Using posix.stat is common, see

    http://groups.yahoo.com/group/python-list/message/4349
    http://www.washington.edu/computing/training/125/mkdoc.html
    http://groups.google.com/groups?th=7d7d118fed161e0&seekm=5qdjch%24dci%40nntp6.u.washington.edu

    for examples. None of these would break with your change,
    though, since they don't rely on the lenght of the tuple.

    If you are going to implement the type in C, I'd put it in
    the posix module. If you are going to implement it in Python
    (and only use it from the Posix module), making it
    general-purpose may be desirable. However, a number of
    things would need to be considered, so a PEP might be
    appropriate. If that is done, I'd propose an interface like

    tuple_with_attrs((value-tuple), (tuple-of-field-names),
    exposed-length-of-tuple))

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 19, 2001

    Logged In: YES
    user_id=499

    I'm becoming more and more convinced that doing it in C is
    the right thing, but I have issue with doing it in the posix
    module. The stat function is provided on (nearly?) all
    platforms, and doing it in C will require minor changes to
    all of these modules. We can probably live with this, but I
    don't think we should duplicate code between all of the os
    modules.

    Is there some other appropriate place to put it in C?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 19, 2001

    Logged In: YES
    user_id=21627

    There aren't actually so many copies of the module, since
    posixmodule implements "posix","nt", and "os2". I found
    alternative implementations in riscosmodule and macmodule.

    Still, putting the support type into a shared C file is
    appropriate. I can think of two candidate places:
    tupleobject.c and fileobject.c. It may be actually
    worthwhile attempting to share the stat() implementations
    as well, but that could be an add-on.

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    Or you could put it in modsupport.c, which is already a
    grab-bag of handy stuff.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 21, 2001

    Logged In: YES
    user_id=499

    Well, here's a posixmodule-only, all-C version. If this
    seems like a good approach, I'll add some better docstrings,
    move it into whichever module you like, and make
    riscosmodule.c and macmodule.c use it too.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 21, 2001

    Logged In: YES
    user_id=499

    And here's an even better all-C version. (This one doesn't
    use a dictionary to store optional attributes.)

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Sep 24, 2001

    Logged In: YES
    user_id=21627

    The patch looks good to me. Are you willing to revise it one
    more time to cover all the stat implementations?
    A few comments on the implementation:

    • Why do you try to have your type participate in GC? they
      will never be part of a cycle. If that ever becomes an
      issue, you probably need to implement a traversal function
      as well.
    • I'd avoid declaring PosixStatResult, since the field
      declarations are misleading. Instead, you should just add
      the right number of additional in the type declaration.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Sep 28, 2001

    Logged In: YES
    user_id=499

    I've fixed it with the suggestions you made, and also

    1. Added docstrings
    2. Fixed a nasty segfault bug that would be triggered by
      os.stat("/foo").__class__((10,)).st_size
      and added tests to keep it from reappearing.

    I'm not sure I know how to cover Mac and RISCOS properly:
    riscos.stat returns a 13-element tuple, and is hence already
    incompatible with posix.stat; whereas mac.{stat|xstat}
    return differing types.

    If somebody with experience with these modules could let
    give me guidance as to the Right Thing, I'll be happy to
    give it a shot... but my shot isn't likely to be half as
    good as somebody who knew the modules better. (For example,
    I don't have the facilities to compile macmodule or
    riscmodule at all, much less test them.)

    I'd also be glad to make any changes that would help
    maintainers of those modules.

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    One comment on the patch: beautiful use of the new type
    stuff, but there's something funky with the constructors
    going on. It seems that the built-in __new__ (inherited from
    the tuple class) requires exactly one argument -- a sequence
    to be tuplified -- but your __init__ requires 13 arguments.
    So construction by using posix.stat_result(...) always
    fails. It makes more sense to fix the init routine to
    require a 13-tuple as argument. I would also recommend
    overriding the tp_new slot to require a 13-tuple: right now,
    I can cause an easy core dump as follows:

    >>> import os
    >>> a = os.stat_result.__new__(os.stat_result, ())
    >>> a.st_ctime
    Segmentation fault (core dumped)
    $

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    Another comment: we should move this to its own file so that
    other os.stat() implementations (esp. MacOS, maybe RiscOS)
    that aren't in posixmodule.c can also use it, rather than
    having to maintain three separate versions of the code.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Oct 1, 2001

    Logged In: YES
    user_id=499

    I think this might be the one... or at least, the
    next-to-last-one. This version of the patch:

    (1) moves the shared C code into a new module, "_stat", for
    internal use.
    (2) updates macmodule and riscosmodule to use the new code.
    (3) fixes a significant reference leak in previous versions.
    (4) is immune to the __new__ and __init__ bugs in previous
    versions.

    Things to note:
    (A) I've tried to make sure that my Mac/RISCOS code was
    correct, but I don't have any way to compile or test it.
    (B) I'm not sure my use of PyImport_ImportModule is legit.
    (C) I've allowed users to construct instances of stat_result
    with < or > 13 arguments. When this happens, attempts to
    get nonexistant attributes now raise AttributeError.
    (D) When dealing with Mac.xstat and RISCOS.stat, I chose to
    keep backward compatibility rather than enforcing the
    10-tuple rule in the docs.

    Because there are new files, I can't make 'cvs diff' get
    everything. I'm uploading a zip file that contains
    _statmodule.c, _statmodule.h, and a unified diff. Please
    let me know if you'd prefer a different format.

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    Nick, what's your real email? I have a bunch of feedback
    related to your use of the new type stuff -- this is
    uncharted territory for me too, and this SF box is too small
    to type comfortably.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Oct 1, 2001

    Logged In: YES
    user_id=499

    I've sent my email address to 'guido at python.org'. For
    reference, it's 'nickm at alum.mit.edu'.

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Oct 2, 2001

    Logged In: YES
    user_id=499

    The fifth all-C (!) version, with changes as suggested by
    Guido's comments via email.

    Big changes: This version no longer subclasses tuple.
    Instead, it creates a general-purpose mechanism for making
    struct/sequence hybrids in C.

    It now includes a patch for timemodule.c as well.
    Shortcomings:
    (1) As before, macmodule and riscosmodule aren't tested.
    (2) These new classes don't participate in GC and aren't
    subclassable. (Famous last words: "I don't think this will
    matter." :) )
    (3) This isn't a brand-new metaclass; it's just a quick bit
    of C. As such, you can't use this mechanism to create new
    struct/tuple hybrids from Python. (I claim this isn't a
    drawback, since it's way easier to reimplement this in
    python than it is to make it accessible from python.)

    So, how's *this* one?

    @mwhudson
    Copy link

    mwhudson commented Oct 3, 2001

    Logged In: YES
    user_id=6656

    If this goes in, I'd like to see it used for termios.tc
    {get,set}attr too.

    I could probably implement this (but not *right* now...).

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    Patience, please. I'm behind reviewing this, probably won't
    have time today either.

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    I'm looking at this now.

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    Thanks, Nick! Good job.

    Checked in, just in time for 2.2b1. I'm passing this
    tracker entry on to Fred for documentation. (Fred, feel
    free to pester Nick for docs. Nick, feel free to upload
    approximate patches to Doc/libos.tex and Doc/libtime.tex.
    :-)

    @nickm
    Copy link
    Mannequin Author

    nickm mannequin commented Oct 18, 2001

    Logged In: YES
    user_id=499

    Here's a documentation patch for libos.tex. I don't know
    the TeX macros well enough to write an analogous one for
    libtime.tex; fortunately, it should be fairly easy to
    extrapolate from the included patch.

    @freddrake
    Copy link
    Member

    Logged In: YES
    user_id=3066

    This has been checked in, edited, and checked in again.

    @mwhudson
    Copy link

    mwhudson commented Mar 5, 2002

    Logged In: YES
    user_id=6656

    I know this patch is closed, but it seems a vaguely sane
    place to ask the question: why do we vary the number of
    field of os.stat_result across platforms? Wouldn't it be
    better to let it always have the same values & fill in one's
    that don't exists locally with -1 or something?

    It's hard to pickle os.stat_results portably the way things
    are at the moment...

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 5, 2002

    Logged In: YES
    user_id=21627

    Adding all fields is both difficult and undesirable. It is
    difficult because you may not know in advance what fields
    will be added in future versions, and it is undesirable
    because applications may think that there is a value even
    though the is none.

    What problem does that cause for pickling, and why would a
    complete list of all attributes solve this problem?

    @mwhudson
    Copy link

    mwhudson commented Mar 5, 2002

    Logged In: YES
    user_id=6656

    I'm not worried about cross version problems.

    The problem with pickling is that stat_results (as of today)
    get pickled as "os.stat_result" and a tuple of arguments.
    The number of arguments os.stat_result takes varies by
    platform (it seems to be 10 on this NT box, but it's 13 on
    the starship, f'ex). So if a stat_result gets pickled on
    the starship and shoved down a socket to an NT machine, it
    can't be unpickled. I don't know if this sort of thing ever
    happens, but I could see it being surprising & annoying if I
    ran into it.

    If os.stat_result took 13 arguments everywhere, this problem
    obviously wouldn't arise.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 5, 2002

    Logged In: YES
    user_id=21627

    To support pickling, I think structseq objects should
    implement a __reduce__ method, returning the type and a
    dictionary. The type's tp_new should accept dictionaries,
    and reconstruct the instance from the dictionary.

    Alternatively, copy_reg could grow support for stat_result,
    which seems desirable anyway, since os.stat returns a
    'nt.stat_result' instance on Windows.

    Furthermore, fixing the number of arguments does not help at
    all in pickling; __reduce__ will return an argument tuple
    which includes the original object; in turn, pickle will
    recurse until the stack overflows.

    @mwhudson
    Copy link

    mwhudson commented Mar 5, 2002

    Logged In: YES
    user_id=6656

    Martin, I may not have been 100% clear in my last note, but
    please run

    cvs up Objects/structseq.c

    structseq objects *do* now implement a __reduce__ method,
    but it returns a tuple. Using a dictionary would be more
    complicated, and not solve the issue completely: what
    happens when you go from a platform with less fields to one
    with more? What value does the not-prepared-for field have?

    Hmm, the point about nt.stat_result is a good one.

    Getting support into copy_reg.py leads to interesting
    bootstrapping problems when using uninstalled builds,
    unfortunately (site.py imports distutils imports re imports
    copy_reg; try to import, say, time, and you can't, because
    the whole reason to import distutils was to set up the path
    to find dynamically linked libraries...).

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 5, 2002

    Logged In: YES
    user_id=21627

    I'd not put the copyreg support into copy_reg, but into
    os.py. Pickling would save a reference to
    os._load_stat_result (or some such). When pickle tries to
    restore the value, it would first restore
    os.load_stat_result. For that, it would import os, which
    would register the copy_reg support.

    As for constructing structseq objects from dictionaries: it
    would be a ValueError if fields within [:n_sequence_fields]
    are not filled out; leaving out other fields is fine.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants