-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Make metadata from read_spss available #34682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
|
So in theory this shouldn’t cause too much trouble? I’m going to give it a try and get back with the results |
Yeah, won't cause any trouble. There will be some operations that fail to propagate the metadata, and we still need to determine how metadata propagates when multiple dataframes (possibly with different metadata) are involved. But won't cause any issues. Thanks for working on this. |
Hi. I’m so sorry for this late reply. I haven’t been able to work on this because this last year has been very hard. I hope to work on it sometime in the near future. So I will keep in touch. |
Closed by #55472? |
Is your feature request related to a problem?
I would like to have the metadata that pyreadstats provides available when reading files from SPSS.
This would be really helpful because it would provide an easy way to have variable labels (descriptions), value labels, and other important metadata available to format results/reports (by replacing the variable names manually with
.replace
function).Those kinds of metadata are widely used in social sciences because it makes understanding results really easy. For example, SPSS changes the variable names to the variable labels in the output of analyses. Users could manually do this if the metadata was available.
Describe the solution you'd like
The metadata read by pyreadstats could be stored in the df's
_metadata
attribute and that would make it readily availableAPI breaking implications
I don't think there would be any implications if it's stored in the
_metadata
attribute because it was developed for this kind of use-case. I'm I right?Describe alternatives you've considered
I could use the pyreadstats directly without using the
df.read_spss
. I can't think of any other options.Additional context
This is related to issues #11179 and #39.
The text was updated successfully, but these errors were encountered: