-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fix capitalization among headings in documentation files #32550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@tonywu1999 do you mind editing the description and providing more context? Imagine a random user wanting to contribute to pandas lands here. We would like to explain what's the problem, why it's useful to fix it, and step by step information on what to do (e.g. We want to add fixes files to Also, if you want to get the list of files to check, and add it in the description (you can use Thanks! |
take |
@tonywu1999 working on the issue I am getting some outputs that I am not sure are valid. If I run the script on Can you confirm that this is an expected output? |
We just developed this validation script, so it's expected that we find some false positives. Can you find where this error is being generated, so we can see what's the problem? |
It looks like those lines in the .rst files are used as bullet points rather than headings. However, those bullet points appear to be empty (i.e. they may have been inserted into the .rst file by accident). You can refer to the following website to see what I mean by empty bullet points: https://pandas.pydata.org/pandas-docs/stable/whatsnew/v0.25.0.html control-f for
to find the empty bullet points and to give context on what's going on. Hope this helps. |
There is a condition in the script where we check that a line just contains dashes (or other specific characters, and that the length of the analysed line and the previous have the same length. I guess we want to add another condition that the length should be greater than one and the previous line shouldn't have one of these specific characters. May be that can be implemented in a separate PR, or together with fixing a single line where this happens. |
@tonywu1999 @datapythonista I believe there are a few capitalization exceptions missing like Also in
|
No problem on changing whatever is needed in the script. |
Hello i'd like to work on it |
Though I don't understand exactly what is the issue or the goal of the issue here. The script does it well to find all occurrences of titles that need to be decapitalized, is it to actually make the changes to the documentation? |
The goal of this issue is to actually make the changes to the documentation. |
Yes but there are exceptions that you don't want to lower and that are not in CAPITALIZATION_EXCEPTIONS. What do you do with it? Should you extend it? |
Yes, the script will validate most cases all right, but if there is anything that need to be changed there, like adding new keywords, you can do it. |
Better don't open a huge PR, take few documents (e.g. five), and just fix those.if you want to fix more (surly appreciated) then keep opening PRs, no problem in opening many. Thanks! |
I am not very used to git yet, how do I push to remote repository? I have pulled the repository on my local machine. I have modified some files in doc, commit, and doesn't work when I push to the GitHub url. What is the url where I should push to? |
@cleconte987 you need to open a pull request. It's a bit tricky the first time, but there are resources out there to help you know how it works. If you don't find anything better, you can see these slides https://docs.google.com/presentation/d/1rOSYXZPyMe9KXnbVK_xbJzw_-ijxd6bIxndmvPU6L2o/edit?usp=sharing and this video (sorry the audio is awful): https://www.youtube.com/watch?v=LCTk0leNH1g |
https://dev.pandas.io/docs/development/contributing.html I started contributing 2 months ago, and I found that this link helped me a lot. |
Ok thank you |
@cleconte987 I am already on the issue. Will do a pull request with all the updated documentation soon |
Well, what should I do now? @tonywu1999 @datapythonista. I started to commit to the documentation. I guess you are assignee. Im here if I can help |
As said early, you should be working on small batches, so keep opening small pull requests with the fixes, and we'll be merging them. There are many titles to fix, try to coordinate if possible, but more than one person can work with this, no problem. |
And I think it's not correct to lower words like DataFrame to Dataframe, shouldn't it be kept with capitalization? |
Hey @datapythonista , I was starting to wok on this issue and came across a weird scenario, specifically with the stumpy package mentioned in ecosystem.rst. So, ecosystem.rst refers to the pckage in all caps, "STUMPY". The script catches this of course and says to correct it to "Stumpy". In situations like this, should I use the capitalization the script suggests, correct the capitalization to all lowercase to match with how it's imported, add the package name to the list of exceptions in the script itself, or the last 2 combined? |
You can add it to the list of exceptions. Or, if you think it's reasonable and not too complicated, just skip that level of header (probably h3) of the ecosystem page, as everything in it should be a package name if I'm not wrong. |
@datapythonista Thanks, I had another question though. it looks like the script is asking me to change the capitalization in one of the urls. For reference this is the original url: https://github.com/TDAmeritrade/stumpy it wants me to make the link all lowercase. The link works fine as it is, but weirdly enough putting the link in all lowercase also seems to work fine, and I have no idea why. Is there some kind of weird behavior that means I shouldn't change the capitalization in links or am I good to go? |
URLs are not case sensitive afaik. So, making the url all lowercase shouldn't be a problem when clicking on it. I guess the capitalization is more for branding, and it'd probably be nice to keep it and don't validate links in the titles. If it doesn't introduce much extra complexity to the validation, and you want to give it a try, that would be great. |
can i work on this issue? |
take |
Take |
Hi, can an admin take a look at #55685? Not sure how to make the tests pass. I didn't make any changes to anything that's being tested in the checks. |
…snew doc files. Sorted exceptions list alphabetically, for better maintainability, proposed name change from CAPITALIZATION_EXCEPTIONS to CAPITALIZATION_EXCLUSIONS. (pandas-dev#32550)
Hey everyone - In the original comment in that issue, I saw that proposed way of running that script was: I also tried to reuse exclusions wherever it was possible, i.e. instead of adding "I/O" to the list I've edited rst to use "IO" as the second one was already on the list. I also think, that there's a need for surpressing some of the validations, and exclusions may not be enough. I.e. - "pandas" is added to exclusions with underscore, however it can also be used at the beginning of the title and then this particular entry in an exclusion doesn't work as expected. I'll be happy to pick up other files as well and trigger some discussions, but before I do so, I just wanted to confirm with you if that's an expected way of working. Potential future stories: |
Corrected title capitalization in various .rst files to match the standard of capitalizing only the first word, unless a term like DataFrame or Series is involved. Ran the script to find and correct heading issues in the following files: - doc/source/user_guide/timedeltas.rst - doc/source/whatsnew/v0.7.0.rst - doc/source/whatsnew/v0.23.4.rst - (… and so on) Fixes part of issue pandas-dev#32550.
Corrected title capitalization in various .rst files to match the standard of capitalizing only the first word, unless a term like DataFrame or Series is involved. Ran the script to find and correct heading issues in the following files: - doc/source/user_guide/timedeltas.rst - doc/source/whatsnew/v0.7.0.rst - doc/source/whatsnew/v0.23.4.rst - (… and so on) Fixes part of issue #32550.
In #26933, we made the capitalization of titles consistent. For example, a title used to be capitalized like, "This is the Section Title", and many of the titles in the pandas documentation was changed to a correct format, like "This is the section title".
In #31114, we made a script called
scripts/validate_rst_title_capitalization.py
that extracts all titles in the documentation, making sure that only the first letter of the sentence is uppercase, or words defined in a short list, like Series, DataFrame, etc. The script also outputs how to fix the title as well.We validated capitalization is correct by integrating this script into CI (continuous integration). The idea is that we should run this script through
ci/code_checks.sh
, and when title capitalization errors show up on CI, the user should fix those errors on the specified files.To verify the code is working on your side, the command below instructs the program to validate the
doc/source/development/contributing.rst
file. There should be no output from this command as this file as no capitalization errors:./scripts/validate_rst_title_capitalization.py doc/source/development/contributing.rst
This command below instructs the program to validate both
doc/source/index.rst
anddoc/source/development/policies.rst
files.This command produces the output below:
The goal of this issue is to correct the title capitalization of all files in the pandas documentation.
In order to see all titles that need to be validated in the documentation folder, one should run the following command below on the command line.
./scripts/validate_rst_title_capitalization.py doc/source
This program validates all RST files in the doc/source folder. Once all titles are all correctly validated, we would like to add the above command into the
ci/code_checks.sh
file.Here's a checklist of all the files that had at least one incorrectly capitalized heading:
The text was updated successfully, but these errors were encountered: