Skip to content

Various name fixes, in particular regarding von-prefix (prelast) & lineage (Jr./Sr./I/II/III/IV/V) #260

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Oct 18, 2024

Conversation

ludopulles
Copy link
Contributor

@ludopulles ludopulles commented Oct 1, 2024

This pull request contains multiple changes:

  • von prefix: A prefix for the last name (van / van de / de la / von / d' but not Van / Van de / O') was sometimes incorrectly seen as part of the last name, which caused the abbreviation (if using e.g. alpha.bst) to be incorrect like [van ] / [de ]. As explained in rationale behind some particle names #107 (comment), by using a 'good' .bst-file in your LaTeX file that sorts on last name not including prefix, alphabetic sorting happens as expected. Even with alpha.bst you have abbreviations like [DvW21] already which is as expected.
  • Example, I changed "Lo{\"i}c {van Oldeneel tot Oldenzeel" to "Lo{\"i}c van {Oldeneel tot Oldenzeel}" because without the {}-braces, Bibtex thinks "van Oldeneel tot" is the von-prefix (according to the TameTheBeast manual) which it isn't.
  • Lineage was sometimes taken part of the last name when it's not Jr./Sr. but a number, e.g. William E. Skeith III used to be wrong. I forced an encoding of von Last, Jr., First when a lineage appears, to make sure it's parsed correctly.
  • Many Belgian names did not include the last name, e.g. I changed "Gilles Van Assche" to "Gilles {Van Assche}".
  • Many Spanish/Italian/Portuguese names had a similar issue, e.g. I changed "Alfredo De Santis" to "Alfredo {De Santis}" (similar for "Ivan {De Oliveira Nunes}").
  • Changed Maroccan names, e.g. I changed "Youssef El Housni" to "Youssef {El Housni}".
  • Many ePrint entries have no spaces, where there should be spaces, e.g. in [EPRINT:Lu23] I changed "Frank Y.C. Lu" to "Frank Y. C. Lu".
  • Various name fixes, removing unparsed special characters that appear as "?", e.g. [EPRINT:GaiKiaRus2] has author "Peter Gaži".

I used a script to more or less automate these changes. I can make a PR for these scripts in db_tools and the lib (for changes to Person class) repositories if wanted. I believe that the import script db_import/import.py needs to be revised to do these names good in the future.

Note: this does not update the ePrint labels, also because there was an open pull request #252 . Once this Pull Request is done, I have a fix for all the >6 letter labels and outdated labels, which is here: https://github.com/ludopulles/crypto_db/tree/fix-252
I'll create a PR for that, once/if this is merged.

Closes: #107

Also some other changes, as some authors also fixed their name on ePrint online, but this was not in cryptobib.
Let LaTeX handle the von prefices, and hope that the .bst file is sorting only by lastname.
In many cases, you can just remove the braces.
Make sure your .bst file sorts by last name, excluding von prefix!
This makes sure that lineages such as II, III, IV are parsed correctly by bibtex!
See e.g. Section 11 "The author field" from the bibtex tamethebeast manual.
There are also some names that do not have capitalization, but those occur less frequent than these names, which are clearly inconsistent with the usual way these names are written.
@ludopulles
Copy link
Contributor Author

The scripts I used are found in https://github.com/ludopulles/cryptobib_tools/ . In particular the files:

Note that to use the "Person" object from mybibtex I modified the cryptobib/lib repo to specify the name output format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rationale behind some particle names
2 participants