-
Notifications
You must be signed in to change notification settings - Fork 35
Naming convention for popgen stats and variables #239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
We should probably change |
Good call. |
tomwhite
added a commit
to tomwhite/sgkit
that referenced
this issue
Nov 30, 2020
tomwhite
added a commit
to tomwhite/sgkit
that referenced
this issue
Dec 1, 2020
Any objections to closing this one now? |
Nope, done. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
#100 added functions
Fst
andTajimas_D
to the API. It's not clear that these are the right naming choices and we should make a conscious choice about what the variable/function naming conventions are. The most obvious choice is to make everything lowercase, (i.e.,f_st/stats_f_st
andtajimas_d/stats_tajimas_d
), but arguably this is less readable thatFst/stats_Fst
. We'll probably end up with a whole pile of stats like Garud's H stats (#231), Patterson's F[2-4] stats, the D, R and R2 LD statsWhen we use letters to describe statistics, then the case does matter and it may be artificial to force them to be lowercase just for the sake of maintaining conventions ("A foolish consistency is the hobgoblin of little minds").
Ideally, we'd have descriptive names for things like we have for
diversity
anddivergence
, but I don't think that's practical. The established conventions in the field is for "[authors] [combination of letters]" to denote a statistic. The question is, what convention should we adopt for rendering "[authors] [combination of letters]" as a function name/variable name in the output dataset.Related to #232 and #226
The text was updated successfully, but these errors were encountered: