Skip to content

Conversation

@ccoVeille
Copy link
Collaborator

@ccoVeille ccoVeille commented May 8, 2025

Add/Fix Dictionary

Dictionary: software-terms

Description

  • feat: add variations of marshal to software-terms.txt
  • feat: add encode/decode/unencode to software-terms
  • feat: add (un)package to software-terms
  • feat: add (un)wrap to software-terms
  • feat: add (un|de)serialize of software-terms
  • feat: add (un)review to software-terms
  • feat: add (un)revert to software-terms

References

  • Any source references.

Checklist

  • By submitting this pull-request, you agree to follow our Code of Conduct
  • Verify that the title starts with the correct prefix:
    • fix: - for minor changes like adding words or fixing spelling issues.
    • feat: - for a significant change like adding a whole new set of words to a dictionary.
    • feat!: - for breaking changes, like file format or licensing changes.
    • chore: - for changes that do not impact the content of dictionaries.

ccoVeille added 2 commits May 8, 2025 22:36
The idea is to add existing variations:
- marshal (verb and noun)
- marshals (present tense and plural noun)

English:
- marshallable (adjective)
- marshalling (gerund)
- marshalled (past tense)
- marshaller (noun)
- marshallers (plural noun)

American English:
- marshalable (adjective)
- marshaling (gerund)
- marshaled (past tense)
- marshaler (noun)
- marshalers (plural noun)
- variations of encode are added to the software-terms dictionary
    - encode (verb)
    - encodes (present tense)
    - encoding (gerund and noun)
    - encodings (plural noun)
    - encoded (past tense)
    - encoder (noun)
    - encoders (plural noun)
    - encodable (adjective)
    - encodability (noun)

- variations of decode are also added

    - decode (verb)
    - decodes (present tense)
    - decoding (gerund and noun)
    - decodings (plural noun)
    - decoded (past tense)
    - decoder (noun)
    - decoders (plural noun)
    - decodable (adjective)
    - decodability (noun)

- variations of unencode are added to the software-terms dictionary
    - unencode (verb)
    - unencodes (present tense)
    - unencoded (past tense)

unencode is a neologism, but it is used in the wild.
ccoVeille added 5 commits May 8, 2025 23:19
- package (verb & noun)
- packages (plural noun and present tense)
- packageability (noun)
- packageable (adjective)
- packaged (past tense)
- packager (noun)
- packagers (plural noun)
- packaging (gerund and noun)
- packagings (plural noun)

- unpackage (verb)
- unpackages (plural noun and present tense)
- unpackageability (noun)
- unpackageable (adjective)
- unpackaged (past tense)
- unpackager (noun)
- unpackagers (plural noun)
- wrap (noun and verb)
- wraps (plural noun and present tense)
- wrappability (noun)
- wrappable (adjective)
- wrapped (adjective and past tense)
- wrapper (noun)
- wrappers (plural noun)
- wrapping (noun and present participle)
- wrappings (plural noun)

- unwrap (verb)
- unwraps (present tense)
- unwrappability (noun)
- unwrappable (adjective)
- unwrapped (adjective and past tense)
- unwrapper (noun)
- unwrappers (plural noun)
- unwrapping (noun and present participle)
- unwrappings (plural noun)
- serializability (noun)
- serializable (adjective)
- serialization (noun)
- serializations (noun)
- serialize (verb)
- serializes (present tense)
- serialized (past tense)
- serializing (noun and gerund)
- serializer (noun)
- serializers (plural noun)

- unserializability (noun)
- unserializable (adjective)
- unserialization (noun)
- unserializations (noun)
- unserialize (verb)
- unserializes (present tense)
- unserialized (past tense)
- unserializing (gerund)
- unserializer (noun)
- unserializers (plural noun)

- deserializability (noun)
- deserializable (adjective)
- deserialization (noun)
- deserializations (plural noun)
- deserialize (verb)
- deserializes (present tense)
- deserializing (gerund)
- deserialized (past tense)
- deserializer (noun)
- deserializers (plural noun)
- review (noun and verb)
- reviews (plural noun and present tense))
- reviewed (adjective)
- reviewer (noun)
- reviewers (plural noun)
- reviewing (gerund)
- reviewable (adjective)
- reviewability (noun)

- unreview (noun and verb)
- unreviews (plural noun and present tense)
- unreviewed (adjective)
- unreviewable (adjective)
- revert (noun and verb)
- reverts (plural noun and present tense)
- reverted (past tense)
- reverting (gerund)
- revertible (adjective)
- revertable (adjective, less common)
- revertability (noun)

- unrevert (verb)
- unreverts (present tense)
- unreverted (past tense)
- unreverting (gerund)
- unrevertible (adjective)
@Jason3S Jason3S changed the title feat: add software-terms fix: add software-terms May 9, 2025
@Jason3S
Copy link
Collaborator

Jason3S commented May 9, 2025

🤔 It might be better to just add these to the appropriate English dictionary.

At the moment, the software terms dictionary is always included. Added mixed language terms impacts everyone.

@ccoVeille
Copy link
Collaborator Author

I thought about it this morning, yes.

Most will go to en_shared/additional.txt as they are missing in en_CA and en_AU

Some might go in en_shared/additional-ise.txt as they are specific to one English flavors.

@ccoVeille
Copy link
Collaborator Author

The main reason I added them to software terms initially is because some words are specific to software. You might use a language that is not English, but use these words anyway. But maybe I'm going too far.

Also, these some of the words I'm adding are simply variations of words that are already in software terms.

Do you think I should consider to clean up the software-terms file by moving them out of this file to move some of them to en_shared ?

Copy link
Collaborator Author

@ccoVeille ccoVeille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why you not coming back to me 😱 😭, so I came here and saw my comments were still in pending state🤦‍♂️.

So here we go, I pressed send button 😜😂

Comment on lines 1056 to 1060
marshallable
marshalled
marshaller
marshallers
marshalling
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question:

Are these good candidate to be added to en_shared/additional-ise.txt

I'm unsure about the en_GB-ise dictionary and the en_shared/additional-ise.txt file purpose

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, including marshalling and marshallable.

Comment on lines +2294 to +2325
serializability
serializable
serialization
serializations
serialize
serialized
serializer
serializers
serializes
serializing

unserializability
unserializable
unserialization
unserializations
unserialize
unserialized
unserializer
unserializers
unserializes
unserializing

deserializability
deserializable
deserialization
deserializations
deserialize
deserialized
deserializer
deserializers
deserializes
deserializing
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About these words, they are all in the -ize form.

What do you think about having an en_shared/additional-ize.txt ?

It would be included in en_US, but words from this list would be excluded from en_GB-ise, and maybe others TBD.

I feel like someone enabling en_GB-ise only doesn't want serialize to be accepted as valid

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it is a good idea. Especially since the British dictionary also has the -ize versions.

Copy link
Collaborator Author

@ccoVeille ccoVeille May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -ize word list is being created #4438

Later, a PR will be opened to add the variants of serialize to this word list

@Jason3S
Copy link
Collaborator

Jason3S commented May 11, 2025

I was wondering why you not coming back to me 😱 😭, so I came here and saw my comments were still in pending state🤦‍♂️.

So here we go, I pressed send button 😜😂

Thank you for the comments. I'll try to answer them.

@Jason3S
Copy link
Collaborator

Jason3S commented May 11, 2025

The main reason I added them to software terms initially is because some words are specific to software. You might use a language that is not English, but use these words anyway. But maybe I'm going too far.

It is a balance. That logic is applied to the programming language dictionaries since most of their keywords are English words.

Also, these some of the words I'm adding are simply variations of words that are already in software terms.

Understood. There are many examples already in the list. I too debated on what to do with derivative words.

Do you think I should consider to clean up the software-terms file by moving them out of this file to move some of them to en_shared ?

Cleaning up software terms is an on-going project. Please remember, removing words tends to get people upset, especially if it breaks their CI/CD pipeline. For this reason, I have started splitting out the word lists, but not deleting terms.

@ccoVeille
Copy link
Collaborator Author

Do you think I should consider to clean up the software-terms file by moving them out of this file to move some of them to en_shared ?\n\nCleaning up software terms is an on-going project. Please remember, removing words tends to get people upset, especially if it breaks their CI/CD pipeline. For this reason, I have started splitting out the word lists, but not deleting terms.

I feel like we could add a legacy.txt to software-terms.txt what do you think?

@ccoVeille
Copy link
Collaborator Author

Because this PR can be used as a future reference, and because it's still unclear to me where each word I added should go, I plan to close this PR and open smaller ones, focusing on adding one word, and its variations

@ccoVeille ccoVeille marked this pull request as draft May 11, 2025 18:28
@Jason3S
Copy link
Collaborator

Jason3S commented May 13, 2025

Do you think I should consider to clean up the software-terms file by moving them out of this file to move some of them to en_shared ?\n\nCleaning up software terms is an on-going project. Please remember, removing words tends to get people upset, especially if it breaks their CI/CD pipeline. For this reason, I have started splitting out the word lists, but not deleting terms.

I feel like we could add a legacy.txt to software-terms.txt what do you think?

I think it is a good idea.

@ccoVeille
Copy link
Collaborator Author

ccoVeille commented May 13, 2025

Do you think I should consider to clean up the software-terms file by moving them out of this file to move some of them to en_shared ?\n\nCleaning up software terms is an on-going project. Please remember, removing words tends to get people upset, especially if it breaks their CI/CD pipeline. For this reason, I have started splitting out the word lists, but not deleting terms.

I feel like we could add a legacy.txt to software-terms.txt what do you think?

I think it is a good idea.

Opened #4443

Replaced by #4453

@ccoVeille
Copy link
Collaborator Author

I still need to review this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants