This repository contains Python scripts used for analysing the data from the database(s) of Bluesky posts that was collected with the Bluesky Language Feed Generator.
- negations (no, nah, nope, naw)
- affirmations (yes, yeah, yep, yeh)
- profanities (shit, fuck, ass)
- personal pronouns (I, you, he, she, it)
- position in the discourse
- capitalized / lowercased
- added vowels / consonants
- surrounding words
- word count
- emojis
- sentiment
First, get the database(s) from the OSF project and put them in a folder.
Run this command:
python analyse.py -p <pronoun>Expect:
<pronoun>.results.md to be created in the a results directory, containing details about the pronoun.
Optional arguments:
-d, --database <database>: Specify the database to use (default isbluesky.db).-o, --outputFile <output_file>: Specify the output file name (default is<pronoun>.results.md).-p <pronoun>: Specify the pronoun file name (default isbro), options include dude, bro, bruh, chat, sis, fam.-s, --profanities: Include profanities in the analysis (default isFalse).-n, --negations: Include negations in the analysis (default isFalse).-a, --affirmations: Include affirmations in the analysis (default isFalse).-e, --emojis: Include emojis in the analysis (default isFalse).-u, --usage: Display the posts for any of the data parameters provided (default isFalse).--allRows: Display every post using the pronoun in the analysis (default isFalse).
run this command:
python summarize.py -d <database> -o <output_file>Expect:
<output_file> to be created in the results directory, containing a summary of all pronouns in the database.
Optional arguments:
-d, --database <database>: Specify the database to use (default isbluesky.db).-o, --outputFile <output_file>: Specify the output file name (default issummary.results.md).