Bibliometric project

Before you run the project, please download the data folder, unzip it, and add it to the project root folder.

pip install -r requirements.txt

A1. Basic data cleansing, author id and name disambiguation
- remove noise
- merge multiple ids
- unify names
A2. CREDS members author-topic Sankey diagram
- visualise author-concept distributions
A3. CREDS members author-topic & correlation heatmaps
- find who and who share similar research interests
A4. TSNE visualisation for CREDS members
- show every CREDS member's position on a 2D plot (research interest based)
A5. CREDS research concept vector
- A pooling vector representation of CREDS research direction
A6. Retrieve citation data for CREDS members
- Data collection via API
- for workflow purposes
- generate 'data/ref_author.RDS' and 'data/ref_concept.RDS'
A7. How do CREDS members cite each other?
- An internal citing map of CREDS members
A8. Who do CREDS members commonly cite the most?
- The top commonly cited authors of CREDS members (common_cited_authors.xlsx)
A9. What concepts are cited most by CREDS members?
- The top commonly cited concepts (topics) of CREDS members (common_cited_topics.xlsx)

B1. Citation performance of CREDS members
- CREDS members' citation performance
- total citation per paper (pp.), yearly citation pp., and 3-year citation pp.
- Box plot of the three indicators
B2. Constructing a benchmark based on the concept Education
- Education topic, and have AU authors.
- Scalability issue here, the API only supports 10,000+ records per request.
- We can use the year-separate approach for records < 10k per year
- Or we can try to download the database snapshot at a bigger cost (https://docs.openalex.org/download-snapshot).
- Perform analysis on the three citation indicators and compare.
B3. Constructing a benchmark based on the related works
- Use the related works as a benchmark.
- Perform analysis on the three citation indicators and compare.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
R		R
output		output
renv		renv
.Rprofile		.Rprofile
.gitignore		.gitignore
0_lazy_setup_project_run_once.R		0_lazy_setup_project_run_once.R
1_lazy_setup_step2.py		1_lazy_setup_step2.py
2_lazy_setup_step2.qmd		2_lazy_setup_step2.qmd
CREDS_project.Rproj		CREDS_project.Rproj
Jupyter Notebook (Anaconda3).lnk		Jupyter Notebook (Anaconda3).lnk
README.md		README.md
Section_A_part_1.ipynb		Section_A_part_1.ipynb
Section_A_part_2.ipynb		Section_A_part_2.ipynb
Section_B.ipynb		Section_B.ipynb
Section_C.ipynb		Section_C.ipynb
environment.yaml.txt		environment.yaml.txt
open nbs.url		open nbs.url
requirements.txt		requirements.txt