Talent Snapshot: Keyword Extraction and Analysis

Description

My first attempt at keyword extraction and analysis using Spacy, pandas and numpy libraries and Xu Liang's implementation of TextRank! which was a great help.

Dataset used

Dataset containing FORKAIA intern responses to their application questions which could not be made publicly available.

Main procedure

I split the project into two parts - extraction and analysis. The extraction part code can be found in "keyword_extraction.py" and the collected keywords in "Keywords.xlsx." The main idea in the extraction was to use the textrank algorithm to remove stopwords and identify keywords that could be later compared to roles of interest during the analysis. This procedure was used in all sections but the education section. For education keywords, I collected data on the different types of degrees and checked for keywords related to the collected data.

The analysis part code can be found in "keyword_analysis.py" and results in "Snapshot Figures and Analysis.xlsx." For analysis, I made use of Spacy's most_similar function while using the large model to compare the extracted keywords to a certain role of interest and gauge their similarity. If similar, the person was counted as holding that role within FORKAIA.

Results

Below are the results I obtained:

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
__pycache__		__pycache__
Keywords.xlsx		Keywords.xlsx
README.md		README.md
Snapshot Figures and Analysis.xlsx		Snapshot Figures and Analysis.xlsx
Talent Showcase visualization.jpg		Talent Showcase visualization.jpg
keyword_analysis.py		keyword_analysis.py
keyword_extraction.py		keyword_extraction.py
text_rank.py		text_rank.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Talent Snapshot: Keyword Extraction and Analysis

Description

Dataset used

Main procedure

Results

About

Uh oh!

Releases

Packages

Languages

induhiu/NLP-Keyword-Extraction-and-Analysis---Forkaia-Talent-Showcase

Folders and files

Latest commit

History

Repository files navigation

Talent Snapshot: Keyword Extraction and Analysis

Description

Dataset used

Main procedure

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages