项目作者: Sumit2514

项目描述 :
50 public profile PDFs from LinkedIn , converting to text then finding most frequent and essential words
高级语言: Python
项目地址: git://github.com/Sumit2514/Resume-Most-Frequent-Words-And-Keywords.git


AI_CHAMP

Task 1:

Download 50 public profile PDFs from my LindkIn and storing them in TASK 1 folder

Task 2:

In Task 2 , Public profile PDFs are converted into text . Conversion of PDF into text is done through slate library .In Task 2 , while loop used for loading and converting multiple PDFs into text. After that adding the text generated into list and then converting list into DataFrame . Store the data frame in CSV named Profile_text.

Task 3:

In Task 3, Text generated in Task 2 are loaded and Tokenized into words . After that remove the stop words row wise.Extract most frequent words (5 words) of each profile and use RAKE library to extract essential words of each profile PDFs. Finally Store the file in CSV named Profile_keyword.