项目作者: MhmdSyd
项目描述 :
Wuzzuf DataAnalysis by java using (SparkSql-Spring-XChart-Spark-ML)
高级语言: Java
项目地址: git://github.com/MhmdSyd/Wuzzuf_Jobs_DataAnalysis.git
Wuzzuf_Jobs_DataAnalysis
java project ITI Team
Project Details:
Java Final Project:
Task:
• Build all java needed classes (POJO, DAO, web service and a tester
client for the web service)
• Make a web service to get the following from the data set:
- Read data set and convert it to dataframe or Spark RDD and
display some from it. - Display structure and summary of the data.
- Clean the data (null, duplications)
- Count the jobs for each company and display that in order (What
are the most demanding companies for jobs?) - Show step 4 in a pie chart
- Find out what are it the most popular job titles?
- Show step 6 in bar chart
- Find out the most popular areas?
- Show step 8 in bar chart
- Print skills one by one and how many each repeated and order the
output to find out the most important skills required? - Factorize the YearsExp feature and convert it to numbers in new
col. (Bounce ) - Apply K-means for job title and companies (Bounce )
Team:
Group of three students.
Deliverables:
• Each team must share with us a git hub link for a maven EE
application.
• Each team must be ready to present his work on 6th of July
Wuzzuf jobs in Egypt data set at Kaggle
https://www.kaggle.com/omarhanyy/wuzzuf-jobs