Problem Statement: Mall owner needs to discover which customers can be focused on in case to grow his sales
Problem Statement: Mall owner needs to discover which customers can be focused on in case to grow his sales
Language Used:
R programming
R is a programming language for data analysis and statistics. This R language is widely used among statisticians and data miners for developing statistical software and data analysis. It has many built in functions and libraries and is extensible, allowing users to define their own functions and procedures.
Introduction of the Data Set selected (Different traits of the dataset)
This data contains basic customer information namely (ID, age, gender, annual income, spending score). This data is of various customers and extracted using membership cards and surveys.
Traits:
Customer ID: It is the unique ID given to a customer
Gender: Gender of the customer
Age: The age of the customer
Annual Income: It is the annual income of the customer
Spending score: It is the score (out of 100) given to a customer by the mall authorities, based on the money spent and the behaviour of the customer.
Credits: https://www.kaggle.com/
List of the operations like Data Pre-processing & ML Algorithm used for prediction:
Data Visualization
• Tableau
Tableau is a Business Intelligence Tool used for data visualization. With Tableau you
can gain insights by just visualizing the stats that you already have with you and use it for
your development of your business. Here, we have used tableau for better understanding of
customer’s annual income and spending amount so that we can predict which customers to
be targeted on.
Data pre-processing
● Converted Gender into factor
● Removed all NA values
● Validating amount of missing data
● Splitted the dataset into training and testing set With 80-20 split ratio
ML Algorithm
• K-Means clustering
K-means clustering is one of the simplest and popular unsupervised machine learning algorithms. We have used this K means clustering