Implementation of FP-Growth Data Mining Algorithm
Project Overview: The csv file Groceries_dataset contains customer ID and an item purchased on a date by that ID. The data can be sorted by each customer ID and then by data of purchase to obtain a receipt of the customer’s order. The organized customer data is shown in the XML file Transactions. With all the receipts organized, we can apply date mining techniques to find relationships within the data.
Compilation and running of the source code can all be done by executing run.sh
or in the terminal executing the command ./run.sh
For organization purposes, cleanup.sh
removes *.class files generated by run.sh
The Algorithm breakdown and explanation can be found here.
Programming Assignment 2 gregory$ ./run.sh
[rolls/buns]->[whole milk]
Count: 209 | Support: 1.3967787208447504 | Confidence: 12.697448359659782
[whole milk]->[rolls/buns]
Count: 209 | Support: 1.3967787208447504 | Confidence: 8.844688954718578
[soda]->[whole milk]
Count: 174 | Support: 1.1628684087415626 | Confidence: 11.97522367515485
[whole milk]->[soda]
Count: 174 | Support: 1.1628684087415626 | Confidence: 7.363520947947524
[whole milk]->[yogurt]
Count: 167 | Support: 1.116086346320925 | Confidence: 7.0672873465933135
[yogurt]->[whole milk]
Count: 167 | Support: 1.116086346320925 | Confidence: 12.996108949416344
[other vegetables]->[whole milk]
Count: 222 | Support: 1.4836596939116486 | Confidence: 12.151067323481117
[whole milk]->[other vegetables]
Count: 222 | Support: 1.4836596939116486 | Confidence: 9.394837071519255
[other vegetables]->[rolls/buns]
Count: 158 | Support: 1.0559379803515339 | Confidence: 8.648056923918993
[rolls/buns]->[other vegetables]
Count: 158 | Support: 1.0559379803515339 | Confidence: 9.59902794653706
[whole milk]->[rolls/buns]
Count: 209 | Support: 1.3967787208447504 | Confidence: 8.844688954718578
[rolls/buns]->[whole milk]
Count: 209 | Support: 1.3967787208447504 | Confidence: 12.697448359659782
[other vegetables]->[rolls/buns]
Count: 158 | Support: 1.0559379803515339 | Confidence: 8.648056923918993
[rolls/buns]->[other vegetables]
Count: 158 | Support: 1.0559379803515339 | Confidence: 9.59902794653706
[other vegetables]->[whole milk]
Count: 222 | Support: 1.4836596939116486 | Confidence: 12.151067323481117
[whole milk]->[other vegetables]
Count: 222 | Support: 1.4836596939116486 | Confidence: 9.394837071519255
[whole milk]->[soda]
Count: 174 | Support: 1.1628684087415626 | Confidence: 7.363520947947524
[soda]->[whole milk]
Count: 174 | Support: 1.1628684087415626 | Confidence: 11.97522367515485