This (for now) console application aims to help with analyzing your messenger conversations.
\ | |
|\/ | _ \ __| __| _ \ __ \ _` | _ \ __| _` | __ \ _` | | | | _ / _ \ __|
| | __/ \__ \ \__ \ __/ | | ( | __/ | ( | | | ( | | | | / __/ |
_| _| \___| ____/ ____/ \___| _| _| \__, | \___| _| \__,_| _| _| \__,_| _| \__, | ___| \___| _|
|___/ ____/
This (for now) console application aims to help with analyzing your messenger conversations. You can search and filter
your messages, count words/regex in them, check how many messages you sent to each other, how many characters
you used, what emojis were used, who responded faster on average or even make beautiful charts like the ones below.
You need Python 3 for this application to work and you need to download your facebook messages as JSON files.
Go to your Facebook settings
Choose Your Facebook Information
Choose View on Download Your Information
Download your messages in JSON format
It needs some time to get ready, mine was ready in a few hours, so it’s not too bad. After you downloaded it unzip it
and then look for the messages folder and copy it’s absolute path (with folder included). That’s what you’ll need to use
this application.
You need to start main.py
with the messages folder given to it as a parameter. After that you’ll be taken to a console
(choose chat console) where you can choose which chat you want to analyze.
Pipelines: you can easily chain commands via the ||
pipe symbol. For example filter -d 2018.1.1 || write
/f -d 2018.1.1 || w
pipes the result of filter to a write function, which writes it out to the console.
Note that if you can enter another command line with a command you can pipe another command’s inputs and then
only the remaining data will be used to start the command line. This is useful for example when you only want to
enter with this year’s data: filter -d 2018.1.1 || cmd_line
.
All consoles are capable of auto completion via the tab key. Command history is also available.
In this application a command won’t write it’s contents out just by calling the command, you usually need to pipe it to
the write command.
write, help and quit commands work in all consoles.
In this console you can choose which conversation you want to analyze.
filter apple || write
This is where most of the fun happens.
chart -m -c -em -s 1000x2000 -sa 2000x1000
.This console makes a Markov chain from your conversation. A markov chain analyzes the words you’ve used so far in this
conversation and can make up it’s own sentences randomly based on that. Most of the time it makes funny incoherent
text, but if you choose your layer count right you can get some pretty good text.
Layer count: If you haven’t spoken much in a conversation I would advise you to use a smaller layer count first. Start off with 2
and the lower them if it only takes sentences from the conversation and places them here and increase it if
it’s too incoherent.
With this console you can analyze the emojis in your chat.
MVC approach was used for dividing the code into parts.
The data classes are contained in data.data.py
.
There are also some data classes which contain data about facebook emojis. We read these from the img/data.txt
.
There are 2 types of controller classes: ones that get data from the files and ones that provides data for the view.
The ones that get data from the files are:
The ones that provide data for the view:
Markov chains are implemented in the controller.markov.markov_chain.py
file. A MarkovChain
has a MarkovState
variable which has transitions to other MarkovState
s. It depends on the layer_count
how many times we can go
down this tree of MarkovState
s. For example if the layer_count
is 3 then there are 2 MarkovState
s which have
other MarkovState
s in transitions, but the last one has no transitions, only states.
5
+---> MState(boy)
|6
11 (eater)+---> MState(girl)
+--->MState +
|
|3 (hater) 3
+--->MState +---> MState(please)
| 1
|5 +----> MState(.)
19 (apple)+--->MState(good)|4
+---->MState + +----> MState(-)
|
| 30 (would) 30
| +----->MState +-----> MState(we)
95 (None) |47 |5 +----> MState(I)
MC +-> MState +---->MState(why)+----->MState(am)+------+ 3
| |12 +----> MState(is)
| +----->MState+--+ 2
| (have) |12
|29 +------> MState(you)
+---->MState+5 1
(nice)+---->MState(boots) +--+------> MState(man)
| | 4
|9 +------> MState(hehh)
+---->MState+--+9
| (face) +----> MState(you)
|15 10
+---->MState +-------> MState(not)
(code) | 3
+-------> MState(but)
| 1
+-------> MState(you)
Here you can see a MarkovChain
, where each arrow is a MarkovTransition
and each number shows the transition’s
chance to happen. For example if we start from the MC(MarkovChain
) node we can go to MState(nice)
with a chance of29/(47 + 19 + 29)
, from there we can go to MState(code)
with a chance of 15/(5 + 9 + 15)
, from there we can go toMState(not)
with a chance of 10/(10 + 3 + 1)
. So we can reach nice code not
with a chance of(29 / 95) * (15 / 29) * (10 / 15) = 10 / 95
. After we got these words we can traverse down the tree via code not
to get a MarkovState
that has these two as starting words and can ask for the next word randomly. Then do
the traversing down again and ask for a random word again. Rinse and repeat and you have yourself a text generator.
Note: because of the inner workings of a markov chain if you set it’s layer count to be 2, it
will result in a layer count of 3, because the first one (MState(None)
) is seen as one as well.
The consoles are implemented with one class being all of their parent class. That class is view.console_input.py
.
It’s a basic command interpreter. This implements basic write, quit and help functions. We can easily
add new command via the add_command()
or add_commands()
function. We need to provide aliases for this command
along with two lambdas: one that gets a console instance, the switches of a command, kwargs for a command and has
to execute the command. It can also return a dictionary which is then later added to the kwargs. This is how the
pipeline works. The other lambda we need to provide is the help lambda, gets nothing but needs to write out
the help for the command.
This console implementation also has some neat features: _get_write_string()
can be implemented which then
later is used to write out the result of a pipeline. It gets passed the kwargs used along the pipeline.
You can also add a welcome and a quit message to the console.
The console above is implemented in the other consoles with help functions as static functions and command
execution function as member function. At the top of the class all commands can be seen, and then easily followed
to the help or execution function.
The console manager can be used to manage more consoles easily. With this the consoles have no need to start their own
child consoles because this manager starts it for them. This has the advantage of being able to reach the current and
other consoles from the whole application and not just from the parent console. (Good for testing for example).
The manager runs on a separate thread and does the input handling instead of the consoles themselves. You can also
switch between consoles on the fly, and simply put the other console in the background.
Currently the built-in python command history functions are being used, which are in readline
. In console_init.py
the setup happens and all the commands are written to .msg_parser_history
. All consoles share one command history,
there is no separate command history for each console.
Auto completion is also available in the console via readline
. All commands have a separate auto complete function
and that gets called in console_input.py
based on what we want to complete. If we only want to complete the current
command then that gets handled by itself, but if we want to complete one of the parameters of the command then
the function implemented by the user gets called, which should return a list of possible candidates for completion.
The console manager also supports auto completion by setting and removing the correct auto completer when a console
switch happens.
There are commands which can be used in more consoles. Those are grouped in view.commands
package. They all have
every function needed (command function itself, help function, auto complete, others if needed in the future)
in the file itself.
In this application testing is used to test the commands in the command lines with edge cases and more complex commands.
Python unittests are used and all of them are in the test
directory with conversations used for testing in test.messages
.
All tests can be executed with all_tests.py
.
For now testing the algorithms themselves is not implemented because in my opinion it’s unnecessary to write test cases
for simple counting, max and min search algorithms. All the edge cases are tested via the command line.
With the help of piping you can chain commands together which you may have not expected work:
basic || write || filter -d 2018.1.1 || write -f "out.txt" || search -i -r "p[eo]*p" || write || markov 2 || count -p || write
This gets executed and does the following in order:
basic || write
f -d 2018.1.1 || w -f "out.txt"
p[eo]*p
with ignored case and then writes out the result (Note that this only searchess -e -r "p[eo]*p" || w
markov 2
count -p || write
I’m not completely sure why you would do something like the command above (maybe this tool gets taught somewhere and
and the command above is a good test question), but hey, it works so please use stupid commands like the one above.
You can do
markov 1 || markov 2 || markov 3 || markov 4 || markov 5 || markov 6 (...)
as many times as you want. You can also sprinkle other consoles in there. This will result, first of all, if you did
it just enough times in your computer crashing. After restart you do it too many times again, so it crashes again.
Third time you get the number right and you have yourself, what I call is a console-ception. You have a console inside
another console inside another console inside… You get it. The console you are currently in is the last one.
You can quit from them via quit
/q
. Have fun and be safe!