项目作者: RUB-SysSec

项目描述 :
Modeling Password Guessability Using Markov Models
高级语言: Python
项目地址: git://github.com/RUB-SysSec/NEMO.git
创建时间: 2019-07-11T15:12:21Z
项目社区:https://github.com/RUB-SysSec/NEMO

开源协议:

下载


NEMO: Modeling Password Guessability Using Markov Models

tl;dr

This is our ongoing effort of using Markov models to build probabilistic password models.
Common use cases include:

  • Strength estimation
  • Guessing
  • (Adaptive) Natural Language Encoders

WARNING

  • This is research-quality code that should only be used for a proof of concept (PoC).
  • We share this code in the hope that the research community can benefit from it. Please share your code, too! :heart_eyes:
  • We recommended running this software using PyPy (see performance stats below).

About NEMO

The scope of this project is not limited to passwords, this software has also been used in the context of other human-chosen secrets like Emoji, PINs, and Android unlock patterns.

The architecture of the software is inspired by OMEN. More background information about OMEN can be found here and here. An excellent Python implementation of OMEN, called py_omen, by Matthew Weir (@lakiw) can be found here.

Difference to OMEN

OMEN makes use of so-called levels (a form of binning). This implementation does not. Thus, efficient enumeration of password candidates (guessing passwords as OMEN does), is not (out of the box) possible, if the key space becomes too big. However, because of the non-binned output, this software has other advantages, for example it can produce more accurate strength estimates.

Overview: Markov Model-Based Password Guessing

Publications

In the past, we used different versions of this code in the following publications: :bowtie:

A simpler version of this code has been used for other user-chosen secrets such as Emoji and Android unlock patterns.

Design Decisions

Warning: Markov models are memory-eating monsters! :smiling_imp:

We use three copies of a data structure (in the past: Python OrderedDicts(), today: plain Python lists) to store the frequencies of the n-grams in the training corpus. We use:

  • IP: Initial probabilities (ngram_size - 1)
  • CP: Conditional probabilities
  • EP: End probabilities (ngram_size - 1)

Here is an example for 3-grams:

  1. password PW
  2. pa IP (some literature uses this annotation: ^pa)
  3. pas CP1
  4. ass CP2
  5. ssw CP3
  6. swo CP4
  7. wor CP5
  8. ord CP6
  9. rd EP (some literature uses this annotation: rd$)

How Big Are They?

  1. IP: alphabet_length ^ (ngram_size - 1)
  2. CP: alphabet_length ^ ngram_size
  3. EP: alphabet_length ^ (ngram_size - 1)

Some Details For the Ones Interested:

:nerd_face:

n-gram size: Currently, we support 2,3,4,5-grams. The higher the order of the Markov chains, the more accurate the model becomes. Unfortunately, this also introduces the risk of overfitting and sparsity. If one does not have enough training data, e.g., when using the model with Android unlock patterns, computing the transition probabilities from too small count numbers will become too noisy. While we only support fixed-order Markov chains, we recommend using Dell’Amico and Filippone backoff model for variable-order Markov chains.

Smoothing: Currently, we only support Additive smoothing (add ‘1’ to the counts), also known as Laplace smoothing.

Alphabet: We tested this software with ASCII passwords only. Using non-ASCII passwords, likely requires to drop the support for Python 2 first. Hint: You can use the info.py script in the utils folder to determine the alphabet.

Development

In early versions of this code, we made heavy use of Python’s (Ordered)-Dictionary class. Fun fact: As of Python 3.7 dictionaries are always ordered :)

  1. cp_dict_full:
  2. key: "aaaa", value: 0.0071192...
  3. key: "aaab", value: 0.0034128...
  4. ...

A few months later, we optimized the memory consumption by only storing n-grams that really occur in the training corpus. If a rare n-gram like the 4-gram o9py does not occur in the training file, we used to return a very small default probability instead. This helped quite a lot to reduce the required memory, still, like Google Chrome, our solution easy occupied more than 20GB of RAM. :poop:

  1. cp_dict_sparse:
  2. key: "assw", value: 0.0838103...
  3. key: "sswo", value: 0.0954193...
  4. ...

Thus, we decided to refactor the code again to further limit the amount of required memory to nowadays approx. 16GB of RAM.
Today, we use simple Python lists to store the n-gram probabilities in memory.
However, this forced us to come up with a ngram-to-listindex function(), which is different for CP in comparison with IP/EP.

  1. _n2i(): ngram, e.g., "assw" to index in list, e.g., ngram_cp_list[87453]
  2. _i2n(): index in list, e.g., ngram_cp_list[87453] to ngram, e.g., "assw"
  3. cp_list_full:
  4. index: 0, value: 0.0071192... | ("aaaa")
  5. index: 1, value: 0.0034128... | ("aaab")
  6. ...
  7. index: 87453, value: 0.0838103... | ("assw")
  8. ...
  9. index: 8133135, value: 0.0954193... | ("sswo")
  10. ...

The current version of the code supports this operation for 2,3,4, and 5-grams.
Fortunately, while this approach achieves the desired memory savings, the additional function call does in comparison to the O(1) HashMap access (offered by Python dictionaries) not increase runtime significantly.

Performance Testing

We highly recommended to replace Python with PyPy before using this software. :100: :thumbsup:

  1. MEMORY TIME
  2. # PYTHON 2
  3. CPython 2.7.10 15.36GB 53m 27s
  4. PyPy2 7.1.1 5.88GB 3m 8s (based on Python 2.7.13) <- Highly recommended
  5. # PYTHON 3
  6. CPython 3.7.3 14.47GB 12m 34s
  7. CPython 3.6.5 14.49GB 13m 13s
  8. PyPy3 7.1.1 7.33GB 2m 13s (based on Python 3.6.1) <- Highly recommended

Getting Started

Folder Structure

  1. .
  2. ├── README.md
  3. ├── configs
  4. ├── configure.py
  5. ├── dev.json
  6. └── main.json
  7. ├── input
  8. ├── log
  9. └── multiprocessinglog.py
  10. ├── meter.py
  11. ├── ngram
  12. └── ngram_creator.py
  13. ├── requirements.txt
  14. ├── results
  15. ├── train.py
  16. ├── trained
  17. ├── training_cp_list_<ngram-size>_<pw-length>.pack
  18. ├── training_ep_list_<ngram-size>_<pw-length>.pack
  19. └── training_ip_list_<ngram-size>_<pw-length>.pack
  20. └── utils
  21. ├── info.py
  22. └── sortresult.py

Installation

Install PyPy (for Python 2 or better Python 3), and create a virtual environment just to keep your system light and clean:

$ virtualenv -p $(which pypy) nemo-venv

  1. Running virtualenv with interpreter /usr/local/bin/pypy
  2. New pypy executable in /home/<username>/nemo-venv/bin/pypy
  3. Installing setuptools, pip, wheel...
  4. done.

Activate the new virtual environment:

$ source nemo-venv/bin/activate

Now clone the repo:

(nemo-venv) $ git clone https://github.com/RUB-SysSec/NEMO.git

Change into the newly cloned folder:

(nemo-venv) $ cd NEMO

Now install the requirements:

(nemo-venv) $ pip install -r requirements.txt

This includes:

  • tqdm # for a fancy progress bar
  • u-msgpack-python # required to store/load the trained model to/from disk
  • rainbow_logging_handler # for colorful log messages

Dataset

While the Markov model can be used for a variety of things, in the following we focus on a simple strength meter use case.

For this, you will need two files:

  • input/training.txt: Contains the passwords that you like to use to train your Markov model.
  • input/eval.txt: Contains the passwords, which guessability you like to estimate.

I will not share any of those password files, but using “RockYou” or the “LinkedIn” password leak sounds like a great idea. Make sure to clean and (ASCII) filter the files to optimize the performance.

For optimal accuracy, consider to train with a password distribution that is similar to the one you like to evaluate (e.g., 90%/10% split). Please do not train a dictionary / word list, this won’t work. :stuck_out_tongue_winking_eye: You need a real password distribution, i.e., including duplicates.

  • The file must be placed in the input folder.
  • One password per line.
  • File must be a real password distribution (not a dictionary / word list), i.e., must contain multiplicities.
  • All passwords that are shorter or longer than the specified lengths will be ignored.
  • All passwords that contain characters which are not in the specified alphabet will be ignored.
    During development, we tested our code with a file that contained ~10m passwords.

Configuration

Before training, you need to provide a configuration file.
You can specify which configuration file to use by editing the following line in configure.py in the configs folder:

  1. with open('./configs/dev.json', 'r') as configfile:

Here is the default content of dev.json, feel free to edit the file as you like.

  1. {
  2. "name" : "Development",
  3. "eval_file" : "eval.txt",
  4. "training_file" : "training.txt",
  5. "alphabet" : "aeio1nrlst20mcuydh93bkgp84576vjfwzxAEqORLNISMTBYCP!.UGHDJ F-K*#V_\\XZW';Q],@&?~+$={^/%",
  6. "lengths" : [4,6,8],
  7. "ngram_size" : 4,
  8. "no_cpus" : 8,
  9. "progress_bar": true
  10. }

Please note: You can use the info.py script in the utils folder to learn the alphabet of your training / evaluation file.

For example run:

(nemo-venv) $ pypy utils/info.py input/eval.txt

  1. File: eval.txt
  2. Min length: 3
  3. Max length: 23
  4. Observed password lengths: [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,21,22,23]
  5. Alphabet (escaped for Python, but watch out for the space char): "aeio1nrlst20mcuydh93bkgp84576vjfwzxAEqORLNISMTBYCP!.UGHDJ F-K*#V_\\XZW';Q],@&?~+$={^/%"
  6. Alphabet length: 85
  7. ASCII only: Yes

If you encounter any issues, go to train.py and change the train() function from multi to single processing. This way, it is easier to debug the actual problem.

Training

Training Requirements
  • ~2-5 minutes
  • ~8 threads (4 cores + hyper-threading)
  • ~16GB of RAM
  • ~6GB of disk space
How to Train the Model

To train the model run:

(nemo-venv) $ pypy train.py

Once the training is done, you should have multiple *.pack files in the trained folder. We use a lightweight MessagePack implementation to serialize the model.

A successful training looks like this:

  1. Start: 2019-07-06 15:54:13
  2. Press Ctrl+C to shutdown
  3. [15:54:13.239] configure.py Line 30 __init__(): Constructor started for 'My Config'
  4. [15:54:13.241] train.py Line 76 train(): Training started ...
  5. [15:54:13.242] ngram_creator.py Line 24 __init__(): Constructor started for 'NGramCreator, Session: Development, Length: 4, Progress bar: True'
  6. [15:54:13.242] ngram_creator.py Line 33 __init__(): Used alphabet: ae10i2onrls9384t5m67cdyhubkgpjvfzwAxEONIRSLM.TC_DqBHYUKPJG!-*F @VWXZ/,#+&?$Q)<'=;^[(%\~]`:|">
  7. [15:54:13.242] ngram_creator.py Line 35 __init__(): Model string length: 4
  8. [15:54:13.242] ngram_creator.py Line 38 __init__(): NGram size: 4
  9. [15:54:15.315] ngram_creator.py Line 48 __init__(): len(IP) theo: 804357
  10. [15:54:15.315] ngram_creator.py Line 49 __init__(): len(CP) theo: 74805201 => 804357 * 93
  11. [15:54:15.315] ngram_creator.py Line 50 __init__(): len(EP) theo: 804357
  12. [15:54:15.315] train.py Line 29 worker(): ip_list init() ...
  13. [15:54:15.343] train.py Line 32 worker(): ip_list count() ...
  14. input/training.txt: 100%|███████████████████████████████████████████████████████████████████| 10000000/10000000 [00:03<00:00, 2916292.68pw/s]
  15. [15:54:18.776] train.py Line 35 worker(): ip_list prob() ...
  16. [15:54:18.794] ngram_creator.py Line 213 _prob(): IP probability sum: 1.0000000000141687
  17. [15:54:18.794] train.py Line 38 worker(): ip_list save() ...
  18. [15:54:18.794] ngram_creator.py Line 265 save(): Start: Writing result to disk, this gonna take a while ...
  19. [15:54:19.022] ngram_creator.py Line 276 save(): Done! Everything stored on disk.
  20. [15:54:19.023] ngram_creator.py Line 277 save(): Storing the data on disk took: 0:00:00.228256
  21. [15:54:19.023] train.py Line 41 worker(): Training IP done ...
  22. [15:54:19.023] train.py Line 44 worker(): cp_list init() ...
  23. [15:54:21.722] train.py Line 47 worker(): cp_list count() ...
  24. input/training.txt: 100%|███████████████████████████████████████████████████████████████████| 10000000/10000000 [00:03<00:00, 2995344.77pw/s]
  25. [15:54:25.063] train.py Line 50 worker(): cp_list prob() ...
  26. [15:54:25.893] train.py Line 53 worker(): cp_list save() ...
  27. [15:54:25.893] ngram_creator.py Line 265 save(): Start: Writing result to disk, this gonna take a while ...
  28. [15:54:45.189] ngram_creator.py Line 276 save(): Done! Everything stored on disk.
  29. [15:54:45.189] ngram_creator.py Line 277 save(): Storing the data on disk took: 0:00:19.295808
  30. [15:54:45.190] train.py Line 56 worker(): Training CP done ...
  31. [15:54:45.190] train.py Line 59 worker(): ep_list init() ...
  32. [15:54:45.211] train.py Line 62 worker(): ep_list count() ...
  33. input/training.txt: 100%|███████████████████████████████████████████████████████████████████| 10000000/10000000 [00:03<00:00, 3005917.73pw/s]
  34. [15:54:48.542] train.py Line 65 worker(): ep_list prob() ...
  35. [15:54:48.553] ngram_creator.py Line 242 _prob(): EP probability sum: 1.0000000000141684
  36. [15:54:48.553] train.py Line 68 worker(): ep_list save() ...
  37. [15:54:48.554] ngram_creator.py Line 265 save(): Start: Writing result to disk, this gonna take a while ...
  38. [15:54:48.781] ngram_creator.py Line 276 save(): Done! Everything stored on disk.
  39. [15:54:48.782] ngram_creator.py Line 277 save(): Storing the data on disk took: 0:00:00.227519
  40. [15:54:48.782] train.py Line 71 worker(): Training EP done ...
  41. [15:54:55.686] ngram_creator.py Line 53 __del__(): Destructor started for 'NGramCreator, Session: Development, Length: 4, Progress bar: True'
  42. ...
  43. Done: 2019-07-06 15:56:11

Strength Estimation

After training, we can use the model for example to estimate the strength of a list of passwords that originate from a similar password distribution.
To do so, please double check that your eval_file is specified correctly in your configuration .json.

For the strength estimation, we will read the trained n-gram frequencies from disk and then evaluate all passwords from the specified eval_file.

(nemo-venv) $ pypy meter.py

The result of this strength estimation can be found in the results folder in a file called eval_result.txt.

  1. ...
  2. 1.7228127641947414e-13 funnygirl2
  3. 4.03572701534676e-13 single42
  4. 3.669804567773374e-16 silkk
  5. 3.345752850966769e-11 car345
  6. 6.9565427286338e-11 password1991
  7. 4.494395283171681e-12 abby28
  8. 3.1035094651948957e-13 1595159
  9. 7.936477209731241e-13 bhagwati
  10. 1.3319042593247044e-22 natt4evasexy
  11. 1.5909371909986554e-15 curbside
  12. ...

The values are <TAB> separated.
You can use sortresult.py from the utils folder to sort the passwords.

For example run:

(nemo-venv) $ pypy utils/sortresult.py results/eval_result.txt > results/eval_result_sorted.txt

A successful strength estimation looks like this:

  1. Start: 2019-07-06 16:07:58
  2. Press Ctrl+C to shutdown
  3. [16:07:58.346] configure.py Line 30 __init__(): Constructor started for 'My Config'
  4. [16:07:58.349] ngram_creator.py Line 24 __init__(): Constructor started for 'Development'
  5. [16:07:58.349] ngram_creator.py Line 33 __init__(): Used alphabet: ae10i2onrls9384t5m67cdyhubkgpjvfzwAxEONIRSLM.TC_DqBHYUKPJG!-*F @VWXZ/,#+&?$Q)<'=;^[(%\~]`:|">
  6. [16:07:58.349] ngram_creator.py Line 35 __init__(): Model string length: 8
  7. [16:07:58.349] ngram_creator.py Line 38 __init__(): NGram size: 4
  8. [16:08:00.253] ngram_creator.py Line 48 __init__(): len(IP) theo: 804357
  9. [16:08:00.253] ngram_creator.py Line 49 __init__(): len(CP) theo: 74805201 => 804357 * 93
  10. [16:08:00.253] ngram_creator.py Line 50 __init__(): len(EP) theo: 804357
  11. [16:08:00.253] meter.py Line 23 worker(): Thread: 8 - ip_list load() ...
  12. [16:08:00.438] ngram_creator.py Line 291 load(): Done! Everything loaded from disk.
  13. [16:08:00.439] ngram_creator.py Line 292 load(): Loading the data from disk took: 0:00:00.184483
  14. [16:08:00.439] meter.py Line 25 worker(): Thread: 8 - cp_list load() ...
  15. [16:08:14.075] ngram_creator.py Line 291 load(): Done! Everything loaded from disk.
  16. [16:08:14.076] ngram_creator.py Line 292 load(): Loading the data from disk took: 0:00:13.635805
  17. [16:08:14.076] meter.py Line 27 worker(): Thread: 8 - ep_list load() ...
  18. [16:08:14.224] ngram_creator.py Line 291 load(): Done! Everything loaded from disk.
  19. [16:08:14.224] ngram_creator.py Line 292 load(): Loading the data from disk took: 0:00:00.148400
  20. [16:08:14.224] meter.py Line 29 worker(): Thread: 8 - Loading done ...
  21. [16:08:14.225] meter.py Line 55 eval(): Training loaded from disk ...
  22. ...
  23. Info: No Markov model for this length: 13 jake1password
  24. Info: No Markov model for this length: 16 marasalvatrucha3
  25. ...
  26. Done: 2019-07-06 16:08:14

FAQ

  • Usage: ASCII pre-filter your input / eval files.

  • Usage: Limit the alphabet alphabet (lower+upper+digits), n-gram size ngram_size (3 or 4-grams), password lengths lengths (6 or 8 character long passwords)

  • Usage: Make sure you train a real password distribution, not a word list / dictionary, one normally uses with tools like Hashcat / John the Ripper.

  • Debugging: If you encounter any issues, go to train.py and change the train() function from multi to single processing. This way, it is easier to debug the actual problem.

  • Debugging: In configure.py you can change the verbosity of the rainbow_logging_handler from DEBUG to INFO or CRITICAL.

License

NEMO is licensed under the MIT license. Refer to docs/LICENSE for more information.

Third-Party Libraries

  • tqdm is a library that can be used to display a progress meter. It is a “product of collaborative work“ from multiple authors and is using the MIT license. The license and the source code can be found
    here.
  • u-msgpack-python is a lightweight MessagePack serializer developed by Ivan (Vanya) A. Sergeev and is using the MIT license. The
    source code and the license can be downloaded here.
  • rainbow_logging_handler is a colorized logger developed by Mikko Ohtamaa and Sho Nakatani. The authors released it as “free and unencumbered public domain software“. The source code and “license” can be found here.

Contact

Visit our website and follow us on Twitter. If you are interested in passwords, consider to contribute and to attend the International Conference on Passwords (PASSWORDS).