Towards deepfake detection that actually works
Read the technical deep dive: https://www.dessa.com/post/deepfake-detection-that-actually-works
In our recent article, we make the following contributions:
Our Pytorch implementation, conducts extensive experiments to demonstrate that the datasets produced by Google and detailed in the FaceForensics++
paper are not sufficient for making neural networks generalize to detect real-life face manipulation techniques. It also provides a current solution for such
behavior which relies on adding more data.
Our Pytorch model is based on a pre-trained ResNet18 on Imagenet, that we finetune to solve the deepfake detection problem.
We also conduct large scale experiments using Dessa’s open source scheduler + experiment manger Atlas.
To run the code, your system should meet the following requirements: RAM >= 32GB , GPUs >=1
sudo apt install ffmpeg
That’s it, You’re ready to go!
Half of the dataset used in this project is from the FaceForensics deepfake detection dataset.
.
To download this data, please make sure to fill out the google form to request access to the data.
For the dataset that we collected from Youtube, it is accessible on S3 for download.
To automatically download and restructure both datasets, please execute:
bash restructure_data.sh faceforensics_download.py
Note: You need to have received the download script from FaceForensics++ people before executing the restructure script.
Note2: We created the restructure_data.sh
to do a split that replicates our exact experiments avaiable in the UI above, please feel free to change the
splits as you wish.
Before starting to train/evaluate models, we should first create the docker image that we will be running our experiments with. To do so, we already prepared
a dockerfile to do that inside custom_docker_image
. To create the docker image, execute the following commands in terminal:
cd custom_docker_image
nvidia-docker build . -t atlas_ff
Note: if you change the image name, please make sure you also modify line 16 of job.config.yaml
to match the docker image name.
Inside job.config.yaml
, please modify the data path on host from /media/biggie2/FaceForensics/datasets/
to the absolute path of your datasets
folder.
The folder containing your datasets should have the following structure:
datasets
├── augment_deepfake (2)
│ ├── fake
│ │ └── frames
│ ├── real
│ │ └── frames
│ └── val
│ ├── fake
│ └── real
├── base_deepfake (1)
│ ├── fake
│ │ └── frames
│ ├── real
│ │ └── frames
│ └── val
│ ├── fake
│ └── real
├── both_deepfake (3)
│ ├── fake
│ │ └── frames
│ ├── real
│ │ └── frames
│ └── val
│ ├── fake
│ └── real
├── precomputed (4)
└── T_deepfake (0)
├── manipulated_sequences
│ ├── DeepFakeDetection
│ ├── Deepfakes
│ ├── Face2Face
│ ├── FaceSwap
│ └── NeuralTextures
└── original_sequences
├── actors
└── youtube
Notes:
frames
contain frames collected usingffmpeg
Then, to run all the experiments we will show in the article to come, you can launch the script hparams_search.py
using:
python hparams_search.py
In the following pictures, the title for each subplot is in the form real_prob, fake_prob | prediction | label
.
For models trained on the paper dataset alone, we notice that the model only learns to detect the manipulation techniques mentioned in the paper and misses
all the manipulations in real world data (from data)
Models trained on the youtube data alone learn to detect real world deepfakes, but also learn to detect easy deepfakes in the paper dataset as well. These
models however fail to detect any other type of manipulation (such as NeuralTextures).
Finally, models trained on the combination of both datasets together, learns to detect both real world manipulation techniques as well as the other methods
mentioned in FaceForensics++ paper.
for a more in depth explanation of these results, please refer to the article we published. More results can be seen in the
interactive UI
Please feel free to fork this work and keep pushing on it.
If you also want to help improving the deepfake detection datasets, please share your real/forged samples at foundations@dessa.com.
© 2020 Square, Inc. ATLAS, DESSA, the Dessa Logo, and others are trademarks of Square, Inc. All third party names and trademarks are properties of their respective owners and are used for identification purposes only.