项目作者: nebula-beta

项目描述 :
A PyTorch Toolbox for creating adversarial examples that fool neural networks.
高级语言: Python
项目地址: git://github.com/nebula-beta/torchadver.git
创建时间: 2019-08-02T14:30:28Z
项目社区:https://github.com/nebula-beta/torchadver

开源协议:

下载


Introduction

torchadver is a Pytorch tool box for generating adversarial images. The basic adversarial attack are implemented. Such as FSGM, I-FGSM, MI-FGSM, M-DI-FGSM, C&W .etc.

Installation

How to Use

The brief attack process is shown below. More detailed process introduction you can refer to ./examples/toturial.py.

Generate adversarial images by satisfy L2 norm

Non-targeted attack

  1. from torchadver.attacker.iterative_gradient_attack import FGM_L2, I_FGM_L2, MI_FGM_L2, M_DI_FGM_L2
  2. mean = [0.5, 0.5, 0.5]
  3. std = [0.5, 0.5, 0.5]
  4. # images normalized by mean and std
  5. images, labels = ...
  6. model = ...
  7. # use mean and std to determine effective range of pixel of image in channels.
  8. attacker = FGM_L2(model, loss_fn=nn.CrossEntropyLoss(),
  9. mean=mean, std=std,
  10. max_norm=4.0, # L2 norm bound
  11. random_init=True)
  12. # for non-targeted attack
  13. adv_images = attacker.attack(images, labels) # or adv_images = attacker.attack(images)

Targeted attack

  1. from torchadver.attacker.iterative_gradient_attack import FGM_L2, I_FGM_L2, MI_FGM_L2, M_DI_FGM_L2
  2. mean = [0.5, 0.5, 0.5]
  3. std = [0.5, 0.5, 0.5]
  4. # images normalized by mean and std
  5. images, labels = ...
  6. model = ...
  7. targeted_labels = ...
  8. # use mean and std to determine effective range of pixel of image in channels.
  9. attacker = FGM_L2(model, loss_fn=nn.CrossEntropyLoss(),
  10. mean=mean, std=std,
  11. max_norm=4.0, # L2 norm bound
  12. random_init=True)
  13. # for non-targeted attack
  14. adv_images = attacker.attack(images, targeted_labels)

Generate adversarial images by satisfy Linf norm

Non-targeted attack

  1. from torchadver.attacker.iterative_gradient_attack import FGM_LInf, I_FGM_LInf, MI_FGM_LInf, M_DI_FGM_LInf
  2. mean = [0.5, 0.5, 0.5]
  3. std = [0.5, 0.5, 0.5]
  4. # images normalized by mean and std
  5. images, labels = ...
  6. model = ...
  7. # use mean and std to determine effective range of pixel of image in channels.
  8. attacker = FGM_L2(model, loss_fn=nn.CrossEntropyLoss(),
  9. mean=mean, std=std,
  10. max_norm=0.1, # Linf norm bound
  11. random_init=True)
  12. # for non-targeted attack
  13. adv_images = attacker.attack(images, labels) # or adv_images = attacker.attack(images)

Targeted attack

  1. from torchadver.attacker.iterative_gradient_attack import FGM_LInf, I_FGM_LInf, MI_FGM_LInf, M_DI_FGM_LInf
  2. mean = [0.5, 0.5, 0.5]
  3. std = [0.5, 0.5, 0.5]
  4. # images normalized by mean and std
  5. images, labels = ...
  6. model = ...
  7. targeted_labels = ...
  8. # use mean and std to determine effective range of pixel of image in channels.
  9. attacker = FGM_L2(model, loss_fn=nn.CrossEntropyLoss(),
  10. mean=mean, std=std,
  11. max_norm=0.1, # Linf norm bound
  12. random_init=True, targeted=True)
  13. # for non-targeted attack
  14. adv_images = attacker.attack(images, targeted_labels)

Citations

More information about adversarial attack about deep learning, refer to awesome-adversarial-deep-learning.