こんにちは、futabatoです。

今回は、Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks (Xu, Weilin, David Evans, and Yanjun Qi., 2017)の論文に目を通したので、論文メモとしてBlogに残しておきます。

Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks

論文の概要

著者: Xu, Weilin, David Evans, and Yanjun Qi.
年度: 2017
論文URL: https://arxiv.org/abs/1704.01155
被引用数: 1167
タグ: Preprocess, Supplementing the Network

本論文の手法に代表されるように、モデルに入力されるデータに対して前処理を行うことで誤分類を防ぐ手法があります。

特徴量の入力空間が不必要に大きく、その広大さがAdversarial Examples構築する機会を提供してしまっているということに注目して、不要な入力を絞り込む(squeezing)することで攻撃者が利用できる自由度を減らそうとしています。

Fig. 1: Feature-squeezing framework for detecting adversarial examples.

Abstract

Although deep neural networks (DNNs) have achieved great success in many tasks, they can often be fooled by adversarial examples that are generated by adding small but purposeful distortions to natural examples. Previous studies to defend against adversarial examples mostly focused on refining the DNN models, but have either shown limited success or required expensive computation. We propose a new strategy, feature squeezing, that can be used to harden DNN models by detecting adversarial examples. Feature squeezing reduces the search space available to an adversary by coalescing samples that correspond to many different feature vectors in the original space into a single sample. By comparing a DNN model’s prediction on the original input with that on squeezed inputs, feature squeezing detects adversarial examples with high accuracy and few false positives. This paper explores two feature squeezing methods: reducing the color bit depth of each pixel and spatial smoothing. These simple strategies are inexpensive and complementary to other defenses, and can be combined in a joint detection framework to achieve high detection rates against state-of-the-art attacks.

既存研究と比べてどこが凄い？

これまでの研究ではDNNモデルを改良することが中心で、その効果は限定的で、しかも高度な計算が必要になるものだった。本論文で紹介されているfeature squeezingは比較的安価で他の防御手法と補完的であるため、他の検出フレームワークと組み合わせて多層防御することで最先端の攻撃に対して高い検出率を達成できた。

技術や手法のキモはどこ？

Color Depth

Fig. 2: Image examples with bit depth reduction.

上記Figure 2を見ると、中断と下段の画像は、8bitの原画像と4bitに絞った画像の違いを見極めるのは困難である一方で、上段の画像と違い4bit以下の色深度では人間が観察可能な損失が発生している。これは1channelあたりのbit数が同じでもRGBの3channel分の情報が失われるからである。

とはいえ、8bitから4bitへの圧縮は、Legitimate Examplesに対する精度を維持しながら、多くの敵対的な例を軽減するのに十分強力であることがわかった。