CrossCert: A Cross-Checking Detection Approach to Patch Robustness Certification for Deep Learning Models
Patch robustness certification is an emerging kind of defense technique for deep learning models. It aims to enhance the reliability of these models against adversarial patch attacks with provable guarantees. There are two research lines: certified recovery and certified detection. They aim to correctly label malicious samples with provable guarantees and issue warnings for malicious samples predicted to non-benign labels with provable guarantees, respectively. However, existing certified detection defenders suffer from producing labels subject to manipulation, and existing certified recovery defenders cannot warn samples about their labels. A certified defense that simultaneously offers robust labels and systematic warning protection against patch attacks is desirable. This paper proposes a novel certified defense technique called \textit{CrossCert}. \textit{CrossCert} formulates a novel approach by cross-checking two certified recovery defenders to provide unwavering certification and detection certification. Unwavering certification ensures that a certified sample, when subjected to a patched perturbation, will always be returned with a benign label without triggering any warnings with a provable guarantee. To our knowledge, \textit{CrossCert} is the first certified detection technique to offer this guarantee. Our experiments show that, with a slightly lower performance than ViP and a comparable performance with PatchCensor in terms of detection certification, \textit{CrossCert} certifies a significant proportion of samples with the guarantee of unwavering certification.