To expose the hidden training data, Ryan Webster and his colleagues at the University of Caen Normandy in France used a type of attack called a membership attack, which can be used to find out whether certain data was used to train a neural network model. These attacks typically take advantage of subtle differences between the way a model treats data it was trained on — and has thus seen thousands of times before — and unseen data. For example, a model might identify a previously unseen image accurately, but with slightly less confidence than one it was trained on. A second, attacking model can learn to spot such tells in the first model’s behavior and use them to predict when certain data, such as a photo, is in the training set or not.
Such attacks can lead to serious security leaks. For example, finding out that someone’s medical data was used to train a model associated with a disease might reveal that this person has that disease. Webster’s team extended this idea so that instead of identifying the exact photos used to train a GAN, they identified photos in the GAN’s training set that were not identical but appeared to portray the same individual — in other words, faces with the same identity. To do this, the researchers first generated faces with the GAN and then used a separate facial-recognition AI to detect whether the identity of these generated faces matched the identity of any of the faces seen in the training data. The results are striking. In many cases, the team found multiple photos of real people in the training data that appeared to match the fake faces generated by the GAN, revealing the identity of individuals the AI had been trained on.
Read more of this story at Slashdot.