I have recently stumbled upon an article reporting that Facebook is using Artificial Intelligenceto fight “revenge porn” - a phenomenon of sharing intimate photos on social networks without people’s permission. This is one of the downsides of the development of new technologies and social networks. This great TED talk from one of the victims explains why it is a serious problem that needs strong measures.
Facebook’s AI tool to fight revenge porn
As much as the problem is undoubtedly very important, it struck as non-trivial to solve. I have followed the original Facebook blog post to find more details about this solution’s implementation. It says:
Finding these images goes beyond detecting nudity on our platforms. By using machine learning and artificial intelligence, we can now proactively detect near nude images or videos that are shared without permission on Facebook and Instagram.
This unfortunately doesn’t clear up much about how this would work in practice. In the article we can find some references to the Australian pilot program of fighting Non-Consensual Intimate Images sharing. It brings more information about Facebook’s actions, but the described method relies on simple hashing functions. This classical method has two great advantages in this case - it enables efficient validation of every uploaded photo against the database of reported images, and additionally transforms the photo into a format not meaningful for people, which enables storing those representations instead of the original photos.
Are there any other solutions?
If a new method was to bring some novelty, what could it be? It's hard to imagine an algorithm that would rely on just images and be more than a “nudity detector”, but Facebook has a broad range of data that it could incorporate to give a broader context.
If we moved away from computational constraints for a moment (which are not to be neglected in the real world for a company with such scale) and assumed that we can train any model on their data, what data would bring the most value?
Just to give some potential approaches, we could try to find if the publisher of the suspicious photo has been in a relationship that has ended. If yes, we could try to validate if the person in the published film or photo corresponds to previous partners based on their profile photos. If we were not limited by privacy (for the sake of the higher purpose of defending people from “revenge porn”), we could analyse the message pattern between this publisher and his previous partners and maybe even analyse the sentiment of the recent messages looking for threats.
These ideas could probably sometimes help in finding the exact situations when the sharing of naked photos without consent takes place - but we have to be aware that this still would probably not be a perfect solution for a broad range of situations. This probably explains why currently Facebook’s solution focuses on detecting a broader class of all nude content, while still scouring all uploaded content in the search of all known cases of non-consensual intimate images sharing. What’s also very important in this case is the whole procedure that is designed to help the victims.
Unfortunately, machine learning methods are not an exception and they can also be used in a malicious way. DeepFake is one of the examples. Being a technique for human image synthesis, it has gained public attention at the end of 2017, after the publication of AI-constructed pornography in which celebrity faces were inserted into videos. But not only celebrities are at risk - it is terrifying how this can be used to create “revenge porn” that never took place.
This is not the only example of algorithms that can have a huge negative impact on our society. Lately we have seen that one of the most important research institutions for Artificial Intelligence, OpenAI, has decided not to publish the full language model and codebase for their latest invention because, as they stated, “it’s clear that the ability to generate synthetic text that is conditioned on specific subjects has the potential for significant abuse.”
ThisPersonDoesNotExist.com - a website presenting recent results of the Nvidia research on generating images of realistically looking faces - also caused a lot of debate around potential misuse of the state-of-the-art technology, since many claim these synthetically generated photos could be used for creating fake identities that are completely untraceable.
With great power comes great responsibility
The above examples were provided not to frighten, but rather to raise awareness - there has always been a race between the researchers, regulators, and those trying to find loopholes and cause harm.