Have you ever wondered what people are saying in a muted video or a photo? You might think that it is impossible to get any sound from a silent image, but a new technique called Side Eye can do just that.
Side Eye is a machine learning tool developed by researchers from four US universities. It can extract audio from static photos and silent or muted videos by analyzing the tiny vibrations of the camera lens caused by sound waves.
The researchers were inspired by an episode of the sci-fi show “Fringe” where the characters recovered audio from a melted glass. They decided to challenge themselves and see if they could achieve something similar with modern cameras.
They found out that most phone cameras use a method called rolling shutter, which scans the image one row at a time instead of capturing it all at once. This creates a slight distortion in the light that can be used to reconstruct the sound frequency.
Using Side Eye, the researchers were able to determine the gender and the words of someone speaking in the room where a photo was taken, or in a muted video. They also demonstrated that they could get audio from off-camera sources, such as someone speaking behind the person taking the photo.
The technique has many potential applications, such as enhancing security, forensics, and accessibility. However, it also raises some privacy and ethical concerns, as anyone with a camera could potentially eavesdrop on others without their consent.
The researchers plan to present their work at the ACM Conference on Computer and Communications Security in November 2023. They hope that their findings will raise awareness and spark a discussion about the implications of this technology.