Or possibly the exact opposite of that, I can't tell if it's a one-to-one mapping on mobile: https://www2.b3ta.com/buffyswear/
(Also, I'm feeling my age now, given how many years have elapsed since that kind of thing passed for internet culture…)
https://www.theverge.com/23810061/zenith-space-command-remot...
A couple years ago for a weekend project I made a simple "audio-mnist" dataset from handwritten digit audio recordings. I never got past a few days worth of work, but open-sourcing it has been on my mind for a minute. This post kicked me into action. Getting some more data, basic CNN examples, etc. could provide a nice starting point for a lot of research and tools.
There is still separate code I'd have to find and make intelligible to create the recordings and split the audio.
Anyway, in case anyone finds part of this process interesting or useful.
https://en.m.wikipedia.org/wiki/Tempest_(codename)
TEMPEST considered almost everything from electromagnetic leakage to exactly the attack described here.
Here's a few random papers I read along the way:
https://doi.org/10.1007/s10207-019-00449-8 - SonarSnoop, which uses a phone's speaker to produce ultrasonic audio that can be used to profile the user's interaction (e.g. entering swipe-based passcodes).
https://people.eecs.berkeley.edu/~daw/papers/ssh-use01.pdf - "Timing Analysis of Keystrokes and Timing Attacks on SSH", a paper from 2001 that uses statistical models of keystroke timings to retrieve passwords from encrypted SSH traffic.
https://doi.org/10.1145/1609956.1609959 - "Keyboard acoustic emanations revisited", which uses hidden Markov models and some other English language features to recover text based on classification via cepstrum features.
https://doi.org/10.1145/2660267.2660296 - "Context-free Attacks Using Keyboard Acoustic Emanations" which uses a geometric approach, using time-difference-of-arrival to estimate physical locations probabilistically.
In 2005 ACM's CCS Zhuang, Zhou and Tygar presented Keyboard Acoustic Emanations Revisited [1]
We examine the problem of keyboard acoustic emanations. We
present a novel attack taking as input a 10-minute sound recording
of a user typing English text using a keyboard, and then recovering
up to 96% of typed characters. There is no need for a labeled
training recording. Moreover the recognizer bootstrapped this way
can even recognize random text such as passwords: In our experiments,
90% of 5-character random passwords using only letters can
be generated in fewer than 20 attempts by an adversary; 80% of 10-
character passwords can be generated in fewer than 75 attempts.
Our attack uses the statistical constraints of the underlying content,
English language, to reconstruct text from sound recordings
without any labeled training data. The attack uses a combination
of standard machine learning and speech recognition techniques,
including cepstrum features, Hidden Markov Models, linear classification,
and feedback-based incremental learning
which builds up on Asonov & Agrawal's work [2] who came up with the idea the previous year (2004). We show that PC keyboards, notebook keyboards, telephone
and ATM pads are vulnerable to attacks based on
differentiating the sound emanated by different keys. Our
attack employs a neural network to recognize the key being
pressed. We also investigate why different keys produce
different sounds and provide hints for the design of homophonic
keyboards that would be resistant to this type of attack.
[1] https://dl.acm.org/doi/10.1145/1609956.1609959The locations of the numbers move around to prevent mouseloggers from recording your movements.
It seems like any way of doing it would end up slowing down the typist though. If it is just for the password, I could see it being possible, but if you're dealing with lots of information that needs to be protected, then it seems impossible.
Lagniappe: “To temporarily silence bucklespring, for example to enter secrets, press ScrollLock twice”
On screen pin entry with jumbled number mappings does the same thing. It also makes the inter-stroke delay rather independent of position, because the brain has to search the screen (although repeated digits and previously occuring digits are quicker, which is why some jumble at every keystroke).
Keyboards with OLED keys (like the Apple Touchbar or the Optimus[1]) might also work.
As far as I know, Cherry blues only click once and the second sound you hear on a keypress is just the topping out sound.
This attack is about as realistic as the film: a parallel universe where million to one chances happen nine times out of ten.