My son and his friend Paul were playing video games in the living room as I prepared a late breakfast for myself. While the bacon sizzled, I searched the refrigerator for eggs I knew were in there somewhere.
“Daniel,” I called out to my son. “Have you seen the eggs box? Paul, can I get you something to drink?”
“Hey, what happened?” cried Paul. “My guy just froze up on screen.”
“Dad,” Daniel added. “You messed up our game.”
How did I do that? I’m standing way over here in the kitchen. They’re the ones clutching the Xbox controllers. I’m just trying to find the eggs box.
Ohhhh … I think I know what happened.
While most of my family’s interaction with the television is done via remote control (including the stuff I throw at the screen whenever a Real Housewife appears), Daniel has set up his game console so he can control certain aspects of its operation with voice commands. If he needs to step away from the zombie-killing on “Dead Island” to deal with concerns more pedestrian than the undead, he simply says “Xbox pause” and the game stops.
When I asked “have you seen the eggs box? Paul…”, the voice recognition software heard “Xbox pause” and halted the action.
Daniel also realized what had happened. He turned to the TV and said “Xbox play,” and the automaton slaughter resumed.
Since the dawn of the Industrial Revolution, man’s interaction with machinery has come about primarily by pushing buttons and pulling levers. Occasionally, a factory worker was able to halt his equipment by fatally falling into it, though technically that still involved physical contact with the controls.
Using your voice to issue commands is a relatively new development. Like many technological innovations, it was first conceived in the realm of science fiction. I still remember watching “Star Trek” in the 1960s, when Kirk or Spock or whatever Expendoid was cast that week for the specific purpose of being killed by Klingons saying “Computer! Bring me a ham sandwich” or “Computer! Save the universe.” And sure enough, the computer would do it.
Now, like the teleportation device that rockets people instantly through space or whatever innovation made it possible for second-rate sci-fi to promulgate into countless remakes, voice-recognition technology is part of our everyday lives.
Hands-free cellphones make it possible to apply makeup while driving. Cars themselves now respond to imperatives like “change the radio station” or “lower the AC.” You can order fast-food at a drive-through speakerbox, which activates robots inside to sneeze on your hamburger, drop it to the floor, bag it, and pass it through the window.
Like everything else introduced in the last 25 years (with the possible exception of my son), I don’t like it. It just seems fundamentally wrong that we speak to inanimate objects when we already have enough trouble talking to fellow humans. Machines should be controlled by interfaces like keyboards and touchpads. Humans should be controlled by verbal threats and menacing glances.
Chatting up the Xerox WorkCentre 5755 in an attempt to convince it to make 50 two-sided color copies is just too much effort. You have to ask how its family is doing and everything.
I’ve overcome initial reservations about the corollary of voice-recognition — employing keypads to interact with people. Fewer and fewer individuals are using face-to-face conversations or their smartphones to talk to loved ones. Texting, tweeting, instant-messaging and posting pictures of your drunken self on Facebook are now the preferred ways to communicate.
And I’m fine with that. In fact, I prefer it. Concise written messages — even those strewn with emoticons and misspelled into a wireless device — may take longer to key than the spoken word, but they last longer than our ephemeral grunts. “Conversations” held days ago are now fully documented, great news if you have a dispute with your wife about what you were supposed to get at the grocery store, not so great if you’ve been caught conspiring to commit murder-for-hire.
But I suspect my objections to voice-recognition interfaces are based more on what I perceive to be a threat to my livelihood. I make my living as a proofreader and typesetter for a printing company. After over 30 years in this and similar roles, I’ve become very good at my job, especially the typing part. I can key over 100 words per minute with 98% accuracy, according to the man-eating sea creatures at Typer Shark. Even charging hammerheads recoil before my onslaught of ampersands and semicolons.
If voice-detection technology is introduced in my workplace, my typing skills could become useless. Instead of spending the day covering my keyboard with a blur of digits, I’d be reduced to mouthing the words into an input device. Instead of zipping through some of my favorite words to type — “administration,” “facilities,” “constipation,” to name a few — in a matter of nanoseconds, I’d have to say each individual letter. Speaking “m-a-s-t-u-r-b-a-t-o-r-y” into a cone is nowhere near as fun as fingering it into a QWERTY keyboard faster than you can moan orgasmically.
And if there are dozens of fellow workers engaged in the same activity, the collective drone would be enough to knock you flat out. Though sore throats might be easier to treat than chronic carpal tunnel syndrome, you’d end the day vocally exhausted, unable to talk your car into starting.
So today I’m saying “Speech Recognition, Pause.” Take your high-tech audio analysis and consign it to the scrap heap of futuristic-but-ill-conceived ideas, along with jetpacks and rocket cars and Lady Gaga. When I call a customer service help line, I want to press the “O” key, not say “representative,” then say it again, then say it again.
Only when this scourge of needless modernity is eliminated from our lives will I be prepared to again say “Progress, Play.”