Project Portfolio

Dan Maynes Aminzade

Research

Actuated Workbench
Audience Interaction
Hover
You're in Control
Edible User Interfaces
Fuzzmail
KC-135
OSCAR

Schools

Stanford
MIT
Carnegie Mellon

Industry

MERL
Microsoft
Adobe
Disney Imagineering

Fun

Unsafe Search
Music Visualization
Mobot
PantsCam
Taboo Database
Pointillism
Painting
WebAmp

Zany

Tacos
SETI Joke
Pepsi Database
Love Calculator

Hacks

AdBall
RCA Lyra
Stone Cold

Humor

SURG Proposals
Female Pop Singers
Satan Baby
Wesley Willis

Hacking Stone Cold

 

Jakks Pacific is a major developer and manufacturer of action figures based on characters from the Worldwide Wresting Federation (WWF).  The company recently released a new line of so-called "interactive" talking action figures.  The first in this series of dolls features Stone Cold Steve Austin, perhaps the most popular wrestler in the WWF.
Talking dolls are nothing new, but this doll is unique in that its sayings can constantly be refreshed with clips downloaded from the internet.  The doll comes with a cable that plugs into a PC serial port, allowing you to upload up to 45 seconds of audio to the doll's flash memory using an included Windows application called the "Stone Cold Steve Austin Rant Manager".

You can download audio samples from the official WWF Wired website in a format that Jakks Pacific calls a "Rant Pakk" (sic). I'm sure that this excites wrestling fans to no end; when they hear a snappy catch phrase or devastating insult on TV while watching WWF "Monday Nitro", they can visit the website the next day, download the clip, and upload it to their talking action figure.

I'm interested in the doll for another reason: I want to reverse engineer the "Rant Pakk" format to make the doll say anything I want.  Imagine a stuttering Stone Cold, a cowardly Stone Cold, or even an effeminate Stone Cold.

A Rant Pakk is really nothing more than a self-extracting ZIP archive containing a series of audio clips and their textual descriptions.  Each clip comes in two formats: an 8-bit mono 11 kHz WAV file, and a mysterious audio file format with the file extension "WWF".  The WAV file is used to preview sound clips in the Rant Manager Application; the WWF file is what is actually uploaded to the doll.  Reverse engineering the doll is essentially a matter of determining the WWF file format specification.

If you select one of the sound clips in a Rant Pakk, open the WAV file in a sound editor, downgrade its quality to 8-bit mono, 3000 Hz, and then strip the file of its WAV header, you'll find that it is exactly 10 bytes smaller than the corresponding WWF file.  This discrepancy can be explained by the presence of a 5-byte header (00FF00FF01) and a 5-byte footer (1000FF00FF) common to every WWF file.  This suggests that a WWF file is nothing more than a low-quality uncompressed WAV file, but the file formats differ in unusual ways.  For example, large values in the WAV file don't necessarily correspond to large data values in the WWF file.  When you load a WWF file into a sound editor and play it, you hear a warped, raspy sound that barely matches the sound in the WAV file.
Original WAV file.
WWF file, interpreted as 8-bit mono 3000 Hz signed WAV file.
WWF file, interpreted as 8-bit mono 3000 Hz unsigned WAV file.

I've been experimenting with various theories on how to decode the WWF format.  Here are some descriptions of my experiments:

  • Masking.  One possibility is that the WWF file is simply a version of the WAV file masked with a special key.  This is a simple method of encoding the data that could easily be reversed in hardware.  However, I noticed that after removing small chunks of data from the file, it could still be uploaded and played by the figure correctly.  Subsections of a WWF file can also be cropped out and uploaded without a problem (provided the header and footer are added).   This leads me to conclude that the encoding must be on a smaller byte-level basis and not on a global file-level basis.

  • Bit Muddling.  I considered the possibility that small sections of the file (or even single bytes) were being rotated or swapped.  I wrote several programs that loaded WWF and WAV files, mangled the bytes in various ways, and compared the data.  Unfortunately, none of these efforts bore fruit.

  • Amplitude Modulation.  One possibility is that the sound is encoded as a frequency-modulated sine wave.  In an effort to isolate frequency components that had been added to the original signal, I compared the Fourier transforms of the WWF data and the WAV data, but I didn't notice any obvious frequency components that had been added.


    FFT of PCM data.


    FFT of WWF data.

  •  Bleeps.  Some of the audio clips available from the WWF Wired website include bleeping noises that cover profanities spoken by Stone Cold Steve Austin.  One would expect these sections of the audio clip to be telling, since beeps consist of regular sine waves.  Strangely enough, the WWF data corresponding to a bleep doesn't contain a regularly repeating pattern.

Although I haven't been able to crack the WWF format specification yet, I feel that I've made good progress.  I'll continue to post my findings to this site; be sure to e-mail me if you discover anything new.  Let's hack Stone Cold!