[games_access] Answers from Valve about Closed Captioning

Tue Jun 28 23:25:01 EDT 2005

Below you'll find 5 questions I sent to Valve that are answered by
Yahn Bernier about their closed captioning experience with HL2. He's
got some really good answers and he gave us permission to post them.
Feel free to put them on the game accessiblity site, wiki, etc.

His answers are very similar to my experience with the Doom3[CC] mod.
It also took us 2 weeks (with two programmers on the weekends) to get
the fundamentals working and then it took us a lot longer to fine tune
the system.

-Reid 

-----Original Message-----
 From: Yahn Bernier [mailto:YahnBernier at valvesoftware.com]
 Sent: Mon 6/27/2005 1:18 PM
 To: Reid Kimball
 Cc: Ken Birdwell; Bill Van Buren; Greg Coomer; Marc Laidlaw
 Subject: RE: Questions regarding closed captioning system

 1. How long did it take to design and program the captioning system,
the system that recognizes a sound has played and to display it's
relevant text on screen given various criteria (if any, such as
distance away from player).

 It probably took two weeks of my time to implement and refine the
systems code part of this. Because we changed all of our sounds to
play via an EmitSound() call and that call takes a shorthand 'name'
for the sound, it was easy to make that name be the captioning lookup
key as well. We also knew that we wanted certain sounds to display in
different colors (effects versus npc's speaking, etc.) so we added the
ability to embed simple HTML like tags into the actual localized
caption text (including coloration, bold/italics, line breaks,
non-default "linger" time, etc.). All of the code for the system is
built into the game and client .dlls and is in the public part of the
SDK code. I'd look there to see how we implemented this stuff.
Specifically hud_closecaption.cpp/.h in the cl_dll and the EmitSound
code and EmitCloseCaption code in the game .dll (sceneentity.cpp makes
use of this for acting scenes). Also we drew a distinction between
subtitling and close captioning (subtitling just being dialogue, as if
you weren't hearing impaired but were listenining to the dialogue in a
foreign language for instance).

 2. What was the most difficult aspect of the system to implement?

 Tuning the system was much harder than actually programming it. Marc
Laidlaw, our writer, and Bill Van Buren, who worked closely with our
voice actors, had to go through and tag a lot of the dialog in the
script. Luckily everything was in one centralized Unicode text file so
he could work in there as needed. Marc and Bill had to spend a lot of
time watching the captions in the game and tuning them, especially
captions for weapon and environmental sounds. We allowed each caption
to specify how long had to transpire before the same caption would be
seen again. Something like a machine gun, therefore, would show up
with a single line caption every few seconds instead of a steady
scroll of captions.

 3. I believe Valve has a custom tool that allows captions to be
created for a sound file. How long did that take to develop? Was the
development worth it, did it save time in the creation of the closed
captions?

 We used faceposer to extract phoneme data for sound files (the
extractor part of it is actually a separate .dll, so we have some
command line tools to do batch processing which we used for phonemes
in the localized versions of the game). One of the steps was to type
in the text of a .wav as an initial hint to the extraction system.
That system drives the facial animation. The text and phonemes are
stored directly in the .wav file (the .wav file format is RIFF which
allows custom chunks to be embedded in the .wav). We were able to use
one of our tools to extract the phoneme related text from these .wav
files and use that as a first pass at the english captioning data. It
was an unintentional benefit of the facial animation system that we
had most of the English captions roughed out automatically.

 4. Is there anything you would design/implement differently if you
were to design another captioning system?

 There were a few things I read about on-line at deafgamers.com. The
main thing we didn't put in the UI was a history view or a way to dump
out the captions to a text file so you could read through the
transcript like a screenplay.

 5. Was there any discussion about the use of colors in the captioning
text and how other cultures may perceive those color assignments?

 Yes, actually. We initially only colored world effects differently
from speech. All speech was white. When we had hearing impaired
testers come in, their feedback was that it was difficult to figure
out who was speaking since all of the text was white. At that point,
we went back into the captions and added coloration tags to each main
NPC in the game to differentiate them from each other during acted
scenes. I don't believe that we looked at cultural perception issues
with the colors when they were chosen. That would be a good question
for Greg Coomer and Marc Laidlaw.