Speech Synthesis

 History Of Speech Synthesis Mechanical Talking Robotic Voice
History Of Speech Synthesis

     Sitting in a movie theater in the 1960s watching a space odyssey about two astronauts and a computer, the audience encountered a computer that could speak. The computer, named HAL, not only spoke, he was friendly and understanding.
     HAL was definitely ahead of its time. For most of the people in the audience a computer was something out of science fiction. Its typical embodiment was an array of tall cases containing spinning tapes, a large box for the computer's memory and CPU (central processing unit), and machines that printed out pages and pages of wide sheets filled with numbers and obscure symbols. A few of these viewers may have been familiar with punched computer cards for bills. In all likelihood, this was the extent of their real-life experiences with computers. Even those who used one in the sixties interacted with machines in an extremely cumbersome way, using decks of punched cards to submit information and receiving large printouts in return. The way we interact with computers today -- by typing on a keyboard to input information and receiving responses on a video screen -- was just being designed. Spoken communication with a computer was a luxury that existed only in science-fiction books and the movies.
     Another novel aspect of HAL was his voice. Before HAL, an actor speaking as a computer deliberately created a stylized, mechanical, "robotic" voice. That mechanical sound was the viewer's cue that a computer or robot was speaking. 2001, however, featured a different kind of talking computer, a computer who spoke in a friendly, warm, and (often) emotional voice. Rather than conforming to the expectations about computer voices, 2001 presented the possibility that future computers would speak and function like human beings. HAL's warm emotional nature were even more striking when contrasted with the demeanors of his traveling companions. The actors portrayed astronauts Frank Poole and Dave Bowman as cool scientists whose faces and voices were devoid of emotional expression. Their lack of human emotions accentuated the effect of HAL's amiable voice (see chapter 14).
     Given the limited familiarity most people had with computers in the sixties, was Arthur Clarke's conception of the talking computer visionary, or was it simply a futuristic fantasy? What was the state of the talking machine in the sixties? Could computers sing? What challenges does creation of a talking or singing machine present? What is the present status of the speaking machine?
     I saw the movie 2001 when I was a physics graduate student at the University of Chicago computing the orbits of electrons. Simultaneously, I was studying music theory and composition in the music department. The film had no profound impact on my chosen career path. I remember thinking that HAL's voice sounded "too good" to be mistaken for a machine. When I finished my graduate work, I chose to do research in sound creation, and in particular speech sounds, because of my interest in computers, sound, and music. Consequently, I began working on speech analysis and synthesis at Bell Laboratories. Speech synthesis, or sound synthesis, refers to the process of creating a sound by machine or computer, rather than by such natural means as the human voice or a musical instrument. My initial interest involved modeling the intonation patterns (melodies) of speech, which led me to devote the next twenty-seven years to trying to create a talking machine. Later in my career, therefore, the movie 2001, and especially HAL, assisted me. Often, at social gatherings, people asked me what I did at Bell Labs. My standard reply, that I worked on talking computers, generally drew a blank until I referred to HAL in 2001. HAL provided a better explanation of my work than I could devise myself.


History of the Talking Machine

Early Mechanical Models
     Human fascination with talking machines is not new. For centuries, people have tried to empower machines with the ability to speak; prior to the machine age humans even hoped to create speech for inanimate objects. The ancients attempted to show that their idols could speak, usually by hiding a person behind the figure or channeling voices through air tubes. The same method was used to produce the speech of inanimate objects in 2001. Actor Douglas Rain gave HAL his voice, and recorded all his lines over one weekend without knowing the complete story of the film. Rain spoke in an attractive, mellow, expressive tone quite different from the usual mechanical monotone attributed to computers in that period.
     The first scientific attempts to construct talking machines were recorded in the eighteenth century. One such device was built in 1779 by C.G. Kratzenstein for the Imperial Academy of St. Petersburg. This device produced vowel sounds (/ a /, / i /, / o /, ...) by blowing air through a reed into a variable-resonance chamber that resembled a human vocal tract, starting at the vocal chords and continuing through the mouth. In 1791, W. von Kempelen constructed a device capable of speaking whole utterances. It consisted of bellows that forced air through a reed to excite a resonance chamber. The shape of the resonance chamber was manipulated by the fingers of one hand to produce different vowel sounds. Consonant sounds were produced by different chambers controlled by the other hand.
By: Joseph P. Olive

More on this subject
Beginner's Help
BUG Club Home

 History Of Speech Synthesis Mechanical Talking Robotic Voice