Design Considerations for Creating
Avatars for a 3D Virtual Environment
Stasia
McGehee 9/22/2001
1. The Creation of Avatars for OnLive! Technologies' Traveler - http://www.onlive.com
Briefly I will discuss the decision making process that resulted in a rather idiosyncratic approach to avatar design, and how our team at OnLive! adopted a "form follows function" approach, designing avatars and environments that were compelling, while meeting the strict requirements of the technology. By abandoning the literal paradigms of reality, such as full bodies and texture maps, the avatars function not as real-life representations of self, but rather as totems, powerful spirit beings that we can identify with, gaining strength and power from the association. Yet an avatar may also function as a mask, obscuring one's identity as well as augmenting it. As very stylized representations of self, the 200+ polygon avatars in Traveler, circa 1996, were also informed by the notion of totems, archetypes, alter egos, and ceremonial masks, relying upon the simultaneous magic of audio, real-time lip sync, and autonomous gestures, to provide the suspension of disbelief.
User Customization
I will also talk about the importance of user
customization. Within a rather limited color palette we were able
to offer a high degree of customization just by allowing users to color
their avatars according to specially designed color regions, specified
by the designer. Users can also squash and stretch their avatar;
choose from a palette of four transitory emotions that can vary from avatar
to avatar; and alter their voice. Although we thought that the ability
to disguise one's voice would be a necessity, such a feature that obscured
or hindered communication proved to undermine the whole purpose of the
application. Ultimately, voice disguising was only limited to altering
one's pitch, and even this was viewed as a novelty.
II. Design by Necessity - The Decision Making Process that lead to the OnLive! Severed Heads
I joined OnLive! Technologies in February 1994, responsible for avatar design and development. Our team began with the notion that we would create huge virtual communities with fully articulated walking and talking avatars. Feverishly we worked over Christmas, so that by January 95 we had an actual prototype of a textured avatar, walking, talking, and blinking through a heavily texture-mapped cyberspace, ready for CES (the Consumer Electronic Show) in Las Vegas, NV. From our hotel suite we listened excitedly through the walls as Geffen, Katzenburg, and Spielburg of DreamWorks fame ogled over our lone prototype in the adjacent room.
But the real test of our abilities occurred a few months later, when we realized that our voice-enabled creation took up the entire bandwidth of a Pentium 60. We had a community of One. Clearly our well wrought plan was undeployable. The day before an important board meeting, one of our founders called a mandatory meeting with the entire product development team, including about 15 engineers and designers. He admitted his worst fear - that the idea upon which our tiny company was based was not feasible. After an unproductive interlude, he left us, admonishing, "When I come back in an hour, I want you to give me a proposal that I can present to the board." Left to our own conjectures, one engineer flippantly advised, "Let’s just lop of the heads. Who needs bodies anyway?" When the founder returned he was none too enamored with the idea, but no other suggestions surfaced, so we asked the board to give us a few days to come up with a streamlined version of our current prototype. This was accomplished by eliminating all texture maps, and by stripping out the now extraneous body parts, so that, before long, we could accommodate up to five simultaneous users in a single room, each sporting a streamlined, untextured, Gauraud-shaded head.
And that’s how, after 14 months of R&D, we settled with an idea said in jest, the glib bi-product of a harried brainstorm session. But it was actually a wise decision. The elimination of the bodies also eliminated many opportunities for feature-creep, allowing us to focus on our core technology – a 3D spatialized audio product with avatars that boasted real-time lip sync and emotions. By avoiding extraneous detail, like body parts, we never set up false expectations of being able to manipulate objects in the universe, a feature we were not prepared to implement. Nor did we need to worry about integrating the emotional content of facial expressions with the ongoing gestures of the body.
Internal Debates
Needless to say, there was much dissent throughout
the company, and we lost a few key folks over a decision that seemed absurd.
There was already a schism in the company - between those who strictly
wanted to develop a chat application, a harmonious environment that would
facilitate communication; and those who dreamed of creating a vast multi-user
gaming environment, where one could blow people up. By eliminating
bodies, and the ability for users to manipulate their environments, we
were postponing indefinitely the ability to tailor our application to game
developers, a major source of disappointment for some. By emphasizing
face to face contact, we were positioning ourselves almost exclusively
as a 3D chat product, competing with the likes of AOL, or even with the
phone company. But, despite internal opposition, a prototype was
created; and a few company meetings later, our CEO spoke to us virtually
from the comfort of her office - her silver head addressing us from a monitor
set at the head of the table in the board room.
The Role of Gesture in a Disembodied Universe
Despite our intentions to focus entirely on lip
sync and facial expressions, gesture and body language does play a crucial
role within OnLive!’s 3D environments. Even with the absence of bodies,
users developed a lexicon of motions to express levels of excitement and
enthusiasm. Users quickly noted that certain combinations of keys allow
one’s avatar to spin, roll, and twirl through space. Collision detection,
augmented by the clonking sound of two skulls colliding, provides a game-like
element, and is often employed by unruly members, to the annoyance of those
more inclined to converse. Users also take care to position more engaging
users within the center of their 90 degree field of view. Thus, body language
and gesture seem to be an inherent aspect of 3D immersive environments,
whether you provide for it or not.
III. Aesthetic Influences
As an Avatar Designer an important task was to
provide the world with attractive representations of self within a 200
polygon limit. In my effort to tackle the dilemma of providing widely appealing
content within a low poly budget, I drew upon various sources. In terms
of content, I strove to provide a range of entities, from realistic (see
Bruce’s Damer’s unintentional likeness) to more archetypal and mythological,
like the Pharaoh, Vampire, Cyclops, and Pirate. I also noticed that
animals tended to be popular, so I created a variety of them. Avatar
creation became an iterative process, as the initial avatars generated
user input which influenced further development.
In addressing the low-poly problem, I looked
at the geometric patterns of African, Native American, or tribal art with
their stylized representations. Although intended for a different
audience with a different purpose, these artists were also faced with similar
design constraints - representing complex facial features through woodcarving
techniques that did not afford the luxury of minute refinements.
In essence, they too created highly expressive low poly characters, using
clean lines and flat shades of color.
For reference, The National Museum of African Art at the Smithsonian, in Washington D.C. proved to be an invaluable resource. In addition, the Smithsonian sells inexpensive cut-outs and coloring books of masks. I also studied the flat planes of Cubist and Italian Futurist Art, which also had its antecedents in indigenous art forms. The more I observed the design solutions offered by other forms of sculpture and painting, the easier it was to arrive at my own unique solutions.
Constructing Morph Targets for Speech
Most of the vertices tend to cluster around the
mouth regions, so as to accommodate lip-syncing. As the artist, I had to
provide the extreme mouth positions, and the speech engine interpolated
between these positions. Later I also noticed that slight head movements
could be incorporated into the speech positions, so that the avatar’s head
could bob around as it spoke. An emotional palette was provided as well:
happy, sad, angry, and mad. While the avatars idle, they randomly cycle
through the emotional palette at about 1-2%, giving the impression that
the avatar is breathing.
User Customization
One of the most compelling aspects of the OnLive!
Avatars is the users’ ability to customize their avatars. Although we had
grand schemes of creating fully customizable parameterized heads, for the
first pass we merely allowed users to alter their colors and to do a global
squash and stretch. In designing avatars with customizable color regions,
I paid close attention to the orientation of each and every facet, conscious
of how faces could be tailored into customizable patterns. Used in conjunction
with a global squash and stretch, users came up with creations that I barely
recognized. Thus, this small degree of customization keeps users
occupied for hours, and allows ample variation within a fairly limited
range of choices. Originally we also offered several variations of voice
disguising, but this ended up not being nearly as important as we had surmised;
ultimately we only allowed users to alter their pitch.
IV. Future Directions for Avatar Design
Higher Poly Avatars
Future directions for avatar design include creating
higher poly avatars and adding textures for realism, possibly appealing
to a business application, or for a one-on-one context. This was
accomplished with some degree of success when Traveler was revived at Communities.com,
2000, with the addition of several new 800 poly avatars. Although
more difficult to author, the more powerful CPU's accomodated these higher
poly avatars with ease.
User Customization
High on the list is User Customization.
Currently the users can color their avatar, but they are limited to a 24-color
palette. Giving them a 32-bit color picker could be a next step.
We have always discussed the need for parameterized heads. This would
presume a single head with a fixed topology that could be tweaked in an
infinite number of ways. This could compensate for the fact that
3D authoring is a difficult skill; very few of our users have been able
to make compelling 3D avatars on their own. But if they were given
a head with selectable regions that could be controlled by various slider
bars, they could author an anthropomorphic avatar without having to master
the medium. In 2001, the proposed parameterized facial system under
consideration during OnLive! has resurfaced under the guise of www.appeerance.com,
after further development by Creative Director Steve DiPaola and 3D Graphics
Engineer Dave Collins.
Increase the Emotional Range
Also we could increase the emotional range by
providing Attitudes. Unlike our current emotions, attitudes would
not be transitory, but could color the avatar’s whole personality.
In addition to the ability to select a fixed attitude, we could also offer
a palette of transitory emotions, more complex than our current selection
of four.
More Convincing Lip Sync
The speech engine that drives the real-time lip
sync was developed by Shankar Narayan over a period of several years while
at Apple Computer. Shankar's goal was to create a speech engine that
was so accurate, that one could read an avatar's lips. However, the
OnLive! implementation of this speech engine was not so granular.
For the short term, we could add another parameter to accommodate the "oo"
or "w" sound. This could be calculated as a negative value of the
widest lip position, so that a trained 3D artist would not have to be conscripted
to re-author this morph target for existing content.
end