Realistic Audio Communication in a Three Dimensional Computer-Generated Virtual Environment

Info

Publication number: 20090240359
Type: Application
Filed: Dec 28, 2008
Publication Date: Sep 24, 2009
Applicant: Nortel Networks Limited (St. Laurent)
Inventors: Arn Hyndman (Ottawa), Andrew Lippman (Salem, MA), Nicholas Sauriol (Ottawa)
Application Number: 12/344,542

Abstract

A participant in a three dimensional computer-generated virtual environment is able to control a dispersion pattern of his Avatar's voice such that the Avatar's voice may be directionally enhanced using simple controls. The audio dispersion envelope is designed to extend further in front of the Avatar and less to the sides and rear of the Avatar. The audio dispersion envelope may be static or controllable by the participant to enable the distance that the Avatar's voice travels within the virtual environment to be adjusted. This enables the Avatar to whisper or “shout” in the virtual environment such that other Avatars normally outside of hearing range of the Avatar may selectively receive audio generated by the user. Separate audio streams are mixed for each user from audio generated by users with Avatars within their Avatar's dispersion envelope. The volume of audio from a user in the mixed audio stream depends on the separation of the Avatars within the virtual environment.

Description

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Patent Application No. 61/037,447, filed Mar. 18, 2008, entitled “Method and Apparatus For Providing 3 Dimensional Audio on a Conference Bridge”, the content of which is hereby incorporated herein by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to virtual environments and, more particularly, to a method and apparatus for implementing realistic audio communications in a three dimensional computer-generated virtual environment.

2. Description of the Related Art

Virtual environments simulate actual or fantasy 3-D environments and allow for many participants to interact with each other and with constructs in the environment via remotely-located clients. One context in which a virtual environment may be used is in connection with gaming, although other uses for virtual environments are also being developed.

In a virtual environment, an actual or fantasy universe is simulated within a computer processor/memory. Multiple people may participate in the virtual environment through a computer network, such as a local area network or a wide area network such as the Internet. Each player selects an “Avatar” which is often a three-dimensional representation of a person or other object to represent them in the virtual environment. Participants send commands to a virtual environment server that controls the virtual environment to cause their Avatars to move within the virtual environment. In this way, the participants are able to cause their Avatars to interact with other Avatars and other objects in the virtual environment.

A virtual environment often takes the form of a virtual-reality three dimensional map, and may include rooms, outdoor areas, and other representations of environments commonly experienced in the physical world. The virtual environment may also include multiple objects, people, animals, robots, Avatars, robot Avatars, spatial elements, and objects/environments that allow Avatars to participate in activities. Participants establish a presence in the virtual environment via a virtual environment client on their computer, through which they can create an Avatar and then cause the Avatar to “live” within the virtual environment.

As the Avatar moves within the virtual environment, the view experienced by the Avatar changes according to where the Avatar is located within the virtual environment. The views may be displayed to the participant so that the participant controlling the Avatar may see what the Avatar is seeing. Additionally, many virtual environments enable the participant to toggle to a different point of view, such as from a vantage point outside of the Avatar, to see where the Avatar is in the virtual environment.

The participant may control the Avatar using conventional input devices, such as a computer mouse and keyboard. The inputs are sent to the virtual environment client, which forwards the commands to one or more virtual environment servers that are controlling the virtual environment and providing a representation of the virtual environment to the participant via a display associated with the participant's computer.

Depending on how the virtual environment is set up, an Avatar may be able to observe the environment and optionally also interact with other Avatars, modeled objects within the virtual environment, robotic objects within the virtual environment, or the environment itself (i.e. an Avatar may be allowed to go for a swim in a lake or river in the virtual environment). In these cases, client control input may be permitted to cause changes in the modeled objects, such as moving other objects, opening doors, and so forth, which optionally may then be experienced by other Avatars within the virtual environment.

“Interaction” by an Avatar with another modeled object in a virtual environment means that the virtual environment server simulates an interaction in the modeled environment, in response to receiving client control input for the Avatar. Interactions by one Avatar with any other Avatar, object, the environment or automated or robotic Avatars may, in some cases, result in outcomes that may affect or otherwise be observed or experienced by other Avatars, objects, the environment, and automated or robotic Avatars within the virtual environment.

A virtual environment may be created for the user, but more commonly the virtual environment may be persistent, in which it continues to exist and be supported by the virtual environment server even when the user is not interacting with the virtual environment. Thus, where there is more than one user of a virtual environment, the environment may continue to evolve when a user is not logged in, such that the next time the user enters the virtual environment it may be changed from what it looked like the previous time.

Virtual environments are commonly used in on-line gaming, such as for example in online role playing games where users assume the role of a character and take control over most of that character's actions. In addition to games, virtual environments are also being used to simulate real life environments to provide an interface for users that will enable on-line education, training, shopping, and other types of interactions between groups of users and between businesses and users.

As Avatars encounter other Avatars within the virtual environment, the participants represented by the Avatars may elect to communicate with each other. For example, the participants may communicate with each other by typing messages to each other or audio may be transmitted between the users to enable the participants to talk with each other.

Although great advances have happened in connection with visual rendering of Avatars and animation, the audio implementation has lagged and often the audio characteristics of a virtual environment are not very realistic. Accordingly, it would be advantageous to be able to provide a method and apparatus for implementing more realistic audio communications in a three dimensional computer-generated virtual environment.

SUMMARY OF THE INVENTION

A method and apparatus for implementing realistic audio communications in a three dimensional computer-generated virtual environment is provided. In one embodiment, a participant in a three dimensional computer-generated virtual environment is able to control a dispersion pattern of his Avatar's voice such that the Avatar's voice may be directionally enhanced using simple controls. In one embodiment, an audio dispersion envelope is designed to extend further in a direction in front of the Avatar and in a smaller direction to the sides and rear of the Avatar. The shape of the audio dispersion envelope may be affected by other aspects of the virtual environment such as ceilings, floors, walls and other logical barriers. The audio dispersion envelope may be static or optionally controllable by the participant to enable the Avatar's voice to be extended outward in front of the Avatar. This enables the Avatar to “shout” in the virtual environment such that other Avatars normally outside of hearing range of the Avatar the User can still hear the user. Similarly, the volume level of the audio may be reduced to allow the Avatars to whisper or adjusted based on the relative position of the Avatars and directions in which the Avatars are facing. Individual audio streams may be mixed for each user of the virtual environment depending on the position and orientation of the user's Avatar in the virtual environment, the shape of the user's dispersion envelope, and which other Avatars are located within the user's user dispersion envelope.

BRIEF DESCRIPTION OF THE DRAWINGS

Aspects of the present invention are pointed out with particularity in the appended claims. The present invention is illustrated by way of example in the following drawings in which like references indicate similar elements. The following drawings disclose various embodiments of the present invention for purposes of illustration only and are not intended to limit the scope of the invention. For purposes of clarity, not every component may be labeled in every figure. In the figures:

FIG. 1 is a functional block diagram of a portion of an example system enabling users to have access to three dimensional computer-generated virtual environment;

FIG. 2 is a two dimensional view of users in an example three dimensional computer-generated virtual environment and showing a normal audio dispersion envelope of the users in the three dimensional computer generated virtual environment;

FIG. 3 is a two dimensional view of users in an example three dimensional computer-generated virtual environment and showing a directional audio dispersion envelope of the users in the three dimensional computer generated virtual environment;

FIGS. 4-5 show two examples of user controllable audio dispersion envelopes to enable the user to project his voice in the three dimensional computer-generated virtual environment according to an embodiment of the invention;

FIGS. 6 and 7 show interaction of the audio dispersion envelope with obstacles in an example three dimensional computer-generated virtual environment according to an embodiment of the invention;

FIG. 8 is a flow chart showing a process of implementing realistic audio communications in a three dimensional computer-generated virtual environment;

FIG. 9 is a functional block diagram showing components of the system of FIG. 1 interacting to enable audio to be transmitted between users of the three dimensional computer-generated virtual environment according to an embodiment of the invention;

FIG. 10 is a diagram of three dimensional coordinate space showing dispersion envelopes in three dimensional space; and

FIG. 11 is a three dimensional view of a virtual environment.

DETAILED DESCRIPTION

The following detailed description sets forth numerous specific details to provide a thorough understanding of the invention. However, those skilled in the art will appreciate that the invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, protocols, algorithms, and circuits have not been described in detail so as not to obscure the invention.

FIG. 1 shows a portion of an example system 10 showing the interaction between a plurality of users 12 and one or more virtual environments 14. A user may access the virtual environment 14 from their computer 22 over a packet network 16 or other common communication infrastructure. The virtual environment 14 is implemented by one or more virtual environment servers 18. Audio may be transmitted between the users 12 by one or more communication servers 20.

The virtual environment may be implemented as using one or more instances, each of which may be hosted by one or more virtual environment servers. Where there are multiple instances, the Avatars in one instance are generally unaware of Avatars in the other instance, however a user may have a presence in multiple worlds simultaneously through several virtual environment clients. Conventionally, each instance of the virtual environment may be referred to as a separate World. In the following description, it will be assumed that the Avatars are instantiated in the same world and hence can see and communicate with each other. A world may be implemented by one virtual environment server 18, or may be implemented by multiple virtual environment servers. The virtual environment is designed as a visual representation of a real-world environment that enables humans to interact with each other and communicate with each other in near-real time.

Generally, a virtual environment will have its own distinct three dimensional coordinate space. Avatars representing users may move within the three dimensional coordinate space and interact with objects and other Avatars within the three dimensional coordinate space. The virtual environment servers maintain the virtual environment and generate a visual presentation for each user based on the location of the user's Avatar within the virtual environment. The view may also depend on the direction in which the Avatar is facing and the selected viewing option, such as whether the user has opted to have the view appear as if the user was looking through the eyes of the Avatar, or whether the user has opted to pan back from the Avatar to see a three dimensional view of where the Avatar is located and what the Avatar is doing in the three dimensional computer-generated virtual environment.

Each user 12 has a computer 22 that may be used to access the three-dimensional computer-generated virtual environment. The computer 22 will run a virtual environment client 24 and a user interface 26 to the virtual environment. The user interface 26 may be part of the virtual environment client 24 or implemented as a separate process. A separate virtual environment client may be required for each virtual environment that the user would like to access, although a particular virtual environment client may be designed to interface with multiple virtual environment servers. A communication client 28 is provided to enable the user to communicate with other users who are also participating in the three dimensional computer-generated virtual environment. The communication client may be part of the virtual environment client 24, the user interface 26, or may be a separate process running on the computer 22.

The user may see a representation of a portion of the three dimensional computer-generated virtual environment on a display/audio 30 and input commands via a user input device 32 such as a mouse, touch pad, or keyboard. The display/audio 30 may be used by the user to transmit/receive audio information while engaged in the virtual environment. For example, the display/audio 30 may be a display screen having a speaker and a microphone. The user interface generates the output shown on the display under the control of the virtual environment client, and receives the input from the user and passes the user input to the virtual environment client. The virtual environment client enables the user's Avatar 34 or other object under the control of the user to execute the desired action in the virtual environment. In this way the user may control a portion of the virtual environment, such as the person's Avatar or other objects in contact with the Avatar, to change the virtual environment for the other users of the virtual environment.

Typically, an Avatar is a three dimensional rendering of a person or other creature that represents the user in the virtual environment. The user selects the way that their Avatar looks when creating a profile for the virtual environment and then can control the movement of the Avatar in the virtual environment such as by causing the Avatar to walk, run, wave, talk, or make other similar movements. Thus, the block 34 representing the Avatar in the virtual environment 14, is not intended to show how an Avatar would be expected to appear in a virtual environment. Rather, the actual appearance of the Avatar is immaterial since the actual appearance of each user's Avatar may be expected to be somewhat different and customized according to the preferences of that user. Since the actual appearance of the Avatars in the three dimensional computer-generated virtual environment is not important to the concepts discussed herein, Avatars have generally been represented herein using simple geometric shapes such as cubes and diamonds, rather than complex three dimensional shapes such as people and animals.

FIG. 2 shows a portion of an example three dimensional computer-generated virtual environment and showing normal audio dispersion envelopes associated with Avatars in the three dimensional computer generated virtual environment. FIG. 2 has been shown in two dimensions for ease of illustration and to more clearly show how audio dispersion occurs. Audio dispersion occurs in the vertical direction as well in the same manner. To simplify the explanation, most of the description uses two-dimensional figures to explain how audio dispersion may be affected in the virtual environment. It should be remembered that many virtual environments are three dimensional and, hence, the audio dispersion envelopes will extend in all three dimensions. Extension of the two dimensional audio dispersion envelopes to three dimensions is straightforward, and several examples of this are provided in connection with FIGS. 10 and 11. Thus, although the description may focus on two dimensions the invention is not limited in this manner as the same principles may be used to control the vertical dispersion (Z coordinate direction) as may be used to control the X and Y coordinate dispersions properties.

For example, FIG. 10 shows another example in which a first Avatar is located at XYZ coordinates 2, 2, 25, and a second Avatar is located at XYZ coordinates 25, 25, 4. Viewing only the X and Y coordinates, the two Avatars are approximately 18.34 units apart. If the audio dispersion envelope is 20 units, then looking only at the X and Y coordinates the two Avatars should be able to talk with each other. However, when the Z coordinate is factored in, the separation between the two Avatars in the three dimensional space is 28.46 which is well beyond the 20 unit dispersion envelope reach. Accordingly, in a three dimensional virtual environment it may often be necessary to consider the Z coordinate separation of the two Avatars as well as the X and Y coordinate separation when determining whether the two Avatars can talk with each other. A mixed audio stream will be created for each user based on the position of the user's Avatar and the dispersion envelope for the Avatar, so that audio from each of the users represented by an Avatar within the user's dispersion envelope can be mixed and provided to the user. In this way individual audio streams may be created for each user so that the user can hear audio from other users proximate their Avatar in the virtual environment. The volume of a particular user will be adjusted during the mixing process so that audio from close Avatars is louder than audio from Avatars that are farther away.

As shown in FIG. 2, in this example it has been assumed that five Avatars 34 are present in the viewable area of virtual environment 14. The Avatars have been labeled A through E for purposes of discussion. Typically each Avatar would be controlled by a separate user, although there may be instances where a user could control more than one Avatar. In the figure, the arrow shows the direction that the avatar is facing in the virtual environment.

In the example shown in FIG. 2, users associated with Avatars are generally allowed to talk with each other as long as they are within a particular distance of each other. For example, Avatar A is not within range of any other Avatar and, accordingly, the user associated with Avatar A is not able to talk to the users associated with any of the other Avatars. Similarly, Avatars D and E are not sufficiently close to talk with each other, even though they are looking at each other. Avatars B and C can hear each other, however, since they are sufficiently close to each other. In the example shown in FIG. 2, the audio dispersion envelopes for each of the Avatars are spherical (circular in two dimensional space) such that an Avatar can communicate with any Avatar that is within a particular radial distance. When the user associated with the Avatar speaks, the other Avatars within the radial distance will hear the user. If the Avatars are too far apart they are not able to “talk” within the virtual environment and, hence, audio packets/data will not be transmitted between the users.

If users are closer together, the users will be able to hear each other more clearly, and as the users get farther apart the volume of the audio tapers off until, at the edge of the dispersion envelope, the contribution of a user's audio is reduced to zero. In one embodiment, the manner in which the volume of a user's contribution is determined on a linear basis. Thus, looking at the example shown in FIG. 3, the user A is located within a dispersion envelope. The volume close to the user is highest, as you get farther away from the user A within the dispersion envelope the volume of contribution tapers off on a linear basis until the edge of the dispersion envelope is reached.

Audio is mixed individually for each user of the virtual environment. The particular mix of audio will depend on which other users are within the dispersion envelope for the user, and their location within the dispersion envelope. The location within the dispersion envelope affects the volume with which that user's audio will be presented to the user associated with the dispersion envelope. Since the audio is mixed individually for each user, an audio bridge is not required per user, but rather an audio stream may be created individually for each user based on which other users are proximate that user in the virtual environment.

FIG. 3 shows an embodiment of the invention in which the audio dispersion envelope may be shaped to extend further in front of the Avatar and to a lesser distance behind and to the left and right sides of the Avatar. This enables the Avatar to talk in a particular direction within the virtual environment while minimizing the amount of noise the Avatar generates for other users of the virtual environment. Thus, the ability of a user to talk with other users is dependent, in part, on the orientation of the user's Avatar within the virtual environment. This mirrors real life, where it is easier to hear someone who is facing you than to hear the same person talking at the same volume, but facing a different direction. As a result, the audio dispersion envelope may be controlled by the user using simple controls, such as the controls used to control the direction in which the Avatar is facing. Optionally, as discussed below, other controls may also be used to control the volume or distance of the audio dispersion envelope to further enhance the user's audio control in the virtual environment.

In the example shown in FIG. 3, the Avatars are in the same position as they are in the example shown in FIG. 2. However, the audio dispersion envelopes have been adjusted to extend further in the direction that the Avatar is facing and to extend to a lesser extent to the sides of the Avatar and to the rear of the Avatar. This affects which Avatars are able to communicate with each other. For example, as shown in FIG. 3, Avatars B and C are not able to talk with each other, even though they are standing relatively close to each other, since they are not facing each other. Similarly, Avatars D and E are able to talk with each other since they are facing each other, even though they are somewhat farther apart than Avatars B and C. Avatar A is still alone and, hence, cannot talk to any other Avatars.

In the embodiment shown in FIG. 3, the Avatars must be within each other's audio dispersion envelope for audio to be transmitted between each other. An alternative may be to enable audio to be transmitted where at least one of the Avatars is within the dispersion envelope of the other Avatar. For example, if Avatar B is within the dispersion envelope of Avatar C, but Avatar C is not in the dispersion envelope of Avatar B, audio may be transmitted between the users associated with Avatars C and B. Optionally, audio may be transmitted in one direction such that the user associated with Avatar C can hear the user associated with Avatar B, but the user associated with Avatar B cannot hear the user associated with Avatar C.

The shape of the audio dispersion envelope may depend on the preferences of the user as well as the preferences of the virtual environment provider. For example, the virtual environment provider may provide the user with an option to select an audio dispersion shape to be associated with their Avatar when they enter the virtual environment. The audio dispersion shape may be persistent until adjusted by the user. For example, the user may select a voice for their Avatar such that some users will have robust loud voices while others will have more meek and quiet voices. Alternatively, the virtual environment provider may provide particular audio dispersion profiles for different types of Avatars, for example police Avatars may be able to be heard at a greater distance than other types of Avatars. Additionally, the shape of the audio dispersion envelope may depend on other environmental factors, such as the location of the Avatar within the virtual environment and the presence or absence of ambient noise in the virtual environment.

FIG. 4 shows an embodiment in which the Avatar is provided with an option to increase the volume of their voice so that their voice may reach farther within the virtual environment. This allows the Avatar to “shout” within the virtual environment. As shown in FIG. 4, assume that the user associated with Avatar A would like to talk to the user associated with Avatar B. User A may face Avatar B in the virtual environment by causing his Avatar A to face Avatar B. As discussed above in connection with FIG. 3, where the audio dispersion profile is designed to extend further in front of the Avatar, this action will cause the main audio dispersion profile lobe to extend toward Avatar B. However, as shown in FIG. 4, the main audio dispersion lobe of the audio dispersion profile may be insufficiently large to encompass Avatar B if Avatars A and B are too far apart in the virtual environment.

According to an embodiment of the invention, user A may “Shout” toward Avatar B to cause the audio dispersion profile to extend further in the direction of B. The user's intention to shout at B may be indicated by the user through the manipulation of simple controls. For example, on a wheeled mouse, the mouse wheel may be a shout control that the user uses to extend the audio dispersion profile of their Avatar. In this embodiment, the user may simply scroll the mouse wheel away by contacting the mouse wheel and pushing forward with their finger. This is commonly used to scroll up using the mouse wheel in most common computer user interfaces. Conversely, if the user no longer wants to shout, the user may pull back on the top of the mouse wheel in a motion similar to scrolling down with the mouse wheel on most common user interfaces.

There are times where the user may want to communicate with every user within a given volume of the virtual environment. For example a person may wish to make a presentation to a room full of Avatars. To do this, the user may cause their audio dispersion envelope to increase in all directions to expand to fill the entire volume. This is shown in FIGS. 4 and 5 as dashed line 5. The volume of the user's voice in this embodiment won't increase, but rather all users within the volume will be able to hear the user so that they can hear audio generated by the user. This feature will be referred to herein as “OmniVoice”. When the user invokes OmniVoice, audio from the user is mixed into the audio stream presented to each of the other users within the volume so that the other users can hear the user invoking OmniVoice. When the user invokes OmniVoice, the appearance of the user may be altered somewhat so that other people engaged in the virtual environment are aware of who has invoked the OmniVoice feature.

A user may use explicit controls to invoke OmniVoice or, preferably, OmniVoice may be invoked intrinsically based on the location of the Avatar within the virtual environment. For example, the user may walk up to a podium on a stage and, the user's presence on the stage, may cause audio provided by the user to be included in the mixed audio stream of every other Avatar within a particular volume of the virtual environment.

The mouse wheel may have multiple uses in the virtual environment, depending on the particular virtual environment and the other activities available to the Avatar. If this is the case, then a combination of inputs may be used to control the audio dispersion profile. For example, left clicking on the mouse combined with scrolling of the mouse wheel, or depressing a key on the keyboard along with scrolling of the mouse wheel may be used to signal that the mouse scrolling action is associated with the Avatar's voice rather than another action. Alternatively, the mouse wheel may not be used and a key stroke or combination of keystrokes on the keyboard may be used to extend the audio dispersion envelope and cause the audio dispersion envelope to return to normal. Thus, in addition to implementing directional audio dispersion envelopes based on the orientation of the Avatar within the three dimensional virtual environment, the user may also warp the audio dispersion envelope in real time to control the Avatar's audio dispersion envelope in the virtual environment.

When the user elects to shout toward another user in the virtual environment, the facial expression of the Avatar may change or other visual indication may be provided as to who is shouting. This enables the user to know that they are shouting as well as to enable other users of the virtual environment to understand why the physics have changed. For example, the Avatar may cup their hands around their mouth to provide a visual clue that they are shouting in a particular direction. A larger extension of the Avatar's voice may be displayed as an Avatar or even by providing a ghost of the Avatar move closer to the new center of the voice range so that other users can determine who is yelling. Other visual indications may be provided as well.

The user controlled audio dispersion envelope warping may toggle, as shown in FIG. 4, such that the user is either talking normally or shouting depending on the state of the toggle control. Alternatively, the user controlled audio dispersion profile warping may be more gradual and have multiple discrete levels as shown in FIG. 5. Specifically, as shown in FIG. 5, the user may increase the directionality and range of the projection of their voice in the three-dimensional computer generated virtual environment so that the reach of their voice may extend different distances depending on the extent to which the user would like to shout in the virtual environment. In FIG. 5, the user has been provided with four discrete levels, and then a fifth level for use in connection with OmniVoice. Other numbers of levels may be used as well. Additionally, the discretization may be blurred such that the control is closer to continuous depending on the implementation.

In another embodiment, the selection of a person to shout to may be implemented when the user mouses over another Avatar and depresses a button, such as left clicking or right clicking on the other Avatar. If the person is within normal talking distance of the other Avatar, audio from that other Avatar will be mixed into the audio stream presented to the user. Audio from other Avatars within listening distance will similarly be included in mixed audio stream presented to the user. If the other Avatar is not within listening distance, i.e. is not within the normal audio dispersion envelope, the user may be provided with an option to shout to the other Avatar. In this embodiment the user would be provided with an instruction to double click on the other Avatar to shout to them. Many different ways of implementing the ability to shout may be possible depending on the preferences of the user interface designer.

In one embodiment, the proximity of the Avatars may adjust the volume of their audio when their audio is mixed into the audio stream for the user so that the user is presented with an audio stream that more closely resemble normal realistic audio. Other environmental factors may similarly affect the communication between Avatars in the virtual environment.

Although the previous description has focused on enabling the Avatar to increase the size of the dispersion envelope to shout in the virtual environment, the same controls may optionally also be used to reduce the size of the dispersion envelope so that the user can whisper in the virtual environment. In this embodiment, the user may control their voice in the opposite direction to reduce the size of the dispersion envelope so that users must be closer to the user's Avatar to communicate with the user. The mouse controls or other controls may be used in this manner to reduce the size of the audio dispersion envelope as desired.

FIG. 6 and 7 show an embodiment in which the profile of the audio dispersion envelope is affected by obstacles in the virtual environment. Commonly, two Avatars in a virtual environment will be able to hear each other if they are within talking distance of each other regardless of the other objects in the virtual environment. For example, assume that Avatar A is in one room and that Avatar B is in another room as shown in FIG. 7. Normally, the virtual environment server would enable the Avatars to communicate with each other (as shown by the dashed line representing the dispersion envelope) since they are proximate each other in the virtual environment. FIG. 11 shows a similar situation where two Avatars are closed to each other, but are on different floors of the virtual environment. From the Avatar's perspective, the two Avatars are unlikely to be able to see through the wall or floor/ceiling and, hence, enabling the Avatars to hear each other through an obstacle of this nature is unnatural. This feature also enables Avatars to listen in on conversations that are occurring within the virtual environment between other Avatars, without being seen by the other Avatars.

According to an embodiment of the invention, the audio dispersion profile may be adjusted to account for obstacles in the virtual environment. Obstacles may be thought of as creating shadows on the profile to reduce the distance of the profile in a particular direction. This prevents communication between Avatars where they would otherwise be able to communicate if not for the imposition of the obstacle. In FIG. 6, for example, Avatar A is in a room and talking through a doorway. The jambs and walls on either side of the door serve as obstacles that partially obstruct the audio dispersion profile of the user. If the door is closed, this too may affect the dispersion envelope. Thus, Avatar B who is standing next to the wall will be unable to hear or talk to Avatar A since Avatar B is in the shadow of the wall and outside of Avatar A's audio dispersion profile. Avatar C, by contrast, is in direct line of site of Avatar A through the door and, accordingly, may communicate with Avatar A. Of course, Avatar B may communicate with Avatar C and, hence may still hear Avatar B's side of the conversation between Avatars A and C. Thus, a particular Avatar may be only partially included in communications between other parties. To implement an embodiment of this nature, a discrete mixer is implemented per client, where essentially all these described attributes and properties are calculated on a per client basis. The mixer determines which audio from adjacent Avatars is able to be heard by the particular user and mixes the available audio streams accordingly. Implementing this on a per-client basis is advantageous because no two users are ever in exactly the same position at a particular point in time, and hence the final output of each of the clients will necessarily be different.

FIG. 11 shows two Avatars that are separated vertically as they are on separate floors. The floor may function in the same manner as the walls did in FIGS. 5 and 6. Specifically, if the rooms on the separate floors are defined as separate volumes, sound from one floor will not enter the other floor so that, even though the two Avatars are very close to each other in both the X and Y coordinate space, they are separated vertically such that the two Avatars cannot talk with each other.

The shadow objects may be wholly opaque to transmission of sound or may simply be attenuating objects. For example, a concrete wall may attenuate sound 100%, a normal wall may attenuate 90% of sound while allowing some sound to pass through, and a curtain may attenuate sound only modestly such as 10%. The level of attenuation may be specified when the object is placed in the virtual environment.

Audio be implemented using a communication server that is configured to mix audio individually for each user of the virtual environment. The communication server will receive audio from all the users of the virtual environment and create an audio stream for a particular user by determining which of the other users have an Avatar within the user's dispersion envelope. To enable participants to be selected, a notion of directionality needs to be included in the selection process such that the selection process does not simply look at the relative distance of the participants, but also looks to see what direction the participants are facing within the virtual environment. This may be done by associating a vector with each participant and determining whether the vector extends sufficiently close to the other Avatar to warrant inclusion of the audio from that user in the audio stream. Additionally, if shadows are to be included in the determination, the process may look to determine whether the vector transverses any shadow objects in the virtual environment. If so, the extent of the shadow may be calculated to determine whether the audio should be included. Other ways of implementing the audio connection determination process may be used as well and the invention is not limited to this particular example implementation.

In another embodiment, audio is to be transmitted between two Avatars may be determined by integrating attenuation along a vector between the Avatars. In this embodiment, the normal “air” or empty space in the virtual environment may be provided with an attenuation factor such as 5% per unit distance. Other objects within the environment may be provided with other attenuation factors depending on their intended material. Transmission of audio between Avatars, in this embodiment, may depend on the distance between the Avatars, and hence the amount of air the sound must pass through, and the objects the vector passes through. The strength of the vector, and hence the attenuation able to be accommodated while still enabling communication, may depend on the direction the Avatar is facing. Additionally, the user may temporarily increase the strength of the vector by causing the Avatar to shout.

In the preceding description, it was assumed that the user could elect to shout within the virtual environment. Optionally, that privilege may be reserved for Avatars possessing particular items within the virtual environment. For example, the Avatar may need to find or purchase a particular item such as a virtual bull horn to enable the Avatar to shout. Other embodiments are possible as well.

FIG. 8 shows a flow chart of a process, portions of which may be used by one or more entities, to determine whether audio should be transmitted between particular participants in a three dimensional computer-generated virtual environment. In the process shown in FIG. 8, it will be assumed that the virtual environment server(s) have rendered the virtual environment (100) so that it is available for use by participants. Thus, the virtual environment servers will enable user A to enter the virtual environment and will render Avatar A for user A within the virtual environment (102). Similarly, the virtual environment servers will enable user B to enter the virtual environment and will render Avatar B for user B within the virtual environment (102′). Other users may have Avatars within the virtual environment as well.

The virtual environment servers will also define an audio dispersion envelope for Avatar A which specifies how the Avatar will be able to communicate within the virtual environment (104). Each Avatar may have a set pre-defined audio dispersion envelope which is a characteristic of all Avatars within the virtual environment, or the virtual environment servers may define custom specific audio dispersion envelopes for each user. Thus, the step of defining audio dispersion envelopes may be satisfied by specifying that the Avatar is able to communicate with other Avatars that are located a greater distance in front of the Avatar than other Avatars located in other directions relative to the Avatar.

Automatically, or upon initiation of the user controlling Avatar A, the virtual environment server will determine whether Avatar B is within audio dispersion envelope for Avatar A (106). This may be implemented, for example, by looking to see whether Avatar A is facing Avatar B, and then determining how far distance Avatar B is from Avatar A in the virtual environment. If Avatar B is within the audio dispersion envelope of Avatar A, the virtual environment server will enable audio from Avatar B to be included in the audio stream transmitted to the user associated with Avatar A.

If Avatar B is not within the audio dispersion envelope of Avatar A, the user may be provided with an opportunity to control the shape of the audio dispersion envelope such as by enabling the user to shout toward Avatar B. In particular, user A may manipulate their user interface to cause Avatar A to shout toward Avatar B (110). If user A properly signals via their user interface that he would like to shout toward Avatar B, the virtual environment server will enlarge the audio dispersion envelope for the Avatar in the virtual environment in the direction of the shout (112).

The virtual environment server will then similarly determine whether Avatar B is within the enlarged audio dispersion envelope for Avatar A (114). If so, the virtual environment server will enable audio to be transmitted between the users associated with the Avatars (116). If not, the two Avatars will need to move closer toward each other in the virtual environment to enable audio to be transmitted between the users associated with Avatars A and B (118).

FIG. 9 shows a system that may be used to implement realistic audio within a virtual environment according to an embodiment of the invention. As shown in FIG. 9, users 12 are provided with access to a virtual environment 14 that is implemented using one or more virtual environment servers 18.

Users 12A, 12B are represented by avatars 34A, 34B within the virtual environment 14. When the users are proximate each other and facing each other, an audio position and direction detection subsystem 64 will determine that audio should be transmitted between the users associated with the Avatars. Audio will be mixed by mixing function 78 to provide individually determined audio streams to each of the Avatars.

In the embodiment shown in FIG. 9, the mixing function is implemented at the server. The invention is not limited in this manner as the mixing function may instead be implemented at the virtual environment client. In this alternative embodiment, audio from multiple users would be transmitted to the user's virtual environment client and the virtual environment client would select particular portions of the audio to be presented to the user. Implementing the mixing function at the server reduces the amount of audio that must be transmitted to each user, but increases the load on the server. Implementing the mixing function at the client distributes the load for creating individual mixed audio streams and, hence, is easier on the server. However, it requires multiple audio streams to be transmitted to each of the clients. The particular solution may be selected based on available bandwidth and processing power.

The functions described above may be implemented as one or more sets of program instructions that are stored in a computer readable memory and executed on one or more processors within on one or more computers. However, it will be apparent to a skilled artisan that all logic described herein can be embodied using discrete components, integrated circuitry such as an Application Specific Integrated Circuit (ASIC), programmable logic used in conjunction with a programmable logic device such as a Field Programmable Gate Array (FPGA) or microprocessor, a state machine, or any other device including any combination thereof. Programmable logic can be fixed temporarily or permanently in a tangible medium such as a read-only memory chip, a computer memory, a disk, or other storage medium. All such embodiments are intended to fall within the scope of the present invention.

It should be understood that various changes and modifications of the embodiments shown in the drawings and described in the specification may be made within the spirit and scope of the present invention. Accordingly, it is intended that all matter contained in the above description and shown in the accompanying drawings be interpreted in an illustrative and not in a limiting sense. The invention is limited only as defined in the following claims and the equivalents thereto.

Claims

1. A method of selectively enabling audio to be transmitted between users of a three dimensional computer-generated virtual environment, the method comprising the steps of:

determining a position of a first Avatar in the virtual environment;

determining an orientation of the first Avatar in the virtual environment to determine which way the first Avatar is facing in the virtual environment;

enabling audio from the second user associated with a second Avatar to be included in a mixed audio stream for the first user associated with the first Avatar if the second Avatar is located within a first distance of the first Avatar in the virtual environment and the first Avatar is facing the second Avatar; and

not enabling audio from the second user associated with a second Avatar to be included in a mixed audio stream for the first user associated with the first Avatar if the second Avatar is located within the first distance of the first Avatar in the virtual environment but the first Avatar is not facing the second Avatar.

2. The method of claim 1, further comprising the step of defining a audio dispersion envelope for the first Avatar, such that if the second Avatar is located within the audio dispersion envelope then audio from the user associated with the second Avatar will be included in the mixed audio stream for the first user, and if the second Avatar is located outside the audio dispersion envelope, then audio from the user associated with the second Avatar will not be included in the mixed audio stream for the first user.

3. The method of claim 2, wherein the audio dispersion envelope for the first Avatar is controllable by the first user associated with the first Avatar.

4. The method of claim 3, wherein the audio dispersion envelope for the first Avatar may be increased to extend a greater direction toward a direction that the first Avatar is facing to enable the first Avatar to shout in the virtual environment.

5. The method of claim 3, wherein the audio dispersion envelope for the first Avatar may be decreased to extend a lesser direction toward a direction that the first Avatar is facing to enable the first Avatar to whisper in the virtual environment.

6. The method of claim 3, wherein the audio dispersion envelope for the first Avatar may be increased to extend to coincide with a volume of the virtual environment, such that any audio generated by the first user associated with the first Avatar will be included in audio streams mixed for all other users associated with Avatars that are located within the volume of the virtual environment.

7. The method of claim 6, wherein the audio dispersion envelope for the first Avatar is automatically increased to extend to coincide with the volume of the virtual environment when the first Avatar enters a particular location within the virtual environment.

8. The method of claim 3, wherein the audio dispersion envelope is controllable by the first user via interaction with a mouse wheel.

9. The method of claim 1, the method further comprising the step of determining whether there are obstacles between the first Avatar and the second Avatar in the virtual environment, and selectively not including audio from the second user associated with the second Avatar in the mixed audio stream for the first user associated with the first Avatar if there are sufficient obstacles between the first and second Avatars.

10. The method of claim 9, wherein the obstacle is a wall.

11. The method of claim 2, wherein a shape of the audio dispersion envelope for the first Avatar is dependent on Avatar's location within environment.

12. A computer program including data and instructions that are stored on a computer readable memory which, when loaded on a computer processor, enables the computer to implement a method of enabling realistic audio to be simulated in a three dimensional computer-generated virtual environment, the method comprising the steps of:

defining a first audio dispersion envelope within the three dimensional computer-generated virtual environment for a first Avatar associated with a first user; and

selecting audio from other users associated with Avatars located within the audio dispersion envelope for inclusion in a mixed audio stream to be presented to the first user.

13. The computer program of claim 12, wherein the method further comprises the steps of adjusting a shape of the audio dispersion envelope based on obstacles in the three dimensional computer-generated virtual environment.

14. The computer program of claim 12, further comprising defining an attenuation factor for items in the virtual environment to determine a size and shape of the dispersion envelope.

15. The computer program of claim 14, further comprising enabling the first user to control the audio dispersion envelope.

16. The computer program of claim 15, wherein the step of enabling the user to control the audio dispersion envelope enables the user to enlarge the audio dispersion envelope to include audio from Avatars that are farther away so that the Avatar can shout in the three dimensional computer-generated virtual environment.

17. The computer program of claim 16, wherein the step of enabling the user to control the audio dispersion envelope enables the user to reduce the audio dispersion envelope to prevent Avatars from being included in the audio dispersion envelope so that the Avatar can whisper in the three dimensional computer-generated virtual environment.

18. The computer program of claim 12, wherein if a second Avatar is within the dispersion envelope for the first Avatar, then a user associated with the second Avatar can hear audio from the first user.

19. The computer program of claim 18, further comprising the user to invoke a feature that will enable audio from the first user to be included in audio streams provided to all users associated with Avatars within a volume of the virtual environment.

20. The computer program of claim 19, wherein the feature is intrinsically invoked when the first Avatar enters a defined area of the virtual environment.