The Problem

Something I have noticed in my 10+ years experience in MMO’s, up to and including “hardcore” raiding in various titles, is the very real limit of game immersion and encounter dynamics faced by the inability to effectively implement, coordinate, or get people in the community on-board with the idea of real-time voice communications.

WoW has attempted to integrate this in-game with little success, and even in the more hardcore raids I have attended and helped to organize, it always seems to end up being a handful of “core” players who effectively communicate in this medium, and the majority who, for various reasons not necessarily indicating fault on their part, prefer to stick to text chat.

The problems presented by this statistical trend are obvious. In more complex fights with several active mechanics and phase transitions, it is simply not practical to “pause” the action to type out instructions or or responses in realtime via “snail chat.” Yes, eventually people learn the fights, but then it becomes more a “rinse and repeat” rather than an “increase the intelligence of AI” sort of evolution.

Furthermore, this dramatically takes away from the element of spontaneous action and immersion, which is what I myself and I'd venture most who play find enjoyable, entertaining, and rewarding about the genre, and therefor in large part constitutes what I refer to as “the wall,” or the very real limit placed on clever and forward-thinking game developers on how creatively they can develop new content.

It also places limits on the “time to burnout” factor. Ever been in a 20-man raid even months after it has been initially released, once all the "hardcores" have it on "farm," where people do intentionally stupid things just to get a chuckle (Leeeeeroy!), because it has all become so repetitive, so boring, that it is as much so as the life we play to escape? This is the ugliness of digital burnout! =P

The Dilemma

I have a fairly good idea why this community resistance to voice chat exists.

As a developer or business professional one must always ask themselves, “why do most of my customers purchase my services?” Time and again we see the results of these surveys: Most people play, at least on some level, to escape.

This doesn’t necessarily mean full RP, where our Avatar becomes our second identity much like Jake Sully’s character in the film, though it may. We all have the potential to be imaginative and highly creative beings, however much our antiquated, profit and status driven social models attempt to beat it out of us with fear, constant inundation, condemnation, and utter hopelessness.

At the same time, "escape" could be as simple as playing to relax, to distract ourselves from the harsher realities of our “real lives” and hopefully, in the genre of MMO’s, to have experiences in a community of at least similarly inclined people which can help to balance where those more mundane aspects are found wanting, which is always important.

Angry elitist "alpha-sheep" can belittle and condemn the creative majority all they like, typically until their vocal minority get their way, but there is nothing wrong with an objective approach to what constitutes the real genre-transcending "draw" of these virtual worlds.

From a business perspective, it really is as simple as this:

When you have a majority who play to escape, to role play, who go to great lengths creating these elaborate virtual fantasy characters with achievements and investment of time and effort to gear and power up, having to ground that Avatar in the real-life rigidity of hearing your own human voice through their virtual mouths is immersion breaking, it is counter the idea of the entire experience, and it is, quite simply, WRONG.

Yet, at the same time, the fact remains more intelligent and complex group content simply cannot push must past the well-established limitations of text-based communication in MMO’s without it.

Again, the elitists would simply dismiss the majority, saying "let the typers fail," however anyone with any business sense whatsoever would tell such a person they are dreaming. No company in their right mind would cater to a minority demographic, and elitists will always be the minority, sorry to say. If most people don’t want voice chat and real-time communication, developers aren’t going to make tougher encounters that require it to coordinate and say “too bad, adapt.”

They’d quickly be out of business.

No, instead we will see game companies continue to balance difficulty and complexity around the standard.

As a scientist I believe that instead of arguing over who is right or wrong, it is more productive to explore what about voice communications most turns people away, and attempt to adapt an equitable workaround.

The Soution

Trion does a better job than many MMO’s of giving you control over the customization of aspects of your character, without giving you TOO much control. Rift strikes a healthy balance here, of having enough options to personalize appearance without those same options seeming daunting, requiring hours of trial and error, things not looking as expected or worse, being stuck with a high level character you can’t stand to look at anymore! (I’m sure they will also implement a character re-customization feature at a later date.)

My point is, why not implement a system where you can also choose your character’s voice profile? Why not have a few simple recognizable lines from the game you can hear each gender and racially-specific voice recite during character creation, to get a feel for what it actually sounds like.

Then, allow some simple filters like pitch, racial “accentification,” and best of all, INTEGRATE this with in-game voice chat!

Modern voice emulation or voice changing technology has far surpassed the "fitter, happier, more productive" (for Radiohead fans!) equivalent of emotionless, monotone, and robotic "machine speak." In fact, having sampled several offerings of the major players in the industry, it has indeed gotten to the point of being so good it is often difficult to notice at first.

This is especially true of more sophisticated software that combines aspects of voice recognition's "learn as you speak" components.

Yes, this would involve coding such an engine into the game, though perhaps Trion could simply license the technology from any number of known distributors.

Basically, the end result would be that when you talk in voice chat, everyone else hears your avatar’s voice, not your real one (though they may be as close or different as you choose to configure them.)

This could be strictly enforced such that all players using the in-game voice communications would have their own emulated avatar voice, to avoid the sort of discrimination that so often occurs.

There would be a toggle to disable hearing your character's voice, to avoid the confusion that may result in also hearing your own.

Guild leaders who simply refuse, for whatever reason, to make use of such technology, could opt to keep to the various third-party applications such as Ventrilo or Team Speak presently employed, and players who prefer not to use such real-voice methods but rather enjoy the immersion of anonymous voice chat could simply avoid those guilds.

Processing would be done on the client side before transmission, which is something modern multi-threaded CPU's are perfect for.

I predict this would VASTLY expand the quality and experience of community-defining RP (not to mention re-playability and therefor revenue) of ANY MMO, while at the same time drastically improving the number of people willing then to participate in voice chat, because it is no longer so counter to the reason many play.

Developers could continue creating more complex and interesting encounters, because the barrier of real-time communication would have finally been breached in an acceptable way, to where it is actually FUN for those who value their relative creative anonymity to participate!

I see this as a win-win minus a modest (based on Trion’s revenue to date) initial investment in the technology to implement. So, I propose this as the next major genre-defining advancement in the MMO experience:

In-game voice processing communications, with options for text-to-speech through the same Avatar voice, probably through the use of a standard emote system passed through the configured filters, which might even include a capacity for rudimentary language translation through the freeware Google API.

So, is Trion up to the challenge?