Last week, we lost a true leader for rural communities, a true champion of social justice in communications policy, and a personal friend and inspiration. Wally Bowen, founder of Mountain Area Information Network, died of ALS (aka “Lou Gehrig’s Disease”) on November 17 at the age of 63.
You can read his official obituary here. As always, such things give you the what and the where, but no real sense of what made Wally such an amazing person. I don’t have a lot of personal heroes, but Wally was one. Simply put, he gave the work I do meaning.
It’s almost Thanksgiving, and I am truly thankful for the time we had with Wally on Earth, even if I am sorry that it ended too soon. I elaborate below . . .
Today one of the guys showed off his recurve bow. As long as you hold it, it re-arms with arrows. Pull the string back with your physical hands, aim and let go, and the arrow flies. It’s basic fun with weaponry. People took turns playing with it.
But quickly people started saying, “Oh, shoot me between the eyes!” (including our CEO). It’s pretty cool to see the arrow coming right for you. Can’t do that IRL! (In Real Life.)
But I never said this. Instead, I stood among the circle of avatars in front of the shooter, and kind of stared him down. Without thinking, I conveyed, “Come at me, bro”, but with nothing so obvious as arms wide or a Neo-like “come on” finger-wave. My avatar could do that with just a stretch of my physical arm or squeeze of my hand, but I didn’t gesture or say a word. Our team is pretty competitive, but we’re not the kind of folks who would shoot the person in front of us without being invited. We need to be invited. But multiple shooters did shoot me. Non-verbal communication works!
Some terrific initial results of our Avatar IK. This is not from post-processing of someone wearing a motion-capture suit with a bunch of ping-pong balls stuck to it. All the movement is instead driven by a (developer’s edition of a soon-to-be-retailed) gaming VR headset (capturing head position and orientation in 3D), and two (somewhat dodgy) gaming hand-controllers (capturing each hand’s position and orientation, plus a trigger). As described here, we interpret these 3 measurements to trigger an appropriate underlying base animation, which then provides more measurements to drive a realtime inverse-kinematics skeleton.
We’ve come up with something that I think is quite cool for conveying presence though our avatars. Interpretive Motion IK is a technique in which all kinds of inputs drive a small set of animated motions, and then the animation results and additional inputs both drive an Inverse Kinematic system. This gives us a rich set of built-in avatar behaviors, while also making use of an ever-evolving set of input devices to produce a dynamic and life-like result.
Why Aren’t Avatars More Like Us?
From static “head shots” in a text chat, to illustrations in a book and famous faces in a movie, avatars give us an intuitive sense of who is present and who is doing what. The story of an animated movie (or high end game “cut scene”) is largely shown through the fluid motion of anthropoids. A movie studio can hire an army of artists, or record body actors in motion capture suits. But a virtual world does not follow a set script in which all activity can be identified and animated before use. Avatar animation must instead be generated in real time, in response to a world of possible activities.
This challenge leads some systems to just show users as simplified images, or as just a head, a disembodied “mask and gloves”, or a mostly un-animated “tin can” robot. This may be appropriate for specialized situations, but in the general case of unlimited high fidelity virtual worlds, the lack of whole-body humanoid animation fails to provide a fulfilling sense of personal immersion.
When it works well, the interpretation of motion is so strong that when another avatar turns to face your avatar, we describe it as “YOU can see ME”. In fact, the pixels on the screen have not turned, and cannot see. Think of the personality conveyed by the Pixar desk lamp hopping across the scene and looking at the camera, or the dancing skeletons of Disney’s early Silly Symphony. Unlike Disney and Pixar, High Fidelity aims to capture this rich whole-body movement as a realtime result of dynamic user input. Alas, today’s input devices give us only incomplete data. Interpretive Motion IK allows us to integrate these clumsy signals into a realistic sense of human personality and action.
The work was done with http://www.tiltbrush.com, which appears to be a couple of guys who got bought by Google. The project – I’m not sure that its actually available for sale – appears to be evolving along several dimensions:
3d Model Definition: covering stroke capture (including, symmetric stroke duplication), “dry” and “wet” (oil paint-mixing/brushwork effects), texture patterns, volumetric patterns, emmision, particles.
Interactive Model Creation: tool pallette in one hand, and brush in the other.
Videos at tiltbrush.com suggest an additional moveable flat virtual canvas (a “tilt canvas”?) that one can hold and move and paint against. The art on display was clearly made this way, as they all felt like they were a sort of a rotational 2.5D — the brush strokes were thin layers of (sometimes curved) surfaces.
The artists last night appeared to be working directly in 3D, without the tilt canvas.
The site mentions an android app for creation. I don’t know if it is one of these techniques or a third.
Viewing: HMD, static snapshots, animated .gifs that oscillate between several rotated viewpoints (like the range of an old-fashioned lenticular display).
I haven’t seen any “drive around within a 3D scene on your desktop” displays (like standard-desktop/non-HMD versions of High Fidelity).
The displays were all designed so that you observed from a pretty limited spot. Really more “sitting”/”video game” HMD rather than “standing”/”cave” exploration.
My reactions to the art:
Emmission and particle effects are fun in the hands of an artist.
“Fire… Light… It’s so Promethean!”
With the limited movement during display, mostly the art was “around you” like a sky box, rather than something you wandered around in. In this context, the effect of layering (e.g., a star field) – as if a Russian doll-set of sky boxes (though presumably not implemented that way) – was very appealling.
Tip for using caves: Put down a sculpted rug and go barefoot!
Philip gave a terrific quick demo and future roadmap at MIT Technology Review’s conference last week. See the video at their EmTech Digital site.
Today we put up a progress report (with even more short videos!) about the accomplishments of the past half year. Check it out.
I wonder if we’ll find it useful to have such material in-world. Of course, the material referenced above is intended for people who are not yet participating in our open Alpha, so requiring a download to view is a non-starter. But what about discussion and milestone-artifacts as we proceed? At High Fidelity we all run a feed that shows us some of what is discussed on thar interwebs, and there are various old-school IRC and other discussions. It’s great to jump on, but it kind of sucks to have an engaging media-rich discussion with someone in realtime via Twitter. Or Facebook. OR ANYTHING ELSE in popular use today.
William Gibson said that Cyberspace is the place where a phone call takes place. I have always viewed virtual worlds as a meta-medium in which people could come together, introduce any other media they want, arrange it, alter it, and discuss it. Like any good museum on a subject, each virtual forum would be a dynamic place not only for individuals to view what others had collected there, but to discuss and share in the viewing. The WWW allows for all of this, but it doesn’t combine it in a way that lets you do it all at once. Years ago I made this video about how we were then using Qwaq/Croquet forums in this way. It worked well for the big enterprises we were selling to, but they weren’t distractible consumers. High Fidelity could be developed to do this, but should we? When virtual worlds are ubiquitous, I’m certain that they’ll be used for this purpose as well as other uses. But I’m not sure whether this capability is powerful enough to be the thing that makes them ubiquitous. Thoughts?
In High Fidelity, running your own virtual world is trivially easy, and I often do development work using a mostly empty workspace on my laptop. Nine years after playing hide-and-seek with my son (first link above), I played air hockey with him over the network. I was thrilled that this now jaded teenager was still able to giggle at the unexpected realtime correctness of the experience. But we had set out to get together online, and this time he wasn’t surprised to find me there.
It turns out that my online visibility had been set to “visible to everyone”. The next weekend I was online and someone clicked on my name and ended up in my world. I was startled to not only see another avatar, but to hear such a clear voice saying, “Hi!” Despite the cartoon avatar, it was as though she were in the room with me. I explained that I was “just working on some development” and that this space would be going up and down a lot, and her voice sounded crushed as she said, “Oh. Ok. Bye…”
An hour later, another visitor came. When I told him the same thing, he left immediately without saying goodbye. Then I was the one who felt crushed.
Of course, one can control access to your own domain — I can’t quite explain why I don’t feel like doing that. But I have turned my online presence visibility to “friends only”.
We’ve just started our open Alpha metaverse at High Fidelity. It works! It’s sure not feature-complete, and just about everything needs improvement, but there’s enough to see what the heck this is all about. It’s pretty darn amazing.
It’s all open source and we do take developer submissions. There’s already a great alpha community of both users and developers — and lots of developer-users. We even contract out to the community for paid improvements — some of which are proposed directly by the community.
So, suppose you’re reading about High Fidelity and seeing videos, and you jump right in. What is the experience like? To participate, you need one medium-sized download called “interface”. Getting and using that is not difficult. To run your own world from your own machine, accessible to others, you need a second download called “stack manager”, which is also easy to get and use. It’s really easy to add content, change your avatar, use voice and lip-sync’d facial capture with your laptop camera, etc. (Make sure you’ve got plenty of light on your face, and that you don’t have your whole family sticking their heads in front of yours as they look at what your laptop is doing. Just saying.)
The biggest problem I encountered — and this is a biggie — is that the initial content is not optimized. You jump in world and the first thing it starts doing is downloading a lot of content. While it’s doing that, the system isn’t responsive. Sound is bad. You can’t tell what the heck is going on or what you should be seeing. We’ve got to do a better job of that initial experience. However, once you’ve visited a place, your machine will cache the content and subsequent visits should be much smoother.
Also, your home bandwidth is probably plenty, but your home wifi might not be. If your family are all on Skype, Youtube, and WoW on the same wifi while you’re doing this, it could make things a bit glitchy.
Our Philip Rosedale gave a talk this week at the Silicon Valley Virtual Reality Conference. The talk was on what the Metaverse will be, comprising roughly of the following points. Each point was illustrated with what we have so far in the Alpha version of High Fidelity today. There are couple of bugs, but it’s pretty cool to be able to illustrate the future with your laptop and an ad-hoc network on your cell phone. It’ll blow you away.
The Metaverse subsumes the Web — includes it, but with personal presence and a shared experience.
The Metaverse has user generated content, like the web. Moroever, it’s editable while you’re in it, and persistent. This is a consequence of being a shared experience, unlike the Web.
A successful metaverse is likely to be all open source, and use open content standards.
Different spaces link to each other, like hyperlinks on the Web.
Everyone runs their own servers, with typable names.
The internet now supports low latency, and the Metaverse has low latency audio and matching capture of lip sync, facial expressions, and body movement.
The Metaverse will change education. Online videos have great content, but the Metaverse has the content AND the student AND the teacher, and the students and teachers can actually look at each other. The teacher/student gaze is a crucial aspect of learning.
The Metaverse scales by using participating machines. There are 1000 times more desktops on the Web than there are in all the servers in the cloud.
We were featured at last week’s NeuroGaming Conference in San Francisco. Philip’s presentation is the first 30 minutes of this, and right from the start it pulls you in with the same kind of fact-based insight-and-demonstration as Alan Kay’stalks. (Alas, the 100ms lag demo doesn’t quite work on video-of-video.)
But everyone has their own ideas of what the metaverse is all about. This Chinese News coverage (in English) emphasized a bunch of “full dive” sorts of things that we don’t do at all. The editor also chose to end Philip’s interview on a scary note, which is the opposite of his presentation comments (link above) in which he shared his experiences in VR serving to better one’s real-life nature.