O'Reilly Network    


 Published on The O'Reilly Network (http://www.oreillynet.com/)
 http://www.oreillynet.com/pub/wlg/5171

Beyond the dimension

by Jono Bacon
Jul. 7, 2004

We are facing an interesting time in the Open Source desktop world. Not only are a number a of interesting technologies being developed for making our computers work more transparently, but a new technology has been Open Sourced recently that provides a new playground for a new way of thinking about the Open Source Desktop; this technology is Project Looking Glass.

Project Looking Glass (PLG) is a technology that was created by Sun to create a 3D desktop environment. The environment gives you the ability to perform simple operations such as flipping windows, changing the perspective and view of an object and other functions. The software was created by Sun to explore the possibilities for 3D based applications and a 3D based desktop, and although fairly useless in its current incarnation, the prototype provides a level of usable framework to create 3D applications and experiment with a new way of interacting with software.

The aim of this article is to discuss some ideas and concepts for making use of a 3D environment. Before I continue, there should be a few disclaimers however. First of all, I am no usability expert, and I am actually fairly cynical about certain aspects of usability theory. As such, you should take my ideas here as simply ideas - they were in no way researched and are not backed up with data to prove their usefulness. Secondly, the ideas here can apply to any 3D environment or software, and not specifically PLG. Feel free to make use of these ideas in your own 3D environment.

I believe that a 3D environment could be useful. There has been much discussion on the net about the worth of a 3D interface, particularly considering that it is confined within the remit of your 2D screen and typical 2D input devices; keyboard and mouse. Although I share some cynicism to a point, I also do believe that people can perceive 3D sufficiently on a screen to interact with it. You only have to look at how we perceive 3D in video games and movies to see this. I think the biggest challenges that we face are not with perception, but with regards to the input and architecture of the environment.

Input

I think it is fair to say that it is unreasonable to expect users of a 3D interface to go out and buy a special input device for their computer. We are not aiming to build a Minority Report type system here; the aim is to create a level of useful 3D interaction that is as familiar and intuitive as possible. I do believe that the mouse is useful here.

3D interfaces are based around three axis points:

When considering our input mechanism, we need to take into account these axis requirements. In addition to this we need to consider the selection requirements. I believe that selection will be as simple in the 3D space as it is in a 2D space; you need to be able to select something (such as loading an application when double clicking an icon) and you need to able to hold something (such as dragging an icon by single clicking, dragging and releasing). The only other possible requirement is a context menu, but then I am rather skeptical of these, and I think a better solution can be achieved in the 3D space with semi-transparent overlays.

With these considerations, one choice of input could be:

Although I have suggested which button can do what, these combinations can obviously be changed. The main point I am making is that you need a selection button and a means to control each axis. Some people have suggested using a Shift/Ctrl/Alt key in combination with the mouse, but I think this feels a little clumsy.

3D representation of objects

The 3D interface will never amount to anything if we don't consider some specific use cases and how the interface can be best used. I think the key to defining 'best used' is to clearly separate out 2D and 3D functions. I see no point in making everything 3D; some things are inherently 2D (such as creating a word processed document) and the interface should allow you to edit your document in a 2D window as if you were using KDE/GNOME.

I think the true value of 3D comes in when we consider how we interact with objects. A while back a friend of mine told me about John Siracusa's analysis of the spatial finder, and I found his commentary on how we interact with objects interesting. A 3D interface really allows us to take this concept and raise it to the next bar - in the 3D space we can truly interact with the object and not simply interact with iconic representations of objects.

Let us take for example, a file. In most current GUI's, a file is represented by an icon. This icon can be interacted in the sense of moving it to different locations and clicking on it to load the file into a viewer. In the 3D space this file could be literally an accurate representation of the file itself. In this sense we could represent some of the following types of file:

Some icons will obviously be 2D by their very nature. A .png or .jpg image is obviously a flat 2D image and is represented as such, but the key is in providing a realistic representation to use of what type of content the object is. As an example, the user needs to see an intrinsic link between the document they type into and the document that comes out of their printer.

Application use cases

Before we can consider any kind of development effort, we need to come up with some ideas for how 3D applications will work. We need to formulate these ideas into use cases that can be clearly discussed and debated over. Here are some ideas:

File management

If there is something that humans seem to have no problem understanding is that of drawers, cupboards, fridges and other square boxes with a door on the front. We also understand pigeon holes, boxes, containers and other methods of putting one object in another. We also innately understand that if you put two objects in a box you only need to move the box to move both objects. This can be useful for dealing with directories and moving files around.

I think what we need to create in this kind of interface is a number of of visual representations of real world storage containers. As an example, a hard disk could be represented as an office/storage room (we need to visually suggest that the hard disk is bigger than anything on it, so we need to visually represent the actual disk as a larger room). Within this room we then have a number of storage cabinets (directories) in which the files can be stored. Moving a cabinet from one room to another should be as simple as dragging it over from one room to the other. With the metaphor of cabinets we can also have different types of storage container for different types of information. A typical My Documents type directory could be a filing cabinet for example.

With this kind of metaphor I want to steer clear of someone walking into a 3D room and in a Doom III style manner and moving a hand around to pick up files. This whole metaphor is based on iconic meaning tied in with a real world relationship between the objects. Here is a use case:

Creating content. E.g. burning a CD

The concept of burning a CD follows my ideas for creating any type of simple content. For this we need to identify the core components of the object we are creating, and put on the screen a simple template that allows the user to click on the relevant part of the object to change it. For a burnable CD we will typically have the CD itself and a cover. We may also have a cover for the back of a CD case. Here is the use case:

What could be useful for this case is that when the user buys some new CDs he/she is encouraged to add them to the media store - this way the computer can let the user know when he/she is running out of media. This is particularly useful with the computer checking if the CD's are working or damaged when the burning process is finished.

Device handling

When a user plugs in a device, it should be visually represented on the screen. This will make an intrinsic link between the physical device and the virtual device, although they may look different physically (this is the biggest problem). With this device on screen, the user should be able to interact with it in a similar way to the real device. Let us assume we are plugging in a digital camera:

This system is not radically different to the current method of viewing pictures on a drive, but we are connecting together the concept of pictures on the device and actually dragging them to somewhere useful.

These use cases are not necessarily the right way to do things, but they provide a starting point for discussion. With more consideration and some prototypes we can better target the 3D aspects of the interface in the applications and make these use cases more representative of how we physically interact with the world.

Conclusion

I firmly believe that the 3D desktop environment has some great potential, but it needs to combine the best elements of the 2D methods we currently use and the innovative 3D ideas we will consider in the desktop of the future. This article has been written to hopefully pique the interest and ideas of people to think about how we can create an interface that is far easier to use and more representative of the real world.

The biggest challenge when implementing an interface such as this is how far you represent reality. As an example, when you plug your camera in and look at the pictures on the virtual screen, you should really be able to use the functions on the camera as if it was the physical device, but the software limits this potential to merely grabbing pictures and maybe taking a few shots. In this sense the physical representation cannot be fully imitated - we simply need to get a good batting average.

I would love to hear your thoughts on all of this, so feel free to get in touch with me or scribe your thoughts down in the comments box below. I am as interested to learn new ideas as much as coming up with new ideas; this could really mark a new wave in the Open Source desktop revolution.

Jono Bacon is an award-winning leading community manager, author and consultant, who has authored four books and acted as a consultant to a range of technology companies. Bacon's weblog (http://www.jonobacon.org/) is one of the widest read Open Source weblogs.

oreillynet.com Copyright © 2006 O'Reilly Media, Inc.