Longhorn introduces significant new graphics technology, currently known by its codename, "Avalon." Avalon renders an application's visual elements onto the screen using a much more sophisticated approach than Windows has previously used. In this article, I show how this new graphical composition model solves various limitations of Win32, what new user interface design techniques this enables, and what it means to developers.
Note: the content of this article applies to the Longhorn Developer Preview (build 4051) released at the 2003 PDC. Although this release does not contain the DCE (Desktop Composition Engine), all of the composition techniques described in this article can still be used. The absence of the DCE simply means that Avalon's graphical composition support can currently only be used inside of an application's windows.
In computer graphics, composition is the process of combining multiple images into a single image. It can mean anything from the very simple process of showing two pictures next to each other right up to the sophisticated techniques used in the cinema industry, where each frame may be made up of hundreds of images combined using a wide range of visual effects.
All windowing systems support some level of composition. The image you see on your computer screen is composed from many discrete elements. For example, a web browser window typically has many toolbar buttons, an address field, a menu, and a status bar, as well as the main browsing panel. These elements are combined with the window title bar and borders supplied by the OS to form the appearance of the whole window. This in turn will be the composed with any other windows that are currently visible to form the image on your screen.
The set of composition techniques a windowing system offers influences the way in which user interfaces can be built. Avalon offers much more sophisticated graphical composition than has previously been available under Windows. To see why, we'll take a quick look at where current versions of Windows fall short.
Windows has always used a fairly simple form of composition. The screen is divided up into regions, one
for each visible window. (Note that in Win32, controls such as buttons and textboxes are all treated as windows. So
for the purposes of this discussion, a window is anything with an
part of the screen needs to be updated, Windows sends a message (
WM_PAINT) to each of the
windows whose regions are in the area to be redrawn. The programs that own these windows process this message
by using Win32's drawing APIs to construct the appearance of the window on screen. Windows clips this drawing
output so that each window only draws into its allocated region.
The shapes of these regions change dynamically as you move or resize windows. Although programmers can define a region for a window in order to control its shape, the visible region may not be the same as the region specified by the programmer. For example, if any part of the window is obscured by some other window, the effective region will be whatever remains visible.
Figure 1. Two overlapping windows
For example, consider the main client areas of the two windows in the picture above. These areas are both nominally rectangular, but only Window 2's contents currently occupy a rectangular region on screen. The grey client area of Window 1 is partially obscured, so for composition purposes, its visible region is a rotated L shape. (And because Windows XP has added a title bar with curved corners, the L-shaped region Window 1 gets to draw into has a curved inner corner.) Behind that, you can see part of the desktop, and while the desktop window is notionally rectangular, it has quite a convoluted visible region here.
A crucial feature of this style of composition is that none of the visible regions overlap. The notional regions may coincide -- the two windows above clearly appear to overlap. But for composition purposes, only the visible regions count, and visible regions are always disjoint.
The Visual Styles feature in Windows XP and Windows Forms are both able to give the impression of supporting more advanced composition. They each allow a form of transparency, where a parent window's background is visible through a child window. For example, this picture shows a window with a gradient fill background and three child radio-button windows:
Figure 2. A window with pseudo-transparent child controls
The top control works in the old-fashioned way -- it occupies its own rectangle entirely. It has attempted to blend in with its surroundings by filling in its region with what it thinks is the parent window's background color, but it has failed because the parent window doesn't have one single background color. This illustrates one of the shortcomings of a composition approach where any given region of the screen is filled by just one control. However, the other two controls appear to work just fine. Not only does the background show through, but the controls have successfully drawn the text using ClearType rendering, which requires pixel-level blending effects. (And you can even write custom controls that use GDI+ partial transparency to draw see-through things over the form background.)
However, the way this is done is a bit of a hack -- these controls are not really being composed in the way that it appears. They are occupying rectangular regions, just like the control at the top. Visual Styles and Windows Forms both use the same basic trick: "transparent" controls arrange for the parent window to draw their backgrounds for them, passing in their own graphical context. This means that these controls' backgrounds look like that part of the main window would have looked if the child controls had not been there, providing the illusion that we can see the main window behind the controls. However, if any of the child windows overlap each other, it becomes clear that each control still occupies its own region of the screen exclusively, and the illusion is destroyed:
Figure 3. The shortcomings of pseudo-transparency
If you are familiar with the window styles available in Win32, you might be wondering why pseudo-transparency is
done this way rather than by using the
styles. These allow the visible regions of windows to overlap, which sounds like precisely what we want. However,
these flags make it impossible for an application to update its display without flickering. The key to eliminating
flicker is to perform updates in a single step (typically by doing all the drawing into an off-screen buffer, and then
copying that to the screen once all drawing is complete). But these window styles force the painting to be done in
separate stages -- in order to update an area of the screen where multiple controls overlap, Windows must send
WM_PAINT message to each control in turn, which precludes the use of flicker elimination techniques.
By faking transparency the way that they do, the Visual Styles system and Windows Forms permit child windows to
appear to be transparent and yet provide perfectly flicker-free updates.
(Windows 2000 introduced "layered" windows, which really can overlap, and they even support semi-transparent blending. However, this can only be applied to top-level windows. This facility is of no use for composition of child windows.)
Pseudo-transparency works pretty well, in many cases. But it does impose constraints on what kinds of user interfaces you can build. For example, developers who are new to Windows often ask how to use controls to layer adornments, such as selection outlines or draggable items, on top of the contents of a form. Attempting to use pseudo-transparency in these scenarios usually results in overlaps that destroy the illusion.
The inability to handle overlapping siblings also tends to constrain the visual design of forms. The only way for applications to break free of these constraints is to abandon window-based composition entirely -- if you do all of your own drawing into one big window, you can control the composition process yourself, breaking free of these restrictions. Try looking at some of the more visually capable applications that Microsoft produces with a window examining tool, such as Spy++. Internet Explorer, MSN Messenger, and the task list in Windows XP's Explorer all use one big window, and perform their own composition rather than using child windows. One of the giveaways in MSN Messenger is that the scrollbars don't behave quite like Windows scrollbars. Normally, if you move the mouse pointer far enough away from a scrollbar while scrolling, it snaps back to its starting position. But the MSN Messenger scrollbars don't support this behavior. This kind of inconsistency invariably creeps in when applications try to recreate standard components from scratch.
While these applications illustrate that it is possible to break free of the constraints by performing your own UI composition, developers shouldn't have to spend their time doing this. Composition is one of the basic facilities that should be provided by the platform, and if a platform feature no longer meets developers' needs, it's time for the platform to improve. Enter Avalon.
Avalon introduces a new model for graphical composition. It does two things in a fundamentally different way from Win32 that enable much richer composition without risking flicker, while making developers' lives much easier. First, visual elements are free to overlap. Second, composition is all done off-screen.
Because Avalon is designed to allow visual elements to overlap as much as they like, elements are no longer required to supply their own backgrounds. So while in Win32, the first thing a control typically draws is its background, in Avalon, the background is whatever happens to be underneath. A control can choose to be opaque and fill in its background if it wants to, but this is no longer a requirement. So if you choose to put some text in a UI, it will draw directly onto whatever is beneath it, rather than appearing in its own solid box as a label control would in Windows. For example, in this XAML snippet:
<Canvas DockPanel.Dock="Fill"> <Rectangle Fill="VerticalGradient Navy Cyan" Stroke="DarkGreen" StrokeThickness="3px" Canvas.Left="5" Canvas.Top="20" RectangleWidth="185" RectangleHeight="50" /> <Ellipse Fill="PaleGreen" Opacity="0.5" Stroke="Green" StrokeThickness="2px" Canvas.Left="62" Canvas.Top="10" RadiusX="30" RadiusY="40" /> <Text Canvas.Left="10" Canvas.Top="0" FontSize="24pt" Foreground="Red">Hello, world!</Text> </Canvas>
the text element is drawn over the ellipse and rectangle, but these remain visible behind the text:
Figure 4. Text over an ellipse and a rectangle
This example also illustrates Avalon's support for partial transparency. The ellipse's
property has been set to 0.5, indicating that the composition engine should blend the ellipse with whatever is
behind it. The value of
Opacity indicates the ratio in which to blend. The default value of 1 indicates that
the element should not be see-through at all, while a value of 0 would make the item invisible. The composition
engine can also apply other effects; for example, you can clip elements against any path, or you can specify transformations such as scaling, shearing, rotation,
and translation. You will also be able to apply image effects such as blurring or a "glow." (The Longhorn preview
distributed at the PDC only allows image effects to be applied to bitmaps, but the plan is to be able to apply these
effects to any part of the UI.)
Given this support for overlapping and partially-transparent objects, you might think that Avalon would suffer from
the same flickering issues mentioned earlier for the
WS_CLIPSIBLINGS styles in Win32. After all, if our program does something to change any of the
elements in this example, the chances are that all three elements will need to be redrawn in order to generate the
correct final picture. For example, if we change the font size of the text, the text will clearly need to be redrawn,
but the change will uncover parts of the ellipse and rectangle that were previously covered, so those will need
to be at least partially redrawn, as well. However, Avalon manages to do this without causing any flicker. The key to
this is the second fundamental change between Win32 and Avalon -- the off-screen composition.
One of well-known but often misunderstood facts about Avalon is that it makes use of modern graphics cards. This is often described somewhat misleadingly as a "3D API," because the GPUs (Graphics Processing Units) on modern graphics cards are designed mainly to render 3D graphics very quickly. Moreover, the API Avalon uses to control the graphics hardware is DirectX, the API used by most 3D games. This tends to bring to mind futuristic movie-like, virtual-reality user interfaces. However, while 3D user interfaces are certainly a possibility, Avalon actually makes much more pragmatic use of the GPU -- it can put features designed for 3D use to work in generating 2D images.
For example, high-end graphics cards have much more memory than is necessary to manage the screen's contents. A dual-monitor setup with 1600-by-1200 pixel, full-color displays is high-end by today's standards, but only requires about 7.4MB of video memory. And yet high-end laptops are shipping today with 128MB of video memory, some 17 times as much memory as would seem to be required. The reason for all of this extra memory is that it allows games to run faster -- it is typically used to hold the "texture" images used to make 3D surfaces look interesting or realistic (or both). Increasing the memory available for textures allows games to provide either more detail or greater variety, with no loss of performance.
In normal non-gaming activity, while some of this video memory is used for caching fonts, most of it typically goes unused today. However, Avalon will make extensive use of this memory. When a GUI needs to be rendered on screen, Avalon does not use the Win32 approach of getting each visible item to draw directly into its on-screen region. Instead, drawing is done into off-screen buffers, using all of that extra video memory normally used for 3D textures. This means that applications are no longer obliged to perform their own double buffering to avoid flicker -- Avalon will do it for them. Moreover, because Avalon is managing the buffering, it can choose the optimum moment to transfer the buffer onto the screen. (In forthcoming builds of Longhorn, the DCE will be responsible for this final transfer to the screen, enabling it to apply any further desktop-level transformations such as scaling, or fancy transition effects.) It can wait until all of the visual elements have made their contributions before updating the screen. So unlike in Win32, where double buffering can only normally be applied at the level of a single window, Avalon can double buffer entire trees of UI elements.
There are more clever tricks that Avalon can do with this off-screen buffering. For example, it doesn't necessarily have to use a single buffer for a particular window. It could render subsections of the visual tree into their own buffers, and then compose these buffers into the main window's buffer. The big advantage of this is that it allows sections of the UI to change without having to redraw everything.
For example, consider a complex and graphically rich document -- even exploiting the full power of a modern GPU, it might take a significant amount of time to draw. Bear in mind that when elements are being dragged by the mouse or animated, for the movement to look totally smooth, updates should be performed at the refresh rate of the monitor. With the 60Hz refresh used by most LCD panels, this gives you just under 17ms between updates. A complex document might have many thousands of details (particularly if it contains a lot of small text), at which point 17ms starts to look like an aggressive target. So if you wanted to be able to drag anything like a selection rectangle over the document, it might not be able to redraw the entire document at the frame rate, so the drag operation will end up feeling rather sluggish. However, if the composition engine renders the layer containing the drag rectangle in a separate off-screen buffer from the document, all it has to do in order to update the UI is redraw the drag rectangle in its off-screen buffer, and then compose that with the pre-drawn document off-screen buffer, and copy the result to the screen. This is likely to be very much faster than redrawing everything from scratch.
The use of off-screen buffers also allows Avalon to exploit the graphics card's ability to transform images. All modern GPUs are able to perform scaling, rotation, and shearing of images as part of their basic 3D functionality. This means that if you have an animation that is rotating or zooming part of the UI, those parts don't necessarily need to be redrawn each time the transformation changes -- the animation can simply use the graphics card to transform the pre-drawn image during the composition process. There are limits to this technique -- if you scale an off-screen buffer too much, the image quality will suffer. However, this technique can be used to reduce the frequency with which a full redraw is required, reducing the load on the CPU.
If there is sufficient memory available on the graphics card, Avalon does not need to free up the off-screen buffers it allocates once it has finished updating the display. It can keep them, so that if an application's window needs to be redrawn due to other display activity, it can just copy the image straight out of the buffer back onto the screen without needing to get the application code involved at all.
Even if there is insufficient video memory to buffer all open windows, Avalon doesn't need to pester the application, because it has another way of retaining the information needed to rebuild the display. Avalon maintains a scene graph, which is a data structure representing all of the elements of an application's UI. This is a private structure that your code cannot access directly, and is not the same thing as the logical tree or visual tree that you program against, although its structure will be closely related to theirs. It contains all of the information required to represent the visible aspects of the UI, including animation. If you make changes to the logical tree that cause visible changes, this will result in the scene graph being updated appropriately. But because the scene graph is a distinct data structure that the system controls, Avalon can keep the display up to date autonomously -- animations and video playback can run without continuous attention from the application.
This retained scene graph lets the operating system make fewer demands on the program. In classic Win32
applications, the system sends lots of
WM_PAINT messages to the application in order to keep the
screen up to date, but with Avalon, this is not necessary. This offers several benefits. The display remains intact even
if the application is busy. The load on the CPU is reduced because far fewer messages need to be sent to the
application. And this technique scales better in terminal server scenarios, both because of lower CPU load and the
fact that the scene graph can be sent to the client machine in a reasonably compact form. This enables most of the
repainting to be offloaded to the client, allowing high-quality visuals and even animation to be used without using too
much network bandwidth.
The new composition model in Avalon removes many of the visual design constraints that applied to most Win32 applications. It also improves performance by making more effective use of modern graphics cards, reducing the frequency with which the OS has to call the application back to keep the display up to date, and enabling a much higher quality of user interface.
Ian Griffiths is an independent consultant specializing in medical imaging applications and digital video. He also works as an instructor, teaching courses on .NET for DevelopMentor. Ian holds a degree in computer science from Cambridge University.
Return to ONDotnet.com
Copyright © 2009 O'Reilly Media, Inc.