Is 2-D graphics the next killer app for Linux?
Raph Levien
artofcode LLC

Abstract:

This paper describes some of the technical issues and challenges involved in high quality 2D graphics displays. We further describe a number of projects that provide infrastructure for 2D graphics applications under Linux and other free operating systems.

1 Introduction

The compelling technical strengths of Linux and other free software systems in multitasking and networking have brought it considerable success in the area of Web servers. In this presentation, I demonstrate that the similar technical strengths free software is gaining in 2D graphics, and argue that this area could be the next ``killer app'' for Linux.

Advances in display technology present an opportunity and a challenge for 2D graphics software. As resolutions pass 140 dpi for CRT's and 200 dpi for LCD's, the user experience becomes a sharp, clear display of information rather than a field of pixels, with a dilemma between smooth but fuzzy antialiasing, or sharp but jaggy edges. These displays pose a formidable challenge to drive well. First, software must be designed to be scalable, or resolution independent. Second, at these resolutions, good antialiasing is crucial for the highest quality. Third, these displays require generating pixels at a very high bandwidth.

Fortunately, between existing technology and new projects in development, Linux will meet these challenges. We will present and demonstrate four particularly instructive technologies:

Ghostscript is a mature program, long having provided scalable, resolution independent rendering for both screen and printer. We will demonstrate the new transparency and blending capabilities of PDF 1.4, scheduled for the next major release.
Nautilus, the new Gnome file manager from Eazel, has the core technology in place for scalable display, including vector graphics icons and scalable antialiased text. Nautilus uses the Libart rendering engine and the Gnome Canvas.
The XFree86 Render extension, currently in development, will provide hardware acceleration for 2D graphics primitives including antialiased vector path and text rendering, alpha compositing, and semitransparent images.
Keith Packard has developed an experimental X server that can alpha composite between windows. User interfaces built on top of this technology have the potential to rival the most advanced proprietary systems, including Mac OS X's Aqua user interface.

The core technology is coming. The challenge now is to make sure that they're integrated throughout all applications. This presentation will help show the way for users, developers of applications, and those with an interest in where the technology is going.

2 The evolution of resolution

The resolution of computer displays has been experiencing steady improvement for quite some time. While a thorough examination of historical data is beyond the scope of this paper, we examine the trends and underlying technology, and offer some predictions. We also take a look at the theoretically useful limits of resolution in terms of its ability to be perceived by the human visual system.

First, what is resolution? The fundamental parameters are pixels per inch and color depth. In pure information-theoretic terms, the maximum information carrying capacity of a square inch of display is the square of the resolution in pixels per inch multiplied by the color depth in bits.

These figures don't tell the whole story, however. Display devices don't display these bits of information perfectly and purely; they almost always add some form of degradation to the image. The form and amount of degradation varies with the type and of the device. The primary form of degradation in CRT's tends to be blurring, primarily caused by the aperture grill or shadow mask used to make the device capable of color. In inkjet printers, the degradation is primarily inaccurate positioning of ink drops, combined with ink spreading on the page. In laser printers, it's toner spreading and noise caused by the xerographic process. Good quality LCD's offer quite little degredation, although distortion of the tone range is common, especially for off-axis viewing.

Nonetheless, most displays in widespread use today operate fairly close to the theoretical limits imposed by their resolution and color depth.

The introduction of the Macintosh in 1983 set a standard for displays in consumer computers: 75 pixels per inch, albeit with only a single bit per pixel of color depth.

As of the mid-90's, the resolution on PC's was not considerably greater, typically 96 pixels per inch, although color depth had increased to 16 or 24 bits truecolor.

Today, most people still run their displays at around 96 pixels per inch, largely due to software limitations. However, higher resolution consumer devices are available. The DELL Inspiron 8000 laptop, for example, is available with a 1400 x 1050 pixel 14.1" LCD display, for a resolution of 124 dpi. Similarly, higher-end monitors such as the ViewSonic PF815 typically have maxiumum resolutions in the 120 dpi range. The Nokia 446PRO has a maximum resolution of 133 dpi.

200 dpi LCD screens have been fabricated as research prototypes (see http://www.research.ibm.com/roentgen/) and have just begun low-volume production [IBM00]. It is reasonable to expect that displays of this resolution will be available commercially soon.

Display resolutions beyond 200 dpi are feasible, but won't provide much improvement in actual display quality. This assertion may seem surprising to those who note the dramatic difference in quality between 300 and 600 dpi laser printers, for example. However, laser printers have one bit per pixel color depth. As such, they produce jaggy edges and strokes quantized to integer pixel widths.

The human visual system has no hard resolution cutoff. Instead, contrast sensitivity decreases as spatial frequencies increase. The peak sensitivity is approximately 3 cycles per degree (corresponding to roughly 18 dpi at an 18 inch viewing distance), is roughly 10% of the peak sensitivity at 10 cycles per degree (corresponding to 60 dpi), and trails off to 1% of peak at 30 cycles per degree (corresponding to 190 dpi).

Of course, individuals vary in visual acuity, and some may also prefer viewing distances closer than 18 inches. However, it is reasonable to conclude that a high quality 200 dpi display with good color depth represents a ``magic point'' in display technology, beyond which the gains are marginal.

3 Hinting and antialiasing

Hinting is a technique for improving the quality of text rendering [Mic98]. It generally works by gently distorting the shape so that it is aligned to the pixel grid before rendering. Hinting improves rendering in a number of ways, including:

Making stroke widths uniform.
Making characters symmetrical.
Avoiding flat or ``pimply'' curve extrema.

Thus for one bit deep displays, hinting is almost always a win. A large number of technologies for font hinting exist, including Metafont [Knu86], Adobe's Type 1 format, and TrueType. All of these techniques embed ``hints'' in the font to direct the renderer how to distort the font shape (hence the name). Some renderers work on the basis of ``autohinting'' and do not require hints embedded in the font. These include John Hobby's groundbreaking research on automatic glyph rendering [Hob93], as well as the new autohinting module in FreeType.

file=m2.ps file=goodm2.ps

Unhinted vs. hinted glyph rasterization. From [Hob93].

Antialiasing is a technique for replacing jagged edges with a grayscale representation. As such, it is only applicable to displays with more than one bit of color depth. In a non-antialiased display of a shape, each pixel is colored according to whether or not its center is inside the shape. Thus, a diagonal edge is rendered with a jagged edge as the edge steps from one row of pixels to the next. In an antialiased display, the pixel is colored according to the fraction of the pixel inside the shape. If the entire pixel is inside, it is colored black. If a portion of the pixel is inside, it is colored gray. Thus, diagonal edges are rendered with soft ramps of gray rather than jaggies.

On multibit displays, antialiasing is a tradeoff. It gets rid of the ``jaggies'' and improves the accuracy of rendering overall, but at a price: softness or fuzziness of the edges, and also lower contrast.

One technique used to mitigate these effects is to combine antialiasing and hinting. The most important effect of this technique is to align vertical and horizontal strokes to the pixel grid, so that the edges of these strokes have full contrast. However, diagonal and curved edges still suffer from softness.

The biggest drawback to hinting is that it distorts the letter shapes. This prevents the display from being smoothly scalable, and also detracts from the accuracy of the displayed text, which is particularly annoying in desktop publishing applications. This problem acutely effects the spacing between letters, as hinting quantizes widths and offsets to integer coordinates. Yet, at today's resolutions, the improvement in edge sharpness and contrast makes hinting in conjunction with antialiasing an appealing tradeoff.

As resolution increases, however, the tradeoff tilts in favor of not performing hinting. For glyphs rendered to sizes of 20 pixels, stroke width for normal fonts increases to two pixels. Strokes two pixels or wider always have a fully-on pixel in their interior, while for thinner strokes this is not guaranteed unless the stroke edges are aligned to pixel boundaries. Thus, contrast of unhinted antialiased strokes is fairly poor at low resolutions, steadily improves as resolution increases, and then acheives full contrast at approximately 140 dpi and higher. Thus, at high resolutions, the distortion caused by hinting letterforms is not offset by any improvement in contrast.

I conclude, then, that when high resolution displays become commonplace, the battles today over hinting technology will become as irrelevant as once-fierce battles over 8-bit pseudocolor dithering are today.

3.1 Software support for scalability

I found this quote ad for Apple's Cinema Display amusing:

Easy on your eyes

Pixel density is something else to watch out for. After all, you don't want to have to squint to see images because the pixel density at high resolution makes them too small to see without magnification. With the Apple Cinema Display, the pixel density allows you to use the display at the maximum resolution all the time - and still view everything at a size and sharpness that's easy on your eyes.

If I'm reading it correctly, it's praising the low resolution of the display (86 dpi) as being easy on the ideas. This is counterintuitive: all other things being equal, a higher resolution should always be a higher quality display in all respects.

The problem here, of course, is that not all other things are equal. In particular, most software written today is not designed for scalable displays. With unscalable software, elements of the display are forced to become smaller as the resolution increases---a state of affairs that definitely can lead to squinting.

Designing software for scalable displays is a considerable challenge. The infrastructure technology has been available for some time (including Display PostScript, Display PDF, Libart, etc.), but most applications are written in terms of hardwired pixels. This is, surprisingly, even true for UI applications written for the supposedly ``next-generation'' Aqua UI platform in Mac OS X, even though the underlying Display PDF technology is quite scalable[App00].

In the world of the Linux desktop, things are slightly better. For one, the configurability of apps (while creating a usability mess beyond the scope of this paper) generally means you can use larger fonts and graphic elements. The Gnome Canvas goes one step further by providing an adjustable zoom parameter, capable of scaling the entire display by a constant scale factor. Applications using the Gnome Canvas for display thus gain the ability to scale quite easily.

One such application, Nautilus, is particularly appealing because it exposes the zoom control to the user. It also provides more detail as the zoom factor increases. As such, Nautilus will probably yield excellent results when used on higher resolution displays.

In many ways, the need for scalable displays parallels the ``Y2K'' problem. Just as 20th century programmers or should have known that the year 2000 would arrive, GUI programmers today know (or should know) that high resolution displays are coming. To avoid lots of expensive retrofitting when this happens, simply design for it now.

Unfortunately, I expect that the unscalability of today's popular GUI software will hold back the commercial market for high resolution displays. Most likely, specialized applications, such as supercomputer vizualization, medical display devices, and map readers, will drive the market for these devices in the short term. Many of these applications will be running under Linux and will be able to take full advantage of the 2D rendering infrastructure available.

4 2D Graphics Infrastructure

Fortunately, free software provides quite a bit of infrastructure for 2D graphics, with development continuing rapidly.

This paper will discuss in more detail three projects which I feel are especially valuable resources to 2D graphics applications: Ghostscript, Libart, and the XFree86 Render extension. No slight is intended to other projects, some of which may be quite promising.

4.1 Ghostscript

Ghostscript's development dates back to 1986, as a project of L. Peter Deutsch. Version 1.0 was released in August of 1988. Since then, it has become a fixture in Unix systems. I took over the maintainership of Ghostscript in August 2000.

Ghostscript tracks the development of Adobe's PostScript and PDF standards quite closely. In particular, it has tracked the development of the PostScript/PDF imaging model. Up to PostScript LangageLevel 3, this imaging model was based on the ``cut and stencil'' model, in which an object is either fully transparent or fully opaque at each pixel. Even so, the base imaging model is quite powerful for most printing applications, including Bezier paths, a wide range of stroking parameters, and both Type 1 and TrueType fonts (as well as the myriad variants of each of these font formats). It also supports RGB and CMYK color spaces (Adobe's DeviceN and Separation color spaces for ``spot colors'' are not yet implemented in Ghostscript). The LanguageLevel 3 spec added sophisticated gradients, based on Gouraud triangle meshes, Coons patches, or Tensor product splines, in addition to the more usual linear and radial gradients.

However, for many graphical applications, the underlying ``cut and stencil'' is simply too limiting. Thus, most 2D graphics efforts have extended the basic PostScript imaging model to include compositing of semi-transparent objects. With this extension, drop shadow and ``glow'' effects are quite straightforward. Compositing RGBA images (the usual red, green, blue plus an added ``alpha'' channel for controlling transparency) renders soft (or ``feathered'') edges realistically, and also allows fine control over antialiasing.

With the imminent introduction of the PDF 1.4 standard, the PDF imaging model surpasses any other 2D graphics system in wide use. At the time of this writing, the full PDF 1.4 specification is not yet published, but Adobe has released a technical note describing the transparency and blending operations [Ado00].

4.2 Libart

Libart is a 2D graphics library optimized for interactive displays. It supports most of the PostScript imaging model, and has full support for antialiasing and transparency. Libart is the engine for the antialiased renderer in the Gnome Canvas [MqL00].

One special Libart feature is microtile arrays, a lightweight and efficient approximate representation of specific regions. These microtile arrays are particularly useful for minimizing the re-display area on incremental display, speeding up response time and making motion smoother.

Libart is basically designed to the SVG imaging model. It is used as the basis for librsvg, the batch SVG renderer embedded in Nautilus. At present, it lacks the advanced gradients of PostScript LanguageLevel 3, as well as the advanced blending options of PDF 1.4. In addition, it is optimized for RGB displays and does not support CMYK or other color spaces needed for printing.

The future plan is for Libart and the Ghostscript graphics library to merge. This merged library will hopefully provide the smooth antialiasing features and ease of integration of the current Libart, along with a complete implementation of the PDF 1.4 imaging model.

4.3 XFree86 Render

The XFree86 Render extension is one of the more exciting developments in the land of free 2D graphics in a while. The basic idea is simple: provide access to the incredibly high speed of modern video hardware, through an interface consistent with the X11 base.

The primitives offered by Render are more or less exactly those needed to do sophisticated 2D graphics: RGBA images, antialiased glyph compositing, antialised shapes (rendered as lots of small triangles), and gradients (rendered as lots of Gouraud-shaded triangles).

The speed of hardware acceleration for these primitives is impressive. Keith Packard reports at least one order of magnitude, and in many cases two. This opens up many new possibilities. For example, real-time scrolling has become expected. Carefully implemented, real-time zooming may become realistic. For some applications, such as map display, it could be a very useful feature.

The greatest challenge posed by the Render extension is integration with applications. This work has already begun - Keith Packard has posted patches to KDE that make it display antialiased text. There is a considerable amount of work remaining, but I am confident that it will proceed apace, fueled by the astonishing performance improvements offered by hardware.

5 2D Graphics Applications

The Gimp, an image manipualation program, is one of the highest profile graphics applications available. It is one of the first free software applications with a modern user interface. Its success is partly responsible for inspiring the Gnome desktop project.

While there are other image manipulation programs (notably KImageShop), the Gimp is the tool of the choice for those seriosly doing graphics. Unfortunately, the situation is markedly different in other fields.

There are over a dozen projects to create a usable vector graphics editor. None of them are anywhere near their commercial counterparts, such as Adobe Illustrator, Corel Draw, or Macromedia FreeHand. The list of free vector apps includes: xfig, tgif, Gill, Sodipodi, GYVE, KIllustrator, Sketch, ivtools, ImPress, geist, GILT, and possibly others.

I believe a big part of the problem is the lack of a good standard data format for vector graphics. While images are widely interchangeable in standard formats, each vector illustration program, proprietary or free, has tended to have its own format. I had great hopes that SVG would become such a standard, but based on my own experiences trying to implement Gill, it is likely its complexity and dependence on other bleeding edge technologies will hamper its widespread adoption.

6 Conclusion

Expectations for the 2D imaging model have been steadily rising. Where the PostScript ``cut and stencil'' imaging model was once considered sufficient, antialiasing and alpha transparency are requirements for any modern 2D graphics application. High resolution displays and accelerated video hardware will bring advanced 2D graphics capabilities to ordinary users over the next few years. New applications can and should be written to be scalable, and to make good use of the free software infrastructure available.

References

[Ado00] Adobe, Transparency in PDF. Technical Note #5407, May 2000.

[App00] Apple Computer, Inside Mac OS X, Adopting the Aqua Interface, 2000.

[Hob93] Hobby, J. D. Generating Automatically Tuned Bitmaps from Outlines, JACM 40(1), 1993.

[IBM00] IBM ships world's highest-resolution computer display, November 2000. http://www.ibm.com/news/2000/11/10.phtml

[Kan99] Kang, H. Digital Color Halftoning. SPIE and IEEE Press, 1999.

[Knu86] Knuth, D. E. The Metafont Book. Addison-Wesley, 1986.

[Mic98] Microsoft Corp. Introduction to hinting. http://www.microsoft.com/typography/hinting/hinting.htm

[MqL00] Mena-Quintero, F. and R. Levien, The GNOME Canvas: a Generic Engine for Structured Graphics, Usenix Technical Conference, San Diego June 2000.

[Nai91] Naiman, A. C., The Use of Grayscale for Improved Character Presentation. PhD thesis, Technical Report CSRI-253, U. Toronto, March 1991.

[Pac00] Packard, K. A New Rendering Model for X. Usenix Technical Conference 2000.

-0.5in -0.5in 7.5in

Appendix A

This appendix shows samples of text rendered at different resolutions, and with different options for hinting and antialiasing. All samples were rendered with Ghostscript 6.50.

file=sample75.ps file=sample75h.ps file=sample75a.ps file=sample75ah.ps

75 dpi: no antialiasing or hinting, hinting only, antialiasing only, hinting and antialiasing.

file=sample100.ps file=sample100h.ps file=sample100a.ps file=sample100ah.ps

100 dpi: no antialiasing or hinting, hinting only, antialiasing only, hinting and antialiasing.

file=sample140.ps file=sample140h.ps file=sample140a.ps file=sample140ah.ps

140 dpi: no antialiasing or hinting, hinting only, antialiasing only, hinting and antialiasing.

file=sample200.ps file=sample200h.ps file=sample200a.ps file=sample200ah.ps

200 dpi: no antialiasing or hinting, hinting only, antialiasing only, hinting and antialiasing.

This document was translated from L^AT_EX by H^EV^EA.