INTRODUCTION

Digital radiology has many advantages compared to film-based radiology. One is that the functions that used to be performed by the film now can be divided into four separate steps: data acquisition, image processing, data storage, and image display. Each of these steps can and should be optimized separately. In the last of these four steps, the information in the digital image is transferred to the observer, usually as variations in light and color from a display. It is important to have displays of high quality in order not to degrade the last step in the image-forming process. In the literature, medical-grade monochrome displays are usually recommended, mostly because of their higher luminance.1 The major drawback of the monochrome displays is their very high cost, which has prompted some institutions to use standard color displays, which are considerably less expensive because they are mass-produced for the general computer market.

In recent years, there has been a trend toward switching from displays based on cathode-ray tubes (CRT) to flat panels based on liquid crystal displays (LCD). This is supported by several studies.2,3

We wished to test the null hypothesis that there is no significant difference in a calculated image quality factor between contrast-detail phantom images displayed on a consumer-grade color LCD display and a medical-grade monochrome LCD display having the same resolution. We also wished to test the null hypothesis that there is no significant difference in diagnostic image quality between clinical radiographs of the lumbar spine displayed on the same monitors.

MATERIALS AND METHODS

The study compared three types of displays. The main comparison was made between a 20-inch color LCD display (2000 FP UltraSharp, Dell, Round Rock, TX, USA) with a resolution of 1,200 × 1,600 pixels, 2 megapixels (MP), and a 20-inch monochrome LCD display (MFGD 2320, Barco, Kortrijk, Belgium) with a resolution of 1,200 × 1,600 pixels, 2 MP. Some comparisons were also made using a 20-inch monochrome LCD display (MFGD 3220 D, Barco) with a resolution of 1,536 × 2,048 pixels, 3 MP. All displays were connected to a PACS work station using a web interface (Centricity Enterprise Web v2.1, GE Medical Systems) for image reproduction. The monochrome displays were calibrated between 1 and 300 cd/m2 according to the digital imaging and communications in medicine (DICOM) part 14 grayscale standard display function using the built-in photometer (I-guard) and Medical Pro software. The 2-MP color display was left uncalibrated as a calibration would require installation of additional software on a validated medical PACS work station. All three displays were characterized with a luminance spot meter (Minolta LS-100, Minolta Co. Ltd., Osaka, Japan) and AAPM TG-18 test targets (LN12-01 through LN12-18). Because the measuring distance was approximately 1 m, ambient light was always included in each measurement. The ambient light settings were measured with a lux-meter (Elvos LM-1010, Elvos GmbH, Ludwigsburg, Germany).

The 2-MP color display had a minimum luminance of 0.89 cd/m2 and a maximum luminance of 143 cd/m2. The 2-MP monochrome display had a minimum luminance of 1.59 cd/m2 and a maximum luminance 295 cd/m2, all values measured at 23 lx illuminance.

The cost of the displays at installation during 2003–2004 was about $2,400 for the color display, $10,000 for the 2-MP monochrome display, and $16,000 for the 3-MP display. The prices have been reduced considerably since installation, but the relationship is similar.

Two types of comparisons were made: one using a contrast-detail phantom and one using clinical radiographs of the lumbar spine.

Comparison using the contrast-detail phantom:

Images of a CDRAD 2.0 contrast-detail phantom (Artinis Medical Systems, Zetten, the Netherlands) were viewed on all displays. The phantom consists of a 265 × 265 × 10 mm polymethylmethacrylate (PMMA) sheet, with drilled holes of different depth and diameter. Using this phantom, a four-alternative forced choice is performed with the task being to detect as many targets as possible. All detection results were corrected according to the user manual for the CDRAD phantom. From the resulting data, a numerical value, the Image Quality Figure (IQF), can be calculated. The IQF is defined as

$$IQF = {\sum\limits_{i = 1}^{15} {C_{i} \times D_{{i,th}} } },$$

where i = contrast-column number, C i  = contrast (depth of hole), and D i,th = threshold diameter in contrast-column i .

Two images were used: one acquired using a flat-panel detector (Digital Diagnost, Philips Medical Systems, Best, the Netherlands) with a pixel size of 143 μm giving a resolution of 3.5 lp/mm, exposed with automatic exposure control (AEC) at 70 kV, with a simulated system speed of 400. Twenty centimeters of PMMA was used as attenuator and the entrance dose was 706 μGy. The other image was acquired using storage phosphor plates (AC-3, Philips Medical Systems) with a pixel size of 200 μm giving a resolution of 2.5 lp/mm, exposed with AEC at 70 kV with a simulated system speed of 200. Fifteen centimeters of PMMA was used as attenuator, and the entrance dose was 638 μGy. The latter system was used as a representative of a radiographic system with lower inherent image quality.

The images were read on the various types of displays independently by four radiologists with several years’ experience with digital radiography. During the course of their ordinary work, three of the radiologists were doing their image reading mainly on monochrome displays, the fourth on color displays. The following settings were evaluated:

  1. 1.

    Flat-panel detector image with 2-MP color and 2-MP monochrome displays, displaying the images at a 1.0 zoom (ie, 1 pixel of the display displaying 1 pixel of the digital image) and low illumination (23 lx as measured at the face of the displays).

  2. 2.

    Flat-panel detector image with the same displays but higher ambient illumination, 90 lx. The same zoom settings as above.

  3. 3.

    Storage phosphor plate image with the same displays. The same zoom settings as above, low illumination (23 lx).

  4. 4.

    Flat-panel detector image with 2-MP color and 3-MP monochrome displays with no zooming allowed, ie the images were scaled to fit the displays. Low illumination (23 lx).

The low level, 23 lx, is the level that is commonly used in the reading room. The high level, 90 lx, is normally only used for other tasks in the room and not when image reading is performed. Throughout the study, the grayscale could be adjusted at will using a linear window/level function.

Comparison using clinical radiographs:

Thirty clinical anteroposterior lumbar spine radiographs were evaluated by the same four radiologists in a visual grading analysis (VGA). The patients had a mean age of 57.6 years, range 17 to 91 years. Eighteen images were acquired using storage phosphor plates (FCR5000, Fuji, Tokyo, Japan) and twelve using a flat-panel detector (CXDI-40 G, Canon, Tokyo, Japan). Another lumbar spine radiograph of good quality, acquired with the Canon flat-panel detector, was chosen as the reference image and was displayed on a separate 2-MP monochrome display of the same model as described above. Then, all images were compared to this image in random order on 2-MP color and 2-MP monochrome displays. One comparison was made per observer and image. Image quality was rated in a five-grade scale from −2 to +2 compared to the reference image (much worse, worse, equivalent, better, or much better) for seven criteria from the European guidelines on quality criteria for AP lumbar spine4 (Table 1). Based on these results, a VGA score was calculated for each criterion using the formula

$$VGAscore = \frac{{{\sum\limits_{o = 1}^o {{\sum\limits_{i = 1}^I {{\sum\limits_{c = 1}^C {G_{{o,i,c}} } }} }} }}}{{O \times I \times C}},$$

where G o,i,c = grading for observer o, image i, and criterion c, O = number of observers, I = number of images, and C = number of criteria.

Table 1 Visual Grading Analysis Scores for Anteroposterior Lumbar Spine Images Compared to a Reference Image Displayed on a 2-MP Monochrome Display; Four Observers and 30 Images for Each Criterion

Statistical methods:

In the clinical radiographs part of the study, there was one observation per observer and image, ie, no double reading. All results were treated as paired observations. The VGA scores were evaluated with the Wilcoxon signed ranks test.

RESULTS

Calibration curves for the 2-MP color and 2-MP monochrome displays are shown in Figure 1.

Fig 1
figure 1

Calibration curves for the 2-MP color and 2-MP monochrome displays. JND = Just noticeable difference.

The comparisons between color and monochrome displays with different images of the contrast-detail phantom and levels of ambient illuminance are shown in Figure 2. Using a flat-panel image at low illumination, the mean image quality figure (IQF) was 40 for the 2-MP color and 42 for the 2-MP monochrome display. At high illumination, the corresponding IQF values were 44 and 42. When changing to the storage phosphor plate image, the IQF values were increased to 51 and 52, indicating inferior image quality, still with a very small difference between the two displays. When exchanging the monochrome display for a 3-MP unit with no zoom allowed, the IQF values were 44 for the 2-MP color and 40 for the 3-MP monochrome display.

Fig 2
figure 2

Image quality figures for a CDRAD contrast-detail phantom. Lower IQF values indicate better image quality. i) Comparison of a 2-MP color and a 2-MP monochrome display using a flat-panel detector image at low ambient illumination (23 lx). ii) Comparison of the same displays and image at higher ambient illumination (90 lx). iii) Comparison of the same displays using a storage phosphor plate image with lower inherent image quality at 23 lx. iv) Comparison of the 2-MP color display and a 3-MP monochrome display with no zoom allowed at 23 lx.

The VGA of clinical images resulted in very small differences between the two display types and no significant difference for the overall score (Table 1). The 2-MP color display performed significantly better in “reproduction of the spinous and transverse processes,” whereas the 2-MP monochrome display performed better in “visually sharp reproduction of the pedicles” and “reproduction of the intervertebral joints.” All other comparisons were nonsignificant.

DISCUSSION

This study did not show any significant difference in image quality between a standard 2-MP color LCD display and a medical-grade 2-MP monochrome LCD display, neither using the contrast-detail phantom nor in the visual grading study. Our findings are in accordance with several studies that have shown similar performances for color and monochrome displays in a variety of clinical tasks such as brain CT,5 radiography of wrist fractures,6,7 computed radiographs of the hands in early rheumatoid arthritis,8 and chest radiographs in interstitial lung disease.9 In another study, Goo et al10 found that for chest radiographs, a display luminance as low as 86 cd/m2 was acceptable provided that the ambient illuminance was low.

The main purpose of calibrating a monitor according to DICOM part 14 is to obtain similar image presentation on all displays. A calibration distributes the total contrast of the display equally across the entire grayscale and objects will thus be presented with the same contrast regardless of whether they are present in bright or dark parts of the image. When the task is to find known objects in an image, such as targets in a contrast-detail phantom, the window/level controls can be used to optimize image contrast. The display’s contrast characteristics becomes less important and the noise properties become more important—noise from the image detector and noise from the image display. However, this does not mean that calibrating a display is meaningless. Clinical images have little resemblance to images of a contrast-detail phantom in that pathology might be present also in the bright or dark parts of the image. A consistent display of images is even more important when, for example, a current image is compared to a previous image on another display. Any differences between the images should be caused by the imaged object and not by the displays.

The main advantage of medical-grade monochrome displays is their high luminance, which makes it easier to see the entire grayscale from black to white in an image. In a recent report,11 high luminance is a requirement for displays used in diagnostic radiology. Medical-grade displays are usually also equipped with controls to facilitate grayscale calibration. A disadvantage of medical-grade displays is their high cost. In our study, the cheaper of the two monochrome displays cost about four times as much as the color display. However, if the life span of the monochrome displays is longer, this will help to offset the price difference.

The major drawback of color displays is their lower maximum luminance—143 cd/m2 in our study compared to 295 cd/m2 for the monochrome display. A low luminance has been stated to increase the time for diagnosis.1 Krupinski et al12 found no significant difference of performance between high and low luminance, although the dwell time was longer with the lower-luminance displays. In another study, it was stated that observers were taking more time to make less accurate decisions using the color display.13 Apart from their lower cost, a great advantage of color displays is their ability to show color information. In modern digital radiology, color is used more and more in various modalities such as color Doppler ultrasound, 3D reconstructions in computed tomography (CT), functional magnetic resonance (MR) imaging and nuclear medicine including PET. Another advantage is the possibility to exchange color displays four times as often as monochrome displays within a fixed budget.

The tests with the contrast-detail phantom showed very small differences in image quality between the two types of displays. There was in fact a larger difference in image quality between the flat-panel detector and the storage phosphor plates (Fig. 2). It might thus be more appropriate to choose a better (more expensive) imaging system such as a flat-panel detector and use (cheaper) color displays than the opposite. Irrespective of the detector being used, there was a large interobserver variability, similar to what has been reported previously.14 This can probably be attributed to varying levels of confidence in deciding whether a lesion is seen or not. However, the intraobserver variability is much lower than the interobserver variability, which is also shown in Figure 2.

The higher ambient illuminance setting resulted in slightly poorer lesion detection on the 2-MP color display, but resulted in no difference with the 2-MP monochrome display. It is known that ambient illuminance should be low as ambient light elevates the black level of the display15 because of reflected light, and thus reduces the effective contrast ratio. In our study, the low level of illuminance was 23 lx, which is higher than in some studies. Our high level of illuminance was 90 lx, which we consider too high for diagnostic work, but still lower than in other studies where up to 200 lx have been used.9,16 The relatively small difference between our ”low” and ”high” levels might explain the rather small difference in lesion detection. The lower image quality for the 2-MP color display under high illuminance might, however, be a result of the lower luminance of this display. The loss of contrast in the dark areas of the image in higher ambient illuminance can be restituted by calibration, but then the total contrast of the display is reduced because of the smaller contrast span of the color display.

The visual grading study using clinical images showed significantly higher image quality for the 2-MP monochrome display for reproduction of pedicles and intervertebral joints; and lower for reproduction of spinous and transverse processes. Overall, there was no significant difference between the displays in the visual grading part of the study.

Free adjustment of window width and level was allowed in our study, as that is the way radiologists work in everyday practice. Windowing is easily performed by moving the computer mouse. If this type of image processing is not done, the full potential of digital imaging is not used. We consider image adjustment and manipulation to be a natural part in reading a digital image, and indeed a necessity to view all information in the image, and consequently a comparison between monochrome and color displays without the use of free adjustment of window and level was not included in this study. This is probably one reason why the 2-MP color display performed so well. All information in the image could be placed in the middle (gray) area of the contrast span where the two display types were almost equal. A drawback is that the user’s performance efficiency might be reduced.17 Another drawback is that with a narrow window setting, it might be difficult to compare the contrast of an object with that of other areas that are lighter or darker than the object. In many studies, there has been no provision for image manipulation such as zooming and alteration of window width and level, which might explain the varying results. The very stringent requirements on displays in recommendations11 might, in fact, have been set with the ambition that the user should not be required to manipulate the image.

To let all PACS stations in a radiology department have the capability to display all types of images, it is necessary to equip them with display units that are able to display also images with color information such as Doppler ultrasound, 3D volume rendered CT images, PET images, and SPECT images. It is costly to furnish an entire radiology department with the more expensive monochrome displays, and color displays might also, for economic reasons, be a better alternative. The new users of digital radiological image information, the clinicians, usually opt for color displays, which may be a conscious cost-saving decision or simply the effect of old habits.

The spatial resolution of the displays was not evaluated specifically in this study because the two displays used in the majority of tests had the same resolution. When used without magnification, the 3-MP monochrome display showed a trend toward higher image quality compared to the 2-MP color display. This is not surprising because the images were scaled to fit the display in that particular test. None of the displays managed to show all of the five megapixels that the test image consisted of, but the 3-MP display did show a larger proportion of the image information than the 2-MP displays.

The contrast-detail phantom has been used previously in evaluations of displays,18,19 and it has also been criticized for having too large intraobserver and interobserver variability leading to a low sensitivity for changes in display performance.20 Still, contrast-detail phantoms have often been used for comparison of image quality of various radiographic methods, and we believe that they are also a reasonably good way of comparing displays. This way, the influence of “anatomical noise” is excluded. A drawback is that only the central (gray) part of the contrast range is evaluated and not the dark or light parts of the image, as there is no such information in the CDRAD image. Our results with the contrast-detail phantom were, however, in accordance with the results from the visual grading part of the study. In the central part of the contrast span, the 2-MP color display, which was uncalibrated, had in fact slightly better contrast resolution than the 2-MP monochrome display, whereas it was much worse in the dark and light parts, a result of not being calibrated according to DICOM part 14. However, we do consider it mandatory to calibrate all displays according to the DICOM standard, as this is a good way to ensure consistent image quality over the whole radiology department.

There were some limitations of the study. The number of images is somewhat limited, both for the contrast-detail phantom and for the clinical images. For the contrast-detail phantom, the limitations regarding intra- and interobserver variability are well known, and the low number of images does not seem to be a major problem. For the VGA, the number of images seems to be adequate as we found significant differences in image quality for some criteria, whereas there was no significant difference for the overall score.

CONCLUSIONS

In summary, we did not find any significant difference in image quality between a medical-grade monochrome LCD display and a color LCD display of equal spatial resolution, neither with a contrast-detail phantom nor in a visual grading analysis when adjustment of the grayscale was used to its full potential.