Introduction

New technological advancements, including multislice computed tomography (CT) and functional magnetic resonance imaging (MRI), have dramatically increased the size and number of digital images generated by medical imaging departments. Data produced by CT may represent up to 50% of a Picture Archiving and Communication System (PACS) storage capacity, with some studies (e.g., cardiac CT) holding 5,000 images. Storage needs are predicted to further increase when departments will be required to archive 3D volume renderings and other complex image reformats of the original images. Despite the fact that the cost of storage is dropping, the savings are largely surpassed by the increasing volume of data being generated. In addition, the cost of operation remains high, considering mandatory data migration and legal retention periods for patient data, which can be even longer for pediatric cases. Considering the fact that Canada annually generates 1.5 PB of radiology imaging data (41 million studies for 2006), the potential national metrics are significant.1 Another important issue is that of data transmission. While local area network bandwidth within a hospital is adequate for timely access to imaging data, efficiently moving the data between institutions requires wide area network bandwidth that may not be available at the regional or national level.1

Data compression can address the storage and transmission needs by enabling more efficient distribution and optimizing archiving of imaging data. The use of Lossy Compression can address these issues, as long as there is no loss of relevant information. Lossy compression allows a greater size reduction with no significant visual quality loss or image degradation, and the severity of the degradation is strictly dependent on the compression ratio. JPEG is the most widely accepted compression tool, but it has been shown that the new JPEG 2000 algorithm may allow higher compression levels than JPEG for equivalent or better image quality.2,3 The Canadian Association of Radiologists PACS/Teleradiology committee has accepted the principle of irreversible (“lossy”) compression for use in primary diagnosis and clinical review, using DICOM JPEG or JPEG 2000 compression algorithms, at specific compression ratios set by image type to be based upon the results of an extensive clinical evaluation.4

The goal of this study was to evaluate current levels at which lossy compression can be confidently used in diagnostic imaging applications. In order to provide a fair assessment of existing compression tools, we tested and compared the two most commonly adopted DICOM compression algorithms: JPEG and JPEG-2000. CAD was excluded from this study but will be the object of a separate evaluation, as this is a very important issue and has significant implication on radiology workflow.

At low compression ratios, we hypothesized that there would be no detectable differences in the performances of the two compression ratios. The compression levels investigated were based on accepted levels extracted from the literature. Rather than finding an absolute compression threshold for the five modalities and seven anatomical areas tested, our objective was instead to establish a range in which compression could confidently be applied.

Materials and Methods

After institutional Research Ethics Board approval was obtained, we conducted an extensive Pan-Canadian evaluation of seven anatomical areas and five modalities, in which readers were recruited from nine out of the ten Canadian provinces (Prince Edward Island was not represented). We enrolled 100 readers across Canada to participate in our study, all experienced staff radiologists with more than 3 years in practice, no resident or fellow. To ensure an adequate sample size, we required a minimum of three readers for each of our reading sessions, but for some sessions, we had up to six readers. Each radiologist read only in his/her subspecialty.

A collection of images consisting of a mix (1:4) of normal cases and identified subtle pathologies were presented to reviewers for their assessment with pathologies as follows:

  • CR/DR (mainly Agfa QS 2.1.72 with CR25, CR75, ADC compact digitizer, from 7.8 to 28 MB)

    • Chest: pneumothorax, tiny nodules less than 3 mm, Kerley lines, PIC lines, rib fractures;

    • Body: kidney stones, calcifications, peritoneal dialysis catheters, surgical clips, sutures;

    • MSK: cyst, fracture, erosion, lytic lesion, periostal reaction, soft tissue calcification;

    • Breast: calcification, mass, nodule, architectural distortion.

  • CT (mainly GE Lightspeed VCT and Toshiba Aquilion 64)

    • Neuro: calcification, mass, hemorrhage, ischemic lesion;

    • Chest: ground glass opacity, nodule, tree in bud, pneumothorax;

    • Body: cyst, hematoma, metastasis, node, stone.

    • MSK: fracture, lytic lesion, mass, calcified synovial facet joint;

    • Angio: pulmonary embolus, intramural hematoma, filling defect, calcification;

  • MR (mainly GE HDX Twinspeed, T1 and T2)

    • Neuro: mass, ischemic lesion, herniated disc;

    • Body: cyst, soft tissue mass;

    • MSK: cyst, full or partial tears, stress fracture, tendonosis;

    • Breast: cyst, mass, enhancement, node;

    • Angio: plaques, stenosis;

  • US (mainly from ATL 5000 and Toshiba Aplio XG)

    • Body: calcification, cyst, fatty sparring, hemangioma, metastasis;

    • MSK: cyst, tear, tendonosis;

    • Breast: calcification, cyst, mass, architectural distortion;

    • Pediatrics: nodule, mass, hemangioma, metastasis, polyp.

  • NM

    • Increased uptake, decreased uptake, lesion, scar.

We had approximately 2,000 anonymized patients in our database and selected the best images possible. We did not look specifically at degradation linked to fat in excess, but this may be the object of a subsequent study as we are implementing a bariatric center in our institution.

Readers were given a CD or DVD that contained an auto-run software application which displayed the collection of images. The readers were categorized according to subspecialty and were assigned an appropriate collection of studies.

They never saw the same case twice, to avoid the bias of remembering a specific condition, which means that compression (or no compression) was applied to different images. Readers responded to a series of questions by selecting their responses from drop-down menus embedded in the software application. The responses were transmitted to a centralized server at Sunnybrook Health Science Centre via the Internet. They could interrupt the session at any time, but when they resumed, the session started automatically where they left it.

Compression Technique

In order to provide a fair assessment of existing compression tools, we tested and compared the two most commonly adopted DICOM compression algorithms: JPEG and JPEG-2000. At low compression ratios, we speculated that there would be no detectable differences in the performances of the two compression ratios.

The new features in JPEG 2000 compared to JPEG are:

  1. 1.

    Efficient lossy and lossless compression within a single unified coding framework;

  2. 2.

    Progressive transmission and spatial scalability (thumbnails);

  3. 3.

    Superior image quality; broad range of image types;

  4. 4.

    Support for Region of Interest coding;

  5. 5.

    Support for continuous-tone and bi-level compression (BW and color);

  6. 6.

    Robustness to bit-errors (wireless communication applications);

  7. 7.

    Avoids excessive memory usage.

Study Design

Many different techniques have been suggested to evaluate the quality of compression including numerical analysis of pixel values before and after compression, subjective observer evaluation focusing on aesthetic acceptability and estimated diagnostic value, and objective measurement of diagnostic accuracy using blinded evaluation methods.5

Two techniques were selected to evaluate the quality of the compressed images: an objective method based on diagnostic accuracy and a subjective method based on the concept of Just Noticeable Difference. Thus, readers followed a two-step process in evaluating each case within their allotted series of studies.

First, images were displayed with a grid overlay, which divided the image into four equal quadrants in order for the reader to state in which quadrant he could see the abnormality. He could choose the quadrant from a drop menu on the screen. The grid could be toggled on and off by the reader during the evaluation. Then, the reader was required to identify the type of pathology (or absence of pathology) in a second drop menu listing the conditions described higher in this chapter, and in a third drop menu associate a confidence rating of 1 to 5 (1 is definite absence of lesion, 2 = probably no lesion, 3 = unsure, 4 = probably presence of lesion, and 5 is definite presence) with his assessement. We considered as a positive result if the reader could correctly state pathology/no pathology and if any could locate in the correct quadrant. To be statistically significant, we were interested in the consistency of answers, which means that we expected the same ratio of errors in the pathology assessment for the compressed and noncompressed images.

Second, an original-revealed forced choice evaluation technique was implemented, where each compressed image was paired with its original, and the observer was asked to compare both images. The reader was asked to rate the perceptible difference on a scale from 1 to 6, where 6 represents no visible difference, and 1 is unacceptable (Table 1).7 By incorporating both diagnostic accuracy and subjective evaluation techniques, enabled us to define a range of compression for each modality and body part tested. Our study considered the effects of compression on seven anatomical areas and five modalities (Table 2).

Table 1. Quality of the Compression Levels and Types was Assessed Using a Six-Point Likert Scale
Table 2. Radiological Areas and Modalities Investigated

Compression Ratios

The range of compression ratios applied to our images was extrapolated from an extensive literature review on medical image compression commissioned by Canada Health Infoway[8] as shown in Table 3. We tested three different levels of compression for both JPEG and JPEG 2000 compression algorithms.

Table 3. The Range of Compression Ratio we Applied to Each Modality/Body Part was Based on the Results of an Extensive Literature Review

In addition to compressed images, each set of images also contained uncompressed images for evaluation. Each work list included 70 images or stacks of images for CT scans, representing a ratio of six compressed for one uncompressed image. The entire set of images was randomized and, within each reading session, readers were not shown the same image to evaluate twice. In selecting images for this study, we have created a collection of more than 2000 anonymized studies, which can be used for future studies. Collecting images from other collaborating hospitals and research institutes ensured that our findings could be generalized to images generated by different brands of acquisition equipment.

Lossy JPEG uses a Quality factor expressed in percentage, which is not correlated with a ratio of compression.

Q factor is based on the unique properties that make up each image, including amount of black background present in each image, making some images more suitable for higher compression ratios. In some cases, the JPEG Q Factor prevented us from obtaining the compression ratios that we had originally decided to use in this study.

The Q factor determines the divisor used in the quantization tables. In order to achieve the desired compression ratios, we used an iterative approach of Q-factor reduction until the desired compression ratio was achieved.

Lossy JPEG Codec Limitations

Many of the challenges we faced during the course of our development arose from limitations of the lossy JPEG codec. The lossy JPEG codec does not accommodate images over 12-bits of data; which proved to be particularly problematic as an increasing proportion of the MR and CT images generated by the equipment in our Medical Imaging department are 16-bits. As it was a mandatory requirement for our study to keep the DICOM information attached to the image, we investigated several solutions including selecting only 8- or 12-bit images. However, considering that an increasing proportion of our images are 16-bits, this was unrealistic in a modern medical imaging department.

An option that was deemed unacceptable was to manually window the image on the server, save the window-level presets, and then convert the image into an 8-bit file. This would result in an 8-bit image, in which the reader would have no ability to manipulate the window-level. This option was dismissed, as it would not allow readers to view images in a manner that allows real life conditions.

An option that we used in some cases was converting 16-bit images to 8 bits. While this resolved our lossy JPEG codec limitation, however, it could also limit readers’ ability to manipulate the images.

Another strategy that we considered was converting the signed data to unsigned, shift 4 bits right, compress the 12 used bits, decompress generating 12 bits, shift left 4 bits and hereby generate 16 bits again.

The solution we engineered was to rescale the images in replacing any pixel value of Pixel Padding Value with the lowest valid pixel value (which will become zero), normalize all pixel values in a 0–4095 range by adding the necessary offset from Rescale Intercept, check that no value exceeds 4095 and update the DICOM tags in the header to reflect the changes.

The technical issues encountered during the implementation of our study confirms that JPEG 2000 is much more flexible,10 its features including supporting more image formats, progressive lossy to lossless embedded, ROI coding, interactive protocol, and appears to better address the new requirements of modern and more demanding medical imaging studies.

Technical Developments

To enable radiologists across Canada to evaluate images for our study, we developed a dedicated software application that was synchronized to a centralized server; which allowed results to be reported, in real-time, to the central database via the Internet.

Server Application

The server application for this study was designed to collect images exported from our PACS using a DICOM communication tool (Merge DICOM3 Toolkit, Merge-Emed, Milwaukee, WI, USA), import images from CD ROM, and store anonymized images in a database. The application needed to create and manage a database of radiologists, generate specific work lists of images, and store image information including the pathology and location of the pathology. The server compressed images using an industry recognized compression package (PICTools JPEG 2000 from Pegasus Imaging, Tampa, FL).2 Images were randomly assigned compression ratio and compression algorithm. Results were collected online from the remote workstations through synchronization with the client software.

Client Application

Our client application needed to retrieve and remotely display images in a manner consistent with readers’ everyday reading experiences. Our software provided readers with the essential tools for clinical evaluation including window/leveling, zoom, pan, and reset functions. The software consisted of an auto-run program that displayed the images side-by-side on a dual monitor workstation. By using a grid to divide the images into four quadrants, readers were able to specify the location of the abnormality that they found. Readers were also able to select their responses to questions regarding type of pathology, and provide a confidence rating of their assessment from a drop down menu. The client software was synchronized to our server over the Internet, which allowed readers responses to be reported in real-time. The software was distributed to readers on CDs or DVDs, which contained a connectivity test, in which readers could determine whether they were able to access our server via the Internet. Our innovative technology enables online results collection; which can be used for worldwide research. This has the potential to alter the way that data is collected, and will be implemented for use in future studies.

Viewing Environment

In order to obtain findings that were relevant to everyday clinical evaluation, images were not viewed under a strict laboratory environment; rather, they were read on hospital/clinic workstations that complied with CAR and ACR practice standards. We contacted by phone PACS administrators at each participating institution to ensure that workstations were DICOM compliant, regularly tested and calibrated, that the resolution and video cards were appropriate for types of images being read (i.e., minimum of 2 MP for computed radiography (CR), that monitors had a minimum luminance (50-ft-lambert), and that the lighting environment was appropriate.9

Quality Control

To verify that our compression ratios were within the acceptable range of accuracy, we had the compression ratios externally validated by two engineering companies: Pegasus Imaging (Tampa, FL, USA), and Khademi Consulting (Toronto, Canada). Those two companies made sure that the compression engine was correctly implemented, that the compression was properly applied to our images, and that the ratio published was in accordance with the final image size for both algorithms.

Results

Statistical Technique

Diagnostic accuracy was assessed by comparing rater Sensitivity (proportion of abnormal images correctly identified) and Specificity (proportion of normal images correctly identified) across three compression levels and both types of compression using a two within-factor analysis of variance (ANOVA) for each of the seven anatomical areas and five anatomical areas. The Bonferroni alpha adjustment for multiple testing (α = .05/22 = 0.0023) was used for all comparisons.

The subjective quality of the comparison of each of the compression levels and two types versus the uncompressed image was scored using a six-point ordinal scale (1 = Unacceptable, 2 = Significant, 3 = Intermediate, 4 = Conspicuous, 5 = Just noticeable, 6 = None). For each of the seven anatomical areas and five modalities, these scores were tested using the Mantel–Haenszel trend statistic for the tendency of ratings in one level of compression and type to have better scores than ratings in another level of compression and type.

The Effects of Compression on CR Images

Diagnostic accuracy results are summarized in Table 4. There was no effect of compression level or type on diagnostic accuracy of CR chest, CR pediatric, or CR body images. While there was no effect of compression type on specificity of CR musculoskeletal images (Fig. 1), there was a significant effect of compression level on sensitivity (P = 0.002). There was no effect of compression on readers’ subjective assessments of CR chest (df = 6, P = 0.5452), CR pediatric (df = 6, P = 0.1275), CR body (df = 6, P = 0.5240), or CR musculoskeletal images (df = 6, P = 0.0566).

Fig 1.
figure 1

The effect of compression on sensitivity of CR musculoskeletal images: compression had a significant effect, mainly JPEG 2000. P = 0.002.

Table 4. Summary of Diagnostic Accuracy Results

The Effects of Compression on CT Images

There was no effect of compression level or type on diagnostic accuracy of CT vascular, CT Body, CT chest, CT pediatric, or CT MSK images. While there was no effect of compression level on specificity of CT neurological images (Fig. 2), there was a significant effect of compression type on sensitivity (P < 0.0001). There was no effect of compression on readers’ subjective assessments of CT vascular images (df = 6, P = 0.0896). There was however, a significant effect of compression on readers’ subjective assessments of CT body images (Fig. 3) (df = 6, P < 0.0001), with a greater proportion using JPEG 2000 choosing categories 1 (unacceptable), 2 (significant), and 3 (intermediate). There was also a significant effect of compression on readers’ subjective assessments of CT neurological images (df = 6, P < 0.0001) with readers rating images compressed with J2K more frequently as 1 (unacceptable) and 2 (significant).

Fig 2.
figure 2

The effect of compression on subjective assessment of CT neurological images. Lossy JPEG performed better than JPEG 2000 at the highest levels of compression tested. (df = 6, P < 0.0001).

Fig 3.
figure 3

The effect of compression on subjective assessment of CT body images. Lossy JPEG performed better than JPEG 2000 at the highest levels of compression tested (df = 6, P < 0.0001).

The Effects of Compression on US Images

There was no effect of compression level or type on US body, US breast, US pediatric, or US musculoskeletal images. There was no effect of compression on readers’ subjective assessments of ultrasound musculoskeletal images (df = 6, P = 0.4244), ultrasound body images (df = 6, P = 0.9722), ultrasound breast images (df = 6, P = 0.4038), or Ultrasound pediatric images (df = 6, P = 0.4038).

The Effects of Compression on MR Images

There was no effect of compression level or type on diagnostic accuracy of MR breast, MR Angio, MR musculoskeletal, or MR pediatric images. While there was no effect of compression level on sensitivity of MR neurological images, there was a significant effect of compression type on specificity (P = 0.002). There was no effect of compression on readers’ subjective assessments of MR breast (df = 6, P = 0.0404), MR body (df = 6, P = 0.4232), MR musculoskeletal (df = 6, P = 0.5681), or MR neurological images (df = 6, P = 0.5167).

The Effects of Compression on Nuclear Medicine Images

The images tested included cardiac imaging (MUGA, Planar), bone, thyroid, and kidney. There was no effect of compression level or type on diagnostic accuracy of NM pediatric or adult images. There was also no effect of compression on readers’ subjective assessments of nuclear medicine adult (df = 6, P = 0.6714) or nuclear medicine pediatric images (df = 6, P = 0.8033).

The Effects of Compression on Mammography Images

There was no effect of compression level or type on diagnostic accuracy of mammography images. There was also no effect of compression on readers’ subjective assessment of mammography images (df = 6, P = 0.9502).

Discussion

This was an ambitious study in terms of both the number of anatomical areas and modalities considered and the technical developments. During the course of this study, we encountered many challenges which have been described in detail to assist individuals pursuing research in the compression of medical images.

Diagnostic Accuracy

With the exception of two subspecialties, compression had no effect on readers’ specificity and sensitivity. Compression level had a significant effect on sensitivity of CR musculoskeletal images; with readers performing more poorly on images compressed at higher levels of compression. Compression type had a significant effect on sensitivity for CT neurological images, with JPEG performing better than JPEG 2000. The effect of compression level on specificity of neurological images in MR showed poor results at low level of compression: <40% in JPEG 2000 and <60% for JPEG, but 100% correct answers at the upper levels (20 and 24:1) for JPEG 2000 and at 20:1 for JPEG, when there was no effect on sensitivity.

ROC Analysis

We had originally intended to perform ROC analysis15 for each modality and anatomical specialty included in this study. However, the lack of variability in readers’ responses made impossible for us to carry out this analysis. The lack of variability is likely attributed to the conservative compression ratios we selected for our study. For the most part, readers did not report a detectable difference between compressed and uncompressed images. Future studies wishing to include ROC analysis may consider selecting sufficiently high compression levels to allow readers to detect a difference between the uncompressed images and compressed images. This did not impact our study as we were not trying to establish a threshold but on the contrary to define a comfort zone with no significant effect of compression on image quality.

Table 5. Recommended Compression Ratios for each Modality and Anatomical Area Investigated

Subjective Assessment

For the majority of compression levels and modalities tested, readers found no detectable difference between uncompressed images and images compressed with either JPEG or JPEG 2000. There were two exceptions; readers evaluating CT neuro and CT body images were able to distinguish between the uncompressed image and a copy of the image compressed with JPEG 2000. Readers ranked JPEG 2000 CT neuro and CT body images more frequently as unacceptable, significant information being lost, and that subtle abnormalities may be overlooked as a result of compression.

External Factors Impacting Image Quality

Radiation Dose

Special consideration is required when working with CT pediatric images, where radiologists try to reduce the amount of radiation delivered to young patients. This results in an increasing background noise in the image, which can make images less tolerant to compression.

Short Acquisition Times

Quick runtimes are often employed during pediatric CT and MR exams, in an effort to counteract motion artifacts due to children agitation or to decrease anesthetic times required to perform the exams. Quick runtimes also result in increased background noise on the images, resulting in images that are less tolerant to compression.

Slice Thickness

Previous studies suggest9 that thin CT slices may modify image tolerance to lossy compression. Our study was restricted to include images of 2.5 mm thickness and greater. Assessment of thin slices and different filters will be the object of a subsequent dedicated evaluation.

Multiphasic Nuclear Medicine Images

The complexity of displaying multiphasic nuclear medicine images posed substantial challenges for our development team. Nuclear medicine images are variable in size, with some files being large, while others files being relatively small. As a result of the size discrepancies between the images, we could not apply one compression ratio across all NM images; some images would inevitably be overcompressed, while others would be under compressed. In addition to the challenges of displaying these images, there were no previous studies in the literature from which we could base the compression ratios required for our study.

JPEG 2000 Degradation for CT Body and CT Neurological Images

Some images proved to be less tolerant to compression by JPEG 2000 than to JPEG. Even if the literature suggests that regardless of compression level, JPEG 2000 typically outperforms JPEG in compression5,12,13 it has been described that fine, irregular textures (white matter in brain CT, trabecular bone pattern, ultrasound specks) contain many small high-frequency coefficients and tend to exhibit blurring artefacts at moderate levels of compression.14 We worked with leading industry experts Aware Inc. (Bedford, MA, USA) and Pegasus Imaging (Tampa, FL, USA), to verify our results, and to ensure that the compression ratios (8, 10, and 12:1) we selected were appropriate for this radiological specialty. The main source of degradation in JPEG 2000 compressed CT neuroimages occurred in speckled regions (which predominantly is acquisition noise). Speckle patterns are represented by many low-amplitude high-frequency wavelet coefficients, which are discarded by quantization. This causes local blurring and ringing artifacts, since each DWT coefficient affects the frequency content of reconstructed images at specific location. This is in contrast to JPEG, where DCT coefficients are representative of the global frequency content of the image (i.e., it is not spatially localized) and hence similar features with similar frequency content, such as speckles and other image objects, have a combined frequency representation in the DCT domain. Consequently, these features are less affected by quantization because they are combined to produce larger valued DCT coefficients. We could confirm that low-energy high-frequency speckles are discarded by JPEG 2000 and therefore create blurring artifacts not showing on Lossy JPEG images.

Compression Ratios

The compression ratios used in this experiment were intentionally conservative. We were not interested in determining the threshold at which images could be compressed. Instead, we were interested in providing the CAR with a range of compression levels that their readers could confidently use in their everyday reading (Table 5). In the majority of the modalities and specialties included in this study, we did not detect a difference between the uncompressed images and images compressed at any of the levels tested.

The acceptable compression ratios we found are in some cases higher than the ones published in previous studies, as demonstrated in the extensive literature review on which our study was based, but we suggest to use the lowest values of the range we propose; in any case, the supervision of a qualified radiologist is mandatory in order to ultimately determine whether the image quality after compression has been applied is acceptable.

Conclusions

The results of our large-scale pan Canadian evaluation study of lossy compression suggest that at low levels of lossy compression, there was no significant difference between the performance of lossy JPEG and lossy JPEG 2000, and that they are both appropriate to use for reporting on medical images. At higher levels, lossy JPEG proved to be more effective than JPEG 2000 in some cases of MSK CR, brain, and abdominal CT images.

Validation of irreversible compression for slices with a thickness of less than 2.5 mm has not been evaluated at the time of writing. When such evaluation is completed, the guidelines will be updated accordingly.

We have developed a reproducible evaluation methodology, which will allow us to assess more imaging modalities and processes, such as CAD, that we could not include in this study.

We recommend a range of acceptable conservative compression values by anatomical area and modality than can be used in primary reporting. The adoption of irreversible compression by an organization or group of radiologists must be subject to the supervision of a qualified radiologist who must ultimately determine whether the image quality after compression has been applied is acceptable.

Our Canadian initiative is not isolated, and there is a lot of interest in other countries to implement similar guidelines, among them United Kingdom, where a standard should be released soon, France and Germany.