1 Introduction

Public databases of X-ray images can be found for medical imaging,Footnote 1 however, to the best knowledge of the authors, up until now there have not been any public databases of digital X-ray images for X-ray testing.Footnote 2

Fig. 1
figure 1

Random X-ray images of \(\mathbb {GDX}\)ray database

As a service to the X-ray testing community, we collected more than 19,400 X-ray images for the development, testing and evaluation of image analysis and computer vision algorithms. The images are organized in a public database called \(\mathbb {GDX}\)ray: The Grima X-ray database.Footnote 3 In order to illustrate our database, a random selection of 70 X-ray is shown in Fig. 1. The database includes five groups of X-ray images: castings, welds, baggage, natural objects and settings. Each group has several series, and each series several X-ray images. Some samples of each series are illustrated in Fig. 2. Most of the series are annotated or labeled. In those cases, the coordinates of the bounding boxes of the objects of interest or the labels of the images are available. In Table 1 we can see some statistics. The size of \(\mathbb {GDX}\)ray is 3.5 GB and it can be downloaded from our website (see Fig. 2).

In this paper, we will view the structure of \(\mathbb {GDX}\)ray database, a description for each group (with some series examples), and some examples of applications that have been published using images of \(\mathbb {GDX}\)ray.

Fig. 2
figure 2

Screenshot of \(\mathbb {GDX}\)ray website. Some X-ray images of ten series are shown at the right side: C0001 and C0034 for castings, W0001 and W0003 for welds, B0001 and B0046 for baggage, N0006 (cherry), N0010 (wood) and N0011 (salmon) for natural objects and S0001 for settings (a calibration pattern)

2 Structure of the Database

\(\mathbb {GDX}\)ray is available in a public repository. The repository contains 5 group folders one for each group: Castings, Welds, Baggage, Nature and Settings. For each group we define an initial: C, W, B, N and S respectively. As shown in Table 1, each group has several series. Each series is stored in an individual sub-folder of the corresponding group folder. The sub-folder name is Xssss, where X is the initial of the group and ssss is the number of the series. For example, the third series of group Castings is stored in sub-folder C0003 of folder Castings (see more examples in Fig. 2). The X-ray images of a series are stored in file Xssss_nnnn.png. Again Xssss is the name of the series. The number nnnn corresponds to the number of the X-ray image of this series. For example, the fifth X-ray image of series C0003 is C0003_0005.png and is stored in sub-folder Castings/C0003. The whole structure is summarized in Table 2. It is worth mentioning that all X-ray images of \(\mathbb {GDX}\)ray are stored in ‘png’ (Portable Network Graphics)Footnote 4 8-bit grayscale format. Additional metadata for each series (such as description of the objects, parameters and description of X-ray imaging system, etc.) are given in an ASCII file called Xssss_readme.txt included in sub-folder Xssss, e.g., C0003_readme.txt for series Castings/C0003.

Table 1 Statistics of \(\mathbb {GDX}\)ray database

3 Castings

The group Castings contains 2727 X-ray images arranged in 67 series. The X-ray images were taken mainly from automotive parts (aluminum wheels and knuckles) using an image intensifier. Some examples are illustrated in Figs. 3 and 4. The details of each series are given in Table 3. Experiments on these data can be found in several publications as shown in Table 4. It is interesting to highlight that series C0001 (see Fig. 3) contains not only a sequence of 72 X-ray images taken from an aluminum wheel by rotating its central axis in 5\(^0\), but also annotations of bounding boxes of the ground truth of 226 small defects and the calibration matrix of each image that relates the 3D coordinates of the aluminum wheel with 2D coordinates of the X-ray image.

Table 2 Structure of \(\mathbb {GDX}\)ray
Fig. 3
figure 3

Some X-ray images of an aluminum wheel (group Castings series C0001)

Fig. 4
figure 4

Some annotated images showing bounding boxes of casting defects

Table 3 Description of group ‘Castings’ of \(\mathbb {GDX}\)ray
Table 4 Applications of series castings
Fig. 5
figure 5

Some images of group Welds series W0001 (X-ray images) and W0002 (ground truth)

Table 5 Description of group ‘Welds’ of \(\mathbb {GDX}\)ray.
Table 6 Applications of series welds
Fig. 6
figure 6

Some X-ray images of a bag containing handguns, shuriken and razor blades (group Baggage series B0048)

Fig. 7
figure 7

Some X-ray images of handguns (series B0049), shuriken (series B0050) and razor blades (series B0051) of group Baggage

Fig. 8
figure 8

A knife was rotated in 1\(^0\) and by each position an X-ray image was captured. In this figure, X-ray images at \(0^0, 10^0, 20^0, \dots 350^0\) are illustrated (see series B00008 of group Baggage)

Table 7 Description of group ‘Baggage’ of \(\mathbb {GDX}\)ray
Table 8 Applications of series baggage
Fig. 9
figure 9

Some X-ray images of salmon filets (group Nature series N0011)

Fig. 10
figure 10

Some X-ray images of wood (group Nature series N0010)

4 Welds

The group Welds contains 88 images arranged in 3 series. The X-ray images were taken by the BAM Federal Institute for Materials Research and Testing, Berlin, Germany.Footnote 5 Some examples are illustrated in Fig. 5. The details of each series are given in Table 5. Experiments on these data can be found in several publications as shown in Table 6. It is interesting to highlight that series W0001 and W0002 (see Fig. 5) contains not only 10 X-ray images selected from the whole BAM database (series W0003), but also annotations of bounding boxes and the binary images of the ground truth of 641 defects.

Series W0003 contains a collection of 67 digitized radiographs from a round robin test on flaw recognition in welding seams. The NDT films (used with lead screens) were exposed according to ISO 17636-1, testing class A. After development they have been scanned with a LASER scanner LS85 SDR from Lumisys using digitization class DB-9 according to ISO 14096-2. The original 12 bit data depth was rescaled to 8 bits with a linear LUT proportional to optical film density by visual adjustment to the image content. This ensures that all necessary flaw information is still in the 8 bit images.Footnote 6 The pixel size is 40.3 micron (630 dpi). The images are 8 bit gray values. In addition, in this directory the file ‘real-values.xls’ contains the true data and the flaw designations according to ISO 6520 and ISO 5817. These true data have been generated using weld sections of 1 cm width starting from the indicated Zero point.

Table 9 Description of group ‘Nature’ of \(\mathbb {GDX}\)ray
Table 10 Applications of series Nature.
Fig. 11
figure 11

Some images of group Nature series S0012 (X-ray images of salmon filets) and S0013 (ground truth for fish bones)

Fig. 12
figure 12

Some X-ray images of a cooper checkerboard used by calibration (group Settings series S0001)

5 Baggage

The group Baggage contains 8150 X-ray images arranged in 77 series. The X-ray images were taken from different containers such as backpacks, pen cases, wallets, etc. Some examples are illustrated in Figs. 6, 7 and 8. The details of each series are given in Table 7. Experiments on these data can be found in several publications as shown in Table 8. It is interesting to highlight that series B0046, B0047 and B0048 (see for example Fig. 6) contains 600 X-ray images that can be used for automated detection of handguns, shuriken and razor blades (bounding boxes for these objects of interest are available as well). In this case, the training can be performed using series B0049, B0050 and B0051 that includes X-ray images of individual handguns, shuriken and razor blades respectively taken from different points of view as shown in Fig. 7.

Table 11 Description of group ‘Settings’ of \(\mathbb {GDX}\)ray

6 Natural Objects

The group Nature contains 8290 X-ray images arranged in 13 series. The X-ray images were taken from different natural objects such as salmon filets, fruit and wood pieces. Some examples are illustrated in Figs. 9 and 10 The details of each series are given in Table 9. Experiments on these data can be found in several publications as shown in Table 10. It is interesting to highlight that series N0012 and N0013 (see Fig. 11) contains not only 6 X-ray images of salmon filets, but also annotations of bounding boxes and the binary images of the ground truth of 73 fish bones. For training proposes, there are more than 7500 labeled small crops (\(10 \times 10\) pixels), of regions of X-ray of salmon filets with and without fish bones in series N0003.

Table 12 Applications of series Settings

7 Settings

The group Settings contains 151 X-ray images arranged in 7 series. The X-ray images were taken from different calibration objects such checkerboards and 3D objects with regular patterns. Some examples are illustrated in Fig. 12. The details of each series are given in Table 11. Experiments on these data can be found in several publications as shown in Table 12. It is interesting to highlight that series S0001 (see Fig. 12) contains not only 18 X-ray images of a copper checkerboard, but also the calibration matrix of each view. In addition, series S0007 can be used for modeling the distortion of an image intensifier. The coordinates of each hole of the calibration pattern in each view are available, and the coordinates of the 3D model are given as well.

8 Conclusions

In this paper, we presented the details of a new public dataset called \(\mathbb {GDX}\)ray. It consists of more than 19,400 X-ray images. The database includes five groups of X-ray images: castings, welds, baggage, natural objects and settings. Each group has several series and X-ray images with many labels and annotations that can be used for training and testing purposes in computer vision algorithms. To the best knowledge of the authors, up until now there have not been any public databases of digital X-ray images for X-ray testing.

In this paper, we explained the structure of the \(\mathbb {GDX}\)ray database, we gave a description for each group (with some series examples), and we presented some examples of applications that have been published using images of \(\mathbb {GDX}\)ray.

We believe that \(\mathbb {GDX}\)ray represents a relevant contribution to the X-ray testing community. On the one hand, students, researchers and engineers can use these X-ray images to develop, test and evaluate image analysis and computer vision algorithms without purchasing expensive X-ray equipment. On the other hand, these images can be used as a benchmark in order to test and compare the performance of different approaches on the same data. Moreover, the database can be used in the training programs of human inspectors.