Abstract
In this survey, we compile a list of publicly available infrared image and video sets for artificial intelligence and computer vision researchers. We mainly focus on IR image and video sets, which are collected and labelled for computer vision applications such as object detection, object segmentation, classification, and motion detection. We categorise 109 publicly available or private sets according to their sensor types, image resolution, and scale. We describe each set in detail regarding their collection purpose, operation environment, optical system properties, and application area. We also cover a general overview of fundamental concepts related to IR imagery, such as IR radiation, IR detectors, IR optics and application fields. We analyse the statistical significance of the entire corpus from different perspectives. This survey will be a guideline for computer vision and artificial intelligence researchers who want to delve into working with the spectra beyond the visible domain.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Artificial intelligence is based on data, which is the new defining element of science. In the past decade, machine learning techniques have evolved to the point where they are now capable of processing larger data sets than humans could ever imagine or possess. Particularly in the field of computer vision, large-scale data sets improve machine learning performance so dramatically that deep neural networks are able to perform as well as humans on especially high-quality images [31]. The amount of labelled visual data available for various computer vision tasks (such as image classification, segmentation, detection, tracking, etc.) has reached billions of high-quality images [125] available worldwide for use by researchers and engineers.
Visual data that is publicly accessible comes in a variety of formats. Although the available data is overwhelmingly composed of the visible band, or in other words, “RGB” images; public access to images of other modalities, such as multi/hyperspectral, magnetic resonance (MR), computerised tomography (CT), synthetic aperture radar (SAR), to name a few, is also possible. One relatively less public imaging modality is the infrared (IR) imagery, which corresponds to images constructed with the radiation of an invisible portion of the electromagnetic spectrum, known as the infrared band.
All kinds of objects emit infrared radiation [46]. With its low radiation absorption, high contrast, and capacity for hot target detection, the IR band is popular and practical for use in civil and military applications [84]. IR imaging is used in many applications, such as object detection, object segmentation, classification, motion detection, etc. However, in contrast to visible band imagery, IR images are difficult to access for several reasons. To begin with, the technology of most IR imaging systems is relatively expensive for use in consumer electronics. Besides, since most IR vision applications are utilised for military or medical applications, they are inaccessible due to either security reasons or intellectual property rights. As a result, the publicly available infrared image and video sets are limited compared to high-scale labelled visible band image and video sets.
The primary purpose of this article is to compile a list of publicly available infrared image and video sets for artificial intelligence and computer vision researchers. We mainly focusFootnote 1 on IR image and video sets which are collected and labelled for computer vision applications such as object detection, object segmentation, classification, and motion detection. We categorize 109 different publicly available or private sets according to their sensor types, image resolution, and scale. We describe each and every set in detail regarding their collection purpose, operation environment, optical system properties, and area of application.
The number of survey studies on IR vision algorithms and IR vision technologies is increasing [48, 49, 83, 107]. However, to the best of our knowledge, no published survey studies that review IR image or video sets exist. Our aim is to compile a collection of sets so that researchers in the fields of computer vision and deep learning can identify a visual corpus with necessary properties and compare it with other sets already available. As a result, we believe that the survey can contribute to new algorithms in deep learning and vision research using the spectra beyond the visible spectrum. By scanning public academic sources, we compile this list of image and video sets collected using IR imaging equipment. What is more, for the reader to completely evaluate the different properties of IR image and video sets, we also provide a background on the fundamentals of infrared imagery, including topics such as principles of infrared radiation, infrared sensors, infrared optics, and application fields of IR imagery.
The remainder of this paper is organised as follows: Section 2 covers a general overview of IR radiation, IR detectors, IR optics and related applications. Section 3 starts with an analysis of the statistical significance of the entire corpora and follows by providing the compiled sets as a list with brief descriptions. Finally, Section 4 procures conclusions and sets future directions for the paper.
2 Fundamentals of infrared imagery
2.1 Infrared radiation
The discovery of IR radiation dates back to an experiment by Frederick William Herschel more than 200 years ago using prisms and basic temperature sensors to measure the wavelength distribution of the stellar spectra [23]. However, its widespread use is relatively new, starting by the early 20th century with the understanding of Plank’s law and blackbody radiation, and also with the help of modern physics and quantum theory [20, 57]. Today it is almost common knowledge that according to specific known laws of physics, objects emit unique radiation in a broad region of wavelengths called the electromagnetic spectrum (ES). The IR region of this spectrum corresponds to wavelengths from the nominal red edge of the visible spectrum around 700 nanometers to 1 millimetre. IR wavelengths in this region are conventionally categorised into five spectral sub-bands. The wavelength region of 0.7μ m to 1.4 μ m is called the near-infrared (NIR), 1.4μ m to 5μ m: the short-wave infrared (SWIR), 3μ m to 8μ m: the mid-wave infrared (MWIR), 8 μ m to 15 μ m the long-wave infrared (LWIR), and finally 15μ m to 1000μ m the far-infrared (FIR) (see Table 1).
The conventional categorization of IR sub-bands defined in Table 1 is correlated with how IR radiation is absorbed, reflected or transmitted by the atmosphere. The region of the IR spectrum, where there is relatively little absorption of terrestrial thermal radiation by atmospheric gases, is called the IR atmospheric window, which is roughly between 1 to 15 μ m. The absorption of IR radiation depends on various atmospheric conditions such as altitude, latitude, solar Zenith angle, water vapour, etc. In Fig. 1, a synthetically created spectrum of atmospheric transmission between 0.7-30 μ m, using the ATRAN moduleFootnote 2 [69] is depicted. For instance, as seen in Fig. 1, atmospheric transmittance of the NIR spectrum band is relatively high, which makes this sub-band an effective spectrum for active (i.e. a radiation source illuminating the scene) night vision systems.
It is also seen in Fig. 1 that much of the IR spectrum is not suitable for everyday applications because IR radiation is absorbed by water or carbon dioxide in the atmosphere. However, there are a number of wavelength bands with low absorption, which actually create the IR sub-bands known as the short, medium and long-wavelength IR bands, abbreviated as SWIR, MWIR and LWIR respectively.
Visible, NIR or SWIR light (0.35-3 μ m) corresponds to a high atmospheric transmission band and peak solar illumination. This is why most optical systems usually include detectors sensitive to these bands for the best clarity and resolution. However, without moonlight or artificial illumination, SWIR imaging systems are known to provide poor or no imagery of objects below 300K temperatures. SWIR imaging systems predominantly use reflected light. Accordingly, they are comparable to grey-scale visible images in resolution and detail.
The MWIR (also referred to as the ‘MIR’) band also provides partial regions of lossless atmospheric transmission with the added benefit of reduced ambient and background noise. This region is referred to as the “thermal infrared”. The radiation in this sub-band is emitted from the object itself; hence passive imaging is utilised. Two principal factors determine how bright an object appears in the MWIR spectrum: the object’s temperature and its emissivity (E). Emissivity is a physical property of materials that describes how efficiently it radiates the absorbed radiation.
The LWIR band spans roughly between 8-15 μ m, with almost no atmospheric absorption between the 9-12 μ m region. Because LWIR sensors can construct an image of a scene based on passive thermal emissions only and hence require no active illumination, this region is also considered as “thermal infrared”. LWIR band is better than MWIR for imaging through smoke or atmospheric particles (aerosols). Therefore, surveillance applications usually prefer LWIR technology. On the other hand, for very long-range detection (such as 10km or more), MWIR has greater atmospheric transmission than LWIR in most atmospheric conditions.
Although the FIR spectrum is defined between 0.75μ m and 1mm, the atmosphere absorbs almost all IR radiation with wavelengths above 25μ m. Hence, atmospheric FIR spectroscopy can only be effectively utilised for wavelengths in the limited spectrum between 0.75 to 25 μ m. This region is also an atmospheric thermal band, which we can experience in the form of heat waves. For astronomical observation outside of the atmosphere, the entire FIR spectrum is utilised.
For a general overview of the subject and the fundamentals of radiometry, the reader may refer to [81].
2.2 Infrared detectors
One of the fundamental parts of an IR electro-optical system is the detecting sensor. In order to capture the IR signature of a scene, a detector sensitive to IR radiation is needed. IR-sensitive detectors capture the IR radiation emitted by the objects and the scene, and convert it into electrical signals. Objects that have different temperatures and emissivity, emit different levels of radiation so that the camera produces electrical signals that have different amplitudes. These electrical signals are used to produce the IR image.
Detectors are the core of an IR imaging system. Historically IR detectors can be scrutinised in three generations. The first generation consists of single-cell detectors. In order to create an image plane, the infrared beam emitted from a scene reaches a reflective surface (i.e. mirror). As the position of the mirror is deflected by two-dimensional rotary actuators, the focused infrared beam creates a two-dimensional pattern of the target image plane. In contrast, the second-generation systems comprise an array of detectors with an optical mirror system that rotates only on a single axis. Finally, the modern third-generation IR optical systems have two-dimensional array detectors, known as focal plane arrays (FPA), so that the system does not need a mirror system to scan different parts of the scene [10, 46].Third-generation IR detectors are quite similar to modern digital photographing machines in principle.
In order to measure IR detector performance, three principle metrics are utilised: photosensitivity (or responsivity), noise-equivalent-power (NEP), and Detectivity (D*).
Photosensitivity or responsivity is defined as the output signal per Watts of incident energy. The output may vary according to the type of detector, for example, while the output signals in photovoltaic detectors are usually photocurrent (i.e. Amperes), the output signals in photoconductor detectors are obtained as voltage. Photosensitivity is related to the magnitude of the sensor’s response and is expressed as follows;
where S is signal output, P is incident energy and A is the detector’s active area [46, 115].
The signal-to-noise ratio (SNR) for a given input flux level is an important parameter used to determine IR image sensitivity [18]. NEP is the quantity of incident light when the SNR is 1 and expressed as follows:
where N is the noise output and Δ is the noise bandwidth (and S,P and A are the same as in (1)).
Detectivity D* (normalised detectivity) is the photosensitivity per unit active area of a detector and is expressed as follows:
Technologically, IR detectors are classified into two main groups: thermal detectors and photon (quantum) detectors (see Table 2) [93]. Thermal detectors include thermocouples, thermopiles, pyrometers and bolometers that use infrared energy for detection. They are constructed using metal compounds or semiconductor materials and are low-cost. These detectors operate at room temperature. Their sensitivity is independent of wavelengths. Consequently, they are capable of capturing scenes in all IR sub-bands. However, they suffer from slow response times, low sensitivity, and low resolution.
In contrast to thermal detectors, photon detectors simply count photons of IR radiation. There are different technologies that operationalize these types of sensors such as photoconductors, photodiodes, Schottky Barrier Detectors, and Quantum Well detectors [93]. Compared to thermal sensors, they are more sensitive and operate faster. However, these types of detectors do not operate at room temperature but require a cooling capability. In addition, they are made from materials such as InSb, HgCdTe, and GaAs/AIGaAs whose sensitivity depends on photon absorption and, therefore are more expensive. They also have a limited IR spectrum. Photon detectors are usually utilised when a high-sensitivity response is required at a specific wavelength.
Comparative studies on thermal and photon detectors show that both sensor types have their pros and cons [51, 93, 115]. Photon detectors are favoured at specific wavelengths and lower operating temperatures, whereas thermal detectors are favoured at a very long spectral range [92]. Photon detectors are fundamentally limited by generation-recombination noise arising from photon exchange with a radiating background. Thermal detectors are fundamentally limited by temperature fluctuation noise arising from radiant power exchange with a radiating background [62].
2.2.1 IR detector raw output
The raw pixel output of an IR detector is the irradiance (i.e. the flux of infrared energy per unit area) transformed into quantised n-bit values. These values are within the limits of the so-called “dynamic range”, which is the difference between the largest and smallest signal value the detector can record or reproduce. Hence, the raw pixel values are usually not uniformly distributed within the dynamic range. In practice, a raw IR detector output is usually confined to a very limited range. In Fig. 2, a 16bit IR detector raw output (taken from [13]), its 16-bit raw pixel histogram and the enhanced image are depicted.
IR electro-optical systems that provide a visual output for human users, enhance the raw detector output using contrast-enhancing histogram shaping methods [101]. These types of systems usually provide 8-bit contrast-enhanced images as output. The aim of such a process is to increase the contrast of the raw IR image for the human observer. As seen in Fig. 2a, the raw image is barely visible to the human eye. Due to the irreversibility of most image enhancement algorithms, the bit range decreases with the price of sacrificing information. This enhancement is usually a default process for visible band cameras. On the other hand, systems that provide intelligent IR image processing algorithms, such as tracking, detection, recognition, etc., utilize the raw output of pixels; since the raw output is representative of the actual irradiance values collected from the scene and has a higher dynamic bit range. The raw output of the electro-optical system usually has the same bit-depth as the IR detector, such as 11-bits or 14-bits. In the following section, when analyzing the various image and video sets, information regarding the raw or enhanced nature of pixel values for a given set is specifically indicated.
Some thermal cameras utilize false colours for their 8-bit contrast-enhanced output. This is usually done for temperature mapping for cameras that are used for temperature measurement. In Fig. 2d, an example of a false-colour contrast-enhanced infrared image is depicted.
2.3 Infrared optics
IR imaging technology was founded in the late 1920s with the understanding of photon emission, and improvements continue even today [20]. IR imaging is based on a fundamental concept in geometrical optics called the ray model. A ray model ignores the diffraction and assumes that light travels in straight lines from a source point. Each location in the scene can be assumed as a source point, and the source points emit different levels of radiation that create the IR scene.[18].
In geometrical optics an image is constructed via an optical material, by focusing the rays collected from the scene onto an image plane. Hence, the optical material used in an infrared system needs to be transparent (i.e. with transmittance closer to 1.0) at the wavelength the detector is sensitive to. The percentage of incident light that passes through a material for a given wavelength of radiation is defined as electromagnetic transmission, also known as transmittance.
When choosing the correct optical material for an IR imaging system, there are three main points to consider. The first is the thermal properties of the material. Optical materials are typically placed in environments with varying temperatures, and as a result, they can generate a significant amount of heat. To ensure that the user receives the desired performance, the coefficient of thermal expansion (CTE) of the material should be evaluated. Secondly, as mentioned above, sufficient transmittance of the material for the given wavelength is a must. In Fig. 3, the transmittance of different materials in IR sub-bands is depicted. For example, if the system is intended to operate in the LWIR band, germanium (Ge) optics with a thickness of 1mm are preferable to sapphire optics with the same thickness.
Another factor in choosing a suitable optical material is the refractive index, which is the measure of how fast radiation travels through a material. IR refractive index varies among materials, allowing more flexibility in system design. As a solution, anti-reflection coatings are applied to materials used for IR optics, which also limits them to a desired band within the IR spectrum.
For more information on the subject, the reader may refer to [28].
2.4 IR electro-optical system properties
There are some important parameters used in selecting appropriate equipment and characterising the performance of IR systems. The parameters that measure the performance of an IR electro-optical system depend on its ability to detect IR radiation and resolve the temperature differences in the scene. The contrast in an IR image occurs due to variations in temperature and emissivity. The parameters that may affect the performance of an IR electro-optical system, in general, include spectral range, normalised detectivity, temperature range, absolute accuracy, repeatability, frame rate, spatial resolution and thermal sensitivity [113]. Below these parameters are briefly explained:
-
Spectral range: refers to the wavelength range in which the IR system will operate.
-
Normalised detectivity (D*): as defined in (3), is one of the widely used parameters to compare the performance of IR detectors.
-
Temperature range: or the operating temperature, is the minimum and maximum temperatures that can be measured by the IR electro-optical system. It has a unit of K, C∘, or F∘.
-
Absolute Accuracy: is a measure of how accurately the system detects the actual temperature and is denoted by temperature units. Related to this measure, Repeatability is defined as the consistency of the system accuracy.
-
Frame rate: is the number of frames displayed per second. For monitoring moving objects, higher frame rate cameras are mostly preferred [10]. It has a unit of Hz.
-
Spatial resolution: also referred to as the “instantaneous field-of-view” (IFOV), is the imaging system’s ability to differentiate the details of objects within a single pixel-sized FOV. It is a measure of solid angle, hence represented by steradians. As spatial resolution increases, so does the image qualityç [10].
-
Thermal sensitivity: is the smallest temperature change detected by the IR imaging system. There are three most common parameters used as a measure of thermal sensitivity, namely “Noise Equivalent Temperature Difference” (NEDT), “Minimum Resolvable Temperature Difference” (MRDT) and “Minimum Detectable Temperature Difference” (MDTD) [113]. It has a unit of temperature (i.e. K, C∘, or F∘).
In order to choose the right camera for the right application, all of the aforementioned parameters should be taken into account. There are numerous commercial IR electro-optical systems available in the market. In Table 3, we provide a selection of six different near-infrared electro-optical systems, with their comparative parameters, so as to give the reader a sense of the systems engineering perspective of IR electro-optical system selection.
2.5 Applications of IR electro-optical systems
The development in IR sensing technologies has resulted in countless applications, which we divided into four major categories: military & surveillance, industrial, medical, and scientific. Each category title is briefly explained below. The IR image and video sets provided in the next section are categorised according to these application titles.
-
Military & Surveillance Applications: The military and surveillance field, which also encapsulates law enforcement and rescue applications, cover a wide variety of applications utilised in all IR sub-bands. Warfare applications include target tracking/detection/acquisition in various platforms such as missile seeker heads, forward-looking infrared (FLIR) systems, infrared search and track (IRST) systems, and directional countermeasure (DIRCM) systems. Regarding law enforcement and rescue applications, night vision systems, reconnaissance and surveillance, fire fighting and rescue in smoke, identification of earthquake victims’ locations, forest fire detection, and radiation thermometer are prime examples.
-
Industrial Applications: Industrial applications of IR imaging systems include the utilization of IR sensing technology in various industrial fields, such as infrared heating in process control, nondestructive inspection of thermal insulators, hidden piping location detection, diseased tree and crop detection, hot spot detection, brake lining, industrial temperature measurement, clear-air turbulence detection, pipeline leak and petrol spill detection, just to name a few.
-
Medical Applications: In medicine, IR technology is fundamentally used for diagnosis, such as early cancer detection, determining the optimum site for amputation, determining the location of the placenta, detecting strokes and vein blockages before they occur, monitoring wound healing, and detecting infection. Due to its non-invasive nature, IR technology in medicine provides information about conditions that are directly or indirectly related to the focused region of the body (such as hands [102]), as well as facilitating the assessment of treatment.
-
Scientific Applications: In nearly every scientific field, from remote sensing and meteorology to material science and microbiology, from engineering to biology, IR imaging technologies are used. In this paper, when categorizing a set as ”scientific”, we took into account its use outside of the other categorised sectors, namely military, surveillance, industrial, and medical. An image or video set, for example, is classified as both Military & Surveillance and Scientific if it has the capacity to support both types of applications.
3 IR image & video sets
The paper analyses 109 IR image and/or video sets and provides a list of the sets in Table 4. A total of 77 are public, in other words, they offer public download links, while 3 are private and require payment. The remaining 29 sets can be downloaded for free but require manual registration by contacting the institution that owns them. The entire corpus of sets includes nearly 20 million still images and video frames. In the following, we provide the statistical details of the compiled list of IR image and video sets in terms of application fields, included object categories, resolution, annotation, and preprocessing, before presenting the list with brief descriptions.
Table 4 provides a sample image and a brief description for every set. There are also separate columns for technical details, such as types of annotated classes, number of total frames, image resolutions, sensor types, image bit depths, and application fields. The description section additionally specifies whether the collection is accessible to everyone (pub), accessible to paid users only (pri), or needs registration (rr). For more details, we suggest that the reader consult the References section for an online link to the image set.
3.1 Application fields
As mentioned in the previous section, application fields for IR image and video sets are scrutinised in 4 main titles: Military & Surveillance (Mil. & Sur.), Industrial, Medical and Scientific. As seen in Fig. 4a, Military & Surveillance comprises 65.2% of the total volume of images and video frames, clearly demonstrating the importance of IR imaging in this industry. Sets collected for scientific applications cover 25.9% of the corpus, while medical applications cover 8.6%, most likely due to the legal challenges involved in collecting or publishing health informatics data. Industrial applications account for a marginal share, which is probably due to the fact that they do not publish their data in public domain. In Table 4, the application fields for every individual set are indicated in the right-most column (titled “App.”).
3.2 Resolution and sensor
IR image and video sets listed in this survey range from lower-definition (LD--), which corresponds to resolutions lower than LD, to ultra-higher-definition (HD++), which corresponds to resolutions higher than UHD. Depending on the application, the resolution plays a significant role. Most surveillance systems require HD or better resolutions for accuracy. On the other hand, LD and standard definition (SD) systems are ideal when the computational capabilities of the system are limited. Figure 4b shows that despite three-quarters of the corpora being SD++ or worse, the rest are almost UHD or better. It is important to note that sets with UHD or better resolutions are recent sets showing a clear future trend. In Table 4, the actual resolutions for every individual set are indicated in column four (titled “res”).
In addition, the optical equipment used to collect each set is provided in column five (titled “Sensor”). When compared to RGB cameras, IR optical systems are capable of different kinds of calibration, and they are capable of producing characteristic output that may not be replicated with similar equipment. The reason for this is that today’s RGB cameras usually use the same preprocessing and aim at producing almost the same output, whereas, with IR vision, it becomes important to know the parameters of the equipment in order to recreate similar scenes or images. Therefore, in Table 4 column titled “Sensor”, we provide details regarding the collection equipment for the sets, which openly specify these details.
3.3 Annotations and object categories
Many computer vision applications annotate data with labels for certain purposes, such as detection, tracking, and recognition. Data annotation/labelling is an expensive effort, which provides means for supervised learning, and hence deep learning if the annotated data are sufficiently large in scale. Similarly, some of the IR sets listed in this survey are annotated with various labels. As shown in Fig. 4c, about 33.5% of the entire corpus is annotated. For some sets, these annotations are black-box locations for objects, whereas for others they are global labels for entire images. A majority of the corpora are not labelled, but we believe that most annotations may not be shared publicly due to their commercial implications. Once again, it is important to note that sets with labels are recent sets showing another future trend.
In Table 4, (in column three, titled “Classes”), categories for any existing annotation of a given set are provided. The entire collection of sets includes a wide range of object annotation categories. The objects are categorised under seven titles in Fig. 4d, namely biometrics, environments, humans, vehicles, animals, unknown and uncategorised. Biometrics annotations include IR images of faces, irises, ears and/or fingerprints, and cover the majority of the annotations with a 52.1% share. Human annotations, including pedestrians, runners, sportsmen, etc cover 37.7% of these annotations. Vehicles of different sorts such as cars, bicycles, motorcycles, aircraft, boats, etc, are also included and cover 8.6% of label annotations. There are a small number of animal class annotations that take 1.4%. There is also a marginal share of annotations that are related to environmental objects, including terrain, roads, clouds, or various objects like food, or uncategorised application-specific labels.
3.4 Image enhancement
As mentioned previously in Section 2.2.1, IR electro-optical systems that provide a visual output for human users, usually enhance the raw detector output using contrast-enhancing histogram shaping methods. However, IR image processing systems that utilize algorithms such as tracking, detection, recognition, etc., utilize the raw output of pixels, which usually has the same bit-depth of the IR detector. The histogram-enhanced image is, in most cases, the only accessible output of an IR optical system. For such systems, the details of the enhancement algorithms are rarely provided to the user. Most systems apply different algorithms that suit their design requirements such as level of contrast, and real-time operation, just to name a few. As seen in Fig. 4e, only a minority of 5.3% of the entire corpora of collected frames are raw detector outputs. The “bit” column in Table 4 gives information about the bit depth of an image/frame for a given set. The number (8, 11, 16, etc.) corresponds to image bit-depth. For some sets, the bit depth is indicated by “8*” showing that the images/frames are of 24bit RGB (i.e 8bit per channel) format. The abbreviation “HE” is to indicate the existence of any histogram enhancement process, whereas “RAW” suggests the accessibility of the raw detector output. The type of enhancement technique is not indicated in the table, because this information is not available for most of the collection equipment.
4 Conclusions and future directions
In this survey, we compile a list of publicly available IR image and video sets for artificial intelligence and computer vision researchers. We mainly focus on IR image and video sets, which are collected and labelled for computer vision applications such as object detection, object segmentation, classification, and motion detection. We categorize 109 publicly available or private sets according to their sensor types, image resolution, and scale. The list includes brief descriptions for each set. The statistical details of the entire corpus of IR image & video sets are provided in terms of applications fields, including object categories, resolution, annotations, sensor types and preprocessing details.
We believe that this survey, with solid introductory references to the fundamentals of IR imagery, will be a guideline for computer vision and artificial intelligence researchers who want to delve into working with the spectra beyond the visible domain. Today, consumer electronics are integrating IR cameras with smartphones, making IR imaging a reality within the consumer market. Within a short time, the IR domain will host a large number of pre-trained deep learning models. Therefore, this collection can be used to research deep learning models for vision problems like IR domain adaptation, multi-modal vision, and fusion in the future. Such an approach may result in IR subband-specific deep feature extractors, which can be used for a variety of vision tasks. These models would need very large-scale sets. A crucial practice in the future would be the ongoing updating of this survey, especially in light of the possibility that annotated IR sets may soon be made available in vast quantities.
Data Availability Statement
The dataset generated during the current study is available from the corresponding author upon reasonable request.
Notes
Multispectral image sets collected with satellites are left out of the scope of this survey paper. We believe that multispectral satellite imagery is a category that requires a unique focus due to differences in IR imaging in vision practices, perspective, atmospheric effects and applications.
ATRAN module input parameters are selected as, observatory altitude: 13800 feet (Mauna Kea (red) at an altitude of 13.8K feet and 3.4 mm water vapour), observatory latitude: 39 degrees, water vapour overburden: 0 microns, standard atmosphere with 2 Layers, Zenith angle: 45 degrees, smoothing resolution: 1000.
References
(2018) Multi-modal dataset for hand gesture recognition. Available at https://www.kaggle.com/gti-upm/multimodhandgestrec
(2020) Thermal images - diseased & healthy leaves - paddy. Available at https://www.kaggle.com/sujaradha/thermal-images-diseased-healthy-leaves-paddy?select=thermal+images+UL
Akula A, Khanna N, Ghosh R et al (2014) Adaptive contour-based statistical background subtraction method for moving target detection in infrared video sequences. Infrared Phys Technol 63:103–109. Available at http://vcipl-okstate.org/pbvs/bench/
Alaska Fisheries Science Center (accessed on 2022) A dataset for machine learning algorithm development. Available at https://lila.science/datasets/noaa-arctic-seals-2019/
Alqattan M (2020) A dataset of raw thermal, visible and night vision images for illegal fishers in the kuwaiti bay. https://doi.org/10.17632/69ncy4nxsg.1, Available at https://data.mendeley.com/datasets/69ncy4nxsg/1
Aniket A (2022) bird dataset. Available at https://universe.roboflow.com/antiuav-9-aniket/bird-6le8u
Ariffin S M Z S Z, Jamil N, Rahman P N M A (2016) Diast variability illuminated thermal and visible ear images datasets. In: 2016 Signal processing: algorithms, architectures, arrangements, and applications (SPA), pp 191–195. https://doi.org/10.1109/SPA.2016.7763611, Available at http://vcipl-okstate.org/pbvs/bench/
Ashfaq Q, Akram U, Zafar R (2021) Thermal image dataset for object classification. https://doi.org/10.17632/btmrycjpbj.1, Available at https://data.mendeley.com/datasets/btmrycjpbj/1
AV-Public (2022) All thermal dataset. Available at https://universe.roboflow.com/avpublic/all_ther
Bagavathiappan S, Lahiri BB, Saravanan T et al (2013) Infrared thermography for condition monitoring - a review. Infrared Phys Technol 60:35–55
Bahnsen C H, Moeslund T B (2018) Rain removal in traffic surveillance: does it matter? IEEE Trans Intell Transp Syst, 1–18. https://doi.org/10.1109/TITS.2018.2872502. Available at https://www.kaggle.com/aalborguniversity/aau-rainsnow/
Benes R, Dvorak P, Faundez-Zanuy M et al (2013) Multi-focus thermal image fusion. Pattern Recogn Lett 34(5):536–544. Available at http://splab.cz/en/download/databaze/multi-focus-thermal-image-database
Berg A, Ahlberg J, Felsberg M (2015) A thermal object tracking benchmark. In: 2015 12th IEEE international conference on advanced video and signal based surveillance (AVSS). Available at http://www.cvl.isy.liu.se/en/research/datasets/ltir/version1.0/
Bernhard J, Barr J, Bowyer K W et al (2015) Near-ir to visible light face matching: Effectiveness of pre-processing options for commercial matchers. In: 2015 IEEE 7th International conference on biometrics theory, applications and systems (BTAS), pp 1–8. https://doi.org/10.1109/BTAS.2015.7358780, Available at https://cvrl.nd.edu/projects/data/
Bertozzi M, Broggi M V G D R M (2006) Low-level pedestrian detection by means of visible and far infra-red tetra-vision. Maintained by http://vislab.it/
Bilodeau G-A, Torabi A, St-Charles P-L et al (2014) Thermal–visible registration of human silhouettes: a similarity measure performance evaluation. Infrared Phys Technol 64:79–86. Available at http://vcipl-okstate.org/pbvs/bench/
Bondi E, Jain R, Aggrawal P et al (2020) Birdsai: a dataset for detection and tracking in aerial thermal infrared videos. In: WACV. Available at https://sites.google.com/view/elizabethbondi/dataset
Boreman G D (1998) Basic electro-optics for electrical engineers, vol 31. SPIE Press
Brown M, Süsstrunk S (2011) Multispectral SIFT for scene category recognition. In: Computer Vision and Pattern Recognition (CVPR11), Colorado Springs, pp 177–184. Available at https://ivrlwww.epfl.ch/supplementary_material/cvpr11/index.html
Buser R G, Tompsett M F (1997) Historical overview. In: Semiconductors and Semimetals, vol 47. Elsevier, pp 1–16
Chen X, Flynn P, Bowyer K (2005) Ir and visible light face recognition. Comput Vis Image Underst 99:332–358. https://doi.org/10.1016/j.cviu.2005.03.001. Available at https://cvrl.nd.edu/projects/data/
Chingovska I, Erdogmus N, Anjos A et al (2016) Face recognition systems under spoofing attacks. Springer International Publishing, Cham, pp 165–194. https://doi.org/10.1007/978-3-319-28501-6_8, Available at https://www.idiap.ch/en/dataset/msspoof
Clerke A M (2003) A popular history of astronomy during the nineteenth century. Sattre Pr
Coşar S, Yan Z, Zhao F et al (2018) Thermal camera based physiological monitoring with an assistive robot. In: 2018 40th Annual international conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp 5010–5013. https://doi.org/10.1109/EMBC.2018.8513201, Available at https://lcas.lincoln.ac.uk/wp/research/data-sets-software/
Computer Vision and Biometrics Lab. (2022) Multimodal biometrics dataset thermal face images. Available at https://cvbl.iiita.ac.in/dataset.php
Cosar S, Bellotto N (2019) Human re-identification with a robot thermal camera using entropy-based sampling. Journal of Intelligent & Robotic Systems. https://doi.org/10.1007/s10846-019-01026-w, Available at https://lcas.lincoln.ac.uk/wp/research/data-sets-software/l-cas-rgb-d-t-re-identification-dataset/
D’Angelo E, Herbin S, Ratieville M (2006) Robin challenge. Available at https://robin.inrialpes.fr/testsdefinitions.php
Daniels A (2018) Field guide to infrared optics, materials, and radiometry, vol FG39. SPIE
Davis J W, Keck M A (2005) A two-stage template approach to person detection in thermal imagery. In: 2005 Seventh IEEE workshops on applications of computer vision (WACV/MOTION’05), vol 1. IEEE, pp 364–369. Available at http://vcipl-okstate.org/pbvs/bench/
Davis J W, Sharma V (2007) Background-subtraction using contour-based fusion of thermal and visible imagery. Comput Vis Image Understand 106(2-3):162–182. Available at http://vcipl-okstate.org/pbvs/bench/
Dodge S F, Karam L J (2017) A study and comparison of human and deep learning recognition performance under visual distortions. arXiv:1705.02498
Erazo-Aux J, Loaiza-Correa H, Restrepo-Giron A D et al (2020) Thermal imaging dataset from composite material academic samples inspected by pulsed thermography. Data Brief 32:106313. https://doi.org/10.1016/j.dib.2020.106313, https://europepmc.org/articles/PMC7508994, Available at https://data.mendeley.com/datasets/v4knrwgj9y/2
Faundez-Zanuy M, Mekyska J, Espinosa-Duró V (2011) On the focusing of thermal images. Pattern Recogn Lett 32:1548–1557. https://doi.org/10.1016/j.patrec.2011.04.022, Available at http://splab.cz/en/download/databaze/thermal-focus-image-database
Faundez-Zanuy M, Mekyska J, Font X (2013) A new hand image database simultaneously acquired in visible, near-infrared and thermal spectrums. Cogn Comput, 6. https://doi.org/10.1007/s12559-013-9230-3, Available at http://splab.cz/en/download/databaze/carl-database
FLIR (2022) Free flir thermal dataset for algorithm training. Available at https://www.flir.com/oem/adas/adas-dataset-form/
Gade R, Moeslund T B (2018) Constrained multi-target tracking for team sports activities. IPSJ Trans Comput Vis Applic 10(1):1–11. Available at https://www.kaggle.com/aalborguniversity/thermal-soccer-dataset
Gao C, Du Y, Liu J et al (2016) Infar dataset: infrared action recognition at different times. Neurocomputing 212:36–47. https://doi.org/10.1016/j.neucom.2016.05.094, Available at https://drive.google.com/file/d/0B8URzo24xElURU1Oa0ctYmpaTlk/view?usp=sharing&resourcekey=0-6EOSjRX7_Ea-14tJorumrg
Garcia L, Diaz J, Loaiza Correa H et al (2020) Thermal and visible aerial imagery. https://doi.org/10.17632/ffgxxzx298.2, Available at https://data.mendeley.com/datasets/ffgxxzx298/2
Gebhardt E, Wolf M (2018) Camel dataset for visual and thermal infrared multiple object detection and tracking. In: 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, pp 1–6. Available at https://camel.ece.gatech.edu/
Ghayoumi zadeh H, Haddadnia J, Seryasat OR et al (2016) Segmenting breast cancerous regions in thermal images using fuzzy active contours. https://doi.org/10.17877/DE290R-17666, Available at http://database.irthermo.ir/
Ghayoumi zadeh H, Namdari F, Dadpay M et al (2017) Evaluation of thermal imaging in the diagnosis and classification of varicocele. Iran J Med Phys 14:114–121. https://doi.org/10.22038/ijmp.2017.20753.1200, Available at http://database.irthermo.ir/
Ghiass R, Bendada H, Maldague X (2018) Université laval face motion and time-lapse video database (ul-fmtv). https://doi.org/10.21611/qirt.2018.051. Available at http://www.qirt.org/liens/FMTV.htm
Gonzalez Alzate A, Fang Z, Socarras Y et al (2016) Pedestrian detection at day/night time with visible and fir cameras: A comparison. Sensors 16:820. https://doi.org/10.3390/s16060820
Ha Q, Watanabe K, Karasawa T et al (2017) Mfnet: towards real-time semantic segmentation for autonomous vehicles with multi-spectral scenes. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 5108–5115. https://doi.org/10.1109/IROS.2017.8206396, Available at https://www.mi.t.u-tokyo.ac.jp/static/projects/mil_multispectral/
HACARUS Inc. (2020) Near infrared hyperspectral image dataset. Available at https://www.kaggle.com/hacarus/near-infrared-hyperspectral-image
HAMAMATSU PHOTONICS K.K. (2011) Solid State Division. Characteristics and Use of Infrared Dedectors. Tech. rep.
Haque M A, Bautista R B, Noroozi F et al (2018) Deep multimodal pain recognition: a database and comparison of spatio-temporal visual modalities. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, pp 250–257. Available at https://vap.aau.dk/mintpain-database/
He Y, Deng B, Wang H et al (2021) Infrared machine vision and infrared thermography with deep learning: a review. Infrared Phys Technol 116
Hou F, Zhang Y, Zhou Y et al (2022) Review on infrared imaging technology. Sustainability 14:18. https://doi.org/10.3390/su141811161, https://www.mdpi.com/2071-1050/14/18/11161
Huda N U, Hansen B D, Gade R et al (2020) The effect of a diverse dataset for transfer learning in thermal person detection. Sensors 20:7. Available at https://www.kaggle.com/noorulhuda90/aaupdt
Hudson RD, Hudson JW, Levinstein H (1976) Infrared detectors. Phys Today 29(3):59
Hui B, Song Z, Fan H et al (2019) A dataset for infrared image dim-small aircraft target detection and tracking under ground / air background. https://doi.org/10.11922/sciencedb.902, Available at https://www.scidb.cn/en/detail?dataSetId=720626420933459968&dataSetType=journal
Hwang S, Park J, Kim N et al (2015) Multispectral pedestrian detection: benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1037–1045. Available at https://soonminhwang.github.io/rgbt-ped-detection/
Iwashita Y, Nakashima K, Stoica A et al (2019) Tu-net and tdeeplab: deep learning-based terrain classification robust to illumination changes, combining visible and thermal imagery, pp 280–285. https://doi.org/10.1109/MIPR.2019.00057, Available at http://robotics.ait.kyushu-u.ac.jp/~yumi/db/jpl_marsyard_db.html
Jia X, Zhu C, Li M et al (2021) Llvip: a visible-infrared paired dataset for low-light vision. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3496–3504. Available at https://bupt-ai-cz.github.io/LLVIP/
Karasawa T, Watanabe K, Ha Q et al (2017) Multispectral object detection for autonomous vehicles. Proceedings of the on Thematic Workshops of ACM Multimedia 2017. Available at https://www.mi.t.u-tokyo.ac.jp/static/projects/mil_multispectral/
Karim A, Andersson J Y (2013) Infrared detectors: advances, challenges and new technologies. In: IOP Conference series: materials science and engineering, vol 51. IOP Publishing, p 012001
Kong S, Heo J, Boughorbel F et al (2007) Multiscale fusion of visible and thermal ir images for illumination-invariant face recognition. Int J Comput Vision 71:215–233. https://doi.org/10.1007/s11263-006-6655-0, Available at http://vcipl-okstate.org/pbvs/bench/
Korki14 (2022) Drones dataset. Available at https://universe.roboflow.com/korki14/drones-srdze
Kristan M, Matas J, Leonardis A et al (2016) A novel performance evaluation methodology for single-target trackers. IEEE Trans Pattern Anal Mach Intell 38(11):2137–2155. https://doi.org/10.1109/TPAMI.2016.2516982, Available at https://www.votchallenge.net/vot2019/dataset.html
Krišto M, Ivasic-Kos M, Pobar M (2020) Thermal object detection in difficult weather conditions using yolo. IEEE Access 8:125459–125476. https://doi.org/10.1109/ACCESS.2020.3007481, Available at https://dx.doi.org/10.21227/yec9-yy29
Kruse PW (1995) A comparison of the limits to the performance of thermal and photon detector imaging arrays. Infrared Phys Technol 36(5):869–882. https://doi.org/10.1016/1350-4495(95)00014-P, https://www.sciencedirect.com/science/article/pii/135044959500014P
Kumar A, Srikanth T (2008) Online personal identification in night using multiple face representations. In: 2008 19th International conference on pattern recognition, pp 1–4. https://doi.org/10.1109/ICPR.2008, Available at https://www4.comp.polyu.edu.hk/~csajaykr/IITD/FaceIR.htm
Lee A J, Cho Y, Shin Ys et al (2019) Vivid: vision for visibility dataset. Available at https://visibilitydataset.github.io/
Li S Z, Chu R, Liao S et al (2007) Illumination invariant face recognition using near-infrared images. IEEE Trans Pattern Anal Mach Intell 29 (4):627–639. Available at http://vcipl-okstate.org/pbvs/bench/
Liu H, Bao C, Xie T et al (2019) Research on the intelligent diagnosis method of the server based on thermal image technology. Infrared Phys Technol 96:390–396. Available at https://www.kaggle.com/liuhangaz/thermal-images-of-the-server
Liu Q, He Z (2018) PTB-TIR: a thermal infrared pedestrian tracking benchmark. arXiv:1801.05944. Available at https://github.com/QiaoLiuHit/PTB-TIR_Evaluation_toolkit
Liu Q, Li X, He Z et al (2020) Lsotb-tir: a large-scale high-diversity thermal infrared object tracking benchmark. https://doi.org/10.1145/3394171.3413922, Available at https://github.com/QiaoLiuHit/LSOTB-TIR
Lord S D (1992) A new software tool for computing Earth’s atmospheric transmission of near- and far-infrared radiation. NASA Technical Memorandum 103957
Mantecon T, Del-Blanco C, Jaureguizar F et al (2016) Hand gesture recognition using infrared imagery provided by leap motion controller. 10016, 47–57. https://doi.org/10.1007/978-3-319-48680-2_5, Available at https://www.kaggle.com/gti-upm/leapgestrecog
Miezianko R (accessed on 2022) Terravic research infrared database. Available at http://vcipl-okstate.org/pbvs/bench/
Miron A (2014) Multi-modal, multi-domain pedestrian detection and classification: proposals and explorations in visible over stereovision, fir and swir. Available at https://zenodo.org/record/3754168#.YIvye7UzZPa
Mohd Asaari M S, Suandi S A, Rosdi B (2014) Fusion of band limited phase only correlation and width centroid contour distance for finger based biometrics. Expert Syst Appl 41:3367–3382. https://doi.org/10.1016/j.eswa.2013.11.033, Available at http://drfendi.com/fv_usm_database/
Morris N, Avidan S, Matusik W et al (2007) Statistics of infrared images, 1–7. https://doi.org/10.1109/CVPR.2007.383003, Available at http://www.dgp.toronto.edu/~nmorris/IR/
Naik S (2019) Thermal mango image dataset - flir one. https://doi.org/10.17632/vksfkmphzs.1, Available at https://data.mendeley.com/datasets/vksfkmphzs/1
Najafi M, Baleghi Y, Mirimani S M (2021) Thermal images dataset, transformer, 1 phase dry type. https://doi.org/10.17632/8mg8mkc7k5.2, Available at https://data.mendeley.com/datasets/8mg8mkc7k5/2
Nelson J (2020) Thermal dogs and people object detection dataset. Available at https://public.roboflow.com/object-detection/thermal-dogs-and-people
Olmeda D, Premebida C, Nunes U et al (2013) Pedestrian detection in far infrared images. Integr Comput-Aided Eng, 20. https://doi.org/10.3233/ICA-130441, Available at https://e-archivo.uc3m.es/handle/10016/17370
Palmero C, Clapés A, Holmberg Bahnsen C et al (2016) Multi-modal rgb-depth-thermal human body segmentation. Int J Comput Vision, 118. https://doi.org/10.1007/s11263-016-0901-x, Available at https://vap.aau.dk/vap-trimodal-people-segmentation-dataset/
Panetta K, Wan Q, Agaian S et al (2018) A comprehensive database for benchmarking imaging systems. IEEE Trans Pattern Anal Mach Intell 42(3):509–520. Available at https://www.kaggle.com/kpvisionlab/tufts-face-database?select=file_1
Parr A C, Datla R, Gardner J (2005) Optical radiometry, vol 41. Elsevier
Patino L, Cane T, Vallee A et al (2016) Pets 2016: dataset and challenge. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 1–8. Available at http://www.cvg.reading.ac.uk/PETS2016/a.html
Perpetuini D, Filippini C, Cardone D et al (2021) An overview of thermal infrared imaging-based screenings during pandemic emergencies. Int J Environ Res Public Health 18:6
Piñeiro-Ave J, Blanco-Velasco M, Cruz-Roldán F et al (2014) Target detection for low cost uncooled mwir cameras based on empirical mode decomposition. Infrared Phys Technol 63:222–231
Pini S, D’Eusanio A, Borghi G et al (2020) Baracca: a multimodal dataset for anthropometric measurements in automotive. In: Proceedings of the International joint Conference on Biometrics (IJCB). Available at https://aimagelab.ing.unimore.it/imagelab/page.asp?IdPage=37
Portmann J, Lynen S, Chli M et al (2014) People detection and tracking from aerial thermal views. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp 1794–1800. Available at https://projects.asl.ethz.ch/datasets/doku.php?id=ir%3Airicra2014
Prasad D K, Rajan D, Rachmawati L, Rajabally E et al (2017) Video processing from electro-optical sensors for object detection and tracking in a maritime environment: a survey. IEEE Trans Intell Transp Syst 18(8):1993–2016. https://doi.org/10.1109/TITS.2016.2634580, Available at https://sites.google.com/site/dilipprasad/home/singapore-maritime-dataset
Projects R U (2022) People detection - thermal dataset. Available at https://universe.roboflow.com/roboflow-universe-projects/people-detection-thermal
Rivadeneira R E, Sappa A D, Vintimilla B X (2020) Thermal image super-resolution: a novel architecture and dataset. In: International conference on computer vision theory and applications, pp 1–2. Available at https://github.com/rafariva/ThermalDatasets
Rivadeneira R E, Suárez P L, Sappa A D, Vintimilla B X (2019) Thermal image superresolution through deep convolutional neural network. In: International conference on image analysis and recognition. Springer, pp 417–426. Available at https://github.com/rafariva/ThermalDatasets
Roboflow (2020) Thermal cheetah object detection dataset. Available at https://public.roboflow.com/object-detection/thermal-cheetah
Rogalski A (1997) Infrared thermal detectors versus photon detectors: I. Pixel performance. In: Sizov F F, Tetyorkin V V (eds) Material science and material properties for infrared optoelectronics, vol 3182. SPIE, pp 14–25. https://doi.org/10.1117/12.280417
Rogalski A (2002) Infrared detectors: an overview. Infrared Phys Technol 43(3-5):187–210
Schneider P, Anisimov Y, Islam R et al (2022) Timo—a dataset for indoor building monitoring with a time-of-flight camera. Sensors 22:11. https://doi.org/10.3390/s22113992, https://www.mdpi.com/1424-8220/22/11/3992, Available at https://vizta-tof.kl.dfki.de/timo-dataset-overview/
Sedik A, Abd El-Rahiem B, Abd El-Samie F et al (2020) Mbd: multi-biometric dataset. https://doi.org/10.17632/94ksjgbwnz.1, Available at https://data.mendeley.com/datasets/94ksjgbwnz/1
SENSIAC (2008) Military sensing information analysis center (sensiac). Available at https://www.sensiac.org/external/products/list_databases/
Shahroudy A, Liu J, Ng T-T et al (2016) Ntu rgb+ d: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1010–1019. Available at https://rose1.ntu.edu.sg/dataset/actionRecognition/
Shamsoshoara A, Afghah F, Razi A et al (2021) Aerial imagery pile burn detection using deep learning: The flame dataset. Comput Netw 193:108001. https://doi.org/10.1016/j.comnet.2021.108001, Available at https://dx.doi.org/10.21227/qad6-r683
Silva A, Calado C (2020) Thermal and optical behavior dataset of surfaces coated with high reflectance and common materials under different conditions, used in brazil. Data Brief 30:105445. https://doi.org/10.1016/j.dib.2020.105445, Available at https://data.mendeley.com/datasets/gnhjwsf6jf/2
Socarras Y, Ramos S, Vazquez D et al (2013) Adapting pedestrian detection from synthetic to far infrared images. Available at http://adas.cvc.uab.es/elektra/enigma-portfolio/item-1/
Soundrapandiyan R, Satapathy S C, P.V.S.S.R. C M et al (2022) A comprehensive survey on image enhancement techniques with special emphasis on infrared images. Multimed Tools Applic 81(7):9045–9077. https://doi.org/10.1007/s11042-021-11250-y
Sousa E, Vardasca R, Teixeira S et al (2017) A review on the application of medical infrared thermal imaging in hands. Infrared Phys Technol 85:315–323. https://doi.org/10.1016/j.infrared.2017.07.020, https://www.sciencedirect.com/science/article/pii/S1350449517304024
Speth J, Vance N, Czajka A et al (2021) Deception detection and remote physiological monitoring: a dataset and baseline experimental results. Available at https://cvrl.nd.edu/projects/data/
Strat T (2005) Vivid tracking evaluation web site. Available at http://vision.cse.psu.edu/data/vividEval/datasets/datasets.html
Strohmayer J, Pramerdorfer C, Kampel M (2020) Sdt: a synthetic multi-modal dataset for person detection and pose classification. Available at https://zenodo.org/record/4124309#.YWlGKRpBxPZ
Sun X, Guo L, Zhang W et al (2021) A dataset for small infrared moving target detection under clutter background. v1. Available at https://datapid.cn/31253.11.sciencedb.j00001.00231
Teutsch M, Sappa A D, Hammoud R I (2021) Computer vision in the infrared spectrum: challenges and approaches. Synth Lect Comput Vis 10(2):1–138
Toet A, IJspeert JK, Waxman AM, Aguilar M (1997) Fusion of visible and thermal imagery improves situational awareness. Displays 18(2):85–95. https://doi.org/10.1016/S0141-9382(97)00014-0, Available at https://figshare.com/articles/dataset/TNO_Image_Fusion_Dataset/1008029?file=37872186
Toet A (2002) Detection of dim point targets in cluttered maritime backgrounds through multisensor image fusion. In: Targets and Backgrounds VIII: Characterization and Representation, vol 4718. International Society for Optics and Photonics, pp 118–129. Available at https://figshare.com/articles/dataset/Kayak_image_fusion_sequence_Part_I/1007650
Toet A, Hogervorst M A, Pinkus A R (2016) The triclobs dynamic multiband image dataset. Available at https://figshare.com/articles/dataset/The_TRICLOBS_Dynamic_Multiband_Image_Dataset/3206887/1
Tu Z, Ma Y, Li Z et al (2020) Rgbt salient object detection: a large-scale dataset and benchmark. arXiv:2007.03262. Available at https://github.com/lz118/RGBT-Salient-Object-Detection
UMDAMAV-Dataset (2022) Thermal overhead dataset. Available at https://universe.roboflow.com/umdamavdataset/thermal_overhead
Venkataraman B, Raj B (2003) Performance parameters for thermal imaging systems. Insight-Non-Destructive Testing and Condition Monitoring 45 (8):531–535
Visual Lab. (accessed on 2022) Thermal images for breast cancer diagnosis. Available at http://712visual.ic.uff.br/en/proeng/thiagoelias/
Vollmer M, Möllmann K-P (2017) Infrared thermal imaging: fundamentals, research and applications. Wiley
Wang Y, Jodoin P-M, Porikli F et al (2014) Cdnet 2014: an expanded change detection benchmark dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 387–394. Available at http://jacarini.dinf.usherbrooke.ca/dataset2014/
Treible W, Saponaro P, Sorensen S et al (2017) Cats: a color and thermal stereo benchmark. In: Conference on Computer Vision and Pattern Recognition (CVPR). Available at http://bigdatavision.org/CATS/download.html
Westlake S T, Volonakis T N, Jackman J et al (2020) Deep learning for automatic target recognition with real and synthetic infrared maritime imagery. In: Artificial intelligence and machine learning in defense applications II, vol 11543. International Society for Optics and Photonics, p 1154309. Available at https://cord.cranfield.ac.uk/articles/dataset/IRShips/12800324
Wu Z, Fuller N, Theriault D et al (2014) A thermal infrared video benchmark for visual analysis. In: 2014 IEEE Conference on computer vision and pattern recognition workshops, pp 201–208. https://doi.org/10.1109/CVPRW.2014.39, Available at http://csr.bu.edu/BU-TIV/BUTIV.html
Xiang S (2020) Spindle thermal error prediction approach based on thermal infrared images: a deep learning method. https://doi.org/10.21227/vwp1-q708, Available at https://dx.doi.org/10.21227/vwp1-q708
Xu Z, Zhuang J, Liu Q et al (2019) Benchmarking a large-scale fir dataset for on-road pedestrian detection. Infrared Phys Technol 96:199–208. https://doi.org/10.1016/j.infrared.2018.11.007, Available at https://github.com/SCUT-CV/SCUT_FIR_Pedestrian_Dataset
Yaman M, Kalkan S (2015) An iterative adaptive multi-modal stereo-vision method using mutual information. Available at https://kovan.ceng.metu.edu.tr/MMStereoDataset/
Yoon J S, Park K, Hwang S et al (2016) Thermal-infrared based drivable region detection. In: Intelligent Vehicles Symposium (IV), 2016 IEEE. IEEE, pp 978–985. Available at https://sites.google.com/site/drivableregion/
Zhang H, Luo C, Wang Q et al (2018) A novel infrared video surveillance system using deep learning based techniques. Multimed Tools Applic 77 (20):26657–26676. Available at http://www.lpi.tel.uva.es/AALARTDATA
Zhang L, Rui Y (2013) Image search—from thousands to billions in 20 years. ACM Trans Multimed Comput Commun Appl 9:1s. https://doi.org/10.1145/2490823
Zhang M M, Choi J, Daniilidis K et al (2015) Vais: a dataset for recognizing maritime imagery in the visible and infrared spectrums. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 10–16. https://doi.org/10.1109/CVPRW.2015.7301291, Available at http://vcipl-okstate.org/pbvs/bench/
Zukal M, Mekyska J, Cika P, Smekal Z (2013) Interest points as a focus measure in multi-spectral imaging. Radioengineering 22:68–81. Available at http://splab.cz/en/download/databaze/multispec
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 A.1 List of Abbreviations
CT | Computerised Tomography |
CTE | Coefficient of Thermal Expansion |
D* | Detectivity |
E | Emissivity |
ES | Electromagnetic Spectrum |
FHD | Full High Definition |
FIR | Far-Infrared |
FLIR | Forward Looking Infrared |
FOV | Field-of-View |
FPA | Focal Plane Array |
HD | High Definition |
HE | Histogram Equalization |
IR | Infrared |
LD | Low Definition |
LWIR | Long-Wave Infrared |
Mil.&Sur. | Military & Surveillance |
MR | Magnetic Resonance |
MWIR | Mid-Wave Infrared |
NEP | Noise-Equivalent-Power |
NIR | Near-Infrared |
pri | Private Dataset |
pub | Public Dataset |
RGB | Red-Green-Blue |
rr | Dataset that Requires Registration |
SAR | Synthetic Aperture Radar |
SD | Standard Definition |
SNR | Signal-to-Noise Ratio |
SWIR | Short-Wave Infrared |
UHD | Ultra High Definition |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Danaci, K.I., Akagunduz, E. A survey on infrared image & video sets. Multimed Tools Appl 83, 16485–16523 (2024). https://doi.org/10.1007/s11042-023-15327-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15327-8