Abstract
This paper proposes two novel time-of-flight based fire detection methods for indoor and outdoor fire detection. The indoor detector is based on the depth and amplitude image of a time-of-flight camera. Using this multi-modal information, flames can be detected very accurately by fast changing depth and amplitude disorder detection. In order to detect the fast changing depth, depth differences between consecutive frames are accumulated over time. Regions which have multiple pixels with a high accumulated depth difference are labeled as candidate flame regions. Simultaneously, the amplitude disorder is also investigated. Regions with high accumulative amplitude differences and high values in all detail images of the amplitude image its discrete wavelet transform, are also labeled as candidate flame regions. Finally, if one of the depth and amplitude candidate flame regions overlap, fire alarm is given. The outdoor detector, on the other hand, only differs from the indoor detector in one of its multi-modal inputs. As depth maps are unreliable in outdoor environments, the outdoor detector uses a visual flame detector instead of the fast changing depth detection. Experiments show that the proposed detectors have an average flame detection rate of 94% with no false positive detections.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Fire is one of the leading hazards affecting everyday life around the world. The sooner the fire is detected, the better the chances for survival. Today’s fire alarm systems, however, still pose many problems. In order to cope with these problems, research on video-based fire detection (VFD) has started in the beginning of this century. This has resulted in a large amount of vision-based detection techniques that can be used to detect the fire at an early stage [44]. Based on the numerous advantages of video-based sensors, e.g. fast detection (no transport delay) and the ability to provide fire progress information, VFD is recently becoming a viable alternative or complement for the more traditional fire sensors.
Although it has been shown that ordinary video promises good fire detection and analysis results, vision-based detectors still suffer from a significant amount of missed detections and false alarms. The main cause of both problems is the fact that visual detection is often subject to constraints regarding the scene under investigation, e.g. changing environmental conditions, and the target characteristics. To avoid the disadvantages of visual sensors, the use of other types of sensors is started to be explored in the last decade.
Thanks to the improvement of resolution, speed and sensitivity of infrared (IR) imaging, this newer type of imagery is already used successfully in many video surveillance applications, e.g. traffic safety, airport security and material inspection. Recently, also IR imaging approaches for flame detection have been proposed [6, 18, 31, 41, 45]. When light conditions are bad or the target its color is similar to the background, IR vision is a fundamental aid. Even other visual-specific object detection problems, such as shadows, do not cause problems in IR [20]. Nevertheless, IR has its own specific limitations, such as thermal reflections, IR-blocking and thermal-distance problems. Furthermore, the cost of IR cameras is still very high.
Recently, as an alternative for IR and visual sensors, time-of-flight (TOF) imaging sensors are started to be used as a way to improve everyday video analysis tasks. TOF cameras are a relatively new innovation capable of providing three-dimensional image data from a single sensor. TOF imaging takes advantage of the different kinds of information produced by the TOF cameras, i.e. depth and amplitude information. The ability to describe scenes using a depth map and an amplitude image, provides new opportunities in different applications, including visual monitoring (object detection, tracking, recognition and image understanding), human computer interaction (e.g. gaming) and video surveillance. As the TOF sensor cost is decreasing, it is even expected that this number of applications will increase significantly in the near future.
The possibilities of TOF based fire detection have not yet been investigated. As such, this paper is the first attempt in this direction. Based on our preliminary experiments with a Panasonic D-Imager [32], of which some exemplary TOF flame images are shown in Fig. 1, it is already possible to state that TOF cameras have great potential for fire detection. Flames produce a lot of measurement artifacts (~TOF noise), which most likely can be attributed to the emitted infrared (IR) light of the flames themselves. Contrarily to ordinary objects, like people, the depth of flames changes very fast over time and ranges over the entire depth range of the camera. Furthermore, the amplitude of the boundary pixels of flames shows a high degree of disorder. As our experiments revealed that the combination of these TOF characteristics is unique for flames, they are the most appropriate features for TOF fire detection.
In outdoor situations, outside the range of the TOF camera and in case that smoke appears in the field of view of the TOF camera, the TOF depth map becomes unreliable and cannot be used anymore for accurate flame detection. This can also be seen in Fig. 1. In order to cope with this problem, a visual detector is used instead of the fast changing depth detector. The proposed visual flame detector is based on a set of low-cost visual flame features which have proven to be useful in distinguishing flames from ordinary moving objects. The main benefit of using TOF amplitude data as well as visible information in outdoor environments (or outside the range of the TOF camera), is that mis-detections in visual images can be corrected by TOF detections and vice versa. As such fewer false alarms will occur when both are combined. Important to mention is that in order to do this, the visual and TOF images need to be registered, i.e. aligned with each other.
The remainder of this paper is organized as follows. Section 2 lists the related work in TOF video analysis and video fire detection. Very briefly, this section also reflects on conventional (3D) depth acquisition using stereo-vision. Next, Section 3 describes the working principle and the advantages and disadvantages of TOF imaging. Subsequently, Section 4 proposes the novel TOF based indoor flame detection algorithm. Flames are detected by looking for regions which contain multiple pixels with high amplitude disorder and high accumulated depth differences. Section 5, on the other hand, presents the novel multi-sensor visual-TOF flame detector. Next, Section 6 reports the performance results of our primary experiments. Finally, Section 7 lists the conclusions and points out directions for future work.
2 Related work
2.1 Time-of-flight based video analysis
To the best of our knowledge, the work described in this paper is the first attempt to develop a fire detection system based on the use of a TOF depth sensor. Nevertheless, the use of TOF cameras for video analysis is not new. Recently, TOF has started to be explored as a way to improve a lot of conventional video based applications. The results of these first approaches already seem very promising and ensure the feasibility of TOF imaging in other domains, such as fire detection. So far, TOF imaging devices are used for:
-
Video surveillance: Hügli and Zamofing [22] explore a remedy to shadows and illumination problems in ’conventional’ video surveillance by using range cameras. Tanner et al. [37] and Bevilacqua et al. [3] propose a TOF-based improvement for the detection, tracking and counting of people. Similarly, Grassi et al. [17] fuse TOF and infrared images to detect pedestrians and to classify them according to their moving direction and relative speed; Tombari et al. [38] detect graffiti by looking for stationary changes of brightness that do not correspond to changes in depth.
-
Image/video segmentation: In [4, 34] fusion of depth and color images results in significant improvements in segmentation of challenging sequences.
-
Face detection/recognition: Hansen et al. [21] improve the performance of face detection by using both depth and gray scale images; Meers et al. [28] generate accurate TOF-based 3D face prints, suitable for face recognition.
-
Human Computer Interaction: TOF cameras also pave the way to new types of interfaces that make use of gesture recognition [7] or the user’s head pose and facial features [28]. These novel interfaces can be used in a lot of systems, e.g. view control in 3D simulation programs, video conferencing, interactive tabletops [51], and support systems for the disabled.
-
(Deviceless) gaming: TOF imaging also increases the gaming immersion, as with this technology, people can play video games using their bodies as controllers. This is done by markerless motion capture, i.e. tracking and gesture recognition, using a single depth sensor. The sensor smoothly mirrors the player’s movements onto the gaming character. Recently, several companies, e.g. Omek Interactive and Softkinetic, started to provide commercially available TOF technology for gesture-based video gaming. Furthermore, Microsoft also focuses on this new way of gaming with its recently launched TOF-like Kinect.
-
Other applications: e-health (e.g fall detection [25]), medical applications, interactive shopping and automotive applications (e.g driving assistance and safety functions such as collision avoidance [35, 43]).
2.2 Video fire detection
The several vision-based fire and smoke detection algorithms that have been proposed in literature have led to a large amount of VFD algorithms that can be used to detect the presence of fire. Each of these algorithms detects flames or smoke by analyzing one or more fire features in visible light [44].
Color was one of the first features used in VFD and is still by far the most popular [9]. The majority of the color-based approaches in VFD makes use of RGB color space, sometimes in combination with the saturation of HSI (Hue–Saturation–Intensity) color space [10, 33]. The main reason for using RGB is the equality in RGB values of smoke pixels and the easily distinguishable red–yellow range of flames.
Other frequently used fire features are flickering [27, 33] and energy variation [8, 40, 44]. Both focus on the temporal behavior of flames and smoke. Flickering refers to the temporal intermittency with which pixels appear and disappear at the edges of turbulent flames. Energy variation refers to the temporal disorder of pixels in the high-pass components of the discrete wavelet transformed images of the camera.
Fire also has the unique characteristic that it does not remain a steady color, i.e., the flames are composed of several varying colors within a small area. Spatial difference analysis [33, 39] focuses on this feature and analyses the spatial color variations in pixel values to eliminate ordinary fire-colored objects with a solid flame color.
Also an interesting feature for fire detection is the disorder of smoke and flame regions over time. Some examples of frequently used metrics to measure this disorder are randomness of area size [5], boundary roughness [40], and turbulence variance [52]. Although not directly related to fire characteristics, motion is also used in most VFD systems as a feature to simplify and improve the detection process, i.e., to eliminate the disturbance of stationary non-fire objects. In order to detect possible motion, possibly caused by the fire, the moving part in the current video frame is detected by means of a motion segmentation algorithm [8, 39, 40, 52].
Based on the analysis of the discussed state-of-the-art and our own experiments [49], a low-cost visual flame detector is presented in Section 5. This detector is used in conjunction with the TOF amplitude disorder detector to detect flames in outdoor situations and outside the range of the TOF camera.
2.3 Conventional depth acquisition using stereo-vision
The acquisition of high quality real time depth information is a very challenging issue. Traditionally this problem has been tackled by means of stereo vision systems, which exploit the information coming from two or more conventional video cameras. Stereo vision provides a direct way of inferring the depth information by using two images, i.e., the stereo pair, destined for the left and right eye, respectively. When each image of a stereo pair is viewed by its respective eye, the human brain can process subtle differences between the images to yield 3D perception of the scene being viewed [42]. Currently, stereo vision systems are used in a wide range of application domains such as remote machine control, medical imaging, automatic surveillance and multi-modal (depth/color) content-based segmentation [1, 14].
Although stereo vision systems have been greatly improved in the last years and obtained interesting results, they cannot handle all scene situations (~aperture problem). Moreover the algorithmic complexity of most advanced stereo-vision systems is quite high, i.e., they are very time-consuming algorithms, not suited for real-time operation. Hence, stereo vision systems do not provide completely satisfactory solutions for the extraction of depth information from generic scenes [12]. This is also confirmed by the quantitative comparison of Beder et al. [2]. They found that the TOF system outperformed the stereo system in terms of achievable accuracy for distance measurements.TOF cameras can as such be considered as a competing technique for stereo-vision based video surveillance applications. Furthermore, since TOF systems directly yield accurate 3D measurements, they can (contrarily to stereo-vision systems) be used in highly dynamic environments, such as large and open spaces like car parks, atria and airports, i.e., our use cases.
3 Time-of-flight imaging
The working principle of TOF imaging is shown in Fig. 2. In order to measure the depth for every pixel in the image, the TOF camera is surrounded by infrared LEDs which illuminate the scene with a frequency modulated IR signal. This signal is reflected on the scene, and the camera measures the time t needed by the signal to go and return. If the emitter and the receiver are punctual and located at the same place, then Δt allows us to measure the depth of each pixel, as the depth d = c Δt / 2, where c is the signal’s speed (c ≃ 3 * 108 m/s for light). Simultaneously, the camera also measures the strength of the reflected infrared signal, i.e. its amplitude, which is an indicator about the accuracy of the distance estimation [7].
As the depth and amplitude information is obtained using the same sensor, the depth map (Fig. 3a) and the amplitude image (Fig. 3b) are registered (Fig. 3c). Compared to other multi-modal detectors, no additional processing is required for correspondence matching, i.e., one of the strengths of the TOF sensor. Other advantages of TOF imaging are:
-
Not sensitive to light changes/shadows: the TOF camera uses his own (invisible) light, which simplifies moving object detection a lot.
-
Minimal amount of post-processing, giving application-processing more time for real time detection.
-
The depth map, of which the information represents the physical properties of object location and shape, can help in dividing the objects during occlusion or partial overlapping [13].
-
Low price compared to other IR-based sensors.
In general, one can conclude that time-of-flight data compensates for the disadvantages and weaknesses, e.g.noise and other problematic artifacts, present in other data [4]. However, time-of-flight imaging also has its disadvantages, i.e., its advantages come with a price. The disadvantages associated with TOF imaging cameras are:
-
Low spatial resolution: The average commercially available TOF camera has a QCIF resolution (176 ×144 pixels), which is rather low. However, as with traditional imaging technology, resolution is increasing steadily with each new model offering higher resolution as the technology matures.
-
Measurement artifacts: Objects too distant (≤ 1.2 m for the type of camera used in our experiments) can be poorly illuminated leading to low quality range/depth measurements. This is also illustrated by the experiments shown in Fig. 4a; Significant motion can also cause corrupt range/amplitude data, because the scene may change during consecutive range acquisitions. In order to cope with these problems, the discrete wavelet transform (DWT) detail images are also investigated to distinguish flames from other ‘fast’ moving objects; The sensor also has a limited ‘non-ambiguity range’ before the signals get out of phase. In small rooms, this is no problem, but in large rooms this can do raise problems, as shown by the experiment in Fig. 4b. For this reason, the outdoor detector, which can also be used for detection out of the range of the TOF camera, uses a visual flame detector instead of the ’unreliable’ depth maps.
-
Need for active illumination: This increases power consumption and physical size, complicates thermal dissipation, but perhaps most importantly, limits the useful operating distance of the cameras. However, as the proposed detectors mainly focus on the IR emitted by the flames itself, this active illumination can (probably) be switched off.
4 Time-of-flight based flame detection (indoor, distance < 10m)
A general scheme of the indoor TOF based flame detector is shown in Fig. 5. The proposed algorithm consists of three stages. The first two stages, i.e., the fast changing depth detection and the amplitude disorder detection are processed simultaneously. The last stage, i.e., the region overlap detection, investigates the overlap between the resulting candidate flame regions of the prior stages. If there is an overlap, fire alarm is given. Because the proposed flame detection algorithm requires reliable depth maps, its detection distance is limited to the range of the TOF camera, which is closer than ten meter.
4.1 Fast changing depth detection
The fast changing depth detection starts with calculating the accumulated frame difference AFD n (1) between the current depth frame \(F^{\rm depth}_n\) and the previous and the next depth frame, i.e., \(F^{\rm depth}_{n-1}\) and \(F^{\rm depth}_{n+1}\) respectively. By rounding the absolute frame differences, the \(AFD^{\rm depth}_n\) is able to distinguish fast changing flames from more slowly moving ordinary objects. Pixels which \(AFD^{\rm depth}_n\) is greater than zero get a label 1 in the candidate flames image \(Flames^{\rm depth}_n\) (2). Other pixels get a label zero.
Next, a morphological closing with a 3 ×3 structuring element connects neighboring candidate flame pixels, i.e. pixels with a label 1 in \(Flames^{\rm depth}_n\). Subsequently, a morphological opening filters out isolated candidate flame pixels using the same structuring element. The resulting connected flame pixel group(s) of \(Flames^{\rm depth}_n\) form the depth candidate flame region(s). An example of the fast changing depth detection is shown in Fig. 6.
Important to mention is that, as an alternative for the conventional morphological operators, we have also evaluated the added value of more advanced morphological alternatives, such as the opening by reconstruction proposed by Doulamis et al. [15]. According to the authors, this more advanced operator does better retain the contours of image objects. In the experimental results (Section 6), the performance of this more advanced technique is compared to the conventional morphological operations and the results of tests on realistic fire/non-fire video sequences are discussed.
4.2 Amplitude disorder detection
The amplitude disorder detection starts with a similar accumulated frame differencing (3) as the one which was used for the fast changing depth detection. However, as high \(AFD^{\rm amp}_n\) frame differences also occur at the boundary pixels of ordinary moving objects which are close to the TOF sensor, this feature alone is not enough for accurate flame detection.
In order to distinguish flame pixels from the boundary pixels of ordinary ‘close’ moving objects, the discrete wavelet transform (DWT) [26] of the amplitude image is also investigated. Experiments (Fig. 7) revealed that flame regions are uniquely characterized by high values in the horizontal H, vertical V and diagonal D detail images of the DWT. Ordinary ‘close’ moving objects do not have this characteristic. For this reason, a \(AFD^{\rm amp}_n\) region R with high accumulated amplitude differences is only labeled as candidate flame region, i.e. gets a value of 1 in \(Flames^{\rm amp}_n\) (5), if it contains high H,V and D values in the DWT detail images, i.e. \(DWT^{\rm detail}_R = 1\) (4).
Analogously as in the fast changing depth detection, the morphological filtering connects neighboring candidate flame pixels in \(Flames^{\rm amp}_n\) and filters out isolated candidate flame pixels. The resulting connected flame pixel group(s) of \(Flames^{\rm amp}_n\) (Fig. 8) form the amplitude candidate flame region(s).
4.3 Region overlap detection
This last stage investigates the overlap between the depth and the amplitude candidate flame region(s), i.e. \(Flames^{\rm depth}_n\) and \(Flames^{\rm amp}_n\) respectively. Important to mention is that, in order to do this, the depth map and the amplitude image need to be registered. However, as they are both obtained using the same sensor, both TOF outputs are already aligned on each other. In order to detect the overlap, it is sufficient to perform a logical AND operation between \(Flames^{\rm amp}_n\) and \(Flames^{\rm depth}_n\). If the resulting binary image contains one or more ’common’ pixels, i.e. pixels with a value of 1, fire alarm is given. In Fig. 9, an example of this region overlap detection is shown.
5 Visual - TOF flame detection (outdoor, distance > = 10m)
In an outdoor environment or over longer distances, i.e., out of the range of the TOF camera, the depth information of the TOF sensor becomes unreliable. For this reason, the indoor TOF-based flame detector, which is introduced in the previous section, cannot be used under these circumstances. One could think of only using the TOF amplitude information, however, relying on this feature alone can cause mis-detections, as high values in the H, V, and D detail images of the DWT amplitude images can also occur due to amplitude measurement artifacts. As such, we propose to use a visual detector in addition to the TOF amplitude disorder detection. The proposed visual flame detector is based on a set of low-cost visual flame features which have proven useful in distinguishing fire from ordinary moving objects [48].
A general scheme of the ’outdoor’ visual-TOF based flame detector is shown in Fig. 10. The proposed algorithm consists of three stages and is similar to the previous described ‘indoor’ TOF based flame detection algorithm. The first two stages, i.e., the low-cost visual flame detection and the amplitude disorder detection are processed simultaneously. The last stage, i.e., the region overlap detection, investigates the overlap between the resulting candidate flame regions of the prior stages. If there is an overlap between these flame regions, fire alarm is given.
Important to mention is that, in order to perform the region overlap detection, the visual RGB image and the amplitude image need to be registered. Some types of TOF cameras, e.g. the OptriCam [30], already offer both TOF sensing and RGB capabilities and their visual and TOF images are already registered. The majority of TOF cameras, however, still does not have this RGB capabilities. As such, visual-TOF registration, i.e. the calculation of the visual-TOF transformation parameters, is necessary.
5.1 Low-cost visual flame detector
The low-cost visual flame detector (Fig. 11) starts with a dynamic background subtraction [39, 44], which extracts moving objects by subtracting the video frames with everything in the scene that remains constant over time, i.e. the estimated background. To avoid unnecessary computational work and to decrease the false alarms caused by noisy objects, a morphological opening, which filters out the noise, is performed after the dynamic background subtraction. Each of the remaining foreground objects is further analyzed using a set of visual flame features.
In case of a fire object, the selected features, i.e. spatial flame color disorder, principal orientation disorder and bounding box disorder, vary considerably over time. Due to this high degree of disorder, extrema analysis is chosen as a technique to easily distinguish between flames and other objects. It is related to the number of local maxima and minima in the set of data points. For more detailed information the reader is referred to [49].
5.2 Multi-sensor image registration
In order to combine the information in our multi-sensor visual-TOF setup, the corresponding objects in the scene need to be aligned, i.e. registered. The goal of registration is to establish geometric correspondence between the multi-sensor images so that they may be transformed, compared, and analyzed in a common reference frame [36]. Because corresponding objects in the visual and TOF amplitude image may have different sizes, shapes, features, positions and intensities, as is shown in Fig. 12, the fundamental question to address during registration is: what is a good image representation to work with, i.e. what representation will bring out the common information between the two multi-sensor images, while suppressing the non common information between those images [23]?
When choosing an appropriate registration method, a first distinction can be made between automatic and manual registration. In applications with manual registration, e.g. using a calibration checkerboard [24], a set of corresponding points are manually selected from the two images in order to compute the parameters of the transformation and the registration performance is evaluated by subjectively comparing the registered images. This is repeated several times until the registration performance is satisfied, i.e., the registration criteria is reached. If the background changes, e.g. due to camera movement, the entire procedure needs to be repeated. Because this manual process is labor intensive, automatic registration is more desirable. Therefore, we adapt the latter in our system.
A second distinction for an appropriate (automatic) registration method, is between region, line and point feature-based methods [54]. It is necessary to use features that are stable with respect to the sensors, i.e. the same physical artifact produces features in both images. Compared to the correspondence of individual points and lines, region-based methods, such as silhouette mapping, provide more reliable correspondence between color and TOF amplitude image pairs [11].
For example, comparing the visual and TOF images in Fig. 12, one can see that some information varies a lot, but what is most similar are the silhouettes. Therefore, the proposed image registration method performs a match of the transformed color silhouette of the calibration object, i.e. a moving person, to its amplitude silhouette. The mutual information, i.e. the silhouette coverage, is assumed to reach its maximal value when both images are registered. However, knowing that the same silhouettes extracted from TOF and visual images can still have different details (as shown in the experiments), a complete exact match is (quasi) impossible. It is also important to mention that, instead of using a person as the calibration object, also other (moving) objects in the scene can be used.
The proposed silhouette contour based image registration algorithm, which is shown in Fig. 13, coarsely registers the images taken simultaneously from the TOF and visual parallel sensors whose lines of sight are close to each other. The registration starts with a moving object silhouette extraction [11] in both visual and TOF image to separate the calibration objects, i.e. the moving foreground, from the background, which is assumed to be static. Key components of the moving object silhouette extraction are the dynamic background subtraction, automatic thresholding and morphological filtering with growing structuring elements, which grow iteratively until a resulting silhouette is suitable for visual-TOF silhouette matching. After silhouette extraction, 1D contour vectors are generated from the resulting IR and visual silhouettes using silhouette boundary extraction, Cartesian to polar transform and radial vector analysis. Next, in order to retrieve the rotation angle and the scale factor between the LWIR and visual image, these contours are mapped onto each other using circular cross correlation [19] and contour scaling. Finally, the translation between the two images is calculated using maximization of binary correlation. The retrieved transformation parameters are used in the region overlap detection to align the visual with the TOF amplitude image.
For a more detailed description of this silhouette based registration method, the reader is referred to [46, 47], in which the same registration process is used for LWIR-visual image registration. The experiments in Fig. 14 (and the referred work) show that the proposed method automatically finds the correspondence between silhouettes from synchronous multi-sensor images.
6 Experimental results
The TOF camera used in this work is the Panasonic D-Imager [32]. The D-imager is one of the leading commercial products of its kind. Other appropriate TOF cameras are the CanestaVision from Canesta, the SwissRanger from Mesa Imaging, the PMD[vision] CamCube and the Optricam from Optrima [30]. The technical specifications of the D-Imager are shown in Fig. 15. The image processing code was written in MATLAB, and is sufficiently simple to operate in real-time on a standard desktop or portable personal computer.
6.1 Indoor experiments
To illustrate the potential use of the proposed indoor TOF based flame detector, several realistic fire and non-fire indoor experiments were performed. An example of these experiments, i.e., the paper fire test, is shown in Fig. 16. As can be seen in the depth maps, the measured depth of flames changes very fast. Even between two consecutive frames, very high depth differences are noticeable. In the amplitude images, on the other hand, it can also be seen that the boundaries of the flames have a very high amplitude. Simultaneously to the TOF recording with the Panasonic D-Imager, we also recorded the experiments with an ordinary video camera. As such, the TOF detection results can be compared to state-of-the-art VFD methods.
In order to objectively evaluate the detection results of the proposed algorithm, the detection rate metric (4) is used. This metric is comparable to the evaluation methods used by Celik et al. [9] and Toreyin et al. [40]. The detection rate equals the ratio of the number of correctly detected fire frames, i.e. the detected fire frames minus the number of falsely detected frames, to the number of frames with fire in the manually created ground truth (GT).
The results in Table 1 show how robust fire detection can be obtained with relatively simple TOF image processing. Compared to the VFD detection results, i.e., an average detection rate of 93% and an average false positive rate of 2%, the proposed TOF-based flame detector, with its 96% detection rate and no false positive detections, performs better for these primary experiments. The indoor detector, however, is not able to detect the fire in outdoor situations or outside the range of the TOF camera. Main reason of its failing is the fact that its depth maps becomes unreliable under these circumstances. In order to cope with this problem, an outdoor visual-TOF flame detector was introduced in Section 5. The following subsection discusses its performance.
Important to mention is that, as an alternative for the conventional morphological operators, we have also evaluated the added value of the more advanced morphological technique proposed by Doulamis et al. [15]. As can be seen in the last column of Table 1, the results achieved with the proposed conventional morphological operators are comparable to, i.e., do not much differ from, the advanced opening by reconstruction (shown between brackets). As such, due to their lower computational complexity, the conventional morphological operators still seem a good choice.
6.2 Outdoor experiments > 10m
Analogously as for the evaluation of the indoor detector, several realistic fire and non-fire experiments were performed to illustrate the potential use of the outdoor visual-TOF flame detector. An example of these experiments, i.e. the Christmas tree fire, is shown in Fig. 17. In order to test the detection range of the proposed multi-sensor detector, the distance between the sensors and the fire/moving objects is also varied during these experiments.
As the results in Table 2 show, robust flame detection can be obtained with the proposed multi-sensor visual-TOF image processing. Compared to the VFD detection results, i.e., an average detection rate of 88% and an average false positive rate of 4%, the outdoor detector, with its average detection rate of 92% and no false positive detections, performs better.
By further inspecting the tests in Table 2, one can also see that increasing the distance between the cameras and the fire source, does not much influence the detection results. For example, the detection rate of the outdoor wood fire test at 22 m is around 89%, which is quasi as good as the 91% of the straw fire test at 7 m. Similar to the indoor detector results, the use of more advanced morphological techniques did also not have significant effect on the flame detection rate.
The results also show that, compared to the indoor detector, the average detection rate of the outdoor detector is a little lower. This can mainly be attributed to the fact that the resolution of the TOF camera, for the moment, is too low to detect small objects over long distances. Very small flames (e.g. in the beginning of the fire) are, as such, not detected.
7 Conclusions
Two novel time-of-flight based fire detection methods for indoor and outdoor fire detection are proposed in this paper. The indoor detector focuses on the most appropriate TOF flame features, i.e., a fast changing depth and a high amplitude disorder, which combination is unique for flames. The fast changing depth is detected by accumulated frame differencing between three consecutive depth frames. Regions which have multiple pixels with a high accumulated depth difference are labeled as candidate flame regions. Simultaneously, regions with high accumulative amplitude differences and high values in all detail images of the amplitude image its discrete wavelet transform are also detected. These regions are labeled as the candidate flame regions of the amplitude images. If the resulting candidate flame regions of the fast changing depth detection and the amplitude disorder detection overlap, fire alarm is given. Experiments show that the proposed indoor detector yields an average flame detection rate of 96% with no false positive detections.
The outdoor detector, on the other hand, only differs from the indoor detector in one of its multi-modal inputs. As depth maps are unreliable in outdoor environments and outside the range of the camera, the outdoor detector uses a visual flame detector instead of the fast changing depth detection. Outdoor experiments show that this multi-sensor detector has an average flame detection rate of 92% and also has no false positive detections.
Both of the proposed detectors can be very essential tools for future environment crisis management systems. An example of such a system is described in the work of Vescoukis et al. [50]. Currently, the proposed work is also evaluated within test cases of a Belgian and an European project, i.e., the car park fire safety project [29] and the Firesense project [16]. Both projects have similar objectives as the work of Vescoukis et al., i.e., they aim to deliver a system in which severel sensors are monitored simultaneously and combined/fused with context information about the environment. Preliminary test results within those projects show the practical application, i.e., the valorization potential, of the proposed time-of-flight work. As the proposed techniques can easily be translated to other domains, e.g., to detect dynamic regions within the context of crowd analysis [53], our work on multi-modal TOF based video analysis will only increase in value in the coming years.
Finally, it is important to remark that the proposed algorithms will also be able to detect multiple fire seeds within the same ‘view’, i.e., a requirement for many apllications. As the proposed algorithms work on a ‘blob’ level (~flame regions), they have no difficulties to detect multiple fireplaces. Furthermore, as the multi-modal images are registered, a candidate flame region on position (x,y) in one image will not be able to influence a candidate flame region on a ‘non-overlapping’ position (x′, y′) in the other multi-modal image.
References
Alatan A, Onural L, Wollborn M, Mech R, Tuncel E, Sikora T Image sequence analysis for emerging interactive multimedia services - the European cost 211 framework. IEEE Trans Circuits Syst Video Technol 8(7):802–813 (1998)
Beder C, Bartczak B, Koch R (2007) A comparison of PMD-cameras and stereo-vision for the task of surface reconstruction using patchlets. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Bevilacqua A, Stefano LD, Azzari P (2006) People tracking using a time-of-flight depth sensor. In: IEEE int. conf. on video and signal based, pp 89–95
Bleiweiss A, Werman M Fusing time-of-flight depth and color for real-time segmentation and tracking. In: DAGM workshop on dynamic 3D imaging, pp 58–69 (2009)
Borges PVK, Mayer J, Izquierdo E (2008) Efficient visual fire detection applied for video retrieval. In: European signal processing conference
Bosch I, Gomez S, Molina R, Miralles R (2009) Object discrimination by infrared image processing. In: International work-conference on the interplay between natural and artificial computation (IWINAC), pp 30–40
Breuer P, Eckes C, Muller S (2007) Hand gesture recognition with a novel ir time-of-flight range camera - a pilot study. In: 3rd int. conf. on computer vision/computer graphics collaboration techniques, 3rd international conference on computer vision/computer graphics collaboration techniques, pp 247–260
Calderara S, Piccinini P, Cucchiara R (2008) Smoke detection in video surveillance: a MoG model in the wavelet domain. In: International conference on computer vision systems, pp 119–128
Celik T, Demirel H (2008) Fire detection in video sequences using a generic color model. Fire Saf J 44(2):147–158
Chen T-H, Wu P-H, Chiou Y-C (2004) An early fire-detection method based on image processing. In: International conference on image processing, pp 1707–1710
Chen H-M, Lee S, Rao RM, Slamani M-A, Varshney PK (2005) Imaging for concealed weapon detection. IEEE Signal Process Mag 22:52–61
Dal Mutto C, Zanuttigh P, Cortelazzo GM (2010) Accurate 3D reconstruction by stereo and ToF data fusion. In: Proceedings of gruppo telecomunicazioni e tecnologie dell’informazione (GTTI) meeting
Dorrington A, Kelly C, McClure S, Payne A, Cree M (2009) Advantages of 3D time-of-flight range imaging cameras in machine vision applications. In: 16th electronics New Zealand conference (ENZCon), pp 95–99
Doulamis A, Doulamis N, Ntalianis K, Kollias S (2000) Efficient unsupervised content-based segmentation in stereoscopic video sequence. Int J Artif Intell Tools 9(2):277–303
Doulamis A, Doulamis N, Maragos P (2001) Generalized multiscale connected operators with applications to granulometric image analysis. In: International conference on image processing, vol 3, pp 684–687
FIRESENSE project (2011) Protection of cultural heritage. http://www.firesense.eu/
Grassi A, Frolov V, Leon FP (2010) Information fusion to detect and classify pedestrians using invariant features. In: Information fusion, pp 1–9
Gunay O, Tasdemir K, Toreyin BU, Cetin AE (2009) Video based wildfire detection at night. Fire Saf J 44:860–868
Hamici Z (2006) Real-time pattern recognition using circular cross-correlation: a robot vision system. Int J Rob Res 21(3):174–183
Han J, Bhanu B (2007) Fusion of color and infrared video for moving human detection. Pattern Recogn 40:1771–1784
Hansen DW, Larsen R, Lauze F (2007) Improving face detection with TOF cameras. In: International symposium on signals, circuits & systems, pp 225–228
Hugli H, Zamofing T (2007) Pedestrian detection by range imaging. In: Conference on computer vision theory and applications, pp 18–22
Irani M, Anandan P (1998) Robust multi-sensor image alignment. In: IEEE international conference on computer vision, pp 959–966
Krotosky SJ, Trivedi MM (2007) Mutual information based registration of multimodal stereo videos for person tracking. Comput Vis Image Underst 106:270–287
Leone A, Diraco G, Distante C, Siciliano P, Malfatti M, Gonzo L, Grassi M, Lombardi A, Rescio G, Malcovati P, Libal V, Huang J, Potamianos G (2008) A multi-sensor approach for people fall detection in home environment. In: Workshop on multi-camera and multi-modal sensor fusion algorithms and applications, pp 1–12
Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Recogn Mach Intell 11(7):674–693
Marbach G, Loepfe M, Brupbacher T (2006) An image processing technique for fire detection in video images. Fire Saf J 41:285–289
Meers S, Ward K (2008) Head-pose tracking with a timeof-flight camera. In: Australian conference on robotics and automation, pp 1–7
Merci B (2011) Fire safety and explosion safety in car parks. http://www.carparkfiresafety.be/
Optrima (2010) 3D time-of-flight camera systems. http://www.optrima.com/
Owrutsky JC, Steinhurst DA, Minor CP, Rose-Pehrsson SL, Williams FW, Gottuk DT (2006) Long wavelength video detection of fire in ship compartments. Fire Saf J 41:315–320
Panasonic 3D image sensor. http://panasonic-electric-works.net/D-IMager/
Qi X, Ebert J (2009) A computer vision based method for fire detection in color videos. Int J Imaging 2:22–34
Sabeti L, Parvizi E, Wu QMJ (2008) Visual tracking using color cameras and time-of-flight range imaging sensors. J Multimed 3:28–36
Schamm T, Strand M, Gumpp T, Kohlhaas R, Zollner JM, Dillmann R (2009) Vision and ToF-based driving assistance for a personal transporter. In: International conference on advanced robotics, pp 1–6
Shah M, Kumar R (2003) Video registration. Kluwer Academic Publishers, Dordrecht
Tanner R, Studer M, Zanoli A, Hartmann A (2008) People detection and tracking with TOF sensor. In: 5th int. conf. on advanced video and signal based surveillance, pp 356–361
Tombari F, Di Stefano L, Mattoccia S, Zanetti A (2008) Graffiti detection using a time-of-flight camera. In: 10th int. conf. on advanced concepts for intelligent vision systems, pp 645–654
Toreyin BU, Dedeoglu Y, Gudukbay U, Cetin AE (2005) Computer vision based method for real-time fire and flame detection. Pattern Recogn Lett 27:49–58
Toreyin BU, Dedeoglu Y, Cetin AE (2006) Contour based smoke detection in video using wavelets In: European signal processing conference
Toreyin BU, Cinbis RG, Dedeoglu Y, Cetin AE (2007) Fire detection in infrared video using wavelet analysis. SPIE Opt Eng 46:1–9
Triantafyllidis GA, Tzovaras D, Strintzis MG (2000) Occlusion and visible backgroumd and foreground areas in stereo: a bayesian approach. IEEE Trans Circuits Syst Video Technol 10(4):563–576 (Special Issue on 3D Video Technology)
Vacek S, Schamm T, Schroder J, Dillmann R (2007) Collision avoidance for cognitive automobiles using a 3D PMD camera. In: 6th IFAC symposium on intelligent autonomous vehicles, pp 1–6
Verstockt S, Lambert P, Van de Walle R, Merci B, Sette B (2009) State of the art in vision-based fire and smoke dectection. In: 14th int. conf. on automatic fire detection, vol 2, pp 285–292
Verstockt S, Dekeerschieter R, Vanoosthuyse A, Merci B, Sette B, Lambert P, Van de Walle R (2010) Video fire detection using non-visible light. In: 6th international seminar on fire and explosion hazards
Verstockt S, Poppe C, De Potter P, Van Hoecke S, Hollemeersch C, Lambert P, Van de Walle R (2010) Silhouette coverage analysis for multi-modal video surveillance. In: 29th progress in electromagnetics research symposium (PIERS), pp 1–5
Verstockt S, Poppe C, Van Hoecke S, Hollemeersch C, Merci B, Sette B, Lambert P, Van de Walle R (2011) Silhouette-based multi-sensor smoke detection: coverage analysis of moving object silhouettes in thermal and visual registered images. Mach Vis Appl. doi:10.1007/s00138-011-0359-3
Verstockt S, Van Hoecke S, Tilley N, Merci B, Sette B, Lambert P, Hollemeersch C, Van de Walle R (2011) FireCube: a multi-view localization framework for 3D fire analysis. Fire Saf J 46(5):262–275
Verstockt S, Vanoosthuyse A, Van Hoecke S, Lambert P, Van de Walle R (2010) Multi-sensor fire detection by fusing visual and non-visual flame features. In: 4th international conference on image and signal processing, pp 333–341
Vescoukis V, Doulamis N, Karagiorgou S (2012) A service oriented architecture for decision support systems in environmental crisis management. Future Gener Comput Syst 28(3):593-604
Wilson AD (2010) Using a depth camera as a touch sensor. In: ACM international conference on interactive tabletops and surfaces, pp 69–72
Xiong Z, Caballero R, Wang H, Finn AM, Lelic MA, Peng P-Y (2007) Video-based smoke detection: possibilities, techniques, and challenges. IFPA fire suppression and detection research and applications. A technical working conference
Zhan B, Monekosso D, Remagnino P, Velastin S, Xu L (2008) Crowd analysis: a survey. Mach Vis Appl 19(5):345–357
Zitova B, Flusser J (2003) Image registration methods: a survey. Image Vis Comput 21:977–1000
Acknowledgements
The research activities as described in this paper were funded by Ghent University, the Interdisciplinary Institute for Broadband Technology (IBBT), University College West Flanders (HOWEST), Warringtonfiregent (WFRGent NV), the Institute for the Promotion of Innovation by Science and Technology in Flanders (IWT), the Fund for Scientific Research-Flanders, the Belgian Federal Science Policy Office (BFSPO) and the European Union.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Verstockt, S., Van Hoecke, S., De Potter, P. et al. Multi-modal time-of-flight based fire detection. Multimed Tools Appl 69, 313–338 (2014). https://doi.org/10.1007/s11042-012-0991-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-0991-6