Keywords

1 Introduction

Emergency vehicle accidents, particularly those involving ambulances and fire trucks, pose operational and financial challenges. Immediate consequences include vehicle downtime, increased operational costs, and potential legal liabilities. The total number of ambulance crashes, including minor “fender benders,” has been estimated at 6,500 per year [1]. Fire truck crashes occur at a rate of approximately 30,000 per year, having potentially dire consequences for the vehicle occupants and the community if the fire truck was traveling to provide emergency service [2]. Each year, there are approximately 300 fatalities in the U.S. that occur during police pursuits [3]. These crashes often occur at high speeds, at night, and on local roads. The economic impact of vehicle crashes is substantial, with the government paying an estimated $35 billion annually [4].

These accidents lead to issues such as inflated repair costs, delays in vehicle recovery due to inefficient roadside assistance, and prolonged insurance processes that extend vehicle downtime. These challenges strain emergency services financially and impede their ability to provide timely and effective responses.

In this paper, we introduce the Smart Auto Assessment and Roadside Technical Help Interface (SAARTHI), an end-to-end framework that addresses these challenges. SAARTHI leverages advanced artificial intelligence (AI) technologies, including object detection, instance segmentation, and salient object detection, to improve the way emergency vehicle damages are assessed and repairs are managed. This platform assesses damage from user-uploaded images, classifies the damages into six categories-dent, scratch, crack, lamp broken, glass shatter, and tire flat, and further distinguishes them as major and minor damages, also providing immediate repair cost estimates. While current processes for dealing with emergency vehicle damages are generally functional, SAARTHI aims to enhance existing protocols by offering rapid damage assessment tools, supporting decision-making for fleet managers and maintenance departments.

One of the issues in emergency vehicle operations is the variability and inconsistency in repair estimates for similar damages. There are inflated repair costs due to the lack of standardized pricing. For example, two similar ambulance accidents might result in vastly different repair quotes from different service providers, causing potential overpayment. Implementing a standardized repair cost estimation system can aid government agencies in planning their budgets for upcoming years by reducing overpayment for repairs.

Another challenge is the efficiency of roadside assistance, which is crucial for keeping emergency vehicles operational. Delays in returning a fire truck to service after an accident can leave the fire department short of essential resources, impacting their ability to respond effectively to emergencies.

To address these issues, we introduce SAARTHI, a comprehensive framework that manages the entire lifecycle of an emergency vehicle from accident to return to service. This end-to-end AI-powered framework can be directly implemented to expedite vehicle damage management. Key features of SAARTHI include:

  • Real-time damage assessment. We implemented Resnet-based [5] mask-RCNN [6] model utilizing MMDetection [7], and U-net to perform real-time damage assessments from user-uploaded images.

  • Detailed reporting and documentation. The system generates detailed reports and documentation of the damage, including images, statistical graphs, estimated repair costs, and other critical details.

  • Repair cost estimation. By integrating Non-Maximum Suppression [8, 9] and considering the base prices of different types of damages along with their impact factors, we provide accurate repair cost estimations.

  • Immediate assistance via chatbot. SAARTHI includes a chatbot feature that provides immediate assistance to users.

The rest of the paper is structured as follows: Sect. 2 covers related work. Section 3 presents our approach and algorithms for damage detection and repair cost estimation. Section 4 details the SAARTHI framework. Section 5 provides experimental results, and Sect. 6 concludes the paper.

2 Background

Advanced AI technologies, such as object detection [10] and instance segmentation [11], offer robust solutions for assessing vehicle damage [12]. There is abundant research that has utilized Convolutional Neural Networks (CNNs) for detecting damages in vehicles [13, 14]. CNNs have shown great promise in accurately identifying various types of vehicle damage. However, one of the significant challenges in this field has been the accurate detection of overlapping damages. Overlapping damages complicate the task of isolating individual damages, making it difficult to provide precise repair cost estimates.

In the SAARTHI framework, after implementing the initial damage assessment using object detection and instance segmentation, we apply the Non-Maximum Suppression (NMS) technique [15] to eliminate the overlapping detection of damages. NMS is crucial for refining the results of damage detection by ensuring that only the most significant damages are considered. By focusing on the most significant damage, we can more accurately identify the estimated repair cost. Detecting vehicle damage, particularly with irregular shapes and flexible boundaries, poses significant challenges. Scratches and cracks often have similar contours and colors, leading to misclassification [16]. To address this, we use Salient Object Detection (SOD) methods, which refine boundaries and segment objects with irregular shapes. SOD locates all salient objects without classifying them, ensuring accurate assessment of dents, scratches, and cracks by focusing on the location and extent of the damage.

Fig. 1.
figure 1

Network architecture of the Mask R-CNN model with a ResNet-50 backbone, used for damage detection and segmentation in emergency vehicles

3 Methodology

3.1 Damage Detection And Segmentation For Emergency Vehicle

The aim of car damage assessment is to accurately detect, classify, and contour damages on vehicles, aligning with the objectives of instance segmentation and object detection. We implement a Mask R-CNN [6] model with a ResNet-50 [5] backbone using the MMDetection [7, 17] toolbox (Fig. 1). The model is initially pre-trained on the COCO dataset [18] and fine-tuned on the CarDD dataset [12], which consists of approximately 4000 images, manually annotated with bounding boxes and masks to improve the performance on vehicles assessments.

To enhance the training process, we implement a custom hook that dynamically modifies the data augmentation pipeline, incorporating steps to increase model robustness. Images are loaded with their annotations (bounding boxes and masks), randomly resized with scales from 0.1 to 2.0. Then randomly cropped, flipped to handle orientation variations, and padded to 640\(\,\times \,\)640 pixels. This augmentation strategy introduces diverse transformations, improving the model’s generalization.

3.2 Non-Maximum Suppression for Bounding Box Filtering

To refine our object detection results and prepare data for repair cost estimation, we apply the Non-Maximum Suppression (NMS) [8, 9] algorithm. NMS reduces the number of overlapping bounding boxes by retaining only the most relevant ones, ensuring each detected object is represented by a single bounding box, thus enhancing the accuracy of subsequent cost estimation.

The NMS algorithm computes the Intersection over Union (IoU) between bounding boxes to measure overlap. It selects the box with the highest confidence score and suppresses all other boxes with an IoU greater than a specified threshold. This process is repeated until only the most significant bounding boxes remain (Algorithm 1). Figure 2 shows the NMS results on a sample input.

Algorithm 1
figure a

Non-Maximum Suppression (NMS)

For bounding boxes \(\text {boxA}\) and \(\text {boxB}\) defined by corners \((x_1, y_1, x_2, y_2)\):

$$\begin{aligned} &{\textbf {{Intersection Area}}} = \max (0, \min (x_2^A, x_2^B) - \max (x_1^A, x_1^B)) \times \max (0, \min (y_2^A, y_2^B) - \max (y_1^A, y_1^B))\\ &\quad {\textbf {{Area of boxA}}} = (x_2^A - x_1^A) \times (y_2^A - y_1^A)\quad \quad \quad \quad {\textbf {\text {Area of boxB}}} = (x_2^B - x_1^B) \times (y_2^B - y_1^B) \end{aligned}$$
$$\begin{aligned} {\textbf {\text {IoU} }}= \frac{\text {Intersection Area}}{\text {Area of boxA} + \text {Area of boxB} - \text {Intersection Area}} \end{aligned}$$
(1)
  • Area of Intersection is the overlapping area between two bounding boxes

  • Area of Union is the total area covered by the two bounding boxes.

Fig. 2.
figure 2

Non-Maximum Suppression (NMS) results applied on a police vehicle

3.3 Estimating Repair Cost Using Detected Damages

To estimate repair costs for car damages, we employ an algorithm that processes detected bounding boxes, labels, and confidence scores. The methodology involves two primary steps: NMS (Sect. 3.2) and cost calculation.

NMS filters redundant bounding boxes to ensure each damage type is uniquely represented. Subsequently, the repair cost is computed based on predefined base costs for each damage type. This base cost is adjusted by a severity factor, which is derived from the area of the bounding box, the associated confidence score, and a normalization factor to adjust severity. The final repair cost is the sum of these adjusted costs. Repair cost is calculated by:

$$\begin{aligned} \text {Cost} = \text {Base Cost} \times \left( 1 + \frac{\text {(Area} \times \text {Score)}}{\text {Normalization Factor}}\right) \end{aligned}$$
(2)

This approach ensures precise and reflective repair cost estimation based on the severity of detected damages, facilitating accurate repair assessments.

3.4 Salient Object Detection For Emergency Vehicle Damages

Salient Object Detection (SOD) enhances the detection of car damages by refining the boundaries of irregular and slender shapes. SOD focuses on locating all salient objects in an image, highlighting the most noticeable and severe areas of damage, and reducing noise by filtering out less relevant parts. This is useful in complex scenes where it is more important to identify key areas of damage than to classify them.

We apply a modified U-Net [19] model to refine the boundaries of detected damages. Using the CarDD [20] dataset, images are resized to 256\(\,\times \,\)256 pixels and augmented with random resizing and flipping. The U-Net model was trained on an NVIDIA Tesla T4 GPU with a batch size of 32 for 250 epochs, using an Adam optimizer with a learning rate of 0.001 and weight decay of 1e-5. The loss function is Binary Cross-Entropy with Logits Loss (BCEWithLogitsLoss).

4 SAARTHI Framework

The SAARTHI framework (Fig. 3) integrates advanced AI techniques for vehicle damage assessment and repair coordination, streamlining the process from accident to repair completion. In this context, “users" refers to emergency vehicle drivers or fleet managers, depending on the organization’s protocol.

Fig. 3.
figure 3

SAARTHI Workflow. Please refer to Sects. 4.1–4.6 for more details

4.1 User Registration and Login

Users register on the SAARTHI platform. Once logged in, they can view their damage assessment history and past help requests on a personalized dashboard. Here they can also start a new assessment by uploading an image of the vehicle.

4.2 Image Upload and AI Operations

Users can upload an image of their damaged emergency vehicle along with optional details such as the car model and year. This image serves as the input for AI model operations, which include the following steps:

  1. 1.

    Instance Segmentation and Object Detection (Sect. 3.1): The uploaded image undergoes instance segmentation and object detection to identify and classify different types of damage on the vehicle, including dent, scratch, crack, lamp broken, glass shatter, and tire flat.

  2. 2.

    Non-Maximum Suppression (NMS) (Sect. 3.2): NMS is applied to eliminate redundant bounding boxes, ensuring each type of damage is represented only once.

  3. 3.

    Cost Estimation (Sect. 3.3): The system estimates the repair cost based on the identified damages, using predefined base costs adjusted by the area and confidence score of each detected damage.

The assessment allows users to view the original image, the damage-detected image, the image after applying NMS, and the result of SOD. The assessment report includes statistical graphs, providing a comprehensive understanding of the damages. The report also includes a table of estimated repair costs with labels for individual damages added in the total estimated repair cost.

4.3 Request Handling and Agent Coordination

Users can initiate a chatbot conversation for further assistance. Developed using JavaScript and the BotUI library [21], the chatbot offers options such as creating and downloading a PDF of the assessment report and connecting with the nearest SAARTHI agent based on the user’s location. Users can request a tow or on-the-spot repairs. The system uses integrated Google Maps API [22] to fetch the user’s location and create a request on the SAARTHI agent dashboard.

4.4 SAARTHI Agent

SAARTHI agents are emergency vehicle repair experts who have their own accounts on the SAARTHI portal. They are available 24/7 to provide assistance at any time of the day. Agents are registered on the portal by an administrator after verifying their identity and credentials. During registration, the agent’s details including shop location (latitude and longitude), city, phone number and personal informations are collected. Each agent is assigned a unique ID to facilitate tracking and coordination.

4.5 Nearest Agent Selection

When a user requests assistance, the system identifies the nearest available agent based on distance as follows:

  1. 1.

    Calculate Distance: Distance is calculated by Haversine formula [23, 24]:

    $$\begin{aligned} d = 2r \cdot \arctan 2\left( \sqrt{a}, \sqrt{1-a}\right) \end{aligned}$$
    (3)

    where \(a = \sin ^2\left( \frac{\Delta \phi }{2}\right) + \cos (\phi _1) \cdot \cos (\phi _2) \cdot \sin ^2\left( \frac{\Delta \lambda }{2}\right) \) and \(\Delta \phi \) and \(\Delta \lambda \) are the differences in latitude and longitude, and \(r\) is the Earth’s radius.

  2. 2.

    Sort Agents: Agents are sorted based on their distance from the user. The closest agent is contacted first.

  3. 3.

    Handle Availability: If an agent rejects the request or is at full capacity, the system moves to the next closest agent.

4.6 Report

Once the repair is completed, the user is notified and the detailed cost breakdown, including repair costs and service charges, is presented for review and then added to the monthly report of the user’s organization.

5 Results

5.1 Object Detection and Instance Segmentation

For this research we used CarDD dataset [20]. Models were trained using an NVIDIA Tesla T4 GPU with a batch size of 4 for 10 epochs. The learning rate was set to 8e-05, using the Adam optimizer with a weight decay of 0.05. The loss functions included RPN classification loss, RPN bounding box regression loss, main classification loss (Cross-Entropy Loss), main bounding box regression loss (L1 Loss), and mask prediction loss (Mask Loss). The total loss, which is the sum of these individual losses, provided a comprehensive measure of the model’s performance across different stages of object detection and segmentation. Figure 4 presents key metrics evaluating the model’s performance per batch during training, with a batch size of 4.

Fig. 4.
figure 4

Training Phase Result Metrics. Accuracy (Fig. 4a) shows a consistent upward trend, indicating improving model performance. The loss (Fig. 4b) declines steadily, reflecting increasingly accurate predictions and convergence. Bounding Box Mean Average Precision (bbox mAP) (Fig. 4c) indicates improving object detection performance, with steady increases in precision and recall. Segmentation Mean Average Precision (seg. mAP) (Fig. 4d) shows consistent improvement in identifying and segmenting objects

Testing Phase. For the testing phase of the SAARTHI framework we evaluated the model’s performance using Mean Average Precision (mAP) and Intersection over Union (IoU) metrics. mAP measures the average precision across different classes, indicating better model accuracy with higher values. IoU measures the overlap between predicted and ground truth bounding boxes, with specific thresholds (e.g., 0.5, 0.75) determining correct predictions.

The overall bounding box mAP (bbox mAP) was 0.538 (Table 1), with high precision at IoU thresholds of 0.5 and 0.75. The model showed good accuracy for large, medium, and small objects. For segmentation masks (seg. mAP), the overall mAP was 0.519 (Table 2), also demonstrating high precision at IoU thresholds of 0.5 and 0.75, and good performance across different object sizes. Figures 6 and 8 illustrate the model’s performance with examples of original images and damage detection outputs. These results indicate that the model performs well, particularly for larger objects and at an IoU threshold of 0.5. Figure 5 represents results of damage detection and segmentation for an input image.

Table 1. bbox mAP
Table 2. seg. mAP
Fig. 5.
figure 5

Results

5.2 Salient Object Detection

Table 3 summarizes the model’s performance for salient object detection during testing. The average loss of 0.5826 indicates a good match between predicted and actual values. The model achieves an accuracy of 0.8645, reflecting a high proportion of correct predictions. Precision and recall, both at 0.87, suggest the model effectively identifies true positives with balanced accuracy. The F1-score of 0.87 confirms the overall robust performance of the model in detecting salient objects. Figure 6 represents training phase results (loss, accuracy) and predicted segmentation mask for an input image.

Fig. 6.
figure 6

SOD Training phase result 6a 6b and sample input image results 6c 6d

Table 3. Salient Object Detection results metrics (Testing Phase)

5.3 SAARTHI User Interface

Figure 7 represents a few snippets of the end-to-end SAARTHI user interface. For more details, demo and code, please refer to sites.google.com/view/saarthi-home

Fig. 7.
figure 7

SAARTHI User Interface

6 Conclusion

The SAARTHI framework effectively addresses the challenges of emergency vehicle accidents by utilizing advanced AI technologies for damage assessment, standardized repair cost estimations, and expedited vehicle recovery. This framework reduces downtime and enhances operational readiness through real-time assessment, detailed reporting, and chatbot assistance. While the focus is on external damages visible in images, future work will integrate telematics data and sensor readings for a holistic approach, including internal damages.

Integrating SAARTHI with IoT technologies for real-time monitoring and predictive maintenance will further improve emergency vehicle readiness and efficiency.