Keywords

figure a
figure b

1 Introduction

Developing software that is secure and bug-free is an extraordinarily challenging task. Due to the devastating effects vulnerabilities may have, financially or on an individual’s well-being, software verification is a necessity [1]. For example, Airbus found a software vulnerability in the A400M aircraft that caused a crash in 2015. This vulnerability created a fault in the control units for the engines, which caused them to power off shortly after taking-off [2]. A software vulnerability is best described as a defect or weakness in software design [3]. That design can be verified by Model Checking [4] or Fuzzing [5]. Model-checking and fuzzing are two techniques that are well suited to find bugs. In particular, model-checking has proven to be one of the most successful techniques based on its use in research and industry [6]. This paper will focus on fuzzing and bounded model checking (BMC) techniques for code coverage and vulnerability detection. Code coverage has proven to be a challenge due to the state space problem, where the search space to be explored becomes extremely large [6]. For example, vulnerabilities are hard to detect in network protocols because the state-space of sophisticated protocol software is too large to be explored [7]. Vulnerability detection is another challenge that we have to take besides the code coverage. Some vulnerabilities cannot be detected without going deep into the software implementation. Many reasons motivate us to verify software for coverage and to detect security vulnerabilities formally. Therefore, these problems have attracted many researchers’ attention to developing automated tools.

Researchers have been advancing the state-of-the-art to detect software vulnerabilities, as observed in the recent edition of the International Competition on Software Testing (Test-Comp 2021) [8]. Test-Comp is a competition that aims to reflect the state-of-the-art in software testing to the community and establish a set of benchmarks for software testing. Test-Comp 2021 [8], had two categories Error Coverage (or Cover-Error) and Branch Coverage (or Cover-Branches). The Error Coverage category tests the tool’s ability to discover bugs where every C program in the benchmarks contains a bug. The aim of the Branch Coverage category is to cover as many program branches as possible. Test-Comp 2021 works as follows: each tool task is a pair of an input program (a program under test) and a test specification. The tool then should generate a test suite according to the test specification. A test suite is a sequence of test cases, given as a directory of files according to the format for exchangeable test-suitesFootnote 1. The specification for testing a program is given to the test generator as an input file (either coverage-error-call.prp or coverage branches.prp for Test-Comp 2021) [8].

Techniques such as fuzzing [9], symbolic execution [10], static code analysis [11], and taint tracking [12] are the most common techniques, which were employed in Test-Comp 2021 to cover branches and detect security vulnerabilities [8]. Fuzzing is generally unable to create various inputs that exercise all paths in the software execution. Symbolic execution might also not achieve high path coverage because of the dependence on Satisfiability Modulo Theories (SMT) solvers and the path-explosion problem. Consequently, fuzzing and symbolic execution by themselves often cannot reach deep software states. In particular, the deep states’ vulnerabilities cannot be identified and detected by these techniques in isolation [13]. Therefore, a hybrid technique involving fuzzing and symbolic execution might achieve better code coverage than fuzzing or symbolic execution alone. VeriFuzz [14] and LibKluzzer [15] are the most prominent tools that combine these techniques. VeriFuzz combines the power of feedback-driven evolutionary fuzz testing with static analysis, where LibKluzzer combines the strengths of coverage-guided fuzzing and dynamic symbolic execution.

This paper proposes a novel method for detecting security vulnerabilities in C programs that combines fuzzing with symbolic execution via bounded model checking. We make use of coverage-guided fuzzing to produce random inputs to locate security vulnerabilities in C programs. Separately, we make use of BMC techniques [16, 17]. BMC unfolds a program up to depth k by evaluating (conditional) branch sides and merging states after that branch. It builds one logical formula expressed in a fragment of first-order theories and checks the satisfiability of the resulting formula using SMT solvers. These two methods are combined in our tool FuSeBMC which can consequently handle the two main features in software testing: bug detection and code coverage, as defined by Beyer et al. [18]. We also manage each engine’s execution time to improve FuSeBMC’s efficiency in terms of verification time. Therefore, we raise the chance of bug detection due to its ability to cover different blocks of the C program, which other tools could not reach, e.g., KLEE [19], CPAchecker [20], VeriFuzz [14], and LibKluzzer [15].

Contributions. This paper extends our prior work [21] by making the following original contributions.

  • We detail how FuSeBMC guides fuzzing and BMC engines to produce test cases that can detect security vulnerabilities and achieve high code coverage while massively reducing the consumption of both CPU and memory. Furthermore, we discuss using a custom fuzzer we refer to as a selective fuzzer as a third engine that learns from the test cases produced by fuzzing/BMC to produce new test cases for the uncovered goals.

  • We provide a detailed analysis of the results from FuSeBMC’s successful participation in Test-Comp 2021. FuSeBMC achieved first place in Cover-Error category and second place in Overall category. FuSeBMC achieved first place in the subcategories ReachSafety-BitVectors, ReachSafety-Floats, ReachSafety-Recursive, ReachSafety-Sequentialized and ReachSafety-XCSP. We analyse the results in depth and explain how our research has enabled FuSeBMC’s success across these categories as well its low energy consumption.

2 Preliminaries

2.1 Fuzzing

Fuzzing is a cost-effective software testing technique to exploit vulnerabilities in software systems [22]. The basic idea is to generate random inputs and check whether an application crashes; it is not testing functional correctness (compliance). Critical security flaws most often occur because program inputs are not adequately checked [23]. Therefore, fuzzing prepares random or semi-random inputs, which might consider, e.g., (1) very long or completely blank strings; (2) min/max values of integers, or only zero and negative values; and (3) include unique values or characters likely to trigger bugs. Software systems that cannot endure fuzzing could potentially lead to security holes. For example, a bug was found in Apple wireless driver by utilizing file system fuzzing. The driver could not handle some beacon frames, which led to out-of-bounds memory access.

2.2 Symbolic Execution

Introduced in the 1970s, symbolic execution [24] is a software analysis technique that allowed developers to test specific properties in their software. The main idea is to execute a program symbolically using a symbolic execution engine that keeps track of every path the program may take for every input [24]. Moreover, each input represents symbolic input values instead of concrete input values. This method treats the paths as symbolic constraints and solves the constraints to output a concrete input as a test case. Symbolic execution is widely used to find security vulnerabilities by analyzing program behavior and generating test cases [25]. BMC is an instance of symbolic execution, where it merges all execution paths into one single logical formula instead of exploring them individually.

2.3 Types of Vulnerabilities

Software, in general, is prone to vulnerabilities caused by developer errors, which include: buffer overflow, where a running program attempts to write data outside the memory buffer, which is intended to store this data [26]; memory leak, which occurs when programmers create a memory in a heap and forget to delete it [27]; integer overflows, when the value of an integer is greater than the integer’s maximum size in memory or less than the minimum value of an integer. It usually occurs when converting a signed integer to an unsigned integer and vice-versa [28]. Another example is string manipulation, where the string may contain malicious code and is accepted as an input; this is reasonably common in the C programming language [29]. Denial-of-service attack (DoS) is a security event that occurs when an attacker prevents legitimate users from accessing specific computer systems, devices, services, or other IT resources [30]. For example, a vulnerability in the Cisco Discovery Protocol (CDP) module of Cisco IOS XE Software Releases 16.6.1 and 16.6.2 could have allowed an unauthenticated, adjacent attacker to cause a memory leak, which could have lead to a DoS condition [31]. Part of our motivation is to mitigate the harm done by these vulnerabilities by the proposed method FuSeBMC.

3 FuSeBMC: An Energy-Efficient Test Generator for Finding Security Vulnerabilities in C Programs

We propose a novel verification method named FuSeBMC (cf. Fig. 1) for detecting security vulnerabilities in C programs using fuzzing and BMC techniques. FuSeBMC builds on top of the Clang compiler [32] to instrument the C program, uses Map2check [33, 34] as a fuzzing engine, and ESBMC (Efficient SMT-based Bounded Model Checker) [35, 36] as BMC and symbolic execution engines, thus combining dynamic and static verification techniques.

Fig. 1.
figure 1

FuSeBMC: an energy-efficient test generator framework.

The method proceeds as follows. First, FuSeBMC takes a C program and a test specification as input. Then, FuSeBMC invokes the fuzzing and BMC engines sequentially to find an execution path that violates a given property. It uses an iterative BMC approach that incrementally unwinds the program until it finds a property violation or exhausts time or memory limits. In code coverage mode, FuSeBMC explores and analyzes the target C program using the clang compiler to inject labels incrementally. FuSeBMC traverses every branch of the Clang AST and injects a label in each of the form \(GOAL_i\) for \(i \in \mathbb {N}\). Then, both engines will check whether these injected labels are reachable to produce test cases for branch coverage. After that, FuSeBMC analyzes the counterexamples and saves them as a graphml file. It checks whether the fuzzing and BMC engines could produce counterexamples for both categories Cover-Error and Cover-Branches. If that is not the case, FuSeBMC employs a second fuzzing engine, the so-called selective fuzzer (cf. Sect. 3.6), which attempts to produce test cases for the rest of the labels. The selective fuzzer produces test cases by learning from the two previous engines’ output.

FuSeBMC introduces a novel algorithm for managing the time allocated to its component engines. In particular, FuSeBMC manages the time allocated to each engine to avoid wasting time for a specific engine to find test cases for challenging goals. For example, let us assume we have 100 goals injected by FuSeBMC and1000 s to produce test cases. In this case, FuSeBMC distributes the time per engine per goal so that each goal will have 10s and recalculate the time for the goals remaining after each goal passed. If an engine succeeds on a particular goal within the time limit, the extra time is redistributed to the other goals; otherwise, FuSeBMC kills the process that passes the time set for it.

Furthermore, FuSeBMC has a minimum time, which a goal must be allocated. If there are too many goals for all to receive this minimum time, FuSeBMC will select a subset to attempt using a quasi-random strategy (e.g., all even-numbered goals). FuSeBMC also manages the global time of the fuzzing, BMC, and selective fuzzing engines. It gives 13% of the time for fuzzing, 77% for BMC, and 10% for selective fuzzing. FuSeBMC further carries out time management at this global level to maximize engine usage. If, for example, the fuzzing engine is finished before the time allocated to it, its remaining time will be carried over and added to the allocated time of the BMC engine. Similarly, we add the remaining time from the BMC engine to the selective fuzzer allocated time.

FuSeBMC prepares valid test cases with metadata to test a target C program using TestCov [37] as a test validator. The metadata file is an XML file that describes the test suite and is consistently named metadata.xml. Figure 2 illustrates an example metadata file with all available fields [37]. Some essential fields include the program function that is tested by the test suite \(\langle entryfunction \rangle \), the coverage criterion for the test suite \(\langle specification \rangle \), the programming language of the program under test \(\langle sourcecodelang \rangle \), the system architecture the program tests were created for \(\langle architecture \rangle \), the creation time \(\langle creationtime \rangle \), the SHA-256 hash of the program under test \(\langle programhash \rangle \), the producer of counterexample \(\langle producer \rangle \) and the name of the target program \(\langle programfile \rangle \). A test case file contains a sequence of tags \(\langle input \rangle \) that describes the input values sequence. Figure 3 illustrates an example of the test case file.

Algorithm 1 describes the main steps we implemented in FuSeBMC. It consists of extracting all goals of a C program (line 1). For each goal, the instrumented C program, containing the goals (line 2), is executed on our verification engines (fuzzing and BMC) to check the reachability property produced by REACH(G) for that goal (lines 8 & 20). REACH is a function; it takes a goal (G) as input and produces a corresponding property for fuzzing/BMC (line 7 & 19). If our engines find that the property is violated, meaning that there is a valid execution path that reaches the goal (counterexample), then the goals are marked as covered, and the test case is saved for later (lines 9–11). Then, we continue if we still have time allotted for each engine. Otherwise, if our verification engines could not reach some goals, then we employ the selective fuzzer in attempt to reach these as yet uncovered goals. In the end, we return all test cases for all the goals we have found in the specified XML format (line 41).

figure c
Fig. 2.
figure 2

An example of a metadata.

Fig. 3.
figure 3

An example of test case file.

3.1 Analyze C Code

FuSeBMC explores and analyzes the target C programs as the first step using Clang [38]. In this phase, FuSeBMC analyzes every single line in the C code and considers the conditional statements such as the if-conditions, for, while, and do while loops in the code. FuSeBMC takes all these branches as path conditions, containing different values due to the conditions set used to produce the counterexamples, thus helping increase the code coverage. It supports blocks, branches, and conditions. All the values of the variables within each path are taken into account. Parentheses and the else-branch are added to compile the target code without errors.

3.2 Inject Labels

FuSeBMC injects labels of the form \(GOAL_i\) in every branch in the C code as the second step. In particular, FuSeBMC adds else to the C code that has an if-condition with no else at the end of the condition. Additionally, FuSeBMC will consider this as another branch that should produce a counterexample for it to increase the chance of detecting bugs and covering more statements in the program. For example, the code in Fig. 4 consists of two branches: the if-branch is entered if condition \(x < 0\) holds; otherwise, the else-branch is entered implicitly, which can exercise the remaining execution paths. Also, Fig. 4 shows how FuSeBMC injects the labels and considers it as a new branch.

Fig. 4.
figure 4

Original C code vs code instrumented.

3.3 Produce Counterexamples

FuSeBMC uses its verification engines to generate test cases that can reach goals amongst \(GOAL_1\), \(GOAL_2\), ..., \(GOAL_n\) inserted in the previous phase. FuSeBMC then checks whether all goals within the C program are covered. If so, FuSeBMC continues to the next phase; otherwise, FuSeBMC passes the goals that are not covered to the selective fuzzer to produce test cases for it using randomly generated inputs learned from the test cases produced from both engines. Figure 5 illustrates how the method works.

Fig. 5.
figure 5

Produce counterexamples.

3.4 Create Graphml

FuSeBMC will generate a graphml for each goal injected and then name it. The name of the graphml takes the number of the goal extended by the graphml extension, e.g., (GOAL1.graphml). The graphml file contains data about the counterexample, such as data types, values, and line numbers for the variables, which will be used to obtain the values of the target variable.

3.5 Produce Test Cases

In this phase, FuSeBMC will analyze all the graphml files produced in the previous phase. Practically, FuSeBMC will focus on the \(\texttt {{<}edge{>}}\) tags in the graphml that refer to the variable with a type non-deterministic. These variables will store their value in a file called, for example, (testcase1.xml). Figure 6 illustrates the edges and values used to create the test cases.

Fig. 6.
figure 6

An example of target edges

3.6 Selective Fuzzer

In this phase, we apply the selective fuzzer to learn from the test cases produced by either fuzzing or BMC engines to produce test cases for the goals that have not been covered by the two. The selective fuzzer uses the previously produced test cases by extracting from each the number of assignments required to reach an error. For example, in Fig. 7, we assumed that the fuzzing/BMC produced a test case that contains values 18 (1000 times) generated from a random seed. The selective fuzzer will produce random numbers (1000 times) based on the test case produced by the fuzzer. In several cases, the BMC engine can exhaust the time limit before providing the information needed by the selective fuzzer, such as the number of inputs, when large arrays need to be initialized at the beginning of the program.

Fig. 7.
figure 7

The selective fuzzer

3.7 Test Validator

The test validator takes as input the test cases produced by FuSeBMC and then validates these test cases by executing the program on all test cases. The test validator checks whether the bug is exposed if the test was bug-detection, and it reports the code coverage if the test was a measure of the coverage. In our experiments, we use the tool TESTCOV [37] as a test validator. The tool provides coverage statistics per test. It supports block, branch, and condition coverage and covering calls to an error function. TESTCOV uses the XML-based exchange format for test cases specifications defined by Test-Comp [16]. TESTCOV was successfully used in recent editions of Test-Comp 2019, 2020, and 2021 to execute almost 9 million tests on 1720 different programs [37].

4 Evaluation

4.1 Description of Benchmarks and Setup

We conducted experiments with FuSeBMC on the benchmarks of Test-Comp 2021 [39] to check the tool’s ability in the previously mentioned criteria. Our evaluation benchmarks are taken from the largest and most diverse open-source repository of software verification tasks. The same benchmark collection is used by SV-COMP [40]. These benchmarks yield 3173 test tasks, namely 607 test tasks for the category Error Coverage and 2566 test tasks for the category Code Coverage. Both categories contain C programs with loops, arrays, bit-vectors, floating-point numbers, dynamic memory allocation, and recursive functions.

The experiments were conducted on the server of Test-Comp 2021 [39]. Each run was limited to 8 processing units, 15 GB of memory, and 15 min of CPU time. The test suite validation was limited to 2 processing units, 7 GB of memory, and 5 min of CPU time. Also, the machine had the following specification of the test node was: one Intel Xeon E3-1230 v5 CPU, with 8 processing units each, a frequency of 3.4 GHz, 33 GB of RAM, and a GNU/Linux operating system (x86-64-Linux, Ubuntu 20.04 with Linux kernel 5.4).

FuSeBMC source code is written in C++; it is available for downloading at GitHub,Footnote 2 which includes the latest release of FuSeBMC v3.6.6. FuSeBMC is publicly available under the terms of the MIT license. Instructions for building FuSeBMC from the source code are given in the file README.md.

4.2 Objectives

This evaluation’s main goal is to check the performance of FuSeBMC and the system’s suitability for detecting security vulnerabilities in open-source C programs. Our experimental evaluation aims to answer three experimental goals:

figure d

4.3 Results

First, we evaluated FuSeBMC on the Error Coverage category. Table 1 shows the experimental results compared with other tools in Test-Comp 2021 [39], where FuSeBMC achieved the 1st place in this category by solving 500 out of 607 tasks, an 82% success rate.

In detail, FuSeBMC achieved 1st place in the subcategories ReachSafety-BitVectors, ReachSafety-Floats, ReachSafety-Recursive, ReachSafety-XCSP and ReachSafety-Sequentialized. FuSeBMC solved 10 out of 10 tasks in ReachSafety-BitVectors, 32 out of 33 tasks in ReachSafety-Floats, 19 out of 20 tasks in ReachSafety-Recursive, 53 out of 59 tasks in ReachSafety-XCSP and 101 out of 107 tasks in ReachSafety-Sequentialized.

FuSeBMC outperformed the top tools in Test-Comp 2021, such as KLEE [19], CPAchecker [20], Symbiotic [41], LibKluzzer [15], and VeriFuzz [14] in these subcategories. However, FuSeBMC did not perform as well in the ReachSafety-ECA subcategory if compared with leading tools in the competition. We suspect that this is due to the prevalence of nested branches in these benchmarks. The FuSeBMC’s verification engines and the selective fuzzer could not produce test cases to reach the error due to the existence of too many path conditions, making the logical formula hard to solve and making it difficult to create random inputs to reach the error.

figure e
Table 1. Cover-Error results\(^\mathrm{{a}}\). We identify the best for each tool in bold.

FuSeBMC also participated in the Branch Coverage category at Test-Comp 2021. Table 2 shows the experimental results from this category. FuSeBMC achieved 4th place in the category by successfully achieving a score of 1161 out of 2566, behind the 3rd place system by 8 scores only. In the subcategory ReachSafety-Floats, FuSeBMC obtained the first place by achieving 103 out of 226 scores. Thus, FuSeBMC outperformed the top tools in Test-Comp 2021. Further, FuSeBMC obtained the first place in the subcategory ReachSafety-XCSP by achieving 97 out of 119 scores. However, FuSeBMC did not perform well in the subcategory ReachSafety-ECA compared with the leading tools in the Test-Comp 2021. Again we suspect the cause to be the prevalence of nested branches in these benchmarks.

figure f
Table 2. Cover-Branches results\(^\mathrm{{a}}\). We identify the best for each tool in bold.

FuSeBMC achieved 2nd place overall at Test-Comp 2021, with a score of 1776 out of 3173. Table 4 and Fig. 8 shows the overall results compared with other tools in the competition. Overall, FuSeBMC performed well compared with top tools in the subcategories ReachSafety-BitVectors, ReachSafety-Floats, ReachSafety-Recursive, ReachSafety-Sequentialized and ReachSafety-XCSP.

Fig. 8.
figure 8

Quantile functions for category Overall. [8]

Test-Comp 2021 also considers energy efficiency in rankings since a large part of the cost of test generation is caused by energy consumption. FuSeBMC is classified as a Green-testing tool - Low Energy Consumption tool (see Table 3). FuSeBMC consumed less energy than many other tools in the competition. This ranking category uses the energy consumption per score point as a rank measure: CPU Energy Quality, with the unit kilo-joule per score point (kJ/sp). It uses CPU Energy Meter [42] for measuring the energy.

Table 3. The consumption of CPU and memory [8].
figure g
Table 4. Test-Comp 2021 Overall results\(^\mathrm{{a}}\).

5 Related Work

For more than 20 years, software vulnerabilities have been mainly identified by fuzzing [43]. American fuzzy lop (AFL) [44, 45] is a tool that aims to find software vulnerabilities. AFL increases the coverage of test cases by utilizing genetic algorithms (GA) with guided fuzzing. Another fuzzing tool is LibFuzzer [46]. LibFuzzer generates test cases by using code coverage information provided by LLVM’s Sanitizer Coverage instrumentation. It is best used for programs with small inputs that have a run-time of less than a fraction of a second for each input as it is guaranteed not to crash on invalid inputs. AutoFuzz [47] is a tool that verifies network protocols using fuzzing. First, it determines the specification for the protocol, then utilizes fuzzing to find vulnerabilities. Additionally, Peach [48] is an advanced and robust fuzzing framework that provides an XML file to create a data model and state model definition.

Symbolic execution has also been used to identify security vulnerabilities. One of the most popular symbolic execution engines is KLEE [19]. It is built on top of the LLVM compiler infrastructure and employs dynamic symbolic execution to explore the search space path-by-path. KLEE has proven to be a reliable symbolic execution engine for its utilization in many specialized tools such as TracerX [49] and Map2Check [33] for software verification, also SymbexNet [50] and SymNet [51] for verification of network protocols implementation.

The combination of symbolic execution and fuzzing has been proposed before. It started with the tool that earned first place in Test-Comp 2020 [18], VeriFuzz [14]. VeriFuzz is a state-of-the-art tool we have compared to FuSeBMC. It is a program-aware fuzz tester that combines the power of feedback-driven evolutionary fuzz testing with static analysis. It is built based on grey-box fuzzing to exploit lightweight instrumentation for observing the behaviors that occur during test runs. There is also LibKluzzer [15], which is a novel implementation that combines the strengths of coverage-guided fuzzing and white-box fuzzing. LibKluzzer is a combination of LibFuzzer and an extension of KLEE called KLUZZER [52]. Driller [53] is a hybrid vulnerability excavation tool, which leverages fuzzing and selective concolic execution in a complementary manner to find deeply embedded bugs. The authors avoid the path explosion inherent in concolic analysis and the incompleteness of fuzzing by combining the two techniques’ strengths and mitigating the weaknesses. Driller splits the application into compartments based on checks of particular values of a specific input. The proficiency of fuzzing allows it to explore possible values of general input in a compartment. However, when it comes to values that satisfy checks on an input that guides the execution between compartments, fuzzing struggles to identify such values. In contrast, selective concolic execution excels at identifying such values required by checks and drive the execution between compartments.

Another example is hybrid fuzzer [54], which provides an efficient way to generate provably random test cases that guarantee the execution of unique paths. It uses symbolic execution to determine frontier nodes that lead to a unique execution path. Given some resource constraints, the tool collects as many frontier nodes as possible. With these nodes, fuzzing is employed with provably random input, preconditioned to lead to each frontier node. Badger [55] is a hybrid testing approach for complexity analysis. It uses Symbolic PathFinder [56] to generate new inputs and provides the Kelinci fuzzer with worst-case analysis. Munch [57] is a hybrid tool introduced to increase function coverage. It employs fuzzing with seed inputs generated by symbolic execution and targets symbolic execution when fuzzing saturates. SAGE (Scalable Automated Guided Execution) [58] is a hybrid fuzzer developed at Microsoft Research. It extends dynamic symbolic execution with a generational search; it negates and solves the path predicates to increase the code coverage. SAGE is used extensively at Microsoft, where it has been successful at finding many security-related bugs. SAFL [59] is an efficient fuzzer for C/C++ programs. It generates initial seeds that can get an appropriate fuzzing direction by employing symbolic execution in a lightweight approach. He et al. [60] describe a new approach for learning a fuzzer from symbolic execution; they instantiated it to the domain of smart contracts. First, it learns a fuzzing policy using neural networks. Then it generates inputs for fuzzing unseen smart contracts by this learning fuzzing policy. In summary, many tools have combined fuzzers with BMC and symbolic execution to perform software verification. However, our approach’s novelty lies with the addition of the selective fuzzer and time management algorithm between engines and goals. These features were what distinguished FuSeBMC from other tools at Test-Comp 2021.

6 Conclusions and Future Work

We proposed a novel test case generation approach that combined Fuzzing and BMC and implemented it in the FuSeBMC tool. FuSeBMC explores and analyzes the target C programs by incrementally injecting labels to guide the fuzzing and BMC engines to produce test cases. We inject labels in every program branch to check for their reachability, producing test cases if these labels are reachable. We also exploit the selective fuzzer to produce test cases for the labels that fuzzing and BMC could not produce test cases. FuSeBMC achieved two significant awards from Test-Comp 2021. First place in the Cover-Error category and second place in the Overall category. FuSeBMC outperformed the leading state-of-the-art tools because of two main factors. Firstly, the usage of the selective fuzzer as a third engine that learns from the test cases of fuzzing/BMC to produce new test cases for the as-yet uncovered goals. Overall, it substantially increased the percentage of successful tasks. Secondly, we apply a novel algorithm of managing the time allocated for each engine and goal. This algorithm prevents FuSeBMC from wasting time finding test cases for difficult goals so that if the fuzzing engine is finished before the time allocated to it, the remaining time will be carried over and added to the allocated time of the BMC engine. Similarly, we add the remaining time from the BMC engine to the selective fuzzer allocated time. As a result, FuSeBMC raised the bar for the competition, thus advancing state-of-the-art software testing. Future work will investigate the extension of FuSeBMC to test multi-threaded programs [61, 62] and reinforcement learning techniques to guide our selective fuzzer to find test cases that path-based fuzzing and BMC could not find.