Keywords

20.1 Introduction

Video processing and computer vision methods become increasingly important not only in the industrial applications but also in our daily life [1]. Video processing generally exploits tasks with very high computational demands. Such tasks can be handled by the standard processors and computers or by computers connected to the computational networks [1]. However, such approach is not always suitable that’s why specialized hardware solutions based on digital signal processors (DSP) or a field programmable gate arrays (FPGA) are usually used in embedded systems [2, 3]. Xilinx System Generator allows the design of hardware system starting from a graphical high level Simulink environment [3, 4]. System Generator extends the traditional Hardware Description Language (HDL) design providing graphical modules, and thus does not require a detailed knowledge of this complex language. The Simulink graphical language allows an abstraction of the design through the use of available System Generator blocks and subsystems [3]. This reduces the time necessary between the control design derivations and hardware implementation. In addition, the software provides for the hardware simulation and hardware-in-the-loop verification, referred to as hardware co-simulation [2, 4], from within this environment. This methodology provides easier hardware verification and implementation compared to HDL based approach. The Simulink simulation and hardware-in-the loop approach presents a far more cost efficient solution than other methodologies. The ability to quickly and directly realize a control system design as a real-time embedded system greatly facilitates the design process.

The remainder of this paper is divided into six sections. After introducing, a description of Design methodology for implementation on FPGA with Xilinx System Generator is presented; Section 20.3 presents a study case which is Color Space Conversion Application. In Section 20.4, experimental results and software performances are detailed. Section 20.5 shows some discussion and comparison. This paper is concluded in Section 20.6.

20.2 Design Methodology for Implementation on FPGA with Xilinx System Generator

Efficient rapid prototyping system requires a development environment targeting the hardware design platform. The used tools are MATLAB R2007a with Simulink from MathWorks [4, 5], System Generator 10.1 for DSP and ISE 10.1 from Xilinx present such capabilities (Fig. 20.1). Although the Xilinx ISE 10.1 [2, 5] foundation software is not directly utilized, it is required due to the fact that it is running in the background when the System Generator blocks are implemented. The System Generator [2] environment allows for the Xilinx line of FPGAs to be interfaced directly with Simulink. In addition there are several cost effective development boards available on the market that can be utilized for the software design development phase.

Fig. 20.1
figure 1_20figure 1_20

Design methodology with Xilinx System Generator

MATLAB is an interactive software for numerical computations that simplifies the implementation of linear algebra routines. Powerful operations can be performed by using the provided MATLAB commands. Simulink [2, 3] is an additional MATLAB toolbox that provides for modeling, simulating and analyzing dynamic systems within a graphical environment. The software allows for both modular and hierarchical models to be developed providing the advantage of developing a complex system design that is conceptually simplified.

Xilinx System Generator is a MATLAB-Simulink based design tool for Xilinx’s line of FPGAs. Complex digital circuits have been developed using multiple Hardware Description Language (HDL) modules. Because of the abstraction level is very low within the HDL environment, the difficulty increases as the design becomes more complex [5].

The Xilinx Integrated Software Environment (ISE) is a powerful design environment that is working in the background when implementing System Generator blocks. The ISE environment consists of a set of program modules, written in HDL, that are utilized to create, capture, simulate and implement digital designs in a FPGA or CPLD target device [1, 2]. The synthesis of these modules creates netlist files which serve as the input to the implementation module. After generating these files, the logic design is converted into a physical file that can be downloaded on the target device.

20.3 Study Case: Color Space Conversion RGB to YCbCr

20.3.1 Overwiew

Color Space Conversion (CSC) [6, 7] is an important application in image and video processing systems. CSC has been implemented in software and various kinds of hardware. Hardware implementations can achieve a higher performance compared to software-only solutions. Application specific integrated circuits (ASICs) are efficient and have good performance. However, they lack the programmability of devices such as field programmable gate arrays (FPGAs) [8, 9].

Many video applications require converting video and image content from one color space to another [1012]. Images and motion images (video) have utilized a wide variety of color spaces including: RGB, YCrCb, HSI, and other formats to represent the colors within the image [13]. Each of these color space representations has its own set of advantages and disadvantages. For example, RGB is often used for the most demanding applications where ultimate color fidelity must be maintained. Any given color that the human eye can see may be represented by a combination of the primary colors (Red – R, Blue – B, and Green – G). The human eye doesn’t actually see equally well in the different color bands with our human-vision [12, 14] system optimized for the red, green bands but not quite as sensitive to changes in blues. Scientist and engineers looking for was to reduce the bandwidth and/or bit rate of a video system have created other color spaces (and sampling spaces) that reduce the amount of blue information in a system while maintaining a subjectively high picture quality. Furthermore, human vision is more highly tuned to changes in brightness (black and white or gray-scale changes) than it is to changes in hue (changes from one color or another with the same brightness). Therefore, many video systems sub-sample the color information [12] (chrominance) while transmitting the black and white (luminance) in full resolutions. This sub-sampling is often applied to luminance-chrominance color space systems such as YCrCb where Y represents the luminance information and Cr and Cb are color difference signals that represent the chrominance information. In these systems all of the Y samples are used but every other color sample is dropped. These systems are referred to as 4:2:2 sampling. The 4:2:2 nomenclatures signify that for every 4 Y samples only 2 Cr and 2 Cb samples are saved. Owing to the bandwidth saving benefits of these different image formats different video equipment will adopt different color space encodings. Interoperability between such equipment often requires a device to convert the output of one video device in a given color space to the color space needed as input for the down stream device. Some examples of color space conversion are the converting of the RGB video output from a computer VGA card to YCrCb input on a TV monitor [6, 13]. The opposite conversion path is also common where a video device such as a DVD player outputs YCrCb and the video needs to be converted to RGB to drive a monitor [13].

20.3.2 YCbCr Color Model

YCbCr color model also belongs to the family of television transmission color models. In this color model, the luminance component is separated from the color components. Component (Y) represents luminance, and chrominance information is stored as two color-difference components. Color component Cb represent the difference between the blue component and a reference value and the color component Cr represents the difference between the red component and a reference value. The following conversion is used to segment the RGB image into Y, Cb and Cr components: The conversion matrix can be expressed as in Eq. (20.1) [13].

$\begin{array}{rcl} \left [\begin{array}{c} Y\\ Cb \\ Cr\\ \end{array} \right ] = \left [\begin{array}{ccc} 0.299 & 0.587 & 0.114\\ - 0.169 & - 0.331 & 0.5 \\ 0.5 & - 0.419 & - 0.081\\ \end{array} \right ] {_\ast}\left [\begin{array}{c} R\\ G \\ B\\ \end{array} \right ] + \left [\begin{array}{c} 16\\ 128 \\ 128\\ \end{array} \right ]& &\end{array}$
(20.1)

Among all the color models found, YCbCr seems to be better for skin detection since the Colors in YCbCr are specified in terms of luminance (Y channel) and chrominance (Cb and Cr channels). The main advantage of converting the image from RGB color model to the YCbCr color model is the influence of luminance can be removed during our video processing. Figure 20.2 shows the conversion of a RGB color model in to a YCbCr color model implemented with the function rgb2ycbcr from Matlab.

Fig. 20.2
figure 2_20figure 2_20

Matlab implementation for rgb2ycbcr

20.4 Implementation Results, Simulation and Comparisons

20.4.1 Hardware Co-simulation

Figure 20.3 shows the model that uses the top level HDL module and its Xilinx blokset for RGB to Y component. This model can be used for co-simulation.

Fig. 20.3
figure 3_20figure 3_20

System Generator project for simulation

Once the design is verified, a hardware co-simulation block can be generated. and then will be used to program the FPGA for the CSC design implementation. Figure 20.4 shows the model with the hardware co-simulation block. The bitstream download step is performed using a JTAG cable.

Fig. 20.4
figure 4_20figure 4_20

System Generator project for hardware-in-the-loop testing

20.4.2 Simulation

After the co-simulation step the VHDL codes were automatically generated from the System Generator block sets. Behavioral and post simulation are supported by Mentor Graphics ModelSim tool (Fig. 20.5).

Fig. 20.5
figure 5_20figure 5_20

Simulation results of the VHDL RGB to YCbCR conversion

The VHDL codes were then synthesized using Xilinx ISE 10.1i and targeted for Xilinx Spartan3 and Virtex II Pro family [2]. The optimization setting is for maximum clock speed. Table 20.1 details the resource requirements of the design. Note that in practice, additional blocks are needed for input/output interfaces, and synchronization.

Table 20.1 FPGA resources used in the implementation for the CSC

The HDL-based circuit design flow is completed with the Xilinx ISE tool to perform synthesis, implementation, place & route and device programming for the whole design. For the arithmetic units, unsigned pipeline integer divider with both quotient and remainder output are parameterized and generated by Xilinx Core Generator tool [5, 9]. Multiplication uses the embedded multiplier in the hardware. The target FPGA chip is Xilinx Virtex II Pro xc2vp7–6ff672 and Spartan 3 xc3s200–5 ft256. During the Simulink-to-FPGA design flow, circuit modeling is built up with Simulink basic blocks and Xilinx specified blocks. Input and output data are combined with Matlab workspace, which is convenient to convert number format and debug. Figure 20.6 shows the software and hardware simulation for the CSC design for the input image.

Fig. 20.6
figure 6_20figure 6_20

Outputs from different implementations

20.5 Discussion

To provide a proper performance evaluation, the implemented CSC architecture using low cost available Spartan-II development system with Xilinx chip 2S200PQ208. The properties of other designs along with ours are listed in Table 20.2. As seen from this table, the design of the CSC proposed by [8] requires 380 CLB on the basis clock rate of 55.159 MHz.

Table 20.2 Performance comparison

On the other hand, our resulting architecture spent about 323 CLB with a working frequency up to 83.271 MHz. Obviously, our proposed architecture has lower complexity and improved efficiency in area, thus providing a good choice in terms of low-cost hardware.

From the development of FPGA technology, the methodology challenges the update of various EDA tools [11]. Based on the standard development flow, initial efforts have been transferred to high-level design and synthesis. There are many conversion tools such as C-to-FPGA, Stateflow diagram to VHDL Matlab-to-FPGA. The features of Simulink/Xilinx System Generator-to-FPGA [2, 4] flow can be discussed as follows.