1 Introduction

Cardiovascular disease was the leading cause of chronic non-communicable diseases deaths in 2012 and was responsible for 17.5 million deaths, or 31% of all global deaths. Of these deaths, an estimated 7.4 million were due to coronary heart disease and 6.7 million were due to strokes [1]. Cardiac magnetic resonance imaging (MRI) has proven to be a versatile and noninvasive imaging modality. The MRI of the left ventricle (LV) is important for the assessment of stroke volume, ejection fraction, and myocardial mass, as well as regional function parameters such as wall motion and wall thickening [2]. To perform a quantitative analysis of a LV, clinicians need an accurate segmentation of the LV which can provide the anatomical and functional information of a heart, so it can be widely applied in clinical diagnoses [3]. The LV segmentation in cardiac MRI images is one of the most critical prerequisites for quantitative study of the LV.

So far, in clinical practice, The LV segmentation is almost completed manually. This workload, however, is too heavy and time-consuming, subjective, and prone to intra- and inter-observer variability. Therefore, it is attractive to develop accurate and automatic segmentation algorithms for clinical diagnosis and treatment. But, there are several challenges in the automatic LV segmentation from cardiac MRI images: blood flow resulting in heterogeneities in the brightness of heart chambers; intensities of papillary muscles similar to that of the myocardium; complexity of apical and basal slices segmentation; partial volume effects due to the limited resolution of cardiac MRI; intrinsic noise related with cardiac MRI; motion of the heart and inhomogeneity of intensity; considerable variability in shape and intensity of the heart chambers across patients, notably in pathological cases, etc [4,5,6,7]. Due to these technical barriers, the automatic LV segmentation from cardiac MRI is still a challenging problem. Some shortcomings of classical LV segmentation methods from cardiac MRI, i.e., shrinkage and leakage and sensitivity to initialization still need to be solved. Furthermore, the existing LV segmentation methods mainly focus on the segmentation accuracy. According to the segmentation accuracy our proposed method is comparable to those reported in previous studies. However, in the context of big data, it is not enough that only the segmentation accuracy is improved, the training time and segmentation time our proposed method is extremely fast.

Machine learning methods for medical image analysis has addressed this issue by estimating more complex shape and appearance models using annotated training data [8,9,10]. However, the accuracy and time requirements in clinical applications usually mean that these models need to be quite complex, which can learn all appearance and shape variations found in the annotated training data, and as a result, this training data has to be large and rich. But the acquisition of comprehensive annotated training data is a particularly difficult task [11, 12]. Therefore, in order to reduce the model complexity and the requirement for large and rich training data, naturally, an automatic, accurate and robust LV segmentation method which combines a Hierarchical Extreme Learning Machine (H-ELM) algorithm [13] and an automatic location LV technique from cardiac MRI is proposed. During past years, extreme Learning Machine (ELM) [14, 15] has attracted considerable attention. As a powerful classification algorithm, it is of faster learning speed and better generalization performance, comparing with traditional feedforward network learning algorithms. ELM, with its variants [16,17,18], has been widely applied to many fields. However, due to its shallow architecture, feature learning may not be effective for natural signals (e.g., images/videos) [13]. To tackle this problem, a new H-ELM framework [13] is proposed, which is composed of two main parts: (1) self-taught feature extraction followed by supervised feature classification and (2) they are bridged by random initialized hidden weights. The H-ELM implements more compact and significant feature representations than the original ELM and a better generalization with faster learning speed [13]. Meanwhile, the automatic Location method can make the best of spatiotemporal continuity of MRI images to improve the segmentation accuracy and reduce the cost of segmentation. The contributions of the proposed work are as follows.

  1. (1)

    A new automatic LV location method is proposed. Once the segmentation result of the previous slice is obtained, the segmentation scope of the current slice is fixed, applying the morphological dilatation method with the structural element of a \(3\times 3\) disk (empirically selected based upon 20 trials) from the mid-slice image of LV to the apical and the basal slice image, respectively. The same method is applied to the end diastole (ED) and end systole (ES) slices, respectively [19].

  2. (2)

    A new automatic LV segmentation method based on an H-ELM is developed. To the best of my knowledge, it is the first time the H-ELM is utilized in segmenting LV MRI images, which has been proven to have a better generalization and classification performance with faster learning speed.

  3. (3)

    The average computation time of LV segmentation is extremely fast than the existing methods.

The remaining of this paper is as follows. Section 2 briefly reviews the related works on the segmentation of LV. In Sect. 3, this paper introduces the basic theory of ELM and H-ELM. In Sect. 4, the image segmentation methods are introduced in detail. The experimental results of the segmentation of LV based on the H-ELM are presented in Sect. 5. In Sect. 6 discussion is given. Section 7 concludes the paper.

2 Related works

In recent years, many methods have been proposed for LV segmentation. They can be classified into two types, in accordance with no or weak or strong prior [5].

2.1 LV segmentation without or with weak prior

The LV segmentation with weak or without prior consists of image-based methods, pixel (voxel) classification-based methods, region-based methods, edge-based methods, a combined deep-learning and deformable-model method [4], a combining deep learning and level set method [11] as well as deformable models.

Image-based methods include thresholding [20], dynamic programming (DP) [2, 21,22,23,24], spatiotemporal Continuity and Myocardium Information-based methods [19]. However, the Otsu methods [25, 26] can deviate from the optimal threshold. The performance of the DP method sometimes is poor in the boundary extraction [23, 24].

Pixel (voxel) classification-based methods include statistical models and artificial intelligence-based methods. Statistical models [27, 28] take full advantage of the characteristics of the image gray histogram and the fitting approximation of distribution function, the establishment of the distribution function and parameter estimation methods are challenging problems. Artificial intelligence-based methods contain clustering methods and classification methods. The clustering methods are unsupervised, but the clustering maybe result in non-optimal solutions. The classification methods usually utilize artificial neural network (ANN) algorithms, including BP, support vector machine (SVM) [29] but the performance of these methods rely on the selection of samples and the extraction of the features.

Region-based methods include region growing method, splitting method, and watershed algorithm and so on [30]. Region growing method depends on selection of seed points, watershed algorithm provides the advantages of stabilization and speediness, but it is difficult to decide stopping criteria for region-based methods.

Edge-based methods utilize gray level differences between organs to find edges. The limitation is that the performance is affected by noise, pseudo edge and weak edges.

Deformable models include snakes [20, 31,32,32,33], level set [34,35,36,37], and their variants [38,39,40]. A random active contour schem19e for automatic images segmentation is proposed [41]. This method utilizes a parametric shape prior and integrates the region and boundary information into a generalized energy function to realize minimization. This method, however, requires the prior knowledge.

2.2 LV segmentation with strong prior

The segmentation consists of shape prior based deformable models, active shape (ASM) and appearance models (AAM), and atlas based methods.

The deformable models with strong prior take up the variational framework and modify the energy functional to be minimized by introducing a new term, which embeds an anatomical constraint on the deforming contour [42].

The ASM consists of a statistical shape model, called Point Distribution Model (PDM) and a method for searching the model in an image [8, 43]. The combination of the AAM and the ASM is also used [44].

In the atlas based method, an atlas can be generated by manually segmenting an image or integrating information from multiple segmented images of different individuals [45,46,47]. This method denotes that the segmentation will not be too much leakage, but also limit flexibility.

3 Brief introduction to H-ELM

3.1 ELM theory

The ELM is a learning algorithm, whose speed can be thousands of times faster than traditional feed-forward network learning algorithms, and which has better generalization performance [48].

Given N arbitrary different samples \({({\rm X}_{\rm i},{\rm t}_{\rm i})},i=1,\dots ,N,\) where \({\rm X}_{\rm i} =\left[ x_{i1},x_{i2},\dots ,x_{in}\right] ^{\rm T} \in R^n\) , and \({{\rm t}_{\rm i}}=\left[ t_{i1},t_{i2},\dots ,t_{in} \right] ^{\rm T} \in R^m,\) standard SLFNs with M hidden nodes and activation function g(x) are modeled as

$$\begin{aligned} \sum \limits _{i=1}^M \beta _i g_i (\rm {x}_j)=\sum \limits _{i=1}^M\beta _i g(\rm {W}_i \cdot \rm {X}_j + b_i)=o_j,\quad (j=1,\dots ,N) \end{aligned}$$
(1)

where M is the number of the hidden layer nodes, \(\rm {W}_i=\left[ w_{i1},w_{i2},\dots ,w_{in} \right]\) is the input weight vector, \(\beta _i=\left[ \beta _{i1},\beta _{i2},\dots ,\beta _{im} \right] ^{\rm T}\) is the output weight vector, and \(b_i\) is the threshold of the i th hidden node.\(\rm {W}_i \cdot \rm {X}_j\) is the inner product of \(\rm {W}_i\) and \(\rm {X}_j\) [49]. The output of ELM can be written compactly as

$$\begin{aligned} f(X)=\sum \limits _{i=1}^M \beta _i g(\rm {W}_i \cdot \rm {X}_j+b_i) \end{aligned}$$
(2)

where

$$\begin{aligned} & \rm{H}(\rm {W}_1,\dots ,\rm {W}_M,b_1,\dots ,b_M,\rm {X}_1,\dots ,\rm {X}_N) \\ & \quad = \begin{bmatrix} g(\rm {W}_1 \cdot \rm {X}_1+b_1)&\dots&g(\rm {W}_M \cdot \rm {X}_1+b_M)\\ \vdots&\dots&\vdots \\ g(\rm {W}_1 \cdot \rm {X}_N+b_1)&\dots&g(\rm {W}_M \cdot \rm {X}_N+b_M) \end{bmatrix} _{N \times M} \end{aligned}$$
(3)
$$\begin{aligned} \beta = \begin{bmatrix} \beta _1^{\rm T}\\ \vdots \\ \beta _M^{\rm T} \end{bmatrix} _{N \times m}\quad \rm{and}\quad {\rm T}= \begin{bmatrix} t_1^{\rm T}\\ \vdots \\ t_N{\rm T} \end{bmatrix} _{N \times m} \end{aligned}$$
(4)

If only the activation function is infinitely differentiable, the input weights and hidden layer biases can be randomly generated [49]. All the parameters of SLFNs need to be adjusted; training an SLFN is simply equivalent to finding a least squares solution \(\hat{\beta }\) of the linear system \(H\beta =\rm {T}:\)

$$\begin{aligned}&\left\| \rm {H}(\rm {W}_1,\dots ,\rm {W}_M,b_1,\dots ,b_M)\hat{\beta }-\rm {T} \right\| \nonumber \\&\quad =\rm {min} \left\| \rm {H}(\rm {W}_1,\dots ,\rm {W}_{\hat{N}}, b_1,\dots ,b_{\hat{N}})\beta -\rm {T} \right\| \end{aligned}$$
(5)

If the number M of the hidden nodes equals the number N of distinct training samples, matrix H is square and invertible, and SLFNs can approximate these training samples with zero error. However, in most cases the number of hidden nodes is much less than the number of distinct training samples, \(M\ll N,\) \(\rm {H}\) is a non-square matrix and there may not exist \(\rm {W}_i,b_i,\beta _i\) such that \(\rm {H} \beta = \rm {T}.\)

$$\begin{aligned} \hat{\beta }=\rm {H}^\dagger \rm {T} \end{aligned}$$
(6)

where \(\rm {H}^{\dagger }\) is the Moore–Penrose generalized inverse of matrix \(\rm {H}.\)

3.2 H-ELM theory

The H-ELM training architecture is structurally separated into two independent phases: (1) unsupervised hierarchical feature representation and (2) supervised feature classification. For the first phase, a new ELM-based autoencoder is designed to extract multilayer sparse features of the input data, and then for the second one, the original ELM-based regression is implemented for making final decision [13].

3.2.1 Unsupervised feature learning

Firstly, the original input data is transformed into an ELM random feature space, which can help to utilize hidden information among from training data. Then, an unsupervised learning is performed to eventually obtain the high-level sparse features [50]. The output of the ith hidden layer can be represented mathematically as

$$\begin{aligned} \rm {H}= g (\rm {H}_{i-1}\cdot \beta ) \end{aligned}$$
(7)

where \(\rm {H}_i\) denotes the output of the ith hidden layer (\(i\in {[1, \rm {K}]}\)), \(\rm {H}_{i-1}\) is the output of the (i1)th layer, \(g(\cdot )\) is the activation function of the hidden layers, and \(\beta\) is the output weights. Note that each hidden layer of the H-ELM is independent each other, and a separated features extractor. The more layers, the more compact the resulting features. In this frameworks all the hidden layers are gathered together as a whole, with unsupervised initialization. The whole system is retrained iteratively by BP-based NNs. After unsupervised hierarchical training, the outputs of the Kth layer, i.e. , \(\rm {H}_K\) are considered as the high-level features extracted from the input data. Before classification, they are randomly projected, and then used as the inputs of the supervised ELM-based regression to obtain the final classification. To speed up the learning speed, the H-ELM framework is constructed based on random mapping and makes full use of the universal approximation capability of ELM, both in two phases of the whole framework. According to [13, Theorem 2.1], using random mapped features as the inputs, the H-ELM can approximate or classify any input data.

3.2.2 ELM sparse autoencoder

As mentioned above, the H-ELM briefly includes two independent phases: (1) unsupervised training and (2) supervised training. Owing to the latter phase is performed by the original ELM, the former one (autoencoder) will be emphatically introduced. It is known that the autoencoder aims to approximate the input data, by making the reconstructed outputs being similar to the input data as much as possible [13].

The universal approximation capability of ELM is used for constructing the autoencoder [51], meanwhile, sparse constraint is added on the autoencoder optimization, and thus, it is referred to as ELM sparse autoencoder. According to the ELM theory [52], the autoencoder is initialized without the fine-tuning. Additionally, to gain more sparse and compact features of the inputs, the optimization model of the proposed ELM sparse autoencoder can be expressed as follows:

$$\begin{aligned} O_\beta =\rm {argmin}\{\left\| H\beta -X\right\| ^2+\left\| \beta \right\| _{\ell _1}\} \end{aligned}$$
(8)

where \(\rm {X}\) is the input data, \(\rm {H}\) is the random mapping output which needs not to be optimized [15], and \(\beta\) denotes the hidden layer weight. The \(\ell _1\) optimization has been proved to be a better solution for data recovery and other applications [53, 54].

4 Methods

The whole algorithm of this image segmentation includes pre-processing training data, training H-ELM, Pre-processing testing data, classification and post-processing as shown in Fig. 1.

Fig. 1
figure 1

Workflow of the proposed segmentation algorithm. a The workflow of the training procedure. b The workflow of the testing procedure

4.1 Pre-processing training data

The procedure of pre-processing training data consists of the following steps (as shown in Fig. 1a):

  1. (1)

    In accordance with image clarity and whether including varying amounts of endocardial trabeculae and papillary muscles, 186 images of 10 cases in cardiac MRI were selected as sample images, whose ground truth had been acquired.

  2. (2)

    For any sample image, all the pixels were selected from the LV region of the ground truth, meanwhile, they were labeled as 1.

  3. (3)

    The LV region was extended, applying the morphological dilatation method with the structural element of a \(5\times 5\) disk (empirically selected based upon 20 trials), and then all the pixels were selected from the new region adjacent to the LV of the ground truth, meanwhile, they were labeled as 0.

  4. (4)

    Four different kinds of features were extracted, which included 3-dimensional gray level values, such as gray level value, gray mean value and gray median, 20-dimensional gray level co-occurrence matrix [55] such as energy, contrast, correlation, entropy and inverse from four directions via a \(11\times 11\) window, 9-dimensional histogram of oriented gradient features calculated within \(17 \times 17\) cell blocks with nine histogram bins similar to [56] and 18-dimensional local binary pattern features [57, 58] via a \(5\times 5\) size empirically, amounting to 50-dimensional features [59].

  5. (5)

    Feature vectors of all pixels of an image were concentrated to generate a feature matrix.

  6. (6)

    The procedure of pre-processing testing data included steps 2–5, all feature matrices were merged into a feature matrix at last. Also, each value of this matrix were normalized to [0, 1].

4.2 Training H-ELM

Training H-ELM aimed to find optimal parameters, using the obtained feature matrix. The \(\ell _1\) penalty of the last layer ELM is \(2^{-50}\) and \(S =0.8\) is the scaling factor. The ELM kernel used in the proposed algorithm was Sigmoid function and the number of hidden nodes was 100, which were selected empirically based upon 20 trials, owing to the randomness of the input weights and hidden layer biases.

Fig. 2
figure 2

The images within the procedure of segmentation. a The original cardiac MRI mid-slice image. b A circle with the center of the MRI image and the radius of 50 pixels. c The binary image of the original image. d The initial location image of the LV. e The final location image of the LV with the red pixels after dilation, the red contour includes almost all pixels of the LV. f The segmentation result with red pixels

4.3 Pre-processing testing data

The procedure of pre-processing testing data consists of the following steps (as shown in Fig. 1b):

  1. (1)

    At the same temporal phase, the LV of the mid-slice image (as shown in Fig. 2a) is always the biggest and roundest, which was segmented first. To reduce computational complexity and time [4], a circle with the center of the MRI image and the radius of 50 pixels was drawn, all pixels outside the circle were set to 0 (as shown in Fig. 2b). Accordingly, a fitting threshold was found using the Otsu method, and then the original image (as shown in Fig. 2b) was converted into the binary image (as shown in Fig. 2c).

  2. (2)

    The roundness, area and centroid of each object were calculated, based on an overall consideration, the LV was located approximately, the result was shown in Fig. 2d.

  3. (3)

    As shown in Fig. 2d, in virtue of the presence of intensity inhomogeneity, endocardial trabeculae and papillary muscles of the LV cavity, usually, not all contours of LVs could be satisfactory, therefore, the LV were extended by the same method (introduced in step 3 of Sect. 4.1). As a result, the extended region included almost all the pixels of the LV, which were regarded as the testing pixels set as shown in Fig. 2e.

  4. (4)

    By the same method (introduced in step 4 of Sect. 4.1) 50-dimensional features of each pixel of the testing pixels set were extracted to generate a feature matrix.

4.4 Classification

The feature matrix was input into the trained H-ELM, then all pixels were classified into two classes, namely one class belonged to the LV and the other one belonged to the non-LV area.

4.5 Post-processing

The contour of the LV was depicted and smoothed using the open-close operations in mathematical morphology, the segmentation result was shown in Fig. 2f. In order to segment the adjacent slice image in the superior and/or inferior directions, the contour of the LV was extended by the same method (introduced in step 3 of Sect. 4.1), and then the extended LV was used as a binary mask to locate the LV of the next image. Sequentially by the above method (introduced in step 4 of Sects. 4.3, 4.4 and 4.5), the next image was segmented. That is to say, the derived contour region from the preceding slice image was dilated and utilized to locate the LV of the next slice image till to the apical slice, which can be also adopted from the mid-slice to the basal slice. In the dataset, all slices of each case were divided into two classes: the end diastole (ED) and end systole (ES) slices. The above method is used for them, respectively.

5 Results

5.1 Data set

In this paper, the dataset is cardiac cine MRI short axis images from the General Hospital of Shenyang Military. There are a total of 30 cases (19 males, 11 females, mean age 48.4\(\pm {15.6}\) years), including heart failure cases, coronary heart disease cases, hypertrophy cases and normal cases. Cine CMR images data were acquired using a 2D SSFP pulse sequence on 1.5 T (General Electric) scanners. In each case, LV was imaged in 8–16 short axis slices from the level of the mitral valve annulus through the LV apex. Typical CMR parameters were repetition time (TR) 3.5 ms, echo time (TE) 1.6 ms, flip angle 60, matrix size \(256\times 256,\) image dimensions \(256\times 256,\) receiver bandwidth 125 kHz, FOV: 360 mm, slice thickness 6–8 mm, slice gap 2–4mm.

5.2 Evaluation measures

In this paper, several measures were used in our experiments to test and compare the segmentation results of our proposed method, level set method, the SVM method and Hu’s method [27], including mean absolute deviation (Mad), maximum absolute deviation (Maxd), dice similarity coefficient (Dsc) and segmentation Time.

5.3 Performance evaluation

In this section, the performance of LV segmentation based on an H-ELM was studied through evaluating its efficiency and effectiveness. The algorithm was coded in MATLAB 2014a. All experiments were conducted on a 2.0-GHz PC with 16G memory running window 7. The same pre-processing and post-processing methods were used in the proposed method and the SVM method, the similar location method was used in the level set method [36] and Hu’s method [27].

Table 1 listed Mad, Maxd and Dsc of our proposed method, the level set, the SVM and Hus method from 20 patients, respectively. From Table 1, it could be seen that the mean Mad of images segmented by our proposed method was about 67.9, 81.3 and 98.7% of those images segmented by the level set, the SVM and Hu’s method, respectively. The Mad Std of our proposed method was the lowest in four methods. The mean Maxd of images segmented by our proposed method was about 63.5, 77.3 and 98.0% of those images segmented by the level set, the SVM and Hu’s method, respectively. The Maxd Std of our proposed method was the lowest in four methods. The mean Dsc of images segmented by our proposed method was about 13.7, 9.3 and 0.5% higher than that of those images segmented by the level set, the SVM and Hu’s method, respectively. The Dsc Std of our proposed method was the lowest in four methods. The similar results were seen from Table 2, the mean speed of our proposed method was about 38.3, 6.7 and 23.8 times faster than that of the level set, the SVM and Hu’s method, respectively. The mean segmentation Time Std of our proposed method was the lowest in four methods.

Fig. 3
figure 3

Box plots of evaluation Measures. a Box plots of Mad between our proposed method, the level set, the SVM and Hu’s method. b Box plots of Maxd between our proposed method, the level set, the SVM and Hus method. c Box plots of Dsc between our proposed method, the level set, the SVM and Hu’s method. d Box plots of segmentation Time between our proposed method, the level set, the SVM and Hu’s method, vertical coordinates are logarithmic

Table 1 Mean segmentation accuracy of the H-ELM method (H), the level set method (L), the SVM method (S) and Hus method (U) (a total of 20 patients)
Table 2 Mean segmentation time of the H-ELM method (H), the level set method (L), the SVM method (S) and Hu’s method (U) (a total of 20 patients)

In order to further evaluate the performance of our proposed method, the local distributions of segmentation errors and the similarity between the segmentation result and the ground truth were illustrated in Fig. 3, respectively. The boxplots indicated the median, lower and upper quartiles of Mad, Maxd, Dsc and Time of our proposed method, the level set, the SVM and Hu’s method. It was notable from Fig. 3 that our proposed method outperformed the level set, the SVM and Hu’s method since it obtained the higher Dsc, the lower Mad, Maxd and Time.

6 Discussion

In this study, an automatic LV segmentation method based on an H-ELM model is developed and validated. The whole framework is divided into two components including training an H-ELM and segmenting LVs from cardiac MRI images by the trained H-ELM, the former one is composed of self-taught feature extraction and supervised feature classification [13]. The self-taught feature extraction implemented more compact and meaningful feature representations than the original ELM, and then the hierarchically encoded outputs are randomly projected to generate a better generalization with faster learning speed, the supervised feature classification is performed by an original ELM, the hidden layers of this framework are trained in a forward manner. The latter one consists of the LV location followed by the LV segmentation. Owing to complexity of cardiac MRI, the LV location is a prerequisite for the LV segmentation from cardiac MRI images based on an H-ELM, the segmentation result is directly affected by the location accuracy. First, the mid-slice with the biggest and roundest shape is located, using a fitting threshold from the Otsu method. Second, the LV is segmented through the trained H-ELM, the segmentation result is dilated with the structural element of a \(5\times 5\) diskempirically selected, and then the contour of the dilated region is used as a binary mask to locate the LV of the adjacent image from mid-slice to apical slice and basal slice, respectively. The same method is performed for the end diastole (ED) and end systole (ES) slices, respectively. When the segmentation is close to the most apical slice, an unsuccessful classification lead to failure not only for the current segmentation but also for subsequent segmentations, such as the area of the current segmentation LV equals 0, Accordingly, the current segmentation result is replaced by the corroded previous segmentation result with the structural element of a \(3\times 3\) diskempirically selected. The previous segmentation result is utilized as the segmentation mask of the current slice, which may avoid to leak to surrounding areas to some extent. In a word, the segmentation result of the previous slice is crucial to the current slice segmentation. Due to all parameters are empirically selected in our proposed method, they cannot be guaranteed to be optimal for feature extraction and classification in all cases.

In the future, there are several points can be explored in order to improve the segmentation results of the LV. Firstly, the LV should be segmented from point of view 3D, which may make the best of features of each voxel, meanwhile, reduce problem complexity and computational time. Secondly, currently, in our proposed method, for all parameters of the H-ELM they are empirically selected, ideally, a systematically analytic approach is need to obtain optimal all parameters. Finally, during the growth of training data, all parameters of the H-ELM are updated incrementally to further produce more accurate segmentation results.

7 Conclusions

In summary, a new method for automatic LV segmentation from cardiac MRI images is proposed. This method takes into account the intensity inhomogeneity which often occurs in the LV cavity and may cause many difficulties in image segmentation, cardiac MRI image scaling, shifts and spatiotemporal continuity and so on. Experimental results show that our proposed method is better than the level set, the SVM and Hu’s method. The results of this study prove that our proposed method is fast, robust, efficient and satisfactory for LV segmentation.