Keywords

Introduction

Economic growth and current development in the field of hospitality and tourism industry have impacted culinary as one of the industry’s important niche area. There are increased demands for high-skilled and competent culinary professionals working in the restaurant, catering, and hotels’ sector. However, the high turnover rate of culinary professionals in the industry was one of the most confounding employment issues for the culinary profession. Additionally, recent studies have shown that the culinary graduates and young chefs are lack of knowledge, skills and abilities in performing their job. Thus, this study concerns on measuring the mastery level of competencies required for successful career in culinary profession. It is beneficial to have a comprehensive competency measurement instrument developed specifically for the profession can be used to provide data on competencies of current employees that we have in the industry. The establishment of a genuinely valid competency-based assessment approach can yield great benefit, not only to the professions, but to the whole community (Greenstein 2012; Gonczi et al. 1993).

Assessment should be precise, technically sound, producing accurate information for decision making in all circumstances (Dubois and Rothwell 2000; Stetz and Chmielewski 2015). The utilization of survey technique is communal among social science researchers as most of the data collection emphasizes on self-reported data. Nevertheless, the credibility of the instrument as a reliable and valid measurement tool is somehow flouted. It is important to consider the fact that identifying reliability and validity of an instrument is crucial for maintaining the accuracy of the instrument. In order to improve the survey instrumentation, Rasch measurement model was utilized to employ a data-driven model which is designed to measure culinary professionals’ self-assessment on their competencies. The Rasch measurement model was opted for this study because of its sophisticated approach to evaluate patterns of items responses and scale, and item performance (Linacre 2002; Bond and Fox 2007, 2015). Analysis using Rasch measurement model is a more sophisticated approach to evaluate patterns of items responses and scale, and item performance (Chen et al. 2014).

Methodology

The instrument testing was conducted among culinary professionals in 20 commercial kitchens of the 4-star and 5-star hotels in Peninsular Malaysia. The SC-SAT instrument was tested on 111 culinary professionals using a survey which had been conducted for three months.

Data Analysis and Findings

This section describes the data analysis and findings from the instrument testing. The questionnaire was analyzed using Winsteps software, software based on Rasch measurement model for reliability and validity test.

Demographic Profile

There are 111 respondents from 20 hotels in Peninsular Malaysia involved in the study. Table 1 shows the demographic profile of the respondents. About 61 respondents work with the 5-star hotels and 50 respondents work with the 4-star hotels. Majority of the respondents are Malay (83.8 %), followed by Chinese (9 %), Indian (4.5 %), and others (2.7 %).

Table 1 Respondents’ demographic profile

According to gender classification, there are 74 (66.7 %) male and 37 (33.3 %) female culinary professionals. Among these 111 respondents, 52.3 % of them holds managerial level position (Chef de Partie post and above ranks) while 47.7 % works at nonmanagerial level (Commis and Cook). For each hotel’s management, the title of job positions are varies; there are 16 types of job titles among respondents of the study. In terms of educational background, majority of them were Diploma holders (53.2 %), followed by high school graduates (43.2 %). There are three respondents who have Degree and one respondent who has a postgraduate educational accomplishment. For methods of culinary training and education attainment, 66 respondents (59.5 %) reported that they learnt culinary from culinary schools or institutions. Another 40.5 % of the respondents learnt culinary through experiences. About 83.8 % of the respondents have experience working in foreign countries. A large percentage of the respondents (84.7 %) do not have the MOSQ certification.

Person and Item Reliability and Separation Index

In the third instrument testing, the value for person reliability is 0.99 with person separation index of 8.78. Person reliability interpretation is equivalent with Alpha Cronbach (KR-20), which is 0.99. The person separation index value of 8.78 demonstrates that there are 9 levels of person ability that can be categorized in the instrument. With 111 person measuring 159 items in the SC-SAT instrument, Table 2 shows the value of item reliability is 0.94 with separation index of 4.02.

Table 2 Items reliability and separation index for each constructs in SC-SAT

The finding demonstrates that the probability of the SC-SAT instrument reliability when given to another group of sample with the same characteristics is 0.94. The separation index of 4.02 means that items in the SC-SAT can be categorize into four levels of difficulty. Table 3 shows the value of item reliability and separation obtained for the six constructs in SC-SAT. From the table, it can be seen that most of the constructs showed item reliability value that is greater than 0.70, ranging from 0.70 to 0.96. Physical state and self-concept shows the item separation index below 2 (1.66 and 1.54).

Table 3 Person reliability and separation index for each constructs in SC-SAT

Based on Table 3, all of the constructs are accepted because the item separation indexes are equal to and higher than 2, which is considered as acceptable values except for physical state construct which need to be revised as the value of item separation is 1.72. However, the person reliability for physical state construct is 0.75 indicates a satisfying condition for further analysis.

Item Polarity

Based on Table 4, all of the correlation coefficient is positive for each of the constructs, showing the item ability to measure the competencies is valid (Linacre 2002). There are no items that need to be dropped based on polarity requirement because items are moving in one direction with the constructs.

Table 4 Polarity of items

Fit Statistics

Table 5 shows the summary of analysis of Item Fit and Person Fit for the instrument testing. Based on the table, ten respondents were identified to be the misfit person in measuring the six constructs of competency. They are person ID88, ID104, ID112, ID110, ID76, ID41, ID49, ID0, ID64, and ID40.

Table 5 Item and person fit for SC-SAT

Accordingly, Table 6 shows the detailed analysis of Person Fit. The analysis shows that these people do not meet the requirement of Rasch model in analyzing the fit characteristics. Thus, suggested these people supposed to be removed from the analysis.

Table 6 Analysis of person misfit SC-SAT

Further, Table 7 shows the item misfit for each of the items in the SC-SAT instrument. Nine items was found to have Infit MNSQ above 1.4 and ZSTD above 2. There is only one item with value of Infit MNSQ below 0.6 and ZSTD value below −2.00; which is SVC5 (I can apply stalls arrangement concept).

Table 7 Analysis of item misfit for SC-SAT instrument

Figure 1 depicts the visual presentation of the bubble charts generated by the Rasch Analysis. Item that fits the models’ expectations are located in the acceptable values between −2.0 and +2.0. Items which located on the right (>+2.0) are too erratic to be useful whereas items on the left (<−2.0) are too good to be true.

Fig. 1
figure 1

Visual presentations of fit (quality control) for SC-SAT instrument

Item Dimensionality

Based on Table 8, the raw variance explained by measures is 41.8 %, whereas the unexplained variance in first contrast is 5.5 %.

Table 8 Standardized residual variance (in eigenvalue)

Table 9 shows the value of raw variance explained by measures and the value of unexplained variance in first contrast for each constructs. The raw variance explained by measures ranging from 42.6 % (nontechnical) to 74.7 % (physical state). It is observed that this value is above the Rasch measurement requirement where the value must exceed 40 %. The range for the unexplained variance in first contrast is 7.1 % (nontechnical) to 13.3 % (physical state) also considered as an acceptable value below 15 %.

Table 9 Standardized residual variance (in eigenvalue) for each constructs

Standardized Residual Correlation

The largest standardized residual correlation that is used to identify dependent items is displayed in Table 10. There are ten pairs of items that need to be revised because the value is more than 0.70, meaning that these items are highly correlated with each other.

Table 10 Standardized residual correlations

Item Calibration

Analysis on item calibration was done to investigate whether appropriate rating scales are applied. The observed count is the number of times the category was selected across all items and persons. Based on Table 11, the scale used in the questionnaire is 5-point scale which described as 1: Not at All True of Me, 2: Slightly True of Me, 3: Moderately True of Me, 4: Very True of Me, 5: Completely True of Me. The scale number 4 “Very True of Me” is the most selected response from the respondents (48 %). The least response is for scale 1 (Not at All True of Me) with 0 % responses.

Table 11 Observed average at 5-point scale (12345)

The observed average is normal and improved from negative to positive index. The index value starts from −0.38 to +3.26 logit. The category probability curve is shown in Fig. 2. Bond and Fox (2007) claimed that each of the rating categories should have a distinct peak in the probability curve graph. However, it can be seen that not the entire peak is clearly seen.

Fig. 2
figure 2

Category probability curve at 5-point scale (12345)

Further analysis on the calculation of Structure Calibration shows the values are not acceptable according to the requirement of 1.4 < SC < 5 where [−1.55 − (−2.42) = 0.87]; thus collapsing is required between scale 1 and scale 2. After collapsing is done, the new Structure Calibration is improved. The observed average is increasing steadily and consistent as shown in Table 12.

Table 12 Observed average at 4-point scale

The observed average, the average of logit positions modeled in the category is normal and enhanced from negative to positive index. The index value starts from −0.80 to +2.45 logit, demonstrating that it is increased by category value. The new category probability curve is shown in Fig. 3.

Fig. 3
figure 3

Category probability curve at 4-point scale (1234)

It can be observed from the figure that the entire peak can be seen distinctively. Further analysis on the calculation of Structure Calibration shows the values are well accepted according to the requirement of 1.4 < SC < 5 where [−0.15 − (−2.31) = 2.16]. In an attempt to revise the categorization, the item and person reliability and separation index is reanalyzed for the calibrated scale of 11234. Table 13 shows the comparison of separation index value before and after scale calibration. Result shows that after category collapsing was done, the value of item separation index is increased from 4.02 to 4.03. On the other hand, the value of person separation index is maintained at 8.78. The value of mean person decreases from 1.81 to 0.98 (standard deviation 1.30).

Table 13 Comparison before and after scale calibration

The utilization of 5 likert-type rating scale is suggested for the next stage of instrument testing after taking into account that the changes in person and item separation appear to be meaninglessly small.

Differential Item Functioning (DIF)

It is crucial that the items in the designated SC-SAT instrument should not advantage (or disadvantage) culinary professionals from different groups. The differential item functioning (DIF) analysis is conducted to strengthen the psychometric evaluation of the instrument. The major purpose of DIF analysis is to identify whether there are biases exists among items in the SC-SAT instrument from the aspects of gender. To analyze DIF, Winstep perform the two-tailed t-test analysis to test the significant difference between the difficulty indexes. In DIF analysis, the cut off point is the critical t value within range of +2.0 > t > −2.0 and +0.5 > DIF contrast >−0.5 at 95 % confidence level.

Items which have DIF contrast value outside the range >+0.5 or <−0.5 need to be revised after considering the t value. Results for the DIF Analysis of SC-SAT items based on gender are presented in Table 14. There are fifteen items which is detected as items that have bias between the two groups of male and female culinary professionals in the SC-SAT instrument. The analysis shows that most of the DIF measure for Person Class 1 (male) is smaller than DIF Measure for Person Class 2 (female), indicating that male culinary professionals more easy to endorse their self-reflections towards the competency items.

Table 14 DIF analysis of SC-SAT items

Item Targeting

The data were delved further to determine the Malaysian Culinary Professional’s Competency Profile based on the SC-SAT instrument. Additional analysis were conducted to demonstrate the ability of the Rasch Analysis diagnoses in constructing the competency profiling based on the item difficulty and person ability. The heart of the Rasch Analysis is presented in Fig. 4, where the map of the person and items was displayed in tandem. The mean for all items are indicated as “M” (Item Mean) starts at 0.00 logit while the Person Mean (also marked as “M”) is observed at +1.81. “S” is one standard deviation away from the mean, whereas “T” is two standard deviations away from the mean. From the Item-Person map, it shows that the Person Mean is above the Item Mean. Respondents’ ability was arranged according to ascending order from the lowest to the highest ability in performing the items.

Fig. 4
figure 4

Item-person wright map based on the SC-SAT

As shown in Table 15, the most difficult item is located at +1.72 logit and the easiest item is located at −1.74 logit with the standard deviation of 1.29, inferring the small spread within the data. Though the items still targeting at groups of person with moderate ability and below, there is an even distribution of persons according to their abilities along the logit scale. This shows that there is a slight improvement on item targeting. The most difficult item that respondents gave endorsement is item SCI1 from constructs technical competency “knowledge of cooking chemistry”. Item M14 is the easiest items from motive construct “career as Chef brings the utmost satisfaction”. There are 21 off target items with no respondents, which mean that these items are too easy. The maximum logit for person is +6.18 logit which is represented by person ID83, followed by two people at +5.67 logit (Person ID10 and ID82). The minimum logit for person is −0.62 logit (Person ID46).

Table 15 Item difficulty level and person response level

From Fig. 4, there are two categories of items spanned along the positive and negative logit scale. There are 72 items above the mean (45 %) and another 87 items below the mean (55 %). This shows only 45 % of positive person response level, showing that these percentages of the respondents have perceptions to agree with the items. Accordingly, the findings demonstrate that they are capable in carrying out the competencies. Person ID83 who demonstrate the highest logit value (+6.18 logit) is a male, Chinese person who holds job position as a Sous Chef in a 5-Star hotel. He has been holding the position around 6–10 years. He is aged between 26 and 35 years old with more than 11–15 years of experience in the culinary industry. The person also has experience working in foreign country. Nevertheless, the person does not have MOSQ certification.

Discussion

As discussed earlier, Rasch measurement analysis was initially, conducted with 203 items, and conceptually ordered from low to high level of difficulty. It was concluded that a reliable linear, unidimensional scale of competencies for superior work performance was created using culinary professional views. With such detailed precision, these results mean that valid inferences could be made from the SC-SAT instrument. Since the scale data were shown to be reliable, valid inferences were drawn from the scale. Findings from the study have shed lights on the construct validity of the scales constructed. The study emphasized on six aspects of Rasch Analysis diagnoses which are (i) item and person reliability and separation index, (ii) item fit, (iii) item polarity, (iv) item dimensionality, (v) item calibration, and (vi) differential item functioning. The aspect of item targeting and consequently, competency profiling based on the SC-SAT instrument were discussed further. A closer look at the responses given by the culinary professionals in answering the SC-SAT may indicate which aspects specifically are sound and which may need attention to further developed their competence at work. The development of SC-SAT has put forward a better technique of competency measurement that is purposely developed for employees’ professional development.

Assessment should include a spectrum of strategies where the process and products are emphasized. Assessment should be communicated, integrated in a day to day basis, stimulate thinking, build prior knowledge and construct meaning. Assessment results should be routinely revised and provide a proper database (Dubois and Rothwell 2000; Stetz and Chmielewski 2015). Formative assessment should be responsive in a way that it provides opportunities for self-reflections and revision. Hence, the betterment for model of assessment for professional development is to provide feedback. Workers should receive feedback routinely, recognizing achievement beyond the scores.

Other than culinary profession, studies focusing on aspects of assessment of professional employees’ competence and performance, addressing the question of self-assessment, and the means to assure more objective measurements of competence and performance were conducted widely among vocational profession (Winther and Klotz 2013), health professionals (Bashook 2005; Nicholson et al. 2012) information technology professionals (Azrilah et al. 2008), sales professionals (Lambert et al. 2014) and management professionals (Sisson and Adams 2013). The studies also attempts to develop applications of findings in identifying performance at workplace using a bevy of assessment methods.

Conclusion

Analyses of SC-SAT instrument items fully support its function as a useful measure of competencies. All items were analyzed and a minor modification has been made in order to achieve an adequate model fit. Data from the instrument testing provide evidence that the psychometric evaluations of the instrument are improved from one stage to another. The newly developed SC-SAT provides opportunity for culinary professionals to identify and measure their own competencies where the result can be used to identify how well they are doing. SC-SAT is considered as a norm-referenced measurement tool that is expected to possess a high degree of accuracy, discriminating those who perform excellently with those who are low performers, and functioning. Rasch Analysis has assisted the researcher in improving the quality of SC-SAT instrument, providing evidence to support the validity of the SC-SAT instrument. This study will be of better quality by implementing a number of improvements in a certain area such as in improving the item targeting.