1 Introduction

Teaching quality is a relatively subjective index in the teaching process. It is an important measure of teaching effect. Improving teaching quality is an important way for a school to achieve leapfrog development [1]. In the teaching process of colleges and universities, people pay more and more attention to the teaching quality. There is a teaching quality evaluation mechanism based on expert group, and some methods to reflect the teaching quality through simple statistics. However, because the teaching process is a two-way interactive process, and different subjects have different teaching methods, teachers will have different understanding of the classroom, and thus they will have different views on the teaching quality. Objective evaluation in the teaching process is much more complex than evaluating the performance or quality of a product [2].

In terms of the current evaluation of teaching quality in colleges and universities, teaching quality evaluation mechanism with expert group is the core, which comprehensively evaluates the teaching quality of a certain teaching process through the scoring of quantitative indicators by experts. If the characteristics of these data can be well analyzed and extracted, we will better extract the reasons or existing models affecting teaching quality, and thus effectively ensure teaching quality. However, due to the diversity of disciplines in colleges and universities, it is quite difficult to evaluate the teaching quality of a course or a discipline [3]. The teaching quality evaluation system is composed of a series of subjective and objective factors. There are continuous and discrete quantities in quantitative indicators. This heterogeneous index value has high requirements for the expression ability and performance of the evaluation model, which makes some traditional models based on simple statistics powerless, or cannot reach the actual required accuracy. Colleges and universities should seek a way to evaluate the teaching quality on the basis of cooperating with the Ministry of education in the evaluation of them. When the evaluation is completed, they can continue to maintain their own teaching quality through internal evaluation [4]. At present, the evaluation methods of teaching quality in colleges and universities in China mainly include the evaluation of teaching supervision group led by teaching competent department, mutual evaluation of teachers, and self-evaluation of teachers, students' evaluation of teachers. The evaluation forms include lectures, discussion with students, questionnaire survey, student scoring, and peer evaluation. Among them, because the main body of teaching is teachers and students, it is not teachers' peers, experts or managers who experience teachers' teaching effects but students. Therefore, it is more comprehensive and objective for students to evaluate the quality and attitude of teachers' teaching work. The evaluation of teaching quality with students as the main body can more objectively and reasonably reflect teachers' teaching level and teaching quality [5].

At present, there are many evaluation methods, such as organized expert evaluation, leadership evaluation, peer evaluation, and student evaluation. The evaluation contents include teaching preparation, teaching attitude, teaching methods, teaching content, teaching language, and teaching effect. The advantage of this kind of evaluation method is diversity, which can make a more comprehensive, objective and scientific evaluation of teachers' teaching activities from multiple directions and angles. It has made its due contribution to the scientific and effective evaluation of school teaching quality and the improvement of school teaching quality. However, the current teaching quality evaluation system also has many problems [6]. As one of the most important means to ensure and promote the continuous improvement of higher education teaching quality, teaching quality evaluation is being widely studied. Effectively processing and analyzing the huge original data collected in the teaching process of colleges and universities can provide decision support for the formulation of teaching quality evaluation and related improvement measures. Teaching quality can be defined as the degree to which the talents trained by a school are consistent with the school's educational objectives and school running objectives in terms of quality, knowledge and ability. The evaluation of teaching quality in colleges and universities is essentially the evaluation of the change degree of students' quality, knowledge and ability [7, 8].

The introduction of machine learning analysis method into teaching quality evaluation has become the hotpot to solve this challenging problem. Machine learning can improve the performance of the system through previous empirical data, and thus the system has human intelligence. This concept is applicable to the evaluation of teaching quality. We believe that the learner can have a certain ability to judge the quality of teaching through the learning of historical experience data. There have been many breakthroughs in the research of machine learning algorithms in recent years, including the research on the essence of machine learning, semi supervised learning and the combination with other mathematical tools. However, it should be pointed out that the halo analysis of teaching quality evaluation is not really combined with machine learning. At least at present, there are few successful applications and no relevant theoretical elaboration. Therefore, we expound the application of machine learning method in teaching quality evaluation, and use deep learning to show the application of machine learning method in teaching quality evaluation.

Content-centric Data Center Networks (CCDCNs) has become a popular network for Internet services and applications. With the improvement of the Internet of things, smart city and industry 4.0, a large number of sensing devices have produced growing big data, which has grown from Gigabyte (GB) to Terabyte (TB), and may grow to Petabyte (Pb) or even Exabyte (EB) in the future. Grabbing hot content from these big data and distributing it to users not only enhances the service/application quality [9,10,11] of CCDCNs, but also provides CCDCNs with the quality of experience (QoE) of big data services/applications. Therefore, this paper will enhance QoE through the research of big data. At present, some researchers are driven by QoE to study big data. For example: big data mining, cleaning, and classification. Because the data has the limitations of dimension, bandwidth and diversity, the traditional research cannot simply train the hot content in big data. Machine learning [12,13,14,15] (ML) technology has some unique advantages in capturing hot content. A typical example is the use of convolutional neural network (CNN) to solve hot content problems in big data capture [16], such as image processing, speech recognition, and natural language processing. Another example is that researchers use deep CNN (D-CNN) to perform big data analysis. Compared with CNN, deep CNN can get more hot content. Tensor CNN [17] (T-CNN) can obtain more hot content in the analysis of big data than other methods. Facing the analysis of big data, many deep learning (DL) algorithms have high complexity, which makes it too time-consuming to train neural networks. In order to improve the training speed of neural network, fast CNN [18] (F-CNN) algorithm has been proposed and applied to the analysis of big data. Generally, some literatures have studied the hot content, and some literatures use CNN to improve the training speed of neural networks when analyzing big data.

At present, multi-level research has been carried out on the evaluation of teaching quality in colleges and universities [19]. The evaluation index system is mainly used to score teaching activities, students' learning effect and feedback, and then the score representing the teaching quality of colleges and universities is calculated by weighting the index system or applying mathematical model. There are two main difficulties in the evaluation of teaching quality in colleges and universities. One is that the diversity and quantity of original teaching data are huge, and it is difficult to obtain effective evaluation by using the traditional evaluation method of index calculation based on definite formula. The other one is the teaching process is a highly subjective process, which contains very complex internal laws. The traditional teaching evaluation is the teaching evaluation under the local concept, which often lacks the overall consideration [20]. First of all, when formulating the evaluation objectives, we cannot screen the elements and find the best combination of these elements from the whole, and there is no connection between the objectives. In addition, the evaluation content only focuses on local factors, such as the evaluation of classroom teaching quality. Finally, the methods of constructing each link of the evaluation index system are often isolated and lack of organic connection. Moreover, in the implementation of evaluation, there is a lack of complete plan [21].

In order to deal with the above difficulties and challenges, we propose an English teaching quality evaluation model based on deep reinforcement learning model by using content distribution network. Combined with the actual situation of undergraduate education in colleges and universities, we design quality evaluation indicators in line with the characteristics of English teaching, and use CCDCNs to classify the level of teaching quality. In order to improve the accuracy of classification, we construct three cache scheduling algorithms in content data center network CCDCNs. Firstly, an approximate dynamic algorithm is proposed in this work, which has high complexity. Then, based on the characteristics of node centralization, we propose an improved approximate dynamic scheduling algorithm. Although the algorithm includes the scheduling of cached content and the scheduling of content transmission rate, it has low complexity in processing scheduling. Finally, based on Deep Reinforcement Learning (DRL), a cache scheduling algorithm is proposed. Although the algorithm has high complexity, the scheduling accuracy is also high. Experiments show that the proposed architecture and method can get more satisfactory Quality of Experience (QoE).

The rest of this paper is organized as follows. Section 2 presents the related works of teaching quality evaluation method. Section 3 presents the detailed design on teaching quality evaluation method based on DRL in CCDCNs. Experimental results and discussion are reported in Sect. 4. Finally, the conclusion of this paper is in Sect. 5.

2 Method

2.1 Teaching quality evaluation method

Teaching quality evaluation is to judge whether the teaching process and results meet certain quality requirements with the theory and technology of educational evaluation. The purpose of this approach is to promote the continuous improvement of teaching quality and make some qualification certificates for the evaluated objects, which is a basic link of teaching work. A basic principle of modern education evaluation is to evaluate the object as a whole, all-round and dynamic, but not just the evaluation results, which comes from the process. The evaluation starts from the process, which will more effectively promote the generation of ideal results. It is incorrect to regard teaching and evaluation as two different parts. Modern teaching evaluation has been integrated with the teaching process and has become an important factor in the teaching process. The view that only evaluating the learning quality of students or classroom teaching quality of teachers can complete the evaluation of teaching quality is one-sided, because modern teaching evaluation includes the evaluation of all aspects of teaching activities. Teaching quality has two meanings: quality and quantity. Accordingly, teaching quality evaluation also has two meanings, i.e., qualitative evaluation and quantitative evaluation. Quantitative evaluation is based on the specific index system reflecting the degree of achievement of teaching objectives. Evaluation indicators are usually operable, measurable, quantitative and specific. The specific index system of teaching quality evaluation is used to judge the realization degree of teaching objectives from both quality and quantity, rather than the objectives pursued by teaching, which is the fundamental difference between them. For example, we usually use examination means and take the examination score as an index to measure and judge the degree of students' mastery of knowledge and the level of ability development. The examination score is an operable, measurable and quantitative scale. It is only an approach to understand the degree of realization of teaching objectives, but not the teaching objectives themselves. Test score is an evaluation index used to judge the degree of realization of teaching goal. The evaluation concept of modern education evaluation is a more general concept including the concepts of test, measurement, evaluation and evaluation. Due to the different evaluation purposes of teaching quality, a variety of evaluation theories have been developed. In this work, we mainly focus on the following two teaching quality evaluation theories:

  1. (1)

    Developmental evaluation theory. It aims at promoting teachers' professional development, and is a formative evaluation based on objectives, paying attention to process, timely feedback and promoting teachers' development. Developmental teacher evaluation is a formative evaluation carried out in the process of teachers' work. It diagnoses the teaching plan, process, progress and existing problems, feeds back in time, improves, regulates and corrects in time, and thus can improve the quality of education. It can reflect the development trend of the evaluation object in the process of activities and the specific reasons affecting the final evaluation results. Its main function is to clarify the existing problems and the direction that needs to be improved for obtaining the basis for improvement and obtain a more ideal work effect. The process of developmental teacher evaluation is a dynamic development process in which teachers transform social requirements into self-realization goals. It not only pays attention to the actual performance of teachers, but also pays more attention to the future development of teachers. In the process of evaluation, it advocates promoting teachers' conscious and active development in a loose environment, and pays attention to cultivating teachers' subject consciousness and creative spirit. It is a teacher evaluation with the concept of promoting teachers' development, which takes teachers as the core and developing teachers' individuals. Developmental teacher evaluation believes that teachers are the main body of development. The evaluation process is judging the value of the work of schools and teachers through systematic collection and analysis of evaluation information, and thus can realize the coordinated development of schools and teachers.

Developmental teacher evaluation originates from the development of teacher evaluation practice and is the objective requirement of the development of teacher evaluation practice. The theory derived from practice also has theoretical inheritance, which is not only the enrichment and development of the new theory on the basis of the original old theory, but also the other relevant theoretical basis on which the new theory can be established. Developmental teacher evaluation is formed on the basis of the original teacher evaluation theory, and has relevant theoretical basis. The theory of developmental teacher evaluation is based on the research results of philosophy, sociology, pedagogy, management and other related disciplines.

  1. (2)

    Pluralistic evaluation theory. It is based on extensive support and participation. The diversity of evaluation subjects makes the evaluation results can be mutually confirmed. It is combined and used to reduce errors, and makes the evaluation results more authoritative. Therefore, teachers' evaluation, students' evaluation, peer evaluation and the evaluation of teaching departments should be combined in the evaluation.

In general, the main body of teaching quality evaluation is students, because students are the direct beneficiaries of teaching effect. Students' evaluation of teaching effect can truly reflect the quality of teachers' classroom teaching. At the same time, teachers' self-evaluation, peer evaluation, leadership evaluation and expert evaluation form a diversified teaching quality monitoring and evaluation system together with students' evaluation from different aspects, and thus can achieve multi-angle, multi-directional and multi-level monitoring and evaluation of teaching quality. For this evaluation system, according to the different nature of the curriculum and evaluation objectives, different evaluation models can be formulated to evaluate the teaching quality in an all-round way from different angles. The evaluation of teachers' teaching quality takes teachers and their teaching activities as the main evaluation object, and highlights the quality evaluation of teaching activities. The purpose of evaluation is to continuously promote the improvement of teachers' own quality and teaching quality, and make a quality evaluation of teachers' labor, which can be used as the basis for promotion, training, reward and punishment. The evaluation of teachers' teaching quality focuses on the evaluation of classroom teaching quality, and takes into account other aspects, such as quality evaluation, pre-class preparation and after-class guidance. The results of students' learning quality evaluation are used as important information indicators. Students' evaluation takes students and their learning activities as the main evaluation object, and the highlight is the evaluation of the quality of learning results. The purpose of evaluation is to continuously promote students' learning, achieve training objectives, and make a quality evaluation of their learning achievements, which can be used as the basis for educational decision-making. Experts and supervisory leaders evaluate the teaching quality from a comprehensive and meticulous perspective. They evaluate teachers, students and management departments to help find the factors affecting the teaching quality, and then formulate measures to change it. Teaching quality evaluation has multiple functions, but its primary function is to promote the continuous improvement of teaching quality. The focus of evaluation is not on the evaluation of results, but on the formation of process. For different actual needs, some evaluation model will be emphasized. In summarize, teaching quality evaluation should play the following five roles:

  1. 1)

    Feedback guidance. A feedback guidance function is to guide and adjust the teaching and learning activities of teachers and students through the feedback information of teaching evaluation to increase the effectiveness of teaching activities. The feedback guidance function includes two meanings: the first one is the feedback guidance to teachers' teaching work, and the other one is the feedback incentive and strengthening to students' learning. Using the results of evaluation, teachers can understand the actual situation of students, find out the problems existing in teaching, reflect and improve their own teaching plans and teaching methods.

  2. 2)

    Management role. Teaching evaluation can reflect the quality and level of teachers' teaching. It can supervise teachers' teaching labor and provide a basis for certain personnel decision-making. The evaluation of students' learning results can provide evidence for the quality and level of students' learning, and become the basis for decision-making, such as selection and elimination, graduation or not.

  3. 3)

    Strengthening the role of learning. Whether for teachers or students, teaching quality evaluation provides them with important learning experience. Objective and fair evaluation can provide teachers clear the direction of teaching efforts, and carry forward their own strengths. The evaluation of students' learning quality can promote students to review the learning content before the evaluation, strengthen the consolidation, and clarify the content that has not been fully mastered in the evaluation process.

  4. 4)

    Guiding effect. The contents, standards and indicators of teaching quality evaluation affect the efforts of teachers and students to a considerable extent. The ways and methods of teaching and learning play a guiding role in teaching work. Therefore, the contents, standards, indicators and methods of teaching quality evaluation should be fully evaluated, otherwise the wrong orientation of evaluation will cause serious adverse consequences.

  5. 5)

    Role of scientific research. Teaching quality evaluation can test the success or failure of the teaching process, and make comparative judgments on teaching methods and teaching effects. In addition, the development of teaching materials and multimedia courseware, curriculum, the investigation of teachers' quality and the investigation of students' ability are inseparable from the help of teaching quality evaluation. Teaching evaluation has become an important tool in educational science research.

Teaching quality evaluation plays an important and positive role in the teaching process, and has become an important link and organic part of systematic teaching activities. The quality and level of the teaching evaluation index system itself must also be continuously improved. Based on the above evaluation theories and principles, the teaching quality evaluation indicators are designed in this work is shown in Table 1.

Table 1 Evaluation index system of teachers' teaching quality

2.2 Cache scheduling algorithms in CCDCNs based on RL

In this part, we construct three cache scheduling algorithms in CCDCNs aiming at the problem of cache scheduling. Firstly, an approximate dynamic algorithm C-SPT is proposed, which has high complexity. Then, based on the characteristics of node centralization, we propose an approximate dynamic scheduling algorithm C-SPTC. The algorithm includes the scheduling of cached content and the scheduling of content transmission rate, but it has low complexity in processing content scheduling. Finally, based on DRL technology, a cache scheduling algorithm is proposed. Although the algorithm has high complexity, the scheduling accuracy is also high. The proposal of three algorithms can further improve QoE.

In order to meet the experience brought by CCDCNs, QoE is defined as the quality perceived by the network and users. Based on this work, we use two common methods for evaluating QoE. The first method is based on the value of network QoE. The smaller the network consumption, the higher the quality of network experience, which is expressed by QoENet. The second method uses MOS model, especially for different applications expressed as QoEUser. The higher MOS means the higher value of user experience quality. The QoE of the whole CCDCNs is affected by these two items. The detailed evaluation method is described below. Network traffic can be regarded as content forwarding between network nodes, and the path length can be defined as the consumption of timestamp t. When the requested content is not cached in the local node, the request will be forwarded to the resource node that caches the content. In homogeneous networks, the transmission consumption of content blocks between adjacent nodes is approximately the same. Therefore, the network consumption can be expressed by the consumption generated by forwarding content, which is related to the path length and the consumption of each transmission between adjacent nodes. It is assumed that the forwarding consumption between adjacent nodes is determined by the transmission rate. In order to measure the subjective experience of users, MOS model is used to evaluate different applications.

To find the optimal strategy and design an algorithm to solve the QoE maximization problem, we propose a method for \({R}_{a\to t}^{i}\) discretization method. Because the second term of the objective function is continuous, the variation range of \({R}_{a\to t}^{i}\) is [Rmin, Rmax], and thus this interval is divided into k parts, and each part has the same size Z. We use the k bisection point as \({R}_{a\to t}^{i}\) discrete value, and all discrete points are included in the Q set. When k is infinite and Z is close to 0, the value of \({R}_{a\to t}^{i}\) tends to be continuous. When i = 4, it contains 5 elements and three bisection points. We propose a cache scheduling algorithm for calculating the optimal solution. The larger k means the better solution, and the computational complexity will increase accordingly. Firstly, we define \(\sum_{{v}_{t}^{i},{v}_{a}^{i}\in V}{q}^{i}\bullet {x}_{a}^{i}\bullet {b}_{a\to t}^{i}\bullet g({R}_{a\to t}^{i})\) as \(\mathrm{cos}{t}_{c}^{i}\), i.e., the forwarding consumption after scheduling fi with cache size ci between network nodes. When the above method is used, the subproblem is reduced to the knapsack problem. Therefore, the optimal cache scheduling problem in CCDCNs is approximately expressed as a general knapsack problem. Assuming that the qi of the request content is known, the objective function of the subproblem can be described as follows:

$$\mathop {F_{t} = {\text{ min}}}\limits_{{}} \mathop \sum \limits_{{v_{t}^{i} ,v_{a}^{i} \in V}} x_{a}^{i} \cdot b_{a \to t}^{i}$$
(1)

Ft in Eq. (1) represents the total forwarding times of all content blocks between different adjacent nodes. Considering that CCDCNs has developed inner path cache, the network consumption depends on the cache location (\({X}^{i}=\{{x}_{a}^{i},{v}_{a}^{i}\}\)) and the location of the source service node. For simplicity, the path assumption follows the shortest path from fi to s(fi) in the network. Therefore, the mapping of f i and source service nodes is close to the optimal solution.

Figure 1 is an example of SPT with service node a as the root node. When the content block is cached in node e, Fig. 1(a) can be forwarded once. When the content block is cached in node f, it only needs to be forwarded twice in Fig. 1(b). Traffic is defined as the forwarding process of content blocks between nodes, including not only the traffic caused by the content request itself, but also the traffic transmitted by the download node through the node. We assume that each node generates a request content in timestamp t. Therefore, the overall traffic can be calculated by accumulating the traffic of each node. Then, the consumption is equal to the overall traffic with the cached content block. We can decompose the green cache scheduling problem in CCDCNs into two problems: the first one is cache scheduling problem in SPT and the second one is to solve the knapsack problem.

Fig. 1
figure 1

Cache scheduling based on shortest path tree

Chen and Han have developed their own models based on the traditional Markov Decision Processes (MDP) model including action A, transition probability P, state S and reward R, in which A and R are determined by the model and S comes from observation. The goal of MDP model is to understand the preference strategies of different states for reward expectation. This section defines CCDCNs as follows.

MDP state: the state of the network can be defined as S(t) = [L(t), Q(t)] ∈ S within the time interval of the tth scheduling, where L(t) denotes the state set of the cache node, Q(t) denotes to the transmission rate set, and S denotes the network state set. The scheduling interval is a change in state. At the beginning of each scheduling interval, the Internet Service Provider (ISP) will do some preparations, such as collecting service information, caching the conditions of content blocks in nodes and calculating the transmission rate of content blocks. Then it reschedules services by mapping control policies to migration operations. Although the network consumption is not clearly represented, their consumption can be calculated by the service aggregation running on it. Therefore, the definition of CCDCNs is comprehensive.

MDP action: it can be assumed that the transmission rate between adjacent nodes is the same under certain conditions, but the transmission rate may be different under different conditions. The cache node can cache multiple content blocks. If a content block is cached in a node, another content block will be deleted from the node's original location. The process of change is represented as an action.

Reward: it is an incentive mechanism to schedule tasks after implementing A and B actions. The goal of the model is to minimize network consumption and obtain the appropriate transmission rate. At the same time, it also requires network stability and maximizes their final QoE by encouraging the current state. Transmission rate \({R}_{a\to t}^{i}\) must maintain a balance between the minimum rate and the maximum rate, because they need to meet the service requirement of users and avoid huge costs. It can be seen that there is a partial conflict between cost saving and the above constraints. Therefore, the model optimizes the linear combination of network and user related rewards to find an ideal trade-off between consumption and MOS. For probability P, considering the demand of each distributed task and combining with the sliding window scheme, the transition probability can be learned through maximum likelihood estimation. It should be noted that this method cannot succeed in a new environment, and it needs a long time of training to update all relevant States. Additionally, the value of window size has a certain heuristic, which may lead to inaccurate estimation of key points and serious errors in judgment. Considering the above problems, we develop a model based on MDP, which only includes status, behavior and reward. Compared with the pre-trained transition probability P, the model can learn the optimal decision and control strategy online.

DRL has an excellent learning effect in solving MDP problems. In fact, Deep Q-Network (DQN) is an appropriate choice to solve the network scheduling problem. Firstly, the decisions in the network are highly repetitive, which brings a lot of data to the training of DQN. Secondly, DQN can be realized in online random environment. After training with more up-to-date data, DQN can become more intelligent. Finally, DQN based Q-learning has strong flexibility and can capture more features. Specifically, the approximation error of DQN is very small. The concepts of two neural network function approximators are given by weight and, which are called prediction network and target network, respectively. Current neural networks rely on over estimation in traditional Q-learning. When the Q value is updated, the estimation error will be passed from one iteration to the next due to the operation of the maximum Q (S’, A’, B’). In addition, the estimation in DQN is a regression task because the output Q value can be any real number. Therefore, the loss function Li(θ) can be trained by minimizing the sequence to meet the requirements of the approximator. The loss function is defined as follows:

$$L_{i} \left( \theta \right) = E_{S,A,B\sim \rho ()} [\left( {\varphi \left( {S,A,B} \right) + \gamma \mathop {\max }\limits_{{A^{\prime},B^{\prime}}} Q\left( {S^{\prime},A^{\prime},B^{\prime},\theta } \right) - Q\left( {S,A,B,\theta } \right))^{2} } \right]$$
(2)

where \(\rho \left(S,A,B\right)\) is the behavior distribution on action A,B and state S, and i is the number of iterations. It should be noted that the target value depends on the network \(\theta\), and the estimation is made by the current \(\theta\). It is obtained by minimizing the loss function in each iteration. Due to the large number of parameters, we use the random gradient descent method to optimize the loss function.

3 Experiment and results

The data used in the experiment comes from the research and practice of teaching process quality assurance and teaching quality evaluation system of colleges and universities, which is a project approved by the teaching reform of colleges and universities in Shandong Province. The project is encoded by C#.Net framework based on Microsoft SQL Server database software, and it is published on the web application server of campus network. The reinforcement learning toolbox is embedded into the system and introduced into the content distribution network. All the experiments are conducted on 8 GB RAM, Intel Core i7 processor with 3.60 GHz, and Windows 10 operating system.

3.1 Data acquirement

The evaluation model of teaching quality is divided into three categories according to different evaluation scores: I is excellent, II is good, and III is poor. Firstly, 337 sample data are selected according to the evaluation of experts in the system, of which 287 sample data are used to train the reinforcement learning model and 50 sample data are used to test. The sample data is that teaching quality is scored by various evaluators against 14 evaluation factors in the index system. In order to facilitate measurement and conform to the evaluation method of ordinary people, the score range of each evaluation factor is [0, 100]. Firstly, the evaluation grade is defined for the teacher according to all scores, combined with the opinions of experts and the teacher's usual performance. It is divided into three levels: poor, good and excellent. In the final evaluation result,—1 means poor, 0 means good and 1 means excellent. In this way, the corresponding 287 is accurate sample data according to the evaluation index. In order to realize the evaluation and centralize the data, we normalize the data according to the formula (x-85)/85, and store the processed data in the corresponding training sample table of the database. The index system can be described in mathematical form as follows. X is the input index set, X = (x1, x2, …, x14), Y is the output index, which is the evaluation result corresponding to X. In our system, X is the normalized evaluation index value, and Y is the evaluation result of teachers according to experts' and usual performance.

3.2 Cache scheduling algorithms experiment

The data generated in Table 1 is imported into Matlab, where the configure simulator is composed of topology and request pattern. In order to explore these factors, we can only rely on aggregation that can change different properties, such as clustering and horizontal distribution. In order to simulate the topology, we introduce Barabasi (BA) model and limited nodes with high centrality, which provides an unrealistic ideal scenario for caching, such as high request iteration. There are 200 nodes in the generated topology. Once the topology is generated, 20 service nodes will be attached and shared among 2000 users. After the targets are evenly distributed among the source service nodes, their respective connection points can be randomly selected. In each time interval, the distribution of all objects requested by each node can be regarded as Zipf distribution [22]: \(\sum_{i=1}^{N}\left(\frac{c}{{i}^{\beta }}\right)=1\). In addition, the global cache capacity Ctota is normalized to 0.01 in the experiment. We choose the best parameter setting according to the test results (Please see Table 2 for details).

Table 2 Simulation parameter setting

In order to study the importance of cache scheduling to QoE, we only discuss whether cache scheduling is important to performance compared with only considering cache capacity. Different allocation schemes are used for simulation, including cache scheduling based on DQN algorithm, Local Neighborhood Degree Centrality [23] (LNDC) and mobile edge computing [24] (MEC). The randomized least frequently used [25] (RLFU) method is the cache replacement strategy used in all simulation groups.

The x-axis is the total cache capacity, which is normalized by multiplying the number of nodes by the number of content blocks. The remaining network consumption is shown on the y-axis. In order to improve comparability, the remaining network consumption under different schemes can also be normalized. In contrast, DC and BC make greater contributions to reducing network consumption. It should be noted that when the overall budget is greater than 0.01, the DC is effective, and when the cache capacity is small, the BC performance is better. Regardless of the overall capacity budget, the BC never allocates cache space for the root node. When the cache is scheduled to the central node and the overall budget is small, the BC method is effective. However, when there is free cache space available to users, it is not valid. From Fig. 2, we can see that for simultaneous interpreting QoE with different transmission rates under different schemes, the QoE of all scheduling methods is very low at very low transmission rate. Therefore, when the transmission rate is low, the scheduling method is not the main factor affecting QoE. This finding is consistent with practical experience.

Fig. 2
figure 2

Comparison of network consumption of different mechanisms

The teaching feature vector in Sect. 3.2 can be described in mathematical form as follows: X is the input index set, X = (x1, x2, …, x14), y is the output index, which is the evaluation result corresponding to X. In this system, X is the normalized evaluation index value, and y is the evaluation result of teachers according to the experts. The final results of the teaching performance contain three categories: excellent, good and poor. In order to objectively evaluate the effect of this algorithm on teaching performance, we use Table 3 to show the evaluation with accuracy, recall and specificity of each algorithm.

Table 3 Comparison results of teaching performance evaluation

Through the previous simulation results and analysis, it is summarized as follows:

  1. 1)

    In CCDCNs, it is not ideal to arbitrarily allocate cache space, and using a better method to schedule cache has a great impact on QoE. When C-SPT is used to schedule the cache, the result is better than the isomorphic method.

  2. 2)

    Transmission rate is the main factor affecting QoE. The set of Q will determine whether the maximum QoE value can be obtained. In practical application, when the transmission rate is too high or too low, it will lead to poor QoE. Therefore, ISP should select the appropriate content block transmission rate for users.

  3. 3)

    The popular type of content has little impact on MOS. However, it will change the optimal deployment, resulting in different QoE. When the request is too large, the cached content should be scheduled to the edge to meet the request. Surprisingly, the performance of QoE does not change significantly due to the different popularity of content.

  4. 4)

    We abstract the teaching performance into 14 features in this work, and thus the network can distribute it more accurately. It can be seen from the comparison results that our method obtains higher accuracy. However, the higher number of features will affect the real-time performance of the network and reduce its efficiency.

4 Conclusions

In this work, we introduce content distribution network and reinforcement learning technology to improve the evaluation performance of English teaching quality. For the cache scheduling problem, we construct three cache scheduling algorithms in CCDCNs. Firstly, an approximate dynamic algorithm C-SPT is proposed, which has high complexity. Then, based on the characteristics of node centralization, an approximate dynamic scheduling algorithm C-SPT is also proposed. The algorithm includes the scheduling of cached content and the scheduling of content transmission rate, but it has low complexity in processing content scheduling. Finally, based on DRL technology, a cache scheduling algorithm is proposed. Although the algorithm has high complexity, the scheduling accuracy is also high. In order to further improve the performance and service quality of CCDCNs, the next work is mainly solved from the following aspects. Due to the diversity and rapid update of content in big data, the collection of samples with high similarity between CCDCNs cached content and user target requirements becomes more complex and difficult. Therefore, researchers need to further solve the problem of hot content migration or explore more efficient big data technology to improve the prediction accuracy of content popularity and further improve cache performance. CCDCNs are far away from users, and centralized processing of users' needs is very resource-consuming. Putting the data center on the edge of the environment has high research value.