Abstract
When autonomous vehicles (AVs) are commercialized, people will be able to engage in various activities in the vehicle, such as reading books and using mobile devices. However, 2/3 of passengers suffer from carsickness when looking at still scene in a moving vehicle. This carsickness is a problem that must be overcome, which eliminates advantages of AV. Therefore, in this paper, a methodology to cancel out the acceleration generated by the AV through the operation of the motor-based power seat was proposed. In addition, a methodology for determining the actuation signal of the power seat through reinforcement learning (RL) was proposed. Then, the effectiveness of the method was verified through a simulation. Consequently, it was confirmed that the proposed method is effective in reducing carsickness. In the future, performance improvement through RL optimization and actual effect verification through human studies are planned.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Keywords
1 Introduction
According to a literacture, about 2/3 of passengers suffer from a carsickness [7]. Meanwhile, when autonomous vehicles (AVs) become commercially available, everyone in the AVs becomes a passenger. Therefore, the probability that a passenger suffering from carsickness is in the AV increases. By the way, this carsickness makes passengers uncomfortable and it eliminates the benefits of AVs. Therefore, carsickness is an important problem to be solved.
In previous studies, there was an attempt to install a webcam on a dashboard of a vehicle and use the captured scene as the background of a mobile device (e.g., smartphone or tablet) [5]. And there was an attempt to inform the rotation direction of the vehicle by giving vibration to the haptic devices composed of 7 mini vibration motors installed in both of the passenger’s forearm [3]. Also, there was an attempt to inform the vehicle’s rotation direction using 32 light-emitting diodes (LEDs) installed around the visual display device [2]. Finally, after setting the border of the smartphone as a visualization area, an attempt was made to present a moving bubble in this area according to the vehicle’s acceleration direction and magnitude [4]. However, some of them had a limitation that an additional device was necessary (e.g., webcam, haptic device, and LEDs) [2, 3, 5], and some of them had a limitation that it was effective only when using a mobile device [2, 4]. Therefore, in this paper, a method of canceling out the acceleration generated by AVs using a power seat is developed. This does not require an additional device because it uses the power seat already present in the vehicle, and it is applicable not only when using a mobile device but also when reading a book. In the proposed system, depending on which signal is applied to the power seat, carsickness increases or decreases. In this paper, therefore, the actuation signal applied to the power seat was determined through reinforcement learning (RL) that takes best choice by trial-and-error [8].
In order to validate the proposed method, a simulation was performed to compare the otolith response in the following two cases: (i) when vehicle acceleration was not canceled out, (ii) when the vehicle acceleration was canceled out by applying the actuation signal generated by RL to the power seat. As a result, the feasibility of the RL-based power seat actuation for the mitigation of the carsickness was verified.
This paper is organized as follows: Sect. 2 introduces RL and RL for power seat actuation. The next Sect. 3 introduces the simulation condition, measurements, learning environment and hyper parameters for RL, and simulation results. Finally, Sect. 4 closes this paper by presenting the conclusions and future works.
2 Reinforcement Learning
This section briefly introduces RL and the configuration for applying RL to power seat actuation.
2.1 Reinforcement Learning
RL is one of the machine learning methods to achieve performance improvement through trial and error. As shown in Fig. 1, RL consists of two components, agent and environment, and three information of action, state, and reward is transmitted between the two components. During a training, the agent performs various actions in various states and generates various rewards, and as a result, it is possible to know which action generates a higher reward in a given state.
2.2 Reinforcement Learning for Power Seat Actuation
The objective of this paper is to cancel out the acceleration generated by the AV through an actuation of the power seat. In this system, the actuation signal of the power seat is manipulated while observing the vehicle state, the passenger state, and the power seat state. Therefore, the agent of RL is the power seat controller that generates actuation signal, and the environments of RL are vehicle, passenger, and power seat that are observation targets.
Firstly, the agent performs an action that generates a power seat actuation signal in a specific range (between −1 m/s2 and 1 m/s2). Next, the velocity of the power seat was limited to -0.5 m/s and 0.5 m/s. Finally, the workspace of the power seat was limited to 1 m because the AV has space limits.
Secondly, the power seat receiving the actuation signal changes the passenger acceleration (vehicle acceleration minus power seat acceleration), otolith response (perceived vestibular acceleration), and power seat position. Among them, the otolith response can be obtained by a mathematical model [10].
Thirdly, states of vehicle acceleration, passenger acceleration, otolith response of the passenger, and normalized power seat position are provided to the agent.
Finally, the environment generates reward by using otolith response of the passenger as follows:
where r and \(\hat{f}_{i}\) are reward and currently perceived force, respectively. If the passenger senses a greater vestibular acceleration, greater motion sickness occurs [6]. Therefore, to make an agent that produces a smaller vestibular acceleration, the reward was generated by multiplying the otolith response by \((-1)\). In the mean time, there is a special phenomenon obtained by the workspace limitation of the power seat. If the power seat reaches the workspace limitation during actuation, the power seat stops with impact (large acceleration). Therefore, the reward must be computed by considering the power seat impact. If there is no impact (\(|\hat{f}_{i}|\) is smaller than 2), an operation was performed to add 2 to maintain the reward as a positive value. On the other hand, when there is an impact, the perceived vestibular acceleration was multiplied by 5 to give a large penalty.
3 Simulation
This section presents the simulation condition, measurement, learning environment and hyper parameters of RL, simulation results, and discussions to find out whether the RL-based power seat actuation method reduces sensory conflict of the AV passenger.
3.1 Simulation Conditions
There are values that must be selected to perform simulation. It includes otolith response related values and acceleration/velocity/position ranges of AV. The values are selected to match the simulation environment and the AV driving environment similarly. Firstly, the vehicle acceleration provided by the environment to the agent is a value randomly selected between -3 m/s2 and 3 m/s2. The simulation was performed in an environment in which the AV accelerates and decelerates at random. Secondly, there is no restriction on the position of the vehicle.
To check the feasibility of RL-based power seat actuation, the performance of the general situation in which the power seat does not move and the proposed situation in which the power seat is driven using the RL were compared. This comparison was made on an AV driven for 60 s.
3.2 Measurements
As shown in Fig. 2, if the passenger reads a book or uses a smartphone in AV, the passenger receives fixed visual feedback. That is, perceived visual acceleration is zero. On the other hand, the remainder subtracting the power seat acceleration from the vehicle acceleration is transmitted to the vestibular system. If this remainder acceleration is not zero, the perceived vestibular acceleration of the passenger becomes a non zero value. Carsickness arises from the difference between these two perceived accelerations [6]. And it is intuitively predictable that carsickness will increase as this difference increases [9]. In the meantime, it is assumed that perceived visual acceleration is zero. Therefore, the larger the perceived vestibular acceleration the greater the carsickness, so the magnitude of the perceived vestibular acceleration was used as a measurement for performance comparison.
3.3 Learning Environment and Hyper Parameters
The learning environment was configured using the UNITY ml-agents toolkit. For training, a proximal policy optimization (PPO) algorithm [8] that performs better than others and is the most commonly used [1] was used . Also, the hyper parameters used for learning are shown in Table 1. As the learning of 4 million steps progressed, the reward did not increase and the loss did not decrease. After completing the learning, the actuation signal of the power seat was generated using the learned model.
3.4 Simulation Results
Figure 3 shows the mean otolith response when RL based power seat actuation is applied and when the power seat is stationary without motion. As seen in the figure, after applying RL based power seat actuation, the mean otolith response was reduced about 38.44%. A statistical analysis was performed to check whether the difference in mean otolith response between the two conditions was statistically significant. Consequently, there was a statistically significant difference between the mean otolith responses of the two conditions (F(1, 98) = 481.039, p < 0.001***).
4 Conclusions and Futureworks
This paper proposed an RL based power seat actuation method to alleviate the carsickness that AV passengers may experience. And simulation was performed to verify the proposed methodology. As a result, it was confirmed that the otolith response decreased by about 38% when the proposed method was applied. In the future, the authors of this paper will conduct a study to find the optimal reward that minimizes the otolith response, and will verify whether this methodology is actually effective through human studies.
References
Andrychowicz, M., et al.: What matters in on-policy reinforcement learning? a large-scale empirical study (2020). arXiv preprint arXiv:2006.05990
Karjanto, J., Yusof, N.M., Wang, C., Terken, J., Delbressine, F., Rauterberg, M.: The effect of peripheral visual feedforward system in enhancing situation awareness and mitigating motion sickness in fully automated driving. Transport. Res. F: Traffic Psychol. Behav. 58, 678–692 (2018)
Md. Yusof, N., Karjanto, J., Kapoor, S., Terken, J., Delbressine, F., Rauterberg, M.: Experimental setup of motion sickness and situation awareness in automated vehicle riding experience. In: Proceedings of the 9th International Conference on Automotive User Interfaces and Interactive Vehicular Applications Adjunct, pp. 104–109 (2017)
Meschtscherjakov, A., Strumegger, S., Trösterer, S.: Bubble margin: motion sickness prevention while reading on smartphones in vehicles. In: Lamas, D., Loizides, F., Nacke, L., Petrie, H., Winckler, M., Zaphiris, P. (eds.) INTERACT 2019. LNCS, vol. 11747, pp. 660–677. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29384-0_39
Miksch, M., Steiner, M., Miksch, M., Meschtscherjakov, A.: Motion sickness prevention system (MSPS) reading between the lines. In: Adjunct Proceedings of the 8th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, pp. 147–152 (2016)
Reason, J.T.: Motion sickness adaptation: a neural mismatch model. J. R. Soc. Med. 71(11), 819–829 (1978)
Schmidt, E.A., Kuiper, O.X., Wolter, S., Diels, C., Bos, J.E.: An international survey on the incidence and modulating factors of carsickness. Transport. Res. F: Traffic Psychol. Behav. 71, 76–87 (2020)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Stoffregen, T.A., Riccio, G.E.: An ecological critique of the sensory conflict theory of motion sickness. Ecol. Psychol. 3(3), 159–194 (1991)
Young, L.R., Meiry, J.L.: A revised dynamic otolith model. In: Third Symposium on the Role of the Vestibular Organs in Space Exploration, NASA SP-152, pp. 363–368 (1968)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lee, CG., Kwon, O. (2023). Reinforcement Learning Based Power Seat Actuation to Mitigate Carsickness of Autonomous Vehicles. In: Stephanidis, C., Antona, M., Ntoa, S., Salvendy, G. (eds) HCI International 2023 Posters. HCII 2023. Communications in Computer and Information Science, vol 1836. Springer, Cham. https://doi.org/10.1007/978-3-031-36004-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-031-36004-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36003-9
Online ISBN: 978-3-031-36004-6
eBook Packages: Computer ScienceComputer Science (R0)