Facial expression monitoring system for predicting patient’s sudden movement during radiotherapy using deep learning

Abstract Purpose Imaging, breath‐holding/gating, and fixation devices have been developed to minimize setup errors so that the prescribed dose can be exactly delivered to the target volume in radiotherapy. Despite these efforts, additional patient monitoring devices have been installed in the treatment room to view patients’ whole‐body movement. We developed a facial expression recognition system using deep learning with a convolutional neural network (CNN) to predict patients’ advanced movement, enhancing the stability of the radiation treatment by giving warning signs to radiation therapists. Materials and methods Convolutional neural network model and extended Cohn‐Kanade datasets with 447 facial expressions of source images for training were used. Additionally, a user interface that can be used in the treatment control room was developed to monitor real‐time patient's facial expression in the treatment room, and the entire system was constructed by installing a camera in the treatment room. To predict the possibility of patients' sudden movement, we categorized facial expressions into two groups: (a) uncomfortable expressions and (b) comfortable expressions. We assumed that the warning sign about the sudden movement was given when the uncomfortable expression was recognized. Results We have constructed the facial expression monitoring system, and the training and test accuracy were 100% and 85.6%, respectively. In 10 patients, their emotions were recognized based on their comfortable and uncomfortable expressions with 100% detection rate. The detected various emotions were represented by a heatmap and motion prediction accuracy was analyzed for each patient. Conclusion We developed a system that monitors the patient's facial expressions and predicts patient's advanced movement during the treatment. It was confirmed that our patient monitoring system can be complementarily used with the existing monitoring system. This system will help in maintaining the initial setup and improving the accuracy of radiotherapy for the patients using deep learning in radiotherapy.

images such as cone-beam computed tomography and megavoltage computed tomography are obtained periodically at the beginning of and during the treatments, and the respiratory gated radiotherapy is used to minimize errors caused by internal organ motion. 4,5 Moreover, methods for improving the accuracy of radiation treatment using various immobilizers have been used to minimize body movement during radiation treatment. 6,7 Additionally, the patients are monitored in a radiation therapy control room using an installed video imaging device inside the treatment room. However, it is difficult to prevent patients from changing their postures in units of several millimeters to several centimeters and sudden movements.
Therefore, development of a system is needed to inform the radiation treatment technologist regarding the emotional state of a patient by providing an audiovisual alarm. Certainly, various tools for surface imaging are available, and they can detect patient's real-time movement. However, these tools are expensive and hard to install in every treatment room.
Recognition of facial expressions can be applied to a wide range of research areas, such as diagnosing mental disorders and detecting human social/physiological interactions. Affective expressions can be shown in emotions, and these emotions are expressed in various ways by faces, gestures, postures, voices, and actions. It can also affect physiological parameters. Therefore, understanding emotional expression is important when performing diagnostic and therapeutic procedures. [8][9][10] Fantoni and Gerbino 11 conducted a cognitive psychological experiment where subjects could either comfortably or uncomfortably touch a visually induced object to identify facial emotions at each time. This indicates that there is an association between the patient's facial expression and the comfortable/uncomfortable posture of the body. Since the emotional identification of the facial expression is associated with physical movement, the patient's physical condition is inferred by extracting the uncomfortable emotion, which in turn enables patient monitoring. Shakya et al. studied the prediction of human behaviors through the recognition of facial expressions. 12 In their study, artificial intelligence (AI) algorithm was used for facial expression recognition, and appropriate actions were predicted from various series of emotions.
There are several face recognition research projects using facial image databases (Table 1). In the early 1970s, Ekman P. introduced specific guidelines for research about face-emotion connection and its judgment and analysis for psychology, anthropology, ethology, sociology, and biology. Recently, facial recognition is widely used in our practical life such as high-security alert. 12 The main purpose of the 2000s research was to obtain a vast amount of facial expression databases. However, the databases were finally applied to predict human behavior change through facial expression recognition to warn. [13][14][15][16] In the radiation treatment process, before the treatment, the patient is advised by the radiation therapist not to move his/her body while the radiation treatment is in progress. However, in reality, one cannot directly control the movement of the patient undergoing treatment apart from the auditory notice using a microphone in a control room. Therefore, an AI algorithm using facial expression recognition of the patient in the abovementioned case can be used. In this study, we aimed to develop a system that recognizes a patient's facial expression, detects the patient's discomfort feelings using deep learning, alerts radiation therapy technologists on the possibility of sudden movement due to the patient's discomfort in advance, and confirms the possibility of its utilization in the treatment of patients.

| MATERIALS AND METHODS
In this study, it is necessary to construct a system that recognizes facial expressions of patients, detects uncomfortable feelings, and gives alarms to predict sudden patient movements. That is, in order

2.B | Convolutional neural network
For the machine learning algorithm, we used a CNN to classify patients' facial expressions with images ( Fig. 3). The CNN algorithm is widely used in computer vision processing and is one of the machine learning algorithms used for image-based pattern recognition. [18][19][20] Convolutional neural network is modeled on a computer in a way that

2.C | Training data and test dataset
The training dataset used for the learning of CNN was CK+ dataset. 24

2.D | Real test sample with the patients
Ten patients (male = 7, female = 3, average age = 67.3 ± 6.72) were tested for recognition rate using our system in the treatment room.
The patients should not wear thermo-plastic mask, so they were F I G . 4. Implemented system algorithm using the convolutional neural network and the entire process for the whole treatment. In Table 3, it showed that the recorded eight patients were treated with an uncomfortable expression, and no sudden movement of the patient was detected. Stability is the dominant emotion per rest of emotions during monitoring. This means that most patients were treated with various expressions without maintaining a constant expression for a certain time. And the expected nonmotion scores that the comfortable emotions scoring results not to move in all frames for each patient. Thus, the patients who had comfortable and neutral emotion had higher score. And the expected motion is opposite situation of non-motion. Therefore, motion prediction accuracy follows the expected non-motion according to the movement result of the patient, since the movement of the patient did not appear (Table 3).

3.B | Accuracy test
To evaluate the accuracy of the system, we analyzed the training and validation accuracy of the CNN model. Moreover, the receiver operating characteristic curve analysis was performed for each facial expression group. To evaluate the performance of the system, the recognition rate was tested using a camera with a developed CNN model. Subsequently, monitoring was performed on the patients in the actual treatment room to analyze the recognition rate. The results of the accuracy analysis of the developed CNN model are shown in Fig. 7. Training accuracy was 100% at maximum, and validation accuracy was 82.2% [ Fig. 7(a)]. Test accuracy was 85.6%. The ROC curves for each motion identified by output are shown in Fig. 7(b), with an average of 95.8%, which is a measure of how accurately each emotion is identified in the model. Additionally, statistical analysis and the following correlation studies are required to identify which emotional state the patient showed actual movements. In the monitoring scheme shown in Fig. 4(b), as the treatment fraction progresses, the patient's facial expression results from the previous treatment will be fed back to the monitoring system at the next treatment, resulting in a more complete patientspecific system as the number of fraction treatments increases.

| DISCUSSION
The recognition rate of the patient's facial expression depends on the image angle. The angle of the image used in the source was obtained to allow an angle of 30°in the front direction, and it was F I G . 6. Facial expression monitoring result by frame sequence (n = 10). not recognized if it was out of the screen angle, higher than 60°of the image. Therefore, we selected an angle so that the eyes, nose, and mouth were captured by the camera in the forehead direction.
If training is performed using a dataset with free-screen angle, the recognition rate of the system is expected to be improved, and further study on this matter is required.
When the current system is applied for clinical applications, it is limited in patients undergoing head and neck stereotactic radiosurgery who wear a head thermoplastic frame or a stereotactic head frame and in patients assuming a prone position. And only ten patients were tested in the treatment room while they are treating as a preliminary study. In further study, more patient cases will be analyzed after final installation in the treatment room.
In a commercial system capable of surface imaging, it is possible to monitor patient's motion in real time. However, in the treatment environment where surface imaging is difficult, the system developed and proposed in this study can be implemented efficiently.
The patient's face recognition can be used in the process of identifying the patient in conjunction with the image taken at the time of the patient's visit. It could be implemented as part of an intelligent patient treatment system by combining it into the total information process in the treatment stage such as treatment setup.

| CONCLUSIONS
We have developed a system that recognizes patient's facial expressions. The system detects the uncomfortable emotion of a patient using AI algorithms and gives advance warning to the radiation treatment technologist on the possibility of movement of the patient as a result of the uncomfortable condition of the patient. If the recognition rate and accuracy of the system are improved and further studies are conducted, we confirmed that the system could be actively used in the treatment of patients.