A multi-institutional machine learning algorithm for prognosticating facial nerve injury following microsurgical resection … – Nature.com

Posted: June 11, 2024 at 2:48 am


without comments

Facial nerve injury is a morbid complication of treatment for VS, with downstream effects ranging from social stigmata, patient depression and reduced quality of life16,17, to corneal abrasions and ulcers from incomplete eye closure and loss of corneal sensation18. Other than tumor size, relatively little is understood about factors that may influence facial nerve outcomes in microsurgery for VS. The clinical impact of facial nerve injury and importance of facial nerve preservation is highlighted by the extensive literature exploring predictors of facial nerve injury19,20,21,22,23. We leveraged our multi-institutional experience at two centers with high volumes of VS patients and applied machine learning techniques to identify novel predictors of facial nerve injury in patients treated with microsurgery.

Machine learning technologies have recently undergone a resurgence alongside the development of computational tools for handling and storing the large amounts of data required for their meaningful and broad scale utilization13,24. The recognition that such tools can be used to glean novel trends from data that are not readily apparent from common descriptive statistical approaches makes their application within the clinical domain a valuable and ongoing endeavor25. Such a phenomenon can be seen in the present study where tests of association, comparing measures of centrality between outcome groups, did not identify any factors which significantly differed between patients with and without preserved facial function. In contrast, random forest feature importance analysis discerned four featuresBMI, case length, age and the tumor dimension representing growth towards the brainstem (measurement B)as being relevant in predicting 6-month facial nerve status. While further studies must be carried out to fully characterize the mechanistic role of these factors in facial nerve outcome, this demonstrates the utility of applying novel data science techniques to uncover non-linear interactions between variables which may have real-world, clinical relevance.

As previously noted, tumor measurements utilized in our study were selected due to their relationships to surgical corridors, as well as having been shown to correlate well with tumor size by volumetric analysis in previous literature10. We found high ICC for all measurements, which was comparable to other reports in the literature on similar VS measurement tasks26,27. Although historically, an overall larger tumor size has been demonstrated to portend worse facial nerve function after microsurgical resection19,20,28,29,30, results of the present study identified the tumor dimension representing growth within the cerebellopontine angle between the mid-axis of the tumor and the brainstem as most predictive of facial nerve outcome. Our findings are consistent with prior literature, while providing further insight into possible mechanisms by which tumor size may influence facial nerve injury. A relatively larger tumor dimension within the cerebellopontine angle, between the brainstem and porus acusticus is postulated to result in more thinning and splaying of the facial nerve. This causes direct mechanical injury and makes the facial nerve more difficult to distinguish from tumor capsule and surrounding adherent arachnoid, placing the facial nerve at greater risk of iatrogenic injury31. Thus, our study builds on prior literature reporting greater tumor size as a predictor of facial nerve injury following vestibular schwannoma microsurgery, by suggesting that the tumor dimension representing growth within the cerebellopontine angle from the mid-axis of the tumor towards the brainstem has the greatest implication on facial nerve outcome. We did not identify any difference between our facial nerve preservation and facial nerve dysfunction groups when comparing this dimension. It is worth noting that we observed a relatively higher rate of Koos grade III and IV tumors compared to other published series, suggesting that this series may be skewed towards larger tumors overall. This may partially explain our inability to decipher a difference between facial nerve preservation and facial nerve injury groups based on tumor size. We anticipate that future studies including larger cohorts of patients might capture a relationship between facial nerve susceptibility to injury as this tumor dimension increases.

Older patient age has been previously shown to be predictive of facial nerve dysfunction, similar to our own findings20,29, though this remains controversial. While some studies have found no significant relationship between post-operative facial nerve function and age32, our study and others have identified a trend towards increasing age influencing unfavorable facial nerve outcomes following vestibular schwannoma microsurgery33. Others reporting on this finding have hypothesized on the influence of frailty, burden of comorbidities, decreased neurologic reserve resulting in reduced facial nerve rehabilitation potential33, and the confounding influence of age itself on facial nerve grading given that skin laxity and thinning may contribute to worse grading and/or worsened manifestations of facial nerve paralysis in elderly patients34. We further hypothesize that the basis of this relationship might be less favorable tissue dissection planes in patients of advanced age, placing older patients at greater risk of iatrogenic facial nerve injury. Although further detailed analysis of the role of age in facial nerve outcome on patients undergoing vestibular schwannoma microsurgery is beyond the scope of the current study, further study would certainly be valuable to confirm and better characterize the nature of this relationship. Our study further demonstrated additional unique features predictive of facial nerve outcomes which have not been previously identified. Our hypotheses regarding the role of BMI and case length are discussed further below.

Interestingly, our model identified BMI and operative case length as being highly predictive of facial nerve outcome at 6months post-operatively. To the best of our knowledge, these associations have not been clearly delineated in previous studies. One study examined facial nerve injury in the context of post-operative complications and the need for readmission or re-operation, finding no significant association to BMI35. However, as the authors note, facial nerve injury often occurs without the requirement for reoperation and readmission, thus is likely underrepresented in their analysis. Another study evaluated the influence of BMI on mean HB score pre-operatively (1.1 non-obese vs. 1.0 obese, p=0.16) and post-operatively (1.9 non-obese vs. 1.7 obese, p=0.32) finding no difference between obese and non-obese groups36. However, the timing of facial nerve function assessment is not clearly specified in this study and when facial function is modelled as a categorical variable (rather than continuous, summarized with mean HB scores), obese patients were more likely than non-obese patients to have HB scores equal to or greater than III (9.2% non-obese vs. 17.7% obese). The observed association between BMI and facial nerve dysfunction in our study may be seen as hypothesis-generating, and should be explored in future studies. It is possible that difficult surgical ergonomics in high-BMI patients make tumor dissection off of the facial nerve more difficult, placing patients at higher risk of dysfunction37,38,39. For example, in higher BMI patients, relatively higher mass of the neck and shoulder may further narrow an already small operative working corridor, which in addition to requiring less ergonomic positioning for tumor access, limits the dissection vectors and angles, and reduces range of motion and visibility. The increased utilization of endoscopes40 and exoscopes41 in lateral skull base surgery may eventually mitigate some of these constraints.

Operative duration is identified as a key factor associated with facial nerve outcome in microsurgical resection of vestibular schwannomas in the present studyto our knowledge, this is the first such description of this association, however, this is consistent with previous studies in which prolonged operative duration has been shown to be associated with a higher rate of complications42. Our observed association of increased operative length being associated with a higher likelihood of facial nerve dysfunction may be reflective in part of the known association between tumor size and facial nerve outcomes, as a result of larger tumors having longer average operative durations. However, given that larger overall tumor size and individual tumor measurements in three dimensions (parallel to the posterior petrous bone, between central axis of tumor and porus acusticus, and from porus acusticus to distalmost extent of tumor growth within the IAC) were not found to be predictive of facial nerve dysfunction, other factors which may increase case length should be considered and investigated in future studies as the underlying mechanism of this association. Factors such as tumor hypervascularity43, adherence to the facial nerve perineurium, and the direction of facial nerve displacement may be reflected among difference in operative length across patients, and thus contribute to the observed differential risk of facial nerve dysfunction as it relates to case length20. These factors may serve as a surrogate for dissection complexity. Lastly, it is important to recognize that this algorithm, as any machine learning/artificial intelligence tool, is limited by the inputs. As such, there may be other confounding variables that influence facial nerve injury risk which were not captured in our data or analysis. Further study will be critical to better understand the myriad factors which may influence the role of case length on facial nerve outcome in vestibular schwannoma microsurgery.

A major strength of this study is the inclusion of patient cohorts from three hospitals across two health systems, increasing the generalizability of the resulting model. The model demonstrates an expected performance decay from 90.5 to 84% when assessed on unseen data from one of the included institutions. This level of performance decay both demonstrates the low likelihood of overfitting of this model and the relative reliability of the model in the real world (clinical) context. While the current model demonstrates good accuracy while avoiding overfitting, we recognize that performance will continue to improve in the deployment phase as further data is collected at external sites and through future prospective validation with patient data from the participating institutions (Supplementary Fig.2). While we appreciate the tremendous benefit of multi-center data collection to enhance reproducibility, generalizability and clinical translation of our algorithm, we also recognize that as we increase the number of participating centers and expand to include institutions outside of our region, hospital-related factors (setting, level of care, equipment, etc.) and surgeon-related factors (patient selection, preferred surgical approach, years of experience, etc.), will need to be considered and evaluated in this stage of deployment44.

A limitation of the present study is an overall small proportion of patients with facial nerve dysfunction, which likely limited the statistical significance of associations which may have clinical relevance, as well as our ability to further stratify patients into different grades of facial function (i.e. HB IVI). As vestibular schwannoma is a relatively rare disease entity, expanding our database with each currently participating institution will occur at a rate of roughly 3060 patients per year, thus increasing the time to build a dataset robust enough to meaningfully improve the model metrics and generalizability. However, we aim to overcome this limitation through dissemination of our results and the current iteration of the algorithmwe aim to expand this work to include additional intuitions both nationally and internationally with the goals of improving statistical power, and further increasing the generalizability of this work. As additional validation is performed, we anticipate that the machine learning lifecycle will re-start, including further iterations of model evaluation and tuning to further improve performance.

As previously noted, the current iteration of this algorithm was developed based on manual tumor measurements that have been shown to have strong reproducibility and correlation with volumetric analysis throughout the vestibular schwannoma literature. However, accelerated deployment could be expedited through automated tumor segmentationseveral such promising tools have recently been developed for vestibular schwannoma, however, in all cases the authors acknowledge that these will require further validation before implementation45,46,47,48. This approach has shown significant promise in other medical contexts, particularly in developing strategies for automating chest X-ray review during the COVID-19 pandemic49,50, and in the identification of concerning vs. benign gastrointestinal polyps51,52. Lastly, as data science techniques are increasingly applied in medicine, no discussion of their implementation in this context is complete without considering the protection of patient privacy and confidentiality. The algorithm we present here is run locally and completely offline. However, cloud-based automation offers several advantages that must be weighed against the potential for data leakagestrategies for obviating security concerns while maintaining the flexibility, reliability, and accelerated deployment afforded by these tools are under development. A full discussion of such methods is beyond the scope of this paper, but can be further explored in recent works by Mei et al.53 and Wu et al.54, among others.

It is our goal that this algorithm will ultimately be utilized as a clinically valuable tool for stratifying an individual patients risk of facial nerve injury, aiding in pre-operative counseling about treatment approach (watchful waiting vs. radiosurgery vs. microsurgical resection) and timing. Importantly, the model was evaluated via accuracy, sensitivity and specificity given the common utilization of these as metrics of test performance in the clinical setting. In this specific context, we interpret the 90% accuracy to be excellent compared to the 85% accuracy which has been referenced as a benchmark of acceptable performance15we further anticipate improved accuracy and generalizability performance (less performance decay), with the addition of validation examples during deployment. In addition, the sensitivity and specificity of 90% and 90% represent that the model performs equally well at predicting which patients are likely to have complete facial nerve preservation as it does at predicting which patients are likely to have facial nerve dysfunction. We anticipate that further validation through collaboration with additional centers which treat high volumes of vestibular schwannomas will continue to improve the models performance.

Recognizing that clinicians and patients with little to no computer programming background may find it cumbersome to implement the algorithm, we plan to develop a graphical user interface to facilitate ease of use in both exploratory and clinical settings. This concept has been applied in other areas of medicine to facilitate a user-friendly implementation of artificial intelligence in the clinical environment55,56.

View post:

A multi-institutional machine learning algorithm for prognosticating facial nerve injury following microsurgical resection ... - Nature.com

Related Posts

Written by admin |

June 11th, 2024 at 2:48 am

Posted in Machine Learning

Tagged with




matomo tracker