Is Prognostication Possible in Patients with Aneurysmal Subarachnoid Haemorrhage Post Endovascular Treatment?

Introduction: Subarachnoid haemorrhage due to aneurysm rupture is a major cause of death and disability. Accurately predicting the outcome for those patients who have endovascular treatment from a set of predictive variables may identify high-risk patients and guide treatment approaches, leading to decreased morbidity. Logistic regression models allow for the identification and validation of predictive variables. However, advanced machine learning algorithms offer an alternative, in particular, for large-scale multi-institutional data, with the advantage of easily incorporating newly available data to improve prediction performance. Our aim was to design and compare different machine learning methods, capable of predicting the outcome of endovascular intervention in acute subarachnoid haemorrhage and aneurysm rupture. Method: We conducted a retrospective study on a prospectively collected database of patients with acute subarachnoid haemorrhage due to aneurysm rupture who underwent endovascular intervention. All demographic, clinical and procedural data was collated including information from follow up imaging studies. Using SPSS®, MATLAB® and RapidMiner®, classical statistics as well as machine learning algorithms were applied to design a supervised machine capable of classifying these predictors into potential good and poor outcomes. It was attempted to predict the final patients’ outcome based on modified Rankin Scale (mRS), and a dichotomised outcome, good or bad, as well as mortality, recanalization rate and need for retreatment. Subsequently, these algorithms were trained, validated and tested using randomly divided data. Results: We included 236 consecutive acute subarachnoid haemorrhage patients with ruptured intracerebral aneurysm treated by endovascular technique, with a mean age of 52.7 (SD=13.7). All the available demographic, procedural and clinical factors were included into the models. The overall accuracy in predicting the exact mRS was just below 50%, which increased to above 75% in prediction of the dichotomised (good or bad) outcome, and approximately 85% in prediction of mortality. Prediction of recanalization had an overall accuracy of just below 50%; however, there was an approximately 90% accuracy in prediction of those patients requiring retreatment. Discussion: We showed promising accuracy of outcome prediction, using supervised machine learning algorithms in particular in prediction of final outcome as good or bad as well as the probability of needing retreatment in future, with potential for incorporation of larger multicenter datasets, likely further improving predictive accuracy. Finally the filtered and optimized dataset was introduced into a decision induction module and a simplified prognostication tree was designed representing a pictorial relationship between the predictors and the final outcome in a relatively easy to interpret way.

All demographic, clinical and procedural data was collated including information from follow up imaging studies. Using SPSS ® , MATLAB ® and RapidMiner ® , classical statistics as well as machine learning algorithms were applied to design a supervised machine capable of classifying these predictors into potential good and poor outcomes.
It was attempted to predict the final patients' outcome based on modified Rankin Scale (mRS), and a dichotomised outcome, good or bad, as well as mortality, recanalization rate and need for retreatment. Subsequently, these algorithms were trained, validated and tested using randomly divided data.

Results:
We included 236 consecutive acute subarachnoid haemorrhage patients with ruptured intracerebral aneurysm treated by endovascular technique, with a mean age of 52.7 (SD=13.7). All the available demographic, procedural and clinical factors were included into the models.
The overall accuracy in predicting the exact mRS was just below 50%, which increased to above 75% in prediction of the dichotomised (good or bad) outcome, and approximately 85% in prediction of mortality.
Prediction of recanalization had an overall accuracy of just below 50%; however, there was an approximately 90% accuracy in prediction of those patients requiring retreatment.

Discussion:
We showed promising accuracy of outcome prediction, using supervised machine learning algorithms in particular in prediction of final outcome as good or bad as well as the probability of needing retreatment in future, with potential for incorporation of larger multicenter datasets, likely further improving predictive accuracy.

Incidence and demographics
The overall prevalence of the cerebral aneurysms has been reported variably from 3.6 to 6 percent of the population worldwide [1], with an estimated 6 million people in the United States with unruptured aneurysm, or approximately 1 in 50 people [1,2].
They are most prevalent at 35-60 years of age with newly diagnosed aneurysms mostly developing in patient's older than 40. Paediatric cases are also diagnosed, mainly in the context of underlying syndromic conditions. There is a slight female predilection for aneurysms with a female to male ratio of approximately 3-2 [1][2][3][4]. The majority of the aneurysms measure between 3 mm to 2.5 cm in diameter (65%-85%) and carry a low-rupture risk (less than or equal to 1% per year) [5,6]. Aneurysms larger than 2.5 cm are referred to as "Giant". Ten to 15% of patients diagnosed with a brain aneurysm have another aneurysm [1,2].

Aneurysm rupture
Rupture of cerebral aneurysms, also known as aneurysmal subarachnoid haemorrhage (aSAH), is a significant cause of death and disability worldwide, estimated half a million deaths every year, with approximately 40% overall mortality [7,8]; however, an estimated 50-80 percent of all aneurysms never rupture or cause any symptoms during a person's lifetime [9].
On the contrary, large and in particular giant aneurysms can pose a significant risk of complications, with the annual rate of rupture has been reported variably in different studies, estimated 6-12 or more precisely 8-10 per 100,000 people, equating to 30,000 patients in the USA per year, or approximately one aneurysm rupture every 18 minutes. Giant aneurysms have more than 40% risk of rupture over 5 years [2,4,5,8].
The median age of aneurysmal rupture is 50 years age, with typically no warning signs, and it accounts for approximately 3-5% of all new strokes within the whole population [3,8,9].

Outcome of SAH
About 10-15% of patients with aneurysmal haemorrhage will die before reaching hospital with 25% mortality within the first 24 hours of the hemorrhage. The overall survival rate is estimated at less than 50% [9]. Amongst survivors, approximately 66% suffer a permanent neurological deficit, with approximately 4 out of 7 people who recover from a ruptured brain aneurysm will have disabilities. A study in 2004 estimated the combined financial impact of these survivors and their careers, up to $138,000,000 annually in the USA alone [2,3,6,8]. Not only is there a high morbidity and mortality associated with aneurysm rupture, but also aneurysm treatment costs significantly increases after a rupture [1][2][3]6,7,9].

Treatment
The ultimate goal in treating an unruptured intracranial aneurysm is to exclude it from the circulation to prevent the rupture; however, if an aneurysm has already bled, it still needs to be secured to prevent rebleeding, and making more intensive treatment of potential vasospasm possible and safer. Prior to the introduction of the Guglielmi detachable coils (GDC) in 1990, making the endovascular coiling of the aneurysms possible, the only treatment option available was open surgical clipping [10,11].
Advances in the equipment and techniques, including invention of the three dimensional coils, micro-stents and balloons as well as flow divertor stents, parallel to the introduction of balloon and stent assisted coiling, have made endovascular treatment of the wide neck or dissecting aneurysms possible [12,13]. Endovascular intervention is now considered the first line treatment for intracranial aneurysms, consistent with the international subarachnoid aneurysmal trial (ISAT) which revealed that endovascular coiling is associated with lower morbidity and mortality rates compared with open neurosurgical clipping [14].
The timing of definitive management of acutely ruptured intracranial aneurysms has been the subject of considerable debate, and over the last few decades, there has been significant controversy around the optimum timing for treatment of an acutely ruptured aneurysms [15]. Although it was always believed that early treatment decreases the risk of rebleeding, it was historically regarded as a higher risk than a delayed procedure [15][16][17][18]. However, overtime there has been a move from advocacy of "late intervention", defined as more than ten days post aneurysmal rupture and SAH, towards "early" surgery, defined as one to three days post SAH; and even recently a few studies have proposed "ultra-early" intervention, defined as within 24 hours post SAH with improved clinical outcome [19][20][21][22].

Long-term prognosis
Overall trends show that survival rates are increasing, despite the fact that the incidence of aSAH appears to be largely unchanged [23]; however, because of all the advances in the management of the aneurysms and post procedural care, there is an ever increasing number of people surviving who are potentially left with serious physical and cognitive deficits associated with significant financial burden, and perhaps questionable quality of life [9]. This highlights the need for prognostication in patient's presenting with aneurysm rupture and subarachnoid haemorrhage. This question is not only important from the acute management point of view, but also makes potential revolutionary difference for long-term planning in rehabilitation and more specifically follow up assessments, in particular imaging surveillance. Therefore, in addition to mortality, predicting the degree of morbidity is of great importance. Additionally there are increasing numbers of patients with coiled aneurysms; however, there is no clear protocol for their clinical and imaging follow up for early detection of aneurysm recurrence or recanalization which is undoubtedly a multifactorial problem requiring a more individualised approach.

Aims of this study
In this study we aimed to investigate the possibility of prognostication in patients with subarachnoid haemorrhage due to aneurysm rupture who underwent endovascular treatment, and assess the ability to predict their intermediateterm clinical outcome as well as subsequent chance of recanalization requiring retreatment.

Method
This is a retrospective study using a prospectively collected, de-identified clinical database.
Demographics and clinical details of 236 patients with acute subarachnoid haemorrhage due to ruptured aneurysm who underwent endovascular treatment in our institution, over a period of approximately six years, were extracted from a prospectively maintained aneurysm database.
Patients' age at the time of presentation was recorded as well as any history of smoking. Neurological examination was performed for all of the patients prior to any intervention and the baseline subarachnoid haemorrhage grading was estimated based on the world federation of neurological societies scale (WFNS Grade) [24].
All procedures were performed while the patients were under general anaesthesia. From the initial diagnostic angiogram the location of the aneurysm was identified, and its size was recorded. In the case of multiple aneurysms, depending on the angiographic morphology and distribution of the blood on the CT the most likely culprit was selected and subsequently treated.
Following endovascular treatment, the number of the coils, and if required, the use of balloons, and stents were all documented. The level of success in excluding the aneurysm was assessed based on the Raymond-Roy Occlusion Scale [25,26]. It was also recorded if the procedure was abandoned because of technical issues or patient's clinical condition, as well as possible procedural complications including on-table aneurysmal rupture or thromboembolic events, as well as ischemic stroke or intraparenchymal hemorrhage.
Subsequently the 90 day mortality was collected, followed by assessment of the functional outcome based on the modified Rankin Scale (mRS) for the survivors which was then dichotomised as good or bad outcome, with less than or equal 2 considered as good. In addition, if applicable recanalization or recurrence of the coiled aneurysm was scaled and documented on the followup imaging assessments as well as history of any further endovascular treatments if needed.
Using SPSS ® (IBM Corporation), a Classical Linear Model was designed, using Forward-Stepwise as the model selection method, and information criterion (AICC) as the criteria for entry, then machine learning (ML) methods were attempted in MATLAB ® (MathWorks Inc.) and RapidMiner ® (Rapid-I Inc.) using Neural Network (NN) and Support Vector Machine (SVM) algorithms with different kernels. Potential predictors of the mRS, Dichotomised mRS (DmRS), mortality, recanalization and the need for further retreatment were then identified and a prediction model was formed and compared with the observed outcome for validation.
Initially, a two-layer Feed-Forward network with sigmoid hidden and linear output neurons, was designed. The data was then randomly divided into 70, 15 and 15 percent subsets and the network was trained using Levenberg-Marquardt algorithm, validated and tested; with the performance of the model monitored using Mean Squared Error. Subsequently, a Regression Classifier Algorithm was used to model the outcomes, and finally the two models were combined and the accuracy and precision of the combined model was assessed in predicting each of the desired outcomes separately. In addition a Bayesian model was also applied only to predict the need for retreatment, which was not tried for other predictions.
For those outcome measures with scaled value (e.g. mRS or recanalisation), linear regressions were also performed between the observed and estimated outcome, over the training, validation and test datasets independently using Theil-Sen estimator. However, with the dichotomised outcome models (e.g. DmRS, mortality, etc.), being a binary classifier, Receiver Operating Characteristic curves were calculated to illustrate the performance of the system over each dataset as its discrimination threshold is varied. In addition, confusion matrices or contingency tables were also calculated, allowing better representation of the performance of the network.
Finally the filtered and optimized dataset was introduced into a decision induction module and a simplified prognostication tree was designed as an inverted tree-like graph representing a pictorial relationship between the predictors and the final outcome in a relatively easy to interpret way. These trees are usually generated by recursive partitioning, repeatedly splitting based on the possible values of the predictors' attributes.
The information criterion for the main tree shown in this manuscript was "Gain Ratio", which is adjusting the information gain for each predictor to the breadth and uniformity of its values, with the maximum depth of the tree set to be 20. No pre-or post-pruning was performed on the input date or the calculated tree.

Results
Two hundred and thirty six patients were included into the study with average age of 52.7 (SD=13.7) (Figure 1), with ACOM and PCOM the most common and second most common aneurysms respectively (Figure 2). Average aneurysm size was approximately 7 mm with 4 mm standard deviation (Figure 3).  Using classical linear modelling, predictive accuracy was around 30%, with the most important predictor, the WFNS grading at presentation, and the second most important predictor patient's age (Figure 4). The best class recall of 64% was for mRS of 0, with 59% precision; with the second best class recall of 27% for mRS 6 with 36% precision. Accuracy and precision for the remainder of the classes (mRS) was quite low. When the model was trained using a Regression Classifier algorithm, the overall accuracy raised to 53% (+/-5.53), with the best recall of 94% for mRS of 0 with 57% precision; however, the accuracy and precision was quite poor for the remainder.
An averaged overall accuracy of 47% (+/-8.2) was achieved when the models were combined with the highest recall accuracy of 67% for mRS of 0, and 64% precision. The second most accurate recall was for mRS of 1 with 27% precision (Figure 5). Using classical linear modelling, predictive accuracy was around 30%, with the most important predictor, the WFNS grading at presentation, and the second most important predictor patient's age (Figure 6). The Neural Network model, designed using MATLAB demonstrated a Pearson correlation of coefficient of approximately 0.7 between the predicted and true outcome, with a favorable ROC curve and confusion matrix with an accuracy of approximately 86% (Figure 7). Using the network designed in Rapid Miner, an overall accuracy of 77.5% (+/-6.4) was achieved. The "good outcome" had the better recall accuracy of 88% with high precision, approximately 84%; while recall accuracy of the "bad outcome" was around 40% with 49% precision.
When the model was trained using a Regression Classifier algorithm, the overall accuracy raised to 79% (+/-3.4), with the best recall of 99% for "good outcome" with 79% precision.
Although, the class recall for the "bad outcome" was approximately 10%; however, the precision was around 71%. An averaged overall accuracy of 76% (+/-8) was achieved when the models were combined with 86% accuracy and 84% precision for the "good outcome", and significant improvement of the "bad outcome" class recall, more than 40%, and 46% precision.

Prediction of mortality
Using classical linear modelling, predictive accuracy was around 24%, with the most important predictor, the WFNS grading at presentation, and the second most important predictor presence or absence of intraparenchymal haemorrhage (Figure 8).
The Neural Network model, designed using MATLAB demonstrated a Pearson correlation of coefficient of approximately 0.7 between the predicted and true outcome, with a favorable ROC curve and confusion matrix with an accuracy of approximately 89% (Figure 9). Using the network designed in Rapid Miner, an overall accuracy of 82% (+/-4.8) was achieved. The "good outcome" had the better recall accuracy of 91% with high precision, approximately 89%; while recall accuracy of the "bad outcome" was around 20% with only 24% precision.
When the model was trained using a Regression Classifier algorithm, the overall accuracy raised to 89% (+/-1.2), with the 99.5% recall for "good outcome" with 87% precision; however, an extremely poor class recall and precision for the "bad outcome". An averaged overall accuracy of 84% (+/-3.2) was achieved when the models were combined, with 92% accuracy and 90% precision for the "good outcome", and improvement of the "bad outcome" class recall of 30%, and 35% precision.

Prediction of follow up recanalization (Raymond-Roy Occlusion Classification)
Using classical linear modelling, predictive accuracy was 25.6%, with the most important predictor, the location of the aneurysm and the second most important predictor being whether the procedure was abandoned or not (Figure 10). The Neural Network model, designed using MATLAB demonstrated a Pearson correlation of coefficient of approximately 0.5 between the predicted and true outcome, with an overall accuracy of 46% (+/-10) using the network designed in Rapid Miner (Figure 11). The best class recall of 53% was for Raymond score of 1, with 45% precision; with the second best class recall of 46% for Raymond 2 with 50% precision. Accuracy and precision for Raymond 3 was the poorest.
When the model was trained using a Regression Classifier algorithm, the overall accuracy raised to 57% (+/-10.2), with the best recall of 62% for Raymond 1 with 57% precision; with the second best class recall of 60% for Raymond 2 with 56% precision, and again accuracy and precision for Raymond 3 was the poorest. An averaged overall accuracy of 48% (+/-8.4) was achieved when the models were combined with the highest recall accuracy of 53% for Raymond 1, and 46% precision. The second best class recall was of 51% for Raymond 2 with 51% precision, and again accuracy and precision for Raymond 3 was the poorest.

Figure 10
Relative importance of the predictors for subsequent recanalization using classical linear model Figure 11 Linear fit between the estimated and observed outcome for subsequent recanalization using the neural network model

Prediction of need for retreatment
Only 218 patients were included into this part of analyses and 18 patients were excluded as they were either lost to follow up with our institution or the consensus was yet to be made by the neurointerventional and neurosurgical teams regarding the decision for retreatment at the time of this manuscript. Using classical linear modelling, predictive accuracy was around 34%, with the location of the aneurysm by far the most important predictor. The second most important predictor was the WFNS grading at presentation, which was almost as important as insertion of a stent as the third predictor (Figure 12). The neural network model, designed using MATLAB demonstrated a Pearson correlation of coefficient of approximately 0.5 between the predicted and true outcome, with a relatively favorable ROC curve and confusion matrix with an accuracy of approximately 94% (Figure 13). Using the network designed in Rapid Miner, an overall accuracy of 88% (+/-5.4) was achieved. The "no retreatment" had a much better recall accuracy of 94% with high precision, approximately 93%; while recall accuracy of the "retreatment" was quite low, around 17% with only 20% precision.
When the model was trained using a Regression Classifier algorithm, although the overall accuracy raised to 92% (+/-2); however, the overall performance of the model was quite poor with "retreatment" recall approaching 0%. However, with a Bayesian approach the accuracy reached to 90% (+/-2.6), with the "no retreatment" recall of 95% with 94% precision, and approximately 19% "retreatment" recall with 25% precision, with no significant change when these two models were combined.

Figure 12
Relative importance of the predictors for need in retreatment using classical linear model Figure 13 Over all ROC curve and confusion matrix in prediction of need for retreatment using the neural network model

Prognostication Tree
From a variety of decision trees processed and produced for this dataset, the most concise tree was extracted and optimised (Figure 14), demonstrating step by step filtering of any individual subject through each important predictor of the respective model, ratifying based on the predictors parameters, with the age at the root of the tree. This was followed by the grade of the subarachnoid haemorrhage based on the WFNS classification, size of the aneurysm based on the largest dimension, and number of the coils used during the procedure, as well as finally the status of the post-procedural residuum based on the Raymond-Roy grading system.

Discussion
Cerebral aneurysms and subarachnoid haemorrhage have a complex and somewhat poorly understood natural course, with an estimated 6 million people (approximately 1 in 50 people) in the United States with unruptured aneurysms [1,2], with the majority of the aneurysms measuring between 3 mm to 2.5 cm in diameter, carrying a risk of less than 1% per year for rupture [5,6]. Rupture carries an approximately 40% overall mortality [7,8], making immediate treatment inevitable, to save patients life and minimise potential morbidities.
The ultimate goal in treating a ruptured intracranial aneurysm is to exclude it from the circulation to prevent further bleeding, which if done with endovascular techniques is associated with lower morbidity and mortality compared with open surgery [14]. However, despite all the advances in the management of ruptured aneurysms in particular endovascular treatments available, there is significant mortality and morbidity associated with this condition imposing significant social and economical burdens. This makes proper prognostication essential in treatment and rehabilitation planning as well as strategic decision making regarding the followup and potentially further interventions for these patients, especially given the fact that there is no well-established protocol for the clinical and imaging surveillance of these patients, requiring a more individualised approach.
Allowing for all of the complexities of the clinical course and perplexing therapeutic strategies, with unknown prognosticating and outcome predictors, in this study we tried to have an in depth look into the details of our series of patients with aneurysmal subarachnoid haemorrhage who underwent endovascular treatment, to assess for potential factors predicting post intervention behaviour of the disease, its morbidity and mortality as well as the overall chance of recanalization and the need for further treatment.

Figure 14
Prognostication Tree based on the mortality predictors Overall the accuracy of our classical linear model for prediction of the final mRS was quite low, although it demonstrated significant correlation with the grade of the subarachnoid haemorrhage at the time of presentation, consistent with the literature. Prediction accuracy improved using neural network modelling and approached to 53% when a regression classifier algorithm was applied, with no further improvement when two models were combined. This mediocre accuracy in prediction of exact outcome out of seven possibilities can be partly described by the multitude of the conditions probable, contrary to the binary classifiers in deciding one out of two possible outcomes. This is consistent with our findings when the models applied to predict the dichotomized outcome of good or bad with accuracy of up to approximately 80% and quite high precision in particular in forecasting a good outcome. Our neural network model was also proven to be of high accuracy in predicting the mortality rate in patients treated with endovascular techniques, with striking accuracy of more than 90% in predicting the survivors when regression classifiers were used, which can be of great clinical benefit. The most important factor for prediction of mortality proposed as the grade of the disease at the time of presentation with the second most accurate predictor being presence or absence of intraparenchymal hemorrhage.
Moderate performance of the models in predicting the chance of recanalization was demonstrated with an accuracy of around 60%, with the best precision for those aneurysm with trivial subsequent recurrence compared to poor detection of those with large recanalization, with the most important predictor shown to be the location of the aneurysm which likely dictates the technical liberty in aneurysm packing without compromising the parent or en-route arteries and the second most important factor being the fact that whether a procedure was abandoned, indicating the potential difficulty in treating the aneurysm completely.
This was different when the models were applied from the prospective of need to reintervene, with interestingly the grade of the disease as the most important predictor of the need to retreat and stent assisted coiling as the second most important predictor. Although these are somewhat difficult to reconcile with the proposed predictors of recanalization; however, insertion of a stent can intuitively be interpreted as potentially a difficult procedure with technical difficulties regarding the neck of the aneurysm. Despite the low accuracy of the classical linear model, neural networks designed in both MATLAB and Rapid Miner environments demonstrated a very notable accuracy of approximately 94% in predicting the need for retreatment with the highest precision in detecting those who would not need it, compared to poor recall rate for those Finally a relatively easy to follow prognostication tree was calculated for our dataset with interesting properties. Age of the patient at the time of presentation was characterised as the root of the tree which was then followed by the clinical severity of the subarachnoid haemorrhage based on WFNS grading. This tree can be potentially considered as a useful tool in patient ratification and prognostication. Although it is worth mentioning that this tree is one of the many potential prognostication trees possible to be extracted from the dataset depending on the selection criterion on which attributes will be influencing splitting of the tree branches, as well as a few other minor parametric adjustments, therefore each tree will have particular emphases on different predictors resulting in variable final shapes and configurations.

Conclusion
In this study we tried to investigate the possibility of prognostication in patients with subarachnoid haemorrhage due to ruptured aneurysm who underwent endovascular treatment in our institution, and assess the ability to predict their intermediate-term clinical outcome as well as subsequent chance of recanalization requiring retreatment, by in-depth analyses of potential factors and parameters.
There has been recent interest in adopting artificial intelligence and machine learning techniques in diagnostic neuroradiology [27,28] and in the prediction of the outcome in patients post neurointerventional procedures [29,30], with two recent studies demonstrating promising results in modelling the outcome in patients with acute ischemic stroke post endovascular intervention, with relatively good accuracy and precision in prediction of the final mRS using artificial neural network modelling and support vector machine algorithms.
However, to our knowledge there is no multifactorial study published in the literature, attempting to apply machine learning algorithms in prediction of the outcome in the patients with ruptured intracerebral aneurysm post endovascular treatment. Numerous factors, including extensive clinical heterogeneity, can influence the prognosis of a patient with ruptured aneurysm treated endovascularly, making conventional modelling challenging and perhaps inaccurate; machine learning models however, being relatively independent of the unknown potential underlying interactions between these factors, are probably able to simulate the eventual result of such a complex system.
Despite a small dataset, in this study we were able to show that an acceptable prediction accuracy is attainable using machine learning techniques which is superior to conventional counterparts, and there is the likely potential of further improving prediction by the incorporation of larger multicenter datasets. Such models may be of use not only for prognostication and in predicting the outcome under different circumstances, but can potentially assist in clinical decision making, in particular identifying those patients who may benefit from a variety of possible treatment options, including more aggressive management, such as follow-ups in shorter intervals or reintervention.

Limitations
Parallel to the all abovementioned advantages of the machine learning algorithms, there are important underlying assumptions and limitations that should not be forgotten. These models although can be accurate, and perhaps useful in answering the primary question, but more or less behave as a complex set of algorithms requiring large training datasets to improve their performance, with the true underlying relationships between influential factors remaining undiscovered to the user [31][32][33].
This inherent need for large training datasets may affect the accuracy of the machines in studies, like the current study, when only representative training data is used. In addition, with no clear understanding of the true predictors, an overcorrected conservative design may lead to the models being over-fitted by irrelevant demographics or clinical factors, thus increasing the random error and covering the desired signal with noise. To avoid this, techniques like crossvalidation, regularization, or pruning can be used to indicate the tipping point when further training no longer results in a better performance [31][32][33].