ABSTRACT
Bone age is one of biological indicators of maturity used in clinical practice and it is a very important parameter of a child’s assessment, especially in paediatric endocrinology. The most widely used method of bone age assessment is by performing a hand and wrist radiograph and its analysis with Greulich-Pyle or Tanner-Whitehouse atlases, although it has been about 60 years since they were published. Due to the progress in the area of Computer-Aided Diagnosis and application of artificial intelligence in medicine, lately, numerous programs for automatic bone age assessment have been created. Most of them have been verified in clinical studies in comparison to traditional methods, showing good precision while eliminating inter- and intra-rater variability and significantly reducing the time of assessment. Additionally, there are available methods for assessment of bone age which avoid X-ray exposure, using modalities such as ultrasound or magnetic resonance imaging.
Introduction
Conclusion
For clinicians, especially paediatric endocrinologists, it is very important to assess BA as precisely as possible to be able to make the right diagnosis and monitor closely the development of a child, the progress of a disease or effects of treatment. The traditional methods used to date have very significant drawbacks. These drawbacks include being highly time consuming, having a high inter- and intra-rater variability, making comparison of chronologically sequential examinations of one patient difficult and the need to possess a physical copy of the atlas. The new automated BAA techniques provide instant results, eliminate inter- and intra-rater variability and all only need access to the software. Much research in this field is currently underway and the results are very promising. Most of the programs described herein have been validated in clinical studies, in comparison to traditional BAA and they show very good precision while possessing the benefits of automated BAA systems. There are already some widely available options for clinical use, including BoneXpert and the Paediatric Bone Age Calculator from 16Bit.ai. It is to be expected that these automated tools will continue to gain acceptability and widespread usage, making the traditional atlas-based BAA a thing of the past.
Maturation Indicators
The processes of growth and maturation in children are usually correlated, but they cannot be treated as one process as they may not be linear and may proceed at different paces. Due to numerous disturbances, such as growth hormone (GH) deficiency, deficiency of thyroid hormones or delayed puberty, but also sometimes in healthy children, the chronological age (CA) doesn’t match the biological age. This is because they are regulated by various factors, which include genes and nutrition, but also include many hormones, including GH, insulin-like growth factor-1, sex hormones and adrenal steroids such as cortisol, dehydroepiandrosterone, and testosterone (1,2). In paediatric endocrinology, it is especially important to assess the child’s growth and puberty in relation to biological age, rather than CA. Thus, clinicians have been looking for a good marker of maturation rate in children for decades (3).
Age at menarche is a solid biological indicator of maturity, but it is a one-off event and relates to only half of the population (3). Dentists, mainly orthodontists, use dental age judged using the Demirjian or Willems scale in daily practice but this practice has not been established as a reliable tool for other clinicians (3,4,5). Sexual characteristics, such as that made by assessment of position on the Tanner scale, are useful only in the adolescent period and are very subjective. The only biological indicator of maturity, which is available from birth to adulthood, is bone age (BA) (3).
Bone Age
In paediatric endocrinology, BA is an important tool used in the clinical assessment of patients, mainly those suffering from growth and puberty disorders. Many parameters correlate better with BA than with CA including height velocity, menarche, muscle mass and bone mineral mass (6). Delayed BA is typical for GH deficiency, constitutional delay of growth, hypothyroidism, malnutrition and chronic illness (6,7). On the other hand, BA is advanced in many conditions that include precocious puberty and congenital adrenal hyperplasia, when there is a prolonged elevation of sex steroid levels (6,7,8). BA may be also marginally advanced in cases of overweight children, children with tall stature or premature adrenarche (1,6,8). In genetic overgrowth syndromes, for example Sotos syndrome, Beckwith-Wiedemann syndrome and Marshall-Smith syndrome, BA is usually significantly advanced (6). In all cases it is important to remember that advancement or delay of BA in relation to CA is a slow process, thus BA may not be altered in the case of examinations performed shortly after the first manifestations of a disorder and should be assessed in a temporal manner (7).
What is more, BA is used in forensic and legal medicine to estimate CA, for example in asylum seekers or unaccompanied minors without documents. In such cases an adequate assessment of age using precise methods is crucial. The consequences of incorrect assessment of a child as an adult may result in more restricted access to education, medical care or other forms of support provided for children (9).
This article considers different methods of BA assessment from the perspective of a paediatrician or paediatric endocrinologist (Table 1).
Traditional Methods
Although there have been attempts to assess BA by examinations of specific bones, such as the clavicle or iliac bone (Risser sign) (10,11,12,13,14,15), in paediatrics and paediatric endocrinology, the established way to obtain BA is by performing a radiograph of the hand and wrist of the non-dominant hand. Assessment of development of the bones can be performed in the traditional, manual way or using one of the automated methods. The manual method involves a comparison of obtained radiograph with radiographs in atlases. The manual methods can be divided into two groups depending on the type of atlas – holistic or analytic.
The first atlases were published shortly after the discovery of X-rays in 1895. In 1898, John Poland published the first one: “skiagraphic atlas showing the development of bones of the wrist and hand” (16). In his atlas, he depicted skiagraphs (positive reprints) of hand radiographs of 19 British children, aged between 1 and 17 years, with an attached description of each radiograph (16). However, the two most important publications in this field were issued in 1959 by Greulich and Pyle (17) and in 1962 by Tanner, Whitehouse and Healy (18).
Greulich-Pyle Atlas
‘The Radiographic Atlas of Skeletal Development of the Hand and Wrist’ by Greulich and Pyle (17) (GP) has been widely recognized and is used in many centers currently. This atlas was created based on radiographs of hands of paediatric patients referred to endocrinologists William Walter Greulich and Sarah Idell Pyle by paediatricians between the years 1931-1942. These patients were Caucasian children from a generally upper middle class background, living in Cleveland, Ohio, United States (19,20). This atlas consists of separate reference images for boys and girls aged 0-18 (boys) or 0-19 years (girls) in various intervals (3 months-1 year). Images are accompanied by an explanation of the gradual age-related changes in the bones at a given age and separate BAs calculated for each bone. Due to the natural variability of the BA of different bones in one individual, in some bones, it is often more or less advanced than the standard it is intended to represent. For example, a radiograph representing the age of 3 years 6 month (42 months) includes a 36-month first metacarpal and a 54-month lunate (17). BA is calculated by comparing the non-dominant wrist radiographs of the subject with the nearest matching reference radiographs provided in the atlas. Thus this method is termed a holistic method. Figure 1 presents GP atlas.
GP is the most popular method among clinicians and radiologists, as the assessment by GP is relatively quick and easy to learn. Although widely used, this method has significant drawbacks. BA assessment (BAA) using GP shows high inter- and intra-observer variability. In addition given the reference population used in GP, this method may not be an appropriate, universal tool for use in various populations.
BAA by GP is very subjective and the standard error on a single determination in inter-observer studies ranges from 0.45 to 0.83 years (21,22,23,24,25). There is no standardization in how the bones are weighted. Depending on a rater, in clinical practice one may assign different weight to different bones, some raters may ignore the carpals and others may assign even half weight to the carpals during the assessment. Raters using the carpals reduce their importance at higher maturity but again not in a standardized manner (24).
It has been reported that currently boys and girls develop secondary sex characteristics earlier than decades ago in United States (26,27). Thus current use of the GP atlas, even in a similar population to the original source, may not be as precise as when it was created.
What is more, it has been proven that correlation of BA with CA, and consequently the applicability of GP, depends on ethnic origin (28,29). According to a recent meta-analysis it has been proven that in African females, in comparison to GP standards, BA is significantly advanced. Conversely, in Asian males, BA is significantly delayed between 6 and 9 years of age and significantly advanced at 17 years (28). This should be taken into consideration while assessing BA in these populations using the GP atlas.
There is an online version of GP uploaded by Brazilian Instituto Mineiro de Endocrinologia (28).
Tanner-Whitehouse Atlas
The second most popular tool for BA assessment is the Tanner-Whitehouse atlas (TW). The first version of TW was created in 1962 based on 2600 radiographs collected in the1950s and 1960s of British children coming from average socio-economic class (18). It was later updated in 1983 to Tanner-Whitehouse 2 (TW2) and in 2001 the latest updated version was published - Tanner-Whitehouse 3 (TW3). These updates have attempted to adjust for the secular trends that influence the relationship between the total bone maturity score and BA (30). In several countries standardized TW methods have been created which change the relationship between the total maturity score and BA to make it suitable for different ethnic groups (31,32,33).
TW2 is an analytic or scoring method and it is based on the maturity levels of 20 regions of interest (ROI) in different bones of the hand and wrist. The level of development of each ROI is labeled as a given stage, which is then converted to a numerical score. A total maturity score is calculated by adding the scores of the ROIs and it is matched with the age of boys and girls separately.
The TW method is considered to be more objective than the holistic GP method and to also exhibit higher reproducibility than GP. Bull et al (21) reported that the intra-observer variation was greater using GP than TW (95% confidence interval, -2.46 to 2.18 vs -1.48 to 1.43, respectively). However, assessment using the TW method is more time-consuming. In a study performed by King et al (34) the average time required for TW assessment was calculated as 7.9 min. vs. 1.4 min. in the case of GP assessment. In this study the intra-observer variation between GP and TW assessment was also found to be insignificant (the average spread of results was 0.74 years for TW and 0.96 years for the GP). It should be noted that the sample size assessed by King et al (34) was considerably smaller than that assessed by Bull et al (21). A comparison of GP and TW methods is presented in Table 1 (Table 2).
Other Atlases
The FELS method was developed in 1988 using 13,823 serial radiographs of the left hand-wrist of boys and girls in the Fels Longitudinal Study performed by William Cameron Chumlea, Alex F. Roche and David Thissen from two universities in Kansas and Ohio, US (35). It is based upon maturity indicators that represent radiographic features that occur during the maturation of every child (35). The set of maturity indicators is analysed with a computer program that provides the BA and the standard error for that assessment (35). However, the FELS method has not gained wide recognition.
In 2005 a digital atlas created by Vicente Gilsanz and Osman Ratib (GR) was published. It consists of artificially created, idealised images of hands and wrists, specific for age and sex. These images were produced by an analysis of the size, shape, morphology and density of ossification centres of 522 hand radiographs from healthy Caucasian children from Los Angeles, US (50% girls and 50% boys). Each image includes typical characteristics of development for each of the ossification centres (36). The images are of better quality and precision in comparison to GP. Another advantage is the regular spacing of the images at 6-monthly intervals from the ages of 2 to 6 years and yearly intervals from the age of 7 to 17 years (37). In one study the GR atlas was compared to GP and it was concluded that they were comparable in terms of precision. Yet again, however, the study was performed on a small number of examinations (38).
Ultrasound Assessment
Other imaging modalities, which have developed considerably over the years, now offer some advantages over the ubiquitous radiograph for assessment of BA. One of these is ultrasound (USG), the major advantage of which is that it does not expose the patient to any ionizing radiation, important when patients receive sequential assessment of BA. Some studies have been performed to establish different methods of BAA, including by performing USG (39).
A result of one of these trials is BonAge® (Sunlight Medical Ltd, Tel Aviv, Israel) which consists of a device that performs an ultrasonographic examination and software that calculates the BA on the basis of this examination (19,40,41,42,43). BonAge® measures the ossifying cartilage structures of the wrist as an ultrasonic wave passes through the subject’s distal radius and ulnar epiphysis. According to the producer, BonAge® provides on-the-spot, easy-to-read, immediate results, without exposing children and adolescents to ionizing X-ray radiation, and moreover, it is objective and safe (40). The time of the examination is approximately five minutes although this can prove problematic in the smallest children (41).
Several studies have been performed to assess the precision of this instrument. Mentzel et al (41) and Shimura et al (42) concluded that the results of BonAge® examinations correlate closely with BA evaluated conventionally using the GP or TW2 method. However, in a more recent study performed by Khan et al (43) on a bigger number of patients it was shown that BonAge® tended to over read delayed BA and under read advanced BA and the authors concluded that ultrasonographic assessment should not yet be considered a valid replacement for radiographic BAA.
There has also been a report of ultrasonographic assessment of the thickness of anterior femoral head cartilage, which correlates strongly with the child’s CA and BA, standing height and body weight, according to the authors of the study (44). Ultrasonic examination of ossification of the iliac crest apophysis, (Risser’s sign), was also studied and it presented with high accuracy, specificity and sensitivity in comparison to hand X-ray examination and GP assessment (45).
Although the majority of the authors of these studies conclude that USG methods investigated are of good accuracy in comparison to hand X-ray, USG-based BAA is rarely used in daily practice. This may be because the examination needs to be performed by a trained specialist or there is a need for a specific device. In both cases, it takes more time to perform than an X-ray. Taking into consideration that most studies investigating the utility of USG in BAA were performed on relatively small groups of patients, the clinical utility of USG examination is as yet unproven. Isolation of the forearm allows for minimal radiation exposure and the radiation during hand X-ray is very low (0.0005 mSv). However, in the future, USG may be an advantageous method that may allow total elimination of children’s exposure to ionizing radiation during BAA.
Magnetic Resonance Imaging Assessment
The first research in the field of BAA using magnetic resonance imaging (MRI) was performed in 2007 to find a tool suitable to establish the age of male football players without unnecessary radiation exposure (29). Since in some Asian and African countries registration at birth is not compulsory, age determination is crucial to prevent participation in the incorrect age group (29).
In 2012 Terada et al (46) reported a technique for BAA based on MRI examination. BA was determined using an open, compact, newly designed MR imager optimized for evaluation of a child’s hand and wrist and it was scored by two raters using the TW system adapted for the Japanese population. Evaluation of this method was performed on a group of 93 healthy Japanese children and a strong positive correlation with BA and CA was demonstrated. What is more, the intra-and inter-rater reproducibility rates were significantly high (46). Another study from the same authors was performed in 2014 to improve the performance of this method (47). This was conducted on a group of 88 healthy children with three raters assessing BA and it confirmed the reliability and validity of this method (47). However, a disadvantage of MRI is that it requires a relatively long time to be performed (2 min and 44 sec), therefore it may not be suitable for the youngest children, due to body movement.
Another study was performed by Tomei et al (48) and this was published in 2014. They performed hand and wrist MRIs on 179 healthy children aged 11-16 years old and analyzed the correlation with CA. It was concluded that BAA with MRI was feasible and showed good inter-observer reproducibility (48).
In 2017 the results of another study were published regarding the use of MRI in BAA. Hojreh et al (49) performed hand MRI and X-ray examinations in 50 healthy volunteers and 10 patients, all of whom were adolescents (aged 15±2 years and 13.5±2.6 years, respectively) and assessed both examinations according to GP criteria. This study concluded that the correlation between estimated patients’ ages on radiographs assessed by GP and MRI was high with the average estimated age difference between the MRIs and radiographs being −0.05/−0.175 years. However larger, multicenter studies are necessary to confirm the usefulness of this method. There have also been attempts to automate the BAA using MRI instead of radiography (50,51). The comparison of RTG, USG and MRI methods is presented in Table 3.
Automated Techniques
Due to the problems associated with BAA when using traditional methods, such as inter- and intra-observer variability and the fact that it is time-consuming, a need emerged for new, objective tools that would provide immediate results. As Computer-Aided Diagnosis (CAD) has emerged and has started to be used in clinical practice, one obvious procedure, which would be suitable for adaptation to CAD was BAA, and BA was one of the first radiologic examinations to be automated. This is not recent, however. The first trials of CAD in BAA date back to 1989 when a semi-automated system called HANDX was introduced by Michael and Nelson (52). More recently, work on a system which is based on assessment of phalangeal regions of interest (PROI) was published by Pietka et al (53) in 1991. In this method, the PROI were detected and the lengths of the distal, middle, and proximal phalanx were measured automatically. BA was estimated using the standard phalangeal length table, presented earlier by Garn et al (54).
CASAS
However, the first system to be used by different authors in studies was CASAS - a computerized image analysis system for estimating TW2 BA (55). This semi-automated system was introduced by Tanner and Gibbons in 1994 and it used the 13 bones of TW RUS system (radius, ulna and short bones) for BAA. These bones had to be located manually on the screen by a rater (correct positioning was assured by computer templates of each bone stage) and then automatic scoring was performed. Tanner and Gibbons (55) concluded that CASAS was more reliable and valid than manual TW RUS rating (56). Although other researchers have also reported that CASAS was useful and reliable (57,58), this system has not been widely adopted. The major drawback was that it took more time to estimate BA with CASAS than a manual TW assessment. In addition, difficulties with BAA in cases of abnormally shaped bones restricted the use of CASAS in some pathological conditions.
More recently there have been numerous approaches to BAA automation (58,59,60,61,62,63,64,65,66,67,68,69,70,71) and the most important ones are described below.
BoneXpert
This automated tool for BAA was created in 2008 by the Visiana company, based in Holte, Denmark (72,73,74). This computer program analyses BA automatically, in several steps. The first step is the definition of borders and intensity of the radiologic image of 13 points of interest of the same 13 bones used in the TW RUS system, that is the radius, ulna and 11 short bones. During this first step the system also defines if the picture is complete and of appropriate technical quality. In the next step, BA is assessed for each of the 13 bones separately. The last step is the transformation of the summary BA according to GP and TW criteria (72,73). Figure 2 presents BAA by BoneXpert. BAA is available for ages 2.5-19 years for boys and 2-18 years for girls (version 2.4.7.6.) (75). The data set used for the creation of this program consisted of 1678 hand radiographs of healthy Danish children and children from Belgium diagnosed with a range of disorders, such as Turner syndrome (73).
To date several papers have been published that verify the reliability and precision of BAA using BoneXpert in comparison to GP in different populations (Table 4). In European populations, studies have been conducted among healthy children from the Netherlands (405 patients), German children with short stature (1,097 patients), precocious or early puberty (116 patients), congenital adrenal hyperplasia (100 patients) and with various other endocrinological disturbances (514 patients) (75,76,77,78,79). Moreover, there was a study conducted with 1100 healthy American children from four different ethnic groups (Caucasian, African American, Asian and Hispanic) (22) and another on 515 eutrophic, overweight and obese children from Brazil (80). Research into the validity of BoneXpert has also been performed in Asian populations, including a study on 397 healthy children from Shanghai, China (81), in a large population of 6026 healthy children from five different cities in China (82) and among Japanese children, using 185 radiographs from 22 healthy children and 284 radiographs from 22 patients diagnosed with GH deficiency (83).
What is more, studies have confirmed the validity of BAA via BoneXpert in groups of children suffering from different disorders, including juvenile idiopathic arthritis (84), in severely disabled children (85) and, as previously noted, children with short stature (76), precocious puberty (77) and congenital adrenal hyperplasia (78). All these studies conclude that BoneXpert is a suitable tool to perform BAA, it is faster than traditional methods and eliminates rater variability. However, it should be noted that one of the authors of most of these studies is a person connected to the commercial activity of Visiana company, the producer of BoneXpert.
BoneXpert has several critical limitations. BA is not identified directly, the prediction depends on the relationship between CA, which is an input to the system, and BA (62). The system is brittle and will reject radiographs when there is excessive noise, in one study it rejected 4.5% of individual bones (81). Finally, until recently BoneXpert did not take the carpal bones into consideration, although in younger children they contain discriminative features. This has been changed in the latest version - BoneXpert 3.0 released in September 2019 - which now does include carpal bones in the analysis.
An additional feature that BoneXpert offers is measurement of a parameter called the Bone Health Index (BHI) (86), which is a unique parameter. BHI is a measurement of bone mass counted as a function of cortical thickness of three central metacarpals and their width and length. The program also automatically calculates standard deviation (SD) values for BHI, based on cohort data of Caucasian children (86). There are several research studies on the comparison of BHI values and traditional methods of bone mass measurement. In one study BHI was compared to dual-energy X-ray-absorption (DXA) and peripheral quantitative computed tomography (pQCT) in a cohort of paediatric patients from paediatric endocrine or paediatric oncology outpatient clinics and it was concluded that BHI values showed a strong positive correlation with DXA readings and total bone mineral density, as assessed via pQCT, also positively correlated with the BHI (87,88). In another study on a group of patients with juvenile idiopathic arthritis, BHI measured by BoneXpert was correlated to measurements of bone mineral density by DXA, however, the correlation of Z-scores of bone mineral density measured by the two methods was weaker (89). The authors of these studies noted that a significant advantage of using BHI, in comparison to DXA or pQCT, was that radiation exposure was lower and in low-risk peripheral areas. Also, BHI has already been used in research studies of BA in patients with juvenile idiopathic arthritis (89). There is an extension to BoneXpert, known as digital X-ray radiogrammetry (DXR). DXR measures the cortical bone thickness in the shafts of the metacarpals and has been shown to be effective in the assessment of hand bone loss caused by rheumatoid arthritis (90).
Another advantage of BoneXpert is a prediction of the final height of a child (91,92), which is a vital element of clinical assessment of a child with short stature. Methods in current routine use take into consideration BAA using traditional methods – GP or TW. The variability of these assessments is the main reason for the variability of predicted final height. When BAA derived from BoneXpert is used, it is possible to predict final height in an objective, precise way. This program takes into consideration sex, CA, height and BA of a child in order to predict their final height. One can also add the height of parents and height at menarche to obtain even more reliable outcome. It is also compulsory to classify the child into one of nine population groups, five within the Caucasian ethnicity, Asian Chinese, Asian American, Hispanic and African American. The result of these calculations is accompanied by an SD value and the true height values will be within the indicated range with 68% probability (93). This method’s accuracy has been validated in a clinical study (91).
Artificial Intelligence and Machine Learning
New possibilities of automating BAA emerged with the use of artificial intelligence (AI) and machine learning, especially the specific type of machine learning known as deep learning. The most popular use a convolutional neural network (CNN), which has already found application in areas such as detection of patterns of interstitial lung disease on CT imaging (94) or segmenting the vascular network of the human eyes on fundus photographs (95). In recent years there has been tremendous progress in this field and there have been numerous publications reporting the automation of BAA using CNN (96,97,98,99,100,101,102,103,104,105,106,107,108).
In 2017 Radiological Society of North America (RSNA) conducted a challenge to assess BA from paediatric hand radiographs (RSNA Pediatric Bone Age Machine Learning Challenge 2017), as part of efforts to spur the creation of AI tools for radiology (109,110). The goal of the RSNA 2017 Machine Learning Challenge was to develop an algorithm which can most accurately determine BA using a validation set of paediatric hand radiographs. The results were evaluated by determining the mean difference and the mean absolute difference (MAD) between the performance of each system and the mean of all reviewers’ estimates. The company 16 Bit were placed first in the competition with a MAD of 4.265 months and concordance correlation coefficient of 0.991 (111). The training data set available for competitors contained 12612 images from two American hospitals with a minimum age of 1 month, maximum age of 19 years and mean (SD) age of 10 years and 7 months (3 years 6 months) (111). Their Paediatric Bone Age Calculator is freely available on the website 16Bit.ai, although it is provided with the rider that the application is strictly for demonstration purposes and should not be used for clinical decision making (111). However, this tool has already been validated by a group of Canadian researchers, who compared its results to BAA using the GP atlas in a group of 213 male and 213 female patients and found that the differences between BA assessed by these two methods was not statistically significant (median difference was 0.33 years) and concluded that the tool created by 16 Bit is suitable for clinical use (112).
Another attempt to automate BAA using CNN was described in 2016 by Spampinato et al (113). They compared performance of several approaches, ranging from existing, off-the-shelf CNN, through existing pre-trained CNN (with general imagery) and fine-tuned programs to custom, trained from scratch only on BA radiographs (113). All of these CNNs were tested on the same, public data set, the Digital Hand Atlas Database System, provided in 2007 by Gertych et al (114). This atlas includes 1391 digitized, left-hand radiographs from evenly distributed, normally developed children of Caucasian, Asian, African-American and Hispanic origin, both male and female, with an age range from 1 to 18 years. Spampinato et al (113) conclude that the best performance was observed with BoNet, which was an original, new CNN trained from scratch specifically to assess hand radiographs (114).
Another study in this area deserving attention, as it is especially thorough and methods used have been precisely described, concerns a system called the Fully Automated Deep Learning System for BAA, which was created in 2017 by a group of researchers from Massachusetts General Hospital, Harvard Medical School. They used a pre-trained, fine-tuned CNN to create a new tool for BAA, using a large number of hand radiographs that included 4278 for females and 4047 for males but excluded children aged 0-4 years (115). This system calculates BA and provides a result as a number with representative picture and presents four more pictures of BA +1, +2, -1, -2 years. Thus the radiologist can verify the result and compare it with the closest ones. It achieved an accuracy of 57.32% and 61.4% for the female and male cohorts on held-out test images. Female test radiographs were assigned a BAA within 1 year 90.39% of the time and within 2 years 98.11% of the time. Male test radiographs were assigned 94.18% within 1 year and 99.00% within 2 years. It should be noted that this system does not reject malformed images (115). These authors also compared the BAA performance of a cohort of paediatric radiologists with and without the assistance of their tool for automatic BAA (116). They concluded that AI improves the radiologist’s performance for BAA by increasing accuracy and decreasing variability and root mean squared error. The best results were achieved when radiological assessment was assisted by AI and this was better than using AI alone, a radiologist alone, or a pooled cohort of experts (116).
A comparison of chosen AI methods and BoneXpert is presented in Table 5. Due to the small number of radiographs in training and validating data sets, all the systems based on CNNs used data augmentation (increasing the number of radiographs by rotating the pictures, adding noise, etc.). In some studies authors tested more than one type of CNN. In these studies the CNN with the best performance is presented in the table.