Evaluation of the quality of transesophageal echocardiography images and verification of proficiency

Various metrics have been used in curriculum-based transesophageal echocardiography (TEE) training programs to evaluate acquisition of proficiency. However, the quality of task completion, that is the final image quality, was subjectively evaluated in these studies. Ideally, the endpoint metric should be an objective comparison of the trainee-acquired image with a reference ideal image. Therefore, we developed a simulator-based methodology of preclinical verification of proficiency (VOP) in trainees by tracking objective evaluation of the final acquired images. We utilized geometric data from the simulator probes to compare image acquisition of anesthesia residents who participated in our structured longitudinal simulator-based TEE educational program vs ideal image planes determined from a panel of experts. Thirty-three participants completed the study (15 experts, 7 postgraduate year (PGY)-1 and 11 PGY-4). The results of our study demonstrated a significant difference in image capture success rates between learners and experts (χ2 = 14.716, df = 2, P < 0.001) with the difference between learners (PGY-1 and PGY-4) not being statistically significant (χ2 = 0, df = 1, P = 1.000). Therefore, our results suggest that novices (i.e. PGY-1 residents) are capable of attaining a level of proficiency comparable to those with modest training (i.e. PGY-4 residents) after completion of a simulation-based training curriculum. However, professionals with years of clinical training (i.e. attending physicians) exhibit a superior mastery of such skills. It is hence feasible to develop a simulator-based VOP program in performance of TEE for junior anesthesia residents.


Introduction
The ability to accurately and reliably evaluate a trainee's actions is the highest level of clinical performance assessment tool (1). It is an important component of trainee feedback and critical to identification of opportunities for curricular interventions (2). Mixed simulators that are able to accurately simulate an operative or clinical imaging environment have been increasingly incorporated into clinician training (3,4). With widespread implementation of these programs, preclinical verification of proficiency (VOP) is now considered an integral component of surgical training (5). Task performance in these programs is evaluated with observer-and endpoint-based metrics and motion analysis (1,2,5,6,7). Similarly in transesophageal echocardiography (TEE) teaching, the role of simulators in enhancing training has been established (8,9,10). While various metrics have been used in curriculum-based TEE training programs to evaluate acquisition of proficiency, the quality of task completion, that is the final image quality, has usually been subjectively evaluated in these studies (10,11,12,13,14). Ideally, the endpoint metric should be an objective comparison of the trainee-acquired image with a reference ideal image. Motion tracking software can allow positional data capture with a narrow range of error. It is therefore possible that the range of geometric-scan plane position for an ideal image can also be established (10,14). Comparison of the geometric position of the reference/ideal and trainee's scan plane could then be used as an objective endpoint metric of image quality. Our primary hypothesis is that the novel VOP metric will allow for differentiation of trainee performance from expert performance in image acquisition. Secondly, we hypothesize that this VOP metric can objectively compare performance between differently trained cohorts of learners; consistent with comparisons made previously with other metrics, we expect that the PGY-1 residents who have completed an intensive fundamentals of ultrasound (FUS) course will have similar VOP results to longitudinally trained PGY-4 residents (15).

Study design
This study was conducted between October 2015 and March 2016 with institutional review board approval with waiver of informed consent. Anesthesia faculty members who were Diplomats of the National Board of Echocardiography (NBE) in Advanced Perioperative TEE (PTEeXAM), PGY-1 and PGY-4 anesthesia residents were invited to participate in this study as part of a departmental education initiative. The goal was to compare the image acquisition of the PGY-1 and PGY-2 trainees throughout the training program with that of the trained anesthesia staff.

Training programs
PGY-1 anesthesia categorical interns undergo an intensive, 13-day multimodal basic US course. This course has been integrated in the educational program of our hospital (14). This course has different teaching modalities such as live lectures, online components, electronic books, hands-on sessions on phantom models, simulation and case discussions. During the course, three full days are dedicated to TEE, with approximately 14 h of TEE hands-on simulator training; the course schedule has been previously mentioned in earlier publications.
PGY-4 residents underwent an eight-session online and simulator-based TEE course during their PGY-2 year. This training program was organized into eight modules imparted over 4 weeks. Each training session started with 15 min of discussion or lectures about the topic and followed by a 75-min hands-on training, with approximately 10-h hands-on training over 4 weeks. Other TEE exposure during the residency program included 12 annual echo simulator sessions during their PGY-3 and 4 years, a cardiac-thoracic rotation of 1 month and optional TEE rotation of 1 month.

Traditional transesophageal echo training
Anesthesia residents participated in a structured longitudinal simulator-based TEE educational program during the PGY-3 and 4 years. It consists of an ultrasound training session (60 min) of formal didactics and hands-on instruction with the TEE simulator every 3-4 weeks (five total sessions). During this program, each trainee had approximately 20 h of hands-on TEE simulator training as well as access to our web-based training modules. Additionally, residents are exposed to clinical TEE training during a 1-month cardio-thoracic rotation in the PGY-3 year, and a dedicated 1-month clinical TEE training curriculum for each PGY-4 year resident.

FUS program
As an educational initiative, our department established an intensive FUS program for 'categorical anesthesia interns' during their PGY-1 period prior to start of anesthesia training (Supplementary Table 1, see section on supplementary data given at the end of this article). Components of the curriculum of the FUS educational program have been published previously (Supplementary Table 2). During the FUS program, the PGY-1 trainees were exposed to a total of 14 h of dedicated and supervised hands-on TEE simulator training at the departmental ultrasound simulation laboratory as well as live and online didactics. Image accuracy assessment was performed on the final day of the course.

PGY-1 group: This group consisted of categorical anesthesia interns who participated in the FUS educational program during their internship year.
PGY-4 group: This group comprised graduating anesthesia residents who received the traditional TEE training (as described previously) during their residency along with exposure to ultrasound training during subspecialty rotations but not the FUS program. Image accuracy testing was performed individually in the last month of their anesthesia training prior to their graduation from the residency program.
Expert group: This group consisted of anesthesia faculty certified by NBE perioperative TEE having successfully completed the PTEeXAM and performed and reviewed the relevant required echocardiograms. Images were determined using guidelines and common imaging planes used for assessment.

Study protocol
The study consisted of the following steps: 1. An expert acquired standardized TEE images on a simulator as the reference images. 2. The final image was observed and agreed upon by consensus of experts as the ideal standard image during acquisition. 3. The motion metric and the geometric positional data of the scan plane in the 3D space for each reference image was captured and stored. 4. Experts were invited to acquire the same images on the TEE simulator with simultaneous acquisition of the motion metric and final positional data of the scan plane. 5. The final positional data of experts was analyzed to create a range of acceptable scan plane final positions in order to define an acceptable range of accuracy. 6. PGY-1 and PGY-4 trainees were invited to acquire the selected images on the TEE simulator with capture of the motion metric and final positional data. 7. The motion metric and final positional data were compared between the groups.

Acquisition of positional and orientation data
A Vimedix TEE Simulator (CAE Healthcare, Montreal, Canada) capable of capturing motion metrics during TEE probe manipulation was used for training and evaluation. The methodology used for to capture the motion data has been described in previous studies (10). Briefly, an experienced echocardiographer acquired 12 TEE images (Supplementary Table 3) that is target cut planes (TCPs) that consisted of multiple upper, mid-esophageal and transgastric windows corresponding to a standard TEE images. These TCPs were observed and approved by consensus by five experts as images representative of the standard basic exam plus an additional two TCP's, our group finds to be useful for ventricular function and hemodynamic assessment. The 4-chamber (4C) view was omitted, as it is the starting point for motion tracking in each exam. The TCPs were (1) mid-esophageal (ME) two chamber, (2) ME long axis, (3) ME ascending aorta long axis, (4) ME ascending aorta short axis, (5) ME aortic valve short axis, (6) ME right ventricular inflow-outflow, (7) ME bicaval, (8) transgastric mid-short axis, (9) descending aorta short axis, (10) descending aorta long axis, (11) deep transgastric long axis and (12) transgastric long axis.

Motion metric data collection and position tracking
A participant was first asked to obtain a ME-4C view so that the starting point for all TCPs, across all participants, was similar. Metrics tracking was started by the instructor, and the participant was asked to capture the first TCP. Metrics tracking was stopped when the participant was satisfied with the quality of his/her TCP. Participants were asked to return to the ME-4C view, and the procedure was repeated for each of the remaining 11 TCPs.
The simulator recorded the probe's position and orientation in the form of x, y and z coordinates and roll, pitch and yaw degrees ( Supplementary Fig. 1). The data were exported from the simulator as a comma-separated values (.csv) file, and the final positions and orientations of each image capture were imported to the geometric analytical software 'R' (R Core Group, Vienna, Austria) for further analyses.

Data collected
In addition to tracking the probe position as Cartesian coordinates (x, y, z), and the probe orientation in aircraft principal axes (roll, pitch and yaw), the following motion metrics were also recorded during image acquisition: Capture time: The total time, in seconds, from the start of metrics tracking to the time of image acquisition.
Lag time: The time, in seconds, from the start of metrics tracking to the time of the first probe acceleration peak.
Path length: The total distance, in cm, traveled by the probe tip during metrics tracking.
Probe accelerations: The number of times the probe accelerated >0.5 cm/s 2 during metrics tracking.
The metrics data were exported from the simulator as .csv file and imported to R (R Core Group) for further analyses.

Assessment of image accuracy
Each participant's probe manipulation at the time of image capture was defined as a unit vector with the following endpoints: Position: x y z , ,

( )
Orientation: x y z +cos yaw cos pitch , cos pitch sin yaw , +sin pitch where x, y, z, pitch and yaw were obtained from the .csv output.
For each TCP, the set of probe manipulations of the expert group were plotted in 3D space in R (R Core Group) (Fig. 1A), and ranges of acceptable probe position (Fig. 1B) and orientation (Fig. 1C) were defined.
These ranges were developed following the concept of a 95% CI; the sample of expert positions yielded an acceptable range sphere centered at their geometric center with a radius equal to the product of their s.d. and the 95% critical z score (z score ≈ 1.96). The range of acceptable probe orientations was constructed similarly.
The range of acceptable probe positions was a sphere centered at point p with radius r 1 . The range of acceptable probe orientations was a sphere centered at point o with radius r 2 : p: The geometric centroid of expert positions. The probe manipulations of the PGY-1 and the PGY-4 groups were then overlaid on the same plot (Fig. 1D), and each participant's probe manipulation was deemed successful if, and only if, their position and orientation fell within the respective acceptable ranges.

Statistical methods
A Pearson's chi-square test of independence was used to assess whether there was a significant difference in the success rate between the PGY-1 group, the PGY-4 group and the expert group. A Pearson's chi-square test of independence was then applied to the PGY-1 group and the PGY-4 group alone in order to assess whether the PGY-1 group had attained a level of proficiency comparable to the PGY-4 group. One-way ANOVA was used to evaluate whether there was a difference in the path length, capture time, number of accelerations from rest and lag time between the three groups. A post hoc Tukey HSD test quantified the significance of pairwise differences between the PGY-1 group and the PGY-4 group in order to assess whether the PGY-1 group had attained a level of proficiency comparable to the PGY-4 group.

Successful image captures
The proportion of successful image captures, categorized by training level is summarized in Table 1.
Within the PGY-1 group, 68 out of 82 captured images were acceptable, yielding an aggregate success rate of 82.9%. Within the PGY-4 group, 103 out of 124 captured images were acceptable, yielding an aggregate success rate of 83.1%. Within the expert group, 168 out of 176 captured images were acceptable under the defined spheres for each respective target cut plane, yielding an aggregate success rate of 95.5%.
A Pearson's chi-square test of independence suggests that there is a significant difference in success rate between the PGY-1 group, the PGY-4 group and the expert group (χ 2 = 14.716, df = 2, P < 0.001). However, the difference between the PGY-1 and PGY-4 group alone was not significant (χ 2 = 0, df = 1, P = 1.000).
The distribution of individual subjects' proportion of successful image captures within each of the three groups is summarized in Table 2.
Motion metrics of proficiency (independent of image accuracy; i.e. capture time, lag time, number of accelerations from rest and path length) are summarized in Table 3.
When comparing all three groups, capture time (P < 0.001), lag time (P = 0.002), number of accelerations from rest (P < 0.001) and path length (P < 0.001) differed significantly with training level.
When comparing the PGY-1 group to the PGY-4 group alone, capture time (P < 0.001) and lag time (P = 0.006) differed significantly with training level. The differences in average accelerations from rest (P = 0.721) and path length (P = 0.728) were not significant.

Discussion
It has been previously demonstrated by multiple investigators that a foundation of basic knowledge and psychomotor skills for perioperative TEE can be acquired in the skills laboratory (15,16,17). The VOP metric allowed for objective differentiation between experts and trainees in terms of final images acquired. The establishment of an endpoint image provides a critical approach to the assessment of image quality, which was previously subjective. In this study, echo-naïve junior residents who underwent the intensive FUS program demonstrated a proficiency level in TEE that was comparable to graduating resident who underwent traditional training. This finding was verified both with the new VOP metric and previously described hand motion metrics (15,16). Future studies should explore how to combine the VOP metric and existing hand motion, knowledge and workflow metrics into a comprehensive proficiency index that could be used to track the progression of learners toward proficiency over time. With the ability of simulators to incorporate pathologies, Doppler information and 3D imaging, advanced evaluations for certification could also be developed.
Expertise is a trait acquired through repetitive highlevel clinical exposure. While we have previously had  some limited data on expert performance, this is the first study in which we had a large enough group of experts to do a thorough comparison of all metrics between experts and trainees; it is also the first time objective endpoint data was quantifiable with the new VOP metric. Neither group of residents achieved a level of accuracy demonstrated by the experts. This implies that simulation training is not a substitute for clinical training and our program establishes readiness to learn and perform and not expertise.
There are a few limitations to this study. First, while these simulators do an excellent job in maintaining of TEE movements and representations of the anatomy, the imaging in them is more idealized and not completely concordant with human anatomy. Secondly, while the ability to capture motion metrics and assess image quality is available, not all commercially available TEE simulators offer such features. Thirdly, the determination of image quality is based on complex geometric algorithms, which most centers will not yet be able to employ due to lack of requisite hardware or software. This process can, however, be automated and possibly incorporated into simulators themselves, rendering assessment less cumbersome. Finally, while essential, image acquisition skills are only one component of clinical proficiency; other techniques exist and must be employed to assess workflow and knowledge as well.
In conclusion, based on the positional data of the TEE probe and the scan plane, we demonstrated that the quality of the acquired image could also be objectively evaluated and mathematically compared to an acceptable range of expert images. Possession of psychomotor skills for an invasive procedure is considered an integral component of VOP, and we have successfully demonstrated acquisition of psychomotor skills for TEE utilizing our training protocols. After a simulator-based training, echo-naïve junior residents were able to demonstrate psychomotor skills and image quality for TEE image acquisition that were comparable to senior residents who underwent traditional training.

Supplementary data
This is linked to the online version of the paper at https://doi.org/10.1530/ ERP-18-0002. A one-way ANOVA was used to assess the difference each in mean capture time between all three groups, and a post hoc Tukey HSD test was used to assess the difference between the PGY-1 group and the PGY-4 alone. PGY, postgraduate year.