GM2 ORO.FC.231(d)(2) Evidence-based training

CAA ORS9 Decision No. 32

VERIFICATION OF THE ACCURACY OF THE GRADING SYSTEM — FEEDBACK PROCESS

The verification of the accuracy of the grading system provides valuable data for the training system performance and concordance assurance. Therefore, the verification is necessary from a systemic point of view and the intention is not to measure individual pilot against Appendix 9 criteria.

Concordance agreement between instructors may be high; however, the whole community of instructors may be grading too low or too high (accuracy).

The statistical result of the verification against Appendix 9 criteria can provide the operator with a criterion-referenced system to adjust the accuracy of the grading system. The verification does not require an examiner; EBT instructors may provide the necessary data.

Example 1: For the last 36 months, the operator has a rate of 3 % of pilots scoring 1 (assuming the data is statistically relevant). In this example, the rate of 3 % of the pilots scoring 1 is maintained across all the technical competencies. When the operator performs a verification, the rate of failure would have been only 0,5 %. This may indicate that instructors are rating too low in EBT and therefore some of the pilots scoring 1 should have been graded with a score higher than 1. This may be economically negative for the operator. On the other hand, it could be that the operator has decided to implement higher standards.

Example 2: The operator has an EBT programme with a negligible rate of pilots scoring 1 and a 1 % of pilots scoring 2 in two consecutive recurrent modules. The verification of the technical competencies against Appendix 9 criteria provides a rate of 5 % failure. The EBT manager should further investigate the reason behind this mismatch between EBT and Appendix 9 in the technical competencies. There may be factors influencing this mismatch (e.g. statistical issues, the events in the EBT modules are too benign compared to the events in Appendix 9), which may lead to a corrective action (e.g. redesign of the EBT modules). If the difficulty of the EBT scenarios is equivalent to Appendix 9 and the concordance is high between instructors, then the discrepancy in outcomes might be because the community of instructors are grading too high in the technical competencies (they are grading with 2 when they should have graded 1). Further instructor standardisation will be needed to address this.

The implementation of mixed EBT following GM1 ORO.FC.230(a);(b);(f) provides a good opportunity to fine-tune and verify the accuracy of the grading system because an Appendix 9 licence proficiency check is carried out every year. The CAA may not allow full EBT unless the accuracy of the grading system is demonstrated.