The probability of a fortuitous global agreement is the probability that they have agreed on a yes or no, that is, as Marusteri and Bacarea (9) have found, there is never 100% certainty about the results of the research, even if the statistical significance is reached. The statistical results used to test hypotheses about the relationship between independent and dependent variables are meaningless when there are inconsistencies in the evaluation of variables by evaluators. If the agreement is less than 80%, more than 20% of the data analysed is wrong. With a reliability of only 0.50 to 0.60, it is understandable that 40 to 50% of the data analyzed is wrong. If Kappa values are less than 0.60, the confidence intervals around the received kappa are so wide that it can be assumed that about half of the data may be false (10). It is clear that statistical significance does not mean much when there are so many errors in the results tested. In addition, Cohens Kappa expects councillors to be deliberately elected. If your advisors are randomly selected from a population of advisors, use Fleiss` Kappa instead. The initial agreement between the judges was binary, i.e. they fully agreed or disagreed completely. For red tones, we cannot have a binary agreement.
Consider these two scenarios .B. Cohen suggested: InterpretIng Kappa`s score as follows: values ≤ 0 as non-concordance and 0.01-0.20 as none too light, 0.21-0.40 as just, 0.41-0.60 as moderate, 0.61-0.80 as a substantial chord and 0.81-1.00 as almost perfect. This interpretation, however, makes it possible to describe very little correspondence between the advisors as “substantial”. For the approval percentage, an approval rate of 61% can be immediately considered problematic. Nearly 40% of the dataset data is incorrect. In the area of health research, this could lead to recommendations to change the practice on the basis of erroneous evidence. For a clinical laboratory, it would be a very serious quality problem if 40% of the sample evaluations were false. This is why many texts recommend an 80% agreement as an acceptable interim agreement. Given the reduction in percentage agreements typical of Kappa`s results, some reduction in standards relative to the approval percentage makes sense. However, the assumption of 0.40 to 0.60 as “moderate” may mean that the lowest figure (0.40) is an appropriate agreement.
A more logical interpretation is proposed in Table 3. Considering that any agreement that is not perfect (1.0) is not only a measure of agreement, but also of inverted differences of opinion between the advisors, the interpretations of Table 3 can be simplified as follows: each kappa less than 0.60 indicates a mismatch among the advisors and little confidence should be given to the results of the study.