The methods have several minor differences. The vector method weighs relatively heavily on disagreements, because the denominator includes every code used by an encoder. This option can offer advantages in certain situations, for example. B when the researcher tries to identify disagreements and potential problems with the coding of techniques. In addition, this method is less compliant when the two encoders choose different codes (as in row 4 of Table 6 above) rather than when one encoder selects a sub-quantity of the codes of the other encoder, but there are no different codes (as in row 3 of Table 6). This method may also be easier to conceptualize for computer scientists trained in information retrieval. However, in the event of a serious disagreement or when an individual programmer tends to use more codes per line than others, the denominator becomes much larger and the concordance statistics much smaller. In addition, the addition of other programmers will tend to inflate the denominator in situations of low convergence and establish lower statistics. The qualitative data used in this example are transcripts of 6 focus groups that were made to study the attitudes, knowledge, and perception of health and technology of Harlem residents.

The project was part of the largest CDC-funded digital health partnership, which has been described elsewhere.5, 6 Kupper Hafner statistics are a kappa-like statistic that allows for a unique 2X2 table for each row of the document, defining „positively“ for each unit of analysis (in our case lines in the document) as the intersection of the codes used by the two programmers. The match values per line are then averaged over the entire group (in our case, the entire document). To compare the observed and maximum conformity vectors, we first calculate their pointing product, and then normalize by the point product of the maximum conformity vector with itself (i.e. the square of its length) to produce a number between 0 (for orthogonal vectors that do not match) to 1 (identical vectors that constitute a total match). The line correspondence for row i (plai) is as follows: Table 7 gives an additional overview of the reasons for the low overall convergence in this data set. . . .