Reliability in psychometry: what it is and how it is estimated in tests

If you’ve studied psychology or other related careers, the concept of reliability is sure to ring a bell. But what is it exactly? Reliability in psychometry is a quality or property of measuring instruments (For example tests), which allows you to check if these are accurate, consistent and stable in your measurements.

In this article, we explain what this property is, give you some examples to clarify the concept, and we explain the different ways to calculate the coefficient of reliability in psychometry.

    What is reliability in psychometry?

    Reliability is a concept encompassed in psychometry, the discipline responsible for measuring the psychological variables of the human being through different techniques, methods and tools. Thus, reliability in psychometry, which is worth redundancy, consists of a psychometric property, which implies the absence of measurement errors for a given instrument (For example, a test).

    It is also known as the degree of consistency and stability of scores obtained in different measures by the same instrument or test. Another synonym for reliability in psychometry is “precision”. Thus, a test is said to be reliable when needed, to be error-free, and to have stable and consistent measurements throughout repeated measurements.

    Beyond reliability in psychology, in what areas does this concept appear and is it used? In different fields, such as social research and education.


    To better illustrate what this psychometric concept is, consider the following example: We use a thermometer to measure daily temperature in a classroom. We take the measurement at ten in the morning every day for a week.

    It will be said that the thermometer is reliable (it has a high reliability) if, in the iron more or less every day the same temperature every day, the thermometer indicates it thus (that is to say that the measurements approach, it are not) are big jumps or big differences).

    however, if the measurements are totally disparate from each other (Being the temperature more or less the same every day), this would mean that this instrument does not have a good reliability (because its measurements are not stable or consistent over time).

    Another example to understand the concept of reliability in psychometry: imagine that we weigh a basket with three apples every day, for several days, and we write down the results. If these results vary considerably over successive measurements (that is, as we repeat them), this would indicate that the reliability of the scale is not good, as the measurements would be inconsistent and unstable (the antagonists reliability).

    Thus, a reliable instrument is one that shows consistent and stable results in repeated measuring processes of a given variable.

    The variability of measurements

    How do you know if an instrument is reliable? For example, from the variability of their measurements. In other words, if the scores that we get (by repeatedly measuring) with this instrument are very variable with each other, we will consider that their values ​​are not precise, and therefore the instrument does not have a good not trustworthy).

    By extrapolating this to psychological tests and a subject’s responses to one of them, we see how the fact that he has responded to the same test under the same conditions, repeatedly, it would provide us with an indicator of the reliability of the test, based on the variability of the scores.

      The calculation: coefficient of reliability

      How to calculate reliability in psychometry? From the coefficient of reliability, which can be calculated in two different ways: from procedures involving two applications or only one. Let’s see the different ways to calculate it, in these two major blocks:

      1. Two requests

      In the first group we find the different ways (or procedures) they make it possible to calculate the reliability coefficient from two applications of a test. Let’s get to know them, as well as their disadvantages:

      1.1. Parallel or equivalent shapes

      With this method, we obtain the measure of reliability, in this case also called “equivalence”. The method consists in applying, simultaneously, the two tests: the X (the original test) and the X ‘(the equivalent test that we created). The disadvantages of this procedure are basically two: candidate fatigue and the construction of two tests.

      1.2. Test-repeat

      The second method, within the framework of the procedures for calculating the coefficient of reliability from two applications, is the test-retest, which makes it possible to obtain the stability of the test. It mainly consists of apply an X test, allow time to elapse and reapply the same X test to the same sample.

      The disadvantages of this procedure are: the learning that the examined subject may have acquired during this period, the evolution of the person, which may alter the results, etc.

      1.3. Test-retest with alternative forms

      Finally, another way to calculate reliability in psychometry is to start from the test-retest with alternative forms. This is a combination of the two previous procedures, So that although it can be used in some cases, it accumulates the disadvantages of both.

      The procedure involves administering the X test, allowing a period of time to pass, and administering the X test ” (i.e. the equivalent test created from the original, X).

      2. One application

      In contrast, the procedures for calculating reliability in psychometry (coefficient of reliability) from a single application of the test or measuring instrument, are divided into two subgroups: the two halves and the covariance between the two halves. items. Let’s see this in more detail, to better understand:

      2.1. two halves

      In that case, simply, the test is divided into two. In this section we find three types of procedures (ways of dividing the test):

      • Parallel shapes: The Spearman-Brown formula is applied.
      • Equivalent forms: the Rulon or Guttman-Flanagan formula is applied.
      • Congeneric forms: Raju’s formula is applied.

      2.2. Covariance between elements

      The covariance between the elements involves the analysis of the relationship between all the elements of the test. Within it we also find three methods or formulas of psychometry:

      Croanbach’s alpha coefficient: its value varies from 0 to 1. Kuder-Richardson (KR20): applies when the elements are dichotomous (ie when they acquire only two values). Guttman.

      3. Other methods

      Beyond the procedures involving one or two applications of the test to calculate the coefficient of reliability, there are other methods, such as: inter-rater reliability (which measures the consistency of the test), the Hoyt method, etc.

      Bibliographical references:

      • Kaplan, RM and Saccuzzo, DP (2010). Psychological tests: principles, applications and problems. (8th edition). Belmont, CA: Wadsworth, Cengage Learning.
      • Martínez, MA, Hernández, MJ and Hernández, MV (2014). Psychometry. Madrid: Alliance.
      • Martínez Arias, R. (2006). Psychometry. Madrid: Anaya.
      • Morales Vallejo, Pedro (2007). Statistics applied to the social sciences. The reliability of tests and scales. Madrid: Pontifical Comillas University. p. 8.
      • Prieto, Gerardo; Delgado, Ana R. (2010). Reliability and validity. Psychologist’s Articles (Spain: General Council of Official Colleges of Psychologists) 31 (1): 67-74.

      Leave a Comment