What is the genetic code and how does it work?

Whatever the morphological diversity that living beings present, we are all united under one umbrella: our basic functional unit is the cell. If a living being has a cell on which rests its entire morphological structure, it is said to be unicellular (the case of protozoa or bacteria), while those that we present several (from a few hundred to hundreds of billions) we are multicellular beings.

Thus, every organism leaves the cell and, therefore, some molecular beings like viruses are not considered strictly “living” from a biological point of view. In turn, studies have characterized that each cell contains neither more nor less than 42 million protein molecules. Therefore, it is not surprising that it is estimated that 50% of the weight of dry living tissue is made up of protein only.

Why are we providing all this seemingly unrelated data? Today we come to unravel the secret of life: the genetic code. As mysterious as it may seem at first glance, we assure you that you will immediately understand this concept. The problem is with cells, proteins and DNA. Stay tuned.

    What is the genetic code?

    Let’s start with a clear and concise way: the genetic code is nothing more than the set of instructions that tell the cell how to make a specific protein. We have already said in the previous lines that proteins are the essential structural unit of living tissues, which is why we are not faced with an anecdotal question: without proteins there is no life, so simple.

    The characteristics of the genetic code were established in 1961 by Francis Crick, Sydney Brenner and other collaborating molecular biologists. This term is based on a number of premises, but first we need to clarify some terms to understand them. Let’s go to this:

    • DNA: nucleic acid containing the genetic instructions used in the development and function of all existing living organisms.
    • RNA: nucleic acid that performs several functions, including directing the intermediate steps in protein synthesis.
    • Nucleotides: the organic molecules that together give rise to the DNA and RNA chains of living things.
    • Codon or triplet: all 3 RNA-forming amino acids form a codon, i.e. a triplet of genetic information.
    • Amino acid: organic molecules which, in a certain order, give rise to proteins. 20 amino acids are encoded in the genetic code.

    The basics of the genetic code

    Once we’re clear on these basic terms it’s time to explore the main characteristics of the genetic code, established by Crick and his colleagues. These are:

    • The code is organized in triplets or codons: every three nucleotides (codon or triplet) codes for an amino acid.
    • The genetic code is degenerate: there are more triplets or codons than amino acids. This means that an amino acid is usually encoded by more than one triplet.
    • The genetic code does not overlap: a nucleotide belongs to only one triplet. In other words, a specific nucleotide is not in two codons at the same time.
    • The reading is “without comma”: we do not want to engage too complex terminology, so we will say that there are no spaces between the codons.
    • The nuclear genetic code is universal: the same triplet in different species codes for the same amino acid.

    Unraveling the genetic code

    We already have the terminology bases and the theoretical pillars. Now is the time to put them into practice. First of all, we will tell you that each nucleotide is given a name based on a letter, which is conditioned by the nitrogenous base it has. The nitrogenous bases are: adenine (A), cytosine (C), guanine (G), thymine (T) and uracil (U). Adenine, cytosine, and guanine are universal, while thymine is unique to DNA, and uracil is unique to RNA. If you see this, what do you think it means ?:



    Tap to retrieve the terms described above. The CCT is part of a DNA strand, i.e. 3 different nucleotides: one with the cytosine base, another with the cytosine base and another with the thymine base. In the second case of bold characters, it is a codon, because it is the genetic information of the DNA “taducid” (hence there is a uracil where there was previously a thymine) in an RNA strand.

    So we can say that CCU is a codon encoding the amino acid proline. As we said before, the genetic code is degenerated. Thus, the amino acid proline is also encoded by other codons with different nucleotides: CCC, CCA, CCG. Then, the amino acid proline is encoded by a total of 4 codons or triplets.

    It should be noted that not all 4 codons are required to encode the amino acid, but all of them are valid. Usually, essential amino acids are encoded by 2,3,4 or 6 different codons, with the exception of methionine and tryptophan which answer only one each.

      Why so much complexity?

      We do calculations. If each codon was encoded by a single nucleotide, only 4 different amino acids could be formed. This would make protein synthesis an impossible process, since typically each protein is made up of around 100 to 300 amino acids. There are only 20 amino acids included in the genetic codeBut these can be sorted in different ways along the “assembly line” to give rise to the different proteins present in our tissues.

      On the other hand, if each codon consisted of two nucleotides, the total number of possible “graduates” would be 16. We are still far from the goal. However, if each codon consisted of three nucleotides (as it is), the number of possible permutations would increase to 64. Since there are 20 essential amino acids, with 64 codons, it is possible to encode each. from them and, on top of that, offer different variations in each case.

      An applied look

      We’re running out of space, but it’s really complex to put so much information into a few lines. Follow the following diagram, because we promise you that shutting down this whole terminology conglomerate is much easier than it looks:

      CCT (DNA) → CCU (RNA) → Proline (ribosome)

      This little diagram tells us the following: cellular DNA contains the 3 CCT nucleotides, but it cannot “express” genetic information, because it is isolated from the cellular machinery in the nucleus of the same. Therefore, the enzyme RNA polymerase is responsible for the transcription (process known as transcription) of DNA nucleotides into RNA nucleotides, which will form messenger RNA.

      We now have the CCU codon in messenger RNA, which will travel outside the nucleus through its pores to the cytosol, where the ribosomes are located. In short, we can say that messenger RNA gives this information to the ribosome, Which “understands” that it must add the amino acid proline to the amino acid sequence already constructed to give rise to a particular protein.

      As we said above, a protein is made up of around 100 to 300 amino acids. Thus, any protein formed from the arrangement of 300 amino acids, will be encoded by a total of 900 triplets (300×3) or, if you prefer, by 2700 nucleotides (300x3x3). Now imagine each of the letters of each of the 2700 nucleotides, something like: AAAUCCCCGGUGAUUUAUAAGG (…) It is this arrangement, this conglomeration of letters, which is really the genetic code. Easier than it seemed at first, right?


      If you ask a biologist interested in molecular biology about the genetic code, you will likely have a conversation of about 4 to 5 hours. It is truly fascinating to know that the secret of life, as unreal as it may seem, is enclosed in a specific succession of “letters”.

      Therefore, the genome of everything living being can be mapped with these 4 letters. For example, according to the Human Genome Project, all genetic information in our species is made up of 3 billion base pairs (nucleotides), which are found in the 23 chromosome pairs in the nucleus of all our cells. Of course, as different as we are as living beings, we have all presented a common “language”.

      Bibliographical references:

      • What is the genetic code? genotipia.com. Retrieved from: https://genotipia.com/codigo-genetico/
      • Asimov, I., and de la Font, AM (1982). The genetic code (Sirsi no. I9789688561034). Plaza and Janés.
      • Genetic code, National Institute for Research on the Human Genome. Retrieved from: https://www.genome.gov/es/genetics-glossary/Codigo-genetico
      • Genetic code: characteristics and decryption, Complutense University of Madrid (UCM). Retrieved from: https://www.ucm.es/data/cont/media/www/pag-56185/08-C%C3%B3digo%20Gen%C3%A9tico-caracter%C3%ADsticas%20y%20desciframiento.pdf
      • The genetic code, Khanacademy.org. Retrieved from: https://es.khanacademy.org/science/ap-biology/gene-expression-and-regulation/translation/a/the-genetic-code-discovery-and-properties
      • It’s official: there are 42 million protein molecules in every cell, europapress.com. Retrieved from: https://www.europapress.es/ciencia/laboratorio/noticia-oficial-hay-42-millones-moleculas-proteina-cada-celula-20180117181506.html
      • Lee, TF (1994). The Human Genome Project: Breaking the Genetic Code of Life (No. Sirsi) i9788474325072).

      Leave a Comment