1. Explore: From Genes to Complexity
What is the Human Genome?
The human genome is the complete set of genetic instructions encoded in DNA. DNA is composed of four types of nucleotides: adenine (A), thymine (T), cytosine (C), and guanine (G). These nucleotides are arranged in a specific sequence of more than three billion base pairs. Together, they store the information required to direct all biological processes in the human body, including growth, development, metabolism, and response to the environment (Lander et al., 2001).
Deoxyribonucleic acid (DNA) is organized into structures called chromosomes, which are located in the nucleus of each cell. Each chromosome contains many genes. A gene is a defined sequence of nucleotides that usually encodes a functional product, most commonly a protein. Proteins perform a wide range of essential functions in cells, such as building cellular structures, catalyzing biochemical reactions, and transmitting signals between cells.
The completion of the Human Genome Project in 2001 provided the first comprehensive overview of human genetic information. One of its most unexpected findings was that the human genome contains approximately 21,000 protein-coding genes. This number is surprisingly similar to that of much simpler organisms, such as the nematode Caenorhabditis elegans (Claverie, 2001). This discovery challenged the long-standing assumption that organismal complexity is primarily determined by the number of protein-coding genes.
As a result, scientists began to investigate other sources of biological complexity within the genome. Attention shifted toward the large proportion of DNA that does not encode proteins. These regions, referred to as noncoding DNA, make up the majority of the human genome. Although they do not directly produce proteins, they play essential roles in regulating gene expression and maintaining genome structure (ENCODE Project Consortium, 2007).
Noncoding regions include a wide range of functional elements. Among these are promoters, which mark the starting point of transcription; enhancers, which increase the level of gene expression; silencers, which repress transcription; and insulators, which control interactions between regulatory elements and genes. In addition, the genome contains various types of noncoding RNA molecules, such as long noncoding RNAs (lncRNAs) and microRNAs (miRNAs). These molecules influence gene regulation by affecting chromatin structure, RNA stability, and translation.
Together, protein-coding genes, regulatory DNA elements, and noncoding RNAs form a highly coordinated regulatory system. This system allows the same genome to generate a wide diversity of cell types, such as neurons, muscle cells, and immune cells, each with distinct functions. It also enables cells to respond to developmental signals and environmental changes. Understanding this regulatory complexity is fundamental to explaining human development, adaptation, and disease.