Read Junk DNA: A Journey Through the Dark Matter of the Genome Online
Authors: Nessa Carey
The problem is that our genome is constantly bombarded by potentially damaging stimuli in our environment. We sometimes think of this as a modern phenomenon, especially when we consider radiation from disasters such as those at the Chernobyl or Fukushima nuclear plants. But in reality this has been an issue throughout human existence. From ultraviolet radiation in sunlight to carcinogens in food, or emission of radon gas from granite rocks, we have always been assailed by potential threats to our genomic integrity. Sometimes these don’t matter that much. If ultraviolet radiation causes a mutation in a skin cell, and the mutation results in the death of that cell, it’s not a big deal. We have lots of skin cells; they die and are replaced all the time, and the loss of one extra is not a problem.
But if the mutation causes a cell to survive better than its neighbours, that’s a step towards the development of potential cancer, and the consequences of that can be a very big deal indeed. For example, over 75,000 new cases of melanoma are diagnosed every year in the United States, and there are nearly 10,000 deaths per year from the condition.
14
Excessive exposure to ultraviolet radiation is a major risk factor. In evolutionary terms, mutations would be even worse if they occurred in eggs or sperm, as they may be passed on to offspring.
If we think of our genome as constantly under assault, the insulation theory of junk DNA has definite attractions. If only one in 50 of our base pairs is important for protein sequence because the other 49 base pairs are simply junk, then there’s only a one in 50 chance that a damaging stimulus that hits a DNA molecule will actually strike an important region.
It’s also consistent with why the human genome contains so much junk DNA compared with the relatively tiny amounts present in less complex species such as the worm and yeast, as we saw in Figure 3.1. Worms and yeast have short life cycles, and can produce large numbers of offspring. The cost–benefit equation for them is different from that of a species such as humans, who take a long time to reproduce and only have small numbers of offspring. For worms and yeast there probably isn’t much point putting a large amount of effort into protecting the protein-coding genes so extensively. Even if a few of their offspring carry mutations that make them less fit for their environment, the majority will probably be OK. But if you get very few shots at passing your genetic material on to the next generation, protecting those important protein-coding genes makes good evolutionary sense.
Nature, as we have seen, is nothing if not adaptive, and so even though the insulation theory makes good sense, it raises another couple of questions. Is insulation the only role of junk DNA?; and where did all this insulating material come from in the first place?
4. Outstaying an Invitation
Every British schoolchild knows the date 1066. It’s the year that William the Conqueror and his troops from Normandy in what is modern-day France invaded England. This wasn’t some temporary raiding party. The invaders stayed, brought their families over and expanded in numbers and influence. They ultimately assimilated, becoming an integrated part of the English political, cultural, social and linguistic landscape.
Every American schoolchild knows the date 1620. It’s the year that the
Mayflower
anchored at Cape Cod, triggering the great wave of European migration and settlement to North America. Like the Normans in Britain over 500 years before them, these early settlers expanded in numbers rapidly, altering the landscape forever.
A similar event happened in the human genome many millennia ago. It was invaded by foreign DNA elements, which then multiplied hugely in number, finally becoming stable integral parts of our genetic heritage. These foreign elements act as a kind of fossil record in our genome, which can be compared with the records from other species. But they also can affect the function of our protein-coding genes, influencing health and disease.
Although they can affect expression of protein-coding genes, these foreign elements don’t code for proteins themselves. This makes them an example of junk DNA.
When the draft human genome sequence was released, it was astonishing to realise just how widely these genetic interlopers
have spread through our DNA.
1
Over 40 per cent of the human genome is composed of these parasitic elements. They are called interspersed repetitive elements, and there are four main classes.
a
As their name suggests, they are DNA stretches in which particular sequences are repeated. The sheer numbers are extraordinary. There are over 4 million of these interspersed repetitive elements in the human genome. One class alone is present 850,000 times throughout the genome and constitutes over 20 per cent of our DNA.
Most of these sequences found ways in the past of increasing their numbers within the genome. Often they mimicked the action of certain types of viruses, similar to the virus that causes AIDS. The basics of this are shown in Figure 4.1. It provides a mechanism whereby a cellular sequence can be copied over and over again
and reinserted back into the genome. This creates an amplification cycle that results in the repetitive sequences increasing in number faster than the rest of the genome.
Figure 4.1
A single DNA element is copied to create multiple RNA copies. In a relatively unusual process, these multiple RNA molecules can be copied back into DNA and reinserted into the genome. This amplifies the number of these elements. This may have happened multiple times in early evolution, but just one round is shown here for clarity.
In many ways, the repeats have undergone the equivalent of copy-and-paste in the genome. This is what has allowed them to spread all over our chromosomes.
As a consequence of these amplifications, we carry enormous numbers of these elements in our genome. The question is whether or not this actually matters. Do these sequences have any effect, or are they just passengers in the genome, with neither positive nor negative impacts?
There are various ways in which we can consider this question. Most of the repeats are very old in evolutionary terms. Comparisons with other animals show that the majority of the repeats arose before placental mammals separated from other animal lineages, over 125 million years ago. For at least one of the classes of repeats, we haven’t developed any new insertions since we separated from the Old World monkeys about 25 million years ago. So there seems to have been a huge expansion in repeats in the human genome in our distant past. After that, the numbers didn’t increase significantly, which might suggest that there is an upper limit to the number of these repeats we can tolerate. But they also seem to be cleared out of the genome very slowly, which in turn suggests that as long as the number of repeats is below this limit, we can put up with them.
And yet there does seem to be some difference in the ways that the human genome copes with such repeats, compared with other species. Mammals in general seem to have a more diverse range of certain repeats than other species. But in mammals, these are based on very ancient sequences that have stuck around for a long time. In other organisms, the old repeats have been cleared out to some extent, and newer ones have taken their places. The authors of the draft human genome sequence calculated that in the fruit fly,
a non-functional DNA element has a half-life of about 12 million years. In mammals, the half-life is about 800 million years.
But even among mammals, humans seem to be unusual. Repeat elements have been decreasing in number in the hominid lineage since the expansion in the number of mammalian species. This hasn’t happened in rodents. The majority of the repeats in the human genome also no longer undergo copy-and-paste. Essentially, the repeats are more active in rodents than in primates.
Perhaps as a consequence, repeats are a bigger cause of problems in rodents than in humans. If repeats replicate in the genome, they may insert into or near functional protein-coding genes and interfere with their normal roles. In some cases they may prevent the correct protein from being expressed. In others, they may drive increased expression of the protein. In mice, insertion of repeats into novel regions of the genome is 60 times more likely to be the cause of a new genetic condition than is the case in human cells. In mice, these account for 10 per cent of all new genetic mutations, whereas the figure is one in 600 for humans. We seem to have our genomes under tighter control than our rodent cousins.
Dangerous repetition
Perhaps this is just as well, when we look at some of the consequences of this kind of mutation mechanism in rodents. There’s a mouse strain in which such a mutation results in no tail. This in itself might not be too problematical, but the kidneys also fail to develop, and that’s a very bad thing indeed.
2
This is because the insertion leads to over-expression of a nearby gene. In a different strain, the insertion switches off an important gene in the central nervous system. This results in mice that spasm if they are handled, and have a lifespan of just two weeks.
3
We can also draw a similar conclusion about the potential
impact of such repeats from the opposite phenomenon, i.e. by looking at regions of the genome where these repeats hardly ever occur.
There is group of genes called the HOX cluster, which is very important in driving the correct development of complex cellular organisms. The genes in the cluster are switched on in a specific order during development, and expressed at highly regulated levels. If anything goes wrong with this order, the effects can be very profound. The importance of the HOX cluster was first shown in fruit flies. Flies with mutations in these genes developed some extraordinary characteristics. In the most famous example, the flies didn’t have antennae on their heads. Instead, their heads had a pair of legs on them.
4
Just like flies, mammals also rely on the appropriate expression patterns of HOX genes for the development of the correct body patterns. Mutations at the HOX cluster are rare in humans, probably because these genes are so important. But it has been shown that a mutation in at least one HOX gene results in defects in the ends of the limbs.
5
The HOX cluster is one of the few places in the human genome that is almost completely clear of interspersed repetitive elements. This suggests that even relatively benign genetic interlopers have the potential to affect gene expression, and that there are some regions of the genome where evolution has ensured that they are kept at bay. This repeat-free aspect of the HOX cluster is also found in other primates and in rodents.
The presence of interspersed repeats in the genome can have unexpected consequences. There’s an unusual class of repeats caused ERVs. ERV stands for endogenous retrovirus. The human immunodeficiency virus (HIV, the causative agent of AIDS) is an example of a retrovirus. Such viruses are characterised by the genetic material being made of RNA, not DNA. The viral RNA is copied to form DNA, which can then integrate into the host
genome. The host treats the DNA like its own, producing new viral components and ultimately new viruses.
Long ago in our evolutionary history, some retroviruses became fully established in our genomes. Many are now genomic fossils. Certain parts of the retroviral sequences have been lost, and so they can never again produce viral particles. But some still contain all the components required to make new viruses. These are normally kept under very tight control by the cell.
6
Scientists have also discovered that the immune system doesn’t just fight off viruses that infect us from the outside world; it also plays a role in keeping these endogenous viruses under control. Genetically engineered mice which lack certain components of the normal immune system suffer problems through the reactivation of these viruses lurking in their own genomes.
7
This control of endogenous retroviruses is a potential issue in one approach to tackling a problematic area of human health. Every year, thousands of people die on waiting lists for organ transplants because there aren’t enough donors. For example, approximately one in three of the people whose lives could potentially be saved by a heart transplant dies while still on the waiting list.
8
One potential way around this would be if we could use hearts from animals as replacement organs. This is known as xenotrans-plantation (‘xeno’ is derived from the Greek for ‘foreign’). For cardiac transplants, the animal of choice is the pig. Its heart is about the same size and strength as the human organ.
There are a number of technical hurdles to overcome (in addition to ethical issues around the use of pigs that matter to certain religious groups).
9
Some of these are being addressed by the creation of genetically modified pigs that don’t provoke the very aggressive immune response that is a problem when introducing pig cells into the human cardiovascular system. But there may be another issue. The pig genome contains endogenous retroviruses, just as the human genome does. But the ones in pigs are different
from the ones in humans. Work at the end of the 20th century showed that some of these pig retroviruses can infect human cells, given the right conditions.
10