Read Junk DNA: A Journey Through the Dark Matter of the Genome Online
Authors: Nessa Carey
The most forthright responses were mainly from evolutionary biologists. This wasn’t altogether surprising. Evolution is the biological discipline where emotions tend to run highest. Normally the bullets are targeted at creationists, but the Gatling guns may also be turned on other scientists. Epigeneticists working on the transmission of acquired characteristics from parent to offspring were probably quite relieved that ENCODE took them out of the firing line for a while.
14
Figure 14.6
Scientists are usually outwardly polite (left-hand statements), but are sometimes just speaking in barely disguised code (right-hand thoughts) …
The angriest critique of ENCODE included the expressions ‘logical fallacy’, ‘absurd conclusion’, ‘playing fast and loose’ and
‘used the wrong definition wrongly’. Just in case we were still in doubt about their direction of travel, the authors concluded their paper with the following damning blast:
The ENCODE results were predicted by one of its lead authors to necessitate the rewriting of textbooks. We agree, many textbooks dealing with marketing, mass media hype, and public relations may well have to be rewritten.
15
The main criticisms from this counter-blast centred around the definition of function, the way that the ENCODE authors analysed their data, and the conclusions drawn about evolutionary pressures. The first of these applied to the problems we have already described, using our Jackson Pollock and Downton Abbey analogies. In some ways, these problems derive in large part from difficulties in separating mathematics from biology. The ENCODE data sets were predominantly interpreted by the original authors through the use of statistical and mathematical approaches. The sceptics argue that this leads us down a blind alley, because it doesn’t take into account biological relationships, and that these are critically important. They use a very helpful analogy to explain this. The reason the heart is important is that it pumps blood around the body. That’s the biologically important relationship. But if we analysed the actions of the heart just by a mathematically derived map of its interactions, we would draw some ridiculous conclusions. These could include that the heart is present so that it can add weight to the body, and to produce the sound ‘lub-dub’. These are both things that the heart undoubtedly does, but they are not its function. They are just contingent on its genuine role.
The authors criticised the analytical methods because they felt that the ENCODE teams had not been consistent in the way they
applied their algorithms. One consequence of this was that effects seen in a large region might weigh down an analysis inappropriately. For example, if a block of 600 base pairs was classified as being functional, when all the work was actually carried out by just ten of them, this would dramatically skew the percentage of the genome that would be designated as having a function.
The evolutionary argument was that the ENCODE authors ignored the standard model that regions with large amounts of variation are reflective of a lack of evolutionary selection, which in turn means they are relatively unimportant. If you want to overturn such a long-held principle, you need to have very strong grounds for doing so. But the critics claimed that the ENCODE papers, although containing huge amounts of data, had only focused on an inappropriately small number of regions when drawing evolutionary conclusions from the sequences of humans and other primates.
There are interesting scientific arguments on both sides, but it would be disingenuous to believe that the amount of heat and emotion generated by ENCODE has been purely about the science. We can’t ignore other, very human factors. ENCODE was an example of Big Science. These are typically huge collaborations costing millions and millions of dollars. The science budget is not infinite and when funds are used for these Big Science initiatives, there is less money to go around for the smaller, more hypothesis-driven research.
Funding agencies work hard to get the balance right between the two types of research. In many cases, Big Science is funded if it generates a resource that will stimulate a great deal of other science. The original sequencing of the human genome would be a clear example of this, although we should recognise that even that was not without its critics. But with ENCODE the controversy is not around the raw data that were generated, it’s about how those data are interpreted. That makes it different from a pure infrastructure investment in the eyes of the critics.
When all stages and aspects of ENCODE are added up, it cost in the region of a quarter of a billion dollars. The same amount of money could have funded at least 600 average-sized single research grants focusing on investigation of individual hypotheses. Choosing how to distribute funding is a balancing act, and at these levels of funding it is guaranteed to create division and concern.
A company called Gartner created a graphic that shows how new technologies are perceived. It is known as the Hype Cycle. At first everyone is very excited – ‘the peak of inflated expectations’. When the new tech fails to transform everything about your life there is a crash leading to the ‘trough of disillusionment’. Eventually, everyone settles down, there is a steady growth in rational understanding and finally a productive plateau is reached.
With something like ENCODE this cycle is extraordinarily compressed, because of the polarisation from the most vocal groups. Those scientists with inflated expectations are operating at exactly the same time as those in the trough. Pretty much everyone else is pragmatic, and will use the data from ENCODE when it is useful to do so. Which is usually when it can help inform a specific question that an individual scientist finds interesting.
Footnote
a
These were typically accessibility to enzymes that can cut DNA molecules, which is a sign of an open structure that may be able to be copied into RNA.
15. Headless Queens, Strange Cats and Portly Mice
The ENCODE consortium identified a daunting abundance of potentially functional elements in the human genome. Given the huge numbers, it’s hard to define a sensible strategy for deciding which candidate regions to experiment on first. But the task may not be quite as difficult as it seems, and that’s because, as always, nature has decided to point the way. In recent years scientists have begun to identify human diseases that are caused by tiny changes to regulatory regions of the genome. Previously, these might have been dismissed as harmless random variations in junk DNA. But we now know that in some cases just a single base-pair change in an apparently irrelevant region of the genome can have a definite effect on an individual. In rare cases, the effect is so severe that life itself is impossible.
We’ll start with a less dramatic example, but one that takes us back about 500 years, to the reign of King Henry VIII in England. Most British schoolchildren are at some point taught a useful rhyme to help them remember what happened to the six wives of this notorious monarch:
Divorced, beheaded, died
,
Divorced, beheaded, survived
.
(Feel free to send a thank-you email when this handy little ditty helps you in a quiz.)
The first wife to be beheaded was Anne Boleyn, the mother of the future Queen Elizabeth I. After her death, the Tudor spin doctors launched quite a smear campaign and Anne Boleyn’s physical appearance was described in such a way that she sounded like the 16th-century image of a witch. She was characterised as having a projecting tooth, a large mole under her chin and six fingers on her right hand. The story of that extra finger has passed down in folklore, although there is little if any evidence that it was true.
1
Perhaps one of the reasons that the story has been accepted is because it’s not completely ludicrous. It’s not as if the chroniclers claimed that the former queen had three legs. There are people who are born with an extra finger, although usually they have an extra finger on each hand rather than just one.
There is a protein-coding gene that is very important in the correct development of the hands and feet.
a
The protein acts as a morphogen, meaning that it governs patterns of tissue development. The effects of the protein are very dependent on its concentration, and in the developing embryo there is a gradient effect, where high levels in one region gradually fade away to lower levels in adjacent tissues.
Mittens and kittens
One of the features controlled by this morphogen is the number of fingers. If the expression levels of the protein are wrong, babies are born with extra fingers. Over ten years ago researchers discovered that some cases of extra fingers were caused by a tiny genetic change. This wasn’t in the morphogen gene, but in a region of junk DNA about a million base pairs away. They
identified the change in a huge Dutch family where the presence of extra fingers was clearly inherited as a genetic trait. All 96 affected individuals had a change of just one base in the junk. Instead of a C base, these patients had a G base. None of the relatives with the normal number of digits had a C in this position. Single base changes were also found in other families where some individuals had extra fingers. These changes were in the same general region of the genome as in the Dutch family but 200–300 base pairs away from that alteration.
2
The junk region that carries these single base changes is an enhancer of the morphogen gene.
b
In order to create the correct body pattern, the spatial and temporal control of the morphogen is very tightly controlled by a whole slew of regulators. In the people with the mutation and the extra digit, the enhancer activity was slightly abnormal. The impact of the tiny change in this one regulator shows just how important and finely tuned this control is.
Here’s some help with another quiz. What’s the connection between Dutch people who have trouble buying gloves, and one of the great figures of 20th-century American literature? No? Give up? Well, in the 1930s Ernest Hemingway was given a cat by a ship’s captain. Instead of having five toes on its front paws, this cat had six. There are now about 40 descendants of this cat at Hemingway’s home, about half of whom have six toes on their front paws. It’s easy to find pictures of these cats on the internet
3
and they are simultaneously cute and a little bit scary. The extra toe looks like a thumb, rendering the cats slightly too capable-looking for comfort.
The same group that identified the change in the enhancer region in humans with extra fingers showed that the same region was altered in Hemingway’s cats. By inserting the enhancer
into another animal’s genome they confirmed that the alteration changed the expression of the morphogen. The experimental animal over-expressed the morphogen and developed an extra digit on each front paw. Rather delightfully, this effect was demonstrated by inserting feline DNA into a murine embryo. A genuine catand-mouse game.
4
Cats with extra front paw toes have also been found in other countries, including the UK. In the British cats there is also a change in the same enhancer, but it’s not exactly the same change. It is two base pairs away from the Hemingway change, in a three-base-pair motif that is very highly conserved in evolution. The enhancer region that is involved in the extra digits on the forelimbs of humans and cats is about 800 base pairs in length and most of it is highly conserved from humans all the way down to fish. This suggests that the control of limb development is a very ancient system.
Morphogens and facial development
The morphogen that is responsible for finger formation is also critical for other developmental processes. One of these is the process whereby the structures of the front of the brain and the face are formed. If this process goes wrong, the effect can be very mild: simply a cleft lip. But at the other extreme, where the morphogen expression is more severely disrupted, the effects can be devastating. The brain and face may be completely abnormal, with no proper formation of brain structures. In the most severe cases the babies are born with just one malformed eye in the middle of the forehead and with severely impaired brain development. The babies never survive.
This spectrum of condition is known as holoprosencephaly.
5
A number of different protein-coding genes has been shown to be mutated in different families with this condition. Many of these genes are involved in the regulation of the same morphogen that is
required for correct digit formation. In some cases, the gene for the morphogen protein itself is mutated. The developing embryo only produces half of the normal amount of the morphogen, because the functional protein is only produced from one chromosome, not two. The abnormalities in the affected individuals show that it is critical that the morphogen levels hit the right thresholds at key points in development.