The new era of health and medicine as an information technology is broader than individual genes
February 4, 2011 by Ray Kurzweil
Is it time to rethink the promise of genomics?
There has been recent disappointment expressed in the progress in the field of genomics. In my view, this results from an overly narrow view of the science of genes and biological information processing in general. It reminds me of the time when the field of “artificial intelligence” (AI) was equated with the methodology of “expert systems.” If someone referred to AI they were actually referring to expert systems and there were many articles on how limited this technique was and all of the things that it could not and would never be able to do.
At the time, I expressed my view that although expert systems was a useful approach for a certain limited class of problems it did indeed have restrictions and that the field of AI was far broader.
The human brain works primarily by recognizing patterns (we have about a billion pattern recognizers in the neocortex, for example) and there were at the time many emerging methods in the field of pattern recognition that were solving real world problems and that should properly be considered part of the AI field. Today, no one talks much about expert systems and there is a thriving multi-hundred billion dollar AI industry and a consensus in the AI field that nonbiological intelligence will continue to grow in sophistication, flexibility, and diversity.
The same thing is happening here. The problem starts with the word “genomics.” The word sounds like it refers to “all things having to do with genes.” But as practiced, it deals almost exclusively with single genes and their ability to predict traits or conditions, which has always been a narrow concept. The idea of sequencing genes of an individual is even narrower and typically involves individual single-nucleotide polymorphisms (SNPs) which are variations in a single nucleotide (A, T, C or G) within a gene, basically a two bit alteration.
I have never been overly impressed with this approach and saw it as a first step based on the limitations of early technology. There are some useful SNPs such as Apo E4 but even here it only gives you statistical information on your likelihood of such conditions as Alzheimer’s Disease and macular degeneration based on population analyses. It is certainly not deterministic and has never been thought of that way. As Dr. Venter points out in his Der Spiegel interview, there are hundreds of diseases that can be traced to defects in individual genes, but most of these affect developmental processes. So if you provide a medication that reverses the effect of the faulty gene you still have the result of the developmental process (of, say, the nervous system) that has been going on for many years. You would need to detect and reverse the condition very early, which of course is possible and a line of current investigation.
To put this narrow concept of genomics into perspective, think of genes as analogous to lines of code in a software program. If you examine a software program, you generally cannot assign each line of code to a property of the program. The lines of code work together in a complex way to produce a result. Now it is possible that in some circumstances you may be able to find one line of code that is faulty and improve the program’s performance by fixing that one line or even by removing it. But such an approach would be incidental and accidental, it is not the way that one generally thinks of software. To understand the program you would need to understand the language it is written in and how the various lines interact with each other. In this analogy, a SNP would be comparable to a single letter within a single line (actually a quarter of one letter to be precise since a letter is usually represented by 8 bits and a nucleotide by 2 bits). You might be able to find a particularly critical letter in a software program, but again that is not a well motivated approach.
The collection of the human genome was indeed an exponential process with the amount of genetic data doubling each year and the cost of sequencing coming down by half each other. But its completion around 2003 was just the beginning of another even more daunting process, which is to understand it. The language is the three-dimensional properties and interaction of proteins. We started with individual genes as a reasonable place to start but that was always going to be inherently limited if you consider my analogy above to the role of single lines in a software program.
The structure of DNA. (Image: The U.S. National Library of Medicine)
As we consider the genome, the first thing we notice is only about 3 percent of the human genome codes for proteins. With about 23,000 genes, there are over 23,000 proteins (as some portions of genes also produce proteins) and, of course, these proteins interact with each other in complicated pathways.
A trait in a complex organism such as a human being is actually an emergent property of this complex and organized collection of proteins. The 97 percent of the genome that does not code for proteins was originally called “junk DNA.”
We now understand that this portion of the genome has an important role in controlling and influencing gene expression. It is the case that there is less information in these non-coding regions and they are replete with redundancies that we do not see in the coding regions.
For example, one lengthy sequence called ALU is repeated hundreds of thousands of times. Gene expression is a vital aspect of understanding these genetic processes. The noncoding DNA plays an important role in this, but so do environmental factors. Even ignoring the concept that genes work in networks not as individual entities, genes have never been thought of as deterministic.
The “nature versus nurture” discussion goes back eons. What our genetic heritage describes (and by genetic heritage I include the epigenetic information that influences gene expression) is an entity (a human being) that is capable of evolving in and adapting to a complex environment. Our brain, for example, only becomes capable of intelligent decision making through its constant adaptation to and learning from its environment.
To reverse-engineer biology we need to examine phenomena at different levels, especially looking at the role that proteins (which are coded for in the genome) play in biological processes. In understanding the brain, for example, there is indeed exponential progress being made in simulating neurons, neural clusters, and entire regions. This work includes understanding the “wiring” of the brain (which incidentally includes massive redundancy) and how the modules in the brain (which involve multiple neuron types) process information. Then we can link these processes to biochemical pathways, which ultimately links back to genetic information. But in the process of reverse-engineering the brain, genetic information is only one source and not the most important one at that.
So genes are one level of understanding biology as an information process, but there are other levels as well, and some of these other levels (such as actual biochemical pathways, or mechanisms in organs including the brain) are more accessible than genetic information. In any event, just examining individual genes, let alone SNPs, is like looking through a very tiny keyhole.
As another example of why the idea of examining individual genes is far from sufficient, I am currently involved with a cancer stem cell project with MIT scientists Dr. William Thilly and Dr. Elena Gostjeva. What we have found is that mutations in certain stem cells early in life will turn that stem cell into a cancer stem cell which in turn will reproduce and ultimately seed a cancer tumor. It can take years and often decades for the tumor to become clinically evident. But you won’t find these mutations in a blood test because they are mutations originally in a single cell (which then reproduces to create nearby cells), not in all of your cells. However, understanding the genetic mutations is helping us to understand the process of metastasis, which we hope will lead to treatments that can inhibit the formation of new tumors. This is properly part of gene science but is not considered part of the narrow concept of “genomics,” as that term is understood.
Indeed there is a burgeoning field of stem cell treatments using adult stem cells in the positive sense of regenerating needed tissues. This is certainly a positive and clinically relevant result of the overall science and technology of genes.
If we consider the science and technology of genes and information processing in biology in its proper broad context, there are many exciting developments that have current or near term clinical implications, and enormous promise going forward.
A few years ago, Joslin Diabetes Center researchers showed that by inhibiting a particular gene (which they called the fat insulin receptor gene) in the fat cells (but not the muscle cells as that would negatively affect muscles) enabled caloric restriction without the restriction. The test animals ate ravenously and remained slim. They did not get diabetes or heart disease and lived 20 percent longer, getting most of the benefit of caloric restriction. This research is continuing now focusing on doing the same thing in humans, and the researchers whom I spoke with recently, are optimistic.
We have a new technology that can turn genes off, and that has emerged since the completion of the human genome project (and which has already been recognized with the Noble prize), which is RNA interference (RNAi). There are hundreds of drugs and other processes in the development and testing pipeline using this methodology. As I said above, human characteristics, including disease, result from the interplay of multiple genes. There are often individual genes which if inhibited can have a significant therapeutic effect (such as we might disable a rogue software program by overwriting one line of code or one machine instruction).
There are also new methods of adding genes. I am an advisor (and board member) to United Therapeutics, which has developed a method to take lung cells out of the body, add a new gene in vitro (so that the immune system is not triggered — which was a downside of the old methods of gene therapy), inspect the new cell, and replicate it several million fold. You now have millions of cells with your DNA but with a new gene that was not there before. These are injected back into the body and end up lodged in the lungs. This has cured a fatal disease (pulmonary hypertension) in animal trials and is now undergoing human testing. There are also hundreds of such projects using this and other new forms of gene therapy.
As we understand the network of genes that are responsible for human conditions, especially reversible diseases, we will have the means of changing multiple genes, and turning some off or inhibiting them, turning others on or amplifying them. Some of these approaches are entering human trials. More complex approaches involving multiple genes will require greater understanding of gene networks but that is coming.
There is a new wave of drugs entering trials, some late stage trials that are based on gene results. For example, an experimental drug PLX4032 from Roche is designed to attack tumor cells with a mutation in a particular gene called BRAF. For patients with this genetic variant, 81 percent of patients with advanced melanoma had their tumors shrink (rather than grow), which is an impressive result for a form of cancer that is generally resistant to conventional treatment.
There is the whole area of regenerative medicine from stem cells. Some of this is now being done from adult autologous stem cells. Particularly exciting is the recent breakthrough in induced pluripotent stem cells (IPSCs). This involves using in-vitro genetic engineering to add genes to normal adult cells (such as skin cells) to convert them into the equivalent of embryonic stem cells which can subsequently be converted into any type of cell (with your own DNA). IPSCs have been shown to be pluripotent, to have efficacy, and to not trigger the immune system because they are genetically identical. IPSCs offer the potential to repair essentially any organ from hearts to the liver and pancreas. These methods are part of genetic engineering which in turn is part of gene science and technology.
And then of course there is the entire new field of synthetic biology which is based on synthetic genomes. A major enabling breakthrough was recently announced by Craig Venter’s company in which an organism with a synthetic genome (which previously existed only as a computer file) was created. This field is based on entire genomes not just individual genes and it is certainly part of the broad field of gene science and technology. The goal is to create organisms that can do useful work such as produce vaccines and other medicines, biofuels and other valuable industrial substances.
You could write a book (or many books) about all of the advances that are being made in which knowledge of genetic processes and other biological information processes play a critical role. Health and medicine used to be entirely hit or miss without any concept of how biology worked on an information level. Our knowledge is still very incomplete, but our knowledge of these processes is growing exponentially and that is feeding into medical research which is already bearing fruit. To focus just on the narrow concepts that were originally associated with “genomics” is as limited a view as the old idea of AI being just expert systems.