Determined but not complacent, grounded but hopeful, Dr. Leonid Peshkin is one of the scientists working on understanding aging so that it may one day be treated like we treat any other ailment.
As he revealed in an interview with the Boston Globe in mid-2018, the idea of having to lose oneself and one’s loved ones to aging never made any sense to him, and ever since he was a child, he has been preoccupied with aging and the fear that it might take away his father, who was almost 60 when Leon was 10 and, sadly, passed away in July 2018 at the age of 96.
Dr. Peshkin, a 48-year-old from Moscow, Russia, possesses a master’s degree in applied mathematics and a Ph.D. in machine learning. He currently works at the Systems Biology Department at Harvard Medical School; his primary interests are embryology, evolution, and aging, which he has studied for over a decade.
In this interview with Lifespan.io, he makes no mystery of his wish for aging to be defeated as soon as possible, but unlike other scientists in the field, who tend to prefer specific theories of aging, he thinks that our understanding isn’t yet deep enough to take sides, and in this, he is somewhat reminiscent of João Pedro de Magalhães.
As he makes clear, Leon is not particularly fond of the overly optimistic attitude of some people in the longevity community and some singularitarians, as he finds that complacency may well hinder progress in a battle that, at this point, is all but won. He began as follows:
Thank you for the opportunity to share my views. I must mention that my Ph.D. dissertation was done in the field of artificial intelligence, and for the past 12 years, I have worked at the Systems Biology department at Harvard Medical School, yet I am a relative novice in the field of aging biology. As a way of introduction, I’d like to offer a caricature of the currently popular sensationalist view in the field of aging:
“We are the chosen generation. Singularity is near. Rejuvenation therapy is almost here. Not one, several a-la-carte: stem cells, factors from young blood, senolytics, Skulachev’s ions, NAD, etc. Companies backed up by luminaries from business and science are already sorting out the remaining details, helped by the formidable force of AI technology called ‘deep learning’.”
This fairy tale is beautiful, and deep in my heart, I hope I am mistaken, but I think that at the moment, this positive mysticism is not justified and is rather counterproductive. The excessive optimism is, unfortunately, standing in the way of progress, as I will try to explain.
There are many proposed models of aging, such as the Hallmarks of Aging, SENS, and the deleteriome. Which, if any, of these models do you believe reflects the reality of aging?
I would not want to take part in religious wars. People get very passionate and clash about often vaguely defined terms. Which of the observed hallmarks of aging, from the molecular to the organism levels, are correlates and which are causes of aging is hard to say. Biology has not yet matured to become an exact science. Perhaps owing to my training in quantitative science (M.Sc. in Applied Math, Ph.D. in Machine Learning) I take a “model” to mean a level of quantitative understanding that allows for “modeling”; that is, forecasting and answering “what if” questions. Such a model might not be ultimately expressed by a set of crisp human-readable mathematical formulae but rather a large set of tuned parameters in an artificial neural net or some other representation that has not yet been invented.
It must, however, provide a way to assess the current state of an organism and predict its lifespan and healthspan in a stable environment, outside of a major perturbation, and then go further to allow for perturbations and adjust the predictions. Today, I can’t even say that there is an agreement in the field of what is a useful definition of “aging”. I like “increase of hazard rate (i.e. the probability of dying) with time”, which is admittedly a very mathematical notion — precise and not terribly useful. Inverting this formula, we get a curious metaphor – a life without aging can be imagined as a life where, say, once a year, you undergo a treatment that rejuvenates you a year in biological age, or, with some small but non-negligible probability, kills you. Life is a game of chance.
Do you believe that aging is a one-way process or something that is flexible and amenable to intervention?
It is both. I think that it is a one-way process but also flexible and amenable to intervention. Imagine one dramatic intervention: one day, we invent a way to cryo-protect a warm-blooded organism like ours so that it can undergo a freeze-thaw cycle without damage. Now, you are faced with a challenge to design a schedule that determines when, and in what size fractions, you’d like to use up your lifespan. While you are frozen, time stops. While you are alive, you age: the “deleteriome” kicks in, ionizing radiation wrecks your DNA, your defrosted friends and family du jour stress you out, etc. That’s what things would look like ad absurdum, illustrating the tradeoffs. Now, back to the interventions: I imagine a process not unlike a beauty salon, in which you do your nails and hair and get an occasional facelift; all of these are tradeoffs, even if people do not recognize it. Beauty treatments make you look younger at the moment, but cosmetics products may poison your skin and accelerate actual aging.
There is evidence of such tradeoffs across organisms in nature; extending lifespan in many species can be accomplished at the expense of reproduction, and in cold-blooded organisms, you can multiply the lifespan several-fold by just cooling the environment down or slowing down metabolic processes in other ways. I believe that the first results will be not so much in giving people free tickets to longer lives but in making the tradeoffs more explicit, educating people and putting them in control of decision making. These are never easy choices, not unlike the one that a cancer patient faces when given a choice between quick and painless death or taking a chance at the expense of painful agony. Life is a game of chance.
What are some of the studies that have convinced you that this is the case?
I did not have to be convinced; it follows from well-known observations. There are closely related species with drastically different lifespans, the germ-line reset that we discuss below, the dependency of lifespan on diet, and other obvious environmental conditions, such as ambient temperature for cold-blooded animals. I was tangentially involved in the effort to obtain the complete genome of the naked mole rat, which lives ten times longer than closely related species.
I currently study aging in daphnia – a small aquatic organism that goes from infancy to frailty in one month. We all know about aging, but this familiarity becomes understanding as you closely watch one organism being created from a single cell, form organs, grow, mature, procreate by itself, then exist for a while in a perfect environment with the right temperature, water quality, nutrients and light, just to decay and fall dead, all in one month. This organism is not hard to relate to, as it has very recognizable parts of common anatomy with us and lots of recognizable cell types. Yes, it is an invertebrate, but the muscle, heart, blood, primitive immune system, gut and auxiliary digestive system, eyes, and other sensory organs made of sensory and control neurons are easily observed using an ordinary dissection microscope. We are looking for signs of aging that are similar to those known in people, such as formation of cataracts, changes in bone density, hardening of blood vessels, etc. With such a short lifespan, we are able to manipulate the conditions with a variety of perturbations in order to hopefully build a detailed understanding and a causal picture of aging.
Eventually, we will even build a “model” in a rigorous sense of the word. We and others already see that calorie intake has a strong effect on the lifespan of daphnia; yet, even here, we do not have a clear, causal picture. Our approach to teasing out causality builds upon machine learning and was developed and published with a focus on cancer and metastasis, specifically cell motility. Roughly, the idea is to use broad-specificity drugs (polypharmacology) with well-characterized molecular targets as “twenty questions” in order to systematically explore which pathways do and do not affect a phenotype, which is lifespan in our case. Unlike with a drug screen, we do not expect to find a magic pill out of thousands or hundreds of thousands of candidates; rather, we use dozens of specifically selected drugs and respective doses to get a complex system to reveal its modularity to us by targeting modules in a disjointed fashion.
What is the epigenome, and how does it relate to the genome?
I like to think of the genome as an encyclopedic-scale cookbook, whereas the epigenome is a collection of bookmarks and handwritten notes on the margins as well as sauce stains and syrup spills. The genome is often likened to a book, but I insist on having a functional aspect for this metaphor to really work. We must use that book to create a product according to a recipe. It adds a lot to the biological metaphor; branches of the same fast-food chain use the same pages out of this book in a consistent, perhaps ever-slightly different, way. These are akin to cells of the same type. Worn-out pages lead to omitted or incorrectly interpreted steps, et cetera. Trying to implement a recipe that is already somewhat damaged might lead to a cook spending more time with it; in turn, this leads to more damage to this and adjacent pages, a process that greatly amplifies initial small random differences in a chaotic way, leading to naively inexplicable dramatic differences in the lifespan of originally twin-identical systems in near-identical environments.
Apparently, the epigenome can be used to accurately “age” the specimen, that is, to tell how old an organism is from a sample. This will be very useful in aging research, as to this day, we know how to age only a few selected species that keep track of their age in a visible way, like the annual circles on a tree stump, the similar layers on an otolith, and the scales of some fish. Having access to an organism’s age allows us to correlate its age to other markers, such as gene and protein expression in various tissues, behavior changes and physical wear of structures.
It has been proposed that epigenetic alterations are a primary cause of aging and that both genetic and epigenetic interventions are keys to potentially controlling aging. Do you consider epigenetic alterations as a cause of aging or a downstream consequence?
Neither cause nor downstream. There is no linear causal chain with the two links of “aging” and “epigenetic alterations”; instead, there are loops and amplifiers in the circuits of aging. Epigenetic alterations have to be caused by something else; these, in turn, control many things. On the other hand, DNA damage is clearly pretty early in the causal network but is hard to undo. There is more hope to proofread and fix “epigenetic alterations”. Going back to my fast-food chain metaphor, you would imagine a quality inspector, examining a cookbook, reducing the stains and shaking off crumbs and accidental bookmarks while completely shutting down the restaurants with hopelessly damaged cookbooks.
While I am not sure about the arrow of causality, I am very much interested in this direction of research, so much so that we are planning an experiment around it in daphnia, looking at changes in the distributions of cell types in cell populations that make up young and old individuals. The expectation is that epigenetic alterations lead to de-differentiation and mis-differentiation of cells in old organisms, which could be characterized and further used as end-points for aging interventions. We would need to find a reliable and affordable way to profile the epigenetic state at a single-cell level for tens of thousands of cells, which is a big challenge.
In December 2016, researcher Juan Carlos Izpisua Belmonte reset epigenetic aging markers in living animals using partial cellular reprogramming, similar to how we make induced pluripotent stem cells and how embryos “reset” their aging. This appeared to reverse multiple aspects of aging and increased healthy lifespan in mice. Are you optimistic that resetting the epigenetic biomarkers of aging in aged adults is a promising direction of research, and why?
I do think that it is a very promising direction of research, but I wouldn’t rush to inject myself with Yamanaka factors just yet. There is way too much hype surrounding research in the field of aging nowadays, which is simply harmful to the field. Biology in general is suffering from the reproducibility problem. Very few results translate from mice to humans. “Normal” laboratory mice are far from “wild type”; they are inbred and live in unhealthy conditions. It would be great to create a natural, healthy, predator-free environment for mice and many other species and get good lifespan statistics someday, both natural and under perturbations.
For now, we have very scarce data on “natural” aging of “natural” animals and have to rely on statistics for lab strains of animals that are cheaper to keep in large numbers, such as worms and flies. Even for these “unnatural” mice, I do not think that we yet have the lifespan data on “Yamanaka infusion”; instead, we saw encouraging results on some overall health markers in progeric mice. In order to rationalize the hypothetical rejuvenation reset, one has to imagine that Yamanaka factors work differently in different cell types, re-aligning each cell to its original profile. That sounds like too much to hope for as a solution to rejuvenation.
Speaking of resetting epigenetics and thus aging, there is already solid evidence that this happens every day when children are born. If aging was a one-way process, then our children would be born old, but they are not. There is a strong connection with aging and embryology, so why is it that an old oocyte cell grows into a young organism rather than an already aged one?
I love this “germ line reset” question. I see two and a half possibilities here:
1: The germline needs no reset. The germline is created early; very few cell cycles happen between generations. If some damage does occur, it is selected against at the cellular level or, later, at the organism level.
2: The germline does age, but
(a): it rejuvenates in an intense cleansing event somewhere during germ cell maturation and fertilization. The dream is to identify this mechanism and get it to work elsewhere. One of my projects has to do with precisely this magic, as it looks at the progesterone-induced maturation of frog oocytes. In this project, we use quantitative mass spectrometry to analyze the dynamics of protein abundances and phosphorylation events in order to look for the hints of a proverbial reset.
(b): age-related damage is not a problem. It is ignored and then gets diluted in the massive expansion from a single cell to an organism.
My own established line of research on protein expression in early embryogenesis made a strong case against this. I established that up to a point in vertebrate development, when hundreds of thousands of cells are there and many complex tissues have been germinated, the vast majority of protein molecules in use were deposited by the mother into the oocyte rather than synthesized de novo in the embryo. Simply keeping “bad” old protein until it gets diluted and not using it in development would be a tricky proposition, so the “reset” (abrupt degradation of the protein aggregates and carbonylated proteins) is a very reasonable theory.
The use of AI, specifically deep learning, in research has been in the news a great deal lately; how are you using it as part of your research?
I am not. Even though my Ph.D. is in AI, or precisely because it is, I am not of a mind to “unleash AI onto the problem and let it figure out aging”. There are some encouraging methods being developed, such as “deep learning”, but these are just that – tools, technologies that are only powerful when correctly applied in specific circumstances. Yes, they are useful for board games, image analysis, and self driving cars, but not yet biology nor aging.
Some of the aging-related projects that I am involved with produce image data and therefore might use machine learning someday. One example is developing “healthspan assays” for the short-lived organism daphnia, which involves collecting a lot of image data. We would like to develop a standardized platform for testing many perturbations’ effects on lifespan and healthspan. These could be drugs, changes to the environment such as light cycle and ambient temperature, diet, etc. Even though there are related initiatives, I am surprised that such a platform has not been developed yet; it’s badly needed and could be crowdsourced. Some machine learning methods will be useful there.
Also, I already mentioned one example of where machine learning tools are used in my work: a method we developed called “KIR – Kinome Regularization”, which uses drugs as perturbations and regularized regression as a way to uncover which pathways are responsible for a particular phenotype.
I am a lead co-author of a, fair to say, very highly cited paper that uses AI in biology. The paper is about classifying variants in protein-coding genes as deleterious or neutral. It’s definitely the most known and used method in this field and is licensed by many groups in academia and industry. The approach in that pipeline is the most basic AI method, which has been a workhorse of AI for decades: Naive Bayes. What’s important is that it is robust to outliers and that the results are understandable and interpretable by human experts. Every step is important for machine learning to work: the quality of the data, correct annotation of the training set, informative representation of the features relevant to that particular domain, and a good algorithm. Machine learning has gone a long way towards producing methods that are robust to noise in feature discovery and selection, but if you start with poor, inconsistent, and mislabeled data, it’s hard to accomplish anything. Biology is indeed at the point of having produced Big Data, which generates a lot of excitement for applying machine learning to it; the issue is that unfortunately a substantial component of low-quality infusion turns the whole treasure trove into Big Bad Data.
AI in aging research is still in its early stages; how do you see it developing in the next decade, and what needs to happen for it to become an optimal tool in the fight against aging and age-related diseases?
In order for AI to start being applicable to aging and biology in general, the data standards in biology must catch up to the data standards and quality of traditional AI domains. So far, AI, whether deep or other, has impressively fit within a very constrained and homogeneous setting, such as board games. The number and types of objects are very well defined, the data is curated, all situations are mostly at the same scale, and so forth. Even in voice recognition, self-driving cars, and robotics, the environments are extremely consistent and curated. Biology is very different in this respect; or, rather, our perception of biology at this point is different. We have not found invariant ways to describe phenomena, so any situation is unique.
Let’s look at one simple question: lifespan. For most species, we only have anecdotal, unreliable data. We would want to know how long species live in a setting where there is no predation and death comes “of old age”. It only happens at the zoo, and zoo data is a pretty good source, but making these records consistent across various countries is tricky. It’s shocking perhaps that even the answer to the question of how long people live has a lot of room for improvement. What’s the maximum registered lifespan? Jeanne Calment’s famous record of 122 years has recently come into question. What is the median? This depends on the country and many other things. The American life expectancy has been slightly falling in recent years. Is at least the overall distribution a Gompertz distribution? It’s not clear. There is a discussion of a flattening of the hazard rate in old age; this is seen in other species such as flies, but is there no flattening in humans?
I recently lost my father to old age; he passed away one week short of 97 after many months at hospitals and rehabs. I personally witnessed what I’d call an “assisted homicide” phenomenon – there is strong pressure not to provide elderly patients with the level of medical care that is standard for younger patients with the same conditions. Doctors avoid procedures when their perceived risk is high, as it reflects negatively on the practice (spoils statistics); they like to call it “useless care”. Social workers push relatives to “pull the plug” to “avoid needless suffering”, exaggerating risks and downplaying chances of recovery; nurses and unskilled helpers quietly withdraw medical care and assistance, projecting their own religious beliefs of the afterlife and other superstitions onto what’s good for the patient. Such norms are likely country-specific and naturally skew lifespan statistics; we ignore this component of mortality risk in old age, which has to do with the medical system’s bias against elderly patients.
Another example involves data on the perturbations of drugs on lifespan. There is vast literature and databases that attempt to agglomerate the results. However, not all the parameters are reflected, and often you find the same drug and dose having different effects in the same species. Drugs themselves are surprisingly hard to study because there is no consistent information widely available on which drug is called what or has which chemical structure and CAS identifier; there are rampant inconsistencies and misannotations. Getting the “same drug” from two different vendors leads to different results because drugs have different purity, etc. Overall, the vision that we can fund large optimized data factories for producing Big Data in biology and then unleash AI on it to discover how biology works has not paid off, probably for very fundamental reasons.
We do not know where to look for invariants in biology, so measuring e.g. gene and protein expression in a cancer cell line and assuming that your measurement captures the “normal” state and reaction to perturbation by drugs is naive; so many factors influence what things look like, and time of day, cell density, and microenvironment are just a few, so tomorrow, you can remeasure and get very different results. So, getting the results of multiple studies from these giant data repositories and hoping to find things in common could be very disappointing.
I mentioned that I was involved with a project of sequencing the naked mole rat, helping the Gladyshev laboratory at Harvard. The idea was to look for genes explaining its exceptional longevity and health. Such analysis has to be done by comparison to other species, so, naturally, it relies on having accurate and complete information on several other genomes. However, the quality of most so-called published complete genomes is such that you can never be sure if a certain gene is missing because it is really not there in these species or because it was missed by the genome sequencing and annotation pipeline.
One of my projects has to do with the quality control and attributes of complete genomes, so such analyses can be informed by the overall quality of the genome of each species, and gene sequences that are likely incomplete, bad, or represent the fusion of several real genes are labeled as such. I do not mean to downplay the work of the other teams that produced the complete genomes; there are hard challenges that are organism-specific, and we should be grateful to have as much information as we do, yet an objective way to assign confidence and take it into account has been missing.
You can probably see that the data quality in biology is a personal crusade for me.
AI is part of broader advances in research; automated microfluidics, high-throughput assays, and other automation are also improving research quality and speed. What other ways do you see automation in the lab helping to speed up research?
I am fortunate to have been involved with method development and to teach people about some of the cutting-edge technologies for high-throughput measurements, such as quantitative mass-spec proteomics and single cell transcriptomics. These are very promising technologies, but here, again, one has to be very careful with the interpretation of the data and particularly careful when expecting to merge data from multiple, disjointed studies. In single-cell studies, everything depends on which tissue you work with, how successfully you managed to rapidly dissociate the sample into single cells without killing a lot of cells, particularly in some biased way that masks some cell types entirely, whether (or rather to what extent) cells lyse in the device, and with what parameters you run the device and make and sequence libraries.
The same goes for proteomics: it’s easy to miss whole classes of proteins, such as low-abundance transcription factors or lipid-soluble proteins. In both cases, splitting the same exact sample in half and running it through a pipeline would often give you dramatically different outcomes. So while there is indeed some degree of automation, and automation is helpful, it is only practical to a very small extent, and even where it is used, it has to be used with extreme care.
I already mentioned that the data quality in biology is a personal crusade for me. I myself work on integrating AI into these methods. One project which is just now being submitted for publication is about assigning confidence to quantitative mass spectrometry measurements of relative protein abundance. Comparing protein expression across samples is currently done by reporting the most likely value, which does not allow us to notice significant but small changes or to rank candidate genes for follow-up research in a statistically sound way. With colleagues from Princeton, we developed a rigorous statistical model , which allows us to confidently judge a shift of protein expression down to 1%. This is an application of a Bayesian statistical approach borrowed from machine learning, and it will empower many new studies and allow us to re-analyze already published data to get to new findings. We will immediately plug this into our aging project, which involves life-long profiling of protein levels in daphnia.
Progress in aging research seems to have been speeding up in the last few years; however, there are still a number of bottlenecks holding things back. As a researcher, what is the biggest bottleneck holding us back from making rapid progress in the field?
While I, very humanly, would love to believe that we are the fortunate generation to witness and benefit from the awesome moment of the maturation of science and technology known as the “singularity”, if I try to soberly consider this “speed up”, I find it hard to believe. What I mean is that I am not sure if our time will seem like the singularity in retrospect. I also suspect that 10, 100 and 1000 years ago, there were people who lived through the same breathtaking experience of an imminent “singularity”; new tools tend to give us exaggerated sense of control over nature.
What is holding us back falls into two broad categories: things that are not in our power to change and things that are. The current system of incentives in science, funding and publication puts a lot of pressure on scientists to compete, not to share their best data and code, and to obfuscate even when they do. Competition can be extremely useful at the stages when cures need to be brought to the market, but the current state of aging research badly needs cooperation. We need platforms for the uniform profiling of perturbations. We need consistent, clean information on drugs and lifespan in species.
Perhaps most importantly, we need a platform for crowdsourcing efforts and doing citizen science. There is enormous enthusiasm, which largely is wasted in social media groups that discuss food supplements and cheer. I know many bright, generous, and resourceful non-specialists who would be happy to contribute their time in curating data, coding up useful snippets of software, and following instructions to collect samples. Organizing this force of nature will take a Task Rabbit or Mechanical Turk kind of platform. Also, it amazes me that, to this day, in the United States, someone can not get their own blood work done “out of curiosity”. People are already getting armed with technology to run some tests at home. If we trust people to spit into a tube to get their DNA sequenced, we can surely develop instructions to run what would essentially amount to a “citizen-run clinical trial” in which interventions are consistently tested.
All in all, the distinction between an atmosphere of competition or cooperation in the field really depends on the perception of whether the field is very young and far from having mature results and “products” or is very close to getting the first drugs and therapies to the market. That’s mainly why the false sense of accomplishment is very harmful to progress. The field is in its infancy, and it’s way too early to worry about marketing, patents and profits and to build walls; we have to find ways to be very open and collaborative.