How is the history of a gene determined?


While a rudimentary understanding of heritable traits is traceable as far back as biblical tales – noticeably the story of Jacob and his speckled goats in Genesis 30 & 31 – as with much else in human history, knowledge of its processes didn’t really take off until the last century and a half. Darwin’s work on evolutionary theory culminated in his On the Origin of Species in 1859, a thorough examination of his observations some thirty years in the making, which shortly preceded the work of a curious monk by the name of Gregor Mendel. A combination of isolation, contemplation and love of horticulture led to Mendel’s profound understanding of heritable, genetic traits and his eventual theories of inheritance. Simple plants that they are, Mendel was lucky to have selected peas for his work, as their genes clearly delineate between recessive and dominant, allowing them to fall into clear patterns.

A (male) a (male)
A (female) AA Aa
a (female) aA aa

Should a dominant trait exist, it overpowers the recessive trait. Mendel studied a wide variety of traits in his pea plants, but here we’ll only consider a single trait: height, with a tall A dominating the smaller a. A quick look at the chart reveals that 3 times out of four, given unknown parentage, you’ll raise tall plants. In humans, things aren’t necessarily as clear cut, but eye color makes for a good example. The pure Aryan of Nazi mythology had bright blue eyes; as brown eyes are a dominant trait, it was a simple process to determine who was “genetically pure” and who was not.

DNA (deoxyribonucleic acid) was a known substance even in Darwin’s time, but its structure was not determined until nearly a hundred years after the Origin of Species was published, through the concentrated efforts of Watson and Crick. Stored in the nuclei of the cells of all animals, plants, fungi (and so on), DNA is inseparable from what we know as living creatures. DNA forms the backbone for each of the 23 chromosome pairs that make up the human genome. It stores all of the instructions necessary to build and maintain an organism, with redundancies built in to buffer even the most dramatic of influences, up to and including nuclear radiation.

Each chromosome forms the familiar double-helix pattern made up of a long chain of chemical bases called (for simplicity) A, C, G and T. Each side of the twisted ladder is a mirror of the other, with As and Ts joining hands, and Cs and Gs joining hands, all the way up the helix, forming molecular couples known as “base pairs”. Genes are specific sections of this code, used to build the molecules that do the actual work, such as forming a cell, regulating and patrolling invaders like viruses, and so on. DNA is “read” in triplets of bases, where a given triplet is used to form the amino acids that make up a protein. There are 20 standard amino acids, and a kind of built-in redundancy allows for a lot of leeway in mutations in the DNA. For example, the sequences TTA, TTG, CTA, CTT, CTC and CTG are all capable of forming the same amino acid, leucine.

DNA holds a record of every mutation that it has ever undergone. Traits like eyes, red blood cells or the ability for formerly vegetarian great apes to digest meat are developed and lost through small changes in its code. The old adage, “use it or lose it”, is just as true in keeping genetic traits alive as it is in keeping one’s math skills up to date. Indeed, some of our genes are universally held throughout the animal kingdom, preserved in their identical state for hundreds of millions of years. Our immortal genes. Other genes performed their function, and then, when circumstances no longer required their use, became fossilized in our DNA, no longer available for coding functional proteins (if at all). As we became more reliant on our vision, humans lost much of our ability to smell, while other animals have only refined theirs.

Popular reports on genetics and biology often discuss in passing their estimates of when certain changes in our DNA took place. The ability for adult humans to digest milk is one of the fastest spreading changes our species has ever known; DNA analysts claim that the change took place a mere 9,000 years ago. We used to be vegetarians; DNA analysis shows that we developed the ability to eat meat in two stages, first a change 2.5 million years ago that allowed us to fight infectious diseases caused by rancid meat, and then over 2 million years later a change that allowed us to clear out the fat and cholesterol buildup that had been drastically shortening (ha!) our lives.

These popular reports tend to avoid the question of how a researcher determines the age of a particular mutation, or change in a base pair. Truth be told, there are fairly wide margins of error in the dates provided, even for fairly recent mutations. Most commonly over the last 60 years, and for simplicity, it was assumed that the mutation rate of DNA was constant across time and species. Differing methods to calculate this constant are described by Laurence Moran as the biochemical method (130 mutations per person), the phylogenetic method (100-150 mutations per person) and the direct method (77 mutations per person). Considering that the entire genome contains approximately 3.2 x 10^9 base pairs (of A, C, G and T), it’s still pretty consistent, though the mutation can bring truly mind-boggling changes to a given individual. Of course, not all mutations are hugely beneficial, nor excruciatingly detrimental. Most of the time, a mutation has no discernible effect on a given person.

With a known mutation rate, it’s a relatively simple process to compare the DNA collected from an ancient human skeleton to one today, and count the number of mutations that have taken place. Divide the two, and *boom* there’s your result. Nothing in science is really quite that simple, however, and there has been a lot of discussion in biology circles about the accuracy of the mutation rate and its related term, the genetic clock. Species migrate, exposure to mutagens are non-constant, etcetera.

Always, always, these estimates must be compared to dating methods used by the other sciences, such as radiometric dating, stratigraphic methods and written histories. The latter has proven the Ashkenazi Jews a boon to DNA studies, as they have a very long and very well known historical record. Hailing originally from central Germany, they have made a concerted effort to keep their bloodlines “pure” for hundreds of years. Genetic problems that are endemic to the Ashkenazim can be traced back in time to when the first person in that group suffered that particular ailment.

At present, the most widely used methods for determining the variable rates of mutation are the Bayesian method described by Thorne, and the Penalized Likelihood method described by Sanderson. Computer programs have been developed that use these algorithms, with the most popular called, appropriately, BEAST (Bayesian Evolutionary Analysis by Sampling Trees), and made freely available through the University of Auckland. In a 2013 study, Paradis tweaked the original Penalized Likelihood method, and revealed that the traditional method of the stable genetic clock still turns out to be a fairly accurate approach, as it simply represents a special case of the other methods used.

So, yes, there are uncertainties in the dating methods used by DNA researchers. That is to be expected. But it’s good to know that when they say something happened approximately 220,000 years ago, there really are good reasons to believe their results.


Drummond, AJ and Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology 2007, 7:214. 2007.

Labuda, D et al. The Genetic Clock and the Age of the Founder Effect in Growing Populations: A Lesson from French Canadians and Ashkenazim. Am. J. Hum. Genet. 61:768–771. 1997.

Moran, LA. Estimating the Human Mutation Rate. 2013.

Paradis, E. Molecular dating of phylogenies by likelihood methods: A comparison of models and a new information criterion. Mol. Phyl. Evol. 67:436-444. 2013.

Sanderson, MJ. Estimating Absolute Rates of Molecular Evolution and Divergence Times: A Penalized Likelihood Approach, Mol. Biol. Evol. 19(1):101–109. 2002.

Thorne JL, Kishino H, and Painter IS. Estimating the Rate of Evolution of the Rate of Molecular Evolution. 1998.


Carroll, S. The Making of the Fittest. 2006.

Gonick, L. The Cartoon Guide to Genetics (2nd Edition). 2005.

Kean, S. The Violinist’s Thumb: Tales of Love, War and Genius as Written by Our Genetic Code. 2011.


UCSC Genome Bioinformatics

UCSC Genome Browser

National Human Genome Research Institute

Genetics Home Reference [Includes Handbook]

BEAST Software

This entry was posted in education, history, science and tagged , , , , . Bookmark the permalink.

One Response to How is the history of a gene determined?

  1. Margaret says:

    read this a week ago actually. You made it seem pretty easy to understand. Amazing leucine can be made so many ways!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s