The genetic alphabet

20 Oct

Illustration by D. Leja, courtesy of the National Human Genome Research Institute, http://www.genome.gov.

The genes in our chromosomes are often compared to recipes inside of a cookbook.

But if we were to open one of these cookbooks, we would find that the recipes are all written in a unique language.

Whereas we have 26 letters in our alphabet, the DNA alphabet is only four letters long: A, T, G, C.

Each letter represents a chemical base: adenine (A), thymine (T), guanine (G), and cytosine (C).

Looking at DNA directly, we see the As, Ts, Gs, and Cs arranged on that most famous genetic shape, the double helix.

To picture this shape, imagine you are holding a toy rubber ladder that you twist on both ends.  The sides of the ladder are sugar and phosphate molecules; the rungs of the ladder are As paired with Ts, or Cs paired with Gs, fitted together in units called base pairs.

If you unravel the DNA, you reveal the recipes of As, Ts, Cs, and Gs that are written into genes in the form of three-letter words called codons (so named because they are little codes).

In the dictionary of codons there are only 64 words—that is to say, there are exactly 64 possible variations of As, Ts, Gs, and Cs when they are arranged into three-letter words. Because so many of these words are synonyms, they only have about 21 meanings. Sixty-one codons are recipes for 20 amino acids, the building blocks of proteins, and three mean “end the protein,” in the way that “stop” expressed the end of the sentence in that bygone communique, the telegram.

Still, there is no limit to the number of proteins that could be spelled out by different sequences of codons.

To illustrate this point, let’s play a quick word game.

Write a poem or just a few sentences limited to the words in the following “dictionary” of 64 three-letter words. The words are paired into groups of very rough synonyms (we cheated a little) to mirror the limited vocabulary of codons.

YES, YEP, YUP, YEA, YAH, AYE

NIX, NIL, NOT, NAY, NON, NOR

PIP, BIT, TAD, PAT, DAB, NIP

CRY, SAY, ASK, PUT

TOP, CAP, SKY, LID

BUD, PAL, BRO, CUZ

ATE, EAT, HAS, HAD

SHE, HIM, HER, HIS

WHY, WHO, HOW

END, DOT, FIN (These are the “stop” codons)

YOU, THY

CAN, MAY

BUT, YET

ARE, WAS

AND, TOO

POO, DOO

DOG, PUP

MAN, GUY

CUP, MUG

DID

THE

From this list of words will not emerge Homer’s Odyssey.  But the poem (or sentences) that you come up with will not be the same as mine:

The dog ate poo.

“But why?” you cry.

Ask the dog.

Ask the sky.

In the language of the genome, the possibilities are more limitless than this model allows, since the “sentences” of codons that are instructions for creating a protein molecule, which is a long way of saying “genes,” care nothing for syntax.*

They do, however, allow for punctuation, in the sense that there are codons that begin and end “sentences” in that they contain chemical instructions to start or stop manufacturing a protein.

Deciphering the DNA code is as simple as a children’s game.  On the back of a cereal box you might find a coding game where you have a set of numbers and a look-up table that specifies the letters they represent.

Where:

5=G

6=E

and 13=N,

“5-6-13-6”

would spell GENE.

Similarly geneticists, or anybody else, can simply look up the amino acids represented by codons, so that a “recipe” that reads ATG, ATG, ATG, ATG according to the look-up table means methionine, methionine, methionine, methionine.  (Methionine is an essential amino acid that is especially prevalent in sesame seeds.)

The average proteins range in size from eight or 10 up to thousands of amino acids.

Through the alphabet of As, Gs, Ts, and Cs, genes encode all of the proteins that make all life on earth go.

*ADDENDUM: Coincidentally, within hours of writing the words “genes care nothing for syntax,” the author of this post heard a scientist on NPR convincingly make the opposite case. A study appearing in this week’s issue of the journal Nature reports that lifestyle factors, such as overeating, may alter genetic traits. The focus of this study is on epigenetics– the things around our DNA, such as the chemical markers that turn genes on and off. The epigenome, says Andy Feinberg, professor of medicine and molecular biology at the Johns Hopkins School of Medicine, is the grammar that “helps to tell what the genes are actually supposed to do, and puts them in context.” Read the story from All Things Considered.

Advertisements

One Response to “The genetic alphabet”

Trackbacks/Pingbacks

  1. What are my genes (and proteins) doing now? « Genetics & Parenthood - July 6, 2011

    […] function of the genes in my cells is to provide the “recipes” (in the form of DNA sequences of As, Gs, Ts and Cs) that are used to make […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: