Home | Jobs | News & Articles | Job Advice | Search | Protocols | Fun | @RealLabRat
Lab Rat with DNA

Post your resume
Candidates - post your resume

Search job listings
Candidates - search biotech jobs

Post job openings
Employers - post job openings

Email Login
Password
New users
sign up!
HOME > Review > The Human Genome Project

The Human Geneome Project

So then, we have a “working draft” of the human genome. Dr Ewan Birney, one of the lead researchers at the publicly-funded European Bio-informatics Institute (EBI) in Cambridge, UK, told the BBC that: "The public project decided last year to accelerate its rate of discovery to match the private project and on 15 June we will say that we're effectively 90% done - 90% of the interesting bits." But what constitutes ‘an interesting bit’? As Dr Birney doesn’t mention that, we’re left guessing. Presumably this includes protein-coding sequences, regulatory regions and transcription factors.

Still, this is only about 3% of the genome. So what exactly is the rest of it? Well, a significant proportion is just repeated DNA. Repeats come in two flavours, mini & micro. The repeats are made up of ‘units’ that are repeated. For example, "ATATATATATATAT" would have a 2bp unit of "AT". Micro-repeats are of 5-6bp. Mini-repeats are 10 or more units. There are virtually no 7,8 or 9bp units observed. What is the difference you might ask? Well, micro-repeats are thought to be the result of DNA polymerase ‘slippage’. Basically the copied DNA has a few extra repeats than the original. After a few million years, you get a lot of repeats. Micro-repeats are thought to be caused by inaccurate crossing over in meiosis.

Although repeats are considered ‘junk’, they do have their uses. VNTRs (Variable Number Tandem Repeats) allow DNA samples to be discerned from one another, they are the basis of the DNA fingerprint. Generally, genes are no good for fingerprinting.  Although most genes have several ‘versions’ or alleles, most are not very common. Any changes in genes will probably result a change in the function of the gene product, so is almost always selected against. To ensure that there is enough variation for fingerprinting to be useful, there needs to be 6 or more common alleles; this is very rare in genes, frequent in VNTRs. Analysis of 10 or more VNTRs is usually enough for a fingerprint.
Repeats also have another interesting consequence for sequencing. The chemical reactions used to read off the bases are stalled by highly repetitive regions. This means that about 10% of the genome is virtually impossible to sequence! Chromosome 22, the first to be sequenced does in fact have a section of about 15Mb (3%) that is unreadable.

The situation is worse in the case of the privately funded Celera Genomics sequencing program. Celera uses a shotgun sequencing technique. This breaks the DNA into short fragments & sequences them. Overlaps between the ends of these sequences are looked for so that they can be fitted together like a jigsaw. The nature of this technique is that there will be gaps where a section hasn’t been successfully sequenced. Then add to this the problem of un-sequencable repeats.

Finally there are ‘short repeated sequences’ which afflicts both programs. While they can be sequenced, they all look exactly the same as each other, so we cannot pinpoint where in the mass of repeats a segment fits.

The importance of these regions is debatable. It is generally agreed that these regions are the least likely to contain any genes. However, Evan Eichler at the University of Cleveland has discovered arrays of genes at the edges of repetitive regions. He believes that by ignoring these regions, researches are missing important genes. This may prove to be very important, or of little relevance; we just don’t know yet.


So then, while there is uproar over Celera sequencing the human genome in a matter of months and a less than a tenth the cost of the public project, neither will be complete. But they are also looking for different things. Celera want to uncover as many genes of commercial value as quickly as possible. The HGP aims to sequence as much of the genome as possible and make that information available to all. It is much more thorough than Celera’s approach. This said, Celera will discover more genes in less time; there is a place for both, although for my money, Celera comes across as a rushed job.


theLabRat.com 2005. All Rights Reserved.