On the Human Genome Science Conference, 120 researchers from all over the world gathered together to construct a framework for the construction of the transcriptome database. They hope the database will someday be able to include all the gene expressed sequencing in the human genomics.
As we all know, the first step in the synthesis of proteins in vivo is to transcript the gene (DNA) in the genetic information to the messenger RNA (mRNA). By this process, the gene sequence is separated from the rest of the gene. The so-called transcriptome is the general term for all the mRNA. In recent years, researchers used mRNA as a template to obtain complementary DNA (cDNA) by reverse transcription, and then use cDNA to study the transcription products. Therefore, the study of the mRNA can be done by the research of cDNA, and the cDNA is more convenient and easier to operate.
At present, most of the cDNA data can be obtained from a variety of public databases, but many of them are not complete cDNA, but only cDNA fragments. In addition, these data still have many defects, such as wrong classification, inconsistency of data and so on. These defects of cDNA data have hindered the practical application in the study. Therefore, scientists have been hoping to be able to collect all the human cDNA sequence to be sorted, and included in the same database, so that the work of scientific research personnel can be more standardized and accurate.
At present, researchers find the genes from the human genome. The usual method is to find a specific sequence in the whole genome sequence, and thus speculate the expression of this fragment. The process of this prediction will generally has more or less errors. However, if the research staff have completed the work by cDNA, it will make the operation steps more simple and accurate.
In order to construct the database, researchers have collected 42,000 cDNA data from six world’s database. They will make the gene mapping within the next few months and mark them one by one in 23,000 different regions of the human genome. The researchers find that there may be a number of overlapping cDNA overlap in the same area. A further study of this phenomenon may also reveal one mysteries of the human genome-how few genes can produce a large variety of proteins and how they can produce such complex genetic characteristic differences determined by human genetic factors.