One difference to the CMAP database is neces sitated by the multi

One difference to the CMAP database is neces sitated by the multiple Trichostatin A HDAC origins of the expression profile data represented by multiple probe ID definitions. The problem of multiple probe IDs is solved by the GEM TREND database having expression profiles mapped onto UniGene IDs. The database consists of experimental series where samples can be clearly assigned to treatment and control groups. Of course, this is not always the case and this limits the scope of the database. In compiling the expression database SPIED we sought to loosen the restraints inherent in previous treatments and thereby open up a larger set of data for interrogation. In many expression series sets there is no clear control/ treatment assignment or there could be multiple alterna tive reference profile definitions.

To address this problem of generating fold change profiles without reference to a defined control, an effective fold has been intro duced corresponding to the expression level relative to the experimental series average. In this way, data can be compiled automatically without the need for manual inspection. In cases where the experimental series con sists of well defined multiple treatment and control sam ples the fold profiles are usually given by the ratio of the average treatment to average control values. In general this fold profile will have high positive correlation with the EF profiles from the treatment set and high negative correlations with the control set. In cases where there is no obvious way of separating samples into control and treatment sets, as with samples from multiple organ types or cell types, the EF representation can be viewed as a normalized expression value.

In searching SPIED with a query profile one is not deriving any biological sig nificance for non correlating profiles as lack of correla tion can be attributed to multiple factors such as bad experimental data or genuine lack of biological relevance. Rather significantly correlating or anti correlating pro files are posited as having biological significance. The next objective was to reduce the expression profiles Batimastat to non redundant EF gene profiles by associating each gene with just one probe ID, so that the database can then be searched with gene set data alone. Here, for a given chip platform the distribution of each probe ID EF value across the totality of series was compiled and each gene was then assigned to the probe having the highest average fold magnitude. The gene names were unam biguously associated with the Entrez human gene list, consisting of 24,764 genes and these were matched to probe IDs by inspection of the given platform annotation files.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>