摘要: | 我們用兩組酵母菌和兩組細菌的微陣列數據來驗證以及比較三個用來研究基因調控的動力模型―一階、二階和平均。對於每一個受關注的被調控基因,我們以最高事後機率找出最有可能的調控基因,接著利用EchoBASE、GeneDB以及Saccharomyces Genome Database等資料庫來查對那些最高事後機率的基因,並認定其中哪些被註釋為轉錄因子。我們將最大化機率的基因分成兩組-編碼轉錄因子與編碼非轉錄因子,再以它們的ROC線下面積(AUC)來量化模型的表現。我們發現二階模型在驗證中的整體表現最佳,其中一個AUC高達100%。因此在本研究中,我們認為二階微分模型預測轉錄因子的能力會優於其他兩種動力模型。 On two yeast datasets and two bacteria datasets, we validate and compare three kinetic models – first-order, second-order and averaged – for studying causal relationships between genes. For each regulated gene of interest, we identify the gene with the highest posterior probability of being its dominant regulating gene. We then check the annotation of those genes with the highest posterior probability(probability-maximizing genes)in EchoBASE, GeneDB, and in the Saccharomyces Genome Database and note which among them are putative transcription factors. To quantify performance, we estimate the area under the receiver operating characteristic curve (AUC) between the probability-maximizing genes that encode putative transcription factors and those that do not. We find the second-order model performs well in validation. One of its AUC estimates is 100%, reflecting the case in which the only putative transcription factor has a higher posterior probability of being the dominant regulator of a gene of interest than any of the other 42 genes. Based on this study we suggest that the second-order model is better than others kinetic models on predicting transcription factors. |