在生物醫學領域研究中,微陣列晶片已成為測量基因表現的重要工具。 然而使用晶片測量基因表現一直以來都有低再現性的問題。 雖然造成這樣的原因仍然未被完全的確定,然而平台的設計問題已被MicroArray Quality Control計畫使用大尺度研究排除在外。在這樣的晶片測量中,樣本的雜交前處理包含培養、加藥處理、晶片平台特定樣本製備等皆會引入生物性誤差,而晶片平台固有的隨機誤差則是技術性誤差,而此兩種誤差是混合在測量中難以量化分離。 越來越多的證據顯示生物性誤差是誤差的主要成分,但是缺乏一個評估生物性誤差的方法,使得實驗者不斷懷疑數據的可靠性。 在這裡我們發展出一個評估樣本生物性誤差及晶片組敏感性的方法來解決此問題,這個方法是全新的,用一個全統計的方式來計算且不需要正歸化晶片訊號。我們使用此方法研究350組公開的晶片資料集的生物性誤差,我們發現生物性誤差是晶片誤差的主要來源,而我們的結果顯示有一部份的晶片組靈敏度都是低的,這或許可以解釋為什麼研究相同疾病卻有高度不相似的差異表現基因清單。 這樣的結果也指出如果不在樣本處理上有改善,再現性的問題也仍然會出現在次世代定序等未來科技的測量上。Measurement of gene expression using microarray has been an extremely important research tool in biology and medicine. However, poor reproducibility of array-based results remains a long-standing issue. Although the cause for the problem has not been firmly identified, platform design and test site have been ruled out in a large-scale study by the MicroArray Quality Control project. In such measurements, prehybridization error (biological variance, or BV) introduced during sample processing (e.g. culture and treatment) and platform-specific sample preparation, and inherent random error of the technology (technical variance, or TV) are coupled and difficult to quantify separately. Increasing evidence points to BV as the primary cause but lack of a method for assessing BV keeps the experimentalist in constant doubt of data reliability. Here, we developed a procedure, Measuring Improper Sample Handling (MISH), as a solution for the problem and produced a computer package for its implementation. MISH is novel, all-statistics procedure and does not require normalization. For demonstration, we applied MISH to study the BV in 350 public data sets. Part of the result may be taken as a characterization of BV of the Affymetrix GeneChip Human Genome U133 Plus 2.0 Array platform. We found that BV was the dominant error in the data sets studied and that, for data sets from biological replicates, sample processing introduced the most error. Our analysis showed that a large number of public cohort data sets had low sensitivity on contrasts, which may well explain why studies on same diseases yielded highly dissimilar lists of DEGs. This suggests that the reproducibility issue will remain a concern for measurements based on next-generation sequencing, and on any future technology that does not focus on improvement in sample processing.