利用資料探勘技術建立商用複合機銷售預測模型;Applying Data Mining Techniques to Construct the Sale Forecast Model for Multiple Function Devices

NCU Institutional Repository > 管理學院 > 資訊管理學系碩士在職專班 > 博碩士論文 > Item 987654321/64555

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/64555

題名:	利用資料探勘技術建立商用複合機銷售預測模型;Applying Data Mining Techniques to Construct the Sale Forecast Model for Multiple Function Devices
作者:	洪彥群;Hung,Yen-Chun
貢獻者:	資訊管理學系在職專班
關鍵詞:	資料探勘;銷售預測;單一分類器與多重分類器;Data Mining;Sales Forecast;Single Classifiers;Multiple Classifiers
日期:	2014-05-15
上傳時間:	2014-08-11 18:38:01 (UTC+8)
出版者:	國立中央大學
摘要:	商用多功能複合機是將影印、列印、傳真以及掃描等多項功能配載於單一裝置中，透過簡易與直覺化的操作，提供使用者一站式的服務，藉此以提升工作效率；對於辦公室採購者而言，可以減少其他裝置的採購與佈署，讓商辦空間坪效獲得更靈活運用。國內目前針對銷售預測的論文題目數量相當多，但鮮少有文獻嘗試進行「連續型與離散型資料」及「單一與多重分類器比較」的銷售預測效能進行比較，因此本研究以個案公司的真實銷售資料，試圖找出符合個案公司需求之最適工具並期許本研究結果能提供學術界參考。本研究針對資料來源逐步進行維度篩選、無效資料刪除、維度整理、資料前處理等動作。在實驗流程上，將資料分成連續型資料與離散型資料，並分別透過資料探勘工具Weka3.6.9 版本，進行不同分類器實驗，以試圖獲得最佳銷售預測模型。其中離散型資料是根據個案公司每月銷售數量，以常態分配法劃分為 3 類。為能找出個案公司資料中具備影響力的維度，本研究更進一步比PCA(principle components analysis)篩選後的維度，其連續型與離散型的預測結果。在連續型資料的預測工具上，本研究分別採用 Linear Regression、MultilayerPerceptron、SMOreg 與 kNN 等4 種單一分類器，並搭配 Additive Regression 與 Bagging 多重分類器加以驗證；在離散型資料則採用 MultilayerPerceptron、SMO、LibSVM、kNN、CART 與 BayNet 等 6 種單一分類器，並搭配 Adaboost 與 Bagging 多重分類器加以驗證。經過實驗結果得知，PCA 對於連續型或離散型資料的預測結果影響都不大，而在連續型資料上，以 SMOreg 的表現最佳，錯誤率整體來說最低；而在離散型資料，則以LibSVM 的正確率較高。;Multiple function devices are a type of office machines which combines E-mail, fax,copy, printing, and scanning functions. It was designed to provide users with easy and promptoperation and usage. In the literature of data mining applications, very few focus on B2B selling forecast in Taiwan. Moreover, there is no a comparative study for the applicability of data mining techniques to different types of forecasting results, which are continuous and discrete prediction outputs. Therefore, in this thesis the research objective is to compare different supervised learning techniques for the sale forecast of multiple function devices. The contributions of this thesis are able to provide some guidelines for the case company to conduct sales forecast and can give academics a reference on B2B industry. In the experiments, the attributes relate to sales from historical data are collected, and the data completeness in each attribute is also taken into account. Next, the historical selling quantity (i.e. continuous values) is used as the prediction output. In addition, the selling quantity is further divided into 3 classes by normal distribution for comparison. On the other hand, in order to find out the effect of performing feature selection on the forecasting result,PCA (principle components analysis) is used to select more representative attributes from the original data set. For model construction, different single and multiple classification techniques are compared. The experimental results show that performing feature selection does not significantly affect the final prediction results no matter for continuous or discrete prediction output. For continuous prediction without PCA, the support vector machine (SVM) performs the best in terms of MAE (Mean Absolute Error). For discrete prediction without PCA, the SVM outperforms the other models in terms of prediction accuracy.
顯示於類別:	[資訊管理學系碩士在職專班 ] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	636	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....