論文名稱(外文):The study of the Exponentiated Geometric distribution and related applications
指導教授(外文):Lee, Chiang-Sheng
口試委員(外文):Lee, Chiang-ShengShi-Woei Lin
在統計學上,數值資料(numerical data)分為以下兩類:連續型資料(continuous data)和計數型資料(count data)。定義上,連續型資料表示資料可以為任意的數值,如身高、體重等。計數型資料則表示為整數型資料,如班級裡的學生人數等。
在以往的資料研究,相關人員常利用傳統的間斷分配去分析一般的計數資料。然而,在實際研究中,往往會發生資料含有大量零觀測值的現象,統計上稱為零膨脹資料(Zero-inflated data)。研究發現傳統的間斷分配並不適用於此類型資料。因此有不同的作者提出不同的方法去分析此類型資料。
本篇文章的主要目的是提出指數型幾何分配(exponentiated geometric distribution)的統計性質及其相關應用。此分配首先由Nadarajah &Baker(2015)提出,但是文章中有不少錯誤的討論,我們修正了原作者在文章中的謬誤之處並延伸了該分配的研究。
本論文中收集了車險索賠次數、每家住院人數和羔羊胎兒胎動次數等不同類型的零膨脹資料來驗證指數型幾何分配是否適用,並與ZIP、ND及Adjusted GPD做適合度上的比較。
In statistics, numerical data divided into the following two classifications:continuous data and count data. On definition, continuous data indicates that data can take any values, such as height and weight. Count data indicates that data only can take certain values, such as the number of students in a class.
On a previous data research, relevant people usually use the traditional discrete distribution to analyze the normal count data. But, in the practical research, it usually happens a phenomenon that data with a large amount of zero observation. Statistics call it Zero-inflated data. By the research, we find the traditional distribution is not suitable for this kind of data. So the different author proposed different method to analyze this kind of data.
This paper aims on proposing an exponentiated geometric distribution and related applications. This distribution originally proposed by Nadarajah & Baker (2015).But, there are some wrong discussions in the article. We revised the mistake in the original paper and generalized the distribution’s research.
In this paper, besides making the detailed discussion of statistics properties (CDF, PMF, hazard function, order statistics, quantile) about exponentiated geometric distribution, we also verify the revised distribution, judging whether it is suitable of zero inflated data or greater than the previous common distribution.
This paper collect the different kind of zero inflated data, such as the automobile claim data, Hospitalizations’ data and fetal movement data to verify the exponentiated geometric distribution is suitable or not. And we compare the goodness of fit between exponentiated geometric distribution, ZIP, ND and Adjusted GPD.
The research result shows that exponentiated geometric distribution is suitable for zero inflated data. It can be an alternative to analyze the zero inflated data.
口試委員會審定書 #
誌謝 i
中文摘要 ii
目錄 v
圖目錄 vi
表目錄 vii
第一章 緒論 1
第二章 文獻回顧 3
2.1 零膨脹資料的常見分配 3
第三章 研究方法 7
3.1 指數型幾何配的統計性質 7
3.1.1 機率函數和累積分配函數及其相關圖形 7
3.1.2 可靠度函數及危險函數其相關圖形 9
3.1.3 Quantile(分位數) 12
3.1.4 隨機變數的生成 13
3.1.5 指數型幾何分配的展開 15
3.1.6 各類動差函數 15
3.1.7 順序統計量 17
3.2 指數型幾何分配的參數估計 18
3.3 模型的比較方法 20
第四章 資料分析與研究結果 22
4.1 資料研究 22
第五章 結論 28
5.1 結論 28
5.2 研究限制 29
5.3 未來研究建議 29
參考文獻 30
