- Back to Home »
- 統計 »
- Logistic regression 迴歸整理
Posted by : Chih-Hao Chang
2015年1月26日 星期一
ln (p/ (1-p) ) = b0+b1X
p= e ^ (b0+b1X) / (1+ (e^(b0+b1X)))
書裡寫到在logistic regression可以 report 的data 有
beta value, standard error, P value, R square (Hosmer & Lemeshow) and goodness-of-fit statistics. odd ratio, confidence interval. constant.
Omnibus tests of models coefficient table
看chi-square是否適合,若p 小於0.05表示這個model可以。但是比較多個不同的model,找出 most parsimonious model.
odds= P(cure) / (1-P(cure))
odds ratio (exponential of B, exp (B)) = (odds after a unit changes in the predictor) / (original odds)
R square = ( (-2LL(baseline)) - (-2LL (new) ) ) / (-2LL (baseline))
最好每一個cell 都有variables
complete separation:如果可以用一個predictor來預測outcome就不適用
一個predictor需要10個case (? or 50?)
Method of regression
Enter: 每一個項目都放到model裡
forward, backward就是照他字面上的意思,把變數加入或把變數剔除。
logistic regression 和 discriminant analysis 的差異
discriminant analysis:需normally distributed 的 independent variable
logistic regression :需要足夠的 sample
multivariate analysis: 2個 dependent variables
multivariable analysis: 多個 independent variables
常見的迴歸錯誤(來自聰明學統計一書):
分析非線性關係、相關性不等於因果關係、顛倒的因果、遺漏變數偏誤、高度相關的解釋變數、超出資料範圍的推測、資料地雷(太多變數)
參考資料:
Discovering Statistics using IBM SPSS Statistics (4版)
Exploratory regression analysis: A tool for selecting models and determining predictor importance
What is the difference between Logistic Regression and Discriminant Analysis?
Multivariate or Multivariable Regression?
2015-8-10 update:
為何增加一個變數後原先的變數會變成沒有意義
p= e ^ (b0+b1X) / (1+ (e^(b0+b1X)))
書裡寫到在logistic regression可以 report 的data 有
beta value, standard error, P value, R square (Hosmer & Lemeshow) and goodness-of-fit statistics. odd ratio, confidence interval. constant.
Omnibus tests of models coefficient table
看chi-square是否適合,若p 小於0.05表示這個model可以。但是比較多個不同的model,找出 most parsimonious model.
odds= P(cure) / (1-P(cure))
odds ratio (exponential of B, exp (B)) = (odds after a unit changes in the predictor) / (original odds)
R square = ( (-2LL(baseline)) - (-2LL (new) ) ) / (-2LL (baseline))
最好每一個cell 都有variables
complete separation:如果可以用一個predictor來預測outcome就不適用
一個predictor需要10個case (? or 50?)
Method of regression
Enter: 每一個項目都放到model裡
forward, backward就是照他字面上的意思,把變數加入或把變數剔除。
logistic regression 和 discriminant analysis 的差異
discriminant analysis:需normally distributed 的 independent variable
logistic regression :需要足夠的 sample
multivariate analysis: 2個 dependent variables
multivariable analysis: 多個 independent variables
常見的迴歸錯誤(來自聰明學統計一書):
分析非線性關係、相關性不等於因果關係、顛倒的因果、遺漏變數偏誤、高度相關的解釋變數、超出資料範圍的推測、資料地雷(太多變數)
參考資料:
Discovering Statistics using IBM SPSS Statistics (4版)
Exploratory regression analysis: A tool for selecting models and determining predictor importance
What is the difference between Logistic Regression and Discriminant Analysis?
Multivariate or Multivariable Regression?
2015-8-10 update:
為何增加一個變數後原先的變數會變成沒有意義
張貼留言