《离散选择模型》PPT课件

上传人:zhu****ng 文档编号:225768295 上传时间:2023-08-03 格式:PPT 页数:50 大小:490.47KB
收藏 版权申诉 举报 下载
《离散选择模型》PPT课件_第1页
第1页 / 共50页
《离散选择模型》PPT课件_第2页
第2页 / 共50页
《离散选择模型》PPT课件_第3页
第3页 / 共50页
资源描述:

《《离散选择模型》PPT课件》由会员分享,可在线阅读,更多相关《《离散选择模型》PPT课件(50页珍藏版)》请在装配图网上搜索。

1、 离散选择模型王志刚2003.12线性概率模型(LPM)nYi=E(Yi|Xi)+ui.pi=p(Yi=1|Xi)nY can take only two values.It is 1 with probability pi and 0 with probability(1 pi):nE(Yi|Xi)=1 pi+0 (1 pi)=pi=1+2Xi.nThis means we can rewrite the model asnYi=1+2Xi+ui.nMarginal effects:kn问题:n1)Pi0;n2)var(ui|Xi)=(1+2Xi)(1-1-2Xi)概率单位模型(probit

2、 model)和对数单位模型(logit model)n.三个模型估计系数的大概关系nLPM*2.5=Probit nLPM*4=logitnProbit*1.6=logitn以上只是一个大概的关系.估计方法极大似然法(MLE)模型的拟合指标nLogL1 是关注的模型的极大似然值;LogL0是所有参数除了截距项以外均为0时的极大似然值;显然LogL0 LogL10.5,则Yi取1,否则取0.预测的Yi与实际Yi相符的次数所占百分比称为正确预测百分数.nProbit model估计系数的含义:模型估计发生比(odds)和发生比率(odds ratio)连续自变量的发生比率n.离散自变量的发生比率

3、有序向应模型(以probit模型为例)n.n识别问题:不同的参数组合可能产生相同的极大似然值.只要保持一定的参数比率;n因此要进行标准化:多重选择模型n当存在多种选择,而且这些选择之间没有程度的不同,不涉及排序问题,就应用多重的probit,或logit 模型.n假设残差项独立.这意味着(控制可观测变量的基础上),任何两个可选择的效用是独立的;问题在于当多个选择相似时,例如交通方式的选择,坐车,乘船,坐飞机;但是有人对颜色有不同的偏好,就把每种方式菜系分为红黄蓝三色,此时有六种选择,但是统一交通方式之间相似,这就不能用多重选择模型,而要用分层选择模型.Probit model using st

4、atanprobit depvar indepvarweightif expin range,level(#)nocoef noconstant robust cluster(varname)score(newvarname)asis offset(varname)maximize_optionsndprobit depvar indepvars weightif expin range,at(matname)classic probit_options.npredict type newvarname if expin range,p|xb|stdp|rules asif nooffsete

5、xplanationnprobit estimates a MLEndprobit estimates alternative MLE,reports the change in probability for an infinitesimal change in each independent,continuous variable(default).reports discrete change for dummy variables.nscores(.)create newvar nasis:request that all specified variables and oberva

6、tioins be retained in maximization process.noffset()specifies varname to be included in model with coefficient constrained to be 1.Explanation for optionsnOptions for dprobitnat(matname)specifies point around which the transformation of results is made.Default is perform transformation around nclass

7、ic request mean effects be calculated using formula for continuous;nif for dummy,use nPrediction:nP:probability of a positive outcome.nXb:calculate the linear predictioin.nstdp:standard error of linear prediction.Output after dprobitnereturn list (在dprobit之后使用)输出各种结果;n其中:ne(xbar):表示在样本均值处的指数值;e(b)指的

8、是对应的分布函数值;e(r2_p):pseudo R2;ne(par):fraction of success observed in data.ne(dfdx):marginal effects.ne(se_dfdx):standard error of marginal effects.例子:参与工会模型n因变量:参加工会unionn自变量:潜在经验potexp,工作经验的平方exp2,受教育年数grade,婚否married,工会化程度high.n分别采用线性概率模型,probit(包括mfx),dprobit,logit 模型进行分析,结果如下:LPM resultsn reg uni

9、on potexp exp2 grade married high/*LPM*/n-n union|Coef.Std.Err.t P|t|95%Conf.Intervaln-+-n potexp|.0200388 .0038969 5.14 0.000 .0123916 .0276859n exp2|-.0003706 .0000819-4.53 0.000 -.0005313 -.0002099n grade|-.0124636 .0051005 -2.44 0.015 -.0224725 -.0024547n married|.0133428 .030001 0.44 0.657 -.04

10、55298 .0722153n high|.1439396 .0256785 5.61 0.000 .0935492 .1943299n _cons|.1021368 .0749337 1.36 0.173 -.0449096 .2491832n-Probit resultsnProbit estimates Number of obs=1000 LR chi2(5)=93.09 Prob chi2=0.0000 Log likelihood=-475.2514 Pseudo R2=0.0892n-n union|Coef.Std.Err.z P|z|95%Conf.Intervaln-+-n

11、potexp|.0835091 .0156087 5.35 0.000 .0529166 .1141016n exp2|-.0015308 .0003179-4.82 0.000 -.0021538 -.0009078n grade|-.042078 .0189089 -2.23 0.026 -.0791388 -.0050171 married|.0622516 .1125836 0.55 0.580 -.1584083 .2829115n high|.5612953 .099662 5.63 0.000 .3659613 .7566292n _cons|-1.468412 .2958112

12、 -4.96 0.000 -2.048192 -.8886332n-using mfx after probit estimationnMarginal effects after probit;y=Pr(union)(predict)=.19047616n-variable|dy/dx Std.Err.z P|z|95%C.I.Xn-+-potexp|.0226964 .00415 5.47 0.000 .014557 .030836 18.884 exp2|-.000416 .00008 -4.90 0.000 -.000583 -.00025 519.882 grade|-.011436

13、1 .00514 -2.23 0.026 -.021506-.001366 13.014married*|.0167881 .03011 0.56 0.577 -.042234 .07581 .641 high*|.1470987 .0247 5.96 0.000 .098687 .195511 .568n-n(*)dy/dx is for discrete change of dummy variable from 0 to 1marginal effects coefficient estimation using dprobit(和前一页结果一样)n-union|dF/dx Std.Er

14、r.z P|z|x-bar 95%C.I.n-+-potexp|.0226964.0041529 5.35 0.000 18.884 .014557 .030836 exp2|-.000416 .000085 -4.82 0.000 519.882 -.000583 -.00025grade|-.0114361.0051379-2.23 0.026 13.014 -.021506-.001366married*|.0167881.0301137 0.55 0.580.641 -.042234 .07581high*|.1470987.0247005 5.63 0.000 .568 .09868

15、7 .195511n-+-n obs.P|.216n pred.P|.1904762 (at x-bar)默认的;当然可以指定其他值;n-n(*)dF/dx is for discrete change of dummy variable from 0 to 1n z and P|z|are the test of the underlying coefficient being 0Marginal index effcts vs marginal probability effects evaluated at sample meann边际指数效应:就是对应的估计系数值;n在样本均值处的边际

16、概率效应等于相应的估计系数值乘以相应的在样本均值处的概率密度函数值.以教育为例的结果nscalar pdfxallbar=normden(e(xbar)(注依次是dprobit 之后没有at(.);如果有at(.),则可以写为:scalar pdfxallbar=normden(e(at).nlincom _bgrade*pdfxallbar(在dprobit之后使用)n(1).2717829 grade=0-union|Coef.Std.Err.z P|z|95%Conf.Intervaln-+-(1)|-.0114361.00513-2.230.026-.02150-.00136n-n蓝色

17、的是概率偏效应系数;绿色是指数偏效应(probit)系数;n具体细节见程序说明;t 检验和F检验n test potexp=graden(1)potexp-grade=0n chi2(1)=24.58;Prob chi2=0.0000n.test potexp grade exp2 married highn(1)potexp=0n(2)grade=0n(3)exp2=0n(4)married=0n(5)high=0n chi2(5)=81.40;Prob chi2=0.0000Ordered probit modelnoprobit depvar varlistweightif expin

18、range,table robust cluster(varname)score(newvarlist)level(#)maximize_optionsnpredict type newvarname if expin range,p|xb|stdpoutcome(outcome)nooffsetnOptions:k:number of categoriesntable showing how probability are computed from the fitted equation.ncluster:specifies observations are independent acr

19、oss groups(clusters)but not necessary within groups;it equal to robust;nscore:first is dLnLj/dln(xjb);second is dLnLj/d_cut1jnKth is dLnLj/d_cut(k-1)jOptions for predictionnp:default calculate predicted probabilities.If dont specify outcome(),you must specify k new variables,k is number of categorie

20、s of dependent variables;if you specify outcome(),you must specify one new variable;nxb,stdp,nooffset as the same as probit optionnoutcome(outcome):specifies for which outcome the predicted probabilities to be computed,it should contain either one single value of dependent variable,or one of#1,#2,n#

21、1 represent the first category of dependent variable.例子:对自然资源愿意支付的程度n每个人都面临着三个标的:初始的,以及上标和下标;是观测不到的个人支付意愿。因变量depvar;自变量:Age,income 均为分组变量;输出结果nOrdered probit estimates LR chi2(3)=37.90nProb chi2 =0.0000nLog likelihood=-359.24085 Pseudo R2 =0.0501n-ndepvar|Coef.Std.Err.z P|z|95%Conf.Intervaln-+-n age

22、|-.1750912 .0437968 -4.00 0.000 -.2609313 -.0892511 female|-.2436467 .128588 -1.89 0.058 -.4956745 .0083811 income|.1507535 .0507301 2.97 0.003 .0513242 .2501827ndepvar|Probability Observedn-+-n 1|Pr(xb+u_cut1)0.3942n 2|Pr(_cut1xb+u_cut2)0.0577n 3|Pr(_cut2xb+u_cut3)0.3622n 4|Pr(_cut3 chi2=0.0000 Log

23、 likelihood=-475.55411 Pseudo R2 =0.0886n-n union|Odds Ratio Std.Err.z P|z|95%Conf.Intervaln-+-n potexp|1.15882 .0325594 5.25 0.000 1.09673 1.224425n exp2|.9973167.0005639 -4.75 0.000 .9962121 .9984225n grade|.9320947.0299594 -2.19 0.029 .8751866 .9927031n married|1.122393 .2208633 0.59 0.557 .76321

24、41 1.650606n high|2.664832 .4798005 5.44 0.000 1.872457 3.79252n-皮尔逊拟合优度检验nlfitnLogistic model for union,goodness-of-fit testn number of observations=1000n number of covariate patterns=556nPearson chi2(550)=529.72n Prob chi2=0.7255n表明模型充分地拟合了数据。正确预测的概率n lstatnLogistic model for unionn -True-nClassif

25、ied|D D|Totaln-+-+-n +|0 3|3n -|216 781|997n-+-+-n Total|216 784|1000nCorrectly classified 78.10%n78.10%=1-(216+3)/1000Receiver operating characteristic(ROC)curvenLogistic model for union:area under ROC curve =0.7083ROCnPlot sensitivity vs(1-specificity)as the cutoff c varied,and calculate area unde

26、r it;n(0,0)对应的c=1;(1,1)对应的c=0;n横轴(specificity):fraction of observed negative outcome cases that are correctly classified.n纵轴(sensitivity):fraction of observed positive outcome cases that are correctly classified.n模型越有解释力,则曲线弯曲的程度越大;最没有解释力的是在45度直线上(对应的面积为0.5)。具有完全解释力的模型对应的面积为1。n本例子模型解释力还可以。dbeta 距离指标

27、检验异常点Wald testn test exp2n(1)exp2=0;chi2(1)=22.58;Prob chi2=0.0000n.test potexp=graden(1)potexp-grade=0n chi2(1)=24.12;Prob chi2=0.0000Logit modelnlogit depvar indepvarsweightif expin range,nocoef noconstant robust cluster(varname)score(newvarlist)level(#)maximize_options.npredict type newvarname if

28、 expin range,statistic rules asif nooffsetnStatistic is:np|xb|stdpas the same as probit.Predict ndbeta:Pregibon(1991)influence statistic.ndeviance:deviation residual;越大,模型越差;ndx2:Hosmer,and Leme influence statistic.nddeviance:Hosmer,Leme influence statistic.nhat:pregibon,leverage.nresiduals,pearson

29、residuals,nrstandard:standard pearson residuals;或标准化残差;大样本时服从正态分布N(0,1).nor:reports estimated coefficients transformed into odds ratios.exp(beta)instead of beta;Mlogit procedurenmlogit depvar indepvarsweightif expin range,basecategory(#)constraints(clist)noconstant robust cluster(varname)score(newva

30、rlist)rrr level(#)maximize_options.npredict type newvarname if expin range,p|xb|stdp|stddp outcome(outcome).explanationnbasecategory(#):specifies the value of depvar that treated as base category;default is to choose the most frequent category.nconstraints(clist):linear constrained estimation.nscore

31、(newvarlist):create k-1 variables,k is number of observed outcomes;the mth is dLnLj/dlnxjbm;nPredict:p,xb,stdp as before;stdpp:calculates standard error of the difference in two linear predictions,must specifies outcome(outcome);ne.g:predict sed1,stdpp(1,3)nrrr:reports estimated coefficients transfo

32、rmed to relative risk ratio.I.e:exp(b)instead of b;例子:加入不同行业n数据仍然使用前面工会例子所使用的;若选区第一类产业(自然资源)作为基准;n因变量:产业选择ind1;n自变量:potexp,exp2,grade,married,union,highnMultinomial logit model results:输出结果n.省略n13|n potexp|-.2179493 .0665262 -3.28 0.001 -.3483383 -.0875603n exp2|.004946 .0015428 3.21 0.001 .0019222

33、.0079698n union|.7321836 .8778095 0.83 0.404 -.9882914 2.452659n grade|.376377 .0877074 4.29 0.000 .2044736 .5482805n high|-59.32135 4738540 -0.00 1.000 -9287427 9287308n married|-.8431111 .4845398 -1.74 0.082 -1.792792 .1065695n _cons|22.52589 1.349662 16.69 0.000 19.8806 25.17118n-+-n14|n potexp|-

34、.1024596 .1043477 -0.98 0.326 -.3069773 .102058n exp2|.0023012 .0021146 1.09 0.276 -.0018433 .0064458n union|.47805 .5203087 0.92 0.358 -.5417363 1.497836n grade|.3177647 .1038616 3.06 0.002 .1141998 .5213296n high|.5012028 1.846817 0.27 0.786 -3.118492 4.120898n married|-1.060803 .7061995 -1.50 0.1

35、33 -2.444929 .3233226n _cons|-2.141218 .n(Outcome ind1=1 is the comparison group)例子:选择行业(logit.,rrr)n rrr (odds ratio)n-+-n14|n potexp|.8988697 .0933506 -1.03 0.305 .7333253 1.101785n exp2|1.001984 .0021047 0.94 0.345 .9978678 1.006118n union|1.284149 .6549112 0.49 0.624 .4726132 3.489193n high|1.23

36、3808 1.457804 0.18 0.859 .1217621 12.50211n married|.4224411 .295379 -1.23 0.218 .1072975 1.663193n-n(Outcome ind1=1 is the comparison group)其他过程nOrdered logit:ologitnConditional logistic regression:clogitnWeighted least squares for grouped data:glogit;gprobit;nNested logit model:nlogitnPanel logit/probit:xtlogit,xtprobitnSas/stat/proc logistic;n推荐参考书:n1。Logistic 回归模型方法与应用;高教出版社。n2。Statistics with stata5.0n3.handbook of stata;

展开阅读全文
温馨提示:
1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2: 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
3.本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!