6西格玛回归分析高教书苑

上传人:仙*** 文档编号:43259173 上传时间:2021-11-30 格式:PPT 页数:75 大小:2.30MB
收藏 版权申诉 举报 下载
6西格玛回归分析高教书苑_第1页
第1页 / 共75页
6西格玛回归分析高教书苑_第2页
第2页 / 共75页
6西格玛回归分析高教书苑_第3页
第3页 / 共75页
资源描述:

《6西格玛回归分析高教书苑》由会员分享,可在线阅读,更多相关《6西格玛回归分析高教书苑(75页珍藏版)》请在装配图网上搜索。

1、回归分析回归分析Regression Analysis目的目的Objectivesl介绍相关性及回归的基本概念介绍相关性及回归的基本概念 Introduce The Basic Concepts of Correlation and Regressionl把回归与六西格玛路线图结合起来把回归与六西格玛路线图结合起来 Link Regression To The Six Sigma Roadmapl学习多元回归的使用学习多元回归的使用 Review the use of Multiple Regressionl介绍相关性及回归的基本概念介绍相关性及回归的基本概念 Introduce The Ba

2、sic Concepts of Correlation and Regressionl把回归与六西格玛路线图结合起来把回归与六西格玛路线图结合起来 Link Regression To The Six Sigma Roadmapl学习多元回归的使用学习多元回归的使用 Review the use of Multiple Regressionl介绍相关性及回归的基本概念介绍相关性及回归的基本概念 Introduce The Basic Concepts of Correlation and Regressionl把回归与六西格玛路线图结合起来把回归与六西格玛路线图结合起来 Link Regres

3、sion To The Six Sigma Roadmapl学习多元回归的使用学习多元回归的使用 Review the use of Multiple Regression2高级教育项目跟踪图项目跟踪图 第五版 项目开始日期21/01/2004项目类别“Y”“Y”变量数据变量数据采集计划采集计划制定项目制定项目 日程日程启动项目书启动项目书DMAIC改善定义定义确定”Y”变量和起草项目书项目书得以批准流程图流程图C&EC&E矩阵或矩阵或故障树分析故障树分析FTAFTA第三十天第三十天MBBMBB审阅审阅FMEAFMEA或或故障树分析故障树分析FTAFTA测量系统分析测量系统分析

4、MSAMSA关键关键”X”X”变量变量 数据采集计划数据采集计划MBBMBB审阅审阅测量测量21/01/200421/01/200404/02/200404/02/200411/02/200411/02/200425/02/200425/02/200409/03/200409/03/200409/03/200409/03/200409/03/200409/03/2004初始能力研究初始能力研究 多元变量流程分析多元变量流程分析MBBMBB审阅审阅合同批准合同批准分析分析22/03/200422/03/200415/04/200415/04/200415/04/200415/04/200415/

5、04/200415/04/200415/04/200415/04/2004单因子或多因子测试单因子或多因子测试实验设计实验设计(DOE) (DOE) MBBMBB审阅审阅改善改善31/05/200431/05/200431/05/200431/05/200431/05/200431/05/2004控制计划控制计划最终能力研究最终能力研究 控制阶段控制阶段FMEAFMEA回顾回顾 重新修订重新修订RPNRPNMBBMBB审阅审阅项目最终汇报项目最终汇报 及报告及报告项目审核项目审核及项目收尾及项目收尾控制控制21/06/200421/06/200429/06/200429/06/200429/0

6、6/200429/06/200405/07/200405/07/200409/07/200409/07/200409/07/200409/07/200419/07/200419/07/2004( (根据需要使用根据需要使用) )客户心声客户心声/ /业务之声调查业务之声调查VOC/VOBVOC/VOB需求分析需求分析流程再造流程再造 解决方案设计解决方案设计流程再造流程再造在这里输入开始日期在这里输入开始日期 确定改善方案确定改善方案由项目发起人在备选项目数据库中完成由项目发起人在备选项目数据库中完成在在6 6西格玛西格玛数据库数据库查找相似项目查找相似项目实施改善实施改善移交培训移交培训/

7、/流程所有人签准流程所有人签准再造路线图的日程是独立计算的与以上DMAIC的日期不相关实际完成日期 计划完成日期图例图例2/1/020022/3/022/3/02完成画钩3高级教育分析路线图分析路线图Analyze Roadmap 单一因子 X -单一因子 Y Single X - Single Y输入变量输入变量 X X X Data离散离散Discrete 连续连续Continuous 输出变量输出变量 Y Y Y Data离散离散Discrete 连续连续Continuous 卡方相关性分析卡方相关性分析Chi-Square逻辑回归逻辑回归Logistic Regression方差分析,

8、方差分析,均值均值/ /中位数测试中位数测试ANOVAMeans / Medians Tests回归回归Regression4高级教育什么是什么是 Y ?Y ? _ _ 数据类型数据类型? ? _什么是什么是 X ?X ? _ 数据类型数据类型 ? _? _应该使用何种工具应该使用何种工具? ? _案例案例 #1 #1 Scenario #1管理者想知道接线员的经验管理者想知道接线员的经验( (以月为单位衡量以月为单位衡量) )是否会对接听顾客热线电话需要的时间有影响是否会对接听顾客热线电话需要的时间有影响5高级教育相关性相关性Correlation 什么是相关性什么是相关性 ? What i

9、s correlation? 你是否有过如此经验你是否有过如此经验: :测量某些产品并送至顾客处测量某些产品并送至顾客处,但,但他们回来告诉你的产品不符规格他们回来告诉你的产品不符规格? ? Have you ever measured something and then shipped to your customer only for them to tell you it doesnt meet spec? 在奥林匹克溜冰比赛上,你认为两个裁判成绩之相关在奥林匹克溜冰比赛上,你认为两个裁判成绩之相关性有多高性有多高? ? How well correlated do you think

10、 two ice skating judges are at the Olympics?6高级教育相关性相关性Correlation路线分析图路线分析图Analyze Roadmap 画出点阵图画出点阵图Produce Scatter Plot计算相关性计算相关性Calculate Correlation评估评估r r 和和 P P值值 Evaluate r and P value7高级教育相关系数相关系数Correlation Coefficients 什么是相关系数什么是相关系数? ? So what is the Correlation Coefficient supposed to b

11、e anyway? 相关系数相关系数 (r) (r)介于介于-1-1和和1 1之间之间 The Correlation Coefficient (r) lies between -1 and 1 一般规则一般规则:General Rules 相关系数相关系数 (r) .80 (r) .80 或或 -0.8 .80 or 刹车距离Braking Distance = 182.8 + 0.4763 速度速度SpeedS = 13.5571 R-Sq = 69.5% R-Sq(adj) = 67.9%方差分析方差分析Analysis of VarianceSource DF SS MS F PReg

12、ression 1 7955.9 7955.91 43.29 0.000Error 19 3492.1 183.79Total 20 11448.0Minitab Minitab 更多输出更多输出R2 (Same one as before)28高级教育R2 - R2 - 有何意义有何意义? ?R2与P值,有助我们以统计做决策。R2被称为 判断判断系数系数R2 and P , help us put some statistical backing behind our decisions. The R2 is called the coefficient of determinationR2

13、 值代表“多少”输出变异总量可由回归模式所解释,其值介于 0 到 1 (0% 到 100%)。此值越高代表对该模式的可信度越高.R2 is a measure of the amount of variation in the output that is explained by the regression model. It will always be a value between 0 and 1 (0% to 100%). The higher this amount, the greater confidence we have in the model itself. R210

14、0%0%29高级教育R2 - 有何意义有何意义? ?The R2 = 69.5%这表明有69.5%的Y(刹车距离)的变差可以由X(速度)来解释.This means 69.5% of the variation in Y (Braking Distance) can be explained by the X (Speed).30.5% 30.5% 是由其他因素引起的是由其他因素引起的.30.5% is due to something else.你的决策是什么?SpeedBraking Distance475450425400375350420400380360340320S13.5571R

15、-Sq69.5%R-Sq(adj)67.9%Fitted Line PlotBraking Distance = 182.8 + 0.4763 Speed30高级教育R2-该为多大值? How Big Should It Be ? 视分析对象而定 如对安全系统或回纹针 That answer “depends” on what you are studying, e.g. safety systems or paper clips. 如果你在实验一个新的安全保障系统, 你的数据将由交通部审查.你的数值该需要有多“好”? If you are experimenting with a new s

16、afety restraint system, your numbers will probably be reviewed by the Department of Transportation. How “good” should you be ? 不同的课题会有不同的决策标准 (通常为 +80%)。重要的是我们必须认识到 R2 越高,统计模式越好。 Different texts suggest different decision criteria (usually +80%). The important thing to realize is that the higher the

17、 R2 the better the model.31高级教育回归分析: 刹车距离v. 速度Regression Analysis: Braking Dist versus Speed回归的等式为The regression equation is刹车距离Braking Distance = 182.8 + 0.4763 速度SpeedS = 13.5571 R-Sq = 69.5% R-Sq(adj) = 67.9%方差分析Analysis of VarianceSource DF SS MS F PRegression 17955.97955.91 43.29 0.000Error 193

18、492.1 183.79Total 2011448.0P值里怎么了? What Is Going On Here ?Another P Value !32高级教育零假设: 线段斜率=0 (无相关性)Ho: Slope of The Line = 0(No correlation)备择假设: 线段斜率 = 0(有相关性)Ha: Slope of The Line 0 (There is correlation)记住P P要小要小, , Ho Ho要倒要倒When P is low, Ho must go !P 值另一个假设检定Another Hypothesis Test33高级教育Minita

19、b 回归- 残差&拟合数Regression - Residuals & Fits34高级教育Speed DistanceRESI1FITS1336325-17.8392342.839418375-6.8948381.89535536715.1113351.889445385-9.7546394.75536537518.3484356.652455395-4.5175399.51739539524.0598370.940405365-10.7031375.7033463557.3979347.60. . . . . . . . .Minitab 更多输出More Output3

20、5高级教育速度Speed距离Distance残差1 RESI1拟合数1 FITS1336325-17.8392342.839残差&拟合数- 它们是什么? Residual & Fit - What Are They ?拟和线Fitted Line336325实际点Actual Point残差距离Residual Distance (-17.8392)理论拟合点Theoretical Fit 34236高级教育速度Speed 距离Distance残差1 RESI1 拟合数1 FITS1336325-17.8392 342.839残差- 点到拟合线的垂直距离 在线下方为负, 在线上

21、方为正.Residual - The vertical distance to the fitted line Negative is below , positive is above拟合数拟合数- - Y值在拟合线上的理论值Fits - The theoretical y value on the fitted line残差&拟合数- 它们是什么? Residual & Fit - What Are They ?37高级教育回归- 残差&拟合数- 图表总结Regression - Residuals & Fits Graphical Summary38高级教

22、育数据应该通过 “胖铅笔测试”“Fat Pencil Test”残差分析Residual Analysis数据应该像钟型分布Data Should Fit A Bell Shaped CurveResidualPercent30150-15-30999050101Fitted ValueResidual40038036034020100-10-20ResidualFrequency20100-10-206.04.53.01.50.0Observation OrderResidual201816141210864220100-10-20Normal Probability Plot of the

23、 ResidualsResiduals Versus the Fitted ValuesHistogram of the ResidualsResiduals Versus the Order of the DataResidual Plots for Braking Distance比较P值与残差正态分布测试的结果Check P value with Normality test on Residuals39高级教育数据应在控制线内,调查异常点Data Should Be In ControlInvestigate Outliers残差分析Residual Analysis数据应无任何规律D

24、ata Should Exhibit No PatternsResidualPercent30150-15-30999050101Fitted ValueResidual40038036034020100-10-20ResidualFrequency20100-10-206.04.53.01.50.0Observation OrderResidual201816141210864220100-10-20Normal Probability Plot of the ResidualsResiduals Versus the Fitted ValuesHistogram of the Residu

25、alsResiduals Versus the Order of the DataResidual Plots for Braking Distance40高级教育其他案例Other Examples使用Minitab Project: 练习 #1: Analyze worksheet Y = 油漆厚度Paint Thickness X1 = 气压Air Pressure X2 = 黏度Viscosity练习 #2: Analyze worksheet Y = 客户回应时间Customer Response TimeX1 = 代理人有经验程度Experience Level of AgentX

26、2 = 与客户的距离Distance From Customer Site练习 #3: Analyze 41高级教育注意陈述中的注意陈述中的因果关系因果关系Beware of Stating Causality即使我们建立了Y与X之相关性,但并不能确定X之变异将一定导致Y之变异。If we establish a correlation between Y and a X, that doesnt necessarily mean variation in X caused variation in Y.其它潜藏的变量,可能造成X与Y之改变。 Other variables may be lu

27、rking that cause both X and Y to vary.42高级教育研究指出当医院规模增加,病人死亡率亦显著提升。这么说来,我们应该避免去大型医院就诊吗?Research has consistently shown that as the hospital size increases, the death rate of patients dramatically increases. So, should we avoid large hospitals?回归问题探讨:回归问题探讨:Xs Xs 缺失缺失 Regression Issues - Missing Xs0

28、1 2 4 5 X =医院规模Y =死亡率1510543高级教育有关一个城市的数据显示,当城市里鹳的数量增加时,城市人口也增加鹳真的影响城市人口吗?Data on a city showed that as population density of storks increased, so did the towns population. Did storks influence the population ?0 1 2 4 5 X= X= 鹳的数量鹳的数量Y =Y =城市人口城市人口15105回归问题探讨:回归问题探讨:Xs Xs 缺失缺失 Regression Issues - Mi

29、ssing Xs44高级教育回归问题探讨回归问题探讨Regression Issues 研究范围太狭窄研究范围太狭窄Range Of Study Too Small0 1 2 4 5 X= X= 车龄车龄Age of CarY =Y =车值车值 Sales Value1510545高级教育$ $ 车值车值Value of Car车龄车龄Age of Car现在的数据看来如何?What Would This Look Like Now ?0 15 10 15 20 25 30 35 40 45 50回归问题探讨回归问题探讨Regression Issues 研究范围太狭窄Range Of Stu

30、dy Too Small46高级教育分析路线图分析路线图Analyze Roadmap 输入变量输入变量 X X X Data单一因子单一因子 XSingle X多因子多因子 XsMultiple Xs 输出变量输出变量 Y Y Y Data单一输出单一输出 Y Single Y 多元输出多元输出 Y Multiple Ys 多变量分析Multivariate Analysis(注意: 这与多元变量分析不同)(Note: This Is Not The Same As Multi-Vari Analysis)输入变量输入变量 X X Data离散 Discrete 连续 Continuous

31、输出变量输出变量 Y Y Y Data卡方相关性分析Chi-Square逻辑回归Logistic RegressionT T 测试,方差分析,均值/中位数测试T-test, ANOVAMeans/Medians Tests回归Regression多元回归Multiple RegressionMedians Tests2, 3, 4 way.ANOVAMultiple Logistic Regression多元逻辑回归离散 Discrete 连续 Continuous 离散 Discrete 连续 Continuous 离散 Discrete 连续 Continuous 2, 3, 4 因子方差

32、分析中位数测试多元逻辑回归Multiple Logistic Regression输入变量输入变量 X X Data输出变量输出变量 Y Y Data47高级教育多元回归分析Multiple Regression Analysis 两个或多个流程变量(Xs)可能对流程表现产生影响(Y). Two or more process variables (Xs) may have an influence upon process performance (Y). 多元回归应用于有两个或多个可能的预测变量的情况Multiple regression is used whenever there ar

33、e two or more possible predictor variables. 多元回归的一般等式为The general form of the multiple regression equation isnnXbXbXbbY.2211048高级教育案例:刹车板销售量Example: Brake Sales例中对刹车板销售量进行次的观察已知有五个流程变量和一个表现变量,:Twenty observations regarding Brake Sales are given. There are Five known process variables and one perform

34、ance variable, Y:X1 = 年度YearX2 = 市场营销费用Mktg$X3 = 今年销售人员数Sales RepX4 = 去年(销售人员)数LY(Sales Rep)X5 = 产品ProductY = 销售Sales利用数据找出可能影响”销售量”的”重要的几个”流程变量. .Use the data to mine for the “vital few” process variables that may influence “Sales”. 49高级教育刹车板销售量数据YearMktg$SalesRep LY(SalesRep)ProductSales19.63020 1

35、8130210.3203017157310.2152019129410.4251522129510.6302524162610.7153018154710.5251517132810.9352516172911.04035142071011.12040182041111.22520221441211.23525251751311.4535271671411.212528971511.61612181221611.72116161391711.82221151531811.82422161561911.82624101722012.128261817850高级教育刹车板销售量5544332211

36、0XbXbXbXbXbbY我们的目的是找到适用于下列形式的多元回归:Our goal is to fit a multiple regression of the following form这个问题便于阐明下列多元回归的其他方面:This problem will illustrate the following additional aspects of multiple regression(1)去掉没有解释能力的变量 elimination of X-variables that have no explanatory power;(2)残差分析 residual analysis留在

37、模式里的变量是能控制的在西格玛里,我们的目标就是要控制少数变量What stays in the model must have controls. In Six Sigma, goal is to control a few. 51高级教育多元回归Multiple Regression路线分析图规划分析内容收集数据利用回归或最佳子集分析Analyze Using Regression or Best Subsets评估残差制定决策评估 R2 及 P值的显著性多元共线性分析(相关性)Multicollinearity “X” Check (correlation)使用多元回归简化模式Run M

38、ultiple Regression Reduced Model因为有多条线,就不再使用拟合线图,No longer fitted line plot due to multiple lines52高级教育相关的预测变量(多元共线性)相关的预测变量(多元共线性)Correlated Predictor Variables (Multicollinearity)nnXbXbXbbY.22110流程结果()与预测变量(s)间的相关性是有用的它可以帮助我们找出可能的因果关系 Correlation between the process output (Y) and the predictor va

39、riables (Xs) is good - helps us identify possible cause and effect relationships.相反,预测变量间的相关性却是一个问题 Correlation between predictors, in contrast, is a problem.计算里的正负符号和预测变量间的相关性大小可能有错误.Calculated signs and magnitudes of correlated predictors may be wrong.计算出的P值可能偏大.Calculated P-values may be large.预测

40、变量间的高相关性被称为”共线性”High correlation between predictor variables is called “collinearity”53高级教育多元共线性:刹车板销售量Multicollinearity: Brake Sales左侧是前刹车板销售量预测变量:Predictor Variables: (1) 年度Year;(2)市场营销费用Marketing $;(3) 今年销售人员数量How many Sales Reps this year;(4)去年销售人员数量How many Sales Reps last year.(5) 产品Product Ye

41、arMktg$SalesRepLY(SalesRep)Product Sales19.6302018130210.3203017157310.2152019129410.4251522129510.6302524162610.7153018154710.5251517132810.9352516172911.04035142071011.12040182041111.22520221441211.23525251751311.4535271671411.212528971511.61612181221611.72116161391711.82221151531811.8242216156191

42、1.82624101722012.128261817854高级教育多元共线性:刹车板销售量多元共线性:刹车板销售量选择所有五个预测变量和响应变量Select all five predictor variables and the response variable.使用 Minitab 菜单, STAT BASIC STATS CORRELATION.不选择p值选项Uncheck p value55高级教育年度和市场营销费用有着很高的相关性!我们必须只能选择一个作为预测变量在回归拟合中使用市场营销费用可能受年度影响,因此我们保留市场营销费用,而去掉年度变量The Year and Marke

43、ting$ Variables are highly correlated! We will have to choose one or the other of the correlated predictor variables (but not both) to use in a regression fit.Possible that marketing$ is a function of the year - so keep the marketing $ and eliminate year. 基本原则基本原则, , 如果相关性如果相关性 0.8 or0.8 or - 0.8 Re

44、gression Best Subsets.59高级教育最佳子集回归:刹车板销售 注意”年度”从模式中去掉了.Best Subsets Regression: Sales versus Mktg$, Sales Rep, .Response is Sales S L a Y P l ( r M e S o k s a d t l u g R e c Vars R-Sq R-Sq(adj) C-p S $ e s t 1 79.0 77.8 156.0 12.841 X 1 20.9 16.6 631.3 24.910 X 2 90.1 89.0 66.8 9.0570 X X 2 85.2 8

45、3.5 107.0 11.084 X X 3 98.2 97.8 3.0 4.0222 X X X 3 90.5 88.7 65.8 9.1570 X X X 4 98.2 97.7 5.0 4.1540 X X X X 60高级教育多元回归Multiple Regression路线分析图规划分析内容收集数据利用回归或最佳子集分析Analyze Using Regression or Best Subsets评估残差制定决策评估 R2 及 P值的显著性多元共线性分析(相关性)Multicollinearity “X” Check (correlation)使用多元回归简化模式Run Multi

46、ple Regression Reduced Model因为有多条线,就不再使用拟合线图,No longer fitted line plot due to multiple lines61高级教育回归:刹车板销售Regression: Brake Sales 选择所有四个预测变量和响应变量.Select all four predictor variables and the response variable.使用 Minitab 菜单, STAT Regression Regression62高级教育回归分析:刹车板销售Regression Analysis: Brake Sales 零

47、假设 = 变量间没有任何关系备择假设= 变量间有一些关系Ho = No relationship between variables Ha = Some relationship exists between variablesRegression Analysis: Sales versus Mktg$, Sales Rep, .The regression equation isSales = - 66.6 + 11.8 Mktg$ + 1.18 Sales Rep + 2.70 LY(SalesRep) - 0.007 ProductPredictor Coef SE Coef T PC

48、onstant -66.64 19.17 -3.48 0.003Mktg$ 11.838 1.494 7.92 0.000 HaSales Re 1.1751 0.1224 9.60 0.000HaLY(Sales 2.7023 0.1154 23.42 0.000HaProduct -0.0068 0.2337 -0.03 0.977HoS = 4.154 R-Sq = 98.2% R-Sq(adj) = 97.7%63高级教育回归/简化模式:刹车板销售Regression/Reduced Model: Brake Sales 选择所剩三个预测变量和响应变量.Select the three

49、 remaining predictor variables and the response variable.Using Minitab Menu, STAT Regression Regression记住检查残差图记住检查残差图Remember to check your residual plots64高级教育回归分析:刹车板销售Regression Analysis: Brake Sales 零假设 = 变量间没有任何关系备择假设= 变量间有一些关系Ho = No relationship between variables Ha = Some relationship exists

50、 between variables回归分析:销售量v.市场营销费用,销售人员数,去年销售人员数Regression Analysis: Sales versus Mktg$, Sales Rep, LY(SalesRep)The regression equation isSales = - 66.9 + 11.8 Mktg$ + 1.18 Sales Rep + 2.70 LY(SalesRep)Predictor Coef SE Coef T PConstant -66.91 16.22 -4.12 0.001Mktg$ 11.847 1.414 8.38 0.000HaSales Re

51、 1.1764 0.1106 10.64 0.000HaLY(Sales 2.7027 0.1106 24.44 0.000HaS = 4.022 R-Sq = 98.2% R-Sq(adj) = 97.8%65高级教育刹车板销售案例的其他MiniTab 输出The Rest of Mini Tab Output Brake Sales Analysis of VarianceSource DF SS MS F PRegression 3 13870.1 4623.4 285.78 0.000Residual Error 16 258.8 16.2Total 19 14128.9Source

52、DF Seq SSMktg$ 1 893.9Sales Re 1 3313.2LY(Sales 1 9663.0Unusual ObservationsObs Mktg$ Sales Fit SE Fit Residual St Resid 10 11.1 204.000 196.236 2.161 7.764 2.29R R denotes an observation with a large standardized residual66高级教育刹车板销售R-Sq (修正后)Brake Sales R-Sq (Adjusted)R-Sq (Adj)= 97.8%Y的变差可由回归里的三个元

53、素解释.R-Sq (Adj) = 97.8% of the variation in Y is explained by the Three factors included in the regression.尽管结果不错,但仍有2.2%刹车板销售的变差不能解释(While good, this still means that about 2.2% of the variation in Brake Sales is still unexplained.)S = 4.022 R-Sq = 98.2% R-Sq(adj) = 97.8%67高级教育多元回归多元回归Multiple Regre

54、ssion路线分析图路线分析图Analyze Roadmap规划分析内容收集数据利用回归或最佳子集分析Analyze Using Regression or Best Subsets评估残差制定决策评估 R2 及 P值的显著性多元共线性分析(相关性)Multicollinearity “X” Check (correlation)使用多元回归简化模式Run Multiple Regression Reduced Model因为有多条线,因为有多条线,就不再使用拟合线图就不再使用拟合线图,No longer fitted line plot due to multiple lines68高级教育

55、刹车板销售残差Brake Sales Residuals残差分析同样不容忽视. 对残差进行仔细分析会帮助我们确定我们没有违反least squares 拟合规律.以此可以指导我们改进回归拟合模式.Not to be overlooked is residual analysis. Careful analysis of residuals tells whether any assumptions of the least squares fit are violated. This will guide us in improving the regression fit. 最小二乘方的假设

56、Least Squares Assumptions:1. 残差的变差不是由任何预测变量X引起的 The variance of the residuals do not depend upon any predictor variable, X.2. 残差有着正态分布. Residuals are normally distributed.3. 以时间为序,各残差间互不倚靠 Arranged in time order, the residuals are independent of each other.69高级教育刹车板销售量的拟合及残差Brake Sales Fits & Re

57、siduals70高级教育多元回归总结Multiple Regression Summary 这是用来建立Y = f(X1, X2, X3, . . .)该形式非常有用的统计工具 A powerful statistical tool that is used to build models of the form Y = f(X1, X2, X3, . . .). 最好的模式是可以用最少的元素来解释响应变量Y的绝大多数变异的模式. The best model is the one with the fewest factors that explains the most variatio

58、n in the response Y. 最好子集回归是一个可以快速整合可能的好模式的非常有用的技巧. Best Subsets is a useful technique to quickly zero in on potential good models 回归里应避免没有价值的残差及元素间的共线性. Pitfalls to avoid in regression are poorly-behaved residuals and factor collinearity.71高级教育多元回归Multiple Regression路线分析图路线分析图Analyze Roadmap规划分析内容收

59、集数据利用回归或最佳子集分析Analyze Using Regression or Best Subsets评估残差制定决策评估 R2 及 P值的显著性多元共线性分析(相关性)Multicollinearity “X” Check (correlation)使用多元回归简化模式Run Multiple Regression Reduced Model因为有多条线,因为有多条线,就不再使用拟合线图就不再使用拟合线图,No longer fitted line plot due to multiple lines72高级教育多元回归的练习 3-4人分为一组,记录下列数据: Y = 组里每人的身高(

60、cm) Height of each person in group X1 = 从膝盖到脚踝的长度(cm) Length from knee to ankle X2 =从肘关节到手腕的长度(cm) Length from elbow to wrist X3 = 鞋的长度(cm) Length of shoe X4 = 翼距 (cm) (两臂左右尽量伸直,从左手指间到右手指间的距离) (from tip of finger on left hand to tip of finger on right hand with arms stretched out to the side) 使用Mini

61、tab里多元回归的工具分析数据.Use Multiple Regression tools in Minitab to analyze the data. 哪些Xs对Y最重要? Which Xs are most important to Y? 你的最终模式是什么?别忘了对最终模式进行残差分析。What should your final model look like? Dont forget to do Residual Diagnostics on final model.73高级教育提示Belt Hints在适用的情况下,研究定性类数据也可运用这个工具.可向你的MBB咨询.There are some appropriate cases where attribute data might perform well with this tool. See your MBB for adviceR-Sq 值无需”完美”The R-squared values dont have to be “perfect” _74高级教育复习Reviewl介绍相关性及回归的基本概念l把回归与六西格玛路线图结合起来l复习多元回归的使用75高级教育

展开阅读全文
温馨提示:
1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2: 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
3.本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!