Stata:偏差校正倾向得分匹配及PSM操作应用
计量经济学服务中心专辑汇总ASPMEX!计量百科 ·资源·干货:
Stata |Python |Ma tlab |Eviews |R
Geoda |A rcGis |GeodaSpace |SPSS
一文读懂 |数据资源 |回归方法 |网络爬虫
门 限回归 |工具变量 | 内生性 |空间计量
因 果推断 |合成控制法 |倾向匹配得分 |断点回归 |双重差 分
面板数据 | 动态面板数据
计量经济学服务中心专辑汇总ASPMEX!计量百科 ·资源·干货:
Stata |Python |Ma tlab |Eviews |R
Geoda |A rcGis |GeodaSpace |SPSS
一文读懂 |数据资源 |回归方法 |网络爬虫
门 限回归 |工具变量 | 内生性 |空间计量
因 果推断 |合成控制法 |倾向匹配得分 |断点回归 |双重差 分
面板数据 | 动态面板数据
Stata:偏差校正倾向得分匹配及PSM操作应用 一.命令介绍**
偏差校正倾向得分匹配方法对应命令为:nnmatch
语法格式为:
nnmatch depvar treatvar varlist_nnmatch [ ifexp] [ inrange] [pw] [, tc(ate |att |atc) m( #) metric(maha |matname) exact(varlist_ex) biasadj(bias |varlist_adj) robust(#_v) population level(#) keep(filename) replace]
详细解释为:
depvar :结果变量
treatvar:处理变量
varlist_nnmatch :匹配变量
tc(ate|att|atc) specifies which treatment effect is to be estimated:
ate: the average treatment effect,
att: the average treatment effect for the treated, or
atc: the average treatment effect for the controls.
metric(maha |matname) :metric(maha)表示使用马氏距离ASPMEX,即权重矩阵为样本协方差矩阵的逆矩阵
m(#) :进行#近邻匹配,默认#=1ASPMEX。
robust(#_v):表示进行异方差稳健的标准误
展开全文
depvar :结果变量
treatvar:处理变量
varlist_nnmatch :匹配变量
tc(ate|att|atc) specifies which treatment effect is to be estimated:
ate: the average treatment effect,
att: the average treatment effect for the treated, or
atc: the average treatment effect for the controls.
metric(maha |matname) :metric(maha)表示使用马氏距离ASPMEX,即权重矩阵为样本协方差矩阵的逆矩阵
m(#) :进行#近邻匹配,默认#=1ASPMEX。
robust(#_v):表示进行异方差稳健的标准误
nnmatch y t x1, m(3)
nnmatch y t x1 x2, tc(att)
nnmatch y t x1 x2, tc(atc) met(maha) bias(bias) robust(4)
nnmatch y t x1 x2, met(matname) bias(x1 x3) keep(artdata) replace
nnmatch y t x1 x2 [w=w], met(matname) bias(x1 x3) exact(x4) pop
二.偏差校正匹配估计量操作应用**
本文仍然使用倾向得分匹配所对应的案例数据ASPMEX,所对应的变量数据结构为:
. desc
Contains data from E:\2022年8月Stata课程2022.08.13--2022.08.15\data\ldw_exper.dta
obs: 445
vars: 12 30 Jan 2013 12:47
size: 12,015
storage display value
variable name typeformat label variable label
t byte %8.0g participation injob training program
age byte %8.0g age
educ byte %8.0g years of education
black byte %8.0g indicator forAfrican-American
hisp byte %8.0g indicator forHispanic
married byte %8.0g indicator formarried
nodegree byte %8.0g indicator formore than grade school but
less than high-school education
re74 float%9.0g real earnings in1974 ( inthousands of
1978 $)
re75 float%9.0g real earnings in1975 ( inthousands of
1978 $)
re78 float%9.0g real earnings in1978 ( inthousands of
1978 $)
u74 float%9.0g indicator forunemployed in1974
u75 float%9.0g indicator forunemployed in1975
Sorted by:
首先使用一对一的匹配ASPMEX,不做偏差校正,但是进行稳健标准误估计: nnmatch re78 t age edu black his married re74 re75 u74 u75, tc(atc) m(1) robust(1)
结果为:
. nnmatch re78 t age edu black his married re74 re75 u74 u75, tc(atc) m(1) robust(1)
Matching estimator: Average Treatment Effect forthe Controls
Weighting matrix: inverse variance Number of obs = 445
Number of matches (m) = 1
Number of matches,
robust std. err. (h) = 1
re78 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
SATC | 2.262412 .9681856 2.34 0.019 .3648029 4.160021
Matching variables: age educ black hisp married re74 re75 u74 u75
上表显示权重矩阵为默认的ASPMEX,即对角线元素为各变量样本方差的对角矩阵之逆矩阵,ATT的估计值为2.2624,并且在5%的水平下显著
下面进行进行偏差校正
nnmatch re78 t age edu black his married re74 re75 u74 u75, tc(atc) m(1) robust(1) bias(bias)
结果为:
. nnmatch re78 t age edu black his married re74 re75 u74 u75, tc(atc) m(1) robust(1)
> bias(bias)
Matching estimator: Average Treatment Effect forthe Controls
Weighting matrix: inverse variance Number of obs = 445
Number of matches (m) = 1
Number of matches,
robust std. err. (h) = 1
re78 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
SATC | 2.160295 .9681856 2.23 0.026 .262686 4.057904
Matching variables: age educ black hisp married re74 re75 u74 u75
Bias-adj variables: age educ black hisp married re74 re75 u74 u75
发现atc的值减少到2.1603ASPMEX,并且也在5%显著性水平下显著
下面使用样本协方差矩阵的逆矩阵为权重矩阵ASPMEX,metric(maha)即使用马氏距离
nnmatch re78 t age edu black his married re74 re75 u74 u75, tc(atc) m(1) robust(1) bias(bias) metric(maha)
结果为:
. nnmatch re78 t age edu black his married re74 re75 u74 u75, tc(atc) m(1) robust(1)
> bias(bias) metric(maha)
Matching estimator: Average Treatment Effect forthe Controls
Weighting matrix: Mahalanobis Number of obs = 445
Number of matches (m) = 1
Number of matches,
robust std. err. (h) = 1
re78 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
SATC | 2.24162 .9538869 2.35 0.019 .3720359 4.111204
Matching variables: age educ black hisp married re74 re75 u74 u75
Bias-adj variables: age educ black hisp married re74 re75 u74 u75
大家都在读: 一文读懂倾向得分匹配法(PSM)举例及stata实现(一)
一、倾向匹配得分应用之培训对工资的效应
政策背景:国家支持工作示范项目( National Supported Work,NSW )
研究目的:检验接受该项目(培训)与不接受该项目(培训)对工资的影响ASPMEX。基本思想:分析接受培训组(处理组, treatment group )接受培训行为与不接受培训行为在工资表现上的差异。但是,现实可以观测到的是处理组接受培训的事实,而处理组没有接受培训会怎样是不可能观测到的,这种状态也成为反事实( counterfactual )。
匹配法就是为了解决这种不可观测事实的方法ASPMEX。在倾向得分匹配方法( Propensity Score Matching )中,根据处理指示变量将样本分为两个 组,一是处理组,在本例中就是在 NSW 实施后接受培训的组;二是对照组 ( comparison group ),在本例中就是在 NSW 实施后不接受培训的组。倾向得分 匹配方法的基本思想是,在处理组和对照组样本通过一定的方式匹配后,在其他 条件完全相同的情况下,通过接受培训的组(处理组)与不接受培训的组(对照 组)在工资表现上的差异来判断接受培训的行为与工资之间的因果关系。
注:本例节选自 Cameron&Trivedi 《微观计量经济学:方法与应用》(中译本,上海财经大学出版社, 2010 ) pp794-800 所有数据及程序均来自于本书的配套网站( 。
政策背景:国家支持工作示范项目( National Supported Work,NSW )研究目的:检验接受该项目(培训)与不接受该项目(培训)对工资的影响
ASPMEX。基本思想:分析接受培训组(处理组, treatment group )接受培训行为与不接受培训行为在工资表现上的差异。但是,现实可以观测到的是处理组接受培训的事实,而处理组没有接受培训会怎样是不可能观测到的,这种状态也成为反事实( counterfactual )。
匹配法就是为了解决这种不可观测事实的方法ASPMEX。在倾向得分匹配方法( Propensity Score Matching )中,根据处理指示变量将样本分为两个 组,一是处理组,在本例中就是在 NSW 实施后接受培训的组;二是对照组 ( comparison group ),在本例中就是在 NSW 实施后不接受培训的组。倾向得分 匹配方法的基本思想是,在处理组和对照组样本通过一定的方式匹配后,在其他 条件完全相同的情况下,通过接受培训的组(处理组)与不接受培训的组(对照 组)在工资表现上的差异来判断接受培训的行为与工资之间的因果关系。
注:本例节选自 Cameron&Trivedi 《微观计量经济学:方法与应用》(中译本,上海财经大学出版社, 2010 ) pp794-800 所有数据及程序均来自于本书的配套网站(。 Contains data from E:\2022年8月Stata课程2022.08.13--2022.08.15\data\ldw_exper.dta
obs: 445
vars: 12 30 Jan 2013 12:47
size: 12,015
storage display value
variable name typeformat label variable label
t byte %8.0g participation injob training program
age byte %8.0g age
educ byte %8.0g years of education
black byte %8.0g indicator forAfrican-American
hisp byte %8.0g indicator forHispanic
married byte %8.0g indicator formarried
nodegree byte %8.0g indicator formore than grade school but
less than high-school education
re74 float%9.0g real earnings in1974 ( inthousands of
1978 $)
re75 float%9.0g real earnings in1975 ( inthousands of
1978 $)
re78 float%9.0g real earnings in1978 ( inthousands of
1978 $)
u74 float%9.0g indicator forunemployed in1974
u75 float%9.0g indicator forunemployed in1975
Sorted by:
描述性分析 tabulate t, summarize(re78) means standard
结果为:
tabulate t, summarize(re78) means standard
participati | Summary of real
on injob | earnings in1978 ( in
training | thousands of 1978 $)
program | Mean Std. Dev.
0 | 4.5548023 5.4838368
1 | 6.3491454 7.8674047
Total | 5.3007651 6.6314934
三、倾向匹配得分操作
数据介绍 :Data used by Lalonde (1986)We are interested in the possible effect of participation in a job training program on individuals earnings in 1978This dataset has been used by many authors ( Abadie et al. 2004,Becker and Ichino, 2002, Dehejia and Wahba, 1999).
数据介绍 :Data used by Lalonde (1986)We are interested in the possible effect of participation in a job training program on individuals earnings in 1978This dataset has been used by many authors ( Abadie et al. 2004,Becker and Ichino, 2002, Dehejia and Wahba, 1999).
gen u=runiform
sort u //排序
或者order u
上述命令是为了生成伪随机数
ASPMEX,满足01的均匀分布 ** local** v1 "t"
** local** v2 "age edu black hisp married re74 re75 u74 u75"
**global** x "`v1' `v2' "
psmatch2 $x, out(re78) neighbor(1) ate ties logit common // 1:1 匹配
$表示引用宏变量
ASPMEX, psmatch2 $x, out(re78) neighbor(1) ate ties logit common // 1:1 匹
**等价于
psmatch2 t age edu black hisp married re74 re75 u74 u75, out(re78) neighbor(1) ate ties logit common
下面用pstest查看匹配效果是否较好的平衡
ASPMEX了数据 下面用pstest查看匹配效果是否较好的平衡
ASPMEX了数据 psmatch2 t age edu black hisp married re74 re75 u74 u75, out(re78) neighbor(1) ate ties logit common // 1:1 匹
pstest age edu black hisp married re74 re75 u74 u75, both graph
psgraph
结果为:
psgraph
完整结果为:
setseed 20180105
. gen u=runiform
. sort u
. localv1 "t"
. localv2 "age edu black hisp married re74 re75 u74 u75"
. global x "`v1' `v2' "
. psmatch2 $x, out(re78) neighbor(1) ate ties logit common
Logistic regression Number of obs = 445
LR chi2(9) = 11.70
Prob > chi2 = 0.2308
Log likelihood = -296.25026 Pseudo R2 = 0.0194
t | Coef. Std. Err. z P>|z| [95% Conf. Interval]
age | .0142619 .0142116 1.00 0.316 -.0135923 .0421162
educ | .0499776 .0564116 0.89 0.376 -.060587 .1605423
black | -.347664 .3606532 -0.96 0.335 -1.054531 .3592032
hisp | -.928485 .50661 -1.83 0.067 -1.921422 .0644523
married | .1760431 .2748817 0.64 0.522 -.3627151 .7148012
re74 | -.0339278 .0292559 -1.16 0.246 -.0912683 .0234127
re75 | .01221 .0471351 0.26 0.796 -.0801731 .1045932
u74 | -.1516037 .3716369 -0.41 0.683 -.8799987 .5767913
u75 | -.3719486 .317728 -1.17 0.242 -.9946841 .2507869
_cons | -.4736308 .8244205 -0.57 0.566 -2.089465 1.142204
There are observations with identical propensity score values.
The sort order of the data could affect your results.
Make sure that the sort order is random before calling psmatch2.
> --
Variable Sample | Treated Controls Difference S.E. T-st
> at
> --
re78 Unmatched | 6.34914538 4.55480228 1.79434311 .632853552 2.
> 84
ATT | 6.40495818 4.99436488 1.4105933 .839875971 1.
> 68
ATU | 4.52683013 6.15618973 1.6293596 .
> .
ATE | 1.53668776 .
> .
> --
Note: S.E. does not take into account that the propensity score is estimated.
psmatch2: | psmatch2: Common
Treatment | support
assignment | Off suppo On suppor | Total
Untreated | 11 249 | 260
Treated | 2 183 | 185
Total | 13 432 | 445
. pstest age edu black hisp married re74 re75 u74 u75, both graph
> --
Unmatched | Mean %reduct | t-test | V(T)/
Variable Matched | Treated Control %bias |bias| | t p>|t| | V(C)
> --
age U | 25.816 25.054 10.7 | 1.12 0.265 | 1.03
M | 25.781 25.383 5.6 47.7 | 0.52 0.604 | 0.91
educ U | 10.346 10.088 14.1 | 1.50 0.135 | 1.55*
M | 10.322 10.415 -5.1 63.9 | -0.49 0.627 | 1.52*
black U | .84324 .82692 4.4 | 0.45 0.649 | .
M | .85246 .86339 -2.9 33.0 | -0.30 0.765 | .
hisp U | .05946 .10769 -17.5 | -1.78 0.076 | .
M | .06011 .04372 5.9 66.0 | 0.71 0.481 | .
married U | .18919 .15385 9.4 | 0.98 0.327 | .
M | .18579 .19126 -1.4 84.5 | -0.13 0.894 | .
re74 U | 2.0956 2.107 -0.2 | -0.02 0.982 | 0.74*
M | 2.0672 1.9222 2.7 -1166.6 | 0.27 0.784 | 0.88
re75 U | 1.5321 1.2669 8.4 | 0.87 0.382 | 1.08
M | 1.5299 1.6446 -3.6 56.7 | -0.32 0.748 | 0.82
u74 U | .70811 .75 -9.4 | -0.98 0.326 | .
M | .71038 .75956 -11.1 -17.4 | -1.06 0.288 | .
u75 U | .6 .68462 -17.7 | -1.85 0.065 | .
M | .60656 .63388 -5.7 67.7 | -0.54 0.591 | .
> --
* ifvariance ratio outside [0.75; 1.34] forU and [0.75; 1.34] forM
Sample | Ps R2 LR chi2 p>chi2 MeanBias MedBias B R %Var
Unmatched | 0.019 11.75 0.227 10.2 9.4 33.1* 0.82 50
Matched | 0.008 3.87 0.920 4.9 5.1 20.6 1.09 25
* ifB>25%, R outside [0.5; 2]
五、PSM命令简介
Stata does not have a built-in command for propensity score matching, a non-experimental method of sampling that produces a control group whose distribution of covariates is similar to that of the treated group. However, there are several user-written modules for this method. The following modules are among the most popular:
Stata没有一个内置的倾向评分匹配的命令,一种非实验性的抽样方法,它产生一个控制组,它的协变量分布与被处理组的分布相似
ASPMEX。但是,这个方法有几个用户编写的模块。以下是最受欢迎的模块(主要有如下几个外部命令) **psmatch2.ado**
pscore.ado
nnmatch.ado
psmatch2.ado was developed by Leuven and Sianesi (2003) and pscore.ado by Becker and Ichino (2002). More recently, Abadie, Drukker, Herr, and Imbens (2004) introduced nnmatch.ado. All three modules support pair-matching as well as subclassification.
You can find these modules using the .net commandas follows:
net search psmatch2
net search pscore
net search nnmatch
You can install these modules using the .ssc or .net command, for example:
ssc install psmatch2, replace
After installation, read the help files to find the correct usage, for example:
helppsmatch2
上述主要介绍了如何获得PSM相关的命令
ASPMEX,总结一下目前市面上用的较好的命令为psmatch2. PSM 相关命令 helppsmatch2
helpnnmatch
helppsmatch
helppscore
持续获取最新的 PSM 信息和程序
findit propensity score
findit matching
psmatch2 is being continuously improved and developed. Make sure to keep your version up-to-date as follows
ssc install psmatch2, replace
where you can check your version as follows:
whichpsmatch2
语法格式
helppsmatch2
psmatch2 depvar [indepvars] [ ifexp] [ inrange] [, outcome(varlist)
pscore(varname) neighbor( integer) radius caliper(real)
mahalanobis(varlist) ai( integer) population altvariance
kernel llr kerneltype( type) bwidth(real) spline
nknots( integer) common trim(real) noreplacement
descending odds index logit ties quietly w(matrix) ate]
where indepvars and mahalanobis(varlist) may contain factor variables;
see fvvarlist.
psmatch2 D x1 x2 x3, outcome(y)
pscore(varname) neighbor( integer) radius caliper(real)
mahalanobis(varlist) ai( integer) population altvariance
kernel llr kerneltype( type) bwidth(real) spline
nknots( integer) common trim(real) noreplacement
descending odds index logit ties quietly w(matrix) ate]
核匹配 (Kernel matching)
核匹配 (Kernel matching)
其
ASPMEX他匹配方法 广义精确匹配(Coarsened Exact Matching) || help cem
局部线性回归匹配 (Local linear regression matching)
样条匹配 (Spline matching)
马氏匹配 (Mahalanobis matching)
广义精确匹配(Coarsened Exact Matching) || help cem
局部线性回归匹配 (Local linear regression matching)
样条匹配 (Spline matching)
马氏匹配 (Mahalanobis matching)
评论