Dialysis
This package contains functions useful for reproducing Grieco & McDevitt (2017). See this assignment 1 and 2.
Function Reference
Dialysis.clustercov — Method
clustercov(x::Array{Real,2}, clusterid)Compute clustered, heteroskedasticity robust covariance matrix estimate for Var(∑ x). Assumes that observations with different clusterid are independent, and observations with the same clusterid may be arbitrarily correlated. Uses number of observations - 1 as the degrees of freedom.
Arguments
xnumber of observations by dimension of x matrixclusteridnumber of observations length vector
Returns
Vdimension of x by dimension of x matrix
Dialysis.downloadDFR — Method
downloadDFR(;redownload=false)Downloads Dialysis Facility Reports. Saves zipfiles in Dialysis/data/.
Dialysis.errors_gm — Method
errors_gm(y::Symbol, k::Symbol, l::Symbol, Φ::Symbol,
id::Symbol, t::Symbol, data::DataFrame;
npregress::Function=polyreg)Returns functions that given β calculate ω(β) and η(β).
Arguments
ySymbol for y variable in datakSymbol for k variable in datalSymbol for l variable in dataqSymbol for q variable in dataΦSymbol for Φ variable in dataidSymbol for id variable in datatSymbol for t variable in datadataDataFrame containing variablesαestimate of α
Returns
- ωfunc(β) computes ω given β for the
dataand α passed in as input.length(ωfunc(β)) == nrow(data)ωfunc(β) will contain missings if the data does. - ηfunc(β) computes η given β. for the
dataand α passed in as input.length(ηfunc(β)) == nrow(data)ηfunc(β) will contain missings.
Warning: this function is not thread safe!
Dialysis.loadDFR — Method
loadDFR(;recreate=false)If Dialysis/data/dfr.zip exists, load it from disk. Otherwise, create Dialysis/data/dfr.zip exists by loading Dialysis Facility Reports from zipfiles in Dialysis/data/.
Dialysis.loaddata_old — Method
loaddata_old()Loads "dialysisFacilityReports.rda". Returns a DataFrame.
Dialysis.locallinear — Method
locallinear(xpred::AbstractMatrix,
xdata::AbstractMatrix,
ydata::AbstractMatrix)Computes local linear regression of ydata on xdata. Returns predicted y at x=xpred. Uses Scott's rule of thumb for the bandwidth and a Gaussian kernel. xdata should not include an intercept.
Arguments
xpredx values to compute fitted yxdataobserved xydataobserved y, must havesize(y)[1] == size(xdata)[1]bandwidth_multipliermultiply Scott's rule of thumb bandwidth by this number
Returns
- Estimates of
f(xpred)
Dialysis.objective_gm — Method
objective_gm(y::Symbol, k::Symbol, l::Symbol, q::Symbol,
Φ::Symbol, id::Symbol, t::Symbol,
instruments::Array{Symbol,1}, data::DataFrame
; W=UniformScaling(1.),
npregress::Function=(xp,xd,yd)->polyreg(xp,xd,yd,degree=1))Dialysis.panellag — Function
panellag(x::Symbol, data::AbstractDataFrame, id::Symbol, t::Symbol,
lags::Integer=1)Create lags of variables in panel data.
Arguments
xvariable to create lag ofdataDataFrame containingx,id, andtidcross-section identifierttime variablelagsnumber of lags. Can be negative, in which cause leads will be created
Returns
- A vector containing lags of data[x]. Will be missing for
idandtcombinations where the lag is not contained indata.
Dialysis.partiallinear — Method
function partiallinear(y::Symbol, x::Array{Symbol, 1}, controls::Array{Symbol,1}, data::DataFrame; npregress::Function=polyreg, clustervar::Symbol=Symbol())
Estimates a partially linear model. That is, estimate
\[ y = xβ + f(controls) + ϵ\]
Assuming that E[ϵ|x, controls] = 0.
Arguments
ysymbol specificying y variablexcontrolslist of control variables entering fdataDataFrame where all variables are foundnpregressfunction for estimating E[w|x] nonparametrically. Used to partial out E[y|controls] and E[x|controls] and E[q|controls].clustervarsymbol specifying categorical variable on which to cluster when calculating standard errors
Returns
- regression output from FixedEffectModels.jl
Details
Uses orthogonal (with respect to f) moments to estimate β. In particular, it uses
\[0 = E[(y - E[y|controls]) - (x - E[x|controls])β)*(x - E[x|controls])]\]
to estimate β. In practice this can be done by regressing (y - E[y|controls]) on (x - E[x|controls]). FixedEffectModels is used for this regression. Due to the orthogonality of the moment condition the standard errors on β will be the same as if E[y|controls] and E[x|controls] were observed (i.e. FixedEffectModels will report valid standard errors)
Dialysis.partiallinearIV — Method
partiallinearIV(y::Symbol, q::Symbol, z::Symbol,
controls::Array{Symbol,1}, data::DataFrame;
npregress::Function)Estimates a partially linear model using IV. That is, estimate
y = αq + Φ(controls) + ϵ
using z as an instrument for q with first stage
q = h(z,controls) + u
It assumes that E[ϵ|z, controls] = 0.
Uses orthogonal (wrt Φ and other nuisance functions) moments for estimating α. In particular, it uses
0 = E[(y - E[y|controls] - α(q - E[q|controls]))*(E[q|z,controls] - E[q|controls])]
See section 4.2 (in particular footnote 8) of Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, Newey, and Robins (2018) for more information.
In practice α can be estimated by an iv regression of (y - E[y|controls]) on (q - E[q|controls]) using (E[q|z,controls] - E[q|controls]) as an instrument. FixedEffectModels is used for this regression. Due to the orthogonality of the moment condition, the standard error on α will be the same as if E[y|controls] and E[q|controls] were observed (i.e. FixedEffectModels will report valid standard errors)
Arguments
ysymbol specificying y variableqzlist of instrumentscontrolslist of control variables entering ΦdataDataFrame where all variables are foundnpregressfunction for estimating E[w|x] nonparametrically. Used to partial out E[y|controls], E[q|z,controls], and E[q|controls]. Syntax should be the same aslocallinearorpolyreg
Returns
αestimate of αΦestimate of Φ(controls)regestregression output with standard error for α
Dialysis.polyreg — Method
polyreg(xpred::AbstractMatrix,
xdata::AbstractMatrix,
ydata::AbstractMatrix; degree=1)Computes polynomial regression of ydata on xdata. Returns predicted y at x=xpred.
Arguments
xpredx values to compute fitted yxdataobserved xydataobserved y, must havesize(y)[1] == size(xdata)[1]degreederivwhether to also return df(xpred). Only implemented when xdata is one dimentional
Returns
- Estimates of
f(xpred)
Dialysis.singlevars_new — Method
parse data from file in new format (2020 or newer)
Dialysis.singlevars_old — Method
parse data from file in old format (2019 or older)
Dialysis.yvars_new — Method
parse data from file in new format (2020 or newer)