Title: | An API for M-Estimation |
---|---|
Description: | Provides a general, flexible framework for estimating parameters and empirical sandwich variance estimator from a set of unbiased estimating equations (i.e., M-estimation in the vein of Stefanski & Boos (2002) <doi:10.1198/000313002753631330>). All examples from Stefanski & Boos (2002) are published in the corresponding Journal of Statistical Software paper "The Calculus of M-Estimation in R with geex" by Saul & Hudgens (2020) <doi:10.18637/jss.v092.i02>. Also provides an API to compute finite-sample variance corrections. |
Authors: | Bradley Saul [aut, cre], Brian Barkley [ctb] |
Maintainer: | Bradley Saul <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1.1 |
Built: | 2025-01-03 04:45:41 UTC |
Source: | https://github.com/bsaul/geex |
geex provides an extensible API for estimating parameters and their covariance from a set of estimating functions (M-estimation). M-estimation theory has a long history [see reference in the M-estimation bibliography: https://bsaul.github.io/geex/articles/articles/mestimation_bib.html. For an excellent introduction, see the primer by L.A. Stefanski and D.D. Boos, "The Calculus of M-estimation" (The American Statistician (2002), 56(1), 29-38) (http://www.jstor.org/stable/3087324).
M-estimation encompasses a broad swath of statistical estimators and ideas including:
the empirical "sandwich" variance estimator
generalized estimating equations (GEE)
many maximum likelihood estimators
robust regression
and many more
geex can implement all of these using a user-defined estimating function.
To learn more about geex, see the package vignettes: browseVignettes(package = 'geex')
.
If you can specify a set of unbiased estimating equations, geex does the rest. The goals of geex are simply:
To minimize the translational distance between a set of estimating functions and R code;
To return numerically accurate point and covariance estimates from a set of unbiased estimating functions.
geex does not, by itself, necessarily aim to be fast nor precise. Such goals are left to the user to implement or confirm.
Maintainer: Bradley Saul [email protected]
Other contributors:
Brian Barkley [contributor]
Saul, Bradley C., and Michael G. Hudgens. (2020). "The Calculus of M-estimation in R with geex." Journal of Statistical Software 92(2), 1-15. doi:10.18637/jss.v092.i02.
Useful links:
Report bugs at https://github.com/bsaul/geex/issues
EXPERIMENTAL. See example 7 in vignette("01_additional_examples", package = "geex")
for usage.
.FUN
a function which approximates an estFUN
.
.options
a list of options passed to .FUN
.
A general class for defining a function
, and the options passed to the
function
.FUN
a function
.options
a list of options passed to .FUN
root_control-class
, deriv_control-class
approx_control-class
Gets the parameter estimates from a geex object
## S4 method for signature 'geex' coef(object) ## S4 method for signature 'geex_summary' coef(object)
## S4 method for signature 'geex' coef(object) ## S4 method for signature 'geex_summary' coef(object)
object |
a |
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) coef(results)
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) coef(results)
Computes with
provided
and
matrices.
compute_sigma(A, B, solver = solve)
compute_sigma(A, B, solver = solve)
A |
a matrix, generally the |
B |
a matrix, generally the |
solver |
the function used to compute the inverse of |
the matrix
Ainv %*% B %*% t(Ainv)
A <- diag(2, nrow = 2, ncol = 2) B <- matrix(4, nrow = 2, ncol = 2) compute_sigma(A = A, B = B)
A <- diag(2, nrow = 2, ncol = 2) B <- matrix(4, nrow = 2, ncol = 2) compute_sigma(A = A, B = B)
Modifies the matrices in a sandwich_components
object
using the function and options in a correct_control
object.
The function correction
is a utility for creating
correct_control
objects.
correct_by(.components, .correct_control)
correct_by(.components, .correct_control)
.components |
an object of class |
.correct_control |
an object of class |
See the finite sample corrections vignette for further examples.
the result of .FUN
in .correct_control
.
fay_bias_correction
and fay_df_correction
for corrections provided by geex
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) mats <- estimate_sandwich_matrices(mybasis, .theta = c(5.04, 10.04)) correct_by(mats, .correct_control = correction(fay_bias_correction, b = .75))
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) mats <- estimate_sandwich_matrices(mybasis, .theta = c(5.04, 10.04)) correct_by(mats, .correct_control = correction(fay_bias_correction, b = .75))
correct_control S4 class
.FUN
a function which "corrects" a sandwich_components
object. Usually a small-sample correction
.options
a list of options passed to .FUN
.
Creates a correct_control object
correction(FUN, ...)
correction(FUN, ...)
FUN |
a correction to perform. |
... |
additional arguments passed to |
a correct_control
object
correction(FUN = fay_bias_correction, b = 0.75)
correction(FUN = fay_bias_correction, b = 0.75)
Creates an m_estimation_basis object
create_basis(estFUN, data, units, outer_args, inner_args)
create_basis(estFUN, data, units, outer_args, inner_args)
estFUN |
a function that takes in group-level data and returns a function that takes parameters as its first argument |
data |
a data.frame |
units |
an optional character string identifying the grouping variable in |
outer_args |
a list of arguments passed to the outer (data) function of |
inner_args |
a list of arguments passed to the inner (theta) function of |
Either data
or split_data
must be provided
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex)
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex)
From a list of for i = 1, ..., m,
creates
,
called
GFUN
. Here, is the
*inner* part of an
estFUN
, in that the data is fixed and
is a function of
.
create_GFUN(object, ...) ## S4 method for signature 'm_estimation_basis' create_GFUN(object)
create_GFUN(object, ...) ## S4 method for signature 'm_estimation_basis' create_GFUN(object)
object |
an object of class |
... |
additional arguments passed to other methods |
a function
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) f <- grab_GFUN(create_GFUN(mybasis)) # Evaluate GFUN at mean and variance: should be close to zero n <- nrow(geexex) f(c(mean(geexex$Y1), var(geexex$Y1) * (n - 1)/n))
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) f <- grab_GFUN(create_GFUN(mybasis)) # Evaluate GFUN at mean and variance: should be close to zero n <- nrow(geexex) f(c(mean(geexex$Y1), var(geexex$Y1) * (n - 1)/n))
Creates the estimating function ()
for each unit. That is, this function evaluates the outer function in
estFUN
for each independent unit and a returns the inner function in
estFUN
.
create_psiFUN_list(object, ...) ## S4 method for signature 'm_estimation_basis' create_psiFUN_list(object)
create_psiFUN_list(object, ...) ## S4 method for signature 'm_estimation_basis' create_psiFUN_list(object)
object |
an object of class |
... |
additional arguments passed to other methods |
the object
with the .psiFUN_list
slot populated.
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) psi_list <- grab_psiFUN_list(create_psiFUN_list(mybasis)) # A list of functions head(psi_list)
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) psi_list <- grab_psiFUN_list(create_psiFUN_list(mybasis)) # A list of functions head(psi_list)
deriv_control S4 class
.FUN
a function which computes a numerical derivation. This functions
first argument must the function on which the derivative is being compute.
Defaults to jacobian
.
.options
a list of options passed to .FUN
. Defaults to
list(method = 'Richardson')
Computes the value of
, i.e., the estimating
equations at theta
. Used to verify that (or close to 0).
diagnose_roots(GFUN, theta)
diagnose_roots(GFUN, theta)
GFUN |
a function of theta |
theta |
parameter estimates to use in evaluating the estimating equations. |
a numeric vector
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mest <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1, 1))) f <- grab_GFUN(mest@basis) # Should be close to zero diagnose_roots(GFUN = f, theta = roots(mest))
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mest <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1, 1))) f <- grab_GFUN(mest@basis) # Should be close to zero diagnose_roots(GFUN = f, theta = roots(mest))
Using the rootFUN
specified by the user (defaults to multiroot
),
this function estimates the roots of the equations:
estimate_GFUN_roots(.basis)
estimate_GFUN_roots(.basis)
.basis |
an object of class |
This is primilary an internal function used within m_estimate
,
but it is exported for use in debugging and development.
For an example of how to use a different rootFUN
,
see the root solver vignette, vignette('geex_root_solvers', package = 'geex')
.
the output of the rootFUN
function
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } # Start with a basic basis mybasis <- create_basis( estFUN = myee, data = geexex) # Add a control for the root solver mycontrol <- new('geex_control', .root = setup_root_control(start = c(1, 1))) [email protected] <- mycontrol # Now estimate roots of GFUN roots <- estimate_GFUN_roots(mybasis) roots
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } # Start with a basic basis mybasis <- create_basis( estFUN = myee, data = geexex) # Add a control for the root solver mycontrol <- new('geex_control', .root = setup_root_control(start = c(1, 1))) mybasis@.control <- mycontrol # Now estimate roots of GFUN roots <- estimate_GFUN_roots(mybasis) roots
For a given set of estimating equations computes the 'meat' (
in Stefanski and Boos notation) and 'bread' (
in Stefanski and
Boos notation) matrices necessary to compute the covariance matrix.
estimate_sandwich_matrices(.basis, .theta)
estimate_sandwich_matrices(.basis, .theta)
.basis |
basis an object of class |
.theta |
vector of parameter estimates (i.e. estimated roots) |
For a set of estimating equations (),
this function computes:
where all of the above are evaluated at . The partial derivatives in
numerically approximated by the function defined in
deriv_control
.
Note that and not
, and the same for
.
a sandwich_components
object
Stefanski, L. A., & Boos, D. D. (2002). The calculus of m-estimation. The American Statistician, 56(1), 29-38.
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } # Start with a basic basis mybasis <- create_basis( estFUN = myee, data = geexex) # Now estimate sandwich matrices estimate_sandwich_matrices( mybasis, c(mean(geexex$Y1), var(geexex$Y1)))
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } # Start with a basic basis mybasis <- create_basis( estFUN = myee, data = geexex) # Now estimate sandwich matrices estimate_sandwich_matrices( mybasis, c(mean(geexex$Y1), var(geexex$Y1)))
estimating_function S4 class
.estFUN
the estimating function.
.outer_args
a named list
of arguments passed to the outer
function of .estFUN
. Should *not* include the data
argument.
.inner_args
a named list
of arguments passed to the inner
function of .estFUN
. Should *not* include the theta
argument.
Computes the bias corrected sandwich covariance matrix described in Fay and
Graubard (2001). See vignette("05_finite_sample_corrections", package = "geex")
for further information.
fay_bias_correction(components, b = 0.75)
fay_bias_correction(components, b = 0.75)
components |
an object of class |
b |
a numeric value < 1. Defaults to 0.75 as in Fay. |
a corrected covariance matrix
Fay, M. P., & Graubard, B. I. (2001). Small-Sample adjustments for Wald-type tests using sandwich estimators. Biometrics, 57(4), 1198-1206
# This example demonstrates usage of the corrections, not a meaningful application myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( bias_correction_.1 = correction(fay_bias_correction, b = .1), bias_correction_.3 = correction(fay_bias_correction, b = .3)) ) get_corrections(results)
# This example demonstrates usage of the corrections, not a meaningful application myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( bias_correction_.1 = correction(fay_bias_correction, b = .1), bias_correction_.3 = correction(fay_bias_correction, b = .3)) ) get_corrections(results)
Computes the degrees of freedom correction described in Fay and
Graubard (2001). See vignette("05_finite_sample_corrections", package = "geex")
for further information.
fay_df_correction(components, b = 0.75, L, version)
fay_df_correction(components, b = 0.75, L, version)
components |
an object of class |
b |
a numeric value < 1. Defaults to 0.75 as in Fay. |
L |
a k x p matrix where p is the dimension of theta |
version |
either 1 or 2, corresponding to hat(d) or tilde(d), respectively |
a scalar corresponding to the estimated degrees of freedom
Fay, M. P., & Graubard, B. I. (2001). Small-Sample adjustments for Wald-type tests using sandwich estimators. Biometrics, 57(4), 1198-1206
# This example demonstrates usage of the corrections, not a meaningful application myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( df_correction1 = correction(fay_df_correction, b = .75, L = c(0, 1), version = 1 ), df_correction2 = correction(fay_df_correction, b = .75, L = c(0, 1), version = 2 )) ) get_corrections(results)
# This example demonstrates usage of the corrections, not a meaningful application myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( df_correction1 = correction(fay_df_correction, b = .75, L = c(0, 1), version = 1 ), df_correction2 = correction(fay_df_correction, b = .75, L = c(0, 1), version = 2 )) ) get_corrections(results)
An object which control all the basic_control
objects
necessary to perform M-estimation
.approx
an approx_control
object
.root
a root_control
object
.deriv
a deriv_control
object
geex summary object
estFUN
a estimating-function
outer_args
the list
arguments passed to the m_estimate
call
inner_args
the list
arguments passed to the m_estimate
call
data
the data.frame
passed to the m_estimate
call
weights
the weights
passed to the m_estimate
call
nobs
the number of observational units used to compute the M-estimator
units
the name of the variable identifying the observational units
corrections
a list
of correction performed on sandwich_components
estimates
a numeric
vector of parameter estimates
vcov
the empirical sandwich variance matrix
geex S4 class
call
the m_estimate
call
basis
a m_estimation_basis
object
rootFUN_results
the results of call to the root finding algorithm function
sandwich_components
a sandwich_components
object
GFUN
the function
of which the roots are computed.
corrections
a list
of correction performed on sandwich_components
estimates
a numeric
vector of parameter estimates
vcov
the empirical sandwich variance matrix
The data used to illustrate examples 1-9 of Stefanski and Boos (2002).
a dataset with 9 variables and 100 observations
Y1 rnorm(mean = 5, sd = 4)
Y2 rnorm(mean = 2, sd = 1)
X1 rgamma(shape =5)
Y3 2 + 3*X1 + 1*rnorm(0, 1)
W1 X1 + 0.25 * rnorm(0, 1)
Z1 2 + 1.5*X1 + 1*rnorm(0, 1)
X2 0 for first 50 observation, 1 for rest
Y4 0.1 + 0.1*X1 + 0.5*X2 + rnorm(0, 1)
Y5 rbinom(prob = plogis(0.1 + 0.1*X1 + 0.5*X2))
Stefanski, L. A., & Boos, D. D. (2002). The calculus of m-estimation. The American Statistician, 56(1), 29-38.
Gets the corrections from a geex object
get_corrections(object, ...) ## S4 method for signature 'geex' get_corrections(object) ## S4 method for signature 'geex_summary' get_corrections(object)
get_corrections(object, ...) ## S4 method for signature 'geex' get_corrections(object) ## S4 method for signature 'geex_summary' get_corrections(object)
object |
a |
... |
arguments passed to other methods |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( bias_correction_.1 = correction(fay_bias_correction, b = .1), bias_correction_.3 = correction(fay_bias_correction, b = .3)) ) get_corrections(results)
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( bias_correction_.1 = correction(fay_bias_correction, b = .1), bias_correction_.3 = correction(fay_bias_correction, b = .3)) ) get_corrections(results)
Grab something from an object
grab(from, what, ...)
grab(from, what, ...)
from |
an object |
what |
what to grab one of 'response', 'design_matrix', 'response_formula', 'fixed_formula', 'eeFUN' |
... |
additional arguments passed to |
the value returns depends on the argument what
.
grab_response
, grab_design_matrix
,
grab_response_formula
, grab_fixed_formula
,
grab_design_levels
Grabs the .A (bread matrix) slot
grab_bread(object) ## S4 method for signature 'sandwich_components' grab_bread(object)
grab_bread(object) ## S4 method for signature 'sandwich_components' grab_bread(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_bread(results@sandwich_components)
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_bread(results@sandwich_components)
Gets the .A_i (list of bread matrices) slot
grab_bread_list(object) ## S4 method for signature 'sandwich_components' grab_bread_list(object)
grab_bread_list(object) ## S4 method for signature 'sandwich_components' grab_bread_list(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) head(grab_bread_list(results@sandwich_components))
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) head(grab_bread_list(results@sandwich_components))
Useful when splitting data later, used with grab_design_matrix
or especially when calling grab_psiFUN
from within an eeFun.
grab_design_levels(model)
grab_design_levels(model)
model |
a model object such as |
A named list of character vectors that provides the fentire set of
levels that each factor predictor in model
will take on. This is
hopefully identical to what the xlev
argument to
link[stats]{model.frame}
desires. When model
has no factors
as predictors, then an empty list is returned.
## Not run: geex::grab_design_matrix( data = data, rhs_formula = geex::grab_fixed_formula(model), xlev = geex::grab_design_levels(model) ) ## Below is helpful within an eeFun. geex::grab_psiFUN( data = data,## Especially when this is a subset of the data rhs_formula = geex::grab_fixed_formula(model), xlev = geex::grab_design_levels(model) ) ## End(Not run)
## Not run: geex::grab_design_matrix( data = data, rhs_formula = geex::grab_fixed_formula(model), xlev = geex::grab_design_levels(model) ) ## Below is helpful within an eeFun. geex::grab_psiFUN( data = data,## Especially when this is a subset of the data rhs_formula = geex::grab_fixed_formula(model), xlev = geex::grab_design_levels(model) ) ## End(Not run)
Grab a matrix of fixed effects from a model object
grab_design_matrix(data, rhs_formula, ...)
grab_design_matrix(data, rhs_formula, ...)
data |
the data from which to extract the matrix |
rhs_formula |
the right hand side of a model formula |
... |
Can be used to pass |
# Create a "desigm" matrix for the first ten rows of iris data fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_design_matrix( data = iris[1:10, ], grab_fixed_formula(fit))
# Create a "desigm" matrix for the first ten rows of iris data fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_design_matrix( data = iris[1:10, ], grab_fixed_formula(fit))
Gets the .ee_i (observed estimating function) slot
grab_ee_list(object)
grab_ee_list(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_ee_list(results@sandwich_components)
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_ee_list(results@sandwich_components)
Grab estimating functions from a model object
grab_estFUN(object) ## S4 method for signature 'estimating_function' grab_estFUN(object)
grab_estFUN(object) ## S4 method for signature 'estimating_function' grab_estFUN(object)
object |
a |
Grab the RHS formula from a model object
grab_fixed_formula(model)
grab_fixed_formula(model)
model |
a model object such as |
the right-hand side of a model's formula
object
fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_fixed_formula(fit)
fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_fixed_formula(fit)
Gets the .psi_list slot in a m_estimation_basis
grab_GFUN(object) ## S4 method for signature 'm_estimation_basis' grab_GFUN(object) ## S4 method for signature 'geex' grab_GFUN(object)
grab_GFUN(object) ## S4 method for signature 'm_estimation_basis' grab_GFUN(object) ## S4 method for signature 'geex' grab_GFUN(object)
object |
a |
Gets the .B (meat matrix) slot
grab_meat(object) ## S4 method for signature 'sandwich_components' grab_meat(object)
grab_meat(object) ## S4 method for signature 'sandwich_components' grab_meat(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_meat_list(results@sandwich_components)
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_meat_list(results@sandwich_components)
Gets the .B_i (list of bread matrices) slot
grab_meat_list(object) ## S4 method for signature 'sandwich_components' grab_meat_list(object) ## S4 method for signature 'sandwich_components' grab_ee_list(object)
grab_meat_list(object) ## S4 method for signature 'sandwich_components' grab_meat_list(object) ## S4 method for signature 'sandwich_components' grab_ee_list(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) head(grab_meat_list(results@sandwich_components))
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) head(grab_meat_list(results@sandwich_components))
Grab estimating functions from a model object
grab_psiFUN(object, ...) ## S3 method for class 'glm' grab_psiFUN(object, data, ...) ## S3 method for class 'geeglm' grab_psiFUN(object, data, ...) ## S3 method for class 'merMod' grab_psiFUN(object, data, numderiv_opts = NULL, ...)
grab_psiFUN(object, ...) ## S3 method for class 'glm' grab_psiFUN(object, data, ...) ## S3 method for class 'geeglm' grab_psiFUN(object, data, ...) ## S3 method for class 'merMod' grab_psiFUN(object, data, numderiv_opts = NULL, ...)
object |
the object from which to extrace |
... |
additonal arguments passed to other methods |
data |
the data to use for the estimating function |
numderiv_opts |
a list of arguments passed to |
a function corresponding to the estimating equations of a model
grab_psiFUN(glm)
: Create estimating equation function from a glm
object
grab_psiFUN(geeglm)
: Create estimating equation function from a geeglm
object
grab_psiFUN(merMod)
: Create estimating equation function from a merMod
object
## Not run: library(geepack) library(lme4) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) geefit <- geeglm(resp ~ age, data = ohio, id = id, family = binomial(link = "logit")) glmmfit <- glmer(resp ~ age + (1|id), data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(geefit), outer_args = list(model = geefit)) m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = unlist(getME(glmmfit, c('beta', 'theta'))), outer_args = list(model = glmmfit)) ## End(Not run)
## Not run: library(geepack) library(lme4) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) geefit <- geeglm(resp ~ age, data = ohio, id = id, family = binomial(link = "logit")) glmmfit <- glmer(resp ~ age + (1|id), data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(geefit), outer_args = list(model = geefit)) m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = unlist(getME(glmmfit, c('beta', 'theta'))), outer_args = list(model = glmmfit)) ## End(Not run)
Gets the .psi_list slot in a m_estimation_basis
grab_psiFUN_list(object) ## S4 method for signature 'm_estimation_basis' grab_psiFUN_list(object) ## S4 method for signature 'geex' grab_psiFUN_list(object)
grab_psiFUN_list(object) ## S4 method for signature 'm_estimation_basis' grab_psiFUN_list(object) ## S4 method for signature 'geex' grab_psiFUN_list(object)
object |
a |
Grab a vector of responses from a model object
grab_response(data, formula)
grab_response(data, formula)
data |
data.frame from which to extract the vector of responses |
formula |
model formula |
# Grab vector of responses for the first ten rows of iris data fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_response( data = iris[1:10, ], formula(fit))
# Grab vector of responses for the first ten rows of iris data fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_response( data = iris[1:10, ], formula(fit))
Grab the LHS formula from a model object
grab_response_formula(model)
grab_response_formula(model)
model |
a model object such as |
the left-hand side of a model's formula
object
fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_response_formula(fit)
fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_response_formula(fit)
M-estimation theory provides a framework for asympotic properties of estimators that are solutions to estimating equations. Many R packages implement specific applications of estimating equations. geex aims to be provide a more general framework that any modelling method can use to compute point and variance estimates for parameters that are solutions to estimating equations of the form:
m_estimate( estFUN, data, units = character(0), weights = numeric(0), outer_args = list(), inner_args = list(), roots = NULL, compute_roots = TRUE, compute_vcov = TRUE, Asolver = solve, corrections, deriv_control, root_control, approx_control )
m_estimate( estFUN, data, units = character(0), weights = numeric(0), outer_args = list(), inner_args = list(), roots = NULL, compute_roots = TRUE, compute_vcov = TRUE, Asolver = solve, corrections, deriv_control, root_control, approx_control )
estFUN |
a function that takes in group-level data and returns a function that takes parameters as its first argument |
data |
a data.frame |
units |
an optional character string identifying the grouping variable in |
weights |
an optional vector of weights. See details. |
outer_args |
a list of arguments passed to the outer (data) function of |
inner_args |
a list of arguments passed to the inner (theta) function of |
roots |
a vector of parameter estimates must be provided if |
compute_roots |
whether or not to find the roots of the estimating equations.
Defaults to |
compute_vcov |
whether or not to compute the variance-covariance matrix.
Defaults to |
Asolver |
a function passed to |
corrections |
an optional list of small sample corrections where each
list element is a |
deriv_control |
a |
root_control |
a |
approx_control |
a |
The basic idea of geex is for the analyst to provide at least two items:
data
estFUN
: (the function), a function that takes unit-level
data and returns a function in terms of parameters (
)
With the estFUN
, geex computes the roots of the estimating equations
and/or the empirical sandwich variance estimator.
The root finding algorithm defaults to multiroot
to
estimate roots though the solver algorithm can be specified in the rootFUN
argument. Starting values for multiroot
are passed via the
root_control
argument. See vignette("v03_root_solvers", package = "geex")
for information on customizing the root solver function.
To compute only the covariance matrix, set compute_roots = FALSE
and pass
estimates of via the
roots
argument.
M-estimation is often used for clustered data, and a variable by which to split
the data.frame into independent units is specified by the units
argument.
This argument defaults to NULL
, in which case the number of units equals
the number of rows in the data.frame.
For information on the finite-sample corrections, refer to the finite sample
correction API vignette: vignette("v05_finite_sample_corrections", package = "geex")
a geex
object
An estFUN
is a function representing . geex works
by breaking
into two parts:
the "outer" part of the estFUN
which manipulates data
and
outer_args
and returns an
"inner" function of theta
and inner_args
. Internally, this
"inner" function is called psiFUN
.
In pseudo-code this looks like:
function(data, <<outer_args>>){ O <- manipulate(data, <<outer_args>>) function(theta, <<inner_args>>){ map(O, to = theta, and = <<inner_args>>) } }
See the examples below or the package vignettes to see an estFUN
in action.
Importantly, the data
used in an estFUN
is *unit* level data,
which may be single rows in a data.frame or block of rows for clustered data.
Additional arguments may be passed to both the inner and outer function of the
estFUN
. Elements in an outer_args
list are passed to the outer
function; any elements of the inner_args
list are passed to the inner
function. For an example, see the finite sample correction vignette [
vignette("v05_finite_sample_corrections", package = "geex")
].
To estimate roots of the estimating functions, geex uses the rootSolve
multiroot
function by default, which requires starting
values. The root_control
argument expects a root_control
object, which the utility function setup_root_control
aids in
creating. For example, setup_root_control(start = 4)
creates a
root_control
setting the starting value to 4. In general,
the dimension of start
must the same as theta
in the inner
estFUN
.
In some situations, use of weights can massively speed computations. Refer
to vignette("v04_weights", package = "geex")
for an example.
Stefanski, L. A., & Boos, D. D. (2002). The calculus of M-estimation. The American Statistician, 56(1), 29-38.
# Estimate the mean and variance of Y1 in the geexex dataset ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) # compare to the mean() and variance() functions mean(geexex$Y1) n <- nrow(geexex) var(geexex$Y1) * (n - 1)/n # A simple linear model for regressing X1 and X2 on Y4 lm_eefun <- function(data){ X <- cbind(1, data$X1, data$X2) Y <- data$Y4 function(theta){ t(X) %*% (Y - X %*% theta) } } m_estimate( estFUN = lm_eefun, data = geexex, root_control = setup_root_control(start = c(0, 0, 0))) # Compare to lm() results summary(lm(Y4 ~ X1 + X2, data = geexex))
# Estimate the mean and variance of Y1 in the geexex dataset ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) # compare to the mean() and variance() functions mean(geexex$Y1) n <- nrow(geexex) var(geexex$Y1) * (n - 1)/n # A simple linear model for regressing X1 and X2 on Y4 lm_eefun <- function(data){ X <- cbind(1, data$X1, data$X2) Y <- data$Y4 function(theta){ t(X) %*% (Y - X %*% theta) } } m_estimate( estFUN = lm_eefun, data = geexex, root_control = setup_root_control(start = c(0, 0, 0))) # Compare to lm() results summary(lm(Y4 ~ X1 + X2, data = geexex))
m_estimation_basis S4 class
.data
the analysis data.frame
.units
an (optional) character string identifying the variable in
.data
which splits the data into indepedent units
.weights
a numeric vector of weights used in weighting the estimating functions
.psiFUN_list
a list of psiFUN
s created by create_psiFUN_list
.GFUN
a function created by create_GFUN
.control
a geex_control
object
Extract the number observations
## S4 method for signature 'geex' nobs(object) ## S4 method for signature 'geex_summary' nobs(object)
## S4 method for signature 'geex' nobs(object) ## S4 method for signature 'geex_summary' nobs(object)
object |
a |
library(geepack) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } z <- m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) nobs(z)
library(geepack) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } z <- m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) nobs(z)
root_control S4 class
.FUN
a root finding function whose first argument must be named f
.
.options
a list of options passed to .FUN
.
.object_name
a character string identifying the object containing the
roots in the output of .FUN
.
Gets the parameter estimates matrix from a geex object
roots(object, ...) ## S4 method for signature 'geex' roots(object) ## S4 method for signature 'geex_summary' roots(object)
roots(object, ...) ## S4 method for signature 'geex' roots(object) ## S4 method for signature 'geex_summary' roots(object)
object |
a |
... |
arguments passed to other methods |
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) roots(results)
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) roots(results)
sandwich_components S4 class
.A
the "bread" matrix
.A_i
a list of "bread" matrices per unit
.B
the "meat" matrix
.B_i
a list of "meat" matrices per unit
.ee_i
a list of observed estimating function values per unit
Setup an approx_control object
setup_approx_control(FUN, ...)
setup_approx_control(FUN, ...)
FUN |
a function |
... |
arguments passed to |
a approx_control
object
# For usage, see example 7 in vignette("01_additional_examples", package = "geex")
# For usage, see example 7 in vignette("01_additional_examples", package = "geex")
Setup a basic_control object
setup_control(type, FUN, ...)
setup_control(type, FUN, ...)
type |
one of |
FUN |
a function |
... |
arguments passed to |
a basic_control
object
setup_root_control
, setup_deriv_control
,
setup_approx_control
Setup a deriv_control object
setup_deriv_control(FUN, ...)
setup_deriv_control(FUN, ...)
FUN |
a function |
... |
arguments passed to |
a deriv_control
object
setup_deriv_control() # default setup_deriv_control(method = "simple") # will speed up computations
setup_deriv_control() # default setup_deriv_control(method = "simple") # will speed up computations
Setup a root_control object
setup_root_control(FUN, roots_name, ...)
setup_root_control(FUN, roots_name, ...)
FUN |
a function |
roots_name |
a character string identifying the object containing the |
... |
arguments passed to |
a root_control
object
# Setup the default setup_root_control(start = c(3, 5, 6)) # Also setup the default setup_root_control(FUN = rootSolve::multiroot, start = c(3, 5, 6)) # Or use uniroot() setup_root_control(FUN = stats::uniroot, interval = c(0, 1))
# Setup the default setup_root_control(start = c(3, 5, 6)) # Also setup the default setup_root_control(FUN = rootSolve::multiroot, start = c(3, 5, 6)) # Or use uniroot() setup_root_control(FUN = stats::uniroot, interval = c(0, 1))
m_estimation_basis
, or geex
object
show(object) ## S4 method for signature 'sandwich_components' show(object) ## S4 method for signature 'm_estimation_basis' show(object) ## S4 method for signature 'geex' show(object) ## S4 method for signature 'geex_summary' show(object)
show(object) ## S4 method for signature 'sandwich_components' show(object) ## S4 method for signature 'm_estimation_basis' show(object) ## S4 method for signature 'geex' show(object) ## S4 method for signature 'geex_summary' show(object)
object |
the object to print |
Object Summaries
## S4 method for signature 'geex' summary(object, keep_data = TRUE, keep_args = TRUE)
## S4 method for signature 'geex' summary(object, keep_data = TRUE, keep_args = TRUE)
object |
a |
keep_data |
keep the original data or not |
keep_args |
keep the |
library(geepack) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } z <- m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) object.size(z) object.size(summary(z)) object.size(summary(z, keep_data = FALSE)) object.size(summary(z, keep_data = FALSE, keep_args = FALSE))
library(geepack) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } z <- m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) object.size(z) object.size(summary(z)) object.size(summary(z, keep_data = FALSE)) object.size(summary(z, keep_data = FALSE, keep_args = FALSE))
Gets the variance-covariance matrix from a geex object
## S4 method for signature 'geex' vcov(object) ## S4 method for signature 'geex_summary' vcov(object)
## S4 method for signature 'geex' vcov(object) ## S4 method for signature 'geex_summary' vcov(object)
object |
a |
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) vcov(results)
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) vcov(results)
Extract Model weights
## S4 method for signature 'geex' weights(object) ## S4 method for signature 'geex_summary' weights(object)
## S4 method for signature 'geex' weights(object) ## S4 method for signature 'geex_summary' weights(object)
object |
a |