| Title: | An API for M-Estimation |
|---|---|
| Description: | Provides a general, flexible framework for estimating parameters and empirical sandwich variance estimator from a set of unbiased estimating equations (i.e., M-estimation in the vein of Stefanski & Boos (2002) <doi:10.1198/000313002753631330>). All examples from Stefanski & Boos (2002) are published in the corresponding Journal of Statistical Software paper "The Calculus of M-Estimation in R with geex" by Saul & Hudgens (2020) <doi:10.18637/jss.v092.i02>. Also provides an API to compute finite-sample variance corrections. |
| Authors: | Bradley Saul [aut, cre], Brian Barkley [ctb] |
| Maintainer: | Bradley Saul <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.1 |
| Built: | 2026-05-31 09:42:56 UTC |
| Source: | https://github.com/bsaul/geex |
geex provides an extensible API for estimating parameters and their covariance from a set of estimating functions (M-estimation). M-estimation theory has a long history [see reference in the M-estimation bibliography: https://bsaul.github.io/geex/articles/articles/mestimation_bib.html. For an excellent introduction, see the primer by L.A. Stefanski and D.D. Boos, "The Calculus of M-estimation" (The American Statistician (2002), 56(1), 29-38) (http://www.jstor.org/stable/3087324).
M-estimation encompasses a broad swath of statistical estimators and ideas including:
the empirical "sandwich" variance estimator
generalized estimating equations (GEE)
many maximum likelihood estimators
robust regression
and many more
geex can implement all of these using a user-defined estimating function.
To learn more about geex, see the package vignettes: browseVignettes(package = 'geex').
If you can specify a set of unbiased estimating equations, geex does the rest. The goals of geex are simply:
To minimize the translational distance between a set of estimating functions and R code;
To return numerically accurate point and covariance estimates from a set of unbiased estimating functions.
geex does not, by itself, necessarily aim to be fast nor precise. Such goals are left to the user to implement or confirm.
Maintainer: Bradley Saul [email protected]
Other contributors:
Brian Barkley [contributor]
Saul, Bradley C., and Michael G. Hudgens. (2020). "The Calculus of M-estimation in R with geex." Journal of Statistical Software 92(2), 1-15. doi:10.18637/jss.v092.i02.
Useful links:
Report bugs at https://github.com/bsaul/geex/issues
EXPERIMENTAL. See example 7 in vignette("01_additional_examples", package = "geex")
for usage.
.FUNa function which approximates an estFUN.
.optionsa list of options passed to .FUN.
A general class for defining a function, and the options passed to the
function
.FUNa function
.optionsa list of options passed to .FUN
root_control-class, deriv_control-class
approx_control-class
Gets the parameter estimates from a geex object
## S4 method for signature 'geex' coef(object) ## S4 method for signature 'geex_summary' coef(object)## S4 method for signature 'geex' coef(object) ## S4 method for signature 'geex_summary' coef(object)
object |
a |
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) coef(results)ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) coef(results)
Computes with
provided and matrices.
compute_sigma(A, B, solver = solve)compute_sigma(A, B, solver = solve)
A |
a matrix, generally the |
B |
a matrix, generally the |
solver |
the function used to compute the inverse of |
the matrix Ainv %*% B %*% t(Ainv)
A <- diag(2, nrow = 2, ncol = 2) B <- matrix(4, nrow = 2, ncol = 2) compute_sigma(A = A, B = B)A <- diag(2, nrow = 2, ncol = 2) B <- matrix(4, nrow = 2, ncol = 2) compute_sigma(A = A, B = B)
Modifies the matrices in a sandwich_components object
using the function and options in a correct_control object.
The function correction is a utility for creating
correct_control objects.
correct_by(.components, .correct_control)correct_by(.components, .correct_control)
.components |
an object of class |
.correct_control |
an object of class |
See the finite sample corrections vignette for further examples.
the result of .FUN in .correct_control.
fay_bias_correction and fay_df_correction
for corrections provided by geex
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) mats <- estimate_sandwich_matrices(mybasis, .theta = c(5.04, 10.04)) correct_by(mats, .correct_control = correction(fay_bias_correction, b = .75))myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) mats <- estimate_sandwich_matrices(mybasis, .theta = c(5.04, 10.04)) correct_by(mats, .correct_control = correction(fay_bias_correction, b = .75))
correct_control S4 class
.FUNa function which "corrects" a sandwich_components
object. Usually a small-sample correction
.optionsa list of options passed to .FUN.
Creates a correct_control object
correction(FUN, ...)correction(FUN, ...)
FUN |
a correction to perform. |
... |
additional arguments passed to |
a correct_control object
correction(FUN = fay_bias_correction, b = 0.75)correction(FUN = fay_bias_correction, b = 0.75)
Creates an m_estimation_basis object
create_basis(estFUN, data, units, outer_args, inner_args)create_basis(estFUN, data, units, outer_args, inner_args)
estFUN |
a function that takes in group-level data and returns a function that takes parameters as its first argument |
data |
a data.frame |
units |
an optional character string identifying the grouping variable in |
outer_args |
a list of arguments passed to the outer (data) function of |
inner_args |
a list of arguments passed to the inner (theta) function of |
Either data or split_data must be provided
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex)myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex)
From a list of for i = 1, ..., m,
creates ,
called GFUN. Here, is the
*inner* part of an estFUN, in that the data is fixed and
is a function of .
create_GFUN(object, ...) ## S4 method for signature 'm_estimation_basis' create_GFUN(object)create_GFUN(object, ...) ## S4 method for signature 'm_estimation_basis' create_GFUN(object)
object |
an object of class |
... |
additional arguments passed to other methods |
a function
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) f <- grab_GFUN(create_GFUN(mybasis)) # Evaluate GFUN at mean and variance: should be close to zero n <- nrow(geexex) f(c(mean(geexex$Y1), var(geexex$Y1) * (n - 1)/n))myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) f <- grab_GFUN(create_GFUN(mybasis)) # Evaluate GFUN at mean and variance: should be close to zero n <- nrow(geexex) f(c(mean(geexex$Y1), var(geexex$Y1) * (n - 1)/n))
Creates the estimating function ()
for each unit. That is, this function evaluates the outer function in
estFUN for each independent unit and a returns the inner function in
estFUN.
create_psiFUN_list(object, ...) ## S4 method for signature 'm_estimation_basis' create_psiFUN_list(object)create_psiFUN_list(object, ...) ## S4 method for signature 'm_estimation_basis' create_psiFUN_list(object)
object |
an object of class |
... |
additional arguments passed to other methods |
the object with the .psiFUN_list slot populated.
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) psi_list <- grab_psiFUN_list(create_psiFUN_list(mybasis)) # A list of functions head(psi_list)myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mybasis <- create_basis( estFUN = myee, data = geexex) psi_list <- grab_psiFUN_list(create_psiFUN_list(mybasis)) # A list of functions head(psi_list)
deriv_control S4 class
.FUNa function which computes a numerical derivation. This functions
first argument must the function on which the derivative is being compute.
Defaults to jacobian.
.optionsa list of options passed to .FUN. Defaults to
list(method = 'Richardson')
Computes the value of
, i.e., the estimating
equations at theta. Used to verify that (or close to 0).
diagnose_roots(GFUN, theta)diagnose_roots(GFUN, theta)
GFUN |
a function of theta |
theta |
parameter estimates to use in evaluating the estimating equations. |
a numeric vector
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mest <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1, 1))) f <- grab_GFUN(mest@basis) # Should be close to zero diagnose_roots(GFUN = f, theta = roots(mest))myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } mest <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1, 1))) f <- grab_GFUN(mest@basis) # Should be close to zero diagnose_roots(GFUN = f, theta = roots(mest))
Using the rootFUN specified by the user (defaults to multiroot),
this function estimates the roots of the equations:
estimate_GFUN_roots(.basis)estimate_GFUN_roots(.basis)
.basis |
an object of class |
This is primilary an internal function used within m_estimate,
but it is exported for use in debugging and development.
For an example of how to use a different rootFUN,
see the root solver vignette, vignette('geex_root_solvers', package = 'geex').
the output of the rootFUN function
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } # Start with a basic basis mybasis <- create_basis( estFUN = myee, data = geexex) # Add a control for the root solver mycontrol <- new('geex_control', .root = setup_root_control(start = c(1, 1))) [email protected] <- mycontrol # Now estimate roots of GFUN roots <- estimate_GFUN_roots(mybasis) rootsmyee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } # Start with a basic basis mybasis <- create_basis( estFUN = myee, data = geexex) # Add a control for the root solver mycontrol <- new('geex_control', .root = setup_root_control(start = c(1, 1))) mybasis@.control <- mycontrol # Now estimate roots of GFUN roots <- estimate_GFUN_roots(mybasis) roots
For a given set of estimating equations computes the 'meat' (
in Stefanski and Boos notation) and 'bread' ( in Stefanski and
Boos notation) matrices necessary to compute the covariance matrix.
estimate_sandwich_matrices(.basis, .theta)estimate_sandwich_matrices(.basis, .theta)
.basis |
basis an object of class |
.theta |
vector of parameter estimates (i.e. estimated roots) |
For a set of estimating equations (),
this function computes:
where all of the above are evaluated at . The partial derivatives in
numerically approximated by the function defined in deriv_control.
Note that and not , and the same for .
a sandwich_components object
Stefanski, L. A., & Boos, D. D. (2002). The calculus of m-estimation. The American Statistician, 56(1), 29-38.
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } # Start with a basic basis mybasis <- create_basis( estFUN = myee, data = geexex) # Now estimate sandwich matrices estimate_sandwich_matrices( mybasis, c(mean(geexex$Y1), var(geexex$Y1)))myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } # Start with a basic basis mybasis <- create_basis( estFUN = myee, data = geexex) # Now estimate sandwich matrices estimate_sandwich_matrices( mybasis, c(mean(geexex$Y1), var(geexex$Y1)))
estimating_function S4 class
.estFUNthe estimating function.
.outer_argsa named list of arguments passed to the outer
function of .estFUN. Should *not* include the data argument.
.inner_argsa named list of arguments passed to the inner
function of .estFUN. Should *not* include the theta argument.
Computes the bias corrected sandwich covariance matrix described in Fay and
Graubard (2001). See vignette("05_finite_sample_corrections", package = "geex")
for further information.
fay_bias_correction(components, b = 0.75)fay_bias_correction(components, b = 0.75)
components |
an object of class |
b |
a numeric value < 1. Defaults to 0.75 as in Fay. |
a corrected covariance matrix
Fay, M. P., & Graubard, B. I. (2001). Small-Sample adjustments for Wald-type tests using sandwich estimators. Biometrics, 57(4), 1198-1206
# This example demonstrates usage of the corrections, not a meaningful application myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( bias_correction_.1 = correction(fay_bias_correction, b = .1), bias_correction_.3 = correction(fay_bias_correction, b = .3)) ) get_corrections(results)# This example demonstrates usage of the corrections, not a meaningful application myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( bias_correction_.1 = correction(fay_bias_correction, b = .1), bias_correction_.3 = correction(fay_bias_correction, b = .3)) ) get_corrections(results)
Computes the degrees of freedom correction described in Fay and
Graubard (2001). See vignette("05_finite_sample_corrections", package = "geex")
for further information.
fay_df_correction(components, b = 0.75, L, version)fay_df_correction(components, b = 0.75, L, version)
components |
an object of class |
b |
a numeric value < 1. Defaults to 0.75 as in Fay. |
L |
a k x p matrix where p is the dimension of theta |
version |
either 1 or 2, corresponding to hat(d) or tilde(d), respectively |
a scalar corresponding to the estimated degrees of freedom
Fay, M. P., & Graubard, B. I. (2001). Small-Sample adjustments for Wald-type tests using sandwich estimators. Biometrics, 57(4), 1198-1206
# This example demonstrates usage of the corrections, not a meaningful application myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( df_correction1 = correction(fay_df_correction, b = .75, L = c(0, 1), version = 1 ), df_correction2 = correction(fay_df_correction, b = .75, L = c(0, 1), version = 2 )) ) get_corrections(results)# This example demonstrates usage of the corrections, not a meaningful application myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( df_correction1 = correction(fay_df_correction, b = .75, L = c(0, 1), version = 1 ), df_correction2 = correction(fay_df_correction, b = .75, L = c(0, 1), version = 2 )) ) get_corrections(results)
An object which control all the basic_control objects
necessary to perform M-estimation
.approxan approx_control object
.roota root_control object
.deriva deriv_control object
geex summary object
estFUNa estimating-function
outer_argsthe list arguments passed to the m_estimate call
inner_argsthe list arguments passed to the m_estimate call
datathe data.frame passed to the m_estimate call
weightsthe weights passed to the m_estimate call
nobsthe number of observational units used to compute the M-estimator
unitsthe name of the variable identifying the observational units
correctionsa list of correction performed on sandwich_components
estimatesa numeric vector of parameter estimates
vcovthe empirical sandwich variance matrix
geex S4 class
callthe m_estimate call
basisa m_estimation_basis object
rootFUN_resultsthe results of call to the root finding algorithm function
sandwich_componentsa sandwich_components object
GFUNthe function of which the roots are computed.
correctionsa list of correction performed on sandwich_components
estimatesa numeric vector of parameter estimates
vcovthe empirical sandwich variance matrix
The data used to illustrate examples 1-9 of Stefanski and Boos (2002).
a dataset with 9 variables and 100 observations
Y1 rnorm(mean = 5, sd = 4)
Y2 rnorm(mean = 2, sd = 1)
X1 rgamma(shape =5)
Y3 2 + 3*X1 + 1*rnorm(0, 1)
W1 X1 + 0.25 * rnorm(0, 1)
Z1 2 + 1.5*X1 + 1*rnorm(0, 1)
X2 0 for first 50 observation, 1 for rest
Y4 0.1 + 0.1*X1 + 0.5*X2 + rnorm(0, 1)
Y5 rbinom(prob = plogis(0.1 + 0.1*X1 + 0.5*X2))
Stefanski, L. A., & Boos, D. D. (2002). The calculus of m-estimation. The American Statistician, 56(1), 29-38.
Gets the corrections from a geex object
get_corrections(object, ...) ## S4 method for signature 'geex' get_corrections(object) ## S4 method for signature 'geex_summary' get_corrections(object)get_corrections(object, ...) ## S4 method for signature 'geex' get_corrections(object) ## S4 method for signature 'geex_summary' get_corrections(object)
object |
a |
... |
arguments passed to other methods |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( bias_correction_.1 = correction(fay_bias_correction, b = .1), bias_correction_.3 = correction(fay_bias_correction, b = .3)) ) get_corrections(results)myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1)), corrections = list( bias_correction_.1 = correction(fay_bias_correction, b = .1), bias_correction_.3 = correction(fay_bias_correction, b = .3)) ) get_corrections(results)
Grab something from an object
grab(from, what, ...)grab(from, what, ...)
from |
an object |
what |
what to grab one of 'response', 'design_matrix', 'response_formula', 'fixed_formula', 'eeFUN' |
... |
additional arguments passed to |
the value returns depends on the argument what.
grab_response, grab_design_matrix,
grab_response_formula, grab_fixed_formula,
grab_design_levels
Grabs the .A (bread matrix) slot
grab_bread(object) ## S4 method for signature 'sandwich_components' grab_bread(object)grab_bread(object) ## S4 method for signature 'sandwich_components' grab_bread(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_bread(results@sandwich_components)myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_bread(results@sandwich_components)
Gets the .A_i (list of bread matrices) slot
grab_bread_list(object) ## S4 method for signature 'sandwich_components' grab_bread_list(object)grab_bread_list(object) ## S4 method for signature 'sandwich_components' grab_bread_list(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) head(grab_bread_list(results@sandwich_components))myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) head(grab_bread_list(results@sandwich_components))
Useful when splitting data later, used with grab_design_matrix
or especially when calling grab_psiFUN from within an eeFun.
grab_design_levels(model)grab_design_levels(model)
model |
a model object such as |
A named list of character vectors that provides the fentire set of
levels that each factor predictor in model will take on. This is
hopefully identical to what the xlev argument to
link[stats]{model.frame} desires. When model has no factors
as predictors, then an empty list is returned.
## Not run: geex::grab_design_matrix( data = data, rhs_formula = geex::grab_fixed_formula(model), xlev = geex::grab_design_levels(model) ) ## Below is helpful within an eeFun. geex::grab_psiFUN( data = data,## Especially when this is a subset of the data rhs_formula = geex::grab_fixed_formula(model), xlev = geex::grab_design_levels(model) ) ## End(Not run)## Not run: geex::grab_design_matrix( data = data, rhs_formula = geex::grab_fixed_formula(model), xlev = geex::grab_design_levels(model) ) ## Below is helpful within an eeFun. geex::grab_psiFUN( data = data,## Especially when this is a subset of the data rhs_formula = geex::grab_fixed_formula(model), xlev = geex::grab_design_levels(model) ) ## End(Not run)
Grab a matrix of fixed effects from a model object
grab_design_matrix(data, rhs_formula, ...)grab_design_matrix(data, rhs_formula, ...)
data |
the data from which to extract the matrix |
rhs_formula |
the right hand side of a model formula |
... |
Can be used to pass |
# Create a "desigm" matrix for the first ten rows of iris data fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_design_matrix( data = iris[1:10, ], grab_fixed_formula(fit))# Create a "desigm" matrix for the first ten rows of iris data fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_design_matrix( data = iris[1:10, ], grab_fixed_formula(fit))
Gets the .ee_i (observed estimating function) slot
grab_ee_list(object)grab_ee_list(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_ee_list(results@sandwich_components)myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_ee_list(results@sandwich_components)
Grab estimating functions from a model object
grab_estFUN(object) ## S4 method for signature 'estimating_function' grab_estFUN(object)grab_estFUN(object) ## S4 method for signature 'estimating_function' grab_estFUN(object)
object |
a |
Grab the RHS formula from a model object
grab_fixed_formula(model)grab_fixed_formula(model)
model |
a model object such as |
the right-hand side of a model's formula object
fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_fixed_formula(fit)fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_fixed_formula(fit)
Gets the .psi_list slot in a m_estimation_basis
grab_GFUN(object) ## S4 method for signature 'm_estimation_basis' grab_GFUN(object) ## S4 method for signature 'geex' grab_GFUN(object)grab_GFUN(object) ## S4 method for signature 'm_estimation_basis' grab_GFUN(object) ## S4 method for signature 'geex' grab_GFUN(object)
object |
a |
Gets the .B (meat matrix) slot
grab_meat(object) ## S4 method for signature 'sandwich_components' grab_meat(object)grab_meat(object) ## S4 method for signature 'sandwich_components' grab_meat(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_meat_list(results@sandwich_components)myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) grab_meat_list(results@sandwich_components)
Gets the .B_i (list of bread matrices) slot
grab_meat_list(object) ## S4 method for signature 'sandwich_components' grab_meat_list(object) ## S4 method for signature 'sandwich_components' grab_ee_list(object)grab_meat_list(object) ## S4 method for signature 'sandwich_components' grab_meat_list(object) ## S4 method for signature 'sandwich_components' grab_ee_list(object)
object |
a |
myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) head(grab_meat_list(results@sandwich_components))myee <- function(data){ function(theta){ c(data$Y1 - theta[1], (data$Y1 - theta[1])^2 - theta[2]) } } results <- m_estimate( estFUN = myee, data = geexex, root_control = setup_root_control(start = c(1,1))) head(grab_meat_list(results@sandwich_components))
Grab estimating functions from a model object
grab_psiFUN(object, ...) ## S3 method for class 'glm' grab_psiFUN(object, data, ...) ## S3 method for class 'geeglm' grab_psiFUN(object, data, ...) ## S3 method for class 'merMod' grab_psiFUN(object, data, numderiv_opts = NULL, ...)grab_psiFUN(object, ...) ## S3 method for class 'glm' grab_psiFUN(object, data, ...) ## S3 method for class 'geeglm' grab_psiFUN(object, data, ...) ## S3 method for class 'merMod' grab_psiFUN(object, data, numderiv_opts = NULL, ...)
object |
the object from which to extrace |
... |
additonal arguments passed to other methods |
data |
the data to use for the estimating function |
numderiv_opts |
a list of arguments passed to |
a function corresponding to the estimating equations of a model
grab_psiFUN(glm): Create estimating equation function from a glm object
grab_psiFUN(geeglm): Create estimating equation function from a geeglm object
grab_psiFUN(merMod): Create estimating equation function from a merMod object
## Not run: library(geepack) library(lme4) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) geefit <- geeglm(resp ~ age, data = ohio, id = id, family = binomial(link = "logit")) glmmfit <- glmer(resp ~ age + (1|id), data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(geefit), outer_args = list(model = geefit)) m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = unlist(getME(glmmfit, c('beta', 'theta'))), outer_args = list(model = glmmfit)) ## End(Not run)## Not run: library(geepack) library(lme4) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) geefit <- geeglm(resp ~ age, data = ohio, id = id, family = binomial(link = "logit")) glmmfit <- glmer(resp ~ age + (1|id), data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(geefit), outer_args = list(model = geefit)) m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = unlist(getME(glmmfit, c('beta', 'theta'))), outer_args = list(model = glmmfit)) ## End(Not run)
Gets the .psi_list slot in a m_estimation_basis
grab_psiFUN_list(object) ## S4 method for signature 'm_estimation_basis' grab_psiFUN_list(object) ## S4 method for signature 'geex' grab_psiFUN_list(object)grab_psiFUN_list(object) ## S4 method for signature 'm_estimation_basis' grab_psiFUN_list(object) ## S4 method for signature 'geex' grab_psiFUN_list(object)
object |
a |
Grab a vector of responses from a model object
grab_response(data, formula)grab_response(data, formula)
data |
data.frame from which to extract the vector of responses |
formula |
model formula |
# Grab vector of responses for the first ten rows of iris data fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_response( data = iris[1:10, ], formula(fit))# Grab vector of responses for the first ten rows of iris data fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_response( data = iris[1:10, ], formula(fit))
Grab the LHS formula from a model object
grab_response_formula(model)grab_response_formula(model)
model |
a model object such as |
the left-hand side of a model's formula object
fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_response_formula(fit)fit <- lm(Sepal.Width ~ Petal.Width, data = iris) grab_response_formula(fit)
M-estimation theory provides a framework for asympotic properties of estimators that are solutions to estimating equations. Many R packages implement specific applications of estimating equations. geex aims to be provide a more general framework that any modelling method can use to compute point and variance estimates for parameters that are solutions to estimating equations of the form:
m_estimate( estFUN, data, units = character(0), weights = numeric(0), outer_args = list(), inner_args = list(), roots = NULL, compute_roots = TRUE, compute_vcov = TRUE, Asolver = solve, corrections, deriv_control, root_control, approx_control )m_estimate( estFUN, data, units = character(0), weights = numeric(0), outer_args = list(), inner_args = list(), roots = NULL, compute_roots = TRUE, compute_vcov = TRUE, Asolver = solve, corrections, deriv_control, root_control, approx_control )
estFUN |
a function that takes in group-level data and returns a function that takes parameters as its first argument |
data |
a data.frame |
units |
an optional character string identifying the grouping variable in |
weights |
an optional vector of weights. See details. |
outer_args |
a list of arguments passed to the outer (data) function of |
inner_args |
a list of arguments passed to the inner (theta) function of |
roots |
a vector of parameter estimates must be provided if |
compute_roots |
whether or not to find the roots of the estimating equations.
Defaults to |
compute_vcov |
whether or not to compute the variance-covariance matrix.
Defaults to |
Asolver |
a function passed to |
corrections |
an optional list of small sample corrections where each
list element is a |
deriv_control |
a |
root_control |
a |
approx_control |
a |
The basic idea of geex is for the analyst to provide at least two items:
data
estFUN: (the function), a function that takes unit-level
data and returns a function in terms of parameters ()
With the estFUN, geex computes the roots of the estimating equations
and/or the empirical sandwich variance estimator.
The root finding algorithm defaults to multiroot to
estimate roots though the solver algorithm can be specified in the rootFUN
argument. Starting values for multiroot are passed via the
root_control argument. See vignette("v03_root_solvers", package = "geex")
for information on customizing the root solver function.
To compute only the covariance matrix, set compute_roots = FALSE and pass
estimates of via the roots argument.
M-estimation is often used for clustered data, and a variable by which to split
the data.frame into independent units is specified by the units argument.
This argument defaults to NULL, in which case the number of units equals
the number of rows in the data.frame.
For information on the finite-sample corrections, refer to the finite sample
correction API vignette: vignette("v05_finite_sample_corrections", package = "geex")
a geex object
An estFUN is a function representing . geex works
by breaking into two parts:
the "outer" part of the estFUN which manipulates data and
outer_args and returns an
"inner" function of theta and inner_args. Internally, this
"inner" function is called psiFUN.
In pseudo-code this looks like:
function(data, <<outer_args>>){
O <- manipulate(data, <<outer_args>>)
function(theta, <<inner_args>>){
map(O, to = theta, and = <<inner_args>>)
}
}
See the examples below or the package vignettes to see an estFUN
in action.
Importantly, the data used in an estFUN is *unit* level data,
which may be single rows in a data.frame or block of rows for clustered data.
Additional arguments may be passed to both the inner and outer function of the
estFUN. Elements in an outer_args list are passed to the outer
function; any elements of the inner_args list are passed to the inner
function. For an example, see the finite sample correction vignette [
vignette("v05_finite_sample_corrections", package = "geex")].
To estimate roots of the estimating functions, geex uses the rootSolve
multiroot function by default, which requires starting
values. The root_control argument expects a root_control
object, which the utility function setup_root_control aids in
creating. For example, setup_root_control(start = 4) creates a
root_control setting the starting value to 4. In general,
the dimension of start must the same as theta in the inner
estFUN.
In some situations, use of weights can massively speed computations. Refer
to vignette("v04_weights", package = "geex") for an example.
Stefanski, L. A., & Boos, D. D. (2002). The calculus of M-estimation. The American Statistician, 56(1), 29-38.
# Estimate the mean and variance of Y1 in the geexex dataset ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) # compare to the mean() and variance() functions mean(geexex$Y1) n <- nrow(geexex) var(geexex$Y1) * (n - 1)/n # A simple linear model for regressing X1 and X2 on Y4 lm_eefun <- function(data){ X <- cbind(1, data$X1, data$X2) Y <- data$Y4 function(theta){ t(X) %*% (Y - X %*% theta) } } m_estimate( estFUN = lm_eefun, data = geexex, root_control = setup_root_control(start = c(0, 0, 0))) # Compare to lm() results summary(lm(Y4 ~ X1 + X2, data = geexex))# Estimate the mean and variance of Y1 in the geexex dataset ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) # compare to the mean() and variance() functions mean(geexex$Y1) n <- nrow(geexex) var(geexex$Y1) * (n - 1)/n # A simple linear model for regressing X1 and X2 on Y4 lm_eefun <- function(data){ X <- cbind(1, data$X1, data$X2) Y <- data$Y4 function(theta){ t(X) %*% (Y - X %*% theta) } } m_estimate( estFUN = lm_eefun, data = geexex, root_control = setup_root_control(start = c(0, 0, 0))) # Compare to lm() results summary(lm(Y4 ~ X1 + X2, data = geexex))
m_estimation_basis S4 class
.datathe analysis data.frame
.unitsan (optional) character string identifying the variable in
.data which splits the data into indepedent units
.weightsa numeric vector of weights used in weighting the estimating functions
.psiFUN_lista list of psiFUNs created by create_psiFUN_list
.GFUNa function created by create_GFUN
.controla geex_control object
Extract the number observations
## S4 method for signature 'geex' nobs(object) ## S4 method for signature 'geex_summary' nobs(object)## S4 method for signature 'geex' nobs(object) ## S4 method for signature 'geex_summary' nobs(object)
object |
a |
library(geepack) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } z <- m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) nobs(z)library(geepack) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } z <- m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) nobs(z)
root_control S4 class
.FUNa root finding function whose first argument must be named f.
.optionsa list of options passed to .FUN.
.object_namea character string identifying the object containing the
roots in the output of .FUN.
Gets the parameter estimates matrix from a geex object
roots(object, ...) ## S4 method for signature 'geex' roots(object) ## S4 method for signature 'geex_summary' roots(object)roots(object, ...) ## S4 method for signature 'geex' roots(object) ## S4 method for signature 'geex_summary' roots(object)
object |
a |
... |
arguments passed to other methods |
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) roots(results)ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) roots(results)
sandwich_components S4 class
.Athe "bread" matrix
.A_ia list of "bread" matrices per unit
.Bthe "meat" matrix
.B_ia list of "meat" matrices per unit
.ee_ia list of observed estimating function values per unit
Setup an approx_control object
setup_approx_control(FUN, ...)setup_approx_control(FUN, ...)
FUN |
a function |
... |
arguments passed to |
a approx_control object
# For usage, see example 7 in vignette("01_additional_examples", package = "geex")# For usage, see example 7 in vignette("01_additional_examples", package = "geex")
Setup a basic_control object
setup_control(type, FUN, ...)setup_control(type, FUN, ...)
type |
one of |
FUN |
a function |
... |
arguments passed to |
a basic_control object
setup_root_control, setup_deriv_control,
setup_approx_control
Setup a deriv_control object
setup_deriv_control(FUN, ...)setup_deriv_control(FUN, ...)
FUN |
a function |
... |
arguments passed to |
a deriv_control object
setup_deriv_control() # default setup_deriv_control(method = "simple") # will speed up computationssetup_deriv_control() # default setup_deriv_control(method = "simple") # will speed up computations
Setup a root_control object
setup_root_control(FUN, roots_name, ...)setup_root_control(FUN, roots_name, ...)
FUN |
a function |
roots_name |
a character string identifying the object containing the |
... |
arguments passed to |
a root_control object
# Setup the default setup_root_control(start = c(3, 5, 6)) # Also setup the default setup_root_control(FUN = rootSolve::multiroot, start = c(3, 5, 6)) # Or use uniroot() setup_root_control(FUN = stats::uniroot, interval = c(0, 1))# Setup the default setup_root_control(start = c(3, 5, 6)) # Also setup the default setup_root_control(FUN = rootSolve::multiroot, start = c(3, 5, 6)) # Or use uniroot() setup_root_control(FUN = stats::uniroot, interval = c(0, 1))
m_estimation_basis, or geex object
show(object) ## S4 method for signature 'sandwich_components' show(object) ## S4 method for signature 'm_estimation_basis' show(object) ## S4 method for signature 'geex' show(object) ## S4 method for signature 'geex_summary' show(object)show(object) ## S4 method for signature 'sandwich_components' show(object) ## S4 method for signature 'm_estimation_basis' show(object) ## S4 method for signature 'geex' show(object) ## S4 method for signature 'geex_summary' show(object)
object |
the object to print |
Object Summaries
## S4 method for signature 'geex' summary(object, keep_data = TRUE, keep_args = TRUE)## S4 method for signature 'geex' summary(object, keep_data = TRUE, keep_args = TRUE)
object |
a |
keep_data |
keep the original data or not |
keep_args |
keep the |
library(geepack) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } z <- m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) object.size(z) object.size(summary(z)) object.size(summary(z, keep_data = FALSE)) object.size(summary(z, keep_data = FALSE, keep_args = FALSE))library(geepack) data('ohio') glmfit <- glm(resp ~ age, data = ohio, family = binomial(link = "logit")) example_ee <- function(data, model){ f <- grab_psiFUN(model, data) function(theta){ f(theta) } } z <- m_estimate( estFUN = example_ee, data = ohio, compute_roots = FALSE, units = 'id', roots = coef(glmfit), outer_args = list(model = glmfit)) object.size(z) object.size(summary(z)) object.size(summary(z, keep_data = FALSE)) object.size(summary(z, keep_data = FALSE, keep_args = FALSE))
Gets the variance-covariance matrix from a geex object
## S4 method for signature 'geex' vcov(object) ## S4 method for signature 'geex_summary' vcov(object)## S4 method for signature 'geex' vcov(object) ## S4 method for signature 'geex_summary' vcov(object)
object |
a |
ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) vcov(results)ex_eeFUN <- function(data){ function(theta){ with(data, c(Y1 - theta[1], (Y1 - theta[1])^2 - theta[2] )) }} results <- m_estimate( estFUN = ex_eeFUN, data = geexex, root_control = setup_root_control(start = c(1,1))) vcov(results)
Extract Model weights
## S4 method for signature 'geex' weights(object) ## S4 method for signature 'geex_summary' weights(object)## S4 method for signature 'geex' weights(object) ## S4 method for signature 'geex_summary' weights(object)
object |
a |