Title: | Surrogate Outcome Regression Analysis |
---|---|
Description: | Performs estimation and inference on a partially missing target outcome (e.g. gene expression in an inaccessible tissue) while borrowing information from a correlated surrogate outcome (e.g. gene expression in an accessible tissue). Rather than regarding the surrogate outcome as a proxy for the target outcome, this package jointly models the target and surrogate outcomes within a bivariate regression framework. Unobserved values of either outcome are treated as missing data. In contrast to imputation-based inference, no assumptions are required regarding the relationship between the target and surrogate outcomes. Estimation in the presence of bilateral outcome missingness is performed via an expectation conditional maximization either algorithm. In the case of unilateral target missingness, estimation is performed using an accelerated least squares procedure. A flexible association test is provided for evaluating hypotheses about the target regression parameters. For additional details, see: McCaw ZR, Gaynor SM, Sun R, Lin X: "Leveraging a surrogate outcome to improve inference on a partially missing target outcome" <doi:10.1111/biom.13629>. |
Authors: | Zachary McCaw [aut, cre] |
Maintainer: | Zachary McCaw <[email protected]> |
License: | GPL-3 |
Version: | 0.6.0.1 |
Built: | 2024-11-10 06:16:10 UTC |
Source: | https://github.com/zrmacc/surrogateregression |
Bivariate Regression Model
Covariance
Residual covariance matrix.
Covariance.info
Information for covariance parameters.
Covariance.tab
Table of covariance parameters.
Method
Method used for estimation.
Regression.info
Information for regression coefficients.
Regression.tab
Table of regression coefficients.
Residuals
Outcome residuals.
Check Initiation
CheckInit(init)
CheckInit(init)
init |
Optional list of initial parameters for fitting the null model. |
Check Test Specification
CheckTestSpec(is_zero, p)
CheckTestSpec(is_zero, p)
is_zero |
Logical vector, with as many entires as columns in the target model matrix, indicating which columns have coefficient zero under the null. |
p |
Number of columns for the target model matrix. |
Extract Coefficients from Bivariate Regression Model
## S3 method for class 'bnr' coef(object, ..., type = NULL)
## S3 method for class 'bnr' coef(object, ..., type = NULL)
object |
|
... |
Unused. |
type |
Either Target or Surrogate. |
Covariance Information Matrix
CovInfo(data_part, sigma)
CovInfo(data_part, sigma)
data_part |
List of partitioned data. See |
sigma |
Target-surrogate covariance matrix. |
3x3 Numeric information matrix for the target variance, target-surrogate covariance, and surrogate variance.
Tabulate Covariance Parameters
CovTab(point, info, sig = 0.05)
CovTab(point, info, sig = 0.05)
point |
Point estimates. |
info |
Information matrix. |
sig |
Significance level. |
Data.table containing the point estimate, standard error, and confidence interval.
Covariate Update
CovUpdate(data_part, b0, a0, b1, a1, sigma0)
CovUpdate(data_part, b0, a0, b1, a1, sigma0)
data_part |
List of partitioned data. See |
b0 |
Previous target regression coefficient. |
a0 |
Previous surrogate regression coefficient. |
b1 |
Current target regression coefficient. |
a1 |
Current surrogate regression coefficient. |
sigma0 |
Initial target-surrogate covariance matrix. |
ECM update of the target-surrogate covariance matrix.
Estimation procedure for bivariate normal regression models in which the target and surrogate outcomes are both subject to missingness.
FitBNEM( t, s, X, Z, sig = 0.05, b0 = NULL, a0 = NULL, sigma0 = NULL, maxit = 100, eps = 1e-06, report = TRUE )
FitBNEM( t, s, X, Z, sig = 0.05, b0 = NULL, a0 = NULL, sigma0 = NULL, maxit = 100, eps = 1e-06, report = TRUE )
t |
Target outcome vector. |
s |
Surrogate outcome vector. |
X |
Target model matrix. |
Z |
Surrogate model matrix. |
sig |
Type I error level. |
b0 |
Initial target regression coefficient. |
a0 |
Initial surrogate regression coefficient. |
sigma0 |
Initial covariance matrix. |
maxit |
Maximum number of parameter updates. |
eps |
Minimum acceptable improvement in log likelihood. |
report |
Report fitting progress? |
The target and surrogate model matrices are expected in numeric format.
Include an intercept if required. Expand factors and interactions in advance.
Initial values may be specified for any of the target coefficient
b0
, the surrogate coefficient a0
, or the target-surrogate
covariance matrix sigma0
.
An object of class 'bnr' with slots containing the estimated regression coefficients, the target-surrogate covariance matrix, the information matrices for the regression and covariance parameters, and the residuals.
Estimation procedure for bivariate normal regression models in which only the target outcome is subject to missingness.
FitBNLS(t, s, X, sig = 0.05)
FitBNLS(t, s, X, sig = 0.05)
t |
Target outcome vector. |
s |
Surrogate outcome vector. |
X |
Model matrix. |
sig |
Type I error level. |
The model matrix is expected in numeric format. Include an intercept if required. Expand factors and interactions in advance.
An object of class 'bnr' with slots containing the estimated regression coefficients, the target-surrogate covariance matrix, the information matrices for the regression and covariance parameters, and the residuals.
Estimation procedure for bivariate normal regression models. The EM algorithm
is applied if s
contains missing values, or if X
differs from
Z
. Otherwise, an accelerated least squares procedure is applied.
FitBNR(t, s, X, Z = NULL, sig = 0.05, ...)
FitBNR(t, s, X, Z = NULL, sig = 0.05, ...)
t |
Target outcome vector. |
s |
Surrogate outcome vector. |
X |
Target model matrix. |
Z |
Surrogate model matrix. Defaults to |
sig |
Significance level. |
... |
Additional arguments accepted if fitting via EM. See
|
The target and surrogate model matrices are expected in numeric format. Include an intercept if required. Expand factors and interactions in advance.
An object of class 'mnr' with slots containing the estimated regression coefficients, the target-surrogate covariance matrix, the information matrices for regression parameters, and the residuals.
# Case 1: No surrogate missingness. set.seed(100) n <- 1e3 X <- stats::rnorm(n) data <- rBNR( X = X, Z = X, b = 1, a = -1, t_miss = 0.1, s_miss = 0.0 ) t <- data[, 1] s <- data[, 2] # Model fit. fit_bnls <- FitBNR( t = t, s = s, X = X ) # Case 2: Target and surrogate missingness. set.seed(100) n <- 1e3 X <- stats::rnorm(n) Z <- stats::rnorm(n) data <- rBNR( X = X, Z = Z, b = 1, a = -1, t_miss = 0.1, s_miss = 0.1 ) # Log likelihood. fit_bnem <- FitBNR( t = data[, 1], s = data[, 2], X = X, Z = Z )
# Case 1: No surrogate missingness. set.seed(100) n <- 1e3 X <- stats::rnorm(n) data <- rBNR( X = X, Z = X, b = 1, a = -1, t_miss = 0.1, s_miss = 0.0 ) t <- data[, 1] s <- data[, 2] # Model fit. fit_bnls <- FitBNR( t = t, s = s, X = X ) # Case 2: Target and surrogate missingness. set.seed(100) n <- 1e3 X <- stats::rnorm(n) Z <- stats::rnorm(n) data <- rBNR( X = X, Z = Z, b = 1, a = -1, t_miss = 0.1, s_miss = 0.1 ) # Log likelihood. fit_bnem <- FitBNR( t = data[, 1], s = data[, 2], X = X, Z = Z )
Fits the standard OLS model.
fitOLS(y, X)
fitOLS(y, X)
y |
Nx1 Numeric vector. |
X |
NxP Numeric matrix. |
List containing the following:
Beta |
Regression coefficient. |
V |
Outcome variance. |
Ibb |
Information matrix for beta. |
Resid |
Outcome residuals. |
Format Output
FormatOutput(data_part, method, b, a, sigma, sig)
FormatOutput(data_part, method, b, a, sigma, sig)
data_part |
List of partitioned data. See |
method |
Estimation method. |
b |
Final target regression parameter. |
a |
Final surrogate regression parameter. |
sigma |
Final target-surrogate covariance matrix. |
sig |
Significance level. |
Object of class 'bnr'.
Update Iteration
IterUpdate(theta0, update, maxit, eps, report)
IterUpdate(theta0, update, maxit, eps, report)
theta0 |
List containing the initial parameter values. |
update |
Function to iterate. Should accept and return a list similar parameter values. |
maxit |
Maximum number of parameter updates. |
eps |
Minimum acceptable improvement in log likelihood. |
report |
Report fitting progress? |
Calculates the determinant of .
matDet(A, logDet = FALSE)
matDet(A, logDet = FALSE)
A |
Numeric matrix. |
logDet |
Return the logarithm of the determinant? |
Scalar.
Calcualtes .
matInv(A)
matInv(A)
A |
Numeric matrix. |
Numeric matrix.
Calculates the product .
matIP(A, B)
matIP(A, B)
A |
Numeric matrix. |
B |
Numeric matrix. |
Numeric matrix.
Calculates the outer product .
matOP(A, B)
matOP(A, B)
A |
Numeric matrix. |
B |
Numeric matrix. |
Numeric matrix.
Calculates the quadratic form .
matQF(X, A)
matQF(X, A)
X |
Numeric matrix. |
A |
Numeric matrix. |
Numeric matrix.
Calculates the product .
MMP(A, B)
MMP(A, B)
A |
Numeric matrix. |
B |
Numeric matrix. |
Numeric matrix.
Observed Data Log Likelihood
ObsLogLik(data_part, b, a, sigma)
ObsLogLik(data_part, b, a, sigma)
data_part |
List of partitioned data. See |
b |
Target regression coefficient. |
a |
Surrogate regression coefficient. |
sigma |
Target-surrogate covariance matrix. |
Observed data log likelihood.
Parameter Initialization
ParamInit(data_part, b0, a0, sigma0)
ParamInit(data_part, b0, a0, sigma0)
data_part |
List of partitioned data. See |
b0 |
Initial target regression coefficient. |
a0 |
Initial surrogate regression coefficient. |
sigma0 |
Initial covariance matrix. |
List containing initial values of beta, alpha, sigma.
Partition Data by Outcome Missingness Pattern.
PartitionData(t, s, X, Z = NULL)
PartitionData(t, s, X, Z = NULL)
t |
Target outcome vector. |
s |
Surrogate outcome vector. |
X |
Target model matrix. |
Z |
Surrogate model matrix. |
List containing these components:
'Orig' original data.
'Dims' dimensions and names.
'Complete', data for complete cases.
'TMiss', data for subjects with target missingness.
'SMiss', data for subjects with surrogate missingness.
'IPs', inner products.
# Generate data. n <- 1e3 X <- rnorm(n) Z <- rnorm(n) data <- rBNR(X = X, Z = Z, b = 1, a = -1) data_part <- PartitionData( t = data[, 1], s = data[, 2], X = X, Z = Z )
# Generate data. n <- 1e3 X <- rnorm(n) Z <- rnorm(n) data <- rBNR(X = X, Z = Z, b = 1, a = -1) data_part <- PartitionData( t = data[, 1], s = data[, 2], X = X, Z = Z )
Print for Bivariate Regression Model
## S3 method for class 'bnr' print(x, ..., type = "Regression")
## S3 method for class 'bnr' print(x, ..., type = "Regression")
x |
|
... |
Unused. |
type |
Either Regression or Covariance. |
Function to simulate from a bivariate normal regression model with outcomes missing completely at random.
rBNR( X, Z, b, a, t_miss = 0, s_miss = 0, sigma = NULL, include_residuals = TRUE )
rBNR( X, Z, b, a, t_miss = 0, s_miss = 0, sigma = NULL, include_residuals = TRUE )
X |
Target design matrix. |
Z |
Surrogate design matrix. |
b |
Target regression coefficient. |
a |
Surrogate regression coefficient. |
t_miss |
Target missingness in [0,1]. |
s_miss |
Surrogate missingness in [0,1]. |
sigma |
2x2 target-surrogate covariance matrix. |
include_residuals |
Include the residual? Default: TRUE. |
Numeric Nx2 matrix. The first column contains the target outcome, the second contains the surrogate outcome.
set.seed(100) # Observations. n <- 1e3 # Target design. X <- cbind(1, matrix(rnorm(3 * n), nrow = n)) # Surrogate design. Z <- cbind(1, matrix(rnorm(3 * n), nrow = n)) # Target coefficient. b <- c(-1, 0.1, -0.1, 0.1) # Surrogate coefficient. a <- c(1, -0.1, 0.1, -0.1) # Covariance structure. sigma <- matrix(c(1, 0.5, 0.5, 1), nrow = 2) # Data generation, target and surrogate subject to 10% missingness. y <- rBNR(X, Z, b, a, t_miss = 0.1, s_miss = 0.1, sigma = sigma)
set.seed(100) # Observations. n <- 1e3 # Target design. X <- cbind(1, matrix(rnorm(3 * n), nrow = n)) # Surrogate design. Z <- cbind(1, matrix(rnorm(3 * n), nrow = n)) # Target coefficient. b <- c(-1, 0.1, -0.1, 0.1) # Surrogate coefficient. a <- c(1, -0.1, 0.1, -0.1) # Covariance structure. sigma <- matrix(c(1, 0.5, 0.5, 1), nrow = 2) # Data generation, target and surrogate subject to 10% missingness. y <- rBNR(X, Z, b, a, t_miss = 0.1, s_miss = 0.1, sigma = sigma)
Regression Information
RegInfo(data_part, sigma, as_matrix = FALSE)
RegInfo(data_part, sigma, as_matrix = FALSE)
data_part |
List of partitioned data. See |
sigma |
Target-surrogate covariance matrix. |
as_matrix |
Return as an information matrix? If FALSE, returns a list. |
List containing the information matrix for beta (Ibb), the information matrix for alpha (Iaa), and the cross information (Iba).
Tabulate Regression Coefficients
RegTab(point, info, sig = 0.05)
RegTab(point, info, sig = 0.05)
point |
Point estimates. |
info |
Information matrix. |
sig |
Significance level. |
Data.table containing the point estimate, standard error, confidence interval, and Wald p-value.
Regression Update
RegUpdate(data_part, sigma)
RegUpdate(data_part, sigma)
data_part |
List of partitioned data. See |
sigma |
Target-surrogate covariance matrix. |
List containing the generalized least squares estimates of beta and alpha.
Extract Residuals from Bivariate Regression Model
## S3 method for class 'bnr' residuals(object, ..., type = NULL)
## S3 method for class 'bnr' residuals(object, ..., type = NULL)
object |
A |
... |
Unused. |
type |
Either Target or Surrogate. |
Calculates the efficient information .
SchurC(Ibb, Iaa, Iba)
SchurC(Ibb, Iaa, Iba)
Ibb |
Information of target parameter |
Iaa |
Information of nuisance parameter |
Iba |
Cross information between target and nuisance parameters |
Numeric matrix.
Performs a Score test of the null hypothesis that a subset of the regression parameters for the target outcome are zero.
ScoreBNEM( t, s, X, Z, is_zero, init = NULL, maxit = 100, eps = 1e-08, report = FALSE )
ScoreBNEM( t, s, X, Z, is_zero, init = NULL, maxit = 100, eps = 1e-08, report = FALSE )
t |
Target outcome vector. |
s |
Surrogate outcome vector. |
X |
Target model matrix. |
Z |
Surrogate model matrix. |
is_zero |
Logical vector, with as many entires as columns in the target model matrix, indicating which columns have coefficient zero under the null. |
init |
Optional list of initial parameters for fitting the null model. |
maxit |
Maximum number of parameter updates. |
eps |
Minimum acceptable improvement in log likelihood. |
report |
Report model fitting progress? Default is FALSE. |
A numeric vector containing the score statistic, the degrees of freedom, and a p-value.
Show for Bivariate Regression Model
## S4 method for signature 'bnr' show(object)
## S4 method for signature 'bnr' show(object)
object |
|
This package performs estimation and inference on a partially missing target
outcome while borrowing information from a correlated surrogate outcome.
Rather than regarding the surrogate outcome as a proxy for the target
outcome, this package jointly models the target and surrogate outcomes within
a bivariate regression framework. Unobserved values of either outcome are
treated as missing data. In contrast to imputation-based inference, no
assumptions are required regarding the relationship between the target and
surrogate outcomes. However, in order for surrogate inference to improve
power, the target and surrogate outcomes must be correlated, and the target
outcome must be partially missing. The primary estimation function is
FitBNR
. In the case of bilateral missingness, i.e. missingness
in both the target and surrogate outcomes, estimation is performed via an
expectation conditional maximization either (ECME) algorithm. In the case of
unilateral target missingness, estimation is performed using an accelerated
least squares procedure. Inference on regression parameters for the target
outcome is performed using TestBNR
.
Zachary R. McCaw
Performs a test of the null hypothesis that a subset of the regression parameters for the target outcome are zero in the bivariate normal regression model.
TestBNR(t, s, X, Z = NULL, is_zero, test = "Wald", ...)
TestBNR(t, s, X, Z = NULL, is_zero, test = "Wald", ...)
t |
Target outcome vector. |
s |
Surrogate outcome vector. |
X |
Target model matrix. |
Z |
Surrogate model matrix. |
is_zero |
Logical vector, with as many entires as columns in the target model matrix, indicating which columns have coefficient zero under the null. |
test |
Either Score or Wald. Only Wald is available for LS. |
... |
Additional arguments accepted if fitting via EM. See
|
A numeric vector containing the test statistic, the degrees of freedom, and a p-value.
# Generate data. set.seed(100) n <- 1e3 X <- cbind(1, rnorm(n)) Z <- cbind(1, rnorm(n)) data <- rBNR(X = X, Z = Z, b = c(1, 0), a = c(-1, 0), t_miss = 0.1, s_miss = 0.1) # Test 1st coefficient. wald_test1 <- TestBNR( t = data[, 1], s = data[, 2], X = X, Z = Z, is_zero = c(TRUE, FALSE), test = "Wald" ) score_test1 <- TestBNR( t = data[, 1], s = data[, 2], X = X, Z = Z, is_zero = c(TRUE, FALSE), test = "Score" ) # Test 2nd coefficient. wald_test2 <- TestBNR( t = data[, 1], s = data[, 2], X = X, Z = Z, is_zero = c(FALSE, TRUE), test = "Wald" ) score_test2 <- TestBNR( t = data[, 1], s = data[, 2], X = X, Z = Z, is_zero = c(FALSE, TRUE), test = "Score" )
# Generate data. set.seed(100) n <- 1e3 X <- cbind(1, rnorm(n)) Z <- cbind(1, rnorm(n)) data <- rBNR(X = X, Z = Z, b = c(1, 0), a = c(-1, 0), t_miss = 0.1, s_miss = 0.1) # Test 1st coefficient. wald_test1 <- TestBNR( t = data[, 1], s = data[, 2], X = X, Z = Z, is_zero = c(TRUE, FALSE), test = "Wald" ) score_test1 <- TestBNR( t = data[, 1], s = data[, 2], X = X, Z = Z, is_zero = c(TRUE, FALSE), test = "Score" ) # Test 2nd coefficient. wald_test2 <- TestBNR( t = data[, 1], s = data[, 2], X = X, Z = Z, is_zero = c(FALSE, TRUE), test = "Wald" ) score_test2 <- TestBNR( t = data[, 1], s = data[, 2], X = X, Z = Z, is_zero = c(FALSE, TRUE), test = "Score" )
Calculates the trace of a matrix .
tr(A)
tr(A)
A |
Numeric matrix. |
Scalar.
EM Update
UpdateEM(data_part, b0, a0, sigma0)
UpdateEM(data_part, b0, a0, sigma0)
data_part |
List of partitioned data. See |
b0 |
Initial target regression coefficient. |
a0 |
Initial surrogate regression coefficient. |
sigma0 |
Initial covariance matrix. |
List containing updated values for beta 'b', alpha 'a', 'sigma', the log likelihood 'loglik', and the change in log likelihood 'delta'.
Returns the either the estimated covariance matrix of the outcome, the information matrix for regression coefficients, or the information matrix for covariance parameters.
## S3 method for class 'bnr' vcov(object, ..., type = "Regression", inv = FALSE)
## S3 method for class 'bnr' vcov(object, ..., type = "Regression", inv = FALSE)
object |
|
... |
Unused. |
type |
Select "Covariance","Outcome",or "Regression". Default is "Regression". |
inv |
Invert the covariance matrix? Default is FALSE. |
Performs a Wald test of the null hypothesis that a subset of the regression parameters for the target outcome are zero.
WaldBNEM( t, s, X, Z, is_zero, init = NULL, maxit = 100, eps = 1e-08, report = FALSE )
WaldBNEM( t, s, X, Z, is_zero, init = NULL, maxit = 100, eps = 1e-08, report = FALSE )
t |
Target outcome vector. |
s |
Surrogate outcome vector. |
X |
Target model matrix. |
Z |
Surrogate model matrix. |
is_zero |
Logical vector, with as many entries as columns in the target model matrix, indicating which columns have coefficient zero under the null. |
init |
Optional list of initial parameters for fitting the null model, with one or more of the components: a0, b0, S0. |
maxit |
Maximum number of parameter updates. |
eps |
Minimum acceptable improvement in log likelihood. |
report |
Report model fitting progress? Default is FALSE. |
A numeric vector containing the Wald statistic, the degrees of freedom, and a p-value.
Performs a Wald test of the null hypothesis that a subset of the regression parameters for the target outcome are zero.
WaldBNLS(t, s, X, is_zero)
WaldBNLS(t, s, X, is_zero)
t |
Target outcome vector. |
s |
Surrogate outcome vector. |
X |
Model matrix. |
is_zero |
Logical vector, with as many entires as columns in the target model matrix, indicating which columns have coefficient zero under the null. |
A numeric vector containing the Wald statistic, the degrees of freedom, and a p-value.