| Title: | Rank Normal Transformation Omnibus Test |
|---|---|
| Description: | Inverse normal transformation (INT) based genetic association testing. These tests are recommended for continuous traits with non-normally distributed residuals. INT-based tests robustly control the type I error in settings where standard linear regression does not, as when the residual distribution exhibits excess skew or kurtosis. Moreover, INT-based tests outperform standard linear regression in terms of power. These tests may be classified into two types. In direct INT (D-INT), the phenotype is itself transformed. In indirect INT (I-INT), phenotypic residuals are transformed. The omnibus test (O-INT) adaptively combines D-INT and I-INT into a single robust and statistically powerful approach. See McCaw ZR, Lane JM, Saxena R, Redline S, Lin X. "Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies" <doi:10.1111/biom.13214>. |
| Authors: | Zachary R. McCaw [aut, cre] |
| Maintainer: | Zachary R. McCaw <[email protected]> |
| License: | GPL-3 |
| Version: | 1.0.1.4 |
| Built: | 2026-05-26 09:07:52 UTC |
| Source: | https://github.com/zrmacc/rnomni |
INT-based tests control type I error when standard linear regression does not (e.g. skewed or kurtotic residuals) and typically outperform linear regression in power. The package provides:
BAT: Basic association test (no transformation).
DINT: Direct INT (phenotype is rank-normalized).
IINT: Indirect INT (phenotypic residuals are
rank-normalized).
OINT: Omnibus test combining D-INT and I-INT via
Cauchy combination (OmniP).
Helper functions include RankNorm (rank-based INT) and
FitOLS (OLS fit).
Genetic association tests based on the rank-based inverse normal transformation (INT). Recommended for continuous traits with non-normally distributed residuals.
Use OINT for a single robust test that adapts to the trait
distribution. Use DINT when the trait may be a
rank-preserving transform of a normal trait, and IINT when
the trait is linear in covariates but has non-normal residuals.
Maintainer: Zachary McCaw [email protected] (ORCID)
McCaw ZR, Lane JM, Saxena R, Redline S, Lin X (2020). Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies. Biometrics, doi:10.1111/biom.13214.
OINT, DINT, IINT, BAT,
RankNorm, OmniP
Stops evaluation if inputs are improperly formatted.
BasicInputChecks(y, G, X)BasicInputChecks(y, G, X)
y |
Numeric phenotype vector. |
G |
Genotype matrix with observations as rows, SNPs as columns. |
X |
Covariate matrix. |
None; called for side effects (stops on invalid input).
Conducts tests of association between the loci in G and the
untransformed phenotype y, adjusting for the model matrix X.
BAT(y, G, X = NULL, test = "Score", simple = FALSE)BAT(y, G, X = NULL, test = "Score", simple = FALSE)
y |
Numeric phenotype vector. |
G |
Genotype matrix with observations as rows, SNPs as columns. |
X |
Model matrix of covariates and structure adjustments. Should include
an intercept. Omit or set to |
test |
Character: |
simple |
If |
If simple = TRUE, returns a vector of p-values, one for each column
of G. If simple = FALSE, returns a numeric matrix, including the
Wald or Score statistic, its standard error, the Z-score, and the p-value.
set.seed(100) # Design matrix X <- cbind(1, stats::rnorm(1e3)) # Genotypes G <- replicate(1e3, stats::rbinom(n = 1e3, size = 2, prob = 0.25)) storage.mode(G) <- "numeric" # Phenotype y <- as.numeric(X %*% c(1, 1)) + stats::rnorm(1e3) # Association test p <- BAT(y = y, G = G, X = X)set.seed(100) # Design matrix X <- cbind(1, stats::rnorm(1e3)) # Genotypes G <- replicate(1e3, stats::rbinom(n = 1e3, size = 2, prob = 0.25)) storage.mode(G) <- "numeric" # Phenotype y <- as.numeric(X %*% c(1, 1)) + stats::rnorm(1e3) # Association test p <- BAT(y = y, G = G, X = X)
Convert Cauchy Random Variable to P
CauchyToP(z)CauchyToP(z)
z |
Numeric Cauchy random variable. |
Numeric p-value.
Applies the rank-based inverse normal transformation (RankNorm)
to the phenotype y. Conducts tests of association between the loci in
G and transformed phenotype, adjusting for the model matrix X.
DINT( y, G, X = NULL, k = 0.375, test = "Score", ties.method = "average", simple = FALSE )DINT( y, G, X = NULL, k = 0.375, test = "Score", ties.method = "average", simple = FALSE )
y |
Numeric phenotype vector. |
G |
Genotype matrix with observations as rows, SNPs as columns. |
X |
Model matrix of covariates and structure adjustments. Should include
an intercept. Omit or set to |
k |
Offset for rank-normalization; see |
test |
Character: |
ties.method |
Method for breaking ties, passed to |
simple |
If |
If simple = TRUE, returns a vector of p-values, one for each column
of G. If simple = FALSE, returns a numeric matrix, including the
Wald or Score statistic, its standard error, the Z-score, and the p-value.
set.seed(100) # Design matrix X <- cbind(1, stats::rnorm(1e3)) # Genotypes G <- replicate(1e3, stats::rbinom(n = 1e3, size = 2, prob = 0.25)) storage.mode(G) <- "numeric" # Phenotype y <- exp(as.numeric(X %*% c(1, 1)) + stats::rnorm(1e3)) # Association test p <- DINT(y = y, G = G, X = X)set.seed(100) # Design matrix X <- cbind(1, stats::rnorm(1e3)) # Genotypes G <- replicate(1e3, stats::rbinom(n = 1e3, size = 2, prob = 0.25)) storage.mode(G) <- "numeric" # Phenotype y <- exp(as.numeric(X %*% c(1, 1)) + stats::rnorm(1e3)) # Association test p <- DINT(y = y, G = G, X = X)
Fits the linear model by OLS.
FitOLS(y, X)FitOLS(y, X)
y |
Numeric response vector (length |
X |
Numeric design matrix ( |
A list:
Beta |
Estimated coefficient vector |
V |
Residual variance |
Ibb |
Information matrix for |
Resid |
Residual vector |
Two-stage association testing procedure. In the first stage, phenotype
y and genotype G are each regressed on the model matrix
X to obtain residuals. The phenotypic residuals are transformed
using RankNorm. In the next stage, the INT-transformed
residuals are regressed on the genotypic residuals.
IINT(y, G, X = NULL, k = 0.375, ties.method = "average", simple = FALSE)IINT(y, G, X = NULL, k = 0.375, ties.method = "average", simple = FALSE)
y |
Numeric phenotype vector. |
G |
Genotype matrix with observations as rows, SNPs as columns. |
X |
Model matrix of covariates and structure adjustments. Should include
an intercept. Omit or set to |
k |
Offset for rank-normalization; see |
ties.method |
Method for breaking ties, passed to |
simple |
If |
If simple = TRUE, returns a vector of p-values, one for each column
of G. If simple = FALSE, returns a numeric matrix, including the
Wald or Score statistic, its standard error, the Z-score, and the p-value.
set.seed(100) # Design matrix X <- cbind(1, stats::rnorm(1e3)) # Genotypes G <- replicate(1e3, stats::rbinom(n = 1e3, size = 2, prob = 0.25)) storage.mode(G) <- "numeric" # Phenotype y <- exp(as.numeric(X %*% c(1,1)) + stats::rnorm(1e3)) # Association test p <- IINT(y = y, G = G, X = X)set.seed(100) # Design matrix X <- cbind(1, stats::rnorm(1e3)) # Genotypes G <- replicate(1e3, stats::rbinom(n = 1e3, size = 2, prob = 0.25)) storage.mode(G) <- "numeric" # Phenotype y <- exp(as.numeric(X %*% c(1,1)) + stats::rnorm(1e3)) # Association test p <- IINT(y = y, G = G, X = X)
Association test that synthesizes the DINT and
IINT tests. The first approach is most powerful for traits that
could have arisen from a rank-preserving transformation of a latent normal
trait. The second approach is most powerful for traits that are linear in
covariates, yet have skewed or kurtotic residual distributions. During the
omnibus test, the direct and indirect tests are separately applied, then the
p-values are combined via the Cauchy combination method.
OINT( y, G, X = NULL, k = 0.375, ties.method = "average", weights = c(1, 1), simple = FALSE )OINT( y, G, X = NULL, k = 0.375, ties.method = "average", weights = c(1, 1), simple = FALSE )
y |
Numeric phenotype vector. |
G |
Genotype matrix with observations as rows, SNPs as columns. |
X |
Model matrix of covariates and structure adjustments. Should include an intercept. Omit to perform marginal tests of association. |
k |
Offset applied during rank-normalization. See
|
ties.method |
Method of breaking ties, passed to |
weights |
Numeric length-2 vector of weights for the D-INT and I-INT
p-values in the Cauchy combination. Default |
simple |
If |
If simple = TRUE, a named numeric vector of omnibus p-values
(one per column of G). If simple = FALSE, a numeric matrix
with columns DINT-p, IINT-p, OINT-p and one row per SNP.
set.seed(100) # Design matrix X <- cbind(1, rnorm(1e3)) # Genotypes G <- replicate(1e3, rbinom(n = 1e3, size = 2, prob = 0.25)) storage.mode(G) <- "numeric" # Phenotype y <- exp(as.numeric(X %*% c(1, 1)) + rnorm(1e3)) # Omnibus p <- OINT(y = y, G = G, X = X, simple = TRUE)set.seed(100) # Design matrix X <- cbind(1, rnorm(1e3)) # Genotypes G <- replicate(1e3, rbinom(n = 1e3, size = 2, prob = 0.25)) storage.mode(G) <- "numeric" # Phenotype y <- exp(as.numeric(X %*% c(1, 1)) + rnorm(1e3)) # Omnibus p <- OINT(y = y, G = G, X = X, simple = TRUE)
Combines a vector of potentially dependent p-values into a single p-value using the Cauchy combination method: p-values are converted to Cauchy random deviates, then weighted and summed; the sum is again Cauchy, and is converted back to a p-value.
OmniP(p, w = NULL)OmniP(p, w = NULL)
p |
Numeric vector of p-values (each in [0, 1]; cannot mix 0 and 1). |
w |
Optional numeric weight vector of the same length as |
A single numeric p-value.
Liu Y, Xie J (2020). Cauchy combination test: a powerful test with bimodal distributions. J Am Stat Assoc, doi:10.1080/01621459.2018.1554485.
OINT, which uses OmniP to combine D-INT and
I-INT p-values.
Partition y and X according to the missingness pattern of g.
PartitionData(e, g, X)PartitionData(e, g, X)
e |
Numeric residual vector. |
g |
Genotype vector. |
X |
Model matrix of covariates. |
List containing:
"g_obs", observed genotype vector.
"X_obs", covariates for subjects with observed genotypes.
"X_mis", covariates for subjects with missing genotypes.
"e_obs", residuals for subjects with observed genotypes.
Convert P-value to Cauchy Random
PtoCauchy(p)PtoCauchy(p)
p |
Numeric p-value. |
Numeric Cauchy random variable.
Applies the rank-based inverse normal transform (INT) to a numeric vector. Observations are first mapped to the (0, 1) scale via the empirical cumulative distribution function (ECDF), then to the real line via the probit (inverse normal CDF).
RankNorm(u, k = 0.375, ties.method = "average")RankNorm(u, k = 0.375, ties.method = "average")
u |
Numeric vector. Must not contain |
k |
Offset in the probability scale; must be in (0, 0.5). Default
|
ties.method |
Method for breaking ties, passed to |
Numeric vector of rank-normalized values (same length as u).
# Draw from chi-1 distribution y <- stats::rchisq(n = 1e3, df = 1) # Rank normalize z <- RankNorm(y) # Plot density of transformed measurement plot(stats::density(z))# Draw from chi-1 distribution y <- stats::rchisq(n = 1e3, df = 1) # Rank normalize z <- RankNorm(y) # Plot density of transformed measurement plot(stats::density(z))