Commit 2f555297 by Eric Coissac

New version of the manuscript

And changes on some default parameter in multivariate.R
parent 96f5bf28
......@@ -24,7 +24,6 @@ Suggests: knitr,
vegan
VignetteBuilder: knitr
Collate:
'IR.R'
'internals.R'
'procmod_frame.R'
'multivariate.R'
......
......@@ -26,7 +26,6 @@ export(bicenter)
export(corls)
export(corls.partial)
export(corls.test)
export(icor)
export(is.euclid)
export(is.procmod.frame)
export(nmds)
......
......@@ -62,7 +62,7 @@ as.data.frame.dist <- function(x, row.names = NULL, optional = FALSE, ...) {
#' @author Christelle Gonindard-Melodelima
#' @export
nmds <- function(distances,
maxit = 50, trace = TRUE,
maxit = 100, trace = FALSE,
tol = 0.001, p = 2) {
......
......@@ -4,7 +4,7 @@
\alias{nmds}
\title{Project a distance matrix in a euclidean space (NMDS).}
\usage{
nmds(distances, maxit = 50, trace = TRUE, tol = 0.001, p = 2)
nmds(distances, maxit = 100, trace = FALSE, tol = 0.001, p = 2)
}
\arguments{
\item{distances}{a \code{\link[stats]{dist}} object or a
......
......@@ -359,7 +359,7 @@ r2s <- c(1:2) * 5/100
n_rand <- 1000
@
To evaluate relative power of the three considered tests, pairs of to random matrices were produced for various $p \in \{\Sexpr{p_qs}\}$, $n \in \{\Sexpr{n_indivduals}\}$ and two levels of shared variances $R^2 \in \{\Sexpr{r2s}\}$. For each combination of parameters, $k = \Sexpr{n_sim}$ simulations are run. Each test are estimated based on $\Sexpr{n_rand}$ randomizations for the $CovLs$ test, or permutations for \texttt{protest} and \texttt{procuste.rtest}.
To evaluate relative power of the three considered tests, pairs of to random matrices were produced for various $p \in \{\Sexpr{p_qs}\}$, $n \in \{\Sexpr{n_indivduals}\}$ and two levels of shared variations $R^2 \in \{\Sexpr{r2s}\}$. For each combination of parameters, $k = \Sexpr{n_sim}$ simulations are run. Each test are estimated based on $\Sexpr{n_rand}$ randomizations for the $CovLs$ test, or permutations for \texttt{protest} and \texttt{procuste.rtest}.
<<estimate_power, cache=TRUE, message=FALSE, warning=FALSE, include=FALSE, dependson="estimate_power_setting">>=
......@@ -634,10 +634,10 @@ print(tab,
\subsection{Power of the test based on randomisation}
Power of the $CovLs$ test based on the estimation of $\overline{RCovLs(X,Y)}$ is equivalent of the power estimated for both \texttt{vegan::protest} and \texttt{ade4::procuste.rtest} tests (Table \ref{tab:power}). As for the two other tests, power decreases when the number of variable ($p$ or $q$) increases and increase with the number of individuals and the shared variance. The advantage of the test based on the Monte-Carlo estimation of $\overline{RCovLs(X,Y)}$ is to remove the need of running a supplementary set of permutations when \irls is computed.
Power of the $CovLs$ test based on the estimation of $\overline{RCovLs(X,Y)}$ is equivalent of the power estimated for both \texttt{vegan::protest} and \texttt{ade4::procuste.rtest} tests (Table \ref{tab:power}). As for the two other tests, power decreases when the number of variable ($p$ or $q$) increases and increase with the number of individuals and the shared variation. The advantage of the test based on the Monte-Carlo estimation of $\overline{RCovLs(X,Y)}$ is to remove the need of running a supplementary set of permutations when \irls is computed.
\begin{table}[!t]
\processtable{Power estimation of the procruste tests for two low level of shared variances $5\%$ and $10\%$.\label{tab:power}} {
\processtable{Power estimation of the procruste tests for two low level of shared variations $5\%$ and $10\%$.\label{tab:power}} {
<<power_table, echo=FALSE, results="asis" >>=
n_indivduals <- dimnames(power)[[1]]
p_qs <- dimnames(power)[[2]]
......@@ -691,7 +691,8 @@ print(tab,
}{} % <- we can add a footnote in the last curly praces
\end{table}
\subsection{Evaluating the shared variance}
\subsection{Evaluating the shared variation}
$\rls$ can be considered for matrices as a strict equivalent of Pearson's $\rpearson$ for vectors. Therefore its squared value is an estimator of the shared variation between two matrices. But because of over-fitting the estimation is over-estimated. The proposed corrected vection ($\irls$) of that coefficient is able to provide a good estimate of the shared variation and is perfectly robust to the over-fitting phenomenon (Figure~\ref{fig:shared_variation}). Only a small over evalution is observable for the low values of simulated shared variation.
\begin{figure*}[!tpb]%figure1
<<fig__r2, echo=FALSE, message=FALSE, warning=FALSE, fig.height=4, fig.width=8>>=
......@@ -715,7 +716,7 @@ r2_sims_all %>%
theme(axis.text.x = element_text(angle = 45, size=7, hjust = 1))
@
\caption{Shared variation ($R^2$) between two matrices is mesured with both the corrected ($\irls$) and the original ($\rls$) versions of the procrustean correlation coefficient. A gradiant of $R^2$ is simulated for two population sizes ($n \in \{10,24\}$) and two numbers of descriptive variables ($p \in \{10,100\}$). The black dashed line corresponds to a perfect match where measured $R^2$ equals the simulated one.}
\label{fig:shared_variance}
\label{fig:shared_variation}
\end{figure*}
......
No preview for this file type
......@@ -233,7 +233,7 @@ $p \in \{10, 20, 50\}$ are simulated under the null hypothesis of independancy.
To evaluate relative power of the three considered tests, pairs of to random matrices were produced for various $p \in \{10, 20, 50, 100\}$, $n \in \{10, 15, 20, 25\}$ and two levels of shared variances $R^2 \in \{0.05, 0.1\}$. For each combination of parameters, $k = 1000$ simulations are run. Each test are estimated based on $1000$ randomizations for the $CovLs$ test, or permutations for \texttt{protest} and \texttt{procuste.rtest}.
To evaluate relative power of the three considered tests, pairs of to random matrices were produced for various $p \in \{10, 20, 50, 100\}$, $n \in \{10, 15, 20, 25\}$ and two levels of shared variations $R^2 \in \{0.05, 0.1\}$. For each combination of parameters, $k = 1000$ simulations are run. Each test are estimated based on $1000$ randomizations for the $CovLs$ test, or permutations for \texttt{protest} and \texttt{procuste.rtest}.
......@@ -289,7 +289,7 @@ whatever the $p$ tested (Table~\ref{tab:alpha_pvalue}). This ensure that the pro
of the distribution of $P_{values}$ correlation test to $\mathcal{U}(0,1)$
under the null hypothesis.\label{tab:alpha_pvalue}} {
% latex table generated in R 3.5.2 by xtable 1.8-4 package
% Fri Jun 7 14:46:13 2019
% Fri Jun 7 15:08:34 2019
\begin{tabular*}{0.98\linewidth}{@{\extracolsep{\fill}}crrr}
\hline
& \multicolumn{3}{c}{Cramer-Von Mises p.value} \\
......@@ -306,12 +306,12 @@ whatever the $p$ tested (Table~\ref{tab:alpha_pvalue}). This ensure that the pro
\subsection{Power of the test based on randomisation}
Power of the $CovLs$ test based on the estimation of $\overline{RCovLs(X,Y)}$ is equivalent of the power estimated for both \texttt{vegan::protest} and \texttt{ade4::procuste.rtest} tests (Table \ref{tab:power}). As for the two other tests, power decreases when the number of variable ($p$ or $q$) increases and increase with the number of individuals and the shared variance. The advantage of the test based on the Monte-Carlo estimation of $\overline{RCovLs(X,Y)}$ is to remove the need of running a supplementary set of permutations when \irls is computed.
Power of the $CovLs$ test based on the estimation of $\overline{RCovLs(X,Y)}$ is equivalent of the power estimated for both \texttt{vegan::protest} and \texttt{ade4::procuste.rtest} tests (Table \ref{tab:power}). As for the two other tests, power decreases when the number of variable ($p$ or $q$) increases and increase with the number of individuals and the shared variation. The advantage of the test based on the Monte-Carlo estimation of $\overline{RCovLs(X,Y)}$ is to remove the need of running a supplementary set of permutations when \irls is computed.
\begin{table}[!t]
\processtable{Power estimation of the procruste tests for two low level of shared variances $5\%$ and $10\%$.\label{tab:power}} {
\processtable{Power estimation of the procruste tests for two low level of shared variations $5\%$ and $10\%$.\label{tab:power}} {
% latex table generated in R 3.5.2 by xtable 1.8-4 package
% Fri Jun 7 14:46:13 2019
% Fri Jun 7 15:08:34 2019
\begin{tabular}{lcrrrrrrrrr}
\hline
& $R^2$ & \multicolumn{4}{c}{5\%} & &\multicolumn{4}{c}{10\%} \\
......@@ -336,7 +336,8 @@ Power of the $CovLs$ test based on the estimation of $\overline{RCovLs(X,Y)}$ is
}{} % <- we can add a footnote in the last curly praces
\end{table}
\subsection{Evaluating the shared variance}
\subsection{Evaluating the shared variation}
$\rls$ can be considered for matrices as a strict equivalent of Pearson's $\rpearson$ for vectors. Therefore its squared value is an estimator of the shared variation between two matrices. But because of over-fitting the estimation is over-estimated. The proposed corrected vection ($\irls$) of that coefficient is able to provide a good estimate of the shared variation and is perfectly robust to the over-fitting phenomenon (Figure~\ref{fig:shared_variation}). Only a small over evalution is observable for the low values of simulated shared variation.
\begin{figure*}[!tpb]%figure1
\begin{knitrout}
......@@ -345,7 +346,7 @@ Power of the $CovLs$ test based on the estimation of $\overline{RCovLs(X,Y)}$ is
\end{knitrout}
\caption{Shared variation ($R^2$) between two matrices is mesured with both the corrected ($\irls$) and the original ($\rls$) versions of the procrustean correlation coefficient. A gradiant of $R^2$ is simulated for two population sizes ($n \in \{10,24\}$) and two numbers of descriptive variables ($p \in \{10,100\}$). The black dashed line corresponds to a perfect match where measured $R^2$ equals the simulated one.}
\label{fig:shared_variance}
\label{fig:shared_variation}
\end{figure*}
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment