Roxygen2: unexpected missing links in documentation object

Created on 16 Nov 2019  ·  12Comments  ·  Source: r-lib/roxygen2

In some functions in my package, I am inheriting arguments from lme4::lmer and lme4::glmer-

#' @inheritDotParams lme4::lmer

This works without any issue:
Source: https://indrajeetpatil.github.io/groupedstats/reference/grouped_lmer.html#arguments

image

But it is producing a WARNING in my R CMD CHECK-

> checking Rd cross-references ... WARNING

  Missing link or links in documentation object 'grouped_lmer.Rd':
    '[lme4]{list}' '[lme4]{offset}' '[lme4]{model.offset}'

  See section 'Cross-references' in the 'Writing R Extensions' manual.

This is not supposed to happen because this is not a reference in code, but in the documentation.

As @bbolker pointed out, this _might_ be a roxygen2 buglet?
(specifically, here: https://github.com/r-lib/roxygen2/blob/e016a7020de1a4ca6a695e7f1eb1b43ffe83e27f/R/rd-inherit.R#L188-L190)

bug rd

All 12 comments

Can you please create a minimal reprex?

Sure, here is a minimal reprex.

The problem is caused by- \code{\link[lme4]{model.offset}} in .Rd file, since there is no such function as lme4::model.offset, but rather stats::model.offset and that leads to the WARNING in R CMD CHECK:

library(roxygen2)
roc_proc_text(rd_roclet(), "
                      #' Title
                      #' @inheritDotParams lme4::lmer
                      #' 
                      foo <- function(...) {}") 

#> $foo.Rd
#> % Generated by roxygen2: do not edit by hand
#> % Please edit documentation in ./<text>

#> \name{foo}
#> \alias{foo}
#> \titlec{Title}
#> \usage{
#> foo(...)
#> }
#> \arguments{
#> \item{...}{
#>   Arguments passed on to \code{\link[lme4:lmer]{lme4::lmer}}
#>   \describe{
#>     \item{\code{formula}}{a two-sided linear formula object describing both the
#>     fixed-effects and random-effects part of the model, with the
#>     response on the left of a \code{~} operator and the terms, separated
#>     by \code{+} operators, on the right.  Random-effects terms are
#>     distinguished by vertical bars (\code{|}) separating expressions
#>     for design matrices from grouping factors.  Two vertical bars
#>     (\code{||}) can be used to specify multiple uncorrelated random
#>     effects for the same grouping variable. (Because of the way it is
#>     implemented, the \code{||}-syntax \emph{works
#>     only for design matrices containing
#>     numeric (continuous) predictors}; to fit models with independent
#>   categorical effects, see \code{\link[lme4]{dummy}} or the \code{lmer_alt}
#>   function from the \code{afex} package.) 
#> }
#>     \item{\code{data}}{an optional data frame containing the variables named in
#>     \code{formula}.  By default the variables are taken from the
#>     environment from which \code{lmer} is called. While \code{data} is
#>     optional, the package authors \emph{strongly} recommend its use,
#>     especially when later applying methods such as \code{update} and
#>     \code{drop1} to the fitted model (\emph{such methods are not
#>     guaranteed to work properly if \code{data} is omitted}). If
#>     \code{data} is omitted, variables will be taken from the environment
#>     of \code{formula} (if specified as a formula) or from the parent
#>     frame (if specified as a character vector).}
#>     \item{\code{REML}}{logical scalar - Should the estimates be chosen to
#>     optimize the REML criterion (as opposed to the log-likelihood)?}
#>     \item{\code{control}}{a list (of correct class, resulting from
#>     \code{\link[lme4]{lmerControl}()} or \code{\link[lme4]{glmerControl}()}
#>     respectively) containing control parameters, including the nonlinear
#>     optimizer to be used and parameters to be passed through to the
#>     nonlinear optimizer, see the \code{*lmerControl} documentation for
#>     details.}
#>     \item{\code{start}}{a named \code{\link[lme4]{list}} of starting values for the
#>     parameters in the model.  For \code{lmer} this can be a numeric
#>     vector or a list with one component named \code{"theta"}.}
#>     \item{\code{verbose}}{integer scalar.  If \code{> 0} verbose output is
#>     generated during the optimization of the parameter estimates.  If
#>     \code{> 1} verbose output is generated during the individual
#>     penalized iteratively reweighted least squares (PIRLS) steps.}
#>     \item{\code{subset}}{an optional expression indicating the subset of the rows
#>     of \code{data} that should be used in the fit. This can be a logical
#>     vector, or a numeric vector indicating which observation numbers are
#>     to be included, or a character vector of the row names to be
#>     included.  All observations are included by default.}
#>     \item{\code{weights}}{an optional vector of \sQuote{prior weights} to be used
#>     in the fitting process.  Should be \code{NULL} or a numeric vector.
#>     Prior \code{weights} are \emph{not} normalized or standardized in
#>     any way.  In particular, the diagonal of the residual covariance
#>     matrix is the squared residual standard deviation parameter
#>     \code{\link[lme4]{sigma}} times the vector of inverse \code{weights}.
#>     Therefore, if the \code{weights} have relatively large magnitudes,
#>     then in order to compensate, the \code{\link[lme4]{sigma}} parameter will
#>     also need to have a relatively large magnitude.}
#>     \item{\code{na.action}}{a function that indicates what should happen when the
#>     data contain \code{NA}s.  The default action (\code{na.omit},
#>     inherited from the 'factory fresh' value of
#>     \code{getOption("na.action")}) strips any observations with any
#>     missing values in any variables.}
#>     \item{\code{offset}}{this can be used to specify an \emph{a priori} known
#>     component to be included in the linear predictor during
#>     fitting. This should be \code{NULL} or a numeric vector of length
#>     equal to the number of cases.  One or more \code{\link[lme4]{offset}}
#>     terms can be included in the formula instead or as well, and if more
#>     than one is specified their sum is used.  See
#>     \code{\link[lme4]{model.offset}}.}
#>     \item{\code{contrasts}}{an optional list. See the \code{contrasts.arg} of
#>     \code{model.matrix.default}.}
#>     \item{\code{devFunOnly}}{logical - return only the deviance evaluation
#>     function. Note that because the deviance function operates on
#>     variables stored in its environment, it may not return
#>     \emph{exactly} the same values on subsequent calls (but the results
#>     should always be within machine tolerance).}
#>   }}
#> }
#> \description{
#> Title
#> }

Created on 2019-11-16 by the reprex package (v0.3.0)

I can't reproduce this. On my system the instances of \code{\link{...}} do not have the extra [lme4] component added ...

R Under development (unstable) (2019-10-24 r77329)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

[locale and matrix-product info omitted]

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] roxygen2_6.1.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3       lattice_0.20-38  commonmark_1.7   MASS_7.3-51.4   
 [5] grid_4.0.0       R6_2.4.0         nlme_3.1-142     magrittr_1.5    
 [9] rlang_0.4.1      stringi_1.4.3    minqa_1.2.4      nloptr_1.2.1    
[13] Matrix_1.2-17    xml2_1.2.2       boot_1.3-23      splines_4.0.0   
[17] statmod_1.4.32   lme4_1.1-21.9002 tools_4.0.0      stringr_1.4.0   
[21] purrr_0.3.3      compiler_4.0.0  

$foo.Rd
% Generated by roxygen2: do not edit by hand
% Please edit documentation in RtmprVcnC4/file7a343a4e318f
\name{foo}
\alias{foo}
\title{Title}
\usage{
foo(...)
}
\arguments{
  \item{...}{Arguments passed on to \code{lme4::lmer}
  \describe{ \item{formula}{a two-sided linear formula object
  describing both the fixed-effects and random-effects part
  of the model, with the response on the left of a \code{~}
  operator and the terms, separated by \code{+} operators,
  on the right. Random-effects terms are distinguished
  by vertical bars (\code{|}) separating expressions for
  design matrices from grouping factors. Two vertical bars
  (\code{||}) can be used to specify multiple uncorrelated
  random effects for the same grouping variable. %----------
  (Because of the way it is implemented, the \code{||}-
  syntax \emph{works only for design matrices containing
  numeric (continuous) predictors}; to fit models with
  independent categorical effects, see \code{\link{dummy}}
  or the \code{lmer_alt} function from the \href{https://
  CRAN.R-project.org/package=afex}{\pkg{afex}} package.) }
  \item{data}{an optional data frame containing the variables
  named in \code{formula}. By default the variables are
  taken from the environment from which \code{lmer} is
  called. While \code{data} is optional, the package authors
  \emph{strongly} recommend its use, especially when later
  applying methods such as \code{update} and \code{drop1} to
  the fitted model (\emph{such methods are not guaranteed to
  work properly if \code{data} is omitted}). If \code{data}
  is omitted, variables will be taken from the environment
  of \code{formula} (if specified as a formula) or from
  the parent frame (if specified as a character vector).}
  \item{REML}{logical scalar - Should the estimates be
  chosen to optimize the REML criterion (as opposed to
  the log-likelihood)?} \item{control}{a list (of correct
  class, resulting from \code{\link{lmerControl}()} or
  \code{\link{glmerControl}()} respectively) containing
  control parameters, including the nonlinear optimizer to be
  used and parameters to be passed through to the nonlinear
  optimizer, see the \code{*lmerControl} documentation
  for details.} \item{start}{a named \code{\link{list}}
  of starting values for the parameters in the model. For
  \code{lmer} this can be a numeric vector or a list with
  one component named \code{"theta"}.} \item{verbose}{integer
  scalar. If \code{> 0} verbose output is generated during
  the optimization of the parameter estimates. If \code{>
  1} verbose output is generated during the individual
  penalized iteratively reweighted least squares (PIRLS)
  steps.} \item{subset}{an optional expression indicating the
  subset of the rows of \code{data} that should be used in
  the fit. This can be a logical vector, or a numeric vector
  indicating which observation numbers are to be included,
  or a character vector of the row names to be included.
  All observations are included by default.} \item{weights}
  {an optional vector of \sQuote{prior weights} to be used
  in the fitting process. Should be \code{NULL} or a numeric
  vector. Prior \code{weights} are \emph{not} normalized
  or standardized in any way. In particular, the diagonal
  of the residual covariance matrix is the squared residual
  standard deviation parameter \code{\link{sigma}} times
  the vector of inverse \code{weights}. Therefore, if the
  \code{weights} have relatively large magnitudes, then in
  order to compensate, the \code{\link{sigma}} parameter
  will also need to have a relatively large magnitude.}
  \item{na.action}{a function that indicates what should
  happen when the data contain \code{NA}s. The default action
  (\code{na.omit}, inherited from the 'factory fresh' value of
  \code{getOption("na.action")}) strips any observations with
  any missing values in any variables.} \item{offset}{this
  can be used to specify an \emph{a priori} known component
  to be included in the linear predictor during fitting. This
  should be \code{NULL} or a numeric vector of length equal
  to the number of cases. One or more \code{\link{offset}}
  terms can be included in the formula instead or as
  well, and if more than one is specified their sum is
  used. See \code{\link{model.offset}}.} \item{contrasts}
  {an optional list. See the \code{contrasts.arg} of
  \code{model.matrix.default}.} \item{devFunOnly}{logical
  - return only the deviance evaluation function. Note that
  because the deviance function operates on variables stored
  in its environment, it may not return \emph{exactly} the
  same values on subsequent calls (but the results should
  always be within machine tolerance).} }}
}
\description{
Title
}

@bbolker You are using roxygen2 6.1.1, while I am using roxygen2 7.0, the latest CRAN release.

OK. 7.0 is dated 2019-11-12, so pretty recent!

Skimming the NEWS file, it looks like this behaviour was implemented here:

@inheritDotParams includes link to function and wraps parameters in \code{} (@halldc, #842).

In my opinion, this strengthens the case that this is an unforeseen edge case (but to be fair, I haven't analyzed what's actually going on here).

It’ll be #635, but I thought I already checked for bare links to base package functions.

Does the fact that these links are from stats and not base change anything?

Sorry, I meant base packages, not base package, but it’s possible that confusion is at the root of the problem.

Minimal reprex:

rd <- roxygen2:::parse_rd("\\code{\\link{model.offset}}")
roxygen2:::tweak_links(rd, "lme4")
#> \code{\link[lme4]{model.offset}}

Created on 2019-11-20 by the reprex package (v0.3.0)

And the root cause is this:

roxygen2:::has_topic("model.offset", "lme4")
#> [1] TRUE

Created on 2019-11-20 by the reprex package (v0.3.0)

This error is still happening. Is there any updates?

This is a closed issue. If you think that this is still not fixed, please open another issue with an example. Thanks!

Was this page helpful?
0 / 5 - 0 ratings