Data.table: Rename coalesce() to fcoalesce()

Created on 9 Aug 2019  Â·  8Comments  Â·  Source: Rdatatable/data.table

As @MichaelChirico metioned in https://github.com/Rdatatable/data.table/issues/3740#issuecomment-519751021, we have 10 functions named with prefix f (some are no conflicts with base). However, coalesce() is an exception. I support @MichaelChirico 's idea that making the function names as consistent as possible.

So I think it may be better to rename the function coalesce() to fcoalesce() since it's not been published on CRAN yet.

| Function | Conflict in base? |
| --------- | ------------------ |
| fintersect | ✅ |
| foverlaps | ❌ |
| frank{v} | ✅ |
| f{read,write} | ❌ |
| froll{mean,sum} | ❌ |
| fsetdiff | ✅ |
| fsetequal | ✅ |
| fsort | ✅ |
| funion | ✅ |

High dev

Most helpful comment

revdep testing #3581 reveals the wrinkle.

Exporting coalesce causes warnings in revdeps grattan, mlr, OpenML, vosonSML, at least.

The warning (just because it's any warning) would need to be resolved by communication with those maintainers. But more significantly, those packages may start to use our new coalesce; we would inflict it on them without them asking. That inflict happens for mlr and OpenML but not grattan or vosonSML; I guess the difference is the order they import the packages. One bug in our coalesce revealed by mlr (#3826) in revdep testing because of a difference between our coalesce and BBmisc::coalesce. So it was good to call it coalesce otherwise we wouldn't have found that. But we shouldn't force the new function on them, until a few years have passed anyway. (Aside: dplyr::coalesce returns the same type error we did in that coalesce(NULL,"foo") example, I'd say dplyr::coalesce needs to fix that too).

The case for changing to fcoalesce seems clear then. We can still optimize coalesce inside DT[...] to use fcoalesce and we can export fcoalesce too.

Warning: replacing previous import 'data.table::coalesce' by 'hutils::coalesce' when loading 'grattan' 
Warning: replacing previous import ‘BBmisc::coalesce’ by ‘data.table::coalesce’ when loading ‘mlr’
Warning: replacing previous import ‘BBmisc::coalesce’ by ‘data.table::coalesce’ when loading ‘OpenML’
Warning: replacing previous import ‘data.table::coalesce’ by ‘dplyr::coalesce’ when loading ‘vosonSML’

All 8 comments

Main factor for deciding names should be not overlapping to base R. Although I named frollmean to not overlap to zoo's function. Anyway my vote goes for keeping coalesce, overlapping shouldn't be very problematic here as both functions behaves quite the same. fcoalesce is not that obvious what it is.

@jangorecki just to push back a little, why cater frollmean not to overlap w zoo, but it's ok for coalesce to overlap with dplyr/hutils?

Not arguing strongly for fcoalesce (I'm kind of indifferent TBH), but rather think if we're going to close it would be best to have complete arguments laid out here to refer back to

  • zoo::rollmean has different defaults than frollmean, (align arg) thus will yield different answer
  • name of the function doesn't sound so artificial, "roll" from rolling, and "mean", adding "f" ("fast") prefix to this sounds more natural than adding to an artificial name "coalesce" which for me is just an SQL keyword not having any other meaning.
  • this is rather personal opinion, but I found using data.table+zoo more common than data.table+dplyr

btw. there was discussion related to frollmean naming: https://github.com/orgs/Rdatatable/teams/project-members/discussions/5

After a second thought, I think the current name is ok. Close this for now.

revdep testing #3581 reveals the wrinkle.

Exporting coalesce causes warnings in revdeps grattan, mlr, OpenML, vosonSML, at least.

The warning (just because it's any warning) would need to be resolved by communication with those maintainers. But more significantly, those packages may start to use our new coalesce; we would inflict it on them without them asking. That inflict happens for mlr and OpenML but not grattan or vosonSML; I guess the difference is the order they import the packages. One bug in our coalesce revealed by mlr (#3826) in revdep testing because of a difference between our coalesce and BBmisc::coalesce. So it was good to call it coalesce otherwise we wouldn't have found that. But we shouldn't force the new function on them, until a few years have passed anyway. (Aside: dplyr::coalesce returns the same type error we did in that coalesce(NULL,"foo") example, I'd say dplyr::coalesce needs to fix that too).

The case for changing to fcoalesce seems clear then. We can still optimize coalesce inside DT[...] to use fcoalesce and we can export fcoalesce too.

Warning: replacing previous import 'data.table::coalesce' by 'hutils::coalesce' when loading 'grattan' 
Warning: replacing previous import ‘BBmisc::coalesce’ by ‘data.table::coalesce’ when loading ‘mlr’
Warning: replacing previous import ‘BBmisc::coalesce’ by ‘data.table::coalesce’ when loading ‘OpenML’
Warning: replacing previous import ‘data.table::coalesce’ by ‘dplyr::coalesce’ when loading ‘vosonSML’

Even with #3826 fixed, mlr still fails due to coalesce.

> require(mlr)
Loading required package: mlr
Loading required package: ParamHelpers
Warning message:
replacing previous import ‘BBmisc::coalesce’ by ‘data.table::coalesce’ when loading ‘mlr’ 
> example(makeLearners)

mkLrnr> makeLearners(c("rpart", "lda"), type = "classif", predict.type = "prob")
Error in mapply(makeLearner, cl = cls, id = ids, MoreArgs = list(...),  : 
  zero-length inputs cannot be mixed with those of non-zero length
> 
> makeLearners
function (cls, ids = NULL, type = NULL, ...) 
{
    if (!is.null(type)) {
        assertChoice(type, listTaskTypes())
        cls = stri_paste(type, cls, sep = ".")
    }
    assertCharacter(cls, any.missing = FALSE)
    assertCharacter(ids, any.missing = FALSE, len = length(cls), 
        unique = TRUE, null.ok = TRUE)
    ids = coalesce(ids, cls)
    lrns = mapply(makeLearner, cl = cls, id = ids, MoreArgs = list(...), 
        SIMPLIFY = FALSE)
    setNames(lrns, ids)
}
<bytecode: 0x55a8a00af370>
<environment: namespace:mlr>
> debug(makeLearners)
> example(makeLearners)

mkLrnr> makeLearners(c("rpart", "lda"), type = "classif", predict.type = "prob")
debugging in: makeLearners(c("rpart", "lda"), type = "classif", predict.type = "prob")
debug: {
    if (!is.null(type)) {
        assertChoice(type, listTaskTypes())
        cls = stri_paste(type, cls, sep = ".")
    }
    assertCharacter(cls, any.missing = FALSE)
    assertCharacter(ids, any.missing = FALSE, len = length(cls), 
        unique = TRUE, null.ok = TRUE)
    ids = coalesce(ids, cls)
    lrns = mapply(makeLearner, cl = cls, id = ids, MoreArgs = list(...), 
        SIMPLIFY = FALSE)
    setNames(lrns, ids)
}
Browse[2]> n
debug: if (!is.null(type)) {
    assertChoice(type, listTaskTypes())
    cls = stri_paste(type, cls, sep = ".")
}
Browse[2]> 
debug: assertChoice(type, listTaskTypes())
Browse[2]> 
debug: cls = stri_paste(type, cls, sep = ".")
Browse[2]> 
debug: assertCharacter(cls, any.missing = FALSE)
Browse[2]> 
debug: assertCharacter(ids, any.missing = FALSE, len = length(cls), 
    unique = TRUE, null.ok = TRUE)
Browse[2]> 
debug: ids = coalesce(ids, cls)
Browse[2]> ids
NULL
Browse[2]> cls
[1] "classif.rpart" "classif.lda"  
Browse[2]> n
debug: lrns = mapply(makeLearner, cl = cls, id = ids, MoreArgs = list(...), 
    SIMPLIFY = FALSE)
Browse[2]> ids
NULL
Browse[2]> Q
> BBmisc::coalesce(NULL, c("classif.rpart","classif.lda"))
[1] "classif.rpart" "classif.lda"  
> 

That's because BBmisc::coalesce is not like the other 3 coalesce (hutils, dplyr and data.table). Instead, it returns the "first non-missing, non-null argument, otherwise NULL." mlr is relying on that. This further points to change to fcoalesce.

@mllg FYI

Was this page helpful?
0 / 5 - 0 ratings