Sf: Encoding bug with sf-tibbles and degree-unit crs

Created on 20 Dec 2018  ·  8Comments  ·  Source: r-spatial/sf

On Windows with a cp1252 locale, when passing a tibble with a geographic coordinate system like WGS84 (i.e. one that uses degrees as the unit of measurement) to st_as_sf, the column type of the geometry column is not displayed correctly. Namely, the degree symbol ° in <POINT [°]> is displayed as °.

library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
library(tibble)
x_tbl <- tibble(place = "Münster", x = 7.625808, y = 51.96311)
st_as_sf(x_tbl, coords = c("x", "y"), crs = 4326) 
#> Simple feature collection with 1 feature and 1 field
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: 7.625808 ymin: 51.96311 xmax: 7.625808 ymax: 51.96311
#> epsg (SRID):    4326
#> proj4string:    +proj=longlat +datum=WGS84 +no_defs
#> # A tibble: 1 x 2
#>   place              geometry
#>   <chr>          <POINT [°]>
#> 1 Münster (7.625808 51.96311)

This is because the string “°” is not marked as UTF-8, but as latin1.
I am not sure though, whether this is a bug in sf or tibble.

Session info

devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.5.1 (2018-07-02)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language EN                          
#>  collate  German_Germany.1252         
#>  ctype    German_Germany.1252         
#>  tz       Europe/Berlin               
#>  date     2018-12-20                  
#> 
#> - Packages --------------------------------------------------------------
#>  package     * version    date       lib source                     
#>  assertthat    0.2.0      2017-04-11 [1] CRAN (R 3.5.1)             
#>  backports     1.1.3      2018-12-14 [1] CRAN (R 3.5.1)             
#>  callr         3.1.0      2018-12-10 [1] CRAN (R 3.5.1)             
#>  class         7.3-14     2015-08-30 [2] CRAN (R 3.5.1)             
#>  classInt      0.2-3      2018-04-16 [1] CRAN (R 3.5.1)             
#>  cli           1.0.1.9000 2018-10-25 [1] Github (r-lib/cli@56538e3) 
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.5.1)             
#>  DBI           1.0.0      2018-05-02 [1] CRAN (R 3.5.1)             
#>  desc          1.2.0      2018-10-25 [1] Github (r-lib/desc@7c12d36)
#>  devtools      2.0.1      2018-10-26 [1] CRAN (R 3.5.1)             
#>  digest        0.6.18     2018-10-10 [1] CRAN (R 3.5.1)             
#>  e1071         1.7-0      2018-07-28 [1] CRAN (R 3.5.1)             
#>  evaluate      0.12       2018-10-09 [1] CRAN (R 3.5.1)             
#>  fansi         0.4.0      2018-10-05 [1] CRAN (R 3.5.1)             
#>  fs            1.2.6      2018-08-23 [1] CRAN (R 3.5.1)             
#>  glue          1.3.0      2018-07-17 [1] CRAN (R 3.5.1)             
#>  highr         0.7        2018-06-09 [1] CRAN (R 3.5.1)             
#>  htmltools     0.3.6      2017-04-28 [1] CRAN (R 3.5.1)             
#>  knitr         1.21       2018-12-10 [1] CRAN (R 3.5.1)             
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.5.1)             
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.5.1)             
#>  pillar        1.3.1      2018-12-15 [1] CRAN (R 3.5.1)             
#>  pkgbuild      1.0.2      2018-10-16 [1] CRAN (R 3.5.1)             
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.5.1)             
#>  prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.5.1)             
#>  processx      3.2.1      2018-12-05 [1] CRAN (R 3.5.1)             
#>  ps            1.2.1      2018-11-06 [1] CRAN (R 3.5.1)             
#>  R6            2.3.0      2018-10-04 [1] CRAN (R 3.5.1)             
#>  Rcpp          1.0.0      2018-11-07 [1] CRAN (R 3.5.1)             
#>  remotes       2.0.2      2018-10-30 [1] CRAN (R 3.5.1)             
#>  rlang         0.3.0.1    2018-10-25 [1] CRAN (R 3.5.1)             
#>  rmarkdown     1.11       2018-12-08 [1] CRAN (R 3.5.1)             
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.5.1)             
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.5.1)             
#>  sf          * 0.7-1      2018-10-24 [1] CRAN (R 3.5.1)             
#>  spData        0.2.9.6    2018-12-03 [1] CRAN (R 3.5.1)             
#>  stringi       1.2.4      2018-07-20 [1] CRAN (R 3.5.1)             
#>  stringr       1.3.1      2018-05-10 [1] CRAN (R 3.5.1)             
#>  testthat      2.0.1      2018-10-13 [1] CRAN (R 3.5.1)             
#>  tibble      * 1.4.2      2018-01-22 [1] CRAN (R 3.5.1)             
#>  units         0.6-2      2018-12-05 [1] CRAN (R 3.5.1)             
#>  usethis       1.4.0      2018-08-14 [1] CRAN (R 3.5.1)             
#>  utf8          1.1.4      2018-05-24 [1] CRAN (R 3.5.1)             
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 3.5.1)             
#>  xfun          0.4        2018-10-23 [1] CRAN (R 3.5.1)             
#>  yaml          2.2.0      2018-07-25 [1] CRAN (R 3.5.1)             
#> 
#> [1] D:/Users/Daniel/Documents/R/win-library/3.5
#> [2] C:/Program Files/R/R-3.5.1/library

All 8 comments

Thanks Daniel! Would be great if you could confirm this one works.
Also it would be good to know what this does in your locale:

> units::set_units(1, degree)
1 [°]

Hi!

My setup is not identical, but the code page is the same (1252). The fix in commit 83157d1 does not seem to work. I think enc2utf8 is not the appropriate solution here. The following example shows that enc2utf8("°") converts "°" to UTF-8 encoding but keeps the interpretation. What could work is keeping the byte representation but marking the Encoding() as "UTF-8".

library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
library(tibble)
x_tbl <- tibble(place = "M\u00fcnster", x = 7.625808, y = 51.96311)
st_as_sf(x_tbl, coords = c("x", "y"), crs = 4326)
#> Simple feature collection with 1 feature and 1 field
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: 7.625808 ymin: 51.96311 xmax: 7.625808 ymax: 51.96311
#> epsg (SRID):    4326
#> proj4string:    +proj=longlat +datum=WGS84 +no_defs
#> # A tibble: 1 x 2
#>   place              geometry
#>   <chr>          <POINT [°]>
#> 1 Münster (7.625808 51.96311)

degree <- "\u00b0"
degree_unknown <- degree
Encoding(degree_unknown) <- "unknown"
degree_enc2utf8 <- enc2utf8(degree_unknown)
degree_back <- degree_unknown
Encoding(degree_back) <- "UTF-8"

stopifnot(identical(degree, degree_back))
degree
#> [1] "°"
degree_unknown
#> [1] "°"
degree_enc2utf8
#> [1] "°"
Encoding(degree)
#> [1] "UTF-8"
Encoding(degree_unknown)
#> [1] "unknown"
Encoding(degree_enc2utf8)
#> [1] "UTF-8"
charToRaw(degree)
#> [1] c2 b0
charToRaw(degree_unknown)
#> [1] c2 b0
charToRaw(degree_enc2utf8)
#> [1] c3 82 c2 b0

Session info

devtools::session_info()
#> - Session info ----------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.5.2 (2018-12-20)
#>  os       Windows 7 x64 SP 1          
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  Finnish_Finland.1252        
#>  ctype    Finnish_Finland.1252        
#>  tz       Europe/Helsinki             
#>  date     2018-12-21                  
#> 
#> - Packages --------------------------------------------------------------
#>  package     * version    date       lib source                          
#>  assertthat    0.2.0      2017-04-11 [1] CRAN (R 3.5.0)                  
#>  backports     1.1.3      2018-12-14 [1] CRAN (R 3.5.1)                  
#>  callr         3.1.0      2018-12-10 [1] CRAN (R 3.5.1)                  
#>  class         7.3-14     2015-08-30 [1] CRAN (R 3.5.0)                  
#>  classInt      0.3-1      2018-12-18 [1] CRAN (R 3.5.1)                  
#>  cli           1.0.1      2018-09-25 [1] CRAN (R 3.5.1)                  
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.5.0)                  
#>  DBI           1.0.0      2018-05-02 [1] CRAN (R 3.5.0)                  
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 3.5.0)                  
#>  devtools      2.0.1      2018-10-26 [1] CRAN (R 3.5.1)                  
#>  digest        0.6.18     2018-10-10 [1] CRAN (R 3.5.1)                  
#>  e1071         1.7-0      2018-07-28 [1] CRAN (R 3.5.1)                  
#>  evaluate      0.12       2018-10-09 [1] CRAN (R 3.5.1)                  
#>  fansi         0.4.0      2018-10-05 [1] CRAN (R 3.5.1)                  
#>  fs            1.2.6      2018-08-23 [1] CRAN (R 3.5.1)                  
#>  glue          1.3.0      2018-07-17 [1] CRAN (R 3.5.1)                  
#>  highr         0.7        2018-06-09 [1] CRAN (R 3.5.0)                  
#>  htmltools     0.3.6      2017-04-28 [1] CRAN (R 3.5.0)                  
#>  knitr         1.21       2018-12-10 [1] CRAN (R 3.5.1)                  
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.5.0)                  
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.5.0)                  
#>  pillar        1.3.1      2018-12-15 [1] CRAN (R 3.5.1)                  
#>  pkgbuild      1.0.2      2018-10-16 [1] CRAN (R 3.5.1)                  
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.5.1)                  
#>  prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.5.0)                  
#>  processx      3.2.1      2018-12-05 [1] CRAN (R 3.5.1)                  
#>  ps            1.2.1      2018-11-06 [1] CRAN (R 3.5.1)                  
#>  R6            2.3.0      2018-10-04 [1] CRAN (R 3.5.1)                  
#>  Rcpp          1.0.0      2018-11-07 [1] CRAN (R 3.5.1)                  
#>  remotes       2.0.2      2018-10-30 [1] CRAN (R 3.5.1)                  
#>  rlang         0.3.0.1    2018-10-25 [1] CRAN (R 3.5.1)                  
#>  rmarkdown     1.11       2018-12-08 [1] CRAN (R 3.5.1)                  
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.5.0)                  
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.5.1)                  
#>  sf          * 0.7-3      2018-12-21 [1] Github (r-spatial/sf@83157d1)   
#>  stringi       1.2.4      2018-07-20 [1] CRAN (R 3.5.1)                  
#>  stringr       1.3.1      2018-05-10 [1] CRAN (R 3.5.0)                  
#>  testthat      2.0.1      2018-10-13 [1] CRAN (R 3.5.1)                  
#>  tibble      * 1.4.2      2018-01-22 [1] CRAN (R 3.5.0)                  
#>  units         0.6-2      2018-12-05 [1] CRAN (R 3.5.1)                  
#>  usethis       1.4.0      2018-08-14 [1] CRAN (R 3.5.1)                  
#>  utf8          1.1.4      2018-05-24 [1] CRAN (R 3.5.0)                  
#>  withr         2.1.2.9000 2018-10-23 [1] Github (jimhester/withr@be57595)
#>  xfun          0.4        2018-10-23 [1] CRAN (R 3.5.1)                  
#>  yaml          2.2.0      2018-07-25 [1] CRAN (R 3.5.1)                  
#> 
#> [1] C:/Omat/R/win-library/3.5
#> [2] C:/Program Files/R/R-3.5.2/library

set_units()

The set_units() example also returns a badly formatted result.

units::set_units(1, degree)
#> 1 [°]

set_utf8() should do the job (the issue is equivalent to #5, see your post here).

I don't think so, as that works on a data.frame, here we have a (length 1) character vector.

Right.
~set_utf8()~ to_utf8() should do the job. 😉

Giving up.

FWIW, this looks fine from my point of view (sf v0.8-0, units 0.6-5):

library(sf)
#> Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
library(tibble)
x_tbl <- tibble(place = "Münster", x = 7.625808, y = 51.96311)
st_as_sf(x_tbl, coords = c("x", "y"), crs = 4326)
#> Simple feature collection with 1 feature and 1 field
#> geometry type:  POINT
#> dimension:      XY
#> bbox:           xmin: 7.625808 ymin: 51.96311 xmax: 7.625808 ymax: 51.96311
#> epsg (SRID):    4326
#> proj4string:    +proj=longlat +datum=WGS84 +no_defs
#> # A tibble: 1 x 2
#>   place                 geometry
#>   <chr>              <POINT [°]>
#> 1 Münster    (7.625808 51.96311)

Session info

devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.6.1 (2019-07-05)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language EN                          
#>  collate  German_Germany.1252         
#>  ctype    German_Germany.1252         
#>  tz       Europe/Berlin               
#>  date     2019-11-22                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version    date       lib source                          
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.5.3)                  
#>  backports     1.1.5      2019-10-02 [1] CRAN (R 3.6.1)                  
#>  callr         3.3.2      2019-09-22 [1] CRAN (R 3.6.1)                  
#>  class         7.3-15     2019-01-01 [2] CRAN (R 3.6.1)                  
#>  classInt      0.4-2      2019-10-17 [1] CRAN (R 3.6.1)                  
#>  cli           1.1.0      2019-03-19 [1] CRAN (R 3.5.3)                  
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.5.1)                  
#>  DBI           1.0.0      2018-05-02 [1] CRAN (R 3.5.1)                  
#>  desc          1.2.0      2019-05-19 [1] Github (r-lib/desc@c860e7b)     
#>  devtools      2.2.1      2019-09-24 [1] CRAN (R 3.6.1)                  
#>  digest        0.6.22     2019-10-21 [1] CRAN (R 3.6.1)                  
#>  e1071         1.7-2      2019-06-05 [1] CRAN (R 3.5.3)                  
#>  ellipsis      0.3.0      2019-09-20 [1] CRAN (R 3.6.1)                  
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 3.5.3)                  
#>  fansi         0.4.0      2018-10-05 [1] CRAN (R 3.5.1)                  
#>  fs            1.3.1      2019-05-06 [1] CRAN (R 3.5.3)                  
#>  glue          1.3.1      2019-03-12 [1] CRAN (R 3.5.3)                  
#>  highr         0.8        2019-03-20 [1] CRAN (R 3.5.3)                  
#>  htmltools     0.4.0      2019-10-04 [1] CRAN (R 3.6.1)                  
#>  KernSmooth    2.23-15    2015-06-29 [2] CRAN (R 3.6.1)                  
#>  knitr         1.26       2019-11-12 [1] CRAN (R 3.6.1)                  
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.5.1)                  
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.6.1)                  
#>  pillar        1.4.2      2019-06-29 [1] CRAN (R 3.5.3)                  
#>  pkgbuild      1.0.6      2019-10-09 [1] CRAN (R 3.6.1)                  
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 3.6.1)                  
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.5.1)                  
#>  prettyunits   1.0.2      2015-07-13 [1] CRAN (R 3.5.1)                  
#>  processx      3.4.1      2019-07-18 [1] CRAN (R 3.6.1)                  
#>  ps            1.3.0      2018-12-21 [1] CRAN (R 3.5.2)                  
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 3.6.1)                  
#>  Rcpp          1.0.3      2019-11-08 [1] CRAN (R 3.6.1)                  
#>  remotes       2.1.0.9000 2019-10-16 [1] Github (dpprdan/remotes@c82381d)
#>  rlang         0.4.1      2019-10-24 [1] CRAN (R 3.6.1)                  
#>  rmarkdown     1.17       2019-11-13 [1] CRAN (R 3.6.1)                  
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.5.1)                  
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.5.1)                  
#>  sf          * 0.8-0      2019-09-17 [1] CRAN (R 3.6.1)                  
#>  stringi       1.4.3      2019-03-12 [1] CRAN (R 3.5.3)                  
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 3.5.2)                  
#>  testthat      2.3.0      2019-11-05 [1] CRAN (R 3.6.1)                  
#>  tibble      * 2.1.3      2019-06-06 [1] CRAN (R 3.5.3)                  
#>  units         0.6-5      2019-10-08 [1] CRAN (R 3.6.1)                  
#>  usethis       1.5.1      2019-07-04 [1] CRAN (R 3.5.3)                  
#>  utf8          1.1.4      2018-05-24 [1] CRAN (R 3.5.1)                  
#>  vctrs         0.2.0      2019-07-05 [1] CRAN (R 3.5.3)                  
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 3.5.1)                  
#>  xfun          0.11       2019-11-12 [1] CRAN (R 3.6.1)                  
#>  yaml          2.2.0      2018-07-25 [1] CRAN (R 3.5.1)                  
#>  zeallot       0.1.0      2018-01-28 [1] CRAN (R 3.5.1)                  
#> 
#> [1] D:/Users/Daniel/Documents/R/win-library/3.6
#> [2] C:/Program Files/R/R-3.6.1/library

Oh, happy to hear that! Maybe it is only appveyor (my only windows platform) that does it wrong...

Was this page helpful?
0 / 5 - 0 ratings