Lubridate: Spanish month labels

Created on 20 Jul 2019  ·  9Comments  ·  Source: tidyverse/lubridate

If I use Spanish or French Locale for abbreviated version of the month labels, these are displayed with "ene\.", but it should be only "ene.". I'm using Windows 10.

> library(lubridate)
> Sys.getlocale("LC_TIME")
## [1] "Spanish_Spain.1252"

> dt <- seq(ymd("2018-01-01"), ymd("2018-12-31"), "day")

> head(month(dt, label = TRUE))
## [1] ene\\. ene\\. ene\\. ene\\. ene\\. ene\\.
## 12 Levels: ene\\. < feb\\. < mar\\. < abr\\. < may\\. < ... < dic\\.

> Sys.setlocale("LC_TIME", "French")
## [1] "French_France.1252"
> head(month(dt, label = TRUE))
## [1] janv\\. janv\\. janv\\. janv\\. janv\\. janv\\.
## 12 Levels: janv\\. < févr\\. < mars < avr\\. < mai < juin < ... < déc\\.

> Sys.setlocale("LC_TIME", "English")
## [1] "English_United States.1252"
> head(month(dt, label = TRUE))
## [1] Jan Jan Jan Jan Jan Jan
## 12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec
bug clock localisation

All 9 comments

This appears to be because the month names are regular expressions; so we also need to store canonical names for output.

This is surely a Windows only issue. Probably some regex bug indeed.

@dominicroye could you please provide the output of the following (with local replaced by your French and Spanish locale names)?

Sys.setlocale("LC_TIME", "es_ES.utf8")
format <- "%a@%A@%b@%B@%p@"
enc2utf8(unique(format(lubridate:::.date_template, format = format)))
##  [1] "jue@jueves@ene@enero@@"      "lun@lunes@feb@febrero@@"     "mar@martes@mar@marzo@@"     
##  [4] "dom@domingo@abr@abril@@"     "vie@viernes@may@mayo@@"      "mar@martes@jun@junio@@"     
##  [7] "vie@viernes@jul@julio@@"     "mié@miércoles@ago@agosto@@"  "mar@martes@sep@septiembre@@"
## [10] "vie@viernes@oct@octubre@@"   "mar@martes@nov@noviembre@@"  "sáb@sábado@dic@diciembre@@" 

Also the value of

str(.get_locale_regs("...your_locales..."))

es_ES.utf8 doesn't exist in Windows.

Here is my output from your code:

SPANISH

> Sys.setlocale("LC_TIME", "Spanish_Spain.1252")
> format <- "%a@%A@%b@%B@%p@"
> enc2utf8(unique(format(lubridate:::.date_template, format = format)))
 [1] "ju.@jueves@ene.@enero@@"      "lu.@lunes@feb.@febrero@@"     "ma.@martes@mar.@marzo@@"     
 [4] "do.@domingo@abr.@abril@@"     "vi.@viernes@may.@mayo@@"      "ma.@martes@jun.@junio@@"     
 [7] "vi.@viernes@jul.@julio@@"     "mi.@miércoles@ago.@agosto@@"  "ma.@martes@sep.@septiembre@@"
[10] "vi.@viernes@oct.@octubre@@"   "ma.@martes@nov.@noviembre@@"  "sá.@sábado@dic.@diciembre@@" 

> str(lubridate:::.get_locale_regs("Spanish_Spain.1252"))
List of 6
 $ alpha_flex : Named chr [1:6] "((?<b_b>ene\\.|feb\\.|mar\\.|abr\\.|may\\.|jun\\.|jul\\.|ago\\.|sep\\.|oct\\.|nov\\.|dic\\.)|(?<B_b>enero|febre"| __truncated__ "(?<B_B>enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|octubre|noviembre|diciembre)(?![[:alpha:]])" "((?<a_a>ju\\.|lu\\.|ma\\.|do\\.|vi\\.|mi\\.|sá\\.)|(?<A_a>jueves|lunes|martes|domingo|viernes|miércoles|sábado)"| __truncated__ "(?<A_A>jueves|lunes|martes|domingo|viernes|miércoles|sábado)(?![[:alpha:]])" ...
  ..- attr(*, "names")= chr [1:6] "b" "B" "a" "A" ...
 $ num_flex   : Named chr [1:24] "(?<d>[012]?[1-9]|3[01]|[12]0)(?!\\d)" "(?<q>[0]?[1-4])(?!\\d)" "(?<H>2[0-4]|[01]?\\d)(?!\\d)" "(?<H>2[0-4]|[01]?\\d)(?!\\d)" ...
  ..- attr(*, "names")= chr [1:24] "d" "q" "H" "h" ...
 $ alpha_exact: Named chr [1:6] "((?<b_b_e>ene\\.|feb\\.|mar\\.|abr\\.|may\\.|jun\\.|jul\\.|ago\\.|sep\\.|oct\\.|nov\\.|dic\\.)|(?<B_b_e>enero|f"| __truncated__ "(?<B_B_e>enero|febrero|marzo|abril|mayo|junio|julio|agosto|septiembre|octubre|noviembre|diciembre)(?![[:alpha:]])" "((?<a_a_e>ju\\.|lu\\.|ma\\.|do\\.|vi\\.|mi\\.|sá\\.)|(?<A_a_e>jueves|lunes|martes|domingo|viernes|miércoles|sáb"| __truncated__ "(?<A_A_e>jueves|lunes|martes|domingo|viernes|miércoles|sábado)(?![[:alpha:]])" ...
  ..- attr(*, "names")= chr [1:6] "b" "B" "a" "A" ...
 $ num_exact  : Named chr [1:24] "(?<d_e>[012][1-9]|3[01]|[12]0)" "(?<q_e>[0][1-4])" "(?<H_e>2[0-4]|[01]\\d)" "(?<H_e>2[0-4]|[01]\\d)" ...
  ..- attr(*, "names")= chr [1:24] "d" "q" "H" "h" ...
 $ wday_names :List of 2
  ..$ abr : chr [1:7] "do\\." "lu\\." "ma\\." "mi\\." ...
  ..$ full: chr [1:7] "domingo" "lunes" "martes" "miércoles" ...
 $ month_names:List of 2
  ..$ abr : chr [1:12] "ene\\." "feb\\." "mar\\." "abr\\." ...
  ..$ full: chr [1:12] "enero" "febrero" "marzo" "abril" ...

FRENCH

> Sys.setlocale("LC_TIME", "French_France.1252")
> format <- "%a@%A@%b@%B@%p@"
> enc2utf8(unique(format(lubridate:::.date_template, format = format)))
 [1] "jeu.@jeudi@janv.@janvier@@"    "lun.@lundi@févr.@février@@"    "mar.@mardi@mars@mars@@"       
 [4] "dim.@dimanche@avr.@avril@@"    "ven.@vendredi@mai@mai@@"       "mar.@mardi@juin@juin@@"       
 [7] "ven.@vendredi@juil.@juillet@@" "mer.@mercredi@août@août@@"     "mar.@mardi@sept.@septembre@@" 
[10] "ven.@vendredi@oct.@octobre@@"  "mar.@mardi@nov.@novembre@@"    "sam.@samedi@déc.@décembre@@"  

> str(lubridate:::.get_locale_regs("French_France.1252"))
List of 6
 $ alpha_flex : Named chr [1:6] "((?<b_b>janv\\.|févr\\.|mars|avr\\.|mai|juin|juil\\.|août|sept\\.|oct\\.|nov\\.|déc\\.)|(?<B_b>janvier|février|"| __truncated__ "(?<B_B>janvier|février|mars|avril|mai|juin|juillet|août|septembre|octobre|novembre|décembre)(?![[:alpha:]])" "((?<a_a>jeu\\.|lun\\.|mar\\.|dim\\.|ven\\.|mer\\.|sam\\.)|(?<A_a>jeudi|lundi|mardi|dimanche|vendredi|mercredi|s"| __truncated__ "(?<A_A>jeudi|lundi|mardi|dimanche|vendredi|mercredi|samedi)(?![[:alpha:]])" ...
  ..- attr(*, "names")= chr [1:6] "b" "B" "a" "A" ...
 $ num_flex   : Named chr [1:24] "(?<d>[012]?[1-9]|3[01]|[12]0)(?!\\d)" "(?<q>[0]?[1-4])(?!\\d)" "(?<H>2[0-4]|[01]?\\d)(?!\\d)" "(?<H>2[0-4]|[01]?\\d)(?!\\d)" ...
  ..- attr(*, "names")= chr [1:24] "d" "q" "H" "h" ...
 $ alpha_exact: Named chr [1:6] "((?<b_b_e>janv\\.|févr\\.|mars|avr\\.|mai|juin|juil\\.|août|sept\\.|oct\\.|nov\\.|déc\\.)|(?<B_b_e>janvier|févr"| __truncated__ "(?<B_B_e>janvier|février|mars|avril|mai|juin|juillet|août|septembre|octobre|novembre|décembre)(?![[:alpha:]])" "((?<a_a_e>jeu\\.|lun\\.|mar\\.|dim\\.|ven\\.|mer\\.|sam\\.)|(?<A_a_e>jeudi|lundi|mardi|dimanche|vendredi|mercre"| __truncated__ "(?<A_A_e>jeudi|lundi|mardi|dimanche|vendredi|mercredi|samedi)(?![[:alpha:]])" ...
  ..- attr(*, "names")= chr [1:6] "b" "B" "a" "A" ...
 $ num_exact  : Named chr [1:24] "(?<d_e>[012][1-9]|3[01]|[12]0)" "(?<q_e>[0][1-4])" "(?<H_e>2[0-4]|[01]\\d)" "(?<H_e>2[0-4]|[01]\\d)" ...
  ..- attr(*, "names")= chr [1:24] "d" "q" "H" "h" ...
 $ wday_names :List of 2
  ..$ abr : chr [1:7] "dim\\." "lun\\." "mar\\." "mer\\." ...
  ..$ full: chr [1:7] "dimanche" "lundi" "mardi" "mercredi" ...
 $ month_names:List of 2
  ..$ abr : chr [1:12] "janv\\." "févr\\." "mars" "avr\\." ...
  ..$ full: chr [1:12] "janvier" "février" "mars" "avril" ...

Ok, so on Windows all the abbreviations come with dots at the end. Let me see what I can do.

Should have been fixed. Would really appreciate if you guys could try the dev version and let me know if it works correctly now.

It is working correctly. Thank you!

I have to reopen this issue since it is still happening with weekdays. I am sorry that I noticed it now!

> library(lubridate)
> Sys.getlocale("LC_TIME")
[1] "Spanish_Spain.1252"

> dt <- seq(ymd("2018-01-01"), ymd("2018-12-31"), "day")

> head(wday(dt, label = TRUE))
[1] lu\\. ma\\. mi\\. ju\\. vi\\. sá\\.
Levels: do\\. < lu\\. < ma\\. < mi\\. < ju\\. < vi\\. < sá\\.

I confirm this bug. However, the solution for the guess_formats (https://github.com/tidyverse/lubridate/commit/cc5f1a6de86863f983fd3f69ac842c31997a03a0) function works and can be easily implemented in .get_locale_regs which is what is used in the wday function.

It is necessary change this line (https://github.com/tidyverse/lubridate/blob/6f26b02de432cd9373ad4ce7766c36eacfc29918/R/guess.r#L311) by this:

  mat[] <- gsub("\\.$", "", mat) # remove abbrev trailing dot in some locales (#781)
  mat[] <- gsub("([].|(){^$*+?[])", "\\\\\\1", mat) # escaping meta chars

I imagine this works correctly with clock, since we don't do anything with regular expressions:

library(clock)

dt <- seq(date_parse("2018-01-01"), date_parse("2018-12-31"), "day")

head(date_month_factor(dt, labels = "es", abbreviate = TRUE))
#> [1] ene. ene. ene. ene. ene. ene.
#> 12 Levels: ene. < feb. < mar. < abr. < may. < jun. < jul. < ago. < ... < dic.

head(date_weekday_factor(dt, labels = "es", abbreviate = TRUE))
#> [1] lun. mar. mié. jue. vie. sáb.
#> Levels: dom. < lun. < mar. < mié. < jue. < vie. < sáb.

If the labels aren't exactly what you expect, you can always create a custom clock_labels() object to use as the labels argument

Was this page helpful?
0 / 5 - 0 ratings