Spring-boot: Consistent URI/body decoding

Created on 27 Jun 2014  路  9Comments  路  Source: spring-projects/spring-boot

Spring MVC defaults to ISO-8859-1 for historical reason and it's not easy to change that value. The decoding of the URI happens at two stages: the container does it (see #542 for what we did for Tomcat since Jetty already defaults to UTF-8) and Spring MVC does it, based on the request encoding or the default if it's not set.

What people usually do is configure the org.springframework.web.filter.CharacterEncodingFilter with the same encoding as the one defined on the container. That way the body and the URI are decoded in a consistent manner.

Long story short: #542 does not fully provide a UTF-8 decoding by default. It would be nice to try to bring that property back at a generic level if we happen to be able to configure jetty: that way we can customize the uriEncoding with the use of a single property.

enhancement

Most helpful comment

Japanese Spring Boot user always configure CharacterEncodingFilter like following

    @Bean
    @Order(Ordered.HIGHEST_PRECEDENCE)
    CharacterEncodingFilter characterEncodingFilter() {
        CharacterEncodingFilter filter = new CharacterEncodingFilter();
        filter.setEncoding("UTF-8");
        filter.setForceEncoding(true);
        return filter;
    }

to handle Japanese in POST request.

Autocofigured CharacterEncodingFilter would be very helpful.

All 9 comments

Japanese Spring Boot user always configure CharacterEncodingFilter like following

    @Bean
    @Order(Ordered.HIGHEST_PRECEDENCE)
    CharacterEncodingFilter characterEncodingFilter() {
        CharacterEncodingFilter filter = new CharacterEncodingFilter();
        filter.setEncoding("UTF-8");
        filter.setForceEncoding(true);
        return filter;
    }

to handle Japanese in POST request.

Autocofigured CharacterEncodingFilter would be very helpful.

Not on a computer right now but a similar issue already exists FYI.

Here is another related issue in Spring MVC: SPR-11925

I was just going to point out that HTTP specifies ISO-8859-1 as the default charset (see p. 26 here: http://www.w3.org/Protocols/rfc2068/rfc2068.txt), but I noticed that is also mentioned in the SPR-11925 issue that you just referenced.

Plenty of other such tickets. One just commented on yesterday https://jira.spring.io/browse/SPR-11035.

RFC 2068 is ancient, made obsolete by 2616 and that's also now "dead". RFC 7231 in Appendix B says default charset ISO-8859-1 has been removed.

I think UTF-8 is a reasonable default these days. Boot should either default to it and/or ideally make it easy to switch.

Rossen, let's discuss how best we can achieve that because that's something I want to do for a very long time. Thanks!

I'd like to point out one more thing regarding this topic.

With Spring Security filter, I have had another encoding problem.
Even though I configured CharacterEncodingFilter, it didn't work well.

Strangely it depends on the environment it runs.
For example,
It occurred in AWS Elastic Beanstalk c3.large but didn't in t1.micro and m3.large.
Actually on my local macbook, it works well even though Spring Security is configured.

At that time, I have fixed with the following configuration in extended WebSecurityConfigurerAdapter class:

    @Override
    protected void configure(HttpSecurity http) throws Exception {
        // omitted
        CharacterEncodingFilter filter = new CharacterEncodingFilter();
        filter.setEncoding("UTF-8");
        filter.setForceEncoding(true);
        http.addFilterBefore(filter, CsrfFilter.class);
    }

I don't investigate so much yet, but I wonder it is related to the execution order of HIGHEST_PRECEDENCE filters.

I'd appreciate it if Spring Boot encapsulates this :)

+1

Was this page helpful?
0 / 5 - 0 ratings