By default Java .properties files are encoded in ISO-8859-1 (see https://en.wikipedia.org/wiki/.properties ) but since #4622 Spring-Boot reads them as UTF-8.
This causes incompatibility with Java and Spring itself.
The differences to Spring are very uncomfortable when it comes to unit tests, e.g. when the test case has its own Spring Context, .properties files are loaded by Spring and not by Spring Boot. They are parsed correctly as ISO-8859-1. But when the application is started, the same .properties are loaded by Spring-Boot and valid ISO-8859-1 characters are broken.
A JUnit test configured like this:
@RunWith(SpringJUnit4ClassRunner.class)
@ContextConfiguration(classes = MyTestConfig.class)
public class MyTest {
// ...
}
with a test config importing a PropertySource:
@Configuration
@PropertySource(value = "/application.properties")
public class MyTestConfig {
}
will pass with ISO-8859-1 characters like "盲枚眉" in the application.properties but the Spring-Boot application itself will show corrupted UTF-8 characters.
Furthermore the typical Java IDEs treat .properties files by default as ISO-8859-1 (or similar CP1252).
@dsyer can you provide any background on why #4622 changed the default?
We changed it because someone asked us to support extended character sets (8859 is pretty narrow). I don't think we should change it back. Surely @PropertySource is handled by Spring (not Boot) so it would always be the same.
It looks to me like @PropertySource uses ResourcePropertySource which uses EncodedResource but doesn't set UTF-8 encoding. I think we might be unique with our, handling so I'd vote for reverting back to ISO-8859-1
Spring Framework recently supported an encoding attribute on @PropertySource. The default hasn't changed so I also think we should revert it. Having said that, maybe we could offer a configuration option (that will be hard to set) to customize this?
Most helpful comment
It looks to me like
@PropertySourceusesResourcePropertySourcewhich usesEncodedResourcebut doesn't setUTF-8encoding. I think we might be unique with our, handling so I'd vote for reverting back toISO-8859-1