Actix-web: actix-files having trouble with Chinese characters in file content (not filename)

Created on 15 Sep 2020  路  9Comments  路  Source: actix/actix-web

Expected Behavior

Chinese characters (and other unicode characters) should be displayed correctly.

Current Behavior

They are not.

Possible Solution

I noticed that there was a pull request related to this (#1151), that PR did fix the issue with filename, but it seems that there are still issues with file contents.

Steps to Reproduce (for bugs)

See: https://github.com/lucifer1004/actix-file-utf-8

Context


Your Environment

  • Rust Version (I.e, output of rustc -V): rustc 1.47.0-nightly (30f0a0768 2020-08-18)
  • Actix Web Version: 3.0.1
  • Actix Files Version: 0.3.0
C-feature P-files

All 9 comments

Update:

  • curl is OK.
  • Problems are with browsers. Chrome, Safari, and Firefox tested, and none is able to display correctly.

I had a look at the source code and did not find a specific handler for files that will be directly opened in the browser (e.g., .txt, .html, etc.).

I then used Node.js http-server to serve the same file, and found that the opened web page has content-type: text/plain; charset=UTF-8, whereas actix-files served page simply has content-type: text/plain.

I think this is the cause.

Here's a somewhat reasonable workaround. We will add in the content types to the file handling though.

use actix_files as fs;
use actix_service::Service;
use actix_web::{
    http::{header, HeaderValue},
    web, App, HttpResponse, HttpServer,
};
use futures::FutureExt;

fn configure_files(cfg: &mut web::ServiceConfig) {
    cfg.service(
        web::scope("")
            .wrap_fn(|req, srv| {
                srv.call(req).map(|res| {
                    res.map(|mut res| {
                        let headers = res.headers_mut();

                        let ct = headers.get(header::CONTENT_TYPE).unwrap();
                        let ct = ct.to_str().unwrap();

                        let new_ct = [ct, "; charset=UTF-8"].concat();
                        let new_ct = HeaderValue::from_str(&new_ct).unwrap();

                        headers.insert(header::CONTENT_TYPE, new_ct);

                        res
                    })
                })
            })
            .service(fs::Files::new("/static", ".").show_files_listing()),
    );
}

#[actix_web::main]
async fn main() -> std::io::Result<()> {
    HttpServer::new(|| {
        App::new()
            .route("/", web::to(HttpResponse::Conflict))
            .route("/one", web::to(HttpResponse::Created))
            .configure(configure_files)
    })
    .bind("127.0.0.1:8888")?
    .run()
    .await
}

@robjtede Thanks for your quick reply!

It turns out this isn't a bug. The HTTP spec states the default content type character encoding is ISO-8859-1. We should provide the ability to add the charset=utf-8 parameter though; I'm looking into it.

It would be nice if we can set the charset with a simple option. Manipulating headers manually is OK, but far from elegant.

Actually, W3C recommends using UTF-8 for all content here.

Yep that's correct. Whatever solution we come up with for adding the charset param should be the default eventually.

released workaround in 0.4.0

Was this page helpful?
0 / 5 - 0 ratings