Crystal: Encoding issue while building an XML file

Created on 23 Mar 2018  路  2Comments  路  Source: crystal-lang/crystal

Hi,

I just noticed a weird thing when I try to build an XML :

Reproductible code

require "xml"
string = XML.build(indent: "  ", encoding: "utf-8") do |xml|
  xml.element("person", id: 1) do
    xml.element("firstname") { xml.text "Jane" }
  end
end
puts string

Result

<?xml version="1.0" encoding="utf-8U"?>
<person id="1">
  <firstname>Jane</firstname>
  <lastname>Doe</lastname>
</person>

The 'encoding' attribute is utf-8U instead of utf-8.
Why an 'U' is appended to the character encoding I specify?

Crystal version

Crystal 0.24.1 on Debian 9
This issue is also reproductible in the crystal playground

Most helpful comment

All 2 comments

This looks really strange. It happens with other encodings as well, but not all (for example ISO-8859-1 is unmmodified) and differs depending on capitalization. For example ASCII becomes ASCIIU but ascii becomes asciiV.
I could not identify any potential issues in the Crystal bindings. XML::Builder just passes a char pointer to xmlTextWriterStartDocument pointing to the chars in the string (Bytes[117_u8, 116_u8, 102_u8, 45_u8, 56_u8, 0_u8]) and that's how it is supposed to work.

When using the C API directly, everything works as expected: https://carc.in/#/r/3rri

My example uses xmlBufferCreate and xmlNewTextWriterMemory instead of xmlOutputBufferCreateIO and xmlNewTextWriter. So I guess it has something to do with the IO based buffer used by XML::Builder.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MakeNowJust picture MakeNowJust  路  64Comments

malte-v picture malte-v  路  77Comments

sergey-kucher picture sergey-kucher  路  66Comments

asterite picture asterite  路  60Comments

asterite picture asterite  路  78Comments