V8-archive: Length validation uses wrong encoding type to count value length

Created on 11 Sep 2019  ·  17Comments  ·  Source: directus/v8-archive

After you create a Text Input interface, set its type to VarChar and specify a Length; the input in the panel does not accept the text with that specified length. I think it accepts max 10 chars less.

bug

All 17 comments

@anelad - Can you please verify that the steps which I followed are correct? As I didnt find the incredible(+10) length.

ezgif com-video-to-gif

Hi @bjgajjar, to reproduce the issue, you don't need to do any special things. Just type. The only different thing is my client uses Turkish chars. I think the interface miscalculates the Turkish chars, like "ş" or "ç" or "ğ".

Hey @anelad — you'll need to provide more info for this one. We would need all the info on your server/install/db/OS to properly debug this since it's working fine for normal instances.

I just tested this on one of the demo's varchar Text Input fields and it worked as expected (even with your provided Turkish characters). The below column has a DB limit of 50

Screen Shot 2019-10-12 at 7 00 10 PM

@benhaynes Will handle that after Monday

It could be that certain characters are "ligature style" characters that are in fact two characters that are rendered as one (like emojis). In that case it reads as 1, but it's stored / counted as 2.

Machine: CentOS Linux 7.7.1908
DB: mysql 10.2.27-1.el7.centos
Directus: Directus Suite 7.11.0
Server: FPM app served by Apache (2.4.6-90.el7.centos) with PHP version of 7.3.10 and FastCGI support

A fresh error while entering value BIST 100 endeksi, günü yüzde 0,56 değer artışının ardından 100.345,64 puan ile tamamladı. Analistler, TCMB'nin faiz kararının ardından dolar/TL'de sakin bir sey:

POST https://cms.medyaeli.com/_/items/haberler 422
    (anonymous) @ directus-sdk.umd.min.js:1
    e.exports @ directus-sdk.umd.min.js:1
    e.exports @ directus-sdk.umd.min.js:8
    Promise.then (async)
    c.request @ directus-sdk.umd.min.js:8
    (anonymous) @ directus-sdk.umd.min.js:1
    request @ directus-sdk.umd.min.js:1
    post @ directus-sdk.umd.min.js:1
    createItem @ directus-sdk.umd.min.js:1
    hc @ actions.js:44
    (anonymous) @ vuex.esm.js:732
    m.dispatch @ vuex.esm.js:437
    dispatch @ vuex.esm.js:331
    save @ item.vue:697
    click @ item.vue?e165:1
    nt @ vue.runtime.esm.js:1854
    n @ vue.runtime.esm.js:2179
    nt @ vue.runtime.esm.js:1854
    e.$emit @ vue.runtime.esm.js:3882
    click @ header-button.vue?ac7a:8
    nt @ vue.runtime.esm.js:1854
    n @ vue.runtime.esm.js:2179
    Zo.i._wrapper @ vue.runtime.esm.js:6911


{
    "code": 12,
    "message": "The value submitted (BIST 100 endeksi, günü yüzde 0,56 değer artışının ardından 100.345,64 puan ile tamamladı. Analistler, TCMB'nin faiz kararının ardından dolar/TL'de sakin bir sey) for 'Ozet' is longer than the field's supported length (160). Please submit a shorter value or ask an Admin to increase the length."
    "__proto__": {
        Bu @ error.js:9
        nt @ vue.runtime.esm.js:1854
        e.$emit @ vue.runtime.esm.js:3882
        (anonymous) @ item.vue:755
        Promise.catch (async)
        save @ item.vue:751
        click @ item.vue?e165:1
        nt @ vue.runtime.esm.js:1854
        n @ vue.runtime.esm.js:2179
        nt @ vue.runtime.esm.js:1854
        e.$emit @ vue.runtime.esm.js:3882
        click @ header-button.vue?ac7a:8
        nt @ vue.runtime.esm.js:1854
        n @ vue.runtime.esm.js:2179
        Zo.i._wrapper @ vue.runtime.esm.js:6911
}

Full log here

Hey @anelad,
I tried with the text given by you in the Directus Demo APP and successfully created an item without any error.

ezgif com-video-to-gif

Perhaps this is an issue with your database's encoding, character set, or some other specific difference in your server. Make sure everything is set to UTF8.

@benhaynes My DB collation is utf8_general_ci or utf8mg4_general_ci and storage engine is InnoDB

I successfully added the text to DB via PhpMyAdmin, it works as expected. The problem is with the Directus

JavaScript's .length returns a different size for this string than PHP's strlen. That's most likely the cause of the issue, as the JS based front-end lets you enter in more characters than the PHP API will allow in it's validation:

// PHP

$str = "BIST 100 endeksi, günü yüzde 0,56 değer artışının ardından 100.345,64 puan ile tamamladı. Analistler, TCMB'nin faiz kararının ardından dolar/TL'de sakin bir sey";

echo strlen($str);

// => 173
// JavaScript

var str = "BIST 100 endeksi, günü yüzde 0,56 değer artışının ardından 100.345,64 puan ile tamamladı. Analistler, TCMB'nin faiz kararının ardından dolar/TL'de sakin bir sey";

str.length

// => 160

I believe this is due to the fact that JS's .length counts the string in UTF-8 encoding, while PHP defaults to binary. You can confirm this by checking the difference between:

// PHP 

$str = "BIST 100 endeksi, günü yüzde 0,56 değer artışının ardından 100.345,64 puan ile tamamladı. Analistler, TCMB'nin faiz kararının ardından dolar/TL'de sakin bir sey";

echo strlen($str);

// => 173

and

// PHP

$str = "BIST 100 endeksi, günü yüzde 0,56 değer artışının ardından 100.345,64 puan ile tamamladı. Analistler, TCMB'nin faiz kararının ardından dolar/TL'de sakin bir sey";

echo mb_strlen($str, "UTF-8");

// => 160

By using UTF-8 as encoding in strlen, it will count the characters correctly. Seeing that the database column encoding is set to utf8_general_ci, the UTF-8 count should fit in the column.

Changing this

https://github.com/directus/api/blob/7e379d9045e2126bc56fc82687b5a051ad5f9251/src/core/Directus/Services/AbstractService.php#L545

to this

                 );
             }
-        }else{
-            if(!is_null($field['length']) && ((is_array($value) && $field['length'] < strlen(json_encode($value))) || (!is_array($value) && $field['length'] < strlen($value)))){
+        } else {
+            if (!is_null($field['length']) && ((is_array($value) && $field['length'] < strlen(json_encode($value))) || (!is_array($value) && $field['length'] < mb_strlen($value, 'UTF-8')))){
                 throw new UnprocessableEntityException(
                     sprintf("The value submitted (%s) for '%s' is longer than the field's supported length (%s). Please submit a shorter value or ask an Admin to increase the length.",!is_array($value) ? $value : 'Json / Array',$field->getFormatisedName(),$field['length'])
                 );

(eg replace strlen($value) with mb_strlen($value, "UTF-8")) fixes it on my end. @bjgajjar could you confirm this change won't screw up anything else?

I guess a cleaner solution would be to extract the encoding type from the database column instead of assuming UTF-8, but seeing that Directus creates columns with that encoding type, I think it's a good starting point

could you confirm this change won't screw up anything else?

Nope - it's not a breakable change.

And It seems like the master branch already contained this code - so we can close this.

@rijkvanzanten - Can you please help me to find out what's required to close this issue?

And It seems like the master branch already contained this code

It _seems_ like it contains it... You were the one who merged it in @bjgajjar ! 😄

You were the one who merged it

@rijkvanzanten - Yeah I know that. That's why I mentioned here to get a confirmation. :)

Ah, lost in translation then 😉

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ondronix picture ondronix  ·  3Comments

cdwmhcc picture cdwmhcc  ·  3Comments

cdwmhcc picture cdwmhcc  ·  3Comments

binary-koan picture binary-koan  ·  3Comments

jwkellyiii picture jwkellyiii  ·  3Comments