Zeroclickinfo-goodies: URL Decode: UTF-8 characters are incorrectly decoded

Created on 1 Sep 2017  路  4Comments  路  Source: duckduckgo/zeroclickinfo-goodies

Description

Percent-encoded characters comprising more than one percent sign are decoded by URL Decode as if they were just one character. For example; %C3%9C becomes 脙聹 but should be .

Steps to recreate

Search for %C3%9C. This becomes 脙聹 but should be .
Search for %E4%B8%AD%E6%96%87. This becomes 盲赂颅忙聳聡 but should be 涓枃.

Tested on Firefox 55.0.3 and Safari 10.1.2, on macOS Sierra 10.12.6.


IA Page: http://duck.co/ia/view/urldecode
Maintainer: @mintsoft

Bug Perl Needs a Developer

All 4 comments

Adding the discussion label until the potential bug is scoped out.

Basically all this does is uri_unescape($in) and return, so it might be an upstream bug?

It's most likely that the input string isn't being handled as utf8

Not only input string, but also output string :)
if instead of title => $decoded, I write this title => "屑懈泻褉芯泻褉械写懈褌 hey", then I'll get this output "脨录脨赂脨潞脩脨戮脨潞脩脨碌脨麓脨赂脩 hey". So, it doesn't output utf8 characters.

btw, I've found, how to output correct string, but I don't know, why it works :) I'll commit it if you want.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ClarkJeff picture ClarkJeff  路  3Comments

lowellk picture lowellk  路  3Comments

fzzr- picture fzzr-  路  5Comments

pjhampton picture pjhampton  路  3Comments

jonathancross picture jonathancross  路  5Comments