Using: _angular-1.3.0-build.2860_
I was expecting AngularJS to encode query string parameters using encodeURIComponent. According to the following test it is not the case:
describe('$http', function () {
it('encodes uri components correctly', inject(function($http, $httpBackend) {
var data = 'Hello from http://example.com';
$httpBackend.expectGET('/api/process?data=' + encodeURIComponent(data));
$http({ method: 'GET', url: '/api/process', params: { data: data } });
$httpBackend.flush();
}));
});
The test fails with the following error:
$http encodes uri components correctly
Error: Unexpected request: GET /api/process?data=Hello+from+http:%2F%2Fexample.com
Expected GET /api/process?data=Hello%20from%20http%3A%2F%2Fexample.com
To sum up:
Hello%20from%20http%3A%2F%2Fexample.comHello+from+http:%2F%2Fexample.comThis makes testing requests hard as I don't know what to expect as encoded value. Any reason why angular is not using encodeURIComponent?
This is documented in the source, on the function replacing encodeURIComponent (in src/Angular.js).
encodeURIComponent encodes more characters than it must, preventing certain characters from being passed that are allowed by RFC3986.
/**
* We need our custom method because encodeURIComponent is too aggressive and doesn't follow
* http://www.ietf.org/rfc/rfc3986.txt with regards to the character set (pchar) allowed in path
* segments:
* segment = *pchar
* pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
* pct-encoded = "%" HEXDIG HEXDIG
* unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
* sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
* / "*" / "+" / "," / ";" / "="
*/
function encodeUriSegment(val) {
return encodeUriQuery(val, true).
replace(/%26/gi, '&').
replace(/%3D/gi, '=').
replace(/%2B/gi, '+');
}
Agree that encodeURIComponent encodes characters that do not really need encoding. However, this does not incur any data loss or corruption, does it? Why is avoiding over-encoding is worth rolling a custom encoding function? The encoded data is sent to the server and is not used for routing or anything "useful" to angular as far as I can see. I am curious to know the rationale behind this.
On the other hand, the encodeUriSegment function is not part of the API, so it can't be used in testing. Any idea on how we can work around that?
If you look at the tests that were added along with these functions, in 9e30baad, the comments note that some external API's rely on unescaped characters, like @:
it('should not encode @ in url params', function() {
//encodeURIComponent is too agressive and doesn't follow http://www.ietf.org/rfc/rfc2396.txt
//with regards to the character set (pchar) allowed in path segments
//so we need this test to make sure that we don't over-encode the params and break stuff like
//buzz api which uses @self
Could you copy the functions to use for your own testing, write new tests based on the RFC, or make a PR to publicize these functions (angular.encodeUri)? Those are my best ideas..
as @zzmp mentioned, this is intentional because some serverside apis don't like urls that were overencoded.
to test, either use literal url strings or create a helper function to escape params.
Most helpful comment
This is documented in the source, on the function replacing
encodeURIComponent(insrc/Angular.js).encodeURIComponentencodes more characters than it must, preventing certain characters from being passed that are allowed by RFC3986.