flow-remove-types produces invalid JS after non-latin characters

Created on 3 Jun 2019 · 13Comments · Source: facebook/flow

Flow version: 0.100.0

Expected behavior

Given the following input:

// @flow
// п
function foo(bar: number) {}

flow-remove-types should produce:

//      
// п
function foo(bar        ) {}

Actual behavior

It produces invalid JS:

//      
// п
function foo(bar:         {}

The non-latin character is the trigger. Without it, it works correctly. This seems like quite a serious bug considering how common it is to use non-latin characters in comments (e.g. comment in other languages, draw arrows etc.).

cc @mroch @samwgoldman

bug parsing

Source

mourner

👍4

All 13 comments

This seems to be a bug in flow-parser which produces incorrect range values in AST for non-latin characters. E.g. parsing // п and // n produces [0, 5] and [0, 4] ranges respectively, although both should be the same.

mourner on 3 Jun 2019

same root cause as #5793, I believe. the lexer reports its position in unicode codepoints, of which these are 1, but they are multibyte.

mroch on 3 Jun 2019

👍1

@mroch I believe it's a recent regression, because https://astexplorer.net/ (which uses Flow v0.94.0) shows identical AST for both cases ([0, 4]).

mourner on 3 Jun 2019

Screen Shot 2019-06-03 at 9 27 32 AM

this is 0.94, when hovering over the FunctionDeclaration in the AST. so it's been there a while, but flow-remove-types regressed because it moved to flow-parser.

mroch on 3 Jun 2019

this appears to only be an issue with the range. the loc matches babylon.

it's a bit easier to see if you do it all on one line: https://flow.org/try/#0PQKk-CAsBmD2MNxA (see the AST tab). since it's all on line 1, the range should match the columns in loc.

mroch on 3 Jun 2019

@mroch ah, yeah, you're right. The link I commented earlier triggers the bug if you add a new line after the comment.

mourner on 3 Jun 2019

hrm i think we intentionally made range work like this. it's the byte offset in the file, which is what you'd want if you wanted to seek to read that part of the file from disk. but it's not what you'd pass to substring in JS, which is what tools like flow-remove-types do.

mroch on 3 Jun 2019

cc @nmote I think we should change range to match esprima and babylon, and potentially add (an option for) another field like byte_range or something. will need to figure out which tools need to be updated to use that.

mroch on 3 Jun 2019

👍1

https://github.com/facebook/flow/blob/a05cbb202197d57e5e909e11dcbb859f11c48641/src/parser/offset_utils.mli#L8-L30

... yup

mroch on 3 Jun 2019

Thanks for looking into this @mroch! Agreed about bringing Flow AST more in line with other parsers.

In the meantime, to fix the breaking bug with flow-remove-types sooner, should we just rewrite the logic so that it only depends on line/column, not range? I could take a stab at a PR.

mourner on 3 Jun 2019

alternatively, if it could work with UTF8 bytes instead of JS strings, that would make the existing ranges correct.

for example, perhaps changing flow-remove-types to use Buffers? I think the offsets of a Buffer are in bytes, not codepoints. modifying a buffer in place instead of lots of string concatenation could be a perf win too.
https://github.com/facebook/flow/blob/a05cbb202197d57e5e909e11dcbb859f11c48641/packages/flow-remove-types/index.js#L107

mroch on 3 Jun 2019

👍1

@mroch yep, will do!

mourner on 3 Jun 2019

We are running into the same problem, in our case with a ™ in a comment.

// @flow
import React from 'react';

/*
  This should Just Work™
*/
export const SomeComponent = (props: Object) => (
  <div {...props}>
    This breaks
  </div>
);

becomes

import React from 'react';

/*
  This should Just Work™
*/
export const SomeComponent = (props:=> (
  <div {...props}>
    This breaks
  </div>
);

Looks like it's been a few months now, and it shows up in all v2 releases of flow-remove-types -- any updates on a fix?

walkerdb on 17 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Export module commonJS problem

Beingbook · 3Comments

Are there some good way to cast `List<number>` to `List<number|string>` ?

iamchenxin · 3Comments

"const abc: string = {}.foo;" is wrong but passes type-checking.

damncabbage · 3Comments

DefaultProps don't work when using functional components

marcelbeumer · 3Comments

How to define custom method of HTMLElement?

mjj2000 · 3Comments