Godot-proposals: Make division operator less error prone without a breaking change

Created on 24 Nov 2020 · 6Comments · Source: godotengine/godot-proposals

Describe the project you are working on:

Not so relevant in this case.

Describe the problem or limitation you are having in your project:

Encountering bugs caused by ambiguity of float vs integer division.

Describe the feature / enhancement and how it helps to overcome the problem or limitation:

This is a follow up proposal to #1866. Overall I think @dalexeev's proposal would be a much cleaner solution purely from a language design perspective, but has the downside of:

the breaking change,
people coming from statically typed languages expect a single division operator (which however can be turned around to: people coming from dynamically typed languages expect float and integer operators to be different operators).

The idea of this proposal is:

Make no change of the behavior of /, i.e., keep a single division operator.
Provide some linting rules that help to catch the majority of the typical division bugs arising in dynamic contexts.

The fundamental question is if GDScript is closer to a dynamically or a statically typed language.

Python 3 and GDScript both support type annotations, but there is an important difference. In GDScript type annotations actually lead to type conversions, whereas in Python they don't:

Python

def f(x: float):
    print(type(x))

f(42)
# prints: <class 'int'>

GDScript

func f(x: float):
    print(typeof(x) == TYPE_REAL)   # should be called TYPE_FLOAT

f(42)
# prints: True

This also means that in Python having the classic single / division behavior would be a bug even with type annotations:

def f(a: float, b: float):
    # type annotations don't convert potential ints to float, i.e., the division would not necessarily
    # be a float division
    assumed_float_division_result = a / b

f(5, 2)

Compared to GDScript with type annotations:

func f(a: float, b: float):
    # type annotations do convert anything to float, i.e., the division is guaranteed to be a float division
    var assumed_float_division_result = a / b

f(5, 2)

Thus, the way type annotations work in GDScript bring it closer to a statically typed language. In general I would conclude that the division semantics are sane if the types of the operands are known -- like it is the case in statically typed languages.

This leaves the case when operands have unknown types. Omitting type annotations opens the door for the classic division bug of dynamically typed language: Integer values creep into a place where they aren't expected and thus the wrong return type is obtained:

func f(a, b):
    # the classic division bug
    var assumed_float_division_result = a / b

f(5, 2)

The idea of this proposal is to at least highlight / catch such cases via a linter rule producing something like:

var assumed_float_division_result = a / b
# Encountered error-prone division with unknown operand types. Consider giving the operands an explicit type.

There would be no linter warning if:

both operand types are known (which includes the common int / int case)
only one of the operands is known and it is a float. This case should be fine, because integer division can already be excluded.

The latter would allow to silence the warning in the line above via the typical:

var assumed_float_division_result = float(a) / b

There are some use cases where users really want to have full dynamic typing (see this as an example), in which case they really cannot annotate the types of the operands at all. In such a case the user could explicitly silence the linter to express "trust me, I know what I'm doing here". To illustrate on the generic normalize_values function:

func normalize_values(values, reference_value):
    var new_values = []
    for value in values:
        # Deliberately apply the "multiply by 1.0" trick to convert integers to float
        # in a duck typing compatible way. Requires to tell the linter that we know
        # that this untyped division is fine.
        new_values.append(value * 1.0 / reference_value) # nolint
    return new_values

(I don't know if there is already a mechanism to silence the linter on a per line basis. Just using the # nolint as an example here.)

The feasibility of this proposal depends on whether GDScript's type inference system is already strong enough to infer the type in non-trivial cases, i.e., could it see that function_returning_float() * 1.0 * local_int * local_float is of type float?

This basically leads back to the question: Is GDScript statically typed "enough" to afford a / operator like in true statically typed languages, or is its nature still more dynamic?

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:

See above.

If this enhancement will not be used often, can it be worked around with a few lines of script?:

Language property.

Is there a reason why this should be core and not an add-on in the asset library?:

Language property.

gdscript

Source

bluenote10

👍3 👀2

Most helpful comment

@reduz

Having seen your thoughts on IRC, I'd really like to reach out to you. While I fully trust you in so many things, I feel like you may be missing an important aspect in this particular case.

It's the same thing either way, you can hit a bug dividing two integers and getting a float unexpectedly, or dividing two integers and getting an integer unexpectedly.

As far as I can see, this really isn't the case. The source of bugs is not just about a developer expecting one behavior, while it is the other way around. The core issue is the "integer-finding-their-way-into-expressions-bug", and that only exists with the single division operator. This bug cannot occur with the two division solution! So it is wrong to conclude you can go either way, and you just get the bugs from mixing up the behavior. The source of bugs really is asymmetric. There is entire class of bugs that can only occur in one direction, but not in the other. This is not a "tabs vs spaces" discussion as many mistakenly view it.

Incidentally the original issue triggering the proposal was exactly such a "integer-finding-their-way-into-expressions-bug", in this case in disguise of replacing an internal type having changed from Vector2 to Vector2i (the viewport) which had messed up the semantics of an internal division (dividing the x / y of the vector). A nasty time-wasting bug that would have been entirely preventable. Note that with the current division operator we would have to write float(vector.x) / vector.y everywhere to be safe w.r.t. to any potential Vector2 => Vector2i refactoring, which is totally crazy.

Since you haven't touched the "integer creeping in problem" in your reasoning, it makes sense to arrive at the conclusion:

I also dislike this from python 3 and feel it was completely unecesary

I tried my best to illustrate this problem in the example above, but perhaps PEP 238 does a better job:

Python doesn't have argument type declarations, so integer arguments can easily find their way into an expression.

The problem is particularly pernicious since ints are perfect substitutes for floats in all other circumstances: math.sqrt(2) returns the same value as math.sqrt(2.0), 3.14100 and 3.14100.0 return the same value, and so on. Thus, the author of a numerical routine may only use floating point numbers to test his code, and believe that it works correctly, and a user may accidentally pass in an integer input value and get incorrect results.

After using Python for 15 years, I am absolute certain that this has prevented many bugs, and it was a wise decision. In the early days it was a typical issue to run some_algorithm(epsilon=0.001) successfully, but running some_algorithm(epsilon=1) suddenly lead to weird/buggy behavior due to the integer creeping into an internal division which had switched the division semantics. Note that these bugs are silent, and were quite tedious to identify. It also often raised the question who is supposed to solve the problem, because library authors would argue that you are not supposed to pass in integers. PEP 238 has solved these issues for good.

As long as you view the Python 3 decision as "completely unnecessary", you do not seem to have a good foundation to make an informed decision.

That being said, what the example also illustrates however, is why it is much less of an issue in GDScript if type annotations are involved thanks to type conversions.

The question you may have to answer for yourself: _Would you like to support writing numerical routines without type annotations?_

If so, splitting the division would be highly recommended to prevent the "integer-creeping-in-problem".
Otherwise the above proposal should be sufficient to catch many bugs and force users more towards using GDScript in a statically typed style.

bluenote10 on 25 Nov 2020

👍2

All 6 comments

Note that the proposed change to the behavior of the division operator does not actually break compatibility as much as it might seem at first glance. See comment for details.

dalexeev on 25 Nov 2020

That's a good proposal, but it is actually partially implemented already:

    var hours := 1337
    var days = hours / 24
    print(days)

raises:

W 0:00:00.657   Integer division, decimal part will be discarded.
  <C++ Error>   INTEGER_DIVISION
  <Source>      level.gd:13

I say partially because it seems to only happen when at least one of the operands has a type hint (here inferred to int). This doesn't trigger any warning:

    var hours = 1337
    var days = hours / 24
    print(days)

This might be considered a bug.

Aside from that I agree with your analysis about the (increasingly) statically typed nature of GDScript (even more so GDScirpt 2.0) which makes it more appropriate for it to behave like a statically typed scripting language (like C#) than a fully dynamically typed scripting language (like Python 3).

Reflecting upon type hints and their different behavior in GDScript and Python 3, it's also worth noting that type hints were added in Python 3.6 so they were not part of the decision making for introducing the // operator. And your further explanations on how type hints don't coerce types in Python definitely make the need for separate operators very clear.

akien-mga on 25 Nov 2020

👍2

I say _partially_ because it seems to only happen when at least one of the operands has a type hint (here inferred to int).

This might be considered a bug.

@akien-mga That is, after fixing the bug, the following code should produce Integer Division warning?

var a = 5 / 2

Thus, the purpose of Integer Division warning is to familiarize the user with the mathematically unexpected behavior of the division operator and then be disabled in the project settings?

dalexeev on 25 Nov 2020

Thus, the purpose of Integer Division warning is to familiarize the user with the mathematically unexpected behavior of the division operator and then be disabled in the project settings?

I don't fully understand the current Integer Division warning yet, but it doesn't make so much sense to me. In general, the linter shouldn't be a tool to teach the language, but rather a tool to hint for potential bugs. If we fully commit to the current / division, we at some point have to assume that users mean what they have written. Then it doesn't make sense to hint every int / int division to remind the user that this is an integer division. Basically such a lint would merely tell the user that the language has a design bug ;).

In any case, under my proposal 5 / 2 would be no warning because both types are statically known, and then there can be at least be no bugs from dynamic typing -- only from lack of language behaviour knowledge.

The current Integer Division warning also indicates that perhaps 5 // 2 would be clearer ;)

bluenote10 on 25 Nov 2020

👍1

@reduz

Having seen your thoughts on IRC, I'd really like to reach out to you. While I fully trust you in so many things, I feel like you may be missing an important aspect in this particular case.

It's the same thing either way, you can hit a bug dividing two integers and getting a float unexpectedly, or dividing two integers and getting an integer unexpectedly.

Since you haven't touched the "integer creeping in problem" in your reasoning, it makes sense to arrive at the conclusion:

I also dislike this from python 3 and feel it was completely unecesary

I tried my best to illustrate this problem in the example above, but perhaps PEP 238 does a better job:

Python doesn't have argument type declarations, so integer arguments can easily find their way into an expression.

The problem is particularly pernicious since ints are perfect substitutes for floats in all other circumstances: math.sqrt(2) returns the same value as math.sqrt(2.0), 3.14100 and 3.14100.0 return the same value, and so on. Thus, the author of a numerical routine may only use floating point numbers to test his code, and believe that it works correctly, and a user may accidentally pass in an integer input value and get incorrect results.

As long as you view the Python 3 decision as "completely unnecessary", you do not seem to have a good foundation to make an informed decision.

That being said, what the example also illustrates however, is why it is much less of an issue in GDScript if type annotations are involved thanks to type conversions.

The question you may have to answer for yourself: _Would you like to support writing numerical routines without type annotations?_

If so, splitting the division would be highly recommended to prevent the "integer-creeping-in-problem".
Otherwise the above proposal should be sufficient to catch many bugs and force users more towards using GDScript in a statically typed style.

bluenote10 on 25 Nov 2020

👍2

Having seen your thoughts on IRC, I'd really like to reach out to you. While I fully trust you in so many things, I feel like you may be missing an important aspect in this particular case.

Yes it would be nice. Unfortunately, I am not very accurate in expressing my thoughts in English, and much of this is unclear to those who do not yet see the problem. But since you understand what I wanted to say and you also have your own arguments, you could better explain this.