Definitelytyped: complete refactor of [underscore] - discussion

Created on 28 Jun 2019  Â·  17Comments  Â·  Source: DefinitelyTyped/DefinitelyTyped

The current underscore implementation is missing a lot of pieces and has many bugs in the entire Chain flow.

I want to completely rewrite the library and try to reach 100% test coverage.

__This will render the new version as non backwards compatible!!!__

The first big change involves separating each method into a seperate file (same folder structure like @types/lodash). Each file will contain all method implementations in UnderscoreStatic Underscore and _Chain interfaces. This will make it easier to maintain and read as each method has 3 separated implementation and currently it is hard to see the correlation between them because of the 6k line main file...

The 2nd main change is the complete disregard to some types of iteratee. Most of them are easy to add, but a specific one is a bit more complicated - https://underscorejs.org/#property
I can easily implement it using https://github.com/pirix-gh/ts-toolbelt but it requires to work with typescript v3.5 (which will cause a chain dependency change of 18 more dependent libraries) - @jgonggrijp what do you think about that change?

The property change is as simple and will look like something of the sort:

import { Object, Any, Union } from 'ts-toolbelt';

export function getProp<O extends object, P extends string[]>(obj: Union.Nullable<O>,
                                                              ...keys: Any.Cast<P, Object.PathValid<O, P>>)
  : Object.Path<O, P> | undefined {
  return keys.reduce(
    (result: any, key: string) => (result === null || result === undefined) ? undefined : result[key],
    obj);
}

All 17 comments

Finished version of map for example:

// https://underscorejs.org/#map

import {Object, Any} from 'ts-toolbelt';
import _ = require("../index");
import {Dictionary, List, TypeOfList, TypeOfDictionary} from "../index";

declare module "../index" {

    interface UnderscoreStatic {

        /**
         * Produces a new array of values by mapping each value in list through a transformation function
         * (iterator). If the native map method exists, it will be used instead. If list is a JavaScript
         * object, iterator's arguments will be (value, key, object).
         * @param list Maps the elements of this array.
         * @param iterator Map iterator function for each element in `list`.
         * @param context `this` object in `iterator`, optional.
         * @return The mapped array result.
         **/
        map<T extends TypeOfList<V>, TResult, V extends List<any> = List<T>>(
            list: V,
            iterator: ListIterator<T, TResult, V>,
            context?: any): TResult[];

        map<T extends TypeOfList<V>, K extends IterateeProperty<T>, V extends List<any> = List<T>>(
            list: V,
            iterator: K,
            context?: any): T[K][];

        map<T extends TypeOfList<V>, K extends string[], V extends List<any> = List<T>>(
            list: V,
            iterator: Any.Cast<K, Object.PathValid<V, K>>,
            context?: any): Array<Object.Path<V, K> | undefined>;

        map<T extends TypeOfList<V>, V extends List<any> = List<T>>(
            list: V,
            iterator: IterateeMatcherShorthand<T>,
            context?: any): boolean[];

        map<T extends TypeOfList<V>, V extends List<any> = List<T>>(
            list: V,
            iterator: null,
            context?: any): V;

        /**
         * @param object Maps the properties of this object.
         * @param iterator Map iterator function for each property on `object`.
         * @param context `this` object in `iterator`, optional.
         * @return The mapped object result.
         **/
        map<T extends TypeOfDictionary<V>, TResult, V extends Dictionary<any> = Dictionary<T>>(
            object: V,
            iterator: ObjectIterator<T, TResult, V>,
            context?: any): TResult[];

        map<T extends TypeOfDictionary<V>, K extends IterateeProperty<T>, V extends Dictionary<any> = Dictionary<T>>(
            object: V,
            iterator: K,
            context?: any): T[K][];

        map<T extends TypeOfDictionary<V>, K extends string[], V extends Dictionary<any> = Dictionary<T>>(
            object: V,
            iterator: Any.Cast<K, Object.PathValid<V, K>>,
            context?: any): Array<Object.Path<V, K> | undefined>;

        map<T extends TypeOfDictionary<V>, V extends Dictionary<any> = Dictionary<T>>(
            object: V,
            iterator: IterateeMatcherShorthand<T>,
            context?: any): boolean[];

        map<T extends TypeOfDictionary<V>, V extends Dictionary<any> = Dictionary<T>>(
            object: V,
            iterator: null,
            context?: any): T[];

        collect: typeof _.map;

    }

    interface _List<T extends TypeOfList<V>, V extends List<any>> {

        map<TResult>(iterator: ListIterator<T, TResult, V>,
                     context?: any): TResult[];

        map<K extends IterateeProperty<T>>(iterator: K,
                                           context?: any): T[K][];

        map<K extends string[]>(iterator: Any.Cast<K, Object.PathValid<V, K>>,
                                context?: any): Array<Object.Path<V, K> | undefined>;

        map(iterator: IterateeMatcherShorthand<T>,
            context?: any): boolean[];

        map(iterator: null, context?: any): V;

        collect<TResult>(iterator: ListIterator<T, TResult, V>,
                         context?: any): TResult[];

        collect<K extends IterateeProperty<T>>(iterator: K,
                                               context?: any): T[K][];

        collect<K extends string[]>(iterator: Any.Cast<K, Object.PathValid<V, K>>,
                                    context?: any): Array<Object.Path<V, K> | undefined>;

        collect(iterator: IterateeMatcherShorthand<T>,
                context?: any): boolean[];

        collect(iterator: null, context?: any): V;

    }

    interface _Object<T extends TypeOfDictionary<V>, V extends Dictionary<any>> {

        map<TResult>(iterator: ObjectIterator<T, TResult, V>,
                     context?: any): TResult[];

        map<K extends IterateeProperty<T>>(iterator: K,
                                           context?: any): T[K][];

        map<K extends string[]>(iterator: Any.Cast<K, Object.PathValid<V, K>>,
                                context?: any): Array<Object.Path<V, K> | undefined>;

        map(iterator: IterateeMatcherShorthand<T>,
            context?: any): boolean[];

        map(iterator: null,
            context?: any): T[];

        collect<TResult>(iterator: ObjectIterator<T, TResult, V>,
                     context?: any): TResult[];

        collect<K extends IterateeProperty<T>>(iterator: K,
                                           context?: any): T[K][];

        collect<K extends string[]>(iterator: Any.Cast<K, Object.PathValid<V, K>>,
                                context?: any): Array<Object.Path<V, K> | undefined>;

        collect(iterator: IterateeMatcherShorthand<T>,
            context?: any): boolean[];

        collect(iterator: null,
            context?: any): T[];
    }

    interface _ChainList<T extends TypeOfList<V>, V extends List<any>> {

        map<TResult>(iterator: ListIterator<T, TResult, V>,
                     context?: any): _ChainListReturn<TResult>;

        map<K extends IterateeProperty<T>>(iterator: K,
                                           context?: any): _ChainListReturn<T[K]>;

        map<K extends string[]>(iterator: Any.Cast<K, Object.PathValid<V, K>>,
                                context?: any): _ChainListReturn<Object.Path<V, K> | undefined>;

        map(iterator: IterateeMatcherShorthand<T>,
            context?: any): _ChainListReturn<boolean>;

        map(iterator: null, context?: any): _ChainListReturn<T>;

        collect<TResult>(iterator: ListIterator<T, TResult, V>,
                     context?: any): _ChainListReturn<TResult>;

        collect<K extends IterateeProperty<T>>(iterator: K,
                                           context?: any): _ChainListReturn<T[K]>;

        collect<K extends string[]>(iterator: Any.Cast<K, Object.PathValid<V, K>>,
                                context?: any): _ChainListReturn<Object.Path<V, K> | undefined>;

        collect(iterator: IterateeMatcherShorthand<T>,
            context?: any): _ChainListReturn<boolean>;

        collect(iterator: null, context?: any): _ChainListReturn<T>;
    }

    interface _ChainObject<T extends TypeOfDictionary<V>, V extends Dictionary<any>> {

        map<TResult>(iterator: ObjectIterator<T, TResult, V>,
                     context?: any): _ChainListReturn<TResult>;

        map<K extends IterateeProperty<T>>(iterator: K,
                                           context?: any): _ChainListReturn<T[K]>;

        map<K extends string[]>(iterator: Any.Cast<K, Object.PathValid<V, K>>,
                                context?: any): _ChainListReturn<Object.Path<V, K> | undefined>;

        map(iterator: IterateeMatcherShorthand<T>,
            context?: any): _ChainListReturn<boolean>;

        map(iterator: null,
            context?: any): _ChainListReturn<T>;

        collect<TResult>(iterator: ObjectIterator<T, TResult, V>,
                     context?: any): _ChainListReturn<TResult>;

        collect<K extends IterateeProperty<T>>(iterator: K,
                                           context?: any): _ChainListReturn<T[K]>;

        collect<K extends string[]>(iterator: Any.Cast<K, Object.PathValid<V, K>>,
                                context?: any): _ChainListReturn<Object.Path<V, K> | undefined>;

        collect(iterator: IterateeMatcherShorthand<T>,
            context?: any): _ChainListReturn<boolean>;

        collect(iterator: null,
            context?: any): _ChainListReturn<T>;

    }

}

Let's be clear. By "complete rewrite", do you mean

  1. "throw away the existing code and start over", or
  2. "preserve all code, but carefully examine everything and change/reorder as needed"?

I believe both options would be a lot of work, perhaps even in the same order of magnitude of rewriting Underscore itself.

I would vote against option 1. Not because I think the code is good as it is, but because throwing away code to replace it has such strong inherent disadvantages that there is simply no way to justify it. Joel Spolsky wrote a good article on this long ago.

I'm OK with option 2, but if you really want to go this route, I warn against making it an all-or-nothing affair. Rewrite the entire thing if you want, but spread the changes over many pull requests. This way, everyone gets to benefit from your work already from the start. In addition, small PRs are more likely to be approved (if only because it is easier for reviewers to find the time), having many reviews will help you stay in sync with what other people think, and having many small steps with rewards (in the form of approvals) will also make it easier for your to stay motivated until you decide you have completed your mission.

Now to answer your question more specifically:

The 2nd main change is the complete disregard to some types of iteratee. Most of them are easy to add, but a specific one is a bit more complicated - https://underscorejs.org/#property
I can easily implement it using https://github.com/pirix-gh/ts-toolbelt but it requires to work with typescript v3.5 (which will cause a chain dependency change of 18 more dependent libraries) - @jgonggrijp what do you think about that change?

I find it hard to say because of a couple of things I don't know:

  • How bad/difficult is the alternative? What will go wrong if you try to write a _.property typing that is compatible with TypeScript 2.9?
  • How bad is it to drop support for TypeScript <3.5? What proportion of Underscore+TypeScript users will be on an older version of TypeScript when you release this change into the world?
  • How happy are the DefinitelyTyped maintainers with external dependencies such as ts-toolbelt?

This is also a reason to submit many small PRs; more questions like this will come up as you go, and having them one at a time makes them easier to digest for everyone involved.

@jgonggrijp thanks for the reply!

Let's be clear. By "complete rewrite", do you mean

  1. "throw away the existing code and start over", or
  2. "preserve all code, but carefully examine everything and change/reorder as needed"?

Sadly I mean 1 - the current interface structure/ architecture is just wrong - it allows you to perform dictionary operations on lists and vice versa and it fails to preserve the actual type of the wrapped object (when using the constructor or chain flows). In order to tackle that I had to split the Underscore and _Chain interfaces into 2:

    interface _Dictionary<T extends TypeOfDictionary<V>, V extends Dictionary<any>> {

        chain(): _ChainDictionary<T, V>;

        value(): V

    }

    interface _List<T extends TypeOfList<V>, V extends List<any>> {

        chain(): _ChainList<T, V>;

        value(): V

    }

    interface _ChainDictionary<T extends TypeOfDictionary<V>, V extends Dictionary<any>> {

        chain(): _Chain<T, V>;

        value(): V

    }

    interface _ChainList<T extends TypeOfList<V>, V extends List<any>> {

        chain(): _Chain<T, V>;

        value(): V

    }

And had to change all the signatures of methods to return the proper one based on their behavior.
This kind of change is hard to perform gradually and because the change is a breaking change anyways - I think it will be best if it will be released at once.
More over, it is impossible to maintain the current library as it is ~6k lines of code and each method is spread in those lines.

I believe both options would be a lot of work, perhaps even in the same order of magnitude of rewriting Underscore itself.

I agree it is a lot of work but I don't mind putting the time

I would vote against option 1. Not because I think the code is good as it is, but because throwing away code to replace it has such strong inherent disadvantages that there is simply no way to justify it. Joel Spolsky wrote a good article on this long ago.

I agree with you and it sadness me that this is how it should be done... I believe that because I will add (or at least try to) 100% coverage tests (in addition to making sure the current tests pass) , the disadvantages will grow smaller a bit.

  • How bad/difficult is the alternative? What will go wrong if you try to write a _.property typing that is compatible with TypeScript 2.9?

I can skip this additions and keep the current everything any generics implementation but I want to make the typings air tight and that is the way to go. Adding proper typings without ts-toolbelt will result in mega verbose implementation like the current partial method https://github.com/DefinitelyTyped/DefinitelyTyped/blob/3fee2364dfda7c2beb65dc6e1980e41b2ed30751/types/underscore/index.d.ts#L3411-L3420

  • How bad is it to drop support for TypeScript <3.5? What proportion of Underscore+TypeScript users will be on an older version of TypeScript when you release this change into the world?

Just like the other changes I made before - they can keep using the library before this breaking change

  • How happy are the DefinitelyTyped maintainers with external dependencies such as ts-toolbelt?

Not sure - do you know who can answer that?

This is also a reason to submit many small PRs; more questions like this will come up as you go, and having them one at a time makes them easier to digest for everyone involved.

I agree but I don't see a way currently to achieve it in a gradual way

@jgonggrijp thanks for the reply!

Thanks for openly discussing your plans. Since this is FOSS, you don't need anyone's agreement.

I'm not convinced yet that you really need to go for option 1. This doesn't have to stop you, but I hope that either of us will be able to convince the other that option 2 is/isn't available.

Sadly I mean 1 - the current interface structure/ architecture is just wrong - it allows you to perform dictionary operations on lists and vice versa and it fails to preserve the actual type of the wrapped object (when using the constructor or chain flows).

I do agree that a lot is wrong with the typings.

In order to tackle that I had to split the Underscore and _Chain interfaces into 2:

code

While I haven't looked into it carefully, I'm definitely willing to believe that such a split is necessary.

And had to change all the signatures of methods to return the proper one based on their behavior.
This kind of change is hard to perform gradually

I don't see why this is the case. I can imagine a multistep plan with a separate PR after each step:

  1. Add the _Dictionary, _List, _ChainDictionary and _ChainList interfaces, initially empty. Preserve the soon-to-be-deprecated Underscore and _Chain interfaces for the time being. Allow lossy degradation from the new interfaces to the old ones.
  2. Move a single method (e.g., filter) to the new interfaces, leaving a forwarding alias on the old interfaces.

  ...  Repeat (2) with every other method.
  N.  Drop the deprecated interfaces.

Why wouldn't this work?

and because the change is a breaking change anyways - I think it will be best if it will be released at once.

A single pull request with 100 breaking changes is much harder to get approval for (or in fact, to get a review for at all) than 100 pull requests with one breaking change each.

More over, it is impossible to maintain the current library as it is ~6k lines of code and each method is spread in those lines.

Why not move the methods into separate files one at a time? For example, together with moving them to the new interfaces?

I believe that because I will add (or at least try to) 100% coverage tests (in addition to making sure the current tests pass) , the disadvantages will grow smaller a bit.

Making sure that the existing tests keep passing is certainly a good idea. Please also keep an eye on real client code, though. Nobody's going to be happy if they suddenly need to overhaul all of their code because of 100 breaking changes, whether the tests pass or not.

  • How bad/difficult is the alternative? What will go wrong if you try to write a _.property typing that is compatible with TypeScript 2.9?

I can skip this additions and keep the current everything any generics implementation but I want to make the typings air tight and that is the way to go. Adding proper typings without ts-toolbelt will result in mega verbose implementation like the current partial method

Just to throw in an idea, wouldn't it be possible to make something like the following inductive typing work?

// May need to wrap the first branch of the conditional in an interface because of recursion.
type HasPath<K extends keyof any, V> = K extends (infer K1) | (infer Krest) ? {
    [Name in K1]: HasPath<Krest, V>;
} : {
    [Name in K]: V;
}

property<K extends keyof any>(key: K):
    <V, T extends HasPath<K, V>>(obj: T) => V;

property<K extends keyof any>([key: K]): typeof property(key);

property<
    K1 extends keyof any, K2 extends keyof any, Krest extends keyof any
>([key1: K1, key2: K2, ...keyrest: Krest[]]):
    <V, T extends HasPath<K1, HasPath<K2 | Krest, V>>>(obj: T) =>
        typeof property([key2, ...keyrest])(obj[key1]);
  • How bad is it to drop support for TypeScript <3.5? What proportion of Underscore+TypeScript users will be on an older version of TypeScript when you release this change into the world?

Just like the other changes I made before - they can keep using the library before this breaking change

Sure, but if everyone is stuck in the past, nobody will benefit from your improvements. From 2.9 to 3.5 crosses a major version, so it's a more intrusive upgrade than from 2.3 to 2.9.

  • How happy are the DefinitelyTyped maintainers with external dependencies such as ts-toolbelt?

Not sure - do you know who can answer that?

Probably either @sandersn or somebody he knows. I suggest mentioning him in a new post with just the question, so he doesn't need to catch up with the rest of the discussion.

I'm not convinced yet that you really need to go for option 1. This doesn't have to stop you, but I hope that either of us will be able to convince the other that option 2 is/isn't available.

I hope so too!

I don't see why this is the case. I can imagine a multi step plan with a separate PR after each step:

  1. Add the _Dictionary, _List, _ChainDictionary and _ChainList interfaces, initially empty. Preserve the soon-to-be-deprecated Underscore and _Chain interfaces for the time being. Allow lossy degradation from the new interfaces to the old ones.
  2. Move a single method (e.g., filter) to the new interfaces, leaving a forwarding alias on the old interfaces.

      ...  Repeat (2) with every other method.
      N.  Drop the deprecated interfaces.

Why wouldn't this work?

The reason it will not work is because the new interfaces will be 99% empty at the beginning so if some method returns them, you won't be able to follow chaining.
For example, if we moved in the first phase only map then the following will break:

_.chain(arr).map(...).filter()

As filter won't exist on the new interface. Do you agree?

A single pull request with 100 breaking changes is much harder to get approval for (or in fact, to get a review for at all) than 100 pull requests with one breaking change each.

I completely agree

Why not move the methods into separate files one at a time? For example, together with moving them to the new interfaces?

If we can agree on a plan to make the change gradually, of course we can defer the file structure change as well.

Making sure that the existing tests keep passing is certainly a good idea. Please also keep an eye on real client code, though. Nobody's going to be happy if they suddenly need to overhaul all of their code because of 100 breaking changes, whether the tests pass or not.

The breaking changes will not affect you at all if you used the typings without specifying types explicitly in the generics parts.
I can try to maintain the generics order where it is applicable so it will reduce the amount of code that will break, in expense of code uniform styling.

Just to throw in an idea, wouldn't it be possible to make something like the following inductive typing work?

Just from a quick look it doesn't seem like it will do the trick. I think we can defer that decision to later on - first make all the relevant changes that doesn't require the ts-toolbelt dependency, and then start working on the missing types and how to achieve them.

Sure, but if everyone is stuck in the past, nobody will benefit from your improvements. From 2.9 to 3.5 crosses a major version, so it's a more intrusive upgrade than from 2.3 to 2.9.

I agree, but sometimes progress is required from both sides of the coin :-) I too hate it when a library I work with stops supporting my current platform, but I understand that the problem is with me, not them - I should be more agile and be able to advance with everyone else. In any case that is my 2 cents on the matter.

Probably either @sandersn or somebody he knows. I suggest mentioning him in a new post with just the question, so he doesn't need to catch up with the rest of the discussion.

Will do

I got here from #36575. I have a couple of comments:

  1. Compatibility: Old versions of @types/underscore remain available if you start requiring 3.5. Since underscore is a stable package -- not many updates for the last 5 years -- it means that those old versions of the types will still be useful for many years as long as they don't need fixes themselves.
  2. Compatibility 2: If you want to maintain 3.5 and pre-3.5 versions simultaneously, see https://github.com/DefinitelyTyped/DefinitelyTyped#i-want-to-use-features-from-typescript-31-or-above. The main reason to do this is to allow the pre-3.5 version to get bug fixes, or to update when new versions of underscore are published.
  3. Compile time vs coverage: Increasing type coverage necessarily increases complexity of the types in underscore, and total coverage of a package like underscore [1] will be incredibly expensive to compile. And since underscore is a collection library, it will be part of a much bigger program that may not be able to compile if underscore balloons in size.

If you do make the change, be sure to keep compile time and memory in mind the whole time.

[1] a collection library designed without types in mind, specifically.

@sandersn thanks for the detailed response!
Your suggestion to use a seperate library for the higher TS version is a perfect solution.
I will make the changes without the advanced typings first and then I will continue with the advances types on the forked version.

@regevbr

I can imagine a multi step plan with a separate PR after each step:
(plan)
Why wouldn't this work?

The reason it will not work is because the new interfaces will be 99% empty at the beginning so if some method returns them, you won't be able to follow chaining.
For example, if we moved in the first phase only map then [_.chain(arr).map(...).filter()] will break (...) as filter won't exist on the new interface. Do you agree?

I agree that this technical challenge exists, but this is the reason why I wrote "Allow lossy degradation from the new interfaces to the old ones" in step 1 and "Drop the deprecated interfaces" in the final step. After some more thought, I still think this can be made to work and I have a concrete-ish suggestion:

In step 1, introduce the new interfaces, keep the deprecated ones alongside them, and also add temporary compatibility interfaces, as below. I'm showing the principle for Underscore; _Chain would follow the same pattern. I'm also omitting the type parameters in order to keep it short.

// The interface to be deprecated
interface Underscore {
    // still contains all of the methods
}

// Compatibility interfaces
// (both equivalent to the deprecated interface for now)
// (the intersections will be removed in the end but the names will stay)
type _Dictionary = Underscore & _FutureDictionary;
type _List = Underscore & _FutureList;

// Future interfaces, both empty for now
// (these names are not used anywhere else)
interface _FutureDictionary {}
interface _FutureList {}

Intermediate result:

  • Underscore is still there and still has all the methods, so nothing breaks yet.
  • The new interfaces _List and _Dictionary are introduced and already contain all the methods they should contain in the end. They also still contain lots of method signatures they shouldn't contain, but at least users can already start rewriting their code to use _List and _Dictionary instead of Underscore.

In each step from 2 through N-1, move/split one method from Underscore to _FutureList and/or _FutureDictionary. Use _List and _Dictionary in the signature(s) of the updated method. Also cause other breaking changes in this method if you see a reason to do so. Intermediate result of each step:

  • A small subset of users will experience this as a breaking change, which may prompt them to start adapting their code.
  • The method's signatures are only available from the interfaces where they belong, i.e., signatures that belong in _List are not available from _Dictionary anymore and vice versa. In other words, the method is fixed.

In the final step (N), remove the deprecated interface and the intersection types and remove the Future part from the names of the new interfaces (so the new interfaces take over the names of the temporary intersection types). All other code already uses the names without Future so it doesn't need to change.

End result:

  • The _List and _Dictionary interfaces are entirely correct.
  • The removal of Underscore isn't a major shock, because it didn't contain any methods anymore, anyway. By this time, everyone who has been using a recent version of @types/underscore has already adapted their code to use _List and _Dictionary instead of Underscore.
  • The interfaces are identical to the former intersection types, so people who already correctly adapted their code won't even notice that the intersection types are gone.

(...) sometimes progress is required from both sides of the coin :-) I too hate it when a library I work with stops supporting my current platform, but I understand that the problem is with me, not them - I should be more agile and be able to advance with everyone else. In any case that is my 2 cents on the matter.

Fair enough. I was a bit afraid that 3.5 might be "too new", but in the meanwhile I've learned that the first stable release of 3.5 was a month ago and the bleeding edge is already at 3.6.

@sandersn

  1. Compile time vs coverage: Increasing type coverage necessarily increases complexity of the types in underscore, and total coverage of a package like underscore [1] will be incredibly expensive to compile. And since underscore is a collection library, it will be part of a much bigger program that may not be able to compile if underscore balloons in size.

If you do make the change, be sure to keep compile time and memory in mind the whole time.

[1] a collection library designed without types in mind, specifically.

I think you're being a bit unfair to underscore in your footnote. Libraries that don't specifically target TypeScript will use the full type system of JavaScript rather than the subset of it that the TypeScript compiler is most comfortable with, but that doesn't mean types aren't taken into account. In the case of Underscore, the opposite is true; the designer (Jeremy Ashkenas) chose the signatures of each function very carefully. The semantics of underscore are well-typed, even if they are hard to express or compile in TypeScript.

As an aside, by "collection library", do you mean a toolkit?

@jgonggrijp I agree we can work that way, BUT :

  1. It will take us a few months to achieve that as for every method (I guess around 50) we need to have a PR and review and then wait for owner merge....
  2. We might possibly introduce 50 breaking changes over time, so that means the users of the library will need potentially to fix their code 50 times in the coming months - which is worse(?) than a single time fix.

Also can we agree to defer the special typings that use ts-toolbelt to after we finish the refactor? Or do you want me to add them straight away and so the changes will be made on the 3.5 version of the library?

  1. It will take us a few months to achieve that as for every method (I guess around 50) we need to have a PR and review and then wait for owner merge....

Yes, this is a disadvantage of the approach I'm suggesting. I think it doesn't have to be too bad, though. You can already work on the next method while the previous PR is waiting for review and merge.

  1. We might possibly introduce 50 breaking changes over time, so that means the users of the library will need potentially to fix their code 50 times in the coming months - which is worse(?) than a single time fix.

I think 50 predictable small breaking changes is better for end users than a sudden single huge one, especially if the new interfaces are available from the start. The first time a user experiences a small breaking change, they essentially get an early warning of the changes to come and they can choose to adapt their code ahead of time, either incrementally or all at once. It is better to have some slack time while already knowing that your code will break in the future, than to suddenly find yourself having to overhaul everything whether you have time for it or not.

Come to think of it, the early warning effect can probably be amplified by putting a comment in the code near the method that has changed, containing a hyperlink to a GitHub comment (or some other article on the web) that summarizes the upcoming changes.

Also can we agree to defer the special typings that use ts-toolbelt to after we finish the refactor?

Yes, I agree with that approach.

@jgonggrijp ok lets go with that.
I will start making PRs soon - prepare yourself for the blast of PRs to come :-)

Cool. Bring it on, I'll try hard to review quickly!

@jgonggrijp By collection library, I mean a library that operates on containers, like array, map, set and function.

Whether code is typed doesn’t make it good or bad, since as you say making it typed constrains how you approach certain problems. I don’t know the history of underscore’s development (did Jeremy write about it? — that would be fascinating), but functions like Array.flatten are impossible to type in any type system I’m aware of:

  1. Ad-hoc behaviour: The flattening is either shallow or deep depending on whether the boolean parameter is true or false. I don’t know any non-dependent type system that can represent this reliably, even Typescript.
  2. Untypeable behaviour: The first parameter of flatten has an infinite type. Again, I don’t know any type system that can perfectly represent this type.

The current state of the library makes me think that types were not systematically part of its development.

@sandersn Thanks for clarifying, the _.flatten example is illuminating.

I don’t know the history of underscore’s development (did Jeremy write about it? — that would be fascinating)

I'd be interested in that history, too.

, but functions like Array.flatten are impossible to type in any type system I’m aware of:

  1. Ad-hoc behaviour: The flattening is either shallow or deep depending on whether the boolean parameter is true or false. I don’t know any non-dependent type system that can represent this reliably, even Typescript.

Hm, C++ has constexpr true and constexpr false, which can influence template overload resolution with metafunctions like std::enable_if. I heared TS has a Turing-complete type system, too, so I wonder whether there is really no way to do this.

  1. Untypeable behaviour: The first parameter of flatten has an infinite type. Again, I don’t know any type system that can perfectly represent this type.

I see what you mean; it is impossible to annotate the type of the first parameter in such a way, that if the type of each element (of each element (...)) of the array is known statically, the type of each element of the return value can also be accurately statically inferred. In this sense, I can certainly agree that Underscore wasn't designed with static type inference in mind. This is probably true of most libraries that target a dynamic language.

However, there is a subtle but important difference between "not designed with static type inference in mind" and "not designed with types in mind". I hope you see what I mean.

All of that said, I agree with you that _.flatten requires a very approximate typing in order to be efficiently compilable. There is no way around it that users will need to use a type assertion to make up for the loss of information.

If you make an overload set like this:

declare function flatten<T>(ts: T[][], shallow: true): T[];
declare function flatten(ts: any, shallow: false): any;

Then you can prevent callers from passing boolean: they'll either have to use literals or variables whose types were inferred from literals. That's 'restrictive' more than 'unreliable', I guess.

Yes, I was thinking along those lines, too. In most cases people will probably pass a literal. People who need to work with a variable can use workarounds like the following:

if (variable === true) {
    result = _.flatten(array, true);
} else {
    result = _.flatten(array, false);
}

By the way, the deep overload can be more specific because the return value is guaranteed be an array without any nested arrays. If we ignore for now that users are allowed to pass in null and arguments, I arrive at something like this:

declare function flatten(ts: Array<any>, shallow: true): Array<any>;
declare function flatten(ts: Array<any>, shallow?: false): Array<Exclude<any, Array<any>>>;
Was this page helpful?
0 / 5 - 0 ratings