currently for using sequence or traverse on arrays we do this
import * as A from "fp-ts/lib/Array";
import * as T from "fp-ts/lib/Task";
A.sequence(T.task)([T.of(1), T.of(2)])
but after reading this issue I was thinking if we want to colocate the sequence of Array with our data types we can optimize and get better performance also stack safety.
if we look at issues concerning stack safety most(all) of them is there are some array that wants to traverse or sequence on some data types like this issue or this issue or this issue
I write sequence arrays for major data types if the community and @gcanti think this is a good idea I can make a PR.
I named this function sequenceA, A is an acronym for Array
as far as know option is stack safe but the A.sequence(O.option) is painfully slow for large arrays.
my solution is this
const sequenceA = <A>(arr: Array<Option<A>>) => {
const result = [];
for (const a of arr) {
if (a._tag === "None") {
return none;
}
result.push(a.value);
}
return some(result);
};
I write a small benchmark to see how much difference it makes
import * as O from "fp-ts/lib/Option";
import * as A from "fp-ts/lib/Array";
import { pipe } from "fp-ts/lib/pipeable";
const sequenceA = <A>(arr: Array<O.Option<A>>): Option<Array<A>> => {
const result = [];
for (const a of arr) {
if (a._tag === "None") {
return O.none;
}
result.push(a.value);
}
return O.some(result);
};
console.time("current");
pipe(A.range(0, 100000), A.map(O.some), A.sequence(O.option));
console.timeEnd("current");
console.time("suggestion");
pipe(A.range(0, 100000), A.map(O.some), sequenceA);
console.timeEnd("suggestion");
the result is
current: 40.118s
suggestion: 17.546ms
from now on I write just the result for other data types the array size is 100,000 for all of them
export const sequenceA: <E, A>(
arr: Array<Either<E, A>>
) => Either<E, Array<A>> = (arr) => {
const result = [];
for (const e of arr) {
if (e._tag === "Left") {
return e;
}
result.push(e.right);
}
return right(result);
};
benchmark result
current: 44.256s
solution: 17.599ms
export const sequenceA: <A>(arr: Array<Task<A>>) => Task<Array<A>> = (
arr
) => () => Promise.all(arr.map((x) => x()));
benchmark result
current: RangeError: Maximum call stack size exceeded
suggestion: 52.492ms
export const sequenceA: <R, A>(
arr: Array<Reader<R, A>>
) => Reader<R, Array<A>> = (arr) => (r) => arr.map((x) => x(r));
benchmark result
current: RangeError: Maximum call stack size exceeded
suggestion: 10.789ms
export const sequenceA: <A>(arr: Array<IO<A>>) => IO<Array<A>> = (
arr
) => () => arr.map((x) => x());
benchmark result
current: RangeError: Maximum call stack size exceeded
suggestion: 24.152ms
we can compose them together and make other sequenceA
export const sequenceA: <E, A>(
arr: Array<TaskEither<E, A>>
) => TaskEither<E, Array<A>> = (arr) =>
pipe(TASK.sequenceA(arr), TASK.map(EITHER.sequenceA));
benchmark result
current: RangeError: Maximum call stack size exceeded
suggestion: 57.102ms
const sequenceA: <R, E, A>(
arr: Array<ReaderTaskEither<R, E, A>>
) => ReaderTaskEither<R, E, A[]> = (arr) =>
pipe(READER.sequenceA(arr), READER.map(TASKEITHER.sequenceA));
current: RangeError: Maximum call stack size exceeded
suggestion: 81.367ms
I know that this solution is not exactly like something that Haskell does, but Haskell has a compiler that optimizes the hell out of these tasks.
I think this solution is rather easy to maintain compared to trampolining and the performance gain is so much that we can consider implementing it.
@mohaalak LGTM (not sure about the name though)
@mohaalak LGTM (not sure about the name though)
yeah the name looks odd to me too, but I don't have any other idea for it
Actually I find myself writing all that parametrisation by hand each time on each project. array.sequence(instance), array.traverse(instance) etc.. Looks like it's worth adding optimized versions of them to each ADT module. I would however suggest a full name for them: sequenceArray, traverseArray. There's already a bunch of magic letters in the library: "T" for tuples, "W" for widen, let's not introduce another one.
Agree with @raveclassic that magic letters should be avoided to improve clarity for both new and existing users of the library.
I also wanted to bring up the addition of TraversableWithIndex versions of these helpers. Data types that can be indexed and define a traverse function also define a traverseWithIndex function. I frequently utilize both in various projects.
Is it worth adding an additional function that exposes the index alongside the value to the main lib? Or should these helpers be defined on a per project basis?
@IMax153 I also think this function should not be in the main lib.
Adding it to fp-ts-contrib makes more sense to me. Am I wrong, @gcanti?
@iamomiid IMO we could add them too, I can't see nothing wrong with it.
Most helpful comment
Actually I find myself writing all that parametrisation by hand each time on each project.
array.sequence(instance),array.traverse(instance)etc.. Looks like it's worth adding optimized versions of them to each ADT module. I would however suggest a full name for them:sequenceArray,traverseArray. There's already a bunch of magic letters in the library: "T" for tuples, "W" for widen, let's not introduce another one.