Ramda: Lazy but finite reduce for recursive data structures and an unbounded number of reducers (see comments)

Created on 15 Mar 2017 · 55Comments · Source: ramda/ramda

Good day everyone!

Over the past weeks we've been working with arbitrarily nested objects and arrays (mixed altogether), up until we've spotted the necessity of a recursive reduce function that could be used to transform, but at least to obtain, certain if not every property of this nested structure, which could also be stopped at any given point if a condition is sufficient. Essentially, we've been looking at what in haskell would be a fold takeWhile, but for deep/recursive objects/arrays.

After several iterations, the function valid for these use cases that we haven't been able to simplify anymore is the following:

// Recursive fold takeWhile (like in haskell)
// fn is the iterator
// obj is the iteratee
// r is the accumulator
const foldWhile = (fn, obj, r = []) => {
  let innerFold = o => !(typeof o === 'object') ? o
    : Object.keys(o).every(k => fn(r, o[k], k) && innerFold(o[k]))
  innerFold(obj)
  return r
}

_Note: this is implemented with lodash in our code, but since it doesn't need lodash, I'm using generic es6 to show what it does._
_Note2: Besides using functions from the Ramda ecosystem, this would benefit from a trampoline._

The main point of the function is combining the use of every recursively to stop at any point of the (_lazy_) evaluation of the nested object/array, with the use of an accumulator, which is handled to the user.

To see use cases and tests, see this gist: https://gist.github.com/sadasant/de106b879c61f21797dffdaa577f7868

We have been looking for opensource alternatives, but we haven't been successful.

I know that there have been several issues in this repo about lazy evaluation and nested data structures, however there doesn't seem to be an agreement on how to combine both the nested structures and the stopping of the fold at any given point. I know that Ramda has reduceWhile and until, but they don't work over nested structures, right?.

Having that said,
I wonder if this is interesting to anybody. We would definitely prefer using opensource instead of maintaining our own code.

Thank you for the time.

Source

sadasant

👍6 ❤2 🎉2

Most helpful comment

i'm interested in seeing a PR for this

buzzdecafe on 20 Mar 2017

❤1 👍1

All 55 comments

( I also asked this on lodash https://github.com/lodash/lodash/issues/3058 )

sadasant on 15 Mar 2017

Here's an alternative that works for infinitely many reducers (pardon my lodash):

// Recursive reduce nested objects of any kind that
// stops at any point if a falsy value is returned.
export const flowReduce = (...fns) => (obj, acc = []) => {
    if (typeof obj !== 'object') return acc
    let _acc = acc
    const type = typeof acc
    _.every(key => {
        const pass = _.every(f => {
            let result = f(_acc, obj[key], key)
            if (typeof result === type) _acc = result
            return result
        }, fns)
        if (pass) _acc = flowReduce(...fns)(obj[key], _acc)
        return pass
    }, _.keys(obj))
    return _acc
}

Simple test with mutable referenced accumulators:

expect(f.flowReduce(acc => acc < 3, acc => ++acc)([ 1, 2, 3, 4 ], 0)).to.equal(3)

One test from the ones we did with foldWhile:

        const target = {
            something: {
                mostly_empty: {}
            },
            deep: {
                object: {
                    matching: {
                        key: 'value'
                    }
                }
            }
        }

        let found = f.flowReduce(
            r => r.length < 2,
            (acc, value, key) => key === 'matching' ? acc.concat({ [key]: value }) : acc
        )(target)

        expect(found).to.deep.equal([{
            matching: { key: 'value' }
        }])

Full tests: https://github.com/smartprocure/futil-js/pull/61/files

sadasant on 15 Mar 2017

_Note: If you are available, please give feedback on better code styles or any improvement anyone might spot 👍 thank you!_

sadasant on 15 Mar 2017

The name lacks the every part, which is important. Perhaps the name should be: manyfoldwhile 😂 flowReduceWhile ? flowEvery... recursiveFlowEvery ... 😵

sadasant on 15 Mar 2017

Studied a bit and found out that foldWhile and flowReduce were an implementation of an automata, a finite state machine, but ciclic structures of unknown size and an unbounded number of generators... which is generalized in Category Theory as the Groupoid Category. See https://en.wikipedia.org/wiki/Automata_theory#Connection_to_category_theory

This implementation satisfies exactly the same tests as flowReduce, but is completely stripped of lodash utilities for clarity.

The name is particularly tempting since is easy to pronounce, funny enough to remember, and technically precise 😱

Full tests: https://github.com/smartprocure/futil-js/pull/63/files
Please review 😻

// General variable automata is the definition of
// non-deterministic input iterators over discrete state changes
// aka groupoid category
// See: https://en.m.wikipedia.org/wiki/Automata_theory#Connection_to_category_theory
export const groupoid = (...funs) => function G (field, acc = []) {
    if (!field || typeof field !== 'object') return acc
    let accepted = acc
    let state = acc
    let keys = Object.keys(field)
    let key = keys.shift()
    let fN = 0
    while (state !== false) {
        accepted = state
        if (!funs[fN]) {
            key = keys.shift()
            fN = 0
        }
        if (!key) break
        let result = G(field[key], funs[fN](state, field[key], key))
        if (result !== true) state = result
        fN++
    }
    return accepted
}

sadasant on 16 Mar 2017

I don't want you to think you're being ignored. I'm interested by this, but have not been able to dedicate any real time to it. I expect to look at it over the weekend.

CrossEye on 17 Mar 2017

Hi!

Don't worry! Any time you can give is valuable \o/ I've been crazy trying to make this the most solid / reusable / etc. Any feedback is welcome!

sadasant on 17 Mar 2017

If this issue can be useful to define a solution for these general use cases, even if that means not using any of my code at all, I would have done my job here, since we don't want to maintain any form of private (and probably biased) code, and since having this _peer reviewed_ and available externally will better help us reuse it in many places in our codebase. So thanks for whatever is posible!

sadasant on 17 Mar 2017

_(just to keep this up to date with lodash issue, and to share the picture)_

I know that all of this has been somewhat confusing, but here's the argument on favor of Groupoid in a picture as pretty as I could come up with. Please feel free to counter it, what I really want is to find a general solution ♥︎

sadasant on 17 Mar 2017

Here is a new version that allows breadth first, bottom to top depth and breadth first and allows easy maps/automorphisms (which shouldn't be hard with any general reducer/endomorphism) because it sends the full parents keys to the reducers. The implementation is focused on making it work with some clarity rather than making it pretty (or even perfect), but the ideas are there!

// How to use it:
//   groupoid(reducer1, reducer2, ...reducerN)(field, [optional params, see below])
// Each reducer will receive:
//   - The result of the previous reducer
//   - The current value
//   - The key where the current value is located
//   - The key names of the parents of the current key
export const groupoid = (...funs) => function G (
    // REQUIRED
    field, // input n-dimensional field
    // Optional parameters
    acc = [], // accumulator
    breadth, // breadth first is falsy by default
    orientation = 1, // > 0 by default, goes bottom to top if < 0 // TODO: Expose a groupoidRight to make this easier
    path = [] // only used for recursion purposes, to expose the parent keys of the current key value pair
) {
    // return the current accumulator if the field is not iterable
    if (!field || typeof field !== 'object') return acc

    let accepted = acc
    let state = acc
    let keys = Object.keys(field)
    let fN = 0
    let nextBreadth = []

    if (orientation < 0) keys = keys.reverse()
    let key = keys.shift()

    // stops only if the state resulting from funs[fN] is exactly falsee
    // (or if (!key) break)
    while (state !== false) {
        accepted = state
        let f = funs[fN]
        // if we have no more functins to look at,
        // get the next key and reset the function
        // counter. Lets us avoid having two whiles.
        if (!f) {
            key = keys.shift()
            fN = 0
            f = funs[fN]
        }
        if (!key) break
        let val = field[key]
        // result of reducer f[fN]
        let result = f(state, val, key, path)
        // Either we accumulate iterables to go deeper
        // after all the current level (breadth),
        // or we go recursive right now.
        if (breadth && val && typeof val === 'object') {
            nextBreadth.push({ val, key })
        } else {
            result = G(val, result, breadth, orientation, path.concat(key))
        }
        if (result !== true) state = result
        fN++
    }
    // if breadth first, going recursive on what was nested in this layer
    let result = accepted
    while (result && nextBreadth.length) {
        let { val, key } = nextBreadth.shift()
        result = G(val, accepted, breadth, orientation, path.concat(key))
        if (result && result !== true) accepted = result
    }
    return accepted
}

It passes all the previous use cases, and some other more. Here's a reversed breadth first test:

   it('groupoid reverse breadth first for nested objects', () => {
        const target = {
            something: {
                matching: {
                    key: 'value3'
                }
            },
            matching: {
                key: 'value1'
            },
            matchingToo: {
                key: 'value2'
            },
            deep: {
                object: {
                    matching: {
                        key: 'value4'
                    },
                    matchingToo: {
                        key: 'value5'
                    }
                }
            }
        }

        let found = f.groupoid(
            (acc, val, key, path) =>
                key.indexOf('matching') >= 0
                ? acc.concat({ [key]: val, path }) : acc
        )(target, [], true, -1)

        expect(found).to.deep.equal([{
            matchingToo: { key: 'value2' },
            path: []
        }, {
            matching: { key: 'value1' },
            path: []
        }, {
            matchingToo: { key: 'value5' },
            path: [ 'deep', 'object' ]
        }, {
            matching: { key: 'value4' },
            path: [ 'deep', 'object' ]
        }, {
            matching: { key: 'value3' },
            path: [ 'something' ]
        }])
    })

And here is a proof of concept of a string map made with a groupoid:

    it('groupoid changing values in place, like a map', () => {
        let target = {
            key: 'root',
            items: [{
                key: 'x',
                items: [{
                    key: 'y',
                    items: [{
                        key: 'z',
                        extra: 'blah'
                    }, {
                        key: 's'
                    }]
                }]
            }]
        }

        const fixArrayKeys = k => isNaN(parseInt(k)) ? k : `[${k}]`
        const fixPath = _.flow(_.map(fixArrayKeys), _.join('.'), x => x.replace(/\.\[/g, '['))
        const mapStrings = (fn, obj) => f.groupoid(
            (o, v, k, path) => {
                if (!_.isString(v)) return o
                let loPath = fixPath(path.concat(k))
                return _.has(loPath, o) ? _.set(loPath, fn(v), o) : o
            }
        )(obj, obj)

        const result = mapStrings(x => x.toUpperCase(), target)

        expect(result).to.deep.equal({
            key: 'ROOT',
            items: [{
                key: 'X',
                items: [{
                    key: 'Y',
                    items: [{
                        key: 'Z',
                        extra: 'BLAH'
                    }, {
                        key: 'S'
                    }]
                }]
            }]
        })
    })

Full tests: https://github.com/smartprocure/futil-js/pull/63

sadasant on 18 Mar 2017

i'm interested in seeing a PR for this

buzzdecafe on 20 Mar 2017

❤1 👍1

Hi! I can do it (probably starting in 12 hours), however I was wondering if we could get feedback on the ideas, specially on the approach. For example, on what's exposed: groupoid( ...functions )( targetObject, accumulator, /* breadth, orientation */), would it be better differently?, perhaps something else could also be better?

sadasant on 20 Mar 2017

(thanks for the interest 😍)

sadasant on 20 Mar 2017

we have almost no variadic functions. So the first thing I would recommend is taking a list (or stream?) of functions rather than a variable number of arguments:

export const groupoid = ([funs]) => 
//...

we also have no functions with optional params, so you may want to consider that:

export const groupoid = (...funs) => function G (
    // REQUIRED
    field, // input n-dimensional field
    // Optional parameters
    acc = [], // accumulator
    breadth, // breadth first is falsy by default
    orientation = 1, // > 0 by default, goes bottom to top if < 0 // TODO: Expose a groupoidRight to make this easier
    path = [] // only used for recursion purposes, to expose the parent keys of the current key value pair
)

One possibility would be to expose e.g. bfs and dfs as separate functions derived from this one, and maybe not have groupoid on the public API at all?

Then let's consider whether the accumulator should in fact be a required param; I'm inclined to say it should -- let the user explicitly control the shape of the output:

export const groupoid = [funs] => curry((field, acc) => //...etc );

Then we should consider what order of arguments is optimal for this function. Which things are least likely to change? Different functions over one field? Or different fields using the same list of functions?

This looks very similar to a fold but with a list of functions instead of one. But the sig of the funs takes 3 arguments; can we align it with the binary fold-style function?

I also agree a generic trampoline helper may be helpful here. A long time ago we had one ... maybe that was in the other lib ...

buzzdecafe on 20 Mar 2017

In the PR I can also avoid recursuion, the code won't be nice but we won't be having maximum stack errors ✌

sadasant on 20 Mar 2017

got it @buzzdecafe ! I'll follow your guidelines 👍👍👍

sadasant on 20 Mar 2017

@buzzdecafe also, yes! this is just a fold, but goes deep. So in summary, we want:

A generic trampoline? or just avoid recursion?
A ... depth first deep fold?
A ... breadth first deep fold?
The <name>Right version of those above? (for the orientation?)

sadasant on 20 Mar 2017

if you are not recurring on G you may be able to curry the whole thing then?

// is the sig something like this?
// g :: [((a, x) -> a)] -> a -> Object -> [a]
const g = curry(([funs], acc, field) => { 
  // etc.
});

btw, this sig is appealing to me since it looks so much like reduce, the connection is obvious.

buzzdecafe on 20 Mar 2017

the connection is obvious yes! All I need is... a deep reduce that conditionally stops.

sadasant on 20 Mar 2017

In fact you can use the groupoid just like reduce, it works for one reducer and one-dimensional input fields.

sadasant on 20 Mar 2017

wow this is starting to look like a lot of work! For now, it may be easiest to implement iteratively; let's not overload this. Maybe do the dfs version first, and leave the right version for a future PR. I wanna keep this manageable for you and for anyone reviewing this

In fact you can use the groupoid just like reduce, it works for one reducer and one-dimensional input fields.

That makes sense, it is more general.

buzzdecafe on 20 Mar 2017

It's not that much work! I have a working implementation 😂

sadasant on 20 Mar 2017

Proposal: Four functions that,

Stop the iteration if a false value is sent.
Ignore true results from the reducer passed.
Accept only one reducer since we can compose before sending.
Do not use recursion internally.
Accept any kind of accumulator, except booleans.

depthFold(reducer, accumulator, target)
breadthFold(reducer, accumulator, target)
depthFoldRight(reducer, accumulator, target)
breadthFoldRight(reducer, accumulator, target)

@buzzdecafe would that be fine for you? I can make separate PRs (or all in one PR!)

sadasant on 20 Mar 2017

Stop the iteration if a false value is sent.

This makes me a little nervous. Transducers e.g. uses a {reduced: true} object as the token to terminate the iteration.

buzzdecafe on 20 Mar 2017

@buzzdecafe sure! we can stop if R.reduced is passed.

sadasant on 20 Mar 2017

not suggesting that necessarily, esp. since transducers may get pulled from core lib. but looking for a better way to indicate "stop iterating"

buzzdecafe on 20 Mar 2017

Well, unless we have R.reduced, I think we should trust on false, since it might be obnoxious to tell users to use some value other than false for any conditional function. I personally don't see much benefit of allowing boolean values to be accumulated.

sadasant on 20 Mar 2017

You know the right answer, @buzzdecafe. :stuck_out_tongue_winking_eye:

davidchambers on 20 Mar 2017

@davidchambers what's the right answer? 👀

sadasant on 20 Mar 2017

I expect @davidchambers is suggesting using an ADT to handle _either_ the return case or the recurse case, and pattern matching on the type to do the right thing.

buzzdecafe on 20 Mar 2017

@buzzdecafe what's an ADT?

sadasant on 20 Mar 2017

We can separate the reducer with the conditional stop, but I think it's better to use the reducer... at least prettier! But however is better for Ramda \o/

sadasant on 20 Mar 2017

@buzzdecafe wins the prize!

Rather than have the function return a value of type a we could have it return a value of type Maybe a, where Nothing (a value of type Maybe a) would signal completion. As we would not even inspect the a value, there would be no special cases. S.unfoldr uses this approach (whereas R.unfold requires the function to return either false or an array, which is rather odd).

To avoid introducing the Maybe type we could use Array a as the return type: [] would signal completion; [42] would signal that the computation should continue with 42.

davidchambers on 20 Mar 2017

@davidchambers woah! [] to stop is wonderful! Thanks for the idea :v:

sadasant on 20 Mar 2017

@davidchambers wait, what if I only want to aggregate some of the values that pass through the reducer? I can be printing empty arrays for a while.

sadasant on 20 Mar 2017

I'm not following this issue closely. What I'm suggesting is using [] rather than false (or some other sentinel value) and using [x] rather than x. If x is [1, 2, 3], say, you'll use [[1, 2, 3]].

davidchambers on 20 Mar 2017

I'm a bit late to the party here, but instead of using a specific value or object structure, wouldn't this be a good use case for a Symbol? I think it's easier than introducing specific types or structures and is part of the official JS standard.

daedalus28 on 20 Mar 2017

@davidchambers no, no, the reducer returns always the accumulator, so if at some point it returns [x], x, [z] all of them are valid changes to the accumulator

sadasant on 20 Mar 2017

Perhaps some code would help to convey my idea.

Before:

//  result :: a
var result = f(x);
if (result === false) {
  // terminate
} else {
  // continue with `result`
}

After:

//  result :: Array a
var result = f(x);
if (result.length === 0) {
  // terminate
} else {
  // continue with `result[0]`
}

Note that in the first case false is treated specially, whereas in the second case result[0] can be absolutely any JavaScript value.

davidchambers on 20 Mar 2017

@davidchambers if I'm following what you're saying, always wrapping results in array and returning [] to stop is the same as just returning undefined, isn't it? That's also a valid option - return undefined to stop, and if you want no value you return null. With that said, in general I think any kind of value has the potential to be what you're looking for in JS, so the only way to know for sure that the return value is _not_ an accumulator result is to make something specific to the method.

daedalus28 on 20 Mar 2017

See my comment above, @daedalus28. There is absolutely no possibility of conflict. [undefined] signals _continue with undefined_, [[1, 2, 3]] signals _continue with [1, 2, 3]_, etc. The array's length (0 or 1) determines whether or not to terminate.

davidchambers on 20 Mar 2017

@davidchambers ok I'm buying it, but only if Ramda already does that somewhere else.

sadasant on 20 Mar 2017

Got it, it's treating an array as a maybe where the array is just the container, which either has stuff in it or doesn't, e.g. KindaSortaLikeaMaybeContainer = x => [x]. That's neat 😄

To @sadasant's point, is there any precedent of using arrays as quick and dirty functional containers?

daedalus28 on 20 Mar 2017

[I]s there any precedent of using arrays as quick and dirty functional containers?

I don't think so.

davidchambers on 20 Mar 2017

@buzzdecafe are you on board with @davidchambers 's solution? It's super nice, but it needs to make sense in Ramda's context.

sadasant on 20 Mar 2017

definitely prefer the single return type. So yes, on board.

"ADT" = Algebraic Data Type
and we should fix unfold, that is pretty gross

buzzdecafe on 20 Mar 2017

@buzzdecafe got it, makes sense!
PRs coming through the day and/or tomorrow.
@buzzdecafe @davidchambers @daedalus28 thank you all for the feedback!

sadasant on 20 Mar 2017

We have a first PR for depthReduce 😸 https://github.com/ramda/ramda/pull/2133 once we adjust it to Ramda needs, the rest of the PRs will follow quickly 😱

Please review! But thanks in advance for the all time you've already invested 🙇

sadasant on 21 Mar 2017

Hello, @CrossEye ! I'm curious, were you able to see this? ^

sadasant on 22 Mar 2017

@buzzdecafe @davidchambers So, I recognize I'm pretty excited about contributing 😅, but I also want to make that energy useful: and we should fix unfold, that is pretty gross, I'm guessing an unfold where the iterator is only expected to return a single-element array? or an empty array if we want to stop? I can do that o/

sadasant on 22 Mar 2017

Since we don't have the option of using Maybe (Pair a b), we'd need to use Array (Pair a b) and assume that the array's length will either be zero or one ([] or [['foo', 42]], say).

davidchambers on 22 Mar 2017

@davidchambers I made an issue to continue there. It has a sketch of the upcoming unfold. I'll be making the PR during the evening :v: https://github.com/ramda/ramda/issues/2138

sadasant on 22 Mar 2017

Actually, I just found that partial.lenses exist: https://github.com/calmm-js/partial.lenses

L.select([L.elems, "y"], [{x:1},{y:2},{z:3}])
L.selectAs(x => x > 3 ? -x : undefined, L.elems, [3,1,4,1,5])

That works over nested structures. Is exactly what I've been looking for, but also formalized.
I'm still wondering if there's value in this for ramda though.

sadasant on 23 Mar 2017

👍1

partial.lenses author also has https://github.com/polytypic/fastener which works for structures of unknown shape.

sadasant on 23 Mar 2017

Why was this issue closed when the PR (#2133) is still open?
And why is the PR still open?