Ramda: Lazy but finite reduce for recursive data structures and an unbounded number of reducers (see comments)

Created on 15 Mar 2017  ยท  55Comments  ยท  Source: ramda/ramda

Good day everyone!

Over the past weeks we've been working with arbitrarily nested objects and arrays (mixed altogether), up until we've spotted the necessity of a recursive reduce function that could be used to transform, but at least to obtain, certain if not every property of this nested structure, which could also be stopped at any given point if a condition is sufficient. Essentially, we've been looking at what in haskell would be a fold takeWhile, but for deep/recursive objects/arrays.

After several iterations, the function valid for these use cases that we haven't been able to simplify anymore is the following:

// Recursive fold takeWhile (like in haskell)
// fn is the iterator
// obj is the iteratee
// r is the accumulator
const foldWhile = (fn, obj, r = []) => {
  let innerFold = o => !(typeof o === 'object') ? o
    : Object.keys(o).every(k => fn(r, o[k], k) && innerFold(o[k]))
  innerFold(obj)
  return r
}

_Note: this is implemented with lodash in our code, but since it doesn't need lodash, I'm using generic es6 to show what it does._
_Note2: Besides using functions from the Ramda ecosystem, this would benefit from a trampoline._

The main point of the function is combining the use of every recursively to stop at any point of the (_lazy_) evaluation of the nested object/array, with the use of an accumulator, which is handled to the user.

To see use cases and tests, see this gist: https://gist.github.com/sadasant/de106b879c61f21797dffdaa577f7868

We have been looking for opensource alternatives, but we haven't been successful.

I know that there have been several issues in this repo about lazy evaluation and nested data structures, however there doesn't seem to be an agreement on how to combine both the nested structures and the stopping of the fold at any given point. I know that Ramda has reduceWhile and until, but they don't work over nested structures, right?.

Having that said,
I wonder if this is interesting to anybody. We would definitely prefer using opensource instead of maintaining our own code.

Thank you for the time.

Most helpful comment

i'm interested in seeing a PR for this

All 55 comments

( I also asked this on lodash https://github.com/lodash/lodash/issues/3058 )

Here's an alternative that works for infinitely many reducers (pardon my lodash):

// Recursive reduce nested objects of any kind that
// stops at any point if a falsy value is returned.
export const flowReduce = (...fns) => (obj, acc = []) => {
    if (typeof obj !== 'object') return acc
    let _acc = acc
    const type = typeof acc
    _.every(key => {
        const pass = _.every(f => {
            let result = f(_acc, obj[key], key)
            if (typeof result === type) _acc = result
            return result
        }, fns)
        if (pass) _acc = flowReduce(...fns)(obj[key], _acc)
        return pass
    }, _.keys(obj))
    return _acc
}

Simple test with mutable referenced accumulators:

expect(f.flowReduce(acc => acc < 3, acc => ++acc)([ 1, 2, 3, 4 ], 0)).to.equal(3)

One test from the ones we did with foldWhile:

        const target = {
            something: {
                mostly_empty: {}
            },
            deep: {
                object: {
                    matching: {
                        key: 'value'
                    }
                }
            }
        }

        let found = f.flowReduce(
            r => r.length < 2,
            (acc, value, key) => key === 'matching' ? acc.concat({ [key]: value }) : acc
        )(target)

        expect(found).to.deep.equal([{
            matching: { key: 'value' }
        }])

Full tests: https://github.com/smartprocure/futil-js/pull/61/files

_Note: If you are available, please give feedback on better code styles or any improvement anyone might spot ๐Ÿ‘ thank you!_

The name lacks the every part, which is important. Perhaps the name should be: manyfoldwhile ๐Ÿ˜‚ flowReduceWhile ? flowEvery... recursiveFlowEvery ... ๐Ÿ˜ต

Studied a bit and found out that foldWhile and flowReduce were an implementation of an automata, a finite state machine, but ciclic structures of unknown size and an unbounded number of generators... which is generalized in Category Theory as the Groupoid Category. See https://en.wikipedia.org/wiki/Automata_theory#Connection_to_category_theory

This implementation satisfies exactly the same tests as flowReduce, but is completely stripped of lodash utilities for clarity.

The name is particularly tempting since is easy to pronounce, funny enough to remember, and technically precise ๐Ÿ˜ฑ

Full tests: https://github.com/smartprocure/futil-js/pull/63/files
Please review ๐Ÿ˜ป

// General variable automata is the definition of
// non-deterministic input iterators over discrete state changes
// aka groupoid category
// See: https://en.m.wikipedia.org/wiki/Automata_theory#Connection_to_category_theory
export const groupoid = (...funs) => function G (field, acc = []) {
    if (!field || typeof field !== 'object') return acc
    let accepted = acc
    let state = acc
    let keys = Object.keys(field)
    let key = keys.shift()
    let fN = 0
    while (state !== false) {
        accepted = state
        if (!funs[fN]) {
            key = keys.shift()
            fN = 0
        }
        if (!key) break
        let result = G(field[key], funs[fN](state, field[key], key))
        if (result !== true) state = result
        fN++
    }
    return accepted
}

I don't want you to think you're being ignored. I'm interested by this, but have not been able to dedicate any real time to it. I expect to look at it over the weekend.

Hi!

Don't worry! Any time you can give is valuable \o/ I've been crazy trying to make this the most solid / reusable / etc. Any feedback is welcome!

If this issue can be useful to define a solution for these general use cases, even if that means not using any of my code at all, I would have done my job here, since we don't want to maintain any form of private (and probably biased) code, and since having this _peer reviewed_ and available externally will better help us reuse it in many places in our codebase. So thanks for whatever is posible!

_(just to keep this up to date with lodash issue, and to share the picture)_

I know that all of this has been somewhat confusing, but here's the argument on favor of Groupoid in a picture as pretty as I could come up with. Please feel free to counter it, what I really want is to find a general solution โ™ฅ๏ธŽ

Here is a new version that allows breadth first, bottom to top depth and breadth first and allows easy maps/automorphisms (which shouldn't be hard with any general reducer/endomorphism) because it sends the full parents keys to the reducers. The implementation is focused on making it work with some clarity rather than making it pretty (or even perfect), but the ideas are there!

// How to use it:
//   groupoid(reducer1, reducer2, ...reducerN)(field, [optional params, see below])
// Each reducer will receive:
//   - The result of the previous reducer
//   - The current value
//   - The key where the current value is located
//   - The key names of the parents of the current key
export const groupoid = (...funs) => function G (
    // REQUIRED
    field, // input n-dimensional field
    // Optional parameters
    acc = [], // accumulator
    breadth, // breadth first is falsy by default
    orientation = 1, // > 0 by default, goes bottom to top if < 0 // TODO: Expose a groupoidRight to make this easier
    path = [] // only used for recursion purposes, to expose the parent keys of the current key value pair
) {
    // return the current accumulator if the field is not iterable
    if (!field || typeof field !== 'object') return acc

    let accepted = acc
    let state = acc
    let keys = Object.keys(field)
    let fN = 0
    let nextBreadth = []

    if (orientation < 0) keys = keys.reverse()
    let key = keys.shift()

    // stops only if the state resulting from funs[fN] is exactly falsee
    // (or if (!key) break)
    while (state !== false) {
        accepted = state
        let f = funs[fN]
        // if we have no more functins to look at,
        // get the next key and reset the function
        // counter. Lets us avoid having two whiles.
        if (!f) {
            key = keys.shift()
            fN = 0
            f = funs[fN]
        }
        if (!key) break
        let val = field[key]
        // result of reducer f[fN]
        let result = f(state, val, key, path)
        // Either we accumulate iterables to go deeper
        // after all the current level (breadth),
        // or we go recursive right now.
        if (breadth && val && typeof val === 'object') {
            nextBreadth.push({ val, key })
        } else {
            result = G(val, result, breadth, orientation, path.concat(key))
        }
        if (result !== true) state = result
        fN++
    }
    // if breadth first, going recursive on what was nested in this layer
    let result = accepted
    while (result && nextBreadth.length) {
        let { val, key } = nextBreadth.shift()
        result = G(val, accepted, breadth, orientation, path.concat(key))
        if (result && result !== true) accepted = result
    }
    return accepted
}

It passes all the previous use cases, and some other more. Here's a reversed breadth first test:

   it('groupoid reverse breadth first for nested objects', () => {
        const target = {
            something: {
                matching: {
                    key: 'value3'
                }
            },
            matching: {
                key: 'value1'
            },
            matchingToo: {
                key: 'value2'
            },
            deep: {
                object: {
                    matching: {
                        key: 'value4'
                    },
                    matchingToo: {
                        key: 'value5'
                    }
                }
            }
        }

        let found = f.groupoid(
            (acc, val, key, path) =>
                key.indexOf('matching') >= 0
                ? acc.concat({ [key]: val, path }) : acc
        )(target, [], true, -1)

        expect(found).to.deep.equal([{
            matchingToo: { key: 'value2' },
            path: []
        }, {
            matching: { key: 'value1' },
            path: []
        }, {
            matchingToo: { key: 'value5' },
            path: [ 'deep', 'object' ]
        }, {
            matching: { key: 'value4' },
            path: [ 'deep', 'object' ]
        }, {
            matching: { key: 'value3' },
            path: [ 'something' ]
        }])
    })

And here is a proof of concept of a string map made with a groupoid:

    it('groupoid changing values in place, like a map', () => {
        let target = {
            key: 'root',
            items: [{
                key: 'x',
                items: [{
                    key: 'y',
                    items: [{
                        key: 'z',
                        extra: 'blah'
                    }, {
                        key: 's'
                    }]
                }]
            }]
        }

        const fixArrayKeys = k => isNaN(parseInt(k)) ? k : `[${k}]`
        const fixPath = _.flow(_.map(fixArrayKeys), _.join('.'), x => x.replace(/\.\[/g, '['))
        const mapStrings = (fn, obj) => f.groupoid(
            (o, v, k, path) => {
                if (!_.isString(v)) return o
                let loPath = fixPath(path.concat(k))
                return _.has(loPath, o) ? _.set(loPath, fn(v), o) : o
            }
        )(obj, obj)

        const result = mapStrings(x => x.toUpperCase(), target)

        expect(result).to.deep.equal({
            key: 'ROOT',
            items: [{
                key: 'X',
                items: [{
                    key: 'Y',
                    items: [{
                        key: 'Z',
                        extra: 'BLAH'
                    }, {
                        key: 'S'
                    }]
                }]
            }]
        })
    })

Full tests: https://github.com/smartprocure/futil-js/pull/63

i'm interested in seeing a PR for this

Hi! I can do it (probably starting in 12 hours), however I was wondering if we could get feedback on the ideas, specially on the approach. For example, on what's exposed: groupoid( ...functions )( targetObject, accumulator, /* breadth, orientation */), would it be better differently?, perhaps something else could also be better?

(thanks for the interest ๐Ÿ˜)

we have almost no variadic functions. So the first thing I would recommend is taking a list (or stream?) of functions rather than a variable number of arguments:

export const groupoid = ([funs]) => 
//...

we also have no functions with optional params, so you may want to consider that:

export const groupoid = (...funs) => function G (
    // REQUIRED
    field, // input n-dimensional field
    // Optional parameters
    acc = [], // accumulator
    breadth, // breadth first is falsy by default
    orientation = 1, // > 0 by default, goes bottom to top if < 0 // TODO: Expose a groupoidRight to make this easier
    path = [] // only used for recursion purposes, to expose the parent keys of the current key value pair
) 

One possibility would be to expose e.g. bfs and dfs as separate functions derived from this one, and maybe not have groupoid on the public API at all?

Then let's consider whether the accumulator should in fact be a required param; I'm inclined to say it should -- let the user explicitly control the shape of the output:

export const groupoid = [funs] => curry((field, acc) => //...etc );

Then we should consider what order of arguments is optimal for this function. Which things are least likely to change? Different functions over one field? Or different fields using the same list of functions?

This looks very similar to a fold but with a list of functions instead of one. But the sig of the funs takes 3 arguments; can we align it with the binary fold-style function?

I also agree a generic trampoline helper may be helpful here. A long time ago we had one ... maybe that was in the other lib ...

In the PR I can also avoid recursuion, the code won't be nice but we won't be having maximum stack errors โœŒ

got it @buzzdecafe ! I'll follow your guidelines ๐Ÿ‘๐Ÿ‘๐Ÿ‘

@buzzdecafe also, yes! this is just a fold, but goes deep. So in summary, we want:

  • A generic trampoline? or just avoid recursion?
  • A ... depth first deep fold?
  • A ... breadth first deep fold?
  • The <name>Right version of those above? (for the orientation?)

if you are not recurring on G you may be able to curry the whole thing then?

// is the sig something like this?
// g :: [((a, x) -> a)] -> a -> Object -> [a]
const g = curry(([funs], acc, field) => { 
  // etc.
});

btw, this sig is appealing to me since it looks so much like reduce, the connection is obvious.

the connection is obvious yes! All I need is... a deep reduce that conditionally stops.

In fact you can use the groupoid just like reduce, it works for one reducer and one-dimensional input fields.

wow this is starting to look like a lot of work! For now, it may be easiest to implement iteratively; let's not overload this. Maybe do the dfs version first, and leave the right version for a future PR. I wanna keep this manageable for you and for anyone reviewing this

In fact you can use the groupoid just like reduce, it works for one reducer and one-dimensional input fields.

That makes sense, it is more general.

It's not that much work! I have a working implementation ๐Ÿ˜‚

Proposal: Four functions that,

  • Stop the iteration if a false value is sent.
  • Ignore true results from the reducer passed.
  • Accept only one reducer since we can compose before sending.
  • Do not use recursion internally.
  • Accept any kind of accumulator, except booleans.

depthFold(reducer, accumulator, target)
breadthFold(reducer, accumulator, target)
depthFoldRight(reducer, accumulator, target)
breadthFoldRight(reducer, accumulator, target)

@buzzdecafe would that be fine for you? I can make separate PRs (or all in one PR!)

Stop the iteration if a false value is sent.

This makes me a little nervous. Transducers e.g. uses a {reduced: true} object as the token to terminate the iteration.

@buzzdecafe sure! we can stop if R.reduced is passed.

not suggesting that necessarily, esp. since transducers may get pulled from core lib. but looking for a better way to indicate "stop iterating"

Well, unless we have R.reduced, I think we should trust on false, since it might be obnoxious to tell users to use some value other than false for any conditional function. I personally don't see much benefit of allowing boolean values to be accumulated.

You know the right answer, @buzzdecafe. :stuck_out_tongue_winking_eye:

@davidchambers what's the right answer? ๐Ÿ‘€

I expect @davidchambers is suggesting using an ADT to handle _either_ the return case or the recurse case, and pattern matching on the type to do the right thing.

@buzzdecafe what's an ADT?

We can separate the reducer with the conditional stop, but I think it's better to use the reducer... at least prettier! But however is better for Ramda \o/

@buzzdecafe wins the prize!

Rather than have the function return a value of type a we could have it return a value of type Maybe a, where Nothing (a value of type Maybe a) would signal completion. As we would not even inspect the a value, there would be no special cases. S.unfoldr uses this approach (whereas R.unfold requires the function to return either false or an array, which is rather odd).

To avoid introducing the Maybe type we could use Array a as the return type: [] would signal completion; [42] would signal that the computation should continue with 42.

@davidchambers woah! [] to stop is wonderful! Thanks for the idea :v:

@davidchambers wait, what if I only want to aggregate some of the values that pass through the reducer? I can be printing empty arrays for a while.

I'm not following this issue closely. What I'm suggesting is using [] rather than false (or some other sentinel value) and using [x] rather than x. If x is [1, 2, 3], say, you'll use [[1, 2, 3]].

I'm a bit late to the party here, but instead of using a specific value or object structure, wouldn't this be a good use case for a Symbol? I think it's easier than introducing specific types or structures and is part of the official JS standard.

@davidchambers no, no, the reducer returns always the accumulator, so if at some point it returns [x], x, [z] all of them are valid changes to the accumulator

Perhaps some code would help to convey my idea.

Before:

//  result :: a
var result = f(x);
if (result === false) {
  // terminate
} else {
  // continue with `result`
}

After:

//  result :: Array a
var result = f(x);
if (result.length === 0) {
  // terminate
} else {
  // continue with `result[0]`
}

Note that in the first case false is treated specially, whereas in the second case result[0] can be absolutely any JavaScript value.

@davidchambers if I'm following what you're saying, always wrapping results in array and returning [] to stop is the same as just returning undefined, isn't it? That's also a valid option - return undefined to stop, and if you want no value you return null. With that said, in general I think any kind of value has the potential to be what you're looking for in JS, so the only way to know for sure that the return value is _not_ an accumulator result is to make something specific to the method.

See my comment above, @daedalus28. There is absolutely no possibility of conflict. [undefined] signals _continue with undefined_, [[1, 2, 3]] signals _continue with [1, 2, 3]_, etc. The array's length (0 or 1) determines whether or not to terminate.

@davidchambers ok I'm buying it, but only if Ramda already does that somewhere else.

Got it, it's treating an array as a maybe where the array is just the container, which either has stuff in it or doesn't, e.g. KindaSortaLikeaMaybeContainer = x => [x]. That's neat ๐Ÿ˜„

To @sadasant's point, is there any precedent of using arrays as quick and dirty functional containers?

[I]s there any precedent of using arrays as quick and dirty functional containers?

I don't think so.

@buzzdecafe are you on board with @davidchambers 's solution? It's super nice, but it needs to make sense in Ramda's context.

definitely prefer the single return type. So yes, on board.

  1. "ADT" = Algebraic Data Type
  2. and we should fix unfold, that is pretty gross

@buzzdecafe got it, makes sense!
PRs coming through the day and/or tomorrow.
@buzzdecafe @davidchambers @daedalus28 thank you all for the feedback!

We have a first PR for depthReduce ๐Ÿ˜ธ https://github.com/ramda/ramda/pull/2133 once we adjust it to Ramda needs, the rest of the PRs will follow quickly ๐Ÿ˜ฑ

Please review! But thanks in advance for the all time you've already invested ๐Ÿ™‡

Hello, @CrossEye ! I'm curious, were you able to see this? ^

@buzzdecafe @davidchambers So, I recognize I'm pretty excited about contributing ๐Ÿ˜…, but I also want to make that energy useful: and we should fix unfold, that is pretty gross, I'm guessing an unfold where the iterator is only expected to return a single-element array? or an empty array if we want to stop? I can do that o/

Since we don't have the option of using Maybe (Pair a b), we'd need to use Array (Pair a b) and assume that the array's length will either be zero or one ([] or [['foo', 42]], say).

@davidchambers I made an issue to continue there. It has a sketch of the upcoming unfold. I'll be making the PR during the evening :v: https://github.com/ramda/ramda/issues/2138

Actually, I just found that partial.lenses exist: https://github.com/calmm-js/partial.lenses

L.select([L.elems, "y"], [{x:1},{y:2},{z:3}])
L.selectAs(x => x > 3 ? -x : undefined, L.elems, [3,1,4,1,5])

That works over nested structures. Is exactly what I've been looking for, but also formalized.
I'm still wondering if there's value in this for ramda though.

partial.lenses author also has https://github.com/polytypic/fastener which works for structures of unknown shape.

Why was this issue closed when the PR (#2133) is still open?
And why is the PR still open?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

allangomessl picture allangomessl  ยท  34Comments

luizbills picture luizbills  ยท  98Comments

ivan-kleshnin picture ivan-kleshnin  ยท  99Comments

esamattis picture esamattis  ยท  40Comments

davidchambers picture davidchambers  ยท  39Comments