Pomegranate: incompatibilities with the upcoming networkx 2.0

Created on 15 Dec 2016  ·  20Comments  ·  Source: jmschrei/pomegranate

networkx 2.0 switches graph.nodes() to return a generator rather than a list (see https://networkx.readthedocs.io/en/latest/reference/release_2.0.html#api-changes), so constructs such as len(self.graph.nodes()) (https://github.com/jmschrei/pomegranate/blob/master/pomegranate/hmm.pyx#L693) will fail with

TypeError: object of type 'generator' has no len()

All 20 comments

Great, thanks for the heads up. I'll go through the code and make the relevant changes soon.

any progress on this ? is it in release 0.9.0 ?

Howdy

Unfortunately not. The blocker I ran into is that the functionality of
topological sort changes and they don’t seem to provide a method for
recreating the old function. I’ll take another look at it, but since they
changed a great deal of functions without adding in much documentation is
takes a lot of work to identify what the right conversion is.

On Wed, Jan 3, 2018 at 10:16 AM stonebig notifications@github.com wrote:

any progress on this ? is it in release 0.9.0 ?


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
https://github.com/jmschrei/pomegranate/issues/209#issuecomment-355085284,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADvEEBB2Jx63MXJtOsQYyZMkgsFxoVpFks5tG8QDgaJpZM4LNshS
.

for the line with problem, would this suffice ?
`````

prestates = self.graph.nodes()

from distutils.version import LooseVersion
if LooseVersion(networkx.__version__) < LooseVersion("2.0.0"):
prestates = self.graph.nodes()
else:
prestates = [i for i in self.graph.nodes()]
# or otherwise, prestates = list(self.graph.nodes())
````

it would not force you to change your program logic.

any progress on this ?

No progress on this yet.

I made an attempt, however some hmm tests return location mismatch errors for the arrays compared, similar to the following arrays:

 x: array([5, 1, 0, 2, 2, 2, 4])
 y: array([4, 1, 0, 2, 2, 2, 5])

 x: array([[      -inf,       -inf,       -inf,       -inf,       -inf,
          0.      ],
       [ -6.72405 ,  -3.874849,  -7.39693 ,  -2.656593,  -4.680333,...
 y: array([[      -inf,       -inf,       -inf,       -inf,   0.      ,
              -inf],
       [ -6.72405 ,  -3.874849,  -7.39693 ,  -2.656593,       -inf,...

When i first run the tests i get 10-15 tests failures. Calling PyCharm's rerun-failed-tests option 3 to 5 times makes all tests passed.

Hi @uludag, the reason is that topological_sort has completely changed, so sometimes you hit the "right" (i.e. previous) order just by chance. So just

- silent_states_sorted = networkx.topological_sort(silent_subgraph, nbunch=silent_states)
+ silent_states_sorted = networkx.topological_sort(silent_subgraph)

as you did won't work.

@jmschrei
Would it be possible if we included the 1.11 topological_sort function as kind of a legacy code with pomegranate? It would be quick and dirty but very convenient until we've figured out how to replace it. I'm very dependent on using both networkx and pomegranate in their newest versions so I'm pretty motivated. At any rate much better than constantly switching environments.

By the way, comparing the current release with nx1.11 (https://github.com/networkx/networkx/compare/networkx-1.11...master)
I also found the following:

topolgical_sort no longer accepts reverse or nbunch arguments.
If nbunch was a single node source, then the same effect can now be achieved
using the subgraph operator:
nx.topological_sort(G.subgraph(nx.descendants(G, nbunch)))

Indicating it might be possible to reproduce old behaviour by iterating over all silent_states, somehow...?

Source code in nx1.11:

import networkx as nx

def topological_sort(G, nbunch=None, reverse=False):
    """Return a list of nodes in topological sort order.

    A topological sort is a nonunique permutation of the nodes
    such that an edge from u to v implies that u appears before v in the
    topological sort order.

    Parameters
    ----------
    G : NetworkX digraph
        A directed graph

    nbunch : container of nodes (optional)
        Explore graph in specified order given in nbunch

    reverse : bool, optional
        Return postorder instead of preorder if True.
        Reverse mode is a bit more efficient.

    Raises
    ------
    NetworkXError
        Topological sort is defined for directed graphs only. If the
        graph G is undirected, a NetworkXError is raised.

    NetworkXUnfeasible
        If G is not a directed acyclic graph (DAG) no topological sort
        exists and a NetworkXUnfeasible exception is raised.

    Notes
    -----
    This algorithm is based on a description and proof in
    The Algorithm Design Manual [1]_ .

    See also
    --------
    is_directed_acyclic_graph

    References
    ----------
    .. [1] Skiena, S. S. The Algorithm Design Manual  (Springer-Verlag, 1998).
        http://www.amazon.com/exec/obidos/ASIN/0387948600/ref=ase_thealgorithmrepo/
    """
    if not G.is_directed():
        raise nx.NetworkXError(
            "Topological sort not defined on undirected graphs.")

    # nonrecursive version
    seen = set()
    order = []
    explored = set()

    if nbunch is None:
        nbunch = G.nodes_iter()
    for v in nbunch:     # process all vertices in G
        if v in explored:
            continue
        fringe = [v]   # nodes yet to look at
        while fringe:
            w = fringe[-1]  # depth first search
            if w in explored:  # already looked down this branch
                fringe.pop()
                continue
            seen.add(w)     # mark as seen
            # Check successors for cycles and for new nodes
            new_nodes = []
            for n in G[w]:
                if n not in explored:
                    if n in seen:  # CYCLE !!
                        raise nx.NetworkXUnfeasible("Graph contains a cycle.")
                    new_nodes.append(n)
            if new_nodes:   # Add new_nodes to fringe
                fringe.extend(new_nodes)
            else:           # No new nodes so w is fully explored
                explored.add(w)
                order.append(w)
                fringe.pop()    # done considering this node
    if reverse:
        return order
    else:
        return list(reversed(order))

Source code now:

def topological_sort(G):
    """Return a generator of nodes in topologically sorted order.
    A topological sort is a nonunique permutation of the nodes such that an
    edge from u to v implies that u appears before v in the topological sort
    order.
    Parameters
    ----------
    G : NetworkX digraph
        A directed acyclic graph (DAG)
    Returns
    -------
    iterable
        An iterable of node names in topological sorted order.
    Raises
    ------
    NetworkXError
        Topological sort is defined for directed graphs only. If the graph `G`
        is undirected, a :exc:`NetworkXError` is raised.
    NetworkXUnfeasible
        If `G` is not a directed acyclic graph (DAG) no topological sort exists
        and a :exc:`NetworkXUnfeasible` exception is raised.  This can also be
        raised if `G` is changed while the returned iterator is being processed
    RuntimeError
        If `G` is changed while the returned iterator is being processed.
    Examples
    --------
    To get the reverse order of the topological sort:
    >>> DG = nx.DiGraph([(1, 2), (2, 3)])
    >>> list(reversed(list(nx.topological_sort(DG))))
    [3, 2, 1]
    If your DiGraph naturally has the edges representing tasks/inputs
    and nodes representing people/processes that initiate tasks, then
    topological_sort is not quite what you need. You will have to change
    the tasks to nodes with dependence reflected by edges. The result is
    a kind of topological sort of the edges. This can be done
    with :func:`networkx.line_graph` as follows:
    >>> list(nx.topological_sort(nx.line_graph(DG)))
    [(1, 2), (2, 3)]
    Notes
    -----
    This algorithm is based on a description and proof in
    "Introduction to Algorithms: A Creative Approach" [1]_ .
    See also
    --------
    is_directed_acyclic_graph, lexicographical_topological_sort
    References
    ----------
    .. [1] Manber, U. (1989).
       *Introduction to Algorithms - A Creative Approach.* Addison-Wesley.
    """
    if not G.is_directed():
        raise nx.NetworkXError(
            "Topological sort not defined on undirected graphs.")

    indegree_map = {v: d for v, d in G.in_degree() if d > 0}
    # These nodes have zero indegree and ready to be returned.
    zero_indegree = [v for v, d in G.in_degree() if d == 0]

    while zero_indegree:
        node = zero_indegree.pop()
        if node not in G:
            raise RuntimeError("Graph changed during iteration")
        for _, child in G.edges(node):
            try:
                indegree_map[child] -= 1
            except KeyError:
                raise RuntimeError("Graph changed during iteration")
            if indegree_map[child] == 0:
                zero_indegree.append(child)
                del indegree_map[child]

        yield node

    if indegree_map:
        raise nx.NetworkXUnfeasible("Graph contains a cycle or graph changed "
                                    "during iteration")

BTW, I'm still pretty amused by the comment in the original code:

# Get the sorted silent states. Isn't it convenient how NetworkX has
# exactly the algorithm we need?
silent_states_sorted = networkx.topological_sort(silent_subgraph, nbunch=silent_states)

Sometimes it's too good to be true :)

@uludag, @jmschrei , see https://github.com/jmschrei/pomegranate/compare/master...poplarShift:nx2 for my cheap fix. All tests passing now as far as I can see.

@jmschrei any idea whether you'll pull @poplarShift's fix in? seems like a reasonable workaround to me 🤷‍♂️

What breaks when NetworkxX 2.0 is installed? Many of my pipelines are dependent on nx >= 2.0 at the moment :/

@jmschrei Another vote for folding in @poplarShift fix. Visualizing the graph would be really nice.

Pretty sure this should have been closed by #551, I've been using nx 2 w/o issue

@bnaul, what version of nx are you using? I've tried 2.2 and 2.3. Here's what I'm getting on 2.3 using python3.6:

  File ".../lib/python3.6/site-packages/networkx/drawing/nx_pylab.py", line 1067, in draw_spectral
    draw(G, spectral_layout(G), **kwargs)
  File ".../lib/python3.6/site-packages/networkx/drawing/layout.py", line 784, in spectral_layout
    G, center = _process_params(G, center, dim)
  File ".../lib/python3.6/site-packages/networkx/drawing/layout.py", line 50, in _process_params
    empty_graph.add_nodes_from(G)
  File ".../lib/python3.6/site-packages/networkx/classes/graph.py", line 564, in add_nodes_from
    for n in nodes_for_adding:

That's weird. I thought that I had fixed this already. Can you open an issue with a reproducible example and your current environment?

@jmschrei Sure, here's an example:

from pomegranate import BayesianNetwork
import networkx as nx
import matplotlib.pyplot as plt
import numpy

X = numpy.random.randint(2, size=(2000, 7))
X[:,3] = X[:,1]
X[:,6] = X[:,1]
X[:,0] = X[:,2]
X[:,4] = X[:,5]
model = BayesianNetwork.from_samples(X, algorithm='exact')
print(model.structure)
fig = plt.figure(figsize=(8, 8))
nx.draw_spectral(model.graph, with_labels=True, node_color='gray', font_size='8')

output:

((), (), (0,), (1,), (), (4,), (3,))
Traceback (most recent call last):
  File "/Users/landes/tmp/d.py", line 16, in <module>
    nx.draw_spectral(model.graph, with_labels=True, node_color='gray', font_size='8')
  File ".../lib/python3.6/site-packages/networkx/drawing/nx_pylab.py", line 1067, in draw_spectral
    draw(G, spectral_layout(G), **kwargs)
  File ".../lib/python3.6/site-packages/networkx/drawing/layout.py", line 784, in spectral_layout
    G, center = _process_params(G, center, dim)
  File ".../lib/python3.6/site-packages/networkx/drawing/layout.py", line 50, in _process_params
    empty_graph.add_nodes_from(G)
  File ".../lib/python3.6/site-packages/networkx/classes/graph.py", line 564, in add_nodes_from
    for n in nodes_for_adding:
TypeError: 'pomegranate.FactorGraph.FactorGraph' object is not iterable

The model.graph attribute there is a FactorGraph object, not a networkx object. While pomegranate explicitly stores a networkx graph in HMMs, it doesn't for BayesianNetworks, since BayesianNetwork objects are just a wrapper for FactorGraphs (which do the real inference). I realize this may be a little confusing. If you try doing model.plot it will visualize the network by building it in the function. If you'd like to do more with networkx than just plot it, or have more control over how the plotting happens, you can follow what I did in that function (https://github.com/jmschrei/pomegranate/blob/master/pomegranate/BayesianNetwork.pyx#L244) Fortunately it doesn't seem like this is an issue with networkx2. Sorry for the inconvenience.

If you replace the last two lines with

model.plot()
plt.show()

you should get the following picture:

Figure_1-1

@jmschrei Correct! Yes, I was able to get this to work and it's what I need.

However, what your saying regarding the data structure doesn't seem to match what the documentation says. But I could be missing something once again.

Thanks very much for your quick response (and to the others on this thread) for your help. It's much appreciated.

Was this page helpful?
0 / 5 - 0 ratings