Pomegranate: Cannot calculate model.probability()

Created on 7 Jan 2020  路  11Comments  路  Source: jmschrei/pomegranate

Moving by the tutorial. On Input 7 and 8 got error. Made 0 changes.

TypeError                                 Traceback (most recent call last)
<ipython-input-43-c1e53d06d14d> in <module>
----> 1 model.probability(['A', 'B', 'C'])

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/base.pyx in pomegranate.base.Model.probability()

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/BayesianNetwork.pyx in pomegranate.BayesianNetwork.BayesianNetwork.log_probability()

TypeError: list indices must be integers or slices, not tuple

pip list

Package            Version
------------------ -------
attrs              19.3.0 
backcall           0.1.0  
bleach             3.1.0  
cycler             0.10.0 
decorator          4.4.1  
defusedxml         0.6.0  
entrypoints        0.3    
importlib-metadata 1.3.0  
ipykernel          5.1.3  
ipython            7.11.1 
ipython-genutils   0.2.0  
jedi               0.15.2 
Jinja2             2.10.3 
joblib             0.14.1 
json5              0.8.5  
jsonschema         3.2.0  
jupyter-client     5.3.4  
jupyter-core       4.6.1  
jupyterlab         1.2.4  
jupyterlab-server  1.0.6  
kiwisolver         1.1.0  
MarkupSafe         1.1.1  
matplotlib         3.1.2  
mistune            0.8.4  
more-itertools     8.0.2  
nbconvert          5.6.1  
nbformat           5.0.3  
networkx           2.4    
notebook           6.0.2  
numpy              1.18.1 
pandas             0.25.3 
pandocfilters      1.4.2  
parso              0.5.2  
pexpect            4.7.0  
pickleshare        0.7.5  
Pillow             7.0.0  
pip                9.0.1  
pkg-resources      0.0.0  
pomegranate        0.12.0 
prometheus-client  0.7.1  
prompt-toolkit     3.0.2  
ptyprocess         0.6.0  
Pygments           2.5.2  
pygraphviz         1.5    
pyparsing          2.4.6  
pyrsistent         0.15.6 
python-dateutil    2.8.1  
pytz               2019.3 
PyYAML             5.3    
pyzmq              18.1.1 
scipy              1.4.1  
seaborn            0.9.0  
Send2Trash         1.5.0  
setuptools         39.0.1 
six                1.13.0 
terminado          0.8.3  
testpath           0.4.4  
tornado            6.0.3  
traitlets          4.3.3  
watermark          2.0.2  
wcwidth            0.1.8  
webencodings       0.5.1  
wheel              0.33.6 
zipp               0.6.0  

Python 3.6

All 11 comments

Same here

Can confirm that issue as well. Downgrading from 0.12.0 to 0.11.2 solves that ...

Can confirm that issue as well. Downgrading from 0.12.0 to 0.11.2 solves that ...

Downgrading solves that, but there another bug...

CODE

from pomegranate import BayesianNetwork, DiscreteDistribution, ConditionalProbabilityTable, Node, State
import matplotlib.pyplot as plt

angina_pectoris = DiscreteDistribution({0: 0.99, 1: 0.01})
heart_attack = DiscreteDistribution({0: 0.99, 1: 0.01})

pain_syndrome = ConditionalProbabilityTable([

    [0, 0, 1, 0.01],
    [0, 0, 0, 0.99],

    [0, 1, 1, 0.9],
    [0, 1, 0, 0.1],

    [1, 0, 1, 0.9],
    [1, 0, 0, 0.1],

    [1, 1, 1, 1],
    [1, 1, 0, 0],

], [angina_pectoris, heart_attack])

relief_of_pain = ConditionalProbabilityTable([

    [0, 0, 'yes', 0.1],
    [0, 0, 'not_completely', 0.1],
    [0, 0, 'no', 0.1],

    [0, 1, 'yes', 0.05],
    [0, 1, 'not_completely', 0.35],
    [0, 1, 'no', 0.6],

    [1, 0, 'yes', 0.9],
    [1, 0, 'not_completely', 0.05],
    [1, 0, 'no', 0.05],

    [1, 1, 'yes', 0.33],
    [1, 1, 'not_completely', 0.33],
    [1, 1, 'no', 0.33],

], [angina_pectoris, heart_attack])

ps = State(pain_syndrome, name='pain_syndrome')
rp = State(relief_of_pain, name='relief_of_pain')
ap = State(angina_pectoris, name='angina_pectoris')
ha = State(heart_attack, name='heart_attack')

model = BayesianNetwork('Medical decision support system')

model.add_states(ps, rp, ap, ha)

model.add_edge(ap, ps)
model.add_edge(ap, rp)

model.add_edge(ha, ps)
model.add_edge(ha, rp)

model.bake()

model.probability([0, 'yes', 0, 1])

ERROR

KeyError                                  Traceback (most recent call last)

<ipython-input-19-cfc278c8d52f> in <module>
----> 1 model.probability([0, 'yes', 0, 1])
      2 

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/base.pyx in pomegranate.base.Model.probability()

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/BayesianNetwork.pyx in pomegranate.BayesianNetwork.BayesianNetwork.log_probability()

~/Documents/GitHub/py-bolit/venv/lib/python3.6/site-packages/pomegranate/distributions/ConditionalProbabilityTable.pyx in pomegranate.distributions.ConditionalProbabilityTable.ConditionalProbabilityTable.log_probability()

KeyError: ('0', '1', '0')

@sviperm

Your error appears to be originating from your different datatypes for the assignments. I suggest you to transform all integers (0,1) to strings ('0','1')

Thus, it's a different issue than what you stumpled upon initialy

I would recommend making the input to probability and log_probability always be 2D even when there's a single example. I have a paper deadline on the 30th but I'll look into this shortly after. Sorry for the inconvenience.

Thanks to @SebastianBelkner, we got the problem for the 0.12 incompatibility:
model.probability(...) documentation and examples requires an (2d) array-like structure.

Looking at the v0.11.2 implementation, the array like structure was converted first to a Numpy array and then accessed via numpyarr[i,j]:

def log_probability(self, X, n_jobs=1):
            <...>
        X = numpy.array(X, ndmin=2)
        <...>
        for i in range(n):
            for j, state in enumerate(self.states):
                logp[i] += state.distribution.log_probability(X[i, self.idxs[j]])

        return logp if n > 1 else logp[0]

The conversion to a Numpy has been removed later. Consequently, the way of accessing values does not work that way. *Converting the 2d-array in advance to a Numpy array as argument for model.probability(...) solves the problem. *

IMHO there are two solutions:

  • either include X = numpy.array(X, ndmin=2) again, or
  • update documentation and examples.

I will add the casting as a numpy array back in in the next version. Thanks for catching this and posting a temporary solution.

Can confirm that issue as well. Downgrading from 0.12.0 to 0.11.2 solves that ...

It really solved my problem. If you are using Kaggle then, install pomegranate in the first cell using the following command "!pip install pomegranate==0.11.2"

Passing in a single vector now raises an error. You should pass in a 2D matrix or a list of lists (even when there is only one example). In v0.12.1 print(model.probability([[0, 'yes', 0, 1]])) will return 4.95e-5, which looks like the right answer.

@jmschrei For current release neither list nor list of lists is working. Is there already some change in source code or example will be updated?

v0.12.1 will have the fix. I will release that soon. For now, pass in a 2D numpy array and it should work.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

xkortex picture xkortex  路  6Comments

adamnovak picture adamnovak  路  14Comments

dmmiller612 picture dmmiller612  路  15Comments

rhydomako picture rhydomako  路  18Comments

chananshgong picture chananshgong  路  11Comments