Pymc3: GaussianRandomWalk prior predictive is broken

Created on 13 Jun 2020  路  8Comments  路  Source: pymc-devs/pymc3

Description of your problem

When running a sample_prior_predictive for a GaussianRandomWalk, the result looks not even close to what one would expect:

x = numpy.arange(0, 10)
with pymc3.Model() as pmodel:
    grw = pymc3.GaussianRandomWalk('grw', mu=0, sd=1, shape=len(x))
    pp = pymc3.sample_prior_predictive()
fig, (left, right) = pyplot.subplots(ncols=2, figsize=(10,5))
for i in numpy.random.randint(0, 500, size=40):
    left.plot(x, pp['grw'][i,:])
left.set_title('sample_prior_predictive')    
for _ in range(50):
    right.plot(x, grw.random())
right.set_title('.random()')
pyplot.show()

image

Versions and main components

  • PyMC3 Version: latest master
  • Theano Version: the one and only
  • Python Version: 3.6.8
  • Operating system: Windows
  • How did you install PyMC3: pip
defects

Most helpful comment

I just commented on the scipy issue (https://github.com/scipy/scipy/issues/12482). It looks like the problem is actually in PyMC3; apparently the _random method in GaussianRandomWalk is not performing the cumulative sum along the correct axis.

All 8 comments

I would like to work on this

So far I have been able to tie the starting point of the lines of sample_prior_predictive() plot at 0. PFB the attached screenshot of the result.
issue

The issue was that in the _random() function of Gaussian Random Walk, data = data - data[0] produced intended line starting from 0 for 1d array. However, for 2d array this caused the first row of the 2d array to turn 0.

is this fine now?

@Rish001 No, there is this weird correlation between instantiations that is quite puzzling. The left plot should look as random as the right.

@twiecki I suspect the issue lies in the internal workings of the method .rvs(size) of scipy when size is 2d. When I invoked the same method in a loop for each row of the 2d matrix, it generated the expected plot. PFB the screenshot attached
image

PFB the code snippet for the 2d matrix
data = np.empty(size) for i in range(size[0]): data[i] = rv.rvs((size[1],)).cumsum(axis = axis) data[i] = data[i] - data[i][0]

@twiecki I suspect the issue lies in the internal workings of the method .rvs(size) of scipy when size is 2d. When I invoked the same method in a loop for each row of the 2d matrix, it generated the expected plot. PFB the screenshot attached
image

PFB the code snippet for the 2d matrix
data = np.empty(size) for i in range(size[0]): data[i] = rv.rvs((size[1],)).cumsum(axis = axis) data[i] = data[i] - data[i][0]

That looks promising!
What if the shape is reversed (and the result transformed)?
If that doesn't help, do you think it's a scipy bug that should be reported?

@michaelosthege I believe its a valid issue that should be reported to scipy.

I just commented on the scipy issue (https://github.com/scipy/scipy/issues/12482). It looks like the problem is actually in PyMC3; apparently the _random method in GaussianRandomWalk is not performing the cumulative sum along the correct axis.

Thanks a lot @WarrenWeckesser ! Don't know how I missed that 馃う

Was this page helpful?
0 / 5 - 0 ratings