Shap: problem with pytorch model

Created on 5 Mar 2019 · 11Comments · Source: slundberg/shap

I am trying to use shap to interpret genomic deep learning model in PyTorch. The input tensors to shap are three dimensional and have shapes like torch.Size([100, 4, 2000]) for the background and
torch.Size([5, 4, 2000]) for the input samples for shap_values().

I get the following error message:

File "/home/users/bioinf/erane/work/unity/tools/gedai/dna_pytorch_shap.py", line 668, in
main()
File "/home/users/bioinf/erane/work/unity/tools/gedai/dna_pytorch_shap.py", line 215, in main
shap_values = e.shap_values(xdata)
File "/evogene/software/anaconda3/envs/py3.6-env-erane/lib/python3.6/site-packages/shap/explainers/deep/__init__.py", line 119, in shap_values
return self.explainer.shap_values(X, ranked_outputs, output_rank_order)
File "/evogene/software/anaconda3/envs/py3.6-env-erane/lib/python3.6/site-packages/shap/explainers/deep/deep_pytorch.py", line 221, in shap_values
sample_phis = self.gradient(feature_ind, joint_x)
File "/evogene/software/anaconda3/envs/py3.6-env-erane/lib/python3.6/site-packages/shap/explainers/deep/deep_pytorch.py", line 166, in gradient
grads = [torch.autograd.grad(selected, x)[0].cpu().numpy() for x in X]
File "/evogene/software/anaconda3/envs/py3.6-env-erane/lib/python3.6/site-packages/shap/explainers/deep/deep_pytorch.py", line 166, in
grads = [torch.autograd.grad(selected, x)[0].cpu().numpy() for x in X]
File "/evogene/software/anaconda3/envs/py3.6-env-erane/lib/python3.6/site-packages/torch/autograd/__init__.py", line 144, in grad
inputs, allow_unused)
RuntimeError: hook 'deeplift_grad' has changed the size of value

On the same machine I can successfully use shap with examples of pytorch MNIST image classification code (like https://www.kaggle.com/ceshine/pytorch-deep-explainer-mnist-example) . Any hint will be well appreciated.

Eran

Source

eranbio

Most helpful comment

Hi!

1) A Flatten layer hasn't been implemented in PyTorch yet. Assuming the flatten layer looks something like what is being implemented here: https://github.com/pytorch/pytorch/pull/22245, then this doesn't affect the gradients - the DeepExplainer should work fine for this.

2) Support for Adaptive Average Pooling was added here: https://github.com/slundberg/shap/pull/609. However, the correct behaviour for the DeepExplainer for Adaptive Average Pooling layers is to leave the gradient unchanged, since it is linear. This is what the DeepExplainer does when it doesn't recognize a Module, so it should be safe to ignore that warning.

3) Assuming the BasicBlock you are referring to is this: https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py#L34, then the explainer should be able to handle it, since all of its component modules have been explicitly implemented.

gabrieltseng on 17 Jul 2019

👍2 🎉1

All 11 comments

Do you get the same error for GradientExplainer as well?

slundberg on 7 Mar 2019

I haven't tested my code with GradientExplainer yet. However I have some insight regarding the problem with DeepExplainer. After I isolated the components of my pytorch model, it turned out that the crash occurs in the Conv1d layer of my model. After changing tensor dimensions and using Conv2d instead, I was able to run shap and the results looks fine, although I got a warning message which I am not sure about. I will post it soon.

eranbio on 10 Mar 2019

this is the warning message which I get:
Warning: unrecognized nn.Module:

Scott, for your question - I don't get the same error with GradientExplainer. With GradientExplainer I can run with either Conv1d or Conv2d. I also don't get the above warning.

eranbio on 19 Mar 2019

Hmm... @gabrieltseng might have some ideas since he wrote the PyTorch parts.

slundberg on 20 Mar 2019

Hi!

With respect to the warning you are getting (Warning: unrecognized nn.Module: ), this is to tell you that Conv1d isn't implemented in the deep explainer yet. There was a typo in the warning, which has been fixed here: https://github.com/slundberg/shap/commit/714110b0eea969e78437771c50c29dc6cc486605

For the first error, could you send me the model you are using? Because Conv1d isn't implemented in the deep explainer, it should default to not changing the gradient.

Modifying one of the tests to include a conv1d layer does seem to work:

def test_pytorch_regression():
    """Testing regressions (i.e. single outputs)
    """
    try:
        import torch
        from torch import nn
        from torch.nn import functional as F
        from torch.utils.data import TensorDataset, ConcatDataset, DataLoader
        from sklearn.datasets import load_boston
    except Exception as e:
        print("Skipping test_pytorch_regression!")
        return
    import shap

    X, y = load_boston(return_X_y=True)
    num_features = X.shape[1]
    data = TensorDataset(torch.tensor(X).float(),
                         torch.tensor(y).float())
    loader = DataLoader(data, batch_size=128)

    class Net(nn.Module):
        def __init__(self, num_features):
            super(Net, self).__init__()
            self.conv1d = nn.Conv1d(1, 1, 1)
            self.linear = nn.Linear(num_features, 1)

        def forward(self, X):
            return self.linear(self.conv1d(X.unsqueeze(1)).squeeze(1))
    model = Net(num_features)
    optimizer = torch.optim.Adam(model.parameters())

    def train(model, device, train_loader, optimizer, epoch):
        model.train()
        num_examples = 0
        for batch_idx, (data, target) in enumerate(train_loader):
            num_examples += target.shape[0]
            data, target = data.to(device), target.to(device)
            optimizer.zero_grad()
            output = model(data)
            loss = F.mse_loss(output.squeeze(1), target)
            loss.backward()
            optimizer.step()
            if batch_idx % 2 == 0:
                print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                    epoch, batch_idx * len(data), len(train_loader.dataset),
                           100. * batch_idx / len(train_loader), loss.item()))

    device = torch.device('cpu')
    train(model, device, loader, optimizer, 1)

    next_x, next_y = next(iter(loader))
    np.random.seed(0)
    inds = np.random.choice(next_x.shape[0], 20, replace=False)
    e = shap.DeepExplainer(model, next_x[inds, :])
    test_x, test_y = next(iter(loader))
    shap_values = e.shap_values(test_x[:1])

    model.eval()
    model.zero_grad()
    with torch.no_grad():
        diff = (model(test_x[:1]) - model(next_x[inds, :])).detach().numpy().mean(0)
    sums = np.array([shap_values[i].sum() for i in range(len(shap_values))])
    d = np.abs(sums - diff).sum()
    assert d / np.abs(diff).sum() < 0.001, "Sum of SHAP values does not match difference! %f" % (
            d / np.abs(diff).sum())

gabrieltseng on 20 Mar 2019

👍1

Hi Gabriel,
Sorry, I think I caused some confusions. I actually had the original error (RuntimeError: hook 'deeplift_grad' has changed the size of value) with the Pool1d layer. After moving to Pool2d, I could run successfully.
The warning( Warning: unrecognized nn.Module: ) is probably related to Conv1d.
So if I understand correctly Conv1d and Pool1d are not yet supported in DeeExplainer. Is there a plan to support them? Is there an updated list of all modules which are supported?
Thanks!
Eran

eranbio on 24 Mar 2019

Hi Eran,

With respect to the warning, it should be fixed by this pull request: https://github.com/slundberg/shap/pull/507 . However, since a conv layer is linear, the default behaviour of the deep explainer when it doesn't recognize the module (to just take the gradient as normally calculated by pytorch) is the correct one, so the results will be correct with a conv1d layer despite the warning.

You are right; there is a bug in the max pool 1d layer; thanks for spotting it! I will look into fixing it.

You can see which modules are explicitly supported here: https://github.com/slundberg/shap/blob/master/shap/explainers/deep/deep_pytorch.py#L313

Thanks,

Gabi

gabrieltseng on 24 Mar 2019

Hi, I'm wondering if it's safe to ignore this warning for the layer types Flatten and/or AdaptiveAvgPool2d?
Edit: Oh also, what about resnet BasicBlocks?

austinmw on 17 Jul 2019

Hi!

gabrieltseng on 17 Jul 2019

👍2 🎉1

That information helps a ton, thank you!!

(hadn't realized Flatten was part of fastai and not actually in base pytorch)

austinmw on 17 Jul 2019

Hi Eran,

With respect to the warning, it should be fixed by this pull request: #507 . However, since a conv layer is linear, the default behaviour of the deep explainer when it doesn't recognize the module (to just take the gradient as normally calculated by pytorch) is the correct one, so the results will be correct with a conv1d layer despite the warning.

You are right; there is a bug in the max pool 1d layer; thanks for spotting it! I will look into fixing it.

You can see which modules are explicitly supported here: https://github.com/slundberg/shap/blob/master/shap/explainers/deep/deep_pytorch.py#L313

Thanks,

Gabi

Hi Gabi,

I had a question, I'm trying to use DeepExplainer for a feedforward Linear model in PyTorch which is run on a Tabular dataset. But I couldn't find any example. Just wanted to know where can I find an example or does DeepExplainer work with Tabular data at all?
I also have a CNN model for text classification and I was curious to know if I can use DeppExplainer for that model or not??
Thank you so much in advance.