Seurat: RunPCA error

Created on 3 Jul 2019 · 16Comments · Source: satijalab/seurat

Hi Seurat Team,
I followed the tutorial of Integrating stimulated vs. control PBMC datasets to learn cell-type specific responses.

immune.combined <- IntegrateData(anchorset = immune.anchors, dims = 1:30)
DefaultAssay(immune.combined) <- "integrated"  
immune.combined <- ScaleData(immune.combined, verbose = FALSE)

I did these above successfully.

The immune.combined is

An object of class Seurat
16494 features across 142032 samples within 2 assays
Active assay: integrated (2000 features)
1 other assay present: RNA

But when I did the RunPCA, there was something wrong with it.
immune.combined <- RunPCA(immune.combined, npcs = 30, verbose = FALSE)
the error is :
Error in irlba(A = t(x = object), nv = npcs, ...) :
max(nu, nv) must be positive

Could you please help me with that?

Many thanks,
Emily

more-information-needed

Source

zyzhangyan

Most helpful comment

Hi, in case someone comes across this post, I had the same error message. I saw another issue regarding this, and I made the same comment there too, sorry for posting twice.
So after finding the variable genes with the FindVariableFeatures() function prior to PCA, RunPCA() did not complain any more. Don't know why it did not work without it though.

seu <- FindVariableFeatures(object = seu)
seu <- RunPCA(seu, features = VariableFeatures(object = seu) )

eregenyi on 4 Oct 2019

👍10

All 16 comments

Hi Emily,

Are you running this on the data from that tutorial? If not, could you provide an object that reproduces the issue?

andrewwbutler on 12 Jul 2019

Hi Andrew,

I'm not running the data from the tutorial but using the large data that contains more than 130,000 cells from 23 samples. So maybe it's not convenience to provide the data.
And I was wondering if the number of the cells is too much.

------------------ Original ------------------
From: "Andrew Butler"notifications@github.com;
Date: Fri, Jul 12, 2019 11:46 PM
To: "satijalab/seurat"seurat@noreply.github.com;
Cc: "zhangyan"1050550990@qq.com;"Author"author@noreply.github.com;
Subject: Re: [satijalab/seurat] RunPCA error (#1788)

Hi Emily,

Are you running this on the data from that tutorial? If not, could you provide an object that reproduces the issue?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.

zyzhangyan on 15 Jul 2019

Hmm, if you downsample the object to say 1k cells do you get the same error?

andrewwbutler on 19 Jul 2019

Closing this now as we have not heard back, but please re-open if you are still having problems

timoast on 9 Aug 2019

Hi Emily，

have you solved this problem?
I met the same with my data, but I just have around 500 cells from 5 groups(5 96cells/well ).

woshilushu on 30 Aug 2019

seu <- FindVariableFeatures(object = seu)
seu <- RunPCA(seu, features = VariableFeatures(object = seu) )

eregenyi on 4 Oct 2019

👍10

Hi Seurat Team,
I have 98000 cells from 27 samples.

I run the following code

Gastricv2.integrated <- IntegrateData(anchorset = Gastricv1.anchors, features.to.integrate = my.genes,  dims = 1:30)
DefaultAssay(Gastricv2.integrated) <- "integrated"
Gastricv2.integrated <- ScaleData(Gastricv2.integrated, features = my.genes, verbose = FALSE)

and I able to run it.

However when i run the following code:
Gastricv2.integrated <- RunPCA(Gastricv2.integrated, npcs = 30, verbose = FALSE)

I get the following error and warning:

Error in irlba(A = t(x = object), nv = npcs, ...) :
max(nu, nv) must be positive
In addition: Warning message:
In PrepDR(object = object, features = features, verbose = verbose) :
The following 2000 features requested have zero variance (running reduction without them)...

So I tried running codes again using 9 samples and 49000 cells and I could run successfully.

Can you help me to figure out why I am not able to run it when I use 27 samples with 98000 cells?

Thanks very much,
Vikrant

Vikrant-Kumar2019 on 25 Dec 2019

I'm getting the same error in the SCT workflow for data integration, even after down-sampling the whole data set from roughly 90k to 25k cells.

seurat_integrated <- IntegrateData(
  anchorset = seurat_anchors,
  normalization.method = 'SCT'
)

seurat_integrated
# An object of class Seurat 
# 37391 features across 24680 samples within 3 assays 
# Active assay: integrated (3000 features)
#  2 other assays present: RNA, SCT

seurat_integrated <- RunPCA(seurat_integrated)
# Error in irlba(A = t(x = object), nv = npcs, ...) : 
#   max(nu, nv) must be positive
# In addition: Warning message:
# In PrepDR(object = object, features = features, verbose = verbose) :
#   The following 3000 features requested have zero variance (running reduction without them): LYZ, HBA1, HBB, S100A8, HBA2, S100A9, HBD, HBM, CA1, GNLY, RP11-1143G9.4, AHSP, CCL5, CXCL8, HIST1H4C, HLA-DRA, S100A12, CST3, TYROBP, CD74, IGLL1, CA2, FCN1, IGKC, JCHAIN, KLRB1, NKG7, GYPA, IGLC3, LST1, LGALS1, IGLC2, G0S2, GZMB, STMN1, CCL3, FCER1G, AIF1, TUBA1B, CTSS, CCL4, HLA-DPB1, TCL1A, HLA-DPA1, CSTA, HLA-DRB1, FCGR3A, ALAS2, PRDX2, VCAN, MZB1, HMGB2, GZMK, HEMGN, CMC1, IGHM, SNCA, VPREB3, GZMA, HLA-DQA1, FCER1A, TRDC, SAT1, TUBB, SPINK2, GZMH, HLA-DQB1, CD79B, RETN, KIAA0101, S100A11, CD79A, MNDA, GYPB, IFIT1B, LGALS2, BLVRB, COTL1, IFITM3, AZU1, CD14, SERPINA1, CD24, MS4A1, SLC4A1, FGFBP2, TMCC2, TRBC1, CXCL2, SOX4, SLC25A37, IGHA1, IGHG3, IGHG1, CFD, CH17-373J23.1, CCL3L3, LTB, EREG, KLRF1, MS4A6A, FAM178B, IL32, CD8B, KLRD1, HLA-DRB5, IGHD, NEAT1, CST7, MS4A7, S100A4, S100A6, SRGN, FTL, HMGB1, CLIC3, PLAUR, PSAP, UBE2C, IRF8, CTSW, IFI30, RGS2, IFNG, PLD4, PRSS57, VIM, PRF1, HOPX, NA [... truncated]

I also tried to run RunPCA() with the features specified, as suggested by @eregenyi, but that didn't work either.

With all cells, I already got an error when trying to split the cells by group using SplitObject() which I couldn't figure out how to solve.

EDIT: I'm using Seurat v3.1.1 on R 3.6.1.

romanhaa on 21 Jan 2020

Hi Seurat Community,

I encountered a similar issue when trying to combine a 700-cell object with a previously combined 4800-cell set. When I tried running "FindVariableFeatures" before running pca, I got the error: "Error in simpleLoess(y, x, w, span, degree = degree, parametric = parametric, : invalid 'x'" . Also, downsampling 4800-cell set to 2000-cell doesn't solve the issue:\

Sincerely hope someone could share their solution!

Thanks!

cadyyuheng on 5 Feb 2020

I encountered the same problem, then checked that the total variable genes are 'zero' (0) due to one unintentional mistake in the pipeline. So please check if total variable genes are present or not. I am not sure but if the cells are from one cell-types only and having few cells in number might cause this issue too.

rahulnutron on 19 Mar 2020

👍1

I encountered the same problem, then checked that the total variable genes are 'zero' (0) due to one unintentional mistake in the pipeline. So please check if total variable genes are present or not. I am not sure but if the cells are from one cell-types only and having few cells in number might cause this issue too.

Hi Rahul,
Thanks for sharing! May I ask how did you fix when total variable genes not resented if one cell tpe containing few cells?

cadyyuheng on 19 Mar 2020

I encountered the same issue. So there are genes in my data having variance = 0. This creates NA in the sparseMatrix after using log normalization. Then when I used ScaleData, it created a lot of zeros. I guess when Seurat feeded this scaled data to irlba, irlba removed columns/rows having variance = 0 and the final matrix was smaller than their expected number of left/right singular vectors. Fyi, this is my data (a Seurat object). :

sum(data@[email protected]$vst.variance == 0)
# 95
table(is.na(data@assays$RNA@counts@x))
# FALSE
# 19332
table(is.na(data@assays$RNA@data@x))
# FALSE  TRUE
# 19332 12870
data <- Seurat::RunPCA(data, npcs = 50, verbose = FALSE)
# Error in irlba(A = t(x = object), nv = npcs, ...) :
#  max(nu, nv) must be positive

So I replaced the NA in my normalized matrix with 0 (this is not right I know), re-scaled the data, and re-ran PCA. And it worked.

data@assays$RNA@data@x[is.na(data@assays$RNA@data@x)] <- 0
data <- Seurat::ScaleData(data, features = Seurat::VariableFeatures(data))
data <- Seurat::RunPCA(data, npcs = 50, verbose = FALSE)

Any better solution?

TriLe965 on 9 Apr 2020

I'm getting the same error in the SCT workflow for data integration, even after down-sampling the whole data set from roughly 90k to 25k cells.

seurat_integrated <- IntegrateData(
  anchorset = seurat_anchors,
  normalization.method = 'SCT'
)

seurat_integrated
# An object of class Seurat 
# 37391 features across 24680 samples within 3 assays 
# Active assay: integrated (3000 features)
#  2 other assays present: RNA, SCT

seurat_integrated <- RunPCA(seurat_integrated)
# Error in irlba(A = t(x = object), nv = npcs, ...) : 
#   max(nu, nv) must be positive
# In addition: Warning message:
# In PrepDR(object = object, features = features, verbose = verbose) :
#   The following 3000 features requested have zero variance (running reduction without them): LYZ, HBA1, HBB, S100A8, HBA2, S100A9, HBD, HBM, CA1, GNLY, RP11-1143G9.4, AHSP, CCL5, CXCL8, HIST1H4C, HLA-DRA, S100A12, CST3, TYROBP, CD74, IGLL1, CA2, FCN1, IGKC, JCHAIN, KLRB1, NKG7, GYPA, IGLC3, LST1, LGALS1, IGLC2, G0S2, GZMB, STMN1, CCL3, FCER1G, AIF1, TUBA1B, CTSS, CCL4, HLA-DPB1, TCL1A, HLA-DPA1, CSTA, HLA-DRB1, FCGR3A, ALAS2, PRDX2, VCAN, MZB1, HMGB2, GZMK, HEMGN, CMC1, IGHM, SNCA, VPREB3, GZMA, HLA-DQA1, FCER1A, TRDC, SAT1, TUBB, SPINK2, GZMH, HLA-DQB1, CD79B, RETN, KIAA0101, S100A11, CD79A, MNDA, GYPB, IFIT1B, LGALS2, BLVRB, COTL1, IFITM3, AZU1, CD14, SERPINA1, CD24, MS4A1, SLC4A1, FGFBP2, TMCC2, TRBC1, CXCL2, SOX4, SLC25A37, IGHA1, IGHG3, IGHG1, CFD, CH17-373J23.1, CCL3L3, LTB, EREG, KLRF1, MS4A6A, FAM178B, IL32, CD8B, KLRD1, HLA-DRB5, IGHD, NEAT1, CST7, MS4A7, S100A4, S100A6, SRGN, FTL, HMGB1, CLIC3, PLAUR, PSAP, UBE2C, IRF8, CTSW, IFI30, RGS2, IFNG, PLD4, PRSS57, VIM, PRF1, HOPX, NA [... truncated]

I also tried to run RunPCA() with the features specified, as suggested by @eregenyi, but that didn't work either.

With all cells, I already got an error when trying to split the cells by group using SplitObject() which I couldn't figure out how to solve.

EDIT: I'm using Seurat v3.1.1 on R 3.6.1.

Hi romanhaa,

Did you ever figure out the solution to this issue? I'm experiencing the exact same thing on a dataset of 9k cells.

Thanks,
Will

wzhao01 on 6 May 2020

I encountered the same issue. Any news on this? Thanks

ipatop on 22 Jun 2020

I have the same problem with RunPCA

pbmc <- RunPCA(pbmc, features = VariableFeatures(object = pbmc))
Error in irlba(A = t(x = object), nv = npcs, ...) :
max(nu, nv) must be strictly less than min(nrow(A), ncol(A))

Someone have the solution?????

ehatl on 20 Jul 2020

One has to do the following steps before running a PCA:

GEX <- NormalizeData(GEX, normalization.method = "LogNormalize", scale.factor = 10000)
GEX <- ScaleData(GEX, features = rownames(GEX))
GEX <- FindVariableFeatures(GEX, selection.method = "vst", nfeatures = 2000)
GEX <- RunPCA(GEX, features = VariableFeatures(object = GEX))