-
Notifications
You must be signed in to change notification settings - Fork 630
ValueError: Length of values (1) does not match length of index with sc.pp.calculate_qc_metrics(adata) #2008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I am suddenly having a similar problem as well, in addition to the other issue I raised... |
I'm also suddenly having this problem with "ValueError: Length of values (1) does not match length of index()" for certain Scanpy functions like |
Could you update your version of scanpy and see if the issue persists? I believe this issue was an incompatibility with the 1.3.0 release of pandas pandas-dev/pandas#42376, which was fixed for scanpy 1.8.1 (#1917) |
I had this issue on 1.8.1 with pandas 1.3.3 |
I've encountered similar issue last week and it was because import scanpy as sc
from scipy.sparse import csr_matrix
adata = sc.datasets.pbmc3k()
adata.X = csr_matrix(adata.X)
adata.obs['total_counts'] = adata.X.sum(1) # is sparse, pandas doesn't complain
adata.obs # raises the formatter error ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/IPython/core/formatters.py in __call__(self, obj)
700 type_pprinters=self.type_printers,
701 deferred_pprinters=self.deferred_printers)
--> 702 printer.pretty(obj)
703 printer.flush()
704 return stream.getvalue()
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/IPython/lib/pretty.py in pretty(self, obj)
392 if cls is not object \
393 and callable(cls.__dict__.get('__repr__')):
--> 394 return _repr_pprint(obj, self, cycle)
395
396 return _default_pprint(obj, self, cycle)
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
698 """A pprint that just redirects to the normal repr function."""
699 # Find newlines and replace them with p.break_()
--> 700 output = repr(obj)
701 lines = output.splitlines()
702 with p.group():
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/core/frame.py in __repr__(self)
993 else:
994 width = None
--> 995 self.to_string(
996 buf=buf,
997 max_rows=max_rows,
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/core/frame.py in to_string(self, buf, columns, col_space, header, index, na_rep, formatters, float_format, sparsify, index_names, justify, max_rows, min_rows, max_cols, show_dimensions, decimal, line_width, max_colwidth, encoding)
1129 decimal=decimal,
1130 )
-> 1131 return fmt.DataFrameRenderer(formatter).to_string(
1132 buf=buf,
1133 encoding=encoding,
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in to_string(self, buf, encoding, line_width)
1051
1052 string_formatter = StringFormatter(self.fmt, line_width=line_width)
-> 1053 string = string_formatter.to_string()
1054 return save_to_buffer(string, buf=buf, encoding=encoding)
1055
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/string.py in to_string(self)
23
24 def to_string(self) -> str:
---> 25 text = self._get_string_representation()
26 if self.fmt.should_show_dimensions:
27 text = "".join([text, self.fmt.dimensions_info])
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/string.py in _get_string_representation(self)
38 return self._empty_info_line
39
---> 40 strcols = self._get_strcols()
41
42 if self.line_width is None:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/string.py in _get_strcols(self)
29
30 def _get_strcols(self) -> list[list[str]]:
---> 31 strcols = self.fmt.get_strcols()
32 if self.fmt.is_truncated:
33 strcols = self._insert_dot_separators(strcols)
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in get_strcols(self)
538 Render a DataFrame to a list of columns (as lists of strings).
539 """
--> 540 strcols = self._get_strcols_without_index()
541
542 if self.index:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in _get_strcols_without_index(self)
802 int(self.col_space.get(c, 0)), *(self.adj.len(x) for x in cheader)
803 )
--> 804 fmt_values = self.format_col(i)
805 fmt_values = _make_fixed_width(
806 fmt_values, self.justify, minimum=header_colwidth, adj=self.adj
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in format_col(self, i)
816 frame = self.tr_frame
817 formatter = self._get_formatter(i)
--> 818 return format_array(
819 frame.iloc[:, i]._values,
820 formatter,
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in format_array(values, formatter, float_format, na_rep, digits, space, justify, decimal, leading_space, quoting)
1238 )
1239
-> 1240 return fmt_obj.get_result()
1241
1242
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in get_result(self)
1269
1270 def get_result(self) -> list[str]:
-> 1271 fmt_values = self._format_strings()
1272 return _make_fixed_width(fmt_values, self.justify)
1273
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in _format_strings(self)
1516
1517 def _format_strings(self) -> list[str]:
-> 1518 return list(self.get_result_as_array())
1519
1520
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in get_result_as_array(self)
1480 float_format = lambda value: self.float_format % value
1481
-> 1482 formatted_values = format_values_with(float_format)
1483
1484 if not self.fixed_width:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in format_values_with(float_format)
1454 values = self.values
1455 is_complex = is_complex_dtype(values)
-> 1456 values = format_with_na_rep(values, formatter, na_rep)
1457
1458 if self.fixed_width:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in format_with_na_rep(values, formatter, na_rep)
1425 mask = isna(values)
1426 formatted = np.array(
-> 1427 [
1428 formatter(val) if not m else na_rep
1429 for val, m in zip(values.ravel(), mask.ravel())
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in <listcomp>(.0)
1426 formatted = np.array(
1427 [
-> 1428 formatter(val) if not m else na_rep
1429 for val, m in zip(values.ravel(), mask.ravel())
1430 ]
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/IPython/core/formatters.py in __call__(self, obj)
343 method = get_real_method(obj, self.print_method)
344 if method is not None:
--> 345 return method()
346 return None
347 else:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/core/frame.py in _repr_html_(self)
1045 decimal=".",
1046 )
-> 1047 return fmt.DataFrameRenderer(formatter).to_html(notebook=True)
1048 else:
1049 return None
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in to_html(self, buf, encoding, classes, notebook, border, table_id, render_links)
1027 render_links=render_links,
1028 )
-> 1029 string = html_formatter.to_string()
1030 return save_to_buffer(string, buf=buf, encoding=encoding)
1031
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/html.py in to_string(self)
70
71 def to_string(self) -> str:
---> 72 lines = self.render()
73 if any(isinstance(x, str) for x in lines):
74 lines = [str(x) for x in lines]
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/html.py in render(self)
619 self.write("<div>")
620 self.write_style()
--> 621 super().render()
622 self.write("</div>")
623 return self.elements
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/html.py in render(self)
76
77 def render(self) -> list[str]:
---> 78 self._write_table()
79
80 if self.should_show_dimensions:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/html.py in _write_table(self, indent)
246 self._write_header(indent + self.indent_delta)
247
--> 248 self._write_body(indent + self.indent_delta)
249
250 self.write("</table>", indent)
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/html.py in _write_body(self, indent)
393 def _write_body(self, indent: int) -> None:
394 self.write("<tbody>", indent)
--> 395 fmt_values = self._get_formatted_values()
396
397 # write values
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/html.py in _get_formatted_values(self)
583
584 def _get_formatted_values(self) -> dict[int, list[str]]:
--> 585 return {i: self.fmt.format_col(i) for i in range(self.ncols)}
586
587 def _get_columns_formatted_values(self) -> list[str]:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/html.py in <dictcomp>(.0)
583
584 def _get_formatted_values(self) -> dict[int, list[str]]:
--> 585 return {i: self.fmt.format_col(i) for i in range(self.ncols)}
586
587 def _get_columns_formatted_values(self) -> list[str]:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in format_col(self, i)
816 frame = self.tr_frame
817 formatter = self._get_formatter(i)
--> 818 return format_array(
819 frame.iloc[:, i]._values,
820 formatter,
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in format_array(values, formatter, float_format, na_rep, digits, space, justify, decimal, leading_space, quoting)
1238 )
1239
-> 1240 return fmt_obj.get_result()
1241
1242
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in get_result(self)
1269
1270 def get_result(self) -> list[str]:
-> 1271 fmt_values = self._format_strings()
1272 return _make_fixed_width(fmt_values, self.justify)
1273
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in _format_strings(self)
1516
1517 def _format_strings(self) -> list[str]:
-> 1518 return list(self.get_result_as_array())
1519
1520
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in get_result_as_array(self)
1480 float_format = lambda value: self.float_format % value
1481
-> 1482 formatted_values = format_values_with(float_format)
1483
1484 if not self.fixed_width:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in format_values_with(float_format)
1454 values = self.values
1455 is_complex = is_complex_dtype(values)
-> 1456 values = format_with_na_rep(values, formatter, na_rep)
1457
1458 if self.fixed_width:
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in format_with_na_rep(values, formatter, na_rep)
1425 mask = isna(values)
1426 formatted = np.array(
-> 1427 [
1428 formatter(val) if not m else na_rep
1429 for val, m in zip(values.ravel(), mask.ravel())
~/.miniconda3/envs/cellrank/lib/python3.8/site-packages/pandas/io/formats/format.py in <listcomp>(.0)
1426 formatted = np.array(
1427 [
-> 1428 formatter(val) if not m else na_rep
1429 for val, m in zip(values.ravel(), mask.ravel())
1430 ]
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() |
With pandas 1.3.4 and 1.3.3
This looks very upstream in pandas. I will try and submit an issue/ check that this hasn't been reported to pandas already tomorrow. This may be a kinda easy fix (e.g. check value shape better during column assignment in pandas), but it can take a bit to figure out how to fix things there. AFAIK, we removed calls in scanpy which assigned (n x 1) matrices to pandas because of related, non-formatting error. Is the current scanpy release assigning these matrices anywhere? |
Many thanks for everyone's input. The bug is indeed due to an issue with Pandas ≥1.3. I am running Scanpy 1.8.1 and I can confirm that the indexing problem remains with Pandas 1.3.0, 1.3.2, and the latest 1.3.4, but resolves when downgrading to 1.2.5 |
thanks for everyone's input. I tried to solve this problem by downgrading pandas to 1.1.5. the cause of this problem may be that in python 3.9 and above, pandas modifies the matrix function |
Opened a PR to pandas which should hopefully fix this: pandas-dev/pandas#42376 |
My PR was merged, so this should be resolved with the next version of pandas. |
I also tried 'log1p = False' and produced the other error. Thank you.
sc.pp.calculate_qc_metrics(adata, log1p = False)
Versions
WARNING: If you miss a compact list, please try
print_header
!anndata 0.7.6
scanpy 1.7.2
sinfo 0.3.1
PIL 8.3.2
anndata 0.7.6
beta_ufunc NA
binom_ufunc NA
cffi 1.14.6
colorama 0.4.4
concurrent NA
cycler 0.10.0
cython_runtime NA
dateutil 2.8.2
dunamai 1.6.0
encodings NA
genericpath NA
get_version 3.5
h5py 3.4.0
joblib 1.0.1
kiwisolver 1.3.2
legacy_api_wrap 0.0.0
llvmlite 0.37.0
matplotlib 3.4.3
mpl_toolkits NA
natsort 7.1.1
nbinom_ufunc NA
ntpath NA
numba 0.54.0
numexpr 2.7.3
numpy 1.20.3
opcode NA
packaging 21.0
pandas 1.3.3
pkg_resources NA
posixpath NA
pycparser 2.20
pyexpat NA
pyparsing 2.4.7
pytz 2021.1
scanpy 1.7.2
scipy 1.7.1
setuptools_scm NA
sinfo 0.3.1
six 1.16.0
sklearn 1.0
sphinxcontrib NA
sre_compile NA
sre_constants NA
sre_parse NA
tables 3.6.1
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) [GCC 9.4.0]
Linux-5.4.72-microsoft-standard-WSL2-x86_64-with-glibc2.31
24 logical CPU cores, x86_64
Session information updated at 2021-10-01 14:56
The text was updated successfully, but these errors were encountered: