Skip to content

BUG: #47350 if else added to add NaT for missing time values #47647

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 54 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
aa9f8c7
BUG: If else added for idxmax / idxmin ValueError occurs if a period …
hamedgibago Jul 8, 2022
5d75134
Merge branch 'pandas-dev:main' into main
hamedgibago Jul 8, 2022
2987981
Merge branch 'main' into main
hamedgibago Jul 9, 2022
952f74e
Merge branch 'pandas-dev:main' into main
hamedgibago Jul 9, 2022
541ec70
BUG:47350 if added by hamedgibago (local checks with pre-commit passed)
hamedgibago Jul 10, 2022
18e02b4
Merge branch 'main' into main
hamedgibago Jul 10, 2022
d586077
Test added for # GH 47350
hamedgibago Jul 12, 2022
c4025dc
Merge branch 'pandas-dev:main' into main
hamedgibago Jul 12, 2022
ff0734d
Merge branch 'main' into main
hamedgibago Jul 12, 2022
702c586
BUG:47350 If exchanged with try except
hamedgibago Jul 20, 2022
6716cfc
Merge branch 'pandas-dev:main' into main
hamedgibago Jul 20, 2022
4645cd4
Changed the comment in code
hamedgibago Jul 20, 2022
4815b5e
1 added
hamedgibago Jul 20, 2022
f95a2e4
__init__.py was changed and some errores occured. Reverted it and it …
hamedgibago Jul 21, 2022
da61402
Merge branch 'pandas-dev:main' into main
hamedgibago Jul 23, 2022
a8da21b
AttributeError added to except part in addition to ValueError.
hamedgibago Jul 25, 2022
f94e812
Merge branch 'main' of https://github.com/pandas-dev/pandas
hamedgibago Jul 25, 2022
dc40ce1
Merge branch 'main' of https://github.com/hamedgibago/pandas
hamedgibago Jul 25, 2022
57cb1c6
Doctest errors cleared
hamedgibago Jul 25, 2022
cf8af3f
Merge branch 'pandas-dev:main' into main
hamedgibago Jul 25, 2022
4c3bb61
More Doctests errors cleared
hamedgibago Jul 26, 2022
bed87fb
Merge branch 'main' of https://github.com/hamedgibago/pandas
hamedgibago Jul 26, 2022
4766327
Multiline error during doctest
hamedgibago Jul 26, 2022
e9d50b0
Some Doctests errors cleared
hamedgibago Jul 27, 2022
9877034
Merge branch 'main' into main
hamedgibago Jul 27, 2022
26b24f2
Some more errors from online Doctests cleared
hamedgibago Jul 27, 2022
0ea81fd
Doctest leading whitespace cleared
hamedgibago Jul 27, 2022
dbee98a
Doctest errors
hamedgibago Jul 27, 2022
80ede97
Doctest errors fixed online
hamedgibago Jul 27, 2022
7456a26
Doctest errors corrected online
hamedgibago Jul 27, 2022
22f1166
Doctest online errors cleared
hamedgibago Jul 27, 2022
a9ce0c2
Doctest debug
hamedgibago Jul 27, 2022
fc96625
Extra old comments and variables removed
hamedgibago Jul 29, 2022
cdda71f
Merge branch 'main' into main
hamedgibago Jul 29, 2022
8519f66
Merge branch 'main' into main
hamedgibago Aug 3, 2022
32c1a74
Merge branch 'main' into main
hamedgibago Aug 3, 2022
70a9433
Test added for #GH 47653 (Origin param with no effect)
hamedgibago Aug 3, 2022
a402371
Merge remote-tracking branch 'upstream/main'
hamedgibago Aug 3, 2022
df2f37d
Revert "More Doctests errors cleared"
hamedgibago Aug 3, 2022
9b89614
Revert "Some Doctests errors cleared"
hamedgibago Aug 3, 2022
fb06399
Revert "Doctest debug"
hamedgibago Aug 3, 2022
3d1951f
Revert "Doctest online errors cleared"
hamedgibago Aug 3, 2022
fc8c00a
Revert "Doctest errors corrected online"
hamedgibago Aug 3, 2022
24d9cd4
Revert "Doctest errors fixed online"
hamedgibago Aug 3, 2022
d8d18e8
Revert "More Doctests errors cleared"
hamedgibago Aug 3, 2022
f050fda
Merge branch 'pandas-dev:main' into main
hamedgibago Aug 4, 2022
2b57e4d
Merge branch 'main' of https://github.com/hamedgibago/pandas
hamedgibago Aug 4, 2022
3335cfd
Merge branch 'main' into main
hamedgibago Aug 5, 2022
2bc6a29
Merge branch 'main' into main
hamedgibago Aug 6, 2022
fdb123f
Merge branch 'main' of https://github.com/hamedgibago/pandas
hamedgibago Aug 7, 2022
8777790
Revert "Some more errors from online Doctests cleared"
hamedgibago Aug 7, 2022
ffd5b55
Revert "More Doctests errors cleared"
hamedgibago Aug 7, 2022
c5ae12f
Merge branch 'pandas-dev:main' into main
hamedgibago Aug 14, 2022
df958ce
Merge branch 'pandas-dev:main' into main
hamedgibago Aug 15, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 24 additions & 1 deletion pandas/core/groupby/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -821,14 +821,37 @@ def apply(
# This calls DataSplitter.__iter__
zipped = zip(group_keys, splitter)

i = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This i variable doesn't look to be used anywhere

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to remove it, it was created for my old code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

for key, group in zipped:
# BUG:47350 if replaced 1 by hamedgibago
# if key not in data.index and is_datetime64_any_dtype(data.index):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comments or old code?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have to remove this comment for my old code too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this old code is still here

# #or (key not in data.index and f.__name__ in ['idxmax','idxmin']) :
# ser=Series(i,[key])
# res = None
# else:
# res = f(group)
try:
res = f(group)
except (ValueError, AttributeError):
# except ValueError:
res = None

object.__setattr__(group, "name", key)

# group might be modified
group_axes = group.axes
res = f(group)

if not mutated and not _is_indexed_like(res, group_axes, axis):
mutated = True

i = i + 1

# BUG:47350 if added by hamedgibago
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks unused?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I have to remove it too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

# if key in data.index:
# result_values.append(res)
# else:
# result_values.append(np.nan)

result_values.append(res)

# getattr pattern for __name__ is needed for functools.partial objects
Expand Down
33 changes: 14 additions & 19 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,8 +45,8 @@
tz_compare,
)
from pandas._typing import (
AnyArrayLike,
ArrayLike,
Axes,
Dtype,
DtypeObj,
F,
Expand Down Expand Up @@ -261,10 +261,6 @@ def _new_Index(cls, d):
# GH#23752 "labels" kwarg has been replaced with "codes"
d["codes"] = d.pop("labels")

# Since this was a valid MultiIndex at pickle-time, we don't need to
# check validty at un-pickle time.
d["verify_integrity"] = False

elif "dtype" not in d and "data" in d:
# Prevent Index.__new__ from conducting inference;
# "data" key not in RangeIndex
Expand All @@ -277,9 +273,8 @@ def _new_Index(cls, d):

class Index(IndexOpsMixin, PandasObject):
"""
Immutable sequence used for indexing and alignment.

The basic object storing axis labels for all pandas objects.
Immutable sequence used for indexing and alignment. The basic object
storing axis labels for all pandas objects.

Parameters
----------
Expand Down Expand Up @@ -2297,7 +2292,8 @@ def is_monotonic(self) -> bool:
@property
def is_monotonic_increasing(self) -> bool:
"""
Return a boolean if the values are equal or increasing.
Return if the index is monotonic increasing (only equal or
increasing) values.

Examples
--------
Expand All @@ -2313,7 +2309,8 @@ def is_monotonic_increasing(self) -> bool:
@property
def is_monotonic_decreasing(self) -> bool:
"""
Return a boolean if the values are equal or decreasing.
Return if the index is monotonic decreasing (only equal or
decreasing) values.

Examples
--------
Expand Down Expand Up @@ -3815,9 +3812,8 @@ def get_loc(self, key, method=None, tolerance=None):
_index_shared_docs[
"get_indexer"
] = """
Compute indexer and mask for new index given the current index.

The indexer should be then used as an input to ndarray.take to align the
Compute indexer and mask for new index given the current index. The
indexer should be then used as an input to ndarray.take to align the
current data to the new index.

Parameters
Expand Down Expand Up @@ -4586,7 +4582,8 @@ def join(
sort: bool = False,
) -> Index | tuple[Index, npt.NDArray[np.intp] | None, npt.NDArray[np.intp] | None]:
"""
Compute join_index and indexers to conform data structures to the new index.
Compute join_index and indexers to conform data
structures to the new index.

Parameters
----------
Expand Down Expand Up @@ -4687,7 +4684,6 @@ def join(
not isinstance(self, ABCMultiIndex)
or not any(is_categorical_dtype(dtype) for dtype in self.dtypes)
)
and not is_categorical_dtype(self.dtype)
):
# Categorical is monotonic if data are ordered as categories, but join can
# not handle this in case of not lexicographically monotonic GH#38502
Expand Down Expand Up @@ -5983,9 +5979,8 @@ def set_value(self, arr, key, value) -> None:
_index_shared_docs[
"get_indexer_non_unique"
] = """
Compute indexer and mask for new index given the current index.

The indexer should be then used as an input to ndarray.take to align the
Compute indexer and mask for new index given the current index. The
indexer should be then used as an input to ndarray.take to align the
current data to the new index.

Parameters
Expand Down Expand Up @@ -7283,7 +7278,7 @@ def ensure_index_from_sequences(sequences, names=None) -> Index:
return MultiIndex.from_arrays(sequences, names=names)


def ensure_index(index_like: Axes, copy: bool = False) -> Index:
def ensure_index(index_like: AnyArrayLike | Sequence, copy: bool = False) -> Index:
"""
Ensure that we have an index from some index-like object.

Expand Down
43 changes: 34 additions & 9 deletions pandas/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -1982,6 +1982,12 @@ def _get_timestamp_range_edges(
-------
A tuple of length 2, containing the adjusted pd.Timestamp objects.
"""
if isinstance(origin, Timestamp):
first, last = _adjust_dates_anchored(
first, last, freq, closed=closed, origin=origin, offset=offset
)
return first, last

if isinstance(freq, Tick):
index_tz = first.tz
if isinstance(origin, Timestamp) and (origin.tz is None) != (index_tz is None):
Expand Down Expand Up @@ -2116,7 +2122,10 @@ def _adjust_dates_anchored(
origin_nanos = origin.value
elif origin in ["end", "end_day"]:
origin = last if origin == "end" else last.ceil("D")
sub_freq_times = (origin.value - first.value) // freq.nanos
if isinstance(freq, Tick):
sub_freq_times = (origin.value - first.value) // freq.nanos
else:
sub_freq_times = origin.value - first.value
if closed == "left":
sub_freq_times += 1
first = origin - sub_freq_times * freq
Expand All @@ -2133,19 +2142,29 @@ def _adjust_dates_anchored(
if last_tzinfo is not None:
last = last.tz_convert("UTC")

foffset = (first.value - origin_nanos) % freq.nanos
loffset = (last.value - origin_nanos) % freq.nanos
if isinstance(freq, Tick):
foffset = (first.value - origin_nanos) % freq.nanos
loffset = (last.value - origin_nanos) % freq.nanos
else:
foffset = first.value - origin_nanos
loffset = last.value - origin_nanos

if closed == "right":
if foffset > 0:
# roll back
fresult_int = first.value - foffset
else:
fresult_int = first.value - freq.nanos
if isinstance(freq, Tick):
fresult_int = first.value - freq.nanos
else:
fresult_int = first.value

if loffset > 0:
# roll forward
lresult_int = last.value + (freq.nanos - loffset)
if isinstance(freq, Tick):
# roll forward
lresult_int = last.value + (freq.nanos - loffset)
else:
lresult_int = last.value - loffset
else:
# already the end of the road
lresult_int = last.value
Expand All @@ -2157,10 +2176,16 @@ def _adjust_dates_anchored(
fresult_int = first.value

if loffset > 0:
# roll forward
lresult_int = last.value + (freq.nanos - loffset)
if isinstance(freq, Tick):
# roll forward
lresult_int = last.value + (freq.nanos - loffset)
else:
lresult_int = last.value - loffset
else:
lresult_int = last.value + freq.nanos
if isinstance(freq, Tick):
lresult_int = last.value + freq.nanos
else:
lresult_int = last.value
fresult = Timestamp(fresult_int)
lresult = Timestamp(lresult_int)
if first_tzinfo is not None:
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -2175,7 +2175,7 @@ def unique(self) -> ArrayLike:
Examples
--------
>>> pd.Series([2, 1, 3, 3], name='A').unique()
array([2, 1, 3])
array([2, 1, 3], dtype=int64)

>>> pd.Series([pd.Timestamp('2016-01-01') for _ in range(3)]).unique()
array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
Expand Down
25 changes: 25 additions & 0 deletions pandas/tests/groupby/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,31 @@ def test_repr():
assert result == expected


def test_origin_param_no_effect():
# GH 47653
df = DataFrame(
[
{"A": A, "datadate": datadate}
for A in range(1, 3)
for datadate in date_range(start="1/2/2022", end="2/1/2022", freq="D")
]
)

result = df.groupby(["A", Grouper(key="datadate", freq="W", origin="start")])

# for i, dfg in result:
# print(dfg[["A", "datadate"]])..
# print("-----------------------")

expected = df.groupby(["A", Grouper(key="datadate", freq="W", origin="1/5/2022")])

# for i, dfg in expected:
# print(dfg[["A", "datadate"]])
# print("-----------------------")

tm.assert_series_equal(result, expected)


@pytest.mark.parametrize("dtype", ["int64", "int32", "float64", "float32"])
def test_basic(dtype):

Expand Down
11 changes: 11 additions & 0 deletions pandas/tests/resample/test_resampler_grouper.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,17 @@ async def test_tab_complete_ipython6_warning(ip):
list(ip.Completer.completions("rs.", 1))


def test_dataframe_missing_a_day():
# GH 47350
dates = pd.DatetimeIndex(["2022-01-01", "2022-01-02", "2022-01-04"])
df = DataFrame([0, 1, 2], index=dates)
result = df.resample("D")[0].idxmax() # raises value error

expected = df.resample("D")[0].apply(lambda x: x.idxmax() if len(x) else None)

tm.assert_series_equal(result, expected)


def test_deferred_with_groupby():

# GH 12486
Expand Down