[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[x] I have confirmed this bug exists on the main branch of pandas.
import pandas as pd
dr1 = pd.date_range("2021-08-05", "2021-08-10", freq="1D")
print(dr1) # prints dates only
print(dr1.asof("2021-08-09")) # correctly prints 2021-08-09 00:00:00
# dr2 = dr1.append(pd.DatetimeIndex(["2021-08-11 00:00:00"])) # adding a midnight time works fine
dr2 = dr1.append(pd.DatetimeIndex(["2021-08-11 00:00:01"])) # non-midnight time breaks asof
print(dr2) # prints dates and times, now that there is a non-midnight time - this is fine.
# wrongly prints 2021-08-06 00:00:00 even though 2021-08-09 exists in Index:
print("This should be 2021-08-09: ", dr2.asof("2021-08-09"))
print("2021-08-09" in dr2) # True, so why does asof not find it?
Using a daily DateTimeIndex, if an entry is appended which does not have a midnight time, then asof() can no longer find existing midnight datetimes. I have reproduced the problem with 1.5.2, 1.5.3 and the nightly build '2.0.0.dev0+1147.g7cb7592523' (see below), all with Python 3.10.
dr2.asof("2021-08-09") should return "2021-08-09"
commit : 7cb7592523380133f552e258f272a5694e37957a python : 3.10.4.final.0 python-bits : 64 OS : Linux OS-release : 5.13.0-35-generic Version : #40~20.04.1-Ubuntu SMP Mon Mar 7 09:18:32 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_GB.UTF-8 LOCALE : en_GB.UTF-8
pandas : 2.0.0.dev0+1147.g7cb7592523 numpy : 1.23.5 pytz : 2022.1 dateutil : 2.8.2 setuptools : 44.0.0 pip : 22.3.1 Cython : None pytest : 7.2.0 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 3.0.6 lxml.etree : 4.9.1 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.1.2 IPython : 8.7.0 pandas_datareader: None bs4 : 4.11.1 bottleneck : None brotli : 1.0.9 fastparquet : None fsspec : None gcsfs : None matplotlib : 3.6.1 numba : None numexpr : None odfpy : None openpyxl : 3.0.10 pandas_gbq : None pyarrow : 10.0.0 pyreadstat : None pyxlsb : None s3fs : None scipy : 1.9.3 snappy : None sqlalchemy : 0.7.10 tables : None tabulate : 0.9.0 xarray : None xlrd : 2.0.1 zstandard : None tzdata : None qtpy : 2.1.0 pyqt5 : None
I reproduced the problem with the nightly build - is that the same as the "main branch"? (as per the link above)
Seems I can get around it using one of the following:
dr2.asof(pd.to_datetime("2021-08-09"))
dr2.asof("2021-08-09 00:00:00")