[X] I have checked that this issue has not already been reported.
[X] I have confirmed this bug exists on the latest version of pandas.
[X] I have confirmed this bug exists on the main branch of pandas.
import pandas as pd
from datetime import date
df = pd.DataFrame([date.today()])
df.to_excel('/tmp/out.xlsx')
datetime.date
are formatted incorrectly due to usage of YYYY-MM-DD
instead of yyyy-mm-dd
. It seems to be reproducible with both openpyxl and xlsxwriter. https://xlsxwriter.readthedocs.io/format.html?highlight=decimal#set_num_format
Which was set due to the default date_format
for ExcelWriter
. And was used to get the num_format
for https://github.com/pandas-dev/pandas/blob/33f4f7b57b46bb295c1b71f3890377a5e541122e/pandas/io/excel/_base.py#L1288.
I think maybe we can set the default date format to yyyy-mm-dd
but I am not sure what is the impact.
On microsoft excel, it shows the correct date format but with extra time part, so probably it didn't get parsed correctly.
On google sheets, it works correctly.
On macos numbers, it shows YYYY-01-DD
which is weird.
The date should be formatted as YYYY-MM-DD
and works with excel-related applications correctly.
commit : 2e218d10984e9919f0296931d92ea851c6a6faf5 python : 3.10.9.final.0 python-bits : 64 OS : Darwin OS-release : 22.1.0 Version : Darwin Kernel Version 22.1.0: Sun Oct 9 20:14:30 PDT 2022; root:xnu-8792.41.9~2/RELEASE_ARM64_T8103 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : None LOCALE : None.UTF-8
pandas : 1.5.3 numpy : 1.24.1 pytz : 2022.7.1 dateutil : 2.8.2 setuptools : 65.6.3 pip : 22.3.1 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 3.0.7 lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : None snappy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None tzdata : None
Thanks for the report! I'm not able to duplicate on linux.
datetime.date
are formatted incorrectly due to usage ofYYYY-MM-DD
instead ofyyyy-mm-dd
.
vs
The date should be formatted as
YYYY-MM-DD
These two seem to be in conflict. Are you saying you get correct results with the following?
df = pd.DataFrame([date.today()])
with pd.ExcelWriter("test.xlsx", date_format="yyyy-mm-dd") as writer:
df.to_excel(writer)
Thanks for the report! I'm not able to duplicate on linux.
Ah, I tried reproducing on linux but can't reproduce too.
Yeah, that works on macos if I specify the date_format
.