Pandas version checks

  • [X] I have checked that this issue has not already been reported.

  • [X] I have confirmed this bug exists on the latest version of pandas.

  • [X] I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
from datetime import date
df = pd.DataFrame([date.today()])
df.to_excel('/tmp/out.xlsx')

Issue Description

datetime.date are formatted incorrectly due to usage of YYYY-MM-DD instead of yyyy-mm-dd. It seems to be reproducible with both openpyxl and xlsxwriter. https://xlsxwriter.readthedocs.io/format.html?highlight=decimal#set_num_format

Which was set due to the default date_format for ExcelWriter. And was used to get the num_format for https://github.com/pandas-dev/pandas/blob/33f4f7b57b46bb295c1b71f3890377a5e541122e/pandas/io/excel/_base.py#L1288.

I think maybe we can set the default date format to yyyy-mm-dd but I am not sure what is the impact.

On microsoft excel, it shows the correct date format but with extra time part, so probably it didn't get parsed correctly.

On google sheets, it works correctly.

On macos numbers, it shows YYYY-01-DD which is weird.

Screenshot 2023-01-25 at 5 17 32 PM

Expected Behavior

The date should be formatted as YYYY-MM-DD and works with excel-related applications correctly.

Installed Versions

INSTALLED VERSIONS

commit : 2e218d10984e9919f0296931d92ea851c6a6faf5 python : 3.10.9.final.0 python-bits : 64 OS : Darwin OS-release : 22.1.0 Version : Darwin Kernel Version 22.1.0: Sun Oct 9 20:14:30 PDT 2022; root:xnu-8792.41.9~2/RELEASE_ARM64_T8103 machine : arm64 processor : arm byteorder : little LC_ALL : None LANG : None LOCALE : None.UTF-8

pandas : 1.5.3 numpy : 1.24.1 pytz : 2022.7.1 dateutil : 2.8.2 setuptools : 65.6.3 pip : 22.3.1 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : 3.0.7 lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : None numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : None snappy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None tzdata : None

0

Thanks for the report! I'm not able to duplicate on linux.

datetime.date are formatted incorrectly due to usage of YYYY-MM-DD instead of yyyy-mm-dd.

vs

The date should be formatted as YYYY-MM-DD

These two seem to be in conflict. Are you saying you get correct results with the following?

df = pd.DataFrame([date.today()])
with pd.ExcelWriter("test.xlsx", date_format="yyyy-mm-dd") as writer:
    df.to_excel(writer)
1

Thanks for the report! I'm not able to duplicate on linux.

Ah, I tried reproducing on linux but can't reproduce too.

Yeah, that works on macos if I specify the date_format.

0
© 2022 pullanswer.com - All rights reserved.