Date parsing error, importing CSV

Do you want to: Report a bug
OS version: Win 10 Pro 22H2
App version: 8.0.1.31 (x64)
Downloaded from: ONLYOFFICE website
Additional information: Australian Region setting

In Australia, date format is typically dd/mm/yyyy

And when importing a CSV file created by one of the banks I use, some of the file had the following date data;

16/05/2024
15/05/2024
13/05/2024
13/05/2024
09/05/2024
08/05/2024
03/05/2024
03/05/2024
03/05/2024
03/05/2024
03/05/2024
02/05/2024
01/05/2024
29/04/2024
29/04/2024
29/04/2024

I pared away the other columns as they contain personal financial information.

When OO displayed the data in the spreadsheet, it was rendered as;

16/05/2024
15/05/2024
13/05/2024
13/05/2024
05.09.2024
05.08.2024
05.03.2024
05.03.2024
05.03.2024
05.03.2024
05.03.2024
05.02.2024
05.01.2024
29/04/2024
29/04/2024
29/04/2024

Not only is the middle set of data delimited with a period instead of a slash character, the day and month data was reversed, corrupting the data.
In effect, a date of the 3 May 24 was changed to 5 March 24, not helpful for financial reconciliation.

It looks like the date parser for CSV is assuming a US date format, but since some of the records contain a 13th, 15th, 16th, or 29th month in the US date format, it rendered those correctly as day, but the others were passed as the US date format as the data could support it.

IT would be helpful of the date parser paid attention to the regional settings of the OO application rather than assuming a US regional format.

Thanks, Team!

Ah! I’ve just noticed that this issue was reported as well.

with a work-around

which does appear to work. And hopefully will be resolved in version 8.1

Hello @DavidRGreen

I’d like to inform that this issue will be fixed in upcoming version 8.1.

1 Like

Nope, it doesn’t.

I use OnlyOffice 8.1.1 on macOS 12.7.5

Here’s the content of the csv file (utf-8 / CRLF):

Expected date;Y-M-D;Y-D-M;D-M-Y;Y/M/D;Y/D/M;D/M/Y
“-> 28 October 2024”;2024-11-28;2024-28-11;28-11-2024;2024/11/28;2024/28/11;28/11/2024
“-> 8 April 2023”;2023-04-08;2023-08-04;08-04-2023;2023/04/08;2023/08/04;08/04/2023
“-> 22 July 1994”;1994-07-22;1994-22-07;22-07-1994;1994/07/22;1994/22/07;22/07/1994

dates.csv (309 Bytes)

And here’s what OO parsed:

So, as you can see, every parsed dates are wrong. Dates in DD-MM-YYYY format look right but in fact are not recognized as dates at all.

I imported (copy & paste) the test data from my earlier post into OO, and it was parsed as expected.

@Kroc I am not surprised that it struggled with your test data given the number of date format changes; however…

OO Dates

by applying custom date formatting to some of the relevant columns, marked as “custom”, the data was straightened out, excepting the 8 April 2023 row where day 08 and month 04 are both under 12, so the date parser could not accurately distinguish them apart.

I would suggest that if it is needful to display dates in multiple format standards on the one worksheet, it may be better to supply the dates in a format that is not ambiguous, like the internal date number, ie. 8 April 2023 which is 45024, or the Unix time code, but that would require a formula to convert to the internal date number.

… also, your expected October date row has November date data.

The October / November error was a typo when I prepared this sample file. My bad.

Getting working data on my side is not difficult. What we need is that OO works with all date formats encoutered in CSV files like Excel does, not only one.

Maybe an optional step in which we could indicate the type and structure of the CSV columns would do the job?

Can you reference the way your data parsed there?

@Kroc

I am not sure how even Excel would handle parsing this date data.

My expectation would be that date data would be parsed according to the region upon which one was configured in the computer’s OS.

What you are attempting to do, parsing regional formatted dates for multiple regions in a single sheet seems a bit of an impossible task. And as I mentioned previously, I would expect that the only reliable approach would be using a format-less date code like the internal date number or the Unix time, which based on the number of seconds since the Epoch Date (1 Jan 1970). ie. I am typing this at time 1722340735, which much to my surprise, Excel can’t parse this without a formula.

These format-less date can then be parsed to any local date format without error.
Unlike 08/04/2024 or 08-04-1024 which is could be parsed as 8 April 2024 or 4 August 2024 depending upon region formatting or context. I would not expect software to sort this out without some degree of external assistance.
And expecting date format parsing instruction in the CSV file is not going to work without introducing a new extended CSV format.

When I raised the ticket, all my date data was in dd/mm/yyyy format, but prior to the bug fix, it parsed correctly until the data supported mm/dd/yyyy then it parsed it that way, before reverting back to dd/mm/yyyy when forced to by the data. The bug fix in the latest version addressed this anomalous behaviour, using OS regional settings I expect.

I’m not sure I understood your question. If you want to know how I got from the uploaded CSV file to the screenshot: I simply opened the CSV file in OnlyOffice Mac.

But this CSV file was just a sample I made to illustrate OnlyOffice Mac fails to guess dates format, whatever day - month - year order we choose and regardless of the separation character used. What I was hoping with this example file, was that OnlyOffice would recognise at least one entire date column without any errors. Then I would have used this date format scheme in all my CSV files.

In the case that most often causes me problems (data exported from MySQL), I solved the problem by transforming the MySQL EPOCH values into Excel date values using this formula:

SELECT (i_date/86400)+25569 AS Date

instead of trying to convert MySQL EPOCH values to a dd/mm/yyyy sequence OnyOffice Mac would recognize (using my CSV file to find which one).

All this to say that this CSV file is not one that I want OnlyOffice to be able to understand. It was simply a tool.

But if it exists, it would be nice to tell us the date format that OnlyOffice will always be able to convert from CSV without making any errors.

And to complete my answer above, even with only one dates column, which @DavidRGreen indicates to import correctly on his side, it fails on my Mac, as you can see on this screenshot:

I just open the CSV with OnlyOffice, nothing else, and all the cell values were transformed to 21/01/1970.

I was asking to share a screenshot of how provided CSV is parsed by Excel in your case to see the difference.


We will investigate this case.

1 Like

Here’s the result with the free version of Excel from Mac App Store.

This is what you get when you simply open the CSV file and do nothing else:

And to show you clearly which columns Excel recognized as valid dates, this is the same data but with cell data type set to long date format (in French) and valid results in green and wrong ones in red:

I’ve deliberately left my error “28 October 2024” in cell A2 which should be “28 November 2024” to show you that I used exactly the same CSV file as in my OnlyOffice screenshot of my message above.

The issue seems to occur only on macOS. Bug was registered, thanks for the report and sorry for the inconvenience caused.

1 Like

Maybe not the exact problem, but also with date formats and CSV. Linux Mint 22 with ONLYOFFICE 8.1. It does not import CSV date fields in proper format when directly importing CSV files by double-clicking on the file.

Sample CSV as text from external text editor with mm/dd/yyyy date format as expected.:

Blockquote
POSTED DATE,TRANSACTION DATE,TRANSACTION TIME
08/10/2024,08/09/2024,02:52:31 PM
08/10/2024,08/09/2024,02:44:18 PM
08/10/2024,08/09/2024,02:40:10 PM
08/10/2024,08/09/2024,02:32:03 PM
08/10/2024,08/09/2024,12:56:11 PM
08/10/2024,08/09/2024,12:44:43 PM
08/10/2024,08/09/2024,12:48:12 PM
08/10/2024,08/09/2024,12:36:48 PM
07/10/2024,07/10/2024,05:35:59 AM
07/10/2024,07/09/2024,07:25:09 AM
07/10/2024,07/09/2024,07:17:09 AM
07/09/2024,07/09/2024,08:07:43 AM
07/09/2024,07/09/2024,07:38:17 AM
07/09/2024,07/09/2024,07:35:21 AM

When directly importing by double-clicking on CSV file, date is in dd/mm/yyyy format (not appropriate for Region).

Blockquote
|POSTED DATE|TRANSACTION DATE|TRANSACTION TIME||
|10/8/2024|9/8/2024|02:52:31 PM||
|10/8/2024|9/8/2024|02:44:18 PM||
|10/8/2024|9/8/2024|02:40:10 PM||
|10/8/2024|9/8/2024|02:32:03 PM||
|10/8/2024|9/8/2024|12:56:11 PM||
|10/8/2024|9/8/2024|12:44:43 PM||
|10/8/2024|9/8/2024|12:48:12 PM||
|10/8/2024|9/8/2024|12:36:48 PM||
|10/7/2024|10/7/2024|05:35:59 AM||
|10/7/2024|9/7/2024|07:25:09 AM||
|10/7/2024|9/7/2024|07:17:09 AM||
|9/7/2024|9/7/2024|08:07:43 AM||
|9/7/2024|9/7/2024|07:38:17 AM||
|9/7/2024|9/7/2024|07:35:21 AM||

When Importing via File->Create New- > Data ->Get Data->From Local TXT/CSV, date is in correct regional format of mm/dd/yyyy.

Blockquote
|POSTED DATE|TRANSACTION DATE|TRANSACTION TIME||
|8/10/2024|8/9/2024|2:52:31 PM|
|8/10/2024|8/9/2024|2:44:18 PM|
|8/10/2024|8/9/2024|2:40:10 PM|
|8/10/2024|8/9/2024|2:32:03 PM|
|8/10/2024|8/9/2024|12:56:11 PM|
|8/10/2024|8/9/2024|12:44:43 PM|
|8/10/2024|8/9/2024|12:48:12 PM|
|8/10/2024|8/9/2024|12:36:48 PM|
|7/10/2024|7/10/2024|5:35:59 AM|
|7/10/2024|7/9/2024|7:25:09 AM|
|7/10/2024|7/9/2024|7:17:09 AM|
|7/9/2024|7/9/2024|8:07:43 AM|
|7/9/2024|7/9/2024|7:38:17 AM|
|7/9/2024|7/9/2024|7:35:21 AM|

Is directly importing CSV by double-clicking on the file not the proper procedure or is this a bug?

More Information to last reply:

  • Using ONLYOFFICE->Open Local file gives same result as double-clicking from file manager. Totally expected, just confirming.
  • Regional setting for Linux Mint 22, applied US English (including date format) System-Wide, logged out and rebooted. Same results as listed.
  • Confirmed Region setting is English (United States) in ONLYOFFICE calc.
  • Exact version is 8.1.1.27 via flatpack.

Hello @TomB

So the expected date to see is 10th of August in the this line, is it correct? At least it seems so according to US format.

Yes, that’s supposed to be August 2024 dates.

Thank you for the reply. We will take a closer look at this situation. I’m keeping you posted.

We have registered an issue on this behavior for further investigation. Thank you for pointing to it.

Thanks for your efforts. This is how things get better.

2 Likes