Dear OnlyOffice Team
We have integrated OnlyOffice into our application to enable editing and previewing of documents uploaded by users to our server. OnlyOffice works exceptionally well; users find it very smooth, and it fully meets our expectations. The only issue we’ve encountered relates to OnlyOffice’s versioning mechanism.
By examining the OnlyOffice working directory on the server (/var/lib/onlyoffice/documentserver/App_Data/cache/files/data
), we observed the following workflow for file editing:
- When a user first opens a file via OnlyOffice, an
Editor.bin
file and amedia
folder are created underApp_Data/cache/files/data/${fileKey}/
. - After editing, the changes are saved to a new directory:
App_Data/cache/files/data/${fileKey}_\d+/
. This directory contains three files:changes.zip
,changeHistory.json
, andoutput.${fileType}
. - Every edit generates a new directory with these three files.
This poses a problem: if the original file is large, multiple edits (even minor ones) will create numerous copies of output.${fileType}
, consuming significant storage space.
Our Questions:
- Can OnlyOffice be configured to retain only the latest version of edits? If so, how?
- If configuration isn’t possible, can we manually delete intermediate versions? Based on our analysis of the directory structure before/after edits, we suspect OnlyOffice works as follows:
- The original file is converted to
Editor.bin
upon first open and remains unchanged. - Each edit generates a
changes.zip
, which functions like a diff/patch file. Theoutput.${fileType}
is generated by applyingchanges.zip
toEditor.bin
.
If our understanding is correct, we could safely delete intermediate versions and keep only the latest edit results. We plan to identify the latest version using the last_open_date
field in the task_result
table of the PostgreSQL database.
Would there be any side effects if we programmatically delete these intermediate versions?
------------------------------------------------ 以为内容为 AI 翻译的英文原文------------------------------------------------
如何删除编辑过后的文件的历史版本,只保留最新版本。
OnlyOffice 的朋友们您们好,我将 OnlyOffice 接入了我们的应用用来对用户上传到服务端的文档进行编辑和预览。
OnlyOffice 工作的很好,用户起来很顺滑,完全符合我们的预期。唯一的问题出现在 OnlyOffice 的多版本机制上。
通过查看服务端的 OnlyOffice 工作目录: var/lib/onlyoffice/documentserver/App_Data/cache/files/data,我发现 OnlyOffice 对于文件的编辑是按如下流程处理的:
- 用户首次通过 OnlyOffice 打开文件后,会在 App_Data/cache/files/data/${fileKey}/ 目录下创建 Editor.bin 文件以及 media 文件夹
- 用户对文件进行编辑之后将编辑结果保存到目录: App_Data/cache/files/data/${fileKey}_\d+/,在这个目录中会写入: changes.zip, changeHistory.json, output.${fileType}
- 用户没每对文件进行编辑一次就生成一个新的目录,并且生成步骤 2 中所描述的 3 个文件。
这样会有一个问题,如果一个文件本身非常大,那么多次修改此文件(哪怕只修改一个字), 就会保存多个版本的 output.${fileType} 文件,会耗费非常大的存储空间。
是否可以通过配置让 OnlyOffice 只保留最新一个版本的修改结果? 如果有的话,要怎么样配置? 如果不能我是否可以自己删除不需要的中间修改版本。根据我对编辑前后生成的目录结构的变化,
我推测 OnlyOffice 的工作逻辑如下:
- 用户首次打开文件时将原文件转换为 Editor.bin 文件,之后此文件不不会再被修改
- 用户每次次修改后在每个目录生成 changes.zip,changes.zip 的作用就像 diff 命令生成的 patch 文件一样,通过 Editor.bin + changes.zip 就可以生成 output.${fileType}
如果我理解的上述工作原理没有问题的话,我完全可以删除编辑的中间版本,只需要保留最后一次修改的结果就好了。 关于哪个目录是最新版本可以通过查询 psql 数据中的 task_result 表的 last_open_date
来确定。 如果我自己通过代码删除这些中间版本是否会有其他副作用?