Combining three documents into one

Hello there,

I’m trying to accomplish the following in the Document Builder:

  1. Open first file and save it into a variable
  2. Open second file and save it into a variable
  3. Open the main template file which contains some text and table of contents (first two pages of final document). It is a normal docx file - just called a template
  4. Clear out the file
  5. Insert the original template contents
  6. Go to the end of the opened file and insert contents of first file
  7. After that, insert the contents of second file
  8. Update the table of contents so it contains newly inserted sections
  9. Do some text replacement
  10. Add a paragraph at the end
  11. Save as .pdf

Here is the script I use for now:

builder.SetTmpFolder("DocBuilderTemp")

builder.OpenFile("firstfile.docx")
const oDocument = Api.GetDocument()
const json = oDocument.ToJSON(true, true)
GlobalVariable["firstfile"] = json
builder.CloseFile()

builder.OpenFile("secondfile.docx")
const oDocument = Api.GetDocument()
const json = oDocument.ToJSON(true, true)
GlobalVariable["secondfile"] = json
builder.CloseFile()

builder.OpenFile("template.docx")
const ofirstfile = Api.FromJSON(GlobalVariable["firstfile"])
const osecondfile = Api.FromJSON(GlobalVariable["secondfile"])
const oDocument = Api.GetDocument()
const oDocument2 = oDocument.GetContent()

oDocument.RemoveAllElements()
oDocument.InsertContent(oDocument2)
oDocument.InsertContent(ofirstfile)
oDocument.InsertContent(osecondfile)
oDocument.UpdateAllTOC()
builder.SaveFile("pdf", "result.pdf")
builder.CloseFile()

I’m struggling a bit with steps 4-8. The resulting file behaves strangely. It only inserts the one file’s contents and nothing else.
If I don’t do steps 4-5 and just insert the first file into template, it gets inserted at the beginning instead of at the end. I can’t insert any more files after that.

Should I be using some different workflow? Maybe other methods?

Hello @3io4ugorhiucvb

First of all, please let me know version of Document Builder that is used.

As for the scenario: I do not see the point in copy-pasting content of the template document back to it - wouldn’t it be easier to simply place content of those 2 files with InsertContent into template document and save it as PDF?

Hi there,
I’m using the latest available version from the download page, so I guess it’s 8.2.0.

Like I mentioned in my first post, I tried just “appending” the file into template, but it copies the contents to the beginning of the template, not end. It also doesn’t allow me to insert any more files - so the line oDocument.InsertContent(osecondfile)doesn’t do anything.

There is also another issue: I use oDocument.UpdateAllTOC() function, because there is a table of contents in the template file. After adding the new content, it should add the headers from new files to TOC. But it looks like the headers are not discovered by this function and TOC remains empty.

Let’s focus on the “combining” issue first. I did some tests and managed to build a document with slight corrections to your script. Here is what I used:

builder.SetTmpFolder("DocBuilderTemp");

builder.OpenFile("1.docx");
var oDocument = Api.GetDocument();
var json1 = oDocument.ToJSON(true, true);
GlobalVariable["firstfile"] = json1;
builder.CloseFile();

builder.OpenFile("2.docx");
var oDocument = Api.GetDocument();
var json2 = oDocument.ToJSON(true, true);
GlobalVariable["secondfile"] = json2;
builder.CloseFile();

builder.OpenFile("0.docx");
var ofirstfile = Api.FromJSON(GlobalVariable["firstfile"]);
var osecondfile = Api.FromJSON(GlobalVariable["secondfile"]);
var oDocument = Api.GetDocument();
var oDocument2 = oDocument.GetContent();

oDocument.RemoveAllElements();
oDocument.InsertContent(oDocument2);
oDocument.InsertContent(ofirstfile);
oDocument.InsertContent(osecondfile);
oDocument.UpdateAllTOC();
builder.SaveFile("pdf", "result.pdf");
builder.CloseFile();

As you can see, I’ve changed the names for original json variables to json1 and json2 as it was simply overwriting the first variable, hence second call for InsertContent(osecondfile); did not work. Also, names for document here are changed:

  • 0.docx is the template document;
  • 1.docx is the first document to append;
  • 2.docx is the second document to append.

As a result I am getting a PDF file with combined content:
result.pdf (26.7 KB)
As you can see, the structure is built accordingly, the content follows the order in which documents were appended.


Please start a new topic for this issue to avoid mixing up several problems in a single topic. I will assist you with this issue separately. Thanks.

While I agree it would be good to discuss the TOC issue independently, I have a strong case for it being very much related to the combining of multiple documents.
One issue is that (at least in my case) when there is a table of contents created in Microsoft Word, it isn’t recognized by the UpdateAllTOC() function as a valid TOC. This can be easily mitgated by using OnlyOffice to edit a template file wwith TOC, so I say this is a non-issue for me at this time.

However, there is a much bigger issue in the process of “importing” two files into the template.
Take the example files I provided.
firstfile.docx (26.7 KB)
secondfile.docx (26.8 KB)
template.docx (47.1 KB)

The files all have headers, text and lists.
However, when importing files into a global variable, it seems that all the attributes/text types(?) are stripped. If you run the following script on them:

builder.SetTmpFolder("DocBuilderTemp")

builder.OpenFile("firstfile.docx")
var oDocument = Api.GetDocument()
var json1 = oDocument.ToJSON(true, true)
GlobalVariable["firstfile"] = json1
builder.CloseFile()

builder.OpenFile("secondfile.docx")
var oDocument = Api.GetDocument()
var json2 = oDocument.ToJSON(true, true)
GlobalVariable["secondfile"] = json2
builder.CloseFile()

builder.OpenFile("template.docx")
var ofirstfile = Api.FromJSON(GlobalVariable["firstfile"])
var osecondfile = Api.FromJSON(GlobalVariable["secondfile"])
var oDocument = Api.GetDocument()
var oDocument2 = oDocument.GetContent()

oDocument.RemoveAllElements()

oDocument.InsertContent(oDocument2)
oDocument.InsertContent(ofirstfile)
oDocument.InsertContent(osecondfile)
oDocument.UpdateAllTOC()
builder.SaveFile("pdf", "result.pdf")
builder.SaveFile("docx", "result.docx")
builder.CloseFile()

…you can see in the resulting file that headers and lists got converted to a simple text, as if these were paragraphs.
Because of that, UpdateAllTOC() only sees headers in the “template” file and inserts them into TOC.
You can open result.docx file and verify that theer are, in fact, no headers and lists in later parts of the file.
It also doesn’t matter whether the source files were created in OnlyOffice, or Microsoft Word.

So yes, it is technically affecting the TOC generation, however the combining also doesn’t really work - it loses all the styling and structure.

My guess is that either the ToJSON() or function FromJSON() method is messing up all the attributes.
I also noticed that after importing with FromJSON() conversion, there are no methods available for the imported document elements.

I will take a closer look at the situation with ToJSON /FromJSON methods and provide a feedback.

I’ve done some research in terms of the scenario with combining several documents into a one with preserving heading styles and found an issue. Unfortunately, there is no direct mechanism that would allow combining documents and ToJSON /FromJSON approach is not a reliable option here as these methods very complex in the core, so there was registered a bug on the behavior with missing heading styles after using these methods for combining the document.

Sorry for the inconvenience.