OnlyOffice Workspace portal doesn`t start after server failure

Hello, I have a problem with Onlyoffice Workspace. I have a PC running Ubuntu 20.04.1 LTS, which serves as a server in a local network. A few years ago, Onlyoffice Workspace was installed on it using Docker (communityserver:11.1.0.1506, controlpanel:2.9.1.369, documentserver:6.2.0.123, mysql:5.7.30). Recently, the server had issues with one RAM module, which was replaced. Since then, the portal keeps loading infinitely upon startup. I’m wondering if there is a way to restore the functionality of the portal or at least retrieve the data from it.

I tried following the instructions mentioned here Onlyoffice portal never starts properly. However, when executing the command docker exec -it onlyoffice-community-server service monoserve status it returns monoserve: unrecognized service. Restarting the Community Server container with docker restart onlyoffice-community-server did not help. Additionally, running the command docker exec -it onlyoffice-community-server service systemd status also returns systemd: unrecognized service.

Hello, @kuzminol :wave:

Please execute the following commands and provide me with the file logs.txt

  1. Docker Version:
  • docker -v > logs.txt
  1. Container Information:
  • docker ps -a >> logs.txt
  1. Logs for the “community-server” container:
  • docker logs onlyoffice-community-server >> logs.txt

:upside_down_face:

Hello, @Nikolas :slight_smile:
I did everything as you told as soon as I reached the server.
I can’t upload the file logs.txt to the forum, so I uploaded it to Google Drive
https://drive.google.com/file/d/1SE1TOZq85m3j1w45fYtTDASDY-wO-uu3/view?usp=drivesdk

Thank you!

@kuzminol
Requested access.
image

@Nikolas , sorry, forgot to do it before. Done.

1 Like

@kuzminol

Briefly summarizing the logs you provided:

The issue lies in establishing a connection to the onlyoffice-document-server container.
The script is performing a check for the availability of the onlyoffice-document-server host on port 8000 and continues execution only after the connection is established.
Unfortunately, there is no response from the document server, causing the container to restart its configuration process.

...
+ bash /app/assets/tools/wait-for-it.sh onlyoffice-document-server:8000 --quiet -s -- echo 'Document Server is up'
+ sleep 1
+ bash /app/assets/tools/wait-for-it.sh onlyoffice-document-server:8000 --quiet -s -- echo 'Document Server is up'
+ sleep 1
+ bash /app/assets/tools/wait-for-it.sh onlyoffice-document-server:8000 --quiet -s -- echo 'Document Server is up'
+ sleep 1
+ bash /app/assets/tools/wait-for-it.sh onlyoffice-document-server:8000 --quiet -s -- echo 'Document Server is up'
+ sleep 1
+ bash /app/assets/tools/wait-for-it.sh onlyoffice-document-server:8000 --quiet -s -- echo 'Document Server is up'
+ sleep 1
+ bash /app/assets/tools/wait-for-it.sh onlyoffice-document-server:8000 --quiet -s -- echo 'Document Server is up'
+ sleep 1
+ bash /app/assets/tools/wait-for-it.sh onlyoffice-document-server:8000 --quiet -s -- echo 'Document Server is up'
+ echo '##########################################################'
##########################################################
+ echo '#########  Start container configuration  ################'
#########  Start container configuration  ################
+ echo '##########################################################'
##########################################################
+ SERVER_HOST=
+ APP_DIR=/var/www/onlyoffice
+ APP_DATA_DIR=/var/www/onlyoffice/Data
+ APP_INDEX_DIR=/var/www/onlyoffice/Data/Index/v7.4.0
+ APP_PRIVATE_DATA_DIR=/var/www/onlyoffice/Data/.private
+ APP_SERVICES_DIR=/var/www/onlyoffice/Services
................


Let’s take a look at what’s happening inside the document server container.
Please provide the logs from the document server container:

docker logs onlyoffice-document-server > logDS.txt

Logs from document server available here:

@kuzminol

.....
 * Starting PostgreSQL 12 database server        [80G 
 [74G[ OK ]
nc: port number invalid: "services":
Waiting for connection to the { host on port "services":
nc: port number invalid: "services":
Waiting for connection to the { host on port "services":
nc: port number invalid: "services":
Waiting for connection to the { host on port "services":
.....

Looks like we hit a snag when starting up RabbitMQ. Let’s take a peek at the logs for RabbitMQ, Redis, and PostgreSQL

The logs are located in the onlyoffice-document-server container:

/var/log/rabbitmq/
/var/log/redis/
/var/log/postgresql/

@Nikolas

Logs from the onlyoffice-document-server container here:

1 Like

Let’s continue troubleshooting.
The attached logs didn’t provide the necessary information.
Follow the steps in the suggested sequence.

  1. Access the container:
docker exec -it onlyoffice-document-server bash
  1. Provide the output of the following command within the onlyoffice-document-server container:
rabbitmq-diagnostics status
  1. Stop the RabbitMQ service:
service rabbitmq-server stop
  1. Attach the logs from the directory /var/log/rabbitmq/.

  2. Clear the RabbitMQ data directory:

rm -rf /var/lib/rabbitmq/mnesia/*
  1. Start the RabbitMQ service:
service rabbitmq-server start

Please allow some time for the community server to boot up.

  1. Give the community server some time to start up on its own.

If the issue persists, please provide the output of the following command: rabbitmq-diagnostics status (In Community Server container)

@Nikolas

Output of the command rabbitmq-diagnostics status is

root@e7c840398cc0:/# rabbitmq-diagnostics status
Error: unable to perform an operation on node 'rabbit@e7c840398cc0'. 
Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, 
TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server 
(e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on
https://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rabbit@e7c840398cc0
 * If target node is configured to use long node names, 
don't forget to use --longnames with CLI tools

DIAGNOSTICS
===========

attempted to contact: [rabbit@e7c840398cc0]

rabbit@e7c840398cc0:
  * connected to epmd (port 4369) on e7c840398cc0
  * epmd reports: node 'rabbit' not running at all
                  no other nodes on e7c840398cc0
  * suggestion: start the node

Current node details:
 * node name: 'rabbitmqcli-2849002-rabbit@e7c840398cc0'
 * effective user's home directory: /var/lib/rabbitmq
 * Erlang cookie hash: 77oOWpjzbslkx0WjLqgilw==

Logs from the directory /var/log/rabbitmq/ available here:

Cleared the RabbitMQ data directory with provided command and started the RabbitMQ. The ouput was

root@e7c840398cc0:/# service rabbitmq-server start
 * Starting RabbitMQ Messaging Server rabbitmq-server
* FAILED - check /var/log/rabbitmq/startup_\{log, _err\}

Files startup_log and startup_err from /var/log/rabbitmq/ available here:

UPD. Tried to start the RabbitMQ service else one more time. It worked.
Output of the rabbitmq-diagnostics status is:

root@e7c840398cc0:/# rabbitmq-diagnostics status
Status of node rabbit@e7c840398cc0 ...
Runtime

OS PID: 2854909
OS: Linux
Uptime (seconds): 772
RabbitMQ version: 3.8.2
Node name: rabbit@e7c840398cc0
Erlang configuration: Erlang/OTP 22 [erts-10.6.4] [source] 
[64-bit] [smp:1:1] [ds:1:1:10] [async-threads:64]
Erlang processes: 257 used, 1048576 limit
Scheduler run queue: 1
Cluster heartbeat timeout (net_ticktime): 60

Plugins

Enabled plugin file: /etc/rabbitmq/enabled_plugins
Enabled plugins:


Data directory

Node data directory: /var/lib/rabbitmq/mnesia/rabbit@e7c840398cc0

Config files


Log file(s)

 * /var/log/rabbitmq/rabbit@e7c840398cc0.log
 * /var/log/rabbitmq/rabbit@e7c840398cc0_upgrade.log

Alarms

(none)

Memory

Calculation strategy: rss
Memory high watermark setting: 0.4 of available memory, 
computed to: 6.6412 gb

other_proc: 0.0305 gb (29.87 %)
code: 0.0268 gb (26.21 %)
other_system: 0.0225 gb (22.05 %)
allocated_unused: 0.0179 gb (17.48 %)
other_ets: 0.0026 gb (2.58 %)
atom: 0.0014 gb (1.41 %)
binary: 0.0002 gb (0.23 %)
mnesia: 0.0001 gb (0.07 %)
metrics: 0.0 gb (0.05 %)
msg_index: 0.0 gb (0.03 %)
plugins: 0.0 gb (0.01 %)
quorum_ets: 0.0 gb (0.01 %)
connection_channels: 0.0 gb (0.0 %)
connection_other: 0.0 gb (0.0 %)
connection_readers: 0.0 gb (0.0 %)
connection_writers: 0.0 gb (0.0 %)
mgmt_db: 0.0 gb (0.0 %)
queue_procs: 0.0 gb (0.0 %)
queue_slave_procs: 0.0 gb (0.0 %)
quorum_queue_procs: 0.0 gb (0.0 %)
reserved_unallocated: 0.0 gb (0.0 %)

File Descriptors

Total: 2, limit: 1048479
Sockets: 0, limit: 943629

Free Disk Space

Low free disk space watermark: 0.05 gb
Free disk space: 11801.3621 gb

Totals

Connection count: 0
Queue count: 0
Virtual host count: 1

Listeners

Interface: [::], port: 25672, protocol: 
clustering, purpose: inter-node and CLI tool communication
Interface: [::], port: 5672, protocol: 
amqp, purpose: AMQP 0-9-1 and AMQP 1.0

UPD2. Half an hour later I still can see only endless startup in web interface

@kuzminol

Great progress so far!
Please restart the DS and CS containers:
docker restart onlyoffice-document-server onlyoffice-community-server

@Nikolas
Nothing changed, still endless loading.
After restart of the onlyoffice-document-server container command rabbitmq-diagnostics status gives output as earlier:

root@e7c840398cc0:/# rabbitmq-diagnostics status
Error: unable to perform an operation on node 'rabbit@e7c840398cc0'. 
Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, 
TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server (e.g. due to CLI 
tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on 
https://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rabbit@e7c840398cc0
 * If target node is configured to use long node names, 
don't forget to use --longnames with CLI tools

DIAGNOSTICS
===========

attempted to contact: [rabbit@e7c840398cc0]

rabbit@e7c840398cc0:
  * connected to epmd (port 4369) on e7c840398cc0
  * epmd reports: node 'rabbit' not running at all
                  no other nodes on e7c840398cc0
  * suggestion: start the node

Current node details:
 * node name: 'rabbitmqcli-150-rabbit@e7c840398cc0'
 * effective user's home directory: /var/lib/rabbitmq
 * Erlang cookie hash: 77oOWpjzbslkx0WjLqgilw==

It looks like rabbitmq is off again after restart of the container

@kuzminol :eyes:
It’s unfortunate.
Let’s try reinstalling the Document Server container, assuming the issue lies there.

Please ensure to create a snapshot of the server hosting OnlyOffice Workspace before proceeding.

At this stage, you can attempt two solutions:

1. Reinstalling the Document Server using the script:

docker rm -f onlyoffice-document-server # remove the old document-server
wget https://download.onlyoffice.com/install/workspace-install.sh # download the new workspace installation script
bash workspace-install.sh -u true -dv 6.2.0.123 -ics false -icp false -ims false -ies false -skiphc true # Reinstall Document Server only to version 6.2.0.123

The script will automatically install and restart all containers.

Description:
-u --update
-dv or --documentversion
-ics or --installcommunityserver → install or update community server (true|false|pull)
-icp or --installcontrolpanel → install or update control panel (true|false|pull)
-ims or --installmailserver → install or update mail server (true|false|pull)
-ies or --installelasticsearch → install or update elasticsearch (true|false|pull)
-skiphc or --skiphardwarecheck → skip hardware check

2. Updating to the latest versions of all Workspace components using a script:

wget https://download.onlyoffice.com/install/workspace-install.sh
bash workspace-install.sh -u true

You can start with the first option.
If the update doesn’t yield positive results, you can then try the second option.