Check_mk omd backup or WATO backup feature -> Connection reset by peer (Since 1.5.0)


(Overlord) #1

Hey guys

Since I changed from check_mk 1.4 to 1.5 I have on all my instances an issue with creating backups. Regardless of whether creating backups directly to the mounted network path or locally, via cli and omd or over wato - I get an “Connection reset by peer”.

Any ideas?

Greetz
Ovrld


(Overlord) #2

Nothing?

root@server:/home/admin# omd backup Monitoring /home/admin/Monitoring_05.09.18
Failed to perform backup: [Errno 104] Connection reset by peer

And I have the same issue on all my check_mk machines (all same Debian version and updates).


#3

same her… :frowning:


Site backup Failed
#4

My CMK servers still run 1.4, but I just tested with 1.5.0p2 (Enterprise). No issues. It’s a quite fresh test install. There could be a problem with older, upgraded installs.

Ubuntu 16.04 LTS


(Raphaël Berlamont) #5

Hi,

got the same issue with 1.5.0p5. Can’t find a way to do the backup.

Anybody got rid of this “bug” ?


(Volker A Mönch) #6

Hi,

did you figure it out? I have the same problem.

Regards,
Volker


(Philipp Näther) #7

@KleskMS @Raphux @Paubolix which OS do you run exactly?

@All
How big are your sites?
Does it work for fresh site installations?
What does it show if you call # omd -v backup ...?
Do you have sufficient disk space?
Does the backup work if you exclude one of the following: --no-rrds, --no-dbs, --no-logs ?


(Volker A Mönch) #8

How big are your sites?_

I’m a beginner, only 3 Hosts. 76 services.

Does it work for fresh site installations?

It is fresh. Approximately 4 weeks old installation

what does it show if you call # omd -v backup ... ?

many resumed-messages for rrdcached. Last message was: rrdcached response: '0 0 rrds resumed\n'

Do you have sufficient disk space?

yes

Does the backup work if you exclude one of the following: --no-rrds, --no-dbs, --no-logs ?

No, same message as above

Gruß vom Volker


(Overlord) #9

How big are your sites?

#1: 292 Services / 15 Hosts / 100 MB Backup Folder
#2: 2156 Services / 90 Hosts / 780 MB Backup Folder

Does it work for fresh site installations?

My installations are old

what does it show if you call # omd -v backup ... ?

Both:
A lot of those:
rrdcached response: ‘-1 /opt/omd/sites/Monitoring/var/pnp4nagios/perfdata/www-qa-02.test.de/Memory_sreclaimable.rrd - No such file or directory\n’
Resuming RRD updates for /opt/omd/sites/Monitoring/var/pnp4nagios/perfdata/www-qa-02.test.de/Memory_sreclaimable.rrd
rrdcached command: RESUME /opt/omd/sites/Monitoring/var/pnp4nagios/perfdata/www-qa-02.test.de/Memory_sreclaimable.rrd
skipping rrdcached command (broken pipe)
Pausing RRD updates for /opt/omd/sites/Monitoring/var/pnp4nagios/perfdata/www-qa-02.test.de/Memory_pending.rrd
rrdcached command: SUSPEND /opt/omd/sites/Monitoring/var/pnp4nagios/perfdata/www-qa-02.test.de/Memory_pending.rrd
rrdcached response: ‘-1 /opt/omd/sites/Monitoring/var/pnp4nagios/perfdata/www-qa-02.test.de/Memory_pending.rrd - No such file or directory\n’
Resuming RRD updates for /opt/omd/sites/Monitoring/var/pnp4nagios/perfdata/www-qa-02.test.de/Memory_pending.rrd
rrdcached command: RESUME /opt/omd/sites/Monitoring/var/pnp4nagios/perfdata/www-qa-02.test.de/Memory_pending.rrd

Do you have sufficient disk space?

Yes

Does the backup work if you exclude one of the following: --no-rrds, --no-dbs, --no-logs?

Seems so - with “–no-rrds” I got only one issue:

Calling hook: /omd/sites/Monitoring/lib/omd/hooks/LIVESTATUS_TCP_ONLY_FROM ‘default’


(Philipp Näther) #10

@Paubolix @Overlord

With “Does it work on fresh installations” I wanted you to create a new site, maybe add localhost as a host and try a backup.

@Paubolix

Which OS do you use?

@Overlord

Do you have a distributed setup?


(Volker A Mönch) #11

Hi Philipp,

A question before: Could it be the fault or reason?

[Warn] Apache number of processes

Thank you for your efforts.


(Philipp Näther) #12

No this doesn’t affect the backup process unless you actually have insufficient RAM on your system that gets completely used up by the apache processes.


(Volker A Mönch) #13

With “Does it work on fresh installations” I wanted you to create a new site, maybe add localhost as a host and try a backup.

No problems with a new site

Which OS do you use?

Debian 6.3 Linux 4.9.0-7-amd64


(Overlord) #14

With “Does it work on fresh installations” I wanted you to create a new site, maybe add localhost as a host and try a backup.

Me too - no problems. I think it’s an issue with the rrcached performance data. Maybe with old format or something?

Which OS do you use?

Debian 9.5 x64

Do you have a distributed setup?

No - two seperated servers - one at my home and one at work.


(Overlord) #15

Hey guys

Related to the backup issue I got an answer from the check_mk team about my ticket:

Hi,
thanks for reporting this problem. We have fixed an issue which could have caused your problem with werk #6860 ([https://mathias-kettner.de/check_mk-werks.php?werk_id=6860).] This fix will be included with the 1.5.0p8 release.
Best regards
Lars


(Raphaël Berlamont) #16

Hi @TheLucKy, thank you for your time.

which OS do you run exactly?

Debian 9 x86_64

How big are your sites?

1 site, 24 hosts, 700 services

Does it work for fresh site installations?

Yes

What does it show if you call # omd -v backup ... ?

root@checkmk:~#  omd -v backup myfirm /tmp/myfirm-cmk.backup
Calling hook: /omd/sites/myfirm/lib/omd/hooks/LIVESTATUS_TCP_ONLY_FROM 'default'
Pausing RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd
rrdcached command: SUSPEND /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd
rrdcached response: '0 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd suspended\n'
Resuming RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd
rrdcached command: RESUME /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd
rrdcached response: '0 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd resumed\n'
Pausing RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Filesystem__home_trend.rrd
rrdcached command: SUSPEND /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Filesystem__home_trend.rrd
rrdcached response: '0 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Filesystem__home_trend.rrd suspended\n'
Resuming RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Filesystem__home_trend.rrd
rrdcached command: RESUME /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Filesystem__home_trend.rrd
rrdcached response: '0 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Filesystem__home_trend.rrd resumed\n'

[~9k lines removed]

Pausing RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/rou1a1r2.myfirm.fr/Memory_writeback.rrd
rrdcached command: SUSPEND /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/rou1a1r2.myfirm.fr/Memory_writeback.rrd
rrdcached response: '-1 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/rou1a1r2.myfirm.fr/Memory_writeback.rrd - No such file or directory\n'
Resuming RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/rou1a1r2.myfirm.fr/Memory_writeback.rrd
rrdcached command: RESUME /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/rou1a1r2.myfirm.fr/Memory_writeback.rrd
Failed to perform backup: [Errno 104] Connection reset by peer
root@checkmk:~#

Do you have sufficient disk space?

5.5Go free, so I guess yes.

Does the backup work if you exclude one of the following: --no-rrds, --no-dbs, --no-logs ?

–no-rrds : OK

root@checkmk:~# omd -v backup --no-rrds myfirm /tmp/myfirm-cmk.backup
Calling hook: /omd/sites/myfirm/lib/omd/hooks/LIVESTATUS_TCP_ONLY_FROM 'default'
root@checkmk:~#

–no-dbs : does not work, option does not exist.

root@checkmk:~# omd -v backup --no-dbs librit /tmp/librit-cmk.backup
Invalid option '--no-dbs'
root@checkmk:~#

–no-rrds : same as

root@checkmk:~# omd -v backup --no-logs myfirm /tmp/myfirm-cmk.backup
Calling hook: /omd/sites/myfirm/lib/omd/hooks/LIVESTATUS_TCP_ONLY_FROM 'default'
Pausing RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd
rrdcached command: SUSPEND /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd
rrdcached response: '0 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd suspended\n'
Resuming RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd
rrdcached command: RESUME /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd
rrdcached response: '0 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/hebergement-mut-1.myfirm.fr/Interface_2_inerr.rrd resumed\n'

[~6k lines removed]

skipping rrdcached command (broken pipe)
Pausing RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_sunreclaim.rrd
rrdcached command: SUSPEND /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_sunreclaim.rrd
rrdcached response: '-1 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_sunreclaim.rrd - No such file or directory\n'
Resuming RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_sunreclaim.rrd
rrdcached command: RESUME /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_sunreclaim.rrd
skipping rrdcached command (broken pipe)
Pausing RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_shmem_pmd_mapped.rrd
rrdcached command: SUSPEND /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_shmem_pmd_mapped.rrd
rrdcached response: '-1 /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_shmem_pmd_mapped.rrd - No such file or directory\n'
Resuming RRD updates for /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_shmem_pmd_mapped.rrd
rrdcached command: RESUME /opt/omd/sites/myfirm/var/pnp4nagios/perfdata/zoneminder.myperso.com/Memory_shmem_pmd_mapped.rrd
Failed to perform backup: [Errno 104] Connection reset by peer
root@checkmk:~#

(Philipp Näther) #17

I would say you have to wait for the next release as overlord stated there will be a fix. I think it is a bug related to the rrd implementation in debian environments.


(Marcel Schulte) #18

Hi, the “connection reset by peer” issue is fixed in Werk 6860.

…will be included in 1.5.0p8.


(Raphaël Berlamont) #19

I confirm that it works perfectly on 1.5.0p8. Thanks all !