rrdcached/npcd is not removing/flushing the old journalfiles and if left unchecked rrdcached process gets killed by the redhat kernel and OMD site is reported as "Partially started". When this occures either viewing graf data in the GUI takes forever, the sites crashes or graf data is not updated.
This issue can be keept in check if i manually stop/start the site and remove rrd.journal* files about twice a week in the following path (/opt/omd/sites/master/var/rrdcached)
It seems that only the sites with larger amount of host/services are the ones with this issues. I am showing logs of a working and a non working server
Server master - Check_mk RAW 1.2.8p18 (Enviroment Production) - Not working
Server gmon - Check_mk RAW 1.2.8p18 (Enviroment Test) - Working
We have been troubleshooting this issue for quite a while so help is greatly apprecited!