Thruk 2.38 - CentOS 7 logcache upgrade is growing huge

Hi,

we have installed thruk with the version 2.38-0 in our monitoring environment since yesterday and exactly after that we face the issue that logcache ("/var/lib/mysql/ibdata1") is growing and growing. I had to exand the disk a third time now.

I see in the logs the following entries.

[15:05:02,175][ERROR] logcache version too old: 5, recreating with version 6…
[15:05:02,179][ERROR] logcache version too old: 5, recreating with version 6…
[15:05:02,181][ERROR] logcache version too old: 5, recreating with version 6…
[15:05:02,213][ERROR] logcache version too old: 5, recreating with version 6…
[15:05:02,215][ERROR] logcache version too old: 5, recreating with version 6…
[15:05:02,218][ERROR] logcache version too old: 5, recreating with version 6…
[15:05:02,223][ERROR] logcache version too old: 5, recreating with version 6…
[15:45:01,517][ERROR] logcache version too old: 5, recreating with version 6…

These messages repeat every few minutes and I see also the processes.
ibdata1 has at the moment about 110 GB and is still growing. Does this part take so much space to create the logcache in the new version?

Or is this a bug?

Thanks in Advance
Thomas

the logcache should not grow that much. You can see logcache statistics on the “Performance Info” page. Disk usage should not be much more than the size listed there.
I would suggest to use innodb_file_per_table. See https://dba.stackexchange.com/questions/8982/what-is-the-best-way-to-reduce-the-size-of-ibdata-in-mysql for more details.

Besides that, the logcache in version 6 should not be much larger or larger at all than before. It uses two timeranges now, a short one of 10weeks by default with all data and a larger range from 10weeks to 2years with only sla relevant data. This should save a lot
space. See http://thruk.org/whatsnew/v2.38.html for more details.
If the update does not start automatically, run the import manually with:
%> thruk logcache import --start=-1y

Thank you Sven for your answer. I will take care of that. Update is running since hours and ibdata1 increasing since the same duration.

Our environment is growing over the years.
perf

I will wait until tomorrow, maybe it “heals” itself :wink:
Otherwise I have to dig deeper.

I have increased the partition from 60 GB to 180 GB now and it is still growing.
I think Thruk lost the path of normal behaviour. This is our productive environment and I can only workaround with increasing the disk which I wont do anymore after it takes the last bit.

Any hint what am I doing wrong?
Is there a possibility to drop/truncate all logcache data and thruk can rebuild logcache?

Is there a rollback possibility? It is friday and I wont to avoid outages during the weekend.

If I know for example that it needs more space it is absolutely no problem that thruk get everything it needs, but afaik it has already plenty of space?

@Sven, do you have patreon or something similiar to support you and your effort?

Thanks Thomas

There is not much Thruk can do. This is “normal” behaviour of the mysql with innodb backend and the innodb_file_per_table setting disabled.
You could clean the mysql folder and start the import with a short timeperiod, lets say a week or so.

What do you mean exactly with “clean the mysql folder”? Delete the content of /var/lib/mysql/ ?
Then restart thruk logcache with a timeperiod of -1w ?

Before Thruk with version 2.38 /var had 60 GB now it has 180 GB. If the file grows can I assume everything is fine and it needs more space?

Right, i would enable the innodb_file_per_table in the my.cnf, then
run the import with:

to import only one week. And remove the ibdata1afterwards, it should not be used then anymore.

Tried your solution and mysql still used the ibdata1 file. So i stopped httpd, mariadb, … services, renamed the files and started again.

Now it created another ibdata1 file and logfiles and I see the following error messages in the thruk.log file

DBD::mysql::db do failed: Can’t create table ‘thruk_logs.95007_contact’ (errno: -1) at /usr/share/thruk/lib/Thruk/Backend/Provider/Mysql.pm line 2205.

also with selinux permissive it is the same result, and I dont have any audit logs concerning denies and thruk.

I think we are not far away from the solution, if Thruk can sync now from scratch then I suppose we are done. Not sure why its not possible to create a database for thruk anymore.

I deleted the database and all concerning files, created the database and set the innodb_file_per_table beforehand.
Restarted all services and now it seems that Thruk is createing the logcache correctly and also into separate files like expected.

Lets wait and see :wink: ty so far for your help Sven

great to hear its working now. Disk usage should be way smaller now.
Otherwise i had suggested to use myisam table engine. And disable innodb completly.