Graphite not showing Data older than a week

metrics
graphite

(Markus Radis) #1

Hi, Graphite is not showing data that is older than 1 Week (yesterday I got it so far - before it only showed the data of a day). I´ve searched for a bit and find it quite hard to find any relevant data on this issue. My config looks like this:

[carbon]
pattern = ^carbon\.
retentions = 60:90d
[icinga2_internals]
pattern = ^icinga2\..*\.(max_check_attempts|reachable|current_attempt|execution_time|latency|state|state_type)
retentions = 60s:1d,1m:2d,5m:7d,5m:10d,5m:31d,5m:30d,30m:90d,360m:1y,360m:4y
[icinga2_default]
pattern = ^icinga2\.
retentions = 60s:1d,1m:2d,5m:7d,5m:10d,5m:31d,5m:30d,30m:90d,360m:1y,360m:4y
[default_1min_for_1day]
pattern = .*
retentions = 60s:1d,1m:2d,5m:7d,5m:10d,5m:31d,5m:30d,30m:90d,360m:1y,360m:4y

One of the issues was that the Default pattern had to be placed in the bottom - for good meassure I added all retentions to all patterns. Though I only get the Data of a week. Allready deleted the Data from before -> mv /var/lib/graphite/whisper/icinga2 /var/lib/graphite/whisper/icinga2_old and restarted the icinga2 daemon. when trying to access the Monthly / Yearly data the view wont change (click 1 day then 1 month and the same graph is showing)


Graphite/Carbon only shows nothing past 1 week
(Thomas Widhalm) #2

Hi,

Carbon uses a first match when it comes to decide which retention policy to use. It starts on top of your list and the retention times of the first pattern that matches is used. The icinga2_internals doesn’t look like valid regex to me. So most of your metrics will use icinga2_default.

Since carbon is aggregating older entries you should always use multiples of a timeframe as next timeframe. What you have is: 1Minute, 1Minute, 5Minutes,5Minutes all with different retention times. What you want is something like

1m:1d,5m:7d,30m:30d,1h:1y,1d:4y

So the first part of each tuple show’s the timeframe of the aggregation, the second shows how long to keep it. If you use one timeframe (eg. 1m) multiple times with different retention times, carbon doesn’t know what to do.

And always use multiples of the the predecessors.


(Markus Radis) #3

Thanks, that helped a lot - I forgot to mention that I need the following times:

10m,1h,12h,1d,1w,1m,1y

Is the Config you provided going to work for that aswell? Or only for 1d/7d/30d/1y/4y?


(Thomas Widhalm) #4

Every aggregation timeframe has to be a multiple of every one of the timeframes you used before.

10m is 10*1m and 2*5m so this is ok. You just have to decide for how long you want to keep these retentions. I’d go for not too many timeframes. Remember, these are only aggregation levels. Theoretically you could use 1m:4y (please don’t do this, you will end up with unreasonably huge files) and zoom in and out just as you whish.

The first part of a 1m:1d tuple states the aggregation level. Let me explain with a part from my example above:

1m:1d,5m:7d

Means:

  • One datapoint every minute for 1 day
  • After this day 5 of these datapoints are calculated into one aggregation (per default it’s the mean, all summed up and diveded by 5).
  • These aggegations will be saved for 7 days
  • After these 7 days, the already aggregated values will be aggregated again
  • When the last aggregation level is reached the data is deleted from the database

So the aggregation timeframe means, you can not zoom in deeper than the aggregation timeframe. You can always zoom out. Using an aggregation timeframe of 1y would only be reasonable if you wanted to keep your data for, say, 100 years.

Remember that every time you change the storage schema this only effects newly created whisper files. You will have to tranform older files if you want them to use the new schema as well. This might not be losless. In most cases it’s just easier to delete the whisper files already created.


(Markus Radis) #5

Thanks for the explanation! Helped a lot! Seems to be working now.


(Markus Radis) #6

@widhalmt The Issue Still exists - Now i can see the scope but no Data (graphs) is showing for data older than a day. Before the change i would not view the Scope.


(Thomas Widhalm) #7

Could it be that you resettet Carbon just a day ago? Old data is not preserved when you delete the whisper files. Icinga 2 writes only new data.


(Markus Radis) #8

nope did it on the day you helped me. Im not sure but i think on that day i was able to see graphs in the other scopes - just really short ones - shouldn´t i ne able to see parts of then though deleting all the data a few days ago?


(Thomas Widhalm) #9

You should see all data since the last time you deleted the files.

Could you send me your current storage schema configuration? Did you restart carbon and delete the whisper files after you set the storage schema?


(Markus Radis) #10
[carbon]
pattern = ^carbon\.
retentions = 60:90d

[icinga2_internals]
pattern = ^icinga2\..*\.(max_check_attempts|reachable|current_attempt|execution_time|latency|state|state_type)
retentions = 60s:1d,1m:2d,5m:7d,5m:10d,5m:31d,5m:30d,30m:90d,360m:1y,360m:4y

[icinga2_default]
pattern = ^icinga2\.
retentions = 1m:1d,5m:7d,30m:30d,1h:1y,1d:4y

[default_1min_for_1day]
pattern = .*
retentions = 1m:1d,5m:7d,30m:30d,1h:1y,1d:4y

I did move all whisper data icinga2 -> icinga2old but did not restart the carbon service. is the service called “carbon-cache” ? Just restarted it (not showing any difference might be not enough time passed)


(Thomas Widhalm) #11

Hi,

I think there is a bit of a misunderstanding about how Graphite works.

The software that takes the data from Icinga 2 and writes it into “whisper” databases is called carbon-cache (or sometimes short “carbon”.

When you start carbon it reads the storage schema from its configuration files and everytime data from a new source (a new host or service monitored by icinga 2) it creates a whisper file it does not change afterwards. Only new values are written to this file but the schema stays the same. In my postings before I showed you how to change the storage schema so it will work. What you sent me just now won’t work. It seems that your “icinga2_internals” schema matches. icinga2_default might not be used at all.

After you changed the storage schema you have to restart the carbon cache process so it will use this schema for new whisper files it creates.

After the restart, delete or move all whisper files away. The data in them is lost. New data will be written with the new storage schema. So for some time you will not see anything. One after the other the whisper files will be recreated with the new schema. You will see all data from when you restarted until the time you look at it. Nothing that happened before.

Be aware that the shortest (the first interval) of you storage schema must match your check_interval of every check the pattern for the schema matches. So if you use your icinga2_default, than all checks have to be executed every minute. If they aren’t, please adjust your storage schema.


(Markus Radis) #12

Ok, thanks - Seems that i was a bit confused and tought you said the icinga2_internals wont be used anyway - changed it though.

You stated that the first value must match the check_interval. The problem is that I have different check intervals for hosts and services. Hosts will be checked every 2 Minues and services every 5 Minutes - which one should i Use?

And if i set it to 5 Minutes i need to change the second value aswell, right?

EDIT: Just set it to

[icinga2_internals]
pattern = ^icinga2\..*\.(max_check_attempts|reachable|current_attempt|execution_time|latency|state|state_type)
retentions = 5m:1d,10m:7d,30m:30d,1h:1y,1d:4y

Seems to be working for now :slight_smile:


(Thomas Widhalm) #13

Hi,

Great that it’s working. Is it still working?

If you have several check_interval settings (and you will want that) you need to have different storage schemas. That’s what the pattern part is for. But remember: First match wins, so you will have to add more specific ones at the top of the file, less specific ones at the bottom.

Cheers,
Thomas


(Markus Radis) #14

Yes, still working - Thanks Thomas :slight_smile:


(Thomas Widhalm) #15

Great!

Would you be so kind to mark one of the replies as solution, then? :smile:


(Markus Radis) #16

Just realized that graphite does not seem to work properly some graphs are drawn normaly some are not drawn anymore since 2 weeks some since 1 week and for a few there is no data at all… ideas?


(Thomas Widhalm) #17

Are there any differences that could lead to the source of the problems? Like different check intervalls, different storage schemas matching? IIRC you should see which of your storage schemas is used for creating the whisper file in the log of carbon.


Graphite not showing older data
(Markus Radis) #18

Just remembered that I changed the check intervall for check_load / memory / disk to 1 Hour and these checks stopped showing graphs for the Month/Year tabs in graphite. Wouldn´t like to change them back to 5mins because checking it that often for 1000+ Hosts generates a lot of unneccessary overhead. We would like to have graphs to check how the hosts behave over long time.


(Thomas Widhalm) #19

Then you have to change your storage schema for these checks. Use the pattern to match the path of the whisper files you want to change and set a new schema accordingly. Remember that the first match wins, so this has to be placed above the more general schemas.


(Markus Radis) #20

Looks like this now:


changed the 30m:30d to 60m:30d - do i have to delete all the graphs now? (Still beeing a noob with graphite.) You suggested changing it for the specific checks - how would that look like?