Hi,
So we hace around 500 monitors running and I notice my team completely ignores the 'confirmed anomaly' dashboard and the all anomaly control, which, I think is very important. I did a deep dive into 43 reported anomalies out of 69 in 1 day and all are false positive. Site 24x7 keeps reporting traffic spikes (bytes sent or received) and cpu spikes based on BASELINES. While this is similar to what I reported on www.site24x7.com/community/cloudfront-monitoring-ia-thresdhold-not-useful%2C there's something very specific: not all production environments are the same and people sleeps! :)
We do newspapers and eCommerce, most of the traffic is between 10am and 10pm. Within than range, you also have peak hours, most likely around 12pm and 6pm. We you build a baseline considering the 24 hours, is going to be wrong.
I'm not sure if this is the actual problem the cases all the false positives, but I think for most production environments you should consider difference baselines and different anomaly %.
Dear Hernan,
We apologize for the problems faced. The primary reason for the issue is the units (anomaly in bytes, though the anomaly engine is right, since the units are lower gives a sense of false positive). We are already addressing it at the framework level will get in touch with you once done.
Thank you for the critical feedback, we are working towards getting it right as it is important for us as it is for you.
-Jasper
Site24x7 PM
Hi, that's great and it will reduce to 1/3 of the anomalies daily, but we will end up with 20 false positives a day anyway. I think the anomaly detection should consider the reality of the load (off hours, on hours, peak hours) and allow different thresholds for that times.
Dear Herman,
Anomaly engine does consider seasonality. We'll check if there are any issues in this.
-Jasper
That's great. We continue to have around 20 reported anomalies a day from Cloudfront which are basically not anomalies. For us continues to fail detecting the changing nature of web traffic and in some cases, we don't think is working OK (see screenshot). On the Cloudfront console we see a pretty stable graph hourly with no anomalies, but Site24x7 is reporting one.
Also, a lot of the anomalies are from small sites growing. I will be great to have the chance to filter it out. An anomaly in a web site with 139 requests in an hour does not help much :)