[PSA] Automatic Incident Creation for AWS ALB Healthy Host Count
What is this about?
In regards to improve our automatic incident detection, BEI team will add additional centralized Datadog system monitor for the ALB healthy host count metric (aws.applicationelb.healthy_host_count
). The monitor will automatically create Datadog incident and paging respective team when ALB healthy host count < 1.
Who is this announcement for?
All teams whose service runs behinds AWS Application Load balancer (ALB).
Impacted AWS Accounts
- tvlk-acd-prod
- tvlk-afc-prod
- tvlk-afi-prod
- tvlk-asi-prod
- tvlk-ast-prod
- tvlk-ath-prod
- tvlk-bei-prod
- tvlk-cnt-prod
- tvlk-coi-prod
- tvlk-con-prod
- tvlk-cri-prod
- tvlk-ctv-prod
- tvlk-cxp-prod
- tvlk-ecb-prod
- tvlk-eci-prod
- tvlk-ewl-prod
- tvlk-fpr-prod
- tvlk-fsp-prod
- tvlk-gmf-prod
- tvlk-gtr-prod
- tvlk-gvo-prod
- tvlk-hcn-prod
- tvlk-ins-prod
- tvlk-ipi-prod
- tvlk-loc-prod
- tvlk-mch-prod
- tvlk-mfc-prod
- tvlk-msg-prod
- tvlk-pay-prod
- tvlk-pkg-prod
- tvlk-pts-prod
- tvlk-rec-prod
- tvlk-sec-prod
- tvlk-srs-prod
- tvlk-tqs-prod
- tvlk-trp-prod
- tvlk-txt-prod
- tvlk-ugc-prod
- tvlk-usc-prod
- tvlk-usr-prod
- tvlk-vcp-prod
- tvlk-vsa-prod
- tvlk-web-prod
- tvlk-xpe-prod
- tvlk-xps-prod
- tvlk-xxt-prod
What do you need to do?
As of now, there is no action item from your side
Timeline
- 24 Oct - 3 Nov:
- BEI team create and trial the Datadog monitor query and threshold
- 4 Nov: BEI team will integrate the Datadog monitor with the automatic incident creation
Questions/Concerns?
Should you have any question/concern, feel free to reach out to @U7FC1KYER @U02UU1QE7 and @U08G58KL3 in #C03A4ENFK, #techops-<vertical> or in this thread directly