On-call Report 8-14 Agu
Action Items
Inquiry
Prod
- Inquiry to check review schedule -- review is not requested although pass visit date
- Item price shown in the product detail is not the cheapest one
- A customer claimed that 3 out of 9 vouchers are redeemed although they haven't redeemed it
- Demand - Help provide impacted item/product ID as additional data point
- XS-529 Photo sharpness on frontend
- Clue: There's a size and quality processing for each image expose via API
- Homework: find the setting we are currently using
- XS-530 Cannot remove photo + photo duplicate on live site
- From AXES, there's nested html tag <h6><img/></h6>
- Current code doesn't support it yet.
- Xperience Review content minimum length
Staging
- SHS cannot search & book
- Search = in xpepapi, there's out of memory. Solved by restartin
- Booking = POT is down since 15:09 JKT
- trppapi circuit breaker tripped
- Can not connect to upstream: POST https://trppapi.trp.stg-tvlk.cloud/id-id/v2/trip/booking/createBooking giving up after 4 attempt(s): unexpected HTTP status 500 Server Error
- Applies only to Web because there's proxy between web & backend.
- Reason: xpebook throw too much exception that trppapi declare our service unhealthy.
- The cause is there's NPE in the code
Pager Duty
- [10 Agu] High Latency in booking flow potentially caused by RWS provider down
- POT Team Response: it seems like RWS api is down based on this thread, that might be the cause since there's spike in RWS latency in the datadog
- [11 Agu] XPE search & booking unavailable 07:00-07:30 https://app.datadoghq.com/incidents/320
- 1. Check xpesrch -> POT
- 2. Check POT dashboard
- 3. Report & attend war room
- Conclusion: depleted connection pool from application to database
- [11 Agu] XPE high latency mixed with exception from POT 23:00-23:59
- High latency from BMG -- inconclusive, still pending investigation from POT
- [13 Agu] Xpedata low memory
- No system impact
- Restart the service to silence the alarm
Pending Tasks