tvlk-build is shared AWS account used to put the pipeline for Java build and AMI baking for the team that is still resided in tvlk-prod. We have several shared AWS accounts such as tvlk-prod, tvlk-dev, tvlk-build, etc. When the team migrates to multi-account, the usage of all these shared accounts is migrated to their own AWS account that is managed by themselves
We don't see a need to increase the number of engineers. It is just the distribution of the responsibility from the central-team to all engineers. Multi account sets a clear boundary within product domain so the blast radius is small and we feel more confident to spread this responsibility to the engineers.
yes, central team will shift focus from maintaining the infrastructure for all product domains to teach and spread the knowledge to the engineer to maintain the infrastructure
For now, we have not calculated the cost, but this cost becomes more visible to the team. In centralised infrastructure, this cost we cannot do cost attribution very well. Since the cost becomes more visible, the product team can justify the cost of their product domain and maybe there will be some considerations to deprecate the product domain because the product domain itself is not worth compared to the value it provides.
Based on the projection, the cost will reduce from 8000$/month to around 1000$/month. We also don't see it only in 1 Product Domain only. If this solution can be applied to all accommodation Product Domain, then the reduction will become bigger. This solution might be not too low hanging fruit because it needs effort to research and explore this solution
So far, XPS team is using the container. We can ask the data from them to compare the cost reduction
Currently, RI is managed by Serhiy. The utilisation is very low currently because we are having transition of the instance usage after the team did load testing to their application. We suggested the team not to move back again to the RI because we want to get the new baseline for the instance type usage. For more details about it, you can discuss directly with Serhiy. AWS also have new business model, instead of reserving the instance type, we reserve for vCPU and RAM.
If accommodation already has baseline on the instance type they will use, maybe you can talk to Serhiy about it so he can have more accurate projection for RI
How fast AWS release new generation for the instance type? How disruptive this new generation release to our usage (especially RI)?
It's around 2-3 years depends on how fast Intel or AMD release it. It's not that disruptive, the team can still use the old generation (especially if they already reserve the RI). The problem with the RI currently is caused by in the past the team request the machine without having thorough analysis and after doing load testing they realise that they don't need the instance that big
It's kinda hard if we see it in shared account. We (BEI) don't have visibility but if we talked about the cost in early date in every month, it' is most likely the fixed cost such as AWS Config, GuardDuty, CodePipeline, etc
Currently, we are researching of these tools. By the end of this month, we will have a thought process how to use these tools. In high level, all these three tools should be used together. It's not like any one is more credible than the others, it depends on the need
As far as we know, all these tools are already enabled in multi account. But for using Cost Recommendation and Compute Optimizer, it needs memory metrics to provide better recommendation. Currently, we are working on to enable sending memory metrics from the EC2 instance
Fully adopted means the team have migrated all of their infrastructures to their own account. Yes, the resource should be zero. We also reminds them to create decommission ticket for their resources. If they have not cleaned up their resources in tvlk-prod
, yes they will be charged for it.
It should be useful especially when they are managing their own infrastructure. Correction: after confirming it with Serhiy, he said it is not useful to up-skill the engineers but it will be useful to test the quality of the engineer
We notice that some engineers are already proactively to research and explore new technology from AWS. The problem is they don't know that other engineers are also doing that and this becomes duplicated tasks. If we have centralised documentation and reduce the duplicated works, we can catch up with the AWS technology. It's not like we use all AWS technologies, there are some technologies which is not related at all with us. We usually also wait for several months to use AWS technology. Based on our experience, it took several months until the new technology from AWS is stable and reliable to use.
We have confluence space that contains documentation where they can contribute but either they don't know that we have it or the engineer doesn't get value from the documentation itself so they don't see any point why they need to contribute to it
Currently, we only rely on the analytics from the google docs/confluence itself. We haven't had any tracking to know if the engineers are really doing the lab. We mostly still piloting it to the engineer who are recently onboard in the multi account and get feedback from them
We try to put the label as many as possible and consider any keywords that might related to the documentation itself. Search capability in Confluence is a bit limited because it only searches based on title and label of the documentation
Yes, we have it. Additional: here is the link to the recording
AWS Cost Management Q&A: https://tvlk.slack.com/files/T02T3CAFM/FN30GV0JF?origin_team=T02T3CAFM
AWS Security Q&A: https://tvlk.slack.com/files/T02T3CAFM/FMWKAAEGK?origin_team=T02T3CAFM
In terms of our (BEI) initiative, when we see how they are proactive with our initiative, accommodation are really good
Actually we don't know exactly, we don't have the metrics for it yet. What we believe is technical debt is like snowball, the more we delay it, it will slow us in the future
Even when the congestion goes down, we don't think the product team will get more granular and powerful access in the shared account. Maybe it might not be worth to consider it in the shared account. As of now, it still makes sense for culinary to stay in tvlk-prod but please put it into consideration that there will be a milestone where they need to migrate to their own account