Kafka Decommission Planning
Hi All,
As we want to decommision Kafka as the result of the plan of using AWS multi account in product, we have the following actionable items that should be noticed:
- Tracking service in the backend will be using Protobuf to avoid dependency to Kafka Schema Registry located in Data Team's VPC.
- During the migration process, we hope the following stake holder will required to discuss (usually we create separate channel in Slack for each domain)how we can do this for each of domain migration:
- Product engineer
- Backend infra engineer
- Data integration engineer
- Data analyst
- Data leads
- Data engineer
- Affected downstream will be including:
- Kafka consumer
- Dataflow pipeline that possibly consume data from Google Pub/Sub
- ETL and others services that read S3 data from your topic (ask relevant Data Integration Engineer for help)
- At this step, you should expect these as your outcome :
- JIRA ticket to track down the migration progress with Data team
- List of topics that will be migrated (usually in a Google sheet)
- List of downstream services that consume your topic
- Protobuf schema and the JAR generated from the protobuf schema
- Note that without permission from the owner of the downstream, we can't release the new tracking services
More details about how we can migrate existing tracking to Protobuf is written in here https://29022131.atlassian.net/wiki/spaces/DATA/pages/1011007676/Tracking+Migration+to+Protobuf+Guidelines.
The exact time when this will be decommissioned is not finalized yet, but it will be around Q2 2019. Right now the dependencies for the kafka and gobblin are our ETL/scripts/jobs/etc that is using the raw data and product engineering team who is subscribing to kafka. Product teams are already started to move to subscribing to pubsub and are expected to finish by start/mid Q2 (for the kafka consumer part).
Public slack channel for this: #kafka-decommission