Enterprise Grade Single-Step Streaming Data Infrastructure Setup. (Under Development)
Will help in shaping initial features.
Read more about it on my Blog at Towards Data Science: https://tinyurl.com/yyqr79dh
Oesophagus enables you to deploy an entirely plug-n-play Data Infrastructure to advance your organisation’s data capability.
The architecture consists of:
# Start kafka, connect, schema-registry, ksqldb, ksqlcli, postgres, elasticsearch and automation-scripts
$ docker-compose up -d
# GET Request on Elasticsearch server to test availability
$ curl -f 'localhost:9200'
# Search all indices in Elasticsearch
$ curl -f 'localhost:9200/_search'
Oesophagus Postges CDC Producer is built to Extract, Transform and Load Relation Databases’ data to Downstream databases/services.
It uses Change-Data-Capture Pattern to read changes from the WAL (Write-Ahead-Logs) of the source database.
Change Data Capture (CDC), as its name suggests, is a Database Design Pattern that captures individual data changes instead of dealing with the entire data. Instead of dumping your entire database, using CDC, you would capture just the data changes made to the master database and apply them to the BI databases to keep both of your databases in sync. This is much more scalable because it only deals with data changes. Also, the replication can be done much faster, often in near real-time.
Information Source: FlyData
Note: Before starting the service, wal2json
plugin should be installed on your postgres container to fetch database logs.
producer.json
.