A fast data feed designed for machine consumption
DataHub is an experimental high-throughput data feed designed for machine
consumption.
I need a centralized data feed with a defined format so that distributed
applications can consume and respond to events.
Redis instances use a single thread which is optimal for robotics
applications running on Raspberry Pis. Although a single instance
of redis has exceptional performance, redis can be scaled as needed
as an application usage grows.
The API design is optimized for scaling behind a load balancer in architectures where many separate systems transmit information to a central endpoint. This design means that only a single directional connection (from the sensor systems to the API) is required, which is optimal when data is being transmitted from devices that aren’t publicly-accessible on a network.
Data should be sent to the API in the form of a POST request with the
following JSON content.
For example, a GPS sensor.
The application that sends the data determines when it expires.
For 1000 requests sent usingtests/benchmark.py
Post time: 3.70 seconds
Get time: 1.80 seconds
Total execution time: 5.54 seconds
The following projects the number requests per second that the system
will need to be able to handle based on a hypothetical robotics project.
So about 125 requests per second for incoming event data.
Let’s also estimate that there will be 50 subscribers reading
the incoming data at a rate of 1 request per second.
That gives us a grand total of 175 request per second that this
system needs to be able to handle. This is definitely doable.