In the current connected world — Websites, Mobile Apps, IoT Devices collect a large volume of users’ personally identifiable activity data. These collected data is used for varied purposes of analytics, marketing, personalisation of services, etc. Data is assimilated through site cookies, tracking device IDs, embedded JavaScripts, Pixels, etc. to name a few. Many of these tracking and usage of collected data happens behind the scene and is not apparent to an average user. Consequently, many Countries and Regions have formulated legislations (e.g. …

Image for post
Image for post

This is a quick note on the key considerations that I have made over the years while designing and developing large scale distributed real time data processing systems. These principles follow evolving usage patterns and use cases.

  1. Processing Semantics
  • “Exactly Once” processing semantics or “at-least once”. Choose your semantics based on your use case needs. It is possible to configure different semantics for each use case.
  • Consistency (strict — eventual mix) over Availability. Refer CAP Theorem, and the fact that you must account for network partitioning.
  • In-memory Data Grid (IMDGs) or Caching layer that support Transactions — preferably ACID complaint…

Subhadip Mitra

Distributed Systems, Artificial Intelligence, Blockchain, Theoretical Physics, Open Source

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store