Our objective at BDS Bynfo is to provide our customers continuous value through the data and analytics services that we implement. Therefore it's crucial to ensure that the architectures and services are well planned and there is a proper orchestration in place. DataOps enables us standardised processes and is one of the corner stones of BDS Bynfo Way.
DataOps is an operating model that has data that is defined in a specific way and where the operating model is defined by data and analytical developments. This is happening among others by automating routine workflows designed to monitor and control quality.
The goal for DataOps is Continuous Delivery of Analytics whether it is data or models or any else related area. Benefits following DataOps principals is that delivering insights to customer will be faster, and quality will be higher.
Intellectual heritage for DataOps comes from three different buckets.
One crucial element of data analytics is that it also manages and orchestrates a data pipeline. Data continuously enters the pipeline on one side, goes through series of steps and ends up in the other side of the pipeline in form of reports, models and views. This could be thinking of the operations side of analytics.
SPC or statistical process control in this case measures and monitors data and operational characteristics of the data pipeline, ensuring that statistics remain within acceptable ranges. SPC role is to ensure that the pipeline flow is working and alerts the data analytics team in case of anomalies.
In DataOps the lifecycle consists two active and intersecting pipelines. The first pipeline is the Value pipeline which takes the raw source data in and after series of orchestrated steps produces analytic insight that create value for the organization. Automation is implemented to the orchestration and SPC monitors the quality through the pipeline. The second pipeline: Innovation pipeline is the process how new analytic ideas are introduced to Value pipeline.
There are two orchestration models in DataOps process: The first orchestrate is a replica of this data pipeline which is used to test and verify new analytics before deployment into production. The second one is the value pipeline´s orchestration or the data analytics operations which consists several steps bringing data to value.
Testing itself has a big role in Value and Innovation pipelines. In the Innovation pipeline testing focuses new analytics features before deploying them to production. New tests are created to make sure the new feature works as supposed to and the previously created tests which are currently monitoring Value pipeline makes sure that the existing production will not be broken because of the new feature.
In Value pipeline tests monitor the data values flowing through the data pipeline to catch anomalies or flag data outside statistical norms. After the new feature is deployed to production the tests which were created in the innovation pipeline will be moved to the Value pipeline to monitor that the new feature works as supposed to.