Process Data Lake : a Data Lake designed for industrial operations

principe process data lake ou lac de données, des sources de données aux usages et analyses


The Process Data Lake is a concept that we have built up over many years of experience in processing industrial data. Its objective: to simplify the handling of data relating to production processes by industrializing, automating and simplifying the entire processing chain for this type of information.

Data Lake, because it is a solution that aims to collect, store and organize data to provide it to different populations of users: operational, R&D teams, process experts, data specialists…

Process, because our solution natively embeds a large number of concepts that can be found in industrial processes, essential for processing data efficiently in these activities.

With the Process Data Lake, digitalize your industrial production today

All of these elements of our Process Data Lake allow you to:

  • Reduce the implementation time your plant digitalization projetc.
  • Facilitate the use of data by your teams in the shop floor.
  • Take advantage of all the power of our infrastructures to handle your data at the best cost.
  • Be agile and keep control of your data processing.
  • Benefit from a centralised data source for your production processes, thanks to its capacities and its openness.

A way to structure industrial data according to business specificities

The Process Data Lake natively supports the following concepts related to the variety of data encountered on an industrial site:

  • Time series data (sensors, quality controls, etc.) and their share of subtleties, such as resampling, interpolation, extrapolation, realignment of non-synchronous series
  • Data related to batches, operations, campaigns, cycles (recipe, indicators, team, tools, …)
  • Data related to events (scheduled or unscheduled shutdowns, alerts, change of tool, etc.)
  • Traceability data (where, when and how an operation, a batch, was carried out) and genealogy (how the different unit operations are linked together, how an operation implements one or more batches of previous operations…)
  • Measurements and units: we offer a standard framework allowing to manage the homogeneity of the manipulated data as well as the traceability and the automatic conversion of the units for storage and for restitution to users.
  • Description and labeling of data to facilitate their identification and grouping (site, equipment, line, etc.) with the support of several languages simultaneously.
  • Management of data access rights for partitioned access to data according to users.

Continuous data processing

But the Process Data Lake does not stop there and integrates a real layer of computation which makes it possible to combine the data together to transform them into “information” while relying on the structuring in place. In particular, it makes it possible to centralize calculations with a wide range of functions (mathematical, logical, statistical, specific to the problems encountered in the process industries…). This layer allows for example to:

  • Calculate indicators associated with a production batch by combining traceability and sensor data.
  • Calculate a material balance by combining sensor data (for example flow rates) and analysis laboratory data (for example concentrations).
  • Aggregate parameters over time or between them according to their characteristics (for example to compare production lines with each other, make energy balances)

To complete this computational layer and extend the capabilities of the Process Data Lake to infinity, the latter integrates a Python code execution module that allows you to manage your code, whether it is numerical calculations or Machine Learning models or any other processing that can be done in Python. This code can be executed automatically in different contexts by drawing on the data and calculations available to generate new information and enrich the content of the Process Data Lake.

An efficient and reliable technical architecture

Performance: we combine cutting-edge technologies in terms of data storage and processing to provide you with a smooth user experience.

Reliability of data flows: in association with our OIBus collection agent, the Process Data Lake enables continuous and robust multi-source data collection.

Security, sustainability and data integrity: we rely on proven cloud infrastructures to offer high level of security. For more details, we have a Security Insurance Plan covering the measures put in place.


Contact us to discuss about your needs

Fill this form and we will come back to you to talk about your business needs.