Data Lake Archives - Digital Innovation Blog

Industrial Data Platforms on Microsoft Azure: a decision guide for manufacturing – Part 1: From use case to Azure stack

Manufacturing Solutions

From concept to implementation

In the previous articles, we discussed architecture concepts for industrial data platforms: brownfield challenges, latency classes, batch versus streaming, the Medallion Architecture, and the question of edge versus cloud. But all these concepts remain abstract until they lead to concrete technology decisions.

This article is a guide to finding the right technology foundation for your manufacturing environment. How do we translate architecture ideas into sensible decisions on Microsoft Azure from the perspective of OT, IT, and data teams? Using Azure, these questions often lead to two main directions today: either Platform as a Service (PaaS) with a stack made up of Azure building blocks, or a more integrated Software as a Service (SaaS) solution with Microsoft Fabric. Neither direction is inherently superior; what matters are the use case, the operating model, and the skills that already exist. At ZEISS Digital Innovation, we support manufacturing companies in exactly this transformation. This article is intentionally not a product catalog, but a decision guide based on our experience from real projects. We start from the specific use case, ask the right questions, and show which direction the answers point to on Azure. One thing becomes clear: managed services remove much of the infrastructure work, but domain architecture and governance remain project-specific tasks.

One important point must be considered from the start: the cost structure in day-to-day operation. Wrong architecture decisions, such as streaming instead of batch, missing storage lifecycle policies, or unnecessary data redundancy, quickly lead to unexpectedly high cloud costs. From our project experience, we know this: economically sound architectures emerge when cost is considered from the beginning and weighed carefully in the context of the operating model.

A brief recap: the basic principles

Modern manufacturing companies struggle with hundreds to thousands of data sources in silos. An industrial data platform creates a central infrastructure to collect, process, and use this data. We have learned that not every use case needs real-time data. The range goes from milliseconds for process control to days for management reports. True closed loops in the millisecond range remain in the automation layer or at the edge; the central data platform mainly supports monitoring, analysis, and coordination. The right classification saves cost and complexity.

The Medallion Architecture structures the data flow: Bronze for raw data, Silver for cleaned data, Gold for aggregated business views. And depending on latency requirements and network conditions, we decide whether batch ingestion is sufficient or streaming is necessary, and whether we preprocess data within edge or send it directly to the cloud. With this basic understanding, we can now get specific: How do these principles translate into actual technology? In the Microsoft ecosystem, this means a focused selection of suitable services.

From use cases to technology decisions

But let us not start with technology. Let us start with typical manufacturing use cases. The following three scenarios should be understood as stages of expansion with increasing complexity: from simple reporting to ongoing monitoring to machine learning (ML). In practice, an industrial data platform often grows in exactly this sequence. For each case, we first outline the solution approach and then derive suitable technology paths from it.

3D illustration of three stacked blue blocks showing typical industrial Azure data platform use cases: KPI reporting with batch ingestion from MES/ERP into a central data lake and dashboards; OEE monitoring with streaming ingestion, stream processing, live data and live dashboards from machine data; and predictive maintenance with hybrid ingestion of sensor data, historical plus live data in an integrated data lake with ML model storage, resulting in maintenance orders. — *Figure 1: Data-driven use cases in manufacturing*

Scenario 1: KPI Reporting and Management Dashboards

Let us first look at the classic case: manufacturing staff and managers want daily or weekly reports on production figures, scrap, and energy consumption. The data sources are manageable, network connectivity is stable, and an update every hour or every day is enough. Here, a clear batch approach is sufficient: data is loaded through batch ingestion, structured in Bronze, Silver, and Gold, and then provided for dashboards. The main effort lies less in the technology than in data modeling, KPI definition, and governance.

Scenario 2: OEE Monitoring and Live Dashboards

Now it becomes more demanding: staff in the control room need second- to minute-level views of machine condition and Overall Equipment Effectiveness (OEE) across several production lines. This is where streaming becomes relevant. Machine data is captured through streaming ingestion, processed almost in real time, and stored in parallel for historical analysis. In unstable networks or strict OT security zones, additional edge processing on the shop floor is recommended. Technically this is manageable, but organizationally it only succeeds when OT, IT, and the data team share responsibility for the end-to-end path.

Scenario 3: Predictive Maintenance

Predictive maintenance combines the best and the most demanding parts of both worlds. You need years of historical time-series data for model training and, at the same time, current streams for predictions that flow back into day-to-day operations, for example as maintenance orders in the Computerized Maintenance Management System (CMMS). The right approach is hybrid: streaming ingestion for current sensor data, historical time-series data in the Data Lakehouse, and a machine learning environment for training and inference. This is exactly where the difference between platform building blocks and project work becomes especially clear: Microsoft provides tools, but model selection, feature engineering, and integration into the CMMS remain project-specific.

Azure Building Blocks: not as a product list, but as a toolbox

Instead of going through an endless product list, let us look at Azure services by task area. This makes it easier to find the right technology for each challenge.

In principle, this leads to two well-supported paths: a PaaS approach, where you combine individual services for each task area, or a more integrated approach with Microsoft Fabric, where individual functions are more tightly connected. Neither path is automatically better. What matters are the desired level of integration, the operating model, and the question of how much platform composition your team wants to take on itself.

Connecting data sources and ingesting data

How does data from machines and sensors, as well as data from databases, files, or Application Programming Interfaces (APIs), get into the platform?

Table 1: Azure building blocks for connecting data sources and ingesting data

Service	Best suited for?	Typical classification
Azure IoT Hub	Bidirectional communication with devices, device identities, and device lifecycle	Near real-time
Azure Event Hubs	Highly scalable streaming for millions of events per second, no device identity	Near real-time
Azure Event Grid	Event-based architectures, MQTT support, Pub/Sub	Near real-time
Azure IoT Edge	Containerized logic on devices or gateways, local preprocessing, offline capability	Edge processing
Azure IoT Operations	Edge data layer on Azure Arc/Kubernetes with MQTT broker, OPC UA connectivity, and streaming	Edge processing
Azure Data Factory	Connection to databases, file and API sources, including from on-premises environments	Mainly batch
Partner solutions (e.g. OPC UA gateways)	Protocol translation and machine connectivity in Brownfield environments	Depends on the setup

For continuous data streams from machines and sensors, Azure IoT Hub, Azure Event Hubs, Azure Event Grid, and edge services are the obvious building blocks. Azure IoT Hub is suitable when you need device identities, secure communication, and device lifecycle management. Azure Event Hubs, in contrast, is designed for pure high-volume streaming without device management. Azure Event Grid is especially suitable for MQTT or event-driven architectures. For edge scenarios, there are currently two equally valid paths: Azure IoT Edge is a good fit for containerized logic on individual devices or gateways, while Azure IoT Operations is stronger when you want to build standardized industrial data flows with MQTT, OPC UA, and predefined cloud targets on Azure Arc/Kubernetes.

For batch ingestion from systems such as MES, ERP, SQL databases, file shares, SFTP, or APIs, Azure Data Factory is usually the more suitable building block. With its many connectors and a self-hosted integration runtime, on-premises sources can also be connected to the platform. This shows that data ingestion into the platform is broader than pure device communication. The right Azure solution depends on the source, the latency class, and the operating model.

Data storage and preparation

How do we store data in a structured way, version it, keep its history, and prepare it for analysis?

Table 2: Azure building blocks for data storage and preparation

Service	Best suited for?	Medallion role
Azure Data Lake Storage Gen2	Scalable, cost-efficient object storage for structured and unstructured data	Bronze, Silver, Gold
Apache Iceberg or Delta Lake (e.g. on Azure Databricks)	ACID transactions, time travel, schema evolution on the Data Lake	Silver, Gold
Microsoft Fabric (OneLake and Lakehouse)	Integrated SaaS platform for storage, preparation, usage, and governance	Bronze, Silver, Gold
Azure Data Explorer	High-performance analysis of telemetry, log, and time-series data	Silver, Gold
Azure SQL Database / Azure Cosmos DB	Relational or NoSQL databases for specific use cases	Gold (for applications)

Azure Data Lake Storage (ADLS) Gen2 is the cost-efficient standard storage for all data. A table format such as Apache Iceberg or Delta Lake is added when you need ACID (Atomicity, Consistency, Isolation, Durability) transactions and historical tracking with schema evolution, which is typical for Silver and Gold. If you want to analyze large telemetry and time-series datasets interactively, Azure Data Explorer is often the more precise choice. Microsoft Fabric covers this layer in a more integrated way: OneLake as the shared storage foundation, Lakehouse for data preparation, and shared data use across several workloads. The Medallion model maps as follows: Bronze stores raw data unchanged, Silver cleans and harmonizes it in tables, Gold aggregates it for business views.

Orchestration and processing

How do we control, transform, and aggregate data flows, in batch or in streaming?

Table 3: Azure building blocks for orchestration and processing

Service	Best suited for?	Batch/Streaming
Azure Data Factory	Orchestration, ETL/ELT, many connectors, GUI-based	Mainly batch
Azure Databricks	Spark-based, flexible for batch and streaming, ML workflows	Batch and Streaming
Azure Stream Analytics	SQL-based streaming, simple for straightforward transformations	Streaming
Microsoft Fabric with Data Factory / Real-Time Intelligence	Integrated orchestration, event streams, Eventhouse, and real-time analytics	Batch and Streaming
Azure Functions	Serverless, event-driven, for small processing steps	Batch and Streaming

In the PaaS approach, Azure Data Factory is suitable for classic ETL jobs. Azure Databricks comes into play for complex transformations, large data volumes, and ML integration. Azure Stream Analytics is a good fit for simple streaming scenarios with SQL, while Azure Functions handle small, event-driven tasks. When using Microsoft Fabric, Data Factory and Real-Time Intelligence cover large parts of these tasks within one platform, from ingestion through event streams to analysis in Eventhouse or Power BI.

Usage and integration

How do we make data accessible to end users and applications?

Table 4: Azure building blocks for usage and integration

Service	Best suited for?
Power BI	Business intelligence, dashboards, reports, and data analysis in business units
Azure API Management	Provide, secure, monitor, and version APIs
Azure Digital Twins	Digital twins for complex assets, room models, and process models
Azure App Service/ Azure Container Apps	Web apps, custom user interfaces, microservices

Power BI is often the standard choice for dashboards. With Microsoft Fabric, it is embedded directly in the platform. Azure API Management is suitable for providing data and ML models as APIs. Azure Digital Twins makes sense when you want to model assets, spaces, or process relationships semantically. Azure App Service or Azure Container Apps come into play when custom applications or microservices are needed.

Technology stack examples

To make the theory more concrete, let us look at three specific stack examples for the scenarios introduced above. Each example first shows the PaaS approach and then a possible alternative with Microsoft Fabric.

3D graphic comparing three Azure technology stacks for manufacturing: a minimal reporting stack with Azure Data Factory, Azure Data Lake Storage Gen2 and Power BI; a near‑real‑time OEE stack adding Azure IoT Hub or Event Hubs plus Azure Stream Analytics/Databricks Structured Streaming; and a predictive maintenance stack with Azure IoT Hub, ADLS Gen2, Azure Databricks/Azure Machine Learning, Azure Functions/Stream Analytics and Azure API Management for exposing predictions. — *Figure 2: Examples of the technology stack for typical manufacturing use cases*

Example 1: minimal stack for reporting

Requirements:

Daily KPI reports for one plant
Data sources: MES database (SQL), a few CSV exports
Users: management, controlling
Latency: daily update is sufficient

Azure stack:

Ingestion: Azure Data Factory with SQL connector and Blob connector; for on-premises sources usually through Self-Hosted Integration Runtime
Storage: ADLS Gen2
1. Bronze: raw data from SQL and CSV
1. Silver: cleaned data (e.g. normalized timestamps, duplicates removed), Apache Iceberg as the table format
1. Gold: aggregated KPIs (e.g. produced parts per line, scrap per product)
Transformation: Azure Data Factory with visually designed data transformations (mapping data flows) or simple copy activities
Usage: Power BI reads directly from the Gold layer

Alternative with Microsoft Fabric: Data Factory loads the data into OneLake, a Lakehouse maps Bronze, Silver, and Gold, and Power BI accesses the same platform directly. This path is especially attractive when data integration, governance, and business intelligence should be tightly connected in one SaaS environment.

Note: This stack is almost turnkey. The main effort lies in data modeling, KPI definition, and governance (who may see which data?).

Example 2: near real-time OEE for multiple lines

Requirements:

Second- to minute-level view of machine condition and OEE
Multiple production lines, different machine types
Display in the control room on large monitors
Simple alerting in case of faults

Azure stack:

Ingestion: Azure IoT Hub or Azure Event Hubs
1. OPC UA gateway collects data from machines and sends it to Azure IoT Hub
Edge (optional): Azure IoT Edge or Azure IoT Operations
1. Azure IoT Edge for local preprocessing, filtering, and buffering during network outages
1. Azure IoT Operations for standardized data flows on Azure Arc/Kubernetes
Streaming: Azure Stream Analytics or Structured Streaming in Azure Databricks
1. Calculates OEE almost in real time and writes to ADLS Gen2 (Bronze/Silver)
Storage: ADLS Gen2 with Apache Iceberg for Silver/Gold
Usage: Power BI with real-time streaming or custom dashboards (e.g. React app)

Alternative with Microsoft Fabric: Real-Time Intelligence handles ingestion, event streams, and real-time analysis, while OneLake and Eventhouse form the data foundation. Power BI or real-time dashboards visualize the results. This is especially interesting when streaming, analysis, and visualization should be combined in one platform.

Note: Azure provides streaming and visualization building blocks, but the edge architecture (filter logic, offline handling) and the OEE calculation logic are project-specific. OT must connect the machines, IT must operate the edge infrastructure, and the data team must develop the streaming logic.

Example 3: Predictive Maintenance with ML

Requirements:

Prediction of bearing failures based on vibration data
Historical data over 2 years needed for model training
Current streaming data for predictions
Predictions should flow into the CMMS

Azure stack:

Ingestion: Azure IoT Hub for vibration sensors
Speicher: ADLS Gen2 with Apache Iceberg
1. Bronze: raw data (vibration, temperature, etc.)
1. Silver: cleaned time series, feature engineering for ML
1. Gold: aggregated features for ML training
ML: Azure Databricks or Azure Machine Learning
1. Model training with historical data (Silver/Gold)
1. Model deployment as REST API (e.g. through Azure Machine Learning endpoints or Model Serving in Azure Databricks)
Streaming for inference: Azure Functions or Azure Stream Analytics call the model API
Integration: Azure API Management provides predictions for the CMMS
Optional: Azure IoT Edge or Azure IoT Operations brings the model or preprocessing locally to the asset

Alternative with Microsoft Fabric: Microsoft Fabric combines OneLake, data engineering, data science, and Power BI in one platform. For streaming-related analyses, Real-Time Intelligence can capture the current data, while models are trained and evaluated on the historical data in OneLake. If predictions must be generated very close to the asset, the edge part remains a separate architecture decision.

Note: This is where the difference between platform building blocks and custom development becomes especially clear. Azure provides ML tools, APIs, and deployment mechanisms, but the model itself, the selection of suitable features, the concept for retraining the model, and the integration into the CMMS are pure project work. Close cooperation between data science experts, OT, and IT is essential here.

Interim conclusion: from use case to Azure stack

The three scenarios and the Azure toolbox make it clear: there is no universal answer to the question of which stack is the right one. The key is to think consistently from current and especially future use cases. Latency requirements, data volume, user groups, and organizational conditions determine which combination of ingestion, storage, processing, and visualization makes sense.

Microsoft provides two solid directions for this: the composed Azure PaaS path with services such as Azure IoT Hub, Azure Event Hubs, Azure Data Factory, ADLS Gen2, Azure Data Explorer, or Power BI, and the more integrated SaaS approach via Microsoft Fabric path with OneLake, Data Factory, Real-Time Intelligence, and Power BI in one platform. The stack examples presented show typical starting points, from minimal batch reporting to an ML-driven predictive maintenance setup.

But choosing the right technology stack is not enough. In the second part of this article, we will address questions that go beyond the mere use of tools: When do you need edge processing and when is the cloud sufficient? Where do managed services end, and where does the actual project work begin? How can governance, modern software development, and operations be designed to ensure the platform remains viable in the long term? And what mistakes should you avoid from the very start?

Industrial Data Platforms on Microsoft Azure: a decision guide for manufacturing – Part 2: From architecture to implementation and the most costly misunderstandings in practice

Manufacturing Solutions

From the Azure stack to a viable architecture

In the first part of this article, we translated typical manufacturing scenarios, from KPI reporting through OEE monitoring to predictive maintenance, into specific Azure stacks. We structured Azure building blocks to task clusters and demonstrated how data ingestion, storage, processing, and use interact using three example stack combinations.

But a platform does not stand or fall based on tool selection alone. In this second part, we focus on deeper questions: when is edge processing necessary, and when is pure cloud i ngestion enough? Where do the capabilities already provided by Azure or Microsoft Fabric end, and where does project-specific development begin? Which development practices ensure long-term maintainability of the platform? And which decision-making patterns repeatedly lead to unnecessary complexity or avoidable costs?

Edge vs. cloud: the central architecture question

When is pure cloud ingestion enough, and when do we need edge? The answer depends on latency, network stability, and OT security zones. With daily reports, stable network connectivity, and IT-side data sources, you can go directly to the cloud. But with strict latency requirements, unstable internet connections, strict OT security zones, or high data volume, edge is the better choice. Real control loops in the millisecond range remain the responsibility of the automation layer; here the data platform mainly supports monitoring, analysis, and coordination. In our work with manufacturing companies, we see this decision regularly: it is rarely purely technical, but also touches security policies, operating concepts, and organizational boundaries.

For edge implementation, there are currently two comparable approaches in the Microsoft ecosystem, with different strengths. Azure IoT Edge is especially suitable when containerized logic should run on individual devices or gateways, for example for local preprocessing, filtering, inference, or offline buffering. Azure IoT Operations is stronger when you want to build a standardized industrial edge data layer with MQTT broker, OPC UA connectivity, and data flows to targets such as Azure Event Hubs, Azure Data Lake Storage (ADLS) Gen2, Microsoft Fabric OneLake, or Azure Data Explorer on Azure Arc and Kubernetes. What Microsoft does not take off your hands in either case is the choice of protocols, the filtering logic, the failover behavior, and the OT integration. OT, IT, and the data team need to work together here: OT defines latency and security requirements, IT operates the edge infrastructure, and the data team develops the processing logic.

3D graphic comparing edge and cloud scenarios for industrial data: on the left, stacked blocks labeled “Milliseconds / Real-time”, “Unstable / Offline phases”, “Strict OT zones” and “Mass raw data”; on the right, blocks labeled “Daily / Hourly”, “Stable / Permanent”, “Standard IT network” and “Aggregated KPIs”, illustrating when edge versus cloud processing is appropriate. — *Figure 1: Comparison of key parameters of edge and cloud architectures*

Where Azure is turnkey and where project work begins

Azure is not a turnkey “Industry 4.0 product”, but a powerful ecosystem of building blocks. In the PaaS approach, Microsoft provides strong infrastructure support: Azure IoT Hub manages the device lifecycle, Azure Data Factory includes hundreds of standard connectors, ADLS Gen2 and open table formats such as Apache Iceberg or Delta Lake provide a solid Lakehouse foundation, Azure Data Explorer supports interactive time-series and telemetry analysis, Power BI integrates smoothly, and Azure Monitor monitors everything centrally. In the more integrated SaaS approach via Microsoft Fabric, OneLake provides the shared storage base, Data Factory handles data integration, Lakehouse handles processing, and Power BI handles usage within the same platform.

However, OT-specific connectors often require partners or custom development. Semantics are pure project work: what does “machine condition” mean? Which tags are needed? Which units apply? You develop the Bronze/Silver/Gold design, data contracts, data quality checks, and domain-specific applications yourself. Microsoft handles the infrastructure work, but domain architecture, data modeling, and governance remain your responsibility.

Modern software development: not optional, but mandatory

An often underestimated point is this: an industrial data platform is software and must be treated as such. Without modern development practices, it quickly becomes difficult to manage. At ZEISS Digital Innovation, we deliberately combine software engineering practices with the world of industrial data, not as an end in itself, but to keep projects maintainable and scalable in the long term.

Diagram of a DevOps lifecycle for an industrial data platform: circular flow from “Code & Infrastructure as Code (IaC)” to “Test & Build”, “Staging”, “Production”, “Observability & FinOps” and “Feedback during development” around a “Production environment – stable and scalable”, with IT team, OT department, and Data & Dev team shown collaborating underneath. — *Figure 2: Principles of Modern Software Development*

The foundation for this is the automation of infrastructure and deployments. Instead of manually clicking Azure resources together, the entire environment is described as Infrastructure as Code (IaC) (for example with Bicep or Terraform). This allows even complex setups for several plants to be rolled out consistently and under version control. Closely linked to this is Continuous Integration and Continuous Delivery (CI/CD) for data pipelines: Azure Data Factory pipelines or Azure Databricks Notebooks are treated like classic code, go through automated unit and integration tests with realistic test data, and move through clean staging environments into production. Faulty versions can then be reverted within minutes, before they cause unnoticed problems.

Once the platform is live, observability closes the loop, both technically and economically. Tools such as Azure Monitor and Azure Log Analytics do not just monitor whether pipelines run without errors and latencies stay within limits, but also continuously check data quality. Proactive alerts report problems before users notice them. Closely related to this is cost monitoring: Azure Cost Management does not only track spending at a high level, but also breaks it down by use case, plant, or business area with the help of cost allocation tags. Only this transparency allows sound decisions about which use case is economically sensible and where optimization is worthwhile. Cost awareness thus becomes an integral part of platform governance.

The roles are clearly divided: IT is responsible for landing zones, IaC, and the CI/CD setup. Data and development teams build pipelines and ML models, while OT provides the requirements and tests in the staging environment. Only this interaction creates a reliably maintainable platform.

Data governance: not a later add-on

Data governance deserves its own article, but the core message is clear: governance must be built in from the start. It is about data ownership (who is responsible?), data quality (which standards apply?), access control (who may see what?), and compliance (GDPR, audit requirements).

Azure supports this with Microsoft Purview for data catalogs and lineage, Azure RBAC (Role-Based Access Control) and Microsoft Entra ID for fine-grained access control, landing zones for clear domain ownership, and Azure Policy for enforced standards. Governance is especially critical in industrial data: production data may be regulated (pharma, automotive), OT data must not fall into the wrong hands, and without trust in data quality nobody will use the platform. Azure and Microsoft Fabric provide the tools, but you must define the governance strategy, roles, processes, and standards yourself.

Common mistakes – and what we can learn from them

In our work with customers, we keep seeing similar challenges. Knowing them and addressing them early is part of our role as a partner.

Infographic listing common misconceptions about industrial data platforms on Azure on the left—such as “We do everything in real time”, “We’re looking for THAT tool”, “OT and IT decide separately”, “Azure is only storage”, “Governance comes later”, “All data stays in hot storage forever” and “We don’t want cloud dependency”—and the corresponding solutions on the right: “Batch-first approach”, “Use cases & architecture before choosing a tool”, “Teamwork is crucial”, “Medallion & governance”, “Governance from day 1”, “Storage tiering” and “Managed services & open data formats”. — *Abbildung 3: Mythen in der Entscheidungsfindung*

“We do everything in real time” is a classic. Every dashboard is supposed to update immediately, even when daily updates would be fully sufficient. The result: unnecessary complexity, higher costs, longer development time. The key question is: Which decisions are really made in real time? Often, a minimum viable product (MVP) with batch processing is the better start.

“We are looking for the one Azure product that solves everything” reveals a misunderstanding. There is no single “product”, but an ecosystem of building blocks. A platform is created through architecture, not through tool selection. Define use cases and architecture first, then choose the suitable building blocks.

“OT and IT decide separately” leads to isolated solutions. OT procures edge gateways, IT builds the cloud platform, and the data team hears nothing about it until the systems are incompatible. Industrial data processing is teamwork. Joint kickoffs, a shared architecture vision, and clear end-to-end responsibility are essential.

“Azure is only storage” is the safe road to a data swamp. If data lands in ADLS Gen2 “somehow” without structure, transformation, or governance, nobody will find anything later. The Medallion structure, data quality checks, and a catalog, for example with Microsoft Purview, are not extras, but basic requirements.

“Governance comes later” is an illusion. Governance added later is much harder than governance from the start. Define basic roles, access controls, and naming conventions from day one.

“All data stays in hot storage forever” is a classic cost trap. ADLS Gen2 offers different storage tiers, Hot, Cool, and Archive, with clearly different cost structures. If all historical data stays permanently in the Hot tier, storage costs become unnecessarily high. Define from the start: Which data needs fast access, which is rarely used, and which is only for long-term archiving? Azure Lifecycle Management automates this tiering. The same applies to data resolution: not every historical time series needs to be stored at full resolution. Downsampling older data saves storage volume and therefore cost.

“We don’t want cloud dependency” sounds cautious but often leads to expensive extra effort. If you only use VMs and open-source components, you give up managed services and must operate everything yourself: patching, scaling, and monitoring. The better question is: Can we keep data in standard formats and still benefit from managed services?

Conclusion: from architecture to implementation

An industrial data platform on Azure does not mean buying a product, but designing and implementing an architecture. Microsoft offers a mature ecosystem of building blocks that removes much of the infrastructure and platform work. The challenge is to choose the right building blocks, combine them sensibly, and create sustainable governance.

The most important principles are these: Start from the use case, not from the technology. Think in task clusters, not in product lists. A PaaS approach using individual Azure services and an integrated SaaS approach using Microsoft Fabric are two valid options with different strengths. Edge versus cloud is an architecture decision, not a tool question. Microsoft provides the infrastructure, not the business logic. Domain model, transformations, and governance remain your responsibility. Modern software development with IaC, CI/CD, and tests is mandatory, not optional. And above all, OT, IT, and the data team must work together. Industrial data is a shared task.

This is exactly where we at ZEISS Digital Innovation come in: as a partner that understands both the manufacturing world and modern cloud architectures. We translate between OT, IT, and data, ask the right questions, create clear architectures, and work with our customers to develop solutions that work in practice and remain maintainable in the long term. From requirements through implementation to operations, we support you on the path to a scalable, future-ready data platform.