Industrial Data Platforms on Microsoft Azure: a decision guide for manufacturing – Part 1: From use case to Azure stack

Manufacturing Solutions

From concept to implementation

In the previous articles, we discussed architecture concepts for industrial data platforms: brownfield challenges, latency classes, batch versus streaming, the Medallion Architecture, and the question of edge versus cloud. But all these concepts remain abstract until they lead to concrete technology decisions.

This article is a guide to finding the right technology foundation for your manufacturing environment. How do we translate architecture ideas into sensible decisions on Microsoft Azure from the perspective of OT, IT, and data teams? Using Azure, these questions often lead to two main directions today: either Platform as a Service (PaaS) with a stack made up of Azure building blocks, or a more integrated Software as a Service (SaaS) solution with Microsoft Fabric. Neither direction is inherently superior; what matters are the use case, the operating model, and the skills that already exist. At ZEISS Digital Innovation, we support manufacturing companies in exactly this transformation. This article is intentionally not a product catalog, but a decision guide based on our experience from real projects. We start from the specific use case, ask the right questions, and show which direction the answers point to on Azure. One thing becomes clear: managed services remove much of the infrastructure work, but domain architecture and governance remain project-specific tasks.

One important point must be considered from the start: the cost structure in day-to-day operation. Wrong architecture decisions, such as streaming instead of batch, missing storage lifecycle policies, or unnecessary data redundancy, quickly lead to unexpectedly high cloud costs. From our project experience, we know this: economically sound architectures emerge when cost is considered from the beginning and weighed carefully in the context of the operating model.

A brief recap: the basic principles

Modern manufacturing companies struggle with hundreds to thousands of data sources in silos. An industrial data platform creates a central infrastructure to collect, process, and use this data. We have learned that not every use case needs real-time data. The range goes from milliseconds for process control to days for management reports. True closed loops in the millisecond range remain in the automation layer or at the edge; the central data platform mainly supports monitoring, analysis, and coordination. The right classification saves cost and complexity.

The Medallion Architecture structures the data flow: Bronze for raw data, Silver for cleaned data, Gold for aggregated business views. And depending on latency requirements and network conditions, we decide whether batch ingestion is sufficient or streaming is necessary, and whether we preprocess data within edge or send it directly to the cloud. With this basic understanding, we can now get specific: How do these principles translate into actual technology? In the Microsoft ecosystem, this means a focused selection of suitable services.

From use cases to technology decisions

But let us not start with technology. Let us start with typical manufacturing use cases. The following three scenarios should be understood as stages of expansion with increasing complexity: from simple reporting to ongoing monitoring to machine learning (ML). In practice, an industrial data platform often grows in exactly this sequence. For each case, we first outline the solution approach and then derive suitable technology paths from it.

3D illustration of three stacked blue blocks showing typical industrial Azure data platform use cases: KPI reporting with batch ingestion from MES/ERP into a central data lake and dashboards; OEE monitoring with streaming ingestion, stream processing, live data and live dashboards from machine data; and predictive maintenance with hybrid ingestion of sensor data, historical plus live data in an integrated data lake with ML model storage, resulting in maintenance orders. — *Figure 1: Data-driven use cases in manufacturing*

Scenario 1: KPI Reporting and Management Dashboards

Let us first look at the classic case: manufacturing staff and managers want daily or weekly reports on production figures, scrap, and energy consumption. The data sources are manageable, network connectivity is stable, and an update every hour or every day is enough. Here, a clear batch approach is sufficient: data is loaded through batch ingestion, structured in Bronze, Silver, and Gold, and then provided for dashboards. The main effort lies less in the technology than in data modeling, KPI definition, and governance.

Scenario 2: OEE Monitoring and Live Dashboards

Now it becomes more demanding: staff in the control room need second- to minute-level views of machine condition and Overall Equipment Effectiveness (OEE) across several production lines. This is where streaming becomes relevant. Machine data is captured through streaming ingestion, processed almost in real time, and stored in parallel for historical analysis. In unstable networks or strict OT security zones, additional edge processing on the shop floor is recommended. Technically this is manageable, but organizationally it only succeeds when OT, IT, and the data team share responsibility for the end-to-end path.

Scenario 3: Predictive Maintenance

Predictive maintenance combines the best and the most demanding parts of both worlds. You need years of historical time-series data for model training and, at the same time, current streams for predictions that flow back into day-to-day operations, for example as maintenance orders in the Computerized Maintenance Management System (CMMS). The right approach is hybrid: streaming ingestion for current sensor data, historical time-series data in the Data Lakehouse, and a machine learning environment for training and inference. This is exactly where the difference between platform building blocks and project work becomes especially clear: Microsoft provides tools, but model selection, feature engineering, and integration into the CMMS remain project-specific.

Azure Building Blocks: not as a product list, but as a toolbox

Instead of going through an endless product list, let us look at Azure services by task area. This makes it easier to find the right technology for each challenge.

In principle, this leads to two well-supported paths: a PaaS approach, where you combine individual services for each task area, or a more integrated approach with Microsoft Fabric, where individual functions are more tightly connected. Neither path is automatically better. What matters are the desired level of integration, the operating model, and the question of how much platform composition your team wants to take on itself.

Connecting data sources and ingesting data

How does data from machines and sensors, as well as data from databases, files, or Application Programming Interfaces (APIs), get into the platform?

Table 1: Azure building blocks for connecting data sources and ingesting data

Service	Best suited for?	Typical classification
Azure IoT Hub	Bidirectional communication with devices, device identities, and device lifecycle	Near real-time
Azure Event Hubs	Highly scalable streaming for millions of events per second, no device identity	Near real-time
Azure Event Grid	Event-based architectures, MQTT support, Pub/Sub	Near real-time
Azure IoT Edge	Containerized logic on devices or gateways, local preprocessing, offline capability	Edge processing
Azure IoT Operations	Edge data layer on Azure Arc/Kubernetes with MQTT broker, OPC UA connectivity, and streaming	Edge processing
Azure Data Factory	Connection to databases, file and API sources, including from on-premises environments	Mainly batch
Partner solutions (e.g. OPC UA gateways)	Protocol translation and machine connectivity in Brownfield environments	Depends on the setup

For continuous data streams from machines and sensors, Azure IoT Hub, Azure Event Hubs, Azure Event Grid, and edge services are the obvious building blocks. Azure IoT Hub is suitable when you need device identities, secure communication, and device lifecycle management. Azure Event Hubs, in contrast, is designed for pure high-volume streaming without device management. Azure Event Grid is especially suitable for MQTT or event-driven architectures. For edge scenarios, there are currently two equally valid paths: Azure IoT Edge is a good fit for containerized logic on individual devices or gateways, while Azure IoT Operations is stronger when you want to build standardized industrial data flows with MQTT, OPC UA, and predefined cloud targets on Azure Arc/Kubernetes.

For batch ingestion from systems such as MES, ERP, SQL databases, file shares, SFTP, or APIs, Azure Data Factory is usually the more suitable building block. With its many connectors and a self-hosted integration runtime, on-premises sources can also be connected to the platform. This shows that data ingestion into the platform is broader than pure device communication. The right Azure solution depends on the source, the latency class, and the operating model.

Data storage and preparation

How do we store data in a structured way, version it, keep its history, and prepare it for analysis?

Table 2: Azure building blocks for data storage and preparation

Service	Best suited for?	Medallion role
Azure Data Lake Storage Gen2	Scalable, cost-efficient object storage for structured and unstructured data	Bronze, Silver, Gold
Apache Iceberg or Delta Lake (e.g. on Azure Databricks)	ACID transactions, time travel, schema evolution on the Data Lake	Silver, Gold
Microsoft Fabric (OneLake and Lakehouse)	Integrated SaaS platform for storage, preparation, usage, and governance	Bronze, Silver, Gold
Azure Data Explorer	High-performance analysis of telemetry, log, and time-series data	Silver, Gold
Azure SQL Database / Azure Cosmos DB	Relational or NoSQL databases for specific use cases	Gold (for applications)

Azure Data Lake Storage (ADLS) Gen2 is the cost-efficient standard storage for all data. A table format such as Apache Iceberg or Delta Lake is added when you need ACID (Atomicity, Consistency, Isolation, Durability) transactions and historical tracking with schema evolution, which is typical for Silver and Gold. If you want to analyze large telemetry and time-series datasets interactively, Azure Data Explorer is often the more precise choice. Microsoft Fabric covers this layer in a more integrated way: OneLake as the shared storage foundation, Lakehouse for data preparation, and shared data use across several workloads. The Medallion model maps as follows: Bronze stores raw data unchanged, Silver cleans and harmonizes it in tables, Gold aggregates it for business views.

Orchestration and processing

How do we control, transform, and aggregate data flows, in batch or in streaming?

Table 3: Azure building blocks for orchestration and processing

Service	Best suited for?	Batch/Streaming
Azure Data Factory	Orchestration, ETL/ELT, many connectors, GUI-based	Mainly batch
Azure Databricks	Spark-based, flexible for batch and streaming, ML workflows	Batch and Streaming
Azure Stream Analytics	SQL-based streaming, simple for straightforward transformations	Streaming
Microsoft Fabric with Data Factory / Real-Time Intelligence	Integrated orchestration, event streams, Eventhouse, and real-time analytics	Batch and Streaming
Azure Functions	Serverless, event-driven, for small processing steps	Batch and Streaming

In the PaaS approach, Azure Data Factory is suitable for classic ETL jobs. Azure Databricks comes into play for complex transformations, large data volumes, and ML integration. Azure Stream Analytics is a good fit for simple streaming scenarios with SQL, while Azure Functions handle small, event-driven tasks. When using Microsoft Fabric, Data Factory and Real-Time Intelligence cover large parts of these tasks within one platform, from ingestion through event streams to analysis in Eventhouse or Power BI.

Usage and integration

How do we make data accessible to end users and applications?

Table 4: Azure building blocks for usage and integration

Service	Best suited for?
Power BI	Business intelligence, dashboards, reports, and data analysis in business units
Azure API Management	Provide, secure, monitor, and version APIs
Azure Digital Twins	Digital twins for complex assets, room models, and process models
Azure App Service/ Azure Container Apps	Web apps, custom user interfaces, microservices

Power BI is often the standard choice for dashboards. With Microsoft Fabric, it is embedded directly in the platform. Azure API Management is suitable for providing data and ML models as APIs. Azure Digital Twins makes sense when you want to model assets, spaces, or process relationships semantically. Azure App Service or Azure Container Apps come into play when custom applications or microservices are needed.

Technology stack examples

To make the theory more concrete, let us look at three specific stack examples for the scenarios introduced above. Each example first shows the PaaS approach and then a possible alternative with Microsoft Fabric.

3D graphic comparing three Azure technology stacks for manufacturing: a minimal reporting stack with Azure Data Factory, Azure Data Lake Storage Gen2 and Power BI; a near‑real‑time OEE stack adding Azure IoT Hub or Event Hubs plus Azure Stream Analytics/Databricks Structured Streaming; and a predictive maintenance stack with Azure IoT Hub, ADLS Gen2, Azure Databricks/Azure Machine Learning, Azure Functions/Stream Analytics and Azure API Management for exposing predictions. — *Figure 2: Examples of the technology stack for typical manufacturing use cases*

Example 1: minimal stack for reporting

Requirements:

Daily KPI reports for one plant
Data sources: MES database (SQL), a few CSV exports
Users: management, controlling
Latency: daily update is sufficient

Azure stack:

Ingestion: Azure Data Factory with SQL connector and Blob connector; for on-premises sources usually through Self-Hosted Integration Runtime
Storage: ADLS Gen2
1. Bronze: raw data from SQL and CSV
1. Silver: cleaned data (e.g. normalized timestamps, duplicates removed), Apache Iceberg as the table format
1. Gold: aggregated KPIs (e.g. produced parts per line, scrap per product)
Transformation: Azure Data Factory with visually designed data transformations (mapping data flows) or simple copy activities
Usage: Power BI reads directly from the Gold layer

Alternative with Microsoft Fabric: Data Factory loads the data into OneLake, a Lakehouse maps Bronze, Silver, and Gold, and Power BI accesses the same platform directly. This path is especially attractive when data integration, governance, and business intelligence should be tightly connected in one SaaS environment.

Note: This stack is almost turnkey. The main effort lies in data modeling, KPI definition, and governance (who may see which data?).

Example 2: near real-time OEE for multiple lines

Requirements:

Second- to minute-level view of machine condition and OEE
Multiple production lines, different machine types
Display in the control room on large monitors
Simple alerting in case of faults

Azure stack:

Ingestion: Azure IoT Hub or Azure Event Hubs
1. OPC UA gateway collects data from machines and sends it to Azure IoT Hub
Edge (optional): Azure IoT Edge or Azure IoT Operations
1. Azure IoT Edge for local preprocessing, filtering, and buffering during network outages
1. Azure IoT Operations for standardized data flows on Azure Arc/Kubernetes
Streaming: Azure Stream Analytics or Structured Streaming in Azure Databricks
1. Calculates OEE almost in real time and writes to ADLS Gen2 (Bronze/Silver)
Storage: ADLS Gen2 with Apache Iceberg for Silver/Gold
Usage: Power BI with real-time streaming or custom dashboards (e.g. React app)

Alternative with Microsoft Fabric: Real-Time Intelligence handles ingestion, event streams, and real-time analysis, while OneLake and Eventhouse form the data foundation. Power BI or real-time dashboards visualize the results. This is especially interesting when streaming, analysis, and visualization should be combined in one platform.

Note: Azure provides streaming and visualization building blocks, but the edge architecture (filter logic, offline handling) and the OEE calculation logic are project-specific. OT must connect the machines, IT must operate the edge infrastructure, and the data team must develop the streaming logic.

Example 3: Predictive Maintenance with ML

Requirements:

Prediction of bearing failures based on vibration data
Historical data over 2 years needed for model training
Current streaming data for predictions
Predictions should flow into the CMMS

Azure stack:

Ingestion: Azure IoT Hub for vibration sensors
Speicher: ADLS Gen2 with Apache Iceberg
1. Bronze: raw data (vibration, temperature, etc.)
1. Silver: cleaned time series, feature engineering for ML
1. Gold: aggregated features for ML training
ML: Azure Databricks or Azure Machine Learning
1. Model training with historical data (Silver/Gold)
1. Model deployment as REST API (e.g. through Azure Machine Learning endpoints or Model Serving in Azure Databricks)
Streaming for inference: Azure Functions or Azure Stream Analytics call the model API
Integration: Azure API Management provides predictions for the CMMS
Optional: Azure IoT Edge or Azure IoT Operations brings the model or preprocessing locally to the asset

Alternative with Microsoft Fabric: Microsoft Fabric combines OneLake, data engineering, data science, and Power BI in one platform. For streaming-related analyses, Real-Time Intelligence can capture the current data, while models are trained and evaluated on the historical data in OneLake. If predictions must be generated very close to the asset, the edge part remains a separate architecture decision.

Note: This is where the difference between platform building blocks and custom development becomes especially clear. Azure provides ML tools, APIs, and deployment mechanisms, but the model itself, the selection of suitable features, the concept for retraining the model, and the integration into the CMMS are pure project work. Close cooperation between data science experts, OT, and IT is essential here.

Interim conclusion: from use case to Azure stack

The three scenarios and the Azure toolbox make it clear: there is no universal answer to the question of which stack is the right one. The key is to think consistently from current and especially future use cases. Latency requirements, data volume, user groups, and organizational conditions determine which combination of ingestion, storage, processing, and visualization makes sense.

Microsoft provides two solid directions for this: the composed Azure PaaS path with services such as Azure IoT Hub, Azure Event Hubs, Azure Data Factory, ADLS Gen2, Azure Data Explorer, or Power BI, and the more integrated SaaS approach via Microsoft Fabric path with OneLake, Data Factory, Real-Time Intelligence, and Power BI in one platform. The stack examples presented show typical starting points, from minimal batch reporting to an ML-driven predictive maintenance setup.

But choosing the right technology stack is not enough. In the second part of this article, we will address questions that go beyond the mere use of tools: When do you need edge processing and when is the cloud sufficient? Where do managed services end, and where does the actual project work begin? How can governance, modern software development, and operations be designed to ensure the platform remains viable in the long term? And what mistakes should you avoid from the very start?

Industrial Data Platforms on Microsoft Azure: a decision guide for manufacturing – Part 2: From architecture to implementation and the most costly misunderstandings in practice

Manufacturing Solutions

From the Azure stack to a viable architecture

In the first part of this article, we translated typical manufacturing scenarios, from KPI reporting through OEE monitoring to predictive maintenance, into specific Azure stacks. We structured Azure building blocks to task clusters and demonstrated how data ingestion, storage, processing, and use interact using three example stack combinations.

But a platform does not stand or fall based on tool selection alone. In this second part, we focus on deeper questions: when is edge processing necessary, and when is pure cloud i ngestion enough? Where do the capabilities already provided by Azure or Microsoft Fabric end, and where does project-specific development begin? Which development practices ensure long-term maintainability of the platform? And which decision-making patterns repeatedly lead to unnecessary complexity or avoidable costs?

Edge vs. cloud: the central architecture question

When is pure cloud ingestion enough, and when do we need edge? The answer depends on latency, network stability, and OT security zones. With daily reports, stable network connectivity, and IT-side data sources, you can go directly to the cloud. But with strict latency requirements, unstable internet connections, strict OT security zones, or high data volume, edge is the better choice. Real control loops in the millisecond range remain the responsibility of the automation layer; here the data platform mainly supports monitoring, analysis, and coordination. In our work with manufacturing companies, we see this decision regularly: it is rarely purely technical, but also touches security policies, operating concepts, and organizational boundaries.

For edge implementation, there are currently two comparable approaches in the Microsoft ecosystem, with different strengths. Azure IoT Edge is especially suitable when containerized logic should run on individual devices or gateways, for example for local preprocessing, filtering, inference, or offline buffering. Azure IoT Operations is stronger when you want to build a standardized industrial edge data layer with MQTT broker, OPC UA connectivity, and data flows to targets such as Azure Event Hubs, Azure Data Lake Storage (ADLS) Gen2, Microsoft Fabric OneLake, or Azure Data Explorer on Azure Arc and Kubernetes. What Microsoft does not take off your hands in either case is the choice of protocols, the filtering logic, the failover behavior, and the OT integration. OT, IT, and the data team need to work together here: OT defines latency and security requirements, IT operates the edge infrastructure, and the data team develops the processing logic.

3D graphic comparing edge and cloud scenarios for industrial data: on the left, stacked blocks labeled “Milliseconds / Real-time”, “Unstable / Offline phases”, “Strict OT zones” and “Mass raw data”; on the right, blocks labeled “Daily / Hourly”, “Stable / Permanent”, “Standard IT network” and “Aggregated KPIs”, illustrating when edge versus cloud processing is appropriate. — *Figure 1: Comparison of key parameters of edge and cloud architectures*

Where Azure is turnkey and where project work begins

Azure is not a turnkey “Industry 4.0 product”, but a powerful ecosystem of building blocks. In the PaaS approach, Microsoft provides strong infrastructure support: Azure IoT Hub manages the device lifecycle, Azure Data Factory includes hundreds of standard connectors, ADLS Gen2 and open table formats such as Apache Iceberg or Delta Lake provide a solid Lakehouse foundation, Azure Data Explorer supports interactive time-series and telemetry analysis, Power BI integrates smoothly, and Azure Monitor monitors everything centrally. In the more integrated SaaS approach via Microsoft Fabric, OneLake provides the shared storage base, Data Factory handles data integration, Lakehouse handles processing, and Power BI handles usage within the same platform.

However, OT-specific connectors often require partners or custom development. Semantics are pure project work: what does “machine condition” mean? Which tags are needed? Which units apply? You develop the Bronze/Silver/Gold design, data contracts, data quality checks, and domain-specific applications yourself. Microsoft handles the infrastructure work, but domain architecture, data modeling, and governance remain your responsibility.

Modern software development: not optional, but mandatory

An often underestimated point is this: an industrial data platform is software and must be treated as such. Without modern development practices, it quickly becomes difficult to manage. At ZEISS Digital Innovation, we deliberately combine software engineering practices with the world of industrial data, not as an end in itself, but to keep projects maintainable and scalable in the long term.

Diagram of a DevOps lifecycle for an industrial data platform: circular flow from “Code & Infrastructure as Code (IaC)” to “Test & Build”, “Staging”, “Production”, “Observability & FinOps” and “Feedback during development” around a “Production environment – stable and scalable”, with IT team, OT department, and Data & Dev team shown collaborating underneath. — *Figure 2: Principles of Modern Software Development*

The foundation for this is the automation of infrastructure and deployments. Instead of manually clicking Azure resources together, the entire environment is described as Infrastructure as Code (IaC) (for example with Bicep or Terraform). This allows even complex setups for several plants to be rolled out consistently and under version control. Closely linked to this is Continuous Integration and Continuous Delivery (CI/CD) for data pipelines: Azure Data Factory pipelines or Azure Databricks Notebooks are treated like classic code, go through automated unit and integration tests with realistic test data, and move through clean staging environments into production. Faulty versions can then be reverted within minutes, before they cause unnoticed problems.

Once the platform is live, observability closes the loop, both technically and economically. Tools such as Azure Monitor and Azure Log Analytics do not just monitor whether pipelines run without errors and latencies stay within limits, but also continuously check data quality. Proactive alerts report problems before users notice them. Closely related to this is cost monitoring: Azure Cost Management does not only track spending at a high level, but also breaks it down by use case, plant, or business area with the help of cost allocation tags. Only this transparency allows sound decisions about which use case is economically sensible and where optimization is worthwhile. Cost awareness thus becomes an integral part of platform governance.

The roles are clearly divided: IT is responsible for landing zones, IaC, and the CI/CD setup. Data and development teams build pipelines and ML models, while OT provides the requirements and tests in the staging environment. Only this interaction creates a reliably maintainable platform.

Data governance: not a later add-on

Data governance deserves its own article, but the core message is clear: governance must be built in from the start. It is about data ownership (who is responsible?), data quality (which standards apply?), access control (who may see what?), and compliance (GDPR, audit requirements).

Azure supports this with Microsoft Purview for data catalogs and lineage, Azure RBAC (Role-Based Access Control) and Microsoft Entra ID for fine-grained access control, landing zones for clear domain ownership, and Azure Policy for enforced standards. Governance is especially critical in industrial data: production data may be regulated (pharma, automotive), OT data must not fall into the wrong hands, and without trust in data quality nobody will use the platform. Azure and Microsoft Fabric provide the tools, but you must define the governance strategy, roles, processes, and standards yourself.

Common mistakes – and what we can learn from them

In our work with customers, we keep seeing similar challenges. Knowing them and addressing them early is part of our role as a partner.

Infographic listing common misconceptions about industrial data platforms on Azure on the left—such as “We do everything in real time”, “We’re looking for THAT tool”, “OT and IT decide separately”, “Azure is only storage”, “Governance comes later”, “All data stays in hot storage forever” and “We don’t want cloud dependency”—and the corresponding solutions on the right: “Batch-first approach”, “Use cases & architecture before choosing a tool”, “Teamwork is crucial”, “Medallion & governance”, “Governance from day 1”, “Storage tiering” and “Managed services & open data formats”. — *Abbildung 3: Mythen in der Entscheidungsfindung*

“We do everything in real time” is a classic. Every dashboard is supposed to update immediately, even when daily updates would be fully sufficient. The result: unnecessary complexity, higher costs, longer development time. The key question is: Which decisions are really made in real time? Often, a minimum viable product (MVP) with batch processing is the better start.

“We are looking for the one Azure product that solves everything” reveals a misunderstanding. There is no single “product”, but an ecosystem of building blocks. A platform is created through architecture, not through tool selection. Define use cases and architecture first, then choose the suitable building blocks.

“OT and IT decide separately” leads to isolated solutions. OT procures edge gateways, IT builds the cloud platform, and the data team hears nothing about it until the systems are incompatible. Industrial data processing is teamwork. Joint kickoffs, a shared architecture vision, and clear end-to-end responsibility are essential.

“Azure is only storage” is the safe road to a data swamp. If data lands in ADLS Gen2 “somehow” without structure, transformation, or governance, nobody will find anything later. The Medallion structure, data quality checks, and a catalog, for example with Microsoft Purview, are not extras, but basic requirements.

“Governance comes later” is an illusion. Governance added later is much harder than governance from the start. Define basic roles, access controls, and naming conventions from day one.

“All data stays in hot storage forever” is a classic cost trap. ADLS Gen2 offers different storage tiers, Hot, Cool, and Archive, with clearly different cost structures. If all historical data stays permanently in the Hot tier, storage costs become unnecessarily high. Define from the start: Which data needs fast access, which is rarely used, and which is only for long-term archiving? Azure Lifecycle Management automates this tiering. The same applies to data resolution: not every historical time series needs to be stored at full resolution. Downsampling older data saves storage volume and therefore cost.

“We don’t want cloud dependency” sounds cautious but often leads to expensive extra effort. If you only use VMs and open-source components, you give up managed services and must operate everything yourself: patching, scaling, and monitoring. The better question is: Can we keep data in standard formats and still benefit from managed services?

Conclusion: from architecture to implementation

An industrial data platform on Azure does not mean buying a product, but designing and implementing an architecture. Microsoft offers a mature ecosystem of building blocks that removes much of the infrastructure and platform work. The challenge is to choose the right building blocks, combine them sensibly, and create sustainable governance.

The most important principles are these: Start from the use case, not from the technology. Think in task clusters, not in product lists. A PaaS approach using individual Azure services and an integrated SaaS approach using Microsoft Fabric are two valid options with different strengths. Edge versus cloud is an architecture decision, not a tool question. Microsoft provides the infrastructure, not the business logic. Domain model, transformations, and governance remain your responsibility. Modern software development with IaC, CI/CD, and tests is mandatory, not optional. And above all, OT, IT, and the data team must work together. Industrial data is a shared task.

This is exactly where we at ZEISS Digital Innovation come in: as a partner that understands both the manufacturing world and modern cloud architectures. We translate between OT, IT, and data, ask the right questions, create clear architectures, and work with our customers to develop solutions that work in practice and remain maintainable in the long term. From requirements through implementation to operations, we support you on the path to a scalable, future-ready data platform.

Recap Thin[gk]athon – Taking Virtual Shopfloor to the Next Level

How do you bring a physical production hall into the digital world? This question was at the heart of the Thin[gk]athon, with ZEISS Digital Innovation and Volkswagen Sachsen as the challenge owners. The co-innovation format of the Smart Systems Hub provides the methodological framework to collaboratively tackle highly relevant challenges. This time, the motto was “Taking Virtual Shopfloor to the Next Level,” inviting ambitious minds with diverse professional backgrounds to jointly explore the boundaries of digitalization. The result: interdisciplinary teams where fresh perspectives met diverse experiences to develop innovative solutions.

Participants and jury, front row from left to right: Prof. Marius Brade (University of Applied Sciences Dresden), Dr. Stefan Feldmann (Zeiss), Dr. Dirk Thieme (Volkswagen), Daniel Beltz (Sachsenmilch), Leonid Dendya (Volkswagen)

The Challenge: Making Digitalization Tangible

At the center of the challenge was the task of transferring real production data into a virtual 3D world – interactively, scalable, and practically. The teams had to link scanned production environments, comprised of point clouds and images, with digital product passes. Specifically, this meant autonomously recognizing QR codes in the 3D scans and linking them with information from the Asset Administration Shell (AAS). The goal was a walkable, interactive 3D scene that not only visualizes the environment but also provides data and optimizes processes. To make the results transferable, a generic approach was chosen.

26.06.2025, Dresden, Thin[gk]athon, ZEISS Digital Innovation, Smart Systems Hub, Volkswagen Sachsen

Teamwork as an Innovation Driver

The teams, consisting of participants with diverse professional backgrounds, demonstrated how valuable this format is. By combining knowledge from IT, production, design, and science, new ideas emerged that replaced conventional thinking with fresh impulses. Our colleagues Pawel Adaszewski and Gergely Honti from ZEISS Digital Innovation also supported the teams with their expertise in software development and their shopfloor experience. In the end, all teams presented practical concepts within just three days and outlined possible next steps for building an interactive and scalable digital twin.

Technological Deep Dive: A Challenge in Three Dimensions

The true complexity of the challenge was revealed in the raw data. The datasets were in the form of so-called 3D Gaussian Splats – a modern method for rendering photorealistic scenes from photos. The result: a huge and often noisy point cloud that first had to be filtered from noise. Some teams tried to analyze the density of the point cloud with the K-Nearest-Neighbors algorithm (KNN) but quickly abandoned this approach due to the enormous amount of data. More successful was a box-based method that divided the space into a voxel grid – imagine this as a division into small cubes. By calculating the point density in each cube, machine clusters could be quickly and efficiently isolated.

The biggest hurdle remained the detection of QR codes. The crucial insight was that the 3D scans only depicted the surfaces of the objects. Instead of searching the entire 3D space, the teams could specifically search the rendered surface for areas with high black-and-white contrast. However, the practical implementation was rocky: attempts to visualize the found contrast hotspots with Python-based 3D renderers failed due to software conflicts and bugs. The breakthrough finally came through a pragmatic switch to the Unity engine. Using WebGL rendering, the virtual camera was moved to the calculated coordinates to find and read the QR codes.

After preprocessing, a spatial relationship between the QR codes and the associated machines was established. This spatial relationship was used to retrieve data from the Asset Administration Shell (AAS) for the user and display it in an interactive user interface. A possible frontend allowed users to interactively navigate through the point cloud and explore the relevant information.

The Winning Team’s Solution

The winning team from the Information Management department at HTW Dresden – consisting of Stefan Vogt, Robert Pampuch, Johannes Metzler, Felix Fritzsche, and Paul Patolla – impressed with a functioning solution that also won the competition. For them, it was three intensive days full of technology, exchange, and practical development.

*Figure 1: Interface in the administration shell*

The technical processing of their solution was based on several steps. First, video files were broken down into individual frames to identify QR codes along with position information in each frame. For this, they primarily used the Python library OpenCV, which enabled efficient image preparation and robust QR code recognition. In parallel, the team used the Gaussian Splatting method, based on the scientific publication by Kerbl et al., to generate a realistic point cloud from the image data. This allowed the precise localization of the QR codes in a three-dimensional context.

For visualization, Babylon.js was used, allowing the results to be experienced directly in the browser (and optionally in VR) as a WebXR application. Additionally, NVIDIA Omniverse was used to generate textures, significantly enhancing the visual quality of the representation. This way, a comprehensive solution was created that intelligently combined data analysis and immersive 3D visualization, demonstrating a clear connection to practice.

*Figure 2: Screenshot of the winning team’s solution*

Conclusion and Outlook

A panel of experts from industry and academia evaluated the concepts based on technical depth, feasibility, and teamwork. The teams demonstrated a strong commitment to social responsibility by donating their prize money of €2,000 to charitable organizations.

Winning team “Punkt Pioniere”

The Thin[gk]athon once again proved that the digitalization of production is not an end in itself but an effective tool. It enables more efficient processes and data-based decisions that create sustainable added value for the shopfloor. The key to success lies in structured data enablement and the targeted use of existing production data.

The developed concepts impressively show how complex challenges can be solved through the combination of technology, team spirit, and methodological support. We look forward to continuing formats of this kind and further promoting exchange together.

Example implementation of a digital twin exchange with the Asset Administration Shell concept

Manufacturing Solutions

The Asset Administration Shell (AAS) is an emerging concept in connection with the digital twin data management. It is a virtual representation of a physical or logical unit, such as a machine or a product. The AAS enables relevant information about this unit to be collected, processed and used to ensure efficient and intelligent production. The architectural standardization of the AAS enables communication between digital systems and is therefore the basis for Cyber Physical Systems (CPS). You can find out more about the need for digital twins in the industry here: ZEISS Digital Innovation Blog – The digital twin as a pillar of Industry 4.0

The following article focuses on the actual implementation of an example. In one scenario, we will use a reference implementation from the Fraunhofer Institute for Experimental Software Engineering IESE (BaSyx implementation of AAS v3) to establish an information exchange between two participating partners. For the sake of simplicity, both parties are based within one company and share the infrastructure.

Note: This application is also suitable for cross-company distribution. The respective security aspects are not part of this scenario.

Scenario

Factory A is a manufacturer of measuring head accessories and produces pushbuttons, among other products.

Factory B manufactures measuring heads and would like to install pushbuttons from factory A on its measuring head.

In addition to the physical forwarding of the pushbutton, information must also be exchanged. The sales department of factory A provides contact information so that the technical purchasing department of factory B can access and negotiate prices. Moreover, factory A must also provide documentation, etc.

For this exchange of information:

Contact details and
documentation

the AAS concept is to be used.

Infrastructure

In our scenario, we want to use type 2 of the Asset Administration Shell. Here, the AAS is provided on servers and there is no exchange of AASX files, but rather communication between the different services via REST interfaces.

In our scenario, the different services run within a Kubernetes cluster in Docker containers.

Services

To provide the AAS, we use the BaSyx implementation (https://github.com/eclipse-basyx/basyx-java-server-sdk). To be specific, we use the following components:

AAS repository,
submodel repository,
ConceptDescription repository,
discovery service,
AAS registry as well as
a Submodel Registry.

AAS repository, submodel repository and ConceptDescription repository are provided within one container – the AAS environment.

Technical implementation

The communication is only done via API calls to the corresponding REST interfaces of the services.

For this, we will use simple Python scripts to represent a communication in which the manufacturing factory B wants to receive information about a product (here: a pushbutton for a sensor) from the component factory A.

Procedure

Factory B only knows the asset ID of one pushbutton, but knows nothing else about this component.

Factory B uses the Asset ID to request the discovery service in order to obtain an AAS ID.
This AAS ID is used for requesting the central AAS registry, which provides the AAS repository endpoint of factory A.
Via this endpoint and the AAS ID, factory B receives the complete Asset Administration Shell including the relevant submodel IDs.
With these submodel IDs (e.g., for contact details and documentation), factory B requests the submodel registry of A and receives the submodel repository endpoints.
Factory B requests the submodel endpoints and receives the complete submodel data. This data includes the desired contact information for the buyer and complete documentation for the pushbutton asset.

The process is illustrated in detail below using UML sequence diagrams.

UML sequence: Creation of AAS registration in component factory A

Factory A is responsible for registering its own component.

Figure 1: UML sequence: Creation of the AAS registration in component factory A

UML sequence: Requesting the component data

Factory B requires data about the component and requests it from the AAS infrastructure. Here, factory B may only have access to the asset ID of the component, in our case that of the pushbutton.

Figure 2: UML sequence: Requesting the component data

Implementation

For the process described above to work, the services mentioned must be made available. Secondly, shells and submodels must be created and also stored in the registries.

Service deployment

Deployment takes place via Docker containers that run within a Kubernetes cluster. For this, each BaSyx image receives a Kubernetes deployment which starts the corresponding pods via Kubernetes replica sets.
By means of port forwarding, for example, the corresponding ports are made accessible by the host. This is necessary to address the APIs according to the example Python scripts.

A relatively simple Kubernetes deployment configuration looks like this. There are four deployments, each with a replica set.

Figure 3: Kubernetes deployment configuration

Creating the AAS from factory A

Factory A would like to provide the contact details for its pushbutton component, for example.

For this, submodels are created in the AAS repository (here: in the AAS environment) and registered in the submodel registry.

Subsequently, the Asset Administration Shell is created in the AAS repository and registered in the AAS registry.

The shell reference is then added to the AAS repository.

This completes the registration process from the component factory.

Contact details requests from factory B

Factory B would like to receive the contact details and requests the various AAS services according to the workflow described above. Ultimately, the submodel data set containing the required values is obtained and can then be made available within a user interface, for example.

Conclusion

The Type 2 Asset Administration Shell is a distributed system that is based on the corresponding repositories, registries and the discovery service. In our example, we have only used a simple submodel template for contact data. However, there are far more templates available for many applications.

Communication between the services for the provision and retrieval of data is relatively straightforward, although aspects such as security were not a focus in this scenario.

The example implementation described in this article gives an idea of the immense potential of the AAS concept and encourages users to start concrete implementations.

This post was written by:

Daniel Bruegge

Daniel Brügge works as a software developer at ZEISS Digital Innovation with a focus on cloud development and distributed applications.

See author’s posts

Cyber-physical systems as a pillar of Industry 4.0

Manufacturing Solutions

What is that?

A cyber-physical system (CPS) is used to control a physical-technical process and, for this purpose, combines electronics, complex software and network communication, e.g. via the Internet. One characteristic feature is that all elements make an inseparable contribution to the functioning of the system. For this reason, it would be wrong to consider any device with some software and a network connection to be a CPS.

Especially in manufacturing, CPS’ are often mechatronic systems, e.g. interconnected robots. Embedded systems form the core of these systems, are interconnected by networks and supplemented by central software systems, e.g. in the cloud.

Due to their interconnection, cyber-physical systems can also be used to automatically control infrastructures that are located far away from each other or a large number of locations. These could only be automated to a limited extent – until now. Some examples of this are decentrally controlled power grids, logistics processes and distributed production processes.

Thanks to their automation, digitalization and interconnection, CPS provide a high degree of flexibility and autonomy in manufacturing. This enables matrix production systems, which support a wide range of variants at large and small quantities [1].

So far, no standardized definition has been established, as the term is used broadly and non-specifically and is sometimes used to market utopian-futuristic concepts [2].

Where did this term originate?

In recent years, innovations in the fields of IT, network technology, electronics, etc. have made complex, automated and interconnected control systems possible. Academic disciplines such as control engineering and information technology offered no suitable concept for the new mix of technical processes, complex data and software. As a result, a new concept with a suitable name was needed.

The term is closely related to the Internet of Things (IoT). Moreover, cyber-physical systems make up the technical core of many innovations that bear the label “smart” in their name: Smart Home, Smart City, Smart Grid etc.

Features of CPS

As mentioned above, there is no generally recognized definition. But the following characteristics can be destilled from the multitude of definitions:

At its core there is a physical or technical process.
There are sensors and models to digitally record the status of the process.
There is complex software to allow for a (partially) automatic decision to be made based on the status. While human intervention is possible, it is not absolutely required.
There are technical means for implementing the selected decision.
All elements of the system are interconnected in order to exchange information.

One CPS design model is the layer model according to [2]

Figure 1: Layer model for the internal structure of cyber-physical systems

Examples of cyber-physical systems

Self-controlled manufacturing machines and processes (Smart Factory)
Decentralized control of power generation and consumption (Smart Grids)
Household automation (Smart Home)
Traffic control in real time, via central or decentral control with traffic management systems or apps (element of the Smart City)

Example of an industrial cyber-physical system

This example shows a manufacturing machine that can operate largely autonomously thanks to software and interconnection, thereby minimizing idle times, downtimes and maintenance times. Let us assume that we are dealing with a machine tool for cutting as example.

Interconnected elements of the system:

Machine tool with
- QR code camera for workpiece identification
- RFID reader for tool identification
- Automatic inventory monitoring
- Wear detection and maintenance prediction
Central IT system for design data and tool parameters (CAM)
MES/ERP system

The manufacturing machine of our example is capable of identifying the workpiece and the tool. The common technologies RFID or QR code can be used for this purpose. A central IT system manages design and specification data, e.g. a computer-aided manufacturing system (CAM) for CNC machines. The manufacturing machine retrieves all the data required for processing from the central system using the ID of workpiece and tool. As a result, there is no need to enter parameters manually as the data is processed digitally throughout. The identification allows the physical layer and data layer of a cyber-physical system to be linked.

The digitized data for workpieces, machines and other manufacturing elements can be grouped under the term digital twin, which was presented in the blog article “Digital twins: a central pillar of Industry 4.0” by Marco Grafe.

The set-up tools and the material and resource inventories available in the machine are checked on the basis of the design and specification data. The machine notifies personnel if necessary. By performing this validation before processing begins, rejects can be avoided and utilization increased.

The machine monitors its status (in operation, idle, failure) and reports the status digitally to a central system that records utilization and other operating indicators. These types of status monitoring functions are typically integrated into a Manufacturing Execution System (MES) and are now in widespread use. In our example, the machine is also able to measure its own wear and tear in order to predict and report maintenance requirements, thereby increasing its autonomy. These functions are known as predictive maintenance. All these measures improve machine availability and make maintenance and work planning easier.

Through the use of electronics and software, our fictitious manufacturing machine is capable of working largely autonomously. The role of humans is reduced to feeding, set-up, troubleshooting and maintenance; humans only support the machine in the manufacturing process.

References

[1] Forschungsbeirat Industrie 4.0, „Expertise: Umsetzung von cyber-physischen Matrixproduktionssystemen,“ acatech – Deutsche Akademie der Technikwissenschaften, München, 2022.

[2] P. H. J. Nardelli, Cyber-physical systems: theory, methodology, and applications, Hoboken, New Jersey: Wiley, 2022.

[3] P. V. Krishna, V. Saritha und H. P. Sultana, Challenges, Opportunities, and Dimensions of Cyber-Physical Systems, Hershey, Pennsylvania: IGI Global, 2015.

[4] P. Marwedel, Eingebettete Systeme: Grundlagen Eingebetteter Systeme in Cyber-Physikalischen Systemen, Wiesbaden: Springer Vieweg, 2021.

Kinematic Simulation for Beginners

Introduction

Mass individualized production, demographic change, labor shortage, and manufacturing reshoring are some global challenges that producing companies worldwide face. As a result, production sites in high-wage countries demand highly automated and flexible production systems to remain competitive. Industrial robots and custom machines have proven to be key assets and enablers in addressing some of these challenges. The flexible programming of these manufacturing systems allows tasks such as handling, transportation, and various production processes to be performed automatically with high precision, speed, and quality.

Although the benefits of such production systems have been widely demonstrated, it should be noted that there is a significant amount of integration and programming that must be considered before these systems can be run productively. In most cases, developing software for production systems requires using physical components under real conditions. However, the availability of such systems is limited in most cases for various reasons, e.g., the system is running, is being developed in parallel, or, in the worst case, does not exist. This problem causes software development to be delayed or postponed until the necessary physical components are available and integrated. In addition, programming an industrial robot is considered a non-trivial task that requires skilled operators with a good spatial understanding of the workspace and domain knowledge. Their main task is to program a sequence of motions that will ensure the completion of the production process while avoiding collisions and ensuring the process’s quality. For these reasons, industrial robot programming is still manually performed and considered challenging and resource-intensive.

At a more abstract level, these problems are common to software development in other domains. So the question arises: How do developers deal with these problems when there are no modules or interfaces? We create mocks! In a sense, mocks are nothing more than models that simulate a desired functionality. However, modeling a robot sounds a bit more complicated than mocking a database or an interface. This article aims to prove the opposite and presents a quick tutorial on how to create kinematic simulation models to make programming kinematic manufacturing systems more straightforward and efficient.

Takeaways for the reader:

Basic understanding of kinematic simulation models.
Ability to create a simple kinematic model of a manipulator (e.g., robot, rotary table, linear axes) based on Free and open-source software (FOSS) that can be used for prototyping, development, and testing purposes.

Kinematic Modelling

Reference System

Before simulating the kinematics of a robot system, some essential mathematical concepts and the common terminology used in the context of kinematic modeling must be first understood. To this purpose, consider first a minimal system consisting of a revolute joint \(j_1\) and a link \(l_1\).

The link is rigidly coupled to the joint. This means that when the joint rotates over its z-axis (frame \(B_{j_1}\)) by an angle of \(\phi_{j_1}\) , link \(l_1\) rotates with it. Assume the system’s origin coordinate frame is located at the basis coordinate frame at \(B_0\). Moreover, the vector \(p_0^w := (x, y, z, a^x, \beta^y, \phi^z)^T\) describes the link’s pose (position and rotation) at the work frame \(B_w\). The 2D kinematic model of such system and its components are illustrated in left of Figure 1.

Kinematic Model

Now, assuming that we will require to program or simulate some motions, we will inevitably be confronted with at least one of the following problems:

Inverse Kinematic Problem: Which is the angle \(\phi_{j1}\) corresponding to an actual pose \(p_0^{w,act}\)?
Forward Kinematic Problem: Which is the resulting link pose \(p_0^w\) of the actual joint angle \(\phi_{j1}^{act}\)?

These questions are depicted on the right of Figure 1 and represent the fundamental problems of kinematic modeling, denoted as the inverse and forward kinematic problems. To answer any of these questions, we first need to estimate all spatial relationships between all consecutive ^[1] frames of the system. That means from the base \(B_0\) to the joint \(B_{j1}\) and from the joint \(B_{j1}\) to the work frame \(B_w\). Let the relative spatial relationships^[2] between two frames be modeled by the translational components \(x_{\Delta}, y_{\Delta}\), and \(z_{\Delta}\) and its rotational counterparts \(\alpha_{\Delta}^x\), \(\beta_{\Delta}^y\), and \(\phi_{\Delta}^z\). Table 1 shows the geometric relationships for all consecutive frames of the kinematic system of Figure 1.

Base Frame	Reference Frame	Translation	Rotation
Base \(B_0\)	Joint \(B_{j_1}\)	\(x_{\Delta}\) = 0, \(y_{\Delta}\) = 0, \(z_{\Delta}\) = \(d_0\)	\(\alpha_{\Delta}^x\) = 0, \(\beta_{\Delta}^y\) = 0, \(\phi_{\Delta}^z\) = \(\phi_{j_1}\)
Joint \(B_{j_1}\)	Work frame \(B_w\)	\(x_{\Delta}\) = \(a_1\), \(y_{\Delta}\) = 0, \(z_{\Delta}\) = \(d_0\)	\(\alpha_{\Delta}^x\) = -90°, \(\beta_{\Delta}^y\) = 0, \(\phi_{\Delta}^z\) = 0

Table 1: Kinematic relationships

After having defined the geometrical relationships of our system, the kinematic model that will answer the previous questions can now be described. For example, the forward kinematic module that gives the resulting pose for a commanded joint angle could be modeled using the trigonometric relationships depicted in Figure 2.

*Figure 2: Trigonometric relationships of reference system*

\(x_0^w\) = \(a_1 \cos (\phi_{j1})\)

\(y_0^w\) = 0

\(z_0^w\) = \(d_0 + a_1 \sin (\phi_{j_1})\)

Although these geometric functions sufficiently model the kinematics of our reference system, it should be noted that describing the kinematic model in such a way is not so straightforward for more complex systems with multiple joints and links. For this reason, other mathematical approaches, e.g., homogenous matrices or quaternions, are generally used for computing multiple-coordinate transformations. The description of these techniques falls outside this blog’s scope. For the rest of the article, knowing which inputs and outputs we can expect from a kinematic model is sufficient. These models can be assumed as black-box components, as depicted in Figure 3.

*Figure 3: Forward and inverse kinematic black-box models*

Implementation

Now that the core concepts of kinematic modeling have been introduced, a seamless way of describing kinematic chains is introduced.

There exists a handful of specifications and formats that address the modeling of kinematic chains, e.g., Collada, AutomationML, OPC UA Robotics. However, in our experience, a standardized format has not been established within the industry. This represents a broader problem in the robotics domain, where programming languages are primarily vendor-specific, and there are no standards for programming or modeling robots. This is one of the reasons why the Robot Operating System (ROS) was founded in 2010. ROS is a FOSS robotics middleware that includes several libraries (e.g., kinematic modeling, perception, visualization, path planning) for hardware-agnostic programming of robotic systems. This has made ROS the state-of-the-art framework used within robotics research. Because of its popularity and characteristics (e.g., performance, hardware-agnostic, FOSS, modularity, SOA), manufacturers of robots, field devices (e.g., grippers and sensors), and software vendors have begun to offer programming interfaces for ROS.

As part of the development of ROS, the Unified Robot Description Format (URDF) was introduced for modeling kinematic chains. The URDF is an open standard XML schema for describing the geometric relationships between joints and links of a robot. In addition to modeling kinematic chains, the URDF provides the possibility to model the physical properties of joints (e.g., inertia, dynamics, and axis limits) or use CAD files for modeling the volumetric properties of links that can be used for collision testing. Furthermore, since the URDF follows an XML schema, kinematic models can be straightforwardly represented in a readable manner. For example, the following excerpt in Figure 4 describes the kinematic relationships between the joint j₁ and the link l₁ from Table 1.

Having described the geometric relationships between all links and joints using a URDF file, the kinematic model can be visualized and used to calculate end-effector positions or required joint rotations. ROS integrates a handful of packages that use third-party libraries implementing all these functionalities. The use of these libraries is described in the ROS documentation.

<!--All links of our model.-->
<!--The root frame in ROS is called the base_link and represents the root frame (B_0) in our system. -->
<link name="base_link"/>
<!--The link 1 of our model. -->
<link name="link_1"/>
<!-- The work frame of our model is represented as a link.-->
<link name="work_frame"/>

<!--All joints of our model.-->
<!--The revolute joint 1, which couples the base link (parent link) with the link 1 (child link) is modeled here.
The joint is located at the origin of the child link.-->
<joint name="joint_1" type="revolute">
	<parent link="base_link"/>
	<child link="link_1"/>
<!-- Selection of rotation axis, in our case around the joint is around the z-axis in positive direction.-->
	<axis xyz="0 0 1"/>
<!-- The transformation between the parent and child link is given here.-->
<!-- The translational components (xyz) are given in meters. -->
<!-- The rotation is expressed by the Euler angles (rpy) in radians according to the following
 notation (r)oll (x-axis rot), (p)itch (y-axis rot.), and (y)aw (z-axis rot.). -->
	<origin xyz="0 0 0.4" rpy="1.57079632679 0.0 0.0"/>
<!-- The model of a movable joint must include further physical properties. -->
	<limit effort="100" lower="-0.175" upper="3.1416" velocity="0.5"/>
</joint>

Figure 4: URDF excerpt describing kinematic relationship between joint \(j_1\) and link \(l_1\)

Extended System

Having understood the basics of kinematic modeling and how to use URDFs to implement a kinematic model, nothing stands in the way of describing more complex multi-joint kinematic chains like the one shown in Figure 5.

*Figure 5: Extended kinematic model considering a prismatic and a revolute joint.*

The corresponding geometry relationships are given in Table 2. In addition, the complete URDF can be found attached.

Base Frame	Reference Frame	Translation	Rotation
Base \(B_0\)	Joint \(B_{j_1}\)	\(\Delta x\) = \(a_{j_1}\), \(\Delta y\) = 0, \(\Delta z\) = \(d_0\)	\(\alpha_x\) = 0, \(\beta_y\) = 0, \(\phi_z = 0\)
Joint \(B_{j_1}\)	Joint \(B_{j_2}\)	\(\Delta\)x = \(a_2\), \(\Delta y\) = 0, \(\Delta z\) = 0	\(\alpha_x\) = 90°, \(\beta_y\) = 0, \(\phi_z\) = 0
Joint \(B_{j_2}\)	Work frame \(B_w\)	\(\Delta\)x = \(a_3\), \(\Delta y\) = 0, \(\Delta z\) = 0	\(\alpha_x\) = -90°, \(\beta_y\) = 0, \(\phi_z\) = 0

Table 2: Kinematic relationships extended system

The URDF can then be straightforwardly used within ROS to visualize and position joints, as depicted in Figure 6.

Joint 1: \(a_{j_1}\) = -170mm, Joint 2: \(\phi_{j_2}\) = -48°

Joint 1: \(a_{j_1}\) = 80mm, Joint 2: \(\phi_{j_2}\) = 52°

Figure 6: URDF visualization in ROS using two different joint configurations and following values:
\(d_0\) = 300mm, \(a_2\) = 500mm, \(a_3\) = 200mm. The image on the right side also depicts the integration of surface models for modeling the volumetric properties of the links.

Summary and Outlook

Programming robots is a complex and resource-exhaustive task requiring expert knowledge, time, and in most cases, the use of the physical system. These barriers directly affect the software development and commissioning of such systems. A kinematic (mock) model of the robot enables the possibility of programming robots without requiring the physical system while reducing costs. However, modeling robot systems is considered a non-trivial task requiring in a first step that their kinematic model is described. For this reason, this blog introduced first the minimal mathematical fundamentals to understand kinematic modeling. Then, in a further step, we showed how kinematic models could be seamlessly implemented using the standard format URDF.

With the blogs’ insights, the reader should be able to describe kinematic models that can be used as mock-ups for prototyping or development purposes. Having overcome the first obstacle of kinematic modeling, the following steps might include:

integrate the kinematic model with a real robot system and build a digital twin (further reading: Smart Manufacturing, IOT with Azure Digital Twins)

offer kinematic models as microservices for development, testing, and commissioning purposes (further reading: Mocks in test environment)

develop more user-friendly programming frameworks based on kinematic simulations using cutting-edge technologies such as VR (virtual reality) or AR (augmented reality).

Minimal kinematic model (urdf):

minimal_kinematic_model.urdf

Extended kinematic model (urdf):

extended_kinematic_model.urdf

[1] For this reason, these models are commonly denoted as serial kinematic chains. There also exist parallel kinematic models for representing delta robots. The kinematic modeling of such systems lies outside the scope of this blog.

[2] In the robotic domain, the transformation between two coordinate frames is frequently described using homogenous transformations and a set of four parameters describing the translation and orientation displacement, known as the Denavit-Hartenberg parameters.

Smart Manufacturing at the office desk

Manufacturing Solutions

factory out of lego bricks on a desk in an office — *Figure 1: Overview of the learning factory in Görlitz*

While more and more start-ups, mid-sized companies and large corporations are using digitalisation and networking to expand their business, and are developing entirely new business models, the global demand for standardisation and implementation expertise is growing. For example, real-life technologies have long been evolving from phrases that previously didn’t hold a lot of meaning, like “Big Data”, “Internet of Things (IoT)” and “Industry 4.0”; such technologies are driving digital transformation while helping companies to increase their productivity, optimise their supply chains and, ultimately, increase their gross profit margins. They primarily benefit from reusable services from hyperscalers such as Amazon, Microsoft, Google or IBM, but are themselves often unable to implement tailor-made solutions using their own staff. ZEISS Digital Innovation (ZDI) assists and supports its customers in their digital transformation as both a partner and development service provider.

Cloud solutions have long been clunky – especially in the industrial environment. This was due to widespread scepticism regarding data, IT and system security, as well as development and operating costs. In addition, connecting and upgrading a large number of heterogeneous existing systems required a great deal of imagination. For the most part, these basic questions have now been resolved and cloud providers are using specific IoT services to recruit new customers from the manufacturing industry.

In order to illustrate the typical opportunities and challenges borne by IoT environments in the most realistic way possible, an interdisciplinary ZDI team – consisting of competent experts from the areas of business analysis, software architecture, front-end and back-end development, DevOps engineering, test management and test automation – will use a proven agile process to develop a demonstrator that can be used at a later date to check the feasibility of customer-specific requirements.

A networked production environment is simulated in the demonstrator using a fischertechnik Learning Factory and is controlled using a cloud application developed by us. With its various sensors, kinematics, extraction technology and, in particular, a Siemens S7 control unit, the learning factory contains many of the typical elements that are also used in real industrial systems. Established standards such as OPC UA and MQTT are used to link the devices to an integrated IoT gateway, which in turn supplies the collected data via a standard interface to the cloud services that have been optimised for this purpose. Conversely, the gateway also allows controlled access to the production facilities from outside of the factory infrastructure while taking the strict IT and system security requirements into account.

part of the learning factory — *Figure 2: Gripping arm with blue NFC workpiece*

Establishing and securing connectivity for employees across all ZDI locations after commissioning has occurred is on one hand an organisational requirement, and on the other, already a core requirement for any practical IoT solution with profound effects for the overall architecture. In terms of technology, the team will initially focus on cloud services offered by Microsoft (Azure) and Amazon (AWS), contributing extensive experiences from challenging customer projects in the IoT environment. Furthermore, the focus remains on architecture and technology reviews as well as the implementation of the initial monitoring use cases. Using this as a foundation, more complex use cases for cycle time optimisation, machine efficiency, quality assurance or tracing (track and trace) are in the planning phase.

ZDI is also especially well positioned in the testing services field. Unlike in extremely software-heavy industries such as logistics or the financial sector, however, test managers for numerous production-related use cases were repeatedly confronted with the question of how hardware, software and, in particular, their interaction at the control level can be tested in full and automatically, without requiring valuable machine and system time. In hyper-complex production environments, such as those that ZEISS has come across in the semiconductor and automotive industries, digital twins, which are widely used otherwise, only provide a limited degree of mitigation as relationships are difficult to model and, occasionally, fully unknown influencing factors are involved. This makes it all the more important to design a suitable testing environment that can be used to narrow down errors, reproduce them and eliminate them in the most minimally invasive way possible.

We will use this blog to regularly report on the project’s progress and share our experiences.