MoDeS3 – Railway system: Summary of the project

In the last two months we have invested much time to demonstrate model-based development with the help of a mixed-critical model railway system. We wanted to show the newest techniques that can be used to develop highly reliable safety critical systems.

We started from an old hardware architecture, but for this project we have completely redesigned the hardware architecture to meet the new requirements of the envisioned system – we wanted to demonstrate various development, verification, and validation techniques, integrate them, and use a real-time open-source operating system on the controllers. After the architectural plans were made, we’ve decided to use BeagleBone Blacks that’ll run a special Debian based Linux (with a real-time kernel developed by Texas Instruments).

The development of safety critical systems starts by clarifying the requirements – in this case our goal is avoiding crashes and accidents. Then comes the more difficult part: model-based development of the distributed logic. This allows verification and code generation based on the model, resulting in a more reliable system. This was conducted in the open-source Yakindu Statecharts Tool. Validation and verification rules were utilized to find design problems at the early stages of development. The validation rules were defined with the help of the open-source IncQuery engine (which is used for graph-queries), while the transformation rules were implemented in the VIATRA framework. The source code of the distributed logic was generated automatically from the design models.

MQTT is used as a communication platform, Mosquitto and Paho provided the technology to implement the communication infrastructure of the system. These IoT technologies proved their applicability in time critical systems. Our experience was that they are not only easy to use, but these technologies provide realiability and the performance needed for our project.

Conceptually, we had two different levels of monitoring. One level runs on the programmable real-time unit of the BeagleBone hardware, and another one runs at the system level.

We have developed a language to support the runtime verification of the local components. A special statechart language is provided based on Xtext from which we can generate embedded monitors.

On the top level, there is a computer vision based monitoring system. The video is processed real-time by OpenCV, which is an open-source library of computer vision techniques. An open-source comlex-event processing engine maintains an abstract view represented in a live model. IncQuery patterns are used to define the critical event in the system and VIATRA-CEP processes the event stream. Our developments will hopefully appear in the future in the official version of VIATRA-CEP.

During this challenge one of the main difficulties regarding the complex-event processing part was creating a program module which can be later integrated to an open-source project.
The not technology related complexity was to design and implement a mathematical formalism which is appropiate for using in Complex Event Processing Frameworks.
Using the open-source IoT technologies of Eclipse was challenging at first but since they are well-documented and intuitive to use they were simple to use after all. Integrating the complex-event processing engine to other open-source communication technologies was not difficult.

We have tested the system in the IBM Bluemix cloud in Docker containers and also on the real hardware. Node-RED flows and the generated statechart codes have been deployed into IBM Bluemix. Each turnout has its own statechart and they run separately, connected through the Node-RED flow, to make the communication easier and to use cutting-edge Internet of Things technolgy!

The high-level overview of the system is depicted on the next picture.

During the development we have faced theoretical challenges such as the new formalism for complex-event processing, the development of the distributed safety-logic or the verification of the components. On the other side, we have solved many technology related problems, integration tasks or more involved problems.

image

Besides, we have shot some videos as well: Model Railway Track Side Perspective, Model Railway Onboard Footage #1 and #2, Restlessly building the hardware. The videos are available at our blog.

Here is a picture for a quick overview of the technologies we used.

image

During the developments, we have also contributed to the open-source world! Besides, all modules we implemented are open source and available in our GitHub repository.

For more details, read our blog!

Runtime verification of the component behaviours in the distributed system – The embedded approach

We’ve covered the basics of our runtime verification workflow in a previous post – now it’s time to see it in action. To keep things simple, the monitor will only check whether a message that arrived to the BeagleBone is always followed by a response. A simple statechart to achieve this can be represented as:

Monitor automaton

We start from a state of no previous messages. If a message arrives, the message broker raises signal one. This state has an entry action: if the state remains active after 500 time units have passed, it raises signal three. We decided to define a single time unit as one milisecond for the BeagleBone. This means that if a response isn’t sent after half a second, the statechart will enter an error state (for debugging and videos, we’ve used two seconds). If the response was sent, we return to the first state and the process starts over.

This monitor can be expressed in our Xtext language as depicted in the following code.

Statechart in our editor

After the statechart is described, we need to map it to the corresponding C++ monitor – luckily, the latter can be done with a click in Eclipse. Transformation are developed with the help of Xtend. We are using low level monitors with minimalistic C++ code to fit in the memories of embedded systems. Generating such code is easier than writing it by hand, just take a look at this snippet:

Generated monitor

Finally, we need to write small bits of glue code to connect the component to the system being monitored – mostly mapping the signals to their corresponding messages. We’ve put together a small system with the functionality of the above statechart, where we’re sending messages and controling the replies of the BeagleBone, which then enables certain pins’ outputs to light up leds. The first led is lit when we’re in the second state, where a message has already arrived but haven’t been answered yet. The second led signals reaching an error state. We’ve sent a message that was answered, and then a message which was not – as such, we reached an error state.

MoDeS3 Lego Robot – Modelling and simulation of the robot arm

Simulation is an important means in complex cyber-physical and IoT applications as it can provide:

  1. analysis at design time of the development
  2. prediction at runtime

We have developed the physican model of the robot arm (crane) in the open-source OpenModelica framework.

It defines itself with the following goal: "The goal with the OpenModelica effort is to create a comprehensive Open Source Modelica modeling, compilation and simulation environment based on free software distributed in binary and source code form for research, teaching, and industrial usage.“ 

OpenModelica is complex, there are many built-in functions and libraries, and it is also able to compute the complex behaviours of hybrid systems. Our robot arm is inherently hybrid as the controller has discrete modes while the physical system is continuous.

We have built the model of the robot arm.

Lego robot - the physical reality

By decomposing the physical model into smaller pieces, we can get the hugh level modelica model, depicted in the following figure.

Mapping the robot arm to modelica

This hierarchical model can be further refined according to the parts of the physical component.

Two level of hierarchy of the physical model

We have used many different components of the library, we have parameterized them according to the measurements, and what we got is quite close to reality. We have done measurements and the model could predict the real physics with only 2-3% of error. This is a quite good result 🙂

The simulaton is provided to the controller as a service running on a separate linux virtual machine. A virtual controller is developed which compiles the uploaded modelica files with the given parameters and computes the simulation. The results can be showed by the web server of this virtual machine or the results can be sent back to the controller.

the overview of the architecture is depicted on the next figure.

This way the simulation at design and also at runtime is available through a simple interface!

And now let’s show a simple simulation scenario! The model of the next picture is being simulated.

image

The angle of the three motors in the function of time is depicted on the next picture.

image

As it can be seen, the three motors can work independently with different speed! Note that this is just a simple example where only some parameters are examined. However, the tool is able to evaluate more complex parameters and movements of the system.

Simulation is a useful feature both at design and also at runtime, various questions regarding the path of the robot arm then can be answered!

The integration of the Lego robot with the controller: using MQTT from Python

In the Lego robot subproject of the MoDeS3 it was important to integrate the sensor information from the Lego sensors, the control and also the logic responsible for the safety. For this purpose, we have implemented an advanced control protocol in Python. This script runs on an embedded Linux distribution, called EV3dev. With this operating system we can utilize the Lego devices connected to the EV3 brick, while have access to all generic Linux packages, like the mosquitto broker.

The simplified overview of depdendecies is depicted on the next picture:

Overview

Our logic is able to detect when the motors are overdriven, or the robot is getting to a twisted and dangerous position, and prevents it from further attacking its limits by stopping them. By this protocol other components can control the crane by MQTT messages, or stop it if any other sensor detects something dangerous.

The code snippet below runs in a cycle and if it notices chage in the state of the touch-sensor, sends a message through MQTT (Paho). If needed it also executes some safety routines.

using MQTT

Sensors of the hardware are depicted on the following figures: these sensors provide the information:

SensorSensor2

The solution is built modularly, each part is responsible for certain movements and sensor information. The control software enables the user to control any of the motors individually, and get back raw sensor data through MQTT. A safety modul is observes the behaviours and available information and intervenes if something goes wrong.

Controlling and ensuring safety of the Lego robot arm: the computer vision challenge

As the Lego robot arm executes a mission, information about the environment is required. Beside executing missions correctly, our goal is to detect any kind of danger caused by the robot. The goal is twofold: the robot has to know when to execute a mission i.e. the object to be moved is at the right place. Second, it has to stop when some dangerous situation happens, for example a human is present near to the robot.

We are building the monitoring infrastructure of the robot arm based on computer vision technologies. OpenCV helps us detecting and tracking the movements of the robot In case of a moving robot, no other moving objects should be there. In addition, computer vision will detect if the object to be transported is in the proper place to handle by the robot.

First time we built only a robot arm with limited functionality. According to the experiences, we have totally rebuilt it.

Rebuilding the robot was a big step forward for the project’s computer vision goals. Now, we are able to detect the orientation and also the movement of the arm, without markers.

The new concept is to put on some Lego element in a combination to form a larger component with distinctive shape and colour. The camera observes the whole loading area from the top, and searches for the elements.

Using the same camera frames, we can detect the orientation of the gripper and find the cargos and the train.

For the gripper we needed a marker and that Lego element which we talked about before. The marker is directly connected to the gripper’s motor, so it is moving with the gripper during the rotation and other movements. The orientation is compared to the arm’s orientation so it won’t change during the movement of the arm, only the rotation influences its settings. The marker is a black circle, therefore we replaced the color detection with circle detection for this case.

Cargo has distinctive color.

In the following picture the output of the various steps of the process is depicted.

Output of the image processing steps

In the following we just sketch the working of the detection algorithms. Transforming the picture of the camera to HSV (Hue-Saturation-Value) representation. This will serve as a base representation for further processing. The next step is to decompose the picture according to the information we are looking for. In order to ease the tasks of the further processing, the picture is cut into pieces: The Lego arm, the gripper and also the object to be moved will be in different pieces of the picture.

Detecting the various objects of interest, we need to assess the color and the size of the objects in the picture: this is assessed at the next phase of the processing.

Edge detection algorithms search for the contour of the objects. Pattern matching algorithms try to find rectangles in the picture.

Numerical filters than used to sort the found objects (rectangles) according to their size. From the filtered objects, some special heuristics filter those object which are likely to be the searched object, namely the arm, the gripper and the load to be moved.

The movement of the arm is traced by reducing the problem to finding the moving rectangles in the filtered picture. Computing averages and tracing the middle point of the objects provide quite precise results.

So, as you might see, many algorithms work on the control and safety assurance of the Lego robot arm. Despite its complexity, it works well in practice!

Runtime verification in the MoDeS3 project – an introduction

Most of us get in elevators, ride trains and board airplanes without thinking about the danger. It became widespread to trust technology – or at least those parts that have been surrounding us ever since we were born. We were taught that these devices are safe. Nothing operates perfectly of course, but these devices manage to keep the severity of failures to a minimum somehow. However, in the IT world, having a few errors in projects with millions of lines of code is more than common. So how is it that these systems can still operate safely?

 Safety critical development

In safety-critical and also mixed-criticality systems – such as ours – it is very important to ensure the correctness not only at design time, but also at the working of the system. Traditional verification, as it was introduced in former post, can find design problems early in the design process. However, it would be a great idea to use the formally verified specification also at runtime to check if the runs of the system conforms to the specification.

There are many problems which cannot be handled by traditional design time verification. We generate the code from the design models. However, there is no assurance, that the code generator is correct. The second problem is that we can not verify our hardware. Problems caused by the hardware cannot be taken into account in the verification of the distributed logic. In addition, there can be transient or permanent errors in the components caused by short-circuit or many other kinds of events. Communication problems might result the loss of messages in the system, or errors in the network components might cause huge problems in the system.

Actually, in our small system we have faced many of the aforementioned problems, especially network delays caused serious problems and “accidents”.

Runtime verification

The output of running systems can be validated by external components checking conformance. For complex systems, only safety critical parts are monitored to be cost efficient. Our approach is to generate small monitors receiving the same inputs as the running component and verifies if the output is correct. Erroneous behavior can usually be detected by much smaller components. Imagine for example an airplane that with a safety criteria that it’s acceleration can’t go above 30m/s^2. The components that control the exact power of the engines can be complex, and done by many different parts of the system. Any error anywhere in the chain could lead to faulty behavior. A single component at the end can monitor the result, and if the final value would result in an acceleration higher than 30m/s^2, can signal an error. Such monitors are simple and efficient means to check certain properties.

Workflow

Monitoring components have a much lower complexity than the system itself, so they can easily be generated from models and we can trust in their correctness in a higher confidence. One of the most widely known modeling methods in the engineering world is statechart based modeling, so our approach is based on engineers creating statechart models, which then can be used to generate monitoring components with minimal non-generated code (glue code required to connect signals in the statecharts to the actual systems that they monitor).

Statechart based monitors

There are many flavors of statechart languages from low level ones which resemble state machines, to complex ones like UML statecharts. Our goal was to develop an intuitive and highly expressive statechart language with features like error state annotation to make monitor generation easy. A simple statechart can be described as following:

Editor

The described system simply switches between an odd and an even state on every tick signal. The features of the UML statecharts are fully available (entry and exit actions, state hierarchy, etc) with a few extensions, like parametrized handling of certain situation. A system specification can hold multiple statechart definitions which can communicate via shared signals.

Monitor generation from statecharts

We had a few ideas on how monitor generation should work, from flat, highly efficient monitors to high level ones that preserves the statechart’s hierarchy in the code itself (creating easily extendable and readable source files in the process). We also had a few options somewhere between the two extremities, but most systems either have a lot, or nearly no extra computing power that can be used for monitoring, so a midways approach isn’t really necessary. We ended up implementing most of the functionality for both the high and the low level monitor generators. So, let’s dive into how they work!

High level monitor generation

This method preserves the hierarchy of the statecharts completely. As a specification might consist of multiple statecharts, a statechart handler is responsible for the top level functionality. It works with a signal handler that connects the world to the environment – which handles the signal queues. Signal queues are one of the parts that has to be written by hand. It will mostly operate by either using shared memory (locking functions are built in) for the queues, or by attaching to a network interface to monitor packets, where certain packets raise certain signals. The statechart handler is responsible for the proper working of statecharts, which in turn contain states and transitions. These are all separate classes derived from a generic state and transition class, which allows the developers to extend the functionality of certain actions or guards. Names for states are also stored as strings which can be used to send informative error messages. For example, a state with a built in, and a custom entry action is represented as follows:

Monitor example

The handling of time is delegated to a separate class. This uses standard C++11 timing features and a clock with millisecond resolution by default, but can be easily changed to platform specific solutions: three functions need to be replaced in the class, one for getting the current time, one for getting the current time with an offset (which is needed for timers), and the comparison function between two timestamps.

This allows the generator not only to be used for monitor generation, but as a general tool to create object oriented C++ code from statecharts. This naturally results in a larger codebase than a low level monitor would, which is usually problematic when running on embedded systems.

Monitors with low overhead

After we realised that a BeagleBone PRU (which is what we wanted the monitor to run on) only has 8kB of code memory, a low level monitor generator had to be implemented. Code for handling hierarchy was the first to go – flat statecharts are just as good as hierarchical ones, when the memory limit is 8kB. The statechart names can also be omitted – even if it’s a less friendly method for the eye, storing an ID is enough to be to trace back which error state was reached. Creating child classes for states is also unacceptable overhead on such a small system, so we decided to use a general state class with function pointers. C++11 compilers are also seldom on embedded systems, so the code was downgraded to be C++98 compliant (which is the reason why no nice looking initialization lists are used). Then a single function running in an infinite loop checks for any changes in the signal que (in shared memory), and takes timesteps accordingly.

By now, you should have quite a bad feeling about how such code might look like. Well, wait no more, here is a small part from an example statechart, which shows how a transition is handled and the monitoring code is built:

Looks horrible, right? Still: that’s how low level monitors for embedded systems are born.