Blog | Songlark

Verifying reactive systems in F*

March 24, 2023 · 11 min read

Bird enthusiast

This post describes early work embedding Lustre-style control systems in F*, and shows a demo of it in action in a real (simple) controller.

Introduction

For safety-critical systems such as the braking controllers in cars, we want strong assurances that the software that controls the system is correct. One way to achieve such assurances is to write them in a high-level language, such as Lustre, and formally prove that they satisfy a high-level specification. To build confidence that our program itself is correct, we can prove that our Lustre programs satisfy the specification using a model-checker such as Kind2. To build confidence that our compiled code faithfully implements our original program, we can use a verified compiler such as Vélus.

Unfortunately, however, there are a few practical issues with this approach: Kind2 and Vélus both use different dialects of Lustre, so it's not possible to use them both on the same input program without converting from one syntax to another. Vélus and Kind2 also support different feature subsets, because they have different objectives and priorities.

Aside from practical issues of syntax, there is also the more theoretical issue of whether Kind2's proofs apply to Vélus's generated code. The two tools define the semantics of the language in very different ways: Kind2 translates input programs to transition systems, which are good for reasoning about but aren't the best for executing. Vélus, on the other hand, uses a coinductive semantics which is designed for proving compiler correctness. These two different semantics have no formal connection, which doesn't give us much confidence that they really agree.

One option to build confidence that programs satisfy their specification would be to define a secondary semantics in Vélus, one which is better for reasoning about programs, and allow users to prove their programs correct manually in Coq. However, Coq has relatively little proof automation. Compared to the fully-automated proofs that Kind2 can produce, requiring manual Coq proofs would prohibit its use by many systems engineers.

Instead, I'd like to introduce Pipit, an embedded language for implementing and verifying control systems in F*. The goal is to reuse F*'s excellent SMT-solver-based proof automation to automatically verify transition systems via k-induction, which is the same key method that Kind2 uses to perform its proofs. There is also an imperative subset of F* for which C code can be extracted, so we can translate our control systems to imperative code and generate C code to run on embedded systems.

Pipit is a work-in-progress. Once the translation to transition systems and to imperative code have both been verified, we can be confident that any property we prove about the high-level program also applies to the generated imperative code. This is a reasonably strong guarantee, but it's not as strong as Vélus' guarantee that the generated assembly is correct, as the proof of correctness of F*'s C code generation is not mechanised yet. So far, I have proved soundness of the core part of the translation to transition systems for verification, but I still need to prove some additional features of the translation to transition systems and the translation to imperative code.

Pipit only supports a small core language yet and doesn't have a nice front-end syntax, but in its current state I can define simple controllers, verify them, and generate C code to execute on a small embedded system. It's enough for a small demo.

Plumbing a coffee machine

I have a domestic coffee machine with a water reservoir. In normal use, the water reservoir must be manually filled with water from the tap. I wanted to plumb the reservoir to receive water directly from the tap, but I was concerned about flooding the kitchen if the tap somehow got stuck open. To ensure that the kitchen wouldn't flood, I decided to implement a small controller to open and close the tap.

I have added a solenoid connected to the water mains, which I have mounted above the lid of the reservoir. The solenoid is normally-closed so that water cannot flow when the power is off. When power is applied, water flows from the mains into the reservoir. I have also added a float switch suspended from the lid of the reservoir, which allows the system to sense the water level. When the water goes above the level of the float switch, the switch turns off to indicate that the water level is sufficiently high. Finally, I have attached an "emergency stop" lever switch to the lid of the reservoir. When the lid is placed on the reservoir, the estop switch turns off; when the lid is removed, the estop switch turns on.

The system has two safety controls to reduce the risk of flooding: firstly, if the emergency stop lever indicates that the lid is not attached to the reservoir, the controller closes the tap. Secondly, if the tap has been open for over a minute, the controller closes the tap.

The control system is very simple: if the water level has been low for long enough, it opens the tap. If the level is high, if the emergency stop lever is on, or if the system is "stuck", then the tap is closed. Once the system becomes stuck, it stays stuck until you restart the microcontroller.

To define the control system in Pipit, we first define a function called once to check if a signal has been true at any point in the past ("at least once"):

let once (signal: exp) =
  recursive (fun once' -> signal || fby false once')

This function introduces a recursively-defined stream called once', which is true if the input signal is true, or if the previous value of once' is true (fby false once'). The false in fby false once' means that if there is no previous value, as is the case at the very start of execution, it defaults to false.

Using the once function, as well as a lastn t function that checks if a signal has been true for at least some window of history t, the controller looks like the following:

// Timeouts
let settle_time   = 100  // one second, assuming the system runs at 100Hz
let stuck_time    = 6000 // one minute
// Flags for bitfield
let solenoid_flag = 1
let stuck_flag    = 2

let controller estop level_low =
  // Try to turn the solenoid on if estop has been false and the water level
  // has been low for at least a second
  let sol_try = lastn settle_time (!estop && level_low) in
  // Consider the system to be stuck if, now or in the past, the solenoid has
  // been on for a minute
  let stuck   = once (lastn stuck_time sol_try) in
  // Only actually turn the solenoid on if we're not stuck
  let sol_en  = !stuck && sol_try in

  // Properties to be proved
  property "if estop then do not engage"
    (estop      => !sol_en);
  property "if level high then do not engage"
    (!level_low => !sol_en);

  // Encode the two results as a bitfield as we don't support tuples yet
  let result =
    (if sol_en then solenoid_flag else 0) +
    (if stuck  then stuck_flag    else 0) in
  result

(The actual implementation is syntactically messier because the core language doesn't have a nice front-end yet; here I am presenting the "aspirational" syntax.)

To prove that our controller satisfies the two (very simple) properties, we convert it to a transition system and prove it inductively. There is a bit of boilerplate here for the conversion, but the actual proof goes through automatically after applying the normalisation-by-evaluation tactic (tac_nbe) to simplify away the translation to a transition system.

let controller_lts =
  system_of_exp (controller (XVar 0) (XVar 1))

let controller_prove (): Lemma (ensures induct1' controller_lts) =
  assert (base_case' controller_lts) by tac_nbe ();
  assert (step_case' controller_lts) by tac_nbe ();
  ()

To generate C code, there is also a bit of boilerplate, but it's not too bad. The interface for the generated code has the usual reset and step functions for the controller:

typedef struct Example_Compile_Pump_input_s
{
  bool estop;
  bool level_low;
}
Example_Compile_Pump_input;

typedef struct Example_Compile_Pump_output_s
{
  bool sol_en;
  bool nok_stuck;
}
Example_Compile_Pump_output;

void Example_Compile_Pump_reset(Example_Compile_Pump_state *stref);

Example_Compile_Pump_output
Example_Compile_Pump_step(Example_Compile_Pump_input inp, Example_Compile_Pump_state *stref);

The implementation of the C code is surprisingly long for such a simple controller, but it works. My current translation to imperative code is very dumb and duplicates a lot of work, but this issue is fixable.

(The example is called Example.Pump, but there is no pumping here at all, only solenoiding.)

Demo

I have implemented the above on a microcontroller (a Raspberry Pi Pico) and attached it to my coffee machine. Here is a video of it in action.

Future work

I am happy to have Pipit working as a whole end-to-end system. We can implement a simple controller, prove some properties, and run them on a real embedded system, even though it's still very raw. I have verified the core of the translation to transition systems (which is used for proving systems correct), which means that any properties we prove on the transition system hold for the language's semantics. I'm confident that the rest of this translation can be verified, but first I'd like to focus on improving the language a bit.

The examples I showed above use an "aspirational" syntax, as the real implementation uses de Bruijn indices with no support for named variables. Manually writing programs with de Bruijn indices is pretty awful. There are standard approaches to fix this, but I haven't implemented them yet. The language is also untyped: all expressions are represented by integer values, and boolean operations implicitly treat non-zero integers as true. Again, there are standard approaches to fix this, but I wanted to see the system working end-to-end before investing time into these more-standard "engineering" problems.

Once I have improved the language and finished the verification of translation to transition systems, the obvious next step is to verify the translation to imperative code. This proof will give us confidence that the two translations agree, and that any properties we can prove really do hold on the executable code. This proof will be more challenging than verifying the translation to transition system, as there is a larger gap between the programming language's high-level semantics and the imperative code. I believe that this proof will be easier than the proof of correctness given by Vélus, the verified Lustre compiler, as the imperative subset of F* is still higher-level than the C that Vélus needs to generate. This smaller gap is a trade-off, however, as until F* itself is verified, we have a larger trusted computing base.

Finally, we need more evaluation, which involves writing and verifying real safety-critical systems in Pipit — not just coffee machines. I am excited about the possibilities of writing control systems in Pipit with F* as a metalanguage, as I believe having a good metalanguage will be more expressive than traditional Lustre, without sacrificing the beauty and simplicity of Lustre. I also believe that F*'s support for both automatic and manual proofs will be useful for verifying larger control systems.

Applicative invariants for Lustre

November 27, 2022 · 12 min read

Amos Robinson

Bird enthusiast

This article describes some preliminary work on an invariant generation technique for Lustre. I gave a talk on this at Synchron 2022 in late November 2022. Slides are available here (PDF), but the slides are quite large (25MB) as they contain many bird photographs. I intend to do more evaluation and write this up as a paper in the future.

For safety-critical programs written in Lustre, we can prove that they satisfy certain correctness properties using model checking. Most model checkers for Lustre first transform the program into a transition system and prove properties on the transition system, rather than proving them directly on the streaming semantics of Lustre, because the streaming semantics becomes complex for streams with arbitrary clocks. However, some properties which are obvious on the streaming semantics require extra inductive invariants on the transition system. These invariants may need to refer to internal state of components, which makes them difficult to write in the source language. For such properties, if the model checker's invariant generation techniques do not find the required invariants within the time limit, the proof attempt will time out. I introduce an invariant generation technique called Applicative Invariants for Lustre (AIL) which uses equivalence graphs (e-graphs) to find invariants that hold on the streaming semantics but require inductive invariants on the transition system. This invariant generation technique is designed to work in cooperation with other techniques, as the invariants it proposes are not as diverse as those proposed by instantiation-based invariants [Kahsai11] or counterexample-based invariants such as IC3 [Bradley12]. The advantage is that it draws from a smaller set of possible invariants so that the whole set can be explored exhaustively, making it more resiliant to minor program changes.

The Songlark model checker

October 17, 2022 · 5 min read

Amos Robinson

Bird enthusiast

Real-time control systems, such as the microcontroller that controls your car's brakes, can have horrific consequences if they go wrong. That's why, for safety-critical software, we need real assurance that the software always does the right thing.

Introduction​

Plumbing a coffee machine​

Demo​

Future work​

Introduction

Plumbing a coffee machine

Demo

Future work