Increase device driver quality with device and platform simulation

By Sam Post
Numonyx Software Engineer

Device drivers are critical software components that transform complex hardware implementation details into simple software interfaces. Well-designed device drivers simplify user program logic and increase system security without compromising stability or limiting the user programs from accomplishing real work. Most device drivers run in the operating system’s kernel at with administrator (root) privilege, which makes debugging and/or validation difficult. This article examines the use of platform and device simulators to increase driver quality absent of physical hardware.

The need for simulation

Device driver writers invariably come across two main obstacles during the development lifecycle:

  • In-kernel debugging is difficult or unreliable
  • Hardware is not always available (or stable) during the development process

Additionally, if the device in question is running on an embedded target, the development cycle involves many steps (compile for target, download binary to target, attach remote debugger stub, etc.) that each can take several minutes to complete.

Simulation solves these issues by transforming the driver into a userspace program (thereby allowing industry-standard debugging and validation tools) and/or removing the dependency on hardware for functional validation. Validation suites that involve userspace simulation often run on multi-gigahertz machines, speeding up validation suites and reducing the time it takes to test changes to the software in question.

In one specific instance, the turnaround time to compile/validate a code change in the simulator was less than two minutes – compared to an hour (or more) to validate on the target platform. Simulator-based validation suites are an invaluable supplement to full-scale platform validation that can dramatically reduce development time.

In addition, a simulator can be used to inject infrequent or uncommon errors on-demand. For example, a NOR flash device simulator could inject “write errors” upon request as part of a validation suite. On a real NOR device, that specific failure may happen extremely rarely, leaving a code path untested or less frequently tested. Simulated error injection allows for more complete validation of error cases and results in more robust/stable code.

Simulation overview

This article discusses two types of simulation: device simulators and platform simulators. Device simulators mimic the specific behaviors of the target device, while platform simulators can be used to simulate the operating system and other system services.

The following diagram depicts one possible arrangement showing the interaction between simulators and the driver being tested.

 
Figure 1: Interaction between simulators and the driver being tested

A platform simulator can be used to “stub out” the operating system into a userspace module that can then be debugged. For example, when building a Linux device driver, the platform simulator could contain a subset of the block layer that is suitable for taking fopen(), fwrite(), fread() calls from the test application and translating them to bios or requests (in the case of a block device). Or, if the driver in test is a character driver, system calls could be translated into the driver APIs llseek(), read(), write(), poll(), etc.

If developing a driver for Microsoft Windows*-based operating systems, the platform simulator would convert system calls into I/O Request Packets (IRP) and pass them to the device driver in test.

The device driver is dispatched by the platform simulator to handle a request (bio, IRP, read/write, etc.), and will perform any necessary logic to prepare for interacting with the hardware in question. Calling platform-specific APIs to write to device registers (such as ioread/iowrite or READ_REGISTER_ULONG) can be intercepted by the platform simulator and sent to the device simulator for processing. Eventually, a result will be returned to the test application, which can then validate the result as part of an automated test suite.

A simpler system can be designed without the complexities of platform simulation, and is illustrated in the following figure. 


Figure 2: Example of a simpler system without the complexities of platform simulation

In this scenario, the test application would directly call into the driver’s entry points, rather than the indirection of routing through a platform simulator. This simplicity comes as a price, as the test application is more closely mated with the specific driver APIs and cannot be reused with the full operating system and driver running in kernel mode.

Simulator implementation details

Device simulators mimic the specific behavior and characteristics of the device in question. Examples of devices that benefit from simulation include individual Numonyx® NAND, NOR and Phase Change Memory (PCM) die, extending up the stack to high-end graphics cards or rotational media.

Assuming that the device in question is presented to the system via a register set accessible through memory-mapped I/O, the device simulator can present a regular RAM region, which is written to by the test application. The test application can then poll selected registers and trigger state changes according to the device specification. In practice, inserting a sleep(0) into the polling loop can reduce CPU utilization and force other threads to run more often, resulting in dramatically improved validation runtimes.

 
Figure 3: Simulator polls waiting for the driver to write to a shared RAM region

Since polling is CPU-intensive and subject to timing and/or race conditions, using macros to write to device registers can remove the polling mechanism when the driver is compiled for simulation mode. Take for example a hypothetical device where writing 0xdeadbeef to the device “control” register would trigger state change Y on the device. The driver writer could create a macro “WRITE_TO_DEVICE_CONTROL(value).” In a production environment, this macro (or function) could be defined to call iowrite32() or REGISTER_WRITE_ULONG; when compiling the driver in test mode. WRITE_TO_DEVICE_CONTROL could be redefined to call a specific simulator function, thus directly triggering the state change.

A platform simulator can be implemented as a series of “stub” functions that mimic the target operating system application interfaces. For example, the Linux block device APIs include a number of functions, which are listed briefly in the following table. When the driver is compiled for the simulation target environment, these APIs can be re-implemented into a simple userspace application that hands system calls to the driver in question, performing any necessary translation. The work required to simulate a platform is non-trivial, but can be well worth the time invested, as it allows full end-to-end functional verification of the driver in question.

The following table provides examples of Linux block layer APIs, which can be re-implemented for userspace platform simulation.

 API Description
 blk_init_queue()  Initializes a block device’s request queue.
 elv_next_request()  Gets the next request from the I/O scheduler.
 blk_end_request()  Completes an I/O to the kernel, returns status to user program.

Simulator limitations

The simulation discussed in this article removes the dependency on platforms and hardware devices by replacing both dependencies with userspace programs. As such, timing-related bugs and operating system intricacies are not easily modeled. Simulation will not uncover all driver defects, as the process is further limited by the accuracy of the simulator itself.

Summary

Device and platform simulators improve the driver development process by removing the dependency on physical hardware operating systems, and by enabling userspace debugging of the driver’s functional logic. Additionally, simulation can simplify and speed up the software development cycle, allowing defects to be detected and corrected in a shorter time and with greater accuracy.

For additional information on using simulators in the development process, refer to the article: Using Software simulators in the development process.