By Phil March
Numonyx Software Product Manager
You have just finished a call and you are putting your phone away but you miss the holster. The unthinkable happens: you drop your phone. It hits the ground and comes apart – cover, battery and all. This time you got lucky, nothing is broken. You breathe a sigh of relief and put things back together. Then, you press the power button and wait expectantly for the familiar splash screen, hour glass and signal strength indicator. But this time the sequence looks different. No splash screen, just an error message. You wind up with an “exception” fault and the phone does not boot up. You try again, cycling the power off, then on again without success. Your phone is dead. This is one of the many negative outcomes of using a file system without Power Loss Recovery (PLR).
PLR is a critical aspect of file system safety and integrity for wireless devices because unexpected power loss can result in the following problems:
- Loss of data
- Lost directories
- Inability of the file system to initiate or mount
- Incorrect naming of files or directories
- Application blow-up
PLR ensures that a file system and critical data will not be corrupted when power is lost unexpectedly. Without PLR, cell phones would experience catastrophic errors when the battery is disconnected unexpectedly or fails. For cell phones that use NAND flash memory, PLR takes place in both the Flash Translation Layer (FTL), such as the Numonyx® NAND Flash Translation Layer (NFTL), and the file system software, such as the Numonyx® Sector-Based Compact File System (SCFS). SCFS is a FAT file system that is unique in the market because traditionally FAT file systems have not had PLR. This article concentrates on the PLR requirements and capabilities of flash file systems and describes the steps Numonyx takes to ensure PLR for the cell phone market.
In some systems, power is lost frequently. In these instances, the system will return to default values, which will allows the application to operate correctly once power is restored. There are situations in which this is unacceptable. One such situation is when an application must remember system-critical information. For example, a cell phone is calibrated at the factory for its own specific characteristics. This tuning information is stored in flash memory and is used to initialize the baseband hardware at power up. If the tuning information is corrupted by a power loss, the phone will not function.
In addition, PLR is particularly critical when writing data to NAND flash memory. The sequence of writing a file to NAND flash memory (irrespective of PLR) by FAT-based flash file systems is as follows:
- Write data
- Update the FAT table
- Update the directory location
However, if the file system performs the above without doing any logging, unpredictable things can happen when a power loss occurs. For this reason, PLR incorporates an additional step after performing any write: writing a log entry. By doing this, the system is able to keep track of the last step that was executed and whether or not it was successful. If a power loss should occur at any point during this process, the system is able to return to the last operation, then review and either discard corrupt data or complete the operation.
This logging can be done in a separate area or in the actual data being written. In the latter method, PLR keeps track of the state of all data writes to flash memory. Based on the last state of a write to flash, initialization code can decide whether or not the write completed successfully. If the write did not complete, then the previous version of the data must be restored. This can be accomplished through a simple Write State Machine (WSM).
The WSM can be implemented by adding an extra word to the data being written to store the state of the data. The state machine will have the following states:
- Empty
- Writing Data
- Data Written
- Replacing/Deleting Data
- Deleted Data
Before a parameter is written for the first time, the state is changed from Empty to Writing Data. The data is then written and the state is changed to Data Written. If power is lost before the state is changed to Data Written, the data will be invalidated on initialization. If the state is read as Data Written, the data is considered valid and can be used by the application.
If a parameter is replaced with a new value, the current data’s state is changed from Data Written to Replacing/Deleting Data. The new data’s state is then changed from Empty to Writing Data. The new data is written to flash memory, followed by a state change to Data Written. The old state is changed to Deleted Data to signal that the old copy is now invalid. If power is lost before the new data’s state is changed to Data Written, the old data will remain valid and the new data will be invalidated by changing its status to Deleted Data. If power is lost after the state of the new data is set to Data Written, the old state will be changed to Deleted Data and the old data is discarded.
While there are a number of different methods for implementing PLR, and at least two different software layers in which to do PLR, the key factor is that both your file system and translation layer software incorporate this feature. We have seen the detrimental effects of the lack of PLR, which affects both the wireless and embedded markets. Most of these negative outcomes are no longer acceptable for today’s sophisticated users. Failure to have PLR on either software layer can be catastrophic.
Today, not only must cell phones be PLR safe, but televisions, set-top boxes and many DVRs do as well. When incorporating NAND into your design, you can rely on Numonyx flash file system and FTL software, which are power loss safe. PLR has been addressed and fully tested by Numonyx for the most stringent applications to ensure fully functioning devices in the field.
For more information about Numonyx PLR software solutions for designs using NAND flash memory, contact your Numonyx representative.