How To Use MTD

Memory Technology Device (MTD) is the name of the Linux subsystem that handles most raw flash devices, such as NOR, NAND, dataflash, and SPI flash. It provides both character and block access to these devices, as well as a number of specialized filesystems.

Device Support

MTD Devices

The following devices are supported by the MTD subsystem:

  • NAND flash
  • NOR flash
  • OneNAND flash
  • Atmel Dataflash
  • SPI Flash

Non-MTD Devices

The following devices are not handled by the MTD subsystem, even though they are commonly referred to as "flash." Many of these include Flash Translation Layer (FTL) hardware that takes care of many of the concerns surrounding flash usage, such as wear-leveling and bad blocks.

  • USB Sticks — handled by USB Host and SCSI subsystems
  • Compact Flash — handled by the PC Card/IDE subsystems, depending on the implementation
  • EEPROMs — handled by either the SPI EEPROM driver, or I2C EEPROM driver
  • MMC/SD Cards — handled by the MMC subsystem

Overview

The MTD subsystem provides a number of mechanisms for interacting with raw flash chips. It consists of a number of generic drivers for classes of chips (NAND, NOR), as well drivers for individual flash controllers. It also includes support for accessing these chips as either block or character devices, as well as the ability to divide a single chip into multiple, smaller partitions. Many features of the MTD subsystem deal with flash-centric issues such as wear-leveling, bad block detection and handling, and out-of-band (OOB) data. There are a number of filesystems available specifically for the MTD subsystem.

Wear-Leveling

Wear-leveling is an important consideration when dealing with flash chips. A flash chip is composed of a large number of memory cells, or gates. Each of these gates are rated for a limited number of erase/program cycles. This means that they will eventually wear down, and become "bad blocks." Without wear leveling, some blocks will wear down much faster than others. For example, consider a file that keeps track of the total operational time of the board. If we were to update this file once a minute, and always use the same block when storing the file, that block would wear down after around 70 days on flash rated for 100,000 erase/program cycles. Wear-leveling ensures that this burden is spread around to multiple blocks, which increases the overall life of the chip. MTD itself doesn't perform wear-leveling, but many of the flash filesystems do.

Bad Blocks

One distinguishing feature of NAND flash is that it is allowed to contain bad blocks straight from the factory. These bad blocks are tracked on the chip so that software can skip them when necessary. This allows the manufacturers to achieve a much larger yield, and therefore decrease the costs of the chips. Most (if not all) NAND flash manufacturers will also guarantee that the first block is good, allowing you to use that space for a bootloader.

Linux MTD also has facilities for detecting and tracking bad blocks on NAND chips. Linux generates a bad block table (bbt) and stores this information in the last two good blocks of the chip. Most Linux-capable bootloaders also treat NAND in this way. When using a flash-friendly filesystem, or userspace MTD utilities, bad blocks are automatically handled.

Block vs. Character Access

MTDs can be accessed as either character or block devices in the Linux kernel. However, using the device as a normal block device is extremely unsafe, and not fully implemented. For the most part, you should always use the character device interface.

NOTE: One notable exception is the JFFS2 filesystem, which uses the mtdblock handle to determine which device to mount. You must enable CONFIG_MTD_BLOCK=y for jffs2 support. However, even JFFS2 doesn't actually use any of the mtdblock functionality.

MTD Partitions

Due to cost and space constraints, many embedded systems will probably only have one Flash chip on board. On these systems, it may be necessary to logically divide that flash chip into many smaller partitions, much like a hard disk. For instance, you may wish to have some small amount of space set aside for a bootloader and its enviroment, another for the Linux kernel, and yet another for the root filesystem. You may even wish to have multiple filesystems. To accommodate these requirements, MTD provides a system for creating partitions on an MTD chip.

Unlike a normal block device, such as a hard disk, you cannot use the fdisk utility to create a partition table. Instead, the MTD subsystem provides a number of ways to define partitions on an MTD chip. Partitions are purely a software construct — besides the RedBoot table method, there's no real partition table stored on the device. This means that your bootloader is typically unaware of any partition scheme.

  • Static (compiled-in) device tree partitioning — Use data structures in the kernel, which are passed to the relevant flash driver's probe function, to determine the flash partitions.
  • Device tree partitioning — Define the partition scheme in a device tree blob that is provided to the kernel at boot time.
  • RedBoot Partition Table — Use the table created by the RedBoot bootloader (and stored in its environment) to define your partition table.
  • Command-line partitioning — Provide a kernel-command line argument that defines the partition scheme. See How To Partition MTD on the Command Line.
Type Configuration Option Supported Systems
Static CONFIG_MTD_PARTITIONS=y All
Device Tree CONFIG_MTD_OF_PARTS=y PowerPC
RedBoot CONFIG_MTD_REDBOOT_PARTS=y Any system with RedBoot FIS Support
Command-line CONFIG_MTD_CMDLINE_PARTS=y All

The static structure is typically defined in a board-specific file in the arch directory of the kernel. For instance, on the AT91SAM9263-EK, it is defined in the file arch/arm/mach-at91/board-sam9263ek.c, in the ek_nand_partition array. This array is a number of mtd_partition structs. Each struct defines an additional partition on the given device. The exact method for passing these structures to the driver varies depending on the driver itself.

The device tree structure follows a very similar format to the static type. It is a structure defined in the device tree file for your system. See section VI.2.c of the file Documentation/powerpc/booting-without-of.txt for more information about device tree partitioning. This information applies to most partitioned chips, not just NOR.

RedBoot partitioning requires you to provide the location of the RedBoot configuration table to the kernel. This is done in the kernel configuration file (.config). This method allows you to use RedBoot to create your partition tables, and ensures that you have consistent values between your bootloader and kernel.

A logical MTD partition is functionally equivalent to its physical counterpart. That is, any operation that can be done to an unpartitioned chip can be done to the partition. The kernel treats them exactly the same. For instance, if a single NAND chip is not partitioned, it will appear to the kernel as a single MTD device, such as /dev/mtd0. If that chip is partitioned into 4 parts, then it will appear to the kernel as 4 devices, /dev/mtd0, /dev/mtd1, /dev/mtd2, /dev/mtd3. Unlike a hard disk, there is no device file that references the entire flash chip.

Unsorted Block Images (UBI)

Unsorted Block Images (UBI) are a relatively new construct in MTD that provide additional functionality on top of the standard MTD subsystem. It was created as a response to the proliferation of NAND devices, which have a number of considerations above and beyond other flash chips. In particular, it has the ability to adjust partitions around bad blocks. This is useful for extremely small partitions, which may be seriously affected if bad blocks occur within their bounds. Consider a small configuration section, which only needs a single block. If we define that partition as an offset from the beginning of flash, there's a possibility that the given block is bad. The UBI system allows you to specify an abstract block, which will be allocated by UBI itself in a known good location. Rather than referring to the offset, you would refer to the UBI device.

For more information about UBI, see the document: How To Use UBIFS.

Filesystems

Raw flash devices are very different from hard disks. Since many of the commonly used filesystems, such as ext2, are optimized for spinning disk accesses, they are generally not appropriate for flash devices.

NOTE: Some types of non-mtd flash, like compact flash and SD cards, implement a Flash Translation Layer (FTL) in hardware that makes the device appear as if it were a disk. These implement wear-leveling and bad block management in hardware. This is why you are able to write ext2 or fat filesystems to these devices without the problems that you would encounter on a raw flash device.

JFFS2

JFFS2 has been the de facto standard for flash filesystems on Linux for quite some time. It is in the mainline kernel, which gives it widespread exposure. However, it has recently been superseded by UBI and ubifs, although it is still in widespread use.

See the document How To Create JFFS2 Images.

ubifs

ubifs is a relatively new filesystem that was introduced to the mainline kernel in the 2.6.27 release. It is meant to be the successor to the popular JFFS2 filesystem. It addresses many of the shortcomings of the JFFS2 filesystem, and is typically much faster.

See the document How To Use UBIFS for information about the UBI and ubifs.

Userspace Utilities

The package mtd-utils provides a number of userspace utilities for interacting with MTD flash chips. These utilities typically operate on the character device (/dev/mtdN) for the given MTD chip.

flash_erase

The flash_erase utility is used to erase an arbitrary number of blocks from the given device. It sets all of the memory cells in the defined range to the erased value, typically 0xFF in most types of flash. It takes the form:

flash_erase MTD-device [start] [cnt (# erase blocks)] [lock]

This command erases cnt blocks, starting from block start, on MTD-device, which is the character device that you wish to erase from (e.g. /dev/mtd1).

Example

This command will erase 20 blocks, starting at the first block, from /dev/mtd1:

flash_erase /dev/mtd1 0 20

flash_eraseall

The flash_eraseall utility is used to erase all of the blocks on the given device. It sets all of the memory cells to the erased value, typically 0xFF in most types of flash. It takes the form:

Usage: flash_eraseall [OPTION] MTD_DEVICE
Erases all of the specified MTD device.

  -j, --jffs2    format the device for jffs2
  -q, --quiet    don't display progress messages
      --silent   same as --quiet
      --help     display this help and exit
      --version  output version information and exit

The -j flag will also set up the Out-of-band (OOB) area for JFFS2 filesystems. Since NAND flash doesn't use OOB data with JFFS2, is not necessary on that type of flash.

Example

The following command will erase all data on /dev/mtd1:

flash_eraseall /dev/mtd1

flashcp

The flashcp utility will write a file to the given flash device. Unlike the dd command, it is safe for devices with bad blocks, and provides a number of error detection properties.

usage: flashcp [ -v | --verbose ] <filename> <device>
       flashcp -h | --help

   -h | --help      Show this help message
   -v | --verbose   Show progress reports
   <filename>       File which you want to copy to flash
   <device>         Flash device to write to (e.g. /dev/mtd0, /dev/mtd1, etc.)

Example

This command will copy the file sample.img, located in the root directory of your filesystem, to the device /dev/mtd1:

flashcp /root/sample.img /dev/mtd1

nandwrite

The nandwrite utility is designed for writing files to NAND flash. It provides a number of additional options that specifically apply to NAND.

NOTE: Most of the time, flashcp is sufficient for writing images to NAND. It will almost always "do the right thing." You should only need to use this utility if you absolutely need to make sure that the information is written in a very specific way, such as legacy support for flash filesystems, or if a certain application is reading flash in a very specific manner.

Usage: nandwrite [OPTION] MTD_DEVICE [INPUTFILE|-]
Writes to the specified MTD device.

  -a, --autoplace         Use auto oob layout
  -j, --jffs2             Force jffs2 oob layout (legacy support)
  -y, --yaffs             Force yaffs oob layout (legacy support)
  -f, --forcelegacy       Force legacy support on autoplacement-enabled mtd device
  -m, --markbad           Mark blocks bad if write fails
  -n, --noecc             Write without ecc
  -o, --oob               Image contains oob data
  -s addr, --start=addr   Set start address (default is 0)
  -p, --pad               Pad to page size
  -b, --blockalign=1|2|4  Set multiple of eraseblocks to align to
  -q, --quiet             Don't display progress messages
      --help              Display this help and exit
      --version           Output version information and exit

Example

This command will copy the file sample.img, located in the root directory of your filesystem, to the device /dev/mtd1:

nandwrite /dev/mtd1 /root/sample.img

mkfs.jffs2 and sumtool

mtd-utils also provides two tools for working with JFFS2 images, mkfs.jffs2 and sumtool. See the document How To Create JFFS2 Images for more information about these utilities.

UBI utilities

mtd-utils also provides a number of utilities for UBI and UBIFS. See the document How To Use UBIFS for information about the UBI userspace utilities.

Additional Information

Internal Documents

External Documents