User Tools

Site Tools


computer_science:what_is_sparse_files_explained_ntfs_features

This is an old revision of the document!


Introduction

In the year 2000, Microsoft introduced the support of sparse files with the release of New Technology File Systems (NTFS) Version 3.0. Operating systems based on the Windows NT family (starting from Windows 2000 and subsequent versions) are able to make use of this file management system. In this article we will look at what sparse files are and how they are used.

What are Sparse Files?

A sparse file is defined as a file type which efficiently handles empty regions within itself so that space utilization is optimized during storage.

In computing, a normal file may contain regions of empty blocks that do not contain actual information. This empty space is filled with bytes of zeros and stored alongside the regions that contain actual data in the file. Both the empty space and actual data take up file space. An example is a database file which stores a large volume of zero bytes to represent data that were either deleted or simply there to reserve those spaces for future data storage. Storing such files in standard format would take up a great deal of disk space that could be freed up for other purposes.

Through the use of sparse files functionality, a file containing many zero bytes is tagged as sparse and a special attribute will be associated with it. NTFS will then store such files in a different way from normal files. Only regions that contain actual data are allocated with storage space on the disk volume, while the zero bit data are not. The file system automatically tracks the location of these empty ranges and stores them in metadata as a representation of the actual empty blocks.

NTFS manages sparse files seamlessly in the background, filling the read memory buffer with zeros when a read operation tries to access the areas of the file where those zeros are located. The application is unaware of this conversion.

Sparse files are widely used in disk images, database files, log files and scientific applications.

Difference between File Compression and Sparse Files

Besides sparse files, NTFS also includes built-in functionality to compress files. Both tools are known for their space saving advantages on the disk volume, but they achieve that goal differently. The main disadvantage with using file compression is that it may degrade performance in a system while reading/writing the file. Precious resources are used for decompressing/compressing the file as required. Such overheads are sometimes not acceptable in certain critical applications.

computer_science/what_is_sparse_files_explained_ntfs_features.1442448257.txt.gz ยท Last modified: 2015/09/17 00:04 (external edit)