As anyone in the software industry can testify, applications evolve as requirements change to meet expanding market demands. Determining when the file system is no longer an appropriate data store must be driven by current requirements as well as by the future scaling and direction for the application. Considerations for the decision should not only include the database feature set of a commercial database but time-to-market, reliability and maintenance costs. A database management system, or DBMS, is a feature-rich tool that the application developer can use to easily and reliably store and retrieve data. However, a DBMS is not always an appropriate solution for addressing the application’s data storage needs. Some conditions where this might be applicable not to use a DBMS are:
- Simple application data requirements or special purpose and relatively static data;
- Data sets are very small and data loss is an acceptable risk; or,
- Concurrent, multi-user access to data is not needed (“users” can be defined as multiple threads and/or multiple processes).
One example of an appropriate use for a file-system data store is the application configuration file or INI file. This type of data is typically static and rarely modified after the application installation. Key Questions to Ask Determining when it is no longer an effective design decision to use or continue to use a file system for data storage may not be a clear-cut decision. The following questions may help in guiding the decision process.
Do you need object management?
Real world data are not files, they are objects. Applications map their objects into files and manage them through the application. A DBMS, on the other hand, is designed to manage objects and will allow the application to manage them directly without the need to add object management code.
Do you need object relationship management?
Relationships between different object types are required by most applications. File systems have no object concept, so no ability to manage relationships. A DBMS is designed to provide and manage object relationships.
Do you need transactional data operations?
A file system can´t handle making sure that your data does not get corrupted, that threads don’t see inconsistent data, or that data changes aren’t only partially completed. Without the support for ACID (Atomic, Consistent, Isolated, Durable) data transactions, vendors can fall into these traps resulting in large amount of code being added to the application. Embedded databases implement ACID transactions, which prevents these problems.
Do you need concurrent access to your data?
Multiple applications accessing the same file will only exercise the file system lock arbiter. File systems do not notify waiting applications when another lock releases. The application ends up pooling or managing the sharing of data through its code. A DBMS manages concurrent access to data efficiently, resulting in faster access to your data and overall better performance for your application.
Do you need indexed data for fast lookups?
File systems do not index objects, they index files. Needing fast access to objects, implemented through a file system, means the application needs to manage index information. A DBMS manages indexing for the application seamlessly through the database schemas.
Do you need intelligent memory managed data?
Real-time applications needing complex memory and data management will find it hard to combine in-memory management with file-based data management. With real-time data, RAM-based management may be the only way to achieve performance requirements.
Do you need data redundancy?
Application vendors who desire failover and data redundancy will need to manage this in the application if it’s based on a file system.
Will your data management requirements change?
As applications get more complex so does the data management. File-based solutions are tightly coupled with the initial application requirements and extremely hard to redefine and change.
Answering “yes” to any of the above questions may indicate that the data requirements for your application design are becoming more complex. Using a commercial DBMS allows the application development team to focus on the core competencies of their application while taking advantage of a data management solution that addresses the complexities of managed data.
The feature list of a DBMS can only be considered an advantage if those features are essential to providing an effective solution to the application data management requirements. A quick summary of the features that address the questions above is as follows:
- Transaction Support– Atomic transactions guarantee complete failure or success of an operation. This includes automatic recovery of the database to a transaction-consistent point in the event of an abnormal termination of the application (crash, power loss, etc.).
- Concurrent Access– The ability to share data by controlling access to data items; many users (process or threads) can access data concurrently.
- Data Normalization– A well-designed database schema can reduce storage requirements on the target storage media by reducing duplicate data.
- Expandability, Flexibility, Scalability– A database system can scale easily to larger datasets.
- Standards Enforcement– One example of this advantage would be to use the DBMS for all data storage requirements for the application. Multiple data structures can be manipulated using the same API functions. The can lead to reduced application development times and reduced maintenance costs in the future.
- Fast Query Access– Databases allow indexing based on any attribute or data-property (i.e. SQL columns). This helps fast retrieval of data, based on the indexed attribute. This is an important advantage as data-sets begin to grow large as it provides a more predictable query response time.
- Interoperability– Connectivity through industry standard protocols allowing third-party tools to access and analyze data.