So, there are different levels of data manipulation:
- Data access layer, which is convenient to use from programming languages;
- A storage layer. This is a separate layer, because it is usually convenient to store data in other ways than to use: effectively in memory, align, stack on disk. This is to the question of schemaless: a scheme that is easy to store, not easy to access.
- "Iron" is the layer where data lies, and there it is organized in another third way, because the disks are managed by the operating system, and they communicate only through the driver. We are not going to go into that level very much.
For the data access layer, there are requirements that we are interested in fulfilling in order to be easy to work with:
- Versatility, so that it is possible to request data using any technology.
- Optimality of this query. The method of access must be such that it is good and convenient to get data out of the database.
- Parallelism, because now everything is scalable, different servers are accessing the database for the same data at the same time. We need to make the most of the advantages of parallelism and process the data in this way faster.
It is still important for the storage layer to keep the original parallelism so that all the data would not be broken, overwritten, etc.
At the same time they should be reliably stored and reliably reproduced. That is, if we have recorded something in the database, we database load testing must make sure that we get it back.If you have been working with old databases, such as FoxPro, you know that there are often broken data. In new databases like MongoDB, Cassandra and others, such problems also happen. Maybe they just are not always noticed because there is a lot of data and it is more difficult to notice.For iron, reliability is really important. It's kind of an assumption, because we're still going to talk about theoretical things. In our model, if something got on the disk, we think it's fine. How to replace a disk in RAID on time is for us today the concern of admins. We won't go deep into this question and will hardly touch upon how efficiently database performance testing tools the storage is physically organized.To solve these problems, there are some approaches that are very similar for different data storages - both new and classic.