“Polyglot Persistence” is what the data architecture that brings together different storage technologies within a single body is called. Martin Fowler already spoke about it many years ago, but still today some companies are discovering its benefits as they embark on digital transformation processes.
In data centric architectures, where the data is the protagonist, the arguments in favour of using the best data storage possible for every use case become even stronger. But, faced with such an offer, which product does one choose?
It is necessary to take into account that, with the final applications in mind, the technology used in the persistence layer is transparent to development.
Some of the factors that should be considered when choosing a data store are:
- The architecture and principles on which it is based. You should delve into the product’s features and understand it well before making a decision.
- The maturity of the product and the clients that use it.
- The functionality that the product offers us and how it fits our needs.
- The ease of scaling and, in general, administrating the service.
- Powerful but simple development that facilitates the agile and efficient building of software.
- Complete visual documentation.
- A large community within which to consult examples, find talent or seek help to solve the occasional problem.
- The rate at which new versions are released and bugs are fixed, but also the kind of improvements that have been included with each release.
But these are not the only things to consider before choosing a technology. At times, choosing from amongst hundreds of products is really complicated. Because of this, having a list of use cases and which technologies solve them can be a significant help in finding the best candidate. We will look at some of them below.
Recommendation systems in real time
Nowadays, it is very common for a service to offer recommendations based on information at hand. For example, products in a shop, advertisements on an ad server, social network connections, taste in music…
A graph database that efficiently maintains the relationships between these elements will allow you to make recommendations based on certain patterns by collecting your own or third-party information.
Although many products offer graph functions, none of them do it like Neo4j, since it was designed for this purpose from the beginning.
The visualisation and analysis of prices require querying a large quantity of time samples collected during a period of time. This can be useful for, amongst other things, predicting the best time to buy a product or suggesting recommended prices by detecting patterns or fluctuations.
In this scenario, the best solution is Riak TS, since it supports native aggregations and continuous data storage according to the time range to which it refers, it uses a SQL-like language, it can be configured in a multi-cluster environment and it facilitates connectors with Spark to efficiently analyse the stored information.
There are many situations in which you must identify fraudulent usage of your services, rating systems, online purchases or comments, to cite a few examples.
One of the most widespread techniques to detect such situations is based on the cyclic references between entities. A graph database is the optimal solution for this scenario.
Neo4j is once again the preferred choice, owing not only to what was said before, but because in this case an ACID database helps to guarantee transactionality in critical operations and, as a result, allows you to depend on a reliable source of information to detect possible fraud.
It appears to be a simple task, but keeping your catalogue’s product inventory updated is one of the most important priorities to avoid discontentment amongst potential customers and possibly even economic losses if you do not manage to have a system that responds in real time.
Couchbase is mainly a robust document storage product that incorporates a cache which optimises response times and system resources by making use of what is called Multi-dimensional Scale (MDS), amongst other interesting features.
Shopping cart management
Shopping carts comprise simple and volatile information that must be accessible at a dizzying speed, since it will be frequently queried. Redis is a key-value cache system in memory that enables you to create groups of data that expire in distributed environments.
There are many products or applications that would like to give a human touch to the frequently asked questions or contact sections, at times building complex natural language recognition systems to give the desired response at all times.
Elastic is the best product to build analysers, stemmers and tokenisers simply and unobtrusively in our semantic search engine. In addition, it is a resilient, flexible and scalable product with excellent documentation and a great community. They have ultimately not only been able to build a good tool, but also offer a package that includes many additional elements.
These days, information catalogues with millions of elements and information that is not very homogenous are commonplace. For example, information on users, products, purchases, etc.
A system like MongoDB can store documents in JSON as well as validate them at runtime, manage large volumes of information and offer extraordinary performance, making it the best option.
In addition to offering powerful query frameworks capable of retrieving information from the required viewpoint, you have the option of using connectors with traditional Business Intelligence tools, thanks to their BI connector.
At present, all mobile applications extensively query information via the Internet, which poses a problem when managing the disconnections that users can experience when using their mobile phones, causing application crashes. This necessitates the development of components that guarantee the resilience of your application during such circumstances.
Couchbase Mobile offers a sound alternative, allowing you to make NoSQL queries on the terminal itself, with the benefits this brings, and transparently synchronise with the application’s data source. By doing so, you do not have to worry about whether you have an Internet connection, as Couchbase Mobile will always return the latest valid data available.
I have presented some of the most powerful and ubiquitous technologies on the market, but you should not think that they only work in the given situations. All the products are designed and optimised to solve a relatively small set of problems that can be applied to a host of use cases.
I hope that this list serves as a reference the next time you are faced with the challenge of choosing a Data Store.
If you will allow me to offer some advice, try to be creative, innovative and conscientious, and, if possible, build a concept test that validates your proposal. This will be the first step in leaving your comfort zone and exploring new paths that put your business on a higher level.