How we use open source software to expand storage capacities
The Internet is full of superlatives and the same can be said about e-mail providers: Continually increasing numbers of users is leading to ever-larger mail quotas. In particular, providers offering services beyond the enterprise level are facing an inexorable exponential growth in storage demand due to the plethora of users. And this is causing them problems: The required capacities per user are spiraling and at the same time, they are expected to remain secure, cost-effective and performant. This can only be achieved if storage expansion is decoupled from the actual mail servers as this will then allow the mail server to grow with the number of users, and storage to scale with the storage unit per user.
Our long-standing customer, Deutsche Telekom, commissioned us to develop a flexible solution for their mail storage, similar to the one described above. They laid down two conditions for the project:
Local storage on the used server machines should be avoided which makes local large-scale caching of mail objects impossible.
Open source software (OSS) should be used wherever possible.
Conventional shared file systems such as NFS, for example, often reach their limits here with regard to capacity, budget and performance. For this reason, we have, step by step, ascertained a bespoke two-pillared solution: The first pillar is Dovecot. This mail system currently has the greatest market penetration, is already in use at Deutsche Telekom and – as laid down by the customer – is available as open source software. Together with the customer, we chose the Ceph storage system as the second pillar of the solution to provide Dovecot with an arbitrarily scalable petabyte-capable storage system. Ceph is a storage cluster running on commercially available server hardware and is capable of storing data quantities ranging from a couple of hundred terabytes and petabytes up to even exabytes. Intelligent peer-to-peer communication ensures all data is stored redundantly and with high availability. The managed storage is based on “arbitrarily”-sized binary objects that are addressed via the object name. In addition, various standard protocols are available for accessing the objects.
We are using two protocols in our Deutsche Telekom project. The first one is RADOS. This is the native protocol and provides direct access to the objects. It is used for the standard implementations. The other protocol is CephFS, a shared file system. Together, they pave the way for an almost totally arbitrarily scalable storage volume.
Why do we use two protocols?
The Dovecot mail server has a well-functioning plug-in system for many areas. The plug-in functions can be used to implement extensions or new back-ends for storage, user databases, etc. without making any changes to Dovecot itself. Our solution involved developing a plug-in for the so-called storage API that accesses the actual mails. We used the RADOS base protocol for this as it allows direct and very low latency access to Ceph object storage devices (OSD).
The efficiency of Dovecot is due, to a large extent, to its sophisticated index and cache management. Instead of a plug-in concept, this data management is based exclusively on files in a file system (local or shared).
This is why we decided to aim initially for a hybrid solution.
The mails are stored directly in Ceph as RADOS objects.
The index and cache files are saved in CephFS.
Both are stored in the same Ceph cluster. As Ceph – unlike S3- or Swift-compatible systems – is a system with very low latency, and parallel handling of CephFS metadata is efficient, we expect good response performance. This is based on the assumption that Dovecot already aims to protect used file systems.
Details of the implementation include:
A Dovecot storage plug-in for saving e-mails as RADOS objects. It is our aim to provide this plug-in as open source software (OSS) and to keep it up to date.
A Dovecot dictionary plug-in that allows storage of key/value pairs in RADOS omaps. Once again, it is our aim to provide this plug-in as open source software (OSS) and to keep it up to date.
This is a very future-proof solution for our customer, Deutsche Telekom. It facilitates cost-conscious growth and significantly simplifies the status quo because, as much as anything, the use of open source software guarantees independence from system suppliers.
It’s still too early to make a final judgment – but we at Tallence are convinced that this is the right solution and are enthusiastically looking forward to the first load testing.
You are kindly invited to participate. Please fell free to test it, give feedback and to report bugs. You can contribute here: https://github.com/ceph-dovecot