Ceph BlueStore: To cache or not to cache? |美光科技有限公司.-沙巴体育结算平台

To Cache or not to Cache, that is the question.

嗯，你呢?? Cache for your Ceph® cluster? The answer is, that it depends.

You can use high-end enterprise NVMe™ drives, such as the 微米® 9200 MAX, and not have to worry about getting the most performance from your Ceph cluster. But what if you would like to gain more performance in a system that is made up mostly of SATA drives. 如果是这样的话, there are benefits to adding a couple of faster drives to your Ceph OSD 服务器 for storing your BlueStore database and write-ahead log.

微米 developed and tested the popular Accelerated Ceph 存储 Solution, which leverages 服务器 with Red Hat Ceph 存储 running on Red Hat Linux. I will go through a few workload scenarios and show you where caching can help you, based on actual results from our solution testing lab.

系统配置

Testing was done using a four OSD node Ceph cluster with the following configuration:

处理器	单插座AMD 7551P
内存	256GB DDR4 @ 2666Hz (8x32GB)
网络	100G
SATA驱动器	微米5210离子 3.84年结核病(x12)
NVMe驱动器(缓存设备)	微米 9200 Max 1.6结核病(x2)
OS	Red Hat®Enterprise Linux.6
应用程序	Red Hat Ceph 存储.2
每盘SATA盘的osd数	2
数据集	50 RBDs @ 150GB each with 2x replication

Table 1: Ceph OSD Server Configuration

4KiB随机块测试

对于4KiB随机写入, 使用FIO(灵活I/O), you can see that utilizing caching drives greatly increases your performance while keeping your tail latency low, 即使在高负荷下. 对于40个FIO实例, the performance is 71% higher (190K vs 111K) and tail latency is 72% lower (119ms vs 665ms).

Figure 1: 4KiB Random Write Performance and Tail Latency

There is some performance gain during 4KiB Random Read testing, but it is much less convincing. 这是可以预期的, 在读测试期间, the write-ahead log will not be utilized and the BlueStore database won’t change much if at all.

Figure 2: 4KiB Random Read Performance and Tail Latency

A mixed workload (70% Read/30% Write) also shows the benefits of having caching devices in your system. Performance gains range from 30% at 64 queue depth to 162% at 6 Queue depth.

Figure 3: 4KiB Random 70% Read/30% Write Performance and Tail Latency

4MiB对象测试

When running the rados bench command with 4MiB objects, there is some performance gain with caching devices, but it’s not as dramatic as the small block workloads. Since the write-ahead log is small and the objects are large, there is much less impact on performance by adding caching devices. 吞吐量提高了9% (4.94 GiB/s vs 4.53 GiB/s) with caching vs none, while average latency is 7% lower (126ms vs 138ms), when running 10 instances of rados bench.

Figure 4: 4MiB Object Write Performance

With reads, we again see that there is negligible performance gain across the board.

Figure 5: 4MiB Object Read Performance

结论

如你所见, if your workload is almost all reads, you won’t gain much if anything from adding caching devices to your Ceph cluster for BlueStore database and write-ahead log storage. But with writes, it is a completely different story. 虽然, 对于大型物体, 这是有好处的, the real showstopper for caching devices is with small block writes and mixed workloads. For a small investment of adding a couple of 微米 performance 9200 NVMe drives to your system, you can get the most out of your Ceph cluster.

What sorts of results are you getting with your open source storage? Learn more at 微米 Accelerated Ceph 存储.

Stay up to date by following us on Twitter @微米存储 and connect with us on LinkedIn.

Principal 存储 Solutions Engineer

约翰Mazzie

John is a Member of the Technical Staff in the Data Center Workload Engineering group in Austin, TX. He graduated in 2008 from West Virginia University with his MSEE with an emphasis in wireless communications. John has worked for Dell on their storage MD3 Series of storage arrays on both the development and sustaining side. John joined 微米 in 2016 where he has worked on Cassandra, MongoDB, 和Ceph, and other advanced storage workloads.

沙巴体育结算平台概述

搜索 for, filter and download 微米 data sheets

市场 & 行业概述

人工智能和机器学习

合作伙伴的概述

Learn about and enroll in 微米's Technology Enablement Program (TEP)

销售 & 支持概述

Contact 微米's sales support

沙巴体育安卓版下载概述

访问职业生涯 to join our team.

投资者关系概述

访问微米's 投资者关系 site

Ceph BlueStore: To cache or not to cache, that Is the question

系统配置

4KiB随机块测试

4MiB对象测试

结论

约翰Mazzie

搜索 for, filter and download 微米 data sheets

人工智能和机器学习

Learn about and enroll in 微米's Technology Enablement Program (TEP)

Contact 微米's sales support

访问 职业生涯 to join our team.

访问 微米's 投资者关系 site

Ceph BlueStore: To cache or not to cache, that Is the question

系统配置

4KiB随机块测试

4MiB对象测试

结论

约翰Mazzie

访问职业生涯 to join our team.

访问微米's 投资者关系 site