How to achieve 1 GByte/sec I/O throughput with commodity IDE disks

Jens Mache, Joshua Bower-Cooley, Jason Guchereau, Paul Thomas, and Matthew
Wilkinson Lewis & Clark College Portland, OR 97219

The Problem

In order to compete with custom-made systems, PC clusters have to provide not
only fast computation and communication, but also high-performance disk access.
I/O performance can play a critical role in the completion times of many
applications that transfer large amounts of data to and from secondary storage,
for example simulations, computer graphics, file serving, data mining or
visualization.

An I/O throughput of 1 GByte/sec was first achieved on ASCI Red with I/O
hardware costing over one million dollars. We set out to achieve similar I/O
performance on our PC cluster by harnessing the power of commodity IDE disks on
remote nodes.

The Approach

We set out to achieve an I/O throughput of 1 GByte/sec on a PC cluster that (1)
has as few as 32 nodes and (2) uses less than ten thousand dollars worth of I/O
hardware. In order to reach this goal, each node must be able to access data at
a rate of at least 32 MBytes/sec.

The novelty of our approach is (A) on each node to use two commodity IDE disks
(not SCSI disks) in a software RAID configuration and (B) to configure the
parallel file system such that each nodes acts as both I/O node and compute
node.

In our first experiment, we measured the local read and write performance of our
two IDE drives (IBM 20GB ATA100 7200rpm costing $112 each), configured as a
software RAID 0. Using the Bonnie disk benchmark, we measured up to 68.23
MBytes/sec.

In our second experiment, we measured the performance of a concurrent read/
write test program that sits on top of PVFS, an open-source parallel file
system. Parallel file systems allow transparent access to disks on remote nodes.
We configured each machine as both an I/O and a compute node to best make use of
our limited number of nodes. Using MPI and the native PVFS API, I/O throughputs
were well above 1 GByte/sec. We achieved up to 2007.199 MBytes/sec read
throughput and 1698.896 MBytes/sec write throughput (with appropriate file view
and stripe size such that most disk accesses were local).

In additional experiments, we measured the I/O performance of a ray tracing
application and studied how I/O performance is sensitive to configuration and
programming choices.

Our conclusions are as follows:

· High-performance I/O is now possible on PC clusters with commodity IDE disks.

· Compared to ASCI Red, price/performance for I/O improved by over a factor of
100. (To achieve 1 GByte/sec, we used 64 IDE drives costing $112 each and the
ASCI Red had 18 SYMBIOS RAIDs costing $60,000 each.)

· In contrast to the ASCI Red, I/O nodes in our cluster have a higher throughput
than the interconnect. (Using ttcp, we measured 38 to 46 MBytes/sec network
throughput for our copper Gigabit Ethernet Foundry switch and Intel cards. ASCI
Red’s SYMBIOS RAID can write data at 70 MBytes/sec, while the custom-made
network can transfer data at 380 MBytes/sec.)

Impact, Importance, Interest, Audience

Interest in cluster computing is at an all time high. While there is no I/O
category in the top500 ranking (nor for SC awards) yet, I/O performance is
getting more and more attention (“the I/O bottleneck”).

The impact of our work is

(A) showing how commodity IDE disks on remote nodes can be harnessed,

(B) reporting of I/O performance sensitivities.

(C) reporting on extremely good price/performance (factor of 100 better)

Thus, parallel I/O now seems affordable, even for small businesses and colleges.

Our sensitivity results are highly valuable

(1) to give performance recommendations for application development,

(2) as a guide to I/O benchmarking (which will play an important role in
compiling the new “clusters @ top500″ ranking), and

(3) as a guide to further improvement of parallel file systems.

Visual Presentation

First, we’ll have a traditional color poster display (32″x40”), describing the
problem, our approach (IDE disks in RAID configuration, PVFS with overlapped
nodes), our experiments (graphs and tables) and our conclusions.

Second, we plan to show the performance of application and benchmark runs “on
demand”. (It only takes a laptop and an Internet connection for us to start
programs on our cluster from Denver and get the performance results back.)

2009-06-25 17:17

wumingland.com

Print Friendly, PDF & Email
Categories: temps

wumingland.com

Focus on Internet Marketing

0 thoughts on “How to achieve 1 GByte/sec I/O throughput with commodity IDE disks”

Leave a Reply

Your email address will not be published.

Related Posts

temps

明白做什么

2010-02-04 14:59 首先,明确目标是第一步。需要知道自己做什么。 对于人物的各种指标做收集和整理。 指定开发规范之前的方法就是多看、多读、多想、多做测试。 从论坛、google,qq群等各种地方收集整理资料。 然后做测试程序,这样,会得到一个大局观,首先,知道高低优缺点,全面 看问题。 然后才是审时度势,调度,取长补短等综合考虑。 举例来说,目标明确后。首先就是侦查,周密的侦查,然后是练兵,有针对的训练。 小范围训练,制作模型。根据模型制作规范。根据规范约束训练。 试点成功后呢。 逐步跟进,建立反馈机制,开始大规模列装。 这个期间的各级骨干适当的介入,将指挥和反馈系统化。正规化。 这个时候呢,上动下自随,下动上自领。 整体若一。 具体来讲,做任何项目前,对于主流技术、框架做个全面的了解。是最好的。 而不是片面的开始进入编码。未做周密测试的确定某某语言,某某框架。 然后开始编码,等到写了大半截才发现有这样、那样的缺陷。 甚至从头返工。 这样做,其实很慢,但是,比较全面。比较可靠。针对重要项目和长久项目。

temps

oracle pl/sql 程序设计-读书有感。

简单问题复杂化,叫做有学问。 oracle9i pl/sql 程序设计。这就是典型的教科书。参考还行。 干活太差了。 给我一个循环、异常、变量都用的例子。半个小时,我把活都干了一半了。 要是看书从头看,不知要都长时间。 就是找个人教,我教这不是两天的事吗?要是看书,我可真不知道了。 所以,老孔就说吗。学习没有比接近高人进步更快的了。我信服了。 读书走弯路的事情看来也不少。有知识的人都愿炫耀一下,这个炫耀不要时间呀?! 号称从基础开始,其实是太罗嗦了。太罗嗦了。 要是内部培训的话,把基础类的稍微讲一下。效率得有多高呀。 实例。我就是写个循环,做个游标,做个异常处理而已。就翻来翻去找不到点。晕。晕。晕。

temps

DreamHost中shell使用指南

1. Basic Instructions基本操作命令 通常来说,使用”$[Instructions] –help”可以获得以下各个命令[instructions]的帮助,包含其参数列表的定义。 -ls 列出当前文件夹下所有内容 $ls -o 列出当前文件夹中所有内容,含详细信息,但不列出group $ls -l 同上,含group信息 $ls -a 列出当前文件夹中所有内容,包含以”.”开头的文件 $ls -t 按更改时间排序 $ls -v 按版本先后排序 -cd [dir] 进入文件夹 -pwd 显示当前路径 -mkdir [dir] 新建文件夹 -chmod 更改文件/文件夹权限 $chmod [Mode] [dir],其中Mode形如”755″或”777″等。 Read more...