Storage Systems

Organization, Performance, Coding, Reliability, and Their Data Processing

1st Edition - October 13, 2021
Latest edition
Author: Alexander Thomasian
Language: English

Storage Systems: Organization, Performance, Coding, Reliability and Their Data Processing was motivated by the 1988 Redundant Array of Inexpensive/Independent Disks proposal to rep… Read more

Back to School

Start strong. Study with purpose.

Save up to 25% on trusted learning resources

Shop now

Description

Storage Systems: Organization, Performance, Coding, Reliability and Their Data Processing was motivated by the 1988 Redundant Array of Inexpensive/Independent Disks proposal to replace large form factor mainframe disks with an array of commodity disks. Disk loads are balanced by striping data into strips—with one strip per disk— and storage reliability is enhanced via replication or erasure coding, which at best dedicates k strips per stripe to tolerate k disk failures. Flash memories have resulted in a paradigm shift with Solid State Drives (SSDs) replacing Hard Disk Drives (HDDs) for high performance applications. RAID and Flash have resulted in the emergence of new storage companies, namely EMC, NetApp, SanDisk, and Purestorage, and a multibillion-dollar storage market. Key new conferences and publications are reviewed in this book.

The goal of the book is to expose students, researchers, and IT professionals to the more important developments in storage systems, while covering the evolution of storage technologies, traditional and novel databases, and novel sources of data. We describe several prototypes: FAWN at CMU, RAMCloud at Stanford, and Lightstore at MIT; Oracle's Exadata, AWS' Aurora, Alibaba's PolarDB, Fungible Data Center; and author's paper designs for cloud storage, namely heterogeneous disk arrays and hierarchical RAID.

Key features

Surveys storage technologies and lists sources of data: measurements, text, audio, images, and video
Familiarizes with paradigms to improve performance: caching, prefetching, log-structured file systems, and merge-trees (LSMs)
Describes RAID organizations and analyzes their performance and reliability
Conserves storage via data compression, deduplication, compaction, and secures data via encryption
Specifies implications of storage technologies on performance and power consumption
Exemplifies database parallelism for big data, analytics, deep learning via multicore CPUs, GPUs, FPGAs, and ASICs, e.g., Google's Tensor Processing Units

Readership

Scientists, researchers, and MSc. PhD. students from the fields of Computer Science and Engineering. Researchers, practitioners, and students in the fields of computer architecture and operating systems, as well as management information systems

1. Introduction

2. Storage Technologies and Their Data

3. Disk Drive Data Placement and Scheduling

4. Mirrored & Hybrid Arrays

5. Redundant Arrays of Independent Disks - RAID

6. Coding for Multiple Disk Failures

7. Saving Power in Disks, Flash Memories, and Servers

8. Database Parallelism, Big Data and Analytics, Deep Learning

9. Structured, Unstructured, and Diverse Databases

10. Heterogeneous Disk Arrays - HDAs

11. Hierarchical RAID - HRAID

12. Conclusions
Appendix

Product details

Edition: 1
Latest edition
Published: October 20, 2021
Language: English

About the author

Alexander Thomasian

Dr. Alexander Thomasian is the founder and CEO of Thomasian Associates consulting, in Pleasantville, NY, USA. As a former IBM Systems Engineer, he did a PhD in Computer Science at UCLA. Dr. Thomasian has held teaching and research positions at Case Western Reserve U., U. Southern California, Burroughs Corp., IBM T.J. Watson Research Center, and New Jersey Institute of Technology. At IBM's Almaden Research Center, he developed the analysis to predict the performance of IBM's RAID5 product under development. His storage research was funded by National Science Foundation Hitachi Global Storage Technologies, and AT&T. He was a visiting scientist of Chinese Academy of Sciences at Shenzhen and a Fulbright Fellow at the American University of Armenia in Yerevan. He is a Life Fellow of IEEE for fundamental contributions to the performance analysis of computer systems. He was an Editor of IEEE Transactions on Parallel and Distributed Systems, a monograph on database concurrency control, and 150 papers, more recently on storage systems.

Affiliations and expertise

CEO, Thomasian and Associates consulting, Pleasantville, NY, USA

View book on ScienceDirect

Read Storage Systems on ScienceDirect

Life Sciences

Physical Sciences & Engineering

Social Sciences & Humanities

Health