分布式系统

Storage Systems Course: My proposal(zz)

2012年4月26日 阅读(404)

微博上看到的一篇文章,存储领域非常系统的介绍,尤其给出了各个部分的经典论文,需要翻墙,为方便阅读抄录如下

原文地址: http://dirkmeister.blogspot.com/2010/01/storage-systems-course-my-own-idea.html  作者:Dirk Meister

In my last post, I summarized some of the storage systems courses from international top universities with storage system labs.

In this post, I will distill my own ideas and my own views into a structure for a storage system course. Here, I assume here a 15-weeks course with a single 1 1/2 hour lecture per week (as we have in Germany):

  1. Introduction, Overview, Disk Drive Architecture
    Material: Ruemmler, Wilkes An introduction to disk drive modeling
  2. Disk Scheduling / SSD
    Material: Iyer, Druschel. Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O, Agrawal et al. Design Tradeoffs for SSD Performance
  3. RAID
    Material: Patterson et al. Introduction to Redundant Arrays of Inexpensive Disk (RAID), Corbett. Row-Diagonal Parity for Double Disk Failure Correction
  4. Local File Systems
  5. Local File System Case Studies: ext3, btrfs
    Material: Valerie Aurora. A short history of btrfs, Card et al. Design and Implementation of the Second Extended Filesystem
  6. Local File Structures (Sequential, Hashing, B-Tree)
    Material: Comer. The Ubiquitous B-Tree
  7. SAN / NAS / Object-based Storage
    Material: Sacks. Demystifying DAS, SAN, NAS, NAS Gateways, Fibre Channel, and iSCSI
  8. Examples: NFS, Ceph, GoogleFS/Hadoop DFS
    Material: Weil. Ceph, A scalable, high-performance distributed file system, Ghemawat et al. The Google File System
  9. Snapshots and Log-based Storage Designs
    Material: Brinkmann, Effert. Snapshots and Continuous Data Replication in Cluster Storage Environments, Hitz et al.File System Design for an NFS File Server Appliance, Rosenblum, Ousterhout. The Design and Implementation of a Log-Structured File System
  10. Fault Tolerance, Journaling, and Soft Updates
    Material: Prabhakaran et al. Analysis and Evolution of Journaling File Systems, Seltzer et al. Journaling Versus Soft Updates: Asynchronous Meta-data Protection in File Systems
  11. Advanced Hashing: Consistent Hashing, Share, and Crush
    Material: Karger et al. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web, Weil et al. CRUSH: controlled, scalable, decentralized placement of replicated data
  12. Caching, Replication
    Material: Nelson et al. Caching in the Sprite network file system, Kistler et al. Disconnected operation in the Coda File System
  13. Consistency, Availability, and Partition Tolerance
    Material: DeCandia et al. Dynamo: Amazon’s Highly Available Key-value Store, Helland, Life beyond Distributed Transaction: An Apostate’s Opinion
  14. Data Deduplication
    Material: Muthitacharoen et al., A Low-bandwidth Network File System, Douglis, Iyengar. Application-specific Delta-encoding via Resemblance Detection
  15. Performance Analysis
    Material: Traeger, A nine year study of file system and storage benchmarking (at least parts of it)

As books I would recommend: 

For me, a few key points are important: 

  • To clearly separate between classes of file systems and a concrete example. The best example is the class of network file systems vs. NFS. At the end there should be no much question if something is a inherent property of a class of file systems or of the concrete implementation
  • To have enough time to handle the basic concepts independently from concrete usages. For example explaining B-Trees as an important file structures independent from the usage in e.g. BTRFS. 
  • The concepts are more important than the current technology or standards.

You Might Also Like