[说明:之前的那篇<<分布式系统领域经典论文翻译集>>颇显庞杂,涵盖内容五花八门,也有人提出不知道从哪开始。所以呢,就有了这一篇,主要以组成Hadoop生态系统的各组件为线索,串起了其中相对重要的那些论文,同时基本上也是按照从前往后的顺序由浅入深的,其中也包含了翻译集中所未出现的一些。这些文章中,标为(译)的那些是本人翻译的,标为(zz)的那些是其他人的译文或原创,剩余未标明的部分基本上是英文原文,可能会择其要者而译之。]
Case Study GFS: Evolution on Fast-forward(译)
The Hadoop Distributed File System(译)
HDFS scalability:the limits to growth(译)
HDFS Reliability
MapReduce: Simplied Data Processing on Large Clusters(译)
A Comparision of Approaches to Large-Scale Data Analysis(译)
MapReduce-A Flexible Data Processing Tool(译)
MapReduce and Parallel DBMSs-Friends or Foes(译)
Hadoop公平调度器指南(zz)
Hadoop MapReduce 源码分析
Bigtable: A Distributed Storage System for Structured Data(译)
HFile:A Block-Indexed File Format to Store Sorted Key-Value Pairs
LevelDB:一个快速轻量级的key-value存储库(译)
LevelDB:源码分析
Chubby: The Chubby lock service for loosely-coupled distributed systems(译)
ZooKeeper: Wait-free coordination for Internet-scale systems
Sawzall:Interpreting the Data–Parallel Analysis with Sawzall(zz)
Hive – A Petabyte Scale Data Warehouse Using Hadoop(zz)
Hive – A Warehousing Solution Over a Map-Reduce Framework
RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems
Tenzing- A SQL Implemention On The MapReduce Framework(译)
6.优化相关
Hadoop平台优化综述(zz)
Hadoop性能调优(zz)
HBase性能调优(zz)
HBase性能深度分析(zz)
Hadoop Performance Evaluation
The Performance of MapReduce: An Indepth Study
Optimizing Hadoop for the cluster
Starfish: A Selftuning System for Big Data Analytics
To Compress or Not To Compress – Compute vs. IO tradeoffs for MapReduce Energy Efficiency
Apache Hadoop Goes Realtime at Facebook(译)
The Anatomy of Hadoop I/O Pipeline(译)
下一代Apache Hadoop MapReduce(zz)
注:
转载请注明作者:phylips@bmy 2011-10-7
出处:http://duanple.blog.163.com/blog/static/7097176720119791920962/