分布式系统

Hadoop Reading List

2011年10月7日 阅读(481)

[说明:之前的那篇<<分布式系统领域经典论文翻译集>>颇显庞杂,涵盖内容五花八门,也有人提出不知道从哪开始。所以呢,就有了这一篇,主要以组成Hadoop生态系统的各组件为线索,串起了其中相对重要的那些论文,同时基本上也是按照从前往后的顺序由浅入深的,其中也包含了翻译集中所未出现的一些。这些文章中,标为(译)的那些是本人翻译的,标为(zz)的那些是其他人的译文或原创,剩余未标明的部分基本上是英文原文,可能会择其要者而译之。]

1.HDFS 

GFS:google文件系统(译)

Case Study GFS: Evolution on Fast-forward(译)

The Hadoop Distributed File System(译)

HDFS scalability:the limits to growth(译)

HDFS Reliability

2.MapReduce   

MapReduce: Simplied Data Processing on Large Clusters(译)

关于MapReduce的争论

MapReduce和并行数据库,朋友还是敌人?(zz)

MapReduce:一个巨大的倒退(zz)

MapReduce:一个巨大的倒退(II)

A Comparision of Approaches to Large-Scale Data Analysis(译)

MapReduce-A Flexible Data Processing Tool(译)

MapReduce and Parallel DBMSs-Friends or Foes(译)

MapReduce Online

Hadoop公平调度器指南(zz)

Hadoop MapReduce 源码分析 

3.HBase  

Bigtable: A Distributed Storage System for Structured Data(译)

HBase Architecture(译)

HFile:A Block-Indexed File Format to Store Sorted Key-Value Pairs

HFile V2

LevelDB:一个快速轻量级的key-value存储库(译)

LevelDB:实现(译)

LevelDB:源码分析

4.Zookeeper 

Chubby: The Chubby lock service for loosely-coupled distributed systems(译)

ZooKeeper: Wait-free coordination for Internet-scale systems

5.Hive

Sawzall:Interpreting the Data–Parallel Analysis with Sawzall(zz)

Hive – A Petabyte Scale Data Warehouse Using Hadoop(zz)

Hive – A Warehousing Solution Over a Map-Reduce Framework

HIVE RCFile高效存储结构(zz)

RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems

Pig Latin 分析报告(zz)

Tenzing- A SQL Implemention On The MapReduce Framework(译)

6.优化相关 

Hadoop平台优化综述(zz)

Hadoop作业调优参数整理及原理(zz)

Hadoop性能调优(zz)

HBase性能调优(zz)

HBase性能深度分析(zz)

HBase Performance Tuning

Hadoop Performance Evaluation

The Performance of MapReduce: An Indepth Study

Optimizing Hadoop for the cluster

Starfish: A Selftuning System for Big Data Analytics

To Compress or Not To Compress – Compute vs. IO tradeoffs for MapReduce Energy Efficiency 

7.综合与其他

Apache Hadoop Goes Realtime at Facebook(译)

The Anatomy of Hadoop I/O Pipeline(译)

下一代Apache Hadoop MapReduce(zz)

Apache Hadoop 0.23 

Avro: 大数据的数据格式(zz)

注:

转载请注明作者:phylips@bmy 2011-10-7

出处:http://duanple.blog.163.com/blog/static/7097176720119791920962/

You Might Also Like