duanple – 第9页 – 银河里的星星

Pregel: A System for Large-Scale Graph Processing(
2010年12月18日阅读(332)

zz from:

http://blog.csdn.net/AE86_FC/archive/2010/08/08/5796640.aspx

Abstract
许多实际应用问题中都涉及到大型的图算法。比如网页链接关系和社会关系图等。这些图都有相同的特点：规模超大，常常达到数十亿的顶点和上万亿的边。这么大的规模，给需要在其上进行高效计算的应用提出了巨大的难题。在这篇论文中，我们将提出一种适合处理这类问题的计算模式。将程序用一系列的迭代来描述(Programs are expressed as a sequence of iterations)，在每一次迭代中，每一个顶点都能接收来自上一次迭代的信息，并将这些信息传送给下一个顶点，并在此过程中修改其自身的状态信息，以该顶点为起点的出边的状态信息，或改变整个图的拓扑结构。这种面向顶点的方法足够的灵活，可以用来描述一系列的算法。这种计算模式被设计的足够高效，可扩展，和足够的容错，并在有上千台的计算节点的集群中得以实现。这种模式中隐式的同步性(implied synchronicity)使得它对程序的确认变得简单。分布式相关的细节已经被一组抽象的API给隐藏。而展现给人们的仅仅是一个表现力很强，很容易编程的大型图算法处理的计算框架。
Keywords
分布式计算，图算法
1．Introducetion
Internet使得Web graph成为一个人们争相分析和研究的热门对象。Web 2.0更是将对社会关系网的关注推向高潮。同时还有其他的大型图对象（如交通路线图，报纸文献，疾病爆发路径，以及科学研究的发表文章中的引用关系等），也已经被研究了很多年了。同时也有了许多相应的应用算法，如最短路径算法，page rank理论演变而来的图相关算法等等。同时还有许多其他的图计算问题也有着相当的实际价值，如最小切割，以及连通分支等相关问题。
事实证明，高效的处理大型图对象的计算是一件极具挑战性的事情。图算法常常暴露出类似不高效的本地内存访问，针对每个顶点的处理过少，以及在计算过程中改变并行度等问题。分布式的介入更是加剧了locality的问题，并且加剧了在计算过程中机器发生故障而影响计算的可能性。尽管大型图形无处不在，其商业应用也非常普及，但是一种通用的，适合各种图算法的大型分布式环境的实现到目前还不存在。
要实现一种大型图计算的算法通常意味着要在以下几点中作出选择: read more
Spanner: Google s next Massive Storage and Computa
2010年12月18日阅读(522)

Spanner: Google s next Massive Storage and Computation infrastructure

MapReduce Bigtable and Pregel have their origins in Google and they all deal with large systems . But all of them may be dwarfed in size and complication by a new project Google is working on which was mentioned briefly (may be un-intentionally) at an event last year. read more
Scalable System Design Patterns(zz)
2010年12月18日阅读(417)

zz from:

http://horicky.blogspot.com/2010/10/scalable-system-design-patterns.html

Looking back after 2.5 years since my previous post on scalable system design techniques, I’ve observed an emergence of a set of commonly used design patterns. Here is my attempt to capture and share them.
Load Balancer
In this model, there is a dispatcher that determines which worker instance will handle the request based on different policies. The application should best be "stateless" so any worker instance can handle the request.
This pattern is deployed in almost every medium to large web site setup.

Scatter and Gather
In this model, the dispatcher multicast the request to all workers of the pool. Each worker will compute a local result and send it back to the dispatcher, who will consolidate them into a single response and then send back to the client.
This pattern is used in Search engines like Yahoo, Google to handle user’s keyword search request … etc.

Result Cache
In this model, the dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order to save the actual execution.
This pattern is commonly used in large enterprise application. Memcached is a very commonly deployed cache server.

Shared Space
This model also known as "Blackboard"; all workers monitors information from the shared space and contributes partial knowledge back to the blackboard. The information is continuously enriched until a solution is reached.
This pattern is used in JavaSpace and also commercial product GigaSpace.

Pipe and Filter
This model is also known as "Data Flow Programming"; all workers connected by pipes where data is flow across.
This pattern is a very common EAI pattern.

Map Reduce
The model is targeting batch jobs where disk I/O is the major bottleneck. It use a distributed file system so that disk I/O can be done in parallel.
This pattern is used in many of Google’s internal application, as well as implemented in open source Hadoop parallel processing framework. I also find this pattern can be used in many many application design scenarios.

Bulk Synchronous Parellel
This model is based on lock-step execution across all workers, coordinated by a master. Each worker repeat the following steps until the exit condition is reached, when there is no more active workers.
Each worker read data from input queue
Each worker perform local processing based on the read data
Each worker push local result along its direct connection
This pattern has been used in Google’s Pregel graph processing model as well as the Apache Hama project.

Execution Orchestrator
This model is based on an intelligent scheduler / orchestrator to schedule ready-to-run tasks (based on a dependency graph) across a clusters of dumb workers.
This pattern is used in Microsoft’s Dryad project

Although I tried to cover the whole set of commonly used design pattern for building large scale system, I am sure I have missed some other important ones. Please drop me a comment and feedback.
Also, there is a whole set of scalability patterns around data tier that I haven’t covered here. This include some very basic patterns underlying NOSQL. And it worths to take a deep look at some leading implementations. read more
Beyond Hadoop: Next-Generation Big Data Architectu
2010年12月18日阅读(194)

from:

http://www.nytimes.com/external/gigaom/2010/10/23/23gigaom-beyond-hadoop-next-generation-big-data-architectu-81730.html

After 25 years of dominance, relational databases and SQL have in recent years come under fire from the growing “NoSQL movement.” A key element of this movement is Hadoop, the open-source clone of Google’s internal MapReduce system. Whether it’s interpreted as “No SQL” or “Not Only SQL,” the message has been clear: If you have big data challenges, then your programming tool of choice should be Hadoop. read more
select, poll和epoll的区别(zz)
2010年12月2日阅读(432)

from: http://hi.baidu.com/makeittrue/blog/item/bb6ca4371b4941360b55a954.html

参考：http://wenku.baidu.com/view/0ea86ffdc8d376eeaeaa3198.html

select()系统调用提供一个机制来实现同步多元I/O：

#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
int select (int n,
fd_set *readfds,
fd_set *writefds,
fd_set *exceptfds,
struct timeval *timeout);
FD_CLR(int fd, fd_set *set);
FD_ISSET(int fd, fd_set *set);
FD_SET(int fd, fd_set *set);
FD_ZERO(fd_set *set); read more
What’s New in Hadoop Core 0.19(zz)
2010年11月17日阅读(247)

from: http://www.cloudera.com/blog/2008/12/whats-new-in-hadoop-core-019/

What’s New in Hadoop Core 0.19
by Tom White
December 31, 2008
1 comment
Tweet

The first release (0.19.0) from the 0.19 branch of Hadoop Core was made on November 24. Many changes go into a release like this, and it can be difficult to get a feel for the more significant ones, even with the detailed Jira log, change log, and release notes. (There’s also JDiff documentation, which is a great way to see how the public API changed, via a JavaDoc-like interface.) This post gives a high-level feel for what’s new. read more
海量数据分析：Sawzall并行处理（中文版论文）(zz)
2010年11月16日阅读(759)

from: http://peopleyun.com/?p=896

Google的工程师为了方便内部人员使用MapReduce，研发了一种名为Sawzall的DSL，同时Hadoop也推出了类似Sawzall的Pig语言，但在语法上面有一定的区别。今天就给大家贴一下Sawall的论文，值得注意的是其第一作者是UNIX大师之一（Rob Pike）。原文地址，并在这里谢谢译者崮山路上走9遍。 read more
关于linux线程(zz)
2010年10月24日阅读(258)

在同一进程环境中使用多个线程可以共享所有的进程资源, 而且可以优化程序流程, 使很多工作异步进行;
线程单独拥有的资源有: 线程id, 堆栈, 信号屏蔽字, 一组寄存器的值, errno变量, 调度优先级和策略等;
线程共同拥有的资源有: 进程的所有资源包括程序代码, 全局变量, 堆栈空间, 文件描述符等;
POSIX.1-2001标准中规定的线程接口称为POSIX thread, 或pthreads. 编译时要加-lpthread.
目前的linux内核是以轻量级进程(lightweight process, LWP)的方式实现多线程的.
内核里每个LWP对应用户空间的一个线程, LWP拥有自己的task_struct, 也是一个进程调度单位;
LWP与普通进程的区别是多个LWP共享某些资源, 如: 地址空间, 打开的文件等;
Solaris的线程库就不是一个LWP对应一个用户空间线程, 而是用户空间分时复用数量有限的LWP. read more
多线程中局部静态变量初始化
2010年10月24日阅读(605)

多线程中局部静态变量初始化陷阱：http://www.rxyj.org/html/2010/0424/279529.php

多处理器环境和线程同步的高级话题：http://baiy.cn/doc/cpp/advanced_topic_about_multicore_and_threading.htm

如何解决静态变量的线程安全问题：http://www.programfan.com/club/showtxt.asp?id=294156 read more
请注意,C++ 局部静态初始化不是线程安全!(zz)
2010年10月24日阅读(359)

http://www.cppblog.com/lymons/archive/2010/08/01/120638.aspx

8 Mar 2004 7:00 AM

在块作用域中的静态变量的规则 (与之相对的是全局作用域的静态变量) 是, 程序第一次执行到他的声明的时候进行初始化.

察看下面的竞争条件:

int ComputeSomething() { static int cachedResult = ComputeSomethingSlowly(); return cachedResult; } read more
【google论文四】Bigtable:结构化数据的分布式存储系统(下)
2010年10月16日阅读(396)

转载请注明：http://duanple.blog.163.com/blog/static/709717672010916103257933/ 作者 phylips@bmy

7.性能评价

我们建立了一个N个tablet服务器的Bigtable集群来测量Bigtable伴随着N的变化的性能和可扩展性。Tablet服务器配置成由含有1G内存 400G IDE硬盘的1786个机器组成的GFS cell写入。N个客户端为这些测试生成工作负载。(我们使用与tablet服务器相同数目的客户端来保证客户端不会成为瓶颈)。每个机器有一个双核Opteron 2GHz 芯片，供运行的进程使用的足够的物理内存，一个gigabit 以太网链路。机器通过一个两级树状交换机网络连接，根节点总体带宽接近100-200Gbps。所有机器具有相同的主机配置，因此任意两个机器间的往返时间小于1ms。 read more
The power of Python’s yield(zz)
2010年10月10日阅读(406)

(October 2007) http://users.softlab.ntua.gr/~ttsiod/yield.html
Computing permutations
What is a permutation? Reading from my local copy of Wikipedia:
Permutation is the rearrangement of objects or symbols into distinguishable sequences. Each unique ordering is called a permutation. For example, with the numerals one to six, each possible ordering consists of a complete list of the numerals, without repetitions. There are 720 total permutations of these numerals, one of which is: "4, 5, 6, 1, 2, 3". read more
python相关
2010年10月10日阅读(376)

http://www.red-dove.com/python_logging.html python logging模块文档与源码 http://gashero.yeax.com/?p=16 部分中文翻译

http://www.advsofteng.com/download.html chart director 绘图-chartdir

http://docs.python.org/library/profile.html python profile

http://docs.python.org/library/unittest.html python unittest read more
【google论文四】Bigtable:结构化数据的分布式存储系统(上)
2010年10月6日阅读(1,569)

转载请注明：http://duanple.blog.163.com/blog/static/709717672010961173782/

作者 phylips@bmy

摘要

Bigtable是设计用来管理那些可能达到很大大小(比如可能是存储在数千台服务器上的数PB的数据)的结构化数据的分布式存储系统。Google的很多项目都将数据存储在Bigtable中，比如网页索引，google 地球，google金融。这些应用对Bigtable提出了很多不同的要求，无论是数据大小(从单纯的URL到包含图片附件的网页)还是延时需求。尽管存在这些各种不同的需求，Bigtable成功地为google的所有这些产品提供了一个灵活的，高性能的解决方案。在这篇论文中，我们将描述Bigtable所提供的允许客户端动态控制数据分布和格式的简单数据模型，此外还会描述Bigtable的设计和实现。 read more
【google论文三】MapReduce:简化大集群上的数据处理(下)
2010年10月2日阅读(519)

转载请注明：http://duanple.blog.163.com/blog/static/70971767201092673696/ 作者 phylips@bmy

5.性能

在本节中我们将通过运行在大集群的机器上的两个计算来测量MapReduce的性能。一个计算在大概1TB的数据中搜索给定模式的文本。另一个计算对接近1T的数据进行排序。 read more
【google论文三】MapReduce:简化大集群上的数据处理(上)
2010年10月2日阅读(1,798)

转载请注明：http://duanple.blog.163.com/blog/static/709717672010923203501/

作者 phylips@bmy

摘要：

MapReduce是一个编程模型以及用来处理和生成大数据集的一个相关实现。用户通过描述一个map函数，处理一组key/value对进而生成一组key/value对的中间结果，然后描述一个reduce函数，将具有相同key的中间结果进行归并。正如论文所表明的，很多现实世界中的任务都可以用这个模型来表达。 read more
【google论文二】Google文件系统(下)
2010年10月1日阅读(545)

转载请注明：http://duanple.blog.163.com/blog/static/7097176720109151534289/ 作者 phylips@bmy

6.测量

在这一节，我们用一些小规模的测试来展示GFS架构和实现固有的一些瓶颈，有一些数字来源于google的实际集群。

6.1小规模测试

我们在一个由一个master，两个master备份，16个chunkserver，16个client组成的GFS集群上进行了性能测量。这个配置是为了方便测试，实际中的集群通常会有数百个chunkserver，数百个client。 read more
【google论文二】Google文件系统(中)
2010年10月1日阅读(757)

转载请注明：http://duanple.blog.163.com/blog/static/7097176720109151211526/ 作者 phylips@bmy

3.系统交互

我们是以尽量最小化master在所有操作中的参与度来设计系统的。在这个背景下，我们现在描述下client，master以及chunkserver如何交互来实现数据变更，记录append以及快照的。 read more
【google论文二】Google文件系统(上)
2010年10月1日阅读(3,764)

转载请注明：http://duanple.blog.163.com/blog/static/7097176720109145829346/

作者 phylips@bmy

摘要

我们设计实现了google文件系统，一个面向大规模分布式数据密集性应用的可扩展分布式文件系统。它运行在廉价的商品化硬件上提供容错功能，为大量的客户端提供高的整体性能。 read more
【google论文一】面向星球的网络搜索：google集群架构
2010年10月1日阅读(1,461)

转载请注明：http://duanple.blog.163.com/blog/static/70971767201091102339246/ 作者 phylips@bmy

为了能够支持可扩展的并行化，google的网络搜索应用让不同的查询由不同的处理器处理，同时通过划分全局索引，使得单个查询可以利用多个处理器处理。针对所要处理的工作负载类型，google的集群架构由15000个普通pc机和容错软件组成。这种架构达到了很高的性能，同时由于采用了普通pc机，也节省了采用昂贵的高端服务器的大部分花费。 read more

Older Posts

Newer Posts