Interpreting the Data: Parallel Analysis with. Sawzall. Rob Pike, Sean Dorward, Robert Griesemer,. Sean Quinlan. Google, Inc. Presented by Alexey. Interpreting the Data: Parallel Analysis with Sawzall Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan Scientific Programming Journal Special Issue. Cue Sawzall, a new language that Google use to write distributed, parallel data- processing programs for use on their clusters. While the.

Author: Faern Kazragar
Country: Grenada
Language: English (Spanish)
Genre: Art
Published (Last): 19 October 2009
Pages: 135
PDF File Size: 16.41 Mb
ePub File Size: 13.51 Mb
ISBN: 576-1-56622-662-2
Downloads: 90523
Price: Free* [*Free Regsitration Required]
Uploader: Goltikree

The paper is well written with lot of examples.

Interpreting the Data: Parallel Analysis with Sawzall

The paper is from the organization Google which is popular for their capabilities for massive computation on Data and is about the product they are using to solve day to day problems parallel Google. Sawzall is a statically typed language for processing very large amount of data on multiple machines.

It generally breaks the calculation in two phases first phase analyses the record and second phase aggregates the result. The calculation is divided into pieces and distributed, keeping computation near data.


It works above Google infrastructure. Protocol Buffers are used to describe the format of permanent records stored on disk. Software called the Workqueue is handled scheduling a job to run on a cluster of machines.

The paper gives a detailed overview of sawzall programming language with examples. The benchmark test cases are all CPU-bound cases. However, in the paper, the authors talked about the applications for this language being mostly IO-bound.

It would seem to make sense if they gave some examples that are IO-bound and still be able to show the performance advantage of Sawzall. Sawzall is also a level of abstraction above MapReduce, but still appears to be a bit more restrictive than Pig Latin [1]. A sawzall program has a fairly rigid structure consisting of a filtering phase the map step followed by an aggregation phase the reduce step. It was a little bit concerning factor as with terabytes of data being processed error can easily happen.


Kamath, S Narayanam, C.

Interpreting the Data: Parallel Analysis with Sawzall

You are commenting using your WordPress. You are commenting using your Twitter account.

You are commenting using your Facebook account. Notify me of new comments via email.

Skip to content Home About My Publications. Leave a Reply Cancel reply Enter your comment here Fill in wawzall details below or click an icon to log in: Email required Address never made public. This site uses cookies. By continuing to use this website, you agree to their use.

To find out more, including how to control cookies, see here: