вторник, 30 марта 2010 г.

Семинар отменяется

Тысяча извинений, но я сегодня не смогу провести семинар.

[Protocols]: Fast TCP

Fast TCP:
- wiki
- "FAST TCP: motivation, architecture, algorithms, performance"
- "FAST TCP: from theory to experiments"

To poll or epoll: that is the question

To poll or epoll: that is the question:

"One of the updates in build 59 of Mustang (JavaTM SE 6) is that the New I/O Selector implementation will use the epoll event notification facility when running on the Linux 2.6 kernel. The epoll event mechanism is much more scalable than the traditional poll when there are thousands of file descriptors in the interest set. The work done by poll depends on the size of the interest set whereas with epoll (like Solaris /dev/poll) the registration of interest is separated from the retrieval of the events. A lot has been written on the topic. The C10K problem has been documenting I/O frameworks and strategies for several years. One relatively recent paper on Comparing and Evaluating epoll, select, and poll Event Mechanisms makes it clear the workloads where epoll performs a lot better than poll.

[Protocols]: Sockets Direct Protocol (SDP)

Lesson: Understanding the Sockets Direct Protocol:

This section is being updated to reflect features and conventions of the upcoming release, JDK7. You can download the current JDK7 Snapshot from java.net. We've published this preliminary version so you can get the most current information now, and so you can tell us about errors, omissions, or improvements we can make to this tutorial.

For high performance computing environments, the capacity to move data across a network quickly and efficiently is a requirement. Such networks are typically described as requiring high throughput and low latency. High throughput refers to an environment that can deliver a large amount of processing capacity over a long period of time. Low latency refers to the minimal delay between processing input and providing output, such as you would expect in a real-time application.

In these environments, conventional networking using socket streams can create bottlenecks when it comes to moving data. Introduced in 1999 by the InfiniBand Trade Association, InfiniBand (IB) was created to address the need for high performance computing. One of the most important features of IB is Remote Direct Memory Access (RDMA). RDMA enables moving data directly from the memory of one computer to another computer, bypassing the operating system of both computers and resulting in significant performance gains.

The Sockets Direct Protocol (SDP) is a networking protocol developed to support stream connections over InfiniBand fabric. SDP support was introduced to Java Platform, Standard Edition ("Java SE Platform") in JDK7 for applications deployed in the Solaris Operating System ("Solaris OS"). The Solaris OS has supported SDP and InfiniBand since Solaris 10 5/08.

понедельник, 29 марта 2010 г.

Seminar 30.03: GFS, Chubby, BigTable

Themes for seminar 30.03:
- GFS
- Chubby
- BigTable

There are kernel technologies at Google.

пятница, 26 марта 2010 г.

The Google File System

The Google File System
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung, Google

Abstract
We have designed and implemented the Google File System, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients.
While sharing many of the same goals as previous distributed file systems, our design has been driven by observations of our application workloads and technological environment, both current and anticipated, that reflect a marked departure from some earlier file system assumptions. This has led us to reexamine traditional choices and explore radically different design points.

MapReduce: Simpli ed Data Processing on Large Clusters

MapReduce: Simplified Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat, Google Inc.

Abstract
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper.
Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system.
Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

Interpreting the Data: Parallel Analysis with Sawzall

Interpreting the Data: Parallel Analysis with Sawzall
Rob Pike, Sean Dorward, Robert Griesemer, Sean Quinlan, Google Inc.

Abstract
Very large data sets often have a flat but regular structure and span multiple disks and machines. Examples include telephone call records, network logs, and web document repositories. These large data sets are not amenable to study using traditional database techniques, if only because they can be too large to fit in a single relational database. On the other hand, many of the analyses done on them can be expressed using simple, easily distributed computations: filtering, aggregation, extraction of statistics, and so on. We present a system for automating such analyses. A filtering phase, in which a query is expressed using a new procedural programming language, emits data to an aggregation phase. Both phases are distributed over hundreds or even thousands of computers. The results are then collated and saved to a file. The design—including the separation into two phases, the form of the programming language, and the properties of the aggregators—exploits the parallelism inherent in having data and computation distributed across many machines.

The Chubby lock service for loosely-coupled distributed systems

The Chubby lock service for loosely-coupled distributed systems.
Mike Burrows, Google Inc.

Abstract
We describe our experiences with the Chubby lock service, which is intended to provide coarse-grained locking as well as reliable (though low-volume) storage for a loosely-coupled distributed system. Chubby provides an interface much like a distributed file system with advisory locks, but the design emphasis is on availability and reliability, as opposed to high performance. Many instances of the service have been used for over a year, with several of them each handling a few tens of thousands of clients concurrently. The paper describes the initial design and expected use, compares it with actual use, and explains how the design had to be modified to accommodate the differences.

Bigtable: A Distributed Storage System for Structured Data

Bigtable: A Distributed Storage System for Structured Data
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, ant others, Google Inc

Abstract
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this paper we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.

среда, 24 марта 2010 г.

Comparing Two High-Performance I/O Design Patterns

Comparing Two High-Performance I/O Design Patterns:
1. Short and clear comparison of Reactor and Proactor I/O patterns.
2. Interesting info about props of different I/O mechanizm on different platforms (Win, Linux, BSD, Solaris).

Summary
This article investigates and compares different design patterns of high performance TCP-based servers. In addition to existing approaches, it proposes a scalable single-codebase, multi-platform solution (with code examples) and describes its fine-tuning on different platforms. It also compares performance of Java, C# and C++ implementations of proposed and existing solutions.

понедельник, 15 марта 2010 г.

Concurrent Data Structures

Concurrent Data Structures from concurrency experts Mark Moir and Nir Shavit (Sun Microsystems Laboratories).

In 23 page overview authors cite !!!138!!! world-class sources + introduce good index.

Contents:
1.1 Designing Concurrent Data Structures
- Performance
- Blocking Techniques
- Nonblocking
- Techniques
- Complexity Measures
- Correctness
- Verification Techniques
- Tools of the Trade
1.2 Shared Counters and Fetch-and-Á Structures
1.3 Stacks and Queues
1.4 Pools
1.5 Linked Lists
1.6 Hash Tables
1.7 Search Trees
1.8 Priority Queues
1.9 Summary

пятница, 12 марта 2010 г.

Amazon SimpleDB Consistency Enhancements

Amazon SimpleDB Consistency Enhancements.

Abstract
This document outlines the new strong consistency features in SimpleDB - consistent read and conditional put/delete. It also demonstrates how they can be used to program key database application scenarios such as persistent application state, concurrency control, item counter, and conditional update/delete.

четверг, 11 марта 2010 г.

Scalable servers: NIO + NIO.2

Some ineresting slides:
+ Doug Lea, "Scalable IO in Java" (pattern Reactor in NIO);
+ Alan Bateman(Sun), Jeanfrancois Arcand(Sun), "Asynchronous I/O Tricks and Tips".

среда, 10 марта 2010 г.

HotSpot Glossary of Terms

HotSpot Glossary of Terms:
- adaptive spinning
- biased locking
- block start table
- bootstrap classloader
- bytecode verification
- C1 compiler
- C2 compiler
- card table
- class data sharing
- class hierachy analysis(CHA)

Семинар

Тысяча извинений, что не смог провести семинар 9 марта и не смог всех предупредить.