понедельник, 4 июня 2012 г.

A Note on Distributed Computing

A Note on Distributed Computing

Суть
Каждые 10 лет приходит новая волна моды на распределенные вычисления. 70-е - передача сообщений, 80-е - RPC, 90-е - распределенныные объекты (статья 94го года). Но основная проблема не в модели связывания computation + communication, а в принципиальных сложностях: latency, different model of memory access, concurrency, partial failure, lack of a central resource manager. Похоже, авторы - разработчики NFS.

Abstract

    We argue that objects that interact in a distributed system need to be dealt with in ways that are intrinsically different from objects that interact in a single address space. These differences are required because distributed systems require that the programmer be aware of latency, have a different model of memory access, and take into account issues of concurrency and partial failure.
    We look at a number of distributed systems that have attempted to paper over the distinction between local and remote objects, and show that such systems fail to support basic requirements of robustness and reliability. These failures have been masked in the past by the small size of the distributed systems that have been built. In the enterprise-wide distributed systems foreseen in the near future, however, such a masking will be impossible.
    We conclude by discussing what is required of both systems-level and application-level programmers and designers if one is to take distribution seriously.



"... This vision is centered around the following principles that may, at first, appear plausible:
• there is a single natural object-oriented design for a given application, regardless of the context in which that application will be deployed;
• failure and performance issues are tied to the implementation of the components of an application, and consideration of these issues should be left out of an initial design; and
• the interface of an object is independent of the context in which that object is used.
Unfortunately, all of these principles are false."

"Historically, the language approach has been the less influential of the two camps. Every ten years (approximately), members of the language camp notice that the number of distributed applications is relatively small. They look at the programming interfaces and decide that the problem is that the programming model is not close enough to whatever programming model is currently in vogue (messages in the 1970s [7], [8], procedure calls in the 1980s [9], [10], [11], and objects in the 1990s [1], [2]). A furious bout of language and protocol design takes place and a new distributed computing paradigm is announced that is compliant with the latest programming model. After several years, the percentage of distributed applications is discovered not to have increased significantly, and the cycle begins anew."

"The hard problems in distributed computing concern dealing with partial failure and the lack of a central resource manager. The hard problems in distributed computing concern insuring adequate performance and dealing with problems of concurrency. The hard problems have to do with differences in memory access paradigms between local and distributed entities. People attempting to write distributed applications quickly discover that they are spending all of their efforts in these areas and not on the communications protocol programming interface."

"Providing developers with tools that help manage the complexity of handling the problems of distributed application development as opposed to the generic application development is an area that has been poorly addressed."

"The major differences between local and distributed computing concern the areas of latency, memory access, partial failure, and concurrency. The difference in latency is the most obvious, but in many ways is the least fundamental. The often overlooked differences concerning memory access, partial failure, and concurrency are far more difficult to explain away, and the differences concerning partial failure and concurrency make unifying the local and remote computing models impossible without making unacceptable compromises."

P.S. Найдено в статье про Remote Akka Actors - Location Transparency ("Everything in Akka is designed to work in a distributed setting: all interactions of actors use purely message passing and everything is asynchronous. This effort has been undertaken to ensure that all functions are available equally when running within a single JVM or on a cluster of hundreds of machines. The key for enabling this is to go from remote to local by way of optimization instead of trying to go from local to remote by way of generalization. See this classic paper for a detailed discussion on why the second approach is bound to fail.").

Комментариев нет:

Отправить комментарий