Cloud and Desktop Grid Computing
The performance variation of cloud resources makes it difficult to run certain scientific applications in the cloud because of their unique synchronization and communication requirements. This problem is similar to that of desktop grids, except that cloud networks are more reliable. While applications with little or no communication between worker nodes (such as independent task applications) perform well in such computational environments, applications that rely on frequent communication (such as distributed matrix multiplications) perform rather poorly.We argue that by assigning individual tasks to (groups of) nodes with appropriate computational and communication characteristics, it is possible to achieve better performance for such applications. A centralized scheduler that considers performance information for a large number of nodes, however, would become a bottleneck.
Our solution to this problem is employ decentralized scheduling algorithms for many-task applications that assign individual tasks to cloud nodes based on periodic performance measurements of the cloud resources.
As a proof of concept, we have developed a vector-based scheduling algorithm that assigns tasks to nodes based on measuring the compute performance and the queue length of those nodes. Our experiments with a set of tasks in CloudLab show that the application proceeds in three distinct phases: flooding the cloud nodes with tasks, a steady state in which all nodes are busy, and the end game in which the remaining tasks are executed on the fastest nodes.
In previous work, we have developed a biologically inspired and fully-decentralized approach to the organization of computation in a desktop grid that is based on the autonomous scheduling of strongly mobile agents on a peer-to-peer network. Our approach achieves the following design objectives: near-zero knowledge of network topology, zero knowledge of system status, autonomous scheduling, distributed computation, lack of specialized nodes. Every node is equally responsible for scheduling and computation, both of which are performed with practically no information about the system.
We have implemented an extension of Java with strong mobility that allows multi-threaded agents to migrate with all of their execution state by translating Java with strong mobility into Java with weak mobility. We built a prototype grid infrastructure, the Organic Grid, in which an application is scheduled by encapsulating it in an agent together with a scheduler specific to the application characteristics. Similar to other desktop grids, the Organic Grid can be deployed in a screen saver.
Collaborators
- Qingyang Wang CSE Division, School of Electrical Engineering and Computer Science, Louisiana State University
- Mario Lauria, Microsoft Research – University of Trento Centre for Computational and Systems Biology, Trento, Italy
Student
Former Students
- Arjav J. Chakravarti (PhD, June 2004), Dasra
- Yalda Fazlalizadeh
- Peter Franz
- John T. Glass
- Rajneesh Khambham (MS, December 2006)
- Brian L. Peterson (PhD, May 2017), MathWorks
- Anindya Poddar
- Arvind Saini (PhD, May 2018)
- Harshini Vannikkarasan (MS, December 2021)
- Xiaojin Wang (MS, December 2001), Amazon.com
Publications
2018
- A Vector-Scheduling Approach for Running Many-Task
Applications in the Cloud
B. Peterson, Y. Fazlalizadeh, G. Baumgartner, Q. Wang. In M. Luo and J.-J. Zhang (Eds.): Proceedings of the 2018 International Conference on Cloud Computing (CLOUD 2018), Seattle, WA 25-30 June 2018. Lecture Notes in Computer Science, Vol. 10967, Springer-Verlag, pp. 3-19. Best paper award.
2017
- A Decentralized Scheduling Framework for Many-Task
Scientific Computing in a Hybrid Cloud
B. Peterson, G. Baumgartner, Q. Wang. Service Transactions on Cloud Computing (STCC), Vol. 5, No. 1, Dec. 2017, pp. 1-13, doi: 10.29268/stcc.2017.5.1.1.
2015
- An Optimizing Translation Framework for Strongly Mobile
Java
A. Saini, G. Baumgartner. In Proceedings of the 18th Workshop on Programming Languages and Foundations of Programming, Pörtschach am Wörthersee, Austria, 5-7 October 2015. - A Hybrid Cloud Frameworkf ro Scientific Computing
B. Peterson, G. Baumgartner, X. Wang. In Proceedings of the 8th IEEE International Conference on Cloud Computing (IEEE CLOUD 2015), New York, NY, 27 June - 2 July 2015, pp. 373-380.
2007
-
Self-Organizing Scheduling on the Organic Grid
A.J. Chakravarti, G. Baumgartner, M. Lauria. In Manish Parashar, Salim Hariri (eds.), Autonomic Computing: Concepts, Infrastructure, and Applications, CRC Press, 2007, Chapter 19, pp. 389-411.
2006
-
The Organic Grid:
Self-Organizing Computational Biology on Desktop Grids
A.J. Chakravarti, G. Baumgartner, M. Lauria. In Albert Zomaya (ed.), Parallel Computing for Bioinformatics and Computational Biology: Models, Enabling Technologies, and Case Studies, John Wiley & Sons, February 2006, Chapter 27. pp. 671-703. -
Self-Organizing Scheduling on the Organic Grid
A.J. Chakravarti, G. Baumgartner, M. Lauria. International Journal on High-Performance Computing Applications, Vol. 20, No. 1, January 2006, pp. 115-130.
2005
-
The Organic Grid: Self-Organizing Computation on a
Peer-to-Peer Network
A.J. Chakravarti, G. Baumgartner, M. Lauria. IEEE Transactions on Systems, Man, and Cybernetics, Part A, Vol. 35, No. 3, May 2005, pp. 373-384.
2004
-
Application-Specific Scheduling for the Organic Grid
A.J. Chakravarti, G. Baumgartner, M. Lauria. In Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing (Grid '04), Pittsburgh, Pennsylvania, 8 November 2004, pp. 146-155.
Also available as Technical Report OSU-CISRC-4/04-TR23, Dept. of Computer and Information Science, The Ohio State University, April 2004. -
The Organic Grid: Self-Organizing Computation on a
Peer-to-Peer Network
A.J. Chakravarti, G. Baumgartner, M. Lauria. In Proceedings of the First International Conference on Autonomic Computing (ICAC '04), New York, NY, 17-18 May 2004, IEEE Computer Society Press, pp. 96-103.
An extended version of this paper is available as Technical Report OSU-CISRC-10/03-TR55, Dept. of Computer and Information Science, The Ohio State University, October 2003. -
Application-Specific Scheduling for the Organic Grid
A.J. Chakravarti, G. Baumgartner, M. Lauria. Technical Report OSU-CISRC-4/04-TR23, Dept. of Computer and Information Science, The Ohio State University, April 2004.
2003
-
The Organic Grid: Self-Organizing Computation on a
Peer-to-Peer Network
A.J. Chakravarti, G. Baumgartner, M. Lauria. Technical Report OSU-CISRC-10/03-TR55, Dept. of Computer and Information Science, The Ohio State University, October 2003. -
Implementation of Strong Mobility for Multi-Threaded Agents
in Java
A.J. Chakravarti, X. Wang, J.O. Hallstrom, G. Baumgartner. In Proceedings of the 2003 International Conference on Parallel Processing (ICPP '03), Koahsiung, Taiwan, 6-9 October 2003, IEEE Computer Society Press, pp. 321-330.
An extended version of this paper is available as Technical Report OSU-CISRC-2/03-TR06, Dept. of Computer and Information Science, The Ohio State University, October 2003. -
Implementation of Strong Mobility for Multi-Threaded Agents
in Java
A.J. Chakravarti, X. Wang, J.O. Hallstrom, G. Baumgartner. Technical Report OSU-CISRC-2/03-TR06, Dept. of Computer and Information Science, The Ohio State University, March 2003.
2001
-
Reliability Through Strong Mobility
X. Wang, J. Hallstrom, G. Baumgartner. In Proceedings of the 7th ECOOP Workshop on Mobile Object Systems: Development of Robust and High Confidence Agent Applications (MOS '01), Budapest, Hungary, 18 June 2001, pp. 1-13.