File(s) under permanent embargo
Methods of distributed processing for combat simulation data generation
conference contributionposted on 01.01.2017, 00:00 authored by L Holden, S Russack, Mohamed AbdelrazekMohamed Abdelrazek, Rajesh VasaRajesh Vasa, S Pace, Kon MouzakisKon Mouzakis, Rhys Adams
© 2017 Proceedings - 22nd International Congress on Modelling and Simulation, MODSIM 2017. All rights reserved. Combat simulation requires an extensive amount of data to be generated for the execution of the simulations followed by an extensive amount of iterations of the simulation tool to produce quantities of data sufficient for analysis. Typically this generation process exceeds the capabilities of a single desktop computer to perform in a usable time period. Effective data generation requires a method of harnessing the power of multiple computers working towards the same goal. To meet the data generation requirements of combat simulation execution a series of distributed processing architectures were developed. These have expanded from specific task solutions to generic task distributed processing architectures. Each implementation has to solve the issue of distributing processing tasks that were not initially developed with an existing distributed processing framework in mind (such as Map-Reduce). Each of the architectures has been built on experiences learned from previous implementations. These lessons have resulted in two architectures available for our distributed processing needs. These architectures take a different approach to the distribution of jobs and the management of work execution and scheduling. The first is a Distributed Queue architecture that is based on a dynamic client pool of processing nodes that pull jobs from a well-known job description queue using transient data transfer. This approach allows cross-platform processing nodes to be added to the distribution network on a needs basis. The other is a Distributed Scheduler architecture uses a resource scheduling algorithm to distribute jobs to resources on a network. This scheduler can manage task dependencies and the transfer of persisted data between processing tasks. Both implementations have to manage for resource nodes not working correctly, processing errors on the remote tasks and monitoring the progress of any assigned tasks. This paper will examine the history of the distributed processing architectures we have used. It will describe the two resulting architectures and the differences between them. It will then outline our selection criteria and the currently used distribution implementation and future improvements to the process.