File(s) under permanent embargo
An optimal recruitment algorithm based on an efficient tree search policy
conference contributionposted on 2017-01-01, 00:00 authored by P Lalbakhsh, A Novak, Terry CaelliTerry Caelli
© 2017 Proceedings - 22nd International Congress on Modelling and Simulation, MODSIM 2017. All rights reserved. In the area of highly specialized skills training where the cost of training is high and available infrastructure is limited and limiting, recruitment and manpower training scheduling can be quite complex. For Royal Australian Navy (RAN) pilot training, there are some additional unique features such as feedback loops that are generated by the requirement for graduated pilots to return into the training continuum as instructors. Furthermore, the trainee numbers are relatively small, the failure rates are high and highly variable. In this paper, we consider a simplified RAN training scheduler solution as an optimal control problem having feedback loops and cost functions that penalize prolonged waiting periods between courses (buffers) with an overall objective of minimizing the total training time. The solution algorithm converges on an optimal recruitment strategy through which the training continuum maintains a functional squadron over a specific timeframe, while imposing the least possible cost to the organisation. Solutions also take into account course pass rates, squadron wastage, the number of trainees in each course or the number of trainees waiting in the intermediate buffers. The proposed algorithm uses states and actions as used in Markov Decision Processes (MDP) where current states and actions are used to predict new states to minimize cost. The algorithm differs from MDPs in so far as the MDP “optimal policy” for prediction future states and associated optimal actions is replaced by an optimal tree search process where traversing a level in the tree is interpreted as taking an action resulting in a transition from one state to another. The algorithm uses a recruitment-wastage near-equilibrium condition to prune the tree avoiding suboptimal solutions. To select the best recruitment strategy, the combined cost from root-to-leaf is considered as the final merit thus replacing the stochastic MDP policy approach with a deterministic optimal tree search strategy. The algorithm benefits from a solution archive that maintains a sorted list of the best n created solutions. The result of the experiments show that the algorithm can efficiently perform tree search in order to rapidly find feasible recruitment policies with optimal costs.