Deakin University
Browse

Is high performance computing (HPC) ready to handle big data?

Version 2 2024-06-03, 07:42
Version 1 2017-11-17, 20:58
conference contribution
posted on 2024-06-03, 07:42 authored by BR Ray, Morshed Chowdhury, U Atif
In recent years big data has emerged as a universal term and its management has become a crucial research topic. The phrase ‘big data’ refers to data sets so large and complex that the processing of them requires collaborative High Performance Computing (HPC). How to effectively allocate resources is one of the prime challenges in HPC. This leads us to the question: are the existing HPC resource allocation techniques effective enough to support future big data challenges? In this context, we have investigated the effectiveness of HPC resource allocation using the Google cluster dataset and a number of data mining tools to determine the correlational coefficient between resource allocation, resource usages and priority. Our analysis initially focused on correlation between resource allocation and resource uses. The finding shows that a high volume of resources that are allocated by the system for a job are not being used by that same job. To investigate further, we analyzed the correlation between resource allocation, resource usages and priority. Our clustering, classification and prediction techniques identified that the allocation and uses of resources are very loosely correlated with priority of the jobs. This research shows that our current HPC scheduling needs improvement in order to accommodate the big data challenge efficiently.

History

Volume

759

Pagination

97-112

Location

Gainesville, Florida

Start date

2017-08-31

End date

2017-09-02

ISSN

1865-0929

ISBN-13

9783319655475

Language

eng

Publication classification

E Conference publication, E1 Full written paper - refereed

Copyright notice

2017, Springer

Editor/Contributor(s)

Doss R, Piramuthu S, Zhou W

Title of proceedings

FNSS 2017 : Proceedings of the 3rd International Conference on Future Network Systems and Security 2017

Event

Future Network Systems and Security. International Conference (3rd : 2017 : Gainesville, Florida)

Publisher

Springer

Place of publication

Berlin, Germany

Series

Communications in Computer and Information Science

Usage metrics

    Research Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC