Deakin University
Browse

Video-grounded Dialog: Models and Applications

Download (9.78 MB)
thesis
posted on 2025-10-27, 02:51 authored by Hoang Anh Pham
This research introduces new approaches to video dialog that utilize neural reasoning and object-centric analysis to facilitate meaningful conversations about visual content. By analysing videos into object trajectories and preserving dialog history, COST (Conversation about Objects in Space-Time) and N2N (End-to-End) effectively tackle challenges in visual understanding, linguistic comprehension, and advanced reasoning, showing promising results in performance evaluations.<p></p>

History

Open access

  • Yes

Language

eng

Copyright notice

All rights reserved

Editor/Contributor(s)

Truyen Tran, Thao Minh Le

Pagination

85 p.

Degree type

Masters

Degree name

MRes

Usage metrics

    Theses

    Categories

    No categories selected

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC