Deakin University

File(s) under permanent embargo

Extracting domain models from natural-language requirements

conference contribution
posted on 2016-01-01, 00:00 authored by Chetan Arora, Mehrdad Sabetzadeh, Lionel Briand, Frank Zimmer
Domain modeling is an important step in the transition from natural-language requirements to precise specifications. For large systems, building a domain model manually is a laborious task. Several approaches exist to assist engineers with this task, whereby candidate domain model elements are automatically extracted using Natural Language Processing (NLP). Despite the existing work on domain model extraction, important facets remain under-explored: (1) there is limited empirical evidence about the usefulness of existing extraction rules (heuristics) when applied in industrial settings; (2) existing extraction rules do not adequately exploit the natural-language dependencies detected by modern NLP technologies; and (3) an important class of rules developed by the information retrieval community for information extraction remains unutilized for building domain models.

Motivated by addressing the above limitations, we develop a domain model extractor by bringing together existing extraction rules in the software engineering literature, extending these rules with complementary rules from the information retrieval literature, and proposing new rules to better exploit results obtained from modern NLP dependency parsers. We apply our model extractor to four industrial requirements documents, reporting on the frequency of different extraction rules being applied. We conduct an expert study over one of these documents, investigating the accuracy and overall effectiveness of our domain model extractor.



Model Driven Engineering Languages and Systems. Conference (2016 : Saint-malo, France)


1 - 11


ACM Press


Saint-malo, France

Place of publication

New York, N.Y.

Start date


End date






Publication classification

E1.1 Full written paper - refereed

Title of proceedings

MODELS '16: Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems