Data is becoming the world’s new natural resource
and big data use grows quickly. The trend of computing
technology is that everything is merged into the Internet and
‘big data’ are integrated to comprise complete
information for collective intelligence. With the increasing
size of big data, refining big data themselves to reduce data size
while keeping critical data (or useful information) is a new
approach direction. In this paper, we provide a novel data
consumption model, which separates the consumption of data
from the raw data, and thus enable cloud computing for big
data applications. We define a new Data-as-a-Product (DaaP)
concept; a data product is a small sized summary of the
original data and can directly answer users’ queries. Thus, we
separate the mining of big data into two classes of processing
modules: the refine modules to change raw big data into smallsized
data products, and application-oriented mining modules
to discover desired knowledge further for applications from
well-defined data products. Our practices of mining big stream
data, including medical sensor stream data, streams of text
data and trajectory data, demonstrated the efficiency and
precision of our DaaP model for answering users’ queries