The effective and efficient management and use of increasing amounts of stored data and in particular the transformation of these data into information and knowledge, is considered a key requirement in modern information systems. Data mining also known as Knowledge Discovery in Databases (KDD) is the technology addressing this information need. However, this field has mainly been developed for largely homogeneous and localized computing environments. These assumptions are increasingly not met in modern scientific and industrial complex-problem solving environments, which are characterized by two important features:
- Increasing amounts of digital data and,
- Rising demands for co-ordinated resource sharing across geographically widely dispersed sites.
Next-generation grid technologies are promising to provide the necessary infrastructure facilitating a seamless sharing of computing resources in distributed environments. Thus grid computing promises to be capable of addressing the computing requirements of future distributed data-mining environments.
Currently there exists no coherent framework for developing and deploying data-mining applications on the grid. The DataMiningGrid project addressed this gap by developing generic and sector-independent data mining tools and grid interfaces, allowing data mining tools to operate in a distributed grid computing environment. The aim of the project is therefore to upgrade data mining technologies in such a way that makes traditional knowledge discovery approaches distributed.
Updated on June 4, 2008