One of the basic grid-enabled group of services are the DataMiningGrid Data Services. Data services are concerned with locating data sources, manipulating datasets derived from the sources, and transferring the resulting data to additional locations where they can be more easily processed, analysed or stored. Data Services are opposed to analysis services which perform some sort of calculations in order to produce the desired results.
Data sets and data sources used for data mining vary considerable in structure, size, problem solving context, background knowledge, and other statistical and technological aspects across different domains and sectors. Data is different to streams of bits and bytes, this fact needs to be reflected in the protocols and services layers. Functionallity supported by the data services includes:
- data service location (metadata annotation)
- database federation
- data access and selection
- data transfer
- data transformation and pre-processing
We explicitly emphasise the fact that grid-enabled data services for data mining are not the same as services for query-oriented systems and applications. Therefore new data interfaces and services are being developed, allowing data mining tools to operate with distributed data located in heterogenuous databases (e.g. relational databases, filesystems).
Contact us if you would like to learn more or browse our digital library for more information.
Updated on June 4, 2008