Data warehouse systems use back-end tools and utilities to populate and refresh their data. These tools and facilities include the following functions—
1 Data Extraction —
it typically gathers data from multiple, heterogeneous, and external sources.
2 Data Cleaning —
This detects errors in the data and rectifies them when possible.
3 Data Transformation —
It converts data from legacy or host format to warehouse format.
4 Load —
It sorts, stmunariz.es, consolidates, computes views, checks integrity. and builds indices and partitions.
5 Refresh —
It propagates the updates from the data sources to the warehouse.
Besides cleaning, loading, refreshing, and metadata definition tools, data warehouse systems usually provide a good set of data warehouse management tools. Data cleansing and data transformation are important steps in improving the quality of the data and, subsequently, of the data mining results.