0845 643 64 63

ETL (Extract Transform Load)

What is ETL?

The Extract Transform and Load (ETL) process is used to populate the data warehouse. The term ETL is used to refer to sets of tools or processes used to extract information from any existing IT system, transform it into a more suitable structure, validate it and then load it into a data warehouse.

 

Source Systems

We can extract information out of virtually any source system, for example SQL Server, Oracle, mySQL, FoxPro, Access, Excel, SAP, Sage, Text Files and Binary files. It is preferable to have access to some details of the source data structures, however even if you don't have have any system details, we can still work this out for you. We pride ourselves on being able to quickly understand your business and systems in order to implement a solution that is right for you.

 

Source Systems

 

 

System Migration and Upgrade

A key use for ETL systems is to enable a smooth migration from one system to another. By creating an ETL script for each system, data can be stored in a consistent format in the data warehouse. The source system can then be changed, without any impact on the data warehouse or the reporting/analysis systems.

 

Data Cleansing

Data CleansingIn any environment where data is being entered, either manually or via automated data capture, it is almost guaranteed that some invalid or contradictory data will be introduced into the system. This could be manual mistakes, data corruptions or 3rd party system errors.

As the data warehouse is being loaded, the ETL process can filter out these errors, and either correct them or save them in a reject area for manual review. You can then be confident that all the information being reported on is correct and consistent.

Different currencies can be converted into a single, or a number of, reporting currencies. If different source systems use different reference codes (such as different product codes, employee numbers etc.) then the ETL process can standardise these into a single common set of references.

 

Performance

As the demand for information grows, the load on your systems grows with it. If a report takes 5 minutes to run, that is putting 5 minutes of heavy load on your source systems, compromising any other users that are working at the time.
As the data warehouse takes over job of information supplier, it can have a huge performance effect on your existing systems, purely by reducing their workload. You can then use the flexability of SQL Server to add clustering, multi processor support, mirroring and failover support to increase the performance, scalability and reliability further and further.