A data warehouse sometimes referred to as a management decision support system uses dimensional modelling to address the shortcomings inherent in the transaction model. Data is stored in an consolidated form displaying the totals of all transactions as opposed to the individual transactions of an OLTP. A data warehouse exists primarily to answer questions and to enable management teams to reach critical decisions that affect the business. Dimensions are used to measure the state of the business and represent the different aspects required for a report. Typical dimensions would be Time, geographical locations, products, customers etc where each dimension would measure the data according to a particular requirement.
An example would be the total number of customers purchasing new sports equipment from all in the southern stores during the year 2000. Using the same dimensions we could measure the profitable stores of sports equipment sales to the less profitable by asking for the totals of all sports equipment sold from each southern store during the year 2000. The first example could display a total of 500 customers bought sports equipment from some 50 stores where as the second example could display that only 20 of the stores sold sports equipment during that year.
To summarize a data warehouse can be structured to respond efficiently to the types of questions management teams may require answers to.
A data warehouse has the ability to produce results not readily available from a transaction database, these results are based on the clean data extracted from an OLTP database and consolidated into a dimensional model. Data mining moves beyond the concept of a data warehouse by looking for patterns and relationships in the data. Using data mining algorithms it is possible to evaluate and understand trends, these trends can improve the profitability of stores by examining the sales transactions of each store in question. One of the key aspects to this style of analysis is bar coding, which has made it possible to identify each single item in a purchase. This data can then be studied and conclusions reached about the customers purchases. It may be that each customer who purchased sports equipment also purchased a box of chocolates. To sell more chocolate perhaps a prompting on sports equipment would be in order.
Do male customers buy chocolate ?. This could be established by other purchases identified in the shopping basket, such as shaving cream or panty hose, where as a male shopper would buy shaving cream he would be unlikely to purchase panty hose. A female shopper on the other hand could buy both items. Each shopping basket resulting in the identification of the shoppers gender. So by moving female products closer to the sports section the sale of panty hose could increase. To summarize data mining is the ability analyze current trends in order to predict future markets.
Typically, most companies will store information in an OLTP On Line Transaction Processing database. Each transaction or detail of operation will be stored as a single record or row of data. An OLAP On Line Analytical Processing database stores data in an consolidated form displaying the totals for each row of data in an OLTP. If a store sold a hundred tin of beans in a days trading a query on an OLTP database for every tin of beans sold, would return a hundred rows of data. An OLAP database would return a single row of data displaying the total number of beans sold.
A client / server application normally splits the business logic between the server and the client machine. The two main criteria for any system querying a database is speed of access and network traffic. I have seen some horrendous client / server systems where nearly all the business logic was built into the client application and the management couldn't understand why the network was so slow.
By implementing a middle tier the business logic can be removed from the client thereby reducing the client size proportionally. All clients connect to the middle tier using DCOM which in turn connects to the database server. each client in turn is then able to download any required data which can then be worked on off line. If the database server should fail for any reason all clients currently logged on are still able to process data locally, this can then be sent back to the main database server as and when the IT department has rectified the problem. An alternative would be to switch to another mirrored database server.
Another advantage is that the middle tier can be worked on by the IT department who may need to address some requirements by end users not foreseen in the original specification or because business requirements have changed. A single EXE or DLL can be installed onto the middle tier with less disruption to end users. This saves having to reinstall new clients on every desktop, all possible with a multi tier application.