Monday, March 14, 2016

Software Tools


3/14/2016

https://en.wikipedia.org/wiki/Data_warehouse

Software tools[edit]

The typical extract-transform-load (ETL)-based data warehouse uses stagingdata integration, and access layers to house its key functions. The staging layer or staging database stores raw data extracted from each of the disparate source data systems. The integration layer integrates the disparate data sets by transforming the data from the staging layer often storing this transformed data in an operational data store (ODS) database. The integrated data are then moved to yet another database, often called the data warehouse database, where the data is arranged into hierarchical groups often called dimensions and into facts and aggregate facts. The combination of facts and dimensions is sometimes called a star schema. The access layer helps users retrieve data.[4]
This definition of the data warehouse focuses on data storage. The main source of the data is cleaned, transformed, cataloged and made available for use by managers and other business professionals for data miningonline analytical processingmarket researchand decision support.[5] However, the means to retrieve and analyze data, to extract, transform and load data, and to manage the data dictionary are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. Thus, an expanded definition for data warehousing includes business intelligence tools, tools to extract, transform and load data into the repository, and tools to manage and retrieve metadata.

Basics

3/14/2016

Data Warehouse: Warehouse is the place where you keep all your material, if a person says it is his Warehouse it means it will have all varity of material that he wanted to store for future needs for his purpose of business. In our native language we call it as 'Godown'. Then what is Data Warehouse?

Data Warehouse is Data godown, is a system used for reporting and data analysis. Here data is being collected from different sources(sources like in from employees, users- facebook or google search engine, data from machines like satellites).  It is called central repository of integrated data from one or more disparate sources. As they store current and historical data, they also used for creating analytical reports and analysis.

So Data warehouse is two terms looks like ->  storage(collection) from different source + analysis of that data
https://en.wikipedia.org/wiki/Data_warehouse

In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data and are used for creating analytical reports for knowledge workers throughout the enterprise. 


Examples of reports could range from annual and quarterly comparisons and trends to detailed daily sales analysis.


The data stored in the warehouse is uploaded from the operational systems (such as marketing, sales, etc., shown in the figure to the right). The data may pass through an operational data store for additional operations before it is used in the DW for reporting.


http://searchsqlserver.techtarget.com/definition/data-warehouse
Data warehousing emphasizes the capture of data from diverse sources for useful analysis and access, but does not generally start from the point-of-view of the end user who may need access to specialized, sometimes local databases. The latter idea is known as the data mart.

Data Mining


3/14/2016

http://searchsqlserver.techtarget.com/definition/data-mining

Data mining is sorting through data to identify patterns and establish relationships.

Data mining parameters include:
  • Association - looking for patterns where one event is connected to another event
  • Sequence or path analysis - looking for patterns where one event leads to another later event
  • Classification - looking for new patterns (May result in a change in the way the data is organized but that's ok)
  • Clustering - finding and visually documenting groups of facts not previously known
  • Forecasting - discovering patterns in data that can lead to reasonable predictions about the future (This area of data mining is known as predictive analytics.)
Data mining techniques are used in a many research areas, including mathematics, cybernetics, genetics and marketing. Web mining, a type of data mining used in customer relationship management (CRM), takes advantage of the huge amount of information gathered by a Web site to look for patterns in user behavior.