没有合适的资源?快使用搜索试试~ 我知道了~
Data warehouse
需积分: 9 0 下载量 119 浏览量
2015-03-09
19:44:07
上传
评论
收藏 207KB DOC 举报
温馨提示
试读
12页
Data Warehouse Overview
资源推荐
资源详情
资源评论
Data warehouse
From Wikipedia, the free encyclopedia
Data Warehouse Overview
Incomputing, adata warehouse(DWorDWH), also known as anenterprise data
warehouse(EDW), is a system used forreportinganddata analysis. DWs are central
repositories of integrated data from one or more disparate sources. They store current
and historical data and are used for creating trending reports for senior management
reporting such as annual and quarterly comparisons.
The data stored in the warehouse isuploadedfrom the operational systems (such as
marketing, sales, etc., shown in the figure to the right). The data may pass through
anoperational data storefor additional operations before it is used in the DW for
reporting.
Contents
1 Types of systems
2 Software tools
3 Benefits
4 Generic data warehouse environment
5 History
6 Information storage
o 6.1 Facts
o 6.2 Dimensional vs. normalized approach for storage of data
7 Top-down versus bottom-up design methodologies
o 7.1 Bottom-up design
o 7.2 Top-down design
o 7.3 Hybrid design
8 Data warehouses versus operational systems
9 Evolution in organization use
10 See also
11 References
12 Further reading
13 External links
Types of systems[edit]
Data mart
A data mart is a simple form of a data warehouse that is focused on a single subject
(or functional area), such as sales, finance or marketing. Data marts are often built and
controlled by a single department within an organization. Given their single-subject
focus, data marts usually draw data from only a few sources. The sources could be
internal operational systems, a central data warehouse, or external data.
[1]
Online analytical processing(OLAP)
Is characterized by a relatively low volume of transactions. Queries are often very
complex and involve aggregations. For OLAP systems, response time is an
effectiveness measure. OLAP applications are widely used by Data Mining techniques.
OLAP databases store aggregated, historical data in multi-dimensional schemas
(usually star schemas). OLAP systems typically have data latency of a few hours, as
opposed to data marts, where latency is expected to be closer to one day.
Online Transaction Processing(OLTP)
Is characterized by a large number of short on-line transactions (INSERT, UPDATE,
DELETE). OLTP systems emphasize very fast query processing and maintainingdata
integrityin multi-access environments. For OLTP systems, effectiveness is measured
by the number of transactions per second. OLTP databases contain detailed and
current data. The schema used to store transactional databases is the entity model
(usually3NF).[2]
Predictive analysis
Predictive analysis is aboutfindingand quantifying hidden patterns in the data using
complex mathematical models that can be used topredictfuture outcomes. Predictive
analysis is different from OLAP in that OLAP focuses on historical data analysis and is
reactive in nature, while predictive analysis focuses on the future. These systems are
also used for CRM (Customer Relationship Management).[3]
Software tools[edit]
The typical extract-transform-load (ETL)-based data warehouse usesstaging,data
integration, and access layers to house its key functions. The staging layer or staging
database stores raw data extracted from each of the disparate source data systems.
The integration layer integrates the disparate data sets by transforming the data from
the staging layer often storing this transformed data in anoperational data store(ODS)
database. The integrated data are then moved to yet another database, often called
the data warehouse database, where the data is arranged into hierarchical groups
often called dimensions and into facts and aggregate facts. The combination of facts
and dimensions is sometimes called astar schema. The access layer helps users
retrieve data.[4]
This definition of the data warehouse focuses on data storage. The main source of
the data is cleaned, transformed, cataloged and made available for use by managers
and other business professionals fordata mining,online analytical processing,market
researchanddecision support.[5]However, the means to retrieve and analyze data,
toextract, transform and loaddata, and to manage thedata dictionaryare also
considered essential components of a data warehousing system. Many references to
data warehousing use this broader context. Thus, an expanded definition for data
warehousing includesbusiness intelligence tools, tools toextract, transform and
loaddata into the repository, and tools to manage and retrievemetadata.
Benets[edit]
A data warehouse maintains a copy of information from the source transaction
systems. This architectural complexity provides the opportunity to:
Congregate data from multiple sources into a single database so a single query
engine can be used to present data.
Mitigate the problem of database isolation level lock contention in transaction
processing systems caused by attempts to run large, long running, analysis queries
in transaction processing databases.
Maintaindata history, even if the source transaction systems do not.
Integrate data from multiple source systems, enabling a central view across the
enterprise. This benefit is always valuable, but particularly so when the organization
has grown by merger.
Improvedata quality, by providing consistent codes and descriptions, flagging or
even fixing bad data.
Present the organization's information consistently.
Provide a single common data model for all data of interest regardless of the
data's source.
Restructure the data so that it makes sense to the business users.
Restructure the data so that it delivers excellent query performance, even for
complex analytic queries, without impacting theoperational systems.
Add value to operational business applications, notablycustomer relationship
management(CRM) systems.
Make decision–support queries easier to write.
Generic data warehouse environment[edit]
The environment for data warehouses and marts includes the following:
Source systems that provide data to the warehouse or mart;
Data integration technology and processes that are needed to prepare the data
for use;
Different architectures for storing data in an organization's data warehouse or
data marts;
剩余11页未读,继续阅读
资源评论
mangsmgood
- 粉丝: 0
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功