Limited Offer
Managing Data in Motion
Data Integration Best Practice Techniques and Technologies
- 1st Edition - February 26, 2013
- Author: April Reeve
- Language: English
- Paperback ISBN:9 7 8 - 0 - 1 2 - 3 9 7 1 6 7 - 8
- eBook ISBN:9 7 8 - 0 - 1 2 - 3 9 7 7 9 1 - 5
Managing Data in Motion describes techniques that have been developed for significantly reducing the complexity of managing system interfaces and enabling scalable architect… Read more
Purchase options
Institutional subscription on ScienceDirect
Request a sales quoteManaging Data in Motion describes techniques that have been developed for significantly reducing the complexity of managing system interfaces and enabling scalable architectures. Author April Reeve brings over two decades of experience to present a vendor-neutral approach to moving data between computing environments and systems. Readers will learn the techniques, technologies, and best practices for managing the passage of data between computer systems and integrating disparate data together in an enterprise environment.
The average enterprise's computing environment is comprised of hundreds to thousands computer systems that have been built, purchased, and acquired over time. The data from these various systems needs to be integrated for reporting and analysis, shared for business transaction processing, and converted from one format to another when old systems are replaced and new systems are acquired.
The management of the "data in motion" in organizations is rapidly becoming one of the biggest concerns for business and IT management. Data warehousing and conversion, real-time data integration, and cloud and "big data" applications are just a few of the challenges facing organizations and businesses today. Managing Data in Motion tackles these and other topics in a style easily understood by business and IT managers as well as programmers and architects.
- Presents a vendor-neutral overview of the different technologies and techniques for moving data between computer systems including the emerging solutions for unstructured as well as structured data types
- Explains, in non-technical terms, the architecture and components required to perform data integration
- Describes how to reduce the complexity of managing system interfaces and enable a scalable data architecture that can handle the dimensions of "Big Data"
Data Warehouse Professionals; Data Modelers and Architects; Database and Network Administrators; ETL and Application Programmers; Project Managers; IT and Data Center Managers; CIO/CTO
Dedication
Foreword
Acknowledgements
Biography
Introduction
What this book is about and why it’s necessary
What the reader will learn
Who should read this book
How this book is organized
Part 1: Introduction to data integration
Part 2: Batch data integration
Part 3: Real-time data integration
Part 4: Big data integration
Part 1: Introduction to Data Integration
Chapter 1. The Importance of Data Integration
The natural complexity of data interfaces
The rise of purchased vendor packages
Key enablement of big data and virtualization
Chapter 2. What Is Data Integration?
Data in motion
Integrating into a common format—transforming data
Migrating data from one system to another
Moving data around the organization
Pulling information from unstructured data
Moving process to data
Chapter 3. Types and Complexity of Data Integration
The differences and similarities in managing data in motion and persistent data
Batch data integration
Real-time data integration
Big data integration
Data virtualization
Chapter 4. The Process of Data Integration Development
The data integration development life cycle
Inclusion of business knowledge and expertise
Part 2: Batch Data Integration
Chapter 5. Introduction to Batch Data Integration
What is batch data integration?
Batch data integration life cycle
Chapter 6. Extract, Transform, and Load
What is ETL?
Profiling
Extract
Staging
Access layers
Transform
Load
Chapter 7. Data Warehousing
What is data warehousing?
Layers in an enterprise data warehouse architecture
Types of data to load in a data warehouse
Chapter 8. Data Conversion
What is data conversion?
Data conversion life cycle
Data conversion analysis
Best practice data loading
Improving source data quality
Mapping to target
Configuration data
Testing and dependencies
Private data
Proving
Environments
Chapter 9. Data Archiving
What is data archiving?
Selecting data to archive
Can the archived data be retrieved?
Conforming data structures in the archiving environment
Flexible data structures
Chapter 10. Batch Data Integration Architecture and Metadata
What is batch data integration architecture?
Profiling tool
Modeling tool
Metadata repository
Data movement
Transformation
Scheduling
Part 3: Real Time Data Integration
Chapter 11. Introduction to Real-Time Data Integration
Why real-time data integration?
Why two sets of technologies?
Chapter 12. Data Integration Patterns
Interaction patterns
Loose coupling
Hub and spoke
Synchronous and asynchronous interaction
Request and reply
Publish and subscribe
Two-phase commit
Integrating interaction types
Chapter 13. Core Real-Time Data Integration Technologies
Confusing terminology
Enterprise service bus (ESB)
Service-oriented architecture (SOA)
Extensible markup language (XML)
Data replication and change data capture
Enterprise application integration (EAI)
Enterprise information integration (EII)
Chapter 14. Data Integration Modeling
Canonical modeling
Message modeling
Chapter 15. Master Data Management
Introduction to master data management
Reasons for a master data management solution
Purchased packages and master data
Reference data
Masters and slaves
External data
Master data management functionality
Types of master data management solutions—registry and data hub
Chapter 16. Data Warehousing with Real-Time Updates
Corporate information factory
Operational data store
Master data moving to the data warehouse
Chapter 17. Real-Time Data Integration Architecture and Metadata
What is real-time data integration metadata?
Modeling
Profiling
Metadata repository
Enterprise service bus—data transformation and orchestration
Data movement and middleware
External interaction
Part 4: Big, Cloud, Virtual Data
Chapter 18. Introduction to Big Data Integration
Data integration and unstructured data
Big data, cloud data, and data virtualization
Chapter 19. Cloud Architecture and Data Integration
Why is data integration important in the cloud?
Public cloud
Cloud security
Cloud latency
Cloud redundancy
Chapter 20. Data Virtualization
A technology whose time has come
Business uses of data virtualization
Data virtualization architecture
Chapter 21. Big Data Integration
What is big data?
Big data dimension—volume
Big data dimension—variety
Big data dimension—velocity
Traditional big data use cases
More big data use cases
Leveraging the power of big data—real-time decision support
Big data architecture
Chapter 22. Conclusion to Managing Data in Motion
Data integration architecture
Data integration engines
Data integration hubs
Metadata management
The end
References
Index
- No. of pages: 204
- Language: English
- Edition: 1
- Published: February 26, 2013
- Imprint: Morgan Kaufmann
- Paperback ISBN: 9780123971678
- eBook ISBN: 9780123977915
AR