Fundamentals of Data Science
Theory and Practice
- 1st Edition - November 17, 2023
- Authors: Jugal K. Kalita, Dhruba K. Bhattacharyya, Swarup Roy
- Language: English
- Paperback ISBN:9 7 8 - 0 - 3 2 3 - 9 1 7 7 8 - 0
- eBook ISBN:9 7 8 - 0 - 3 2 3 - 9 7 2 6 3 - 5
Fundamentals of Data Science: Theory and Practice presents basic and advanced concepts in data science along with real-life applications. The book provides students, researche… Read more
Purchase options
Institutional subscription on ScienceDirect
Request a sales quoteFundamentals of Data Science: Theory and Practice presents basic and advanced concepts in data science along with real-life applications. The book provides students, researchers and professionals at different levels a good understanding of the concepts of data science, machine learning, data mining and analytics. Users will find the authors’ research experiences and achievements in data science applications, along with in-depth discussions on topics that are essential for data science projects, including pre-processing, that is carried out before applying predictive and descriptive data analysis tasks and proximity measures for numeric, categorical and mixed-type data.
The book's authors include a systematic presentation of many predictive and descriptive learning algorithms, including recent developments that have successfully handled large datasets with high accuracy. In addition, a number of descriptive learning tasks are included.
- Presents the foundational concepts of data science along with advanced concepts and real-life applications for applied learning
- Includes coverage of a number of key topics such as data quality and pre-processing, proximity and validation, predictive data science, descriptive data science, ensemble learning, association rule mining, Big Data analytics, as well as incremental and distributed learning
- Provides updates on key applications of data science techniques in areas such as Computational Biology, Network Intrusion Detection, Natural Language Processing, Software Clone Detection, Financial Data Analysis, and Scientific Time Series Data Analysis
- Covers computer program code for implementing descriptive and predictive algorithms
- Cover image
- Title page
- Table of Contents
- Copyright
- Dedication
- Preface
- Acknowledgment
- Foreword
- Foreword
- 1: Introduction
- Abstract
- 1.1. Data, information, and knowledge
- 1.2. Data Science: the art of data exploration
- 1.3. What is not Data Science?
- 1.4. Data Science tasks
- 1.5. Data Science objectives
- 1.6. Applications of Data Science
- 1.7. How to read the book?
- References
- 2: Data, sources, and generation
- Abstract
- 2.1. Introduction
- 2.2. Data attributes
- 2.3. Data-storage formats
- 2.4. Data sources
- 2.5. Data generation
- 2.6. Summary
- References
- 3: Data preparation
- Abstract
- 3.1. Introduction
- 3.2. Data cleaning
- 3.3. Data reduction
- 3.4. Data transformation
- 3.5. Data normalization
- 3.6. Data integration
- 3.7. Summary
- References
- 4: Machine learning
- Abstract
- 4.1. Introduction
- 4.2. Machine Learning paradigms
- 4.3. Inductive bias
- 4.4. Evaluating a classifier
- 4.5. Summary
- References
- 5: Regression
- Abstract
- 5.1. Introduction
- 5.2. Regression
- 5.3. Evaluating linear regression
- 5.4. Multidimensional linear regression
- 5.5. Polynomial regression
- 5.6. Overfitting in regression
- 5.7. Reducing overfitting in regression: regularization
- 5.8. Other approaches to regression
- 5.9. Summary
- References
- 6: Classification
- Abstract
- 6.1. Introduction
- 6.2. Nearest-neighbor classifiers
- 6.3. Decision trees
- 6.4. Support-Vector Machines (SVM)
- 6.5. Incremental classification
- 6.6. Summary
- References
- 7: Artificial neural networks
- Abstract
- 7.1. Introduction
- 7.2. From biological to artificial neuron
- 7.3. Multilayer perceptron
- 7.4. Learning by backpropagation
- 7.5. Loss functions
- 7.6. Activation functions
- 7.7. Deep neural networks
- 7.8. Summary
- References
- 8: Feature selection
- Abstract
- 8.1. Introduction
- 8.2. Steps in feature selection
- 8.3. Principal-component analysis for feature reduction
- References
- 9: Cluster analysis
- Abstract
- 9.1. Introduction
- 9.2. What is cluster analysis?
- 9.3. Proximity measures
- 9.4. Exclusive clustering techniques
- 9.5. High-dimensional data clustering
- 9.6. Biclustering
- 9.7. Cluster-validity measures
- 9.8. Summary
- References
- 10: Ensemble learning
- Abstract
- 10.1. Introduction
- 10.2. Ensemble-learning framework
- 10.3. Supervised ensemble learning
- 10.4. Unsupervised ensemble learning
- 10.5. Semisupervised ensemble learning
- 10.6. Issues and challenges
- 10.7. Summary
- References
- 11: Association-rule mining
- Abstract
- Acknowledgement
- 11.1. Introduction
- 11.2. Association analysis: basic concepts
- 11.3. Frequent itemset-mining algorithms
- 11.4. Association mining in quantitative data
- 11.5. Correlation mining
- 11.6. Distributed and parallel association mining
- 11.7. Summary
- References
- 12: Big Data analysis
- Abstract
- 12.1. Introduction
- 12.2. Characteristics of Big Data
- 12.3. Types of Big Data
- 12.4. Big Data analysis problems
- 12.5. Big Data analytics techniques
- 12.6. Big Data analytics platforms
- 12.7. Big Data analytics architecture
- 12.8. Tools and systems for Big Data analytics
- 12.9. Active challenges
- 12.10. Summary
- References
- 13: Data Science in practice
- Abstract
- 13.1. Need of Data Science in the real world
- 13.2. Hands-on Data Science with Python
- 13.3. Dataset preprocessing
- 13.4. Feature selection and normalization
- 13.5. Classification
- 13.6. Clustering
- 13.7. Summary
- References
- 14: Conclusion
- Abstract
- Index
- No. of pages: 334
- Language: English
- Edition: 1
- Published: November 17, 2023
- Imprint: Academic Press
- Paperback ISBN: 9780323917780
- eBook ISBN: 9780323972635
JK
Jugal K. Kalita
DB
Dhruba K. Bhattacharyya
SR