Data mining and statistics for decision making /
"This practical guide to understanding and implementing data mining techniques discusses traditional methods--cluster analysis, factor analysis, linear regression, PLS regression, and generalized linear models--and recent methods--bagging and boosting, decision trees, neural networks, support v...
Κύριος συγγραφέας: | |
---|---|
Μορφή: | Ηλ. βιβλίο |
Γλώσσα: | English |
Έκδοση: |
Chichester, West Sussex ; Hoboken, NJ. :
Wiley,
2011.
|
Σειρά: | Wiley series in computational statistics.
|
Θέματα: | |
Διαθέσιμο Online: | Full Text via HEAL-Link |
Πίνακας περιεχομένων:
- Front Matter
- Overview of Data Mining
- The Development of a Data Mining Study
- Data Exploration and Preparation
- Using Commercial Data
- Statistical and Data Mining Software
- An Outline of Data Mining Methods
- Factor Analysis
- Neural Networks
- Cluster Analysis
- Association Analysis
- Classification and Prediction Methods
- An Application of Data Mining: Scoring
- Factors for Success in a Data Mining Project
- Text Mining
- Web Mining
- Appendix A: Elements of Statistics
- Appendix B: Further Reading
- Index.
- Machine generated contents note: Preface
- Foreword
- Contents
- Overview of data mining
- 1.1. What is data mining?
- 1.2. What is data mining used for?
- 1.3. Data Mining and statistics
- 1.4. Data mining and information technology
- 1.5. Data mining and protection of personal data
- 1.6. Implementation of data mining
- The development of a data mining study
- 2.1. Defining the aims
- 2.2. Listing the existing data
- 2.3. Collecting the data
- 2.4. Exploring and preparing the data
- 2.5. Population segmentation
- 2.6. Drawing up and validating predictive models
- 2.7. Synthesizing predictive models of different segments
- 2.8. Iteration of the preceding steps
- 2.9. Deploying the models
- 2.10. Training the model users
- 2.11. Monitoring the models
- 2.12. Enriching the models
- 2.13. Remarks
- 2.14. Life cycle of a model
- 2.15. Costs of a pilot project
- Data exploration and preparation
- 3.1. The different types of data
- 3.2. Examining the distribution of variables
- 3.3. Detection of rare or missing values
- 3.4. Detection of aberrant values
- 3.5. Detection of extreme values
- 3.6. Tests of normality
- 3.7. Homoscedasticity and heteroscedasticity
- 3.8. Detection of the most discriminating variables
- 3.9. Transformation of variables
- 3.10. Choosing ranges of values of continuous variables
- 3.11. Creating new variables
- 3.12. Detecting interactions 89
- 3.13. Automatic variable selection
- 3.14. Detection of collinearity
- 3.15. Sampling
- Using commercial data
- 4.1. Data used in commercial applications
- 4.2. Special data
- 4.3. Data used by business sector
- Statistical and data mining software
- 5.1. Types of data mining and statistical software
- 5.2. Essential characteristics of the software
- 5.3. The main software packages
- 5.4. Comparison of R, SAS and IBM SPSS
- 5.5. How to reduce processing time
- An outline of data mining methods
- 6.1. A note on terminology
- 6.2. Classification of the methods
- 6.3. Comparison of the methods
- 6.4. Using these methods in the business world
- Factor analysis
- 7.1. Principal component analysis
- 7.2. Variants of principal component analysis
- 7.3. Correspondence analysis
- 7.4. Multiple correspondence analysis
- Neural networks
- 8.1. General information on neural networks
- 8.2. Structure of a neural network
- 8.3. Choosing the training sample
- 8.4. Some empirical rules for network design
- 8.5. Data normalization
- 8.6. Learning algorithms
- 8.7. The main neural networks
- Automatic clustering methods
- 9.1. Definition of clustering
- 9.2. Applications of clustering
- 9.3. Complexity of clustering
- 9.4. Clustering structures
- 9.5. Some methodological considerations
- 9.6. Comparison of factor analysis and clustering
- 9.7. Intra-class and inter-class inertias
- 9.8. Measurements of clustering quality
- 9.9. Partitioning methods
- 9.10. Hierarchical ascending clustering
- 9.11. Hybrid clustering methods
- 9.12. Neural clustering
- 9.13. Clustering by aggregation of similarities
- 9.14. Clustering of numeric variables
- 9.15. Overview of clustering methods
- Finding associations
- 10.1. Principles
- 10.2. Using taxonomy
- 10.3. Using supplementary variables
- 10.4. Applications
- 10.5. Example of use
- Classification and prediction methods
- 11.1. Introduction
- 11.2. Inductive and transductive methods
- 11.3. Overview of classification and prediction methods
- 11.4. Classification by decision tree
- 11.5. Prediction by decision tree
- 11.6. Classification by discriminant analysis
- 11.7. Prediction by linear regression
- 11.8. Classification by logistic regression
- 11.9. Developments in logistic regression
- 11.10. Bayesian methods
- 11.11. Classification and prediction by neural networks
- 11.12. Classification by support vector machines (SVMs)
- 11.13. Prediction by genetic algorithms
- 11.14. Improving the performance of a predictive model
- 11.15. Bootstrapping and aggregation of models
- 11.16. Using classification and prediction methods
- An application of data mining: scoring
- 12.1. The different types of score
- 12.2. Using propensity scores and risk scores
- 12.3. Methodology
- 12.4. Implementing a strategic score
- 12.5. Implementing an operational score
- 12.6. The kinds of scoring solutions used in a business
- 12.7. An example of credit scoring (data preparation)
- 12.8. An example of credit scoring (modelling by logistic regression)
- 12.9. An example of credit scoring (modelling by DISQUAL discriminant analysis)
- 12.10. A brief history of credit scoring
- Factors for success in a data mining project
- 13.1. The subject
- 13.2. The people
- 13.3. The data
- 13.4. The IT systems
- 13.5. The business culture
- 13.6. Data mining: eight common misconceptions
- 13.7. Return on investment
- Text mining
- 14.1. Definition of text mining
- 14.2. Text sources used
- 14.3. Using text mining
- 14.4. Information retrieval
- 14.5. Information extraction
- 14.6. Multi-type data mining
- Web mining
- 15.1. The aims of web mining
- 15.2. Global analyses
- 15.3. Individual analyses
- 15.4. Personal analyses
- Appendix: Elements of statistics
- 16.1. A brief history
- 16.2. Elements of statistics
- 16.3. Statistical tables
- Further reading
- 17.1. Statistics and data analysis
- 17.2. Data mining and statistical learning
- 17.3. Text mining
- 17.4. Web mining
- 17.5. R software
- 17.6. SAS software
- 17.7. IBM SPSS software
- 17.8. Websites
- Index.