LIMITED OFFER

## Save 50% on book bundles

Immediately download your ebook while waiting for your print delivery. No promo code is needed.

Skip to main content# Biostatistics

## A Guide to Design, Analysis and Discovery

## Purchase options

## Save 50% on book bundles

## Institutional subscription on ScienceDirect

Request a sales quote### Ronald N. Forthofer

### Eun Sul Lee

### Mike Hernandez

- 2nd Edition - December 14, 2006
- Authors: Ronald N. Forthofer, Eun Sul Lee, Mike Hernandez
- Language: English
- Hardback ISBN:9 7 8 - 0 - 1 2 - 3 6 9 4 9 2 - 8
- eBook ISBN:9 7 8 - 0 - 0 8 - 0 4 6 7 7 2 - 6

Biostatistics, Second Edition, is a user-friendly guide on biostatistics, which focuses on the proper use and interpretation of statistical methods. This textbook does not require… Read more

LIMITED OFFER

Immediately download your ebook while waiting for your print delivery. No promo code is needed.

*Biostatistics, Second Edition, *is a user-friendly guide on biostatistics, which focuses on the proper use and interpretation of statistical methods.

This textbook does not require extensive background in mathematics, making it user-friendly for all students in the public health sciences field. Instead of highlighting derivations of formulas, the authors provide rationales for the formulas, allowing students to grasp a better understanding of the link between biology and statistics.

The material on life tables and survival analysis allows students to better understand the recent literature in the health field, particularly in the study of chronic disease treatment. This updated edition contains over 40% new material with modern real-life examples, exercises, and references, including new chapters on Logistic Regression, Analysis of Survey Data, and Study Designs.

The book is recommended for students in the health sciences, public health professionals, and practitioners.

- Over 40% new material with modern real-life examples, exercises and references
- New chapters on Logistic Regression; Analysis of Survey Data; and Study Designs
- Introduces strategies for analyzing complex sample survey data
- Written in a conversational style more accessible to students with real data

Students in the health sciences, public health professionals, practitioners

1. INTRODUCTION

1.1 What is Biostatistics?

1.2 Data – The Key Component of a Study

1.3 Design – The Road to Relevant Data

1.4 Replication – Part of the Scientific Method

1.5 Applying Statistical Methods

Concluding Remarks

Exercises

References

2. DATA AND NUMBERS

2.1 Data: Numerical Representation

2.2 Observations and Variables

2.3 Scales Used with Variables

2.4 Reliability and Validity

2.5 Randomized Response Technique

2.6 Common Data Problems

Concluding Remarks

Exercises

References

3. DESCRIPTIVE METHODS

3.1 Introduction to Descriptive Methods

3.2 Tabular and Graphic Presentation of Data

3.2.1 Frequency Tables

3.2.2 Line Graphs

3.2.3 Bar Charts

3.2.4 Histograms

3.2.5 Stem-and-Leaf Plots

3.2.6 Dot Plots

3.2.7 Scatter Plots

3.3 Measures of Central Tendency

3.3.1 Mean, Median, and Mode

3.3.2 Use of the Measures of Central Tendency

3.3.3 The Geometric Mean

3.4 Measures of Variability

3.4.1 Ranges and Percentiles

3.4.2 Box Plots

3.4.3 Variance and Standard Deviation

3.5 Rates and Ratios

3.5.1 Crude and Specific Rates

3.5.2 Adjusted Rates

3.6 Measures of Change Over Time

3.6.1 Linear Growth

3.6.2 Geometric Growth

3.6.3 Exponential Growth

3.7 Correlation Coefficients

3.7.1 Pearson Correlation Coefficient

3.7.2 Spearman Rank Correlation Coefficient

Concluding Remarks

Exercises

References

4. PROBABILITY AND LIFE TABLES

4.1 A Definition of Probability

4.2 Rules for Calculating Probabilities

4.2.1 Addition Rule for Probabilities

4.2.2 Conditional Probabilities

4.2.3 Independent Events

4.3 Definitions from Epidemiology

4.3.1 Rates and Probabilities

4.3.2 Sensitivity, Specificity, and Predicted Value Positive and Negative

4.3.3 Receiver Operating Characteristic Plot

4.4 Bayes’ Theorem

4.5 Probability in Sampling

4.5.1 Sampling With Replacement

4.5.2 Sampling Without Replacement

4.6 Estimating Probabilities by Simulation

4.7 Probability and the Life Table

4.7.1 The First Four Columns of the Life Table

4.7.2 Some Uses of the Life Table

4.7.3 Expected Values in the Life Table

4.7.4 Other Expected Values in the Life Table

Concluding Remarks

Exercises

References

5. PROBABILITY DISTRIBUTIONS

5.1 The Binomial Distribution

5.1.1 Binomial Probabilities

5.1.2 Mean and Variance of the Binomial Distribution

5.1.3. Shapes of the Binomial Distribution

5.2 The Poisson Distribution

5.2.1 Poisson Probabilities

5.2.2 Mean and Variance of the Poisson Distribution

5.2.3 Finding Poisson Probabilities

5.3 The Normal Distribution

5.3.1 Normal Probabilities

5.3.2 Transforming to the Standard Normal Distribution

5.3.3 Calculation of Normal Probabilities

5.3.4 Normal Probability Plot

5.4 The Central Limit Theorem

5.5 Approximations to the Binomial and Poisson Distributions

5.5.1 Normal Approximation to the Binomial Distribution

5.5.2 Normal Approximation to the Poisson Distribution

Concluding Remarks

Exercises

References

6. STUDY DESIGNS

6.1 Design: Putting Chance to Work

6.2 Sample Surveys and Experiments

6.3 Sampling and Sample Designs

6.3.1 Sampling Frame

6.3.2 Importance of Probability Sampling

6.3.3 Simple Random Sampling

6.3.4 Systematic Sampling

6.3.5 Stratified Random Sampling

6.3.6 Cluster Sampling

6.3.7 Problems Due to Unintended Sampling

6.4 Designed Experiments

6.4.1 Comparison Groups and Randomization

6.4.2 Random Assignment

6.4.3 Sample Size

6.4.4 Single and Double Blind Experiments

6.4.5 Blocking and Extraneous Variables

6.4.6 Limitations of Experiments

6.5 Variations in Study Designs

6.5.1 The Cross-Over Design

6.5.2 The Case Control Design

6.5.3 The Cohort Study Design

Concluding Remarks

Exercises

References

7. INTERVAL ESTIMATION

7.1 Prediction, Confidence, and Tolerance Intervals

7.2 Distribution-Free Intervals

7.2.1 Prediction Interval

7.2.2 Confidence Interval

7.2.3 Tolerance Interval

7.3 Confidence Intervals Based on the Normal Distribution

7.3.1 Confidence Interval for the Mean

7.3.2 Confidence Interval for a Proportion

7.3.3 Confidence Interval for Crude and Adjusted Rates

7.4 Confidence Interval for the Difference of Two Means and Proportions

Difference of Two Independent Means

7.4.1 Difference of Two Dependent Means

7.4.2 Difference of Two Independent Proportions

7.4.3 Difference of Two Dependent Proportions

7.5 Confidence Interval and Sample Size

7.6 Confidence Interval for Other Measures

7.6.1 Confidence Interval for the Variance

7.6.2 Confidence Interval for Pearson Correlation Coefficient

7.7 Prediction and Tolerance Intervals Based on the Normal Distribution

7.7.1 Prediction Interval

7.7.2 Tolerance Interval

Concluding Remarks

Exercises

References

8. TESTS OF HYPOTHESES

8.1 Preliminaries in Tests of Hyppotheses

8.1.1 Definitions of Terms Used in Hypothesis Testing

8.1.2 Determination of Decision Rule

8.1.3 Relationship of the Decision Rule, á and â

8.1.4 Conducting the Test

8.2 Testing Hypotheses about the Mean

8.2.1 Known Variance

8.2.2 Unknown Varinace

8.3 Testing Hypotheses about the Proportion and Rates

8.4 Testing Hypotheses about the Variance

8.5 Testing Hypotheses about the Pearson Correlation Coefficient

8.6 Testing Hypotheses about the Difference of Two Means

8.6.1 Difference of Two Independent Means

8.6.2 Difference of Two Dependent Means

8.7 Testing Hypotheses about the Difference of Two Proportions

8.7.1 Difference of Two Independent Proportions

8.7.2 Difference of Two Dependent Means

8.8 Tests of Hypotheses and Sample Size

8.9 Statistical and Practical Significance

Concluding Remarks

Exercises

References

9. NONPARAMETRIC TESTS

9.1 Why Nonparametric Tests?

9.2 The Sign Test

9.3 The Wilcoxon Signed Rank Test

9.4 The Wilcoxon Rank Sum Test

9.5 The Kruskal-Wallis Test

9.6 The Friedman Test

Concluding Remarks

Exercises

References

10. ANALYSIS OF CATEGORICAL DATA

10.1 Goodness-of-Fit Test

10.2 The 2 by 2 Contingency Table

10.2.1 Comparing Two Independent Binomial Proportions

10.2.2 Expected Cell Counts Assuming No Association

10.2.3 The Odds Ratio – a Measure of Association

10.2.4 The Fisher’s Exact Test

10.2.5 Analysis of Paired Data: The McNemar Test

10.3 The r by c Contingency Table

10.3.1 Testing Hypothesis of Non Association: The Chi-Square Test

10.3.2 Testing Hypothesis of No Trend

10.4 Multiple 2 by 2 Tables

10.4.1 Analyzing the Tables Separately

10.4.2 The Cochran-Mantel-Haenszel Test

104.3 The Mantel-Haenszel Common Odds Ratio

Concluding Remarks

Exercises

References

11. ANALYSIS OF SURVIVAL DATA

11.1 Data Collection in Follow-Up Studies

11.2 The Life Table Method

11.3 The Product-Limit Method

11.4 Comparison of Two Survival Distributions

11.4.1 The Cochran-Mantel-Haenszel Test

11.4.2 The Log-Rank Test

Concluding Remarks

Exercises

References

12. ANALYSIS OF VARIANCE

12.1 Assumptions for the Use of the ANOVA

12.2 One-Way ANOVA

12.2.1 Sums of Squares and Mean Squares

12.2.2 The F Statistics

12.2.3 The ANOVA Table

12.3 Multiple Comparisons

12.3.1 Error Rates: Individual and Family

12.3.2 Tukey-Kramer Method

12.3.3 Fisher’s Least Significant Difference Method

12.3.4 Dunnett’s Method

12.4 Two-Way ANOVA for the Randomized Block Design with m Replicates

12.5 Two-Way ANOVA with Interaction

12.6 Linear Model Representation of the ANOVA

12.6.1 The Completely Randomized Design

12.6.2 The Randomized Block Design with m Replicates

12.6.3 Two-Way ANOVA with Interaction

12.7 ANOVA with Unequal Numbers of Observations in Subgroups

Concluding Remarks

Exercises

References

13. LINEAR REGRESSION

13.1 Simple Linear Regression

13.1.1 Estimation of Coefficients

13.1.2 The Variance of Y

1.1 What is Biostatistics?

1.2 Data – The Key Component of a Study

1.3 Design – The Road to Relevant Data

1.4 Replication – Part of the Scientific Method

1.5 Applying Statistical Methods

Concluding Remarks

Exercises

References

2. DATA AND NUMBERS

2.1 Data: Numerical Representation

2.2 Observations and Variables

2.3 Scales Used with Variables

2.4 Reliability and Validity

2.5 Randomized Response Technique

2.6 Common Data Problems

Concluding Remarks

Exercises

References

3. DESCRIPTIVE METHODS

3.1 Introduction to Descriptive Methods

3.2 Tabular and Graphic Presentation of Data

3.2.1 Frequency Tables

3.2.2 Line Graphs

3.2.3 Bar Charts

3.2.4 Histograms

3.2.5 Stem-and-Leaf Plots

3.2.6 Dot Plots

3.2.7 Scatter Plots

3.3 Measures of Central Tendency

3.3.1 Mean, Median, and Mode

3.3.2 Use of the Measures of Central Tendency

3.3.3 The Geometric Mean

3.4 Measures of Variability

3.4.1 Ranges and Percentiles

3.4.2 Box Plots

3.4.3 Variance and Standard Deviation

3.5 Rates and Ratios

3.5.1 Crude and Specific Rates

3.5.2 Adjusted Rates

3.6 Measures of Change Over Time

3.6.1 Linear Growth

3.6.2 Geometric Growth

3.6.3 Exponential Growth

3.7 Correlation Coefficients

3.7.1 Pearson Correlation Coefficient

3.7.2 Spearman Rank Correlation Coefficient

Concluding Remarks

Exercises

References

4. PROBABILITY AND LIFE TABLES

4.1 A Definition of Probability

4.2 Rules for Calculating Probabilities

4.2.1 Addition Rule for Probabilities

4.2.2 Conditional Probabilities

4.2.3 Independent Events

4.3 Definitions from Epidemiology

4.3.1 Rates and Probabilities

4.3.2 Sensitivity, Specificity, and Predicted Value Positive and Negative

4.3.3 Receiver Operating Characteristic Plot

4.4 Bayes’ Theorem

4.5 Probability in Sampling

4.5.1 Sampling With Replacement

4.5.2 Sampling Without Replacement

4.6 Estimating Probabilities by Simulation

4.7 Probability and the Life Table

4.7.1 The First Four Columns of the Life Table

4.7.2 Some Uses of the Life Table

4.7.3 Expected Values in the Life Table

4.7.4 Other Expected Values in the Life Table

Concluding Remarks

Exercises

References

5. PROBABILITY DISTRIBUTIONS

5.1 The Binomial Distribution

5.1.1 Binomial Probabilities

5.1.2 Mean and Variance of the Binomial Distribution

5.1.3. Shapes of the Binomial Distribution

5.2 The Poisson Distribution

5.2.1 Poisson Probabilities

5.2.2 Mean and Variance of the Poisson Distribution

5.2.3 Finding Poisson Probabilities

5.3 The Normal Distribution

5.3.1 Normal Probabilities

5.3.2 Transforming to the Standard Normal Distribution

5.3.3 Calculation of Normal Probabilities

5.3.4 Normal Probability Plot

5.4 The Central Limit Theorem

5.5 Approximations to the Binomial and Poisson Distributions

5.5.1 Normal Approximation to the Binomial Distribution

5.5.2 Normal Approximation to the Poisson Distribution

Concluding Remarks

Exercises

References

6. STUDY DESIGNS

6.1 Design: Putting Chance to Work

6.2 Sample Surveys and Experiments

6.3 Sampling and Sample Designs

6.3.1 Sampling Frame

6.3.2 Importance of Probability Sampling

6.3.3 Simple Random Sampling

6.3.4 Systematic Sampling

6.3.5 Stratified Random Sampling

6.3.6 Cluster Sampling

6.3.7 Problems Due to Unintended Sampling

6.4 Designed Experiments

6.4.1 Comparison Groups and Randomization

6.4.2 Random Assignment

6.4.3 Sample Size

6.4.4 Single and Double Blind Experiments

6.4.5 Blocking and Extraneous Variables

6.4.6 Limitations of Experiments

6.5 Variations in Study Designs

6.5.1 The Cross-Over Design

6.5.2 The Case Control Design

6.5.3 The Cohort Study Design

Concluding Remarks

Exercises

References

7. INTERVAL ESTIMATION

7.1 Prediction, Confidence, and Tolerance Intervals

7.2 Distribution-Free Intervals

7.2.1 Prediction Interval

7.2.2 Confidence Interval

7.2.3 Tolerance Interval

7.3 Confidence Intervals Based on the Normal Distribution

7.3.1 Confidence Interval for the Mean

7.3.2 Confidence Interval for a Proportion

7.3.3 Confidence Interval for Crude and Adjusted Rates

7.4 Confidence Interval for the Difference of Two Means and Proportions

Difference of Two Independent Means

7.4.1 Difference of Two Dependent Means

7.4.2 Difference of Two Independent Proportions

7.4.3 Difference of Two Dependent Proportions

7.5 Confidence Interval and Sample Size

7.6 Confidence Interval for Other Measures

7.6.1 Confidence Interval for the Variance

7.6.2 Confidence Interval for Pearson Correlation Coefficient

7.7 Prediction and Tolerance Intervals Based on the Normal Distribution

7.7.1 Prediction Interval

7.7.2 Tolerance Interval

Concluding Remarks

Exercises

References

8. TESTS OF HYPOTHESES

8.1 Preliminaries in Tests of Hyppotheses

8.1.1 Definitions of Terms Used in Hypothesis Testing

8.1.2 Determination of Decision Rule

8.1.3 Relationship of the Decision Rule, á and â

8.1.4 Conducting the Test

8.2 Testing Hypotheses about the Mean

8.2.1 Known Variance

8.2.2 Unknown Varinace

8.3 Testing Hypotheses about the Proportion and Rates

8.4 Testing Hypotheses about the Variance

8.5 Testing Hypotheses about the Pearson Correlation Coefficient

8.6 Testing Hypotheses about the Difference of Two Means

8.6.1 Difference of Two Independent Means

8.6.2 Difference of Two Dependent Means

8.7 Testing Hypotheses about the Difference of Two Proportions

8.7.1 Difference of Two Independent Proportions

8.7.2 Difference of Two Dependent Means

8.8 Tests of Hypotheses and Sample Size

8.9 Statistical and Practical Significance

Concluding Remarks

Exercises

References

9. NONPARAMETRIC TESTS

9.1 Why Nonparametric Tests?

9.2 The Sign Test

9.3 The Wilcoxon Signed Rank Test

9.4 The Wilcoxon Rank Sum Test

9.5 The Kruskal-Wallis Test

9.6 The Friedman Test

Concluding Remarks

Exercises

References

10. ANALYSIS OF CATEGORICAL DATA

10.1 Goodness-of-Fit Test

10.2 The 2 by 2 Contingency Table

10.2.1 Comparing Two Independent Binomial Proportions

10.2.2 Expected Cell Counts Assuming No Association

10.2.3 The Odds Ratio – a Measure of Association

10.2.4 The Fisher’s Exact Test

10.2.5 Analysis of Paired Data: The McNemar Test

10.3 The r by c Contingency Table

10.3.1 Testing Hypothesis of Non Association: The Chi-Square Test

10.3.2 Testing Hypothesis of No Trend

10.4 Multiple 2 by 2 Tables

10.4.1 Analyzing the Tables Separately

10.4.2 The Cochran-Mantel-Haenszel Test

104.3 The Mantel-Haenszel Common Odds Ratio

Concluding Remarks

Exercises

References

11. ANALYSIS OF SURVIVAL DATA

11.1 Data Collection in Follow-Up Studies

11.2 The Life Table Method

11.3 The Product-Limit Method

11.4 Comparison of Two Survival Distributions

11.4.1 The Cochran-Mantel-Haenszel Test

11.4.2 The Log-Rank Test

Concluding Remarks

Exercises

References

12. ANALYSIS OF VARIANCE

12.1 Assumptions for the Use of the ANOVA

12.2 One-Way ANOVA

12.2.1 Sums of Squares and Mean Squares

12.2.2 The F Statistics

12.2.3 The ANOVA Table

12.3 Multiple Comparisons

12.3.1 Error Rates: Individual and Family

12.3.2 Tukey-Kramer Method

12.3.3 Fisher’s Least Significant Difference Method

12.3.4 Dunnett’s Method

12.4 Two-Way ANOVA for the Randomized Block Design with m Replicates

12.5 Two-Way ANOVA with Interaction

12.6 Linear Model Representation of the ANOVA

12.6.1 The Completely Randomized Design

12.6.2 The Randomized Block Design with m Replicates

12.6.3 Two-Way ANOVA with Interaction

12.7 ANOVA with Unequal Numbers of Observations in Subgroups

Concluding Remarks

Exercises

References

13. LINEAR REGRESSION

13.1 Simple Linear Regression

13.1.1 Estimation of Coefficients

13.1.2 The Variance of Y

- No. of pages: 528
- Language: English
- Edition: 2
- Published: December 14, 2006
- Imprint: Academic Press
- Hardback ISBN: 9780123694928
- eBook ISBN: 9780080467726

RF

Affiliations and expertise

Boulder County, Colorado, U.S.A.EL

Affiliations and expertise

Oregon Health Science University, Portland, U.S.A.MH

Mike Hernandez has been working as a statistical analyst in the Department of Biostatistics at the MD Anderson Cancer Center for over 10 years. Working in a large medical center, he has developed an expertise in doing collaborative research spanning several disciplines from health disparities to clinical trials. He has coauthored over 40 peer-reviewed manuscripts, and is a co-author of: Biostatistics: A Guide to Design, Analysis, and Discovery 2nd ed.

Affiliations and expertise

Anderson Cancer Center, Houston, TX, USARead *Biostatistics* on ScienceDirect