Biostatistics and Bioinformatics in Clinical Trials
Summary of Key Points
• The process of conducting cancer research must change in the face of prohibitive costs and limited patient resources.
• Biostatistics has a tremendous impact on the level of science in cancer research, especially in the design and conducting of clinical trials.
• The Bayesian statistical approach to clinical trial design and conduct can be used to develop more efficient and effective cancer studies.
• Modern technology and advanced analytic methods are directing the focus of medical research to subsets of disease types and to future trials across different types of cancer.
• A consequence of the rapidly changing technology for generating “omics” data is that biological assays are often not stable long enough to discover and validate a model in a clinical trial.
• Bioinformaticians must use technology-specific data normalization procedures and rigorous statistical methods to account for sample collecting, batch effects, multiple testing, confounding covariates, and any other potential biases.
• Best practices in developing prediction models include public access to the information, rigorous validation of the model, and model lockdown prior to its use in patient care management.
1. In research, the P value is:
2. Type I and type II error rates, fixed distribution index parameters, and confidence intervals are characteristic of which statistical approach?
3. Which statistical approach to clinical trial design is characterized by flexibility, updating knowledge as data accrue, and using the updated knowledge to dynamically guide patient randomization and direct trial conduct?
4. Batch effects in “omics” studies:
A May arise when sequencing runs are conducted on different days
B Must be accounted for in standardized laboratory protocols and analytic methods
1. Answer: D. The P value is conditioned on a hypothesis that seems unlikely to be true in view of the results (the null hypothesis) and because it depends on probabilities of possible occurrences that were not observed. Thus it is not intuitive. P values are commonly misunderstood and misinterpreted.
2. Answer: B. Frequentist inference is based on the type I error rate (rejecting the null hypothesis when it is true) and the type II error rate (accepting the null hypothesis when it is false). The frequentist approach sets a fixed value for the unknown index of the data distribution. In the Bayesian approach, such distributions are not fixed. The frequentist approach uses confidence intervals to represent the range of possible values for a population parameter that the study is intended to estimate.
3. Answer: A. The Bayesian approach can be used in some clinical trial designs to guide adaptive randomization of patients to better performing treatment arms and to dynamically direct conduction of the trial according to the accumulating knowledge obtained from the patients already enrolled in the trial. The frequentist approach uses a fixed randomization scheme, a fixed sample size, and predetermined rules of trial conduct.
4. Answer: D. Batch effects are technological artifacts that are characteristic of the many processes involved in collecting, processing, and analyzing complex specimen samples and data. Batch effects arise from many sources, such as sample collection, preparation, and storage, reagents, assay technology, and instrumentation. Randomization and blinding must be used to avoid confounding true findings with batch effects. Statistical modeling can be used to adjust for some batch effects.
5. Answer: D. Specialized methods are required for processing raw “omics” data and converting them to a form that can be analyzed. Data normalization procedures, which correct for technological artifacts that may distort the measurements, are specific to the type of assay being analyzed and to the corrections needed for a given experiment. Determining the appropriate data normalization procedure requires expertise in the technology used to create the assay and in the mathematical methods incorporated in the normalization procedures.