Biostatistics

Designing a study includes developing good research question(s), choosing an appropriate methodology, estimating sample size, selecting data collection tools, and creating an analysis plan. Click here to learn more.

A good rule of thumb to follow in these cases is to obtain 25% of the sample size needed for the full research study.

The sample size calculation depends on your hypothesis test, the significance level (usually set as 5%), and the power and results from your pilot study. There are many formulas available for different research situations. Click here to learn more.

For surveys, Qualtrics has a sample size calculator.

The appropriate analysis will depend on the type of data that you collected and the hypothesis or research objective that you want to answer. Click here to learn more.

Most statistical analyses require data normality, such as ANOVA or linear regression. However, some tests are parametric-independent and do not require data to meet the normality assumption. While transformations can be performed to produce normally distributed data, this is trickier to do.

When your sample size is relatively large, then the assumption of normality may be relaxed.

When the sample size is sufficiently large (>200 per group), the normality assumption can be relaxed because the Central Limit Theorem ensures that the distribution of the noise or disturbance term will approximate normality.

When you have very small samples, it is important to check for a possible violation of the normality assumption.

Level of significance, α, is the pre-set level of error that you are willing to commit in your research, determined before your data collection. It is usually set at 0.05 or 0.01, more rarely at 0.10. 

P-value is the actual level of error found when you perform the statistical test. When p-value < α, then it supports the evidence against the null hypothesis (no effect) and your results are ‘statistically significant’. The smaller the p-value, the stronger the evidence.

It is a good idea to support a p-value with a confidence interval (CI) for the estimate or effect size being tested. The ‘estimate of the effect’ you found is only for the sample data you collected. The ‘true effect’ for the whole population may be different, but we can be assured that 95% of the time it will fall within the confidence interval.

Power is the ability of the statistical test to detect differences or effects that lead to rejection of the null hypothesis. It depends on the sample size. The larger the sample size, the bigger the power.

It is important to calculate the sample size to have sufficient power before you begin your data collection. When your sample size is small, your study might not be able to detect the difference or effect, even when it is real, because of lack of power. Click here to learn more.

Statistical significance simply means that we reject the null hypothesis (no effect). For example, a clinical trial may enrol hundreds of thousands of patients to compare a new anti-hypertension drug with the current one. Because of the large sample, the test may reject the null hypothesis that the two drugs are equivalent. However, in practice, the difference between them may be relatively small and have no real clinical significance. The clinician should not just blindly follow the results, but should combine professional judgement with statistical evidence.

 

Click here for some useful online resources to get started. 

Click here for some useful online resources to get started. 

UCalgary's Research Computing Services is available to help researchers with study design, interpretation of results, and writing up results for publication.