When two clinicians independently rate a subject according to a categorical scale, we might be interested in estimating the probability that the 2 clinicians are in exact agreement.
For example, suppose that 100 subjects were categorized by each of 2 physicians as having either mild, moderate, severe or no impairment in shoulder mobility. The results of this hypothetical study are tabulated in the table below.
The intraclass correlation coefficient (ICC) is a useful descriptive statistic for measuring the strength of the relationship between the observations within a class or group. It is often used, for example, to assess reliability in interrater reliability studies.
When it comes to calculating the ICC, there are a number of formulas available from which to choose. Fisher (1954) discusses a product-moment estimator, Shrout and Fleiss (1979) present estimators for several linear (ANOVA) models that assume normality, and Rothery (1979) suggests a nonparametric version of the ICC. Which is the best one to use?
Since applying an incorrect or inappropriate ICC formula can lead to erroneous conclusions, it’s important to spend some time identifying what is and is not appropriate for any particular study. Continue reading
There are many free sample size and/or power “calculators” available on the internet, aimed at assisting researchers in planning the number of subjects to include in studies. Not all are of equal quality, and before deciding to rely on one of these free calculators, be sure of what you’re getting.
What to look for:
Statistical method(s) stated clearly: Using information provided on the sample size calculator website, you should be able to determine what statistical method(s) form the basis for the sample size/power calculations provided. This is essential information. The statistical methods used for estimating the sample size/power must match the researcher’s planned statistical methods for analysing the study data at the time of study completion.
Reference provided: Ensure that the website provides a trusted, published reference for the statistical methods used to calculate sample size/power. Peer-reviewed journal articles are usually reliable sources of information, as are statistical encyclopaedias.
Assumptions: In addition to knowing what statistical method(s) are being used, it’s important to be aware of assumptions necessary for the methods to be appropriate. For example, is it necessary to assume a normal distribution? equal variances across groups? other conditions?
Has the calculator been validated: Does the calculator website describe what steps were taken to ensure that the sample size/power calculator actually does what it’s supposed to do? If not, contact the website owner to get an answer to this question.
Does your Bland-Altman plot look more like a regression line rather than a flat, horizontal line? Does it look more like Figure B than Figure A? Then read on for a solution to the problem.
ANOVA is often the statistical method used to analyze the results of clinical trial with independent treatment groups, when the assumption of normality is reasonable. Normality is not the only assumption that must be satisfied, though: all treatment/intervention groups must have the same variance. There are several useful approaches available for determining whether the assumption of equal variances is met:
- graphic / visual check of treatment group variability
- use descriptive statistics to summarize treatment group variability
- hypothesis tests about a possible common variance
In many clinical studies, there is a clustering present in the data either by nature or by design. If it cannot be reasonably assumed that the observations within a cluster are independent of one another, then the relationships within the cluster might be described using an intraclass correlation coefficient. More on that later.
First, consider a few different study designs where observations within a cluster, or class, might be not be independent of each other:
- each spinal injury patient is assessed for level of hand function by four independent healthcare professionals (interrater reliability study; the 4 observations on a given subject form a class)
- school teachers record their personal physical activity over a 7-day period (repeated measures with compound symmetry; the 7 observations recorded by a specific teacher is a class)
- a case-control study of breast cancer used age to match a cancer patient with a related non-cancer patient and an unrelated non-cancer patient (case-control study; each group of 3 patients matched on age is a class)
- dementia clinics were randomized to one of two exercise regimens to determine their impact on dementia: all patients attending a particular clinic were assigned to the same exercise program (cluster randomized clinical trial: patients attending a certain clinic form a class)
You should be starting with a dataset (suppose it’s called mydata) that contains two measurements/assessments to be compared for each subject. Here the 2 assessments will be referred to as new and old.
Calculate the difference between the 2 variables, and the average of the two. There are a number of ways to do this, but here is how to do it using a data step:
data diffs ;
set mydata ;
/* calculate the difference */
/* calculate the average */