Published online before print June 12, 2014 , doi: 10.1681/ASN.2014040410

+ Author Affiliations


For more than two centuries, the measurement of urinary proteins has been a standard tool for nephrologists to diagnose kidney disease. Renal disease is characterized by changes in the glomerular filtration barrier leading to increased urinary excretion of proteins, the main protein of which is albumin. This was later called macroalbuminuria and appeared to be associated with an increased risk of progressive renal function loss. 1 In the 1980s, novel methods were introduced that enabled the measurement of small quantities of albumin in the urine, called microalbuminuria . The introduction of these techniques encouraged endocrinologists and diabetologists to measure urinary albumin in their patients with diabetes (at that time, this comprised predominantly patients with type 1 diabetes). To note, these small quantities of albumin also predicted the risk of developing progressive renal disease in patients with type 1 diabetes. 2 In recent decades, attention on albuminuria as a predictor of renal risk has shifted mainly to patients with type 2 diabetes, likely because of the worldwide increased prevalence of this disease. Studies in populations with type 2 diabetes confirmed the important role of macroalbuminuria in the progression of renal disease, as also found in nondiabetic renal disease. 3 In addition, the success of renoprotective interventions was shown to associate with and often depend on the albuminuria-lowering effect: The greater the reduction in albuminuria in the first months of treatment, the greater the subsequent renal risk reduction. 4 , 5

The study by de Boer et al. in this issue of JASN brings the role and importance of macroalbuminuria in the type 1 diabetes population to our attention and provides important insights into the long-term renal outcomes in a contemporary cohort of patients with type 1 diabetes and macroalbuminuria. 6 This study analyzed data from the Diabetes Control and Complication Trial (DCCT) and its observational Epidemiology of Diabetes Intervention and Complication (EDIC) follow-up study. The analysis included 159 individuals who developed incident macroalbuminuria over a 25-year follow-up period. The wealth of data of such long-term follow-up is enormous and a lot can be learned. First, de Boer et al. showed that despite the improvement in treatments over the years, the incidence of progression to macroalbuminuria is still high and is surprisingly similar to that reported in type 2 diabetes. Second, de Boer et al. found that macroalbuminuria is associated with a marked risk of progression in renal disease, as evidenced by an eGFR loss of −5.4 ml/min per 1.73 m 2 per year and a high risk for developing an eGFR<60 ml/min per 1.73 m 2 . Because of the resemblance of such risk data with those data observed in type 2 diabetes, we cannot escape the notion that the treatment of type 1 diabetes should receive more attention, particularly in the prevention of macroalbuminuria. In addition, data on type 2 diabetes in patients with macroalbuminuria and compromised eGFR show an extremely high risk for ESRD (which leads to death in the case of no dialysis or transplantation) or death, which is higher than that of all treated cancers. 7 de Boer et al. report that the use of renin-angiotensin-aldosterone system inhibition (RAASi) increased over time, macroalbuminuria regressed, and the risk of renal disease progression decreased. These findings are intuitively compelling and could lead to the conclusion that RAASi may have played a role in either preventing macroalbuminuria or exerting renal protective effects by lowering albuminuria and slowing progression of eGFR decline if macroalbuminuria was present.


A study design that randomly assigns participants into an experimental group or a control group. As the study is conducted, the only expected difference between the control and experimental groups in a randomized controlled trial (RCT) is the outcome variable being studied.



Design pitfalls to look out for

An RCT should be a study of one population only.

The variables being studied should be the only variables between the experimental group and the control group.

Fictitious Example

To determine how a new type of short wave UVA-blocking sunscreen affects the general health of skin in comparison to a regular long wave UVA-blocking sunscreen, 40 trial participants were randomly separated into equal groups of 20: an experimental group and a control group. All participants' skin health was then initially evaluated. The experimental group wore the short wave UVA-blocking sunscreen daily, and the control group wore the long wave UVA-blocking sunscreen daily.

After one year, the general health of the skin was measured in both groups and statistically analyzed. In the control group, wearing long wave UVA-blocking sunscreen daily led to improvements in general skin health for 60% of the participants. In the experimental group, wearing short wave UVA-blocking sunscreen daily led to improvements in general skin health for 75% of the participants.

Real-life Examples

Ensrud, K. E., Stock, J. L., Barrett-Connor, E., Grady, D., Mosca, L., Khaw, K., et al. (2008). Effects of raloxifene on fracture risk in postmenopausal women: The raloxifene use for the heart trial. 112-120.

Müller, O., Traoré, C., Kouyaté, B., Yé, Y., Frey, C., Coulibaly, B., et al. (2006). Effects of insecticide-treated bednets during early infancy in an African area of intense malaria transmission: A randomized controlled trial. 120-126.

When the groups that have been randomly selected from a population do not know whether they are in the control group or the experimental group.

Being able to show that an independent variable directly causes the dependent variable. This is generally very difficult to demonstrate in most study designs.

Variables that cause/prevent an outcome from occurring outside of or along with the variable being studied. These variables render it difficult or impossible to distinguish the relationship between the variable and outcome being studied).

In order to explain randomization’s eminent role, one may refer to the logic of the experiment, largely based on J. S. Mill’s method of difference [ 4 ]: If one compares two groups of subjects (Treatment T versus Control C , say) and observes a salient contrast in the end (e.g. ), that difference must be due to the experimental manipulation—IF the groups were equivalent at the very beginning of the experiment.

In other words, since the difference between treatment and control (i.e. the experimental manipulation) is the only perceivable reason that can explain the variation in the observations, it must be the cause of the observed effect (the difference in the end). The situation is quite altered, however, if the two groups already differed substantially at the beginning. Then (see Table 1 below), there are two possible explanations of an effect:

Table 1. Mill’s logic.

Thus, for the logic of the experiment, it is of paramount importance to ensure equivalence of the groups at the beginning of the experiment. The groups, or even the individuals involved, must not be systematically different; one has to compare like with like. Alas, in the social sciences exact equality of units, e.g. human individuals, cannot be maintained. Therefore one must settle for comparable subjects or groups ( T C ).

2.1 Defining comparability

In practice, it is straightforward to define comparability with respect to the features or properties of the experimental units involved. In a typical experimental setup, statistical units (e.g. persons) are represented by their corresponding vectors of attributes (properties, variables) such as gender, body height, age, etc.

If the units are almost equal in as many properties as possible, they should be comparable, i.e., the remaining differences shouldn’t alter the experimental outcome substantially. However, since, in general, vectors have to be compared, there is not a single measure of similarity. Rather, there are quite a lot of measures available, depending on the kind of data at hand. An easily accessible and rather comprehensive overview may be found here:

As an example, suppose a unit is represented by a binary vector a = (, …, ). The Hamming distance (⋅,⋅) between two such vectors is the number of positions at which the corresponding symbols are different. In other words, it is the minimum number of substitutions required to change one vector into the other. Let a = (0,0,1,0), a = (1,1,1,0), and a = (1,1,1,1). Therefore ( a , a ) = 2, ( a , a ) = 3, ( a , a ) = 1, and ( a , a ) = 0. Having thus calculated a reasonable number for the “closeness” of two experimental units, one next has to consider what level of deviance from perfect equality may be tolerable.

Due to this, coping with similarities is a tricky business. Typically many properties (covariates) are involved and conscious (subjective) judgement seems to be inevitable. An even more serious question concerns the fact that relevant factors may not have been recorded or might be totally unknown. In the worst case, similarity with respect to some known factors has been checked, but an unnoticed nuisance variable is responsible for the difference between the outcome in the two groups.

Moreover, comparability depends on the phenomenon studied. A clearly visible difference, such as gender, is likely to be important with respect to life expectancy, and can influence some physiological and psychological variables such as height or social behaviour, but it is independent of skin color or blood type. In other words, experimental units do not need to be twins in any respect; it suffices that they be similar with respect to the outcome variable under study.

Given a unique sample it is easy to think about a of other samples that are alike in all relevant respects to the one observed. However, even Fisher could not give these words a precise formal meaning [ Faux Suede Ankle Boots LLVwv
]. Thus De Finetti [ 6 ] proposed , i.e. “instead of judging whether two groups are similar, the investigator is instructed to imagine a hypothetical of the two groups … and then judge whether the observed data under the swap would be distinguishable from the actual data” (see [ 7 ], p. 196). Barnard [ 8 ] gives some history on this idea and suggests the term , “which conveys the idea of replacing one thing by another similar thing.” Nowadays, epidemiologists say that “the effect of treatment is if the treated and untreated groups resemble each other in all relevant features” [ 7 ], p. 196.

2.2 Experimental techniques to achieve comparability

There are a number of strategies to achieve comparability. Starting with the experimental units, it is straightforward to similar individuals, i.e., to construct pairs of individuals that are alike in many (most) respects. Looking at the group level ( and ), another straightforward strategy is to all relevant variables when assigning units to groups. Many approaches of this kind are discussed in [ 9 ], minimization being the most prominent among them. Treasure and MacRae [ 10 ] explain:

However, apart from being cumbersome and relying on the experimenter’s expertise (in particular in choosing and weighing the factors), these strategies are always open to the criticism that unknown nuisance variables may have had a substantial impact on the result. Therefore Fisher [ 1 ], pp. 18–20, advised strongly against treating every conceivable factor explicitly. Instead, he taught that “the random choice of the objects to be treated in different ways [guarantees] the validity of the test of significance … against corruption by the causes of disturbance which have not been eliminated.” More explicitly, Berger [ 11 ], pp. 9–10, explains:

In our study of aspirin versus placebo … we chose age, sex, operating surgeon, number of coronary arteries affected, and left ventricular function. But in trials in other diseases those chosen might be tumour type, disease stage, joint mobility, pain score, or social class.

