789share: neuroimaging

Showing posts with label neuroimaging. Show all posts

Friday, January 11, 2013

Genetic variation and neuroimaging: some ground rules for reporting research

Those who follow me on Twitter may have
noticed signs of tetchiness in my tweets over the past few weeks. In the course
of writing a review article, I’ve been reading papers linking genetic variants
to language-related brain structure and function. This has gone more slowly than I expected for
two reasons. First, the literature gets ever more complicated and technical:
both genetics and brain imaging involve huge amounts of data, and new methods
for crunching the numbers are developed all the time. If you really want to understand
a paper, rather than just assuming the Abstract is accurate, it can be a long,
hard slog, especially if, like me, you are neither a geneticist nor a
neuroimager. That’s understandable and perhaps unavoidable. The other reason,
though, is less acceptable. For all their complicated methods, many of the
papers in this area fail to tell the reader some important and quite basic
information. This is where the tetchiness comes in. Having burned my brains out
trying to understand what was done, I then realise that I have no idea about
something quite basic like the sample size. The initial assumption is that I’ve
missed it, and so I wade through the paper again, and the Supplementary Material, looking
for the key information. Only when I’m absolutely certain that it’s not there,
am I reduced to writing to the authors for the information. So
this is a plea – to authors, editors and reviewers. If a paper is concerned
with an association between a genetic variant and a phenotype (in my case the
interest is in neural phenotypes, but I suspect this applies more widely) then
could we please ensure that the following information is clearly reported in
the Methods or Results section

1. What genetic variant are we talking about?
You might think this is very simple, but it’s not: for instance, one of the
genes I’m interested in is CNTNAP2, which has been associated with a range of
neurodevelopmental disorders, especially those affecting language. The evidence
for a link between CNTNAP2 and developmental disorders comes from studies that
have examined variation in single-nucleotide polymorphisms or SNPs. These are
segments of DNA that are useful in revealing differences between people because
they are highly variable. DNA is composed of four bases, C, T, G, and A in
paired strands. So for instance, we might have a locus where some people have
two copies of C, some have two copies of T, and others have a C and a T. SNPs
are not necessarily a functional part of
the gene itself – they may be in a non-coding region, or so close to a gene that
variation in the SNP co-occurs with variation in the gene. Many different SNPs
can index the same gene. So for CNTNAP2, Vernes et al (2008)tested 38 SNPs,
ten of which were linked to language problems. So we have to decide which SNP
to study – or whether to study all of them. And we have to decide how to do the
analysis. For instance, SNP rs2710102 can take the form CC, CT or TT. We could
look for a dose response effect (CC < CT < TT) or we could compare CC/CT with TT, or we could compare CC with CT/TT. Which of these we do may depend on whether prior research suggests the genetic effect is additive or dominant, but for brain imaging studies grouping can also be dictated by practical considerations: it’s usual to compare just two groups and to combine genotypes to give a reasonable sample size. If you’ve followed me so far, and you have some background in statistics, you will already be starting to see why this is potentially problematic. If the researcher can select from ten possible SNPs, and two possible analyses, the opportunities for finding spuriously ‘significant’ results are increased. If there are no directional predictions – i.e. we are just looking for a difference between two groups, but don’t have a clear idea of what type of difference will be associated with ‘risk’ – then the number of potentially ‘interesting’ results is doubled.

For CNTNAP2, I found two papers that had
looked at brain correlates of SNP rs2710102. Whalley et al (2011) found that adults
with the CC genotype had different patterns of brain activation from CT/TT
individuals. However, the other study, by Scott-van Zeeland et al (2010), treated
CC/CT as a risk genotype that was compared with TT. (This was not clear in the
paper, but the authors confirmed it was what they did).

Four studies looked at another SNP -
rs7794745, on the basis that an increased risk of autism had been reported for
the T allele in males. Two of them (Tan et al, 2010; Whalley et al, 2010) compared TT vs TA/AA and two (Folia et al, 2011; Kos et al, 2012) compared
TT/TA with AA. In any case, the ground is rather cut from under the feet of
these researchers by a recent failure to replicate an association of this SNP
with autism (Anney et al, 2012).

2. Who are the participants? It’s not very
informative to just say you studied “healthy volunteers”. There are some types
of study where it doesn’t much matter how you recruited people. A study looking
at genetic correlates of cognitive ability isn’t one of them. Samples of
university students, for instance, are not representative of the general
population, and aren’t likely to include many people with significant language
problems.

3. How many people in the study had each type
of genetic variant? And if subgroup analyses are reported, how many people in
each subgroup had each type of genetic variant? I've found that papers in top-notch journals often fail to provide this basic
information.

Why is this important? For a start, likelihood
of showing significant activation of a brain region will be affected by sample
size. Suppose you have 24 people with genotype A and 8 with genotype B. You
find significant activation of brain region X in those with genotype A, but not
for those with genotype B. If you don’t do an explicit statistical comparison
of groups (you should - but many people don’t) you may be misled into concluding that brain
activation is defective in genotype B – when in fact you just have low power to
detect effects in that group because it is so small.

In addition, if you don’t report the N, then
it’s difficult to get an idea of the effect size and confidence interval for
any effect that is reported. The reasons why this is optimal are
well-articulated here. This issue has been much discussed in psychology, but seems not to have
permeated the field of genetics, where reliance on p-values seems the norm. In
neuroimaging it gets particularly complicated, because some form of correction
for ‘false discovery’ will be applied when multiple comparisons are conducted. It’s
often hard to work out quite how this was done, and you can end up staring at
a table that shows brain regions and p-values, with only a vague idea of how
big a difference there actually is between groups.

Most of the SNPs that are being used in brain studies are ones that
were found to be associated with a behavioural phenotype in large-scale genomic
studies where the sample size would include hundreds if not thousands of
individuals, so small effects could be detected. Brain-based studies often use
sample sizes that are relatively small, but some of them find large, sometimes
very large, effects. So what does that mean? The optimistic interpretation is
that a brain-based phenotype is much closer to the gene effect, and so gives
clearer findings. This is essentially
the argument used by those who talk of ‘endophenotypes’ or ‘biomarkers’.
There is, however, an alternative, and much more pessimistic view, which is
that studies linking genotypes with brain measures are prone to generate false
positive findings, because there are too many places in the analysis pipeline
where the researchers have opportunities to pick and choose the analysis that
brings out the effect of interest most clearly. Neuroskeptic has a nice blogpost illustrating this well-known problem in
the neuroimaging area; matters are only made worse by uncertainty re SNP classification
(point 1).

A source of concern here is the
unpublishability of null findings. Suppose you did a study where you looked at,
say, 40 SNPs and a range of measures of brain structure, covering the whole
brain. After doing appropriate corrections for multiple comparisons, nothing is
significant. The sad fact is that your study is unlikely to find a home in a
journal. But is this right? After all, we don’t want to clutter up the
literature with a load of negative results. The answer depends on your sample
size, among other things. In a small sample, a null result might well reflect
lack of statistical power to detect a small effect. This is precisely why
people should avoid doing small studies: if you find nothing, it’s
uninterpretable. What we need are studies that allow us to say with confidence
whether or not there is a significant gene effect.

4. How do the genetic/neuroimaging results relate to cognitive measures in your sample? Your notion that ‘underactivation of brain area
X’ is an endophenotype that leads to poor language, for instance, doesn’t look
very plausible if people who have such underactivation have excellent language skills. Out
of five papers on CNTNAP2 that I reviewed, three made no mention of cognitive measures,
one gathered cognitive data but did not report how it related to genotype or
brain measures, and only one provided some relevant, though sketchy, data.

5. Report negative findings. The other kind of
email I’ve been writing to people is one that says – could you please clarify
whether your failure to report on the relationship between X and Y was because
you didn’t do that analysis, or whether you did the analysis but failed to find
anything. This is going to be an uphill battle, because editors and reviewers
often advise authors to remove analyses with nonsignificant findings. This is a
very bad idea as it distorts the literature.

And last of all....

A final plea is not so much to journal
editors as to press officers. Please be aware that studies of common SNPs aren't the same as studies of rare genetic mutations. The genetic variants in the
studies I looked at were all relatively common in the general population, and so
aren't going to be associated with major brain abnormalities. Sensationalised
press releases can only cause confusion:

This release on the Scott van-Zeeland (2010) study described neuroimaging
findings from CNTNAP2 variants that are found in over 70% of the population. It claims that:

“A gene variant tied to autism rewires the
brain"

"Now we can begin to unravel the mystery
of how genes rearrange the brain's circuitry, not only in autism but in many
related neurological disorders."

“Regardless of their diagnosis, the children
carrying the risk variant showed a disjointed brain. The frontal lobe was
over-connected to itself and poorly connected to the rest of the brain”

"If we determine that the CNTNAP2
variant is a consistent predictor of language difficulties, we could begin to
design targeted therapies to help rebalance the brain and move it toward a path
of more normal development."

Only at the end of the press release, are we
told that "One third of the population [sic: should be two thirds] carries this variant in its DNA.
It's important to remember that the gene variant alone doesn't cause autism, it
just increases risk."

References

Anney, R., Klei, L.,
Pinto, D., Almeida, J., Bacchelli, E., Baird, G., . . . Devlin, B. .
Individual common variants exert weak effects on the risk for autism spectrum
disorders. Human Molecular Genetics, 21(21), 4781-4792. doi: 10.1093/hmg/dds301(2012)

V. Folia, C. Forkstam, M.
Ingvar, P. Hagoort, K. M. Petersson, Implicit artificial syntax processing:
Genes, preference, and bounded recursion. Biolinguistics 5, (2011).

M. Kos et al., CNTNAP2
and language processing in healthy individuals as measured with ERPs. PLOS One
7, (2012).

Scott-Van Zeeland, A., Abrahams, B., Alvarez-Retuerto, A., Sonnenblick, L., Rudie, J., Ghahremani, D., Mumford, J., Poldrack, R., Dapretto, M., Geschwind, D., & Bookheimer, S. (2010). Altered Functional Connectivity in Frontal Lobe Circuits Is Associated with Variation in the Autism Risk Gene CNTNAP2 Science Translational Medicine, 2 (56), 56-56 DOI: 10.1126/scitranslmed.3001344

G. C. Tan, T. F. Doke, J.
Ashburner, N. W. Wood, R. S. Frackowiak, Normal variation in fronto-occipital
circuitry and cerebellar structure with an autism-associated polymorphism of
CNTNAP2. Neuroimage 53, 1030 (2010).

Vernes, S. C., Newbury,
D. F., Abrahams, B., Winchester, L., Nicod, J., Groszer, M., . . . Fisher, S. A functional genetic link between distinct developmental language
disorders. New England Journal of Medicine, 359, 2337-2345. (2008).

H. C. Whalley et al.,
Genetic variation in CNTNAP2 alters brain function during linguistic processing
in healthy individuals. Am. J. Med. Genet. B 156B, 941 (2011).

Sunday, July 15, 2012

The devaluation of low-cost psychological research

Psychology encompasses a wide range of subject areas,
including social, clinical and developmental psychology, cognitive psychology
and neuroscience. The costs of doing different types of psychology vary hugely.
If you just want to see how people remember different types of material, for
instance, or test children's understanding of numerosity, this can be done at very
little cost. For most of the psychology I did as an undergraduate, data
collection did not involve complex equipment, and data analysis was pretty
straightforward - certainly well within the capabilities of a modern desktop
computer. The main cost for a research proposal in this area would be for staff
to do data collection and analysis. Neuroscience, however, is a different
matter. Most kinds of brain imaging require not only expensive equipment, but
also a building to house it and staff to maintain it, and all or part of these
costs will be passed on to researchers. Furthermore, data analysis is usually
highly technical and complex, and can take weeks, or even months, rather than
hours. A project that involves neuroimaging will typically cost orders of
magnitude more than other kinds of psychological research.

In academic research, money follows money. This is quite
explicit in funding systems that reward an institution in proportion to their
research income. This makes sense: an institution that is doing costly research
needs funding to support the infrastructure for that research. The problem is
that the money, rather than the research, can become the indicator of success. Hiring
committees will scrutinise CVs for evidence of ability to bring in large
grants. My guess is that, if choosing between one candidate with strong
publications and modest grant income vs. another with less influential
publications and large grant income, many would favour the latter.
Universities, after all, have to survive in a tough financial climate, and so
we are all exhorted to go after large grants to help shore up our institution's
income. Some Universities have even taken to firing people who don't bring in
the expected income. This means that cheap cost-effective research in
traditional psychological areas will be devalued relative to more expensive
neuroimaging.

I have no quarrel, in principle, with psychologists doing
neuroimaging studies - some of my best friends are neuroimagers - and it is important that if good science is to be done in
this area that it should be properly funded. I am uneasy, though, about an
unintended consequence of the enthusiasm for neuroimaging, which is that it has
led to a devaluation of the other kinds of psychological research. I've been
reading Thinking Fast and Slow,
by Daniel Kahneman, a psychologist who has the rare distinction of
being a Nobel Laureate. This is just one example of a psychologist who has made major advances without using brain scanners. I couldn't help thinking that Kahneman would not fare
well in the current academic climate, because his experiments were simple,
elegant ... and inexpensive.

I've suggested previously that systems of academic rewards
need to be rejigged to take into account not just research income and
publication outputs, but the relationship between the two. Of course, some
kinds of research require big bucks, but large-scale grants are not always
cost-effective. And on the other side of the coin, there are people who do
excellent, influential work on a small budget.

I thought I'd see if it might be possible to get some hard
data on how this works in practice. I used data for Psychology Departments from
the last Research Assessment Exercise (RAE), from this website, and matched
this up against citation counts for publications that came out in the same time
period (2000-2007) from Web of Knowledge. The latter is a bit tricky, and I'm
aware that figures may contain inaccuracies, as I had to search by address,
using the name of the institution coupled with the words Psychology and UK. This will miss articles that don't have these words in the address. Also when double-checking the numbers, I found that for a search by address, results can fluctuate from one occasion to the next. For these reasons, I'd urge readers to treat the results with caution, and
I won't refer to institutions by name. Note too that though I restrict consideration to articles between 2000-2007, the citations extend
beyond the period when the RAE was completed. Web of Knowledge helpfully gives
you an H-index for the institution if you ask for a citation report, and this
is what I report here, as it is more stable across repeated searches than the citation count. Figure 1 shows how research income for a department
relates to its H-index, just for those institutions deemed research active,
which I defined as having a research income of at least £500K over the reporting
period. The overall RAE rating is colour-coded into bandings, and the symbol denotes
whether or not the departmental submission mentions neuroimaging as an
important part of its work.

Data from RAE and Web of Knowledge: treat with caution!

Several features are seen in these data, and most are
unsurprising:

Research income and H-index are positively correlated, r =
.74 (95%CI .59-.84) as we would expect. Both variables are correlated with the
number of staff entered in the RAE, but the correlation between them remains
healthy when this factor is partialled out, r = .61 (95%CI .40-.76).

Institutions coded as doing neuroimaging have bigger grants: after taking into account differences in number of staff, the mean income
for departments with neuroimaging was £7,428K and for those without it was
£3,889K (difference significant at p = .01).

Both research income and H-index are predictive of RAE
rankings: the correlations are .68 (95% CI .50-.80) for research income and .79
(95% CI .66-.87) for H-index, and together they account for 80% of the variance
in rankings. We would not expect perfect prediction, given that the RAE committee
went beyond metrics to assess aspects of research quality not
reflected in citations or income. And in addition, it must be noted that the
citations counted here are for all researchers at a departmental address, not
just those entered in the RAE.

A point of concern to me in these data, though, is the wide
spread in H-index seen for those institutions with the highest levels of grant
income. If these numbers are accurate, some departments are using their
substantial income to do influential work, while others seem to achieve no more
than other departments with much less funding. There may be reasonable
explanations for this - for instance, a large tranche of funding may have been
awarded in the RAE period but not had time to percolate through to
publications. But nevertheless, it adds to my concern that we may
be rewarding those who chase big grants without paying sufficient attention to
what they do with the funding when they get it.

What, if anything, should we do about this? I've toyed in
the past with the idea of a cost-efficiency metric (e.g. citations divided by
grant income), but this would not work as a basis for allocating funds, because
some types of research are intrinsically more expensive than others. In
addition, it is difficult to get research funding, and success in this arena is
in itself an indicator that the researchers have impressed a tough committee of
their peers. So, yes, it makes sense to treat level of research funding as one indicator
of an institution's research excellence when rating departments to determine
who gets funding. My argument is simply that we should be aware of the
unintended consequences if we rely too heavily on this metric. It would be nice
to see some kind of indicator of cost-effectiveness included in ratings of
departments alongside the more traditional metrics. In times of financial
stringency, it is particularly short-sighted to discount the contribution of
researchers who are able to do influential work with relatively scant
resources.