789share: funding

Showing posts with label funding. Show all posts

Sunday, January 12, 2014

Why does so much research go unpublished?

As described in my last blogpost, I attended an excellent symposium on waste in research this week. A recurring theme was research that never got published. Rosalind Smyth described her experience of sitting on the funding panel of a medium-sized charity. The panel went to great pains to select the most promising projects, and would end a meeting with a sense of excitement about the great work that they were able to fund. A few years down the line, though, they'd find that many of the funds had been squandered. The work had either not been done, or had been completed but not published.

In order to tackle this problem, we need to understand the underlying causes. Sometimes, as Robert Burns noted, the best-laid schemes go wrong. Until you've tried to run a few research projects, it's hard to imagine the myriad different ways in which life can conspire to mess up your plans. The eight laws of psychological research formulated by Hodgson and Rollnick are as true today as they were 25 years ago.

But much research remains unpublished despite being completed. Reasons are multiple, and the strategies needed to overcome them are varied, but here is my list of the top three problems and potential solutions.

Inconclusive results

Probably the commonest reason for inconclusive results is lack of statistical power. A study is undertaken in the fond hope that a difference will be found between condition X and condition Y, and if the difference is found, there is great rejoicing and a rush to publish. A negative result should also be of interest, provided the study was well-designed and adequately motivated. But if the sample is small, then we can't be sure whether our failure to observe the effect is because it is absent: a real but small effect could be swamped by noise.

I think the solution to this problem lies in the hands of funding panels and researchers: quite simply, they need to take statistical power very seriously indeed and to consider carefully whether anything will be learned from a study if the anticipated effects are not obtained. If not, then the research needs to be rethought. In the fields of genetics and clinical trials, it is now recognised that multicentre collaborations are the way forward to ensure that studies are conducted with sufficient power to obtain a conclusive result.

Rejection of completed work by journals

Even well-conducted and adequately powered studies may be rejected by journals if the results are not deemed to be exciting. To solve this problem, we must look to journals. We need recognition that - provided a study is methodologically strong and well-motivated - negative results can be as informative as positive ones. Otherwise we are doomed to waste time and money pursuing false leads. As Paul Glasziou has emphasised, failure is part of the research process. It is important to tell people about what doesn't work if we are not to repeat our mistakes.

We do now have some journals that will publish negative results, and there is a growing move toward pre-registration of studies, with guaranteed publication if the methods meet quality criteria. But there is still a lot to be done, and we need a radical change of mindset about what kinds of research results are valuable.

Lack of time

Here, I lay the blame squarely on the incentive structures that operate in universities. To get a job, or to get promoted, you need to demonstrate that you can pull in research income. In many UK institutions this is quite explicit, and promotions criteria may give a specific figure to aim for of X thousand pounds research income per annum. There are few UK universities whose strategic plan does not include a statement about increasing research funding. This has changed the culture dramatically; as Fergus Millar put it: "in the modern British university, it is not that funding is sought in order to carry out research, but that research projects are formulated in order to get funding".

Of course, for research to thrive, our Universities need people who can compete for funding to support their work. But the acquisition of funding has become an end in itself, rather than a means to an end. This has the pernicious effect of driving people to apply for grant after grant, without adequately budgeting for the time it takes to analyse and write up research, or indeed to carefully think about what they are doing. As I argued previously, even junior researchers these days have an 'academic backlog' of unwritten papers.

At the Lancet meeting there were some useful suggestions for how we might change incentive structures to avoid such waste. Malcolm MacLeod argued researchers should be evaluated not by research income and high-impact publications, but by the quality of their methods, the extent to which their research was fully reported, and the reproducibility of findings. An-Wen Chan echoed this, arguing for performance metrics that recognise full dissemination of research and use of research datasets by other groups. However, we may ask whether such proposals have any chance of being adopted when University funding is directly linked to grant income, and Universities increasingly view themselves as businesses.

I suspect we would need revised incentives to be reflected at the level of those allocating central funding before vice-chancellors took them seriously. It would, however, be feasible for behaviour to be shaped at the supply end, if funders adopted new guidelines. For a start, they could look more carefully at the time commitments of those to whom grants are given: in my experience this is never taken into consideration, and one can see successful 'fat cats' accumulating grant after grant, as success builds on success. Funders could also monitor more closely the outcomes of grants: Chan noted that NIHR withholds 10% of research funds until a paper based on the research has been submitted for publication. Moves like this could help us change the climate so that an award of a grant would confer responsibility on the recipient to carry through the work to completion, rather than acting solely to embellish the researcher's curriculum vitae.

References

Chan, A., Song, F., Vickers, A., Jefferson, T., Dickersin, K., Gotzsche, P., Krumholz, H. M., Ghersi, D., & van der Worp, H. B. (2014). Increasing value and reducing waste: addressing inaccessible research Lancet (8 Jan ) : 10.1016/S0140-6736(13)62296-5

Macleod, M. R., Michie, S., Roberts, I., Dirnagl, U., Chalmers, I., Ioannidis, J. P. A., . . . Glasziou, P. (2014). Biomedical research: increasing value, reducing waste. Lancet, 383(9912), 101-104.

Thursday, January 9, 2014

Off with the old and on with the new: the pressures against cumulative research

Yesterday I escaped a very soggy Oxford to make it down to London for a symposium on "Increasing value, reducing waste" in Research. The meeting marked the publication of a special issue of the Lancet containing five papers and two commentaries, which can be downloaded here.

I was excited by the symposium because, although the focus was on medicine, it raised a number of issues that have much broader relevance for science, including several that I have raised on this blog, including pre-registration of research, criteria used by high-impact journals, ethics regulation, academic backlogs, and incentives for researchers. It was impressive to see that major players in the field of medicine are now recognizing that there is a massive problem of waste in research. Better still, they are taking seriously the need to devise ways in which this could be fixed.

I hope to blog about more of the issues that came up in the meeting, but for today I'll confine myself to one topic that I hadn't really thought about much before, but which I see as important, namely the importance of doing research that builds on previous research, and the current pressures against this.

Iain Chalmers presented one of the most disturbing slides of the day, a forest plot of effect sizes found in medical trials for a treatment to prevent bleeding during surgery.

Based on Figure 3 of Chalmers et al, 2014

Time is along the x-axis, and the horizontal line corresponds to a result where the active and control treatments do not differ. Points which are below the line and whose fins do not cross it show a beneficial effect of treatment. The graph shows that the effectiveness of the treatment was clearly established by around 2002, yet a further 20 studies including several hundred patients were reported in the literature after that date. Chalmers made the point that it is simply unethical to do a clinical trial if previous research has already established an effect. The problem is that researchers often don't check the literature to see what has already been done, and so there is wasteful repetition of studies. In the field of medicine this is particularly serious because patients may be denied the most effective treatment if they enrol in a research project.

Outside medicine, I'm not sure this is so much of an issue. In fact, as I've argued elsewhere, in psychology and neuroscience I think there's more of a problem with lack of replication. But there definitely is much neglect of prior research. I lose count of the number of papers I review where the introduction presents a biased view of the literature that supports the authors' conclusions. For instance, if you are interested in the relation between auditory deficit and children's language disorders, it is possible to write an introduction presenting this association as an established fact, or to write one arguing that it has been comprehensively debunked. I have seen both.

Is this just lazy, biased or ignorant authors? In part, I suspect it is. But I think there is a deeper problem which has to do with the insatiable demand for novelty shown by many journals, especially the high-impact ones. These journals typically have a lot of pressure on page space and often allow only 500 words or less for an introduction. Unless authors can refer to a systematic review of the topic they are working on, they are obliged to give the briefest account of prior literature. It seems we no longer value the idea that research should build on what has gone before: rather, everyone wants studies that are so exciting that they stand alone. Indeed, if a study is described as 'incremental' research, that is typically the death knell in a funding committee.

We need good syntheses of past research, yet these are not valued because they are not deemed novel. One point made by Iain Chalmers was that funders have in the past been reluctant to give grants for systematic reviews. Reviews also aren't rated highly in academia: for instance, I'm proud of a review on mismatch negativity that I published in Psychological Bulletin in 2007. It not only condensed and critiqued existing research, but also discovered patterns in data that had not previously been noted. However, for the REF, and for my publications list on a grant renewal, reviews don't count.

We need a rethink of our attitude to reviews. Medicine has led the way and specified rigorous criteria for systematic reviews, so that authors can't just cherrypick specific studies of interest. But it has also shown us that such reviews are an invaluable part of the research process. They help ensure that we do not waste resources by addressing questions that have already been answered, and they encourage us to think of research as a cumulative, developing process, rather than a series of disconnected, dramatic events.

Reference

Chalmers, Iain, Bracken, Michael B., Djulbegovic, Ben, Garattini, Silvio, Grant, Jonathan, Gülmezoglu, A. Metin, Howells, David W., Ioannidis, John P. A., & Oliver, Sandy (2014). How to increase value and reduce waste when research priorities are set Lancet : 10.1016/S0140-6736(13)62229-1

Monday, October 14, 2013

The Matthew effect and REF2014

For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken away even that which he hath. Matthew 25:29

So you’ve slaved over your departmental submission for REF2014, and shortly will be handing it in. A nervous few months await before the results are announced. You’ve sweated blood over deciding whether staff publications or impact statements will be graded as 1*, 2*, 3* or 4*, but it’s not possible to predict how the committee will judge them, nor, more importantly, how these ratings will translate into funding. In the last round of evaluation, in 2008, a weighted formula was used, such that a submission earned 1 point for every 2* output, 3 points for every 3* output, and 7 points for every 4* output. Rumour has it that this year there may be no money for 2* outputs and even more for 4*. It will be more complicated than this, because funding allocations will also take into account ratings of ‘impact statements’, and the ‘environment’.

I’ve blogged previously about concerns I have with the inefficiency of the REF2014 as a method for allocating funds. Today I want to look at a different issue: the extent to which the REF increases disparities between universities over time. To examine this, I created a simulation which made a few simple assumptions. We start with a sample of 100 universities, each of which is submitting 50 staff in a Unit of Assessment. At the outset, we start with all universities equal in terms of the research quality of their staff: they are selected at random from a pool of possible staff whose research quality is normally distributed. Funding is then allocated according to the formula used in RAE2008. The key feature of the simulation is that over every assessment period there is turnover of staff (estimated at 10% in simulation shown here), and universities with higher funding levels are able to recruit replacement staff with higher scores on the research quality scale. These new staff are then the basis for computing funding allocations in the next cycle – and so on, through as many cycles as one wishes.
This simulation shows that funding starts out fairly normally distributed, but as we progress through each cycle, it becomes increasingly skewed, with the top-performers moving steadily away from the rest (Figure A). In the graphs, funding is shown over time for universities grouped in deciles, i.e., bands of 10 universities after ranking by funding level.

Simulation: Mean income for universities in each of 10 deciles over 6 funding cycles

Depending on specific settings of parameters in the model, we may even see a bimodal distribution developing over time: a large pool of ‘have-nots’ vs an elite group of ‘haves’.

Despite the over-simplifications of the model, I would argue that it captures an essential feature of the current funding framework: funding goes to those who are successful, allowing them to enter a positive feedback loop whereby they can recruit more high-calibre researchers and become even more successful – and hence gain even more funds in the next round. For those who are unsuccessful, it can be hard to break out of a downward spiral into research inactivity.

We could do things differently. Figure B shows how tweaking the funding model could avoid opening up such a wide gulf between the richest and poorest, and retain a solid core of middle-ranking universities.

Simulation using linear weighting of * levels. Each line is average for institutions in a given decile

Figure C, on the other hand, shows how a formula that predominantly rewards 4* outputs (weighting of 1 for 3* and 7 for 4*, which is rumoured to be a possible formula used in REF2014). This would dramatically increase the gulf between the elite and other institutions.

Simulation where 4* outputs get favoured. Each line is average for institutions in a given decile

I’m sure people will have very different views about whether or not the consequences illustrated here are desirable. One argument is that it is best to concentrate our research strength in a few elite institutions. That way the UK will be able to compete with the rest of the world in University league tables. Furthermore, by pooling the brightest brains in places where they have the best resources to do research, we have a chance of making serious breakthroughs. We could even use biblical precedent to justify such an approach: the Matthew effect refers to the biblical parable of the talents, in which servants are entrusted different sums of money by their master, and those who have most make the best use of it. There is no sympathy for those with few resources: they fail to make good use of what they do have and end up cast out into outer darkness, where there is weeping and gnashing of teeth. This robust attitude characterises those who argue that only internationally outstanding research should receive serious funding.

However, given that finances are always limited, there will be a cost to the focus on an elite; the middle-ranking universities will get less funding, and be correspondingly less able to attract high-calibre researchers. And it could be argued that we don’t just need an elite: we need a reasonable number of institutions in which there is a strong research environment, where more senior researchers feel valued and their graduate students and postdocs are encouraged to aim high. Our best strategy for retaining international competitiveness might be by fostering those who are doing well but have potential to do even better. In any case, much research funding is awarded through competition for grants, and most of this goes to people in elite institutions, so these places will not be starved of income if we were to adopt a more balanced system of awarding central funds.

What worries me most is that I haven’t been able to find any discussion of this issue – namely, whether the goal of a funding formula should be to focus on elite institutions or distribute funds more widely. The nearest thing I’ve found so far is a paper analysing a parallel issue in grant awards (Fortin & Curry, 2013) – which comes to the conclusion that broader distribution of smaller grants is more effective than narrowly distributed large grants. Very soon, somebody somewhere is going to decide on the funding formula, and if rumours are to be believed, it will widen the gap between the haves and have-nots even further. I'm concerned that if we continue to concentrate funding only in those
institutions with a high proportion of research superstars, we may be
creating an imbalance in our system of funding that will be bad for UK
research in the long run.

Reference

Fortin JM, & Currie DJ (2013). Big Science vs. Little Science: How Scientific Impact Scales with Funding. PloS one, 8 (6) PMID: 23840323

Saturday, January 26, 2013

An alternative to REF2014?

After blogging last week about use of journal impact factors in REF2014, many people have asked me what alternative I'd recommend. Clearly, we need a transparent, fair and cost-effective method for distributing funding to universities to support research. Those designing the REF have tried hard over the years to devise such a method, and have explored various alternatives, but the current system leaves much to be desired.

Consider the current criteria for rating research outputs, designed by someone with a true flair for ambiguity:

Rating	Definition
4*	Quality that is world-leading in terms of originality, significance and rigour
3*	Quality that is internationally excellent in terms of originality, significance and rigour but which falls short of the highest standards of excellence
2*	Quality that is recognised internationally in terms of originality, significance and rigour
1*	Quality that is recognised nationally in terms of originality, significance and rigour

Since only 4* and 3* outputs will feature in the funding formula, then a great deal hinges on whether research is deemed “world-leading”, “internationally excellent” or “internationally recognised”. This is hardly transparent or objective. That’s one reason why many institutions want to translate these star ratings into journal impact factors. But substituting a discredited, objective criterion for a subjective criterion is not a solution.

The use of bibliometrics was considered but rejected in the past. My suggestion is that we should reconsider this idea, but in a new version. A few months ago, I blogged about how university rankings in the previous assessment exercise (RAE) related to grant income and citation rates for outputs. Instead of looking at citations for individual researchers, I used Web of Science to compute an H-index for the period 2000-2007 for each department, by using the ‘address’ field to search. As noted in my original post, I did this fairly hastily and the method can get problematic in cases where a Unit of Assessment does not correspond neatly to a single department. The H-index reflected all research outputs of everyone at that address – regardless of whether they were still at the institution or entered for the RAE. Despite these limitations, the resulting H-index predicted the RAE results remarkably well, as seen in the scatterplot below, which shows H-index in relation to the funding level following from RAE. This is computed by number of full-time staff equivalents multiplied by the formula:

.1 x 2* + .3 x 3* + .7 x 4*

(N.B. I ignored subject weighting, so units are arbitrary).

Psychology (Unit of Assessment 44), RAE2008 outcome by H-index

Yes, you might say, but the prediction is less successful at the top end of the scale, and this could mean that the RAE panels incorporated factors that aren’t readily measured by such a crude score as H-index. Possibly true, but how do we know those factors are fair and objective? In this dataset, one variable that accounted for additional variance in outcome, over and above departmental H-index, was whether the department had a representative on the psychology panel: if they did, then the trend was for the department to have a higher ranking than that predicted from the H-index. With panel membership included in the regression, the correlation (r) increased significantly from .84 to .86, t = 2.82, p = .006. It makes sense that if you are a member of a panel, you will be much more clued up than other people about how the whole process works, and you can use this information to ensure your department’s submission is strategically optimal. I should stress that this was a small effect, and I did not see it in a handful of other disciplines that I looked at, so it could be a fluke. Nevertheless, with the best intentions in the world, the current system can’t ever defend completely against such biases.

So overall, my conclusion is that we might be better off using a bibliometric measure such as a departmental H-index to rank departments. It is crude and imperfect, and I suspect it would not work for all disciplines – especially those in the humanities. It relies solely on citations, and it's debatable whether that is desirable. But for sciences, it seems to be pretty much measuring whatever the RAE was measuring, and it would seem to be the lesser of various possible evils, with a number of advantages compared to the current system. It is transparent and objective, it would not require departments to decide who they do and don’t enter for the assessment, and most importantly, it wins hands down on cost-effectiveness. If we'd used this method instead of the RAE, a small team of analysts armed with Web of Science should be able to derive the necessary data in a couple of weeks to give outcomes that are virtually identical to those of the RAE. The money saved both by HEFCE and individual universities could be ploughed back into research. Of course, people will attempt to manipulate whatever criterion is adopted, but this one might be less easily gamed than some others, especially if self-citations from the same institution are excluded.

It will be interesting to see how well this method predicts RAE outcomes in other subjects, and whether it can also predict results from the REF2014, where the newly-introduced “impact statement” is intended to incorporate a new dimension into assessment.

Sunday, July 15, 2012

The devaluation of low-cost psychological research

Psychology encompasses a wide range of subject areas,
including social, clinical and developmental psychology, cognitive psychology
and neuroscience. The costs of doing different types of psychology vary hugely.
If you just want to see how people remember different types of material, for
instance, or test children's understanding of numerosity, this can be done at very
little cost. For most of the psychology I did as an undergraduate, data
collection did not involve complex equipment, and data analysis was pretty
straightforward - certainly well within the capabilities of a modern desktop
computer. The main cost for a research proposal in this area would be for staff
to do data collection and analysis. Neuroscience, however, is a different
matter. Most kinds of brain imaging require not only expensive equipment, but
also a building to house it and staff to maintain it, and all or part of these
costs will be passed on to researchers. Furthermore, data analysis is usually
highly technical and complex, and can take weeks, or even months, rather than
hours. A project that involves neuroimaging will typically cost orders of
magnitude more than other kinds of psychological research.

In academic research, money follows money. This is quite
explicit in funding systems that reward an institution in proportion to their
research income. This makes sense: an institution that is doing costly research
needs funding to support the infrastructure for that research. The problem is
that the money, rather than the research, can become the indicator of success. Hiring
committees will scrutinise CVs for evidence of ability to bring in large
grants. My guess is that, if choosing between one candidate with strong
publications and modest grant income vs. another with less influential
publications and large grant income, many would favour the latter.
Universities, after all, have to survive in a tough financial climate, and so
we are all exhorted to go after large grants to help shore up our institution's
income. Some Universities have even taken to firing people who don't bring in
the expected income. This means that cheap cost-effective research in
traditional psychological areas will be devalued relative to more expensive
neuroimaging.

I have no quarrel, in principle, with psychologists doing
neuroimaging studies - some of my best friends are neuroimagers - and it is important that if good science is to be done in
this area that it should be properly funded. I am uneasy, though, about an
unintended consequence of the enthusiasm for neuroimaging, which is that it has
led to a devaluation of the other kinds of psychological research. I've been
reading Thinking Fast and Slow,
by Daniel Kahneman, a psychologist who has the rare distinction of
being a Nobel Laureate. This is just one example of a psychologist who has made major advances without using brain scanners. I couldn't help thinking that Kahneman would not fare
well in the current academic climate, because his experiments were simple,
elegant ... and inexpensive.

I've suggested previously that systems of academic rewards
need to be rejigged to take into account not just research income and
publication outputs, but the relationship between the two. Of course, some
kinds of research require big bucks, but large-scale grants are not always
cost-effective. And on the other side of the coin, there are people who do
excellent, influential work on a small budget.

I thought I'd see if it might be possible to get some hard
data on how this works in practice. I used data for Psychology Departments from
the last Research Assessment Exercise (RAE), from this website, and matched
this up against citation counts for publications that came out in the same time
period (2000-2007) from Web of Knowledge. The latter is a bit tricky, and I'm
aware that figures may contain inaccuracies, as I had to search by address,
using the name of the institution coupled with the words Psychology and UK. This will miss articles that don't have these words in the address. Also when double-checking the numbers, I found that for a search by address, results can fluctuate from one occasion to the next. For these reasons, I'd urge readers to treat the results with caution, and
I won't refer to institutions by name. Note too that though I restrict consideration to articles between 2000-2007, the citations extend
beyond the period when the RAE was completed. Web of Knowledge helpfully gives
you an H-index for the institution if you ask for a citation report, and this
is what I report here, as it is more stable across repeated searches than the citation count. Figure 1 shows how research income for a department
relates to its H-index, just for those institutions deemed research active,
which I defined as having a research income of at least £500K over the reporting
period. The overall RAE rating is colour-coded into bandings, and the symbol denotes
whether or not the departmental submission mentions neuroimaging as an
important part of its work.

Data from RAE and Web of Knowledge: treat with caution!

Several features are seen in these data, and most are
unsurprising:

Research income and H-index are positively correlated, r =
.74 (95%CI .59-.84) as we would expect. Both variables are correlated with the
number of staff entered in the RAE, but the correlation between them remains
healthy when this factor is partialled out, r = .61 (95%CI .40-.76).

Institutions coded as doing neuroimaging have bigger grants: after taking into account differences in number of staff, the mean income
for departments with neuroimaging was £7,428K and for those without it was
£3,889K (difference significant at p = .01).

Both research income and H-index are predictive of RAE
rankings: the correlations are .68 (95% CI .50-.80) for research income and .79
(95% CI .66-.87) for H-index, and together they account for 80% of the variance
in rankings. We would not expect perfect prediction, given that the RAE committee
went beyond metrics to assess aspects of research quality not
reflected in citations or income. And in addition, it must be noted that the
citations counted here are for all researchers at a departmental address, not
just those entered in the RAE.

A point of concern to me in these data, though, is the wide
spread in H-index seen for those institutions with the highest levels of grant
income. If these numbers are accurate, some departments are using their
substantial income to do influential work, while others seem to achieve no more
than other departments with much less funding. There may be reasonable
explanations for this - for instance, a large tranche of funding may have been
awarded in the RAE period but not had time to percolate through to
publications. But nevertheless, it adds to my concern that we may
be rewarding those who chase big grants without paying sufficient attention to
what they do with the funding when they get it.

What, if anything, should we do about this? I've toyed in
the past with the idea of a cost-efficiency metric (e.g. citations divided by
grant income), but this would not work as a basis for allocating funds, because
some types of research are intrinsically more expensive than others. In
addition, it is difficult to get research funding, and success in this arena is
in itself an indicator that the researchers have impressed a tough committee of
their peers. So, yes, it makes sense to treat level of research funding as one indicator
of an institution's research excellence when rating departments to determine
who gets funding. My argument is simply that we should be aware of the
unintended consequences if we rely too heavily on this metric. It would be nice
to see some kind of indicator of cost-effectiveness included in ratings of
departments alongside the more traditional metrics. In times of financial
stringency, it is particularly short-sighted to discount the contribution of
researchers who are able to do influential work with relatively scant
resources.