A row of books are lined up in the foreground. Behind them are blurred books shelfs full of books.

Reform of higher education research assessment and funding

DfE Universities & higher education Government departments & agencies HE funding & research strategy 2006

1 Which, if any, of the RAE 2008 panels might adopt a greater or wholly metrics-based approach?

This question begs a rather larger one, namely whether a change to a metrics-based assessment will ever be undertaken at the level of a subject-based assessment unit. Most informed opinion suggests that metric assessment will only work at the aggregated scale of a whole institution (that of the University itself). Our view is that the great strength of the RAE has been that it is subject-based, and "bottom-up" in terms of the manner in which the academic community participates in the selection of panel members and the definition of criteria through a consultation process. This buy-in by the academic community must play a significant role in the ownership of the assessment process which is necessary for research assessment to drive an improvement in the overall performance of research quality. Our view is that this will be lost as a result of a shift to a metrics-based assessment.

We also wish to stress that the current RAE is in fact a metric assessment. It uses qualitative professional judgement to "measure" a system and to assign semi-quantitative grades. It achieves this by combining information on the quality of outputs, grant income, studentship numbers etc, with the weightings attached to these factors varying to an appropriate degree between subjects. As part of the process of generating the output metric of the RAE, there can be variation in the input metrics employed to suit the requirements of individual subjects, but within a consistent overall philosophy. This consistent philosophy is an essential element of the assessment procedure, as it allows fair distribution amongst subject areas of equal international standing. This will be lost if there is a differential method of assessment based on metrics in some cases, and something more like the current RAE in others. However, we fail to see how this can work, because the University-scale metric assessment and subject-based peer review are obviously incompatible.

A further critical issue is that the RAE must adopt a consistent but flexible approach so that comparisons can be legitimately made between UoAs. Furthermore, this is also required so that there is no tension created within and amongst disciplines that embrace a wide range of research traditions, and that might inhibit inter- and multi-disciplinary working. We return in more detail to this issue in respect of Geography and similar disciplines, in our response to Q.3.

2 Have we identified all the important metrics? Bearing in mind the need to avoid increasing the overall burden of data collection on institutions, are there other indicators that we should consider?

A critical requirement must be that any proposed metric indicator is independently and reliably measured and validated. Unless there is consistency in the sampling and measurement of metric variables, they will lack credibility, and open up the possibility of judicial review and other challenges; and result in divisive arguments between disciplines about relative validity. In addition, metrics will need to be scaled relative to the size of a research community; citations will inevitably be lower in a smaller community. It is by no means clear that it is a straightforward procedure to take citation data and scale them without adequate research; after all, the citations are international, so must be scaled by the size of the relevant international community.

Many metrics will be applied in a simplistic manner, and will fail to measure research quality and strength. For example, if publication in highly-rated journals is prioritised, this will miss the fact that many papers in lower impact journals may have greater impact than many papers in high impact journals. In the field of development geography, international impacts may be substantial as a result of papers published in well-respected journals outside the Anglo-American nexus, and a metric assessment will be unable to judge this. Research funding in interdisciplinary fields may be derived from a wide diversity of sources, including commercial sources and overseas agencies. but there may be a weakness in accounting for these reliably. These and many other examples illustrate that metrics are far from being straightforward indicators of either activity or quality; that they can be difficult to define reliably; and that their use as a relatively blunt tool will marginalise many highly productive and valued academics. Declining morale and increased out-migration are predictable consequences.

In the social sciences there are no reliable correlations between metric variables and other variables. For example, it is not possible to show a correlation between research income and research impact as measured by citations; the correlation is very weak even in the Russell Group social sciences. This implies that metric assessment would be a poor guide to research quality. It follows that the reputation of UK research may suffer if there is evidence of a weak relationship between input and output measures.

It is also the case that the correlation between different input measures (eg grant income and QR) is only evident in a dozen or so large institutions that can include all the heavy sciences by virtue of their size; in the rest of the University system, there is no such correlation, implying that there would be considerable variation in funding levels in a majority of institutions, simply because of a change in the measuring system. This kind of variance will destroy confidence in the procedure for allocation of funding, as it will appear to endorse an arbitrariness of outcome.

Finally, we wish to emphasise that the strength of the RAE is its capacity to judge the quality of a very wide range of types of research output, far beyond the standard journal article. The rest of the world envies this eclecticism, and it is assessment based on peer judgement that permits it. To switch to a metric assessment will narrow the range of "measurable" outputs, and lead to a stultifying uniformity. This will be extremely damaging to disciplines where the journal article is not the gold standard for quality (eg the humanities with their monographs, or with practice-based outputs). There should not be an assessment practice introduced for bureaucratic reasons that then distorts and even changes the established and approved methods of communicating research of the highest quality. Again, UK HE and its reputation will suffer.

3 Which of the alternative models described in this chapter do you consider to be the most suitable for STEM subjects? Are there alternative models or refinements of these models that you would want to propose?

We are of the view that even in the context of STEM subjects, there is considerable variation in the degree to which subjects will value metric assessment; we cannot envisage pure mathematics and theoretical physics as being measured by the same metric criteria as engineering and medicine. Equally, there are several disciplines, of which Geography is one, that interact strongly with STEM subjects and embody a wide range of intellectual traditions and research methods, and we are extremely concerned that those subjects will be severely damaged by a change to a metric assessment procedure which is of variable merit amongst the component sub-disciplines. Geography, Archaeology, Town Planning, Psychology, and Anthropology all have a very broad intellectual base (from the humanities to the natural sciences), and our discussions with representatives of others of those subjects indicates considerable concern that introducing metric assessment of differential value to the different parts of these subjects will cause serious internal tensions. It is short-sighted to introduce a method of assessment that could damage subjects that are inherently interdisciplinary, given that great stress is today placed on the value of inter-disciplinarity (and indeed, the Treasury has previously emphasised the importance of promoting such interdisciplinarity: Science & innovation investment framework 2004-2014, #2.9, p.22).

Given the view expressed in 1, that the current RAE generates a metric output, and uses metric inputs, but as part of a balanced portfolio of evidence, we do not believe that there is any reason to depart in principle from the model currently in place, although there are ways in which it could be simplified (see the response to Q4). We believe that criticism of the RAE relative to metric-based assessment is misguided, and commend the argument by Sastry and Bekhradnia in "Using metrics to allocate research funds: initial response to the Government’s consultation proposals" (HEPI, June 2006) that the increased annual cost of the Research Council peer review system following introduction of metric assessment based on grant income will outweigh that of the 5-7 year RAE cycle.

4 What, in your view, would be an appropriate and workable basis for assessing and funding research in non-STEM subjects?

The current RAE has become more complex. The best solution is not to shift to an entirely new system of assessment, which will have many unforeseen consequences, but to review the current system constructively and to streamline it. The shift to producing a distribution rather than a grade has significantly increased the burden of the RAE, as there is less justification for sampling; all outputs must be examined in detail. It has become more structured and constraining, as a result of the weightings for outputs, environment and esteem. It would be possible to revert to a simpler procedure.

5 What are the possible undesirable behavioural consequences of the different models and how might the effects be mitigated?

Considerable difficulties will arise for those subject areas that traditionally include a wide diversity of research methods and output types; different components of these subject areas will value and accept metric assessment to varying degrees (including not at all), and academics in these subjects will find collaboration within their own subject area increasingly difficult because some will be unwilling to contribute to outputs that they consider cannot be measured (or judged) adequately by one or the other of the assessment procedures. This issue goes beyond the subjects mentioned in 3, because the implications will be damaging for interdisciplinarity more generally.

Because metric assessment tends to be strongly dependent on historic data, there will be less room for innovation, and new fields of research will find great difficulty in becoming established.

As noted in 3, there will be a change in behaviour far more radical than that which has been attributed to the current style of RAE, in that academics will adapt their choice of output in ways designed less to communicate their research and more to influence the metrics that will then determine their income. These changes of output strategy will have a serious negative effect on the international standing of many areas of UK research.

The transaction costs of grant acquisition will be increased, and this will be a continual, annual cost, not an intermittent cost. The RC peer review procedures will become more costly, and under pressure, they will change in ways that will lose the confidence of the research community. Universities will strengthen their own internal procedures for vetting and encouraging procedures. These will be expensive, and will sap morale.

In short, switching to metric assessment is likely to increase further the bureaucratisation of Universities. The cost of making this change could therefore be considerable in terms of the negative unquantifiable consequences.

6 In principle, do you believe that a metrics-based approach for assessment or funding can be used across all institutions?

No. It is evident in the attempts to show strong correlations between RC research grant income and QR that the strength of correlation relies entirely on the largest 12-14 institutions. In the remainder, there is no correlation, and accordingly it is dangerous to assume that the relationship applies widely. The implication is that for most institutions, the change to metric assessment is likely to result in considerable instability in funding levels, and significant manipulation of the outcomes will be necessary to compensate for this. This will, of course, result in a loss of trust in the institution responsible for that manipulation.

7 Should the funding bodies receive and consider institutions' research plans as part of the assessment process?

No. These will have been constructed for a variety of reasons, and there is no consistent way in which they might be judged. And, indeed, since they would have to be the subject of qualitative, judgement-based appraisal, we find it extremely strange that this question is posed at all. In essence, in a consultation about metrics-based assessment, we are asked if it would be useful to have a University-level "research environment" statement comparable to that requested of departments in the RAE. How do these fit together? Not at all, as far as we can tell. Furthermore, the implication is (in Q7) that the assessment is at institution level when Q1 implies otherwise.

8 How important do you feel it is for there to continue to be an independent assessment of UK higher education research quality for benchmarking purposes? Are there other ways in which this could be accomplished?

Yes. The current RAE does provide for international quality benchmarking, but again, by incorporating the professional judgement of international scholars of some standing. It is a valuable benchmarking for the RAE procedure, but it is opaque as to how a comparable benchmarking might be used in conjunction with a largely data-driven metrics-based assessment.

One final point. It has been suggested that a metric exercise will be run in parallel with the 2008 RAE. If this is to be done, it must be completely transparent what role it will play, and it must not subvert the outcome of the RAE by being used to complement or supplant the RAE outcomes. The academic community was generally dissatisfied by the introduction, without warning, of the "6*" status after the last RAE, a move which undermined the authority of the RAE process. The RAE outcomes must guide funding for at least 3 years before any changes are introduced, in order to be credible, and for the RAE process not to be undermined again.