One approach to overcoming, or at least easing, the complexity of judging individual cases is the use of metrics or statistical calculations to measure impact or influence. This has been an area of increasing interest even with traditional publications. This measure of impact is often represented by a statistical measure such as the ‘h-index’, which is based upon bibliometric calculations of citations using a specific set of publisher databases. This measure seeks to identify references to one publication within another giving ‘an estimate of the importance, significance, and broad impact of a scientist's cumulative research contributions’ (Hirsch, 2005). Promising though this may sound it is a system that can be cheated, or gamed (Falagas and Alexiou, 2008), for instance, by authors referencing previous papers or between groups, and so a continual cycle of detecting such behaviours and then eliminating them is entered into, rather akin to the battle fought between computer-virus makers and antivirus software. The Research Excellence Framework (REF) examined the potential of using such measures as a part of the assessment process and found that currently available systems and data were ‘not sufficiently robust to be used formulaically or as a primary indicator of quality; but there is considerable scope for it to inform and enhance the process of expert review’ (HEFCE, 2010).
There are at least three further degrees of separation from this walled garden approach to citations. The first is to use data outside of a proprietary database as a measure of an article's impact. This ‘webometrics’ approach was identified early on as offering potential to get richer information about the use of an article, by analysing the links to an article, downloads from a server and citations across the web (e.g. Marek and Valauskas, 2002). Cronin et al. (1998) argue that this data could ‘give substance to modes of influence which have historically been backgrounded in narratives of science’.
The next step is to broaden this webometrics approach to include the more social, Web 2.0 tools. This covers references to articles in social networks such as Twitter and blogs, social bookmarking tools such as CiteULike and recommendation tools such as Digg (Patterson, 2009). This recognises that a good deal of academic discourse now takes place outside of the formal journal, and there is a wealth of data that can add to the overall representation of an article's influence.
The ease of participation, which is a key characteristic of these tools, also makes them even more subject to potential gaming. As Priem and Hemminger (2010) report, there are services which can attempt to increase the references from services such as Digg to a site (or article) for a fee. But they are reasonably optimistic that gaming can be controlled, proposing that ‘one particular virtue of an approach examining multiple social media ecosystems is that data from different sources could be cross-calibrated, exposing suspicious patterns invisible in single source’.
A more radical move away from the citation work that has been conducted so far is to extend metrics to outputs beyond the academic article. A digital scholar is likely to have a distributed online identity, all of which can be seen to represent factors such as reputation, impact, influence and productivity. Establishing a digital scholar footprint across these services is problematic because people will use different tools, so the standard unit of the scholarly article is lacking. Nevertheless one could begin to establish a representation of scholarly activity by analysing data from a number of sites, such as the individual's blog, Twitter, Slideshare and YouTube accounts, and then also using the webometrics approach to analyse the references to these outputs from elsewhere. A number of existing tools seek to perform this function for blogs; for example, PostRank tracks the conversation around blog posts, including comments, Twitter links and delicious bookmarks. These metrics are not without their problems and achieving a robust measure is still some way off, but there is a wealth of data now available which can add to the overall case an individual makes.