Evaluating the recommender – undergraduate focus groups

We held more focus groups over the summer holidays, which is kind of tricky when your audience for testing are undergraduates!  Not many are left on campus during the summer months, but we did manage to find some undergraduates still around the Manchester Metropolitan University (MMU) campus.  This might be a reflection of the heavy rainfall we’ve been having in Manchester this summer.

We chose MMU as we wanted to test the recommender on undergraduates who didn’t already have a recommender on their own library catalogue, as they might complete the tests as a comparison.   We spoke to 11 undergraduates in total and they tested the recommender 42 times between them.

The Results

The students were positive about the concept of the book recommender and were keen to use this as another tool in their armoury to find out about new resources.  A key bonus to them was the lack of input the recommender needed in order to gain maximum output. To a time poor, pressured undergraduate this is a huge plus point.

‘Yeah, I would use it, I don’t have to do anything’

‘I would always look at it if it was on the MMU library catalogue’.

The recommender also offered an alternative source of materials to the ubiquitous reading list.  This is absolutely crucial because it quickly became apparent that our participants struggled to find resources;

‘I go off the reading list, all those books are checked out anyway’

‘I’ve used Google scholar when I’ve had nowhere else to go but it returned stuff I couldn’t get hold of and that just frustrated me’.

So in theory it offered more substance to their reading lists.  The additional books it found were more likely to be in the library and came with the advantage that they were suggested because of student borrowing patterns.  Our respondents liked having this insider knowledge about what their peers have read in previous years.

‘It would be useful as I had to read a book for a topic area and when we got the topic area there were already 25 reservations on the book, so if I could click on something and see what the person who did this last year read, that would be very useful’.

Testing the prototype

In testing it proved difficult to conclude if the recommender was useful or not, as some testers seemed to have more luck then others in finding resources that were useful to them. Obviously within the data collection method some margin of error needs to be accounted for.

Of course, you could argue that whether a book is useful or not is a highly subjective decision.  One person’s wildcard might be another’s valuable and rare find, and whereas one tester might be searching for similar books others might be looking for tangentially linked books.   As an example of this, in our group, History students wanted old texts and Law students wanted new ones.

Positively, 91.4 % of the recommendations looked useful and only 3 searches returned nothing at all of any use to the user. 88.6% of searches generated at least one item that the user wanted to borrow and only 4 searches resulted in not a single thing that the user would borrow. Even with a deviation due to subjectivity these are compelling results. As the recommender requires the user not to submit anything substantial in order to get results, a low percentage returning nothing is acceptable to all users we interviewed.

Privacy concerns?

As in previous research, none of the undergraduates attending the focus groups expressed any concern in respect of privacy issues and they understood that the library collected circulation data and the results by the book recommender are generated by that circulation data.

‘I would benefit from everyone else’s borrowing as they are benefitting from mine, so I haven’t got a problem’.

‘It would be nice to be informed about it and given the option to opt out, but I don’t have a problem with it. No.’

Although a ‘nice to be asked’ was expressed by more than one attendee, they wouldn’t want this to delay the development of the book recommender.

In conclusion, the time poor, pressured student struggling to find books off the reading list still in the library would welcome another way of finding precious resources. The majority of students in our groups would use the recommender and, although some recommendations are better than others, they would be willing to forgive this if it just gave them one more resource, when the coursework deadline looms!

Working with Academics and the COPAC Recommender

Over the past month, after compiling a list of 132 potential candidates, I’ve been working with fourteen academics in representative disciplines within the Humanities from around the UK to test and give feedback on the new Copac recommender.  Those individual interviews are now complete, and I am starting to put together a synthesis report on what they’ve told me so far, and it’s all quite heartening.

For one, as you will no doubt like to hear, is that the recommender works!  For the majority of the searches, it returned legitimate and useful suggestions, which the academics said were definitely important and could be used to develop reading lists: “quite a few of these or so are just spot on” was a phrase a number of them said, as well as “I know these books and authors; this is what I would hope would come up”.  Others also found the recommender could be applied to their own professional research: “I knew about this book, but I’d forgotten about it, so this is a good memory jogger, since that would be something I need to consult again”.  One academic during the testing found a book he hadn’t heard of before, and judging from the description and the contents, thought it likely that he would consult it for his own current project.

In terms of the actual recommendations for a reading list, we heard comments such as “this is a seriously useful addition to Copac… it’s great for a reading list developer but this is a major advantage for students”, illustrating that the recommender could also support undergraduate research, as well as developing lists for modules and courses.

To qualify, however, not all searches returned such useful content, and we knew it wouldn’t.  Some recommendations were considered to be tangential or too general, and that is partly because of the nature of the algorithm used, which groups searches with other searches; thus, not all suggestions are going to be as weighty as others.  However, there is something to be said about serendipitous research in that sense: we often find that researchers, themselves, practice a non-linear approach, particularly near the beginning of their search on a library catalogue or database, allowing those accidents and peregrinations to guide rather than hinder.  To that end, one lecturer pointed out that “the editions and recommendations offered some interesting trails” emphasising that the circuitous still has its place in the world of research.

In other instances, some searches produced no results at all.  This result could be that the searches were potentially just too obscure.  It is the reality of an integrated library system like Copac that the majority of its users will not have taken out those same specialised books in relation to other books.  This result could also still be connected to the glitch we discovered with the ISBNs, which were discussed in the post “Data Loading, Processing, and More Challenges with ISBNs – a Technical Update”: “In terms of the ISBN issue, we found our problem was not so much that we have duplicates appearing, but that when we implement it into Copac many results did not have recommendations at all – quite simply because we couldn’t easily match works with the same ISBN to one another.”  As with all testing, the discovery of glitches and bugs is advantageous, since these kinds of problems can only be discovered by those using the tool.

In terms of how the recommender functions for the end user (not the technical end), testers said that the system “works the way you’d expect it to work” which is important to highlight: we wanted to ensure that teachers and students recognised it as a useful tool and be familiar with its basic functions from the outset, and not need any kind of training.   Another said “from a personal perspective, it is a very good exploration tool” and that the “mixture of suggestions is very interesting but their usefulness would depend on the kind of course I’d be developing.”  That comment is also important, because the recommender is not meant to replace or surpass the expertise of the researcher, particularly with regard to reading lists.  It is a tool, pure and simple, and we understand that most academics will still have their own methods for choosing.  However, if the recommender can give them titles, which they had possibly forgotten about or wouldn’t have thought about in relation to their searches before, then it is indeed valuable.  One lecturer also said that the recommender offered “a useful method for browsing” and that could possibly work with developing improved research skills and information literacy.

All in all, the testing showed us that academics do appreciate the recommender and see it as valuable to putting together reading lists, as well as aiding with undergraduates in their own research; the other valuable point I discovered in our favour was that none immediately saw the recommender as “just like Amazon’s”: to the person, they all saw it as much more refined and suited to their work as teachers and researchers.

Data loading, processing, and more challenges with ISBNs — a technical update

A while back I wrote a post detailing some of the challenges we were encountering with resolving ISBNs through the API to ensure that items were allocated relevant recommendations.  This problem meant that not only could duplicate items appear in lists of recommendations, but also that the relevancy could be weakened.  We said then that we were opting to ‘settle for grabbing the first ISBN to get the demonstrator working’ purely for testing purposes.

But then we began work on aggregating and normalising the data from our four additional partners, and found (of course) that the issue only became significantly exacerbated as the quantity of data and variance between records increased significantly. Processing of data has also slowed considerably as we tackle these larger pots of data, and if this work were to be taken further then we’d be exploring how to enhance and streamline the database and processing workflows. In addition, right now calls upon the API would provide potentially very slow results, which is clearly not sustainable in the longer term if the API is to be used more broadly as part of a core service infrastructure.  For detailed information on the loading and processing routines we’re using, see this document prepared by our developer, Dave Chaplin.

In terms of the ISBN issue, we found our problem was not so much that we have duplicates appearing, but that when we implement it into Copac many results did not have recommendations at all – quite simply because we couldn’t easily match works with the same ISBN to one another.  The level of duplication currently existing in the Copac database compounds this issue further, and is something we’re tackling separately – calling upon the API against work level records will go a long way in making this issue go away for Copac users.

But for testing purposes, the problem of empty results has been resolved by the use of OCLC’s xISBN service, which is allowing us to cross-walk from one ISBN to any of its aliases that might appear in the transaction data (see figure below). Right now we’re using the free API which allows a pretty generous 1,000 calls day – but with the scale of data and use we’re talking about here, use of this free service is not going to be a viable solution in the long term.

The diagram below gives an overview of how the API currently works with the loan data from the 5 institutions.   Dave has stripped back the API so that it grabs one ISBN from each search result, and then we use xISBN to return all known variants.  These aliases are then matched to individual (and anonymised) user circulation data in the database (in other words, we find all the people who have that book in common) and we then trawl the database to see what other books those users have in common. Any items borrowed by 8 or more of the people from that subset will be automatically recommended.  Note that each recommendation is weighted by the total number of times the item has been borrowed (as per Dave Pattern’s methodology, see http://www.daveyp.com/blog/archives/1453) and ranked accordingly, with the top 40 suggestions offered; this is an attempt to present the user with relevant recommendations, rather than simply the related items that have been borrowed the most, while not swamping them with potentially hundreds, if not thousands, of suggestions.

Simple overview of how the API works

This approach has improved matters significantly – but not completely.  Behind the scenes there is a hell of a lot of processing going on, which is slowing things down somewhat – and the call upon the xISBN service in each instance is not helping matters.  The diagram above definitely belies the scale we’re often dealing with.

For example, Foucault’s History of Sexuality has been a seminal text in many advanced humanities, arts, and social science disciplines for several decades now. This work has 71 individual ISBN aliases, 3,327 individual borrowers, 182,270 cumulative ‘items’ associated with those borrowers (or loans, although we don’t count multiple loans of the same item by the same person). Of those 182,270 books borrowed by those 3,327 people, 12,497 have 8 people in common.  Using our current experimental system, the first time we ran that search it took around 70 seconds to process (!)

So we can test the qualitative value of the results with academic users, we’re storing that search locally so that the next user does not have the same problem, although (again) this is not a long-term solution for stable service delivery as it would be reasonable to argue that it goes beyond the fair use of the OCLC API. Obviously, further testing would need to be undertaken once the system was improved to evaluate the functionality and speed.

Work is now underway to put the prototype in front of groups of academics, undergraduates, and librarians, so we can further understand the value of the service in supporting learning and research. This will all be reported, along with the technical lesson learned and routes forward, in a final shared services feasibility study.  Certainly, working with the data in aggregate and at such large scale has unearthed challenges we’d not anticipated — all of them surmountable, but which mean if we take this development further we will need to go back to the drawing board in terms of system infrastructure, which is working fine as a live proof of concept, but is not production ready in terms of handling large amounts of data processing or usage.

 

Evaluating the recommender

While our techies, Dave and Neeta, are working away to crunch the data we’ve pulled in from all of our partners, the other half of the Copac AD team are looking at the methods we’ll be using to evaluate the recommender.

We had some interesting user evaluations from the previous SALT project, where we spoke to postgraduates at The University of Manchester to find out if the recommender would support their research by surfacing valuable materials that might, otherwise, have been unknown to them.  There was overwhelming support for the concept from the groups we spoke to, so we’re interested to find out how other users will respond to the recommender.

This time, we’re planning focus groups with undergraduate students to find out if the recommendations we generate will help them to find course materials.  We think that students, already familiar with the concept from the likes of Amazon and Spotify, will welcome the Copac AD recommender so we’ll be showing the groups our prototype and asking what they think.

We’ll also be interviewing academics and teachers to find out if recommendations could support the development of course reading lists.  Some of the postgraduate researchers from the SALT focus groups were also graduate teaching assistants, and they told us that they could see how recommendations from the library catalogue could help their students to read more material and move beyond reading lists – we’d like to explore this more widely with academics from a range of institutions.

And finally, we’ll be conducting interviews with academic liaison librarians to ask if the recommender could support their work with collection development.  I was a liaison librarian in a previous life, and I can definitely see the potential of the tool – especially for gathering recommendations for stock purchases – but again we’ll be speaking to librarians from our project partner institutions to get their views.

We’ll be publishing our evaluation reports on this blog along with our testing instruments over the next couple of months.

Progress so far, and some of the challenges around identifiers and ISBNS we’re facing along the way

Over the last few weeks we’ve been liasing with our cohorts at the University of Sussex, Cambridge University Library, and Lincoln University to extract data and bring it over here to Mimas to start processing. Our aim is to add those sets to the existing API (along with updated data from JRUL and Huddersfield), so that the recommendations or aggregations of related texts produced are less ‘skewed’ to the JRUL context (course reading lists, etc).

When we ran the SALT project, we worked only with the substantial JRUL set of circulation data.  Interestingly (and usefully), the way that JRUL set up their system locally allowed us to see both ISBNs, as well as the JRUL assigned work-ID to identify items. This meant we could deal with items without ISBNs — somewhat critical to our ‘long tail’ hypothesis, which posited that recommenders could help surface under-used items, many of which might be pre-1970s, when ISBNs were phased in.

But now we’re dealing with circulation data from more than one source, and of course there are issues with this approach. The JRUL local solution for items without ISBNs is not widely applied and now we’re dealing with more datasets; we need to map items between different datasets, and the only common ID we have is ISBN. This means that for now we need to shift back to using only ISBN as the ID we deal with, and then adjust our tables and API accordingly.  We do see this as limiting, but for our key objectives in this project, it’s good enough. However, we want to return to this challenge later in the project to see if we can refine the system so it can surface older items.

The other issue emerging currently is that of multiple ISBNs for the same work – a perennial and complex issue, which is particularly coming to the fore in the debate on how to identify eBooks: http://publishingperspectives.com/2010/11/isbns-and-e-books-the-ongoing-dilemma/

With some of our partners’ data, this field has only one value – it seems to be difficult to pinpoint exactly where in the supply chain the decision as to which ISBN to assign seems to occur (depending on vendor systems and cataloguing practices), but it’s clear it will vary a great deal according to institution and processes. On the other hand, in other datasets, multiple ISBNs for one work are recorded, and we need to make a call as to which ISBN we work with.   We could just go with the first ISBN that appears, but this will likely result in duplicates appearing in the recommendations list; it also means that the algorithm on which the recommendation itself is made is watered down (i.e., recommendations will be less meaningful).

For now, we’re going to have to settle for grabbing the first ISBN to get the demonstrator working.  But we’ll also need to develop a stage in our processing where we map ISBNs, and this would also need to be part of the API (so institutions using the API can also map effectively). Right now we’re trying to find out if there is some sort of service that might help us out here. General consensus is that ‘there must be something’ (surely we’re not the first people to tackle this) but so far we’ve not come across anything that fits the bill.  Any suggestions gratefully received!

 

 

Announcing the Copac Activity Data Project (otherwise known as SALT 2)

We’re extremely pleased to announce that thanks to funding from JISC, we are about to commence work that builds on the success of SALT, and provides further understanding of the potential of aggregating and sharing library circulation data to support recommender functionality and the local and national levels. From now until July 31st 2012, we want to  strengthen the existing business case for openly sharing circulation data to support recommendations, and will produce a scoping and feasibility report for a shared national service to support circulation data aggregation, normalisation, and distribution for reuse via an open API.

To achieve this we plan to aggregate and normalise data from libraries in addition to JRUL and to make this available along with the John Rylands Library, University of Manchester dataset through a shared API; our new partner in this include: Cambridge University library, Lincoln University Library, Sussex University Library, and University of Huddersfield Library.

CopacAD will conduct primary research to  investigate the following additional use cases:

  • an undergraduate from a teaching and learning institution searching for course related materials
  • academics/teachers using the recommender to support the development of course reading lists
  • librarians using the recommendations to support academics/lecturers and collections development.

At the same time, we’re going to develop a Shared Service Scoping and Feasibility study will explore the options for a shared service for aggregating, normalising and hosting circulation data, and the potential range of web services/APIs that could be made available on top of that data.

Issues we’ll address will include identifying what infrastructure would need to be in place, how scaleable the service would need to be, and whether the service can scale with demand, the potential use cases for such a service, and benefits to be realised, the projected costs of such a service on an ongoing basis, technical and financial sustainability, including potential business model options moving forward.

If you’re interested in learning more, here’s the proposal for this work [doc].  And as with SALT, we will be regularly updating the community on our progress and lessons learned through this blog.

Introducing the SALT recommender API (based on 10 years of University of Manchester circulation data)

I’m pleased to announce the release of the SALT recommender API which works with over ten years of circulation data from the University of Manchester’s John Rylands Library.

The data source is currently static, but nonetheless yields excellent results. Please experiment and let us know how you get on. Stay tuned for a future post detailing some work we have planned for continuing this project, which will include assessing additional use cases, aggregating more data sources (and adding them to the API) and producing a shared service feasibility report for JISC.

User Feedback Results – Super 8

In an effort to find the magic number the SALT team opened its testing labs again this week.  Another 6 University of Manchester post graduate students spent the afternoon interrogating the Copac and John Rylands library catalogues to evaluate the recommendations thrown back by the SALT API.

With searches ranging from ’The Archaeology of Islam in Sub Saharan Africa’ to ‘Volunteering and Society: Principles and Practice’ no aspect of the Arts and Humanities was left unturned, or at least it felt that way.  We tried to find students with diverse interests within Arts and Humanities to test the recommendations from as many angles as possible.  Using the same format as the previous groups (documented in our earlier blog post ‘What do users think of the SALT recommender?), the library users were asked to complete an evaluation of the recommendations they were given.  Previously the users tested SALT when the threshold was set at 3(that is 3 people borrowed the book which therefore made it eligible to be thrown back as a recommendation), however we felt that the results could be improved.  Previously, although 77.5% found at least one recommendation useful, too many recommendations were rated as ’not that useful’,(see the charts in ‘What do users think of the SALT recommender?’).

This time, we set the threshold at 15 in the John Rylands library catalogue and 8 in Copac.  Like the LIDP team at Huddersfield, (http://library.hud.ac.uk/blogs/projects/lidp/2011/08/30/focus-group-analysis/), we have a lot of data to work with now, and we’d like to spend some more time interrogating the results to find out whether clear patterns emerge.  Although, our initial analysis has also raised some further questions, it’s also revealed some interesting and encouraging results.  Here are the highlights of what we found out.

The Results

On initial inspection the JRUL with its threshold of 15 improved on previous results;

Do any of the recommendations look useful:

92.3 % of the searches returned at least one item the user thought was useful, however when the user was asked if they would borrow at least one item only 56.2% answered that they would.

When asked, a lot of the users stated that they knew the book and so wouldn’t need to borrow it again, or that although the book was useful, their area of research was so niche that it wasn’t specifically useful to them but they would deem it as ‘useful’ to others in their field.

One of the key factors which came up in the discussions with users was the year that the book had been published. The majority of researchers are in need of up to date material, many preferring the use of journals rather than monographs, and this was taken into account when deciding whether a book is worth borrowing. Many users wouldn’t borrow anything more than 10 years old;

‘Three of the recommendations are ‘out of date’ 1957, 1961, 1964 as such I would immediately discount them from my search’ 30/08/11 University of Manchester, Postgraduate, Arts and Humanities, SALT testing group.

So the book could be a key text, and ‘useful’ but it wouldn’t necessarily be borrowed.  Quite often, one user explained, rather than reading a key text she would search for journal articles about the key text, to get up to date discussion and analysis about it. This has an impact on our hypothesis which is to discover the long tail. Quite often the long tail that is discovered includes older texts, which some users discount.

Copac, with a threshold of 8 was also tested. Results here were encouraging;

Do any of the recommendations look useful;

Admittedly further tests would need to be done on both thresholds as the number of searches conducted (25) do not give enough results to draw concrete conclusions from but it does seem as if the results are vastly improved on increase of the threshold.

No concerns about privacy

The issue of privacy was raised again. Many of the postgraduate students are studying niche areas and seemed to understand how this could affect them should the recommendations be attributed back to them. However, as much as they were concerned about their research being followed, they were also keen to use the tool themselves and so their concerns were outweighed by the perceived benefits. As a group they agreed that a borrowing rate of 5 would offer them enough protection whilst still returning interesting results. The group had no concerns about the way in which the data was being used and indeed trusted the libraries to collect this data and use it in such a productive way.

‘It’s not as if it is being used for commercial gain, then what is the issue?’ 30/08/11 University of Manchester, Postgraduate, Arts and Humanities, SALT testing group.

Unanimous support for the recommender

The most encouraging outcome from the group was the uniform support for the book recommender. Every person in the group agreed that the principle of the book recommender was a good one, and they gave their resolute approval that their data was collected and used in a positive way.

All of them would use the book recommender if it was available. Indeed one researcher asked, ‘can we have it now?’

Janine Rigby and Lisa Charnock 31/08/11

Final blog post

In this final post I’m going to sum up what this project has produced, potential next steps, key lessons learned, and what we’d pass on to others working in this area.

In the last five months, the SALT project has produced a number of outputs:

  1.  Data extraction recipe: http://salt11.wordpress.com/recipe-data-extraction-from-talis/
  2.  Details on how the algorithm can support recommendations (courtesy Dave Pattern): http://www.daveyp.com/blog/archives/1453
  3. Technical processes documentation for processing the data and supporting the recommender API (though the API itself is not yet published): http://salt11.wordpress.com/technical-processes/
  4. An open licensing statement from JRUL which means the data can be made available for reuse (we’ve yet to determine how to make this happen, given the size of the dataset; and we also need to explore whether CC-BY is the most appropriate license going forward): http://salt11.wordpress.com/2011/07/26/agreeing-licensing-of-data/
  5. A trial recommender functionality in the live Copac prototype: http://salt11.files.wordpress.com/2011/07/copac_recommender.jpg
  6. A recommender function the JRUL library search interface prototype: http://salt11.files.wordpress.com/2011/08/salt_jrul.jpg
  7. User testing instruments:SALT Postgraduate User Discussion Guide  SALT user response sheet and results
  8. Feedback from collections managers & potential data contributors helping us consider weaknesses and opportunities, as well as possible sustainable next steps.

 

Next steps:

There are a number of steps that can be taken as a result of this project – some imminent ‘quick wins’ which we plan to take on after the official end, and then others that are ‘bigger’ than this project.

What we plan to do next anyway:

  • Adjust the threshold to a higher level (using the ‘usefulness’ benchmark given to us as users as a basis) so as to suppress some of the more off-base recommendations our users were bemused by.
  • Implement the recommender in the JRUL library search interface
  • Once the threshold has been reset, consider implementing the recommender as an option feature in the new Copac interface. We’d really like to, but we’d need to assess if the results are too JRUL-centric.
  • Work with JRUL to determine most appropriate mechanisms for hosting the data and supporting the API in the longer term (decisions here are dependent on how, if at all, we continue with this work from a Shared Services perspective)
  • Work with JRUL to assess the impact of this in the longer term (on user satisfaction, and on borrowing behaviour)

The Big Picture (what else we’d like to see happen):

1.       Aggregate more data. Combine the normalised data from JRUL with processed data from additional libraries that represent a wider range of institutions, including learning and teaching. Our hunch is that only a few more would make the critical difference in ironing out some of the skewed results we get from focusing on one data set (i.e. results skewed to JRUL course listings)

2.  Assess longer term impact. Longer-term analysis of the impact of the recommender functionality on JRUL user satisfaction and borrowing behaviour.  Is there, as with Huddersfield, more borrowing from ‘across the shelf’? Is our original hypothesis borne out?

3.  Requirements and costs gathering for a shared service. Establish the requirements and potential costs for a shared service to support processing, aggregation, and sharing of activity data via an API.  Based on this project, we have a fair idea of what those requirements might be, but our experience with JRUL indicates that such provision need to adequately support the handling and processing of large quantities of data.  How much FTE, processing power, and storage would we need if we scaled to handling more libraries? Part of this requirements gathering exercise would involve identifying additional contributing libraries, and the size of their data.

4.       Experiment with different UI designs and algorithm thresholds to support different use cases. For example, undergraduate users vs ‘advanced’ researcher users might benefit from the thresholds being set differently; in addition, there are users who want to see items held elsewhere and how to get them vs those who don’t. Some libraries will be keen to manage user expectations if they are ‘finding’ stock that’s not held at the home institution.

5.       Establish more recipes to simplify data extraction from the more common LMS’s beyond Talis (Horizon, ExLibris Voyager, and Innovative).

6.       Investigate how local activity data can help collections managers identify collection strengths and recognise items that should be retained because of association with valued collections. We thought about this as a form of “stock management by association.”  Librarians might treat some long-tail items (e.g. items with limited borrowing) with caution if they were aware of links/associations to other collections (although there is also the caveat that this wouldn’t be possible with local activity data reports in isolation)

 7.       More ambitiously, investigate how nationally aggregated activity data could support activities such as stock weeding by revealing collection strengths or gaps and allowing librarians to cross check against other collections nationally. This could also inform the number of copies a library should buy, and which books from reading lists are required in multiple copies.

8.       Learning and teaching support. Explore the relationship between recommended lists and reading lists, and how it can be used as a tool to support academic teaching staff.

9.       Communicate the benefits to decision-makers.  If work were to continue along these lines, then a recommendation that has come out strongly from our collaborators is the need to accompany any development activity with a targeted communications plan, which continually articulates the benefits of utilising activity data to support search to decision-makers within libraries. While within our community a significant amount of momentum is building in this area, our meetings with librarians indicates that the ‘why should I care?’ and more to the point ‘why should I make this a priority?’ questions are not adequately answered. In a nutshell, ‘leveraging activity data’ can easily fall down or off the priority lists of most library managers.  It would be particularly useful to tie these benefits to the strategic aims and objectives of University libraries as a means to get such work embedded in annual operational planning.

What can other institutions do to benefit from our work?

  1. For those using the Talis LMS (and with a few years of data stored), institutions can extract data, and create their own API to pull in as a recommender function using these recipes.
  2. Institutions can benefit from the work we did with users to understand their perceptions of the function, and can be assured that students (undergraduates and postgraduates) can see the immediate benefit (as long as we get rid of some of the odd stuff by setting the threshold higher)
  3. Use the findings of this project to support a business case for this work to their colleagues

How can they go about this?

  1. Assess the quality and quantity of the data stored in your LMS to determine if there’s potential there. For this project (and for the simple recommender based on ‘people who borrowed) you only need data that ties unique individuals to borrowed items (see more from Andy Land on the data extraction process and how anonymisation is handled here: http://salt11.wordpress.com/recipe-data-extraction-from-talis/).
  1. To understand how the recommender algorithm works, see this post Dave Pattern wrote for us: http://www.daveyp.com/blog/archives/1453
  1. To follow our steps in terms of data format, loading, processing, and setting up an API see Dave Chaplin’s explanation: http://salt11.wordpress.com/technical-processes/
  1. To conduct user-testing and focus groups to assess the recommender, feel free to draw from our SALT Postgraduate User Discussion Guide and SALT user response sheet.

Our most significant lessons:

  1. A lower threshold may throw up ‘long tail’ items, but they are likely to not be deemed relevant or useful by users (although they might be seen as ‘interesting’ and something they might look into further). Set a threshold of ten or so, as the University of Hudderfield has, and the quality of recommendations is relatively sound.
  2. Concerns over anonymisation and data privacy are not remotely shared by the users we spoke to.  While we might question this response as potentially naive, this does indicate that users trust libraries to handle their data in a way that protects them and also benefits them.
  3. You don’t necessarily need a significant backlog of data to make this work locally. Yes, we had ten years worth from JRUL, which turned out to be a vast amount of data to crunch.  But interestingly in our testing phases when we worked with only 5 weeks of data, the recommendations were remarkably good.  Of course, whether this is true elsewhere, depends on the nature and size of the institution. But it’s certainly worth investigating.
  4. If the API is to work on the shared service level, then we need more (but potentially not many more) representative libraries to aggregate data from in order to ensure that recommendations aren’t skewed to represent one institution’s holdings, course listings or niche research interests, and can support different use cases (i.e. learning and teaching).

Lessons learned from the user evaluation perspective (or can we define the ‘long tail’?)

The key lesson we’ve learned during this project is that the assumptions behind the hypothesis of this project need to be reconsidered, as in this context the ‘long tail’ is complex and difficult to measure. Firstly how do we evaluate what is ‘long tail’ from a user perspective? We may draw a line in the sand in terms of number of times an item has been borrowed, but this doesn’t necessarily translate into individual or community contexts. Most of this project was taken up with processing the data and creating the API and UI; if we’d had a bit more time we could have spent more resource dealing with these questions as they arose during testing.

The focus groups highlight how diverse and unique each researcher and what they are researching is. We chose humanities  postgrads, PhD’s and masters level, but in this group alone we have a huge range of topic areas, from the incredibly niche to the rather more popular. Therefore we had some respondents who found the niche searches fruitful and others who found nothing, because their research area is so niche, hardly any material they don’t already know about doesn’t exist. In addition, when long tail is revealed, some researchers find it outdated or irrelevant. This is why it isn’t borrowed that often. So is there any merit in bringing it to the attention of the research community?

Further more in-depth testing in this area needs to be done in order to find answers to some of these problems.  The testing for this project asked the respondents to rate their searches and pick out some of the more interesting texts. But we need to sit with fewer researchers and broaden the discussions. What is relevant? How do you guage it as relevant? Some of the respondents said the books were not relevant but more said they would borrow them, so where does this discrepancy come from? Perhaps ‘relevant’ is not the correct term, can the long tail of discovery produce new perspectives, interesting associations perhaps previously not thought of? Only one-to-one in-depth testing can give the right data which will then indicate which level the threshold should be set.

After all is there any point in having a recommender which only gives you recommendations you expect or know about already? However, some participants wanted this from a recommender or expected it and were disappointed when they got results they could not predict. I know if I search for a CD on Amazon that I’m familiar with I sometimes get recommendations I know about or already own. So the recommender means different things to different people. There is a group that are satisfied they know all the recommended texts and can sleep soundly knowing they have completely saturated their research topic and there is a group that need new material.

The long tail hypothesis is a difficult one to prove in a short term project of 6 months. As its name suggests the long tail needs to be explored over a long time. Monitoring borrowing patterns in the library, click through and feedback from the user community and librarians will help to refine the recommender tool for ultimate effectiveness.