February 2010

You are currently browsing the monthly archive for February 2010.

The following text is taken from a briefing paper I prepared for the UK federation Policy and Advisory Board – i thought it might be of broader interest!

1. Introduction

One of the most discussed topics within the federation space at the moment is ‘interfederation’. This describes the process of two or more federations exchanging metadata to allow members within different federations to connect via a federated access management exchange. This process results in a ‘metadata aggregation’ – the subject of a useful paper by Ian Young and Chad La Joie. This briefing paper is intended to give an overview of the current thinking behind interfederation at the current time.

In most interfederation models, the principle that Identity Providers are static, and Service Providers are mobile is used. This means that Identity Providers are expected to join their ‘home’ federation (their local education and research federation) but that Service Providers have no such natural affiliation. At the present time, this means that Service Providers have to join multiple federations to interact with each separate group of national Identity Providers. This is clearly sub-optimal for Service Providers, who have to deal with multiple agreements different approaches to discovery, attributes etc. and differing approaches to charging. The interfederation approach aims to solve this problem as effectively as possible.

Whilst these assumptions generally form the basis of most discussions, there is no requirement for Identity Providers to be ‘static’ within federations and future models may see more mobility from IdPs.

2. Available approaches

2.1 Aligning Policy

Whilst not strictly an ‘interfederation’ approach, the complexities faced by Service Providers could be addressed through more work on ensuring that education and research federations use policies that are aligned. This would mean that SPs could be given assurances that the policy of federation A is the same as federation B, with perhaps minor changes to clauses x,y and z, thus cutting down on the lead time and legal expenses of SPs as they join multiple federations. This approach was the subject of a JISC funded study: “Investigation into the Feasiblity of a Cross-Jurisdictional Common Access Management Federation Agreement”. This report noted that there were no significant legal reasons why federations have adopted different policy agreements, and that most differences were based on cultural and funding issues.

Advantages

  • Supports SPs by improving their experience of approaching multiple federations.
  • Does not impact on charging models adopted by many federations.
  • No need for interfederation agreements to be signed.

Disadvantages

  • Still requires SPs to join multiple federations.

Whilst it is unlikely that we will see a wholesale change in policy across federations, the study has been useful in making small changes to policies in order to support interfederation – such as the alignment of meaning assigned to values in the eduPersonScopedAffiliation field.

2.2 Interfederation

Interfederation is achieved by two federations bilaterally agreeing to exchange metadata, and agreeing a policy for achieving this aim. Uses for the UK federation would be interfederation with the Government Gateway to allow parents to use their citizen ID to access school data, and interfederation with organisations such as InCommon, with Service Providers are of interest to UK Identity Providers.

Advantages

  • Solves the problem of SPs joining multiple federations;
  • Interfederation agreement can be lightweight;
  • Model template agreement is available.

Disadvantages

  • Getting commitment and agreement from two federations to take forward;
  • Legal issues surrounding the agreements.

This model is now well developed, and an interfederation agreement for use by educational and esearch federations has been tabled. However, no real use is being made of the process. For this approach to be successful, it will be necessary for two federations to take the plunge and sign an agreement and start testing with Service Providers.

2.3 Confederation

Confederation involves multiple federations all agreeing to abide by a single agreement on how metadata will be published, issued and aggregating. This model is being explored by the GEANT funded eduGain project.

Advantages

  • Federations only need to agree to one policy;
  • Easier for entities to understand the process when centrally managed.

Disadvantages

  • Not all federations are likely to be in each confederation ‘club’ so bi-lateral agreements will still be needed;
  • Sensitivity over charging models used by each federation;
  • Complex to achieve widescale agreement;
  • Complexities over ‘lowest common denominator’ for assurance.

As this approach requires many different parties to agree on an approach, it is the most complex to finalise. The eduGain model is suffering from this, and has built up quite a complex set of agreements: a constitution that federations will need to sign, a policy agreement that federations will need to sign and a metadata terms of use (which seems redundant in the light of the preceding agreements). This will act as a significant barrier to entry for many federations, including the UK federation.

Another example of this in action is Kalmar2. This is a collaboration between four of the Nordic countries, allowing confederation to be achieved between a set of like-minded federations.

2.4 Metadata Terms of Use

In this approach, a Federation Operator simply publishes a set of metadata, with a terms of use attached to it (similar to an opensource software license). Any other Federation Operator, or indeed any other metadata distributor, may use the metadata file subject to the terms of use. Trust is established by the consuming Federation Operator obeying the terms of use and the publishing Federation Operator providing a ‘Federation Operator Practise’ statement that the consuming Operator can read, assess and chose to trust.

Advantages

  • No need for complex legal agreements;
  • Allows metadata aggregation at many levels – does not need to involve a Federation Operator / Registrar;
  • Advantageous for ‘virtual organisations’ that cross multiple federations.

Disadvantages

  • Does not provide the ‘safety net’ of a signed legal agreement.

This approach is popular among technical developers and federations that have very limited liability, but is less popular with those who are naturally risk adverse or have concerns about legal liability. As this is the easiest way to achieve interfederation, it is beginning to be used extensively amongst small projects. This ‘bottom up’ approach is likely to grow rapidly, and as federations mature it is likely to be the process of choice for achieving simple interfederation.

One of the things that we are looking at closely with the UK federation at the moment is a move towards a more seamless approach to metadata management. Metadata is clearly one of the most important things about a federation – it has all the information to allow IdPs and SPs to connect to each other. It is also critically important that the metadata is accurate – bad metadata could easily break the trust model of a federation.

However, metadata takes a long time to process, check and verify. One approach that federations have been taking to help streamline this process is to introduce systems where by members can automagically update their own metadata. A good example of this is the SWITCH AAI Resource Registry.

Implementing something like this for the UK federation is an interesting concept, but I still have a number of questions:

  • What is the impact on members in terms of additional cost / time from having to upload their own metadata information?
  • Is there a corresponding reduction in staff time and effort at the federation operator, and it is right to switch the balance of effort?
  • How do we maintain integrity and accuracy of data? What would be the impact of incorrect data being passed through?
  • What is an appropriate level of human intervention / checking of data with this automated process?

I’d be really interested to hear people’s thoughts on this process.

Of course, another option would be to adopt a more radical approach whereby Identity Providers and Service Providers host their own metadata and merely inform the federation of its location. This embraces the idea of a truly distributed service model…but is perhaps a step we are not yet ready for.

This week, I’m getting excited about statistics! Well, I need something down to earth to balance out the amazing experience of being at APAN29 in Sydney.

Just before I started at JISC, we had some long and detailed conversations about statistics as part of the ANGEL project. Whilst usage statistic work has mumbled on in the background but there hasn’t been any significant work in this area….until now. Like buses, JISC usage statistic projects all come at once.

Something I am very happy to see funded, particularly as I saw the birth of the project idea whilst walking on a very hot day in San Antonio, is the RAPTOR project at Cardiff University. At the moment, Shibboleth Identity Providers can produce very useful access logs for institutions, but in a format that is not particularly friendly or helpful to the needs of librarians who need to be able to quickly review and assess resource usage. RAPTOR will produce a toolkit to not only provide this functionality but also to integrate these statistics with EZProxy logs – a joined up approach which I’m sure will be appreciated.

Hand in hand with this, the UK federation are planning on producing a portal to allow institutions to upload appropriately anonymised statistics….possible using the outputs from RAPTOR if we are smart about it. This will give us an interesting national view of resource usage, useful for both JISC and JISC Collections in focusing attention on the requirements of our community.

At the other end of the picture, it is equally important that we look at Service Provider statistics to provide the more detailed view of user behaviour beyond the authentication point. JISC Collections have been examining the potential of a usage statistics portal that will aggregate statistics from COUNTER compliant reports provided by publishers. Again, the point here is to reduce the amount of time librarians are forced to spend aggregating this information.

To complete the picture, the PIRUS project is looking at usage statistics right down at the article level across both publisher resources and repositories. More information is available in this post from Ben Wynne. PIRUS has produced a review of what information would be required to provide article level statistics. My only concern about this report is ‘who’ section and the options described for identifying unique users. eduPersonTargetedID and eduPersonPrincipleName seem obvious candidates for potential unique identifiers but are missing from the report. The challenge here will be any suggestion that looks at tracking the same user across multiple Service Providers. Obviously this is useful information for institutions, publishers and authors, but the privacy issues and management of Personally Identifiable Information (PII) will have to be carefully examined.

So that is your usage stats round-up – certainly lots of good stuff to keep an eye on.