Tuesday, 28 January 2014

InterProScan 5 is out


Towards the end of last year, we released InterProScan 5. This represented a complete redesign of the popular InterProScan software that compares protein sequences against different InterPro member database signatures, to classify them into protein families and predict the presence of important domains and sites. The new version of the software is highly flexible, robust and scalable. Multiple output formats are available, including tab-delimited, XML and GFF3, as well as graphical output in HTML and SVG formats (an example of which is given below). You can find more details about InterProScan 5 here and read the paper describing it here.

InterProScan graphical output when searched with UniProt entry B0A027. 
A month or two on from the software release, we think we have ironed out (most of!) the bugs and urge you to give it a go. This is particularly important for InterProScan 4 users; whilst InterProScan 4.8 is still available for download, we are not supporting it any more and so highly recommend that you upgrade. We are also very interested in hearing what you think about the new software, your experiences installing and running it, and what you might be using it for. So drop us a line, using the comments section below.

Alex Mitchell
on behalf of the InterPro team

Friday, 2 August 2013

Migrating from InterProScan v4.x to InterProScan v5

First, a bit of background..

It's been a while in development but finally we are ready to make InterProScan v5 the official release of InterProScan, meaning that v4.x will be retired.  The development of InterProScan 5 was motivated by several factors - both feedback from our users and internal needs to have a robust pipeline for keeping all the calculations of InterPro signatures against UniProtKB proteins up-to-date.

As explained on the Google Code wiki for InterProScan 5, there are numerous differences between the two versions, most notably the change from a Perl-based architecture to a Java-based one.  We've also simplified the installation process, added a new analysis type (Phobius), improved the output formats that users can get their results in and tailored this version to work on a large-scale better than ever before.

Migrating from v4 to v5

In order to help users transition from using v4 to v5, we have created a page on our Google Code wiki that explains what might need to be changed to migrate from using v4 to v5.  In the initial version of the page (which will be added to as and when we get requests from users for clarification) we describe changes to the command-line options available and the output formats generated.  This is mainly of interest to users who have downloaded InterProScan and installed it locally.

Changes will also be reflected in the EBI-hosted versions, with both the web interface and SOAP/REST web services changing to v5 shortly.  We will continue to run the hosted InterProScan v4.8 in parallel with the v5 whilst people swtich over but the older version will no longer be actively developed (including releases of data).

Timelines

It is our intention to officially launch InterProScan 5 with the next public release of the InterPro database, currently scheduled for 19th September 2013.  

If people have any questions about the above or suggestions, please contact us

Thursday, 1 August 2013

InterPro 43.1 is released, fixing the previous data problems

Good news!

Last week we were able to finally release an update to release 43 to fix the problems we identified.

43.1 contains the same member databases as v43.0 (i.e. an upgrade to Pfam from v26.0 to v27.0 compared with the InterPro 42.0 release) but there are 2 main differences:
  1. The pre-calculated match information we supply via the InterPro website and the downloadable match_complete.xml file that is used by InterProScan is now correct
  2. Additional Pfam signatures have been integrated into InterPro entries since v43.0.  This is why in the release notes for 43.1, 13579/14831 Pfam signatures belong to an InterPro entry but in 43.0 only 13079/14831 signatures were included.  (Our curators have been busy!)
We  apologise again for this problem and the inconvenience it might have caused our users.  We have now put in place measures to ensure that this particular problem won't happen again.

Sarah

Monday, 17 June 2013

InterPro is temporarily reverted to v42.0

An update on our progress.  We decided to revert the InterPro website back to the previous release's data (v42.0).  This means that the Pfam release that was incorporated into release v43.0 is no longer visible via the website, at least, until the fix is completed.  The full status of all our services is now as follows:

InterPro website

Currently displays v42.0 data - all protein match information visible on the site is now correct and can be used with confidence.  The version of Pfam that is visible is v26.0, however.

InterProScan5 (downloadable)

The InterProScan5 current version (RC6) was built against v42.0.  We hadn't built and distributed the version (RC7) that was for v43.0 of the data and so users are still safe using InterProScan5 RC6.

InterProScan4 (downloadable)

Standalone InterProScan4 (downloadable from our FTP site) had data released for v43.0 which included Pfam 27.0, however, it was only the match_complete.xml file that was affected by the data.  Users could either run their InterProScan4 installation with 43.0 data with the -nocrc option on the command-line or can download the data for release 42.0 from the FTP site (ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/DATA/iprscan_MATCH_DATA_42.0.tar.gz and ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/DATA/iprscan_DATA_42.0.tar.gz) and revert back to that version.

InterProScan4 (EBI-hosted)

InterProScan4 is currently running using InterPro release 42.0 data and can therefore be used with confidence.  The version of Pfam included is v26.0.

Next steps

We will hopefully make a new public release next week which will contain Pfam 27.0 and correct protein match information to the website.  Updates to InterProScan v4 data and InterProScan 5 (RC7) will follow shortly afterwards.  These updates will be announced on the twitter feed and mailing lists as v43.1

Again, many thanks for your patience whilst we sort out these issues.

Friday, 14 June 2013

Update on fix to InterPro 43.0

We're still working on fixing release 43.0 and we are aiming to release a fixed version (v43.1) next week.  We're sorry it's taking so long to sort out but we are working hard to do so.  It's highly likely we'll temporarily revert the public data to release 42.0 if we've not fixed 43 by Monday.

InterProScan

We have had some questions about the use of InterProScan

Users of InterProScan v5 will be pleased to know that we noticed the problem with 43.0 before we had updated I5, therefore, all the data coming from InterProScan 5 should be correct and you can use it with confidence.

InterProScan4, however, is affected by this problem if you have not used "-nocrc" option on the commandline of the standalone version, or if you have used the EBI-hosted version without specifying "-nocrc".   Running InterProScan 4 with the lookup disabled (using "-nocrc") will not use the problemmatic dataset, and so the results should be OK.

Once again, we apologise for any inconvenience this might have caused our users.

Monday, 10 June 2013

Problem with InterPro release 43.0

For the first time, we've discovered a major problem with the match data generated for InterPro.

This has resulted in incorrect InterPro calculations for approximately 3 million protein sequences in the UniParc database - therefore, it is highly likely that a number of UniProtKB proteins will have incorrect match data visible in the InterPro web interface.  At the same time, we have noticed that some of the pathway mappings associated to InterPro entries (e.g. mappings of entries to KEGG, Reactome, etc.) are incorrect.

We are currently working to fix this problem and re-release the data as soon as possible.  Note that this potentially affects the data in both the InterPro website and InterProScan XML files.

We apologise for the inconvenience and will make a new announcement once the problem is fixed and the new data is available (it will be called InterPro v43.1)

Please let us know if you have any questions about the above by using our support channels.

Tuesday, 26 March 2013

New (RESTful) interface for InterPro


In September 2012, InterPro launched a new look and feel for its website (http://www.ebi.ac.uk/interpro/).  We re-designed the pages based on feedback from our users (we did a lot of usability testing, user surveys and talking to people directly about what they wanted from our website).  We hope that, as a consequence, people have found it much easier to find the information they need and the site more enjoyable to use.
The redesigning of the website was almost entirely cosmetic and very little of the underlying codebase and database was changed before it was released. However, immediately after launching the re-styled site, we decided to re-factor the entire web application and its back-end database. This may seem like a back-to-front way to do things but in fact, doing things that way round meant we were confident that all of the data we were delivering through our web app was utilised in some way. Rather than using our production database schema (as in the past), instead we generated a new query-optimised warehouse, and a Spring MVC web application was built on top of that. The new web site that we launched on March 11th is faster, less prone to failures and much easier for the team to maintain (hurrah!) An additional happy consequence is that we've been able to produce RESTful URLs for the data in InterPro.  

For example, if you:


There are RESTful URLs for all of the entry and protein entity pages in InterPro.  Please take a look around the site and let us know if there are any features that you think we should add.  If you link to us already, please update to using these new URLs in your resource.

Sarah
on behalf of the InterPro team