InterPro database blog: In the pipeline – streamlined InterPro production

Wednesday, 26 November 2014

In the pipeline – streamlined InterPro production

You may have noticed that InterPro has had fewer releases than usual this year. It is not that we haven’t been working as hard as ever, integrating member database signatures into InterPro entries and adding Gene Ontology terms - we have! But a number of things have been going on behind the scenes, which we thought you might be interested in knowing about.

Sequence growth

InterPro release 1.0, back in 2000, was built using a version of Swiss-Prot/TrEMBL that contained just over 300 thousand sequences. Our current InterPro release (49.0) is built using over 77 million Swiss-Prot/TrEMBL sequences. That is a massive amount of sequence growth - and even more remarkable is the fact that almost half of these sequences have been added in the last year.

A new InterPro production pipeline

As you might imagine, processing this number of sequences can cause all kinds of problems for computational pipelines that were developed when sequence data volumes were orders of magnitudes smaller. To make sure that we can handle the kind of data volume growth we have been seeing - and expect to see in the future - we have been busy rebuilding our production pipeline. The new system is built entirely on InterProScan, which, for a variety of complicated historical reasons, the previous version was not. This change helps streamline the production process, removes a number of bottlenecks, and generally makes many things associated with data production a lot less complicated.

Further pipeline developments and a new data centre

To put these changes in place, we have had to focus a lot of our efforts on pipeline development, with knock-on effects on our release schedule. As a consequence, while we have maintained our usual rate of database integrations, these have been squeezed into slightly fewer InterPro releases. And, as a further complication, we have also recently moved all of our data (in the form of hard drives on the back of a truck - no, really!) to a new data centre, as part of EMBL-EBI’s consolidation of its Web infrastructure. This has impacted our release schedule further still. However, we believe that we are now much better placed to calculate and provide match data for our users. We think we are also better prepared for future data production challenges - as the number of protein sequences hits 100 million, and beyond.

Alex Mitchell
on behalf of the InterPro team

2 comments:

Unknown8 September 2018 at 05:45
We represent only the most selected Best Karachi Model Escorts companions Salina is a professional fashion model and very passionate high in Demand. The most selective gentleman want some adult fun and need to chill intimate moments after finish hard work our VIP experience Lady Escorts in Karachi Provide extreme pleasureful Service.
ReplyDelete
Replies
Anneke Sergio1 September 2019 at 22:00
Here is a great herbal doctor who cured me of Hepatitis B. his name is Dr. Imoloa. I suffered Hepatitis B for 11 years, I was very weak with pains all over my body my stomach was swollen and I could hardly eat. And one day my brother came with a herbal medicine from doctor Imoloa and asked me to drink and I drank hence there was no hope, and behold after 2 week of taking the medicine, I started feeling relief, my swollen stomach started shrinking down and the pains was gone. I became normal after the completion of the medication, I went to the hospital and I was tested negative which means I’m cured. He can also cure the following diseases with his herbal medicine...lupus, hay fever, measles, dry cough, diabetics hepatitis A.B.C, mouth ulcer, mouth cancer, bile salt disease, fol ate deficinecy, diarrhoea, liver/kidney inflammatory, eye cancer, skin cancer disease, malaria, chronic kidney disease, high blood pressure, food poisoning, parkinson disease, bowel cancer, bone cancer, brain tumours, asthma, arthritis, epilepsy, cystic fibrosis, lyme disease, muscle aches, cholera, fatigue, muscle aches, shortness of breath, alzhemer's disease, acute myeloid leukaemia, acute pancreatitis, chronic inflammatory joint disease, inflammatory bowel disease, Addison's disease back acne, breast cancer, allergic bronchitis, Celia disease, bulimia, congenital heart disease, cirrhosis, fetal alcohol spectrum, constipation, fungal nail infection, fabromyalgia, (love spell) and many more. he is a great herbalist man. Contact him on email; drimolaherbalmademedicine@gmail.com. You can also reach him on whatssap- +2347081986098.

ReplyDelete
Replies

Add comment