Wednesday, 21 May 2014

Protein focus: Dionysian mysteries - the aldehyde dehydrogenase (ALDH) family

Do you have friends that cannot handle alcoholic drinks? Just half a pint of beer or a few sips of wine, and their faces turn red, possibly with some hangover symptoms, such as headaches and nausea? You may envy  their cheap night out, but wonder why these people cannot tolerate alcohol as you do. The phenomenon is called ‘alcohol flush reaction’, also known as ‘Asian flush syndrome’, due to its association with the Asian population. It is a condition caused by the accumulation of acetaldehyde, a metabolic byproduct of the catabolic metabolism of alcohol

Picture provided by Louise Daugherty
Normally, during the alcohol metabolic process, ethanol is converted to acetaldehyde by an alcohol dehydrogenase enzyme, called ADH1B, and then broken down to acetic acid by an aldehyde dehydrogenase enzyme (ALDH)

In humans, there are nineteen identified ALDH genes (ALDH1-19). Most Europeans have normal copy of the ALDH2 gene, whilst approximately 30-50% of East Asians carry an allele (ALDH2*2)  that results in the synthesis of a less efficient enzyme 1.

ALDH2 forms homotetramers.  Each subunit in the tetramer consists of three domains - the catalytic domain, the coenzyme-binding domain and the oligomerisation domain. The low activity of ALDH2*2 is the result of a substitution of lysine for glutamate at position 487 (Glu487) of the 500-amino-acid mature enzyme 2. The Glu487 links the coenzyme-binding site to the active site, which creates a stable structural scaffold contributing to catalysis (Figure 1). In the ALDH2*2 apoenzyme, the presence of a lysine at residue 487 disturbs the hydrogen bonds and causes disruptions of the αG helix structure 3. This reduces affinity for the coenzyme and lowers the rate of the metabolic process 3

As a result of this mutation, acetaldehyde accumulates whenever alcohol is consumed. Unfortunately, acetaldehyde is a DNA damaging agent that can cause cancer  4, and a higher risk of ALDH2-deficient drinkers developing esophageal cancer has been shown by several studies 4,5,6. A knock out mouse model also links ethanol consumption with higher risk of acetaldehyde toxicity in ALDH2 deficient individuals 7.  But whilst the outlook seems to be dim and gloomy for the ALDH2*2 drinkers, on the bright side, they are less likely to suffer alcohol addiction problems  8. In fact, there is a drug called disulfiram that causes symptoms similar to Asian flush syndrome that is used to treat alcoholism

Figure 1. The protein structure of a single subunit of ALDH2. 
Residue 487 is indicated in violet. ALDH2*2 αG helix is shown in red, 
wild type I shown in blue.  Picture modified from Larson et al.  2005. 3
Interestingly, some ALDH2*2 individuals have less intense flushing symptoms. This is because they also have a less active form of ADH1B (ADH1B*1/*1). This prevents a steep rise in acetaldehyde after drinking. However, some studies have shown that these individuals may have higher risk of both alcoholism and cancer (Figure 2)  8, 9.    

Figure 2. Ethanol metabolic process.

It is intriguing that a single mutation in the ALDH2 gene could cause alcohol-related health problems. From an evolutionary point of view, aldehyde dehydrogenases are utilised by different species to detoxify harmful chemical intermediates, and hence play an important role in cell survival. They catalyse the conversion of a wide variety of aldehyde substrates to their respective carboxylic acids, using coenzyme NAD or NADP. The aldehyde dehydrogenase family members contain two conserved sites: a cysteine active site and a glutamic acid active site (Figure 4). These two sites are represented by the InterPro entries IPR016160 and IPR029510, and are conserved across species,  from archaea and bacteria to eukaryotes

ALDH in different species

In contrast to humans, budding yeast have only five ALDHs. They are the key enzymes of the pyruvate dehydrogenase (PDH) bypass, which generates additional acetyl-CoA 10. In the wine producing process, the acetate produced by the PDH bypass accumulates during the alcoholic fermentation of sugars 11.  The level of acetate  has important effects on wine quality - most unspoiled wines have a level of 0.2 to 0.8 g  of acetate per litre 12.

Figure 3. Key enzymes of the PDH bypass pathway. PDH, pyruvate dehydrogenase; PDC, pyruvate decarboxylase; ADH, alcohol dehydrogenase; mtALDH, mitochondria ; cALDH, cytoplasmic ALDH. Modified from Wei et al.  2009. 16

Plants also have multiple ALDHs, and 14 have been identified in Arabodopsis 13. They play an important role in the adaptation of plants to various stresses, such as drought, salinity and extreme temperatures 14. They may also be involved in different transduction pathways 15

An unsolved mystery

We may not yet understand the reason why the  ALDH2 deficiency is widespread in Asian populations. However, research can help us understand more about the relationship between ALDH2, cancers and alcoholism, as well potentially uncovering the safe number of alcohol units that ALDH2*2  individuals can consume. 

So before you encourage your friends to have another glass of wine or a pint of beer, you may need to check if they have the Asian flush symptoms, or even review their ALDH2 phenotype!
By Hsin-Yu Chang and Alex Mitchell


1. Helminen A, Väkeväinen S, Salaspuro M. ALDH2 genotype has no effect on salivary acetaldehyde without the presence of ethanol in the systemic circulation. PLoS One. 8(9):e74418. 2013. [PMID: 24058561]

2. Larson HN, Zhou J, Chen Z, Stamler JS, Weiner H, Hurley TD. Structural and functional 
consequences of coenzyme binding to the inactive asian variant of mitochondrial aldehyde dehydrogenase: roles of residues 475 and 487. 282(17):12940-50. J Biol Chem.  2007. [PMID:17327228]

3. Larson HN, Weiner H, Hurley TD. Disruption of the coenzyme binding site and dimer interface revealed in the crystal structure of mitochondrial aldehyde dehydrogenase "Asian" variant. J Biol Chem. 280(34):30550-6. 2005. [PMID: 15983043]

4. Lewis SJ, Smith GD. Alcohol, ALDH2, and esophageal cancer: a meta-analysis which illustrates the potentials and limitations of a Mendelian randomization approach. Cancer Epidemiol Biomarkers Prev. 14(8):1967-71. 2005. [PMID: 16103445]

5. Yokoyama A, Omori T, Yokoyama T. Alcohol and aldehyde dehydrogenase polymorphisms and a new strategy for prevention and screening for cancer in the upper aerodigestive tract in East Asians. Keio J Med. 59(4):115-30. 2010. [PMID: 21187698]

6. Seitz HK, Meier P. The role of acetaldehyde in upper digestive tract cancer in alcoholics. Transl Res. 149(6):293-7. 2007. [PMID:17543846]

7. Isse T, Oyama T, Matsuno K, Ogawa M, Narai-Suzuki R, Yamaguchi T, Murakami T, Kinaga T, Uchiyama I, Kawamoto T. Paired acute inhalation test reveals that acetaldehyde toxicity is higher in aldehyde dehydrogenase 2 knockout mice than in wild-type mice. J Toxicol Sci. 30(4):329-37. 2005. [PMID: 16404141]

8. Yokoyama A, Omori T, Yokoyama T. Alcohol and aldehyde dehydrogenase polymorphisms and a new strategy for prevention and screening for cancer in the upper aerodigestive tract in East Asians. Keio J Med. 59(4):115-30. 2010. [PMID: 21187698]

9. Lee CH, Lee JM, Wu DC, Goan YG, Chou SH, Wu IC, Kao EL, Chan TF, Huang MC, Chen PS, Lee CY, Huang CT, Huang HL, Hu CY, Hung YH, Wu MT. Carcinogenetic impact of  DH1B and ALDH2 genes on squamous cell carcinoma risk of the esophagus with regard to the consumption of alcohol, tobacco and betel quid. Int J Cancer. 122(6):1347-56. 2008. [PMID:18033686]

10. Boubekeur S, Camougrand N, Bunoust O, Rigoulet M, Guérin B. Participation of acetaldehyde dehydrogenases in ethanol and pyruvate metabolism of the yeast Saccharomyces cerevisiae. Eur J Biochem. 268(19):5057-65. 2001. [PMID: 11589696]

11. Saint-Prix F, Bönquist L, Dequin S. Functional analysis of the ALD gene family of Saccharomyces cerevisiae during anaerobic growth on glucose: the NADP+-dependent Ald6p and Ald5p isoforms play a major role in acetate formation. Microbiology. 150(Pt 7):2209-20. 2004. [PMID: 15256563]

12. Remize F, Roustan JL, Sablayrolles JM, Barre P, Dequin S. Glycerol overproduction by engineered saccharomyces cerevisiae wine yeast strains leads to substantial changes in Byproduct formation and to a stimulation of fermentation rate in stationary phase. Appl Environ
Microbiol. 65(1):143-9. 1999. [PMID: 9872772]

13. Kirch HH, Schlingensiepen S, Kotchoni S, Sunkar R, Bartels D. Detailed expression analysis of selected genes of the aldehyde dehydrogenase (ALDH) gene superfamily in Arabidopsis thaliana. Plant Mol Biol. 57(3):315-32. 2005. [PMID: 15830124]

14. Zhang Y, Mao L, Wang H, Brocker C, Yin X, Vasiliou V, Fei Z, Wang X. Genome-wide identification and analysis of grape aldehyde dehydrogenase (ALDH) gene superfamily. PLoS One. 7(2):e32153. 2012. [PMID: 22355416]

15. Kirch HH, Bartels D, Wei Y, Schnable PS, Wood AJ. The ALDH gene superfamily of Arabidopsis. Trends Plant Sci. 9(8):371-7. 2004. [PMID: 15358267]

16. Wei Y, Lin M, Oliver DJ, Schnable PS. The roles of aldehyde dehydrogenases (ALDHs) in
the PDH bypass of Arabidopsis. BMC Biochem. 10: 7. 2009. [PMID: 19320993]

Tuesday, 28 January 2014

InterProScan 5 is out

Towards the end of last year, we released InterProScan 5. This represented a complete redesign of the popular InterProScan software that compares protein sequences against different InterPro member database signatures, to classify them into protein families and predict the presence of important domains and sites. The new version of the software is highly flexible, robust and scalable. Multiple output formats are available, including tab-delimited, XML and GFF3, as well as graphical output in HTML and SVG formats (an example of which is given below). You can find more details about InterProScan 5 here and read the paper describing it here.

InterProScan graphical output when searched with UniProt entry B0A027. 
A month or two on from the software release, we think we have ironed out (most of!) the bugs and urge you to give it a go. This is particularly important for InterProScan 4 users; whilst InterProScan 4.8 is still available for download, we are not supporting it any more and so highly recommend that you upgrade. We are also very interested in hearing what you think about the new software, your experiences installing and running it, and what you might be using it for. So drop us a line, using the comments section below.

Alex Mitchell
on behalf of the InterPro team

Friday, 2 August 2013

Migrating from InterProScan v4.x to InterProScan v5

First, a bit of background..

It's been a while in development but finally we are ready to make InterProScan v5 the official release of InterProScan, meaning that v4.x will be retired.  The development of InterProScan 5 was motivated by several factors - both feedback from our users and internal needs to have a robust pipeline for keeping all the calculations of InterPro signatures against UniProtKB proteins up-to-date.

As explained on the Google Code wiki for InterProScan 5, there are numerous differences between the two versions, most notably the change from a Perl-based architecture to a Java-based one.  We've also simplified the installation process, added a new analysis type (Phobius), improved the output formats that users can get their results in and tailored this version to work on a large-scale better than ever before.

Migrating from v4 to v5

In order to help users transition from using v4 to v5, we have created a page on our Google Code wiki that explains what might need to be changed to migrate from using v4 to v5.  In the initial version of the page (which will be added to as and when we get requests from users for clarification) we describe changes to the command-line options available and the output formats generated.  This is mainly of interest to users who have downloaded InterProScan and installed it locally.

Changes will also be reflected in the EBI-hosted versions, with both the web interface and SOAP/REST web services changing to v5 shortly.  We will continue to run the hosted InterProScan v4.8 in parallel with the v5 whilst people swtich over but the older version will no longer be actively developed (including releases of data).


It is our intention to officially launch InterProScan 5 with the next public release of the InterPro database, currently scheduled for 19th September 2013.  

If people have any questions about the above or suggestions, please contact us

Thursday, 1 August 2013

InterPro 43.1 is released, fixing the previous data problems

Good news!

Last week we were able to finally release an update to release 43 to fix the problems we identified.

43.1 contains the same member databases as v43.0 (i.e. an upgrade to Pfam from v26.0 to v27.0 compared with the InterPro 42.0 release) but there are 2 main differences:
  1. The pre-calculated match information we supply via the InterPro website and the downloadable match_complete.xml file that is used by InterProScan is now correct
  2. Additional Pfam signatures have been integrated into InterPro entries since v43.0.  This is why in the release notes for 43.1, 13579/14831 Pfam signatures belong to an InterPro entry but in 43.0 only 13079/14831 signatures were included.  (Our curators have been busy!)
We  apologise again for this problem and the inconvenience it might have caused our users.  We have now put in place measures to ensure that this particular problem won't happen again.


Monday, 17 June 2013

InterPro is temporarily reverted to v42.0

An update on our progress.  We decided to revert the InterPro website back to the previous release's data (v42.0).  This means that the Pfam release that was incorporated into release v43.0 is no longer visible via the website, at least, until the fix is completed.  The full status of all our services is now as follows:

InterPro website

Currently displays v42.0 data - all protein match information visible on the site is now correct and can be used with confidence.  The version of Pfam that is visible is v26.0, however.

InterProScan5 (downloadable)

The InterProScan5 current version (RC6) was built against v42.0.  We hadn't built and distributed the version (RC7) that was for v43.0 of the data and so users are still safe using InterProScan5 RC6.

InterProScan4 (downloadable)

Standalone InterProScan4 (downloadable from our FTP site) had data released for v43.0 which included Pfam 27.0, however, it was only the match_complete.xml file that was affected by the data.  Users could either run their InterProScan4 installation with 43.0 data with the -nocrc option on the command-line or can download the data for release 42.0 from the FTP site ( and and revert back to that version.

InterProScan4 (EBI-hosted)

InterProScan4 is currently running using InterPro release 42.0 data and can therefore be used with confidence.  The version of Pfam included is v26.0.

Next steps

We will hopefully make a new public release next week which will contain Pfam 27.0 and correct protein match information to the website.  Updates to InterProScan v4 data and InterProScan 5 (RC7) will follow shortly afterwards.  These updates will be announced on the twitter feed and mailing lists as v43.1

Again, many thanks for your patience whilst we sort out these issues.

Friday, 14 June 2013

Update on fix to InterPro 43.0

We're still working on fixing release 43.0 and we are aiming to release a fixed version (v43.1) next week.  We're sorry it's taking so long to sort out but we are working hard to do so.  It's highly likely we'll temporarily revert the public data to release 42.0 if we've not fixed 43 by Monday.


We have had some questions about the use of InterProScan

Users of InterProScan v5 will be pleased to know that we noticed the problem with 43.0 before we had updated I5, therefore, all the data coming from InterProScan 5 should be correct and you can use it with confidence.

InterProScan4, however, is affected by this problem if you have not used "-nocrc" option on the commandline of the standalone version, or if you have used the EBI-hosted version without specifying "-nocrc".   Running InterProScan 4 with the lookup disabled (using "-nocrc") will not use the problemmatic dataset, and so the results should be OK.

Once again, we apologise for any inconvenience this might have caused our users.

Monday, 10 June 2013

Problem with InterPro release 43.0

For the first time, we've discovered a major problem with the match data generated for InterPro.

This has resulted in incorrect InterPro calculations for approximately 3 million protein sequences in the UniParc database - therefore, it is highly likely that a number of UniProtKB proteins will have incorrect match data visible in the InterPro web interface.  At the same time, we have noticed that some of the pathway mappings associated to InterPro entries (e.g. mappings of entries to KEGG, Reactome, etc.) are incorrect.

We are currently working to fix this problem and re-release the data as soon as possible.  Note that this potentially affects the data in both the InterPro website and InterProScan XML files.

We apologise for the inconvenience and will make a new announcement once the problem is fixed and the new data is available (it will be called InterPro v43.1)

Please let us know if you have any questions about the above by using our support channels.