Genome mining for novel natural product discovery

GL Challis - Journal of medicinal chemistry, 2008 - ACS Publications
Journal of medicinal chemistry, 2008ACS Publications
Genomics has resulted in the deposition of a huge quantity of DNA sequence data from a
wide variety of organisms in publicly accessible databases. Such data can be exploited to
generate new knowledge in several areas relevant to medicinal chemistry including the
characterization of human physiological processes, the identification and validation of new
drug targets in human pathogens, and the discovery of new chemical entities (NCEsa) from
natural sources, which may form the basis for new drug leads. The term “genome mining” …
Genomics has resulted in the deposition of a huge quantity of DNA sequence data from a wide variety of organisms in publicly accessible databases. Such data can be exploited to generate new knowledge in several areas relevant to medicinal chemistry including the characterization of human physiological processes, the identification and validation of new drug targets in human pathogens, and the discovery of new chemical entities (NCEsa) from natural sources, which may form the basis for new drug leads. The term “genome mining” has been used in various fields to describe the exploitation of genomic information for the discovery of new processes, targets, and products. This Miniperspective will focus on the development of genome mining approaches for the discovery of new natural products. It will also discuss future prospects for the application of genome mining technology to NCE discovery and lead generation. Natural products and their derivatives form the basis of many important drugs that have found widespread use in the clinic, eg, as antibacterial (penicillin G 1, vancomycin 2, erythromycin A 3, daptomycin 4), antifungal (amphotericin B 5), immunosuppressant (cyclosporin A 6, tacrolimus 7), antitumor (doxorubicin 8, paclitaxel 9, bleomycin A2 10, calicheamicin 11), and cholesterol-lowering agents (mevastatin 12)(Figure 1). 1, 2 Despite this, most large pharmaceutical companies are no longer seriously engaged in the search for new drug leads from natural sources. The advent of combinatorial chemistry is partly responsible for this decline, together with the high frequency at which known natural products are rediscovered in activity screens of natural extracts. 3 However, combinatorial chemistry has failed to deliver leads that form the basis for development of successful new drugs and new possibilities in natural product drug discovery have been opened up by the genomic age. 4, 5 Thus, it is likely that we will soon witness a resurgence of interest in natural products for new drug discovery. The concept of exploiting genomic sequence data for the discovery of new natural products has grown out of the rapid expansion in knowledge of the genetic and biochemical basis for secondary metabolite biosynthesis, particularly in microorganisms, in the 1980s and 1990s. 6 As large quantities of genomic sequence data began to accumulate in public databases at the turn of the century, it quickly became apparent that many genomes, in particular those of plants and microorganisms, contain numerous genes encoding proteins likely to participate in the assembly of structurally complex bioactive natural products but not associated with the production of known metabolites. In the microbial arena, this phenomenon was first recognized during analysis of the complete genome sequences of the model actinomycete Streptomyces coelicolor A3 (2) and the industrial actinomycete Streptomyces aVermitilis. 7–9 Similar observations have since been made for other microbial genomes, eg, Pseudomonas fluorescens Pf-5, Saccharopolyspora erythraea NRRL2338, and Aspergillus species, as well as some plant genomes. 10–15
Progress in understanding the biochemical programming and molecular basis of substrate specificity in two types of natural product biosynthetic systems, known as the modular polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs), 6 has facilitated the prediction of structural features of natural products assembled by new examples of these systems uncovered by genomics. Modular PKSs and NRPSs are both multienzymes containing numerous enzymatic domains organized into functional units termed modules …
ACS Publications