New algorithm finds lots of gene-editing enzymes in environmental DNA

Zoom in / Protein structure of CAS, shown with nucleic acid binding.

CRISPR — clustered regularly interspaced short palindromic repeats — is the microbial world’s answer to adaptive immunity. Bacteria do not generate antibodies when they are invaded by pathogens and then block those antibodies if they encounter the same pathogen again, as we do. Instead, they integrate some of the pathogen’s DNA into their genome and attach it to an enzyme that they can use to recognize the pathogen’s DNA sequence and cut it into pieces if the pathogen reappears.

The enzyme that does the cutting is called Cas, after CRISPR. Although the CRISPR-Cas system evolved as a bacterial defense mechanism, it has been harnessed and adapted by researchers as a powerful tool for genetic manipulation in laboratory studies. It has also proven its agricultural uses, and the first CRISPR-based treatment has been approved in the United Kingdom to treat sickle cell disease and transfusion-dependent beta-thalassemia.

Now, researchers have developed a new way to search genomes for CRISPR-Cas-like systems. They discovered that we might have a lot of additional tools to work with.

DNA modification

To date, six types of CRISPR-Cas systems have been identified in different microbes. Although they differ in details, they all have the same appeal: they deliver proteins to a specific sequence of genetic material with a degree of specificity that has until now been technically difficult, expensive, and time-consuming to achieve. Any DNA sequence of interest in the system can be programmed and targeted.

See also  NASA reveals OSIRIS-REx asteroid sample today: Watch it live

Native systems in microbes typically bring an exonuclease — an enzyme that cleaves DNA — into the sequence, chopping up the genetic material of pathogens. This ability can be used to cut any chosen DNA sequence for gene editing; Combined with enzymes and/or other DNA sequences, it can be used to insert or delete additional short sequences, and to correct mutant genes. Some CRISPR-Cas systems cut specific RNA molecules instead of DNA. They can be used to eliminate disease-causing RNA, such as the genomes of some viruses, the way they are eliminated in native bacteria. This can also be used to rescue defects in RNA processing.

But there are many additional ways to modify nucleic acids that may be useful. It is an open question whether enzymes that make additional modifications have evolved. So some researchers decided to look for them.

Researchers at MIT have developed a new tool to detect variant CRISPR arrays and applied it to 8.8 tera (1,012) base pairs of prokaryotic genomic information. Many of the systems they found are rare and have only appeared in the data set in the past 10 years, highlighting how important it is to continue adding environmental samples that were previously difficult to obtain into these data repositories.

The new tool was needed because databases of protein and nucleic acid sequences are expanding at a ridiculous rate, so techniques for analyzing all that data need to keep up. Some of the algorithms used to analyze them attempt to compare each sequence to every other sequence, which is clearly untenable when dealing with billions of genes. Others rely on clustering, but only find genes that are highly similar, so they cannot shed light on evolutionary relationships between distantly related proteins. But rapid, location-sensitive hashtag-based clustering (“flash assembly”) works by grouping billions of proteins into smaller and larger sets of sequences that differ only slightly to identify new and rare relatives.

See also  Persevere on Mars spies a piece of its landing gear

A search using FLSHclust successfully extracted 188 new CRISPR-Cas systems.

Lots of CRISPyness

Some themes emerged from the work. One is that some of the newly identified CRISPR systems use Cas enzymes with domains that have never been seen before, or appear to be fusions with known genes. The scientists also characterized some of these enzymes and found that one is more specific than the CRISPR enzymes currently in use, and another cuts RNA that they propose is structurally distinct enough to include an entirely new type 7 CRISPR-Cas system.

A corollary of this topic is to link enzymes with different functions, not just nucleases (enzymes that cut DNA and RNA), with CRISPR arrays. Scientists have exploited CRISPR’s remarkable ability to target genes by attaching it to other types of enzymes and molecules, such as fluorescent dyes. But apparently evolution got there first.

For example, FLSHclust identified something called a transposase that is associated with two different types of CRISPR systems. Transposase is an enzyme that helps transfer a specific part of DNA to another part of the genome. CRISPR RNA-directed transformation has been seen before, but this is another example of it. A whole range of proteins with different functions, such as proteins with transmembrane domains and signaling molecules, have been found associated with CRISPR arrays, highlighting the mix-and-match nature of the evolution of these systems. They even found a protein expressed by a virus that binds to CRISPR arrays and renders them inactive, with the virus essentially disabling the CRISPR system that evolved to protect against viruses.

See also  Russia sends spacecraft to rescue crew from International Space Station after damaged Soyuz ruled 'unviable'

Not only did the researchers find new proteins associated with CRISPR arrays, they also found other regularly spaced repeat arrays that were not associated with any Cas enzymes, similar to CRISPR but not CRISPR. They are not sure of the function of these RNA-guided systems but speculate that they are involved in defense just like CRISPR.

The authors set out to find “a list of RNA-guided proteins that expand our understanding of the biology and evolution of these systems and provide a starting point for the development of new biotechnologies.” And they appear to have achieved their goal: “The results of this work reveal unprecedented regulatory and functional flexibility and modularity of CRISPR systems,” they write. They go on to conclude: “This represents only a small fraction of the systems discovered, but it highlights the breadth of untapped potential of the Earth’s biodiversity.” , Earth’s biodiversity, and the remaining candidates will serve as a resource for future exploration.

Article DOI: 10.1126/science.adi1910

Leave a Reply

Your email address will not be published. Required fields are marked *