During the initial phase of research into the etiology of SARS, an unknown virus was cultured in Vero cells from a patient suffering from SARS (Ksiazek et al. 2003). Total nucleic acid purified from this viral culture, as well as a control culture, was obtained from the Centers for Disease Control and Prevention on March 22, 2003. These two samples, along with additional controls (HeLa cell RNA and water alone), were amplified and hybridized within 24 h to the virus DNA microarray. The strongest hybridizing array elements from the infected culture were derived from two families: astroviridae and coronaviridae. Table 1 lists the oligonucleotides from these families with the greatest hybridization intensity. By comparison, these oligonucleotides yielded essentially background levels of hybridization in the various control arrays performed in parallel. The initial suggestion from this hybridization pattern was that members of both of these viral families might be present. However, alignment of the oligonucleotides using ClustalX revealed that all four hybridizing oligonucleotides from the astroviridae and one oligonucleotide from avian infectious bronchitis virus (IBV) (GenBank NC_001451), an avian coronavirus, shared a core consensus motif spanning 33 nucleotides (data not shown); thus, these five oligonucleotides behaved essentially as multiple redundant probes for the same sequence. This motif is known to be present in the 3′ UTR of all astroviruses and the avian coronaviruses (Jonassen et al. 1998), but appears to be absent in the available sequenced mammalian coronaviruses (bovine coronavirus, murine hepatitis virus [MHV], human coronavirus 229E, porcine epidemic diarrhea virus, and transmissible gastroenteritis virus). The other three hybridizing oligonucleotides were derived from three conserved regions within the ORF1AB polyprotein common to all coronaviruses (Figure 1). Based on the aggregate hybridization pattern, the virus appeared to be a novel member of the coronavirus family.
To further characterize this virus, we sequenced fragments of the viral genome using two complementary approaches. First, BLAST alignment of two of the hybridizing viral oligonucleotides, one each from bovine coronavirus and human coronavirus 229E, to the IBV genome indicated that the oligonucleotides possessed homology to distinct conserved regions within the NSP11 gene (BLAST identity matches of 42/47 and 26/27, respectively). A pair of PCR primers was designed to amplify the intervening sequences between the two conserved regions, and a fragment that possessed 89% identity over 37 amino acids to MHV, a murine coronavirus, was obtained (Figure 1; sequence available as Data S1).
In a parallel approach, we directly recovered hybridized viral sequences from the surface of the microarray. This procedure took advantage of the physical separation achieved during microarray hybridization, which effectively purified the viral nucleic acid from other nucleic acid species present in the sample. Using a tungsten needle, the DNA microarray spot corresponding to the conserved 3′ UTR motif was repeatedly scraped and the hybridized nucleic acid was recovered. This material was subsequently amplified, cloned, and sequenced (Figure 2). The largest clone spanned almost 1.1 kb; this fragment encompassed the 3′ UTR conserved motif and extended into the most 3′ coding region of the viral genome. BLAST analysis revealed 33% identity over 157 amino acids to MHV nucleocapsid, thus confirming the presence of a novel coronavirus (see Figure 1; see Data S1). We subsequently confirmed results obtained from both strategies described above by using a random-primed RT-PCR shotgun sequencing approach that generated contigs totaling approximately 25 kb of viral genome sequence (see Data S1).