capsulatum with a large set of tiling arrays, and combined the results with gene-targeted expression profiling and sequence homology, #E7080 cost randurls[1|1|,|CHEM1|]# yielding a high confidence set of validated gene predictions for G217B with 7,362 gene predictions being validated by at least two of the three methods. In addition, the unbiased approach of the tiling arrays allowed us to detect 264 novel transcripts that are now being incorporated into our oligo expression arrays, directly extending the sensitivity of that platform. Additionally, the results of
this study are available at http://histo.ucsf.edu in an interactive format intended to facilitate expression, insertional mutagenesis, and bioinformatics based studies. Thus, the transcript sets resulting from this study represent an enhancement of the previously available H. capsulatum gene set and a starting point for the experimental and theoretical characterization of the molecular biology of this important intracellular pathogen. Methods RNA Extraction and cDNA synthesis To generate a diverse RNA sample for the tiling experiment, we prepared RNA from yeast-form CP673451 Histoplasma capsulatum strain G217B (ATCC 26032; a kind gift of William
Goldman, Washington University, St. Louis, MO) under a variety of conditions (including early, middle, and late logarithmic growth, stationary phase, heat shock (42°C for 30 min), oxidative stress (1 mM menadione for 80 min), sulfhydryl Ketotifen reducing stress (10 mM DTT for 2 hours), and a range of media (HMM[20], 3M[20], YPD[21], and SD complete[21]). Total RNA and polyA RNA were prepared as previously described[8, 9]. Cy5-labeled cDNA was prepared from individual RNA samples as previously described[8], and an equal mass of cDNA was pooled from each sample and hybridized to individual tiling arrays as described below. Whole Genome Tiling Array Design The whole genome tiling arrays were designed based on the GSC Histoplasma capsulatum strain G217B genome assembly as of 11/30/2004. Degenerate sequence and transposable elements were removed from the assembly using RepeatMasker[22] with default parameters and the repeat families determined by the
GSC. The remaining sequence was tiled with 50 mer probes at an average frequency of one probe every 60 base pairs. Probe spacing was adjusted to minimize variation in melting temperature, and a subset of probes were truncated to optimize synthesis, in collaboration with CombiMatrix. The number of arrays used to tile a given contig was minimized, and the location of tiling probes was randomized within a given array. In addition, each array contained a common set of control probes, viz.: quality control (QC) and negative control (NC) probes designed by CombiMatrix (Mukilteo, WA); positive control probes tiling the genomic loci and non-genic flanking sequence of TEF1(P40911)[23], TYR1[9], and CBP1(AF006209)[24]; and probes specific to a spike-in control sequence.