Draft genome sequence of Arthrobacter sp. strain B6 isolated from the high-arsenic sediments in Datong Basin, China

Arthrobacter sp. B6 is a Gram-positive, non-motile, facultative aerobic bacterium, isolated from the arsenic-contaminated aquifer sediment in the Datong basin, China. This strain displays high resistance to arsenic, and can dynamically transform arsenic under aerobic condition. Here, we described the high quality draft genome sequence, annotations and the features of Arthrobacter sp. B6. The G + C content of the genome is 64.67%. This strain has a genome size of 4,663,437 bp; the genome is arranged in 8 scaffolds that contain 25 contigs. From the sequences, 3956 protein-coding genes, 264 pseudo genes and 89 tRNA/rRNA-encoding genes were identified. The genome analysis of this strain helps to better understand the mechanism by which the microbe efficiently tolerates arsenic in the arsenic-contaminated environment.


Introduction
The genus Arthrobacter was first proposed in 1947 by Conn and Dimmick [1], belongs to the family of Micrococcaceae in the class of Actinobacteria. Recently, based on the intrageneric phylogeny and chemotaxonomic characteristics, the description of the genus Arthrobacter sensu lato was emended by Busse, and the genus Arthrobacter sensu stricto was restricted to A. globiformis, A. pascens, A. oryzae and A. humicola [2]. Due to their nutritional versatility and tolerance to various environmental stressors [3][4][5][6][7], Arthrobacter species are widely present in soils and the environments contaminated with chemicals and heavy metal [8][9][10][11][12][13], as well as extreme environments, such as Antarctic and radioactive sediments [14,15].
Arthrobacter sp. B6 was isolated from an arseniccontaminated sediment sample collected from the Datong Basin, China, where the uses of high arsenic groundwater for drinking and irrigation have resulted in endemic arsenic poisoning among tens of thousands of residents [16]. Strain B6 is of particular interest because it showed high level of resistance to arsenic and can dynamically transform arsenic under aerobic condition. Here, we presented a summary of the taxonomic characterization of Arthrobacter sp. B6 and its main genomic features. These data help to better understand the microbial detoxification mechanism for arsenic, and are useful for the comparisons of the genomic and physiological features between this isolate and other Arthrobacter species.
The 16S rRNA gene sequence of strain B6 shares 94.67-99.59% identities with those of other known species of the genus Arthrobacter. In order to evaluate the evolutionary relationships between B6 and other known strains of the genus Arthrobacter, the 16S rRNA gene sequence of all of these bacteria were aligned using Clus-talW [17], and a phylogenetic tree were conducted using the maximum-likelihood and neighbor-joining algorithms implemented in MEGA 6.0, respectively [18]. The phylogeny illustrated that the strain B6 is closely associated with Arthrobacter oryzae, A. globiformis, A. pascens and A. humicola; suggesting that B6 is affiliated with the genus Arthrobacter (Fig. 2). We also found that Arthrobacter sp. B6 showed high resistance to arsenic, with maximal inhibitory concentrations of 150.0 mM for arsenate and 5.0 mM for arsenite. A dynamic transformation of arsenic catalyzed by strain B6 was observed when it was cultured aerobically with arsenate.

Genome project history
Arthrobacter sp. strain B6 was selected for sequencing on the basis of its high resistance to arsenic and dynamic arsenic transformation capability. The Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank database under the accession number LQAP00000000. A summary of the main project information on compliance with MIGS version 2.0 is shown in Table 2 [19].

Growth conditions and genomic DNA preparation
Strain B6 was grown at 30°C in 0.1× Trypticase Soy Broth liquid medium to mid-exponential phase. Genomic DNA was extracted from 0.5 to 1.0 g of cells using the modified method of Marmur [20]. The purity of DNA, expressed as the value of A260/A280, was assessed on a NanoDrop™ ND-1000 Spectrophotometer (Biolab).

Genome sequencing and assembly
The draft genome of Arthrobacter sp. B6 was sequenced at the Beijing Genomics Institute (BGI, Shenzhen) using    [21]. The final draft assembly contains 25 contigs in 8 scaffolds. Final assembly was based on all clean reads that provide an average of 161-fold coverage of the genome. The total size of the genome is 4.66 Mbp.

Genome annotation
Genes were identified using Glimmer v3.02 [22]. The predicted CDSs were translated into amino acid sequences that were used as queries to BLAST the GenBank, Swissprot, InterPro, KEGG, COG and GO databases, respectively. These data were combined to assert a product description for each predicted protein. Additional gene prediction analysis and functional annotation was performed using the Integrated Microbial Genomes-Expert Review (IMG-ER) platform [23].

Genome properties
The assembly of the draft genome sequence consists of 8 scaffolds amounting to 4,663,437 bp. The G + C content is 64.67% (Table 3). From the genome, 4309 genes were predicted, of which 3956 are protein-coding genes. Among these protein-coding genes, 154 were assigned to putative functions, and 275 were annotated as hypothetical proteins. The assignment of genes into COGs functional categories is presented in Table 4 and Fig. 3. Fig. 3 A graphical circular map of the genome performed with CGview comparison tool [31]. From outside to center, ring 1 and 4 show proteincoding genes oriented in the forward (colored by COG categories) and reverse (colored by COG categories) directions, respectively. ring 2 and 3 denote genes on forward/reverse strand; ring 5 shows G + C% content plot, and the inner-most ring shows GC skew, purple indicating negative values and olive, positive values A three-gene (arsR-acr3-arsC) operon involved in the regulation of arsenate tolerance and reduction was identified from the genome of Arthrobacter sp. B6. The putative arsenate reductase (ArsC) of strain B6 shows 96% and 95% sequence identities to those of Arthrobacter sp. Leaf137 and Pseudarthrobacter phenanthrenivorans Sphe3, respectively. It also shows 89% identities to those of A. globiformis NBRC 12137, A. nitrophenolicus SJCon, A. enclensis NIO-1008 and Arthrobacter sp. FB24, respectively. The amino acid sequence of ACR3 displays 85% identity to that of the arsenic transporter from Arthrobacter sp. FB24. Numerous genes responsible for tolerance or detoxification of metals were identified from the genome of Arthrobacter sp. B6, including copper resistance protein CopC and CopD, copper chaperone, copper-translocating P-type ATPase, cobalt-zinc-cadmium resistance protein CzcD, mercuric reductase, DNA gyrase subunit A and B involved in fluoroquinolones resistance, various polyols ABC transporter and DedA protein involved in the uptake of selenate and selenite. In addition, there are some genes in the genome responsible for osmotic stress. The high tolerance of salt (7% NaCl) of strain B6 may be explained by the presence of glycine betaine ABC transport system permease protein in the genome.

Conclusions
In the present study, we characterized the genome of Arthrobacter sp. B6 that was isolated from the arseniccontaminated aquifer sediment in the Datong Basin, China. It contains numerous genes involved in heavy metal tolerance and detoxification. The knowledge of the genome sequence of Arthrobacter sp. B6 lays foundation for better understanding of the special metabolic abilities of the strain and for elucidation of the metabolic diversity of bacteria inhabiting in the high-arsenic environment. Further functional analyses of the identified genes may gain insights into the detailed molecular mechanisms by which the microbes tolerate and transform arsenic in the arsenic-contaminated environments.