Publications by authors named "Takatomo Fujisawa"

40 Publications

Complete sequence and structure of the genome of the harmful algal bloom-forming cyanobacterium Planktothrix agardhii NIES-204 and detailed analysis of secondary metabolite gene clusters.

Harmful Algae 2021 01 15;101:101942. Epub 2020 Dec 15.

Center for Environmental Biology and Ecosystem Studies, National Institute for Environmental Studies, 16-2 Onogawa, Tsukuba, Ibaraki 305-8506, Japan.

Planktothrix species are distributed worldwide, and these prevalent cyanobacteria occasionally form potentially devastating toxic blooms. Given the ecological and taxonomic importance of Planktothrix agardhii as a bloom species, we set out to determine the complete genome sequence of the type strain Planktothrix agardhii NIES-204. Remarkably, we found that the 5S ribosomal RNA genes are not adjacent to the 16S and 23S ribosomal RNA genes. The genomic structure of P. agardhii NIES-204 is highly similar to that of another P. agardhii strain isolated from a geographically distant site, although they differ distinctly by a large inversion. We identified numerous gene clusters that encode the components of the metabolic pathways that generate secondary metabolites. We found that the aeruginosin biosynthetic gene cluster was more similar to that of another toxic bloom-forming cyanobacterium Microcystis aeruginosa than to that of other strains of Planktothrix, suggesting horizontal gene transfer. Prenyltransferases encoded in the prenylagaramide gene cluster of Planktothrix strains were classified into two phylogenetically distinct types, suggesting a functional difference. In addition to the secondary metabolite gene clusters, we identified genes for inorganic nitrogen and phosphate uptake components and gas vesicles. Our findings contribute to further understanding of the ecologically important genus Planktothrix.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.hal.2020.101942DOI Listing
January 2021

DDBJ update: streamlining submission and access of human data.

Nucleic Acids Res 2021 01;49(D1):D71-D75

Bioinformation and DDBJ Center, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan.

The Bioinformation and DDBJ Center (DDBJ Center, https://www.ddbj.nig.ac.jp) provides databases that capture, preserve and disseminate diverse biological data to support research in the life sciences. This center collects nucleotide sequences with annotations, raw sequencing data, and alignment information from high-throughput sequencing platforms, and study and sample information, in collaboration with the National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI). This collaborative framework is known as the International Nucleotide Sequence Database Collaboration (INSDC). In collaboration with the National Bioscience Database Center (NBDC), the DDBJ Center also provides a controlled-access database, the Japanese Genotype-phenotype Archive (JGA), which archives and distributes human genotype and phenotype data, requiring authorized access. The NBDC formulates guidelines and policies for sharing human data and reviews data submission and use applications. To streamline all of the processes at NBDC and JGA, we have integrated the two systems by introducing a unified login platform with a group structure in September 2020. In addition to the public databases, the DDBJ Center provides a computer resource, the NIG supercomputer, for domestic researchers to analyze large-scale genomic data. This report describes updates to the services of the DDBJ Center, focusing on the NBDC and JGA system enhancements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkaa982DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7779041PMC
January 2021

BioHackathon 2015: Semantics of data for life sciences and reproducible research.

F1000Res 2020 24;9:136. Epub 2020 Feb 24.

St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Darlinghurst, Australia.

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/f1000research.18236.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141167PMC
February 2021

DDBJ Data Analysis Challenge: a machine learning competition to predict Arabidopsis chromatin feature annotations from DNA sequences.

Genes Genet Syst 2020 Apr 26;95(1):43-50. Epub 2020 Mar 26.

Center for Information Biology, National Institute of Genetics.

Recently, the prospect of applying machine learning tools for automating the process of annotation analysis of large-scale sequences from next-generation sequencers has raised the interest of researchers. However, finding research collaborators with knowledge of machine learning techniques is difficult for many experimental life scientists. One solution to this problem is to utilise the power of crowdsourcing. In this report, we describe how we investigated the potential of crowdsourced modelling for a life science task by conducting a machine learning competition, the DNA Data Bank of Japan (DDBJ) Data Analysis Challenge. In the challenge, participants predicted chromatin feature annotations from DNA sequences with competing models. The challenge engaged 38 participants, with a cumulative total of 360 model submissions. The performance of the top model resulted in an area under the curve (AUC) score of 0.95. Over the course of the competition, the overall performance of the submitted models improved by an AUC score of 0.30 from the first submitted model. Furthermore, the 1- and 2-ranking models utilised external data such as genomic location and gene annotation information with specific domain knowledge. The effect of incorporating this domain knowledge led to improvements of approximately 5%-9%, as measured by the AUC scores. This report suggests that machine learning competitions will lead to the development of highly accurate machine learning models for use by experimental scientists unfamiliar with the complexities of data science.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1266/ggs.19-00034DOI Listing
April 2020

DDBJ Database updates and computational infrastructure enhancement.

Nucleic Acids Res 2020 01;48(D1):D45-D50

The Bioinformation and DDBJ Center, National Institute of Genetics, Mishima, Shizuoka, 411-8540, Japan.

The Bioinformation and DDBJ Center (https://www.ddbj.nig.ac.jp) in the National Institute of Genetics (NIG) maintains a primary nucleotide sequence database as a member of the International Nucleotide Sequence Database Collaboration (INSDC) in partnership with the US National Center for Biotechnology Information and the European Bioinformatics Institute. The NIG operates the NIG supercomputer as a computational basis for the construction of DDBJ databases and as a large-scale computational resource for Japanese biologists and medical researchers. In order to accommodate the rapidly growing amount of deoxyribonucleic acid (DNA) nucleotide sequence data, NIG replaced its supercomputer system, which is designed for big data analysis of genome data, in early 2019. The new system is equipped with 30 PB of DNA data archiving storage; large-scale parallel distributed file systems (13.8 PB in total) and 1.1 PFLOPS computation nodes and graphics processing units (GPUs). Moreover, as a starting point of developing multi-cloud infrastructure of bioinformatics, we have also installed an automatic file transfer system that allows users to prevent data lock-in and to achieve cost/performance balance by exploiting the most suitable environment from among the supercomputer and public clouds for different workloads.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkz982DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7145692PMC
January 2020

Generating Publication-Ready Prokaryotic Genome Annotations with DFAST.

Methods Mol Biol 2019 ;1962:215-226

Department of Informatics, National Institute of Genetics, Shizuoka, Japan.

DDBJ Fast Annotation and Submission Tool (DFAST) is a genome annotation pipeline for prokaryotes, which also assists data submission to the public sequence database. It is available both as a web service and as a stand-alone tool that runs on local machines. DFAST can annotate a typical-sized bacterial genome within 5 min. The default annotation workflow contains a gene prediction phase for protein coding sequence, rRNA, tRNA, and CRISPR, and a functional annotation phase to infer protein functions. DFAST generates result files in standard annotation formats and data files for submission to DNA Data Bank of Japan (DDBJ). In this chapter, the annotation workflow and applications of DFAST are introduced.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/978-1-4939-9173-0_13DOI Listing
August 2019

TogoGenome/TogoStanza: modularized Semantic Web genome database.

Database (Oxford) 2019 01 1;2019. Epub 2019 Jan 1.

National Institute of Genetics, Mishima, Shizuoka, Japan.

TogoGenome is a genome database that is purely based on the Semantic Web technology, which enables the integration of heterogeneous data and flexible semantic searches. All the information is stored as Resource Description Framework (RDF) data, and the reporting web pages are generated on the fly using SPARQL Protocol and RDF Query Language (SPARQL) queries. TogoGenome provides a semantic-faceted search system by gene functional annotation, taxonomy, phenotypes and environment based on the relevant ontologies. TogoGenome also serves as an interface to conduct semantic comparative genomics by which a user can observe pan-organism or organism-specific genes based on the functional aspect of gene annotations and the combinations of organisms from different taxa. The TogoGenome database exhibits a modularized structure, and each module in the report pages is separately served as TogoStanza, which is a generic framework for rendering an information block as IFRAME/Web Components, which can, unlike several other monolithic databases, also be reused to construct other databases. TogoGenome and TogoStanza have been under development since 2012 and are freely available along with their source codes on the GitHub repositories at https://github.com/togogenome/ and https://github.com/togostanza/, respectively, under the MIT license.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/bay132DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6323299PMC
January 2019

Draft Genome Sequence of the Nitrogen-Fixing and Hormogonia-Inducing Cyanobacterium Strain WK-1, Isolated from the Coralloid Roots of .

Genome Announc 2018 Feb 15;6(7). Epub 2018 Feb 15.

Kobe University Research Center for Inland Seas, Awaji, Hyogo, Japan

We report here the whole-genome sequence of strain WK-1, which was isolated from cyanobacterial colonies growing in the coralloid roots of the gymnosperm It can provide valuable resources to study the mutualistic relationships and the syntrophic metabolisms between the cyanobacterial symbiont and the host plant, .
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/genomeA.00021-18DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5814485PMC
February 2018

DFAST: a flexible prokaryotic genome annotation pipeline for faster genome publication.

Bioinformatics 2018 03;34(6):1037-1039

Center for Information Biology, National Institute of Genetics, Research Organization of Information and Systems, 1111 Yata, Mishima 411-8540, Japan.

Summary: We developed a prokaryotic genome annotation pipeline, DFAST, that also supports genome submission to public sequence databases. DFAST was originally started as an on-line annotation server, and to date, over 7000 jobs have been processed since its first launch in 2016. Here, we present a newly implemented background annotation engine for DFAST, which is also available as a standalone command-line program. The new engine can annotate a typical-sized bacterial genome within 10 min, with rich information such as pseudogenes, translation exceptions and orthologous gene assignment between given reference genomes. In addition, the modular framework of DFAST allows users to customize the annotation workflow easily and will also facilitate extensions for new functions and incorporation of new tools in the future.

Availability And Implementation: The software is implemented in Python 3 and runs in both Python 2.7 and 3.4-on Macintosh and Linux systems. It is freely available at https://github.com/nigyta/dfast_core/under the GPLv3 license with external binaries bundled in the software distribution. An on-line version is also available at https://dfast.nig.ac.jp/.

Contact: yn@nig.ac.jp.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btx713DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860143PMC
March 2018

Complete Genome Sequence of a Coastal Cyanobacterium, sp. Strain NIES-970.

Genome Announc 2017 Apr 6;5(14). Epub 2017 Apr 6.

Center for Environmental Biology and Ecosystem Studies, National Institute for Environmental Studies, Tsukuba, Ibaraki, Japan.

Members of the cyanobacterial genus are abundant in marine environments. To better understand the genomic diversity of marine spp., we determined the complete genome sequence of a coastal cyanobacterium, sp. NIES-970. The genome had a size of 3.1 Mb, consisting of one chromosome and four plasmids.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/genomeA.00139-17DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5383900PMC
April 2017

DNApod: DNA polymorphism annotation database from next-generation sequence read archives.

PLoS One 2017 24;12(2):e0172269. Epub 2017 Feb 24.

Genome Informatics Laboratory, National Institute of Genetics, Mishima, Shizuoka, Japan.

With the rapid advances in next-generation sequencing (NGS), datasets for DNA polymorphisms among various species and strains have been produced, stored, and distributed. However, reliability varies among these datasets because the experimental and analytical conditions used differ among assays. Furthermore, such datasets have been frequently distributed from the websites of individual sequencing projects. It is desirable to integrate DNA polymorphism data into one database featuring uniform quality control that is distributed from a single platform at a single place. DNA polymorphism annotation database (DNApod; http://tga.nig.ac.jp/dnapod/) is an integrated database that stores genome-wide DNA polymorphism datasets acquired under uniform analytical conditions, and this includes uniformity in the quality of the raw data, the reference genome version, and evaluation algorithms. DNApod genotypic data are re-analyzed whole-genome shotgun datasets extracted from sequence read archives, and DNApod distributes genome-wide DNA polymorphism datasets and known-gene annotations for each DNA polymorphism. This new database was developed for storing genome-wide DNA polymorphism datasets of plants, with crops being the first priority. Here, we describe our analyzed data for 679, 404, and 66 strains of rice, maize, and sorghum, respectively. The analytical methods are available as a DNApod workflow in an NGS annotation system of the DNA Data Bank of Japan and a virtual machine image. Furthermore, DNApod provides tables of links of identifiers between DNApod genotypic data and public phenotypic data. To advance the sharing of organism knowledge, DNApod offers basic and ubiquitous functions for multiple alignment and phylogenetic tree construction by using orthologous gene information.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0172269PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5325239PMC
September 2017

Genome sequence and overview of Shr3 in the eighth class of the phylum .

Stand Genomic Sci 2016 13;11:90. Epub 2016 Dec 13.

Genetic Strains Research Center, National Institute of Genetics, 1111 Yata, Mishima, 411-8540 Japan ; Department of Genetics, The Graduate University for Advanced Studies (SOKENDAI), 1111 Yata, Mishima, 411-8540 Japan.

Shr3 is the first strain described in the newest (eighth) class of the phylum . This strain was isolated from the 0.2-μm filtrate of a suspension of sand gravels collected in the Sahara Desert in the Republic of Tunisia. The genome of Shr3 is 7,569,109 bp long and consists of one scaffold with a 54.3% G + C content. A total of 6,463 genes were predicted, comprising 6,406 protein-coding and 57 RNA genes. Genome sequence analysis suggested that strain Shr3 had multiple terminal oxidases for aerobic respiration and various transporters, including the resistance-nodulation-cell division-type efflux pumps. Additionally, gene sequences related to the incomplete denitrification pathway lacking the final step to reduce nitrous oxide (NO) to nitrogen gas (N) were found in the Shr3 genome. The results presented herein provide insight into the metabolic versatility and NO-producing activity of species.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s40793-016-0210-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5154148PMC
December 2016

DNA Data Bank of Japan.

Nucleic Acids Res 2017 01 24;45(D1):D25-D31. Epub 2016 Oct 24.

DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan

The DNA Data Bank of Japan (DDBJ) (http://www.ddbj.nig.ac.jp) has been providing public data services for thirty years (since 1987). We are collecting nucleotide sequence data from researchers as a member of the International Nucleotide Sequence Database Collaboration (INSDC, http://www.insdc.org), in collaboration with the US National Center for Biotechnology Information (NCBI) and European Bioinformatics Institute (EBI). The DDBJ Center also services Japanese Genotype-phenotype Archive (JGA), with the National Bioscience Database Center to collect human-subjected data from Japanese researchers. Here, we report our database activities for INSDC and JGA over the past year, and introduce retrieval and analytical services running on our supercomputer system and their recent modifications. Furthermore, with the Database Center for Life Science, the DDBJ Center improves semantic web technologies to integrate and to share biological data, for providing the RDF version of the sequence data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkw1001DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210514PMC
January 2017

CyanoBase: a large-scale update on its 20th anniversary.

Nucleic Acids Res 2017 01 29;45(D1):D551-D554. Epub 2016 Nov 29.

Center for Information Biology, National Institute of Genetics, Research Organization of Information and Systems, Yata, Mishima 411-8540, Japan

The first ever cyanobacterial genome sequence was determined two decades ago and CyanoBase (http://genome.microbedb.jp/cyanobase), the first database for cyanobacteria was simultaneously developed to allow this genomic information to be used more efficiently. Since then, CyanoBase has constantly been extended and has received several updates. Here, we describe a new large-scale update of the database, which coincides with its 20th anniversary. We have expanded the number of cyanobacterial genomic sequences from 39 to 376 species, which consists of 86 complete and 290 draft genomes. We have also optimized the user interface for large genomic data to include the use of semantic web technologies and JBrowse and have extended community-based reannotation resources through the re-annotation of Synechocystis sp. PCC 6803 by the cyanobacterial research community. These updates have markedly improved CyanoBase, providing cyanobacterial genome annotations as references for cyanobacterial research.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkw1131DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5210588PMC
January 2017

DFAST and DAGA: web-based integrated genome annotation tools and resources.

Biosci Microbiota Food Health 2016 14;35(4):173-184. Epub 2016 Jul 14.

Center for Information Biology, National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan; RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi, Yokohama, Kanagawa 230-0045, Japan.

Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, and , obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in and , whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12938/bmfh.16-003DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5107635PMC
July 2016

Complete Genome Sequence of Aurantimicrobium minutum Type Strain KNCT, a Planktonic Ultramicrobacterium Isolated from River Water.

Genome Announc 2016 Jun 30;4(3). Epub 2016 Jun 30.

Genetic Strains Research Center, National Institute of Genetics, Mishima, Japan The Graduate University for Advanced Studies (SOKENDAI), Mishima, Japan.

Aurantimicrobium minutum type strain KNC(T) is a planktonic ultramicrobacterium isolated from river water in western Japan. Strain KNC(T) has an extremely small, streamlined genome of 1,622,386 bp comprising 1,575 protein-coding sequences. The genome annotation suggests that strain KNC(T) has an actinorhodopsin-based photometabolism.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/genomeA.00616-16DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4929513PMC
June 2016

FALDO: a semantic standard for describing the location of nucleotide and protein feature annotation.

J Biomed Semantics 2016 Jun 13;7:39. Epub 2016 Jun 13.

The James Hutton Institute, Dundee, DD2 5DA, UK.

Background: Nucleotide and protein sequence feature annotations are essential to understand biology on the genomic, transcriptomic, and proteomic level. Using Semantic Web technologies to query biological annotations, there was no standard that described this potentially complex location information as subject-predicate-object triples.

Description: We have developed an ontology, the Feature Annotation Location Description Ontology (FALDO), to describe the positions of annotated features on linear and circular sequences. FALDO can be used to describe nucleotide features in sequence records, protein annotations, and glycan binding sites, among other features in coordinate systems of the aforementioned "omics" areas. Using the same data format to represent sequence positions that are independent of file formats allows us to integrate sequence data from multiple sources and data types. The genome browser JBrowse is used to demonstrate accessing multiple SPARQL endpoints to display genomic feature annotations, as well as protein annotations from UniProt mapped to genomic locations.

Conclusions: Our ontology allows users to uniformly describe - and potentially merge - sequence annotations from multiple sources. Data sources using FALDO can prospectively be retrieved using federalised SPARQL queries against public SPARQL endpoints and/or local private triple stores.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/s13326-016-0067-zDOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4907002PMC
June 2016

Complete Genome Sequence of Cyanobacterium Leptolyngbya sp. NIES-3755.

Genome Announc 2016 Mar 17;4(2). Epub 2016 Mar 17.

Genome Research Center, Tokyo University of Agriculture, Setagaya-ku, Tokyo, Japan

Cyanobacterial genus Leptolyngbya comprises genetically diverse species, but the availability of their complete genome information is limited. Here, we isolated Leptolyngbya sp. strain NIES-3755 from soil at the Toyohashi University of Technology, Japan. We determined the complete genome sequence of the NIES-3755 strain, which is composed of one chromosome and three plasmids.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/genomeA.00090-16DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4796116PMC
March 2016

Complete genome sequence of cyanobacterium Fischerella sp. NIES-3754, providing thermoresistant optogenetic tools.

J Biotechnol 2016 Feb 16;220:45-6. Epub 2016 Jan 16.

Genome Research Center, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo 156-8502, Japan. Electronic address:

Cyanobacterial phytochrome-class photosensors are recently emerging optogenetic tools. We isolated Fischerella sp. strain NIES-3754 from hotspring at Suwa-shrine, Suwa, Nagano, Japan. We determined complete genome sequence of the NIES-3754 strain, which is composed of one chromosome and two putative replicons (total 5,826,863bp containing no gaps). We identified photosensor genes of 5 phytochromes and 9 cyanobacteriochromes, which will facilitate optogenetics of thermophile.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbiotec.2016.01.011DOI Listing
February 2016

Complete genome sequence of cyanobacterium Nostoc sp. NIES-3756, a potentially useful strain for phytochrome-based bioengineering.

J Biotechnol 2016 Jan 4;218:51-2. Epub 2015 Dec 4.

Genome Research Center, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo 156-8502, Japan. Electronic address:

To explore the diverse photoreceptors of cyanobacteria, we isolated Nostoc sp. strain NIES-3756 from soil at Mimomi-Park, Chiba, Japan, and determined its complete genome sequence. The Genome consists of one chromosome and two plasmids (total 6,987,571 bp containing no gaps). The NIES-3756 strain carries 7 phytochrome and 12 cyanobacteriochrome genes, which will facilitate the studies of phytochrome-based bioengineering.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbiotec.2015.12.002DOI Listing
January 2016

DNA data bank of Japan (DDBJ) progress report.

Nucleic Acids Res 2016 Jan 17;44(D1):D51-7. Epub 2015 Nov 17.

DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan National Bioscience Database Center, Japan Science and Technology Agency, Tokyo 102-8666, Japan

The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Since 2013, the DDBJ Center has been operating the Japanese Genotype-phenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in Japan. In addition, the DDBJ Center develops semantic web technologies for data integration and sharing in collaboration with the Database Center for Life Science (DBCLS) in Japan. This paper briefly reports on the activities of the DDBJ Center over the past year including submissions to databases and improvements in our services for data retrieval, analysis, and integration.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkv1105DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4702806PMC
January 2016

Implementation of linked data in the life sciences at BioHackathon 2011.

J Biomed Semantics 2015 7;6. Epub 2015 Jan 7.

Department of Bioclinical informatics, Tohoku Medical Megabank Organization, Tohoku University, Seiryo-cho 4-1, Aoba-ku, Sendai-shi Miyagi, 980-8575 Japan.

Background: Linked Data has gained some attention recently in the life sciences as an effective way to provide and share data. As a part of the Semantic Web, data are linked so that a person or machine can explore the web of data. Resource Description Framework (RDF) is the standard means of implementing Linked Data. In the process of generating RDF data, not only are data simply linked to one another, the links themselves are characterized by ontologies, thereby allowing the types of links to be distinguished. Although there is a high labor cost to define an ontology for data providers, the merit lies in the higher level of interoperability with data analysis and visualization software. This increase in interoperability facilitates the multi-faceted retrieval of data, and the appropriate data can be quickly extracted and visualized. Such retrieval is usually performed using the SPARQL (SPARQL Protocol and RDF Query Language) query language, which is used to query RDF data stores. For the database provider, such interoperability will surely lead to an increase in the number of users.

Results: This manuscript describes the experiences and discussions shared among participants of the week-long BioHackathon 2011 who went through the development of RDF representations of their own data and developed specific RDF and SPARQL use cases. Advice regarding considerations to take when developing RDF representations of their data are provided for bioinformaticians considering making data available and interoperable.

Conclusions: Participants of the BioHackathon 2011 were able to produce RDF representations of their data and gain a better understanding of the requirements for producing such data in a period of just five days. We summarize the work accomplished with the hope that it will be useful for researchers involved in developing laboratory databases or data analysis, and those who are considering such technologies as RDF and Linked Data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-6-3DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4429360PMC
May 2015

The BioMart community portal: an innovative alternative to large, centralized data repositories.

Nucleic Acids Res 2015 Jul 20;43(W1):W589-98. Epub 2015 Apr 20.

Oncology Computational Biology, Pfizer, La Jolla, USA.

The BioMart Community Portal (www.biomart.org) is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biological datasets spanning genomics, proteomics, model organisms, cancer data, ontology information and more. All resources available through the portal are independently administered and funded by their host organizations. The BioMart data federation technology provides a unified interface to all the available data. The latest version of the portal comes with many new databases that have been created by our ever-growing community. It also comes with better support and extensibility for data analysis and visualization tools. A new addition to our toolbox, the enrichment analysis tool is now accessible through graphical and web service interface. The BioMart community portal averages over one million requests per day. Building on this level of service and the wealth of information that has become available, the BioMart Community Portal has introduced a new, more scalable and cheaper alternative to the large data stores maintained by specialized organizations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gkv350DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489294PMC
July 2015

Complete Genome Sequence of Bifidobacterium longum 105-A, a Strain with High Transformation Efficiency.

Genome Announc 2014 Dec 18;2(6). Epub 2014 Dec 18.

the United Graduate School of Agricultural Science, Gifu University, Gifu, Gifu, Japan.

Bifidobacterium longum 105-A shows high transformation efficiency and allows for the generation of gene knockout mutants through homologous recombination. Here, we report the complete genome sequence of strain 105-A. Genes encoding at least four putative restriction-modification systems were found in this genome, which might contribute to its transformation efficiency.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/genomeA.01311-14DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4271160PMC
December 2014

The DDBJ Japanese Genotype-phenotype Archive for genetic and phenotypic human data.

Nucleic Acids Res 2015 Jan 3;43(Database issue):D18-22. Epub 2014 Dec 3.

DDBJ Center, National Institute of Genetics, Shizuoka 411-8540, Japan

The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. Since October 2013, DDBJ Center has operated the Japanese Genotype-phenotype Archive (JGA) in collaboration with our partner institute, the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency. DDBJ Center provides the JGA database system which securely stores genotype and phenotype data collected from individuals whose consent agreements authorize data release only for specific research use. NBDC has established guidelines and policies for sharing human-derived data and reviews data submission and usage requests from researchers. In addition to the JGA project, DDBJ Center develops Semantic Web technologies for data integration and sharing in collaboration with the Database Center for Life Science. This paper describes the overview of the JGA project, updates to the DDBJ databases, and services for data retrieval, analysis and integration.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gku1120DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4383935PMC
January 2015

Loss of cytochrome cM stimulates cyanobacterial heterotrophic growth in the dark.

Plant Cell Physiol 2015 Feb 20;56(2):334-45. Epub 2014 Nov 20.

Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya, 464-8601 Japan

Although cyanobacteria are photoautotrophs, they have the capability for heterotrophic metabolism that enables them to survive in their natural habitat. However, cyanobacterial species that grow heterotrophically in the dark are rare. It remains largely unknown how cyanobacteria regulate heterotrophic activity. The cyanobacterium Leptolyngbya boryana grows heterotrophically with glucose in the dark. A dark-adapted variant dg5 isolated from the wild type (WT) exhibits enhanced heterotrophic growth in the dark. We sequenced the genomes of dg5 and the WT to identify the mutation(s) of dg5. The WT genome consists of a circular chromosome (6,176,364 bp), a circular plasmid pLBA (77,793 bp) and two linear plasmids pLBX (504,942 bp) and pLBY (44,369 bp). Genome comparison revealed three mutation sites. Phenotype analysis of mutants isolated from the WT by introducing these mutations individually revealed that the relevant mutation is a single adenine insertion causing a frameshift of cytM encoding Cyt c(M). The respiratory oxygen consumption of the cytM-lacking mutant grown in the dark was significantly higher than that of the WT. We isolated a cytM-lacking mutant, ΔcytM, from another cyanobacterium Synechocystis sp. PCC 6803, and ΔcytM grew in the dark with a doubling time of 33 h in contrast to no growth of the WT. The respiratory oxygen consumption of ΔcytM grown in the dark was about 2-fold higher than that of the WT. These results suggest a suppressive role(s) for Cyt cM in regulation of heterotrophic activity.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/pcp/pcu165DOI Listing
February 2015

Draft Genome Sequence of Lactobacillus oryzae Strain SG293T.

Genome Announc 2014 Aug 28;2(4). Epub 2014 Aug 28.

National Agriculture and Food Research Organization, National Institute of Livestock and Grassland Science, Nasushiobara, Japan

We report the 1.86-Mb draft genome and annotation of Lactobacillus oryzae SG293(T) isolated from fermented rice grains. This genome information may provide further insights into the mechanisms underlying the fermentation of rice grains.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/genomeA.00861-14DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4148733PMC
August 2014

Draft Genome Sequence of Weissella oryzae SG25T, Isolated from Fermented Rice Grains.

Genome Announc 2014 Jul 10;2(4). Epub 2014 Jul 10.

National Agriculture and Food Research Organization, National Institute of Livestock and Grassland Science, Nasushiobara, Japan

Weissella oryzae was originally isolated from fermented rice grains. Here we report the draft genome sequence of the type strain of W. oryzae. This first report on the genomic sequence of this species may help identify the mechanisms underlying bacterial adaptation to the ecological niche of fermented rice grains.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1128/genomeA.00667-14DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4110764PMC
July 2014

Klebsormidium flaccidum genome reveals primary factors for plant terrestrial adaptation.

Nat Commun 2014 May 28;5:3978. Epub 2014 May 28.

Department of Biological Sciences, Tokyo Institute of Technology, Yokohama City, Kanagawa 226-8501, Japan.

The colonization of land by plants was a key event in the evolution of life. Here we report the draft genome sequence of the filamentous terrestrial alga Klebsormidium flaccidum (Division Charophyta, Order Klebsormidiales) to elucidate the early transition step from aquatic algae to land plants. Comparison of the genome sequence with that of other algae and land plants demonstrate that K. flaccidum acquired many genes specific to land plants. We demonstrate that K. flaccidum indeed produces several plant hormones and homologues of some of the signalling intermediates required for hormone actions in higher plants. The K. flaccidum genome also encodes a primitive system to protect against the harmful effects of high-intensity light. The presence of these plant-related systems in K. flaccidum suggests that, during evolution, this alga acquired the fundamental machinery required for adaptation to terrestrial environments.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/ncomms4978DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4052687PMC
May 2014

BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains.

J Biomed Semantics 2014 Feb 5;5(1). Epub 2014 Feb 5.

Database Center for Life Science, Research Organization of Information and Systems, 2-11-16, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan.

The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-5-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3978116PMC
February 2014