Publications by authors named "Atsuko Yamaguchi"

35 Publications

Spatial-Temporal Distribution of Megamouth Shark, , Inferred from over 250 Individuals Recorded in the Three Oceans.

Animals (Basel) 2021 Oct 12;11(10). Epub 2021 Oct 12.

Graduate School of Fisheries Science and Environmental Studies, Nagasaki University, Nagasaki 852-8521, Japan.

The megamouth shark () is one of the rarest shark species in the three oceans, and its biological and fishery information is still very limited. A total of 261 landing/stranding records were examined, including 132 females, 87 males, and 42 sex unknown individuals, to provide the most detailed information on global megamouth shark records, and the spatial-temporal distribution of was inferenced from these records. The vertical distribution of ranged 0-1203 m in depth, and immature individuals were mostly found in the waters shallower than 200 m. Mature individuals are not only able to dive deeper, but also move to higher latitude waters. The majority of are found in the western North Pacific Ocean (>5° N). The Indian and Atlantic Oceans are the potential nursery areas for this species, immature individuals are mainly found in Indonesia and Philippine waters. Large individuals tend to move towards higher latitude waters (>15° N) for foraging and growth from April to August. Sexual segregation of is found, females tend to move to higher latitude waters (>30° N) in the western North Pacific Ocean, but males may move across the North Pacific Ocean.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/ani11102947DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8532755PMC
October 2021

O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information.

Genomics Inform 2021 Sep 30;19(3):e26. Epub 2021 Sep 30.

Graduate School of Integrative Science and Engineering, Tokyo City University, Tokyo 158-8557, Japan.

Previous approaches to create a controlled vocabulary for Japanese have resorted to existing bilingual dictionary and transformation rules to allow such mappings. However, given the possible new terms introduced due to coronavirus disease 2019 (COVID-19) and the emphasis on respiratory and infection-related terms, coverage might not be guaranteed. We propose creating a Japanese bilingual controlled vocabulary based on MeSH terms assigned to COVID-19 related publications in this work. For such, we resorted to manual curation of several bilingual dictionaries and a computational approach based on machine translation of sentences containing such terms and the ranking of possible translations for the individual terms by mutual information. Our results show that we achieved nearly 99% occurrence coverage in LitCovid, while our computational approach presented average accuracy of 63.33% for all terms, and 84.51% for drugs and chemicals.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.5808/gi.21014DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8510863PMC
September 2021

Constructing Japanese MeSH term dictionaries related to the COVID-19 literature.

Genomics Inform 2021 Sep 30;19(3):e25. Epub 2021 Sep 30.

Computer Science Department, The University of Sheffield, Western Bank, Sheffield S10 2TN, UK.

The coronavirus disease 2019 (COVID-19) pandemic has led to a flood of research papers and the information has been updated with considerable frequency. For society to derive benefits from this research, it is necessary to promote sharing up-to-date knowledge from these papers. However, because most research papers are written in English, it is difficult for people who are not familiar with English medical terms to obtain knowledge from them. To facilitate sharing knowledge from COVID-19 papers written in English for Japanese speakers, we tried to construct a dictionary with an open license by assigning Japanese terms to MeSH unique identifiers (UIDs) annotated to words in the texts of COVID-19 papers. Using this dictionary, 98.99% of all occurrences of MeSH terms in COVID-19 papers were covered. We also created a curated version of the dictionary and uploaded it to PubDictionary for wider use in the PubAnnotation system.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.5808/gi.21012DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8510869PMC
September 2021

A proof-of-concept study of extracting patient histories for rare/intractable diseases from social media.

Genomics Inform 2020 Jun 18;18(2):e17. Epub 2020 Jun 18.

Leiden University Medical Center, Leiden, 2333 ZA, The Netherlands.

The amount of content on social media platforms such as Twitter is expanding rapidly. Simultaneously, the lack of patient information seriously hinders the diagnosis and treatment of rare/intractable diseases. However, these patient communities are especially active on social media. Data from social media could serve as a source of patient-centric knowledge for these diseases complementary to the information collected in clinical settings and patient registries, and may also have potential for research use. To explore this question, we attempted to extract patient-centric knowledge from social media as a task for the 3-day Biomedical Linked Annotation Hackathon 6 (BLAH6). We selected amyotrophic lateral sclerosis and multiple sclerosis as use cases of rare and intractable diseases, respectively, and we extracted patient histories related to these health conditions from Twitter. Four diagnosed patients for each disease were selected. From the user timelines of these eight patients, we extracted tweets that might be related to health conditions. Based on our experiment, we show that our approach has considerable potential, although we identified problems that should be addressed in future attempts to mine information about rare/intractable diseases from Twitter.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.5808/GI.2020.18.2.e17DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7362943PMC
June 2020

Changes in feeding habits of the starspotted smooth-hound, Mustelus manazo, in Tokyo Bay between periods with different stock size levels.

Mar Pollut Bull 2020 Mar 17;152:110863. Epub 2020 Feb 17.

Center for Health and Environmental Risk Research, National Institute for Environmental Studies, Onogawa, Tsukuba, Ibaraki 305-8506, Japan.

We investigated differences in the feeding habits of the starspotted smooth-hound, Mustelus manazo, in Tokyo Bay between the mid-1990s (low stock size) and the late 2000s (high stock size). The frequency of M. manazo with empty stomachs increased from 5.9% in the mid-1990s to 16.1% in the late 2000s. A decrease in the relative weight of the stomach contents was evident from the mid-1990s to the late 2000s, especially in the small size classes, along with changes in the species composition in the stomach contents. Although crustaceans were the main constituents of the stomach contents, the proportion of crabs increased while those of shrimps and hermit crabs decreased. Changes in the feeding habits of M. manazo may be associated with shifts in the benthic community structure in Tokyo Bay.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.marpolbul.2019.110863DOI Listing
March 2020

BioHackathon 2015: Semantics of data for life sciences and reproducible research.

F1000Res 2020 24;9:136. Epub 2020 Feb 24.

St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Darlinghurst, Australia.

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.12688/f1000research.18236.1DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7141167PMC
February 2021

Split4Blank: Maintaining consistency while improving efficiency of loading RDF data with blank nodes.

PLoS One 2019 4;14(6):e0217852. Epub 2019 Jun 4.

Database Center for Life Science (DBCLS), Research Organization of Information and Systems, Kashiwa, Chiba, Japan.

In life sciences, accompanied by the rapid growth of sequencing technology and the advancement of research, vast amounts of data are being generated. It is known that as the size of Resource Description Framework (RDF) datasets increases, the more efficient loading to triple stores is crucial. For example, UniProt's RDF version contains 44 billion triples as of December 2018. PubChem also has an RDF dataset with 137 billion triples. As data sizes become extremely large, loading them to a triple store consumes time. To improve the efficiency of this task, parallel loading has been recommended for several stores. However, with parallel loading, dataset consistency must be considered if the dataset contains blank nodes. By definition, blank nodes do not have global identifiers; thus, pairs of identical blank nodes in the original dataset are recognized as different if they reside in separate files after the dataset is split for parallel loading. To address this issue, we propose the Split4Blank tool, which splits a dataset into multiple files under the condition that identical blank nodes are not separated. The proposed tool uses connected component and multiprocessor scheduling algorithms and satisfies the above condition. Furthermore, to confirm the effectiveness of the proposed approach, we applied Split4Blank to two life sciences RDF datasets. In addition, we generated synthetic RDF datasets to evaluate scalability based on the properties of various graphs, such as a scale-free and random graph.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0217852PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6548388PMC
February 2020

Isolation and characterization of polymorphic microsatellite loci from pale-edged stingray, Telatrygon zugei (Elasmobranchii, Dasyatidae).

Integr Zool 2019 May;14(3):318-322

Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/1749-4877.12331DOI Listing
May 2019

YummyData: providing high-quality open life science data.

Database (Oxford) 2018 01;2018

Novartis Institutes for Biomedical Research, Basel, Switzerland.

Abstract: Many life science datasets are now available via Linked Data technologies, meaning that they are represented in a common format (the Resource Description Framework), and are accessible via standard APIs (SPARQL endpoints). While this is an important step toward developing an interoperable bioinformatics data landscape, it also creates a new set of obstacles, as it is often difficult for researchers to find the datasets they need. Different providers frequently offer the same datasets, with different levels of support: as well as having more or less up-to-date data, some providers add metadata to describe the content, structures, and ontologies of the stored datasets while others do not. We currently lack a place where researchers can go to easily assess datasets from different providers in terms of metrics such as service stability or metadata richness. We also lack a space for collecting feedback and improving data providers’ awareness of user needs. To address this issue, we have developed YummyData, which consists of two components. One periodically polls a curated list of SPARQL endpoints, monitoring the states of their Linked Data implementations and content. The other presents the information measured for the endpoints and provides a forum for discussion and feedback. YummyData is designed to improve the findability and reusability of life science datasets provided as Linked Data and to foster its adoption. It is freely accessible at http://yummydata.org/. Database URL: http://yummydata.org/
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/bay022DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5846286PMC
January 2018

The Effect of Aromatherapy Treatment on Fatigue and Relaxation for Mothers during the Early Puerperal Period in Japan: A Pilot Study.

Int J Community Based Nurs Midwifery 2017 Oct;5(4):365-375

Maternity Ward ,Tokyo Medical Center, Tokyo, Japan.

Background: Early in the postpartum period, mothers are often nervous and tired from the delivery, breast-feeding and caring for a new-born. The aim of this study was to evaluate the process and outcome of using aromatherapy treatments to increase relaxation and decrease fatigue for mothers during the first to the seventh day of the postpartum period.

Methods: This non-randomized controlled study with a quasi-experimental one-group pretest-posttest design was used to evaluate scores in relaxation and fatigue before and after the intervention. Aromatherapy hand treatments were performed on a purposive sample of 34 postpartum mothers in Tokyo, Japan, from May to July 2016. The single treatment included a choice of one of five essential aroma oils through hand and forearm massage. Relaxation and fatigue were measured by self-administered valid and reliable questionnaires. Wilcoxon signed-rank test was conducted to analyze the data before and after the intervention. The software programs SPSS, v. 23.0 (SPSS, Tokyo), was used to analyze the data, with the significance level set at 5%.

Results: Valid responses were obtained from 29 participants. A comparison of the scores before and after aroma treatment intervention indicated that the participants' relaxation scores increased significantly (P<0.001) and fatigue scores were significantly reduced (P<0.001). The majority of participants (77.8%) were satisfied with the treatment.

Conclusion: The aroma treatments significantly improved relaxation and reduced fatigue for mothers in the early puerperal period and were well received. Therefore, a larger study using a pretest-posttest random control trial is recommended.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5635556PMC
October 2017

Mediators of the effects of rice intake on health in individuals consuming a traditional Japanese diet centered on rice.

PLoS One 2017 2;12(10):e0185816. Epub 2017 Oct 2.

Department of Psychiatry, Hokkaido University Graduate School of Medicine, Sapporo, Japan.

Although the Japanese diet is believed to be balanced and healthy, its benefits have been poorly investigated, especially in terms of effects on mental health. We investigated dietary patterns and physical and mental health in the Japanese population using an epidemiological survey to determine the health benefits of the traditional Japanese diet. Questionnaires to assess dietary habits, quality of life, sleep quality, impulsivity, and depression severity were distributed to 550 randomly selected middle-aged and elderly individuals. Participants with any physical or mental disease were excluded. Two-hundred and seventy-eight participants were selected for the final statistical analysis. We determined rice to be one of the most traditional foods in Japanese cuisine. Scores for each questionnaire were computed, and the correlations between rice intake and health indices were assessed. When analyzing the direct correlations between rice intake and health indices, we found only two correlations, namely those with quality of life (vitality) and sleep quality. Path analysis using structural equation modeling was performed to investigate the association between rice intake and health, with indirect effects included in the model. Additional associations between rice intake and health were explained using this model when compared to those using direct correlation analysis. Path analysis was used to identify mediators of the rice-health association. These mediators were miso (soybean paste) soup, green tea, and natto (fermented soybean) intake. Interestingly, these mediators have been major components of the Japanese diet since 1975, which has been considered one of the healthiest diets since the 1960s. Our results indicate that the combination of rice with other healthy foods, which is representative of the traditional Japanese diet, may contribute to improvements in physical and mental health.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0185816PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5624626PMC
October 2017

The health care and life sciences community profile for dataset descriptions.

PeerJ 2016 16;4:e2331. Epub 2016 Aug 16.

Database Center for Life Science, Kashiwa, Japan.

Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.7717/peerj.2331DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4991880PMC
September 2016

Structural basis for the recognition of two consecutive mutually interacting DPF motifs by the SGIP1 μ homology domain.

Sci Rep 2016 Jan 29;6:19565. Epub 2016 Jan 29.

Division of Structural Biology, Medical Institute of Bioregulation, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan.

FCHo1, FCHo2, and SGIP1 are key regulators of clathrin-mediated endocytosis. Their μ homology domains (μHDs) interact with the C-terminal region of an endocytic scaffold protein, Eps15, containing fifteen Asp-Pro-Phe (DPF) motifs. Here, we show that the high-affinity μHD-binding site in Eps15 is a region encompassing six consecutive DPF motifs, while the minimal μHD-binding unit is two consecutive DPF motifs. We present the crystal structures of the SGIP1 μHD in complex with peptides containing two DPF motifs. The peptides bind to a novel ligand-binding site of the μHD, which is distinct from those of other distantly related μHD-containing proteins. The two DPF motifs, which adopt three-dimensional structures stabilized by sequence-specific intramotif and intermotif interactions, are extensively recognized by the μHD and are both required for binding. Thus, consecutive and singly scattered DPF motifs play distinct roles in μHD binding.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1038/srep19565DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4731787PMC
January 2016

Serum DJ-1 level is positively associated with improvements in some aspects of metabolic syndrome in Japanese women through lifestyle intervention.

Nutr Res 2014 Oct 16;34(10):851-5. Epub 2014 Sep 16.

Graduate School of Pharmaceutical Sciences, Hokkaido University, Sapporo 060-0812, Japan.

DJ-1 is a protein that is associated with Parkinson disease and cancer, and the reduction of DJ-1 function and expression is also thought to be a cause of diabetes and hypertension. However, little is known about the association between the plasma concentration of DJ-1 and risk of metabolic syndrome. We hypothesized that a lifestyle intervention would increase serum DJ-1 and that up-regulated DJ-1 functions will result in the prevention of metabolic syndrome. The objective of our study is to examine whether the level of serum DJ-1 is associated with the risk of metabolic syndrome. Therefore, to reveal the association between DJ-1 and metabolic syndrome, this study investigated lifestyle intervention in a control group (n = 37) and intervention group (n = 45). The results showed that body mass index, body fat ratio, waist-hip ratio, waist circumference, blood pressure, and plasma glucose level were improved in the intervention group, as compared with those in the control group. Furthermore, serum levels of DJ-1 were increased in the intervention group, when compared with those in the control group. These results suggest that serum DJ-1 is increased by lifestyle intervention and that increased serum DJ-1 prevents metabolic syndrome. Thus, the level of serum DJ-1 will become one of the indexes for the risk of metabolic syndrome.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.nutres.2014.09.004DOI Listing
October 2014

Semantic Web technologies for the big data in life sciences.

Biosci Trends 2014 Aug;8(4):192-201

Database Center for Life Science, Research Organization of Information and Systems.

The life sciences field is entering an era of big data with the breakthroughs of science and technology. More and more big data-related projects and activities are being performed in the world. Life sciences data generated by new technologies are continuing to grow in not only size but also variety and complexity, with great speed. To ensure that big data has a major influence in the life sciences, comprehensive data analysis across multiple data sources and even across disciplines is indispensable. The increasing volume of data and the heterogeneous, complex varieties of data are two principal issues mainly discussed in life science informatics. The ever-evolving next-generation Web, characterized as the Semantic Web, is an extension of the current Web, aiming to provide information for not only humans but also computers to semantically process large-scale data. The paper presents a survey of big data in life sciences, big data related projects and Semantic Web technologies. The paper introduces the main Semantic Web technologies and their current situation, and provides a detailed analysis of how Semantic Web technologies address the heterogeneous variety of life sciences big data. The paper helps to understand the role of Semantic Web technologies in the big data era and how they provide a promising solution for the big data in life sciences.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.5582/bst.2014.01048DOI Listing
August 2014

BioBenchmark Toyama 2012: an evaluation of the performance of triple stores on biological data.

J Biomed Semantics 2014 10;5:32. Epub 2014 Jul 10.

Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan.

Background: Biological databases vary enormously in size and data complexity, from small databases that contain a few million Resource Description Framework (RDF) triples to large databases that contain billions of triples. In this paper, we evaluate whether RDF native stores can be used to meet the needs of a biological database provider. Prior evaluations have used synthetic data with a limited database size. For example, the largest BSBM benchmark uses 1 billion synthetic e-commerce knowledge RDF triples on a single node. However, real world biological data differs from the simple synthetic data much. It is difficult to determine whether the synthetic e-commerce data is efficient enough to represent biological databases. Therefore, for this evaluation, we used five real data sets from biological databases.

Results: We evaluated five triple stores, 4store, Bigdata, Mulgara, Virtuoso, and OWLIM-SE, with five biological data sets, Cell Cycle Ontology, Allie, PDBj, UniProt, and DDBJ, ranging in size from approximately 10 million to 8 billion triples. For each database, we loaded all the data into our single node and prepared the database for use in a classical data warehouse scenario. Then, we ran a series of SPARQL queries against each endpoint and recorded the execution time and the accuracy of the query response.

Conclusions: Our paper shows that with appropriate configuration Virtuoso and OWLIM-SE can satisfy the basic requirements to load and query biological data less than 8 billion or so on a single node, for the simultaneous access of 64 clients. OWLIM-SE performs best for databases with approximately 11 million triples; For data sets that contain 94 million and 590 million triples, OWLIM-SE and Virtuoso perform best. They do not show overwhelming advantage over each other; For data over 4 billion Virtuoso works best. 4store performs well on small data sets with limited features when the number of triples is less than 100 million, and our test shows its scalability is poor; Bigdata demonstrates average performance and is a good open source triple store for middle-sized (500 million or so) data set; Mulgara shows a little of fragility.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-5-32DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4118313PMC
August 2014

Mitochondrial genome of Japanese angel shark Squatina japonica (Chondrichthyes: Squatinidae).

Mitochondrial DNA A DNA Mapp Seq Anal 2016 27;27(2):832-3. Epub 2014 May 27.

a Key Laboratory of Zoological Systematics and Evolution , Institute of Zoology, Chinese Academy of Sciences , Beijing , P.R. China .

Squatina japonica belonging to the monogenetic family Squatinidae is endemic to the Northwest Pacific. The complete mitochondrial genome sequence of S. japonica is 16,689 bp long and comprises 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes, and 1 control region. The base composition of the genome is 31.10% A, 31.04% T, 24.42% C, and 13.43% G. The geographic clade and phylogenetic relationship of S. japonica are ambiguous. Therefore, studying the complete mitochondrial genome of S. japonica is highly important to understand the aforementioned aspect and to analyze the conservation genetics in the genus Squatina.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3109/19401736.2014.919463DOI Listing
October 2016

TogoTable: cross-database annotation system using the Resource Description Framework (RDF) data model.

Nucleic Acids Res 2014 Jul 14;42(Web Server issue):W442-8. Epub 2014 May 14.

Database Center for Life Science, Research Organization of Information and Systems, 178-4-4 Wakashiba, Kashiwa, Chiba 277-0871, Japan.

TogoTable (http://togotable.dbcls.jp/) is a web tool that adds user-specified annotations to a table that a user uploads. Annotations are drawn from several biological databases that use the Resource Description Framework (RDF) data model. TogoTable uses database identifiers (IDs) in the table as a query key for searching. RDF data, which form a network called Linked Open Data (LOD), can be searched from SPARQL endpoints using a SPARQL query language. Because TogoTable uses RDF, it can integrate annotations from not only the reference database to which the IDs originally belong, but also externally linked databases via the LOD network. For example, annotations in the Protein Data Bank can be retrieved using GeneID through links provided by the UniProt RDF. Because RDF has been standardized by the World Wide Web Consortium, any database with annotations based on the RDF data model can be easily incorporated into this tool. We believe that TogoTable is a valuable Web tool, particularly for experimental biologists who need to process huge amounts of data such as high-throughput experimental output.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/nar/gku403DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4086138PMC
July 2014

BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains.

J Biomed Semantics 2014 Feb 5;5(1). Epub 2014 Feb 5.

Database Center for Life Science, Research Organization of Information and Systems, 2-11-16, Yayoi, Bunkyo-ku, Tokyo 113-0032, Japan.

The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-5-5DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3978116PMC
February 2014

Mitochondrial genome of longheaded eagle ray Aetobatus flagellum (Chondrichthyes: Myliobatidae).

Mitochondrial DNA 2015 7;26(5):763-4. Epub 2014 Jan 7.

b School of Life Science, Anhui University , Hefei , China and.

The complete mitochondrial genome sequence of the Aetobatus flagellum is 20,201 bp long and consists of 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes and 1 control region (CR). The base composition of the genome is 30.9% A, 28.2% T, 27.1% C and 13.8% G. Comparing mtDNA of elasmobranchs submitted in NCBI, our study not only identified the longest mitochondrial genome with 4490 bp CR in A. flagellum, but also strongly revealed that records in the northwest Pacific may belong to a separate species from those distributed in Indonesia.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3109/19401736.2013.855740DOI Listing
June 2016

A new species of eagle ray Aetobatus narutobiei from the Northwest Pacific: an example of the critical role taxonomy plays in fisheries and ecological sciences.

PLoS One 2013 31;8(12):e83785. Epub 2013 Dec 31.

Faculty of Fisheries, Nagasaki University, Nagasaki, Japan.

Recent taxonomic and molecular work on the eagle rays (Family Myliobatidae) revealed a cryptic species in the northwest Pacific. This species is formally described as Aetobatus narutobiei sp. nov. and compared to its congeners. Aetobatus narutobiei is found in eastern Vietnam, Hong Kong, China, Korea and southern Japan. It was previously considered to be conspecific with Aetobatus flagellum, but these species differ in size, structure of the NADH2 and CO1 genes, some morphological and meristic characters and colouration. Aetobatus narutobiei is particularly abundant in Ariake Bay in southern Japan where it is considered a pest species that predates heavily on farmed bivalve stocks and is culled annually as part of a 'predator control' program. The discovery of A. narutobiei highlights the paucity of detailed taxonomic research on this group of rays. This discovery impacts on current conservation assessments of A. flagellum and these need to be revised based on the findings of this study.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0083785PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3877081PMC
September 2014

Building Linked Open Data towards integration of biomedical scientific literature with DBpedia.

J Biomed Semantics 2013 Mar 13;4(1). Epub 2013 Mar 13.

Database Center for Life Science, Faculty of Engineering Bldg, 12, The University of Tokyo, 2-11-16, Yayoi, Bunkyo-ku, Tokyo, Japan.

Background: There is a growing need for efficient and integrated access to databases provided by diverse institutions. Using a linked data design pattern allows the diverse data on the Internet to be linked effectively and accessed efficiently by computers. Previously, we developed the Allie database, which stores pairs of abbreviations and long forms (LFs, or expanded forms) used in the life sciences. LFs define the semantics of abbreviations, and Allie provides a Web-based search service for researchers to look up the LF of an unfamiliar abbreviation. This service encounters two problems. First, it does not display each LF's definition, which could help the user to disambiguate and learn the abbreviations more easily. Furthermore, there are too many LFs for us to prepare a full dictionary from scratch. On the other hand, DBpedia has made the contents of Wikipedia available in the Resource Description Framework (RDF), which is expected to contain a significant number of entries corresponding to LFs. Therefore, linking the Allie LFs to DBpedia entries may present a solution to the Allie's problems. This requires a method that is capable of matching large numbers of string pairs within a reasonable period of time because Allie and DBpedia are frequently updated.

Results: We built a Linked Open Data set that links LFs to DBpedia titles by applying key collision methods (i.e., fingerprint and n-gram fingerprint) to their literals, which are simple approximate string-matching methods. In addition, we used UMLS resources to normalise the life science terms. As a result, combining the key collision methods with the domain-specific resources performed best, and 44,027 LFs have links to DBpedia titles. We manually evaluated the accuracy of the string matching by randomly sampling 1200 LFs, and our approach achieved an F-measure of 0.98. In addition, our experiments revealed the following. (1) Performances were similar independently from the frequency of the LFs in MEDLINE. (2) There is a relationship (r2 = 0.96, P < 0.01) between the occurrence frequencies of LFs in MEDLINE and their presence probabilities in DBpedia titles.

Conclusions: The obtained results help Allie users locate the correct LFs. Because the methods are computationally simple and yield a high performance and because the most frequently used LFs in MEDLINE appear more often in DBpedia titles, we can continually and reasonably update the linked dataset to reflect the latest publications and additions to DBpedia. Joining LFs between scientific literature and DBpedia enables cross-resource exploration for mutual benefits.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-4-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3621846PMC
March 2013

The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies.

J Biomed Semantics 2013 Feb 11;4(1). Epub 2013 Feb 11.

Database Center for Life Science, Research Organization of Information and Systems, 2-11-16, Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan.

Background: BioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research.

Results: The theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization.

Conclusion: We provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-4-6DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3598643PMC
February 2013

Mitochondrial genome of Dasyatis bennettii (Chondrichthyes: Dasyatidae).

Mitochondrial DNA 2013 Aug 24;24(4):344-6. Epub 2013 Jan 24.

School of Life Science, Anhui University, Hefei 230039, PR China.

Dasyatis bennettii is a bottom-dweller that inhabits in the coastal waters of the Indian and Pacific Oceans as well as the freshwaters of Southern China. In this study, we determined the complete mitochondrial genome of this species of stingrays. The results showed that the total length of the mitogenome was 17,668 bp as a circular DNA and contained 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 control region. The base composition of the complete mitochondrial DNA was 31.1% A, 28.7% T, 26.7% C, and 13.5% G. All the genes in D. bennettii were distributed on the H-strand, except for the ND6 subunit gene and eight tRNA genes which were encoded on the L-strand.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3109/19401736.2012.760552DOI Listing
August 2013

Discriminative application of string similarity methods to chemical and non-chemical names for biomedical abbreviation clustering.

BMC Genomics 2012 Jun 11;13 Suppl 3:S8. Epub 2012 Jun 11.

Database Center for Life Science, Bunkyo-ku, Tokyo, Japan.

Background: Term clustering, by measuring the string similarities between terms, is known within the natural language processing community to be an effective method for improving the quality of texts and dictionaries. However, we have observed that chemical names are difficult to cluster using string similarity measures. In order to clearly demonstrate this difficulty, we compared the string similarities determined using the edit distance, the Monge-Elkan score, SoftTFIDF, and the bigram Dice coefficient for chemical names with those for non-chemical names.

Results: Our experimental results revealed the following: (1) The edit distance had the best performance in the matching of full forms, whereas Cohen et al. reported that SoftTFIDF with the Jaro-Winkler distance would yield the best measure for matching pairs of terms for their experiments. (2) For each of the string similarity measures above, the best threshold for term matching differs for chemical names and for non-chemical names; the difference is especially large for the edit distance. (3) Although the matching results obtained for chemical names using the edit distance, Monge-Elkan scores, or the bigram Dice coefficients are better than the result obtained for non-chemical names, the results were contrary when using SoftTFIDF. (4) A suitable weight for chemical names varies substantially from one for non-chemical names. In particular, a weight vector that has been optimized for non-chemical names is not suitable for chemical names. (5) The matching results using the edit distances improve further by dividing a set of full forms into two subsets, according to whether a full form is a chemical name or not. These results show that our hypothesis is acceptable, and that we can significantly improve the performance of abbreviation-full form clustering by computing chemical names and non-chemical names separately.

Conclusions: In conclusion, the discriminative application of string similarity methods to chemical and non-chemical names may be a simple yet effective way to improve the performance of term clustering.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2164-13-S3-S8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3394426PMC
June 2012

Study on the necessary survey days for energy intake in school children assessed by 7 day survey.

J Med Invest 2012 ;59(1-2):111-5

International Nutrition, Ochanomizu University Graduate School of Humanities and Sciences, Tokyo, Japan.

Theoretically, the longer the period of a nutrition survey, the more reliable the results. However, a long survey can impose a burden on subjects and cause the results to become inaccurate. For adults, a 3 non-consecutive day survey is usually recommended; however, for school children, at least in Japan, it has not been determined whether this is necessary. In this study we conducted a survey of 7 days and tried to find the minimum number of days necessary to determine the energy intake. The subjects were about 300 children aged from 6 to 7, 10 to 11 and 13 to 14 years old in a city in the western part of Japan. The weighing method was used for the school lunch and other meals were surveyed by 24-recalling method. For the 6-7 year-old school children, guardians were asked to keep dietary records. The final number of subjects who were able to complete the 7-day survey was 139. Energy intakes for each weekday were not statistically different (p>0.05) and those for each weekend did not differ (p>0.05). Average energy intakes on weekdays were higher than those on weekend days in 10-11 and 13-14 year-old children. The average intakes of energy in 10-11 and 13-14 year-old children were lower than Japanese estimated energy requirements (EER). However, body weight of more than 90% of subjects was within the normal range. The results suggest that a survey of one weekday is reliable for all weekdays and that of one week-end day is reliable for any weekend day and also indicate the necessity of further studies of EER in rapidly growing children.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.2152/jmi.59.111DOI Listing
July 2012

Brain atrophy caused by vitamin B12-deficient anemia in an infant.

J Pediatr Hematol Oncol 2011 Oct;33(7):556-8

Department of Pediatrics and Neonatology, Nagoya City University, Graduate School of Medical Sciences, Nagoya, Japan.

Vitamin B12 deficiency in infants often presents with nonspecific hematological, gastrointestinal, and neurological manifestations. It is usually caused by inadequate intake, abnormal absorption, or congenital disorders of vitamin B12 metabolism, including transport disorders. We describe a vitamin B12-deficient infant with severe anemia who was breastfed. His mother had undiagnosed vitamin B12 deficiency having undergone total gastrectomy 18 years earlier. The infant developed normally after taking vitamin B12. It is important to suspect vitamin B12 deficiency in mothers who have undergone gastrectomy. Early diagnosis and treatment of vitamin B12 deficiency in infants is important and will help improve long-term prognosis.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1097/MPH.0b013e31821e5290DOI Listing
October 2011

The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications.

J Biomed Semantics 2011 Aug 2;2. Epub 2011 Aug 2.

Database Center for Life Science, Research Organization of Information and Systems, 2-11-16 Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan.

Background: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009.

Results: Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs.

Conclusions: Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-2-4DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3170566PMC
August 2011

Allie: a database and a search service of abbreviations and long forms.

Database (Oxford) 2011 15;2011:bar013. Epub 2011 Apr 15.

Database Center for Life Science, Bunkyo-ku, Tokyo, Japan.

Many abbreviations are used in the literature especially in the life sciences, and polysemous abbreviations appear frequently, making it difficult to read and understand scientific papers that are outside of a reader's expertise. Thus, we have developed Allie, a database and a search service of abbreviations and their long forms (a.k.a. full forms or definitions). Allie searches for abbreviations and their corresponding long forms in a database that we have generated based on all titles and abstracts in MEDLINE. When a user query matches an abbreviation, Allie returns all potential long forms of the query along with their bibliographic data (i.e. title and publication year). In addition, for each candidate, co-occurring abbreviations and a research field in which it frequently appears in the MEDLINE data are displayed. This function helps users learn about the context in which an abbreviation appears. To deal with synonymous long forms, we use a dictionary called GENA that contains domain-specific terms such as gene, protein or disease names along with their synonymic information. Conceptually identical domain-specific terms are regarded as one term, and then conceptually identical abbreviation-long form pairs are grouped taking into account their appearance in MEDLINE. To keep up with new abbreviations that are continuously introduced, Allie has an automatic update system. In addition, the database of abbreviations and their long forms with their corresponding PubMed IDs is constructed and updated weekly. Database URL: The Allie service is available at http://allie.dbcls.jp/.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/database/bar013DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3077826PMC
September 2011

The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*.

J Biomed Semantics 2010 Aug 21;1(1). Epub 2010 Aug 21.

Database Center for Life Science, Research Organization of Information and Systems, 2-11-16 Yayoi, Bunkyo-ku, Tokyo, 113-0032, Japan.

Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/2041-1480-1-8DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2939597PMC
August 2010
-->