Publications by authors named "Arto Klami"

7 Publications

  • Page 1 of 1

Towards brain-activity-controlled information retrieval: Decoding image relevance from MEG signals.

Neuroimage 2015 May 13;112:288-298. Epub 2015 Jan 13.

Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland; Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland. Electronic address:

We hypothesize that brain activity can be used to control future information retrieval systems. To this end, we conducted a feasibility study on predicting the relevance of visual objects from brain activity. We analyze both magnetoencephalographic (MEG) and gaze signals from nine subjects who were viewing image collages, a subset of which was relevant to a predetermined task. We report three findings: i) the relevance of an image a subject looks at can be decoded from MEG signals with performance significantly better than chance, ii) fusion of gaze-based and MEG-based classifiers significantly improves the prediction performance compared to using either signal alone, and iii) non-linear classification of the MEG signals using Gaussian process classifiers outperforms linear classification. These findings break new ground for building brain-activity-based interactive image retrieval systems, as well as for systems utilizing feedback both from brain activity and eye movements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neuroimage.2014.12.079DOI Listing
May 2015

Group Factor Analysis.

IEEE Trans Neural Netw Learn Syst 2015 Sep 18;26(9):2136-47. Epub 2014 Dec 18.

Factor analysis (FA) provides linear factors that describe the relationships between individual variables of a data set. We extend this classical formulation into linear factors that describe the relationships between groups of variables, where each group represents either a set of related variables or a data set. The model also naturally extends canonical correlation analysis to more than two sets, in a way that is more flexible than previous extensions. Our solution is formulated as a variational inference of a latent variable model with structural sparsity, and it consists of two hierarchical levels: 1) the higher level models the relationships between the groups and 2) the lower models the observed variables given the higher level. We show that the resulting solution solves the group factor analysis (GFA) problem accurately, outperforming alternative FA-based solutions as well as more straightforward implementations of GFA. The method is demonstrated on two life science data sets, one on brain activation and the other on systems biology, illustrating its applicability to the analysis of different types of high-dimensional data sources.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2014.2376974DOI Listing
September 2015

Identifying fragments of natural speech from the listener's MEG signals.

Hum Brain Mapp 2013 Jun 17;34(6):1477-89. Epub 2012 Feb 17.

Brain Research Unit, MEG Core, and Advanced Magnetic Imaging Centre, Low Temperature Laboratory, Aalto University, Finland.

It is a challenge for current signal analysis approaches to identify the electrophysiological brain signatures of continuous natural speech that the subject is listening to. To relate magnetoencephalographic (MEG) brain responses to the physical properties of such speech stimuli, we applied canonical correlation analysis (CCA) and a Bayesian mixture of CCA analyzers to extract MEG features related to the speech envelope. Seven healthy adults listened to news for an hour while their brain signals were recorded with whole-scalp MEG. We found shared signal time series (canonical variates) between the MEG signals and speech envelopes at 0.5-12 Hz. By splitting the test signals into equal-length fragments from 2 to 65 s (corresponding to 703 down to 21 pieces per the total speech stimulus) we obtained better than chance-level identification for speech fragments longer than 2-3 s, not used in the model training. The applied analysis approach thus allowed identification of segments of natural speech by means of partial reconstruction of the continuous speech envelope (i.e., the intensity variations of the speech sounds) from MEG responses, provided means to empirically assess the time scales obtainable in speech decoding with the canonical variates, and it demonstrated accurate identification of the heard speech fragments from the MEG data.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1002/hbm.22004DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6869971PMC
June 2013

High Expression of Complement Component 5 (C5) at Tumor Site Associates with Superior Survival in Ewing's Sarcoma Family of Tumour Patients.

ISRN Oncol 2011 2;2011:168712. Epub 2011 Oct 2.

Department of Pathology, Haartman Institute and HUSLAB, University of Helsinki and Helsinki University Central Hospital, 00014, Helsinki, Finland.

Background. Unlike in most adult-onset cancers, an association between typical paediatric neoplasms and inflammatory triggers is rare. We studied whether immune system-related genes are activated and have prognostic significance in Ewing's sarcoma family of tumors (ESFTs). Method. Data analysis was performed on gene expression profiles of 44 ESFT patients, 11 ESFT cell lines, and 18 normal skeletal muscle samples. Differential expression of 238 inflammation and 299 macrophage-related genes was analysed by t-test, and survival analysis was performed according to gene expression. Results. Inflammatory genes are activated in ESFT patient samples, as 38 of 238 (16%) inflammatory genes were upregulated (P < 0.001) when compared to cell lines. This inflammatory gene activation was characterized by significant enrichment of macrophage-related gene expression with 58 of 299 (19%) of genes upregulated (P < 0.001). High expression of complement component 5 (C5) correlated with better event-free (P = 0.01) and overall survival (P = 0.004) in a dose-dependent manner. C5 and its receptor C5aR1 expression was verified at protein level by immunohistochemistry on an independent ESFT tumour tissue microarray. Conclusion. Immune system-related gene activation is observed in ESFT patient samples, and prognostically significant inflammatory genes (C5, JAK1, and IL8) for ESFT were identified.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.5402/2011/168712DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3196920PMC
November 2011

Combined use of expression and CGH arrays pinpoints novel candidate genes in Ewing sarcoma family of tumors.

BMC Cancer 2009 Jan 14;9:17. Epub 2009 Jan 14.

Department of Pathology, Haartman Institute and HUSLAB, University of Helsinki, Helsinki, Finland.

Background: Ewing sarcoma family of tumors (ESFT), characterized by t(11;22)(q24;q12), is one of the most common tumors of bone in children and young adults. In addition to EWS/FLI1 gene fusion, copy number changes are known to be significant for the underlying neoplastic development of ESFT and for patient outcome. Our genome-wide high-resolution analysis aspired to pinpoint genomic regions of highest interest and possible target genes in these areas.

Methods: Array comparative genomic hybridization (CGH) and expression arrays were used to screen for copy number alterations and expression changes in ESFT patient samples. A total of 31 ESFT samples were analyzed by aCGH and in 16 patients DNA and RNA level data, created by expression arrays, was integrated. Time of the follow-up of these patients was 5-192 months. Clinical outcome was statistically evaluated by Kaplan-Meier/Logrank methods and RT-PCR was applied on 42 patient samples to study the gene of the highest interest.

Results: Copy number changes were detected in 87% of the cases. The most recurrent copy number changes were gains at 1q, 2, 8, and 12, and losses at 9p and 16q. Cumulative event free survival (ESFT) and overall survival (OS) were significantly better (P < 0.05) for primary tumors with three or less copy number changes than for tumors with higher number of copy number aberrations. In three samples copy number imbalances were detected in chromosomes 11 and 22 affecting the FLI1 and EWSR1 loci, suggesting that an unbalanced t(11;22) and subsequent duplication of the derivative chromosome harboring fusion gene is a common event in ESFT. Further, amplifications on chromosomes 20 and 22 seen in one patient sample suggest a novel translocation type between EWSR1 and an unidentified fusion partner at 20q. In total 20 novel ESFT associated putative oncogenes and tumor suppressor genes were found in the integration analysis of array CGH and expression data. Quantitative RT-PCR to study the expression levels of the most interesting gene, HDGF, confirmed that its expression was higher than in control samples. However, no association between HDGF expression and patient survival was observed.

Conclusion: We conclude that array CGH and integration analysis proved to be effective methods to identify chromosome regions and novel target genes involved in the tumorigenesis of ESFT.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2407-9-17DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2633345PMC
January 2009

Simple integrative preprocessing preserves what is shared in data sources.

BMC Bioinformatics 2008 Feb 21;9:111. Epub 2008 Feb 21.

Department of Computer Science, P,O, Box 68, FI-00014, University of Helsinki, Finland.

Background: Bioinformatics data analysis toolbox needs general-purpose, fast and easily interpretable preprocessing tools that perform data integration during exploratory data analysis. Our focus is on vector-valued data sources, each consisting of measurements of the same entity but on different variables, and on tasks where source-specific variation is considered noisy or not interesting. Principal components analysis of all sources combined together is an obvious choice if it is not important to distinguish between data source-specific and shared variation. Canonical Correlation Analysis (CCA) focuses on mutual dependencies and discards source-specific "noise" but it produces a separate set of components for each source.

Results: It turns out that components given by CCA can be combined easily to produce a linear and hence fast and easily interpretable feature extraction method. The method fuses together several sources, such that the properties they share are preserved. Source-specific variation is discarded as uninteresting. We give the details and implement them in a software tool. The method is demonstrated on gene expression measurements in three case studies: classification of cell cycle regulated genes in yeast, identification of differentially expressed genes in leukemia, and defining stress response in yeast. The software package is available at http://www.cis.hut.fi/projects/mi/software/drCCA/.

Conclusion: We introduced a method for the task of data fusion for exploratory data analysis, when statistical dependencies between the sources and not within a source are interesting. The method uses canonical correlation analysis in a new way for dimensionality reduction, and inherits its good properties of being simple, fast, and easily interpretable as a linear projection.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1186/1471-2105-9-111DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2278131PMC
February 2008

Improved learning of Riemannian metrics for exploratory analysis.

Neural Netw 2004 Oct-Nov;17(8-9):1087-100

Neural Networks Research Centre, Helsinki University of Technology, PO Box 5400, FI-02015 HUT, Finland.

We have earlier introduced a principle for learning metrics, which shows how metric-based methods can be made to focus on discriminative properties of data. The main applications are in supervising unsupervised learning to model interesting variation in data, instead of modeling all variation as plain unsupervised learning does. The metrics are derived by approximations to an information-geometric formulation. In this paper, we review the theory, introduce better approximations to the distances, and show how to apply them in two different kinds of unsupervised methods: prototype-based and pairwise distance-based. The two examples are self-organizing maps and multidimensional scaling (Sammon's mapping).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2004.06.008DOI Listing
January 2005
-->