Latest in q-bio.gn

total 2006took 0.18s
FindeR: Accelerating FM-Index-based Exact Pattern Matching in Genomic Sequences through ReRAM technologyJul 11 2019Genomics is the critical key to enable the precision medicine, ensure the global food security and enforce the wildlife conservation. The massive genomic data produced by various genome sequencing technologies presents a significant challenge for genome ... More
Quantitative Immunology for PhysicistsJul 08 2019The adaptive immune system is a dynamical, self-organized multiscale system that protects vertebrates from both pathogens and internal irregularities, such as tumours. For these reason it fascinates physicists, yet the multitude of different cells, molecules ... More
Non-Invasive MGMT Status Prediction in GBM Cancer Using Magnetic Resonance Images (MRI) Radiomics Features: Univariate and Multivariate Machine Learning Radiogenomics AnalysisJul 08 2019Background and aim: This study aimed to predict methylation status of the O-6 methyl guanine-DNA methyl transferase (MGMT) gene promoter status by using MRI radiomics features, as well as univariate and multivariate analysis. Material and Methods: Eighty-two ... More
Predicting Gene Expression Between Species with Neural NetworksJul 05 2019We train a neural network to predict human gene expression levels based on experimental data for rat cells. The network is trained with paired human/rat samples from the Open TG-GATES database, where paired samples were treated with the same compound ... More
Next Generation Radiogenomics Sequencing for Prediction of EGFR and KRAS Mutation Status in NSCLC Patients Using Multimodal Imaging and Machine Learning ApproachesJul 03 2019Aim: In the present work, we aimed to evaluate a comprehensive radiomics framework that enabled prediction of EGFR and KRAS mutation status in NSCLC cancer patients based on PET and CT multi-modalities radiomic features and machine learning (ML) algorithms. ... More
Machine Learning based Prediction of Hierarchical Classification of Transposable ElementsJul 02 2019Transposable Elements (TEs) or jumping genes are the DNA sequences that have an intrinsic capability to move within a host genome from one genomic location to another. Studies show that the presence of a TE within or adjacent to a functional gene may ... More
Machine Learning based Prediction of Hierarchical Classification of Transposable ElementsJul 02 2019Jul 09 2019Transposable Elements (TEs) or jumping genes are the DNA sequences that have an intrinsic capability to move within a host genome from one genomic location to another. Studies show that the presence of a TE within or adjacent to a functional gene may ... More
A Bayesian Phylogenetic Hidden Markov Model for B Cell Receptor Sequence AnalysisJun 27 2019The human body is able to generate a diverse set of high affinity antibodies, the soluble form of B cell receptors (BCRs), that bind to and neutralize invading pathogens. The natural development of BCRs must be understood in order to design vaccines for ... More
Convolutional neural network models for cancer type prediction based on gene expressionJun 18 2019Background Precise prediction of cancer types is vital for cancer diagnosis and therapy. Important cancer marker genes can be inferred through predictive model. Several studies have attempted to build machine learning models for this task however none ... More
Nested partitions from hierarchical clustering statistical validationJun 17 2019We develop a greedy algorithm that is fast and scalable in the detection of a nested partition extracted from a dendrogram obtained from hierarchical clustering of a multivariate series. Our algorithm provides a $p$-value for each clade observed in the ... More
Exploring Bayesian approaches to eQTL mapping through probabilistic programmingJun 12 2019The discovery of genomic polymorphisms influencing gene expression (also known as expression quantitative trait loci or eQTLs) can be formulated as a sparse Bayesian multivariate/multiple regression problem. An important aspect in the development of such ... More
Unsupervised Representation Learning of DNA SequencesJun 07 2019Recently several deep learning models have been used for DNA sequence based classification tasks. Often such tasks require long and variable length DNA sequences in the input. In this work, we use a sequence-to-sequence autoencoder model to learn a latent ... More
DOT: Gene-set analysis by combining decorrelated association statisticsJun 05 2019Jun 10 2019Historically, the majority of statistical association methods have been designed assuming availability of SNP-level information. However, modern genetic and sequencing data present new challenges to access and sharing of genotype-phenotype datasets, including ... More
DOT: Gene-set analysis by combining decorrelated association statisticsJun 05 2019Historically, the majority of statistical association methods have been designed assuming availability of SNP-level information. However, modern genetic and sequencing data present new challenges to access and sharing of genotype-phenotype datasets, including ... More
Incorporating Biological Knowledge with Factor Graph Neural Network for Interpretable Deep LearningJun 03 2019While deep learning has achieved great success in many fields, one common criticism about deep learning is its lack of interpretability. In most cases, the hidden units in a deep neural network do not have a clear semantic meaning or correspond to any ... More
Point mutations in a growing cell populationMay 29 2019A growing cell population such as cancer or bacteria accumulates point mutations (single nucleotide changes in DNA). To model this process, we consider that cells divide and die as a branching process, and that each cell contains a sequence of nucleotides ... More
The network architecture of the human brain is modularly encoded in the genomeMay 18 2019The form of genotype-phenotype maps are typically modular. However, the form of the genotype-phenotype map in human brain connectivity is unknown. A modular mapping could exist, in which distinct sets of genes' coexpression across brain regions is similar ... More
ncRNA Classification with Graph Convolutional NetworksMay 16 2019Non-coding RNA (ncRNA) are RNA sequences which don't code for a gene but instead carry important biological functions. The task of ncRNA classification consists in classifying a given ncRNA sequence into its family. While it has been shown that the graph ... More
The Energetics of Molecular Adaptation in Transcriptional RegulationMay 15 2019Mutation is a critical mechanism by which evolution explores the functional landscape of proteins. Despite our ability to experimentally inflict mutations at will, it remains difficult to link sequence-level perturbations to systems-level responses. Here, ... More
Tumor Microenvironment-based Gene Signatures Divides Novel Immune and Stromal Subgroup Classification of Lung AdenocarcinomaMay 10 2019Tumor microenvironment has complex effects on tumorigenesis and metastasis. However, there is still a lack of comprehensive understanding of the relationship among molecular and cellular characteristics in tumor microenvironment, clinical prognosis and ... More
RNASeqR: an R package for automated two-group RNA-Seq analysis workflowMay 10 2019RNA-Seq analysis has revolutionized researchers' understanding of the transcriptome in biological research. Assessing the differences in transcriptomic profiles between tissue samples or patient groups enables researchers to explore the underlying biological ... More
Tasks, Techniques, and Tools for Genomic Data VisualizationMay 08 2019Genomic data visualization is essential for interpretation and hypothesis generation as well as a valuable aid in communicating discoveries. Visual tools bridge the gap between algorithmic approaches and the cognitive skills of investigators. Addressing ... More
RACS: Rapid Analysis of ChIP-Seq data for contig based genomesMay 07 2019Background: Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely used technique to investigate the function of chromatin-related proteins in a genome-wide manner. ChIP-Seq generates large quantities of data which ... More
Somatic mutations render human exome and pathogen DNA more similarMay 07 2019Immunotherapy has recently shown important clinical successes in a substantial number of oncology indications. Additionally, the tumor somatic mutation load has been shown to associate with response to these therapeutic agents, and specific mutational ... More
Analysis of Gene Interaction Graphs for Biasing Machine Learning ModelsMay 06 2019Gene interaction graphs aim to capture various relationships between genes and can be used to create more biologically-intuitive models for machine learning. There are many such graphs available which can differ in the number of genes and edges covered. ... More
A joint model of unpaired data from scRNA-seq and spatial transcriptomics for imputing missing gene expression measurementsMay 06 2019Spatial studies of transcriptome provide biologists with gene expression maps of heterogeneous and complex tissues. However, most experimental protocols for spatial transcriptomics suffer from the need to select beforehand a small fraction of genes to ... More
DNA energy constraints shape biological evolutionary trajectoriesMay 02 2019Most living systems rely on double-stranded DNA (dsDNA) to store their genetic information and perpetrate themselves. Thus, the biological information contained within a dsDNA molecule, in terms of a linear sequence of nucleotides, has been considered ... More
A bioinformatics pipeline for the identification of CHO cell differential gene expression from RNA-Seq dataMay 01 2019In recent years the publication of genome sequences for the Chinese hamster and Chinese hamster ovary (CHO) cell lines have facilitated study of these biopharmaceutical cell factories with unprecedented resolution. Our understanding of the CHO cell transcriptome, ... More
Genome analysis and pleiotropy assessment using causal networks with loss of function mutation and metabolomicsApr 29 2019Background: Many genome-wide association studies have detected genomic regions associated with traits, yet understanding the functional causes of association often remains elusive. Utilizing systems approaches and focusing on intermediate molecular phenotypes ... More
Statistical methods for the quantitative genetic analysis of high-throughput phenotyping dataApr 28 2019The advent of plant phenomics, coupled with the wealth of genotypic data generated by next-generation sequencing technologies, provides exciting new resources for investigations into and improvement of complex traits. However, these new technologies also ... More
Generating protein sequences from antibiotic resistance genes data using Generative Adversarial NetworksApr 28 2019We introduce a method to generate synthetic protein sequences which are predicted to be resistant to certain antibiotics. We did this using 6,023 genes that were predicted to be resistant to antibiotics in the intestinal region of the human gut and were ... More
Read classification using semi-supervised deep learningApr 23 2019In this paper, we propose a semi-supervised deep learning method for detecting the specific types of reads that impede the de novo genome assembly process. Instead of dealing directly with sequenced reads, we analyze their coverage graphs converted to ... More
MinCall - MinION end2end convolutional deep learning basecallerApr 22 2019The Oxford Nanopore Technologies's MinION is the first portable DNA sequencing device. It is capable of producing long reads, over 100 kBp were reported. However, it has significantly higher error rate than other methods. In this study, we present MinCall, ... More
Radiogenomics models in precision radiotherapy: from mechanistic to machine learningApr 21 2019Machine learning provides a broad framework for addressing high-dimensional prediction problems in classification and regression. While machine learning is often applied for imaging problems in medical physics, there are many efforts to apply these principles ... More
Genomics models in radiotherapy: from mechanistic to machine learningApr 21 2019Jul 04 2019Machine learning provides a broad framework for addressing high-dimensional prediction problems in classification and regression. While machine learning is often applied for imaging problems in medical physics, there are many efforts to apply these principles ... More
Solution of the FISH-Hi-C paradox for Human Interphase ChromosomesApr 20 2019Hi-C experiments are used to infer the contact probabilities between loci separated by varying genome lengths. Contact probability should decrease as the spatial distance between two loci increases. However, studies comparing Hi-C and FISH data show that ... More
Testing for differential abundance in compositional counts data, with application to microbiome studiesApr 18 2019Jun 02 2019In order to identify which taxa differ in the microbiome community across groups, the relative frequencies of the taxa are measured for each unit in the group by sequencing PCR amplicons. Statistical inference in this setting is challenging due to the ... More
Testing for differential abundance in compositional counts data, with application to microbiome studiesApr 18 2019In order to identify which taxa differ in the microbiome community across groups, the relative frequencies of the taxa are measured for each unit in the group by sequencing PCR amplicons. Statistical inference in this setting is challenging due to the ... More
Disease gene prioritization using network topological analysis from a sequence based human functional linkage networkApr 15 2019Sequencing large number of candidate disease genes which cause diseases in order to identify the relationship between them is an expensive and time-consuming task. To handle these challenges, different computational approaches have been developed. Based ... More
A mean first passage time genome rearrangement distanceApr 12 2019This paper introduces a new way to define a genome rearrangement distance, using the concept of mean first passage time from control theory. Crucially, this distance estimate provides a genuine metric on genome space. We develop the theory and introduce ... More
pdbmine: A Node.js API for the RCSB Protein Data Bank (PDB)Apr 03 2019Summary: The advent of Web-based tools that assist in the analysis and visualization of macromolecules require application programming interfaces (APIs) designed for modern web frameworks. To this end, we have developed a Node.js module pdbmine that allows ... More
Learning Clinical Outcomes from Heterogeneous Genomic Data SourcesApr 02 2019Translating the vast data generated by genomic platforms into reliable predictions of clinical outcomes remains a critical challenge in realizing the promise of genomic medicine largely due to small number of independent samples. In this paper, we show ... More
BPPart and BPMax: RNA-RNA Interaction Partition Function and Structure Prediction for the Base Pair Counting ModelApr 02 2019RNA-RNA interaction (RRI) is ubiquitous and has complex roles in the cellular functions. In human health studies, miRNA-target and lncRNAs are among an elite class of RRIs that have been extensively studied. Bacterial ncRNA-target and RNA interference ... More
Data structures to represent sets of k-long DNA sequencesMar 29 2019The analysis of biological sequencing data has been one of the biggest applications of string algorithms. The approaches used in many such applications are based on the analysis of k-mers, which are short fixed-length strings present in a dataset. While ... More
Why understanding multiplex social network structuring processes will help us better understand the evolution of human behaviorMar 26 2019Anthropologists have long appreciated that single-layer networks are insufficient descriptions of human interactions---individuals are embedded in complex networks with dependencies. One debate explicitly about this surrounds food sharing. Some argue ... More
HIV-1 virus cycle replication: a review of RNA polymerase II transcription, alternative splicing and protein synthesisMar 12 2019HIV virus replication is a time-related process that includes several stages. Focusing on the core steps, RNA polymerase II transcripts in an early stage pre-mRNA containing regulator proteins (i.e nef,tat,rev,vif,vpr,vpu), which are completely spliced ... More
conLSH: Context based Locality Sensitive Hashing for Mapping of noisy SMRT ReadsMar 11 2019Single Molecule Real-Time (SMRT) sequencing is a recent advancement of Next Gen technology developed by Pacific Bio (PacBio). It comes with an explosion of long and noisy reads demanding cutting edge research to get most out of it. To deal with the high ... More
A biologically constrained encoding solution for long-term storage of images onto synthetic DNAMar 07 2019Living in the age of the digital media explosion, the amount of data that is being stored increases dramatically. However, even if existing storage systems suggest efficiency in capacity, they are lacking in durability. Hard disks, flash, tape or even ... More
On genetic correlation estimation with summary statistics from genome-wide association studiesMar 04 2019Genome-wide association studies (GWAS) have been widely used to examine the association between single nucleotide polymorphisms (SNPs) and complex traits, where both the sample size n and the number of SNPs p can be very large. Recently, cross-trait polygenic ... More
CAMIRADA: Cancer microRNA association discovery algorithm, a case study on breast cancerFeb 27 2019In recent studies, non-coding protein RNAs have been identified as microRNA that can be used as biomarkers for early diagnosis and treatment of cancer, that decrease mortality in cancer. A microRNA may target hundreds or thousands of genes and a gene ... More
Fast Approximation of Frequent $k$-mers and Applications to MetagenomicsFeb 26 2019Estimating the abundances of all $k$-mers in a set of biological sequences is a fundamental and challenging problem with many applications in biological analysis. While several methods have been designed for the exact or approximate solution of this problem, ... More
Diversity and its decomposition into variety, balance and disparityFeb 25 2019Feb 26 2019Diversity is a central concept in many fields. Despite its importance, there is no unified methodological framework to measure diversity and its three components of variety, balance and disparity. Current approaches take into account disparity of the ... More
A Nonparametric Multi-view Model for Estimating Cell Type-Specific Gene Regulatory NetworksFeb 21 2019We present a Bayesian hierarchical multi-view mixture model termed Symphony that simultaneously learns clusters of cells representing cell types and their underlying gene regulatory networks by integrating data from two views: single-cell gene expression ... More
Using sequencing coverage statistics to identify sex chromosomes in minke whalesFeb 18 2019The ever-increasing number of genome sequencing and resequencing projects is a central source of insights into the ecology and evolution of non-model organisms. An important aspect of genomics is the elucidation of sex determination systems and identifying ... More
BOAssembler: a Bayesian Optimization Framework to Improve RNA-Seq Assembly PerformanceFeb 14 2019High throughput sequencing of RNA (RNA-Seq) can provide us with millions of short fragments of RNA transcripts from a sample. How to better recover the original RNA transcripts from those fragments (RNA-Seq assembly) is still a difficult task. For example, ... More
OPENMENDEL: A Cooperative Programming Project for Statistical GeneticsFeb 14 2019Statistical methods for genomewide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, ... More
PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasetsFeb 12 2019Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification ... More
Apollo: A Sequencing-Technology-Independent, Scalable, and Accurate Assembly Polishing AlgorithmFeb 12 2019A large proportion of the basepairs in the long reads that third-generation sequencing technologies produce possess sequencing errors. These errors propagate to the assembly and affect the accuracy of genome analysis. Assembly polishing algorithms minimize ... More
Achieving GWAS with Homomorphic EncryptionFeb 12 2019One way of investigating how genes affect human traits would be with a genome-wide association study (GWAS). Genetic markers, known as single-nucleotide polymorphism (SNP), are used in GWAS. This raises privacy and security concerns as these genetic markers ... More
Achieving GWAS with Homomorphic EncryptionFeb 12 2019Mar 11 2019One way of investigating how genes affect human traits would be with a genome-wide association study (GWAS). Genetic markers, known as single-nucleotide polymorphism (SNP), are used in GWAS. This raises privacy and security concerns as these genetic markers ... More
Scalable optimal Bayesian classification of single-cell trajectories under regulatory model uncertaintyFeb 08 2019Single-cell gene expression measurements offer opportunities in deriving mechanistic understanding of complex diseases, including cancer. However, due to the complex regulatory machinery of the cell, gene regulatory network (GRN) model inference based ... More
Some Enumeration Problems in the Duplication-Loss Model of Genome RearrangementFeb 01 2019Jun 06 2019Tandem-duplication-random-loss (TDRL) is an important genome rearrangement operation studied in evolutionary biology. This paper investigates some of the formal properties of TDRL operations on the symmetric group (the space of permutations over an $ ... More
Some Enumeration Problems in the Duplication-Loss Model of Genome RearrangementFeb 01 2019Tandem-duplication-random-loss (TDRL) is an important genome rearrangement operation studied in evolutionary biology. This paper investigates some of the formal properties of TDRL operations on the symmetric group (the space of permutations over an $ ... More
Adaptive Monte Carlo Multiple Testing via Multi-Armed BanditsFeb 01 2019Feb 19 2019Monte Carlo (MC) permutation test is considered the gold standard for statistical hypothesis testing, especially when standard parametric assumptions are not clear or likely to fail. However, in modern data science settings where a large number of hypothesis ... More
Adaptive Monte Carlo Multiple Testing via Multi-Armed BanditsFeb 01 2019May 18 2019Monte Carlo (MC) permutation test is considered the gold standard for statistical hypothesis testing, especially when standard parametric assumptions are not clear or likely to fail. However, in modern data science settings where a large number of hypothesis ... More
Adaptive Monte Carlo Multiple Testing via Multi-Armed BanditsFeb 01 2019Monte Carlo (MC) permutation testing is considered the gold standard for statistical hypothesis testing, especially when standard parametric assumptions are not clear or likely to fail. However, in modern data science settings where a large number of ... More
Predicting Toxicity from Gene Expression with Neural NetworksJan 31 2019We train a neural network to predict chemical toxicity based on gene expression data. The input to the network is a full expression profile collected either in vitro from cultured cells or in vivo from live animals. The output is a set of fine grained ... More
GeNet: Deep Representations for MetagenomicsJan 30 2019We introduce GeNet, a method for shotgun metagenomic classification from raw DNA sequences that exploits the known hierarchical structure between labels for training. We provide a comparison with state-of-the-art methods Kraken and Centrifuge on datasets ... More
Causal Mediation Analysis Leveraging Multiple Types of Summary Statistics DataJan 24 2019Summary statistics of genome-wide association studies (GWAS) teach causal relationship between millions of genetic markers and tens and thousands of phenotypes. However, underlying biological mechanisms are yet to be elucidated. We can achieve necessary ... More
Proteomic and metagenomic insights into prehistoric Spanish Levantine Rock ArtJan 24 2019The Iberian Mediterranean Basin is home to one of the largest groups of prehistoric rock art sites in Europe. Despite the cultural relevance of prehistoric Spanish Levantine rock art, pigment composition remains partially unknown, and the nature of the ... More
Identifying centromeric satellites with dna-brnnJan 22 2019Mar 18 2019Summary: Human alpha satellite and satellite 2/3 contribute to several percent of the human genome. However, identifying these sequences with traditional algorithms is computationally intensive. Here we develop dna-brnn, a recurrent neural network to ... More
Identifying centromeric satellites with dna-brnnJan 22 2019Summary: Human alpha satellite and satellite 2/3 contribute to several percent of the human genome. However, identifying these sequences with traditional algorithms is computationally intensive. Here we develop dna-brnn, a recurrent neural network to ... More
Dual Graph-Laplacian PCA: A Closed-Form Solution for Bi-clustering to Find "Checkerboard" Structures on Gene Expression DataJan 21 2019In the context of cancer, internal "checkerboard" structures are normally found in the matrices of gene expression data, which correspond to genes that are significantly up- or down-regulated in patients with specific types of tumors. In this paper, we ... More
Spatial clustering and common regulatory elements correlate with coordinated gene expressionJan 18 2019Many cellular responses to surrounding cues require temporally concerted transcriptional regulation of multiple genes. In prokaryotic cells, a single-input-module motif with one transcription factor regulating multiple target genes can generate coordinated ... More
A Hybrid HMM Approach for the Dynamics of DNA MethylationJan 18 2019The understanding of mechanisms that control epigenetic changes is an important research area in modern functional biology. Epigenetic modifications such as DNA methylation are in general very stable over many cell divisions. DNA methylation can however ... More
Determining Multifunctional Genes and Diseases in Human Using Gene OntologyJan 11 2019The study of human genes and diseases is very rewarding and can lead to improvements in healthcare, disease diagnostics and drug discovery. In this paper, we further our previous study on gene disease relationship specifically with the multifunctional ... More
The Mahalanobis kernel for heritability estimation in genome-wide association studies: fixed-effects and random-effects methodsJan 09 2019Linear mixed models (LMMs) are widely used for heritability estimation in genome-wide association studies (GWAS). In standard approaches to heritability estimation with LMMs, a genetic relationship matrix (GRM) must be specified. In GWAS, the GRM is frequently ... More
De novo inference of diversity genes and analysis of non-canonical V(DD)J recombination in immunoglobulinsJan 08 2019The V(D)J recombination forms the immunoglobulin genes by joining the variable (V), diversity (D), and joining (J) germline genes. Since variations in germline genes have been linked to various diseases, personalized immunogenomics aims at finding alleles ... More
Figure 1 Theory Meets Figure 2 Experiments in the Study of Gene ExpressionDec 30 2018It is tempting to believe that we now own the genome. The ability to read and re-write it at will has ushered in a stunning period in the history of science. Nonetheless, there is an Achilles heel exposed by all of the genomic data that has accrued: we ... More
ATHENA: Automated Tuning of Genomic Error Correction Algorithms using Language ModelsDec 30 2018The performance of most error-correction algorithms that operate on genomic sequencer reads is dependent on the proper choice of its configuration parameters, such as the value of k in k-mer based techniques. In this work, we target the problem of finding ... More
Parallel Clustering of Single Cell Transcriptomic Data with Split-Merge Sampling on Dirichlet Process MixturesDec 25 2018Motivation: With the development of droplet based systems, massive single cell transcriptome data has become available, which enables analysis of cellular and molecular processes at single cell resolution and is instrumental to understanding many biological ... More
Pan-Cancer Epigenetic Biomarker Selection from Blood Samples Using SASDec 21 2018A key focus in current cancer research is the discovery of cancer biomarkers that allow earlier detection with high accuracy and lower costs for both patients and hospitals. Blood samples have long been used as a health status indicator, but DNA methylation ... More
Bayesian Manifold-Constrained-Prior Model for an Experiment to Locate XceDec 20 2018We propose an analysis for a novel experiment intended to locate the genetic locus Xce (X-chromosome controlling element), which biases the stochastic process of X-inactivation in the mouse. X-inactivation bias is a phenomenon where cells in the embryo ... More
GenHap: A Novel Computational Method Based on Genetic Algorithms for Haplotype AssemblyDec 18 2018The computational problem of inferring the full haplotype of a cell starting from read sequencing data is known as haplotype assembly, and consists in assigning all heterozygous Single Nucleotide Polymorphisms (SNPs) to exactly one of the two chromosomes. ... More
α7 nicotinic acetylcholine receptor signaling modulates ovine fetal brain astrocytes transcriptome in response to endotoxin: comparison to microglia, implications for prenatal stress and development of autism spectrum disorderDec 17 2018Dec 31 2018Neuroinflammation in utero may result in lifelong neurological disabilities. Astrocytes play a pivotal role, but the mechanisms are poorly understood. No early postnatal treatment strategies exist to enhance neuroprotective potential of astrocytes. We ... More
Alpha7 nicotinic acetylcholine receptor signaling modulates ovine fetal brain astrocytes transcriptome in response to endotoxinDec 17 2018Apr 09 2019Neuroinflammation in utero may result in lifelong neurological disabilities. Astrocytes play a pivotal role, but the mechanisms are poorly understood. No early postnatal treatment strategies exist to enhance neuroprotective potential of astrocytes. We ... More
Topological Data Analysis of Single-cell Hi-C Contact MapsDec 04 2018In this article, we show how the recent statistical techniques developed in Topological Data Analysis for the Mapper algorithm can be extended and leveraged to formally define and statistically quantify the presence of topological structures coming from ... More
Integrating omics and MRI data with kernel-based tests and CNNs to identify rare genetic markers for Alzheimer's diseaseDec 02 2018For precision medicine and personalized treatment, we need to identify predictive markers of disease. We focus on Alzheimer's disease (AD), where magnetic resonance imaging scans provide information about the disease status. By combining imaging with ... More
Integrating omics and MRI data with kernel-based tests and CNNs to identify rare genetic markers for Alzheimer's diseaseDec 02 2018Mar 05 2019For precision medicine and personalized treatment, we need to identify predictive markers of disease. We focus on Alzheimer's disease (AD), where magnetic resonance imaging scans provide information about the disease status. By combining imaging with ... More
Reliable uncertainty estimate for antibiotic resistance classification with Stochastic Gradient Langevin DynamicsNov 27 2018Antibiotic resistance monitoring is of paramount importance in the face of this on-going global epidemic. Deep learning models trained with traditional optimization algorithms (e.g. Adam, SGD) provide poor posterior estimates when tested against out-of-distribution ... More
Interlacing Personal and Reference Genomes for Machine Learning Disease-Variant DetectionNov 26 2018DNA sequencing to identify genetic variants is becoming increasingly valuable in clinical settings. Assessment of variants in such sequencing data is commonly implemented through Bayesian heuristic algorithms. Machine learning has shown great promise ... More
A Framework for Implementing Machine Learning on Omics DataNov 26 2018The potential benefits of applying machine learning methods to -omics data are becoming increasingly apparent, especially in clinical settings. However, the unique characteristics of these data are not always well suited to machine learning techniques. ... More
Private Shotgun DNA SequencingNov 23 2018Current techniques in sequencing a genome allow a service provider (e.g. a sequencing company) to have full access to the genome information, and thus the privacy of individuals regarding their lifetime secret is violated. In this paper, we introduce ... More
Inference of the three-dimensional chromatin structure and its temporal behaviorNov 22 2018Understanding the three-dimensional (3D) structure of the genome is essential for elucidating vital biological processes and their links to human disease. To determine how the genome folds within the nucleus, chromosome conformation capture methods such ... More
Group induced graphical lasso allows for discovery of molecular pathways-pathways interactionsNov 21 2018Complex systems may contain heterogeneous types of variables that interact in a multi-level and multi-scale manner. In this context, high-level layers may considered as groups of variables interacting in lower-level layers. This is particularly true in ... More
DeepZip: Lossless Data Compression using Recurrent Neural NetworksNov 20 2018Sequential data is being generated at an unprecedented pace in various forms, including text and genomic data. This creates the need for efficient compression mechanisms to enable better storage, transmission and processing of such data. To solve this ... More
A Multi-Trait Approach Identified Genetic Variants Including a Rare Mutation in RGS3 with Impact on Abnormalities of Cardiac Structure/FunctionNov 19 2018Heart failure is a major cause for premature death. Given heterogeneity of the heart failure syndrome, identifying genetic determinants of cardiac function and structure may provide greater insights into heart failure. Despite progress in understanding ... More
Prediction of Signal Sequences in Abiotic Stress Inducible Genes from Main Crops by Association Rule MiningNov 18 2018It is important to study on genes affecting to growing environment of main crops. Especially the recognition problem of promoter region, which is the problem to predict whether DNA sequences contain promoter regions or not, is prior to find abiotic stress-inducible ... More
Synergistic Drug Combination Prediction by Integrating Multi-omics Data in Deep Learning ModelsNov 16 2018Drug resistance is still a major challenge in cancer therapy. Drug combination is expected to overcome drug resistance. However, the number of possible drug combinations is enormous, and thus it is infeasible to experimentally screen all effective drug ... More
Linking de novo assembly results with long DNA reads by dnaasm-link applicationNov 13 2018Currently, third-generation sequencing techniques, which allow to obtain much longer DNA reads compared to the next-generation sequencing technologies, are becoming more and more popular. There are many possibilities to combine data from next-generation ... More