• Dowd Iqbal posted an update 1 month, 3 weeks ago

    Polyploidy is a widespread phenomenon in eukaryotes that can lead to phenotypic novelty and has important implications for evolution and diversification. The modification of phenotypes in polyploids relative to their diploid progenitors may be associated with altered gene expression. However, it is largely unknown how interactions between duplicated genes affect their diurnal expression in allopolyploid species. In this study, we explored parental legacy and hybrid novelty in the transcriptomes of an allopolyploid species and its diploid progenitors. We compared the diurnal transcriptomes of representative Brachypodium cytotypes, including the allotetraploid Brachypodium hybridum and its diploid progenitors Brachypodium distachyon and Brachypodium stacei. We also artificially induced an autotetraploid B. distachyon. We identified patterns of homoeolog expression bias (HEB) across Brachypodium cytotypes and time-dependent gain and loss of HEB in B. selleck products hybridum. Furthermore, we established that many genes with diurnal expression experienced HEB, while their expression patterns and peak times were correlated between homoeologs in B. hybridum relative to B. distachyon and B. stacei, suggesting diurnal synchronization of homoeolog expression in B. hybridum. Our findings provide insight into the parental legacy and hybrid novelty associated with polyploidy in Brachypodium, and highlight the evolutionary consequences of diurnal transcriptional regulation that accompanied allopolyploidy.In recent years, eukaryotic long non-coding RNAs (lncRNAs) have been identified as important factors involved in a wide variety of biological processes, including histone modification, alternative splicing and transcription enhancement. The expression of lncRNAs is highly tissue-specific and is regulated by environmental stresses. Recently, a large number of plant lncRNAs have been identified, but very few of them have been studied in detail. Furthermore, the mechanism of lncRNA expression regulation remains largely unknown. Arabidopsis HISTONE DEACETYLASE 6 (HDA6) and LSD1-LIKE 1/2 (LDL1/2) can repress gene expression synergistically by regulating H3Ac/H3K4me. In this research, we performed RNA-seq and ChIP-seq analyses to further clarify the function of HDA6-LDL1/2. Our results indicated that the global expression of lncRNAs is increased in hda6/ldl1/2 and that this increased lncRNA expression is particularly associated with H3Ac/H3K4me2 changes. In addition, we found that HDA6-LDL1/2 is important for repressing lncRNAs that are non-expressed or show low-expression, which may be strongly associated with plant development. GO-enrichment analysis also revealed that the neighboring genes of the lncRNAs that are upregulated in hda6/ldl1/2 are associated with various developmental processes. Collectively, our results revealed that the expression of lncRNAs is associated with H3Ac/H3K4me2 changes regulated by the HDA6-LDL1/2 histone modification complex.Single cell RNA-sequencing (scRNA-seq) technology, a powerful tool for analyzing the entire transcriptome at single cell level, is receiving increasing research attention. The presence of dropouts is an important characteristic of scRNA-seq data that may affect the performance of downstream analyses, such as dimensionality reduction and clustering. Cells sequenced to lower depths tend to have more dropouts than those sequenced to greater depths. In this study, we aimed to develop a dimensionality reduction method to address both dropouts and the non-negativity constraints in scRNA-seq data. The developed method simultaneously performs dimensionality reduction and dropout imputation under the non-negative matrix factorization (NMF) framework. The dropouts were modeled as a non-negative sparse matrix. Summation of the observed data matrix and dropout matrix was approximated by NMF. To ensure the sparsity pattern was maintained, a weighted ℓ1 penalty that took into account the dependency of dropouts on the sequencing depth in each cell was imposed. An efficient algorithm was developed to solve the proposed optimization problem. Experiments using both synthetic data and real data showed that dimensionality reduction via the proposed method afforded more robust clustering results compared with those obtained from the existing methods, and that dropout imputation improved the differential expression analysis.CRISPR arrays and CRISPR-associated (Cas) proteins comprise a widespread adaptive immune system in bacteria and archaea. These systems function as a defense against exogenous parasitic mobile genetic elements that include bacteriophages, plasmids and foreign nucleic acids. With the continuous spread of antibiotic resistance, knowledge of pathogen susceptibility to bacteriophage therapy is becoming more critical. Additionally, gene-editing applications would benefit from the discovery of new cas genes with favorable properties. While next-generation sequencing has produced staggering quantities of data, transitioning from raw sequencing reads to the identification of CRISPR/Cas systems has remained challenging. This is especially true for metagenomic data, which has the highest potential for identifying novel cas genes. We report a comprehensive computational pipeline, CasCollect, for the targeted assembly and annotation of cas genes and CRISPR arrays-even isolated arrays-from raw sequencing reads. Benchmarking our targeted assembly pipeline demonstrates significantly improved timing by almost two orders of magnitude compared with conventional assembly and annotation, while retaining the ability to detect CRISPR arrays and cas genes. CasCollect is a highly versatile pipeline and can be used for targeted assembly of any specialty gene set, reconfigurable for user provided Hidden Markov Models and/or reference nucleotide sequences.After diverging, each chimpanzee subspecies has been the target of unique selective pressures. Here, we employ a machine learning approach to classify regions as under positive selection or neutrality genome-wide. The regions determined to be under selection reflect the unique demographic and adaptive history of each subspecies. The results indicate that effective population size is important for determining the proportion of the genome under positive selection. The chimpanzee subspecies share signals of selection in genes associated with immunity and gene regulation. With these results, we have created a selection map for each population that can be displayed in a genome browser (www.hsb.upf.edu/chimp_browser). This study is the first to use a detailed demographic history and machine learning to map selection genome-wide in chimpanzee. The chimpanzee selection map will improve our understanding of the impact of selection on closely related subspecies and will empower future studies of chimpanzee.