Jubayer Hossain

Biomedical Researcher

Resources


Helpful materials, guides, and tools for learning bioinformatics, data science, and public health research.


Recommended Reading and RNA-seq Resources

Resources to help find RNA-seq Data Sets

Unix/Linux Tutorials

General Bioinformatics Resources

Getting More Help

Literature Review

  • typeset.io - Typeset.io is a platform designed to assist researchers, academics, and authors in the process of writing, formatting, and preparing documents for publication.
  • scholarcy - The AI-powered article summarizer.
  • elicit - AI-based research assistant.

Text Editor (Vi/Vim) Tutorials

Journal Finders

  • Journal Finder by Elsevier: This tool allows you to enter your manuscript title and abstract to find journals that are a good match for your research. It also provides information on journal metrics and submission guidelines.

  • Springer Journal Suggester: Springer's tool helps you find the right journal for your research by analyzing your abstract or keywords and suggesting journals that publish articles in your field.

  • PubMed Journal Selector: If your research is in the biomedical or life sciences, PubMed's Journal Selector can assist you in finding journals that match your keywords and research area.

  • JANE (Journal/Author Name Estimator): JANE is a free online tool that helps you find journals and authors based on the text of your article's title and abstract.

  • JournalGuide: This tool provides a comprehensive database of journals in various fields and allows you to search for journals by keywords or browse by subject area.

  • Scopus Journal Finder: Scopus, a bibliographic database, offers a journal finder feature that helps you find journals related to your research area based on keywords or article titles.

  • DOAJ (Directory of Open Access Journals): If you're interested in open-access journals, DOAJ is a directory of freely available scholarly journals that you can search by subject or keyword.

  • Scimago Journal & Country Rank: On the Scimago platform, you can find information about journals, their rankings, citation data, and more, which can be useful for researchers looking to identify suitable journals for their research or assess the impact and prestige of journals in their field. It's a valuable tool for academic research and evaluation.

Data Wrangling

  • readxl for importing data into R
  • dplyr, tidyr and others from the tidyverse for data preparation.

Data Visualization

  • ggplot2 for the vast majority of the graphics, together with the hrbrtheme for styling.
  • patchwork to put graphics together.
  • ggraph and igraph for most of the network related graphics
  • plotly and other html widgets for interactive graphics.
  • RColorBrewer and viridis and colormap to control color in charts.
  • Ggrepel and other ggplot2 extension that make your life simpler.
  • Heatmaply for most of the heatmaps

Publication-ready Tables

  • gtsummary for creating publication-ready descriptives and analytical tables.
  • gt to customize tables and export as docs or tex.

Reproducible Research

  • R Markdown to produce statistical reports.
  • Quarto to build 95% of the website for my courses and others.

Statistical Modeling

  • easystats for easy statistical modeling, visualization, and reporting

Data Science

  • NumPy for scientific computing.
  • Pandas for data wrangling and analysis
  • Matplotlib for data visulization
  • Seaborn for advance statistical visualizations
  • Plotly for interative data visualization
  • researchpy to summarize data and perform statistical tests.
  • Dask for big data analysis
  • scikit-learn for machine learning
  • scikit-image for life science image manipulation

Data Resources

Some Data Sources

Infectious Disease Specific

General

Influenza

TB

Cancer Bioinformatics

Data Sources

Analysis Tools

  • UCSC Xena: An online exploration tool for public and private, multi-omic and clinical/phenotype data

  • GEO2R: GEO2R is an interactive web tool that allows users to compare two or more groups of Samples in a GEO Series in order to identify genes that are differentially expressed across experimental conditions. Results are presented as a table of genes ordered by P-value, and as a collection of graphic plots to help visualize differentially expressed genes and assess data set quality. GEO2R uses a variety of R packages from the Bioconductor project. Bioconductor is an open-source software project based on the R programming language that provides tools for the analysis of high-throughput genomic data.

  • GEPIA2: GEPIA2 is a web-based tool for analyzing gene expression data in cancer. It stands for Gene Expression Profiling Interactive Analysis 2 and is an updated version of the original GEPIA tool. GEPIA2 allows users to explore gene expression patterns, perform survival analyses, and visualize gene expression data across various cancer types.

  • TIMER2.0: TIMER is a comprehensive resource for systematical analysis of immune infiltrates across diverse cancer types. This version of webserver provides immune infiltrates' abundances estimated by multiple immune deconvolution methods, and allows users to generate high-quality figures dynamically to explore tumor immunological, clinical and genomic features comprehensively.

  • UALCAN: UALCAN is a web-based platform that provides interactive and comprehensive analysis of cancer transcriptome data. It enables users to explore gene expression patterns, perform survival analyses, and compare gene expression between tumor and normal samples across different cancer types. UALCAN utilizes data from The Cancer Genome Atlas (TCGA) to facilitate cancer research and provide insights into tumor biology.

  • cBioPortal for Cancer Genomics:: cBioPortal hosts a large collection of cancer genomics datasets, allowing users to explore and visualize the data.

  • GREIN : GEO RNA-seq Experiments Interactive Navigator: GREIN is an interactive web platform that provides user-friendly options to explore and analyze GEO RNA-seq data. GREIN is powered by the back-end computational pipeline for uniform processing of RNA-seq data and the large number (>6,000) of already processed datasets. These datasets were retrieved from GEO and reprocessed consistently by the back-end GEO RNA-seq experiments processing pipeline (GREP2).

  • OncoLnc: Description: OncoLnc is a web resource that provides survival analysis and expression correlation for genes of interest across multiple cancer datasets.

  • UCSC Cancer Genomics Browser: The UCSC Cancer Genomics Browser offers a comprehensive collection of cancer genomics data integrated with genomic annotations.

  • ONCOMINE: ONCOMINE is a powerful web-based platform for the analysis and visualization of cancer transcriptomic data. It provides researchers with access to a vast collection of publicly available gene expression datasets derived from cancer studies. ONCOMINE allows users to explore gene expression patterns, identify potential biomarkers, and compare gene expression between different cancer types or subtypes.

R Packages

  • TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data. TCGAbiolinks is able to access The National Cancer Institute (NCI) Genomic Data Commons (GDC) thorough its GDC Application Programming Interface (API) to search, download and prepare relevant data for analysis in R.

  • maftools: Summarize, Analyze and Visualize MAF Files. This package attempts to summarize, analyze, annotate and visualize MAF files in an efficient manner from either TCGA sources or any in-house studies as long as the data is in MAF format.

  • SummarizedExperiment: The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.

  • MutationalPatterns: Comprehensive genome-wide analysis of mutational processes. he package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes.

  • GenVisR : Short for "Genomic Visualizations in R," this tool provides visualization capabilities tailored to a variety of genomic data types, including data common in cancer research such as somatic mutations, copy number variations, and more.

Teaching Tools