Resources
Helpful materials, guides, and tools for learning bioinformatics, data science, and public health research.
Helpful materials, guides, and tools for learning bioinformatics, data science, and public health research.
Journal Finder by Elsevier: This tool allows you to enter your manuscript title and abstract to find journals that are a good match for your research. It also provides information on journal metrics and submission guidelines.
Springer Journal Suggester: Springer's tool helps you find the right journal for your research by analyzing your abstract or keywords and suggesting journals that publish articles in your field.
PubMed Journal Selector: If your research is in the biomedical or life sciences, PubMed's Journal Selector can assist you in finding journals that match your keywords and research area.
JANE (Journal/Author Name Estimator): JANE is a free online tool that helps you find journals and authors based on the text of your article's title and abstract.
JournalGuide: This tool provides a comprehensive database of journals in various fields and allows you to search for journals by keywords or browse by subject area.
Scopus Journal Finder: Scopus, a bibliographic database, offers a journal finder feature that helps you find journals related to your research area based on keywords or article titles.
DOAJ (Directory of Open Access Journals): If you're interested in open-access journals, DOAJ is a directory of freely available scholarly journals that you can search by subject or keyword.
Scimago Journal & Country Rank: On the Scimago platform, you can find information about journals, their rankings, citation data, and more, which can be useful for researchers looking to identify suitable journals for their research or assess the impact and prestige of journals in their field. It's a valuable tool for academic research and evaluation.
The Project Open Data Dashboard gives overview statistics of available government data from various agencies.
Guide to Open Data Publishing & Analytics - A good article describing best practices for publishing data openly. Is also a good read for those who want to analyze other's data.
A short list of data related R packages - packages that either access data or include data
Kaggle Data - A growing number of datasets used in Kaggle data analysis contests and available for any other use.
Nasdaq Data Link - mainly finance related data
NHANES - longstanding and thorough survey done by CDC
SEER - Cancer data
CDC WONDER - list of mainly CDC online databases
Healthy People Website - contains among other things links to various data sources
HCUP - collection of health related databases, focusing on US wide and state-specific samples of ER and hospital visits. Not free, but not too expensive.
Clinical Study Data Request - a way to get (tedious) access to clinical trial data
EMA Clinical Data Portal - looks like a way to get access to some clinical trial data for EMA registered studies.
MIMIC - a free and open database of critical care patient visits to a Boston hospital.
Data.gov - federak government data platform.
Analyze Survey Data for Free - Step by Step Instructions to Explore Public Microdata from an Easy to Type Website
Inter-university Consortium for Political and Social Research (ICPSR) - access to various social and behavioral sciences data.
A list hosted by Microsoft with links to various data sources
National Cancer Institute (NCI) Genomic Data Commons (GDC): Description: GDC is an open-access data portal providing access to a wide range of cancer genomics datasets.
cellxgene.cziscience.com - Download and visually explore reference-quality data to understand the functionality of human tissues at the cellular level with Chan Zuckerberg CELL by GENE Discover (CZ CELLxGENE Discover).
10XGenomics - High-performance in situ from the single cell leader
UCSC Xena: An online exploration tool for public and private, multi-omic and clinical/phenotype data
GEO2R: GEO2R is an interactive web tool that allows users to compare two or more groups of Samples in a GEO Series in order to identify genes that are differentially expressed across experimental conditions. Results are presented as a table of genes ordered by P-value, and as a collection of graphic plots to help visualize differentially expressed genes and assess data set quality. GEO2R uses a variety of R packages from the Bioconductor project. Bioconductor is an open-source software project based on the R programming language that provides tools for the analysis of high-throughput genomic data.
GEPIA2: GEPIA2 is a web-based tool for analyzing gene expression data in cancer. It stands for Gene Expression Profiling Interactive Analysis 2 and is an updated version of the original GEPIA tool. GEPIA2 allows users to explore gene expression patterns, perform survival analyses, and visualize gene expression data across various cancer types.
TIMER2.0: TIMER is a comprehensive resource for systematical analysis of immune infiltrates across diverse cancer types. This version of webserver provides immune infiltrates' abundances estimated by multiple immune deconvolution methods, and allows users to generate high-quality figures dynamically to explore tumor immunological, clinical and genomic features comprehensively.
UALCAN: UALCAN is a web-based platform that provides interactive and comprehensive analysis of cancer transcriptome data. It enables users to explore gene expression patterns, perform survival analyses, and compare gene expression between tumor and normal samples across different cancer types. UALCAN utilizes data from The Cancer Genome Atlas (TCGA) to facilitate cancer research and provide insights into tumor biology.
cBioPortal for Cancer Genomics:: cBioPortal hosts a large collection of cancer genomics datasets, allowing users to explore and visualize the data.
GREIN : GEO RNA-seq Experiments Interactive Navigator: GREIN is an interactive web platform that provides user-friendly options to explore and analyze GEO RNA-seq data. GREIN is powered by the back-end computational pipeline for uniform processing of RNA-seq data and the large number (>6,000) of already processed datasets. These datasets were retrieved from GEO and reprocessed consistently by the back-end GEO RNA-seq experiments processing pipeline (GREP2).
OncoLnc: Description: OncoLnc is a web resource that provides survival analysis and expression correlation for genes of interest across multiple cancer datasets.
UCSC Cancer Genomics Browser: The UCSC Cancer Genomics Browser offers a comprehensive collection of cancer genomics data integrated with genomic annotations.
ONCOMINE: ONCOMINE is a powerful web-based platform for the analysis and visualization of cancer transcriptomic data. It provides researchers with access to a vast collection of publicly available gene expression datasets derived from cancer studies. ONCOMINE allows users to explore gene expression patterns, identify potential biomarkers, and compare gene expression between different cancer types or subtypes.
TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data. TCGAbiolinks is able to access The National Cancer Institute (NCI) Genomic Data Commons (GDC) thorough its GDC Application Programming Interface (API) to search, download and prepare relevant data for analysis in R.
maftools: Summarize, Analyze and Visualize MAF Files. This package attempts to summarize, analyze, annotate and visualize MAF files in an efficient manner from either TCGA sources or any in-house studies as long as the data is in MAF format.
SummarizedExperiment: The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
MutationalPatterns: Comprehensive genome-wide analysis of mutational processes. he package covers a wide range of patterns including: mutational signatures, transcriptional and replicative strand bias, lesion segregation, genomic distribution and association with genomic features, which are collectively meaningful for studying the activity of mutational processes.
GenVisR : Short for "Genomic Visualizations in R," this tool provides visualization capabilities tailored to a variety of genomic data types, including data common in cancer research such as somatic mutations, copy number variations, and more.