The Kiel Microbiome Center workshop
Overview
Here you can find the teaching and practical materials from the workshop.
Concepts in microbiome ecology and data analysis
Link all files of the talks here, use a logical order:
Hands-on material
- Introduction to R
- Alpha diversity analysis
- Beta diversity analysis
- Differential abundance analysis
- Metagenome outlook
Commands are in R. Scripts are in Jupyter notebook. Below, we provide instructions on how to set it all up.
Setting up the tutorial
Download the whole project directory to your computer. Unzip the folder. Scripts are in scripts
folder and inputs are in the inputs
folder. Intermediary RData files are in R_objects
folder.
To run the tutorial, it is assumed that you are working on a mac or a linux-based computer.
Input files
16S Data Files
To run the tutorials, you will need files from the inputs
folder with 16S dataset. This dataset has been reduced in size and randomly shuffled compared to the original dataset. The files you need are:
- feature-table.tsv
- metadata.tsv
- taxonomy.csv
- tree.nwk
Metagenome Files:
For metagenome analysis, you will need to download the files from public repositories and place them in the inputs
folder.
HMP Sample
These files are to be used in the tutorial a hmp2 metagenome with the ID MSM79H8.
The files need to be downloaded at (inputs/metagenome/
).
Human/GRC38 genome files
- genome.1.bt2
- genome.2.bt2
- genome.3.bt2
- genome.4.bt2
- genome.fa
- genome.rev.1.bt2
- genome.rev.2.bt2
You can download the needed Human/GRC38 genome files from e.g. here. Unzip the files at inputs/metagenome/
.
Sylph database
Download the file to (inputs/metagenome/
).
The used database can be downloaded from the official Sylph Repository. Note that this file has around 13 Gibabytes.
GTDB-Tk taxonomy
Download the file to (inputs/metagenome/
).
BBMap Files
Download the files to (inputs/metagenome/bbmap/
).
- nextera.fa.gz
- phix174_ill.ref.fa.gz
- phix_adapters.fa.gz
You can download the needed files from the BBMap Repository.
Intermediary back-up files for phyloseq
Intermediary RData files are in the R_objects
folder. These files are intermediary files. You will create these in the tutorial. Many of the tutorial use these objects as input. Therefore, we are providing them here.
- subset_phyloseq_object.RData
- subset_rare_phyloseq_object.RData
Running the R commands
Option 1: Using Jupyter Notebooks with the R Kernel
Install conda environment
First, make sure you have conda installed. You can install all required packages and tools with conda. Once you install conda, you can create a dedicated conda environment for the tutorial with the following command:
conda env create --file kmc_16s_environment.yml --prefix kmc_workshop
Once the environment is created, activate it:
conda activate kmc_workshop
Launch Jupyter Notebook
If Jupyter is not already installed, you can install it within the environment:
conda install jupyter
Start Jupyter with the R kernel:
jupyter notebook
Option 2: Using RStudio
Ensure that conda is installed and active (same as step 1 above).
Link Conda Environment with RStudio:
In RStudio, go to Tools > Global Options > R and select the R Version associated with the Conda environment.
Run Your R Code
You can now open your R scripts or RMarkdown notebooks in RStudio and run the code, just as we did in the Jupyter environment.