The Kiel Microbiome Center workshop

Overview

Here you can find the teaching and practical materials from the workshop.

Concepts in microbiome ecology and data analysis

Link all files of the talks here, use a logical order:

  1. Data processing
  2. Alpha diversity
  3. Beta diversity
  4. Metagenomics

Hands-on material

  1. Introduction to R
  2. Alpha diversity analysis
  3. Beta diversity analysis
  4. Differential abundance analysis
  5. Metagenome outlook

Commands are in R. Scripts are in Jupyter notebook. Below, we provide instructions on how to set it all up.

Setting up the tutorial

Download the whole project directory to your computer. Unzip the folder. Scripts are in scripts folder and inputs are in the inputs folder. Intermediary RData files are in R_objects folder.

To run the tutorial, it is assumed that you are working on a mac or a linux-based computer.

Input files

16S Data Files

To run the tutorials, you will need files from the inputs folder with 16S dataset. This dataset has been reduced in size and randomly shuffled compared to the original dataset. The files you need are:

  • feature-table.tsv
  • metadata.tsv
  • taxonomy.csv
  • tree.nwk

Metagenome Files:

For metagenome analysis, you will need to download the files from public repositories and place them in the inputs folder.

HMP Sample

These files are to be used in the tutorial a hmp2 metagenome with the ID MSM79H8.

The files need to be downloaded at (inputs/metagenome/).

Human/GRC38 genome files

  • genome.1.bt2
  • genome.2.bt2
  • genome.3.bt2
  • genome.4.bt2
  • genome.fa
  • genome.rev.1.bt2
  • genome.rev.2.bt2

You can download the needed Human/GRC38 genome files from e.g. here. Unzip the files at inputs/metagenome/.

Sylph database

Download the file to (inputs/metagenome/).

The used database can be downloaded from the official Sylph Repository. Note that this file has around 13 Gibabytes.

GTDB-Tk taxonomy

Download the file to (inputs/metagenome/).

BBMap Files

Download the files to (inputs/metagenome/bbmap/).

  • nextera.fa.gz
  • phix174_ill.ref.fa.gz
  • phix_adapters.fa.gz

You can download the needed files from the BBMap Repository.

Intermediary back-up files for phyloseq

Intermediary RData files are in the R_objects folder. These files are intermediary files. You will create these in the tutorial. Many of the tutorial use these objects as input. Therefore, we are providing them here.

  • subset_phyloseq_object.RData
  • subset_rare_phyloseq_object.RData

Running the R commands

Option 1: Using Jupyter Notebooks with the R Kernel

Install conda environment

First, make sure you have conda installed. You can install all required packages and tools with conda. Once you install conda, you can create a dedicated conda environment for the tutorial with the following command:

conda env create --file kmc_16s_environment.yml --prefix kmc_workshop

Once the environment is created, activate it:

conda activate kmc_workshop

Launch Jupyter Notebook

If Jupyter is not already installed, you can install it within the environment:

conda install jupyter

Start Jupyter with the R kernel:

jupyter notebook

Option 2: Using RStudio

Ensure that conda is installed and active (same as step 1 above).

Run Your R Code

You can now open your R scripts or RMarkdown notebooks in RStudio and run the code, just as we did in the Jupyter environment.