Flores et al

library(tidyverse) # various useful R enhancenents

## ── Attaching packages ────────────────────────────────────────────── tidyverse 1.3.0 ──

## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.1
## ✓ tidyr   1.1.1     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0

## ── Conflicts ───────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(readxl) # for reading excel files
library(bipartite) # for network tools

## Loading required package: vegan

## Loading required package: permute

## Loading required package: lattice

## This is vegan 2.5-6

## Loading required package: sna

## Loading required package: statnet.common

## 
## Attaching package: 'statnet.common'

## The following object is masked from 'package:base':
## 
##     order

## Loading required package: network

## network: Classes for Relational Data
## Version 1.16.0 created on 2019-11-30.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
##                     Mark S. Handcock, University of California -- Los Angeles
##                     David R. Hunter, Penn State University
##                     Martina Morris, University of Washington
##                     Skye Bender-deMoll, University of Washington
##  For citation information, type citation("network").
##  Type help("network-package") to get started.

## sna: Tools for Social Network Analysis
## Version 2.5 created on 2019-12-09.
## copyright (c) 2005, Carter T. Butts, University of California-Irvine
##  For citation information, type citation("sna").
##  Type help(package="sna") to get started.

##  This is bipartite 2.15.
##  For latest changes see versionlog in ?"bipartite-package". For citation see: citation("bipartite").
##  Have a nice time plotting and analysing two-mode networks.

## 
## Attaching package: 'bipartite'

## The following object is masked from 'package:vegan':
## 
##     nullmodel

Flores et al

Read raw data from the excel file. There are separate sheets for each experiment, and we load all of them into a list, where each element is a matrix.

studies <- excel_sheets("data/sd02.xls")
phageDat <- lapply(studies, function(x) read_excel("data/sd02.xls", sheet = x))

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...4`

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`

## New names:
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `1` -> `1...5`

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `0` -> `0...3`
## * `0` -> `0...4`

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `0` -> `0...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...
## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...
## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...
## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * `0` -> `0...6`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `1` -> `1...4`

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `1` -> `1...2`
## * `0` -> `0...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `1` -> `1...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...
## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `1` -> `1...3`
## * `1` -> `1...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...
## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

## New names:
## * `1` -> `1...1`
## * `1` -> `1...2`
## * `1` -> `1...3`
## * `0` -> `0...4`
## * `1` -> `1...5`
## * ...

## New names:
## * `0` -> `0...1`
## * `0` -> `0...2`
## * `0` -> `0...3`
## * `0` -> `0...4`
## * `0` -> `0...5`
## * ...

names(phageDat) <- studies

Getting started

What was indixes were analyzed in the original paper?

The details of the methods are in the online supplement. - They used the Q metric

Examining nestedness in the data set

Note: I’m using 100 permutations for speed, in reality you would want something like 1000.
To speed things up I pre-computed a couple of the variables.

if (file.exists("nulls.Rdata")) {
  nulls <- readRDS("nulls.RData")
} else { 
  nulls <- lapply(phageDat, function(x) nullmodel(x, N=100, method="shuffle.web"))
  saveRDS(nulls, "nulls.RData")
}

nestednessObs <- sapply(phageDat, function(x) networklevel(x, index="nestedness"))

if (file.exists("nestednessNulls.Rdata")) {
  nestednessNulls <- readRDS("nestednessNulls.RData")
} else { 
  nestednessNulls <- pbapply::pblapply(nulls, function (x) sapply(x, function(y) networklevel(y, index="nestedness")))
  saveRDS(nestednessNulls, "nestednessNulls.RData")
}

Examining output

We’ll try to make something like Fig. 6B for

nestednessDat <- data.frame(mean = sapply(nestednessNulls, function(x) mean(x)),
           lower = sapply(nestednessNulls, function(x) quantile(x, 0.025, na.rm = T)),
           upper = sapply(nestednessNulls, function(x) quantile(x, 0.975, na.rm = T)),
           observed = nestednessObs,
           study = studies)
nestednessDat$study <- factor(nestednessDat$study, levels = studies[order(nestednessDat$mean)]) 

ggplot(nestednessDat, aes(x = study, y = observed)) + geom_point(color = "red") + geom_point(aes(y = mean), color = "black") + geom_errorbar(aes(ymin = lower, ymax = upper, width = 0.2)) + theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

## Warning: Removed 1 rows containing missing values (geom_point).

## Warning: Removed 1 rows containing missing values (geom_point).

On the whole, we’re getting similar results, in that networks seem slightly more nested than observed. The results don’t quite match up, since I don’t think we’re using the same kind of null model and the same nestedness statistic, but overall similar

Homework

Recreate Fig. 6A.
According to Delmas et al NODF is currently the most commonly used nestedness measure
Do you get the same results if you use a different network permutation approach? Here I used the shuffle.web algorithm, but mgen should also work for binary networks. You can find more information about there algorithms by examining the function documentation, e.g., ?shuffle.web
Do we need addtional null models for our study?

Flores et al

Sasha Mikheyev

8/10/2020

Flores et al

Getting started

Examining nestedness in the data set

Examining output

Homework