Minimal intro SNA

Objectives

This is an introduction to social networks using built-in functions in R and the packages sna and network. We will learn the

  • basic features of the adjacency matrix represented as a matrix object,
    • calculate the degrees of the nodes, and
    • calculate some fundamental descriptives of the network.
  • We will then translate the network
    • from a matrix object to a network object in order to
    • plot the sociogram.

For full details of the packages, see https://cran.r-project.org/web/packages/sna/sna.pdf and https://cran.r-project.org/web/packages/network/network.pdf. For accessile general R-help see https://www.statmethods.net/ and for any kind of errors use https://google.com.

This introduction is deliberately writen in inelegant R, using as basic functions as possible. Many packages offer sleaker and more userfriendly network routines, such as ‘igraph’. In particular, I would like to reccomend the packages of David Schooch http://mr.schochastics.net/ for accessible and elegant network analysis in R. In general, basic plots in R (described in https://www.statmethods.net/graphs/index.html) are functional but more advanced and better looking plots can be acchieved through ‘ggplot’.

For basic concepts in network analysis see Robins (2015) and Borgatti, Everett, and Johnson (2018). There is also a handy online bool http://faculty.ucr.edu/~hanneman/nettext/ (Hanneman and Riddle 2005).

Build your own network

To use sna(Butts 2016) and network (Butts 2015) for the first time, install the packages

install.packages("sna")
install.packages("network")

Once packages are installed, load them

library("sna")
library("network")

The Matrix

Create an empty adjacency matrix for n = 5 nodes

n <- 5
ADJ <- matrix(0,n,n) # create a matrix with n rows and n columns and all values 0

Add ties \(1 \rightarrow 2\), \(1 \rightarrow 3\), \(2 \rightarrow 3\), \(3 \rightarrow 4\), and , \(4 \rightarrow 5\)

ADJ[1,2] <- 1
ADJ[1,3] <- 1
ADJ[2,3] <- 1
ADJ[3,4] <- 1
ADJ[4,5] <- 1
ADJ
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    0    1    1    0    0
## [2,]    0    0    1    0    0
## [3,]    0    0    0    1    0
## [4,]    0    0    0    0    1
## [5,]    0    0    0    0    0

To make the network undirected, add the ties \(2 \rightarrow 1\), \(3 \rightarrow 1\), \(3 \rightarrow 2\), \(4 \rightarrow 3\), and \(5 \rightarrow 4\)

ADJ[2,1] <- 1
ADJ[3,1] <- 1
ADJ[3,2] <- 1
ADJ[4,3] <- 1
ADJ[5,4] <- 1
ADJ
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    0    1    1    0    0
## [2,]    1    0    1    0    0
## [3,]    1    1    0    1    0
## [4,]    0    0    1    0    1
## [5,]    0    0    0    1    0

Cells in the adjacency matrix and tie-variables

In general the cell ADJ[i,j] corresponds to the tie-variable \(X_{i,j}\). Here \(x_{1,2}=1\)

ADJ[1,2]
## [1] 1

but, for example, \(x_{1,4}=0\)

ADJ[1,4]
## [1] 0

The ties of node \(i=1\) is the \(i\)’th row

ADJ[1,]
## [1] 0 1 1 0 0

Density

The adjcenacy matrix has

dim(ADJ)
## [1] 5 5

rows and columns. This means that there are \(n \times n\) cells in the adjacency matrix.

dim(ADJ)[1]*dim(ADJ)[2]
## [1] 25
length(ADJ)
## [1] 25

The \(n\) diagonal elements \(x_{1,1},x_{2,2},\ldots,x_{n,n}\) are zero by definition, which means that there are \(n \times n - n = n(n-1)\) variables that can be non-zero, here

dim(ADJ)[1]*dim(ADJ)[2] - n
## [1] 20

Density: How many variables are equal to 1 out of the total posible?

The total number of ones \[L = \sum_{i,j,i\neq j}x_{i,j}=x_{1,2}+\cdots+x_{1,n}+x_{2,1}+\cdots+x_{n-1,n}\] is simply a count of the number of non-zero entries

sum(ADJ)
## [1] 10

The density thus is

sum(ADJ)/(n*(n-1))
## [1] 0.5

and 50% of possible ties are present in the network.

Degree

How many ties does a node have?

The degree \(d_i\) of a node \(i\) is defined as the sum \(d_i=\sum_{j}x_{i,j}=x_{i,2}+x_{i,2}+\cdots + x_{i,n}\). The degree of node \(i=1\) is thus

sum(ADJ[1,])
## [1] 2

and the degree of node \(i=2\) is

sum(ADJ[2,])
## [1] 2

Degree distribution

Calculate the column sum of the adjacency matrix to get the vector of degrees (note the capital S)

colSums(ADJ)
## [1] 2 2 3 2 1

The degree distribution is the table of frequencies of degrees

table( colSums(ADJ) )
## 
## 1 2 3 
## 1 3 1

You can chart the degree distribution with a bar chart

plot( table( colSums(ADJ) ))

You can use standard R-routines to explore the adjacency matrix

For example finding what node (-s) have, say, degree 3

which(colSums(ADJ)==3)
## [1] 3

Or subsetting the adjacency matrix to look only at nodes with degree 2 or greater

use <- which(colSums(ADJ)>=2) # for each row there will be a logical TRUE or FALSE
ADJ[use,use]
##      [,1] [,2] [,3] [,4]
## [1,]    0    1    1    0
## [2,]    1    0    1    0
## [3,]    1    1    0    1
## [4,]    0    0    1    0

Fun Fact: Linear algebra

Most network metrics can be calculated using linear algebra. For example, if \(X_{i,j}\) in \(X\) tell you if \(i\) and \(j\) are directly connected, element \((XX)_{i,j}\) of the matrix product \(XX\), tells you how many paths \(i \rightarrow k \rightarrow j\) there are

ADJ %*% ADJ
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    2    1    1    1    0
## [2,]    1    2    1    1    0
## [3,]    1    1    3    0    1
## [4,]    1    1    0    2    0
## [5,]    0    0    1    0    1

Element \((XXX)_{i,j}\) of the matrix product \(XXX\), tells you how many paths \(i \rightarrow k \rightarrow h \rightarrow j\) there are

ADJ %*% ADJ %*% ADJ
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    2    3    4    1    1
## [2,]    3    2    4    1    1
## [3,]    4    4    2    4    0
## [4,]    1    1    4    0    2
## [5,]    1    1    0    2    0

Network object

Plotting the matrix object ADJ is not meaningful because R does not know that this is an adjacency matrix. To interpret ADJ as a network, translate the adjacency matrix to a network object

net <- as.network(ADJ, directed = FALSE)

NB: in the network package you use directed=FALSE in lieu of setting mode equal to graph.

The new object net is an object of type

class(net)
## [1] "network"

While printing ADJ to screen just gives you the matrix, priniting net gives you a summary of the network

net
##  Network attributes:
##   vertices = 5 
##   directed = FALSE 
##   hyper = FALSE 
##   loops = FALSE 
##   multiple = FALSE 
##   bipartite = FALSE 
##   total edges= 5 
##     missing edges= 0 
##     non-missing edges= 5 
## 
##  Vertex attribute names: 
##     vertex.names 
## 
## No edge attributes

Plot sociogram

When plotting a network object, R knows that you want to plot the sociogram

plot( net )

For various plotting option see ?plot.network. For example, set node-size to degree, include labels, and set different colours

plot( net , # the network object
      vertex.cex = degree(net) , # how should nodes (vertices) be scaled
      displaylabels =  TRUE, # display the labels of vertices
      vertex.col = c('red','blue','grey','green','yellow'))

Note that degree(net) is a built-in function in network for calculating the degrees of the nodes. The next step will explore more of these functions.

References

Borgatti, Stephen P, Martin G Everett, and Jeffrey C Johnson. 2018. Analyzing Social Networks. Sage.

Hanneman, Robert A., and Mark Riddle. 2005. Introduction to Social Network Methods. Riverside, CA: University of California, Riverside. http://faculty.ucr.edu/~hanneman/.

Robins, Garry. 2015. Doing Social Network Research: Network-Based Research Design for Social Scientists. Sage.